> Is there a way to get a list of all the Fail frames
> for *all* jobs on the network inside Irush?
>
> Sometimes we get a fair of Fail frames, and want an easy way to see
> if there's a network-wide trend across all jobs causing the problem.
> (eg. if one particular host is responsible for all the failures,
> or if there's a problem with one of the file servers or license managers.
Yes, you can create a script that generates this report,
and the report will show up in irush such that you can view
logs and requeue frames using the Frame controls as you'd expect.
The script would a) get a list of all the jobids in the shop,
and b) run 'rush -lf | grep ^Fail' to get a list of all the
fail frames from each job.
To get such a report, you could create a script like the following
which does just that, and add it as a hotkey in Irush so that
you can see it as its own "Frames" report:
1) Save the below script as '/your/server/bin/all-fail-frames'
2) Go into irush Hotkeys -> Edit
3) Click New, and fill out the form:
Name: "All Fail Frames"
Command: perl /your/server/bin/all-fail-frames.pl
Hotkey: F1
Output: Upper:Frame
4) Hit Done
Then, when you hit F1, you'll get a list of all the failed frames
in Irush with the 'Frames' controls you're used to, so you can
double click frames to view the logs, etc.
Here's the 'all-fail-frames.pl' script:
--- snip
#!/usr/bin/perl
$| = 1;
my $jobids = "";
# LIST ALL JOBS
open(LAJ, "rush -laj|");
while ( <LAJ> )
{
if ( /^Run\s+(\S+)/ || /^Fail\s+(\S+)/ || /^Pause\s+(\S+)/ )
{ $jobids .= "$1 "; }
}
close(LAJ);
# LIST FAIL FRAMES IN ALL JOBS
open(LF, "rush -lf $jobids -t 5|");
my $header = 0;
while ( <LF> )
{
# Print header only once
if ( /^Status/ && ! $header )
{ $header = 1; print $_; }
# Print any failed frames
if ( /^Fail/ )
{ print $_; }
}
close(LF);
--- snip
|