Rush Render Queue - cpu.acct File (C) Copyright 1995,2000 Greg Ercolano. All rights reserved. V 102.43 06/14/08 Strikeout text indicates features not yet implemented |
$RUSH_DIR/etc/cpu.acct |
|
|
File Format |
|
r start 940000000 online p 940001000 tahoe.798 WERNER/C33 erco 0106 superior 100k 122 0 0 0 27823 p 940001123 tahoe.797 KILLER erco 0504 superior 200 121 0 0 0 27846 s 940007123 offline fred@tahoe[192.17.1.34] - m 940100000 r end 940100000 offline d stop 940100005 offline d start 940100008 offline ^ /|\ |_______ | 'r' - Log rotation 'p' - Process completed 's' - State change of daemon (online/offline) 'm' - Midnight marker 'd' - Daemon start/stop |
p 948242783 tahoe.798 WERNER/C33 erco 0106 superior 100k 122 0 0 0 27822 p 948242783 tahoe.798 WERNER/C33 erco 0107 superior 100k 122 0 0 0 27834 p 948242865 tahoe.797 KILLER erco 0504 superior 200 121 0 0 0 27846 - --------- --------- ---------- ---- ---- -------- ---- --- - - - ----- | | | | | | | | | | | | | | | | | | | | | | | | | Pid | | | | | | | | | | | | | | | | | | | | | | | Exit code | | | | | | | | | | | | | | | | | | | | | #Secs User Time | | | | | | | | | | | | | | *Job | | | | #Secs System Time | | | | Owner | | | | | | | | | | | | | | | Title of job | | | #Secs Wall Clock Time | | Jobid | | | | | | | Priority | time(2) process started | | | | Host that ran the process 'p' indicates 'process entry' | Frame that ran * The job owner is not necessarily the owner of the process. Such is the case in windows jobs running frames on unix machines, or 'forceuid' configured in the rush.conf file. |
Rotation entries help applications determine the start/end range of times a particular log file covers. r start 948200000 online r end 948300001 online - ----- --------- ------ | | | | | | | (New in 102.42a9) | | | Indicates 'online' or 'offline' state of daemon | | | | | time(1) file was rotated | | | 'start' indicates the time the new log created | 'end' indicates time log was rotated out (Ocpu.acct files only) | 'r' indicates log file was rotated, either manually or automatically |
(New in 102.42a9) An 's' entry is logged when someone changes the online/offline state of the dameon, indicating what time the change was made, what state it was changed to (online or offline), by whom, from which machine the change was issued, and optional comments (if any). s 982330201 online jerry@tahoe[192.17.1.34] - s 982334241 offline root@meade[192.15.0.177] Offline for maintenance - ----- ------- ------------------------ --------------------------- | | | | | | | | | Optional remarks ('-' if none) | | | | | | | User@host who invoked the online/offline command | | | | | 'online' or 'offline' | | | time(1) state was changed | 's' indicates daemon's online/offline state was changed |
Midnight marks are useful for applications to determine days that were completely idle, such as when a log isn't rotated for several days. m 1114326000 - ---------- | | | time(1) mark occurred | 'm' indicates a midnight time marker. |
Daemon boot messages indicate when the daemon was started/stopped. d 1114326000 start "reason" d 1114326005 stop "reason" - ---------- ----- -------- | | | | | | | (Optional) user supplied reason why daemon was stop/started | | | | | start|stop | | | time(1) mark occurred | 'd' indicates a daemon boot message |
-15 - process killed with SIGTERM; someone probably manually killed it -9 - process killed with SIGKILL; probably bumped in a priority battle -3 - process killed with SIGINT; someone sent it a ^C 0 - process did an exit(0); frame Done 1 - process did an exit(1); frame Fail 2 - process did an exit(2); frame Requeue
NFS is the 'kiss of death' for daemons (rush, cron) if the NFS server hangs or goes down; as soon as the daemon tries to touch a hung NFS (e.g. rush adding a line to cpu.acct when a frame finishes), the daemon will hang up completely. In the case of rush, it will not only make the daemon unresponsive via irush during the outage, it will also be unkillable if the mounts are 'hard'.
To properly bill for cpu time, you would either need to enable full-on Unix process accounting to attain accumulated cpu time for all sub-processes in the user's render script, or, create wrapper scripts that use programs like timex(2) to monitor the binary execution time of the critical render/compositor processes.
Tools like timex(2) indicate in their documentation that they must have Unix process accounting enabled to show sub-process totals. This is usually prohibitive on production machines, due to disk resources used by the Unix process accounting system.