RUSH RENDER QUEUE: FILES
(C) Copyright 1995,2000 Greg Ercolano. All rights reserved.
V 102.30 04/25/01
Strikeout text indicates features not yet implemented


Configuration File
$RUSH_DIR/etc/rush.conf


Config File Commands



AdminUser


AppHostCache


ClientPort


CpuAcctPath


DaemonHostCache


DisableFu


ExpandCpus


ForceGid


ForceUid


GidRange


HourlyConsole


InMaxMsgs
(Version 101.83+)


JobPassTimeout


JobUpdateThrottle


LogFlags


LogRotateHour
(Version 101.81+)


NewTaskMsgs


NtRushGid


NtRushUid


ServerPort


TaskCleanupHours
(+102.17c)


TcpSockOpts


TmpDir


UdpMaxRetries


UdpRestTimeOut


UdpTimeout


UidRange


WebUser




Hosts File
$RUSH_DIR/etc/hosts


The $RUSH_DIR/etc/hosts file must contain the names of all hosts that participate in rendering.

Example Hosts File
# RUSH HOSTS
#
# The 'Host' field should contain short names for hosts (aliases are ok),
# and must be unique.
#
# The 'Criteria' field must *NOT* contain white space, and words are 
# comma delimited. All hosts should contain '+any' in the criteria field.
#
#Host    Cpus Ram   MinPri Criteria
#-----   ---- ----  ------ -----------
tahoe    2    256   0      +any,+work,sgi,irix,irix6.5
superior 2    256   0      +any,+work,sgi,irix,irix6.2
ontario  1    128   0      +any,+work,linux,linux6.0,intel
erie     1    128   0      +any,+work,sgi,irix,irix6.4
rf1      1    512   0      +any,+farm,linux
rf2      1    512   0      +any,+farm,linux
rf3      1    512   0      +any,+farm,linux
rf4      1    512   0      +any,+farm,linux
rf5      1    512   0      +any,+farm,linux

The hosts file can be updated on the fly. Simply edit a copy, make changes, then rdist(1) the copy to all the machines, and the daemons will pick up your changes within one minute.

To make changes to this file and update this to the network, use these commands.

The format of the hosts file is single lines of 5 white space separated fields, one line per host:

Blank lines and lines starting with '#' are ignored:

Hosts File Field Descriptions


<Hostname>

This is the name of the host, and should be the shortest name possible (e.g., host aliases can be used here).

This is the name that will be used in jobids and other cpu reports, so it is best if short names are used (10 chars or less). Longer names are ok, but will misalign columnar reports. Avoid using FQDN hostnames (e.g., foo.domain.com).

As of version 102.13, you can optionally specify an alternate network interface, other than the default. Just append to the hostname a ':' followed by the name of the interface, e.g.:

This says 'tahoe' is the actual name of the machine (ie. hostname(1)), but rush should use tahoe's 'tahoe-eth' network interface for all communications.


<#Cpus>

This should be the number of cpus the host has. This is how many processes the host will run at the same time. This value can be larger or smaller than the actual number of physical cpus the machine has. 

'0' is an acceptable value that essentially disables the machine from participating in rendering, while allowing the host to be specified in submit scripts.


<Ram>

This is the amount of ram the machine has. This value can be less or more than the actual ram the machine has; usually this value takes into account some percentage of the host's swap space as well. This value is used when accepting frames to render; a frame that asks for more ram than the machine has will be turned away. 

On multiprocessor machines, this value is a total from which rendering frames subtract their estimated ram use. For instance, if a 4 cpu machine is configured with a Ram value of 512, and 2 frames are currently rendering each with ram values 200, then only 112 will be left for rendering on the other two processors (112 = 512 - ( 200 x 2 ) ).


<Minimum Priority>

Use this value to set a limit on the minimum priority a job must have to render on this machine.

Useful where you want to prevent people from rendering on workstations unless they are of at least a certain priority, or if you want to allow only the local workstation user to submit to their own workstation using a policy enforced priority value.

A value of '0' allows all jobs. A value of '900' will only allow renders with a priority of 900 or above; renders with less than that will be turned away.


<Criteria>

This is a list of comma separated strings that define platform or operating system specific features for the host. These can be arbitrary alpha-numeric strings that may also contain dashes, underbars and periods, but must not contain any white space. '+' characters have the special purpose of leading off a Host Group specification.

The <Criteria> field might be set to:

+any,linux,linux6.1,prman3.7
These strings can then be used in TD's submit scripts to limit which hosts will render their frames. See the  Criteria Submit Script command for more info. All hosts should have a criteria entry that at least contains +any

Host Group names are configured in this field, too. To add a hostgroup called +servers to the above example:

+any,linux,linux6.1,prman3.7,+servers






Cpu Accounting File
$RUSH_DIR/var/cpu.acct


The cpu accounting file is configured with the rush.conf file's CpuAcctPath  command. Each time a frame finishes executing, a new entry is created in the Cpu Accounting file, logging the name of the job, how long the frame ran, etc.

Cpu Accounting File Example

u  948242700 53
p  948242783 tahoe-798    WERNER/C33 erco     0106  superior 100k  122  0   0	0 27823
p  948242783 tahoe-798    WERNER/C33 erco     0107  superior 100k  122  0   0	0 27834
p  948242865 tahoe-797    KILLER     erco     0504  superior 200   121  0   0	0 27846
u  948246300 5
u  948249900 0

Process Entries


p  948242783 tahoe-798 WERNER/C33 erco  0106  superior  100k  122  0   0   0 27822
p  948242783 tahoe-798 WERNER/C33 erco  0107  superior  100k  122  0   0   0 27834
p  948242865 tahoe-797 KILLER     erco  0504  superior  200   121  0   0   0 27846
-  --------- --------- ---------- ----  ----  --------  ----  ---  -   -   - -----
|      |         |          |      |     |       |       |     |   |   |   |   |
|      |         |          |      |     |       |       |     |   |   |   |   Pid
|      |         |          |      |     |       |       |     |   |   |   |
|      |         |          |      |     |       |       |     |   |   |   Exit code
|      |         |          |      |     |       |       |     |   |   |
|      |         |          |      |     |       |       |     |   |   #Secs User Time
|      |         |          |      |     |       |       |     |   |                 
|      |         |          |      User  |       |       |     |   #Secs System Time
|      |         |          |            |       |       |     |
|      |         |          Title of job |       |       |     #Secs Wall Clock Time
|      |         Jobid                   |       |       |
|      |                                 |       |       Priority
|      time(2) process started           |       |
|                                        |       Host that ran the process
'p' indicates 'process entry'            |
					 Frame that ran

Utilization Entries


u  948242700 53
u  948246300 5
-  --------- --
|      |      |
|      |      Percent of time processor(s) were busy rendering. (0-100)
|      |
|      time(2) utilization recorded
|
'u' indicates 'utilization entry' 

CAVEATS

  • 'Exit code' is normally a positive number representing the actual exit code of the process. This value will be negative if the process was signaled; the value being the signal number. If the value is negative, this usually means the process killed, segfaulted, or was bumped by a higher priority process. Commonly, the 'Exit code' will be one of:
    
      -15 - process killed with SIGTERM; someone probably manually killed it
       -9 - process killed with SIGKILL; probably bumped in a priority battle
       -3 - process killed with SIGINT; someone sent it a ^C
        0 - process did an exit(0); frame Done
        1 - process did an exit(1); frame Fail
        2 - process did an exit(2); frame Requeue
    

  • Although tempting, it is not recommend to use process execution times for cpu billing purposes. Wall clock time includes time the process may have spent waiting for network load. User and System times report the respective times spent for the Render Script only; not its sub-processes (e.g., the renderer).

    To properly bill for cpu time, you would either need to enable full-on Unix process accounting to attain accumulated cpu time for all sub-processes in the user's render script, or, create wrapper scripts that use programs like timex(2) to monitor the binary execution time of the critical render/compositor processes.

    Tools like timex(2) indicate in their documentation that they must have Unix process accounting enabled to show sub-process totals. This is usually prohibitive on production machines, due to disk resources used by the Unix process accounting system.