RUSH RENDER QUEUE: FILES
(C) Copyright 1995,2000 Greg Ercolano. All rights reserved.
V 102.30 04/25/01

Strikeout text indicates features not yet implemented

Configuration File
`$RUSH_DIR/etc/rush.conf`

The configuration file should be customized by the systems administrator. Most settings are used only for fine tuning, but some control important security settings (uidrange/gidrange/forceuid/forcegid), and process auditing/logging (cpuacctpath).

The rush.conf file can be updated on the fly. Simply edit a copy, make changes, then rdist(1) the copy to all the machines, and the daemons will pick up your changes within one minute.

To make changes to this file and update this to the network, use these commands.

Config File Commands

AdminUser
AppHostCache
ClientPort
CpuAcctPath
DaemonHostCache
DisableFu
ExpandCpus
ForceGid
ForceUid
GidRange
HourlyConsole
InMaxMsgs
JobPassTimeout
JobUpdateThrottle
LogFlags
LogRotateHour
MaxNewTaskMsgs
NtRushGid
NtRushUid
ServerPort
TaskCleanupHours
TcpSockOpts
TmpDir
UdpMaxRetries
UdpRestTimeOut 
UdpTimeout
UidRange
WebUser

AdminUser

Sets login name for user allowed to administer the rush daemons. The adminuser can always manipulate jobs, regardless of the value of DisableFu. Also, commands such as 'rush -dexit', 'rush -dlog a' and others are limited to root and 'adminuser'.

Set to 'root' if there is no special rush administrative login.

adminuser root

AppHostCache

rush(1) 's host caching option; only affects the rush(1) client application's method of host caching. Can be none or demand.

apphostcache demand

ClientPort

Obsolete. Remove from all rush.conf files.

CpuAcctPath

Path to cpu accounting file.

Set to '-' to disable generation of cpu accounting data.

cpuacctpath /var/logs/cpu.acct

DaemonHostCache

rushd(8)'s hostname caching options. Only affects the way the daemon caches information.

Options can be none, demand or boot:

none - No caching. If using only /etc/hosts, or if hostnames change a lot.

demand - Cache on demand, or whenever new hostlist is reloaded. Prevents repetitive NIS/DNS traffic.

boot - Cache entire hostlist and IP mappings on boot, or whenever hostlist changes.

Example: daemonhostcache boot

DisableFu

Allows administrator to control whether users can use 'rush -fu' and $RUSH_FU to control other people's jobs.

disablefu 1 prevents users from controlling each other's jobs by disabling 'rush -fu' and $RUSH_FU; only root and adminuser can control any jobs.

disablefu 0 allows users to control each other's jobs, as well as root and adminuser.

Normally, users should be able to control each other jobs, allowing local policies, peer pressure (and auditing daemon logs) to prevent pandemonium.

disablefu 0

ExpandCpus

(102.17b+) Allows administrator to control whether hostgroups are expanded to include one or all processors on each host.

If expandcpus 1 causes hostgroups to expand to include all processors on each host.

If expandcpus 0 causes hostgroups to only expand to one processor for each host.

expandcpus 1

ForceGid

Same as ForceUid for GID values.

forcegid -1

forcegid 100

ForceUid

Forces all user processes to run as this uid. Default is -1, allowing user processes to run as the UID of the user who submitted the job.

forceuid -1

forceuid 100

GidRange

Controls gid values the same way UidRange controls uid values.

gidrange 100 65000

HourlyConsole

Controls whether hourly console messages are printed or not. Can be either 'on' or 'off'.

hourlyconsole on

InMaxMsgs (Version 101.83+)

How many messages(tcp/udp) maximum are received and handled at a time before doing other daemon operations. e.g.:

    while ( daemon loop )
    {
	// Read messages from remotes
	for ( t=0; t < inmaxmsgs; t++ )
	{
	    select(.. inbuf ..)	
	    if ( inbuf is empty ) break;
	    HandleMessage(inbuf);
	}

	// Do other stuff
	...
    }

This puts a limit on how many messages to read, so rush doesn't spend too much time reading data from remotes. Since the whole network can potentially generate more traffic than the rush daemon can possibly deal with, a limit must be enforced.

inmaxmsgs 30

JobPassTimeout

The 'jobpasstimeout' value configures how many seconds the task will remain in the JOBPASS state before re-entering an IDLE state by itself.

When a task on a remote cpu becomes IDLE, it tries to convince a job to use its cpu. If the job 'passes' on this request (no more frames to render, etc), the remote task enters a JOBPASS state, to avoid contacting the job again for a while. After the timeout period, the task re-enters an IDLE state to see if maybe the job had a FAIL frame, and has more frames to render after all.

jobpasstimeout 150

JobUpdateThrottle

Doesn't advertise jobs' cpus faster than jobthrottlesecs. The daemon will re-advertise cpus that haven't been acknowledged by the remotes at about this rate.

jobupdatethrottle 10

LogFlags

Not to be confused with submit script LogFlags, Configures daemon logging features. Most are debugging flags used to track operation of the system.

Flags can be combined to enable multiple debugging features.

LogFlags affect both the daemon AND user applications. To affect only the daemon, specify flags on daemon's command line, or use 'rush -dlog <flags>'.

See Logging Flags Table for a complete list of all the one letter log flags.

logflags jE

LogRotateHour (Version 101.81+)

Sets the hour (0-23) that the logs automatically rotate. A value of -1 disables automatic log rotation.

logrotatehour 0

NewTaskMsgs

Chokes the maximum number of 'newtask' advertisements the job server sends out to the remotes at a time.

Absolutely must be greater than 0.

Should be around 20 or so.

Too low causes new jobs to take a while to establish processors. Too high floods remote's input buffers with 'newtask' advertisements.

maxnewtaskmsgs 20

NtRushGid

The gid used if an NT submitted job is to run on Unix machines.

Since the NT version of rush doesn't know how to map the name 'ntrush' to the equivalent gid value, NtRushGid is used to resolve it. Basically, this value should be the same as the gid value for the Unix user 'ntrush'.

ntrushgid 100

NtRushUid

The uid used if an NT submitted job is to run on Unix machines.

Since the NT version of rush doesn't know how to map the name 'ntrush' to the equivalent uid value, NtRushUid is used to resolve it. Basically, this value should be the same as the uid value for the Unix user 'ntrush'.

ntrushuid 100

ServerPort

Set the rushd(1) server daemon's port numbers for UDP/TCP connections.

Though unnecessary for proper operation of the render queue, you should register the ServerPort value in your /etc/services file, e.g.:

	    rushd  696/tcp   # rush server
	    rushd  696/udp   # rush server

serverport 696

TaskCleanupHours (+102.17c)

Sets up the hour(s) of the day that rush purges orphaned tasks from the 'rush -tasklist'. This should be done once a day during early morning hours.

An example of 'taskcleanuphours 5' indicates cleanup occurs between 5:00am and 5:45am. Rush disperses the cleanup operation on a per-host basis over a 40 minute period to prevent network load.

taskcleanuphours 5

TcpSockOpts

Allows administrator to set various TCP tuning values for all tcp-based connections (rush -lf/-lj/-log/-ping, etc).

Several instances of 'tcpsockopt' can be specified, to set multiple flags.

Currently, only TCP_NODELAY is recommended. Usage:

tcpsockopt <SO_OPTION> <value>

..where <SO_OPTION> is one of:

TCP_NODELAY

Disables Nagle algorithm, speeds up connection time for all TCP based connections by a noticable factor, when small amounts of data are involved (rush -ping, etc). WRT "Nagle" RFC 896, "delayed ACK" RFC 813, and "Hosts Communication Requirements" RFC 1122.

SO_LINGER

If specified, it is ALWAYS enabled, argument is the linger 'time'. See setsockopt(2) for more info.

SO_KEEPALIVE

Enable connected sockets to be 'kept alive'. <value> must be '1' to enable, 0 to disable.

SO_DONTROUTE

Not recommended. If enabled, socket connections bypass normal routing. <value> must be '1' to enable, 0 to disable.

SO_REUSEADDR

Not recommended. Enabled reuse of port addresses. <value> must be '1' to enable, 0 to disable.

SO_REUSEPORT

Not recommended. Enabled reuse of port addresses. Some platforms (Redhat 6.2) don't even support this. On those platforms, the option is ignored. <value> must be '1' to enable, 0 to disable.

SO_SNDBUF
SO_RCVBUF

Not recommended. Sets teh send/receive buffer size. <value> is the number of bytes in the buffer. Rush may override the value you specify in some cases, as it may know it needs large buffers.

tcpsockopts TCP_NODELAY 1

TmpDir

Allows administrator to set where rush creates the $RUSH_TMPDIR for user's jobs.

Be aware that when every frame executes, a subdirectory is created in 'tmpdir', and on completion the subdir is 'rm -rf'ed. Both are done *as the user*, so users must have write permission to this directory.

tmpdir /var/tmp

UdpMaxRetries

The number of re-transmissions until 'retry time-out' occurs

udpmaxretries 5

UdpRestTimeOut

How many secs to rest before recovering from a 'retry time-out'

udpresttimeout 40

UdpTimeout

The number of seconds between udp re-transmissions.

udptimeout 8

UidRange

Disallow render queue to run processes with a uid outside this range. First value is a minimum, second value is a maximum.

When a job is submitted, if the user's uid value is outside the range specified here, an error message is printed and the job will not be submitted.

uidrange 100 65000

WebUser

~~Sets login name for user the httpd daemon runs as, in cases where rush is being controlled by web interfaces.~~

~~This user is allowed to use the RUSH_USER environment variable to pose as other users for the purpose of cgi-bin scripts being able to submit jobs as the user on the other end of Netscape.~~

~~Set to "root" to disable this feature (default).~~

~~Example: webuser guest~~

Hosts File
`$RUSH_DIR/etc/hosts`

The $RUSH_DIR/etc/hosts file must contain the names of all hosts that participate in rendering.

Example Hosts File

# RUSH HOSTS
#
# The 'Host' field should contain short names for hosts (aliases are ok),
# and must be unique.
#
# The 'Criteria' field must *NOT* contain white space, and words are 
# comma delimited. All hosts should contain '+any' in the criteria field.
#
#Host    Cpus Ram   MinPri Criteria
#-----   ---- ----  ------ -----------
tahoe    2    256   0      +any,+work,sgi,irix,irix6.5
superior 2    256   0      +any,+work,sgi,irix,irix6.2
ontario  1    128   0      +any,+work,linux,linux6.0,intel
erie     1    128   0      +any,+work,sgi,irix,irix6.4
rf1      1    512   0      +any,+farm,linux
rf2      1    512   0      +any,+farm,linux
rf3      1    512   0      +any,+farm,linux
rf4      1    512   0      +any,+farm,linux
rf5      1    512   0      +any,+farm,linux

The hosts file can be updated on the fly. Simply edit a copy, make changes, then rdist(1) the copy to all the machines, and the daemons will pick up your changes within one minute.
To make changes to this file and update this to the network, use these commands.
The format of the hosts file is single lines of 5 white space separated fields, one line per host:

<Hostname> <#Cpus> <Ram> <Minimum Priority> <Criteria>

Blank lines and lines starting with '#' are ignored:

Hosts File Field Descriptions

<Hostname>
<#Cpus>
<Ram>
<Minimum Priority>
<Criteria>

<Hostname>

This is the name of the host, and should be the shortest name possible (e.g., host aliases can be used here).
This is the name that will be used in jobids and other cpu reports, so it is best if short names are used (10 chars or less). Longer names are ok, but will misalign columnar reports. Avoid using FQDN hostnames (e.g., foo.domain.com).
As of version 102.13, you can optionally specify an alternate network interface, other than the default. Just append to the hostname a ':' followed by the name of the interface, e.g.:

tahoe:tahoe-eth

This says 'tahoe' is the actual name of the machine (ie. hostname(1)), but rush should use tahoe's 'tahoe-eth' network interface for all communications.

<#Cpus>

This should be the number of cpus the host has. This is how many processes the host will run at the same time. This value can be larger or smaller than the actual number of physical cpus the machine has.
'0' is an acceptable value that essentially disables the machine from participating in rendering, while allowing the host to be specified in submit scripts.

<Ram>

This is the amount of ram the machine has. This value can be less or more than the actual ram the machine has; usually this value takes into account some percentage of the host's swap space as well. This value is used when accepting frames to render; a frame that asks for more ram than the machine has will be turned away.
On multiprocessor machines, this value is a total from which rendering frames subtract their estimated ram use. For instance, if a 4 cpu machine is configured with a Ram value of 512, and 2 frames are currently rendering each with ram values 200, then only 112 will be left for rendering on the other two processors (112 = 512 - ( 200 x 2 ) ).

<Minimum Priority>

Use this value to set a limit on the minimum priority a job must have to render on this machine.
Useful where you want to prevent people from rendering on workstations unless they are of at least a certain priority, or if you want to allow only the local workstation user to submit to their own workstation using a policy enforced priority value.
A value of '0' allows all jobs. A value of '900' will only allow renders with a priority of 900 or above; renders with less than that will be turned away.

<Criteria>

This is a list of comma separated strings that define platform or operating system specific features for the host. These can be arbitrary alpha-numeric strings that may also contain dashes, underbars and periods, but must not contain any white space. '+' characters have the special purpose of leading off a Host Group specification.
The <Criteria> field might be set to:
+any,linux,linux6.1,prman3.7
These strings can then be used in TD's submit scripts to limit which hosts will render their frames. See the Criteria Submit Script command for more info. All hosts should have a criteria entry that at least contains +any.
Host Group names are configured in this field, too. To add a hostgroup called +servers to the above example:
+any,linux,linux6.1,prman3.7,+servers

Cpu Accounting File
`$RUSH_DIR/var/cpu.acct`

The cpu accounting file is configured with the rush.conf file's CpuAcctPath command. Each time a frame finishes executing, a new entry is created in the Cpu Accounting file, logging the name of the job, how long the frame ran, etc.

Cpu Accounting File Example

u  948242700 53
p  948242783 tahoe-798    WERNER/C33 erco     0106  superior 100k  122  0   0	0 27823
p  948242783 tahoe-798    WERNER/C33 erco     0107  superior 100k  122  0   0	0 27834
p  948242865 tahoe-797    KILLER     erco     0504  superior 200   121  0   0	0 27846
u  948246300 5
u  948249900 0

Process Entries


p  948242783 tahoe-798 WERNER/C33 erco  0106  superior  100k  122  0   0   0 27822
p  948242783 tahoe-798 WERNER/C33 erco  0107  superior  100k  122  0   0   0 27834
p  948242865 tahoe-797 KILLER     erco  0504  superior  200   121  0   0   0 27846
-  --------- --------- ---------- ----  ----  --------  ----  ---  -   -   - -----
|      |         |          |      |     |       |       |     |   |   |   |   |
|      |         |          |      |     |       |       |     |   |   |   |   Pid
|      |         |          |      |     |       |       |     |   |   |   |
|      |         |          |      |     |       |       |     |   |   |   Exit code
|      |         |          |      |     |       |       |     |   |   |
|      |         |          |      |     |       |       |     |   |   #Secs User Time
|      |         |          |      |     |       |       |     |   |                 
|      |         |          |      User  |       |       |     |   #Secs System Time
|      |         |          |            |       |       |     |
|      |         |          Title of job |       |       |     #Secs Wall Clock Time
|      |         Jobid                   |       |       |
|      |                                 |       |       Priority
|      time(2) process started           |       |
|                                        |       Host that ran the process
'p' indicates 'process entry'            |
					 Frame that ran

Utilization Entries


u  948242700 53
u  948246300 5
-  --------- --
|      |      |
|      |      Percent of time processor(s) were busy rendering. (0-100)
|      |
|      time(2) utilization recorded
|
'u' indicates 'utilization entry'

CAVEATS
'Exit code' is normally a positive number representing the actual exit code of the process. This value will be negative if the process was signaled; the value being the signal number. If the value is negative, this usually means the process killed, segfaulted, or was bumped by a higher priority process. Commonly, the 'Exit code' will be one of:
  -15 - process killed with SIGTERM; someone probably manually killed it
   -9 - process killed with SIGKILL; probably bumped in a priority battle
   -3 - process killed with SIGINT; someone sent it a ^C
    0 - process did an exit(0); frame Done
    1 - process did an exit(1); frame Fail
    2 - process did an exit(2); frame Requeue
Although tempting, it is not recommend to use process execution times for cpu billing purposes. Wall clock time includes time the process may have spent waiting for network load. User and System times report the respective times spent for the Render Script only; not its sub-processes (e.g., the renderer).
To properly bill for cpu time, you would either need to enable full-on Unix process accounting to attain accumulated cpu time for all sub-processes in the user's render script, or, create wrapper scripts that use programs like timex(2) to monitor the binary execution time of the critical render/compositor processes.
Tools like timex(2) indicate in their documentation that they must have Unix process accounting enabled to show sub-process totals. This is usually prohibitive on production machines, due to disk resources used by the Unix process accounting system.

Configuration File $RUSH_DIR/etc/rush.conf

Hosts File $RUSH_DIR/etc/hosts