RUSH RENDER QUEUE: RUSH.CONF FILE
(C) Copyright 1995,2000 Greg Ercolano. All rights reserved.
V 102.41 03/05/04
Strikeout text indicates features not yet implemented


Configuration File
$RUSH_DIR/etc/rush.conf


Rush Config File
Index

File Format *     -- rush.conf general file format description
----------------------------------------------------------------------------
AdminUser         -- Sets user allowed to administer rush
AllCpusFmt        -- Controls the 'rush -lac' output format
AllCpusHeader1    -- Sets header for line #1 of 'rush -lac' report
AllCpusHeader2    -- Sets header for line #2 of 'rush -lac' report
AllJobsFmt        -- Controls the 'rush -laj' output format
AllJobsHeader1    -- Sets header for line #1 of 'rush -laj' report
AllJobsHeader2    -- Sets header for line #2 of 'rush -laj' report
AllowPush         -- Enable/disable 'rush -push
AppHostCache      -- Rush hostname caching method for rush(1) client
CheckPoint.Log    -- Enables log messages for checkpointing
CheckPoint.OnBoot -- Enables automatic loading of checkpoint snapshots
CheckPoint.Secs   -- Enables automatic checkpoint snapshots
CpuAcctPath       -- Path to cpu accounting file
DaemonHostCache   -- Rush hostname caching method for rushd(1) server
DisableFu         -- Enable/disable rush(1) '-fu' flag
DisablePflags     -- Enable/disable k/a priority flags
ExpandCpus        -- Enable/disable +hostgroup expansions to include all procs
ForceGid          -- Forces processes to run with a particular GID
ForceUid          -- Forces processes to run with a particular UID
GidRange          -- Prevent processes to run with a GID outside this range
HourlyConsole     -- Enable/disable hourly console messages
InMaxMsgs         -- Number of tcp/udp messages handled at a time
JobFmt            -- Controls the 'rush -lj' output format
JobHeader1        -- Sets header for line #1 of 'rush -lj' report
JobHeader2        -- Sets header for line #2 of 'rush -lj' report
JobPassTimeout    -- Seconds tasks remain JOBPASS before re-entering IDLE state 
JobUpdateThrottle -- Throttle job updates to remote cpus
KillCommand       -- Command used to kill Windows frames when 'usejobobjects' disabled
LogFlags          -- Configures daemon debug logging features
LogRotateHour     -- Sets the hour daemon log automatically rotates
MaxNewTaskMsgs    -- Throttle number of tasks advertised at a time to remote cpus
NtRushGid         -- Sets the GID windows jobs run as under unix
NtRushUid         -- Sets the UID windows jobs run as under unix
Permit *          -- Permit users to use various rush operations
Rush.(options) *  -- Change the defaults for rush options
ServerPort        -- Sets the network port number rush uses communications
SmtpDebug         -- Enables mail SMTP transactions
SmtpFrom          -- Sets the 'from' address for all emails sent by rush
SmtpPort          -- Sets the TCP port rush uses to initiate SMTP transactions
SmtpServer        -- Sets the hostname of the SMTP mail server
TaskCleanupHours  -- Sets the hour(s) rush reaps orphaned tasks
TaskKeepaliveSecs -- Sets number of seconds cpu server checks jobs still active
TcpSockOpts       -- Sets the TCP tuning flags
TmpDir            -- Sets the temp directory rush uses for the RUSH_TMPDIR
UdpMaxRetries     -- Sets how many re-transmissions until 'retry time-out' occurs
UdpRestTimeOut    -- Sets how many secs to wait before recovering from 'retry time-out'
UdpTimeout        -- Sets how many seconds between udp re-transmissions.
UidRange          -- Prevent processes to run with a GID outside this range
UseJobObjects     -- Enables use of Windows 2000 'Job Objects'
    

  File Format  
The configuration file should be customized by the systems administrator. Most settings are used only for fine tuning, but some control important security settings (uidrange/gidrange/forceuid/forcegid), and process auditing/logging (cpuacctpath).

The rush.conf file can be updated on the fly. Simply edit a copy, make changes, then rdist(1) the copy to all the machines, and the daemons will pick up your changes within one minute.

You can prefix any rush.conf commands with 'os=xxx', where 'xxx' can be any of:

    os=windows  -- Windows: NT, 2000, etc.
    os=unix     -- Unix: irix, linux, mac
    os=irix     -- Irix only
    os=mac      -- Mac only
    os=linux    -- Linux only
    os=all      -- All platforms
    
The 'os=xxx' prefix can be used to qualify the command be run only on the respective operating systems. This way you can maintain one file for both windows and unix operating systems. If many commands match, it is the LAST one that takes effect. Examples: eg:
    os=windows tmpdir C:/TEMP         # Windows
    os=unix    tmpdir /var/tmp        # Unix (mac|linux|irix)
    
..in this case 'tmpdir C:/TEMP' is set on all the windows machines, and 'tmpdir /var/tmp' is set on all the unix machines.

(New in rush 102.40f and up) Similarly, you can prefix any rush.conf command with 'host=xyz', where xyz is the hostname that should execute the command. This way, certain commands can be executed only on a particular host, eg:

    os=windows tmpdir C:/TEMP         # default for all windows machines
    os=unix    tmpdir /var/tmp        # default for all unix machines
    host=tahoe tmpdir /mnt/ext1       # host 'tahoe' has a special tmpdir
    host=bart  tmpdir /mnt/ramdisk    # host 'bart' has a special tmpdir
    host=r07   tmpdir F:/TEMP         # host 'r07' has a special tmpdir
    
..in this case the lower 'tahoe', 'bart' and 'r07' settings override the preceding 'windows' and 'unix' general cases above, because the commands execute in the order they appear; the lower commands overriding the higher ones. So if there's overlap, put the more general entries (os=xxx) first, and follow them with the more specific entries (host=xxx).

To make changes to this file and update this to the network, use these commands.

  AdminUser  
Deprecated in 102.40f: Use permit.

Sets login name for user allowed to administer the rush daemons. The adminuser can always manipulate jobs, regardless of the value of DisableFu. Also, commands such as 'rush -dexit', 'rush -dlog a' and others are limited to root and 'adminuser'.

Set to 'root' if there is no special rush administrative login.

    Example: adminuser root

  AllCpusFmt  
Controls the 'rush -lac' report format, using a printf(2) style format string.

You may change the column width values, and can add or remove truncation specifications. eg. %s can be changed to %12s, %12.12s, %-12.12s, etc.

You may not add or remove '%s' fields, or change the data types (eg. do not change '%s' to '%d', etc). It is recommended that you do not use control sequences such as '\n', '\r', or ansi sequences such as '\033[1m', as these will negatively affect irush(1)'s presentations.

This string must be enclosed in double quotes. To represent double quotes in the string, you can use \".

See the caveats and warnings for other relevant warnings.

  AllCpusHeader1  
Sets the header string for line #1 of 'rush -lac' reports.

Basically this should be a string that correctly identifies the column headings for the 'rush -lac' report.

This setting is printed literally to the screen; no printf() style formatting expansions will be made, ie. \n, \r, \t, %s, will all be printed literally.

See the caveats and warnings for other relevant warnings.

Each column heading should be a single word (ie. column heading names should NOT contain spaces), as this will affect the ability of irush to detect the report columns for sorting.

This string must be enclosed in double quotes. To represent double quotes in the string, you must use \".

This heading can be disabled from the reports by setting this field to "-", which means this header line WILL NOT be printed. However, doing so may negatively impact programs such as irush(1), www-rush(1), and other tools that parse the reports, expecting to skip over headers.

  AllCpusHeader2  
Sets the header string for line #2 of 'rush -lac' reports. By default, this is disabled, for backwards compatibility.. the 'rush -lac' report never had dotted lines on the second line. This may change in a future release.

This is normally the 'dotted line' header that delineates columns for easy viewing. Try to follow the example of the default setting.

This setting is printed literally to the screen; no printf() style formatting expansions will be made, ie. \n, \r, \t, %s, will all be printed literally.

See the caveats and warnings for other relevant warnings.

This string must be enclosed in double quotes. To represent double quotes in the string, you must use \".

This heading can be disabled from the reports by setting this field to "-", which means this header line WILL NOT be printed. Currently, this is the recommended setting, to be backwards compatible with older 'rush -lac' reports.

  AllJobsFmt  
Controls the 'rush -laj' report format, using a printf(2) style format string.

You may change the column width values, and can add or remove truncation specifications. eg. %s can be changed to %12s, %12.12s, %-12.12s, etc.

You may not add or remove '%s' fields, or change the data types (eg. do not change '%s' to '%d', etc). It is recommended that you do not use control sequences such as '\n', '\r', or ansi sequences such as '\033[1m', as these will negatively affect irush(1)'s presentations.

This string must be enclosed in double quotes. To represent double quotes in the string, you can use \".

Warnings and Caveats:

  • Making typos or improper modifications to this string can cause the rushd(1) daemon and/or rush(1) to memory fault. Change this string as carefully as you would change a printf() statement in a C program. No type checking is done to your modifications, so make changes to this string carefully, and double check your work.

  • AllJobsFmt, AllJobsHeader1 and AllJobsHeader2 all must agree on the width of columns, or irush(1) will not parse columns correctly. Make sure when you modify column widths that the headings correctly align with your format string.

  • Heading names must not contain spaces. Irush uses the first character of each word in the AllJobsHeader1 field to determine column boundaries for the purposes of sorting.

  • Modification of format string's left-hand verses right-hand justification of any field from the defaults *may* affect irush's ability to parse fields correctly from columns. Make sure irush(1) and other such tools that parse the report all still work correctly after you make modifications. Report white space parsing problems if you encounter them.

  AllJobsHeader1  
Sets the header string for line #1 of 'rush -laj' reports.

Basically this should be a string that correctly identifies the column headings for the 'rush -laj' report.

This setting is printed literally to the screen; no printf() style formatting expansions will be made, ie. \n, \r, \t, %s, will all be printed literally.

See the caveats and warnings for the AllJobsFmt command for other relevant warnings.

Each column heading should be a single word (ie. column heading names should NOT contain spaces), as this will affect the ability of irush to detect the report columns for sorting.

This string must be enclosed in double quotes. To represent double quotes in the string, you must use \".

This heading can be disabled from the reports by setting this field to "-", which means this header line WILL NOT be printed. However, doing so may negatively impact programs such as irush(1), www-rush(1), and other tools that parse the reports, expecting to skip over headers.

  AllJobsHeader2  
Sets the header string for line #2 of 'rush -laj' reports.

This is normally the 'dotted line' header that delineates columns for easy viewing. Try to follow the example of the default setting.

This setting is printed literally to the screen; no printf() style formatting expansions will be made, ie. \n, \r, \t, %s, will all be printed literally.

See the caveats and warnings for the AllJobsFmt command for other relevant warnings.

This string must be enclosed in double quotes. To represent double quotes in the string, you must use \".

This heading can be disabled from the reports by setting this field to "-", which means this header line WILL NOT be printed. However, doing so may negatively impact programs such as irush(1), www-rush(1), and other tools that parse the reports, expecting to skip over headers.

  AllowPush  
Allows the sysadmin to enable (or disable) the 'rush -push' feature. If enabled, the admin can release files like the rush.conf, license.dat, and rush 'hosts' files to an entire network easily and quickly.

    Example: allowpush yes

  AppHostCache  
rush(1) 's host caching option; only affects the rush(1) client application's method of host caching. Can be none or demand.

    Example: apphostcache demand

  CheckPoint.Log  
Enables log messages when checkpoint events occur.

Examples:

    checkpoint.log 1    -- enable checkpoint logging (default)
    checkpoint.log 0    -- disable checkpoint logging
        

  CheckPoint.OnBoot  
Enables checkpointing file to be loaded automatically when daemon starts.

Examples:

    checkpoint.onboot 1    -- checkpoint files automatically load 
                              when daemon starts (default)

    checkpoint.onboot 0    -- do not load checkpoint files
    

  CheckPoint.Secs  
Sets how frequently daemon automatically writes snapshots of job checkpoint information to disk.

Examples:

    checkpoint.secs 600  -- write checkpoints every 10 minutes (default)
    checkpoint.secs 0    -- disable automatic checkpoint snapshots
    

Caveats

    Checkpoints may be written more frequently, eg. if user invokes 'rush -jobcheckpoint'. Also, when someone submits a job, the next checkpoint will be scheduled within 30 seconds of the submission.

  CpuAcctPath  
Path to cpu accounting file.

Set to '-' to disable generation of cpu accounting data.

Examples:

    os=windows cpuacctpath c:/rush/var/cpu.acct          # WinNT
    os=unix    cpuacctpath /usr/local/rush/var/cpu.acct  # Unix
    

CAVEATS

  • Do /not/ attempt to redirect the cpu.acct log to an NFS server or remote file system; keep the files local. If you want to centralize the data, make a crontab(1) that sweeps the data to a central server, using either sendmail(8), rcp(1), rdist(1), or some other more forgiving mechanism than NFS.

    NFS is the 'kiss of death' for daemons (rush, cron) if the NFS server hangs or goes down; as soon as the daemon tries to touch a hung NFS (e.g. rush adding a line to cpu.acct when a frame finishes), the daemon will hang up completely. In the case of rush, it will not only make the daemon unresponsive via irush during the outage, it will also be unkillable if the mounts are 'hard'.

  DaemonHostCache  
rushd(8)'s hostname caching options. Only affects the way the daemon caches information. 

Options can be none, demand or boot:

  • none  - No caching. Use local OS for lookups.
  • demand - Cache on demand, or whenever new hostlist is reloaded.
  • boot - Cache on boot, or whenever hostlist reloaded.

Example: daemonhostcache boot

  DisableFu  
Allows administrator to control whether users can use 'rush -fu' and $RUSH_FU to control other people's jobs.

disablefu 1 prevents users from controlling each other's jobs by disabling 'rush -fu' and $RUSH_FU; only root and adminuser can control any jobs.

disablefu 0 allows users to control each other's jobs, as well as root and adminuser.

Normally, users should be able to control each other jobs, allowing local policies, peer pressure (and auditing daemon logs) to prevent pandemonium.

Example: disablefu 0

  DisablePflags  
Allows administrator to control whether users can use 'k' or 'a' priorities in their cpu specifications.

    Permit Functions
    disablepflags - Allows users to use any priority flags (default).
    disablepflags ka Disables users from being able to use either the 'k' (kill)
    or 'a' (almighty) priority flags.
    disablepflags k Disables users from being able to use the 'k' (kill)
    priority flag.

See Also:

  ExpandCpus  
Allows administrator to control whether hostgroups are expanded to include one or all processors on each host.

  • expandcpus 1 causes hostgroups to expand all processors on each host.
  • expandcpus 0 causes hostgroups to expand to only one processor for each host.

Example: expandcpus 1

  ForceGid  
Forces all user processes to run as this GID. Default is -1, allowing user processes to run as the GID of the user who submitted the job.

Examples:

    forcegid -1     # Disabled
    forcegid 100    # Force the gid to 100 for all processes
	

  ForceUid  

Forces all user processes to run as this UID. Default is -1, allowing user processes to run as the UID of the user who submitted the job.

Examples:

    forceuid -1     # Disabled
    forceuid 100    # Force the UID to 100 for all processes
        

  GidRange  
Controls gid values the same way UidRange controls uid values.

Example: gidrange 100 65000

  HourlyConsole  
Controls whether hourly console messages are printed or not. Can be either 'on' or 'off'.

    Example: hourlyconsole on

  InMaxMsgs  
How many messages(tcp/udp) maximum are received and handled at a time before doing other daemon operations. e.g.:
    while ( daemon loop )
    {
	// Read messages from remotes
	for ( t=0; t < inmaxmsgs; t++ )
	{
	    select(.. inbuf ..)	
	    if ( inbuf is empty ) break;
	    HandleMessage(inbuf);
	}

	// Do other stuff
	...
    }

This puts a limit on how many messages to read, so rush doesn't spend too much time reading data from remotes. Since the whole network can potentially generate more traffic than the rush daemon can possibly deal with, a limit must be enforced.

    Example: inmaxmsgs 30

  JobFmt  
TBD.

  JobHeader1  
TBD.

  JobHeader2  
TBD.

  JobPassTimeout  
The 'jobpasstimeout' value configures how many seconds the task will remain in the JOBPASS state before re-entering an IDLE state by itself.

When a task on a remote cpu becomes IDLE, it tries to convince a job to use its cpu. If the job 'passes' on this request (no more frames to render, etc), the remote task enters a JOBPASS state, to avoid contacting the job again for a while. After the timeout period, the task re-enters an IDLE state to see if maybe the job had a FAIL frame, and has more frames to render after all.

Example: jobpasstimeout 150

  JobUpdateThrottle  
Doesn't advertise jobs' cpus faster than jobthrottlesecs. The daemon will re-advertise cpus that haven't been acknowledged by the remotes at about this rate. 

Example: jobupdatethrottle 10

  KillCommand  
This is the command rush uses to kill renders under Windows when 'usejobobjects' is disabled. More than one command can be specified in the order you want executed.

Use '%ld' in place of where you want the process ID to appear.

Enable 'logflags kKE' to log kill commands, and see errors.

Use special 'killcommand ' to use the old internal kill which is 'supposed' to kill the process group according to Microsoft, but it doesn't work in real life. Use 'killtree.pl' perl script, or the new 'killtree.exe', which is a binary equivalent (default).

Example: os=windows killcommand c:/rush/etc/bin/killtree %ld

Caveats:

	 prodplant: Needed 'kill %ld' to kill chalice renders
	 digiscope: Needed 'killcommand perl c:/rush/etc/bin/killtree.pl %ld' for houdini
	       525: Needed 'killcommand perl c:/rush/etc/bin/killtree.pl %ld' for maya
      

  LogFlags  
Configures daemon logging features; not to be confused with submit script LogFlags.

Most flags are debugging flags used to track operation of the system. Flags can be combined to enable multiple debugging features. 

LogFlags affect both the daemon AND user applications. To affect only the daemon, specify flags on daemon's command line, or use 'rush -dlog <flags>'.

See Logging Flags Table for a complete list of all the one letter log flags.

    Example: logflags jE

  LogRotateHour  
Sets the hour (0-23) that the logs automatically rotate. A value of -1 disables automatic log rotation.

    Example: logrotatehour 0

  MaxNewTaskMsgs  
Chokes the maximum number of 'newtask' advertisements the job server sends out to the remotes at a time.

Absolutely must be greater than 0.

Should be a value around 30 or so.

Too low causes new jobs to take a while to establish processors. Too high floods remote's input buffers with 'newtask' advertisements.

    Example: maxnewtaskmsgs 20

  NtRushGid  
The gid used if an NT submitted job is to run on Unix machines.

Since the NT version of rush doesn't know how to map the name 'ntrush' to the equivalent gid value, NtRushGid is used to resolve it. Basically, this value should be the same as the gid value for the Unix user 'ntrush'.

    Example: ntrushgid 100

  NtRushUid  
The uid used if an NT submitted job is to run on Unix machines.

Since the NT version of rush doesn't know how to map the name 'ntrush' to the equivalent uid value, NtRushUid is used to resolve it. Basically, this value should be the same as the uid value for the Unix user 'ntrush'.

    Example: ntrushuid 100

  Permit  
(New in rush 102.40f and up)

Permit users to access certain rush functions.

The syntax of the 'permit' command is as follows:

    permit
    {
	functionlist:
	{
	    userlist
	}
	functionlist:
	{
	    userlist
	}
    }
Comments can be interspersed within the 'permit' command, and must be delimited with '#'.

To permit users to only online/offline/getoff their own machines, or a specific list of machines, see below for examples of how to do this.

'userlist' is a list of users who will be granted access to the functions in the preceding 'functionlist' described below. User names can be separated by commas (,) spaces ( ) or can appear on separate lines, or any combination of commas, spaces and lines. '*' is special in that it matches 'all users'. No verification is done to check if user names are actually valid, so it's not an error to specify non-existant users. The rush debugging flag 'F' can be used to debug 'permit' settings, e.g. 'rush -d F -ping |& grep permit:'.

'functionlist' is a comma or space separated list of function names from the table below, which specifies the functions that will be granted to the users in 'userlist'. 'functionlist' can contain any of:

    Permit Functions
    everything /All/ operations in this table, including administrative commands,
    eg: rush -push, rush -dexit, rush -rotate..
    online Lets users use 'rush -online' command. or the same function in onrush(1)
    offline Lets users use 'rush -offline' command, or the same function in onrush(1)
    getoff Lets users use 'rush -getoff' command, or the same function in onrush(1)
    kill Lets users use the 'k' kill priority (eg. +any=100k)
    (This setting can be overridden by 'disablepflags k')
    almighty Lets users use the 'a' almighty priority (eg. +any=100a)
    (This setting can be overridden by 'disablepflags a')


Permit Examples


    Default Permissions
    The default rush permissions.
    
    # Example. The default permit behavior:
    #    1. 'root' and 'administrator' can do /everything/
    #    2. everyone else can do only normal user stuff (not admin commands)
    #
        
    permit
    {
        everything:
        {
            root                        # unix 'root' user
            administrator               # windows 'administrator' user
        }
    
        online,offline,getoff,kill,almighty:
        {
            *                           # allow everyone to do these functions      
        }
    }

    Wide Open Permissions
    Let everyone do everything.
        permit
        {
            everything:
            {
                *             # everyone can do admin functions (everything)  
            }
        }

    Specific User Permissions
    Allow certain users to have specific permissions
    
    # Example. Configure specific user permissions:
    #     1) 'root' and 'administrator' can do /everything/
    #     2) 'fred' and 'fez' can online/offline
    #     3) 'jack' 'jane' and 'fred' can use kill/almighty priorities
    #     4) 'bill' and 'ted' to use online/offline/getoff/kill
    #
    
    permit
    {
        everything:
        {
    	root,administrator	# root,administrator can do everything
        }
    
        online,offline
        {
    	fred,fez		# fred,fez can online and offline machines     
        }
    
        kill,almighty:
        {
    	jack,jane,fred		# jack,jane and fred can use k/a priority
        }
    
        online,offline,getoff,kill:
        {
    	bill,ted		# bill and ted can online/offline/getoff
    	                        # and use 'k' priority
        }
    }

    Real World Example
    
    permit
    {
         everything:
         {
    	root,administrator
         }
    
         online,offline:
         {
    	*
         }
    
         getoff:
         {
            # Only production TDs can getoff. *ahem*
    
    	# "FIFTH"
    	fifth,jendy,rinbow,mia,kang,ty,karl,markip,ochere,bchavez,jge
    	klovance,amby,kweith,ezimmerman,jhl,jinx,benbower,kholzman,
    	pshino,klm,ronan,bmittle,kenbergman,jw
    
    	# "HONDA"
            honda,zaustin,justinp,avio,mia,bks,mdavis,adamk,gutzin,rga
    	jmilburn,jenn,aglass,orink,kcb,ronan,kglass,andrew
    
    	# PRODUCERS
    	lisa,bonk,wandas,dan
    
    	# RENDER WATCHERS
    	dannyb,nick,hellerman,donovan
    
         }
    
         kill,almighty:
         {
    	# PRODUCERS
    	lisa,bonk,wandas,dan
    
    	# DATA I/O
    	catlin,dman
         }
    
         # USERS WHO CAN USE ONLINE/OFFLINE/GETOFF ON THEIR OWN MACHINES ONLY
         #   Note use of new 'host=<hostname>' to limit commands to run
         #   only on the machines specified.
         #
         online,offline,getoff:
         {
    host=hollywood  fred                            # fred can control host hollywood
    host=fenway     jenna                           # jenna can control host fenway
    host=oaklawn    bks,fred                        # bks and fred can control host oaklawn
    host=+farm      dannyb,nick,hellerman,donovan   # render watchers can control farm hosts
         }
    }

Permitting Users To Only Control Their Own Hosts

The 'host=' and 'os=' prefixes (described in the rush.conf file format description) can be used to cause lines to be executed only on specific hosts.

Example. This shows how to configure 'permit' to allow users to online/offline/getoff certain machines:

    Permit Workstation Online/Offline
    Allow users to online/offline their own workstations.
    permit
    {
        [..]
    
        online,offline,getoff:
        {
    host=tahoe      erco,jack       # erco and jack can control host tahoe
    host=ontario    reid            # reid can control host ontario
    host=+farm      erco            # erco can control all hosts in the +farm host group
        }
    }
            

See Also:

  rush.(options)  
(New in Rush 102.40h and up)

Rush command line arguments have various defaults, some of which are configurable by the sysadmin in the rush.conf file. Here is a table of the supported values that can be changed:

    rush.(options)
    rush.lac_count 1
    Changes default [-c cnt] value for 'rush -lac'.
    Default is 1, and should not be changed.
    rush.lac_secs 4
    Changes default [-s secs] value for 'rush -lac'.
    Default is 4. Larger values cause longer wait for responses.
    rush.laj_count 1
    Changes default [-c cnt] value for 'rush -laj'.
    Default is 1, and should not be changed.
    rush.laj_secs 4
    Changes default [-s secs] value for 'rush -laj'.
    Default is 4. Larger values cause longer wait for responses.
    rush.status_count 1
    Changes default [-c cnt] value for 'rush -status'.
    Default is 1, and should not be changed.
    rush.status_secs 4
    Changes default [-s secs] value for 'rush -status'.
    Default is 4. Larger values cause longer wait for responses.
    rush.status_backoff_min 5
    Changes the minimum number of times to contact a host
    before kicking in the packet transmission backoff algorithm
    for 'rush -status'. Default is 5, indicating 5 consecutive attempts must fail
    before Rush starts backing off transmissions to this host.
    rush.status_backoff_max 15
    Sets the maximum number of times to skip transmissions for 'rush -status'.
    Default is 15, which indicates up to 15 transmissions will be
    skipped when 'backoff' is in effect. The maximim backoff between
    transmissions when a host is down is: (status_backoff_max * status_secs ).

    Use larger values to increase the backoff time, to prevent
    rushtop(1) from sending Arp packets to powered off machines.
    See also this FAQ entry.

    rush.push_count 2
    Changes default [-c cnt] value for 'rush -push'.
    Default is 2, which indicates a second attempt is made to machines
    that don't respond the first time.
    rush.push_secs 4
    Changes default [-s secs] value for 'rush -push'.
    Default is 4, which indicates rush will wait up to 4 seconds
    for machines to respond to 'rush -push' operations.
    rush.dlogstats_count 1
    Changes default [-c cnt] value for 'rush -dlogstats'.
    Default is 1, and should not be changed.
    rush.dlogstats_secs 4
    Changes default [-s secs] value for 'rush -dlogstats'.
    Default is 4. Larger values cause longer wait for responses.

  ServerPort  
Set the rushd(1) server daemon's port numbers for UDP/TCP connections. 

Though unnecessary for proper operation of the render queue, you should register the ServerPort value in your /etc/services file, e.g.:

	rushd 696/tcp    # rush render queue
	rushd 696/udp    # rush render queue

Example: serverport 696

  SmtpDebug  
Enables debugging of the SMTP mail transfers (eg. 'donemail' deliveries) to the log file $RUSH_DIR/var/mail.log.

This aids in debugging problems with email delivery, so one can see the entire email transaction with the configured SmtpServer.

This is a Windows specific flag.

Examples:

    os=windows smtpdebug -  # Disable SMTP debugging
    os=windows smtpdebug t  # Enable SMTP tcp transactions
	

  SmtpFrom  
Sets the 'from' address for all emails sent by rush. This also affects the "Errors-To:", "Reply-To", and "Return-Path" fields of the messages, controlling where bounced email is sent if delivery fails in transit.

NOTE: If rush is unable to deliver the mail to the server, the mail is simply dropped, and an error message will appear in the $RUSH_DIR/var/mail.log.

This is a Windows specific flag.

Example:

    os=windows smtpfrom ntrush@yourdomain.com
        

  SmtpPort  
The TCP port rush uses to contact the SmtpServer to deliver mail. Unless your mail server is listening on some other report, leave this at 25.

This is a Windows specific flag.

Example:

    os=windows smtpport 25
        

  SmtpServer  
This should be set to the hostname of the SMTP server for your local network. Rush will use this server to deliver DoneMail messages.

If set to '-', mail delivery is disabled. Error messages will appear in the rushd.log whenever a user attempts to use DoneMail.

If set to '-', mail delivery is disabled. Error messages will appear This is a Windows specific flag. Windows does not have a command line oriented mail delivery agent, so rush uses it's own ($RUSH_DIR/bin/rushsendmail).

Example:

    os=windows smtpserver mail.yourdomain.com
        

  TaskCleanupHours  
Sets up the hour(s) of the day that rush purges orphaned tasks from the 'rush -tasklist'. (eg. a job server reboots; all remote machines end up with tasks that won't go away until they're next up for scheduling) This should be done at least once a day during early morning hours.

An example of 'taskcleanuphours 5' indicates cleanup occurs between 5:00am and 5:40am. Rush disperses the cleanup operation on a per-host basis over a 40 minute period to prevent network load.

The number of minutes delay from the hour is the position in the rush hosts file mod 40.

Examples:

    # This example sets cleanups to run at 5am each day (Default)
    taskcleanuphours 5

    # This example sets cleanups to run 4 times per day: midnight, 6am, noon, and 6pm
    taskcleanuphours 0
    taskcleanuphours 6
    taskcleanuphours 12
    taskcleanuphours 18
        

  TaskKeepaliveSecs  
Sets the number of seconds for jobs in the cpu server tasklists to check in with their jobs to make sure the jobs are still active. (A job might 'disappear' due to a shutdown).

Note that this value is a minimum; a random value of up to 40 minutes is added to this value to prevent 'packet storming'.

Normally this is set to 8 hours (28800 seconds), which means that once a job is submitted, every 8 hours (or so) the remote cpu servers will check back in with their job server to ensure the job is still alive.

    Example: taskkeepalivesecs 28800

  TcpSockOpts  
Allows administrator to set various TCP tuning values for all tcp-based connections (rush -lf/-lj/-log/-ping, etc).

Several instances of 'tcpsockopt' can be specified, to set multiple flags.

Currently, only TCP_NODELAY is recommended. Usage:

tcpsockopt <SO_OPTION> <value>

..where <SO_OPTION> is one of:

    TCP_NODELAY
      Disables Nagle algorithm, speeds up connection time for all TCP based connections by a noticable factor, when small amounts of data are involved (rush -ping, etc). WRT "Nagle" RFC 896, "delayed ACK" RFC 813, and "Hosts Communication Requirements" RFC 1122.

    SO_LINGER
      If specified, it is ALWAYS enabled, argument is the linger 'time'. See setsockopt(2) for more info.

    SO_KEEPALIVE
      Enable connected sockets to be 'kept alive'. <value> must be '1' to enable, 0 to disable.

    SO_DONTROUTE
      Not recommended. If enabled, socket connections bypass normal routing. <value> must be '1' to enable, 0 to disable.

    SO_REUSEADDR
      Not recommended. Enabled reuse of port addresses. <value> must be '1' to enable, 0 to disable.

    SO_REUSEPORT
      Not recommended. Enabled reuse of port addresses. Some platforms (Redhat 6.2) don't even support this. On those platforms, the option is ignored. <value> must be '1' to enable, 0 to disable.

    SO_SNDBUF
    SO_RCVBUF
      Not recommended. Sets teh send/receive buffer size. <value> is the number of bytes in the buffer. Rush may override the value you specify in some cases, as it may know it needs large buffers.

Example: tcpsockopts TCP_NODELAY 1

  TmpDir  
Allows administrator to set where rush creates the $RUSH_TMPDIR for user's jobs.

Be aware that when every frame executes, a subdirectory is created in 'tmpdir', and on completion the subdir is 'rm -rf'ed. Both are done *as the user*, so users must have write permission to this directory.

Examples:

    os=windows tmpdir c:/temp         # WinNT
    os=unix    tmpdir /var/tmp        # Unix
        

  UdpMaxRetries  
The number of re-transmissions until 'retry time-out' occurs

    Example: udpmaxretries 5

  UdpRestTimeOut  
How many secs to rest before recovering from a 'retry time-out'

    Example: udpresttimeout 40

  UdpTimeout  
The number of seconds between udp re-transmissions.

    Example: udptimeout 8

  UidRange  
Disallow render queue to run processes with a uid outside this range. First value is a minimum, second value is a maximum. 

When a job is submitted, if the user's uid value is outside the range specified here, an error message is printed and the job will not be submitted.

    Example: uidrange 100 65000

  UseJobObjects  
Windows Only.

Enables the use of Windows 2000 'Job Objects', which is supposedly better at killing renders when rush needs to, such as requeueing frames, dumping jobs, etc.

If enabled, Job Objects will be used (if the windows platform supports it), and will ignore the KillCommands altogether. Some older windows platforms don't support Job Objects (eg. Windows NT), and these systems will fall back to the older KillCommand approach to job control automatically.

If disabled, job objects will not be used, even if available.

It is recommended this setting be left on. The only reason to turn it off is if you suspect it of causing problems; it is relatively new (as of 102.31p).

To have this flag take effect, you must restart the daemon.

Examples:

    os=windows usejobobjects yes     -- Enable job objects if available (default)
    os=windows usejobobjects no -- Disable job objects




Cpu Accounting File
$RUSH_DIR/var/cpu.acct


The cpu accounting file is configured with the rush.conf file's CpuAcctPath  command. Each time a frame finishes executing, a new entry is created in the Cpu Accounting file, logging the name of the job, how long the frame ran, etc.

Cpu Accounting File Example

u  948242700 53
p  948242783 tahoe.798    WERNER/C33 erco     0106  superior 100k  122  0   0	0 27823
p  948242783 tahoe.798    WERNER/C33 erco     0107  superior 100k  122  0   0	0 27834
p  948242865 tahoe.797    KILLER     erco     0504  superior 200   121  0   0	0 27846
u  948246300 5
u  948249900 0

Process Entries


p  948242783 tahoe.798 WERNER/C33 erco  0106  superior  100k  122  0   0   0 27822
p  948242783 tahoe.798 WERNER/C33 erco  0107  superior  100k  122  0   0   0 27834
p  948242865 tahoe.797 KILLER     erco  0504  superior  200   121  0   0   0 27846
-  --------- --------- ---------- ----  ----  --------  ----  ---  -   -   - -----
|      |         |          |      |     |       |       |     |   |   |   |   |
|      |         |          |      |     |       |       |     |   |   |   |   Pid
|      |         |          |      |     |       |       |     |   |   |   |
|      |         |          |      |     |       |       |     |   |   |   Exit code
|      |         |          |      |     |       |       |     |   |   |
|      |         |          |      |     |       |       |     |   |   #Secs User Time
|      |         |          |      |     |       |       |     |   |                 
|      |         |          |      *Job  |       |       |     |   #Secs System Time
|      |         |          |      Owner |       |       |     |
|      |         |          |            |       |       |     |
|      |         |          Title of job |       |       |     #Secs Wall Clock Time
|      |         Jobid                   |       |       |
|      |                                 |       |       Priority
|      time(2) process started           |       |
|                                        |       Host that ran the process
'p' indicates 'process entry'            |
					 Frame that ran

* The job owner is not necessarily the owner of the process.
  Such is the case in windows jobs running frames on unix machines,
  or 'forceuid' configured in the rush.conf file.


Utilization Entries


u  948242700 53
u  948246300 5
-  --------- --
|      |      |
|      |      Percent of time processor(s) were busy rendering. (0-100)
|      |
|      time(2) utilization recorded
|
'u' indicates 'utilization entry' 

CAVEATS

  • 'Exit code' is normally a positive number representing the actual exit code of the process. This value will be negative if the process was signaled; the value being the signal number. If the value is negative, this usually means the process killed, segfaulted, or was bumped by a higher priority process. Commonly, the 'Exit code' will be one of:
    
      -15 - process killed with SIGTERM; someone probably manually killed it
       -9 - process killed with SIGKILL; probably bumped in a priority battle
       -3 - process killed with SIGINT; someone sent it a ^C
        0 - process did an exit(0); frame Done
        1 - process did an exit(1); frame Fail
        2 - process did an exit(2); frame Requeue
        

  • Do /not/ attempt to redirect the cpu.acct log to an NFS server or remote file system; keep the files local. If you want to centralize the data, make a crontab(1) that sweeps the data to a central server, using either sendmail(8), rcp(1), rdist(1), or some other more forgiving mechanism than NFS.

    NFS is the 'kiss of death' for daemons (rush, cron) if the NFS server hangs or goes down; as soon as the daemon tries to touch a hung NFS (e.g. rush adding a line to cpu.acct when a frame finishes), the daemon will hang up completely. In the case of rush, it will not only make the daemon unresponsive via irush during the outage, it will also be unkillable if the mounts are 'hard'.

  • Although tempting, it is not recommend to use process execution times for cpu billing purposes. Wall clock time includes time the process may have spent waiting for network load. User and System times report the respective times spent for the Render Script only; not its sub-processes (e.g., the renderer).

    To properly bill for cpu time, you would either need to enable full-on Unix process accounting to attain accumulated cpu time for all sub-processes in the user's render script, or, create wrapper scripts that use programs like timex(2) to monitor the binary execution time of the critical render/compositor processes.

    Tools like timex(2) indicate in their documentation that they must have Unix process accounting enabled to show sub-process totals. This is usually prohibitive on production machines, due to disk resources used by the Unix process accounting system.