RUSH RENDER QUEUE - JOB PRIORITIES
(C) Copyright 1995,2000 Greg Ercolano. All rights reserved.
V 102.41 03/05/04


Priority Description

    In general, higher priority values are 'more important'.

    Priority values are in the range 1-999. Values outside this range cause an error.

    Priority values are generally specified in the 'cpus' command, such as:

      cpus tahoe@100
      cpus tahoe=1@100

    Both of the above are equivalent, asking for one cpu on tahoe at 100 priority. (If the number of cpus is not specified, '1' is the default).

    Jobs contend for cpus based primarily on priority. When priority values are equal, the system uses a round robin scheme as 'first come, first served'.

    Priority values are 'relative'. If all other jobs on the network are 100 but your job is 101, it will always win battles for an idle cpu. (Making your job 200 priority will not make it execute any quicker.)

    When priority values differ, cpus are arbitrated using these rules:

    • Higher Priority Always Wins.
      If two jobs contend for a cpu, the higher priority job always wins-- which implies the next rule:

    • Lower Priority Jobs Always Lose.
      The lower priority job will always lose.

    • Equal Priority Jobs Share.
      If two jobs of equal priority contend for a cpu, they alternate execution on the cpu.

    In addition to the above, 'priority flags' may be appended to the priority values. (eg. 100k, 100a, 100ka). These flags augment the above behavior in the following ways:

    • Kill Flag ('k').
      The job will kill lower priority jobs immediately, rather than wait for them to finish rendering frames already in progress (the default behavior). A killed frame will automatically be re-rendered on the next available cpu. The 'k' flag is only effective against jobs of lower priority. Where the priorities are equal, the 'k' flag has no effect.
    • Almighty Flag ('a').
      Disables higher priority jobs from being able to kill frames already in progress during priority battles. Basically, this disables other job's Kill ('k') flags, causing these jobs to revert to the default 'passive' behavior of waiting for an in-progress frame to finish.

    Priority flags are normally used separately, but can be combined (e.g., 100ka) to create the situation known as 'Kick Ass' mode.

      Beware: abuse of these flags can be tracked by sysadmins. If you cause trouble by submitting jobs with killer priorities that are not assigned to you, you can be tracked down via the system's auditing logs.

    Here are some example situations to demonstrate the above rules.

    Priority Scenarios
    Example: Passive Higher Priority (Non-Killer)

      A 100 priority job is running on a cpu. No other jobs are using the cpu, so the job continues to render on that cpu, one frame after the other.

      Suddenly, someone submits a 200 priority job to the same cpu. The 100 priority job will be allowed to finish rendering the current frame and then the 200 priority takes over the cpu, rendering all its frames. Once the 200 priority job has completed, the 100 priority job continues to render the remaining frames.

    Example: Aggressive Higher Priority (Killer)

      Similar to the above, a 100 priority job is running on a cpu. No other jobs are active, so the job continues to render on that cpu, one frame after the other.

      But this time, someone submits a 200k priority job (kill flag is set). The 100 priority job's frame is immediately killed, and the 200k priority job takes over the cpu until all its frames are rendered, at which point the 100 priority job resumes on the cpu.

    Example: Equal Priority (Round Robin)

      Again, a 100 priority job is running, with no other jobs active.

      Then someone submits a different job at 100 priority. Both jobs will alternate using the cpu, yielding to each other.

      Note: Even if either or both jobs had their 'k' flags set, the behavior would still be the same, since the priority of both jobs is equal (killer jobs will only kill lower priority jobs, not jobs of equal priority).




Priority Staircasing

    To use rush effectively, assign jobs with some cpus at high priority, and some at low. This is called 'staircasing' the priorities, because when you graph it out, it resembles a staircase:
    
              ____________________________________________________________________________
             |                                                                            |
             |     +any=5@800k                                                            |
     P   200-|      _________                                                             |
     r       |     | | | | | |                                                            |
     i       |     | | | | | |    +any=10@100                                             |
     o   100-|     | | | | | |_________________                                           |
     r       |     | | | | | | | | | | | | | | |                                          |
     i       |     | | | | | | | | | | | | | | |             +any=20@1                    |
     t     1-|     | | | | | | | | | | | | | | |_______________________________________   |
     y       |     |_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|  |
             |                                                                            |
             |____________________________________________________________________________|
                   |         |                 |                                       |
                   0         5                 15                                      35
    
                                              C p u s
        
    Most companies find they only need two specifications per job; a high priority request for cpus (+any=5@800k) and a low priority request (+any=20@1).

    This way, if someone else submits with similar priorities, they will bump the other guys low priority cpus.

    This ensures everyone gets at least 2 high priority cpus. The only problem is if the network is completely saturated with high priority frames, in which case you're probably needing more machines.

    There are basically two ways to use "staircased" priorities; 'Passive' and 'Aggressive'.

    Staircased Priorities: Passive

    If everyone submits with:
    	    +any=2@800
    	    +any=50@1
        
    ..then they'll all get at least =2 cpus @800 high priority, and the rest up to =50 procs @1 low priority.

    The idea here is each job asks for any 2 processors at high priority, and the rest at low.

    But if the network is saturated with jobs, then a new guy submitting won't be able to get a frame running until one of the running frames finish. If the renders are long, that might be a while to wait.

    Staircased Priorities: Aggressive

    A more aggressive job wouldn't want to wait around for low priority frames to get done. They want the job to take those high priority frames right away.

    This is where the 'k' flag becomes useful (Kill) on the high priority submission, so that it bumps lower priority frames out of the way, instead of waiting for them to finish, so the job can kick in those high priority frames without waiting.

    So instead, if everyone 'that's in a hurry' submits jobs with:

    	    +any=2@800k		-- note the 'k'
    	    +any=50@1
        
    ..then that ensures that if the network is mostly saturated with low priority renders, this new submit will bump at least two low priority renders to get a couple at high priority.

    Note that the above will only bump for two procs, the rest will wait in round robin.

    High priority renders won't bump other high priority renders (as long as the 'high' numbers are all 100k).

    Feel free to ask questions, but first check out the docs above to get an understanding of how the priority stuff works.

    You may find you want to increase or decrease the number after the '=' sign as needed.

    The priority numbers themselves are relativeistic; there's no magic about the numbers '100' or '1', they could be '499' and '500'. Just as long as the entire shop agrees on what numbers are considered 'high priority' and what numbers are considered 'low'.

    It is recommended to think of numbers above 99 as 'high', and 99 and lower as low. This is arbitrary from the software's point of view, since values are relativistic. But using these arbitrary values leaves some elbow room in both directions.

    All companies reserve 999k priority for 'it must go through' jobs. For instance, submitting a job with '+any=3@999k' will ensure 3 frames rendering as soon as the job is submitted. And of course, to 'take over' the entire network, nasty this would be: +any=100@999k, which would kill everything running (requing the frames of course) and pushing the current job through the pipe.