From: Greg Ercolano <erco@(email surpressed)>
Subject: [Q+A] How can I run commands when a render has exceeded its 'maxtime'?
   Date: Mon, 29 Sep 2008 19:23:56 -0400

Msg# 1790
View Complete Thread (1 article) | All Threads
Last Next

> We've been using rush's 'maxtime' setting to kill renders that
> take too long to run.
>
> But we have a situation where one of our render scripts interacts
> with a database at the start of a render, and again at the end.
>
> We'd like to be able to run special database commands if the render
> took too long.
>
> Is there a way to do that?

    You really wouldn't want to use the rush 'maxtime' to do this.

    To do what you want, I'd advise modifying the submit script so
    that if someone does specify the 'Max Time:' field in the submit form,
    the script handles it itself, instead of to rush's 'maxtime' command.

    To do this, pass the maxtime value as a command line argument
    to your render script, and have the render script parse it,
    and run the process in the background while the render script
    keeps an eye on the wallclock time in a while() loop, so when
    the time gets above the maxtime value, the render script can
    stop the child, and do the database tweak as a post-operation.

    This, instead of using rush's own builtin 'maxtime' submit command
    which is not trappable; when rush's maxtime is triggered, the render
    process is simply killed. This is because there is no clean 'cross
    platform' mechanism rush can use to reliably notify a process it is
    about to be killed. Windows and unix vary widely in their techniques,
    and some work better than others, depending on the job being run.

    This is fairly easy to do in your script, and this way your script
    can do all kinds of complex handling when the timer expires.

    The following working Unix perl example should give you an idea
    of how to approach the render script side; this way you'll have
    full control over what to do when the child process takes too long,
    what kill signal to use, etc:

----------------------------------------------------------------------

#!/usr/bin/perl -w
use strict;
use POSIX ":sys_wait_h";        # unix only

###
### UNIX EXAMPLE: HOW TO LIMIT A CHILD PROCESS'S EXECUTION TIME
### erco 09/29/08
###

# SUBROUTINE: RUN CHILD, KILL IF TAKES LONGER THAN 'maxsecs'
#    $1 - command to run
#    $2 - max time in seconds
#
sub RunChild($$)
{
    my ($command, $maxsecs) = @_;
    my $pid = fork();
    if ( $pid == 0 ) {
        # CHILD
        exec($command) || die "Could not exec($command)";
    } else {
        # PARENT -- if child takes longer than maxsecs, kill it
        print "--- CHILD STARTED: PID=$pid, COMMAND='$command'\n";
        while ( 1 ) {
            # CHECK IF CHILD DONE
            my $ret = waitpid($pid, WNOHANG);
            if ( $ret == -1 ) { print "waitpid() failed: $?\n"; next; }
            if ( $ret == $pid ) {
                # CHILD DONE
                printf("--- CHILD EXITED: STATUS=%d, EXITCODE=%d\n", $?, ($?>>8));
                return($?);
            }
            # MAX TIME EXPIRED? KILL CHILD, DONE
            if ( $maxsecs-- == 0 ) {
                print "--- CHILD TOOK TOO LONG: KILLING PID $pid\n";
                kill(-9, $pid);
                return(-1);
            }
            # KEEP WAITING
            sleep(1);
            print "--- Waiting for child.. $maxsecs\n";
        }
    }
    #NOTREACHED#
}

### MAIN ###
{
    # TWEAK THESE AS NEEDED
    my $command = "(sleep 30 && exit 33)";    # command to run and exit code
    my $maxsecs = 5;                          # max time to wait

    # RUN CHILD PROCESS
    my $ret = RunChild($command, $maxsecs);

    # DONE
    print "--- DONE: RET=$ret\n";
    exit(0);
}

----------------------------------------------------------------------

        The above example is written to assume the child takes too long
        so as to show what happens in that case.

        If you change the 'sleep 30' to 'sleep 3', then the child
        will execute normally, and the process's exit code of 33 will
        be printed.

        Note the use of ($? >> 8) to get the exit code, as opposed
        to the raw 'status' returned by $?. Read the unix docs on
        wait() and friends wait4, waitpid, etc. for more info on this.
        These extra bits can tell you if the process core dumped or was
        signaled, instead of exiting on its own, should you need to make
        a distinction.

WINDOWS CAVEAT
        If you're on Windows, I think you can probably do a timer
        technique similar to the above using perl's
        Win32::Process::Create() stuff; see .common.pl for an example,
        and activestate perl's docs on the WIN32 module for more info.

Last Next