From: Greg Ercolano <erco@(email surpressed)>
Subject: Re: Rush didn't start jobs after system reboot
   Date: Thu, 09 Nov 2006 20:21:01 -0500
Msg# 1427
View Complete Thread (5 articles) | All Threads
Last Next
Patrick Boucher wrote:
[posted to rush.general]

I've never ever had this happen to me and I've been using Rush for a while now.

Jobs were submitted from a workstation (3DFX037) the checkpoint was done and then the workstation rebooted.

The workstation tries to read it's checkpoint and didn't restart a bunch of jobs. Here is an excerpt of the log on the system.

Any help would be appreciated.

    This is a bug that was fixed in a July 2005 release (102.42).

    From the release notes for 102.42, which describes the problem:

        o Fixed problem with loading checkpoint files.
	  On reboot, job was not loading if it contained a hostname that was no longer
	  in the rush hosts file. Example:
	
	      1) Job requests hosts a,b,c
	      2) Sysadmin removes host 'b' from hosts file
	      3) Daemon reboots
	      4) On reloading job requesting hosts "a,b,c", job fails to load
		 because 'b' is no longer a valid host.

    Looks like you're currently running 102.41, which is very old:

11/09,18:05:08 START     3dfx037 RUSHD 102.41 PID=..
                                       ^^^^^^

    When production allows, upgrade to the current version, 102.42a7,
    the upgrade is free.

    Contact me directly via email, and I'll send you the upgrade
    instructions.

--
Greg Ercolano, erco@(email surpressed)
Rush Render Queue, http://seriss.com/rush/
Tel: (Tel# suppressed)
Fax: (Tel# suppressed)
Cel: (Tel# suppressed)

Last Next