From: Marco Recuay <marco@(email surpressed)>
Subject: Re: First In-First Out Tip
   Date: Mon, 20 Jul 2009 19:37:49 -0400
Msg# 1874
View Complete Thread (4 articles) | All Threads
Last Next
Thanks for the correction Greg..
I guess I'll have to do some more testing on this. The clocks all appear synchronized on all the machines, and they all have the same rush version and updated rush.conf file.

Using the same job server is a workaround for now, but when we have a free moment I'll see how they react to the tests you outlined to pinpoint where the problem lies.

On 2009-07-20 08:40:37 -0700, Greg Ercolano <erco@(email surpressed)> said:

	BTW, Marco, if you're having strange behavior with different
	job servers, I'll be happy to help.

	Make sure you've got 'sched fifo' in ALL the rush.conf files,
	and are running the same version of rush that supports FIFO
	(102.42a9 or higher) on all the machines, since the older
	releases won't understand FIFO scheduling.

	There are some easy tests you can do from the command line
	to submit a couple of jobs and watch them compete.

	For instance, from a Unix machine (linux or OSX), you can
	submit two 100 frame jobs as a test:

(echo title AAA; echo frames 1-100; echo cpus +any=500@10; echo command rush -sleep 10) | rush -submit

(echo title BBB; echo frames 1-100; echo cpus +any=500@10; echo command rush -sleep 10) | rush -submit

	Each line will submit a job that does nothing but sleep 10 seconds
	per frame.

	Those should be two separate lines; make sure your newsreader
	is wide enough to show those as complete lines before copy/pasting.

	Run the lines one at a time, with at least 1 or 2 seconds between
	each submit, so the FIFO system can tell which job was first.

	When running, the AAA job should get all the cpus, and the BBB
	job should wait around until the AAA job starts finishing.

	You should also be able to run similar command lines on other job
	servers, or tack a hostname on the end, after the 'rush -submit',
	to tell it to use some other machine as the job server, eg. to use
	'tahoe' as the job server:

(echo title CCC; echo frames 1-100; echo cpus +any=500@10; echo command rush -sleep 10) | rush -submit tahoe



Last Next