From: Greg Ercolano <erco@(email surpressed)>
Subject: Re: rush slowness/timeouts
   Date: Mon, 02 Jan 2006 20:27:31 -0500
Msg# 1161
View Complete Thread (8 articles) | All Threads
Last Next
Hi Luke,

Need some more info: regarding the unresponsive machines 'loaner5'
and 'loaner16', do you think rush is slow because the rush daemon is busy,
or because the machine is thrashing due to rendering?

It's important to determine if the rushd is busy, or if the machine
is busy due to rendering.

When a machine is not being responsive to 'rush -ping', try ssh/rsh'ing
over to that machine and look at 'top' and/or the output of eg. 'vmstat 3'.
Is rushd using up all the cpu, or is a render? Is the machine swapping due
to unavailable ram? Does rsh/ssh not even respond when trying to connect
to the machine? If so, the renders may be using too much in the way of
ram resources, swapping the machine to death.

Or, possibly rush is being kept busy; what is the successful output of
'rush -tasklist loaner16? If the list is huge, possibly users are submitting
with too many +any specifications. For instance if there are 250 jobs each asking for:

	+any=3@200 +any=5@150 +any=10@100 +any=20@50

..that will make four entries on each host, multiplying the complexity
to rush by 4 (4 specs per job * 250 jobs = 1000 active tasks)

..consider instead using just a two tier submissions:

	+any=3@200 +any=20@50

Our rush license server is often very heavily loaded up (it also serves files) - could this be a factor?

Not likely, as the rushd daemons only communicate with the license
server on boot.

Unless, that is, your license server is also acting as a job server for
jobs (ie. submitting jobs to the license server, such that jobs have jobids
with the license server's hostname in them)



--
Greg Ercolano, erco@(email surpressed)
Rush Render Queue, http://seriss.com/rush/
Tel: (Tel# suppressed)
Cel: (Tel# suppressed)
Fax: (Tel# suppressed)

Last Next