RUSH RENDER QUEUE
(C) Copyright 1995,2000 Greg Ercolano. All rights reserved.
V 102.40g 08/08/03
Strikeout text indicates features not yet implemented
|
|
|
Known Maya Issues
Maya Licensing Problem Under Windows, The "exit 128" Bug
REPLICATION
-----------
It appears (under windows only) one can't run maya as different
users on the same machine, even though maya licenses are available to
do so.
This problem was reported by several studios, including 525 Studios
and Turner Broadcasting. I told each studio to open a case with
Maya support; each case was 'resolved' as a known bug.
Situation: When a Windows NT or 2K machine is rendering maya via command
line, and a different user tries to render maya on the same machine remotely
or via the 'runas' command, the render command exits without an error message,
but returns a 128 exit code.
To replicate, in one DOS terminal window:
render <all args>
..it starts rendering normally.
In a new DOS window, while the above command is still rendering:
runas /user:DOMAIN\someuser cmd (logging in as a new user)
and in the new DOS window type:
render <all args>
immediately returns back to the command line with no errors...
echo %ERRORLEVEL%
128
One can't even get usage information on the render command with
"render -help". Instead maya immediately returns to the command line
with no output whatsoever.
It appears that when a render is going, another user-login cannot run
a render.
STATUS
------
All cases opened with Alias/Wavefront report this as a bug.
A/W confirmed it's a problem with the license system only on
Windows platforms.
WORKAROUNDS
-----------
Alias Wavefront has not suggested solutions or workarounds
in any of the cases, that I am aware, to date 01/29/2002.
The earliest case I'm aware of was made by 525 studios, June 2001,
case #68182. Another was opened by Turner, case #81526.
However, it's pretty clear that any one of these are possible
workarounds:
o Don't try render on machines that are being used
interactively. Many people take this approach, using
'onrush' to disable/enable the processor.
When cpus are to be made available for use, they
first exit out of maya.
Since rush runs all renders as the same user, there's
no problem with two different people's jobs taking
on one dual proc machine.
-- OR --
o Have the render script automatically detect the
exit 128 error, and offline that processor, and requeue
the frame.
-- OR --
o Have all windows users login as the same user
(eg. a 'render' user), and also have rush run as this
user. This is ugly in general, unless your production
is already operating this way. You'd be surprised how
many productions I've seen where everyone is logged in
as Administrator, as a shortcut around administration
overhead.
-- OR --
o Create a wrapper script that causes maya to run
as a certain user. This can be done, I am told,
with the 'SU' command that comes with one of the
Windows resource kits. It can also be done with
the WinNT 'runas' program, but you must answer its
prompt for a password. (with the SU program, I'm told
the password can be suppled on the command line, and
thus tied into the wrapper script).
This lets people login with their own names, and have
their own desktops, while maya runs as a consistent
user. Again, rush needs to run as that same user.
-- OR --
o Use linux/unix, which doesn't appear to have this
problem. There are a few companies I know actively
running redhat workstations with maya, using FireGL2
cards for fast graphics, and Redhat 7.1 and 7.2 with
patched 2.4.17 kernels (latest stable release).
Maya Hanging under Linux with UNC paths
DESCRIPTION
-----------
It appears in Maya 5.0 (and possibly 4.x), a combination
of two conditions can cause Maya under Linux to hang
indefinitely during rendering, using up 99% of the cpu.
The two conditions that have to be met are:
1) If a UNC style path is supplied as either the scene
or project path. (e.g. "//some/path")
2) If a file referred to by the scene file does not exist.
When these two conditions occur, maya.bin starts using 99%
of the cpu when it tries to open the file that does not exist.
Running strace(1) on maya.bin reveals it enters an infinite loop,
trying to open the non-existant file over and over again,
without printing an error. This was reported by Dylan Penhale,
and the following solution was verified to prevent the problem.
SOLUTION
--------
Modify the submit-maya.pl render script to automatically remove the
leading slashes from the project and scene pathnames, before supplying
them to the 'maya -render' command line.
This fix can be applied in Rush 102.40f scripts (and earlier)
with the following 2 steps:
1) Edit submit-maya.pl, and find the "# MAYA RENDER COMMAND"
section.
2) In the "else" block, add these two lines above the $command:
...
else
{
$project =~ s%//%/%g; # ADD THIS
$scenepath =~ s%//%/%g; # ADD THIS
$command =
"maya -render " .
"-verbose 1 " .
...
|
This fix has already been applied to Rush versions 102.40g and up.
DEBUGGING PROCESS
-----------------
What follows is the relevant excerpt from emails with the customer
used to determine the cause of the problem:
(dp) However when I try rendering some more complex scenes (i.e lots of external textures)
(dp) (with "//" in the pathname) I get to the following stage..
(dp)
(dp) [root@Render07 scenes]# time maya -render -verbose 1 -proj //var/tmp/Maya -s 2 -e 2 \
-b 1 //var/tmp/Maya/scenes/ee064_010_v041_farm.ma
(dp) Maya (R), Version 5.0, 2003 04 01 00 02
(dp) Copyright 1997-2003 Alias|Wavefront, a division of Silicon Graphics Limited.
(dp) All rights reserved.
(dp)
(dp) .. and thats about it. I have left it in this state for over 12 hours
(dp) before giving up.
(ge) You mentioned AW asked if it happens when you use 'Render'
(ge) instead of 'maya -render'; does that make a difference in
(ge) this case?
(dp) Nope
(ge) Also, this might be a good time to whip out strace(1)
(ge) to see what the program is hanging on. With the render
(ge) hung, open another terminal to that machine and:
(ge)
(ge) 1) Use 'ps fax' to find the PID of the hung process.
(ge)
(ge) 2) Include the relevant part of that report here;
(ge) just the maya process hierarchy.
(dp) Appears to be running:
(dp) 1604 ? S 14:24 /usr/local/rush/bin/rushd
(dp) 2500 ? S 0:00 /usr/sbin/sshd
(dp) 29552 ? S 0:00 \_ /usr/sbin/sshd
(dp) 29553 pts/9 S 0:00 | \_ -bash
(dp) 481 ? S 0:00 \_ /usr/sbin/sshd
(dp) 482 pts/11 S 0:00 | \_ -bash
(dp) 527 pts/11 S 0:00 | \_ /bin/csh -f /usr/local/bin/maya -render -verbose 1 -proj //var/tmp/
(dp) 551 pts/11 R 4:26 | \_ /usr/aw/maya5.0/bin/maya.bin -render -verbose 1 -proj //var/tmp
(ge) My guess is the lowest process in that hierarchy
(ge) is the one you want to focus on.
(ge)
(ge) 3) Use 'strace -p <pid_of_hung_process> to see
(ge) what it's hung on. Include (some of) that output
(ge) here.
(dp) Output from strace:
(dp)
(dp) access("//images/ee064_010_bg_v030.0698.iff", F_OK) = -1 ENOENT (No such file or directory)
(dp) Repeating....
(dp) access("//images/ee064_010_bg_v030.0698.iff", F_OK [unfinished ...]
(dp)
(dp) Looks like your right. I'll look at the scene file to see if I can see
(dp) anything.
|