RUSH RENDER QUEUE - MAYA ISSUES
(C) Copyright 1995,2000 Greg Ercolano. All rights reserved.
V 102.41 03/05/04
Strikeout text indicates features not yet implemented

  Known Maya Issues  





  Maya Licensing Problem Under Windows, The 'exit 128' Bug  
    REPLICATION
    -----------
    It appears (under windows only) one can't run maya as different
    users on the same machine, even though maya licenses are available to
    do so.

    This problem was reported by several studios, including 525 Studios
    and Turner Broadcasting. I told each studio to open a case with 
    Maya support; each case was 'resolved' as a known bug.

    Situation: When a Windows NT or 2K machine is rendering maya via command 
    line, and a different user tries to render maya on the same machine remotely
    or via the 'runas' command, the render command exits without an error message,
    but returns a 128 exit code.

    To replicate, in one DOS terminal window:

	render <all args> 

    ..it starts rendering normally.
  
    In a new DOS window, while the above command is still rendering:

	runas /user:DOMAIN\someuser cmd   (logging in as a new user) 

    and in the new DOS window type: 

	render <all args> 

    immediately returns back to the command line with no errors... 

	echo %ERRORLEVEL% 
	128 
  
    One can't even get usage information on the render command with 
    "render -help".  Instead maya immediately returns to the command line 
    with no output whatsoever.

    It appears that when a render is going, another user-login cannot run 
    a render. 

    STATUS
    ------
    All cases opened with Alias/Wavefront report this as a bug.
    A/W confirmed it's a problem with the license system only on 
    Windows platforms.

    WORKAROUNDS
    -----------
    Alias Wavefront has not suggested solutions or workarounds
    in any of the cases, that I am aware, to date 01/29/2002. 
    The earliest case I'm aware of was made by 525 studios, June 2001,
    case #68182. Another was opened by Turner, case #81526.

    However, it's pretty clear that any one of these are possible
    workarounds:

	    o Don't try render on machines that are being used
	      interactively. Many people take this approach, using
	      'onrush' to disable/enable the processor.

	      When cpus are to be made available for use, they
	      first exit out of maya.

	      Since rush runs all renders as the same user, there's
	      no problem with two different people's jobs taking
	      on one dual proc machine.

    -- OR --
	    o Have the render script automatically detect the
	      exit 128 error, and offline that processor, and requeue
	      the frame.

    -- OR --

	    o Have all windows users login as the same user
	      (eg. a 'render' user), and also have rush run as this
	      user. This is ugly in general, unless your production
	      is already operating this way. You'd be surprised how 
	      many productions I've seen where everyone is logged in
	      as Administrator, as a shortcut around administration
	      overhead.

    -- OR --

	    o Create a wrapper script that causes maya to run
	      as a certain user. This can be done, I am told,
	      with the 'SU' command that comes with one of the
	      Windows resource kits. It can also be done with
	      the WinNT 'runas' program, but you must answer its
	      prompt for a password. (with the SU program, I'm told
	      the password can be suppled on the command line, and
	      thus tied into the wrapper script).

	      This lets people login with their own names, and have
	      their own desktops, while maya runs as a consistent
	      user. Again, rush needs to run as that same user.

    -- OR --

	    o Use linux/unix, which doesn't appear to have this
	      problem. There are a few companies I know actively
	      running redhat workstations with maya, using FireGL2
	      cards for fast graphics, and Redhat 7.1 and 7.2 with 
	      patched 2.4.17 kernels (latest stable release).

  Maya Hanging under Linux with UNC paths  
    DESCRIPTION
    -----------
    It appears in Maya 5.0 (and possibly 4.x), a combination
    of two conditions can cause Maya under Linux to hang 
    indefinitely during rendering, using up 99% of the cpu. 

    The two conditions that have to be met are:

	1) If a UNC style path is supplied as either the scene 
	   or project path.  (e.g. "//some/path")

	2) If a file referred to by the scene file does not exist.

    When these two conditions occur, maya.bin starts using 99% 
    of the cpu when it tries to open the file that does not exist.

    Running strace(1) on maya.bin reveals it enters an infinite loop,
    trying to open the non-existent file over and over again, 
    without printing an error. This was reported by Dylan Penhale,
    and the following solution was verified to prevent the problem.

    SOLUTION
    --------
    Modify the submit-maya.pl render script to automatically remove the 
    leading slashes from the project and scene pathnames, before supplying
    them to the 'maya -render' command line.

    This fix can be applied in Rush 102.40f scripts (and earlier)
    with the following 2 steps:

	1) Edit submit-maya.pl, and find the "# MAYA RENDER COMMAND" 
           section.

	2) In the "else" block, add these two lines above the $command:

    Patch to submit-maya.pl for UNC Hangs
        ...
        else 
        {
    	$project =~ s%//%/%g; 	# ADD THIS    
    	$scenepath =~ s%//%/%g;	# ADD THIS
    	$command =
    	    "maya -render " .
    	    "-verbose 1 " .
        ...
        

    This fix has already been applied to Rush versions 102.40g and up.

    DEBUGGING PROCESS
    -----------------
    What follows is the relevant excerpt from emails with the customer
    used to determine the cause of the problem:

    Debugging Session for Maya UNC Hanging Problem
    
    (dp) However when I try rendering some more complex scenes (i.e lots of external textures)
    (dp) (with "//" in the pathname) I get to the following stage..  
    (dp)
    (dp) [root@Render07 scenes]# time maya -render -verbose 1 -proj //var/tmp/Maya -s 2 -e 2 \
    			     -b 1 //var/tmp/Maya/scenes/ee064_010_v041_farm.ma
    (dp) Maya (R), Version 5.0, 2003 04 01 00 02
    (dp) Copyright 1997-2003 Alias|Wavefront, a division of Silicon Graphics Limited.
    (dp) All rights reserved.
    (dp)
    (dp) .. and thats about it. I have left it in this state for over 12 hours 
    (dp) before giving up.
    
    (ge) 	You mentioned AW asked if it happens when you use 'Render'
    (ge)	instead of 'maya -render'; does that make a difference in
    (ge) 	this case?
    
    (dp) Nope
    
    (ge)	Also, this might be a good time to whip out strace(1)
    (ge)	to see what the program is hanging on. With the render
    (ge)	hung, open another terminal to that machine and:
    (ge)
    (ge)		1) Use 'ps fax' to find the PID of the hung process.
    (ge)
    (ge)		2) Include the relevant part of that report here;
    (ge)		   just the maya process hierarchy.
    
    (dp) Appears to be running:
    (dp)  1604 ?      S 14:24 /usr/local/rush/bin/rushd
    (dp)  2500 ?      S  0:00 /usr/sbin/sshd
    (dp) 29552 ?      S  0:00  \_ /usr/sbin/sshd
    (dp) 29553 pts/9  S  0:00  |   \_ -bash
    (dp)   481 ?      S  0:00  \_ /usr/sbin/sshd
    (dp)   482 pts/11 S  0:00  |   \_ -bash
    (dp)   527 pts/11 S  0:00  |       \_ /bin/csh -f /usr/local/bin/maya -render -verbose..
    (dp)   551 pts/11 R  4:26  |           \_ /usr/aw/maya5.0/bin/maya.bin -render -verbos..
    
    (ge) 		   My guess is the lowest process in that hierarchy
    (ge) 		   is the one you want to focus on.
    (ge) 
    (ge) 		3) Use 'strace -p <pid_of_hung_process> to see
    (ge) 		   what it's hung on. Include (some of) that output
    (ge) 		   here.
    
    (dp) Output from strace:
    (dp) 
    (dp) access("//images/ee064_010_bg_v030.0698.iff", F_OK) = -1 ENOENT (No such file or directory)
    (dp) Repeating....
    (dp) access("//images/ee064_010_bg_v030.0698.iff", F_OK [unfinished ...]
    (dp) 
    (dp) Looks like you're right. I'll look at the scene file to see if I can see
    (dp) anything. 
    
    

  Multiple Maya 5.0 Issues Under OSX  
Maya 5.0 under Mac OSX appears to have some very limiting problems when it comes to rendering on a network of OSX machines.

At least three customers have confirmed these problems with Maya 5.0 which can all be replicated through rsh(1) and telnet(1), as well as with rush:

  1. /usr/sbin/maya always returns exit code of 1. It should be passing through the exit code of the renderer. (this can be fixed if you hack the /usr/sbin/maya script)

  2. /usr/sbin/maya will fail if the invoking user is not the same as the user logged in. This can be demonstrated with 'telnet' or 'rsh'; telnet to a machine, logging in as a user different than the user currently logged into the window manager, and try to run a command line render with 'maya -render ..' or 'Render -verbose..', and you'll get an error like:

        INIT_Processeses(), could not establish the default connection 
        to the WindowServer. Abort 
    	 

    The problem appears to be command line rendering still makes a connection to the window manager, even though a window is never opened. Command line rendering should not involve the window manager at all. This CAN'T be fixed by hacking the script; it's a problem with the Maya 5.0 binary.

  3. /usr/sbin/maya can't run multiple instances on the same machine as the same user. This can also be demonstrated with two telnet(1) or rsh(1) connections. If both instances of 'maya -render' are invoked at the same time, one of them fails. This CAN'T be entirely fixed by hacking the script, as the maya binary appears to be opening the files that negatively interact, and are overwriting each other.

  4. Since Maya appears to write files to the user's home directory, this creates problems if the user is a 'network user', who's home directory is on a central server. This user then can't render on more than one machine concurrently, because each instance of Maya on each machine will be writing over the same user files. So it appears the user must have a local '/Users/[name]' directory on each machine for network rendering with Maya 5.0 to work correctly, until this problem is fixed. This CAN'T be fixed by hacking the script; the problem appears to be in the maya binary.

  5. /usr/sbin/maya should not be unconditionally trying to open the GUI text editor 'TextEdit' to show error messages. During command line rendering, the error should be printed to stdout/stderr. The script can be hacked to do this; see the patch below.

When you encounter these problems and can verify them, please follow up with a caseid to Maya support, so you can be updated on fixes.

Preventing TextEdit From Opening During Command Line Rendering

    When Maya 5.0 is invoked to do command line rendering via:
        maya -render ..
    
    ..the 'maya' script (/usr/sbin/maya) will still attempt to open TextEdit to display error messages, rather than printing them to the terminal.

    Using TextEdit to display the errors is useful if Maya is invoked interactively from a desktop shortcut, where there is no terminal to see error messages.

    But there is no reason to open TextEdit if the "-render" option is supplied to maya; this makes remote execution of renders difficult.

    The following patch modifies the Maya 5.0 'maya' script (/usr/sbin/maya) to check to see if "-render" is specified, and if so, to print the errors to the terminal instead.

Preserving Exit Codes

    The 5.0 version of the /usr/sbin/maya script does not preserve exit codes from the actual maya application. This makes it impossible for scripts that invoke maya to check if the renders succeeded or failed.

    The following patch for /usr/sbin/maya also helps preserve the exit codes from the maya main application.

Patching /usr/sbin/maya

    The following patch to /usr/sbin/maya may help you solve the 'exit 1' and 'TextEdit' problems listed above.

    Definitely keep a copy of the original 'maya' script (eg. maya.orig) in case you need it.

      Patch to OSX Maya 5.0 /usr/sbin/maya
      
      --- maya	2003-12-17 20:42:08.000000000 -0800
      +++ maya.fixed	2003-12-17 20:41:09.000000000 -0800
      @@ -11,9 +11,6 @@
       #*-***********************************************************************
       #*
       
      -# Interactive rendering? Not if -render specified
      -set interactive = 1 ; if ( "$1" == "-render" ) set interactive = 0
      -
       set LAUNCHCFMAPP=/System/Library/Frameworks/Carbon.framework/Versions/A/Support/LaunchCFMApp
       set PREF_LOCATION=~/Library/Preferences/AliasWavefront/maya/5.0
       set BATCHENV_PATH="~/Documents/temp"
      @@ -102,12 +99,22 @@
          endif
       
          if ($?RENDER_LOCATION) then
      -       eval "$LAUNCHCFMAPP $RENDER_LOCATION" 
      +       # Remove old mayaRenderLog.txt first
      +       if ( -e $PREF_LOCATION/mayaRenderLog.txt ) rm -f $PREF_LOCATION/mayaRenderLog.txt
      +       eval "$LAUNCHCFMAPP $RENDER_LOCATION"
      +       set err = $status               # Preserve exit codes from CFMapp
              if ($?PREF_LOCATION) then
                  cat $PREF_LOCATION/mayaRenderStdout.txt >> $PREF_LOCATION/mayaRenderLog.txt
                  rm $PREF_LOCATION/mayaRenderStdout.txt
      -           open -a TextEdit $PREF_LOCATION/mayaRenderLog.txt &
      +           if ( $interactive ) then
      +               # Interactive renders: ok to open TextEdit
      +                open -a TextEdit $PREF_LOCATION/mayaRenderLog.txt &
      +           else
      +               # Non-interactive renders: errors to stdout
      +               cat $PREF_LOCATION/mayaRenderLog.txt
      +           endif
              endif
      +       exit $err                       # Return CFMapp exit code to caller
           else
             echo "Invalid Path for Maya : $MAYA_LOCATION"
           endif
      
      

    If you have problems applying this patch, please post a message on the Rush newsgroup, including details and error messages as appropriate.