From: Greg Ercolano <erco@(email surpressed)>
Subject: Re: aerender CS5 write permissions
   Date: Wed, 09 Mar 2011 06:51:24 -0500
Msg# 2047
View Complete Thread (7 articles) | All Threads
Last Next
Victor DiMichina wrote:
Gary,  I knew this sounded familiar.   I have what sounds like the exact
setup as you,  OS X servers,  AFP shares, AE,  etc.
    Thanks for the post, Victor!

    I was actually waiting to see Gary's frame log to see if the actual error
    messages had those same 'exit 9' errors.

    Do you know if you were getting these same random errors with file servers
    running Tiger? I'm curious if the problem correlates to newer OSX servers
    (10.6), or if it happened with Tiger/10.4 as well.
I got those permissions problems you described,  and with the help of a
certain perl expert I know (cough cough...greg...cough),   I put the
following into my submit.afterfx.pl <http://submit.afterfx.pl> to parse
for that error.    It's since become a distant memory.
    It's a reasonable workaround, but be careful:

    In cases where there might actually be a /real/ permission error,
    the logic shown would just keep retrying. But based on Victor's request,
    we coded it that way.

    The 'exit 9' might make it unique, but I don't think we tested for
    the condition of an actual perm error to see if it throws the same error code.

    You might want to have it fail after some number of retries
    so it doesn't retry forever. And perhaps a sleep(5) in there too
    so that it doesn't spin.

    If you wanted that behavior, in place of these three lines:


        print STDERR "--- AE EXIT 9: FALSE ERROR DETECTED: $logmsg\n";
        system("rush -fu -notes $ENV{RUSH_FRAME}:\"RETRY: AE EXIT 9 FALSE ERROR\"");
        exit(2);         # RETRY


    ..you could add this extra logic (in red):
   

        if ( $ENV{RUSH_TRY} < 10 )         # limit to x10 retries
        {
            print STDERR "--- AE EXIT 9: FALSE ERROR DETECTED: $logmsg\n";
            system("rush -fu -notes $ENV{RUSH_FRAME}:\"RETRY: AE EXIT 9 FALSE ERROR\"");
            sleep(5);                      # prevent spin
            exit(2);                       # retry (up to 10 times)
        }
        print STDERR "--- AE EXIT 9: FAILING AFTER 10 RETRIES\n";
        system("rush -fu -notes $ENV{RUSH_FRAME}:\"FAIL AFTER 10 TRYS: AE EXIT 9 FALSE ERROR\"");
        exit(1);                           # fail


    In the cases where I've actually looked into these errors with sysadmins,
    the actual file system /was/ throwing real OS errors in the actual OS logs
    (eg. /var/log/system.log)

    In one case I helped troubleshoot where we were testing AE for this intermittent
    perm error:

----
aerender Error: After Effects error: Error in output for render queue item 2, output module 1.
                Can not create a file in directory /snowserver/foo/bar. Try checking write permissions.
----

    ..we found the following error messages in the system.log of the snow lep file server
    while we had an otherwise idle network all to our tests:

----
 Feb 23 15:32:20 snowserver gssd[35670]: Error returned by svc_mach_gss_init_sec_context:
 Feb 23 15:32:20 snowserver gssd[35670]:      Major error = 851968: Unspecified GSS failure.  Minor code may provide more information
 Feb 23 15:32:20 snowserver gssd[35670]:      Minor error = 100006:
 Feb 23 15:40:14 snowserver sshd[35675]: USER_PROCESS: 35680 ttys000
 Feb 23 15:42:23 snowserver gssd[35699]: Error returned by svc_mach_gss_init_sec_context:
 Feb 23 15:42:23 snowserver gssd[35699]:      Major error = 851968: Unspecified GSS failure.  Minor code may provide more information
 Feb 23 15:42:23 snowserver gssd[35699]:      Minor error = 100006:

----

    The timestamps correlated to the random AE perm errors.
    Googling these gssd errors seemed to show being kerberos related (even though
    in our case kerberos was not enabled; these were static NFS mounts with local
    user accounts)

    Google showed many other folks were encountering this random perm behavior
    in other contexts outside render farms.

    These errors were truly random; requeing the same job over and over, clearing
    the output directory before each test, random frames/random machines would
    have this problem. When a machine 'caught' the problem, it would have it several
    times in a row, then it would go back to working again.

    It was interesting that on the same farm, maya renders never had a problem.
    Only AE renders had the issue. So it seemed only AE caused this. But the problem
    was traceable to actual file system errors. (gssd in this case)

    In another case, it was with samba; complaints about oplocks failing causing
    not perm errors, but 'drop outs' in connections, and truncated render logs.
    (This was with maya + windows farm + Snow Leopard server with a samba config)

    An interesting test with folks having this intermittent behavior from AE
    with an OSX file server would be to put the test data on a /non-OSX server/
    (eg. linux) and see if you can replicate the problem.
If you can get random
    perm errors with that too, then that would nail AE being nutty. But if it goes away...
    might be the server! (Apple)

I could sit with you and discuss *many* things about aerender that
bother me,  it seems to get worse with each version.
    Me too, don't get me wrong ;)

    My biggest peeves: inconsistently printing filenames during rendering,
    not printing actual OS error messages, disconnecting processes from the
    process hierarchy (re-parenting AfterFx to launchd! CS4 and CS5),
    interacting with the window manager for command line rendering (!),
    ignoring frame ranges when comp names aren't specified, etc. etc.
    all some of my biggest peeves.
-- 
Greg Ercolano, erco@(email surpressed)
Seriss Corporation
Rush Render Queue, http://seriss.com/rush/
Tel: (Tel# suppressed)ext.23
Fax: (Tel# suppressed)
Cel: (Tel# suppressed)

Last Next