From: Mat X <info@matx.ca>
Subject: Nuke GenArts Sapphire renders failing
   Date: Mon, 07 Feb 2011 22:53:30 -0800
Msg# 2010
View Complete Thread (6 articles) | All Threads
Last Next
I wanted to give a heads up to anyone else having issues with GenArts Sapphire (I know of at least one other facility) and a failing frames in Nuke.

The solution, for those not wanting to read through a long rambling post, is to set the nuke disk cache to the rush temp dir, and everyone lives happily ever after.

For the long story version read on....

I ran into this really weird issue when we upgraded our farm to Mac OS X 10.6.4 and our GenArts Sapphire Nuke renders started failing.

We had upgraded to Sapphire v5 previously so I didn't think that was the issue, but the errors were a mixture of "plugin not installed", "unknown plugin", "unknown command" and "corrupt nuke script".

Of course I contacted The Foundry and Gen Arts support, but they were stumped and could not really reproduce the errors (besides the Foundry releasing new Nuke versions which actually supposedly fixed some Sapphire issues, but not my failing frames).

What I tried:

I reinstalled Sapphire.

	- seemed to work, but would start failing again soon enough

I copied the Sapphire plugin bundle into the Nuke built-in plugins folder

	- seemed to work, but would start failing again soon enough

I set the SAPPHIRE_OFX_DIR and the RLM_LICENSE variables in the submit script and moved the Sapphire bundle properly to our central plugin fileserver

	- seemed to work, but would start failing again soon enough


In conclusion: most of my solutions "seemed to work", but would start failing again soon enough.

Then I noticed artist clean his local disk cache because his local renders were failing. So, on a hunch I wrote a simple script to clear the local disk cache on the render nodes and set it up with a RUSH submit-generic to allow the artists to get their renders to stop failing frames. And it worked. When a render would start failing frames they would run the submit-generic script and the renders would work again.

The problem was it a manual procedure and the artists did not find it simple enough. Fair enough, it was a workaround. But I couldn't automate since if one person clear the cache on a node while other renders were running they would fail their frames also. I did not want to set it up as a pre or post render action for that reason.

The other solution was to go back to rendering as unique users, instead of forcing renders to render as one user (set in rush.conf). But I got linux, windows and mac renders all working as the same user, so I didn't want to change that now.

The best solution to all this was Greg Ercolano's idea to tie the nuke temp directory to the rush temp directory. Since each launch of the submit process brought a new rush process with its own temp dir that would be useful to stash the nuke disk cache there also. and rush cleans up after it's done and that solves the need to run scripts afterwards to clean up.


Last Next