> We're writing new custom python submit/render scripts.
> We want to send a bunch of dynamically
generated data (in the form of a dict)
> from the user's submit to the renders.
>
> We'd could use the rush 'log directory' for this, but
we're using the '%s' feature
> in the "logdir" command so that the log directory has the
jobid included in the pathname.
> Since the jobid isn't known until the job is submitted, we're
not sure how to pass
> the dict through the logdir and avoid the "catch-22"
situation:
> can't save the data until we have the jobid,
> can't have the jobid until we submit the job,
> can't submit the job until we save the data
> Do you have any pointers on how to do this?
Sure, the best way is to save the 'dict' as a pickle
file to the job's
log directory, and have the render expect to find the
file there.
(Another way is to submit the job into a paused state, write
the file,
then unpause the job, but this can be avoided..)
To deal with the catch-22, submit the job first, and
include a wait loop
in the render script to *wait* for the pickle file to
appear if it hasn't
been saved yet. This gives the submit side time to
save the file out.
To prevent a race on the file while it's being written,
write the
pickle file to a *temp filename* first, then when the
file has been written
and closed, rename the file into it's final
destination.
The rename will be atomic, such that when the renders
finally 'see'
the file appear, it won't have partial contents.
Here's some example code that shows how to do this; a
single python
script that, if run without arguments will submit the job, and
uses
itself to render the job as well. (There's no render commands
in this
example; just a print statement)
Code highlighting:
green code runs on submit,
blue code during rendering.
red code shows tmpfile/rename
logic to prevent races on the pickle file.
-- -- -- -- -- -- --
-- -- -- -- -- -- --
-- -- -- -- --
#!/usr/bin/python
import os,sys,re
def ParseJobid(rushoutfile):
'''PARSE JOBID FROM A SUBMIT
Returns jobid, or "" if none (job probably failed, see stderr for reason)
'''
f = open(rushoutfile, 'r')
lines = f.readlines()
f.close()
try: jobid = re.search("RUSH_JOBID.(\S+)", "\n".join(lines)).groups()[0]
except AttributeError: jobid = ""
return jobid
def SaveDict(envfile, mydict):
'''SAVE DICT TO A FILE'''
# Save to tempfile first, then do an atomic rename so render script
# never sees partial contents..
#
import pickle
envtmp = envfile + ".tmp"
f = open(envtmp, "w")
pickle.dump(mydict, f)
f.close()
os.rename(envtmp, envfile) # atomic rename
def GetDict(envfile):
'''LOAD DICT FROM FILE'''
import time,pickle
while not os.path.exists(envfile):
# Wait for file to appear via an atomic rename..
print "--- Waiting for envfile to appear.."
time.sleep(5)
# Load pickle file, return dict
f = open(envfile, 'r')
d = pickle.load(f)
f.close()
return d
def SubmitJob(logdir, mydict={}):
'''SUBMIT JOB TO RUSH, PASSING DICT
logdir -- path to log directory (jobid will be appended)
mydict -- an optional, arbitrary dict to pass to the renders
'''
# Submit the job
import tempfile
tmpout = tempfile.mktemp(suffix=".tmp-submit")
submit = os.popen("rush -submit > " + tmpout, 'w') # stdout -> file, stderr -> tty
submit.write("title TEST\n" +
"frames 1-10\n" +
"logdir " + logdir + "/%s\n" +
"command python %s -render\n" % sys.argv[0] +
"cpus +any=5@1\n")
submit.close()
jobid = ParseJobid(tmpout) # get jobid
os.remove(tmpout) # remove tmpfile
if jobid == "": return jobid # submit failed? (no jobid)
# Save mydict to envfile in the job's logdir for render script to read
if len(mydict) > 0:
envfile = logdir + "/" + jobid + "/envfile"
SaveDict(envfile, mydict)
return jobid
### MAIN
if __name__ == "__main__":
if len(sys.argv) == 1:
# SUBMITTED BY THE USER
logdir = "/net/tmp/logs"
mydict = { "lots": "of stuff", # <-- THIS WOULD BE YOUR DICT
"even": "more stuff" } # <-- TO SEND TO THE RENDERS
print "--- Submitting job.."
jobid = SubmitJob(logdir, mydict)
if jobid == "": sys.exit(1) # submit failed? error will be on stderr
print "--- Jobid: " + jobid
sys.exit(0) # success
elif sys.argv[1] == "-render":
# RENDERING ON THE RENDER NODES
# Load the dict the submitter saved for us
envfile = os.path.dirname(os.environ["RUSH_LOGFILE"]) + "/envfile"
print "--- Loading submitted environment from envfile: " + envfile
d = GetDict(envfile)
print "--- Got dict: " + str(d) # show what we got
print "--- Rendering.."
# Do rendering here..
sys.exit(0)
|
|