From: Greg Ercolano <erco@(email surpressed)>
Subject: Re: Question about Max time behavior
   Date: Thu, 22 May 2008 23:51:17 -0400

Msg# 1741
View Complete Thread (2 articles) | All Threads
Last Next

> One of our artists ran a test job through the farm, and we noted that  
> the following message seemed to be generated for the first failed  
> frame only, subsequent frames contained no such information in the logs:
> 
>> --- KILLED: MAXTIME EXCEEDED 00:01:00

	Hmm, the 'KILLED: MAXTIME..' message should always be
	getting appended to the log when it's killed.. but it might
	be somehow getting overwritten by flushing buffers during the
	kill operation.. not sure what to do about that.

	What platform are you running on?

> We were hoping to find out if there was a simple way for our render  
> script to determine if maxtime has been exceeded for a frame?    
> Nothing obvious seems to stand out in the docs.

	Hmm, not with maxtime.. I don't see how it would be possible
	to kill the frame without also killing the script.

	What you probably want instead is a way to fire off the
	render from within the script, and time its execution,
	and if it takes to long, the submit script can kill it,
	and print a nice message, and do other stuff if need be.

	I tried hard to modify 'logtrim' take a time limit argument,
	so that logtrim could not only limit the output of the renderer,
	but also its amount of time to run.

	Trouble is Windows really sucks at this. The only way
	to implement it properly is to use 'Job Objects' (the
	windows equivalent of process groups), and Microsoft
	did not implement job objects to be nestable. (!)

	In other words, a process that is part of a job object
	(like the render script) cannot create another job object
	for a child, and manage it. I asked Microsoft about this,
	and the engineers confirmed job objects cannot be nested.
	I told them this sucked (in so many words) and was poorly
	implemented. Problem with windows is there really isn't
	a parent/child relationship between processes.. job objects
	are a hack around a poorly implemented kernel design (IMHO).

	In unix there's no problem to create process groups inside
	process groups, it works fine. You can just fork() the render,
	and have the parent wait for either the child to finish or a
	certain amount of time to go by that if exceeded, kills the
	child and logs the error.

	I can probably post some code that shows how to do this.
	But as mentioned, it would only work on unix.

-- 
Greg Ercolano, erco@(email surpressed)
Seriss Corporation
Rush Render Queue, http://seriss.com/rush/
Tel: (Tel# suppressed)
Fax: (Tel# suppressed)
Cel: (Tel# suppressed)

Last Next