From: Greg Ercolano <erco@(email surpressed)>
Subject: Re: cpu balancing script
   Date: Wed, 06 Jun 2007 20:07:30 -0400

Msg# 1587
View Complete Thread (16 articles) | All Threads
Last Next

Antoine Durr wrote:
> Hmm, not sure that's feasible for 1000+ node networks, when the 
> packages is multiple thousands per.

	With deals that large, e.g. $100k ~ $500k and up,
	special deals are arranged with the big vendors.

> That's why having the queue monitor what your process is doing is 
> really the only solution.  Reports by the software are what *it* 
> thinks, not what the OS thought.

	Yeah, I wonder what's up with getrusage() being so poorly
	supported across all unix platforms.

	You'd think keeping count of ram use wouldn't be such
	a big deal.. we know it does it for ps(1) reports and
	/proc/<pid>/stat (In DD's 'race', I used the latter)

> I'll definitely migrate to that.  My "exitnotes" also show cpu 
> efficiency (I run my commands via /usr/bin/time), so that compositors 
> can get a general sense of how whether their jobs are cpu or i/o bound, 
> i.e. low efficiency most likely indicates mostly waiting for disk.  
> Renderer low efficiency could be many texture maps access or undue 
> swapping.

	Right.. it's been on my TODO list for a really long time..
	Laika finally got me off my ass on that one a few weeks ago
	(thanks John!) Turned out to be easy to add.

> I can get that.  What might be useful is having hooks for a bunch of 
> these things, and users can install them if they feel they're needed.  
> A user (me!) shouldn't really have to go and write a memory checking 
> script on their own.

	The 'hook' is that the render scripts are highly hackable.
	So you can, for instance, in perl do a fork() to start the
	renderer, and meanwhile monitor the fork()'ed PID's tree
	via /proc, and come up with the totals.

	I'd supply the process tree monitoring code if I thought
	it worked well, but as I've said I don't trust the code.

	My feeling about all this is that until the OS's support getrusage(),
	I won't bother. Too much hackery for a multiplatform system.

	Back in 2001 I thought 'well, they'll figure this out soon,
	how hard can it be', but here it is 2007 and Linux is still
	arguing over what probably amounts to a 20 line kernel patch:
	http://groups.google.com/group/fa.linux.kernel/tree/browse_frm/thread/9126646bb52e36ae/b2e5a16ae19dad02?rnum=1&hl=en&q=linux+getrusage&_done=%2Fgroup%2Ffa.linux.kernel%2Fbrowse_frm%2Fthread%2F9126646bb52e36ae%2F25a450f891fea844%3Flnk%3Dst%26q%3Dlinux%2Bgetrusage%26rnum%3D8%26hl%3Den%26#doc_b2e5a16ae19dad02

	I mean, even with frigging Alan Cox on the thread saying
	this looks OK. I agree 100% with the subject of that article. >;)

>> 	It's too bad that most of the OS's (esp unix!) doesn't let
>> 	a parent program get back the memory use and usr/sys time
> 
> Yeah, I was floored by that when I found out just how bad memory 
> accounting is!  The structures are there, for Pete's sake!

	Yeah, and not even that.. apparently the process still
	can't even get memory use about *itself*, let alone children!!

	I just did a test now on fedora3; a giant malloc() and memset(),
	and getrusage(RUSAGE_SELF) has all zeros for memory; pathetic!
	

            ru_utime:0/1347     user time used (secs/usecs)
            ru_stime:0/4963     system time used (secs/usecs)
           ru_maxrss:0          <--
            ru_ixrss:0          <-- integral shared text memory size
            ru_idrss:0          <-- integral unshared data size
            ru_isrss:0          <-- integral unshared stack size

	/Same/ results on OSX 10.4.9, too. So both linux and OSX
	have brain dead getrusage(2).

	To show I'm not crazy, compiled and ran the same code
	on my FreeBSD webserver, and it /worked/:

            ru_utime:0/3989     user time used (secs/usecs)
            ru_stime:0/0        system time used (secs/usecs)
           ru_maxrss:604        <--
            ru_ixrss:4          <-- integral shared text memory size
            ru_idrss:988        <-- integral unshared data size
            ru_isrss:128        <-- integral unshared stack size

	(shrug)

> My script has a ramp-down of polling frequency: for the first 10 
> seconds, it polls every 2 seconds, then every 5 seconds for another 30, 
> eventually down to once a minute for the life of the process.  This has 
> worked pretty darned well, as it captures the fast frames decently.  It 
> does miss out on last-second memory surges (Shake tends to do that once 
> in a while, it seems).

	Often those last surges are due to 'exit()' being called
	instead of _exit().

	This is something new due to C++; when you call exit(), all
	the destructors are called, and so, often this creates a lot
	of activity pulling memory pages in so they can be free()d.

	Just calling _exit() bypasses all the destructors, so it
	just frees everything, and the process exits instantly.

	One runs into this with vfork() too.

> But for the most part, the frame-to-frame 
> correlation is pretty strong.  Oddly enough, I haven't seen the 
> data-doublings due to forks, maybe because the polling is infrequent.

	Ya, most likely. Also, possibly shake isn't forking off child
	processes, and likely hasn't got very large of a memory footprint.

	Tell me how it goes with ram heavy renders that fork children..
	At DD I was backing off the sample rate too, but then you'd
	get that double memory use fork() snapshot towards the end
	when the display driver would fire. To 'solve it', I tried to
	increase my sample rate a bit when I saw something that looked
	like a double ramuse to see if it was an aberration. That sorta
	worked, but there were other reasons it would get wild values,
	so I just didn't bother with it in Rush, waiting for the OS
	to deliver the goods.

-- 
Greg Ercolano, erco@(email surpressed)
Rush Render Queue, http://seriss.com/rush/
Tel: (Tel# suppressed)
Fax: (Tel# suppressed)
Cel: (Tel# suppressed)

Last Next