From: Abraham Schneider <aschneider@(email surpressed)>
Subject: strange retries
   Date: Mon, 26 Jun 2006 05:47:19 -0400
Msg# 1317
View Complete Thread (4 articles) | All Threads
Last Next
Hi!

We have some troubles with our network/pipeline (linux based ethernet servers connected to a SAN storage): black rendered frames from shake, corrupted rendered frames and several other problems. We try to figure out which part of the pipeline makes these problems but that's not easy because the problems couldn't be reproduced easily.

There is one strange thing that I get when rendering with rush and shake: when I look in iRush->Frames I sometimes retries for some packets. Strange thing is: the log of this packet looks like there hasn't been any problem or retry.

For example: I have "retry #2 of 5" in the notes of a packet, but the log looks like this:

###
### lion.700: 0006
###
--------------- Rush 102.42a --------------
--      Host: scarecrow
--       Pid: 14415
--     Title: servertest1
--     Jobid: lion.717
--     Frame: 0006
--     Tries: 0
--     Owner: aschneid (1054/2001)
-- RunningAs: aschneid (1054/2001)
--  Priority: 81
--      Nice: 10
--    Tmpdir: /var/tmp/.RUSH_TMP.142
--   LogFile: /mnt/frozone/projects/servertest/servertest1.shk.log/0006
-- Command: perl /mnt/libs/rushlib/submit-shake.pl -render /mnt/frozone/projects/servertest/servertest1.shk 5 300 5 AddNever+Requeue 60000 off -v -motion 1.0 1 -cpus 4 -proxyscale Base
--   Started: Sat Jun 24 04:47:00 2006
------------------------------------------
    SHAKEPATH: /mnt/frozone/projects/servertest/servertest1.shk
  RENDERFLAGS: -v -motion 1.0 1 -cpus 4 -proxyscale Base
  BATCHFRAMES: 5 (6-10)
      RETRIES: 5 (AddNever+Requeue after 5 retries)
   MAXLOGSIZE: 60000
PATH: /usr/nreal/shake/bin:/usr/local/rush/bin:/usr/local/rush/bin:/sbin:/usr/sbin:/bin:/usr/bin:/usr/X11R6/bin

Executing: logtrim -s 60000 -c shake -exec /mnt/frozone/projects/servertest/servertest1.shk -t 6-10 -v -motion 1.0 1 -cpus 4 -proxyscale Base
info: rendering frame 6
info: frame 6 rendered in 26.94s
info: rendering frame 7
info: frame 7 rendered in 29.38s
info: rendering frame 8
info: frame 8 rendered in 27.35s
info: rendering frame 9
info: frame 9 rendered in 23.67s
info: rendering frame 10
info: frame 10 rendered in 26.74s
--- SHAKE SUCCEEDS: EXITCODE=0

Any idea why there was a retry? How could I check what forced the retry? Any other general ideas how to test my pipeline to find problematic part?

Thanks for any help.

Abraham



--
Abraham Schneider
VFX Compositor

ARRI Film & TV Services GmbH
Tuerkenstr. 89
D-80799 Muenchen

Phone: +49 89 3809-1269
Mobile: +49 173 5719842
Email: aschneider@(email surpressed)

Last Next