> We've recently been getting these errors intermittently
> in the 'Frames' report:
>
> STATE FRAME TRY HOSTNAME PID JOBID [..] NOTES
> ----- ----- --- -------- ---- ------- ---------------------------------------------------------------------------------------------------------------
> Fail 0009 1 blade10 3268 svr.457 [..] ERROR: can't create log as rush@blade10:\\jobserver\share1\jobs\BUILDLOG: An unexpected network error occurred.
>
> This occurs randomly on different hosts. Our server \\jobserver is a CIFS share
> on our SAN. Is this a Rush problem, or one with the Windows stack?
That's a Microsoft error; it seems the OS gave an error when rush
tried to check to see if the log directory exists.
Microsoft has a few links for this error; maybe one of them
will help you discover the problem:
http://support.microsoft.com/kb/121913
http://support.microsoft.com/kb/897089
One non-obvious cause could simply be a DNS problem; when the OS sees the
UNC path \\jobserver\share1, it has to resolve "jobserver" into an IP address,
and possibly that hostname resolution is being intermittent.
How long has this been happening? If it's recent, does its appearance
coincide with any administrative changes on the server (\\jobserver\share1)
or network infrastructure?
When the problem is happening, try walking up to that client (eg. "blade10",
based on the above error message), and try accessing the "backslash" version
of that pathname with the DIR command from a DOS window, eg:
dir \\jobserver\share1\jobs\BUILDLOG
..if you get the same error, you can probably at least debug it that way.
If so, I'd probably try 'ping jobserver' first, just to see if the hostname
lookup is the issue (ie. see if you get a 'jobserver: Hostname lookup failure' error),
and if not that, probe around with NET USE and friends to see what's up.
|