From: Greg Ercolano <erco@(email surpressed)>
Subject: [SysAdmin/Windows] Getting "ERROR #1398 (has no error message)" intermittently
   Date: Thu, 18 Sep 2008 15:22:40 -0400
Msg# 1779
View Complete Thread (2 articles) | All Threads
Last Next
This problem came up from two different companies within the last month,
so I thought I post the issue / solution here.

*** Problem: Company #1 ***
> I'm seeing these errors in the rushd.log intermittently:
>
> 08/21,16:06:38 ERROR      //zserver/rush_logs/CMM/3D/logs/fex_007_scn_031): ERROR #1398 (has no error message)
> 08/21,16:06:46 ERROR      //zserver/rush_logs/CMM/3D/logs/fex_003_scn_013): ERROR #1398 (has no error message)
> 08/21,16:06:46 ERROR      //zserver/rush_logs/CMM/3D/logs/fex_003_scn_013): ERROR #1398 (has no error message)
>
> When it happens, it's with our file server running CIFS in guest mode,
> and use Kerberos authentication. Rush is running as a domain user.
> Sometimes it fixes itself after a reboot.
> 
*** Problem: Company #2 ***
> We're intermittently getting these errors in the 'NOTES' field in irush's
> 'Frames' report.. seems to be complaining about the unc path to our BlueArc
> file server:
>
> //rtserver/projects/2008_US_VEH/3D/scenes/interior_fly/veh_mask.ma.log: ERROR #1398 (has no error message).
>
> Rush is configured to run as a domain user, and Kerberos authentication is used
> with our PDC.

*** Response ***

	Error #1398 is a Microsoft filesystem authentication error number
	that translates to:

		"There is a time and/or date difference between the client and server."

	..which means there's more than a 5 minute(*) drift between the clocks
	on the client and PDC, or client and file server.

	This is not a problem with Rush, but with windows authentication.

	Microsoft's "Troubleshooting Kerberos Errors" page has info wrt time drift:
	http://www.microsoft.com/downloads/details.aspx?FamilyID=7dfeb015-6043-47db-8238-dc7af89c93f1&displaylang=en

	Regarding the 5 minute(*) citation, this tolerance is a default
	that apparently can be changed in the group policy.

	BTW, you can convert Microsoft error numbers like "1398" using 'net helpmsg',
	eg:

		net helpmsg 1398

	..which in this case prints the time/date error as shown above:

		There is a time and/or date difference between the client and server.

	Rush may print "(has no error message)" for error 1398 because that build
	of Rush didn't have that error message at the time it was built. Microsoft
	creates new error messages from time to time, as they 'embrace and extend'
	their way across the computing landscape.

*** Solution ***

	In the case of Company #1, they fixed the problem by syncing their clocks,
	and converting rush to run on their windows machines as a local user. They
	mention they did have to reboot the clients because the mounts would not
	pickup due to the time slip.

	In the case of Company #2, they fixed the clock skew problem on their
	BlueArc file server; the PDC and clients were all properly synchronized.

	I suggested in both cases they switch to having rush run as a local user
	to take away the 24/7 dependency on PDC authentication that domain logins
	impose.

	Domain accounts make things easier for the sysadmin, but make things
	more complicated for the machines. In normal cases such as users manually
	logging in, this isn't hard on the machines.

	But in high load cases, it's best to make things easier for the machines,
	otherwise problems crop up. Rush is heavily exercising the machines constantly,
	so the weaknesses of network authentication get exercised too. You may actually
	save yourself administration tasks by taking the extra labor effort of
	configuring /local/ accounts for rush to run as on each machine, just to
	take away the dependency on PDC authentication for network rendering.

Last Next