Domain Controller in Safe mode? Marcos brings the Vet office back online.
Here is a guest post from my friend Marcos Velez. He worked out a pretty perplexing issue involving a nonresponsive Domain Controller at his sister’s veterinary office. Follow and tweet him on twitter @RunMarcosRun.
Last night I spent 4 hours troubleshooting logon issues for a domain that has a single DC running Windows 2008 R2 and figured I would write up the experience to share with the blog, and also in the hopes of saving someone else a lot of time if they run into a similar problem.
The person affected was my sister, whose veterinary clinic consists of a small environment with about 10 client computers and a domain with a single Domain Controller. The DC is running all the major services (DNS, AD, DHCP, etc.) and at some point yesterday it started experiencing problems after some automatic Windows Updates seem to have gone awry. Any attempt to log in at a client computer would fail with the standard “There are currently no logon servers available to service the logon request”. Any attempt to log into the DC would get past the logon screen and then display a message informing the user “Please wait for the Windows Modules Installer…” which never happened, even after patiently waiting several hours. Those were the basic, initial symptoms. The first step I undertook (besides giving the server time to try and finish) was to force the server to reboot once, then giving it more time to try and finish applying the updates. That didn’t work, so the next step was to boot into Safe Mode (imagine the fun of having to do this “hands off” and guiding a non technical person through the process!) and then taking ownership of the TrustedInstaller process so I could remove it (try as I may, I couldn’t rename it) and also removing the pending.xml file to prevent any further attempts to install pending updates. A reboot after these changes took care of the annoying Modules Installer message, but I still could NOT log into the DC. Now, how is that possible?!?
I am using the administrator account to log into the DC, but the server would respond with a message stating there weren’t any logon servers available to service the logon request. I tried to log in as “.Administrator” by using the “Switch User” functionality, but that didn’t work either. Any attempt to log in with “.Administrator” resulted in an error message stating that the username or password were incorrect. After multiple attempts, it occurred to me that the AD database could be corrupted, so I booted back into Safe Mode to try and address that issue. While in Safe Mode, I thought about running DSRM (Directory Services Recovery Mode) but I wasn’t sure I knew the DSRM administrator password, and it dawned on me that was the reason why the attempts to log in as “.Administrator” were failing. Using the NDSUTIL utility, I was able to reset the DSRM administrator account and then rebooted the server to try and log in again. Once I was able to log in, I noticed two things: no relevant errors were being recorded in the Event Viewer, and I noticed that the desktop was indicating that the server was running in Safe Mode. This was completely unexpected. I did another reboot to see if this would go away, but it didn’t.
After some research, I learned (or, was forced to remember) that while in Safe Mode, a DC will not start the AD services. As such, the server is unable to service requests. This explained neatly why client computers (or, even server console logins) could not succeed. So, I knew the reason behind the issues, and I knew the cause (a failed update) but I still didn’t know how to fix it. What was causing the server to boot into Safe Mode? A quick check of the MSCONFIG utility revealed that the Windows server was set to boot into Safe Mode every time for Active Directory Repair. See below:
After unclicking the Safe Mode option, I rebooted the server and everything went back to normal.
Here is another information nugget that proved SUPER useful: starting with Windows 2008, it is possible to enable console logins using the DSRM administrator account by setting a single registry value. This is incredibly helpful in this type of situation. To log in with the DSRM administrator credentials, one need only click on Switch User and then enter .Administrator in the username field. It is the ultimate back door into a DC. Apparently, in Windows 2008 SBS this is enabled by default. For other flavors, it has to be set manually.
So, if you are faced with a DC that refuses to allow logon requests, check first to make sure that it is not running in Safe Mode. I learned from this experience that the server may not always indicate that it is running in Safe Mode, certainly not at the logon screen. I also learned how to stop/block pending updates from getting installed. It was a very stressful four hours of work, but all is well that ends well. Another takeaway from the experience: Microsoft is still releasing updates that can break servers, which is rather unfortunate.
That’s it. (Free Vet services forever!)
If you have something to add, or some feedback on the process and steps I followed, please let me know! I am curious to know if anyone else here has experienced a similar issue.