Citrix Provisioning Services & Cisco UCS Blades

By Jacques Bensimon:

I recently had to make a provisioned XenApp 6.5 vDisk boot on Cisco UCS B-Series blades outfitted with a Cisco VIC network interface card (probably the most common UCS configuration).  Not much information out there (actually none) about PVS & UCS, so I thought I’d share the couple of roadblocks I hit and their solutions:  (to be clear, I’m referring here to streaming vDisks directly to UCS hardware, not to some wussy VM-on-hypervisor-on-UCS embarrassment! [Politically Correct edit by ME 🙂] 😉)

(1) The first issue was that my first blade refused to PXE boot with Provisioning Services –  after making contact with PVS and displaying the “vDisk found” message, it just sat there like a lox and went no further.  That turned out to be a firmware version issue requiring downgrading to a previous firmware release (which actually is the sort of thing that’s easy to do with UCS since the desired firmware package for a particular blade is specified as part of the blade’s “profile” in the UCS management console).  The initial (non-working) firmware release was 2.1(2a) and the release that resolved the PXE issue was the earlier 2.1(1b) which we happened to already have available in the UCS console.  My (unconfirmed) suspicion is that we could successfully have gone as high as 2.1(1f), the release immediately before  2.1(2a), because the latter’s release notes trumpet the implementation of “VIC PXE boot optimization”, which is probably what broke PXE as far as PVS is concerned.

(2) Once the vDisk was successfully booting on that first UCS blade, the next issue was that it refused to boot on an identically configured second blade.  It would get through PXE, then through the scrolling Windows boot screen, but at the point at which the screen would normally go gray and the mouse pointer appear (indicating the switch to Windows drivers), it would instead blue screen with stop code 0x7B (INACCESSIBLE BOOT DEVICE).  That would normally suggest that the new blade’s NIC was somehow different than the original blade’s — things like even minor PnP ID differences, such as a different revision code, or a different bus slot, would cause the NIC to be seen as different from the original and, since the new one doesn’t yet have a corresponding network connection (with its myriad bindings) registered in the image, the PVS target software has nothing to “latch on to” to continue the boot process.  Yet when I reviewed the two blades’ VICs in Device Manager (after booting the second blade from a local disk), I could see no difference in PnP ID, bus ID, slot ID or function ID, i.e. they were in fact identical.

But the Registry told a different story:  although the PnP ID key for the two VICs was identical (i.e. the corresponding subkey name under HKLMSYSTEMCurrentControlSetEnumPCI was the same for both), the “instance ID” (the subkey of the PnP ID key) was different!  — With most hardware, that’s not supposed to happen:  identical devices plugged into the same bus slot on identical hardware should generate the same instance ID, based as it usually is on some private “recipe” that involves the bus ID, slot ID and function ID.  Upon closer examination, those instance IDs were in this case made up of each VIC’s hex MAC address, along with some additional pad digits.  The prospects for booting a single vDisk on multiple UCS blades were at that point looking bleak, especially after a call to Cisco (during which a detailed education on PnP and instance IDs had to be administered 🙂) led to the conclusion that nothing is available for configuration in a UCS profile or BIOS to yield consistent (non-unique) device instance IDs.

Thankfully, Microsoft  had previously encountered a similar issue (see Microsoft KB 2550978, “0x0000007B Stop error after you replace an identical iSCSI network adapter…” and Citrix KB CTX125317, “Unable to Stream to Non-Master Target Dell Physical Hosts”) and this had resulted in hotfix KB2550978 (for Windows 2008 R2 SP1) which fundamentally changes the way Windows looks at hardware devices, basically changing from the rigid “if PnP ID and instance ID don’t match what I already have, then it’s a new device” to the looser “if PnP ID and bus/slot/function IDs match something I have, I’m willing to assume it’s the same device”. The further good news is that, despite Citrix stating the contrary, this hotfix can in fact be installed successfully on a streamed vDisk (i.e. after the PVS Target software), so no tedious reverse image was required to apply the hotfix and make the vDisk bootable on multiple UCS blades.  By the way, that hotfix goes to the top of my “must have” list for all new builds (may not always be immediately necessary, but sure can’t hurt!)

(3) Lastly, a minor (possibly irresolvable) issue:  Cisco UCS B-Series blades with Cisco VICs don’t appear to support Wake-On-LAN, so  they can’t be powered on from the PVS console.  If the case currently open with Cisco uncovers a solution to that, I’ll be sure to share it.

Later,
Jacques.


Be sure to follow Jacques on twitter @JacqBens

TAGS