Bricked QLogic Broadcom BCM57840 after driver update

Two weeks ago, we started with regular updates of our Flex system servers, including drivers and firmwares. It is good practice to update drivers first so I followed this procedure as usually.

Our update list included ESXi driver for CN4022 (QLogic Broadcom BCM57840). Since IBM/Lenovo doesn’t provide VMware drivers for those cards at their site, I downloaded the latest driver available at VMware site for ESXi 6.0 (bnx2x version 2.713.30.v60.8).

Update was successful and everything looked fine after the reboot. However, only till I rebooted server for the second time.

I was getting following errors in the IMM and system was halted during uEFI initialization (I had to reset uEFI settings to boot it into the OS):

Sensor Mezz Exp 1 Fault has transitioned to critical from a less severe state.

I’m guessing some registry changed after ESXi loaded the driver for the first time.  And card firmware wasn’t able to handle it.

ESXi couldn’t load bnx2x module anymore, although you could see the cards in the PCI device list.

I also tried to update firmware to the new version using bootable media, but with no luck.

I managed to brick 3 servers and 6 cards till I discovered what was causing it. Since everything looked good after the first reboot.

The only solution to fix this was to contact vendor to replace the cards!

Don’t forged to downgrade drivers back  ESXi first, because I’m pretty sure it will brick it again!

All of this happened with the following combination

  • Driver bnx2x version 2.713.30.v60.8
  • Firmware 7.12e.4.2d
    • Bootcode for 578xx (MFW) 7.13.24
    • UEFI Driver NX2_Ev 7.13.4)

Please note this driver (bnx2x version 2.713.30.v60.8) works fine after I updated firmware to the new version 7.13b.4.1c (Bootcode for 578xx (MFW) 7.13.75 UEFI Driver NX2_Ev 7.15.0) which I was planning to do right after the drivers update anyway.

 

The following two tabs change content below.
Dusan has over 6 years experience in Virtualization field. Currently working as Senior VMware plarform Architect at one of the biggest retail bank in Slovakia. He has background in closely related technologies including server operating systems, networking and storage. Used to be a member of VMware Center of Excellence at IBM, co-author of several Redpapers. His main scope of work consists from designing and performance optimization of business critical virtualized solutions on vSphere, including, but not limited to Oracle WebLogic, MSSQL and others. He holds several IT industry leading certifications like VCAP-DCD, VCAP-DCA, MCITP and the others. Honored with #vExpert2015 and 2016 awards by VMware for his contribution to the community. Opinions are my own!

About Dusan Tekeljak

Dusan has over 6 years experience in Virtualization field. Currently working as Senior VMware plarform Architect at one of the biggest retail bank in Slovakia. He has background in closely related technologies including server operating systems, networking and storage. Used to be a member of VMware Center of Excellence at IBM, co-author of several Redpapers. His main scope of work consists from designing and performance optimization of business critical virtualized solutions on vSphere, including, but not limited to Oracle WebLogic, MSSQL and others. He holds several IT industry leading certifications like VCAP-DCD, VCAP-DCA, MCITP and the others. Honored with #vExpert2015 and 2016 awards by VMware for his contribution to the community. Opinions are my own!
Bookmark the permalink.

5 Comments

  1. Hi Dusan,

    did the vendor replace the bricked cards free of charge? (I assume servers where under support contract)?

    Pavol

  2. So in this instance are you saying you should update the firmware first before the drivers?

    • Hi, I ‘m not sure what will happen If you are running different firmware, but I would strongly recommend to update fw first If you are running mentioned version.

  3. I’d have to see if I can check the version #s but this sounds an awful lot like what I’ve had happen on a couple of HP ProLiant BL460c Gen8 with a BCM57810S. Interestingly, it’s only happened on machines we’ve used Update Manager to upgrade from 5.5 -> 6.5 -> 6.5U1, and not on ones we wiped and replaced with 6.5, then Update manager to U1.

Comments are closed