Saturday, February 5, 2011

Fixing Seagate 7200.11 BSY + 0 LBA FW Bug.

From Seagate:


An issue exists that may cause some Seagate hard drives to become inoperable immediately after a power-on operation. 


Root Cause
This condition was introduced by a firmware issue that sets the drive event log to an invalid location causing the drive to become inaccessible.
The firmware issue is that the end boundary of the event log circular buffer (320) was set incorrectly. During Event Log initialization, the boundary condition that defines the end of the Event Log is off by one. During power up, if the Event Log counter is at entry 320, or a multiple of (320 + x*256), and if a particular data pattern (dependent on the type of tester used during the drive manufacturing test process) had been present in the reserved-area system tracks when the drive's reserved-area file system was created during manufacturing, firmware will increment the Event Log pointer past the end of the event log data structure. This error is detected and results in an "Assert Failure", which causes the drive to hang as a failsafe measure. When the drive enters failsafe further update s to the counter become impossible and the condition will remain through subsequent power cycles. The problem only arises if a power cycle initialization occurs when the Event Log is at 320 or some multiple of 256 thereafter. Once a drive is in this state, there is no path to resolve/recover existing failed drives without Seagate technical intervention. For a drive to be susceptible to this issue, it must have both the firmware that contains the issue and have been tested through the specific manufacturing process.



So much for the technical mumbo jumbo, but how do I get my freaking data back? It may be possible for you to send the HDD back to Seagate and get them to fix it for you, or if you're feeling brave you can DIY the fix. 


You're gonna need some kinda TTL interface to the HDD. RS232 to TTL, or USB to RS232 to TTL, or USB to TTL, will all work, so long as you can use some kinda terminal program to address it properly. Test by hooking up the Tx and Rx lines together and see if it echoes wadever you've entered. 


Seagate Terminal Connections
Next hook it up like this to the HDD. From PC your Rx will go to the HDD's Tx and vice versa. Don't worry about it too much, if it doesn't work (nothing on screen upon HDD powerup), just swap the connections around. Don't power up your HDD just yet. You gotta 1st determine what kinda problem you have. If your HDD is being detected as 0MB you have the 0 LBA problem, if it's not detected at all you have the BSY problem. The BSY is slightly more involved to fix, but most of the HDDs I've seen are stuck in BSY.


BSY Error Msg
Configure the terminal program for 38400 Baud 8 data 1 Stop No Parity. I'm using RealTerm here (Windows 7 no longer comes with hyperterminal). A drive stuck in BSY will keep flashing the error message as shown here (Once powered up). This prevents you from issuing commands, so we gotta work around that. Note that commands are case sensitive.


Shimmed
To do that, unscrew the HDD a little around the head connector area, and put in a little piece of paper or card to stop it from having electrical contact. Then power up the drive. Afterwhich you hit Ctrl+Z and you should see a prompt like this. 


F3 T>


Now access Level 2. Type


F3 T>/2 (enter)
F3 2>


You'll want to give it some time to before issuing the next command to prevent it from error-ing out. 30 secs should be sufficient. After 30 secs issue the command to spin down the HDD.


F3 2>Z (enter)


Spin Down Complete 
Elapsed Time blah blah blah
F3 2>


Together
Next carefully remove the paper/card shim and re-screw the board down, you'll want to screw it down good (I've had it error out because there wasn't good contact), be careful not to short out your controller board with dropped screws.


After that's done, we'll now issue a command to spin the drive back up. 


F3 2>U (enter) 


Spin Up Complete 
Elapsed Time blah blah blah


Now we reset the SMART so it won't be stuck in BSY. 1st go up to Level 1. 


F3 2>/1 (enter)


F3 1>N1 (enter)


Next you can choose to clear the defect list (G-list). This step is entirely optional. In fact I don't recommend it unless you face problems later.


F3 1>/T (enter)


F3 T>i4,1,22 (enter)


After that's done power down the drive, wait a couple of seconds, then power it back up. Ctrl+Z again to init the terminal session. If everything looks good, issue this command. If you have the 0LBA problem, you'd straight-away be at this step.


F3 T>m0,2,2,0,0,0,0,22 (enter)


This command will quick format the user partition, thus regenerating it. The command must be as shown, anything wrong WILL lead to data loss.  You gotta have patience here, it's gonna take awhile.


Fixed
After some time you should see something like what's shown on the right. 


Only after you've seen that (Max Wr retries blah blah blah) can you power down the drive. It should now be operable, so go ahead and hook it up and backup wadever data is on there. Then to fix the issue once and for all, head over to the Seagate website and download a firmware update suitable for your drive. 

3 comments:

  1. You gotta have patience here, it's gonna take awhile. ...huh how much did you wait? the screenshot shows 5 sec but i'm pretty sure it was much longer,isnt it?

    ReplyDelete
  2. Why is it before the m0,2... Command When I disconnect and reconnect power, I can't type anything into hyper terminal?

    ReplyDelete