Question: On my DELL PowerEdge server, in the front panel, I see this error message: W1228 Raid controller battery capacity <24hr error message. But, everything seems to be working fine on the server. What should I do to fix this issue? Is this something I should be worried about?
Answer: In most situations, the solution to the problem is to replace the RAID battery, as it is dying or dead.
This problem typically happens after few years of running the PowerEdge servers. Based on my experience, this happens on servers that are typically 5 years old. But, there has been few cases where the RAID battery was dead even in the 1st two years.
RAID battery is not that expensive. I always recommend that you keep a spare RAID battery and RAID controller memory handy for this kind of situation.
If you have DELL omsa tools installed, you can view the status from the command line on your server using the omreport command as shown below. I’ve seen this error message on various Poweredge servers including 2950, R610, R710. And on different PERC controller including Perc 6/i , H700, etc.
For example, on one of my PowerEdge 610 server, I saw the following Indicating the RAID ROMB battery failure.
# omreport storage battery List of Batteries in the System Controller PERC H700 Integrated (Slot Embedded) ID : 0 Status : Critical Name : Battery 0 State : Failed Recharge Count : Not Applicable Max Recharge Count : Not Applicable Learn State : Idle Next Learn Time : 57 days 5 hours Maximum Learn Delay : 7 days 0 hours Learn Mode : Auto
In this case, I purchased a new RAID controller battery from DELL and did the following:
- Shutdown the server gracefully (Don’t do an abrupt shutdown). As always, it is a good idea to have a backup of your critical application and keep it somewhere on a different server if something goes wrong with this server.
- Replace the RAID battery. This is fairly easy. Refer to the DELL poweredge manual for your model on how to do this. You don’t need to remove the whole RAID controller from the system for this. You just have to remove the battery from the battery bay on the controller, remove the connection cable that is connected to the old battery, and insert it to the new battery. Finally, insert the new battery on the battery bay.
- Start the server.
- After you replace the battery, you might still see the error message for a little while. i.e Until the controller goes through couple of learning cycle. This might take around 24 hours. Typically, within 24 hours of changing the new RAID battery, the W1228 message from the front panel should go away.
After changing the RAID battery, the status will change to “Non-Critical”, and the state will say that the new battery is currently “Charging” as shown below:
# omreport storage battery List of Batteries in the System Controller PERC H700 Integrated (Slot Embedded) ID : 0 Status : Non-Critical Name : Battery 0 State : Charging Recharge Count : Not Applicable Max Recharge Count : Not Applicable Learn State : Idle Next Learn Time : 89 days 6 hours Maximum Learn Delay : 7 days 0 hours Learn Mode : Auto
Once the charging is complete (after 24 hours of replacing the RAID battery), you’ll see the following, which indicates that the battery is in good condition now.
# omreport storage battery List of Batteries in the System Controller PERC H700 Integrated (Slot Embedded) ID : 0 Status : Ok Name : Battery 0 State : Ready Recharge Count : Not Applicable Max Recharge Count : Not Applicable Learn State : Idle Next Learn Time : 80 days 18 hours Maximum Learn Delay : 7 days 0 hours Learn Mode : Auto
But, there are few others things you should be aware of:
- Sometimes the issue might just be loose cable that is connected to the battery. Reseating the RAID battery cable might help. If you do this, see #4 above.
- Changing the RAID battery does not erase all the RAID configuration settings. They are safe. You can replace the RAID battery without having to worry about losing your RAID config.
- The RAID battery is used by the controller only when something goes wrong to write the pending cache data to the controller. So, a failed battery doesn’t mean data corruption. But, as you can imagine, it is a good idea to fix this before you hit a real failure when controller can’t write the data from the cache because it didn’t have a working battery.
- But, on the other hand, the disk performance might be slower with a failed battery. Because the controller writes the data to the cache first, and the cache data is written to disk later. Without RAID battery, controller will not use cache (it will go from write back to write thru mode), and it will start writing data directly to disk every time without using the cache, which might have slow disk performance. So, fix this problem by changing the battery as early as possible.