RAID 10 Vs RAID 01 (RAID 1+0 Vs RAID 0+1) Explained with Diagram

by Ramesh Natarajan on October 24, 2011

RAID 10 is not the same as RAID 01.

This article explains the difference between the two with a simple diagram.

I’m going to keep this explanation very simple for you to understand the basic concepts well. In the following diagrams A, B, C, D, E and F represents blocks.

RAID 10

  • RAID 10 is also called as RAID 1+0
  • It is also called as “stripe of mirrors”
  • It requires minimum of 4 disks
  • To understand this better, group the disks in pair of two (for mirror). For example, if you have a total of 6 disks in RAID 10, there will be three groups–Group 1, Group 2, Group 3 as shown in the above diagram.
  • Within the group, the data is mirrored. In the above example, Disk 1 and Disk 2 belongs to Group 1. The data on Disk 1 will be exactly same as the data on Disk 2. So, block A written on Disk 1 will be mirroed on Disk 2. Block B written on Disk 3 will be mirrored on Disk 4.
  • Across the group, the data is striped. i.e Block A is written to Group 1, Block B is written to Group 2, Block C is written to Group 3.
  • This is why it is called “stripe of mirrors”. i.e the disks within the group are mirrored. But, the groups themselves are striped.

If you are new to this, make sure you understand how RAID 0, RAID 1 and RAID 5 and RAID 2, RAID 3, RAID 4, RAID 6 works.

RAID 01

  • RAID 01 is also called as RAID 0+1
  • It is also called as “mirror of stripes”
  • It requires minimum of 3 disks. But in most cases this will be implemented as minimum of 4 disks.
  • To understand this better, create two groups. For example, if you have total of 6 disks, create two groups with 3 disks each as shown below. In the above example, Group 1 has 3 disks and Group 2 has 3 disks.
  • Within the group, the data is striped. i.e In the Group 1 which contains three disks, the 1st block will be written to 1st disk, 2nd block to 2nd disk, and the 3rd block to 3rd disk. So, block A is written to Disk 1, block B to Disk 2, block C to Disk 3.
  • Across the group, the data is mirrored. i.e The Group 1 and Group 2 will look exactly the same. i.e Disk 1 is mirrored to Disk 4, Disk 2 to Disk 5, Disk 3 to Disk 6.
  • This is why it is called “mirror of stripes”. i.e the disks within the groups are striped. But, the groups are mirrored.

Main difference between RAID 10 vs RAID 01

  • Performance on both RAID 10 and RAID 01 will be the same.
  • The storage capacity on these will be the same.
  • The main difference is the fault tolerance level. On most implememntations of RAID controllers, RAID 01 fault tolerance is less. On RAID 01, since we have only two groups of RAID 0, if two drives (one in each group) fails, the entire RAID 01 will fail. In the above RAID 01 diagram, if Disk 1 and Disk 4 fails, both the groups will be down. So, the whole RAID 01 will fail.
  • RAID 10 fault tolerance is more. On RAID 10, since there are many groups (as the individual group is only two disks), even if three disks fails (one in each group), the RAID 10 is still functional. In the above RAID 10 example, even if Disk 1, Disk 3, Disk 5 fails, the RAID 10 will still be functional.
  • So, given a choice between RAID 10 and RAID 01, always choose RAID 10.

Linux Sysadmin Course Linux provides several powerful administrative tools and utilities which will help you to manage your systems effectively. If you don’t know what these tools are and how to use them, you could be spending lot of time trying to perform even the basic administrative tasks. The focus of this course is to help you understand system administration tools, which will help you to become an effective Linux system administrator.
Get the Linux Sysadmin Course Now!

If you enjoyed this article, you might also like..

  1. 50 Linux Sysadmin Tutorials
  2. 50 Most Frequently Used Linux Commands (With Examples)
  3. Top 25 Best Linux Performance Monitoring and Debugging Tools
  4. Mommy, I found it! – 15 Practical Linux Find Command Examples
  5. Linux 101 Hacks 2nd Edition eBook Linux 101 Hacks Book

Bash 101 Hacks Book Sed and Awk 101 Hacks Book Nagios Core 3 Book Vim 101 Hacks Book

{ 49 comments… read them below or add one }

1 DarioMor October 24, 2011 at 3:01 am

Something is still out of my sight… if performance, sorage and # of disks (=~cost) are the same, and RAID10 is better in fault tolerance than RAID01, why should any vendor offer RAID01? Is someone offering it?

Or there is some topic where RAID01 is better than RAID10?

oh, I almost forget: YOUR BLOG IS EXCELLENT. Thank you.

2 Pardeep Rathi October 24, 2011 at 4:37 am

Hi Ramesh,

If disk1 and disk2 both failed in RAID10, then what will be happend?

3 Ken Whitesell October 24, 2011 at 5:23 am

I’m sorry, but your logic doesn’t hold here.

You wrote – “In the above RAID 01 diagram, if Disk 1 and Disk 4 fails, both the groups will be down. So, the whole RAID 01 will fail.” Well, that’s also true for the RAID 10 diagram you presented. If disks 1 and 2 fail, your entire RAID is down.

Likewise you wrote – “In the above RAID 10 example, even if Disk 1, Disk 3, Disk 5 fails, the RAID 10 will still be functional.” Again, that also holds true for your RAID 01 diagram. Disks 1, 2 and 3 can all fail in RAID 01 and your system will remain functional.

It’s an issue of probabilities. In both cases, each block exists on two physical drives. The system will fail any time a block becomes unavailable, regardless of grouping. Likewise, the system will remain functional if all blocks are available, regardless of how many drives have failed – assuming your structure can retrieve the unavailable blocks from the other drive on which it resides.

4 jalal hajigholamali October 24, 2011 at 12:15 pm

Hi,

thanks a lot

i think raid 5 is the best one…

5 gUI October 24, 2011 at 2:45 pm

Hi !

I made the maths with my wife (safety expert) :

- ‘f’ is the failure probability of one single disk.
- Gx is the name of the groups
- Dx is the name of the disks

RAID 10 :
For loosing your file, you need to loose G1 OR G2 OR G3. To loose G1, you need to loose D1 AND D2, to loose G2 you need to loose D3 AND D4 and to loose G3 you need to loose D5 AND D6.
=> probability of loosing your file : (f*f)+(f*f)+(f*f) = f²+f²+f² = 3f²

RAID 01 :
For loosing your file, you need to loose G1 AND G2. To loose G1 you need to loose D1 OR D2 OR D3, and to loose G2 you need to loose D4 OR D5 OR D6.
=> probability of loosing your file : (f+f+f)*(f+f+f)=3f*3f=9f²

In this particular case (6 blocks, 6 disks), you have 3 times more chances to loose your file on RAID01 than on RAID10.

More basically, you can think like that :
- on RAID 10, if one disk fails, when the second failure appears, I have 1 possibility between 5 that this makes my entire system fail (the other disk in the group)
- on RAID 01, if one disk fails, when the second failure appears, I have 3 possibilities between 5 that this makes my entire system fail (any disk in the other group)

6 geeknik October 24, 2011 at 5:01 pm

The Intel RAID controller in my system (Gigabyte X58A-UD7 motherboard) shows the RAID as RAID 10 and then (RAID0+1) shortly there after while the machine is booting up, and then the Intel RST software shows RAID10 while in Windows. I’m honestly not sure which I have at this point. I’ll just assume I have RAID10 and hope for the best. =)

7 weebl October 25, 2011 at 11:55 am

Ken, I agree the logic is flawed.
gUI, you may have done the math but you are following the flawed logic.

In either of the setups if disk 1 fails you lose the redundancy on blocks A and D, so in order to have a complete failure you would have to lose the other disk that contains the A and D blocks.
In the case of RAID 10 it would be losing disk 1 and 2, in the case of RAID 01 it would be losing disks 1 and 4.
Both have the same probability after loss of the first disk. 1 out of 5.

8 ramsse October 25, 2011 at 6:54 pm

Hi Ramesh,

In Raid10 section, the last bullet says:
This is why it is called “stripe or mirrors”. i.e the disks within the group are mirrored. But, the groups themselves are striped.

But I think you try to say “stripe of mirror”.

9 Ramesh Natarajan October 25, 2011 at 8:43 pm

@ramsee,

Thanks for pointing it out. It is corrected now.

10 gUI October 25, 2011 at 11:49 pm

@weebl

No, in 01, the problem is the group. If you loose disk 1, group1 is dead, it won’t help you even for getting pieces C,D,E or F. So if the next failure is on the group 2, whatever the disk, the group 2 also is dead, and you won’t get any file.

The disks are not independants.

11 weebl October 26, 2011 at 4:57 am

@gUi thanks for clarifying. It makes sense now, and re-reading the article i see the auhor also states the same thing. I agree RAID 10 is better.

12 Jidifi November 21, 2011 at 8:01 am

Comparing raid10 and raid01 seems easier with
the same number of disks (4) for each raid:

- raid10: 2 disks in each group for mirroring,
2 groups for stripping
(D1G1 D2G1, D1G2 D2G2)

- raid01: 2 disks in each group for stripping,
2 groups for mirroring
(D1G1 D1G2, D1G1 D2G2).

Then, complete failure if:

- raid 10: if 2 disks of one or more group failed
((D1G1 and D2G1) and/or (D1G2 and D2G2))

- raid 01: if 2 disks of one or more mirror failed
((D1G1 and D1G2) and/or (D2G1 and D2G2)).

But probability of “((D1G1 and D2G1) and/or (D1G2 and D2G2)) failed”
is equal to
probability of “((D1G1 and D1G2) and/or (D2G1 and D2G2)) failed”.

And then raid10 == raid01.

13 gUI January 27, 2012 at 4:44 am

@Jidifi
For 2 disks, maybe it’s the same (don’t want to redo the maths), but for more, it’s no more equivalent. Please re-read carefully all comments.

14 Jidifi January 27, 2012 at 4:12 pm

@gUI

In your first comment, you write:
“RAID 10 :
For loosing your file, you need to loose G1 OR G2 OR G3. To loose G1, you need to loose D1 AND D2 …”.

Question:
“RAID 10 :
If you lose D1 only (or D2 only), have you lost your file ? “

15 Jidifi January 27, 2012 at 5:43 pm

@gUI

In your first comment, you write:
<
< – ‘f’ is the failure probability of one single disk.
< RAID 10 :
probability of loosing your file : (f*f)+(f*f)+(f*f) = f²+f²+f² = 3f²
< RAID 01 :
probability of loosing your file : (f+f+f)*(f+f+f)=3f*3f=9f²
probability of loosing your file : 3 x 0.8 x 0.8 = 1.92 !!
RAID 01 : => probability of loosing your file : 9 x 0.8 x 0.8 = 5.76 !!

gUI, probabilities greater than one do not exist in our world.
(Please re-read carefully all your first mathematical books)

16 gUI January 28, 2012 at 4:33 am

I know, thank you very much… but why did you choose f=1 ? I think you also should have a look at your maths books…

17 gUI January 28, 2012 at 4:42 am

Oups, no, you did not choose f=1, you choose 0.8, yes… But the formula still true. It just means that with a hard drive that have 80% of probability to fail in an hour, you are sure that you will loose your system. Not very interesting.

Now check with other value, more realistic, like 1/10e-4 (around one failure in one year).

18 Vasudev March 25, 2012 at 6:46 am

Thanks a lot for this post. You described it very simple and easy to understand.

19 raj April 8, 2012 at 12:38 pm

Very good article
Thanks to all for sharing their views

20 Laurence May 3, 2012 at 10:01 am

While the above definition are 100% correct it is important to recognise RAID 10 is a relatively new term and there are many Raid 10 implementation out there that call them selves Raid 0+1. As late as 2006/7 manifacturers were documenting raid 10 implementations as raid 0+1

21 karl July 2, 2012 at 9:31 pm

Are you saying that (in raid 01 diagram) if disk 1 and disk 6 both fail, then the files cannot be reconstructed from disks 2, 3, 4, & 5, which together contain the complete set of blocks?

22 Laurence July 5, 2012 at 4:12 pm

In answer to Karl, Yes in Raid 01 If disk 1 fails then the Stripe Group1 has failed, so the only data access available is via the Group 2 stripe. If any disk in Group 2 fails then both Groups have failed.

This is why RAID 10 is far superior to RAID 01 in terms of availablity.

23 Stoinov August 10, 2012 at 10:54 am

For all the people that can’t get why 01 is worse here is a more detailed description:

The gist is basically that when a Drive fails in RAID 01 the whole Group fails. In the case above imagine failure of Disk1 then the whole Group1 is down and you’re left with only three disks that make up Group2. If ANY of them fails then I hope you have recent backups :)

As for the RAID 10 – it needs to lose two mirrored disks (for instance Disk1 and Disk2) in order to lose whole group (in this case Group1) so the RAID could fail. And the probability is quite smaller even without a math: with 01 you have simple mirror, when any of the drives fail you have only one group left, and if any of these drives fails while you’re replacing or rebuilding the RAID – it’s all gone.

24 Dev Mgr August 17, 2012 at 11:24 am

A decent raid controller should be able to pick up that, if disk 1 and 5 had failed, it still had access to all data. It would basically piece the 2 raid 0′s together to make a working set of data.

The problem is that some of the low end raid controllers don’t have this level of intelligence.

Also, if you were to resort the drive order in the picture to show “Disk 1 and Disk 4, then Disk 2 and Disk 5, and then Disk 3 and Disk 6″ you would still have the same raid 0, and you could see that data-wise, raid 10 and raid 01 should be the same; it’s just a question of how smart is the raid controller (to understand that all I need are blocks A to F, and it doesn’t matter which drive in which raid set they are on).

25 Laurence August 21, 2012 at 6:03 am

Dev Mgr, you miss the point of this discussion, if the Raid controller implements Raid 0+1 then it can not handle that failure mode if the raid controller implements Raid 10 then it can. It has nothing to do with level of intelligence of the controller or whether itis a low or high end controller, just the raid algorythm chosen. The selection of the latter has significant amounts to do with the level of intellience of the firmware developer but that said there can be good reasons to choose raid 0+1 over raid 10 !!!

26 grant August 24, 2012 at 2:17 pm

I want to understand more the availability of the data. As Dev Mgr points out, depending on the failure mode/sequence of events, your file may actually still exist, but limitations in the RAID algorithm prevent you from accessing it.

Does this mean that with some manual intervention, e.g. physically juggling the drives and putting them in a certain sequence, one could restore their file for use once more? Once restored, further manual steps could be taken to restore your full striped mirror (or mirrored strip) and your RAID array of choice would once again by live?

27 Hugh August 26, 2012 at 12:57 pm

I found this link looking for the benefit of RAID 10 vs. RAID 01, but using FOUR drives in total. With all things being equal, in a four-drive (2 pairs) array, RAID 01 & 10 should be equal. I’m running an older 3ware controller, and I’ve learned never to trust luck.

It seems to me that the redundancy depends on the controller you use.
If I do RAID 01, and one drive from each stripe fails, even if they weren’t holding the same data, I’m depending on the controller being smart enough to “marry” up one drive from each stripe to form a whole effective non-RAID stripe pair. In essence, to re-build one striped pair dynamically. That isn’t a bet I’m anxious to make!

With the fundamental element being a mirror, the controller can be a good deal stupider and the low-level mirrors still dutifully cough-up their contribution to the over-arching stripe, and my day isn’t ruined.

Since both 01 & 10 use BOTH striping and mirroring, they are likely to provide the same performance. So I’m going with RAID 10. (now if there were only some way to make it go to ELEVEN! :)

28 Laurence August 26, 2012 at 11:54 pm

Grant, what you say may be possible, however in a Raid0+1 scenario you really only want to consider this if you had multiple simultaneous failures. The longer the period between the failure the more out of step the remaining good drives in the failed stripe get with the drives in the live stripe.

Whether manual intervention as you describe will work really depends on the Raid software (frimware), what controls it allows and what meta data it stores, do not assume the raid software will allow such activity.

29 Laurence August 27, 2012 at 12:16 am

Hugh, I’m affraid to say raid 0+1 and raid 10 are not the same and will NOT give to the same availablity.

In raid 0+1, when the first disk lost means the first strip in lost which means both disks in that stripe ater ejected from membership of the Raid volume and so the data is no longer mirrored at all.. At this point the data on the other member(s) in the stripe starts to become increasingly stale as it is no longer being updated. If either of the disks in the remaining stripe is loss the complete volume is lost… the controller can’t have the smarts to sort the issue out, we have lost data …

In Raid 10, when the first disk is lost, only 1 of the 2 mirrors is lost and the stripe between the mirrors is unaffected. So providng your next disk failure comes from a disk in the remaining good mirror then the stripe will still be good.

So if Raid0+1 has two disk failures, the volume is toast …
Raid 10 can tolerate at least 1 disk failure and potentially 2 if they are the “right disks” so raid 10 has better availablity,

30 Grant August 27, 2012 at 5:35 am

Hi Laurence,

I’m stuck searching for a backup solution, wondering if you or any of the readers can help.

I have an industrial slim PC at a remote site, no internet connection is permissible (corporate regulation), and I’m nervous of hdd failure. If I could completely mirror the drive, even once a week, the local techs could simply swap the drives to get the system running again and then contact me for warranty support – this keeps downtime very low. The PC box does not have a RAID controller, and does not have room for a 2nd hdd in the box.

Does there exist an external mirror device that works in parallel to the primary disc, or perhaps an external 2 disc RAID 1 device that I can use instead of the internal disc (I’m picturing a 30cm IDE extension cord, so the PC thinks the RAID is internal)?

I prefer a hardware sol’n over software, simply because I cannot login and confirm the software is working at all times.

Thx

31 Leon September 10, 2012 at 10:51 am

Please correct me if my logic is wrong, but it seems the math was made much more complicated that it needs to be. Here is my explanation:

Based on the diagrams for Raid 10 with 6 disks –
In Raid 10 if 2 disks fail the probability of complete failure is 1/5.
1st disk is probabiliy of 1 since it does not matter which disk fails first, and 2nd disk must be part of the same group of the first disk to fail. Therefore out of the remaining 5 and there is only 1 disk that can cause complete failure. Probability is 1 out of 5.

In Raid 01 if 2 disks fail the probability of complete failure is 3/5.
1st disk can be anything once again, and to cause complete failure the 2nd disk can be any of the 3 disks in the other group. Therefore out of the rest of the 5 disks, the probably is 3 out of 5.

Based on this for a 2 disk failing scenario Raid 10 is better. Writing out the probability for 3 disk failure would prove the same result.

The same applies for 4 disk, for Raid 10 probability of 2 disks failing to cause complete failure is 1/3 and for Raid 01 it is 2/3.

32 Laurence September 11, 2012 at 2:52 am

Hi Leon, your maths fails to take into account that in raid 0+1 after the first failure half the disks are no longer available (in the above example two good disks will have effectively been failed along with the bad disk).

In raid 0+1 the first disk failure fails all disks in that stripe, the failure of any subsequent disk means a member of the other stripe has failed and so data access is lost. Two disk failures = total data access losss

In raid 1+0 there is potential for loss of upto half the members and for data access to remain.

As stated earlier this is why RAID 1+0 is so superior to Raid0+1

33 Ven September 24, 2012 at 3:36 pm

Guys it’s a Simple logic don’t make it complex.
RAID 1+0 has a mirrored pair as its basic element. If a drive fails and is replaced, only the mirror needs to be rebuilt. In other words, the disk array controller uses the surviving drive in the mirrored pair for data recovery and continuous operation. Data from one surviving disk will be copied to the replacement disk.
Note: If there is a hot spare, data is rebuilt onto the hot spare from the surviving drive in the mirrored pair . When the failed disk is replaced, data from the surviving drive in the mirrored pair is used to rebuild the data on the replaced disk.

RAID 0+1 uses a stripe as its basic element. The stripe has no protection (RAID 0). If a single drive fails, the entire stripe is faulted, meaning that only half the disks in the RAID set are available for data access. A rebuild operation rebuilds the entire stripe, copying data from each disk in the healthy stripe to the equivalent disk in the failed stripe. This causes increased, and unneeded, I/O load on the backend, and also makes the RAID set more vulnerable to a second disk failure.

RAID 0+1 is less common than RAID 1+0, and is a poorer solution.
Hope above helps

34 Sammitch September 26, 2012 at 12:14 pm

Also, after disk failure in 0+1 you have to rebuild fully half of the disks in the array, which means you have to do a lot of reads on the other, non-failed half. You really end up putting the system through a large amount of disk I/O, and risking failures in other disks.

In 1+0 you’re only involving the one other disk in the affected mirror.

35 Michael October 12, 2012 at 12:11 pm

In case this has not been settled yet, here is every possibility for a 2 drive failure:
Drive R 0+1 R 1+0
1,2 Up Down
1,3 Up Up
1,4 Down Up
1,5 Down Up
1,6 Down Up
2,3 Up Up
2,4 Down Up
2,5 Down Up
2,6 Down Up
3,4 Down Down
3,5 Down Up
3,6 Down Up
4,5 Up Up
4,6 Up Up
5,6 Up Down

36 jake12 January 14, 2013 at 3:22 am

Raid 10 tolerance is not better, and here’s why. Let’s discuss the failure scenarios the article mentions:

Scenario #1: Disks 1 and 4 both fail simultaneously. The entire RAID 01 array would fail, but the RAID 10 array would survive.

My response: True, but if disk 1 and disk 2 both fail, RAID 10 fails, while RAID 01 survives. Since the chances of disks 1 and 2 both failing simultaneously are exactly the same as disks 1 and 4 doing the same, it’s a wash; RAID 10 provides no advantage.

Scenario #2: Disks, 1, 3, and 5 all fail simultaneously. According to the article, the RAID 01 array would fail, while the RAID 10 array would survive.

My response: NOT TRUE. Data on the RAID 01 array would NOT be lost. Why? Because the data on disks 2 and 5 are identical. So just take the (still working) disk 2, substitute it for disk 5 in the “group 2″ array, and voila, the array is working again, with all data still intact.

So in conclusion, it doesn’t matter too much whether you do RAID 10 or RAID 01; the fault tolerance is the same either way.

37 gUI January 14, 2013 at 4:10 pm

@jake12 : in conclusion, you’re wrong :D

RAID is not working in this way, that’s precisely why 01 has lower availability. As soon as you loose one disk (say #1), you’ll loose all the group (Group1), even if the other disks are still OK. That’s the implementation, you cannot change this.
So once this happens, if the next failre is on the other group, you’re dead (even if all data still available globally, I agree).

38 jake12 January 15, 2013 at 2:13 am

@gUI

I don’t think that’s true.

If one drive in a Raid 0 array dies you lose the “entire array” because it’s inacessible, but the data on the surviving drives doesn’t go anywhere. It’s not like the RAID controller says “Oh crap, a drive died; I’d better format all the other drives in the array just in case”

When a drive dies, the RAID 0 *ARRAY* dies, but the *DRIVES* themselves are still perfectly good, and all the data is still there. The only reason the array “dies” is because it needs all drives present in order to access the data, and if one drive is dead/missing, it can’t access the data on the other drives. But if you could somehow magically replace the failed drive with an identical drive with identical data (as is the case when the array is mirrored across another, identical RAID 0 array on the other side of the RAID 01), then the controller has no way of knowing that you just switched drives, and the array lives to see another day.

39 gUI January 15, 2013 at 2:48 am

In that way, you are totally right. But you’ll have to switch off your system, modify the array structure, and stitch it back on. We can’t talk about availability of the system, and that’s why 10 is more reliable, meaning it guarantees a better availability of the service.

In case #2, your data are still there (you can retrieve them at the cost of a little effort), but the service is down.

This being said, I’m not sure that you can switch the drives between them and simply restart your RAID, I wonder if they are tagged or something. But even if this is the case, there will always be a way to retrieve your data, yes.

40 jake12 January 15, 2013 at 3:26 am

@gUI

So then I think we’re in agreement – Raid 10 provides no extra redundancy against catastrophic data loss, only against system uptime regarding how long it takes to swap drive slots and reboot the system.

But even then, that’s assuming that the RAID controller/driver/firmware can’t figure out that there are still enough “good” drives left in the array to get a 100% intact RAID array going. I’ll admit I don’t know a whole lot about server-level RAID controllers, but wouldn’t it be logical for the RAID card manufacturer to put this incredibly simple bit of programming into whatever function monitors the drive status? They’ve been putting all kinds of upgrades/improvements to RAID cards over the years (caching, battery backup, etc.), it wouldn’t very much trouble to make it so that a RAID 01 array had the same redundancy of a RAID 10, so that RAID noobs didn’t have to worry about how to set the array up.

41 Laurence January 15, 2013 at 3:49 am

Hi Jake12, I have to disagree because the drives failures are not likely to be simultaneous which means in a RAID0+1 model after the first drive failure the other two drives that are “marked” bad are no longer being updated which means very quickly you will not be able to re-consitute any file that is written to and there will be real data-consistency issues

Secondly the idea of re-constituting the data from the surviving drives may not work as some raid implementations will not allow such and action as they track the membership using meta-data store either on the drive or in the controller.

Regards, Laurence

42 jake12 January 15, 2013 at 4:07 am

@Laurence

Not a bad point. Theoretically, a RAID controller could alleviate this problem by continuing to update the other drives in the “failed” RAID 0 array, but the particular RAID controller device may or may not do this in practice.

So in theory, there should be no difference in redundancy between RAID 01 and RAID 10, but in practice, you should stick with RAID 10 just to play it safe.

43 good_weather January 21, 2013 at 8:02 am

> @gUI
>
> RAID 10 :
> For loosing your file, you need to loose G1 OR G2 OR G3. To loose G1, you need to loose D1 AND D2, to loose G2 you need to loose D3 AND D4 and to loose G3 you need to loose D5 AND D6.
> => probability of loosing your file : (f*f)+(f*f)+(f*f) = f²+f²+f² = 3f²

> RAID 01 :
> For loosing your file, you need to loose G1 AND G2. To loose G1 you need to loose D1 OR D2 OR D3, and to loose G2 you need to loose D4 OR D5 OR D6.
> => probability of loosing your file : (f+f+f)*(f+f+f)=3f*3f=9f²

You can’t add up probabilities like that because their events overlap. Also probabilities will never ever be higher than one if you do proper calculations.

Considering the first case RAID 10:
You are asking for the probability P(G1=d or G2=d or G3=d). To calculate this you could break this down to each _disjoint_ case:
(G1=d and G2=l and G3=l) or (G1=d and G2=d and G3=l) or (G1=d and G2=d and G3=l) or (G1=d and G2=l and G3=l) …
Here you could add each probability because their events are disjoint. But it is simpler to calculate P(G1=l and G2=l and G2=l) which is the only event left out, the opposite case.

Therefore the probability of P(G1=d or G2=d or G3=d) = 1 – P(G1=l and G2=l and G2=l) = 1-3(f-1)^2.

Another approach would be to negate the boolean expression “G1=d or G2=d or G3=d” which actually is “G1=l and G2=l and G2=l”.

If you now let f be 0.8 the result would be 1 – 0.2^2 = 99.2 %, which of course is within the range [0, 1].

44 good_weather January 21, 2013 at 8:19 am

Wait, i’m stupid. I messed up the probability for a group to be alive:

P(G=l) = 1-f^2

Therefore P(G1=l and G2=l and G3=l) is (1-f^2)^3. The hopefully correct formula is P(G1=d or G2=d or G3=d) = 1-(1-f^2)^3. Let f=.8 then the result is ~ 95.3 %.

Sorry for the confusion :/

45 jake12 January 22, 2013 at 2:21 am

To simplify the point good_weather is making, consider rolling six dice (regular, 6-sided cubic dice). What are the chances of rolling a “5″ on at least one of them?

Let’s see, the chances of getting a 5 (or any other number) on a single die is 1 in 6, because there are 6 possible numbers. So the chances of getting a “5″ on at least one of the six dice rolled is:
(1/6) + (1/6) + (1/6) + (1/6) + (1/6) + (1/6) = 6 * (1/6) = 1
So by that logic, you have a 100% chance of rolling at least one “5″, which is obviously wrong.

While it’s true that *on average* you’ll get a 5 on one of the six dice, you’re not guaranteed to have exactly one on every roll, because some rolls will produce more than one “5″, while others will produce none.

The easiest way to correctly calculate the odds, as good_weather points out, is to calculate the odds of getting *NO* 5′s; i.e. the odds of all six dice resulting on something *other* than a 5, which is easy. The odds of getting a non-5 roll on a single die is 5/6, so the odds of getting a non-5 roll on all 6 dice is (5/6)^6, or 15,625/46,656, or about 1 in 3.

So you have a 1 in 3 chance of NOT getting a 5, which means you have a 2 in 3 chance of rolling at least one. p(at least one) = 1 – p(none).

I think that simplifies it. Either that, or makes it more confusing ;P

46 Anonymous February 3, 2013 at 7:25 am

Very Good Info Ramesh

47 john Johnson February 20, 2013 at 2:32 am

Raid 10 is the same performance as Raid 1?

No.

48 Witek March 12, 2013 at 2:20 pm

gui@ – almost correct, but you made mistakes in calculations.
good_weather@ – correct, but still there are significant assumptions in your calculations.

Distinction between 10 and 01, matters only if you are using them in different layers. For example doing RAID 0 in software on top of hardware RAID 1. or reverse. Or using everything in software, but using two SATA/SAS controllers (each one for half of your drives). Then, loosing one of drives in 01, may mean whole group is unavailable (because hardware raid will not know about other group, or because whole group failed because of faulty controller, or faulty shared PSU), and then it is just a matter of loosing any of remaining disks in remaining group to loose whole array. However, if all disks are behind a one controller, and you are doing software raid or smart hardware raid, then there is no difference – each block is available on some 2 device, shuffling order of drives doesn’t matter.

Your analysis of failure probabilities for 10 and 01, assumes there exists dependency between disks in single group, but it is only true in some cases (as explained above). Still valid in many situations. As also explained above, when using 2 controllers or doing replication across machines – think about it. Is it better to have first mirror on 1 machine (server) and second mirror on 2 machine, and perform striping on client (raid 01)? Not really, if one of the machines dies, you are losing access to data, because having at least one drive in each machine is necessary. Contrast that with oposite. Strip non-mirrored drives on each machine, and mirror them (in client) across machines (raid 10)? Yes, even loosing whole single machine will make array still available to client. Also loosing some disks will be fine in such scenario, because you can choose which machine to use for each block.

49 Claireron April 2, 2013 at 9:21 am

real world working with 5x9s the array RAID theory is all change – classic RAID level is going … going nearly gone, chunklets as in HP 3PAR’s array are the way ahead, and in regard to the posting classic RAID 10 vs 01 – I would always put in two arrays and have layered either

A) array based replication
or
B) HBM (software raid) over the two arrays.

(separate fabrics of course). Within my SAN (2PB 14 arrays stretched over 2 sites), we have had a (multi disk failure) just (once) in three years (that was 2 disk). and the Hot spares kick in very fast so the exposure time is very low, and we run RAID 10.

As for asking the array to choose which RAID 10 or 01 – for the question I would go with suppliers best practice.

Claire

Leave a Comment

Previous post:

Next post: