RAID 10 is one of the most important and commonly used RAID levels in use today. RAID 10 is, of course, what is known as compound or nested RAID where one RAID level is nested within another. In the case of RAID 10, the “lowest” level of RAID, the one touching the physical drives, is RAID 1. The nomenclature of nested RAID is that the number to the left is the one touching the physical drives and each number to the right is the RAID that touches those arrays.
So RAID 10 is a number of RAID 1 (mirror) sets that are in a RAID 0 (non-parity stripe) set together. There is a certain common terminology sometimes applied, principally championed by HP, to refer to even RAID 1 as simply being a subset of RAID 10 – a RAID 10 array where the RAID 0 length is one. A quirky way to think of RAID 1, to be sure, but it actually makes many discussions and comparative calculations easier and makes sense in a practical way for most storage practitioners. Thinking of RAID 1 as a “special name” for the smallest possible RAID 10 stripe size and allowing, then, all RAID 10 permutations to exist as a calculation continuum makes sense.
Likewise, HP also refers to solitary drives attached to a RAID controller as RAID 0 sets of a stripe of one as well. So the application of that terminology to the RAID 10 world is actually more obvious and sensible when it is looked at in that light. However, neither HP nor any other vendor today applies this same naming oddity to other array types such as RAID 5 being a subset of RAID 50 or RAID 6 being a subset of RAID 60 even though they can be thought of that way exactly the same as RAID 1 can be to RAID 10.
If we take that same logic and take it to the next level, figuratively and literally, we can take multiple RAID 10 arrays and stripe them together in another RAID 0. This seems odd but can make sense. The result is a stripe of RAID 10s or, to write it out, a stripe of stripes of mirrors (we generally state RAID from the top down but the nomenclature is from the bottom up.) So as this is RAID 1 on the physical drives, a stripe of those mirrors and then a stripe of those resultant arrays we get RAID 100 (R100.)
RAID 100 is, of course, rare and odd. However one extremely important RAID controller manufacturer utilizes R100 and, subsequently, so does their downstream integration vendor: namely LSI and Dell.
Fortunately because non-parity stripes inject little behavioral oddities and have near zero overhead or latency, this approach is really not a problem although it can lead to a great deal of confusion. For all intents and purposes, RAID 100 behaves exactly like RAID 10 when each RAID 10 subset is identical to each other.
In theory, a RAID 100 could be made up of many disparate RAID 10 sets of varying drive types, spindle counts and speeds. In theory a RAID 10 could be made of up disparate RAID 1 sets but this is far more limited in potential or likely variation. RAID 100 could, theoretically, do some pretty bizarre things if left unchecked. In practicality, though, any RAID 100 implementation will likely, as does LSI’s implementation, enforce standardization and require that each RAID 10 subset be as identical as a controller is capable of enforcing. So each will be effectively uniform keeping the overall behavior to the same as if the same drives were set up as RAID 10.
Because the behavior remains identical to RAID 10 there is an extremely strong tendency to avoid the confusion of calling the array RAID 100 and simply referring to it as RAID 10. This would work fine except for the semi-necessary quirk of needing to be able to specify the geometry of the underlying RAID 10 sets when building a RAID 100. LSI, and therefore Dell, requires that at the time of setting up a RAID 100 set that you must specify the underlying RAID 10 geometry but since the array is labeled as RAID 10, this makes no sense. A bizarre situation indeed.
To further complicate matters, because of the desire to maintain a façade of using RAID 10 rather than RAID 100, proper terminology is eschewed and instead of referring to the underlying RAID 10 members as “RAID 10 arrays” or “RAID 10 subsets” they are simply called “spans.” Span, however, being a term used for something else in storage that doesn’t apply properly here. Span, in no way, is a proper description for a RAID 10 set under any condition.
But if we agree to use the term span to refer to a RAID 10 subset of a RAID 100 array we can move forward pretty easily. Whenever possible, then, we want as many spans as possible to keep the underlying RAID 10 subsets as small as possible. If we make them small enough they actually collapse into RAID 1 sets (HP’s odd RAID 10 with a stripe size of one) and our RAID 100 collapses into a RAID 10 with the middle stripe, rather than the outside stripe, being the one that disappears! Bizarre, yes, but practical.
So how do we apply this in real life? Quite easily. In a RAID 100 array we must specify a count of spans to be used. Since we desire that each span contain two physical drive devices so that each span is a simple RAID 1 we simply need to take the total number of drives in our RAID 100 array, which we will call N, and divide that by two. So the desired span count for a normal RAID 100 array is simply N/2. This means if you have a two drive array, you want one span. Four drives, two spans. Six drives, three spans. Twenty four drives, twelve spans. And so on.
Do not be afraid of RAID 100. For normal users it simply requires some additional knowledge of how to select the proper number of spans. It would be ideal if this was calculated automatically and kept hidden allowing end users to think of the arrays in terms of RAID 10. Or else be labeled consistently as RAID 100 to make it clear what the span must represent. Or, of course, simply use RAID 10 instead of RAID 100. But given the practical state of reality, dealing with RAID 100, once it is understood, is easy.
Photo credit: Seeweb via Flickr