Sunday, November 14, 2010

The Thick and the Thin of Provisioning

Last week there was an interesting exchange between two storage industry analysts on the topic of thinly provisioned storage.  The discussion revolved around the value of thin provisioning in the most common sense - lowering cost of storage by avoiding the purchase of tomorrow's storage at today's prices.  I'm not going to rehash that discussion other than to say that I agree with the proposition that thin provisioning can lower total cost of ownership.  However, I came to realize that there's really more to be said for thin provisioning.

First, let's agree on something which should really be obvious but I think needs to be stated up front in any discussion on thin provisioning.  


If this seems contrary to what you understand thin provisioning to be then blame the marketing hype.  One of the questions I'm most frequently asked is, "What happens when my thinly provisioned volumes fill up?" and my answer is "The same thing that happens when your thickly provisioned volumes fill up!"  In other words, don't use thin provisioning to try and avoid best practices and planning.  Unless you have a very firm grip on your storage growth and have a smooth and responsive procurement process (and planned budget) you're better off not over allocating your storage.  That's my two cents.

Now that I've made at least one industry analyst happy let me explain why I still think thin provisioning is a feature worth having and a necessary part of a storage virtualization solution's feature set (yes, on the array). 

As a storage administrator, having information about actual capacity utilization is pure gold.  It's not good enough to know how much storage you've provisioned - you really need to understand how that storage is being used in order to drive efficiency and control cost (not to mention plan and justify upgrades).  In many shops, storage and the application teams are siloed and obtaining information about storage utilization above the block level provisioning is usually difficult and very likely not accurate.  Consider also that storage consumption can be reported on a lot of different levels and with many different tools.  Collecting and coalescing that information can be time consuming and frustrating.

In a thinly provisioned storage array the administrator can tell in an instant what the utilization rates are and also trend utilization for planning and budgeting purposes.  And, yes, the information can also be used to question new storage requests when existing storage assignments are poorly utilized or over sized.

Although thin provisioning is provided at various other layers of the stack outside of the array it doesn't devalue the single pane management benefits associated with array based thin provisioning.  For example, VMware administrators must select between three storage provisioning options when creating a virtual disk (zeroedthick, thin and eagerzeroedthick).   There may be rationale for using thin provisioning tools within the LVM or application but that should only apply to use cases within that system or solution - in Compellent's case it matters not because the block level device will be thinly provisioned regardless of any higher layer sparse allocation.  In short, a shared storage model requires thin provisioning at the storage layer to drive efficiency for the entire environment (which by the way is justification for storage virtualization at the array in general).

Thin provisioning could be considered a side effect of virtualizing storage and actually assists in delivery of other virtualization features such as snapshots, automated tiering, cloning and replication.  Foremost is reduction in the amount of work that must be done to manage storage for other features.  With thick volumes the zero space must be manipulated as if they were "load bearing" - and in the case of volumes sized for growth this could be significant on a large scale.  For example, a new 100GB LUN, thickly provisioned would need to be allocated storage from some tier.  Maybe that's all tier 1 storage which would eat up expensive capacity.  Maybe it's tier 3 storage which means performance might suffer while the system figures out that some pages just got very active and need to be promoted to a higher tier.  Even if some rough assumptions were made and the LUN was provisioned out of a 50/50 split of high performance and lower cost storage there's still going to be some inefficient use of the overall array.

Likewise, a feature which involves making a copy of stored data, such as cloning and replication, would be more costly if the entire thick (and underutilized) volume were being copied.  Many storage virtualization products provide a capability to create a volume as a golden image for booting and then assign thinly provisioned copies of that boot image to new servers, conserving storage.  Without thin provisioning you could still dole out servers from a golden image, of course but why not deduplicate?  Yes, thin provisioning is a form of deduplication when you think about it.

Far from being a new problem to deal with, thin provisioning is a key feature in any virtualized storage solution and you don't have to over allocate your storage to get value from it.

No comments:

Post a Comment