StorageGumbo

Wednesday, November 16, 2011

vCenter Orchestrator - You Own It, Use It!

At lunch this week with a customer when the subject of automation came up, as it seems to more and more frequently these days. They were looking for a way to automate provisioning of servers so that they could reduce time to market. What product would help them with that, they asked.

You already own it, was my answer.

If you're a VMware administrator you're probably aware of vCenter Orchestrator at least in passing. If you're like me, you may have even played with it a little and moved on to other things because you couldn't really figure out how it all works.

On the other hand, you may have never even heard of vCenter Orchestrator and you don't know why you'd want to learn about it, much less use it.

What is it?

vCenter Orchestrator, or vCO, is a workflow engine that can help administrators automate tasks. Simple, yet powerful statement, right?

In fact, vCO comes out of the box with lots of pre-built workflows for vSphere administration as well as plugins for things like Active Directory, SOAP, REST ... there's even a plugin by Cisco for UCS. Naturually, VMware also provides plugins for other products like vCloud Director and Update Manager. The community also provides a source of plugins such as Joerg Lew's PowerSSHell plugin. VMware Labs also provides some unsupported/experimental plugins.

With vCO, a VMware administrator can create, using a drag and drop interface, a workflow to provision new servers including deploying a template, customizing and installing applications.

Did I say it was drag and drop? That's the best part for salty old infrastructure folks like me - you know, the kind of guy who does his scripting through the Google search interface. vCO doesn't require that you be a hacker, but it also doesn't limit you from creating some pretty complex workflows that include custom scripts. The good news is, you can do a lot of powerful things, pretty easily, with the built in workflows.

What can it do?

Here's a list of examples from the vCO community --

How do I get it?

As the title of this post indicates, if you own vCenter Server you already have vCO. Recently, VMware released a vCO virtual appliance so installation is super easy.

Burke has a good guide for installing vCO on a Windows server if you like to do things the hard way.

How do I get started with vCO?

There's actually a pretty good community out there already with example workflows, step-by-step guides and video demonstrations...

VMware vCenter Orchestrator Blog
VMware Orchestrator Community
vCO Portal
VCO Team
Mighty Virtualization
Cody Bunch - you can pre-order his book "Automating vSphere"

Tuesday, November 15, 2011

Setting the Citrix CLIENTNAME to match the View client host name

Often, applications will use the Citrix client name to assign location based resources (printers, for example). In the case of a Citrix XenApp session running from a Windows based View session, location awareness is lost because Citrix client name assumes the Windows host name of the VM, not the endpoint (a thin client or View running on a physical desktop).

There are two ways that I have seen to address this - and personally I have a preference for one of those methods, but you can decide which works best for you.

First, it's important to understand a couple of things.

The View agent on the VM will capture information about the client and store that data in the hive "HKCU\Volatile Environment\" and one of the keys in that hive is "ViewClient_Machine_Name".

Citrix clients store the CLIENTNAME attribute in "HKLM\Software\Citrix\ICAClient\CLIENTNAME".

By default, the Citrix plugin installs with dynamic client name enabled. This means that the CLIENTNAME reg is populated with the host name of the host running the client. In other words, by default CLIENTNAME will be the VM's OS host name.

One method of changing the CLIENTNAME to the endpoint host name is to grab the reg key "ViewClient_Machine_Name" and replace the "CLIENTNAME" key with that string. A script to do just that may look like this:

Option Explicit
Dim SysVarReg, Value
Set SysVarReg = WScript.CreateObject("WScript.Shell")
value = SysVarReg.RegRead("HKCU\Volatile Environment\ViewClient_Machine_Name")
SysVarReg.RegWrite "HKLM\Software\Citrix\ICAClient\CLIENTNAME", value

You would want this script to run on connection and reconnection, so when a roaming desktop user changes endpoints you are updating the CLIENTNAME to match for location awareness.

To use this method, you would need to reinstall the Citrix plugin on the VM to disable dynamic client name.

I like this approach because it's elegant and simple to troubleshoot. I would recommend using BGinfo to add three bits of info to the VM desktop wallpaper to help validate that the CLIENTNAME is correct and to aid in any troubleshooting. For example, I would have the following appear in the lower left corner in small font on the desktop:

View VM -
View Client -
Citrix Client Name -

The other method would be to leave Citrix dynamic client name enabled and change the host name of the VM on connect/reconnect to match the endpoint host name (or include the endpoint host name in the VM host name). I'm not a huge fan of this approach because it involves constantly changing the host name of the VM which could impact other applications as well as complicate troubleshooting. You also need to make sure that the TCPIP name is changed as well, resulting in a DNS update if you're running dynamic DNS.

I do have access to a script to perform a VM host name change, but since I didn't write it I don't want to post it here. I will check with the author to see if I can add it to this post later, but the script above should give you a start on creating that script.

Thursday, January 27, 2011

Changes

Tomorrow I will make my last sales call for Compellent. It's been a great, albeit short, ride with an up and coming storage company and I appreciate every thing about the past 13 months. There are so many great people at Compellent who I will miss and I wish them the very best in whatever the future holds - this year will be very exciting at Compellent. Good speed to the entire team and thanks for the memories.

I will continue this blog (with hopefully more frequency - but we'll see). Although most postings were related to Compellent's products and value I don't have any regrets about anything I've posted and still stand behind each and every post. My new role will afford me the opportunity to champion not only Compellent but also other storage technologies - particularly as they relate to virtualization and cloud infrastructure.

Beginning next week I will join VMware as a Systems Engineer - doing pretty much the same thing I was doing with Compellent but with a much broader canvas, more brushes and a large palette of colors.

Stay tuned!

As always, my opinions are my own and not those of my employer.

Tuesday, January 25, 2011

A Pool of Storage

I saw a question pop up on Twitter regarding the difference between "free allocated vs unallocated disk space" on a Compellent Storage Center. I believe the question arises out of this screen from the System Manager GUI:

The bar graph shows the storage utilization for disks in the default "Assigned" disk folder (which is typically the way a system is deployed).

The blue shaded area shows disk used by volumes - easy enough. This is allocated pages with user data on them.

The light blue (or clear, I haven't ever figured that out since the background is the same color!) area represents something called "Free allocated space"...

Huh?

In a Compellent storage system (and by the way in pretty much all virtualized storage systems as I'm aware) disk blocks are carved up into pages or extents - terminology differs but it's essentially the same concept and Compellent refers to them as pages and the aggregation of all pages in the system is referred to as the page pool. The default page size for Compellent is 2MB (4096 blocks) although 4MB and 512KB pages can be used in specific cases.

You may be aware that Storage Center uses variable RAID protection based on pages being writeable or part of a read-only replay (snapshot). In general writeable pages are RAID 10 protected and replay pages are RAID 5 (dual parity schemes may also be configured as in the graphic below). Storage Center prepares pages for user data ahead of time for use by volumes - some will be prepared at each RAID protection level on each disk tier. The graphic below from the Data Progression Pressure Reports shows this.

Here again we see the dark blue/light blue bar graph. Note the RAID 10-DM (dual mirrored) fast track allocation. 1.39GB of space is in use by volumes - but another 48GB of disk space has been prepared at this protection scheme (using the outermost tracks all Tier 1 disks). This is actually from the Storage Center 5.3 manual, so I'm not sure what was going on with this system but your typical real world system would likely have more "disk used" of course.

But hopefully you get the point. The "free allocated" or light blue shaded section of the bar on the first graph is the amount of disk space consumed by all of these prepared pages, across all tiers.

So what's unallocated? That's simply disk space that hasn't been turned into pages and assigned to a RAID level - or hasn't been "formatted for use" if you will. No worries, as you consume "free allocated" pages you'll begin to see the "unallocated" space decrease to back fill the "free allocated" pages. By the way, expired replays and deleted volumes return pages to "free allocated" status.

Sunday, December 12, 2010

Sub-LUN Tiering Design Discussion

Nigel Poulton's recent blog posting on storage architecture considerations when using sub-LUN tiering is very thoughtful and I appreciate his approach and concern for the subject. Indeed, one of the challenges for me in working with new Compellent customers is helping them understand the different approach to storage design using automated tiering with sub-LUN granularity.

I wanted to address one point which is particular to Compellent (and maybe some others, I'm not certain) and that is the RAID striping is dynamic, variable and part of the data progression process. In Nigel's example, he shows a three tiered system with various drive types and sizes already pre-carved to suit RAID protection (in the example case all RAID 6 protected). In the Compellent world, the only decision administrators need to decide is an appropriate level of parity protection per tier (typically based on drive size and speed which all goes back to rebuild times). As a best practice, customers are advised to use dual parity protection (which includes both RAID 6 and RAID 110*) for drives 1TB or larger.

That aside, I tend to agree with Nigel on a three tiered approach when bringing SSD into the picture. However, in configurations with spinning rust only there's usually no need for 15K and 10K drives, particularly with the capacities available in 15K and the density of drives for 2.5" 10K tiers

Two rules of thumb can help administrators plan for sub-LUN tiering -

Size for performance first and capacity second

Performance should never be addressed with slow, fat disks

Sizing for performance first allows you to potentially address all of your capacity in a single tier. Using 7200 RPM drives as your primary storage brings up issues of performance degradation during rebuilds, lower reliability and decreased performance with dual parity protection schemes. Rules of thumb, as I stated, so please no comments about exceptions - I know they exist.

Point is, using the rules above you can pretty easily draft a solution design if you know understand performance and capacity requirements.

For example, a solution requiring 4000 IOPS and 8TB of storage could be configured as

Tier 1 - Qty 24 146GB 15K SAS drives (RAID10/RAID5-5)
Tier 2 - Null
Tier 3 - Qty 12 1TB 7200 SAS drives (RAID110/RAID6-6)

On the other hand, a solution needing only 2500 IOPS and 6TB could be designed with:

Tier 1 - Qty 24 450GB 10K SAS drives (RAID10/RAID5-5)
Tier 2 - Null

Tier 3 - Null

Additional capacity tiering could be added in the future as needed, provided that performance requirements don't change (grow). These a simplistic examples and they really only provide starting points for the solution. They will be tweaked to improve initial cost and anticipated growth in demand.

So far those examples don't include SSD, I know. However, that's going to depend a good bit on the application requirements and behavior and this is where sub-LUN tiering adds value but system design gets a bit more difficult.

Consider an example where 600 virtual desktops are being deployed using VMware View 4.5 - we have the ability to create data stores for three different components in that environment:

Read only boot objects can be isolated and stored on the fastest storage (SSD) and are space efficient since multiple guests can read from the same data set.
Writeable machine delta files can be stored on 15K or even 10K drives to avoid problems associated with SSD overwrite performance.
User data can be stored on two tiers - high performing disk for active data and capacity drives for inactive pages.

So in this case we may deploy a solution similar to this very high level design (I'm assuming some things here and really won't go into details about images per replica, IOPS per image, parent image size)

Tier 1 (Replica boot objects) Qty 6 200GB SSD SAS
Tier 2 (Delta and user disks) Qty 48 600GB 10K SAS
Tier 3 Null (may be added in future for capacity growth)

In the end, the administrator really only has to watch two overall metrics in the environment for planning and growth trending. Performance and capacity could be added to the solution to address those specific needs only.

Again, this is all slanted toward the Compellent architecture but I do appreciate Nigel bringing this up as storage customers are going to be facing this more often and should start to get a handle on it sooner rather than later.

* My term for Compellent's Dual Mirrored RAID 10 scheme; I'm always trying to get a new phrase to catch on :)

Tuesday, November 23, 2010

The Morning After

Compellent's big news yesterday generated a lot of traffic and I'm just catching up having otherwise been engaged in pre-sales meetings. Overall, I think the Compellent marketing crew did a great job and is to be commended for delivering the message globally and in a consistent manner. And the message seems to have been well received.

Chris Mellor quoted Chris Evans great in a writeup over at The Register. I have a great deal of respect for both Chris's but wanted to respond specifically to one key point concerning Live Volume (red font coloring is my doing):

Live Volume will place an extra processing burden on the controller. Storage consultant Chris Evans said: "With Live Volume, as a LUN is moved, the new target array has to both serve data and move data to the new location. This put significant extra load on the new target. I don't know how many arrays can be in Live Volume, but I would imagine the intention from Compellent would be to have a many to many relationship. If that's the case then I can see a lot of that extra [controller] horsepower being taken up moving data between arrays and handling I/O for non-local data."

Keep in mind that Live Volume is an extension of Remote Instant Replay (Compellent's replication suite) with the ability to mount the target while replication is underway. In other words, no data is being moved that wouldn't normally be moved during a replication job. The additional functionality serves IO at the target site by having the target replication device become a pass-through to the source. The cut over of a volume from one system (array) to a target basically involves the same computational workload as activating the DR site under a traditional replication scheme. I guess maybe Chris (Evans) is referring to the pass-through IO on the target side being an extra burden but if you consider that the whole point is to transfer workloads then I don't see an undue burden being placed on the target system - it will assume the source role if IO or throughput exceeds the configured threshold anyway.

Like Chris Evans, I can see Live Volume evolving into a many-to-many product eventually, since Remote Instant Replay already supports this type of replication. In fact, the possibilities are exciting and I'm sure (but not in the loop sad to say) that more enhancements will be coming - personally I'd like to see some level of OS awareness of this movement so that outside events could trigger Live Volume movement.

Sunday, November 14, 2010

The Thick and the Thin of Provisioning

Last week there was an interesting exchange between two storage industry analysts on the topic of thinly provisioned storage. The discussion revolved around the value of thin provisioning in the most common sense - lowering cost of storage by avoiding the purchase of tomorrow's storage at today's prices. I'm not going to rehash that discussion other than to say that I agree with the proposition that thin provisioning can lower total cost of ownership. However, I came to realize that there's really more to be said for thin provisioning.

First, let's agree on something which should really be obvious but I think needs to be stated up front in any discussion on thin provisioning.

THIN PROVISIONING ≠ OVER ALLOCATION

If this seems contrary to what you understand thin provisioning to be then blame the marketing hype. One of the questions I'm most frequently asked is, "What happens when my thinly provisioned volumes fill up?" and my answer is "The same thing that happens when your thickly provisioned volumes fill up!" In other words, don't use thin provisioning to try and avoid best practices and planning. Unless you have a very firm grip on your storage growth and have a smooth and responsive procurement process (and planned budget) you're better off not over allocating your storage. That's my two cents.

Now that I've made at least one industry analyst happy let me explain why I still think thin provisioning is a feature worth having and a necessary part of a storage virtualization solution's feature set (yes, on the array).

As a storage administrator, having information about actual capacity utilization is pure gold. It's not good enough to know how much storage you've provisioned - you really need to understand how that storage is being used in order to drive efficiency and control cost (not to mention plan and justify upgrades). In many shops, storage and the application teams are siloed and obtaining information about storage utilization above the block level provisioning is usually difficult and very likely not accurate. Consider also that storage consumption can be reported on a lot of different levels and with many different tools. Collecting and coalescing that information can be time consuming and frustrating.

In a thinly provisioned storage array the administrator can tell in an instant what the utilization rates are and also trend utilization for planning and budgeting purposes. And, yes, the information can also be used to question new storage requests when existing storage assignments are poorly utilized or over sized.

Although thin provisioning is provided at various other layers of the stack outside of the array it doesn't devalue the single pane management benefits associated with array based thin provisioning. For example, VMware administrators must select between three storage provisioning options when creating a virtual disk (zeroedthick, thin and eagerzeroedthick). There may be rationale for using thin provisioning tools within the LVM or application but that should only apply to use cases within that system or solution - in Compellent's case it matters not because the block level device will be thinly provisioned regardless of any higher layer sparse allocation. In short, a shared storage model requires thin provisioning at the storage layer to drive efficiency for the entire environment (which by the way is justification for storage virtualization at the array in general).

Thin provisioning could be considered a side effect of virtualizing storage and actually assists in delivery of other virtualization features such as snapshots, automated tiering, cloning and replication. Foremost is reduction in the amount of work that must be done to manage storage for other features. With thick volumes the zero space must be manipulated as if they were "load bearing" - and in the case of volumes sized for growth this could be significant on a large scale. For example, a new 100GB LUN, thickly provisioned would need to be allocated storage from some tier. Maybe that's all tier 1 storage which would eat up expensive capacity. Maybe it's tier 3 storage which means performance might suffer while the system figures out that some pages just got very active and need to be promoted to a higher tier. Even if some rough assumptions were made and the LUN was provisioned out of a 50/50 split of high performance and lower cost storage there's still going to be some inefficient use of the overall array.

Likewise, a feature which involves making a copy of stored data, such as cloning and replication, would be more costly if the entire thick (and underutilized) volume were being copied. Many storage virtualization products provide a capability to create a volume as a golden image for booting and then assign thinly provisioned copies of that boot image to new servers, conserving storage. Without thin provisioning you could still dole out servers from a golden image, of course but why not deduplicate? Yes, thin provisioning is a form of deduplication when you think about it.

Far from being a new problem to deal with, thin provisioning is a key feature in any virtualized storage solution and you don't have to over allocate your storage to get value from it.