Monday, December 5, 2016

The Self-Healing Data Center Part 5: Configure an Alert to Trigger the vRO Workflow

Now that we have configured the Translation Shim and vRealize Orchestrator the next task before testing is to configure vR Ops to send an alert to the shim.  This will be the final post in this series and when you complete this, you will be ready to apply this solution to automate any alerts you desire by simply creating the appropriate workflow and alert settings.

I am using vR Ops version 6.4 in the steps below, but this should work with any 6.0 or higher version of vR Ops.  We are also using Endpoint Operations to monitor the state of a service on a Linux OS.  Endpoint Operations is NOT required to use this shim, I am only using that because it provides a way to easily trigger an alert by stopping a service on a monitored OS.  It also shows that any automated remediation or activity is possible, not just automation of virtual infrastructure.

Saturday, December 3, 2016

The Self-Healing Data Center Part 4: Configuration of the vRO Example Workflow

So far we have discussed the reasoning behind the Translator Shim, installed it, configured it, started and tested the shim server.  In this blog post, we will set up things on the Orchestrator side so that the workflow is ready to go when an alert is fired from vR Ops.

This blog post assumes you have familiarity with vRealize Orchestrator and some experience with importing workflows and working with the HTTP-REST plugin.  If these are new concepts for you, please don't be discouraged.  Some great references to get comfortable with Orchestrator are:

HOL-1721-SDC-5 - Introduction to vRealize Orchestrator
Blog from on using the REST plugin
Postman + vRO = HTTP-REST Plug-in Operations

Let's dive in.

Wednesday, November 30, 2016

The Self-Healing Data Center Part 3: Configuring and Testing the Orchestrator Translation Shim

So far, we've walked through installing the Translation Shims.  In this blog post we'll configure the Orchestrator shim for use.  The Orchestrator shim is but one of the handful of shims included in the solution.  More are being added via the community.  Participation is encouraged!

The Self-Healing Data Center Part 2: Installing the Translation Shims for Automating vR Ops Alerts

In the previous post, I explained some of the capabilities and limitations with vR Ops alert notifications for automation of alert notifications via the REST Notification Plugin.  I also introduced the Translation Shims for Log Insight and/or vRealize Operations Manager Webhooks as a solution to these current limitations.  By the way, as the name indicates, this solution works great with Log Insight webhooks as well!  In fact, it was originally created for that purpose and later vR Ops support was added.

Monday, November 28, 2016

The Self-Healing Data Center Part 1: Using vR Ops with vRO to Automatically Remediate Alerts

If you are a user of vR OPs, you know that it can monitor your infrastructure, server OS, applications and more.  But as this commercial suggests, monitoring is only part of the answer.  Wouldn't it be much better to have vR Ops attempt some simple fixes before giving up and calling for human intervention?

In this blog post series, I will explain how to activate a vRealize Orchestrator workflow based on a vR Ops alert to fix an issue instead of just alerting you.  First some background.

Wednesday, August 24, 2016

A Postman Collection for Upgrading vR Ops Endpoint Operations Agent via REST API

With the release of vRealize Operations Manager (vR Ops) 6.3 this week, I noticed that I had not updated the Endpoint Operations (EP Ops) agents running in my lab since version 6.1 and as a peer pointed out the 6.3 release notes specifically point out that you should upgrade the EP Ops agents to 6.3 before upgrading vR Ops.

As there is already a KB referenced in the release notes, I won't go into the "supported" way to do an agent upgrade.  Rather, in this blog post I wanted to show how I used Postman REST client to do the upgrades, as customers may wish to leverage something other than the provided Python script to perform the upgrades in bulk.  Using Postman, you can generate a number of different code snippets to use your favorite automation tool (js, Ruby, shell script, etc).

The upgrades for the supported method use the same API call - which is an "internal" API call (meaning, it's available but may be changed or removed in the future).

POST /internal/agent/upgrade

The body of the POST includes a payload with three elements (JSON example shown below).

  "agentId" : "1432528944061-6735281266450674401-1746278254068293921",
  "fileLocation" : "bundles/6.2.1",
  "agentBundleFile" : ""

Note that since this is an internal API endpoint, you need an additional header to permit the operation.

X-vRealizeOps-API-use-unsupported : True

Easy enough, and in the Postman collection I have created for you there are three REST operations that can be run together to perform an upgrade on a single agent.  First, grab the collection from the link below to import into your Postman client (assuming you have Postman installed already).

Upgrade EP Ops Agent Collection

Also, grab the vR Ops environment I have created for the variables used in the collection.


Import that environment into your Postman client and edit the following keys:

{{user}} = vR Ops user name
{{pass}} = vR Ops password
{{vrops}} = vR Ops FDQN or IP address

I'll come back to the other keys in a moment, but I want to explain why they are required.

As you can see below, the collection includes a GET for the agentID based on search against the FQDN or the agent's host system, then performs the update for the agent based on that ID and then finally does a check on the status of the update of the agent.

 Additionally, the POST Upgrade Agent operation has some parameters in the request payload.  As you can see, the values for "fileLocation" and "agentBundleFile" are based on env variables as well.  The "fileLocation" is the path under the following directory structure on the vR Ops virtual appliance:


That location is where you will place the "agentBundleFile" for the upgrade (available from the vR Ops download page).  By the way, if you have a cluster deployment then you must have the bundle files installed on each node in the cluster.

Now back to the environment keys you need to update.

{{agentFQDN}} = agent's host FQDN (case sensitive)
{{fileLocation}} = truncated path for bundle files
{{agentBundleFile}} = complete filename of agent bundle to use for the upgrade (OS/arch specific).

Example vrops environment values - the value for agentID is updated by the Postman tests script

Once you have the {{agentFQDN}} value set, you can run the collection.  The test script on the GET Get Agent Status on Upgrade will fail - and that is to be expected as the upgrade will take a few minutes to complete.  You can run that operation independently as often as you wish to validate the success of the upgrade.  The test is looking for "COMPLETED" in "agent_upgrade_status" within the response.  Other values, such as "IN_PROGRESS_DOWNLOADING", "IN_PROGRESS_UPGRADING" and "FAILED TIMED OUT" are not evaluated in the script I provide but be aware of these if you do create a bulk upgrade script to evaluate the upgrade state for remediation or logging.

Example of the agent_upgrade_status in the response body of the Get Update Status operation

This should give you a general understanding of how you can use the vR Ops REST API to upgrade EP Ops agents.

Monday, August 22, 2016

vR Ops Alert "VMware Virtual Data Service" Is Not Available - What to Do?

Running vR Ops in my home lab and noticed that a couple of VMware services were down thanks to the EPOPS agent.  One of those was sort of a mystery, the "VMware Virtual Data Service" which I will shorten to "vdcs" for this post.

Turns out that this service is responsible for the Content Library and Transfer services.  When you look for the service within the vSphere web client you won't find it listed under a friendly name but rather by the "vdcs" abbreviation (the full name is com.vmware.vdcs.cls-main).

I attempted to start this service from the web client and it didn't return an error in the UI but when I refreshed, it was still not running.  So, off to the logs!  The log file for this service can be found in /var/log/vmware/vdcs/wrapper.log and there I saw the problem!

ERROR  | wrapper  | 2016/08/22 19:28:26 | 4728 pid file, var/log/vmware/vdcs/vm, already exists.

FATAL  | wrapper  | 2016/08/22 19:28:26 | ERROR: Could not write pid file /var/l  og/vmware/vdcs/ Inappropriate ioctl for device

So, I have an orphaned PID file.  Just to validate, I run

ps -ef | grep vmware-vdcs

to make sure that process isn't running.  And then I backup the current PID file, delete it and start the vdcs service:

And give it 5 minutes for vR Ops to check on things.  My error is now cleared!

Fortunately, I have vR Ops to report this - otherwise I would not have known that this service was down until I needed it to be up!

Thursday, July 28, 2016

Starting vRO Workflows with Log Insight Webhooks

Beginning with version 3.3 of Log Insight, alerts can be forwarded via a webhook.  Basically, any URL you designate will have an HTTP POST issued with the alert contents as a JSON body.  This feature provides some very basic capability and most use cases will require a bit more functionality.

For example, vRO workflows can be started from the vRO REST API, but you need to authenticate and prepare a JSON body with expected inputs at a minimum.

I learned of a nice shim, written in Python by some of my peers here at VMware.  You can read more about the general capability of the shim at this link.  The shim had the basic capability I needed, it just didn't support a vRO endpoint.  The authors (Alan Castonguay and Steven Flanders) invited contributions via a pull request, so, I added the vRO shim and I want to provide a little more information about the usage here in this blog post.  General install and usage instructions are on the Github page for the shim, so I won't cover that.


The "" shim does not include any authentication information.  By default, Requests library defaults to the user's .netrc file if no auth options are given.  So, you will need to set up a .netrc in the user home directory with the hostname, username and password, i.e.

machine vro-01a.corp.local
login administrator@vsphere.local
password VMware1!

This has the benefit of securing the password without adding additional code and exposing the credentials.

To Parse or Not to Parse?

The example code shows a workflow that accepts two inputs, both string.  One is "value" which I will talk about below in the next section.  The other, "alertName" is a value that can be retrieved via the parse() function that is part of the main file "" and does a nice job of returning a Python object that you can pull various bits of the alert payload from.  

Do you need to use the parsed alert?  Depends.  In the use case I was writing this shim for, I needed the entire payload in vRO so I could parse it there.  This is because the alert itself has a lot of variability - it is an alert that watches for NSX distributed firewall drops on port 22, by src/dst pair.  That src/dst pair could be different each time and there could be any number of those pairs in a single alerts - practically impossible to anticipate for workflow input.  On the other hand, if your alert watches a specific system for a specific event... well, you probably don't even need to pass inputs. Or maybe you just need the system name.  My recommendation is to try and make your Log Insight alerts as specific as possible and then deal with the remaining variables in either the Python code via the parse() function or in the case of multiple variables by passing the entire alert payload on to vRO for evaluation.

JSON Payload Serialization

I ran into a fun problem with trying to pass a JSON string to vRO as an input.  Basically, it confuses vRO because it reads the JSON string input value as part of the parameter input body.  My work-around was to serialize the JSON string from Log Insight and as such vRO would just see a long string input and be happy.

So, this is why you see the line:

"value": base64.b64encode(request.get_data())

As discussed above, you might not need to do this.  However, if you do, you'll want to grab the CyrptoJS actions package for vRO It has a base64 decode action and I tested it with the shim.

Comments and feedback are welcome, of course.  The shim is provided as an example and should not be used in production.

Wednesday, March 23, 2016

Retooling the Infoblox vRA Plugin to Support Event Broker

In this blog post I will provide information on using the Infoblox "VMware Cloud Adapter" version 3.2 with the new Event Broker feature of vRA 7.

This does not require any modification of the Infoblox workflows and the adapter will continue to work as written by Infoblox.

This content is intended for readers comfortable with vRealize Automation and vRealize Orchestrator.  As always, I do not assume any risk nor do I provide support for using the modifications outlined below in your environment.


I will create a workflow wrapper for one of the IPAM workflows to allow them to be executed from an Event Broker subscription.

You can download an example of the workflow from VMware's Sample Exchange.


I just finished a proof-of-concept with a customer where the Infoblox IPAM plugin for vRA/vRO was used to integrate DDI with vRA.  It's a great solution for customers who own or are considering Infoblox IPAM, with one drawback - it currently supports only using Workflow Stubs in vRA version 7.

While that's not a huge problem, it lacks the flexibility and usefulness of the Event Broker's Event Subscription capability.  If you're not familiar with the Event Broker, you can learn more in this webinar I conducted on vRA 7 extensibility.  Briefly though, the Event Broker replaces the Workflow Stubs beginning in vRA version 7.  Workflow Stubs are still supported for integrations that have already been built around them, but customers and partners are encouraged to transition to the Event Broker.


I am using:

  • vRealize Automation 7.0.1
    • simple install
    • embedded vRealize Orchestrator instance
  • Infoblox DDI Evaluation
    • NIOS for vSphere 7.2.6
  • Infoblox VMware Cloud Adapter version 3.2
  • "Reserve an IP in a network" will be the methodology used
You can follow the instructions from Infoblox for installation of the VMware Cloud Adapter (which I will refer to as "the plugin" or "IPAM plugin" in this post).  The installation will set up the plugin to use Workflow Stubs, and that's fine.  If you wish you can skip the steps to create the Workflow Stubs but you will still want to create the property group (we will make some small edits later).

The Event Broker Wrapper Workflow

If you watched the video I linked above, you will be familiar with a "wrapper" workflow that I provide for usage with Event Broker subscriptions.  This wrapper will extract the properties provided by the Event Broker input, allowing you to capture any information you need as inputs for the IPAM workflows.

For example, the IPAM workflow "Reserve an IP in network for VM" has the following inputs:

While the Event Broker provides a single input:

That single input contains most of what we need for the IPAM workflow.  There are a couple of exceptions, and I'll cover how to find those.  For now, let's start with the easy ones.

The Event Broker wrapper workflow simply prints the key:value pairs found in the eventPayloadProperties input, but it's easy enough to assign them to an attribute for further processing.  For starters, I make a duplicate copy of the Event Broker wrapper workflow and add the IPAM workflow "Get IP in network for VM" into the schema as follows:

Next, I'll promote all of the inputs for the IPAM workflow as attributes:

Next, since I will use the script element "Get payload and execution context" from the EB wrapper, I need to map the output of the script element to those global attributes so that I can update them.

Now, I will make changes to the script in the "Get payload..." element so that I can set the attributes as needed.  Luckily, Infoblox uses the same parameter names as the EB wrapper script!  So, once I have made the attribute bindings the following are already retrieved by the script:

  • vcacVm
  • vCACVmProperties
However, you will need to make some changes.  For starters, the "vCACHost" object is retrieved by the script, but using the var "host" but all I did to fix this was search for instances of "host" in the script and replace them with "vCACHost" as below (the red boxes indicate the two occurrences).

Next, for the "virtualMachineEntity" I added a single line of code:

I inserted that just below the block in the last image:


For what's worth, the IPAM workflow I am using has as input the vSphere virtual machine object, but it doesn't actually use that for anything - which is great because at the state this is called from vRA the VM doesn't yet exist in vSphere!  But, I'm including the code anyway since it will be helpful for other IPAM workflows.

Now to get the "vCenterVm" I need to look that up by the virtual machine ID provided in the machine properties from the Event Broker, using the vSphere plugin.  The following code block will handle that just fine:

I inserted that into the script just below the last line I added:

That leaves "externalWFStub" and honestly,  you could just leave this alone and not assign a value.  It is only used in the IPAM workflow for logging information, as shown below in this example workflow log:

I promised no modification of the Infoblox workflows, so I'll stick to that. Ideally, I could change the IPAM workflow to display "Workflow started from Event Broker" and provide the event information.  But, just to provide some logging info, I will get the Event Broker payload properties and use that for the "externalWFStub" value so at least that information is logged on run.

So, I added the lines below to my wrapper script:

Inserted below the last code block I added above.

With those changes, I now have a wrapper for the IPAM workflow that can be used with Event Broker subscriptions.  The other IPAM workflows can be used within the same wrapper, although you should verify that the inputs are handled by the wrapper script as I did for this IPAM workflow.

Preparing vRA 7

If you followed the installation instructions for the Infoblox VMware Cloud Adapter, then you should have one or more Property Groups in vRA.  These property groups are used to pass the property values that the IPAM workflows need.  However, these groups also activate the WFStubs and since I am using Event Broker, I don't want those WFStubs to execute.

I made a copy of the property group "Infoblox Reserve IP in Network" as below:

I renamed the new Property Group "EB InfoBlox Reserve IP in Network" and removed any properties that reference WFStubs and saved the Property Group.

Now I can associate this Property Group with a blueprint (and disassociate the Infoblox created Property Groups if needed).

Creating an Event Broker subscription is beyond the scope of this post, but you can find more information about that in the video recording I linked above.  A few things to note on the subscription:

  • I used the Lifecycle State Name "VMPSMasterWorkflow32.BuildingMachine" and Lifecycle State Phase "PRE" as this equates to the WFStub,BuildingMachine.
  • I made the Event Broker subscription a blocking subscription with a suitable timeout.  Obviously you need IP info or you don't want to continue with a build.
There's a lot of duplicate logging of properties between the workflow wrapper and the IPAM workflow spitting out the machine properties.  If I were running in production, I'd probably trim a lot of that logging information out or set it for debug only.

Finally, keep in mind that this is only for the IP provisioning IPAM workflow.  You will need to wrap the IP deprovisioning IPAM workflow as well and create an Event Broker subscription for that.

**UPDATE** Some readers have reported that the workflow package I linked will not import into vRO.  So, here is a link to the scriptable task within the wrapper workflow: