<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>vPivot &#187; vcenter</title>
	<atom:link href="http://vpivot.com/tag/vcenter/feed/" rel="self" type="application/rss+xml" />
	<link>http://vpivot.com</link>
	<description>Scott Drummonds on Virtualization</description>
	<lastBuildDate>Wed, 01 Feb 2012 06:46:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Custom Alarms for VMware SIOC</title>
		<link>http://vpivot.com/2011/09/21/custom-alarms-for-vmware-sioc/</link>
		<comments>http://vpivot.com/2011/09/21/custom-alarms-for-vmware-sioc/#comments</comments>
		<pubDate>Wed, 21 Sep 2011 02:14:17 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[sioc]]></category>
		<category><![CDATA[vcenter]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=998</guid>
		<description><![CDATA[Over a year and a half ago I previewed VMware&#8217;s unreleased feature, Storage IO Control (SIOC).  SIOC creates new intelligent latency metrics to evaluate the health of VMFS volumes.  The same latency measurements are used in storage DRS, which VMware released in vSphere 5.  While automated performance correction is great, vCenter should warn VMware admins [...]]]></description>
			<content:encoded><![CDATA[<p>Over a year and a half ago <a href="http://vpivot.com/2010/05/04/storage-io-control/">I previewed VMware&#8217;s unreleased feature, Storage IO Control</a> (SIOC).  SIOC creates new intelligent latency metrics to evaluate the health of VMFS volumes.  The same latency measurements are used in storage DRS, which VMware released in vSphere 5.  While automated performance correction is great, vCenter should warn VMware admins when latency crosses defined thresholds.  Custom vCenter alarms can do this.</p>
<p>With hardest work of making vSphere 5 generally available behind him, one of VMware&#8217;s engineers, Balaji Parimi, recently sent me <a href="http://e-scott.net/share/siocalarms.zip" onClick="javascript: _gaq.push(['_trackPageview', '/downloads/map']);" >scripts he wrote to create SIOC alarms</a>.  These alarms can be used to tell administrators that SIOC is throttling some virtual machines to save high priority applications from ailing datastores.</p>
<p><span id="more-998"></span>Balaji&#8217;s script creates vCenter alarms that trigger when:</p>
<ol>
<li>SIOC configuration is changed (enable / disable, congestionThreshold) on any datastore. The user does not need to do anything if a new datastore is added. The current vCenter alarm infrastructure does not allow to create alarms based on specific properties (e.g. iorm configuration enabled / disabled or congestion threshold etc.). So, I created one alarm that gets triggered when DatastoreIORMReconfiguredEvent is generated. This event gets generated whenever the StorageIORMConfiguration on any datastore in the vCenter inventory is changed.</li>
<li>SIOC Normalized Datastore Latency exceeds the congestionThreshold for a given datastore on a host. This alarm is a bit tricky as it depends on the performance metric. The performance metric based alarms are per instance only. So, this creates one alarm object per datastore in the vCenter. The script allows you to create for all datastores in the vCenter inventory in one shot. If more datastores are added later, the user needs to create an alarm for each of those datastores. If the user changes the congestionThreshold for a datastore, the user needs to delete the old alarm and re-create the alarm on that datastore for the new congestionThreshold value to be recognized.</li>
</ol>
<p>The second alarm will tell you that SIOC is taking action, slowing some virtual machines to improve the performance of others.  This alarm only works on the datastores in existence at the time of its creation.  Thus the first alarm, which will tell the administrator that SIOC or the vSphere storage configuration has changed.  When the first alarm triggers it is time to re-register the second one.</p>
<p>Balaji included in his email to me the following instructions on setting the alarms:</p>
<blockquote><p>Here are the two commands:</p>
<p>./siocalarms.sh 10.132.98.205 Administrator password all</p>
<p>This covers all the Datastores in the vCenter. This creates the SIOC configuration changed Alarm as well.</p>
<p>I added a new Datastore with the name FCLun0. And ran this command to cover the new Datastore.</p>
<p>./siocalarms.sh 10.132.98.205 Administrator password FCLun0</p>
<p>You can use the GUI to delete any of these Alarms. You can use the GUI to set up a specific action (like sending an email) for any of these Alarms.</p>
<p>I am attaching <a href="http://e-scott.net/share/siocalarms.zip" onClick="javascript: _gaq.push(['_trackPageview', '/downloads/map']);" >a zip file containing a README, shell and a batch script</a>. All you need is JRE 1.6 or higher to run this.</p>
<p>Please give it a try and let me know your comments.</p></blockquote>
<p>You may notice that my corporate Exchange server&#8217;s malware detection software removed the batch file Balaji references above.  But the batch file only contained a single line:</p>
<blockquote><p>java -jar SIOCAlarms.jar &lt;VCServer&gt; &lt;Username&gt; &lt;Password&gt; &lt;datastoreName&gt;</p></blockquote>
<p>As the instructions above state, the datastoreName &#8220;all&#8221; will add the alarms for all datastores.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/09/21/custom-alarms-for-vmware-sioc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>vShield Clarification</title>
		<link>http://vpivot.com/2011/02/14/vshield-clarification/</link>
		<comments>http://vpivot.com/2011/02/14/vshield-clarification/#comments</comments>
		<pubDate>Mon, 14 Feb 2011 14:02:27 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ha]]></category>
		<category><![CDATA[vcenter]]></category>
		<category><![CDATA[vshield]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=798</guid>
		<description><![CDATA[A couple of weeks ago I wrote on vShield. As I said then, the Asia Pacific and Japan vSpecialists have spent a lot of time on this wonderful product. We love it. The purpose of my blog entry was to highlight a best practice that would avoid an annoying issue. That issue is that vShield [...]]]></description>
			<content:encoded><![CDATA[<p>A couple of weeks ago <a href="http://vpivot.com/2011/02/02/vshield-vcenter-and-management-clusters/">I wrote on vShield</a>.  As I said then, the Asia Pacific and Japan vSpecialists have spent a lot of time on this wonderful product.  We love it.  The purpose of my blog entry was to highlight a best practice that would avoid an annoying issue.  That issue is that vShield App installation on ESX hosts running a vCenter virtual machine can disconnect vCenter from the network.  The workaround&#8211;documented in <a href="http://www.vmware.com/pdf/vshield_410U1_admin.pdf">the vShield Administration Guide</a>&#8211;is to run vCenter and vShield manager on a management cluster.  Case closed.  Well, not yet.<br />
<a href="http://www.vmware.com/products/vshield/"><img class="alignleft" title="vShield Products" src="http://www.vmware.com/files/images/vshieldfamily-diagram-large-02.jpg" alt="" width="266" height="200" /></a></p>
<p><span id="more-798"></span>Last last week Beth Pariseau at <a href="http://searchservervirtualization.techtarget.com/">SearchServerVirtualization.com</a> wrote <a href="http://searchservervirtualization.techtarget.com/news/2240031993/VMware-vShield-Manager-design-raises-availability-concerns">a piece on vShield that quoted my blog</a>.  Beth kindly asked me for comment before her article went live but we were unable to connect.  After reading her article, I wanted to add my thoughts to the discussion.  I mailed these to her and want to share them with you here.</p>
<p>First, one of her claims is incorrect.  Beth wrote, and cited an anonymous systems integrator in support, that &#8220;If vShield Manager is down, everything on its network is down.&#8221;  The helpful technologists at VMware confirmed my understanding that the vShield architecture will not allow this.  A vShield Manager failure will not affect network traffic.</p>
<p>vShield manager is a central point of management, but not a nexus through which traffic flows.  The vShield architecture uses traffic-controlling security appliances on each ESX host managed by vShield.  In the event of a vShield manager failure, the vShield security appliances continue with policies they previously received.  A vShield Manager failure temporarily makes it difficult to change security policies but in no way affects connectivity or existing policies.</p>
<p>Furthermore, vShield Manager is protected by HA&#8217;s VM-aware monitoring.  If the appliance&#8217;s guest operating system hangs, vpxa will restart the vShield Manager appliance.  If the host running vShield manager fails, then HA will restart the appliance on another host in the cluster.  This means that in the event of a vShield Manager failure, not only will the network continue to work with its existing security policies, but the network will be unmanageable for only the short duration measured by the appliance&#8217;s boot time.</p>
<p>This brings us to the issue I blogged about: installing vShield App on an ESX host running a vCenter virtual machine can disconnect the vCenter&#8217;s virtual NIC.  VMware&#8217;s straightforward guidance for its management products is to run them in a separate management cluster.  Following this recommendation vShield will only secure non-management virtual machines and this vCenter VM network connectivity issue cannot occur.  While <a href="http://vpivot.com/2011/02/09/physically-separate-management-cluster/">my support of this best practice is lukewarm</a>, I recognize that some of VMware&#8217;s customers do follow it.  And I am told that VMware is working on ensuring that all management products can be run as virtual machines on any host.</p>
<p>At the risk of belaboring the central point of this post, I will again say that no single appliance failure can bring down the network.  I have already named vShield Manager in this regard and I reiterate with vCenter.</p>
<p>Beth solicited my comments and I explained my understanding of the article&#8217;s mistake.  I also recommended she engage our friends at VMware, who I know from experience are always thrilled to share in limitless detail the workings of their fine products.  I trust that Beth and VMware will get the right word out to customers.  But as a constant commentator on VMware&#8217;s continued development on vShield, I wanted to append my thoughts to the ongoing vShield discussion.</p>
<p>VMware&#8217;s continued streak of incredible innovation of a constant source of marvel.  When I expose warts on their products it is to help their faithful enthusiasts get the most out of VMware&#8217;s offerings.  I believe Beth&#8217;s article was along that same vein: raise awareness and inspire continued improvement.  Anyone that reads her article or mine should consider these tiny warts in the ongoing epic of datacenter rebirth.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/02/14/vshield-clarification/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>vShield, vCenter, and Management Clusters</title>
		<link>http://vpivot.com/2011/02/02/vshield-vcenter-and-management-clusters/</link>
		<comments>http://vpivot.com/2011/02/02/vshield-vcenter-and-management-clusters/#comments</comments>
		<pubDate>Wed, 02 Feb 2011 03:32:08 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[vcenter]]></category>
		<category><![CDATA[vshield]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=761</guid>
		<description><![CDATA[I recently tweeted looking for VMware customers using the vShield products in production. There was very little response. It seems that vShield Edge and App are barely being used at all in production. It is possible that some rough patched need to be sanded down to polish vShield to its appropriate luster. But if you [...]]]></description>
			<content:encoded><![CDATA[<p>I recently tweeted looking for VMware customers using the vShield products in production.  There was very little response.  It seems that vShield Edge and App are barely being used at all in production.  It is possible that some rough patched need to be sanded down to polish vShield to its appropriate luster.  But if you are not playing with these products now, you owe it to yourself to download the trial versions.</p>
<p>Our fantastic APJ vSpecialist organization has been spending with vShield.  Many of us believe vShield to be one of the most exciting editions to the VMware portfolio in a long time.  It is simple, elegant, and very powerful.  But there is a very big danger with vShield: improper use can disable the network connection of vCenter virtual machines.  Fixing this problem is not intuitive.</p>
<p><span id="more-761"></span>We first observed this issue in our quarterly hands-on workshop in Tokyo with multiple people toying with the vShield products (Edge, App, EndPoint) on a single cluster.  Some sequence of operations disconnected the vCenter&#8217;s vNIC.  We opened the VM&#8217;s settings, clicked the vNIC&#8217;s &#8220;Connected&#8221; checkbox, and tried to go back to work.  But the vNIC remained unconnected.  No matter how many times we repeated this process, the vCenter VM remained offline.  With vCenter offline, the infrastructure was unmanageable.</p>
<p>The general problem is that installing vShield App or Zones on an ESX host running vCenter introduces a circular dependency that can make installation, operation, or uninstallation fail.  A warning against doing this is documented in a small cautionary note on the bottom of page 13 of the current <a href="http://www.vmware.com/pdf/vshield_410U1_admin.pdf">vShield Administration guide</a>.  Unfortunately, it is an <a href="http://www.vmware.com/pdf/vshield_41_admin.pdf">older version of this administration guide</a> (which lacks this critical warning!!!) that appears first in response to the obvious Google search.  </p>
<p>This same problem is described as an uninstallation issue in <a href="http://kb.vmware.com/kb/1028151">VMware KB article (1028151)</a>.  That KB also includes suggestions on resolution.</p>
<p>While vShield App is not fundamentally incompatible with management of a vCenter VM, this problem raises an interesting point.  In even moderate environments it makes sense to create a separate management cluster for VMware&#8217;s management products.  This includes vCenter, vCloud Director, CapacityIQ, the vShield Manager, and a large number of helper VMs for other products.  The key here is that these clusters are protected by HA and vCenter Server Heartbeat, but they are not managed by the tools in the cluster.</p>
<p>Tools like AppSpeed, CapacityIQ, vShield and others should be deployed to measure and manage the production and test clusters, but not the management clusters.  This prevents circular dependencies that could result in difficult-to-correct network problems or inefficiencies like AppSpeed monitoring management traffic.  This is obviously a very high level description of something that needs a reference architecture.  More to come on that next week.</p>
<p>For those of you that share our interest in vShield, I want to direct you to the blog of a colleague of mine in Australia.  Roman Tarnavski, a Sydney-based vSpecialist, has been digging into vShield and publishing some tutorials and hacking guides.  Roman and I met with VMware engineering and we think that Roman will likely soon be showing you some amazing possibilities when hacking to improve vShield Edge.</p>
<p>Roman has already written a couple articles on the vShield and networking:</p>
<ul>
<li><a href="http://blog.romant.net/vmware/by-example-networks-in-vcloud-director/">Three types of network configurations for vCloud Director virtual machines</a>.</li>
<li><a href="http://blog.romant.net/vmware/by-example-enabling-ssh-in-vshield-edge/">Enabling SSH in the vShield Edge appliance</a>.</li>
</ul>
<p>Check back here and read <a href="http://blog.romant.net/">Roman&#8217;s blog</a> regularly if you want to hear more about vShield.  Roman will have some incredible information on the vShield Edge internals that all of you hackers out there will not want to miss.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/02/02/vshield-vcenter-and-management-clusters/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>vCenter Custom Alarms: Instruction, Tips, Tools</title>
		<link>http://vpivot.com/2011/01/11/vcenter-custom-alarms-instruction-tips-tools/</link>
		<comments>http://vpivot.com/2011/01/11/vcenter-custom-alarms-instruction-tips-tools/#comments</comments>
		<pubDate>Tue, 11 Jan 2011 08:50:04 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[alarms]]></category>
		<category><![CDATA[vcenter]]></category>
		<category><![CDATA[vspecialists]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=744</guid>
		<description><![CDATA[Half a year ago the Asia Pacific vSpecialist team made a fantastic acquisition in one David &#8220;Two Screws&#8221; Lloyd. David previously worked for a wonderful VMware and EMC customer in the UK and moved back to Australia. That is when we jumped on the opportunity to snap him up. In working with him in the [...]]]></description>
			<content:encoded><![CDATA[<p>Half a year ago the Asia Pacific vSpecialist team made a fantastic acquisition in one David &#8220;Two Screws&#8221; Lloyd.  David previously worked for a wonderful VMware and EMC customer in the UK and moved back to Australia.  That is when we jumped on the opportunity to snap him up.  In working with him in the labs recently, I have come to realize that David has incredible depth and breadth in the space of virtual infrastructure management.</p>
<p>Recently David sat down to share his tips on setting up alarms in vCenter.  Most of VMware&#8217;s customers understand the power of custom alarming but few harness its value.  Using an example of storage path failures, David created a video that walks its audience through the process of configuring a custom alarm using a tailor made executable that generates useful logging message in the vCenter OS&#8217;s event log.</p>
<p><span id="more-744"></span>Here is David&#8217;s demo on YouTube:</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="600" height="361" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/d7NRrGgV1J0?fs=1&amp;hl=en_US" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="600" height="361" src="http://www.youtube.com/v/d7NRrGgV1J0?fs=1&amp;hl=en_US" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>(A high resolution version of the video is available <a href="http://e-scott.net/share/emc/demos/Monitoring Storage Port failures in vCenter HD.mp4"  onClick="javascript: pageTracker._trackPageview('/downloads/map'); ">here</a>.)</p>
<p>David was kind enough to provide me a write-up of his work with details on the why and how of this process.  From here on, this entry will be in his words.</p>
<h2>vCenter Based Storage Port Monitoring</h2>
<p>The above video demonstrates the creation of custom alarms in vCenter to monitor host level storage port failures.  To add a little bit more functionality to the alarms I wrote a little console application. When executed as a custom action assigned to an alarm, this application writes alarm details to the vCenter OS&#8217;s application event log. By adding entries to the local event log, third party monitoring solutions such as Microsoft’s System Center Operations Manager can be utilised to effectively monitor a VMware virtual environment. This is done by monitoring  for specific events in the event log.  This is core function of most windows monitoring solutions.</p>
<p>When a command is executed as a custom action, it is run in a Windows shell.  That shell has several environment variables which contain details related to the triggered alarm. These details can then be used to provide a rich level of information to assist in problem identification and resolution. The table below shows the environment variables I have utilised to form the description field of the events raised.</p>
<table>
<tbody>
<tr>
<th>Variable</th>
<th>Description</th>
</tr>
<tr>
<td>VMWARE_ALARM_NAME</td>
<td>The name of the triggered alarm.</td>
</tr>
<tr>
<td>VMWARE_ALARM_EVENTDESCRIPTION</td>
<td>The textual description of the alarm condition.</td>
</tr>
<tr>
<td>VMWARE_ALARM_EVENT_ COMPUTERESOURCE</td>
<td>The cluster name of the impacted object.</td>
</tr>
<tr>
<td>VMWARE_ALARM_EVENT_HOST</td>
<td>The hostname of the impacted object.</td>
</tr>
<tr>
<td>VMWARE_ALARM_EVENT_DATASTORE</td>
<td>The datastore name of impacted object.</td>
</tr>
<tr>
<td>VMWARE_ALARM_EVENT_DATACENTER</td>
<td>The datacenter name of the impacted object.</td>
</tr>
<tr>
<td>VMWARE_ALARM_EVENT_NETWORK</td>
<td>The network port group name of the impacted object.</td>
</tr>
</tbody>
</table>
<p>In addition to the custom description, I have added additional flexibility to further customize the event.  This includes:</p>
<ul>
<li>Event Source – A custom source can be assigned by the executable to events it raises.  This will help easily identifying events. The source name is specified within the applications configuration file ‘vCenter-AlarmLog.exe.config’, which comes with both the binary and source distributions (below).</li>
<li>Event ID – Each alarm condition can be assigned a unique ID to help direct of problem tickets resulting from the event. The unique ID is provided via the command line.</li>
<li>Event Level – provides the ability to class an event as an Error, Warning or Information. The event level is specified from the command line.</li>
</ul>
<p>These can assist in the filtering of any events raised for problem ticket routing and classification.</p>
<p>An example of an event created by the tool is shown below. Note the source, ID and level details in addition to the general description.  The event entry below was created when a storage port was disabled resulting in a lost storage port.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2011/01/event_example.png"><img class="size-full wp-image-745" title="Custom Event Example" src="http://vpivot.com/wp-content/uploads/2011/01/event_example.png" alt="Example of custom event created by David Lloyd's tool." width="600" /></a></p>
<p>Each event description is formatted identically with the field identifier and value (if present) separated by a TAB to help with parsing the details.</p>
<h2>Command Line Arguments</h2>
<p>Both Event ID and message type can utilise application defaults or can be specified through the command line. By default the application will raise an ‘Error’ entry and event id of ‘9999’, either of which can be changed as shown below:</p>
<table>
<tbody>
<tr>
<th width="190">Command line</th>
<th>Description</th>
<th width="210">Example</th>
</tr>
<tr>
<td>vCenter-AlarmLog.exe</td>
<td>This will raise an ‘error’ event with the id of ‘9999’.</td>
<td>vCenter-AlarmLog.exe</td>
</tr>
<tr>
<td>vCenter-AlarmLog.exe &lt;id&gt;</td>
<td>This will raise an ‘error’ event with the numeric id as specified.</td>
<td>vCenter-AlarmLog.exe 1234</td>
</tr>
<tr>
<td>vCenter-AlarmLog.exe &lt;id&gt; &lt;type&gt;</td>
<td>Will raise either an ‘error’, ‘warning’ or ‘information&#8217; entry with the numeric id as specified.</td>
<td>vCenter-AlarmLog.exe 1234 warning</td>
</tr>
</tbody>
</table>
<h2>Application Requirements</h2>
<h3>Custom Event Source</h3>
<p>The custom event source is defined in the application configuration file ‘vCenter-AlarmLog.exe.config’.  See the ‘appSettings’ entry ‘source’ in this file.  The file is included in both packages but if it does not exist it will be created upon first execution of the application.  For this to work properly, the application needs to be executed once with ‘Administrative’ level access. Failure to do this will prevent the application from writing any events.</p>
<h3>Runtime Environment</h3>
<p>Requires Microsoft.Net 3.0.  This is also a pre-requisite for vCenter, which should already exist on the vCenter server.</p>
<h3>Source Code</h3>
<p>Not a requirement but available should you wish to tinker with the script.  The tool was written with Visual Studio 2010 in C# and its source is available as a download <a href="http://e-scott.net/share/emc/demos/vCenter-AlarmLog-src-v1.zip" onClick="javascript: pageTracker._trackPageview('/downloads/map'); ">here</a>.</p>
<h3>Executable</h3>
<p>The executable is available <a href="http://e-scott.net/share/emc/demos/vCenter-AlarmLog-bin-v1.zip" onClick="javascript: pageTracker._trackPageview('/downloads/map'); ">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/01/11/vcenter-custom-alarms-instruction-tips-tools/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Alternative to DRS</title>
		<link>http://vpivot.com/2010/12/03/alternative-to-drs/</link>
		<comments>http://vpivot.com/2010/12/03/alternative-to-drs/#comments</comments>
		<pubDate>Fri, 03 Dec 2010 03:50:21 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[drs]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[pricing]]></category>
		<category><![CDATA[vcenter]]></category>
		<category><![CDATA[vmotion]]></category>
		<category><![CDATA[vscsistats]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=706</guid>
		<description><![CDATA[Now that I am six months removed from VMware, I will admit that we executed poorly in the space of performance management.  I know that there is intense work going on right now in acquisitions, unification of performance management tools, and vCenter improvement through folding in vscsiStats and esxtop data.  But in the area of [...]]]></description>
			<content:encoded><![CDATA[<p>Now that I am six months removed from VMware, I will admit that we executed poorly in the space of performance management.  I know that there is intense work going on right now in acquisitions, unification of performance management tools, and vCenter improvement through folding in vscsiStats and esxtop data.  But in the area of performance reporting and visualization, VMware&#8217;s success has been minimal.  VMware hopes its acquisition of <a href="http://www.alivevm.com/">AliveVM</a> will plug part of this gap but today it is safe to say the field is wide open for VMware&#8217;s partners.</p>
<p>This morning one such partner, <a href="http://www.vmturbo.com/">VMTurbo</a>, gave me a demonstration of their offering in this field.  Their product provides an obvious improvement on vSphere&#8217;s performance visualization capabilities.  But given the state of VMware&#8217;s visualization capabilities virtually any graphical front-end provides an improvement.  But what really set off my imagination were two features I had not seen before:</p>
<ul>
<li>A third-party alternative to DRS.</li>
<li>Cross-cluster resource optimization.</li>
</ul>
<p><span id="more-706"></span>VMTurbo provides a variety of monitoring and analysis capabilities but I want to focus most on optimization, in particular load balancing.  But before describing what VMTurbo has done, I want to point out the economics of competing with VMware&#8217;s DRS.</p>
<p>VMware provides four <a href="http://www.vmware.com/products/vsphere/buy/editions_comparison.html">vSphere editions</a> for its customers.  The cheapest edition that offers DRS is Enterprise at a list price of $2,875USD per socket.  The cheapest edition with vMotion is Standard at $995USD per socket.  There are plenty of cool features that come with upgrading from Standard to Enterprise: DRS, VAAI, Fault Tolerance, Storage vMotion, vShield Zones, and others.  But certainly DRS is one of the most valuable of that list.</p>
<p>By leaving such a big price gap between the cheapest vMotion edition and the cheapest DRS edition, VMware has provided its partners an economic incentive to innovate and provide DRS value to customers at a discount.  VMTurbo may capitalize on this incentive and it would not surprise me if numerous other ISVs are already doing so or soon will.  Once a vendor has built a robust monitoring environment, it is only a clever algorithm away from implementing DRS.  And then a trivial API call away from extending DRS to DPM.</p>
<p>The VMTurbo guys explained that their algorithm uses more resources than just CPU and memory and could therefore be better than DRS.  But I know how much work has gone into VMware&#8217;s memory-and-CPU DRS that I will only believe VMTurbo&#8217;s claims when I see the data.</p>
<p>Another area in which VMTurbo is tinkering is with inter-cluster load balancing.  The demo I received this morning showed a pre-cursor step to datacenter-wide load balancing by modeling the merge of two DRS clusters.  As the <a href="http://vpivot.com/2010/11/29/maximum-hosts-per-cluster/#comments">discussion in my maximum cluster size entry</a> showed, choosing and changing cluster sizes is not easy.  And fluidly moving virtual machines between different clusters is not often possible for a variety of reasons.  But modeling cluster merging is the first step in considering cross-cluster operations.  And I think that there is a huge opportunity in the industry for someone to innovate in datacenter-wide optimization.</p>
<p>I would be curious to see what other vendors are doing with DRS, DPM, or datacenter-wide load balancing.  Can anyone refer me to any ISVs that are trying to crack these difficult problems?</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/12/03/alternative-to-drs/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Maximum Hosts Per Cluster</title>
		<link>http://vpivot.com/2010/11/29/maximum-hosts-per-cluster/</link>
		<comments>http://vpivot.com/2010/11/29/maximum-hosts-per-cluster/#comments</comments>
		<pubDate>Mon, 29 Nov 2010 02:08:20 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[drs]]></category>
		<category><![CDATA[ha]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[vcenter]]></category>
		<category><![CDATA[vforum]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=702</guid>
		<description><![CDATA[I just returned from a one week vacation to a warm sunny beach on a small island not too far from Singapore.  Even on my vacations my conversations often migrate to technology and my travel mate is an old friend and current employee at VMware, Dave Korsunsky.  Sitting by a pool with a cocktail in [...]]]></description>
			<content:encoded><![CDATA[<p><!-- p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial; min-height: 15.0px} -->I just returned from a one week vacation to a warm sunny beach on a small island not too far from Singapore.  Even on my vacations my conversations often migrate to technology and my travel mate is an old friend and current employee at VMware, <a href="http://twitter.com/#!/VMW_Dave">Dave Korsunsky</a>.  Sitting by a pool with a cocktail in hand at a fantastic hotel I asked my friend, &#8220;what is the right number of hosts per DRS/HA cluster?&#8221;  Great conversation for a vacation, right?</p>
<p><span id="more-702"></span>I started thinking about this topic at Sydney&#8217;s vForum a month ago.  VMware&#8217;s Dan Anderson suggested that designs that implemented maximum cluster sizes (32 hosts per cluster) were the result of misguided reasoning.  Dan insisted that clusters need never be larger than eight hosts per cluster.  And on this subject we bantered for a few minutes.  Dan convinced me that there are few compelling reasons to implement large clusters.  And we could think of many reasons to avoid them.  I do not think it easy to assign one number as the &#8220;right&#8221; cluster size.  But there are many principles that suggest small to medium sized clusters being choices.</p>
<p>First, the argument for the largest clusters: DRS efficiency.  This was my primary claim in favor of 32-host clusters.  My reasoning is simple: with more hosts in the cluster there are more CPU and memory resource holes into which DRS can place running virtual machines to optimize the cluster&#8217;s performance.  The more hosts, the more options to the scheduler.</p>
<p>But on retrospect I think this is a weak argument.  Its not backed by data and in practice I cannot imagine a 16 host cluster being much more efficient than an eight host cluster.  Once vCenter is managing hundreds or more virtual machines per cluster, it has an astronomical number of combinations for VM placement.  So, doubling the host (and the virtual machine count) should have little impact to cluster efficiency.</p>
<p>More importantly, with respect to the efficiency argument, maximum CPU and memory utilization will be bound either by the failover capacity or the target utilization, which is usually about 80%.  With 20% reserved for resource spikes, the failover capacity is equal to the reserved resources at a 4+1 HA cluster.  Any any cluster larger than this, the failover capacity is less than 20%.  This means that only target utilization bounds resource efficiency.</p>
<p>The efficiency calculation is a little more tricky if you want to size your cluster for target resource utilization <em>after</em> a host failure.  In this case each additional host provides some incremental value to the cluster&#8217;s utilization.  To size a 4+1 cluster to 80% utilization after host failure, you will want to restrict CPU usage in the five hosts to 64%.  Going to a 5+1 cluster results in a pre-failure CPU utilization target of 66%.  The increases slowly approach 80% as the clusters get larger and larger.  But, you can see that the incremental resource utilization improvement is never more than 2%.  So, growing a cluster slightly provides very little value in terms of resource utilization.</p>
<p>Now why might you want to keep a cluster small?  I can think of a few reasons.</p>
<p>It is generally wise to avoid mixing different classes of servers in a single pool.  DRS does not make scheduling decisions based on the performance characteristics of the server so a new, powerful server in a cluster is just as likely to receive a mission-critical virtual machine as older, slower host.  This would be unfortunate if a cluster contained servers with radically different&#8211;although EVC compatible&#8211;CPUs like the Intel Xeon 5400 and Xeon 5500 series.  In the former case ESX would be using its software memory management unit which could perform as much as 40% worse than the hardware MMU in the Xeon 5500.</p>
<p>(I will momentarily digress to answer a question I often get in my performance talks: what is the impact of Enhanced vMotion Compatibility (EVC) on virtual machine performance?  Briefly: very little to none.  The instructions that are disabled on newer processors only benefit applications that were compiled to use those new instructions.  Those applications are rare in the enterprise space.)</p>
<p>Given my recommendation that servers in a cluster should be of a similar class of performance, you will soon find that your purchasing patterns will influence your cluster size.  If you are one of the few people lucky enough to work at a company that is buying servers by the truckload, you can size your clusters however you want.  But the vast majority of VMware&#8217;s customers make smaller purchases of anywhere from four to 16 servers at a time.  These will make nice, homogenous clusters of moderate size.</p>
<p>One more argument Dave offered for keeping clusters small is to use clusters for logical separation of applications of different class.  By putting your mission-critical applications in a cluster of their own your &#8220;server huggers&#8221; will sleep better at night.  They will be able to keep one eye on the iron that can make or break their job.  In my opinion, using physical separation in a virtual world is resisting the complete cloud and hardware independent virtualization that we are all striving for.  But I cannot begrudge an administrator that wants to hold onto some semblance of physical hardware best practices while traveling the multi-year journey to the private cloud.</p>
<p>Another of Dan&#8217;s arguments against large customers is the cumbersome nature of their change control.  Clusters have to be managed to a consistent state and the complexity of this process is dependent on the number of items being managed.  A very large cluster will present unique challenges when managing change.</p>
<p>So, have I given a recommendation?  I am not sure.  If anything I feel that Dave, Dan and I believe that a minimum cluster size needs should be set to guarantee that the CPU utilization target, and not the HA failover capacity, is the defining the number of wasted resources.  This means a minimum cluster of something like four or five hosts.  While neither of us claims a specific problem that will occur with very large clusters, we cannot imagine the value of a 32-host cluster.  So, we think the right cluster size is somewhere shy of 10.</p>
<p>I am quite interested to hear your thoughts on this.  Perhaps the best guidance will grow out of the crucible of debate.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/11/29/maximum-hosts-per-cluster/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>Storage Consolidation (or: How Many VMDKs Per Volume?)</title>
		<link>http://vpivot.com/2010/11/07/storage-consolidation-or-how-many-vmdks-per-volume/</link>
		<comments>http://vpivot.com/2010/11/07/storage-consolidation-or-how-many-vmdks-per-volume/#comments</comments>
		<pubDate>Sun, 07 Nov 2010 08:15:23 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[vcenter]]></category>
		<category><![CDATA[vmkernel]]></category>
		<category><![CDATA[vmworld]]></category>
		<category><![CDATA[vmworld europe]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=696</guid>
		<description><![CDATA[Part of the performance best practices talk I co-presented at VMworld in San Francisco and Copenhagen focused on answering the question, &#8220;How many virtual machines can be placed on a single VMFS volume?&#8221;  There are a lot of theories as to a best answer.  It will not surprise you to learn that no single consolidation [...]]]></description>
			<content:encoded><![CDATA[<p>Part of the performance best practices talk I co-presented at VMworld in San Francisco and Copenhagen focused on answering the question, &#8220;How many virtual machines can be placed on a single VMFS volume?&#8221;  There are a lot of theories as to a best answer.  It will not surprise you to learn that no single consolidation ratio works in every environment.  Your workloads will influence the maximum consolidation.  But we know enough about how ESX virtualizes storage to provide guidance as to the right storage consolidation ratios.</p>
<p><span id="more-696"></span>First, a little background on ESX&#8217;s storage queues.  There are two relevant queues in ESX.  First is the device queue, which has one instantiation at each HBA for each LUN.  Second is the kernel queue, which handles &#8220;overflowed&#8221; IOs that are waiting to be placed in a full device queue.</p>
<p>For Fibre Channel HBAs, the device queue&#8217;s default length is 32 commands.  It is much larger for iSCSI. No HBA, and thus no device queue, exists for NFS.  A 32 command queue is capable of opening 32 commands at a time.  Obviously, if you double this queue length then the queue will drive twice as many IOs to the volume.  For the rest of this article I will discuss queues in terms of the 32 element Fibre Channel queue.</p>
<p>Because one device queue is instantiated at each HBA for each LUN, a storage reconfiguration at an array can change the number of queues at an ESX host.  Increasing the number of queues increases the total number of IOs that the host can open against the array.  I demonstrated this in my VMworld presentation with the following figure.</p>
<div id="attachment_697" class="wp-caption aligncenter" style="width: 489px"><a href="http://vpivot.com/wp-content/uploads/2010/11/device-queues.png"><img class="size-full wp-image-697" title="Example: Two Storage Configurations" src="http://vpivot.com/wp-content/uploads/2010/11/device-queues.png" alt="Two VMFS volumes means two queues.  One volume one queue." width="479" height="519" /></a><p class="wp-caption-text">Putting two VMs on two volumes results in up to 64 commands being opened from the pair of them at one time.</p></div>
<p>This figure shows the simple difference between two virtual machines sharing a single VMFS volume and two that each get their own.  In the first configuration, only 32 commands can be opened from the host and that single queue is shared between the virtual machines.  In the second configuration, the host can open up 64 total commands and each virtual machine can open up to 32.</p>
<p>Your first reaction to this might be, &#8220;Wow! I should put every VMDK on a VMFS volume of its own!  Then imagine the total throughput that the host could drive!!&#8221;  My first response to this is stop using so many exclamation points.  Nobody likes an overenthusiastic writer.  But second, you should consider that more is not always better.  In fact, I can think of several reasons why you should not reconfigure storage to multiply the number of queues:</p>
<ol>
<li>Allowing a host to open many commands simultaneously may be good for the individual virtual machines but is likely to be dangerous for the shared infrastructure.  This could result in short but extremely intense <a href="http://virtualgeek.typepad.com/virtual_geek/2009/06/vmware-io-queues-micro-bursting-and-multipathing.html">microbursts</a> of IO that could present challenges to your fabric or storage processors.</li>
<li>The device driver (and the HBA) can only open a fixed number of commands depending on the device&#8217;s implementation.  You have to use these sparingly.</li>
<li>The configuration that results in more queues necessarily requires more VMFS volumes which results in a greater administration cost.</li>
</ol>
<p>In addition to reconfiguring storage to increase the number of device queues, you always have the option of increasing the length of ESX&#8217;s device queues.  This is documented on page 71 of the <a href="www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdf">Fibre Channel SAN Configuration Guide</a>.  But I will caution you from reconfiguring storage queues, too.  This requires manual changes at every host, produces longer queues that more quickly eat into the fixed number of commands each HBA can support, and increases the possible IO intensity every virtual machine on the host.</p>
<p>And if these detailed explanations are insufficient at explaining why storage queue manipulation is unproductive or even counterproductive towards your goal of optimizing your infrastructure, let me point out that VMware has years of experience at consolidating storage and they chose 32 commands per queue as the right number for most environments.  Trust their experience on this one.</p>
<p>Of course I would be remiss if I did not mention that there are rare times that a storage reconfiguration may help performance.  Redistributing virtual machines across different VMFS volumes or increasing queue depths can correct some issues.  And you can identify occasions where this change may help by a large kernel latency.</p>
<p>As I mentioned above, commands that are waiting for access to a full device queue reside in the kernel queue until a device queue slot becomes available.  On the whole, commands should only spend a fraction of a millisecond in the kernel queue on their way to the device queue.  A kernel queuing time of over one millisecond and certainly over two milliseconds suggests the virtual machines are not having their IO needs served fast enough.</p>
<p>You can see kernel queueing times in the kernel latency statistic reported in esxtop (counter: KAVG) and vCenter (counter: Kernel Latency).  When these latencies consistently average any whole number in milliseconds its time to investigate storage.  But know that slow storage can result in high kernel queuing times.  So, before you go manipulating queues, or reconfiguring your storage layout, make sure your storage is serving IOs in periods deemed acceptible by the storage teams (usually 5-10 ms).</p>
<p>This is kind of a long article by vPivot standards, I know.  But cut me some slack.  <a href="http://virtualgeek.typepad.com/">Chad Sakac</a> bangs out footnotes and parenthetical digressions that are longer than this entry.  This content has already been covered in my VMworld presentations so if you have access to those recordings go listen to Kaushik and I present it there.  But for those of you that were unable to attend I wanted to present this important guidance for your consideration.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/11/07/storage-consolidation-or-how-many-vmdks-per-volume/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Performance Troubleshooting Made Simple</title>
		<link>http://vpivot.com/2010/05/10/performance-troubleshooting-made-simple/</link>
		<comments>http://vpivot.com/2010/05/10/performance-troubleshooting-made-simple/#comments</comments>
		<pubDate>Mon, 10 May 2010 13:27:05 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[troubleshooting]]></category>
		<category><![CDATA[vcenter]]></category>
		<category><![CDATA[vscsistats]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=525</guid>
		<description><![CDATA[I have struggled for years to give VMware&#8217;s customers a framework for diagnosing performance problems. People want a simple system to troubleshoot the unknown sources of poorly performing applications. The best attempt at documenting such a flow is Hal Rosenberg&#8217;s document on vSphere performance troubleshooting. Elegant as it may be, Hal&#8217;s document remains complex for [...]]]></description>
			<content:encoded><![CDATA[<p>I have struggled for years to give VMware&#8217;s customers a framework for diagnosing performance problems.  People want a simple system to troubleshoot the unknown sources of poorly performing applications.  The best attempt at documenting such a flow is <a href="http://communities.vmware.com/docs/DOC-10352">Hal Rosenberg&#8217;s document on vSphere performance troubleshooting</a>. Elegant as it may be, Hal&#8217;s document remains complex for the novice VI administrator.  And it is because that document is so complex that performance people maintain their job security.  <img src='http://vpivot.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />   But in an effort to further obviate my own job, I will try and generalize the troubleshooting flow to add more clarity to the process.</p>
<p><span id="more-525"></span>The first tool in the VI administrator&#8217;s toolbox should always be vCenter.  Through the vSphere client you can use vCenter&#8217;s performance counters to confirm a problem with any of the four resources (storage, CPU, memory, network).  vCenter&#8217;s 20 second sample window impedes its ability to eliminate a resource as a problem.  This is because a three second spike in any resource will be smoothed and missed over the 20 second window.  But when vCenter confirms a sustained resource bottleneck, it is sure to be the performance problem&#8217;s cause.</p>
<p>If vCenter fails to confirm an obvious performance problem, the administrator must next go to more precise, more time-intensive, and more knowledge-intensive tools such as esxtop and vscsiStats.  esxtop takes more skill and time than vCenter but provides better resolution and more visibility into the system.  vscsiStats is the most time-intensive tool and has limits with ESXi hosts but can uncover a world of detail invisible to esxtop and vCenter.</p>
<p>I estimate each tool&#8217;s chance of identifying a random performance problem as follows:</p>
<ul>
<li>vCenter: used in 90% of performance problems</li>
<li>esxtop: used in 9% of problems</li>
<li>vscsiStats: used 0.9% of the time</li>
</ul>
<p>The remaining 0.1% of the time is when you engage your account team or your local VMware performance expert.</p>
<p>Even within each tool&#8217;s usage there is an hierarchy of investigation: storage, CPU, memory and network.  My experience with troubleshooting has informed this decision.  Storage causes the most problems, then CPU, then memory, and lastly (and rarely) network. After each resource level is inspected in vCenter, a repeat of the inspection should occur on esxtop.  Guest tools may be a third option for memory, CPU, and network but vscsiStats should always be consulted if the performance problem persists.</p>
<p>VMware&#8217;s growing array of performance management tools will change this flow somewhat.  AppSpeed, for instance, adds the ability to make very educated guesses about resource bottlenecks based on inside information into the application execution.  Hyperic can provide in-guest process visibility and Ionix ADM will map application interdependenies to focus the investigation.  But, I will abstain from providing best practices on these tools until I have used them more.  In all cases, however, the fundamental relationship of &#8220;easy first, precise later&#8221; remains.</p>
<p>VMware continues to work towards integrating all of these tools into a single view within the vSphere client.  I expect that integration will improve the success rate of the performance layman in troubleshooting these problems.  But I am sure that even into the distant future performance people will find their jobs secure.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/05/10/performance-troubleshooting-made-simple/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>How Many Virtual CPUs Per VM?</title>
		<link>http://vpivot.com/2010/04/30/how-many-virtual-cpus-per-vm/</link>
		<comments>http://vpivot.com/2010/04/30/how-many-virtual-cpus-per-vm/#comments</comments>
		<pubDate>Fri, 30 Apr 2010 04:22:42 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[scheduler]]></category>
		<category><![CDATA[vcenter]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=403</guid>
		<description><![CDATA[Virtual machine sizing is a tricky issue for many VMware administrators. It is important to find the right number of virtual CPUs to maximize application performance and minimize wasted CPU cycles. The optimal number of vCPUs can never be easily identified. But I can offer a few suggestions to help get this number right. ESX [...]]]></description>
			<content:encoded><![CDATA[<p>Virtual machine sizing is a tricky issue for many VMware administrators.  It is important to find the right number of virtual CPUs to maximize application performance and minimize wasted CPU cycles.  The optimal number of vCPUs can never be easily identified.  But I can offer a few suggestions to help get this number right.</p>
<p><span id="more-403"></span><br />
ESX must expend CPU cycles to maintain running virtual CPUs whether they are being used by an application or not.  This means that host efficiency drops as more vCPUs are put on the server.  But applications that scale well with CPUs will deliver greater performance when their virtual machines have been given more CPUs.  The administrator must therefore balance the desires of an individual application&#8217;s owner with the needs of the entire cluster&#8217;s of applications.</p>
<p>There are several resources that VI administrators can use to inform their decisions in virtual machine sizing.  I have listed some of them below.</p>
<h2>Bruce Herndon&#8217;s Cost-of-SMP Article</h2>
<p>Last summer the VMmark team&#8217;s Bruce Herndon published <a href="http://blogs.vmware.com/performance/2009/06/measuring-the-cost-of-smp-with-mixed-workloads.html">an article on the cost of SMP</a>.  I summarized his findings in <a href="http://vpivot.com/2009/09/29/four-things-you-should-know-about-esx-4s-scheduler/">a vPivot article I wrote on the ESX 4 scheduler</a>.  There are two key messages that you can take away from these posts to inform your decisions on virtual machine sizing:</p>
<ul>
<li>Over-sized virtual machines only hurt system performance when the server&#8217;s CPUs are saturated.  When utilization is low, unneeded vCPUs only penalize the system&#8217;s CPU utilization, not the applications&#8217; performance.</li>
<li>Unneeded 2-way virtual machines are not very harmful to the environment.  But administrators should be very careful with 4-way virtual machines and larger.</li>
</ul>
<h2>Co-stop and Ready Time</h2>
<p>Ready time indicates a vCPU waiting for an available core when it has work to perform.  Co-scheduling stop time (or co-stop time) indicates a vCPU being paused by the scheduler to allow its sibling vCPUs to catch up.  These two counters can help administrators recognize a certain kind of stress due to limited CPU resources.</p>
<p>Ready time is generally a sign of the unavailability of CPU.  Correction usually requires the administrator reducing work on the host (migrating virtual machines, decreasing vCPU count, etc.) or increasing CPU capacity (more hosts or faster CPUs).  Co-stop time is a sign that the scheduler is allowing vCPUs to develop skew while it runs portions of virtual machines on available cores.  Considerable numbers for these counters are 10% ready time and 3% co-stop time.  There is no guarantee that application performance is suffering if these thresholds are crossed, but a problem may be present.</p>
<p>The important thing about ready time and co-stop time is that they are signs that you are using all of the CPU you have available to you.  This could be a Good Thing.  But it could also be a surprise to you.  When these counters get high it is a good time to start asking yourself if you capacity usage meets your expectations.  If not, you should inspect your virtual machines to be sure that the applications are using the vCPUs you have given them.  If your guest tools show poor in-guest utilization then decrease those VM sizes.  That will free up resources in the cluster for more virtual machines.</p>
<h2>Application Scalability Information</h2>
<p>I wish we lived in a world where every ISV published data showing their applications&#8217; abilities to scale with cores.  Unfortunately for us, many software vendors have for years allowed their customers to assume that each doubling of cores would double the performance of the application.  VMware has chosen to provide some scalability information so our customers know <a href="http://www.vmware.com/pdf/Perf_ESX40_Oracle-eval.pdf">how well</a> or <a href="http://www.vmware.com/files/pdf/consolidating_webapps_vi3_wp.pdf">how poorly</a> applications scale.  But every customer of a software company deserves to have the vendor provide guidance on sizing the server.  And those vendors deserve the right to put these results out on their own products.  Go talk to your ISV to get the information you need to size your virtual machines.</p>
<h2>CPU Usage Calculations and CapacityIQ</h2>
<p>I am belatedly updating this post with a fourth way of identifying oversized virtual machines: mathematical calculation or Capacity IQ.</p>
<p>When a virtual machine consistently uses only a fraction of its vCPU resources it is possible that the virtual machine can be downsized and still deliver the same application performance.  The calculation to determine this is simple: multiply the vCPU count by utilization and round up.  Set the virtual machine&#8217;s vCPU count to the result of that calculation.</p>
<p>If you own CapacityIQ it will make this calculation for you for every virtual machine in your data center.  Here is an screenshot of its recommendations based on virtual machine CPU and memory utilization.  Click for a clearer picture.</p>
<div id="attachment_512" class="wp-caption alignnone" style="width: 310px"><a href="http://vpivot.com/wp-content/uploads/2010/04/capiq_vm_size_recs.png"><img src="http://vpivot.com/wp-content/uploads/2010/04/capiq_vm_size_recs-300x102.png" alt="" title="Capacity IQ Recommending VM Resize" width="300" class="size-medium wp-image-512" /></a><p class="wp-caption-text">CapacityIQ monitors CPU and memory utilization to recommend VM downsizing.</p></div>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/04/30/how-many-virtual-cpus-per-vm/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Optimizing Memory Utilization</title>
		<link>http://vpivot.com/2010/01/06/optimizing-memory-utilization/</link>
		<comments>http://vpivot.com/2010/01/06/optimizing-memory-utilization/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 21:52:45 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[swap]]></category>
		<category><![CDATA[vcenter]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=198</guid>
		<description><![CDATA[My recent series of blog articles have discussed ESX memory management the the performance specter of host swapping. My last article attempts to correct the misconception that VMware recommends against over-commit memory.  In that article I suggested that memory over-commit is requirement in optimizing memory utilization. Today I want to provide a specific example to [...]]]></description>
			<content:encoded><![CDATA[<p>My recent series of blog articles have discussed ESX memory management the the performance specter of host swapping.  My last article attempts to <a href="http://vpivot.com/2010/01/04/misunderstanding-memory-management/">correct the misconception that VMware recommends against over-commit memory</a>.   In that article I suggested that memory over-commit is requirement in optimizing memory utilization. Today I want to provide a specific example to show why this is true.   I am have also included tips for identifying host swapping in your environments.<br />
<span id="more-198"></span></p>
<h2>Understanding the Bottleneck</h2>
<p>Let me show the value of over-commit and danger of swapping by way of an example.  I will choose the following typical values to demonstrate my point:</p>
<ul>
<li>All virtual machines are on a single host which has <strong>32 GB of RAM</strong> installed.</li>
<li>Each virtual machine is sized to <strong>8 GB of RAM</strong>.</li>
<li>Each virtual machine has <strong>25% active memory</strong> (%ACTV in esxtop and &#8220;Active&#8221; in vCenter).</li>
</ul>
<table id="newspaper-a">
<tbody>
<tr>
<th>VM Count</th>
<th>Active Memory in Host</th>
<th>Comments</th>
</tr>
<tr>
<td>3</td>
<td>3 * 8 GB * 25% = <strong>6 GB</strong></td>
<td>Without memory over-commit, <em>only 18% of the host&#8217;s memory is actively in use</em>.   What a waste!</td>
</tr>
<tr>
<td>12</td>
<td>12 * 8 GB * 25% = <strong>24 GB</strong></td>
<td>Memory is over-committed by 200% but only 75% is actively being used.  In this aggressive consolidation <em>virtual machines will run at full speed</em> until usage exceeds 100% of host memory.</td>
</tr>
<tr>
<td>18</td>
<td>18 * 8 GB * 25% = <strong>36 GB</strong>, limited to <strong>32 GB</strong> by host</td>
<td>These virtual machines want 36 GB of RAM but are limited to the 32 GB that is installed on the host.  ESX must swap to allow these machines to run and <em>performance will suffer greatly</em>.</td>
</tr>
</tbody>
</table>
<p>A virtual machine&#8217;s active memory is dictated by the application and its usage.  But the VI admin has complete control over the number of virtual machines in the environment which means host active memory can be influenced by adding or removing virtual machines.  Because virtual machine active memory is always equal to or less than 100% the only way to drive the host active memory to 100% is to over-commit memory.   <em>This is why hypervisors that do not support memory over-commit are simply not viable for data centers where memory optimization is a priority.</em></p>
<h2>Identifying and Correcting the Bottleneck</h2>
<p>The ongoing occurrence of swapping is identified by a non-zero swap rate in either esxtop or vCenter.  In addition to swap rate, esxtop provides a swap wait time in its CPU panel.  When swap rate exceeds hundreds of kilobytes per second or swap wait time exceeds a couple percentage points, it is time for corrective action.</p>
<p>There are three possible solutions to this problem:</p>
<ol>
<li>Balance the virtual machines&#8217; memory usage by moving virtual machines from hosts with higher amounts of memory usage to hosts with lower amount of memory usage.</li>
<li>Run fewer virtual machines.</li>
<li>Buy more memory.</li>
</ol>
<h2>Designing Your Infrastructure to Simplify Memory Management</h2>
<p>Ultimately I owe you a full white paper on memory management to provide a sufficient answer.  But I want to give you two ideas of the tools and techniques that I will be describing when in this future paper.  First, place <a href="http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/">host swap files on solid state disk (SSD) stores</a> to improve their performance.  With the right SSD device it may be possible to eliminate swap penalties.  Second, even if SSDs are unavailable consider consolidating multiple swap files onto a single store.  This will make swap rate monitoring very easy but may compound the performance penalties of swapping.</p>
<p>Stay tuned and VMware will provide more documentation on memory management in 2010.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/01/06/optimizing-memory-utilization/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
	</channel>
</rss>

