<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>vPivot &#187; memory</title>
	<atom:link href="http://vpivot.com/tag/memory/feed/" rel="self" type="application/rss+xml" />
	<link>http://vpivot.com</link>
	<description>Scott Drummonds on Virtualization</description>
	<lastBuildDate>Wed, 01 Feb 2012 06:46:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Maximum Hosts Per Cluster</title>
		<link>http://vpivot.com/2010/11/29/maximum-hosts-per-cluster/</link>
		<comments>http://vpivot.com/2010/11/29/maximum-hosts-per-cluster/#comments</comments>
		<pubDate>Mon, 29 Nov 2010 02:08:20 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[drs]]></category>
		<category><![CDATA[ha]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[vcenter]]></category>
		<category><![CDATA[vforum]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=702</guid>
		<description><![CDATA[I just returned from a one week vacation to a warm sunny beach on a small island not too far from Singapore.  Even on my vacations my conversations often migrate to technology and my travel mate is an old friend and current employee at VMware, Dave Korsunsky.  Sitting by a pool with a cocktail in [...]]]></description>
			<content:encoded><![CDATA[<p><!-- p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial; min-height: 15.0px} -->I just returned from a one week vacation to a warm sunny beach on a small island not too far from Singapore.  Even on my vacations my conversations often migrate to technology and my travel mate is an old friend and current employee at VMware, <a href="http://twitter.com/#!/VMW_Dave">Dave Korsunsky</a>.  Sitting by a pool with a cocktail in hand at a fantastic hotel I asked my friend, &#8220;what is the right number of hosts per DRS/HA cluster?&#8221;  Great conversation for a vacation, right?</p>
<p><span id="more-702"></span>I started thinking about this topic at Sydney&#8217;s vForum a month ago.  VMware&#8217;s Dan Anderson suggested that designs that implemented maximum cluster sizes (32 hosts per cluster) were the result of misguided reasoning.  Dan insisted that clusters need never be larger than eight hosts per cluster.  And on this subject we bantered for a few minutes.  Dan convinced me that there are few compelling reasons to implement large clusters.  And we could think of many reasons to avoid them.  I do not think it easy to assign one number as the &#8220;right&#8221; cluster size.  But there are many principles that suggest small to medium sized clusters being choices.</p>
<p>First, the argument for the largest clusters: DRS efficiency.  This was my primary claim in favor of 32-host clusters.  My reasoning is simple: with more hosts in the cluster there are more CPU and memory resource holes into which DRS can place running virtual machines to optimize the cluster&#8217;s performance.  The more hosts, the more options to the scheduler.</p>
<p>But on retrospect I think this is a weak argument.  Its not backed by data and in practice I cannot imagine a 16 host cluster being much more efficient than an eight host cluster.  Once vCenter is managing hundreds or more virtual machines per cluster, it has an astronomical number of combinations for VM placement.  So, doubling the host (and the virtual machine count) should have little impact to cluster efficiency.</p>
<p>More importantly, with respect to the efficiency argument, maximum CPU and memory utilization will be bound either by the failover capacity or the target utilization, which is usually about 80%.  With 20% reserved for resource spikes, the failover capacity is equal to the reserved resources at a 4+1 HA cluster.  Any any cluster larger than this, the failover capacity is less than 20%.  This means that only target utilization bounds resource efficiency.</p>
<p>The efficiency calculation is a little more tricky if you want to size your cluster for target resource utilization <em>after</em> a host failure.  In this case each additional host provides some incremental value to the cluster&#8217;s utilization.  To size a 4+1 cluster to 80% utilization after host failure, you will want to restrict CPU usage in the five hosts to 64%.  Going to a 5+1 cluster results in a pre-failure CPU utilization target of 66%.  The increases slowly approach 80% as the clusters get larger and larger.  But, you can see that the incremental resource utilization improvement is never more than 2%.  So, growing a cluster slightly provides very little value in terms of resource utilization.</p>
<p>Now why might you want to keep a cluster small?  I can think of a few reasons.</p>
<p>It is generally wise to avoid mixing different classes of servers in a single pool.  DRS does not make scheduling decisions based on the performance characteristics of the server so a new, powerful server in a cluster is just as likely to receive a mission-critical virtual machine as older, slower host.  This would be unfortunate if a cluster contained servers with radically different&#8211;although EVC compatible&#8211;CPUs like the Intel Xeon 5400 and Xeon 5500 series.  In the former case ESX would be using its software memory management unit which could perform as much as 40% worse than the hardware MMU in the Xeon 5500.</p>
<p>(I will momentarily digress to answer a question I often get in my performance talks: what is the impact of Enhanced vMotion Compatibility (EVC) on virtual machine performance?  Briefly: very little to none.  The instructions that are disabled on newer processors only benefit applications that were compiled to use those new instructions.  Those applications are rare in the enterprise space.)</p>
<p>Given my recommendation that servers in a cluster should be of a similar class of performance, you will soon find that your purchasing patterns will influence your cluster size.  If you are one of the few people lucky enough to work at a company that is buying servers by the truckload, you can size your clusters however you want.  But the vast majority of VMware&#8217;s customers make smaller purchases of anywhere from four to 16 servers at a time.  These will make nice, homogenous clusters of moderate size.</p>
<p>One more argument Dave offered for keeping clusters small is to use clusters for logical separation of applications of different class.  By putting your mission-critical applications in a cluster of their own your &#8220;server huggers&#8221; will sleep better at night.  They will be able to keep one eye on the iron that can make or break their job.  In my opinion, using physical separation in a virtual world is resisting the complete cloud and hardware independent virtualization that we are all striving for.  But I cannot begrudge an administrator that wants to hold onto some semblance of physical hardware best practices while traveling the multi-year journey to the private cloud.</p>
<p>Another of Dan&#8217;s arguments against large customers is the cumbersome nature of their change control.  Clusters have to be managed to a consistent state and the complexity of this process is dependent on the number of items being managed.  A very large cluster will present unique challenges when managing change.</p>
<p>So, have I given a recommendation?  I am not sure.  If anything I feel that Dave, Dan and I believe that a minimum cluster size needs should be set to guarantee that the CPU utilization target, and not the HA failover capacity, is the defining the number of wasted resources.  This means a minimum cluster of something like four or five hosts.  While neither of us claims a specific problem that will occur with very large clusters, we cannot imagine the value of a 32-host cluster.  So, we think the right cluster size is somewhere shy of 10.</p>
<p>I am quite interested to hear your thoughts on this.  Perhaps the best guidance will grow out of the crucible of debate.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/11/29/maximum-hosts-per-cluster/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>Optimizing vSphere for Hyper-threading</title>
		<link>http://vpivot.com/2010/09/13/optimizing-vsphere-for-hyper-threading/</link>
		<comments>http://vpivot.com/2010/09/13/optimizing-vsphere-for-hyper-threading/#comments</comments>
		<pubDate>Mon, 13 Sep 2010 14:23:57 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[hyper-threading]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[numa]]></category>
		<category><![CDATA[scheduler]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=647</guid>
		<description><![CDATA[VMware&#8217;s Jeff Buell has been looking into High Performance Computing (HPC) in support of a new addition to the office of the CTO.  Jeff just posted an article on VROOM! showing outstanding memory bandwidth in vSphere virtual machines.  No one should be surprised by this&#8211;virtual machine memory bandwidth has rarely been a problem.  But Jeff [...]]]></description>
			<content:encoded><![CDATA[<p>VMware&#8217;s Jeff Buell has been looking into High Performance Computing (HPC) in support of <a href="http://communities.vmware.com/community/cto/high-performance">a new addition to the office of the CTO</a>.  Jeff just posted <a href="http://blogs.vmware.com/performance/2010/09/hpc-application-performance-on-esx-41-stream.html">an article on VROOM! showing outstanding memory bandwidth</a> in vSphere virtual machines.  No one should be surprised by this&#8211;virtual machine memory bandwidth has rarely been a problem.  But Jeff did discuss a advanced configuration parameter that should pique everyone&#8217;s curiosity: NUMA.preferHT.</p>
<p><span id="more-647"></span>Hyper-threading presents an interesting dilemma to any software running on Nehalem-based processors.  For some multithreaded workloads, an operating system scheduler can spread threads across multiple NUMA nodes or co-locate them to a single node.  Consider the following figure, which depicts a single 8-way virtual machine being scheduled to all of the eight physical cores on a server.</p>
<div id="attachment_664" class="wp-caption aligncenter" style="width: 282px"><a href="http://vpivot.com/wp-content/uploads/2010/09/Screen-shot-2010-09-17-at-2.12.25-PM.png"><img class="size-full wp-image-664" title="8-way Virtual Machine Using Eight Cores" src="http://vpivot.com/wp-content/uploads/2010/09/Screen-shot-2010-09-17-at-2.12.25-PM.png" alt="" width="272" height="287" /></a><p class="wp-caption-text">This figure depicts the eight vCPUs of a single virtual machine being spared across two NUMA nodes&#39; eight cores.</p></div>
<p>In this case the threads (vCPUs for vSphere) are each given their own physical core.  The benefit is that the vCPUs get unfettered access to their physical cores and the resulting additional computational power.  The drawback is that common memory is remote for half the vCPUs and will have to go through the other NUMA node.  This means memory-intensive workloads might run slower.</p>
<div id="attachment_665" class="wp-caption aligncenter" style="width: 366px"><a href="http://vpivot.com/wp-content/uploads/2010/09/Screen-shot-2010-09-17-at-2.12.35-PM.png"><img class="size-full wp-image-665" title="8-way Virtual Machine Using Four Cores" src="http://vpivot.com/wp-content/uploads/2010/09/Screen-shot-2010-09-17-at-2.12.35-PM.png" alt="" width="356" height="286" /></a><p class="wp-caption-text">This figure depicts the eight vCPUs of a single virtual machine being consolidated to one NUMA node&#39;s four cores.</p></div>
<p>This second configuration places the same virtual machine&#8217;s eight vCPUs on a single NUMA node.  This means physical cores are shared but all memory access is local.  The vCPUs are contending for fewer CPU cycles, although they are benefiting from Hyper-threading.  This will result in less computational power than dedicated physical cores.  On the other hand, assuming the virtual machine was sized to fit in a single node, 100% of memory access will go to fast, local memory.  This could produce better performance for memory intensive workloads.</p>
<p>vSphere will prefer to spread virtual CPUs across NUMA nodes (option one above) to gain the benefit of more physical cores.  But if you are running an application where memory throughput is more important than processor speed, you should consider testing a change vSphere&#8217;s default behavior.  You can do this by setting the ESX 4.1  advanced parameter NUMA.preferHT to 1.  This will configure the scheduler to prefer consolidating threads on logical processors on a single NUMA instead of using more physical cores across multiple nodes.</p>
<p>It would be nice if VMware provided definitive guidance on when virtual machines should be configured to prefer more physical cores (the default setting) or local memory access (NUMA.preferHT=1).  But this guidance would be dependent on application, CPU, virtual machine size, consolidation ratios and utilization.  The complexity of this guidance likely means that we will not see an authoritative word on this any time soon.  But that does not stop you from experimenting on your own and sharing results.  I would love to see any results of experiments posted here.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/09/13/optimizing-vsphere-for-hyper-threading/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Designing VMs with Performance SLAs</title>
		<link>http://vpivot.com/2010/08/09/designing-vms-with-performance-slas/</link>
		<comments>http://vpivot.com/2010/08/09/designing-vms-with-performance-slas/#comments</comments>
		<pubDate>Mon, 09 Aug 2010 13:56:50 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[netioc]]></category>
		<category><![CDATA[sioc]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=614</guid>
		<description><![CDATA[Consolidation amplifies the uncertainty of application performance. Still, VI administrators need a means of guaranteeing performance SLAs to their applications&#8217; users. But the best VMware has been able to offer are resource controls, which are at best an indirect mechanism for sustaining application performance. With the acquisition of B-hive, now AppSpeed, VMware moved a step [...]]]></description>
			<content:encoded><![CDATA[<p>Consolidation amplifies the uncertainty of application performance.  Still, VI administrators need a means of guaranteeing performance SLAs to their applications&#8217; users.  But the best VMware has been able to offer are resource controls, which are at best an indirect mechanism for sustaining application performance.  With the acquisition of B-hive, now AppSpeed, VMware moved a step closer to allowing VI administrators to guarantee a performance SLA.  As an application-aware latency measurement tool, AppSpeed may eventually provide feedback to vCenter to guarantee throughput levels.  But it does not today.  So how are VI administrators to guarantee application performance?</p>
<p><span id="more-614"></span>It was during discussions with advanced VMware customers in Melbourne that a solution to this problem occurred to me.  I have reasoned it through and I think it holds water.  I have socialized it with more customers and my colleagues and we think it stands.  So I want to introduce a system for implementing virtual machines with a better assurance of a performance SLA.</p>
<p>The key to this process is that minimum performance can be measured using limits and that performance can be assured using reservations.  You can develop and document virtual machines with performance SLAs using the following procedure:</p>
<ul>
<li>First, as always, define a small number of strictly-sized virtual machines to be used by all applications in your environment.  Often these look something like small VMs of 1 vCPU and 4 GB RAM, medium VMs of 2 vCPUs and 8 GB of RAM, and large VMs of 4 vCPUs and 16 GB of RAM.  Tune these numbers for your environment, as needed.</li>
<li>For any application, benchmark its maximum performance against each of these virtual machine configurations on an unloaded system.  Chose an ISV-supplied benchmark or a well-known third party tool.  This sets your high water mark for throughput for each application in its virtual machine.</li>
<li>For each configuration, set a CPU limit at 50% of the available CPU and a memory limit of 50% of the available memory.  Retest the application against this smaller, limited configuration.</li>
<li>During the applications&#8217; deployment, change the limits to reservations.  That is, remove limits and set reservations equal to the limits&#8217; previous values, in this case 50%.</li>
<li>Your application now has a maximum performance defined in bullet two, and a &#8220;guaranteed&#8221; performance measured in bullet three.  This is your application&#8217;s performance SLA.</li>
</ul>
<p>The concept is simple: limits can be used to measure the performance of an application in the presence of that degree of contention.  Reservations ensure that those resource amounts are always present.  Here are some notes on this process:</p>
<ul>
<li>This is not a true guarantee since network and storage throughput may drop.  No tool can eliminate this risk entirely but <a href="http://vpivot.com/2010/05/04/storage-io-control/">SIOC</a> and <a href="http://www.vmware.com/resources/techresources/10119">NetIOC</a> can reduce the risk of a network- or storage-induced performance failure.</li>
<li>The memory test is going to be highly dependent on the working set created by your load generation tool.  Your mileage will vary depending on your application owners&#8217; use of the virtual machine.</li>
<li>vCenter will guarantee that the reservations are always available through a process called admission control, which checks the cluster to ensure that enough CPU or memory is available to run the virtual machine immediately and in the event of a server failure.</li>
</ul>
<p>As I said above, this is not a true guarantee of application performance.  But it is as close as we can get until AppSpeed or a replacement evolves into universal application latency measurement that is fed into vCenter.  And this is another in a growing list of reasons  why <a href="http://vpivot.com/2010/03/31/memory-reservations-drive-over-commit/">CPU and memory reservations should be part of all VMware deployments</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/08/09/designing-vms-with-performance-slas/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>vSphere 4.1: Performance Improvements</title>
		<link>http://vpivot.com/2010/07/22/vsphere-4-1-performance-improvements/</link>
		<comments>http://vpivot.com/2010/07/22/vsphere-4-1-performance-improvements/#comments</comments>
		<pubDate>Thu, 22 Jul 2010 03:28:58 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[netioc]]></category>
		<category><![CDATA[network]]></category>
		<category><![CDATA[numa]]></category>
		<category><![CDATA[sioc]]></category>
		<category><![CDATA[vmotion]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=605</guid>
		<description><![CDATA[Last week I took my first vacation in a year and a half.  I had not missed a single day of work in 18 months.  So last week, when I was galavanting through Spain and running terrified, screaming, and covered in sangria through the streets of Pamplona, VMware made its biggest announcement in over a [...]]]></description>
			<content:encoded><![CDATA[<p>Last week I took my first vacation in a year and a half.  I had not missed a single day of work in 18 months.  So last week, when I was galavanting through Spain and <a href="http://www.e-scott.net/blog/?p=332">running terrified, screaming, and covered in sangria through the streets of Pamplona</a>, VMware made its biggest announcement in over a year: <a href="http://www.vmware.com/company/news/releases/vsphere-4-1.html">the launch of vSphere 4.1</a>.  My old team put out what looks to be a wonderful &#8220;<a href="http://www.vmware.com/resources/techresources/10116">What&#8217;s New in Performance</a>&#8221; paper so I want to take a few minutes to add my thoughts to some of the great work VMware has done.</p>
<p><span id="more-605"></span>Calling attention to a subset of the performance features in this launch, I will augment the published documentation with my own comments.</p>
<h2>Wide VM NUMA Support</h2>
<p>A &#8220;wide VM&#8221; is defined by VMware as a virtual machine whose memory is too large for a single NUMA node.  In this case, some of the memory must be placed on a remote node, which has a relatively higher memory latency.  ESX 4.0 would place as much memory as possible on a single node, then arbitrarily spill the rest over to other nodes.  ESX 4.1 now recognizes memory locality of reference and places frequently accessed memory on the local node, potentially eliminating remote memory access penalties.  Expect big gains with wide virtual machines running Java or very active databases.</p>
<h2>Memory Compression</h2>
<p>When I wrote on <a href="http://vpivot.com/2010/03/01/memory-compression/">Steve Herrod&#8217;s preview of memory compression at PEX</a>, I am sure you knew this feature&#8217;s release was imminent.  VMware&#8217;s documentation is sufficient on the gains provided by this feature, so I will not repeat those gains here.  The key thing to remember about memory compression is that it greatly reduces the need for swap.  For years VMware administrators have feared the spectre of memory swapping and have left memory woefully underutilized, even in consolidated environments.  With memory compression in place, you should more confidently push active memory closer to 100%.</p>
<h2>Storage IO Control (SIOC)</h2>
<p>Before the new VMware documentation, you had <a href="http://vpivot.com/2010/05/04/storage-io-control/">my article on SIOC</a> and a <a href="http://www.youtube.com/watch?v=5GN5f1u7pcc">delightful video</a> previewing the feature to whet your appetites.  Now that the feature is out, I want to repeat the moral of the story: SIOC will save your high priority applications&#8217; in the event of storage contention.  But if you storage performance stinks before SIOC, it will continue to stink with SIOC.  SIOC just buys you time for your mission critical applications so you can correct that storage problem.</p>
<h2>Faster vMotion</h2>
<p>(I&#8217;ll take a break here and point out the change of spelling from VMotion to vMotion.  This innocuous change will surely be missed by the large numbers of people that misspell VMWare [sic].  In truth, the case of vMotion is not particularly critical, but those of you grammatical pedants like myself take note.)</p>
<p>Many customers, already happy with vMotion, will scratch their heads as to what is left to be improved in this feature.  But a large number of you have tried evacuating 100 virtual machines from a host.  At two virtual machines at a time, this evacuation would have taken tens of minutes.  VMware was not limiting the vMotion concurrency for no good reason; they wanted to guarantee 100% correctness.  Careful evaluation, experimentation, and critical code improvements allowed the vMotion engineering team to greatly improve the efficiency of a migration in vSphere 4.1.  The result is that virtual machines more efficiency use the vMotion network which means VMware can qualify and support more virtual machines being concurrently migrated.</p>
<p>Part of this efficiency change included a decrease in the virtual machine switchover time, during which the application is unresponsive.  In every production environment I have seen, this switchover time was quite small, resulting in no application downtime.  But as processor performance and memory access time improved, and with vMotion efficiency remaining flat, eventually pages would be touched faster than vMotion could migrate them.  This would result in vMotion failures.</p>
<p>The new vMotion efficiency improvements have dropped application switchover times to minuscule levels, guaranteeing zero application downtime for many years to come.</p>
<h2>Network IO Control</h2>
<p>Missing from the recently published performance document is an overview on Network IO Control (NetIOC).  In truth, I may be responsible for its lack of inclusion.  Apologies.  But luckily performance engineering released a <a href="https://docs.google.com/viewer?url=http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf">best practices document</a> on this wonderful new feature.</p>
<p>NetIOC is the network version of SIOC and may be even more important than SiOC in 10 Gb network environments and infrastructure using converged network adapters.  Let us be honest: the best practice of giving dedicated network hardware to each vSphere network traffic stream is so 2007.  It&#8217;s time to consolidate network and put everything on fewer 10 Gb adapters.  But this is going to create occasional network contention that would benefit from the same resource prioritization that CPU and memory shares have provided for years.</p>
<p>NetIOC will help prioritize your network streams in such an environment.  Converge your networks and investigate NetIOC.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/07/22/vsphere-4-1-performance-improvements/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Hyper-V&#039;s Lack of Memory Over-commit</title>
		<link>http://vpivot.com/2010/04/01/hyper-vs-lack-of-memory-over-commit/</link>
		<comments>http://vpivot.com/2010/04/01/hyper-vs-lack-of-memory-over-commit/#comments</comments>
		<pubDate>Thu, 01 Apr 2010 17:52:38 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[hyper-v]]></category>
		<category><![CDATA[memory]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=371</guid>
		<description><![CDATA[I find it interesting that one day after I wrote about memory over-commitment in vSphere, Greg Shields wrote about the lack of memory over-commitment in Hyper-V.  In today&#8217;s short blog entry, I want provide one paragraph that Greg&#8217;s article currently lacks: While memory over-subscription is a critical feature for production environments, balancing the demands of [...]]]></description>
			<content:encoded><![CDATA[<p>I find it interesting that one day after I wrote about <a href="http://vpivot.com/2010/03/31/memory-reservations-drive-over-commit/">memory over-commitment in vSphere</a>, Greg Shields wrote about <a href="http://virtualizationreview.com/articles/2010/04/01/hypervs-missing-feature.aspx">the lack of memory over-commitment in Hyper-V</a>.  In today&#8217;s short blog entry, I want provide one paragraph that Greg&#8217;s article currently lacks:</p>
<blockquote><p>While memory over-subscription is a critical feature for production environments, balancing the demands of heterogenous applications of varying demands in a resource starved environment is difficult.  Without guidance from administrators on the relative importance of the virtual machines running these applications, a hypervisor will be forced to make arbitrary decisions in assigning limited resources.  Effective use of over-commitment requires a sound resource control system.  The only product on the market that does this well is VMware vSphere.</p></blockquote>
<p>Both Greg and my articles only talked of memory over-commitment, but the rules apply for CPU over-commitment, too.  Microsoft will realize how important resource controls are somewhere between year two and five of their product&#8217;s life.  I can only imagine where vSphere will be by then.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/04/01/hyper-vs-lack-of-memory-over-commit/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Memory Reservations Drive Over-commit</title>
		<link>http://vpivot.com/2010/03/31/memory-reservations-drive-over-commit/</link>
		<comments>http://vpivot.com/2010/03/31/memory-reservations-drive-over-commit/#comments</comments>
		<pubDate>Wed, 31 Mar 2010 20:59:50 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[memory]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=361</guid>
		<description><![CDATA[Many of VMware&#8217;s customers use memory reservations during troubleshooting only in a final attempt to fix performance problems. It is true that memory reservations can limit ballooning and host swapping. But if you are only using reservations to anticipate and avoid memory bottlenecks, you are missing one of the great uses of the feature: memory [...]]]></description>
			<content:encoded><![CDATA[<p>Many of VMware&#8217;s customers use memory reservations during troubleshooting only in a final attempt to fix performance problems.  It is true that memory reservations can limit ballooning and host swapping.  But if you are only using reservations to anticipate and avoid memory bottlenecks, you are missing one of the great uses of the feature: memory reservations can drive over-commitment.</p>
<p><span id="more-361"></span>What do I mean by &#8220;drive over-commitment&#8221;?  I mean that, when properly used, memory reservations allow a VI admin to optimally pack virtual machines across a cluster&#8217;s memory.  With properly set reservations, an admin can continue to power on a cluster&#8217;s VMs until vCenter&#8217;s admission control refuses to allow more.  At that point you can know that you the optimal number of virtual machines is on your hosts.</p>
<p>The first step in using vCenter&#8217;s admission control to drive consolidation is to properly reserve memory.  You do this by summing the minimum memory needs for everything in the virtual machine and setting reservations to that number.  Some thoughts on reservations follow:</p>
<table id="newspaper-a">
<tbody>
<tr>
<th>Memory Consumer</th>
<th>Amount of Memory</th>
<th>Comments</th>
</tr>
<tr>
<td>Operating system</td>
<td>200MB &#8211; 1 GB</td>
<td>Newer OSes tend to use more than older OSes, Windows tends to use more than Linux.</td>
</tr>
<tr>
<td>Application code (not user data)</td>
<td>50MB &#8211; 1GB</td>
<td>This number varies widely from application to application.  Check with your ISV for the application&#8217;s needs.</td>
</tr>
<tr>
<td>Heap (user data)</td>
<td>0-255GB</td>
<td>I am generally calling the amount of user-specific data collected by a running application the &#8220;heap&#8221;.  This applies to databases, Java applications, web server caches, etc.  You should size this based on your application&#8217;s data set, usually aiming for a heap size of 2-10% of the total data size.  Consult your ISV.</td>
</tr>
</tbody>
</table>
<p>This table will get you started in setting a virtual machine&#8217;s memory.  For example, consider a virtual machine running Windows Server 2003 (500MB minimum), with SQL Server 2005 (500 MB), running a database that needs at least a 500 MB cache.  This VM&#8217;s reservation should start at 1.5 GB.  But for future growth, small spikes of peak usage, and for additional administrator tools, it is entirely possible that additional 2 GB will be needed.  So, the VM should be sized to at least 3.5 GB with a 1.5 GB reservation.</p>
<p>Given the wide variation in memory usage from OS to OS, application to application, and instance to instance, I would by lying if I told you a formula worked right without later revision.  The truth is you can only estimate these values before deployment.  You will have to monitor your memory hungry applications and fine tune reservations over time.</p>
<p>You do this by checking vCenter&#8217;s stats for each VM&#8217;s active memory.  The active memory should be less than your selected reservations except for rare occasions.  So, if active memory is much higher than your reservations, increase reservations.  If your VM is consistently using less memory than you have reserved, reduce the reservations.</p>
<p>When care has been taken to properly size and reserve memory, the administrator can power on virtual machines confidence that memory is available to support the application.  When vCenter is unable to reserve enough memory to support your reservations, it will refuse to power on VMs.  This means that memory management is no longer about trend analysis and capacity management.  Instead you will investigate application needs with their owners and trust vCenter to give you a simple &#8220;yes&#8221; or &#8220;no&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/03/31/memory-reservations-drive-over-commit/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Memory Compression</title>
		<link>http://vpivot.com/2010/03/01/memory-compression/</link>
		<comments>http://vpivot.com/2010/03/01/memory-compression/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 01:36:15 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[pex]]></category>
		<category><![CDATA[vmworld]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=313</guid>
		<description><![CDATA[Steve Herrod&#8217;s keynote at Partner Exchange 2010 included a tantalizing slide on an upcoming memory maximization technology: memory compression.  A few of you have already seen the overview of this technology Kit Colbert and Fei Guo previewed it at VMworld 2009.   Today I want to tell you how this upcoming feature will help you pack [...]]]></description>
			<content:encoded><![CDATA[<p>Steve Herrod&#8217;s keynote at Partner Exchange 2010 included a tantalizing slide on an upcoming memory maximization technology: memory compression.  A few of you have already seen the overview of this technology <a href="http://www.vmworld2009.com/docs/DOC-3817">Kit Colbert and Fei Guo previewed it at VMworld 2009</a>.   Today I want to tell you how this upcoming feature will help you pack even more virtual machines onto your existing servers.</p>
<p><span id="more-313"></span>To get the most out of your servers&#8217; memory you have to over-commit it.  Go too far and the host will swap.  This fact is immutable in enterprise memory management.  Because rotating disks have seek times that are <em>six orders of magnitude</em> larger than memory, when your host swaps your applications&#8217; performance suffers catastrophically.  <a href="http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/">Solid state drives (SSD) can mitigate the performance cost</a> by reducing swap latency by a couple orders of magnitude.  But SSDs still have delays tens of thousands of times worse than memory.</p>
<p>VMware engineers have been working on a technology that we unofficially call on-demand memory compression (ODMC) or compression cache, depending on who you talk to.  The idea of ODMC is to avoid swapping by compressing a set of target pages to a special region.  We have measured the additional latency&#8211;the compression time&#8211;to be over a hundred times better than rotating disk latencies.  And this will decrease at CPU performance increases.</p>
<p>SSD devices will continue to play an important part in solving the performance problems of extreme memory over-commit because of the great volume of data they can serve at a speed much faster than rotating disks.  But it will be a secondary solution to the faster ODMC, which uses a small dedicated area of system memory.  VMware&#8217;s long-term prioritization for managing the most aggressively over-committed memory looks like this:</p>
<ol>
<li>Do not swap if possible.  We will continue to leverage transparent page sharing and ballooning to make swapping a last resort.</li>
<li>Use ODMC to a predefined cache to decrease memory utilization.*</li>
<li>Swap to persistent memory (SSD) installed locally in the server.**</li>
<li>Swap to the array, which may benefit from installed SSDs.</li>
</ol>
<p>(*) Demonstrated in the lab and coming in a future product.</p>
<p>(**) Part of our vision and not yet demonstrated.</p>
<p>The end goal of this prioritized use of different technologies is the reduction of the performance penalty of swap.  If VMware can reduce the penalty of swap, the virtual machines in extremely over-committed environments will not slow much when they need more memory than the host has available.  When this performance cost of swapping is reduced, you can safely drive your consolidation ratios even higher.</p>
<p>I cannot wait for the general availability of ODMC.  It is another technological accomplishment in a long line of innovation out of our phenomenal engineering organization.  At times I wonder when the last time another operating system software company introduced something really awesome to the industry.  But of course, I am biased.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/03/01/memory-compression/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>Optimizing Memory Utilization</title>
		<link>http://vpivot.com/2010/01/06/optimizing-memory-utilization/</link>
		<comments>http://vpivot.com/2010/01/06/optimizing-memory-utilization/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 21:52:45 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[swap]]></category>
		<category><![CDATA[vcenter]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=198</guid>
		<description><![CDATA[My recent series of blog articles have discussed ESX memory management the the performance specter of host swapping. My last article attempts to correct the misconception that VMware recommends against over-commit memory.  In that article I suggested that memory over-commit is requirement in optimizing memory utilization. Today I want to provide a specific example to [...]]]></description>
			<content:encoded><![CDATA[<p>My recent series of blog articles have discussed ESX memory management the the performance specter of host swapping.  My last article attempts to <a href="http://vpivot.com/2010/01/04/misunderstanding-memory-management/">correct the misconception that VMware recommends against over-commit memory</a>.   In that article I suggested that memory over-commit is requirement in optimizing memory utilization. Today I want to provide a specific example to show why this is true.   I am have also included tips for identifying host swapping in your environments.<br />
<span id="more-198"></span></p>
<h2>Understanding the Bottleneck</h2>
<p>Let me show the value of over-commit and danger of swapping by way of an example.  I will choose the following typical values to demonstrate my point:</p>
<ul>
<li>All virtual machines are on a single host which has <strong>32 GB of RAM</strong> installed.</li>
<li>Each virtual machine is sized to <strong>8 GB of RAM</strong>.</li>
<li>Each virtual machine has <strong>25% active memory</strong> (%ACTV in esxtop and &#8220;Active&#8221; in vCenter).</li>
</ul>
<table id="newspaper-a">
<tbody>
<tr>
<th>VM Count</th>
<th>Active Memory in Host</th>
<th>Comments</th>
</tr>
<tr>
<td>3</td>
<td>3 * 8 GB * 25% = <strong>6 GB</strong></td>
<td>Without memory over-commit, <em>only 18% of the host&#8217;s memory is actively in use</em>.   What a waste!</td>
</tr>
<tr>
<td>12</td>
<td>12 * 8 GB * 25% = <strong>24 GB</strong></td>
<td>Memory is over-committed by 200% but only 75% is actively being used.  In this aggressive consolidation <em>virtual machines will run at full speed</em> until usage exceeds 100% of host memory.</td>
</tr>
<tr>
<td>18</td>
<td>18 * 8 GB * 25% = <strong>36 GB</strong>, limited to <strong>32 GB</strong> by host</td>
<td>These virtual machines want 36 GB of RAM but are limited to the 32 GB that is installed on the host.  ESX must swap to allow these machines to run and <em>performance will suffer greatly</em>.</td>
</tr>
</tbody>
</table>
<p>A virtual machine&#8217;s active memory is dictated by the application and its usage.  But the VI admin has complete control over the number of virtual machines in the environment which means host active memory can be influenced by adding or removing virtual machines.  Because virtual machine active memory is always equal to or less than 100% the only way to drive the host active memory to 100% is to over-commit memory.   <em>This is why hypervisors that do not support memory over-commit are simply not viable for data centers where memory optimization is a priority.</em></p>
<h2>Identifying and Correcting the Bottleneck</h2>
<p>The ongoing occurrence of swapping is identified by a non-zero swap rate in either esxtop or vCenter.  In addition to swap rate, esxtop provides a swap wait time in its CPU panel.  When swap rate exceeds hundreds of kilobytes per second or swap wait time exceeds a couple percentage points, it is time for corrective action.</p>
<p>There are three possible solutions to this problem:</p>
<ol>
<li>Balance the virtual machines&#8217; memory usage by moving virtual machines from hosts with higher amounts of memory usage to hosts with lower amount of memory usage.</li>
<li>Run fewer virtual machines.</li>
<li>Buy more memory.</li>
</ol>
<h2>Designing Your Infrastructure to Simplify Memory Management</h2>
<p>Ultimately I owe you a full white paper on memory management to provide a sufficient answer.  But I want to give you two ideas of the tools and techniques that I will be describing when in this future paper.  First, place <a href="http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/">host swap files on solid state disk (SSD) stores</a> to improve their performance.  With the right SSD device it may be possible to eliminate swap penalties.  Second, even if SSDs are unavailable consider consolidating multiple swap files onto a single store.  This will make swap rate monitoring very easy but may compound the performance penalties of swapping.</p>
<p>Stay tuned and VMware will provide more documentation on memory management in 2010.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/01/06/optimizing-memory-utilization/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>Misunderstanding Memory Management</title>
		<link>http://vpivot.com/2010/01/04/misunderstanding-memory-management/</link>
		<comments>http://vpivot.com/2010/01/04/misunderstanding-memory-management/#comments</comments>
		<pubDate>Mon, 04 Jan 2010 17:26:37 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[swap]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=176</guid>
		<description><![CDATA[Twice in 2009 someone showed me competitive literature from Microsoft or Citrix claiming that VMware recommends against memory over-commitment.  Given the wide variety of literature we have provided in support of this feature, all of our customers recognize the absurdity of our competitions&#8217; claims.  VMware and its customers love memory over-commitment.  Then where is the [...]]]></description>
			<content:encoded><![CDATA[<p>Twice in 2009 someone showed me competitive literature from <a href="http://blogs.technet.com/jamesone/archive/2009/12/21/drilling-into-reasons-for-not-switching-to-hyper-v.aspx">Microsoft</a> or Citrix claiming that VMware recommends <em>against</em> memory over-commitment.  Given the wide variety of literature we have provided in support of this feature, all of our customers recognize the absurdity of our competitions&#8217; claims.  VMware and its customers love memory over-commitment.  Then where is the source of this misinformed guidance?</p>
<p><span id="more-176"></span>I believe we know the text that is misrepresented.  It comes from our performance best practices document and could be misunderstood by someone to whom the terms &#8220;working set&#8221; and &#8220;active memory&#8221; are unfamiliar.  Here is the sentence, quoted from the <a href="http://www.vmware.com/pdf/vi_performance_tuning.pdf">oldest available version of our best practices document</a>:</p>
<blockquote><p>Swapping is used to forcibly reclaim memory from a virtual machine when both page sharing and ballooning fail to reclaim sufficient memory from an overcommitted system. If the working set (active memory) of the virtual machine resides in physical memory, using the swapping mechanism and having inactive pages swapped out does not affect performance. However, if the working set is so large that active pages are continuously being swapped in and out (that is, the swap I/O rate is high), then performance may degrade significantly.</p></blockquote>
<p>This excerpt describes the condition under which a host will swap.  That condition is best summarized as &#8220;the sum of the working sets of all virtual machines exceeds the amount of memory on the host&#8221;.  This definition presupposes that the reader understands the definition of a working set, which I think some readers may not.  For this discussion I will simplify the definition of working set as &#8220;recently active memory&#8221; and refer readers to their helpful search engine for a more complete description.</p>
<p>When a system&#8217;s working set exceeds available memory, the system will swap.  This is not unique to virtual, consolidated workloads.  As long as operating systems have implemented virtual memory, there existed a possibility that a working set could exceed available physical memory.  The only thing that has changed in a virtual environment is that the working set is calculated by summing the working sets from multiple virtual machines as opposed to a single application or operating system instance.</p>
<p>But the key here&#8211;and the reason why memory over-commitment remains so powerful&#8211;is that <em>allocated memory</em> (the virtual machine&#8217;s size) exceeds <em>active memory</em> (the working set) nearly 100% of the time.  Memory management in consolidated environments is about pushing a host&#8217;s active memory to as close to 100% as possible.  This is something that was not possible in physical environments, and cannot be done in virtual environments without over-commitment.</p>
<p>To sum this up, not only does VMware recommend some over-commitment, but we know that it is impossible to fully use your available memory without the flexibility provided by over-commitment.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/01/04/misunderstanding-memory-management/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Solid State Disks and Host Swapping</title>
		<link>http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/</link>
		<comments>http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/#comments</comments>
		<pubDate>Fri, 25 Dec 2009 01:15:45 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[swap]]></category>
		<category><![CDATA[vmkernel]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=183</guid>
		<description><![CDATA[Recently I have been thinking, talking, and writing about ESX host memory swapping a lot.  ESX swaps memory under the same conditions that traditional operating systems do; the application(s) is using more memory than available on the physical hardware.  Host swapping is an unavoidable consequence of this condition, whether virtualization is present or not. But [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I have been thinking, talking, and <a href="http://vpivot.com/2009/12/23/your-performance-enemy-host-swapping/">writing</a> about ESX host memory swapping a lot.  ESX swaps memory under the same conditions that traditional operating systems do; the application(s) is using more memory than available on the physical hardware.  Host swapping is an unavoidable consequence of this condition, whether virtualization is present or not.</p>
<p><span id="more-183"></span>But <a href="http://communities.vmware.com/blogs/chethank/2009/12/22/using-solidstate-drives-to-improve-performance-of-sql-databases-on-vsphere-hosts-when-memory-is-overcommitted">a recent article</a> by my engineering colleague Chethan Kumar shows an avenue that allows VI admins to aggressively over-commit memory and avoid the catastrophic performance penalty of swapping: use solid state disks to host ESX swap files.</p>
<p>The fundamental problem with host swapping comes from the high latency of traditional disks compared to memory.  Data can be retrieved from memory in nanoseconds but takes milliseconds to fetch from a hard drive.  That means a single 4K memory page takes 100,000 times longer to retrieve if the operating system swapped it out.</p>
<p>The value that solid state disks offer to this problem is exceptional latency, as compared to traditional drives.  The SSD that Chethan used showed microsecond latencies, about 1,000 times lower than physical disks.  This means that  time spent waiting for swap activity* has been decreased to 0.1% of the time spent swapping to physical disks.</p>
<p>The importance of fast swap files is that it enables administrators to more aggressively over-commit memory.  Today our admins rightfully fear the VMs&#8217; aggregate active memory exceeding the available physical memory, which results in swapping.  Today SSD technology in shared storage such as EMC&#8217;s new CLARiiONs allows our admins to cleverly place swap files and drive up memory utilization to previously unheard of levels.  This may enable standard memory overcommitment of 200% or more, with extreme over-commit being much higher than this.</p>
<p>In future versions of ESX we want to automate the usage of SSDs to maximize the use of available memory.  But that&#8217;s a roadmap discussion that I will leave for another day.</p>
<p>(*) This swap wait time has conveniently been added to ESX 4&#8242;s version of esxtop under the counter %SWPWT.  See <a href="http://communities.vmware.com/docs/DOC-9279">Interpreting esxtop Statistics</a> for more information.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

