<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>vPivot &#187; benchmarking</title>
	<atom:link href="http://vpivot.com/tag/benchmarking/feed/" rel="self" type="application/rss+xml" />
	<link>http://vpivot.com</link>
	<description>Scott Drummonds on Virtualization</description>
	<lastBuildDate>Wed, 01 Feb 2012 06:46:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>9,570,000,000,000,000,000,000 Bytes</title>
		<link>http://vpivot.com/2011/09/08/9570000000000000000000-bytes/</link>
		<comments>http://vpivot.com/2011/09/08/9570000000000000000000-bytes/#comments</comments>
		<pubDate>Thu, 08 Sep 2011 02:47:37 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[vmmark]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=982</guid>
		<description><![CDATA[In 2008, the year before I left VMware, I was invited to help measure the amount of information being enterprise computers processed in the entire year.  My invitation came from Dr. James Short of the University of California, San Diego, who was on the team leading this project.  The team called their project &#8220;How Much [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://vpivot.com/wp-content/uploads/2011/09/ent_serv_info.png"><img class="alignleft size-medium wp-image-985" title="Enterprise Server Information" src="http://vpivot.com/wp-content/uploads/2011/09/ent_serv_info-300x276.png" alt="" width="300" height="276" /></a>In 2008, the year before I left VMware, I was invited to help measure the amount of information being enterprise computers processed in the entire year.  My invitation came from <a href="http://irps.ucsd.edu/faculty/faculty-directory/james-e-short.htm">Dr. James Short of the University of California, San Diego</a>, who was on the team leading this project.  The team called their project &#8220;How Much Information?&#8221; (HMI).  And Dr. Short, or Jim, wanted me to provide comment on a small portion of the systems that process information: enterprise hardware.</p>
<p><span id="more-982"></span></p>
<p>The discussions at the conference I attended in 2008 were incredibly broad.  From enterprise hardware to unreleased two-dimensional UPC codes to game consoles to mobile phones.  But the work I contributed two only aimed to provide one piece to the puzzle of worldwide information processing: the amount of data processed by enterprise hardware in 2008.</p>
<p>I joined the HMI team for a conference they hosted in the San Francisco Bay Area in 2008.  Work continued after I left, of course, and my place was taken by VMware&#8217;s Bruce Herndon, who runs the VMmark team.  With years of experience in enterprise benchmarking, Bruce was a much better choice than I to help quantify information processing.  And Jim recently restated to me his thanks to VMware and Bruce for their continued help on this project.</p>
<p>Jim recently shared with me <a href="http://e-scott.net/share/hmi/ESI-Report-Jan%202011.pdf" onClick="javascript: _gaq.push(['_trackPageview', '/downloads/map']);">the public report on enterprise server information from 2008</a>,which was completed earlier this year.  Enterprise processing&#8211;which does not include game consoles, mobile phones, desktop computers, notebooks, tablets, etc.&#8211;obviously represents only a fraction of the world&#8217;s ongoing information processing.  Yet the amount of information that enterprise hardware alone processed in 2008 is staggering.</p>
<p>That number is 9.57 zettabytes.  That&#8217;s almost 10 to the 22nd power.  Or, as this entry&#8217;s title expands, 9,570,000,000,000,000,000,000 bytes.  Unbelievable.</p>
<p>The executive summary from HMI&#8217;s 2008 &#8220;Report on Enterprise Server Information&#8221;  includes some fun statistics.  And remember, these numbers apply only to the information processed by enterprise hardware:</p>
<ul>
<li>Servers process 12 GB of data for each worker in the world <em>daily</em>.  How much email are you sending?</li>
<li>Two-thirds of the world&#8217;s information was processed by &#8220;low end&#8221; hardware, costing $25,000 or less.  It is an x86 world after all, isn&#8217;t it?</li>
<li>About half of the information processing is attributed to transaction processing (invoicing, paying bills, checking stocks, etc.) and the other half is attributed to web content and office applications.</li>
<li>Web services and business applications doubled their performance-per-cost every 1.5 years.  But enterprise servers only doubled their information processing every two years.  Enterprises are getting more value out of their processed information every year.</li>
<li>High-end computers are doubling performance-per-cost and raw performance every four years, or half the pace of low-end computers.  It is certainly an x86 world.</li>
</ul>
<p>The full HMI paper linked above is rich with information, which I will be processing on own various systems in the coming weeks.  I encourage you to check it out to understand the scale of data that pulses through our hardware every day.  And if you are really into the HMI project, read about <a href="http://e-scott.net/share/hmi/SDSC%20CLDS%20Events%20Flyer%20Sept%202011.pdf" onClick="javascript: _gaq.push(['_trackPageview', '/downloads/map']);">an upcoming conference continuing this research</a> that might benefit from your participation.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2011/09/clds.png"><img class="alignnone size-medium wp-image-983" title="Center for Large-scale Data Systems Research" src="http://vpivot.com/wp-content/uploads/2011/09/clds-229x300.png" alt="" width="229" height="300" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/09/08/9570000000000000000000-bytes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Designing VMs with Performance SLAs</title>
		<link>http://vpivot.com/2010/08/09/designing-vms-with-performance-slas/</link>
		<comments>http://vpivot.com/2010/08/09/designing-vms-with-performance-slas/#comments</comments>
		<pubDate>Mon, 09 Aug 2010 13:56:50 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[netioc]]></category>
		<category><![CDATA[sioc]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=614</guid>
		<description><![CDATA[Consolidation amplifies the uncertainty of application performance. Still, VI administrators need a means of guaranteeing performance SLAs to their applications&#8217; users. But the best VMware has been able to offer are resource controls, which are at best an indirect mechanism for sustaining application performance. With the acquisition of B-hive, now AppSpeed, VMware moved a step [...]]]></description>
			<content:encoded><![CDATA[<p>Consolidation amplifies the uncertainty of application performance.  Still, VI administrators need a means of guaranteeing performance SLAs to their applications&#8217; users.  But the best VMware has been able to offer are resource controls, which are at best an indirect mechanism for sustaining application performance.  With the acquisition of B-hive, now AppSpeed, VMware moved a step closer to allowing VI administrators to guarantee a performance SLA.  As an application-aware latency measurement tool, AppSpeed may eventually provide feedback to vCenter to guarantee throughput levels.  But it does not today.  So how are VI administrators to guarantee application performance?</p>
<p><span id="more-614"></span>It was during discussions with advanced VMware customers in Melbourne that a solution to this problem occurred to me.  I have reasoned it through and I think it holds water.  I have socialized it with more customers and my colleagues and we think it stands.  So I want to introduce a system for implementing virtual machines with a better assurance of a performance SLA.</p>
<p>The key to this process is that minimum performance can be measured using limits and that performance can be assured using reservations.  You can develop and document virtual machines with performance SLAs using the following procedure:</p>
<ul>
<li>First, as always, define a small number of strictly-sized virtual machines to be used by all applications in your environment.  Often these look something like small VMs of 1 vCPU and 4 GB RAM, medium VMs of 2 vCPUs and 8 GB of RAM, and large VMs of 4 vCPUs and 16 GB of RAM.  Tune these numbers for your environment, as needed.</li>
<li>For any application, benchmark its maximum performance against each of these virtual machine configurations on an unloaded system.  Chose an ISV-supplied benchmark or a well-known third party tool.  This sets your high water mark for throughput for each application in its virtual machine.</li>
<li>For each configuration, set a CPU limit at 50% of the available CPU and a memory limit of 50% of the available memory.  Retest the application against this smaller, limited configuration.</li>
<li>During the applications&#8217; deployment, change the limits to reservations.  That is, remove limits and set reservations equal to the limits&#8217; previous values, in this case 50%.</li>
<li>Your application now has a maximum performance defined in bullet two, and a &#8220;guaranteed&#8221; performance measured in bullet three.  This is your application&#8217;s performance SLA.</li>
</ul>
<p>The concept is simple: limits can be used to measure the performance of an application in the presence of that degree of contention.  Reservations ensure that those resource amounts are always present.  Here are some notes on this process:</p>
<ul>
<li>This is not a true guarantee since network and storage throughput may drop.  No tool can eliminate this risk entirely but <a href="http://vpivot.com/2010/05/04/storage-io-control/">SIOC</a> and <a href="http://www.vmware.com/resources/techresources/10119">NetIOC</a> can reduce the risk of a network- or storage-induced performance failure.</li>
<li>The memory test is going to be highly dependent on the working set created by your load generation tool.  Your mileage will vary depending on your application owners&#8217; use of the virtual machine.</li>
<li>vCenter will guarantee that the reservations are always available through a process called admission control, which checks the cluster to ensure that enough CPU or memory is available to run the virtual machine immediately and in the event of a server failure.</li>
</ul>
<p>As I said above, this is not a true guarantee of application performance.  But it is as close as we can get until AppSpeed or a replacement evolves into universal application latency measurement that is fed into vCenter.  And this is another in a growing list of reasons  why <a href="http://vpivot.com/2010/03/31/memory-reservations-drive-over-commit/">CPU and memory reservations should be part of all VMware deployments</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/08/09/designing-vms-with-performance-slas/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>SPECvirt Released</title>
		<link>http://vpivot.com/2010/07/26/specvirt-released/</link>
		<comments>http://vpivot.com/2010/07/26/specvirt-released/#comments</comments>
		<pubDate>Mon, 26 Jul 2010 05:27:56 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[specvirt]]></category>
		<category><![CDATA[vmmark]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=607</guid>
		<description><![CDATA[SPEC has diligently working on an industry standard version of VMmark since something like 2006. The first version of their product is complete and was released during my recent holiday. I have been talking with colleagues and customers about SPECvirt for years and would like to talk about what SPECvirt is and what it is [...]]]></description>
			<content:encoded><![CDATA[<p>SPEC has diligently working on an industry standard version of VMmark since something like 2006.  The first version of their product is complete and was <a href="http://www.spec.org/virt_sc2010/press/release.html">released</a> during <a href="http://www.e-scott.net/blog/?p=339">my recent holiday</a>.  I have been talking with colleagues and customers about SPECvirt for years and would like to talk about what SPECvirt is and what it is not.</p>
<p><span id="more-607"></span>VMmark is clearly the reigning king of consolidation benchmarks and anything that enters its arena must stand against its standard.  VMmark pioneered a new method of benchmarking that resonates with virtualization experts.  It tests system performance by adding fixed load virtual machines instead of scaling up a single application to system saturation.  Traditional benchmarks tune up their load generation against a single instance but VMmark piles on virtual machines until the system is capable of no more work.</p>
<p>VMmark is one of VMware&#8217;s many industry-leading initiatives and was started when VMware worked closely with server vendors that wanted to benchmark their servers&#8217; ability to run virtual machines.  VMmark was conceived many years ago, well before VMware had competition.  It is because of this fact that I scratch my head at claims that VMmark is biased towards VMware.  There was no commercial implementation of Xen when VMmark was specified and Microsoft was only dreaming of entering the market.</p>
<p>But even in an environment devoid of competition, customers want certainty that their benchmarks are not hiding flaws in a product.  SPEC has for years been developing honest benchmarks that survive the crucible of debate among its large member community.  SPECvirt, or more properly SPECvirt_sc2010, is the result of this vigorous debate.  You can read up on SPECvirt in the <a href="http://www.spec.org/virt_sc2010/docs/SPECvirt_FAQ.html">FAQ</a> released coincident with the product&#8217;s launch.  But I will add a few comments and comparisons here.</p>
<ol>
<li>SPECvirt costs $3000 to purchase.  VMmark is free.  But VMmark requires commercial software and versions of SPEC benchmarks that are not free.  Depending on your licensing model, you may find VMmark or SPECvirt cheaper.  But the prices of each are essentially comparable.</li>
<li>VMmark uses the most common applications in the data center (like Apache and Microsoft Exchange).  SPECvirt does not mandate application choice for the system under test.
<ul>
<li>This is a Good Thing, because you may now choose a configuration that models your environment by running the exact applications you run.</li>
<li>This is a Bad Thing, because five different testers may choose five different application sets in their tests resulting in incomparable results.</li>
</ul>
</li>
<li>SPECvirt cannot be run against a cluster of hosts.  But VMmark cannot, either.  We will have to wait for an update to one of these benchmarks before we can properly test DRS clusters and their competitive equivalents.</li>
<li>There is only <a href="http://www.spec.org/virt_sc2010/results/specvirt_sc2010_perf.html">one published SPECvirt result</a>, courtesy of IBM running KVM.  There are a boatload of <a href="http://www.vmware.com/products/vmmark/results.html">VMmark results</a>, as one would expect of a more mature product.  It will be interesting to watch the rate of submissions of these two benchmarks over the coming year or two.</li>
<li>SPECvirt runs three workloads and an idle virtual machine in its tile.  One of those workloads, tested by SPECweb, is implemented with three virtual machines.  The end product is a six-VM tile that looks very much like VMmark&#8217;s six-VM tile.</li>
</ol>
<p>For years we have seen online and in-person griping about VMware&#8217;s misunderstood benchmark restriction in its EULA.  Both VMmark and SPECvirt can be run on any supported hypervisor.  So now its time for all the hypervisor vendors to put up or shut up.  Run one of these benchmarks on your product and compare the results against existing published results.  Then the world will know where your product stands.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/07/26/specvirt-released/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>First Ever TPC Result on VMware, New Records</title>
		<link>http://vpivot.com/2010/04/12/first-ever-tpc-result/</link>
		<comments>http://vpivot.com/2010/04/12/first-ever-tpc-result/#comments</comments>
		<pubDate>Mon, 12 Apr 2010 22:27:49 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[tpc]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=394</guid>
		<description><![CDATA[A new era has dawned on VMware virtualization: the Transaction Processing Performance Council (TPC) has posted an audited result on a virtual platform.  That platform, VMware vSphere 4, ran ParAccel&#8217;s Analytic Database (PADB) to set new records for the TPC-H benchmark using a 1,000 GB database.  You can read more about ParAccel&#8217;s work in their [...]]]></description>
			<content:encoded><![CDATA[<p>A new era has dawned on VMware virtualization: the Transaction Processing Performance Council (TPC) has posted <a href="http://www.tpc.org/tpch/results/tpch_price_perf_results.asp">an audited result on a virtual platform</a>.  That platform, VMware vSphere 4, ran ParAccel&#8217;s Analytic Database (PADB) to set new records for the TPC-H benchmark using a 1,000 GB database.  You can read more about ParAccel&#8217;s work in <a href="http://www.paraccel.com/news/pressRelease/VMware_ParAccel_tpch_Release_WIRE.pdf">their recent press release</a>.</p>
<p><span id="more-394"></span>Here are a few interesting stats about the published TPC-H score:</p>
<ul>
<li>The score of 1,316,882 QphH is the highest ever reported throughput for the 1,000 GB results.  It is 13% higher than the next greatest performance score.</li>
<li>The $0.70 price per QphH is the lowest price/performance ratio ever submitted on the 1,000 GB results.  It is 40% less than the next closest price/performance score.</li>
<li>ParAccels results are a mere 1/8 the price/performance of the second best performing result.</li>
<li>The vSphere 4 configuration used 40 HP servers with Xeon 5560 processors running 80 4-way virtual machines.</li>
</ul>
<p>ParAccel&#8217;s amazing work represents a large number of firsts for VMware: our first audited TPC result, our first proof point with big data applications, our first published partnered work in the rapidly growing data warehouse space.  One more amazing thing about this accomplishment: ParAccel and VMware set this record on commodity x86 hardware, which means the infrastructure can be shared with other enterprise applications.</p>
<p>Congrats to ParAccel.  They have shown a product that can set records not in spite of virtualization, but because of it.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/04/12/first-ever-tpc-result/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Optimal Web Servers: vSphere Required</title>
		<link>http://vpivot.com/2010/03/22/optimal-web-servers-vsphere-required/</link>
		<comments>http://vpivot.com/2010/03/22/optimal-web-servers-vsphere-required/#comments</comments>
		<pubDate>Mon, 22 Mar 2010 16:51:22 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[specweb]]></category>
		<category><![CDATA[vmworld]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=341</guid>
		<description><![CDATA[Just over a year ago VMware set a world record for web server performance on 16-core systems. The reason ESX beat native performance is its excellent scalability when compared to the poor scalability of commercial web servers. By implementing multiple web servers in virtual machines on a single host, VMware can drive more web transactions [...]]]></description>
			<content:encoded><![CDATA[<p>Just over a year ago <a href="http://blogs.vmware.com/performance/2009/02/vmware-sets-performance-record-with-specweb2005-result.html">VMware set a world record for web server performance</a> on 16-core systems. The reason ESX beat native performance is its excellent scalability when compared to the poor scalability of commercial web servers. By implementing multiple web servers in virtual machines on a single host, VMware can drive more web transactions through the host than possible without ESX present.  Today I want to update everyone on our work with virtual web servers and repeat my plea: virtualize your web servers now!</p>
<p><span id="more-341"></span>The scalability limitations in web servers were investigated and documented by Sreekanth Setty, whose <a href="http://www.vmworld.com/docs/DOC-2255">VMworld 2008 presentation</a> is summarized in this graph:</p>
<div id="attachment_342" class="wp-caption alignnone" style="width: 457px"><a href="http://vpivot.com/wp-content/uploads/2010/03/specweb-scaling.png"><img class="size-full wp-image-342" title="Scaling Virtual Web Servers" src="http://vpivot.com/wp-content/uploads/2010/03/specweb-scaling.png" alt="Virtual web server scalability" width="447" height="405" /></a><p class="wp-caption-text">A multiple virtual machine configuration can outperform native with web servers.</p></div>
<p>This graph shows four uniprocessor virtual machines outperforming a single four core native configuration. The advantage of the multiple virtual machine configuration improves as virtual machine count increases.  Sree has since shown that ESX can beat physical when tested by <a href="http://blogs.vmware.com/performance/2009/06/index.html">SPECweb using 4-way virtual machines</a>, which means fewer OS and web server instances to maintain. Most recently Sree <a href="http://communities.vmware.com/docs/DOC-12103">updated his results again using an Intel Xeon 5570 and VMDirectPath</a>, which freed a few precious CPU cycles for more web transactions.</p>
<p>It is now beyond doubt that multiple web servers will outperform a single instance on the same hardware. However, some of our customers still challenge our suggestion that those web servers should be run in virtual machines.  Some resist the idea of increasing the number of web and OS instances at all.  So, let me run down the options here and their ramifications:</p>
<ol>
<li><em>Single web server instance in an OS on a physical server.</em> This is the worst performing configuration, scaling to only 38% of the system&#8217;s maximum possible performance.</li>
<li><em>Multiple web server instances in a single OS on a physical server.</em> This would deliver the theoretical best possible performance but requires the admin to run the notoriously uncooperative web servers together in a single operating system.  I am no web server administrator, but our customers have told me that this is a poor solution for production.</li>
<li><em>Multiple web server instances each in their own virtual machine.</em> Excellent performance with web server instances isolated so they cannot harm each other.  Sreekanth&#8217;s VMworld 2008 work showed this delivering about 80% of the system&#8217;s maximum theoretical throughput.</li>
</ol>
<p>Since the performance the first configuration is so poor I exclude it from consideration in a data center.  By instantiating one 3-way virtual machine for a web server, we have nearly matched the physical configuration with no additional maintenance cost.  Then we can scale out the virtual web farm using vSphere management tools to deliver a flexible architecture.  And by isolating these instances in their own hardened virtual machines, we can guarantee that runaway utilization, zombie processes, and any other cross contamination caused by bad web servers will not harm the farm.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/03/22/optimal-web-servers-vsphere-required/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>vSphere 4.0, Hyper-Threading, and Terminal Services</title>
		<link>http://vpivot.com/2010/03/17/vsphere-4-0-hyper-threading-and-terminal-services/</link>
		<comments>http://vpivot.com/2010/03/17/vsphere-4-0-hyper-threading-and-terminal-services/#comments</comments>
		<pubDate>Wed, 17 Mar 2010 22:23:28 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[hyper-threading]]></category>
		<category><![CDATA[intel]]></category>
		<category><![CDATA[scheduler]]></category>
		<category><![CDATA[terminal services]]></category>
		<category><![CDATA[vsphere]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=333</guid>
		<description><![CDATA[I recently wrote a blog article detailing Hyper-Threading (HT) and its effect on vSphere.  An astute reader pointed out, a recent update to Project VRC&#8216;s terminal services analysis suggests disappointment with HT on vSphere.  We spent a lot of time looking at those results to understand why they contradicted the body of performance data, which [...]]]></description>
			<content:encoded><![CDATA[<p>I recently wrote <a href="http://vpivot.com/2010/03/06/hyper-threading-on-vsphere/">a blog article detailing Hyper-Threading (HT) and its effect on vSphere</a>.  An astute reader pointed out, a recent update to <a href="http://www.virtualrealitycheck.net/">Project VRC</a>&#8216;s terminal services analysis suggests disappointment with HT on vSphere.  We spent a lot of time looking at those results to understand why they contradicted the body of performance data, which show HT offering 10-30% gain on vSphere. What we discovered led us to create a vSphere patch that would allow users to improve performance in some benchmarking environments.</p>
<p><span id="more-333"></span>Among the many results presented by VRC, the configurations that most perplexed us were the two and four virtual machine configurations, each with four vCPUs per virtual machine.  The configuration with two virtual machines looked good and matched our internal numbers.  In this configuration there are a total of eight vCPUs on the host which maps each to its own physical core on the Xeon 5500 series processor.  The problem arose when the virtual machine count was increased to four, resulting in 16 total vCPUs.  In this configuration each vCPU is paired with one logical, Hyper-Threaded core.  Project VRC showed this configuration supporting no more desktops than the two-VM configuration, which suggests no value to Hyper-Threading on this configuration.</p>
<p>It took us some time to understand the reason for these results, but we eventually identified a very specific condition where ESX&#8217;s scheduler enforces fairness in scheduling vCPUs at at cost of throughput.  ESX&#8217;s scheduler has long be subject of the intensive scrutiny of a large number of VMware engineers to guarantee fair access to the processor for each virtual machine.  It is because of this fairness that VMware&#8217;s customers can rely on CPU resource controls.  But, when fairness goes too far, throughput may be sub-optimal.</p>
<p>Hyper-Threading presents particular problems to fairness because of the non-linear performance it delivers.  A thread will run at one speed when it has full access to a physical core, at another speed when it is sharing a core, and at third speed when sharing a core with a different thread.  As a result, ESX&#8217;s scheduler will sometimes pause a thread to enforce fairness.  These pauses are more common when Hyper-Threading is present to account for its lack of uniformity in thread performance.  If the host lacks vCPUs that are ready to run, the result is CPU utilization below saturation, leaving CPU cycles unused.</p>
<p>There are three specific conditions that can excite this condition:</p>
<ol>
<li>A Xeon 5500 series processor is present with Hyper-Threading enabled,</li>
<li>CPU utilization is near saturation, and</li>
<li>A roughly one-to-one mapping between vCPUs and logical processors.</li>
</ol>
<p>In this scenario, VMware vSphere favors fairness over throughput and sometimes pauses one vCPU to dedicate a whole core to another vCPU, eliminating gains provided by Hyper-Threading.  In cases outside of these three conditions, the performance of VMware vSphere 4 meets the high expectations of VMware&#8217;s R&amp;D team and its customers.  Of course production environments rarely (never?) have a one-to-one ratio of vCPUs to logical processors.  This occurs when there are only four 4-way virtual machines on a Xeon 5500 system, for example.</p>
<p>But environments such as Project VRC&#8217;s are simplifications of production environments meant to understand the capabilities of virtual platforms.  VMware has provided a patch to Project VRC that will allow them to improve throughput in their environment.  We are going to release this patch and its documentation to the general public within a couple of weeks.  I do not expect that any of VMware&#8217;s customers will benefit from the changes is allows, but I will later document the patch and its usage for anyone that cares to experiment.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/03/17/vsphere-4-0-hyper-threading-and-terminal-services/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>vSphere Performance Leadership with Terminal Services</title>
		<link>http://vpivot.com/2009/09/21/vsphere-performance-leadership-with-terminal-services/</link>
		<comments>http://vpivot.com/2009/09/21/vsphere-performance-leadership-with-terminal-services/#comments</comments>
		<pubDate>Mon, 21 Sep 2009 15:51:25 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[xenapp]]></category>
		<category><![CDATA[xenserver]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=66</guid>
		<description><![CDATA[Project VRC’s latest update (document available with registration) to their ongoing analysis of Terminal Services and XenApp performance in virtualized environments supports VMware’s claims of industry-leading performance.  There are two main conclusions of the revised version: (1) VMware outperforms XenServer and (2) previous performance measurements of XenServer were in error, reporting artificially high results on [...]]]></description>
			<content:encoded><![CDATA[<p>Project VRC’s latest update (<a href="http://www.projectvrc.nl/index.php?option=com_docman&amp;task=cat_view&amp;gid=39&amp;Itemid=">document available</a> with registration) to their ongoing analysis of Terminal Services and XenApp performance in virtualized environments supports VMware’s claims of industry-leading performance.  There are two main conclusions of the revised version: (1) VMware outperforms XenServer and (2) previous performance measurements of XenServer were in error, reporting artificially high results on that product.</p>
<p><span id="more-66"></span>The story of Project VRC’s work goes back nearly a year.  This group of Dutch consultants created an ambitious project to quantify the performance of desktop virtualization platforms.  Their workload simulates multiple users running desktop applications in a Terminal Services environment. A few operations—including window appearances, Windows calculator load time, and others—are timed and the number of users is increased until a specified response time limit is reached.</p>
<p>While this may appear to be straightforward, benchmarking is deceptively difficult, as Project VRC will attest. Through engagements with VMware they have identified a performance gain by using ESX’s hardware assist monitor mode and acknowledged a timing bias in XenServer.  Furthermore, Project VRC has engaged other partners to improve the workload, including minimizing “stuck sessions” and reducing the impact of in-guest sleeps, which cause a variety of problems.</p>
<p>With this new paper, we can now make a few observations about the Project VRC workload and the platforms it runs on:</p>
<ul>
<li>The previously reported XenServer numbers were incorrect.  The old results showed a significant performance lead for XenServer but the update reports that vSphere 4.0 outperforms XenServer 5.5 by 3.5%.</li>
<li>Project VRC has recognized the inaccuracy of sleep system calls in virtual environments so they have removed them from within timed operations.</li>
<li>Some system sleeps remain in the benchmark, which will result in unpredictable operation density from run to run and across platforms.</li>
</ul>
<p>Project VRC plans to continue to develop their workload until it closely matches customer environments.  This includes fixing any remaining in-guest timing issues that may continue to bias their results.  VMware will continue to work with them to help them eliminate issues and produce a benchmark we can all stand behind.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/09/21/vsphere-performance-leadership-with-terminal-services/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>SQL Server Performance Problems Not Due to VMware</title>
		<link>http://vpivot.com/2009/09/16/sql-server-performance-problems-not-due-to-vmware/</link>
		<comments>http://vpivot.com/2009/09/16/sql-server-performance-problems-not-due-to-vmware/#comments</comments>
		<pubDate>Wed, 16 Sep 2009 11:10:27 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[vmworld]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=24</guid>
		<description><![CDATA[[First re-post of an old favorite.  This document is my most popular blog entry from the communities.] Microsoft SQL Server runs at better than 80% of native on VI3 in most benchmarked environments. In production environments, and under loads that model those conditions, SQL Server runs at 90-95% of native on ESX 3.5. I can [...]]]></description>
			<content:encoded><![CDATA[<p><em>[First re-post of an old favorite.  This document is my most popular blog entry from the <a href="http://communities.vmware.com/blogs/drummonds/2009/03/13/sql-server-performance-problems-not-due-to-vmware">communities</a>.]</em></p>
<p>Microsoft SQL Server runs at better than 80% of native on VI3 in most benchmarked environments.  In production environments, and under loads that model those conditions, SQL Server runs at 90-95% of native on ESX 3.5.  I can say this with confidence despite a large amount of the industry&#8217;s skepticism because I&#8217;ve spent so much time on SQL Server in the past half year.  I&#8217;d like to share some of my research on the subject and observations with you.</p>
<p><span id="more-24"></span>Two weeks ago my colleague Chethan Kumar and I presented on SQL Server in Cannes, France for VMworld Europe 2009.  This presentation was the culmination of six months of investigation that was started at VMworld 2008 in Las Vegas.  At that event I heard many customer concerns about SQL Server performance that I was resolved to identify the problems&#8217; root causes.  I talked with every customer I could find that claimed that SQL ran at anything less than 70% of native.  So many of these contacts claimed that they had measured SQL at 25% of native or worse that I knew that something was going wrong.</p>
<p>First, let me show you a slide that Chethan presented at the show in Cannes:</p>
<p><img class="jive-image-thumbnail jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-2720-5630/sql_tuning.png" alt="sql_tuning.png" width="620" /></p>
<p>Chethan spent three months investigating SQL Server to find out how much he could improve virtual performance from the &#8220;out of the box&#8221; experience.  As this figure details, the sum total of performance improvements was 15%.  Here&#8217;s another break-down of these results:</p>
<p><img class="jive-image-thumbnail jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-2720-5632/sql_tuning_summary.png" alt="sql_tuning_summary.png" width="620" /></p>
<p>The only option that we found in ESX to improve virtual performance was static transmit coalescing, which is documented on <a class="jive-link-external" href="http://www.vmware.com/files/pdf/specweb_perf_final.pdf">page four of one of our SPECweb papers</a>.  Large pages and SQL&#8217;s priority boost, which are best practices provided by Microsoft for SQL Server configuration, provide the largest gains in performance.</p>
<p>The key messages that we communicated to our audience were that a properly running SQL Server should run at 80% of native or better.  In most production cases it can run at a performance indistinguishable from native speed.  And if performance is lagging, there don&#8217;t exist many changes that can be made to ESX that can yield and performance gains at all.</p>
<p>This begs the question: &#8220;If ESX can&#8217;t be tuned to double SQL performance, what is causing these reports of terrible SQL Server throughput?&#8221;  The great majority of the problems are coming from mis-configured storage.  But a variety of other items such as poor hardware selection or use of the wrong virtualization software contribute to the confusion, as well.  I&#8217;ve been documenting these issues in <a class="jive-link-wiki" href="http://communities.vmware.com/docs/DOC-8964">Best Practices for SQL Server</a> on this community and will continue to update that document as more problems are discovered.</p>
<p>If you have a SQL Server running un-virtualized in your environment, I&#8217;d like you to try virtualizing it again.  Follow our best practices document and pay close attention to your storage configuration during deployment.  I feel confident that once you&#8217;ve setup your environment properly, you&#8217;re going to like what you see.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/09/16/sql-server-performance-problems-not-due-to-vmware/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

