<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>vPivot &#187; sql</title>
	<atom:link href="http://vpivot.com/tag/sql/feed/" rel="self" type="application/rss+xml" />
	<link>http://vpivot.com</link>
	<description>Scott Drummonds on Virtualization</description>
	<lastBuildDate>Wed, 01 Feb 2012 06:46:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Databases, Storage, and Solid State Disks</title>
		<link>http://vpivot.com/2010/09/20/databases-storage-and-solid-state-disks/</link>
		<comments>http://vpivot.com/2010/09/20/databases-storage-and-solid-state-disks/#comments</comments>
		<pubDate>Mon, 20 Sep 2010 02:53:07 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[emc]]></category>
		<category><![CDATA[fast]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[vmworld]]></category>
		<category><![CDATA[vmworld europe]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=667</guid>
		<description><![CDATA[A colleague of mine dropped by my desk on Friday to talk about storage best practices for virtualized databases (SQL Server in this case).  He observed a VMware deployment where the data and log files for a SQL Server virtual machine were consolidated on a single VMFS volume backed by a RAID 5 LUN.  &#8221;Is [...]]]></description>
			<content:encoded><![CDATA[<p>A colleague of mine dropped by my desk on Friday to talk about storage best practices for virtualized databases (SQL Server in this case).  He observed a VMware deployment where the data and log files for a SQL Server virtual machine were consolidated on a single VMFS volume backed by a RAID 5 LUN.  &#8221;Is this a VMware best practice?&#8221; he asked.  &#8221;Should you not put the redo logs on a RAID 10 LUN?&#8221;  The answers are &#8216;no&#8217; and &#8216;yes&#8217;, respectively.  And with the solid state disk (SSD) auto-tiering from EMC (FAST) the second answer is an emphatic &#8220;YES!&#8221;</p>
<p><span id="more-667"></span></p>
<p>A perfunctory bit of guidance I include in nearly all of my performance talks (such as the enthralling, entertaining, and cancer-curing* presentations from VMworld 2010 that I will repeat in <a href="http://www.vmworld.com/community/conferences/europe2010/">Copenhagen</a> from 12-14 October) is &#8220;follow your application best practices&#8221;.  Audiences usually nod and immediately forget because this recommendation we all know to be correct yet somehow ignore.  In that way it is like, &#8220;stay away from fatty foods&#8221;, &#8220;do not drink wine with pain killers&#8221;, or &#8220;pay attention during the flight attendants&#8217; presentation&#8221;.</p>
<p>Part of the reason why people forget this nugget is because advice is general, and not crystalized in a technological explanation that embeds deep in the minds of the audience.  In this case the application best practice that should be followed is to separate data from logs, putting the data on something good for random read performance (like RAID 5) and the logs on something good for sequential write performance (RAID 10).  Obviously I want everyone to consolidate their storage to VMFS and enjoy the technology, but if you are putting VMDKs that contain each of these files on the same volume, you are ignoring application best practices.</p>
<p>In this case I recommend building two VMFS volumes.  One backed by RAID 5 and the other by RAID 10.  Put the data on RAID 5, the logs on RAID 10.  While you will change the access profile at the array by putting multiple log files on the same RAID 10 backed LUN, the resultant IO will be much more sequential write than had you mixed data file reads among them.  So, consolidate multiple data files onto the same RAID 5 LUN and consolidate multiple log files on the same RAID 10 LUN.</p>
<p>Furthermore, if you are using solid state auto-tiering to manage your volumes, you do <em>not</em> need to protect your database log file with this technology.  What I am talking about here is EMC&#8217;s Fully Automated Storage Tiering (FAST), which is the most popular thing EMC has created since I have been paying attention.  Despite what some people will tell you, solid state disks are the cheapest way to serve huge amounts of random reads.  But their benefits diminish when the profile is sequential write when they become unattractive from a cost perspective.</p>
<p>EMC&#8217;s FAST works by creating a volume that is like a vertical stripe of multiple RAID groups.  LUNs, which become VMFS volumes, are then placed in that FAST volume.  Since FAST is a great technology for solid state disks, RAID 5 is the most cost efficient configuration for database data, and solid state is wasted on sequential IO such as redo logs, my best practice for virtual storage configuration for databases workloads when FAST may be present can be boiled down to the following rules:</p>
<ul>
<li>Always create RAID 5 volumes for your read-intensive database data.</li>
<li>Always create RAID 10 volumes for your database logs.  If you have write-intensive data, you may consider putting them here, too.</li>
<li>If you have FAST, use it to stripe across multiple RAID 5 volumes of different disk types and put your random, read-intensive data on VMFS on this volume.</li>
</ul>
<p>The last bullet is clearly the most important here. I really love FAST, and it seems that EMC&#8217;s customers are crazy for it.  But its not the technology you need for sequential write workloads like redo logs.  Separate those data onto their own &#8220;normal&#8221; (not FAST-backed) VMFS volumes that use no SSDs.  Then you will have the best of all worlds: optimally deployed disk technologies, application best practice compliance, and righteous virtualized database consolidation.</p>
<p>(*) The claims made by the author of this blog do not reflect the views of his employer, the conference organizers, the government of the Kingdom of Denmark, or reality, for that matter.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/09/20/databases-storage-and-solid-state-disks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Performance Tip for ESX 3.0 and ESX 3.5</title>
		<link>http://vpivot.com/2010/04/20/a-performance-tip-for-esx-3-0-and-esx-3-5/</link>
		<comments>http://vpivot.com/2010/04/20/a-performance-tip-for-esx-3-0-and-esx-3-5/#comments</comments>
		<pubDate>Tue, 20 Apr 2010 22:37:28 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[amd]]></category>
		<category><![CDATA[intel]]></category>
		<category><![CDATA[monitor]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=415</guid>
		<description><![CDATA[Do you have any running instances of ESX 3.5 or older?  Are those instances running on processors that are no more than a couple of years old?  If so, I have a tip for you: update your hosts to ESX 4.0. Seriously, upgrade to vSphere already. It&#8217;s been out for a year! All kidding aside, [...]]]></description>
			<content:encoded><![CDATA[<p>Do you have any running instances of ESX 3.5 or older?  Are those instances running on processors that are no more than a couple of years old?  If so, I have a tip for you: update your hosts to ESX 4.0.  Seriously, upgrade to vSphere already.  It&#8217;s been out for a year!</p>
<p><span id="more-415"></span></p>
<p>All kidding aside, yesterday I wrote an article about a sneaky trick to leverage improved hardware assist performance on ESX 3.5 virtual machines that defaulted to binary translation.  I learned very late in the day that the recommended guidance only works for AMD processors.  The text below is from the original post but has since been updated to reflect my newly discovered information.</p>
<h2>Begin Updated Article</h2>
<p>Do you have any running instances of ESX 3.5 or older?  Are those instances running on AMD processors that are no more than a couple of years old?  If so, I have a tip for you: force hardware assist in those virtual machines.  In most situations application performance will improve by 10% or more.  Details follow.</p>
<p>ESX&#8217;s monitor presents virtual hardware to virtual machines&#8217; guest operating systems.  VMware&#8217;s multi-mode monitor uses three technologies to do this: hardware assist, para-virtualization, and binary translation.  Hardware assist has gotten much faster over the years, as this figure demonstrates.</p>
<div id="attachment_418" class="wp-caption alignnone" style="width: 570px"><a href="http://vpivot.com/wp-content/uploads/2010/04/vmexit_latency.png"><img class="size-full wp-image-418" title="VMEXIT Latencies" src="http://vpivot.com/wp-content/uploads/2010/04/vmexit_latency.png" alt="" width="560" height="294" /></a><p class="wp-caption-text">The latency of the VMEXIT instruction is shown on Intel VT systems.  The longer this instruction takes to execute, the worse the virtual machine performs.</p></div>
<p>Johan De Gelas included his take on monitor mode performance when he reported &#8220;Virtualization Round Trip Latency&#8221; in a <a href="http://it.anandtech.com/show/2964/the-intel-xeon-5670-six-improved-cores">recent article on the new Xeon 5600</a>.  His results reiterate the trend I have been sharing with my audiences for over a year now.</p>
<p>Because hardware assist was once so slow, older versions of ESX would utilize our faster-performing binary translation in many situations.  But virtualization assist in today&#8217;s processors&#8211;and here I am talking about Intel and AMD processors manufactured in the past two years&#8211;is generally faster than binary translation. This means your virtual machines running on ESX 3.5 on shiny new processors may not be reaching their full potential performance.</p>
<p>The fix is simple: force hardware assist for your ESX 3.0 and ESX 3.5 virtual machines running on newer AMD processors.  You can do this with the following lines in your virtual machines&#8217; VMX files:</p>
<blockquote><p>monitor.virtual_mmu = hardware</p></blockquote>
<p>A reboot is then required.  But this setting only works for AMD processors, where RVI (AMD&#8217;s hardware memory management unit) is available.  With this setting both AMD-V and RVI are forced on.  This setting is ignored on Intel processors, whose hardware MMU is not leveraged on ESX 3.5.</p>
<p>These changes are not needed with vSphere because its default monitor modes favor hardware assist more than ESX 3.5 and earlier.  You can see vSphere&#8217;s default monitor modes in our wonderful <a href="http://www.vmware.com/files/pdf/perf-vsphere-monitor_modes.pdf">monitor modes paper</a>.</p>
<p>As one example of the magnitude of performance increase this small change can produce, look to the <a href="http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf">SQL Server performance paper</a> we released last year.  Here is one graph I lifted from that document.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2010/04/monitor_performance.png"><img class="alignnone size-full wp-image-419" title="SQL Server Performance of Different Monitor Modes" src="http://vpivot.com/wp-content/uploads/2010/04/monitor_performance.png" alt="" width="584" height="333" /></a></p>
<p>This figure shows AMD-V improving performance by 18% over binary translation on virtual machines running SQL Server 2008.  More gain is possible when the hardware memory management unit is utilized.</p>
<p>Because of ESX 3.5&#8242;s continued wide deployment, it may very well be running millions of virtual machines.  Many those virtual machines are running on newer AMD processors that can benefit from this change.  Go forth and reconfigure those virtual machines to claim the performance to which you are entitled!</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/04/20/a-performance-tip-for-esx-3-0-and-esx-3-5/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Virtual Storage Design: Application Consolidation</title>
		<link>http://vpivot.com/2010/01/15/virtual-storage-design-application-consolidation/</link>
		<comments>http://vpivot.com/2010/01/15/virtual-storage-design-application-consolidation/#comments</comments>
		<pubDate>Fri, 15 Jan 2010 17:27:37 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[emc world]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=246</guid>
		<description><![CDATA[Fixed recommendations for consolidation ratios are cancerous.  Whether we are talking about vCPUs per core, virtual machines per host, or VMDKs per LUN, there is no single number the represents the &#8220;right&#8221; ratio.  Accurate guidance requires workload characterization and fine tuning using vSphere&#8217;s performance counters.  Today I want to highlight one experiment that shows application [...]]]></description>
			<content:encoded><![CDATA[<p>Fixed recommendations for consolidation ratios are cancerous.  Whether we are talking about vCPUs per core, virtual machines per host, or VMDKs per LUN, there is no single number the represents the &#8220;right&#8221; ratio.  Accurate guidance requires workload characterization and fine tuning using vSphere&#8217;s performance counters.  Today I want to highlight one experiment that shows application choice impacting VMDK-to-LUN consolidation.  The inescapable conclusion is that sequential access data must be separated from random access files!</p>
<p><span id="more-246"></span>In <a href="http://www.vmware.com/files/pdf/partners/academic/vpact-workloads.pdf">2009 VPACT paper</a>, VMware engineers showed application performance when the storage is consolidated.  This paper is a bit academic for the average virtualization nut, but does contain insights into choosing which VMDKs to consolidate into a single VMFS volume.  It does this by running applications that contain random or sequential IO and comparing performance with isolated (dedicated) storage to performance using consolidated storage.</p>
<p>The first experiment tested DVD store against Microsoft SQL Server and Oracle Swingbench OLTP against an Oracle database.  These OLTP workloads result in random IO on the data disks.  In the isolation experiment each virtual machine was run with its VMDK on a three-disk RAID 5 LUN (2+1).  In the consolidation experiment, both virtual machines&#8217; VMDKs were put on a common six-disk RAID 5 LUN (5+1).  Here are the paper&#8217;s results from table 1 of the paper:</p>
<div id="attachment_247" class="wp-caption alignnone" style="width: 610px"><a href="http://vpivot.com/wp-content/uploads/2010/01/random_consolidation.png"><img class="size-full wp-image-247" title="Consolidating Random Access Virtual Disks" src="http://vpivot.com/wp-content/uploads/2010/01/random_consolidation.png" alt="Consolidating Random Access Virtual Disks" width="600" /></a><p class="wp-caption-text">Consolidating Random Access Virtual Disks</p></div>
<p>The &#8220;application metric&#8221;, transactions per minute, is the most important indicator of the end user&#8217;s observed performance.  You can see from the results that consolidating (sharing) storage of random workloads does not harm performance at all.  In fact, SQL Server performance increased by 25%, reflecting the relative increase in data disks per stripe.</p>
<p>The second experiment again used DVD Store against SQL Server for a random IO workload.  But instead of a second random workload, an Oracle database was tested by the Swingbench Decision Support System, which results in highly sequential access.  Here are the results of that experiment, taken from table 2 of the paper:</p>
<div id="attachment_248" class="wp-caption alignnone" style="width: 610px"><a href="http://vpivot.com/wp-content/uploads/2010/01/sequential_consolidation.png"><img class="size-full wp-image-248" title="Sequential Workload Consolidation" src="http://vpivot.com/wp-content/uploads/2010/01/sequential_consolidation.png" alt="Sequential Workload Consolidation" width="600" /></a><p class="wp-caption-text">Sequential Workload Consolidation</p></div>
<p>The random workload, SQL plus DVD Store, again improved as the relative percentage of data disks in the RAID volume increased.  But the Decision Support System workload, so heavily dependent on sequential storage performance, suffered greatly.  DSS performance dropped 30% when measured by IO throughput and 50% when measured by completed transactions.</p>
<p>Applications with a sequential storage access pattern can be heavily dependent on the array&#8217;s ability to coalesce IO requests and complete large numbers of IOs very rapidly.  But when a VI admin includes a random access VMDK on the same LUN, the aggregate, interleaved LUN access is no longer sequential.  This slows down the array&#8217;s sequential and its effects are profound at the application level.</p>
<p>There are two summary recommendations from this experiment:</p>
<ul>
<li>A smaller number of RAID 5 volumes using many disks will outperform a larger number of RAID 5 volumes that use fewer disks.  This is due to the relative decrease in parity on the configuration.</li>
<li>VMDKs with random access can be consolidated to a single VMFS volume safely but sequential access pattern files should be separated to their own LUNs.</li>
</ul>
<p>The VMware performance team has a lot more to say about storage design to maximize application performance in virtual environments.  I will have more blog articles as the weeks progress and a white paper to share in the second quarter of 2010.  I expect it to be ready no later than <a href="http://www.emcworld.com/">EMC World 2010</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/01/15/virtual-storage-design-application-consolidation/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>KVM Performance</title>
		<link>http://vpivot.com/2009/09/30/kvm-performance/</link>
		<comments>http://vpivot.com/2009/09/30/kvm-performance/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 15:35:26 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[kvm]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=91</guid>
		<description><![CDATA[A few days ago someone forwarded me a blog article with an interesting claim about KVM performance: Testing results from internal and customers showed SAP workloads: 85-95, Oracle OLTP: 80-92% bare metal. LAMP stack showed better than bare metal performance. Whitepapers will be published in how this was achieved. Java achieved up to 94% bare [...]]]></description>
			<content:encoded><![CDATA[<p>A few days ago someone forwarded me <a href="http://www.linux-kvm.com/content/intro-rhev-video-redhat-summit-2009">a blog article</a> with an interesting claim about KVM performance:</p>
<blockquote><p>Testing results from internal and customers showed SAP workloads: 85-95, Oracle OLTP: 80-92% bare metal. LAMP stack showed better than bare metal performance. Whitepapers will be published in how this was achieved. Java achieved up to 94% bare metal.</p></blockquote>
<p>Frankly, I was surprised to hear this.  KVM is a hosted virtualization platform, equivalent to the <a href="http://www.vmware.com/products/server/">free VMware Server</a>, which runs on top of a host operating system.  VMware server is fine for a virtual machine or two, but you would not want it hosting your critical business applications.  The above KVM claim suggests that KVM possesses hypervisor-like performance.  So we ran a test with a few virtual machines to see what we could learn.  These tests confirmed my suspicions: KVM is a very long way from enterprise-class virtualization performance.</p>
<p><span id="more-91"></span>The thing to remember about virtualization benchmarking is that any vendor can provide virtualization software (hosted or hypervisor) that can virtualize a single application at better than 80% of native performance.  VMware has been doing this for a decade.  But it is extraordinarily difficult to build a hypervisor that can scale with many virtual machines.  Maybe this is one reason why you have never seen Microsoft or Citrix post results from a consolidated workload.  But I digress.</p>
<p>We decided that the easiest way to test this environment with a light/moderate enterprise workload is to use two or three VMs running SQL Server, as tested by <a href="http://www.delltechcenter.com/page/DVD+Store">DVD Store 2 (DS2)</a>.  We tried four configurations of these VMs:</p>
<ul>
<li>Case A: Two 4-way virtual machines.</li>
<li>Case B: Two 3-way VMs and one 2-way.</li>
<li>Case C: Three 3-way VMs.</li>
<li>Case D: Three 4-way VMs.</li>
</ul>
<p>Each virtual machine ran on an HP DL380 G5 and was given 4 GB.</p>
<p>Finding the right number of threads per virtual machine took some time.  Threads on the DS2 client determine the volume of transactions that are generated against the SQL Server.  We wanted to get the highest throughput for a reasonable latency, which we set at 33 ms.  Here are the best numbers I could produce for vSphere and KVM.</p>
<table id="newspaper-a">
<tbody>
<tr>
<th rowspan="2">Case</th>
<th colspan="2">Total OPM</th>
<th colspan="2">Avg. Response Time (ms)</th>
</tr>
<tr>
<th>vSphere</th>
<th>KVM</th>
<th>vSphere</th>
<th>KVM</th>
</tr>
<tr>
<td>A</td>
<td>58095</td>
<td>removed</td>
<td>33</td>
<td>removed</td>
</tr>
<tr>
<td>B</td>
<td>59741</td>
<td>removed</td>
<td>33</td>
<td>removed</td>
</tr>
<tr>
<td>C</td>
<td>52899</td>
<td>removed</td>
<td>33</td>
<td>removed</td>
</tr>
<tr>
<td>D</td>
<td>50996</td>
<td>removed</td>
<td>34</td>
<td>removed</td>
</tr>
</tbody>
</table>
<p>The very best performance that KVM could muster was only <em>removed</em>% of vSphere&#8217;s performance on the same configuration.  Notice that at 50% CPU over-commitment (1.5 vCPUs for each CPU), KVM&#8217;s performance <em>removed</em>.  It&#8217;s throughput fell to <em>removed</em>% of vSphere and its response time <em>removed</em>.  Increasing threads in this configuration actually made throughput and latency worse.</p>
<p>I had suspected that KVM would show hosted platform performance, as it relies on a host operating system.  It appears my suspicions were correct.  It will be tough for Red Hat to sell this product as part of an enterprise product.  To do so they will likely publish results based on single virtual machines and in environments where the CPUs are under-committed.</p>
<p>Lastly, this is the only workload that we have attempted.  I would expect KVM to do much worse when more virtual machines are part of the test or if network or storage throughput becomes significant.  But we have no plans to spend time on KVM benchmarking.  As I mentioned in <a href="http://www.catalyst.burtongroup.com/Na09/PlayerVideo011.html">my performance debate at Catalyst 2009</a>, I think that each vendor should do its own benchmarking to best represent its products.  I challenge Red Hat to post a KVM number using TPC, SPECweb, VMmark, vConsolidate, or any enterprise-class workload.  Customers should expect nothing less of their virtualization vendor.</p>
<h2>10/2/09 Update</h2>
<p>I decided to remove the KVM results to allow Red Hat or a KVM enthusiast to show their own best results on a consolidated workload.  I recommend VMmark or vConsolidate.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/09/30/kvm-performance/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>vSphere Is Not the Performance Problem, Your Storage Is</title>
		<link>http://vpivot.com/2009/09/18/storage-is-the-problem/</link>
		<comments>http://vpivot.com/2009/09/18/storage-is-the-problem/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 00:00:43 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[vscsistats]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=12</guid>
		<description><![CDATA[[This is an update to one of my favorite articles, which details my on-site investigation of SQL Server performance problems.] Back in July I had the privilege of riding along with VMware&#8217;s Professional Services Organization as they piloted a possible performance offering. We are considering two possible services: one for performance troubleshooting and another for [...]]]></description>
			<content:encoded><![CDATA[<p><em>[This is an update to <a href="http://communities.vmware.com/blogs/drummonds/2009/08/17/first-success-of-vmwares-performance-service-offering">one of my favorite articles</a>, which details my on-site investigation of SQL Server performance problems.]</em></p>
<p>Back in July I had the privilege of riding along with VMware&#8217;s Professional Services Organization as they piloted a possible performance offering.  We are considering two possible services: one for performance troubleshooting and another for infrastructure optimization.  During this trip we piloted the troubleshooting service, focusing on the customer&#8217;s disappointing experience with SQL Server&#8217;s performance on vSphere.</p>
<p><span id="more-12"></span>If you have read my blog entries (<a class="jive-link-blogpost" href="http://communities.vmware.com/blogs/drummonds/2009/03/13/sql-server-performance-problems-not-due-to-vmware">SQL Server Performance Problems Not Due to VMware</a>) or <a class="jive-link-external" href="http://www.vmware.com/a/webcasts/details/265">heard me speak</a>, you know that SQL performance is a major focus of my work.  SQL Server is the most common source of performance discontent among our customers, yet 100% of the problems I have diagnosed were not due to vSphere.  When this customer described the problem, I knew this SQL Server issue was stereotypical of my many engagements:</p>
<blockquote><p>&#8220;We virtualized our environment nearly a year ago and and quickly determined that virtualization was not right for our SQL Servers.  Performance dropped by 75% and we know this is VMware&#8217;s fault because we virtualized on much newer hardware on the exact same SAN.  We have since moved the SQL instance back to native.&#8221;</p></blockquote>
<p>Most professionals in the industry stop here, incorrectly files this problem as a deficiency of virtualization, and move on with their deployments.  But I know that <a class="jive-link-external" href="http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf">vSphere&#8217;s abilities with SQL Server</a> are phenomenal, so I expect to make every user happy with their virtual SQL deployment. I start by challenging the assumptions and trust nothing that I have not seen for myself.  Here are my first steps on the hunt for the source of the problem:</p>
<ol>
<li>Instrument the SQL instance that has been moved back to native to profile its resource utilization.  Do this by running Perfmon to collect stats on the database&#8217;s memory, CPU, and disk usage.</li>
<li>Audit the infrastructure and document the SAN configuration.  Primarily I will need RAID group and LUN configuration and an itemized list of VMDKs on each VMFS volume.</li>
<li>Use esxtop and vscsiStats to measure resource utilization of important VMs under peak production load.</li>
</ol>
<p>There are about a dozen other things that I could do here, but my experience in these issues is that I can find 90% of all performance problems with just these three steps.  Let me start by showing you the two RAID groups that were most important to the environment.  I have greatly simplified the process of estimating these groups&#8217; performance, but the rough estimate will serve for this example:</p>
<table id="newspaper-a">
<tbody>
<tr>
<th>RAID Group</th>
<th>Configuration</th>
<th>Performance Estimate</th>
</tr>
<tr>
<td>A</td>
<td>RAID5 using 4 15K disks</td>
<td>4 x 200 = 800 IOPS</td>
</tr>
<tr>
<td>B</td>
<td>RAID5 using 7 10K disks</td>
<td>7 x 150 = 1050 IOPS</td>
</tr>
</tbody>
</table>
<p>We found two SQL instances in their environment that were generating significant IO: one that had been moved back to native and one that remained in a virtual machine.  By using Perfmon for the native instance and vscsiStats the virtual one, we documented the following demands during a one-hour window:</p>
<table id="newspaper-a">
<tbody>
<tr>
<th>SQL Instance</th>
<th>Peak IOPS</th>
<th>Average IOPS</th>
</tr>
<tr>
<td>X (physical)</td>
<td>1800</td>
<td>850</td>
</tr>
<tr>
<td>Y (virtual)</td>
<td>1000</td>
<td>400</td>
</tr>
</tbody>
</table>
<p>In the customer&#8217;s first implementation of the virtual infrastructure, both SQL Servers, X and Y, were placed on RAID group A.  But in the native configuration SQL Server X was placed on RAID group B.  This meant that the storage bandwidth of the physical configuration was approximately 1850 IOPS.  In the virtual configuration the two databases shared a single 800 IOPS RAID volume.</p>
<p>It does not take a rocket scientist to realize that users are going to complain when a critical SQL Server instances goes from 1050 IOPS to 400.  And this was not news to the VI admin on-site, either.  What we found as we investigated further was that virtual disks requested by the application owners were used in unexpected and undocumented ways and frequently demanded more throughput than originally estimated.  In fact, through <a href="http://communities.vmware.com/docs/DOC-10095">vscsiStats</a> analysis, my contact and I were able to identify an &#8220;unused&#8221; VMDK with moderate sequential IO that we immediately recognized as log traffic.  Inspection of the application&#8217;s configuration confirmed this.</p>
<p>Despite the explosion of VMware into the data center we remain the new kid on the block.  As soon as performance suffers the first reaction is to blame the new kid.   But next time you see a performance problem in your production environment, I urge you to look at the issue as a consolidation challenge, and not a virtualization problem.  Follow the best practices you have been using for years and you can correct this problem without needing to call me and my colleagues to town.</p>
<p>Of course, if you want to fly us out for to help you correct a specific problem or optimize your design, I promise we will make it worth your while.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/09/18/storage-is-the-problem/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Newer Processors and Virtualization Performance</title>
		<link>http://vpivot.com/2009/09/16/newer-processors-and-virtualization-performance/</link>
		<comments>http://vpivot.com/2009/09/16/newer-processors-and-virtualization-performance/#comments</comments>
		<pubDate>Wed, 16 Sep 2009 20:08:33 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[amd]]></category>
		<category><![CDATA[cpu]]></category>
		<category><![CDATA[ept]]></category>
		<category><![CDATA[intel]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[monitor]]></category>
		<category><![CDATA[rvi]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[vmkernel]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=18</guid>
		<description><![CDATA[[New content has been added to this is an update to an old article from the performance community.] Newer processors are much more important to virtualized environments than the non-virtualized counterpart. Generational improvements have not just increased the raw compute power, they have also reduced virtualization overheads. This blog entry will describe three key changes [...]]]></description>
			<content:encoded><![CDATA[<p><em>[New content has been added to this is an update to an <a href="http://communities.vmware.com/blogs/drummonds/2009/06/02/newer-processors-and-virtualization-performance">old article from the performance community</a>.]</em></p>
<p>Newer processors are much more important to virtualized environments than the non-virtualized counterpart. Generational improvements have not just increased the raw compute power, they have also reduced virtualization overheads.  This blog entry will describe three key changes that have particularly impacted virtual performance.</p>
<h2><span id="more-18"></span>Hardware Assist Is Faster</h2>
<p>In 2008, with the launch of the Opteron 1300, 2300 and 8300 parts, AMD became the first CPU vendor to produce a hardware memory management unit equipped to support virtualization.  They called this technology Rapid Virtualization Indexing (RVI).  This year Intel did the same with Extended Page Tables (EPT) on its Xeon 5500 line.  Both vendors have been providing the ability to virtualize privileged instructions since 2006, with continually improving results.  Consider the following graph showing the latency of one key instruction from Intel:</p>
<p><img class="jive-image-thumbnail jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-3171-5926/vmexit_latencies.png" alt="vmexit_latencies.png" width="620" /></p>
<p>This instruction, VMEXIT, is called each time the guest exits to the kernel.  The graph shows its latency (delay) in completing this instruction, which represents a wait time incurred by the guest.  Clearly Intel has made great strides in reducing VMEXIT&#8217;s wait time from its Netburst parts (Prescott and Cedar Mill) to its Core architecture (Merom and Penryn) and on to its current generation, Core i7 (Nehalem).  AMD processors have shown commensurate gains with AMD-V.</p>
<p>In a recent <a href="http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf">white paper detailing SQL Server on vSphere</a>, the following graph showed the gains derived by using AMD-V in the Opteron 8324 (Shanghai).</p>
<div id="attachment_33" class="wp-caption alignnone" style="width: 609px"><img class="size-full wp-image-33" title="Monitor Mode and SQL Server Performance" src="http://vpivot.com/wp-content/uploads/2009/06/picture-3.png" alt="Binary translation, AMD-V, and AMV-V plus RVI are measured using SQL Server." width="599" height="343" /><p class="wp-caption-text">Binary translation, AMD-V, and AMV-V plus RVI are measured using SQL Server.</p></div>
<p>This graph shows the practical value of the great gains that CPU manufacturers have made with virtualization assist.  Hardware assist can now be regularly relied upon for great performance.</p>
<h2>Pipelines Are Shorter</h2>
<p>The longest pipelines in the x86 world were in Intel&#8217;s Netburst processors.  These processor&#8217;s pipelines had twice as many stages at their counterparts at AMD and twice as many as the generation of Intel CPUs that followed.  The increased pipeline length would have enabled support for 8 GHz silicon, had it arrived.  Instead, silicon switching speeds hit a wall at 4 GHz and Intel (and its customers) were forced to suffer the drawbacks of large pipelines.</p>
<p>Large pipelines are not necessarily a problem for desktop environments, where single threaded applications used to dominate the market.  But in the enterprise, application thread counts were larger.  Furthermore, consolidation in virtual environments drove thread counts even higher.  With more contexts in the processor, the number of pipeline stalls and flushes increased, and efficiency fell.</p>
<p>Because of decreased efficiency of consolidated workloads on processors with long pipelines, VMware has often recommended that performance-intensive VMs be run on processors no older than 2-3 years.  This excludes Intel&#8217;s Netburst parts.  VI3 and vSphere will do a fine job at virtualizing your less-demanding applications on any supported processors.  But you should use newer parts for applications that hold your highest performance expectations.</p>
<h2>Caches Are Larger</h2>
<p>A cache is highly effective when it fully contains the software&#8217;s working set.  The addition from the hypervisor of even a small about of code will change the working set and reduce cache hit rate.  I&#8217;ve attempted to illustrate this concept with the following simplified view of the relationship between cache hit rates, application working set, and cache sizes:</p>
<div id="attachment_34" class="wp-caption alignnone" style="width: 610px"><img class="size-full wp-image-34" title="Cache Size, Working Set, and Performance" src="http://vpivot.com/wp-content/uploads/2009/06/cache_size_perf.png" alt="Performance drops with small cache systems for even small increases to working set size." width="600" height="400" /><p class="wp-caption-text">Performance drops with small cache systems for even small increases to working set size.</p></div>
<p>This graph is based on a model that greatly simplifies working sets and the hypervisor&#8217;s impact on them.  Assuming that ESX increases the working set by 256 KB, this graph shows the decrease cache hit rate due to the contributions of the hypervisor.  Notice that with very small caches and very small application working sets, the cache hit rate suffers greatly due to the addition of even 256 KB of virtualization code.  And even up to 2 MB, a 10% decrease in cache hit rate can be seen in some applications.  With a 256 KB contribution by the kernel, cache hit rates do not change significantly with cache sizes of 4 MB and beyond.</p>
<p>In some cases a 10% improvement in cache hit rate can double application throughput.  This means that a doubling of cache size can profoundly effect the performance of virtual applications as compared to native.  Given ESX&#8217;s small contribution to the working set, you can see why we at VMware recommend that customers run their performance-intensive workloads on CPUs with 4 MB caches or larger.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/09/16/newer-processors-and-virtualization-performance/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>SQL Server Performance Problems Not Due to VMware</title>
		<link>http://vpivot.com/2009/09/16/sql-server-performance-problems-not-due-to-vmware/</link>
		<comments>http://vpivot.com/2009/09/16/sql-server-performance-problems-not-due-to-vmware/#comments</comments>
		<pubDate>Wed, 16 Sep 2009 11:10:27 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[vmworld]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=24</guid>
		<description><![CDATA[[First re-post of an old favorite.  This document is my most popular blog entry from the communities.] Microsoft SQL Server runs at better than 80% of native on VI3 in most benchmarked environments. In production environments, and under loads that model those conditions, SQL Server runs at 90-95% of native on ESX 3.5. I can [...]]]></description>
			<content:encoded><![CDATA[<p><em>[First re-post of an old favorite.  This document is my most popular blog entry from the <a href="http://communities.vmware.com/blogs/drummonds/2009/03/13/sql-server-performance-problems-not-due-to-vmware">communities</a>.]</em></p>
<p>Microsoft SQL Server runs at better than 80% of native on VI3 in most benchmarked environments.  In production environments, and under loads that model those conditions, SQL Server runs at 90-95% of native on ESX 3.5.  I can say this with confidence despite a large amount of the industry&#8217;s skepticism because I&#8217;ve spent so much time on SQL Server in the past half year.  I&#8217;d like to share some of my research on the subject and observations with you.</p>
<p><span id="more-24"></span>Two weeks ago my colleague Chethan Kumar and I presented on SQL Server in Cannes, France for VMworld Europe 2009.  This presentation was the culmination of six months of investigation that was started at VMworld 2008 in Las Vegas.  At that event I heard many customer concerns about SQL Server performance that I was resolved to identify the problems&#8217; root causes.  I talked with every customer I could find that claimed that SQL ran at anything less than 70% of native.  So many of these contacts claimed that they had measured SQL at 25% of native or worse that I knew that something was going wrong.</p>
<p>First, let me show you a slide that Chethan presented at the show in Cannes:</p>
<p><img class="jive-image-thumbnail jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-2720-5630/sql_tuning.png" alt="sql_tuning.png" width="620" /></p>
<p>Chethan spent three months investigating SQL Server to find out how much he could improve virtual performance from the &#8220;out of the box&#8221; experience.  As this figure details, the sum total of performance improvements was 15%.  Here&#8217;s another break-down of these results:</p>
<p><img class="jive-image-thumbnail jive-image" src="http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-2720-5632/sql_tuning_summary.png" alt="sql_tuning_summary.png" width="620" /></p>
<p>The only option that we found in ESX to improve virtual performance was static transmit coalescing, which is documented on <a class="jive-link-external" href="http://www.vmware.com/files/pdf/specweb_perf_final.pdf">page four of one of our SPECweb papers</a>.  Large pages and SQL&#8217;s priority boost, which are best practices provided by Microsoft for SQL Server configuration, provide the largest gains in performance.</p>
<p>The key messages that we communicated to our audience were that a properly running SQL Server should run at 80% of native or better.  In most production cases it can run at a performance indistinguishable from native speed.  And if performance is lagging, there don&#8217;t exist many changes that can be made to ESX that can yield and performance gains at all.</p>
<p>This begs the question: &#8220;If ESX can&#8217;t be tuned to double SQL performance, what is causing these reports of terrible SQL Server throughput?&#8221;  The great majority of the problems are coming from mis-configured storage.  But a variety of other items such as poor hardware selection or use of the wrong virtualization software contribute to the confusion, as well.  I&#8217;ve been documenting these issues in <a class="jive-link-wiki" href="http://communities.vmware.com/docs/DOC-8964">Best Practices for SQL Server</a> on this community and will continue to update that document as more problems are discovered.</p>
<p>If you have a SQL Server running un-virtualized in your environment, I&#8217;d like you to try virtualizing it again.  Follow our best practices document and pay close attention to your storage configuration during deployment.  I feel confident that once you&#8217;ve setup your environment properly, you&#8217;re going to like what you see.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/09/16/sql-server-performance-problems-not-due-to-vmware/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

