<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>vPivot &#187; ssd</title>
	<atom:link href="http://vpivot.com/tag/ssd/feed/" rel="self" type="application/rss+xml" />
	<link>http://vpivot.com</link>
	<description>Scott Drummonds on Virtualization</description>
	<lastBuildDate>Wed, 01 Feb 2012 06:46:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Flash Or SSD? (or: Why Interfaces Matter)</title>
		<link>http://vpivot.com/2011/11/22/flash-or-ssd-or-why-interfaces-matter/</link>
		<comments>http://vpivot.com/2011/11/22/flash-or-ssd-or-why-interfaces-matter/#comments</comments>
		<pubDate>Tue, 22 Nov 2011 05:41:03 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=1064</guid>
		<description><![CDATA[In my three part series on flash I interchangeably used the terms &#8220;flash&#8221; and &#8220;SSD&#8221;.  In a recent article on this subject, Steven Foskett on IBM&#8217;s Storage Community successfully convinced me that I should stop using these terms interchangeably.  He then suggested that flash would persevere while SSD would not.  I disagree. First, let me [...]]]></description>
			<content:encoded><![CDATA[<p>In my <a href="http://vpivot.com/2011/10/04/the-flash-storage-revolution-part-i">three</a> <a href="http://vpivot.com/2011/10/13/the-flash-storage-revolution-part-ii">part</a> <a href="http://vpivot.com/2011/11/17/the-flash-storage-revolution-part-iii">series</a> on flash I interchangeably used the terms &#8220;flash&#8221; and &#8220;SSD&#8221;.  In <a href="http://storagecommunity.org/blogs/stephenfoskett/archive/2011/11/22/ssd-is-not-the-best-way-to-use-flash-memory-in-storage.aspx">a recent article on this subject</a>, Steven Foskett on <a href="http://storagecommunity.org/">IBM&#8217;s Storage Community</a> successfully convinced me that I should stop using these terms interchangeably.  He then suggested that flash would persevere while SSD would not.  I disagree.</p>
<p><span id="more-1064"></span>First, let me quote what I know Steven got right:</p>
<blockquote><p>Flash memory is the dominant underlying chip technology for solid-state storage. But solid-state disk drives are just one packaging option for flash.</p></blockquote>
<p>Steven then explains that flash can be added to the enterprise in a variety of ways.  Today&#8217;s most common alternative to SSD is the PCI expansion card.  Steven next extols the benefits of PCI-based flash and the drawbacks of SSD-based flash.  These include:</p>
<ul>
<li>Simplicity of design in PCI flash.  No SCSI or ATA controllers needed for PCI flash.</li>
<li>Improved performance of PCI flash, for lack of bottleneck-inducing SCSI or ATA controllers.</li>
</ul>
<p>Steven then concludes:</p>
<blockquote><p>In a decade, SSD will seem a quaint throwback while flash memory will roar ahead.</p></blockquote>
<p>I doubt this conclusion.</p>
<p>First, the argument that flash in PCI is faster than flash in SSD because SSD controllers will always be a bottleneck is nonsense.  Those controllers are created in the same silicon that creates microprocessors.  They can be implemented as fast as the hardware that drives the PCI-e bus.  The reason why PCI cards are faster is because the PCI-e bus can support up to 16GB/s of throughput while no storage array (today) can drive a single connection beyond 10Gb/s.  There is no need to create an SSD disk that supports 16 GB/s of throughput because no flash can serve it and no array can deliver it.  This is an example of designing to the current needs and this limitation will change.</p>
<div id="attachment_1072" class="wp-caption alignleft" style="width: 310px"><a href="http://vpivot.com/wp-content/uploads/2011/11/power-outlet-us.jpeg"><img class="size-full wp-image-1072" title="A Standard Interface Common In the US" src="http://vpivot.com/wp-content/uploads/2011/11/power-outlet-us.jpeg" alt="" width="300" height="300" /></a><p class="wp-caption-text">A Standard Interface Common In the US.</p></div>
<p>But more importantly, one of the things we have learned in decades of computer science is that <em>interfaces matter</em>. Interfaces endure. Examples abound of good interfaces outliving their initial implementations. Instead of deciding to throw away a design and start from scratch, we improve the implementation and keep the interface, even if it is sub-optimal. One such example is the x86 architecture. It seems that the entire world has nearly agreed that this interface is how we want enterprise operating systems to communicate to processors.</p>
<p>(The funny thing about x86 is that years ago Intel abandoned the basic principle of their early architecture: complex instruction set computing (CISC). They designed their processors so programming the CPU would be easy but implementing it would be tough. Decades later, Intel introduced decoders on their processors that effectively translated CISC instructions to RISC microcode. They simultaneously offered a &#8220;better&#8221; pure RISC/VLIW architecture in the Itanium line. But the industry responded loudly: stay with the x86 interface that an ecosystem has come to depend on.)</p>
<p>I believe hard drives should be thought of as an interface.  Not just the protocol and connection by which data is read and written, but also the form factor that humans handle and that hardware vendors build around.</p>
<p>Why is it that the industry likes the hard drive &#8220;interface&#8221;?  Consider the following:</p>
<ol>
<li>Hard drives have reasonable high-density form factors.  They are uniform size, fully enclosed, and rugged enough that an exposed component will not snag on a sweater and break.</li>
<li>Hard drive interfaces (SCSI, SATA, SAS, etc.) are created to be used frequently.  They are designed to be plugged and unplugged.  Pull and replace a SATA plug a thousand times and the connector will survive.  Try pulling and replacing a PCI-e card 1000 times.</li>
<li>We have existing means of aggregating thousands of hard drives into an array.  Because of the capacity drain of more PCI devices it is tougher to scale PCI cards to the same limits.</li>
</ol>
<div>While my crystal ball is not any clearer than Steven&#8217;s, I think that anyone that discounts the endurance of a popular interface is not seeing the full picture.  As long as people are touching the interface, while consumers are using devices that implement it, while competitors are designing products to it, and while it is successfully evolving with demands, the interface will be tough to replace.  The hard drive SSD interface meets these criteria.</div>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/11/22/flash-or-ssd-or-why-interfaces-matter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Flash Storage Revolution: Part III</title>
		<link>http://vpivot.com/2011/11/17/the-flash-storage-revolution-part-iii/</link>
		<comments>http://vpivot.com/2011/11/17/the-flash-storage-revolution-part-iii/#comments</comments>
		<pubDate>Thu, 17 Nov 2011 06:07:44 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[emc]]></category>
		<category><![CDATA[fast]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=1048</guid>
		<description><![CDATA[In this final installment of the series, I will provide some detail behind flash storage sizing.  My previous entry contained an analytical and theoretical approach to sizing flash in today&#8217;s storage.  When I first studied the ideas I introduced in that post, I thought the flash sizing exercise was hopeless.  After all, how are customers [...]]]></description>
			<content:encoded><![CDATA[<p>In this final installment of the series, I will provide some detail behind flash storage sizing.  <a href="http://vpivot.com/2011/10/13/the-flash-storage-revolution-part-ii/">My previous entry</a> contained an analytical and theoretical approach to sizing flash in today&#8217;s storage.  When I first studied the ideas I introduced in that post, I thought the flash sizing exercise was hopeless.  After all, how are customers to measure data cooling?  How could a storage admin quantify skew?</p>
<p>As it turns out, familiarity with these abstract concepts is not needed to size flash in your environment.  The same principles that Intel and AMD apply in sizing microprocessor cache can be applied to storage.  There are generalizations that will suit the majority of deployments.</p>
<p><span id="more-1048"></span>First, a little background.  In building EMC&#8217;s Fully Automated Storage Tiering Virtual Pools (FAST VP), EMC studies the access patterns of over 3,500 arrays.  We measured skew and performance, capacity and footprint.  We experimented with storage layout and FAST VP block sizes.  We tried two-tier and three-tier configurations and sized each to find a best fit for the average case.  The results are summarized in the following figure.</p>
<div id="attachment_1049" class="wp-caption aligncenter" style="width: 510px"><a href="http://vpivot.com/wp-content/uploads/2011/11/fast-skew-tier-size.png"><img class="size-full wp-image-1049" title="Tier Recommendations" src="http://vpivot.com/wp-content/uploads/2011/11/fast-skew-tier-size.png" alt="" width="500" /></a><p class="wp-caption-text">The EMC study that preceded the launch of FAST VP identified three basic tier configurations to improve performance and footprint in 94% of environments.</p></div>
<p>Of the 3,500 arrays we analyzed, 12% of the workloads met criteria that we describe as &#8220;heavy skew&#8221;.  This means 95% of the IO occurred on 5% of the data.  In these configurations nearly all the hot blocks can be stored in flash when it is sized to 3% of the storage footprint.  In &#8220;moderate skew&#8221; environments, the addition of 15% Fibre Channel maintained performance with a footprint only slightly larger than optimal.  &#8221;Low skew&#8221; environments still showed improvement over flash-less configurations in both performance and footprint, while at the same cost.</p>
<p>It was this analysis that led us to recommend the low skew configuration for unknown environments.  This has the following benefits:</p>
<ul>
<li>The cost of storage is the same as the flash-less configuration.</li>
<li>The footprint is half the size of the flash-less configuration.</li>
<li>Storage will be at least 20% faster for 94% of workloads.  Because this measurement was provided at low skew, and higher skew environments will more heavily exercise flash, and performance will exceed non-flash deployments by more than 40% under some workloads.</li>
</ul>
<p>EMC&#8217;s Tier Advisor can help you produce a more precise guide to size your storage tiers.  But it is not strictly necessary.  Deploying a three-tier architecture will improve your existing array by reducing footprint and improving performance.  And if your environment has anything above low skew, adding rotating disks will capacity <em>and</em> improve efficiency.  This works because you will be approaching the more precise tier mapping for your environment&#8217;s workloads.</p>
<p>This ends my three-part series on flash in the enterprise.  I will conclude the series where it began.  I fell in love with flash when I installed an SSD disk in my MacBook Pro.  The impact to my own user experience was so dramatic as to revolutionize my own thinking about the nature of storage.  If your mind has not yet been similarly transformed, go get SSD for your consumer computers right away.  And know that everything we can experience for our own equipment we can deliver in the enterprise, too.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/11/17/the-flash-storage-revolution-part-iii/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>The Flash Storage Revolution: Part II</title>
		<link>http://vpivot.com/2011/10/13/the-flash-storage-revolution-part-ii/</link>
		<comments>http://vpivot.com/2011/10/13/the-flash-storage-revolution-part-ii/#comments</comments>
		<pubDate>Thu, 13 Oct 2011 02:10:12 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[emc]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=1022</guid>
		<description><![CDATA[In the previous entry on this ongoing series covering the flash storage revolution, I concluded that flash is now an essential part of enterprise storage. But its value proposition is hinged on high utilization. High utilization cannot be sustained without efficient auto-tiering or accurate cache sizing for flash-based cache. This article will describe the theory [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://vpivot.com/2011/10/04/the-flash-storage-revolution-part-i/">the previous entry on this ongoing series covering the flash storage revolution</a>, I concluded that flash is now an essential part of enterprise storage. But its value proposition is hinged on high utilization. High utilization cannot be sustained without efficient auto-tiering or accurate cache sizing for flash-based cache.</p>
<p>This article will describe the theory behind optimal cache sizing.  Practical guidance will follow in part three, the last entry in this series. I will again lean heavily on Denis Vilfort&#8217;s presentation that <a onclick="javascript: _gaq.push(['_trackPageview', '/downloads/map']);" href="http://e-scott.net/share/emc/enterprise_flash_overview_sizing.pdf">I offer for download on my blog</a>.</p>
<p><span id="more-1022"></span>Every performance discussion starts with an &#8220;it depends&#8221;.  I will kick off this discussion on the characteristic on which flash sizing depends: skew (a topic I <a href="http://vpivot.com/2010/12/14/justifying-ssds/">discussed once before</a>). Skew is the degree to which different environments will touch different amounts of data. High skew environments will access 1% of their data 80% of the time. Low skew environments will access up to 10% of their data 80% of the time. This is depicted in the following picture.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2011/10/skew.png"><img class="aligncenter size-full wp-image-1024" title="Skew" src="http://vpivot.com/wp-content/uploads/2011/10/skew.png" alt="" width="600" /></a></p>
<p>Cache, flash or otherwise, can be thought of as a FIFO for data. As your users interact with their applications, they generate new data and touch old data. This places new blocks at the head of the flash FIFO, keeping them in SSD cache for a longer period of time. These blocks remain in that cache until newer data pushes it out of this logical FIFO.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2011/10/flash-fifo.png"><img class="aligncenter size-full wp-image-1026" title="Flash Cache as a FIFO" src="http://vpivot.com/wp-content/uploads/2011/10/flash-fifo.png" alt="" width="300" /></a></p>
<p>Skew is a somewhat abstruse concept. But it is more easily understood when you think of it as a data cooling rate. The rate at which applications touch data can be described as cooling. High skew environments have high cooling rates, because the applications are spending most time on a little data.  This means a great deal of data becomes lightly used, or cool. Low skew environments have low cooling rates, because they frequently touch more data, keeping it warm. The following figure shows cooling in action.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2011/10/cooling.png"><img class="aligncenter size-full wp-image-1028" title="Cooling Rate Translates Into Days" src="http://vpivot.com/wp-content/uploads/2011/10/cooling.png" alt="" width="600" /></a></p>
<p>At some rate&#8211;which we are leaving purely theoretical at this point&#8211;applications slough off a certain amount of data into the rarely used, or &#8220;cool&#8221;, category. This information should fall out of the flash FIFO and be relegated to lower cost storage. The above figure shows that data centers with a low cooling rate of 1.4% will take 120 days for 80% of their data to become cool. This means slowly cooling environments need larger flash-backed cache.</p>
<p>Cooling rate dictates the amount of flash needed in enterprise storage. Because flash for cache purchased today will be used for the array for the until the next storage purchase, environments with rapid growth of data require a higher percentage of flash on day one. But because cooling rate is non-linear, the amount of flash needed does not scale linearly with the data growth. In non-engineering jargon, this means that an environment with 100% year-over-year growth of data does not need twice the flash of an environment with 50% year-over-year growth.</p>
<p>This is better shown in the following figure.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2011/10/flash-portion.png"><img class="aligncenter size-full wp-image-1030" title="Flash Portion of Storage" src="http://vpivot.com/wp-content/uploads/2011/10/flash-portion.png" alt="" width="600" /></a></p>
<p>You can now see that, assuming you have efficient flash usage via some technology like cache, you can predict your flash needs with only three variables:</p>
<ol>
<li>The amount of data in your environment.</li>
<li>Your yearly growth rate of data.</li>
<li>Your cooling rate.</li>
</ol>
<p>Two of these numbers are easy to find but cooling rate is not. In fact, it is so difficult that I am not sure it is even obtainable in most environments. But without it how are we to recommend flash purchases? The answer is simpler than you think and the subject of the final article in this series.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/10/13/the-flash-storage-revolution-part-ii/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The Flash Storage Revolution: Part I</title>
		<link>http://vpivot.com/2011/10/04/the-flash-storage-revolution-part-i/</link>
		<comments>http://vpivot.com/2011/10/04/the-flash-storage-revolution-part-i/#comments</comments>
		<pubDate>Tue, 04 Oct 2011 05:32:30 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[emc]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=1010</guid>
		<description><![CDATA[Six weeks ago I finally upgraded my MacBook to solid state storage.  The change in performance is so dramatic, to say the least.  I have been selling flash storage to EMC&#8217;s customers for over a year now and they have been loving it.  But I did not really get how valuable flash is until I [...]]]></description>
			<content:encoded><![CDATA[<p>Six weeks ago I finally upgraded my MacBook to solid state storage.  The change in performance is so dramatic, to say the least.  I have been selling flash storage to EMC&#8217;s customers for over a year now and they have been loving it.  But I did not really get how valuable flash is until I saw it on my own laptop.</p>
<p>After this revolution of my own mind, I want to dedicate a few blog entries to the issue of solid state storage in the enterprise.  First I want to frame the problem that flash both solves and causes.  In the second entry I will introduce some of the theory behind flash sizing.  My last article will give you some very simple practical advice on how to use flash in your enterprise.</p>
<p><span id="more-1010"></span>These entries lean heavily on a presentation EMC&#8217;s Denis Vilfort presented within EMC a few weeks ago.  I am providing <a href="http://e-scott.net/share/emc/enterprise_flash_overview_sizing.pdf" onClick="javascript: _gaq.push(['_trackPageview', '/downloads/map']);"> a PDF version of that presentation on this blog</a> for your own usage.</p>
<p>The fundamental problem that flash is solving is summarized in the following figure.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2011/10/Screen-Shot-2011-10-04-at-12.51.48-PM.png"><img class="aligncenter size-full wp-image-1011" title="Performance: Disk, Memory, CPU" src="http://vpivot.com/wp-content/uploads/2011/10/Screen-Shot-2011-10-04-at-12.51.48-PM.png" alt="" width="600" /></a></p>
<p>The speed of rotating disks is dictated by the time it takes the hard drive head reach data on the platter. These mechanisms&#8211;a spinning disk and a moving arm&#8211;are basically unchanged in the past many decades. Array performance has evolved through the introduction of DRAM cache, intelligent prefetch, clever file systems, and other techniques. But the metaphorical hands of all enterprise storage vendors were bound by the physics of spinning platters and actuating arms. While latencies in hard drive technologies have stuck in the milliseconds, solid state devices like CPU, memory, and flash have improved to microseconds and nanoseconds.</p>
<p>This huge chasm in performance between disk and memory increasingly devalued high performance servers. When storage becomes the bottleneck for a business application, the performance of a server becomes less important. VMware introduced the importance of capacity in server sizing by showing the world consolidation. But performance of data intensive applications was still dictated by storage.</p>
<p>The initial challenge of solid state disks (SSD) was their perceived high cost when compared to hard disks. I remember a couple years ago hearing EMC&#8217;s confused messaging around its enterprise flash disks (EFDs): &#8220;much more expensive but much, much faster&#8221;. Customers were left with a confusing, cost-benefit calculation to make a purchasing decision.</p>
<p>A year and a half ago that EMC started using a more clear message for flash: in some cases, it is actually cheaper than rotating disks. The key is to identify these use cases. This is best shown in the following figure.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2011/10/Screen-Shot-2011-10-04-at-1.03.40-PM.png"><img class="aligncenter size-full wp-image-1012" title="Flash Is Cheaper Per IO, Hard Drives Cheaper Per GB" src="http://vpivot.com/wp-content/uploads/2011/10/Screen-Shot-2011-10-04-at-1.03.40-PM.png" alt="" width="600" /></a></p>
<p>Because of the exceptional improvement in latency, flash drives provide dramatically higher throughput on random access patterns than hard drives. When normalized by throughput&#8211;IOPS in the case of the above figure&#8211;flash storage is much cheaper than any hard drive.</p>
<p>But there is another angle to storage efficiency. Even &#8220;cheap&#8221; hard drive storage was never as cheap as people thought. When arrays aggregate striped disks to sum their maximum throughputs, utilization is low. Furthermore, in some performance environments hard drives were short stroked, which meant only a small, fast section of the platter was used so performance could be maximized. With low utilization and short stroking, hard drive efficiency is very low. This means the nominal cost of rotating disks is much worse than people realized.</p>
<p><a href="http://vpivot.com/wp-content/uploads/2011/10/Screen-Shot-2011-10-04-at-1.14.20-PM.png"><img class="aligncenter size-full wp-image-1013" title="Nominal Cost of Storage Depends on Utilization" src="http://vpivot.com/wp-content/uploads/2011/10/Screen-Shot-2011-10-04-at-1.14.20-PM.png" alt="" width="600" /></a></p>
<p>The low cost of your hard drives depends on their high capacity utilization. But how can you drive high capacity utilization in an environment where you have a storage performance bottleneck? Obviously flash is the answer.</p>
<p>SSD storage can help you solve your utilization and efficiency problems with rotating disks. But this assumes that you and your storage can determine how much flash you need and how to use it. And that will be the subject of my next entry in this three part series.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/10/04/the-flash-storage-revolution-part-i/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>MLC Flash Versus SLC Flash</title>
		<link>http://vpivot.com/2011/05/10/mlc-flash-versus-slc-flash/</link>
		<comments>http://vpivot.com/2011/05/10/mlc-flash-versus-slc-flash/#comments</comments>
		<pubDate>Tue, 10 May 2011 07:04:21 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[emc]]></category>
		<category><![CDATA[emc world]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=860</guid>
		<description><![CDATA[EMC&#8217;s recent announcement at EMC World of Project Lightning documents a program to increase the use of flash devices in enterprise storage. The project includes increased use of flash storage in EMC arrays, all-flash storage configurations, and support for Multi-layer Cell (MLC) flash. This last subject&#8211;MLC flash and its difference from SLC flash&#8211;piqued my curiosity. [...]]]></description>
			<content:encoded><![CDATA[<p>EMC&#8217;s recent <a href="http://www.emc.com/about/news/press/2011/20110509-05.htm">announcement at EMC World of Project Lightning</a> documents a program to increase the use of flash devices in enterprise storage.  The project includes increased use of flash storage in EMC arrays, all-flash storage configurations, and support for Multi-layer Cell (MLC) flash.  This last subject&#8211;MLC flash and its difference from SLC flash&#8211;piqued my curiosity.</p>
<p>Many years ago I studied electrical engineering.  I was an awful at it.  Analog was never my thing.  I much prefer ones and zeroes.  But I challenge myself to think about electronics once every blue moon.  So I decided to delve into SLC and MLC flash technologies to understand how they differ and why we should care.  The content below summarizes my online research and the little bit I remember from school.  If you can add, correct, or update this article I would be happy to have your comments.</p>
<p><span id="more-860"></span></p>
<h2>What is the Difference Between MLC and SLC Flash?</h2>
<p>MLC flash uses many discrete voltage levels to store multiple values, or bits, per cell [1].  Single-layer Cell (SLC) technology uses fewer voltage levels to program a single bit of information to the cell.  MLC technology obviously produces greater density which means it stores more data cheaper.  But the higher density comes with a cost: MLC produces storage that is more sensitive to temperature changes, slightly slower, and more likely to fail than SLC flash.</p>
<p>SLC flash is ten times the endurance for write/erase operations [2].  At an average of 10,000 write/erase cycles an MLC flash cell will die.  SLC flash cells can sustain an average of 100,000 write/erase cycles.  But why do MLC flash cells fail more than SLC?</p>
<h2>Why Do MLC Cells Fail More Than SLC Cells?</h2>
<p>I am unable to find an answer to this anywhere on the web.  If you see one, I would love to read it.  But in the dark, dusty corners of my memory I remember enough about electronics to hazard a guess at this.  As I see it, there are two reasons why MCL flash should fail more than SLC: one reason is a statistical and the other is electronic.</p>
<p>The statistical argument is that MLC cells are being written two 50% more than SLC cells.  They will simply wear out sooner.  An SLC cell is storing the value of zero or one.  When an application writes to the data being held by that cell there is a 50% chance that the cell&#8217;s value has changed and it requires reprogramming.  Because MLC flash stores two bits, there is a 75% chance that the new two-bit data differs from the existing value.  This means an MLC is written to 0.75 times for each 0.5 times an SLC cell is written.  That&#8217;s a 50% increase.</p>
<p>The electronic argument is based on MLC flash programming requiring a wider range of voltages [3].  Higher voltages produce greater amperage.  This exacerbates electromagnetic migration.  And the higher voltage on the transistor&#8217;s gate will increase erosion of the polysilicon that separates the gate and the channel.  Both of these will result in circuit failure.</p>
<h2>How Is MLC Being Made More Reliable?</h2>
<p>Because MLC is so much more cost effective than SLC, industry innovation is improving MLC reliability.  Here are a few techniques I found online [4]:</p>
<ul>
<li>Hardware can level writes, which distributes writes throughout the device to avoid balance cell overuse and avoid hotspots.  This means an entire flash drive will tend to fail at once after a long time.  This as opposed to a small number overworked hotspots failing quickly.</li>
<li>Hardware can include DRAM cache which can be used to coalesce writes, which decreases cell write count.</li>
<li>Flash devices can be over-provisioned for error detection, correction, and dynamic bad cell replacement.</li>
<li>There are also a variety of proprietary techniques from flash manufacturers.</li>
</ul>
<p>One challenge with flash today is the lack of consistent and objective endurance measurements.  It is difficult for storage vendors to publish availability guarantees when the reliability of the underlying media is uncertain.  This means to support flash devices in its VMAX arrays&#8211;which are rated at six nines (99.9999%) availability&#8211;EMC has to do a tremendous amount of qualification of the devices.  This qualification process should always mean that flash support in enterprise storage should consistently lag its support in consumer devices, where availability requirements are much lower.</p>
<h2>Summary</h2>
<p>No one denies that SSD storage is becoming more common in the enterprise.  EMC&#8217;s support of MLC devices is only one of the items introduced by Project Lighting that will increase flash presence, producing better performing and more efficient storage.  If you are interesting in learning more on the subject, follow the links below to the sources for this article.  Also considering Googling &#8220;tlc flash&#8221; to see the higher density, less reliable Triple-layer Cell (TLC) that will certainly find its way to the enterprise after more innovation. </p>
<h2>References</h2>
<p>My information came from documents I found as a result of Google searches.  Here are my recommendations for further reading.</p>
<ol>
<li><a href="http://www.smxrtos.com/articles/mlcslc.htm">http://www.smxrtos.com/articles/mlcslc.htm</a></li>
<li><a href="http://www2.electronicproducts.com/Choosing_flash_memory-article-toshiba-apr2004-html.aspx">http://www2.electronicproducts.com/Choosing_flash_memory-article-toshiba-apr2004-html.aspx</a></li>
<li><a href="http://www.supertalent.com/datasheets/SLC_vs_MLC%20whitepaper.pdf">http://www.supertalent.com/datasheets/SLC_vs_MLC%20whitepaper.pdf</a></li>
<li><a href="http://www.infostor.com/index/articles/display/1169849064/articles/infostor/disk-arrays/disk-drives/2010/july-2010/mlc-vs__slc_flash.html">http://www.infostor.com/index/articles/display/1169849064/articles/infostor/disk-arrays/disk-drives/2010/july-2010/mlc-vs__slc_flash.html</a></li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2011/05/10/mlc-flash-versus-slc-flash/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Justifying SSDs</title>
		<link>http://vpivot.com/2010/12/14/justifying-ssds/</link>
		<comments>http://vpivot.com/2010/12/14/justifying-ssds/#comments</comments>
		<pubDate>Tue, 14 Dec 2010 02:51:51 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[fast]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=711</guid>
		<description><![CDATA[Ever since I saw the results of VMware&#8217;s first performance work on EMC&#8217;s Enterprise Flash Drives (EFDs) I knew the storage world was about to change.  Even though I love the idea of SSD, I still struggle with the justification of their purchase.  I have had trouble quantifying the value of an EFD and fearlessly [...]]]></description>
			<content:encoded><![CDATA[<p>Ever since I saw the results of VMware&#8217;s first performance work on EMC&#8217;s Enterprise Flash Drives (EFDs) I knew the storage world was about to change.  Even though I love the idea of SSD, I still struggle with the justification of their purchase.  I have had trouble quantifying the value of an EFD and fearlessly committing customers&#8217; money to their purchase.  In this article I want to offer a few thoughts on these devices as I formulate my own ideas as to when SSDs are needed and how we can all enjoy their benefits.</p>
<p><span id="more-711"></span>Let us start this discussion with an scientific perspective.  The following graph shows trend lines of performance by disk size.  I have greatly reduced the complexity of this chart by choosing some convenient numbers for throughput (such as 100 IOPS per SATA device) and size (such as 200 GB for an EFD).</p>
<div id="attachment_713" class="wp-caption aligncenter" style="width: 610px"><a href="http://vpivot.com/wp-content/uploads/2010/12/efd_versus_rotating_disks.png"><img class="size-full wp-image-713" title="Capacity Versus Throughput" src="http://vpivot.com/wp-content/uploads/2010/12/efd_versus_rotating_disks.png" alt="A scatter graph showing storage capacity as compared to throughput for SATA, FC, and EFD disks." width="600" /></a><p class="wp-caption-text">The capacity/performance trend of today&#39;s disk technologies.</p></div>
<p>This graph shows the performance and capacity provided by up to 200 disks of three different technologies: SATA, Fibre Channel, and solid state disks.  The first salient point that this figure screams at me is that the spread between the trend lines is extreme.  The capacity/performance trends of SATA and Fibre Channel are relatively similar.  EFDs are a long way from those two trend lines.</p>
<p>This large spread between these two technologies suggests two things to me:</p>
<ol>
<li>The world will not much longer support the existence of two rotating disk technologies.  Assuming capacity and performance are the only criteria in disk selection, the differentiation between these two is not significant enough to merit both their presence in the market.</li>
<li>Since real workloads&#8217; demands will never fall exactly on a trend line, some customers are in the unfortunate position of forcing their workloads to one of these trend lines.  This will likely result in extreme inefficiency in performance or capacity.</li>
</ol>
<p>Let me first comment on the efficiency problem.  Any workload&#8211;or in the case of virtualization, a consolidation of multiple workloads&#8211;can be mapped to the above graph as a dot whose coordinates are defined by its capacity and performance requirements.  This dot will almost certainly fall somewhere between the SATA and EFD trend lines.</p>
<p>You can determine the number of either single technology (SSD or rotating disks) needed for that workload by force-fitting the dot onto a trend line.  Do this by tracing a line either up or right until it intersects the device trend line.  At that point, a homogenous collection of that device type is supporting capacity and performance with a surplus of one of these.  Since these trend lines are so far from each other, if the workload fell squarely between the two trend lines then you are guaranteed to have either a very, very large surplus of performance or a very, very large surplus of capacity.</p>
<p>It is because of the extreme separation of these trends that storage vendors must offer dynamic performance capabilities such as auto-tiering and large caches that come with dynamic placement algorithms.  In this world of incredible performance/capacity divergence, big caches and auto-tiering are not just nice to have.  They are essential to making efficient storage decisions.</p>
<p>Now back to the idea of the vast separation between rotating and solid state disks.  Because large caches and auto-tiering are so essential, those technologies will make better decisions if they have a greater spread between the two extremes.  The greater the spread, the more cheaply capacity can be provided to low-use workloads and the more cheaply IOPS can be supplied to IO-intensive workloads.  This means that technologies like EMC FAST and FAST Cache are going to drive greater spread between disk technologies over time.</p>
<p>Lastly, since block autoplacement benefits most from the spread of these trend lines, the future of intermediate trends is uncertain.  I therefore reason that Fibre Channel disks will eventually fall out of the market.  But know that before this happens storage arrays will need to be optimally placing blocks.  Block placement is great today, but is a long way from optimal.</p>
<p>EMC is placing blocks in its unified storage in 1 GB chunks and in the Symmetrix line using blocks smaller than 1 MB.  As long as the storage block placement is larger than the smallest IO size, there will be some inefficiency.  But as long as the block placement is smaller than the virtual disk size, customers are benefiting from a configuration that is much more efficient than the worst case force-fit above.</p>
<p>I recently sat in on a customer presentation in Singapore from EMC&#8217;s president of the Symmetrix and Virtualization Product Group, Brian Gallagher.  Brian talked about a lot of things but a slide he showed covering this same topic offered a more precise view on the mixed use of solid state and rotating disks.</p>
<div id="attachment_714" class="wp-caption aligncenter" style="width: 610px"><a href="http://vpivot.com/wp-content/uploads/2010/12/Screen-shot-2010-12-14-at-11.24.26-AM.png"><img class="size-full wp-image-714" title="Choosing Disk Technology By Skew" src="http://vpivot.com/wp-content/uploads/2010/12/Screen-shot-2010-12-14-at-11.24.26-AM.png" alt="Figure shows different disk skew--amount of read/write to percentage of data--and possible configurations to minimize cost and disks." width="600" /></a><p class="wp-caption-text">Disk skew is another means of choosing disk types and counts to minimize cost and meet performance and capacity requirements.</p></div>
<p>We define skew here as the amount of IO that occurs on a subset of data.  Unlike my simplistic analysis above, which assumes some flat capacity/performance requirements of the entire data set, real workloads access data in non-uniform ways.  Locality of reference exists in the great majority of workloads and auto-tiering systems will produce outstanding performance/capacity fits when they can move a small number of hot blocks to the faster disk types.</p>
<p>The figure above contains some very rough calculations that in no way should be used as a substitute for careful calculation in your environment.  But they do suggest four possible configurations where small amounts of flash, paired with very precise auto-tiering such as that provided by the VMAX, can result in storage layouts that are faster, cheaper, and smaller than configurations without these technologies.</p>
<p>There is a point here that bears repeating, since it has not yet penetrated the entire market.  <em>The addition of &#8220;expensive&#8221; solid state disks can produce a storage configuration that is less expensive than the configuration that lacks them.</em></p>
<p>I think I have chosen a fun time for a visit into the world of storage.  I spent so much time diagnosing storage problems at VMware that I developed a lot of strong ideas about how customers should be better using stroage and how storage vendors could better support their customers.  The entire storage industry is developing technologies that leverage the huge spread in performance/capacity between existing disks and those technologies are going to drive that spread even further.  And as these optimizations mature, customers are going to derive even greater benefits.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/12/14/justifying-ssds/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Databases, Storage, and Solid State Disks</title>
		<link>http://vpivot.com/2010/09/20/databases-storage-and-solid-state-disks/</link>
		<comments>http://vpivot.com/2010/09/20/databases-storage-and-solid-state-disks/#comments</comments>
		<pubDate>Mon, 20 Sep 2010 02:53:07 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[emc]]></category>
		<category><![CDATA[fast]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[vmworld]]></category>
		<category><![CDATA[vmworld europe]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=667</guid>
		<description><![CDATA[A colleague of mine dropped by my desk on Friday to talk about storage best practices for virtualized databases (SQL Server in this case).  He observed a VMware deployment where the data and log files for a SQL Server virtual machine were consolidated on a single VMFS volume backed by a RAID 5 LUN.  &#8221;Is [...]]]></description>
			<content:encoded><![CDATA[<p>A colleague of mine dropped by my desk on Friday to talk about storage best practices for virtualized databases (SQL Server in this case).  He observed a VMware deployment where the data and log files for a SQL Server virtual machine were consolidated on a single VMFS volume backed by a RAID 5 LUN.  &#8221;Is this a VMware best practice?&#8221; he asked.  &#8221;Should you not put the redo logs on a RAID 10 LUN?&#8221;  The answers are &#8216;no&#8217; and &#8216;yes&#8217;, respectively.  And with the solid state disk (SSD) auto-tiering from EMC (FAST) the second answer is an emphatic &#8220;YES!&#8221;</p>
<p><span id="more-667"></span></p>
<p>A perfunctory bit of guidance I include in nearly all of my performance talks (such as the enthralling, entertaining, and cancer-curing* presentations from VMworld 2010 that I will repeat in <a href="http://www.vmworld.com/community/conferences/europe2010/">Copenhagen</a> from 12-14 October) is &#8220;follow your application best practices&#8221;.  Audiences usually nod and immediately forget because this recommendation we all know to be correct yet somehow ignore.  In that way it is like, &#8220;stay away from fatty foods&#8221;, &#8220;do not drink wine with pain killers&#8221;, or &#8220;pay attention during the flight attendants&#8217; presentation&#8221;.</p>
<p>Part of the reason why people forget this nugget is because advice is general, and not crystalized in a technological explanation that embeds deep in the minds of the audience.  In this case the application best practice that should be followed is to separate data from logs, putting the data on something good for random read performance (like RAID 5) and the logs on something good for sequential write performance (RAID 10).  Obviously I want everyone to consolidate their storage to VMFS and enjoy the technology, but if you are putting VMDKs that contain each of these files on the same volume, you are ignoring application best practices.</p>
<p>In this case I recommend building two VMFS volumes.  One backed by RAID 5 and the other by RAID 10.  Put the data on RAID 5, the logs on RAID 10.  While you will change the access profile at the array by putting multiple log files on the same RAID 10 backed LUN, the resultant IO will be much more sequential write than had you mixed data file reads among them.  So, consolidate multiple data files onto the same RAID 5 LUN and consolidate multiple log files on the same RAID 10 LUN.</p>
<p>Furthermore, if you are using solid state auto-tiering to manage your volumes, you do <em>not</em> need to protect your database log file with this technology.  What I am talking about here is EMC&#8217;s Fully Automated Storage Tiering (FAST), which is the most popular thing EMC has created since I have been paying attention.  Despite what some people will tell you, solid state disks are the cheapest way to serve huge amounts of random reads.  But their benefits diminish when the profile is sequential write when they become unattractive from a cost perspective.</p>
<p>EMC&#8217;s FAST works by creating a volume that is like a vertical stripe of multiple RAID groups.  LUNs, which become VMFS volumes, are then placed in that FAST volume.  Since FAST is a great technology for solid state disks, RAID 5 is the most cost efficient configuration for database data, and solid state is wasted on sequential IO such as redo logs, my best practice for virtual storage configuration for databases workloads when FAST may be present can be boiled down to the following rules:</p>
<ul>
<li>Always create RAID 5 volumes for your read-intensive database data.</li>
<li>Always create RAID 10 volumes for your database logs.  If you have write-intensive data, you may consider putting them here, too.</li>
<li>If you have FAST, use it to stripe across multiple RAID 5 volumes of different disk types and put your random, read-intensive data on VMFS on this volume.</li>
</ul>
<p>The last bullet is clearly the most important here. I really love FAST, and it seems that EMC&#8217;s customers are crazy for it.  But its not the technology you need for sequential write workloads like redo logs.  Separate those data onto their own &#8220;normal&#8221; (not FAST-backed) VMFS volumes that use no SSDs.  Then you will have the best of all worlds: optimally deployed disk technologies, application best practice compliance, and righteous virtualized database consolidation.</p>
<p>(*) The claims made by the author of this blog do not reflect the views of his employer, the conference organizers, the government of the Kingdom of Denmark, or reality, for that matter.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/09/20/databases-storage-and-solid-state-disks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Optimizing Memory Utilization</title>
		<link>http://vpivot.com/2010/01/06/optimizing-memory-utilization/</link>
		<comments>http://vpivot.com/2010/01/06/optimizing-memory-utilization/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 21:52:45 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[swap]]></category>
		<category><![CDATA[vcenter]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=198</guid>
		<description><![CDATA[My recent series of blog articles have discussed ESX memory management the the performance specter of host swapping. My last article attempts to correct the misconception that VMware recommends against over-commit memory.  In that article I suggested that memory over-commit is requirement in optimizing memory utilization. Today I want to provide a specific example to [...]]]></description>
			<content:encoded><![CDATA[<p>My recent series of blog articles have discussed ESX memory management the the performance specter of host swapping.  My last article attempts to <a href="http://vpivot.com/2010/01/04/misunderstanding-memory-management/">correct the misconception that VMware recommends against over-commit memory</a>.   In that article I suggested that memory over-commit is requirement in optimizing memory utilization. Today I want to provide a specific example to show why this is true.   I am have also included tips for identifying host swapping in your environments.<br />
<span id="more-198"></span></p>
<h2>Understanding the Bottleneck</h2>
<p>Let me show the value of over-commit and danger of swapping by way of an example.  I will choose the following typical values to demonstrate my point:</p>
<ul>
<li>All virtual machines are on a single host which has <strong>32 GB of RAM</strong> installed.</li>
<li>Each virtual machine is sized to <strong>8 GB of RAM</strong>.</li>
<li>Each virtual machine has <strong>25% active memory</strong> (%ACTV in esxtop and &#8220;Active&#8221; in vCenter).</li>
</ul>
<table id="newspaper-a">
<tbody>
<tr>
<th>VM Count</th>
<th>Active Memory in Host</th>
<th>Comments</th>
</tr>
<tr>
<td>3</td>
<td>3 * 8 GB * 25% = <strong>6 GB</strong></td>
<td>Without memory over-commit, <em>only 18% of the host&#8217;s memory is actively in use</em>.   What a waste!</td>
</tr>
<tr>
<td>12</td>
<td>12 * 8 GB * 25% = <strong>24 GB</strong></td>
<td>Memory is over-committed by 200% but only 75% is actively being used.  In this aggressive consolidation <em>virtual machines will run at full speed</em> until usage exceeds 100% of host memory.</td>
</tr>
<tr>
<td>18</td>
<td>18 * 8 GB * 25% = <strong>36 GB</strong>, limited to <strong>32 GB</strong> by host</td>
<td>These virtual machines want 36 GB of RAM but are limited to the 32 GB that is installed on the host.  ESX must swap to allow these machines to run and <em>performance will suffer greatly</em>.</td>
</tr>
</tbody>
</table>
<p>A virtual machine&#8217;s active memory is dictated by the application and its usage.  But the VI admin has complete control over the number of virtual machines in the environment which means host active memory can be influenced by adding or removing virtual machines.  Because virtual machine active memory is always equal to or less than 100% the only way to drive the host active memory to 100% is to over-commit memory.   <em>This is why hypervisors that do not support memory over-commit are simply not viable for data centers where memory optimization is a priority.</em></p>
<h2>Identifying and Correcting the Bottleneck</h2>
<p>The ongoing occurrence of swapping is identified by a non-zero swap rate in either esxtop or vCenter.  In addition to swap rate, esxtop provides a swap wait time in its CPU panel.  When swap rate exceeds hundreds of kilobytes per second or swap wait time exceeds a couple percentage points, it is time for corrective action.</p>
<p>There are three possible solutions to this problem:</p>
<ol>
<li>Balance the virtual machines&#8217; memory usage by moving virtual machines from hosts with higher amounts of memory usage to hosts with lower amount of memory usage.</li>
<li>Run fewer virtual machines.</li>
<li>Buy more memory.</li>
</ol>
<h2>Designing Your Infrastructure to Simplify Memory Management</h2>
<p>Ultimately I owe you a full white paper on memory management to provide a sufficient answer.  But I want to give you two ideas of the tools and techniques that I will be describing when in this future paper.  First, place <a href="http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/">host swap files on solid state disk (SSD) stores</a> to improve their performance.  With the right SSD device it may be possible to eliminate swap penalties.  Second, even if SSDs are unavailable consider consolidating multiple swap files onto a single store.  This will make swap rate monitoring very easy but may compound the performance penalties of swapping.</p>
<p>Stay tuned and VMware will provide more documentation on memory management in 2010.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2010/01/06/optimizing-memory-utilization/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>Solid State Disks and Host Swapping</title>
		<link>http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/</link>
		<comments>http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/#comments</comments>
		<pubDate>Fri, 25 Dec 2009 01:15:45 +0000</pubDate>
		<dc:creator>drummonds</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[esxtop]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[ssd]]></category>
		<category><![CDATA[swap]]></category>
		<category><![CDATA[vmkernel]]></category>

		<guid isPermaLink="false">http://vpivot.com/?p=183</guid>
		<description><![CDATA[Recently I have been thinking, talking, and writing about ESX host memory swapping a lot.  ESX swaps memory under the same conditions that traditional operating systems do; the application(s) is using more memory than available on the physical hardware.  Host swapping is an unavoidable consequence of this condition, whether virtualization is present or not. But [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I have been thinking, talking, and <a href="http://vpivot.com/2009/12/23/your-performance-enemy-host-swapping/">writing</a> about ESX host memory swapping a lot.  ESX swaps memory under the same conditions that traditional operating systems do; the application(s) is using more memory than available on the physical hardware.  Host swapping is an unavoidable consequence of this condition, whether virtualization is present or not.</p>
<p><span id="more-183"></span>But <a href="http://communities.vmware.com/blogs/chethank/2009/12/22/using-solidstate-drives-to-improve-performance-of-sql-databases-on-vsphere-hosts-when-memory-is-overcommitted">a recent article</a> by my engineering colleague Chethan Kumar shows an avenue that allows VI admins to aggressively over-commit memory and avoid the catastrophic performance penalty of swapping: use solid state disks to host ESX swap files.</p>
<p>The fundamental problem with host swapping comes from the high latency of traditional disks compared to memory.  Data can be retrieved from memory in nanoseconds but takes milliseconds to fetch from a hard drive.  That means a single 4K memory page takes 100,000 times longer to retrieve if the operating system swapped it out.</p>
<p>The value that solid state disks offer to this problem is exceptional latency, as compared to traditional drives.  The SSD that Chethan used showed microsecond latencies, about 1,000 times lower than physical disks.  This means that  time spent waiting for swap activity* has been decreased to 0.1% of the time spent swapping to physical disks.</p>
<p>The importance of fast swap files is that it enables administrators to more aggressively over-commit memory.  Today our admins rightfully fear the VMs&#8217; aggregate active memory exceeding the available physical memory, which results in swapping.  Today SSD technology in shared storage such as EMC&#8217;s new CLARiiONs allows our admins to cleverly place swap files and drive up memory utilization to previously unheard of levels.  This may enable standard memory overcommitment of 200% or more, with extreme over-commit being much higher than this.</p>
<p>In future versions of ESX we want to automate the usage of SSDs to maximize the use of available memory.  But that&#8217;s a roadmap discussion that I will leave for another day.</p>
<p>(*) This swap wait time has conveniently been added to ESX 4&#8242;s version of esxtop under the counter %SWPWT.  See <a href="http://communities.vmware.com/docs/DOC-9279">Interpreting esxtop Statistics</a> for more information.</p>
]]></content:encoded>
			<wfw:commentRss>http://vpivot.com/2009/12/24/solid-state-disks-and-host-swapping/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

