vPivot

Scott Drummonds on Virtualization

Performance of Thin Provisioned Disks

14 Comments »

I have received questions about thin provisioned disk performance since we announced their full support in vSphere.  The questions usually center around scalability issues.  VMware’s customers fear locking contention as new blocks are created for growing disks and think that growing thin disks may punish the performance of all virtual disks on a volume.  Well performance engineering has finally released a paper on the subject and we are glad to say that thin disk performance is insignificantly different from thick disks.

The kernel of the concern of thin disks is based on the use of SCSI reservations to lock a LUN for for growth.  In theory this means a growing VMDK can cause numerous locks on the entire volume and slow IO for all virtual disks on it.  It is true that these reservations could impact performance but experiments show the zeroing operation for each block is a significantly more important to performance than locking.  And VMFS block zeroing is the same on thin and thick disks.

VMware thick disks, the default for virtual machines, zero a block’s data only on the first write to that block.  This lazy zeroing process means that the first write to each 1 MB block is much slower than every successive write.  Since thin disks allocate and zero the block at the same time, the SCSI reservation is set and released immediately before the zeroing.  Because the zeroing takes much longer, the impact of the reservation is insignificant.

Here is a figure I lifted from the white paper that shows scalability of thin and thick disks as well as the impact of zeroing:

Thin and thick disks are seen to scale the same, depending only on growth phase.

Thin and thick disk performance is dominated by disk phase, not type.

The paper created thin and thick disks in both phases of their existence: (1) the growing or “zeroing” phase where new blocks are being written to for the first time and (2) the steady state or “post-zeroing” phase where all blocks have been written to once.  This figure shows that the aggregate throughput of either thin or thick disks is dependent on the disks phase and not its implementation type.

In summary, throughput of thin and thick disks is lower during the zeroing phase which occurs early in the disks’ lives.  After the thin disk is fully grown or the thick disk’s blocks have each receive one write the storage throughput is bound only link speed.  But in both phases of the disks’ lives thin and thick disks show comparable performance.

14 Responses

58Gbps (and 180Gbps) is a fair amount of throughput for one host. What sort of shared storage adapters were used for that kind of performance?

    • Yes, 180 Gbps is an insanely large number. So insane, in fact, that it is incorrect. You have caught an error in the paper. We are going to correct the units in the graphs to “Mbps”.

      Thank you for the close attention!

  • I have corrected the units in the graph. I believe the whitepaper’s correction is imminent.

  • [...] Drummonds posted a good article with a performance comparison of thick provisioned disks vs. thin provisioned disks. This is good information and helps to clear [...]

  • We are in the process of converting thick zeroed disks (tbz=0) to thin disks. The LUN’s are on an HP EVA.

    How does this effect the disk when it will start to grow dynamically?
    Will it do the zeroing again?

    With other words: Will the already zeroed space on the LUN be lost during conversion? Or will the vSphere host be able to recognize that the space was already zeroed before the conversion?

    • Good question, AQU. You have stumped me. I am going to have to run down an answer for you with engineering. Unfortunately it may take until after the holidays (4 JAN 09) before I can get an answer. But I am on it!

    • VMFS records zeroed blocks for thin provisioned disks in metadata only. This means that when you convert an eager zeroed thick disk to a thin disk the zero blocks will not be represented on disk. That space will therefore be reclaimed.

      Once the guest OS writes non-zero data to the logical block then a new block will be created and the thin disk will grow.

  • Makes sense, tnx for the info Scott. Clears a lot of clouds.

    Now taking it to a next step, when talking in terms I/O time,
    I presume that run-time zeroing of a new “previously zeroed” block will be lesser then to a new “non zeroed” block, as the latter actually contains data which needs to be cleared/formatted, thus will take more time writing to.

    • AQU,

      No, the zeroing of a “previously zeroed” (zeroed in metadata only) block will not be less than a new, unzeroed block. In both cases a full 1MB of zeroes must be written to disk. In the second case a small amount of metadata must be updated. But that write is very, very small with respect to the zero block.

      Scott

  • Aha okay, I get the point, tnx man.

  • Is there any way to convert an eagerzeroedthick or a thin disk to the lazyzeroedthick format or is this only possible by creating new disks?
    I want to only use the thin capabilities of our array.

    • vmkfstools in the console will allow you to convert disks of any format to any other. But note that the default format is lazy zeroed thick. Unless you have changed that, no conversion should be needed.

  • Leave a Reply

    Switch to our mobile site