In my last post I ran a little test where I copied a few ISO’s over to the test server to see how the deduplication process worked on ZFS. I’ve continued that testing, and copied a few virtual machines up to the same lab server. To make things a bit more interesting I’ve enabled ZFS compression as well.
Let’s take a look at the data we’re working with:
VMware VMDK’s
- Virtual Machine 1 has a single disk that is about 12GB
- Virtual Machine 2 has a single disk that is about 10GB
Virtual Box Hard Disks
- 10GB of Virtual Box hard disk’s
So the plan is to create folders 1-4. Folders 1 and 2 will hold duplicate copies of the VMware VMDK files, and folders 3 and 4 will hold duplicate copies of the Virtual Box hard disk files. The the raw uncompressed undedup’ed data total should be about 64GB. Let’s look at the before and after upload results:
Before Upload
-
zpool list rpool NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT rpool 696G 7.78G 688G 1% 1.02x ONLINE -
-
zfs get compressratio rpool/data NAME PROPERTY VALUE SOURCE rpool/data compressratio 1.0x -
After Upload
-
zpool list rpool NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT rpool 696G 13.6G 682G 1% 2.07x ONLINE -
-
zfs get compressratio rpool/data NAME PROPERTY VALUE SOURCE rpool/data compressratio 1.71x -
If we take a look further we can see that df reports about 17GB of disk space used:
df -h Filesystem Size Used Avail Use% Mounted on rpool/data 686G 17G 670G 3% /rpool/data
If we contrast the zpool list results it appears that our 64GB of data has consumed about 6GB of actual disk space. Not bad.
Although this setup warrants further testing (and inclusion in a supported OpenSolaris release), these results should give us a bit of insight into how ZFS dedup and compression will handle our data when we use it as a backup target for our virtualized data.
