[pve-devel] [PATCH pve-docs] Add section for ZFS Special Device

Fabian Ebner f.ebner at proxmox.com
Wed Nov 6 09:34:52 CET 2019


Thanks for the review and suggestions! I'll send a v2 later, two replies 
inline.

On 11/5/19 10:14 AM, Aaron Lauterer wrote:
> Nicely written.
> 
> I have some suggestions inline:
> * splitting long sentences
> * adding more info as to what is valid for the size in 
> special_small_blocks (taken from the zfs man page)
> * rewrote the last paragraph a bit
> 
> On 10/22/19 12:33 PM, Fabian Ebner wrote:
>  > Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
>  > ---
>  >   local-zfs.adoc | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  >   1 file changed, 44 insertions(+)
>  >
>  > diff --git a/local-zfs.adoc b/local-zfs.adoc
>  > index b4fb7db..378cbee 100644
>  > --- a/local-zfs.adoc
>  > +++ b/local-zfs.adoc
>  > @@ -431,3 +431,47 @@ See the `encryptionroot`, `encryption`, 
> `keylocation`, `keyformat` and
>  >   `keystatus` properties, the `zfs load-key`, `zfs unload-key` and `zfs
>  >   change-key` commands and the `Encryption` section from `man zfs` 
> for more
>  >   details and advanced usage.
>  > +
>  > +
>  > +ZFS Special Device
>  > +~~~~~~~~~~~~~~~~~~
>  > +
>  > +Since version 0.8.0 ZFS allows adding a `special` device to a pool, 
> which is
>  > +then used to store metadata, deduplication tables and optionally 
> small file
>  > +blocks.
> 
> Since version 0.8. ZFS supports `special` devices. A `special` device in 
> a pool is used to store metadata, deduplication tables, and optionally 
> small file blocks.
> 
>  > +
>  > +IMPORTANT: The redundancy of the `special` device should match the 
> one of the
>  > +pool, since the `special` device is a point of failure for the whole 
> pool.
>  > +
>  > +WARNING: Adding a `special` device to a pool cannot be undone!
>  > +
>  > +.Create a pool with `special` device and RAID-1:
>  > +
>  > + zpool create -f -o ashift=12 <pool> mirror <device1> <device2> 
> special mirror <device3> <device4>
>  > +
>  > +.Add a `special` device to an existing pool with RAID-1:
>  > +
>  > + zpool add <pool> special mirror <device1> <device2>
>  > +
>  > +For ZFS datasets where the `special_small_blocks` property is set to 
> a non-zero
>  > +value, the `special` device is used to store small file blocks up to 
> that size.
>  > +Setting the `special_small_blocks` property on the pool will change 
> the default
>  > +value of that property for all child ZFS datasets (for example all 
> containers
>  > +in the pool will opt in for small file blocks).
>  > +
>  > +.Opt in for small file blocks pool-wide:
>  > +
>  > + zfs set special_small_blocks=<size> <pool>
>  > +
>  > +.Opt in for small file blocks for a single dataset:
>  > +
>  > + zfs set special_small_blocks=<size> <pool>/<filesystem>
>  > +
>  > +.Opt out from small file blocks for a single dataset:
>  > +
>  > + zfs set special_small_blocks=0 <pool>/<filesystem>
> 
> INFO: The value for <size> can be `0` to disable storing small file 
> blocks on the special device or a power of two in the range between 512B 
> to 128K.
> 

Another thing I'll add here is about the (non-intuitive) relation with 
the recordsize. Setting small_file_blocks higher or equal than the 
recordsize of the ZFS file system will cause *all* data to be written to 
the special device [0].

>  > +
>  > +Using a `special` device makes sense for pools with lots and lots of 
> changing
>  > +metadata respectively small files. If you also have other, larger 
> I/O on the
>  > +same pool then the benefit from using a `special` device might be 
> even more
>  > +noticeable. It is recommended to use SSDs or NVMes for the `special` 
> device.
>  >
> 
> A `special` device can improve the speed of small I/O operations if the 
> pool consists of slow spinning hard disks. Enabling 
> `special_small_blocks` can further increase the performance if a lot of 
> small files are used. Use fast (NVME) SSDs  for the `special` device.
> 

It's really about metadata and not small I/O operations in general. For 
example having I/O operations with block-size 4K, but on large files 
will not benefit from a special device (even with small_file_blocks 
enabled).
And I think that the benefit does not depend so much on the speed of the 
SSD. It should come from the fact that the I/O on the HDDs doesn't get 
disturbed as much by the metadata/small file operations.

What about the following?

A `special` device can improve the speed of a pool consisting of slow 
spinning hard disks with a lot of changing metadata. For example if the 
pool has many short-lived files. Enabling `special_small_blocks` can 
further increase the performance when those files are small. Use SSDs 
for the `special` device.

> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 

[0]: https://github.com/zfsonlinux/zfs/issues/9131#issuecomment-523680936




More information about the pve-devel mailing list