Device Mapper Cache

Recent device mapper allows to use small SSD disks to cache bigger, much slower conventional disks.

CentOs 7.4
It seems they found a performance bottleneck. Presumable it has a connection to the former message

dmesg output: [device-mapper: cache: You have created a cache device with a lot of individual cache blocks]
but as it wasnt explained anywhere - why should anyone change something....
See LVM Cache: limit number of cache chunks to the amount tested

This explains the following message during configuring a 470GByte cache volume

 # lvconvert --type cache-pool --poolmetadata /dev/vg00/lvol-meta --cachemode writethrough /dev/vg00/lvol-cache
  Using 512.00 KiB chunk size instead of default 64.00 KiB, so cache pool has less then 1000000 chunks.

display the current chunk size

# lvs -o+chunksize
  LV    VG   Attr       LSize Pool         Origin        Data%  Meta%  Move Log Cpy%Sync Convert Chunk
  lvol0 vg00 Cwi-aoC--- 1.81t [lvol-cache] [lvol0_corig] 15.07  0.55            0.00             512.00k

Updates [20170916]
New version of lvmcache-statistics.sh with more details and updated policies
Fixed a typo in the cache device detection.

ATTENTION
Upgrading an LVM cache installation from LVM <2.02.112 to 2.0.112 or even 114 will break the setup.
The VG cannot be (fully) activated any more.

LV vg00/lvol-cache has uknown feature flags 0
The missing "n" speaks for the quality of the code - sorry
See more details in Arch Linux Bug 42377

The only known fix yet - roll back to the older LVM version and remove the logical cache volume.

Prior CentOs 7.1 there is a nasty bug that whatever configured in the CLI only the WriteBack method is used.
See https://bugzilla.redhat.com/show_bug.cgi?id=1135639

TODO

There are still some blind spots to work on

  • Metadata Block count
    better usage of the SSD device instead of Metadata Usage: 1.3%
  • Cache of LVM mirrored devices
    seems currently (Dec 2015) not supported

Configuration

Start - vg00 with the original, slow disk and created lvol0

# pvcreate /dev/sdb
# vgextend vg00 /dev/sdb
http://rwmj.wordpress.com/2014/05/22/using-lvms-new-cache-feature/ states that the CacheMetaLV
should be 1/1000th of the size of the CacheDataLV, but a minimum of 8MB.
# lvcreate -L 100G -n lvol-cache vg00 /dev/sdb
# lvcreate -L 4G -n lvol-meta vg00 /dev/sdb
Default mode seems to be writeback
# lvconvert --type cache-pool --poolmetadata /dev/vg00/lvol-meta /dev/vg00/lvol-cache
  Logical volume "lvol1" created
  Converted vg00/lvol-cache to cache pool.
# lvconvert --type cache --cachepool /dev/vg00/lvol-cache /dev/vg00/lvol0
  vg00/lvol0 is now cached.

Remove the Cachepool and Flush all cache blocks to the original disk

# lvremove /dev/vg00/lvol-cache
  Flushing cache for lvol0
  280074 blocks must still be flushed.
  278866 blocks must still be flushed.
  277802 blocks must still be flushed.
  ...
  31 blocks must still be flushed.
  0 blocks must still be flushed.
  Do you really want to remove active logical volume lvol-cache? [y/n]: y
  Logical volume "lvol-cache" successfully removed
Recreate CacheLV
# lvcreate -L 100G -n lvol-cache vg00 /dev/sdb
# lvcreate -L 4G -n lvol-meta vg00 /dev/sdb
remaining PEs = 249

More secure cache pool

# lvconvert --type cache-pool --poolmetadata /dev/vg00/lvol-meta --cachemode writethrough /dev/vg00/lvol-cache
  Logical volume "lvol1" created
  Converted vg00/lvol-cache to cache pool.
  remaining PEs = 121
Where have the 128 PEs (4GByte) been gone?
# lvconvert --type cache --cachepool /dev/vg00/lvol-cache /dev/vg00/lvol0
  vg00/lvol0 is now cached.

Statistics

It took some time to understand the figures provided by dmsetup

# ./lvmcache-statistics.sh
-------------------------------------------------------------------------
LVM [2.02.130(2)-RHEL7] cache report of found device /dev/vg00/lvol0
-------------------------------------------------------------------------
- Cache Usage: 98.9% - Metadata Usage: 1.3%
- Read Hit Rate: 14.3% - Write Hit Rate: 13.6%
- Demotions/Promotions/Dirty: 0/28236/2
- Feature arguments in use: writeback
- Core arguments in use : migration_threshold 2048 smq 0
  - Cache Policy: stochastic multiqueue (smq)
- Cache Metadata Mode: rw
- MetaData Operation Health: ok
I have created a Script called lvmcache-statistics.sh for an easier understanding.

and many more...

how to change the cache policy

# lvchange --cachepolicy cleaner vg00/lvol0
# lvchange --cachepolicy smq vg00/lvol0
# lvchange --cachepolicy mq vg00/lvol0

how to change the cache mode - now that it is fixed
--cachemode writethrough|writeback|passthrough

# lvchange --cachemode writethrough vg00/lvol0
# lvchange --cachemode writeback vg00/lvol0

how to change the cache settings - hm, I havent used this one yet

.

And one ring to rule them all....

# lvs -o+cache_policy,cache_settings,cache_mode
  LV    VG   Attr       LSize Pool         Origin        Data%  Meta%  Move Log Cpy%Sync Convert CachePolicy CacheSettings CacheMode
  lvol0 vg00 Cwi-aoC--- 1.81t [lvol-cache] [lvol0_corig] 15.07  0.55            0.00             smq                       writethrough

what the hell is PMSPARE

# lvs -a
  LV                 VG   Attr       LSize   Pool         Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  [lvol-cache]       vg00 Cwi---C--- 106.25g                            98.97  1.32            0.01
  [lvol-cache_cdata] vg00 Cwi-ao---- 106.25g
  [lvol-cache_cmeta] vg00 ewi-ao----   1.50g
  lvol0              vg00 Cwi-aoC---   1.81t [lvol-cache] [lvol0_corig] 98.97  1.32            0.01
  [lvol0_corig]      vg00 owi-aoC---   1.81t
  [lvol1_pmspare]    vg00 ewi-------   1.50g

work with dmsetup

# dmsetup table vg00-lvol0
0 3892379648 cache 253:1 253:2 253:4 128 0 default 0
# dmsetup reload --table 0 3892379648 cache 253:1 253:2 253:4 128 0 cleaner 0 vg00-lvol0
# dmsetup wait vg00-lvol0

history...

CentOs 7.0-7.2 - LVM 2.02.130
CentOs 7.3 - LVM 2.02.155
CentOs 7.4 - LVM 2.02.171

Links

VM cache mode, policy and configurable parameters