[pve-devel] Default cache mode for VM hard drives

Thu Nov 20 10:04:45 CET 2014

When you tell me about of "enable integrity checking ", are you talking 
about of the directive "data-integrity-alg" or of the directive 
"verify-alg".

Because if you are speaking of "data-integrity-alg", i can say you that i 
was talking with Lars Ellenberg two years ago (more or less) about of the 
option: "data-integrity-alg", and he says that such option is only for 
purposes of tests, due to that the upper layers modify the network packets 
and finally DRBD believe that here we have something wrong with the 
verification (and in my case as accordingly, DRBD cuts the communication), 
so the option "data-integrity-alg" never should be used in production 
environments.

Moreover, since many time ago, my PVEs have LVM on top of DRBD (with 8.4.2, 
8.4.3 and 8.4.4 versions, and the next week will have the 8.4.5 version) for 
my KVM VMs, and never had problems of "oos" since I removed directive 
"data-integrity-alg" (including QEMU cache=none in some VMs)

For verification of the DRBD storage, i run one time a week a script in 
cron, and i have the directive "verify-alg sha1;" enabled in DRBD.

Always I hear to Lars say to many people that should change their version of 
DRBD by a more modern (8.4.x , the latest version stable of the moment).

----- Original Message ----- 
From: Stanislav German-Evtushenko
To: Cesar Peschiera
Cc: Dietmar Maurer ; pve-devel at pve.proxmox.com
Sent: Thursday, November 20, 2014 4:57 AM
Subject: Re: [pve-devel] Default cache mode for VM hard drives

On Thu, Nov 20, 2014 at 10:49 AM, Cesar Peschiera <brain at click.com.py> 
wrote:

Cache=none means no host cache but backend cache is still in use. In case
of DRBD this is a buffer in DRBD. So O_DIRECT return OK when data
reaches this buffer and not RAID cache.

Excuse me please if i intervene in this conversation, but as i understand, 
if the data is in a buffer of DRBD, then DRBD must know that there exist 
data to replicate, so obviuosly the problem isn't in the upper layers (KVM, 
any buffer in the RAM controlled by some software, etc.), so the buffer of 
DRBD should be optimized according to convenience.

Moreover, DRBD have several Web pages that tell us with great detail about 
of optimize many things, including the configuration of his buffers for 
avoid the data loss, also with examples of with and without a RAID 
controller in the middle. So it, the softwares that are in the upper layers 
nothing can do about since that DRBD takes the control of the data as also 
of his own buffer.

If we enable integrity checking (DRBD will compare checksums for all blocks 
prior committing to backand) in DRBD while using cache=none for DRBD then we 
cat this kind of messages from time to time:
block drbd0: Digest mismatch, buffer modified by upper layers during write: 
25715616s +4096

Stanislav