[pve-devel] Default cache mode for VM hard drives

Stanislav German-Evtushenko ginermail at gmail.com
Wed May 27 19:51:14 CEST 2015


Hi Dietmar, hi Alexandre, hi all,

I had to learn basics of C and POSIX Threads to create a reproducible test
case for out of sync blocks. And now I am back with both good and bad
results. The good result is a reliable way to reproduce the problem and the
bad one is the fact that the problem exists.

Bellow you find a C-program which runs two threads, where:
- the first thread writes data from buffer (x) to a block device 10 000
times
- the second thread changes data in buffer at the same time

Results:
- if block device is open without O_DIRECT - the issue never happen
- if block device is open with O_DIRECT - the oos bocks appear almost every
run

What this means? This means that kvm process does the same thing: it does
not wait until "write" system call is finished and all data from buffer
reaches block device. Instead it changes buffer somwhere in between. I am
not sure if behaviour of DRBD can be changed so it does not produce out of
sync with O_DIRECT. This behaviour can be related to kernel code itself and
not just DRBD. At the same time changing buffer in KVM while not all data
reached block device does not seem right.




1. Prerequisites:
- installed gcc
- drbd block device with data-integrity-alg enabled (generally this is not
necessary however it would help to catch the problem right away and not
wasting time doing "drbdadm verify")

2. Here is the code:
-------------------------------------------------------------------
#define _GNU_SOURCE

#include <fcntl.h>      // O_DIRECT
#include <malloc.h>     // free
#include <string.h>     // memset, atoi
#include <pthread.h>    // pthread_create
#include <stdio.h>      // printf

#define BUFFER_SIZE 4096
#define ALIGN 512
#define DELAY 10000

char *x;

void *write_to_blkdev(void *arg)
{
    int fd = open((char*) arg, O_RDWR | O_CREAT | O_DIRECT, 0660);
    //int fd = open((char*) arg, O_RDWR | O_CREAT, 0660);
    if (fd > 0) {
        memset(x, 55, BUFFER_SIZE);
        int i;
        for(i=0; i<10000; i++) {
            write(fd, x, BUFFER_SIZE);
        }
    }
    close(fd);
}

void *change_buffer(void *arg)
{
    int i;
    for(i=0; i<100; i++) {
        memset(x, i, BUFFER_SIZE);
        usleep(DELAY);
    }
}

int main(int argc, char *argv[])
{
    int ret = posix_memalign(&x, ALIGN, BUFFER_SIZE);
    if (ret < 0) { return -1; }

    pthread_t pth1, pth2;

    pthread_create(&pth1, NULL, write_to_blkdev, argv[1]);
    pthread_create(&pth2, NULL, change_buffer, "");

    printf("Waiting for change_buffer thread\n");
    pthread_join(pth2, NULL);
    printf("Waiting for write_to_blkdev thread\n");
    pthread_join(pth1, NULL);

    free(x);
}
-------------------------------------------------------------------

3. Building

gcc -pthread drbd_oos_test.c

4. Using

./a.out /dev/drbd0

PS: This test was done on a virtual machine thus on a physical one you
would probably need to play with DELAY.

Best regards,
Stanislav German-Evtushenko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.proxmox.com/pipermail/pve-devel/attachments/20150527/eb7439aa/attachment.htm>


More information about the pve-devel mailing list