[PVE-User] shared LVM on host-based mirrored iSCSI LUNs

Stefan Sänger stsaenger at googlemail.com
Wed Apr 25 17:52:59 CEST 2012


Hi,

I did some more research and here is what I found out...

Am 23.04.2012 18:10, schrieb Flavio Stanchina:
>
> Not safe, as far as I know.

That was my first guess right away. My iSCSI-setup is just a test 
environment to see what is possible - but unfortunately there is a 
production system that uses FC, two FC SAN boxes and where host-based 
mirroring should be implemented.

BTW: That hardware was not my decision, and right now I am basically 
trying to figure out what will be the best way to go on with this problem...

> It would be just like using a
> non-distributed filesystem such as extX on shared storage: md is not
> meant to be used in this way, there is no locking between multiple
> nodes.

Well - it is not really like using a file system on shared storage.
As long as everything is working and the RAID was synced once, it is not 
really a problem to connect the RAID-LUNs to another host. The other 
host will only discover a clean RAID, will find the lvm information and 
go along with that.

The interesting part is that LVM in fact is kind of a locking mechanism 
here: every logical volume can only be used by a single VM, and that 
single VM can only ron on one host at a time. So there is a clear 
mapping of physical extents to virtual machines and hence there is no 
data corruption as every host system is only writing to extents it is 
allowed to.

But in case of a failover when one of the hosts goes down, the other 
hosts are not aware of the RAID-state, since every host keeps its own 
RAID-metadata.

And a write command issued by a VM to the logical volume will mean that 
md has to issue two write commands - one two each LUN. Since there is no 
communication about the RAID state between hosts there is no way two get 
at least a consistent state.

What is more, reading from a clean, synced RAID1 is supposed to be done 
round-robin just like RAID0 - whithout checking the mirrored block.

So if something has been written to only one RAID member it is a 
coin-flip if you will read that or not. And that means that even if the 
fsck of the VM will think everything is fine it is not :(


> While I can't think of a sure way to break it, I wouldn't feel
> safe to use it in production.

Well, I think I probably described a decent way why it can break.
And that leads me to the next question:

Instead of using RAID to do the mirroring, LVM should be able to take 
care for this. I will do some tests, but maybe you guys around here have 
a good idea about it.

So my next test will be:

- deleting the RAID
- disconnecting the iSCSI-Targets from all nodes but one
- creating single physical volumes on each LUN
- creating the volume group using -cy (--clustered=yes) with both LUNs
- probably the tricky point: creating the logical volume manually
   using lvcreate -m 1
- adding that volume to the virtual machine

I am not sure about some lvcreate options like --mirrorlog yet, and not 
sure if it will work anyway. But I think I should give it a try...


> Use DRBD between the two NAS boxes -- or whatever kind of realtime
> mirroring OpenFiler has to offer -- to mirror the disks, then use
> multipath to expose both ends to the VM hosts.

As mentioned above, this is basically some research on how to implement 
host based mirroring. I did not come up with this requirement, bus since 
I am using proxmox ve for some time now I woul prefer using it here as well.



Stefan



More information about the pve-user mailing list