[pve-devel] Quorum problems with NICs Intel of 10 Gb/s and VMsturns off

Cesar Peschiera brain at click.com.py
Wed Dec 24 20:56:15 CET 2014


>So you need to disable it, to not have any problem ?
Correct, disable is necessary, and with it, my problems ended.

Actually i have a PVE cluster functional and stable of 8 nodes with:
- 6 Servers with kernel 3.10.0-5-pve with pve-manager 3.3-5
- 2 Servers with kernel 2.6.32-19-pve with pve-manager: 2.3-13
- 2 Servers with NICs 10 Gb/s with IP address in vmbr0
- 4 servers with NICs 10 Gb/s for DRBD in bonding-rr
- I/OAT enabled in 2 servers (testing purposes)

For now, i don't see error messages in syslog and kern.log files,
But i will wait a week for see if i get some error message, and  checking to
diary.
(maybe I/OAT is stable in PVE, and it can be enabled in the Bios Hardware)

Questions to Alexandre or also to somebody:
1) If somebody have experience with I/OAT enabled, please report here, with
the version of the kernel that is in use.
2) Logical Proc. idling, ¿can enable it and get stability?

Moreover, when i finished the configuration of the Bios Hardware that the
server has with his tests respective for each option, i will publicate all
the configuration.

Notes about of my future report:
1) I know that some options can be enabled, as for example "SR-IOV Global
Enable", but as i have "HA" and we want that "Live migration" also works, is
that we prefer that such option is disabled.
2) As a great BB.DD. will be run (235GB RAM only for the BB.DD.), a very
important test is pending, the target is that the processor has more speed
in random access to the RAM, as also Hyperthreading, HugePages,  and some
other tests.
3) Such tests will be for optimize the BB.DD. MS-SQL Server, maybe for
others applications in VMs, the configuration should be a little different.
4) My future report will have the comparisons of times of delay in terms of
data base for each option that is  enabled or disabled in the Bios Hardware,
HugePages, etc.

I hope that this information can help to much people.

Preliminary, and not as final recommendation, i recommend this
configuration (that by the moment, for me is stable):

Processor:
----------
Adyacent cache line prefetch : Enable
Hardware Prefetcher  : Enable
DCU streamer Prefetcher  : Enable
DCU IP Prefetcher  : Enable
Execute Disable   : Enable
Logical Proc. idling  : Disable
Dell Turbo   : Disable

Bios or Pheriphericals
----------------------
OS Watchdog timer  : Disable
I/OAT DMA Engine  : Disable
SR-IOV Global Enable  : Disable
Mem. Mapped I/O Above 4 GB : Disable


----- Original Message ----- 
From: "Alexandre DERUMIER" <aderumier at odiso.com>
To: "Cesar Peschiera" <brain at click.com.py>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Sent: Wednesday, December 24, 2014 8:49 AM
Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off


>> I'm interested to known what is this option ;)
>>Mememory Mapped I/O Above 4 GB : Disable

So you need to disable it, to not have any problem ?

Maybe this is related:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2050443

>>yes, i can to write in /etc/pve
>>And talking about of the red lights:
>>After of some hours, the problem mysteriously disappeared.

That mean that the pvestatd daemon in hanging.

pvestatd check sequentially status of host, vm then storage.

And sometime, a slow/overloaded/hanging storage can block pvestatd.

you can restart the pvestatd daemon ,it should fix the problem.
Check also the logs to see if the daemon is hanging on a specific storage.


>>Moreover, i have doubts over these 3 options (Bios Hardware):
>>- OS Watchdog timer (option available in all my servers)

you can use it if you don't use fencing from proxmox. I'll restart the
server in case of a kernel panic for example.


>>- I/OAT DMA Engine ( i am testing with two servers DELL R320, each server
>>with 2 NICs Intel of 1 Gb/s, 4 ports each one)
Don't known too much about this one.


>>- Dell turbo (i don't remember the exact text),
>>But the Dell recommendation is enable only in performance profile
>>This option only appear in servers Dell R720

Maybe it's related to turbo core, with some intel processor.
Generally, I recommand to turn this off, because it's dynamically shutdown
some cores to speedup other cores.
And virtualization don't like this too much, because of changing clock
frequency. (bsod under windows)


----- Mail original -----
De: "Cesar Peschiera" <brain at click.com.py>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mercredi 24 Décembre 2014 08:38:28
Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off

Hi Alexandre

Thanks for your reply, and here my answers:

> I'm interested to known what is this option ;)
Mememory Mapped I/O Above 4 GB : Disable

>Can you check that you can write to /etc/pve?
yes, i can to write in /etc/pve
And talking about of the red lights:
After of some hours, the problem mysteriously disappeared.

Moreover, i have doubts over these 3 options (Bios Hardware):
- OS Watchdog timer (option available in all my servers)
- I/OAT DMA Engine ( i am testing with two servers DELL R320, each server
with 2 NICs Intel of 1 Gb/s, 4 ports each one)
- Dell turbo (i don't remember the exact text),
But the Dell recommendation is enable only in performance profile
This option only appear in servers Dell R720


----- Original Message ----- 
From: "Alexandre DERUMIER" <aderumier at odiso.com>
To: "Cesar Peschiera" <brain at click.com.py>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Sent: Monday, December 22, 2014 2:58 PM
Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off


>>After several checks, I found the problem in these two servers: a
>>configuration in the Hardware Bios that isn't compatible with the
>>pve-kernel-3.10.0-5, and my NICs was getting the link to down and after
>>up.
>>(i guess that soon i will comunicate my setup of BIOS in Dell R720).
>>... :-)

I'm interested to known what is this option ;)



>>The strange behaviour is that when i run "pvecm status", i get this
>>message:
>>Version: 6.2.0
>>Config Version: 41
>>Cluster Name: ptrading
>>Cluster Id: 28503
>>Cluster Member: Yes
>Cluster Generation: 8360
>>Membership state: Cluster-Member
>>Nodes: 8
>>Expected votes: 8
>>Total votes: 8
>>Node votes: 1
>>Quorum: 5
>>Active subsystems: 6
>>Flags:
>>Ports Bound: 0 177
>>Node name: pve5
>>Node ID: 5
>>Multicast addresses: 239.192.111.198
>>Node addresses: 192.100.100.50

So, you have quorum here. All nodes are ok . I don't see any problem.


>>And in the PVE GUI i see the red light in all the others nodes.

That's mean that the pvestatd daemon is hanging/crashed.


Can you check that you can write to /etc/pve.

if not, try to restart

/etc/init.d/pve-cluster restart

then

/etc/init.d/pvedaemon restart
/etc/init.d/pvestatd restart



----- Mail original ----- 
De: "Cesar Peschiera" <brain at click.com.py>
À: "aderumier" <aderumier at odiso.com>, "pve-devel"
<pve-devel at pve.proxmox.com>
Envoyé: Lundi 22 Décembre 2014 04:01:31
Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off

After several checks, I found the problem in these two servers: a
configuration in the Hardware Bios that isn't compatible with the
pve-kernel-3.10.0-5, and my NICs was getting the link to down and after up.
(i guess that soon i will comunicate my setup of BIOS in Dell R720).
... :-)

But now i have other problem, with the mix of PVE-manager 3.3-5 and 2.3-13
versions in a PVE cluster of 8 nodes: I am losing quorum in several nodes
very often.

Moreover, for now i can not apply a upgrade to my old PVE nodes, so for the
moment i would like to know if is possible to make a quick configuration for
that all my nodes always has quorum.

The strange behaviour is that when i run "pvecm status", i get this message:
Version: 6.2.0
Config Version: 41
Cluster Name: ptrading
Cluster Id: 28503
Cluster Member: Yes
Cluster Generation: 8360
Membership state: Cluster-Member
Nodes: 8
Expected votes: 8
Total votes: 8
Node votes: 1
Quorum: 5
Active subsystems: 6
Flags:
Ports Bound: 0 177
Node name: pve5
Node ID: 5
Multicast addresses: 239.192.111.198
Node addresses: 192.100.100.50

And in the PVE GUI i see the red light in all the others nodes.

Can apply a some kind of temporal solution as "Quorum: 1" for that my nodes
can work well and not has this strange behaviour? (Only until I performed
the updates)
Or, what will be the more simple and quick temporal solution for avoid to do
a upgrade in my nodes?
(something as for example: add to the rc.local file a line that says: "pvecm
expected 1")

Note about of the Quorum: I don't have any Hardware fence device enabled, so
i do not care that each node always have quorum (i always can turns off the
server manually and brutally if it is necessary).

----- Original Message ----- 
From: "Cesar Peschiera" <brain at click.com.py>
To: "Alexandre DERUMIER" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Sent: Saturday, December 20, 2014 9:30 AM
Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
VMsturns off


> Hi Alexandre
>
> I put 192.100.100.51 ip address directly to bond0, and i don't have
> network
> enabled (as if the node is totally isolated)
>
> This was my setup:
> ------------------- 
> auto bond0
> iface bond0 inet static
> address 192.100.100.51
> netmask 255.255.255.0
> gateway 192.100.100.4
> slaves eth0 eth2
> bond_miimon 100
> bond_mode 802.3ad
> bond_xmit_hash_policy layer2
>
> auto vmbr0
> iface vmbr0 inet manual
> bridge_ports bond0
> bridge_stp off
> bridge_fd 0
> post-up echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping
> post-up echo 1 > /sys/class/net/vmbr0/bridge/multicast_querier
>
> ...... :-(
>
> Some other suggestion?
>
> ----- Original Message ----- 
> From: "Alexandre DERUMIER" <aderumier at odiso.com>
> To: "Cesar Peschiera" <brain at click.com.py>
> Cc: "pve-devel" <pve-devel at pve.proxmox.com>
> Sent: Friday, December 19, 2014 7:59 AM
> Subject: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
> VMsturns off
>
>
> maybe can you try to put 192.100.100.51 ip address directly to bond0,
>
> to avoid corosync traffic going through to vmbr0.
>
> (I remember some old offloading bugs with 10gbe nic and linux bridge)
>
>
> ----- Mail original ----- 
> De: "Cesar Peschiera" <brain at click.com.py>
> À: "aderumier" <aderumier at odiso.com>
> Cc: "pve-devel" <pve-devel at pve.proxmox.com>
> Envoyé: Vendredi 19 Décembre 2014 11:08:33
> Objet: Re: [pve-devel] Quorum problems with NICs Intel of 10 Gb/s and
> VMsturns off
>
>>can you post your /etc/network/interfaces of theses 10gb/s nodes ?
>
> This is my configuration:
> Note: The LAN use 192.100.100.0/24
>
> #Network interfaces
> auto lo
> iface lo inet loopback
>
> iface eth0 inet manual
> iface eth1 inet manual
> iface eth2 inet manual
> iface eth3 inet manual
> iface eth4 inet manual
> iface eth5 inet manual
> iface eth6 inet manual
> iface eth7 inet manual
> iface eth8 inet manual
> iface eth9 inet manual
> iface eth10 inet manual
> iface eth11 inet manual
>
> #PVE Cluster and VMs (NICs are of 10 Gb/s):
> auto bond0
> iface bond0 inet manual
> slaves eth0 eth2
> bond_miimon 100
> bond_mode 802.3ad
> bond_xmit_hash_policy layer2
>
> #PVE Cluster and VMs:
> auto vmbr0
> iface vmbr0 inet static
> address 192.100.100.51
> netmask 255.255.255.0
> gateway 192.100.100.4
> bridge_ports bond0
> bridge_stp off
> bridge_fd 0
> post-up echo 0 >
> /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping
> post-up echo 1 > /sys/class/net/vmbr0/bridge/multicast_querier
>
> #A link for DRBD (NICs are of 10 Gb/s):
> auto bond401
> iface bond401 inet static
> address 10.1.1.51
> netmask 255.255.255.0
> slaves eth1 eth3
> bond_miimon 100
> bond_mode balance-rr
> mtu 9000
>
> #Other link for DRBD (NICs are of 10 Gb/s):
> auto bond402
> iface bond402 inet static
> address 10.2.2.51
> netmask 255.255.255.0
> slaves eth4 eth6
> bond_miimon 100
> bond_mode balance-rr
> mtu 9000
>
> #Other link for DRBD (NICs are of 10 Gb/s):
> auto bond403
> iface bond403 inet static
> address 10.3.3.51
> netmask 255.255.255.0
> slaves eth5 eth7
> bond_miimon 100
> bond_mode balance-rr
> mtu 9000
>
> #A link for the NFS-Backups (NICs are of 1 Gb/s):
> auto bond10
> iface bond10 inet static
> address 10.100.100.51
> netmask 255.255.255.0
> slaves eth8 eth10
> bond_miimon 100
> bond_mode balance-rr
> #bond_mode active-backup
> mtu 9000
>




More information about the pve-devel mailing list