[PVE-User] PVE kernel (sources, build process, etc.)

Uwe Sauter uwe.sauter.de at gmail.com
Mon Sep 10 11:35:23 CEST 2018


Hi,

I rebooted the two "old" hosts (Westmere EP generation) with the cmdline options. But for the "new" hosts I don't feel comfortable
to apply the options as those also host the VMs.

Now all hosts are on 4.15.18-4-pve and it seems that it is harder to trigger the issue. But once triggered, it is an OSD running
on one of the newer hosts that doesn't commit the subop.


Regards,

	Uwe

Am 22.08.18 um 11:08 schrieb Marcus Haarmann:
> Hi,
> did you try this:
> (taken from the ceph list)
> 
> This is PTI, I think.  Try to add "noibrs noibpb nopti nospectre_v2" to 
> kernel cmdline and reboot.
> 
> 
> Did this make a difference ?
> We are struggeling with proxmox/ceph too and we have the suspect that it is kernel related or network related,
> but could not narrow it down to a specific reason.
> But the effects are different... We encountered stuck I/O on rdb devices.
> And kernel says it is losing a mon connection and hunting for a new mon all the time (when backup takes
> place and heavy I/O is done).
> 
> Marcus Haarmann
> 
> ----------------------------------------------------------------------------------------------------------------------------------
> *Von: *"uwe sauter de" <uwe.sauter.de at gmail.com>
> *An: *"Thomas Lamprecht" <t.lamprecht at proxmox.com>, "pve-user" <pve-user at pve.proxmox.com>
> *Gesendet: *Mittwoch, 22. August 2018 10:50:19
> *Betreff: *Re: [PVE-User] PVE kernel (sources, build process, etc.)
> 
>>>>
>>>>> * pve-kernel 4.13 is based on http://kernel.ubuntu.com/git/ubuntu/ubuntu-artful.git/ ?
>>>>>
>>>>
>>>> Yes. (Note that this may not get much updates anymore)
>>>>
>>>>> * pve-kernel 4.15 is based on http://kernel.ubuntu.com/git/ubuntu/ubuntu-bionic.git/ ?
>>>>>
>>>>
>>>> Yes. We're normally on the latest stable release tagged on the master branch.
>>>>
>>>
>>> I'll checkout both and compare the myri10ge drivers…
>>>
>>
>> What's your exact issue, if I may ask?
>>
>> cheers,
>> Thomas
>>
>>
>>
> 
> Short story is that since updating from 4.13 to 4.15 I get slow_requests in Ceph with only low load on Ceph. If I boot back, those
> are gone.
> 
> Or at least almost as I was able to produce slow_requests with 4.13 but only if deep scrubing was manually initiated on all PGs.
> 
> 
> Sage Weil (Ceph dev) suggested that this probably is MTU or bonding related and thus I'm currently testing with different
> settings. Another guy on the ceph-devel list suggested different driver versions so I was checking that as well.
> 
> 
> 
> Long story is here:
> 
> https://pve.proxmox.com/pipermail/pve-user/2018-May/169472.html
> 
> https://pve.proxmox.com/pipermail/pve-user/2018-May/169492.html
> 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-May/026627.html
> 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-August/028862.html
> 
> https://marc.info/?l=ceph-devel&m=153449419830984&w=2
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user




More information about the pve-user mailing list