From uwe.sauter.de at gmail.com Tue Sep 3 09:18:09 2019 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Tue, 3 Sep 2019 09:18:09 +0200 Subject: [PVE-User] Bug report: Syntax error in /etc/aliases Message-ID: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com> Hi all, on a freshly installed PVE 6 my /etc/aliases looks like: # cat /etc/aliases postmaster: root nobody: root hostmaster: root webmaster: root www:root and I get this output from mailq # mailq -Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient------- 2F38327892 5452 Fri Aug 30 23:25:46 MAILER-DAEMON (alias database unavailable) root at px-golf.localdomain 30E0F27893 5548 Fri Aug 30 23:25:46 MAILER-DAEMON (alias database unavailable) root at px-golf.localdomain If I change the last line in the aliases file to "www: root" (with a space as the format requires as the man page says), recreate the alias database and flush the mail queues, everything looks fine. # sed -i -e 's,www:root,www: root,g' /etc/aliases # newaliases # postqueue -f # mailq Mail queue is empty Looks like the package that adds the www entry makes an error. Regards, Uwe From t.lamprecht at proxmox.com Tue Sep 3 11:46:03 2019 From: t.lamprecht at proxmox.com (Thomas Lamprecht) Date: Tue, 3 Sep 2019 11:46:03 +0200 Subject: [PVE-User] Bug report: Syntax error in /etc/aliases In-Reply-To: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com> References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com> Message-ID: <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com> Hi Uwe, On 03.09.19 09:18, Uwe Sauter wrote: > Hi all, > > on a freshly installed PVE 6 my /etc/aliases looks like: > > # cat /etc/aliases > postmaster: root > nobody: root > hostmaster: root > webmaster: root > www:root > > and I get this output from mailq > > # mailq > -Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient------- > 2F38327892 5452 Fri Aug 30 23:25:46 MAILER-DAEMON > (alias database unavailable) > root at px-golf.localdomain > > 30E0F27893 5548 Fri Aug 30 23:25:46 MAILER-DAEMON > (alias database unavailable) > root at px-golf.localdomain > > > > If I change the last line in the aliases file to "www: root" (with a space as the format requires as the man page says), recreate > the alias database and flush the mail queues, everything looks fine. > > # sed -i -e 's,www:root,www: root,g' /etc/aliases > # newaliases > # postqueue -f > # mailq > Mail queue is empty > > > Looks like the package that adds the www entry makes an error. Yes, you're right! Much thanks for the report, fixed for the next ISO release. @Fabian: we should probably do a postinst hook which fixes this up? Doing # sed -i -e 's/^www:root$/www: root/' /etc/aliases at one single package version transition could be enough. I'd say checksum matching the file to see if it was modified since shipping is not really required, as such matched entries are really not correct. cheers, Thomas From f.gruenbichler at proxmox.com Tue Sep 3 12:09:32 2019 From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?q?Gr=FCnbichler?=) Date: Tue, 03 Sep 2019 12:09:32 +0200 Subject: [PVE-User] Bug report: Syntax error in /etc/aliases In-Reply-To: <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com> References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com> <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com> Message-ID: <1567505336.1gjyizyjik.astroid@nora.none> On September 3, 2019 11:46 am, Thomas Lamprecht wrote: > Hi Uwe, > > On 03.09.19 09:18, Uwe Sauter wrote: >> Hi all, >> >> on a freshly installed PVE 6 my /etc/aliases looks like: >> >> # cat /etc/aliases >> postmaster: root >> nobody: root >> hostmaster: root >> webmaster: root >> www:root >> >> and I get this output from mailq >> >> # mailq >> -Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient------- >> 2F38327892 5452 Fri Aug 30 23:25:46 MAILER-DAEMON >> (alias database unavailable) >> root at px-golf.localdomain >> >> 30E0F27893 5548 Fri Aug 30 23:25:46 MAILER-DAEMON >> (alias database unavailable) >> root at px-golf.localdomain >> >> >> >> If I change the last line in the aliases file to "www: root" (with a space as the format requires as the man page says), recreate >> the alias database and flush the mail queues, everything looks fine. >> >> # sed -i -e 's,www:root,www: root,g' /etc/aliases >> # newaliases >> # postqueue -f >> # mailq >> Mail queue is empty >> >> >> Looks like the package that adds the www entry makes an error. > > > Yes, you're right! Much thanks for the report, fixed for the next ISO release. > > @Fabian: we should probably do a postinst hook which fixes this up? > > Doing > # sed -i -e 's/^www:root$/www: root/' /etc/aliases > > at one single package version transition could be enough. > I'd say checksum matching the file to see if it was modified since shipping is > not really required, as such matched entries are really not correct. > > cheers, > Thomas sounds good to me. From uwe.sauter.de at gmail.com Tue Sep 3 12:14:29 2019 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Tue, 3 Sep 2019 12:14:29 +0200 Subject: [PVE-User] Bug report: Syntax error in /etc/aliases In-Reply-To: <1567505336.1gjyizyjik.astroid@nora.none> References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com> <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com> <1567505336.1gjyizyjik.astroid@nora.none> Message-ID: <42282602-da0f-91e8-4858-7a8bf3834b08@gmail.com> Am 03.09.19 um 12:09 schrieb Fabian Gr?nbichler: > On September 3, 2019 11:46 am, Thomas Lamprecht wrote: >> Hi Uwe, >> >> On 03.09.19 09:18, Uwe Sauter wrote: >>> Hi all, >>> >>> on a freshly installed PVE 6 my /etc/aliases looks like: >>> >>> # cat /etc/aliases >>> postmaster: root >>> nobody: root >>> hostmaster: root >>> webmaster: root >>> www:root >>> >>> and I get this output from mailq >>> >>> # mailq >>> -Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient------- >>> 2F38327892 5452 Fri Aug 30 23:25:46 MAILER-DAEMON >>> (alias database unavailable) >>> root at px-golf.localdomain >>> >>> 30E0F27893 5548 Fri Aug 30 23:25:46 MAILER-DAEMON >>> (alias database unavailable) >>> root at px-golf.localdomain >>> >>> >>> >>> If I change the last line in the aliases file to "www: root" (with a space as the format requires as the man page says), recreate >>> the alias database and flush the mail queues, everything looks fine. >>> >>> # sed -i -e 's,www:root,www: root,g' /etc/aliases >>> # newaliases >>> # postqueue -f >>> # mailq >>> Mail queue is empty >>> >>> >>> Looks like the package that adds the www entry makes an error. >> >> >> Yes, you're right! Much thanks for the report, fixed for the next ISO release. >> >> @Fabian: we should probably do a postinst hook which fixes this up? >> >> Doing >> # sed -i -e 's/^www:root$/www: root/' /etc/aliases >> >> at one single package version transition could be enough. >> I'd say checksum matching the file to see if it was modified since shipping is >> not really required, as such matched entries are really not correct. >> >> cheers, >> Thomas > > sounds good to me. > I'd suggest to do: sed -i -e 's/^www:/www: /' /etc/aliases so that lines that were changed by a user are also caught. From lae at lae.is Tue Sep 3 12:39:05 2019 From: lae at lae.is (Musee Ullah) Date: Tue, 3 Sep 2019 03:39:05 -0700 Subject: [PVE-User] Bug report: Syntax error in /etc/aliases In-Reply-To: <42282602-da0f-91e8-4858-7a8bf3834b08@gmail.com> References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com> <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com> <1567505336.1gjyizyjik.astroid@nora.none> <42282602-da0f-91e8-4858-7a8bf3834b08@gmail.com> Message-ID: On 2019/09/03 3:14, Uwe Sauter wrote: > I'd suggest to do: > sed -i -e 's/^www:/www: /' /etc/aliases > > so that lines that were changed by a user are also caught. just pointing out that consecutive package updates'll continuously add more spaces with the above since it doesn't check if there's already a space. sed -E -i -e 's/^www:(\w)/www: \1/' /etc/aliases -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: OpenPGP digital signature URL: From t.lamprecht at proxmox.com Tue Sep 3 12:48:43 2019 From: t.lamprecht at proxmox.com (Thomas Lamprecht) Date: Tue, 3 Sep 2019 12:48:43 +0200 Subject: [PVE-User] Bug report: Syntax error in /etc/aliases In-Reply-To: References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com> <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com> <1567505336.1gjyizyjik.astroid@nora.none> <42282602-da0f-91e8-4858-7a8bf3834b08@gmail.com> Message-ID: <6c810fb5-3523-25b1-415f-540c9706d59e@proxmox.com> On 03.09.19 12:39, Musee Ullah via pve-user wrote: > On 2019/09/03 3:14, Uwe Sauter wrote: >> I'd suggest to do: >> sed -i -e 's/^www:/www: /' /etc/aliases >> >> so that lines that were changed by a user are also caught. > > just pointing out that consecutive package updates'll continuously add > more spaces with the above since it doesn't check if there's already a > space. > > sed -E -i -e 's/^www:(\w)/www: \1/' /etc/aliases > > That's why I said "at one single package version transition", independent of what exactly we finally do, I'd always guarded it with a version check inside a postinst debhelper script, e.g., like: if dpkg --compare-versions "$2" 'lt' '6.0-X'; then sed ... fi thus it happens only if an upgrade transitions from a version pre "6.0-x" (independent how old) to a version equal or newer than "6.0-x". No point in checking everytime, if a admin changed it back to something "bad" then it was probably wanted, or at least not our fault like it is here. :) But you suggestion itself would work fine, in general. cheers, Thomas From uwe.sauter.de at gmail.com Fri Sep 6 10:41:18 2019 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Fri, 6 Sep 2019 10:41:18 +0200 Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph Message-ID: Hi, I'm having trouble moving a disk image to Ceph. Moving between local disks and NFS share is working. The error given is: ######## create full clone of drive scsi0 (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2) rbd: create error: (17) File exists TASK ERROR: storage migration failed: error with cfs lock 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create error: (17) File exists ######## but this is not true: ######## root at px-bravo-cluster:~# rbd -p vdisks ls vm-106-disk-0 vm-113-disk-0 vm-113-disk-1 vm-113-disk-2 vm-118-disk-0 vm-119-disk-0 vm-120-disk-0 vm-125-disk-0 vm-125-disk-1 ######## Here is the relevant part of my storage.cfg: ######## nfs: aurel-cluster1-VMs export /backup/proxmox-infra/VMs path /mnt/pve/aurel-cluster1-VMs server X.X.X.X content images options vers=4.2 rbd: vdisks_vm content images krbd 0 pool vdisks ######## Looking in /etc/pve I cannot find any filename that would suggest that a lock exists. Any thoughts on this? Thanks, Uwe From a.antreich at proxmox.com Fri Sep 6 11:32:55 2019 From: a.antreich at proxmox.com (Alwin Antreich) Date: Fri, 6 Sep 2019 11:32:55 +0200 Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph In-Reply-To: References: Message-ID: <20190906093255.GA2458639@dona.proxmox.com> Hello Uwe, On Fri, Sep 06, 2019 at 10:41:18AM +0200, Uwe Sauter wrote: > Hi, > > I'm having trouble moving a disk image to Ceph. Moving between local disks and NFS share is working. > > The error given is: > > ######## > create full clone of drive scsi0 (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2) > rbd: create error: (17) File exists > TASK ERROR: storage migration failed: error with cfs lock 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create error: > (17) File exists > ######## Can you see anything in the ceph logs? And on what version (pveversion -v) are you on? > > but this is not true: > > ######## > root at px-bravo-cluster:~# rbd -p vdisks ls > vm-106-disk-0 > vm-113-disk-0 > vm-113-disk-1 > vm-113-disk-2 > vm-118-disk-0 > vm-119-disk-0 > vm-120-disk-0 > vm-125-disk-0 > vm-125-disk-1 > ######## Can you create the image by hand (rbd -p rbd create vm-112-disk-0 --size 1G)? And (rbd -p rbd rm vm-112-disk-0) for delete, ofc. > > Here is the relevant part of my storage.cfg: > > ######## > nfs: aurel-cluster1-VMs > export /backup/proxmox-infra/VMs > path /mnt/pve/aurel-cluster1-VMs > server X.X.X.X > content images > options vers=4.2 > > > rbd: vdisks_vm > content images > krbd 0 > pool vdisks > ######## Is this the complete storage.cfg? -- Cheers, Alwin From uwe.sauter.de at gmail.com Fri Sep 6 11:44:10 2019 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Fri, 6 Sep 2019 11:44:10 +0200 Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph In-Reply-To: <20190906093255.GA2458639@dona.proxmox.com> References: <20190906093255.GA2458639@dona.proxmox.com> Message-ID: <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com> Hello Alwin, Am 06.09.19 um 11:32 schrieb Alwin Antreich: > Hello Uwe, > > On Fri, Sep 06, 2019 at 10:41:18AM +0200, Uwe Sauter wrote: >> Hi, >> >> I'm having trouble moving a disk image to Ceph. Moving between local disks and NFS share is working. >> >> The error given is: >> >> ######## >> create full clone of drive scsi0 (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2) >> rbd: create error: (17) File exists >> TASK ERROR: storage migration failed: error with cfs lock 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create error: >> (17) File exists >> ######## > Can you see anything in the ceph logs? And on what version (pveversion > -v) are you on? Nothing obvious in the logs. The cluster is healthy root at px-bravo-cluster:~# ceph status cluster: id: 982484e6-69bf-490c-9b3a-942a179e759b health: HEALTH_OK services: mon: 3 daemons, quorum px-alpha-cluster,px-bravo-cluster,px-charlie-cluster mgr: px-alpha-cluster(active), standbys: px-bravo-cluster, px-charlie-cluster osd: 9 osds: 9 up, 9 in data: pools: 1 pools, 128 pgs objects: 14.76k objects, 56.0GiB usage: 163GiB used, 3.99TiB / 4.15TiB avail pgs: 128 active+clean io: client: 2.31KiB/s wr, 0op/s rd, 0op/s wr I'm on a fully up-to-date PVE 5.4 (all three nodes). root at px-bravo-cluster:~# pveversion -v proxmox-ve: 5.4-2 (running kernel: 4.15.18-20-pve) pve-manager: 5.4-13 (running version: 5.4-13/aee6f0ec) pve-kernel-4.15: 5.4-8 pve-kernel-4.15.18-20-pve: 4.15.18-46 pve-kernel-4.15.18-19-pve: 4.15.18-45 ceph: 12.2.12-pve1 corosync: 2.4.4-pve1 criu: 2.11.1-1~bpo90 glusterfs-client: 3.8.8-1 ksm-control-daemon: 1.2-2 libjs-extjs: 6.0.1-2 libpve-access-control: 5.1-12 libpve-apiclient-perl: 2.0-5 libpve-common-perl: 5.0-54 libpve-guest-common-perl: 2.0-20 libpve-http-server-perl: 2.0-14 libpve-storage-perl: 5.0-44 libqb0: 1.0.3-1~bpo9 lvm2: 2.02.168-pve6 lxc-pve: 3.1.0-6 lxcfs: 3.0.3-pve1 novnc-pve: 1.0.0-3 proxmox-widget-toolkit: 1.0-28 pve-cluster: 5.0-38 pve-container: 2.0-40 pve-docs: 5.4-2 pve-edk2-firmware: 1.20190312-1 pve-firewall: 3.0-22 pve-firmware: 2.0-7 pve-ha-manager: 2.0-9 pve-i18n: 1.1-4 pve-libspice-server1: 0.14.1-2 pve-qemu-kvm: 3.0.1-4 pve-xtermjs: 3.12.0-1 qemu-server: 5.0-54 smartmontools: 6.5+svn4324-1 spiceterm: 3.0-5 vncterm: 1.5-3 zfsutils-linux: 0.7.13-pve1~bpo2 >> >> but this is not true: >> >> ######## >> root at px-bravo-cluster:~# rbd -p vdisks ls >> vm-106-disk-0 >> vm-113-disk-0 >> vm-113-disk-1 >> vm-113-disk-2 >> vm-118-disk-0 >> vm-119-disk-0 >> vm-120-disk-0 >> vm-125-disk-0 >> vm-125-disk-1 >> ######## > Can you create the image by hand (rbd -p rbd create vm-112-disk-0 --size > 1G)? And (rbd -p rbd rm vm-112-disk-0) for delete, ofc. root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G rbd: create error: (17) File exists 2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0 already exists root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G root at px-bravo-cluster:~# rbd -p vdisks ls test vm-106-disk-0 vm-113-disk-0 vm-113-disk-1 vm-113-disk-2 vm-118-disk-0 vm-119-disk-0 vm-120-disk-0 vm-125-disk-0 vm-125-disk-1 root at px-bravo-cluster:~# rbd -p vdisks rm test Removing image: 100% complete...done. root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0 2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or directory Removing image: 0% complete...failed. rbd: delete error: (2) No such file or directory > >> >> Here is the relevant part of my storage.cfg: >> >> ######## >> nfs: aurel-cluster1-VMs >> export /backup/proxmox-infra/VMs >> path /mnt/pve/aurel-cluster1-VMs >> server X.X.X.X >> content images >> options vers=4.2 >> >> >> rbd: vdisks_vm >> content images >> krbd 0 >> pool vdisks >> ######## > Is this the complete storage.cfg? No, only the parts that are relevant for this particular move. Here's the complete file: ######## rbd: vdisks_vm content images krbd 0 pool vdisks dir: local-hdd path /mnt/local content images,iso nodes px-alpha-cluster,px-bravo-cluster,px-charlie-cluster shared 0 nfs: aurel-cluster1-daily export /backup/proxmox-infra/daily path /mnt/pve/aurel-cluster1-daily server X.X.X.X content backup maxfiles 30 options vers=4.2 nfs: aurel-cluster1-weekly export /backup/proxmox-infra/weekly path /mnt/pve/aurel-cluster1-weekly server X.X.X.X content backup maxfiles 30 options vers=4.2 nfs: aurel-cluster1-VMs export /backup/proxmox-infra/VMs path /mnt/pve/aurel-cluster1-VMs server X.X.X.X content images options vers=4.2 nfs: aurel-cluster2-daily export /backup/proxmox-infra2/daily path /mnt/pve/aurel-cluster2-daily server X.X.X.X content backup maxfiles 30 options vers=4.2 nfs: aurel-cluster2-weekly export /backup/proxmox-infra2/weekly path /mnt/pve/aurel-cluster2-weekly server X.X.X.X content backup maxfiles 30 options vers=4.2 nfs: aurel-cluster2-VMs export /backup/proxmox-infra2/VMs path /mnt/pve/aurel-cluster2-VMs server X.X.X.X content images options vers=4.2 dir: local path /var/lib/vz content snippets,vztmpl,images,rootdir,iso maxfiles 0 rbd: vdisks_cluster2 content images krbd 0 monhost px-golf-cluster, px-hotel-cluster, px-india-cluster pool vdisks username admin ######## Thanks, Uwe > -- > Cheers, > Alwin > From mark at openvs.co.uk Fri Sep 6 12:09:16 2019 From: mark at openvs.co.uk (Mark Adams) Date: Fri, 6 Sep 2019 13:09:16 +0300 Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph In-Reply-To: <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com> References: <20190906093255.GA2458639@dona.proxmox.com> <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com> Message-ID: Is it potentially an issue with having the same pool name on 2 different ceph clusters? is there a vm-112-disk-0 on vdisks_cluster2? On Fri, 6 Sep 2019, 12:45 Uwe Sauter, wrote: > Hello Alwin, > > Am 06.09.19 um 11:32 schrieb Alwin Antreich: > > Hello Uwe, > > > > On Fri, Sep 06, 2019 at 10:41:18AM +0200, Uwe Sauter wrote: > >> Hi, > >> > >> I'm having trouble moving a disk image to Ceph. Moving between local > disks and NFS share is working. > >> > >> The error given is: > >> > >> ######## > >> create full clone of drive scsi0 > (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2) > >> rbd: create error: (17) File exists > >> TASK ERROR: storage migration failed: error with cfs lock > 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create error: > >> (17) File exists > >> ######## > > Can you see anything in the ceph logs? And on what version (pveversion > > -v) are you on? > > Nothing obvious in the logs. The cluster is healthy > > root at px-bravo-cluster:~# ceph status > cluster: > id: 982484e6-69bf-490c-9b3a-942a179e759b > health: HEALTH_OK > > services: > mon: 3 daemons, quorum > px-alpha-cluster,px-bravo-cluster,px-charlie-cluster > mgr: px-alpha-cluster(active), standbys: px-bravo-cluster, > px-charlie-cluster > osd: 9 osds: 9 up, 9 in > > data: > pools: 1 pools, 128 pgs > objects: 14.76k objects, 56.0GiB > usage: 163GiB used, 3.99TiB / 4.15TiB avail > pgs: 128 active+clean > > io: > client: 2.31KiB/s wr, 0op/s rd, 0op/s wr > > I'm on a fully up-to-date PVE 5.4 (all three nodes). > > root at px-bravo-cluster:~# pveversion -v > proxmox-ve: 5.4-2 (running kernel: 4.15.18-20-pve) > pve-manager: 5.4-13 (running version: 5.4-13/aee6f0ec) > pve-kernel-4.15: 5.4-8 > pve-kernel-4.15.18-20-pve: 4.15.18-46 > pve-kernel-4.15.18-19-pve: 4.15.18-45 > ceph: 12.2.12-pve1 > corosync: 2.4.4-pve1 > criu: 2.11.1-1~bpo90 > glusterfs-client: 3.8.8-1 > ksm-control-daemon: 1.2-2 > libjs-extjs: 6.0.1-2 > libpve-access-control: 5.1-12 > libpve-apiclient-perl: 2.0-5 > libpve-common-perl: 5.0-54 > libpve-guest-common-perl: 2.0-20 > libpve-http-server-perl: 2.0-14 > libpve-storage-perl: 5.0-44 > libqb0: 1.0.3-1~bpo9 > lvm2: 2.02.168-pve6 > lxc-pve: 3.1.0-6 > lxcfs: 3.0.3-pve1 > novnc-pve: 1.0.0-3 > proxmox-widget-toolkit: 1.0-28 > pve-cluster: 5.0-38 > pve-container: 2.0-40 > pve-docs: 5.4-2 > pve-edk2-firmware: 1.20190312-1 > pve-firewall: 3.0-22 > pve-firmware: 2.0-7 > pve-ha-manager: 2.0-9 > pve-i18n: 1.1-4 > pve-libspice-server1: 0.14.1-2 > pve-qemu-kvm: 3.0.1-4 > pve-xtermjs: 3.12.0-1 > qemu-server: 5.0-54 > smartmontools: 6.5+svn4324-1 > spiceterm: 3.0-5 > vncterm: 1.5-3 > zfsutils-linux: 0.7.13-pve1~bpo2 > > > > >> > >> but this is not true: > >> > >> ######## > >> root at px-bravo-cluster:~# rbd -p vdisks ls > >> vm-106-disk-0 > >> vm-113-disk-0 > >> vm-113-disk-1 > >> vm-113-disk-2 > >> vm-118-disk-0 > >> vm-119-disk-0 > >> vm-120-disk-0 > >> vm-125-disk-0 > >> vm-125-disk-1 > >> ######## > > Can you create the image by hand (rbd -p rbd create vm-112-disk-0 --size > > 1G)? And (rbd -p rbd rm vm-112-disk-0) for delete, ofc. > > root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G > rbd: create error: (17) File exists > 2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0 > already exists > > root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G > > root at px-bravo-cluster:~# rbd -p vdisks ls > test > vm-106-disk-0 > vm-113-disk-0 > vm-113-disk-1 > vm-113-disk-2 > vm-118-disk-0 > vm-119-disk-0 > vm-120-disk-0 > vm-125-disk-0 > vm-125-disk-1 > > root at px-bravo-cluster:~# rbd -p vdisks rm test > Removing image: 100% complete...done. > > root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0 > 2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest: > failed to retreive immutable metadata: (2) No such file or > directory > Removing image: 0% complete...failed. > rbd: delete error: (2) No such file or directory > > > > > >> > >> Here is the relevant part of my storage.cfg: > >> > >> ######## > >> nfs: aurel-cluster1-VMs > >> export /backup/proxmox-infra/VMs > >> path /mnt/pve/aurel-cluster1-VMs > >> server X.X.X.X > >> content images > >> options vers=4.2 > >> > >> > >> rbd: vdisks_vm > >> content images > >> krbd 0 > >> pool vdisks > >> ######## > > Is this the complete storage.cfg? > > No, only the parts that are relevant for this particular move. Here's the > complete file: > > ######## > rbd: vdisks_vm > content images > krbd 0 > pool vdisks > > dir: local-hdd > path /mnt/local > content images,iso > nodes px-alpha-cluster,px-bravo-cluster,px-charlie-cluster > shared 0 > > nfs: aurel-cluster1-daily > export /backup/proxmox-infra/daily > path /mnt/pve/aurel-cluster1-daily > server X.X.X.X > content backup > maxfiles 30 > options vers=4.2 > > nfs: aurel-cluster1-weekly > export /backup/proxmox-infra/weekly > path /mnt/pve/aurel-cluster1-weekly > server X.X.X.X > content backup > maxfiles 30 > options vers=4.2 > > nfs: aurel-cluster1-VMs > export /backup/proxmox-infra/VMs > path /mnt/pve/aurel-cluster1-VMs > server X.X.X.X > content images > options vers=4.2 > > nfs: aurel-cluster2-daily > export /backup/proxmox-infra2/daily > path /mnt/pve/aurel-cluster2-daily > server X.X.X.X > content backup > maxfiles 30 > options vers=4.2 > > nfs: aurel-cluster2-weekly > export /backup/proxmox-infra2/weekly > path /mnt/pve/aurel-cluster2-weekly > server X.X.X.X > content backup > maxfiles 30 > options vers=4.2 > > nfs: aurel-cluster2-VMs > export /backup/proxmox-infra2/VMs > path /mnt/pve/aurel-cluster2-VMs > server X.X.X.X > content images > options vers=4.2 > > dir: local > path /var/lib/vz > content snippets,vztmpl,images,rootdir,iso > maxfiles 0 > > rbd: vdisks_cluster2 > content images > krbd 0 > monhost px-golf-cluster, px-hotel-cluster, px-india-cluster > pool vdisks > username admin > ######## > > Thanks, > > Uwe > > > -- > > Cheers, > > Alwin > > > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From uwe.sauter.de at gmail.com Fri Sep 6 12:22:28 2019 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Fri, 6 Sep 2019 12:22:28 +0200 Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph In-Reply-To: References: <20190906093255.GA2458639@dona.proxmox.com> <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com> Message-ID: <339bb0bc-303b-5fb3-cf4f-48143a9709d4@gmail.com> Am 06.09.19 um 12:09 schrieb Mark Adams: > Is it potentially an issue with having the same pool name on 2 different ceph clusters? Good catch. > is there a vm-112-disk-0 on vdisks_cluster2? No, but disabling the second Ceph in the storage settings allowed the move to succeed. I'll need to think about naming then. But this keeps me wondering why it only failed for one VM and the other six I moved today caused no problems. Thank you. Regards, Uwe > > On Fri, 6 Sep 2019, 12:45 Uwe Sauter, > wrote: > > Hello Alwin, > > Am 06.09.19 um 11:32 schrieb Alwin Antreich: > > Hello Uwe, > > > > On Fri, Sep 06, 2019 at 10:41:18AM +0200, Uwe Sauter wrote: > >> Hi, > >> > >> I'm having trouble moving a disk image to Ceph. Moving between local disks and NFS share is working. > >> > >> The error given is: > >> > >> ######## > >> create full clone of drive scsi0 (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2) > >> rbd: create error: (17) File exists > >> TASK ERROR: storage migration failed: error with cfs lock 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create > error: > >> (17) File exists > >> ######## > > Can you see anything in the ceph logs? And on what version (pveversion > > -v) are you on? > > Nothing obvious in the logs. The cluster is healthy > > root at px-bravo-cluster:~# ceph status > ? cluster: > ? ? id:? ? ?982484e6-69bf-490c-9b3a-942a179e759b > ? ? health: HEALTH_OK > > ? services: > ? ? mon: 3 daemons, quorum px-alpha-cluster,px-bravo-cluster,px-charlie-cluster > ? ? mgr: px-alpha-cluster(active), standbys: px-bravo-cluster, px-charlie-cluster > ? ? osd: 9 osds: 9 up, 9 in > > ? data: > ? ? pools:? ?1 pools, 128 pgs > ? ? objects: 14.76k objects, 56.0GiB > ? ? usage:? ?163GiB used, 3.99TiB / 4.15TiB avail > ? ? pgs:? ? ?128 active+clean > > ? io: > ? ? client:? ?2.31KiB/s wr, 0op/s rd, 0op/s wr > > I'm on a fully up-to-date PVE 5.4 (all three nodes). > > root at px-bravo-cluster:~# pveversion -v > proxmox-ve: 5.4-2 (running kernel: 4.15.18-20-pve) > pve-manager: 5.4-13 (running version: 5.4-13/aee6f0ec) > pve-kernel-4.15: 5.4-8 > pve-kernel-4.15.18-20-pve: 4.15.18-46 > pve-kernel-4.15.18-19-pve: 4.15.18-45 > ceph: 12.2.12-pve1 > corosync: 2.4.4-pve1 > criu: 2.11.1-1~bpo90 > glusterfs-client: 3.8.8-1 > ksm-control-daemon: 1.2-2 > libjs-extjs: 6.0.1-2 > libpve-access-control: 5.1-12 > libpve-apiclient-perl: 2.0-5 > libpve-common-perl: 5.0-54 > libpve-guest-common-perl: 2.0-20 > libpve-http-server-perl: 2.0-14 > libpve-storage-perl: 5.0-44 > libqb0: 1.0.3-1~bpo9 > lvm2: 2.02.168-pve6 > lxc-pve: 3.1.0-6 > lxcfs: 3.0.3-pve1 > novnc-pve: 1.0.0-3 > proxmox-widget-toolkit: 1.0-28 > pve-cluster: 5.0-38 > pve-container: 2.0-40 > pve-docs: 5.4-2 > pve-edk2-firmware: 1.20190312-1 > pve-firewall: 3.0-22 > pve-firmware: 2.0-7 > pve-ha-manager: 2.0-9 > pve-i18n: 1.1-4 > pve-libspice-server1: 0.14.1-2 > pve-qemu-kvm: 3.0.1-4 > pve-xtermjs: 3.12.0-1 > qemu-server: 5.0-54 > smartmontools: 6.5+svn4324-1 > spiceterm: 3.0-5 > vncterm: 1.5-3 > zfsutils-linux: 0.7.13-pve1~bpo2 > > > > >> > >> but this is not true: > >> > >> ######## > >> root at px-bravo-cluster:~# rbd -p vdisks ls > >> vm-106-disk-0 > >> vm-113-disk-0 > >> vm-113-disk-1 > >> vm-113-disk-2 > >> vm-118-disk-0 > >> vm-119-disk-0 > >> vm-120-disk-0 > >> vm-125-disk-0 > >> vm-125-disk-1 > >> ######## > > Can you create the image by hand (rbd -p rbd create vm-112-disk-0 --size > > 1G)? And (rbd -p rbd rm vm-112-disk-0) for delete, ofc. > > root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G > rbd: create error: (17) File exists > 2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0 already exists > > root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G > > root at px-bravo-cluster:~# rbd -p vdisks ls > test > vm-106-disk-0 > vm-113-disk-0 > vm-113-disk-1 > vm-113-disk-2 > vm-118-disk-0 > vm-119-disk-0 > vm-120-disk-0 > vm-125-disk-0 > vm-125-disk-1 > > root at px-bravo-cluster:~# rbd -p vdisks rm test > Removing image: 100% complete...done. > > root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0 > 2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or > directory > Removing image: 0% complete...failed. > rbd: delete error: (2) No such file or directory > > > > > >> > >> Here is the relevant part of my storage.cfg: > >> > >> ######## > >> nfs: aurel-cluster1-VMs > >>? ? ? export /backup/proxmox-infra/VMs > >>? ? ? path /mnt/pve/aurel-cluster1-VMs > >>? ? ? server X.X.X.X > >>? ? ? content images > >>? ? ? options vers=4.2 > >> > >> > >> rbd: vdisks_vm > >>? ? ? content images > >>? ? ? krbd 0 > >>? ? ? pool vdisks > >> ######## > > Is this the complete storage.cfg? > > No, only the parts that are relevant for this particular move. Here's the complete file: > > ######## > rbd: vdisks_vm > ? ? ? ? content images > ? ? ? ? krbd 0 > ? ? ? ? pool vdisks > > dir: local-hdd > ? ? ? ? path /mnt/local > ? ? ? ? content images,iso > ? ? ? ? nodes px-alpha-cluster,px-bravo-cluster,px-charlie-cluster > ? ? ? ? shared 0 > > nfs: aurel-cluster1-daily > ? ? ? ? export /backup/proxmox-infra/daily > ? ? ? ? path /mnt/pve/aurel-cluster1-daily > ? ? ? ? server X.X.X.X > ? ? ? ? content backup > ? ? ? ? maxfiles 30 > ? ? ? ? options vers=4.2 > > nfs: aurel-cluster1-weekly > ? ? ? ? export /backup/proxmox-infra/weekly > ? ? ? ? path /mnt/pve/aurel-cluster1-weekly > ? ? ? ? server X.X.X.X > ? ? ? ? content backup > ? ? ? ? maxfiles 30 > ? ? ? ? options vers=4.2 > > nfs: aurel-cluster1-VMs > ? ? ? ? export /backup/proxmox-infra/VMs > ? ? ? ? path /mnt/pve/aurel-cluster1-VMs > ? ? ? ? server X.X.X.X > ? ? ? ? content images > ? ? ? ? options vers=4.2 > > nfs: aurel-cluster2-daily > ? ? ? ? export /backup/proxmox-infra2/daily > ? ? ? ? path /mnt/pve/aurel-cluster2-daily > ? ? ? ? server X.X.X.X > ? ? ? ? content backup > ? ? ? ? maxfiles 30 > ? ? ? ? options vers=4.2 > > nfs: aurel-cluster2-weekly > ? ? ? ? export /backup/proxmox-infra2/weekly > ? ? ? ? path /mnt/pve/aurel-cluster2-weekly > ? ? ? ? server X.X.X.X > ? ? ? ? content backup > ? ? ? ? maxfiles 30 > ? ? ? ? options vers=4.2 > > nfs: aurel-cluster2-VMs > ? ? ? ? export /backup/proxmox-infra2/VMs > ? ? ? ? path /mnt/pve/aurel-cluster2-VMs > ? ? ? ? server X.X.X.X > ? ? ? ? content images > ? ? ? ? options vers=4.2 > > dir: local > ? ? ? ? path /var/lib/vz > ? ? ? ? content snippets,vztmpl,images,rootdir,iso > ? ? ? ? maxfiles 0 > > rbd: vdisks_cluster2 > ? ? ? ? content images > ? ? ? ? krbd 0 > ? ? ? ? monhost px-golf-cluster, px-hotel-cluster, px-india-cluster > ? ? ? ? pool vdisks > ? ? ? ? username admin > ######## > > Thanks, > > ? ? ? ? Uwe > > > -- > > Cheers, > > Alwin > > > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From a.antreich at proxmox.com Fri Sep 6 12:32:48 2019 From: a.antreich at proxmox.com (Alwin Antreich) Date: Fri, 6 Sep 2019 12:32:48 +0200 Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph In-Reply-To: <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com> References: <20190906093255.GA2458639@dona.proxmox.com> <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com> Message-ID: <20190906103248.GB2458639@dona.proxmox.com> On Fri, Sep 06, 2019 at 11:44:10AM +0200, Uwe Sauter wrote: > root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G > rbd: create error: (17) File exists > 2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0 already exists > > root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G > > root at px-bravo-cluster:~# rbd -p vdisks ls > test > vm-106-disk-0 > vm-113-disk-0 > vm-113-disk-1 > vm-113-disk-2 > vm-118-disk-0 > vm-119-disk-0 > vm-120-disk-0 > vm-125-disk-0 > vm-125-disk-1 > > root at px-bravo-cluster:~# rbd -p vdisks rm test > Removing image: 100% complete...done. > > root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0 > 2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or > directory > Removing image: 0% complete...failed. > rbd: delete error: (2) No such file or directory Seems Ceph has the vm-112-disk-0 still stored somewhere. At least this error message should be visible in the ceph logs. Hopefully there is more info about it. Does the 'rbd showmapped' show the image being mapped? Are you running filestore or bluestore OSDs? If you migrate the VM to a different host, does it occure there too? -- Cheers, Alwin From klaus.mailinglists at pernau.at Fri Sep 6 13:32:22 2019 From: klaus.mailinglists at pernau.at (Klaus Darilion) Date: Fri, 6 Sep 2019 13:32:22 +0200 Subject: [PVE-User] pvesr: how to achieve continuous replication logs Message-ID: Hello all! As far as I see, pvesr logs the last replication log to /var/log/pve/replicate/VMID and additionally errors are logged to syslog. For debugging purposes I would like the have the detailed replications log (as in /var/log/pve/replicate/) kept for every replication. For example either append the replication logs or send the replication log also to syslog. Is there a way to have the replication logs permanent? Thanks Klaus From uwe.sauter.de at gmail.com Fri Sep 6 13:52:57 2019 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Fri, 6 Sep 2019 13:52:57 +0200 Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph In-Reply-To: <20190906103248.GB2458639@dona.proxmox.com> References: <20190906093255.GA2458639@dona.proxmox.com> <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com> <20190906103248.GB2458639@dona.proxmox.com> Message-ID: <5f7703d8-e941-8111-b66d-b01c3b0a9e29@gmail.com> Am 06.09.19 um 12:32 schrieb Alwin Antreich: > On Fri, Sep 06, 2019 at 11:44:10AM +0200, Uwe Sauter wrote: >> root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G >> rbd: create error: (17) File exists >> 2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0 already exists >> >> root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G >> >> root at px-bravo-cluster:~# rbd -p vdisks ls >> test >> vm-106-disk-0 >> vm-113-disk-0 >> vm-113-disk-1 >> vm-113-disk-2 >> vm-118-disk-0 >> vm-119-disk-0 >> vm-120-disk-0 >> vm-125-disk-0 >> vm-125-disk-1 >> >> root at px-bravo-cluster:~# rbd -p vdisks rm test >> Removing image: 100% complete...done. >> >> root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0 >> 2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or >> directory >> Removing image: 0% complete...failed. >> rbd: delete error: (2) No such file or directory > Seems Ceph has the vm-112-disk-0 still stored somewhere. At least this > error message should be visible in the ceph logs. Hopefully there is > more info about it. > > Does the 'rbd showmapped' show the image being mapped? Are you running > filestore or bluestore OSDs? > > If you migrate the VM to a different host, does it occure there too? > Neither Ceph had vm-112-disk-0. I solved it by disabling the second Ceph instance. PVE was then able to move the disk. Yes, it occured on all hosts. Thanks, Uwe > -- > Cheers, > Alwin > From kulzaus at kulzaus.top Fri Sep 6 14:57:00 2019 From: kulzaus at kulzaus.top (Milosz Stocki) Date: Fri, 6 Sep 2019 14:57:00 +0200 Subject: ZFS live migration with HA Message-ID: <05f001d564b2$94168dc0$bc43a940$@kulzaus.top> Hi, Has anyone tested the new-in-GUI option for local disk migration using ZFS in pve 6? For me it works up until I enable HA for the VMs in question and it looks like proxmox GUI doesn't know that i'm using local disks and won't let me migrate, so when trying failover to new server it starts well with replicas but cannot seem to restore the vm to restored server. I even tried running it from cli with "qm migrate 10101 main-host --online --with-local-disks --force" but it always (from GUI and CLI) gives me: "2019-09-06 13:25:33 can't migrate local disk 'ZFS-main-host-SSD:vm-10101-disk-0': can't live migrate attached local disks without with-local-disks option 2019-09-06 13:25:33 ERROR: Failed to sync data - can't migrate VM - check log" Is it something that Proxmox Team didn't enable yet for HA configs or is it a bug because it seems like it doesn't even pass the -with-local-disks option to HA stack and the error is a bit vague if it's on purpose? I couldn't find it on the bug tracker. Best Regards Milosz Stocki From martin at holub.co.at Tue Sep 10 09:48:57 2019 From: martin at holub.co.at (Martin Holub) Date: Tue, 10 Sep 2019 09:48:57 +0200 Subject: Nested Virtualization and Live Migration Message-ID: Hi, We activated nested Virtualization Support and this apparently broke the Live Migration Support. According to https://www.linux-kvm.org/page/Nested_Guests i think this should work, but i just see "qmp command 'migrate' failed - Nested VMX virtualization does not support live migration yet". Is there something i can do to fix this, or do i have to disable the nested Virtualization? We are using Proxmox6 with Debian & Ubuntu Guests. Best Martin From d.csapak at proxmox.com Tue Sep 10 10:05:48 2019 From: d.csapak at proxmox.com (Dominik Csapak) Date: Tue, 10 Sep 2019 10:05:48 +0200 Subject: [PVE-User] Nested Virtualization and Live Migration In-Reply-To: References: Message-ID: > https://www.linux-kvm.org/page/Nested_Guests the page is sadly outdated there are efforts in kernel and qemu to enable real working live migration of nested machines. currently qemu decided to disable migration altogether when nesting is enabled and the guest has the vmx/svm flag.[0] you can try with a cpu model that does not include that flag, but you lose nesting for that machine ofc. kind regards Dominik 0: https://github.com/qemu/qemu/commit/d98f26073bebddcd3da0ba1b86c3a34e840c0fb8 From martin at holub.co.at Tue Sep 10 13:56:26 2019 From: martin at holub.co.at (Martin Holub) Date: Tue, 10 Sep 2019 13:56:26 +0200 Subject: [PVE-User] Nested Virtualization and Live Migration In-Reply-To: References: Message-ID: <41e03e3d016a5620c19b92044377c431aa7ee2e2.camel@holub.co.at> On Tue, 2019-09-10 at 10:05 +0200, Dominik Csapak wrote: > > https://www.linux-kvm.org/page/Nested_Guests > > the page is sadly outdated > > there are efforts in kernel and qemu to enable real working live > migration of nested machines. > > currently qemu decided to disable migration altogether when nesting > is > enabled and the guest has the vmx/svm flag.[0] > > you can try with a cpu model that does not include that flag, but > you lose nesting for that machine ofc. > > kind regards > Dominik > > 0: > https://github.com/qemu/qemu/commit/d98f26073bebddcd3da0ba1b86c3a34e840c0fb8 > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user Hi Dominik, I see, thanks for the Link. I will then disable the nested Flag again for now. Maybe someone wants to add that to the Wiki Page at [1]? Best Martin [1] https://pve.proxmox.com/wiki/Nested_Virtualization From f.cuseo at panservice.it Tue Sep 10 20:14:03 2019 From: f.cuseo at panservice.it (Fabrizio Cuseo) Date: Tue, 10 Sep 2019 20:14:03 +0200 (CEST) Subject: [PVE-User] Ceph Crush Map Message-ID: <2103160136.84979.1568139243873.JavaMail.zimbra@zimbra.panservice.it> Hello. I want to suggest a new feature for PVE release 6.1 :) Scenario: - 3 hosts in a rack in building A - 3 hosts in a rack in building B - a dedicated 2 x 10Gbit connection (200mt fiber channel) A single PVE cluster with 6 hosts. A single Ceph cluster with 6 hosts (each with several OSD) I would like to manipulate the crush map to have my pools with not less than 1 copy for each building (so if I have a pool with 3 copies, I need to have 2 copies on different hosts in building A, and 1 copy on one of the hosts in building B). With this configuration I can obtain a full redundancy of my VMs and data (something wrong ?). I can change manually the crush map, pools and so on, but with the GUI this can be more simple (also if i need to add other servers to my cluster). Regards, Fabrizio From fatalerrors at geoffray-levasseur.org Wed Sep 11 17:14:05 2019 From: fatalerrors at geoffray-levasseur.org (Geoffray Levasseur) Date: Wed, 11 Sep 2019 15:14:05 +0000 Subject: [PVE-User] Ceph and NTP problems on Ryzen Message-ID: <4cf7609c60787523584b01e878daf845@geoffray-levasseur.org> Hi, I have some difficulties on a Ryzen 5 2400G node. I switched recently to new version 6 with no problems except performances on that machine. The new kernel was suppose to have a better support but it actually is worse. Putting a quad port Intel 82571EB card on the PCI-Express 16x port is probably the orrigin of my troubles. When IOMMU is activated I have very bad performances on the network card, and the server reboot after a few days unexpectingly. When I put IOMMU in software mode, no unexpected reboot anymore but bed performances remain. It turns Ceph extremely slow and it complain permanently with clock skew. NTP have extreme difficulties to do its job, both on hosted virtual machines and the host itself. I read a lot about such troubles now fixed on video cards. But I dont think the network card scenario have been treated by kernel developpers. Is there any workaround for my situation? Regards, -- Geoffray Levasseur Technicien UPS - UMR CNRS 5566 / LEGOS - Service Informatique http://www.geoffray-levasseur.org/ GNU/PG public key : C89E D6C4 8BFC C9F2 EEFB 908C 89C2 CD4D CD9E 23AA Quod gratis asseritur gratis negatur. From lists at merit.unu.edu Wed Sep 11 20:55:50 2019 From: lists at merit.unu.edu (mj) Date: Wed, 11 Sep 2019 20:55:50 +0200 Subject: [PVE-User] Ceph and NTP problems on Ryzen In-Reply-To: <4cf7609c60787523584b01e878daf845@geoffray-levasseur.org> References: <4cf7609c60787523584b01e878daf845@geoffray-levasseur.org> Message-ID: <222decdb-3cf1-6dd1-4ce1-3a823e277d5b@merit.unu.edu> Hi, Not sure of this would solve your problem, but we used to have clock skews all the time. We finally switched to chrony, and ever since they have disappeared. So it seems (with us anyway) chrony does a much better job than ntp. But it seems your problems are much bigger, and probably your ntp issues are only a symptom. Good luck! MJ On 9/11/19 5:14 PM, Geoffray Levasseur wrote: > Hi, > > I have some difficulties on a Ryzen 5 2400G node. I switched recently to new version 6 with no problems except performances on that machine. The new kernel was suppose to have a better support but it actually is worse. Putting a quad port Intel 82571EB card on the PCI-Express 16x port is probably the orrigin of my troubles. > > When IOMMU is activated I have very bad performances on the network card, and the server reboot after a few days unexpectingly. > > When I put IOMMU in software mode, no unexpected reboot anymore but bed performances remain. > > It turns Ceph extremely slow and it complain permanently with clock skew. NTP have extreme difficulties to do its job, both on hosted virtual machines and the host itself. > > I read a lot about such troubles now fixed on video cards. But I dont think the network card scenario have been treated by kernel developpers. Is there any workaround for my situation? > > Regards, > -- > Geoffray Levasseur > Technicien UPS - UMR CNRS 5566 / LEGOS - Service Informatique > > > http://www.geoffray-levasseur.org/ > GNU/PG public key : C89E D6C4 8BFC C9F2 EEFB 908C 89C2 CD4D CD9E 23AA > Quod gratis asseritur gratis negatur. > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From mike at oeg.com.au Thu Sep 12 08:18:34 2019 From: mike at oeg.com.au (Mike O'Connor) Date: Thu, 12 Sep 2019 15:48:34 +0930 Subject: [PVE-User] LXC not starting after V5 to V6 upgrade using ZFS for storage Message-ID: <5b6baf95-4335-d725-5aea-9792b5df3e1a@oeg.com.au> HI All I just finished upgrading from V5 to V6 of Proxmox and have an issue with LXC 's not starting. The issue seems to be that the LXC is being started with out first mounting the ZFS subvolume. This results in a dev directory being created which then means ZFS will not mount over anymore because there are files in the mount point Mount does not work Code: root at pve:/rbd# zfs mount rbd/subvol-109-disk-0 cannot mount '/rbd/subvol-109-disk-0': directory is not empty There is a directory created by the attempt to start the lXC Code: root at pve:/rbd# find /rbd/subvol-109*/ /rbd/subvol-109-disk-0/ /rbd/subvol-109-disk-0/dev Remove the directory Code: rm -rf /rbd/subvol-109-disk-0/dev/ Mount the rbd volume Code: root at pve:/rbd# zfs mount rbd/subvol-109-disk-0 I can then start the LXC from the web page or via the cli. Question: What mounts the zfs subvol ? Is this a ZFS issue of not mounting the subvol at boot ? Should Proxmox be mounting the image ? Should Proxmox not be checking its mounted before starting the LXC ? I've been able to start the LXC but after a reboot, it seems I have to manual fix the mounts again. Thanks From daniel at speichert.pl Thu Sep 12 15:46:16 2019 From: daniel at speichert.pl (Daniel Speichert) Date: Thu, 12 Sep 2019 09:46:16 -0400 Subject: [PVE-User] LXC not starting after V5 to V6 upgrade using ZFS for storage In-Reply-To: <5b6baf95-4335-d725-5aea-9792b5df3e1a@oeg.com.au> References: <5b6baf95-4335-d725-5aea-9792b5df3e1a@oeg.com.au> Message-ID: I've had a similar problem. It was worse because i had to unmount everything up to the root. I think I set the datasets for machines to automount by setting a mountpoint attribute that was missing before. I can't recall if that was it though. Do you have it set? zfs get mountpoint /rbd/... Best, Daniel On 9/12/2019 2:18 AM, Mike O'Connor wrote: > HI All > > I just finished upgrading from V5 to V6 of Proxmox and have an issue > with LXC 's not starting. > > The issue seems to be that the LXC is being started with out first > mounting the ZFS subvolume. > This results in a dev directory being created which then means ZFS will > not mount over anymore because there are files in the mount point > > Mount does not work > Code: > > root at pve:/rbd# zfs mount rbd/subvol-109-disk-0 > cannot mount '/rbd/subvol-109-disk-0': directory is not empty > > There is a directory created by the attempt to start the lXC > Code: > > root at pve:/rbd# find /rbd/subvol-109*/ > /rbd/subvol-109-disk-0/ > /rbd/subvol-109-disk-0/dev > > Remove the directory > Code: > > rm -rf /rbd/subvol-109-disk-0/dev/ > > Mount the rbd volume > Code: > > root at pve:/rbd# zfs mount rbd/subvol-109-disk-0 > > I can then start the LXC from the web page or via the cli. > > Question: > What mounts the zfs subvol ? > Is this a ZFS issue of not mounting the subvol at boot ? > Should Proxmox be mounting the image ? > Should Proxmox not be checking its mounted before starting the LXC ? > > I've been able to start the LXC but after a reboot, it seems I have to > manual fix the mounts again. > > Thanks > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From mike at oeg.com.au Fri Sep 13 10:56:09 2019 From: mike at oeg.com.au (Mike O'Connor) Date: Fri, 13 Sep 2019 18:26:09 +0930 Subject: [PVE-User] LXC not starting after V5 to V6 upgrade using ZFS for storage In-Reply-To: References: <5b6baf95-4335-d725-5aea-9792b5df3e1a@oeg.com.au> Message-ID: <857a34e4-2c73-a281-58a4-47a2e987d7a1@oeg.com.au> On 12/9/19 11:16 pm, Daniel Speichert wrote: > I've had a similar problem. It was worse because i had to unmount > everything up to the root. > > I think I set the datasets for machines to automount by setting a > mountpoint attribute that was missing before. > > I can't recall if that was it though. Do you have it set? zfs get > mountpoint /rbd/... > > Best, > Daniel Hi Daniel Thanks for your comments, this was not the issue. All the subvols have mount points. BUT I did find the issue, I had a fstab entry which was doing a bind from the rbd to a normal path location This was causing the zfs mount service to fail because it thought the rbd had files in it. Removed this by changing the service to use the rbd directory directly instead of a bind. Cheers Mike From f.cuseo at panservice.it Fri Sep 13 21:42:06 2019 From: f.cuseo at panservice.it (Fabrizio Cuseo) Date: Fri, 13 Sep 2019 21:42:06 +0200 (CEST) Subject: [PVE-User] Ceph MON quorum problem Message-ID: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it> Hello. I am planning a 6 hosts cluster. 3 hosts are located in the CedA room 3 hosts are located in the CedB room the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one for each switch) for Ceph storage. My need is to have a full redundancy cluster that can survive to CedA (or CedB) disaster. I have modified the crush map, so I have a RBD Pool that writes 2 copies in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk space is not a problem). But if I loose one of the rooms, i can't establish the needed quorum. Some suggestion to have a quick and not too complicated way to satisfy my need ? Regards, Fabrizio From brians at iptel.co Sat Sep 14 15:41:36 2019 From: brians at iptel.co (Brian :) Date: Sat, 14 Sep 2019 14:41:36 +0100 Subject: [PVE-User] Ceph MON quorum problem In-Reply-To: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it> References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it> Message-ID: Have a mon that runs somewhere that isn't either of those rooms. On Friday, September 13, 2019, Fabrizio Cuseo wrote: > Hello. > I am planning a 6 hosts cluster. > > 3 hosts are located in the CedA room > 3 hosts are located in the CedB room > > the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one for each switch) for Ceph storage. > > My need is to have a full redundancy cluster that can survive to CedA (or CedB) disaster. > > I have modified the crush map, so I have a RBD Pool that writes 2 copies in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk space is not a problem). > > But if I loose one of the rooms, i can't establish the needed quorum. > > Some suggestion to have a quick and not too complicated way to satisfy my need ? > > Regards, Fabrizio > > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From f.cuseo at panservice.it Sat Sep 14 16:11:50 2019 From: f.cuseo at panservice.it (f.cuseo at panservice.it) Date: Sat, 14 Sep 2019 16:11:50 +0200 (CEST) Subject: [PVE-User] Ris: Ceph MON quorum problem In-Reply-To: References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it> Message-ID: <585861321.182698.1568470310925.JavaMail.zimbra@zimbra.panservice.it> This is My last choice :) Inviato dal mio dispositivo Huawei -------- Messaggio originale -------- Oggetto: Re: [PVE-User] Ceph MON quorum problem Da: "Brian :" A: Fabrizio Cuseo ,PVE User List CC: Have a mon that runs somewhere that isn't either of those rooms. On Friday, September 13, 2019, Fabrizio Cuseo wrote: > Hello. > I am planning a 6 hosts cluster. > > 3 hosts are located in the CedA room > 3 hosts are located in the CedB room > > the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one for each switch) for Ceph storage. > > My need is to have a full redundancy cluster that can survive to CedA (or CedB) disaster. > > I have modified the crush map, so I have a RBD Pool that writes 2 copies in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk space is not a problem). > > But if I loose one of the rooms, i can't establish the needed quorum. > > Some suggestion to have a quick and not too complicated way to satisfy my need ? > > Regards, Fabrizio > > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From jamacdon at hwy97.com Sun Sep 15 22:55:00 2019 From: jamacdon at hwy97.com (Joe Garvey) Date: Sun, 15 Sep 2019 13:55:00 -0700 (PDT) Subject: [PVE-User] Empty virtual disk Message-ID: <1584392794.132399.1568580900736.JavaMail.zimbra@talktel.ca> Hello all, I had to reboot a QEMU based VM yesterday and after rebooting it reported there was no boot disk. The disk has lost all content in the hard drive. There aren't even any partition. I booted the VM with acronis disk recovery and it showed the disk as uninitialized. I restored a 6 day old VM and it also had an empty drive. All backups have no data in the drive and are marked as uninitialized. I tested restoring other VM's in my environment and they have no issues with the disks. The only difference I see is that the drives are smaller. The VM in question has been running flawlessly since it was deployed over 30 days ago. Proxmox version: 5.4-2 SCSI controller: VirtIO SCSI Disk Size: 200G Caching is Disabled Storage: Dell NAS server via iSCSI 4x10GB connections (never had a problem with this) I'm guessing the data is gone but a ny ideas what has caused this? Regards, Joe From gilberto.nunes32 at gmail.com Mon Sep 16 03:17:11 2019 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Sun, 15 Sep 2019 22:17:11 -0300 Subject: [PVE-User] Kernel 5.3 and Proxmox Ceph nodes Message-ID: Hi there I read this about kernel 5.3 and ceph, and I am curious... I have a 6 nodes proxmox ceph cluster with luminous... Should be a good idea to user kernel 5.3 from here: https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/ --- Gilberto Nunes Ferreira (47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram Skype: gilberto.nunes36 From lcaron at unix-scripts.info Mon Sep 16 09:55:34 2019 From: lcaron at unix-scripts.info (Laurent CARON) Date: Mon, 16 Sep 2019 09:55:34 +0200 Subject: [PVE-User] Recurring crashes after cluster upgrade from 5 to 6 Message-ID: Hi, After upgrading our 4 node cluster from PVE 5 to 6, we experience constant crashed (once every 2 days). Those crashes seem related to corosync. Since numerous users are reporting sych issues (broken cluster after upgrade, unstabilities, ...) I wonder if it is possible to downgrade corosync to version 2.4.4 without impacting functionnality ? Basic steps would be: On all nodes # systemctl stop pve-ha-lrm Once done, on all nodes: # systemctl stop pve-ha-crm Once done, on all nodes: # apt-get install corosync=2.4.4-pve1 libcorosync-common4=2.4.4-pve1 libcmap4=2.4.4-pve1 libcpg4=2.4.4-pve1 libqb0=1.0.3-1~bpo9 libquorum5=2.4.4-pve1 libvotequorum8=2.4.4-pve1 Then, once corosync has been downgraded, on all nodes # systemctl start pve-ha-lrm # systemctl start pve-ha-crm Would that work ? Thanks From ronny+pve-user at aasen.cx Mon Sep 16 10:50:39 2019 From: ronny+pve-user at aasen.cx (Ronny Aasen) Date: Mon, 16 Sep 2019 10:50:39 +0200 Subject: [PVE-User] Kernel 5.3 and Proxmox Ceph nodes In-Reply-To: References: Message-ID: On 16.09.2019 03:17, Gilberto Nunes wrote: > Hi there > > I read this about kernel 5.3 and ceph, and I am curious... > I have a 6 nodes proxmox ceph cluster with luminous... > Should be a good idea to user kernel 5.3 from here: > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/ > --- > Gilberto Nunes Ferreira > > (47) 3025-5907 > (47) 99676-7530 - Whatsapp / Telegram > > Skype: gilberto.nunes36 > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > you read "this" ? This what exactly? generally unless you have a problem you need fixed i would run the kernels from proxmox. Ronny From ronny+pve-user at aasen.cx Mon Sep 16 12:27:16 2019 From: ronny+pve-user at aasen.cx (Ronny Aasen) Date: Mon, 16 Sep 2019 12:27:16 +0200 Subject: [PVE-User] Empty virtual disk In-Reply-To: <1584392794.132399.1568580900736.JavaMail.zimbra@talktel.ca> References: <1584392794.132399.1568580900736.JavaMail.zimbra@talktel.ca> Message-ID: <1725d4ec-6465-9918-cf93-3b75379f89d2@aasen.cx> On 15.09.2019 22:55, Joe Garvey wrote: > Hello all, > > I had to reboot a QEMU based VM yesterday and after rebooting it reported there was no boot disk. The disk has lost all content in the hard drive. There aren't even any partition. I booted the VM with acronis disk recovery and it showed the disk as uninitialized. > > I restored a 6 day old VM and it also had an empty drive. All backups have no data in the drive and are marked as uninitialized. > > I tested restoring other VM's in my environment and they have no issues with the disks. > > The only difference I see is that the drives are smaller. > > The VM in question has been running flawlessly since it was deployed over 30 days ago. > > > Proxmox version: 5.4-2 > SCSI controller: VirtIO SCSI > Disk Size: 200G > Caching is Disabled > Storage: Dell NAS server via iSCSI 4x10GB connections (never had a problem with this) > > I'm guessing the data is gone but a ny ideas what has caused this? > > Regards, > Joe > > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > that is a tricky one, wild speculations followes: first of all: Do not write more data to unallocated blocks on the disk. you may be able to recover the lost qcow2 image using testdisk. a wild guess is that someone sometime ago did an accidental rm -rf /path/to/qemu-images this deleted all qemu images. but they are not really gone before the last process holding the image open closes. so when the vm was shutdown and restarted qemu closed the file and it was removed and an empty file was created when qemu started again. you can look for deleted but still active files with find /proc/*/fd -ls | grep '(deleted)' basically all these files will be deleted if the process there stops. you can try and copy the running vm images with cp proc/1722/fd/ /somewhere-not-on-same-disk this would make inconsistent copies that you can try to fix with qemu-img check /somewhere-not-on-same-disk after that i would try to do a disk move of the running vm. that may or may not fail. and perhaps crash the vm. (and loose the disk file) but if it works it would probably make a consistent copy, that would be better then the inconsistent manual copy. regarding your first lost disk image, i would try with testdisk to try to recover the lost qcow2 image. I would focus on the current running vm's first. And i would copy the orginal disk to a different host for running testdisk on it. something like : dd_rescue /dev/qcow2_disk - | ssh user at some-host "cat - > /large/storage/recovery-file.dump" then run testdisk on that image file trying to recover the cow file. good luck Ronny Aasen From gilberto.nunes32 at gmail.com Mon Sep 16 12:53:56 2019 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Mon, 16 Sep 2019 07:53:56 -0300 Subject: [PVE-User] Kernel 5.3 and Proxmox Ceph nodes In-Reply-To: References: Message-ID: Oh! I sorry! I didn't sent the link which I referred to https://www.phoronix.com/scan.php?page=news_item&px=Ceph-Linux-5.3-Changes --- Gilberto Nunes Ferreira (47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram Skype: gilberto.nunes36 Em seg, 16 de set de 2019 ?s 05:50, Ronny Aasen escreveu: > On 16.09.2019 03:17, Gilberto Nunes wrote: > > Hi there > > > > I read this about kernel 5.3 and ceph, and I am curious... > > I have a 6 nodes proxmox ceph cluster with luminous... > > Should be a good idea to user kernel 5.3 from here: > > > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/ > > --- > > Gilberto Nunes Ferreira > > > > (47) 3025-5907 > > (47) 99676-7530 - Whatsapp / Telegram > > > > Skype: gilberto.nunes36 > > _______________________________________________ > > pve-user mailing list > > pve-user at pve.proxmox.com > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > you read "this" ? > This what exactly? > > > generally unless you have a problem you need fixed i would run the > kernels from proxmox. > > Ronny > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From humbertos at ifsc.edu.br Mon Sep 16 12:58:17 2019 From: humbertos at ifsc.edu.br (Humberto Jose De Sousa) Date: Mon, 16 Sep 2019 07:58:17 -0300 (BRT) Subject: [PVE-User] Ceph MON quorum problem In-Reply-To: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it> References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it> Message-ID: <2114286804.38162994.1568631497769.JavaMail.zimbra@ifsc.edu.br> Hi. You could try the qdevice: https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support Humberto De: "Fabrizio Cuseo" Para: "pve-user" Enviadas: Sexta-feira, 13 de setembro de 2019 16:42:06 Assunto: [PVE-User] Ceph MON quorum problem Hello. I am planning a 6 hosts cluster. 3 hosts are located in the CedA room 3 hosts are located in the CedB room the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one for each switch) for Ceph storage. My need is to have a full redundancy cluster that can survive to CedA (or CedB) disaster. I have modified the crush map, so I have a RBD Pool that writes 2 copies in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk space is not a problem). But if I loose one of the rooms, i can't establish the needed quorum. Some suggestion to have a quick and not too complicated way to satisfy my need ? Regards, Fabrizio _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From r.correa.r at gmail.com Mon Sep 16 13:04:41 2019 From: r.correa.r at gmail.com (Ricardo Correa) Date: Mon, 16 Sep 2019 11:04:41 +0000 Subject: [PVE-User] Kernel 5.3 and Proxmox Ceph nodes In-Reply-To: References: Message-ID: Another 5.3 fix that might be interesting for some is https://github.com/lxc/lxd/issues/5193#issuecomment-502857830 which allows (or takes us one step closer) to running a kubelet in LXC containers. ?On 16.09.19, 12:55, "pve-user on behalf of Gilberto Nunes" wrote: Oh! I sorry! I didn't sent the link which I referred to https://www.phoronix.com/scan.php?page=news_item&px=Ceph-Linux-5.3-Changes --- Gilberto Nunes Ferreira (47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram Skype: gilberto.nunes36 Em seg, 16 de set de 2019 ?s 05:50, Ronny Aasen escreveu: > On 16.09.2019 03:17, Gilberto Nunes wrote: > > Hi there > > > > I read this about kernel 5.3 and ceph, and I am curious... > > I have a 6 nodes proxmox ceph cluster with luminous... > > Should be a good idea to user kernel 5.3 from here: > > > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/ > > --- > > Gilberto Nunes Ferreira > > > > (47) 3025-5907 > > (47) 99676-7530 - Whatsapp / Telegram > > > > Skype: gilberto.nunes36 > > _______________________________________________ > > pve-user mailing list > > pve-user at pve.proxmox.com > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > you read "this" ? > This what exactly? > > > generally unless you have a problem you need fixed i would run the > kernels from proxmox. > > Ronny > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From f.cuseo at panservice.it Mon Sep 16 14:24:36 2019 From: f.cuseo at panservice.it (Fabrizio Cuseo) Date: Mon, 16 Sep 2019 14:24:36 +0200 (CEST) Subject: [PVE-User] Ceph MON quorum problem In-Reply-To: <2114286804.38162994.1568631497769.JavaMail.zimbra@ifsc.edu.br> References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it> <2114286804.38162994.1568631497769.JavaMail.zimbra@ifsc.edu.br> Message-ID: <176763712.201083.1568636676598.JavaMail.zimbra@zimbra.panservice.it> THank you Humberto, but my problem is not related on proxmox quorum, but ceph mon quorum. Regards, Fabrizio ----- Il 16-set-19, alle 12:58, Humberto Jose De Sousa ha scritto: > Hi. > You could try the qdevice: > https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support > Humberto > De: "Fabrizio Cuseo" > Para: "pve-user" > Enviadas: Sexta-feira, 13 de setembro de 2019 16:42:06 > Assunto: [PVE-User] Ceph MON quorum problem > Hello. > I am planning a 6 hosts cluster. > 3 hosts are located in the CedA room > 3 hosts are located in the CedB room > the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each room i > have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one for each > switch) for Ceph storage. > My need is to have a full redundancy cluster that can survive to CedA (or CedB) > disaster. > I have modified the crush map, so I have a RBD Pool that writes 2 copies in CedA > hosts, and 2 copies in CedB hosts, so a very good redundancy (disk space is not > a problem). > But if I loose one of the rooms, i can't establish the needed quorum. > Some suggestion to have a quick and not too complicated way to satisfy my need ? > Regards, Fabrizio > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- --- Fabrizio Cuseo - mailto:f.cuseo at panservice.it Direzione Generale - Panservice InterNetWorking Servizi Professionali per Internet ed il Networking Panservice e' associata AIIP - RIPE Local Registry Phone: +39 0773 410020 - Fax: +39 0773 470219 http://www.panservice.it mailto:info at panservice.it Numero verde nazionale: 800 901492 From ronny+pve-user at aasen.cx Mon Sep 16 14:49:06 2019 From: ronny+pve-user at aasen.cx (Ronny Aasen) Date: Mon, 16 Sep 2019 14:49:06 +0200 Subject: [PVE-User] Ris: Ceph MON quorum problem In-Reply-To: <585861321.182698.1568470310925.JavaMail.zimbra@zimbra.panservice.it> References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it> <585861321.182698.1568470310925.JavaMail.zimbra@zimbra.panservice.it> Message-ID: with 2 rooms there is no way to avoid a split brain situation unless you have a tie breaker outside one of those 2 rooms. Run a Mon on a neutral third location is the quick, correct, and simple solution. Or you need to have a master-slave situation where one room is the master (3 mons) and the other room is the slave (2 mons) and the slave can not operate without the master, but the master can operate alone. good luck Ronny On 14.09.2019 16:11, f.cuseo at panservice.it wrote: > This is My last choice :) > Inviato dal mio dispositivo Huawei > -------- Messaggio originale -------- > Oggetto: Re: [PVE-User] Ceph MON quorum problem > Da: "Brian :" > A: Fabrizio Cuseo ,PVE User List > CC: > > > Have a mon that runs somewhere that isn't either of those rooms. > > On Friday, September 13, 2019, Fabrizio Cuseo wrote: >> Hello. >> I am planning a 6 hosts cluster. >> >> 3 hosts are located in the CedA room >> 3 hosts are located in the CedB room >> >> the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each > room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one > for each switch) for Ceph storage. >> >> My need is to have a full redundancy cluster that can survive to CedA (or > CedB) disaster. >> >> I have modified the crush map, so I have a RBD Pool that writes 2 copies > in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk > space is not a problem). >> >> But if I loose one of the rooms, i can't establish the needed quorum. >> >> Some suggestion to have a quick and not too complicated way to satisfy my > need ? >> >> Regards, Fabrizio >> >> >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From f.cuseo at panservice.it Mon Sep 16 16:02:39 2019 From: f.cuseo at panservice.it (Fabrizio Cuseo) Date: Mon, 16 Sep 2019 16:02:39 +0200 (CEST) Subject: [PVE-User] Ris: Ceph MON quorum problem In-Reply-To: References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it> <585861321.182698.1568470310925.JavaMail.zimbra@zimbra.panservice.it> Message-ID: <1891770806.204452.1568642559576.JavaMail.zimbra@zimbra.panservice.it> Answer following: ----- Il 16-set-19, alle 14:49, Ronny Aasen ronny+pve-user at aasen.cx ha scritto: > with 2 rooms there is no way to avoid a split brain situation unless you > have a tie breaker outside one of those 2 rooms. > > Run a Mon on a neutral third location is the quick, correct, and simple > solution. > > Or > > you need to have a master-slave situation where one room is the master > (3 mons) and the other room is the slave (2 mons) and the slave can not > operate without the master, but the master can operate alone. Yes, I need a master-slave situation, but I need to have the slave running in case of master's fault. So, if a have a total of 3 mons (2 on master, 1 on slave), if I loose the master, I have only 1 mon available, and i need to create another mon (but i can't create it because I have no quorum). I know that for now, the only solution is a third room. Thanks, Fabrizio > > On 14.09.2019 16:11, f.cuseo at panservice.it wrote: >> This is My last choice :) >> Inviato dal mio dispositivo Huawei >> -------- Messaggio originale -------- >> Oggetto: Re: [PVE-User] Ceph MON quorum problem >> Da: "Brian :" >> A: Fabrizio Cuseo ,PVE User List >> CC: >> >> >> Have a mon that runs somewhere that isn't either of those rooms. >> >> On Friday, September 13, 2019, Fabrizio Cuseo wrote: >>> Hello. >>> I am planning a 6 hosts cluster. >>> >>> 3 hosts are located in the CedA room >>> 3 hosts are located in the CedB room >>> >>> the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each >> room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one >> for each switch) for Ceph storage. >>> >>> My need is to have a full redundancy cluster that can survive to CedA (or >> CedB) disaster. >>> >>> I have modified the crush map, so I have a RBD Pool that writes 2 copies >> in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk >> space is not a problem). >>> >>> But if I loose one of the rooms, i can't establish the needed quorum. >>> >>> Some suggestion to have a quick and not too complicated way to satisfy my >> need ? >>> >>> Regards, Fabrizio >>> >>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at pve.proxmox.com >>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>> >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user - From daniel at firewall-services.com Tue Sep 17 18:27:33 2019 From: daniel at firewall-services.com (Daniel Berteaud) Date: Tue, 17 Sep 2019 18:27:33 +0200 (CEST) Subject: [PVE-User] Moving disk with ZFS over iSCSI = IO error Message-ID: <266547431.18105.1568737653631.JavaMail.zimbra@fws.fr> Hi there. I'm working on moving my NFS setup to ZFS over iSCSI. I'm using a CentOS 7.6 box with ZoL 0.8.1, with the LIO backend (but this shouldn't be relevent, see further). For the PVE side, I'm running PVE6 with all updates applied. Except a few minor issues I found in the LIO backend (for which I sent a patch serie earlier today), most things do work nicely. Except one which is important to me : I can't move disk from ZFS over iSCSI to any other storage. Destination storage type doesn't matter, but the porblem is 100% reproducible when the source storage is ZFS over iSCSI A few seconds after I started disk move, the guest FS will "panic". For example, with an el7 guest using XFS, I get : kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00 kernel: blk_update_request: I/O error, dev sda, sector 7962536 kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00 kernel: blk_update_request: I/O error, dev sda, sector 7962536 kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 bc 0e 28 00 00 08 00 kernel: blk_update_request: I/O error, dev sda, sector 12324392 And the system completely crash. The data itself is not impacted. I can restart the guest and everything appears OK. It doesn't matter if I let the disk move operation terminates or if I cancel it. Moving the disk offline works as expected. Sparse or non sparse zvol backend doesn't matter either. I searched a lot about this issue, and found at least two other persons having the same, or a very similar issue : * One using ZoL but with SCST, see [ https://sourceforge.net/p/scst/mailman/message/35241011/ | https://sourceforge.net/p/scst/mailman/message/35241011/ ] * Another, using OmniOS, so with Comstar, see [ https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/ | https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/ ] Both are likely running PVE5, so it looks like it's not a recently introduced regression. I also was able to reproduce the issue with a FreeNAS storage, so using ctld. As the issue is present with so many different stack, I think we can eliminate an issue on the storage side. The problem is most likely on qemu, in it's iSCSI block implementation. The SCST-Devel thread is interesting, but infortunately, it's beyond my skills here. Any advice on how to debug this further ? I can reproduce it whenever I want, on a test setup. I'm happy to provide any usefull informations Regards, Daniel -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From daniel at firewall-services.com Thu Sep 19 07:57:20 2019 From: daniel at firewall-services.com (Daniel Berteaud) Date: Thu, 19 Sep 2019 07:57:20 +0200 (CEST) Subject: [PVE-User] Moving disk with ZFS over iSCSI = IO error In-Reply-To: <266547431.18105.1568737653631.JavaMail.zimbra@fws.fr> References: <266547431.18105.1568737653631.JavaMail.zimbra@fws.fr> Message-ID: <1159576403.22676.1568872640495.JavaMail.zimbra@fws.fr> ----- Le 17 Sep 19, ? 18:27, Daniel Berteaud a ?crit : > Hi there. > I'm working on moving my NFS setup to ZFS over iSCSI. I'm using a CentOS 7.6 box > with ZoL 0.8.1, with the LIO backend (but this shouldn't be relevent, see > further). For the PVE side, I'm running PVE6 with all updates applied. > Except a few minor issues I found in the LIO backend (for which I sent a patch > serie earlier today), most things do work nicely. Except one which is important > to me : I can't move disk from ZFS over iSCSI to any other storage. Destination > storage type doesn't matter, but the porblem is 100% reproducible when the > source storage is ZFS over iSCSI > A few seconds after I started disk move, the guest FS will "panic". For example, > with an el7 guest using XFS, I get : > kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] > kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated > kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00 > kernel: blk_update_request: I/O error, dev sda, sector 7962536 > kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] > kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated > kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00 > kernel: blk_update_request: I/O error, dev sda, sector 7962536 > kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] > kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated > kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 bc 0e 28 00 00 08 00 > kernel: blk_update_request: I/O error, dev sda, sector 12324392 > And the system completely crash. The data itself is not impacted. I can restart > the guest and everything appears OK. It doesn't matter if I let the disk move > operation terminates or if I cancel it. > Moving the disk offline works as expected. > Sparse or non sparse zvol backend doesn't matter either. > I searched a lot about this issue, and found at least two other persons having > the same, or a very similar issue : > * One using ZoL but with SCST, see [ > https://sourceforge.net/p/scst/mailman/message/35241011/ | > https://sourceforge.net/p/scst/mailman/message/35241011/ ] > * Another, using OmniOS, so with Comstar, see [ > https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/ > | > https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/ > ] > Both are likely running PVE5, so it looks like it's not a recently introduced > regression. > I also was able to reproduce the issue with a FreeNAS storage, so using ctld. As > the issue is present with so many different stack, I think we can eliminate an > issue on the storage side. The problem is most likely on qemu, in it's iSCSI > block implementation. > The SCST-Devel thread is interesting, but infortunately, it's beyond my skills > here. > Any advice on how to debug this further ? I can reproduce it whenever I want, on > a test setup. I'm happy to provide any usefull informations > Regards, Daniel Forgot to mention. When moving a disk offline, from ZFS over iSCSI to something else (in my case to an NFS storage), I do have warnings like this : create full clone of drive scsi0 (zfs-test:vm-132-disk-0) Formatting '/mnt/pve/nfs-dumps/images/132/vm-132-disk-0.qcow2', fmt=qcow2 size=53687091200 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16 transferred: 0 bytes remaining: 53687091200 bytes total: 53687091200 bytes progression: 0.00 % qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 12582909: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 16777212: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 20971515: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 25165818: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 29360121: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 33554424: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 37748727: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 41943030: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 46137333: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 50331636: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 54525939: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 58720242: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 62914545: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 67108848: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 71303151: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 75497454: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 79691757: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 83886060: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 88080363: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 92274666: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 96468969: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 100663272: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 104857575: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) transferred: 536870912 bytes remaining: 53150220288 bytes total: 53687091200 bytes progression: 1.00 % transferred: 1079110533 bytes remaining: 52607980667 bytes total: 53687091200 bytes progression: 2.01 % transferred: 1615981445 bytes remaining: 52071109755 bytes total: 53687091200 bytes progression: 3.01 % qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) transferred: 2158221066 bytes remaining: 51528870134 bytes total: 53687091200 bytes progression: 4.02 % transferred: 2695091978 bytes remaining: 50991999222 bytes total: 53687091200 bytes progression: 5.02 % transferred: 3231962890 bytes remaining: 50455128310 bytes total: 53687091200 bytes progression: 6.02 % transferred: 3774202511 bytes remaining: 49912888689 bytes total: 53687091200 bytes progression: 7.03 % qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) transferred: 4311073423 bytes remaining: 49376017777 bytes total: 53687091200 bytes progression: 8.03 % transferred: 4853313044 bytes remaining: 48833778156 bytes total: 53687091200 bytes progression: 9.04 % transferred: 5390183956 bytes remaining: 48296907244 bytes total: 53687091200 bytes progression: 10.04 % transferred: 5927054868 bytes remaining: 47760036332 bytes total: 53687091200 bytes progression: 11.04 % Which might well be related to the problem (the same errors when the VM is running are reported back to the upper stacks, until the guest FS, which panics ?) When running offline, even with these error messages, the transfert is OK Cheers, Daniel -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From mark at tuxis.nl Thu Sep 19 09:15:17 2019 From: mark at tuxis.nl (Mark Schouten) Date: Thu, 19 Sep 2019 09:15:17 +0200 Subject: [PVE-User] Images on CephFS? Message-ID: Hi, We just built our latest cluster with PVE 6.0. We also offer CephFS 'slow but large' storage with our clusters, on which people can create images for backupservers. However, it seems that in PVE 6.0, we can no longer use CephFS for images? Cany anybody confirm (and explain?) or am I looking in the wrong direction? -- Mark Schouten Tuxis, Ede, https://www.tuxis.nl T: +31 318 200208? ? From daniel at firewall-services.com Fri Sep 20 10:45:33 2019 From: daniel at firewall-services.com (Daniel Berteaud) Date: Fri, 20 Sep 2019 10:45:33 +0200 (CEST) Subject: [PVE-User] Moving disk with ZFS over iSCSI = IO error In-Reply-To: <1159576403.22676.1568872640495.JavaMail.zimbra@fws.fr> References: <266547431.18105.1568737653631.JavaMail.zimbra@fws.fr> <1159576403.22676.1568872640495.JavaMail.zimbra@fws.fr> Message-ID: <1895121826.27120.1568969133081.JavaMail.zimbra@fws.fr> ----- Le 19 Sep 19, ? 7:57, Daniel Berteaud a ?crit : > Forgot to mention. When moving a disk offline, from ZFS over iSCSI to something > else (in my case to an NFS storage), I do have warnings like this : > create full clone of drive scsi0 (zfs-test:vm-132-disk-0) > Formatting '/mnt/pve/nfs-dumps/images/132/vm-132-disk-0.qcow2', fmt=qcow2 > size=53687091200 cluster_size=65536 preallocation=metadata lazy_refcounts=off > refcount_bits=16 > transferred: 0 bytes remaining: 53687091200 bytes total: 53687091200 bytes > progression: 0.00 % > qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5) > ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 12582909: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 16777212: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 20971515: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > [...] > qemu-img: iSCSI GET_LBA_STATUS failed at lba 83886060: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 88080363: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 92274666: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 96468969: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 100663272: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 104857575: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5) > ASCQ:INVALID_FIELD_IN_CDB(0x2400) > transferred: 536870912 bytes remaining: 53150220288 bytes total: 53687091200 > bytes progression: 1.00 % > transferred: 1079110533 bytes remaining: 52607980667 bytes total: 53687091200 > bytes progression: 2.01 % > transferred: 1615981445 bytes remaining: 52071109755 bytes total: 53687091200 > bytes progression: 3.01 % > qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > transferred: 2158221066 bytes remaining: 51528870134 bytes total: 53687091200 > bytes progression: 4.02 % > transferred: 2695091978 bytes remaining: 50991999222 bytes total: 53687091200 > bytes progression: 5.02 % > transferred: 3231962890 bytes remaining: 50455128310 bytes total: 53687091200 > bytes progression: 6.02 % > transferred: 3774202511 bytes remaining: 49912888689 bytes total: 53687091200 > bytes progression: 7.03 % > qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE > KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) > transferred: 4311073423 bytes remaining: 49376017777 bytes total: 53687091200 > bytes progression: 8.03 % > transferred: 4853313044 bytes remaining: 48833778156 bytes total: 53687091200 > bytes progression: 9.04 % > transferred: 5390183956 bytes remaining: 48296907244 bytes total: 53687091200 > bytes progression: 10.04 % > transferred: 5927054868 bytes remaining: 47760036332 bytes total: 53687091200 > bytes progression: 11.04 % > Which might well be related to the problem (the same errors when the VM is > running are reported back to the upper stacks, until the guest FS, which panics > ?) > When running offline, even with these error messages, the transfert is OK Another case which might be related : [ https://forum.proxmox.com/threads/move-disk-to-a-different-iscsi-target-errors-warning.27313/ | https://forum.proxmox.com/threads/move-disk-to-a-different-iscsi-target-errors-warning.27313/ ] -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From chris.hofstaedtler at deduktiva.com Fri Sep 20 14:31:17 2019 From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva) Date: Fri, 20 Sep 2019 14:31:17 +0200 Subject: [PVE-User] Kernel Memory Leak on PVE6? Message-ID: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> Hi, I'm seeing a very interesting problem on PVE6: one of our machines appears to leak kernel memory over time, up to the point where only a reboot helps. Shutting down all KVM VMs does not release this memory. I'll attach some information below, because I just couldn't figure out what this memory is used for. Once before shutting down the VMs, and once after. I had to reboot the PVE host now, but I guess in a few days it will be at least noticable again. This machine has the same (except CPU) hardware as the box next to it; however this one was freshly installed with PVE6, the other one is an upgrade from PVE5 and doesn't exhibit this problem. It's quite puzzling because I haven't seen this symptom at all at all the customer installations. Here are some graphs showing the memory consumption over time: http://zeha.at/~ch/T/20190920-pve6_meminfo_0.png http://zeha.at/~ch/T/20190920-pve6_meminfo_1.png Looking forward to any debug help, suggestions, ... Chris ** Almost out of memory, before VM shutdown: ** top - 10:24:19 up 22 days, 22:29, 1 user, load average: 1.85, 1.57, 1.32 Tasks: 530 total, 1 running, 529 sleeping, 0 stopped, 0 zombie %Cpu(s): 1.8 us, 0.4 sy, 0.0 ni, 97.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 80413.1 total, 509.9 free, 70879.7 used, 9023.5 buff/cache MiB Swap: 20480.0 total, 6516.6 free, 13963.4 used. 8699.0 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3183 root 20 0 10.6g 6.0g 2960 S 8.7 7.6 5861:52 /usr/bin/kvm -id 103 -name puppet -chardev socket,id=qmp,path=/var/run/qemu-server/103.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+ 3349 root 20 0 9266032 4.3g 2972 S 6.8 5.4 3834:41 /usr/bin/kvm -id 2017 -name go-test-srv01 -chardev socket,id=qmp,path=/var/run/qemu-server/2017.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=+ 3068 root 20 0 5060928 3.7g 2900 S 6.8 4.7 3110:01 /usr/bin/kvm -id 101 -name backup -chardev socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+ 3399 root 20 0 5094772 2.3g 2944 S 50.5 2.9 10780:07 /usr/bin/kvm -id 3002 -name monitor01 -chardev socket,id=qmp,path=/var/run/qemu-server/3002.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-+ 3254 root 20 0 32.8g 1.9g 3040 S 1.0 2.4 490:39.29 /usr/bin/kvm -id 2005 -name debbuild -chardev socket,id=qmp,path=/var/run/qemu-server/2005.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-e+ 2994 root 20 0 2656268 658428 2980 S 9.7 0.8 2895:15 /usr/bin/kvm -id 100 -name pbx -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+ 2927 root 20 0 2664232 479372 2944 S 6.8 0.6 2343:43 /usr/bin/kvm -id 102 -name ns1 -chardev socket,id=qmp,path=/var/run/qemu-server/102.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+ 2417 root rt 0 606912 211336 51444 S 1.9 0.3 613:27.87 /usr/sbin/corosync -f 2023020 root 20 0 246556 98020 97044 S 0.0 0.1 15:47.80 /lib/systemd/systemd-journald 1806 root 20 0 967944 32724 23612 S 0.0 0.0 53:49.62 /usr/bin/pmxcfs 2801 root 20 0 314488 32428 6464 S 0.0 0.0 322:58.23 pvestatd + 3771741 root 20 0 150776 31728 3700 S 0.0 0.0 0:12.81 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize 2799 root 20 0 316056 27452 5656 S 0.0 0.0 95:49.25 pve-firewall + 2909 root 20 0 325248 12684 5268 S 1.0 0.0 7:03.91 pve-ha-lrm + 868033 ch 20 0 21660 9104 7280 S 0.0 0.0 0:00.12 /lib/systemd/systemd --user 868009 root 20 0 16912 7988 6856 S 0.0 0.0 0:00.03 sshd: ch [priv] 1 root 20 0 171820 7640 5032 S 0.0 0.0 19:58.80 /lib/systemd/systemd --system --deserialize 37 2876 root 20 0 325544 7124 4988 S 0.0 0.0 4:18.16 pve-ha-crm + 1654 Debian-+ 20 0 40488 7096 2864 S 0.0 0.0 77:37.18 /usr/sbin/snmpd -Lsd -Lf /dev/null -u Debian-snmp -g Debian-snmp -I -smux mteTrigger mteTriggerConf -f -p /run/snmpd.pid 868045 ch 20 0 10240 5404 3996 S 0.0 0.0 0:00.11 -zsh 868044 ch 20 0 16912 4636 3492 S 0.0 0.0 0:00.02 sshd: ch at pts/0 1644 root 20 0 29608 4520 3496 S 0.0 0.0 4:59.62 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal 868336 root 20 0 7716 4372 3092 S 0.0 0.0 0:00.03 -bash 1761096 root 20 0 351564 4180 3336 S 0.0 0.0 1:12.83 pvedaemon worker + 1776171 root 20 0 351696 4076 3352 S 0.0 0.0 1:18.27 pvedaemon worker + 868370 root 20 0 11680 4016 2964 R 2.9 0.0 0:00.68 top 1780591 root 20 0 351696 4008 3248 S 0.0 0.0 1:11.73 pvedaemon worker + 1086 root 20 0 19540 3984 3720 S 0.0 0.0 3:11.21 /lib/systemd/systemd-logind 868335 root 20 0 10156 3788 3364 S 0.0 0.0 0:00.01 sudo -i 2899 www-data 20 0 121256 3412 3080 S 0.0 0.0 0:33.99 spiceproxy + 2000791 www-data 20 0 344932 3412 2604 S 0.0 0.0 1:16.39 pveproxy worker + 2000792 www-data 20 0 344932 3348 2604 S 0.0 0.0 1:07.07 pveproxy worker + 1251 root 20 0 225816 3296 2424 S 0.0 0.0 9:47.44 /usr/sbin/rsyslogd -n -iNONE 1258 message+ 20 0 9212 3268 2820 S 0.0 0.0 6:41.36 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only root at vn03:~# uname -a Linux vn03 5.0.21-1-pve #1 SMP PVE 5.0.21-1 (Tue, 20 Aug 2019 17:16:32 +0200) x86_64 GNU/Linux root at vn03:~# free -m total used free shared buff/cache available Mem: 80413 70877 515 101 9019 8708 Swap: 20479 13963 6516 root at vn03:~# dpkg -l pve\* Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-=======================-============-============-====================================================== ii pve-cluster 6.0-5 amd64 Cluster Infrastructure for Proxmox Virtual Environment ii pve-container 3.0-5 all Proxmox VE Container management tool ii pve-docs 6.0-4 all Proxmox VE Documentation ii pve-edk2-firmware 2.20190614-1 all edk2 based firmware modules for virtual machines ii pve-firewall 4.0-7 amd64 Proxmox VE Firewall ii pve-firmware 3.0-2 all Binary firmware code for the pve-kernel ii pve-ha-manager 3.0-2 amd64 Proxmox VE HA Manager ii pve-i18n 2.0-2 all Internationalization support for Proxmox VE un pve-kernel (no description available) ii pve-kernel-5.0 6.0-7 all Latest Proxmox VE Kernel Image ii pve-kernel-5.0.15-1-pve 5.0.15-1 amd64 The Proxmox PVE Kernel Image ii pve-kernel-5.0.18-1-pve 5.0.18-3 amd64 The Proxmox PVE Kernel Image ii pve-kernel-5.0.21-1-pve 5.0.21-1 amd64 The Proxmox PVE Kernel Image ii pve-kernel-helper 6.0-7 all Function for various kernel maintenance tasks. un pve-kvm (no description available) ii pve-manager 6.0-6 amd64 Proxmox Virtual Environment Management Tools ii pve-qemu-kvm 4.0.0-5 amd64 Full virtualization on x86 hardware un pve-qemu-kvm-2.6.18 (no description available) ii pve-xtermjs 3.13.2-1 all HTML/JS Shell client root at vn03:~# slabtop -o | head -50 Active / Total Objects (% used) : 205425461 / 212231433 (96.8%) Active / Total Slabs (% used) : 4949759 / 4949759 (100.0%) Active / Total Caches (% used) : 114 / 161 (70.8%) Active / Total Size (% used) : 60112896.56K / 60714678.54K (99.0%) Minimum / Average / Maximum Object : 0.01K / 0.29K / 16.62K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 43583592 43542487 99% 0.20K 1117528 39 8940224K vm_area_struct 26520256 26518592 99% 0.06K 414379 64 1657516K anon_vma_chain 16788000 16434450 97% 0.25K 524625 32 4197000K filp 13079680 13078464 99% 0.03K 102185 128 408740K kmalloc-32 11544320 5261058 45% 0.06K 180380 64 721520K dmaengine-unmap-2 10128740 10127452 99% 0.09K 220190 46 880760K anon_vma 9602484 9602484 100% 0.04K 94142 102 376568K pde_opener 7442736 7442572 99% 0.19K 177208 42 1417664K cred_jar 7213200 7209695 99% 0.13K 240440 30 961760K kernfs_node_cache 6023850 5992341 99% 0.19K 143425 42 1147400K dentry 5704350 5704350 100% 0.08K 111850 51 447400K task_delay_info 5054066 5054066 100% 0.69K 109871 46 3515872K files_cache 4664512 4664481 99% 0.12K 145766 32 583064K pid 4591440 4591440 100% 1.06K 153048 30 4897536K mm_struct 4207445 4203908 99% 0.58K 76499 55 2447968K inode_cache 4104480 4104291 99% 0.62K 80480 51 2575360K sock_inode_cache 3901440 3900588 99% 0.06K 60960 64 243840K kmalloc-64 3856230 3856160 99% 1.06K 128541 30 4113312K signal_cache 3423826 3417982 99% 0.65K 69874 49 2235968K proc_inode_cache 3139584 3138382 99% 0.01K 6132 512 24528K kmalloc-8 2983344 2983255 99% 0.19K 71032 42 568256K kmalloc-192 2426976 2426413 99% 1.00K 75843 32 2426976K kmalloc-1k 1939854 1931355 99% 0.09K 46187 42 184748K kmalloc-96 1649895 1649895 100% 2.06K 109993 15 3519776K sighand_cache 1280544 1280544 100% 1.00K 40017 32 1280544K UNIX 1052928 1050819 99% 0.50K 32904 32 526464K kmalloc-512 1029792 1029312 99% 0.25K 32181 32 257448K skbuff_head_cache 940624 940559 99% 4.00K 117578 8 3762496K kmalloc-4k 799895 787069 98% 5.75K 159979 5 5119328K task_struct 735696 724643 98% 0.10K 18864 39 75456K buffer_head 525504 525378 99% 2.00K 32844 16 1051008K kmalloc-2k 433024 426780 98% 0.06K 6766 64 27064K kmem_cache_node 310710 301758 97% 1.05K 10357 30 331424K ext4_inode_cache 292340 290078 99% 0.68K 6220 47 199040K shmem_inode_cache 215250 214814 99% 0.38K 5125 42 82000K kmem_cache 212296 196761 92% 0.57K 7582 28 121312K radix_tree_node 158464 158464 100% 0.02K 619 256 2476K kmalloc-16 149925 149925 100% 1.25K 5997 25 191904K UDPv6 71424 71140 99% 0.12K 2232 32 8928K kmalloc-128 70020 70020 100% 0.16K 1376 51 11008K kvm_mmu_page_header 40032 40009 99% 0.25K 1251 32 10008K kmalloc-256 34944 33823 96% 0.09K 832 42 3328K kmalloc-rcl-96 34816 32567 93% 0.06K 544 64 2176K kmalloc-rcl-64 root at vn03:~# pct list root at vn03:~# qm list VMID NAME STATUS MEM(MB) BOOTDISK(GB) PID 100 pbx running 2048 16.00 2994 101 backup running 4096 32.00 3068 102 ns1 running 2048 32.00 2927 103 puppet running 10240 16.00 3183 2005 debbuild running 32768 40.00 3254 2017 go-test-srv01 running 8192 20.00 3349 3002 monitor01 running 4096 32.00 3399 5001 salsa-runner-01 stopped 16384 32.00 0 6001 deduktiva-runner-01 stopped 32768 32.00 0 6901 mac stopped 4096 0.25 0 root at vn03:~# sysctl -a | grep hugepages vm.nr_hugepages = 0 vm.nr_hugepages_mempolicy = 0 vm.nr_overcommit_hugepages = 0 *** After shutdown of all VMs: *** top - 10:39:56 up 22 days, 22:44, 2 users, load average: 0.83, 1.84, 1.88 Tasks: 491 total, 1 running, 490 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 80413.1 total, 18276.4 free, 52704.9 used, 9431.8 buff/cache MiB Swap: 20480.0 total, 19393.6 free, 1086.4 used. 26801.1 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2417 root rt 0 606908 211332 51444 S 1.0 0.3 613:46.50 /usr/sbin/corosync -f 2878 www-data 20 0 344800 133424 21784 S 0.0 0.2 0:36.09 pveproxy + 883317 www-data 20 0 361776 133084 11056 S 0.0 0.2 0:01.04 pveproxy worker + 2836 root 20 0 343228 132060 21764 S 0.0 0.2 0:38.88 pvedaemon + 883319 www-data 20 0 360688 130992 11148 S 1.0 0.2 0:01.26 pveproxy worker + 883318 www-data 20 0 358056 128864 11148 S 0.0 0.2 0:01.75 pveproxy worker + 883166 root 20 0 351912 121884 10220 S 0.0 0.1 0:00.96 pvedaemon worker + 883165 root 20 0 351848 121584 9952 S 0.0 0.1 0:00.40 pvedaemon worker + 883164 root 20 0 351712 121560 10060 S 0.0 0.1 0:00.65 pvedaemon worker + 2801 root 20 0 307252 92952 20996 S 0.0 0.1 323:07.31 pvestatd + 2023020 root 20 0 267408 90508 89344 S 0.0 0.1 15:48.85 /lib/systemd/systemd-journald 2899 www-data 20 0 121260 59804 12212 S 0.0 0.1 0:34.77 spiceproxy + 883544 www-data 20 0 121500 51260 3448 S 0.0 0.1 0:00.05 spiceproxy worker + 876236 root 20 0 524564 50188 37612 S 0.0 0.1 0:01.90 /usr/bin/pmxcfs 3771741 root 20 0 150776 30880 3264 S 0.0 0.0 0:12.86 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize 2799 root 20 0 316112 28352 5840 S 0.0 0.0 95:51.91 pve-firewall + 2909 root 20 0 325212 14196 5404 S 0.0 0.0 7:04.14 pve-ha-lrm + 2876 root 20 0 325564 9600 5224 S 0.0 0.0 4:18.33 pve-ha-crm + 868033 ch 20 0 21660 8844 7020 S 0.0 0.0 0:00.14 /lib/systemd/systemd --user root at vn03:~# free -m total used free shared buff/cache available Mem: 80413 52700 18281 115 9431 26805 Swap: 20479 1086 19393 root at vn03:~# slabtop -o | head -50 Active / Total Objects (% used) : 199865696 / 200976971 (99.4%) Active / Total Slabs (% used) : 4771440 / 4771440 (100.0%) Active / Total Caches (% used) : 114 / 161 (70.8%) Active / Total Size (% used) : 59688763.91K / 59945034.02K (99.6%) Minimum / Average / Maximum Object : 0.01K / 0.30K / 16.62K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 43540380 43499279 99% 0.20K 1116420 39 8931360K vm_area_struct 26459776 26457217 99% 0.06K 413434 64 1653736K anon_vma_chain 16782720 16429406 97% 0.25K 524460 32 4195680K filp 13075712 13074728 99% 0.03K 102154 128 408616K kmalloc-32 10104728 10103625 99% 0.09K 219668 46 878672K anon_vma 9599628 9599628 100% 0.04K 94114 102 376456K pde_opener 7442106 7442024 99% 0.19K 177193 42 1417544K cred_jar 7211280 7207550 99% 0.13K 240376 30 961504K kernfs_node_cache 5999322 5970370 99% 0.19K 142841 42 1142728K dentry 5691447 5691447 100% 0.08K 111597 51 446388K task_delay_info 5052594 5052594 100% 0.69K 109839 46 3514848K files_cache 4657408 4657315 99% 0.12K 145544 32 582176K pid 4590750 4590721 99% 1.06K 153025 30 4896800K mm_struct 4206400 4202839 99% 0.58K 76480 55 2447360K inode_cache 4091424 4091235 99% 0.62K 80224 51 2567168K sock_inode_cache 3903104 3901440 99% 0.06K 60986 64 243944K kmalloc-64 3855600 3855530 99% 1.06K 128520 30 4112640K signal_cache 3416133 3410170 99% 0.65K 69717 49 2230944K proc_inode_cache 3124224 3123017 99% 0.01K 6102 512 24408K kmalloc-8 2982840 2982826 99% 0.19K 71020 42 568160K kmalloc-192 2425760 2424977 99% 1.00K 75805 32 2425760K kmalloc-1k 1940694 1932266 99% 0.09K 46207 42 184828K kmalloc-96 1649415 1649346 99% 2.06K 109961 15 3518752K sighand_cache 1279520 1279520 100% 1.00K 39985 32 1279520K UNIX 1043392 1040142 99% 0.50K 32606 32 521696K kmalloc-512 1021152 1020672 99% 0.25K 31911 32 255288K skbuff_head_cache 938880 938777 99% 4.00K 117360 8 3755520K kmalloc-4k 797715 784886 98% 5.75K 159543 5 5105376K task_struct 713388 699031 97% 0.10K 18292 39 73168K buffer_head 643008 73139 11% 0.06K 10047 64 40188K dmaengine-unmap-2 525520 525326 99% 2.00K 32845 16 1051040K kmalloc-2k 432768 426806 98% 0.06K 6762 64 27048K kmem_cache_node 308100 298326 96% 1.05K 10270 30 328640K ext4_inode_cache 292387 289915 99% 0.68K 6221 47 199072K shmem_inode_cache 215250 214971 99% 0.38K 5125 42 82000K kmem_cache 212380 180327 84% 0.57K 7585 28 121360K radix_tree_node 157952 157952 100% 0.02K 617 256 2468K kmalloc-16 150150 150150 100% 1.25K 6006 25 192192K UDPv6 71008 70660 99% 0.12K 2219 32 8876K kmalloc-128 40064 40056 99% 0.25K 1252 32 10016K kmalloc-256 34986 34259 97% 0.09K 833 42 3332K kmalloc-rcl-96 34368 32733 95% 0.06K 537 64 2148K kmalloc-rcl-64 33660 33300 98% 0.05K 396 85 1584K ftrace_event_field typical VM config: balloon: 0 bootdisk: virtio0 cores: 2 cpu: Haswell-noTSX ide2: none,media=cdrom memory: 4096 name: backup net0: virtio=52:54:00:b7:e0:ba,bridge=vmbr100 numa: 0 onboot: 1 ostype: l26 scsihw: virtio-scsi-pci serial0: socket smbios1: uuid=39d362a5-6bae-41b7-9803-b76279e2280f sockets: 1 virtio0: datastore:vm-101-disk-1,cache=writeback,size=32G virtio1: datastore:vm-101-disk-2,cache=writeback,size=100G -- Chris Hofstaedtler / Deduktiva GmbH (FN 418592 b, HG Wien) www.deduktiva.com / +43 1 353 1707 From f.cuseo at panservice.it Fri Sep 20 14:34:26 2019 From: f.cuseo at panservice.it (Fabrizio Cuseo) Date: Fri, 20 Sep 2019 14:34:26 +0200 (CEST) Subject: [PVE-User] Kernel Memory Leak on PVE6? In-Reply-To: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> Message-ID: <485538194.321667.1568982866199.JavaMail.zimbra@zimbra.panservice.it> Are you sure that the memory is used by ZFS cache ? Regards, Fabrizio ----- Il 20-set-19, alle 14:31, Chris Hofstaedtler | Deduktiva chris.hofstaedtler at deduktiva.com ha scritto: > Hi, > > I'm seeing a very interesting problem on PVE6: one of our machines > appears to leak kernel memory over time, up to the point where only > a reboot helps. Shutting down all KVM VMs does not release this > memory. > > I'll attach some information below, because I just couldn't figure > out what this memory is used for. Once before shutting down the VMs, > and once after. I had to reboot the PVE host now, but I guess > in a few days it will be at least noticable again. > > This machine has the same (except CPU) hardware as the box next to > it; however this one was freshly installed with PVE6, the other one > is an upgrade from PVE5 and doesn't exhibit this problem. It's quite > puzzling because I haven't seen this symptom at all at all the > customer installations. > > Here are some graphs showing the memory consumption over time: > http://zeha.at/~ch/T/20190920-pve6_meminfo_0.png > http://zeha.at/~ch/T/20190920-pve6_meminfo_1.png > > Looking forward to any debug help, suggestions, ... > > Chris -- --- Fabrizio Cuseo - mailto:f.cuseo at panservice.it Direzione Generale - Panservice InterNetWorking Servizi Professionali per Internet ed il Networking Panservice e' associata AIIP - RIPE Local Registry Phone: +39 0773 410020 - Fax: +39 0773 470219 http://www.panservice.it mailto:info at panservice.it Numero verde nazionale: 800 901492 From chris.hofstaedtler at deduktiva.com Fri Sep 20 14:35:16 2019 From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva) Date: Fri, 20 Sep 2019 14:35:16 +0200 Subject: [PVE-User] Kernel Memory Leak on PVE6? In-Reply-To: <485538194.321667.1568982866199.JavaMail.zimbra@zimbra.panservice.it> References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> <485538194.321667.1568982866199.JavaMail.zimbra@zimbra.panservice.it> Message-ID: <20190920123516.yorxcgktt3uviyxd@zeha.at> * Fabrizio Cuseo [190920 14:34]: > Are you sure that the memory is used by ZFS cache ? There are no zfs filesystems configured, and zfs.ko is not loaded. Thanks, Chris From a.lauterer at proxmox.com Fri Sep 20 14:58:38 2019 From: a.lauterer at proxmox.com (Aaron Lauterer) Date: Fri, 20 Sep 2019 14:58:38 +0200 Subject: [PVE-User] Kernel Memory Leak on PVE6? In-Reply-To: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> Message-ID: Curious, I do have a very similar case at the moment with a slab of ~155GB, out of ~190GB RAM installed. I am not sure yet what causes it but things I plan to investigate are: * hanging NFS mount * possible (PVE) service starting too many threads -> restarting each and checking the memory / slab usage. On 9/20/19 2:31 PM, Chris Hofstaedtler | Deduktiva wrote: > Hi, > > I'm seeing a very interesting problem on PVE6: one of our machines > appears to leak kernel memory over time, up to the point where only > a reboot helps. Shutting down all KVM VMs does not release this > memory. > > I'll attach some information below, because I just couldn't figure > out what this memory is used for. Once before shutting down the VMs, > and once after. I had to reboot the PVE host now, but I guess > in a few days it will be at least noticable again. > > This machine has the same (except CPU) hardware as the box next to > it; however this one was freshly installed with PVE6, the other one > is an upgrade from PVE5 and doesn't exhibit this problem. It's quite > puzzling because I haven't seen this symptom at all at all the > customer installations. > > Here are some graphs showing the memory consumption over time: > http://zeha.at/~ch/T/20190920-pve6_meminfo_0.png > http://zeha.at/~ch/T/20190920-pve6_meminfo_1.png > > Looking forward to any debug help, suggestions, ... > > Chris > > > ** Almost out of memory, before VM shutdown: ** > > top - 10:24:19 up 22 days, 22:29, 1 user, load average: 1.85, 1.57, 1.32 > Tasks: 530 total, 1 running, 529 sleeping, 0 stopped, 0 zombie > %Cpu(s): 1.8 us, 0.4 sy, 0.0 ni, 97.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st > MiB Mem : 80413.1 total, 509.9 free, 70879.7 used, 9023.5 buff/cache > MiB Swap: 20480.0 total, 6516.6 free, 13963.4 used. 8699.0 avail Mem > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3183 root 20 0 10.6g 6.0g 2960 S 8.7 7.6 5861:52 /usr/bin/kvm -id 103 -name puppet -chardev socket,id=qmp,path=/var/run/qemu-server/103.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+ > 3349 root 20 0 9266032 4.3g 2972 S 6.8 5.4 3834:41 /usr/bin/kvm -id 2017 -name go-test-srv01 -chardev socket,id=qmp,path=/var/run/qemu-server/2017.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=+ > 3068 root 20 0 5060928 3.7g 2900 S 6.8 4.7 3110:01 /usr/bin/kvm -id 101 -name backup -chardev socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+ > 3399 root 20 0 5094772 2.3g 2944 S 50.5 2.9 10780:07 /usr/bin/kvm -id 3002 -name monitor01 -chardev socket,id=qmp,path=/var/run/qemu-server/3002.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-+ > 3254 root 20 0 32.8g 1.9g 3040 S 1.0 2.4 490:39.29 /usr/bin/kvm -id 2005 -name debbuild -chardev socket,id=qmp,path=/var/run/qemu-server/2005.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-e+ > 2994 root 20 0 2656268 658428 2980 S 9.7 0.8 2895:15 /usr/bin/kvm -id 100 -name pbx -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+ > 2927 root 20 0 2664232 479372 2944 S 6.8 0.6 2343:43 /usr/bin/kvm -id 102 -name ns1 -chardev socket,id=qmp,path=/var/run/qemu-server/102.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+ > 2417 root rt 0 606912 211336 51444 S 1.9 0.3 613:27.87 /usr/sbin/corosync -f > 2023020 root 20 0 246556 98020 97044 S 0.0 0.1 15:47.80 /lib/systemd/systemd-journald > 1806 root 20 0 967944 32724 23612 S 0.0 0.0 53:49.62 /usr/bin/pmxcfs > 2801 root 20 0 314488 32428 6464 S 0.0 0.0 322:58.23 pvestatd + > 3771741 root 20 0 150776 31728 3700 S 0.0 0.0 0:12.81 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize > 2799 root 20 0 316056 27452 5656 S 0.0 0.0 95:49.25 pve-firewall + > 2909 root 20 0 325248 12684 5268 S 1.0 0.0 7:03.91 pve-ha-lrm + > 868033 ch 20 0 21660 9104 7280 S 0.0 0.0 0:00.12 /lib/systemd/systemd --user > 868009 root 20 0 16912 7988 6856 S 0.0 0.0 0:00.03 sshd: ch [priv] > 1 root 20 0 171820 7640 5032 S 0.0 0.0 19:58.80 /lib/systemd/systemd --system --deserialize 37 > 2876 root 20 0 325544 7124 4988 S 0.0 0.0 4:18.16 pve-ha-crm + > 1654 Debian-+ 20 0 40488 7096 2864 S 0.0 0.0 77:37.18 /usr/sbin/snmpd -Lsd -Lf /dev/null -u Debian-snmp -g Debian-snmp -I -smux mteTrigger mteTriggerConf -f -p /run/snmpd.pid > 868045 ch 20 0 10240 5404 3996 S 0.0 0.0 0:00.11 -zsh > 868044 ch 20 0 16912 4636 3492 S 0.0 0.0 0:00.02 sshd: ch at pts/0 > 1644 root 20 0 29608 4520 3496 S 0.0 0.0 4:59.62 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal > 868336 root 20 0 7716 4372 3092 S 0.0 0.0 0:00.03 -bash > 1761096 root 20 0 351564 4180 3336 S 0.0 0.0 1:12.83 pvedaemon worker + > 1776171 root 20 0 351696 4076 3352 S 0.0 0.0 1:18.27 pvedaemon worker + > 868370 root 20 0 11680 4016 2964 R 2.9 0.0 0:00.68 top > 1780591 root 20 0 351696 4008 3248 S 0.0 0.0 1:11.73 pvedaemon worker + > 1086 root 20 0 19540 3984 3720 S 0.0 0.0 3:11.21 /lib/systemd/systemd-logind > 868335 root 20 0 10156 3788 3364 S 0.0 0.0 0:00.01 sudo -i > 2899 www-data 20 0 121256 3412 3080 S 0.0 0.0 0:33.99 spiceproxy + > 2000791 www-data 20 0 344932 3412 2604 S 0.0 0.0 1:16.39 pveproxy worker + > 2000792 www-data 20 0 344932 3348 2604 S 0.0 0.0 1:07.07 pveproxy worker + > 1251 root 20 0 225816 3296 2424 S 0.0 0.0 9:47.44 /usr/sbin/rsyslogd -n -iNONE > 1258 message+ 20 0 9212 3268 2820 S 0.0 0.0 6:41.36 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only > > root at vn03:~# uname -a > Linux vn03 5.0.21-1-pve #1 SMP PVE 5.0.21-1 (Tue, 20 Aug 2019 17:16:32 +0200) x86_64 GNU/Linux > root at vn03:~# free -m > total used free shared buff/cache available > Mem: 80413 70877 515 101 9019 8708 > Swap: 20479 13963 6516 > root at vn03:~# dpkg -l pve\* > Desired=Unknown/Install/Remove/Purge/Hold > | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend > |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > ||/ Name Version Architecture Description > +++-=======================-============-============-====================================================== > ii pve-cluster 6.0-5 amd64 Cluster Infrastructure for Proxmox Virtual Environment > ii pve-container 3.0-5 all Proxmox VE Container management tool > ii pve-docs 6.0-4 all Proxmox VE Documentation > ii pve-edk2-firmware 2.20190614-1 all edk2 based firmware modules for virtual machines > ii pve-firewall 4.0-7 amd64 Proxmox VE Firewall > ii pve-firmware 3.0-2 all Binary firmware code for the pve-kernel > ii pve-ha-manager 3.0-2 amd64 Proxmox VE HA Manager > ii pve-i18n 2.0-2 all Internationalization support for Proxmox VE > un pve-kernel (no description available) > ii pve-kernel-5.0 6.0-7 all Latest Proxmox VE Kernel Image > ii pve-kernel-5.0.15-1-pve 5.0.15-1 amd64 The Proxmox PVE Kernel Image > ii pve-kernel-5.0.18-1-pve 5.0.18-3 amd64 The Proxmox PVE Kernel Image > ii pve-kernel-5.0.21-1-pve 5.0.21-1 amd64 The Proxmox PVE Kernel Image > ii pve-kernel-helper 6.0-7 all Function for various kernel maintenance tasks. > un pve-kvm (no description available) > ii pve-manager 6.0-6 amd64 Proxmox Virtual Environment Management Tools > ii pve-qemu-kvm 4.0.0-5 amd64 Full virtualization on x86 hardware > un pve-qemu-kvm-2.6.18 (no description available) > ii pve-xtermjs 3.13.2-1 all HTML/JS Shell client > root at vn03:~# slabtop -o | head -50 > Active / Total Objects (% used) : 205425461 / 212231433 (96.8%) > Active / Total Slabs (% used) : 4949759 / 4949759 (100.0%) > Active / Total Caches (% used) : 114 / 161 (70.8%) > Active / Total Size (% used) : 60112896.56K / 60714678.54K (99.0%) > Minimum / Average / Maximum Object : 0.01K / 0.29K / 16.62K > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > 43583592 43542487 99% 0.20K 1117528 39 8940224K vm_area_struct > 26520256 26518592 99% 0.06K 414379 64 1657516K anon_vma_chain > 16788000 16434450 97% 0.25K 524625 32 4197000K filp > 13079680 13078464 99% 0.03K 102185 128 408740K kmalloc-32 > 11544320 5261058 45% 0.06K 180380 64 721520K dmaengine-unmap-2 > 10128740 10127452 99% 0.09K 220190 46 880760K anon_vma > 9602484 9602484 100% 0.04K 94142 102 376568K pde_opener > 7442736 7442572 99% 0.19K 177208 42 1417664K cred_jar > 7213200 7209695 99% 0.13K 240440 30 961760K kernfs_node_cache > 6023850 5992341 99% 0.19K 143425 42 1147400K dentry > 5704350 5704350 100% 0.08K 111850 51 447400K task_delay_info > 5054066 5054066 100% 0.69K 109871 46 3515872K files_cache > 4664512 4664481 99% 0.12K 145766 32 583064K pid > 4591440 4591440 100% 1.06K 153048 30 4897536K mm_struct > 4207445 4203908 99% 0.58K 76499 55 2447968K inode_cache > 4104480 4104291 99% 0.62K 80480 51 2575360K sock_inode_cache > 3901440 3900588 99% 0.06K 60960 64 243840K kmalloc-64 > 3856230 3856160 99% 1.06K 128541 30 4113312K signal_cache > 3423826 3417982 99% 0.65K 69874 49 2235968K proc_inode_cache > 3139584 3138382 99% 0.01K 6132 512 24528K kmalloc-8 > 2983344 2983255 99% 0.19K 71032 42 568256K kmalloc-192 > 2426976 2426413 99% 1.00K 75843 32 2426976K kmalloc-1k > 1939854 1931355 99% 0.09K 46187 42 184748K kmalloc-96 > 1649895 1649895 100% 2.06K 109993 15 3519776K sighand_cache > 1280544 1280544 100% 1.00K 40017 32 1280544K UNIX > 1052928 1050819 99% 0.50K 32904 32 526464K kmalloc-512 > 1029792 1029312 99% 0.25K 32181 32 257448K skbuff_head_cache > 940624 940559 99% 4.00K 117578 8 3762496K kmalloc-4k > 799895 787069 98% 5.75K 159979 5 5119328K task_struct > 735696 724643 98% 0.10K 18864 39 75456K buffer_head > 525504 525378 99% 2.00K 32844 16 1051008K kmalloc-2k > 433024 426780 98% 0.06K 6766 64 27064K kmem_cache_node > 310710 301758 97% 1.05K 10357 30 331424K ext4_inode_cache > 292340 290078 99% 0.68K 6220 47 199040K shmem_inode_cache > 215250 214814 99% 0.38K 5125 42 82000K kmem_cache > 212296 196761 92% 0.57K 7582 28 121312K radix_tree_node > 158464 158464 100% 0.02K 619 256 2476K kmalloc-16 > 149925 149925 100% 1.25K 5997 25 191904K UDPv6 > 71424 71140 99% 0.12K 2232 32 8928K kmalloc-128 > 70020 70020 100% 0.16K 1376 51 11008K kvm_mmu_page_header > 40032 40009 99% 0.25K 1251 32 10008K kmalloc-256 > 34944 33823 96% 0.09K 832 42 3328K kmalloc-rcl-96 > 34816 32567 93% 0.06K 544 64 2176K kmalloc-rcl-64 > root at vn03:~# pct list > root at vn03:~# qm list > VMID NAME STATUS MEM(MB) BOOTDISK(GB) PID > 100 pbx running 2048 16.00 2994 > 101 backup running 4096 32.00 3068 > 102 ns1 running 2048 32.00 2927 > 103 puppet running 10240 16.00 3183 > 2005 debbuild running 32768 40.00 3254 > 2017 go-test-srv01 running 8192 20.00 3349 > 3002 monitor01 running 4096 32.00 3399 > 5001 salsa-runner-01 stopped 16384 32.00 0 > 6001 deduktiva-runner-01 stopped 32768 32.00 0 > 6901 mac stopped 4096 0.25 0 > root at vn03:~# sysctl -a | grep hugepages > vm.nr_hugepages = 0 > vm.nr_hugepages_mempolicy = 0 > vm.nr_overcommit_hugepages = 0 > > > *** After shutdown of all VMs: *** > > top - 10:39:56 up 22 days, 22:44, 2 users, load average: 0.83, 1.84, 1.88 > Tasks: 491 total, 1 running, 490 sleeping, 0 stopped, 0 zombie > %Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st > MiB Mem : 80413.1 total, 18276.4 free, 52704.9 used, 9431.8 buff/cache > MiB Swap: 20480.0 total, 19393.6 free, 1086.4 used. 26801.1 avail Mem > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 2417 root rt 0 606908 211332 51444 S 1.0 0.3 613:46.50 /usr/sbin/corosync -f > 2878 www-data 20 0 344800 133424 21784 S 0.0 0.2 0:36.09 pveproxy + > 883317 www-data 20 0 361776 133084 11056 S 0.0 0.2 0:01.04 pveproxy worker + > 2836 root 20 0 343228 132060 21764 S 0.0 0.2 0:38.88 pvedaemon + > 883319 www-data 20 0 360688 130992 11148 S 1.0 0.2 0:01.26 pveproxy worker + > 883318 www-data 20 0 358056 128864 11148 S 0.0 0.2 0:01.75 pveproxy worker + > 883166 root 20 0 351912 121884 10220 S 0.0 0.1 0:00.96 pvedaemon worker + > 883165 root 20 0 351848 121584 9952 S 0.0 0.1 0:00.40 pvedaemon worker + > 883164 root 20 0 351712 121560 10060 S 0.0 0.1 0:00.65 pvedaemon worker + > 2801 root 20 0 307252 92952 20996 S 0.0 0.1 323:07.31 pvestatd + > 2023020 root 20 0 267408 90508 89344 S 0.0 0.1 15:48.85 /lib/systemd/systemd-journald > 2899 www-data 20 0 121260 59804 12212 S 0.0 0.1 0:34.77 spiceproxy + > 883544 www-data 20 0 121500 51260 3448 S 0.0 0.1 0:00.05 spiceproxy worker + > 876236 root 20 0 524564 50188 37612 S 0.0 0.1 0:01.90 /usr/bin/pmxcfs > 3771741 root 20 0 150776 30880 3264 S 0.0 0.0 0:12.86 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize > 2799 root 20 0 316112 28352 5840 S 0.0 0.0 95:51.91 pve-firewall + > 2909 root 20 0 325212 14196 5404 S 0.0 0.0 7:04.14 pve-ha-lrm + > 2876 root 20 0 325564 9600 5224 S 0.0 0.0 4:18.33 pve-ha-crm + > 868033 ch 20 0 21660 8844 7020 S 0.0 0.0 0:00.14 /lib/systemd/systemd --user > > root at vn03:~# free -m > total used free shared buff/cache available > Mem: 80413 52700 18281 115 9431 26805 > Swap: 20479 1086 19393 > root at vn03:~# slabtop -o | head -50 > Active / Total Objects (% used) : 199865696 / 200976971 (99.4%) > Active / Total Slabs (% used) : 4771440 / 4771440 (100.0%) > Active / Total Caches (% used) : 114 / 161 (70.8%) > Active / Total Size (% used) : 59688763.91K / 59945034.02K (99.6%) > Minimum / Average / Maximum Object : 0.01K / 0.30K / 16.62K > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > 43540380 43499279 99% 0.20K 1116420 39 8931360K vm_area_struct > 26459776 26457217 99% 0.06K 413434 64 1653736K anon_vma_chain > 16782720 16429406 97% 0.25K 524460 32 4195680K filp > 13075712 13074728 99% 0.03K 102154 128 408616K kmalloc-32 > 10104728 10103625 99% 0.09K 219668 46 878672K anon_vma > 9599628 9599628 100% 0.04K 94114 102 376456K pde_opener > 7442106 7442024 99% 0.19K 177193 42 1417544K cred_jar > 7211280 7207550 99% 0.13K 240376 30 961504K kernfs_node_cache > 5999322 5970370 99% 0.19K 142841 42 1142728K dentry > 5691447 5691447 100% 0.08K 111597 51 446388K task_delay_info > 5052594 5052594 100% 0.69K 109839 46 3514848K files_cache > 4657408 4657315 99% 0.12K 145544 32 582176K pid > 4590750 4590721 99% 1.06K 153025 30 4896800K mm_struct > 4206400 4202839 99% 0.58K 76480 55 2447360K inode_cache > 4091424 4091235 99% 0.62K 80224 51 2567168K sock_inode_cache > 3903104 3901440 99% 0.06K 60986 64 243944K kmalloc-64 > 3855600 3855530 99% 1.06K 128520 30 4112640K signal_cache > 3416133 3410170 99% 0.65K 69717 49 2230944K proc_inode_cache > 3124224 3123017 99% 0.01K 6102 512 24408K kmalloc-8 > 2982840 2982826 99% 0.19K 71020 42 568160K kmalloc-192 > 2425760 2424977 99% 1.00K 75805 32 2425760K kmalloc-1k > 1940694 1932266 99% 0.09K 46207 42 184828K kmalloc-96 > 1649415 1649346 99% 2.06K 109961 15 3518752K sighand_cache > 1279520 1279520 100% 1.00K 39985 32 1279520K UNIX > 1043392 1040142 99% 0.50K 32606 32 521696K kmalloc-512 > 1021152 1020672 99% 0.25K 31911 32 255288K skbuff_head_cache > 938880 938777 99% 4.00K 117360 8 3755520K kmalloc-4k > 797715 784886 98% 5.75K 159543 5 5105376K task_struct > 713388 699031 97% 0.10K 18292 39 73168K buffer_head > 643008 73139 11% 0.06K 10047 64 40188K dmaengine-unmap-2 > 525520 525326 99% 2.00K 32845 16 1051040K kmalloc-2k > 432768 426806 98% 0.06K 6762 64 27048K kmem_cache_node > 308100 298326 96% 1.05K 10270 30 328640K ext4_inode_cache > 292387 289915 99% 0.68K 6221 47 199072K shmem_inode_cache > 215250 214971 99% 0.38K 5125 42 82000K kmem_cache > 212380 180327 84% 0.57K 7585 28 121360K radix_tree_node > 157952 157952 100% 0.02K 617 256 2468K kmalloc-16 > 150150 150150 100% 1.25K 6006 25 192192K UDPv6 > 71008 70660 99% 0.12K 2219 32 8876K kmalloc-128 > 40064 40056 99% 0.25K 1252 32 10016K kmalloc-256 > 34986 34259 97% 0.09K 833 42 3332K kmalloc-rcl-96 > 34368 32733 95% 0.06K 537 64 2148K kmalloc-rcl-64 > 33660 33300 98% 0.05K 396 85 1584K ftrace_event_field > > > > typical VM config: > > balloon: 0 > bootdisk: virtio0 > cores: 2 > cpu: Haswell-noTSX > ide2: none,media=cdrom > memory: 4096 > name: backup > net0: virtio=52:54:00:b7:e0:ba,bridge=vmbr100 > numa: 0 > onboot: 1 > ostype: l26 > scsihw: virtio-scsi-pci > serial0: socket > smbios1: uuid=39d362a5-6bae-41b7-9803-b76279e2280f > sockets: 1 > virtio0: datastore:vm-101-disk-1,cache=writeback,size=32G > virtio1: datastore:vm-101-disk-2,cache=writeback,size=100G > > From chris.hofstaedtler at deduktiva.com Fri Sep 20 15:04:34 2019 From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva) Date: Fri, 20 Sep 2019 15:04:34 +0200 Subject: [PVE-User] Kernel Memory Leak on PVE6? In-Reply-To: References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> Message-ID: <20190920130434.pjnuimppit2rrpxj@percival.namespace.at> * Aaron Lauterer [190920 14:58]: > Curious, I do have a very similar case at the moment with a slab of ~155GB, > out of ~190GB RAM installed. > > I am not sure yet what causes it but things I plan to investigate are: > > * hanging NFS mount Okay, to rule storage issues out, this setup has: - root filesystem as ext4 on GPT - efi system partition - two LVM PVs and VGs, with all VM storage in the second LVM VG - no NFS, no ZFS, no Ceph, no fancy userland filesystems > * possible (PVE) service starting too many threads -> restarting each and > checking the memory / slab usage. Do you have a particular service in mind? Chris -- Chris Hofstaedtler / Deduktiva GmbH (FN 418592 b, HG Wien) www.deduktiva.com / +43 1 353 1707 From a.lauterer at proxmox.com Fri Sep 20 15:12:04 2019 From: a.lauterer at proxmox.com (Aaron Lauterer) Date: Fri, 20 Sep 2019 15:12:04 +0200 Subject: [PVE-User] Kernel Memory Leak on PVE6? In-Reply-To: <20190920130434.pjnuimppit2rrpxj@percival.namespace.at> References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> <20190920130434.pjnuimppit2rrpxj@percival.namespace.at> Message-ID: <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com> On 9/20/19 3:04 PM, Chris Hofstaedtler | Deduktiva wrote: > * Aaron Lauterer [190920 14:58]: >> Curious, I do have a very similar case at the moment with a slab of ~155GB, >> out of ~190GB RAM installed. >> >> I am not sure yet what causes it but things I plan to investigate are: >> >> * hanging NFS mount > > Okay, to rule storage issues out, this setup has: > - root filesystem as ext4 on GPT > - efi system partition > - two LVM PVs and VGs, with all VM storage in the second LVM VG > - no NFS, no ZFS, no Ceph, no fancy userland filesystems > >> * possible (PVE) service starting too many threads -> restarting each and >> checking the memory / slab usage. > > Do you have a particular service in mind? Not at this point. I would restart all PVE services (systemctl| grep -e "pve.*service") one by one to see if any of it will result in memory being released by the kernel. If that is not the case at least they are ruled out. > > Chris > From aderumier at odiso.com Fri Sep 20 16:58:00 2019 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Fri, 20 Sep 2019 16:58:00 +0200 (CEST) Subject: [PVE-User] Kernel Memory Leak on PVE6? In-Reply-To: <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com> References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> <20190920130434.pjnuimppit2rrpxj@percival.namespace.at> <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com> Message-ID: <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com> can send detail of cat /proc/slabinfo ? ----- Mail original ----- De: "Aaron Lauterer" ?: "proxmoxve" Envoy?: Vendredi 20 Septembre 2019 15:12:04 Objet: Re: [PVE-User] Kernel Memory Leak on PVE6? On 9/20/19 3:04 PM, Chris Hofstaedtler | Deduktiva wrote: > * Aaron Lauterer [190920 14:58]: >> Curious, I do have a very similar case at the moment with a slab of ~155GB, >> out of ~190GB RAM installed. >> >> I am not sure yet what causes it but things I plan to investigate are: >> >> * hanging NFS mount > > Okay, to rule storage issues out, this setup has: > - root filesystem as ext4 on GPT > - efi system partition > - two LVM PVs and VGs, with all VM storage in the second LVM VG > - no NFS, no ZFS, no Ceph, no fancy userland filesystems > >> * possible (PVE) service starting too many threads -> restarting each and >> checking the memory / slab usage. > > Do you have a particular service in mind? Not at this point. I would restart all PVE services (systemctl| grep -e "pve.*service") one by one to see if any of it will result in memory being released by the kernel. If that is not the case at least they are ruled out. > > Chris > _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From aderumier at odiso.com Fri Sep 20 17:00:22 2019 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Fri, 20 Sep 2019 17:00:22 +0200 (CEST) Subject: [PVE-User] Recurring crashes after cluster upgrade from 5 to 6 In-Reply-To: References: Message-ID: <1121750443.5445084.1568991622944.JavaMail.zimbra@odiso.com> Hi, a patch is available in pvetest http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libknet1_1.11-pve2_amd64.deb can you test it ? (you need to restart corosync after install of the deb) ----- Mail original ----- De: "Laurent CARON" ?: "proxmoxve" Envoy?: Lundi 16 Septembre 2019 09:55:34 Objet: [PVE-User] Recurring crashes after cluster upgrade from 5 to 6 Hi, After upgrading our 4 node cluster from PVE 5 to 6, we experience constant crashed (once every 2 days). Those crashes seem related to corosync. Since numerous users are reporting sych issues (broken cluster after upgrade, unstabilities, ...) I wonder if it is possible to downgrade corosync to version 2.4.4 without impacting functionnality ? Basic steps would be: On all nodes # systemctl stop pve-ha-lrm Once done, on all nodes: # systemctl stop pve-ha-crm Once done, on all nodes: # apt-get install corosync=2.4.4-pve1 libcorosync-common4=2.4.4-pve1 libcmap4=2.4.4-pve1 libcpg4=2.4.4-pve1 libqb0=1.0.3-1~bpo9 libquorum5=2.4.4-pve1 libvotequorum8=2.4.4-pve1 Then, once corosync has been downgraded, on all nodes # systemctl start pve-ha-lrm # systemctl start pve-ha-crm Would that work ? Thanks _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From a.lauterer at proxmox.com Mon Sep 23 10:17:06 2019 From: a.lauterer at proxmox.com (Aaron Lauterer) Date: Mon, 23 Sep 2019 10:17:06 +0200 Subject: [PVE-User] Kernel Memory Leak on PVE6? In-Reply-To: <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com> References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> <20190920130434.pjnuimppit2rrpxj@percival.namespace.at> <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com> <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com> Message-ID: On 9/20/19 4:58 PM, Alexandre DERUMIER wrote:> can send detail of > > cat /proc/slabinfo > > ? Sure, here you go: ---------------------------------------------- slabinfo - version: 2.1 # name : tunables : slabdata nfs_direct_cache 0 0 360 45 4 : tunables 0 0 0 : slabdata 0 0 0 nfs_commit_data 1196 1196 704 46 8 : tunables 0 0 0 : slabdata 26 26 0 nfs_read_data 6120 6120 896 36 8 : tunables 0 0 0 : slabdata 170 170 0 nfs_inode_cache 2340 2340 1072 30 8 : tunables 0 0 0 : slabdata 78 78 0 SCTPv6 22 22 1472 22 8 : tunables 0 0 0 : slabdata 1 1 0 SCTP 24 24 1344 24 8 : tunables 0 0 0 : slabdata 1 1 0 kvm_async_pf 720 720 136 30 1 : tunables 0 0 0 : slabdata 24 24 0 kvm_vcpu 25 25 17024 1 8 : tunables 0 0 0 : slabdata 25 25 0 kvm_mmu_page_header 48103 48700 160 25 1 : tunables 0 0 0 : slabdata 1948 1948 0 x86_fpu 140 140 4160 7 8 : tunables 0 0 0 : slabdata 20 20 0 zfs_znode_hold_cache 0 0 88 46 1 : tunables 0 0 0 : slabdata 0 0 0 zfs_znode_cache 0 0 1048 31 8 : tunables 0 0 0 : slabdata 0 0 0 sio_cache_2 0 0 168 24 1 : tunables 0 0 0 : slabdata 0 0 0 sio_cache_1 0 0 152 26 1 : tunables 0 0 0 : slabdata 0 0 0 sio_cache_0 0 0 136 30 1 : tunables 0 0 0 : slabdata 0 0 0 zil_zcw_cache 0 0 152 26 1 : tunables 0 0 0 : slabdata 0 0 0 zil_lwb_cache 0 0 376 43 4 : tunables 0 0 0 : slabdata 0 0 0 dmu_buf_impl_t 0 0 312 26 2 : tunables 0 0 0 : slabdata 0 0 0 arc_buf_t 0 0 80 51 1 : tunables 0 0 0 : slabdata 0 0 0 arc_buf_hdr_t_l2only 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 arc_buf_hdr_t_full_crypt 0 0 392 41 4 : tunables 0 0 0 : slabdata 0 0 0 arc_buf_hdr_t_full 0 0 328 24 2 : tunables 0 0 0 : slabdata 0 0 0 dnode_t 0 0 896 36 8 : tunables 0 0 0 : slabdata 0 0 0 sa_cache 0 0 248 33 2 : tunables 0 0 0 : slabdata 0 0 0 abd_t 102 102 40 102 1 : tunables 0 0 0 : slabdata 1 1 0 lz4_cache 0 0 16384 2 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_16384 4 4 16384 2 8 : tunables 0 0 0 : slabdata 2 2 0 zio_buf_16384 2 2 16384 2 8 : tunables 0 0 0 : slabdata 1 1 0 zio_data_buf_14336 0 0 16384 2 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_14336 0 0 16384 2 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_12288 0 0 12288 2 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_12288 0 0 12288 2 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_10240 0 0 12288 2 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_10240 0 0 12288 2 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_8192 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_8192 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_7168 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_7168 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_6144 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_6144 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_5120 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_5120 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_4096 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_4096 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_3584 0 0 3584 9 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_3584 0 0 3584 9 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_3072 0 0 3072 10 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_3072 0 0 3072 10 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_2560 0 0 2560 12 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_2560 0 0 2560 12 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_2048 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_2048 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_1536 0 0 1536 21 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_1536 0 0 1536 21 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_1024 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_1024 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0 zio_data_buf_512 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0 zio_buf_512 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0 zio_link_cache 0 0 48 85 1 : tunables 0 0 0 : slabdata 0 0 0 zio_cache 0 0 1240 26 8 : tunables 0 0 0 : slabdata 0 0 0 ddt_entry_cache 0 0 448 36 4 : tunables 0 0 0 : slabdata 0 0 0 range_seg_cache 0 0 72 56 1 : tunables 0 0 0 : slabdata 0 0 0 kcf_context_cache 0 0 192 42 2 : tunables 0 0 0 : slabdata 0 0 0 kcf_areq_cache 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0 kcf_sreq_cache 0 0 192 42 2 : tunables 0 0 0 : slabdata 0 0 0 mod_hash_entries 170 170 24 170 1 : tunables 0 0 0 : slabdata 1 1 0 spl_vn_file_cache 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0 spl_vn_cache 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0 rpc_inode_cache 100 100 640 25 4 : tunables 0 0 0 : slabdata 4 4 0 ext4_groupinfo_4k 280 280 144 28 1 : tunables 0 0 0 : slabdata 10 10 0 btrfs_delayed_ref_head 0 0 160 25 1 : tunables 0 0 0 : slabdata 0 0 0 btrfs_delayed_node 0 0 312 26 2 : tunables 0 0 0 : slabdata 0 0 0 btrfs_ordered_extent 0 0 416 39 4 : tunables 0 0 0 : slabdata 0 0 0 btrfs_extent_map 0 0 144 28 1 : tunables 0 0 0 : slabdata 0 0 0 btrfs_extent_buffer 0 0 280 29 2 : tunables 0 0 0 : slabdata 0 0 0 btrfs_path 0 0 112 36 1 : tunables 0 0 0 : slabdata 0 0 0 btrfs_inode 0 0 1144 28 8 : tunables 0 0 0 : slabdata 0 0 0 scsi_sense_cache 1728 1728 128 32 1 : tunables 0 0 0 : slabdata 54 54 0 PINGv6 28 28 1152 28 8 : tunables 0 0 0 : slabdata 1 1 0 RAWv6 756 756 1152 28 8 : tunables 0 0 0 : slabdata 27 27 0 UDPv6 2050 2050 1280 25 8 : tunables 0 0 0 : slabdata 82 82 0 tw_sock_TCPv6 816 816 240 34 2 : tunables 0 0 0 : slabdata 24 24 0 request_sock_TCPv6 0 0 304 26 2 : tunables 0 0 0 : slabdata 0 0 0 TCPv6 793 793 2368 13 8 : tunables 0 0 0 : slabdata 61 61 0 kcopyd_job 0 0 3312 9 8 : tunables 0 0 0 : slabdata 0 0 0 dm_uevent 0 0 2632 12 8 : tunables 0 0 0 : slabdata 0 0 0 dm_old_clone_request 0 0 296 27 2 : tunables 0 0 0 : slabdata 0 0 0 dm_rq_target_io 0 0 120 34 1 : tunables 0 0 0 : slabdata 0 0 0 mqueue_inode_cache 36 36 896 36 8 : tunables 0 0 0 : slabdata 1 1 0 fuse_request 984 984 392 41 4 : tunables 0 0 0 : slabdata 24 24 0 fuse_inode 6905 6987 768 42 8 : tunables 0 0 0 : slabdata 169 169 0 ecryptfs_key_record_cache 0 0 576 28 4 : tunables 0 0 0 : slabdata 0 0 0 ecryptfs_headers 504 544 4096 8 8 : tunables 0 0 0 : slabdata 68 68 0 ecryptfs_inode_cache 0 0 960 34 8 : tunables 0 0 0 : slabdata 0 0 0 ecryptfs_dentry_info_cache 128 128 32 128 1 : tunables 0 0 0 : slabdata 1 1 0 ecryptfs_file_cache 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 ecryptfs_auth_tok_list_item 0 0 832 39 8 : tunables 0 0 0 : slabdata 0 0 0 fat_inode_cache 0 0 728 45 8 : tunables 0 0 0 : slabdata 0 0 0 fat_cache 0 0 40 102 1 : tunables 0 0 0 : slabdata 0 0 0 squashfs_inode_cache 0 0 704 46 8 : tunables 0 0 0 : slabdata 0 0 0 jbd2_journal_head 6120 6120 120 34 1 : tunables 0 0 0 : slabdata 180 180 0 jbd2_revoke_table_s 256 256 16 256 1 : tunables 0 0 0 : slabdata 1 1 0 ext4_inode_cache 77141 81432 1080 30 8 : tunables 0 0 0 : slabdata 2727 2727 0 ext4_allocation_context 768 768 128 32 1 : tunables 0 0 0 : slabdata 24 24 0 ext4_pending_reservation 3072 3072 32 128 1 : tunables 0 0 0 : slabdata 24 24 0 ext4_extent_status 24990 25194 40 102 1 : tunables 0 0 0 : slabdata 247 247 0 mbcache 1752 1752 56 73 1 : tunables 0 0 0 : slabdata 24 24 0 fscrypt_info 1536 1536 64 64 1 : tunables 0 0 0 : slabdata 24 24 0 fscrypt_ctx 2040 2040 48 85 1 : tunables 0 0 0 : slabdata 24 24 0 userfaultfd_ctx_cache 0 0 192 42 2 : tunables 0 0 0 : slabdata 0 0 0 dnotify_struct 9856 9856 32 128 1 : tunables 0 0 0 : slabdata 77 77 0 posix_timers_cache 816 816 240 34 2 : tunables 0 0 0 : slabdata 24 24 0 UNIX 2942592 2942592 1024 32 8 : tunables 0 0 0 : slabdata 91956 91956 0 ip4-frags 0 0 208 39 2 : tunables 0 0 0 : slabdata 0 0 0 xfrm_dst_cache 4702 4725 320 25 2 : tunables 0 0 0 : slabdata 189 189 0 xfrm_state 0 0 768 42 8 : tunables 0 0 0 : slabdata 0 0 0 PING 34 34 960 34 8 : tunables 0 0 0 : slabdata 1 1 0 RAW 986 986 960 34 8 : tunables 0 0 0 : slabdata 29 29 0 tw_sock_TCP 850 850 240 34 2 : tunables 0 0 0 : slabdata 25 25 0 request_sock_TCP 624 624 304 26 2 : tunables 0 0 0 : slabdata 24 24 0 TCP 2100 2100 2176 15 8 : tunables 0 0 0 : slabdata 140 140 0 hugetlbfs_inode_cache 390 390 616 26 4 : tunables 0 0 0 : slabdata 15 15 0 dquot 768 768 256 32 2 : tunables 0 0 0 : slabdata 24 24 0 eventpoll_pwq 16240 16240 72 56 1 : tunables 0 0 0 : slabdata 290 290 0 dax_cache 1028 1218 768 42 8 : tunables 0 0 0 : slabdata 29 29 0 request_queue 330 330 2056 15 8 : tunables 0 0 0 : slabdata 22 22 0 biovec-max 306 320 8192 4 8 : tunables 0 0 0 : slabdata 80 80 0 biovec-128 1552 1584 2048 16 8 : tunables 0 0 0 : slabdata 99 99 0 biovec-64 3264 3392 1024 32 8 : tunables 0 0 0 : slabdata 106 106 0 dmaengine-unmap-256 15 15 2112 15 8 : tunables 0 0 0 : slabdata 1 1 0 dmaengine-unmap-128 30 30 1088 30 8 : tunables 0 0 0 : slabdata 1 1 0 dmaengine-unmap-16 14118 14322 192 42 2 : tunables 0 0 0 : slabdata 341 341 0 dmaengine-unmap-2 9992311 14087744 64 64 1 : tunables 0 0 0 : slabdata 220121 220121 0 sock_inode_cache 3067527 3067550 640 25 4 : tunables 0 0 0 : slabdata 122702 122702 0 skbuff_ext_cache 33635 33696 128 32 1 : tunables 0 0 0 : slabdata 1053 1053 0 skbuff_fclone_cache 3458 3648 512 32 4 : tunables 0 0 0 : slabdata 114 114 0 skbuff_head_cache 2353888 2353984 256 32 2 : tunables 0 0 0 : slabdata 73562 73562 0 file_lock_cache 888 888 216 37 2 : tunables 0 0 0 : slabdata 24 24 0 net_namespace 0 0 6272 5 8 : tunables 0 0 0 : slabdata 0 0 0 shmem_inode_cache 15658 16544 696 47 8 : tunables 0 0 0 : slabdata 352 352 0 task_delay_info 18748008 18748008 80 51 1 : tunables 0 0 0 : slabdata 367608 367608 0 taskstats 1128 1128 344 47 4 : tunables 0 0 0 : slabdata 24 24 0 proc_dir_entry 2394 2394 192 42 2 : tunables 0 0 0 : slabdata 57 57 0 pde_opener 33415710 33415710 40 102 1 : tunables 0 0 0 : slabdata 327605 327605 0 proc_inode_cache 6028072 6081432 664 24 4 : tunables 0 0 0 : slabdata 253393 253393 0 bdev_cache 1385 1599 832 39 8 : tunables 0 0 0 : slabdata 41 41 0 kernfs_node_cache 22875834 22875900 136 30 1 : tunables 0 0 0 : slabdata 762530 762530 0 mnt_cache 2520 2520 384 42 4 : tunables 0 0 0 : slabdata 60 60 0 filp 43675126 45914432 256 32 2 : tunables 0 0 0 : slabdata 1434826 1434826 0 inode_cache 7973734 8086521 592 27 4 : tunables 0 0 0 : slabdata 299507 299507 0 dentry 22294094 22349733 192 42 2 : tunables 0 0 0 : slabdata 532145 532145 0 names_cache 224 224 4096 8 8 : tunables 0 0 0 : slabdata 28 28 0 iint_cache 0 0 120 34 1 : tunables 0 0 0 : slabdata 0 0 0 lsm_file_cache 20910 20910 24 170 1 : tunables 0 0 0 : slabdata 123 123 0 buffer_head 586426 588549 104 39 1 : tunables 0 0 0 : slabdata 15091 15091 0 uts_namespace 0 0 440 37 4 : tunables 0 0 0 : slabdata 0 0 0 nsproxy 1752 1752 56 73 1 : tunables 0 0 0 : slabdata 24 24 0 vm_area_struct 66387518 66389973 208 39 2 : tunables 0 0 0 : slabdata 1702307 1702307 0 mm_struct 13442790 13442790 1088 30 8 : tunables 0 0 0 : slabdata 448093 448093 0 files_cache 16558942 16558942 704 46 8 : tunables 0 0 0 : slabdata 359977 359977 0 signal_cache 10803886 10803900 1088 30 8 : tunables 0 0 0 : slabdata 360130 360130 0 sighand_cache 5400497 5400630 2112 15 8 : tunables 0 0 0 : slabdata 360042 360042 0 task_struct 2673885 2736425 5888 5 8 : tunables 0 0 0 : slabdata 547285 547285 0 cred_jar 25486314 25486314 192 42 2 : tunables 0 0 0 : slabdata 606817 606817 0 anon_vma_chain 60709704 60712320 64 64 1 : tunables 0 0 0 : slabdata 948630 948630 0 anon_vma 30101801 30102400 88 46 1 : tunables 0 0 0 : slabdata 654400 654400 0 pid 13969696 13969696 128 32 1 : tunables 0 0 0 : slabdata 436553 436553 0 Acpi-Operand 130536 130536 72 56 1 : tunables 0 0 0 : slabdata 2331 2331 0 Acpi-ParseExt 936 936 104 39 1 : tunables 0 0 0 : slabdata 24 24 0 Acpi-State 1428 1428 80 51 1 : tunables 0 0 0 : slabdata 28 28 0 Acpi-Namespace 12648 12648 40 102 1 : tunables 0 0 0 : slabdata 124 124 0 numa_policy 31 31 264 31 2 : tunables 0 0 0 : slabdata 1 1 0 trace_event_file 1932 1932 88 46 1 : tunables 0 0 0 : slabdata 42 42 0 ftrace_event_field 6120 6120 48 85 1 : tunables 0 0 0 : slabdata 72 72 0 pool_workqueue 5347 5568 256 32 2 : tunables 0 0 0 : slabdata 174 174 0 radix_tree_node 497203 568344 584 28 4 : tunables 0 0 0 : slabdata 20298 20298 0 task_group 625 625 640 25 4 : tunables 0 0 0 : slabdata 25 25 0 dma-kmalloc-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-2k 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-1k 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-512 32 32 512 32 4 : tunables 0 0 0 : slabdata 1 1 0 dma-kmalloc-256 0 0 256 32 2 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-128 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-64 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-192 0 0 192 42 2 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-96 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-2k 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-1k 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-512 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-256 0 0 256 32 2 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-192 1308 1344 192 42 2 : tunables 0 0 0 : slabdata 32 32 0 kmalloc-rcl-128 11776 11840 128 32 1 : tunables 0 0 0 : slabdata 370 370 0 kmalloc-rcl-96 17640 17808 96 42 1 : tunables 0 0 0 : slabdata 424 424 0 kmalloc-rcl-64 284839 285696 64 64 1 : tunables 0 0 0 : slabdata 4464 4464 0 kmalloc-rcl-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-8k 1473 1484 8192 4 8 : tunables 0 0 0 : slabdata 371 371 0 kmalloc-4k 2976604 2976608 4096 8 8 : tunables 0 0 0 : slabdata 372076 372076 0 kmalloc-2k 2747302 2747360 2048 16 8 : tunables 0 0 0 : slabdata 171710 171710 0 kmalloc-1k 8100486 8100832 1024 32 8 : tunables 0 0 0 : slabdata 253151 253151 0 kmalloc-512 2420972 2421248 512 32 4 : tunables 0 0 0 : slabdata 75664 75664 0 kmalloc-256 69472 69472 256 32 2 : tunables 0 0 0 : slabdata 2171 2171 0 kmalloc-192 10754576 10754604 192 42 2 : tunables 0 0 0 : slabdata 256062 256062 0 kmalloc-128 253920 253920 128 32 1 : tunables 0 0 0 : slabdata 7935 7935 0 kmalloc-96 9011196 9014166 96 42 1 : tunables 0 0 0 : slabdata 214623 214623 0 kmalloc-64 16099637 16104640 64 64 1 : tunables 0 0 0 : slabdata 251635 251635 0 kmalloc-32 45461228 45461248 32 128 1 : tunables 0 0 0 : slabdata 355166 355166 0 kmalloc-16 407552 407552 16 256 1 : tunables 0 0 0 : slabdata 1592 1592 0 kmalloc-8 130048 130048 8 512 1 : tunables 0 0 0 : slabdata 254 254 0 kmem_cache_node 1349638 1349696 64 64 1 : tunables 0 0 0 : slabdata 21089 21089 0 kmem_cache 675537 675570 384 42 4 : tunables 0 0 0 : slabdata 16085 16085 0 > > ----- Mail original ----- > De: "Aaron Lauterer" > ?: "proxmoxve" > Envoy?: Vendredi 20 Septembre 2019 15:12:04 > Objet: Re: [PVE-User] Kernel Memory Leak on PVE6? > > On 9/20/19 3:04 PM, Chris Hofstaedtler | Deduktiva wrote: >> * Aaron Lauterer [190920 14:58]: >>> Curious, I do have a very similar case at the moment with a slab of ~155GB, >>> out of ~190GB RAM installed. >>> >>> I am not sure yet what causes it but things I plan to investigate are: >>> >>> * hanging NFS mount >> >> Okay, to rule storage issues out, this setup has: >> - root filesystem as ext4 on GPT >> - efi system partition >> - two LVM PVs and VGs, with all VM storage in the second LVM VG >> - no NFS, no ZFS, no Ceph, no fancy userland filesystems >> >>> * possible (PVE) service starting too many threads -> restarting each and >>> checking the memory / slab usage. >> >> Do you have a particular service in mind? > > Not at this point. I would restart all PVE services (systemctl| grep -e > "pve.*service") one by one to see if any of it will result in memory > being released by the kernel. > > If that is not the case at least they are ruled out. >> >> Chris >> > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From mark at tuxis.nl Wed Sep 25 15:46:37 2019 From: mark at tuxis.nl (Mark Schouten) Date: Wed, 25 Sep 2019 13:46:37 +0000 Subject: [PVE-User] Images on CephFS? In-Reply-To: References: Message-ID: Hi, Just noticed that this is not a PVE 6-change. It's also changed in 5.4-3. We're using this actively, which makes me wonder what will happen if we stop/start a VM using disks on CephFS... Any way we can enable it again? -- Mark Schouten Tuxis B.V. https://www.tuxis.nl/ | +31 318 200208 ------ Original Message ------ From: "Mark Schouten" To: "PVE User List" Sent: 9/19/2019 9:15:17 AM Subject: [PVE-User] Images on CephFS? > >Hi, > >We just built our latest cluster with PVE 6.0. We also offer CephFS >'slow but large' storage with our clusters, on which people can create >images for backupservers. However, it seems that in PVE 6.0, we can no >longer use CephFS for images? > > >Cany anybody confirm (and explain?) or am I looking in the wrong >direction? > >-- >Mark Schouten > >Tuxis, Ede, https://www.tuxis.nl > >T: +31 318 200208 > > >_______________________________________________ >pve-user mailing list >pve-user at pve.proxmox.com >https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From t.lamprecht at proxmox.com Wed Sep 25 16:03:44 2019 From: t.lamprecht at proxmox.com (Thomas Lamprecht) Date: Wed, 25 Sep 2019 16:03:44 +0200 Subject: [PVE-User] Images on CephFS? In-Reply-To: References: Message-ID: Hi, On 9/25/19 3:46 PM, Mark Schouten wrote: > Hi, > > Just noticed that this is not a PVE 6-change. It's also changed in 5.4-3. We're using this actively, which makes me wonder what will happen if we stop/start a VM using disks on CephFS... huh, AFAICT we never allowed that, the git-history of the CephFS storage Plugin is quite short[0] so you can confirm yourself.. The initial commit did not allow VM/CT images neither[1].. [0]: https://git.proxmox.com/?p=pve-storage.git;a=history;f=PVE/Storage/CephFSPlugin.pm;h=c18f8c937029d46b68aeafded5ec8d0a9d9c30ad;hb=HEAD [1]: https://git.proxmox.com/?p=pve-storage.git;a=commitdiff;h=e34ce1444359ee06f50dd6907c0937d10748ce05 > > Any way we can enable it again? IIRC, the rational was that if Ceph is used, RBD will be prefered for CT/VM anyhow - but CephFS seems to be quite performant, and as all functionality should be there (or get added easily) we could enable it just fine.. Just scratching my head how you were able to use it for images if the plugin was never told to allow it.. cheers, Thomas > > -- > Mark Schouten > Tuxis B.V. > https://www.tuxis.nl/ | +31 318 200208 > > ------ Original Message ------ > From: "Mark Schouten" > To: "PVE User List" > Sent: 9/19/2019 9:15:17 AM > Subject: [PVE-User] Images on CephFS? > >> >> Hi, >> >> We just built our latest cluster with PVE 6.0. We also offer CephFS 'slow but large' storage with our clusters, on which people can create images for backupservers. However, it seems that in PVE 6.0, we can no longer use CephFS for images? >> >> >> Cany anybody confirm (and explain?) or am I looking in the wrong direction? >> >> -- >> Mark Schouten >> >> Tuxis, Ede, https://www.tuxis.nl >> >> T: +31 318 200208 >> >> >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From mark at tuxis.nl Wed Sep 25 16:47:20 2019 From: mark at tuxis.nl (Mark Schouten) Date: Wed, 25 Sep 2019 14:47:20 +0000 Subject: [PVE-User] Images on CephFS? In-Reply-To: References: Message-ID: Hi, >huh, AFAICT we never allowed that, the git-history of the CephFS >storage Plugin is quite short[0] so you can confirm yourself.. >The initial commit did not allow VM/CT images neither[1].. Haha. That's cool. :) I'm pretty sure I never needed to 'hack' anything to allow it. I can't find an un-updated cluster to test it on. >>Any way we can enable it again? > >IIRC, the rational was that if Ceph is used, RBD will be prefered >for CT/VM anyhow - but CephFS seems to be quite performant, and as >all functionality should be there (or get added easily) we could >enable it just fine.. > >Just scratching my head how you were able to use it for images if >the plugin was never told to allow it.. The good news is, I can create the image and configure the vm-config-file. Works fine, just not for a normal user. It performes fine as well, although I would recommend only raw images. I think I remember issues with qcow2 and snapshots. -- Mark Schouten Tuxis B.V. https://www.tuxis.nl/ | +31 318 200208 From marcomgabriel at gmail.com Wed Sep 25 16:49:51 2019 From: marcomgabriel at gmail.com (Marco M. Gabriel) Date: Wed, 25 Sep 2019 16:49:51 +0200 Subject: [PVE-User] Images on CephFS? In-Reply-To: References: Message-ID: Hi Mark, as a temporary fix, you could just add a "directory" based storage that points to the CephFS mount point. Marco Am Mi., 25. Sept. 2019 um 15:49 Uhr schrieb Mark Schouten : > > Hi, > > Just noticed that this is not a PVE 6-change. It's also changed in > 5.4-3. We're using this actively, which makes me wonder what will happen > if we stop/start a VM using disks on CephFS... > > Any way we can enable it again? > > -- > Mark Schouten > Tuxis B.V. > https://www.tuxis.nl/ | +31 318 200208 > > ------ Original Message ------ > From: "Mark Schouten" > To: "PVE User List" > Sent: 9/19/2019 9:15:17 AM > Subject: [PVE-User] Images on CephFS? > > > > >Hi, > > > >We just built our latest cluster with PVE 6.0. We also offer CephFS > >'slow but large' storage with our clusters, on which people can create > >images for backupservers. However, it seems that in PVE 6.0, we can no > >longer use CephFS for images? > > > > > >Cany anybody confirm (and explain?) or am I looking in the wrong > >direction? > > > >-- > >Mark Schouten > > > >Tuxis, Ede, https://www.tuxis.nl > > > >T: +31 318 200208 > > > > > >_______________________________________________ > >pve-user mailing list > >pve-user at pve.proxmox.com > >https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From jmr.richardson at gmail.com Wed Sep 25 22:31:31 2019 From: jmr.richardson at gmail.com (JR Richardson) Date: Wed, 25 Sep 2019 15:31:31 -0500 Subject: [PVE-User] SR-IOV Network Virtualization Question Message-ID: Hey All, I'm running Poxmox on Dell R710: CPU(s) 16 x Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (2 Sockets) Kernel Version Linux 4.15.18-18-pve #1 SMP PVE 4.15.18-44 (Wed, 03 Jul 2019 11:19:13 +0200) PVE Manager Version pve-manager/5.4-11/6df3d8d0 I wanted to test the new Velo Cloud Partner Gateway KVM appliance and curious about the requirements: Minimum Server Requirements To run the hypervisor: 10 Intel CPU's at 2.0 Ghz or higher. The CPU must support the AES-NI, SSSE3, SSE4 and RDTSC instruction sets. 20+ GB (16 GB is required for VC Gateway VM memory) 100 GB magnetic or SSD based, persistent disk volume 2 x 1 Gbps (or higher) network interface. The physical NIC card should use the Intel 82599/82599ES chipset (for SR-IOV & DPDK support). The CPU is OK and I think I can get my hands on the correct NIC to put in the chassis. I don't know anything about the SR-IOV or DPDK so I'm doing some research on these now. My question is; Has anyone else deployed VMs with these requirements, are the CPU instructions exposed to the VMs in the PROXMOX kernel I have loaded, any caveats or potholes I should be looking for? Can I get the network cards directly exported to the VM? Any feedback is appreciated. Thanks. JR -- JR Richardson Engineering for the Masses Chasing the Azeotrope From mark at openvs.co.uk Fri Sep 27 10:30:28 2019 From: mark at openvs.co.uk (Mark Adams) Date: Fri, 27 Sep 2019 09:30:28 +0100 Subject: [PVE-User] AMD ZEN 2 (EPYC 7002 aka "rome") kernel requirements Message-ID: Hi All, I'm trying out one of these new processors, and it looks like I need at least 5.2 kernel to get some support, preferably 5.3. At present the machine will boot in to proxmox, but IOMMU does not work, and I can see ECC memory is not working. So my question is, whats the recommended way to get a newer kernel than is provided by the pve-kernel package? I understand that pve-kernel uses the newer ubuntu kernel rather than the debian buster one, but are you building anything else in to it? Will proxmox work ok if I install the ubuntu 5.3 kernel? Cheers, Mark From t.lamprecht at proxmox.com Fri Sep 27 10:37:14 2019 From: t.lamprecht at proxmox.com (Thomas Lamprecht) Date: Fri, 27 Sep 2019 10:37:14 +0200 Subject: [PVE-User] AMD ZEN 2 (EPYC 7002 aka "rome") kernel requirements In-Reply-To: References: Message-ID: <2de48ea2-f032-2e4d-c574-4ed966730c7e@proxmox.com> Hi, On 9/27/19 10:30 AM, Mark Adams wrote: > Hi All, > > I'm trying out one of these new processors, and it looks like I need at > least 5.2 kernel to get some support, preferably 5.3. > We're onto a 5.3 based kernel, may need a bit until a build gets released for testing though. But the things required for that newer platform to work will be also backported to older kernels. From f.gruenbichler at proxmox.com Fri Sep 27 10:37:32 2019 From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?q?Gr=FCnbichler?=) Date: Fri, 27 Sep 2019 10:37:32 +0200 Subject: [PVE-User] AMD ZEN 2 (EPYC 7002 aka "rome") kernel requirements In-Reply-To: References: Message-ID: <1569573311.wn9j7ruavu.astroid@nora.none> On September 27, 2019 10:30 am, Mark Adams wrote: > Hi All, > > I'm trying out one of these new processors, and it looks like I need at > least 5.2 kernel to get some support, preferably 5.3. > > At present the machine will boot in to proxmox, but IOMMU does not work, > and I can see ECC memory is not working. > > So my question is, whats the recommended way to get a newer kernel than is > provided by the pve-kernel package? I understand that pve-kernel uses the > newer ubuntu kernel rather than the debian buster one, but are you building > anything else in to it? Will proxmox work ok if I install the ubuntu 5.3 > kernel? these are the patches we currently ship on-top of Ubuntu Disco's kernel: https://git.proxmox.com/?p=pve-kernel.git;a=tree;f=patches/kernel;hb=refs/heads/master another thing we add are the ZFS modules. not sure which version Ubuntu Eoan ships there. From mark at openvs.co.uk Fri Sep 27 16:01:56 2019 From: mark at openvs.co.uk (Mark Adams) Date: Fri, 27 Sep 2019 15:01:56 +0100 Subject: [PVE-User] AMD ZEN 2 (EPYC 7002 aka "rome") kernel requirements In-Reply-To: <1569573311.wn9j7ruavu.astroid@nora.none> References: <1569573311.wn9j7ruavu.astroid@nora.none> Message-ID: Thanks for your responses Thomas and Fabian. On Fri, 27 Sep 2019 at 09:37, Fabian Gr?nbichler wrote: > On September 27, 2019 10:30 am, Mark Adams wrote: > > Hi All, > > > > I'm trying out one of these new processors, and it looks like I need at > > least 5.2 kernel to get some support, preferably 5.3. > > > > At present the machine will boot in to proxmox, but IOMMU does not work, > > and I can see ECC memory is not working. > > > > So my question is, whats the recommended way to get a newer kernel than > is > > provided by the pve-kernel package? I understand that pve-kernel uses the > > newer ubuntu kernel rather than the debian buster one, but are you > building > > anything else in to it? Will proxmox work ok if I install the ubuntu 5.3 > > kernel? > > these are the patches we currently ship on-top of Ubuntu Disco's kernel: > > > https://git.proxmox.com/?p=pve-kernel.git;a=tree;f=patches/kernel;hb=refs/heads/master > > another thing we add are the ZFS modules. not sure which version Ubuntu > Eoan ships there. > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From proxmox at elchaka.de Sat Sep 28 02:15:49 2019 From: proxmox at elchaka.de (proxmox at elchaka.de) Date: Sat, 28 Sep 2019 02:15:49 +0200 Subject: [PVE-User] ZFS live migration with HA In-Reply-To: References: Message-ID: <18A7C883-7E72-4B22-B6BB-95DBA1BB10BC@elchaka.de> If i am Not wrong, you have to Setup ZFS Replication between the HA nodes. Then it should hopefully work... HTH - Mehmet Am 6. September 2019 14:57:00 MESZ schrieb Milosz Stocki via pve-user : >_______________________________________________ >pve-user mailing list >pve-user at pve.proxmox.com >https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From proxmox at elchaka.de Sat Sep 28 02:15:49 2019 From: proxmox at elchaka.de (proxmox at elchaka.de) Date: Sat, 28 Sep 2019 02:15:49 +0200 Subject: [PVE-User] ZFS live migration with HA In-Reply-To: References: Message-ID: <18A7C883-7E72-4B22-B6BB-95DBA1BB10BC@elchaka.de> If i am Not wrong, you have to Setup ZFS Replication between the HA nodes. Then it should hopefully work... HTH - Mehmet Am 6. September 2019 14:57:00 MESZ schrieb Milosz Stocki via pve-user : >_______________________________________________ >pve-user mailing list >pve-user at pve.proxmox.com >https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From proxmox at elchaka.de Sat Sep 28 02:15:49 2019 From: proxmox at elchaka.de (proxmox at elchaka.de) Date: Sat, 28 Sep 2019 02:15:49 +0200 Subject: [PVE-User] ZFS live migration with HA In-Reply-To: References: Message-ID: <18A7C883-7E72-4B22-B6BB-95DBA1BB10BC@elchaka.de> If i am Not wrong, you have to Setup ZFS Replication between the HA nodes. Then it should hopefully work... HTH - Mehmet Am 6. September 2019 14:57:00 MESZ schrieb Milosz Stocki via pve-user : >_______________________________________________ >pve-user mailing list >pve-user at pve.proxmox.com >https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From chris.hofstaedtler at deduktiva.com Sat Sep 28 15:34:40 2019 From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva) Date: Sat, 28 Sep 2019 15:34:40 +0200 Subject: [PVE-User] Kernel Memory Leak on PVE6? In-Reply-To: <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com> References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> <20190920130434.pjnuimppit2rrpxj@percival.namespace.at> <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com> <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com> Message-ID: <20190928133439.vgm4nia2imkn63fd@zeha.at> * Alexandre DERUMIER [190920 16:58]: > can send detail of > > cat /proc/slabinfo I've attached a dump from today and from yesterday, both at 15:00. It appears this machine is eating about 1GB per day - bit hard to tell from the check_mk graphs. Chris -------------- next part -------------- slabinfo - version: 2.1 # name : tunables : slabdata xfs_dqtrx 0 0 528 31 4 : tunables 0 0 0 : slabdata 0 0 0 xfs_dquot 0 0 504 32 4 : tunables 0 0 0 : slabdata 0 0 0 xfs_rui_item 0 0 696 47 8 : tunables 0 0 0 : slabdata 0 0 0 xfs_rud_item 0 0 176 46 2 : tunables 0 0 0 : slabdata 0 0 0 xfs_inode 308 340 960 34 8 : tunables 0 0 0 : slabdata 10 10 0 xfs_efd_item 185 185 440 37 4 : tunables 0 0 0 : slabdata 5 5 0 xfs_buf_item 270 270 272 30 2 : tunables 0 0 0 : slabdata 9 9 0 xfs_trans 1155 1155 232 35 2 : tunables 0 0 0 : slabdata 33 33 0 xfs_da_state 0 0 480 34 4 : tunables 0 0 0 : slabdata 0 0 0 xfs_btree_cur 432 432 224 36 2 : tunables 0 0 0 : slabdata 12 12 0 xfs_log_ticket 1540 1540 184 44 2 : tunables 0 0 0 : slabdata 35 35 0 kvm_async_pf 1200 1200 136 30 1 : tunables 0 0 0 : slabdata 40 40 0 kvm_vcpu 29 29 21376 1 8 : tunables 0 0 0 : slabdata 29 29 0 kvm_mmu_page_header 18003 18003 160 51 2 : tunables 0 0 0 : slabdata 353 353 0 x86_fpu 203 203 4160 7 8 : tunables 0 0 0 : slabdata 29 29 0 sw_flow 0 0 1952 16 8 : tunables 0 0 0 : slabdata 0 0 0 nf_conncount_rb 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 nf_conntrack 2040 2040 320 51 4 : tunables 0 0 0 : slabdata 40 40 0 rpc_inode_cache 102 102 640 51 8 : tunables 0 0 0 : slabdata 2 2 0 ext4_groupinfo_4k 784 784 144 28 1 : tunables 0 0 0 : slabdata 28 28 0 scsi_sense_cache 1824 1824 128 32 1 : tunables 0 0 0 : slabdata 57 57 0 PINGv6 0 0 1152 28 8 : tunables 0 0 0 : slabdata 0 0 0 RAWv6 1204 1204 1152 28 8 : tunables 0 0 0 : slabdata 43 43 0 UDPv6 48775 48775 1280 25 8 : tunables 0 0 0 : slabdata 1951 1951 0 tw_sock_TCPv6 1360 1360 240 34 2 : tunables 0 0 0 : slabdata 40 40 0 request_sock_TCPv6 0 0 304 53 4 : tunables 0 0 0 : slabdata 0 0 0 TCPv6 533 533 2368 13 8 : tunables 0 0 0 : slabdata 41 41 0 kcopyd_job 0 0 3312 9 8 : tunables 0 0 0 : slabdata 0 0 0 dm_uevent 0 0 2632 12 8 : tunables 0 0 0 : slabdata 0 0 0 dm_old_clone_request 0 0 296 55 4 : tunables 0 0 0 : slabdata 0 0 0 dm_rq_target_io 0 0 120 34 1 : tunables 0 0 0 : slabdata 0 0 0 mqueue_inode_cache 36 36 896 36 8 : tunables 0 0 0 : slabdata 1 1 0 fuse_request 1640 1640 392 41 4 : tunables 0 0 0 : slabdata 40 40 0 fuse_inode 4123 4326 768 42 8 : tunables 0 0 0 : slabdata 103 103 0 ecryptfs_key_record_cache 0 0 576 28 4 : tunables 0 0 0 : slabdata 0 0 0 ecryptfs_headers 8 8 4096 8 8 : tunables 0 0 0 : slabdata 1 1 0 ecryptfs_inode_cache 0 0 960 34 8 : tunables 0 0 0 : slabdata 0 0 0 ecryptfs_dentry_info_cache 128 128 32 128 1 : tunables 0 0 0 : slabdata 1 1 0 ecryptfs_file_cache 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 ecryptfs_auth_tok_list_item 0 0 832 39 8 : tunables 0 0 0 : slabdata 0 0 0 fat_inode_cache 90 90 728 45 8 : tunables 0 0 0 : slabdata 2 2 0 fat_cache 0 0 40 102 1 : tunables 0 0 0 : slabdata 0 0 0 squashfs_inode_cache 0 0 704 46 8 : tunables 0 0 0 : slabdata 0 0 0 jbd2_journal_head 4998 4998 120 34 1 : tunables 0 0 0 : slabdata 147 147 0 jbd2_revoke_table_s 512 512 16 256 1 : tunables 0 0 0 : slabdata 2 2 0 ext4_inode_cache 93910 102000 1080 30 8 : tunables 0 0 0 : slabdata 3400 3400 0 ext4_allocation_context 1280 1280 128 32 1 : tunables 0 0 0 : slabdata 40 40 0 ext4_pending_reservation 6912 6912 32 128 1 : tunables 0 0 0 : slabdata 54 54 0 ext4_extent_status 14382 14382 40 102 1 : tunables 0 0 0 : slabdata 141 141 0 mbcache 3942 3942 56 73 1 : tunables 0 0 0 : slabdata 54 54 0 fscrypt_info 2560 2560 64 64 1 : tunables 0 0 0 : slabdata 40 40 0 fscrypt_ctx 3400 3400 48 85 1 : tunables 0 0 0 : slabdata 40 40 0 userfaultfd_ctx_cache 0 0 192 42 2 : tunables 0 0 0 : slabdata 0 0 0 dnotify_struct 5632 5632 32 128 1 : tunables 0 0 0 : slabdata 44 44 0 posix_timers_cache 1360 1360 240 34 2 : tunables 0 0 0 : slabdata 40 40 0 UNIX 409568 409568 1024 32 8 : tunables 0 0 0 : slabdata 12799 12799 0 ip4-frags 0 0 208 39 2 : tunables 0 0 0 : slabdata 0 0 0 xfrm_dst_cache 15657 15912 320 51 4 : tunables 0 0 0 : slabdata 312 312 0 xfrm_state 0 0 768 42 8 : tunables 0 0 0 : slabdata 0 0 0 PING 0 0 960 34 8 : tunables 0 0 0 : slabdata 0 0 0 RAW 1598 1598 960 34 8 : tunables 0 0 0 : slabdata 47 47 0 tw_sock_TCP 952 952 240 34 2 : tunables 0 0 0 : slabdata 28 28 0 request_sock_TCP 2120 2120 304 53 4 : tunables 0 0 0 : slabdata 40 40 0 TCP 1815 1815 2176 15 8 : tunables 0 0 0 : slabdata 121 121 0 hugetlbfs_inode_cache 583 583 616 53 8 : tunables 0 0 0 : slabdata 11 11 0 dquot 1280 1280 256 32 2 : tunables 0 0 0 : slabdata 40 40 0 eventpoll_pwq 13160 13160 72 56 1 : tunables 0 0 0 : slabdata 235 235 0 dax_cache 307 336 768 42 8 : tunables 0 0 0 : slabdata 8 8 0 request_queue 120 120 2056 15 8 : tunables 0 0 0 : slabdata 8 8 0 biovec-max 664 676 8192 4 8 : tunables 0 0 0 : slabdata 169 169 0 biovec-128 2016 2048 2048 16 8 : tunables 0 0 0 : slabdata 128 128 0 biovec-64 1440 1440 1024 32 8 : tunables 0 0 0 : slabdata 45 45 0 dmaengine-unmap-256 15 15 2112 15 8 : tunables 0 0 0 : slabdata 1 1 0 dmaengine-unmap-128 30 30 1088 30 8 : tunables 0 0 0 : slabdata 1 1 0 dmaengine-unmap-16 5124 5124 192 42 2 : tunables 0 0 0 : slabdata 122 122 0 dmaengine-unmap-2 7548143 7548928 64 64 1 : tunables 0 0 0 : slabdata 117952 117952 0 sock_inode_cache 1310649 1310649 640 51 8 : tunables 0 0 0 : slabdata 25699 25699 0 skbuff_ext_cache 1344 1344 128 32 1 : tunables 0 0 0 : slabdata 42 42 0 skbuff_fclone_cache 1696 1696 512 32 4 : tunables 0 0 0 : slabdata 53 53 0 skbuff_head_cache 347360 347520 256 32 2 : tunables 0 0 0 : slabdata 10860 10860 0 file_lock_cache 1480 1480 216 37 2 : tunables 0 0 0 : slabdata 40 40 0 net_namespace 0 0 6272 5 8 : tunables 0 0 0 : slabdata 0 0 0 shmem_inode_cache 102235 102883 696 47 8 : tunables 0 0 0 : slabdata 2189 2189 0 task_delay_info 2066979 2066979 80 51 1 : tunables 0 0 0 : slabdata 40529 40529 0 taskstats 1880 1880 344 47 4 : tunables 0 0 0 : slabdata 40 40 0 proc_dir_entry 2352 2352 192 42 2 : tunables 0 0 0 : slabdata 56 56 0 pde_opener 3306126 3306228 40 102 1 : tunables 0 0 0 : slabdata 32414 32414 0 proc_inode_cache 1175048 1187760 664 49 8 : tunables 0 0 0 : slabdata 24240 24240 0 bdev_cache 715 741 832 39 8 : tunables 0 0 0 : slabdata 19 19 0 kernfs_node_cache 2503800 2503800 136 30 1 : tunables 0 0 0 : slabdata 83460 83460 0 mnt_cache 2100 2100 384 42 4 : tunables 0 0 0 : slabdata 50 50 0 filp 5559835 5726208 256 32 2 : tunables 0 0 0 : slabdata 178944 178944 0 inode_cache 1592023 1594175 592 55 8 : tunables 0 0 0 : slabdata 28985 28985 0 dentry 2285970 2307312 192 42 2 : tunables 0 0 0 : slabdata 54936 54936 0 names_cache 320 320 4096 8 8 : tunables 0 0 0 : slabdata 40 40 0 iint_cache 0 0 120 34 1 : tunables 0 0 0 : slabdata 0 0 0 lsm_file_cache 14280 14280 24 170 1 : tunables 0 0 0 : slabdata 84 84 0 buffer_head 421193 433134 104 39 1 : tunables 0 0 0 : slabdata 11106 11106 0 uts_namespace 0 0 440 37 4 : tunables 0 0 0 : slabdata 0 0 0 nsproxy 2920 2920 56 73 1 : tunables 0 0 0 : slabdata 40 40 0 vm_area_struct 13788959 13805571 208 39 2 : tunables 0 0 0 : slabdata 353989 353989 0 mm_struct 1710780 1710780 1088 30 8 : tunables 0 0 0 : slabdata 57026 57026 0 files_cache 1820542 1820542 704 46 8 : tunables 0 0 0 : slabdata 39577 39577 0 signal_cache 1378675 1378830 1088 30 8 : tunables 0 0 0 : slabdata 45961 45961 0 sighand_cache 595458 595470 2112 15 8 : tunables 0 0 0 : slabdata 39698 39698 0 task_struct 285237 290020 5888 5 8 : tunables 0 0 0 : slabdata 58004 58004 0 cred_jar 2559692 2559732 192 42 2 : tunables 0 0 0 : slabdata 60946 60946 0 anon_vma_chain 8805274 8806784 64 64 1 : tunables 0 0 0 : slabdata 137606 137606 0 anon_vma 3720344 3721308 88 46 1 : tunables 0 0 0 : slabdata 80898 80898 0 pid 1739680 1739680 128 32 1 : tunables 0 0 0 : slabdata 54365 54365 0 Acpi-Operand 4928 4928 72 56 1 : tunables 0 0 0 : slabdata 88 88 0 Acpi-ParseExt 1560 1560 104 39 1 : tunables 0 0 0 : slabdata 40 40 0 Acpi-State 2244 2244 80 51 1 : tunables 0 0 0 : slabdata 44 44 0 Acpi-Namespace 3366 3366 40 102 1 : tunables 0 0 0 : slabdata 33 33 0 numa_policy 62 62 264 31 2 : tunables 0 0 0 : slabdata 2 2 0 trace_event_file 4232 4232 88 46 1 : tunables 0 0 0 : slabdata 92 92 0 ftrace_event_field 5865 5865 48 85 1 : tunables 0 0 0 : slabdata 69 69 0 pool_workqueue 12753 12864 256 32 2 : tunables 0 0 0 : slabdata 402 402 0 radix_tree_node 404222 408632 584 28 4 : tunables 0 0 0 : slabdata 14600 14600 0 task_group 2040 2040 640 51 8 : tunables 0 0 0 : slabdata 40 40 0 dma-kmalloc-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-2k 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-1k 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-512 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-256 0 0 256 32 2 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-128 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-64 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-192 0 0 192 42 2 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-96 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-2k 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-1k 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-512 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-256 0 0 256 32 2 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-192 420 420 192 42 2 : tunables 0 0 0 : slabdata 10 10 0 kmalloc-rcl-128 6632 6688 128 32 1 : tunables 0 0 0 : slabdata 209 209 0 kmalloc-rcl-96 26339 26544 96 42 1 : tunables 0 0 0 : slabdata 632 632 0 kmalloc-rcl-64 25336 26624 64 64 1 : tunables 0 0 0 : slabdata 416 416 0 kmalloc-rcl-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-8k 1527 1548 8192 4 8 : tunables 0 0 0 : slabdata 387 387 0 kmalloc-4k 327067 327096 4096 8 8 : tunables 0 0 0 : slabdata 40887 40887 0 kmalloc-2k 215533 215552 2048 16 8 : tunables 0 0 0 : slabdata 13472 13472 0 kmalloc-1k 882909 883296 1024 32 8 : tunables 0 0 0 : slabdata 27603 27603 0 kmalloc-512 360331 360608 512 32 4 : tunables 0 0 0 : slabdata 11269 11269 0 kmalloc-256 14943 14944 256 32 2 : tunables 0 0 0 : slabdata 467 467 0 kmalloc-192 1101565 1101702 192 42 2 : tunables 0 0 0 : slabdata 26231 26231 0 kmalloc-128 27641 27680 128 32 1 : tunables 0 0 0 : slabdata 865 865 0 kmalloc-96 752313 755832 96 42 1 : tunables 0 0 0 : slabdata 17996 17996 0 kmalloc-64 1492050 1495424 64 64 1 : tunables 0 0 0 : slabdata 23366 23366 0 kmalloc-32 4731136 4731136 32 128 1 : tunables 0 0 0 : slabdata 36962 36962 0 kmalloc-16 69376 69376 16 256 1 : tunables 0 0 0 : slabdata 271 271 0 kmalloc-8 1037476 1038848 8 512 1 : tunables 0 0 0 : slabdata 2029 2029 0 kmem_cache_node 147069 147072 64 64 1 : tunables 0 0 0 : slabdata 2298 2298 0 kmem_cache 74446 74466 384 42 4 : tunables 0 0 0 : slabdata 1773 1773 0 -------------- next part -------------- slabinfo - version: 2.1 # name : tunables : slabdata xfs_dqtrx 0 0 528 31 4 : tunables 0 0 0 : slabdata 0 0 0 xfs_dquot 0 0 504 32 4 : tunables 0 0 0 : slabdata 0 0 0 xfs_rui_item 0 0 696 47 8 : tunables 0 0 0 : slabdata 0 0 0 xfs_rud_item 0 0 176 46 2 : tunables 0 0 0 : slabdata 0 0 0 xfs_inode 308 340 960 34 8 : tunables 0 0 0 : slabdata 10 10 0 xfs_efd_item 185 185 440 37 4 : tunables 0 0 0 : slabdata 5 5 0 xfs_buf_item 270 270 272 30 2 : tunables 0 0 0 : slabdata 9 9 0 xfs_trans 1155 1155 232 35 2 : tunables 0 0 0 : slabdata 33 33 0 xfs_da_state 0 0 480 34 4 : tunables 0 0 0 : slabdata 0 0 0 xfs_btree_cur 432 432 224 36 2 : tunables 0 0 0 : slabdata 12 12 0 xfs_log_ticket 1540 1540 184 44 2 : tunables 0 0 0 : slabdata 35 35 0 kvm_async_pf 1200 1200 136 30 1 : tunables 0 0 0 : slabdata 40 40 0 kvm_vcpu 29 29 21376 1 8 : tunables 0 0 0 : slabdata 29 29 0 kvm_mmu_page_header 18105 18105 160 51 2 : tunables 0 0 0 : slabdata 355 355 0 x86_fpu 203 203 4160 7 8 : tunables 0 0 0 : slabdata 29 29 0 sw_flow 0 0 1952 16 8 : tunables 0 0 0 : slabdata 0 0 0 nf_conncount_rb 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 nf_conntrack 2040 2040 320 51 4 : tunables 0 0 0 : slabdata 40 40 0 rpc_inode_cache 102 102 640 51 8 : tunables 0 0 0 : slabdata 2 2 0 ext4_groupinfo_4k 784 784 144 28 1 : tunables 0 0 0 : slabdata 28 28 0 scsi_sense_cache 1824 1824 128 32 1 : tunables 0 0 0 : slabdata 57 57 0 PINGv6 0 0 1152 28 8 : tunables 0 0 0 : slabdata 0 0 0 RAWv6 1204 1204 1152 28 8 : tunables 0 0 0 : slabdata 43 43 0 UDPv6 55650 55650 1280 25 8 : tunables 0 0 0 : slabdata 2226 2226 0 tw_sock_TCPv6 1360 1360 240 34 2 : tunables 0 0 0 : slabdata 40 40 0 request_sock_TCPv6 0 0 304 53 4 : tunables 0 0 0 : slabdata 0 0 0 TCPv6 533 533 2368 13 8 : tunables 0 0 0 : slabdata 41 41 0 kcopyd_job 0 0 3312 9 8 : tunables 0 0 0 : slabdata 0 0 0 dm_uevent 0 0 2632 12 8 : tunables 0 0 0 : slabdata 0 0 0 dm_old_clone_request 0 0 296 55 4 : tunables 0 0 0 : slabdata 0 0 0 dm_rq_target_io 0 0 120 34 1 : tunables 0 0 0 : slabdata 0 0 0 mqueue_inode_cache 36 36 896 36 8 : tunables 0 0 0 : slabdata 1 1 0 fuse_request 1640 1640 392 41 4 : tunables 0 0 0 : slabdata 40 40 0 fuse_inode 4123 4326 768 42 8 : tunables 0 0 0 : slabdata 103 103 0 ecryptfs_key_record_cache 0 0 576 28 4 : tunables 0 0 0 : slabdata 0 0 0 ecryptfs_headers 8 8 4096 8 8 : tunables 0 0 0 : slabdata 1 1 0 ecryptfs_inode_cache 0 0 960 34 8 : tunables 0 0 0 : slabdata 0 0 0 ecryptfs_dentry_info_cache 128 128 32 128 1 : tunables 0 0 0 : slabdata 1 1 0 ecryptfs_file_cache 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 ecryptfs_auth_tok_list_item 0 0 832 39 8 : tunables 0 0 0 : slabdata 0 0 0 fat_inode_cache 90 90 728 45 8 : tunables 0 0 0 : slabdata 2 2 0 fat_cache 0 0 40 102 1 : tunables 0 0 0 : slabdata 0 0 0 squashfs_inode_cache 0 0 704 46 8 : tunables 0 0 0 : slabdata 0 0 0 jbd2_journal_head 5338 5338 120 34 1 : tunables 0 0 0 : slabdata 157 157 0 jbd2_revoke_table_s 512 512 16 256 1 : tunables 0 0 0 : slabdata 2 2 0 ext4_inode_cache 103557 113790 1080 30 8 : tunables 0 0 0 : slabdata 3793 3793 0 ext4_allocation_context 1280 1280 128 32 1 : tunables 0 0 0 : slabdata 40 40 0 ext4_pending_reservation 7168 7168 32 128 1 : tunables 0 0 0 : slabdata 56 56 0 ext4_extent_status 13848 15606 40 102 1 : tunables 0 0 0 : slabdata 153 153 0 mbcache 4380 4380 56 73 1 : tunables 0 0 0 : slabdata 60 60 0 fscrypt_info 2560 2560 64 64 1 : tunables 0 0 0 : slabdata 40 40 0 fscrypt_ctx 3400 3400 48 85 1 : tunables 0 0 0 : slabdata 40 40 0 userfaultfd_ctx_cache 0 0 192 42 2 : tunables 0 0 0 : slabdata 0 0 0 dnotify_struct 5632 5632 32 128 1 : tunables 0 0 0 : slabdata 44 44 0 posix_timers_cache 1360 1360 240 34 2 : tunables 0 0 0 : slabdata 40 40 0 UNIX 465504 465504 1024 32 8 : tunables 0 0 0 : slabdata 14547 14547 0 ip4-frags 0 0 208 39 2 : tunables 0 0 0 : slabdata 0 0 0 xfrm_dst_cache 16830 17238 320 51 4 : tunables 0 0 0 : slabdata 338 338 0 xfrm_state 0 0 768 42 8 : tunables 0 0 0 : slabdata 0 0 0 PING 0 0 960 34 8 : tunables 0 0 0 : slabdata 0 0 0 RAW 1598 1598 960 34 8 : tunables 0 0 0 : slabdata 47 47 0 tw_sock_TCP 952 952 240 34 2 : tunables 0 0 0 : slabdata 28 28 0 request_sock_TCP 2120 2120 304 53 4 : tunables 0 0 0 : slabdata 40 40 0 TCP 1845 1845 2176 15 8 : tunables 0 0 0 : slabdata 123 123 0 hugetlbfs_inode_cache 583 583 616 53 8 : tunables 0 0 0 : slabdata 11 11 0 dquot 1280 1280 256 32 2 : tunables 0 0 0 : slabdata 40 40 0 eventpoll_pwq 13216 13216 72 56 1 : tunables 0 0 0 : slabdata 236 236 0 dax_cache 307 336 768 42 8 : tunables 0 0 0 : slabdata 8 8 0 request_queue 120 120 2056 15 8 : tunables 0 0 0 : slabdata 8 8 0 biovec-max 640 648 8192 4 8 : tunables 0 0 0 : slabdata 162 162 0 biovec-128 2080 2080 2048 16 8 : tunables 0 0 0 : slabdata 130 130 0 biovec-64 1440 1440 1024 32 8 : tunables 0 0 0 : slabdata 45 45 0 dmaengine-unmap-256 15 15 2112 15 8 : tunables 0 0 0 : slabdata 1 1 0 dmaengine-unmap-128 30 30 1088 30 8 : tunables 0 0 0 : slabdata 1 1 0 dmaengine-unmap-16 5166 5166 192 42 2 : tunables 0 0 0 : slabdata 123 123 0 dmaengine-unmap-2 7536079 7570944 64 64 1 : tunables 0 0 0 : slabdata 118296 118296 0 sock_inode_cache 1494300 1494300 640 51 8 : tunables 0 0 0 : slabdata 29300 29300 0 skbuff_ext_cache 1344 1344 128 32 1 : tunables 0 0 0 : slabdata 42 42 0 skbuff_fclone_cache 1696 1696 512 32 4 : tunables 0 0 0 : slabdata 53 53 0 skbuff_head_cache 391008 391232 256 32 2 : tunables 0 0 0 : slabdata 12226 12226 0 file_lock_cache 1480 1480 216 37 2 : tunables 0 0 0 : slabdata 40 40 0 net_namespace 0 0 6272 5 8 : tunables 0 0 0 : slabdata 0 0 0 shmem_inode_cache 115066 115714 696 47 8 : tunables 0 0 0 : slabdata 2462 2462 0 task_delay_info 2363187 2363187 80 51 1 : tunables 0 0 0 : slabdata 46337 46337 0 taskstats 1880 1880 344 47 4 : tunables 0 0 0 : slabdata 40 40 0 proc_dir_entry 2352 2352 192 42 2 : tunables 0 0 0 : slabdata 56 56 0 pde_opener 3777570 3777672 40 102 1 : tunables 0 0 0 : slabdata 37036 37036 0 proc_inode_cache 1338998 1353625 664 49 8 : tunables 0 0 0 : slabdata 27625 27625 0 bdev_cache 715 741 832 39 8 : tunables 0 0 0 : slabdata 19 19 0 kernfs_node_cache 2821263 2821320 136 30 1 : tunables 0 0 0 : slabdata 94044 94044 0 mnt_cache 2184 2184 384 42 4 : tunables 0 0 0 : slabdata 52 52 0 filp 6359482 6550112 256 32 2 : tunables 0 0 0 : slabdata 204691 204691 0 inode_cache 1818494 1820225 592 55 8 : tunables 0 0 0 : slabdata 33095 33095 0 dentry 2605094 2623740 192 42 2 : tunables 0 0 0 : slabdata 62470 62470 0 names_cache 320 320 4096 8 8 : tunables 0 0 0 : slabdata 40 40 0 iint_cache 0 0 120 34 1 : tunables 0 0 0 : slabdata 0 0 0 lsm_file_cache 14450 14450 24 170 1 : tunables 0 0 0 : slabdata 85 85 0 buffer_head 463822 473967 104 39 1 : tunables 0 0 0 : slabdata 12153 12153 0 uts_namespace 0 0 440 37 4 : tunables 0 0 0 : slabdata 0 0 0 nsproxy 2920 2920 56 73 1 : tunables 0 0 0 : slabdata 40 40 0 vm_area_struct 15765050 15784158 208 39 2 : tunables 0 0 0 : slabdata 404722 404722 0 mm_struct 1958011 1958040 1088 30 8 : tunables 0 0 0 : slabdata 65268 65268 0 files_cache 2082328 2082328 704 46 8 : tunables 0 0 0 : slabdata 45268 45268 0 signal_cache 1576061 1576110 1088 30 8 : tunables 0 0 0 : slabdata 52537 52537 0 sighand_cache 680677 680805 2112 15 8 : tunables 0 0 0 : slabdata 45387 45387 0 task_struct 326007 331250 5888 5 8 : tunables 0 0 0 : slabdata 66250 66250 0 cred_jar 2929292 2929332 192 42 2 : tunables 0 0 0 : slabdata 69746 69746 0 anon_vma_chain 10029976 10031552 64 64 1 : tunables 0 0 0 : slabdata 156743 156743 0 anon_vma 4234576 4235266 88 46 1 : tunables 0 0 0 : slabdata 92071 92071 0 pid 1989408 1989408 128 32 1 : tunables 0 0 0 : slabdata 62169 62169 0 Acpi-Operand 4928 4928 72 56 1 : tunables 0 0 0 : slabdata 88 88 0 Acpi-ParseExt 1560 1560 104 39 1 : tunables 0 0 0 : slabdata 40 40 0 Acpi-State 2244 2244 80 51 1 : tunables 0 0 0 : slabdata 44 44 0 Acpi-Namespace 3366 3366 40 102 1 : tunables 0 0 0 : slabdata 33 33 0 numa_policy 62 62 264 31 2 : tunables 0 0 0 : slabdata 2 2 0 trace_event_file 4232 4232 88 46 1 : tunables 0 0 0 : slabdata 92 92 0 ftrace_event_field 5865 5865 48 85 1 : tunables 0 0 0 : slabdata 69 69 0 pool_workqueue 12361 12512 256 32 2 : tunables 0 0 0 : slabdata 391 391 0 radix_tree_node 431129 437276 584 28 4 : tunables 0 0 0 : slabdata 15623 15623 0 task_group 2040 2040 640 51 8 : tunables 0 0 0 : slabdata 40 40 0 dma-kmalloc-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-2k 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-1k 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-512 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-256 0 0 256 32 2 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-128 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-64 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-192 0 0 192 42 2 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-96 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-2k 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-1k 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-512 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-256 0 0 256 32 2 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-192 420 420 192 42 2 : tunables 0 0 0 : slabdata 10 10 0 kmalloc-rcl-128 6752 6752 128 32 1 : tunables 0 0 0 : slabdata 211 211 0 kmalloc-rcl-96 25417 25788 96 42 1 : tunables 0 0 0 : slabdata 614 614 0 kmalloc-rcl-64 25422 26176 64 64 1 : tunables 0 0 0 : slabdata 409 409 0 kmalloc-rcl-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-8k 1552 1560 8192 4 8 : tunables 0 0 0 : slabdata 390 390 0 kmalloc-4k 373769 373808 4096 8 8 : tunables 0 0 0 : slabdata 46726 46726 0 kmalloc-2k 247249 247264 2048 16 8 : tunables 0 0 0 : slabdata 15454 15454 0 kmalloc-1k 1010825 1011040 1024 32 8 : tunables 0 0 0 : slabdata 31595 31595 0 kmalloc-512 403433 403808 512 32 4 : tunables 0 0 0 : slabdata 12619 12619 0 kmalloc-256 16239 16512 256 32 2 : tunables 0 0 0 : slabdata 516 516 0 kmalloc-192 1260365 1260462 192 42 2 : tunables 0 0 0 : slabdata 30011 30011 0 kmalloc-128 30795 30880 128 32 1 : tunables 0 0 0 : slabdata 965 965 0 kmalloc-96 860672 865662 96 42 1 : tunables 0 0 0 : slabdata 20611 20611 0 kmalloc-64 1699569 1701824 64 64 1 : tunables 0 0 0 : slabdata 26591 26591 0 kmalloc-32 5300345 5300480 32 128 1 : tunables 0 0 0 : slabdata 41410 41410 0 kmalloc-16 75008 75008 16 256 1 : tunables 0 0 0 : slabdata 293 293 0 kmalloc-8 1176740 1178112 8 512 1 : tunables 0 0 0 : slabdata 2301 2301 0 kmem_cache_node 165847 165888 64 64 1 : tunables 0 0 0 : slabdata 2592 2592 0 kmem_cache 84070 84168 384 42 4 : tunables 0 0 0 : slabdata 2004 2004 0 From chris.hofstaedtler at deduktiva.com Sat Sep 28 20:01:21 2019 From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva) Date: Sat, 28 Sep 2019 20:01:21 +0200 Subject: [PVE-User] Kernel Memory Leak on PVE6? In-Reply-To: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at> Message-ID: <20190928180121.c44smlrz47echxt6@zeha.at> * Chris Hofstaedtler | Deduktiva [190920 14:31]: > This machine has the same (except CPU) hardware as the box next to > it; however this one was freshly installed with PVE6, the other one > is an upgrade from PVE5 and doesn't exhibit this problem. It's quite > puzzling because I haven't seen this symptom at all at all the > customer installations. Turns out the upgraded-from-PVE5 machine also shows this symptions, it's just not as noticable. - And, I've found one more machine at a customer site showing the same problems. Can't really make out a pattern in the varying configurations though. Chris From info at aminvakil.com Sun Sep 29 10:05:32 2019 From: info at aminvakil.com (Amin Vakil) Date: Sun, 29 Sep 2019 11:35:32 +0330 Subject: [PVE-User] CentOS 8 Linux Installation error with scsi Message-ID: <4c06caa9-c6c3-5ff3-6be2-df619674811b@aminvakil.com> It seems that we cannot install CentOS 8 iso if bus of CD/DVD drive in Proxmox has been set to scsi, I checked both CentOS-8-x86_64-1905-boot.iso and CentOS-8-x86_64-1905-dvd1.iso (and checked hashsums of them too), but as I want to install it, I face an error /dev/root does not exist. It works fine if I set bus of CD/DVD drive to SATA. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: