From uwe.sauter.de at gmail.com  Tue Sep  3 09:18:09 2019
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Tue, 3 Sep 2019 09:18:09 +0200
Subject: [PVE-User] Bug report: Syntax error in /etc/aliases
Message-ID: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com>

Hi all,

on a freshly installed PVE 6 my /etc/aliases looks like:

# cat /etc/aliases
postmaster: root
nobody: root
hostmaster: root
webmaster: root
www:root

and I get this output from mailq

# mailq
-Queue ID-  --Size-- ----Arrival Time---- -Sender/Recipient-------
2F38327892     5452 Fri Aug 30 23:25:46  MAILER-DAEMON
                                                  (alias database unavailable)
                                         root at px-golf.localdomain

30E0F27893     5548 Fri Aug 30 23:25:46  MAILER-DAEMON
                                                  (alias database unavailable)
                                         root at px-golf.localdomain



If I change the last line in the aliases file to "www: root" (with a space as the format requires as the man page says), recreate
the alias database and flush the mail queues, everything looks fine.

# sed -i -e 's,www:root,www: root,g' /etc/aliases
# newaliases
# postqueue -f
# mailq
Mail queue is empty


Looks like the package that adds the www entry makes an error.


Regards,

	Uwe


From t.lamprecht at proxmox.com  Tue Sep  3 11:46:03 2019
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Tue, 3 Sep 2019 11:46:03 +0200
Subject: [PVE-User] Bug report: Syntax error in /etc/aliases
In-Reply-To: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com>
References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com>
Message-ID: <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com>

Hi Uwe,

On 03.09.19 09:18, Uwe Sauter wrote:
> Hi all,
> 
> on a freshly installed PVE 6 my /etc/aliases looks like:
> 
> # cat /etc/aliases
> postmaster: root
> nobody: root
> hostmaster: root
> webmaster: root
> www:root
> 
> and I get this output from mailq
> 
> # mailq
> -Queue ID-  --Size-- ----Arrival Time---- -Sender/Recipient-------
> 2F38327892     5452 Fri Aug 30 23:25:46  MAILER-DAEMON
>                                                   (alias database unavailable)
>                                          root at px-golf.localdomain
> 
> 30E0F27893     5548 Fri Aug 30 23:25:46  MAILER-DAEMON
>                                                   (alias database unavailable)
>                                          root at px-golf.localdomain
> 
> 
> 
> If I change the last line in the aliases file to "www: root" (with a space as the format requires as the man page says), recreate
> the alias database and flush the mail queues, everything looks fine.
> 
> # sed -i -e 's,www:root,www: root,g' /etc/aliases
> # newaliases
> # postqueue -f
> # mailq
> Mail queue is empty
> 
> 
> Looks like the package that adds the www entry makes an error.


Yes, you're right! Much thanks for the report, fixed for the next ISO release.

@Fabian: we should probably do a postinst hook which fixes this up?

Doing
# sed -i -e 's/^www:root$/www: root/' /etc/aliases

at one single package version transition could be enough.
I'd say checksum matching the file to see if it was modified since shipping is
not really required, as such matched entries are really not correct.

cheers,
Thomas



From f.gruenbichler at proxmox.com  Tue Sep  3 12:09:32 2019
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?q?Gr=FCnbichler?=)
Date: Tue, 03 Sep 2019 12:09:32 +0200
Subject: [PVE-User] Bug report: Syntax error in /etc/aliases
In-Reply-To: <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com>
References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com>
 <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com>
Message-ID: <1567505336.1gjyizyjik.astroid@nora.none>

On September 3, 2019 11:46 am, Thomas Lamprecht wrote:
> Hi Uwe,
> 
> On 03.09.19 09:18, Uwe Sauter wrote:
>> Hi all,
>> 
>> on a freshly installed PVE 6 my /etc/aliases looks like:
>> 
>> # cat /etc/aliases
>> postmaster: root
>> nobody: root
>> hostmaster: root
>> webmaster: root
>> www:root
>> 
>> and I get this output from mailq
>> 
>> # mailq
>> -Queue ID-  --Size-- ----Arrival Time---- -Sender/Recipient-------
>> 2F38327892     5452 Fri Aug 30 23:25:46  MAILER-DAEMON
>>                                                   (alias database unavailable)
>>                                          root at px-golf.localdomain
>> 
>> 30E0F27893     5548 Fri Aug 30 23:25:46  MAILER-DAEMON
>>                                                   (alias database unavailable)
>>                                          root at px-golf.localdomain
>> 
>> 
>> 
>> If I change the last line in the aliases file to "www: root" (with a space as the format requires as the man page says), recreate
>> the alias database and flush the mail queues, everything looks fine.
>> 
>> # sed -i -e 's,www:root,www: root,g' /etc/aliases
>> # newaliases
>> # postqueue -f
>> # mailq
>> Mail queue is empty
>> 
>> 
>> Looks like the package that adds the www entry makes an error.
> 
> 
> Yes, you're right! Much thanks for the report, fixed for the next ISO release.
> 
> @Fabian: we should probably do a postinst hook which fixes this up?
> 
> Doing
> # sed -i -e 's/^www:root$/www: root/' /etc/aliases
> 
> at one single package version transition could be enough.
> I'd say checksum matching the file to see if it was modified since shipping is
> not really required, as such matched entries are really not correct.
> 
> cheers,
> Thomas

sounds good to me.



From uwe.sauter.de at gmail.com  Tue Sep  3 12:14:29 2019
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Tue, 3 Sep 2019 12:14:29 +0200
Subject: [PVE-User] Bug report: Syntax error in /etc/aliases
In-Reply-To: <1567505336.1gjyizyjik.astroid@nora.none>
References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com>
 <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com>
 <1567505336.1gjyizyjik.astroid@nora.none>
Message-ID: <42282602-da0f-91e8-4858-7a8bf3834b08@gmail.com>

Am 03.09.19 um 12:09 schrieb Fabian Gr?nbichler:
> On September 3, 2019 11:46 am, Thomas Lamprecht wrote:
>> Hi Uwe,
>>
>> On 03.09.19 09:18, Uwe Sauter wrote:
>>> Hi all,
>>>
>>> on a freshly installed PVE 6 my /etc/aliases looks like:
>>>
>>> # cat /etc/aliases
>>> postmaster: root
>>> nobody: root
>>> hostmaster: root
>>> webmaster: root
>>> www:root
>>>
>>> and I get this output from mailq
>>>
>>> # mailq
>>> -Queue ID-  --Size-- ----Arrival Time---- -Sender/Recipient-------
>>> 2F38327892     5452 Fri Aug 30 23:25:46  MAILER-DAEMON
>>>                                                   (alias database unavailable)
>>>                                          root at px-golf.localdomain
>>>
>>> 30E0F27893     5548 Fri Aug 30 23:25:46  MAILER-DAEMON
>>>                                                   (alias database unavailable)
>>>                                          root at px-golf.localdomain
>>>
>>>
>>>
>>> If I change the last line in the aliases file to "www: root" (with a space as the format requires as the man page says), recreate
>>> the alias database and flush the mail queues, everything looks fine.
>>>
>>> # sed -i -e 's,www:root,www: root,g' /etc/aliases
>>> # newaliases
>>> # postqueue -f
>>> # mailq
>>> Mail queue is empty
>>>
>>>
>>> Looks like the package that adds the www entry makes an error.
>>
>>
>> Yes, you're right! Much thanks for the report, fixed for the next ISO release.
>>
>> @Fabian: we should probably do a postinst hook which fixes this up?
>>
>> Doing
>> # sed -i -e 's/^www:root$/www: root/' /etc/aliases
>>
>> at one single package version transition could be enough.
>> I'd say checksum matching the file to see if it was modified since shipping is
>> not really required, as such matched entries are really not correct.
>>
>> cheers,
>> Thomas
> 
> sounds good to me.
> 

I'd suggest to do:

sed -i -e 's/^www:/www: /' /etc/aliases

so that lines that were changed by a user are also caught.


From lae at lae.is  Tue Sep  3 12:39:05 2019
From: lae at lae.is (Musee Ullah)
Date: Tue, 3 Sep 2019 03:39:05 -0700
Subject: [PVE-User] Bug report: Syntax error in /etc/aliases
In-Reply-To: <42282602-da0f-91e8-4858-7a8bf3834b08@gmail.com>
References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com>
 <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com>
 <1567505336.1gjyizyjik.astroid@nora.none>
 <42282602-da0f-91e8-4858-7a8bf3834b08@gmail.com>
Message-ID: <c4e8edb1-73fa-74e5-4758-4adf574d0bd0@lae.is>

On 2019/09/03 3:14, Uwe Sauter wrote:
> I'd suggest to do:
> sed -i -e 's/^www:/www: /' /etc/aliases
>
> so that lines that were changed by a user are also caught.

just pointing out that consecutive package updates'll continuously add
more spaces with the above since it doesn't check if there's already a
space.

sed -E -i -e 's/^www:(\w)/www: \1/' /etc/aliases


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: OpenPGP digital signature
URL: <http://lists.proxmox.com/pipermail/pve-user/attachments/20190903/9d8a74ce/attachment.sig>

From t.lamprecht at proxmox.com  Tue Sep  3 12:48:43 2019
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Tue, 3 Sep 2019 12:48:43 +0200
Subject: [PVE-User] Bug report: Syntax error in /etc/aliases
In-Reply-To: <mailman.129.1567507153.416.pve-user@pve.proxmox.com>
References: <471b41e8-2ca1-e06f-4b39-741b4ed5909e@gmail.com>
 <91c35e78-2123-45e3-c951-8f582e46ec38@proxmox.com>
 <1567505336.1gjyizyjik.astroid@nora.none>
 <42282602-da0f-91e8-4858-7a8bf3834b08@gmail.com>
 <mailman.129.1567507153.416.pve-user@pve.proxmox.com>
Message-ID: <6c810fb5-3523-25b1-415f-540c9706d59e@proxmox.com>

On 03.09.19 12:39, Musee Ullah via pve-user wrote:
> On 2019/09/03 3:14, Uwe Sauter wrote:
>> I'd suggest to do:
>> sed -i -e 's/^www:/www: /' /etc/aliases
>>
>> so that lines that were changed by a user are also caught.
> 
> just pointing out that consecutive package updates'll continuously add
> more spaces with the above since it doesn't check if there's already a
> space.
> 
> sed -E -i -e 's/^www:(\w)/www: \1/' /etc/aliases
> 
> 

That's why I said "at one single package version transition", independent
of what exactly we finally do, I'd always guarded it with a version check
inside a postinst debhelper script, e.g., like:


if dpkg --compare-versions "$2" 'lt' '6.0-X'; then
    sed ...
fi

thus it happens only if an upgrade transitions from a version pre "6.0-x"
(independent how old) to a version equal or newer than "6.0-x".
No point in checking everytime, if a admin changed it back to something
"bad" then it was probably wanted, or at least not our fault like it is
here. :) 

But you suggestion itself would work fine, in general.

cheers,
Thomas



From uwe.sauter.de at gmail.com  Fri Sep  6 10:41:18 2019
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Fri, 6 Sep 2019 10:41:18 +0200
Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph
Message-ID: <f575cae5-3f5f-9f08-f5f0-06d5d6426823@gmail.com>

Hi,

I'm having trouble moving a disk image to Ceph. Moving between local disks and NFS share is working.

The error given is:

########
create full clone of drive scsi0 (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2)
rbd: create error: (17) File exists
TASK ERROR: storage migration failed: error with cfs lock 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create error:
(17) File exists
########

but this is not true:

########
root at px-bravo-cluster:~# rbd -p vdisks ls
vm-106-disk-0
vm-113-disk-0
vm-113-disk-1
vm-113-disk-2
vm-118-disk-0
vm-119-disk-0
vm-120-disk-0
vm-125-disk-0
vm-125-disk-1
########

Here is the relevant part of my storage.cfg:

########
nfs: aurel-cluster1-VMs
	export /backup/proxmox-infra/VMs
	path /mnt/pve/aurel-cluster1-VMs
	server X.X.X.X
	content images
	options vers=4.2


rbd: vdisks_vm
	content images
	krbd 0
	pool vdisks
########

Looking in /etc/pve I cannot find any filename that would suggest that a lock exists. Any thoughts on this?


Thanks,

	Uwe


From a.antreich at proxmox.com  Fri Sep  6 11:32:55 2019
From: a.antreich at proxmox.com (Alwin Antreich)
Date: Fri, 6 Sep 2019 11:32:55 +0200
Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph
In-Reply-To: <f575cae5-3f5f-9f08-f5f0-06d5d6426823@gmail.com>
References: <f575cae5-3f5f-9f08-f5f0-06d5d6426823@gmail.com>
Message-ID: <20190906093255.GA2458639@dona.proxmox.com>

Hello Uwe,

On Fri, Sep 06, 2019 at 10:41:18AM +0200, Uwe Sauter wrote:
> Hi,
> 
> I'm having trouble moving a disk image to Ceph. Moving between local disks and NFS share is working.
> 
> The error given is:
> 
> ########
> create full clone of drive scsi0 (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2)
> rbd: create error: (17) File exists
> TASK ERROR: storage migration failed: error with cfs lock 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create error:
> (17) File exists
> ########
Can you see anything in the ceph logs? And on what version (pveversion
-v) are you on?

> 
> but this is not true:
> 
> ########
> root at px-bravo-cluster:~# rbd -p vdisks ls
> vm-106-disk-0
> vm-113-disk-0
> vm-113-disk-1
> vm-113-disk-2
> vm-118-disk-0
> vm-119-disk-0
> vm-120-disk-0
> vm-125-disk-0
> vm-125-disk-1
> ########
Can you create the image by hand (rbd -p rbd create vm-112-disk-0 --size
1G)? And (rbd -p rbd rm vm-112-disk-0) for delete, ofc.

> 
> Here is the relevant part of my storage.cfg:
> 
> ########
> nfs: aurel-cluster1-VMs
> 	export /backup/proxmox-infra/VMs
> 	path /mnt/pve/aurel-cluster1-VMs
> 	server X.X.X.X
> 	content images
> 	options vers=4.2
> 
> 
> rbd: vdisks_vm
> 	content images
> 	krbd 0
> 	pool vdisks
> ########
Is this the complete storage.cfg?

--
Cheers,
Alwin



From uwe.sauter.de at gmail.com  Fri Sep  6 11:44:10 2019
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Fri, 6 Sep 2019 11:44:10 +0200
Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph
In-Reply-To: <20190906093255.GA2458639@dona.proxmox.com>
References: <f575cae5-3f5f-9f08-f5f0-06d5d6426823@gmail.com>
 <20190906093255.GA2458639@dona.proxmox.com>
Message-ID: <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com>

Hello Alwin,

Am 06.09.19 um 11:32 schrieb Alwin Antreich:
> Hello Uwe,
> 
> On Fri, Sep 06, 2019 at 10:41:18AM +0200, Uwe Sauter wrote:
>> Hi,
>>
>> I'm having trouble moving a disk image to Ceph. Moving between local disks and NFS share is working.
>>
>> The error given is:
>>
>> ########
>> create full clone of drive scsi0 (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2)
>> rbd: create error: (17) File exists
>> TASK ERROR: storage migration failed: error with cfs lock 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create error:
>> (17) File exists
>> ########
> Can you see anything in the ceph logs? And on what version (pveversion
> -v) are you on?

Nothing obvious in the logs. The cluster is healthy

root at px-bravo-cluster:~# ceph status
  cluster:
    id:     982484e6-69bf-490c-9b3a-942a179e759b
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum px-alpha-cluster,px-bravo-cluster,px-charlie-cluster
    mgr: px-alpha-cluster(active), standbys: px-bravo-cluster, px-charlie-cluster
    osd: 9 osds: 9 up, 9 in

  data:
    pools:   1 pools, 128 pgs
    objects: 14.76k objects, 56.0GiB
    usage:   163GiB used, 3.99TiB / 4.15TiB avail
    pgs:     128 active+clean

  io:
    client:   2.31KiB/s wr, 0op/s rd, 0op/s wr

I'm on a fully up-to-date PVE 5.4 (all three nodes).

root at px-bravo-cluster:~# pveversion -v
proxmox-ve: 5.4-2 (running kernel: 4.15.18-20-pve)
pve-manager: 5.4-13 (running version: 5.4-13/aee6f0ec)
pve-kernel-4.15: 5.4-8
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-19-pve: 4.15.18-45
ceph: 12.2.12-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-12
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-54
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-14
libpve-storage-perl: 5.0-44
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-6
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-38
pve-container: 2.0-40
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-7
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-54
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2



>>
>> but this is not true:
>>
>> ########
>> root at px-bravo-cluster:~# rbd -p vdisks ls
>> vm-106-disk-0
>> vm-113-disk-0
>> vm-113-disk-1
>> vm-113-disk-2
>> vm-118-disk-0
>> vm-119-disk-0
>> vm-120-disk-0
>> vm-125-disk-0
>> vm-125-disk-1
>> ########
> Can you create the image by hand (rbd -p rbd create vm-112-disk-0 --size
> 1G)? And (rbd -p rbd rm vm-112-disk-0) for delete, ofc.

root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G
rbd: create error: (17) File exists
2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0 already exists

root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G

root at px-bravo-cluster:~# rbd -p vdisks ls
test
vm-106-disk-0
vm-113-disk-0
vm-113-disk-1
vm-113-disk-2
vm-118-disk-0
vm-119-disk-0
vm-120-disk-0
vm-125-disk-0
vm-125-disk-1

root at px-bravo-cluster:~# rbd -p vdisks rm test
Removing image: 100% complete...done.

root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0
2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or
directory
Removing image: 0% complete...failed.
rbd: delete error: (2) No such file or directory


> 
>>
>> Here is the relevant part of my storage.cfg:
>>
>> ########
>> nfs: aurel-cluster1-VMs
>> 	export /backup/proxmox-infra/VMs
>> 	path /mnt/pve/aurel-cluster1-VMs
>> 	server X.X.X.X
>> 	content images
>> 	options vers=4.2
>>
>>
>> rbd: vdisks_vm
>> 	content images
>> 	krbd 0
>> 	pool vdisks
>> ########
> Is this the complete storage.cfg?

No, only the parts that are relevant for this particular move. Here's the complete file:

########
rbd: vdisks_vm
	content images
	krbd 0
	pool vdisks

dir: local-hdd
	path /mnt/local
	content images,iso
	nodes px-alpha-cluster,px-bravo-cluster,px-charlie-cluster
	shared 0

nfs: aurel-cluster1-daily
	export /backup/proxmox-infra/daily
	path /mnt/pve/aurel-cluster1-daily
	server X.X.X.X
	content backup
	maxfiles 30
	options vers=4.2

nfs: aurel-cluster1-weekly
	export /backup/proxmox-infra/weekly
	path /mnt/pve/aurel-cluster1-weekly
	server X.X.X.X
	content backup
	maxfiles 30
	options vers=4.2

nfs: aurel-cluster1-VMs
	export /backup/proxmox-infra/VMs
	path /mnt/pve/aurel-cluster1-VMs
	server X.X.X.X
	content images
	options vers=4.2

nfs: aurel-cluster2-daily
	export /backup/proxmox-infra2/daily
	path /mnt/pve/aurel-cluster2-daily
	server X.X.X.X
	content backup
	maxfiles 30
	options vers=4.2

nfs: aurel-cluster2-weekly
	export /backup/proxmox-infra2/weekly
	path /mnt/pve/aurel-cluster2-weekly
	server X.X.X.X
	content backup
	maxfiles 30
	options vers=4.2

nfs: aurel-cluster2-VMs
	export /backup/proxmox-infra2/VMs
	path /mnt/pve/aurel-cluster2-VMs
	server X.X.X.X
	content images
	options vers=4.2

dir: local
	path /var/lib/vz
	content snippets,vztmpl,images,rootdir,iso
	maxfiles 0

rbd: vdisks_cluster2
	content images
	krbd 0
	monhost px-golf-cluster, px-hotel-cluster, px-india-cluster
	pool vdisks
	username admin
########

Thanks,

	Uwe

> --
> Cheers,
> Alwin
> 



From mark at openvs.co.uk  Fri Sep  6 12:09:16 2019
From: mark at openvs.co.uk (Mark Adams)
Date: Fri, 6 Sep 2019 13:09:16 +0300
Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph
In-Reply-To: <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com>
References: <f575cae5-3f5f-9f08-f5f0-06d5d6426823@gmail.com>
 <20190906093255.GA2458639@dona.proxmox.com>
 <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com>
Message-ID: <CAHxUxjC7EhZqNNm8SWFbsWA61FF5s_9U7UsnLOQMW8DP=+2K7Q@mail.gmail.com>

Is it potentially an issue with having the same pool name on 2 different
ceph clusters?

is there a vm-112-disk-0 on vdisks_cluster2?

On Fri, 6 Sep 2019, 12:45 Uwe Sauter, <uwe.sauter.de at gmail.com> wrote:

> Hello Alwin,
>
> Am 06.09.19 um 11:32 schrieb Alwin Antreich:
> > Hello Uwe,
> >
> > On Fri, Sep 06, 2019 at 10:41:18AM +0200, Uwe Sauter wrote:
> >> Hi,
> >>
> >> I'm having trouble moving a disk image to Ceph. Moving between local
> disks and NFS share is working.
> >>
> >> The error given is:
> >>
> >> ########
> >> create full clone of drive scsi0
> (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2)
> >> rbd: create error: (17) File exists
> >> TASK ERROR: storage migration failed: error with cfs lock
> 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create error:
> >> (17) File exists
> >> ########
> > Can you see anything in the ceph logs? And on what version (pveversion
> > -v) are you on?
>
> Nothing obvious in the logs. The cluster is healthy
>
> root at px-bravo-cluster:~# ceph status
>   cluster:
>     id:     982484e6-69bf-490c-9b3a-942a179e759b
>     health: HEALTH_OK
>
>   services:
>     mon: 3 daemons, quorum
> px-alpha-cluster,px-bravo-cluster,px-charlie-cluster
>     mgr: px-alpha-cluster(active), standbys: px-bravo-cluster,
> px-charlie-cluster
>     osd: 9 osds: 9 up, 9 in
>
>   data:
>     pools:   1 pools, 128 pgs
>     objects: 14.76k objects, 56.0GiB
>     usage:   163GiB used, 3.99TiB / 4.15TiB avail
>     pgs:     128 active+clean
>
>   io:
>     client:   2.31KiB/s wr, 0op/s rd, 0op/s wr
>
> I'm on a fully up-to-date PVE 5.4 (all three nodes).
>
> root at px-bravo-cluster:~# pveversion -v
> proxmox-ve: 5.4-2 (running kernel: 4.15.18-20-pve)
> pve-manager: 5.4-13 (running version: 5.4-13/aee6f0ec)
> pve-kernel-4.15: 5.4-8
> pve-kernel-4.15.18-20-pve: 4.15.18-46
> pve-kernel-4.15.18-19-pve: 4.15.18-45
> ceph: 12.2.12-pve1
> corosync: 2.4.4-pve1
> criu: 2.11.1-1~bpo90
> glusterfs-client: 3.8.8-1
> ksm-control-daemon: 1.2-2
> libjs-extjs: 6.0.1-2
> libpve-access-control: 5.1-12
> libpve-apiclient-perl: 2.0-5
> libpve-common-perl: 5.0-54
> libpve-guest-common-perl: 2.0-20
> libpve-http-server-perl: 2.0-14
> libpve-storage-perl: 5.0-44
> libqb0: 1.0.3-1~bpo9
> lvm2: 2.02.168-pve6
> lxc-pve: 3.1.0-6
> lxcfs: 3.0.3-pve1
> novnc-pve: 1.0.0-3
> proxmox-widget-toolkit: 1.0-28
> pve-cluster: 5.0-38
> pve-container: 2.0-40
> pve-docs: 5.4-2
> pve-edk2-firmware: 1.20190312-1
> pve-firewall: 3.0-22
> pve-firmware: 2.0-7
> pve-ha-manager: 2.0-9
> pve-i18n: 1.1-4
> pve-libspice-server1: 0.14.1-2
> pve-qemu-kvm: 3.0.1-4
> pve-xtermjs: 3.12.0-1
> qemu-server: 5.0-54
> smartmontools: 6.5+svn4324-1
> spiceterm: 3.0-5
> vncterm: 1.5-3
> zfsutils-linux: 0.7.13-pve1~bpo2
>
>
>
> >>
> >> but this is not true:
> >>
> >> ########
> >> root at px-bravo-cluster:~# rbd -p vdisks ls
> >> vm-106-disk-0
> >> vm-113-disk-0
> >> vm-113-disk-1
> >> vm-113-disk-2
> >> vm-118-disk-0
> >> vm-119-disk-0
> >> vm-120-disk-0
> >> vm-125-disk-0
> >> vm-125-disk-1
> >> ########
> > Can you create the image by hand (rbd -p rbd create vm-112-disk-0 --size
> > 1G)? And (rbd -p rbd rm vm-112-disk-0) for delete, ofc.
>
> root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G
> rbd: create error: (17) File exists
> 2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0
> already exists
>
> root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G
>
> root at px-bravo-cluster:~# rbd -p vdisks ls
> test
> vm-106-disk-0
> vm-113-disk-0
> vm-113-disk-1
> vm-113-disk-2
> vm-118-disk-0
> vm-119-disk-0
> vm-120-disk-0
> vm-125-disk-0
> vm-125-disk-1
>
> root at px-bravo-cluster:~# rbd -p vdisks rm test
> Removing image: 100% complete...done.
>
> root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0
> 2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest:
> failed to retreive immutable metadata: (2) No such file or
> directory
> Removing image: 0% complete...failed.
> rbd: delete error: (2) No such file or directory
>
>
> >
> >>
> >> Here is the relevant part of my storage.cfg:
> >>
> >> ########
> >> nfs: aurel-cluster1-VMs
> >>      export /backup/proxmox-infra/VMs
> >>      path /mnt/pve/aurel-cluster1-VMs
> >>      server X.X.X.X
> >>      content images
> >>      options vers=4.2
> >>
> >>
> >> rbd: vdisks_vm
> >>      content images
> >>      krbd 0
> >>      pool vdisks
> >> ########
> > Is this the complete storage.cfg?
>
> No, only the parts that are relevant for this particular move. Here's the
> complete file:
>
> ########
> rbd: vdisks_vm
>         content images
>         krbd 0
>         pool vdisks
>
> dir: local-hdd
>         path /mnt/local
>         content images,iso
>         nodes px-alpha-cluster,px-bravo-cluster,px-charlie-cluster
>         shared 0
>
> nfs: aurel-cluster1-daily
>         export /backup/proxmox-infra/daily
>         path /mnt/pve/aurel-cluster1-daily
>         server X.X.X.X
>         content backup
>         maxfiles 30
>         options vers=4.2
>
> nfs: aurel-cluster1-weekly
>         export /backup/proxmox-infra/weekly
>         path /mnt/pve/aurel-cluster1-weekly
>         server X.X.X.X
>         content backup
>         maxfiles 30
>         options vers=4.2
>
> nfs: aurel-cluster1-VMs
>         export /backup/proxmox-infra/VMs
>         path /mnt/pve/aurel-cluster1-VMs
>         server X.X.X.X
>         content images
>         options vers=4.2
>
> nfs: aurel-cluster2-daily
>         export /backup/proxmox-infra2/daily
>         path /mnt/pve/aurel-cluster2-daily
>         server X.X.X.X
>         content backup
>         maxfiles 30
>         options vers=4.2
>
> nfs: aurel-cluster2-weekly
>         export /backup/proxmox-infra2/weekly
>         path /mnt/pve/aurel-cluster2-weekly
>         server X.X.X.X
>         content backup
>         maxfiles 30
>         options vers=4.2
>
> nfs: aurel-cluster2-VMs
>         export /backup/proxmox-infra2/VMs
>         path /mnt/pve/aurel-cluster2-VMs
>         server X.X.X.X
>         content images
>         options vers=4.2
>
> dir: local
>         path /var/lib/vz
>         content snippets,vztmpl,images,rootdir,iso
>         maxfiles 0
>
> rbd: vdisks_cluster2
>         content images
>         krbd 0
>         monhost px-golf-cluster, px-hotel-cluster, px-india-cluster
>         pool vdisks
>         username admin
> ########
>
> Thanks,
>
>         Uwe
>
> > --
> > Cheers,
> > Alwin
> >
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>


From uwe.sauter.de at gmail.com  Fri Sep  6 12:22:28 2019
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Fri, 6 Sep 2019 12:22:28 +0200
Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph
In-Reply-To: <CAHxUxjC7EhZqNNm8SWFbsWA61FF5s_9U7UsnLOQMW8DP=+2K7Q@mail.gmail.com>
References: <f575cae5-3f5f-9f08-f5f0-06d5d6426823@gmail.com>
 <20190906093255.GA2458639@dona.proxmox.com>
 <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com>
 <CAHxUxjC7EhZqNNm8SWFbsWA61FF5s_9U7UsnLOQMW8DP=+2K7Q@mail.gmail.com>
Message-ID: <339bb0bc-303b-5fb3-cf4f-48143a9709d4@gmail.com>

Am 06.09.19 um 12:09 schrieb Mark Adams:
> Is it potentially an issue with having the same pool name on 2 different ceph clusters?

Good catch.

> is there a vm-112-disk-0 on vdisks_cluster2?

No, but disabling the second Ceph in the storage settings allowed the move to succeed. I'll need to think about naming then.

But this keeps me wondering why it only failed for one VM and the other six I moved today caused no problems.


Thank you.

Regards,

	Uwe


> 
> On Fri, 6 Sep 2019, 12:45 Uwe Sauter, <uwe.sauter.de at gmail.com <mailto:uwe.sauter.de at gmail.com>> wrote:
> 
>     Hello Alwin,
> 
>     Am 06.09.19 um 11:32 schrieb Alwin Antreich:
>     > Hello Uwe,
>     >
>     > On Fri, Sep 06, 2019 at 10:41:18AM +0200, Uwe Sauter wrote:
>     >> Hi,
>     >>
>     >> I'm having trouble moving a disk image to Ceph. Moving between local disks and NFS share is working.
>     >>
>     >> The error given is:
>     >>
>     >> ########
>     >> create full clone of drive scsi0 (aurel-cluster1-VMs:112/vm-112-disk-0.qcow2)
>     >> rbd: create error: (17) File exists
>     >> TASK ERROR: storage migration failed: error with cfs lock 'storage-vdisks_vm': rbd create vm-112-disk-0' error: rbd: create
>     error:
>     >> (17) File exists
>     >> ########
>     > Can you see anything in the ceph logs? And on what version (pveversion
>     > -v) are you on?
> 
>     Nothing obvious in the logs. The cluster is healthy
> 
>     root at px-bravo-cluster:~# ceph status
>     ? cluster:
>     ? ? id:? ? ?982484e6-69bf-490c-9b3a-942a179e759b
>     ? ? health: HEALTH_OK
> 
>     ? services:
>     ? ? mon: 3 daemons, quorum px-alpha-cluster,px-bravo-cluster,px-charlie-cluster
>     ? ? mgr: px-alpha-cluster(active), standbys: px-bravo-cluster, px-charlie-cluster
>     ? ? osd: 9 osds: 9 up, 9 in
> 
>     ? data:
>     ? ? pools:? ?1 pools, 128 pgs
>     ? ? objects: 14.76k objects, 56.0GiB
>     ? ? usage:? ?163GiB used, 3.99TiB / 4.15TiB avail
>     ? ? pgs:? ? ?128 active+clean
> 
>     ? io:
>     ? ? client:? ?2.31KiB/s wr, 0op/s rd, 0op/s wr
> 
>     I'm on a fully up-to-date PVE 5.4 (all three nodes).
> 
>     root at px-bravo-cluster:~# pveversion -v
>     proxmox-ve: 5.4-2 (running kernel: 4.15.18-20-pve)
>     pve-manager: 5.4-13 (running version: 5.4-13/aee6f0ec)
>     pve-kernel-4.15: 5.4-8
>     pve-kernel-4.15.18-20-pve: 4.15.18-46
>     pve-kernel-4.15.18-19-pve: 4.15.18-45
>     ceph: 12.2.12-pve1
>     corosync: 2.4.4-pve1
>     criu: 2.11.1-1~bpo90
>     glusterfs-client: 3.8.8-1
>     ksm-control-daemon: 1.2-2
>     libjs-extjs: 6.0.1-2
>     libpve-access-control: 5.1-12
>     libpve-apiclient-perl: 2.0-5
>     libpve-common-perl: 5.0-54
>     libpve-guest-common-perl: 2.0-20
>     libpve-http-server-perl: 2.0-14
>     libpve-storage-perl: 5.0-44
>     libqb0: 1.0.3-1~bpo9
>     lvm2: 2.02.168-pve6
>     lxc-pve: 3.1.0-6
>     lxcfs: 3.0.3-pve1
>     novnc-pve: 1.0.0-3
>     proxmox-widget-toolkit: 1.0-28
>     pve-cluster: 5.0-38
>     pve-container: 2.0-40
>     pve-docs: 5.4-2
>     pve-edk2-firmware: 1.20190312-1
>     pve-firewall: 3.0-22
>     pve-firmware: 2.0-7
>     pve-ha-manager: 2.0-9
>     pve-i18n: 1.1-4
>     pve-libspice-server1: 0.14.1-2
>     pve-qemu-kvm: 3.0.1-4
>     pve-xtermjs: 3.12.0-1
>     qemu-server: 5.0-54
>     smartmontools: 6.5+svn4324-1
>     spiceterm: 3.0-5
>     vncterm: 1.5-3
>     zfsutils-linux: 0.7.13-pve1~bpo2
> 
> 
> 
>     >>
>     >> but this is not true:
>     >>
>     >> ########
>     >> root at px-bravo-cluster:~# rbd -p vdisks ls
>     >> vm-106-disk-0
>     >> vm-113-disk-0
>     >> vm-113-disk-1
>     >> vm-113-disk-2
>     >> vm-118-disk-0
>     >> vm-119-disk-0
>     >> vm-120-disk-0
>     >> vm-125-disk-0
>     >> vm-125-disk-1
>     >> ########
>     > Can you create the image by hand (rbd -p rbd create vm-112-disk-0 --size
>     > 1G)? And (rbd -p rbd rm vm-112-disk-0) for delete, ofc.
> 
>     root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G
>     rbd: create error: (17) File exists
>     2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0 already exists
> 
>     root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G
> 
>     root at px-bravo-cluster:~# rbd -p vdisks ls
>     test
>     vm-106-disk-0
>     vm-113-disk-0
>     vm-113-disk-1
>     vm-113-disk-2
>     vm-118-disk-0
>     vm-119-disk-0
>     vm-120-disk-0
>     vm-125-disk-0
>     vm-125-disk-1
> 
>     root at px-bravo-cluster:~# rbd -p vdisks rm test
>     Removing image: 100% complete...done.
> 
>     root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0
>     2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or
>     directory
>     Removing image: 0% complete...failed.
>     rbd: delete error: (2) No such file or directory
> 
> 
>     >
>     >>
>     >> Here is the relevant part of my storage.cfg:
>     >>
>     >> ########
>     >> nfs: aurel-cluster1-VMs
>     >>? ? ? export /backup/proxmox-infra/VMs
>     >>? ? ? path /mnt/pve/aurel-cluster1-VMs
>     >>? ? ? server X.X.X.X
>     >>? ? ? content images
>     >>? ? ? options vers=4.2
>     >>
>     >>
>     >> rbd: vdisks_vm
>     >>? ? ? content images
>     >>? ? ? krbd 0
>     >>? ? ? pool vdisks
>     >> ########
>     > Is this the complete storage.cfg?
> 
>     No, only the parts that are relevant for this particular move. Here's the complete file:
> 
>     ########
>     rbd: vdisks_vm
>     ? ? ? ? content images
>     ? ? ? ? krbd 0
>     ? ? ? ? pool vdisks
> 
>     dir: local-hdd
>     ? ? ? ? path /mnt/local
>     ? ? ? ? content images,iso
>     ? ? ? ? nodes px-alpha-cluster,px-bravo-cluster,px-charlie-cluster
>     ? ? ? ? shared 0
> 
>     nfs: aurel-cluster1-daily
>     ? ? ? ? export /backup/proxmox-infra/daily
>     ? ? ? ? path /mnt/pve/aurel-cluster1-daily
>     ? ? ? ? server X.X.X.X
>     ? ? ? ? content backup
>     ? ? ? ? maxfiles 30
>     ? ? ? ? options vers=4.2
> 
>     nfs: aurel-cluster1-weekly
>     ? ? ? ? export /backup/proxmox-infra/weekly
>     ? ? ? ? path /mnt/pve/aurel-cluster1-weekly
>     ? ? ? ? server X.X.X.X
>     ? ? ? ? content backup
>     ? ? ? ? maxfiles 30
>     ? ? ? ? options vers=4.2
> 
>     nfs: aurel-cluster1-VMs
>     ? ? ? ? export /backup/proxmox-infra/VMs
>     ? ? ? ? path /mnt/pve/aurel-cluster1-VMs
>     ? ? ? ? server X.X.X.X
>     ? ? ? ? content images
>     ? ? ? ? options vers=4.2
> 
>     nfs: aurel-cluster2-daily
>     ? ? ? ? export /backup/proxmox-infra2/daily
>     ? ? ? ? path /mnt/pve/aurel-cluster2-daily
>     ? ? ? ? server X.X.X.X
>     ? ? ? ? content backup
>     ? ? ? ? maxfiles 30
>     ? ? ? ? options vers=4.2
> 
>     nfs: aurel-cluster2-weekly
>     ? ? ? ? export /backup/proxmox-infra2/weekly
>     ? ? ? ? path /mnt/pve/aurel-cluster2-weekly
>     ? ? ? ? server X.X.X.X
>     ? ? ? ? content backup
>     ? ? ? ? maxfiles 30
>     ? ? ? ? options vers=4.2
> 
>     nfs: aurel-cluster2-VMs
>     ? ? ? ? export /backup/proxmox-infra2/VMs
>     ? ? ? ? path /mnt/pve/aurel-cluster2-VMs
>     ? ? ? ? server X.X.X.X
>     ? ? ? ? content images
>     ? ? ? ? options vers=4.2
> 
>     dir: local
>     ? ? ? ? path /var/lib/vz
>     ? ? ? ? content snippets,vztmpl,images,rootdir,iso
>     ? ? ? ? maxfiles 0
> 
>     rbd: vdisks_cluster2
>     ? ? ? ? content images
>     ? ? ? ? krbd 0
>     ? ? ? ? monhost px-golf-cluster, px-hotel-cluster, px-india-cluster
>     ? ? ? ? pool vdisks
>     ? ? ? ? username admin
>     ########
> 
>     Thanks,
> 
>     ? ? ? ? Uwe
> 
>     > --
>     > Cheers,
>     > Alwin
>     >
> 
>     _______________________________________________
>     pve-user mailing list
>     pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>
>     https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 



From a.antreich at proxmox.com  Fri Sep  6 12:32:48 2019
From: a.antreich at proxmox.com (Alwin Antreich)
Date: Fri, 6 Sep 2019 12:32:48 +0200
Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph
In-Reply-To: <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com>
References: <f575cae5-3f5f-9f08-f5f0-06d5d6426823@gmail.com>
 <20190906093255.GA2458639@dona.proxmox.com>
 <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com>
Message-ID: <20190906103248.GB2458639@dona.proxmox.com>

On Fri, Sep 06, 2019 at 11:44:10AM +0200, Uwe Sauter wrote:
> root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G
> rbd: create error: (17) File exists
> 2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0 already exists
> 
> root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G
> 
> root at px-bravo-cluster:~# rbd -p vdisks ls
> test
> vm-106-disk-0
> vm-113-disk-0
> vm-113-disk-1
> vm-113-disk-2
> vm-118-disk-0
> vm-119-disk-0
> vm-120-disk-0
> vm-125-disk-0
> vm-125-disk-1
> 
> root at px-bravo-cluster:~# rbd -p vdisks rm test
> Removing image: 100% complete...done.
> 
> root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0
> 2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or
> directory
> Removing image: 0% complete...failed.
> rbd: delete error: (2) No such file or directory
Seems Ceph has the vm-112-disk-0 still stored somewhere. At least this
error message should be visible in the ceph logs. Hopefully there is
more info about it.

Does the 'rbd showmapped' show the image being mapped? Are you running
filestore or bluestore OSDs?

If you migrate the VM to a different host, does it occure there too?

--
Cheers,
Alwin



From klaus.mailinglists at pernau.at  Fri Sep  6 13:32:22 2019
From: klaus.mailinglists at pernau.at (Klaus Darilion)
Date: Fri, 6 Sep 2019 13:32:22 +0200
Subject: [PVE-User] pvesr: how to achieve continuous replication logs
Message-ID: <f3cb1f18-33ad-8a47-381e-224912b8a99d@pernau.at>

Hello all!

As far as I see, pvesr logs the last replication log to
/var/log/pve/replicate/VMID and additionally errors are logged to syslog.

For debugging purposes I would like the have the detailed replications
log (as in  /var/log/pve/replicate/) kept for every replication. For
example either append the replication logs or send the replication log
also to syslog.

Is there a way to have the replication logs permanent?

Thanks
Klaus


From uwe.sauter.de at gmail.com  Fri Sep  6 13:52:57 2019
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Fri, 6 Sep 2019 13:52:57 +0200
Subject: [PVE-User] PVE 5.4: cannot move disk image to Ceph
In-Reply-To: <20190906103248.GB2458639@dona.proxmox.com>
References: <f575cae5-3f5f-9f08-f5f0-06d5d6426823@gmail.com>
 <20190906093255.GA2458639@dona.proxmox.com>
 <7de89c5f-de3d-743f-7f97-cb0aa2c30bb6@gmail.com>
 <20190906103248.GB2458639@dona.proxmox.com>
Message-ID: <5f7703d8-e941-8111-b66d-b01c3b0a9e29@gmail.com>

Am 06.09.19 um 12:32 schrieb Alwin Antreich:
> On Fri, Sep 06, 2019 at 11:44:10AM +0200, Uwe Sauter wrote:
>> root at px-bravo-cluster:~# rbd -p vdisks create vm-112-disk-0 --size 1G
>> rbd: create error: (17) File exists
>> 2019-09-06 11:35:20.943998 7faf704660c0 -1 librbd: rbd image vm-112-disk-0 already exists
>>
>> root at px-bravo-cluster:~# rbd -p vdisks create test --size 1G
>>
>> root at px-bravo-cluster:~# rbd -p vdisks ls
>> test
>> vm-106-disk-0
>> vm-113-disk-0
>> vm-113-disk-1
>> vm-113-disk-2
>> vm-118-disk-0
>> vm-119-disk-0
>> vm-120-disk-0
>> vm-125-disk-0
>> vm-125-disk-1
>>
>> root at px-bravo-cluster:~# rbd -p vdisks rm test
>> Removing image: 100% complete...done.
>>
>> root at px-bravo-cluster:~# rbd -p vdisks rm vm-112-disk-0
>> 2019-09-06 11:36:07.570749 7eff7cff9700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or
>> directory
>> Removing image: 0% complete...failed.
>> rbd: delete error: (2) No such file or directory
> Seems Ceph has the vm-112-disk-0 still stored somewhere. At least this
> error message should be visible in the ceph logs. Hopefully there is
> more info about it.
> 
> Does the 'rbd showmapped' show the image being mapped? Are you running
> filestore or bluestore OSDs?
> 
> If you migrate the VM to a different host, does it occure there too?
> 

Neither Ceph had vm-112-disk-0. I solved it by disabling the second Ceph instance. PVE was then able to move the disk.

Yes, it occured on all hosts.


Thanks,

	Uwe



> --
> Cheers,
> Alwin
> 



From kulzaus at kulzaus.top  Fri Sep  6 14:57:00 2019
From: kulzaus at kulzaus.top (Milosz Stocki)
Date: Fri, 6 Sep 2019 14:57:00 +0200
Subject: ZFS live migration with HA
Message-ID: <05f001d564b2$94168dc0$bc43a940$@kulzaus.top>

Hi,

Has anyone tested the new-in-GUI option for local disk migration using ZFS
in pve 6?

For me it works up until I enable HA for the VMs in question and it looks
like proxmox GUI doesn't know that i'm using local disks and won't let me
migrate, so when trying failover to new server it starts well with replicas
but cannot seem to restore the vm to restored server.

I even tried running it from cli with "qm migrate 10101 main-host  --online
--with-local-disks --force" but it always (from GUI and CLI) gives me:

"2019-09-06 13:25:33 can't migrate local disk
'ZFS-main-host-SSD:vm-10101-disk-0': can't live migrate attached local disks
without with-local-disks option
2019-09-06 13:25:33 ERROR: Failed to sync data - can't migrate VM - check
log"
Is it something that Proxmox Team didn't enable yet for HA configs or is it
a bug because it seems like it doesn't even pass the -with-local-disks
option to HA stack and the error is a bit vague if it's on purpose?

I couldn't find it on the bug tracker.

 

Best Regards

Milosz Stocki

 



From martin at holub.co.at  Tue Sep 10 09:48:57 2019
From: martin at holub.co.at (Martin Holub)
Date: Tue, 10 Sep 2019 09:48:57 +0200
Subject: Nested Virtualization and Live Migration
Message-ID: <a6b14cb2a62b47500aa4d6d90007bdc6ea7252db.camel@holub.co.at>

Hi,

We activated nested Virtualization Support and this apparently broke
the Live Migration Support. According to 
https://www.linux-kvm.org/page/Nested_Guests i think this should work,
but i just see "qmp command 'migrate' failed - Nested VMX
virtualization does not support live migration yet". Is there something
i can do to fix this, or do i have to disable the nested
Virtualization? We are using Proxmox6 with Debian & Ubuntu Guests. 

Best
Martin



From d.csapak at proxmox.com  Tue Sep 10 10:05:48 2019
From: d.csapak at proxmox.com (Dominik Csapak)
Date: Tue, 10 Sep 2019 10:05:48 +0200
Subject: [PVE-User] Nested Virtualization and Live Migration
In-Reply-To: <mailman.0.1568101743.462.pve-user@pve.proxmox.com>
References: <mailman.0.1568101743.462.pve-user@pve.proxmox.com>
Message-ID: <ce129139-92f4-1ba7-5a44-c274f35ea27f@proxmox.com>

 > https://www.linux-kvm.org/page/Nested_Guests

the page is sadly outdated

there are efforts in kernel and qemu to enable real working live 
migration of nested machines.

currently qemu decided to disable migration altogether when nesting is 
enabled and the guest has the vmx/svm flag.[0]

you can try with a cpu model that does not include that flag, but
you lose nesting for that machine ofc.

kind regards
Dominik

0: 
https://github.com/qemu/qemu/commit/d98f26073bebddcd3da0ba1b86c3a34e840c0fb8



From martin at holub.co.at  Tue Sep 10 13:56:26 2019
From: martin at holub.co.at (Martin Holub)
Date: Tue, 10 Sep 2019 13:56:26 +0200
Subject: [PVE-User] Nested Virtualization and Live Migration
In-Reply-To: <ce129139-92f4-1ba7-5a44-c274f35ea27f@proxmox.com>
References: <mailman.0.1568101743.462.pve-user@pve.proxmox.com>
 <ce129139-92f4-1ba7-5a44-c274f35ea27f@proxmox.com>
Message-ID: <41e03e3d016a5620c19b92044377c431aa7ee2e2.camel@holub.co.at>

On Tue, 2019-09-10 at 10:05 +0200, Dominik Csapak wrote:
>  > https://www.linux-kvm.org/page/Nested_Guests
> 
> the page is sadly outdated
> 
> there are efforts in kernel and qemu to enable real working live 
> migration of nested machines.
> 
> currently qemu decided to disable migration altogether when nesting
> is 
> enabled and the guest has the vmx/svm flag.[0]
> 
> you can try with a cpu model that does not include that flag, but
> you lose nesting for that machine ofc.
> 
> kind regards
> Dominik
> 
> 0: 
> 
https://github.com/qemu/qemu/commit/d98f26073bebddcd3da0ba1b86c3a34e840c0fb8
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Hi Dominik,

I see, thanks for the Link. I will then disable the nested Flag again
for now. Maybe someone wants to add that to the Wiki Page at [1]?

Best 
Martin

[1] https://pve.proxmox.com/wiki/Nested_Virtualization



From f.cuseo at panservice.it  Tue Sep 10 20:14:03 2019
From: f.cuseo at panservice.it (Fabrizio Cuseo)
Date: Tue, 10 Sep 2019 20:14:03 +0200 (CEST)
Subject: [PVE-User] Ceph Crush Map
Message-ID: <2103160136.84979.1568139243873.JavaMail.zimbra@zimbra.panservice.it>

Hello.

I want to suggest a new feature for PVE release 6.1 :)

Scenario:  
- 3 hosts in a rack in building A
- 3 hosts in a rack in building B
- a dedicated 2 x 10Gbit connection (200mt fiber channel)

A single PVE cluster with 6 hosts.
A single Ceph cluster with 6 hosts (each with several OSD)

I would like to manipulate the crush map to have my pools with not less than 1 copy for each building (so if I have a pool with 3 copies, I need to have 2 copies on different hosts in building A, and 1 copy on one of the hosts in building B).

With this configuration I can obtain a full redundancy of my VMs and data (something wrong ?).

I can change manually the crush map, pools and so on, but with the GUI this can be more simple (also if i need to add other servers to my cluster).

Regards, Fabrizio 


From fatalerrors at geoffray-levasseur.org  Wed Sep 11 17:14:05 2019
From: fatalerrors at geoffray-levasseur.org (Geoffray Levasseur)
Date: Wed, 11 Sep 2019 15:14:05 +0000
Subject: [PVE-User] Ceph and NTP problems on Ryzen
Message-ID: <4cf7609c60787523584b01e878daf845@geoffray-levasseur.org>

Hi,

I have some difficulties on a Ryzen 5 2400G node. I switched recently to new version 6 with no problems except performances on that machine. The new kernel was suppose to have a better support but it actually is worse. Putting a quad port Intel 82571EB card on the PCI-Express 16x port is probably the orrigin of my troubles.

When IOMMU is activated I have very bad performances on the network card, and the server reboot after a few days unexpectingly.

When I put IOMMU in software mode, no unexpected reboot anymore but bed performances remain.

It turns Ceph extremely slow and it complain permanently with clock skew. NTP have extreme difficulties to do its job, both on hosted virtual machines and the host itself.

I read a lot about such troubles now fixed on video cards. But I dont think the network card scenario have been treated by kernel developpers. Is there any workaround for my situation?

Regards,
--
Geoffray Levasseur
Technicien UPS - UMR CNRS 5566 / LEGOS - Service Informatique
        <fatalerrors at geoffray-levasseur.org>
        <geoffray.levasseur at legos.obs-mip.fr>
        http://www.geoffray-levasseur.org/
GNU/PG public key : C89E D6C4 8BFC C9F2 EEFB 908C 89C2 CD4D CD9E 23AA
Quod gratis asseritur gratis negatur.


From lists at merit.unu.edu  Wed Sep 11 20:55:50 2019
From: lists at merit.unu.edu (mj)
Date: Wed, 11 Sep 2019 20:55:50 +0200
Subject: [PVE-User] Ceph and NTP problems on Ryzen
In-Reply-To: <4cf7609c60787523584b01e878daf845@geoffray-levasseur.org>
References: <4cf7609c60787523584b01e878daf845@geoffray-levasseur.org>
Message-ID: <222decdb-3cf1-6dd1-4ce1-3a823e277d5b@merit.unu.edu>

Hi,

Not sure of this would solve your problem, but we used to have clock 
skews all the time. We finally switched to chrony, and ever since they 
have disappeared.

So it seems (with us anyway) chrony does a much better job than ntp.

But it seems your problems are much bigger, and probably your ntp issues 
are only a symptom.

Good luck!

MJ

On 9/11/19 5:14 PM, Geoffray Levasseur wrote:
> Hi,
> 
> I have some difficulties on a Ryzen 5 2400G node. I switched recently to new version 6 with no problems except performances on that machine. The new kernel was suppose to have a better support but it actually is worse. Putting a quad port Intel 82571EB card on the PCI-Express 16x port is probably the orrigin of my troubles.
> 
> When IOMMU is activated I have very bad performances on the network card, and the server reboot after a few days unexpectingly.
> 
> When I put IOMMU in software mode, no unexpected reboot anymore but bed performances remain.
> 
> It turns Ceph extremely slow and it complain permanently with clock skew. NTP have extreme difficulties to do its job, both on hosted virtual machines and the host itself.
> 
> I read a lot about such troubles now fixed on video cards. But I dont think the network card scenario have been treated by kernel developpers. Is there any workaround for my situation?
> 
> Regards,
> --
> Geoffray Levasseur
> Technicien UPS - UMR CNRS 5566 / LEGOS - Service Informatique
>          <fatalerrors at geoffray-levasseur.org>
>          <geoffray.levasseur at legos.obs-mip.fr>
>          http://www.geoffray-levasseur.org/
> GNU/PG public key : C89E D6C4 8BFC C9F2 EEFB 908C 89C2 CD4D CD9E 23AA
> Quod gratis asseritur gratis negatur.
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 


From mike at oeg.com.au  Thu Sep 12 08:18:34 2019
From: mike at oeg.com.au (Mike O'Connor)
Date: Thu, 12 Sep 2019 15:48:34 +0930
Subject: [PVE-User] LXC not starting after V5 to V6 upgrade using ZFS for
 storage
Message-ID: <5b6baf95-4335-d725-5aea-9792b5df3e1a@oeg.com.au>

HI All

I just finished upgrading from V5 to V6 of Proxmox and have an issue
with LXC 's not starting.

The issue seems to be that the LXC is being started with out first
mounting the ZFS subvolume.
This results in a dev directory being created which then means ZFS will
not mount over anymore because there are files in the mount point

Mount does not work
Code:

root at pve:/rbd# zfs mount rbd/subvol-109-disk-0
cannot mount '/rbd/subvol-109-disk-0': directory is not empty

There is a directory created by the attempt to start the lXC
Code:

root at pve:/rbd# find /rbd/subvol-109*/
/rbd/subvol-109-disk-0/
/rbd/subvol-109-disk-0/dev

Remove the directory
Code:

rm -rf /rbd/subvol-109-disk-0/dev/

Mount the rbd volume
Code:

root at pve:/rbd# zfs mount rbd/subvol-109-disk-0

I can then start the LXC from the web page or via the cli.

Question:
What mounts the zfs subvol ?
Is this a ZFS issue of not mounting the subvol at boot ?
Should Proxmox be mounting the image ?
Should Proxmox not be checking its mounted before starting the LXC ?

I've been able to start the LXC but after a reboot, it seems I have to
manual fix the mounts again.

Thanks



From daniel at speichert.pl  Thu Sep 12 15:46:16 2019
From: daniel at speichert.pl (Daniel Speichert)
Date: Thu, 12 Sep 2019 09:46:16 -0400
Subject: [PVE-User] LXC not starting after V5 to V6 upgrade using ZFS
 for storage
In-Reply-To: <5b6baf95-4335-d725-5aea-9792b5df3e1a@oeg.com.au>
References: <5b6baf95-4335-d725-5aea-9792b5df3e1a@oeg.com.au>
Message-ID: <f016e15c-e730-398c-3db5-b67c4c8c7f9a@speichert.pl>

I've had a similar problem. It was worse because i had to unmount
everything up to the root.

I think I set the datasets for machines to automount by setting a
mountpoint attribute that was missing before.

I can't recall if that was it though. Do you have it set? zfs get
mountpoint /rbd/...

Best,
Daniel

On 9/12/2019 2:18 AM, Mike O'Connor wrote:
> HI All
>
> I just finished upgrading from V5 to V6 of Proxmox and have an issue
> with LXC 's not starting.
>
> The issue seems to be that the LXC is being started with out first
> mounting the ZFS subvolume.
> This results in a dev directory being created which then means ZFS will
> not mount over anymore because there are files in the mount point
>
> Mount does not work
> Code:
>
> root at pve:/rbd# zfs mount rbd/subvol-109-disk-0
> cannot mount '/rbd/subvol-109-disk-0': directory is not empty
>
> There is a directory created by the attempt to start the lXC
> Code:
>
> root at pve:/rbd# find /rbd/subvol-109*/
> /rbd/subvol-109-disk-0/
> /rbd/subvol-109-disk-0/dev
>
> Remove the directory
> Code:
>
> rm -rf /rbd/subvol-109-disk-0/dev/
>
> Mount the rbd volume
> Code:
>
> root at pve:/rbd# zfs mount rbd/subvol-109-disk-0
>
> I can then start the LXC from the web page or via the cli.
>
> Question:
> What mounts the zfs subvol ?
> Is this a ZFS issue of not mounting the subvol at boot ?
> Should Proxmox be mounting the image ?
> Should Proxmox not be checking its mounted before starting the LXC ?
>
> I've been able to start the LXC but after a reboot, it seems I have to
> manual fix the mounts again.
>
> Thanks
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


From mike at oeg.com.au  Fri Sep 13 10:56:09 2019
From: mike at oeg.com.au (Mike O'Connor)
Date: Fri, 13 Sep 2019 18:26:09 +0930
Subject: [PVE-User] LXC not starting after V5 to V6 upgrade using ZFS
 for storage
In-Reply-To: <f016e15c-e730-398c-3db5-b67c4c8c7f9a@speichert.pl>
References: <5b6baf95-4335-d725-5aea-9792b5df3e1a@oeg.com.au>
 <f016e15c-e730-398c-3db5-b67c4c8c7f9a@speichert.pl>
Message-ID: <857a34e4-2c73-a281-58a4-47a2e987d7a1@oeg.com.au>

On 12/9/19 11:16 pm, Daniel Speichert wrote:
> I've had a similar problem. It was worse because i had to unmount
> everything up to the root.
>
> I think I set the datasets for machines to automount by setting a
> mountpoint attribute that was missing before.
>
> I can't recall if that was it though. Do you have it set? zfs get
> mountpoint /rbd/...
>
> Best,
> Daniel

Hi Daniel

Thanks for your comments, this was not the issue. All the subvols have
mount points.

BUT

I did find the issue, I had a fstab entry which was doing a bind from
the rbd to a normal path location

This was causing the zfs mount service to fail because it thought the
rbd had files in it.

Removed this by changing the service to use the rbd directory directly
instead of a bind.


Cheers
Mike


From f.cuseo at panservice.it  Fri Sep 13 21:42:06 2019
From: f.cuseo at panservice.it (Fabrizio Cuseo)
Date: Fri, 13 Sep 2019 21:42:06 +0200 (CEST)
Subject: [PVE-User] Ceph MON quorum problem
Message-ID: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it>

Hello.
I am planning a 6 hosts cluster.

3 hosts are located in the CedA room
3 hosts are located in the CedB room 

the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one for each switch) for Ceph storage.

My need is to have a full redundancy cluster that can survive to CedA (or CedB) disaster.

I have modified the crush map, so I have a RBD Pool that writes 2 copies in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk space is not a problem).

But if I loose one of the rooms, i can't establish the needed quorum.

Some suggestion to have a quick and not too complicated way to satisfy my need ? 

Regards, Fabrizio




From brians at iptel.co  Sat Sep 14 15:41:36 2019
From: brians at iptel.co (Brian :)
Date: Sat, 14 Sep 2019 14:41:36 +0100
Subject: [PVE-User] Ceph MON quorum problem
In-Reply-To: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it>
References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it>
Message-ID: <CAGPQfi_NjBv6jG2heet9rBbdYBNS8pgeYs==ujz72BNMZgFmfw@mail.gmail.com>

Have a mon that runs somewhere that isn't either of those rooms.

On Friday, September 13, 2019, Fabrizio Cuseo <f.cuseo at panservice.it> wrote:
> Hello.
> I am planning a 6 hosts cluster.
>
> 3 hosts are located in the CedA room
> 3 hosts are located in the CedB room
>
> the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each
room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one
for each switch) for Ceph storage.
>
> My need is to have a full redundancy cluster that can survive to CedA (or
CedB) disaster.
>
> I have modified the crush map, so I have a RBD Pool that writes 2 copies
in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk
space is not a problem).
>
> But if I loose one of the rooms, i can't establish the needed quorum.
>
> Some suggestion to have a quick and not too complicated way to satisfy my
need ?
>
> Regards, Fabrizio
>
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>


From f.cuseo at panservice.it  Sat Sep 14 16:11:50 2019
From: f.cuseo at panservice.it (f.cuseo at panservice.it)
Date: Sat, 14 Sep 2019 16:11:50 +0200 (CEST)
Subject: [PVE-User] Ris:  Ceph MON quorum problem
In-Reply-To: <CAGPQfi_NjBv6jG2heet9rBbdYBNS8pgeYs==ujz72BNMZgFmfw@mail.gmail.com>
References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it>
 <CAGPQfi_NjBv6jG2heet9rBbdYBNS8pgeYs==ujz72BNMZgFmfw@mail.gmail.com>
Message-ID: <585861321.182698.1568470310925.JavaMail.zimbra@zimbra.panservice.it>

This is My last choice :)
Inviato dal mio dispositivo Huawei
-------- Messaggio originale --------
Oggetto: Re: [PVE-User] Ceph MON quorum problem
Da: "Brian :"
A: Fabrizio Cuseo ,PVE User List
CC:


Have a mon that runs somewhere that isn't either of those rooms.

On Friday, September 13, 2019, Fabrizio Cuseo <f.cuseo at panservice.it> wrote:
> Hello.
> I am planning a 6 hosts cluster.
>
> 3 hosts are located in the CedA room
> 3 hosts are located in the CedB room
>
> the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each
room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one
for each switch) for Ceph storage.
>
> My need is to have a full redundancy cluster that can survive to CedA (or
CedB) disaster.
>
> I have modified the crush map, so I have a RBD Pool that writes 2 copies
in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk
space is not a problem).
>
> But if I loose one of the rooms, i can't establish the needed quorum.
>
> Some suggestion to have a quick and not too complicated way to satisfy my
need ?
>
> Regards, Fabrizio
>
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From jamacdon at hwy97.com  Sun Sep 15 22:55:00 2019
From: jamacdon at hwy97.com (Joe Garvey)
Date: Sun, 15 Sep 2019 13:55:00 -0700 (PDT)
Subject: [PVE-User] Empty virtual disk
Message-ID: <1584392794.132399.1568580900736.JavaMail.zimbra@talktel.ca>

Hello all, 

I had to reboot a QEMU based VM yesterday and after rebooting it reported there was no boot disk. The disk has lost all content in the hard drive. There aren't even any partition. I booted the VM with acronis disk recovery and it showed the disk as uninitialized. 

I restored a 6 day old VM and it also had an empty drive. All backups have no data in the drive and are marked as uninitialized. 

I tested restoring other VM's in my environment and they have no issues with the disks. 

The only difference I see is that the drives are smaller. 

The VM in question has been running flawlessly since it was deployed over 30 days ago. 


Proxmox version: 5.4-2 
SCSI controller: VirtIO SCSI 
Disk Size: 200G 
Caching is Disabled 
Storage: Dell NAS server via iSCSI 4x10GB connections (never had a problem with this) 

I'm guessing the data is gone but a ny ideas what has caused this? 

Regards, 
Joe 




From gilberto.nunes32 at gmail.com  Mon Sep 16 03:17:11 2019
From: gilberto.nunes32 at gmail.com (Gilberto Nunes)
Date: Sun, 15 Sep 2019 22:17:11 -0300
Subject: [PVE-User] Kernel 5.3 and Proxmox Ceph nodes
Message-ID: <CAOKSTBtG6qhbsf0_wxAVLv0cD5=Abv6JL1V6iHkWhysUwPkPWg@mail.gmail.com>

Hi there

I read this about kernel 5.3 and ceph, and I am curious...
I have a 6 nodes proxmox ceph cluster with luminous...
Should be a good idea to user kernel 5.3 from here:

https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/
---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36


From lcaron at unix-scripts.info  Mon Sep 16 09:55:34 2019
From: lcaron at unix-scripts.info (Laurent CARON)
Date: Mon, 16 Sep 2019 09:55:34 +0200
Subject: [PVE-User] Recurring crashes after cluster upgrade from 5 to 6
Message-ID: <da0981f9-2972-d4b3-73c2-a20c5d827550@unix-scripts.info>

Hi,


After upgrading our 4 node cluster from PVE 5 to 6, we experience 
constant crashed (once every 2 days).

Those crashes seem related to corosync.

Since numerous users are reporting sych issues (broken cluster after 
upgrade, unstabilities, ...) I wonder if it is possible to downgrade 
corosync to version 2.4.4 without impacting functionnality ?

Basic steps would be:

On all nodes

# systemctl stop pve-ha-lrm

Once done, on all nodes:

# systemctl stop pve-ha-crm

Once done, on all nodes:

# apt-get install corosync=2.4.4-pve1 libcorosync-common4=2.4.4-pve1 
libcmap4=2.4.4-pve1 libcpg4=2.4.4-pve1 libqb0=1.0.3-1~bpo9 
libquorum5=2.4.4-pve1 libvotequorum8=2.4.4-pve1

Then, once corosync has been downgraded, on all nodes

# systemctl start pve-ha-lrm
# systemctl start pve-ha-crm

Would that work ?

Thanks



From ronny+pve-user at aasen.cx  Mon Sep 16 10:50:39 2019
From: ronny+pve-user at aasen.cx (Ronny Aasen)
Date: Mon, 16 Sep 2019 10:50:39 +0200
Subject: [PVE-User] Kernel 5.3 and Proxmox Ceph nodes
In-Reply-To: <CAOKSTBtG6qhbsf0_wxAVLv0cD5=Abv6JL1V6iHkWhysUwPkPWg@mail.gmail.com>
References: <CAOKSTBtG6qhbsf0_wxAVLv0cD5=Abv6JL1V6iHkWhysUwPkPWg@mail.gmail.com>
Message-ID: <f2ae718c-0ca0-d5f5-637d-5e63b06a4fee@aasen.cx>

On 16.09.2019 03:17, Gilberto Nunes wrote:
> Hi there
> 
> I read this about kernel 5.3 and ceph, and I am curious...
> I have a 6 nodes proxmox ceph cluster with luminous...
> Should be a good idea to user kernel 5.3 from here:
> 
> https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/
> ---
> Gilberto Nunes Ferreira
> 
> (47) 3025-5907
> (47) 99676-7530 - Whatsapp / Telegram
> 
> Skype: gilberto.nunes36
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 

you read "this" ?
This what exactly?


generally unless you have a problem you need fixed i would run the 
kernels from proxmox.

Ronny


From ronny+pve-user at aasen.cx  Mon Sep 16 12:27:16 2019
From: ronny+pve-user at aasen.cx (Ronny Aasen)
Date: Mon, 16 Sep 2019 12:27:16 +0200
Subject: [PVE-User] Empty virtual disk
In-Reply-To: <1584392794.132399.1568580900736.JavaMail.zimbra@talktel.ca>
References: <1584392794.132399.1568580900736.JavaMail.zimbra@talktel.ca>
Message-ID: <1725d4ec-6465-9918-cf93-3b75379f89d2@aasen.cx>

On 15.09.2019 22:55, Joe Garvey wrote:
> Hello all,
> 
> I had to reboot a QEMU based VM yesterday and after rebooting it reported there was no boot disk. The disk has lost all content in the hard drive. There aren't even any partition. I booted the VM with acronis disk recovery and it showed the disk as uninitialized.
> 
> I restored a 6 day old VM and it also had an empty drive. All backups have no data in the drive and are marked as uninitialized.
> 
> I tested restoring other VM's in my environment and they have no issues with the disks.
> 
> The only difference I see is that the drives are smaller.
> 
> The VM in question has been running flawlessly since it was deployed over 30 days ago.
> 
> 
> Proxmox version: 5.4-2
> SCSI controller: VirtIO SCSI
> Disk Size: 200G
> Caching is Disabled
> Storage: Dell NAS server via iSCSI 4x10GB connections (never had a problem with this)
> 
> I'm guessing the data is gone but a ny ideas what has caused this?
> 
> Regards,
> Joe
> 
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 

that is a tricky one, wild speculations followes:

first of all: Do not write more data to unallocated blocks on the disk. 
you may be able to recover the lost qcow2 image using testdisk.


a wild guess is that someone sometime ago did an accidental rm -rf 
/path/to/qemu-images


this deleted all qemu images. but they are not really gone before the 
last process holding the image open closes.
so when the vm was shutdown and restarted qemu closed the file and it 
was removed and an empty file was created when qemu started again.


you can look for deleted but still active files with
find /proc/*/fd -ls | grep  '(deleted)'

basically all these files will be deleted if the process there stops.

you can try and copy the running vm images with
cp  proc/1722/fd/ /somewhere-not-on-same-disk

this would make inconsistent copies that you can try to fix with
qemu-img check /somewhere-not-on-same-disk

after that i would try to do a disk move of the running vm.
that may or may not fail. and perhaps crash the vm. (and loose the disk 
file)  but if it works it would probably make a consistent copy, that 
would be better then the inconsistent manual copy.



regarding your first lost disk image, i would try with testdisk to try 
to recover the lost qcow2 image.

I would focus on the current running vm's first. And i would copy the 
orginal disk to a different host for running testdisk on it. something 
like :
dd_rescue /dev/qcow2_disk - | ssh user at some-host "cat - > 
/large/storage/recovery-file.dump"

then run testdisk on that image file trying to recover the cow file.


good luck
Ronny Aasen


From gilberto.nunes32 at gmail.com  Mon Sep 16 12:53:56 2019
From: gilberto.nunes32 at gmail.com (Gilberto Nunes)
Date: Mon, 16 Sep 2019 07:53:56 -0300
Subject: [PVE-User] Kernel 5.3 and Proxmox Ceph nodes
In-Reply-To: <f2ae718c-0ca0-d5f5-637d-5e63b06a4fee@aasen.cx>
References: <CAOKSTBtG6qhbsf0_wxAVLv0cD5=Abv6JL1V6iHkWhysUwPkPWg@mail.gmail.com>
 <f2ae718c-0ca0-d5f5-637d-5e63b06a4fee@aasen.cx>
Message-ID: <CAOKSTBsDY_D-AiQ5CPyEbW_VfzAFO6xQ82bNoWnORP0acCf9xA@mail.gmail.com>

Oh! I sorry! I didn't  sent the link which I referred to

https://www.phoronix.com/scan.php?page=news_item&px=Ceph-Linux-5.3-Changes

---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36





Em seg, 16 de set de 2019 ?s 05:50, Ronny Aasen <ronny+pve-user at aasen.cx>
escreveu:

> On 16.09.2019 03:17, Gilberto Nunes wrote:
> > Hi there
> >
> > I read this about kernel 5.3 and ceph, and I am curious...
> > I have a 6 nodes proxmox ceph cluster with luminous...
> > Should be a good idea to user kernel 5.3 from here:
> >
> > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/
> > ---
> > Gilberto Nunes Ferreira
> >
> > (47) 3025-5907
> > (47) 99676-7530 - Whatsapp / Telegram
> >
> > Skype: gilberto.nunes36
> > _______________________________________________
> > pve-user mailing list
> > pve-user at pve.proxmox.com
> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >
>
> you read "this" ?
> This what exactly?
>
>
> generally unless you have a problem you need fixed i would run the
> kernels from proxmox.
>
> Ronny
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>


From humbertos at ifsc.edu.br  Mon Sep 16 12:58:17 2019
From: humbertos at ifsc.edu.br (Humberto Jose De Sousa)
Date: Mon, 16 Sep 2019 07:58:17 -0300 (BRT)
Subject: [PVE-User] Ceph MON quorum problem
In-Reply-To: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it>
References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it>
Message-ID: <2114286804.38162994.1568631497769.JavaMail.zimbra@ifsc.edu.br>

Hi. 

You could try the qdevice: https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support 

Humberto 

De: "Fabrizio Cuseo" <f.cuseo at panservice.it> 
Para: "pve-user" <pve-user at pve.proxmox.com> 
Enviadas: Sexta-feira, 13 de setembro de 2019 16:42:06 
Assunto: [PVE-User] Ceph MON quorum problem 

Hello. 
I am planning a 6 hosts cluster. 

3 hosts are located in the CedA room 
3 hosts are located in the CedB room 

the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one for each switch) for Ceph storage. 

My need is to have a full redundancy cluster that can survive to CedA (or CedB) disaster. 

I have modified the crush map, so I have a RBD Pool that writes 2 copies in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk space is not a problem). 

But if I loose one of the rooms, i can't establish the needed quorum. 

Some suggestion to have a quick and not too complicated way to satisfy my need ? 

Regards, Fabrizio 


_______________________________________________ 
pve-user mailing list 
pve-user at pve.proxmox.com 
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user 


From r.correa.r at gmail.com  Mon Sep 16 13:04:41 2019
From: r.correa.r at gmail.com (Ricardo Correa)
Date: Mon, 16 Sep 2019 11:04:41 +0000
Subject: [PVE-User] Kernel 5.3 and Proxmox Ceph nodes
In-Reply-To: <CAOKSTBsDY_D-AiQ5CPyEbW_VfzAFO6xQ82bNoWnORP0acCf9xA@mail.gmail.com>
References: <CAOKSTBtG6qhbsf0_wxAVLv0cD5=Abv6JL1V6iHkWhysUwPkPWg@mail.gmail.com>
 <f2ae718c-0ca0-d5f5-637d-5e63b06a4fee@aasen.cx>
 <CAOKSTBsDY_D-AiQ5CPyEbW_VfzAFO6xQ82bNoWnORP0acCf9xA@mail.gmail.com>
Message-ID: <HE1PR03MB3148EC8E045C950D06A250FFA38C0@HE1PR03MB3148.eurprd03.prod.outlook.com>

Another 5.3 fix that might be interesting for some is https://github.com/lxc/lxd/issues/5193#issuecomment-502857830 which allows (or takes us one step closer) to running a kubelet in LXC containers.

?On 16.09.19, 12:55, "pve-user on behalf of Gilberto Nunes" <pve-user-bounces at pve.proxmox.com on behalf of gilberto.nunes32 at gmail.com> wrote:

    Oh! I sorry! I didn't  sent the link which I referred to
    
    https://www.phoronix.com/scan.php?page=news_item&px=Ceph-Linux-5.3-Changes
    
    ---
    Gilberto Nunes Ferreira
    
    (47) 3025-5907
    (47) 99676-7530 - Whatsapp / Telegram
    
    Skype: gilberto.nunes36
    
    
    
    
    
    Em seg, 16 de set de 2019 ?s 05:50, Ronny Aasen <ronny+pve-user at aasen.cx>
    escreveu:
    
    > On 16.09.2019 03:17, Gilberto Nunes wrote:
    > > Hi there
    > >
    > > I read this about kernel 5.3 and ceph, and I am curious...
    > > I have a 6 nodes proxmox ceph cluster with luminous...
    > > Should be a good idea to user kernel 5.3 from here:
    > >
    > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/
    > > ---
    > > Gilberto Nunes Ferreira
    > >
    > > (47) 3025-5907
    > > (47) 99676-7530 - Whatsapp / Telegram
    > >
    > > Skype: gilberto.nunes36
    > > _______________________________________________
    > > pve-user mailing list
    > > pve-user at pve.proxmox.com
    > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
    > >
    >
    > you read "this" ?
    > This what exactly?
    >
    >
    > generally unless you have a problem you need fixed i would run the
    > kernels from proxmox.
    >
    > Ronny
    > _______________________________________________
    > pve-user mailing list
    > pve-user at pve.proxmox.com
    > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
    >
    _______________________________________________
    pve-user mailing list
    pve-user at pve.proxmox.com
    https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
    

From f.cuseo at panservice.it  Mon Sep 16 14:24:36 2019
From: f.cuseo at panservice.it (Fabrizio Cuseo)
Date: Mon, 16 Sep 2019 14:24:36 +0200 (CEST)
Subject: [PVE-User] Ceph MON quorum problem
In-Reply-To: <2114286804.38162994.1568631497769.JavaMail.zimbra@ifsc.edu.br>
References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it>
 <2114286804.38162994.1568631497769.JavaMail.zimbra@ifsc.edu.br>
Message-ID: <176763712.201083.1568636676598.JavaMail.zimbra@zimbra.panservice.it>

THank you Humberto, but my problem is not related on proxmox quorum, but ceph mon quorum. 

Regards, Fabrizio 

----- Il 16-set-19, alle 12:58, Humberto Jose De Sousa <humbertos at ifsc.edu.br> ha scritto: 

> Hi.

> You could try the qdevice:
> https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support

> Humberto

> De: "Fabrizio Cuseo" <f.cuseo at panservice.it>
> Para: "pve-user" <pve-user at pve.proxmox.com>
> Enviadas: Sexta-feira, 13 de setembro de 2019 16:42:06
> Assunto: [PVE-User] Ceph MON quorum problem

> Hello.
> I am planning a 6 hosts cluster.

> 3 hosts are located in the CedA room
> 3 hosts are located in the CedB room

> the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each room i
> have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one for each
> switch) for Ceph storage.

> My need is to have a full redundancy cluster that can survive to CedA (or CedB)
> disaster.

> I have modified the crush map, so I have a RBD Pool that writes 2 copies in CedA
> hosts, and 2 copies in CedB hosts, so a very good redundancy (disk space is not
> a problem).

> But if I loose one of the rooms, i can't establish the needed quorum.

> Some suggestion to have a quick and not too complicated way to satisfy my need ?

> Regards, Fabrizio

> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

-- 
--- 
Fabrizio Cuseo - mailto:f.cuseo at panservice.it 
Direzione Generale - Panservice InterNetWorking 
Servizi Professionali per Internet ed il Networking 
Panservice e' associata AIIP - RIPE Local Registry 
Phone: +39 0773 410020 - Fax: +39 0773 470219 
http://www.panservice.it mailto:info at panservice.it 
Numero verde nazionale: 800 901492 


From ronny+pve-user at aasen.cx  Mon Sep 16 14:49:06 2019
From: ronny+pve-user at aasen.cx (Ronny Aasen)
Date: Mon, 16 Sep 2019 14:49:06 +0200
Subject: [PVE-User] Ris: Ceph MON quorum problem
In-Reply-To: <585861321.182698.1568470310925.JavaMail.zimbra@zimbra.panservice.it>
References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it>
 <CAGPQfi_NjBv6jG2heet9rBbdYBNS8pgeYs==ujz72BNMZgFmfw@mail.gmail.com>
 <585861321.182698.1568470310925.JavaMail.zimbra@zimbra.panservice.it>
Message-ID: <eb464671-43e6-0afe-83c3-6d62f73a9bd6@aasen.cx>

with 2 rooms there is no way to avoid a split brain situation unless you 
have a tie breaker outside one of those 2 rooms.

Run a Mon on a neutral third location is the quick, correct, and simple 
solution.

Or

you need to have a master-slave situation where one room is the master 
(3 mons) and the other room is the slave (2 mons) and the slave can not 
operate without the master, but the master can operate alone.

good luck
Ronny


On 14.09.2019 16:11, f.cuseo at panservice.it wrote:
> This is My last choice :)
> Inviato dal mio dispositivo Huawei
> -------- Messaggio originale --------
> Oggetto: Re: [PVE-User] Ceph MON quorum problem
> Da: "Brian :"
> A: Fabrizio Cuseo ,PVE User List
> CC:
> 
> 
> Have a mon that runs somewhere that isn't either of those rooms.
> 
> On Friday, September 13, 2019, Fabrizio Cuseo <f.cuseo at panservice.it> wrote:
>> Hello.
>> I am planning a 6 hosts cluster.
>>
>> 3 hosts are located in the CedA room
>> 3 hosts are located in the CedB room
>>
>> the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each
> room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one
> for each switch) for Ceph storage.
>>
>> My need is to have a full redundancy cluster that can survive to CedA (or
> CedB) disaster.
>>
>> I have modified the crush map, so I have a RBD Pool that writes 2 copies
> in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk
> space is not a problem).
>>
>> But if I loose one of the rooms, i can't establish the needed quorum.
>>
>> Some suggestion to have a quick and not too complicated way to satisfy my
> need ?
>>
>> Regards, Fabrizio
>>
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 



From f.cuseo at panservice.it  Mon Sep 16 16:02:39 2019
From: f.cuseo at panservice.it (Fabrizio Cuseo)
Date: Mon, 16 Sep 2019 16:02:39 +0200 (CEST)
Subject: [PVE-User] Ris: Ceph MON quorum problem
In-Reply-To: <eb464671-43e6-0afe-83c3-6d62f73a9bd6@aasen.cx>
References: <354763632.177819.1568403726282.JavaMail.zimbra@zimbra.panservice.it>
 <CAGPQfi_NjBv6jG2heet9rBbdYBNS8pgeYs==ujz72BNMZgFmfw@mail.gmail.com>
 <585861321.182698.1568470310925.JavaMail.zimbra@zimbra.panservice.it>
 <eb464671-43e6-0afe-83c3-6d62f73a9bd6@aasen.cx>
Message-ID: <1891770806.204452.1568642559576.JavaMail.zimbra@zimbra.panservice.it>

Answer following:

----- Il 16-set-19, alle 14:49, Ronny Aasen ronny+pve-user at aasen.cx ha scritto:

> with 2 rooms there is no way to avoid a split brain situation unless you
> have a tie breaker outside one of those 2 rooms.
> 
> Run a Mon on a neutral third location is the quick, correct, and simple
> solution.
> 
> Or
> 
> you need to have a master-slave situation where one room is the master
> (3 mons) and the other room is the slave (2 mons) and the slave can not
> operate without the master, but the master can operate alone.


Yes, I need a master-slave situation, but I need to have the slave running in case of master's fault.
So, if a have a total of 3 mons (2 on master, 1 on slave), if I loose the master, I have only 1 mon available, and i need to create another mon (but i can't create it because I have no quorum).

I know that for now, the only solution is a third room.

Thanks, Fabrizio 




> 
> On 14.09.2019 16:11, f.cuseo at panservice.it wrote:
>> This is My last choice :)
>> Inviato dal mio dispositivo Huawei
>> -------- Messaggio originale --------
>> Oggetto: Re: [PVE-User] Ceph MON quorum problem
>> Da: "Brian :"
>> A: Fabrizio Cuseo ,PVE User List
>> CC:
>> 
>> 
>> Have a mon that runs somewhere that isn't either of those rooms.
>> 
>> On Friday, September 13, 2019, Fabrizio Cuseo <f.cuseo at panservice.it> wrote:
>>> Hello.
>>> I am planning a 6 hosts cluster.
>>>
>>> 3 hosts are located in the CedA room
>>> 3 hosts are located in the CedB room
>>>
>>> the two rooms are connected with a 2 x 10Gbit fiber (200mt) and in each
>> room i have 2 x 10Gbit stacked switch and each host have a 2 x 10Gbit (one
>> for each switch) for Ceph storage.
>>>
>>> My need is to have a full redundancy cluster that can survive to CedA (or
>> CedB) disaster.
>>>
>>> I have modified the crush map, so I have a RBD Pool that writes 2 copies
>> in CedA hosts, and 2 copies in CedB hosts, so a very good redundancy (disk
>> space is not a problem).
>>>
>>> But if I loose one of the rooms, i can't establish the needed quorum.
>>>
>>> Some suggestion to have a quick and not too complicated way to satisfy my
>> need ?
>>>
>>> Regards, Fabrizio
>>>
>>>
>>> _______________________________________________
>>> pve-user mailing list
>>> pve-user at pve.proxmox.com
>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>> 
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

-


From daniel at firewall-services.com  Tue Sep 17 18:27:33 2019
From: daniel at firewall-services.com (Daniel Berteaud)
Date: Tue, 17 Sep 2019 18:27:33 +0200 (CEST)
Subject: [PVE-User] Moving disk with ZFS over iSCSI = IO error
Message-ID: <266547431.18105.1568737653631.JavaMail.zimbra@fws.fr>

Hi there. 

I'm working on moving my NFS setup to ZFS over iSCSI. I'm using a CentOS 7.6 box with ZoL 0.8.1, with the LIO backend (but this shouldn't be relevent, see further). For the PVE side, I'm running PVE6 with all updates applied. 

Except a few minor issues I found in the LIO backend (for which I sent a patch serie earlier today), most things do work nicely. Except one which is important to me : I can't move disk from ZFS over iSCSI to any other storage. Destination storage type doesn't matter, but the porblem is 100% reproducible when the source storage is ZFS over iSCSI 

A few seconds after I started disk move, the guest FS will "panic". For example, with an el7 guest using XFS, I get : 

kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE 
kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] 
kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated 
kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00 
kernel: blk_update_request: I/O error, dev sda, sector 7962536 
kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE 
kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] 
kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated 
kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00 
kernel: blk_update_request: I/O error, dev sda, sector 7962536 
kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE 
kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] 
kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated 
kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 bc 0e 28 00 00 08 00 
kernel: blk_update_request: I/O error, dev sda, sector 12324392 


And the system completely crash. The data itself is not impacted. I can restart the guest and everything appears OK. It doesn't matter if I let the disk move operation terminates or if I cancel it. 
Moving the disk offline works as expected. 

Sparse or non sparse zvol backend doesn't matter either. 

I searched a lot about this issue, and found at least two other persons having the same, or a very similar issue : 



    * One using ZoL but with SCST, see [ https://sourceforge.net/p/scst/mailman/message/35241011/ | https://sourceforge.net/p/scst/mailman/message/35241011/ ] 
    * Another, using OmniOS, so with Comstar, see [ https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/ | https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/ ] 

Both are likely running PVE5, so it looks like it's not a recently introduced regression. 

I also was able to reproduce the issue with a FreeNAS storage, so using ctld. As the issue is present with so many different stack, I think we can eliminate an issue on the storage side. The problem is most likely on qemu, in it's iSCSI block implementation. 
The SCST-Devel thread is interesting, but infortunately, it's beyond my skills here. 

Any advice on how to debug this further ? I can reproduce it whenever I want, on a test setup. I'm happy to provide any usefull informations 

Regards, Daniel 


-- 


[ https://www.firewall-services.com/ ] 	
Daniel Berteaud 
FIREWALL-SERVICES SAS, La s?curit? des r?seaux 
Soci?t? de Services en Logiciels Libres 
T?l : +33.5 56 64 15 32 
Matrix: @dani:fws.fr 
[ https://www.firewall-services.com/ | https://www.firewall-services.com ] 


From daniel at firewall-services.com  Thu Sep 19 07:57:20 2019
From: daniel at firewall-services.com (Daniel Berteaud)
Date: Thu, 19 Sep 2019 07:57:20 +0200 (CEST)
Subject: [PVE-User] Moving disk with ZFS over iSCSI = IO error
In-Reply-To: <266547431.18105.1568737653631.JavaMail.zimbra@fws.fr>
References: <266547431.18105.1568737653631.JavaMail.zimbra@fws.fr>
Message-ID: <1159576403.22676.1568872640495.JavaMail.zimbra@fws.fr>

----- Le 17 Sep 19, ? 18:27, Daniel Berteaud <daniel at firewall-services.com> a ?crit : 

> Hi there.

> I'm working on moving my NFS setup to ZFS over iSCSI. I'm using a CentOS 7.6 box
> with ZoL 0.8.1, with the LIO backend (but this shouldn't be relevent, see
> further). For the PVE side, I'm running PVE6 with all updates applied.

> Except a few minor issues I found in the LIO backend (for which I sent a patch
> serie earlier today), most things do work nicely. Except one which is important
> to me : I can't move disk from ZFS over iSCSI to any other storage. Destination
> storage type doesn't matter, but the porblem is 100% reproducible when the
> source storage is ZFS over iSCSI

> A few seconds after I started disk move, the guest FS will "panic". For example,
> with an el7 guest using XFS, I get :

> kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current]
> kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated
> kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00
> kernel: blk_update_request: I/O error, dev sda, sector 7962536
> kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current]
> kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated
> kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 79 7f a8 00 00 08 00
> kernel: blk_update_request: I/O error, dev sda, sector 7962536
> kernel: sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> kernel: sd 2:0:0:0: [sda] Sense Key : Aborted Command [current]
> kernel: sd 2:0:0:0: [sda] Add. Sense: I/O process terminated
> kernel: sd 2:0:0:0: [sda] CDB: Read(10) 28 00 00 bc 0e 28 00 00 08 00
> kernel: blk_update_request: I/O error, dev sda, sector 12324392

> And the system completely crash. The data itself is not impacted. I can restart
> the guest and everything appears OK. It doesn't matter if I let the disk move
> operation terminates or if I cancel it.
> Moving the disk offline works as expected.

> Sparse or non sparse zvol backend doesn't matter either.

> I searched a lot about this issue, and found at least two other persons having
> the same, or a very similar issue :

>    * One using ZoL but with SCST, see [
>    https://sourceforge.net/p/scst/mailman/message/35241011/ |
>     https://sourceforge.net/p/scst/mailman/message/35241011/ ]
>    * Another, using OmniOS, so with Comstar, see [
>    https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/
>    |
>    https://forum.proxmox.com/threads/storage-iscsi-move-results-to-io-error.38848/
>     ]

> Both are likely running PVE5, so it looks like it's not a recently introduced
> regression.

> I also was able to reproduce the issue with a FreeNAS storage, so using ctld. As
> the issue is present with so many different stack, I think we can eliminate an
> issue on the storage side. The problem is most likely on qemu, in it's iSCSI
> block implementation.
> The SCST-Devel thread is interesting, but infortunately, it's beyond my skills
> here.

> Any advice on how to debug this further ? I can reproduce it whenever I want, on
> a test setup. I'm happy to provide any usefull informations

> Regards, Daniel

Forgot to mention. When moving a disk offline, from ZFS over iSCSI to something else (in my case to an NFS storage), I do have warnings like this : 

create full clone of drive scsi0 (zfs-test:vm-132-disk-0) 
Formatting '/mnt/pve/nfs-dumps/images/132/vm-132-disk-0.qcow2', fmt=qcow2 size=53687091200 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16 
transferred: 0 bytes remaining: 53687091200 bytes total: 53687091200 bytes progression: 0.00 % 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 12582909: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 16777212: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 20971515: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 25165818: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 29360121: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 33554424: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 37748727: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 41943030: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 46137333: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 50331636: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 54525939: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 58720242: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 62914545: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 67108848: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 71303151: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 75497454: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 79691757: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 83886060: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 88080363: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 92274666: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 96468969: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 100663272: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 104857575: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
transferred: 536870912 bytes remaining: 53150220288 bytes total: 53687091200 bytes progression: 1.00 % 
transferred: 1079110533 bytes remaining: 52607980667 bytes total: 53687091200 bytes progression: 2.01 % 
transferred: 1615981445 bytes remaining: 52071109755 bytes total: 53687091200 bytes progression: 3.01 % 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
transferred: 2158221066 bytes remaining: 51528870134 bytes total: 53687091200 bytes progression: 4.02 % 
transferred: 2695091978 bytes remaining: 50991999222 bytes total: 53687091200 bytes progression: 5.02 % 
transferred: 3231962890 bytes remaining: 50455128310 bytes total: 53687091200 bytes progression: 6.02 % 
transferred: 3774202511 bytes remaining: 49912888689 bytes total: 53687091200 bytes progression: 7.03 % 
qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400) 
transferred: 4311073423 bytes remaining: 49376017777 bytes total: 53687091200 bytes progression: 8.03 % 
transferred: 4853313044 bytes remaining: 48833778156 bytes total: 53687091200 bytes progression: 9.04 % 
transferred: 5390183956 bytes remaining: 48296907244 bytes total: 53687091200 bytes progression: 10.04 % 
transferred: 5927054868 bytes remaining: 47760036332 bytes total: 53687091200 bytes progression: 11.04 % 

Which might well be related to the problem (the same errors when the VM is running are reported back to the upper stacks, until the guest FS, which panics ?) 
When running offline, even with these error messages, the transfert is OK 

Cheers, 
Daniel 

-- 

[ https://www.firewall-services.com/ ] 	
Daniel Berteaud 
FIREWALL-SERVICES SAS, La s?curit? des r?seaux 
Soci?t? de Services en Logiciels Libres 
T?l : +33.5 56 64 15 32 
Matrix: @dani:fws.fr 
[ https://www.firewall-services.com/ | https://www.firewall-services.com ] 


From mark at tuxis.nl  Thu Sep 19 09:15:17 2019
From: mark at tuxis.nl (Mark Schouten)
Date: Thu, 19 Sep 2019 09:15:17 +0200
Subject: [PVE-User] Images on CephFS?
Message-ID: <af6a9a9699e3edc6ecab5ae1372f60cd@tuxis.nl>


Hi,

We just built our latest cluster with PVE 6.0. We also offer CephFS 'slow but large' storage with our clusters, on which people can create images for backupservers. However, it seems that in PVE 6.0, we can no longer use CephFS for images? 


Cany anybody confirm (and explain?) or am I looking in the wrong direction?

--
Mark Schouten <mark at tuxis.nl>

Tuxis, Ede, https://www.tuxis.nl

T: +31 318 200208?
?



From daniel at firewall-services.com  Fri Sep 20 10:45:33 2019
From: daniel at firewall-services.com (Daniel Berteaud)
Date: Fri, 20 Sep 2019 10:45:33 +0200 (CEST)
Subject: [PVE-User] Moving disk with ZFS over iSCSI = IO error
In-Reply-To: <1159576403.22676.1568872640495.JavaMail.zimbra@fws.fr>
References: <266547431.18105.1568737653631.JavaMail.zimbra@fws.fr>
 <1159576403.22676.1568872640495.JavaMail.zimbra@fws.fr>
Message-ID: <1895121826.27120.1568969133081.JavaMail.zimbra@fws.fr>

----- Le 19 Sep 19, ? 7:57, Daniel Berteaud <daniel at firewall-services.com> a ?crit : 

> Forgot to mention. When moving a disk offline, from ZFS over iSCSI to something
> else (in my case to an NFS storage), I do have warnings like this :

> create full clone of drive scsi0 (zfs-test:vm-132-disk-0)
> Formatting '/mnt/pve/nfs-dumps/images/132/vm-132-disk-0.qcow2', fmt=qcow2
> size=53687091200 cluster_size=65536 preallocation=metadata lazy_refcounts=off
> refcount_bits=16
> transferred: 0 bytes remaining: 53687091200 bytes total: 53687091200 bytes
> progression: 0.00 %
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5)
> ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 12582909: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 16777212: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 20971515: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> [...]
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 83886060: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 88080363: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 92274666: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 96468969: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 100663272: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 104857575: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 0: SENSE KEY:ILLEGAL_REQUEST(5)
> ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> transferred: 536870912 bytes remaining: 53150220288 bytes total: 53687091200
> bytes progression: 1.00 %
> transferred: 1079110533 bytes remaining: 52607980667 bytes total: 53687091200
> bytes progression: 2.01 %
> transferred: 1615981445 bytes remaining: 52071109755 bytes total: 53687091200
> bytes progression: 3.01 %
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 4194303: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> transferred: 2158221066 bytes remaining: 51528870134 bytes total: 53687091200
> bytes progression: 4.02 %
> transferred: 2695091978 bytes remaining: 50991999222 bytes total: 53687091200
> bytes progression: 5.02 %
> transferred: 3231962890 bytes remaining: 50455128310 bytes total: 53687091200
> bytes progression: 6.02 %
> transferred: 3774202511 bytes remaining: 49912888689 bytes total: 53687091200
> bytes progression: 7.03 %
> qemu-img: iSCSI GET_LBA_STATUS failed at lba 8388606: SENSE
> KEY:ILLEGAL_REQUEST(5) ASCQ:INVALID_FIELD_IN_CDB(0x2400)
> transferred: 4311073423 bytes remaining: 49376017777 bytes total: 53687091200
> bytes progression: 8.03 %
> transferred: 4853313044 bytes remaining: 48833778156 bytes total: 53687091200
> bytes progression: 9.04 %
> transferred: 5390183956 bytes remaining: 48296907244 bytes total: 53687091200
> bytes progression: 10.04 %
> transferred: 5927054868 bytes remaining: 47760036332 bytes total: 53687091200
> bytes progression: 11.04 %
> Which might well be related to the problem (the same errors when the VM is
> running are reported back to the upper stacks, until the guest FS, which panics
> ?)
> When running offline, even with these error messages, the transfert is OK

Another case which might be related : [ https://forum.proxmox.com/threads/move-disk-to-a-different-iscsi-target-errors-warning.27313/ | https://forum.proxmox.com/threads/move-disk-to-a-different-iscsi-target-errors-warning.27313/ ] 

-- 

[ https://www.firewall-services.com/ ] 	
Daniel Berteaud 
FIREWALL-SERVICES SAS, La s?curit? des r?seaux 
Soci?t? de Services en Logiciels Libres 
T?l : +33.5 56 64 15 32 
Matrix: @dani:fws.fr 
[ https://www.firewall-services.com/ | https://www.firewall-services.com ] 


From chris.hofstaedtler at deduktiva.com  Fri Sep 20 14:31:17 2019
From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva)
Date: Fri, 20 Sep 2019 14:31:17 +0200
Subject: [PVE-User] Kernel Memory Leak on PVE6?
Message-ID: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>

Hi,

I'm seeing a very interesting problem on PVE6: one of our machines
appears to leak kernel memory over time, up to the point where only
a reboot helps. Shutting down all KVM VMs does not release this
memory.

I'll attach some information below, because I just couldn't figure
out what this memory is used for. Once before shutting down the VMs,
and once after. I had to reboot the PVE host now, but I guess
in a few days it will be at least noticable again.

This machine has the same (except CPU) hardware as the box next to
it; however this one was freshly installed with PVE6, the other one
is an upgrade from PVE5 and doesn't exhibit this problem. It's quite
puzzling because I haven't seen this symptom at all at all the
customer installations.

Here are some graphs showing the memory consumption over time:
  http://zeha.at/~ch/T/20190920-pve6_meminfo_0.png
  http://zeha.at/~ch/T/20190920-pve6_meminfo_1.png

Looking forward to any debug help, suggestions, ...

Chris


** Almost out of memory, before VM shutdown: **

top - 10:24:19 up 22 days, 22:29,  1 user,  load average: 1.85, 1.57, 1.32
Tasks: 530 total,   1 running, 529 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.8 us,  0.4 sy,  0.0 ni, 97.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  80413.1 total,    509.9 free,  70879.7 used,   9023.5 buff/cache
MiB Swap:  20480.0 total,   6516.6 free,  13963.4 used.   8699.0 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                                             
   3183 root      20   0   10.6g   6.0g   2960 S   8.7   7.6   5861:52 /usr/bin/kvm -id 103 -name puppet -chardev socket,id=qmp,path=/var/run/qemu-server/103.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+
   3349 root      20   0 9266032   4.3g   2972 S   6.8   5.4   3834:41 /usr/bin/kvm -id 2017 -name go-test-srv01 -chardev socket,id=qmp,path=/var/run/qemu-server/2017.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=+
   3068 root      20   0 5060928   3.7g   2900 S   6.8   4.7   3110:01 /usr/bin/kvm -id 101 -name backup -chardev socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+
   3399 root      20   0 5094772   2.3g   2944 S  50.5   2.9  10780:07 /usr/bin/kvm -id 3002 -name monitor01 -chardev socket,id=qmp,path=/var/run/qemu-server/3002.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-+
   3254 root      20   0   32.8g   1.9g   3040 S   1.0   2.4 490:39.29 /usr/bin/kvm -id 2005 -name debbuild -chardev socket,id=qmp,path=/var/run/qemu-server/2005.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-e+
   2994 root      20   0 2656268 658428   2980 S   9.7   0.8   2895:15 /usr/bin/kvm -id 100 -name pbx -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+
   2927 root      20   0 2664232 479372   2944 S   6.8   0.6   2343:43 /usr/bin/kvm -id 102 -name ns1 -chardev socket,id=qmp,path=/var/run/qemu-server/102.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+
   2417 root      rt   0  606912 211336  51444 S   1.9   0.3 613:27.87 /usr/sbin/corosync -f                                                                                                                                               
2023020 root      20   0  246556  98020  97044 S   0.0   0.1  15:47.80 /lib/systemd/systemd-journald                                                                                                                                       
   1806 root      20   0  967944  32724  23612 S   0.0   0.0  53:49.62 /usr/bin/pmxcfs                                                                                                                                                     
   2801 root      20   0  314488  32428   6464 S   0.0   0.0 322:58.23 pvestatd                                                                                                                                                           +
3771741 root      20   0  150776  31728   3700 S   0.0   0.0   0:12.81 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize                                                                              
   2799 root      20   0  316056  27452   5656 S   0.0   0.0  95:49.25 pve-firewall                                                                                                                                                       +
   2909 root      20   0  325248  12684   5268 S   1.0   0.0   7:03.91 pve-ha-lrm                                                                                                                                                         +
 868033 ch        20   0   21660   9104   7280 S   0.0   0.0   0:00.12 /lib/systemd/systemd --user                                                                                                                                         
 868009 root      20   0   16912   7988   6856 S   0.0   0.0   0:00.03 sshd: ch [priv]                                                                                                                                                     
      1 root      20   0  171820   7640   5032 S   0.0   0.0  19:58.80 /lib/systemd/systemd --system --deserialize 37                                                                                                                      
   2876 root      20   0  325544   7124   4988 S   0.0   0.0   4:18.16 pve-ha-crm                                                                                                                                                         +
   1654 Debian-+  20   0   40488   7096   2864 S   0.0   0.0  77:37.18 /usr/sbin/snmpd -Lsd -Lf /dev/null -u Debian-snmp -g Debian-snmp -I -smux mteTrigger mteTriggerConf -f -p /run/snmpd.pid                                            
 868045 ch        20   0   10240   5404   3996 S   0.0   0.0   0:00.11 -zsh                                                                                                                                                                
 868044 ch        20   0   16912   4636   3492 S   0.0   0.0   0:00.02 sshd: ch at pts/0                                                                                                                                                      
   1644 root      20   0   29608   4520   3496 S   0.0   0.0   4:59.62 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal                                                                       
 868336 root      20   0    7716   4372   3092 S   0.0   0.0   0:00.03 -bash                                                                                                                                                               
1761096 root      20   0  351564   4180   3336 S   0.0   0.0   1:12.83 pvedaemon worker                                                                                                                                                   +
1776171 root      20   0  351696   4076   3352 S   0.0   0.0   1:18.27 pvedaemon worker                                                                                                                                                   +
 868370 root      20   0   11680   4016   2964 R   2.9   0.0   0:00.68 top                                                                                                                                                                 
1780591 root      20   0  351696   4008   3248 S   0.0   0.0   1:11.73 pvedaemon worker                                                                                                                                                   +
   1086 root      20   0   19540   3984   3720 S   0.0   0.0   3:11.21 /lib/systemd/systemd-logind                                                                                                                                         
 868335 root      20   0   10156   3788   3364 S   0.0   0.0   0:00.01 sudo -i                                                                                                                                                             
   2899 www-data  20   0  121256   3412   3080 S   0.0   0.0   0:33.99 spiceproxy                                                                                                                                                         +
2000791 www-data  20   0  344932   3412   2604 S   0.0   0.0   1:16.39 pveproxy worker                                                                                                                                                    +
2000792 www-data  20   0  344932   3348   2604 S   0.0   0.0   1:07.07 pveproxy worker                                                                                                                                                    +
   1251 root      20   0  225816   3296   2424 S   0.0   0.0   9:47.44 /usr/sbin/rsyslogd -n -iNONE                                                                                                                                        
   1258 message+  20   0    9212   3268   2820 S   0.0   0.0   6:41.36 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only                                                            

root at vn03:~# uname -a
Linux vn03 5.0.21-1-pve #1 SMP PVE 5.0.21-1 (Tue, 20 Aug 2019 17:16:32 +0200) x86_64 GNU/Linux
root at vn03:~# free -m
              total        used        free      shared  buff/cache   available
Mem:          80413       70877         515         101        9019        8708
Swap:         20479       13963        6516
root at vn03:~# dpkg -l pve\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                    Version      Architecture Description
+++-=======================-============-============-======================================================
ii  pve-cluster             6.0-5        amd64        Cluster Infrastructure for Proxmox Virtual Environment
ii  pve-container           3.0-5        all          Proxmox VE Container management tool
ii  pve-docs                6.0-4        all          Proxmox VE Documentation
ii  pve-edk2-firmware       2.20190614-1 all          edk2 based firmware modules for virtual machines
ii  pve-firewall            4.0-7        amd64        Proxmox VE Firewall
ii  pve-firmware            3.0-2        all          Binary firmware code for the pve-kernel
ii  pve-ha-manager          3.0-2        amd64        Proxmox VE HA Manager
ii  pve-i18n                2.0-2        all          Internationalization support for Proxmox VE
un  pve-kernel              <none>       <none>       (no description available)
ii  pve-kernel-5.0          6.0-7        all          Latest Proxmox VE Kernel Image
ii  pve-kernel-5.0.15-1-pve 5.0.15-1     amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-5.0.18-1-pve 5.0.18-3     amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-5.0.21-1-pve 5.0.21-1     amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-helper       6.0-7        all          Function for various kernel maintenance tasks.
un  pve-kvm                 <none>       <none>       (no description available)
ii  pve-manager             6.0-6        amd64        Proxmox Virtual Environment Management Tools
ii  pve-qemu-kvm            4.0.0-5      amd64        Full virtualization on x86 hardware
un  pve-qemu-kvm-2.6.18     <none>       <none>       (no description available)
ii  pve-xtermjs             3.13.2-1     all          HTML/JS Shell client
root at vn03:~# slabtop -o | head -50 
 Active / Total Objects (% used)    : 205425461 / 212231433 (96.8%)
 Active / Total Slabs (% used)      : 4949759 / 4949759 (100.0%)
 Active / Total Caches (% used)     : 114 / 161 (70.8%)
 Active / Total Size (% used)       : 60112896.56K / 60714678.54K (99.0%)
 Minimum / Average / Maximum Object : 0.01K / 0.29K / 16.62K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
43583592 43542487  99%    0.20K 1117528       39   8940224K vm_area_struct         
26520256 26518592  99%    0.06K 414379       64   1657516K anon_vma_chain         
16788000 16434450  97%    0.25K 524625       32   4197000K filp                   
13079680 13078464  99%    0.03K 102185      128    408740K kmalloc-32             
11544320 5261058  45%    0.06K 180380       64    721520K dmaengine-unmap-2      
10128740 10127452  99%    0.09K 220190       46    880760K anon_vma               
9602484 9602484 100%    0.04K  94142      102    376568K pde_opener             
7442736 7442572  99%    0.19K 177208       42   1417664K cred_jar               
7213200 7209695  99%    0.13K 240440       30    961760K kernfs_node_cache      
6023850 5992341  99%    0.19K 143425       42   1147400K dentry                 
5704350 5704350 100%    0.08K 111850       51    447400K task_delay_info        
5054066 5054066 100%    0.69K 109871       46   3515872K files_cache            
4664512 4664481  99%    0.12K 145766       32    583064K pid                    
4591440 4591440 100%    1.06K 153048       30   4897536K mm_struct              
4207445 4203908  99%    0.58K  76499       55   2447968K inode_cache            
4104480 4104291  99%    0.62K  80480       51   2575360K sock_inode_cache       
3901440 3900588  99%    0.06K  60960       64    243840K kmalloc-64             
3856230 3856160  99%    1.06K 128541       30   4113312K signal_cache           
3423826 3417982  99%    0.65K  69874       49   2235968K proc_inode_cache       
3139584 3138382  99%    0.01K   6132      512     24528K kmalloc-8              
2983344 2983255  99%    0.19K  71032       42    568256K kmalloc-192            
2426976 2426413  99%    1.00K  75843       32   2426976K kmalloc-1k             
1939854 1931355  99%    0.09K  46187       42    184748K kmalloc-96             
1649895 1649895 100%    2.06K 109993       15   3519776K sighand_cache          
1280544 1280544 100%    1.00K  40017       32   1280544K UNIX                   
1052928 1050819  99%    0.50K  32904       32    526464K kmalloc-512            
1029792 1029312  99%    0.25K  32181       32    257448K skbuff_head_cache      
940624 940559  99%    4.00K 117578        8   3762496K kmalloc-4k             
799895 787069  98%    5.75K 159979        5   5119328K task_struct            
735696 724643  98%    0.10K  18864       39     75456K buffer_head            
525504 525378  99%    2.00K  32844       16   1051008K kmalloc-2k             
433024 426780  98%    0.06K   6766       64     27064K kmem_cache_node        
310710 301758  97%    1.05K  10357       30    331424K ext4_inode_cache       
292340 290078  99%    0.68K   6220       47    199040K shmem_inode_cache      
215250 214814  99%    0.38K   5125       42     82000K kmem_cache             
212296 196761  92%    0.57K   7582       28    121312K radix_tree_node        
158464 158464 100%    0.02K    619      256      2476K kmalloc-16             
149925 149925 100%    1.25K   5997       25    191904K UDPv6                  
 71424  71140  99%    0.12K   2232       32      8928K kmalloc-128            
 70020  70020 100%    0.16K   1376       51     11008K kvm_mmu_page_header    
 40032  40009  99%    0.25K   1251       32     10008K kmalloc-256            
 34944  33823  96%    0.09K    832       42      3328K kmalloc-rcl-96         
 34816  32567  93%    0.06K    544       64      2176K kmalloc-rcl-64         
root at vn03:~# pct list
root at vn03:~# qm list
      VMID NAME                 STATUS     MEM(MB)    BOOTDISK(GB) PID       
       100 pbx                  running    2048              16.00 2994      
       101 backup               running    4096              32.00 3068      
       102 ns1                  running    2048              32.00 2927      
       103 puppet               running    10240             16.00 3183      
      2005 debbuild             running    32768             40.00 3254      
      2017 go-test-srv01        running    8192              20.00 3349      
      3002 monitor01            running    4096              32.00 3399      
      5001 salsa-runner-01      stopped    16384             32.00 0         
      6001 deduktiva-runner-01  stopped    32768             32.00 0         
      6901 mac                  stopped    4096               0.25 0         
root at vn03:~# sysctl -a | grep hugepages
vm.nr_hugepages = 0
vm.nr_hugepages_mempolicy = 0
vm.nr_overcommit_hugepages = 0


*** After shutdown of all VMs: ***

top - 10:39:56 up 22 days, 22:44,  2 users,  load average: 0.83, 1.84, 1.88
Tasks: 491 total,   1 running, 490 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.0 sy,  0.0 ni, 99.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  80413.1 total,  18276.4 free,  52704.9 used,   9431.8 buff/cache
MiB Swap:  20480.0 total,  19393.6 free,   1086.4 used.  26801.1 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                                             
   2417 root      rt   0  606908 211332  51444 S   1.0   0.3 613:46.50 /usr/sbin/corosync -f                                                                                                                                               
   2878 www-data  20   0  344800 133424  21784 S   0.0   0.2   0:36.09 pveproxy                                                                                                                                                           +
 883317 www-data  20   0  361776 133084  11056 S   0.0   0.2   0:01.04 pveproxy worker                                                                                                                                                    +
   2836 root      20   0  343228 132060  21764 S   0.0   0.2   0:38.88 pvedaemon                                                                                                                                                          +
 883319 www-data  20   0  360688 130992  11148 S   1.0   0.2   0:01.26 pveproxy worker                                                                                                                                                    +
 883318 www-data  20   0  358056 128864  11148 S   0.0   0.2   0:01.75 pveproxy worker                                                                                                                                                    +
 883166 root      20   0  351912 121884  10220 S   0.0   0.1   0:00.96 pvedaemon worker                                                                                                                                                   +
 883165 root      20   0  351848 121584   9952 S   0.0   0.1   0:00.40 pvedaemon worker                                                                                                                                                   +
 883164 root      20   0  351712 121560  10060 S   0.0   0.1   0:00.65 pvedaemon worker                                                                                                                                                   +
   2801 root      20   0  307252  92952  20996 S   0.0   0.1 323:07.31 pvestatd                                                                                                                                                           +
2023020 root      20   0  267408  90508  89344 S   0.0   0.1  15:48.85 /lib/systemd/systemd-journald                                                                                                                                       
   2899 www-data  20   0  121260  59804  12212 S   0.0   0.1   0:34.77 spiceproxy                                                                                                                                                         +
 883544 www-data  20   0  121500  51260   3448 S   0.0   0.1   0:00.05 spiceproxy worker                                                                                                                                                  +
 876236 root      20   0  524564  50188  37612 S   0.0   0.1   0:01.90 /usr/bin/pmxcfs                                                                                                                                                     
3771741 root      20   0  150776  30880   3264 S   0.0   0.0   0:12.86 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize                                                                              
   2799 root      20   0  316112  28352   5840 S   0.0   0.0  95:51.91 pve-firewall                                                                                                                                                       +
   2909 root      20   0  325212  14196   5404 S   0.0   0.0   7:04.14 pve-ha-lrm                                                                                                                                                         +
   2876 root      20   0  325564   9600   5224 S   0.0   0.0   4:18.33 pve-ha-crm                                                                                                                                                         +
 868033 ch        20   0   21660   8844   7020 S   0.0   0.0   0:00.14 /lib/systemd/systemd --user                                                                                                                                         

root at vn03:~# free -m
              total        used        free      shared  buff/cache   available
Mem:          80413       52700       18281         115        9431       26805
Swap:         20479        1086       19393
root at vn03:~# slabtop -o | head -50 
 Active / Total Objects (% used)    : 199865696 / 200976971 (99.4%)
 Active / Total Slabs (% used)      : 4771440 / 4771440 (100.0%)
 Active / Total Caches (% used)     : 114 / 161 (70.8%)
 Active / Total Size (% used)       : 59688763.91K / 59945034.02K (99.6%)
 Minimum / Average / Maximum Object : 0.01K / 0.30K / 16.62K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
43540380 43499279  99%    0.20K 1116420       39   8931360K vm_area_struct         
26459776 26457217  99%    0.06K 413434       64   1653736K anon_vma_chain         
16782720 16429406  97%    0.25K 524460       32   4195680K filp                   
13075712 13074728  99%    0.03K 102154      128    408616K kmalloc-32             
10104728 10103625  99%    0.09K 219668       46    878672K anon_vma               
9599628 9599628 100%    0.04K  94114      102    376456K pde_opener             
7442106 7442024  99%    0.19K 177193       42   1417544K cred_jar               
7211280 7207550  99%    0.13K 240376       30    961504K kernfs_node_cache      
5999322 5970370  99%    0.19K 142841       42   1142728K dentry                 
5691447 5691447 100%    0.08K 111597       51    446388K task_delay_info        
5052594 5052594 100%    0.69K 109839       46   3514848K files_cache            
4657408 4657315  99%    0.12K 145544       32    582176K pid                    
4590750 4590721  99%    1.06K 153025       30   4896800K mm_struct              
4206400 4202839  99%    0.58K  76480       55   2447360K inode_cache            
4091424 4091235  99%    0.62K  80224       51   2567168K sock_inode_cache       
3903104 3901440  99%    0.06K  60986       64    243944K kmalloc-64             
3855600 3855530  99%    1.06K 128520       30   4112640K signal_cache           
3416133 3410170  99%    0.65K  69717       49   2230944K proc_inode_cache       
3124224 3123017  99%    0.01K   6102      512     24408K kmalloc-8              
2982840 2982826  99%    0.19K  71020       42    568160K kmalloc-192            
2425760 2424977  99%    1.00K  75805       32   2425760K kmalloc-1k             
1940694 1932266  99%    0.09K  46207       42    184828K kmalloc-96             
1649415 1649346  99%    2.06K 109961       15   3518752K sighand_cache          
1279520 1279520 100%    1.00K  39985       32   1279520K UNIX                   
1043392 1040142  99%    0.50K  32606       32    521696K kmalloc-512            
1021152 1020672  99%    0.25K  31911       32    255288K skbuff_head_cache      
938880 938777  99%    4.00K 117360        8   3755520K kmalloc-4k             
797715 784886  98%    5.75K 159543        5   5105376K task_struct            
713388 699031  97%    0.10K  18292       39     73168K buffer_head            
643008  73139  11%    0.06K  10047       64     40188K dmaengine-unmap-2      
525520 525326  99%    2.00K  32845       16   1051040K kmalloc-2k             
432768 426806  98%    0.06K   6762       64     27048K kmem_cache_node        
308100 298326  96%    1.05K  10270       30    328640K ext4_inode_cache       
292387 289915  99%    0.68K   6221       47    199072K shmem_inode_cache      
215250 214971  99%    0.38K   5125       42     82000K kmem_cache             
212380 180327  84%    0.57K   7585       28    121360K radix_tree_node        
157952 157952 100%    0.02K    617      256      2468K kmalloc-16             
150150 150150 100%    1.25K   6006       25    192192K UDPv6                  
 71008  70660  99%    0.12K   2219       32      8876K kmalloc-128            
 40064  40056  99%    0.25K   1252       32     10016K kmalloc-256            
 34986  34259  97%    0.09K    833       42      3332K kmalloc-rcl-96         
 34368  32733  95%    0.06K    537       64      2148K kmalloc-rcl-64         
 33660  33300  98%    0.05K    396       85      1584K ftrace_event_field     



typical VM config:

balloon: 0
bootdisk: virtio0
cores: 2
cpu: Haswell-noTSX
ide2: none,media=cdrom
memory: 4096
name: backup
net0: virtio=52:54:00:b7:e0:ba,bridge=vmbr100
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=39d362a5-6bae-41b7-9803-b76279e2280f
sockets: 1
virtio0: datastore:vm-101-disk-1,cache=writeback,size=32G
virtio1: datastore:vm-101-disk-2,cache=writeback,size=100G


-- 
Chris Hofstaedtler / Deduktiva GmbH (FN 418592 b, HG Wien)
www.deduktiva.com / +43 1 353 1707


From f.cuseo at panservice.it  Fri Sep 20 14:34:26 2019
From: f.cuseo at panservice.it (Fabrizio Cuseo)
Date: Fri, 20 Sep 2019 14:34:26 +0200 (CEST)
Subject: [PVE-User] Kernel Memory Leak on PVE6?
In-Reply-To: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
Message-ID: <485538194.321667.1568982866199.JavaMail.zimbra@zimbra.panservice.it>

Are you sure that the memory is used by ZFS cache ? 

Regards, Fabrizio

----- Il 20-set-19, alle 14:31, Chris Hofstaedtler | Deduktiva chris.hofstaedtler at deduktiva.com ha scritto:

> Hi,
> 
> I'm seeing a very interesting problem on PVE6: one of our machines
> appears to leak kernel memory over time, up to the point where only
> a reboot helps. Shutting down all KVM VMs does not release this
> memory.
> 
> I'll attach some information below, because I just couldn't figure
> out what this memory is used for. Once before shutting down the VMs,
> and once after. I had to reboot the PVE host now, but I guess
> in a few days it will be at least noticable again.
> 
> This machine has the same (except CPU) hardware as the box next to
> it; however this one was freshly installed with PVE6, the other one
> is an upgrade from PVE5 and doesn't exhibit this problem. It's quite
> puzzling because I haven't seen this symptom at all at all the
> customer installations.
> 
> Here are some graphs showing the memory consumption over time:
>  http://zeha.at/~ch/T/20190920-pve6_meminfo_0.png
>  http://zeha.at/~ch/T/20190920-pve6_meminfo_1.png
> 
> Looking forward to any debug help, suggestions, ...
> 
> Chris
-- 
---
Fabrizio Cuseo - mailto:f.cuseo at panservice.it
Direzione Generale - Panservice InterNetWorking
Servizi Professionali per Internet ed il Networking
Panservice e' associata AIIP - RIPE Local Registry
Phone: +39 0773 410020 - Fax: +39 0773 470219
http://www.panservice.it  mailto:info at panservice.it
Numero verde nazionale: 800 901492


From chris.hofstaedtler at deduktiva.com  Fri Sep 20 14:35:16 2019
From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva)
Date: Fri, 20 Sep 2019 14:35:16 +0200
Subject: [PVE-User] Kernel Memory Leak on PVE6?
In-Reply-To: <485538194.321667.1568982866199.JavaMail.zimbra@zimbra.panservice.it>
References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
 <485538194.321667.1568982866199.JavaMail.zimbra@zimbra.panservice.it>
Message-ID: <20190920123516.yorxcgktt3uviyxd@zeha.at>

* Fabrizio Cuseo <f.cuseo at panservice.it> [190920 14:34]:
> Are you sure that the memory is used by ZFS cache ? 

There are no zfs filesystems configured, and zfs.ko is not loaded.

Thanks,
Chris



From a.lauterer at proxmox.com  Fri Sep 20 14:58:38 2019
From: a.lauterer at proxmox.com (Aaron Lauterer)
Date: Fri, 20 Sep 2019 14:58:38 +0200
Subject: [PVE-User] Kernel Memory Leak on PVE6?
In-Reply-To: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
Message-ID: <e68de579-54be-9fd6-1af6-48553a4c22b8@proxmox.com>

Curious, I do have a very similar case at the moment with a slab of 
~155GB, out of ~190GB RAM installed.

I am not sure yet what causes it but things I plan to investigate are:

* hanging NFS mount
* possible (PVE) service starting too many threads -> restarting each 
and checking the memory / slab usage.



On 9/20/19 2:31 PM, Chris Hofstaedtler | Deduktiva wrote:
> Hi,
> 
> I'm seeing a very interesting problem on PVE6: one of our machines
> appears to leak kernel memory over time, up to the point where only
> a reboot helps. Shutting down all KVM VMs does not release this
> memory.
> 
> I'll attach some information below, because I just couldn't figure
> out what this memory is used for. Once before shutting down the VMs,
> and once after. I had to reboot the PVE host now, but I guess
> in a few days it will be at least noticable again.
> 
> This machine has the same (except CPU) hardware as the box next to
> it; however this one was freshly installed with PVE6, the other one
> is an upgrade from PVE5 and doesn't exhibit this problem. It's quite
> puzzling because I haven't seen this symptom at all at all the
> customer installations.
> 
> Here are some graphs showing the memory consumption over time:
>    http://zeha.at/~ch/T/20190920-pve6_meminfo_0.png
>    http://zeha.at/~ch/T/20190920-pve6_meminfo_1.png
> 
> Looking forward to any debug help, suggestions, ...
> 
> Chris
> 
> 
> ** Almost out of memory, before VM shutdown: **
> 
> top - 10:24:19 up 22 days, 22:29,  1 user,  load average: 1.85, 1.57, 1.32
> Tasks: 530 total,   1 running, 529 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  1.8 us,  0.4 sy,  0.0 ni, 97.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> MiB Mem :  80413.1 total,    509.9 free,  70879.7 used,   9023.5 buff/cache
> MiB Swap:  20480.0 total,   6516.6 free,  13963.4 used.   8699.0 avail Mem
> 
>      PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
>     3183 root      20   0   10.6g   6.0g   2960 S   8.7   7.6   5861:52 /usr/bin/kvm -id 103 -name puppet -chardev socket,id=qmp,path=/var/run/qemu-server/103.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+
>     3349 root      20   0 9266032   4.3g   2972 S   6.8   5.4   3834:41 /usr/bin/kvm -id 2017 -name go-test-srv01 -chardev socket,id=qmp,path=/var/run/qemu-server/2017.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=+
>     3068 root      20   0 5060928   3.7g   2900 S   6.8   4.7   3110:01 /usr/bin/kvm -id 101 -name backup -chardev socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+
>     3399 root      20   0 5094772   2.3g   2944 S  50.5   2.9  10780:07 /usr/bin/kvm -id 3002 -name monitor01 -chardev socket,id=qmp,path=/var/run/qemu-server/3002.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-+
>     3254 root      20   0   32.8g   1.9g   3040 S   1.0   2.4 490:39.29 /usr/bin/kvm -id 2005 -name debbuild -chardev socket,id=qmp,path=/var/run/qemu-server/2005.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-e+
>     2994 root      20   0 2656268 658428   2980 S   9.7   0.8   2895:15 /usr/bin/kvm -id 100 -name pbx -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+
>     2927 root      20   0 2664232 479372   2944 S   6.8   0.6   2343:43 /usr/bin/kvm -id 102 -name ns1 -chardev socket,id=qmp,path=/var/run/qemu-server/102.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+
>     2417 root      rt   0  606912 211336  51444 S   1.9   0.3 613:27.87 /usr/sbin/corosync -f
> 2023020 root      20   0  246556  98020  97044 S   0.0   0.1  15:47.80 /lib/systemd/systemd-journald
>     1806 root      20   0  967944  32724  23612 S   0.0   0.0  53:49.62 /usr/bin/pmxcfs
>     2801 root      20   0  314488  32428   6464 S   0.0   0.0 322:58.23 pvestatd                                                                                                                                                           +
> 3771741 root      20   0  150776  31728   3700 S   0.0   0.0   0:12.81 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize
>     2799 root      20   0  316056  27452   5656 S   0.0   0.0  95:49.25 pve-firewall                                                                                                                                                       +
>     2909 root      20   0  325248  12684   5268 S   1.0   0.0   7:03.91 pve-ha-lrm                                                                                                                                                         +
>   868033 ch        20   0   21660   9104   7280 S   0.0   0.0   0:00.12 /lib/systemd/systemd --user
>   868009 root      20   0   16912   7988   6856 S   0.0   0.0   0:00.03 sshd: ch [priv]
>        1 root      20   0  171820   7640   5032 S   0.0   0.0  19:58.80 /lib/systemd/systemd --system --deserialize 37
>     2876 root      20   0  325544   7124   4988 S   0.0   0.0   4:18.16 pve-ha-crm                                                                                                                                                         +
>     1654 Debian-+  20   0   40488   7096   2864 S   0.0   0.0  77:37.18 /usr/sbin/snmpd -Lsd -Lf /dev/null -u Debian-snmp -g Debian-snmp -I -smux mteTrigger mteTriggerConf -f -p /run/snmpd.pid
>   868045 ch        20   0   10240   5404   3996 S   0.0   0.0   0:00.11 -zsh
>   868044 ch        20   0   16912   4636   3492 S   0.0   0.0   0:00.02 sshd: ch at pts/0
>     1644 root      20   0   29608   4520   3496 S   0.0   0.0   4:59.62 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
>   868336 root      20   0    7716   4372   3092 S   0.0   0.0   0:00.03 -bash
> 1761096 root      20   0  351564   4180   3336 S   0.0   0.0   1:12.83 pvedaemon worker                                                                                                                                                   +
> 1776171 root      20   0  351696   4076   3352 S   0.0   0.0   1:18.27 pvedaemon worker                                                                                                                                                   +
>   868370 root      20   0   11680   4016   2964 R   2.9   0.0   0:00.68 top
> 1780591 root      20   0  351696   4008   3248 S   0.0   0.0   1:11.73 pvedaemon worker                                                                                                                                                   +
>     1086 root      20   0   19540   3984   3720 S   0.0   0.0   3:11.21 /lib/systemd/systemd-logind
>   868335 root      20   0   10156   3788   3364 S   0.0   0.0   0:00.01 sudo -i
>     2899 www-data  20   0  121256   3412   3080 S   0.0   0.0   0:33.99 spiceproxy                                                                                                                                                         +
> 2000791 www-data  20   0  344932   3412   2604 S   0.0   0.0   1:16.39 pveproxy worker                                                                                                                                                    +
> 2000792 www-data  20   0  344932   3348   2604 S   0.0   0.0   1:07.07 pveproxy worker                                                                                                                                                    +
>     1251 root      20   0  225816   3296   2424 S   0.0   0.0   9:47.44 /usr/sbin/rsyslogd -n -iNONE
>     1258 message+  20   0    9212   3268   2820 S   0.0   0.0   6:41.36 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
> 
> root at vn03:~# uname -a
> Linux vn03 5.0.21-1-pve #1 SMP PVE 5.0.21-1 (Tue, 20 Aug 2019 17:16:32 +0200) x86_64 GNU/Linux
> root at vn03:~# free -m
>                total        used        free      shared  buff/cache   available
> Mem:          80413       70877         515         101        9019        8708
> Swap:         20479       13963        6516
> root at vn03:~# dpkg -l pve\*
> Desired=Unknown/Install/Remove/Purge/Hold
> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
> ||/ Name                    Version      Architecture Description
> +++-=======================-============-============-======================================================
> ii  pve-cluster             6.0-5        amd64        Cluster Infrastructure for Proxmox Virtual Environment
> ii  pve-container           3.0-5        all          Proxmox VE Container management tool
> ii  pve-docs                6.0-4        all          Proxmox VE Documentation
> ii  pve-edk2-firmware       2.20190614-1 all          edk2 based firmware modules for virtual machines
> ii  pve-firewall            4.0-7        amd64        Proxmox VE Firewall
> ii  pve-firmware            3.0-2        all          Binary firmware code for the pve-kernel
> ii  pve-ha-manager          3.0-2        amd64        Proxmox VE HA Manager
> ii  pve-i18n                2.0-2        all          Internationalization support for Proxmox VE
> un  pve-kernel              <none>       <none>       (no description available)
> ii  pve-kernel-5.0          6.0-7        all          Latest Proxmox VE Kernel Image
> ii  pve-kernel-5.0.15-1-pve 5.0.15-1     amd64        The Proxmox PVE Kernel Image
> ii  pve-kernel-5.0.18-1-pve 5.0.18-3     amd64        The Proxmox PVE Kernel Image
> ii  pve-kernel-5.0.21-1-pve 5.0.21-1     amd64        The Proxmox PVE Kernel Image
> ii  pve-kernel-helper       6.0-7        all          Function for various kernel maintenance tasks.
> un  pve-kvm                 <none>       <none>       (no description available)
> ii  pve-manager             6.0-6        amd64        Proxmox Virtual Environment Management Tools
> ii  pve-qemu-kvm            4.0.0-5      amd64        Full virtualization on x86 hardware
> un  pve-qemu-kvm-2.6.18     <none>       <none>       (no description available)
> ii  pve-xtermjs             3.13.2-1     all          HTML/JS Shell client
> root at vn03:~# slabtop -o | head -50
>   Active / Total Objects (% used)    : 205425461 / 212231433 (96.8%)
>   Active / Total Slabs (% used)      : 4949759 / 4949759 (100.0%)
>   Active / Total Caches (% used)     : 114 / 161 (70.8%)
>   Active / Total Size (% used)       : 60112896.56K / 60714678.54K (99.0%)
>   Minimum / Average / Maximum Object : 0.01K / 0.29K / 16.62K
> 
>    OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
> 43583592 43542487  99%    0.20K 1117528       39   8940224K vm_area_struct
> 26520256 26518592  99%    0.06K 414379       64   1657516K anon_vma_chain
> 16788000 16434450  97%    0.25K 524625       32   4197000K filp
> 13079680 13078464  99%    0.03K 102185      128    408740K kmalloc-32
> 11544320 5261058  45%    0.06K 180380       64    721520K dmaengine-unmap-2
> 10128740 10127452  99%    0.09K 220190       46    880760K anon_vma
> 9602484 9602484 100%    0.04K  94142      102    376568K pde_opener
> 7442736 7442572  99%    0.19K 177208       42   1417664K cred_jar
> 7213200 7209695  99%    0.13K 240440       30    961760K kernfs_node_cache
> 6023850 5992341  99%    0.19K 143425       42   1147400K dentry
> 5704350 5704350 100%    0.08K 111850       51    447400K task_delay_info
> 5054066 5054066 100%    0.69K 109871       46   3515872K files_cache
> 4664512 4664481  99%    0.12K 145766       32    583064K pid
> 4591440 4591440 100%    1.06K 153048       30   4897536K mm_struct
> 4207445 4203908  99%    0.58K  76499       55   2447968K inode_cache
> 4104480 4104291  99%    0.62K  80480       51   2575360K sock_inode_cache
> 3901440 3900588  99%    0.06K  60960       64    243840K kmalloc-64
> 3856230 3856160  99%    1.06K 128541       30   4113312K signal_cache
> 3423826 3417982  99%    0.65K  69874       49   2235968K proc_inode_cache
> 3139584 3138382  99%    0.01K   6132      512     24528K kmalloc-8
> 2983344 2983255  99%    0.19K  71032       42    568256K kmalloc-192
> 2426976 2426413  99%    1.00K  75843       32   2426976K kmalloc-1k
> 1939854 1931355  99%    0.09K  46187       42    184748K kmalloc-96
> 1649895 1649895 100%    2.06K 109993       15   3519776K sighand_cache
> 1280544 1280544 100%    1.00K  40017       32   1280544K UNIX
> 1052928 1050819  99%    0.50K  32904       32    526464K kmalloc-512
> 1029792 1029312  99%    0.25K  32181       32    257448K skbuff_head_cache
> 940624 940559  99%    4.00K 117578        8   3762496K kmalloc-4k
> 799895 787069  98%    5.75K 159979        5   5119328K task_struct
> 735696 724643  98%    0.10K  18864       39     75456K buffer_head
> 525504 525378  99%    2.00K  32844       16   1051008K kmalloc-2k
> 433024 426780  98%    0.06K   6766       64     27064K kmem_cache_node
> 310710 301758  97%    1.05K  10357       30    331424K ext4_inode_cache
> 292340 290078  99%    0.68K   6220       47    199040K shmem_inode_cache
> 215250 214814  99%    0.38K   5125       42     82000K kmem_cache
> 212296 196761  92%    0.57K   7582       28    121312K radix_tree_node
> 158464 158464 100%    0.02K    619      256      2476K kmalloc-16
> 149925 149925 100%    1.25K   5997       25    191904K UDPv6
>   71424  71140  99%    0.12K   2232       32      8928K kmalloc-128
>   70020  70020 100%    0.16K   1376       51     11008K kvm_mmu_page_header
>   40032  40009  99%    0.25K   1251       32     10008K kmalloc-256
>   34944  33823  96%    0.09K    832       42      3328K kmalloc-rcl-96
>   34816  32567  93%    0.06K    544       64      2176K kmalloc-rcl-64
> root at vn03:~# pct list
> root at vn03:~# qm list
>        VMID NAME                 STATUS     MEM(MB)    BOOTDISK(GB) PID
>         100 pbx                  running    2048              16.00 2994
>         101 backup               running    4096              32.00 3068
>         102 ns1                  running    2048              32.00 2927
>         103 puppet               running    10240             16.00 3183
>        2005 debbuild             running    32768             40.00 3254
>        2017 go-test-srv01        running    8192              20.00 3349
>        3002 monitor01            running    4096              32.00 3399
>        5001 salsa-runner-01      stopped    16384             32.00 0
>        6001 deduktiva-runner-01  stopped    32768             32.00 0
>        6901 mac                  stopped    4096               0.25 0
> root at vn03:~# sysctl -a | grep hugepages
> vm.nr_hugepages = 0
> vm.nr_hugepages_mempolicy = 0
> vm.nr_overcommit_hugepages = 0
> 
> 
> *** After shutdown of all VMs: ***
> 
> top - 10:39:56 up 22 days, 22:44,  2 users,  load average: 0.83, 1.84, 1.88
> Tasks: 491 total,   1 running, 490 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  0.1 us,  0.0 sy,  0.0 ni, 99.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> MiB Mem :  80413.1 total,  18276.4 free,  52704.9 used,   9431.8 buff/cache
> MiB Swap:  20480.0 total,  19393.6 free,   1086.4 used.  26801.1 avail Mem
> 
>      PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
>     2417 root      rt   0  606908 211332  51444 S   1.0   0.3 613:46.50 /usr/sbin/corosync -f
>     2878 www-data  20   0  344800 133424  21784 S   0.0   0.2   0:36.09 pveproxy                                                                                                                                                           +
>   883317 www-data  20   0  361776 133084  11056 S   0.0   0.2   0:01.04 pveproxy worker                                                                                                                                                    +
>     2836 root      20   0  343228 132060  21764 S   0.0   0.2   0:38.88 pvedaemon                                                                                                                                                          +
>   883319 www-data  20   0  360688 130992  11148 S   1.0   0.2   0:01.26 pveproxy worker                                                                                                                                                    +
>   883318 www-data  20   0  358056 128864  11148 S   0.0   0.2   0:01.75 pveproxy worker                                                                                                                                                    +
>   883166 root      20   0  351912 121884  10220 S   0.0   0.1   0:00.96 pvedaemon worker                                                                                                                                                   +
>   883165 root      20   0  351848 121584   9952 S   0.0   0.1   0:00.40 pvedaemon worker                                                                                                                                                   +
>   883164 root      20   0  351712 121560  10060 S   0.0   0.1   0:00.65 pvedaemon worker                                                                                                                                                   +
>     2801 root      20   0  307252  92952  20996 S   0.0   0.1 323:07.31 pvestatd                                                                                                                                                           +
> 2023020 root      20   0  267408  90508  89344 S   0.0   0.1  15:48.85 /lib/systemd/systemd-journald
>     2899 www-data  20   0  121260  59804  12212 S   0.0   0.1   0:34.77 spiceproxy                                                                                                                                                         +
>   883544 www-data  20   0  121500  51260   3448 S   0.0   0.1   0:00.05 spiceproxy worker                                                                                                                                                  +
>   876236 root      20   0  524564  50188  37612 S   0.0   0.1   0:01.90 /usr/bin/pmxcfs
> 3771741 root      20   0  150776  30880   3264 S   0.0   0.0   0:12.86 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize
>     2799 root      20   0  316112  28352   5840 S   0.0   0.0  95:51.91 pve-firewall                                                                                                                                                       +
>     2909 root      20   0  325212  14196   5404 S   0.0   0.0   7:04.14 pve-ha-lrm                                                                                                                                                         +
>     2876 root      20   0  325564   9600   5224 S   0.0   0.0   4:18.33 pve-ha-crm                                                                                                                                                         +
>   868033 ch        20   0   21660   8844   7020 S   0.0   0.0   0:00.14 /lib/systemd/systemd --user
> 
> root at vn03:~# free -m
>                total        used        free      shared  buff/cache   available
> Mem:          80413       52700       18281         115        9431       26805
> Swap:         20479        1086       19393
> root at vn03:~# slabtop -o | head -50
>   Active / Total Objects (% used)    : 199865696 / 200976971 (99.4%)
>   Active / Total Slabs (% used)      : 4771440 / 4771440 (100.0%)
>   Active / Total Caches (% used)     : 114 / 161 (70.8%)
>   Active / Total Size (% used)       : 59688763.91K / 59945034.02K (99.6%)
>   Minimum / Average / Maximum Object : 0.01K / 0.30K / 16.62K
> 
>    OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
> 43540380 43499279  99%    0.20K 1116420       39   8931360K vm_area_struct
> 26459776 26457217  99%    0.06K 413434       64   1653736K anon_vma_chain
> 16782720 16429406  97%    0.25K 524460       32   4195680K filp
> 13075712 13074728  99%    0.03K 102154      128    408616K kmalloc-32
> 10104728 10103625  99%    0.09K 219668       46    878672K anon_vma
> 9599628 9599628 100%    0.04K  94114      102    376456K pde_opener
> 7442106 7442024  99%    0.19K 177193       42   1417544K cred_jar
> 7211280 7207550  99%    0.13K 240376       30    961504K kernfs_node_cache
> 5999322 5970370  99%    0.19K 142841       42   1142728K dentry
> 5691447 5691447 100%    0.08K 111597       51    446388K task_delay_info
> 5052594 5052594 100%    0.69K 109839       46   3514848K files_cache
> 4657408 4657315  99%    0.12K 145544       32    582176K pid
> 4590750 4590721  99%    1.06K 153025       30   4896800K mm_struct
> 4206400 4202839  99%    0.58K  76480       55   2447360K inode_cache
> 4091424 4091235  99%    0.62K  80224       51   2567168K sock_inode_cache
> 3903104 3901440  99%    0.06K  60986       64    243944K kmalloc-64
> 3855600 3855530  99%    1.06K 128520       30   4112640K signal_cache
> 3416133 3410170  99%    0.65K  69717       49   2230944K proc_inode_cache
> 3124224 3123017  99%    0.01K   6102      512     24408K kmalloc-8
> 2982840 2982826  99%    0.19K  71020       42    568160K kmalloc-192
> 2425760 2424977  99%    1.00K  75805       32   2425760K kmalloc-1k
> 1940694 1932266  99%    0.09K  46207       42    184828K kmalloc-96
> 1649415 1649346  99%    2.06K 109961       15   3518752K sighand_cache
> 1279520 1279520 100%    1.00K  39985       32   1279520K UNIX
> 1043392 1040142  99%    0.50K  32606       32    521696K kmalloc-512
> 1021152 1020672  99%    0.25K  31911       32    255288K skbuff_head_cache
> 938880 938777  99%    4.00K 117360        8   3755520K kmalloc-4k
> 797715 784886  98%    5.75K 159543        5   5105376K task_struct
> 713388 699031  97%    0.10K  18292       39     73168K buffer_head
> 643008  73139  11%    0.06K  10047       64     40188K dmaengine-unmap-2
> 525520 525326  99%    2.00K  32845       16   1051040K kmalloc-2k
> 432768 426806  98%    0.06K   6762       64     27048K kmem_cache_node
> 308100 298326  96%    1.05K  10270       30    328640K ext4_inode_cache
> 292387 289915  99%    0.68K   6221       47    199072K shmem_inode_cache
> 215250 214971  99%    0.38K   5125       42     82000K kmem_cache
> 212380 180327  84%    0.57K   7585       28    121360K radix_tree_node
> 157952 157952 100%    0.02K    617      256      2468K kmalloc-16
> 150150 150150 100%    1.25K   6006       25    192192K UDPv6
>   71008  70660  99%    0.12K   2219       32      8876K kmalloc-128
>   40064  40056  99%    0.25K   1252       32     10016K kmalloc-256
>   34986  34259  97%    0.09K    833       42      3332K kmalloc-rcl-96
>   34368  32733  95%    0.06K    537       64      2148K kmalloc-rcl-64
>   33660  33300  98%    0.05K    396       85      1584K ftrace_event_field
> 
> 
> 
> typical VM config:
> 
> balloon: 0
> bootdisk: virtio0
> cores: 2
> cpu: Haswell-noTSX
> ide2: none,media=cdrom
> memory: 4096
> name: backup
> net0: virtio=52:54:00:b7:e0:ba,bridge=vmbr100
> numa: 0
> onboot: 1
> ostype: l26
> scsihw: virtio-scsi-pci
> serial0: socket
> smbios1: uuid=39d362a5-6bae-41b7-9803-b76279e2280f
> sockets: 1
> virtio0: datastore:vm-101-disk-1,cache=writeback,size=32G
> virtio1: datastore:vm-101-disk-2,cache=writeback,size=100G
> 
> 



From chris.hofstaedtler at deduktiva.com  Fri Sep 20 15:04:34 2019
From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva)
Date: Fri, 20 Sep 2019 15:04:34 +0200
Subject: [PVE-User] Kernel Memory Leak on PVE6?
In-Reply-To: <e68de579-54be-9fd6-1af6-48553a4c22b8@proxmox.com>
References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
 <e68de579-54be-9fd6-1af6-48553a4c22b8@proxmox.com>
Message-ID: <20190920130434.pjnuimppit2rrpxj@percival.namespace.at>

* Aaron Lauterer <a.lauterer at proxmox.com> [190920 14:58]:
> Curious, I do have a very similar case at the moment with a slab of ~155GB,
> out of ~190GB RAM installed.
> 
> I am not sure yet what causes it but things I plan to investigate are:
> 
> * hanging NFS mount

Okay, to rule storage issues out, this setup has:
- root filesystem as ext4 on GPT
- efi system partition
- two LVM PVs and VGs, with all VM storage in the second LVM VG
- no NFS, no ZFS, no Ceph, no fancy userland filesystems

> * possible (PVE) service starting too many threads -> restarting each and
> checking the memory / slab usage.

Do you have a particular service in mind?

Chris

-- 
Chris Hofstaedtler / Deduktiva GmbH (FN 418592 b, HG Wien)
www.deduktiva.com / +43 1 353 1707


From a.lauterer at proxmox.com  Fri Sep 20 15:12:04 2019
From: a.lauterer at proxmox.com (Aaron Lauterer)
Date: Fri, 20 Sep 2019 15:12:04 +0200
Subject: [PVE-User] Kernel Memory Leak on PVE6?
In-Reply-To: <20190920130434.pjnuimppit2rrpxj@percival.namespace.at>
References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
 <e68de579-54be-9fd6-1af6-48553a4c22b8@proxmox.com>
 <20190920130434.pjnuimppit2rrpxj@percival.namespace.at>
Message-ID: <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com>



On 9/20/19 3:04 PM, Chris Hofstaedtler | Deduktiva wrote:
> * Aaron Lauterer <a.lauterer at proxmox.com> [190920 14:58]:
>> Curious, I do have a very similar case at the moment with a slab of ~155GB,
>> out of ~190GB RAM installed.
>>
>> I am not sure yet what causes it but things I plan to investigate are:
>>
>> * hanging NFS mount
> 
> Okay, to rule storage issues out, this setup has:
> - root filesystem as ext4 on GPT
> - efi system partition
> - two LVM PVs and VGs, with all VM storage in the second LVM VG
> - no NFS, no ZFS, no Ceph, no fancy userland filesystems
> 
>> * possible (PVE) service starting too many threads -> restarting each and
>> checking the memory / slab usage.
> 
> Do you have a particular service in mind?

Not at this point. I would restart all PVE services (systemctl| grep -e 
"pve.*service") one by one to see if any of it will result in memory 
being released by the kernel.

If that is not the case at least they are ruled out.
> 
> Chris
> 



From aderumier at odiso.com  Fri Sep 20 16:58:00 2019
From: aderumier at odiso.com (Alexandre DERUMIER)
Date: Fri, 20 Sep 2019 16:58:00 +0200 (CEST)
Subject: [PVE-User] Kernel Memory Leak on PVE6?
In-Reply-To: <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com>
References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
 <e68de579-54be-9fd6-1af6-48553a4c22b8@proxmox.com>
 <20190920130434.pjnuimppit2rrpxj@percival.namespace.at>
 <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com>
Message-ID: <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com>

can send detail of 

cat /proc/slabinfo

?

----- Mail original -----
De: "Aaron Lauterer" <a.lauterer at proxmox.com>
?: "proxmoxve" <pve-user at pve.proxmox.com>
Envoy?: Vendredi 20 Septembre 2019 15:12:04
Objet: Re: [PVE-User] Kernel Memory Leak on PVE6?

On 9/20/19 3:04 PM, Chris Hofstaedtler | Deduktiva wrote: 
> * Aaron Lauterer <a.lauterer at proxmox.com> [190920 14:58]: 
>> Curious, I do have a very similar case at the moment with a slab of ~155GB, 
>> out of ~190GB RAM installed. 
>> 
>> I am not sure yet what causes it but things I plan to investigate are: 
>> 
>> * hanging NFS mount 
> 
> Okay, to rule storage issues out, this setup has: 
> - root filesystem as ext4 on GPT 
> - efi system partition 
> - two LVM PVs and VGs, with all VM storage in the second LVM VG 
> - no NFS, no ZFS, no Ceph, no fancy userland filesystems 
> 
>> * possible (PVE) service starting too many threads -> restarting each and 
>> checking the memory / slab usage. 
> 
> Do you have a particular service in mind? 

Not at this point. I would restart all PVE services (systemctl| grep -e 
"pve.*service") one by one to see if any of it will result in memory 
being released by the kernel. 

If that is not the case at least they are ruled out. 
> 
> Chris 
> 

_______________________________________________ 
pve-user mailing list 
pve-user at pve.proxmox.com 
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user 



From aderumier at odiso.com  Fri Sep 20 17:00:22 2019
From: aderumier at odiso.com (Alexandre DERUMIER)
Date: Fri, 20 Sep 2019 17:00:22 +0200 (CEST)
Subject: [PVE-User] Recurring crashes after cluster upgrade from 5 to 6
In-Reply-To: <da0981f9-2972-d4b3-73c2-a20c5d827550@unix-scripts.info>
References: <da0981f9-2972-d4b3-73c2-a20c5d827550@unix-scripts.info>
Message-ID: <1121750443.5445084.1568991622944.JavaMail.zimbra@odiso.com>

Hi,

a patch is available in pvetest

http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libknet1_1.11-pve2_amd64.deb

can you test it ? 

(you need to restart corosync after install of the deb)


----- Mail original -----
De: "Laurent CARON" <lcaron at unix-scripts.info>
?: "proxmoxve" <pve-user at pve.proxmox.com>
Envoy?: Lundi 16 Septembre 2019 09:55:34
Objet: [PVE-User] Recurring crashes after cluster upgrade from 5 to 6

Hi, 


After upgrading our 4 node cluster from PVE 5 to 6, we experience 
constant crashed (once every 2 days). 

Those crashes seem related to corosync. 

Since numerous users are reporting sych issues (broken cluster after 
upgrade, unstabilities, ...) I wonder if it is possible to downgrade 
corosync to version 2.4.4 without impacting functionnality ? 

Basic steps would be: 

On all nodes 

# systemctl stop pve-ha-lrm 

Once done, on all nodes: 

# systemctl stop pve-ha-crm 

Once done, on all nodes: 

# apt-get install corosync=2.4.4-pve1 libcorosync-common4=2.4.4-pve1 
libcmap4=2.4.4-pve1 libcpg4=2.4.4-pve1 libqb0=1.0.3-1~bpo9 
libquorum5=2.4.4-pve1 libvotequorum8=2.4.4-pve1 

Then, once corosync has been downgraded, on all nodes 

# systemctl start pve-ha-lrm 
# systemctl start pve-ha-crm 

Would that work ? 

Thanks 

_______________________________________________ 
pve-user mailing list 
pve-user at pve.proxmox.com 
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user 



From a.lauterer at proxmox.com  Mon Sep 23 10:17:06 2019
From: a.lauterer at proxmox.com (Aaron Lauterer)
Date: Mon, 23 Sep 2019 10:17:06 +0200
Subject: [PVE-User] Kernel Memory Leak on PVE6?
In-Reply-To: <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com>
References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
 <e68de579-54be-9fd6-1af6-48553a4c22b8@proxmox.com>
 <20190920130434.pjnuimppit2rrpxj@percival.namespace.at>
 <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com>
 <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com>
Message-ID: <e24b30fb-fbf6-a0c5-45c8-0b847872e5ce@proxmox.com>

On 9/20/19 4:58 PM, Alexandre DERUMIER wrote:> can send detail of
 >
 > cat /proc/slabinfo
 >
 > ?

Sure, here you go:
----------------------------------------------
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> 
<pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata 
<active_slabs> <num_slabs> <sharedavail>
nfs_direct_cache       0      0    360   45    4 : tunables    0    0 
0 : slabdata      0      0      0
nfs_commit_data     1196   1196    704   46    8 : tunables    0    0 
0 : slabdata     26     26      0
nfs_read_data       6120   6120    896   36    8 : tunables    0    0 
0 : slabdata    170    170      0
nfs_inode_cache     2340   2340   1072   30    8 : tunables    0    0 
0 : slabdata     78     78      0
SCTPv6                22     22   1472   22    8 : tunables    0    0 
0 : slabdata      1      1      0
SCTP                  24     24   1344   24    8 : tunables    0    0 
0 : slabdata      1      1      0
kvm_async_pf         720    720    136   30    1 : tunables    0    0 
0 : slabdata     24     24      0
kvm_vcpu              25     25  17024    1    8 : tunables    0    0 
0 : slabdata     25     25      0
kvm_mmu_page_header  48103  48700    160   25    1 : tunables    0    0 
   0 : slabdata   1948   1948      0
x86_fpu              140    140   4160    7    8 : tunables    0    0 
0 : slabdata     20     20      0
zfs_znode_hold_cache      0      0     88   46    1 : tunables    0    0 
    0 : slabdata      0      0      0
zfs_znode_cache        0      0   1048   31    8 : tunables    0    0 
0 : slabdata      0      0      0
sio_cache_2            0      0    168   24    1 : tunables    0    0 
0 : slabdata      0      0      0
sio_cache_1            0      0    152   26    1 : tunables    0    0 
0 : slabdata      0      0      0
sio_cache_0            0      0    136   30    1 : tunables    0    0 
0 : slabdata      0      0      0
zil_zcw_cache          0      0    152   26    1 : tunables    0    0 
0 : slabdata      0      0      0
zil_lwb_cache          0      0    376   43    4 : tunables    0    0 
0 : slabdata      0      0      0
dmu_buf_impl_t         0      0    312   26    2 : tunables    0    0 
0 : slabdata      0      0      0
arc_buf_t              0      0     80   51    1 : tunables    0    0 
0 : slabdata      0      0      0
arc_buf_hdr_t_l2only      0      0     96   42    1 : tunables    0    0 
    0 : slabdata      0      0      0
arc_buf_hdr_t_full_crypt      0      0    392   41    4 : tunables    0 
   0    0 : slabdata      0      0      0
arc_buf_hdr_t_full      0      0    328   24    2 : tunables    0    0 
  0 : slabdata      0      0      0
dnode_t                0      0    896   36    8 : tunables    0    0 
0 : slabdata      0      0      0
sa_cache               0      0    248   33    2 : tunables    0    0 
0 : slabdata      0      0      0
abd_t                102    102     40  102    1 : tunables    0    0 
0 : slabdata      1      1      0
lz4_cache              0      0  16384    2    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_16384      4      4  16384    2    8 : tunables    0    0 
  0 : slabdata      2      2      0
zio_buf_16384          2      2  16384    2    8 : tunables    0    0 
0 : slabdata      1      1      0
zio_data_buf_14336      0      0  16384    2    8 : tunables    0    0 
  0 : slabdata      0      0      0
zio_buf_14336          0      0  16384    2    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_12288      0      0  12288    2    8 : tunables    0    0 
  0 : slabdata      0      0      0
zio_buf_12288          0      0  12288    2    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_10240      0      0  12288    2    8 : tunables    0    0 
  0 : slabdata      0      0      0
zio_buf_10240          0      0  12288    2    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_8192      0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_8192           0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_7168      0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_7168           0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_6144      0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_6144           0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_5120      0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_5120           0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_4096      0      0   4096    8    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_4096           0      0   4096    8    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_3584      0      0   3584    9    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_3584           0      0   3584    9    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_3072      0      0   3072   10    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_3072           0      0   3072   10    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_2560      0      0   2560   12    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_2560           0      0   2560   12    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_2048      0      0   2048   16    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_2048           0      0   2048   16    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_1536      0      0   1536   21    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_1536           0      0   1536   21    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_1024      0      0   1024   32    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_1024           0      0   1024   32    8 : tunables    0    0 
0 : slabdata      0      0      0
zio_data_buf_512       0      0    512   32    4 : tunables    0    0 
0 : slabdata      0      0      0
zio_buf_512            0      0    512   32    4 : tunables    0    0 
0 : slabdata      0      0      0
zio_link_cache         0      0     48   85    1 : tunables    0    0 
0 : slabdata      0      0      0
zio_cache              0      0   1240   26    8 : tunables    0    0 
0 : slabdata      0      0      0
ddt_entry_cache        0      0    448   36    4 : tunables    0    0 
0 : slabdata      0      0      0
range_seg_cache        0      0     72   56    1 : tunables    0    0 
0 : slabdata      0      0      0
kcf_context_cache      0      0    192   42    2 : tunables    0    0 
0 : slabdata      0      0      0
kcf_areq_cache         0      0    512   32    4 : tunables    0    0 
0 : slabdata      0      0      0
kcf_sreq_cache         0      0    192   42    2 : tunables    0    0 
0 : slabdata      0      0      0
mod_hash_entries     170    170     24  170    1 : tunables    0    0 
0 : slabdata      1      1      0
spl_vn_file_cache      0      0    128   32    1 : tunables    0    0 
0 : slabdata      0      0      0
spl_vn_cache           0      0    128   32    1 : tunables    0    0 
0 : slabdata      0      0      0
rpc_inode_cache      100    100    640   25    4 : tunables    0    0 
0 : slabdata      4      4      0
ext4_groupinfo_4k    280    280    144   28    1 : tunables    0    0 
0 : slabdata     10     10      0
btrfs_delayed_ref_head      0      0    160   25    1 : tunables    0 
0    0 : slabdata      0      0      0
btrfs_delayed_node      0      0    312   26    2 : tunables    0    0 
  0 : slabdata      0      0      0
btrfs_ordered_extent      0      0    416   39    4 : tunables    0    0 
    0 : slabdata      0      0      0
btrfs_extent_map       0      0    144   28    1 : tunables    0    0 
0 : slabdata      0      0      0
btrfs_extent_buffer      0      0    280   29    2 : tunables    0    0 
   0 : slabdata      0      0      0
btrfs_path             0      0    112   36    1 : tunables    0    0 
0 : slabdata      0      0      0
btrfs_inode            0      0   1144   28    8 : tunables    0    0 
0 : slabdata      0      0      0
scsi_sense_cache    1728   1728    128   32    1 : tunables    0    0 
0 : slabdata     54     54      0
PINGv6                28     28   1152   28    8 : tunables    0    0 
0 : slabdata      1      1      0
RAWv6                756    756   1152   28    8 : tunables    0    0 
0 : slabdata     27     27      0
UDPv6               2050   2050   1280   25    8 : tunables    0    0 
0 : slabdata     82     82      0
tw_sock_TCPv6        816    816    240   34    2 : tunables    0    0 
0 : slabdata     24     24      0
request_sock_TCPv6      0      0    304   26    2 : tunables    0    0 
  0 : slabdata      0      0      0
TCPv6                793    793   2368   13    8 : tunables    0    0 
0 : slabdata     61     61      0
kcopyd_job             0      0   3312    9    8 : tunables    0    0 
0 : slabdata      0      0      0
dm_uevent              0      0   2632   12    8 : tunables    0    0 
0 : slabdata      0      0      0
dm_old_clone_request      0      0    296   27    2 : tunables    0    0 
    0 : slabdata      0      0      0
dm_rq_target_io        0      0    120   34    1 : tunables    0    0 
0 : slabdata      0      0      0
mqueue_inode_cache     36     36    896   36    8 : tunables    0    0 
  0 : slabdata      1      1      0
fuse_request         984    984    392   41    4 : tunables    0    0 
0 : slabdata     24     24      0
fuse_inode          6905   6987    768   42    8 : tunables    0    0 
0 : slabdata    169    169      0
ecryptfs_key_record_cache      0      0    576   28    4 : tunables    0 
    0    0 : slabdata      0      0      0
ecryptfs_headers     504    544   4096    8    8 : tunables    0    0 
0 : slabdata     68     68      0
ecryptfs_inode_cache      0      0    960   34    8 : tunables    0    0 
    0 : slabdata      0      0      0
ecryptfs_dentry_info_cache    128    128     32  128    1 : tunables 
0    0    0 : slabdata      1      1      0
ecryptfs_file_cache      0      0     16  256    1 : tunables    0    0 
   0 : slabdata      0      0      0
ecryptfs_auth_tok_list_item      0      0    832   39    8 : tunables 
0    0    0 : slabdata      0      0      0
fat_inode_cache        0      0    728   45    8 : tunables    0    0 
0 : slabdata      0      0      0
fat_cache              0      0     40  102    1 : tunables    0    0 
0 : slabdata      0      0      0
squashfs_inode_cache      0      0    704   46    8 : tunables    0    0 
    0 : slabdata      0      0      0
jbd2_journal_head   6120   6120    120   34    1 : tunables    0    0 
0 : slabdata    180    180      0
jbd2_revoke_table_s    256    256     16  256    1 : tunables    0    0 
   0 : slabdata      1      1      0
ext4_inode_cache   77141  81432   1080   30    8 : tunables    0    0 
0 : slabdata   2727   2727      0
ext4_allocation_context    768    768    128   32    1 : tunables    0 
  0    0 : slabdata     24     24      0
ext4_pending_reservation   3072   3072     32  128    1 : tunables    0 
   0    0 : slabdata     24     24      0
ext4_extent_status  24990  25194     40  102    1 : tunables    0    0 
  0 : slabdata    247    247      0
mbcache             1752   1752     56   73    1 : tunables    0    0 
0 : slabdata     24     24      0
fscrypt_info        1536   1536     64   64    1 : tunables    0    0 
0 : slabdata     24     24      0
fscrypt_ctx         2040   2040     48   85    1 : tunables    0    0 
0 : slabdata     24     24      0
userfaultfd_ctx_cache      0      0    192   42    2 : tunables    0 
0    0 : slabdata      0      0      0
dnotify_struct      9856   9856     32  128    1 : tunables    0    0 
0 : slabdata     77     77      0
posix_timers_cache    816    816    240   34    2 : tunables    0    0 
  0 : slabdata     24     24      0
UNIX              2942592 2942592   1024   32    8 : tunables    0    0 
   0 : slabdata  91956  91956      0
ip4-frags              0      0    208   39    2 : tunables    0    0 
0 : slabdata      0      0      0
xfrm_dst_cache      4702   4725    320   25    2 : tunables    0    0 
0 : slabdata    189    189      0
xfrm_state             0      0    768   42    8 : tunables    0    0 
0 : slabdata      0      0      0
PING                  34     34    960   34    8 : tunables    0    0 
0 : slabdata      1      1      0
RAW                  986    986    960   34    8 : tunables    0    0 
0 : slabdata     29     29      0
tw_sock_TCP          850    850    240   34    2 : tunables    0    0 
0 : slabdata     25     25      0
request_sock_TCP     624    624    304   26    2 : tunables    0    0 
0 : slabdata     24     24      0
TCP                 2100   2100   2176   15    8 : tunables    0    0 
0 : slabdata    140    140      0
hugetlbfs_inode_cache    390    390    616   26    4 : tunables    0 
0    0 : slabdata     15     15      0
dquot                768    768    256   32    2 : tunables    0    0 
0 : slabdata     24     24      0
eventpoll_pwq      16240  16240     72   56    1 : tunables    0    0 
0 : slabdata    290    290      0
dax_cache           1028   1218    768   42    8 : tunables    0    0 
0 : slabdata     29     29      0
request_queue        330    330   2056   15    8 : tunables    0    0 
0 : slabdata     22     22      0
biovec-max           306    320   8192    4    8 : tunables    0    0 
0 : slabdata     80     80      0
biovec-128          1552   1584   2048   16    8 : tunables    0    0 
0 : slabdata     99     99      0
biovec-64           3264   3392   1024   32    8 : tunables    0    0 
0 : slabdata    106    106      0
dmaengine-unmap-256     15     15   2112   15    8 : tunables    0    0 
   0 : slabdata      1      1      0
dmaengine-unmap-128     30     30   1088   30    8 : tunables    0    0 
   0 : slabdata      1      1      0
dmaengine-unmap-16  14118  14322    192   42    2 : tunables    0    0 
  0 : slabdata    341    341      0
dmaengine-unmap-2 9992311 14087744     64   64    1 : tunables    0    0 
    0 : slabdata 220121 220121      0
sock_inode_cache  3067527 3067550    640   25    4 : tunables    0    0 
   0 : slabdata 122702 122702      0
skbuff_ext_cache   33635  33696    128   32    1 : tunables    0    0 
0 : slabdata   1053   1053      0
skbuff_fclone_cache   3458   3648    512   32    4 : tunables    0    0 
   0 : slabdata    114    114      0
skbuff_head_cache 2353888 2353984    256   32    2 : tunables    0    0 
   0 : slabdata  73562  73562      0
file_lock_cache      888    888    216   37    2 : tunables    0    0 
0 : slabdata     24     24      0
net_namespace          0      0   6272    5    8 : tunables    0    0 
0 : slabdata      0      0      0
shmem_inode_cache  15658  16544    696   47    8 : tunables    0    0 
0 : slabdata    352    352      0
task_delay_info   18748008 18748008     80   51    1 : tunables    0 
0    0 : slabdata 367608 367608      0
taskstats           1128   1128    344   47    4 : tunables    0    0 
0 : slabdata     24     24      0
proc_dir_entry      2394   2394    192   42    2 : tunables    0    0 
0 : slabdata     57     57      0
pde_opener        33415710 33415710     40  102    1 : tunables    0 
0    0 : slabdata 327605 327605      0
proc_inode_cache  6028072 6081432    664   24    4 : tunables    0    0 
   0 : slabdata 253393 253393      0
bdev_cache          1385   1599    832   39    8 : tunables    0    0 
0 : slabdata     41     41      0
kernfs_node_cache 22875834 22875900    136   30    1 : tunables    0 
0    0 : slabdata 762530 762530      0
mnt_cache           2520   2520    384   42    4 : tunables    0    0 
0 : slabdata     60     60      0
filp              43675126 45914432    256   32    2 : tunables    0 
0    0 : slabdata 1434826 1434826      0
inode_cache       7973734 8086521    592   27    4 : tunables    0    0 
   0 : slabdata 299507 299507      0
dentry            22294094 22349733    192   42    2 : tunables    0 
0    0 : slabdata 532145 532145      0
names_cache          224    224   4096    8    8 : tunables    0    0 
0 : slabdata     28     28      0
iint_cache             0      0    120   34    1 : tunables    0    0 
0 : slabdata      0      0      0
lsm_file_cache     20910  20910     24  170    1 : tunables    0    0 
0 : slabdata    123    123      0
buffer_head       586426 588549    104   39    1 : tunables    0    0 
0 : slabdata  15091  15091      0
uts_namespace          0      0    440   37    4 : tunables    0    0 
0 : slabdata      0      0      0
nsproxy             1752   1752     56   73    1 : tunables    0    0 
0 : slabdata     24     24      0
vm_area_struct    66387518 66389973    208   39    2 : tunables    0 
0    0 : slabdata 1702307 1702307      0
mm_struct         13442790 13442790   1088   30    8 : tunables    0 
0    0 : slabdata 448093 448093      0
files_cache       16558942 16558942    704   46    8 : tunables    0 
0    0 : slabdata 359977 359977      0
signal_cache      10803886 10803900   1088   30    8 : tunables    0 
0    0 : slabdata 360130 360130      0
sighand_cache     5400497 5400630   2112   15    8 : tunables    0    0 
   0 : slabdata 360042 360042      0
task_struct       2673885 2736425   5888    5    8 : tunables    0    0 
   0 : slabdata 547285 547285      0
cred_jar          25486314 25486314    192   42    2 : tunables    0 
0    0 : slabdata 606817 606817      0
anon_vma_chain    60709704 60712320     64   64    1 : tunables    0 
0    0 : slabdata 948630 948630      0
anon_vma          30101801 30102400     88   46    1 : tunables    0 
0    0 : slabdata 654400 654400      0
pid               13969696 13969696    128   32    1 : tunables    0 
0    0 : slabdata 436553 436553      0
Acpi-Operand      130536 130536     72   56    1 : tunables    0    0 
0 : slabdata   2331   2331      0
Acpi-ParseExt        936    936    104   39    1 : tunables    0    0 
0 : slabdata     24     24      0
Acpi-State          1428   1428     80   51    1 : tunables    0    0 
0 : slabdata     28     28      0
Acpi-Namespace     12648  12648     40  102    1 : tunables    0    0 
0 : slabdata    124    124      0
numa_policy           31     31    264   31    2 : tunables    0    0 
0 : slabdata      1      1      0
trace_event_file    1932   1932     88   46    1 : tunables    0    0 
0 : slabdata     42     42      0
ftrace_event_field   6120   6120     48   85    1 : tunables    0    0 
  0 : slabdata     72     72      0
pool_workqueue      5347   5568    256   32    2 : tunables    0    0 
0 : slabdata    174    174      0
radix_tree_node   497203 568344    584   28    4 : tunables    0    0 
0 : slabdata  20298  20298      0
task_group           625    625    640   25    4 : tunables    0    0 
0 : slabdata     25     25      0
dma-kmalloc-8k         0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-4k         0      0   4096    8    8 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-2k         0      0   2048   16    8 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-1k         0      0   1024   32    8 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-512       32     32    512   32    4 : tunables    0    0 
0 : slabdata      1      1      0
dma-kmalloc-256        0      0    256   32    2 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-128        0      0    128   32    1 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-64         0      0     64   64    1 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-32         0      0     32  128    1 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-16         0      0     16  256    1 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-8          0      0      8  512    1 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-192        0      0    192   42    2 : tunables    0    0 
0 : slabdata      0      0      0
dma-kmalloc-96         0      0     96   42    1 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-rcl-8k         0      0   8192    4    8 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-rcl-4k         0      0   4096    8    8 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-rcl-2k         0      0   2048   16    8 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-rcl-1k         0      0   1024   32    8 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-rcl-512        0      0    512   32    4 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-rcl-256        0      0    256   32    2 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-rcl-192     1308   1344    192   42    2 : tunables    0    0 
0 : slabdata     32     32      0
kmalloc-rcl-128    11776  11840    128   32    1 : tunables    0    0 
0 : slabdata    370    370      0
kmalloc-rcl-96     17640  17808     96   42    1 : tunables    0    0 
0 : slabdata    424    424      0
kmalloc-rcl-64    284839 285696     64   64    1 : tunables    0    0 
0 : slabdata   4464   4464      0
kmalloc-rcl-32         0      0     32  128    1 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-rcl-16         0      0     16  256    1 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-rcl-8          0      0      8  512    1 : tunables    0    0 
0 : slabdata      0      0      0
kmalloc-8k          1473   1484   8192    4    8 : tunables    0    0 
0 : slabdata    371    371      0
kmalloc-4k        2976604 2976608   4096    8    8 : tunables    0    0 
   0 : slabdata 372076 372076      0
kmalloc-2k        2747302 2747360   2048   16    8 : tunables    0    0 
   0 : slabdata 171710 171710      0
kmalloc-1k        8100486 8100832   1024   32    8 : tunables    0    0 
   0 : slabdata 253151 253151      0
kmalloc-512       2420972 2421248    512   32    4 : tunables    0    0 
   0 : slabdata  75664  75664      0
kmalloc-256        69472  69472    256   32    2 : tunables    0    0 
0 : slabdata   2171   2171      0
kmalloc-192       10754576 10754604    192   42    2 : tunables    0 
0    0 : slabdata 256062 256062      0
kmalloc-128       253920 253920    128   32    1 : tunables    0    0 
0 : slabdata   7935   7935      0
kmalloc-96        9011196 9014166     96   42    1 : tunables    0    0 
   0 : slabdata 214623 214623      0
kmalloc-64        16099637 16104640     64   64    1 : tunables    0 
0    0 : slabdata 251635 251635      0
kmalloc-32        45461228 45461248     32  128    1 : tunables    0 
0    0 : slabdata 355166 355166      0
kmalloc-16        407552 407552     16  256    1 : tunables    0    0 
0 : slabdata   1592   1592      0
kmalloc-8         130048 130048      8  512    1 : tunables    0    0 
0 : slabdata    254    254      0
kmem_cache_node   1349638 1349696     64   64    1 : tunables    0    0 
   0 : slabdata  21089  21089      0
kmem_cache        675537 675570    384   42    4 : tunables    0    0 
0 : slabdata  16085  16085      0


> 
> ----- Mail original -----
> De: "Aaron Lauterer" <a.lauterer at proxmox.com>
> ?: "proxmoxve" <pve-user at pve.proxmox.com>
> Envoy?: Vendredi 20 Septembre 2019 15:12:04
> Objet: Re: [PVE-User] Kernel Memory Leak on PVE6?
> 
> On 9/20/19 3:04 PM, Chris Hofstaedtler | Deduktiva wrote:
>> * Aaron Lauterer <a.lauterer at proxmox.com> [190920 14:58]:
>>> Curious, I do have a very similar case at the moment with a slab of ~155GB,
>>> out of ~190GB RAM installed.
>>>
>>> I am not sure yet what causes it but things I plan to investigate are:
>>>
>>> * hanging NFS mount
>>
>> Okay, to rule storage issues out, this setup has:
>> - root filesystem as ext4 on GPT
>> - efi system partition
>> - two LVM PVs and VGs, with all VM storage in the second LVM VG
>> - no NFS, no ZFS, no Ceph, no fancy userland filesystems
>>
>>> * possible (PVE) service starting too many threads -> restarting each and
>>> checking the memory / slab usage.
>>
>> Do you have a particular service in mind?
> 
> Not at this point. I would restart all PVE services (systemctl| grep -e
> "pve.*service") one by one to see if any of it will result in memory
> being released by the kernel.
> 
> If that is not the case at least they are ruled out.
>>
>> Chris
>>
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 



From mark at tuxis.nl  Wed Sep 25 15:46:37 2019
From: mark at tuxis.nl (Mark Schouten)
Date: Wed, 25 Sep 2019 13:46:37 +0000
Subject: [PVE-User] Images on CephFS?
In-Reply-To: <af6a9a9699e3edc6ecab5ae1372f60cd@tuxis.nl>
References: <af6a9a9699e3edc6ecab5ae1372f60cd@tuxis.nl>
Message-ID: <em7b48b39e-047b-405a-8fde-9c411a1d8752@windoos>

Hi,

Just noticed that this is not a PVE 6-change. It's also changed in 
5.4-3. We're using this actively, which makes me wonder what will happen 
if we stop/start a VM using disks on CephFS...

Any way we can enable it again?

--
Mark Schouten
Tuxis B.V.
https://www.tuxis.nl/ | +31 318 200208

------ Original Message ------
From: "Mark Schouten" <mark at tuxis.nl>
To: "PVE User List" <pve-user at pve.proxmox.com>
Sent: 9/19/2019 9:15:17 AM
Subject: [PVE-User] Images on CephFS?

>
>Hi,
>
>We just built our latest cluster with PVE 6.0. We also offer CephFS 
>'slow but large' storage with our clusters, on which people can create 
>images for backupservers. However, it seems that in PVE 6.0, we can no 
>longer use CephFS for images?
>
>
>Cany anybody confirm (and explain?) or am I looking in the wrong 
>direction?
>
>--
>Mark Schouten <mark at tuxis.nl>
>
>Tuxis, Ede, https://www.tuxis.nl
>
>T: +31 318 200208
>
>
>_______________________________________________
>pve-user mailing list
>pve-user at pve.proxmox.com
>https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user



From t.lamprecht at proxmox.com  Wed Sep 25 16:03:44 2019
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Wed, 25 Sep 2019 16:03:44 +0200
Subject: [PVE-User] Images on CephFS?
In-Reply-To: <em7b48b39e-047b-405a-8fde-9c411a1d8752@windoos>
References: <af6a9a9699e3edc6ecab5ae1372f60cd@tuxis.nl>
 <em7b48b39e-047b-405a-8fde-9c411a1d8752@windoos>
Message-ID: <d578f401-ef43-37dc-fc94-cbe934165f75@proxmox.com>

Hi,

On 9/25/19 3:46 PM, Mark Schouten wrote:
> Hi,
> 
> Just noticed that this is not a PVE 6-change. It's also changed in 5.4-3. We're using this actively, which makes me wonder what will happen if we stop/start a VM using disks on CephFS...

huh, AFAICT we never allowed that, the git-history of the CephFS
storage Plugin is quite short[0] so you can confirm yourself..
The initial commit did not allow VM/CT images neither[1]..

[0]: https://git.proxmox.com/?p=pve-storage.git;a=history;f=PVE/Storage/CephFSPlugin.pm;h=c18f8c937029d46b68aeafded5ec8d0a9d9c30ad;hb=HEAD
[1]: https://git.proxmox.com/?p=pve-storage.git;a=commitdiff;h=e34ce1444359ee06f50dd6907c0937d10748ce05

> 
> Any way we can enable it again?

IIRC, the rational was that if Ceph is used, RBD will be prefered
for CT/VM anyhow - but CephFS seems to be quite performant, and as
all functionality should be there (or get added easily) we could
enable it just fine.. 

Just scratching my head how you were able to use it for images if
the plugin was never told to allow it..

cheers,
Thomas

> 
> -- 
> Mark Schouten
> Tuxis B.V.
> https://www.tuxis.nl/ | +31 318 200208
> 
> ------ Original Message ------
> From: "Mark Schouten" <mark at tuxis.nl>
> To: "PVE User List" <pve-user at pve.proxmox.com>
> Sent: 9/19/2019 9:15:17 AM
> Subject: [PVE-User] Images on CephFS?
> 
>>
>> Hi,
>>
>> We just built our latest cluster with PVE 6.0. We also offer CephFS 'slow but large' storage with our clusters, on which people can create images for backupservers. However, it seems that in PVE 6.0, we can no longer use CephFS for images?
>>
>>
>> Cany anybody confirm (and explain?) or am I looking in the wrong direction?
>>
>> -- 
>> Mark Schouten <mark at tuxis.nl>
>>
>> Tuxis, Ede, https://www.tuxis.nl
>>
>> T: +31 318 200208
>>
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 




From mark at tuxis.nl  Wed Sep 25 16:47:20 2019
From: mark at tuxis.nl (Mark Schouten)
Date: Wed, 25 Sep 2019 14:47:20 +0000
Subject: [PVE-User] Images on CephFS?
In-Reply-To: <d578f401-ef43-37dc-fc94-cbe934165f75@proxmox.com>
References: <af6a9a9699e3edc6ecab5ae1372f60cd@tuxis.nl>
 <em7b48b39e-047b-405a-8fde-9c411a1d8752@windoos>
 <d578f401-ef43-37dc-fc94-cbe934165f75@proxmox.com>
Message-ID: <em7beb9e28-d251-4ef2-a99e-1e3d5710dab4@windoos>

Hi,

>huh, AFAICT we never allowed that, the git-history of the CephFS
>storage Plugin is quite short[0] so you can confirm yourself..
>The initial commit did not allow VM/CT images neither[1]..
Haha. That's cool. :) I'm pretty sure I never needed to 'hack' anything 
to allow it. I can't find an un-updated cluster to test it on.

>>Any way we can enable it again?
>
>IIRC, the rational was that if Ceph is used, RBD will be prefered
>for CT/VM anyhow - but CephFS seems to be quite performant, and as
>all functionality should be there (or get added easily) we could
>enable it just fine..
>
>Just scratching my head how you were able to use it for images if
>the plugin was never told to allow it..

The good news is, I can create the image and configure the 
vm-config-file. Works fine, just not for a normal user. It performes 
fine as well, although I would recommend only raw images. I think I 
remember issues with qcow2 and snapshots.

--
Mark Schouten
Tuxis B.V.
https://www.tuxis.nl/ | +31 318 200208



From marcomgabriel at gmail.com  Wed Sep 25 16:49:51 2019
From: marcomgabriel at gmail.com (Marco M. Gabriel)
Date: Wed, 25 Sep 2019 16:49:51 +0200
Subject: [PVE-User] Images on CephFS?
In-Reply-To: <em7b48b39e-047b-405a-8fde-9c411a1d8752@windoos>
References: <af6a9a9699e3edc6ecab5ae1372f60cd@tuxis.nl>
 <em7b48b39e-047b-405a-8fde-9c411a1d8752@windoos>
Message-ID: <CAEp19KMf5K_OqKOu2owFpasG-ptDNaKa2q2eADmczEJA5DhiTg@mail.gmail.com>

Hi Mark,

as a temporary fix, you could just add a "directory" based storage
that points to the CephFS mount point.

Marco

Am Mi., 25. Sept. 2019 um 15:49 Uhr schrieb Mark Schouten <mark at tuxis.nl>:
>
> Hi,
>
> Just noticed that this is not a PVE 6-change. It's also changed in
> 5.4-3. We're using this actively, which makes me wonder what will happen
> if we stop/start a VM using disks on CephFS...
>
> Any way we can enable it again?
>
> --
> Mark Schouten
> Tuxis B.V.
> https://www.tuxis.nl/ | +31 318 200208
>
> ------ Original Message ------
> From: "Mark Schouten" <mark at tuxis.nl>
> To: "PVE User List" <pve-user at pve.proxmox.com>
> Sent: 9/19/2019 9:15:17 AM
> Subject: [PVE-User] Images on CephFS?
>
> >
> >Hi,
> >
> >We just built our latest cluster with PVE 6.0. We also offer CephFS
> >'slow but large' storage with our clusters, on which people can create
> >images for backupservers. However, it seems that in PVE 6.0, we can no
> >longer use CephFS for images?
> >
> >
> >Cany anybody confirm (and explain?) or am I looking in the wrong
> >direction?
> >
> >--
> >Mark Schouten <mark at tuxis.nl>
> >
> >Tuxis, Ede, https://www.tuxis.nl
> >
> >T: +31 318 200208
> >
> >
> >_______________________________________________
> >pve-user mailing list
> >pve-user at pve.proxmox.com
> >https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


From jmr.richardson at gmail.com  Wed Sep 25 22:31:31 2019
From: jmr.richardson at gmail.com (JR Richardson)
Date: Wed, 25 Sep 2019 15:31:31 -0500
Subject: [PVE-User] SR-IOV Network Virtualization Question
Message-ID: <CA+U74VPS69h115jCrrG-kofD_Tg8rqjLa5tezxu_D-a37EFwyw@mail.gmail.com>

Hey All,

I'm running Poxmox on Dell R710:
CPU(s) 16 x Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (2 Sockets)
Kernel Version Linux 4.15.18-18-pve #1 SMP PVE 4.15.18-44 (Wed, 03 Jul
2019 11:19:13 +0200)
PVE Manager Version pve-manager/5.4-11/6df3d8d0

I wanted to test the new Velo Cloud Partner Gateway KVM appliance and
curious about the requirements:
Minimum Server Requirements To run the hypervisor:
10 Intel CPU's at 2.0 Ghz or higher. The CPU must support the AES-NI,
SSSE3, SSE4 and RDTSC instruction sets.
20+ GB (16 GB is required for VC Gateway VM memory)
100 GB magnetic or SSD based, persistent disk volume
2 x 1 Gbps (or higher) network interface. The physical NIC card should
use the Intel 82599/82599ES chipset (for SR-IOV & DPDK support).

The CPU is OK and I think I can get my hands on the correct NIC to put
in the chassis. I don't know anything about the SR-IOV or DPDK so I'm
doing some research on these now.

My question is; Has anyone else deployed VMs with these requirements,
are the CPU instructions exposed to the VMs in the PROXMOX kernel I
have loaded, any caveats or potholes I should be looking for? Can I
get the network cards directly exported to the VM?

Any feedback is appreciated.

Thanks.

JR
-- 
JR Richardson
Engineering for the Masses
Chasing the Azeotrope


From mark at openvs.co.uk  Fri Sep 27 10:30:28 2019
From: mark at openvs.co.uk (Mark Adams)
Date: Fri, 27 Sep 2019 09:30:28 +0100
Subject: [PVE-User] AMD ZEN 2 (EPYC 7002 aka "rome") kernel requirements
Message-ID: <CAHxUxjCJREKf0H01bNgJoYi8OtO-M3-LAgS7QVzHkrvBwCKgGQ@mail.gmail.com>

Hi All,

I'm trying out one of these new processors, and it looks like I need at
least 5.2 kernel to get some support, preferably 5.3.

At present the machine will boot in to proxmox, but IOMMU does not work,
and I can see ECC memory is not working.

So my question is, whats the recommended way to get a newer kernel than is
provided by the pve-kernel package? I understand that pve-kernel uses the
newer ubuntu kernel rather than the debian buster one, but are you building
anything else in to it? Will proxmox work ok if I install the ubuntu 5.3
kernel?

Cheers,
Mark


From t.lamprecht at proxmox.com  Fri Sep 27 10:37:14 2019
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Fri, 27 Sep 2019 10:37:14 +0200
Subject: [PVE-User] AMD ZEN 2 (EPYC 7002 aka "rome") kernel requirements
In-Reply-To: <CAHxUxjCJREKf0H01bNgJoYi8OtO-M3-LAgS7QVzHkrvBwCKgGQ@mail.gmail.com>
References: <CAHxUxjCJREKf0H01bNgJoYi8OtO-M3-LAgS7QVzHkrvBwCKgGQ@mail.gmail.com>
Message-ID: <2de48ea2-f032-2e4d-c574-4ed966730c7e@proxmox.com>

Hi,

On 9/27/19 10:30 AM, Mark Adams wrote:
> Hi All,
> 
> I'm trying out one of these new processors, and it looks like I need at
> least 5.2 kernel to get some support, preferably 5.3.
> 

We're onto a 5.3 based kernel, may need a bit until a build gets
released for testing though.

But the things required for that newer platform to work will be
also backported to older kernels.



From f.gruenbichler at proxmox.com  Fri Sep 27 10:37:32 2019
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?q?Gr=FCnbichler?=)
Date: Fri, 27 Sep 2019 10:37:32 +0200
Subject: [PVE-User] AMD ZEN 2 (EPYC 7002 aka "rome") kernel requirements
In-Reply-To: <CAHxUxjCJREKf0H01bNgJoYi8OtO-M3-LAgS7QVzHkrvBwCKgGQ@mail.gmail.com>
References: <CAHxUxjCJREKf0H01bNgJoYi8OtO-M3-LAgS7QVzHkrvBwCKgGQ@mail.gmail.com>
Message-ID: <1569573311.wn9j7ruavu.astroid@nora.none>

On September 27, 2019 10:30 am, Mark Adams wrote:
> Hi All,
> 
> I'm trying out one of these new processors, and it looks like I need at
> least 5.2 kernel to get some support, preferably 5.3.
> 
> At present the machine will boot in to proxmox, but IOMMU does not work,
> and I can see ECC memory is not working.
> 
> So my question is, whats the recommended way to get a newer kernel than is
> provided by the pve-kernel package? I understand that pve-kernel uses the
> newer ubuntu kernel rather than the debian buster one, but are you building
> anything else in to it? Will proxmox work ok if I install the ubuntu 5.3
> kernel?

these are the patches we currently ship on-top of Ubuntu Disco's kernel:

https://git.proxmox.com/?p=pve-kernel.git;a=tree;f=patches/kernel;hb=refs/heads/master

another thing we add are the ZFS modules. not sure which version Ubuntu 
Eoan ships there.



From mark at openvs.co.uk  Fri Sep 27 16:01:56 2019
From: mark at openvs.co.uk (Mark Adams)
Date: Fri, 27 Sep 2019 15:01:56 +0100
Subject: [PVE-User] AMD ZEN 2 (EPYC 7002 aka "rome") kernel requirements
In-Reply-To: <1569573311.wn9j7ruavu.astroid@nora.none>
References: <CAHxUxjCJREKf0H01bNgJoYi8OtO-M3-LAgS7QVzHkrvBwCKgGQ@mail.gmail.com>
 <1569573311.wn9j7ruavu.astroid@nora.none>
Message-ID: <CAHxUxjDeTrL2=u9K+K=CayGL-V1d-PGUeY7qnS0PMLOEUO3kQw@mail.gmail.com>

Thanks for your responses Thomas and Fabian.

On Fri, 27 Sep 2019 at 09:37, Fabian Gr?nbichler <f.gruenbichler at proxmox.com>
wrote:

> On September 27, 2019 10:30 am, Mark Adams wrote:
> > Hi All,
> >
> > I'm trying out one of these new processors, and it looks like I need at
> > least 5.2 kernel to get some support, preferably 5.3.
> >
> > At present the machine will boot in to proxmox, but IOMMU does not work,
> > and I can see ECC memory is not working.
> >
> > So my question is, whats the recommended way to get a newer kernel than
> is
> > provided by the pve-kernel package? I understand that pve-kernel uses the
> > newer ubuntu kernel rather than the debian buster one, but are you
> building
> > anything else in to it? Will proxmox work ok if I install the ubuntu 5.3
> > kernel?
>
> these are the patches we currently ship on-top of Ubuntu Disco's kernel:
>
>
> https://git.proxmox.com/?p=pve-kernel.git;a=tree;f=patches/kernel;hb=refs/heads/master
>
> another thing we add are the ZFS modules. not sure which version Ubuntu
> Eoan ships there.
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


From proxmox at elchaka.de  Sat Sep 28 02:15:49 2019
From: proxmox at elchaka.de (proxmox at elchaka.de)
Date: Sat, 28 Sep 2019 02:15:49 +0200
Subject: [PVE-User] ZFS live migration with HA
In-Reply-To: <mailman.164.1567774626.416.pve-user@pve.proxmox.com>
References: <mailman.164.1567774626.416.pve-user@pve.proxmox.com>
Message-ID: <18A7C883-7E72-4B22-B6BB-95DBA1BB10BC@elchaka.de>

If i am Not wrong, you have to Setup ZFS Replication between the HA nodes. Then it should hopefully work...

HTH
- Mehmet 

Am 6. September 2019 14:57:00 MESZ schrieb Milosz Stocki via pve-user <pve-user at pve.proxmox.com>:
>_______________________________________________
>pve-user mailing list
>pve-user at pve.proxmox.com
>https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


From proxmox at elchaka.de  Sat Sep 28 02:15:49 2019
From: proxmox at elchaka.de (proxmox at elchaka.de)
Date: Sat, 28 Sep 2019 02:15:49 +0200
Subject: [PVE-User] ZFS live migration with HA
In-Reply-To: <mailman.164.1567774626.416.pve-user@pve.proxmox.com>
References: <mailman.164.1567774626.416.pve-user@pve.proxmox.com>
Message-ID: <18A7C883-7E72-4B22-B6BB-95DBA1BB10BC@elchaka.de>

If i am Not wrong, you have to Setup ZFS Replication between the HA nodes. Then it should hopefully work...

HTH
- Mehmet 

Am 6. September 2019 14:57:00 MESZ schrieb Milosz Stocki via pve-user <pve-user at pve.proxmox.com>:
>_______________________________________________
>pve-user mailing list
>pve-user at pve.proxmox.com
>https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


From proxmox at elchaka.de  Sat Sep 28 02:15:49 2019
From: proxmox at elchaka.de (proxmox at elchaka.de)
Date: Sat, 28 Sep 2019 02:15:49 +0200
Subject: [PVE-User] ZFS live migration with HA
In-Reply-To: <mailman.164.1567774626.416.pve-user@pve.proxmox.com>
References: <mailman.164.1567774626.416.pve-user@pve.proxmox.com>
Message-ID: <18A7C883-7E72-4B22-B6BB-95DBA1BB10BC@elchaka.de>

If i am Not wrong, you have to Setup ZFS Replication between the HA nodes. Then it should hopefully work...

HTH
- Mehmet 

Am 6. September 2019 14:57:00 MESZ schrieb Milosz Stocki via pve-user <pve-user at pve.proxmox.com>:
>_______________________________________________
>pve-user mailing list
>pve-user at pve.proxmox.com
>https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


From chris.hofstaedtler at deduktiva.com  Sat Sep 28 15:34:40 2019
From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva)
Date: Sat, 28 Sep 2019 15:34:40 +0200
Subject: [PVE-User] Kernel Memory Leak on PVE6?
In-Reply-To: <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com>
References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
 <e68de579-54be-9fd6-1af6-48553a4c22b8@proxmox.com>
 <20190920130434.pjnuimppit2rrpxj@percival.namespace.at>
 <71d8d350-ddb7-640d-9b72-c61089b5c124@proxmox.com>
 <52533241.5445047.1568991480294.JavaMail.zimbra@odiso.com>
Message-ID: <20190928133439.vgm4nia2imkn63fd@zeha.at>

* Alexandre DERUMIER <aderumier at odiso.com> [190920 16:58]:
> can send detail of 
> 
> cat /proc/slabinfo

I've attached a dump from today and from yesterday, both at 15:00.
It appears this machine is eating about 1GB per day - bit hard to
tell from the check_mk graphs.

Chris
-------------- next part --------------
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
xfs_dqtrx              0      0    528   31    4 : tunables    0    0    0 : slabdata      0      0      0
xfs_dquot              0      0    504   32    4 : tunables    0    0    0 : slabdata      0      0      0
xfs_rui_item           0      0    696   47    8 : tunables    0    0    0 : slabdata      0      0      0
xfs_rud_item           0      0    176   46    2 : tunables    0    0    0 : slabdata      0      0      0
xfs_inode            308    340    960   34    8 : tunables    0    0    0 : slabdata     10     10      0
xfs_efd_item         185    185    440   37    4 : tunables    0    0    0 : slabdata      5      5      0
xfs_buf_item         270    270    272   30    2 : tunables    0    0    0 : slabdata      9      9      0
xfs_trans           1155   1155    232   35    2 : tunables    0    0    0 : slabdata     33     33      0
xfs_da_state           0      0    480   34    4 : tunables    0    0    0 : slabdata      0      0      0
xfs_btree_cur        432    432    224   36    2 : tunables    0    0    0 : slabdata     12     12      0
xfs_log_ticket      1540   1540    184   44    2 : tunables    0    0    0 : slabdata     35     35      0
kvm_async_pf        1200   1200    136   30    1 : tunables    0    0    0 : slabdata     40     40      0
kvm_vcpu              29     29  21376    1    8 : tunables    0    0    0 : slabdata     29     29      0
kvm_mmu_page_header  18003  18003    160   51    2 : tunables    0    0    0 : slabdata    353    353      0
x86_fpu              203    203   4160    7    8 : tunables    0    0    0 : slabdata     29     29      0
sw_flow                0      0   1952   16    8 : tunables    0    0    0 : slabdata      0      0      0
nf_conncount_rb        0      0     96   42    1 : tunables    0    0    0 : slabdata      0      0      0
nf_conntrack        2040   2040    320   51    4 : tunables    0    0    0 : slabdata     40     40      0
rpc_inode_cache      102    102    640   51    8 : tunables    0    0    0 : slabdata      2      2      0
ext4_groupinfo_4k    784    784    144   28    1 : tunables    0    0    0 : slabdata     28     28      0
scsi_sense_cache    1824   1824    128   32    1 : tunables    0    0    0 : slabdata     57     57      0
PINGv6                 0      0   1152   28    8 : tunables    0    0    0 : slabdata      0      0      0
RAWv6               1204   1204   1152   28    8 : tunables    0    0    0 : slabdata     43     43      0
UDPv6              48775  48775   1280   25    8 : tunables    0    0    0 : slabdata   1951   1951      0
tw_sock_TCPv6       1360   1360    240   34    2 : tunables    0    0    0 : slabdata     40     40      0
request_sock_TCPv6      0      0    304   53    4 : tunables    0    0    0 : slabdata      0      0      0
TCPv6                533    533   2368   13    8 : tunables    0    0    0 : slabdata     41     41      0
kcopyd_job             0      0   3312    9    8 : tunables    0    0    0 : slabdata      0      0      0
dm_uevent              0      0   2632   12    8 : tunables    0    0    0 : slabdata      0      0      0
dm_old_clone_request      0      0    296   55    4 : tunables    0    0    0 : slabdata      0      0      0
dm_rq_target_io        0      0    120   34    1 : tunables    0    0    0 : slabdata      0      0      0
mqueue_inode_cache     36     36    896   36    8 : tunables    0    0    0 : slabdata      1      1      0
fuse_request        1640   1640    392   41    4 : tunables    0    0    0 : slabdata     40     40      0
fuse_inode          4123   4326    768   42    8 : tunables    0    0    0 : slabdata    103    103      0
ecryptfs_key_record_cache      0      0    576   28    4 : tunables    0    0    0 : slabdata      0      0      0
ecryptfs_headers       8      8   4096    8    8 : tunables    0    0    0 : slabdata      1      1      0
ecryptfs_inode_cache      0      0    960   34    8 : tunables    0    0    0 : slabdata      0      0      0
ecryptfs_dentry_info_cache    128    128     32  128    1 : tunables    0    0    0 : slabdata      1      1      0
ecryptfs_file_cache      0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
ecryptfs_auth_tok_list_item      0      0    832   39    8 : tunables    0    0    0 : slabdata      0      0      0
fat_inode_cache       90     90    728   45    8 : tunables    0    0    0 : slabdata      2      2      0
fat_cache              0      0     40  102    1 : tunables    0    0    0 : slabdata      0      0      0
squashfs_inode_cache      0      0    704   46    8 : tunables    0    0    0 : slabdata      0      0      0
jbd2_journal_head   4998   4998    120   34    1 : tunables    0    0    0 : slabdata    147    147      0
jbd2_revoke_table_s    512    512     16  256    1 : tunables    0    0    0 : slabdata      2      2      0
ext4_inode_cache   93910 102000   1080   30    8 : tunables    0    0    0 : slabdata   3400   3400      0
ext4_allocation_context   1280   1280    128   32    1 : tunables    0    0    0 : slabdata     40     40      0
ext4_pending_reservation   6912   6912     32  128    1 : tunables    0    0    0 : slabdata     54     54      0
ext4_extent_status  14382  14382     40  102    1 : tunables    0    0    0 : slabdata    141    141      0
mbcache             3942   3942     56   73    1 : tunables    0    0    0 : slabdata     54     54      0
fscrypt_info        2560   2560     64   64    1 : tunables    0    0    0 : slabdata     40     40      0
fscrypt_ctx         3400   3400     48   85    1 : tunables    0    0    0 : slabdata     40     40      0
userfaultfd_ctx_cache      0      0    192   42    2 : tunables    0    0    0 : slabdata      0      0      0
dnotify_struct      5632   5632     32  128    1 : tunables    0    0    0 : slabdata     44     44      0
posix_timers_cache   1360   1360    240   34    2 : tunables    0    0    0 : slabdata     40     40      0
UNIX              409568 409568   1024   32    8 : tunables    0    0    0 : slabdata  12799  12799      0
ip4-frags              0      0    208   39    2 : tunables    0    0    0 : slabdata      0      0      0
xfrm_dst_cache     15657  15912    320   51    4 : tunables    0    0    0 : slabdata    312    312      0
xfrm_state             0      0    768   42    8 : tunables    0    0    0 : slabdata      0      0      0
PING                   0      0    960   34    8 : tunables    0    0    0 : slabdata      0      0      0
RAW                 1598   1598    960   34    8 : tunables    0    0    0 : slabdata     47     47      0
tw_sock_TCP          952    952    240   34    2 : tunables    0    0    0 : slabdata     28     28      0
request_sock_TCP    2120   2120    304   53    4 : tunables    0    0    0 : slabdata     40     40      0
TCP                 1815   1815   2176   15    8 : tunables    0    0    0 : slabdata    121    121      0
hugetlbfs_inode_cache    583    583    616   53    8 : tunables    0    0    0 : slabdata     11     11      0
dquot               1280   1280    256   32    2 : tunables    0    0    0 : slabdata     40     40      0
eventpoll_pwq      13160  13160     72   56    1 : tunables    0    0    0 : slabdata    235    235      0
dax_cache            307    336    768   42    8 : tunables    0    0    0 : slabdata      8      8      0
request_queue        120    120   2056   15    8 : tunables    0    0    0 : slabdata      8      8      0
biovec-max           664    676   8192    4    8 : tunables    0    0    0 : slabdata    169    169      0
biovec-128          2016   2048   2048   16    8 : tunables    0    0    0 : slabdata    128    128      0
biovec-64           1440   1440   1024   32    8 : tunables    0    0    0 : slabdata     45     45      0
dmaengine-unmap-256     15     15   2112   15    8 : tunables    0    0    0 : slabdata      1      1      0
dmaengine-unmap-128     30     30   1088   30    8 : tunables    0    0    0 : slabdata      1      1      0
dmaengine-unmap-16   5124   5124    192   42    2 : tunables    0    0    0 : slabdata    122    122      0
dmaengine-unmap-2 7548143 7548928     64   64    1 : tunables    0    0    0 : slabdata 117952 117952      0
sock_inode_cache  1310649 1310649    640   51    8 : tunables    0    0    0 : slabdata  25699  25699      0
skbuff_ext_cache    1344   1344    128   32    1 : tunables    0    0    0 : slabdata     42     42      0
skbuff_fclone_cache   1696   1696    512   32    4 : tunables    0    0    0 : slabdata     53     53      0
skbuff_head_cache 347360 347520    256   32    2 : tunables    0    0    0 : slabdata  10860  10860      0
file_lock_cache     1480   1480    216   37    2 : tunables    0    0    0 : slabdata     40     40      0
net_namespace          0      0   6272    5    8 : tunables    0    0    0 : slabdata      0      0      0
shmem_inode_cache 102235 102883    696   47    8 : tunables    0    0    0 : slabdata   2189   2189      0
task_delay_info   2066979 2066979     80   51    1 : tunables    0    0    0 : slabdata  40529  40529      0
taskstats           1880   1880    344   47    4 : tunables    0    0    0 : slabdata     40     40      0
proc_dir_entry      2352   2352    192   42    2 : tunables    0    0    0 : slabdata     56     56      0
pde_opener        3306126 3306228     40  102    1 : tunables    0    0    0 : slabdata  32414  32414      0
proc_inode_cache  1175048 1187760    664   49    8 : tunables    0    0    0 : slabdata  24240  24240      0
bdev_cache           715    741    832   39    8 : tunables    0    0    0 : slabdata     19     19      0
kernfs_node_cache 2503800 2503800    136   30    1 : tunables    0    0    0 : slabdata  83460  83460      0
mnt_cache           2100   2100    384   42    4 : tunables    0    0    0 : slabdata     50     50      0
filp              5559835 5726208    256   32    2 : tunables    0    0    0 : slabdata 178944 178944      0
inode_cache       1592023 1594175    592   55    8 : tunables    0    0    0 : slabdata  28985  28985      0
dentry            2285970 2307312    192   42    2 : tunables    0    0    0 : slabdata  54936  54936      0
names_cache          320    320   4096    8    8 : tunables    0    0    0 : slabdata     40     40      0
iint_cache             0      0    120   34    1 : tunables    0    0    0 : slabdata      0      0      0
lsm_file_cache     14280  14280     24  170    1 : tunables    0    0    0 : slabdata     84     84      0
buffer_head       421193 433134    104   39    1 : tunables    0    0    0 : slabdata  11106  11106      0
uts_namespace          0      0    440   37    4 : tunables    0    0    0 : slabdata      0      0      0
nsproxy             2920   2920     56   73    1 : tunables    0    0    0 : slabdata     40     40      0
vm_area_struct    13788959 13805571    208   39    2 : tunables    0    0    0 : slabdata 353989 353989      0
mm_struct         1710780 1710780   1088   30    8 : tunables    0    0    0 : slabdata  57026  57026      0
files_cache       1820542 1820542    704   46    8 : tunables    0    0    0 : slabdata  39577  39577      0
signal_cache      1378675 1378830   1088   30    8 : tunables    0    0    0 : slabdata  45961  45961      0
sighand_cache     595458 595470   2112   15    8 : tunables    0    0    0 : slabdata  39698  39698      0
task_struct       285237 290020   5888    5    8 : tunables    0    0    0 : slabdata  58004  58004      0
cred_jar          2559692 2559732    192   42    2 : tunables    0    0    0 : slabdata  60946  60946      0
anon_vma_chain    8805274 8806784     64   64    1 : tunables    0    0    0 : slabdata 137606 137606      0
anon_vma          3720344 3721308     88   46    1 : tunables    0    0    0 : slabdata  80898  80898      0
pid               1739680 1739680    128   32    1 : tunables    0    0    0 : slabdata  54365  54365      0
Acpi-Operand        4928   4928     72   56    1 : tunables    0    0    0 : slabdata     88     88      0
Acpi-ParseExt       1560   1560    104   39    1 : tunables    0    0    0 : slabdata     40     40      0
Acpi-State          2244   2244     80   51    1 : tunables    0    0    0 : slabdata     44     44      0
Acpi-Namespace      3366   3366     40  102    1 : tunables    0    0    0 : slabdata     33     33      0
numa_policy           62     62    264   31    2 : tunables    0    0    0 : slabdata      2      2      0
trace_event_file    4232   4232     88   46    1 : tunables    0    0    0 : slabdata     92     92      0
ftrace_event_field   5865   5865     48   85    1 : tunables    0    0    0 : slabdata     69     69      0
pool_workqueue     12753  12864    256   32    2 : tunables    0    0    0 : slabdata    402    402      0
radix_tree_node   404222 408632    584   28    4 : tunables    0    0    0 : slabdata  14600  14600      0
task_group          2040   2040    640   51    8 : tunables    0    0    0 : slabdata     40     40      0
dma-kmalloc-8k         0      0   8192    4    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-4k         0      0   4096    8    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-2k         0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-1k         0      0   1024   32    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-512        0      0    512   32    4 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-256        0      0    256   32    2 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-128        0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-64         0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-32         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-16         0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-8          0      0      8  512    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-192        0      0    192   42    2 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-96         0      0     96   42    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-8k         0      0   8192    4    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-4k         0      0   4096    8    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-2k         0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-1k         0      0   1024   32    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-512        0      0    512   32    4 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-256        0      0    256   32    2 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-192      420    420    192   42    2 : tunables    0    0    0 : slabdata     10     10      0
kmalloc-rcl-128     6632   6688    128   32    1 : tunables    0    0    0 : slabdata    209    209      0
kmalloc-rcl-96     26339  26544     96   42    1 : tunables    0    0    0 : slabdata    632    632      0
kmalloc-rcl-64     25336  26624     64   64    1 : tunables    0    0    0 : slabdata    416    416      0
kmalloc-rcl-32         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-16         0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-8          0      0      8  512    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-8k          1527   1548   8192    4    8 : tunables    0    0    0 : slabdata    387    387      0
kmalloc-4k        327067 327096   4096    8    8 : tunables    0    0    0 : slabdata  40887  40887      0
kmalloc-2k        215533 215552   2048   16    8 : tunables    0    0    0 : slabdata  13472  13472      0
kmalloc-1k        882909 883296   1024   32    8 : tunables    0    0    0 : slabdata  27603  27603      0
kmalloc-512       360331 360608    512   32    4 : tunables    0    0    0 : slabdata  11269  11269      0
kmalloc-256        14943  14944    256   32    2 : tunables    0    0    0 : slabdata    467    467      0
kmalloc-192       1101565 1101702    192   42    2 : tunables    0    0    0 : slabdata  26231  26231      0
kmalloc-128        27641  27680    128   32    1 : tunables    0    0    0 : slabdata    865    865      0
kmalloc-96        752313 755832     96   42    1 : tunables    0    0    0 : slabdata  17996  17996      0
kmalloc-64        1492050 1495424     64   64    1 : tunables    0    0    0 : slabdata  23366  23366      0
kmalloc-32        4731136 4731136     32  128    1 : tunables    0    0    0 : slabdata  36962  36962      0
kmalloc-16         69376  69376     16  256    1 : tunables    0    0    0 : slabdata    271    271      0
kmalloc-8         1037476 1038848      8  512    1 : tunables    0    0    0 : slabdata   2029   2029      0
kmem_cache_node   147069 147072     64   64    1 : tunables    0    0    0 : slabdata   2298   2298      0
kmem_cache         74446  74466    384   42    4 : tunables    0    0    0 : slabdata   1773   1773      0
-------------- next part --------------
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
xfs_dqtrx              0      0    528   31    4 : tunables    0    0    0 : slabdata      0      0      0
xfs_dquot              0      0    504   32    4 : tunables    0    0    0 : slabdata      0      0      0
xfs_rui_item           0      0    696   47    8 : tunables    0    0    0 : slabdata      0      0      0
xfs_rud_item           0      0    176   46    2 : tunables    0    0    0 : slabdata      0      0      0
xfs_inode            308    340    960   34    8 : tunables    0    0    0 : slabdata     10     10      0
xfs_efd_item         185    185    440   37    4 : tunables    0    0    0 : slabdata      5      5      0
xfs_buf_item         270    270    272   30    2 : tunables    0    0    0 : slabdata      9      9      0
xfs_trans           1155   1155    232   35    2 : tunables    0    0    0 : slabdata     33     33      0
xfs_da_state           0      0    480   34    4 : tunables    0    0    0 : slabdata      0      0      0
xfs_btree_cur        432    432    224   36    2 : tunables    0    0    0 : slabdata     12     12      0
xfs_log_ticket      1540   1540    184   44    2 : tunables    0    0    0 : slabdata     35     35      0
kvm_async_pf        1200   1200    136   30    1 : tunables    0    0    0 : slabdata     40     40      0
kvm_vcpu              29     29  21376    1    8 : tunables    0    0    0 : slabdata     29     29      0
kvm_mmu_page_header  18105  18105    160   51    2 : tunables    0    0    0 : slabdata    355    355      0
x86_fpu              203    203   4160    7    8 : tunables    0    0    0 : slabdata     29     29      0
sw_flow                0      0   1952   16    8 : tunables    0    0    0 : slabdata      0      0      0
nf_conncount_rb        0      0     96   42    1 : tunables    0    0    0 : slabdata      0      0      0
nf_conntrack        2040   2040    320   51    4 : tunables    0    0    0 : slabdata     40     40      0
rpc_inode_cache      102    102    640   51    8 : tunables    0    0    0 : slabdata      2      2      0
ext4_groupinfo_4k    784    784    144   28    1 : tunables    0    0    0 : slabdata     28     28      0
scsi_sense_cache    1824   1824    128   32    1 : tunables    0    0    0 : slabdata     57     57      0
PINGv6                 0      0   1152   28    8 : tunables    0    0    0 : slabdata      0      0      0
RAWv6               1204   1204   1152   28    8 : tunables    0    0    0 : slabdata     43     43      0
UDPv6              55650  55650   1280   25    8 : tunables    0    0    0 : slabdata   2226   2226      0
tw_sock_TCPv6       1360   1360    240   34    2 : tunables    0    0    0 : slabdata     40     40      0
request_sock_TCPv6      0      0    304   53    4 : tunables    0    0    0 : slabdata      0      0      0
TCPv6                533    533   2368   13    8 : tunables    0    0    0 : slabdata     41     41      0
kcopyd_job             0      0   3312    9    8 : tunables    0    0    0 : slabdata      0      0      0
dm_uevent              0      0   2632   12    8 : tunables    0    0    0 : slabdata      0      0      0
dm_old_clone_request      0      0    296   55    4 : tunables    0    0    0 : slabdata      0      0      0
dm_rq_target_io        0      0    120   34    1 : tunables    0    0    0 : slabdata      0      0      0
mqueue_inode_cache     36     36    896   36    8 : tunables    0    0    0 : slabdata      1      1      0
fuse_request        1640   1640    392   41    4 : tunables    0    0    0 : slabdata     40     40      0
fuse_inode          4123   4326    768   42    8 : tunables    0    0    0 : slabdata    103    103      0
ecryptfs_key_record_cache      0      0    576   28    4 : tunables    0    0    0 : slabdata      0      0      0
ecryptfs_headers       8      8   4096    8    8 : tunables    0    0    0 : slabdata      1      1      0
ecryptfs_inode_cache      0      0    960   34    8 : tunables    0    0    0 : slabdata      0      0      0
ecryptfs_dentry_info_cache    128    128     32  128    1 : tunables    0    0    0 : slabdata      1      1      0
ecryptfs_file_cache      0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
ecryptfs_auth_tok_list_item      0      0    832   39    8 : tunables    0    0    0 : slabdata      0      0      0
fat_inode_cache       90     90    728   45    8 : tunables    0    0    0 : slabdata      2      2      0
fat_cache              0      0     40  102    1 : tunables    0    0    0 : slabdata      0      0      0
squashfs_inode_cache      0      0    704   46    8 : tunables    0    0    0 : slabdata      0      0      0
jbd2_journal_head   5338   5338    120   34    1 : tunables    0    0    0 : slabdata    157    157      0
jbd2_revoke_table_s    512    512     16  256    1 : tunables    0    0    0 : slabdata      2      2      0
ext4_inode_cache  103557 113790   1080   30    8 : tunables    0    0    0 : slabdata   3793   3793      0
ext4_allocation_context   1280   1280    128   32    1 : tunables    0    0    0 : slabdata     40     40      0
ext4_pending_reservation   7168   7168     32  128    1 : tunables    0    0    0 : slabdata     56     56      0
ext4_extent_status  13848  15606     40  102    1 : tunables    0    0    0 : slabdata    153    153      0
mbcache             4380   4380     56   73    1 : tunables    0    0    0 : slabdata     60     60      0
fscrypt_info        2560   2560     64   64    1 : tunables    0    0    0 : slabdata     40     40      0
fscrypt_ctx         3400   3400     48   85    1 : tunables    0    0    0 : slabdata     40     40      0
userfaultfd_ctx_cache      0      0    192   42    2 : tunables    0    0    0 : slabdata      0      0      0
dnotify_struct      5632   5632     32  128    1 : tunables    0    0    0 : slabdata     44     44      0
posix_timers_cache   1360   1360    240   34    2 : tunables    0    0    0 : slabdata     40     40      0
UNIX              465504 465504   1024   32    8 : tunables    0    0    0 : slabdata  14547  14547      0
ip4-frags              0      0    208   39    2 : tunables    0    0    0 : slabdata      0      0      0
xfrm_dst_cache     16830  17238    320   51    4 : tunables    0    0    0 : slabdata    338    338      0
xfrm_state             0      0    768   42    8 : tunables    0    0    0 : slabdata      0      0      0
PING                   0      0    960   34    8 : tunables    0    0    0 : slabdata      0      0      0
RAW                 1598   1598    960   34    8 : tunables    0    0    0 : slabdata     47     47      0
tw_sock_TCP          952    952    240   34    2 : tunables    0    0    0 : slabdata     28     28      0
request_sock_TCP    2120   2120    304   53    4 : tunables    0    0    0 : slabdata     40     40      0
TCP                 1845   1845   2176   15    8 : tunables    0    0    0 : slabdata    123    123      0
hugetlbfs_inode_cache    583    583    616   53    8 : tunables    0    0    0 : slabdata     11     11      0
dquot               1280   1280    256   32    2 : tunables    0    0    0 : slabdata     40     40      0
eventpoll_pwq      13216  13216     72   56    1 : tunables    0    0    0 : slabdata    236    236      0
dax_cache            307    336    768   42    8 : tunables    0    0    0 : slabdata      8      8      0
request_queue        120    120   2056   15    8 : tunables    0    0    0 : slabdata      8      8      0
biovec-max           640    648   8192    4    8 : tunables    0    0    0 : slabdata    162    162      0
biovec-128          2080   2080   2048   16    8 : tunables    0    0    0 : slabdata    130    130      0
biovec-64           1440   1440   1024   32    8 : tunables    0    0    0 : slabdata     45     45      0
dmaengine-unmap-256     15     15   2112   15    8 : tunables    0    0    0 : slabdata      1      1      0
dmaengine-unmap-128     30     30   1088   30    8 : tunables    0    0    0 : slabdata      1      1      0
dmaengine-unmap-16   5166   5166    192   42    2 : tunables    0    0    0 : slabdata    123    123      0
dmaengine-unmap-2 7536079 7570944     64   64    1 : tunables    0    0    0 : slabdata 118296 118296      0
sock_inode_cache  1494300 1494300    640   51    8 : tunables    0    0    0 : slabdata  29300  29300      0
skbuff_ext_cache    1344   1344    128   32    1 : tunables    0    0    0 : slabdata     42     42      0
skbuff_fclone_cache   1696   1696    512   32    4 : tunables    0    0    0 : slabdata     53     53      0
skbuff_head_cache 391008 391232    256   32    2 : tunables    0    0    0 : slabdata  12226  12226      0
file_lock_cache     1480   1480    216   37    2 : tunables    0    0    0 : slabdata     40     40      0
net_namespace          0      0   6272    5    8 : tunables    0    0    0 : slabdata      0      0      0
shmem_inode_cache 115066 115714    696   47    8 : tunables    0    0    0 : slabdata   2462   2462      0
task_delay_info   2363187 2363187     80   51    1 : tunables    0    0    0 : slabdata  46337  46337      0
taskstats           1880   1880    344   47    4 : tunables    0    0    0 : slabdata     40     40      0
proc_dir_entry      2352   2352    192   42    2 : tunables    0    0    0 : slabdata     56     56      0
pde_opener        3777570 3777672     40  102    1 : tunables    0    0    0 : slabdata  37036  37036      0
proc_inode_cache  1338998 1353625    664   49    8 : tunables    0    0    0 : slabdata  27625  27625      0
bdev_cache           715    741    832   39    8 : tunables    0    0    0 : slabdata     19     19      0
kernfs_node_cache 2821263 2821320    136   30    1 : tunables    0    0    0 : slabdata  94044  94044      0
mnt_cache           2184   2184    384   42    4 : tunables    0    0    0 : slabdata     52     52      0
filp              6359482 6550112    256   32    2 : tunables    0    0    0 : slabdata 204691 204691      0
inode_cache       1818494 1820225    592   55    8 : tunables    0    0    0 : slabdata  33095  33095      0
dentry            2605094 2623740    192   42    2 : tunables    0    0    0 : slabdata  62470  62470      0
names_cache          320    320   4096    8    8 : tunables    0    0    0 : slabdata     40     40      0
iint_cache             0      0    120   34    1 : tunables    0    0    0 : slabdata      0      0      0
lsm_file_cache     14450  14450     24  170    1 : tunables    0    0    0 : slabdata     85     85      0
buffer_head       463822 473967    104   39    1 : tunables    0    0    0 : slabdata  12153  12153      0
uts_namespace          0      0    440   37    4 : tunables    0    0    0 : slabdata      0      0      0
nsproxy             2920   2920     56   73    1 : tunables    0    0    0 : slabdata     40     40      0
vm_area_struct    15765050 15784158    208   39    2 : tunables    0    0    0 : slabdata 404722 404722      0
mm_struct         1958011 1958040   1088   30    8 : tunables    0    0    0 : slabdata  65268  65268      0
files_cache       2082328 2082328    704   46    8 : tunables    0    0    0 : slabdata  45268  45268      0
signal_cache      1576061 1576110   1088   30    8 : tunables    0    0    0 : slabdata  52537  52537      0
sighand_cache     680677 680805   2112   15    8 : tunables    0    0    0 : slabdata  45387  45387      0
task_struct       326007 331250   5888    5    8 : tunables    0    0    0 : slabdata  66250  66250      0
cred_jar          2929292 2929332    192   42    2 : tunables    0    0    0 : slabdata  69746  69746      0
anon_vma_chain    10029976 10031552     64   64    1 : tunables    0    0    0 : slabdata 156743 156743      0
anon_vma          4234576 4235266     88   46    1 : tunables    0    0    0 : slabdata  92071  92071      0
pid               1989408 1989408    128   32    1 : tunables    0    0    0 : slabdata  62169  62169      0
Acpi-Operand        4928   4928     72   56    1 : tunables    0    0    0 : slabdata     88     88      0
Acpi-ParseExt       1560   1560    104   39    1 : tunables    0    0    0 : slabdata     40     40      0
Acpi-State          2244   2244     80   51    1 : tunables    0    0    0 : slabdata     44     44      0
Acpi-Namespace      3366   3366     40  102    1 : tunables    0    0    0 : slabdata     33     33      0
numa_policy           62     62    264   31    2 : tunables    0    0    0 : slabdata      2      2      0
trace_event_file    4232   4232     88   46    1 : tunables    0    0    0 : slabdata     92     92      0
ftrace_event_field   5865   5865     48   85    1 : tunables    0    0    0 : slabdata     69     69      0
pool_workqueue     12361  12512    256   32    2 : tunables    0    0    0 : slabdata    391    391      0
radix_tree_node   431129 437276    584   28    4 : tunables    0    0    0 : slabdata  15623  15623      0
task_group          2040   2040    640   51    8 : tunables    0    0    0 : slabdata     40     40      0
dma-kmalloc-8k         0      0   8192    4    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-4k         0      0   4096    8    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-2k         0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-1k         0      0   1024   32    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-512        0      0    512   32    4 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-256        0      0    256   32    2 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-128        0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-64         0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-32         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-16         0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-8          0      0      8  512    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-192        0      0    192   42    2 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-96         0      0     96   42    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-8k         0      0   8192    4    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-4k         0      0   4096    8    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-2k         0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-1k         0      0   1024   32    8 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-512        0      0    512   32    4 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-256        0      0    256   32    2 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-192      420    420    192   42    2 : tunables    0    0    0 : slabdata     10     10      0
kmalloc-rcl-128     6752   6752    128   32    1 : tunables    0    0    0 : slabdata    211    211      0
kmalloc-rcl-96     25417  25788     96   42    1 : tunables    0    0    0 : slabdata    614    614      0
kmalloc-rcl-64     25422  26176     64   64    1 : tunables    0    0    0 : slabdata    409    409      0
kmalloc-rcl-32         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-16         0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-rcl-8          0      0      8  512    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-8k          1552   1560   8192    4    8 : tunables    0    0    0 : slabdata    390    390      0
kmalloc-4k        373769 373808   4096    8    8 : tunables    0    0    0 : slabdata  46726  46726      0
kmalloc-2k        247249 247264   2048   16    8 : tunables    0    0    0 : slabdata  15454  15454      0
kmalloc-1k        1010825 1011040   1024   32    8 : tunables    0    0    0 : slabdata  31595  31595      0
kmalloc-512       403433 403808    512   32    4 : tunables    0    0    0 : slabdata  12619  12619      0
kmalloc-256        16239  16512    256   32    2 : tunables    0    0    0 : slabdata    516    516      0
kmalloc-192       1260365 1260462    192   42    2 : tunables    0    0    0 : slabdata  30011  30011      0
kmalloc-128        30795  30880    128   32    1 : tunables    0    0    0 : slabdata    965    965      0
kmalloc-96        860672 865662     96   42    1 : tunables    0    0    0 : slabdata  20611  20611      0
kmalloc-64        1699569 1701824     64   64    1 : tunables    0    0    0 : slabdata  26591  26591      0
kmalloc-32        5300345 5300480     32  128    1 : tunables    0    0    0 : slabdata  41410  41410      0
kmalloc-16         75008  75008     16  256    1 : tunables    0    0    0 : slabdata    293    293      0
kmalloc-8         1176740 1178112      8  512    1 : tunables    0    0    0 : slabdata   2301   2301      0
kmem_cache_node   165847 165888     64   64    1 : tunables    0    0    0 : slabdata   2592   2592      0
kmem_cache         84070  84168    384   42    4 : tunables    0    0    0 : slabdata   2004   2004      0

From chris.hofstaedtler at deduktiva.com  Sat Sep 28 20:01:21 2019
From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva)
Date: Sat, 28 Sep 2019 20:01:21 +0200
Subject: [PVE-User] Kernel Memory Leak on PVE6?
In-Reply-To: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
References: <20190920123117.bn5eydbjsmb7tfyl@zeha.at>
Message-ID: <20190928180121.c44smlrz47echxt6@zeha.at>

* Chris Hofstaedtler | Deduktiva <chris.hofstaedtler at deduktiva.com> [190920 14:31]:
> This machine has the same (except CPU) hardware as the box next to
> it; however this one was freshly installed with PVE6, the other one
> is an upgrade from PVE5 and doesn't exhibit this problem. It's quite
> puzzling because I haven't seen this symptom at all at all the
> customer installations.

Turns out the upgraded-from-PVE5 machine also shows this symptions,
it's just not as noticable. - And, I've found one more machine at a
customer site showing the same problems.

Can't really make out a pattern in the varying configurations
though.

Chris


From info at aminvakil.com  Sun Sep 29 10:05:32 2019
From: info at aminvakil.com (Amin Vakil)
Date: Sun, 29 Sep 2019 11:35:32 +0330
Subject: [PVE-User] CentOS 8 Linux Installation error with scsi
Message-ID: <4c06caa9-c6c3-5ff3-6be2-df619674811b@aminvakil.com>

It seems that we cannot install CentOS 8 iso if bus of CD/DVD drive in
Proxmox has been set to scsi, I checked both
CentOS-8-x86_64-1905-boot.iso and CentOS-8-x86_64-1905-dvd1.iso (and
checked hashsums of them too), but as I want to install it, I face an
error /dev/root does not exist.

It works fine if I set bus of CD/DVD drive to SATA.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.proxmox.com/pipermail/pve-user/attachments/20190929/63563833/attachment.sig>