From me at marcobertorello.it Thu Jan 2 16:11:18 2020 From: me at marcobertorello.it (Bertorello, Marco) Date: Thu, 2 Jan 2020 16:11:18 +0100 Subject: [PVE-User] Network Device models Message-ID: <726368a2-8ab5-fb6f-3d62-edc153da01ff@marcobertorello.it> Dear PVE Users, a, maybe, silly question about NIC cards models. I have a running Opnsense installation on a PVE VM, with 3 NICs, all as VirtIO (paravirtualized) as per [1]. All works fine, but there is a bug[2][3] in Opnsense, using cards other than Realtek. But, if I try to run the VM using Realtek 8139 model, the OS doesn't see any interfaces. Is this right, since my physical NICs aren't Realtek? Thanks a lot and best regards, [1] https://docs.netgate.com/pfsense/en/latest/virtualization/virtualizing-pfsense-with-proxmox.html [2] https://forum.opnsense.org/index.php?topic=9754.0 [3] https://forum.opnsense.org/index.php?topic=14315.0 -- Marco Bertorello https://www.marcobertorello.it -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From brians at iptel.co Thu Jan 2 18:38:56 2020 From: brians at iptel.co (Brian :) Date: Thu, 2 Jan 2020 17:38:56 +0000 Subject: [PVE-User] Network Device models In-Reply-To: <726368a2-8ab5-fb6f-3d62-edc153da01ff@marcobertorello.it> References: <726368a2-8ab5-fb6f-3d62-edc153da01ff@marcobertorello.it> Message-ID: Hi Marco The physical nic type is irrelevant. If the guest doesn't see the realteks then most likely it's driver issue in the guest.. I'd try to get the using cards under than realtek issue address in opnsense Brian On Thursday, January 2, 2020, Bertorello, Marco wrote: > Dear PVE Users, > > a, maybe, silly question about NIC cards models. > > I have a running Opnsense installation on a PVE VM, with 3 NICs, all as > VirtIO (paravirtualized) as per [1]. > > All works fine, but there is a bug[2][3] in Opnsense, using cards other > than Realtek. > > But, if I try to run the VM using Realtek 8139 model, the OS doesn't see > any interfaces. Is this right, since my physical NICs aren't Realtek? > > Thanks a lot and best regards, > > [1] > https://docs.netgate.com/pfsense/en/latest/virtualization/virtualizing-pfsense-with-proxmox.html > > [2] https://forum.opnsense.org/index.php?topic=9754.0 > > [3] https://forum.opnsense.org/index.php?topic=14315.0 > > -- > Marco Bertorello > https://www.marcobertorello.it > > > From pve-user at poige.ru Fri Jan 3 17:33:48 2020 From: pve-user at poige.ru (Igor Podlesny) Date: Fri, 3 Jan 2020 23:33:48 +0700 Subject: [PVE-User] Proxmox 5, LXC, Ubuntu 18.04: mess and weirdness Message-ID: Hello! Being sceptical about LXC's low isolation level I typically preferred KVM. There's a circumstance (and some more of course) where KVM doesn't fit well though: ZFS backed storage of hardware node. It's either a call to pass it "as filesystem" with some networking FS sharing protocols, or, say "9p", or to format another FS over ZFS used as block device and have it all working together rather "so-so". None of the above is optimal. Neither building up ZFS pool inside of a VM is. So, that's why I've decided to try to use LXC this time: Debian GNU/Linux 9.11 (stretch) 4.15.18-24-pve x86_64 -- the latest Proxmox 5 IOW. I've chosen 18.04 Ubuntu template that comes with Proxmox 5 and at first it all ... ... looked good. :) Then I realised there are strange quirks (all quirks are strange of course) with DNS resolving inside the CT, I ran tcpdump and ... ... it just quit w/o printing a line of output (but not immediately as if it waited for packets first). dmesg on hardware node (HN) has this: ... audit: type=1400 audit(1578064533.151:56): apparmor="DENIED" operation="file_inherit" namespace="root//lxc-30100_<-var-lib-lxc>" profile="/usr/sbin/tcpdump" name="/dev/pts/6" pid=43675 comm="tcpdump" requested_mask="wr" denied_mask="wr" fsuid=100000 ouid=101000 ... audit: type=1400 audit(1578064538.331:60): apparmor="DENIED" operation="getattr" info="Failed name lookup - disconnected path" error=-13 namespace="root//lxc-30100_<-var-lib-lxc>" profile="/usr/sbin/tcpdump" name="apparmor/.null" pid=43675 comm="tcpdump" requested_mask="r" denied_mask="r" fsuid=165534 ouid=0 and it's not the only line related to the CT! In fact there are some even about my manuals reading there too: ... audit: type=1400 audit(1578064494.766:49): apparmor="STATUS" operation="profile_load" label="lxc-30100_//&:lxc-30100_<-var-lib-lxc>:unconfined" name="/usr/bin/man" pid=42514 comm="apparmor_parser" and so on. -- Obviously it's kinda *crippled environment* no one would be gladly using. (There's another question why would someone ship the system that by default gives you just this but probably it just wasn't tested.) What in your opinion is the best way to have it fixed? Proxmox 6? Relaxing AppArmor's ruleset? Turning it off completely? (I'm unsure if it's supported mode of operation at all.) I tried turning on "feature" named "nesting" but it fixed neither tcpdump malfunction nor DNS quirks themselves. Shall a privileged CT work just fine instead? Or is bringing KVM back instead the only sane option? Unsure if 9p will handle a few TB storage gracefully. -- End of message. Next message? From f.thommen at dkfz-heidelberg.de Thu Jan 9 16:33:29 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Thu, 9 Jan 2020 16:33:29 +0100 Subject: [PVE-User] Starting number of VMs and containers Message-ID: <92d8e255-27ab-9f0c-c77f-9924fc1f7913@dkfz-heidelberg.de> Dear all, is there a specific reason, why PROXMOX VMs and containers are numbered from 100 and not from - e.g. - 001? Can the starting number be changed? I cannot find an appropriate section in the administration manual. Cheers frank From dietmar at proxmox.com Thu Jan 9 17:11:10 2020 From: dietmar at proxmox.com (Dietmar Maurer) Date: Thu, 9 Jan 2020 17:11:10 +0100 (CET) Subject: [PVE-User] Starting number of VMs and containers In-Reply-To: <92d8e255-27ab-9f0c-c77f-9924fc1f7913@dkfz-heidelberg.de> References: <92d8e255-27ab-9f0c-c77f-9924fc1f7913@dkfz-heidelberg.de> Message-ID: <827808956.55.1578586270412@webmail.proxmox.com> > is there a specific reason, why PROXMOX VMs and containers are numbered > from 100 and not from - e.g. - 001? Can the starting number be changed? This has historical reasons (IDs 0-99 were reserved by OpenVZ for internal use). From f.thommen at dkfz-heidelberg.de Fri Jan 10 13:44:46 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Fri, 10 Jan 2020 13:44:46 +0100 Subject: [PVE-User] LVM/pvesm udev database initialization warnings after import of KVM images Message-ID: Dear all, after having (successfully) imported two KVM disk images from oVirt, LVM and pvesm complain about some udev initialization problem: root at pve01:~# pvesm status WARNING: Device /dev/dm-8 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/dm-9 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/dm-8 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/dm-9 not initialized in udev database even after waiting 10000000 microseconds. Name Type Status Total Used Available % local dir active 98559220 45640836 47868836 46.31% local-lvm lvmthin active 832245760 71573135 760672624 8.60% root at pve01:~# root at pve01:~# lvdisplay WARNING: Device /dev/dm-8 not initialized in udev database even after waiting 10000000 microseconds. WARNING: Device /dev/dm-9 not initialized in udev database even after waiting 10000000 microseconds. --- Logical volume --- LV Path /dev/pve/swap LV Name swap VG Name pve [...] root at pve01:~# However this doesn't seem to influence the functionality of the VMs. Any idea what could be the problem and how to fix it? Thank you very much in advance Frank From gianni.milo22 at gmail.com Fri Jan 10 15:32:11 2020 From: gianni.milo22 at gmail.com (Gianni Milo) Date: Fri, 10 Jan 2020 14:32:11 +0000 Subject: [PVE-User] LVM/pvesm udev database initialization warnings after import of KVM images In-Reply-To: References: Message-ID: Does issuing 'udevadm trigger' helps? On Fri, 10 Jan 2020 at 12:44, Frank Thommen wrote: > Dear all, > > after having (successfully) imported two KVM disk images from oVirt, LVM > and pvesm complain about some udev initialization problem: > > root at pve01:~# pvesm status > WARNING: Device /dev/dm-8 not initialized in udev database even after > waiting 10000000 microseconds. > WARNING: Device /dev/dm-9 not initialized in udev database even after > waiting 10000000 microseconds. > WARNING: Device /dev/dm-8 not initialized in udev database even after > waiting 10000000 microseconds. > WARNING: Device /dev/dm-9 not initialized in udev database even after > waiting 10000000 microseconds. > Name Type Status Total Used > Available % > local dir active 98559220 45640836 > 47868836 46.31% > local-lvm lvmthin active 832245760 71573135 > 760672624 8.60% > root at pve01:~# > > root at pve01:~# lvdisplay > WARNING: Device /dev/dm-8 not initialized in udev database even after > waiting 10000000 microseconds. > WARNING: Device /dev/dm-9 not initialized in udev database even after > waiting 10000000 microseconds. > --- Logical volume --- > LV Path /dev/pve/swap > LV Name swap > VG Name pve > [...] > root at pve01:~# > > However this doesn't seem to influence the functionality of the VMs. > > Any idea what could be the problem and how to fix it? > > Thank you very much in advance > Frank > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From gaio at sv.lnf.it Fri Jan 10 15:39:47 2020 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Fri, 10 Jan 2020 15:39:47 +0100 Subject: [PVE-User] Watchdog in containers? In-Reply-To: <20191210135445.GL3544@sv.lnf.it> References: <20191210135445.GL3544@sv.lnf.it> Message-ID: <20200110143947.GD6118@sv.lnf.it> > Clearly, i've to find the guilty process (probably it is Samba) but in > the meantime... there's some sort of 'watchdog' for containers? > I can safely install 'watchdog' on containers, disable /dev/watchdog > and configure to do a reboot if load go too high? eg: > max-load-1 = 24 > max-load-5 = 18 > max-load-15 = 12 I reply to myself. Yes, you can install 'watchdog' on containers and use it, but you have to take care the fact that load are 'host load' and there's no a real 'iron' reboot, so container with: max-load-15 = 12 with a load-15 of 13 reboot, but very probably when they come back, load-15 is still 13. ;-) I've had a container that do two reboots in a row. Now i'm testing memory allocation in watchdog (load go high because of memory exaustion), but indeed with some glitches use of watchdog in container is doable. ;) -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797 Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA! http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000 (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA) From f.thommen at dkfz-heidelberg.de Fri Jan 10 16:47:11 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Fri, 10 Jan 2020 16:47:11 +0100 Subject: [PVE-User] LVM/pvesm udev database initialization warnings after import of KVM images In-Reply-To: References: Message-ID: <880b77ef-3295-3585-69f3-f7934c36d47a@dkfz-heidelberg.de> yes it did. Thank you very much. Is this a step that normally has to be executed after having imported a disk image? If yes, then this could perhaps be added to https://pve.proxmox.com/wiki/Migration_of_servers_to_Proxmox_VE#Qemu.2FKVM. Cheers frank On 1/10/20 3:32 PM, Gianni Milo wrote: > Does issuing 'udevadm trigger' helps? > > On Fri, 10 Jan 2020 at 12:44, Frank Thommen > wrote: > >> Dear all, >> >> after having (successfully) imported two KVM disk images from oVirt, LVM >> and pvesm complain about some udev initialization problem: >> >> root at pve01:~# pvesm status >> WARNING: Device /dev/dm-8 not initialized in udev database even after >> waiting 10000000 microseconds. >> WARNING: Device /dev/dm-9 not initialized in udev database even after >> waiting 10000000 microseconds. >> WARNING: Device /dev/dm-8 not initialized in udev database even after >> waiting 10000000 microseconds. >> WARNING: Device /dev/dm-9 not initialized in udev database even after >> waiting 10000000 microseconds. >> Name Type Status Total Used >> Available % >> local dir active 98559220 45640836 >> 47868836 46.31% >> local-lvm lvmthin active 832245760 71573135 >> 760672624 8.60% >> root at pve01:~# >> >> root at pve01:~# lvdisplay >> WARNING: Device /dev/dm-8 not initialized in udev database even after >> waiting 10000000 microseconds. >> WARNING: Device /dev/dm-9 not initialized in udev database even after >> waiting 10000000 microseconds. >> --- Logical volume --- >> LV Path /dev/pve/swap >> LV Name swap >> VG Name pve >> [...] >> root at pve01:~# >> >> However this doesn't seem to influence the functionality of the VMs. >> >> Any idea what could be the problem and how to fix it? >> >> Thank you very much in advance >> Frank >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From gianni.milo22 at gmail.com Fri Jan 10 19:00:38 2020 From: gianni.milo22 at gmail.com (Gianni Milo) Date: Fri, 10 Jan 2020 18:00:38 +0000 Subject: [PVE-User] LVM/pvesm udev database initialization warnings after import of KVM images In-Reply-To: <880b77ef-3295-3585-69f3-f7934c36d47a@dkfz-heidelberg.de> References: <880b77ef-3295-3585-69f3-f7934c36d47a@dkfz-heidelberg.de> Message-ID: No, normally it's not required to do that. It appears that some people have had this issue when migrating a VM from raw to thin lvm (?), see the link below for more information... https://forum.proxmox.com/threads/error-clone-failed-command-sbin-lvs-separator-noheadings-got-timeout.52519/#post-261766 It's unclear what is causing this. Perhaps an unfinished backup restore/clone job? Better to do some further tests, for example by creating a blank thin lvm based VM and see if the issue persists. G. On Fri, 10 Jan 2020 at 15:47, Frank Thommen wrote: > yes it did. Thank you very much. > > Is this a step that normally has to be executed after having imported a > disk image? If yes, then this could perhaps be added to > https://pve.proxmox.com/wiki/Migration_of_servers_to_Proxmox_VE#Qemu.2FKVM > . > > Cheers > frank > > > On 1/10/20 3:32 PM, Gianni Milo wrote: > > Does issuing 'udevadm trigger' helps? > > > > On Fri, 10 Jan 2020 at 12:44, Frank Thommen < > f.thommen at dkfz-heidelberg.de> > > wrote: > > > >> Dear all, > >> > >> after having (successfully) imported two KVM disk images from oVirt, LVM > >> and pvesm complain about some udev initialization problem: > >> > >> root at pve01:~# pvesm status > >> WARNING: Device /dev/dm-8 not initialized in udev database even > after > >> waiting 10000000 microseconds. > >> WARNING: Device /dev/dm-9 not initialized in udev database even > after > >> waiting 10000000 microseconds. > >> WARNING: Device /dev/dm-8 not initialized in udev database even > after > >> waiting 10000000 microseconds. > >> WARNING: Device /dev/dm-9 not initialized in udev database even > after > >> waiting 10000000 microseconds. > >> Name Type Status Total Used > >> Available % > >> local dir active 98559220 45640836 > >> 47868836 46.31% > >> local-lvm lvmthin active 832245760 71573135 > >> 760672624 8.60% > >> root at pve01:~# > >> > >> root at pve01:~# lvdisplay > >> WARNING: Device /dev/dm-8 not initialized in udev database even > after > >> waiting 10000000 microseconds. > >> WARNING: Device /dev/dm-9 not initialized in udev database even > after > >> waiting 10000000 microseconds. > >> --- Logical volume --- > >> LV Path /dev/pve/swap > >> LV Name swap > >> VG Name pve > >> [...] > >> root at pve01:~# > >> > >> However this doesn't seem to influence the functionality of the VMs. > >> > >> Any idea what could be the problem and how to fix it? > >> > >> Thank you very much in advance > >> Frank > >> _______________________________________________ > >> pve-user mailing list > >> pve-user at pve.proxmox.com > >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > >> > > _______________________________________________ > > pve-user mailing list > > pve-user at pve.proxmox.com > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > From mark at tuxis.nl Tue Jan 14 10:08:18 2020 From: mark at tuxis.nl (Mark Schouten) Date: Tue, 14 Jan 2020 10:08:18 +0100 Subject: [PVE-User] CIFS Mount off- and online Message-ID: Hi, We have a CIFS mount to nl.dadup.eu over IPv6. This mount works fine, but pvestatd continuously loops between 'Offline' and 'Online'. Even though the mount works at that time. What can I do to help debug this issue? root at node03:/mnt/pve/TCC_Marketplace# pveversion pve-manager/6.1-3/37248ce6 (running kernel: 5.3.10-1-pve) root at node03:/mnt/pve/TCC_Marketplace# storage 'TCC_Marketplace' is not online find . ./dump ./dump/vzdump-qemu-160-2020_01_13-02_24_16.vma.lzo ./template ./template/iso ./template/iso/debian-10.1.0-amd64-netinst.iso ./template/cache -- Mark Schouten Tuxis, Ede, https://www.tuxis.nl T: +31 318 200208? ? From nada at verdnatura.es Tue Jan 14 13:47:11 2020 From: nada at verdnatura.es (nada) Date: Tue, 14 Jan 2020 13:47:11 +0100 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI Message-ID: <413ca3933a616bd30233b8fefb305154@verdnatura.es> good day a week ago I upgraded our proxmox 'cluster' to buster (it has just 2 nodes and both nodes are upgraded) MANY thanks for your guidelines at proxmox wiki !!! it works great :-) now i am testing multipath over iSCSI to SAN storage it works good, snapshots, volume resizing and migrations are functional but after reboot the LVMThin VG and relevant LVs are NOT autoactivated what am i missing ? temporally i created simple rc-local service/script to activate VG PLS can anybody write me how it should be done ? following are some details thank you Nada root at mox:~# pveversion pve-manager/6.1-5/9bf06119 (running kernel: 5.3.13-1-pve) root at mox:~# grep pve /etc/lvm/lvm.conf # Also do not scan LVM disks from guests on both VGs named & not named 'pve' global_filter = [ "r|/dev/zd.*|", "r|/dev/mapper/pve-.*|", "r|/dev/mapper/pve-(vm|base)--[0-9]+--disk--[0-9]+|", "a|/dev/mapper/3600.*|", "a|/dev/mapper/san.*|" ] root at mox:~# multipath -ll 3600c0ff000195f8e2172de5d01000000 dm-8 HP,P2000 G3 iSCSI size=23G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 9:0:0:5 sdn 8:208 active ready running | |- 8:0:0:5 sdh 8:112 active ready running | |- 11:0:0:5 sdo 8:224 active ready running | `- 7:0:0:5 sdm 8:192 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 10:0:0:5 sde 8:64 active ready running |- 5:0:0:5 sdj 8:144 active ready running `- 6:0:0:5 sdk 8:160 active ready running root at mox:~# la /dev/mapper/ total 0 drwxr-xr-x 2 root root 340 Jan 13 19:09 . drwxr-xr-x 20 root root 5820 Jan 13 19:31 .. lrwxrwxrwx 1 root root 7 Jan 13 19:09 3600c0ff000195f8e2172de5d01000000 -> ../dm-8 crw------- 1 root root 10, 236 Jan 13 19:08 control lrwxrwxrwx 1 root root 7 Jan 13 19:08 pve-data -> ../dm-5 lrwxrwxrwx 1 root root 7 Jan 13 19:08 pve-data_tdata -> ../dm-3 lrwxrwxrwx 1 root root 7 Jan 13 19:08 pve-data_tmeta -> ../dm-2 lrwxrwxrwx 1 root root 7 Jan 13 19:08 pve-data-tpool -> ../dm-4 lrwxrwxrwx 1 root root 7 Jan 13 19:08 pve-root -> ../dm-1 lrwxrwxrwx 1 root root 7 Jan 13 19:08 pve-swap -> ../dm-0 lrwxrwxrwx 1 root root 7 Jan 13 19:08 pve-vm--104--disk--1 -> ../dm-7 lrwxrwxrwx 1 root root 7 Jan 13 19:08 pve-zfs -> ../dm-6 lrwxrwxrwx 1 root root 8 Jan 13 19:09 santest-santestpool -> ../dm-12 lrwxrwxrwx 1 root root 8 Jan 13 19:09 santest-santestpool_tdata -> ../dm-10 lrwxrwxrwx 1 root root 7 Jan 13 19:09 santest-santestpool_tmeta -> ../dm-9 lrwxrwxrwx 1 root root 8 Jan 13 19:09 santest-santestpool-tpool -> ../dm-11 lrwxrwxrwx 1 root root 8 Jan 13 19:09 santest-vm--901--disk--0 -> ../dm-13 root at mox:~# pvs -a PV VG Fmt Attr PSize PFree /dev/mapper/3600c0ff000195f8e2172de5d01000000 santest lvm2 a-- 23.28g 3.24g /dev/mapper/santest-vm--901--disk--0 --- 0 0 /dev/sda2 --- 0 0 /dev/sda3 pve lvm2 a-- <232.57g <1.86g /dev/sdb1 --- 0 0 /dev/sdb9 --- 0 0 /dev/sdc1 --- 0 0 /dev/sdc9 --- 0 0 /dev/sdd1 --- 0 0 /dev/sdd9 --- 0 0 /dev/sdf1 --- 0 0 /dev/sdf9 --- 0 0 /dev/sdg1 --- 0 0 /dev/sdg9 --- 0 0 /dev/sdi1 --- 0 0 /dev/sdi9 --- 0 0 /dev/sdl1 --- 0 0 /dev/sdl9 --- 0 0 root at mox:~# lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert data pve twi-aotz-- 138.57g 0.59 10.69 [data_tdata] pve Twi-ao---- 138.57g [data_tmeta] pve ewi-ao---- 72.00m [lvol0_pmspare] pve ewi------- 72.00m root pve -wi-ao---- 58.00g swap pve -wi-ao---- 4.00g vm-104-disk-1 pve Vwi-a-tz-- 50.00g data 1.62 zfs pve -wi-ao---- 30.00g [lvol0_pmspare] santest ewi------- 20.00m santestpool santest twi-aotz-- 20.00g 3.40 12.21 [santestpool_tdata] santest Twi-ao---- 20.00g [santestpool_tmeta] santest ewi-ao---- 20.00m vm-901-disk-0 santest Vwi-aotz-- 2.50g santestpool 27.23 root at mox:~# grep santest /var/log/syslog.1 |tail Jan 13 19:02:26 mox lvm[2005]: santest: autoactivation failed. Jan 13 19:04:43 mox lvm[441]: Monitoring thin pool santest-santestpool-tpool. Jan 13 19:06:14 mox lvm[441]: No longer monitoring thin pool santest-santestpool-tpool. Jan 13 19:06:14 mox blkdeactivate[12609]: [LVM]: deactivating Volume Group santest... done Jan 13 19:09:02 mox lvm[2003]: Cannot activate LVs in VG santest while PVs appear on duplicate devices. Jan 13 19:09:02 mox lvm[2003]: Cannot activate LVs in VG santest while PVs appear on duplicate devices. Jan 13 19:09:02 mox lvm[2003]: 0 logical volume(s) in volume group "santest" now active Jan 13 19:09:02 mox lvm[2003]: santest: autoactivation failed. Jan 13 19:09:11 mox lvm[442]: Monitoring thin pool santest-santestpool-tpool. Jan 13 19:09:12 mox rc.local[1767]: 2 logical volume(s) in volume group "santest" now active root at mox:~# pvesm status Name Type Status Total Used Available % backup dir active 59600812 28874144 27669416 48.45% local dir active 59600812 28874144 27669416 48.45% local-lvm lvmthin active 145301504 857278 144444225 0.59% santestpool lvmthin active 20971520 713031 20258488 3.40% zfs zfspool active 30219964 13394036 16825928 44.32% root at mox:~# cat /lib/systemd/system/rc-local.service [Unit] Description=/etc/rc.local Compatibility Documentation=man:systemd-rc-local-generator(8) ConditionFileIsExecutable=/etc/rc.local After=network.target iscsid.service multipathd.service open-iscsi.service [Service] Type=forking ExecStart=/etc/rc.local TimeoutSec=0 RemainAfterExit=yes GuessMainPID=no [Install] WantedBy=multi-user.target root at mox:~# cat /etc/rc.local #!/bin/bash # just to activate VGs from SAN /bin/sleep 10 /sbin/vgchange -aly santest From gaio at sv.lnf.it Tue Jan 14 14:28:01 2020 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Tue, 14 Jan 2020 14:28:01 +0100 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: <413ca3933a616bd30233b8fefb305154@verdnatura.es> References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> Message-ID: <20200114132801.GC2777@sv.lnf.it> Mandi! nada In chel di` si favelave... > the LVMThin VG and relevant LVs are NOT autoactivated > what am i missing ? Usually you need: node.startup = automatic in /etc/iscsi/iscsid.conf on *every* server of the pool (and do initrd recreation, of course). -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797 Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA! http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000 (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA) From nada at verdnatura.es Tue Jan 14 15:54:39 2020 From: nada at verdnatura.es (nada) Date: Tue, 14 Jan 2020 15:54:39 +0100 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: <20200114132801.GC2777@sv.lnf.it> References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <20200114132801.GC2777@sv.lnf.it> Message-ID: thank you Marco once more i've checked iscsid.conf recreated initrd and reboot 2nd node WITHOUT rc.local but the same error "autoactivation failed" detail steps follow Nada root at mox11:~# grep node.startup /etc/iscsi/iscsid.conf node.startup = automatic root at mox11:~# grep pve /etc/lvm/lvm.conf # Also do not scan LVM disks from guests on both VGs named & not named 'pve' global_filter = [ "r|/dev/zd.*|", "r|/dev/mapper/pve-.*|", "r|/dev/mapper/pve-(vm|base)--[0-9]+--disk--[0-9]+|", "a|/dev/mapper/3600.*|", "a|/dev/mapper/san.*|" ] root at mox11:~# uname -r 5.3.13-1-pve # cp /boot/initrd.img-5.3.13-1-pve /boot/initrd.img-5.3.13-1-pve.20200106.bak # update-initramfs -c -k 5.3.13-1-pve # update-grub # reboot root at mox11:~# grep santest /var/log/syslog Jan 14 15:36:09 mox11 blkdeactivate[19162]: [LVM]: deactivating Volume Group santest... skipping Jan 14 15:39:36 mox11 lvm[2086]: Cannot activate LVs in VG santest while PVs appear on duplicate devices. Jan 14 15:39:36 mox11 lvm[2086]: Cannot activate LVs in VG santest while PVs appear on duplicate devices. Jan 14 15:39:36 mox11 lvm[2086]: 0 logical volume(s) in volume group "santest" now active Jan 14 15:39:36 mox11 lvm[2086]: santest: autoactivation failed. root at mox11:~# lvs -a /dev/sda: open failed: No medium found LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert data pve twi-aotz-- 9.98g 0.00 10.61 [data_tdata] pve Twi-ao---- 9.98g [data_tmeta] pve ewi-ao---- 12.00m [lvol0_pmspare] pve ewi------- 12.00m root pve -wi-ao---- 16.75g swap pve -wi-ao---- 4.00g zfs pve -wi-ao---- 30.00g [lvol0_pmspare] santest ewi------- 12.00m santestpool santest twi---tz-- 9.00g [santestpool_tdata] santest Twi------- 9.00g [santestpool_tmeta] santest ewi------- 12.00m vm-902-disk-0 santest Vwi---tz-- 2.00g santestpool root at mox11:~# /sbin/vgchange -aly santest /dev/sda: open failed: No medium found 2 logical volume(s) in volume group "santest" now active root at mox11:~# lvs -a /dev/sda: open failed: No medium found LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert data pve twi-aotz-- 9.98g 0.00 10.61 [data_tdata] pve Twi-ao---- 9.98g [data_tmeta] pve ewi-ao---- 12.00m [lvol0_pmspare] pve ewi------- 12.00m root pve -wi-ao---- 16.75g swap pve -wi-ao---- 4.00g zfs pve -wi-ao---- 30.00g [lvol0_pmspare] santest ewi------- 12.00m santestpool santest twi-aotz-- 9.00g 8.61 13.83 [santestpool_tdata] santest Twi-ao---- 9.00g [santestpool_tmeta] santest ewi-ao---- 12.00m vm-902-disk-0 santest Vwi-a-tz-- 2.00g santestpool 38.73 On 2020-01-14 14:28, Marco Gaiarin wrote: > Mandi! nada > In chel di` si favelave... > >> the LVMThin VG and relevant LVs are NOT autoactivated >> what am i missing ? > > Usually you need: > > node.startup = automatic > > in /etc/iscsi/iscsid.conf on *every* server of the pool (and do initrd > recreation, of course). From gaio at sv.lnf.it Tue Jan 14 16:12:23 2020 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Tue, 14 Jan 2020 16:12:23 +0100 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <20200114132801.GC2777@sv.lnf.it> Message-ID: <20200114151223.GE2777@sv.lnf.it> Mandi! nada In chel di` si favelave... > root at mox11:~# grep santest /var/log/syslog > Jan 14 15:36:09 mox11 blkdeactivate[19162]: [LVM]: deactivating Volume Group santest... skipping > Jan 14 15:39:36 mox11 lvm[2086]: Cannot activate LVs in VG santest while PVs appear on duplicate devices. > Jan 14 15:39:36 mox11 lvm[2086]: Cannot activate LVs in VG santest while PVs appear on duplicate devices. > Jan 14 15:39:36 mox11 lvm[2086]: 0 logical volume(s) in volume group "santest" now active > Jan 14 15:39:36 mox11 lvm[2086]: santest: autoactivation failed. Mmmhhhh... seems to me that LVM get activated before multipath, and so they see multiple PVs (as effectively is). Never happened before, sorry... -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797 Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA! http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000 (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA) From nada at verdnatura.es Tue Jan 14 16:40:00 2020 From: nada at verdnatura.es (nada) Date: Tue, 14 Jan 2020 16:40:00 +0100 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: <20200114151223.GE2777@sv.lnf.it> References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <20200114132801.GC2777@sv.lnf.it> <20200114151223.GE2777@sv.lnf.it> Message-ID: <389b6180fc79c96727ffb9376b1a1555@verdnatura.es> dont'worry and be happy Marco that rc.local save the situation (temporal solution ;-) and CTs which have resources on that 'santest' LV are auto started after reboot Nada On 2020-01-14 16:12, Marco Gaiarin wrote: > Mandi! nada > In chel di` si favelave... > >> root at mox11:~# grep santest /var/log/syslog >> Jan 14 15:36:09 mox11 blkdeactivate[19162]: [LVM]: deactivating >> Volume Group santest... skipping >> Jan 14 15:39:36 mox11 lvm[2086]: Cannot activate LVs in VG santest >> while PVs appear on duplicate devices. >> Jan 14 15:39:36 mox11 lvm[2086]: Cannot activate LVs in VG santest >> while PVs appear on duplicate devices. >> Jan 14 15:39:36 mox11 lvm[2086]: 0 logical volume(s) in volume group >> "santest" now active >> Jan 14 15:39:36 mox11 lvm[2086]: santest: autoactivation failed. > > Mmmhhhh... seems to me that LVM get activated before multipath, and so > they see multiple PVs (as effectively is). > > Never happened before, sorry... From smr at kmi.com Tue Jan 14 19:46:29 2020 From: smr at kmi.com (Stefan M. Radman) Date: Tue, 14 Jan 2020 18:46:29 +0000 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: <389b6180fc79c96727ffb9376b1a1555@verdnatura.es> References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <20200114132801.GC2777@sv.lnf.it> <20200114151223.GE2777@sv.lnf.it> <389b6180fc79c96727ffb9376b1a1555@verdnatura.es> Message-ID: <321DD43B-75DA-489D-8EB4-D4537DED0B28@kmi.com> Hi Nada What's the output of "systemctl --failed" and "systemctl status lvm2-activation-net.service". Stefan > On Jan 14, 2020, at 16:40, nada wrote: > > dont'worry and be happy Marco > that rc.local save the situation (temporal solution ;-) > and CTs which have resources on that 'santest' LV are auto started after reboot > Nada > > On 2020-01-14 16:12, Marco Gaiarin wrote: >> Mandi! nada >> In chel di` si favelave... >>> root at mox11:~# grep santest /var/log/syslog >>> Jan 14 15:36:09 mox11 blkdeactivate[19162]: [LVM]: deactivating Volume Group santest... skipping >>> Jan 14 15:39:36 mox11 lvm[2086]: Cannot activate LVs in VG santest while PVs appear on duplicate devices. >>> Jan 14 15:39:36 mox11 lvm[2086]: Cannot activate LVs in VG santest while PVs appear on duplicate devices. >>> Jan 14 15:39:36 mox11 lvm[2086]: 0 logical volume(s) in volume group "santest" now active >>> Jan 14 15:39:36 mox11 lvm[2086]: santest: autoactivation failed. >> Mmmhhhh... seems to me that LVM get activated before multipath, and so >> they see multiple PVs (as effectively is). >> Never happened before, sorry... > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpve.proxmox.com%2Fcgi-bin%2Fmailman%2Flistinfo%2Fpve-user&data=02%7C01%7Csmr%40kmi.com%7C80d026394e75434213c808d7990809b1%7Cc2283768b8d34e008f3d85b1b4f03b33%7C0%7C1%7C637146132119844345&sdata=9zp3vch%2FmXON%2FjJoIzCfnAc1w1bL%2BdvAKO7%2FARverHM%3D&reserved=0 From uwe.sauter.de at gmail.com Tue Jan 14 21:37:31 2020 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Tue, 14 Jan 2020 21:37:31 +0100 Subject: [PVE-User] Renamed node, now WebUI partly not working Message-ID: <80c96efb-d5f0-a9de-a994-e2e35aa2cf9f@gmail.com> Hi all, today I create some kind of a mess. TL;DR I added one node with the wrong hostname to an existing cluster, tried to rename the node, now some parts of the WebUI don't work anymore. Any thoughts? I have a 3 node cluster with up-to-date software and added a fourth and fifth node. By accident, the fourth node had a wrong hostname. I tried to rename it by following [1] but forgot to increment totem {config_version}. Now on the WebUI -> Server View -> Datacenter -> -> Summary page, the inlined status panel shows "communication failure". If I try to migrate a VM to another node the drop down list for target node selection is empty (also when trying to do bulk migration). Any idea what I did wrong and how to repair the cluster? Thanks, Uwe [1] https://pve.proxmox.com/wiki/Renaming_a_PVE_node From nada at verdnatura.es Wed Jan 15 10:55:06 2020 From: nada at verdnatura.es (nada) Date: Wed, 15 Jan 2020 10:55:06 +0100 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <20200114132801.GC2777@sv.lnf.it> <20200114151223.GE2777@sv.lnf.it> <389b6180fc79c96727ffb9376b1a1555@verdnatura.es> Message-ID: <5b545e139718261531562fec0a49eede@verdnatura.es> On 2020-01-14 19:46, Stefan M. Radman via pve-user wrote: > Hi Nada > What's the output of "systemctl --failed" and "systemctl status > lvm2-activation-net.service". > Stefan Hi Stefan thank you for your response ! the output of "systemctl --failed" was claiming devices from SAN during boot, which were activated by rc-local.service after boot i do NOT have "lvm2-activation-net.service" and find masked status of multipath-tools-boot.service < is this ok ??? but i find some mistake in multipath.conf eventhough it is running i am going to reconfigure it and reboot this afternoon following are details Nada root at mox11:~# systemctl --failed --all UNIT LOAD ACTIVE SUB DESCRIPTION lvm2-pvscan at 253:7.service loaded failed failed LVM event activation on device 253:7 lvm2-pvscan at 253:8.service loaded failed failed LVM event activation on device 253:8 LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 2 loaded units listed. To show all installed unit files use 'systemctl list-unit-files'. root at mox11:~# dmsetup ls san2020jan-vm--903--disk--0 (253:18) santest-santestpool (253:12) 3600c0ff000195f8e7d0a855701000000 (253:7) pve-data-tpool (253:4) pve-data_tdata (253:3) pve-zfs (253:6) santest-santestpool-tpool (253:11) santest-santestpool_tdata (253:10) pve-data_tmeta (253:2) san2020jan-san2020janpool (253:17) santest-santestpool_tmeta (253:9) pve-swap (253:0) pve-root (253:1) pve-data (253:5) 3600c0ff000195f8ec3f01d5e01000000 (253:8) san2020jan-san2020janpool-tpool (253:16) san2020jan-san2020janpool_tdata (253:15) san2020jan-san2020janpool_tmeta (253:14) root at mox11:~# pvs -a PV VG Fmt Attr PSize PFree /dev/mapper/3600c0ff000195f8e7d0a855701000000 santest lvm2 a-- <9.31g 292.00m /dev/mapper/3600c0ff000195f8ec3f01d5e01000000 san2020jan lvm2 a-- <93.13g <2.95g /dev/mapper/san2020jan-vm--903--disk--0 --- 0 0 /dev/sdb --- 0 0 /dev/sdc2 --- 0 0 /dev/sdc3 pve lvm2 a-- 67.73g 6.97g root at mox11:~# vgs -a VG #PV #LV #SN Attr VSize VFree pve 1 4 0 wz--n- 67.73g 6.97g san2020jan 1 2 0 wz--n- <93.13g <2.95g santest 1 1 0 wz--n- <9.31g 292.00m root at mox11:~# lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert data pve twi-aotz-- 9.98g 0.00 10.61 [data_tdata] pve Twi-ao---- 9.98g [data_tmeta] pve ewi-ao---- 12.00m [lvol0_pmspare] pve ewi------- 12.00m root pve -wi-ao---- 16.75g swap pve -wi-ao---- 4.00g zfs pve -wi-ao---- 30.00g [lvol0_pmspare] san2020jan ewi------- 92.00m san2020janpool san2020jan twi-aotz-- 90.00g 0.86 10.84 [san2020janpool_tdata] san2020jan Twi-ao---- 90.00g [san2020janpool_tmeta] san2020jan ewi-ao---- 92.00m vm-903-disk-0 san2020jan Vwi-aotz-- 2.50g san2020janpool 30.99 [lvol0_pmspare] santest ewi------- 12.00m santestpool santest twi-aotz-- 9.00g 0.00 10.58 [santestpool_tdata] santest Twi-ao---- 9.00g [santestpool_tmeta] santest ewi-ao---- 12.00m root at mox11:~# multipathd -k"show maps" Jan 15 10:50:02 | /etc/multipath.conf line 24, duplicate keyword: wwid name sysfs uuid 3600c0ff000195f8e7d0a855701000000 dm-7 3600c0ff000195f8e7d0a855701000000 3600c0ff000195f8ec3f01d5e01000000 dm-8 3600c0ff000195f8ec3f01d5e01000000 root at mox11:~# multipathd -k"show paths" Jan 15 10:50:07 | /etc/multipath.conf line 24, duplicate keyword: wwid hcil dev dev_t pri dm_st chk_st dev_st next_check 6:0:0:3 sde 8:64 10 active ready running XX........ 2/8 9:0:0:3 sdn 8:208 10 active ready running XX........ 2/8 7:0:0:3 sdh 8:112 50 active ready running XX........ 2/8 5:0:0:3 sdd 8:48 10 active ready running XXXXXX.... 5/8 11:0:0:3 sdp 8:240 50 active ready running X......... 1/8 10:0:0:3 sdl 8:176 50 active ready running XXXXXXXXXX 8/8 8:0:0:6 sdk 8:160 50 active ready running XXXXXX.... 5/8 8:0:0:3 sdj 8:144 50 active ready running XXXXXXXXXX 8/8 9:0:0:6 sdo 8:224 10 active ready running X......... 1/8 6:0:0:6 sdg 8:96 10 active ready running XXXXXX.... 5/8 5:0:0:6 sdf 8:80 10 active ready running XXXXXX.... 5/8 10:0:0:6 sdm 8:192 50 active ready running XXXXXXX... 6/8 11:0:0:6 sdq 65:0 50 active ready running XXXXXXXX.. 7/8 7:0:0:6 sdi 8:128 50 active ready running XXXXXXX... 6/8 root at mox11:~# cat /etc/multipath.conf defaults { polling_interval 2 uid_attribute ID_SERIAL no_path_retry queue find_multipaths yes } blacklist { wwid .* # BECAREFULL @mox11 blacklit sda,sdb,sdc devnode "^sd[a-c]" } blacklist_exceptions { # 25G v_multitest # wwid "3600c0ff000195f8e2172de5d01000000" # 10G prueba wwid "3600c0ff000195f8e7d0a855701000000" # 100G sanmox11 wwid "3600c0ff000195f8ec3f01d5e01000000" } multipaths { multipath { # wwid "3600c0ff000195f8e2172de5d01000000" wwid "3600c0ff000195f8e7d0a855701000000" wwid "3600c0ff000195f8ec3f01d5e01000000" } } devices { device { #### the following 6 lines do NOT change vendor "HP" product "P2000 G3 FC|P2000G3 FC/iSCSI|P2000 G3 SAS|P2000 G3 iSCSI" # getuid_callout "/lib/udev/scsi_id -g -u -s /block/%n" path_grouping_policy "group_by_prio" prio "alua" failback "immediate" no_path_retry 18 #### hardware_handler "0" path_selector "round-robin 0" rr_weight uniform rr_min_io 100 path_checker tur } } root at mox11:~# systemctl status multipath-tools-boot multipath-tools-boot.service Loaded: masked (Reason: Unit multipath-tools-boot.service is masked.) Active: inactive (dead) root at mox11:~# pveversion -V proxmox-ve: 6.1-2 (running kernel: 5.3.13-1-pve) pve-manager: 6.1-5 (running version: 6.1-5/9bf06119) pve-kernel-5.3: 6.1-1 pve-kernel-helper: 6.1-1 pve-kernel-4.15: 5.4-12 pve-kernel-5.3.13-1-pve: 5.3.13-1 pve-kernel-4.15.18-24-pve: 4.15.18-52 pve-kernel-4.15.18-21-pve: 4.15.18-48 pve-kernel-4.15.18-11-pve: 4.15.18-34 ceph-fuse: 12.2.11+dfsg1-2.1+b1 corosync: 3.0.2-pve4 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: 0.8.35+pve1 ksm-control-daemon: 1.3-1 libjs-extjs: 6.0.1-10 libknet1: 1.13-pve1 libpve-access-control: 6.0-5 libpve-apiclient-perl: 3.0-2 libpve-common-perl: 6.0-9 libpve-guest-common-perl: 3.0-3 libpve-http-server-perl: 3.0-3 libpve-storage-perl: 6.1-3 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve3 lxc-pve: 3.2.1-1 lxcfs: 3.0.3-pve60 novnc-pve: 1.1.0-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.1-1 pve-cluster: 6.1-2 pve-container: 3.0-15 pve-docs: 6.1-3 pve-edk2-firmware: 2.20191127-1 pve-firewall: 4.0-9 pve-firmware: 3.0-4 pve-ha-manager: 3.0-8 pve-i18n: 2.0-3 pve-qemu-kvm: 4.1.1-2 pve-xtermjs: 3.13.2-1 qemu-server: 6.1-4 smartmontools: 7.0-pve2 spiceterm: 3.1-1 vncterm: 1.6-1 zfsutils-linux: 0.8.2-pve2 From humbertos at ifsc.edu.br Wed Jan 15 12:02:55 2020 From: humbertos at ifsc.edu.br (Humberto Jose De Sousa) Date: Wed, 15 Jan 2020 08:02:55 -0300 (BRT) Subject: [PVE-User] Renamed node, now WebUI partly not working In-Reply-To: <80c96efb-d5f0-a9de-a994-e2e35aa2cf9f@gmail.com> References: <80c96efb-d5f0-a9de-a994-e2e35aa2cf9f@gmail.com> Message-ID: <1768013333.1835719.1579086175937.JavaMail.zimbra@ifsc.edu.br> Hi Uwe. All documetation that I know say changing the hostname and IP is not possible after cluster creation. Once I tried fix this issue and the solution was remove the node, reinstall and add the node on the cluster. Humberto De: "Uwe Sauter" Para: "PVE User List" Enviadas: Ter?a-feira, 14 de janeiro de 2020 17:37:31 Assunto: [PVE-User] Renamed node, now WebUI partly not working Hi all, today I create some kind of a mess. TL;DR I added one node with the wrong hostname to an existing cluster, tried to rename the node, now some parts of the WebUI don't work anymore. Any thoughts? I have a 3 node cluster with up-to-date software and added a fourth and fifth node. By accident, the fourth node had a wrong hostname. I tried to rename it by following [1] but forgot to increment totem {config_version}. Now on the WebUI -> Server View -> Datacenter -> -> Summary page, the inlined status panel shows "communication failure". If I try to migrate a VM to another node the drop down list for target node selection is empty (also when trying to do bulk migration). Any idea what I did wrong and how to repair the cluster? Thanks, Uwe [1] https://pve.proxmox.com/wiki/Renaming_a_PVE_node _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From smr at kmi.com Wed Jan 15 12:06:31 2020 From: smr at kmi.com (Stefan M. Radman) Date: Wed, 15 Jan 2020 11:06:31 +0000 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: <5b545e139718261531562fec0a49eede@verdnatura.es> References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <20200114132801.GC2777@sv.lnf.it> <20200114151223.GE2777@sv.lnf.it> <389b6180fc79c96727ffb9376b1a1555@verdnatura.es> <5b545e139718261531562fec0a49eede@verdnatura.es> Message-ID: <56518B49-6608-4161-8424-5088F92853BE@kmi.com> Hi Nada, Unfortunately I don't have any first hand experience with PVE6 yet. On the PVE5.4 cluster I am working with I had an issue that looked very similar to yours: LVM refused to activate iSCSI multipath volumes on boot, making the lvm2-activation-net.service fail. This only happened during boot of the host. Restarting the lvm2-activation-net.service after boot activated the volume with multipath working. Suspecting a timing/dependency issue specific to my configuration I took a pragmatic approach and added a custom systemd service template to restart the lvm2-activation-net.service after multipath initialization (see below). # cat /etc/systemd/system/lvm2-after-multipath.service [Unit] Description=LVM2 after Multipath After=multipathd.service lvm2-activation-net.service [Service] Type=oneshot ExecStart=/bin/systemctl start lvm2-activation-net.service [Install] WantedBy=sysinit.target Things on PVE6 seem to have changed a bit but your lvm2-pvescan service failures indicate a similar problem ("failed LVM event activation"). Disable your rc.local workaround and try to restart the two failed services after reboot. If that works you might want to take a similar approach instead of activating the volumes manually. The masked status of multipath-tools-boot.service is ok. The package is only needed when booting from multipath devices. Your mistake in multipath.conf seems to be in the multipaths section. Each multipath device can only have one WWID. For two WWIDs you'll need two multiparty subsections. https://help.ubuntu.com/lts/serverguide/multipath-dm-multipath-config-file.html#multipath-config-multipath Stefan On Jan 15, 2020, at 10:55, nada > wrote: On 2020-01-14 19:46, Stefan M. Radman via pve-user wrote: Hi Nada What's the output of "systemctl --failed" and "systemctl status lvm2-activation-net.service". Stefan Hi Stefan thank you for your response ! the output of "systemctl --failed" was claiming devices from SAN during boot, which were activated by rc-local.service after boot i do NOT have "lvm2-activation-net.service" and find masked status of multipath-tools-boot.service < is this ok ??? but i find some mistake in multipath.conf eventhough it is running i am going to reconfigure it and reboot this afternoon following are details Nada root at mox11:~# systemctl --failed --all UNIT LOAD ACTIVE SUB DESCRIPTION lvm2-pvscan at 253:7.service loaded failed failed LVM event activation on device 253:7 lvm2-pvscan at 253:8.service loaded failed failed LVM event activation on device 253:8 LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 2 loaded units listed. To show all installed unit files use 'systemctl list-unit-files'. root at mox11:~# dmsetup ls san2020jan-vm--903--disk--0 (253:18) santest-santestpool (253:12) 3600c0ff000195f8e7d0a855701000000 (253:7) pve-data-tpool (253:4) pve-data_tdata (253:3) pve-zfs (253:6) santest-santestpool-tpool (253:11) santest-santestpool_tdata (253:10) pve-data_tmeta (253:2) san2020jan-san2020janpool (253:17) santest-santestpool_tmeta (253:9) pve-swap (253:0) pve-root (253:1) pve-data (253:5) 3600c0ff000195f8ec3f01d5e01000000 (253:8) san2020jan-san2020janpool-tpool (253:16) san2020jan-san2020janpool_tdata (253:15) san2020jan-san2020janpool_tmeta (253:14) root at mox11:~# pvs -a PV VG Fmt Attr PSize PFree /dev/mapper/3600c0ff000195f8e7d0a855701000000 santest lvm2 a-- <9.31g 292.00m /dev/mapper/3600c0ff000195f8ec3f01d5e01000000 san2020jan lvm2 a-- <93.13g <2.95g /dev/mapper/san2020jan-vm--903--disk--0 --- 0 0 /dev/sdb --- 0 0 /dev/sdc2 --- 0 0 /dev/sdc3 pve lvm2 a-- 67.73g 6.97g root at mox11:~# vgs -a VG #PV #LV #SN Attr VSize VFree pve 1 4 0 wz--n- 67.73g 6.97g san2020jan 1 2 0 wz--n- <93.13g <2.95g santest 1 1 0 wz--n- <9.31g 292.00m root at mox11:~# lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert data pve twi-aotz-- 9.98g 0.00 10.61 [data_tdata] pve Twi-ao---- 9.98g [data_tmeta] pve ewi-ao---- 12.00m [lvol0_pmspare] pve ewi------- 12.00m root pve -wi-ao---- 16.75g swap pve -wi-ao---- 4.00g zfs pve -wi-ao---- 30.00g [lvol0_pmspare] san2020jan ewi------- 92.00m san2020janpool san2020jan twi-aotz-- 90.00g 0.86 10.84 [san2020janpool_tdata] san2020jan Twi-ao---- 90.00g [san2020janpool_tmeta] san2020jan ewi-ao---- 92.00m vm-903-disk-0 san2020jan Vwi-aotz-- 2.50g san2020janpool 30.99 [lvol0_pmspare] santest ewi------- 12.00m santestpool santest twi-aotz-- 9.00g 0.00 10.58 [santestpool_tdata] santest Twi-ao---- 9.00g [santestpool_tmeta] santest ewi-ao---- 12.00m root at mox11:~# multipathd -k"show maps" Jan 15 10:50:02 | /etc/multipath.conf line 24, duplicate keyword: wwid name sysfs uuid 3600c0ff000195f8e7d0a855701000000 dm-7 3600c0ff000195f8e7d0a855701000000 3600c0ff000195f8ec3f01d5e01000000 dm-8 3600c0ff000195f8ec3f01d5e01000000 root at mox11:~# multipathd -k"show paths" Jan 15 10:50:07 | /etc/multipath.conf line 24, duplicate keyword: wwid hcil dev dev_t pri dm_st chk_st dev_st next_check 6:0:0:3 sde 8:64 10 active ready running XX........ 2/8 9:0:0:3 sdn 8:208 10 active ready running XX........ 2/8 7:0:0:3 sdh 8:112 50 active ready running XX........ 2/8 5:0:0:3 sdd 8:48 10 active ready running XXXXXX.... 5/8 11:0:0:3 sdp 8:240 50 active ready running X......... 1/8 10:0:0:3 sdl 8:176 50 active ready running XXXXXXXXXX 8/8 8:0:0:6 sdk 8:160 50 active ready running XXXXXX.... 5/8 8:0:0:3 sdj 8:144 50 active ready running XXXXXXXXXX 8/8 9:0:0:6 sdo 8:224 10 active ready running X......... 1/8 6:0:0:6 sdg 8:96 10 active ready running XXXXXX.... 5/8 5:0:0:6 sdf 8:80 10 active ready running XXXXXX.... 5/8 10:0:0:6 sdm 8:192 50 active ready running XXXXXXX... 6/8 11:0:0:6 sdq 65:0 50 active ready running XXXXXXXX.. 7/8 7:0:0:6 sdi 8:128 50 active ready running XXXXXXX... 6/8 root at mox11:~# cat /etc/multipath.conf defaults { polling_interval 2 uid_attribute ID_SERIAL no_path_retry queue find_multipaths yes } blacklist { wwid .* # BECAREFULL @mox11 blacklit sda,sdb,sdc devnode "^sd[a-c]" } blacklist_exceptions { # 25G v_multitest # wwid "3600c0ff000195f8e2172de5d01000000" # 10G prueba wwid "3600c0ff000195f8e7d0a855701000000" # 100G sanmox11 wwid "3600c0ff000195f8ec3f01d5e01000000" } multipaths { multipath { # wwid "3600c0ff000195f8e2172de5d01000000" wwid "3600c0ff000195f8e7d0a855701000000" wwid "3600c0ff000195f8ec3f01d5e01000000" } } devices { device { #### the following 6 lines do NOT change vendor "HP" product "P2000 G3 FC|P2000G3 FC/iSCSI|P2000 G3 SAS|P2000 G3 iSCSI" # getuid_callout "/lib/udev/scsi_id -g -u -s /block/%n" path_grouping_policy "group_by_prio" prio "alua" failback "immediate" no_path_retry 18 #### hardware_handler "0" path_selector "round-robin 0" rr_weight uniform rr_min_io 100 path_checker tur } } root at mox11:~# systemctl status multipath-tools-boot multipath-tools-boot.service Loaded: masked (Reason: Unit multipath-tools-boot.service is masked.) Active: inactive (dead) root at mox11:~# pveversion -V proxmox-ve: 6.1-2 (running kernel: 5.3.13-1-pve) pve-manager: 6.1-5 (running version: 6.1-5/9bf06119) pve-kernel-5.3: 6.1-1 pve-kernel-helper: 6.1-1 pve-kernel-4.15: 5.4-12 pve-kernel-5.3.13-1-pve: 5.3.13-1 pve-kernel-4.15.18-24-pve: 4.15.18-52 pve-kernel-4.15.18-21-pve: 4.15.18-48 pve-kernel-4.15.18-11-pve: 4.15.18-34 ceph-fuse: 12.2.11+dfsg1-2.1+b1 corosync: 3.0.2-pve4 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: 0.8.35+pve1 ksm-control-daemon: 1.3-1 libjs-extjs: 6.0.1-10 libknet1: 1.13-pve1 libpve-access-control: 6.0-5 libpve-apiclient-perl: 3.0-2 libpve-common-perl: 6.0-9 libpve-guest-common-perl: 3.0-3 libpve-http-server-perl: 3.0-3 libpve-storage-perl: 6.1-3 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve3 lxc-pve: 3.2.1-1 lxcfs: 3.0.3-pve60 novnc-pve: 1.1.0-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.1-1 pve-cluster: 6.1-2 pve-container: 3.0-15 pve-docs: 6.1-3 pve-edk2-firmware: 2.20191127-1 pve-firewall: 4.0-9 pve-firmware: 3.0-4 pve-ha-manager: 3.0-8 pve-i18n: 2.0-3 pve-qemu-kvm: 4.1.1-2 pve-xtermjs: 3.13.2-1 qemu-server: 6.1-4 smartmontools: 7.0-pve2 spiceterm: 3.1-1 vncterm: 1.6-1 zfsutils-linux: 0.8.2-pve2 _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpve.proxmox.com%2Fcgi-bin%2Fmailman%2Flistinfo%2Fpve-user&data=02%7C01%7Csmr%40kmi.com%7C5197e0e1f4c64fb738cf08d799a1067b%7Cc2283768b8d34e008f3d85b1b4f03b33%7C0%7C1%7C637146789208877811&sdata=JhYOtpzDpjbQ1g4yrV%2FUwB8d3d4vX08Pd9wISQUmGp8%3D&reserved=0 From nada at verdnatura.es Wed Jan 15 13:16:29 2020 From: nada at verdnatura.es (nada) Date: Wed, 15 Jan 2020 13:16:29 +0100 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: <56518B49-6608-4161-8424-5088F92853BE@kmi.com> References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <20200114132801.GC2777@sv.lnf.it> <20200114151223.GE2777@sv.lnf.it> <389b6180fc79c96727ffb9376b1a1555@verdnatura.es> <5b545e139718261531562fec0a49eede@verdnatura.es> <56518B49-6608-4161-8424-5088F92853BE@kmi.com> Message-ID: thank you Stefan for your tips !! i do not do activation manualy, i use rc-local.service (details follows) but i will reconfigure multipath in the evening and reboot both nodes will let you know results by tomorrow have a nice day ;-) Nada root at mox11:~# cat /etc/systemd/system/multi-user.target.wants/rc-local.service [Unit] Description=/etc/rc.local Compatibility Documentation=man:systemd-rc-local-generator(8) ConditionFileIsExecutable=/etc/rc.local After=network.target iscsid.service multipathd.service open-iscsi.service [Service] Type=forking ExecStart=/etc/rc.local TimeoutSec=0 RemainAfterExit=yes GuessMainPID=no [Install] WantedBy=multi-user.target root at mox11:~# cat /etc/rc.local #!/bin/bash # just to activate VGs from SAN /bin/sleep 10 /sbin/vgchange -aly santest /sbin/vgchange -aly san2020jan On 2020-01-15 12:06, Stefan M. Radman wrote: > Hi Nada, > > Unfortunately I don't have any first hand experience with PVE6 yet. > > On the PVE5.4 cluster I am working with I had an issue that looked > very similar to yours: > LVM refused to activate iSCSI multipath volumes on boot, making the > lvm2-activation-net.service fail. > This only happened during boot of the host. > Restarting the lvm2-activation-net.service after boot activated the > volume with multipath working. > Suspecting a timing/dependency issue specific to my configuration I > took a pragmatic approach and added a custom systemd service template > to restart the lvm2-activation-net.service after multipath > initialization (see below). > > # cat /etc/systemd/system/lvm2-after-multipath.service > [Unit] > Description=LVM2 after Multipath > After=multipathd.service lvm2-activation-net.service > > [Service] > Type=oneshot > ExecStart=/bin/systemctl start lvm2-activation-net.service > > [Install] > WantedBy=sysinit.target > > Things on PVE6 seem to have changed a bit but your lvm2-pvescan > service failures indicate a similar problem ("failed LVM event > activation"). > Disable your rc.local workaround and try to restart the two failed > services after reboot. > If that works you might want to take a similar approach instead of > activating the volumes manually. > > The masked status of multipath-tools-boot.service is ok. The package > is only needed when booting from multipath devices. > > Your mistake in multipath.conf seems to be in the multipaths section. > Each multipath device can only have one WWID. For two WWIDs you'll > need two multiparty subsections. > https://help.ubuntu.com/lts/serverguide/multipath-dm-multipath-config-file.html#multipath-config-multipath > > Stefan > > On Jan 15, 2020, at 10:55, nada > > wrote: > > On 2020-01-14 19:46, Stefan M. Radman via pve-user wrote: > Hi Nada > What's the output of "systemctl --failed" and "systemctl status > lvm2-activation-net.service". > Stefan > > Hi Stefan > thank you for your response ! > the output of "systemctl --failed" was claiming devices from SAN during > boot, > which were activated by rc-local.service after boot > i do NOT have "lvm2-activation-net.service" > and find masked status of multipath-tools-boot.service < is this ok ??? > but i find some mistake in multipath.conf eventhough it is running > i am going to reconfigure it and reboot this afternoon > following are details > Nada > > root at mox11:~# systemctl --failed --all > UNIT LOAD ACTIVE SUB DESCRIPTION > lvm2-pvscan at 253:7.service loaded failed failed LVM event activation > on device 253:7 > lvm2-pvscan at 253:8.service loaded failed failed LVM event activation > on device 253:8 > > LOAD = Reflects whether the unit definition was properly loaded. > ACTIVE = The high-level unit activation state, i.e. generalization of > SUB. > SUB = The low-level unit activation state, values depend on unit > type. > > 2 loaded units listed. > To show all installed unit files use 'systemctl list-unit-files'. > > root at mox11:~# dmsetup ls > san2020jan-vm--903--disk--0 (253:18) > santest-santestpool (253:12) > 3600c0ff000195f8e7d0a855701000000 (253:7) > pve-data-tpool (253:4) > pve-data_tdata (253:3) > pve-zfs (253:6) > santest-santestpool-tpool (253:11) > santest-santestpool_tdata (253:10) > pve-data_tmeta (253:2) > san2020jan-san2020janpool (253:17) > santest-santestpool_tmeta (253:9) > pve-swap (253:0) > pve-root (253:1) > pve-data (253:5) > 3600c0ff000195f8ec3f01d5e01000000 (253:8) > san2020jan-san2020janpool-tpool (253:16) > san2020jan-san2020janpool_tdata (253:15) > san2020jan-san2020janpool_tmeta (253:14) > > root at mox11:~# pvs -a > PV VG Fmt Attr > PSize PFree > /dev/mapper/3600c0ff000195f8e7d0a855701000000 santest lvm2 a-- > <9.31g 292.00m > /dev/mapper/3600c0ff000195f8ec3f01d5e01000000 san2020jan lvm2 a-- > <93.13g <2.95g > /dev/mapper/san2020jan-vm--903--disk--0 --- > 0 0 > /dev/sdb --- > 0 0 > /dev/sdc2 --- > 0 0 > /dev/sdc3 pve lvm2 a-- > 67.73g 6.97g > root at mox11:~# vgs -a > VG #PV #LV #SN Attr VSize VFree > pve 1 4 0 wz--n- 67.73g 6.97g > san2020jan 1 2 0 wz--n- <93.13g <2.95g > santest 1 1 0 wz--n- <9.31g 292.00m > root at mox11:~# lvs -a > LV VG Attr LSize Pool > Origin Data% Meta% Move Log Cpy%Sync Convert > data pve twi-aotz-- 9.98g > 0.00 10.61 > [data_tdata] pve Twi-ao---- 9.98g > [data_tmeta] pve ewi-ao---- 12.00m > [lvol0_pmspare] pve ewi------- 12.00m > root pve -wi-ao---- 16.75g > swap pve -wi-ao---- 4.00g > zfs pve -wi-ao---- 30.00g > [lvol0_pmspare] san2020jan ewi------- 92.00m > san2020janpool san2020jan twi-aotz-- 90.00g > 0.86 10.84 > [san2020janpool_tdata] san2020jan Twi-ao---- 90.00g > [san2020janpool_tmeta] san2020jan ewi-ao---- 92.00m > vm-903-disk-0 san2020jan Vwi-aotz-- 2.50g san2020janpool > 30.99 > [lvol0_pmspare] santest ewi------- 12.00m > santestpool santest twi-aotz-- 9.00g > 0.00 10.58 > [santestpool_tdata] santest Twi-ao---- 9.00g > [santestpool_tmeta] santest ewi-ao---- 12.00m > > root at mox11:~# multipathd -k"show maps" > Jan 15 10:50:02 | /etc/multipath.conf line 24, duplicate keyword: wwid > name sysfs uuid > 3600c0ff000195f8e7d0a855701000000 dm-7 > 3600c0ff000195f8e7d0a855701000000 > 3600c0ff000195f8ec3f01d5e01000000 dm-8 > 3600c0ff000195f8ec3f01d5e01000000 > > root at mox11:~# multipathd -k"show paths" > Jan 15 10:50:07 | /etc/multipath.conf line 24, duplicate keyword: wwid > hcil dev dev_t pri dm_st chk_st dev_st next_check > 6:0:0:3 sde 8:64 10 active ready running XX........ 2/8 > 9:0:0:3 sdn 8:208 10 active ready running XX........ 2/8 > 7:0:0:3 sdh 8:112 50 active ready running XX........ 2/8 > 5:0:0:3 sdd 8:48 10 active ready running XXXXXX.... 5/8 > 11:0:0:3 sdp 8:240 50 active ready running X......... 1/8 > 10:0:0:3 sdl 8:176 50 active ready running XXXXXXXXXX 8/8 > 8:0:0:6 sdk 8:160 50 active ready running XXXXXX.... 5/8 > 8:0:0:3 sdj 8:144 50 active ready running XXXXXXXXXX 8/8 > 9:0:0:6 sdo 8:224 10 active ready running X......... 1/8 > 6:0:0:6 sdg 8:96 10 active ready running XXXXXX.... 5/8 > 5:0:0:6 sdf 8:80 10 active ready running XXXXXX.... 5/8 > 10:0:0:6 sdm 8:192 50 active ready running XXXXXXX... 6/8 > 11:0:0:6 sdq 65:0 50 active ready running XXXXXXXX.. 7/8 > 7:0:0:6 sdi 8:128 50 active ready running XXXXXXX... 6/8 > > root at mox11:~# cat /etc/multipath.conf > defaults { > polling_interval 2 > uid_attribute ID_SERIAL > no_path_retry queue > find_multipaths yes > } > blacklist { > wwid .* > # BECAREFULL @mox11 blacklit sda,sdb,sdc > devnode "^sd[a-c]" > } > blacklist_exceptions { > # 25G v_multitest > # wwid "3600c0ff000195f8e2172de5d01000000" > # 10G prueba > wwid "3600c0ff000195f8e7d0a855701000000" > # 100G sanmox11 > wwid "3600c0ff000195f8ec3f01d5e01000000" > } > multipaths { > multipath { > # wwid "3600c0ff000195f8e2172de5d01000000" > wwid "3600c0ff000195f8e7d0a855701000000" > wwid "3600c0ff000195f8ec3f01d5e01000000" > } > } > devices { > device { > #### the following 6 lines do NOT change > vendor "HP" > product "P2000 G3 FC|P2000G3 FC/iSCSI|P2000 G3 SAS|P2000 G3 iSCSI" > # getuid_callout "/lib/udev/scsi_id -g -u -s /block/%n" > path_grouping_policy "group_by_prio" > prio "alua" > failback "immediate" > no_path_retry 18 > #### > hardware_handler "0" > path_selector "round-robin 0" > rr_weight uniform > rr_min_io 100 > path_checker tur > } > } > > > root at mox11:~# systemctl status multipath-tools-boot > multipath-tools-boot.service > Loaded: masked (Reason: Unit multipath-tools-boot.service is masked.) > Active: inactive (dead) > > root at mox11:~# pveversion -V > proxmox-ve: 6.1-2 (running kernel: 5.3.13-1-pve) > pve-manager: 6.1-5 (running version: 6.1-5/9bf06119) > pve-kernel-5.3: 6.1-1 > pve-kernel-helper: 6.1-1 > pve-kernel-4.15: 5.4-12 > pve-kernel-5.3.13-1-pve: 5.3.13-1 > pve-kernel-4.15.18-24-pve: 4.15.18-52 > pve-kernel-4.15.18-21-pve: 4.15.18-48 > pve-kernel-4.15.18-11-pve: 4.15.18-34 > ceph-fuse: 12.2.11+dfsg1-2.1+b1 > corosync: 3.0.2-pve4 > criu: 3.11-3 > glusterfs-client: 5.5-3 > ifupdown: 0.8.35+pve1 > ksm-control-daemon: 1.3-1 > libjs-extjs: 6.0.1-10 > libknet1: 1.13-pve1 > libpve-access-control: 6.0-5 > libpve-apiclient-perl: 3.0-2 > libpve-common-perl: 6.0-9 > libpve-guest-common-perl: 3.0-3 > libpve-http-server-perl: 3.0-3 > libpve-storage-perl: 6.1-3 > libqb0: 1.0.5-1 > libspice-server1: 0.14.2-4~pve6+1 > lvm2: 2.03.02-pve3 > lxc-pve: 3.2.1-1 > lxcfs: 3.0.3-pve60 > novnc-pve: 1.1.0-1 > proxmox-mini-journalreader: 1.1-1 > proxmox-widget-toolkit: 2.1-1 > pve-cluster: 6.1-2 > pve-container: 3.0-15 > pve-docs: 6.1-3 > pve-edk2-firmware: 2.20191127-1 > pve-firewall: 4.0-9 > pve-firmware: 3.0-4 > pve-ha-manager: 3.0-8 > pve-i18n: 2.0-3 > pve-qemu-kvm: 4.1.1-2 > pve-xtermjs: 3.13.2-1 > qemu-server: 6.1-4 > smartmontools: 7.0-pve2 > spiceterm: 3.1-1 > vncterm: 1.6-1 > zfsutils-linux: 0.8.2-pve2 > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpve.proxmox.com%2Fcgi-bin%2Fmailman%2Flistinfo%2Fpve-user&data=02%7C01%7Csmr%40kmi.com%7C5197e0e1f4c64fb738cf08d799a1067b%7Cc2283768b8d34e008f3d85b1b4f03b33%7C0%7C1%7C637146789208877811&sdata=JhYOtpzDpjbQ1g4yrV%2FUwB8d3d4vX08Pd9wISQUmGp8%3D&reserved=0 From martin at holub.co.at Wed Jan 15 15:58:42 2020 From: martin at holub.co.at (Martin Holub) Date: Wed, 15 Jan 2020 15:58:42 +0100 Subject: Host Alias for InfluxDB Data Message-ID: <4d58d9130bc9566b1af89d2bd67048072437d9dc.camel@holub.co.at> Hi, Is there any way to add an Alias instead the reported "hostname" for InfluxDB. My problem is, i have severall Hosts with hostname "s1" but FQDN like s1.$location.domain.tld. As the Influx exported seems to report only the hostname, but not the FQDN. I now have all Metrics on one Host, so this make the export quite useless in my scenario. Best Martin From t.lamprecht at proxmox.com Wed Jan 15 16:18:15 2020 From: t.lamprecht at proxmox.com (Thomas Lamprecht) Date: Wed, 15 Jan 2020 16:18:15 +0100 Subject: [PVE-User] Host Alias for InfluxDB Data In-Reply-To: References: Message-ID: <5c596bd7-1d41-fcd2-9d91-efc1f7b9b7af@proxmox.com> On 1/15/20 3:58 PM, Martin Holub via pve-user wrote: > Is there any way to add an Alias instead the reported "hostname" for > InfluxDB. My problem is, i have severall Hosts with hostname "s1" but > FQDN like s1.$location.domain.tld. As the Influx exported seems to > report only the hostname, but not the FQDN. I now have all Metrics on > one Host, so this make the export quite useless in my scenario. Hi! to either use the FQDN or having a possibility to set an alias for a host could make sense, the problem seems valid. Can you please open an enhancement request at https://bugzilla.proxmox.com ? So we can track this. Cheers, Thomas From f.thommen at dkfz-heidelberg.de Thu Jan 16 19:26:11 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Thu, 16 Jan 2020 19:26:11 +0100 Subject: [PVE-User] How to get rid of default thin-provisioned default LV "data"? Message-ID: <02dc099f-9c0e-9ea7-7b95-c1fa725d3f21@dkfz-heidelberg.de> Dear all, when installing PVE on our servers with 1 TB boot disk, the installer creates regular /root and swap LVs (ca. 100 GB in total) and the rest of the disk (800 GB) is used for a thin-provisioned LV named "data". Personally I don't like thin-provisioning and would like to get rid of it and have a "regular" LV instead. However I couldn't find a way to achieve this through the gui (PVE 6.1-3). Did I miss some gui functionality or is it ok to remove the LV with regular lvm commands? Will I break something if I remove it? No VM/container has been installed so far. The hypervisors are freshly installed and they contain additional several internal disks which we plan to use for local storage/Ceph. Cheers frank From gianni.milo22 at gmail.com Thu Jan 16 21:19:48 2020 From: gianni.milo22 at gmail.com (Gianni Milo) Date: Thu, 16 Jan 2020 20:19:48 +0000 Subject: [PVE-User] How to get rid of default thin-provisioned default LV "data"? In-Reply-To: <02dc099f-9c0e-9ea7-7b95-c1fa725d3f21@dkfz-heidelberg.de> References: <02dc099f-9c0e-9ea7-7b95-c1fa725d3f21@dkfz-heidelberg.de> Message-ID: This should be a case of removing (or comment out) its corresponding entry in /etc/pve/storage.cfg and then removing it by using the usual lvm commands. You can then "convert" it to thick lvm and re-add it in the config file... Gianni On Thu, 16 Jan 2020 at 18:26, Frank Thommen wrote: > Dear all, > > when installing PVE on our servers with 1 TB boot disk, the installer > creates regular /root and swap LVs (ca. 100 GB in total) and the rest of > the disk (800 GB) is used for a thin-provisioned LV named "data". > Personally I don't like thin-provisioning and would like to get rid of > it and have a "regular" LV instead. However I couldn't find a way to > achieve this through the gui (PVE 6.1-3). > > Did I miss some gui functionality or is it ok to remove the LV with > regular lvm commands? Will I break something if I remove it? No > VM/container has been installed so far. The hypervisors are freshly > installed and they contain additional several internal disks which we > plan to use for local storage/Ceph. > > Cheers > frank > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From f.thommen at dkfz-heidelberg.de Sat Jan 18 20:54:36 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Sat, 18 Jan 2020 20:54:36 +0100 Subject: [PVE-User] How to get rid of default thin-provisioned default LV "data"? In-Reply-To: References: <02dc099f-9c0e-9ea7-7b95-c1fa725d3f21@dkfz-heidelberg.de> Message-ID: <1a278dc6-6f4e-a22b-f9e8-99fab825575e@dkfz-heidelberg.de> Thanks Gianni, that worked just fine. I've now created a regular LVM LV instead, formatted and mounted it and configured it as "dir" storage. frank On 16.01.20 21:19, Gianni Milo wrote: > This should be a case of removing (or comment out) its corresponding entry > in /etc/pve/storage.cfg and then removing it by using the usual lvm > commands. > You can then "convert" it to thick lvm and re-add it in the config file... > > Gianni > > On Thu, 16 Jan 2020 at 18:26, Frank Thommen > wrote: > >> Dear all, >> >> when installing PVE on our servers with 1 TB boot disk, the installer >> creates regular /root and swap LVs (ca. 100 GB in total) and the rest of >> the disk (800 GB) is used for a thin-provisioned LV named "data". >> Personally I don't like thin-provisioning and would like to get rid of >> it and have a "regular" LV instead. However I couldn't find a way to >> achieve this through the gui (PVE 6.1-3). >> >> Did I miss some gui functionality or is it ok to remove the LV with >> regular lvm commands? Will I break something if I remove it? No >> VM/container has been installed so far. The hypervisors are freshly >> installed and they contain additional several internal disks which we >> plan to use for local storage/Ceph. >> >> Cheers >> frank >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > -- Frank Thommen | HD-HuB / DKFZ Heidelberg | f.thommen at dkfz-heidelberg.de From venefax at gmail.com Sun Jan 19 00:53:46 2020 From: venefax at gmail.com (Saint Michael) Date: Sat, 18 Jan 2020 18:53:46 -0500 Subject: [PVE-User] How to get rid of default thin-provisioned default LV "data"? In-Reply-To: <1a278dc6-6f4e-a22b-f9e8-99fab825575e@dkfz-heidelberg.de> References: <02dc099f-9c0e-9ea7-7b95-c1fa725d3f21@dkfz-heidelberg.de> <1a278dc6-6f4e-a22b-f9e8-99fab825575e@dkfz-heidelberg.de> Message-ID: can somebody post the full chain of commands? That is the only reason why I am not using the software. On Sat, Jan 18, 2020 at 2:54 PM Frank Thommen wrote: > Thanks Gianni, > > that worked just fine. I've now created a regular LVM LV instead, > formatted and mounted it and configured it as "dir" storage. > > frank > > > On 16.01.20 21:19, Gianni Milo wrote: > > This should be a case of removing (or comment out) its corresponding > entry > > in /etc/pve/storage.cfg and then removing it by using the usual lvm > > commands. > > You can then "convert" it to thick lvm and re-add it in the config > file... > > > > Gianni > > > > On Thu, 16 Jan 2020 at 18:26, Frank Thommen < > f.thommen at dkfz-heidelberg.de> > > wrote: > > > >> Dear all, > >> > >> when installing PVE on our servers with 1 TB boot disk, the installer > >> creates regular /root and swap LVs (ca. 100 GB in total) and the rest of > >> the disk (800 GB) is used for a thin-provisioned LV named "data". > >> Personally I don't like thin-provisioning and would like to get rid of > >> it and have a "regular" LV instead. However I couldn't find a way to > >> achieve this through the gui (PVE 6.1-3). > >> > >> Did I miss some gui functionality or is it ok to remove the LV with > >> regular lvm commands? Will I break something if I remove it? No > >> VM/container has been installed so far. The hypervisors are freshly > >> installed and they contain additional several internal disks which we > >> plan to use for local storage/Ceph. > >> > >> Cheers > >> frank > >> _______________________________________________ > >> pve-user mailing list > >> pve-user at pve.proxmox.com > >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > >> > > _______________________________________________ > > pve-user mailing list > > pve-user at pve.proxmox.com > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > > -- > Frank Thommen | HD-HuB / DKFZ Heidelberg > | f.thommen at dkfz-heidelberg.de > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From f.thommen at dkfz-heidelberg.de Sun Jan 19 14:05:30 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Sun, 19 Jan 2020 14:05:30 +0100 Subject: [PVE-User] How to get rid of default thin-provisioned default LV "data"? In-Reply-To: References: <02dc099f-9c0e-9ea7-7b95-c1fa725d3f21@dkfz-heidelberg.de> <1a278dc6-6f4e-a22b-f9e8-99fab825575e@dkfz-heidelberg.de> Message-ID: <1f6a4189-a9cb-0824-401c-48f2c7aca75f@dkfz-heidelberg.de> That's what I have written in our internal documentation: -------------------------------------------------- We delete the current thin-provisioned LV, create a new "thick-provisioned" LV instead, format it, mount it and configure it as "dir" storage for PVE: First manually remove the lvmthin entry from /etc/pve/storage.cfg, then $ lvremove /dev/pve/data Do you really want to remove and DISCARD active logical volume pve/data? [y/n]: yes Logical volume "data" successfully removed $ lvcreate -n data -l 100%FREE pve Logical volume "data" created. $ mkfs -t ext4 /dev/pve/data mke2fs 1.44.5 (15-Dec-2018) Creating filesystem with 216501248 4k blocks and 54132736 inodes Filesystem UUID: 5d457f4c-40a5-4423-9c3a-03db1ae76869 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848 Allocating group tables: done Writing inode tables: done Creating journal (262144 blocks): done Writing superblocks and filesystem accounting information: done $ mkdir /data $ echo "/dev/pve/data /data ext4 defaults 0 2" >> /etc/fstab && mount -a $ pvesm add dir data --path /data --content images,rootdir,vztmpl,iso,backup,snippets $ done... -------------------------------------------------- HTH, frank On 19.01.20 00:53, Saint Michael wrote: > can somebody post the full chain of commands? > That is the only reason why I am not using the software. > > > On Sat, Jan 18, 2020 at 2:54 PM Frank Thommen > wrote: > >> Thanks Gianni, >> >> that worked just fine. I've now created a regular LVM LV instead, >> formatted and mounted it and configured it as "dir" storage. >> >> frank >> >> >> On 16.01.20 21:19, Gianni Milo wrote: >>> This should be a case of removing (or comment out) its corresponding >> entry >>> in /etc/pve/storage.cfg and then removing it by using the usual lvm >>> commands. >>> You can then "convert" it to thick lvm and re-add it in the config >> file... >>> >>> Gianni >>> >>> On Thu, 16 Jan 2020 at 18:26, Frank Thommen < >> f.thommen at dkfz-heidelberg.de> >>> wrote: >>> >>>> Dear all, >>>> >>>> when installing PVE on our servers with 1 TB boot disk, the installer >>>> creates regular /root and swap LVs (ca. 100 GB in total) and the rest of >>>> the disk (800 GB) is used for a thin-provisioned LV named "data". >>>> Personally I don't like thin-provisioning and would like to get rid of >>>> it and have a "regular" LV instead. However I couldn't find a way to >>>> achieve this through the gui (PVE 6.1-3). >>>> >>>> Did I miss some gui functionality or is it ok to remove the LV with >>>> regular lvm commands? Will I break something if I remove it? No >>>> VM/container has been installed so far. The hypervisors are freshly >>>> installed and they contain additional several internal disks which we >>>> plan to use for local storage/Ceph. >>>> >>>> Cheers >>>> frank >>>> _______________________________________________ >>>> pve-user mailing list >>>> pve-user at pve.proxmox.com >>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at pve.proxmox.com >>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>> >> >> >> -- >> Frank Thommen | HD-HuB / DKFZ Heidelberg >> | f.thommen at dkfz-heidelberg.de >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > -- Frank Thommen | HD-HuB / DKFZ Heidelberg | f.thommen at dkfz-heidelberg.de From smr at kmi.com Tue Jan 21 00:44:55 2020 From: smr at kmi.com (Stefan M. Radman) Date: Mon, 20 Jan 2020 23:44:55 +0000 Subject: PVE 6.1 incorrect MTU Message-ID: Recently I upgraded the first node of a 5.4 cluster to 6.1. Everything went smooth (thanks for pve5to6!) until I restarted the first node after the upgrade and started to get weird problems with the LVM/multipath/iSCSI based storage (hung PVE and LVM processes etc). After a while of digging I found that the issues were due to an incorrect MTU on the storage interface (Jumbo frames got truncated). The network configuration (see /etc/network/interfaces further below) worked very well with 5.4 but it does not work with 6.1. After booting with PVE 6.1, some of the MTUs are not as configured. Here is what "ip link" shows after boot. root at seisram04:~# ip link | fgrep mtu 1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 2: eno1: mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 3: eno2: mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 4: eno3: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 5: eno4: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 6: bond0: mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000 7: vmbr0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 8: vmbr533: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 9: bond0.533 at bond0: mtu 1500 qdisc noqueue master vmbr533 state UP mode DEFAULT group default qlen 1000 10: vmbr683: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 11: bond0.683 at bond0: mtu 1500 qdisc noqueue master vmbr683 state UP mode DEFAULT group default qlen 1000 12: vmbr686: mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000 13: bond0.686 at bond0: mtu 1500 qdisc noqueue master vmbr686 state UP mode DEFAULT group default qlen 1000 The only interface that is correctly configured for Jumbo frames is vmbr686, the bridge serving the storage VLAN but the underlying bond0 interface and it's slaves eno1 and eno2 have the default MTU of 1500 and Jumbo frames get truncated on the way. Why did this configuration work on 5.4 but not on 6.1? In the interface configuration, all MTU values are explicitly configured. Why are some of them ignored in 6.1? What am I missing here? How can I make this configuration work in PVE 6.1 (and later)? Any suggestions welcome. Thank you Stefan /etc/network/interfaces: auto lo iface lo inet loopback iface eno1 inet manual mtu 9000 #Gb1 - Trunk - Jumbo Frames iface eno2 inet manual mtu 9000 #Gb2 - Trunk - Jumbo Frames auto eno3 iface eno3 inet static address 192.168.84.4 netmask 255.255.255.0 mtu 1500 #Gb3 - COROSYNC1 - VLAN684 auto eno4 iface eno4 inet static address 192.168.85.4 netmask 255.255.255.0 mtu 1500 #Gb4 - COROSYNC2 - VLAN685 auto bond0 iface bond0 inet manual slaves eno1 eno2 bond_miimon 100 bond_mode active-backup mtu 9000 #HA Bundle Gb1/Gb2 - Trunk - Jumbo Frames auto vmbr0 iface vmbr0 inet static address 172.21.54.204 netmask 255.255.255.0 gateway 172.21.54.254 bridge_ports bond0 bridge_stp off bridge_fd 0 mtu 1500 #PRIVATE - VLAN 682 - Native auto vmbr533 iface vmbr533 inet manual bridge_ports bond0.533 bridge_stp off bridge_fd 0 mtu 1500 #PUBLIC - VLAN 533 auto vmbr683 iface vmbr683 inet manual bridge_ports bond0.683 bridge_stp off bridge_fd 0 mtu 1500 #VPN - VLAN 683 auto vmbr686 iface vmbr686 inet static address 192.168.86.204 netmask 255.255.255.0 bridge_ports bond0.686 bridge_stp off bridge_fd 0 mtu 9000 #STORAGE - VLAN 686 - Jumbo Frames CONFIDENTIALITY NOTICE: This communication may contain privileged and confidential information, or may otherwise be protected from disclosure, and is intended solely for use of the intended recipient(s). If you are not the intended recipient of this communication, please notify the sender that you have received this communication in error and delete and destroy all copies in your possession. From ronny+pve-user at aasen.cx Tue Jan 21 09:17:45 2020 From: ronny+pve-user at aasen.cx (Ronny Aasen) Date: Tue, 21 Jan 2020 09:17:45 +0100 Subject: [PVE-User] PVE 6.1 incorrect MTU In-Reply-To: References: Message-ID: <51a40f45-52ad-68e8-6e02-d3cf84ac778f@aasen.cx> On 21.01.2020 00:44, Stefan M. Radman via pve-user wrote: > > PVE 6.1 incorrect MTU.eml > > Subject: > PVE 6.1 incorrect MTU > From: > "Stefan M. Radman" > Date: > 21.01.2020, 00:44 > > To: > "pve-user at pve.proxmox.com" > > > Recently I upgraded the first node of a 5.4 cluster to 6.1. > Everything went smooth (thanks for pve5to6!) until I restarted the first node after the upgrade and started to get weird problems with the LVM/multipath/iSCSI based storage (hung PVE and LVM processes etc). > > After a while of digging I found that the issues were due to an incorrect MTU on the storage interface (Jumbo frames got truncated). > > The network configuration (see /etc/network/interfaces further below) worked very well with 5.4 but it does not work with 6.1. > > After booting with PVE 6.1, some of the MTUs are not as configured. > Here is what "ip link" shows after boot. > > root at seisram04:~# ip link | fgrep mtu > 1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 > 2: eno1: mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 > 3: eno2: mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 > 4: eno3: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 > 5: eno4: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 > 6: bond0: mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000 > 7: vmbr0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 > 8: vmbr533: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 > 9: bond0.533 at bond0: mtu 1500 qdisc noqueue master vmbr533 state UP mode DEFAULT group default qlen 1000 > 10: vmbr683: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 > 11: bond0.683 at bond0: mtu 1500 qdisc noqueue master vmbr683 state UP mode DEFAULT group default qlen 1000 > 12: vmbr686: mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000 > 13: bond0.686 at bond0: mtu 1500 qdisc noqueue master vmbr686 state UP mode DEFAULT group default qlen 1000 > > The only interface that is correctly configured for Jumbo frames is vmbr686, the bridge serving the storage VLAN but the underlying bond0 interface and it's slaves eno1 and eno2 have the default MTU of 1500 and Jumbo frames get truncated on the way. > > Why did this configuration work on 5.4 but not on 6.1? > In the interface configuration, all MTU values are explicitly configured. > > Why are some of them ignored in 6.1? What am I missing here? > > How can I make this configuration work in PVE 6.1 (and later)? > > Any suggestions welcome. > > Thank you > > Stefan > > /etc/network/interfaces: > > auto lo > iface lo inet loopback > > iface eno1 inet manual > mtu 9000 > #Gb1 - Trunk - Jumbo Frames > > iface eno2 inet manual > mtu 9000 > #Gb2 - Trunk - Jumbo Frames > > auto eno3 > iface eno3 inet static > address 192.168.84.4 > netmask 255.255.255.0 > mtu 1500 > #Gb3 - COROSYNC1 - VLAN684 > > auto eno4 > iface eno4 inet static > address 192.168.85.4 > netmask 255.255.255.0 > mtu 1500 > #Gb4 - COROSYNC2 - VLAN685 > > auto bond0 > iface bond0 inet manual > slaves eno1 eno2 > bond_miimon 100 > bond_mode active-backup > mtu 9000 > #HA Bundle Gb1/Gb2 - Trunk - Jumbo Frames > > auto vmbr0 > iface vmbr0 inet static > address 172.21.54.204 > netmask 255.255.255.0 > gateway 172.21.54.254 > bridge_ports bond0 > bridge_stp off > bridge_fd 0 > mtu 1500 > #PRIVATE - VLAN 682 - Native > > auto vmbr533 > iface vmbr533 inet manual > bridge_ports bond0.533 > bridge_stp off > bridge_fd 0 > mtu 1500 > #PUBLIC - VLAN 533 > > auto vmbr683 > iface vmbr683 inet manual > bridge_ports bond0.683 > bridge_stp off > bridge_fd 0 > mtu 1500 > #VPN - VLAN 683 > > auto vmbr686 > iface vmbr686 inet static > address 192.168.86.204 > netmask 255.255.255.0 > bridge_ports bond0.686 > bridge_stp off > bridge_fd 0 > mtu 9000 > #STORAGE - VLAN 686 - Jumbo Frames I do not know... But i am interested in this case since i have a very similar config on some of my own clusters, and am planning upgrades. Hoppe you can post your solution when you find it. I am guesstimating that setting mtu 1500 on vmbr0 may propagate to member interfaces, in more recent kernels. I belive member ports need to have the same mtu as the bridge, but probably not activly enforced in previous kernels. Personaly i never use native unless it is a phone or accesspoint on the end of a cable. So in my config vmbr0 is a vlan that is vlan-aware without any ip addresses on. with mtu 9000 and bond0 as member. and my vlans are vmbr0.686 instead of bond0.686 only defined for vlans where proxmox need an ip mtu 1500 and 9000 depending on needs. my vm's are attached to vmbr0 with a tag in the config. This way I do not have to edit proxmox config to take a new vlan in use. and all mtu 1500 stancas go on vlan interfaces, and not on bridges, and that probably do not propagate the same way. In your shoes i would try to TAG the native, and move vmbr0 to vmbr682 on bond0.682. This mean you need to change the vm's using vmbr0 tho. So if you have lots of vm's attached to vmbr0 perhaps just TAG and make the member port bond0.682 to avoid changing vm configs. this should make a bond0.682 vlan as the member port and hopefully allow the mtu. disclaimer: just wild guesses, i have not tested this on 6.1. good luck Ronny Aasen From florian.koenig at elster.de Tue Jan 21 12:39:02 2020 From: florian.koenig at elster.de (Florian =?ISO-8859-1?Q?K=F6nig?=) Date: Tue, 21 Jan 2020 12:39:02 +0100 Subject: [PVE-User] PVE 6.1 incorrect MTU In-Reply-To: <51a40f45-52ad-68e8-6e02-d3cf84ac778f@aasen.cx> References: <51a40f45-52ad-68e8-6e02-d3cf84ac778f@aasen.cx> Message-ID: <92b635ae2a6fab72a54b87e2567e707213525aef.camel@elster.de> Hi, we also have Jumbo Frames enables on our PVE6.1 Cluster. /etc/network/interfaces: ... iface ens7f0 inet manual mtu 9000 iface ens7f1 inet manual mtu 9000 iface ens1f0 inet manual mtu 9000 iface ens1f1 inet manual mtu 9000 auto bond1 iface bond1 inet ... bond-slaves ens1f0 ens7f1 bond-primary ens1f0 bond-miimon 100 bond-mode active-backup post-up ip link set bond1 mtu 9000 auto bond2 iface bond2 inet6 ... bond-slaves ens7f0 ens1f1 bond-primary ens7f0 bond-miimon 100 bond-mode active-backup post-up ip link set bond2 mtu 9000 ... As you can see, the MTU is set both at interface and at bond level. I don't know if this is a hard requirement but at least it's working. Flo Am Dienstag, den 21.01.2020, 09:17 +0100 schrieb Ronny Aasen: > On 21.01.2020 00:44, Stefan M. Radman via pve-user wrote: > > PVE 6.1 incorrect MTU.eml > > > > Subject: > > PVE 6.1 incorrect MTU > > From: > > "Stefan M. Radman" > > Date: > > 21.01.2020, 00:44 > > > > To: > > "pve-user at pve.proxmox.com" > > > > > > Recently I upgraded the first node of a 5.4 cluster to 6.1. > > Everything went smooth (thanks for pve5to6!) until I restarted the > > first node after the upgrade and started to get weird problems with > > the LVM/multipath/iSCSI based storage (hung PVE and LVM processes > > etc). > > > > After a while of digging I found that the issues were due to an > > incorrect MTU on the storage interface (Jumbo frames got > > truncated). > > > > The network configuration (see /etc/network/interfaces further > > below) worked very well with 5.4 but it does not work with 6.1. > > > > After booting with PVE 6.1, some of the MTUs are not as configured. > > Here is what "ip link" shows after boot. > > > > root at seisram04:~# ip link | fgrep mtu > > 1: lo: mtu 65536 qdisc noqueue state UNKNOWN > > mode DEFAULT group default qlen 1000 > > 2: eno1: mtu 1500 qdisc mq > > master bond0 state UP mode DEFAULT group default qlen 1000 > > 3: eno2: mtu 1500 qdisc mq > > master bond0 state UP mode DEFAULT group default qlen 1000 > > 4: eno3: mtu 1500 qdisc mq state > > UP mode DEFAULT group default qlen 1000 > > 5: eno4: mtu 1500 qdisc mq state > > UP mode DEFAULT group default qlen 1000 > > 6: bond0: mtu 1500 qdisc > > noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000 > > 7: vmbr0: mtu 1500 qdisc noqueue > > state UP mode DEFAULT group default qlen 1000 > > 8: vmbr533: mtu 1500 qdisc > > noqueue state UP mode DEFAULT group default qlen 1000 > > 9: bond0.533 at bond0: mtu 1500 > > qdisc noqueue master vmbr533 state UP mode DEFAULT group default > > qlen 1000 > > 10: vmbr683: mtu 1500 qdisc > > noqueue state UP mode DEFAULT group default qlen 1000 > > 11: bond0.683 at bond0: mtu 1500 > > qdisc noqueue master vmbr683 state UP mode DEFAULT group default > > qlen 1000 > > 12: vmbr686: mtu 9000 qdisc > > noqueue state UP mode DEFAULT group default qlen 1000 > > 13: bond0.686 at bond0: mtu 1500 > > qdisc noqueue master vmbr686 state UP mode DEFAULT group default > > qlen 1000 > > > > The only interface that is correctly configured for Jumbo frames > > is vmbr686, the bridge serving the storage VLAN but the underlying > > bond0 interface and it's slaves eno1 and eno2 have the default MTU > > of 1500 and Jumbo frames get truncated on the way. > > > > Why did this configuration work on 5.4 but not on 6.1? > > In the interface configuration, all MTU values are explicitly > > configured. > > > > Why are some of them ignored in 6.1? What am I missing here? > > > > How can I make this configuration work in PVE 6.1 (and later)? > > > > Any suggestions welcome. > > > > Thank you > > > > Stefan > > > > /etc/network/interfaces: > > > > auto lo > > iface lo inet loopback > > > > iface eno1 inet manual > > mtu 9000 > > #Gb1 - Trunk - Jumbo Frames > > > > iface eno2 inet manual > > mtu 9000 > > #Gb2 - Trunk - Jumbo Frames > > > > auto eno3 > > iface eno3 inet static > > address 192.168.84.4 > > netmask 255.255.255.0 > > mtu 1500 > > #Gb3 - COROSYNC1 - VLAN684 > > > > auto eno4 > > iface eno4 inet static > > address 192.168.85.4 > > netmask 255.255.255.0 > > mtu 1500 > > #Gb4 - COROSYNC2 - VLAN685 > > > > auto bond0 > > iface bond0 inet manual > > slaves eno1 eno2 > > bond_miimon 100 > > bond_mode active-backup > > mtu 9000 > > #HA Bundle Gb1/Gb2 - Trunk - Jumbo Frames > > > > auto vmbr0 > > iface vmbr0 inet static > > address 172.21.54.204 > > netmask 255.255.255.0 > > gateway 172.21.54.254 > > bridge_ports bond0 > > bridge_stp off > > bridge_fd 0 > > mtu 1500 > > #PRIVATE - VLAN 682 - Native > > > > auto vmbr533 > > iface vmbr533 inet manual > > bridge_ports bond0.533 > > bridge_stp off > > bridge_fd 0 > > mtu 1500 > > #PUBLIC - VLAN 533 > > > > auto vmbr683 > > iface vmbr683 inet manual > > bridge_ports bond0.683 > > bridge_stp off > > bridge_fd 0 > > mtu 1500 > > #VPN - VLAN 683 > > > > auto vmbr686 > > iface vmbr686 inet static > > address 192.168.86.204 > > netmask 255.255.255.0 > > bridge_ports bond0.686 > > bridge_stp off > > bridge_fd 0 > > mtu 9000 > > #STORAGE - VLAN 686 - Jumbo Frames > > I do not know... But i am interested in this case since i have a > very > similar config on some of my own clusters, and am planning upgrades. > Hoppe you can post your solution when you find it. > > I am guesstimating that setting mtu 1500 on vmbr0 may propagate to > member interfaces, in more recent kernels. I belive member ports need > to > have the same mtu as the bridge, but probably not activly enforced > in > previous kernels. > > Personaly i never use native unless it is a phone or accesspoint on > the > end of a cable. > So in my config vmbr0 is a vlan that is vlan-aware without any ip > addresses on. with mtu 9000 and bond0 as member. > > and my vlans are vmbr0.686 instead of bond0.686 only defined for > vlans > where proxmox need an ip mtu 1500 and 9000 depending on needs. > > my vm's are attached to vmbr0 with a tag in the config. This way I > do > not have to edit proxmox config to take a new vlan in use. > and all mtu 1500 stancas go on vlan interfaces, and not on bridges, > and > that probably do not propagate the same way. > > > > In your shoes i would try to TAG the native, and move vmbr0 to > vmbr682 > on bond0.682. This mean you need to change the vm's using vmbr0 tho. > So if you have lots of vm's attached to vmbr0 perhaps just TAG and > make > the member port bond0.682 to avoid changing vm configs. this should > make > a bond0.682 vlan as the member port and hopefully allow the mtu. > > disclaimer: just wild guesses, i have not tested this on 6.1. > > > > good luck > Ronny Aasen > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- Florian K?nig Bayerisches Landesamt f?r Steuern IuK 16 Katharina-von-Bora-Str. 6 80333 M?nchen Telefon: 089 9991-3630 E-Mail: Florian.Koenig at elster.de Internet: http://www.lfst.bayern.de From uwe.sauter.de at gmail.com Tue Jan 21 13:50:29 2020 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Tue, 21 Jan 2020 13:50:29 +0100 Subject: [PVE-User] =?utf-8?q?Proxmox_firewall_=E2=80=93_Ceph_macro?= Message-ID: Hi, I suspect that the Ceph macro in the firewall settings on datacenter level does not contain the complete list of necessary ports, As soon as I enable the firewall on datacenter level I get slow ops reported from Ceph. The firewall configuration line is: enabled: true type: in action; ACCEPT macro: Ceph interface: source: ipset "+px_cluster" destination: ipset "+px_cluster" protocol: dest port: source port: log level: nolog IPset "+px_cluster" contains all IP addresses from the clunster interface the cluster members. The IP addresses of the management interfaces are not in that set. Can anybody confirm that this is indeed an incomplete macro or is something wrong with my configuration? Regards, Uwe Sauter From gilberto.nunes32 at gmail.com Tue Jan 21 16:00:13 2020 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Tue, 21 Jan 2020 12:00:13 -0300 Subject: [PVE-User] qm import stuck Message-ID: Hi there I am using pve 6 all update (pve-no-sub)... I have a Dell Storage MD3200i and have set up multipath... But when try to import a disk into lvm created over multipath, I stuck: qm importdisk 108 vm-107-disk-0.raw VMDATA importing disk 'vm-107-disk-0.raw' to VM 108 ... trying to acquire cfs lock 'storage-VMDATA' ... trying to acquire cfs lock 'storage-VMDATA' ... Logical volume "vm-108-disk-0" created. transferred: 0 bytes remaining: 107374182400 bytes total: 107374182400 bytes progression: 0.00 % I get some messages in dmesg: 74394] sd 15:0:0:0: rdac: array Storage_Abatex, ctlr 0, queueing MODE_SELECT command [63033.074899] sd 15:0:0:0: rdac: array Storage_Abatex, ctlr 0, MODE_SELECT returned with sense 06/94/01 [63033.074902] sd 15:0:0:0: rdac: array Storage_Abatex, ctlr 0, retrying MODE_SELECT command [63033.519329] sd 16:0:0:0: rdac: array Storage_Abatex, ctlr 0, queueing MODE_SELECT command [63033.519835] sd 16:0:0:0: rdac: array Storage_Abatex, ctlr 0, MODE_SELECT returned with sense 06/94/01 [63033.519836] sd 16:0:0:0: rdac: array Storage_Abatex, ctlr 0, retrying MODE_SELECT command This is the multpath file: cat /etc/multipath.conf defaults { polling_interval 3 path_selector "round-robin 0" max_fds "max" path_grouping_policy multibus uid_attribute "ID_SERIAL" rr_min_io 100 failback immediate no_path_retry queue } blacklist { wwid .* devnode "^sda" device { vendor "DELL" product "Universal Xport" } } devices { device { vendor "DELL" product "MD32xxi" path_grouping_policy group_by_prio prio rdac #polling_interval 5 path_checker rdac path_selector "round-robin 0" hardware_handler "1 rdac" failback immediate features "2 pg_init_retries 50" no_path_retry 30 rr_min_io 100 #prio_callout "/sbin/mpath_prio_rdac /dev/%n" } } blacklist_exceptions { wwid "361418770003f0d2800000a6c5df0a959" } multipaths { multipath { wwid "361418770003f0d2800000a6c5df0a959" alias mpath0 } } The information regard Dell MD32xxi in the file above I add with the server online... Do I need to reboot the servers or just systemctl restart multipath-whatever??? Thanks for any help --- Gilberto Nunes Ferreira (47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram Skype: gilberto.nunes36 From gilberto.nunes32 at gmail.com Tue Jan 21 19:38:50 2020 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Tue, 21 Jan 2020 15:38:50 -0300 Subject: [PVE-User] Problem w Storage iSCSI Message-ID: Hi there I get this messagem in dmesg rdac retrying mode select command And the server is slow... --- Gilberto Nunes Ferreira (47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram Skype: gilberto.nunes36 From gilberto.nunes32 at gmail.com Tue Jan 21 21:24:54 2020 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Tue, 21 Jan 2020 17:24:54 -0300 Subject: [PVE-User] HP RX Backup freezing the server Message-ID: Hi there I have PVE 6 all update and working, however when I plug this device in the server, it's just freeze and I need to reboot the entire server... Is there any incompatibility??? https://cc.cnetcontent.com/vcs/hp-ent/inline-content/7A/9/A/9A1E8A6A3F2FCF7EF7DA5EC974A81ED71CC62190_feature.jpg --- Gilberto Nunes Ferreira (47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram Skype: gilberto.nunes36 From smr at kmi.com Wed Jan 22 01:12:05 2020 From: smr at kmi.com (Stefan M. Radman) Date: Wed, 22 Jan 2020 00:12:05 +0000 Subject: [PVE-User] PVE 6.1 incorrect MTU In-Reply-To: <51a40f45-52ad-68e8-6e02-d3cf84ac778f@aasen.cx> References: <51a40f45-52ad-68e8-6e02-d3cf84ac778f@aasen.cx> Message-ID: <1F78EE50-FE84-4A13-9E26-A422A163324D@kmi.com> Hi Ronny Thanks for the input. setting mtu 1500 on vmbr0 may propagate to member interfaces, in more recent kernels. I belive member ports need to have the same mtu as the bridge Hmm. That might be the point with the native bond0 interface. Can you refer me to the place where this is documented or at least discussed? Maybe I can find a configuration item to switch this enforcement off (and configure the MTU manually). It seems that I'll have to do some more testing to find out at which point the MTUs change (or don't). Thanks Stefan On Jan 21, 2020, at 09:17, Ronny Aasen > wrote: I do not know... But i am interested in this case since i have a very similar config on some of my own clusters, and am planning upgrades. Hoppe you can post your solution when you find it. I am guesstimating that setting mtu 1500 on vmbr0 may propagate to member interfaces, in more recent kernels. I belive member ports need to have the same mtu as the bridge, but probably not activly enforced in previous kernels. Personaly i never use native unless it is a phone or accesspoint on the end of a cable. So in my config vmbr0 is a vlan that is vlan-aware without any ip addresses on. with mtu 9000 and bond0 as member. and my vlans are vmbr0.686 instead of bond0.686 only defined for vlans where proxmox need an ip mtu 1500 and 9000 depending on needs. my vm's are attached to vmbr0 with a tag in the config. This way I do not have to edit proxmox config to take a new vlan in use. and all mtu 1500 stancas go on vlan interfaces, and not on bridges, and that probably do not propagate the same way. In your shoes i would try to TAG the native, and move vmbr0 to vmbr682 on bond0.682. This mean you need to change the vm's using vmbr0 tho. So if you have lots of vm's attached to vmbr0 perhaps just TAG and make the member port bond0.682 to avoid changing vm configs. this should make a bond0.682 vlan as the member port and hopefully allow the mtu. disclaimer: just wild guesses, i have not tested this on 6.1. good luck Ronny Aasen CONFIDENTIALITY NOTICE: This communication may contain privileged and confidential information, or may otherwise be protected from disclosure, and is intended solely for use of the intended recipient(s). If you are not the intended recipient of this communication, please notify the sender that you have received this communication in error and delete and destroy all copies in your possession. From smr at kmi.com Wed Jan 22 01:12:51 2020 From: smr at kmi.com (Stefan M. Radman) Date: Wed, 22 Jan 2020 00:12:51 +0000 Subject: [PVE-User] PVE 6.1 incorrect MTU In-Reply-To: <92b635ae2a6fab72a54b87e2567e707213525aef.camel@elster.de> References: <51a40f45-52ad-68e8-6e02-d3cf84ac778f@aasen.cx> <92b635ae2a6fab72a54b87e2567e707213525aef.camel@elster.de> Message-ID: <8994BBC0-8831-4701-B0FA-297922F8C163@kmi.com> Hi Flo I am setting the MTU on all interface levels (nic,bond,vlan,bridge) but not all of them get set this way. It used to work until 5.4 but no longer under 6.1 Stefan > On Jan 21, 2020, at 12:39, Florian K?nig wrote: > > Hi, > > we also have Jumbo Frames enables on our PVE6.1 Cluster. > /etc/network/interfaces: > ... > iface ens7f0 inet manual > mtu 9000 > iface ens7f1 inet manual > mtu 9000 > iface ens1f0 inet manual > mtu 9000 > iface ens1f1 inet manual > mtu 9000 > auto bond1 > iface bond1 inet ... > bond-slaves ens1f0 ens7f1 > bond-primary ens1f0 > bond-miimon 100 > bond-mode active-backup > post-up ip link set bond1 mtu 9000 > auto bond2 > iface bond2 inet6 ... > bond-slaves ens7f0 ens1f1 > bond-primary ens7f0 > bond-miimon 100 > bond-mode active-backup > post-up ip link set bond2 mtu 9000 > ... > > As you can see, the MTU is set both at interface and at bond level. > I don't know if this is a hard requirement but at least it's working. > > Flo > > Am Dienstag, den 21.01.2020, 09:17 +0100 schrieb Ronny Aasen: >> On 21.01.2020 00:44, Stefan M. Radman via pve-user wrote: >>> PVE 6.1 incorrect MTU.eml >>> >>> Subject: >>> PVE 6.1 incorrect MTU >>> From: >>> "Stefan M. Radman" >>> Date: >>> 21.01.2020, 00:44 >>> >>> To: >>> "pve-user at pve.proxmox.com" >>> >>> >>> Recently I upgraded the first node of a 5.4 cluster to 6.1. >>> Everything went smooth (thanks for pve5to6!) until I restarted the >>> first node after the upgrade and started to get weird problems with >>> the LVM/multipath/iSCSI based storage (hung PVE and LVM processes >>> etc). >>> >>> After a while of digging I found that the issues were due to an >>> incorrect MTU on the storage interface (Jumbo frames got >>> truncated). >>> >>> The network configuration (see /etc/network/interfaces further >>> below) worked very well with 5.4 but it does not work with 6.1. >>> >>> After booting with PVE 6.1, some of the MTUs are not as configured. >>> Here is what "ip link" shows after boot. >>> >>> root at seisram04:~# ip link | fgrep mtu >>> 1: lo: mtu 65536 qdisc noqueue state UNKNOWN >>> mode DEFAULT group default qlen 1000 >>> 2: eno1: mtu 1500 qdisc mq >>> master bond0 state UP mode DEFAULT group default qlen 1000 >>> 3: eno2: mtu 1500 qdisc mq >>> master bond0 state UP mode DEFAULT group default qlen 1000 >>> 4: eno3: mtu 1500 qdisc mq state >>> UP mode DEFAULT group default qlen 1000 >>> 5: eno4: mtu 1500 qdisc mq state >>> UP mode DEFAULT group default qlen 1000 >>> 6: bond0: mtu 1500 qdisc >>> noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000 >>> 7: vmbr0: mtu 1500 qdisc noqueue >>> state UP mode DEFAULT group default qlen 1000 >>> 8: vmbr533: mtu 1500 qdisc >>> noqueue state UP mode DEFAULT group default qlen 1000 >>> 9: bond0.533 at bond0: mtu 1500 >>> qdisc noqueue master vmbr533 state UP mode DEFAULT group default >>> qlen 1000 >>> 10: vmbr683: mtu 1500 qdisc >>> noqueue state UP mode DEFAULT group default qlen 1000 >>> 11: bond0.683 at bond0: mtu 1500 >>> qdisc noqueue master vmbr683 state UP mode DEFAULT group default >>> qlen 1000 >>> 12: vmbr686: mtu 9000 qdisc >>> noqueue state UP mode DEFAULT group default qlen 1000 >>> 13: bond0.686 at bond0: mtu 1500 >>> qdisc noqueue master vmbr686 state UP mode DEFAULT group default >>> qlen 1000 >>> >>> The only interface that is correctly configured for Jumbo frames >>> is vmbr686, the bridge serving the storage VLAN but the underlying >>> bond0 interface and it's slaves eno1 and eno2 have the default MTU >>> of 1500 and Jumbo frames get truncated on the way. >>> >>> Why did this configuration work on 5.4 but not on 6.1? >>> In the interface configuration, all MTU values are explicitly >>> configured. >>> >>> Why are some of them ignored in 6.1? What am I missing here? >>> >>> How can I make this configuration work in PVE 6.1 (and later)? >>> >>> Any suggestions welcome. >>> >>> Thank you >>> >>> Stefan >>> >>> /etc/network/interfaces: >>> >>> auto lo >>> iface lo inet loopback >>> >>> iface eno1 inet manual >>> mtu 9000 >>> #Gb1 - Trunk - Jumbo Frames >>> >>> iface eno2 inet manual >>> mtu 9000 >>> #Gb2 - Trunk - Jumbo Frames >>> >>> auto eno3 >>> iface eno3 inet static >>> address 192.168.84.4 >>> netmask 255.255.255.0 >>> mtu 1500 >>> #Gb3 - COROSYNC1 - VLAN684 >>> >>> auto eno4 >>> iface eno4 inet static >>> address 192.168.85.4 >>> netmask 255.255.255.0 >>> mtu 1500 >>> #Gb4 - COROSYNC2 - VLAN685 >>> >>> auto bond0 >>> iface bond0 inet manual >>> slaves eno1 eno2 >>> bond_miimon 100 >>> bond_mode active-backup >>> mtu 9000 >>> #HA Bundle Gb1/Gb2 - Trunk - Jumbo Frames >>> >>> auto vmbr0 >>> iface vmbr0 inet static >>> address 172.21.54.204 >>> netmask 255.255.255.0 >>> gateway 172.21.54.254 >>> bridge_ports bond0 >>> bridge_stp off >>> bridge_fd 0 >>> mtu 1500 >>> #PRIVATE - VLAN 682 - Native >>> >>> auto vmbr533 >>> iface vmbr533 inet manual >>> bridge_ports bond0.533 >>> bridge_stp off >>> bridge_fd 0 >>> mtu 1500 >>> #PUBLIC - VLAN 533 >>> >>> auto vmbr683 >>> iface vmbr683 inet manual >>> bridge_ports bond0.683 >>> bridge_stp off >>> bridge_fd 0 >>> mtu 1500 >>> #VPN - VLAN 683 >>> >>> auto vmbr686 >>> iface vmbr686 inet static >>> address 192.168.86.204 >>> netmask 255.255.255.0 >>> bridge_ports bond0.686 >>> bridge_stp off >>> bridge_fd 0 >>> mtu 9000 >>> #STORAGE - VLAN 686 - Jumbo Frames >> >> I do not know... But i am interested in this case since i have a >> very >> similar config on some of my own clusters, and am planning upgrades. >> Hoppe you can post your solution when you find it. >> >> I am guesstimating that setting mtu 1500 on vmbr0 may propagate to >> member interfaces, in more recent kernels. I belive member ports need >> to >> have the same mtu as the bridge, but probably not activly enforced >> in >> previous kernels. >> >> Personaly i never use native unless it is a phone or accesspoint on >> the >> end of a cable. >> So in my config vmbr0 is a vlan that is vlan-aware without any ip >> addresses on. with mtu 9000 and bond0 as member. >> >> and my vlans are vmbr0.686 instead of bond0.686 only defined for >> vlans >> where proxmox need an ip mtu 1500 and 9000 depending on needs. >> >> my vm's are attached to vmbr0 with a tag in the config. This way I >> do >> not have to edit proxmox config to take a new vlan in use. >> and all mtu 1500 stancas go on vlan interfaces, and not on bridges, >> and >> that probably do not propagate the same way. >> >> >> >> In your shoes i would try to TAG the native, and move vmbr0 to >> vmbr682 >> on bond0.682. This mean you need to change the vm's using vmbr0 tho. >> So if you have lots of vm's attached to vmbr0 perhaps just TAG and >> make >> the member port bond0.682 to avoid changing vm configs. this should >> make >> a bond0.682 vlan as the member port and hopefully allow the mtu. >> >> disclaimer: just wild guesses, i have not tested this on 6.1. >> >> >> >> good luck >> Ronny Aasen >> >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > -- > Florian K?nig > > Bayerisches Landesamt f?r Steuern > IuK 16 > Katharina-von-Bora-Str. 6 > 80333 M?nchen > > Telefon: 089 9991-3630 > E-Mail: Florian.Koenig at elster.de > Internet: http://www.lfst.bayern.de CONFIDENTIALITY NOTICE: This communication may contain privileged and confidential information, or may otherwise be protected from disclosure, and is intended solely for use of the intended recipient(s). If you are not the intended recipient of this communication, please notify the sender that you have received this communication in error and delete and destroy all copies in your possession. From daniel at firewall-services.com Wed Jan 22 08:33:33 2020 From: daniel at firewall-services.com (Daniel Berteaud) Date: Wed, 22 Jan 2020 08:33:33 +0100 (CET) Subject: [PVE-User] VxLAN and tagged frames Message-ID: <376592375.41818.1579678413069.JavaMail.zimbra@fws.fr> Hi there At a french hoster (Online.net), we have a private network available on dedicated server, but without QinQ support. So, we can't rely on native VLAN between nodes. Up to now, I created a single OVS bridge on every node, with GRE tunnels with each other. The GRE tunnel transport tagged frames and everything is working. But I see there are some work on SDN plugins, and VxLAN support. I red [ https://git.proxmox.com/?p=pve-docs.git;a=blob_plain;f=vxlan-and-evpn.adoc;hb=HEAD | https://git.proxmox.com/?p=pve-docs.git;a=blob_plain;f=vxlan-and-evpn.adoc;hb=HEAD ] but there are some stuff I'm not sure I understand. Especially with vlan aware bridges. I like to rely on VLAN aware bridges so I don't have to touch network conf of the hypervisors to create a new network zone. I just use a new, unused VLAN ID. But the doc about VxLAN support on vlan aware bridges has been removed (see [ https://git.proxmox.com/?p=pve-docs.git;a=commitdiff;h=5dde3d645834b204257e8d5b3ce8b65e6842abe8;hp=d4a9910fec45b1153b1cd954a006d267d42c707a | https://git.proxmox.com/?p=pve-docs.git;a=commitdiff;h=5dde3d645834b204257e8d5b3ce8b65e6842abe8;hp=d4a9910fec45b1153b1cd954a006d267d42c707a ] ) So, what's the recommended setup for this ? Create one (non vlan aware) bridge for each network zone, with 1 VxLAN tunnel per bridge between nodes ? This doesn't look very scalable compared with vlan aware bridges (or OVS bridges) with GRE tunnels, does it ? Are the expirimental SDN plugins available somewhere as deb so I can play a bit with it ? (couldn't find it in pve-test or no-subscription) Cheers, Daniel -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From daniel at firewall-services.com Wed Jan 22 09:50:23 2020 From: daniel at firewall-services.com (Daniel Berteaud) Date: Wed, 22 Jan 2020 09:50:23 +0100 (CET) Subject: [PVE-User] nf_conntrack_proto_gre missing in kernel 5.3 ? Message-ID: <398737229.42519.1579683023160.JavaMail.zimbra@fws.fr> Hi. I used to rely on being able to load nf_conntrack_proto_gre in PVE 5 days. It's still present in kernel 5.0 for PVE 6, but missing in kernel 5.3. Is that expected ? Cheers, Daniel -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From devzero at web.de Wed Jan 22 11:20:52 2020 From: devzero at web.de (Roland @web.de) Date: Wed, 22 Jan 2020 11:20:52 +0100 Subject: [PVE-User] add more than 13 disks to vm ? Message-ID: <2b9b296c-8f42-2e6c-7ae4-cbfb96eda906@web.de> Hi, how do i add more than 13(15) scsi (virtio) virtual disks to a VM in pve webgui ? thanks roland From chris.hofstaedtler at deduktiva.com Wed Jan 22 16:05:56 2020 From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva) Date: Wed, 22 Jan 2020 16:05:56 +0100 Subject: [PVE-User] qm import stuck In-Reply-To: References: Message-ID: <20200122150556.kmrji5l5deg35awo@zeha.at> Hi, * Gilberto Nunes [200121 16:02]: > bytes progression: 0.00 % Can't comment on qm import... > > I get some messages in dmesg: > > 74394] sd 15:0:0:0: rdac: array Storage_Abatex, ctlr 0, queueing > MODE_SELECT command > [63033.074899] sd 15:0:0:0: rdac: array Storage_Abatex, ctlr 0, MODE_SELECT > returned with sense 06/94/01 > [63033.074902] sd 15:0:0:0: rdac: array Storage_Abatex, ctlr 0, retrying > MODE_SELECT command > [63033.519329] sd 16:0:0:0: rdac: array Storage_Abatex, ctlr 0, queueing > MODE_SELECT command > [63033.519835] sd 16:0:0:0: rdac: array Storage_Abatex, ctlr 0, MODE_SELECT > returned with sense 06/94/01 > [63033.519836] sd 16:0:0:0: rdac: array Storage_Abatex, ctlr 0, retrying > MODE_SELECT command Do you see other errors too? How have you configured the storage on the PVE side? > The information regard Dell MD32xxi in the file above I add with the server > online... Do I need to reboot the servers or just systemctl restart > multipath-whatever??? Changes in multipath.conf get picked up once you restart multipathd; some of them also get picked up by just invoking multipath. You can check with `multipath -ll` (and similar) which settings are active. Chris -- Chris Hofstaedtler / Deduktiva GmbH (FN 418592 b, HG Wien) www.deduktiva.com / +43 1 353 1707 From chris.hofstaedtler at deduktiva.com Wed Jan 22 16:27:53 2020 From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva) Date: Wed, 22 Jan 2020 16:27:53 +0100 Subject: [PVE-User] nf_conntrack_proto_gre missing in kernel 5.3 ? In-Reply-To: <398737229.42519.1579683023160.JavaMail.zimbra@fws.fr> References: <398737229.42519.1579683023160.JavaMail.zimbra@fws.fr> Message-ID: <20200122152753.avuvfg6zq4uk5o5p@zeha.at> * Daniel Berteaud [200122 09:50]: > I used to rely on being able to load nf_conntrack_proto_gre in PVE 5 days. It's still present in kernel 5.0 for PVE 6, but missing in kernel 5.3. Is that expected ? Appears to be built in now: config-5.0.21-2-pve:CONFIG_NF_CT_PROTO_GRE=m config-5.0.21-5-pve:CONFIG_NF_CT_PROTO_GRE=m config-5.3.10-1-pve:CONFIG_NF_CT_PROTO_GRE=y config-5.3.13-1-pve:CONFIG_NF_CT_PROTO_GRE=y Chris -- Chris Hofstaedtler / Deduktiva GmbH (FN 418592 b, HG Wien) www.deduktiva.com / +43 1 353 1707 From daniel at firewall-services.com Wed Jan 22 16:52:31 2020 From: daniel at firewall-services.com (Daniel Berteaud) Date: Wed, 22 Jan 2020 16:52:31 +0100 (CET) Subject: [PVE-User] nf_conntrack_proto_gre missing in kernel 5.3 ? In-Reply-To: <20200122152753.avuvfg6zq4uk5o5p@zeha.at> References: <398737229.42519.1579683023160.JavaMail.zimbra@fws.fr> <20200122152753.avuvfg6zq4uk5o5p@zeha.at> Message-ID: <566626182.46637.1579708351134.JavaMail.zimbra@fws.fr> ----- Le 22 Jan 20, ? 16:27, Chris Hofstaedtler | Deduktiva chris.hofstaedtler at deduktiva.com a ?crit : > * Daniel Berteaud [200122 09:50]: >> I used to rely on being able to load nf_conntrack_proto_gre in PVE 5 days. It's >> still present in kernel 5.0 for PVE 6, but missing in kernel 5.3. Is that >> expected ? > > Appears to be built in now: > > config-5.0.21-2-pve:CONFIG_NF_CT_PROTO_GRE=m > config-5.0.21-5-pve:CONFIG_NF_CT_PROTO_GRE=m > config-5.3.10-1-pve:CONFIG_NF_CT_PROTO_GRE=y > config-5.3.13-1-pve:CONFIG_NF_CT_PROTO_GRE=y > Thanks, not sure how I missed this ;-) Regards, Daniel -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From devzero at web.de Wed Jan 22 17:39:14 2020 From: devzero at web.de (Roland @web.de) Date: Wed, 22 Jan 2020 17:39:14 +0100 Subject: [PVE-User] add more than 13 disks to vm ? In-Reply-To: <2b9b296c-8f42-2e6c-7ae4-cbfb96eda906@web.de> References: <2b9b296c-8f42-2e6c-7ae4-cbfb96eda906@web.de> Message-ID: <97693b04-5747-1630-caff-f0ac0961858b@web.de> apparently it's not possible - i have created a bugticket/feature request https://bugzilla.proxmox.com/show_bug.cgi?id=2566 Am 22.01.20 um 11:20 schrieb Roland @web.de: > Hi, > > how do i add more than 13(15) scsi (virtio) virtual disks to a VM in pve > webgui ? > > thanks > roland > > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From jm at ginernet.com Thu Jan 23 15:34:47 2020 From: jm at ginernet.com (=?UTF-8?Q?Jos=c3=a9_Manuel_Giner?=) Date: Thu, 23 Jan 2020 15:34:47 +0100 Subject: [PVE-User] CPU freq scale in PVE 6.0 Message-ID: <993bdd37-0e80-4718-07d3-bc17d17c45d5@ginernet.com> Hello, Since PVE 6.0, we detect that the CPU frequency is dynamic even if the governor "performance" is selected. How can we set the maximum frequency in all the cores? Thank you. root at ns1031:~# cat /proc/cpuinfo | grep MHz cpu MHz : 1908.704 cpu MHz : 1430.150 cpu MHz : 1436.616 cpu MHz : 1433.659 cpu MHz : 1547.050 cpu MHz : 2299.611 cpu MHz : 2931.782 cpu MHz : 3321.946 cpu MHz : 3397.896 cpu MHz : 3401.725 cpu MHz : 3401.701 cpu MHz : 3170.021 cpu MHz : 2502.847 cpu MHz : 1534.596 cpu MHz : 2969.438 cpu MHz : 3435.646 cpu MHz : 1992.717 cpu MHz : 2922.292 cpu MHz : 2917.282 cpu MHz : 2921.654 cpu MHz : 2484.807 cpu MHz : 1866.995 cpu MHz : 1629.151 cpu MHz : 2841.298 cpu MHz : 3404.503 cpu MHz : 3401.732 cpu MHz : 3406.120 cpu MHz : 2189.093 cpu MHz : 1545.454 cpu MHz : 3348.573 cpu MHz : 1564.159 cpu MHz : 3438.787 root at ns1031:~# root at ns1031:~# root at ns1031:~# root at ns1031:~# cpupower frequency-info analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: Cannot determine or is not supported. hardware limits: 1.20 GHz - 3.60 GHz available cpufreq governors: performance powersave current policy: frequency should be within 1.20 GHz and 3.60 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency: Unable to call hardware current CPU frequency: 1.90 GHz (asserted by call to kernel) boost state support: Supported: yes Active: yes root at ns1031:~# root at ns1031:~# -- Jos? Manuel Giner https://ginernet.com From aderumier at odiso.com Thu Jan 23 20:38:37 2020 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Thu, 23 Jan 2020 20:38:37 +0100 (CET) Subject: [PVE-User] CPU freq scale in PVE 6.0 In-Reply-To: <993bdd37-0e80-4718-07d3-bc17d17c45d5@ginernet.com> References: <993bdd37-0e80-4718-07d3-bc17d17c45d5@ginernet.com> Message-ID: <1992237777.2463296.1579808317031.JavaMail.zimbra@odiso.com> Hi, I'm setup this now for my new intel processors: /etc/default/grub GRUB_CMDLINE_LINUX="intel_idle.max_cstate=0 intel_pstate=disable processor.max_cstate=1" ----- Mail original ----- De: "Jos? Manuel Giner" ?: "proxmoxve" Envoy?: Jeudi 23 Janvier 2020 15:34:47 Objet: [PVE-User] CPU freq scale in PVE 6.0 Hello, Since PVE 6.0, we detect that the CPU frequency is dynamic even if the governor "performance" is selected. How can we set the maximum frequency in all the cores? Thank you. root at ns1031:~# cat /proc/cpuinfo | grep MHz cpu MHz : 1908.704 cpu MHz : 1430.150 cpu MHz : 1436.616 cpu MHz : 1433.659 cpu MHz : 1547.050 cpu MHz : 2299.611 cpu MHz : 2931.782 cpu MHz : 3321.946 cpu MHz : 3397.896 cpu MHz : 3401.725 cpu MHz : 3401.701 cpu MHz : 3170.021 cpu MHz : 2502.847 cpu MHz : 1534.596 cpu MHz : 2969.438 cpu MHz : 3435.646 cpu MHz : 1992.717 cpu MHz : 2922.292 cpu MHz : 2917.282 cpu MHz : 2921.654 cpu MHz : 2484.807 cpu MHz : 1866.995 cpu MHz : 1629.151 cpu MHz : 2841.298 cpu MHz : 3404.503 cpu MHz : 3401.732 cpu MHz : 3406.120 cpu MHz : 2189.093 cpu MHz : 1545.454 cpu MHz : 3348.573 cpu MHz : 1564.159 cpu MHz : 3438.787 root at ns1031:~# root at ns1031:~# root at ns1031:~# root at ns1031:~# cpupower frequency-info analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: Cannot determine or is not supported. hardware limits: 1.20 GHz - 3.60 GHz available cpufreq governors: performance powersave current policy: frequency should be within 1.20 GHz and 3.60 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency: Unable to call hardware current CPU frequency: 1.90 GHz (asserted by call to kernel) boost state support: Supported: yes Active: yes root at ns1031:~# root at ns1031:~# -- Jos? Manuel Giner https://ginernet.com _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From aderumier at odiso.com Thu Jan 23 20:53:54 2020 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Thu, 23 Jan 2020 20:53:54 +0100 (CET) Subject: [PVE-User] VxLAN and tagged frames In-Reply-To: <376592375.41818.1579678413069.JavaMail.zimbra@fws.fr> References: <376592375.41818.1579678413069.JavaMail.zimbra@fws.fr> Message-ID: <1894071400.2463602.1579809234648.JavaMail.zimbra@odiso.com> Hi, >>So, what's the recommended setup for this ? Create one (non vlan aware) bridge for each network zone, with 1 VxLAN tunnel per bridge between nodes ? yes, you need 1 non-vlan aware bridge + 1 vxlan tunnel. Technically they are vlan (from aware bridge) to vxlan mapping in kernel, but it's realy new and unstable. I don't known if it's possible to send vlan tagged frame inside a vxlan, never tested it. >>This doesn't look very scalable compared with >>vlan aware bridges (or OVS bridges) with GRE tunnels, does it ? I have tested it with 2000 vxlans + 2000 bridges. Works fine. Is is enough for you ? >>Are the expirimental SDN plugins available somewhere as deb so I can play a bit with it ? (couldn't find it in pve-test or no-subscription) #apt-get install libpve-network-perl (try for pvetest repo if possible) The gui is not finished yet, but you can try it at http://odisoweb1.odiso.net/pve-manager_6.1-5_amd64.deb I think if you want to do something like a simple vxlan tunnel, with multiple vlan, something like this should work (need to be tested): auto vxlan2 iface vxlan2 inet manual vxlan-id 2 vxlan_remoteip 192.168.0.2 vxlan_remoteip 192.168.0.3 auto vmbr2 iface vmbr2 inet manual bridge_ports vxlan2 bridge_stp off bridge_fd 0 bridge-vlan-aware yes bridge-vids 2-4096 Note that it's possible to do gre tunnel with ifupdown2, I can send the config if you need it ----- Mail original ----- De: "Daniel Berteaud" ?: "proxmoxve" Envoy?: Mercredi 22 Janvier 2020 08:33:33 Objet: [PVE-User] VxLAN and tagged frames Hi there At a french hoster (Online.net), we have a private network available on dedicated server, but without QinQ support. So, we can't rely on native VLAN between nodes. Up to now, I created a single OVS bridge on every node, with GRE tunnels with each other. The GRE tunnel transport tagged frames and everything is working. But I see there are some work on SDN plugins, and VxLAN support. I red [ https://git.proxmox.com/?p=pve-docs.git;a=blob_plain;f=vxlan-and-evpn.adoc;hb=HEAD | https://git.proxmox.com/?p=pve-docs.git;a=blob_plain;f=vxlan-and-evpn.adoc;hb=HEAD ] but there are some stuff I'm not sure I understand. Especially with vlan aware bridges. I like to rely on VLAN aware bridges so I don't have to touch network conf of the hypervisors to create a new network zone. I just use a new, unused VLAN ID. But the doc about VxLAN support on vlan aware bridges has been removed (see [ https://git.proxmox.com/?p=pve-docs.git;a=commitdiff;h=5dde3d645834b204257e8d5b3ce8b65e6842abe8;hp=d4a9910fec45b1153b1cd954a006d267d42c707a | https://git.proxmox.com/?p=pve-docs.git;a=commitdiff;h=5dde3d645834b204257e8d5b3ce8b65e6842abe8;hp=d4a9910fec45b1153b1cd954a006d267d42c707a ] ) So, what's the recommended setup for this ? Create one (non vlan aware) bridge for each network zone, with 1 VxLAN tunnel per bridge between nodes ? This doesn't look very scalable compared with vlan aware bridges (or OVS bridges) with GRE tunnels, does it ? Are the expirimental SDN plugins available somewhere as deb so I can play a bit with it ? (couldn't find it in pve-test or no-subscription) Cheers, Daniel -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From daniel at firewall-services.com Fri Jan 24 08:20:22 2020 From: daniel at firewall-services.com (Daniel Berteaud) Date: Fri, 24 Jan 2020 08:20:22 +0100 (CET) Subject: [PVE-User] VxLAN and tagged frames In-Reply-To: <1894071400.2463602.1579809234648.JavaMail.zimbra@odiso.com> References: <376592375.41818.1579678413069.JavaMail.zimbra@fws.fr> <1894071400.2463602.1579809234648.JavaMail.zimbra@odiso.com> Message-ID: <2101121917.56814.1579850422421.JavaMail.zimbra@fws.fr> ----- Le 23 Jan 20, ? 20:53, Alexandre DERUMIER aderumier at odiso.com a ?crit : > Hi, > >>>So, what's the recommended setup for this ? Create one (non vlan aware) bridge >>>for each network zone, with 1 VxLAN tunnel per bridge between nodes ? > > yes, you need 1 non-vlan aware bridge + 1 vxlan tunnel. OK > > Technically they are vlan (from aware bridge) to vxlan mapping in kernel, but > it's realy new and unstable. > I don't known if it's possible to send vlan tagged frame inside a vxlan, never > tested it. > >>>This doesn't look very scalable compared with >>vlan aware bridges (or OVS >>>bridges) with GRE tunnels, does it ? > > I have tested it with 2000 vxlans + 2000 bridges. Works fine. Is is enough for > you ? I mean, until the SDN plugin is ready, creating a new network zone requires manual editing of network config on every node (new bridge + new vxlan). Unlike vlan aware bridges where you setup network on the hypervisor once, and then just use a new VLAN id for a VM. But most likely your SDN plugin makes it easier. > > > >>>Are the expirimental SDN plugins available somewhere as deb so I can play a bit >>>with it ? (couldn't find it in pve-test or no-subscription) > > #apt-get install libpve-network-perl (try for pvetest repo if possible) Oh, OK thanks. I was looking for a pve-something package name, that's why I haven't saw it :-) > > > The gui is not finished yet, but you can try it at > http://odisoweb1.odiso.net/pve-manager_6.1-5_amd64.deb > > > > > > I think if you want to do something like a simple vxlan tunnel, with multiple > vlan, something like this should work (need to be tested): > > auto vxlan2 > iface vxlan2 inet manual > vxlan-id 2 > vxlan_remoteip 192.168.0.2 > vxlan_remoteip 192.168.0.3 > > auto vmbr2 > iface vmbr2 inet manual > bridge_ports vxlan2 > bridge_stp off > bridge_fd 0 > bridge-vlan-aware yes > bridge-vids 2-4096 I'll try something like that. Until now, I use this : auto vmbr0 allow-ovs vmbr0 iface vmbr0 inet manual ovs_type OVSBridge ovs_ports none up ovs-vsctl set Bridge ${IFACE} rstp_enable=true Then a script get all the cluster members, and create one gre tunnel with each other node like : ovs-vsctl add-port vmbr0 gre0 -- set interface gre0 type=gre options:remote_ip=10.22.5.2 ovs-vsctl add-port vmbr0 gre1 -- set interface gre1 type=gre options:remote_ip=10.22.5.3 etc. Not perfect, but working. The single GRE tunnel transport all the VLAN ++ -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From daniel at firewall-services.com Fri Jan 24 10:15:34 2020 From: daniel at firewall-services.com (Daniel Berteaud) Date: Fri, 24 Jan 2020 10:15:34 +0100 (CET) Subject: [PVE-User] VxLAN and tagged frames In-Reply-To: <2101121917.56814.1579850422421.JavaMail.zimbra@fws.fr> References: <376592375.41818.1579678413069.JavaMail.zimbra@fws.fr> <1894071400.2463602.1579809234648.JavaMail.zimbra@odiso.com> <2101121917.56814.1579850422421.JavaMail.zimbra@fws.fr> Message-ID: <1856931715.57400.1579857334266.JavaMail.zimbra@fws.fr> ----- Le 24 Jan 20, ? 8:20, Daniel Berteaud daniel at firewall-services.com a ?crit : > ----- Le 23 Jan 20, ? 20:53, Alexandre DERUMIER aderumier at odiso.com a ?crit : >> >> I think if you want to do something like a simple vxlan tunnel, with multiple >> vlan, something like this should work (need to be tested): >> >> auto vxlan2 >> iface vxlan2 inet manual >> vxlan-id 2 >> vxlan_remoteip 192.168.0.2 >> vxlan_remoteip 192.168.0.3 >> >> auto vmbr2 >> iface vmbr2 inet manual >> bridge_ports vxlan2 >> bridge_stp off >> bridge_fd 0 >> bridge-vlan-aware yes >> bridge-vids 2-4096 > > I'll try something like that. Arf. ifupdown2 seems to be needed for vxlan interfaces to be setup. But it somehow breaks my ARP proxy setup on the WAN interface. Not sure why, everything seems to be correctly setup, but the host doesn't answer to ARP requests anymore. And everything is back to normal as soon as I revert to classic ifupdown. I'll try to look at this a bit later, when I more some spare time. ++ -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From aderumier at odiso.com Fri Jan 24 11:06:58 2020 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Fri, 24 Jan 2020 11:06:58 +0100 (CET) Subject: [PVE-User] VxLAN and tagged frames In-Reply-To: <1856931715.57400.1579857334266.JavaMail.zimbra@fws.fr> References: <376592375.41818.1579678413069.JavaMail.zimbra@fws.fr> <1894071400.2463602.1579809234648.JavaMail.zimbra@odiso.com> <2101121917.56814.1579850422421.JavaMail.zimbra@fws.fr> <1856931715.57400.1579857334266.JavaMail.zimbra@fws.fr> Message-ID: <21489142.2478491.1579860418230.JavaMail.zimbra@odiso.com> >Arf. ifupdown2 seems to be needed for vxlan interfaces to be setup. yes, ifupdown2 is needed. >>But it somehow breaks my ARP proxy setup on the WAN interface. >>Not sure why, everything seems to be correctly setup, but the host doesn't answer to ARP requests anymore. And everything is back to normal as soon as I revert to classic ifupdown. >>I'll try to look at this a bit later, when I more some spare time. I'm not sure, but maybe you can try to add iface WAN ... arp-accept on About vlan brige->vxlan, I have done some tests again with last kernel, it seem than 1 vlanaware bridge + 1 vxlan tunnel (tunnel_mode) is still broken, So the only possible way to 1 vlanawarebridge + multiple vxlan tunnel. This can be done easily with ifupdown2 like this: %for v in range(1010,1021): auto vxlan${v} iface vxlan${v} vxlan-id ${v} bridge-access ${v} vxlan_remoteip 192.168.0.2 vxlan_remoteip 192.168.0.3 %endfor auto vmbr2 iface vmbr2 inet manual bridge_ports glob vxlan1010-1020 bridge_stp off bridge_fd 0 bridge-vlan-aware yes bridge-vids 2-4094 This will map vlan1010-1020 to vxlan1010-1020. the vxlan interfaces are create with a template in a loop I have tested it, it's working fine. ----- Mail original ----- De: "Daniel Berteaud" ?: "proxmoxve" Envoy?: Vendredi 24 Janvier 2020 10:15:34 Objet: Re: [PVE-User] VxLAN and tagged frames ----- Le 24 Jan 20, ? 8:20, Daniel Berteaud daniel at firewall-services.com a ?crit : > ----- Le 23 Jan 20, ? 20:53, Alexandre DERUMIER aderumier at odiso.com a ?crit : >> >> I think if you want to do something like a simple vxlan tunnel, with multiple >> vlan, something like this should work (need to be tested): >> >> auto vxlan2 >> iface vxlan2 inet manual >> vxlan-id 2 >> vxlan_remoteip 192.168.0.2 >> vxlan_remoteip 192.168.0.3 >> >> auto vmbr2 >> iface vmbr2 inet manual >> bridge_ports vxlan2 >> bridge_stp off >> bridge_fd 0 >> bridge-vlan-aware yes >> bridge-vids 2-4096 > > I'll try something like that. Arf. ifupdown2 seems to be needed for vxlan interfaces to be setup. But it somehow breaks my ARP proxy setup on the WAN interface. Not sure why, everything seems to be correctly setup, but the host doesn't answer to ARP requests anymore. And everything is back to normal as soon as I revert to classic ifupdown. I'll try to look at this a bit later, when I more some spare time. ++ -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From daniel at firewall-services.com Fri Jan 24 11:18:05 2020 From: daniel at firewall-services.com (Daniel Berteaud) Date: Fri, 24 Jan 2020 11:18:05 +0100 (CET) Subject: [PVE-User] VxLAN and tagged frames In-Reply-To: <21489142.2478491.1579860418230.JavaMail.zimbra@odiso.com> References: <376592375.41818.1579678413069.JavaMail.zimbra@fws.fr> <1894071400.2463602.1579809234648.JavaMail.zimbra@odiso.com> <2101121917.56814.1579850422421.JavaMail.zimbra@fws.fr> <1856931715.57400.1579857334266.JavaMail.zimbra@fws.fr> <21489142.2478491.1579860418230.JavaMail.zimbra@odiso.com> Message-ID: <1610578371.58095.1579861085258.JavaMail.zimbra@fws.fr> ----- Le 24 Jan 20, ? 11:06, Alexandre DERUMIER aderumier at odiso.com a ?crit : >>Arf. ifupdown2 seems to be needed for vxlan interfaces to be setup. > yes, ifupdown2 is needed. > >>>But it somehow breaks my ARP proxy setup on the WAN interface. >>>Not sure why, everything seems to be correctly setup, but the host doesn't >>>answer to ARP requests anymore. And everything is back to normal as soon as I >>>revert to classic ifupdown. >>>I'll try to look at this a bit later, when I more some spare time. > > I'm not sure, but maybe you can try to add > > iface WAN > ... > arp-accept on Will give this a try. > > > > About vlan brige->vxlan, I have done some tests again with last kernel, it seem > than 1 vlanaware bridge + 1 vxlan tunnel (tunnel_mode) is still broken, > So the only possible way to 1 vlanawarebridge + multiple vxlan tunnel. > > This can be done easily with ifupdown2 like this: > > > > > %for v in range(1010,1021): > auto vxlan${v} > iface vxlan${v} > vxlan-id ${v} > bridge-access ${v} > vxlan_remoteip 192.168.0.2 > vxlan_remoteip 192.168.0.3 > %endfor > > > auto vmbr2 > iface vmbr2 inet manual > bridge_ports glob vxlan1010-1020 > bridge_stp off > bridge_fd 0 > bridge-vlan-aware yes > bridge-vids 2-4094 Oooohhh, I didn't know we could use loops and glob like this. This changes everything :-) ! I'll give this a try Thanks for the tips Regards, Daniel -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From daniel at firewall-services.com Fri Jan 24 17:52:39 2020 From: daniel at firewall-services.com (Daniel Berteaud) Date: Fri, 24 Jan 2020 17:52:39 +0100 (CET) Subject: [PVE-User] VxLAN and tagged frames In-Reply-To: <1610578371.58095.1579861085258.JavaMail.zimbra@fws.fr> References: <376592375.41818.1579678413069.JavaMail.zimbra@fws.fr> <1894071400.2463602.1579809234648.JavaMail.zimbra@odiso.com> <2101121917.56814.1579850422421.JavaMail.zimbra@fws.fr> <1856931715.57400.1579857334266.JavaMail.zimbra@fws.fr> <21489142.2478491.1579860418230.JavaMail.zimbra@odiso.com> <1610578371.58095.1579861085258.JavaMail.zimbra@fws.fr> Message-ID: <1149077723.61057.1579884759348.JavaMail.zimbra@fws.fr> > ----- Le 24 Jan 20, ? 11:06, Alexandre DERUMIER aderumier at odiso.com a ?crit : >> >> This can be done easily with ifupdown2 like this: >> >> >> >> >> %for v in range(1010,1021): >> auto vxlan${v} >> iface vxlan${v} >> vxlan-id ${v} >> bridge-access ${v} >> vxlan_remoteip 192.168.0.2 >> vxlan_remoteip 192.168.0.3 >> %endfor >> >> >> auto vmbr2 >> iface vmbr2 inet manual >> bridge_ports glob vxlan1010-1020 >> bridge_stp off >> bridge_fd 0 >> bridge-vlan-aware yes >> bridge-vids 2-4094 > I'm probably missing something, but can't find it. I've just added this : %for v in range(201,400): auto vxlan${v} iface vxlan${v} inet manual vxlan-id ${v} vxlan-svcnodeip 225.20.1.1 vxlan-physdev enp132s0f0.2022 %endfor And vxlan interfaces aren't created. ifreload -a complains : error: /etc/network/interfaces: failed to render template (Undefined). Continue without template rendering ... warning: /etc/network/interfaces: line37: vxlan${v}: unexpected characters in interface name error: /etc/network/interfaces: line41: iface vxlan${v}: invalid syntax '%endfor' error: vxlan${v}: invalid vxlan-id '${v}' I do have python-mako installed (which wasn't pulled in as a dependency of ifupdown2 BTW, maybe it should) Any idea ? -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From daniel at firewall-services.com Fri Jan 24 19:13:19 2020 From: daniel at firewall-services.com (Daniel Berteaud) Date: Fri, 24 Jan 2020 19:13:19 +0100 (CET) Subject: [PVE-User] VxLAN and tagged frames In-Reply-To: <1149077723.61057.1579884759348.JavaMail.zimbra@fws.fr> References: <376592375.41818.1579678413069.JavaMail.zimbra@fws.fr> <1894071400.2463602.1579809234648.JavaMail.zimbra@odiso.com> <2101121917.56814.1579850422421.JavaMail.zimbra@fws.fr> <1856931715.57400.1579857334266.JavaMail.zimbra@fws.fr> <21489142.2478491.1579860418230.JavaMail.zimbra@odiso.com> <1610578371.58095.1579861085258.JavaMail.zimbra@fws.fr> <1149077723.61057.1579884759348.JavaMail.zimbra@fws.fr> Message-ID: <1266029243.61313.1579889599768.JavaMail.zimbra@fws.fr> ----- Le 24 Jan 20, ? 17:52, Daniel Berteaud daniel at firewall-services.com a ?crit : > And vxlan interfaces aren't created. ifreload -a complains : > > error: /etc/network/interfaces: failed to render template (Undefined). Continue > without template rendering ... > warning: /etc/network/interfaces: line37: vxlan${v}: unexpected characters in > interface name > error: /etc/network/interfaces: line41: iface vxlan${v}: invalid syntax > '%endfor' > error: vxlan${v}: invalid vxlan-id '${v}' > > > I do have python-mako installed (which wasn't pulled in as a dependency of > ifupdown2 BTW, maybe it should) > > Any idea ? Stupid me. I was using the ${IFACE} macro earlier in the file, which mako tried to expand. ++ -- [ https://www.firewall-services.com/ ] Daniel Berteaud FIREWALL-SERVICES SAS, La s?curit? des r?seaux Soci?t? de Services en Logiciels Libres T?l : +33.5 56 64 15 32 Matrix: @dani:fws.fr [ https://www.firewall-services.com/ | https://www.firewall-services.com ] From f.thommen at dkfz-heidelberg.de Sat Jan 25 16:44:10 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Sat, 25 Jan 2020 16:44:10 +0100 Subject: [PVE-User] PVE Cluster: New authentication is required to access each node from GUI Message-ID: <2ed8b9fc-c944-3de7-5fda-394909638843@dkfz-heidelberg.de> Dear all, I have installed a 3-node PVE cluster as instructed on https://pve.proxmox.com/pve-docs/chapter-pvecm.html (usung commandline). When I now connect via GUI to one node and select one of the other nodes, I get a "401" error message and then I am asked to authenticate to the other node. So to see all nodes from all other nodes via GUI I would have to authenticate nine times. I don't think that is as it should be ;-). I would assume that once I am logged in on the GUI of one of the cluster nodes, I can look at the other two nodes w/o additional authentication from this GUI. The situation is somehow similar to the one described on https://forum.proxmox.com/threads/3-node-cluster-permission-denied-invalid-pve-ticket-401.56038/, but the suggested "pvecm updatecerts" (run on each node) only helped for a short time. After a reboot of the nodes I am back to the potential nine authentications. My three nodes are connected through a full 10GE mesh (https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server) using broadcast bonds. This mesh will finally also be used for Ceph. I configured this mesh to be the cluster network (--link0). As fallback (--link1) I used the regular LAN. Does anyone have an idea what could be wrong and how this could be fixed? Could the mesh with the broadcast bonds be the problem? If yes, should I use an other type of mesh? Unfortunately a full dedicated PVE-only network with a switch is not an option. I can either use a mesh or the regular LAN in the datacenter. The systems are running PVE 6.1-3. Any help or hint is appreciated. Cheers frank From gianni.milo22 at gmail.com Sat Jan 25 18:32:29 2020 From: gianni.milo22 at gmail.com (Gianni Milo) Date: Sat, 25 Jan 2020 17:32:29 +0000 Subject: [PVE-User] PVE Cluster: New authentication is required to access each node from GUI In-Reply-To: <2ed8b9fc-c944-3de7-5fda-394909638843@dkfz-heidelberg.de> References: <2ed8b9fc-c944-3de7-5fda-394909638843@dkfz-heidelberg.de> Message-ID: Things I would check or modify... - output of 'pvecm s' and 'pvecm n' commands. - syslog on each node for any clues. - ntp. - separate cluster (corosync) network from storage network (i.e In your case, use --link2, LAN). G. On Sat, 25 Jan 2020 at 15:44, Frank Thommen wrote: > Dear all, > > I have installed a 3-node PVE cluster as instructed on > https://pve.proxmox.com/pve-docs/chapter-pvecm.html (usung commandline). > When I now connect via GUI to one node and select one of the other > nodes, I get a "401" error message and then I am asked to authenticate > to the other node. So to see all nodes from all other nodes via GUI I > would have to authenticate nine times. I don't think that is as it > should be ;-). I would assume that once I am logged in on the GUI of one > of the cluster nodes, I can look at the other two nodes w/o additional > authentication from this GUI. > > The situation is somehow similar to the one described on > > https://forum.proxmox.com/threads/3-node-cluster-permission-denied-invalid-pve-ticket-401.56038/, > > but the suggested "pvecm updatecerts" (run on each node) only helped for > a short time. After a reboot of the nodes I am back to the potential > nine authentications. > > My three nodes are connected through a full 10GE mesh > (https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server) using > broadcast bonds. This mesh will finally also be used for Ceph. I > configured this mesh to be the cluster network (--link0). As fallback > (--link1) I used the regular LAN. > > Does anyone have an idea what could be wrong and how this could be > fixed? Could the mesh with the broadcast bonds be the problem? If yes, > should I use an other type of mesh? Unfortunately a full dedicated > PVE-only network with a switch is not an option. I can either use a > mesh or the regular LAN in the datacenter. > > The systems are running PVE 6.1-3. > > Any help or hint is appreciated. > > Cheers > frank > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From f.thommen at dkfz-heidelberg.de Sun Jan 26 02:28:25 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Sun, 26 Jan 2020 02:28:25 +0100 Subject: [PVE-User] PVE Cluster: New authentication is required to access each node from GUI In-Reply-To: References: <2ed8b9fc-c944-3de7-5fda-394909638843@dkfz-heidelberg.de> Message-ID: <7b042b79-a9a5-46c9-415d-47c8991ac075@dkfz-heidelberg.de> The time it was. It was off by a few minutes between two of the servers but off by several hours on the third. I don't like ntpd anyway and I will probably replace it by chronyd. Thanks for the hint. Cheers, frank On 25/01/2020 18:32, Gianni Milo wrote: > Things I would check or modify... > > - output of 'pvecm s' and 'pvecm n' commands. > - syslog on each node for any clues. > - ntp. > - separate cluster (corosync) network from storage network (i.e In your > case, use --link2, LAN). > > G. > > > On Sat, 25 Jan 2020 at 15:44, Frank Thommen > wrote: > >> Dear all, >> >> I have installed a 3-node PVE cluster as instructed on >> https://pve.proxmox.com/pve-docs/chapter-pvecm.html (usung commandline). >> When I now connect via GUI to one node and select one of the other >> nodes, I get a "401" error message and then I am asked to authenticate >> to the other node. So to see all nodes from all other nodes via GUI I >> would have to authenticate nine times. I don't think that is as it >> should be ;-). I would assume that once I am logged in on the GUI of one >> of the cluster nodes, I can look at the other two nodes w/o additional >> authentication from this GUI. >> >> The situation is somehow similar to the one described on >> >> https://forum.proxmox.com/threads/3-node-cluster-permission-denied-invalid-pve-ticket-401.56038/, >> >> but the suggested "pvecm updatecerts" (run on each node) only helped for >> a short time. After a reboot of the nodes I am back to the potential >> nine authentications. >> >> My three nodes are connected through a full 10GE mesh >> (https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server) using >> broadcast bonds. This mesh will finally also be used for Ceph. I >> configured this mesh to be the cluster network (--link0). As fallback >> (--link1) I used the regular LAN. >> >> Does anyone have an idea what could be wrong and how this could be >> fixed? Could the mesh with the broadcast bonds be the problem? If yes, >> should I use an other type of mesh? Unfortunately a full dedicated >> PVE-only network with a switch is not an option. I can either use a >> mesh or the regular LAN in the datacenter. >> >> The systems are running PVE 6.1-3. >> >> Any help or hint is appreciated. >> >> Cheers >> frank >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From jm at ginernet.com Sun Jan 26 12:40:43 2020 From: jm at ginernet.com (=?UTF-8?Q?Jos=c3=a9_Manuel_Giner?=) Date: Sun, 26 Jan 2020 12:40:43 +0100 Subject: [PVE-User] CPU freq scale in PVE 6.0 In-Reply-To: <1992237777.2463296.1579808317031.JavaMail.zimbra@odiso.com> References: <993bdd37-0e80-4718-07d3-bc17d17c45d5@ginernet.com> <1992237777.2463296.1579808317031.JavaMail.zimbra@odiso.com> Message-ID: <8b539fbb-52fc-530d-c070-44b3c98fe991@ginernet.com> I edited /etc/default/grub Updated with update-grub command Checked that /boot/grub/grub.cfg has been updated correctly But after reboot nothing has changed, freq is still btw 1.20 and 3.60 GHz Any idea how to fix? On 23/01/2020 20:38, Alexandre DERUMIER wrote: > Hi, > I'm setup this now for my new intel processors: > > /etc/default/grub > > GRUB_CMDLINE_LINUX="intel_idle.max_cstate=0 intel_pstate=disable processor.max_cstate=1" > > > > ----- Mail original ----- > De: "Jos? Manuel Giner" > ?: "proxmoxve" > Envoy?: Jeudi 23 Janvier 2020 15:34:47 > Objet: [PVE-User] CPU freq scale in PVE 6.0 > > Hello, > > Since PVE 6.0, we detect that the CPU frequency is dynamic even if the > governor "performance" is selected. > > How can we set the maximum frequency in all the cores? > > Thank you. > > > root at ns1031:~# cat /proc/cpuinfo | grep MHz > cpu MHz : 1908.704 > cpu MHz : 1430.150 > cpu MHz : 1436.616 > cpu MHz : 1433.659 > cpu MHz : 1547.050 > cpu MHz : 2299.611 > cpu MHz : 2931.782 > cpu MHz : 3321.946 > cpu MHz : 3397.896 > cpu MHz : 3401.725 > cpu MHz : 3401.701 > cpu MHz : 3170.021 > cpu MHz : 2502.847 > cpu MHz : 1534.596 > cpu MHz : 2969.438 > cpu MHz : 3435.646 > cpu MHz : 1992.717 > cpu MHz : 2922.292 > cpu MHz : 2917.282 > cpu MHz : 2921.654 > cpu MHz : 2484.807 > cpu MHz : 1866.995 > cpu MHz : 1629.151 > cpu MHz : 2841.298 > cpu MHz : 3404.503 > cpu MHz : 3401.732 > cpu MHz : 3406.120 > cpu MHz : 2189.093 > cpu MHz : 1545.454 > cpu MHz : 3348.573 > cpu MHz : 1564.159 > cpu MHz : 3438.787 > root at ns1031:~# > root at ns1031:~# > root at ns1031:~# > root at ns1031:~# cpupower frequency-info > analyzing CPU 0: > driver: intel_pstate > CPUs which run at the same hardware frequency: 0 > CPUs which need to have their frequency coordinated by software: 0 > maximum transition latency: Cannot determine or is not supported. > hardware limits: 1.20 GHz - 3.60 GHz > available cpufreq governors: performance powersave > current policy: frequency should be within 1.20 GHz and 3.60 GHz. > The governor "performance" may decide which speed to use > within this range. > current CPU frequency: Unable to call hardware > current CPU frequency: 1.90 GHz (asserted by call to kernel) > boost state support: > Supported: yes > Active: yes > root at ns1031:~# > root at ns1031:~# > > > -- Jos? Manuel Giner https://ginernet.com From f.thommen at dkfz-heidelberg.de Sun Jan 26 14:14:44 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Sun, 26 Jan 2020 14:14:44 +0100 Subject: [PVE-User] Ceph: Monitors not running but cannot be destroyed or recreated Message-ID: <0decd07f-3d4b-e5d9-83e9-e3539de98de4@dkfz-heidelberg.de> Dear all, I am trying to destroy "old" Ceph monitors but they can't be deleted and also cannot be recreated: I am currently configuring Ceph on our PVE cluster (3 nodes running PVE 6.1-3). There have been some "remainders" of a previous Ceph configuration which I had tried to configure while the nodes were not in a cluster configuration yet (and I had used the wrong network). However I had purged these configurations with `pveceph purge`. I have redone the basic Ceph configuration through the GUI on the first node and I have deleted the still existing managers through the GUI (to have a fresh start). A new monitor has been created on the first node automatically, but I am unable to delete the monitors on nodes 2 and 3. They show up as Status=stopped and Address=Unknown in the GUI and they cannot be started (no error message). In the syslog window I see (after rebooting node odcf-pve02): ------------ Jan 26 13:51:53 odcf-pve02 systemd[1]: Started Ceph cluster monitor daemon. Jan 26 13:51:55 odcf-pve02 ceph-mon[1372]: 2020-01-26 13:51:55.450 7faa98ab9280 -1 mon.odcf-pve02 at 0(electing) e1 failed to get devid for : fallback method has serial ''but no model ------------ On the other hand I see the same message on the first node, and there the monitor seems to work fine. Trying to destroy them results in the message, that there is no such monitor, and trying to create a new monitor on these nodes results in the message, that the monitor already exists.... I am stuck in this existence loop. Destroying or creating them also doesn't work on the commandline. Any idea on how to fix this? I'd rather not completely reinstall the nodes :-) Cheers frank From f.thommen at dkfz-heidelberg.de Sun Jan 26 16:46:08 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Sun, 26 Jan 2020 16:46:08 +0100 Subject: [PVE-User] Ceph: Monitors not running but cannot be destroyed or recreated In-Reply-To: <0decd07f-3d4b-e5d9-83e9-e3539de98de4@dkfz-heidelberg.de> References: <0decd07f-3d4b-e5d9-83e9-e3539de98de4@dkfz-heidelberg.de> Message-ID: <31220e9d-62f9-99ae-90b3-efa761ac619e@dkfz-heidelberg.de> On 26/01/2020 14:14, Frank Thommen wrote: > Dear all, > > I am trying to destroy "old" Ceph monitors but they can't be deleted and > also cannot be recreated: > > I am currently configuring Ceph on our PVE cluster (3 nodes running PVE > 6.1-3).? There have been some "remainders" of a previous Ceph > configuration which I had tried to configure while the nodes were not in > a cluster configuration yet (and I had used the wrong network).? However > I had purged these configurations with `pveceph purge`.? I have redone > the basic Ceph configuration through the GUI on the first node and I > have deleted the still existing managers through the GUI (to have a > fresh start). > > A new monitor has been created on the first node automatically, but I am > unable to delete the monitors on nodes 2 and 3.? They show up as > Status=stopped and Address=Unknown in the GUI and they cannot be started > (no error message).? In the syslog window I see (after rebooting node > odcf-pve02): > > ------------ > Jan 26 13:51:53 odcf-pve02 systemd[1]: Started Ceph cluster monitor daemon. > Jan 26 13:51:55 odcf-pve02 ceph-mon[1372]: 2020-01-26 13:51:55.450 > 7faa98ab9280 -1 mon.odcf-pve02 at 0(electing) e1 failed to get devid for : > fallback method has serial ''but no model > ------------ > > On the other hand I see the same message on the first node, and there > the monitor seems to work fine. > > Trying to destroy them results in the message, that there is no such > monitor, and trying to create a new monitor on these nodes results in > the message, that the monitor already exists.... I am stuck in this > existence loop.? Destroying or creating them also doesn't work on the > commandline. > > Any idea on how to fix this?? I'd rather not completely reinstall the > nodes :-) > > Cheers > frank In an attempt to clean up the Ceph setup again, I ran pveceph stop ceph.target pveceph purge on the first node. Now I get an rados_connect failed - No such file or directory (500) when I select Ceph in the GUI of any of the three nodes. A reboot of all nodes didn't help. frank From f.thommen at dkfz-heidelberg.de Sun Jan 26 23:51:54 2020 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Sun, 26 Jan 2020 23:51:54 +0100 Subject: [PVE-User] Ceph: Monitors not running but cannot be destroyed or recreated In-Reply-To: <31220e9d-62f9-99ae-90b3-efa761ac619e@dkfz-heidelberg.de> References: <0decd07f-3d4b-e5d9-83e9-e3539de98de4@dkfz-heidelberg.de> <31220e9d-62f9-99ae-90b3-efa761ac619e@dkfz-heidelberg.de> Message-ID: <7182305b-5213-a2fd-ce19-050abb312e5b@dkfz-heidelberg.de> On 26/01/2020 16:46, Frank Thommen wrote: > On 26/01/2020 14:14, Frank Thommen wrote: >> Dear all, >> >> I am trying to destroy "old" Ceph monitors but they can't be deleted >> and also cannot be recreated: >> >> I am currently configuring Ceph on our PVE cluster (3 nodes running >> PVE 6.1-3).? There have been some "remainders" of a previous Ceph >> configuration which I had tried to configure while the nodes were not >> in a cluster configuration yet (and I had used the wrong network). >> However I had purged these configurations with `pveceph purge`.? I >> have redone the basic Ceph configuration through the GUI on the first >> node and I have deleted the still existing managers through the GUI >> (to have a fresh start). >> >> A new monitor has been created on the first node automatically, but I >> am unable to delete the monitors on nodes 2 and 3.? They show up as >> Status=stopped and Address=Unknown in the GUI and they cannot be >> started (no error message).? In the syslog window I see (after >> rebooting node odcf-pve02): >> >> ------------ >> Jan 26 13:51:53 odcf-pve02 systemd[1]: Started Ceph cluster monitor >> daemon. >> Jan 26 13:51:55 odcf-pve02 ceph-mon[1372]: 2020-01-26 13:51:55.450 >> 7faa98ab9280 -1 mon.odcf-pve02 at 0(electing) e1 failed to get devid for >> : fallback method has serial ''but no model >> ------------ >> >> On the other hand I see the same message on the first node, and there >> the monitor seems to work fine. >> >> Trying to destroy them results in the message, that there is no such >> monitor, and trying to create a new monitor on these nodes results in >> the message, that the monitor already exists.... I am stuck in this >> existence loop.? Destroying or creating them also doesn't work on the >> commandline. >> >> Any idea on how to fix this?? I'd rather not completely reinstall the >> nodes :-) >> >> Cheers >> frank > > > In an attempt to clean up the Ceph setup again, I ran > > ? pveceph stop ceph.target > ? pveceph purge > > on the first node.? Now I get an > > ?? rados_connect failed - No such file or directory (500) > > when I select Ceph in the GUI of any of the three nodes.? A reboot of > all nodes didn't help. > > frank I was finally able to completely purge the old settings and reconfigure Ceph with the various instructions from this (https://forum.proxmox.com/threads/not-able-to-use-pveceph-purge-to-completely-remove-ceph.59606/) post. Maybe this information could be added to the official documentation (unless there is a nicer way of completely resetting Ceph in a PROXMOX cluster)? frank From aderumier at odiso.com Mon Jan 27 08:29:56 2020 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Mon, 27 Jan 2020 08:29:56 +0100 (CET) Subject: [PVE-User] CPU freq scale in PVE 6.0 In-Reply-To: <8b539fbb-52fc-530d-c070-44b3c98fe991@ginernet.com> References: <993bdd37-0e80-4718-07d3-bc17d17c45d5@ginernet.com> <1992237777.2463296.1579808317031.JavaMail.zimbra@odiso.com> <8b539fbb-52fc-530d-c070-44b3c98fe991@ginernet.com> Message-ID: <1042155989.2522905.1580110196419.JavaMail.zimbra@odiso.com> what is your cpu / motherboard / server model ? ----- Mail original ----- De: "Jos? Manuel Giner" ?: "proxmoxve" Envoy?: Dimanche 26 Janvier 2020 12:40:43 Objet: Re: [PVE-User] CPU freq scale in PVE 6.0 I edited /etc/default/grub Updated with update-grub command Checked that /boot/grub/grub.cfg has been updated correctly But after reboot nothing has changed, freq is still btw 1.20 and 3.60 GHz Any idea how to fix? On 23/01/2020 20:38, Alexandre DERUMIER wrote: > Hi, > I'm setup this now for my new intel processors: > > /etc/default/grub > > GRUB_CMDLINE_LINUX="intel_idle.max_cstate=0 intel_pstate=disable processor.max_cstate=1" > > > > ----- Mail original ----- > De: "Jos? Manuel Giner" > ?: "proxmoxve" > Envoy?: Jeudi 23 Janvier 2020 15:34:47 > Objet: [PVE-User] CPU freq scale in PVE 6.0 > > Hello, > > Since PVE 6.0, we detect that the CPU frequency is dynamic even if the > governor "performance" is selected. > > How can we set the maximum frequency in all the cores? > > Thank you. > > > root at ns1031:~# cat /proc/cpuinfo | grep MHz > cpu MHz : 1908.704 > cpu MHz : 1430.150 > cpu MHz : 1436.616 > cpu MHz : 1433.659 > cpu MHz : 1547.050 > cpu MHz : 2299.611 > cpu MHz : 2931.782 > cpu MHz : 3321.946 > cpu MHz : 3397.896 > cpu MHz : 3401.725 > cpu MHz : 3401.701 > cpu MHz : 3170.021 > cpu MHz : 2502.847 > cpu MHz : 1534.596 > cpu MHz : 2969.438 > cpu MHz : 3435.646 > cpu MHz : 1992.717 > cpu MHz : 2922.292 > cpu MHz : 2917.282 > cpu MHz : 2921.654 > cpu MHz : 2484.807 > cpu MHz : 1866.995 > cpu MHz : 1629.151 > cpu MHz : 2841.298 > cpu MHz : 3404.503 > cpu MHz : 3401.732 > cpu MHz : 3406.120 > cpu MHz : 2189.093 > cpu MHz : 1545.454 > cpu MHz : 3348.573 > cpu MHz : 1564.159 > cpu MHz : 3438.787 > root at ns1031:~# > root at ns1031:~# > root at ns1031:~# > root at ns1031:~# cpupower frequency-info > analyzing CPU 0: > driver: intel_pstate > CPUs which run at the same hardware frequency: 0 > CPUs which need to have their frequency coordinated by software: 0 > maximum transition latency: Cannot determine or is not supported. > hardware limits: 1.20 GHz - 3.60 GHz > available cpufreq governors: performance powersave > current policy: frequency should be within 1.20 GHz and 3.60 GHz. > The governor "performance" may decide which speed to use > within this range. > current CPU frequency: Unable to call hardware > current CPU frequency: 1.90 GHz (asserted by call to kernel) > boost state support: > Supported: yes > Active: yes > root at ns1031:~# > root at ns1031:~# > > > -- Jos? Manuel Giner https://ginernet.com _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From elacunza at binovo.es Tue Jan 28 09:26:35 2020 From: elacunza at binovo.es (Eneko Lacunza) Date: Tue, 28 Jan 2020 09:26:35 +0100 Subject: [PVE-User] PVE 5.4 - resize a NFS disk truncated it Message-ID: <6a0fb5ae-4131-64e6-dbf7-590424a09581@binovo.es> Hi all, We have a PVE 5.4 cluster (details below), with a Synology DS1819+ NFS server for storing file backups. The setup is as follows: - Debian 9 VM with 2 disks; system disk con Ceph RBD, file backup data disk on NFS (6,5TB) - NFS storage on Synology NAS. Backup disk was getting full, so we issued a resize disk from web GUI, with a 500GB size increment. GUI reported timeout with 500 error. After that, disk was shown incremented at VM hardware view, but storage showed the disk having 500GB. VM showed the block device having 500GB (partiton was 6,5TB yet). We have tried incrementing the disk back to 7TB, but disk/partition is corrupt (It was a bad idea, but its too late now). Any idea to recover lost data on NFS server? :) We have done this operation hundreds of times without issues, has anyone had such a catastrophic experience? # pveversion -v proxmox-ve: 5.4-1 (running kernel: 4.15.18-12-pve) pve-manager: 5.4-3 (running version: 5.4-3/0a6eaa62) pve-kernel-4.15: 5.3-3 pve-kernel-4.15.18-12-pve: 4.15.18-35 pve-kernel-4.15.18-7-pve: 4.15.18-27 pve-kernel-4.4.134-1-pve: 4.4.134-112 ceph: 12.2.11-pve1 corosync: 2.4.4-pve1 criu: 2.11.1-1~bpo90 glusterfs-client: 3.8.8-1 ksm-control-daemon: 1.2-2 libjs-extjs: 6.0.1-2 libpve-access-control: 5.1-8 libpve-apiclient-perl: 2.0-5 libpve-common-perl: 5.0-50 libpve-guest-common-perl: 2.0-20 libpve-http-server-perl: 2.0-13 libpve-storage-perl: 5.0-41 libqb0: 1.0.3-1~bpo9 lvm2: 2.02.168-pve6 lxc-pve: 3.1.0-3 lxcfs: 3.0.3-pve1 novnc-pve: 1.0.0-3 proxmox-widget-toolkit: 1.0-25 pve-cluster: 5.0-36 pve-container: 2.0-37 pve-docs: 5.4-2 pve-edk2-firmware: 1.20190312-1 pve-firewall: 3.0-19 pve-firmware: 2.0-6 pve-ha-manager: 2.0-9 pve-i18n: 1.1-4 pve-libspice-server1: 0.14.1-2 pve-qemu-kvm: 2.12.1-3 pve-xtermjs: 3.12.0-1 qemu-server: 5.0-50 smartmontools: 6.5+svn4324-1 spiceterm: 3.0-5 vncterm: 1.5-3 Thanks a lot Eneko -- Zuzendari Teknikoa / Director T?cnico Binovo IT Human Project, S.L. Telf. 943569206 Astigarragako bidea 2, 2? izq. oficina 11; 20180 Oiartzun (Gipuzkoa) www.binovo.es From gianni.milo22 at gmail.com Tue Jan 28 09:58:33 2020 From: gianni.milo22 at gmail.com (Gianni Milo) Date: Tue, 28 Jan 2020 08:58:33 +0000 Subject: [PVE-User] PVE 5.4 - resize a NFS disk truncated it In-Reply-To: <6a0fb5ae-4131-64e6-dbf7-590424a09581@binovo.es> References: <6a0fb5ae-4131-64e6-dbf7-590424a09581@binovo.es> Message-ID: Have you tried booting from a recovery media, then first check reported size within fdisk, then after try to fsck the disk (this may lead to some data loss by itself) ? Is the drive/partition mountable at all? If not, then you may have to try repairing its qcow2/raw file. There are some guides on the internet describing that, however there are no guarantees for the results. G. On Tue, 28 Jan 2020 at 08:27, Eneko Lacunza wrote: > Hi all, > > We have a PVE 5.4 cluster (details below), with a Synology DS1819+ NFS > server for storing file backups. > > The setup is as follows: > > - Debian 9 VM with 2 disks; system disk con Ceph RBD, file backup data > disk on NFS (6,5TB) > - NFS storage on Synology NAS. > > Backup disk was getting full, so we issued a resize disk from web GUI, > with a 500GB size increment. > > GUI reported timeout with 500 error. After that, disk was shown > incremented at VM hardware view, but storage showed the disk having > 500GB. VM showed the block device having 500GB (partiton was 6,5TB yet). > > We have tried incrementing the disk back to 7TB, but disk/partition is > corrupt (It was a bad idea, but its too late now). > > Any idea to recover lost data on NFS server? :) > > We have done this operation hundreds of times without issues, has anyone > had such a catastrophic experience? > > # pveversion -v > proxmox-ve: 5.4-1 (running kernel: 4.15.18-12-pve) > pve-manager: 5.4-3 (running version: 5.4-3/0a6eaa62) > pve-kernel-4.15: 5.3-3 > pve-kernel-4.15.18-12-pve: 4.15.18-35 > pve-kernel-4.15.18-7-pve: 4.15.18-27 > pve-kernel-4.4.134-1-pve: 4.4.134-112 > ceph: 12.2.11-pve1 > corosync: 2.4.4-pve1 > criu: 2.11.1-1~bpo90 > glusterfs-client: 3.8.8-1 > ksm-control-daemon: 1.2-2 > libjs-extjs: 6.0.1-2 > libpve-access-control: 5.1-8 > libpve-apiclient-perl: 2.0-5 > libpve-common-perl: 5.0-50 > libpve-guest-common-perl: 2.0-20 > libpve-http-server-perl: 2.0-13 > libpve-storage-perl: 5.0-41 > libqb0: 1.0.3-1~bpo9 > lvm2: 2.02.168-pve6 > lxc-pve: 3.1.0-3 > lxcfs: 3.0.3-pve1 > novnc-pve: 1.0.0-3 > proxmox-widget-toolkit: 1.0-25 > pve-cluster: 5.0-36 > pve-container: 2.0-37 > pve-docs: 5.4-2 > pve-edk2-firmware: 1.20190312-1 > pve-firewall: 3.0-19 > pve-firmware: 2.0-6 > pve-ha-manager: 2.0-9 > pve-i18n: 1.1-4 > pve-libspice-server1: 0.14.1-2 > pve-qemu-kvm: 2.12.1-3 > pve-xtermjs: 3.12.0-1 > qemu-server: 5.0-50 > smartmontools: 6.5+svn4324-1 > spiceterm: 3.0-5 > vncterm: 1.5-3 > > > Thanks a lot > Eneko > > -- > Zuzendari Teknikoa / Director T?cnico > Binovo IT Human Project, S.L. > Telf. 943569206 > Astigarragako bidea 2 > , > 2? izq. oficina 11; 20180 Oiartzun (Gipuzkoa) > www.binovo.es > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From elacunza at binovo.es Tue Jan 28 10:06:45 2020 From: elacunza at binovo.es (Eneko Lacunza) Date: Tue, 28 Jan 2020 10:06:45 +0100 Subject: [PVE-User] PVE 5.4 - resize a NFS disk truncated it In-Reply-To: References: <6a0fb5ae-4131-64e6-dbf7-590424a09581@binovo.es> Message-ID: <3464be17-3f8c-97a9-2866-6205b20688f3@binovo.es> Thanks for the suggestions Gianni, Partition was mounted. We have umounted and fsck'd the filesystem; journal is broken so we haven't tried going further before trying other recovery methods :) El 28/1/20 a las 9:58, Gianni Milo escribi?: > Have you tried booting from a recovery media, then first check reported > size within fdisk, then after try to fsck the disk (this may lead to some > data loss by itself) ? Is the drive/partition mountable at all? If not, > then you may have to try repairing its qcow2/raw file. There are some > guides on the internet describing that, however there are no guarantees for > the results. > > G. > > On Tue, 28 Jan 2020 at 08:27, Eneko Lacunza wrote: > >> Hi all, >> >> We have a PVE 5.4 cluster (details below), with a Synology DS1819+ NFS >> server for storing file backups. >> >> The setup is as follows: >> >> - Debian 9 VM with 2 disks; system disk con Ceph RBD, file backup data >> disk on NFS (6,5TB) >> - NFS storage on Synology NAS. >> >> Backup disk was getting full, so we issued a resize disk from web GUI, >> with a 500GB size increment. >> >> GUI reported timeout with 500 error. After that, disk was shown >> incremented at VM hardware view, but storage showed the disk having >> 500GB. VM showed the block device having 500GB (partiton was 6,5TB yet). >> >> We have tried incrementing the disk back to 7TB, but disk/partition is >> corrupt (It was a bad idea, but its too late now). >> >> Any idea to recover lost data on NFS server? :) >> >> We have done this operation hundreds of times without issues, has anyone >> had such a catastrophic experience? >> >> # pveversion -v >> proxmox-ve: 5.4-1 (running kernel: 4.15.18-12-pve) >> pve-manager: 5.4-3 (running version: 5.4-3/0a6eaa62) >> pve-kernel-4.15: 5.3-3 >> pve-kernel-4.15.18-12-pve: 4.15.18-35 >> pve-kernel-4.15.18-7-pve: 4.15.18-27 >> pve-kernel-4.4.134-1-pve: 4.4.134-112 >> ceph: 12.2.11-pve1 >> corosync: 2.4.4-pve1 >> criu: 2.11.1-1~bpo90 >> glusterfs-client: 3.8.8-1 >> ksm-control-daemon: 1.2-2 >> libjs-extjs: 6.0.1-2 >> libpve-access-control: 5.1-8 >> libpve-apiclient-perl: 2.0-5 >> libpve-common-perl: 5.0-50 >> libpve-guest-common-perl: 2.0-20 >> libpve-http-server-perl: 2.0-13 >> libpve-storage-perl: 5.0-41 >> libqb0: 1.0.3-1~bpo9 >> lvm2: 2.02.168-pve6 >> lxc-pve: 3.1.0-3 >> lxcfs: 3.0.3-pve1 >> novnc-pve: 1.0.0-3 >> proxmox-widget-toolkit: 1.0-25 >> pve-cluster: 5.0-36 >> pve-container: 2.0-37 >> pve-docs: 5.4-2 >> pve-edk2-firmware: 1.20190312-1 >> pve-firewall: 3.0-19 >> pve-firmware: 2.0-6 >> pve-ha-manager: 2.0-9 >> pve-i18n: 1.1-4 >> pve-libspice-server1: 0.14.1-2 >> pve-qemu-kvm: 2.12.1-3 >> pve-xtermjs: 3.12.0-1 >> qemu-server: 5.0-50 >> smartmontools: 6.5+svn4324-1 >> spiceterm: 3.0-5 >> vncterm: 1.5-3 >> >> >> Thanks a lot >> Eneko >> >> -- >> Zuzendari Teknikoa / Director T?cnico >> Binovo IT Human Project, S.L. >> Telf. 943569206 >> Astigarragako bidea 2 >> , >> 2? izq. oficina 11; 20180 Oiartzun (Gipuzkoa) >> www.binovo.es >> >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- Zuzendari Teknikoa / Director T?cnico Binovo IT Human Project, S.L. Telf. 943569206 Astigarragako bidea 2, 2? izq. oficina 11; 20180 Oiartzun (Gipuzkoa) www.binovo.es From t.marx at proxmox.com Tue Jan 28 10:28:40 2020 From: t.marx at proxmox.com (Tim Marx) Date: Tue, 28 Jan 2020 10:28:40 +0100 (CET) Subject: [PVE-User] PVE 5.4 - resize a NFS disk truncated it In-Reply-To: <6a0fb5ae-4131-64e6-dbf7-590424a09581@binovo.es> References: <6a0fb5ae-4131-64e6-dbf7-590424a09581@binovo.es> Message-ID: <128523840.18.1580203720465@webmail.proxmox.com> Just for reference, this was fixed approximately 5 months ago in recent PVE 6 versions. https://git.proxmox.com/?p=qemu-server.git;a=commit;h=f8b829aabae2fdc8bdd9ace741bbef3598b892f2 > Eneko Lacunza hat am 28. Januar 2020 09:26 geschrieben: > > > Hi all, > > We have a PVE 5.4 cluster (details below), with a Synology DS1819+ NFS > server for storing file backups. > > The setup is as follows: > > - Debian 9 VM with 2 disks; system disk con Ceph RBD, file backup data > disk on NFS (6,5TB) > - NFS storage on Synology NAS. > > Backup disk was getting full, so we issued a resize disk from web GUI, > with a 500GB size increment. > > GUI reported timeout with 500 error. After that, disk was shown > incremented at VM hardware view, but storage showed the disk having > 500GB. VM showed the block device having 500GB (partiton was 6,5TB yet). > > We have tried incrementing the disk back to 7TB, but disk/partition is > corrupt (It was a bad idea, but its too late now). > > Any idea to recover lost data on NFS server? :) > > We have done this operation hundreds of times without issues, has anyone > had such a catastrophic experience? > > # pveversion -v > proxmox-ve: 5.4-1 (running kernel: 4.15.18-12-pve) > pve-manager: 5.4-3 (running version: 5.4-3/0a6eaa62) > pve-kernel-4.15: 5.3-3 > pve-kernel-4.15.18-12-pve: 4.15.18-35 > pve-kernel-4.15.18-7-pve: 4.15.18-27 > pve-kernel-4.4.134-1-pve: 4.4.134-112 > ceph: 12.2.11-pve1 > corosync: 2.4.4-pve1 > criu: 2.11.1-1~bpo90 > glusterfs-client: 3.8.8-1 > ksm-control-daemon: 1.2-2 > libjs-extjs: 6.0.1-2 > libpve-access-control: 5.1-8 > libpve-apiclient-perl: 2.0-5 > libpve-common-perl: 5.0-50 > libpve-guest-common-perl: 2.0-20 > libpve-http-server-perl: 2.0-13 > libpve-storage-perl: 5.0-41 > libqb0: 1.0.3-1~bpo9 > lvm2: 2.02.168-pve6 > lxc-pve: 3.1.0-3 > lxcfs: 3.0.3-pve1 > novnc-pve: 1.0.0-3 > proxmox-widget-toolkit: 1.0-25 > pve-cluster: 5.0-36 > pve-container: 2.0-37 > pve-docs: 5.4-2 > pve-edk2-firmware: 1.20190312-1 > pve-firewall: 3.0-19 > pve-firmware: 2.0-6 > pve-ha-manager: 2.0-9 > pve-i18n: 1.1-4 > pve-libspice-server1: 0.14.1-2 > pve-qemu-kvm: 2.12.1-3 > pve-xtermjs: 3.12.0-1 > qemu-server: 5.0-50 > smartmontools: 6.5+svn4324-1 > spiceterm: 3.0-5 > vncterm: 1.5-3 > > > Thanks a lot > Eneko > > -- > Zuzendari Teknikoa / Director T?cnico > Binovo IT Human Project, S.L. > Telf. 943569206 > Astigarragako bidea 2, 2? izq. oficina 11; 20180 Oiartzun (Gipuzkoa) > www.binovo.es > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From dor at volz.ua Tue Jan 28 10:40:04 2020 From: dor at volz.ua (Dmytro O. Redchuk) Date: Tue, 28 Jan 2020 11:40:04 +0200 Subject: Config/Status commands stopped to respond Message-ID: <20200128094004.GA7342@volz.ua> Hi masters, I am running two-nodes cluster (PM v.5.3), and today I've found that one node stopped to respond to config/status commands --- VMs in GUI are gray and marked with question mark, commands like "pvecm status" or "qm list" hung (until ^C). So, I can login with ssh into that node, and all VMs seem to be working fine. Please, is it possible to get it working without any VMs/node restart? What have I do? Could not find (or missed) anything useful in logfiles. Thank you! -- Dmytro O. Redchuk From gianni.milo22 at gmail.com Tue Jan 28 11:26:41 2020 From: gianni.milo22 at gmail.com (Gianni Milo) Date: Tue, 28 Jan 2020 10:26:41 +0000 Subject: [PVE-User] Config/Status commands stopped to respond In-Reply-To: References: Message-ID: First thing that comes to my mind when having only 2 nodes in the cluster is that perhaps the cluster is not quorate ? I would check that first and maybe restart the related services... G. On Tue, 28 Jan 2020 at 09:40, Dmytro O. Redchuk via pve-user < pve-user at pve.proxmox.com> wrote: > > > > ---------- Forwarded message ---------- > From: "Dmytro O. Redchuk" > To: pve-user at pve.proxmox.com > Cc: > Bcc: > Date: Tue, 28 Jan 2020 11:40:04 +0200 > Subject: Config/Status commands stopped to respond > Hi masters, > > I am running two-nodes cluster (PM v.5.3), > and today I've found that one node stopped to respond to config/status > commands --- VMs in GUI are gray and marked with question mark, > commands like "pvecm status" or "qm list" hung (until ^C). > > So, I can login with ssh into that node, > and all VMs seem to be working fine. > > Please, is it possible to get it working without any VMs/node restart? > > What have I do? > > Could not find (or missed) anything useful in logfiles. > > Thank you! > > -- > Dmytro O. Redchuk > > > > ---------- Forwarded message ---------- > From: "Dmytro O. Redchuk via pve-user" > To: pve-user at pve.proxmox.com > Cc: "Dmytro O. Redchuk" > Bcc: > Date: Tue, 28 Jan 2020 11:40:04 +0200 > Subject: [PVE-User] Config/Status commands stopped to respond > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From dor at volz.ua Tue Jan 28 11:35:08 2020 From: dor at volz.ua (Dmytro O. Redchuk) Date: Tue, 28 Jan 2020 12:35:08 +0200 Subject: [PVE-User] Config/Status commands stopped to respond In-Reply-To: References: Message-ID: <20200128103508.GB7739@volz.ua> ? ??., 28-?? ???. 2020, ? 10:26 Gianni Milo wrote: > First thing that comes to my mind when having only 2 nodes in the cluster > is that perhaps the cluster is not quorate ? I would check that first and First node's corosync reports ok for quorum: root at nd1:~# pvecm status Quorum information ------------------ Date: Tue Jan 28 12:29:52 2020 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 0x00000001 Ring ID: 1/36 Quorate: Yes Votequorum information ---------------------- Expected votes: 2 Highest expected: 2 Total votes: 2 Quorum: 2 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 0x00000001 1 10.24.0.1 (local) 0x00000002 1 10.24.0.2 So, for the moment, I did the following on that "partially failed" node (their logs has been empty for today): 1. systemctl restart pvedaemon.service -- ok, status is OK 2. systemctl restart pveproxy.service -- ok, status is OK 3. systemctl restart pvestatd.service --FAILED, timeout in the log: Jan 28 12:16:04 nd2 systemd[1]: pvestatd.service: Start operation timed out. Terminating. Is it be because of some hunged node/container/process or dead lock file? What else that could be? Thank you! > maybe restart the related services... > > G. > > > On Tue, 28 Jan 2020 at 09:40, Dmytro O. Redchuk via pve-user < > pve-user at pve.proxmox.com> wrote: > > > > > > > > > ---------- Forwarded message ---------- > > From: "Dmytro O. Redchuk" > > To: pve-user at pve.proxmox.com > > Cc: > > Bcc: > > Date: Tue, 28 Jan 2020 11:40:04 +0200 > > Subject: Config/Status commands stopped to respond > > Hi masters, > > > > I am running two-nodes cluster (PM v.5.3), > > and today I've found that one node stopped to respond to config/status > > commands --- VMs in GUI are gray and marked with question mark, > > commands like "pvecm status" or "qm list" hung (until ^C). > > > > So, I can login with ssh into that node, > > and all VMs seem to be working fine. > > > > Please, is it possible to get it working without any VMs/node restart? > > > > What have I do? > > > > Could not find (or missed) anything useful in logfiles. > > > > Thank you! > > > > -- > > Dmytro O. Redchuk > > > > > > > > ---------- Forwarded message ---------- > > From: "Dmytro O. Redchuk via pve-user" > > To: pve-user at pve.proxmox.com > > Cc: "Dmytro O. Redchuk" > > Bcc: > > Date: Tue, 28 Jan 2020 11:40:04 +0200 > > Subject: [PVE-User] Config/Status commands stopped to respond > > _______________________________________________ > > pve-user mailing list > > pve-user at pve.proxmox.com > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- Dmytro O. Redchuk (+380) 44 2474832 From gianni.milo22 at gmail.com Tue Jan 28 12:13:14 2020 From: gianni.milo22 at gmail.com (Gianni Milo) Date: Tue, 28 Jan 2020 11:13:14 +0000 Subject: [PVE-User] Config/Status commands stopped to respond In-Reply-To: References: Message-ID: What's the output of 'journalctl -u pvestatd.service" ? How about 'pvesm status' ? On Tue, 28 Jan 2020 at 10:35, Dmytro O. Redchuk via pve-user < pve-user at pve.proxmox.com> wrote: > > > > ---------- Forwarded message ---------- > From: "Dmytro O. Redchuk" > To: PVE User List > Cc: > Bcc: > Date: Tue, 28 Jan 2020 12:35:08 +0200 > Subject: Re: [PVE-User] Config/Status commands stopped to respond > ? ??., 28-?? ???. 2020, ? 10:26 Gianni Milo wrote: > > First thing that comes to my mind when having only 2 nodes in the cluster > > is that perhaps the cluster is not quorate ? I would check that first and > First node's corosync reports ok for quorum: > > root at nd1:~# pvecm status > Quorum information > ------------------ > Date: Tue Jan 28 12:29:52 2020 > Quorum provider: corosync_votequorum > Nodes: 2 > Node ID: 0x00000001 > Ring ID: 1/36 > Quorate: Yes > > Votequorum information > ---------------------- > Expected votes: 2 > Highest expected: 2 > Total votes: 2 > Quorum: 2 > Flags: Quorate > > Membership information > ---------------------- > Nodeid Votes Name > 0x00000001 1 10.24.0.1 (local) > 0x00000002 1 10.24.0.2 > > > So, for the moment, I did the following on that "partially failed" node > (their logs has been empty for today): > > 1. systemctl restart pvedaemon.service -- ok, status is OK > 2. systemctl restart pveproxy.service -- ok, status is OK > 3. systemctl restart pvestatd.service --FAILED, timeout in the log: > Jan 28 12:16:04 nd2 systemd[1]: pvestatd.service: Start operation timed > out. Terminating. > > > Is it be because of some hunged node/container/process or dead lock file? > > What else that could be? > > Thank you! > > > > maybe restart the related services... > > > > G. > > > > > > On Tue, 28 Jan 2020 at 09:40, Dmytro O. Redchuk via pve-user < > > pve-user at pve.proxmox.com> wrote: > > > > > > > > > > > > > > ---------- Forwarded message ---------- > > > From: "Dmytro O. Redchuk" > > > To: pve-user at pve.proxmox.com > > > Cc: > > > Bcc: > > > Date: Tue, 28 Jan 2020 11:40:04 +0200 > > > Subject: Config/Status commands stopped to respond > > > Hi masters, > > > > > > I am running two-nodes cluster (PM v.5.3), > > > and today I've found that one node stopped to respond to config/status > > > commands --- VMs in GUI are gray and marked with question mark, > > > commands like "pvecm status" or "qm list" hung (until ^C). > > > > > > So, I can login with ssh into that node, > > > and all VMs seem to be working fine. > > > > > > Please, is it possible to get it working without any VMs/node restart? > > > > > > What have I do? > > > > > > Could not find (or missed) anything useful in logfiles. > > > > > > Thank you! > > > > > > -- > > > Dmytro O. Redchuk > > > > > > > > > > > > ---------- Forwarded message ---------- > > > From: "Dmytro O. Redchuk via pve-user" > > > To: pve-user at pve.proxmox.com > > > Cc: "Dmytro O. Redchuk" > > > Bcc: > > > Date: Tue, 28 Jan 2020 11:40:04 +0200 > > > Subject: [PVE-User] Config/Status commands stopped to respond > > > _______________________________________________ > > > pve-user mailing list > > > pve-user at pve.proxmox.com > > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > > _______________________________________________ > > pve-user mailing list > > pve-user at pve.proxmox.com > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > -- > Dmytro O. Redchuk > (+380) 44 2474832 > > > > ---------- Forwarded message ---------- > From: "Dmytro O. Redchuk via pve-user" > To: PVE User List > Cc: "Dmytro O. Redchuk" > Bcc: > Date: Tue, 28 Jan 2020 12:35:08 +0200 > Subject: Re: [PVE-User] Config/Status commands stopped to respond > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From dor at volz.ua Tue Jan 28 12:31:35 2020 From: dor at volz.ua (Dmytro O. Redchuk) Date: Tue, 28 Jan 2020 13:31:35 +0200 Subject: [PVE-User] Config/Status commands stopped to respond In-Reply-To: References: Message-ID: <20200128113135.GD7739@volz.ua> ? ??., 28-?? ???. 2020, ? 11:13 Gianni Milo wrote: > What's the output of 'journalctl -u pvestatd.service" ? How about 'pvesm > status' ? Sorry, here is it: root at nd2:~# journalctl -u pvestatd.service -- Logs begin at Sun 2019-11-03 02:00:04 EET, end at Tue 2020-01-28 13:17:01 EET. -- Jan 28 12:09:03 nd2 systemd[1]: Stopping PVE Status Daemon... Jan 28 12:10:33 nd2 systemd[1]: pvestatd.service: Stopping timed out. Terminating. Jan 28 12:12:03 nd2 systemd[1]: pvestatd.service: State 'stop-sigterm' timed out. Killing. Jan 28 12:12:03 nd2 systemd[1]: pvestatd.service: Killing process 2738 (pvestatd) with signal SIGKILL. Jan 28 12:12:03 nd2 systemd[1]: pvestatd.service: Main process exited, code=killed, status=9/KILL Jan 28 12:12:03 nd2 systemd[1]: Stopped PVE Status Daemon. Jan 28 12:12:03 nd2 systemd[1]: pvestatd.service: Unit entered failed state. Jan 28 12:12:03 nd2 systemd[1]: pvestatd.service: Failed with result 'timeout'. Jan 28 12:12:03 nd2 systemd[1]: Starting PVE Status Daemon... Jan 28 12:13:34 nd2 systemd[1]: pvestatd.service: Start operation timed out. Terminating. Jan 28 12:13:34 nd2 systemd[1]: Failed to start PVE Status Daemon. Jan 28 12:13:34 nd2 systemd[1]: pvestatd.service: Unit entered failed state. Jan 28 12:13:34 nd2 systemd[1]: pvestatd.service: Failed with result 'timeout'. Jan 28 12:14:34 nd2 systemd[1]: Starting PVE Status Daemon... Jan 28 12:16:04 nd2 systemd[1]: pvestatd.service: Start operation timed out. Terminating. Jan 28 12:16:04 nd2 systemd[1]: Failed to start PVE Status Daemon. Jan 28 12:16:04 nd2 systemd[1]: pvestatd.service: Unit entered failed state. Jan 28 12:16:04 nd2 systemd[1]: pvestatd.service: Failed with result 'timeout'. I've tried to restart, then restarted pvedaemon and pveproxy, and tried to start pvestatd again. pvesm hungs too, "zfs list" works ok. And all VMs seem to be working, no any issue with their disks. At least it looks like this. > On Tue, 28 Jan 2020 at 10:35, Dmytro O. Redchuk via pve-user < > pve-user at pve.proxmox.com> wrote: > > > > > > > > > ---------- Forwarded message ---------- > > From: "Dmytro O. Redchuk" > > To: PVE User List > > Cc: > > Bcc: > > Date: Tue, 28 Jan 2020 12:35:08 +0200 > > Subject: Re: [PVE-User] Config/Status commands stopped to respond > > ? ??., 28-?? ???. 2020, ? 10:26 Gianni Milo wrote: > > > First thing that comes to my mind when having only 2 nodes in the cluster > > > is that perhaps the cluster is not quorate ? I would check that first and > > First node's corosync reports ok for quorum: > > > > root at nd1:~# pvecm status > > Quorum information > > ------------------ > > Date: Tue Jan 28 12:29:52 2020 > > Quorum provider: corosync_votequorum > > Nodes: 2 > > Node ID: 0x00000001 > > Ring ID: 1/36 > > Quorate: Yes > > > > Votequorum information > > ---------------------- > > Expected votes: 2 > > Highest expected: 2 > > Total votes: 2 > > Quorum: 2 > > Flags: Quorate > > > > Membership information > > ---------------------- > > Nodeid Votes Name > > 0x00000001 1 10.24.0.1 (local) > > 0x00000002 1 10.24.0.2 > > > > > > So, for the moment, I did the following on that "partially failed" node > > (their logs has been empty for today): > > > > 1. systemctl restart pvedaemon.service -- ok, status is OK > > 2. systemctl restart pveproxy.service -- ok, status is OK > > 3. systemctl restart pvestatd.service --FAILED, timeout in the log: > > Jan 28 12:16:04 nd2 systemd[1]: pvestatd.service: Start operation timed > > out. Terminating. > > > > > > Is it be because of some hunged node/container/process or dead lock file? > > > > What else that could be? > > > > Thank you! > > > > > > > maybe restart the related services... > > > > > > G. > > > > > > > > > On Tue, 28 Jan 2020 at 09:40, Dmytro O. Redchuk via pve-user < > > > pve-user at pve.proxmox.com> wrote: > > > > > > > > > > > > > > > > > > > ---------- Forwarded message ---------- > > > > From: "Dmytro O. Redchuk" > > > > To: pve-user at pve.proxmox.com > > > > Cc: > > > > Bcc: > > > > Date: Tue, 28 Jan 2020 11:40:04 +0200 > > > > Subject: Config/Status commands stopped to respond > > > > Hi masters, > > > > > > > > I am running two-nodes cluster (PM v.5.3), > > > > and today I've found that one node stopped to respond to config/status > > > > commands --- VMs in GUI are gray and marked with question mark, > > > > commands like "pvecm status" or "qm list" hung (until ^C). > > > > > > > > So, I can login with ssh into that node, > > > > and all VMs seem to be working fine. > > > > > > > > Please, is it possible to get it working without any VMs/node restart? > > > > > > > > What have I do? > > > > > > > > Could not find (or missed) anything useful in logfiles. > > > > > > > > Thank you! > > > > > > > > -- > > > > Dmytro O. Redchuk > > > > > > > > > > > > > > > > ---------- Forwarded message ---------- > > > > From: "Dmytro O. Redchuk via pve-user" > > > > To: pve-user at pve.proxmox.com > > > > Cc: "Dmytro O. Redchuk" > > > > Bcc: > > > > Date: Tue, 28 Jan 2020 11:40:04 +0200 > > > > Subject: [PVE-User] Config/Status commands stopped to respond > > > > _______________________________________________ > > > > pve-user mailing list > > > > pve-user at pve.proxmox.com > > > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > > > > _______________________________________________ > > > pve-user mailing list > > > pve-user at pve.proxmox.com > > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > -- > > Dmytro O. Redchuk > > (+380) 44 2474832 > > > > > > > > ---------- Forwarded message ---------- > > From: "Dmytro O. Redchuk via pve-user" > > To: PVE User List > > Cc: "Dmytro O. Redchuk" > > Bcc: > > Date: Tue, 28 Jan 2020 12:35:08 +0200 > > Subject: Re: [PVE-User] Config/Status commands stopped to respond > > _______________________________________________ > > pve-user mailing list > > pve-user at pve.proxmox.com > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- Dmytro O. Redchuk (+380) 44 2474832 From elacunza at binovo.es Tue Jan 28 13:27:41 2020 From: elacunza at binovo.es (Eneko Lacunza) Date: Tue, 28 Jan 2020 13:27:41 +0100 Subject: [PVE-User] PVE 5.4 - resize a NFS disk truncated it In-Reply-To: <128523840.18.1580203720465@webmail.proxmox.com> References: <6a0fb5ae-4131-64e6-dbf7-590424a09581@binovo.es> <128523840.18.1580203720465@webmail.proxmox.com> Message-ID: Thanks a lot Tim, at least we know it was a bug and was fixed, for the next time. :-) El 28/1/20 a las 10:28, Tim Marx escribi?: > Just for reference, this was fixed approximately 5 months ago in recent PVE 6 versions. > https://git.proxmox.com/?p=qemu-server.git;a=commit;h=f8b829aabae2fdc8bdd9ace741bbef3598b892f2 > >> Eneko Lacunza hat am 28. Januar 2020 09:26 geschrieben: >> >> >> Hi all, >> >> We have a PVE 5.4 cluster (details below), with a Synology DS1819+ NFS >> server for storing file backups. >> >> The setup is as follows: >> >> - Debian 9 VM with 2 disks; system disk con Ceph RBD, file backup data >> disk on NFS (6,5TB) >> - NFS storage on Synology NAS. >> >> Backup disk was getting full, so we issued a resize disk from web GUI, >> with a 500GB size increment. >> >> GUI reported timeout with 500 error. After that, disk was shown >> incremented at VM hardware view, but storage showed the disk having >> 500GB. VM showed the block device having 500GB (partiton was 6,5TB yet). >> >> We have tried incrementing the disk back to 7TB, but disk/partition is >> corrupt (It was a bad idea, but its too late now). >> >> Any idea to recover lost data on NFS server? :) >> >> We have done this operation hundreds of times without issues, has anyone >> had such a catastrophic experience? >> >> # pveversion -v >> proxmox-ve: 5.4-1 (running kernel: 4.15.18-12-pve) >> pve-manager: 5.4-3 (running version: 5.4-3/0a6eaa62) >> pve-kernel-4.15: 5.3-3 >> pve-kernel-4.15.18-12-pve: 4.15.18-35 >> pve-kernel-4.15.18-7-pve: 4.15.18-27 >> pve-kernel-4.4.134-1-pve: 4.4.134-112 >> ceph: 12.2.11-pve1 >> corosync: 2.4.4-pve1 >> criu: 2.11.1-1~bpo90 >> glusterfs-client: 3.8.8-1 >> ksm-control-daemon: 1.2-2 >> libjs-extjs: 6.0.1-2 >> libpve-access-control: 5.1-8 >> libpve-apiclient-perl: 2.0-5 >> libpve-common-perl: 5.0-50 >> libpve-guest-common-perl: 2.0-20 >> libpve-http-server-perl: 2.0-13 >> libpve-storage-perl: 5.0-41 >> libqb0: 1.0.3-1~bpo9 >> lvm2: 2.02.168-pve6 >> lxc-pve: 3.1.0-3 >> lxcfs: 3.0.3-pve1 >> novnc-pve: 1.0.0-3 >> proxmox-widget-toolkit: 1.0-25 >> pve-cluster: 5.0-36 >> pve-container: 2.0-37 >> pve-docs: 5.4-2 >> pve-edk2-firmware: 1.20190312-1 >> pve-firewall: 3.0-19 >> pve-firmware: 2.0-6 >> pve-ha-manager: 2.0-9 >> pve-i18n: 1.1-4 >> pve-libspice-server1: 0.14.1-2 >> pve-qemu-kvm: 2.12.1-3 >> pve-xtermjs: 3.12.0-1 >> qemu-server: 5.0-50 >> smartmontools: 6.5+svn4324-1 >> spiceterm: 3.0-5 >> vncterm: 1.5-3 >> >> >> Thanks a lot >> Eneko >> >> -- >> Zuzendari Teknikoa / Director T?cnico >> Binovo IT Human Project, S.L. >> Telf. 943569206 >> Astigarragako bidea 2, 2? izq. oficina 11; 20180 Oiartzun (Gipuzkoa) >> www.binovo.es >> >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- Zuzendari Teknikoa / Director T?cnico Binovo IT Human Project, S.L. Telf. 943569206 Astigarragako bidea 2, 2? izq. oficina 11; 20180 Oiartzun (Gipuzkoa) www.binovo.es From dor at volz.ua Tue Jan 28 22:01:26 2020 From: dor at volz.ua (Dmytro O. Redchuk) Date: Tue, 28 Jan 2020 23:01:26 +0200 Subject: [PVE-User] Config/Status commands stopped to respond In-Reply-To: <8568CB4F-3E98-4183-BC27-44E1C556EF03@elchaka.de> References: <8568CB4F-3E98-4183-BC27-44E1C556EF03@elchaka.de> Message-ID: <20200128210126.GA14399@volz.ua> ? ??., 28-?? ???. 2020, ? 20:35 ceph at elchaka.de wrote: > Hmmm... perhaps some issue with a storage defined in storage.cfg? > > You could check each of your configured storage devices. > > ... does "df - h" working? Yes, it works: root at nd2:~# df -h Filesystem Size Used Avail Use% Mounted on udev 48G 0 48G 0% /dev tmpfs 9.5G 927M 8.6G 10% /run /dev/mapper/pve-root 227G 149G 68G 69% / tmpfs 48G 63M 48G 1% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 48G 0 48G 0% /sys/fs/cgroup local-zfs 1.2T 128K 1.2T 1% /local-zfs /dev/fuse 30M 32K 30M 1% /etc/pve local-zfs/subvol-106-disk-0 8.0G 446M 7.6G 6% /local-zfs/subvol-106-disk-0 local-zfs/subvol-105-disk-0 8.0G 451M 7.6G 6% /local-zfs/subvol-105-disk-0 local-zfs/subvol-107-disk-0 8.0G 446M 7.6G 6% /local-zfs/subvol-107-disk-0 local-zfs/subvol-108-disk-0 8.0G 446M 7.6G 6% /local-zfs/subvol-108-disk-0 tmpfs 9.5G 0 9.5G 0% /run/user/0 tmpfs 9.5G 0 9.5G 0% /run/user/1000 However any ls or cat on /etc/pve does not respond. What can I do here? Thanks. > What about dmesg? Alomst nothing, the only line for today: [Tue Jan 28 01:59:29 2020] audit: type=1400 audit(1580169664.199:130): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-105_" name="/" pid=9570 comm="(ogrotate)" flags="rw, rslave" > > Am 28. Januar 2020 12:31:35 MEZ schrieb "Dmytro O. Redchuk via pve-user" : > >_______________________________________________ > >pve-user mailing list > >pve-user at pve.proxmox.com > >https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- Dmytro O. Redchuk (+380) 44 2474832 From proxmox at elchaka.de Tue Jan 28 22:37:02 2020 From: proxmox at elchaka.de (proxmox at elchaka.de) Date: Tue, 28 Jan 2020 22:37:02 +0100 Subject: [PVE-User] Config/Status commands stopped to respond In-Reply-To: <20200128210126.GA14399@volz.ua> References: <8568CB4F-3E98-4183-BC27-44E1C556EF03@elchaka.de> <20200128210126.GA14399@volz.ua> Message-ID: Am 28. Januar 2020 22:01:26 MEZ schrieb "Dmytro O. Redchuk" : >? ??., 28-?? ???. 2020, ? 20:35 ceph at elchaka.de wrote: >> Hmmm... perhaps some issue with a storage defined in storage.cfg? >> >> You could check each of your configured storage devices. >> >> ... does "df - h" working? >Yes, it works: > >root at nd2:~# df -h >Filesystem Size Used Avail Use% Mounted on >udev 48G 0 48G 0% /dev >tmpfs 9.5G 927M 8.6G 10% /run >/dev/mapper/pve-root 227G 149G 68G 69% / >tmpfs 48G 63M 48G 1% /dev/shm >tmpfs 5.0M 0 5.0M 0% /run/lock >tmpfs 48G 0 48G 0% /sys/fs/cgroup >local-zfs 1.2T 128K 1.2T 1% /local-zfs >/dev/fuse 30M 32K 30M 1% /etc/pve >local-zfs/subvol-106-disk-0 8.0G 446M 7.6G 6% >/local-zfs/subvol-106-disk-0 >local-zfs/subvol-105-disk-0 8.0G 451M 7.6G 6% >/local-zfs/subvol-105-disk-0 >local-zfs/subvol-107-disk-0 8.0G 446M 7.6G 6% >/local-zfs/subvol-107-disk-0 >local-zfs/subvol-108-disk-0 8.0G 446M 7.6G 6% >/local-zfs/subvol-108-disk-0 >tmpfs 9.5G 0 9.5G 0% /run/user/0 >tmpfs 9.5G 0 9.5G 0% /run/user/1000 > >However any ls or cat on /etc/pve does not respond. > >What can I do here? You can put your node in local Mode pmxcfs -l I guess you have a network issue here between your pve nodes and should search for "proxmox cluster troubleshooting" Hth >Thanks. > >> What about dmesg? >Alomst nothing, the only line for today: >[Tue Jan 28 01:59:29 2020] audit: type=1400 audit(1580169664.199:130): >apparmor="DENIED" operation="mount" info="failed flags match" error=-13 >profile="lxc-105_" name="/" pid=9570 comm="(ogrotate)" >flags="rw, rslave" > > >> >> Am 28. Januar 2020 12:31:35 MEZ schrieb "Dmytro O. Redchuk via >pve-user" : >> >_______________________________________________ >> >pve-user mailing list >> >pve-user at pve.proxmox.com >> >https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From mark at tuxis.nl Wed Jan 29 03:32:10 2020 From: mark at tuxis.nl (Mark Schouten) Date: Wed, 29 Jan 2020 03:32:10 +0100 Subject: [PVE-User] External Ceph cluster for PVE6.1-5 Message-ID: Hi, We just upgraded one of our clusters to PVE 6.1-5. It's not hyperconverged, so Ceph is running on an external cluster. That cluster runs Luminous, and we installed the Nautilus client on the proxmox-cluster. I can't find any documentation if this is supported or not. Is it OK to have a Nautilus client on Proxmox for an external Luminous cluster, or will that break stuff? Thanks. -- Mark Schouten Tuxis, Ede, https://www.tuxis.nl T: +31 318 200208? ? From a.antreich at proxmox.com Wed Jan 29 09:23:53 2020 From: a.antreich at proxmox.com (Alwin Antreich) Date: Wed, 29 Jan 2020 09:23:53 +0100 Subject: [PVE-User] External Ceph cluster for PVE6.1-5 In-Reply-To: References: Message-ID: <20200129082353.GG99328@dona.proxmox.com> Hi Mark, On Wed, Jan 29, 2020 at 03:32:10AM +0100, Mark Schouten wrote: > > Hi, > > We just upgraded one of our clusters to PVE 6.1-5. It's not hyperconverged, so Ceph is running on an external cluster. That cluster runs Luminous, and we installed the Nautilus client on the proxmox-cluster. I can't find any documentation if this is supported or not. The stock Ceph version on PVE 6 is Luminous. ;) > > > Is it OK to have a Nautilus client on Proxmox for an external Luminous cluster, or will that break stuff? Yes, the client should work one LTS version apart. Luminous <- Nautilus -> Octopus. -- Cheers, Alwin From dor at volz.ua Wed Jan 29 10:19:31 2020 From: dor at volz.ua (Dmytro O. Redchuk) Date: Wed, 29 Jan 2020 11:19:31 +0200 Subject: [PVE-User] Config/Status commands stopped to respond In-Reply-To: References: <8568CB4F-3E98-4183-BC27-44E1C556EF03@elchaka.de> <20200128210126.GA14399@volz.ua> Message-ID: <20200129091931.GM7739@volz.ua> ? ??., 28-?? ???. 2020, ? 22:37 proxmox at elchaka.de wrote: > >/dev/fuse 30M 32K 30M 1% /etc/pve [...] > >However any ls or cat on /etc/pve does not respond. > > > >What can I do here? > > You can put your node in local Mode > > > pmxcfs -l > > I guess you have a network issue here between your pve nodes and should search for "proxmox cluster troubleshooting" No any issue right now (possibly it has been in the past, not sure), corosync reports cluster is online, both nodes. Ok, thank you, I will search for "proxmox cluster troubleshooting". -- Dmytro O. Redchuk From mark at tuxis.nl Wed Jan 29 13:45:51 2020 From: mark at tuxis.nl (Mark Schouten) Date: Wed, 29 Jan 2020 13:45:51 +0100 Subject: [PVE-User] External Ceph cluster for PVE6.1-5 In-Reply-To: <20200129082353.GG99328@dona.proxmox.com> References: <20200129082353.GG99328@dona.proxmox.com> Message-ID: <20200129124551.ibrglxlylqcvf2hr@shell.tuxis.net> On Wed, Jan 29, 2020 at 09:23:53AM +0100, Alwin Antreich wrote: > > We just upgraded one of our clusters to PVE 6.1-5. It's not hyperconverged, so Ceph is running on an external cluster. That cluster runs Luminous, and we installed the Nautilus client on the proxmox-cluster. I can't find any documentation if this is supported or not. > The stock Ceph version on PVE 6 is Luminous. ;) https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_6.1 and the installed packages suggest otherwise: root at proxmox2-1:~# dpkg -l | grep ceph ii ceph-common 14.2.6-pve1 > > Is it OK to have a Nautilus client on Proxmox for an external Luminous cluster, or will that break stuff? > Yes, the client should work one LTS version apart. > Luminous <- Nautilus -> Octopus. I know Ceph does that. But I did have lost data in a PVE4 to PVE5 upgrade where Ceph was not yet upgraded and images got shrunk because Ceph Hammer behaved differently from Ceph Luminous. I am wondering if that is the case here too. -- Mark Schouten | Tuxis B.V. KvK: 74698818 | http://www.tuxis.nl/ T: +31 318 200208 | info at tuxis.nl From a.antreich at proxmox.com Wed Jan 29 14:26:36 2020 From: a.antreich at proxmox.com (Alwin Antreich) Date: Wed, 29 Jan 2020 14:26:36 +0100 Subject: [PVE-User] External Ceph cluster for PVE6.1-5 In-Reply-To: <20200129124551.ibrglxlylqcvf2hr@shell.tuxis.net> References: <20200129082353.GG99328@dona.proxmox.com> <20200129124551.ibrglxlylqcvf2hr@shell.tuxis.net> Message-ID: <20200129132636.GI99328@dona.proxmox.com> On Wed, Jan 29, 2020 at 01:45:51PM +0100, Mark Schouten wrote: > On Wed, Jan 29, 2020 at 09:23:53AM +0100, Alwin Antreich wrote: > > > We just upgraded one of our clusters to PVE 6.1-5. It's not hyperconverged, so Ceph is running on an external cluster. That cluster runs Luminous, and we installed the Nautilus client on the proxmox-cluster. I can't find any documentation if this is supported or not. > > The stock Ceph version on PVE 6 is Luminous. ;) > > https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_6.1 and the installed > packages suggest otherwise: > > root at proxmox2-1:~# dpkg -l | grep ceph > ii ceph-common 14.2.6-pve1 On a fresh Proxmox VE 6 install, the package is Luminous. It's the current version in Debian Buster. After using pveceph install (or GUI), the Nautilus packages are installed. > > > > Is it OK to have a Nautilus client on Proxmox for an external Luminous cluster, or will that break stuff? > > Yes, the client should work one LTS version apart. > > Luminous <- Nautilus -> Octopus. > > I know Ceph does that. But I did have lost data in a PVE4 to PVE5 > upgrade where Ceph was not yet upgraded and images got shrunk because > Ceph Hammer behaved differently from Ceph Luminous. I am wondering if > that is the case here too. Not to my knowledge. Maybe someone else has more insight. You might also want to consider posting on the ceph-user list. -- Cheers, Alwin From f.cuseo at panservice.it Thu Jan 30 12:46:16 2020 From: f.cuseo at panservice.it (Fabrizio Cuseo) Date: Thu, 30 Jan 2020 12:46:16 +0100 (CET) Subject: [PVE-User] RBD Storage from 6.1 to 3.4 (or 4.4) Message-ID: <386611146.78076.1580384776302.JavaMail.zimbra@zimbra.panservice.it> I have installed a new cluster with the last release, with a local ceph storage. I also have 2 old and smaller clusters, and I need to migrate all the VMs to the new cluster. The best method i have used in past is to add on the NEW cluster the RBD storage of the old cluster, so I can stop the VM, move the .cfg file, start the vm (all those operations are really quick), and move the disk (online) from the old storage to the new storage. But now, if I add the RBD storage, copying the keyring file of the old cluster to the new cluster, naming as the storage ID, and using the old cluster monitors IP, i can see the storage summary (space total and used), but when I go to "content", i have this error: "rbd error: rbd: listing images failed: (95) Operation not supported (500)". If, from the new cluster CLI, i use the command: rbd -k /etc/pve/priv/ceph/CephOLD.keyring -m 172.16.20.31 ls rbd2 I can see the list of disk images, but also the error: "librbd::api::Trash: list: error listing rbd trash entries: (95) Operation not supported" The new cluster ceph release is Nautilus, and the old one is firefly. Some idea ? Thanks in advance, Fabrizio From uwe.sauter.de at gmail.com Thu Jan 30 12:51:41 2020 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Thu, 30 Jan 2020 12:51:41 +0100 Subject: [PVE-User] RBD Storage from 6.1 to 3.4 (or 4.4) In-Reply-To: <386611146.78076.1580384776302.JavaMail.zimbra@zimbra.panservice.it> References: <386611146.78076.1580384776302.JavaMail.zimbra@zimbra.panservice.it> Message-ID: If you can afford the downtime of the VMS you might be able to migrate the disk images using "rbd export | ncat" and "ncat | rbd import". I haven't tried this with such a great difference of versions but from Proxmox 5.4 to 6.1 this worked without a problem. Regards, Uwe Am 30.01.20 um 12:46 schrieb Fabrizio Cuseo: > > I have installed a new cluster with the last release, with a local ceph storage. > I also have 2 old and smaller clusters, and I need to migrate all the VMs to the new cluster. > The best method i have used in past is to add on the NEW cluster the RBD storage of the old cluster, so I can stop the VM, move the .cfg file, start the vm (all those operations are really quick), and move the disk (online) from the old storage to the new storage. > > But now, if I add the RBD storage, copying the keyring file of the old cluster to the new cluster, naming as the storage ID, and using the old cluster monitors IP, i can see the storage summary (space total and used), but when I go to "content", i have this error: "rbd error: rbd: listing images failed: (95) Operation not supported (500)". > > If, from the new cluster CLI, i use the command: > > rbd -k /etc/pve/priv/ceph/CephOLD.keyring -m 172.16.20.31 ls rbd2 > > I can see the list of disk images, but also the error: "librbd::api::Trash: list: error listing rbd trash entries: (95) Operation not supported" > > > The new cluster ceph release is Nautilus, and the old one is firefly. > > Some idea ? > > Thanks in advance, Fabrizio > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From f.cuseo at panservice.it Thu Jan 30 12:59:13 2020 From: f.cuseo at panservice.it (Fabrizio Cuseo) Date: Thu, 30 Jan 2020 12:59:13 +0100 (CET) Subject: [PVE-User] RBD Storage from 6.1 to 3.4 (or 4.4) In-Reply-To: References: <386611146.78076.1580384776302.JavaMail.zimbra@zimbra.panservice.it> Message-ID: <1609854579.79912.1580385553347.JavaMail.zimbra@zimbra.panservice.it> I can't afford the long downtime. With my method, the downtime is only to stop the VM on the old cluster and start on the new; the disk image copy in done online. But my last migration was from 3.4 to 4.4 ----- Il 30-gen-20, alle 12:51, Uwe Sauter uwe.sauter.de at gmail.com ha scritto: > If you can afford the downtime of the VMS you might be able to migrate the disk > images using "rbd export | ncat" and "ncat | rbd > import". > > I haven't tried this with such a great difference of versions but from Proxmox > 5.4 to 6.1 this worked without a problem. > > Regards, > > Uwe > > > Am 30.01.20 um 12:46 schrieb Fabrizio Cuseo: >> >> I have installed a new cluster with the last release, with a local ceph storage. >> I also have 2 old and smaller clusters, and I need to migrate all the VMs to the >> new cluster. >> The best method i have used in past is to add on the NEW cluster the RBD storage >> of the old cluster, so I can stop the VM, move the .cfg file, start the vm (all >> those operations are really quick), and move the disk (online) from the old >> storage to the new storage. >> >> But now, if I add the RBD storage, copying the keyring file of the old cluster >> to the new cluster, naming as the storage ID, and using the old cluster >> monitors IP, i can see the storage summary (space total and used), but when I >> go to "content", i have this error: "rbd error: rbd: listing images failed: >> (95) Operation not supported (500)". >> >> If, from the new cluster CLI, i use the command: >> >> rbd -k /etc/pve/priv/ceph/CephOLD.keyring -m 172.16.20.31 ls rbd2 >> >> I can see the list of disk images, but also the error: "librbd::api::Trash: >> list: error listing rbd trash entries: (95) Operation not supported" >> >> >> The new cluster ceph release is Nautilus, and the old one is firefly. >> >> Some idea ? >> >> Thanks in advance, Fabrizio >> >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- --- Fabrizio Cuseo - mailto:f.cuseo at panservice.it Direzione Generale - Panservice InterNetWorking Servizi Professionali per Internet ed il Networking Panservice e' associata AIIP - RIPE Local Registry Phone: +39 0773 410020 - Fax: +39 0773 470219 http://www.panservice.it mailto:info at panservice.it Numero verde nazionale: 800 901492 From aderumier at odiso.com Thu Jan 30 13:05:42 2020 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Thu, 30 Jan 2020 13:05:42 +0100 (CET) Subject: [PVE-User] RBD Storage from 6.1 to 3.4 (or 4.4) In-Reply-To: <1609854579.79912.1580385553347.JavaMail.zimbra@zimbra.panservice.it> References: <386611146.78076.1580384776302.JavaMail.zimbra@zimbra.panservice.it> <1609854579.79912.1580385553347.JavaMail.zimbra@zimbra.panservice.it> Message-ID: <439460475.2759662.1580385942479.JavaMail.zimbra@odiso.com> ceph client vs server are generally compatible between 2 or 3 releases. They are no way to make working nautilus or luminous clients with firefly. I think minimum is jewel server for nautilus client. So best way could be to upgrade your old proxmox cluster first. (from 4->6, this can be done easily without downtime) ----- Mail original ----- De: "Fabrizio Cuseo" ?: "uwe sauter de" , "proxmoxve" Envoy?: Jeudi 30 Janvier 2020 12:59:13 Objet: Re: [PVE-User] RBD Storage from 6.1 to 3.4 (or 4.4) I can't afford the long downtime. With my method, the downtime is only to stop the VM on the old cluster and start on the new; the disk image copy in done online. But my last migration was from 3.4 to 4.4 ----- Il 30-gen-20, alle 12:51, Uwe Sauter uwe.sauter.de at gmail.com ha scritto: > If you can afford the downtime of the VMS you might be able to migrate the disk > images using "rbd export | ncat" and "ncat | rbd > import". > > I haven't tried this with such a great difference of versions but from Proxmox > 5.4 to 6.1 this worked without a problem. > > Regards, > > Uwe > > > Am 30.01.20 um 12:46 schrieb Fabrizio Cuseo: >> >> I have installed a new cluster with the last release, with a local ceph storage. >> I also have 2 old and smaller clusters, and I need to migrate all the VMs to the >> new cluster. >> The best method i have used in past is to add on the NEW cluster the RBD storage >> of the old cluster, so I can stop the VM, move the .cfg file, start the vm (all >> those operations are really quick), and move the disk (online) from the old >> storage to the new storage. >> >> But now, if I add the RBD storage, copying the keyring file of the old cluster >> to the new cluster, naming as the storage ID, and using the old cluster >> monitors IP, i can see the storage summary (space total and used), but when I >> go to "content", i have this error: "rbd error: rbd: listing images failed: >> (95) Operation not supported (500)". >> >> If, from the new cluster CLI, i use the command: >> >> rbd -k /etc/pve/priv/ceph/CephOLD.keyring -m 172.16.20.31 ls rbd2 >> >> I can see the list of disk images, but also the error: "librbd::api::Trash: >> list: error listing rbd trash entries: (95) Operation not supported" >> >> >> The new cluster ceph release is Nautilus, and the old one is firefly. >> >> Some idea ? >> >> Thanks in advance, Fabrizio >> >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- --- Fabrizio Cuseo - mailto:f.cuseo at panservice.it Direzione Generale - Panservice InterNetWorking Servizi Professionali per Internet ed il Networking Panservice e' associata AIIP - RIPE Local Registry Phone: +39 0773 410020 - Fax: +39 0773 470219 http://www.panservice.it mailto:info at panservice.it Numero verde nazionale: 800 901492 _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From a.antreich at proxmox.com Thu Jan 30 13:42:34 2020 From: a.antreich at proxmox.com (Alwin Antreich) Date: Thu, 30 Jan 2020 13:42:34 +0100 Subject: [PVE-User] RBD Storage from 6.1 to 3.4 (or 4.4) In-Reply-To: <386611146.78076.1580384776302.JavaMail.zimbra@zimbra.panservice.it> References: <386611146.78076.1580384776302.JavaMail.zimbra@zimbra.panservice.it> Message-ID: <20200130124234.GJ99328@dona.proxmox.com> Hello Fabrizio, On Thu, Jan 30, 2020 at 12:46:16PM +0100, Fabrizio Cuseo wrote: > > I have installed a new cluster with the last release, with a local ceph storage. > I also have 2 old and smaller clusters, and I need to migrate all the VMs to the new cluster. > The best method i have used in past is to add on the NEW cluster the RBD storage of the old cluster, so I can stop the VM, move the .cfg file, start the vm (all those operations are really quick), and move the disk (online) from the old storage to the new storage. > > But now, if I add the RBD storage, copying the keyring file of the old cluster to the new cluster, naming as the storage ID, and using the old cluster monitors IP, i can see the storage summary (space total and used), but when I go to "content", i have this error: "rbd error: rbd: listing images failed: (95) Operation not supported (500)". > > If, from the new cluster CLI, i use the command: > > rbd -k /etc/pve/priv/ceph/CephOLD.keyring -m 172.16.20.31 ls rbd2 > > I can see the list of disk images, but also the error: "librbd::api::Trash: list: error listing rbd trash entries: (95) Operation not supported" > > > The new cluster ceph release is Nautilus, and the old one is firefly. > > Some idea ? As said by others already, there is no direct way. Best OFC would be to do a backup + restore. But in any case, you will need a shared storage that can be reached by both clusters, eg. like NFS. And watch out, as one cluster can potentially destroy disks from the other cluster on the shared storage. -- Cheers, Alwin From gilberto.nunes32 at gmail.com Thu Jan 30 14:10:26 2020 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Thu, 30 Jan 2020 10:10:26 -0300 Subject: [PVE-User] VZdump: No such disk, but the disk is there! Message-ID: Hi there I got a strage error last night. Vzdump complain about the disk no exist or lvm volume in this case but the volume exist, indeed! In the morning I have do a manually backup and it's working fine... Any advice? 112: 2020-01-29 22:20:02 INFO: Starting Backup of VM 112 (qemu) 112: 2020-01-29 22:20:02 INFO: status = running 112: 2020-01-29 22:20:03 INFO: update VM 112: -lock backup 112: 2020-01-29 22:20:03 INFO: VM Name: cliente-V-112-IP-165 112: 2020-01-29 22:20:03 INFO: include disk 'scsi0' 'local-lvm:vm-112-disk-0' 120G 112: 2020-01-29 22:20:23 ERROR: Backup of VM 112 failed - no such volume 'local-lvm:vm-112-disk-0' 116: 2020-01-29 22:20:23 INFO: Starting Backup of VM 116 (qemu) 116: 2020-01-29 22:20:23 INFO: status = running 116: 2020-01-29 22:20:24 INFO: update VM 116: -lock backup 116: 2020-01-29 22:20:24 INFO: VM Name: cliente-V-IP-162 116: 2020-01-29 22:20:24 INFO: include disk 'scsi0' 'local-lvm:vm-116-disk-0' 100G 116: 2020-01-29 22:20:49 ERROR: Backup of VM 116 failed - no such volume 'local-lvm:vm-116-disk-0' --- Gilberto Nunes Ferreira (47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram Skype: gilberto.nunes36 From gaio at sv.lnf.it Thu Jan 30 14:33:15 2020 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Thu, 30 Jan 2020 14:33:15 +0100 Subject: [PVE-User] VZdump: No such disk, but the disk is there! In-Reply-To: References: Message-ID: <20200130133315.GJ2908@sv.lnf.it> Mandi! Gilberto Nunes In chel di` si favelave... > Any advice? Happen 'spot' also here; i'm convinced that, under some specific circumstances, eg, high load on the SAN, backup 'timeout' and the error reported is that, a bit misleading indeed. FYI. -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797 Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA! http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000 (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA) From gilberto.nunes32 at gmail.com Thu Jan 30 14:41:27 2020 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Thu, 30 Jan 2020 10:41:27 -0300 Subject: [PVE-User] VZdump: No such disk, but the disk is there! In-Reply-To: <20200130133315.GJ2908@sv.lnf.it> References: <20200130133315.GJ2908@sv.lnf.it> Message-ID: Hi Marco... I already got error from timeout storage, but is very different from that one... Very unusual indeed! Thanks for reply! --- Gilberto Nunes Ferreira (47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram Skype: gilberto.nunes36 Em qui., 30 de jan. de 2020 ?s 10:33, Marco Gaiarin escreveu: > Mandi! Gilberto Nunes > In chel di` si favelave... > > > Any advice? > > Happen 'spot' also here; i'm convinced that, under some specific > circumstances, eg, high load on the SAN, backup 'timeout' and the error > reported is that, a bit misleading indeed. > > FYI. > > -- > dott. Marco Gaiarin GNUPG Key ID: > 240A3D66 > Associazione ``La Nostra Famiglia'' > http://www.lanostrafamiglia.it/ > Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento > (PN) > marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f > +39-0434-842797 > > Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA! > http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000 > (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA) > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From elacunza at binovo.es Thu Jan 30 16:27:26 2020 From: elacunza at binovo.es (Eneko Lacunza) Date: Thu, 30 Jan 2020 16:27:26 +0100 Subject: [PVE-User] RBD Storage from 6.1 to 3.4 (or 4.4) In-Reply-To: <1609854579.79912.1580385553347.JavaMail.zimbra@zimbra.panservice.it> References: <386611146.78076.1580384776302.JavaMail.zimbra@zimbra.panservice.it> <1609854579.79912.1580385553347.JavaMail.zimbra@zimbra.panservice.it> Message-ID: I think firefly is too old. Either you create backups and restore in the new cluster, or you'll have to upgrade the old clusters at least to Proxmox 5 and Ceph Mimic. Cheers El 30/1/20 a las 12:59, Fabrizio Cuseo escribi?: > I can't afford the long downtime. With my method, the downtime is only to stop the VM on the old cluster and start on the new; the disk image copy in done online. > > But my last migration was from 3.4 to 4.4 > > > ----- Il 30-gen-20, alle 12:51, Uwe Sauter uwe.sauter.de at gmail.com ha scritto: > >> If you can afford the downtime of the VMS you might be able to migrate the disk >> images using "rbd export | ncat" and "ncat | rbd >> import". >> >> I haven't tried this with such a great difference of versions but from Proxmox >> 5.4 to 6.1 this worked without a problem. >> >> Regards, >> >> Uwe >> >> >> Am 30.01.20 um 12:46 schrieb Fabrizio Cuseo: >>> I have installed a new cluster with the last release, with a local ceph storage. >>> I also have 2 old and smaller clusters, and I need to migrate all the VMs to the >>> new cluster. >>> The best method i have used in past is to add on the NEW cluster the RBD storage >>> of the old cluster, so I can stop the VM, move the .cfg file, start the vm (all >>> those operations are really quick), and move the disk (online) from the old >>> storage to the new storage. >>> >>> But now, if I add the RBD storage, copying the keyring file of the old cluster >>> to the new cluster, naming as the storage ID, and using the old cluster >>> monitors IP, i can see the storage summary (space total and used), but when I >>> go to "content", i have this error: "rbd error: rbd: listing images failed: >>> (95) Operation not supported (500)". >>> >>> If, from the new cluster CLI, i use the command: >>> >>> rbd -k /etc/pve/priv/ceph/CephOLD.keyring -m 172.16.20.31 ls rbd2 >>> >>> I can see the list of disk images, but also the error: "librbd::api::Trash: >>> list: error listing rbd trash entries: (95) Operation not supported" >>> >>> >>> The new cluster ceph release is Nautilus, and the old one is firefly. >>> >>> Some idea ? >>> >>> Thanks in advance, Fabrizio >>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at pve.proxmox.com >>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>> >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- Zuzendari Teknikoa / Director T?cnico Binovo IT Human Project, S.L. Telf. 943569206 Astigarragako bidea 2, 2? izq. oficina 11; 20180 Oiartzun (Gipuzkoa) www.binovo.es From f.cuseo at panservice.it Thu Jan 30 16:32:12 2020 From: f.cuseo at panservice.it (Fabrizio Cuseo) Date: Thu, 30 Jan 2020 16:32:12 +0100 (CET) Subject: [PVE-User] RBD Storage from 6.1 to 3.4 (or 4.4) In-Reply-To: References: <386611146.78076.1580384776302.JavaMail.zimbra@zimbra.panservice.it> <1609854579.79912.1580385553347.JavaMail.zimbra@zimbra.panservice.it> Message-ID: <1269099270.85357.1580398332179.JavaMail.zimbra@zimbra.panservice.it> Thank you all. I will migrate using an iSCSI storage configured on both the clusters, so the VMs downtime will be short. Fabrizio ----- Il 30-gen-20, alle 16:27, Eneko Lacunza elacunza at binovo.es ha scritto: > I think firefly is too old. > > Either you create backups and restore in the new cluster, or you'll have > to upgrade the old clusters at least to Proxmox 5 and Ceph Mimic. > > Cheers > > El 30/1/20 a las 12:59, Fabrizio Cuseo escribi?: >> I can't afford the long downtime. With my method, the downtime is only to stop >> the VM on the old cluster and start on the new; the disk image copy in done >> online. >> >> But my last migration was from 3.4 to 4.4 >> >> >> ----- Il 30-gen-20, alle 12:51, Uwe Sauter uwe.sauter.de at gmail.com ha scritto: >> >>> If you can afford the downtime of the VMS you might be able to migrate the disk >>> images using "rbd export | ncat" and "ncat | rbd >>> import". >>> >>> I haven't tried this with such a great difference of versions but from Proxmox >>> 5.4 to 6.1 this worked without a problem. >>> >>> Regards, >>> >>> Uwe >>> >>> >>> Am 30.01.20 um 12:46 schrieb Fabrizio Cuseo: >>>> I have installed a new cluster with the last release, with a local ceph storage. >>>> I also have 2 old and smaller clusters, and I need to migrate all the VMs to the >>>> new cluster. >>>> The best method i have used in past is to add on the NEW cluster the RBD storage >>>> of the old cluster, so I can stop the VM, move the .cfg file, start the vm (all >>>> those operations are really quick), and move the disk (online) from the old >>>> storage to the new storage. >>>> >>>> But now, if I add the RBD storage, copying the keyring file of the old cluster >>>> to the new cluster, naming as the storage ID, and using the old cluster >>>> monitors IP, i can see the storage summary (space total and used), but when I >>>> go to "content", i have this error: "rbd error: rbd: listing images failed: >>>> (95) Operation not supported (500)". >>>> >>>> If, from the new cluster CLI, i use the command: >>>> >>>> rbd -k /etc/pve/priv/ceph/CephOLD.keyring -m 172.16.20.31 ls rbd2 >>>> >>>> I can see the list of disk images, but also the error: "librbd::api::Trash: >>>> list: error listing rbd trash entries: (95) Operation not supported" >>>> >>>> >>>> The new cluster ceph release is Nautilus, and the old one is firefly. >>>> >>>> Some idea ? >>>> >>>> Thanks in advance, Fabrizio >>>> >>>> _______________________________________________ >>>> pve-user mailing list >>>> pve-user at pve.proxmox.com >>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at pve.proxmox.com >>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > -- > Zuzendari Teknikoa / Director T?cnico > Binovo IT Human Project, S.L. > Telf. 943569206 > Astigarragako bidea 2, 2? izq. oficina 11; 20180 Oiartzun (Gipuzkoa) > www.binovo.es > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- --- Fabrizio Cuseo - mailto:f.cuseo at panservice.it Direzione Generale - Panservice InterNetWorking Servizi Professionali per Internet ed il Networking Panservice e' associata AIIP - RIPE Local Registry Phone: +39 0773 410020 - Fax: +39 0773 470219 http://www.panservice.it mailto:info at panservice.it Numero verde nazionale: 800 901492 From dor at volz.ua Fri Jan 31 12:23:48 2020 From: dor at volz.ua (Dmytro O. Redchuk) Date: Fri, 31 Jan 2020 13:23:48 +0200 Subject: [PVE-User] Config/Status commands stopped to respond In-Reply-To: References: <8568CB4F-3E98-4183-BC27-44E1C556EF03@elchaka.de> <20200128210126.GA14399@volz.ua> Message-ID: <20200131112348.GA3245@volz.ua> Hello community, after having spare node with some restored backups prepared, I've tried "systemctl stop pve-cluster", checked /etc/pve and then "systemctl start pve-cluster". Everything seems to be ok now. Thanks for help! ? ??., 29-?? ???. 2020, ? 11:19 Dmytro O. Redchuk via pve-user wrote: > Date: Wed, 29 Jan 2020 11:19:31 +0200 > From: "Dmytro O. Redchuk" > To: proxmox at elchaka.de > Cc: PVE User List > Subject: Re: [PVE-User] Config/Status commands stopped to respond > User-Agent: Mutt/1.10.1 (2018-07-13) > > ? ??., 28-?? ???. 2020, ? 22:37 proxmox at elchaka.de wrote: > > >/dev/fuse 30M 32K 30M 1% /etc/pve > [...] > > > >However any ls or cat on /etc/pve does not respond. > > > > > >What can I do here? > > > > You can put your node in local Mode > > > > > > pmxcfs -l > > > > I guess you have a network issue here between your pve nodes and should search for "proxmox cluster troubleshooting" > No any issue right now (possibly it has been in the past, not sure), > corosync reports cluster is online, both nodes. > > Ok, thank you, I will search for "proxmox cluster troubleshooting". -- Dmytro O. Redchuk