From mark at tuxis.nl  Mon May  1 23:45:21 2017
From: mark at tuxis.nl (Mark Schouten)
Date: Mon, 1 May 2017 23:45:21 +0200
Subject: [PVE-User] Missing node in ha-manager
Message-ID: <5B4E1C46-D8AC-4961-8D89-03EBBFF16175@tuxis.nl>

Hi,

I recently added a new node to a cluster which is also running with HA. The fourth node seems to be working fine, but one of the other nodes is confused. pvecm nodes shows the full hostname for the new node, and the short one for the existing nodes. So probably a result of a imperfect /etc/hosts-file. I corrected the hosts-file on all nodes, but pvecm nodes still shows the incorrect output.

Also, in HA, the new node does not exist on the misbehaving node.  In the logs I see:
May 01 09:38:36 proxmox03 pve-ha-crm[2777]: node 'proxmox04': state changed from 'unknown' => 'gone'
May 01 09:38:36 proxmox03 pve-ha-crm[2777]: crm command error - node not online: migrate vm:222 proxmox04

Which is a result of http://pve-devel.pve.proxmox.narkive.com/Eafo8CAz/patch-pve-ha-manager-handle-node-deletion-in-the-ha-stack <http://pve-devel.pve.proxmox.narkive.com/Eafo8CAz/patch-pve-ha-manager-handle-node-deletion-in-the-ha-stack> . I understand why this is done, but I would like to fix this without rebooting the misbehaving node. Can I restart pve-ha-crm to make things right again? /etc/pve/.members on the misbehaving node does not mention the new node at all?

Please advise.

? 
Mark Schouten
Tuxis Internet Engineering <mark at tuxis.nl <mailto:mark at tuxis.nl>>

From t.lamprecht at proxmox.com  Tue May  2 09:05:43 2017
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Tue, 2 May 2017 09:05:43 +0200
Subject: [PVE-User] Missing node in ha-manager
In-Reply-To: <5B4E1C46-D8AC-4961-8D89-03EBBFF16175@tuxis.nl>
References: <5B4E1C46-D8AC-4961-8D89-03EBBFF16175@tuxis.nl>
Message-ID: <b8325c17-e4f9-76e4-fc1b-50ade2ca4a1c@proxmox.com>

Hi,

On 05/01/2017 11:45 PM, Mark Schouten wrote:
> Hi,
>
> I recently added a new node to a cluster which is also running with HA. The fourth node seems to be working fine, but one of the other nodes is confused. pvecm nodes shows the full hostname for the new node, and the short one for the existing nodes. So probably a result of a imperfect /etc/hosts-file. I corrected the hosts-file on all nodes, but pvecm nodes still shows the incorrect output.

No, the node names for the cluster are not resolved from the /etc/hosts 
file, but from /etc/pve/corosync.conf (either the `name` property, or if 
not set, the ring0_addr property). The hosts file can help to resolve 
the node names to their IPs, normally corosync can do that over the 
multicast group, but it's considered good practice to have a valid 
nodenames to IP mapping in /etc/hosts nevertheless.

Can you control that the config looks the same on all nodes?
Especially the difference between working and misbehaving nodes would be 
interesting.

>
> Also, in HA, the new node does not exist on the misbehaving node.  In the logs I see:
> May 01 09:38:36 proxmox03 pve-ha-crm[2777]: node 'proxmox04': state changed from 'unknown' => 'gone'
> May 01 09:38:36 proxmox03 pve-ha-crm[2777]: crm command error - node not online: migrate vm:222 proxmox04
>
> Which is a result of http://pve-devel.pve.proxmox.narkive.com/Eafo8CAz/patch-pve-ha-manager-handle-node-deletion-in-the-ha-stack <http://pve-devel.pve.proxmox.narkive.com/Eafo8CAz/patch-pve-ha-manager-handle-node-deletion-in-the-ha-stack> . I understand why this is done, but I would like to fix this without rebooting the misbehaving node. Can I restart pve-ha-crm to make things right again? /etc/pve/.members on the misbehaving node does not mention the new node at all?

In general you could just restart CRM, but the CRM is capable of syncing 
in new nodes while running, so there shouldn't be any need for that, the 
patches you linked also do not change that, AFAIK.
As /etc/pve.members doesn't shows the new node on the misbehaving one 
the problem is another one.
Who is the current master? Can you give me an output of:
# ha-manager status
# pvecm status
# cat /etc/pve/corosync.conf

 From the misbehaving node and a "OK" one? Remember to redact public IP 
addresses.

cheers,
Thomas

From carheden at ucar.edu  Tue May  2 17:40:37 2017
From: carheden at ucar.edu (Adam Carheden)
Date: Tue, 2 May 2017 09:40:37 -0600
Subject: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA
	cluster
Message-ID: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>

What's supposed to happen if two nodes in a 4-node HA cluster go offline?

I have a 4-node test cluster, two nodes are in one server room and the
other two in another server room. I had HA inadvertently tested for me
this morning due to an unexpected network issue and watchdog rebooted
two of the nodes.

I think this is the expected behavior, and certainly seems like what I
want to happen. However, quorum is 3, not 2, so why didn't all 4 nodes
reboot?

# pvecm status
Quorum information
------------------
Date:             Tue May  2 09:35:23 2017
Quorum provider:  corosync_votequorum
Nodes:            4
Node ID:          0x00000001
Ring ID:          4/524
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      4
Quorum:           3
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000004          1 192.168.0.11
0x00000003          1 192.168.0.203
0x00000001          1 192.168.0.204 (local)
0x00000002          1 192.168.0.206

# ha-manager status
quorum OK
master node3 (active, Tue May  2 09:35:24 2017)
lrm node1 (idle, Tue May  2 09:35:27 2017)
lrm node2 (active, Tue May  2 09:35:26 2017)
lrm node3 (idle, Tue May  2 09:35:23 2017)
lrm node3 (idle, Tue May  2 09:35:23 2017)

Somehow proxmox was smart enough to keep two of the nodes online, but
with a quorum of 3 neither group should have had quorum. How does it
decide which group to keep online?

Thanks
-- 
Adam Carheden

From t.lamprecht at proxmox.com  Wed May  3 09:41:54 2017
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Wed, 3 May 2017 09:41:54 +0200
Subject: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA
 cluster
In-Reply-To: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
References: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
Message-ID: <98fdfb17-dd06-2e0b-8d69-58dda81b3317@proxmox.com>

Hi,

On 05/02/2017 05:40 PM, Adam Carheden wrote:
> What's supposed to happen if two nodes in a 4-node HA cluster go offline?

If all of them have HA services configured then there may happen a full 
cluster reset.
If two nodes go offline the whole cluster looses quorum, so all nodes 
with an active watchdog (i.e. all nodes with active services (in the 
past)) will reset.

For such situation, where there's a tie an external voting arbitrator 
would help, this could be a fifth (tiny) node or an corosync QDevices.
QDevices have the advantage that they can run on any newer Linux Distro 
which ship corosync (2.4 and newer AFAIK) independent of the PVE stack.
They can provide arbitrator votes to multiple cluster, and have less 
constraints regarding network setup latency as the communication happens 
over TCP.
This is usable from PVE but we haven't documented it, which I started to 
do and need to pick up again soon.
Just a note for any other reader, while this can boost reliability and 
recovery in Clusters with an even vote count (you can only 'win' there),
it can do the reverse in Clusters with uneven Node counts.

>
> I have a 4-node test cluster, two nodes are in one server room and the
> other two in another server room. I had HA inadvertently tested for me
> this morning due to an unexpected network issue and watchdog rebooted
> two of the nodes.
>
> I think this is the expected behavior, and certainly seems like what I
> want to happen. However, quorum is 3, not 2, so why didn't all 4 nodes
> reboot?

Because, if the `ha-manager status` still mirrors the same setup (i.e. 
same services on same nodes configured) as when the network failure 
happened, I see that just one node hast active services running.
We do not fence nodes which have no configured HA services, or if all of 
there configured HA services are disabled.
As we think that this would just lower reliability for non-ha services 
but bring no increase in reliability for HA services.

>
> # pvecm status
> Quorum information
> ------------------
> Date:             Tue May  2 09:35:23 2017
> Quorum provider:  corosync_votequorum
> Nodes:            4
> Node ID:          0x00000001
> Ring ID:          4/524
> Quorate:          Yes
>
> Votequorum information
> ----------------------
> Expected votes:   4
> Highest expected: 4
> Total votes:      4
> Quorum:           3
> Flags:            Quorate
>
> Membership information
> ----------------------
>      Nodeid      Votes Name
> 0x00000004          1 192.168.0.11
> 0x00000003          1 192.168.0.203
> 0x00000001          1 192.168.0.204 (local)
> 0x00000002          1 192.168.0.206
>
> # ha-manager status
> quorum OK
> master node3 (active, Tue May  2 09:35:24 2017)
> lrm node1 (idle, Tue May  2 09:35:27 2017)
> lrm node2 (active, Tue May  2 09:35:26 2017)
> lrm node3 (idle, Tue May  2 09:35:23 2017)
> lrm node3 (idle, Tue May  2 09:35:23 2017)
>
> Somehow proxmox was smart enough to keep two of the nodes online, but
> with a quorum of 3 neither group should have had quorum. How does it
> decide which group to keep online?

see above

Cheers,
Thomas

From mark at tuxis.nl  Wed May  3 09:45:49 2017
From: mark at tuxis.nl (Mark Schouten)
Date: Wed, 03 May 2017 09:45:49 +0200
Subject: [PVE-User] Missing node in ha-manager
In-Reply-To: <b8325c17-e4f9-76e4-fc1b-50ade2ca4a1c@proxmox.com>
References: <5B4E1C46-D8AC-4961-8D89-03EBBFF16175@tuxis.nl>
 <b8325c17-e4f9-76e4-fc1b-50ade2ca4a1c@proxmox.com>
Message-ID: <1493797549.12575.1.camel@tuxis.nl>

On Tue, 2017-05-02 at 09:05 +0200, Thomas Lamprecht wrote:
> Can you control that the config looks the same on all nodes?
> Especially the difference between working and misbehaving nodes would
> be?
> interesting.

Please see the attachment. That includes /etc/pve/.members and
/etc/pve/corosync.conf from all nodes. Only the .members file of the
misbehaving node is off.

> In general you could just restart CRM, but the CRM is capable of
> syncing?
> in new nodes while running, so there shouldn't be any need for that,
> the?
> patches you linked also do not change that, AFAIK.

I would like to do a sync without a restart as well, but what would
trigger this?

> As /etc/pve.members doesn't shows the new node on the misbehaving
> one?
> the problem is another one.
> Who is the current master? Can you give me an output of:
> # ha-manager status
> # pvecm status
> # cat /etc/pve/corosync.conf

Output in the attachment. Because the misbehaving node also is the
master, output of ha-manager is identical on all nodes.

-- 
Kerio Operator in de Cloud? https://www.kerioindecloud.nl/
Mark Schouten  | Tuxis Internet Engineering
KvK: 61527076  | http://www.tuxis.nl/
T: 0318 200208 | info at tuxis.nl
-------------- next part --------------
root at proxmox01:~# cat /etc/pve/.members 
{
"nodename": "proxmox01",
"version": 23,
"cluster": { "name": "redacted01", "version": 4, "nodes": 4, "quorate": 1 },
"nodelist": {
  "proxmox02": { "id": 2, "online": 1, "ip": "10.1.1.2"},
  "proxmox01": { "id": 1, "online": 1, "ip": "10.1.1.1"},
  "proxmox04": { "id": 4, "online": 1, "ip": "10.1.1.4"},
  "proxmox03": { "id": 3, "online": 1, "ip": "10.1.1.3"}
  }
}

root at proxmox02:~# cat /etc/pve/.members 
{
"nodename": "proxmox02",
"version": 21,
"cluster": { "name": "redacted01", "version": 4, "nodes": 4, "quorate": 1 },
"nodelist": {
  "proxmox02": { "id": 2, "online": 1, "ip": "10.1.1.2"},
  "proxmox01": { "id": 1, "online": 1, "ip": "10.1.1.1"},
  "proxmox04": { "id": 4, "online": 1, "ip": "10.1.1.4"},
  "proxmox03": { "id": 3, "online": 1, "ip": "10.1.1.3"}
  }
}

root at proxmox03:~# cat /etc/pve/.members 
{
"nodename": "proxmox03",
"version": 24,
"cluster": { "name": "redacted01", "version": 3, "nodes": 3, "quorate": 1 },
"nodelist": {
  "proxmox02": { "id": 2, "online": 1, "ip": "10.1.1.2"},
  "proxmox03": { "id": 3, "online": 1, "ip": "10.1.1.3"},
  "proxmox01": { "id": 1, "online": 1, "ip": "10.1.1.1"}
  }
}

root at proxmox04:~# cat /etc/pve/.members 
{
"nodename": "proxmox04",
"version": 6,
"cluster": { "name": "redacted01", "version": 4, "nodes": 4, "quorate": 1 },
"nodelist": {
  "proxmox02": { "id": 2, "online": 1, "ip": "10.1.1.2"},
  "proxmox01": { "id": 1, "online": 1, "ip": "10.1.1.1"},
  "proxmox04": { "id": 4, "online": 1, "ip": "10.1.1.4"},
  "proxmox03": { "id": 3, "online": 1, "ip": "10.1.1.3"}
  }
}

root at proxmox01:~# cat /etc/pve/corosync.conf 
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: proxmox02
    nodeid: 2
    quorum_votes: 1
    ring0_addr: proxmox02
  }

  node {
    name: proxmox04
    nodeid: 4
    quorum_votes: 1
    ring0_addr: proxmox04
  }

  node {
    name: proxmox01
    nodeid: 1
    quorum_votes: 1
    ring0_addr: proxmox01
  }

  node {
    name: proxmox03
    nodeid: 3
    quorum_votes: 1
    ring0_addr: proxmox03
  }

}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: redacted01
  config_version: 4
  ip_version: ipv4
  secauth: on
  version: 2
  interface {
    bindnetaddr: 10.1.1.1
    ringnumber: 0
  }

}

root at proxmox02:~# cat /etc/pve/corosync.conf 
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: proxmox02
    nodeid: 2
    quorum_votes: 1
    ring0_addr: proxmox02
  }

  node {
    name: proxmox04
    nodeid: 4
    quorum_votes: 1
    ring0_addr: proxmox04
  }

  node {
    name: proxmox01
    nodeid: 1
    quorum_votes: 1
    ring0_addr: proxmox01
  }

  node {
    name: proxmox03
    nodeid: 3
    quorum_votes: 1
    ring0_addr: proxmox03
  }

}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: redacted01
  config_version: 4
  ip_version: ipv4
  secauth: on
  version: 2
  interface {
    bindnetaddr: 10.1.1.1
    ringnumber: 0
  }

}

root at proxmox03:~# cat /etc/pve/corosync.conf 
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: proxmox02
    nodeid: 2
    quorum_votes: 1
    ring0_addr: proxmox02
  }

  node {
    name: proxmox04
    nodeid: 4
    quorum_votes: 1
    ring0_addr: proxmox04
  }

  node {
    name: proxmox01
    nodeid: 1
    quorum_votes: 1
    ring0_addr: proxmox01
  }

  node {
    name: proxmox03
    nodeid: 3
    quorum_votes: 1
    ring0_addr: proxmox03
  }

}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: redacted01
  config_version: 4
  ip_version: ipv4
  secauth: on
  version: 2
  interface {
    bindnetaddr: 10.1.1.1
    ringnumber: 0
  }

}

root at proxmox04:~# cat /etc/pve/corosync.conf 
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: proxmox02
    nodeid: 2
    quorum_votes: 1
    ring0_addr: proxmox02
  }

  node {
    name: proxmox04
    nodeid: 4
    quorum_votes: 1
    ring0_addr: proxmox04
  }

  node {
    name: proxmox01
    nodeid: 1
    quorum_votes: 1
    ring0_addr: proxmox01
  }

  node {
    name: proxmox03
    nodeid: 3
    quorum_votes: 1
    ring0_addr: proxmox03
  }

}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: redacted01
  config_version: 4
  ip_version: ipv4
  secauth: on
  version: 2
  interface {
    bindnetaddr: 10.1.1.1
    ringnumber: 0
  }

}

root at proxmox03:~# pvecm status
Quorum information
------------------
Date:             Wed May  3 09:40:49 2017
Quorum provider:  corosync_votequorum
Nodes:            4
Node ID:          0x00000003
Ring ID:          1/2452
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      4
Quorum:           3  
Flags:            Quorate 

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.1.1.1
0x00000002          1 10.1.1.2
0x00000003          1 10.1.1.3 (local)
0x00000004          1 10.1.1.4

root at proxmox03:~# ha-manager status
quorum OK
master proxmox03 (active, Wed May  3 09:44:26 2017)
lrm proxmox01 (active, Wed May  3 09:44:26 2017)
lrm proxmox02 (active, Wed May  3 09:44:22 2017)
lrm proxmox03 (active, Wed May  3 09:44:21 2017)
<services> <services>

From aderumier at odiso.com  Wed May  3 09:46:26 2017
From: aderumier at odiso.com (Alexandre DERUMIER)
Date: Wed, 3 May 2017 09:46:26 +0200 (CEST)
Subject: [PVE-User] Expected fencing behavior on a bifurcated 4-node
	HA	cluster
In-Reply-To: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
References: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
Message-ID: <1921419111.11432350.1493797586137.JavaMail.zimbra@oxygem.tv>

Maybe this is because node only reboot if an HA vm is present on the node ?

if you had HA vm on all 4 nodes, I think that all nodes should be reboot by watchdog. (as you lost quorum on 4 nodes)

----- Mail original -----
De: "Adam Carheden" <carheden at ucar.edu>
?: "proxmoxve" <pve-user at pve.proxmox.com>
Envoy?: Mardi 2 Mai 2017 17:40:37
Objet: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA	cluster

What's supposed to happen if two nodes in a 4-node HA cluster go offline? 

I have a 4-node test cluster, two nodes are in one server room and the 
other two in another server room. I had HA inadvertently tested for me 
this morning due to an unexpected network issue and watchdog rebooted 
two of the nodes. 

I think this is the expected behavior, and certainly seems like what I 
want to happen. However, quorum is 3, not 2, so why didn't all 4 nodes 
reboot? 

# pvecm status 
Quorum information 
------------------ 
Date: Tue May 2 09:35:23 2017 
Quorum provider: corosync_votequorum 
Nodes: 4 
Node ID: 0x00000001 
Ring ID: 4/524 
Quorate: Yes 

Votequorum information 
---------------------- 
Expected votes: 4 
Highest expected: 4 
Total votes: 4 
Quorum: 3 
Flags: Quorate 

Membership information 
---------------------- 
Nodeid Votes Name 
0x00000004 1 192.168.0.11 
0x00000003 1 192.168.0.203 
0x00000001 1 192.168.0.204 (local) 
0x00000002 1 192.168.0.206 

# ha-manager status 
quorum OK 
master node3 (active, Tue May 2 09:35:24 2017) 
lrm node1 (idle, Tue May 2 09:35:27 2017) 
lrm node2 (active, Tue May 2 09:35:26 2017) 
lrm node3 (idle, Tue May 2 09:35:23 2017) 
lrm node3 (idle, Tue May 2 09:35:23 2017) 

Somehow proxmox was smart enough to keep two of the nodes online, but 
with a quorum of 3 neither group should have had quorum. How does it 
decide which group to keep online? 

Thanks 
-- 
Adam Carheden 
_______________________________________________ 
pve-user mailing list 
pve-user at pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user 

From fk at datenfalke.de  Wed May  3 16:42:53 2017
From: fk at datenfalke.de (Falco Kleinschmidt)
Date: Wed, 3 May 2017 16:42:53 +0200
Subject: [PVE-User] Host error after LXC restart
In-Reply-To: <571889cd-7e3a-5d13-e59d-34fb3940452c@tobiaskropf.de>
References: <571889cd-7e3a-5d13-e59d-34fb3940452c@tobiaskropf.de>
Message-ID: <71c95c6a-fb37-dd29-ca63-d01a04711cc0@datenfalke.de>

I see the following error in some of my LXC hosts. I think without any
resolving problems. But I would be happier without.

I do not know what is producing these errors.

Message from syslogd at xxxxxxx at Apr 28 23:11:01 ...
 kernel:[7261453.445614] unregister_netdevice: waiting for lo to become
free. Usage count = 1

Message from syslogd at xxxxxxx at Apr 29 23:10:11 ...
 kernel:[7347806.677566] unregister_netdevice: waiting for lo to become
free. Usage count = 1

Message from syslogd at xxxxxxx at Apr 30 17:10:15
...                                                                                                                                                                

 kernel:[7412612.306773] unregister_netdevice: waiting for lo to become
free. Usage count = 1

Message from syslogd at xxxxxxx at May  1 04:09:33 ...
 kernel:[7452171.785214] unregister_netdevice: waiting for lo to become
free. Usage count = 1

Message from syslogd at xxxxxxx at May  2 16:09:38 ...
 kernel:[7581781.666309] unregister_netdevice: waiting for lo to become
free. Usage count = 1

Message from syslogd at xxxxxxx at May  2 23:10:11 ...
 kernel:[7607015.010249] unregister_netdevice: waiting for lo to become
free. Usage count = 1

Message from syslogd at xxxxxxx at May  3 03:09:35 ...
 kernel:[7621379.994103] unregister_netdevice: waiting for lo to become
free. Usage count = 1

Am 28.04.2017 um 12:11 schrieb Tobias Kropf:
> Hi @ all
>
> after the LXC container restart on PVE Host (4.4.49-86) we have a lot
> of error on the host.
>
> We see in dmesg this error: "netdevice: waiting for eth0 to become
> free". Have anywone the same error?
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

-- 

Datenfalke - Dipl. Inf. Falco Kleinschmidt
Adresse: Dinnendahlstr. 8 - 45136 Essen
Steuer-Nr: DE248267798
Telefon: +49-(0)201-6124650
Fax: +49-(0)201-6124651
Email: fk at datenfalke.de
WWW: http://www.datenfalke.de

From carheden at ucar.edu  Wed May  3 17:29:27 2017
From: carheden at ucar.edu (Adam Carheden)
Date: Wed, 3 May 2017 09:29:27 -0600
Subject: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA
 cluster
In-Reply-To: <1921419111.11432350.1493797586137.JavaMail.zimbra@oxygem.tv>
References: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
 <1921419111.11432350.1493797586137.JavaMail.zimbra@oxygem.tv>
Message-ID: <0301d662-d15f-feae-a45f-b92898d3a00a@ucar.edu>

On 05/03/2017 01:46 AM, Alexandre DERUMIER wrote:
> Maybe this is because node only reboot if an HA vm is present on the node ?
> 
> 
> if you had HA vm on all 4 nodes, I think that all nodes should be reboot by watchdog. (as you lost quorum on 4 nodes)
That must be it. I have one HA VM and a few non-HA VMs, all just
testing. I think the takeaway is not to run an even number of HA nodes
in production (or to use a corosync QDevice as Thomas suggests).

Am I correct that all PVE nodes contribute to the quorum voting even if
they're not part of an HA group?

My production cluster will have 6 nodes (more redundancy, same
datacetner, less network risk). To prevent cluster shutdown in
production when I'll have lots more HA VMs, I can just add an old
cheap-o box as 7th node for quorum and not put it in any HA groups?

Alternatively, is there a way exclude one of my 6 nodes from the HA
quorum voting? In a CEPH cluster, quorum is determined by nodes running
the monitor service, and not all nodes have to run the monitor service.
Is there an equivalent "no monitor" configuration in PVE?

Thanks

> 
> ----- Mail original -----
> De: "Adam Carheden" <carheden at ucar.edu>
> ?: "proxmoxve" <pve-user at pve.proxmox.com>
> Envoy?: Mardi 2 Mai 2017 17:40:37
> Objet: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA	cluster
> 
> What's supposed to happen if two nodes in a 4-node HA cluster go offline? 
> 
> 
> I have a 4-node test cluster, two nodes are in one server room and the 
> other two in another server room. I had HA inadvertently tested for me 
> this morning due to an unexpected network issue and watchdog rebooted 
> two of the nodes. 
> 
> I think this is the expected behavior, and certainly seems like what I 
> want to happen. However, quorum is 3, not 2, so why didn't all 4 nodes 
> reboot? 
> 
> # pvecm status 
> Quorum information 
> ------------------ 
> Date: Tue May 2 09:35:23 2017 
> Quorum provider: corosync_votequorum 
> Nodes: 4 
> Node ID: 0x00000001 
> Ring ID: 4/524 
> Quorate: Yes 
> 
> Votequorum information 
> ---------------------- 
> Expected votes: 4 
> Highest expected: 4 
> Total votes: 4 
> Quorum: 3 
> Flags: Quorate 
> 
> Membership information 
> ---------------------- 
> Nodeid Votes Name 
> 0x00000004 1 192.168.0.11 
> 0x00000003 1 192.168.0.203 
> 0x00000001 1 192.168.0.204 (local) 
> 0x00000002 1 192.168.0.206 
> 
> # ha-manager status 
> quorum OK 
> master node3 (active, Tue May 2 09:35:24 2017) 
> lrm node1 (idle, Tue May 2 09:35:27 2017) 
> lrm node2 (active, Tue May 2 09:35:26 2017) 
> lrm node3 (idle, Tue May 2 09:35:23 2017) 
> lrm node3 (idle, Tue May 2 09:35:23 2017) 
> 
> Somehow proxmox was smart enough to keep two of the nodes online, but 
> with a quorum of 3 neither group should have had quorum. How does it 
> decide which group to keep online? 
> 
> Thanks 
> 

From dietmar at proxmox.com  Wed May  3 18:03:57 2017
From: dietmar at proxmox.com (Dietmar Maurer)
Date: Wed, 3 May 2017 18:03:57 +0200 (CEST)
Subject: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA
 cluster
In-Reply-To: <0301d662-d15f-feae-a45f-b92898d3a00a@ucar.edu>
References: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
 <1921419111.11432350.1493797586137.JavaMail.zimbra@oxygem.tv>
 <0301d662-d15f-feae-a45f-b92898d3a00a@ucar.edu>
Message-ID: <395141197.11.1493827438265@webmail.proxmox.com>

> Alternatively, is there a way exclude one of my 6 nodes from the HA
> quorum voting? 

Hint: The number of votes per node is configurable.

From t.lamprecht at proxmox.com  Thu May  4 08:39:24 2017
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Thu, 4 May 2017 08:39:24 +0200
Subject: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA
 cluster
In-Reply-To: <0301d662-d15f-feae-a45f-b92898d3a00a@ucar.edu>
References: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
 <1921419111.11432350.1493797586137.JavaMail.zimbra@oxygem.tv>
 <0301d662-d15f-feae-a45f-b92898d3a00a@ucar.edu>
Message-ID: <5ab89f59-78a2-9df5-b1c5-480413edd46d@proxmox.com>

On 05/03/2017 05:29 PM, Adam Carheden wrote:
> On 05/03/2017 01:46 AM, Alexandre DERUMIER wrote:
>> Maybe this is because node only reboot if an HA vm is present on the node ?
>>
>>
>> if you had HA vm on all 4 nodes, I think that all nodes should be reboot by watchdog. (as you lost quorum on 4 nodes)
> That must be it. I have one HA VM and a few non-HA VMs, all just
> testing. I think the takeaway is not to run an even number of HA nodes
> in production (or to use a corosync QDevice as Thomas suggests).

Yes, as my answer stated. ;-)

>
> Am I correct that all PVE nodes contribute to the quorum voting even if
> they're not part of an HA group?
Yes, because quorum is not just used for HA, quorum is there for all 
cluster activities as they all need to be consistent and reliably 
synchronized.

>
> My production cluster will have 6 nodes (more redundancy, same
> datacetner, less network risk). To prevent cluster shutdown in
> production when I'll have lots more HA VMs, I can just add an old
> cheap-o box as 7th node for quorum and not put it in any HA groups?

Yes, that would be an good option. But you certainly need to but some 
thought on where you place this machine.
If it is, for example, in room A and room B gets on fire (just for the 
examples sake :) ) then the nodes in Room A
are still quorate, but not vice versa. I.e. if Room A gets cut off Room 
B will not have quorum.
An option would be to place it in a third room so that is an independent 
arbitrator.
But that is naturally not an option for all..

>
> Alternatively, is there a way exclude one of my 6 nodes from the HA
> quorum voting? In a CEPH cluster, quorum is determined by nodes running
> the monitor service, and not all nodes have to run the monitor service.
> Is there an equivalent "no monitor" configuration in PVE?

As Dietmar hinted: you can configure how many votes a node provides 
(must be >= 1).
This can be configured either on node addition or by editing the 
corosync configuration file in:
/etc/pve/corosync.conf

So you could just give one node in the 'more reliable' room two votes 
and you achieve the same
as with a additional machine in the same room.

See:
http://pve.proxmox.com/pve-docs/chapter-pvecm.html#edit-corosync-conf
# man corosync.conf

cheers,
Thomas

> Thanks
>
>> ----- Mail original -----
>> De: "Adam Carheden" <carheden at ucar.edu>
>> ?: "proxmoxve" <pve-user at pve.proxmox.com>
>> Envoy?: Mardi 2 Mai 2017 17:40:37
>> Objet: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA	cluster
>>
>> What's supposed to happen if two nodes in a 4-node HA cluster go offline?
>>
>>
>> I have a 4-node test cluster, two nodes are in one server room and the
>> other two in another server room. I had HA inadvertently tested for me
>> this morning due to an unexpected network issue and watchdog rebooted
>> two of the nodes.
>>
>> I think this is the expected behavior, and certainly seems like what I
>> want to happen. However, quorum is 3, not 2, so why didn't all 4 nodes
>> reboot?
>>
>> # pvecm status
>> Quorum information
>> ------------------
>> Date: Tue May 2 09:35:23 2017
>> Quorum provider: corosync_votequorum
>> Nodes: 4
>> Node ID: 0x00000001
>> Ring ID: 4/524
>> Quorate: Yes
>>
>> Votequorum information
>> ----------------------
>> Expected votes: 4
>> Highest expected: 4
>> Total votes: 4
>> Quorum: 3
>> Flags: Quorate
>>
>> Membership information
>> ----------------------
>> Nodeid Votes Name
>> 0x00000004 1 192.168.0.11
>> 0x00000003 1 192.168.0.203
>> 0x00000001 1 192.168.0.204 (local)
>> 0x00000002 1 192.168.0.206
>>
>> # ha-manager status
>> quorum OK
>> master node3 (active, Tue May 2 09:35:24 2017)
>> lrm node1 (idle, Tue May 2 09:35:27 2017)
>> lrm node2 (active, Tue May 2 09:35:26 2017)
>> lrm node3 (idle, Tue May 2 09:35:23 2017)
>> lrm node3 (idle, Tue May 2 09:35:23 2017)
>>
>> Somehow proxmox was smart enough to keep two of the nodes online, but
>> with a quorum of 3 neither group should have had quorum. How does it
>> decide which group to keep online?
>>
>> Thanks
>>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

From ab1 at metalit.com  Thu May  4 09:15:03 2017
From: ab1 at metalit.com (Alessandro Briosi)
Date: Thu, 4 May 2017 09:15:03 +0200
Subject: [PVE-User] Migration from server down
Message-ID: <7be33d59-14f8-0cd1-abdf-13e77cc3189f@metalit.com>

Hi all,
I have had some troubles with 1 server in a cluster.
As the VM have all disks on shared storage I though it would have been
possible to migrate them from the current out of order server to the
others in the cluster.
HA is not enabled (I prefer doing it manual)

Though the GUI gave errors 'cause it could not connect to the server.

So I migrated it simply by moving the VM configuration to another node.

Am I missing something in the GUI? Why is it not possible to migrate a
server (with all shared resources) to another server from the GUI when
the VM is on a not reachable server?

IMHO there should be a path to change (and migrate) a VM configuration
even if the server where it rus is down/not reachable.

Regards,
Alessandro

From ab1 at metalit.com  Thu May  4 09:23:42 2017
From: ab1 at metalit.com (Alessandro Briosi)
Date: Thu, 4 May 2017 09:23:42 +0200
Subject: [PVE-User] gluster FS server2
Message-ID: <abfb36be-e38b-0c1a-b596-973ea7c45aa4@metalit.com>

I have had a problem in a cluster with configured a gluster fs.

I manually migrated the VMs to another server, though when I started it
it gave me the following:

kvm: -drive
file=gluster://srvpve2g/datastore2/images/201/vm-201-disk-1.qcow2,if=none,id=drive-virtio0,format=qcow2,cache=none,aio=native,detect-zeroes=on:
Gluster connection for volume datastore2, path
images/201/vm-201-disk-1.qcow2 failed to connect
hint: failed on host srvpve2g and port 24007 Please refer to gluster
logs for more info

Now why is it not using the server2 (srvpve3 in this case) definition in
storage configuration if the first server fails?
I had to change the order of the servers in the storage file to make it
boot.

Alessandro

From f.gruenbichler at proxmox.com  Thu May  4 09:26:57 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Thu, 4 May 2017 09:26:57 +0200
Subject: [PVE-User] Migration from server down
In-Reply-To: <7be33d59-14f8-0cd1-abdf-13e77cc3189f@metalit.com>
References: <7be33d59-14f8-0cd1-abdf-13e77cc3189f@metalit.com>
Message-ID: <20170504072657.mfkz2fxitrjw324s@nora.maurer-it.com>

On Thu, May 04, 2017 at 09:15:03AM +0200, Alessandro Briosi wrote:
> Hi all,
> I have had some troubles with 1 server in a cluster.
> As the VM have all disks on shared storage I though it would have been
> possible to migrate them from the current out of order server to the
> others in the cluster.
> HA is not enabled (I prefer doing it manual)
> 
> Though the GUI gave errors 'cause it could not connect to the server.
> 
> So I migrated it simply by moving the VM configuration to another node.
> 
> Am I missing something in the GUI? Why is it not possible to migrate a
> server (with all shared resources) to another server from the GUI when
> the VM is on a not reachable server?
> 
> IMHO there should be a path to change (and migrate) a VM configuration
> even if the server where it rus is down/not reachable.

because without HA/fencing you have no guarantee that the other node
is actually off, and not just not reachable. the same applies for any
guests potentially running there. "stealing" the guest (configuration)
is therefor potentially dangerous, and needs to be done manually. but
yes, it is just as simple as verifying that the guest is actually not
running on the other node (e.g., because the node is known to be
physically off) and that it has no local dependencies (storage or
otherwise) and then moving the file in /etc/pve.

if there were a button for this on the GUI, people would use it without
understanding the consequences, and then blame the tool instead of the
user when something goes wrong ;)

with HA, the HA stack takes care of this using
- proper locking
- fencing of non-quorate nodes

this ensures that when the HA stack steals a config after the
appropriate timeouts/.. have happened, there cannot be a
conflict/inconsistency.

From ab1 at metalit.com  Thu May  4 10:06:15 2017
From: ab1 at metalit.com (Alessandro Briosi)
Date: Thu, 4 May 2017 10:06:15 +0200
Subject: [PVE-User] Migration from server down
In-Reply-To: <20170504072657.mfkz2fxitrjw324s@nora.maurer-it.com>
References: <7be33d59-14f8-0cd1-abdf-13e77cc3189f@metalit.com>
 <20170504072657.mfkz2fxitrjw324s@nora.maurer-it.com>
Message-ID: <ee95a1df-8f66-05fa-3124-e9600374d620@metalit.com>

Il 04/05/2017 09:26, Fabian Gr?nbichler ha scritto:
> because without HA/fencing you have no guarantee that the other node
> is actually off, and not just not reachable. the same applies for any
> guests potentially running there. "stealing" the guest (configuration)
> is therefor potentially dangerous, and needs to be done manually. but
> yes, it is just as simple as verifying that the guest is actually not
> running on the other node (e.g., because the node is known to be
> physically off) and that it has no local dependencies (storage or
> otherwise) and then moving the file in /etc/pve.
>
> if there were a button for this on the GUI, people would use it without
> understanding the consequences, and then blame the tool instead of the
> user when something goes wrong ;)
>
> with HA, the HA stack takes care of this using
> - proper locking
> - fencing of non-quorate nodes
>
> this ensures that when the HA stack steals a config after the
> appropriate timeouts/.. have happened, there cannot be a
> conflict/inconsistency.

I understand, even if the host has no quorum it might be still running
the VM.

Maybe I'd prefer having a big fat warning before doing it than have the
customer call me 'cause the server is down and cannot be migrated from
the GUI.

Thanks,
Alessandro

From carheden at ucar.edu  Thu May  4 16:10:48 2017
From: carheden at ucar.edu (Adam Carheden)
Date: Thu, 4 May 2017 08:10:48 -0600
Subject: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA
 cluster
In-Reply-To: <5ab89f59-78a2-9df5-b1c5-480413edd46d@proxmox.com>
References: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
 <1921419111.11432350.1493797586137.JavaMail.zimbra@oxygem.tv>
 <0301d662-d15f-feae-a45f-b92898d3a00a@ucar.edu>
 <5ab89f59-78a2-9df5-b1c5-480413edd46d@proxmox.com>
Message-ID: <bbb3ae45-5b61-d37f-f62c-cea58a45aee8@ucar.edu>

On 05/04/2017 12:39 AM, Thomas Lamprecht wrote:
> 
> As Dietmar hinted: you can configure how many votes a node provides
> (must be >= 1).

Not true:

# pvecm status
...
Membership information
----------------------
    Nodeid      Votes Name
0x00000004          0 192.168.0.11 (local)
0x00000003          1 192.168.0.203
0x00000001          1 192.168.0.204
0x00000002          1 192.168.0.206

...but I assume that you meant "not advisable" rather than "not
possible". Does anyone know the consequences of giving a node 0 votes?

From t.lamprecht at proxmox.com  Thu May  4 16:25:10 2017
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Thu, 4 May 2017 16:25:10 +0200
Subject: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA
 cluster
In-Reply-To: <bbb3ae45-5b61-d37f-f62c-cea58a45aee8@ucar.edu>
References: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
 <1921419111.11432350.1493797586137.JavaMail.zimbra@oxygem.tv>
 <0301d662-d15f-feae-a45f-b92898d3a00a@ucar.edu>
 <5ab89f59-78a2-9df5-b1c5-480413edd46d@proxmox.com>
 <bbb3ae45-5b61-d37f-f62c-cea58a45aee8@ucar.edu>
Message-ID: <baf1e3b0-9567-ae74-db5f-b8312810bfd5@proxmox.com>

On 05/04/2017 04:10 PM, Adam Carheden wrote:
> On 05/04/2017 12:39 AM, Thomas Lamprecht wrote:
>> As Dietmar hinted: you can configure how many votes a node provides
>> (must be >= 1).
> Not true:

Saw that actually myself just after writing because we then discussed it 
here :-)

>
> # pvecm status
> ...
> Membership information
> ----------------------
>      Nodeid      Votes Name
> 0x00000004          0 192.168.0.11 (local)
> 0x00000003          1 192.168.0.203
> 0x00000001          1 192.168.0.204
> 0x00000002          1 192.168.0.206
>
> ...but I assume that you meant "not advisable" rather than "not
> possible". Does anyone know the consequences of giving a node 0 votes?

Not really tested would rather hit the nail.
You can do this, it should work just fine, with the natural consequence 
that this node has no influence on quorum.
It can receive cluster traffic without problems, but there may be some 
implications which do not come to my mind at the moment.

From gilberto.nunes32 at gmail.com  Thu May  4 16:31:12 2017
From: gilberto.nunes32 at gmail.com (Gilberto Nunes)
Date: Thu, 4 May 2017 11:31:12 -0300
Subject: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA
	cluster
In-Reply-To: <baf1e3b0-9567-ae74-db5f-b8312810bfd5@proxmox.com>
References: <CAEyRZxcGpM91OpGCb_cQ3g8Qef6gZDYX9RzpfZ4idyW8TL-k1Q@mail.gmail.com>
 <1921419111.11432350.1493797586137.JavaMail.zimbra@oxygem.tv>
 <0301d662-d15f-feae-a45f-b92898d3a00a@ucar.edu>
 <5ab89f59-78a2-9df5-b1c5-480413edd46d@proxmox.com>
 <bbb3ae45-5b61-d37f-f62c-cea58a45aee8@ucar.edu>
 <baf1e3b0-9567-ae74-db5f-b8312810bfd5@proxmox.com>
Message-ID: <CAOKSTBuPj+Eh46=x3NS7dXV+AGzK1Ub3VaXnCV8bZ4rfZDQbcw@mail.gmail.com>

Hi

I think in this case, the nove with 0 vote, will be ignored by others
nodes, when count as a quorum, so the entire system will be just see 3
nodes with quorum...

2017-05-04 11:25 GMT-03:00 Thomas Lamprecht <t.lamprecht at proxmox.com>:

> On 05/04/2017 04:10 PM, Adam Carheden wrote:
>
>> On 05/04/2017 12:39 AM, Thomas Lamprecht wrote:
>>
>>> As Dietmar hinted: you can configure how many votes a node provides
>>> (must be >= 1).
>>>
>> Not true:
>>
>
> Saw that actually myself just after writing because we then discussed it
> here :-)
>
>
>> # pvecm status
>> ...
>> Membership information
>> ----------------------
>>      Nodeid      Votes Name
>> 0x00000004          0 192.168.0.11 (local)
>> 0x00000003          1 192.168.0.203
>> 0x00000001          1 192.168.0.204
>> 0x00000002          1 192.168.0.206
>>
>> ...but I assume that you meant "not advisable" rather than "not
>> possible". Does anyone know the consequences of giving a node 0 votes?
>>
>
> Not really tested would rather hit the nail.
> You can do this, it should work just fine, with the natural consequence
> that this node has no influence on quorum.
> It can receive cluster traffic without problems, but there may be some
> implications which do not come to my mind at the moment.
>
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

-- 
Obrigado

Cordialmente

Gilberto Ferreira

Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server |
Zimbra Mail Server

(47) 3025-5907
(47) 99676-7530

Skype: konnectati

www.konnectati.com.br

From mark at tuxis.nl  Fri May  5 09:38:17 2017
From: mark at tuxis.nl (Mark Schouten)
Date: Fri, 05 May 2017 09:38:17 +0200
Subject: [PVE-User] Missing node in ha-manager
In-Reply-To: <1493797549.12575.1.camel@tuxis.nl>
References: <5B4E1C46-D8AC-4961-8D89-03EBBFF16175@tuxis.nl>
 <b8325c17-e4f9-76e4-fc1b-50ade2ca4a1c@proxmox.com>
 <1493797549.12575.1.camel@tuxis.nl>
Message-ID: <1493969897.12575.7.camel@tuxis.nl>

Thomas, pretty please? :)

On Wed, 2017-05-03 at 09:45 +0200, Mark Schouten wrote:
> On Tue, 2017-05-02 at 09:05 +0200, Thomas Lamprecht wrote:
> > Can you control that the config looks the same on all nodes?
> > Especially the difference between working and misbehaving nodes
> > would
> > be?
> > interesting.
> 
> Please see the attachment. That includes /etc/pve/.members and
> /etc/pve/corosync.conf from all nodes. Only the .members file of the
> misbehaving node is off.
> 
> > In general you could just restart CRM, but the CRM is capable of
> > syncing?
> > in new nodes while running, so there shouldn't be any need for
> > that,
> > the?
> > patches you linked also do not change that, AFAIK.
> 
> I would like to do a sync without a restart as well, but what would
> trigger this?
> 
> > As /etc/pve.members doesn't shows the new node on the misbehaving
> > one?
> > the problem is another one.
> > Who is the current master? Can you give me an output of:
> > # ha-manager status
> > # pvecm status
> > # cat /etc/pve/corosync.conf
> 
> Output in the attachment. Because the misbehaving node also is the
> master, output of ha-manager is identical on all nodes.
> 
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
-- 
Kerio Operator in de Cloud? https://www.kerioindecloud.nl/
Mark Schouten  | Tuxis Internet Engineering
KvK: 61527076  | http://www.tuxis.nl/
T: 0318 200208 | info at tuxis.nl

From t.lamprecht at proxmox.com  Fri May  5 18:14:15 2017
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Fri, 5 May 2017 18:14:15 +0200
Subject: [PVE-User] Missing node in ha-manager
In-Reply-To: <1493969897.12575.7.camel@tuxis.nl>
References: <5B4E1C46-D8AC-4961-8D89-03EBBFF16175@tuxis.nl>
 <b8325c17-e4f9-76e4-fc1b-50ade2ca4a1c@proxmox.com>
 <1493797549.12575.1.camel@tuxis.nl> <1493969897.12575.7.camel@tuxis.nl>
Message-ID: <79953ddf-5fb4-8a5a-d099-b84da4f93661@proxmox.com>

Hi,

It looks like the PVE cluster filesystem is out of sync or has a 
problematic connection to corosync.
 From corosyncs stand point the node addition worked, and is consistent 
on all nodes, which is good.

Now, some log from `proxmox03`- the problematic node - would be nice:

# journalctl -u corosync -u pve-cluster

As I may not get back to you  until Tuesday I give you a quite possible 
resolution now:

I'd suggest restarting pve-cluster *but* as the pve-ha-lrm and its 
watchdog is active on node proxmox03,
it could result in a node reset *if* pve-cluster cannot connect back to 
corosync or fails in another way.
This is not likely but if you can not schedule a maintenance window it 
should be taken care of.
First restartthe pve-ha-crm so that another mode takes up the master role.
Then either move all HA-Services from this node or remove them 
temporarily and stop the pve-ha-lrm and pve-ha-crm services:

# systemctl stop pve-ha-lrm pve-ha-crm

now restart the pve-cluster and corosync (just to be sure) service:

systemctl restart pve-cluster corosync

and check

# cat /etc/pve/.members

It should show all members and the same version number as the other 
members. If thats the case start pve-ha-lrm and crm again,
all should be clear now again.

Oh and sorry for getting back a bit late :)

cheers,
Thomas

On 05/05/2017 09:38 AM, Mark Schouten wrote:
> Thomas, pretty please? :)
>
> On Wed, 2017-05-03 at 09:45 +0200, Mark Schouten wrote:
>> On Tue, 2017-05-02 at 09:05 +0200, Thomas Lamprecht wrote:
>>> Can you control that the config looks the same on all nodes?
>>> Especially the difference between working and misbehaving nodes
>>> would
>>> be
>>> interesting.
>> Please see the attachment. That includes /etc/pve/.members and
>> /etc/pve/corosync.conf from all nodes. Only the .members file of the
>> misbehaving node is off.
>>
>>> In general you could just restart CRM, but the CRM is capable of
>>> syncing
>>> in new nodes while running, so there shouldn't be any need for
>>> that,
>>> the
>>> patches you linked also do not change that, AFAIK.
>> I would like to do a sync without a restart as well, but what would
>> trigger this?
>>
>>> As /etc/pve.members doesn't shows the new node on the misbehaving
>>> one
>>> the problem is another one.
>>> Who is the current master? Can you give me an output of:
>>> # ha-manager status
>>> # pvecm status
>>> # cat /etc/pve/corosync.conf
>> Output in the attachment. Because the misbehaving node also is the
>> master, output of ha-manager is identical on all nodes.
>>
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

From uwe.sauter.de at gmail.com  Fri May  5 18:18:50 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Fri, 5 May 2017 18:18:50 +0200
Subject: [PVE-User] PVE behind reverse proxy: different webroot possible?
Message-ID: <d4aad664-5f72-bc9b-1d82-7a2fc1a84913@gmail.com>

Hi,

I've seen the wiki page [1] that explains how to operate a PVE host behind a reverse proxy.

I'm currently in the situation that I have several services already behind a rev proxy that are accessible with different
webroots, e.g.

https://example.com/dashboard
https://example.com/owncloud
https://example.com/nagios

What changes would be needed so that I could reach the PVE host with

https://example.com/pve ?

Is it even possible?

Also is it possible to make a whole PVE cluster available behind a rev proxy using [2]?

Regards,

	Uwe

[1] https://pve.proxmox.com/wiki/Web_Interface_Via_Nginx_Proxy
[2] http://nginx.org/en/docs/http/load_balancing.html#nginx_load_balancing_with_ip_hash

From daniel at linux-nerd.de  Fri May  5 22:52:23 2017
From: daniel at linux-nerd.de (Daniel)
Date: Fri, 5 May 2017 20:52:23 +0000
Subject: [PVE-User] Mount Ceph Container
Message-ID: <AD923329-8C57-4431-B5EC-A9BF06FF0086@linux-nerd.de>

Hi at All,

i have a VM which is on a Ceph Storage: rootfs: ceph:vm-171-disk-1,size=20G

Is there anyway to mount this Image on the local PMX Host? I need this to copy some data ;)

--
Gr?sse

Daniel

From daniel at linux-nerd.de  Fri May  5 23:23:55 2017
From: daniel at linux-nerd.de (Daniel)
Date: Fri, 5 May 2017 21:23:55 +0000
Subject: [PVE-User] Mount Ceph Container
Message-ID: <F6F4EE68-ACF4-4A40-80C9-1C515EA92C52@linux-nerd.de>

Found it by my self.
Its located here: /dev/rbd/ceph/

-- 
Gr?sse

Daniel

Am 05.05.17, 22:52 schrieb "pve-user im Auftrag von Daniel" <pve-user-bounces at pve.proxmox.com im Auftrag von daniel at linux-nerd.de>:

    Hi at All,

    i have a VM which is on a Ceph Storage: rootfs: ceph:vm-171-disk-1,size=20G

    Is there anyway to mount this Image on the local PMX Host? I need this to copy some data ;)

    --
    Gr?sse

    Daniel
    _______________________________________________
    pve-user mailing list
    pve-user at pve.proxmox.com
    http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

From f.gruenbichler at proxmox.com  Mon May  8 08:51:59 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Mon, 8 May 2017 08:51:59 +0200
Subject: [PVE-User] Mount Ceph Container
In-Reply-To: <AD923329-8C57-4431-B5EC-A9BF06FF0086@linux-nerd.de>
References: <AD923329-8C57-4431-B5EC-A9BF06FF0086@linux-nerd.de>
Message-ID: <20170508065159.6pkyulrwoh4ldpdj@nora.maurer-it.com>

On Fri, May 05, 2017 at 08:52:23PM +0000, Daniel wrote:
> Hi at All,
> 
> i have a VM which is on a Ceph Storage: rootfs: ceph:vm-171-disk-1,size=20G
> 
> Is there anyway to mount this Image on the local PMX Host? I need this to copy some data ;)
> 

for future reference: "pct mount CTID" should work as well, independent
of the actual storage used. don't forget to "pct unmount CTID" after
you're done..

From mark at tuxis.nl  Tue May  9 08:48:15 2017
From: mark at tuxis.nl (Mark Schouten)
Date: Tue, 09 May 2017 08:48:15 +0200
Subject: [PVE-User] Missing node in ha-manager
In-Reply-To: <79953ddf-5fb4-8a5a-d099-b84da4f93661@proxmox.com>
References: <5B4E1C46-D8AC-4961-8D89-03EBBFF16175@tuxis.nl>
 <b8325c17-e4f9-76e4-fc1b-50ade2ca4a1c@proxmox.com>
 <1493797549.12575.1.camel@tuxis.nl> <1493969897.12575.7.camel@tuxis.nl>
 <79953ddf-5fb4-8a5a-d099-b84da4f93661@proxmox.com>
Message-ID: <1494312495.7429.11.camel@tuxis.nl>

On Fri, 2017-05-05 at 18:14 +0200, Thomas Lamprecht wrote:
> Hi,
> 
> It looks like the PVE cluster filesystem is out of sync or has a?
> problematic connection to corosync.
> ?From corosyncs stand point the node addition worked, and is
> consistent?
> on all nodes, which is good.
> 
> Now, some log from `proxmox03`- the problematic node - would be nice:
> 
> # journalctl -u corosync -u pve-cluster

See attachment.

BTW: /etc/pve/.members is different on all nodes. Is that file really
on pmxcfs, or is it actually a 'local' file ?

-- 
Kerio Operator in de Cloud? https://www.kerioindecloud.nl/
Mark Schouten  | Tuxis Internet Engineering
KvK: 61527076  | http://www.tuxis.nl/
T: 0318 200208 | info at tuxis.nl

From t.lamprecht at proxmox.com  Tue May  9 11:01:43 2017
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Tue, 9 May 2017 11:01:43 +0200
Subject: [PVE-User] PVE behind reverse proxy: different webroot possible?
In-Reply-To: <d4aad664-5f72-bc9b-1d82-7a2fc1a84913@gmail.com>
References: <d4aad664-5f72-bc9b-1d82-7a2fc1a84913@gmail.com>
Message-ID: <c8ff0148-00c0-9071-17e0-d905bb3d0917@proxmox.com>

Hi,

On 05/05/2017 06:18 PM, Uwe Sauter wrote:
> Hi,
>
> I've seen the wiki page [1] that explains how to operate a PVE host behind a reverse proxy.
>
> I'm currently in the situation that I have several services already behind a rev proxy that are accessible with different
> webroots, e.g.
>
> https://example.com/dashboard
> https://example.com/owncloud
> https://example.com/nagios
>
> What changes would be needed so that I could reach the PVE host with
>
> https://example.com/pve ?
>
> Is it even possible?

Hmm, there are some problems as we mostly set absolute paths on 
resources (images, JS and CSS files)
so the loading fails...
I.e., pve does not knows that it is accessed from 
https://example.com/pve-node/ and tries to load the resources from the 
absolute path /pve/foo.js
but then https://example.com/pve/foo.js results in a 404/501 error.
Same happens for api calls, AFAIK.
Normally some webapps allow to set a "ROOT_URL" config entry, where the 
access URL can be set.
As there are many places where this would need to be changed it is not 
just a quick fix, though.

But you could work with sub-domains and achieve the same, e.g. a rever 
proxy entry for:
https://pve-node.example.com
should work.

Tested with a default setup and the following nginx configuration:

----
server {
     listen 443;
     server_name test.localhost; # <- FIXME, change
     ssl on;
     ssl_certificate /etc/pve/local/pve-ssl.pem;# OPTIONAL FIXME, change 
if you want other certs
     ssl_certificate_key /etc/pve/local/pve-ssl.key;# or proxy and PVE 
are on separated machines

     location / {
         proxy_pass https://localhost:8006/;

         proxy_set_header Host $host;
         proxy_set_header X-Forwarded-Proto https;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
         proxy_redirect off;

         # really needed?
         proxy_buffering off;
         client_max_body_size 0;
         proxy_connect_timeout 60s;
         proxy_read_timeout 60s;
         proxy_send_timeout 60s;
         send_timeout 60s;
     }
}
----

With this I can access my cluster at https://test.localhost/ just fine.

Change server_name respectively and if you run the proxy on another 
server than the PVE host adapt also the proxy_pass entry and the ssl 
certs for that matter.
I did only tested the situation where the nginx runned on the PVE host 
but it should work the same.

AFAICT, the "upstream" config entry described in the wiki is not really 
needed. Also the redirect from port 80 HTTP to 443 HTTPS is just 
convenience.

I'll update the wiki article a bit :)
> Also is it possible to make a whole PVE cluster available behind a rev proxy using [2]?

How do you mean that? It should be possible to add multiple redirects 
for multiple nodes so it should work.

cheers,
Thomas

From uwe.sauter.de at gmail.com  Tue May  9 11:32:21 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Tue, 9 May 2017 11:32:21 +0200
Subject: [PVE-User] PVE behind reverse proxy: different webroot possible?
In-Reply-To: <c8ff0148-00c0-9071-17e0-d905bb3d0917@proxmox.com>
References: <d4aad664-5f72-bc9b-1d82-7a2fc1a84913@gmail.com>
 <c8ff0148-00c0-9071-17e0-d905bb3d0917@proxmox.com>
Message-ID: <93a69855-93e2-e561-9259-767c6f773237@gmail.com>

Hi Thomas,

thank you for the effort of explaining.

> 
> Hmm, there are some problems as we mostly set absolute paths on resources (images, JS and CSS files)
> so the loading fails...
> I.e., pve does not knows that it is accessed from https://example.com/pve-node/ and tries to load the resources from the absolute
> path /pve/foo.js
> but then https://example.com/pve/foo.js results in a 404/501 error.
> Same happens for api calls, AFAIK.
> Normally some webapps allow to set a "ROOT_URL" config entry, where the access URL can be set.
> As there are many places where this would need to be changed it is not just a quick fix, though.
> 

I was afraid of exactly this situation (as I was bitten by that on some other web app as well).

> But you could work with sub-domains and achieve the same, e.g. a rever proxy entry for:
> https://pve-node.example.com
> should work.

Sub-domains unfortunately are not an option in my case.

> 
> AFAICT, the "upstream" config entry described in the wiki is not really needed. Also the redirect from port 80 HTTP to 443 HTTPS
> is just convenience.
> 
> I'll update the wiki article a bit :)
>> Also is it possible to make a whole PVE cluster available behind a rev proxy using [2]?
> 
> How do you mean that? It should be possible to add multiple redirects for multiple nodes so it should work.

I was thinking of having only one ROOT_URL and nginx would then distribute the load (*) to the several PVE hosts, as is described.
in [1]. I assume that session persistence is needed for the web GUI to work correctly, hence the "ip_hash" parameter that would
redirect requests always to the same server (as long as this server is available).

So if I would access https://example.org/pve from client A, I would always access server X while client B would always access
server Y (as long as X or Y can be reached).

Regards,

	Uwe

(*) load here being meant as redundant access, not as high number of requests
[1] http://nginx.org/en/docs/http/load_balancing.html#nginx_load_balancing_with_ip_hash

> 
> cheers,
> Thomas
> 

From uwe.sauter.de at gmail.com  Thu May 11 15:49:09 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Thu, 11 May 2017 15:49:09 +0200
Subject: [PVE-User] PVE behind reverse proxy: different webroot possible?
In-Reply-To: <c8ff0148-00c0-9071-17e0-d905bb3d0917@proxmox.com>
References: <d4aad664-5f72-bc9b-1d82-7a2fc1a84913@gmail.com>
 <c8ff0148-00c0-9071-17e0-d905bb3d0917@proxmox.com>
Message-ID: <65045f07-e5f4-22d2-debe-fde54c31edcb@gmail.com>

Hi Thomas,

I've been working on this and found half of a solution. Using Nginx' sub_filter rules I was able to get all the static stuff to be
displayed under a new webroot "/pve".

This is the relevant parts of my Nginx configuration:

/etc/ngnix/conf.d/default
####
upstream proxmox-cluster {
  ip_hash;
  server pve-host1:8006;
  server pve-host2:8006:
  server pve-host3:8006;
}

server {
  server_name                myserver.example.org;
  listen                     443 ssl;

  add_header X-Content-Type-Options nosniff;
  add_header X-Frame-Options "SAMEORIGIN";
  add_header X-XSS-Protection "1; mode=block";
  add_header X-Robots-Tag none;
  add_header X-Download-Options noopen;
  add_header X-Permitted-Cross-Domain-Policies none;

#[?] other locations

  location /pve/ {
    proxy_pass https://proxmox-cluster/;

    gzip off;

# filter to rewrite href anchors in content
    sub_filter               'href="/'       'href="/pve/';
# filter to rewrite src anchors in content
    sub_filter               'src="/'        'src="/pve/';
# filters to add prefix to urls (e.g. in pvemanagerlib.js)
    sub_filter               'url: "/api2'   'url: "/pve/api2';
    sub_filter               "url: '/api2"   "url: '/pve/api2";
    sub_filter               'url = "/api2'  'url = "/pve/api2';
    sub_filter               "url = '/api2"  "url = '/pve/api2";
# needed to load js files used by console window
    sub_filter               'url = "/api2'  'url = "/pve/api2';
    sub_filter               "url = '/api2"  "url = '/pve/api2";
#
    sub_filter_last_modified on;
# replace every occasion not only first
    sub_filter_once          off;
# use filter on all types, not only on text/html
    sub_filter_types         *;

    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-Proto https;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

# necessary to tell pveproxy not to compress responses or else sub_filter will not work (as it does not decompress)
    proxy_set_header Accept-Encoding "";

    proxy_redirect off;

    # really needed?
    proxy_buffering off;
    client_max_body_size 0;
    proxy_connect_timeout 60s;
    proxy_read_timeout 60s;
    proxy_send_timeout 60s;
    send_timeout 60s;
  }
}
####

With this configuration I had partial success. What is not working currently:

* Notes on the summary tab of virtual machines. Error displayed is:
  "Error Method 'GET /pve/api2/extjs/nodes/hlrs-pxmx-02/qemu/101/config' not implemented (501)"

* Separate console window does not load. It seems that the URL needed to access a console is constructed by several function calls
and I don't know where I need to search for the string I need to replace. It tries to load
"https://myserver.example.org/api2/json/nodes/pve-host1/qemu/101/vncproxy" while it should load
"https://myserver.example.org/pve/api2/json/nodes/pve-host1/qemu/101/vncproxy".

* Inline console view: result is that I get displayed the root of my server: "https://myserver.example.org/index.html" (though
index.html is added by Nginx directive "index" for location "/").

* There might be more things I missed. Not even tested is containers (I don't have any).

Do you have any suggestions on those issues?

Regards,

	Uwe

PS: Please also note that the use of single and double quotes inside pvemangerib.js is inconsistet (compare lines 4296 and 33465).
This is the reason why I have 2 sub_filters for basically the same replacement.

Am 09.05.2017 um 11:01 schrieb Thomas Lamprecht:
> Hi,
> 
> On 05/05/2017 06:18 PM, Uwe Sauter wrote:
>> Hi,
>>
>> I've seen the wiki page [1] that explains how to operate a PVE host behind a reverse proxy.
>>
>> I'm currently in the situation that I have several services already behind a rev proxy that are accessible with different
>> webroots, e.g.
>>
>> https://example.com/dashboard
>> https://example.com/owncloud
>> https://example.com/nagios
>>
>> What changes would be needed so that I could reach the PVE host with
>>
>> https://example.com/pve ?
>>
>> Is it even possible?
> 
> Hmm, there are some problems as we mostly set absolute paths on resources (images, JS and CSS files)
> so the loading fails...
> I.e., pve does not knows that it is accessed from https://example.com/pve-node/ and tries to load the resources from the absolute
> path /pve/foo.js
> but then https://example.com/pve/foo.js results in a 404/501 error.
> Same happens for api calls, AFAIK.
> Normally some webapps allow to set a "ROOT_URL" config entry, where the access URL can be set.
> As there are many places where this would need to be changed it is not just a quick fix, though.
> 
> But you could work with sub-domains and achieve the same, e.g. a rever proxy entry for:
> https://pve-node.example.com
> should work.
> 
> Tested with a default setup and the following nginx configuration:
> 
> ----
> server {
>     listen 443;
>     server_name test.localhost; # <- FIXME, change
>     ssl on;
>     ssl_certificate /etc/pve/local/pve-ssl.pem;# OPTIONAL FIXME, change if you want other certs
>     ssl_certificate_key /etc/pve/local/pve-ssl.key;# or proxy and PVE are on separated machines
> 
>     location / {
>         proxy_pass https://localhost:8006/;
> 
>         proxy_set_header Host $host;
>         proxy_set_header X-Forwarded-Proto https;
>         proxy_set_header X-Real-IP $remote_addr;
>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>         proxy_redirect off;
> 
>         # really needed?
>         proxy_buffering off;
>         client_max_body_size 0;
>         proxy_connect_timeout 60s;
>         proxy_read_timeout 60s;
>         proxy_send_timeout 60s;
>         send_timeout 60s;
>     }
> }
> ----
> 
> With this I can access my cluster at https://test.localhost/ just fine.
> 
> Change server_name respectively and if you run the proxy on another server than the PVE host adapt also the proxy_pass entry and
> the ssl certs for that matter.
> I did only tested the situation where the nginx runned on the PVE host but it should work the same.
> 
> AFAICT, the "upstream" config entry described in the wiki is not really needed. Also the redirect from port 80 HTTP to 443 HTTPS
> is just convenience.
> 
> I'll update the wiki article a bit :)
>> Also is it possible to make a whole PVE cluster available behind a rev proxy using [2]?
> 
> How do you mean that? It should be possible to add multiple redirects for multiple nodes so it should work.
> 
> cheers,
> Thomas
> 

From IMMO.WETZEL at adtran.com  Thu May 11 17:12:02 2017
From: IMMO.WETZEL at adtran.com (IMMO WETZEL)
Date: Thu, 11 May 2017 15:12:02 +0000
Subject: [PVE-User] dead jobs... how to getrid of them
Message-ID: <F5452071F098E84B91D98FDE02860FAAF1391AB5@ex-mb1.corp.adtran.com>

We had some task started from the proxmox gui which where not ending.
Since a couple of days I do see the jobs still running in the task queue

They definitely dead. So how to get rid of them ?

Tasktype qmstart

Mit freundlichen Gr??en / With kind regards

Immo Wetzel

From sln at fcoo.dk  Mon May 15 12:10:34 2017
From: sln at fcoo.dk (=?iso-8859-1?Q?S=F8ren_Laursen?=)
Date: Mon, 15 May 2017 10:10:34 +0000
Subject: [PVE-User] Proxmox DRBD9 primary/secondary setup?
Message-ID: <A27372E1AD041A41A5EFF3BC7D5F960C73D3FCAF@mail01.fcoo.dk>

Hi,

We have been evaluating Proxmox as a replacement for our ganeti cluster.

Currently we are running 4.4-13 and have enabled drbd9 to have a primary/secondary node setup like we have in Ganeti.

We have some problems, but are not sure if we have setup the proxmox cluster wrong or we have misunderstod something.

The proxmox cluster consist of 3 nodes (HP servers):
pve-manager/4.4-13/7ea56165 (running kernel: 4.4.59-1-pve)

We have enabled the drbd plugin for proxmox and have the following setup:
drbd: drbdstorage
        content images,rootdir
        redundancy 2

We can see that the vm's get replicated but we have no clue on which nodes is primary/secondary.

And if we in the webgui press Edit on drbdstorage under the storage tab nothing happens - is this normal?

Have we misunderstood anything during the setup of Proxmox with DRBD9, and should we (if we can) downgrade to drbd8, to get a setup like ganeti where we know which node contains what.

Best regards,

S?ren

From yannis.milios at gmail.com  Mon May 15 16:56:08 2017
From: yannis.milios at gmail.com (Yannis Milios)
Date: Mon, 15 May 2017 15:56:08 +0100
Subject: [PVE-User] Proxmox DRBD9 primary/secondary setup?
In-Reply-To: <A27372E1AD041A41A5EFF3BC7D5F960C73D3FCAF@mail01.fcoo.dk>
References: <A27372E1AD041A41A5EFF3BC7D5F960C73D3FCAF@mail01.fcoo.dk>
Message-ID: <CAFiF2OpqNGDzkk+rjDX8H8UCa3-+wqNFC1zERUHgvCX_WXYaqA@mail.gmail.com>

>
>
> We can see that the vm's get replicated but we have no clue on which nodes
> is primary/secondary.
>

The resource is in Primary mode on the node where the actual VM is running
on. The rest of replicated resources are(should be) in Secondary mode.
The only time where a resource is in Primary on more than 2 nodes is during
the live migration process. Then, after a successful live migration,  one
of the resources switches back to secondary mode.You can observe this
behaviour in CLI by using drbdadm or drbd-overview commands.

>
> And if we in the webgui press Edit on drbdstorage under the storage tab
> nothing happens - is this normal?
>

Yes it's normal. PVE does not support DRBD management over the GUI. You
need to use the usual DRBD cli commands to manage the cluster.

>
> Have we misunderstood anything during the setup of Proxmox with DRBD9, and
> should we (if we can) downgrade to drbd8, to get a setup like ganeti where
> we know which node contains what.
>

PVE  dropped support for DRBD9 after the change of the licensing model for
drbdmanage. I think they are going to revert back to drbd8 in the upcoming
version of PVE(5). More info about these topics here:

https://pve.proxmox.com/wiki/DRBD9

LINBIT has a dedicated DRBD9 repository for PVE users:

https://docs.linbit.com/doc/users-guide-90/s-proxmox-install/

Yannis

From olivier.borowski at 4murs.fr  Mon May 15 17:51:38 2017
From: olivier.borowski at 4murs.fr (Olivier Borowski)
Date: Mon, 15 May 2017 17:51:38 +0200
Subject: [PVE-User] Proxmox DRBD9 primary/secondary setup?
In-Reply-To: <CAFiF2OpqNGDzkk+rjDX8H8UCa3-+wqNFC1zERUHgvCX_WXYaqA@mail.gmail.com>
References: <A27372E1AD041A41A5EFF3BC7D5F960C73D3FCAF@mail01.fcoo.dk>
 <CAFiF2OpqNGDzkk+rjDX8H8UCa3-+wqNFC1zERUHgvCX_WXYaqA@mail.gmail.com>
Message-ID: <1a1dd524-5056-fbfa-4bb6-e1820c29c0c1@4murs.fr>

Linbit has reverted back its licence to GPL some months ago :
https://www.linbit.com/en/drbd-manage-faq/

I already talked about that on this forum :
https://forum.proxmox.com/threads/drbdmanage-license-change.30404

So please give additional details about future DRBD support in Proxmox.
Proxmox should clarify whether we should give up using DRBD on Proxmox 
and switch to Ceph for example...

Olivier

Le 15/05/2017 ? 16:56, Yannis Milios a ?crit :
> PVE  dropped support for DRBD9 after the change of the licensing model for
> drbdmanage. I think they are going to revert back to drbd8 in the upcoming
> version of PVE(5). More info about these topics here:
>
> https://pve.proxmox.com/wiki/DRBD9
>
> LINBIT has a dedicated DRBD9 repository for PVE users:
>
> https://docs.linbit.com/doc/users-guide-90/s-proxmox-install/
>
> Yannis
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

From dietmar at proxmox.com  Mon May 15 18:24:54 2017
From: dietmar at proxmox.com (Dietmar Maurer)
Date: Mon, 15 May 2017 18:24:54 +0200 (CEST)
Subject: [PVE-User] Proxmox DRBD9 primary/secondary setup?
In-Reply-To: <1a1dd524-5056-fbfa-4bb6-e1820c29c0c1@4murs.fr>
References: <A27372E1AD041A41A5EFF3BC7D5F960C73D3FCAF@mail01.fcoo.dk>
 <CAFiF2OpqNGDzkk+rjDX8H8UCa3-+wqNFC1zERUHgvCX_WXYaqA@mail.gmail.com>
 <1a1dd524-5056-fbfa-4bb6-e1820c29c0c1@4murs.fr>
Message-ID: <1201562625.114.1494865494553@webmail.proxmox.com>

> So please give additional details about future DRBD support in Proxmox.
> Proxmox should clarify whether we should give up using DRBD on Proxmox 
> and switch to Ceph for example...

DRBD9 is supported by LINBIT directly. Proxmox will ship the default 
upstream kernel module for drbd (whatever version that is).

From t.lamprecht at proxmox.com  Tue May 16 10:31:53 2017
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Tue, 16 May 2017 10:31:53 +0200
Subject: [PVE-User] Missing node in ha-manager
In-Reply-To: <1494312495.7429.11.camel@tuxis.nl>
References: <5B4E1C46-D8AC-4961-8D89-03EBBFF16175@tuxis.nl>
 <b8325c17-e4f9-76e4-fc1b-50ade2ca4a1c@proxmox.com>
 <1493797549.12575.1.camel@tuxis.nl> <1493969897.12575.7.camel@tuxis.nl>
 <79953ddf-5fb4-8a5a-d099-b84da4f93661@proxmox.com>
 <1494312495.7429.11.camel@tuxis.nl>
Message-ID: <f9f385c6-cb6b-dc14-2b70-00999183e652@proxmox.com>

Hi,

On 05/09/2017 08:48 AM, Mark Schouten wrote:
 > On Fri, 2017-05-05 at 18:14 +0200, Thomas Lamprecht wrote:
 >> Hi,
 >>
 >> It looks like the PVE cluster filesystem is out of sync or has a
 >> problematic connection to corosync.
 >>  From corosyncs stand point the node addition worked, and is
 >> consistent
 >> on all nodes, which is good.
 >>
 >> Now, some log from `proxmox03`- the problematic node - would be nice:
 >>
 >> # journalctl -u corosync -u pve-cluster
 > See attachment.

Hmm, you had frequent changes where the 4th node left and then joined the
cluster combined with a few message re-transmits.
The frequency of the join/left cycles is strange, besides that it's not too
strange.
Strange is that while the status module from the cfs got all members the 
dcdb
(decentral database) did not received the 4ths node updates...

 >
 > BTW: /etc/pve/.members is different on all nodes. Is that file really
 > on pmxcfs, or is it actually a 'local' file ?
 >

It isn't a local file per se, but it is in fact a virtual one, i.e. its 
content
is read only and gets produced by the the pmxcfs on the fly (locally).
It pulls the information from corosync and its own internal state.
This means that it should be completely the same on each node, when adding
nodes or similar it naturally can differ for a very short amount of time.

How is it different on the different nodes?

Honestly, I would just stop the HA services (this marks the VMs 
currently under
HA) and then do a clean restart of the pve-cluster corosync services,
I'd do this for all nodes but not all at the same time :) This is as 
safe as it
can get, no VM should be interrupted, and I expected that even if we 
know the
full trigger of this it will result in the same action.

Before that do the omping test to see if multicast works as expected in your
setup:
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#cluster-network-requirements

If this test go through the restart should fix it.
I'm know it's not ideal but I have not too much time available currently and
such problems are often result of small details of a setup or the node 
addition
process.

cheers,
Thomas

From mark at tuxis.nl  Tue May 16 10:39:53 2017
From: mark at tuxis.nl (Mark Schouten)
Date: Tue, 16 May 2017 10:39:53 +0200
Subject: [PVE-User] Missing node in ha-manager
In-Reply-To: <f9f385c6-cb6b-dc14-2b70-00999183e652@proxmox.com>
References: <5B4E1C46-D8AC-4961-8D89-03EBBFF16175@tuxis.nl>
 <b8325c17-e4f9-76e4-fc1b-50ade2ca4a1c@proxmox.com>
 <1493797549.12575.1.camel@tuxis.nl> <1493969897.12575.7.camel@tuxis.nl>
 <79953ddf-5fb4-8a5a-d099-b84da4f93661@proxmox.com>
 <1494312495.7429.11.camel@tuxis.nl>
 <f9f385c6-cb6b-dc14-2b70-00999183e652@proxmox.com>
Message-ID: <1494923993.17872.3.camel@tuxis.nl>

On Tue, 2017-05-16 at 10:31 +0200, Thomas Lamprecht wrote:
> Honestly, I would just stop the HA services (this marks the VMs?
> currently under
> HA) and then do a clean restart of the pve-cluster corosync services,
> I'd do this for all nodes but not all at the same time :) This is as?
> safe as it
> can get, no VM should be interrupted, and I expected that even if we?
> know the
> full trigger of this it will result in the same action.

Sorry, forgot to let everybody know. I dist-upgraded the cluster last
week, which restarted clustering and that 'fixed' everything.

Thanks for your support.

-- 
Kerio Operator in de Cloud? https://www.kerioindecloud.nl/
Mark Schouten  | Tuxis Internet Engineering
KvK: 61527076  | http://www.tuxis.nl/
T: 0318 200208 | info at tuxis.nl

From uwe.sauter.de at gmail.com  Tue May 16 20:56:31 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Tue, 16 May 2017 20:56:31 +0200
Subject: [PVE-User] WebUI - Ceph OSD page - remove vs. destroy
Message-ID: <6ba386f7-08d9-37d9-cdec-636a0d63d708@gmail.com>

Hi,

I just noticed an (intentional?) inconsistency between the WebUI's Ceph OSD page vs. the tasks view on the bottom and
the CLI:

If you go to Datacenter -> node -> Ceph -> OSD and select one of the OSDs you can "remove" it with a button in the upper
right corner. If you do so the task is called "Ceph OSD osd.# - Destroy", which matches the CLI command "pveceph
destroyosd #".

Is there any reason for this inconsistency? If not, I'd suggest to change the WebUI's button to reflect that this is not
only a "remove" action but really "destroys" the data on the disk (in a way that makes recovery not so easy. Yes I know
that the disk isn't really wiped, but the default action of the button is to remove the partition table.)

Regards,

	Uwe

From e.kasper at proxmox.com  Wed May 17 10:16:39 2017
From: e.kasper at proxmox.com (Emmanuel Kasper)
Date: Wed, 17 May 2017 10:16:39 +0200
Subject: [PVE-User] WebUI - Ceph OSD page - remove vs. destroy
In-Reply-To: <6ba386f7-08d9-37d9-cdec-636a0d63d708@gmail.com>
References: <6ba386f7-08d9-37d9-cdec-636a0d63d708@gmail.com>
Message-ID: <84411557-99be-ea30-be3e-254e6a991326@proxmox.com>

On 05/16/2017 08:56 PM, Uwe Sauter wrote:
> Hi,
> 
> I just noticed an (intentional?) inconsistency between the WebUI's Ceph OSD page vs. the tasks view on the bottom and
> the CLI:
> 
> If you go to Datacenter -> node -> Ceph -> OSD and select one of the OSDs you can "remove" it with a button in the upper
> right corner. If you do so the task is called "Ceph OSD osd.# - Destroy", which matches the CLI command "pveceph
> destroyosd #".
> 
> Is there any reason for this inconsistency? If not, I'd suggest to change the WebUI's button to reflect that this is not
> only a "remove" action but really "destroys" the data on the disk (in a way that makes recovery not so easy. Yes I know
> that the disk isn't really wiped, but the default action of the button is to remove the partition table.)
> 
this makes sense, especially we already have the "Destroy" string
already localized
we'll look at it

From uwe.sauter.de at gmail.com  Thu May 18 11:40:58 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Thu, 18 May 2017 11:40:58 +0200
Subject: [PVE-User] Backup to Ceph / NFSv4
Message-ID: <8dd85ff3-b5ed-e3f7-a98b-58a2b9bec85f@gmail.com>

Hi,

as my Proxmox hosts don't have enough local storage I wanted to do backups into the "network". One option that came into mind was
using the existing Ceph installation to do backups. What's currently missing for that (as far as I can tell) is Proxmox support
for a Ceph-backed filesystem (CephFS) including a Ceph metadata server.

CephFS would also allow to store ISO images so that there is no need to have local copies (which are required on every host where
a VM can be migrated to (if the VM's DVD drive is currently connected to the ISO)).

Are there plans to support CephFS in the future?

Lacking CephFS support I wanted to use a remote NFSv4 server for backups (as TCP-based NFSv4 is easier to configure in firewalls).
Is the information on [1] still correct? With PVE 4.4 I don't have the option to select a NFS version (or any other NFS mount
option at all) but the wiki page suggests otherwise.

Will TCP-only NFSv4 work with PVE? My iptables config on the NFS server side is:

[?]
-A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 111 -m comment --comment "NFS: Portmapper" -j ACCEPT
-A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 662 -m comment --comment "NFS: Statd" -j ACCEPT
-A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 875 -m comment --comment "NFS: Quota" -j ACCEPT
-A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 892 -m comment --comment "NFS: Mountd" -j ACCEPT
-A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 2049 -m comment --comment "NFS: NFSv4" -j ACCEPT
-A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 32803 -m comment --comment "NFS: Lockd" -j ACCEPT
[?]

Regards,

	Uwe

[1] https://pve.proxmox.com/wiki/Storage:_NFS

From uwe.sauter.de at gmail.com  Thu May 18 14:56:04 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Thu, 18 May 2017 14:56:04 +0200
Subject: [PVE-User] Backup to Ceph / NFSv4
In-Reply-To: <8dd85ff3-b5ed-e3f7-a98b-58a2b9bec85f@gmail.com>
References: <8dd85ff3-b5ed-e3f7-a98b-58a2b9bec85f@gmail.com>
Message-ID: <e4918dfa-1e7e-a9ba-79d8-1dec60570d14@gmail.com>

Followup to NFSv4:

It seems that TCP-based NFSv4 is currently not possible due to several issues:

* PVE uses "pvesm nfsscan <host>" to get a list of exports from the NFS server. This tool internally uses showmount which uses UDP
as protocol and has no fallback to TCP. Thus, the exports lookup fails and PVE does not mount.

# pvesm nfsscan <host>
clnt_create: RPC: Port mapper failure - Unable to receive: errno 113 (No route to host)

* WebUI does not allow to configure mount options, e.g. necessary to select NFS protocol.

* Manually trying to mount NFSv4 failes with:

# mount -t nfs -o vers=4,rw,sync <host>:$SHARE /mnt
mount.nfs: mounting aurel:/proxmox-infra failed, reason given by server: No such file or directory

Tested with $SHARE being several values, e.g. /exports/proxmox-infra, /backup/proxmox-infra, /proxmox-infra

On the server, /etc/exports looks like:

/exports               192.168.253.192/26(rw,fsid=0,no_subtree_check,sync,crossmnt)
/exports/proxmox-infra 192.168.253.192/26(rw,sync)

and /exports/proxmox-infra is a symlink to /backup/proxmox-infra.

On the other hand, manually mounting NFSv3 works:

# mount -t nfs -o vers=3,rw,sync <host>:/backup/proxmox-infra /mnt

but again, due to showmount not using TCP PVE will not mount it automatically.

Regards,

	Uwe

Am 18.05.2017 um 11:40 schrieb Uwe Sauter:
> Hi,
> 
> as my Proxmox hosts don't have enough local storage I wanted to do backups into the "network". One option that came into mind was
> using the existing Ceph installation to do backups. What's currently missing for that (as far as I can tell) is Proxmox support
> for a Ceph-backed filesystem (CephFS) including a Ceph metadata server.
> 
> CephFS would also allow to store ISO images so that there is no need to have local copies (which are required on every host where
> a VM can be migrated to (if the VM's DVD drive is currently connected to the ISO)).
> 
> Are there plans to support CephFS in the future?
> 
> 
> Lacking CephFS support I wanted to use a remote NFSv4 server for backups (as TCP-based NFSv4 is easier to configure in firewalls).
> Is the information on [1] still correct? With PVE 4.4 I don't have the option to select a NFS version (or any other NFS mount
> option at all) but the wiki page suggests otherwise.
> 
> Will TCP-only NFSv4 work with PVE? My iptables config on the NFS server side is:
> 
> [?]
> -A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 111 -m comment --comment "NFS: Portmapper" -j ACCEPT
> -A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 662 -m comment --comment "NFS: Statd" -j ACCEPT
> -A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 875 -m comment --comment "NFS: Quota" -j ACCEPT
> -A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 892 -m comment --comment "NFS: Mountd" -j ACCEPT
> -A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 2049 -m comment --comment "NFS: NFSv4" -j ACCEPT
> -A INPUT -s 192.168.253.192/26 -p tcp -m state --state NEW -m tcp --dport 32803 -m comment --comment "NFS: Lockd" -j ACCEPT
> [?]
> 
> 
> 
> Regards,
> 
> 	Uwe
> 
> 
> 
> [1] https://pve.proxmox.com/wiki/Storage:_NFS
> 

From e.kasper at proxmox.com  Thu May 18 15:04:56 2017
From: e.kasper at proxmox.com (Emmanuel Kasper)
Date: Thu, 18 May 2017 15:04:56 +0200
Subject: [PVE-User] Backup to Ceph / NFSv4
In-Reply-To: <e4918dfa-1e7e-a9ba-79d8-1dec60570d14@gmail.com>
References: <8dd85ff3-b5ed-e3f7-a98b-58a2b9bec85f@gmail.com>
 <e4918dfa-1e7e-a9ba-79d8-1dec60570d14@gmail.com>
Message-ID: <f8e9fb9d-26bf-f4fa-99d5-d365ef73984e@proxmox.com>

On 05/18/2017 02:56 PM, Uwe Sauter wrote:
> # mount -t nfs -o vers=4,rw,sync <host>:$SHARE /mnt
> mount.nfs: mounting aurel:/proxmox-infra failed, reason given by server: No such file or directory

aurel:/proxmox-infra

are you using here the right path ?
looking at your exports file you should try to mount
aurel:/exports/proxmox-infra

command line nfsv4 mounts should work in any case

From uwe.sauter.de at gmail.com  Thu May 18 15:31:57 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Thu, 18 May 2017 15:31:57 +0200
Subject: [PVE-User] Backup to Ceph / NFSv4
In-Reply-To: <f8e9fb9d-26bf-f4fa-99d5-d365ef73984e@proxmox.com>
References: <8dd85ff3-b5ed-e3f7-a98b-58a2b9bec85f@gmail.com>
 <e4918dfa-1e7e-a9ba-79d8-1dec60570d14@gmail.com>
 <f8e9fb9d-26bf-f4fa-99d5-d365ef73984e@proxmox.com>
Message-ID: <fcf0ed3b-c173-f70e-6cfc-78e2e0984fdb@gmail.com>

Am 18.05.2017 um 15:04 schrieb Emmanuel Kasper:
> 
> 
> On 05/18/2017 02:56 PM, Uwe Sauter wrote:
>> # mount -t nfs -o vers=4,rw,sync <host>:$SHARE /mnt
>> mount.nfs: mounting aurel:/proxmox-infra failed, reason given by server: No such file or directory
> 
> aurel:/proxmox-infra
> 
> are you using here the right path ?
> looking at your exports file you should try to mount
> aurel:/exports/proxmox-infra

As I wrote:

"Tested with $SHARE being several values, e.g. /exports/proxmox-infra, /backup/proxmox-infra, /proxmox-infra"

always the same result. My understanding of NFSv4 shares is that you specify the "root-share" (in this case "/exports") on the
server-side and the client will only see shares within that root-share. And the client-side path shouldn't contain the root-share
path.

So it would be lke this:

server-side            | client-side
----------------------------------------------
/exports               | <host>:/
/exports/backup        | <host>:/backups
/exports/movies        | <host>:/movies

and every sub-folder on the server-side is either a "bind"-mount or a symlink to the real directory / filesystem.

But please enlighten me if this is wrong.

> 
> command line nfsv4 mounts should work in any case

They didn't, though it might be that NFSv4 would need something else on the server-side that is not configured / running.

My problem is now solved with allowing upd/111 for showmount and using NFSv3. But this solution is a bit unsatisfying.

> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 

From steve at easy2boot.com  Thu May 18 19:55:38 2017
From: steve at easy2boot.com (Steve)
Date: Thu, 18 May 2017 18:55:38 +0100
Subject: [PVE-User] sbin/unconfigured.sh
Message-ID: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>

In version 3.2 ISO there was this script to start an install.
This file is not in recent versions.
Is there an equivalent way to start an install with v4.4 or any other
recent version?
Thanks
Steve

From uwe.sauter.de at gmail.com  Thu May 18 20:04:30 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Thu, 18 May 2017 20:04:30 +0200
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
Message-ID: <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>

Don't know what your situation is but there is a wiki page [1] that describes the installation of Proxmox on top of an
existing Debian.

[1] https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Jessie

Am 18.05.2017 um 19:55 schrieb Steve:
> In version 3.2 ISO there was this script to start an install.
> This file is not in recent versions.
> Is there an equivalent way to start an install with v4.4 or any other
> recent version?
> Thanks
> Steve
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 

From steve at easy2boot.com  Thu May 18 20:08:36 2017
From: steve at easy2boot.com (Steve)
Date: Thu, 18 May 2017 19:08:36 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
Message-ID: <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>

Thanks for the quick reply.
I am booting from the ISO file itself which is on a multiboot USB drive.
In previous versions, you could boot to the shell, mount the ISO as /mnt
and then start the install by running unconfigured.sh.

So basically, if I boot to the shell, how can I start the install from the
contents of the CD/ISO.

On 18 May 2017 at 19:04, Uwe Sauter <uwe.sauter.de at gmail.com> wrote:

> Don't know what your situation is but there is a wiki page [1] that
> describes the installation of Proxmox on top of an
> existing Debian.
>
> [1] https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Jessie
>
> Am 18.05.2017 um 19:55 schrieb Steve:
> > In version 3.2 ISO there was this script to start an install.
> > This file is not in recent versions.
> > Is there an equivalent way to start an install with v4.4 or any other
> > recent version?
> > Thanks
> > Steve
> > _______________________________________________
> > pve-user mailing list
> > pve-user at pve.proxmox.com
> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From uwe.sauter.de at gmail.com  Thu May 18 20:09:46 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Thu, 18 May 2017 20:09:46 +0200
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
Message-ID: <270bd41e-5912-9783-904e-7f9ec18c6ddf@gmail.com>

Sorry, never did that. No idea?

Am 18.05.2017 um 20:08 schrieb Steve:
> Thanks for the quick reply.
> I am booting from the ISO file itself which is on a multiboot USB drive.
> In previous versions, you could boot to the shell, mount the ISO as /mnt and then start the install by running
> unconfigured.sh.
> 
> So basically, if I boot to the shell, how can I start the install from the contents of the CD/ISO.
> 
> 
> 
> On 18 May 2017 at 19:04, Uwe Sauter <uwe.sauter.de at gmail.com <mailto:uwe.sauter.de at gmail.com>> wrote:
> 
>     Don't know what your situation is but there is a wiki page [1] that describes the installation of Proxmox on top of an
>     existing Debian.
> 
>     [1] https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Jessie
>     <https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Jessie>
> 
>     Am 18.05.2017 um 19:55 schrieb Steve:
>     > In version 3.2 ISO there was this script to start an install.
>     > This file is not in recent versions.
>     > Is there an equivalent way to start an install with v4.4 or any other
>     > recent version?
>     > Thanks
>     > Steve
>     > _______________________________________________
>     > pve-user mailing list
>     > pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>
>     > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user <https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user>
>     >
>     _______________________________________________
>     pve-user mailing list
>     pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>
>     https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user <https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user>
> 
> 

From f.gruenbichler at proxmox.com  Fri May 19 08:34:47 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Fri, 19 May 2017 08:34:47 +0200
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
Message-ID: <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>

On Thu, May 18, 2017 at 07:08:36PM +0100, Steve wrote:
> Thanks for the quick reply.
> I am booting from the ISO file itself which is on a multiboot USB drive.
> In previous versions, you could boot to the shell, mount the ISO as /mnt
> and then start the install by running unconfigured.sh.
> 
> So basically, if I boot to the shell, how can I start the install from the
> contents of the CD/ISO.
> 

you need to mount the contained squashfs files in the right order on the
right places, make an overlayfs and then bind mount the iso into that.
then you can chroot and run the unconfigured.sh script. basically do all
the steps that the initrd contained on the iso does ;)

From steve at easy2boot.com  Fri May 19 09:28:17 2017
From: steve at easy2boot.com (Steve)
Date: Fri, 19 May 2017 08:28:17 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
Message-ID: <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>

Thanks
I tried that.
I made a new .sh from the portion of the initrd that mounts all the
squashfs files and runs unconfigured.sh.
It seems to almost work, but it gets to Detecting network settings... done
and then says
\nInstallation aborted - unable to continue

any ideas why?

On 19 May 2017 at 07:34, Fabian Gr?nbichler <f.gruenbichler at proxmox.com>
wrote:

> On Thu, May 18, 2017 at 07:08:36PM +0100, Steve wrote:
> > Thanks for the quick reply.
> > I am booting from the ISO file itself which is on a multiboot USB drive.
> > In previous versions, you could boot to the shell, mount the ISO as /mnt
> > and then start the install by running unconfigured.sh.
> >
> > So basically, if I boot to the shell, how can I start the install from
> the
> > contents of the CD/ISO.
> >
>
> you need to mount the contained squashfs files in the right order on the
> right places, make an overlayfs and then bind mount the iso into that.
> then you can chroot and run the unconfigured.sh script. basically do all
> the steps that the initrd contained on the iso does ;)
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From steve at easy2boot.com  Fri May 19 09:36:46 2017
From: steve at easy2boot.com (Steve)
Date: Fri, 19 May 2017 08:36:46 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
Message-ID: <CALnr98CQWfKpxoQ4DZO+Zu-Q1skq-wbfH8wXv8FuzfRf6iUvFw@mail.gmail.com>

P.S.
As an experiment, I even inserted a dd'd flash drive containing the proxmox
ISO (which works if I boot to it) and then booted from my multiboot USB
drive to the shell and then mounted /mnt as the flash drive
mount /dev/sdc /mnt

then I ran the modified script.
I gave the same fail message.
So that seems to indicate that something is wrong with the script or
environment when I run the script?

The script I run starts with
if [ -f /mnt/pve-installer.squashfs ]; then
    echo this is a Proxmox VE installation CD

    if ! mount -t squashfs -o ro,loop /mnt/pve-base.squashfs
/mnt/.pve-base; then
debugsh_err_reboot "mount pve-base.squashfs failed"
    fi

any ideas what is not set up before this?

Steve

On 19 May 2017 at 07:34, Fabian Gr?nbichler <f.gruenbichler at proxmox.com>
wrote:

> On Thu, May 18, 2017 at 07:08:36PM +0100, Steve wrote:
> > Thanks for the quick reply.
> > I am booting from the ISO file itself which is on a multiboot USB drive.
> > In previous versions, you could boot to the shell, mount the ISO as /mnt
> > and then start the install by running unconfigured.sh.
> >
> > So basically, if I boot to the shell, how can I start the install from
> the
> > contents of the CD/ISO.
> >
>
> you need to mount the contained squashfs files in the right order on the
> right places, make an overlayfs and then bind mount the iso into that.
> then you can chroot and run the unconfigured.sh script. basically do all
> the steps that the initrd contained on the iso does ;)
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From f.gruenbichler at proxmox.com  Fri May 19 10:12:05 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Fri, 19 May 2017 10:12:05 +0200
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
Message-ID: <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>

On Fri, May 19, 2017 at 08:28:17AM +0100, Steve wrote:
> Thanks
> I tried that.
> I made a new .sh from the portion of the initrd that mounts all the
> squashfs files and runs unconfigured.sh.
> It seems to almost work, but it gets to Detecting network settings... done
> and then says
> \nInstallation aborted - unable to continue
> 
> any ideas why?
> 

boot it with "proxdebug" as part of the kernel cmdline and examine the
logs. you can also manually run "xinit" in the debug shell to restart
the installer from the debug shell.

From steve at easy2boot.com  Fri May 19 10:40:54 2017
From: steve at easy2boot.com (Steve)
Date: Fri, 19 May 2017 09:40:54 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
 <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
Message-ID: <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>

I tried proxdebug. No extra messages are generated after the network has
initialised.
There is no log in /tmp folder ? Has the log moved?

How do I run xinit? Where is it?
Do you mean init? If I run this it says must be run as PID 1 - how can I
fix this (sorry, I am not a linux guru!)

On 19 May 2017 at 09:12, Fabian Gr?nbichler <f.gruenbichler at proxmox.com>
wrote:

> On Fri, May 19, 2017 at 08:28:17AM +0100, Steve wrote:
> > Thanks
> > I tried that.
> > I made a new .sh from the portion of the initrd that mounts all the
> > squashfs files and runs unconfigured.sh.
> > It seems to almost work, but it gets to Detecting network settings...
> done
> > and then says
> > \nInstallation aborted - unable to continue
> >
> > any ideas why?
> >
>
> boot it with "proxdebug" as part of the kernel cmdline and examine the
> logs. you can also manually run "xinit" in the debug shell to restart
> the installer from the debug shell.
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From uwe.sauter.de at gmail.com  Fri May 19 10:43:26 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Fri, 19 May 2017 10:43:26 +0200
Subject: [PVE-User] Problems with backup process and NFS
Message-ID: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>

Hi all,

after having succeeded to have an almost TCP-based NFS share mounted (see yesterday's thread) I'm now struggling with the backup
process itself.

Definition of NFS share in /etc/pve/storage.cfg is:

nfs: aurel
	export /backup/proxmox-infra
	path /mnt/pve/aurel
	server <ip of server>
	content backup
	maxfiles 30
	options vers=3

With this definition, <server>:/backup/proxmox-infra is always mounted on /mnt/pve/aurel on every one of my PVE servers.

Definition of backup job is:
  Nodes: all
  Storage: aurel
  Day of week: Mon-Sun
  Start time: 22:00
  Selection mode: Include selected VMs
  Send email to: <my email address>
  Email notification: always
  compression: lzo
  Mode: Snapshot
  Enable: true
  VMs: <selection of VMs>

Issue 1:
Backups failed tonight with "Error: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted".

Question 1:
Do NFS shares that should be used for backups need to be "disabled" so that the backup process just mount the share in preparation
of the backup? If so, there is a problem with the WebUI as I cannot select a disabled storage location when defining or editing a
backup job.

Issue 2:
When the backup process failed no email was sent to the specified address though email notification mode is set to "always".
Emails in general do work, at least apticron sends update notifications and sending mails from CLI also works.

Question 2a:
Why was no email sent after the backup process failed?

Question 2b:
Does the WebUI use a different method for sending emails? If so where can this be configured?

Regards,

	Uwe

From f.gruenbichler at proxmox.com  Fri May 19 11:04:57 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Fri, 19 May 2017 11:04:57 +0200
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
 <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
 <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>
Message-ID: <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>

On Fri, May 19, 2017 at 09:40:54AM +0100, Steve wrote:
> I tried proxdebug. No extra messages are generated after the network has
> initialised.
> There is no log in /tmp folder ? Has the log moved?
> 
> How do I run xinit? Where is it?
> Do you mean init? If I run this it says must be run as PID 1 - how can I
> fix this (sorry, I am not a linux guru!)
> 

just type "xinit" in the debug shell. but if you are not comfortable
with this kind of debugging, you might be better off just (temporarily)
"sacrificing" a thumb drive for the PVE installer instead of trying to
get this non-standard way to boot it to work ;)

I hope to fix the initrd during the 5.x release cycleto to allow booting
in some kind of loopback mode, but it's a low priority item on my todo
list..

From f.gruenbichler at proxmox.com  Fri May 19 11:17:49 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Fri, 19 May 2017 11:17:49 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
Message-ID: <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>

On Fri, May 19, 2017 at 10:43:26AM +0200, Uwe Sauter wrote:
> Hi all,
> 
> after having succeeded to have an almost TCP-based NFS share mounted (see yesterday's thread) I'm now struggling with the backup
> process itself.
> 
> Definition of NFS share in /etc/pve/storage.cfg is:
> 
> nfs: aurel
> 	export /backup/proxmox-infra
> 	path /mnt/pve/aurel
> 	server <ip of server>
> 	content backup
> 	maxfiles 30
> 	options vers=3
> 
> 
> With this definition, <server>:/backup/proxmox-infra is always mounted on /mnt/pve/aurel on every one of my PVE servers.
> 
> 
> Definition of backup job is:
>   Nodes: all
>   Storage: aurel
>   Day of week: Mon-Sun
>   Start time: 22:00
>   Selection mode: Include selected VMs
>   Send email to: <my email address>
>   Email notification: always
>   compression: lzo
>   Mode: Snapshot
>   Enable: true
>   VMs: <selection of VMs>
> 
> 
> 
> Issue 1:
> Backups failed tonight with "Error: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted".
> 
> Question 1:
> Do NFS shares that should be used for backups need to be "disabled" so that the backup process just mount the share in preparation
> of the backup? If so, there is a problem with the WebUI as I cannot select a disabled storage location when defining or editing a
> backup job.
> 

no. I think your NFS share is (wrongly) detected as unmounted, since
activate_storage in the NFS plugin will try to mount if nfs_is_mounted
returned undef. in general activate_storage is supposed to detect if the
storage is already activated and turn into a no-op if needed, and
calling storage can treat it as idempotent.

what does "grep aurel /proc/mounts" say on this machine?

> 
> Issue 2:
> When the backup process failed no email was sent to the specified address though email notification mode is set to "always".
> Emails in general do work, at least apticron sends update notifications and sending mails from CLI also works.
> 
> Question 2a:
> Why was no email sent after the backup process failed?
> 

can you post the complete log?

> Question 2b:
> Does the WebUI use a different method for sending emails? If so where can this be configured?

different from what?

From uwe.sauter.de at gmail.com  Fri May 19 11:26:35 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Fri, 19 May 2017 11:26:35 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
Message-ID: <ad686413-df95-e3c3-0932-e88761334250@gmail.com>

Hi Fabian,

thanks for looking into this.

As I already mentioned yesterday my NFS setup tries to use TCP as much as possible so the only UDP port used / allowed in the NFS
servers firewall is udp/111 for Portmapper (to allow showmount to work).

>> Issue 1:
>> Backups failed tonight with "Error: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted".
>>
>> Question 1:
>> Do NFS shares that should be used for backups need to be "disabled" so that the backup process just mount the share in preparation
>> of the backup? If so, there is a problem with the WebUI as I cannot select a disabled storage location when defining or editing a
>> backup job.
>>
> 
> no. I think your NFS share is (wrongly) detected as unmounted, since
> activate_storage in the NFS plugin will try to mount if nfs_is_mounted
> returned undef. in general activate_storage is supposed to detect if the
> storage is already activated and turn into a no-op if needed, and
> calling storage can treat it as idempotent.
> 
> what does "grep aurel /proc/mounts" say on this machine?

# grep aurel /proc/mounts
<ip of server>:/backup/proxmox-infra /mnt/pve/aurel nfs
rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<ip of
server>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=141.58.1.15 0 0

> 
>>
>> Issue 2:
>> When the backup process failed no email was sent to the specified address though email notification mode is set to "always".
>> Emails in general do work, at least apticron sends update notifications and sending mails from CLI also works.
>>
>> Question 2a:
>> Why was no email sent after the backup process failed?
>>
> 
> can you post the complete log?

Where can I find that?

> 
>> Question 2b:
>> Does the WebUI use a different method for sending emails? If so where can this be configured?
> 
> different from what?
> 

Different from what apticron and "echo test | mail -s test <recipient address>" use.

Regards,

	Uwe

From steve at easy2boot.com  Fri May 19 11:27:20 2017
From: steve at easy2boot.com (Steve)
Date: Fri, 19 May 2017 10:27:20 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
 <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
 <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>
 <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>
Message-ID: <CALnr98C8k5mksb+6AQunpo8ujqKDbq80KET9F3b89ai7y_G=mg@mail.gmail.com>

If I type xinit it says /bin/sh: xinit: not found

I am the author of easy2boot which is a USB multiboot tool to allow people
to boot from 100's of different ISOs (or images) all from one USB stick.
I have been asked by a user to get proxmox 4 working.
3.2 works because I can run the unconfigured.sh from the command line

http://rmprepusb.blogspot.co.uk/2014/03/add-proxmox-isos-to-easy2boot.html
http://www.easy2boot.com

I tried setting proxdebug and  root=/dev/sdb1  or lvm2root=/dev/sdb1 to get
it to mount the FAT32 partition to /mnt. I do not need to manually use any
mount commands, it just automatically runs init but this causes an early
error message of
mount: mounting /dev/sdb1 on /mnt failed: Invalid argument
but if I use the mount command I can see that /mnt is present as a vfat
/dev/sdb1
If I press CTRL+D to continue, it fails in the exact same place with
Installation aborted - unable to continue.
Note that this is not using my script at all, just the original init script
- I have not broken into the boot process because it picks up the lvm2root
parameter.

I am so *near*, yet I just cannot get the unconfigured.sh script to run in
this way...

thanks for your help.
Steve

On 19 May 2017 at 10:04, Fabian Gr?nbichler <f.gruenbichler at proxmox.com>
wrote:

> On Fri, May 19, 2017 at 09:40:54AM +0100, Steve wrote:
> > I tried proxdebug. No extra messages are generated after the network has
> > initialised.
> > There is no log in /tmp folder ? Has the log moved?
> >
> > How do I run xinit? Where is it?
> > Do you mean init? If I run this it says must be run as PID 1 - how can I
> > fix this (sorry, I am not a linux guru!)
> >
>
> just type "xinit" in the debug shell. but if you are not comfortable
> with this kind of debugging, you might be better off just (temporarily)
> "sacrificing" a thumb drive for the PVE installer instead of trying to
> get this non-standard way to boot it to work ;)
>
> I hope to fix the initrd during the 5.x release cycleto to allow booting
> in some kind of loopback mode, but it's a low priority item on my todo
> list..
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From f.gruenbichler at proxmox.com  Fri May 19 11:53:57 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Fri, 19 May 2017 11:53:57 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
Message-ID: <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>

On Fri, May 19, 2017 at 11:26:35AM +0200, Uwe Sauter wrote:
> Hi Fabian,
> 
> thanks for looking into this.
> 
> As I already mentioned yesterday my NFS setup tries to use TCP as much as possible so the only UDP port used / allowed in the NFS
> servers firewall is udp/111 for Portmapper (to allow showmount to work).
> 
> >> Issue 1:
> >> Backups failed tonight with "Error: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted".
> >>
> >> Question 1:
> >> Do NFS shares that should be used for backups need to be "disabled" so that the backup process just mount the share in preparation
> >> of the backup? If so, there is a problem with the WebUI as I cannot select a disabled storage location when defining or editing a
> >> backup job.
> >>
> > 
> > no. I think your NFS share is (wrongly) detected as unmounted, since
> > activate_storage in the NFS plugin will try to mount if nfs_is_mounted
> > returned undef. in general activate_storage is supposed to detect if the
> > storage is already activated and turn into a no-op if needed, and
> > calling storage can treat it as idempotent.
> > 
> > what does "grep aurel /proc/mounts" say on this machine?
> 
> # grep aurel /proc/mounts
> <ip of server>:/backup/proxmox-infra /mnt/pve/aurel nfs
> rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<ip of
> server>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=141.58.1.15 0 0

that looks okay..

> 
> > 
> >>
> >> Issue 2:
> >> When the backup process failed no email was sent to the specified address though email notification mode is set to "always".
> >> Emails in general do work, at least apticron sends update notifications and sending mails from CLI also works.
> >>
> >> Question 2a:
> >> Why was no email sent after the backup process failed?
> >>
> > 
> > can you post the complete log?
> 
> Where can I find that?

check the node's task history - should have an entry for the backup job
including log.

> 
> > 
> >> Question 2b:
> >> Does the WebUI use a different method for sending emails? If so where can this be configured?
> > 
> > different from what?
> > 
> 
> Different from what apticron and "echo test | mail -s test <recipient address>" use.
> 

there is /root/.forward which points to pvemailforward which uses
"sendmail -bm -N never -f <FROM> <TO>" (using 'root' or the value from
datacenter.cfg as sender, and the adress of 'root at pam' as recipient),
and PVE::Tools::sendmail() (which is used by vzdump) which uses
"sendmail -B 8BITMIME -f <FROM> <TO>" (with configurable sender and
recipient). vzdump sets the sender to the one from datacenter.cfg as
well, and the recipient to what is configured for the backup job.

From eugen.mayer at kontextwork.de  Fri May 19 12:04:26 2017
From: eugen.mayer at kontextwork.de (Eugen Mayer)
Date: Fri, 19 May 2017 12:04:26 +0200
Subject: [PVE-User] Problem running any VM with more then 1 Core ( PVE 4.x)
Message-ID: <etPan.591ec32a.22e8d961.441e@kontextwork.de>

Hello,

having an issue with proxmox 4.x running any VM with more then 1 core.

Hardware:
?- Intel(R) Xeon(R) CPU E3-1275 v5 @ 3.60GHz
?- 64GB ram
?- HW raid SSD

PVE 4.x just got installed freshly, just 1 other VM on that box ( 1core opnsense ) and nothing else, so fairly idle and nothing really to be worry about.

I tried boot rancheros 1.0.1 and grml 2014.11 ( both 64bit ), both have the same pattern, when i boot them first, it works, when i reboot them using the cli
?- grub shows up ( if present as menu )
?- selecting anything leads to a stalled VM which i only can stop using qm kill <>

Using 1 core only, everything works just fine 2-4 broke, did not try more. It does not matter if i boot an iso or e.g an installed version of rancheros, reboot stalls every single time.

I have never seen grml not boot up on anything yet, pretty sure that is a general issue.

Logs / Details:
?- dmsg just shows nothing during that happening
?-?pve-manager ? ? ? ? ? ? ? ? ? ? 4.4-13
?- kernel: 4.9.0-0.bpo.2-amd64 #1 SMP Debian 4.9.18-1~bpo8+1
?- debian?8.8 fully upgraded
?- nothing runs on the host except shorewall

Can i provide anything else? Any clues?

Thanks!

Eugen

From f.gruenbichler at proxmox.com  Fri May 19 12:12:17 2017
From: f.gruenbichler at proxmox.com (=?UTF-8?Q?Fabian_Gr=C3=BCnbichler?=)
Date: Fri, 19 May 2017 12:12:17 +0200 (CEST)
Subject: [PVE-User] Problem running any VM with more then 1 Core ( PVE
 4.x)
In-Reply-To: <etPan.591ec32a.22e8d961.441e@kontextwork.de>
References: <etPan.591ec32a.22e8d961.441e@kontextwork.de>
Message-ID: <820435367.13.1495188737538@webmail.proxmox.com>

> Eugen Mayer <eugen.mayer at kontextwork.de> hat am 19. Mai 2017 um 12:04 geschrieben:
> 
> 
> Hello,
> 
> having an issue with proxmox 4.x running any VM with more then 1 core.
> 
> Hardware:
>  - Intel(R) Xeon(R) CPU E3-1275 v5 @ 3.60GHz
>  - 64GB ram
>  - HW raid SSD
> 
> PVE 4.x just got installed freshly, just 1 other VM on that box ( 1core opnsense ) and nothing else, so fairly idle and nothing really to be worry about.
> 
> I tried boot rancheros 1.0.1 and grml 2014.11 ( both 64bit ), both have the same pattern, when i boot them first, it works, when i reboot them using the cli
>  - grub shows up ( if present as menu )
>  - selecting anything leads to a stalled VM which i only can stop using qm kill <>
> 
> Using 1 core only, everything works just fine 2-4 broke, did not try more. It does not matter if i boot an iso or e.g an installed version of rancheros, reboot stalls every single time.
> 
> I have never seen grml not boot up on anything yet, pretty sure that is a general issue.
> 
> Logs / Details:
>  - dmsg just shows nothing during that happening
>  - pve-manager                     4.4-13
>  - kernel: 4.9.0-0.bpo.2-amd64 #1 SMP Debian 4.9.18-1~bpo8+1
>  - debian 8.8 fully upgraded
>  - nothing runs on the host except shorewall
> 
> Can i provide anything else? Any clues?

you run a non-PVE kernel - which is a completely untested and unsupported configuration. please revert to the PVE 4.4 kernel and test again. if the issue persists, please include "pveversion -v" and "qm config VMID"

From uwe.sauter.de at gmail.com  Fri May 19 12:49:21 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Fri, 19 May 2017 12:49:21 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
 <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
Message-ID: <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>

Am 19.05.2017 um 11:53 schrieb Fabian Gr?nbichler:
> On Fri, May 19, 2017 at 11:26:35AM +0200, Uwe Sauter wrote:
>> Hi Fabian,
>>
>> thanks for looking into this.
>>
>> As I already mentioned yesterday my NFS setup tries to use TCP as much as possible so the only UDP port used / allowed in the NFS
>> servers firewall is udp/111 for Portmapper (to allow showmount to work).
>>
>>>> Issue 1:
>>>> Backups failed tonight with "Error: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted".
>>>>
>>>> Question 1:
>>>> Do NFS shares that should be used for backups need to be "disabled" so that the backup process just mount the share in preparation
>>>> of the backup? If so, there is a problem with the WebUI as I cannot select a disabled storage location when defining or editing a
>>>> backup job.
>>>>
>>>
>>> no. I think your NFS share is (wrongly) detected as unmounted, since
>>> activate_storage in the NFS plugin will try to mount if nfs_is_mounted
>>> returned undef. in general activate_storage is supposed to detect if the
>>> storage is already activated and turn into a no-op if needed, and
>>> calling storage can treat it as idempotent.
>>>
>>> what does "grep aurel /proc/mounts" say on this machine?
>>
>> # grep aurel /proc/mounts
>> <ip of server>:/backup/proxmox-infra /mnt/pve/aurel nfs
>> rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<ip of
>> server>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=141.58.1.15 0 0
> 
> that looks okay..
> 
>>
>>>
>>>>
>>>> Issue 2:
>>>> When the backup process failed no email was sent to the specified address though email notification mode is set to "always".
>>>> Emails in general do work, at least apticron sends update notifications and sending mails from CLI also works.
>>>>
>>>> Question 2a:
>>>> Why was no email sent after the backup process failed?
>>>>
>>>
>>> can you post the complete log?
>>
>> Where can I find that?
> 
> check the node's task history - should have an entry for the backup job
> including log.

Opening the task gives:

OUTPUT: TASK ERROR: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted

STATUS: stopped: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted

Opening the log file in /var/log/pve/tasks for that tasks gives the same as above OUTPUT.

> 
>>
>>>
>>>> Question 2b:
>>>> Does the WebUI use a different method for sending emails? If so where can this be configured?
>>>
>>> different from what?
>>>
>>
>> Different from what apticron and "echo test | mail -s test <recipient address>" use.
>>
> 
> there is /root/.forward which points to pvemailforward which uses
> "sendmail -bm -N never -f <FROM> <TO>" (using 'root' or the value from
> datacenter.cfg as sender, and the adress of 'root at pam' as recipient),
> and PVE::Tools::sendmail() (which is used by vzdump) which uses
> "sendmail -B 8BITMIME -f <FROM> <TO>" (with configurable sender and
> recipient). vzdump sets the sender to the one from datacenter.cfg as
> well, and the recipient to what is configured for the backup job.
> 

# cat /etc/pve/datacenter.cfg
keyboard: de

# cat /etc/pve/user.cfg
user:root at pam:1:0:::<my email>::

I suspect that something just doesn't send emails in that specific error case?

Is there a way to test the mail configuration using PVE's mechanism?

Regards,

	Uwe

From f.gruenbichler at proxmox.com  Fri May 19 13:31:27 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Fri, 19 May 2017 13:31:27 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
 <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
 <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>
Message-ID: <20170519113127.33y4vob54yin2ejw@nora.maurer-it.com>

On Fri, May 19, 2017 at 12:49:21PM +0200, Uwe Sauter wrote:
> Am 19.05.2017 um 11:53 schrieb Fabian Gr?nbichler:
> > On Fri, May 19, 2017 at 11:26:35AM +0200, Uwe Sauter wrote:
> >> Hi Fabian,
> >>
> >> thanks for looking into this.
> >>
> >> As I already mentioned yesterday my NFS setup tries to use TCP as much as possible so the only UDP port used / allowed in the NFS
> >> servers firewall is udp/111 for Portmapper (to allow showmount to work).
> >>
> >>>> Issue 1:
> >>>> Backups failed tonight with "Error: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted".
> >>>>
> >>>> Question 1:
> >>>> Do NFS shares that should be used for backups need to be "disabled" so that the backup process just mount the share in preparation
> >>>> of the backup? If so, there is a problem with the WebUI as I cannot select a disabled storage location when defining or editing a
> >>>> backup job.
> >>>>
> >>>
> >>> no. I think your NFS share is (wrongly) detected as unmounted, since
> >>> activate_storage in the NFS plugin will try to mount if nfs_is_mounted
> >>> returned undef. in general activate_storage is supposed to detect if the
> >>> storage is already activated and turn into a no-op if needed, and
> >>> calling storage can treat it as idempotent.
> >>>
> >>> what does "grep aurel /proc/mounts" say on this machine?
> >>
> >> # grep aurel /proc/mounts
> >> <ip of server>:/backup/proxmox-infra /mnt/pve/aurel nfs
> >> rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<ip of
> >> server>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=141.58.1.15 0 0
> > 
> > that looks okay..
> > 
> >>
> >>>
> >>>>
> >>>> Issue 2:
> >>>> When the backup process failed no email was sent to the specified address though email notification mode is set to "always".
> >>>> Emails in general do work, at least apticron sends update notifications and sending mails from CLI also works.
> >>>>
> >>>> Question 2a:
> >>>> Why was no email sent after the backup process failed?
> >>>>
> >>>
> >>> can you post the complete log?
> >>
> >> Where can I find that?
> > 
> > check the node's task history - should have an entry for the backup job
> > including log.
> 
> Opening the task gives:
> 
> OUTPUT: TASK ERROR: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted
> 
> STATUS: stopped: mount error: mount.nfs: /mnt/pve/aurel is busy or already mounted
> 
> 
> Opening the log file in /var/log/pve/tasks for that tasks gives the same as above OUTPUT.
> 
> 
> 
> > 
> >>
> >>>
> >>>> Question 2b:
> >>>> Does the WebUI use a different method for sending emails? If so where can this be configured?
> >>>
> >>> different from what?
> >>>
> >>
> >> Different from what apticron and "echo test | mail -s test <recipient address>" use.
> >>
> > 
> > there is /root/.forward which points to pvemailforward which uses
> > "sendmail -bm -N never -f <FROM> <TO>" (using 'root' or the value from
> > datacenter.cfg as sender, and the adress of 'root at pam' as recipient),
> > and PVE::Tools::sendmail() (which is used by vzdump) which uses
> > "sendmail -B 8BITMIME -f <FROM> <TO>" (with configurable sender and
> > recipient). vzdump sets the sender to the one from datacenter.cfg as
> > well, and the recipient to what is configured for the backup job.
> > 
> 
> # cat /etc/pve/datacenter.cfg
> keyboard: de
> 
> # cat /etc/pve/user.cfg
> user:root at pam:1:0:::<my email>::
> 
> I suspect that something just doesn't send emails in that specific error case?

yes, seems like activate_storage is called very early on to retrieve
maxfiles and dumpdir via PVE::API2::VZDump (POST) -> PVE::VZDump->new()
-> PVE::VZDump::storage_info() , and that call is not guarded by an
eval, thus no error handling and sendmail is triggered. can you file a
bug for that? thanks!

the underlying issue is still unclear to me.. can you post the output of
the following snippet (insert your correct IP)

perl -e '
use strict;
use warnings;
use PVE::Storage::NFSPlugin;
my $server = "INSERTNFSSERVERIPHERE";
my $export = "/backup/proxmox-infra";
my $mountpoint = "/mnt/pve/aurel";
print PVE::Storage::NFSPlugin::nfs_is_mounted($server, $export, $mountpoint, undef), "\n";
'

if that does not output anything, the following might also be
interesting (feel free to censor as you see fit):

perl -e 'use strict; use warnings; use PVE::ProcFSTools; use Data::Dumper; print Dumper(PVE::ProcFSTools::parse_proc_mounts());'

> 
> Is there a way to test the mail configuration using PVE's mechanism?

yes, just pipe your mail into the the sendmail command I posted ;) to
check past mails, you can also use something like

journalctl -b | grep 'pvemail\|postfix'

From f.gruenbichler at proxmox.com  Fri May 19 13:46:38 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Fri, 19 May 2017 13:46:38 +0200
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <CALnr98C8k5mksb+6AQunpo8ujqKDbq80KET9F3b89ai7y_G=mg@mail.gmail.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
 <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
 <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>
 <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>
 <CALnr98C8k5mksb+6AQunpo8ujqKDbq80KET9F3b89ai7y_G=mg@mail.gmail.com>
Message-ID: <20170519114638.4aprdrkwwszbv7ws@nora.maurer-it.com>

On Fri, May 19, 2017 at 10:27:20AM +0100, Steve wrote:
> If I type xinit it says /bin/sh: xinit: not found
> 
> I am the author of easy2boot which is a USB multiboot tool to allow people
> to boot from 100's of different ISOs (or images) all from one USB stick.
> I have been asked by a user to get proxmox 4 working.
> 3.2 works because I can run the unconfigured.sh from the command line
> 
> http://rmprepusb.blogspot.co.uk/2014/03/add-proxmox-isos-to-easy2boot.html
> http://www.easy2boot.com
> 
> I tried setting proxdebug and  root=/dev/sdb1  or lvm2root=/dev/sdb1 to get
> it to mount the FAT32 partition to /mnt. I do not need to manually use any
> mount commands, it just automatically runs init but this causes an early
> error message of
> mount: mounting /dev/sdb1 on /mnt failed: Invalid argument
> but if I use the mount command I can see that /mnt is present as a vfat
> /dev/sdb1
> If I press CTRL+D to continue, it fails in the exact same place with
> Installation aborted - unable to continue.
> Note that this is not using my script at all, just the original init script
> - I have not broken into the boot process because it picks up the lvm2root
> parameter.
> 
> I am so *near*, yet I just cannot get the unconfigured.sh script to run in
> this way...
> 
> thanks for your help.
> Steve

I just did a quick test run using Grub:

loopback test PATHTOISO
linux (test)/boot/linux26 proxdebug ramdisk_size=16777216 rw
initrd (test)/boot/initrd.img
boot

drops me into the initrd debug shell

if I do the following (major/minor depend on your disk configuration,
  check /sys/class/block/XXX/dev where XXX is your usb partition where
  the iso is)

mknod /tmp/usbdev b MAJOR MINOR
mkdir /tmp/usbmnt
mount /tmp/usbdev /tmp/usbmnt
mount /tmp/usbmnt/PATHTOISO /mnt

followed by lines 294ff of the init script launch the graphical
installer (although the resolution is messed up - probably because I did
not bother to change any of the graphic stuff in Grub), and I am able to
complete the installation..

so I think it must be something in your environment / setup.

(note that there are TWO debug shells, one in the initrd with very
limited commands, and one in the installer environment with access to a
lot more tools - a quick test is to run something like "lsblk", since
that only exists in the latter environment ;). maybe your switch to the
installer environment did not work, and you are still in the initrd
shell? that would explain why you can't execute "xinit")

From steve at easy2boot.com  Fri May 19 13:51:02 2017
From: steve at easy2boot.com (Steve)
Date: Fri, 19 May 2017 12:51:02 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <20170519114638.4aprdrkwwszbv7ws@nora.maurer-it.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
 <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
 <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>
 <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>
 <CALnr98C8k5mksb+6AQunpo8ujqKDbq80KET9F3b89ai7y_G=mg@mail.gmail.com>
 <20170519114638.4aprdrkwwszbv7ws@nora.maurer-it.com>
Message-ID: <CALnr98CxT9wXm0=r3Vz=wdhgKX=tQaqyzD_djhvU_0ge=kbggw@mail.gmail.com>

I just found  a way to get it to work by modifying the grub menu and add
lvm2root=/dev/sdX4

where sdX4 is a partition on the USB drive which points directly to the ISO
file.
So I am booting from the ISO (mapped to the BIOS device (0xff) in grub4dos)
which boots to the grub2 menu and then I add the lvm2root parameter.

The strange thing is that if I boot from a FAT32 flat-file system and have
the ISO mapped to a partition in exactly the same way, the init script
fails.

On 19 May 2017 at 12:46, Fabian Gr?nbichler <f.gruenbichler at proxmox.com>
wrote:

> On Fri, May 19, 2017 at 10:27:20AM +0100, Steve wrote:
> > If I type xinit it says /bin/sh: xinit: not found
> >
> > I am the author of easy2boot which is a USB multiboot tool to allow
> people
> > to boot from 100's of different ISOs (or images) all from one USB stick.
> > I have been asked by a user to get proxmox 4 working.
> > 3.2 works because I can run the unconfigured.sh from the command line
> >
> > http://rmprepusb.blogspot.co.uk/2014/03/add-proxmox-isos-
> to-easy2boot.html
> > http://www.easy2boot.com
> >
> > I tried setting proxdebug and  root=/dev/sdb1  or lvm2root=/dev/sdb1 to
> get
> > it to mount the FAT32 partition to /mnt. I do not need to manually use
> any
> > mount commands, it just automatically runs init but this causes an early
> > error message of
> > mount: mounting /dev/sdb1 on /mnt failed: Invalid argument
> > but if I use the mount command I can see that /mnt is present as a vfat
> > /dev/sdb1
> > If I press CTRL+D to continue, it fails in the exact same place with
> > Installation aborted - unable to continue.
> > Note that this is not using my script at all, just the original init
> script
> > - I have not broken into the boot process because it picks up the
> lvm2root
> > parameter.
> >
> > I am so *near*, yet I just cannot get the unconfigured.sh script to run
> in
> > this way...
> >
> > thanks for your help.
> > Steve
>
> I just did a quick test run using Grub:
>
> loopback test PATHTOISO
> linux (test)/boot/linux26 proxdebug ramdisk_size=16777216 rw
> initrd (test)/boot/initrd.img
> boot
>
> drops me into the initrd debug shell
>
> if I do the following (major/minor depend on your disk configuration,
>   check /sys/class/block/XXX/dev where XXX is your usb partition where
>   the iso is)
>
> mknod /tmp/usbdev b MAJOR MINOR
> mkdir /tmp/usbmnt
> mount /tmp/usbdev /tmp/usbmnt
> mount /tmp/usbmnt/PATHTOISO /mnt
>
> followed by lines 294ff of the init script launch the graphical
> installer (although the resolution is messed up - probably because I did
> not bother to change any of the graphic stuff in Grub), and I am able to
> complete the installation..
>
> so I think it must be something in your environment / setup.
>
> (note that there are TWO debug shells, one in the initrd with very
> limited commands, and one in the installer environment with access to a
> lot more tools - a quick test is to run something like "lsblk", since
> that only exists in the latter environment ;). maybe your switch to the
> installer environment did not work, and you are still in the initrd
> shell? that would explain why you can't execute "xinit")
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From steve at easy2boot.com  Fri May 19 13:52:03 2017
From: steve at easy2boot.com (Steve)
Date: Fri, 19 May 2017 12:52:03 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <CALnr98CxT9wXm0=r3Vz=wdhgKX=tQaqyzD_djhvU_0ge=kbggw@mail.gmail.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
 <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
 <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>
 <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>
 <CALnr98C8k5mksb+6AQunpo8ujqKDbq80KET9F3b89ai7y_G=mg@mail.gmail.com>
 <20170519114638.4aprdrkwwszbv7ws@nora.maurer-it.com>
 <CALnr98CxT9wXm0=r3Vz=wdhgKX=tQaqyzD_djhvU_0ge=kbggw@mail.gmail.com>
Message-ID: <CALnr98ABdzsTz=Xa4+PfkNg5nMZ1oTUsqPgoP9M13GMcvs6GYA@mail.gmail.com>

See the blog post (end of page)
http://rmprepusb.blogspot.co.uk/2014/03/add-proxmox-isos-to-easy2boot.html

On 19 May 2017 at 12:51, Steve <steve at easy2boot.com> wrote:

> I just found  a way to get it to work by modifying the grub menu and add
> lvm2root=/dev/sdX4
>
> where sdX4 is a partition on the USB drive which points directly to the
> ISO file.
> So I am booting from the ISO (mapped to the BIOS device (0xff) in
> grub4dos) which boots to the grub2 menu and then I add the lvm2root
> parameter.
>
> The strange thing is that if I boot from a FAT32 flat-file system and have
> the ISO mapped to a partition in exactly the same way, the init script
> fails.
>
>
>
> On 19 May 2017 at 12:46, Fabian Gr?nbichler <f.gruenbichler at proxmox.com>
> wrote:
>
>> On Fri, May 19, 2017 at 10:27:20AM +0100, Steve wrote:
>> > If I type xinit it says /bin/sh: xinit: not found
>> >
>> > I am the author of easy2boot which is a USB multiboot tool to allow
>> people
>> > to boot from 100's of different ISOs (or images) all from one USB stick.
>> > I have been asked by a user to get proxmox 4 working.
>> > 3.2 works because I can run the unconfigured.sh from the command line
>> >
>> > http://rmprepusb.blogspot.co.uk/2014/03/add-proxmox-isos-to-
>> easy2boot.html
>> > http://www.easy2boot.com
>> >
>> > I tried setting proxdebug and  root=/dev/sdb1  or lvm2root=/dev/sdb1 to
>> get
>> > it to mount the FAT32 partition to /mnt. I do not need to manually use
>> any
>> > mount commands, it just automatically runs init but this causes an early
>> > error message of
>> > mount: mounting /dev/sdb1 on /mnt failed: Invalid argument
>> > but if I use the mount command I can see that /mnt is present as a vfat
>> > /dev/sdb1
>> > If I press CTRL+D to continue, it fails in the exact same place with
>> > Installation aborted - unable to continue.
>> > Note that this is not using my script at all, just the original init
>> script
>> > - I have not broken into the boot process because it picks up the
>> lvm2root
>> > parameter.
>> >
>> > I am so *near*, yet I just cannot get the unconfigured.sh script to run
>> in
>> > this way...
>> >
>> > thanks for your help.
>> > Steve
>>
>> I just did a quick test run using Grub:
>>
>> loopback test PATHTOISO
>> linux (test)/boot/linux26 proxdebug ramdisk_size=16777216 rw
>> initrd (test)/boot/initrd.img
>> boot
>>
>> drops me into the initrd debug shell
>>
>> if I do the following (major/minor depend on your disk configuration,
>>   check /sys/class/block/XXX/dev where XXX is your usb partition where
>>   the iso is)
>>
>> mknod /tmp/usbdev b MAJOR MINOR
>> mkdir /tmp/usbmnt
>> mount /tmp/usbdev /tmp/usbmnt
>> mount /tmp/usbmnt/PATHTOISO /mnt
>>
>> followed by lines 294ff of the init script launch the graphical
>> installer (although the resolution is messed up - probably because I did
>> not bother to change any of the graphic stuff in Grub), and I am able to
>> complete the installation..
>>
>> so I think it must be something in your environment / setup.
>>
>> (note that there are TWO debug shells, one in the initrd with very
>> limited commands, and one in the installer environment with access to a
>> lot more tools - a quick test is to run something like "lsblk", since
>> that only exists in the latter environment ;). maybe your switch to the
>> installer environment did not work, and you are still in the initrd
>> shell? that would explain why you can't execute "xinit")
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>
>
>

From eugen.mayer at kontextwork.de  Fri May 19 14:03:16 2017
From: eugen.mayer at kontextwork.de (Eugen Mayer)
Date: Fri, 19 May 2017 14:03:16 +0200
Subject: [PVE-User] Any way to patch / force proxmox to support
 /etc/network/interfaces.d/*?
Message-ID: <etPan.591edf04.264e911d.441e@kontextwork.de>

Hallo,

due to the nature of deploying with chef and configuring my network, interfaces, bridges there, entries in?/etc/network/interfaces.d/eth0 ..?/etc/network/interfaces.d/vmbr0 are created.
The issue no is, that?/etc/network/interfaces basically just includes?

source /etc/network/interfaces.d/*

and is empty. Proxmox does not support that, does not list me any interfaces in the UI and does not let me assign my KVM VM to any interface. Any way i could patch that / force that / give proxmox a static list of interfaces ?

Thanks

--?
Eugen

From steve at easy2boot.com  Fri May 19 14:51:06 2017
From: steve at easy2boot.com (Steve)
Date: Fri, 19 May 2017 13:51:06 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <20170519114638.4aprdrkwwszbv7ws@nora.maurer-it.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
 <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
 <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>
 <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>
 <CALnr98C8k5mksb+6AQunpo8ujqKDbq80KET9F3b89ai7y_G=mg@mail.gmail.com>
 <20170519114638.4aprdrkwwszbv7ws@nora.maurer-it.com>
Message-ID: <CALnr98BsLG4N6M5yc0b9NsRP6AeiaK2ngJm8Ji3Jk=YhC49Aig@mail.gmail.com>

I think the problem is with grub4dos.

If I use this menu from a FAT32 USB drive

title ProxMox lvm2root=sda1
kernel /boot/linux26 ro ramdisk_size=16777216 lvm2root=/dev/sda1 rw quiet
splash=silent
initrd /boot/initrd.img

then it does not work and I get the Installation aborted message

but if I use the same menu in grub2 from the same drive,

menuentry 'Install Proxmox VE sda1' --class debian --class gnu-linux
--class gnu --class os {
linux /boot/linux26 ro ramdisk_size=16777216 lvm2root=/dev/sda1 rw quiet
splash=silent
initrd /boot/initrd.img
}

then it works and loads the installer GUI.

I also note that it does not change to 1024x768 when booting via grub4dos,
so I added

vga=791

and now it works!!!

On 19 May 2017 at 12:46, Fabian Gr?nbichler <f.gruenbichler at proxmox.com>
wrote:

> On Fri, May 19, 2017 at 10:27:20AM +0100, Steve wrote:
> > If I type xinit it says /bin/sh: xinit: not found
> >
> > I am the author of easy2boot which is a USB multiboot tool to allow
> people
> > to boot from 100's of different ISOs (or images) all from one USB stick.
> > I have been asked by a user to get proxmox 4 working.
> > 3.2 works because I can run the unconfigured.sh from the command line
> >
> > http://rmprepusb.blogspot.co.uk/2014/03/add-proxmox-isos-
> to-easy2boot.html
> > http://www.easy2boot.com
> >
> > I tried setting proxdebug and  root=/dev/sdb1  or lvm2root=/dev/sdb1 to
> get
> > it to mount the FAT32 partition to /mnt. I do not need to manually use
> any
> > mount commands, it just automatically runs init but this causes an early
> > error message of
> > mount: mounting /dev/sdb1 on /mnt failed: Invalid argument
> > but if I use the mount command I can see that /mnt is present as a vfat
> > /dev/sdb1
> > If I press CTRL+D to continue, it fails in the exact same place with
> > Installation aborted - unable to continue.
> > Note that this is not using my script at all, just the original init
> script
> > - I have not broken into the boot process because it picks up the
> lvm2root
> > parameter.
> >
> > I am so *near*, yet I just cannot get the unconfigured.sh script to run
> in
> > this way...
> >
> > thanks for your help.
> > Steve
>
> I just did a quick test run using Grub:
>
> loopback test PATHTOISO
> linux (test)/boot/linux26 proxdebug ramdisk_size=16777216 rw
> initrd (test)/boot/initrd.img
> boot
>
> drops me into the initrd debug shell
>
> if I do the following (major/minor depend on your disk configuration,
>   check /sys/class/block/XXX/dev where XXX is your usb partition where
>   the iso is)
>
> mknod /tmp/usbdev b MAJOR MINOR
> mkdir /tmp/usbmnt
> mount /tmp/usbdev /tmp/usbmnt
> mount /tmp/usbmnt/PATHTOISO /mnt
>
> followed by lines 294ff of the init script launch the graphical
> installer (although the resolution is messed up - probably because I did
> not bother to change any of the graphic stuff in Grub), and I am able to
> complete the installation..
>
> so I think it must be something in your environment / setup.
>
> (note that there are TWO debug shells, one in the initrd with very
> limited commands, and one in the installer environment with access to a
> lot more tools - a quick test is to run something like "lsblk", since
> that only exists in the latter environment ;). maybe your switch to the
> installer environment did not work, and you are still in the initrd
> shell? that would explain why you can't execute "xinit")
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From brians at iptel.co  Fri May 19 14:55:55 2017
From: brians at iptel.co (Brian :)
Date: Fri, 19 May 2017 13:55:55 +0100
Subject: [PVE-User] Any way to patch / force proxmox to support
	/etc/network/interfaces.d/*?
In-Reply-To: <etPan.591edf04.264e911d.441e@kontextwork.de>
References: <etPan.591edf04.264e911d.441e@kontextwork.de>
Message-ID: <CAGPQfi-_CMuy9FNzE8FaMy_=2UVV1EejFxSe5RzVo2o-_xJW7Q@mail.gmail.com>

you could probably cat /etc/network/interfaces.d/* >
/etc/network/interfaces as a horrible hack.

On Fri, May 19, 2017 at 1:03 PM, Eugen Mayer <eugen.mayer at kontextwork.de> wrote:
> Hallo,
>
> due to the nature of deploying with chef and configuring my network, interfaces, bridges there, entries in /etc/network/interfaces.d/eth0 .. /etc/network/interfaces.d/vmbr0 are created.
> The issue no is, that /etc/network/interfaces basically just includes
>
> source /etc/network/interfaces.d/*
>
> and is empty. Proxmox does not support that, does not list me any interfaces in the UI and does not let me assign my KVM VM to any interface. Any way i could patch that / force that / give proxmox a static list of interfaces ?
>
> Thanks
>
> --
> Eugen
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

From steve at easy2boot.com  Fri May 19 15:04:37 2017
From: steve at easy2boot.com (Steve)
Date: Fri, 19 May 2017 14:04:37 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
 <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
 <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>
 <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>
Message-ID: <CALnr98C7G7AD0PrbqY6EHQanbdMxvzNJde=gQqa8uznAgQ1=gg@mail.gmail.com>

One last hurdle!

instead of /dev/sda1  I want to use /dev/disk/by-uuid/XXXXXXX because I
don't know how many disks there are in the system

but it does not work - I presume that /dev/disk/by-uuid does not yet exist
at the time the init script is running?

Is there any other way to reference the partition? I am guessing that
by-label will not work either?...

Steve

On 19 May 2017 at 10:04, Fabian Gr?nbichler <f.gruenbichler at proxmox.com>
wrote:

> On Fri, May 19, 2017 at 09:40:54AM +0100, Steve wrote:
> > I tried proxdebug. No extra messages are generated after the network has
> > initialised.
> > There is no log in /tmp folder ? Has the log moved?
> >
> > How do I run xinit? Where is it?
> > Do you mean init? If I run this it says must be run as PID 1 - how can I
> > fix this (sorry, I am not a linux guru!)
> >
>
> just type "xinit" in the debug shell. but if you are not comfortable
> with this kind of debugging, you might be better off just (temporarily)
> "sacrificing" a thumb drive for the PVE installer instead of trying to
> get this non-standard way to boot it to work ;)
>
> I hope to fix the initrd during the 5.x release cycleto to allow booting
> in some kind of loopback mode, but it's a low priority item on my todo
> list..
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From steve at easy2boot.com  Sat May 20 00:58:26 2017
From: steve at easy2boot.com (Steve)
Date: Fri, 19 May 2017 23:58:26 +0100
Subject: [PVE-User] sbin/unconfigured.sh
In-Reply-To: <20170519114638.4aprdrkwwszbv7ws@nora.maurer-it.com>
References: <CALnr98Dhmd9=xff_G1Np1E3eH-snUfqMV8WB8+aSQoB+c=Fx4w@mail.gmail.com>
 <af4741c9-3a27-d5ee-19ae-7e3ef553807a@gmail.com>
 <CALnr98BeL9OAhZbjyofXGAv7vG+pVMvQwVk=1+DKntNgtPn=4w@mail.gmail.com>
 <20170519063447.2vnikzing32ojyai@nora.maurer-it.com>
 <CALnr98AUeF0Nw8dYRmCPwCZ83pqo=f_dA9AVmmmhqP_LXxENSA@mail.gmail.com>
 <20170519081205.yzapwlrean3qzqvy@nora.maurer-it.com>
 <CALnr98B+Y=KN8PtW0UEJR1jewze9JszyHE4CjeGhUTsrc1=CKA@mail.gmail.com>
 <20170519090457.mpa47g7da34t5nww@nora.maurer-it.com>
 <CALnr98C8k5mksb+6AQunpo8ujqKDbq80KET9F3b89ai7y_G=mg@mail.gmail.com>
 <20170519114638.4aprdrkwwszbv7ws@nora.maurer-it.com>
Message-ID: <CALnr98Dh9eLuEE3pE_PCV0Ogz8n3+b6utYjQiR4kdOWWBE0fNg@mail.gmail.com>

??
Here is final grub4dos menu for use with Easy2Boot which works

iftitle [if exist $HOME$/proxmox-ve_4.4-eb2d6f1e-2.iso] proxmox 4.4\n You
must enter the correct USB name.
set ISO=proxmox-ve_4.4-eb2d6f1e-2.iso

set ldisk=
errorcheck off
if not exist ldisk geometry (hd9) > nul && set ldisk=sdj4
if not exist ldisk geometry (hd8) > nul && set ldisk=sdi4
if not exist ldisk geometry (hd7) > nul && set ldisk=sdh4
if not exist ldisk geometry (hd6) > nul && set ldisk=sdg4
if not exist ldisk geometry (hd5) > nul && set ldisk=sdf4
if not exist ldisk geometry (hd4) > nul && set ldisk=sde4
if not exist ldisk geometry (hd3) > nul && set ldisk=sdd4
if not exist ldisk geometry (hd2) > nul && set ldisk=sdc4
if not exist ldisk geometry (hd1) > nul && set ldisk=sdb4

echo
echo -e $[0104]     I guess partition 4 of the USB drive will be %ldisk%
echo
set /p ldisk=Enter linux device name for USB drive (ptn4), e.g. sdb4 or
sdc4 (ESC=%ldisk%) :
echo
pause --wait=3 Will use /dev/%ldisk%
set NOSUG=1
set redir=> nul
/%grub%/QRUN.g4b $HOME$/%ISO%
kernel /boot/linux26 ro ramdisk_size=16777216 lvm2root=/dev/%ldisk% vga=791
rw
initrd /boot/initrd.img
boot

Thanks for your help
?.?

From lindsay.mathieson at gmail.com  Mon May 22 00:34:49 2017
From: lindsay.mathieson at gmail.com (Lindsay Mathieson)
Date: Mon, 22 May 2017 08:34:49 +1000
Subject: [PVE-User] Snapshots and "Now" not listing in Snapshots GUI
Message-ID: <2fbb1726-b1ee-9ebf-6b7d-1cb6899c4a64@gmail.com>

I've had this happen several times on the latest pve-nosubcription repo. 
Take a snaphot of a windows vm, which completes successfully and later I 
notice that no snapshots or the "Now" enty are listed in the GUI.

I eventually tracked it down the snapshot having itself as its own 
parent. I often reuse snapshot names, so possibly the "parent:session" 
entry was left over in the main section?

Sample conf file here:

agent: 1
boot: cdn
bootdisk: scsi0
cores: 4
ide2: none,media=cdrom
memory: 4096
name: Derek-CRM-Dev
net0: virtio=42:85:0B:97:CD:21,bridge=vmbr0
ostype: win7
parent: session
scsi0: gluster4:204/vm-204-disk-1.qcow2,cache=writeback,size=130559M
scsihw: virtio-scsi-pci
sockets: 1
unused0: LOBVirusBackup:204/vm-204-disk-1.qcow2
vga: qxl

[session]
agent: 1
boot: cdn
bootdisk: scsi0
cores: 4
ide2: none,media=cdrom
machine: pc-i440fx-2.7
memory: 4096
name: Derek-CRM-Dev
net0: virtio=42:85:0B:97:CD:21,bridge=vmbr0
ostype: win7
*parent: session*
scsi0: gluster4:204/vm-204-disk-1.qcow2,cache=writeback,size=130559M
scsihw: virtio-scsi-pci
snaptime: 1495187328
sockets: 1
vga: qxl
vmstate: gluster4:204/vm-204-state-session.raw
~
~

-- 
Lindsay Mathieson

From f.gruenbichler at proxmox.com  Mon May 22 07:53:48 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Mon, 22 May 2017 07:53:48 +0200
Subject: [PVE-User] Snapshots and "Now" not listing in Snapshots GUI
In-Reply-To: <2fbb1726-b1ee-9ebf-6b7d-1cb6899c4a64@gmail.com>
References: <2fbb1726-b1ee-9ebf-6b7d-1cb6899c4a64@gmail.com>
Message-ID: <20170522055348.3t2wy2du3uhwdc7r@nora.maurer-it.com>

On Mon, May 22, 2017 at 08:34:49AM +1000, Lindsay Mathieson wrote:
> I've had this happen several times on the latest pve-nosubcription repo.
> Take a snaphot of a windows vm, which completes successfully and later I
> notice that no snapshots or the "Now" enty are listed in the GUI.
> 
> 
> I eventually tracked it down the snapshot having itself as its own parent. I
> often reuse snapshot names, so possibly the "parent:session" entry was left
> over in the main section?

this should only be possible if you either didn't completely clean up a
failed snapshot deletion, or did an incomplete manual snapshot deletion
yourself.

when you create a snapshot, the following happens:
- lock config, copy most of the current config into the snapshot config
- do volume snapshot(s)
- finalize config, unlock config

the parent of the old "current" state becomes the parent of the new
snapshot in step 1, and the new snapshot becomes the parent of the
"current" state in step 3.

when deleting, a similar process runs:

- lock config
- mark snapshot as deleting
- delete volume snapshot(s)
- rewrite all parent keys referencing the deleted snapshot
- delete snapshot config
- unlock config

now if step 3 fails (e.g., because of a storage failure), the snapshot
will be marked as "delete", but still be referenced in the config. if
you manually clean this situation up, you also need to manually do step
4. if you don't, and just delete the snapshot section in the
configuration and the volume snapshots on the storages, the parent key
points to a non existing snapshot, and if you create another snapshot
with the same name, you get the kind of circular dependency which you
are describing.

> 
> 
> Sample conf file here:
> 
> agent: 1
> boot: cdn
> bootdisk: scsi0
> cores: 4
> ide2: none,media=cdrom
> memory: 4096
> name: Derek-CRM-Dev
> net0: virtio=42:85:0B:97:CD:21,bridge=vmbr0
> ostype: win7
> parent: session
> scsi0: gluster4:204/vm-204-disk-1.qcow2,cache=writeback,size=130559M
> scsihw: virtio-scsi-pci
> sockets: 1
> unused0: LOBVirusBackup:204/vm-204-disk-1.qcow2
> vga: qxl
> 
> [session]
> agent: 1
> boot: cdn
> bootdisk: scsi0
> cores: 4
> ide2: none,media=cdrom
> machine: pc-i440fx-2.7
> memory: 4096
> name: Derek-CRM-Dev
> net0: virtio=42:85:0B:97:CD:21,bridge=vmbr0
> ostype: win7
> *parent: session*
> scsi0: gluster4:204/vm-204-disk-1.qcow2,cache=writeback,size=130559M
> scsihw: virtio-scsi-pci
> snaptime: 1495187328
> sockets: 1
> vga: qxl
> vmstate: gluster4:204/vm-204-state-session.raw
> ~
> ~
> 
> 
> 
> -- 
> Lindsay Mathieson
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

From gaio at sv.lnf.it  Mon May 22 10:46:45 2017
From: gaio at sv.lnf.it (Marco Gaiarin)
Date: Mon, 22 May 2017 10:46:45 +0200
Subject: [PVE-User] VirtIO SCSI, trim, strange logs...
Message-ID: <20170522084645.GG3979@sv.lnf.it>

I'm using latest proxmox 4.4, daily updated by repository.

I've used VirtIO (plain) disk driver until now, but i'm experimenting
now VirtIO SCSI driver, to be able to enable trim on disks image (and
so, enabling the discard options).

So i've moved to VirtIO SCSI some ''non critical'' VMs, using different
storage (iSCSI/SAN, Ceph, local LVM Thin), and i've tested the fstrim
command, that works as expected, following that wiki page:

	https://pve.proxmox.com/wiki/Qemu_trim/discard_and_virtio_scsi

But one of these storage (LVM thin), during backup of the VMs, i've
found the log (of the guest) full of:

	May 21 20:02:49 brucaliffo kernel: [183109.860071] sd 0:0:0:0: [sda] abort

VMs work as expected now, and all seems normal.

But i prefere asking here for some clue... thanks.

-- 
dott. Marco Gaiarin				        GNUPG Key ID: 240A3D66
  Associazione ``La Nostra Famiglia''          http://www.lanostrafamiglia.it/
  Polo FVG   -   Via della Bont?, 7 - 33078   -   San Vito al Tagliamento (PN)
  marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797

		Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
      http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
	(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)

From f.gruenbichler at proxmox.com  Mon May 22 13:25:24 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Mon, 22 May 2017 13:25:24 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <f9a76872-f599-6880-9488-2f9b4058f02a@gmail.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
 <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
 <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>
 <20170519113127.33y4vob54yin2ejw@nora.maurer-it.com>
 <f9a76872-f599-6880-9488-2f9b4058f02a@gmail.com>
Message-ID: <20170522112524.pzmwxrf5gziw5a6u@nora.maurer-it.com>

On Fri, May 19, 2017 at 01:59:27PM +0200, Uwe Sauter wrote:
> 
> >>
> >> I suspect that something just doesn't send emails in that specific error case?
> > 
> > yes, seems like activate_storage is called very early on to retrieve
> > maxfiles and dumpdir via PVE::API2::VZDump (POST) -> PVE::VZDump->new()
> > -> PVE::VZDump::storage_info() , and that call is not guarded by an
> > eval, thus no error handling and sendmail is triggered. can you file a
> > bug for that? thanks!
> > 
> 
> Do I have to create separate users for bugzilla and forum? Don't have a forum user yet but it would probably be benificial?

yes, those two are not connected.

> > the underlying issue is still unclear to me.. can you post the output of
> > the following snippet (insert your correct IP)
> > 
> > perl -e '
> > use strict;
> > use warnings;
> > use PVE::Storage::NFSPlugin;
> > my $server = "INSERTNFSSERVERIPHERE";
> > my $export = "/backup/proxmox-infra";
> > my $mountpoint = "/mnt/pve/aurel";
> > print PVE::Storage::NFSPlugin::nfs_is_mounted($server, $export, $mountpoint, undef), "\n";
> > '
> > 
> 
> Result: "Use of uninitialized value in print at -e line 8."

which means it returned undef, so PVE thinks the storage is not mounted.

> 
> > if that does not output anything, the following might also be
> > interesting (feel free to censor as you see fit):
> > 
> > perl -e 'use strict; use warnings; use PVE::ProcFSTools; use Data::Dumper; print Dumper(PVE::ProcFSTools::parse_proc_mounts());'
> > 
> 
> $VAR1 = [
> ....
>           [
>             '<hostname of NFS server>:/backup/proxmox-infra',
>             '/mnt/pve/aurel',
>             'nfs',
>             'rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<if of
> NFS server>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=<ip of NFS server>',
>             '0',
>             '0'
>           ],
> .....
>         ];
> 

the culprit is likely that your storage.cfg contains the IP, but your
/proc/mounts contains the hostname (with a reverse lookup inbetween?).

can you test using the hostname in your storage.cfg instead of the IP?

> 
> >>
> >> Is there a way to test the mail configuration using PVE's mechanism?
> > 
> > yes, just pipe your mail into the the sendmail command I posted ;) to
> > check past mails, you can also use something like
> > 
> > journalctl -b | grep 'pvemail\|postfix'
> > 
> 
> I tested the bakcup job with a local storage and then I got emails. So it is definitivly something related to NFS and backups, not
> the mailing mechansim.
> 

yes and no - nothing special about NFS here, would be triggered by any
storage where storage_info (or the sub call to activate_storage) fails.

see my proposed patch for #1389 on pve-devel:
https://pve.proxmox.com/pipermail/pve-devel/2017-May/026511.html

From uwe.sauter.de at gmail.com  Mon May 22 14:52:13 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Mon, 22 May 2017 14:52:13 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <20170522112524.pzmwxrf5gziw5a6u@nora.maurer-it.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
 <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
 <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>
 <20170519113127.33y4vob54yin2ejw@nora.maurer-it.com>
 <f9a76872-f599-6880-9488-2f9b4058f02a@gmail.com>
 <20170522112524.pzmwxrf5gziw5a6u@nora.maurer-it.com>
Message-ID: <5189d740-2086-cae7-274c-6f38ce1adf4c@gmail.com>

>>> perl -e 'use strict; use warnings; use PVE::ProcFSTools; use Data::Dumper; print Dumper(PVE::ProcFSTools::parse_proc_mounts());'
>>>
>>
>> $VAR1 = [
>> ....
>>           [
>>             '<hostname of NFS server>:/backup/proxmox-infra',
>>             '/mnt/pve/aurel',
>>             'nfs',
>>             'rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<if of
>> NFS server>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=<ip of NFS server>',
>>             '0',
>>             '0'
>>           ],
>> .....
>>         ];
>>
> 
> the culprit is likely that your storage.cfg contains the IP, but your
> /proc/mounts contains the hostname (with a reverse lookup inbetween?).
> 

I was following https://pve.proxmox.com/wiki/Storage:_NFS , quote: "To avoid DNS lookup delays, it is usually preferable to use an
IP address instead of a DNS name". But yes, the DNS in our environment is configured to allow reverse lookups.

> can you test using the hostname in your storage.cfg instead of the IP?

I removed the former definition and umounted the NFS share on all nodes. BTW, why is a storage not umounted when it is deleted
from the WebUI?

Now storage definition looks like:

nfs: aurel
	export /backup/proxmox-infra
	path /mnt/pve/aurel
	server aurel.XXXXX.de
	content backup
	maxfiles 30
	options vers=3

With this definition, the backup succeeded (and I got mails back from each host).

So it seems that the recommendation from the wiki prevents PVE's mechanism from working properly (when being used in an
environment where reverse name lookups are correctly configured).

>>
>> I tested the bakcup job with a local storage and then I got emails. So it is definitivly something related to NFS and backups, not
>> the mailing mechansim.
>>
> 
> yes and no - nothing special about NFS here, would be triggered by any
> storage where storage_info (or the sub call to activate_storage) fails.
> 
> see my proposed patch for #1389 on pve-devel:
> https://pve.proxmox.com/pipermail/pve-devel/2017-May/026511.html
> 

I'm not familiar enough with Perl to be able to comment whether this is enough?

From uwe.sauter.de at gmail.com  Mon May 22 15:37:32 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Mon, 22 May 2017 15:37:32 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <5189d740-2086-cae7-274c-6f38ce1adf4c@gmail.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
 <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
 <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>
 <20170519113127.33y4vob54yin2ejw@nora.maurer-it.com>
 <f9a76872-f599-6880-9488-2f9b4058f02a@gmail.com>
 <20170522112524.pzmwxrf5gziw5a6u@nora.maurer-it.com>
 <5189d740-2086-cae7-274c-6f38ce1adf4c@gmail.com>
Message-ID: <d2777d19-cf36-dd55-d9ce-c1a10fcaf43a@gmail.com>

>>
>> the culprit is likely that your storage.cfg contains the IP, but your
>> /proc/mounts contains the hostname (with a reverse lookup inbetween?).
>>
> 
> I was following https://pve.proxmox.com/wiki/Storage:_NFS , quote: "To avoid DNS lookup delays, it is usually preferable to use an
> IP address instead of a DNS name". But yes, the DNS in our environment is configured to allow reverse lookups.
> 
>> can you test using the hostname in your storage.cfg instead of the IP?
> 
> I removed the former definition and umounted the NFS share on all nodes. BTW, why is a storage not umounted when it is deleted
> from the WebUI?
> 
> Now storage definition looks like:
> 
> nfs: aurel
> 	export /backup/proxmox-infra
> 	path /mnt/pve/aurel
> 	server aurel.XXXXX.de
> 	content backup
> 	maxfiles 30
> 	options vers=3
> 
> 
> With this definition, the backup succeeded (and I got mails back from each host).
> 
> 
> So it seems that the recommendation from the wiki prevents PVE's mechanism from working properly (when being used in an
> environment where reverse name lookups are correctly configured).
> 

I discovered a different issue with this definition: If I go to Datacenter -> node -> storage aurel -> content I only get "mount
error: mount.nfs: /mnt/pve/aurel is busy or already mounted (500)".

The share is mounted again with IP address though I didn't change the config after above.

# cat /proc/mounts
[?]
<IP address>:/backup/proxmox-infra /mnt/pve/aurel nfs
rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<IP
address>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=<IP address> 0 0

root at px-alpha-cluster:~# df
Filesystem                         1K-blocks      Used  Available Use% Mounted on
[?]
<IP address>:/backup/proxmox-infra 4294967296  63842304 4231124992   2% /mnt/pve/aurel

Also Datacenter -> node -> storage aurel -> summary says:

enabled: yes
active: no
content: vzdump backup file
type: NFS
usage: n/a

But to mention it again: backups do work now.

From uwe.sauter.de at gmail.com  Mon May 22 15:40:17 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Mon, 22 May 2017 15:40:17 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <d2777d19-cf36-dd55-d9ce-c1a10fcaf43a@gmail.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
 <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
 <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>
 <20170519113127.33y4vob54yin2ejw@nora.maurer-it.com>
 <f9a76872-f599-6880-9488-2f9b4058f02a@gmail.com>
 <20170522112524.pzmwxrf5gziw5a6u@nora.maurer-it.com>
 <5189d740-2086-cae7-274c-6f38ce1adf4c@gmail.com>
 <d2777d19-cf36-dd55-d9ce-c1a10fcaf43a@gmail.com>
Message-ID: <11f5fc25-3d9d-049c-fd2a-85217edd9475@gmail.com>

> 
> I discovered a different issue with this definition: If I go to Datacenter -> node -> storage aurel -> content I only get "mount
> error: mount.nfs: /mnt/pve/aurel is busy or already mounted (500)".
> 
> The share is mounted again with IP address though I didn't change the config after above.
> 
> # cat /proc/mounts
> [?]
> <IP address>:/backup/proxmox-infra /mnt/pve/aurel nfs
> rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<IP
> address>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=<IP address> 0 0
> 
> root at px-alpha-cluster:~# df
> Filesystem                         1K-blocks      Used  Available Use% Mounted on
> [?]
> <IP address>:/backup/proxmox-infra 4294967296  63842304 4231124992   2% /mnt/pve/aurel
> 
> 
> Also Datacenter -> node -> storage aurel -> summary says:
> 
> enabled: yes
> active: no
> content: vzdump backup file
> type: NFS
> usage: n/a
> 
> 
> But to mention it again: backups do work now.
> 

The new issue is only true for the one node where no VM is running that has backups configured. The share is mounted but is not
active and content is not shown (though the backups from VMs on other nodes should be visible).

From uwe.sauter.de at gmail.com  Mon May 22 16:07:16 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Mon, 22 May 2017 16:07:16 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <11f5fc25-3d9d-049c-fd2a-85217edd9475@gmail.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
 <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
 <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>
 <20170519113127.33y4vob54yin2ejw@nora.maurer-it.com>
 <f9a76872-f599-6880-9488-2f9b4058f02a@gmail.com>
 <20170522112524.pzmwxrf5gziw5a6u@nora.maurer-it.com>
 <5189d740-2086-cae7-274c-6f38ce1adf4c@gmail.com>
 <d2777d19-cf36-dd55-d9ce-c1a10fcaf43a@gmail.com>
 <11f5fc25-3d9d-049c-fd2a-85217edd9475@gmail.com>
Message-ID: <39e67efa-812a-f679-3a99-ee3629cbe950@gmail.com>

Am 22.05.2017 um 15:40 schrieb Uwe Sauter:
> 
>>
>> I discovered a different issue with this definition: If I go to Datacenter -> node -> storage aurel -> content I only get "mount
>> error: mount.nfs: /mnt/pve/aurel is busy or already mounted (500)".
>>
>> The share is mounted again with IP address though I didn't change the config after above.
>>
>> # cat /proc/mounts
>> [?]
>> <IP address>:/backup/proxmox-infra /mnt/pve/aurel nfs
>> rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<IP
>> address>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=<IP address> 0 0
>>
>> root at px-alpha-cluster:~# df
>> Filesystem                         1K-blocks      Used  Available Use% Mounted on
>> [?]
>> <IP address>:/backup/proxmox-infra 4294967296  63842304 4231124992   2% /mnt/pve/aurel
>>
>>
>> Also Datacenter -> node -> storage aurel -> summary says:
>>
>> enabled: yes
>> active: no
>> content: vzdump backup file
>> type: NFS
>> usage: n/a
>>
>>
>> But to mention it again: backups do work now.
>>
> 
> The new issue is only true for the one node where no VM is running that has backups configured. The share is mounted but is not
> active and content is not shown (though the backups from VMs on other nodes should be visible).
> 

After some more testing it seems that this was some left-overs from previous configuration attempts. After a reboot, this issue
doesn't exist any more.

From f.gruenbichler at proxmox.com  Tue May 23 08:48:46 2017
From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?Q?Gr=FCnbichler?=)
Date: Tue, 23 May 2017 08:48:46 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <5189d740-2086-cae7-274c-6f38ce1adf4c@gmail.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
 <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
 <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>
 <20170519113127.33y4vob54yin2ejw@nora.maurer-it.com>
 <f9a76872-f599-6880-9488-2f9b4058f02a@gmail.com>
 <20170522112524.pzmwxrf5gziw5a6u@nora.maurer-it.com>
 <5189d740-2086-cae7-274c-6f38ce1adf4c@gmail.com>
Message-ID: <20170523064846.72sii77olnimkfwb@nora.maurer-it.com>

On Mon, May 22, 2017 at 02:52:13PM +0200, Uwe Sauter wrote:
> >>> perl -e 'use strict; use warnings; use PVE::ProcFSTools; use Data::Dumper; print Dumper(PVE::ProcFSTools::parse_proc_mounts());'
> >>>
> >>
> >> $VAR1 = [
> >> ....
> >>           [
> >>             '<hostname of NFS server>:/backup/proxmox-infra',
> >>             '/mnt/pve/aurel',
> >>             'nfs',
> >>             'rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<if of
> >> NFS server>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=<ip of NFS server>',
> >>             '0',
> >>             '0'
> >>           ],
> >> .....
> >>         ];
> >>
> > 
> > the culprit is likely that your storage.cfg contains the IP, but your
> > /proc/mounts contains the hostname (with a reverse lookup inbetween?).
> > 
> 
> I was following https://pve.proxmox.com/wiki/Storage:_NFS , quote: "To avoid DNS lookup delays, it is usually preferable to use an
> IP address instead of a DNS name". But yes, the DNS in our environment is configured to allow reverse lookups.

which - AFAIK - is still true, especially since failing DNS means
failing NFS storage if you put the host name there. I think for NFSv4
the situation is slightly different, as reverse lookups are part of the
authentication process, but I haven't played around with that yet.

I cannot reproduce the behaviour you report with an NFS server with
working reverse lookup (proto and mountproto set to tcp, so the
resulting options string looks identical to yours modulo the addresses).
/proc/mounts contains the IP address as source if I put the IP address
into storage.cfg, and the hostname if I put the hostname in storage.cfg
(both on 4.4 and 5.0 Beta).

is there anything else in your setup/environment that might cause this
behaviour? what OS is the NFS server on? any entries in /etc/hosts
relating to the NFS server?

> 
> > can you test using the hostname in your storage.cfg instead of the IP?
> 
> I removed the former definition and umounted the NFS share on all nodes. BTW, why is a storage not umounted when it is deleted
> from the WebUI?

because storage deactivation in PVE happens mostly on a volume level,
and only when needed. deactivating something that is (potentially) still
needed is more dangerous than leaving something activated that is not ;)

> Now storage definition looks like:
> 
> nfs: aurel
> 	export /backup/proxmox-infra
> 	path /mnt/pve/aurel
> 	server aurel.XXXXX.de
> 	content backup
> 	maxfiles 30
> 	options vers=3
> 
> With this definition, the backup succeeded (and I got mails back from each host).

I suspected as much.

> So it seems that the recommendation from the wiki prevents PVE's mechanism from working properly (when being used in an
> environment where reverse name lookups are correctly configured).

... on your machine in your specific environment. Your report is the
first showing this behaviour that I know of, so until we get more
information I am inclined to not blame our instructions here :P running
with IP addresses instead of host names with NFSv3 has been shown to be
more robust (as in, we've had multiple cases where people experienced
NFS storage outages because of DNS problems).

> >> I tested the bakcup job with a local storage and then I got emails. So it is definitivly something related to NFS and backups, not
> >> the mailing mechansim.
> > 
> > yes and no - nothing special about NFS here, would be triggered by any
> > storage where storage_info (or the sub call to activate_storage) fails.
> > 
> > see my proposed patch for #1389 on pve-devel:
> > https://pve.proxmox.com/pipermail/pve-devel/2017-May/026511.html
> 
> I'm not familiar enough with Perl to be able to comment whether this is enough?

was just intended as a heads up that this specific part of the problem
should be fixed soon (once the patch has been reviewed, applied, and
updated packages have trickled down through the repositories).

From uwe.sauter.de at gmail.com  Tue May 23 10:05:01 2017
From: uwe.sauter.de at gmail.com (Uwe Sauter)
Date: Tue, 23 May 2017 10:05:01 +0200
Subject: [PVE-User] Problems with backup process and NFS
In-Reply-To: <20170523064846.72sii77olnimkfwb@nora.maurer-it.com>
References: <9cd79bd6-a17a-92a6-2600-56a01e829a70@gmail.com>
 <20170519091749.ngg7w7zfnz2yyoyy@nora.maurer-it.com>
 <ad686413-df95-e3c3-0932-e88761334250@gmail.com>
 <20170519095357.kkecz5k6sst5dfip@nora.maurer-it.com>
 <d30d2fdf-d68f-412a-80f3-a1118df68628@gmail.com>
 <20170519113127.33y4vob54yin2ejw@nora.maurer-it.com>
 <f9a76872-f599-6880-9488-2f9b4058f02a@gmail.com>
 <20170522112524.pzmwxrf5gziw5a6u@nora.maurer-it.com>
 <5189d740-2086-cae7-274c-6f38ce1adf4c@gmail.com>
 <20170523064846.72sii77olnimkfwb@nora.maurer-it.com>
Message-ID: <7fc9f667-56ae-9988-6dba-3e335c9fa8f2@gmail.com>

Hi Fabian,

>> I was following https://pve.proxmox.com/wiki/Storage:_NFS , quote: "To avoid DNS lookup delays, it is usually preferable to use an
>> IP address instead of a DNS name". But yes, the DNS in our environment is configured to allow reverse lookups.
> 
> which - AFAIK - is still true, especially since failing DNS means
> failing NFS storage if you put the host name there. I think for NFSv4
> the situation is slightly different, as reverse lookups are part of the
> authentication process, but I haven't played around with that yet.

My goal was to use NFSv4 but with the Portmapper problem and no way to specify mount options in the WebUI, this thread is actually
based on NFSv3 (as you can see in the configuration and /proc/mounts).

> I cannot reproduce the behaviour you report with an NFS server with
> working reverse lookup (proto and mountproto set to tcp, so the
> resulting options string looks identical to yours modulo the addresses).
> /proc/mounts contains the IP address as source if I put the IP address
> into storage.cfg, and the hostname if I put the hostname in storage.cfg
> (both on 4.4 and 5.0 Beta).
> 
> is there anything else in your setup/environment that might cause this
> behaviour? what OS is the NFS server on? any entries in /etc/hosts
> relating to the NFS server?

* CentOS 7 with manually tweaked NFS options (I can share if needed)
* /etc/hosts only has entries for the PVE cluster hosts

I'd say that the DNS configuration in our network is state-of-the-art (we have capable people looking after our network services).

> 
>>
>>> can you test using the hostname in your storage.cfg instead of the IP?
>>
>> I removed the former definition and umounted the NFS share on all nodes. BTW, why is a storage not umounted when it is deleted
>> from the WebUI?
> 
> because storage deactivation in PVE happens mostly on a volume level,
> and only when needed. deactivating something that is (potentially) still
> needed is more dangerous than leaving something activated that is not ;)

Not true if some other share should be mounted in the same place but won't because there still is something mounted. I did not
look into the process how PVE manages storage but I had the case that when I remove a storage definition and replace it with
almost the same (hostname instead of IP), it wouldn't mount because a) there still was mounted something and b) technically it was
the same share.
But that's not the topic of this thread :D

> 
>> Now storage definition looks like:
>>
>> nfs: aurel
>> 	export /backup/proxmox-infra
>> 	path /mnt/pve/aurel
>> 	server aurel.XXXXX.de
>> 	content backup
>> 	maxfiles 30
>> 	options vers=3
>>
>> With this definition, the backup succeeded (and I got mails back from each host).
> 
> I suspected as much.
> 
>> So it seems that the recommendation from the wiki prevents PVE's mechanism from working properly (when being used in an
>> environment where reverse name lookups are correctly configured).
> 
> ... on your machine in your specific environment. Your report is the
> first showing this behaviour that I know of, so until we get more
> information I am inclined to not blame our instructions here :P running
> with IP addresses instead of host names with NFSv3 has been shown to be
> more robust (as in, we've had multiple cases where people experienced
> NFS storage outages because of DNS problems).

Fair point.

Truth be told this issue might also be caused by me playing around on both ends, PVE and NFS server.
I now have 2 shares defined (using IP addresses) and did reboot all cluster nodes. The shares get mounted and backups run without
problem.

So I would put this ad acta?

Thanks for your help,

	Uwe

From martin at proxmox.com  Tue May 23 14:26:58 2017
From: martin at proxmox.com (Martin Maurer)
Date: Tue, 23 May 2017 14:26:58 +0200
Subject: [PVE-User] Proxmox VE 5.0 beta2 released!
Message-ID: <160f961a-f0b8-7a60-2f8f-8a9c37bc2418@proxmox.com>

Hi all,

We are proud to announce the release of the second beta of our Proxmox 
VE 5.x family. Based on the feedback from the first beta two months ago 
we improved a lot on the ISO installer and of course, on almost all 
other places.

Get more details from the forum announcement:

https://forum.proxmox.com/threads/proxmox-ve-5-0-beta2-released.34853/

-- 
Best Regards,

Martin Maurer
Proxmox VE project leader

martin at proxmox.com
http://www.proxmox.com

From gilberto.nunes32 at gmail.com  Tue May 23 14:47:02 2017
From: gilberto.nunes32 at gmail.com (Gilberto Nunes)
Date: Tue, 23 May 2017 09:47:02 -0300
Subject: [PVE-User] Proxmox VE 5.0 beta2 released!
In-Reply-To: <160f961a-f0b8-7a60-2f8f-8a9c37bc2418@proxmox.com>
References: <160f961a-f0b8-7a60-2f8f-8a9c37bc2418@proxmox.com>
Message-ID: <CAOKSTBtjhMTSJHAqi+pFK76H1oA44rhGa=TLmtbUDuTsTEL+Zg@mail.gmail.com>

Hi

I am anxious to test cloudinit... When you will release it??

2017-05-23 9:26 GMT-03:00 Martin Maurer <martin at proxmox.com>:

> Hi all,
>
> We are proud to announce the release of the second beta of our Proxmox VE
> 5.x family. Based on the feedback from the first beta two months ago we
> improved a lot on the ISO installer and of course, on almost all other
> places.
>
> Get more details from the forum announcement:
>
> https://forum.proxmox.com/threads/proxmox-ve-5-0-beta2-released.34853/
>
> --
> Best Regards,
>
>
> Martin Maurer
> Proxmox VE project leader
>
> martin at proxmox.com
> http://www.proxmox.com
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

-- 
Obrigado

Cordialmente

Gilberto Ferreira

Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server |
Zimbra Mail Server

(47) 3025-5907
(47) 99676-7530

Skype: konnectati

www.konnectati.com.br

From aderumier at odiso.com  Tue May 23 16:03:38 2017
From: aderumier at odiso.com (Alexandre DERUMIER)
Date: Tue, 23 May 2017 16:03:38 +0200 (CEST)
Subject: [PVE-User] [pve-devel]  Proxmox VE 5.0 beta2 released!
In-Reply-To: <CAOKSTBtjhMTSJHAqi+pFK76H1oA44rhGa=TLmtbUDuTsTEL+Zg@mail.gmail.com>
References: <160f961a-f0b8-7a60-2f8f-8a9c37bc2418@proxmox.com>
 <CAOKSTBtjhMTSJHAqi+pFK76H1oA44rhGa=TLmtbUDuTsTEL+Zg@mail.gmail.com>
Message-ID: <420650404.12924802.1495548218032.JavaMail.zimbra@oxygem.tv>

>>I am anxious to test cloudinit... When you will release it?? 

Hi

Today, I have send cloudinit patches from last year, rebased on last master

Help is welcome if you want to test.

(Patches need to be applied on qemu-server )

----- Mail original -----
De: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
?: "proxmoxve" <pve-user at pve.proxmox.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoy?: Mardi 23 Mai 2017 14:47:02
Objet: Re: [pve-devel] [PVE-User] Proxmox VE 5.0 beta2 released!

Hi 

I am anxious to test cloudinit... When you will release it?? 

2017-05-23 9:26 GMT-03:00 Martin Maurer <martin at proxmox.com>: 

> Hi all, 
> 
> We are proud to announce the release of the second beta of our Proxmox VE 
> 5.x family. Based on the feedback from the first beta two months ago we 
> improved a lot on the ISO installer and of course, on almost all other 
> places. 
> 
> Get more details from the forum announcement: 
> 
> https://forum.proxmox.com/threads/proxmox-ve-5-0-beta2-released.34853/ 
> 
> -- 
> Best Regards, 
> 
> 
> Martin Maurer 
> Proxmox VE project leader 
> 
> martin at proxmox.com 
> http://www.proxmox.com 
> 
> _______________________________________________ 
> pve-user mailing list 
> pve-user at pve.proxmox.com 
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user 
> 

-- 
Obrigado 

Cordialmente 

Gilberto Ferreira 

Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server | 
Zimbra Mail Server 

(47) 3025-5907 
(47) 99676-7530 

Skype: konnectati 

www.konnectati.com.br 
_______________________________________________ 
pve-devel mailing list 
pve-devel at pve.proxmox.com 
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 

From gilberto.nunes32 at gmail.com  Tue May 23 16:06:18 2017
From: gilberto.nunes32 at gmail.com (Gilberto Nunes)
Date: Tue, 23 May 2017 11:06:18 -0300
Subject: [PVE-User] [pve-devel]  Proxmox VE 5.0 beta2 released!
In-Reply-To: <420650404.12924802.1495548218032.JavaMail.zimbra@oxygem.tv>
References: <160f961a-f0b8-7a60-2f8f-8a9c37bc2418@proxmox.com>
 <CAOKSTBtjhMTSJHAqi+pFK76H1oA44rhGa=TLmtbUDuTsTEL+Zg@mail.gmail.com>
 <420650404.12924802.1495548218032.JavaMail.zimbra@oxygem.tv>
Message-ID: <CAOKSTBswh+G2fQv-vXj1gv4yHYoyso2uONCd3vwtn1QxrjoT=A@mail.gmail.com>

Hi Alexandre

Just point me the way! I will do my best to test everything.

2017-05-23 11:03 GMT-03:00 Alexandre DERUMIER <aderumier at odiso.com>:

> >>I am anxious to test cloudinit... When you will release it??
>
> Hi
>
> Today, I have send cloudinit patches from last year, rebased on last master
>
> Help is welcome if you want to test.
>
> (Patches need to be applied on qemu-server )
>
>
> ----- Mail original -----
> De: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
> ?: "proxmoxve" <pve-user at pve.proxmox.com>
> Cc: "pve-devel" <pve-devel at pve.proxmox.com>
> Envoy?: Mardi 23 Mai 2017 14:47:02
> Objet: Re: [pve-devel] [PVE-User] Proxmox VE 5.0 beta2 released!
>
> Hi
>
> I am anxious to test cloudinit... When you will release it??
>
> 2017-05-23 9:26 GMT-03:00 Martin Maurer <martin at proxmox.com>:
>
> > Hi all,
> >
> > We are proud to announce the release of the second beta of our Proxmox VE
> > 5.x family. Based on the feedback from the first beta two months ago we
> > improved a lot on the ISO installer and of course, on almost all other
> > places.
> >
> > Get more details from the forum announcement:
> >
> > https://forum.proxmox.com/threads/proxmox-ve-5-0-beta2-released.34853/
> >
> > --
> > Best Regards,
> >
> >
> > Martin Maurer
> > Proxmox VE project leader
> >
> > martin at proxmox.com
> > http://www.proxmox.com
> >
> > _______________________________________________
> > pve-user mailing list
> > pve-user at pve.proxmox.com
> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >
>
>
>
> --
> Obrigado
>
> Cordialmente
>
>
> Gilberto Ferreira
>
> Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server |
> Zimbra Mail Server
>
> (47) 3025-5907
> (47) 99676-7530
>
> Skype: konnectati
>
>
> www.konnectati.com.br
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>

-- 
Obrigado

Cordialmente

Gilberto Ferreira

Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server |
Zimbra Mail Server

(47) 3025-5907
(47) 99676-7530

Skype: konnectati

www.konnectati.com.br

From aderumier at odiso.com  Tue May 23 16:14:59 2017
From: aderumier at odiso.com (Alexandre DERUMIER)
Date: Tue, 23 May 2017 16:14:59 +0200 (CEST)
Subject: [PVE-User] [pve-devel]  Proxmox VE 5.0 beta2 released!
In-Reply-To: <CAOKSTBswh+G2fQv-vXj1gv4yHYoyso2uONCd3vwtn1QxrjoT=A@mail.gmail.com>
References: <160f961a-f0b8-7a60-2f8f-8a9c37bc2418@proxmox.com>
 <CAOKSTBtjhMTSJHAqi+pFK76H1oA44rhGa=TLmtbUDuTsTEL+Zg@mail.gmail.com>
 <420650404.12924802.1495548218032.JavaMail.zimbra@oxygem.tv>
 <CAOKSTBswh+G2fQv-vXj1gv4yHYoyso2uONCd3vwtn1QxrjoT=A@mail.gmail.com>
Message-ID: <735614091.12925121.1495548899554.JavaMail.zimbra@oxygem.tv>

Here a build deb with patches

http://odisoweb1.odiso.net/qemu-server_5.0-5_amd64.deb

you need to add a special cdrom with

#qm set vmid -(ide|sata)X storeid:cloudinit

(it'll be create by proxmox, can be file or any storeid)

then for configuration:

ipconfigX: ip=192.168.0.124/24,gw=192.168.0.1
searchdomain: mydomain.com
nameserver: 8.8.8.8
sshkey: ....
hostname: myhostname  (optionnal, default is vm name)

(if you have a net0, you can add  ipconfig0:....)

then simple install cloudinit daemon in your vm.

(I have tested it by debian jessie backports)

at start, proxmox will generate config in the cdrom, and daemon in the vm will apply it.

----- Mail original -----
De: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
?: "pve-devel" <pve-devel at pve.proxmox.com>
Cc: "proxmoxve" <pve-user at pve.proxmox.com>
Envoy?: Mardi 23 Mai 2017 16:06:18
Objet: Re: [PVE-User] [pve-devel]  Proxmox VE 5.0 beta2 released!

Hi Alexandre 

Just point me the way! I will do my best to test everything. 

2017-05-23 11:03 GMT-03:00 Alexandre DERUMIER <aderumier at odiso.com>: 

> >>I am anxious to test cloudinit... When you will release it?? 
> 
> Hi 
> 
> Today, I have send cloudinit patches from last year, rebased on last master 
> 
> Help is welcome if you want to test. 
> 
> (Patches need to be applied on qemu-server ) 
> 
> 
> ----- Mail original ----- 
> De: "Gilberto Nunes" <gilberto.nunes32 at gmail.com> 
> ?: "proxmoxve" <pve-user at pve.proxmox.com> 
> Cc: "pve-devel" <pve-devel at pve.proxmox.com> 
> Envoy?: Mardi 23 Mai 2017 14:47:02 
> Objet: Re: [pve-devel] [PVE-User] Proxmox VE 5.0 beta2 released! 
> 
> Hi 
> 
> I am anxious to test cloudinit... When you will release it?? 
> 
> 2017-05-23 9:26 GMT-03:00 Martin Maurer <martin at proxmox.com>: 
> 
> > Hi all, 
> > 
> > We are proud to announce the release of the second beta of our Proxmox VE 
> > 5.x family. Based on the feedback from the first beta two months ago we 
> > improved a lot on the ISO installer and of course, on almost all other 
> > places. 
> > 
> > Get more details from the forum announcement: 
> > 
> > https://forum.proxmox.com/threads/proxmox-ve-5-0-beta2-released.34853/ 
> > 
> > -- 
> > Best Regards, 
> > 
> > 
> > Martin Maurer 
> > Proxmox VE project leader 
> > 
> > martin at proxmox.com 
> > http://www.proxmox.com 
> > 
> > _______________________________________________ 
> > pve-user mailing list 
> > pve-user at pve.proxmox.com 
> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user 
> > 
> 
> 
> 
> -- 
> Obrigado 
> 
> Cordialmente 
> 
> 
> Gilberto Ferreira 
> 
> Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server | 
> Zimbra Mail Server 
> 
> (47) 3025-5907 
> (47) 99676-7530 
> 
> Skype: konnectati 
> 
> 
> www.konnectati.com.br 
> _______________________________________________ 
> pve-devel mailing list 
> pve-devel at pve.proxmox.com 
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> 
> _______________________________________________ 
> pve-devel mailing list 
> pve-devel at pve.proxmox.com 
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> 

-- 
Obrigado 

Cordialmente 

Gilberto Ferreira 

Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server | 
Zimbra Mail Server 

(47) 3025-5907 
(47) 99676-7530 

Skype: konnectati 

www.konnectati.com.br 
_______________________________________________ 
pve-user mailing list 
pve-user at pve.proxmox.com 
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user 

From carheden at ucar.edu  Fri May 26 02:07:19 2017
From: carheden at ucar.edu (Adam Carheden)
Date: Thu, 25 May 2017 18:07:19 -0600
Subject: [PVE-User] How do I remove an old lrm entry?
Message-ID: <c35a58f2-a82f-f32a-8f07-e3439ce37240@ucar.edu>

How do I remove old lrm entries from ha-manager?

We had some unintended munging of /etc/hosts today, causing some
corosync problems. Everything is now recovered except that ha-manager
now lists two lrm entries for every host in the cluster:

lrm pve1 (active, Thu May 25 18:03:08 2017)
lrm pve1.fqdn.com (unable to read lrm status)
...

Does anyone know how I tell it to forget about the pve1.fqdn.com entry?

Thanks
-- 
Adam Carheden

From lindsay.mathieson at gmail.com  Sun May 28 13:32:19 2017
From: lindsay.mathieson at gmail.com (Lindsay Mathieson)
Date: Sun, 28 May 2017 21:32:19 +1000
Subject: [PVE-User] Snapshots and "Now" not listing in Snapshots GUI
In-Reply-To: <20170522055348.3t2wy2du3uhwdc7r@nora.maurer-it.com>
References: <2fbb1726-b1ee-9ebf-6b7d-1cb6899c4a64@gmail.com>
 <20170522055348.3t2wy2du3uhwdc7r@nora.maurer-it.com>
Message-ID: <7eef3dc8-da3b-e7bf-56ed-1b3d9456bf23@gmail.com>

On 22/05/2017 3:53 PM, Fabian Gr?nbichler wrote:
> this should only be possible if you either didn't completely clean up a
> failed snapshot deletion, or did an incomplete manual snapshot deletion
> yourself.

Sorry for the late reply.

On reflection, I suspect you're quite right. Haven't had the issue since 
I removed the snapshots and the parent entry.

Thanks.

-- 
Lindsay Mathieson

From sce.tech at imereos.fr  Mon May 29 15:02:17 2017
From: sce.tech at imereos.fr (Martin LEUSCH)
Date: Mon, 29 May 2017 15:02:17 +0200
Subject: [PVE-User] Writing plugin or module for Proxmox
Message-ID: <1496062937.2031.2.camel@imereos.fr>

Hi,

I plan to enable HA on our new PVE cluster Cluster but with our hosting
provider we can only map public IPs to physical server. So if a HA VM
with an public IP is moved to an other node, it doesn't have access
internet anymore. I can edit IP mapping with a REST API.

I'm wondering if I can write a plugin or module for Proxmox to trigger
action on VM migration or on HA event.

--?
Martin LEUSCH <sce.tech at imereos.fr>

From florent at coppint.com  Tue May 30 15:21:21 2017
From: florent at coppint.com (Florent B)
Date: Tue, 30 May 2017 15:21:21 +0200
Subject: [PVE-User] HTTPS for download.proxmox.com
Message-ID: <b474de64-e006-c9ed-a89a-24aa54360cf5@coppint.com>

Hi PVE team,

Would it be possible to include "download.proxmox.com" in SSL
certificate for accessing downloads with HTTPS.

Current certificate is only valid for proxmox.com & enterprise.proxmox.com.

Thank you.

Florent

From gilberto.nunes32 at gmail.com  Tue May 30 15:21:49 2017
From: gilberto.nunes32 at gmail.com (Gilberto Nunes)
Date: Tue, 30 May 2017 10:21:49 -0300
Subject: [PVE-User] Spice session using local internet link...
Message-ID: <CAOKSTBv-K1G9M4=HswVBujBy74w1O_0gC4N2yx+LUfweVgPXMw@mail.gmail.com>

Hello list

I have a needs here, which I wanna route the internet access inside remote
Spice session, to the local link.
Let's say somebody access proxmox remotely through Spice session. but don't
wanna use the link inside the Spice session, but certain traffic be
re-routed to the client location, so using the local internet access.

I hope I maked myself clear enought.

Thanks for any advice!

-- 
Obrigado

Cordialmente

Gilberto Ferreira

Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server |
Zimbra Mail Server

(47) 3025-5907
(47) 99676-7530

Skype: konnectati

www.konnectati.com.br

From gilberto.nunes32 at gmail.com  Wed May 31 22:06:05 2017
From: gilberto.nunes32 at gmail.com (Gilberto Nunes)
Date: Wed, 31 May 2017 17:06:05 -0300
Subject: [PVE-User] qm agent
Message-ID: <CAOKSTBuKvPeWsXaCwLrnJ4TtryiD==7hjYxWw14QspL=RqVHcQ@mail.gmail.com>

Hi
I have Windows 7 64 bits installed in PVE 5 (last beta ) and after install
qemu-agent I try trigger the following command:

qm agent 101 network-get-interfaces

But nothing happen!

Is there something more can I do, in order to make this work??

Thanks

-- 
Obrigado

Cordialmente

Gilberto Ferreira

Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server |
Zimbra Mail Server

(47) 3025-5907
(47) 99676-7530

Skype: konnectati

www.konnectati.com.br