[PVE-User] (Very) basic question regarding PVE Ceph integration

Mon Dec 17 11:58:13 CET 2018

On 17.12.2018 11:52, Frank Thommen wrote:
> Hi Alwin,
> 
> On 12/16/18 7:47 PM, Alwin Antreich wrote:
>> On Sun, Dec 16, 2018 at 05:16:50PM +0100, Frank Thommen wrote:
>>> Hi Alwin,
>>>
>>> On 16/12/18 15:39, Alwin Antreich wrote:
>>>> Hello Frank,
>>>>
>>>> On Sun, Dec 16, 2018 at 02:28:19PM +0100, Frank Thommen wrote:
>>>>> Hi,
>>>>>
>>>>> I understand that with the new PVE release PVE hosts (hypervisors) 
>>>>> can be
>>>>> used as Ceph servers.  But it's not clear to me if (or when) that 
>>>>> makes
>>>>> sense.  Do I really want to have Ceph MDS/OSD on the same hardware 
>>>>> as my
>>>>> hypervisors?  Doesn't that a) accumulate multiple POFs on the same 
>>>>> hardware
>>>>> and b) occupy computing resources (CPU, RAM), that I'd rather use 
>>>>> for my VMs
>>>>> and containers?  Wouldn't I rather want to have a separate Ceph 
>>>>> cluster?
>>>> The integration of Ceph services in PVE started with Proxmox VE 3.0.
>>>> With PVE 5.3 (current) we added CephFS services to the PVE. So you can
>>>> run a hyper-converged Ceph with RBD/CephFS on the same servers as your
>>>> VM/CT.
>>>>
>>>> a) can you please be more specific in what you see as multiple point of
>>>> failures?
>>>
>>> not only I run the hypervisor which controls containers and virtual 
>>> machines
>>> on the server, but also the fileservice which is used to store the VM 
>>> and
>>> container images.
>> Sorry, I am still not quite sure, what your question/concern is.
>> Failure tolerance needs to be planned into the system design, 
>> irrespective
>> of service distribution.
>>
>> Proxmox VE has a HA stack that restarts all services from a failed node
>> (if configured) on a other node.
>> https://pve.proxmox.com/pve-docs/chapter-ha-manager.html
>>
>> Ceph does selfhealing (if enough nodes
>> are available) or still works in a degraded state.
>> http://docs.ceph.com/docs/luminous/start/intro/
> 
> Yes, I am aware of PVE and Ceph failover/healing capabilities.  But I 
> always liked to separate basic and central services on the hardware 
> level.  This way if one server "explodes", only one service is affected. 
>   With PVE+Ceph on one node, such an outage would affect two basic 
> services at once.  I don't say they wouldn't continue to run 
> productively, but they would run in degraded and non-failure-safe mode - 
> assumed we had three such nodes in the cluster - until the broken node 
> can be restored.

with ceph i feel the very minimum osd node size is 4. with 25% freespace
so you can loose one node and have it rebuilt to a healthy cluster.

I only consider HCI if the performance of the cluster is way above the 
requirements. and once the need grow and you add some nodes, you often 
end up growing osd nodes out into separate machines.

kind regards
Ronny Aasen