[PVE-User] (Very) basic question regarding PVE Ceph integration

Sun Dec 16 17:16:50 CET 2018

Hi Alwin,

On 16/12/18 15:39, Alwin Antreich wrote:
> Hello Frank,
> 
> On Sun, Dec 16, 2018 at 02:28:19PM +0100, Frank Thommen wrote:
>> Hi,
>>
>> I understand that with the new PVE release PVE hosts (hypervisors) can be
>> used as Ceph servers.  But it's not clear to me if (or when) that makes
>> sense.  Do I really want to have Ceph MDS/OSD on the same hardware as my
>> hypervisors?  Doesn't that a) accumulate multiple POFs on the same hardware
>> and b) occupy computing resources (CPU, RAM), that I'd rather use for my VMs
>> and containers?  Wouldn't I rather want to have a separate Ceph cluster?
> The integration of Ceph services in PVE started with Proxmox VE 3.0.
> With PVE 5.3 (current) we added CephFS services to the PVE. So you can
> run a hyper-converged Ceph with RBD/CephFS on the same servers as your
> VM/CT.
> 
> a) can you please be more specific in what you see as multiple point of
> failures?

not only I run the hypervisor which controls containers and virtual 
machines on the server, but also the fileservice which is used to store 
the VM and container images.

> b) depends on the workload of your nodes. Modern server hardware has
> enough power to be able to run multiple services. It all comes down to
> have enough resources for each domain (eg. Ceph, KVM, CT, host).
> 
> I recommend to use a simple calculation for the start, just to get a
> direction.
> 
> In principle:
> 
> ==CPU==
> core='CPU with HT on'
> 
> * reserve a core for each Ceph daemon
>    (preferable on the same NUMA as the network; higher frequency is
>    better)
> * one core for the network card (higher frequency = lower latency)
> * rest of the cores for OS (incl. monitoring, backup, ...), KVM/CT usage
> * don't overcommit
> 
> ==Memory==
> * 1 GB per TB of used disk space on an OSD (more on recovery)
> * enough memory for KVM/CT
> * free memory for OS, backup, monitoring, live migration
> * don't overcommit
> 
> ==Disk==
> * one OSD daemon per disk, even disk sizes throughout the cluster
> * more disks, more hosts, better distribution
> 
> ==Network==
> * at least 10 GbE for storage traffic (more the better),
>    see our benchmark paper
>    https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2018-02.41761/
> * separate networks, cluster, storage, client traffic,
>    additional for separate the migration network from any other
> * use to physical networks for corosync (ring0 & ring1)
> 
> This list doesn't cover every aspect (eg. how much failure is allowed),
> but I think it is a good start. With the above points for the sizing of
> your cluster, the question of a separation of a hyper-converged service
> might be a little easier.

Thanks a lot.  This sure helps in our planning.

frank

> 
> --
> Cheers,
> Alwin
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>