[PVE-User] Confusing about Bond 802.3ad

Josh Knight josh at noobbox.com
Fri Aug 24 17:57:48 CEST 2018


I don't know your topology, I'm assuming you're going from   nodeA ->
switch -> nodeB ?  Make sure that entire path is using RR.  You could
verify this with interface counters on the various hops.  If a single hop
is not doing it correctly, it will limit the throughput.

On Fri, Aug 24, 2018 at 11:20 AM Gilberto Nunes <gilberto.nunes32 at gmail.com>
wrote:

> So I try balance-rr with LAG in the switch and still get 1 GB
>
> pve-ceph02:~# iperf3 -c 10.10.10.100
> Connecting to host 10.10.10.100, port 5201
> [  4] local 10.10.10.110 port 52674 connected to 10.10.10.100 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-1.00   sec   116 MBytes   974 Mbits/sec   32    670
> KBytes
> [  4]   1.00-2.00   sec   112 MBytes   941 Mbits/sec    3    597
> KBytes
> [  4]   2.00-3.00   sec   112 MBytes   941 Mbits/sec    3    509
> KBytes
> [  4]   3.00-4.00   sec   112 MBytes   941 Mbits/sec    0    660
> KBytes
> [  4]   4.00-5.00   sec   112 MBytes   941 Mbits/sec    6    585
> KBytes
> [  4]   5.00-6.00   sec   112 MBytes   941 Mbits/sec    0    720
> KBytes
> [  4]   6.00-7.00   sec   112 MBytes   942 Mbits/sec    3    650
> KBytes
> [  4]   7.00-8.00   sec   112 MBytes   941 Mbits/sec    4    570
> KBytes
> [  4]   8.00-9.00   sec   112 MBytes   941 Mbits/sec    0    708
> KBytes
> [  4]   9.00-10.00  sec   112 MBytes   941 Mbits/sec    8    635
> KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  1.10 GBytes   945 Mbits/sec   59
>  sender
> [  4]   0.00-10.00  sec  1.10 GBytes   942 Mbits/sec
> receiver
>
> iperf Done.
>
>
>
> ---
> Gilberto Nunes Ferreira
>
> (47) 3025-5907
> (47) 99676-7530 - Whatsapp / Telegram
>
> Skype: gilberto.nunes36
>
>
>
>
> 2018-08-24 12:02 GMT-03:00 Josh Knight <josh at noobbox.com>:
>
> > Depending on your topology/configuration, you could try to use bond-rr
> mode
> > in Linux instead of 802.3ad.
> >
> > Bond-rr mode is the only mode that will put pkts for the same mac/ip/port
> > tuple across multiple interfaces.  This will work well for UDP but TCP
> may
> > suffer performance issues because pkts can end up out of order and
> trigger
> > TCP retransmits.  There are some examples on this page, you may need to
> do
> > some testing before deploying it to ensure it does what you want.
> >
> >
> https://wiki.linuxfoundation.org/networking/bonding#bonding-driver-options
> >
> > As others have stated, you can adjust the hashing, but a single flow
> > (mac/ip/port combination) will still end up limited to 1Gbps without
> using
> > round robin mode.
> >
> >
> > On Fri, Aug 24, 2018 at 6:52 AM mj <lists at merit.unu.edu> wrote:
> >
> > > Hi,
> > >
> > > Yes, it is our undertanding that if the hardware (switch) supports it,
> > > "bond-xmit-hash-policy layer3+4" gives you best spread.
> > >
> > > But it will still give you 4 'lanes' of 1GB. Ceph will connect using
> > > different ports, ip's etc, en each connection should use a different
> > > lane, so altogether, you should see a network throughput that
> > > (theoretically) could be as high as 4GB.
> > >
> > > That is how we understand it.
> > >
> > > You can also try something on the switch, like we did on our ProCurve:
> > >
> > > >  Procurve chassis(config)# show trunk
> > > >
> > > > Load Balancing Method:  L3-based (default)
> > > >
> > > >  Port | Name                             Type      | Group  Type
> > > >  ---- + -------------------------------- --------- + ------ --------
> > > >  D1   | Link to prn004 - 1               10GbE-T   | Trk1   LACP
> > > >  D2   | Link to prn004 - 2               10GbE-T   | Trk1   LACP
> > > >  D3   | Link to prn005 - 1               10GbE-T   | Trk2   LACP
> > > >  D4   | Link to prn005 - 2               10GbE-T   | Trk2   LACP
> > >
> > > Namely: change the load balancing method to:
> > >
> > > > Procurve chassis(config)# trunk-load-balance L4
> > >
> > > So the load balance is now based on Layer4 instead of L3.
> > >
> > > Besides these details, I think what you are doing should work nicely.
> > >
> > > MJ
> > >
> > >
> > >
> > > On 08/24/2018 12:45 PM, Uwe Sauter wrote:
> > > > If using standard 802.3ad (LACP) you will always get only the
> > > performance of a single link between one host and another.
> > > >
> > > > Using "bond-xmit-hash-policy layer3+4" might get you a better
> > > performance but is not standard LACP.
> > > >
> > > >
> > > >
> > > >
> > > > Am 24.08.18 um 12:01 schrieb Gilberto Nunes:
> > > >> So what bond mode I suppose to use in order to get more speed? I
> mean
> > > how
> > > >> to join the nic to get 4 GB? I will use Ceph!
> > > >> I know I should use 10gb but I dont have it right now.
> > > >>
> > > >> Thanks
> > > >> Em 24/08/2018 03:01, "Dietmar Maurer" <dietmar at proxmox.com>
> escreveu:
> > > >>
> > > >>>> This 802.3ad do no suppose to agrengate the speed of all available
> > > NIC??
> > > >>>
> > > >>> No, not really. One connection is limited to 1GB. If you start more
> > > >>> parallel connections you can gain more speed.
> > > >>>
> > > >>>
> > > >> _______________________________________________
> > > >> pve-user mailing list
> > > >> pve-user at pve.proxmox.com
> > > >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> > > >>
> > > >
> > > > _______________________________________________
> > > > pve-user mailing list
> > > > pve-user at pve.proxmox.com
> > > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> > > >
> > > _______________________________________________
> > > pve-user mailing list
> > > pve-user at pve.proxmox.com
> > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> > >
> > _______________________________________________
> > pve-user mailing list
> > pve-user at pve.proxmox.com
> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>



More information about the pve-user mailing list