Welcome! Log In Create A New Profile

Advanced

Considering adding support for TCP Zero Copy

Posted by Pavlos Parissis 
Pavlos Parissis
Considering adding support for TCP Zero Copy
May 03, 2018 12:50PM
Hi,

Linux kernel version 4.14 adds support for zero-copy from user memory to TCP sockets by setting
MSG_ZEROCOPY flag. This is for the sending side of the socket, for the receiving side of the socket
we need to wait for kernel version 4.18.

Will you consider enabling this on HAProxy?

More info can be found here, https://www.kernel.org/doc/html/latest/networking/msg_zerocopy.html

Cheers,
Pavlos
Olivier Houchard
Re: Considering adding support for TCP Zero Copy
May 03, 2018 02:50PM
Hi Pavlos,

On Thu, May 03, 2018 at 12:45:42PM +0200, Pavlos Parissis wrote:
> Hi,
>
> Linux kernel version 4.14 adds support for zero-copy from user memory to TCP sockets by setting
> MSG_ZEROCOPY flag. This is for the sending side of the socket, for the receiving side of the socket
> we need to wait for kernel version 4.18.
>
> Will you consider enabling this on HAProxy?
>
> More info can be found here, https://www.kernel.org/doc/html/latest/networking/msg_zerocopy.html

After some discussion with Willy, we're not sure it is worth it.
It would force us to release buffer much later than we do actually, it can't
be used with SSL, and we already achieve zero-copy by using splicing.

Is there any specific case where you think it'd be a huge win ?

Regards,

Olivier
Pavlos Parissis
Re: Considering adding support for TCP Zero Copy
May 03, 2018 03:00PM
On 03/05/2018 02:45 μμ, Olivier Houchard wrote:
> Hi Pavlos,
>
> On Thu, May 03, 2018 at 12:45:42PM +0200, Pavlos Parissis wrote:
>> Hi,
>>
>> Linux kernel version 4.14 adds support for zero-copy from user memory to TCP sockets by setting
>> MSG_ZEROCOPY flag. This is for the sending side of the socket, for the receiving side of the socket
>> we need to wait for kernel version 4.18.
>>
>> Will you consider enabling this on HAProxy?
>>
>> More info can be found here, https://www.kernel.org/doc/html/latest/networking/msg_zerocopy.html
>
> After some discussion with Willy, we're not sure it is worth it.
> It would force us to release buffer much later than we do actually, it can't
> be used with SSL, and we already achieve zero-copy by using splicing.
>
> Is there any specific case where you think it'd be a huge win ?
>

The only use case that I can think of is HTTP streaming. But, without testing it we can't say a lot.

Thanks,
Pavlos
Willy Tarreau
Re: Considering adding support for TCP Zero Copy
May 03, 2018 07:30PM
On Thu, May 03, 2018 at 02:51:12PM +0200, Pavlos Parissis wrote:
> On 03/05/2018 02:45 uu, Olivier Houchard wrote:
> > Hi Pavlos,
> >
> > On Thu, May 03, 2018 at 12:45:42PM +0200, Pavlos Parissis wrote:
> >> Hi,
> >>
> >> Linux kernel version 4.14 adds support for zero-copy from user memory to TCP sockets by setting
> >> MSG_ZEROCOPY flag. This is for the sending side of the socket, for the receiving side of the socket
> >> we need to wait for kernel version 4.18.
> >>
> >> Will you consider enabling this on HAProxy?
> >>
> >> More info can be found here, https://www.kernel.org/doc/html/latest/networking/msg_zerocopy.html
> >
> > After some discussion with Willy, we're not sure it is worth it.
> > It would force us to release buffer much later than we do actually, it can't
> > be used with SSL, and we already achieve zero-copy by using splicing.
> >
> > Is there any specific case where you think it'd be a huge win ?
> >
>
> The only use case that I can think of is HTTP streaming. But, without testing it we can't say a lot.

In fact, for HTTP streaming, splicing already does it all and even
better since it only manipulates a few pointers in the kernel between
the source and destination socket buffers. Userspace is not even
involved.

Also it's important to remember that while copies are best avoided
whenever possible, they aren't that dramatic at the common traffic
rates. I've already reached 60 Gbps of forwarded traffic with and
without splicing on a 4-core machine.

One aspect to keep in mind is the following. A typical Xeon system will
achieve around 20 GB/s of in-L3 memcpy() bandwidth. For a typical 16kB
buffer, that's only 760 ns to copy the whole buffer, which is roughly the
cost of the extra syscall needed to check that the transfer completed.
At 10 Gbps, this represents only 6.25% of the total processing time.
And there's something much more important : with the copy operation,
the buffer is released after these 760 ns and immediately recycled for
other connections. This ensures that the memory usage remains low and
that most transfer operations are made in L3 instead of RAM. If you
use zero-copy here, instead your memory will be pinned for the time
it takes to cycle on many other connections and get back to processing
this FD. It can very easily become 10-100 microseconds, or 15-150 times
more, resulting in much more RAM usage for temporary buffers, and thus
a much higher cache footprint.

In my opinion MSG_ZEROCOPY was designed for servers, those which stream
video and so on, and which produce their own data, and which don't need
to recycle their buffers. We're definitely not in this case at all here,
we're just forwarding ephemeral data so we can recycle buffers very quickly
and through splicing we can even avoid to see these data at all.

Hoping this helps,
Willy
Pavlos Parissis
Re: Considering adding support for TCP Zero Copy
May 04, 2018 10:40AM
On 03/05/2018 07:24 μμ, Willy Tarreau wrote:
> On Thu, May 03, 2018 at 02:51:12PM +0200, Pavlos Parissis wrote:
>> On 03/05/2018 02:45 uu, Olivier Houchard wrote:
>>> Hi Pavlos,
>>>
>>> On Thu, May 03, 2018 at 12:45:42PM +0200, Pavlos Parissis wrote:
>>>> Hi,
>>>>
>>>> Linux kernel version 4.14 adds support for zero-copy from user memory to TCP sockets by setting
>>>> MSG_ZEROCOPY flag. This is for the sending side of the socket, for the receiving side of the socket
>>>> we need to wait for kernel version 4.18.
>>>>
>>>> Will you consider enabling this on HAProxy?
>>>>
>>>> More info can be found here, https://www.kernel.org/doc/html/latest/networking/msg_zerocopy.html
>>>
>>> After some discussion with Willy, we're not sure it is worth it.
>>> It would force us to release buffer much later than we do actually, it can't
>>> be used with SSL, and we already achieve zero-copy by using splicing.
>>>
>>> Is there any specific case where you think it'd be a huge win ?
>>>
>>
>> The only use case that I can think of is HTTP streaming. But, without testing it we can't say a lot.
>
> In fact, for HTTP streaming, splicing already does it all and even
> better since it only manipulates a few pointers in the kernel between
> the source and destination socket buffers. Userspace is not even
> involved.
>
> Also it's important to remember that while copies are best avoided
> whenever possible, they aren't that dramatic at the common traffic
> rates. I've already reached 60 Gbps of forwarded traffic with and
> without splicing on a 4-core machine.
>
> One aspect to keep in mind is the following. A typical Xeon system will
> achieve around 20 GB/s of in-L3 memcpy() bandwidth. For a typical 16kB
> buffer, that's only 760 ns to copy the whole buffer, which is roughly the
> cost of the extra syscall needed to check that the transfer completed.
> At 10 Gbps, this represents only 6.25% of the total processing time.
> And there's something much more important : with the copy operation,
> the buffer is released after these 760 ns and immediately recycled for
> other connections. This ensures that the memory usage remains low and
> that most transfer operations are made in L3 instead of RAM. If you
> use zero-copy here, instead your memory will be pinned for the time
> it takes to cycle on many other connections and get back to processing
> this FD. It can very easily become 10-100 microseconds, or 15-150 times
> more, resulting in much more RAM usage for temporary buffers, and thus
> a much higher cache footprint.
>
> In my opinion MSG_ZEROCOPY was designed for servers, those which stream
> video and so on, and which produce their own data, and which don't need
> to recycle their buffers. We're definitely not in this case at all here,
> we're just forwarding ephemeral data so we can recycle buffers very quickly
> and through splicing we can even avoid to see these data at all.
>
> Hoping this helps,
> Willy
>

Thanks for this very detailed response, once again I learned a lot.

Cheers,
Pavlos
Sorry, only registered users may post in this forum.

Click here to login