Welcome! Log In Create A New Profile

Advanced

[RFC][PATCHES] seamless reload

Posted by Olivier Houchard 
Conrad Hoffmann
Re: [RFC][PATCHES] seamless reload
April 13, 2017 03:10PM
On 04/13/2017 02:28 PM, Olivier Houchard wrote:
> On Thu, Apr 13, 2017 at 12:59:38PM +0200, Conrad Hoffmann wrote:
>> On 04/13/2017 11:31 AM, Olivier Houchard wrote:
>>> On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
>>>> Hi Olivier,
>>>>
>>>> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
>>>>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
>>>>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
>>>>>>> Hi again,
>>>>>>>
>>>>>>> so I tried to get this to work, but didn't manage yet. I also don't quite
>>>>>>> understand how this is supposed to work. The first haproxy process is
>>>>>>> started _without_ the -x option, is that correct? Where does that instance
>>>>>>> ever create the socket for transfer to later instances?
>>>>>>>
>>>>>>> I have it working now insofar that on reload, subsequent instances are
>>>>>>> spawned with the -x option, but they'll just complain that they can't get
>>>>>>> anything from the unix socket (because, for all I can tell, it's not
>>>>>>> there?). I also can't see the relevant code path where this socket gets
>>>>>>> created, but I didn't have time to read all of it yet.
>>>>>>>
>>>>>>> Am I doing something wrong? Did anyone get this to work with the
>>>>>>> systemd-wrapper so far?
>>>>>>>
>>>>>>> Also, but this might be a coincidence, my test setup takes a huge
>>>>>>> performance penalty just by applying your patches (without any reloading
>>>>>>> whatsoever). Did this happen to anybody else? I'll send some numbers and
>>>>>>> more details tomorrow.
>>>>>>>
>>>>>>
>>>>>> Ok I can confirm the performance issues, I'm investigating.
>>>>>>
>>>>>
>>>>> Found it, I was messing with SO_LINGER when I shouldn't have been.
>>>>
>>>> <removed code for brevity>
>>>>
>>>> thanks a lot, I can confirm that the performance regression seems to be gone!
>>>>
>>>> I am still having the other (conceptual) problem, though. Sorry if this is
>>>> just me holding it wrong or something, it's been a while since I dug
>>>> through the internals of haproxy.
>>>>
>>>> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
>>>> which in turn starts haproxy in daemon mode, giving us a process tree like
>>>> this (path and file names shortened for brevity):
>>>>
>>>> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
>>>> \_ /u/s/haproxy-master
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>
>>>> Now, in our config file, we have something like this:
>>>>
>>>> # expose admin socket for each process
>>>> stats socket ${STATS_ADDR} level admin process 1
>>>> stats socket ${STATS_ADDR}-2 level admin process 2
>>>> stats socket ${STATS_ADDR}-3 level admin process 3
>>>> stats socket ${STATS_ADDR}-4 level admin process 4
>>>> stats socket ${STATS_ADDR}-5 level admin process 5
>>>> stats socket ${STATS_ADDR}-6 level admin process 6
>>>> stats socket ${STATS_ADDR}-7 level admin process 7
>>>> stats socket ${STATS_ADDR}-8 level admin process 8
>>>> stats socket ${STATS_ADDR}-9 level admin process 9
>>>> stats socket ${STATS_ADDR}-10 level admin process 10
>>>> stats socket ${STATS_ADDR}-11 level admin process 11
>>>> stats socket ${STATS_ADDR}-12 level admin process 12
>>>>
>>>> Basically, we have a dedicate admin socket for each ("real") process, as we
>>>> need to be able to talk to each process individually. So I was wondering:
>>>> which admin socket should I pass as HAPROXY_STATS_SOCKET? I initially
>>>> thought it would have to be a special stats socket in the haproxy-master
>>>> process (which we currently don't have), but as far as I can tell from the
>>>> output of `lsof` the haproxy-master process doesn't even hold any FDs
>>>> anymore. Will this setup currently work with your patches at all? Do I need
>>>> to add a stats socket to the master process? Or would this require a list
>>>> of stats sockets to be passed, similar to the list of PIDs that gets passed
>>>> to new haproxy instances, so that each process can talk to the one from
>>>> which it is taking over the socket(s)? In case I need a stats socket for
>>>> the master process, what would be the directive to create it?
>>>>
>>>
>>> Hi Conrad,
>>>
>>> Any of those sockets will do. Each process are made to keep all the
>>> listening sockets opened, even if the proxy is not bound to that specific
>>> process, justly so that it can be transferred via the unix socket.
>>>
>>> Regards,
>>>
>>> Olivier
>>
>>
>> Thanks, I am finally starting to understand, but I think there still might
>> be a problem. I didn't see that initially, but when I use one of the
>> processes existing admin sockets it still fails, with the following messages:
>>
>> 2017-04-13_10:27:46.95005 [WARNING] 102/102746 (14101) : We didn't get the
>> expected number of sockets (expecting 48 got 37)
>> 2017-04-13_10:27:46.95007 [ALERT] 102/102746 (14101) : Failed to get the
>> sockets from the old process!
>>
>> I have a suspicion about the possible reason. We have a two-tier setup, as
>> is often recommended here on the mailing list: 11 processes do (almost)
>> only SSL termination, then pass to a single process that does most of the
>> heavy lifting. These process use different sockets of course (we use
>> `bind-process 1` and `bind-process 2-X` in frontends). The message above is
>> from the first process, which is the non-SSL one. When using an admin
>> socket from any of the other processes, the message changes to "(expecting
>> 48 got 17)".
>>
>> I assume the patches are incompatible with such a setup at the moment?
>>
>> Thanks once more :)
>> Conrad
>
> Hmm that should not happen, and I can't seem to reproduce it.
> Can you share the haproxy config file you're using ? Are the number of socket
> received always the same ? How are you generating your load ? Is it happening
> on each reload ?
>
> Thanks a lot for going through this, this is really appreciated :)

I am grateful myself you're helping me through this :)

So I removed all the logic and backends from our config file, it's still
quite big and it still works in our environment, which is unfortunately
quite complex. I can also still reliably reproduce the error with this
config. The number seem consistently the same (except for the difference
between the first process and the others).

I am not sure if it makes sense for you to recreate the environment we have
this running in, the variables used in the config file are set to the
following values:

BASE_DIR=/etc/sv/ampelmann
HTTP_PORT=80
HTTPS_PORT=443
HEALTH_PORT=8081
LOCAL_FRONTEND_ADDR=/haproxy-frontend-local.sock
SYSLOG_ADDR=/dev/log
STATS_ADDR=/tmp/haproxy.sock
TEST_ADDR=127.0.0.1:9082
HAPROXY_STATS_SOCKET=/tmp/haproxy.sock

but maybe it is easier to try to reduce the config file even more, maybe
even get rid of the chroot and stuff. I am also happy to compile with debug
statements in certain places or whatever is this would make it easier. I'll
try to spend some more time understanding your code, then maybe I can be of
more help, but not sure when I'll have the time for that given the easter
holidays.

Thanks a lot for looking at this,
Conrad
--
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
# vim: et ts=2 sw=2 ft=haproxy

global
nbproc 12
maxconn 100000
log ${SYSLOG_ADDR} local1 info
log-tag ampelmann

spread-checks 5

# expose admin socket for each process
stats socket ${STATS_ADDR} level admin process 1
stats socket ${STATS_ADDR}-2 level admin process 2
stats socket ${STATS_ADDR}-3 level admin process 3
stats socket ${STATS_ADDR}-4 level admin process 4
stats socket ${STATS_ADDR}-5 level admin process 5
stats socket ${STATS_ADDR}-6 level admin process 6
stats socket ${STATS_ADDR}-7 level admin process 7
stats socket ${STATS_ADDR}-8 level admin process 8
stats socket ${STATS_ADDR}-9 level admin process 9
stats socket ${STATS_ADDR}-10 level admin process 10
stats socket ${STATS_ADDR}-11 level admin process 11
stats socket ${STATS_ADDR}-12 level admin process 12


user haproxy
group haproxy
chroot ./chroot

defaults
mode http
maxconn 100000

default-server weight 100 inter 30s agent-inter 10s

option dontlognull
option forwardfor except 127.0.0.1 if-none
option redispatch
option abortonclose

timeout http-request 10s
timeout http-keep-alive 120s
timeout queue 10s
timeout connect 5s
timeout client 120s
timeout server 55s
timeout check 5s

http-reuse safe

unique-id-format %{+X}o\ %Ts:%pid:%rc:%rt

errorfile 403 403.http
errorfile 504 504.http
errorfile 503 503.http
errorfile 502 502.http



# The extern proxy performs SSL termination, content compression and common
# request transformations. All requests are then forwarded to the internal
# frontend for logging and routing.
listen public
bind-process 2-32
bind *:${HTTPS_PORT} process 2 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 2
bind *:${HTTPS_PORT} process 3 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 3
bind *:${HTTPS_PORT} process 4 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 4
bind *:${HTTPS_PORT} process 5 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 5
bind *:${HTTPS_PORT} process 6 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 6
bind *:${HTTPS_PORT} process 7 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 7
bind *:${HTTPS_PORT} process 8 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 8
bind *:${HTTPS_PORT} process 9 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 9
bind *:${HTTPS_PORT} process 10 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 10
bind *:${HTTPS_PORT} process 11 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 11
bind *:${HTTPS_PORT} process 12 ssl crt ./wildcard.soundcloud.com.pem crt ./wildcard.sndcdn.com.pem crt ./wildcard.s-cloud.net.pem crt ./exit.sc.pem ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:!CAMELLIA no-sslv3
bind *:${HTTP_PORT} process 12


compression type text/html text/plain text/css text/javascript application/xml application/json application/x-javascript application/javascript application/ecmascript application/rss+xml application/atomsvc+xml application/atom+xml application/msword application/vnd.ms-excel application/vnd.ms-powerpoint
compression algo gzip

unique-id-header X-Request-Id


server http ${LOCAL_FRONTEND_ADDR} send-proxy


# The internal frontend performs the content based routing and logging.
# The redirects happen here so that they get logged (see TRAF-153).
frontend internal
bind-process 1
bind ${BASE_DIR}/chroot${LOCAL_FRONTEND_ADDR} accept-proxy user haproxy group haproxy

log global
option httplog

default_backend deny


backend tarpit
timeout tarpit 30s
reqtarpit .

backend deny
http-request deny if { always_true }

backend test
server 1 ${TEST_ADDR}

## monitor frontend
listen monitor
no log
bind *:${HEALTH_PORT}
monitor-uri /health

## expose stats with base port..port+nbproc
listen stats-0
no log
bind *:5000
bind-process 1
stats enable
stats uri /

listen stats-1
no log
bind *:5001
bind-process 2
stats enable
stats uri /

listen stats-2
no log
bind *:5002
bind-process 3
stats enable
stats uri /

listen stats-3
no log
bind *:5003
bind-process 4
stats enable
stats uri /

listen stats-4
no log
bind *:5004
bind-process 5
stats enable
stats uri /

listen stats-5
no log
bind *:5005
bind-process 6
stats enable
stats uri /

listen stats-6
no log
bind *:5006
bind-process 7
stats enable
stats uri /

listen stats-7
no log
bind *:5007
bind-process 8
stats enable
stats uri /

listen stats-8
no log
bind *:5008
bind-process 9
stats enable
stats uri /

listen stats-9
no log
bind *:5009
bind-process 10
stats enable
stats uri /

listen stats-10
no log
bind *:5010
bind-process 11
stats enable
stats uri /

listen stats-11
no log
bind *:5011
bind-process 12
stats enable
stats uri /
Olivier Houchard
Re: [RFC][PATCHES] seamless reload
April 13, 2017 04:00PM
On Thu, Apr 13, 2017 at 03:06:47PM +0200, Conrad Hoffmann wrote:
>
>
> On 04/13/2017 02:28 PM, Olivier Houchard wrote:
> > On Thu, Apr 13, 2017 at 12:59:38PM +0200, Conrad Hoffmann wrote:
> >> On 04/13/2017 11:31 AM, Olivier Houchard wrote:
> >>> On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
> >>>> Hi Olivier,
> >>>>
> >>>> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
> >>>>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
> >>>>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
> >>>>>>> Hi again,
> >>>>>>>
> >>>>>>> so I tried to get this to work, but didn't manage yet. I also don't quite
> >>>>>>> understand how this is supposed to work. The first haproxy process is
> >>>>>>> started _without_ the -x option, is that correct? Where does that instance
> >>>>>>> ever create the socket for transfer to later instances?
> >>>>>>>
> >>>>>>> I have it working now insofar that on reload, subsequent instances are
> >>>>>>> spawned with the -x option, but they'll just complain that they can't get
> >>>>>>> anything from the unix socket (because, for all I can tell, it's not
> >>>>>>> there?). I also can't see the relevant code path where this socket gets
> >>>>>>> created, but I didn't have time to read all of it yet.
> >>>>>>>
> >>>>>>> Am I doing something wrong? Did anyone get this to work with the
> >>>>>>> systemd-wrapper so far?
> >>>>>>>
> >>>>>>> Also, but this might be a coincidence, my test setup takes a huge
> >>>>>>> performance penalty just by applying your patches (without any reloading
> >>>>>>> whatsoever). Did this happen to anybody else? I'll send some numbers and
> >>>>>>> more details tomorrow.
> >>>>>>>
> >>>>>>
> >>>>>> Ok I can confirm the performance issues, I'm investigating.
> >>>>>>
> >>>>>
> >>>>> Found it, I was messing with SO_LINGER when I shouldn't have been.
> >>>>
> >>>> <removed code for brevity>
> >>>>
> >>>> thanks a lot, I can confirm that the performance regression seems to be gone!
> >>>>
> >>>> I am still having the other (conceptual) problem, though. Sorry if this is
> >>>> just me holding it wrong or something, it's been a while since I dug
> >>>> through the internals of haproxy.
> >>>>
> >>>> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
> >>>> which in turn starts haproxy in daemon mode, giving us a process tree like
> >>>> this (path and file names shortened for brevity):
> >>>>
> >>>> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
> >>>> \_ /u/s/haproxy-master
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
> >>>>
> >>>> Now, in our config file, we have something like this:
> >>>>
> >>>> # expose admin socket for each process
> >>>> stats socket ${STATS_ADDR} level admin process 1
> >>>> stats socket ${STATS_ADDR}-2 level admin process 2
> >>>> stats socket ${STATS_ADDR}-3 level admin process 3
> >>>> stats socket ${STATS_ADDR}-4 level admin process 4
> >>>> stats socket ${STATS_ADDR}-5 level admin process 5
> >>>> stats socket ${STATS_ADDR}-6 level admin process 6
> >>>> stats socket ${STATS_ADDR}-7 level admin process 7
> >>>> stats socket ${STATS_ADDR}-8 level admin process 8
> >>>> stats socket ${STATS_ADDR}-9 level admin process 9
> >>>> stats socket ${STATS_ADDR}-10 level admin process 10
> >>>> stats socket ${STATS_ADDR}-11 level admin process 11
> >>>> stats socket ${STATS_ADDR}-12 level admin process 12
> >>>>
> >>>> Basically, we have a dedicate admin socket for each ("real") process, as we
> >>>> need to be able to talk to each process individually. So I was wondering:
> >>>> which admin socket should I pass as HAPROXY_STATS_SOCKET? I initially
> >>>> thought it would have to be a special stats socket in the haproxy-master
> >>>> process (which we currently don't have), but as far as I can tell from the
> >>>> output of `lsof` the haproxy-master process doesn't even hold any FDs
> >>>> anymore. Will this setup currently work with your patches at all? Do I need
> >>>> to add a stats socket to the master process? Or would this require a list
> >>>> of stats sockets to be passed, similar to the list of PIDs that gets passed
> >>>> to new haproxy instances, so that each process can talk to the one from
> >>>> which it is taking over the socket(s)? In case I need a stats socket for
> >>>> the master process, what would be the directive to create it?
> >>>>
> >>>
> >>> Hi Conrad,
> >>>
> >>> Any of those sockets will do. Each process are made to keep all the
> >>> listening sockets opened, even if the proxy is not bound to that specific
> >>> process, justly so that it can be transferred via the unix socket.
> >>>
> >>> Regards,
> >>>
> >>> Olivier
> >>
> >>
> >> Thanks, I am finally starting to understand, but I think there still might
> >> be a problem. I didn't see that initially, but when I use one of the
> >> processes existing admin sockets it still fails, with the following messages:
> >>
> >> 2017-04-13_10:27:46.95005 [WARNING] 102/102746 (14101) : We didn't get the
> >> expected number of sockets (expecting 48 got 37)
> >> 2017-04-13_10:27:46.95007 [ALERT] 102/102746 (14101) : Failed to get the
> >> sockets from the old process!
> >>
> >> I have a suspicion about the possible reason. We have a two-tier setup, as
> >> is often recommended here on the mailing list: 11 processes do (almost)
> >> only SSL termination, then pass to a single process that does most of the
> >> heavy lifting. These process use different sockets of course (we use
> >> `bind-process 1` and `bind-process 2-X` in frontends). The message above is
> >> from the first process, which is the non-SSL one. When using an admin
> >> socket from any of the other processes, the message changes to "(expecting
> >> 48 got 17)".
> >>
> >> I assume the patches are incompatible with such a setup at the moment?
> >>
> >> Thanks once more :)
> >> Conrad
> >
> > Hmm that should not happen, and I can't seem to reproduce it.
> > Can you share the haproxy config file you're using ? Are the number of socket
> > received always the same ? How are you generating your load ? Is it happening
> > on each reload ?
> >
> > Thanks a lot for going through this, this is really appreciated :)
>
> I am grateful myself you're helping me through this :)
>
> So I removed all the logic and backends from our config file, it's still
> quite big and it still works in our environment, which is unfortunately
> quite complex. I can also still reliably reproduce the error with this
> config. The number seem consistently the same (except for the difference
> between the first process and the others).
>
> I am not sure if it makes sense for you to recreate the environment we have
> this running in, the variables used in the config file are set to the
> following values:
>


Ah ! Thanks to your help, I think I got it (well really Willy got it, but
let's just pretend it's me).
The attached patch should hopefully fix that, so that you can uncover yet
another issue :).

Thanks again !

Olivier
From ea58ec0a314d8974680a12341e160b6fbceb7e8c Mon Sep 17 00:00:00 2001
From: Olivier Houchard <[email protected]>
Date: Thu, 13 Apr 2017 15:44:48 +0200
Subject: [PATCH 11/11] MINOR: listener: Don't close sockets with the "process"
directive.

Binding on process can be done both proxy-wide, or listener-wide.
Previously the seamless reload code only tracked bind-process per-proxy
directive. Introduce a new "LI_ZOMBIE" state for listener, and use that
for "process" per-bind directives.
---
include/types/listener.h | 1 +
src/cli.c | 9 +++++----
src/haproxy.c | 2 +-
src/listener.c | 11 ++++++++---
src/proxy.c | 12 ++++++------
5 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/include/types/listener.h b/include/types/listener.h
index 227cc28..2b8f5fe 100644
--- a/include/types/listener.h
+++ b/include/types/listener.h
@@ -47,6 +47,7 @@ enum li_state {
LI_INIT, /* all parameters filled in, but not assigned yet */
LI_ASSIGNED, /* assigned to the protocol, but not listening yet */
LI_PAUSED, /* listener was paused, it's bound but not listening */
+ LI_ZOMBIE, /* The listener doesn't belong to the process, but is kept opened */
LI_LISTEN, /* started, listening but not enabled */
LI_READY, /* started, listening and enabled */
LI_FULL, /* reached its connection limit */
diff --git a/src/cli.c b/src/cli.c
index 533f792..55baee3 100644
--- a/src/cli.c
+++ b/src/cli.c
@@ -1065,10 +1065,11 @@ static int _getsocks(char **args, struct appctx *appctx, void *private)
struct listener *l;

list_for_each_entry(l, &px->conf.listeners, by_fe) {
- /* Only transfer IPv4/IPv6 sockets */
- if (l->proto->sock_family == AF_INET ||
+ /* Only transfer IPv4/IPv6/UNIX sockets */
+ if (l->state >= LI_ZOMBIE &&
+ (l->proto->sock_family == AF_INET ||
l->proto->sock_family == AF_INET6 ||
- l->proto->sock_family == AF_UNIX)
+ l->proto->sock_family == AF_UNIX))
tot_fd_nb++;
}
px = px->next;
@@ -1119,7 +1120,7 @@ static int _getsocks(char **args, struct appctx *appctx, void *private)
list_for_each_entry(l, &px->conf.listeners, by_fe) {
int ret;
/* Only transfer IPv4/IPv6 sockets */
- if (l->state >= LI_LISTEN &&
+ if (l->state >= LI_ZOMBIE &&
(l->proto->sock_family == AF_INET ||
l->proto->sock_family == AF_INET6 ||
l->proto->sock_family == AF_UNIX)) {
diff --git a/src/haproxy.c b/src/haproxy.c
index 54a457f..2b1db00 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -1676,7 +1676,7 @@ void deinit(void)
* because they still hold an opened fd.
* Close it and give the listener its real state.
*/
- if (p->state == PR_STSTOPPED && l->state >= LI_LISTEN) {
+ if (p->state == PR_STSTOPPED && l->state >= LI_ZOMBIE) {
close(l->fd);
l->state = LI_INIT;
}
diff --git a/src/listener.c b/src/listener.c
index a38d05d..a99e4c0 100644
--- a/src/listener.c
+++ b/src/listener.c
@@ -59,7 +59,12 @@ void enable_listener(struct listener *listener)
/* we don't want to enable this listener and don't
* want any fd event to reach it.
*/
- unbind_listener(listener);
+ if (!(global.tune.options & GTUNE_SOCKET_TRANSFER))
+ unbind_listener(listener);
+ else {
+ unbind_listener_no_close(listener);
+ listener->state = LI_LISTEN;
+ }
}
else if (listener->nbconn < listener->maxconn) {
fd_want_recv(listener->fd);
@@ -95,7 +100,7 @@ void disable_listener(struct listener *listener)
*/
int pause_listener(struct listener *l)
{
- if (l->state <= LI_PAUSED)
+ if (l->state <= LI_ZOMBIE)
return 1;

if (l->proto->pause) {
@@ -149,7 +154,7 @@ int resume_listener(struct listener *l)
return 0;
}

- if (l->state < LI_PAUSED)
+ if (l->state < LI_PAUSED || l->state == LI_ZOMBIE)
return 0;

if (l->proto->sock_prot == IPPROTO_TCP &&
diff --git a/src/proxy.c b/src/proxy.c
index 9e3f901..dc70213 100644
--- a/src/proxy.c
+++ b/src/proxy.c
@@ -995,10 +995,10 @@ void soft_stop(void)
if (p->state == PR_STSTOPPED &&
!LIST_ISEMPTY(&p->conf.listeners) &&
LIST_ELEM(p->conf.listeners.n,
- struct listener *, by_fe)->state >= LI_LISTEN) {
+ struct listener *, by_fe)->state >= LI_ZOMBIE) {
struct listener *l;
list_for_each_entry(l, &p->conf.listeners, by_fe) {
- if (l->state >= LI_LISTEN)
+ if (l->state >= LI_ZOMBIE)
close(l->fd);
l->state = LI_INIT;
}
@@ -1082,13 +1082,13 @@ void zombify_proxy(struct proxy *p)
listeners--;
jobs--;
}
- if (!first_to_listen && l->state >= LI_LISTEN)
- first_to_listen = l;
/*
* Pretend we're still up and running so that the fd
* will be sent if asked.
*/
- l->state = oldstate;
+ l->state = LI_ZOMBIE;
+ if (!first_to_listen && oldstate >= LI_LISTEN)
+ first_to_listen = l;
}
/* Quick hack : at stop time, to know we have to close the sockets
* despite the proxy being marked as stopped, make the first listener
@@ -1096,7 +1096,7 @@ void zombify_proxy(struct proxy *p)
* parse the whole list to be sure.
*/
if (first_to_listen && LIST_ELEM(p->conf.listeners.n,
- struct listener *, by_fe)) {
+ struct listener *, by_fe) != first_to_listen) {
LIST_DEL(&l->by_fe);
LIST_ADD(&p->conf.listeners, &l->by_fe);
}
--
2.9.3
Conrad Hoffmann
Re: [RFC][PATCHES] seamless reload
April 13, 2017 05:10PM
On 04/13/2017 03:50 PM, Olivier Houchard wrote:
> On Thu, Apr 13, 2017 at 03:06:47PM +0200, Conrad Hoffmann wrote:
>>
>>
>> On 04/13/2017 02:28 PM, Olivier Houchard wrote:
>>> On Thu, Apr 13, 2017 at 12:59:38PM +0200, Conrad Hoffmann wrote:
>>>> On 04/13/2017 11:31 AM, Olivier Houchard wrote:
>>>>> On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
>>>>>> Hi Olivier,
>>>>>>
>>>>>> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
>>>>>>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
>>>>>>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
>>>>>>>>> Hi again,
>>>>>>>>>
>>>>>>>>> so I tried to get this to work, but didn't manage yet. I also don't quite
>>>>>>>>> understand how this is supposed to work. The first haproxy process is
>>>>>>>>> started _without_ the -x option, is that correct? Where does that instance
>>>>>>>>> ever create the socket for transfer to later instances?
>>>>>>>>>
>>>>>>>>> I have it working now insofar that on reload, subsequent instances are
>>>>>>>>> spawned with the -x option, but they'll just complain that they can't get
>>>>>>>>> anything from the unix socket (because, for all I can tell, it's not
>>>>>>>>> there?). I also can't see the relevant code path where this socket gets
>>>>>>>>> created, but I didn't have time to read all of it yet.
>>>>>>>>>
>>>>>>>>> Am I doing something wrong? Did anyone get this to work with the
>>>>>>>>> systemd-wrapper so far?
>>>>>>>>>
>>>>>>>>> Also, but this might be a coincidence, my test setup takes a huge
>>>>>>>>> performance penalty just by applying your patches (without any reloading
>>>>>>>>> whatsoever). Did this happen to anybody else? I'll send some numbers and
>>>>>>>>> more details tomorrow.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Ok I can confirm the performance issues, I'm investigating.
>>>>>>>>
>>>>>>>
>>>>>>> Found it, I was messing with SO_LINGER when I shouldn't have been.
>>>>>>
>>>>>> <removed code for brevity>
>>>>>>
>>>>>> thanks a lot, I can confirm that the performance regression seems to be gone!
>>>>>>
>>>>>> I am still having the other (conceptual) problem, though. Sorry if this is
>>>>>> just me holding it wrong or something, it's been a while since I dug
>>>>>> through the internals of haproxy.
>>>>>>
>>>>>> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
>>>>>> which in turn starts haproxy in daemon mode, giving us a process tree like
>>>>>> this (path and file names shortened for brevity):
>>>>>>
>>>>>> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
>>>>>> \_ /u/s/haproxy-master
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>>
>>>>>> Now, in our config file, we have something like this:
>>>>>>
>>>>>> # expose admin socket for each process
>>>>>> stats socket ${STATS_ADDR} level admin process 1
>>>>>> stats socket ${STATS_ADDR}-2 level admin process 2
>>>>>> stats socket ${STATS_ADDR}-3 level admin process 3
>>>>>> stats socket ${STATS_ADDR}-4 level admin process 4
>>>>>> stats socket ${STATS_ADDR}-5 level admin process 5
>>>>>> stats socket ${STATS_ADDR}-6 level admin process 6
>>>>>> stats socket ${STATS_ADDR}-7 level admin process 7
>>>>>> stats socket ${STATS_ADDR}-8 level admin process 8
>>>>>> stats socket ${STATS_ADDR}-9 level admin process 9
>>>>>> stats socket ${STATS_ADDR}-10 level admin process 10
>>>>>> stats socket ${STATS_ADDR}-11 level admin process 11
>>>>>> stats socket ${STATS_ADDR}-12 level admin process 12
>>>>>>
>>>>>> Basically, we have a dedicate admin socket for each ("real") process, as we
>>>>>> need to be able to talk to each process individually. So I was wondering:
>>>>>> which admin socket should I pass as HAPROXY_STATS_SOCKET? I initially
>>>>>> thought it would have to be a special stats socket in the haproxy-master
>>>>>> process (which we currently don't have), but as far as I can tell from the
>>>>>> output of `lsof` the haproxy-master process doesn't even hold any FDs
>>>>>> anymore. Will this setup currently work with your patches at all? Do I need
>>>>>> to add a stats socket to the master process? Or would this require a list
>>>>>> of stats sockets to be passed, similar to the list of PIDs that gets passed
>>>>>> to new haproxy instances, so that each process can talk to the one from
>>>>>> which it is taking over the socket(s)? In case I need a stats socket for
>>>>>> the master process, what would be the directive to create it?
>>>>>>
>>>>>
>>>>> Hi Conrad,
>>>>>
>>>>> Any of those sockets will do. Each process are made to keep all the
>>>>> listening sockets opened, even if the proxy is not bound to that specific
>>>>> process, justly so that it can be transferred via the unix socket.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Olivier
>>>>
>>>>
>>>> Thanks, I am finally starting to understand, but I think there still might
>>>> be a problem. I didn't see that initially, but when I use one of the
>>>> processes existing admin sockets it still fails, with the following messages:
>>>>
>>>> 2017-04-13_10:27:46.95005 [WARNING] 102/102746 (14101) : We didn't get the
>>>> expected number of sockets (expecting 48 got 37)
>>>> 2017-04-13_10:27:46.95007 [ALERT] 102/102746 (14101) : Failed to get the
>>>> sockets from the old process!
>>>>
>>>> I have a suspicion about the possible reason. We have a two-tier setup, as
>>>> is often recommended here on the mailing list: 11 processes do (almost)
>>>> only SSL termination, then pass to a single process that does most of the
>>>> heavy lifting. These process use different sockets of course (we use
>>>> `bind-process 1` and `bind-process 2-X` in frontends). The message above is
>>>> from the first process, which is the non-SSL one. When using an admin
>>>> socket from any of the other processes, the message changes to "(expecting
>>>> 48 got 17)".
>>>>
>>>> I assume the patches are incompatible with such a setup at the moment?
>>>>
>>>> Thanks once more :)
>>>> Conrad
>>>
>>> Hmm that should not happen, and I can't seem to reproduce it.
>>> Can you share the haproxy config file you're using ? Are the number of socket
>>> received always the same ? How are you generating your load ? Is it happening
>>> on each reload ?
>>>
>>> Thanks a lot for going through this, this is really appreciated :)
>>
>> I am grateful myself you're helping me through this :)
>>
>> So I removed all the logic and backends from our config file, it's still
>> quite big and it still works in our environment, which is unfortunately
>> quite complex. I can also still reliably reproduce the error with this
>> config. The number seem consistently the same (except for the difference
>> between the first process and the others).
>>
>> I am not sure if it makes sense for you to recreate the environment we have
>> this running in, the variables used in the config file are set to the
>> following values:
>>
>
>
> Ah ! Thanks to your help, I think I got it (well really Willy got it, but
> let's just pretend it's me).
> The attached patch should hopefully fix that, so that you can uncover yet
> another issue :).

Sure, here it is ;P

I now get a segfault (on reload):

*** Error in `/usr/sbin/haproxy': corrupted double-linked list:
0x0000000005b511e0 ***

Here is the backtrace, retrieved from the core file:

(gdb) bt
#0 0x00007f4c92801067 in __GI_raise ([email protected]=6) at
.../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f4c92802448 in __GI_abort () at abort.c:89
#2 0x00007f4c9283f1b4 in __libc_message ([email protected]=1,
[email protected]=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
.../sysdeps/posix/libc_fatal.c:175
#3 0x00007f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
"corrupted double-linked list", ptr=<optimized out>) at malloc.c:4996
#4 0x00007f4c92845923 in _int_free (av=0x7f4c92b71620 <main_arena>,
p=<optimized out>, have_lock=0) at malloc.c:3996
#5 0x0000000000485850 in tcp_find_compatible_fd (l=0xaaed20) at
src/proto_tcp.c:812
#6 tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
errlen=100) at src/proto_tcp.c:878
#7 0x0000000000493ce1 in start_proxies (verbose=0) at src/proxy.c:793
#8 0x00000000004091ec in main (argc=21, argv=0x7ffccc775168) at
src/haproxy.c:1942

I can send you the entire core file if that makes any sense? Should I send
the executable along, so that the symbols match? The source revision is
c28bb55cdc554549a59f92997ebe7abf8d4612fe with all your patches applied
(latest ones where fixups were sent).

In case it's relevant, here is the output of `-vv`:

HA-Proxy version 1.8-dev1-c28bb5-5 2017/04/05
Copyright 2000-2017 Willy Tarreau <[email protected]>

Build options :
TARGET = linux26
CPU = generic
CC = gcc
CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_PCRE=1

Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2k 26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k 26 Jan 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Built with network namespace support.
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Encrypted password support via crypt(3): yes
Built with PCRE version : 8.35 2014-04-04
Running on PCRE version : 8.35 2014-04-04
PCRE library supports JIT : no (USE_PCRE_JIT not set)

Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace


Thanks again an keep it up, I feel we are almost there :)
Conrad
--
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
Olivier Houchard
Re: [RFC][PATCHES] seamless reload
April 13, 2017 05:20PM
On Thu, Apr 13, 2017 at 04:59:26PM +0200, Conrad Hoffmann wrote:
> Sure, here it is ;P
>
> I now get a segfault (on reload):
>
> *** Error in `/usr/sbin/haproxy': corrupted double-linked list:
> 0x0000000005b511e0 ***
>
> Here is the backtrace, retrieved from the core file:
>
> (gdb) bt
> #0 0x00007f4c92801067 in __GI_raise ([email protected]=6) at
> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1 0x00007f4c92802448 in __GI_abort () at abort.c:89
> #2 0x00007f4c9283f1b4 in __libc_message ([email protected]=1,
> [email protected]=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
> ../sysdeps/posix/libc_fatal.c:175
> #3 0x00007f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
> "corrupted double-linked list", ptr=<optimized out>) at malloc.c:4996
> #4 0x00007f4c92845923 in _int_free (av=0x7f4c92b71620 <main_arena>,
> p=<optimized out>, have_lock=0) at malloc.c:3996
> #5 0x0000000000485850 in tcp_find_compatible_fd (l=0xaaed20) at
> src/proto_tcp.c:812
> #6 tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
> errlen=100) at src/proto_tcp.c:878
> #7 0x0000000000493ce1 in start_proxies (verbose=0) at src/proxy.c:793
> #8 0x00000000004091ec in main (argc=21, argv=0x7ffccc775168) at
> src/haproxy.c:1942

Ok, yet another stupid mistake, hopefully the attached patch fixes this :)

Thanks !

Olivier
From 7c7fe0c00129d60617cba786cbec7bbdd9ce08f8 Mon Sep 17 00:00:00 2001
From: Olivier Houchard <[email protected]>
Date: Thu, 13 Apr 2017 17:06:53 +0200
Subject: [PATCH 12/12] BUG/MINOR: Properly remove the xfer_sock from the
linked list.

Doubly linked list are hard to get right.
---
src/proto_tcp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/proto_tcp.c b/src/proto_tcp.c
index f558f00..57d6fc1 100644
--- a/src/proto_tcp.c
+++ b/src/proto_tcp.c
@@ -806,7 +806,7 @@ static int tcp_find_compatible_fd(struct listener *l)
if (xfer_sock->prev)
xfer_sock->prev->next = xfer_sock->next;
if (xfer_sock->next)
- xfer_sock->next->prev = xfer_sock->next->prev;
+ xfer_sock->next->prev = xfer_sock->prev;
free(xfer_sock->iface);
free(xfer_sock->namespace);
free(xfer_sock);
--
2.9.3
Conrad Hoffmann
Re: [RFC][PATCHES] seamless reload
April 13, 2017 06:10PM
On 04/13/2017 05:10 PM, Olivier Houchard wrote:
> On Thu, Apr 13, 2017 at 04:59:26PM +0200, Conrad Hoffmann wrote:
>> Sure, here it is ;P
>>
>> I now get a segfault (on reload):
>>
>> *** Error in `/usr/sbin/haproxy': corrupted double-linked list:
>> 0x0000000005b511e0 ***
>>
>> Here is the backtrace, retrieved from the core file:
>>
>> (gdb) bt
>> #0 0x00007f4c92801067 in __GI_raise ([email protected]=6) at
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>> #1 0x00007f4c92802448 in __GI_abort () at abort.c:89
>> #2 0x00007f4c9283f1b4 in __libc_message ([email protected]=1,
>> [email protected]=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
>> ../sysdeps/posix/libc_fatal.c:175
>> #3 0x00007f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
>> "corrupted double-linked list", ptr=<optimized out>) at malloc.c:4996
>> #4 0x00007f4c92845923 in _int_free (av=0x7f4c92b71620 <main_arena>,
>> p=<optimized out>, have_lock=0) at malloc.c:3996
>> #5 0x0000000000485850 in tcp_find_compatible_fd (l=0xaaed20) at
>> src/proto_tcp.c:812
>> #6 tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
>> errlen=100) at src/proto_tcp.c:878
>> #7 0x0000000000493ce1 in start_proxies (verbose=0) at src/proxy.c:793
>> #8 0x00000000004091ec in main (argc=21, argv=0x7ffccc775168) at
>> src/haproxy.c:1942
>
> Ok, yet another stupid mistake, hopefully the attached patch fixes this :)
>
> Thanks !
>
> Olivier


It does indeed! Not only does it work now, the result is impressive! Not a
single dropped request even when aggressively reloading under substantial load!

You certainly gave me an unexpected early easter present here :)

I will now head out, but I am planning on installing a 1.8 version with
your patches on our canary pool (which receives a small amount of
production traffic to test changes) after the holidays. I will happily test
anything else that might be helpful for you. I will also set up a proper
load test inside our data center then, but I expect no surprises there (my
current tests were done over a VPN link, somewhat limiting the achievable
throughput).

Once more, thank you so much! This will greatly simplify much of our
operations. If there is anything else we can help test, let me know :)

Cheers,
Conrad
--
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
Olivier Houchard
Re: [RFC][PATCHES] seamless reload
April 13, 2017 06:30PM
On Thu, Apr 13, 2017 at 06:00:59PM +0200, Conrad Hoffmann wrote:
> On 04/13/2017 05:10 PM, Olivier Houchard wrote:
> > On Thu, Apr 13, 2017 at 04:59:26PM +0200, Conrad Hoffmann wrote:
> >> Sure, here it is ;P
> >>
> >> I now get a segfault (on reload):
> >>
> >> *** Error in `/usr/sbin/haproxy': corrupted double-linked list:
> >> 0x0000000005b511e0 ***
> >>
> >> Here is the backtrace, retrieved from the core file:
> >>
> >> (gdb) bt
> >> #0 0x00007f4c92801067 in __GI_raise ([email protected]=6) at
> >> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> >> #1 0x00007f4c92802448 in __GI_abort () at abort.c:89
> >> #2 0x00007f4c9283f1b4 in __libc_message ([email protected]=1,
> >> [email protected]=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
> >> ../sysdeps/posix/libc_fatal.c:175
> >> #3 0x00007f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
> >> "corrupted double-linked list", ptr=<optimized out>) at malloc.c:4996
> >> #4 0x00007f4c92845923 in _int_free (av=0x7f4c92b71620 <main_arena>,
> >> p=<optimized out>, have_lock=0) at malloc.c:3996
> >> #5 0x0000000000485850 in tcp_find_compatible_fd (l=0xaaed20) at
> >> src/proto_tcp.c:812
> >> #6 tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
> >> errlen=100) at src/proto_tcp.c:878
> >> #7 0x0000000000493ce1 in start_proxies (verbose=0) at src/proxy.c:793
> >> #8 0x00000000004091ec in main (argc=21, argv=0x7ffccc775168) at
> >> src/haproxy.c:1942
> >
> > Ok, yet another stupid mistake, hopefully the attached patch fixes this :)
> >
> > Thanks !
> >
> > Olivier
>
>
> It does indeed! Not only does it work now, the result is impressive! Not a
> single dropped request even when aggressively reloading under substantial load!
>
> You certainly gave me an unexpected early easter present here :)
>
> I will now head out, but I am planning on installing a 1.8 version with
> your patches on our canary pool (which receives a small amount of
> production traffic to test changes) after the holidays. I will happily test
> anything else that might be helpful for you. I will also set up a proper
> load test inside our data center then, but I expect no surprises there (my
> current tests were done over a VPN link, somewhat limiting the achievable
> throughput).
>
> Once more, thank you so much! This will greatly simplify much of our
> operations. If there is anything else we can help test, let me know :)

Pfew, at least :) Thanks a lot for your patience, and doing all that testing !

Olivier
Willy Tarreau
Re: [RFC][PATCHES] seamless reload
April 14, 2017 11:30AM
On Thu, Apr 13, 2017 at 06:18:59PM +0200, Olivier Houchard wrote:
> > Once more, thank you so much! This will greatly simplify much of our
> > operations. If there is anything else we can help test, let me know :)
>
> Pfew, at least :) Thanks a lot for your patience, and doing all that testing !

Yep thanks for the testing and feedback, that's much appreciated. I was
also impressed by the efficiency of this change. We've been talking about
it for years, having had a few attempts at it in the past and this one
was the good one.

I've merged the patches already.

Great job, thanks guys!
Willy
Pavlos Parissis
Re: [RFC][PATCHES] seamless reload
April 19, 2017 10:10AM
On 13/04/2017 06:18 μμ, Olivier Houchard wrote:
> On Thu, Apr 13, 2017 at 06:00:59PM +0200, Conrad Hoffmann wrote:
>> On 04/13/2017 05:10 PM, Olivier Houchard wrote:
>>> On Thu, Apr 13, 2017 at 04:59:26PM +0200, Conrad Hoffmann wrote:
>>>> Sure, here it is ;P
>>>>
>>>> I now get a segfault (on reload):
>>>>
>>>> *** Error in `/usr/sbin/haproxy': corrupted double-linked list:
>>>> 0x0000000005b511e0 ***
>>>>
>>>> Here is the backtrace, retrieved from the core file:
>>>>
>>>> (gdb) bt
>>>> #0 0x00007f4c92801067 in __GI_raise ([email protected]=6) at
>>>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>>>> #1 0x00007f4c92802448 in __GI_abort () at abort.c:89
>>>> #2 0x00007f4c9283f1b4 in __libc_message ([email protected]=1,
>>>> [email protected]=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
>>>> ../sysdeps/posix/libc_fatal.c:175
>>>> #3 0x00007f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
>>>> "corrupted double-linked list", ptr=<optimized out>) at malloc.c:4996
>>>> #4 0x00007f4c92845923 in _int_free (av=0x7f4c92b71620 <main_arena>,
>>>> p=<optimized out>, have_lock=0) at malloc.c:3996
>>>> #5 0x0000000000485850 in tcp_find_compatible_fd (l=0xaaed20) at
>>>> src/proto_tcp.c:812
>>>> #6 tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
>>>> errlen=100) at src/proto_tcp.c:878
>>>> #7 0x0000000000493ce1 in start_proxies (verbose=0) at src/proxy.c:793
>>>> #8 0x00000000004091ec in main (argc=21, argv=0x7ffccc775168) at
>>>> src/haproxy.c:1942
>>>
>>> Ok, yet another stupid mistake, hopefully the attached patch fixes this :)
>>>
>>> Thanks !
>>>
>>> Olivier
>>
>>
>> It does indeed! Not only does it work now, the result is impressive! Not a
>> single dropped request even when aggressively reloading under substantial load!
>>
>> You certainly gave me an unexpected early easter present here :)
>>
>> I will now head out, but I am planning on installing a 1.8 version with
>> your patches on our canary pool (which receives a small amount of
>> production traffic to test changes) after the holidays. I will happily test
>> anything else that might be helpful for you. I will also set up a proper
>> load test inside our data center then, but I expect no surprises there (my
>> current tests were done over a VPN link, somewhat limiting the achievable
>> throughput).
>>
>> Once more, thank you so much! This will greatly simplify much of our
>> operations. If there is anything else we can help test, let me know :)
>
> Pfew, at least :) Thanks a lot for your patience, and doing all that testing !
>
> Olivier
>


Joining this again a bit late, do you still want me to test it?
I would like to know if there are any performance impact, but I see that
Conrad Hoffmann has done all the hard work on this. So, we can conclude that we
don't expect any performance impact.

Once again thanks a lot for your hard work on this.
@Conrad Hoffmann, thanks a lot for testing this and checking if there is any
performance impact.

Cheers,
Pavlos
Olivier Houchard
Re: [RFC][PATCHES] seamless reload
April 19, 2017 11:30AM
On Wed, Apr 19, 2017 at 09:58:27AM +0200, Pavlos Parissis wrote:
> On 13/04/2017 06:18 μμ, Olivier Houchard wrote:
> > On Thu, Apr 13, 2017 at 06:00:59PM +0200, Conrad Hoffmann wrote:
> >> On 04/13/2017 05:10 PM, Olivier Houchard wrote:
> >>> On Thu, Apr 13, 2017 at 04:59:26PM +0200, Conrad Hoffmann wrote:
> >>>> Sure, here it is ;P
> >>>>
> >>>> I now get a segfault (on reload):
> >>>>
> >>>> *** Error in `/usr/sbin/haproxy': corrupted double-linked list:
> >>>> 0x0000000005b511e0 ***
> >>>>
> >>>> Here is the backtrace, retrieved from the core file:
> >>>>
> >>>> (gdb) bt
> >>>> #0 0x00007f4c92801067 in __GI_raise ([email protected]=6) at
> >>>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> >>>> #1 0x00007f4c92802448 in __GI_abort () at abort.c:89
> >>>> #2 0x00007f4c9283f1b4 in __libc_message ([email protected]=1,
> >>>> [email protected]=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
> >>>> ../sysdeps/posix/libc_fatal.c:175
> >>>> #3 0x00007f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
> >>>> "corrupted double-linked list", ptr=<optimized out>) at malloc.c:4996
> >>>> #4 0x00007f4c92845923 in _int_free (av=0x7f4c92b71620 <main_arena>,
> >>>> p=<optimized out>, have_lock=0) at malloc.c:3996
> >>>> #5 0x0000000000485850 in tcp_find_compatible_fd (l=0xaaed20) at
> >>>> src/proto_tcp.c:812
> >>>> #6 tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
> >>>> errlen=100) at src/proto_tcp.c:878
> >>>> #7 0x0000000000493ce1 in start_proxies (verbose=0) at src/proxy.c:793
> >>>> #8 0x00000000004091ec in main (argc=21, argv=0x7ffccc775168) at
> >>>> src/haproxy.c:1942
> >>>
> >>> Ok, yet another stupid mistake, hopefully the attached patch fixes this :)
> >>>
> >>> Thanks !
> >>>
> >>> Olivier
> >>
> >>
> >> It does indeed! Not only does it work now, the result is impressive! Not a
> >> single dropped request even when aggressively reloading under substantial load!
> >>
> >> You certainly gave me an unexpected early easter present here :)
> >>
> >> I will now head out, but I am planning on installing a 1.8 version with
> >> your patches on our canary pool (which receives a small amount of
> >> production traffic to test changes) after the holidays. I will happily test
> >> anything else that might be helpful for you. I will also set up a proper
> >> load test inside our data center then, but I expect no surprises there (my
> >> current tests were done over a VPN link, somewhat limiting the achievable
> >> throughput).
> >>
> >> Once more, thank you so much! This will greatly simplify much of our
> >> operations. If there is anything else we can help test, let me know :)
> >
> > Pfew, at least :) Thanks a lot for your patience, and doing all that testing !
> >
> > Olivier
> >
>
>
> Joining this again a bit late, do you still want me to test it?
> I would like to know if there are any performance impact, but I see that
> Conrad Hoffmann has done all the hard work on this. So, we can conclude that we
> don't expect any performance impact.
>
> Once again thanks a lot for your hard work on this.
> @Conrad Hoffmann, thanks a lot for testing this and checking if there is any
> performance impact.
>


Hi Pavlos,

More testing is always appreciated :)
I don't expect any performance impact, but that wouldn't be the first time
I say that and am proven wrong, though as you said with all the testing
Conrad did, I'm fairly confident it should be OK.

Thanks !

Olivier
Conrad Hoffmann
Re: [RFC][PATCHES] seamless reload
April 19, 2017 02:10PM
On 04/19/2017 11:22 AM, Olivier Houchard wrote:
<snip>very long conversation</snip>
>> Joining this again a bit late, do you still want me to test it?
>> I would like to know if there are any performance impact, but I see that
>> Conrad Hoffmann has done all the hard work on this. So, we can conclude that we
>> don't expect any performance impact.
>>
>> Once again thanks a lot for your hard work on this.
>> @Conrad Hoffmann, thanks a lot for testing this and checking if there is any
>> performance impact.
>>
>
>
> Hi Pavlos,
>
> More testing is always appreciated :)
> I don't expect any performance impact, but that wouldn't be the first time
> I say that and am proven wrong, though as you said with all the testing
> Conrad did, I'm fairly confident it should be OK.
>
> Thanks !
>
> Olivier

I also think more testing is always very welcome, especially as there are
so many different configurations that certain code paths might for example
never be executed with the configuration we are running here!

Cheers,
Conrad
--
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
Pierre Cheynier
RE: [RFC][PATCHES] seamless reload
May 04, 2017 12:10PM
Hi Olivier,

Many thanks for that ! As you know, we are very interested on this topic.
We'll test your patches soon for sure.

Pierre
Olivier Houchard
Re: [RFC][PATCHES] seamless reload
May 04, 2017 01:20PM
On Thu, May 04, 2017 at 10:03:07AM +0000, Pierre Cheynier wrote:
> Hi Olivier,
>
> Many thanks for that ! As you know, we are very interested on this topic.
> We'll test your patches soon for sure.
>
> Pierre

Hi Pierre :)

Thanks ! I'm very interested in knowing how well it works for you.
Maybe we can talk about that around a beer sometime.

Olivier
Pavlos Parissis
Re: [RFC][PATCHES] seamless reload
May 06, 2017 11:20PM
On 04/05/2017 01:16 μμ, Olivier Houchard wrote:
> On Thu, May 04, 2017 at 10:03:07AM +0000, Pierre Cheynier wrote:
>> Hi Olivier,
>>
>> Many thanks for that ! As you know, we are very interested on this topic.
>> We'll test your patches soon for sure.
>>
>> Pierre
>
> Hi Pierre :)
>
> Thanks ! I'm very interested in knowing how well it works for you.
> Maybe we can talk about that around a beer sometime.
>
> Olivier
>

Hi,

I finally managed to find time to perform some testing.

Fristly, let me explain environment.

Server and generator are on different servers (bare medal) with the same spec,
network interrupts are pinned to all CPUs and irqbalancer daemon is disabled.
Both nodes have 10GbE network interfaces.

I compared HAPEE with HAProxy using the following versions:

### HAProxy
The git SHA isn't mentioned in the output because I created the tarball
with:

git archive --format=tar --prefix="haproxy-1.8.0/" HEAD | gzip -9 >
haproxy-1.8.0.tar.gz

as I had to build the rpm using a tar ball, but I used the latest haproxy
at f494977bc1a361c26f8cc0516366ef2662ac9502 commit.

/usr/sbin/haproxy -vv
HA-Proxy version 1.8-dev1 2017/04/03
Copyright 2000-2017 Willy Tarreau <[email protected]>

Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -DMAX_HOSTNAME_LEN=42
OPTIONS = USE_LINUX_TPROXY=1 USE_CPU_AFFINITY=1 USE_REGPARM=1 USE_OPENSSL=1
USE_PCRE=1 USE_PCRE_JIT=1

Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Built with network namespace support.
Built without compression support (neither USE_ZLIB nor USE_SLZ are set).
Compression algorithms supported : identity("identity")
Encrypted password support via crypt(3): yes
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.32 2012-11-30
PCRE library supports JIT : yes

Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace

### HAPEE version
/opt/hapee-1.7/sbin/hapee-lb -vv
HA-Proxy version 1.7.0-1.0.0-163.180 2017/04/10
Copyright 2000-2016 Willy Tarreau <[email protected]>

Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
-DMAX_SESS_STKCTR=10 -DSTKTABLE_EXTRA_DATA_TYPES=10
OPTIONS = USE_MODULES=1 USE_LINUX_SPLICE=1 USE_LINUX_TPROXY=1 USE_SLZ=1
USE_CPU_AFFINITY=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE= USE_PCRE_JIT=1
USE_NS=1

Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"),
raw-deflate("deflate"), gzip("gzip")
Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.32 2012-11-30
PCRE library supports JIT : yes
Built with Lua version : Lua 5.3.3
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Built with network namespace support

Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[COMP] compression
[TRACE] trace
[SPOE] spoe


The configuration is the same and it is attached. As you can use I use nbproc >1
and each process is pinned to different CPU. We have 12 real CPUs as Intel hyper
threading is disabled, but we only use 10 CPUs for haproxy, the remaining two CPUs
are left for other daemons to use.

I experimented with wrk2 and httpress stress tools and decided to use wrk2 for
these tests. I didn't want to use the inject and other tools provided by haproxy
as I believe using different clients provides higher chances to spot problems.

In my tests I see that wrk2 reports higher read errors with HAProxy (3890) than
HAPEE (36). I don't know the meaning of the read error and it could be some
stupidity in the code of wrk2. I am saying this because two years ago we spent
four weeks stress testing HAPEE and found out that all open source http stress
tool sucks and some of the errors they report are client errors rather server.
But, in this case wrk2 was always reporting higher read errors with HAProxy.

Below is the report and I have run the same tests 3-4 times.
Another thing I would like to test is any possible performance degradation,
but this requires to build a proper stress environment and I don't have the time
to do it right now.

### HAPEE without reload

wrk2 -c 12000 -d 20s -t 12 -R 80000 http://10.6.213.3/
Running 20s test @ http://10.6.213.3/
12 threads and 12000 connections
Thread calibration: mean lat.: 1.966ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.012ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.096ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.435ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 1.985ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.506ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.047ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.058ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 1.980ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 1.927ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 1.957ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.195ms, rate sampling interval: 10ms
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.28ms 2.94ms 89.47ms 95.98%
Req/Sec 7.06k 2.48k 78.67k 88.73%
1403057 requests in 19.99s, 305.08MB read
Requests/sec: 70187.86
Transfer/sec: 15.26MB

### HAPEE with reload
while (true); do systemctl reload hapee-1.7-lb.service;sleep 1;done


wrk2 -c 12000 -d 20s -t 12 -R 80000 http://10.6.213.3/
Running 20s test @ http://10.6.213.3/
12 threads and 12000 connections
Thread calibration: mean lat.: 2.734ms, rate sampling interval: 11ms
Thread calibration: mean lat.: 2.124ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.034ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.210ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.025ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.165ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.055ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.112ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 3.358ms, rate sampling interval: 16ms
Thread calibration: mean lat.: 2.211ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.157ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.217ms, rate sampling interval: 10ms
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.34ms 1.96ms 31.70ms 93.16%
Req/Sec 7.06k 2.15k 28.10k 85.06%
1402923 requests in 19.98s, 308.61MB read
Socket errors: connect 0, read 36, write 0, timeout 0
Requests/sec: 70204.08
Transfer/sec: 15.44MB
root at lablb-202 in ~

### HAProxy without reload

wrk2 -c 12000 -d 20s -t 12 -R 80000 http://10.6.213.3/
Running 20s test @ http://10.6.213.3/
12 threads and 12000 connections
Thread calibration: mean lat.: 2.050ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 1.958ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.070ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.079ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 3.192ms, rate sampling interval: 15ms
Thread calibration: mean lat.: 2.011ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.103ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 1.974ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.059ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.478ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.032ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 3.027ms, rate sampling interval: 14ms
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.31ms 1.95ms 33.50ms 92.14%
Req/Sec 7.05k 2.51k 31.30k 86.44%
1401915 requests in 19.98s, 304.83MB read
Requests/sec: 70161.32
Transfer/sec: 15.26MB

### HAProxy with reload
while (true); do systemctl reload haproxy.service;sleep 1;done


wrk2 -c 12000 -d 20s -t 12 -R 80000 http://10.6.213.3/
Running 20s test @ http://10.6.213.3/
12 threads and 12000 connections
Thread calibration: mean lat.: 2.135ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 3.418ms, rate sampling interval: 16ms
Thread calibration: mean lat.: 2.166ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.283ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.057ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.164ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 3.200ms, rate sampling interval: 14ms
Thread calibration: mean lat.: 2.232ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.206ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.212ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 2.154ms, rate sampling interval: 10ms
Thread calibration: mean lat.: 3.431ms, rate sampling interval: 16ms
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.69ms 4.09ms 880.64ms 93.43%
Req/Sec 7.06k 2.50k 27.00k 86.45%
1402222 requests in 19.99s, 308.69MB read
Socket errors: connect 0, read 3890, write 1, timeout 0
Requests/sec: 70147.32
Transfer/sec: 15.44MB

Cheers,
Pavlos

global
nbproc 10
stats socket /run/lb_engine/process-1.sock user lbengine group lbengine mode 660 level admin process 1
stats socket /run/lb_engine/process-2.sock user lbengine group lbengine mode 660 level admin process 2
stats socket /run/lb_engine/process-3.sock user lbengine group lbengine mode 660 level admin process 3
stats socket /run/lb_engine/process-4.sock user lbengine group lbengine mode 660 level admin process 4
stats socket /run/lb_engine/process-5.sock user lbengine group lbengine mode 660 level admin process 5
stats socket /run/lb_engine/process-6.sock user lbengine group lbengine mode 660 level admin process 6
stats socket /run/lb_engine/process-7.sock user lbengine group lbengine mode 660 level admin process 7
stats socket /run/lb_engine/process-8.sock user lbengine group lbengine mode 660 level admin process 8
stats socket /run/lb_engine/process-9.sock user lbengine group lbengine mode 660 level admin process 9
stats socket /run/lb_engine/process-10.sock user lbengine group lbengine mode 660 level admin process 10
cpu-map 1 2
cpu-map 2 3
cpu-map 3 4
cpu-map 4 5
cpu-map 5 6
cpu-map 6 7
cpu-map 7 8
cpu-map 8 9
cpu-map 9 10
cpu-map 10 11
user lbengine
group lbengine
chroot /var/empty
daemon
log 127.0.0.1 len 4096 local2
maxconn 500000
ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:ECDHE-RSA-DES-CBC3-SHA:ECDHE-ECDSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA:!EDH
ssl-default-bind-options no-sslv3 no-tls-tickets
ssl-server-verify none
stats maxconn 100
tune.bufsize 49152
tune.ssl.default-dh-param 1024

defaults
option redispatch
option prefer-last-server
log-format {\"lbgroup\":\""${LBGROUP}"\",\"dst_ip\":\"%fi\",\"dst_port\":\"%fp\",\"client_ip\":\"%ci\",\"client_port\":\"%cp\",\"timestamp\":\"%t\",\"frontend_name\":\"%ft\",\"backend_name\":\"%b\",\"server_name\":\"%s\",\"tq\":\"%Tq\",\"ta\":\"%Ta\",\"td\":\"%Td\",\"th\":\"%Th\",\"ti\":\"%Ti\",\"trf\":\"%TR\",\"tw\":\"%Tw\",\"tc\":\"%Tc\",\"tr\":\"%Tr\",\"tt\":\"%Tt\",\"status_code\":\"%ST\",\"bytes_read\":\"%B\",\"termination_state\":\"%tsc\",\"actconn\":\"%ac\",\"feconn\":\"%fc\",\"beconn\":\"%bc\",\"srv_conn\":\"%sc\",\"retries\":\"%rc\",\"srv_queue\":\"%sq\",\"backend_queue\":\"%bq\",\"toptalkers\":\"%[http_first_req]\",\"vhost\":\"%[capture.req.hdr(0),lower]\",\"ssl_ciphers\":\"%sslc\",\"ssl_version\":\"%sslv\",\"http_method\":\"%HM\",\"http_version\":\"%HV\",\"http_uri\":\"%HP\"}

backlog 65535
balance roundrobin
log global
maxconn 500000
mode http
no option dontlognull
option contstats
option http-keep-alive
option tcp-smart-accept
option tcp-smart-connect
retries 2
timeout check 5s
timeout client 30s
timeout connect 4s
timeout http-request 30s
timeout queue 1m
timeout server 30s

frontend test.com
bind 10.6.213.3:80 process 1
bind 10.6.213.3:80 process 2
bind 10.6.213.3:80 process 3
bind 10.6.213.3:80 process 4
bind 10.6.213.3:80 process 5
bind 10.6.213.3:80 process 6
bind 10.6.213.3:80 process 7
bind 10.6.213.3:80 process 8
bind 10.6.213.3:80 process 9
bind 10.6.213.3:80 process 10

default_backend robot

backend robot
server server1 server1:80 weight 1 check

frontend test-ipv4.foo.com_https_lhr4
bind 5.1.1.8:80 process 1
bind 5.1.1.8:80 process 2
bind 5.1.1.8:80 process 3
bind 5.1.1.8:80 process 4
bind 5.1.1.8:80 process 5
bind 5.1.1.8:80 process 6
bind 5.1.1.8:80 process 7
bind 5.1.1.8:80 process 8
bind 5.1.1.8:80 process 9
bind 5.1.1.8:80 process 10
bind 5.1.1.8:443 process 1 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl
bind 5.1.1.8:443 process 2 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl
bind 5.1.1.8:443 process 3 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl
bind 5.1.1.8:443 process 4 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl
bind 5.1.1.8:443 process 5 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl
bind 5.1.1.8:443 process 6 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl
bind 5.1.1.8:443 process 7 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl
bind 5.1.1.8:443 process 8 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl
bind 5.1.1.8:443 process 9 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl
bind 5.1.1.8:443 process 10 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem crt /etc/ssl/certs/www.foo.com-bundle.pem ssl

mode http
capture request header Host len 48
acl site_dead nbsrv(test-ipv4.foo.com_https_all) lt 0
monitor-uri /site_alive
monitor fail if site_dead
http-request add-header X-Header-Order %[req.hdr_names(:)]
http-request add-header F5SourceIP %[src]
http-request add-header F5Nodename %H
http-request add-header F5-Proto https if { ssl_fc }
http-request add-header F5-Proto http unless { ssl_fc }
http-request add-header F5CipherName %sslc if { ssl_fc }
http-request add-header F5CipherVersion %sslv if { ssl_fc }
http-request add-header F5CipherBits %[ssl_fc_use_keysize] if { ssl_fc }
http-request add-header F5TrackerID %{+X}Ts%{+X}[rand()]
http-response set-header X-XSS-Protection "1; mode=block"

http-request set-var(txn.lb_trace) req.hdr(X-Lb-Trace),lower if { req.hdr(X-Lb-Trace) -m found }
acl x_lb_debug_on var(txn.lb_trace) -m str yes

acl x_lb_header res.hdr(X-Lb) -m found
http-response replace-header X-Lb (^.*$) DLB,\1 if x_lb_header x_lb_debug_on
http-response add-header X-Lb DLB if !x_lb_header x_lb_debug_on

acl x_lb_node_header res.hdr(X-Lb-Node) -m found
http-response replace-header X-Lb-Node (^.*$) %H,\1 if x_lb_node_header x_lb_debug_on
http-response add-header X-Lb-Node %H if !x_lb_node_header x_lb_debug_on


default_backend test-ipv4.foo.com_https_all

frontend www-ipv6.foo.com_https_lhr4
bind 2001:5040:0:f::aaaa:80 process 1
bind 2001:5040:0:f::aaaa:80 process 2
bind 2001:5040:0:f::aaaa:80 process 3
bind 2001:5040:0:f::aaaa:80 process 4
bind 2001:5040:0:f::aaaa:80 process 5
bind 2001:5040:0:f::aaaa:80 process 6
bind 2001:5040:0:f::aaaa:80 process 7
bind 2001:5040:0:f::aaaa:80 process 8
bind 2001:5040:0:f::aaaa:80 process 9
bind 2001:5040:0:f::aaaa:80 process 10
bind 2001:5040:0:f::aaaa:443 process 1 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl
bind 2001:5040:0:f::aaaa:443 process 2 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl
bind 2001:5040:0:f::aaaa:443 process 3 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl
bind 2001:5040:0:f::aaaa:443 process 4 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl
bind 2001:5040:0:f::aaaa:443 process 5 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl
bind 2001:5040:0:f::aaaa:443 process 6 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl
bind 2001:5040:0:f::aaaa:443 process 7 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl
bind 2001:5040:0:f::aaaa:443 process 8 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl
bind 2001:5040:0:f::aaaa:443 process 9 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl
bind 2001:5040:0:f::aaaa:443 process 10 crt /etc/ssl/certs/wildcard.foo.com-bundle.pem ssl

mode http
capture request header Host len 48
acl site_dead nbsrv(www-ipv6.foo.com_http_all) lt 0
monitor-uri /site_alive
monitor fail if site_dead
http-request add-header X-Header-Order %[req.hdr_names(:)]
http-request add-header F5SourceIP %[src]
http-request add-header F5Nodename %H
http-request add-header F5-Proto https if { ssl_fc }
http-request add-header F5-Proto http unless { ssl_fc }
http-request add-header F5CipherName %sslc if { ssl_fc }
http-request add-header F5CipherVersion %sslv if { ssl_fc }
http-request add-header F5CipherBits %[ssl_fc_use_keysize] if { ssl_fc }
http-request add-header F5TrackerID %{+X}Ts%{+X}[rand()]
http-response set-header X-XSS-Protection "1; mode=block"

http-request set-var(txn.lb_trace) req.hdr(X-Lb-Trace),lower if { req.hdr(X-Lb-Trace) -m found }
acl x_lb_debug_on var(txn.lb_trace) -m str yes

acl x_lb_header res.hdr(X-Lb) -m found
http-response replace-header X-Lb (^.*$) DLB,\1 if x_lb_header x_lb_debug_on
http-response add-header X-Lb DLB if !x_lb_header x_lb_debug_on

acl x_lb_node_header res.hdr(X-Lb-Node) -m found
http-response replace-header X-Lb-Node (^.*$) %H,\1 if x_lb_node_header x_lb_debug_on
http-response add-header X-Lb-Node %H if !x_lb_node_header x_lb_debug_on

default_backend www-ipv6.foo.com_http_all

frontend bar.foo.com_gui_tcp_lhr4
bind 5.1.1.8:8080 process 1
bind 5.1.1.8:8080 process 2
bind 5.1.1.8:8080 process 3
bind 5.1.1.8:8080 process 4
bind 5.1.1.8:8080 process 5
bind 5.1.1.8:8080 process 6
bind 5.1.1.8:8080 process 7
bind 5.1.1.8:8080 process 8
bind 5.1.1.8:8080 process 9
bind 5.1.1.8:8080 process 10

log-format {\"lbgroup\":\""${LBGROUP}"\",\"dst_ip\":\"%fi\",\"dst_port\":\"%fp\",\"client_ip\":\"%ci\",\"client_port\":\"%cp\",\"timestamp\":\"%t\",\"frontend_name\":\"%ft\",\"backend_name\":\"%b\",\"server_name\":\"%s\",\"tw\":\"%Tw\",\"tc\":\"%Tc\",\"tt\":\"%Tt\",\"bytes_read\":\"%B\",\"termination_state\":\"%tsc\",\"actconn\":\"%ac\",\"feconn\":\"%fc\",\"beconn\":\"%bc\",\"srv_conn\":\"%sc\",\"retries\":\"%rc\",\"srv_queue\":\"%sq\",\"backend_queue\":\"%bq\"}
mode tcp

default_backend bar.foo.com_gui_tcp_all

backend bar.foo.com_gui_tcp_all
mode tcp
default-server inter 2s fall 2 rise 2
no option prefer-last-server
option tcplog
retries 1
timeout check 10s
timeout queue 10m
timeout server 10m

server bar-101foo.com 10.1.2.33:443 weight 1 check
server bar-102foo.com 10.1.181.38:443 weight 1 check
server bar-103foo.com 10.1.207.3:443 weight 1 check
server bar-104foo.com 10.1.213.14:443 weight 1 check
server bar-105foo.com 10.1.181.25:443 weight 1 check
server bar-106foo.com 10.1.206.28:443 weight 1 check
server bar-107foo.com 10.1.210.10:443 weight 1 check
server bar-108foo.com 10.3.147.32:443 weight 1 check
server bar-109foo.com 10.1.29.61:443 weight 1 check
server bar-110foo.com 10.1.29.39:443 weight 1 check
server bar-111foo.com 10.3.147.22:443 weight 1 check
server bar-112foo.com 10.3.162.24:443 weight 1 check
server bar-113foo.com 10.1.29.55:443 weight 1 check
server bar-114foo.com 10.3.162.11:443 weight 1 check
server bar-115foo.com 10.1.33.14:443 weight 1 check
server bar-116foo.com 10.1.145.31:443 weight 1 check
server bar-117foo.com 10.1.70.8:443 weight 1 check
server bar-118foo.com 10.1.69.2:443 weight 1 check
server bar-201.lhr4.prod.foo.com 10.11.11.13:443 weight 1 check
server bar-202.lhr4.prod.foo.com 10.11.3.25:443 weight 1 check
server bar-203.lhr4.prod.foo.com 10.11.2.34:443 weight 1 check
server bar-204.lhr4.prod.foo.com 10.11.193.20:443 weight 1 check
server bar-205.lhr4.prod.foo.com 10.11.194.15:443 weight 1 check
server bar-206.lhr4.prod.foo.com 10.11.178.15:443 weight 1 check
server bar-207.lhr4.prod.foo.com 10.11.11.22:443 weight 1 check
server bar-208.lhr4.prod.foo.com 10.11.2.29:443 disabled weight 1 check
server bar-210.lhr4.prod.foo.com 10.11.217.30:443 weight 1 check
server bar-211.lhr4.prod.foo.com 10.11.14.42:443 weight 1 check
server bar-212.lhr4.prod.foo.com 10.4.100.68:443 weight 1 check
server bar-213.lhr4.prod.foo.com 10.11.28.58:443 weight 1 check
server bar-214.lhr4.prod.foo.com 10.4.94.76:443 weight 1 check
server bar-215.lhr4.prod.foo.com 10.11.24.9:443 weight 1 check
server bar-216.lhr4.prod.foo.com 10.11.24.22:443 weight 1 check

backend test-ipv4.foo.com_http_all
mode http
default-server inter 5s


backend test-ipv4.foo.com_https_all
mode http
default-server inter 5s


backend www-ipv6.foo.com_http_all
mode http
default-server inter 5s

server app1foo.com 10.1.2.1:80 weight 1 check
Pavlos Parissis
Re: [RFC][PATCHES] seamless reload
May 07, 2017 12:10AM
On 06/05/2017 11:15 μμ, Pavlos Parissis wrote:
> On 04/05/2017 01:16 μμ, Olivier Houchard wrote:
>> On Thu, May 04, 2017 at 10:03:07AM +0000, Pierre Cheynier wrote:
>>> Hi Olivier,
>>>
>>> Many thanks for that ! As you know, we are very interested on this topic.
>>> We'll test your patches soon for sure.
>>>
>>> Pierre
>>
>> Hi Pierre :)
>>
>> Thanks ! I'm very interested in knowing how well it works for you.
>> Maybe we can talk about that around a beer sometime.
>>
>> Olivier
>>
>
> Hi,
>
> I finally managed to find time to perform some testing.
>
> Fristly, let me explain environment.
>
> Server and generator are on different servers (bare medal) with the same spec,
> network interrupts are pinned to all CPUs and irqbalancer daemon is disabled.
> Both nodes have 10GbE network interfaces.
>
> I compared HAPEE with HAProxy using the following versions:
>
> ### HAProxy
> The git SHA isn't mentioned in the output because I created the tarball
> with:
>
> git archive --format=tar --prefix="haproxy-1.8.0/" HEAD | gzip -9 >
> haproxy-1.8.0.tar.gz
>
> as I had to build the rpm using a tar ball, but I used the latest haproxy
> at f494977bc1a361c26f8cc0516366ef2662ac9502 commit.
>
> /usr/sbin/haproxy -vv
> HA-Proxy version 1.8-dev1 2017/04/03
> Copyright 2000-2017 Willy Tarreau <[email protected]>
>
> Build options :
> TARGET = linux2628
> CPU = generic
> CC = gcc
> CFLAGS = -DMAX_HOSTNAME_LEN=42
> OPTIONS = USE_LINUX_TPROXY=1 USE_CPU_AFFINITY=1 USE_REGPARM=1 USE_OPENSSL=1
> USE_PCRE=1 USE_PCRE_JIT=1
>
> Default settings :
> maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
>
> Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
> Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
> IP_FREEBIND
> Built with network namespace support.
> Built without compression support (neither USE_ZLIB nor USE_SLZ are set).
> Compression algorithms supported : identity("identity")
> Encrypted password support via crypt(3): yes
> Built with PCRE version : 8.32 2012-11-30
> Running on PCRE version : 8.32 2012-11-30
> PCRE library supports JIT : yes
>
> Available polling systems :
> epoll : pref=300, test result OK
> poll : pref=200, test result OK
> select : pref=150, test result OK
> Total: 3 (3 usable), will use epoll.
>
> Available filters :
> [SPOE] spoe
> [COMP] compression
> [TRACE] trace
>
> ### HAPEE version
> /opt/hapee-1.7/sbin/hapee-lb -vv
> HA-Proxy version 1.7.0-1.0.0-163.180 2017/04/10
> Copyright 2000-2016 Willy Tarreau <[email protected]>
>
> Build options :
> TARGET = linux2628
> CPU = generic
> CC = gcc
> CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
> -DMAX_SESS_STKCTR=10 -DSTKTABLE_EXTRA_DATA_TYPES=10
> OPTIONS = USE_MODULES=1 USE_LINUX_SPLICE=1 USE_LINUX_TPROXY=1 USE_SLZ=1
> USE_CPU_AFFINITY=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE= USE_PCRE_JIT=1
> USE_NS=1
>
> Default settings :
> maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
>
> Encrypted password support via crypt(3): yes
> Built with libslz for stateless compression.
> Compression algorithms supported : identity("identity"), deflate("deflate"),
> raw-deflate("deflate"), gzip("gzip")
> Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
> Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports prefer-server-ciphers : yes
> Built with PCRE version : 8.32 2012-11-30
> Running on PCRE version : 8.32 2012-11-30
> PCRE library supports JIT : yes
> Built with Lua version : Lua 5.3.3
> Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
> IP_FREEBIND
> Built with network namespace support
>
> Available polling systems :
> epoll : pref=300, test result OK
> poll : pref=200, test result OK
> select : pref=150, test result OK
> Total: 3 (3 usable), will use epoll.
>
> Available filters :
> [COMP] compression
> [TRACE] trace
> [SPOE] spoe
>
>
> The configuration is the same and it is attached. As you can use I use nbproc >1
> and each process is pinned to different CPU. We have 12 real CPUs as Intel hyper
> threading is disabled, but we only use 10 CPUs for haproxy, the remaining two CPUs
> are left for other daemons to use.
>
> I experimented with wrk2 and httpress stress tools and decided to use wrk2 for
> these tests. I didn't want to use the inject and other tools provided by haproxy
> as I believe using different clients provides higher chances to spot problems.
>
> In my tests I see that wrk2 reports higher read errors with HAProxy (3890) than
> HAPEE (36). I don't know the meaning of the read error and it could be some
> stupidity in the code of wrk2. I am saying this because two years ago we spent
> four weeks stress testing HAPEE and found out that all open source http stress
> tool sucks and some of the errors they report are client errors rather server.
> But, in this case wrk2 was always reporting higher read errors with HAProxy.
>
> Below is the report and I have run the same tests 3-4 times.
> Another thing I would like to test is any possible performance degradation,
> but this requires to build a proper stress environment and I don't have the time
> to do it right now.
> \

Ignore ignore what I wrote, I am an idiot I am an idiot as I forgot the most
important bit of the test, to enable the seamless reload by suppling the
HAPROXY_STATS_SOCKET environment variable:-(

I added to the systemd overwrite file:
[Service]
Environment=CONFIG="/etc/lb_engine/haproxy.cfg"
"HAPROXY_STATS_SOCKET=/run/lb_engine/process-1.sock"

and wrk2 reports ZERO errors where with HAPEE reports ~49.

I am terrible sorry for this stupid mistake.

But, this mistake revealed something interesting. The fact that with the latest
code we have more errors during reload.

@Olivier, great work dude. I am waiting for this to be back-ported to HAPEE-1.7r1.

Once again I am sorry for my mistake,
Pavlos
Olivier Houchard
Re: [RFC][PATCHES] seamless reload
May 08, 2017 02:40PM
Hi Pavlos,

On Sun, May 07, 2017 at 12:05:28AM +0200, Pavlos Parissis wrote:
[...]
> Ignore ignore what I wrote, I am an idiot I am an idiot as I forgot the most
> important bit of the test, to enable the seamless reload by suppling the
> HAPROXY_STATS_SOCKET environment variable:-(
>
> I added to the systemd overwrite file:
> [Service]
> Environment=CONFIG="/etc/lb_engine/haproxy.cfg"
> "HAPROXY_STATS_SOCKET=/run/lb_engine/process-1.sock"
>
> and wrk2 reports ZERO errors where with HAPEE reports ~49.
>
> I am terrible sorry for this stupid mistake.
>
> But, this mistake revealed something interesting. The fact that with the latest
> code we have more errors during reload.
>
> @Olivier, great work dude. I am waiting for this to be back-ported to HAPEE-1.7r1.
>
> Once again I am sorry for my mistake,
> Pavlos
>

Thanks a lot for testing !
This is interesting indeed. My patch may make it worse when not passing
fds via the unix socket, as all processes now keep all sockets opened, even
the one they're not using, maybe it make the window between the last
accept and the close bigger.
If that is so, then the global option "no-unused-socket" should provide
a comparable error rate.

Regards,

Olivier
Willy Tarreau
Re: [RFC][PATCHES] seamless reload
May 12, 2017 04:30PM
Hi Pavlos, Olivier,

On Mon, May 08, 2017 at 02:34:05PM +0200, Olivier Houchard wrote:
> Hi Pavlos,
>
> On Sun, May 07, 2017 at 12:05:28AM +0200, Pavlos Parissis wrote:
> [...]
> > Ignore ignore what I wrote, I am an idiot I am an idiot as I forgot the most
> > important bit of the test, to enable the seamless reload by suppling the
> > HAPROXY_STATS_SOCKET environment variable:-(
> >
> > I added to the systemd overwrite file:
> > [Service]
> > Environment=CONFIG="/etc/lb_engine/haproxy.cfg"
> > "HAPROXY_STATS_SOCKET=/run/lb_engine/process-1.sock"
> >
> > and wrk2 reports ZERO errors where with HAPEE reports ~49.
> >
> > I am terrible sorry for this stupid mistake.
> >
> > But, this mistake revealed something interesting. The fact that with the latest
> > code we have more errors during reload.
> >
> > @Olivier, great work dude. I am waiting for this to be back-ported to HAPEE-1.7r1.
> >
> > Once again I am sorry for my mistake,
> > Pavlos
> >
>
> Thanks a lot for testing !
> This is interesting indeed. My patch may make it worse when not passing
> fds via the unix socket, as all processes now keep all sockets opened, even
> the one they're not using, maybe it make the window between the last
> accept and the close bigger.

That's very interesting indeed. In fact it's the window between the last
accept and the *last* close, due to processes holding the socket while
not being willing to do accept anything on it.

> If that is so, then the global option "no-unused-socket" should provide
> a comparable error rate.

In fact William is currently working on the master-worker model to get rid
of the systemd-wrapper and found some corner cases between this and your
patchset. Nothing particularly difficult, just the fact that he'll need
to pass the path to the previous socket to the new processes during reloads.

During this investigation it was found that we'd need to be able to say
that a process possibly has no stats socket and that the next one will not
be able to retrieve the FDs. Such information cannot be passed from the
command line since it's a consequence of the config parsing. Thus we thought
it would make sense to have a per-socket option to say whether or not it
would be usable for offering the listening file descriptors, just like we
currently have an administrative level on them (I even seem to remember
that Olivier first asked if we wouldn't need to do this). And suddenly a
few benefits appear when doing this :
- security freaks not willing to expose FDs over the socket would simply
not enable them ;

- we could restrict the number of processes susceptible of exposing the
FDs simply by playing with the "process" directive on the socket ; that
could also save some system-wide FDs ;

- the master process could reliably find the socket's path in the conf
(the first one with this new directive enabled), even if it's changed
between reloads ;

- in the default case (no specific option) we wouldn't change the existing
behaviour so it would not make existing reloads worse.

Pavlos, regarding the backport to your beloved version, that's planned, but
as you can see, while the main technical issues have already been sorted out,
there will still be a few small integration-specific changes to come, which
is why for now it's still on hold until all these details are sorted out
once for all.

Best regards,
Willy
Sorry, only registered users may post in this forum.

Click here to login