Welcome! Log In Create A New Profile

Advanced

Testing master-worker reloads on HAProxy 1.8

Posted by Anthony Via 
Anthony Via
Testing master-worker reloads on HAProxy 1.8
December 07, 2017 12:00AM
?Hello,


I am testing seamless reloads on HAProxy 1.8.0 using the master-worker model and am running into the following when sending SIGUSR2 to the master process:


[ALERT] 339/222907 (61399) : Starting frontend internal_http: cannot bind socket [0.0.0.0:80]
[WARNING] 339/222907 (61399) : Reexecuting Master process in waitpid mode
[WARNING] 339/222907 (61399) : Reexecuting Master process

From my understanding, after the master process receives the SIGUSR2 signal it should be sending the worker process(es) the SIGUSR1 signal, which does not appear to be happening. I have manually sent worker processes the SIGUSR1 signal, and they do shut down cleanly as expected. I thought maybe the worker wasn't shutting down quick enough, so I played around with "hard-stop-after?", but that did not help.

This is on a Solaris based operating system. I did have success on Ubuntu 16.04, so I'm wondering if this is an issue with my OS.

Any ideas for narrowing the problem down?

Thanks,

Anthony
William Lallemand
Re: Testing master-worker reloads on HAProxy 1.8
December 08, 2017 10:50PM
Hello Anthony,

On Wed, Dec 06, 2017 at 10:48:23PM +0000, Anthony Via wrote:
> ?Hello,
>
>
> I am testing seamless reloads on HAProxy 1.8.0 using the master-worker model and am running into the following when sending SIGUSR2 to the master process:
>
>
> [ALERT] 339/222907 (61399) : Starting frontend internal_http: cannot bind socket [0.0.0.0:80]
> [WARNING] 339/222907 (61399) : Reexecuting Master process in waitpid mode
> [WARNING] 339/222907 (61399) : Reexecuting Master process
>
> From my understanding, after the master process receives the SIGUSR2 signal
> it should be sending the worker process(es) the SIGUSR1 signal, which does
> not appear to be happening.

When the master receive the SIGUSR2 signal, it should reexec itself with the
-sf argument followed by the PIDs of the current workers.

It follows the execution of a normal haproxy process with -sf:

- It parses the configuration
- Try to bind with SO_REUSEPORT if supported by your system (I don't think your system support it)
- If it cannot bind it sends the SIGTTOU signal to the old processes
- the old processes receive the SIGTTOU and disable the bind
- the new process try to bind
- the new process send the SIGUSR1 to the old ones
- the new process fork the children

I just tested the master-worker with the -dR option to validate that the
SIGTTOU system is still working, it seems to work on my linux.

> I have manually sent worker processes the SIGUSR1 signal, and they do shut
> down cleanly as expected. I thought maybe the worker wasn't shutting down
> quick enough, so I played around with "hard-stop-after?", but that did not
> help.
>

Did you try launching a new haproxy process with the -sf option, without using
the master-worker?

> This is on a Solaris based operating system. I did have success on Ubuntu
> 16.04, so I'm wondering if this is an issue with my OS.
>

What is your operating system and version exactly?

> Any ideas for narrowing the problem down?
>
> Thanks,

I think the old processes did not receive the SIGTTOU for an unknown reason,
or did not unbind once it received the signal.

Maybe you could try to compare what's happening on your solaris-like system and
your ubuntu with the -dR option, using strace on linux and truss on solaris.

Regards,

--
William Lallemand
William Lallemand
Re: Testing master-worker reloads on HAProxy 1.8
December 08, 2017 11:00PM
> On Wed, Dec 06, 2017 at 10:48:23PM +0000, Anthony Via wrote:
> > I am testing seamless reloads on HAProxy 1.8.0 using the master-worker
> > model and am running into the following when sending SIGUSR2 to the master
> > process

Sorry I misread, my explanation in the previous post is not correct for the
seamless reload (-x + expose-fd), but only for a "classic" reload.

During a seamless reload, the new process try to get the FDs of the listeners
using the unix socket.

Did you try the seamless reload using -x without the master-worker?

--
William Lallemand
Anthony Via
Re: Testing master-worker reloads on HAProxy 1.8
December 12, 2017 12:10AM
> Did you try the seamless reload using -x without the master-worker?

I was looking into the "-x" option and it looks like simply adding "expose-fd listeners" to my stats socket has fixed this issue for me. Sending SIGUSR2 to the master process now works as expected. Is that option required for this reload model?

> Did you try launching a new haproxy process with the -sf option, without using
the master-worker?

Yes, and that also failed with the same "cannot bind" error.

> What is your operating system and version exactly?

We are using SmartOS:
# uname -a
SunOS ops-dev-lb03 5.11 joyent_20170315T185612Z i86pc i386 i86pc Solaris

> I think the old processes did not receive the SIGTTOU for an unknown reason,
or did not unbind once it received the signal.

> Maybe you could try to compare what's happening on your solaris-like system and
your ubuntu with the -dR option, using strace on linux and truss on solaris.

I verified using dtrace that the worker is indeed receiving the SIGTTOU signal from the master (200 times), so the worker must not have been unbinding.


Anthony


________________________________________
From: William Lallemand <[email protected]>
Sent: Friday, December 8, 2017 1:54 PM
To: Anthony Via
Cc: haproxy@formilux.org
Subject: Re: Testing master-worker reloads on HAProxy 1.8

> On Wed, Dec 06, 2017 at 10:48:23PM +0000, Anthony Via wrote:
> > I am testing seamless reloads on HAProxy 1.8..0 using the master-worker
> > model and am running into the following when sending SIGUSR2 to the master
> > process

Sorry I misread, my explanation in the previous post is not correct for the
seamless reload (-x + expose-fd), but only for a "classic" reload.

During a seamless reload, the new process try to get the FDs of the listeners
using the unix socket.

Did you try the seamless reload using -x without the master-worker?

--
William Lallemand
William Lallemand
Re: Testing master-worker reloads on HAProxy 1.8
December 12, 2017 12:40AM
On Mon, Dec 11, 2017 at 11:03:52PM +0000, Anthony Via wrote:
> > Did you try the seamless reload using -x without the master-worker?
>
> I was looking into the "-x" option and it looks like simply adding "expose-fd
> listeners" to my stats socket has fixed this issue for me. Sending SIGUSR2 to
> the master process now works as expected. Is that option required for this
> reload model?

"expose-fd listeners" is the option which give the ability to the admin socket
to pass the listeners FDs, so this option is mandatory if you want to use the
seamless reload.

When you use the 'normal' daemon model, you have to specify -x with the path of
the socket where you want to retrieve the listeners.

Using the master-worker you only need to specify "expose-fd listeners" in the
config, and the master will use this socket and reexecute itself with the right
"-x" option.

> > Did you try launching a new haproxy process with the -sf option, without
> > using the master-worker?
>
> Yes, and that also failed with the same "cannot bind" error.

Without "expose-fd listeners" I suppose?

> > What is your operating system and version exactly?
>
> We are using SmartOS:
> # uname -a
> SunOS ops-dev-lb03 5.11 joyent_20170315T185612Z i86pc i386 i86pc Solaris
>
> > I think the old processes did not receive the SIGTTOU for an unknown
> > reason, or did not unbind once it received the signal.
>
> > Maybe you could try to compare what's happening on your solaris-like system
> > and your ubuntu with the -dR option, using strace on linux and truss on
> > solaris.
>
> I verified using dtrace that the worker is indeed receiving the SIGTTOU
> signal from the master (200 times), so the worker must not have been
> unbinding.
>

Okay, so it looks like that the unbinding with SIGTOUT does not work on your
OS, but the seamless reload seems to work...

According to the code commentary that's a known problem on Solaris, maybe we
should add a note in the documentation about it.

Regards,

--
William Lallemand
Anthony Via
Re: Testing master-worker reloads on HAProxy 1.8
December 12, 2017 12:50AM
> > > Did you try launching a new haproxy process with the -sf option, without
> > > using the master-worker?
> >
> > Yes, and that also failed with the same "cannot bind" error.

> Without "expose-fd listeners" I suppose?

That is correct.

> Okay, so it looks like that the unbinding with SIGTOUT does not work on your
OS, but the seamless reload seems to work...

> According to the code commentary that's a known problem on Solaris, maybe we
should add a note in the documentation about it.

Good to know. Thanks for looking into this.

Anthony

________________________________________
From: William Lallemand <[email protected]>
Sent: Monday, December 11, 2017 3:32 PM
To: Anthony Via
Cc: haproxy@formilux.org
Subject: Re: Testing master-worker reloads on HAProxy 1.8

On Mon, Dec 11, 2017 at 11:03:52PM +0000, Anthony Via wrote:
> > Did you try the seamless reload using -x without the master-worker?
>
> I was looking into the "-x" option and it looks like simply adding "expose-fd
> listeners" to my stats socket has fixed this issue for me. Sending SIGUSR2 to
> the master process now works as expected. Is that option required for this
> reload model?

"expose-fd listeners" is the option which give the ability to the admin socket
to pass the listeners FDs, so this option is mandatory if you want to use the
seamless reload.

When you use the 'normal' daemon model, you have to specify -x with the path of
the socket where you want to retrieve the listeners.

Using the master-worker you only need to specify "expose-fd listeners" in the
config, and the master will use this socket and reexecute itself with the right
"-x" option.

> > Did you try launching a new haproxy process with the -sf option, without
> > using the master-worker?
>
> Yes, and that also failed with the same "cannot bind" error.

Without "expose-fd listeners" I suppose?

> > What is your operating system and version exactly?
>
> We are using SmartOS:
> # uname -a
> SunOS ops-dev-lb03 5.11 joyent_20170315T185612Z i86pc i386 i86pc Solaris
>
> > I think the old processes did not receive the SIGTTOU for an unknown
> > reason, or did not unbind once it received the signal.
>
> > Maybe you could try to compare what's happening on your solaris-like system
> > and your ubuntu with the -dR option, using strace on linux and truss on
> > solaris.
>
> I verified using dtrace that the worker is indeed receiving the SIGTTOU
> signal from the master (200 times), so the worker must not have been
> unbinding.
>

Okay, so it looks like that the unbinding with SIGTOUT does not work on your
OS, but the seamless reload seems to work...

According to the code commentary that's a known problem on Solaris, maybe we
should add a note in the documentation about it.

Regards,

--
William Lallemand
Sorry, only registered users may post in this forum.

Click here to login