Welcome! Log In Create A New Profile

Advanced

haproxy-1.8.8 seamless reloads failing with [email protected] sockets

Posted by Jarno Huuskonen 
Hi,

I'm testing 1.8.8(1.8.8-52ec357 snapshot) and seamless reloads
(expose-fd listeners).

I'm testing with this config (missing some default timeouts):
----------------------8<----------------
global
stats socket /tmp/stats level admin expose-fd listeners

defaults
mode http
log global
option httplog
retries 2
timeout connect 1500ms
timeout client 10s
timeout server 10s

listen testme
bind ipv4@127.0.0.1:8080
server test_abns_server [email protected] send-proxy-v2

frontend test_abns
bind [email protected] accept-proxy
http-request deny deny_status 200
----------------------8<----------------

Reloads (kill -USR2 $(cat /tmp/haproxy.pid)) are failing:
"Starting frontend test_abns: cannot listen to socket []"
(And request to 127.0.0.1:8080 timeout).

I guess the problem is that on reload, haproxy is trying
to bind the abns socket again, because (proto_uxst.c) uxst_bind_listener /
uxst_find_compatible_fd doesn't find existing (the one copied over from
old process) file descriptor for this abns socket.

Is uxst_find_compatible_fd only looking for <sockname>.XXXXX.tmp sockets
and ignoring abns sockets where path starts with \0 ?

Using unix socket instead of abns socket makes the reload work.

-Jarno

--
Jarno Huuskonen
Hi Jarno,

On Sat, May 12, 2018 at 06:04:10PM +0300, Jarno Huuskonen wrote:
> Hi,
>
> I'm testing 1.8.8(1.8.8-52ec357 snapshot) and seamless reloads
> (expose-fd listeners).
>
> I'm testing with this config (missing some default timeouts):
> ----------------------8<----------------
> global
> stats socket /tmp/stats level admin expose-fd listeners
>
> defaults
> mode http
> log global
> option httplog
> retries 2
> timeout connect 1500ms
> timeout client 10s
> timeout server 10s
>
> listen testme
> bind ipv4@127.0.0.1:8080
> server test_abns_server [email protected] send-proxy-v2
>
> frontend test_abns
> bind [email protected] accept-proxy
> http-request deny deny_status 200
> ----------------------8<----------------
>
> Reloads (kill -USR2 $(cat /tmp/haproxy.pid)) are failing:
> "Starting frontend test_abns: cannot listen to socket []"
> (And request to 127.0.0.1:8080 timeout).
>
> I guess the problem is that on reload, haproxy is trying
> to bind the abns socket again, because (proto_uxst.c) uxst_bind_listener /
> uxst_find_compatible_fd doesn't find existing (the one copied over from
> old process) file descriptor for this abns socket.
>
> Is uxst_find_compatible_fd only looking for <sockname>.XXXXX.tmp sockets
> and ignoring abns sockets where path starts with \0 ?
>
> Using unix socket instead of abns socket makes the reload work.
>

Sorry for the late answer.

You're right indeed, that code was not written with abns sockets in mind.
The attached patch should fix it. It was created from master, but should
apply to 1.8 as well.

Thanks !

Olivier
From 3ba0fbb7c9e854aafb8a6b98482ad7d23bbb414d Mon Sep 17 00:00:00 2001
From: Olivier Houchard <[email protected]>
Date: Wed, 6 Jun 2018 18:34:34 +0200
Subject: [PATCH] MINOR: unix: Make sure we can transfer abns sockets as well
on seamless reload.

When checking if a socket we got from the parent is suitable for a listener,
we just checked that the path matched sockname.tmp, however this is
unsuitable for abns sockets, where we don't have to create a temporary
file and rename it later.
To detect that, check that the first character of the sun_path is 0 for
both, and if so, that &sun_path[1] is the same too.

This should be backported to 1.8.
---
src/proto_uxst.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/proto_uxst.c b/src/proto_uxst.c
index 9fc50dff4..a1da337fe 100644
--- a/src/proto_uxst.c
+++ b/src/proto_uxst.c
@@ -146,7 +146,12 @@ static int uxst_find_compatible_fd(struct listener *l)
after_sockname++;
if (!strcmp(after_sockname, ".tmp"))
break;
- }
+ /* abns sockets sun_path starts with a \0 */
+ } else if (un1->sun_path[0] == 0
+ && un2->sun_path[0] == 0
+ && !strncmp(&un1->sun_path[1], &un2->sun_path[1],
+ sizeof(un1->sun_path) - 1))
+ break;
}
xfer_sock = xfer_sock->next;
}
--
2.14.3
Hi Olivier,

On Wed, Jun 06, 2018 at 06:40:05PM +0200, Olivier Houchard wrote:
> You're right indeed, that code was not written with abns sockets in mind.
> The attached patch should fix it. It was created from master, but should
> apply to 1.8 as well.
>
> Thanks !
>
> Olivier

> >From 3ba0fbb7c9e854aafb8a6b98482ad7d23bbb414d Mon Sep 17 00:00:00 2001
> From: Olivier Houchard <[email protected]>
> Date: Wed, 6 Jun 2018 18:34:34 +0200
> Subject: [PATCH] MINOR: unix: Make sure we can transfer abns sockets as well on seamless reload.

Would you be so kind as to tag it "BUG" so that our beloved stable
team catches it for the next 1.8 ? ;-)

> diff --git a/src/proto_uxst.c b/src/proto_uxst.c
> index 9fc50dff4..a1da337fe 100644
> --- a/src/proto_uxst.c
> +++ b/src/proto_uxst.c
> @@ -146,7 +146,12 @@ static int uxst_find_compatible_fd(struct listener *l)
> after_sockname++;
> if (!strcmp(after_sockname, ".tmp"))
> break;
> - }
> + /* abns sockets sun_path starts with a \0 */
> + } else if (un1->sun_path[0] == 0
> + && un2->sun_path[0] == 0
> + && !strncmp(&un1->sun_path[1], &un2->sun_path[1],
> + sizeof(un1->sun_path) - 1))
> + break;

It may still randomly fail here because null bytes are explicitly permitted
in the sun_path. Instead I'd suggest this :

} else if (un1->sun_path[0] == 0 &&
memcmp(un1->sun_path, un2->sun_path, sizeof(un1->sun_path) == 0)

Jarno, if you still notice occasional failures, please try with this.

Thanks
Willy
Hi Willy,

On Thu, Jun 07, 2018 at 11:45:39AM +0200, Willy Tarreau wrote:
> Hi Olivier,
>
> On Wed, Jun 06, 2018 at 06:40:05PM +0200, Olivier Houchard wrote:
> > You're right indeed, that code was not written with abns sockets in mind.
> > The attached patch should fix it. It was created from master, but should
> > apply to 1.8 as well.
> >
> > Thanks !
> >
> > Olivier
>
> > >From 3ba0fbb7c9e854aafb8a6b98482ad7d23bbb414d Mon Sep 17 00:00:00 2001
> > From: Olivier Houchard <[email protected]>
> > Date: Wed, 6 Jun 2018 18:34:34 +0200
> > Subject: [PATCH] MINOR: unix: Make sure we can transfer abns sockets as well on seamless reload.
>
> Would you be so kind as to tag it "BUG" so that our beloved stable
> team catches it for the next 1.8 ? ;-)
>

Sir yes sir.

> > diff --git a/src/proto_uxst.c b/src/proto_uxst.c
> > index 9fc50dff4..a1da337fe 100644
> > --- a/src/proto_uxst.c
> > +++ b/src/proto_uxst.c
> > @@ -146,7 +146,12 @@ static int uxst_find_compatible_fd(struct listener *l)
> > after_sockname++;
> > if (!strcmp(after_sockname, ".tmp"))
> > break;
> > - }
> > + /* abns sockets sun_path starts with a \0 */
> > + } else if (un1->sun_path[0] == 0
> > + && un2->sun_path[0] == 0
> > + && !strncmp(&un1->sun_path[1], &un2->sun_path[1],
> > + sizeof(un1->sun_path) - 1))
> > + break;
>
> It may still randomly fail here because null bytes are explicitly permitted
> in the sun_path. Instead I'd suggest this :
>
> } else if (un1->sun_path[0] == 0 &&
> memcmp(un1->sun_path, un2->sun_path, sizeof(un1->sun_path) == 0)
>
> Jarno, if you still notice occasional failures, please try with this.
>

You're right, as unlikely as it can be in our current scenario, better safe
than sorry.
The attached patch is updated to reflect that.

Regards,

Olivier
From b6c8bd3102abcf1bb1660429b9b737fcd7a60b61 Mon Sep 17 00:00:00 2001
From: Olivier Houchard <[email protected]>
Date: Wed, 6 Jun 2018 18:34:34 +0200
Subject: [PATCH] BUG/MINOR: unix: Make sure we can transfer abns sockets on
seamless reload.

When checking if a socket we got from the parent is suitable for a listener,
we just checked that the path matched sockname.tmp, however this is
unsuitable for abns sockets, where we don't have to create a temporary
file and rename it later.
To detect that, check that the first character of the sun_path is 0 for
both, and if so, that &sun_path[1] is the same too.

This should be backported to 1.8.
---
src/proto_uxst.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/proto_uxst.c b/src/proto_uxst.c
index 9fc50dff4..ab788bde7 100644
--- a/src/proto_uxst.c
+++ b/src/proto_uxst.c
@@ -146,7 +146,12 @@ static int uxst_find_compatible_fd(struct listener *l)
after_sockname++;
if (!strcmp(after_sockname, ".tmp"))
break;
- }
+ /* abns sockets sun_path starts with a \0 */
+ } else if (un1->sun_path[0] == 0
+ && un2->sun_path[0] == 0
+ && !memcmp(&un1->sun_path[1], &un2->sun_path[1],
+ sizeof(un1->sun_path) - 1))
+ break;
}
xfer_sock = xfer_sock->next;
}
--
2.14.3
Hi Olivier / Willy,

On Thu, Jun 07, Olivier Houchard wrote:
> Hi Willy,
>
> On Thu, Jun 07, 2018 at 11:45:39AM +0200, Willy Tarreau wrote:
> > Hi Olivier,
> >
> > On Wed, Jun 06, 2018 at 06:40:05PM +0200, Olivier Houchard wrote:
> > > You're right indeed, that code was not written with abns sockets in mind.
> > > The attached patch should fix it. It was created from master, but should
> > > apply to 1.8 as well.
> > >
> > > Thanks !
> > >
> > > Olivier
> >
> > > >From 3ba0fbb7c9e854aafb8a6b98482ad7d23bbb414d Mon Sep 17 00:00:00 2001
> > > From: Olivier Houchard <[email protected]>
> > > Date: Wed, 6 Jun 2018 18:34:34 +0200
> > > Subject: [PATCH] MINOR: unix: Make sure we can transfer abns sockets as well on seamless reload.
> >
> > Would you be so kind as to tag it "BUG" so that our beloved stable
> > team catches it for the next 1.8 ? ;-)
> >
>
> Sir yes sir.
>
> > > diff --git a/src/proto_uxst.c b/src/proto_uxst.c
> > > index 9fc50dff4..a1da337fe 100644
> > > --- a/src/proto_uxst.c
> > > +++ b/src/proto_uxst.c
> > > @@ -146,7 +146,12 @@ static int uxst_find_compatible_fd(struct listener *l)
> > > after_sockname++;
> > > if (!strcmp(after_sockname, ".tmp"))
> > > break;
> > > - }
> > > + /* abns sockets sun_path starts with a \0 */
> > > + } else if (un1->sun_path[0] == 0
> > > + && un2->sun_path[0] == 0
> > > + && !strncmp(&un1->sun_path[1], &un2->sun_path[1],
> > > + sizeof(un1->sun_path) - 1))
> > > + break;
> >
> > It may still randomly fail here because null bytes are explicitly permitted
> > in the sun_path. Instead I'd suggest this :
> >
> > } else if (un1->sun_path[0] == 0 &&
> > memcmp(un1->sun_path, un2->sun_path, sizeof(un1->sun_path) == 0)
> >
> > Jarno, if you still notice occasional failures, please try with this.
> >
>
> You're right, as unlikely as it can be in our current scenario, better safe
> than sorry.
> The attached patch is updated to reflect that.

Thanks !
My minimal test config with the patch works (on top of
1.8.9): (doing reloads/curl in loop).

I'll test with my normal/production config when I'll have more time
(probably few days).

-Jarno

--
Jarno Huuskonen
On Thu, Jun 07, 2018 at 03:32:31PM +0300, Jarno Huuskonen wrote:
> My minimal test config with the patch works (on top of
> 1.8.9): (doing reloads/curl in loop).

Thanks, not surprising anyway ;-)

Now merged.

Willy
Sorry, only registered users may post in this forum.

Click here to login