Welcome! Log In Create A New Profile

Advanced

[PATCH] BUG/MEDIUM: mworker: Fix re-exec when haproxy is started from PATH

Posted by Tim Düsterhus 
From: Tim Duesterhus <tim@bastelstu.be>

If haproxy is started using the name of the binary only (i.e.
not using a relative or absolute path) the `execv` in
`mworker_reload` fails with `ENOENT`, because it does not
examine the `PATH`:

[WARNING] 315/161139 (7) : Reexecuting Master process
[WARNING] 315/161139 (7) : Cannot allocate memory
[WARNING] 315/161139 (7) : Failed to reexecute the master processs [7]

The error messages are misleading, because the return value of
`execv` is not checked. This should be fixed in a separate commit.

Once this happened the master process ignores any further
signals sent by the administrator.

Replace `execv` with `execvp` to establish the expected
behaviour.

This bug was introduced in commit 73b85e75b3963086be889e1fb40a59e7ef2ad63b.
---
src/haproxy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/haproxy.c b/src/haproxy.c
index a22ed321..b4f6a4f8 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -650,7 +650,7 @@ static void mworker_reload()

deinit(); /* we don't want to leak FD there */
Warning("Reexecuting Master process\n");
- execv(next_argv[0], next_argv);
+ execvp(next_argv[0], next_argv);

alloc_error:
Warning("Cannot allocate memory\n");
--
2.15.0
Hi Tim,

On Sun, Nov 12, 2017 at 05:39:18PM +0100, Tim Düsterhus wrote:
> From: Tim Duesterhus <tim@bastelstu.be>
>
> If haproxy is started using the name of the binary only (i.e.
> not using a relative or absolute path) the `execv` in
> `mworker_reload` fails with `ENOENT`, because it does not
> examine the `PATH`:
>
> [WARNING] 315/161139 (7) : Reexecuting Master process
> [WARNING] 315/161139 (7) : Cannot allocate memory
> [WARNING] 315/161139 (7) : Failed to reexecute the master processs [7]
>
> The error messages are misleading, because the return value of
> `execv` is not checked. This should be fixed in a separate commit.
>
> Once this happened the master process ignores any further
> signals sent by the administrator.
>

Well, in my opinion the return value does not need to be checked because we
should never reach the code after the exec, that means that the code after the
exec is only reached during an exec error. But we could at least display the
error code of exec and return afterwards.

However, in the case of an exec failure, the master-worker should always work,
and so it should be able to receive signals.

I spot two other problems in this code path, I'll fix them shortly.

> Replace `execv` with `execvp` to establish the expected
> behaviour.

Willy, could you please apply this patch? thanks

--
William Lallemand
On Tue, Nov 14, 2017 at 03:03:20PM +0100, William Lallemand wrote:
> Willy, could you please apply this patch? thanks

OK now done, thanks guys.

Willy
William,

Am 14.11.2017 um 15:03 schrieb William Lallemand:
> Well, in my opinion the return value does not need to be checked because we
> should never reach the code after the exec, that means that the code after the
> exec is only reached during an exec error. But we could at least display the
> error code of exec and return afterwards.

Displaying the error code basically would imply "checking the return
value" for me.
I did not mean to say that some recovery should be performed if `exec`
fails. I am only saying that the error message should not be
misleadingly claim that we "Cannot allocate memory" when e.g. the
haproxy binary was deleted instead.

Personally I was confused whether I misconfigured the Docker container
with a broken memory limit, when the actual issue was the bug this patch
fixed. While *this* bug has been fixed `execvp` could possibly fail for
other reasons in the future.

> Willy, could you please apply this patch? thanks
>

Thanks!

Tim Düsterhus
Hi Tim,

On Tue, Nov 14, 2017 at 08:20:25PM +0100, Tim Düsterhus wrote:
> William,
>
> Am 14.11.2017 um 15:03 schrieb William Lallemand:
> > Well, in my opinion the return value does not need to be checked because we
> > should never reach the code after the exec, that means that the code after the
> > exec is only reached during an exec error. But we could at least display the
> > error code of exec and return afterwards.
>
> Displaying the error code basically would imply "checking the return
> value" for me.
> I did not mean to say that some recovery should be performed if `exec`
> fails. I am only saying that the error message should not be
> misleadingly claim that we "Cannot allocate memory" when e.g. the
> haproxy binary was deleted instead.
>
> Personally I was confused whether I misconfigured the Docker container
> with a broken memory limit, when the actual issue was the bug this patch
> fixed. While *this* bug has been fixed `execvp` could possibly fail for
> other reasons in the future.
>

These problems have been fixed in the master with the following commits:

75ea0a06b BUG/MEDIUM: mworker: does not close inherited FD
fade49d8f BUG/MEDIUM: mworker: does not deinit anymore
2f8b31c2c BUG/MEDIUM: mworker: wait again for signals when execvp fail
722d4ca0d MINOR: mworker: display an accurate error when the reexec fail


The master worker should be able to behave correctly on a execvp failure now :-)

Thanks for the report.

Cheers,

--
William Lallemand
William,

Am 15.11.2017 um 21:17 schrieb William Lallemand:
> These problems have been fixed in the master with the following commits:
>
> 75ea0a06b BUG/MEDIUM: mworker: does not close inherited FD
> fade49d8f BUG/MEDIUM: mworker: does not deinit anymore
> 2f8b31c2c BUG/MEDIUM: mworker: wait again for signals when execvp fail
> 722d4ca0d MINOR: mworker: display an accurate error when the reexec fail
>
>
> The master worker should be able to behave correctly on a execvp failure now :-)
>

I took a look at your commits. While I don't know much about haproxy's
internals they look good to me.

Just one thing: At the top of `static void mworker_reload(void)` the
Environment is modified using:

> setenv("HAPROXY_MWORKER_REEXEC", "1", 1);

Is it necessary to reset that value in case of `execvp` failure? You
don't seem to do so.

Best regards
Tim Düsterhus
On Thu, Nov 16, 2017 at 05:30:05PM +0100, Tim Düsterhus wrote:
> William,
>
> Am 15.11.2017 um 21:17 schrieb William Lallemand:
> > These problems have been fixed in the master with the following commits:
> >
> > 75ea0a06b BUG/MEDIUM: mworker: does not close inherited FD
> > fade49d8f BUG/MEDIUM: mworker: does not deinit anymore
> > 2f8b31c2c BUG/MEDIUM: mworker: wait again for signals when execvp fail
> > 722d4ca0d MINOR: mworker: display an accurate error when the reexec fail
> >
> >
> > The master worker should be able to behave correctly on a execvp failure now :-)
> >
>
> I took a look at your commits. While I don't know much about haproxy's
> internals they look good to me.
>
> Just one thing: At the top of `static void mworker_reload(void)` the
> Environment is modified using:
>
> > setenv("HAPROXY_MWORKER_REEXEC", "1", 1);
>
> Is it necessary to reset that value in case of `execvp` failure? You
> don't seem to do so.
>

It's not, this variable is only used at the start of the executable to know if
it's a restart or not, once it's started it should always be 1.

Cheers,

--
William Lallemand
Sorry, only registered users may post in this forum.

Click here to login