Welcome! Log In Create A New Profile

Advanced

[PATCH] BUG/MEDIUM: kqueue/poll: only use EV_SET when actually needing to add or remove event filters

Posted by PiBa-NL 
Hi Willy,

Sending a patch proposal after like 40 hours of looking through what
happens and what event we might be missing, im now changing +-40 lines
of code.. And actually not 'really' changing the events requested.. But
rather kinda bringing the old 'previous state' comparison back..

Lemme know if its okay like this, you would like the new variable to be
renamed, the if/then/else restructured or just don't like it at all ;).
In the last case please do give a hint about what you would like instead :).

Regards,
PiBa-NL (Pieter)
From 21c191c036f740eb75a4fa59c23232b910cd695c Mon Sep 17 00:00:00 2001
From: PiBa-NL <[email protected]>
Date: Sun, 15 Apr 2018 22:20:22 +0200
Subject: [PATCH] BUG/MEDIUM: kqueue/poll: only use EV_SET when actually
needing to add or remove event filters

Avoid event filters being added twice or deleted while not present causing hanging requests.
Known reproduction:
With haproxy on a FreeBSD machine, and a IIS website with NTLM authentication and using http-tunnel haproxy would fail to receive the request with the added credentials to pass to the backend.
---
include/types/fd.h | 1 +
src/ev_kqueue.c | 36 +++++++++++++++++++++++++-----------
src/fd.c | 1 +
3 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/include/types/fd.h b/include/types/fd.h
index 0902e7f..243d7bb 100644
--- a/include/types/fd.h
+++ b/include/types/fd.h
@@ -115,6 +115,7 @@ struct fdtab {
__decl_hathreads(HA_SPINLOCK_T lock);
unsigned long thread_mask; /* mask of thread IDs authorized to process the task */
unsigned long polled_mask; /* mask of thread IDs currently polling this fd */
+ unsigned long polled_mask_write; /* mask of thread IDs currently polling this fd */
unsigned long update_mask; /* mask of thread IDs having an update for fd */
struct fdlist_entry cache; /* Entry in the fdcache */
void (*iocb)(int fd); /* I/O handler */
diff --git a/src/ev_kqueue.c b/src/ev_kqueue.c
index a103ece..ecadec3 100644
--- a/src/ev_kqueue.c
+++ b/src/ev_kqueue.c
@@ -56,29 +56,43 @@ REGPRM2 static void _do_poll(struct poller *p, int exp)
en = fdtab[fd].state;

if (!(fdtab[fd].thread_mask & tid_bit) || !(en & FD_EV_POLLED_RW)) {
- if (!(fdtab[fd].polled_mask & tid_bit)) {
+ if (!(fdtab[fd].polled_mask & tid_bit) && !(fdtab[fd].polled_mask_write & tid_bit)) {
/* fd was not watched, it's still not */
continue;
}
/* fd totally removed from poll list */
- EV_SET(&kev[changes++], fd, EVFILT_READ, EV_DELETE, 0, 0, NULL);
- EV_SET(&kev[changes++], fd, EVFILT_WRITE, EV_DELETE, 0, 0, NULL);
- HA_ATOMIC_AND(&fdtab[fd].polled_mask, ~tid_bit);
+ if (fdtab[fd].polled_mask & tid_bit) {
+ EV_SET(&kev[changes++], fd, EVFILT_READ, EV_DELETE, 0, 0, NULL);
+ HA_ATOMIC_AND(&fdtab[fd].polled_mask, ~tid_bit);
+ }
+ if (fdtab[fd].polled_mask_write & tid_bit) {
+ EV_SET(&kev[changes++], fd, EVFILT_WRITE, EV_DELETE, 0, 0, NULL);
+ HA_ATOMIC_AND(&fdtab[fd].polled_mask_write, ~tid_bit);
+ }
}
else {
/* OK fd has to be monitored, it was either added or changed */

- if (en & FD_EV_POLLED_R)
- EV_SET(&kev[changes++], fd, EVFILT_READ, EV_ADD, 0, 0, NULL);
- else if (fdtab[fd].polled_mask & tid_bit)
+ if (en & FD_EV_POLLED_R) {
+ if (!(fdtab[fd].polled_mask & tid_bit)) {
+ EV_SET(&kev[changes++], fd, EVFILT_READ, EV_ADD, 0, 0, NULL);
+ HA_ATOMIC_OR(&fdtab[fd].polled_mask, tid_bit);
+ }
+ } else if (fdtab[fd].polled_mask & tid_bit) {
EV_SET(&kev[changes++], fd, EVFILT_READ, EV_DELETE, 0, 0, NULL);
+ HA_ATOMIC_AND(&fdtab[fd].polled_mask, ~tid_bit);
+ }

- if (en & FD_EV_POLLED_W)
- EV_SET(&kev[changes++], fd, EVFILT_WRITE, EV_ADD, 0, 0, NULL);
- else if (fdtab[fd].polled_mask & tid_bit)
+ if (en & FD_EV_POLLED_W) {
+ if (!(fdtab[fd].polled_mask_write & tid_bit)) {
+ EV_SET(&kev[changes++], fd, EVFILT_WRITE, EV_ADD, 0, 0, NULL);
+ HA_ATOMIC_OR(&fdtab[fd].polled_mask_write, tid_bit);
+ }
+ } else if (fdtab[fd].polled_mask_write & tid_bit) {
EV_SET(&kev[changes++], fd, EVFILT_WRITE, EV_DELETE, 0, 0, NULL);
+ HA_ATOMIC_AND(&fdtab[fd].polled_mask_write, ~tid_bit);
+ }

- HA_ATOMIC_OR(&fdtab[fd].polled_mask, tid_bit);
}
}
if (changes)
diff --git a/src/fd.c b/src/fd.c
index 1af64e5..57aeb5c 100644
--- a/src/fd.c
+++ b/src/fd.c
@@ -369,6 +369,7 @@ static void fd_dodelete(int fd, int do_close)
fdtab[fd].thread_mask = 0;
if (do_close) {
fdtab[fd].polled_mask = 0;
+ fdtab[fd].polled_mask_write = 0;
close(fd);
}
HA_SPIN_UNLOCK(FD_LOCK, &fdtab[fd].lock);
--
2.10.1.windows.1
Hi Pieter,

On Sun, Apr 15, 2018 at 10:55:34PM +0200, PiBa-NL wrote:
> Hi Willy,
>
> Sending a patch proposal after like 40 hours of looking through what happens
> and what event we might be missing, im now changing +-40 lines of code.. And
> actually not 'really' changing the events requested.. But rather kinda
> bringing the old 'previous state' comparison back..
>
> Lemme know if its okay like this, you would like the new variable to be
> renamed, the if/then/else restructured or just don't like it at all ;).
> In the last case please do give a hint about what you would like instead :).

Well to be clear, I'm pretty sure we're hiding the dust under the carpet
here, even if it fixes the problem in your case. What I need to do is to
actually understand why we end up in this situation. Your patch should
actually let me figure what conditions can lead to the different code
paths before and after the patch. At this point I really feel that having
two distinct bits for polled/read and polled/write is wrong, but I can't
explain why yet, it's just a feeling. However it's very likely that in
some cases we remove the bit in the polled_mask too early while we believe
it's still there, which could explain why your fix gets rid of it. Given
that kqueue uses two different events for reads and writes, it could also
explain why this issue only affects it and not the other ones.

I think that the change your code introduces is a different sequence when
R and W are registered together but one first then the other one. Probably
that the current code incorrectly detects this. I'll check with Olivier,
he has enough FreeBSD machines to test on and knows this area well enough.

Thanks a lot for your investigations!
Willy
Hi Willy,

Op 15-4-2018 om 23:08 schreef Willy Tarreau:
> Well to be clear, I'm pretty sure we're hiding the dust under the carpet
> here, even if it fixes the problem in your case. What I need to do is to
> actually understand why we end up in this situation.

Okay added a little more code/error logging to help understand whats
going on. Like below with the original code from master.
This error pops up with code '2': "[ENOENT] The event could not be found
to be modified or deleted."
This prevents the "EVFILT_READ, EV_ADD (7)" from taking effect and
reading the second browser request that includes the NTLM credentials..

Added logging:

    if (changes) {
        errno = -1;
        int x = kevent(kqueue_fd[tid], kev, changes, NULL, 0, NULL);
        int e = errno;
        fprintf(stdout, "    Events changed:%d result:%d err:%d\n",
changes, x, e);
    }

The result is like this:

Total: 3 (3 usable), will use kqueue.

Available filters :
    [TRACE] trace
    [COMP] compression
    [SPOE] spoe
Using kqueue() as the polling mechanism.
    EVFILT_READ, EV_ADD (4)
    EVFILT_READ, EV_ADD (5)
    Events changed:2 result:0 err:-1
KernelEvent kev for FD:(4) filter:-1
00000000:Syner.accept(0004)=0007 from [192.168.8.116:4096] ALPN=<none>
    EVFILT_READ, EV_ADD (7)
    Events changed:1 result:0 err:-1
KernelEvent kev for FD:(7) filter:-1
00000000:Syner.clireq[0007:ffffffff]: GET
/SynEnt/docs/HRMResourceCard.aspx?ID=7 HTTP/1.1
00000000:Syner.clihdr[0007:ffffffff]: Accept: text/html,
application/xhtml+xml, image/jxr, */*
00000000:Syner.clihdr[0007:ffffffff]: Accept-Language:
nl-NL,nl;q=0.8,en-GB;q=0.5,en;q=0.3
00000000:Syner.clihdr[0007:ffffffff]: User-Agent: Mozilla/5.0 (Windows
NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
00000000:Syner.clihdr[0007:ffffffff]: Accept-Encoding: gzip, deflate
00000000:Syner.clihdr[0007:ffffffff]: Host: 192.168.8.93
00000000:Syner.clihdr[0007:ffffffff]: Connection: Keep-Alive
    EVFILT_READ, EV_DELETE (7)
    Events changed:1 result:0 err:-1
    EVFILT_WRITE, EV_ADD (8)
    Events changed:1 result:0 err:-1
KernelEvent kev for FD:(8) filter:-2
    EVFILT_READ, EV_ADD (8)
    EVFILT_WRITE, EV_DELETE (8)
    Events changed:2 result:0 err:-1
KernelEvent kev for FD:(8) filter:-1
KernelEvent kev for FD:(8) filter:-1
    EVFILT_READ, EV_DELETE (8)
    EVFILT_WRITE, EV_DELETE (8)
    Events changed:2 result:-1 err:2 <<<<<<<<<<< ERROR while deleting a
non existing event
    EVFILT_READ, EV_ADD (8)
    Events changed:1 result:0 err:-1
KernelEvent kev for FD:(8) filter:-1
00000000:Syner.srvrep[0007:0008]: HTTP/1.1 401 Unauthorized
00000000:Syner.srvhdr[0007:0008]: Content-Type: text/html
00000000:Syner.srvhdr[0007:0008]: Server: Microsoft-IIS/7.5
00000000:Syner.srvhdr[0007:0008]: WWW-Authenticate: NTLM
00000000:Syner.srvhdr[0007:0008]: WWW-Authenticate: Negotiate
00000000:Syner.srvhdr[0007:0008]: X-Powered-By: ASP.NET
00000000:Syner.srvhdr[0007:0008]: Date: Mon, 16 Apr 2018 00:12:47 GMT
00000000:Syner.srvhdr[0007:0008]: Content-Length: 1332
    EVFILT_READ, EV_ADD (8)
    EVFILT_WRITE, EV_DELETE (8)
    EVFILT_READ, EV_ADD (7)
    Events changed:3 result:-1 err:2 <<<<<<<<<<< ERROR while deleting a
non existing event

After this the KernelEvent for FD 7 that should read the second browser
request never happens.
I think we can conclude deleting events that don't exist is a bad thing.?
Ill leave further discussion about why and how to you and Oliver :).

Regards,
PiBa-NL (Pieter)
Hi Pieter,

On Mon, Apr 16, 2018 at 02:22:48AM +0200, PiBa-NL wrote:
>     EVFILT_READ, EV_ADD (8)
>     EVFILT_WRITE, EV_DELETE (8)
>     EVFILT_READ, EV_ADD (7)
>     Events changed:3 result:-1 err:2 <<<<<<<<<<< ERROR while deleting a non
> existing event
>
> After this the KernelEvent for FD 7 that should read the second browser
> request never happens.
> I think we can conclude deleting events that don't exist is a bad thing.?

Ah that's a very interesting discovery! Indeed, the main difference between
kqueue and other pollers is that kqueue batches changes (which is very cool),
but apparently these changes are not idempotent so we have to be extra
careful. Now it explains why not trying to delete an event (as in your
first patch) results in the read to work fine! It totally makes sense. So
I messed up on something there. It's possible that we'll indeed need a poll
mask per direction, but I really hope we won't, as it can be cumbersome.
Possibly that we can rely on the FD's state to know whether we know it or
not.

> Ill leave further discussion about why and how to you and Oliver :).

OK, thanks!

Willy
Sorry, only registered users may post in this forum.

Click here to login