Welcome! Log In Create A New Profile

Advanced

haproxy requests hanging since b0bdae7

Posted by Patrick Hemmer 
Patrick Hemmer
haproxy requests hanging since b0bdae7
June 05, 2018 11:10PM
It seems that commit b0bdae7 has completely broken haproxy for me. When
I send a request to haproxy, it just sits there. The backend server
receives nothing, and the client waits for a response.
Running with debug enabled I see just a single line:
00000000:f1.accept(0004)=0005 from [127.0.0.1:63663] ALPN=<none>

commit b0bdae7b88d53cf8f18af0deab6d4c29ac25b7f9 (refs/bisect/bad)
Author: Olivier Houchard <[email protected]>
Date: Fri May 18 18:45:28 2018 +0200

MAJOR: tasks: Introduce tasklets.

Introduce tasklets, lightweight tasks. They have no notion of priority,
they are just run as soon as possible, and will probably be used for I/O
later.

For the moment they're used to replace the temporary thread-local list
that was used in the scheduler. The first part of the struct is common
with tasks so that tasks can be cast to tasklets and queued in this
list.
Once a task is in the tasklet list, it has its leaf_p set to 0x1 so that
it cannot accidently be confused as not in the queue.

Pure tasklets are identifiable by their nice value of -32768 (which is
normally not possible).

Issue reproducible with a very simple config:

defaults
mode http
frontend f1
bind :8081
default_backend b1
backend b1
server s1 127.0.0.1:8081

Compiled on OS-X with only a single make variable of TARGET=osx
Compiler: clang-900.0.39.2


-Patrick
Olivier Houchard
Re: haproxy requests hanging since b0bdae7
June 06, 2018 02:10PM
Hi Patrick,

On Tue, Jun 05, 2018 at 05:02:41PM -0400, Patrick Hemmer wrote:
> It seems that commit b0bdae7 has completely broken haproxy for me. When
> I send a request to haproxy, it just sits there. The backend server
> receives nothing, and the client waits for a response.
> Running with debug enabled I see just a single line:
> 00000000:f1.accept(0004)=0005 from [127.0.0.1:63663] ALPN=<none>
>
> commit b0bdae7b88d53cf8f18af0deab6d4c29ac25b7f9 (refs/bisect/bad)
> Author: Olivier Houchard <[email protected]>
> Date: Fri May 18 18:45:28 2018 +0200
>
> MAJOR: tasks: Introduce tasklets.
>
> Introduce tasklets, lightweight tasks. They have no notion of priority,
> they are just run as soon as possible, and will probably be used for I/O
> later.
>
> For the moment they're used to replace the temporary thread-local list
> that was used in the scheduler. The first part of the struct is common
> with tasks so that tasks can be cast to tasklets and queued in this
> list.
> Once a task is in the tasklet list, it has its leaf_p set to 0x1 so that
> it cannot accidently be confused as not in the queue.
>
> Pure tasklets are identifiable by their nice value of -32768 (which is
> normally not possible).
>
> Issue reproducible with a very simple config:
>
> defaults
> mode http
> frontend f1
> bind :8081
> default_backend b1
> backend b1
> server s1 127.0.0.1:8081
>
> Compiled on OS-X with only a single make variable of TARGET=osx
> Compiler: clang-900.0.39.2
>
>

Oops, seems I broke haproxy when built without thread support.
The attached patch should fix both the issues you reported, can you confirm
it ?

Thanks a lot !

Olivier
From d3c0abb18b44a942dcc7ead072be84f323184d0f Mon Sep 17 00:00:00 2001
From: Olivier Houchard <[email protected]>
Date: Wed, 6 Jun 2018 14:01:08 +0200
Subject: [PATCH] BUG/MEDIUM: tasks: Use the local runqueue when building
without threads.

When building without threads enabled, instead of just using the global
runqueue, just use the local runqueue associated with the only thread, as
that's what is now expected for a single thread in prcoess_runnable_tasks().
This should fix haproxy when built without threads.
---
include/proto/task.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/proto/task.h b/include/proto/task.h
index 0c2c5f28c..dc8a54481 100644
--- a/include/proto/task.h
+++ b/include/proto/task.h
@@ -133,7 +133,7 @@ static inline void task_wakeup(struct task *t, unsigned int f)
else
root = &rqueue;
#else
- struct eb_root *root = &rqueue;
+ struct eb_root *root = &rqueue_local[tid];
#endif

state = HA_ATOMIC_OR(&t->state, f);
--
2.14.3
Willy Tarreau
Re: haproxy requests hanging since b0bdae7
June 06, 2018 02:20PM
On Wed, Jun 06, 2018 at 02:04:35PM +0200, Olivier Houchard wrote:
> When building without threads enabled, instead of just using the global
> runqueue, just use the local runqueue associated with the only thread, as
> that's what is now expected for a single thread in prcoess_runnable_tasks().

Just out of curiosity, shouldn't we #ifdef out the global runqueue
definition when running without threads in order to catch such cases
in the future ?

Willy
Olivier Houchard
Re: haproxy requests hanging since b0bdae7
June 06, 2018 02:30PM
Hi Willy,

On Wed, Jun 06, 2018 at 02:09:01PM +0200, Willy Tarreau wrote:
> On Wed, Jun 06, 2018 at 02:04:35PM +0200, Olivier Houchard wrote:
> > When building without threads enabled, instead of just using the global
> > runqueue, just use the local runqueue associated with the only thread, as
> > that's what is now expected for a single thread in prcoess_runnable_tasks().
>
> Just out of curiosity, shouldn't we #ifdef out the global runqueue
> definition when running without threads in order to catch such cases
> in the future ?
>

I think this is actually a good idea.
My only concern is it adds quite a bit of #ifdef USE_THREAD, see the attached
patch.

Regards,

Olivier
From baeb750ed13307010bfef39de92ec9bb8af54022 Mon Sep 17 00:00:00 2001
From: Olivier Houchard <[email protected]>
Date: Wed, 6 Jun 2018 14:22:03 +0200
Subject: [PATCH] MINOR: tasks: Don't define rqueue if we're building without
threads.

To make sure we don't inadvertently insert task in the global runqueue,
while only the local runqueue is used without threads, make its definition
and usage conditional on USE_THREAD.
---
include/proto/task.h | 2 ++
src/task.c | 28 +++++++++++++++++++++++++---
2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/include/proto/task.h b/include/proto/task.h
index dc8a54481..266246098 100644
--- a/include/proto/task.h
+++ b/include/proto/task.h
@@ -93,7 +93,9 @@ extern struct pool_head *pool_head_tasklet;
extern struct pool_head *pool_head_notification;
extern THREAD_LOCAL struct task *curr_task; /* task currently running or NULL */
extern THREAD_LOCAL struct eb32sc_node *rq_next; /* Next task to be potentially run */
+#ifdef USE_THREAD
extern struct eb_root rqueue; /* tree constituting the run queue */
+#endif
extern struct eb_root rqueue_local[MAX_THREADS]; /* tree constituting the per-thread run queue */
extern struct list task_list[MAX_THREADS]; /* List of tasks to be run, mixing tasks and tasklets */
extern int task_list_size[MAX_THREADS]; /* Number of task sin the task_list */
diff --git a/src/task.c b/src/task.c
index 16c723230..c961725a1 100644
--- a/src/task.c
+++ b/src/task.c
@@ -49,9 +49,11 @@ __decl_hathreads(HA_SPINLOCK_T __attribute__((aligned(64))) rq_lock); /* spin lo
__decl_hathreads(HA_SPINLOCK_T __attribute__((aligned(64))) wq_lock); /* spin lock related to wait queue */

static struct eb_root timers; /* sorted timers tree */
+#ifdef USE_THREAD
struct eb_root rqueue; /* tree constituting the run queue */
-struct eb_root rqueue_local[MAX_THREADS]; /* tree constituting the per-thread run queue */
static int global_rqueue_size; /* Number of element sin the global runqueue */
+#endif
+struct eb_root rqueue_local[MAX_THREADS]; /* tree constituting the per-thread run queue */
static int rqueue_size[MAX_THREADS]; /* Number of elements in the per-thread run queue */
static unsigned int rqueue_ticks; /* insertion count */

@@ -68,10 +70,13 @@ void __task_wakeup(struct task *t, struct eb_root *root)
void *expected = NULL;
int *rq_size;

+#ifdef USE_THREAD
if (root == &rqueue) {
rq_size = &global_rqueue_size;
HA_SPIN_LOCK(TASK_RQ_LOCK, &rq_lock);
- } else {
+ } else
+#endif
+ {
int nb = root - &rqueue_local[0];
rq_size = &rqueue_size[nb];
}
@@ -80,8 +85,10 @@ void __task_wakeup(struct task *t, struct eb_root *root)
*/
redo:
if (unlikely(!HA_ATOMIC_CAS(&t->rq.node.leaf_p, &expected, (void *)0x1))) {
+#ifdef USE_THREAD
if (root == &rqueue)
HA_SPIN_UNLOCK(TASK_RQ_LOCK, &rq_lock);
+#endif
return;
}
/* There's a small race condition, when running a task, the thread
@@ -104,8 +111,10 @@ redo:
state = (volatile unsigned short)(t->state);
if (unlikely(state != 0 && !(state & TASK_RUNNING)))
goto redo;
+#ifdef USE_THREAD
if (root == &rqueue)
HA_SPIN_UNLOCK(TASK_RQ_LOCK, &rq_lock);
+#endif
return;
}
HA_ATOMIC_ADD(&tasks_run_queue, 1);
@@ -124,10 +133,13 @@ redo:
}

eb32sc_insert(root, &t->rq, t->thread_mask);
+#ifdef USE_THREAD
if (root == &rqueue) {
global_rqueue_size++;
HA_SPIN_UNLOCK(TASK_RQ_LOCK, &rq_lock);
- } else {
+ } else
+#endif
+ {
int nb = root - &rqueue_local[0];

rqueue_size[nb]++;
@@ -239,7 +251,9 @@ void process_runnable_tasks()
{
struct task *t;
int max_processed;
+#ifdef USE_THREAD
uint64_t average = 0;
+#endif

tasks_run_queue_cur = tasks_run_queue; /* keep a copy for reporting */
nb_tasks_cur = nb_tasks;
@@ -253,6 +267,7 @@ void process_runnable_tasks()
return;
}

+#ifdef USE_THREAD
average = tasks_run_queue / global.nbthread;

/* Get some elements from the global run queue and put it in the
@@ -284,6 +299,7 @@ void process_runnable_tasks()
__task_unlink_rq(t);
__task_wakeup(t, &rqueue_local[tid]);
}
+#endif

HA_SPIN_UNLOCK(TASK_RQ_LOCK, &rq_lock);
} else {
@@ -361,8 +377,12 @@ void process_runnable_tasks()
if (t != NULL) {
state = HA_ATOMIC_AND(&t->state, ~TASK_RUNNING);
if (state)
+#ifdef USE_THREAD
__task_wakeup(t, (t->thread_mask == tid_bit) ?
&rqueue_local[tid] : &rqueue);
+#else
+ __task_wakeup(t, &rqueue_local[tid]);
+#endif
else
task_queue(t);
}
@@ -382,7 +402,9 @@ int init_task()
int i;

memset(&timers, 0, sizeof(timers));
+#ifdef USE_THREAD
memset(&rqueue, 0, sizeof(rqueue));
+#endif
HA_SPIN_INIT(&wq_lock);
HA_SPIN_INIT(&rq_lock);
for (i = 0; i < MAX_THREADS; i++) {
--
2.14.3
Willy Tarreau
Re: haproxy requests hanging since b0bdae7
June 06, 2018 02:40PM
On Wed, Jun 06, 2018 at 02:24:29PM +0200, Olivier Houchard wrote:
> > Just out of curiosity, shouldn't we #ifdef out the global runqueue
> > definition when running without threads in order to catch such cases
> > in the future ?
> >
>
> I think this is actually a good idea.
> My only concern is it adds quite a bit of #ifdef USE_THREAD, see the attached
> patch.

Indeed. A few of them may be replaced with the magical __decl_hathread()
macro but that will remain marginal. Well, it's not that bad either, just
let me know what you prefer, I'm fine with both options.

Willy
Patrick Hemmer
Re: haproxy requests hanging since b0bdae7
June 06, 2018 04:10PM
On 2018/6/6 08:24, Olivier Houchard wrote:
> Hi Willy,
>
> On Wed, Jun 06, 2018 at 02:09:01PM +0200, Willy Tarreau wrote:
>> On Wed, Jun 06, 2018 at 02:04:35PM +0200, Olivier Houchard wrote:
>>> When building without threads enabled, instead of just using the global
>>> runqueue, just use the local runqueue associated with the only thread, as
>>> that's what is now expected for a single thread in prcoess_runnable_tasks().
>> Just out of curiosity, shouldn't we #ifdef out the global runqueue
>> definition when running without threads in order to catch such cases
>> in the future ?
>>
> I think this is actually a good idea.
> My only concern is it adds quite a bit of #ifdef USE_THREAD, see the attached
> patch.
>
> Regards,
>
> Olivier
With this patch I'm getting:

include/proto/task.h:138:26: error: use of undeclared identifier 'rqueue'

With the previous patch, both reported issues are resolved.

-Patrick
Olivier Houchard
Re: haproxy requests hanging since b0bdae7
June 06, 2018 04:30PM
On Wed, Jun 06, 2018 at 10:06:30AM -0400, Patrick Hemmer wrote:
>
>
> On 2018/6/6 08:24, Olivier Houchard wrote:
> > Hi Willy,
> >
> > On Wed, Jun 06, 2018 at 02:09:01PM +0200, Willy Tarreau wrote:
> >> On Wed, Jun 06, 2018 at 02:04:35PM +0200, Olivier Houchard wrote:
> >>> When building without threads enabled, instead of just using the global
> >>> runqueue, just use the local runqueue associated with the only thread, as
> >>> that's what is now expected for a single thread in prcoess_runnable_tasks().
> >> Just out of curiosity, shouldn't we #ifdef out the global runqueue
> >> definition when running without threads in order to catch such cases
> >> in the future ?
> >>
> > I think this is actually a good idea.
> > My only concern is it adds quite a bit of #ifdef USE_THREAD, see the attached
> > patch.
> >
> > Regards,
> >
> > Olivier
> With this patch I'm getting:
>
> include/proto/task.h:138:26: error: use of undeclared identifier 'rqueue'
>
> With the previous patch, both reported issues are resolved.
>

Hi Patrick,

The last patch depended on the first one, so without it that failure is
expected.

Thanks a lot for reporting and testing.

Willy, I think you can push both the patches.

Regards,

Olivier
Willy Tarreau
Re: haproxy requests hanging since b0bdae7
June 06, 2018 04:40PM
On Wed, Jun 06, 2018 at 04:22:22PM +0200, Olivier Houchard wrote:
> The last patch depended on the first one, so without it that failure is
> expected.

and confirms the benefit of catching such cases at build time :-)

> Thanks a lot for reporting and testing.
>
> Willy, I think you can push both the patches.

OK now merged, thank you!

Willy
Sorry, only registered users may post in this forum.

Click here to login