Welcome! Log In Create A New Profile

Advanced

1.8.0 stuck in write(threads_sync_pipe[1], "S", 1)

Posted by Максим Куприянов 
Максим Куприянов
1.8.0 stuck in write(threads_sync_pipe[1], "S", 1)
December 02, 2017 08:30AM
Hi!

Tonight all of mine haproxy 1.8.0 instances stopped answering. They didn't
forward traffic and even didn't answered over socket. They're compiled with
threads, but threads are not enabled in they configs (no nbthread option).
All of them stuck in same place:
# strace -f -p 831919
Process 831919 attached
write(2, "S", 1

Here's some debug stuff (from 1-threaded instance):
(gdb) bt
#0 0x00007fef9bd2a330 in __write_nocancel () at
.../sysdeps/unix/syscall-template.S:81
#1 0x0000558dea62275b in thread_want_sync () at src/hathreads.c:74
#2 0x0000558dea58f548 in srv_register_update ([email protected]=0x558ded691e30)
at src/server.c:2596
#3 0x0000558dea5922f7 in server_recalc_eweight ([email protected]=0x558ded691e30)
at src/server.c:1151
#4 0x0000558dea5c3028 in server_warmup (t=0x558def513120) at
src/checks.c:1448
#5 0x0000558dea619216 in process_runnable_tasks () at src/task.c:229
#6 0x0000558dea5cf237 in run_poll_loop () at src/haproxy.c:2326
#7 run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:2375
#8 0x0000558dea53b6fe in main (argc=<optimized out>, argv=0x7ffe03a880d8)
at src/haproxy.c:2910

(gdb) bt full
#0 0x00007fef9bd2a330 in __write_nocancel () at
.../sysdeps/unix/syscall-template.S:81
No locals.
#1 0x0000558dea62275b in thread_want_sync () at src/hathreads.c:74
No locals.
#2 0x0000558dea58f548 in srv_register_update ([email protected]=0x558ded691e30)
at src/server.c:2596
No locals.
#3 0x0000558dea5922f7 in server_recalc_eweight ([email protected]=0x558ded691e30)
at src/server.c:1151
px = <optimized out>
w = <optimized out>
#4 0x0000558dea5c3028 in server_warmup (t=0x558def513120) at
src/checks.c:1448
s = 0x558ded691e30
#5 0x0000558dea619216 in process_runnable_tasks () at src/task.c:229
t = 0x558def513120
i = <optimized out>
max_processed = 173
rq_next = <optimized out>
local_tasks = {0xcbe, 0x558dea61d5d7 <mux_pt_wake+87>,
0x558e1f68b5e0, 0x402770, 0x558dea894350 <fdtab>, 0x558e19e7f4e0, 0x0,
0x558dea539f47 <fd_stop_send+104>, 0x202310, 0x558e19e7f4e0,
0x558dea60e979 <conn_update_xprt_polling+73>, 0x7ffe03a87d80, 0x500000004,
0x558dea61e6e2 <fd_process_cached_events+610>, 0x7ffe03a87d80,
0x7ffe03a87d80}
local_tasks_count = <optimized out>
final_tasks_count = <optimized out>
#6 0x0000558dea5cf237 in run_poll_loop () at src/haproxy.c:2326
next = <optimized out>
#7 run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:2375
ptif = <optimized out>
ptdf = <optimized out>
#8 0x0000558dea53b6fe in main (argc=<optimized out>, argv=0x7ffe03a880d8)
at src/haproxy.c:2910
tids = 0x558def33ad30
threads = 0x558def46d050
i = <optimized out>
err = <optimized out>
retry = <optimized out>
limit = {rlim_cur = 42419, rlim_max = 42419}
errmsg = "\000\000\000\000\000\000\000\000n\000\000\000w", '\000'
<repeats 11 times>,
"\017\177\250\003\376\177\000\000\260\033e\354\215U\000\000`W&\233\357\177\000\000|\000\000\000\000\000\000\000
\000\000\000\000\000\000\000P
e\354\215U\000\000`W&\233\357\177\000\000\260o\001\000\000\000\000\000\200\000\000\000\000\000\000\000P
", <incomplete sequence \354>
pidfd = <optimized out>

(gdb) thread
[Current thread is 1 (Thread 0x7fef9c59d980 (LWP 831919))]

[email protected]:/var/tmp# /usr/sbin/haproxy -vv
HA-Proxy version 1.8.0-4 2017/11/29
Copyright 2000-2017 Willy Tarreau <[email protected]>

Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -g -O2 -fPIE -fstack-protector --param=ssp-buffer-size=4
-Wformat -Werror=format-security -D_FORTIFY_SOURCE=2
OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_THREAD=1
USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 USE_TFO=1 USE_NS=1

Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
Running on OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.31 2012-07-06
Running on PCRE version : 8.31 2012-07-06
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace

--
Best regards,
Maksim Kupriianov
Christopher Faulet
Re: 1.8.0 stuck in write(threads_sync_pipe[1], "S", 1)
December 02, 2017 10:30AM
Le 02/12/2017 à 08:23, Максим Куприянов a écrit :
> Hi!
>
> Tonight all of mine haproxy 1.8.0 instances stopped answering. They
> didn't forward traffic and even didn't answered over socket. They're
> compiled with threads, but threads are not enabled in they configs (no
> nbthread option). All of them stuck in same place:
> # strace -f -p 831919
> Process 831919 attached
> write(2, "S", 1
> Here's some debug stuff (from 1-threaded instance):
> (gdb) bt
> #0  0x00007fef9bd2a330 in __write_nocancel () at
> ../sysdeps/unix/syscall-template.S:81
> #1  0x0000558dea62275b in thread_want_sync () at src/hathreads.c:74
> #2  0x0000558dea58f548 in srv_register_update
> ([email protected]=0x558ded691e30) at src/server.c:2596
> #3  0x0000558dea5922f7 in server_recalc_eweight
> ([email protected]=0x558ded691e30) at src/server.c:1151
> #4  0x0000558dea5c3028 in server_warmup (t=0x558def513120) at
> src/checks.c:1448
> #5  0x0000558dea619216 in process_runnable_tasks () at src/task.c:229
> #6  0x0000558dea5cf237 in run_poll_loop () at src/haproxy.c:2326
> #7  run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:2375
> #8  0x0000558dea53b6fe in main (argc=<optimized out>,
> argv=0x7ffe03a880d8) at src/haproxy.c:2910
>

Hi,

Thanks for your detailed report. There is a bug in the sync-point, when
the same thread requests a synchronization many times. And, it is easier
to encountered this bug with only one thread.

Could you check the attached patch ? It should fix the bug.

--
Christopher Faulet
Максим Куприянов
Re: 1.8.0 stuck in write(threads_sync_pipe[1], "S", 1)
December 02, 2017 11:30AM
Hi!

Thank you for such a quick response. I'll apply patch and leave one
instance of 1.8 under load till Monday. Than I'll write you back.
Willy Tarreau
Re: 1.8.0 stuck in write(threads_sync_pipe[1], "S", 1)
December 02, 2017 02:40PM
Hi Christopher,

On Sat, Dec 02, 2017 at 10:24:11AM +0100, Christopher Faulet wrote:
> Thanks for your detailed report. There is a bug in the sync-point, when the
> same thread requests a synchronization many times. And, it is easier to
> encountered this bug with only one thread.
>
> Could you check the attached patch ? It should fix the bug.

Interesting one. I've taken it because even if I don't know yet whether
it fixes the Maxim's bug, at least it fixes one :-)

Thanks,
Willy
Максим Куприянов
Re: 1.8.0 stuck in write(threads_sync_pipe[1], "S", 1)
December 04, 2017 11:20AM
Hi!

Everything seems fine. Haproxy is still alive, so your patch solves the
problem.

Thank you!
Maxim


2017-12-02 13:22 GMT+03:00 Максим Куприянов <[email protected]>:

> Hi!
>
> Thank you for such a quick response. I'll apply patch and leave one
> instance of 1.8 under load till Monday. Than I'll write you back.
>
Christopher Faulet
Re: 1.8.0 stuck in write(threads_sync_pipe[1], "S", 1)
December 04, 2017 11:20AM
Le 04/12/2017 à 11:12, Максим Куприянов a écrit :
> Hi!
>
> Everything seems fine. Haproxy is still alive, so your patch solves the
> problem.
>
> Thank you!

Nice, Thanks for your report.

--
Christopher Faulet
Sorry, only registered users may post in this forum.

Click here to login