Welcome! Log In Create A New Profile

Advanced

SSL: double free on reload

Posted by Thierry Fournier 
Thierry Fournier
SSL: double free on reload
July 06, 2018 04:40PM
Hi list,

I caught a double-free whien I reload haproxy-1.8:

writev(2, [{"*** Error in `", 14}, {"/opt/o3-haproxy/sbin/haproxy", 28}, {"': ", 3}, {"double free or corruption (!prev)", 33}, {": 0x", 4}, {"000000001cec2ab0", 16}, {" ***\n", 5}], 7) = 103

Decoded:

*** Error in `/opt/o3-haproxy/sbin/haproxy': double free or corruption (!prev): 0x000000001cec2ab0 ***

Gdb says:

#0 0x00007f4bac88b067 in __GI_raise ([email protected]=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f4bac88c448 in __GI_abort () at abort.c:89
#2 0x00007f4bac8c91b4 in __libc_message ([email protected]=1,
[email protected]=0x7f4bac9be210 "*** Error in `%s': %s: 0x%s ***\n")
at ../sysdeps/posix/libc_fatal.c:175
#3 0x00007f4bac8ce98e in malloc_printerr (action=1,
str=0x7f4bac9be318 "double free or corruption (!prev)", ptr=<optimized out>) at malloc.c:4996
#4 0x00007f4bac8cf696 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
#5 0x000000000042af56 in ssl_sock_destroy_bind_conf (bind_conf=0x1d27e810) at src/ssl_sock.c:4819
#6 0x00000000004b1390 in deinit () at src/haproxy.c:2240
#7 0x000000000041b83c in main (argc=<optimized out>, argv=0x7ffc22f6b4d8) at src/haproxy.c:3094

I use the last 1.8.12 version.

Thierry


--
Thierry Fournier
Web Performance & Security Expert
m: +33 6 68 69 21 85 | e: thierry.fournier@ozon.io
w: http://www.ozon.io/ | b: http://blog.ozon.io/
Willy Tarreau
Re: SSL: double free on reload
July 16, 2018 08:10AM
Hi Thierry,

On Fri, Jul 06, 2018 at 04:28:22PM +0200, Thierry Fournier wrote:
> Hi list,
>
> I caught a double-free whien I reload haproxy-1.8:
>
> writev(2, [{"*** Error in `", 14}, {"/opt/o3-haproxy/sbin/haproxy", 28}, {"': ", 3}, {"double free or corruption (!prev)", 33}, {": 0x", 4}, {"000000001cec2ab0", 16}, {" ***\n", 5}], 7) = 103
>
> Decoded:
>
> *** Error in `/opt/o3-haproxy/sbin/haproxy': double free or corruption (!prev): 0x000000001cec2ab0 ***
>
> Gdb says:
>
> #0 0x00007f4bac88b067 in __GI_raise ([email protected]=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1 0x00007f4bac88c448 in __GI_abort () at abort.c:89
> #2 0x00007f4bac8c91b4 in __libc_message ([email protected]=1,
> [email protected]=0x7f4bac9be210 "*** Error in `%s': %s: 0x%s ***\n")
> at ../sysdeps/posix/libc_fatal.c:175
> #3 0x00007f4bac8ce98e in malloc_printerr (action=1,
> str=0x7f4bac9be318 "double free or corruption (!prev)", ptr=<optimized out>) at malloc.c:4996
> #4 0x00007f4bac8cf696 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
> #5 0x000000000042af56 in ssl_sock_destroy_bind_conf (bind_conf=0x1d27e810) at src/ssl_sock.c:4819
> #6 0x00000000004b1390 in deinit () at src/haproxy.c:2240
> #7 0x000000000041b83c in main (argc=<optimized out>, argv=0x7ffc22f6b4d8) at src/haproxy.c:3094
>
> I use the last 1.8.12 version.

This one looks a bit strange. I looked at it a little bit and it corresponds
to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
other line in the code appearing to perfom a free on this element, and when
passing through this code the key_ref is destroyed and properly nulled. I
checked if it was possible for this element not to be allocated and I don't
see how that could happen either. Thus I'm seeing only three possibilities :

- this element was duplicated and appears at multiple places (multiple list
elements) leading to a real double free

- there is a memory corruption somewhere possibly resulting in this element
being corrupted and not in fact victim of a double free

- I can't read code and there is another free that I failed to detect.

Are you able to trigger this on a trivial config ? Maybe it only happens
when certain features you have in your config are enabled ?

willy
Janusz Dziemidowicz
Re: SSL: double free on reload
July 16, 2018 08:40AM
pon., 16 lip 2018 o 08:02 Willy Tarreau <[email protected]> napisaƂ(a):
> This one looks a bit strange. I looked at it a little bit and it corresponds
> to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
> other line in the code appearing to perfom a free on this element, and when
> passing through this code the key_ref is destroyed and properly nulled. I
> checked if it was possible for this element not to be allocated and I don't
> see how that could happen either. Thus I'm seeing only three possibilities :
>
> - this element was duplicated and appears at multiple places (multiple list
> elements) leading to a real double free
>
> - there is a memory corruption somewhere possibly resulting in this element
> being corrupted and not in fact victim of a double free
>
> - I can't read code and there is another free that I failed to detect.
>
> Are you able to trigger this on a trivial config ? Maybe it only happens
> when certain features you have in your config are enabled ?

I've reported this some time ago :)
https://www.mail-archive.com/[email protected]/msg30093.html

--
Janusz Dziemidowicz
Thierry Fournier
Re: SSL: double free on reload
July 16, 2018 09:20AM
On Mon, 16 Jul 2018 08:00:48 +0200
Willy Tarreau <[email protected]> wrote:

> Hi Thierry,
>
> On Fri, Jul 06, 2018 at 04:28:22PM +0200, Thierry Fournier wrote:
> > Hi list,
> >
> > I caught a double-free whien I reload haproxy-1.8:
> >
> > writev(2, [{"*** Error in `", 14}, {"/opt/o3-haproxy/sbin/haproxy", 28}, {"': ", 3}, {"double free or corruption (!prev)", 33}, {": 0x", 4}, {"000000001cec2ab0", 16}, {" ***\n", 5}], 7) = 103
> >
> > Decoded:
> >
> > *** Error in `/opt/o3-haproxy/sbin/haproxy': double free or corruption (!prev): 0x000000001cec2ab0 ***
> >
> > Gdb says:
> >
> > #0 0x00007f4bac88b067 in __GI_raise ([email protected]=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> > #1 0x00007f4bac88c448 in __GI_abort () at abort.c:89
> > #2 0x00007f4bac8c91b4 in __libc_message ([email protected]=1,
> > [email protected]=0x7f4bac9be210 "*** Error in `%s': %s: 0x%s ***\n")
> > at ../sysdeps/posix/libc_fatal.c:175
> > #3 0x00007f4bac8ce98e in malloc_printerr (action=1,
> > str=0x7f4bac9be318 "double free or corruption (!prev)", ptr=<optimized out>) at malloc.c:4996
> > #4 0x00007f4bac8cf696 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
> > #5 0x000000000042af56 in ssl_sock_destroy_bind_conf (bind_conf=0x1d27e810) at src/ssl_sock.c:4819
> > #6 0x00000000004b1390 in deinit () at src/haproxy.c:2240
> > #7 0x000000000041b83c in main (argc=<optimized out>, argv=0x7ffc22f6b4d8) at src/haproxy.c:3094
> >
> > I use the last 1.8.12 version.
>
> This one looks a bit strange. I looked at it a little bit and it corresponds
> to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
> other line in the code appearing to perfom a free on this element, and when
> passing through this code the key_ref is destroyed and properly nulled. I
> checked if it was possible for this element not to be allocated and I don't
> see how that could happen either. Thus I'm seeing only three possibilities :
>
> - this element was duplicated and appears at multiple places (multiple list
> elements) leading to a real double free
>
> - there is a memory corruption somewhere possibly resulting in this element
> being corrupted and not in fact victim of a double free
>
> - I can't read code and there is another free that I failed to detect.
>
> Are you able to trigger this on a trivial config ? Maybe it only happens
> when certain features you have in your config are enabled ?


Reproduced ! unfortunately, I can't reproduce it without systemd. Check the
tls-keys path. With relative path, you must force the start path in the
systemd config file, or give the fullpath.

The bug seems to be linked with multiple bind line. The followng has no sense,
but the bug appens (on by original conf, I use multi process, avec each bind
line is associated with one process).

Maybe each bind line is duplicated on each process, the tls-key is commun for
each lines, and double-free when the second bind try to release memory.

I guess that systemd is not a cause of the crash, but if I start the process
with -Ws on command line, and I sent kill -USER2, the bug is not trigerred.

test.cfg:
---------

global

frontend frt
bind *:443 ssl crt default.pem tls-ticket-keys tls-keys
bind *:443 ssl crt default.pem tls-ticket-keys tls-keys

tls-keys
--------

WRGMXEZMeqZzeY7bJTLsfWvrlBKszxDuZ+2WlSP3YFOqUq4dbzBpH+8nvwforYej
b2dwxCxZsV02/8bmEv+q/QjMllu/4bOSCYFWn6CuTtwiQExG8SLYnwBMevOUjVpL
cOGgEy6YK4K3h8rS9jSEiu8xWjHP4iMT+IRhHkwYaKPmgwbmvARzvoPkMDnyw5gq


/lib/systemd/system/test.service:
---------------------------------
[Service]
LimitCORE=infinity
Environment="PIDFILE=/run/test.pid"
WorkingDirectory=/etc/o3-haproxy
ExecStart=/opt/o3-haproxy/sbin/haproxy -Ws -f test.cfg
ExecReload=/bin/kill -USR2 $MAINPID


Thierry
--
Thierry Fournier
Web Performance & Security Expert
m: +33 6 68 69 21 85 | e: thierry.fournier@ozon.io
w: http://www.ozon.io/ | b: http://blog.ozon.io/
Willy Tarreau
Re: SSL: double free on reload
July 16, 2018 10:50AM
On Mon, Jul 16, 2018 at 08:32:31AM +0200, Janusz Dziemidowicz wrote:
> pon., 16 lip 2018 o 08:02 Willy Tarreau <[email protected]> napisal(a):
> > This one looks a bit strange. I looked at it a little bit and it corresponds
> > to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
> > other line in the code appearing to perfom a free on this element, and when
> > passing through this code the key_ref is destroyed and properly nulled. I
> > checked if it was possible for this element not to be allocated and I don't
> > see how that could happen either. Thus I'm seeing only three possibilities :
> >
> > - this element was duplicated and appears at multiple places (multiple list
> > elements) leading to a real double free
> >
> > - there is a memory corruption somewhere possibly resulting in this element
> > being corrupted and not in fact victim of a double free
> >
> > - I can't read code and there is another free that I failed to detect.
> >
> > Are you able to trigger this on a trivial config ? Maybe it only happens
> > when certain features you have in your config are enabled ?
>
> I've reported this some time ago :)
> https://www.mail-archive.com/[email protected]/msg30093.html

Ah thank you Janusz, and I notice that your report matches Thierry's second
e-mail very closely.

I'm CCing Nenad who added the tls-ticket-keys in case he has any idea
on the subject, based on how the bind line is initialized maybe.

thanks,
Willy
Nenad Merdanovic
Re: SSL: double free on reload
July 17, 2018 03:50AM
Hello,

On 7/16/2018 10:46 AM, Willy Tarreau wrote:
> On Mon, Jul 16, 2018 at 08:32:31AM +0200, Janusz Dziemidowicz wrote:
>> pon., 16 lip 2018 o 08:02 Willy Tarreau <[email protected]> napisal(a):
>>> This one looks a bit strange. I looked at it a little bit and it corresponds
>>> to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
>>> other line in the code appearing to perfom a free on this element, and when
>>> passing through this code the key_ref is destroyed and properly nulled. I
>>> checked if it was possible for this element not to be allocated and I don't
>>> see how that could happen either. Thus I'm seeing only three possibilities :
>>>
>>> - this element was duplicated and appears at multiple places (multiple list
>>> elements) leading to a real double free
>>>
>>> - there is a memory corruption somewhere possibly resulting in this element
>>> being corrupted and not in fact victim of a double free
>>>
>>> - I can't read code and there is another free that I failed to detect.
>>>
>>> Are you able to trigger this on a trivial config ? Maybe it only happens
>>> when certain features you have in your config are enabled ?
>>
>> I've reported this some time ago :)
>> https://www.mail-archive.com/[email protected]/msg30093.html
>
> Ah thank you Janusz, and I notice that your report matches Thierry's second
> e-mail very closely.
>
> I'm CCing Nenad who added the tls-ticket-keys in case he has any idea
> on the subject, based on how the bind line is initialized maybe.

Ugh, this was a long time ago. [FROM MEMORY] The element should not be
duplicated as far as I can remember. The references are stored in an
ebtree in order to prevent duplication and to provide consistent view
when updated dynamically.

I just pulled HEAD and cannot reproduce this with either of these
configs. The "good" thing is that I get a crash every time I reload,
with different stack traces for each config.

One of them starts like:
#5 0x00007f271cce0847 in _int_free (av=0x7f271d015c40 <main_arena>,
p=0x562e01935460, have_lock=<optimized out>) at malloc.c:4362
size = 195488
fb = <optimized out>
nextchunk = 0x562e01944f30
nextsize = 131280
nextinuse = <optimized out>
prevsize = <optimized out>
bck = <optimized out>
fwd = <optimized out>
__PRETTY_FUNCTION__ = "_int_free"
#6 0x0000562e0011a7bb in deinit_pollers () at src/fd.c:554
bp = <optimized out>
p = <optimized out>
#7 0x0000562e00024c77 in main (argc=<optimized out>,
argv=0x7fffc630ec38) at src/haproxy.c:3095
err = <optimized out>
retry = <optimized out>
limit = {rlim_cur = 4012, rlim_max = 4012}
errmsg =
"\000\000\000\000\000\000\000\000\000\377\330q\356\336\342\345\370\351\060\306\377\177\000\000\000\t\216\001.V\000\000\230\351\060\306\377\177\000\000\261\000\000\000\000\000\000\000\262\000\000\000\000\000\000\000\330\352\060\306\377\177\000\000\370\351\060\306\377\177\000\000\346z\r\000.V\000\000y+\025\000.V\000\000\330\352\060\306\377\177\000\000\000\000\000"
pidfd = <optimized out>

I'll poke around it more tomorrow as it's quite late here.

Regards,
Nenad

>
> thanks,
> Willy
>
Sorry, only registered users may post in this forum.

Click here to login