Welcome! Log In Create A New Profile

Advanced

SSL: double free on reload

Posted by Thierry Fournier 
Thierry Fournier
SSL: double free on reload
July 06, 2018 04:40PM
Hi list,

I caught a double-free whien I reload haproxy-1.8:

writev(2, [{"*** Error in `", 14}, {"/opt/o3-haproxy/sbin/haproxy", 28}, {"': ", 3}, {"double free or corruption (!prev)", 33}, {": 0x", 4}, {"000000001cec2ab0", 16}, {" ***\n", 5}], 7) = 103

Decoded:

*** Error in `/opt/o3-haproxy/sbin/haproxy': double free or corruption (!prev): 0x000000001cec2ab0 ***

Gdb says:

#0 0x00007f4bac88b067 in __GI_raise ([email protected]=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f4bac88c448 in __GI_abort () at abort.c:89
#2 0x00007f4bac8c91b4 in __libc_message ([email protected]=1,
[email protected]=0x7f4bac9be210 "*** Error in `%s': %s: 0x%s ***\n")
at ../sysdeps/posix/libc_fatal.c:175
#3 0x00007f4bac8ce98e in malloc_printerr (action=1,
str=0x7f4bac9be318 "double free or corruption (!prev)", ptr=<optimized out>) at malloc.c:4996
#4 0x00007f4bac8cf696 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
#5 0x000000000042af56 in ssl_sock_destroy_bind_conf (bind_conf=0x1d27e810) at src/ssl_sock.c:4819
#6 0x00000000004b1390 in deinit () at src/haproxy.c:2240
#7 0x000000000041b83c in main (argc=<optimized out>, argv=0x7ffc22f6b4d8) at src/haproxy.c:3094

I use the last 1.8.12 version.

Thierry


--
Thierry Fournier
Web Performance & Security Expert
m: +33 6 68 69 21 85 | e: thierry.fournier@ozon.io
w: http://www.ozon.io/ | b: http://blog.ozon.io/
Willy Tarreau
Re: SSL: double free on reload
July 16, 2018 08:10AM
Hi Thierry,

On Fri, Jul 06, 2018 at 04:28:22PM +0200, Thierry Fournier wrote:
> Hi list,
>
> I caught a double-free whien I reload haproxy-1.8:
>
> writev(2, [{"*** Error in `", 14}, {"/opt/o3-haproxy/sbin/haproxy", 28}, {"': ", 3}, {"double free or corruption (!prev)", 33}, {": 0x", 4}, {"000000001cec2ab0", 16}, {" ***\n", 5}], 7) = 103
>
> Decoded:
>
> *** Error in `/opt/o3-haproxy/sbin/haproxy': double free or corruption (!prev): 0x000000001cec2ab0 ***
>
> Gdb says:
>
> #0 0x00007f4bac88b067 in __GI_raise ([email protected]=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1 0x00007f4bac88c448 in __GI_abort () at abort.c:89
> #2 0x00007f4bac8c91b4 in __libc_message ([email protected]=1,
> [email protected]=0x7f4bac9be210 "*** Error in `%s': %s: 0x%s ***\n")
> at ../sysdeps/posix/libc_fatal.c:175
> #3 0x00007f4bac8ce98e in malloc_printerr (action=1,
> str=0x7f4bac9be318 "double free or corruption (!prev)", ptr=<optimized out>) at malloc.c:4996
> #4 0x00007f4bac8cf696 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
> #5 0x000000000042af56 in ssl_sock_destroy_bind_conf (bind_conf=0x1d27e810) at src/ssl_sock.c:4819
> #6 0x00000000004b1390 in deinit () at src/haproxy.c:2240
> #7 0x000000000041b83c in main (argc=<optimized out>, argv=0x7ffc22f6b4d8) at src/haproxy.c:3094
>
> I use the last 1.8.12 version.

This one looks a bit strange. I looked at it a little bit and it corresponds
to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
other line in the code appearing to perfom a free on this element, and when
passing through this code the key_ref is destroyed and properly nulled. I
checked if it was possible for this element not to be allocated and I don't
see how that could happen either. Thus I'm seeing only three possibilities :

- this element was duplicated and appears at multiple places (multiple list
elements) leading to a real double free

- there is a memory corruption somewhere possibly resulting in this element
being corrupted and not in fact victim of a double free

- I can't read code and there is another free that I failed to detect.

Are you able to trigger this on a trivial config ? Maybe it only happens
when certain features you have in your config are enabled ?

willy
Janusz Dziemidowicz
Re: SSL: double free on reload
July 16, 2018 08:40AM
pon., 16 lip 2018 o 08:02 Willy Tarreau <[email protected]> napisaƂ(a):
> This one looks a bit strange. I looked at it a little bit and it corresponds
> to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
> other line in the code appearing to perfom a free on this element, and when
> passing through this code the key_ref is destroyed and properly nulled. I
> checked if it was possible for this element not to be allocated and I don't
> see how that could happen either. Thus I'm seeing only three possibilities :
>
> - this element was duplicated and appears at multiple places (multiple list
> elements) leading to a real double free
>
> - there is a memory corruption somewhere possibly resulting in this element
> being corrupted and not in fact victim of a double free
>
> - I can't read code and there is another free that I failed to detect.
>
> Are you able to trigger this on a trivial config ? Maybe it only happens
> when certain features you have in your config are enabled ?

I've reported this some time ago :)
https://www.mail-archive.com/[email protected]/msg30093.html

--
Janusz Dziemidowicz
Thierry Fournier
Re: SSL: double free on reload
July 16, 2018 09:20AM
On Mon, 16 Jul 2018 08:00:48 +0200
Willy Tarreau <[email protected]> wrote:

> Hi Thierry,
>
> On Fri, Jul 06, 2018 at 04:28:22PM +0200, Thierry Fournier wrote:
> > Hi list,
> >
> > I caught a double-free whien I reload haproxy-1.8:
> >
> > writev(2, [{"*** Error in `", 14}, {"/opt/o3-haproxy/sbin/haproxy", 28}, {"': ", 3}, {"double free or corruption (!prev)", 33}, {": 0x", 4}, {"000000001cec2ab0", 16}, {" ***\n", 5}], 7) = 103
> >
> > Decoded:
> >
> > *** Error in `/opt/o3-haproxy/sbin/haproxy': double free or corruption (!prev): 0x000000001cec2ab0 ***
> >
> > Gdb says:
> >
> > #0 0x00007f4bac88b067 in __GI_raise ([email protected]=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> > #1 0x00007f4bac88c448 in __GI_abort () at abort.c:89
> > #2 0x00007f4bac8c91b4 in __libc_message ([email protected]=1,
> > [email protected]=0x7f4bac9be210 "*** Error in `%s': %s: 0x%s ***\n")
> > at ../sysdeps/posix/libc_fatal.c:175
> > #3 0x00007f4bac8ce98e in malloc_printerr (action=1,
> > str=0x7f4bac9be318 "double free or corruption (!prev)", ptr=<optimized out>) at malloc.c:4996
> > #4 0x00007f4bac8cf696 in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
> > #5 0x000000000042af56 in ssl_sock_destroy_bind_conf (bind_conf=0x1d27e810) at src/ssl_sock.c:4819
> > #6 0x00000000004b1390 in deinit () at src/haproxy.c:2240
> > #7 0x000000000041b83c in main (argc=<optimized out>, argv=0x7ffc22f6b4d8) at src/haproxy.c:3094
> >
> > I use the last 1.8.12 version.
>
> This one looks a bit strange. I looked at it a little bit and it corresponds
> to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
> other line in the code appearing to perfom a free on this element, and when
> passing through this code the key_ref is destroyed and properly nulled. I
> checked if it was possible for this element not to be allocated and I don't
> see how that could happen either. Thus I'm seeing only three possibilities :
>
> - this element was duplicated and appears at multiple places (multiple list
> elements) leading to a real double free
>
> - there is a memory corruption somewhere possibly resulting in this element
> being corrupted and not in fact victim of a double free
>
> - I can't read code and there is another free that I failed to detect.
>
> Are you able to trigger this on a trivial config ? Maybe it only happens
> when certain features you have in your config are enabled ?


Reproduced ! unfortunately, I can't reproduce it without systemd. Check the
tls-keys path. With relative path, you must force the start path in the
systemd config file, or give the fullpath.

The bug seems to be linked with multiple bind line. The followng has no sense,
but the bug appens (on by original conf, I use multi process, avec each bind
line is associated with one process).

Maybe each bind line is duplicated on each process, the tls-key is commun for
each lines, and double-free when the second bind try to release memory.

I guess that systemd is not a cause of the crash, but if I start the process
with -Ws on command line, and I sent kill -USER2, the bug is not trigerred.

test.cfg:
---------

global

frontend frt
bind *:443 ssl crt default.pem tls-ticket-keys tls-keys
bind *:443 ssl crt default.pem tls-ticket-keys tls-keys

tls-keys
--------

WRGMXEZMeqZzeY7bJTLsfWvrlBKszxDuZ+2WlSP3YFOqUq4dbzBpH+8nvwforYej
b2dwxCxZsV02/8bmEv+q/QjMllu/4bOSCYFWn6CuTtwiQExG8SLYnwBMevOUjVpL
cOGgEy6YK4K3h8rS9jSEiu8xWjHP4iMT+IRhHkwYaKPmgwbmvARzvoPkMDnyw5gq


/lib/systemd/system/test.service:
---------------------------------
[Service]
LimitCORE=infinity
Environment="PIDFILE=/run/test.pid"
WorkingDirectory=/etc/o3-haproxy
ExecStart=/opt/o3-haproxy/sbin/haproxy -Ws -f test.cfg
ExecReload=/bin/kill -USR2 $MAINPID


Thierry
--
Thierry Fournier
Web Performance & Security Expert
m: +33 6 68 69 21 85 | e: thierry.fournier@ozon.io
w: http://www.ozon.io/ | b: http://blog.ozon.io/
Willy Tarreau
Re: SSL: double free on reload
July 16, 2018 10:50AM
On Mon, Jul 16, 2018 at 08:32:31AM +0200, Janusz Dziemidowicz wrote:
> pon., 16 lip 2018 o 08:02 Willy Tarreau <[email protected]> napisal(a):
> > This one looks a bit strange. I looked at it a little bit and it corresponds
> > to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
> > other line in the code appearing to perfom a free on this element, and when
> > passing through this code the key_ref is destroyed and properly nulled. I
> > checked if it was possible for this element not to be allocated and I don't
> > see how that could happen either. Thus I'm seeing only three possibilities :
> >
> > - this element was duplicated and appears at multiple places (multiple list
> > elements) leading to a real double free
> >
> > - there is a memory corruption somewhere possibly resulting in this element
> > being corrupted and not in fact victim of a double free
> >
> > - I can't read code and there is another free that I failed to detect.
> >
> > Are you able to trigger this on a trivial config ? Maybe it only happens
> > when certain features you have in your config are enabled ?
>
> I've reported this some time ago :)
> https://www.mail-archive.com/[email protected]/msg30093.html

Ah thank you Janusz, and I notice that your report matches Thierry's second
e-mail very closely.

I'm CCing Nenad who added the tls-ticket-keys in case he has any idea
on the subject, based on how the bind line is initialized maybe.

thanks,
Willy
Nenad Merdanovic
Re: SSL: double free on reload
July 17, 2018 03:50AM
Hello,

On 7/16/2018 10:46 AM, Willy Tarreau wrote:
> On Mon, Jul 16, 2018 at 08:32:31AM +0200, Janusz Dziemidowicz wrote:
>> pon., 16 lip 2018 o 08:02 Willy Tarreau <[email protected]> napisal(a):
>>> This one looks a bit strange. I looked at it a little bit and it corresponds
>>> to the line "free(bind_conf->keys_ref->tlskeys);". Unfortunately, there is no
>>> other line in the code appearing to perfom a free on this element, and when
>>> passing through this code the key_ref is destroyed and properly nulled. I
>>> checked if it was possible for this element not to be allocated and I don't
>>> see how that could happen either. Thus I'm seeing only three possibilities :
>>>
>>> - this element was duplicated and appears at multiple places (multiple list
>>> elements) leading to a real double free
>>>
>>> - there is a memory corruption somewhere possibly resulting in this element
>>> being corrupted and not in fact victim of a double free
>>>
>>> - I can't read code and there is another free that I failed to detect.
>>>
>>> Are you able to trigger this on a trivial config ? Maybe it only happens
>>> when certain features you have in your config are enabled ?
>>
>> I've reported this some time ago :)
>> https://www.mail-archive.com/[email protected]/msg30093.html
>
> Ah thank you Janusz, and I notice that your report matches Thierry's second
> e-mail very closely.
>
> I'm CCing Nenad who added the tls-ticket-keys in case he has any idea
> on the subject, based on how the bind line is initialized maybe.

Ugh, this was a long time ago. [FROM MEMORY] The element should not be
duplicated as far as I can remember. The references are stored in an
ebtree in order to prevent duplication and to provide consistent view
when updated dynamically.

I just pulled HEAD and cannot reproduce this with either of these
configs. The "good" thing is that I get a crash every time I reload,
with different stack traces for each config.

One of them starts like:
#5 0x00007f271cce0847 in _int_free (av=0x7f271d015c40 <main_arena>,
p=0x562e01935460, have_lock=<optimized out>) at malloc.c:4362
size = 195488
fb = <optimized out>
nextchunk = 0x562e01944f30
nextsize = 131280
nextinuse = <optimized out>
prevsize = <optimized out>
bck = <optimized out>
fwd = <optimized out>
__PRETTY_FUNCTION__ = "_int_free"
#6 0x0000562e0011a7bb in deinit_pollers () at src/fd.c:554
bp = <optimized out>
p = <optimized out>
#7 0x0000562e00024c77 in main (argc=<optimized out>,
argv=0x7fffc630ec38) at src/haproxy.c:3095
err = <optimized out>
retry = <optimized out>
limit = {rlim_cur = 4012, rlim_max = 4012}
errmsg =
"\000\000\000\000\000\000\000\000\000\377\330q\356\336\342\345\370\351\060\306\377\177\000\000\000\t\216\001.V\000\000\230\351\060\306\377\177\000\000\261\000\000\000\000\000\000\000\262\000\000\000\000\000\000\000\330\352\060\306\377\177\000\000\370\351\060\306\377\177\000\000\346z\r\000.V\000\000y+\025\000.V\000\000\330\352\060\306\377\177\000\000\000\000\000"
pidfd = <optimized out>

I'll poke around it more tomorrow as it's quite late here.

Regards,
Nenad

>
> thanks,
> Willy
>
Willy Tarreau
Re: SSL: double free on reload
July 17, 2018 05:30AM
Hi Nenad,

On Tue, Jul 17, 2018 at 03:37:37AM +0200, Nenad Merdanovic wrote:
> Ugh, this was a long time ago. [FROM MEMORY] The element should not be
> duplicated as far as I can remember. The references are stored in an ebtree
> in order to prevent duplication and to provide consistent view when updated
> dynamically.

OK. Then maybe the elements are freed after scanning the ebtree as well,
and we're meeting the node again after it was freed. I'll run a test
with the memory debugging options to see if it crashes on first dereference.

> I just pulled HEAD and cannot reproduce this with either of these configs.
> The "good" thing is that I get a crash every time I reload, with different
> stack traces for each config.

This could then indeed indicate a corrupted memory somewhere, and maybe
the tls-ticket part is just the first victim.

> I'll poke around it more tomorrow as it's quite late here.

Thank you Nenad. I'll try as well on my side.

Cheers,
Willy
Willy Tarreau
Re: SSL: double free on reload
July 17, 2018 10:20AM
Hi again Nenad,

On Tue, Jul 17, 2018 at 05:18:45AM +0200, Willy Tarreau wrote:
> Hi Nenad,
>
> On Tue, Jul 17, 2018 at 03:37:37AM +0200, Nenad Merdanovic wrote:
> > Ugh, this was a long time ago. [FROM MEMORY] The element should not be
> > duplicated as far as I can remember. The references are stored in an ebtree
> > in order to prevent duplication and to provide consistent view when updated
> > dynamically.
>
> OK. Then maybe the elements are freed after scanning the ebtree as well,
> and we're meeting the node again after it was freed. I'll run a test
> with the memory debugging options to see if it crashes on first dereference.

OK I found it, the same keys_ref is indeed assigned to multiple bind_confs :

static int bind_parse_tls_ticket_keys(char **args, int cur_arg, struct proxy *px, struct bind_conf *conf, char **err)
{
....
keys_ref = tlskeys_ref_lookup(args[cur_arg + 1]);
if(keys_ref) {
conf->keys_ref = keys_ref;
return 0;
}

So we clearly end up with a normal double-free. I'm checking what's the best
solution to fix this now, as we don't have flags nor any such thing in the
bind_conf to indicate that it must or must not be freed. We can't duplicate
the allocation either because it's used for the updates. I think the cleanest
solution will be to add a refcount to the tls_keys_ref entry.

Then I'm proposing the attached patch which works for me and is obvious
enough to make valgrind happy as well.

Could you guys please give it a try to confirm ? I'll then merge it.

Thanks,
Willy
From 50588a4e3ccc92cdb0c8a347193192030f0ea9f0 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Tue, 17 Jul 2018 10:05:32 +0200
Subject: BUG/MINOR: ssl: properly ref-count the tls_keys entries

Commit 200b0fa ("MEDIUM: Add support for updating TLS ticket keys via
socket") introduced support for updating TLS ticket keys from the CLI,
but missed a small corner case : if multiple bind lines reference the
same tls_keys file, the same reference is used (as expected), but during
the clean shutdown, it will lead to a double free when destroying the
bind_conf contexts since none of the lines knows if others still use
it. The impact is very low however, mostly a core and/or a message in
the system's log upon old process termination.

Let's introduce some basic refcounting to prevent this from happening,
so that only the last bind_conf frees it.

Thanks to Janusz Dziemidowicz and Thierry Fournier for both reporting
the same issue with an easy reproducer.

This fix needs to be backported from 1.6 to 1.8.
---
include/types/ssl_sock.h | 1 +
src/ssl_sock.c | 6 ++++--
2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/types/ssl_sock.h b/include/types/ssl_sock.h
index 8a5b3ee..2e02631 100644
--- a/include/types/ssl_sock.h
+++ b/include/types/ssl_sock.h
@@ -59,6 +59,7 @@ struct tls_keys_ref {
struct list list; /* Used to chain refs. */
char *filename;
int unique_id; /* Each pattern reference have unique id. */
+ int refcount; /* number of users of this tls_keys_ref. */
struct tls_sess_key *tlskeys;
int tls_ticket_enc_index;
__decl_hathreads(HA_RWLOCK_T lock); /* lock used to protect the ref */
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index b5547cc..3f70b12 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -4823,7 +4823,7 @@ void ssl_sock_destroy_bind_conf(struct bind_conf *bind_conf)
ssl_sock_free_ssl_conf(&bind_conf->ssl_conf);
free(bind_conf->ca_sign_file);
free(bind_conf->ca_sign_pass);
- if (bind_conf->keys_ref) {
+ if (bind_conf->keys_ref && !--bind_conf->keys_ref->refcount) {
free(bind_conf->keys_ref->filename);
free(bind_conf->keys_ref->tlskeys);
LIST_DEL(&bind_conf->keys_ref->list);
@@ -7672,7 +7672,8 @@ static int bind_parse_tls_ticket_keys(char **args, int cur_arg, struct proxy *px
}

keys_ref = tlskeys_ref_lookup(args[cur_arg + 1]);
- if(keys_ref) {
+ if (keys_ref) {
+ keys_ref->refcount++;
conf->keys_ref = keys_ref;
return 0;
}
@@ -7719,6 +7720,7 @@ static int bind_parse_tls_ticket_keys(char **args, int cur_arg, struct proxy *px
i -= 2;
keys_ref->tls_ticket_enc_index = i < 0 ? 0 : i % TLS_TICKETS_NO;
keys_ref->unique_id = -1;
+ keys_ref->refcount = 1;
HA_RWLOCK_INIT(&keys_ref->lock);
conf->keys_ref = keys_ref;

--
1.7.12.1
Thierry Fournier
Re: SSL: double free on reload
July 17, 2018 02:00PM
On Tue, 17 Jul 2018 10:10:58 +0200
Willy Tarreau <[email protected]> wrote:

> Hi again Nenad,
>
> On Tue, Jul 17, 2018 at 05:18:45AM +0200, Willy Tarreau wrote:
> > Hi Nenad,
> >
> > On Tue, Jul 17, 2018 at 03:37:37AM +0200, Nenad Merdanovic wrote:
> > > Ugh, this was a long time ago. [FROM MEMORY] The element should not be
> > > duplicated as far as I can remember. The references are stored in an ebtree
> > > in order to prevent duplication and to provide consistent view when updated
> > > dynamically.
> >
> > OK. Then maybe the elements are freed after scanning the ebtree as well,
> > and we're meeting the node again after it was freed. I'll run a test
> > with the memory debugging options to see if it crashes on first dereference.
>
> OK I found it, the same keys_ref is indeed assigned to multiple bind_confs :
>
> static int bind_parse_tls_ticket_keys(char **args, int cur_arg, struct proxy *px, struct bind_conf *conf, char **err)
> {
> ...
> keys_ref = tlskeys_ref_lookup(args[cur_arg + 1]);
> if(keys_ref) {
> conf->keys_ref = keys_ref;
> return 0;
> }
>
> So we clearly end up with a normal double-free. I'm checking what's the best
> solution to fix this now, as we don't have flags nor any such thing in the
> bind_conf to indicate that it must or must not be freed. We can't duplicate
> the allocation either because it's used for the updates. I think the cleanest
> solution will be to add a refcount to the tls_keys_ref entry.
>
> Then I'm proposing the attached patch which works for me and is obvious
> enough to make valgrind happy as well.
>
> Could you guys please give it a try to confirm ? I'll then merge it.


i, the patch works for me with backport to the 1.8 version. It is on
productio stage.

Thanks,
Thierry



> Thanks,
> Willy


--
Thierry Fournier
Web Performance & Security Expert
m: +33 6 68 69 21 85 | e: thierry.fournier@ozon.io
w: http://www.ozon.io/ | b: http://blog.ozon.io/
Willy Tarreau
Re: SSL: double free on reload
July 18, 2018 09:10AM
On Tue, Jul 17, 2018 at 01:49:40PM +0200, Thierry Fournier wrote:
> > Could you guys please give it a try to confirm ? I'll then merge it.
>
>
> i, the patch works for me with backport to the 1.8 version. It is on
> productio stage.

OK thank you Thierry, now merged.

Willy
Sorry, only registered users may post in this forum.

Click here to login