Welcome! Log In Create A New Profile

Advanced

Bug when passing variable to mapping function

Posted by Daniel Schneller 
Daniel Schneller
Bug when passing variable to mapping function
June 25, 2018 04:50PM
Hi!

While playing around with map_regm I noticed some strange behavior when using variables
and map_regm. I managed to reduce it so a small test case and believe this is an actual bug.

It tested this on macOS, should it be relevant. haproxy is installed via homebrew:

----- haproxy version -------
$ haproxy -vvv
HA-Proxy version 1.8.10-ec17d7a 2018/06/22
Copyright 2000-2018 Willy Tarreau <[email protected]>

Build options :
TARGET = generic
CPU = generic
CC = clang
CFLAGS =
OPTIONS = USE_ZLIB=1 USE_POLL=1 USE_KQUEUE=1 USE_THREAD=1 USE_OPENSSL=1 USE_PCRE=1

Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2o 27 Mar 2018
Running on OpenSSL version : OpenSSL 1.0.2o 27 Mar 2018
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
Built with transparent proxy support using:
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.42 2018-03-20
Running on PCRE version : 8.42 2018-03-20
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
kqueue : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use kqueue.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace
-----------------------------


This is my haproxy configuration file:
----------- bla.cfg ---------
defaults
mode http

frontend fe_test
bind 127.0.0.1:21111

# Test Setup
# -------------------------------------------
# Remove port from Host header
http-request replace-value Host '(.*):.*' '\1'

# Store host header in variable
http-request set-var(txn.host) req.hdr(Host)
# -------------------------------------------


# Test cases:
# -------------------------------------------
# This works correctly
http-request set-var(txn.manual) str("distri")
http-request set-header X-Distri-Direct-From-Manual-Var %[var(txn.manual)]

# This works correctly
http-request set-header X-Distri-Mapped-From-Header %[req.hdr(Host),map_regm(hostmap.txt,"unknown"),lower]

# This works correctly
http-request set-header X-Distri-Direct-From-Var %[var(txn.host)]

# This breaks
http-request set-header X-Distri-Mapped-From-Var %[var(txn.host),map_regm(hostmap.txt,"unknown"),lower]

# -------------------------------------------

default_backend be_test

backend be_test
server s 127.0.0.1:8111
-----------------------------

The sever is just a Python SimpleHTTPServer, dumping the request headers.


This is the contents of the map file:
-------- hostmap.txt ---------
^(.*)\.(.*)$ \1
------------------------------


This is the sample request I send:
---------- request -----------
$ curl -v http://127.0.0.1:21111/example.txt -H 'Host: distri.com:1234'
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 21111 (#0)
> GET /example.txt HTTP/1.1
> Host: distri.com:1234
> User-Agent: curl/7.54.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: SimpleHTTP/0.6 Python/2.7.10
< Date: Mon, 25 Jun 2018 14:15:41 GMT
< Content-type: text/plain
< Content-Length: 0
< Last-Modified: Mon, 25 Jun 2018 14:13:09 GMT
* HTTP/1.0 connection set to keep alive!
< Connection: keep-alive
<
* Connection #0 to host 127.0.0.1 left intact
-----------------------------


HAproxy is started in the Terminal with debug output:
----- HAProxy Output --------
$ haproxy -d -f bla.cfg
....
Available polling systems :
kqueue : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result FAILED
Total: 3 (2 usable), will use kqueue.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace
Using kqueue() as the polling mechanism.
00000000:fe_test.accept(0004)=0007 from [127.0.0.1:64880] ALPN=<none>
00000000:fe_test.clireq[0007:ffffffff]: GET /example.txt HTTP/1.1
00000000:fe_test.clihdr[0007:ffffffff]: Host: distri.com:1234
00000000:fe_test.clihdr[0007:ffffffff]: User-Agent: curl/7.54.0
00000000:fe_test.clihdr[0007:ffffffff]: Accept: */*
00000000:be_test.srvrep[0007:0008]: HTTP/1.0 200 OK
00000000:be_test.srvhdr[0007:0008]: Server: SimpleHTTP/0.6 Python/2.7.10
00000000:be_test.srvhdr[0007:0008]: Date: Mon, 25 Jun 2018 14:15:41 GMT
00000000:be_test.srvhdr[0007:0008]: Content-type: text/plain
00000000:be_test.srvhdr[0007:0008]: Content-Length: 0
00000000:be_test.srvhdr[0007:0008]: Last-Modified: Mon, 25 Jun 2018 14:13:09 GMT
00000000:be_test.srvcls[0007:adfd]
00000001:fe_test.clicls[0007:ffffffff]
00000001:fe_test.closed[0007:ffffffff]
-----------------------------



Now, the interesting thing is the server's debug output:

------ Server Output --------
Host: distri.com
User-Agent: curl/7.54.0
Accept: */*
X-Distri-Direct-From-Manual-Var: distri
X-Distri-Mapped-From-Header: distri
X-Distri-Direct-From-Var: distri.com
X-Distri-Mapped-From-Var: %00istri

127.0.0.1 - - [25/Jun/2018 16:30:48] "GET /example.txt HTTP/1.1" 200 -
-----------------------------

See the X-Distri-Mapped-From-Var header's value. It has what seems to be a nul-byte
instead of the first character of the domain name. The other X- headers
before it are meant to narrow down where the bug actually happens.

It would appear that it is somehow related to passing a variable's value
into the mapping function or its return from there. Interestingly, the
issue does _not_ show when simply putting the variable value into a header
(X-Distri-Direct-From-Var) or when calling the mapping function with the
header lookup instead of the intermediate variable (X-Distri-Mapped-From-Header).


One more tidbit: If I change the mapping file to this:
------------------
^(.*)\.(.*)$ a\1
------------------

The generated header header changes to:
----------------------
X-Distri-Mapped-From-Var: aaistri
----------------------

Looks like some off-by-one error?


Cheers,
Daniel




--
Daniel Schneller
Principal Cloud Engineer

CenterDevice GmbH
Rheinwerkallee 3
53227 Bonn
www.centerdevice.com

__________________________________________
Geschäftsführung: Dr. Patrick Peschlow, Dr. Lukas Pustina, Michael Rosbach, Handelsregister-Nr.: HRB 18655, HR-Gericht: Bonn, USt-IdNr.: DE-815299431

Diese E-Mail einschließlich evtl. beigefügter Dateien enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und löschen Sie diese E-Mail und evtl. beigefügter Dateien umgehend. Das unerlaubte Kopieren, Nutzen oder Öffnen evtl. beigefügter Dateien sowie die unbefugte Weitergabe dieser E-Mail ist nicht gestattet.
Jarno Huuskonen
Re: Bug when passing variable to mapping function
June 28, 2018 03:50PM
Hi,

On Mon, Jun 25, Daniel Schneller wrote:
> This is the contents of the map file:
> -------- hostmap.txt ---------
> ^(.*)\.(.*)$ \1
> ------------------------------

Setting this to:
^(.*)\.(.*)$ \2

And I get
X-Distri-Mapped-From-Var: com

and with map_regm: ^(.*)\.(.*)\.(.*)$ \2.\3
(Host: www.distri.com)

I get X-Distri-Mapped-From-Var: distri.com

So it looks like only backref \1 has first char set to \000

> See the X-Distri-Mapped-From-Var header's value. It has what seems to be a nul-byte
> instead of the first character of the domain name. The other X- headers
> before it are meant to narrow down where the bug actually happens.
>
> It would appear that it is somehow related to passing a variable's value
> into the mapping function or its return from there. Interestingly, the
> issue does _not_ show when simply putting the variable value into a header
> (X-Distri-Direct-From-Var) or when calling the mapping function with the
> header lookup instead of the intermediate variable (X-Distri-Mapped-From-Header).
>
>
> One more tidbit: If I change the mapping file to this:
> ------------------
> ^(.*)\.(.*)$ a\1
> ------------------
>
> The generated header header changes to:
> ----------------------
> X-Distri-Mapped-From-Var: aaistri
> ----------------------
>
> Looks like some off-by-one error?

AFAIK this works on 1.7.11 but seems to be broken on all 1.8.x.

I think this is the commit that breaks map_regm in this case:
b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
acls/maps thread safe).

If I revert this commit from pattern.c:pattern_exec_match
then the map_regm \1 backref seems to work.

-Jarno

--
Jarno Huuskonen
Jarno Huuskonen
Re: Bug when passing variable to mapping function
June 29, 2018 07:20AM
Hi,

On Thu, Jun 28, Jarno Huuskonen wrote:
> I think this is the commit that breaks map_regm in this case:
> b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> acls/maps thread safe).
>
> If I revert this commit from pattern.c:pattern_exec_match
> then the map_regm \1 backref seems to work.

I think I found what's replacing the \000 as first char:
in (map.c) sample_conv_map:
/* In the regm case, merge the sample with the input. */
if ((long)private == PAT_MATCH_REGM) {
str = get_trash_chunk();
str->len = exp_replace(str->str, str->size, smp->data.u.str.str,
pat->data->u.str.str,
(regmatch_t *)smp->ctx.a[0]);

Before call to get_trash_chunk() smp->data.u.str.str is for example
'distri.com' and after get_trash_chunk() smp->data.u.str.str
is '\000istri.com'.

At the moment I don't have time to dig deeper, but hopefully this
helps a little bit.

-Jarno

--
Jarno Huuskonen
Daniel Schneller
Re: Bug when passing variable to mapping function
July 09, 2018 03:40PM
Hi!

Thanks for your analysis. I was away for a few days, hence the late
response.

> AFAIK this works on 1.7.11 but seems to be broken on all 1.8.x.

Interesting. I tried the new config locally with 1.8 and found the bug.
Production servers are still on 1.7, so it would even have worked :)
I am glad, though, that I found this on 1.8, sparing me the trouble some
time down the road when they get updated.


> > I think this is the commit that breaks map_regm in this case:
> > b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> > acls/maps thread safe).
> >
> > If I revert this commit from pattern.c:pattern_exec_match
> > then the map_regm \1 backref seems to work.
>
> I think I found what's replacing the \000 as first char:
> in (map.c) sample_conv_map:
> /* In the regm case, merge the sample with the input. */
> if ((long)private == PAT_MATCH_REGM) {
> str = get_trash_chunk();
> str->len = exp_replace(str->str, str->size,
smp->data.u.str.str,
> pat->data->u.str.str,
> (regmatch_t *)smp->ctx.a[0]);
>
> Before call to get_trash_chunk() smp->data.u.str.str is for example
> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
> is '\000istri.com'.

I had a look at that code, but I must admit my understanding of the
concepts (trash chunk? some optimization, I assume?) and C as a language is
too limited to make a patch myself.
Is this on any of the developers' radar?

Thanks a lot :)

Daniel

On 29 June 2018 at 07:14, Jarno Huuskonen <[email protected]> wrote:

> Hi,
>
> On Thu, Jun 28, Jarno Huuskonen wrote:
> > I think this is the commit that breaks map_regm in this case:
> > b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> > acls/maps thread safe).
> >
> > If I revert this commit from pattern.c:pattern_exec_match
> > then the map_regm \1 backref seems to work.
>
> I think I found what's replacing the \000 as first char:
> in (map.c) sample_conv_map:
> /* In the regm case, merge the sample with the input. */
> if ((long)private == PAT_MATCH_REGM) {
> str = get_trash_chunk();
> str->len = exp_replace(str->str, str->size,
> smp->data.u.str.str,
> pat->data->u.str.str,
> (regmatch_t *)smp->ctx.a[0]);
>
> Before call to get_trash_chunk() smp->data.u.str.str is for example
> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
> is '\000istri.com'.
>
> At the moment I don't have time to dig deeper, but hopefully this
> helps a little bit.
>
> -Jarno
>
> --
> Jarno Huuskonen
>
>


--

--
Daniel Schneller
Principal Cloud Engineer

CenterDevice GmbH | Hochstraße 11
| 42697 Solingen
tel: +49 1754155711 | Deutschland
daniel.schneller@centerdevice.de | www.centerdevice.de

Geschäftsführung: Dr. Patrick Peschlow, Dr. Lukas Pustina,
Michael Rosbach, Handelsregister-Nr.: HRB 18655,
HR-Gericht: Bonn, USt-IdNr.: DE-815299431
Lukas Tribus
Re: Bug when passing variable to mapping function
July 16, 2018 01:40PM
Hello,



On Fri, 29 Jun 2018 at 07:15, Jarno Huuskonen <[email protected]> wrote:
>
> Hi,
>
> On Thu, Jun 28, Jarno Huuskonen wrote:
> > I think this is the commit that breaks map_regm in this case:
> > b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> > acls/maps thread safe).
> >
> > If I revert this commit from pattern.c:pattern_exec_match
> > then the map_regm \1 backref seems to work.
>
> I think I found what's replacing the \000 as first char:
> in (map.c) sample_conv_map:
> /* In the regm case, merge the sample with the input. */
> if ((long)private == PAT_MATCH_REGM) {
> str = get_trash_chunk();
> str->len = exp_replace(str->str, str->size, smp->data.u.str.str,
> pat->data->u.str.str,
> (regmatch_t *)smp->ctx.a[0]);
>
> Before call to get_trash_chunk() smp->data.u.str.str is for example
> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
> is '\000istri.com'.
>
> At the moment I don't have time to dig deeper, but hopefully this
> helps a little bit.

Thanks for the detailed analysis, relaying to Emeric ...



Lukas
Emeric Brun
Re: Bug when passing variable to mapping function
July 17, 2018 04:00PM
Hi Jarno, and thanks Lukas

On 07/16/2018 07:27 AM, Lukas Tribus wrote:
> Hello,
>
>
>
> On Fri, 29 Jun 2018 at 07:15, Jarno Huuskonen <[email protected]> wrote:
>>
>> Hi,
>>
>> On Thu, Jun 28, Jarno Huuskonen wrote:
>>> I think this is the commit that breaks map_regm in this case:
>>> b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
>>> acls/maps thread safe).
>>>
>>> If I revert this commit from pattern.c:pattern_exec_match
>>> then the map_regm \1 backref seems to work.
>>
>> I think I found what's replacing the \000 as first char:
>> in (map.c) sample_conv_map:
>> /* In the regm case, merge the sample with the input. */
>> if ((long)private == PAT_MATCH_REGM) {
>> str = get_trash_chunk();
>> str->len = exp_replace(str->str, str->size, smp->data.u.str.str,
>> pat->data->u.str.str,
>> (regmatch_t *)smp->ctx.a[0]);
>>
>> Before call to get_trash_chunk() smp->data.u.str.str is for example
>> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
>> is '\000istri.com'.
>>
>> At the moment I don't have time to dig deeper, but hopefully this
>> helps a little bit.
>
> Thanks for the detailed analysis, relaying to Emeric ...
>
>
>
> Lukas
>

Could you try the patch in attachment? i hope it will fix the issue

R,
Emeric
Jarno Huuskonen
Re: Bug when passing variable to mapping function
August 01, 2018 01:20PM
Hi,

On Tue, Jul 17, Emeric Brun wrote:
> > On Fri, 29 Jun 2018 at 07:15, Jarno Huuskonen <[email protected]> wrote:
> >> On Thu, Jun 28, Jarno Huuskonen wrote:
> >>> I think this is the commit that breaks map_regm in this case:
> >>> b5997f740b21ebb197e10a0f2fe9dc13163e1772 (MAJOR: threads/map: Make
> >>> acls/maps thread safe).
> >>>
> >>> If I revert this commit from pattern.c:pattern_exec_match
> >>> then the map_regm \1 backref seems to work.
> >>
> >> I think I found what's replacing the \000 as first char:
> >> in (map.c) sample_conv_map:
> >> /* In the regm case, merge the sample with the input. */
> >> if ((long)private == PAT_MATCH_REGM) {
> >> str = get_trash_chunk();
> >> str->len = exp_replace(str->str, str->size, smp->data.u.str.str,
> >> pat->data->u.str.str,
> >> (regmatch_t *)smp->ctx.a[0]);
> >>
> >> Before call to get_trash_chunk() smp->data.u.str.str is for example
> >> 'distri.com' and after get_trash_chunk() smp->data.u.str.str
> >> is '\000istri.com'.
>
> Could you try the patch in attachment? i hope it will fix the issue

Sorry I've been away from keyboard. Just tested the patch w/1.8.12 and
for me the patch fixes the map_regm issue with Daniel's example config
(https://www.mail-archive.com/[email protected]/msg30523.html).

Thanks,
-Jarno

--
Jarno Huuskonen
Sorry, only registered users may post in this forum.

Click here to login