Welcome! Log In Create A New Profile

Advanced

Propagating agent-check weight change to tracking servers

Posted by Michał 
Hello,

We use track's in haproxy to minimize check traffic in some situations and
after my last patch we are probably going to switch to agent-checks for
live management of weights and statuses. One problem I see now - track
don't propagate weight setting to trackers, so if we set agent-check on
track we can manage status only.

My first PoC solution works good, so I thought about introducing something
like agent-track or track-agent directive set on backends (or maybe it
should be default, non-configurable behaviour) to propagate agent-check
responses from main check to all tracking backends. Both default behaviour
and directive approach are small changes in code, but a little bigger in
functionality.

In my opinion if we set agent-check on backend which is tracked by others -
it should propagate agent-check weight response to those tracking backend
and set weight on them too. Configurable or not - it will be good feature.

Michał
Willy Tarreau
Re: Propagating agent-check weight change to tracking servers
January 20, 2017 11:00AM
Hi Michal,

On Thu, Jan 19, 2017 at 11:45:57PM +0100, Micha?? wrote:
> Hello,
>
> We use track's in haproxy to minimize check traffic in some situations and
> after my last patch we are probably going to switch to agent-checks for
> live management of weights and statuses. One problem I see now - track
> don't propagate weight setting to trackers, so if we set agent-check on
> track we can manage status only.
>
> My first PoC solution works good, so I thought about introducing something
> like agent-track or track-agent directive set on backends (or maybe it
> should be default, non-configurable behaviour) to propagate agent-check
> responses from main check to all tracking backends. Both default behaviour
> and directive approach are small changes in code, but a little bigger in
> functionality.
>
> In my opinion if we set agent-check on backend which is tracked by others -
> it should propagate agent-check weight response to those tracking backend
> and set weight on them too. Configurable or not - it will be good feature.

I think we at least propagate the DRAIN state which is equivalent to
weight == 0. If so I too think we should propagate *relative* weights.
Agent-checks can return a relative weight (eg: 50%, 100%, 150%) or an
absolute weight (eg: 10, 20). If you have two farms configured like this :

backend farm1
server new1 1.1.1.1:8000 weight 10 agent-check
server new2 1.1.1.2:8000 weight 10 agent-check

backend farm2
server new1 1.1.1.1:8000 weight 20 track farm1/new1
server new2 1.1.1.2:8000 weight 20 track farm1/new2
server old1 1.1.1.3:8000 weight 10 check
server old2 1.1.1.4:8000 weight 10 check

Then you want the weight changes on farm1 to be applied proportionally
to farm2 (ie: a ratio of the configured absolute weight, which is iweight
IIRC).

Otherwise that sounds quite reasonable to me given that the agent-check's
purpose is to provide a more accurate vision of the server's health, and
that tracking is made to share the same vision across multiple farms.

Regards,
Willy
Hello,

So here's patch, which includes all functionalities I think about.
It propagates the response for every tracking server without changing it
and without intercepting it. In my opinion we should propagate relative
and absolute weights, because if you use weight=0 server's to offload
checks then you can send relative weight change and 0 will stay where it is..

Regards,
Michał


2017-01-20 10:54 GMT+01:00 Willy Tarreau <[email protected]>:

> Hi Michal,
>
> On Thu, Jan 19, 2017 at 11:45:57PM +0100, Micha?? wrote:
> > Hello,
> >
> > We use track's in haproxy to minimize check traffic in some situations
> and
> > after my last patch we are probably going to switch to agent-checks for
> > live management of weights and statuses. One problem I see now - track
> > don't propagate weight setting to trackers, so if we set agent-check on
> > track we can manage status only.
> >
> > My first PoC solution works good, so I thought about introducing
> something
> > like agent-track or track-agent directive set on backends (or maybe it
> > should be default, non-configurable behaviour) to propagate agent-check
> > responses from main check to all tracking backends. Both default
> behaviour
> > and directive approach are small changes in code, but a little bigger in
> > functionality.
> >
> > In my opinion if we set agent-check on backend which is tracked by
> others -
> > it should propagate agent-check weight response to those tracking backend
> > and set weight on them too. Configurable or not - it will be good
> feature.
>
> I think we at least propagate the DRAIN state which is equivalent to
> weight == 0. If so I too think we should propagate *relative* weights..
> Agent-checks can return a relative weight (eg: 50%, 100%, 150%) or an
> absolute weight (eg: 10, 20). If you have two farms configured like this :
>
> backend farm1
> server new1 1.1.1.1:8000 weight 10 agent-check
> server new2 1.1.1.2:8000 weight 10 agent-check
>
> backend farm2
> server new1 1.1.1.1:8000 weight 20 track farm1/new1
> server new2 1.1.1.2:8000 weight 20 track farm1/new2
> server old1 1.1.1.3:8000 weight 10 check
> server old2 1.1.1.4:8000 weight 10 check
>
> Then you want the weight changes on farm1 to be applied proportionally
> to farm2 (ie: a ratio of the configured absolute weight, which is iweight
> IIRC).
>
> Otherwise that sounds quite reasonable to me given that the agent-check's
> purpose is to provide a more accurate vision of the server's health, and
> that tracking is made to share the same vision across multiple farms.
>
> Regards,
> Willy
>
Attachments:
open | download - 0001-MINOR-checks-propagate-agent-check-weight-to-tracker.patch (1.3 KB)
Hi,
I checked it and during synthetic tests it worked. I use same
mechanism as origin agent-check, so it's ready to merge.

It doesn't need to be backported.

2017-01-27 15:38 GMT+01:00 Michał <[email protected]>:

> Hello,
>
> So here's patch, which includes all functionalities I think about.
> It propagates the response for every tracking server without changing it
> and without intercepting it. In my opinion we should propagate relative
> and absolute weights, because if you use weight=0 server's to offload
> checks then you can send relative weight change and 0 will stay where it
> is.
>
> Regards,
> Michał
>
>
> 2017-01-20 10:54 GMT+01:00 Willy Tarreau <[email protected]>:
>
>> Hi Michal,
>>
>> On Thu, Jan 19, 2017 at 11:45:57PM +0100, Micha?? wrote:
>> > Hello,
>> >
>> > We use track's in haproxy to minimize check traffic in some situations
>> and
>> > after my last patch we are probably going to switch to agent-checks for
>> > live management of weights and statuses. One problem I see now - track
>> > don't propagate weight setting to trackers, so if we set agent-check on
>> > track we can manage status only.
>> >
>> > My first PoC solution works good, so I thought about introducing
>> something
>> > like agent-track or track-agent directive set on backends (or maybe it
>> > should be default, non-configurable behaviour) to propagate agent-check
>> > responses from main check to all tracking backends. Both default
>> behaviour
>> > and directive approach are small changes in code, but a little bigger in
>> > functionality.
>> >
>> > In my opinion if we set agent-check on backend which is tracked by
>> others -
>> > it should propagate agent-check weight response to those tracking
>> backend
>> > and set weight on them too. Configurable or not - it will be good
>> feature.
>>
>> I think we at least propagate the DRAIN state which is equivalent to
>> weight == 0. If so I too think we should propagate *relative* weights.
>> Agent-checks can return a relative weight (eg: 50%, 100%, 150%) or an
>> absolute weight (eg: 10, 20). If you have two farms configured like this :
>>
>> backend farm1
>> server new1 1.1.1.1:8000 weight 10 agent-check
>> server new2 1.1.1.2:8000 weight 10 agent-check
>>
>> backend farm2
>> server new1 1.1.1.1:8000 weight 20 track farm1/new1
>> server new2 1.1.1.2:8000 weight 20 track farm1/new2
>> server old1 1.1.1.3:8000 weight 10 check
>> server old2 1.1.1.4:8000 weight 10 check
>>
>> Then you want the weight changes on farm1 to be applied proportionally
>> to farm2 (ie: a ratio of the configured absolute weight, which is iweight
>> IIRC).
>>
>> Otherwise that sounds quite reasonable to me given that the agent-check's
>> purpose is to provide a more accurate vision of the server's health, and
>> that tracking is made to share the same vision across multiple farms.
>>
>> Regards,
>> Willy
>>
>
>
>
Hello!
Any news in this topic? Is there anything wrong with my patch?

Michał

2017-02-04 9:38 GMT+01:00 Michał <[email protected]>:

> Hi,
> I checked it and during synthetic tests it worked. I use same
> mechanism as origin agent-check, so it's ready to merge.
>
> It doesn't need to be backported.
>
> 2017-01-27 15:38 GMT+01:00 Michał <[email protected]>:
>
>> Hello,
>>
>> So here's patch, which includes all functionalities I think about.
>> It propagates the response for every tracking server without changing it
>> and without intercepting it. In my opinion we should propagate relative
>> and absolute weights, because if you use weight=0 server's to offload
>> checks then you can send relative weight change and 0 will stay where it
>> is.
>>
>> Regards,
>> Michał
>>
>>
>> 2017-01-20 10:54 GMT+01:00 Willy Tarreau <[email protected]>:
>>
>>> Hi Michal,
>>>
>>> On Thu, Jan 19, 2017 at 11:45:57PM +0100, Micha?? wrote:
>>> > Hello,
>>> >
>>> > We use track's in haproxy to minimize check traffic in some situations
>>> and
>>> > after my last patch we are probably going to switch to agent-checks for
>>> > live management of weights and statuses. One problem I see now - track
>>> > don't propagate weight setting to trackers, so if we set agent-check on
>>> > track we can manage status only.
>>> >
>>> > My first PoC solution works good, so I thought about introducing
>>> something
>>> > like agent-track or track-agent directive set on backends (or maybe it
>>> > should be default, non-configurable behaviour) to propagate agent-check
>>> > responses from main check to all tracking backends. Both default
>>> behaviour
>>> > and directive approach are small changes in code, but a little bigger
>>> in
>>> > functionality.
>>> >
>>> > In my opinion if we set agent-check on backend which is tracked by
>>> others -
>>> > it should propagate agent-check weight response to those tracking
>>> backend
>>> > and set weight on them too. Configurable or not - it will be good
>>> feature.
>>>
>>> I think we at least propagate the DRAIN state which is equivalent to
>>> weight == 0. If so I too think we should propagate *relative* weights.
>>> Agent-checks can return a relative weight (eg: 50%, 100%, 150%) or an
>>> absolute weight (eg: 10, 20). If you have two farms configured like this
>>> :
>>>
>>> backend farm1
>>> server new1 1.1.1.1:8000 weight 10 agent-check
>>> server new2 1.1.1.2:8000 weight 10 agent-check
>>>
>>> backend farm2
>>> server new1 1.1.1.1:8000 weight 20 track farm1/new1
>>> server new2 1.1.1.2:8000 weight 20 track farm1/new2
>>> server old1 1.1.1.3:8000 weight 10 check
>>> server old2 1.1.1.4:8000 weight 10 check
>>>
>>> Then you want the weight changes on farm1 to be applied proportionally
>>> to farm2 (ie: a ratio of the configured absolute weight, which is iweight
>>> IIRC).
>>>
>>> Otherwise that sounds quite reasonable to me given that the agent-check's
>>> purpose is to provide a more accurate vision of the server's health, and
>>> that tracking is made to share the same vision across multiple farms.
>>>
>>> Regards,
>>> Willy
>>>
>>
>>
>>
>
Hi Michal,

On Wed, Mar 15, 2017 at 10:13:01PM +0100, Michal wrote:
> Hello!
> Any news in this topic? Is there anything wrong with my patch?

Not yet, we're just totally drowning under complex bugs resulting in
minor features to be delayed :-/

Thanks,
Willy
Hi Michal,

On Wed, Mar 15, 2017 at 10:13:01PM +0100, Michal wrote:
> Hello!
> Any news in this topic? Is there anything wrong with my patch?

So I checked it but it still has the problem of propagating absolute
weights, which, as I explained earlier, will break lots of setups. I
tend to think that doing it only for relative weight changes could be
OK (provided this is properly documented in the "track" and "agent-check"
keyword sections). The principle of the relative weight change is what
most users are seeking : the server wants to say "I'm running my backups
now, please cut my load in half" or "I'm swapping, I estimate that by
shrinking my load by 33% it will be OK". Regardless of the configured
weigths in different farms, we could propagate this relative weight
change.

Also I'm seeing that your patch only propagates to the first layer of
tracking servers, and stops there without updating the next layer, you
need a recursive propagation here.

Last, if you implement this, it's absolutely mandatory that the same is
done for the CLI since the CLI is the only way to fix the bad effects
of wrong agent changes. Thus having a function dedicated to propagating
relative weight changes would help, it would solve the case for the CLI
and for the agent.

I continue to think that such a change will definitely reintroduce problems
that took several years to get rid of, but hopefully with proper documentation
that can be worked around. Ie when a use complains that the weight change
applied to a server seems to regularly be ignored, it probably is because
of the fact that they're tracking another server whose weight changes.

Thanks,
Willy
Hello Willy,

So I'm fighting with dba97077 made by Frédéric Lécaille - it broke many
things.
E.g. you can't use "source", because that patch broke it. I'm curious how
many other stuff got broken with those patches around default-server.

We need some CI (even if they will only build haproxy) and IMHO people with
@haproxy.com mails should test their code before posting and merging :(

In attachment there is patch fixed after your (Willy) review. Sorry for
loop,
I was focused on fixing all those problems with Frédérics patch I just
didn't
think how to replace do..while (which obviously works) with this cleaner
version - thanks for this. The patch got even simpler.

Thanks again,
Michał

2017-03-27 15:34 GMT+02:00 Willy Tarreau <[email protected]>:

> Hi Michal,
>
> On Mon, Mar 27, 2017 at 03:18:10PM +0200, Michal wrote:
> > Hello,
> > Thank you for your response. I agree with part about backward
> compatibility
> > and of course I don't want to break working things
> >
> > I prepared patch with described functionality and with two notes in doc
> to
> > let users know about this behaviour.
>
> Thanks. Just a few points :
>
> 1) cosmetic cleanups below :
>
> > diff --git a/doc/configuration.txt b/doc/configuration.txt
> > index fb3e691..7f22782 100644
> > --- a/doc/configuration.txt
> > +++ b/doc/configuration.txt
> > @@ -11370,6 +11370,10 @@ track [<proxy>/]<server>
> > enabled. If <proxy> is omitted the current one is used. If
> disable-on-404 is
> > used, it has to be enabled on both proxies.
> >
> > + Note:
> > + Relative weight changes are propagated to all tracking servers. Each
> > + tracking server will have it's weight recalculated separately.
>
> s/it's/its/
>
> > +
> > Supported in default-server: No
> >
> > verify [none|required]
> > diff --git a/doc/management.txt b/doc/management.txt
> > index 1d34f84..3f3730a 100644
> > --- a/doc/management.txt
> > +++ b/doc/management.txt
> > @@ -1694,6 +1694,10 @@ set weight <backend>/<server> <weight>[%]
> > "admin". Both the backend and the server may be specified either by
> their
> > name or by their numeric ID, prefixed with a sharp ('#').
> >
> > + Note:
> > + Relative weight changes are propagated to all tracking servers. Each
> > + tracking server will have it's weight recalculated separately.
>
> s/it's/its/
>
> > +
> > show cli sockets
> > List CLI sockets. The output format is composed of 3 fields separated
> by
> > spaces. The first field is the socket address, it can be a unix
> socket, a
> > diff --git a/src/server.c b/src/server.c
> > index 5589723..9462a16 100644
> > --- a/src/server.c
> > +++ b/src/server.c
> > @@ -45,6 +45,9 @@
> > static void srv_update_state(struct server *srv, int version, char
> **params);
> > static int srv_apply_lastaddr(struct server *srv, int *err_code);
> >
> > +const char *server_propagate_weight_change_request(struct server *sv,
> > + const char *weight_str);
> > +
> > /* List head of all known server keywords */
> > static struct srv_kw_list srv_keywords = {
> > .list = LIST_HEAD_INIT(srv_keywords.list)
> > @@ -912,6 +915,8 @@ const char *server_parse_weight_change_request(struct
> server *sv,
> > struct proxy *px;
> > long int w;
> > char *end;
> > + const char *msg;
> > + int relative = 0;
> >
> > px = sv->proxy;
> >
> > @@ -933,6 +938,8 @@ const char *server_parse_weight_change_request(struct
> server *sv,
> > w = sv->iweight * w / 100;
> > if (w > 256)
> > w = 256;
> > +
> > + relative = 1;
> > }
> > else if (w < 0 || w > 256)
> > return "Absolute weight can only be between 0 and 256
> inclusive.\n";
> > @@ -945,6 +952,34 @@ const char *server_parse_weight_change_request(struct
> server *sv,
> > sv->uweight = w;
> > server_recalc_eweight(sv);
> >
> > + if (relative) {
> > + msg = server_propagate_weight_change_request(sv,
> weight_str);
> > + if (msg != NULL) {
> > + return msg;
> > + }
> > + }
> > +
> > + return NULL;
> > +}
> > +
> > +const char *server_propagate_weight_change_request(struct server *sv,
> > + const char *weight_str)
> > +{
> > + if (sv->trackers == NULL)
> > + return NULL;
> > +
> > + struct server *_propagated_server = sv->trackers;
> > + const char *msg;
>
> Please don't mix code and declarations, building this emits a warning.
> Also please avoid starting your variable name with an underscore, it
> makes it look "special".
>
> > +
> > + do {
> > + msg = server_parse_weight_change_req
> uest(_propagated_server,
> > + weight_str);
> > +
> > + if (msg != NULL) {
> > + return msg;
> > + }
> > + } while ((_propagated_server = _propagated_server->tracknext) !=
> NULL);
>
> The test at the beginning of the function and the loop combined into
> this quite simplified form :
>
> struct server *tracker;
> const char *msg;
>
> for (tracker = sv->trackers; tracker; tracker =
> tracker->tracknext) {
> msg = server_parse_weight_change_request(_propagated_server,
> weight_str);
> if (msg)
> return msg;
> }
>
> And 2) please send the resulting patch to the mailing list so that others
> can participate and share their opinion. Some may say "wow cool" and others
> might say "hey please no" (though I suspect that nobody will possibly
> complain
> about the change on relative weights only). And you'll get more testers.
> Just for your information you don't need to be subscribed to the mailing
> list, it's open and you can freely post.
>

I'm sorry. I tend to click "Reply" instead of "Reply to all".


>
> Thanks,
> Willy
>
Attachments:
open | download - 0001-MINOR-server-Inherit-CLI-weight-changes-and-agent-ch.patch (3.6 KB)
Hi Michal,

On Tue, Apr 11, 2017 at 04:41:25PM +0200, Michal wrote:
> Hello Willy,
>
> So I'm fighting with dba97077 made by Frédéric Lécaille - it broke many
> things.

That's the principle of distributed development : better get your changes
in shape first so that you don't have to adapt to others' changes.

> E.g. you can't use "source", because that patch broke it. I'm curious how
> many other stuff got broken with those patches around default-server.

Well, for having reviewed this patch set in depth, I didn't spot anything
obvious. That doesn't mean there's no bug, it's just that I didn't notice
any.

> We need some CI (even if they will only build haproxy) and IMHO people with
> @haproxy.com mails should test their code before posting and merging :(

Well, all people test their code, but there are countless combinations
of possible breakage. There's a reason why this version is still in
development so your comment suggesting that people don't work the way
you'd do it is totally inappropriate.

And in case you're wondering, we're still in development and there will
be more breakage to come. That's the principle of a development branch.

> In attachment there is patch fixed after your (Willy) review. Sorry for
> loop,
> I was focused on fixing all those problems with Frédérics patch I just
> didn't
> think how to replace do..while (which obviously works) with this cleaner
> version - thanks for this. The patch got even simpler.

If you don't mind I'll retag it "MEDIUM" because it will definitely impact
some users since it's a behaviour change. I'll review the patch ASAP.

Thanks,
Willy
Hi Michal,

so I've merged your patch now eventhough I'm still not totally convinced
it's a good idea, I continue to think it will lead to some surprizes.

Regarding your point below :

> E.g. you can't use "source", because that patch broke it. I'm curious how
> many other stuff got broken with those patches around default-server.

For me "source" seems to work fine. Did you at least send Frédéric a
working reproducer for the issue you discovered ? If not, I'm not going
to spend more time trying to guess what problem you may have faced :-/

Regards,
Willy
Frederic Lecaille
Re: Propagating agent-check weight change to tracking servers
April 13, 2017 02:00PM
Hello Michal,

On 04/11/2017 04:41 PM, Michał wrote:
> Hello Willy,
>
> So I'm fighting with dba97077 made by Frédéric Lécaille - it broke many
> things.

This patch broke haproxy non-transparent builds. Thanks to Steven
Davidovitz, Pavlos Parissis and David Carlier for having promptly helped
in fixing this.

Excepted this building issue, AFAIK this patch may break only server or
default server 'source' setting parsing. How much other things did it break?

Do not forgive that as far as your configuration files do not use any
'default-server' line with new supported settings (I agree they are
numerous), they should be parsed as before.

So, if you want to have more chances not to be impacted by my
"default-server patches", please do not use any 'default-server' new
supported setting as a first workaround, or no 'default-server' line at
all as a second workaround. This should help you to continue and develop
your features.

> E.g. you can't use "source", because that patch broke it. I'm curious how
> many other stuff got broken with those patches around default-server.
>

[snipped rude/useless/unproductive published anonymous comments,
especially towards my colleagues]

>
> In attachment there is patch fixed after your (Willy) review. Sorry for
> loop,
> I was focused on fixing all those problems with Frédérics patch I just
> didn't
> think how to replace do..while (which obviously works) with this cleaner
> version - thanks for this. The patch got even simpler.

As Willy wrote before, please feel free to contact me to fix asap any
issue I alone am responsible. I am keeping an eye on HAproxy ML, and I
would still be glad to help you and consequently any other haproxy users.

Regards,
Fred.
Hi again Michal,

So in the end I already had to revert your latest patch, I should have
been more careful before merging it.

> We need some CI (even if they will only build haproxy) and IMHO people with
> @haproxy.com mails should test their code before posting and merging :(

Thus please let me reuse your own words above :

You need some CI, and IMHO people with @allegrogroup.com mails should
test their code before posting.

but I won't generalize and as I previously said it's development so we
are allowed to break a bit for the sake of trying to make something better,
so case closed.

The problem is immediately spotted in configs like this one :

global
stats timeout 1h
stats socket /var/run/haproxy.sock level admin mode 600

defaults
maxconn 100
log 127.0.0.1 local0
option log-health-checks
mode http
timeout client 3s
timeout server 1s
timeout connect 10s
default-server maxconn 50 weight 100

listen main
bind :8000
stats uri /stat
use_backend customer1 if { req.hdr(host) -i customer1 }
use_backend customer2 if { req.hdr(host) -i customer2 }
use_backend customer3 if { req.hdr(host) -i customer3 }

backend servers
server s1 127.0.0.1:8001 check inter 1s

backend customer1
balance uri
hash-type consistent
server s1 127.0.0.1:8001 track servers/s1

backend customer2
balance uri
server s1 127.0.0.1:8001 track servers/s1

backend customer3
balance uri
hash-type consistent
server s1 127.0.0.1:8001 track servers/s1

There are 3 backends, one per customer. Two customers use a consistent
hash on the URI while the other one uses a standard hash (map-based).
But all point to the same server, it's just a matter of choice. In fact
an even more common setup with reverse caches looks like this when
users want to use one farm under moderate load and another one under
high load :

backend leastconn
balance leastconn
server s1 127.0.0.1:8001 track cache/s1
server s2 127.0.0.2:8001 track cache/s2
server s3 127.0.0.3:8001 track cache/s3

backend cache
balance uri
server s1 127.0.0.1:8001 check
server s2 127.0.0.2:8001 check
server s3 127.0.0.3:8001 check

The problem faced here is when a non-integral weight change happens, such
as "50%". Since it can only be applied to backends using a dynamic LB algo,
it will be rejected on others. In the first example above, doing so will
lead to server "s1" in farms "servers" and "customer3" to turn to 50%,
farm "customer2" to generate an error and to abort the processing, and
farm "customer1" not to be changed despite being dynamic thus compatible.

At the very least it's very important that the change remains consistent
across multiple similar farms. Having both customer1 and customer3 identical
and tracking "servers" with different states after a change is not acceptable.

Now what's the best approach here ? I don't really know. We could refuse
to apply the change at all and roll everything back, but that means that
you are supposed to save the respective servers weights in order to restore
them. And it also means that having one new farm tracking a central one and
making use of a non-compatible balance algorithm could prevent all the other
ones from adjusting their weights. Or we can decide that after all, a change
performed to a map-based farm normally only triggers an error and is ignored,
so probably we could simply report the first error, continue, then report the
number of errors at the end of the operation and the rate of success.

Still another problem will remain :

defaults
balance uri

backend servers
server s1 127.0.0.1:8001 check inter 1s

backend customer1
hash-type consistent
server s1 127.0.0.1:8001 track servers/s1

backend customer2
hash-type consistent
server s1 127.0.0.1:8001 track servers/s1

backend customer3
hash-type consistent
server s1 127.0.0.1:8001 track servers/s1

This one will always fail to use non-integral weigths because of
the farm receiving the order which itself doesn't use a dynamic
algorithm, and will not be propagated. We could decide that we can
ignore the problem, but then it will render this one inconsistent :

defaults
balance uri

backend servers
server s1 127.0.0.1:8001 check inter 1s

backend customer1
server s1 127.0.0.1:8001 track servers/s1

backend customer2
hash-type consistent
server s1 127.0.0.1:8001 track customer2/s1

where one could expect that a dynamic backend tracking a static one
would always stay in sync with it, but that would not be true anymore.

So for now I think this requires more thinking, probable a specific server
option like "track-weight" or something like this to let the user decide
whether or not the weights should be inherited from tracked servers. This
would limit the risk of very bad surprizes.

Regards,
Willy
Hi,
Maybe my email wasn't nice enough, but breaking compilation and simplest
config with server using "source" got me very angry. I didn't send any
reproducer, because even simple
"server name 1.1.1.1:80 source 1.2.3.4 track other"
wasn't parsing.

2017-04-13 15:38 GMT+02:00 Willy Tarreau <[email protected]>:

> Hi again Michal,
>
> So in the end I already had to revert your latest patch, I should have
> been more careful before merging it.
>
> > We need some CI (even if they will only build haproxy) and IMHO people
> with
> > @haproxy.com mails should test their code before posting and merging :(
>
> Thus please let me reuse your own words above :
>
> You need some CI, and IMHO people with @allegrogroup.com mails should
> test their code before posting.
>
> but I won't generalize and as I previously said it's development so we
> are allowed to break a bit for the sake of trying to make something better,
> so case closed.
>

I'm posting this patch as open source committer, private person with private
e-mail address and let's be honest - @haproxy.com guy didn't check
simplest and very important thing as compilation on different platforms.
Even
-dev branch shouldn't break such thing.


>
> The problem is immediately spotted in configs like this one :
>
> global
> stats timeout 1h
> stats socket /var/run/haproxy.sock level admin mode 600
>
> defaults
> maxconn 100
> log 127.0.0.1 local0
> option log-health-checks
> mode http
> timeout client 3s
> timeout server 1s
> timeout connect 10s
> default-server maxconn 50 weight 100
>
> listen main
> bind :8000
> stats uri /stat
> use_backend customer1 if { req.hdr(host) -i customer1 }
> use_backend customer2 if { req.hdr(host) -i customer2 }
> use_backend customer3 if { req.hdr(host) -i customer3 }
>
> backend servers
> server s1 127.0.0.1:8001 check inter 1s
>
> backend customer1
> balance uri
> hash-type consistent
> server s1 127.0.0.1:8001 track servers/s1
>
> backend customer2
> balance uri
> server s1 127.0.0.1:8001 track servers/s1
>
> backend customer3
> balance uri
> hash-type consistent
> server s1 127.0.0.1:8001 track servers/s1
>
> There are 3 backends, one per customer. Two customers use a consistent
> hash on the URI while the other one uses a standard hash (map-based).
> But all point to the same server, it's just a matter of choice. In fact
> an even more common setup with reverse caches looks like this when
> users want to use one farm under moderate load and another one under
> high load :
>
> backend leastconn
> balance leastconn
> server s1 127.0.0.1:8001 track cache/s1
> server s2 127.0.0.2:8001 track cache/s2
> server s3 127.0.0.3:8001 track cache/s3
>
> backend cache
> balance uri
> server s1 127.0.0.1:8001 check
> server s2 127.0.0.2:8001 check
> server s3 127.0.0.3:8001 check
>
> The problem faced here is when a non-integral weight change happens, such
> as "50%". Since it can only be applied to backends using a dynamic LB algo,
> it will be rejected on others. In the first example above, doing so will
> lead to server "s1" in farms "servers" and "customer3" to turn to 50%,
> farm "customer2" to generate an error and to abort the processing, and
> farm "customer1" not to be changed despite being dynamic thus compatible.
>
> At the very least it's very important that the change remains consistent
> across multiple similar farms. Having both customer1 and customer3
> identical
> and tracking "servers" with different states after a change is not
> acceptable.
>
> Now what's the best approach here ? I don't really know. We could refuse
> to apply the change at all and roll everything back, but that means that
> you are supposed to save the respective servers weights in order to restore
> them. And it also means that having one new farm tracking a central one and
> making use of a non-compatible balance algorithm could prevent all the
> other
> ones from adjusting their weights. Or we can decide that after all, a
> change
> performed to a map-based farm normally only triggers an error and is
> ignored,
> so probably we could simply report the first error, continue, then report
> the
> number of errors at the end of the operation and the rate of success.
>
> Still another problem will remain :
>
> defaults
> balance uri
>
> backend servers
> server s1 127.0.0.1:8001 check inter 1s
>
> backend customer1
> hash-type consistent
> server s1 127.0.0.1:8001 track servers/s1
>
> backend customer2
> hash-type consistent
> server s1 127.0.0.1:8001 track servers/s1
>
> backend customer3
> hash-type consistent
> server s1 127.0.0.1:8001 track servers/s1
>
> This one will always fail to use non-integral weigths because of
> the farm receiving the order which itself doesn't use a dynamic
> algorithm, and will not be propagated. We could decide that we can
> ignore the problem, but then it will render this one inconsistent :
>
> defaults
> balance uri
>
> backend servers
> server s1 127.0.0.1:8001 check inter 1s
>
> backend customer1
> server s1 127.0.0.1:8001 track servers/s1
>
> backend customer2
> hash-type consistent
> server s1 127.0.0.1:8001 track customer2/s1
>
> where one could expect that a dynamic backend tracking a static one
> would always stay in sync with it, but that would not be true anymore.
>
> So for now I think this requires more thinking, probable a specific server
> option like "track-weight" or something like this to let the user decide
> whether or not the weights should be inherited from tracked servers. This
> would limit the risk of very bad surprizes.
>

Probably yes, we need another option for tracking weights and that will fix
issues with those unpredictable variants.

Regards,
> Willy
>



Please don't mess allegrogroup here, I'm posting those patches from my
private
e-mail.

Michał
Hi,
I didn't want to continue this discussion, but there is one thing that's
totally not true.

2017-04-13 13:50 GMT+02:00 Frederic Lecaille <[email protected]>:

> Hello Michal,
>
> On 04/11/2017 04:41 PM, Michał wrote:
>
>> Hello Willy,
>>
>> So I'm fighting with dba97077 made by Frédéric Lécaille - it broke many
>> things.
>>
>
> This patch broke haproxy non-transparent builds. Thanks to Steven
> Davidovitz, Pavlos Parissis and David Carlier for having promptly helped in
> fixing this.
>
> Excepted this building issue, AFAIK this patch may break only server or
> default server 'source' setting parsing. How much other things did it break?
>
> Do not forgive that as far as your configuration files do not use any
> 'default-server' line with new supported settings (I agree they are
> numerous), they should be parsed as before.
>
>
That's the case. My config don't use this functionality and the parsing
broke.
That's why I got so angry, because breaking existing configs it's not best
practice IMHO.


> So, if you want to have more chances not to be impacted by my
> "default-server patches", please do not use any 'default-server' new
> supported setting as a first workaround, or no 'default-server' line at all
> as a second workaround. This should help you to continue and develop your
> features.
>
> E.g. you can't use "source", because that patch broke it. I'm curious how
>> many other stuff got broken with those patches around default-server.
>>
>>
> [snipped rude/useless/unproductive published anonymous comments,
> especially towards my colleagues]
>
>
>> In attachment there is patch fixed after your (Willy) review. Sorry for
>> loop,
>> I was focused on fixing all those problems with Frédérics patch I just
>> didn't
>> think how to replace do..while (which obviously works) with this cleaner
>> version - thanks for this. The patch got even simpler.
>>
>
> As Willy wrote before, please feel free to contact me to fix asap any
> issue I alone am responsible. I am keeping an eye on HAproxy ML, and I
> would still be glad to help you and consequently any other haproxy users.
>
> Regards,
> Fred.
>
>
On Sat, Apr 15, 2017 at 02:24:41PM +0200, Michal wrote:
> Hi,
> Maybe my email wasn't nice enough, but breaking compilation

You were the first one to experience the build breakage, it worked for
most of us, but you didn't even give the smallest hints about the error(s)
you met, making it much harder to fix the problem. Fortunately others were
much more constructive and reported it the normal way.

> and simplest
> config with server using "source" got me very angry. I didn't send any
> reproducer, because even simple
> "server name 1.1.1.1:80 source 1.2.3.4 track other"
> wasn't parsing.

It's much easier when the proper info is sent, you can imagine as many
test combinations as you want, you'll always find one that does not work.
I tested several configs myself before merging Fred's patches and all of
them worked so I was confident enough.

In this project we try very hard not to break existing setups. Some configs
have never been modified since version 1.1 emitted 16 years ago and still
work fine. We even implemented emulation for some older settings that are
not natively implemented anymore, so we know pretty well that breaking
setups is not nice. But this is a development branch and during development
it's expected that things break from time to time, this is why nobody is
supposed to run this in production, and everyone is welcome to report
issues.

> I'm posting this patch as open source committer, private person with private
> e-mail address and let's be honest - @haproxy.com guy didn't check
> simplest and very important thing as compilation on different platforms.
> Even -dev branch shouldn't break such thing.

Please stop saying that bullshit. Tests were made and I tested myself. It's
just that you had the first config revealing a regression and you would
have been welcome to report it as anyone else usually does here. If you're
too narrow-minded to stand a development cycle, please go pollute another
project, we don't need people who complain against those who do the work.
And feel free to call me a clueless incompetent bastard if that makes you
feel better. Let me tell you something : I couldn't have brought this project
where it is alone without all the contributions we received, so I do have
a lot of respect for contributors and their work and I will not let anyone
insult those who devote time to improve the project. And that's true for
your contributions as well. If there's something you don't like, propose
improvements but do not tell people they're doing a bad work. I hope this
is clear.

> Please don't mess allegrogroup here, I'm posting those patches from my
> private e-mail.

This is the "From" you put in the commit author. As you see it's never nice
to see your work criticized when you thought you did a nice job and it
has an impact on others, and constructive bug reports are generally nicer
for everyone than broad accusations.

Now let's turn constructive again and try to find how to deal with this
weight propagation issue.

Willy
Frederic Lecaille
Re: Propagating agent-check weight change to tracking servers
April 16, 2017 05:40PM
Hello Willy,

On 04/15/2017 04:43 PM, Willy Tarreau wrote:
> On Sat, Apr 15, 2017 at 02:24:41PM +0200, Michal wrote:
>> Hi,
>> Maybe my email wasn't nice enough, but breaking compilation
>
> You were the first one to experience the build breakage, it worked for
> most of us, but you didn't even give the smallest hints about the error(s)
> you met, making it much harder to fix the problem. Fortunately others were
> much more constructive and reported it the normal way.
>
>> and simplest
>> config with server using "source" got me very angry. I didn't send any
>> reproducer, because even simple
>> "server name 1.1.1.1:80 source 1.2.3.4 track other"
>> wasn't parsing.

Indeed, in fact this is any keyword provided *after* 'source' keyword
which could not be parsed anymore. The first keyword ('track' here) was
skipped leading to this error message: 'other' unknown keyword.

I agree that I have not tested such cases.

srv_parse_source() 'source' keyword parser updates 'cur_arg' variable
(the index in the line of the current argument to be parsed). It is its
job because 'source' keyword number of variable arguments. So, in this
case we should not update 'cur_arg' variable anymore outside of
srv_parse_souce() parser.

The patch attached to this mail fixes this major bug.

Note that all such added lines 'if (kw->skip == -1)' may be removed,
but I am not sure it is a good thing.


Regards,
Fred.
Hi Fred!

On Sun, Apr 16, 2017 at 05:33:01PM +0200, Frederic Lecaille wrote:
> The patch attached to this mail fixes this major bug.

Applied, thanks!

> Note that all such added lines 'if (kw->skip == -1)' may be removed,
> but I am not sure it is a good thing.

I think we'll be able to simplify some parts of the parser after we
complete the pending ssl version keyword changes.

I'm even thinking that "source" + "usesrc" is the only couple causing
trouble, and most people use "source 0.0.0.0 usesrc foo", so we could
imagine relaxing this in the future to allow "usesrc" alone and let
"source" take exactly one argument.

Cheers,
Willy
Sorry, only registered users may post in this forum.

Click here to login