Welcome! Log In Create A New Profile

Advanced

Question on Caching.

Posted by Andrew Smalley 
Andrew Smalley
Question on Caching.
April 26, 2018 11:10PM
Hello Haproxy mailing list

I have been looking at caching technology and have found this

https://github.com/jiangwenyuan/nuster/

It claims to be a v1.7 / v1.8 branch fully compatible with haproxy
and indeed based on haproxy with the added capibility of having a
really fast cache as described here
https://github.com/jiangwenyuan/nuster/wiki/Web-cache-server-performance-benchmark:-nuster-vs-nginx-vs-varnish-vs-squid

It looks interesting but I would love some feedback please


Andruw Smalley

Loadbalancer.org Ltd.

www.loadbalancer.org
+1 888 867 9504 / +44 (0)330 380 1064
asmalley@loadbalancer.org

Leave a Review | Deployment Guides | Blog
Willy Tarreau
Re: Question on Caching.
April 28, 2018 08:00AM
Hi Andrew,

On Thu, Apr 26, 2018 at 10:06:00PM +0100, Andrew Smalley wrote:
> Hello Haproxy mailing list
>
> I have been looking at caching technology and have found this
>
> https://github.com/jiangwenyuan/nuster/
>
> It claims to be a v1.7 / v1.8 branch fully compatible with haproxy
> and indeed based on haproxy with the added capibility of having a
> really fast cache as described here
> https://github.com/jiangwenyuan/nuster/wiki/Web-cache-server-performance-benchmark:-nuster-vs-nginx-vs-varnish-vs-squid
>
> It looks interesting but I would love some feedback please

It's indeed interesting. By the way it's only for 1.7 as the 1.8 branch also
contains 1.7. First, he found that nginx's primary job is not to be a cache
(just like haproxy is not), and that in the end, only squid and varnish are
real caches.

Second, he focuses on performance. It's not new for many of us that haproxy
rocks here, being 3 times faster than nginx in single core and 3 times faster
than varnish using 12 cores is easily expected since haproxy never makes any
single I/O access. He could even have compared with the small object cache
in 1.8.

But there's an important point which is missed there : manageability.
Varnish is a real cache and made for being manageable and flexible. It
probably has its own shortcomings, but it does the job perfectly for those
who need a fully manageable cache. Putting a full-blown cache into haproxy
is not a good idea in my opinion. A load balancer must be mostly stateless
so that it can be killed, rebooted or tweaked. Implementing a full-blown
cache into it seriously affects this capacity. It may even require some
reloads just to flush the cache, while a load balancer should never have
to be touched for no reason, especially when it's shared between multiple
customers.

The reason I was OK with the "favicon cache" in haproxy is that I noticed
that when placing haproxy in front of varnish, we wasted more CPU and time
processing the connection between haproxy and varnish than delivering a
very small object from memory. And others had noticed that before, seeing
certain configs use dummy backends with "errorfile 503" to deliver very
small objects. So I thought that a short-lived, tiny objects cache saving
us from having to connect to varnish would benefit both components without
adding any requirement for cache maintenance. It's really where I draw the
line between what is acceptable in haproxy and what is not. The day someone
asks here if we can implement a cache flush on the CLI will indicate we've
gone too far already, and we purposely refrained from implementing it.

With this said, I can understand why some people would like to have more,
especially when seeing the performance numbers on the site above. Possibly
that we should think how to make it easier for these people to maintain
their code without having to rebase too much (eg they may need some extra
register functions or hooks to avoid patching the core).

Regards,
Willy
Andrew Smalley
Re: Question on Caching.
April 30, 2018 11:20AM
Hi Willy

Thank you for you for your detailed reply explaining why you think only the
favicon cache is sensible and that a full-blown cache within Haproxy
is not the best of ideas although interesting.

I will continue the search for a viable yet small cache.



Andruw Smalley

Loadbalancer.org Ltd.

www.loadbalancer.org
+1 888 867 9504 / +44 (0)330 380 1064
asmalley@loadbalancer.org

Leave a Review | Deployment Guides | Blog


On 28 April 2018 at 06:48, Willy Tarreau <[email protected]> wrote:
> Hi Andrew,
>
> On Thu, Apr 26, 2018 at 10:06:00PM +0100, Andrew Smalley wrote:
>> Hello Haproxy mailing list
>>
>> I have been looking at caching technology and have found this
>>
>> https://github.com/jiangwenyuan/nuster/
>>
>> It claims to be a v1.7 / v1.8 branch fully compatible with haproxy
>> and indeed based on haproxy with the added capibility of having a
>> really fast cache as described here
>> https://github.com/jiangwenyuan/nuster/wiki/Web-cache-server-performance-benchmark:-nuster-vs-nginx-vs-varnish-vs-squid
>>
>> It looks interesting but I would love some feedback please
>
> It's indeed interesting. By the way it's only for 1.7 as the 1.8 branch also
> contains 1.7. First, he found that nginx's primary job is not to be a cache
> (just like haproxy is not), and that in the end, only squid and varnish are
> real caches.
>
> Second, he focuses on performance. It's not new for many of us that haproxy
> rocks here, being 3 times faster than nginx in single core and 3 times faster
> than varnish using 12 cores is easily expected since haproxy never makes any
> single I/O access. He could even have compared with the small object cache
> in 1.8.
>
> But there's an important point which is missed there : manageability.
> Varnish is a real cache and made for being manageable and flexible. It
> probably has its own shortcomings, but it does the job perfectly for those
> who need a fully manageable cache. Putting a full-blown cache into haproxy
> is not a good idea in my opinion. A load balancer must be mostly stateless
> so that it can be killed, rebooted or tweaked. Implementing a full-blown
> cache into it seriously affects this capacity. It may even require some
> reloads just to flush the cache, while a load balancer should never have
> to be touched for no reason, especially when it's shared between multiple
> customers.
>
> The reason I was OK with the "favicon cache" in haproxy is that I noticed
> that when placing haproxy in front of varnish, we wasted more CPU and time
> processing the connection between haproxy and varnish than delivering a
> very small object from memory. And others had noticed that before, seeing
> certain configs use dummy backends with "errorfile 503" to deliver very
> small objects. So I thought that a short-lived, tiny objects cache saving
> us from having to connect to varnish would benefit both components without
> adding any requirement for cache maintenance. It's really where I draw the
> line between what is acceptable in haproxy and what is not. The day someone
> asks here if we can implement a cache flush on the CLI will indicate we've
> gone too far already, and we purposely refrained from implementing it.
>
> With this said, I can understand why some people would like to have more,
> especially when seeing the performance numbers on the site above. Possibly
> that we should think how to make it easier for these people to maintain
> their code without having to rebase too much (eg they may need some extra
> register functions or hooks to avoid patching the core).
>
> Regards,
> Willy
Willy Tarreau
Re: Question on Caching.
April 30, 2018 03:50PM
Hi Andrew,

On Mon, Apr 30, 2018 at 10:08:11AM +0100, Andrew Smalley wrote:
> Hi Willy
>
> Thank you for you for your detailed reply explaining why you think only the
> favicon cache is sensible and that a full-blown cache within Haproxy
> is not the best of ideas although interesting.
>
> I will continue the search for a viable yet small cache.

What are you looking for exactly ? What makes you think the small object
cache would not be suited to your use case, or that it would be desirable
to have a more complete cache inside the load balancer ? We didn't get
much feedback on the cache, so your opinion on this is obviously interesting.

Thanks,
Willy
Aaron West
Re: Question on Caching.
May 07, 2018 05:10PM
Hi Willy,

I think what we are looking for is some kind of small cache to
accelerate the load times of a single page; this is particularly for
things such as WordPress where page load times can be slow. I imagine
it being set to cache the homepage only, fairly small(just a few K)
and I guess it would need to only cache the HTML body rather than
headers... Does that make any sense at all?

It may be that the small object cache would help? Or the idea itself
may be a waste of time... Currently, I've been looking at the Apache
module mod_cache.

I'd value your opinion either way.

Aaron West

Loadbalancer.org Ltd.

www.loadbalancer.org

+1 888 867 9504 / +44 (0)330 380 1064
aaron@loadbalancer.org
Sorry, only registered users may post in this forum.

Click here to login