Welcome! Log In Create A New Profile

Advanced

Consistent hashing in PHP and Java

Posted by Vicente Aguilar 
Vicente Aguilar
Consistent hashing in PHP and Java
August 27, 2010 09:30AM
Hi

At work we're setting up a multi-layered cluster with a 1st line of web/proxy/cache servers (nginx+memcached) and several layers of PHP-FPM and Tomcat application servers. At the moment we're using memcached just to cache raw HTML pages, not serialized objects: upon a request nginx checks if that URL (key) is on memcached and then serves it, else it proxies the request to the appropriate app server which serves it and saves it to memcached. Pretty basic memcached usage.

The problem is that we can't get the PHP, Java and nginx implementations to work together, the PHP and Java consistent hashing algorithms don't seem to match.

We're using:

- nginx: http://wiki.nginx.org/NginxHttpUpstreamConsistentHash
- PHP: PECL:memcache http://pecl.php.net/package/memcache
- Java: spymemcached http://code.google.com/p/spymemcached/

Both the nginx and PHP implementations match, any HTML representation saved by PHP is retrieved by nginx from the same memcached server; but spymemcached's implementation seems to be different and with 2 memcached server only matches 50% of the time, so it's a completely different algorithm and the matches are random. We're using the nginx patch and PECL:memcached with their default configuration, and as for spymemcached we've tried the NATIVE_HASH, CRC32_HASH and KETAMA_HASH algorithms but none of them matches nginx' and PHP's hashing.

Anybody else here is using memcached in Java and PHP and got them to work together? I've searched the list and this question has been raised several times but haven't find any definitive answer, just the obvious "different libraries may use different hashing implementations". We don't mind switching to other libraries.

Thanks in advance.

Regards

--
Vicente Aguilar <[email protected]> | http://www.bisente.com
Matt Ingenthron
Re: Consistent hashing in PHP and Java
August 27, 2010 07:50PM
Hi Vicente,

Vicente Aguilar wrote:
> Hi
>
> At work we're setting up a multi-layered cluster with a 1st line of
> web/proxy/cache servers (nginx+memcached) and several layers of
> PHP-FPM and Tomcat application servers. At the moment we're using
> memcached just to cache raw HTML pages, not serialized objects: upon a
> request nginx checks if that URL (key) is on memcached and then serves
> it, else it proxies the request to the appropriate app server which
> serves it and saves it to memcached. Pretty basic memcached usage.
>
> The problem is that we can't get the PHP, Java and nginx
> implementations to work together, the PHP and Java consistent hashing
> algorithms don't seem to match.

We (Dustin Sallings and Steve Yen mainly) did a lot of testing in
spymemcached to ensure the hashing was the same between it and libketama
in the most recent release. There is a very large test of these in
spymemcached. In the process a bug was found and fixed, but it was a
very minor bug.

It was compared to what libmemcached is doing with it's hashing and it's
been well verified. More below...

>
> We're using:
>
> - nginx: http://wiki.nginx.org/NginxHttpUpstreamConsistentHash
> http://www.google.com/url?sa=D&q=http://wiki.nginx.org/NginxHttpUpstreamConsistentHash&usg=AFQjCNHc6onwPil5pzVr3imuikObHBbz9w
> - PHP: PECL:memcache http://pecl.php.net/package/memcache
> - Java: spymemcached http://code.google.com/p/spymemcached/
>
> Both the nginx and PHP implementations match, any HTML representation
> saved by PHP is retrieved by nginx from the same memcached server; but
> spymemcached's implementation seems to be different and with 2
> memcached server only matches 50% of the time, so it's a completely
> different algorithm and the matches are random. We're using the nginx
> patch and PECL:memcached with their default configuration, and as for
> spymemcached we've tried the NATIVE_HASH, CRC32_HASH and KETAMA_HASH
> algorithms but none of them matches nginx' and PHP's hashing.
>
> Anybody else here is using memcached in Java and PHP and got them to
> work together? I've searched the list and this question has been
> raised several times but haven't find any definitive answer, just the
> obvious "different libraries may use different hashing
> implementations". We don't mind switching to other libraries.

Assuming you're using the latest spymemcached (which was released some
time ago) and a recent pecl/memcached, at least those two should work
together. The PHP level wasn't, but the libmemcached ketama and
spymemcached ketama are well tested.

With only two servers, nearly anything could make it be 50% off but it
almost sounds like one client is seeing one of the servers as down. In
a ketama world, that'd mean put all of it on the remaining up servers
(the other one in your case).

The only other thing to be careful with is perhaps the key (which seems
to be a URL in your case?) is getting truncated or treated differently
in different places?

Sorry I don't have any clear answers here, but I can say it's been
tested and SHOULD work.

- Matt
Vicente Aguilar
Re: Consistent hashing in PHP and Java
August 30, 2010 08:50AM
Hi

>> Anybody else here is using memcached in Java and PHP and got them to work together? I've searched the list and this question has been raised several times but haven't find any definitive answer, just the obvious "different libraries may use different hashing implementations". We don't mind switching to other libraries.
>
> Assuming you're using the latest spymemcached (which was released some time ago) and a recent pecl/memcached, at least those two should work together. The PHP level wasn't, but the libmemcached ketama and spymemcached ketama are well tested.

Right now we're usig PECL/memcache, not PECL/memcached. p/memcached is based in libmemcached but AFAIK p/memcache isn't. We're using p/memcache because last time I checked p/memcached didn't have a session handler (we're load-balancing the app servers and want to store sessions in memcached) but it seems more recent releases do have it.

As for spymemcached, we're using a git version from January. It was a 1.5-pre-something, definitely post 1.4.x but not the final 1.5 release. Maybe the version we're using still didn't had that bug you mention fixed.

Will try with pecl/memcached and spymemcached-1.5.

> With only two servers, nearly anything could make it be 50% off but it almost sounds like one client is seeing one of the servers as down. In a ketama world, that'd mean put all of it on the remaining up servers (the other one in your case).

Both servers were up during my tests, the two of them received both reads and writes.

> Sorry I don't have any clear answers here, but I can say it's been tested and SHOULD work.

No clear answers but at least now I know that spymemcached and libmemcached have been thoroughly tested and they should match, so if they're not matching for us we either are doing something wrong or are not using the correct releases. :-)

Thanks Matt :-)

--
Vicente Aguilar <[email protected]> | http://www.bisente.com
Paul Reinheimer
Re: Consistent hashing in PHP and Java
August 30, 2010 05:40PM
Hello All,

> Right now we're usig PECL/memcache, not PECL/memcached. p/memcached is based
> in libmemcached but AFAIK p/memcache isn't. We're using p/memcache because
> last time I checked p/memcached didn't have a session handler (we're
> load-balancing the app servers and want to store sessions in memcached) but
> it seems more recent releases do have it.

I've used pecl/memcached in very high volume production environments,
including the
built in session handler. No problems with it at all.

I can't shed light on your hash consistency issue, but I can put a
gold star beside the
session support.



paul
Sorry, only registered users may post in this forum.

Click here to login