Comment #5 on issue 127 by
dorma...@rydia.net: incr/decr operations are not
thread safe.
http://code.google.com/p/memcached/issues/detail?id=127
I'll have to follow this up with another patch as mine's still incomplete,
but I had
a thought...
When items are being sent back to a client, they have their reference
counters
incr/decr'ed. If we move the item_get down into do_add_delta (and use
do_item_get),
what we can do is then also test to see if the refcount > 0 on the item. If
it is,
another thread is copying the data to the client or modifying it in some
way. So then
we allocate a new item and do the replace.
But if recount == 0, we can modify the item in place. This *should* be safe
since
item_get and add_delta operate under the same lock.
Bonus points would be optimizing counters to use thread-local write buffers
on
return. I can think of mildly complicated things to shorten the amount of
time a
counter is reflocked and decrease the number of expensive re-allocations:
- Use thread-local write buffers for returning counters. Flatly allocated to
INCR_MAX_STORAGE_LEN.
- In a counter, store a uint64_t into the first 8 bytes of the value, and
the string
of the value in free bytes after that.
- in do_add_delta, remove the safe_strtoull call and simply do the math
- do the refcount check described above
- then flatten the value back out via snprintf into the end of the value.
- when returning a counter, we'd have to use some bit somewhere to
determine that
it's a counter (haven't looked at the item struct for ideas yet)
- then memcpy the non-uint64_t of the counter into the send buffer for the
client.
- only hold the refclock on the counter in the client for the time it takes
to fetch
the item, determine it's a counter, toss a buffer into the return chain,
and memcpy
into it.
That should greatly reduce the amount of time counter items stay locked.
Also the
amount of time we hold the global lock while processing incr/decr.
Extra bonus points for doing the incr-or-decr detection outside of
do_add_delta and
passing in a positive or negative value to be added to the counter.
That's on the memory-for-cpu tradeoff edge. we could also test just storing
8 bytes
into counter values, doing most of the above, and calling snprintf from the
value
into the client buffer on every return.
Low priority compared to getting the engine work finished, but worth
testing I think
:) In the shortterm I have a good feeling that the refcount trick will
restore it to
roughly the same speed it was before.
--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings
--
To unsubscribe, reply using "remove me" as the subject.