The primary motivation was more about not beating up the TLB cache on
the CPU when running with large heaps. There are users with large heaps
already, so this should help if the underlying OS supports large pages.
TLB cache sizes are getting bigger in CPUs, but virtualization is more
common and memory heaps are growing faster.
I'd like to have some empirical data on how big a difference the -L flag
makes, but that assumes a workload profile. I should be able to hack
one up and do this with memcachetest, but I've just not done it yet. :)
> To put it more concretely, here is a proposed change to make -L do a
> contiguous preallocation even on machines without getpagesizes tuning.
> My memcached server doesn't seem to crash, but I'm not sure if that's
> a proper litmus test. What are the pros/cons of doing something like
> this?
>
This feels more related to the -k flag, and that it should be using
madvise() in there somewhere too. It wouldn't be a bad idea to separate
these necessarily. I don't know that the day after 1.4.0 is the day to
redefine -L though, but it's not necessarily bad. We should wait for
Trond's repsonse to see what he thinks about this since he implemented
it. :)
Also, I did some testing with this (-L) some time back (admittedly on
OpenSolaris) and the actual behavior will vary based on the memory
allocation library you're using and what it does with the OS
underneath. I didn't try Linux variations, but that may be worthwhile
for you. IIRC, default malloc would wait for page-fault to do the
actual memory allocation, so there'd still be risk of fragmentation.
- Matt
Mike Lambert wrote:
> Trond, any thoughts?
>
Trond is actually on vacation, but I did steal a few cycles of his time
and asked about this.
> I'd like to double-check that there isn't a reason we can't support
> preallocation without getpagesizes() before attempting to manually
> patch memcache and play with our production system here.
>
There's no reason you can't do that. There may be a slightly cleaner
integration approach Trond and I talked through. I'll try to code that
up here in the next few days... but for now you may try your approach to
see if it helps alleviate the issue you were seeing.
Incidentially, how did the memory fragmentation manifest itself on your
system? I mean, could you see any effect on apps running on the system?
Looking again right now at a machine configured with -m 6000 (so
~6gb), I see "stats maps" showing a 512mb hashtable and 7.5gb heap.
"stats malloc" (which isn't 64-bit aware) gives:
STAT mmapped_space 564604928 # this has the 512mb hashtable
STAT arena_size -1058820096
STAT total_alloc -2040194320
STAT total_free 981374224
where arena_size = total_alloc+total_free.
Knowing that the total size of the heap is 7.5gb, I can derive that
real_arena_size = -1058820096 + 2**32 * 3 = 7531114496. Doing
total_free/real_arena_size gives 13%, which is my estimate for
free-but-unallocated ram. (Free due to fragmentation or
not-yet-allocation is hard to tell, but that number is still very
high.)
Alternately, one could ask why we have a 7.5gb heap for a 6gb
memcache...why so much ram? I calculated 100mb-200mb for 7600
connections plus some various free lists, but I was running into the
problem that total_free indicates there are still 981mb of unallocated
ram in the heap. So I think at the time I concluded this was due to
fragmentation.
We solved our problem by reducing the amount of ram we gave to
memcache so we didn't swap, but in theory getting an extra 10-13% of
RAM out of our memcaches sounds like a great idea. And so given my
fragmentation conclusion, I was looking for ways to reduce that.
Thoughts? Is there perhaps another explanation for the data above?
Thanks,
Mike
Mike
>
> Incidentally, why was "mallinfo" removed from memcache 1.4.0? Even
> without it being 64-bit aware, it still provided some useful data that
> I wasn't able to get via other means in our 1.2.6 binaries.
>
I wanted to remove the mallinfo call from memcached because I don't
think it belongs in the memcached protocol, but this is something you
can get from other tools (like pmap).
Another problem with using mallinfo is that not all memory allocators
implements mallinfo. By checking for mallinfo in configure (at least
on Solaris) would cause memcached to link with libmalloc, and if you
look at the manual page for libmalloc you will find:
DESCRIPTION
Functions in this library provide routines for memory allo-
cation. These routines are space-efficient but have lower
performance. Their usage can result in serious performance
degradation.
You may think that this wouldn't be a problem because we use our own
memory allocator inside memcached, but that's not true. The slab
allocator is _only_ used to store the items, and all other memory
allocations is done through malloc (hash tables, connection structs
and buffers, suffix pool etc).
Cheers,
Trond