Dynamic settings

75 views
Skip to first unread message

Scott Mansfield

unread,
Jan 23, 2017, 9:18:25 PM1/23/17
to memcached
There was a single setting my team was looking at today and wish we could have changed dynamically: the reqs_per_event setting. Right now in order to change it we need to shut down the process and start it again with a different -R parameter. I don't see a way to change many of the settings, though there are some that are ad-hoc changeable through some stats commands. I was going to see if I could patch memcached to be able to change the reqs_per_event setting at runtime, but before doing so I wanted to check to see if that's something that would be amenable. I also didn't want to do something specifically for that setting if it was going to be better to add it as a general feature.

I see some pros and cons:

One easy pro is that you can easily change things at runtime to save performance while not losing all of your data. If client request patterns change, the process can react.

A con is that the startup parameters won't necessarily match what the process is doing, so they are no longer going to be a useful way to determine the settings of memcached. Instead you would need to connect and issue a stats settings command to read them. It also introduces change in places that may have previously never seen it, e.g. the reqs_per_event setting is simply read at the beginning of the drive_machine loop. It might need some kind of synchronization around it now instead. I don't think it necessarily needs it on x86_64 but it might on other platforms which I am not familiar with.

dormando

unread,
Jan 24, 2017, 2:53:52 PM1/24/17
to 'Scott Mansfield' via memcached
Hey,

Would you mind explaining a bit how you determined the setting was causing
an issue, and what the impact was? The default there is very old and might
be worth a revisit (or some kind of auto-tuning) as well.

I've been trending as much as possible to online configuration, inlcuding
the actual memory limit.. You can turn the lru crawler on and off,
automoving on and off, manually move slab pages, etc. I'm hoping to make
the LRU algorithm itself modifyable at runtime.

So yeah, I'd take a patch :)
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> memcached+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>

Scott Mansfield

unread,
Jan 25, 2017, 1:57:33 PM1/25/17
to memcached
The reqs_per_event setting was causing a client that was doing large batch-gets (of a few hundred keys) to see some timeouts. Since memcached will delay responding fully until other connections are serviced and our client will wait until the batch is done, we see some client-side timeouts for the users of our client library. Our solution has been to up the setting during startup, but just as a thought experiment I was asking if we could have done it dynamically to avoid losing data. At the moment there's quite a lot of machinery to change the setting (deploy, copy data over with our cache warmer, flip traffic, tear down old boxes) and I would have rather left everything as is and adjusted the setting on the fly until our client's problem was resolved.

I'm interested in patching this specific setting to be settable, but having it fully dynamic in nature is not something I'd want to tackle. There's a natural tradeoff of latency for other connections / throughput for the one that is currently being serviced. I'm not sure it's a good idea to dynamically change that. It might cause unexpected behavior if one bad client sends huge requests.


Scott Mansfield

Product > Consumer Science Eng > EVCache > Sr. Software Eng
{
  K: {M: mobile, E: email, K: key}
}


> For more options, visit https://groups.google.com/d/optout.
>
>

--

---
You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to memcached+unsubscribe@googlegroups.com.

dormando

unread,
Jan 25, 2017, 2:04:16 PM1/25/17
to 'Scott Mansfield' via memcached
I guess when I say dynamic I mostly mean runttime-settable. Dynamic is a
little harder so I tend to do those as a second pass.

You're saying your client had head-of-line blocking for unrelated
requests? I'm not 100% sure I follow.

Big multiget comes in, multiget gets processed slightly slower than normal
due to other clients making requests, so requests *behind* the multiget
time out, or the multiget itself?

How long is your timeout? :P

I'll take a look at it as well and see about raising the limit in `-o
modern` after some performance tests. The default is from 2006.

thanks!
> > memcached+...@googlegroups.com.
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to memcached+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to memcached+...@googlegroups.com.

Scott Mansfield

unread,
Jan 25, 2017, 2:25:53 PM1/25/17
to memcached
The client is the EVCache client jar: https://github.com/netflix/evcache

When a user calls the batch get function on the client, it will spread those batch gets out over many servers because it is hashing keys to different servers. Imagine many of these batch gets happening at the same time, though, and each server's queue will get a bunch of gets from a bunch of different user-facing batch gets. It all gets intermixed. These client-side read queues are rather large (10000) and might end up sending a batch of a few hundred keys at a time. These large batch gets are sent off to the servers as "one" getq|getq|getq|getq|getq|getq|getq|getq|getq|getq|noop package and read back in that order. We are reading the responses fairly efficiently internally, but the batch get call that the user made is waiting on the data from all of these separate servers to come back in order to properly respond to the user in a synchronous manner. 

Now on the memcached side, there's many servers all doing this same pattern of many large batch gets. Memcached will stop responding to that connection after 20 requests on the same event and go serve other connections. If that happens, any user-facing batch call that is waiting on any getq command still waiting to be serviced on that connection can be delayed. It doesn't normally end up causing timeouts but it does at a low level.

Our timeouts for this app in particular are 5 seconds for a single user-facing batch get call. This client app is fine with higher latency for higher throughput.

At this point we have the reqs_per_event set to a rather high 300 and it seems to have solved our problem. I don't think it's causing any more consternation (for now), but having a dynamic setting would have lowered the operational complexity of the tuning.


Scott Mansfield

Product > Consumer Science Eng > EVCache > Sr. Software Eng
{
  K: {M: mobile, E: email, K: key}
}


>       > For more options, visit https://groups.google.com/d/optout.
>       >
>       >
>
>       --
>
>       ---
>       You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
>       To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
>       To unsubscribe from this group and all its topics, send an email to memcached+unsubscribe@googlegroups.com.

>       For more options, visit https://groups.google.com/d/optout.
>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscribe@googlegroups.com.

> For more options, visit https://groups.google.com/d/optout.
>
>

--

---
You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to memcached+unsubscribe@googlegroups.com.

dormando

unread,
Jan 25, 2017, 2:33:49 PM1/25/17
to 'Scott Mansfield' via memcached
Okay, so it's the big rollup that gets delayed. Makes sense.

You're using binary protocol for everything? That's a major focus of my
performance annoyance right now, since every response packet is sent
individually. I should have that switched to an option at least pretty
soon, which should also help with the time it takes to service them.

I'll test both ascii and binprot + the req_per_event option to see how bad
this is measurably.
> >       > memcached+...@googlegroups.com.
> >       > For more options, visit https://groups.google.com/d/optout.
> >       >
> >       >
> >
> >       --
> >
> >       ---
> >       You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
> >       To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
> >       To unsubscribe from this group and all its topics, send an email to memcached+...@googlegroups.com.
> >       For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to memcached+...@googlegroups.com.
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to memcached+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to memcached+...@googlegroups.com.

Scott Mansfield

unread,
Jan 25, 2017, 4:05:15 PM1/25/17
to memcached
Yes, our production traffic all uses binary protocol, even behind our on-server proxy that we use. In fact, if you have a way to reduce syscalls by batching responses, that would solve another huge pain we have that's of our own doing.


Scott Mansfield

Product > Consumer Science Eng > EVCache > Sr. Software Eng
{
  K: {M: mobile, E: email, K: key}
}

>       >       > memcached+unsubscribe@googlegroups.com.

>       >       > For more options, visit https://groups.google.com/d/optout.
>       >       >
>       >       >
>       >
>       >       --
>       >
>       >       ---
>       >       You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
>       >       To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
>       >       To unsubscribe from this group and all its topics, send an email to memcached+unsubscribe@googlegroups.com.

>       >       For more options, visit https://groups.google.com/d/optout.
>       >
>       >
>       > --
>       >
>       > ---
>       > You received this message because you are subscribed to the Google Groups "memcached" group.
>       > To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscribe@googlegroups.com.

>       > For more options, visit https://groups.google.com/d/optout.
>       >
>       >
>
>       --
>
>       ---
>       You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
>       To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
>       To unsubscribe from this group and all its topics, send an email to memcached+unsubscribe@googlegroups.com.

>       For more options, visit https://groups.google.com/d/optout.
>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscribe@googlegroups.com.

> For more options, visit https://groups.google.com/d/optout.
>
>

--

---
You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to memcached+unsubscribe@googlegroups.com.

dormando

unread,
Jan 25, 2017, 4:52:24 PM1/25/17
to 'Scott Mansfield' via memcached
Yeah gimme a few weeks maybe. Reducing those syscalls is like almost all
of the CPU usage. Difference between 1.2m keys/sec and 35m keys/sec on 20
cores in my own tests.

I did this:
https://github.com/memcached/memcached/pull/243
.. which would help batch perf.
and this:
https://github.com/memcached/memcached/pull/241
.. which should make binprot perf better at nearly undetectable cost to
ascii.

so, working my way to it.
> >       >       > memcached+...@googlegroups.com.
> >       >       > For more options, visit https://groups.google.com/d/optout.
> >       >       >
> >       >       >
> >       >
> >       >       --
> >       >
> >       >       ---
> >       >       You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
> >       >       To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
> >       >       To unsubscribe from this group and all its topics, send an email to memcached+...@googlegroups.com.
> >       >       For more options, visit https://groups.google.com/d/optout.
> >       >
> >       >
> >       > --
> >       >
> >       > ---
> >       > You received this message because you are subscribed to the Google Groups "memcached" group.
> >       > To unsubscribe from this group and stop receiving emails from it, send an email to memcached+...@googlegroups.com.
> >       > For more options, visit https://groups.google.com/d/optout.
> >       >
> >       >
> >
> >       --
> >
> >       ---
> >       You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
> >       To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
> >       To unsubscribe from this group and all its topics, send an email to memcached+...@googlegroups.com.
> >       For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to memcached+...@googlegroups.com.
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/C6l8aoXQO4A/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to memcached+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to memcached+...@googlegroups.com.

Scott Mansfield

unread,
Feb 9, 2017, 11:30:59 PM2/9/17
to memcached
I have opened a pull request with a preliminary implementation for a settings command: https://github.com/memcached/memcached/pull/255

I took a few liberties, so let me know if anything is out of line.
Reply all
Reply to author
Forward
0 new messages