Advice on Simultaneous multiple access requests to Store

38 views
Skip to first unread message

Jeremy Cox

unread,
Jun 21, 2016, 12:10:36 PM6/21/16
to project-voldemort
I have been doing benchmarking, and due to time to get response from server, it is clear that I will get substantial speed increase if I do multiple requests in parallel.

Can you advise on the best practice for making parallel requests in regards to Java client architecture?  

Off the top of my head, I can think of two fundamental approaches:
        I create one StorClient<T,T> and have all threads talk through it.  
        I create a new client for each thread.

It seems ideal to have one client from a coding simplicity standpoint.  
However, I am ignorant on the implementation.  Is StoreClient<T,T> designed to handle multiple requests in a threadsafe manner?

Thanks again for this great freeware.

Brendan Harris (a.k.a. stotch on irc.oftc.net)

unread,
Jun 21, 2016, 1:17:20 PM6/21/16
to project-voldemort
Hi Jeremy,

The client supports parallel requests. You would not want to spin up multiple clients. That would be very costly in terms of JVM use and context switches.

There are two client configs you can use to increase the parallelism:
max_connections (default 50)
selectors (default 8)

The default settings are usually good enough to get you up to hundreds of requests per second out of a single client. If you find yourself needing to get up to thousands of requests a second, you might want to bump selectors up to about 20 and maybe bump max_connections up to around 100. Increasing the selectors increases the number of threads available to the java.nio selector manager, which allows is to handle more requests in parallel. max_connections is the maximum number of open connections the client can have to each voldemort server for sending requests to them. Increasing these numbers too much can have significant negative impact on the JVM environment that your app lives in.

You'll also want to make sure that your voldemort servers can handle both the connections and the number of requests being sent over them. So, you need to have enough servers and, selectors and I/O capability. You'll probably have to play around a bit with the server configs.

We run clusters that have 22 servers in them, serving 20+k connections a piece and handling hundreds of thousands of requests a second to about more than 1000 upstream application instances. And we have some client instances that have been able to funnel more than 10,000 requests a second back to clusters containing 30 nodes. But we also back our storage with SSD, which helps a lot with the throughput capabilities.

Arunachalam

unread,
Jun 21, 2016, 1:24:42 PM6/21/16
to project-...@googlegroups.com
StoreClient object is thread safe, so you can share the same storeClient across multiple threads. The second option of creating one storeClient per thread will work as well.

You are right, do what is simpler and see if it meets your performance. 

Thanks,
Arun.



--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

Jeremy Cox

unread,
Jun 28, 2016, 3:09:11 PM6/28/16
to project-voldemort


You'll also want to make sure that your voldemort servers can handle both the connections and the number of requests being sent over them. So, you need to have enough servers and, selectors and I/O capability. You'll probably have to play around a bit with the server configs.

I am trying prototype development -- I am running a single node on my machine and then running java code on same machine to access the store.  (I am thinking of trying a small 5GB database as a proof of concept before scaling up.)

I get the occasional timed-out exception when using put() -- and I am only reading/writing one at a time, so I think I should probably try some server tweaks.  I am using a default set-up (copied from examples, see below).

At first glance it seems I have enough threads.  But I wonder do I need to explicitly allow multiple processes?  What other parameters are available to tweek?  I am not sure where to find the list.  I have been digging around the source and so far haven't found it.  I probably don't know where to look :D

Thank you,
Jeremy


server.properties

max.threads=100

############### DB options ######################

http.enable=true
socket.enable=true

# BDB
bdb.write.transactions=false
bdb.flush.transactions=false
bdb.cache.size=1G
bdb.one.env.per.store=true

# Mysql
mysql.host=localhost
mysql.port=1521
mysql.user=root
mysql.password=XXXXXXX
mysql.database=test

#NIO connector settings.
enable.nio.connector=true

request.format=vp3
storage.configs=voldemort.store.bdb.BdbStorageConfiguration, voldemort.store.readonly.ReadOnlyStorageConfiguration


Brendan Harris (a.k.a. stotch on irc.oftc.net)

unread,
Jun 30, 2016, 11:51:31 AM6/30/16
to project-voldemort
It depends on how many records you have in the store and how many of those records you are reading/writing. It also depends on what kind of disks you're backing the store with. You could have a number of limitations there. If you're working on a single spinning disk and you're hitting a lot of keys, you're going to be blocked on I/O. The best thing to do is to take a look at some of the system performance metrics, like iowait and see what is going on there. Also, a 1GB bdb cache might be too small, depending on how many keys you have. And how many stores you have on that node. The default configuration for the storage engine is going to attempt to keep all the keys and values inside that bdb cache, which is inside the java heap. If you run out of bdb cache and start getting a lot of cache churn, you're going to have a lot of heavy GC activity. So, another thing to look at is garbage collection activity.

Another thing to note is that voldemort is designed to take advantage of parallelism and having data spread across multiple servers. So, using one client to talk to one server is going to significantly limit the performance capability of voldemort. Just having two servers will be a significant performance increase over one.

If you tell us what kind of hardware you have to work with, how many stores you're going to use and how many keys and QPS you'll be doing, and give us an idea of the key size and the value size on average, we could recommend a voldemort and JVM configuration that will accommodate your testing.

Brendan
Reply all
Reply to author
Forward
0 new messages