Yeah, it kind of sounds like it. You'll need a lot of parallel clients.
Did you take a look at both your client and server JVM GC activity to see if your garbage collection is causing the app to hang periodically and if that is increasing as you increase the throughput?
Our servers have been able to steadily handle more than 20K qps per server sustained throughout multiple days. I'd be really surprised if you could not come close to that.
Let me show you what our server config looks like ...
ReadOnlyStorageEngine servers:
admin.streams.buffer.size=1024
bdb.enable=false
data.directory=${voldemort.data.dir}
enable.grandfather=false
enable.nio.connector=true
enable.readonly.engine=true
enable.repair=false
enable.server.routing=false
enable.verbose.logging=false
file.fetcher.class=voldemort.store.readonly.fetcher.HdfsFetcher
hdfs.fetcher.buffer.size=16MB
hdfs.fetcher.tmp.dir=${voldemort.home.dir}/voldemort
fetcher.retry.count=12
http.enable=false
jmx.enable=true
nio.connector.selectors=50
readonly.hadoop.config.path=${voldemort.home.dir}/config/hadoop
slop.enable=false
socket.buffer.size=65000
socket.enable=true
storage.configs=voldemort.store.readonly.ReadOnlyStorageConfiguration
voldemort.home=${voldemort.home.dir}
ReadOnly JVM config:
-server \
-Xms4096m \
-Xmx4096m \
-XX:NewSize=1024m \
-XX:MaxNewSize=1024m \
-XX:+UseConcMarkSweepGC \
-XX:+UseParNewGC \
-XX:CMSInitiatingOccupancyFraction=70 \
-XX:SurvivorRatio=2 \
-XX:+AlwaysPreTouch \
-XX:+UseCompressedOops \
-XX:+PrintTenuringDistribution \
-XX:+PrintGCDetails \
-XX:+PrintGCDateStamps \
-Xloggc:$LOG_DIR/gc.log \
-XX:+PrintGCApplicationStoppedTime \
-XX:+PrintGCTimeStamps \
-XX:+PrintGCApplicationConcurrentTime \
-Djava.net.preferIPv4Stack=true \
BdbStorageEngine servers:
admin.max.threads=40
bdb.cache.evictln=true
bdb.cache.size=20GB
bdb.cleaner.interval.bytes=15728640
bdb.cleaner.lazy.migration=false
bdb.cleaner.min.file.utilization=0
bdb.cleaner.threads=1
bdb.enable=true
bdb.evict.by.level=true
bdb.expose.space.utilization=true
bdb.lock.nLockTables=47
bdb.minimize.scan.impact=true
bdb.one.env.per.store=true
bdb.raw.property.string=je.cleaner.adjustUtilization=false
data.directory=${voldemort.data.dir}
enable.server.routing=false
enable.verbose.logging=false
http.enable=false
max.proxy.put.threads=50
nio.connector.selectors=50
num.scan.permits=2
restore.data.timeout.sec=1314000
retention.cleanup.first.start.hour=3
scheduler.threads=24
storage.configs=voldemort.store.bdb.BdbStorageConfiguration
stream.read.byte.per.sec=209715200
stream.write.byte.per.sec=78643200
voldemort.home=${voldemort.home.dir}
BDB JVM config:
-server
-Xms32684m
-Xmx32684m
-XX:NewSize=2048m
-XX:MaxNewSize=2048m
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=70
-XX:SurvivorRatio=2
-XX:+AlwaysPreTouch
-XX:+UseCompressedOops
-XX:+PrintTenuringDistribution
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-Xloggc:$LOG_DIR/gc.log
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCApplicationConcurrentTime
We have separate dedicated configs for each storage engine we support, as each storage engine has vastly different behavior from the others. And we do not load more than one storage engine inside any one JVM, otherwise it causes the JVM to be too unstable and makes it almost impossible to tune.
Our servers are each hit by hundreds to thousands of clients, maintaining thousands to tens of thousands of connections and individual servers often handle tens of thousands of queries a second. Clusters can often serve more than half a million queries per second in our production infrastructure.
Brendan