Java client for fast mass insertion

534 views
Skip to first unread message

Jan O

unread,
Mar 31, 2015, 1:35:33 PM3/31/15
to redi...@googlegroups.com
Hi all,

Currently I am using Jedis as a redis client. In my scenario, I need to upload a couple of million keys (simple key-values pairs) every few minutes. Even when using Jedis pipelining feature it takes more than 10 minutes for 2 million keys. My pipes are 10k elements large, after I call .sync().

Are you aware of any other redis Java client that might be more suitable for my use case?

Thanks,
Jan.

Ross Kristof

unread,
Mar 31, 2015, 2:35:20 PM3/31/15
to redi...@googlegroups.com
I haven't used any other clients but you didn't mention a couple big things that may speed things up considerably.

1. Are you using a single Jedis client or a JedisPool?
2. If you are already using a JedisPool, play around with the size of the pool (default is 8) if you haven't already.
3. It's worth double checking that you aren't bottlenecking on a thread in your app or a core on the machine hosting redis.

Jan

unread,
Mar 31, 2015, 3:40:12 PM3/31/15
to redi...@googlegroups.com
Thanks for the reply.

At the beginning I was doing it in a single thread. Now I have multiple threads (num of cores => 8 in my case), each has its own JedisPool of a default size and a pipeline of 10k elements.

What should be redis' througput? I read about 100k ops/s.

Ross Kristof

unread,
Mar 31, 2015, 4:45:13 PM3/31/15
to redi...@googlegroups.com
Throughput will vary from system to system and network to network. Best bet to determine if redis is overloaded is to look at the various metrics of the system hosting redis (CPU usage, I/O usage if you have AOF/RDB on).

If the redis host isn't being fully utilized I'd look at how long the sync() calls are taking as they will block a thread until they complete. If CPU usage on the machine running the app is low then it's likely a case of resource starvation (thread or network being most likely).

Also, JedisPool is threadsafe so you only need one instance between all the threads. Rule of thumb I usually start with for the pool is a connection for every thread, tweaking as needed.
Reply all
Reply to author
Forward
0 new messages