"I won't say anything whether you are right or wrong, but!or google. " -- Konstantin
Redis is fast enough for most real world tasks unless you are facebook
I agree. As long as the task is kept simple.
As I mentioned in an earlier post "Real world (PHP + Rediska) speed vs
Redis Benchmark ", I GET around 7200 values/sec against 121,733 keys.
It's not a benchmark, it's a result. Yet the Redis benchmark utility
says 55,000 values/sec.
That may very well be total throughput using 50 clients etc., but this
is where I feel the hyperbole kicking in.
If all one is doing is pulling values of keys, then Redis is fine. But
in my world, I need to process the data.
I'm looking to use a NoSQL DB for semantic networking against large
datasets. My queries look like;
"Get all the people in group A (a SET with 100,000 values) that live
in Texas, listen to "The Killers", are under 35 and drive SUVs"
I said "The Apache -> Predis/Rediska -> Redis I/O roundtrip simply
chokes, and i see no way of speeding it up.",
and you replied "There is no single networked DB without this
limitation, as I explained above."And therein lies the rub. With the Rebol experiments, I don't have the
network limitation. Blocks (Rebol's name for SETs), are native to the
Cheyenne webserver.
Cheyenne is non-blocking, event driven as well.
I can loop this 1 million times in 0.389 seconds. So retrieving values
by index has phenomenal performance. Saving values as an index
position seems beneficial.
Now, you can say what you want, but these RESULTS over Predis/Rediska
+ Redis makes Rebol the clear winner.
Rebol using pick: 2,570,694/s
Predis/Rediska + Redis: 7200/s
On my 2 year old macbook pro laptop, using jredis in pipeline mode, and redis using the append-only log with no fsync, I get about:
Pushed 10000 items in 202 msec.
or a bit under 50k/sec. On a beefy test machine this morning, I actually got:
Pushed 100000 items in 704 msec.
or well over 100k/sec. Test code is in scala below. I didn't put any particular amount of effort into this code, so that's strong praise for the jredis library.
Using pipelining is KEY to getting results this good, by the way.
robey
def writeABunchPipelined(redis: JRedisPipeline) {
var currentKey = 0
var statusId = 1L
var pipeline = new mutable.ListBuffer[Future[ResponseStatus]]
buffer.clear()
buffer.putLong(statusId)
buffer.putLong(0)
val startTime = System.currentTimeMillis
var i = 0
while (i < pushes) {
val key = "data:" + currentKey
pipeline += redis.rpush(key, buffer.array)
if (pipeline.size > pipelineSize) {
pipeline.remove(0).get()
}
i += 1
currentKey = (currentKey + 1) % totalKeys
}
while (pipeline.size > 0) {
pipeline.remove(0).get()
}
val endTime = System.currentTimeMillis
println("Pushed %d items in %d msec.".format(pushes, endTime - startTime))
}
Very nice to see a counter example ;)
Thanks for sharing,
Because you are doing in-process access, it is (as stated previously)
apples and oranges (redis talks across the network). You can use C++
and the STL to do tens of millions of gets/second from an in-process
hash on a modern machine. Heck, I wrote my own hash table 6 years ago
(using C), ran it on a 1.2 ghz mobile processor, and was able to do 5+
million gets/second. That sort-of makes your rebol-based solution
seem slow now, doesn't it?
[de]serialization, network latency (even over a local socket, pipe, or
unix domain socket), and data copies are going to reduce performance
significantly.
Good luck,
- Josiah
P.S. Incidentally, common use-cases of Google's AppEngine include
memcache copies of model objects. While this would seem like a good
idea generally (memcaching is obviously faster!), it turns out that
the de-serialization of the models could take longer than fetching the
models directly from the data store. I don't know if that is still
the case, but I remember learning that lesson the hard way about a
year ago (removing memcache gave me a 2-3x reduction in overall
request latency).
Are you sure you're counting the full round-trip time? And are you
sure you are measuring the actual display time?
Regardless, if you have a (javascript) client that needs to make an
http request directly to a cache/datastore, unless you added the
functionality to redis (not necessarily the easiest thing in the
world), that's just one more hop, and it is unlikely that redis
addresses your use-case directly.
- Josiah
As I've mentioned, if you're using Redis for simply pulling values of
keys, then fine. And everything depends on the usage.
I'm running an inference engine against millions of keys.. pulling
properties, filtering, grouping, then pulling more properties etc.
Some of these more complicated queries using MySQL can take 10 or 20
seconds. But at 7700 GETS/s using say Predis, it's worse than MySQL
Now some here have said that this kind of data crunching is not what
Redis is for. From the performance I've seen, that's a correct
statement.
If I/O is the problem, then Redis has a problem..
Client <- io -> (PHP, RUBY, NODE.JS) <- io -> Redis <- io (optional) -
> DB
I'm putting together an example that connects clientside JS -> Rocket
(the combined http server and key / value store) via websocket.
Initial results look promising.
Roundtrip from JS - > Rocket (send message, pull 100,000 values from
1,000,000 key/values, display time on page): 0.07499980926513672
Time to pull the 100,000 values: 0.069
Same thing with Predis and Redis.. 12 seconds.
The other thing mentioned was how data should be structured in Redis
for max performance. Shouldn't this be automatic? NoSQL that requires
complicated schema of sorts is just SQL in a new dress.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
Redis can handle many more commands than a client can issue and
process results for. Correct me if I'm wrong, but I believe Redis
doesn't have to wait around for the client to finish reading before it
can continue to process the next request.
The important thing to understand is that, yes, one client may not be
able to hit a huge throughput, but that doesn't mean Redis can't
handle it. By increasing the number of clients connected, you're able
to utilize the streaming qualities and actually test Redis' capacity
instead of measuring latency.
Matt
--
Matt Todd
Highgroove Studios
www.highgroove.com
cell: 404-314-2612
blog: maraby.org
Scout - Web Monitoring and Reporting Software
www.scoutapp.com
Hello Terry,
you showed your point, most of us (for sure me) don't agree.
I personally think that this is starting to sound like trolling /
spam-about-another-product, and since people subscribed to this
mailing list are interested in Redis and good arguments and good
signal/noise ratio, I suggest to stop this discussion, or move it in
some other more appropriate place.
Kind Regards,