--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/FwyRq9aPRr8J.
We've used syslog-ng as it increases the log message limit to be much
larger than is allowed in standard syslog (we sometimes log json
blobs), it allows for transport over UDP, TCP, and SSL, it is a
drop-in replacement for syslog (so all of the logging tools that your
platform offers will still work), it offers filtering and redirection
of different log messages (these can get a little ugly to configure,
but it's not bad), etc.
While I am generally a fan of hacking Redis to do just about anything,
in the case of logging: pick one of the standard log packages
(syslog-ng, flume, scribe, etc.). They work great, automatically
include time stamps, origin information, etc., and won't blow up your
memory if your log collection process fails to run for one reason or
another.
Regards,
- Josiah
- Josiah
On Tue, Jul 19, 2011 at 11:47 PM, tianyuan <iamti...@gmail.com> wrote:
> what if some messages are published but the processers are too busy to
> handler them?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/redis-db/-/emC1Hq6Lg-IJ.
Also, putting logs into Cassandra doesn't magically make them scale.
To make any on-disk storage system really scale takes either 1) no
indexes (plain log files) or 2) more disks (Cassandra, MongoDB, etc).
Sticking with plain logs storage also makes them trivial to backup,
analyze, import into another system if he discovers a need for them
later, etc.
Regards,
- Josiah
[1] http://blog.reddit.com/2010/05/reddits-may-2010-state-of-servers.html
> i think, it is better for you to use a simple fifo queue implementation. but
> not with redis, since it's an in-memory database. i would use rabbitmq for
> queue management and cassandra for storing logs in the disk. (your workers
> simply get the messages from rabbitmq in the queue and send it to the
> cassandra.)
>
> --
> web developer
> http://www.emreyilmaz.me
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
Regards,
- Josiah
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/redis-db/-/lDnbhGUrbq8J.
I would wholeheartedly recommend against using RabbitMQ, as heavy
writes (sometimes as few as a few hundred/second) can cause it to
segfault. [1]
Also, putting logs into Cassandra doesn't magically make them scale.
To make any on-disk storage system really scale takes either 1) no
indexes (plain log files) or 2) more disks (Cassandra, MongoDB, etc).
Sticking with plain logs storage also makes them trivial to backup,
analyze, import into another system if he discovers a need for them
later, etc.
We ran into a segfaulting condition at 75, I've had friends locally
who have run into it at 20/second.
>> Also, putting logs into Cassandra doesn't magically make them scale.
>> To make any on-disk storage system really scale takes either 1) no
>> indexes (plain log files) or 2) more disks (Cassandra, MongoDB, etc).
>> Sticking with plain logs storage also makes them trivial to backup,
>> analyze, import into another system if he discovers a need for them
>> later, etc.
>>
>>
>
> 18 GB per day is a huge data to store in the memory.
Who said anything about storing them in memory? I'm talking about
logging them to plain files on disk (then putting them up in S3 or
similar for longer term storage/analysis).
- Josiah
We ran into a segfaulting condition at 75, I've had friends locally
who have run into it at 20/second.
Who said anything about storing them in memory? I'm talking about
logging them to plain files on disk (then putting them up in S3 or
similar for longer term storage/analysis).
That's what we thought after manually testing at 1k/second, but yet we
and others have segfaulted at that rate (we were running on a 32 bit
box, and apparently suffered some memory fragmentation). Maybe we were
running a buggy version of RabbitMQ, maybe we were running an improper
version of Erlang, I don't know. All I remember from a year and a few
months ago is: it broke about a week after Reddit had theirs break, we
had to spend a week replacing our Celery + RabbitMQ production
infrastructure with ActiveMQ.
>> Who said anything about storing them in memory? I'm talking about
>> logging them to plain files on disk (then putting them up in S3 or
>> similar for longer term storage/analysis).
>>
> okay, there is a misunderstood. i replaced 'plain text files' with
> cassandra. this is a good use case for it, lots of writes, and few reads.
> plus, searching things, analyzing logs would be so much easier with
> cassandra.
If you are running your setup in Amazon AWS, and you are storing your
data in Cassandra, all it is doing is costing you money; it's running
a cluster of Cassandra instances whose purpose is to be available to
query logs (which is rare, by definition). It's better to log to flat
files, rotate/store them every hour/day/week in S3, then run
mapreduces across the logfiles. The storage is cheaper, the mapreduce
is cheaper, and the 2nd cheapest box in AWS can easily handle 100 gigs
of logs/day. That's just not possible with one Cassandra install at
that price. Even worse, if you decide that your X Cassandra machines
aren't enough, and want to go to 2X, your write speeds drop like a
rock every time you add a new one. Again, see the Reddit "our site
totally went down" blog post from last year:
http://blog.reddit.com/2010/05/reddits-may-2010-state-of-servers.html
Regards,
- Josiah
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/46pSGkir_nsJ.
--You received this message because you are subscribed to the Google Groups "Redis DB" group.
To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/46pSGkir_nsJ.
Hello,
it sounds like a good idea. You are not going to store the logs into
Redis right?
instead you are simply using Pub/Sub as a way to collect logs in a central way.
An alternative is to push instead into a list with LPUSH, and the
"processor" of stats will use BRPOP or alike to get new results. This
way you can stop the collector for some time and logs will accumulate
into Redis memory.
> When the subscriber get the log, is the log still in memory or
> completely disappeared ?
Completely disappeared. This is why you may want to use lists instead.
But depends on your use case.
However Pub/Sub or queues are a good way to communicate between many
instances without inventing your own networking layer.
Cheers,
Salvatore
>
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>
--
Salvatore 'antirez' Sanfilippo
open source developer - VMware
http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele