Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#435884: ITP: rsyslog -- enhanced multi-threaded syslogd

51 views
Skip to first unread message

Michael Biebl

unread,
Aug 3, 2007, 6:20:05 PM8/3/07
to
Package: wnpp
Severity: wishlist
Owner: Michael Biebl <bi...@debian.org>

* Package name : rsyslog
Version : 1.18.0
Upstream Author : Rainer Gerhards <rger...@adiscon.com>
* URL : http://www.rsyslog.com
* License : GPL v2 or later
Programming Lang: C
Description : enhanced multi-threaded syslogd

Rsyslog is an enhanced multi-threaded syslogd supporting, amongst
others:
* MySQL
* syslog/tcp
* RFC 3195
* permitted sender lists
* filtering on any message part
* fine grained output format control
* backup log destinations
.
It is quite compatible to stock sysklogd and can be used
as a drop-in replacement.


-- System Information:
Debian Release: lenny/sid
APT prefers unstable
APT policy: (500, 'unstable'), (300, 'experimental')
Architecture: i386 (i686)

Kernel: Linux 2.6.22.1
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash


--
To UNSUBSCRIBE, email to debian-bugs-...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Hamish Moffatt

unread,
Aug 4, 2007, 12:20:11 AM8/4/07
to
On Sat, Aug 04, 2007 at 12:12:50AM +0200, Michael Biebl wrote:
> * Package name : rsyslog
> Version : 1.18.0
> Upstream Author : Rainer Gerhards <rger...@adiscon.com>
> * URL : http://www.rsyslog.com
> * License : GPL v2 or later
> Programming Lang: C
> Description : enhanced multi-threaded syslogd
>
> Rsyslog is an enhanced multi-threaded syslogd supporting, amongst
> others:

Why is rsyslog being multi-threaded interesting to our users?
Isn't that an internal implementation decision?

Hamish
--
Hamish Moffatt VK3SB <ham...@debian.org> <ham...@cloud.net.au>


--
To UNSUBSCRIBE, email to debian-dev...@lists.debian.org

Roberto C. Sánchez

unread,
Aug 4, 2007, 1:50:09 AM8/4/07
to
On Sat, Aug 04, 2007 at 02:18:29PM +1000, Hamish Moffatt wrote:
> On Sat, Aug 04, 2007 at 12:12:50AM +0200, Michael Biebl wrote:
> > * Package name : rsyslog
> > Version : 1.18.0
> > Upstream Author : Rainer Gerhards <rger...@adiscon.com>
> > * URL : http://www.rsyslog.com
> > * License : GPL v2 or later
> > Programming Lang: C
> > Description : enhanced multi-threaded syslogd
> >
> > Rsyslog is an enhanced multi-threaded syslogd supporting, amongst
> > others:
>
> Why is rsyslog being multi-threaded interesting to our users?
> Isn't that an internal implementation decision?
>
As the "target" user for this sort of package is a sysadmin type, I
would saw it is an important enough detail that it should be in the
short description.

Regards,

-Roberto

--
Roberto C. Sánchez
http://people.connexer.com/~roberto
http://www.connexer.com

signature.asc

Bastian Blank

unread,
Aug 4, 2007, 6:30:10 AM8/4/07
to
On Sat, Aug 04, 2007 at 01:44:14AM -0400, Roberto C. Sánchez wrote:
> As the "target" user for this sort of package is a sysadmin type, I
> would saw it is an important enough detail that it should be in the
> short description.

But only in the relation: multi-threaded == bad. You need much more
knowledge to handle concurrency correctly.

Bastian

--
Worlds are conquered, galaxies destroyed -- but a woman is always a woman.
-- Kirk, "The Conscience of the King", stardate 2818.9

Hamish Moffatt

unread,
Aug 4, 2007, 10:30:10 AM8/4/07
to
On Sat, Aug 04, 2007 at 12:24:58PM +0200, Bastian Blank wrote:
> On Sat, Aug 04, 2007 at 01:44:14AM -0400, Roberto C. Sánchez wrote:
> > As the "target" user for this sort of package is a sysadmin type, I
> > would saw it is an important enough detail that it should be in the
> > short description.
>
> But only in the relation: multi-threaded == bad. You need much more
> knowledge to handle concurrency correctly.

Yes that's my reaction also.

System admins might regard multi-threaded as the key to high
performance. As programmers we consider it the key to increased
complexity and therefore more bugs.

Multi-threaded does allow you to use more than one CPU, but in the case
of syslogd ultimately the log file has to end up on disk anyway.

Hamish
--
Hamish Moffatt VK3SB <ham...@debian.org> <ham...@cloud.net.au>

Andreas Barth

unread,
Aug 4, 2007, 10:30:13 AM8/4/07
to
* Hamish Moffatt (ham...@debian.org) [070804 16:22]:

> On Sat, Aug 04, 2007 at 12:24:58PM +0200, Bastian Blank wrote:
> > But only in the relation: multi-threaded == bad. You need much more
> > knowledge to handle concurrency correctly.
>
> Yes that's my reaction also.
>
> System admins might regard multi-threaded as the key to high
> performance. As programmers we consider it the key to increased
> complexity and therefore more bugs.

I'm quite optimistic we will soon know about the main bugs of rsyslog,
because Fedora is using that now as their main syslog server - so we can
happily package it now, view what happens on Fedora and use it. (And,
BTW, I was next to packaging rsyslog myself, but I decided I'm already
involved in far too many tasks, so didn't do it.)


Cheers,
Andi
--
http://home.arcor.de/andreas-barth/

Florian Weimer

unread,
Aug 4, 2007, 2:10:05 PM8/4/07
to
* Bastian Blank:

> On Sat, Aug 04, 2007 at 01:44:14AM -0400, Roberto C. Sánchez wrote:
>> As the "target" user for this sort of package is a sysadmin type, I
>> would saw it is an important enough detail that it should be in the
>> short description.
>
> But only in the relation: multi-threaded == bad. You need much more
> knowledge to handle concurrency correctly.

Your comment is a bit odd because Debian's syslogd (which is not
multi-threaded) did not handle concurrency correctly, resulting in a
hanging system.

Steinar H. Gunderson

unread,
Aug 4, 2007, 4:40:09 PM8/4/07
to
On Sun, Aug 05, 2007 at 12:21:46AM +1000, Hamish Moffatt wrote:
> System admins might regard multi-threaded as the key to high
> performance.

Your system admins sound rather odd. Lots of software is high performance
without ever using threads at all.

/* Steinar */
--
Homepage: http://www.sesse.net/

Steinar H. Gunderson

unread,
Aug 5, 2007, 11:50:06 AM8/5/07
to
On Sun, Aug 05, 2007 at 05:39:11PM +0200, SZALAY Attila wrote:
> Yes, but you cannot exploit the power of more than one CPU without
> multithreading.

This is wrong. Note that "multithreading" is a different concept from
spawning many processes (ie. the traditional UNIX fork() model).

SZALAY Attila

unread,
Aug 5, 2007, 12:00:14 PM8/5/07
to
Hi All!

On Sat, 2007-08-04 at 22:39 +0200, Steinar H. Gunderson wrote:
>
> Your system admins sound rather odd. Lots of software is high performance
> without ever using threads at all.

Yes, but you cannot exploit the power of more than one CPU without
multithreading. Of course it's an other question that this power is
needed to handle system logging.

SZALAY Attila

unread,
Aug 5, 2007, 12:20:12 PM8/5/07
to
On Sun, 2007-08-05 at 17:43 +0200, Steinar H. Gunderson wrote:
>
> This is wrong. Note that "multithreading" is a different concept from
> spawning many processes (ie. the traditional UNIX fork() model).

You are right, but (I think) it's not harder to write a program which is
multithread than which is multiprocess. (And I think that there is no
multiprocess logging program too.)

Steinar H. Gunderson

unread,
Aug 5, 2007, 2:00:23 PM8/5/07
to
On Sun, Aug 05, 2007 at 06:02:11PM +0200, SZALAY Attila wrote:
>> This is wrong. Note that "multithreading" is a different concept from
>> spawning many processes (ie. the traditional UNIX fork() model).
> You are right, but (I think) it's not harder to write a program which is
> multithread than which is multiprocess.

This is also wrong. All threads in a program share address space, which means
that all variables are shared by default, which means that every single
non-local variable access has the potential of a race condition. Multiprocess
is the complete opposite -- the address spaces are separated unless you
explicitly use shared memory. (You'll still have to lock files and such, but
that's comparatively easy.)

/* Steinar */
--
Homepage: http://www.sesse.net/

Ron Johnson

unread,
Aug 5, 2007, 2:10:10 PM8/5/07
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/05/07 10:39, SZALAY Attila wrote:
> Hi All!
>
> On Sat, 2007-08-04 at 22:39 +0200, Steinar H. Gunderson wrote:
>> Your system admins sound rather odd. Lots of software is high performance
>> without ever using threads at all.
>
> Yes, but you cannot exploit the power of more than one CPU without
> multithreading. Of course it's an other question that this power is
> needed to handle system logging.

Are you saying that Apache 1.x only ever used 1 CPU?

- --
Ron Johnson, Jr.
Jefferson LA USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGthIfS9HxQb37XmcRAimtAJ9q/icb0kOqfSnLNAdMioaETLftUACfc/n+
pcYPKVame/87ZgGY6SVbQuk=
=uO1/
-----END PGP SIGNATURE-----

SZALAY Attila

unread,
Aug 5, 2007, 4:50:14 PM8/5/07
to
Hi All!

On Sun, 2007-08-05 at 19:55 +0200, Steinar H. Gunderson wrote:
>
> This is also wrong. All threads in a program share address space, which means
> that all variables are shared by default, which means that every single
> non-local variable access has the potential of a race condition. Multiprocess
> is the complete opposite -- the address spaces are separated unless you
> explicitly use shared memory. (You'll still have to lock files and such, but
> that's comparatively easy.)

I'm on a direct opposite opinion. Yes, there are some rules what you
have to keep but in other way it's easy to program. (I know I'm a
developer :)

First the stack differ in every thread so it's safe to use it. And all
local variable allocated in stack.

And yes, I know that the choosen method is depend on the function. But I
think that this function is better to implement with multithread than
multiprocess, because:

1. Reading a log message and write it to a file is highly paralellable.

2. There may be a lot of paralell reading. (the config)

3. We have to ballance the load about message, not connection because
sometimes a small number of programs (maybe one) generate a big part of
messages.

4. IIRC one of the most important reason to choose multiprocess
programing is not that multiprocessing is easier to implement than
multithreading but the memory leaks.

So if I had to choose I surely choose multithreading.

And I think that the real question is that there is place in Debian for
a multithread/process system logging daemon (against the singlethread
ones) or not. And I think that this dispute is onlt theoretical. :)

SZALAY Attila

unread,
Aug 5, 2007, 5:00:14 PM8/5/07
to
Hi All!

On Sun, 2007-08-05 at 13:08 -0500, Ron Johnson wrote:
>
> Are you saying that Apache 1.x only ever used 1 CPU?

I protest for multithread/process applications agains singlethread ones.

I doesn't care which paralellization is used i just want to say that
it's not too bad to write a multiXXX system logging daemon.

Pierre Habouzit

unread,
Aug 5, 2007, 5:10:11 PM8/5/07
to
On Sun, Aug 05, 2007 at 10:34:03PM +0200, SZALAY Attila wrote:
> Hi All!
>
> On Sun, 2007-08-05 at 13:08 -0500, Ron Johnson wrote:
> >
> > Are you saying that Apache 1.x only ever used 1 CPU?
>
> I protest for multithread/process applications agains singlethread ones.
>
> I doesn't care which paralellization is used i just want to say that
> it's not too bad to write a multiXXX system logging daemon.

Why ? Do you need 4 CPU to soak your hard drive ? There is usually one
partition for every logs on the machine, so you don't get a lot writing
many log files at a time. And if you're that concerned with performance,
then it's not multi-foo that you need, but aio's, and a big nice epoll
loop to dispatch /dev/log clients. And absolutely no multiple processes
that you will else have to deal with locking, especially locking of
output files, which would be a huge mistake.

The syslog daemon shall not eat anymore than 0.01% of your CPU. Why
would you need to bloat it for god's sake ? It reminds me of so called
network monitors that are so huge, that they mostly measure their own
fat. A multi-foo syslog daemon is just plain silly.

--
·O· Pierre Habouzit
··O madc...@debian.org
OOO http://www.madism.org

SZALAY Attila

unread,
Aug 5, 2007, 6:10:08 PM8/5/07
to
Hi All!

On Sun, 2007-08-05 at 23:07 +0200, Pierre Habouzit wrote:
>
> Why ? Do you need 4 CPU to soak your hard drive ? There is usually one
> partition for every logs on the machine, so you don't get a lot writing
> many log files at a time. And if you're that concerned with performance,
> then it's not multi-foo that you need, but aio's, and a big nice epoll
> loop to dispatch /dev/log clients. And absolutely no multiple processes
> that you will else have to deal with locking, especially locking of
> output files, which would be a huge mistake.

You are wrong because (and this is what I have said) aio and epoll
couldn't gain anything from the second processor. You can prove this if
you try. (I have tried.)

> The syslog daemon shall not eat anymore than 0.01% of your CPU. Why
> would you need to bloat it for god's sake ? It reminds me of so called
> network monitors that are so huge, that they mostly measure their own
> fat. A multi-foo syslog daemon is just plain silly.

Yes, you are right in a Desktop. You are right in a server too. But if
you want to collect log messages from some hundred machine is an other
question. And it's more easy to put another CPU into the machine than
double the clock rate.

Hamish Moffatt

unread,
Aug 5, 2007, 6:20:06 PM8/5/07
to
On Sun, Aug 05, 2007 at 10:25:34PM +0200, SZALAY Attila wrote:
> And yes, I know that the choosen method is depend on the function. But I
> think that this function is better to implement with multithread than
> multiprocess, because:
>
> 1. Reading a log message and write it to a file is highly paralellable.
>
> 2. There may be a lot of paralell reading. (the config)
>
> 3. We have to ballance the load about message, not connection because
> sometimes a small number of programs (maybe one) generate a big part of
> messages.

Since messages arrive on a single socket (usually connection-less)
ultimately the messages enter through one process/thread. And they get
written to a file or database which is ultimately not parallelable
either. Is there a huge amount of processing in between which justifies
multithreading?

Also does rsyslog guarantee that messages are logged in the order they
are sent? If messages may take different paths due to multi-threading I
guess this would need extra care.

> And I think that the real question is that there is place in Debian for
> a multithread/process system logging daemon (against the singlethread
> ones) or not. And I think that this dispute is onlt theoretical. :)

My original question was why you would mention multi-threaded in the
short description of rsyslog?

Hamish
--
Hamish Moffatt VK3SB <ham...@debian.org> <ham...@cloud.net.au>

Pierre Habouzit

unread,
Aug 5, 2007, 6:20:08 PM8/5/07
to
On Sun, Aug 05, 2007 at 11:49:12PM +0200, SZALAY Attila wrote:
> Hi All!

please don't CC me, I read the list, and my M-F-T specifically ask you
not to do so[0].

> On Sun, 2007-08-05 at 23:07 +0200, Pierre Habouzit wrote:
> > Why ? Do you need 4 CPU to soak your hard drive ? There is usually one
> > partition for every logs on the machine, so you don't get a lot writing
> > many log files at a time. And if you're that concerned with performance,
> > then it's not multi-foo that you need, but aio's, and a big nice epoll
> > loop to dispatch /dev/log clients. And absolutely no multiple processes
> > that you will else have to deal with locking, especially locking of
> > output files, which would be a huge mistake.
>
> You are wrong because (and this is what I have said) aio and epoll
> couldn't gain anything from the second processor.

You didn't answered, why is there any kind of gain to use another CPU
where 1/100 of one is enough ?

> You can prove this if you try. (I have tried.)

I code daemons using epoll and such techniques (without threads or
even multiple processes of course) for a living.

> > The syslog daemon shall not eat anymore than 0.01% of your CPU. Why
> > would you need to bloat it for god's sake ? It reminds me of so called
> > network monitors that are so huge, that they mostly measure their own
> > fat. A multi-foo syslog daemon is just plain silly.
>
> Yes, you are right in a Desktop. You are right in a server too. But if
> you want to collect log messages from some hundred machine is an other
> question. And it's more easy to put another CPU into the machine than
> double the clock rate.

That's very blunt assertions, backed up with nothing. The bottleneck
in any application that has big flows of data to write somwhere, is the
hard drive (or you're a very crappy programmer). So please explain to me
how using more CPU will make you able to write faster to your hard
drive. I'm sure that would be enlightening.


Hint: For a hundred machines, even very busy ones, you don't need
multiple processes to collect the logs. And for a thoudand, you'll
merely need a setrlimit to overcome the 1024-file-descriptor-limit.

[0] thanks Mister Ray not to start a new MFT flame.

Pierre Habouzit

unread,
Aug 5, 2007, 6:30:14 PM8/5/07
to
On Mon, Aug 06, 2007 at 08:15:58AM +1000, Hamish Moffatt wrote:
> On Sun, Aug 05, 2007 at 10:25:34PM +0200, SZALAY Attila wrote:
> > And I think that the real question is that there is place in Debian for
> > a multithread/process system logging daemon (against the singlethread
> > ones) or not. And I think that this dispute is onlt theoretical. :)
>
> My original question was why you would mention multi-threaded in the
> short description of rsyslog?

To scare people away. But maybe the ability to log into mysql was
enough. No kidding, that's a great feature of the Description, it says
to every clever sysadmin: don't use me.

Faidon Liambotis

unread,
Aug 5, 2007, 7:10:08 PM8/5/07
to
Pierre Habouzit wrote:
> To scare people away. But maybe the ability to log into mysql was
> enough. No kidding, that's a great feature of the Description, it says
> to every clever sysadmin: don't use me.
Well, I don't know if you or me or Debian will use it but Fedora is
going to switch to it as the default syslog server for F8.

It may be a mistake from their part of course but it certainly deserves
at least another look because of that.

Regards,
Faidon

Stig Sandbeck Mathisen

unread,
Aug 6, 2007, 4:40:08 AM8/6/07
to
Pierre Habouzit <madc...@debian.org> writes:

> The syslog daemon shall not eat anymore than 0.01% of your CPU.

That's just silly. :P

For a cluster of syslog servers, the syslog daemon shall use whatever
CPU time it needs. If it needs more than one CPU, and more than one
CPU is available, then it's a good idea for the syslog daemon to use
more than one CPU.

You have multiple ways for logs to enter:

514/udp - the good old standard.

<whatever>/tcp - tcp syslog, queued on the client side, ensured on
the server side, possibly encrypted if data passes external
networks.

local sockets, doors, etc...

Logs may be filtered and classified according to priority, network,
server group, application, or facility.

You have several places where the log data will go:

Disk

Database

Some analysis application

Custom statistics software with realtime graphs.

IDS (Big, horrible, expensive, java-thingy. Prints Pretty Pictures)

Local antispam-daemons.

> Why would you need to bloat it for god's sake? It reminds me of so


> called network monitors that are so huge, that they mostly measure
> their own fat. A multi-foo syslog daemon is just plain silly.

Not if you run a large network, cluster, server group or if you're an
internet service provider. If you get tens or hundreds of gigabytes
of logs every day, you need a good framework. A mail service for just
1M users alone lots 1GB every few hours. Some of that is interesting,
and everything must be kept for a while.

For your own laptop? Naah, you can keep sysklogd, as it's probably
good enough for your needs.

Remember that Debian is used by more than just you, so calling the
needs of others "silly" may be perceived as short-sighted.

--
Stig Sandbeck Mathisen

Pierre Habouzit

unread,
Aug 6, 2007, 5:10:09 AM8/6/07
to
On Mon, Aug 06, 2007 at 10:39:36AM +0200, Stig Sandbeck Mathisen wrote:
> You have multiple ways for logs to enter:
>
> 514/udp - the good old standard.
>
> <whatever>/tcp - tcp syslog, queued on the client side, ensured on
> the server side, possibly encrypted if data passes external
> networks.
>
> local sockets, doors, etc...

aaaaand ? how does it takes more than a fragment of CPU time ?

> Logs may be filtered and classified according to priority, network,
> server group, application, or facility.
>
> You have several places where the log data will go:
>
> Disk
>
> Database
>
> Some analysis application
>
> Custom statistics software with realtime graphs.
>
> IDS (Big, horrible, expensive, java-thingy. Prints Pretty Pictures)
>
> Local antispam-daemons.

That takes CPU time but not accounted for the syslog daemon.

> > Why would you need to bloat it for god's sake? It reminds me of so
> > called network monitors that are so huge, that they mostly measure
> > their own fat. A multi-foo syslog daemon is just plain silly.
>
> Not if you run a large network, cluster, server group or if you're an
> internet service provider. If you get tens or hundreds of gigabytes
> of logs every day, you need a good framework. A mail service for just
> 1M users alone lots 1GB every few hours. Some of that is interesting,
> and everything must be kept for a while.

I totally fail to see why that would need any kind of CPU power to
deal with. 10Gb/hour is merely 3Mo/s, so even with 10Mo/s peaks you're
not even limited by your hard drive, and you don't even use a full
100Mbits link to get your datas. If you write applications that need
more than one CPU to deal with such an enoooormous volume of data, then
well, that's very interesting.

Josselin Mouette

unread,
Aug 6, 2007, 5:20:11 AM8/6/07
to
Le dimanche 05 août 2007 à 23:49 +0200, SZALAY Attila a écrit :
> Yes, you are right in a Desktop. You are right in a server too. But if
> you want to collect log messages from some hundred machine is an other
> question. And it's more easy to put another CPU into the machine than
> double the clock rate.

Some simple experiments we have done show a single CPU is more than
enough to handle 40 Mbit/s of logs (using syslog-ng, of course). There
are some systems where you may need more than one CPU, but you'll find
they are pretty rare.

--
.''`.
: :' : We are debian.org. Lower your prices, surrender your code.
`. `' We will add your hardware and software distinctiveness to
`- our own. Resistance is futile.

SZALAY Attila

unread,
Aug 6, 2007, 4:40:07 PM8/6/07
to
Hi All!

<disclaimer>
I have no connection with rsyslog. I don't know anything about this
program.
</disclaimer>

On Mon, 2007-08-06 at 08:15 +1000, Hamish Moffatt wrote:
>
> Since messages arrive on a single socket (usually connection-less)
> ultimately the messages enter through one process/thread. And they get
> written to a file or database which is ultimately not parallelable
> either. Is there a huge amount of processing in between which justifies
> multithreading?

At first if you poll for an fd in more than one thread you can balance
the load. (When a thread handle a message another thread can read. Just
like in spamassassin :)

At second there may be more than one destionation or you can connect to
a database with more than one connection. So it can be parallelable too.

At third there _may_ be some processing between log reading and writing.

Below this I couldn't answer the questions so I delete it.

SZALAY Attila

unread,
Aug 6, 2007, 6:20:06 PM8/6/07
to
Hi All!

On Mon, 2007-08-06 at 00:19 +0200, Pierre Habouzit wrote:
>
> please don't CC me, I read the list, and my M-F-T specifically ask you
> not to do so[0].

Ok.

> You didn't answered, why is there any kind of gain to use another CPU
> where 1/100 of one is enough ?

Maybe that is not enough. maybe you want to do something with the log
message, not jast forward it.

> I code daemons using epoll and such techniques (without threads or
> even multiple processes of course) for a living.

I code daemons with epoll and code program with a lot of threading too.

> That's very blunt assertions, backed up with nothing. The bottleneck
> in any application that has big flows of data to write somwhere, is the
> hard drive (or you're a very crappy programmer). So please explain to me
> how using more CPU will make you able to write faster to your hard
> drive. I'm sure that would be enlightening.

Maybe there is someone outside the word who want to transform log
messages not just put it into a file. Even maybe rewrite it with regular
expression matching. And somebody maybe want this feature from the log
daemon too.

No I'm not suggesting that that the syslog daemon is the only place to
do this and I'm not saying that everyone need this feature. But this
feature _may_ be handsome for someone.

Hamish Moffatt

unread,
Aug 6, 2007, 6:30:18 PM8/6/07
to
On Sat, Aug 04, 2007 at 02:18:29PM +1000, Hamish Moffatt wrote:
> On Sat, Aug 04, 2007 at 12:12:50AM +0200, Michael Biebl wrote:
> > * Package name : rsyslog
> > Version : 1.18.0
> > Upstream Author : Rainer Gerhards <rger...@adiscon.com>
> > * URL : http://www.rsyslog.com
> > * License : GPL v2 or later
> > Programming Lang: C
> > Description : enhanced multi-threaded syslogd
> >
> > Rsyslog is an enhanced multi-threaded syslogd supporting, amongst
> > others:
>
> Why is rsyslog being multi-threaded interesting to our users?
> Isn't that an internal implementation decision?

Rainer has blogged about this at:
http://rgerhards.blogspot.com/2007/08/why-is-rsyslog-multi-threaded-and-is-it.html

If I may summarise, rsyslog is currently only actually using two
threads, one to collect messages from input sources and one to write
them out. This is intended to prevent slow output (eg MySQL) potentially
causing messages to be lost.

This seems quite reasonable.

Hamish Moffatt

unread,
Aug 6, 2007, 6:30:19 PM8/6/07
to
On Mon, Aug 06, 2007 at 10:22:22PM +0200, SZALAY Attila wrote:
> At first if you poll for an fd in more than one thread you can balance
> the load. (When a thread handle a message another thread can read. Just
> like in spamassassin :)

There's a risk of reordering the messages if subsequent messages may
take different routes within the application (ie a different receiving
thread); even subsequent messages from the same application, if it is
not using TCP.

I wonder if syslog is expected to preserve message order. It could be
quite confusing...

> At second there may be more than one destionation or you can connect to
> a database with more than one connection. So it can be parallelable too.

Same potential for reordering here.


Hamish
--
Hamish Moffatt VK3SB <ham...@debian.org> <ham...@cloud.net.au>

Florian Weimer

unread,
Aug 7, 2007, 4:30:16 PM8/7/07
to
* Hamish Moffatt:

> Also does rsyslog guarantee that messages are logged in the order they
> are sent?

The kernel does not guarantee that SOCK_DGRAM sockets preserve order,
even if the packets are sent from a single process/host.

Florian Weimer

unread,
Aug 7, 2007, 5:00:20 PM8/7/07
to
* Pierre Habouzit:

> On Mon, Aug 06, 2007 at 08:15:58AM +1000, Hamish Moffatt wrote:
>> On Sun, Aug 05, 2007 at 10:25:34PM +0200, SZALAY Attila wrote:
>> > And I think that the real question is that there is place in Debian for
>> > a multithread/process system logging daemon (against the singlethread
>> > ones) or not. And I think that this dispute is onlt theoretical. :)
>>
>> My original question was why you would mention multi-threaded in the
>> short description of rsyslog?
>
> To scare people away.

Reality check, please. Many non-multithreaded programs suffer from
concurrency issues as well. In fact, it's easier to handle signals
such as SIGHUP from a multi-threaded program because signal handling
is inherently concurrent.

0 new messages