Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

multiple threads select on same socket

2,015 views
Skip to first unread message

Meir Lamed

unread,
Dec 20, 2001, 12:19:58 PM12/20/01
to
What will happen when multiple threads select on the same socket?
Will all of them exit when socket receives data?
If they all try to read from socket afterwards, will all but one be blocked?


Nithyanandham

unread,
Dec 20, 2001, 1:03:02 PM12/20/01
to

Meir Lamed wrote:

> What will happen when multiple threads select on the same socket?

"Multiple threads selecting on the same socket" doesn't seem to be a good design
unless
there is a specific , unavoidable reason for that.

> Will all of them exit when socket receives data?

"exit "- Do you mean stop waiting in the select().
Actually it is undefined/random which thread will wake up and come out of the
select().

> If they all try to read from socket afterwards, will all but one be blocked?

Yes. But, which one comes out of blocking from select() is unpredictable.

I suggest you to
lock a mutex
select()
When the socket descriptor is ready,unlock the mutex.

in all the threads select() ing on the socket.


--

Nithyanand.
Siemens, Bangalore, India.
(Opinions expressed are my own and do not reflect the opinions of my employer,
Siemens)

Barry Margolin

unread,
Dec 20, 2001, 1:56:41 PM12/20/01
to

I'm pretty sure we've had threads on this topic before, so I suggest you do
a Google search. I think the general answer is that it depends on the OS;
some will wake up all the threads, others will pick one. That's why it's
usually a good idea to put a socket in non-blocking mode even if you're
using select().

--
Barry Margolin, bar...@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

David Schwartz

unread,
Dec 20, 2001, 3:35:02 PM12/20/01
to
Nithyanandham wrote:

> I suggest you to
> lock a mutex
> select()
> When the socket descriptor is ready,unlock the mutex.
>
> in all the threads select() ing on the socket.

No! You should never ever make a system call that could block for a
long period of time while holding a mutex if there is any other
possibility. In this case there are many other things you could do.

DS

David Schwartz

unread,
Dec 20, 2001, 3:34:01 PM12/20/01
to
Meir Lamed wrote:

> What will happen when multiple threads select on the same socket?
> Will all of them exit when socket receives data?

It's not guaranteed, but it's what usually happens.

> If they all try to read from socket afterwards, will all but one be blocked?

None will block because you wouldn't do something as silly as calling
'select' on a blocking socket. Choose either blocking I/O or
non-blocking I/O but don't try to combine them.

DS

Heiner Steven

unread,
Dec 20, 2001, 3:46:23 PM12/20/01
to
Meir Lamed wrote:

I cannot answer your question for select(), but for accept()
(Solaris), exactly one of the threads blocking will be woken
up and have accepted the connection.

You probably know what you are doing, but select() and threads
are usually not used together. select() was invented in the
pre-thread days, when there was no other elegant way for
a (single) process to monitor multiple file descriptors.

With threads, you can monitor multiple file descriptors
very elegantly using threads.

Heiner
--
___ _
/ __| |_ _____ _____ _ _ Heiner STEVEN <heiner...@nexgo.de>
\__ \ _/ -_) V / -_) ' \ Shell Script Programmers: visit
|___/\__\___|\_/\___|_||_| http://www.shelldorado.com/

David Schwartz

unread,
Dec 20, 2001, 4:08:59 PM12/20/01
to
Heiner Steven wrote:

> You probably know what you are doing, but select() and threads
> are usually not used together. select() was invented in the
> pre-thread days, when there was no other elegant way for
> a (single) process to monitor multiple file descriptors.

Huh?!



> With threads, you can monitor multiple file descriptors
> very elegantly using threads.

Surely you aren't suggestion that an application dealing with 10,000
TCP connections create 20,000 threads, one to block in 'read' and one to
block in 'write' for each one. That would be madness.

DS

Heiner Steven

unread,
Dec 20, 2001, 6:25:48 PM12/20/01
to
David Schwartz wrote:

Well, that's one extreme. The other is, to have one thread dealing
with 10,000 connections using select(), which is not very attractive,
either.

To return to Meir's original question:

> What will happen when multiple threads select on the same socket?
> Will all of them exit when socket receives data?
> If they all try to read from socket afterwards, will all but one be blocked?

Stevens' book "UNIX Network Programming", Networking APIs: Sockets and XTI.
has some points to make.

Chapter 27.11 describes different approaches to server writing: iterative;
concurrent with one process per client; different versions of preforked
and prethreaded servers.

o chapter 27.6 explains "select collisions" for multiple
processes (not threads), causing a performance degradation:

"[...] when multiple processes are blocking on the same descriptor,
it is better to block in a function such as accept [...]."

This does say nothing about threads, though

The multi-threaded versions center around calls to accept():

o One server, accept(), one thread per connection (27.10)

o Multiple server threads, each calling accept(). A mutex make
sure only one thread can call accept() at one time.
Stevens notes:

"On a Berkeley-derived kernel [...] we do not nead
any locking around the call to accept [...]. Doing so, however,
increases the process control CPU time from 3.5 seconds [...]
to 3.9 seconds."

He attributes this to the "thundering herd" of threads woken
up in the kernel.

o Main thread calling accept(); multiple, prethreaded clients
processing requests

Well, this still does not answer the original question.
Therefore, I wrote an example program (mthread.c), and
tried it:

Start the server in one Xterm window:

$ mthreads
listening on port 9090

and then, from another window:

$ telnet localhost 9090
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
X<return>

The server output looks like this:

listening on port 9090
thread 1026
thread 2051
thread 3076
1026: 1026: select returned, reading char
1026: read character 'X' (88)
1026: returning
2051: 2051: select returned, reading char
' (13)read character '
2051: returning
3076: 3076: select returned, reading char
3076: read character '
' (10)
3076: returning

Result: with Linux/Pthreads, all threads are woken up at once.

The source code of the test program follows. Compile it with
Linux/gcc:

$ gcc -o mthreads mthreads.c -lpthread

------------------------------------------------------------------------
#include <errno.h>
#include <pthread.h>
#include <stdio.h>

#include <netinet/in.h>

#define LISTENPORT 9090
#define NTHREADS 3

static void *select_input (void *arg);

int main (int argc, char *argv [])
{
int sfd, cfd; /* listen/client file descriptor */
struct sockaddr_in serv, cli;
int clilen;
pthread_t threads [NTHREADS];
int i;

memset (&serv, 0, sizeof serv);
serv.sin_family = AF_INET;
serv.sin_addr.s_addr = htonl (INADDR_ANY);
serv.sin_port = htons (LISTENPORT);

clilen = sizeof cli;
memset (&cli, 0, sizeof cli);

if ( (sfd = socket (AF_INET, SOCK_STREAM, 0)) < 0 ) {
perror ("socket"); return 1;
}

#ifdef SO_REUSEADDR
i = 1;
(void)setsockopt (sfd, SOL_SOCKET, SO_REUSEADDR, (const char *)&i,
sizeof i);
#endif

fprintf (stderr, "listening on port %hd\n", LISTENPORT);

if ( bind (sfd, (struct sockaddr *)&serv, sizeof serv) < 0
|| listen (sfd, 4) < 0
|| (cfd = accept (sfd, (struct sockaddr *)&cli, &clilen)) < 0 ) {
perror ("bind/listen/accept"); return 1;
}

for ( i=0; i<NTHREADS; ++i ) { /* create handler threads */
if ( pthread_create (&threads [i], NULL, select_input,
(void *)cfd) != 0 ) {
perror ("pthread_create"); return 1;
}
}

/* wait for all thread to terminate */
for ( i=0; i<NTHREADS; ++i ) {
pthread_join (threads [i], NULL);
}

return 0;
}

static void *select_input (void *arg)
{
int fd = (int)arg;
fd_set readfds;
char c;
int n;

fprintf (stderr, "thread %d\n", pthread_self ());
(void) sleep (1);

FD_ZERO (&readfds);
FD_SET (fd, &readfds);

n = select (fd+1, &readfds, NULL, NULL, NULL);
fprintf (stderr, "%d: ", pthread_self ());

switch ( n ) {
case -1: perror ("select"); break;
case 0: perror ("timeout - cannot happen"); break;
default:
fprintf (stderr, "%d: select returned, reading char\n",
pthread_self ());
if ( read (fd, &c, 1) != 1 ) {
fprintf (stderr, "%d: ", pthread_self ());
perror ("read");
} else {
fprintf (stderr, "%d: read character '%c' (%d)\n ",
pthread_self (), c, c & 0xff);
}
}

fprintf (stderr, "%d: returning\n", pthread_self ());
return 0;
}
------------------------------------------------------------------------

David Schwartz

unread,
Dec 20, 2001, 6:26:41 PM12/20/01
to
Heiner Steven wrote:

> David Schwartz wrote:

> > Heiner Steven wrote:
> >
> > > You probably know what you are doing, but select() and threads
> > > are usually not used together. select() was invented in the
> > > pre-thread days, when there was no other elegant way for
> > > a (single) process to monitor multiple file descriptors.
> >
> > Huh?!
> >
> > > With threads, you can monitor multiple file descriptors
> > > very elegantly using threads.
> >
> > Surely you aren't suggestion that an application dealing with 10,000
> > TCP connections create 20,000 threads, one to block in 'read' and one to
> > block in 'write' for each one. That would be madness.

> Well, that's one extreme. The other is, to have one thread dealing
> with 10,000 connections using select(), which is not very attractive,
> either.

That's right. What you want is something in the middle of those two.
The reality is that there's still no elegant portable way for a single
process to monitor multiple file descriptors. However, please don't
spread the false idea that threading means one thread per pending
operation. That's rarely the best approach except for cases where load
is very low.

Consider a server with just 10 connections, and consider only the two
extremes. One extreme is one thread blocked in select listening on all
ten connections. The other extreme is ten threads each blocked on
'recv'. If you have ten threads blocked on receive and a little bit of
data is received on each connection, ten context switches are needed. If
you have one thread blocked on 'select', it can wakeup once and receive
on all ten sockets, requiring only one context switch. Worse, the ten
threads are likely to step all over each other as they contend for
shared resources. The single thread is much less likely to have that
problem.

So your statement that "select and threads generally aren't used
together" is utterly false. Select (or poll) is the the only efficient
way to large amounts of network I/O to large numbers of connections with
reasonable portability. (Although it's usually not as simple as one
thread that selects on every network connection.)

> o Multiple server threads, each calling accept(). A mutex make
> sure only one thread can call accept() at one time.
> Stevens notes:
>
> "On a Berkeley-derived kernel [...] we do not nead
> any locking around the call to accept [...]. Doing so, however,
> increases the process control CPU time from 3.5 seconds [...]
> to 3.9 seconds."
>
> He attributes this to the "thundering herd" of threads woken
> up in the kernel.

Right. There's generally no reason to have more than one thread
blocking on the same socket except perhaps in accept. For select, it's
hard to imagine how this could be useful. But the way to fix this is not
to have the thread call select by some reasonable synchronization
mechanism. Protecting the 'select' system call itself with a mutex is
not sensible.



> Result: with Linux/Pthreads, all threads are woken up at once.

This is typical behavior. It is, however, theoretically possible that
some of the threads will go back to sleep without returning from
'select' when the 'second half' of the system call can't figure out why
they were awoken (since none of the return conditions are currently
satisfied).

DS

Heiner Steven

unread,
Dec 20, 2001, 6:39:40 PM12/20/01
to
Heiner Steven wrote:

> Result: with Linux/Pthreads, all threads are woken up at once.

Well, this answer was given a little prematurely...

After testing with more than 3 threads I found out, that
the threads in the example above were woken up, because multiple
characters were available to read.

A test with more threads (10) using a "raw" mode for the client
(to send each character immediately it was entered) woke
up the threads one-by-one.

This is probably exactly the behaviour Meir desired, isn't it?

David Schwartz

unread,
Dec 20, 2001, 6:35:45 PM12/20/01
to
Heiner Steven wrote:

> After testing with more than 3 threads I found out, that
> the threads in the example above were woken up, because multiple
> characters were available to read.

Hard to imagine that the system would do its wakeups differently based
upon how many characters were available.



> A test with more threads (10) using a "raw" mode for the client
> (to send each character immediately it was entered) woke
> up the threads one-by-one.

That's strange. I'm guessing that internally all the threads woke up
but went back to sleep when they couldn't figure out why they woke up.
Have the waking thread sleep a second before it reads and I bet you'll
find that all the threads wake up. (They'd better, the condition they
are waiting for has been satisfied, and the OS has no idea which thread,
if any, will read the data!)



> This is probably exactly the behaviour Meir desired, isn't it?

It's hard to say. Perhaps there were ten different things he needed to
do when data was received and so he wanted all ten threads to wake up
and now he needs some way to wake them up. ;)

DS

Just Another Victim of the Ambient Morality

unread,
Dec 21, 2001, 1:18:15 AM12/21/01
to

"David Schwartz" <dav...@webmaster.com> wrote in message
news:3C2273B1...@webmaster.com...

>
> Consider a server with just 10 connections, and consider only the two
> extremes. One extreme is one thread blocked in select listening on all
> ten connections. The other extreme is ten threads each blocked on
> 'recv'. If you have ten threads blocked on receive and a little bit of
> data is received on each connection, ten context switches are needed. If
> you have one thread blocked on 'select', it can wakeup once and receive
> on all ten sockets, requiring only one context switch. Worse, the ten
> threads are likely to step all over each other as they contend for
> shared resources. The single thread is much less likely to have that
> problem.

This is one issue I've always wanted clarified. A thread is much
lighter than a process. In particular, a thread does not have it's own
address space and, therefore, does not require a context switch (depending
on your definition of "context", I've seen different people use differing
definitions). So, it's not as bad (although you still have to switch
execution and stack and all) as a context switch, or so is my understanding.
If there's a flaw in this reasoning, please explain it!

Vijay Paul

unread,
Dec 21, 2001, 1:28:43 AM12/21/01
to
Hi,
I arrived a little late on this thread, so there may be some overlap.

The entire idea behind using select() is to ensure that the program will have
information as to whether an I/O operation on the descriptor will block or not.
Now the only thing to be looked at is whether using threads & select() together
will break this functionality. The answer clearly is NO on a solaris. The
reason being that the man page says that select() is MT-Safe. Which means that
exactly one process will come out of select when data arrives for a particular
descriptor.

On a linux too, I think we can assume that this will be the case, as in
Linux, the thread itself is recognized by the "kernel" as the basic schedulable
entity. So when each thread blocks on a select, internally each thread is
waiting for some I/O indication at the kernel level. When the indication comes
each thread is notified and the one that is scheduled first by the scheduler is
the one that unblocks. The rest of the threads will remain blocked. Hence
select() should be MT-safe even on a linux.

Correct me if I am wrong.

regards,
vijay.

David Schwartz

unread,
Dec 21, 2001, 4:17:42 PM12/21/01
to
Vijay Paul wrote:

> The entire idea behind using select() is to ensure that the program will have
> information as to whether an I/O operation on the descriptor will block or not.
> Now the only thing to be looked at is whether using threads & select() together
> will break this functionality. The answer clearly is NO on a solaris. The
> reason being that the man page says that select() is MT-Safe. Which means that
> exactly one process will come out of select when data arrives for a particular
> descriptor.

What?! This is entirely wrong. If only one thread came out of select
when data arrived, then the function would not be thread safe. Thread 1
calling 'select' will change select's behavior for thread 2. Suppose
thread 1 comes out of select and never reads on the socket, thread 2 is
now stuck forever waiting for something that's already happened!



> On a linux too, I think we can assume that this will be the case, as in
> Linux, the thread itself is recognized by the "kernel" as the basic schedulable
> entity. So when each thread blocks on a select, internally each thread is
> waiting for some I/O indication at the kernel level. When the indication comes
> each thread is notified and the one that is scheduled first by the scheduler is
> the one that unblocks. The rest of the threads will remain blocked. Hence
> select() should be MT-safe even on a linux.

Why should they remain blocked?! The condition they are waiting for has
occured, they *must* unblock.

DS

David Schwartz

unread,
Dec 21, 2001, 4:16:05 PM12/21/01
to

It depends upon the implementation and what exactly manages the
switching between threads. If both threads are kernel scheduling
entities (as they are on Linux, for example), then all of the overhead
of a context switch is required except the flushing of TLBs. That's
still about 80% of the work of a context switch. Worse, schedulers are
not tuned to handle large numbers of runnable KSEs, so you encounter
drastically sub-standard scheduler performance.

You are still better off minimizing context switches between threads if
you can. And you are still better off minimizing the number of runnable
threads you have so that they are not much more than the number of CPUs
on which they can run (both for the benefit of the scheduler and to
avoid contention).

DS

Vijay Paul

unread,
Dec 22, 2001, 4:04:54 AM12/22/01
to
David Schwartz wrote:

> Vijay Paul wrote:
>
> > The entire idea behind using select() is to ensure that the program will have
> > information as to whether an I/O operation on the descriptor will block or not.
> > Now the only thing to be looked at is whether using threads & select() together
> > will break this functionality. The answer clearly is NO on a solaris. The
> > reason being that the man page says that select() is MT-Safe. Which means that
> > exactly one process will come out of select when data arrives for a particular
> > descriptor.
>
> What?! This is entirely wrong. If only one thread came out of select
> when data arrived, then the function would not be thread safe. Thread 1
> calling 'select' will change select's behavior for thread 2. Suppose
> thread 1 comes out of select and never reads on the socket, thread 2 is
> now stuck forever waiting for something that's already happened!
>

Interesting comment, but "wrong". Consider the case where two threads are waiting on
a select and both
of them are woken up. If one of them gobbles up the data, then the other process
would have no way of
knowing if an I/O operation would block. This is exactly the situation that we want
to avoid. Btw, in response
to your point that, "Suppose thread 1 comes out of select and never reads on the
socket, thread 2 is
now stuck forever waiting for something that's already happened!", we have a timeout
argument in select()
so that can be a way out. Besides, why on earth would you want to be notified when
data arrives on a socket
if you have no plans to read from it (there can be exceptional cases, but then
there are elegant work-arounds).

>
> > On a linux too, I think we can assume that this will be the case, as in
> > Linux, the thread itself is recognized by the "kernel" as the basic schedulable
> > entity. So when each thread blocks on a select, internally each thread is
> > waiting for some I/O indication at the kernel level. When the indication comes
> > each thread is notified and the one that is scheduled first by the scheduler is
> > the one that unblocks. The rest of the threads will remain blocked. Hence
> > select() should be MT-safe even on a linux.
>
> Why should they remain blocked?! The condition they are waiting for has
> occured, they *must* unblock.
>

Could you suggest what should be the mask setting when a thread is unblocked ?
Should it
indicate that a read will not block when the effect could be contrary ????

>
> DS

David Schwartz

unread,
Dec 22, 2001, 4:57:41 AM12/22/01
to
Vijay Paul wrote:

> > What?! This is entirely wrong. If only one thread came out of select
> > when data arrived, then the function would not be thread safe. Thread 1
> > calling 'select' will change select's behavior for thread 2. Suppose
> > thread 1 comes out of select and never reads on the socket, thread 2 is
> > now stuck forever waiting for something that's already happened!

> Interesting comment, but "wrong".

No, you are completely wrong.

> Consider the case where two threads are waiting on
> a select and both
> of them are woken up. If one of them gobbles up the data, then the other process
> would have no way of
> knowing if an I/O operation would block. This is exactly the situation that we want
> to avoid.

No, this is the situation you think you want to avoid, but this makes
no sense. The 'select' function call is supposed to block until a read
can complete without blocking. Now you are suggesting that it shouldn't
stop blocking even though that exact circumstance took place simply
because it might not be true in the future. That's crazy.

> Btw, in response
> to your point that, "Suppose thread 1 comes out of select and never reads on the
> socket, thread 2 is
> now stuck forever waiting for something that's already happened!", we have a timeout
> argument in select()
> so that can be a way out.

I don't think you understood my point. If 'select' only woke one
thread, then it wouldn't be thread safe. Suppose one thread was only
curious if data was available and had no intention of reading any. The
thread that was actually going to read the data would see a behavior
change in the semantics of 'select' just because another thread called
'select'. That would be flat out broken and no system does that.

> Besides, why on earth would you want to be notified when
> data arrives on a socket
> if you have no plans to read from it (there can be exceptional cases, but then
> there are elegant work-arounds).

What the heck does that have to do with anything. It's this simple --
'select' blocks until a read can complete without blocking. It makes no
guarantee that a read at some point in the future won't block, nor can
it.



> > > On a linux too, I think we can assume that this will be the case, as in
> > > Linux, the thread itself is recognized by the "kernel" as the basic schedulable
> > > entity. So when each thread blocks on a select, internally each thread is
> > > waiting for some I/O indication at the kernel level. When the indication comes
> > > each thread is notified and the one that is scheduled first by the scheduler is
> > > the one that unblocks. The rest of the threads will remain blocked. Hence
> > > select() should be MT-safe even on a linux.
> > Why should they remain blocked?! The condition they are waiting for has
> > occured, they *must* unblock.

> Could you suggest what should be the mask setting when a thread is unblocked ?
> Should it
> indicate that a read will not block when the effect could be contrary ????

You grossly misunderstand what 'select' returns. If select says a read
won't block, that means that at some point inbetween when you called
select and when it returned, a read would not have blocked. It does not
in any way assure that a read at some time in the future won't block --
how could it possibly?!

DS

Vijay Paul

unread,
Dec 22, 2001, 9:14:34 AM12/22/01
to
Hi,
I did some testing both on solaris and linux.

Solaris:
What i'd previously suggested does not turn out to be true. Yes, if more than one
thread (i tried with 6) is waiting on a select() then more than one of them unblock. But
not all unblock. Only some of them unblock, which seems a bit awkward.

Linux:
On linux exactly one thread gets woken up, so we can say that select() behaves
exactly like accept() in this respect. So there wont be any problem if our intention is
to read data and not try and snoop to see if data is available for the taking.

vp

Casper H.S. Dik - Network Security Engineer

unread,
Dec 22, 2001, 6:43:49 AM12/22/01
to
[[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]]

Vijay Paul <pa...@lucent.com> writes:

-- watch teh line length.

>Interesting comment, but "wrong". Consider the case where two threads are waiting on
>a select and both
>of them are woken up. If one of them gobbles up the data, then the other process
>would have no way of
>knowing if an I/O operation would block. This is exactly the situation that we want
>to avoid. Btw, in response
>to your point that, "Suppose thread 1 comes out of select and never reads on the
>socket, thread 2 is
>now stuck forever waiting for something that's already happened!", we have a timeout
>argument in select()
>so that can be a way out. Besides, why on earth would you want to be notified when
>data arrives on a socket
>if you have no plans to read from it (there can be exceptional cases, but then
>there are elegant work-arounds).

If your code doesn't wakeup threads that are waiting for an event,
you will have threads blocked in poll/select possible forever but
potentially too long.

If multiple threads call select() on the same fd they *MUST* handle spurious
wakeups and use non blocking reads.

The wake-up one scenario is fatally flawed for two reasons:

- it can cause deadlock or unnecessary delays.
- it doesn't protect against the scenario it purports to
protect against.


In the first case, when a thread is awoken for multiple events, it
may decide to do something else before using the other ready descriptor,
this can cause delays.

There are multiple scenarios for the lack of protection, which you
m,ay or may not handle: e.g., threads entering select after a fd has
become ready; they would normally not block in select and continue.
(unless you somehow mark an fd "owned" by a certain thread and
"un-own" them on the first I/O operation). Or a second bit
of data arrives which would cause another wakeup but the thread
that was initially woken up takes care of all input.

I think we had the wake-one discussion before; I think we concluded
then that wake-one is problematic.

>Could you suggest what should be the mask setting when a thread is unblocked ?
>Should it
>indicate that a read will not block when the effect could be contrary ????

Your program has a logic flaw if it calls select in multiple threads on
the same fd and somehow believes that if select() returns with the
descriptor marked as "ready" that read/accept/etc won't block.

The same can happen with multiple processes, and is nothing new
to multi threaded programming.

select() returns a snapshot of what is true at the moment
select() returns, other processes/threads can make change that
before the thread calling select() gets around to using the fd.

Hacking "select" to wake up just one blocked thread is misguided;
it makes some buggy programs work some of the time but not all of
the time; and makes those programs non-portable because of that.

But the bugs remain and can be triggered in those applications.


Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

Casper H.S. Dik - Network Security Engineer

unread,
Dec 22, 2001, 10:06:49 AM12/22/01
to
[[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]]

Vijay Paul <pa...@lucent.com> writes:

>Solaris:
> What i'd previously suggested does not turn out to be true. Yes, if more than one
>thread (i tried with 6) is waiting on a select() then more than one of them unblock. But
>not all unblock. Only some of them unblock, which seems a bit awkward.

If you followed the poll by an immediate read/accept, then not all threads will
be woken up as the event "goes away".

(The threads are awoken briefly in the kernel and check whether the
polled condition is true; the threads that lose the race with the first thread
to call accept will go back to sleep in the kernel)

>Linux:
> On linux exactly one thread gets woken up, so we can say that select() behaves
>exactly like accept() in this respect. So there wont be any problem if our intention is
>to read data and not try and snoop to see if data is available for the taking.

There will be a problem; if the threads are all sleeping in select they'll be
mostly fine (though some event may go unnoticed for some time) but threads
that are about to enter select() will find a conenction pending and
return.

David Schwartz

unread,
Dec 22, 2001, 5:33:01 PM12/22/01
to

Please post your code. I cannot believe your claimed results on Linux
are accurate. If it were true, it would be a *major* bug in the kernel.

DS

David Schwartz

unread,
Dec 22, 2001, 5:53:14 PM12/22/01
to

I see the correct behavior, all threads wake as they should. Here's my
test code:

#include <pthread.h>
#include <sys/types.h>
#include <unistd.h>

int read_fd;

void *thread(void *a)
{
fd_set read_set;
int i;
FD_ZERO(&read_set);
FD_SET(read_fd, &read_set);
printf("Blocking in select\n");
i=select(read_fd+1, &read_set, NULL, NULL, NULL);
printf("Woken, ret=%d\n", i);
}

int main(void)
{
int fildes[2];
pthread_t th;

pipe(fildes);
read_fd=fildes[0];
pthread_create(&th, NULL, thread, NULL);
pthread_create(&th, NULL, thread, NULL);
pthread_create(&th, NULL, thread, NULL);
usleep(100);
printf("Writing\n");
write(fildes[1], "a", 1);
printf("Written\n");
usleep(100);
return;
}

And here's its output:

Blocking in select
Blocking in select
Blocking in select
Writing
Written
Woken, ret=1
Woken, ret=1
Woken, ret=1

Note that any other output from this program would be insane,
unreasonable, and a violation of standards.

DS

Vijay Paul

unread,
Dec 25, 2001, 7:51:05 AM12/25/01
to
I do not understand how you could possible expect this program to do its job. It
is *plainly* obvious that all the threads will come out of a blocked state if you
use this program. After your select, you are just printing the return value and
exiting. This means that there is data waiting to be read. Hence the remaining
threads will also unblock from select.

What you need to do is put a simple read after select and check for yourself if
all the threads unblock from select. The answer is NO. Only one of them unblocks
even with your example code. So seems like according to *you*, you have found a
*major* bug in the linux kernel. Congrats.

So with a read in your code, the output I got is .....

Blocking in select
Blocking in select
Blocking in select
Writing
Written
Woken, ret=1

read 1 bytes after select


which means that one thread blocking on a select is woken up. However if the
thread that got woken up does not read (), and it proceeds to get scheduled out,
then another thread blocking on a select() would get woken up. I dont know if
this is a desirable feature, but this is the way it seems to be.

David Schwartz wrote:

> I see the correct behavior, all threads wake as they should. Here's my
> test code:
>
> #include <pthread.h>
> #include <sys/types.h>
> #include <unistd.h>
>
> int read_fd;
>
> void *thread(void *a)
> {
> fd_set read_set;
> int i;
> FD_ZERO(&read_set);
> FD_SET(read_fd, &read_set);
> printf("Blocking in select\n");
> i=select(read_fd+1, &read_set, NULL, NULL, NULL);
> printf("Woken, ret=%d\n", i);

put a simple read here.

}

>
> Note that any other output from this program would be insane,
> unreasonable, and a violation of standards.

yeah. there seems to be some *insane* output.

>
> DS

vp.


Vijay Paul

unread,
Dec 25, 2001, 8:04:52 AM12/25/01
to
I do not understand what you mean by *claimed* results... It somehow seemed to hint that the
results were not genuine... surely, noone would put forward a false claim when every
programmer with a linux can verify the facts.. anyway, the code that i used is given below...

_START_

#include <pthread.h>
#include <sys/socket.h>
#include <netinet/in.h>


void
start_r(void *args);
int val1, val2, val3, val4, val5, val6;
int sfd, sockfd;

int
main()
{
pthread_t thr1, thr2, thr3, thr4, thr5, thr6;
struct sockaddr_in sin;
int len;

val1 = 1;
val2 = 2;
val3 = 3;
val4 = 4;
val5 = 5;
val6 = 6;

sfd= socket(PF_INET, SOCK_STREAM, 0);
sin.sin_port = htons(9000);
sin.sin_addr.s_addr = INADDR_ANY;
sin.sin_family = AF_INET;

if(bind(sfd, &sin, sizeof(sin)) < 0)
{
perror("bind");
exit(0);
}

listen(sfd, 5);

len = sizeof(sin);
sockfd = accept(sfd, &sin, &len);

printf("main:: connected socket %d\n", sockfd);

pthread_create(&thr1, 0, start_r, &val1);

pthread_create(&thr2, 0, start_r, &val2);

pthread_create(&thr3, 0, start_r, &val3);

pthread_create(&thr4, 0, start_r, &val4);

pthread_create(&thr5, 0, start_r, &val5);

pthread_create(&thr6, 0, start_r, &val6);

while(1)
{
sleep(10);
}
}

void
start_r(void *args)
{
int *val, n, nfds;
fd_set rfds;
char buf[100];

val = (int *)args;

printf("Inside thread %d\n", *val);

FD_ZERO(&rfds);

while(1)
{
FD_SET(sockfd, &rfds);

nfds = select(100, &rfds, 0, 0, 0);

printf("thread %d, select returned %d\n", *val, nfds);

if(nfds <= 0)
{
continue;
}

if(FD_ISSET(sockfd, &rfds))
{
bzero(buf, 100);
n = read(sockfd, buf, 100);
printf("thread %d read %d\n", *val, n);
printf("thread %d contents are %s\n", *val, buf);
sprintf(buf, "thread %d got the data\n", *val, buf);
write(sockfd, buf, strlen(buf));
}
}
printf("thread %d exiting\n", *val);
}

_END_


Run this and telnet to localhost with port number 9000....
keep pressing return and see the order in which threads unblock from select()..

vp

David Schwartz

unread,
Dec 25, 2001, 1:49:49 PM12/25/01
to
Vijay Paul wrote:

> I do not understand how you could possible expect this program to do its job. It
> is *plainly* obvious that all the threads will come out of a blocked state if you
> use this program. After your select, you are just printing the return value and
> exiting. This means that there is data waiting to be read. Hence the remaining
> threads will also unblock from select.

Isn't that the question? Does a read hit wake one thread or all? The
answer is, it wakes all.



> What you need to do is put a simple read after select and check for yourself if
> all the threads unblock from select. The answer is NO. Only one of them unblocks
> even with your example code. So seems like according to *you*, you have found a
> *major* bug in the linux kernel. Congrats.

So you're saying that while all three threads are blocked in select,
the kernel somehow knows that they're not going to call read?! How? ESP?
What if the line of code after 'select' was:

if(foo()) read(bar(), buf, 1000);

How would the kernel know whether the threads are going to read or not?



> So with a read in your code, the output I got is .....
>
> Blocking in select
> Blocking in select
> Blocking in select
> Writing
> Written
> Woken, ret=1
> read 1 bytes after select

> which means that one thread blocking on a select is woken up. However if the
> thread that got woken up does not read (), and it proceeds to get scheduled out,
> then another thread blocking on a select() would get woken up. I dont know if
> this is a desirable feature, but this is the way it seems to be.

No, that is not the way it seems to be. What the kernel does is wake
*ALL* the threads that are in select. What is, however, happening is
that some of the threads *MAY* go back to sleep if upon waking up they
can't figure out why they woke up.



> > Note that any other output from this program would be insane,
> > unreasonable, and a violation of standards.
>
> yeah. there seems to be some *insane* output.

No, the output is entirely reasonable.

As I've said before:

1) If 'select' returns a read indication on a socket, it means that at
some point between the entry to and the exit from select, a read on the
socket could have completed without blocking.

2) If a read on a socket can complete without blocking, select must
eventually return.

Neither of these two rules are broken in either scenario. However,
assuming that a read hit from select ensures that a read can complete
without blocking later is madness.

DS

David Schwartz

unread,
Dec 25, 2001, 1:51:35 PM12/25/01
to
Vijay Paul wrote:

> while(1)
> {
> FD_SET(sockfd, &rfds);
>
> nfds = select(100, &rfds, 0, 0, 0);
>
> printf("thread %d, select returned %d\n", *val, nfds);
>
> if(nfds <= 0)
> {
> continue;
> }
>
> if(FD_ISSET(sockfd, &rfds))
> {
> bzero(buf, 100);
> n = read(sockfd, buf, 100);
> printf("thread %d read %d\n", *val, n);
> printf("thread %d contents are %s\n", *val, buf);
> sprintf(buf, "thread %d got the data\n", *val, buf);
> write(sockfd, buf, strlen(buf));
> }
> }
> printf("thread %d exiting\n", *val);
> }

Your code is buggy. The 'read' in the 'if' clause may cause threads
that woke up in select to go back to sleep. (When they wake up, they try
to figure out why they woke up. If they can't figure out why they woke
up, they may go back to sleep.) Remember, the goal was to tell which
threads woke up, so any code that could allow threads to go back to
sleep ruins the test.

DS

Vijay Paul

unread,
Dec 26, 2001, 12:21:25 AM12/26/01
to
>
> > if(FD_ISSET(sockfd, &rfds))
> > {
> > bzero(buf, 100);
> > n = read(sockfd, buf, 100);
> > printf("thread %d read %d\n", *val, n);
> > printf("thread %d contents are %s\n", *val, buf);
> > sprintf(buf, "thread %d got the data\n", *val, buf);
> > write(sockfd, buf, strlen(buf));
> > }
> > }
> > printf("thread %d exiting\n", *val);
> > }
>
> Your code is buggy. The 'read' in the 'if' clause may cause threads
> that woke up in select to go back to sleep. (When they wake up, they try
> to figure out why they woke up. If they can't figure out why they woke
> up, they may go back to sleep.) Remember, the goal was to tell which
> threads woke up, so any code that could allow threads to go back to
> sleep ruins the test.
>
> DS

The read in the if clause is *NOT* a bug. The idea here is to check if all
select() unblocks if data comes on the socket. We all know that select() will
unblock if data remains in the socket. Obviously, if there is data remaining
to be read then all threads will be woken up one by one. So you need to read
the data. (please appreciate that the intention here is not to ensure
non-blocking perfectly working code for some application. The intention is to
verify a particular functionality. I had clearly stated in my previous mail,
but for your benifit, *If* the read is not there, then it is *plainly* obvious
that all selects will unblock eventually, not because data came some time
back, but because there is data to be read currently.

vp

David Schwartz

unread,
Dec 26, 2001, 1:02:12 AM12/26/01
to
Vijay Paul wrote:

> > Your code is buggy. The 'read' in the 'if' clause may cause threads
> > that woke up in select to go back to sleep. (When they wake up, they try
> > to figure out why they woke up. If they can't figure out why they woke
> > up, they may go back to sleep.) Remember, the goal was to tell which
> > threads woke up, so any code that could allow threads to go back to
> > sleep ruins the test.

> The read in the if clause is *NOT* a bug. The idea here is to check if all


> select() unblocks if data comes on the socket.

Every thread unblocks when data comes on the socket. However, they may
not get scheduled immediately.

> We all know that select() will
> unblock if data remains in the socket.

In other words, 'select' has wake all semantics.

> Obviously, if there is data remaining
> to be read then all threads will be woken up one by one.

No, no, no. Threads will be scheduled at some rate, but all threads
will be woken (made ready to run) simultaneously. All threads that are
blocking in select for a socket wake up when that socket is triggered.

> So you need to read
> the data. (please appreciate that the intention here is not to ensure
> non-blocking perfectly working code for some application. The intention is to
> verify a particular functionality. I had clearly stated in my previous mail,
> but for your benifit, *If* the read is not there, then it is *plainly* obvious
> that all selects will unblock eventually, not because data came some time
> back, but because there is data to be read currently.

NO, NO, NO again. It's because of the data that was there some time
back. No matter how many times I explain it, you still don't get it.

Every thread that is blocked in select is scheduled. As those threads
get rescheduled (at an unpredictable rate that depends upon many factors
beyond your control), they may go back to sleep or return depending upon
whether they can figure out why they woke up.

DS

Vijay Paul

unread,
Dec 26, 2001, 9:24:14 AM12/26/01
to
David Schwartz wrote:

> Vijay Paul wrote:
>
> > > Your code is buggy. The 'read' in the 'if' clause may cause threads
> > > that woke up in select to go back to sleep. (When they wake up, they try
> > > to figure out why they woke up. If they can't figure out why they woke
> > > up, they may go back to sleep.) Remember, the goal was to tell which
> > > threads woke up, so any code that could allow threads to go back to
> > > sleep ruins the test.
>
> > The read in the if clause is *NOT* a bug. The idea here is to check if all
> > select() unblocks if data comes on the socket.
>
> Every thread unblocks when data comes on the socket. However, they may
> not get scheduled immediately.

What are we discussing here ? I thought what we were discussing was if all threads
blocking in the select() system call unblocked (ie a thread that called a select,
came out and went on to execute the next line). This discussion was always about
the functionality that an *end* user got when using the select() system call. I do
not see any point in someone saying that, "the thread unblocks when data comes on
the socket. However, they may not get scheduled immediately". The plain fact of the
matter is that select() does not unblock as far as the *end* user is concerned
which is what my example code has demonstrated.

>
> > We all know that select() will
> > unblock if data remains in the socket.
>
> In other words, 'select' has wake all semantics.

Huh ? What this means is that the thread gets added to the "ready" queue (if i may
use such a term) as a result of a data arriving. If later on it finds that there is
no data available, then it goes back to sleep. Which means that, if we call
select() from several threads, all of them will not wake up. But if you still think
that select() has "wake all semantics", well, what can I say.

>
> > Obviously, if there is data remaining
> > to be read then all threads will be woken up one by one.
>
> No, no, no. Threads will be scheduled at some rate, but all threads
> will be woken (made ready to run) simultaneously. All threads that are
> blocking in select for a socket wake up when that socket is triggered.

What is meant by "woken up" here is come out of select() and go on to the
instruction in the programs code. Either you did not realise this or you chose to
ignore this. What if the KSE is woken up internally, it still may go to sleep if
there is no data. SO, the call to select() may remain blocked.

>
> > So you need to read
> > the data. (please appreciate that the intention here is not to ensure
> > non-blocking perfectly working code for some application. The intention is to
> > verify a particular functionality. I had clearly stated in my previous mail,
> > but for your benifit, *If* the read is not there, then it is *plainly* obvious
> > that all selects will unblock eventually, not because data came some time
> > back, but because there is data to be read currently.
>
> NO, NO, NO again. It's because of the data that was there some time
> back. No matter how many times I explain it, you still don't get it.

Come on now. If we read immediately after the select in one thread, then there
won't be data when the second thread is scheduled and hence it will go back to
sleep. So *SELECT* will, repeat, will unblock based on if data is present for
reading. If you had cared to read what i've written, you would have noticed that i
have all along been talking about *select()* being unblocked and not about some KSE
being woken up and then going to sleep. So, what if some KSE woke up and then went
back to sleep, select() call as visible to the programmer did not unblock.

>
> Every thread that is blocked in select is scheduled. As those threads
> get rescheduled (at an unpredictable rate that depends upon many factors
> beyond your control), they may go back to sleep or return depending upon
> whether they can figure out why they woke up.

Exactly... or to be more precise, depending upon whether they find data to be read,
else they go back to sleep. Now, if they go to sleep, does the user know about what
happened in the kernel ? NO. So what became of that select() call ? It remains
blocked. So again, all threads in a select() will not return from select() if data
arrives. I hope you finally got the point.

vp.

PS-> I do not see any further value add in continuing this discussion in this
group. I would suggest that if you want any further clarifications then you can
mail me directly and avoid others the pain, unless of course if someone else is
interested in being a party to this discussion.

Casper H.S. Dik - Network Security Engineer

unread,
Dec 26, 2001, 10:53:52 AM12/26/01
to
[[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]]

Vijay Paul <pa...@lucent.com> writes:

>The read in the if clause is *NOT* a bug. The idea here is to check if all
>select() unblocks if data comes on the socket. We all know that select() will
>unblock if data remains in the socket. Obviously, if there is data remaining
>to be read then all threads will be woken up one by one. So you need to read
>the data. (please appreciate that the intention here is not to ensure
>non-blocking perfectly working code for some application. The intention is to
>verify a particular functionality. I had clearly stated in my previous mail,
>but for your benifit, *If* the read is not there, then it is *plainly* obvious
>that all selects will unblock eventually, not because data came some time
>back, but because there is data to be read currently.


It's not clear that the observation "select doesn't return" is the
same as "thread is not woken up".

In Solaris, all threads are woken up, but poll/select will return in
only a few threads (as long as the data is present).

And the wording "eventually" implies that it may take a longish while;
but there's no proof of that: the effect is probably nearly immediate.

Casper H.S. Dik - Network Security Engineer

unread,
Dec 26, 2001, 11:18:13 AM12/26/01
to
[[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]]

Vijay Paul <pa...@lucent.com> writes:

>What are we discussing here ? I thought what we were discussing was if all threads
>blocking in the select() system call unblocked (ie a thread that called a select,
>came out and went on to execute the next line). This discussion was always about
>the functionality that an *end* user got when using the select() system call. I do
>not see any point in someone saying that, "the thread unblocks when data comes on
>the socket. However, they may not get scheduled immediately". The plain fact of the
>matter is that select() does not unblock as far as the *end* user is concerned
>which is what my example code has demonstrated.


Typical kernel wakeups are done in this fashion:


while (condition not met) {
cond_wait(&cv, &mtx);
}


When a thread is woken up, it will start running in the kernel and
will retest the condition.

The read system call makes the condition go away and select()
starts to sleep again.

Your conclusion that therethis points to "wake one" or "wake few" is not
correct; you user code will need to deal with "wake all" (e.g., a thread
is woken up but its code that calls select is paged out; it gets a page
fault before the select returns and the other threads also start
to return from select)

>Huh ? What this means is that the thread gets added to the "ready" queue (if i may
>use such a term) as a result of a data arriving. If later on it finds that there is
>no data available, then it goes back to sleep. Which means that, if we call
>select() from several threads, all of them will not wake up. But if you still think
>that select() has "wake all semantics", well, what can I say.

Yes, all threads are woken up; but some may lose the race with the
first thread to reach read and go back to sleep.

Typical kernel wakeups don't carry exact information about why the
wakeup occurs and the wakeups can be spurious too. That's why such
checks are typically performed in a loop.

>What is meant by "woken up" here is come out of select() and go on to the
>instruction in the programs code. Either you did not realise this or you chose to
>ignore this. What if the KSE is woken up internally, it still may go to sleep if
>there is no data. SO, the call to select() may remain blocked.

But that behaviour is not easy to model; what is means in pratice is that
you need to cater for wake-all in your code; that fewer threads wake up
is just by accident and not design. This appears to happen both in Solaris
and Linux; from a kernel perspective, both wake up all kernel threads.


>Come on now. If we read immediately after the select in one thread, then there
>won't be data when the second thread is scheduled and hence it will go back to
>sleep. So *SELECT* will, repeat, will unblock based on if data is present for
>reading. If you had cared to read what i've written, you would have noticed that i
>have all along been talking about *select()* being unblocked and not about some KSE
>being woken up and then going to sleep. So, what if some KSE woke up and then went
>back to sleep, select() call as visible to the programmer did not unblock.

Not all, but some. A number of threads can return from select before one
calls reads and obtains the data; all other threads that returns from select
will now block on read().

>Exactly... or to be more precise, depending upon whether they find data to be read,
>else they go back to sleep. Now, if they go to sleep, does the user know about what
>happened in the kernel ? NO. So what became of that select() call ? It remains
>blocked. So again, all threads in a select() will not return from select() if data
>arrives. I hope you finally got the point.

But likely more than one thread will; so we hope that you finally get the
point that multiple selects() may return and some will end up blocked in
read.

Vijay Paul

unread,
Dec 27, 2001, 12:01:19 AM12/27/01
to
>
> It's not clear that the observation "select doesn't return" is the
> same as "thread is not woken up".

select() not returning meant that the next instruction in the program after the
select call was made was not executed because select() was still blocking. Hope it
is clear now.

> In Solaris, all threads are woken up, but poll/select will return in
> only a few threads (as long as the data is present).
>

I agree. not all threads blocking on a select() will return.


0 new messages