Searching the "true" about multithreading

faturita

unread,

Nov 5, 2009, 2:07:34 PM11/5/09

to

Hello everyone, Good morning/afternoon/evening !

Win32, Windows XP Professional.
Openssl0.9.8k compiled with multithreading support.

I am using a blocking BIO to read and write to an SSL socket using two threads (because it is in blocking mode).

There is the Reading thread:

while (true)
     BIO_read( m_pBio, pBuffer, i_iLength );

and concurrently, the writing thread that do something like this:

     BIO_write( m_pBio, i_rStream.operator unsigned char *(), i_rStream.GetLength() );

The SSL_ctx and SSL_session are created from within the Writing thread and to stop the connection, the writing thread do:

     BIO_reset()

and the Reading thread stops by getting an invalid return value from the BIO_read function.

This is all very basic, and it is working but I am aware of the "OpenSSL multithreading (or not so) things" and I would like to know if this basic stuff needs some form of synchronization (using THREAD_setup() and THREAD_cleanup()). At the same time if I use some form of synchronization with blocking BIO how can I be free to use BIO_write and send info async if the Reading thread is blocked in BIO_read and holding the mutex ? I guess that SSL wisely will use the functions callbacks to access the internal structures thread-safely, but isn't there any possible way to end up in a deadlock ?

Thank you a lot for your answers !!!!!
R.

Sebastián Treu

unread,

Nov 5, 2009, 2:46:44 PM11/5/09

to

Hi,

On Thu, Nov 5, 2009 at 4:07 PM, faturita <rra...@gmail.com> wrote:
> This is all very basic, and it is working but I am aware of the "OpenSSL
> multithreading (or not so) things" and I would like to know if this basic
> stuff needs some form of synchronization (using THREAD_setup() and
> THREAD_cleanup()). At the same time if I use some form of synchronization
> with blocking BIO how can I be free to use BIO_write and send info async if
> the Reading thread is blocked in BIO_read and holding the mutex ? I guess
> that SSL wisely will use the functions callbacks to access the internal
> structures thread-safely, but isn't there any possible way to end up in a
> deadlock ?

This is not a direct answer to your question. I'm writting a
multithreaded server with non-blocking file descriptors with a kind of
producer-consumer point of view. I will be pleased if you feed me with
some data about that "issue" on multithreading.

How I implement this "server"? well, it's on development but I use
mutexes between the SSL_read() function and the SSL_get_error()
function, as both uses same resource.

The "consumer" (sender) will wait for a signal from the "producer"
(reader) and will block the shared buffer queue. This is, the
"producer" gains exclusive access to the ssl structure of the client
for SSL_read() and SSL_get_error() checking the SSL_ERROR_WANT_READ or
SSL_ERROR_WANT_WRITE, when he fetches some data, he then gain
exclusive access to the buffer queue, puts data on it, singals the
"consumer" and goes on to another read.

The "consumer" when signaled, obtains information on the buffer and
lock the ssl structure of the client he wants to send.

This is the main idea that I have implemented and I'm testing out.
Basically, the "consumer" is something like this:

http://pastebin.ca/1658628

The code is only with a buffer and not with a queue. I'm thinking on
using a queue more ahead.

Please, if you have some information about threading issues I'll be
pleased to read it.

Thanks,
--
If you want freedom, compile the source. Get gentoo.

Sebastián Treu
http://labombiya.com.ar
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openss...@openssl.org
Automated List Manager majo...@openssl.org

Sebastián Treu

unread,

Nov 5, 2009, 3:00:44 PM11/5/09

to

Hi,

I forgot the "producer"

> The "consumer" (sender) will wait for a signal from the "producer"
> (reader) and will block the shared buffer queue. This is, the
> "producer" gains exclusive access to the ssl structure of the client
> for SSL_read() and SSL_get_error() checking the SSL_ERROR_WANT_READ or
> SSL_ERROR_WANT_WRITE, when he fetches some data, he then gain
> exclusive access to the buffer queue, puts data on it, singals the
> "consumer" and goes on to another read.

http://pastebin.ca/1658646

Regards,

faturita

unread,

Nov 5, 2009, 3:30:45 PM11/5/09

to

Hello Sebastian,

Thanks a lot for your reply !!

The "consumer" (sender) will wait for a signal from the "producer"
(reader) and will block the shared buffer queue. This is, the
"producer" gains exclusive access to the ssl structure of the client
for SSL_read() and SSL_get_error() checking the SSL_ERROR_WANT_READ or
SSL_ERROR_WANT_WRITE, when he fetches some data, he then gain
exclusive access to the buffer queue, puts data on it, singals the
"consumer" and goes on to another read.

I guess that if you are using SSL_ERROR_WANT_READ, it is because you are dealing with a non blocking IO, so it is not exactly like the situation that I stated on my email. I guess using non blocking IO you have more control of the code and you can do more stuff to avoid concurrency issues.

Please, if you have some information about threading issues I'll be
pleased to read it.

Right now I do not have any real problem and the application is working, but I know, because it is documented, that you should not share the ssl connection between threads.

Thanks,

Thanks you !

David Schwartz

unread,

Nov 5, 2009, 8:15:43 PM11/5/09

to

Faturita wrote:

> I am using a blocking BIO to read and write to an SSL socket
> using two threads (because it is in blocking mode).

This is not permitted. You cannot have two threads call modification
functions on the same object at the same time.

> There is the Reading thread:
> while (true)
> BIO_read( m_pBio, pBuffer, i_iLength );

> and concurrently, the writing thread that do something like this:

> BIO_write( m_pBio, i_rStream.operator unsigned char *(),
i_rStream.GetLength() );

Assuming they are the same BIO, this is a disaster waiting to happen. You
have one thread that can modify the BIO while another thread is reading it.
Boom.

> This is all very basic, and it is working but I am aware of the
> "OpenSSL multithreading (or not so) things" and I would like to
> know if this basic stuff needs some form of synchronization
> (using THREAD_setup() and THREAD_cleanup()).

It doesn't need synchronization, it needs to be replaced with code that is
sane. No synchronization can fix this, it is broken at a fundamental level.

> At the same time if I use some form of synchronization with
> blocking BIO how can I be free to use BIO_write and send info
> async if the Reading thread is blocked in BIO_read and holding
> the mutex?

Exactly. Broken design. This is why I tell people not to use blocking
functions, non-blocking is so much easier.

> I guess that SSL wisely will use the functions callbacks
> to access the internal structures thread-safely, but isn't
> there any possible way to end up in a deadlock ?

The problem has nothing to do with deadlock but with simple crashing. Just
as one thread can't modify a pointer while another thread is following that
pointer, one thread can't modify a BIO while another thread is reading from
that BIO.

Blocking operations are only suitable when you know, for sure, what
operation you want to do and are willing to wait forever for it to happen.
It redoing your code is too difficult, you can emulate blocking operations
with non-blocking ones.

DS

faturita

unread,

Nov 6, 2009, 6:43:00 AM11/6/09

to

David, thanks a lot for your valuable reply !

It doesn't need synchronization, it needs to be replaced with code that is
sane. No synchronization can fix this, it is broken at a fundamental level.

Exactly. Broken design. This is why I tell people not to use blocking
functions, non-blocking is so much easier.

In this very simple situation, what do you think will be a better design ?

Blocking operations are only suitable when you know, for sure, what
operation you want to do and are willing to wait forever for it to happen.
It redoing your code is too difficult, you can emulate blocking operations
with non-blocking ones.

If I do change only the BIO from blocking to non-blocking, Will I have to use some form of synchronization while accessing the BIO, right ?

Thanks a lot !!
faturita

David Schwartz

unread,

Nov 6, 2009, 3:58:54 PM11/6/09

to

Faturita wrote:

> David, thanks a lot for your valuable reply !

>> It doesn't need synchronization, it needs to be replaced with code that
is
>> sane. No synchronization can fix this, it is broken at a fundamental
level.

>> Exactly. Broken design. This is why I tell people not to use blocking
>> functions, non-blocking is so much easier.

> In this very simple situation, what do you think will be a better design ?

Ideally you wouldn't have two threads in the first place. If your program
had to manage 1,000 SSL connections, would you have 2,000 threads? I hope
the answer is "of course not". So why do you have two threads for one SSL
connection?

>> Blocking operations are only suitable when you know, for sure, what
>> operation you want to do and are willing to wait forever for it to
happen.
>> It redoing your code is too difficult, you can emulate blocking
operations
>> with non-blocking ones.

> If I do change only the BIO from blocking to non-blocking, Will I have to
use
> some form of synchronization while accessing the BIO, right ?

Yes, just like any other object you want to share between two threads, you
need to synchronize access to it and coordinate the threads so they don't
step on each other. Unfortunately, you have about the worst possible design
-- two dedicated threads that expect blocking semantics even though they
don't know which operation the protocol specifies will succeed.

And the simplest obvious solution doesn't work as simply as you might thing.
If you try to emulate blocking functions with non-blocking functions the
most obvious way, you will have a fatal race condition. If you make a
'write' and 'read' function like this:

1) Acquire the mutex that protects the SSL connection.

2) Try to send/receive some data with the SSL/BIO read/write function,
non-blocking.

3) If we made any forward progress, go to step 2.

4) Release the mutex.

5) If we sent/received all the data, return success.

6) If we got a fatal error in step 2, return an error.

7) Call 'select' (or equivalent) based on the return code we got in step 2.

8) Go to step 1.

The problem is that there's a race between step 4 and step 7. In that
window, the other side can make an SSL/BIO operation that makes our call in
step 6 deadlock. For example, if we're writing, and we can't write because
we're waiting for renegotiation data, as soon as we release the mutex, the
read thread might read the renegotiation data. So when we call 'select' (or
whatever) in step 6, we may be waiting for data that has already arrived.

There are a lot of ways to fix this (best, IMO, is not to fake blocking
semantics but to design around non-blocking semantics) but the simplest is
to change the logic, using a pipe for blocking/signalling and a 'thread is
blocking' count like this:

1) Acquire the mutex that protects the SSL connection.

2) Try to send/receive some data with the SSL/BIO read/write function,
non-blocking.

3) If we made any forward progress and this is the read thread or this is
the write thread and we sent all the data: If the 'thread is blocking' count
is non-zero, write a byte to the pipe. Release the mutex. Return the count
of bytes sent or received.

4) If this is the write thread and we sent some, but not all, of the data,
go to step 2.

5) If we got a want_read or want_write indication, increment the 'thread is
blocking' count, release the mutex, and 'select' (or equivalent) on both the
socket (in the direction indicated by the return code from step 2) and the
read end of the pipe. When we wake up, if the pipe was what woke us, read
the data from it to re-arm the pipe. Acquire the mutex, decrement the
'thread is blocking' count, and go to step 1.

The pipe solves the race as it can be armed while holding the mutex and will
unblock the thread whether it's signaled before or after it entered 'select'
(or equivalent).

But, IMO, the need to do this just shows how truly awful the "two threads,
blocking sockets" approach always was. It's just that all the ugly races
were handled for you in the kernel with TCP.

Sebastián Treu

unread,

Nov 6, 2009, 4:00:08 PM11/6/09

to

Hi,

> If I do change only the BIO from blocking to non-blocking, Will I have to
> use some form of synchronization while accessing the BIO, right ?

Well, maybe this is useful to you, if you are going to use
non-blocking you will need to check the errors. If you don't
synchronize the access you will have anormal situations. Imagine
thread A reading from a pointer, then thread B writing to the same
pointer, and then thread A checking the read() operation return with
SSL_get_error() with the pointer written by B... Data will no be
precise there. You then must synchronize that if that escenario could
happen in your program.

I used a select() thread with non-blocking IO just to no keep the cpu
busy in a infinite loop. Kind of:

while( alive && CONTINUE )
{
/* The main client attendance */

copy = master;
if ( select(client->fd+1, &copy, NULL,NULL,NULL) == -1)
printf("<thread %d>:[ERR]:\tSelect fail\n",tid);
else
{
if ( FD_ISSET(client->fd, &copy) )
{
/* read from the secure connection gaining exclusive access */
/* to the client ssl structure. The 'sender thread' could */
/* access this structure coliding with the 'err' value and */
/* starting a catastrophe. */
pthread_mutex_lock(&client->mutex[SSL_MUTEX]);
nbytes = SSL_read(client->ssl, client->buffer, chunk_size);
err = SSL_get_error(client->ssl, nbytes);
pthread_mutex_unlock(&client->mutex[SSL_MUTEX]);

You can check the hole threaded server in earlier development here:

http://code.google.com/p/tellapic/source/browse/trunk/server.c

I hope this can help to anything, and if I'm missing something just
don't forget to remind me!!

Regards,

PS: this job is a final job of my Computer Analist career. I'm a
computer science student and i'm not a pro-developer. Not even closer.
I'm on that way xD

--
If you want freedom, compile the source. Get gentoo.

Sebastián Treu
http://labombiya.com.ar

David Schwartz

unread,

Nov 7, 2009, 10:47:18 AM11/7/09

to

Sebastián Treu wrote:

Your logic is backwards here. You are trying to decide whether or not to
read data on the decrypted output link, so why are you 'select'ing on the
encrypted input link?

SSL is a state machine, not a filter. The implementation of SSL_read is
*NOT*:
1) Read some data from the socket.
2) If we got any data, decrypt it.
3) Return the data we read.

It is:
1) Try to make forward progress, doing any reads and writes as necessary.
2) If this resulted in any decrypted data, return it.
3) If not, tell the caller why.

As a result, you can only 'select' *after* calling SSL_read, never before.
And you cannot assume that you will be selecting in the read direction,
because either can be necessary.

DS

Sebastián Treu

unread,

Nov 7, 2009, 11:59:18 AM11/7/09

to

On Sat, Nov 7, 2009 at 12:47 PM, David Schwartz <dav...@webmaster.com> wrote:
> Your logic is backwards here. You are trying to decide whether or not to
> read data on the decrypted output link, so why are you 'select'ing on the
> encrypted input link?
>
> SSL is a state machine, not a filter. The implementation of SSL_read is
> *NOT*:
> 1) Read some data from the socket.
> 2) If we got any data, decrypt it.
> 3) Return the data we read.
>
> It is:
> 1) Try to make forward progress, doing any reads and writes as necessary.
> 2) If this resulted in any decrypted data, return it.
> 3) If not, tell the caller why.
>
> As a result, you can only 'select' *after* calling SSL_read, never before.
> And you cannot assume that you will be selecting in the read direction,
> because either can be necessary.
>
> DS

Hi David,

The main idea was avoid polling in an infinite loop consuming CPU
resources. I wrote that code thinking in: "If the particular client
socket is calling our (thread) attention then fetchs the data". I
thought on that approach as I don't know another for non-blocking IO
without a poll cycle. If I loop forever on the SSL_read() function,
CPU will be kept busy on that job so I thought in a way of not having
to do so. Instead, something should "inform" that on that socket is
data ready to be read.

Mmmh...I can't see how to do it without select(). The main important
thing here is that this thread is attending only 1 client. Maybe it's
confusing because "why use select() then if you are polling always on
the same IO socket?". Answer: I don't know if there is another system
call to block until a file descriptor is ready to be read.

That part of code is threaded, and althought you are right on saying:
"why a server should have 1,000 threads when you have 1,000
connections", the particular use of this application will be a
very-connection-limited server. For example, saying 20 clients is a
huge number of connections. The numbers of threads are limited as the
number of connections.

Then, if I read first with SSL_read() on non-blocking IO, every time
the client isn't writting or sending anything, the server is using and
wasting cpu cycles. Without the select() approach and with a maximun
of 32 clients my cpu usage went to 200% ( 100 per core). With the
select() approach the cpu usage is relative to the clients
reading/writting actions.

I believe you are more experienced developer than me (in fact, i'm not
what you can call A developer) and if not much to ask, how do you
solve this kind of problem? (without removing the roots of the
multithreaded server design) I mean, how can you block execution
waiting for a "noise" on the file descriptor to take some action
without using select()?

I really appreciatte your concern on letting me know my errors and
sorry if this invalidate the main topic thread,
Regards

--
If you want freedom, compile the source. Get gentoo.

Sebastián Treu
http://labombiya.com.ar

David Schwartz

unread,

Nov 7, 2009, 12:35:35 PM11/7/09

to

Sebastián Treu wrote:

> The main idea was avoid polling in an infinite loop consuming CPU
> resources. I wrote that code thinking in: "If the particular client
> socket is calling our (thread) attention then fetchs the data". I
> thought on that approach as I don't know another for non-blocking IO
> without a poll cycle. If I loop forever on the SSL_read() function,
> CPU will be kept busy on that job so I thought in a way of not having
> to do so. Instead, something should "inform" that on that socket is
> data ready to be read.

Right, but your code call 'select' even if it doesn't need to read data from
the socket.

> Mmmh...I can't see how to do it without select(). The main important
> thing here is that this thread is attending only 1 client. Maybe it's
> confusing because "why use select() then if you are polling always on
> the same IO socket?". Answer: I don't know if there is another system
> call to block until a file descriptor is ready to be read.

You totally missed my point. You are correct that you need to block
somewhere, you are simply blocking in the wrong place for the wrong reason.
The only reason you should ever block using 'select' on an SSL connection is
because the SSL state machine cannot make forward progress until the socket
is ready. But you call 'select' without knowing this.

> Then, if I read first with SSL_read() on non-blocking IO, every time
> the client isn't writting or sending anything, the server is using and
> wasting cpu cycles. Without the select() approach and with a maximun
> of 32 clients my cpu usage went to 200% ( 100 per core). With the
> select() approach the cpu usage is relative to the clients
> reading/writting actions.

What? How does calling SSL_read *first* waste CPU cycles? You *cannot* call
'select' until you *know* that you need to call 'select'. The data the SSL
state machine needs to make forward progress may already have been read.

> I believe you are more experienced developer than me (in fact, i'm not
> what you can call A developer) and if not much to ask, how do you
> solve this kind of problem? (without removing the roots of the
> multithreaded server design) I mean, how can you block execution
> waiting for a "noise" on the file descriptor to take some action
> without using select()?

I guess I wasn't clear. The problem is not that you are calling 'select' at
all, the problem is that you are calling 'select' even when you have
absolutely no reason to do so.

Call SSL_read. If you make forward progress, great. If you make no forward
progress, the SSL state machine will tell you why. If, for example, it
returns a 'WANT_READ' indication, then you know that the SSL state machine
cannot make forward progress unless it reads from the socket. Then, and only
then, does it make sense to call 'select'.

Again, you *MUST* get this idea out of your head:
"Read data from socket, decrypt it, pass it to application."
That is *NOT* what SSL_read does. SSL_read is *NOT* a decryption function.
It is an entry point into a state machine that can do all kinds of things,
including reading from the socket.

Here's where your code blows up horribly:

1) You call SSL_write. A renegotiation is in progress, so it reads data from
the socket to see if it can complete the renegotiation. It gets the data
needed to complete the renegotiation and some encrypted application data. It
sends the encrypted data you asked it to, and returns success.

2) You enter your broken read function and call 'select', but the data has
already arrived and been read (in step 1). You deadlock waiting forever for
data that is already here.

Do you see? You cannot call 'select' unless you know for a fact that the SSL
state machine needs to read from the socket. Otherwise you could be waiting
for something that already happened or is not supposed to happen.

Do not "look through" the SSL state machine. Let it do its job.

DS

Sebastián Treu

unread,

Nov 7, 2009, 1:23:39 PM11/7/09

to

Hi David,

Get it. Excellent explanation. I didn't knew that thing about the
state machine. Thanks,

Regards,

--
If you want freedom, compile the source. Get gentoo.

Sebastián Treu
http://labombiya.com.ar

Scott Gifford

unread,

Nov 7, 2009, 12:35:04 AM11/7/09

to

faturita <rra...@gmail.com> writes:

[...]

> This is all very basic, and it is working but I am aware of the "OpenSSL
> multithreading (or not so) things" and I would like to know if this basic stuff
> needs some form of synchronization

Not quite an answer to your question, but when faced with a similar
problem about a year ago, we rewrote our code to use boost asio, and
were very pleased with the results. It handled the SSL and helped us
get rid of our very resource-hungry one-thread-per-connection model,
without tons of changes outside of the I/O layer.

Hope this is helpful,

----Scott.

Sebastián Treu

unread,

Nov 11, 2009, 3:12:32 PM11/11/09

to

Continuing with the "true multithreading" idea, I have some doubts
about what the documentation says about implementing static locks. I
read the old doc "Network Security with OpenSSL" from O'reilly about
the OpenSSL 0.9.7 library. I didn't finish it, but I focus on the API
part, chapter 4 and chapter 5.

When talking about thead-safeness and the developer responsability.
Say that I implement static locking callbacks on my application with
non-blocking BIO.

Should I still take care on having my own mutexes for locking access
to an SSL client structure while another thread can eventually access
to it with I/O OpenSSL functions?

Does this callbacks only locks specific OpenSSL structures internally
and should I be aware of locking when reading/writing from/to a
client?

If this callbacks locks on a write operation, does this means that I
can't read until the lock is release although I'm reading from a
different client than i'm writing to?

These 3 questions are related on each answer. If the callbacks are
only to lock internal structures that I/O operations (or anyone else)
uses, then I know that I must lock the specific client BIO and while
I'm reading/writing on this BIO another thread could be
reading/writing to another one. Just that internally maybe it will
block on accessing to OpenSSL structures by the callbacks mentioned
above. Is this how it works?

Regards,
--
If you want freedom, compile the source. Get gentoo.

Sebastián Treu
http://labombiya.com.ar

David Schwartz

unread,

Nov 11, 2009, 4:02:49 PM11/11/09

to

Sebastián Treu wrote:

> When talking about thead-safeness and the developer responsability.
> Say that I implement static locking callbacks on my application with
> non-blocking BIO.

Yes, you must implement the locking callbacks. OpenSSL uses them to provide
the thread-safety guarantees it provides.

> Should I still take care on having my own mutexes for locking access
> to an SSL client structure while another thread can eventually access
> to it with I/O OpenSSL functions?

Yes. The locking callbacks are used by OpenSSL to protect its internals.
They don't prevent you from screwing up.

> Does this callbacks only locks specific OpenSSL structures internally
> and should I be aware of locking when reading/writing from/to a
> client?

The lock internal structures, that is correct.

> If this callbacks locks on a write operation, does this means that I
> can't read until the lock is release although I'm reading from a
> different client than i'm writing to?

No, it doesn't. OpenSSL specifically permits concurrent operations on
different objects. It uses the locks internally to make this work even if
those distinct objects internally refer to the same underlying objects (for
example, two SSL connections using the same context).

> These 3 questions are related on each answer. If the callbacks are
> only to lock internal structures that I/O operations (or anyone else)
> uses, then I know that I must lock the specific client BIO and while
> I'm reading/writing on this BIO another thread could be
> reading/writing to another one. Just that internally maybe it will
> block on accessing to OpenSSL structures by the callbacks mentioned
> above. Is this how it works?

You can access two different SSL connections at the same time. You do not
need to lock the library as a whole. OpenSSL works just like every other
user-space library, in fact, it works just like strings do. One thread can
access one string while another thread accesses some other string. But one
thread cannot read a string while another thread is or might be modifying
that same string.

And note that all BIO/SSL operations, even those with 'read' in their names
are logically modification operations.

DS

Sebastián Treu

unread,

Nov 11, 2009, 6:32:23 PM11/11/09

to

Hi David,

Excellent explanation, as usual. Thank you very much.

Regards,
--
If you want freedom, compile the source. Get gentoo.

Sebastián Treu
http://labombiya.com.ar