Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Because of multithreading semantics, this is not reliable.

1 view
Skip to first unread message

OlafM...@gmail.com

unread,
May 3, 2006, 4:09:09 PM5/3/06
to
Because of multithreading semantics, this is not reliable. This
sentence is found in the Python documentation for "7.8.1 Queue
Objects".

This scares me! Why would Queue.qsize(), Queue.empty( ), and a
Queue.full() not be reliable?

Looking at the source code of Queue.py, all 3 calls use a mutex (based
on thread.allocate_lock()). Does this mean that the
thread.allocate_lock() mechanism is not reliable (scary indeed) or does
this have to do with other implementation details?

Many thanks for explaining this mystery.

Olaf

Grant Edwards

unread,
May 3, 2006, 4:41:49 PM5/3/06
to
On 2006-05-03, OlafM...@gmail.com <OlafM...@gmail.com> wrote:

> Because of multithreading semantics, this is not reliable.
> This sentence is found in the Python documentation for "7.8.1
> Queue Objects".
>
> This scares me! Why would Queue.qsize(), Queue.empty( ), and a
> Queue.full() not be reliable?

IIRC from the last time this question came up, what the doc
means by "not reliable" is that the result you get is accurate
at the time of the call (for the period the call is inside in
the mutex-protected region), but the saved result may not be
correct at some point in the future because some other thread
may have done an operation on the queue.

I've argued that the "not reliable" phrase is simply wrong: IMO
the calls _are_ reliable: they always return the correct value
at the time the call was made (for my previous definition of
"at the time the call was made"). That's "reliable" in my book.

I've no idea why anybody would ever expect Queue.qsize() to
return the size of the queue as it was going to be at some
undetermined point in the future.

If we were to use the "not reliable" semantics that are used in
the Queue docs, pretty much everything is "not reliable" in a
multi-threading environment. For example binding a global name
to an object is "not reliable" in a multi-threaded environment
because another thread can re-bind it later to a different
object. I think describing that problem as "global name
binding is not reliable in a multi-threaded environment" is
very misleading.

> Looking at the source code of Queue.py, all 3 calls use a
> mutex (based on thread.allocate_lock()). Does this mean that
> the thread.allocate_lock() mechanism is not reliable (scary
> indeed) or does this have to do with other implementation
> details?

IMO, it has to do with a poor choice of language.

> Many thanks for explaining this mystery.

No problem.

--
Grant Edwards grante Yow! Mr and Mrs PED, can I
at borrow 26.7% of the RAYON
visi.com TEXTILE production of the
INDONESIAN archipelago?

Tim Peters

unread,
May 3, 2006, 5:13:33 PM5/3/06
to pytho...@python.org
[OlafM...@gmail.com]

> Because of multithreading semantics, this is not reliable. This
> sentence is found in the Python documentation for "7.8.1 Queue
> Objects".
>
> This scares me! Why would Queue.qsize(), Queue.empty( ), and a
> Queue.full() not be reliable?

Because they may not be telling the truth at the instant the _caller_
tries to use the result. I'm not sure why, but people write code like

if q.empty():
return

in a thread, and then complain that "it's a bug" if some other thread
of theirs happens to sneak in and add another item to the queue
_between_ the time q.empty() correctly determined that q was empty,
and the time the code generated for "if q.empty()" tests the result.
There's no mutex to stop other threads from running between those
times. The docs could be clearer about this, and "not reliable" had a
stronger meaning in earlier versions of Python.

> Looking at the source code of Queue.py, all 3 calls use a mutex (based
> on thread.allocate_lock()). Does this mean that the
> thread.allocate_lock() mechanism is not reliable (scary indeed)

No.

> or does this have to do with other implementation details?

It just has to do with the way threads work, and with trying to
disabuse newbies of faulty common beliefs. It's good to scare
threading newbies away from these methods, because they _don't_ do
what newbies typically assume they do. That puts them in the
"attractive nuisance" category for many people.

OlafM...@gmail.com

unread,
May 3, 2006, 5:26:53 PM5/3/06
to
Tim and Grant

>>>
if q.empty():
return
>>>

Of course you explanation is understood and ideally should be included
as a note in the Python documentation. And the "not reliable" should
be removed from the documentation!

Anyway, many thanks for your explanations (I feel "safer" now).

Olaf

Edward Elliott

unread,
May 3, 2006, 6:20:14 PM5/3/06
to
Tim Peters wrote:
> That puts them in the
> "attractive nuisance" category for many people.

Argh. That gives me bad flashbacks to my torts final from Mon, which had a
bona-fide "attractive nuisance" problem on it. Damn you, Tim Peters! ;)

Christophe

unread,
May 4, 2006, 4:48:33 AM5/4/06
to
OlafM...@gmail.com a écrit :

You could go as far as to say that since the function return itself
isn't wrapped in a mutex, the value can be obsolete before even the
function returns.

ie that code can sometimes return the wrong value :

def empty(self):
self.acquire_mutex()
result = self.count == 0
self.release_mutex()
return result before that line, some other thread added a value !

Olaf Meding

unread,
May 4, 2006, 8:24:32 AM5/4/06
to
> return result before that line, some other thread added a value !

Sure, but that is the nature of using threads and a mutex. I hope you are
you not saying that every function that uses a mutex should have a comment
saying this is not "reliable"?

Olaf


Sergei Organov

unread,
May 4, 2006, 8:50:11 AM5/4/06
to pytho...@python.org
"Olaf Meding" <OlafM...@noSpam.compuserve.com> writes:

Strictly speaking, mutex has nothing to do about it, -- if there were no
mutex there the problem would have been even worse, so this is the
nature of accessing shared resources by multiple threads.

Somewhat similar example is accessing a file after checking for its
existence in a multi-tasking OS:

if fileExists(fileName):
openFile(fileName)

has the same problem, -- at open_file time one can't rely on existence
of the file, so the check for existence is useless.

--
Sergei.

Christophe

unread,
May 4, 2006, 9:26:08 AM5/4/06
to
Olaf Meding a écrit :

That function can return a value that was already obsolete even before
it finished executing. And so special care should be taken by the user
so that he can either cope with that or make sure that it doesn't happens.

As was said, it's the same reason the user should not do a if
fileExists() followed by a openFile. Same reason that there is a warning
in the "os.access" manual

Grant Edwards

unread,
May 4, 2006, 9:38:55 AM5/4/06
to
On 2006-05-04, Olaf Meding <OlafM...@noSpam.compuserve.com> wrote:

>> return result before that line, some other thread added a value !
>
> Sure, but that is the nature of using threads and a mutex.

Yes.

> I hope you are you not saying that every function that uses a
> mutex should have a comment saying this is not "reliable"?

My point exactly.

--
Grant Edwards grante Yow! My pants just went to
at high school in the Carlsbad
visi.com Caverns!!!

OlafM...@gmail.com

unread,
May 4, 2006, 10:04:21 AM5/4/06
to
Christophe

> Same reason that there is a warning in the "os.access" manual

I understand the if file exists open it code.

I looked at the os.access documentation and see no "warning" or "not
reliable" wording there.
6.1.4 Files and Directories
access(path, mode)


Olaf

Christophe

unread,
May 4, 2006, 10:14:07 AM5/4/06
to
OlafM...@gmail.com a écrit :

6.1.4 Files and Directories

access( path, mode)

Use the real uid/gid to test for access to path. Note that most
operations will use the effective uid/gid, therefore this routine can be
used in a suid/sgid environment to test if the invoking user has the
specified access to path. mode should be F_OK to test the existence of
path, or it can be the inclusive OR of one or more of R_OK, W_OK, and
X_OK to test permissions. Return True if access is allowed, False if
not. See the Unix man page access(2) for more information. Availability:
Macintosh, Unix, Windows.
Note: Using access() to check if a user is authorized to e.g. open a
file before actually doing so using open() creates a security hole,
because the user might exploit the short time interval between checking
and opening the file to manipulate it.

I call that note a warning.

0 new messages