python-speech client library error originating from sync_posix.cc

51 views
Skip to first unread message

ersc...@google.com

unread,
Oct 1, 2020, 1:52:23 PM10/1/20
to grpc.io

Hello,

I need to understand under what conditions the pthread_mutex_lock check fails.

Context: I'm trying to diagnose a flaky test. Intermittently, one of the Google Cloud Speech API tests fails with the following message:

```
============================= test session starts ==============================
platform linux -- Python 3.7.7, pytest-6.0.1, py-1.9.0, pluggy-0.13.1 -- /workspace/speech/cloud-client/.nox/py-3-7/bin/python
cachedir: .pytest_cache
rootdir: /workspace, configfile: pytest.ini
collecting ... collected 22 items

beta_snippets_test.py::test_transcribe_file_with_enhanced_model PASSED   [  4%]
beta_snippets_test.py::test_transcribe_file_with_metadata PASSED         [  9%]
beta_snippets_test.py::test_transcribe_file_with_auto_punctuation PASSED [ 13%]
beta_snippets_test.py::test_transcribe_diarization PASSED                [ 18%]
beta_snippets_test.py::test_transcribe_multichannel_file E0828 10:27:18.708634490   13111 sync_posix.cc:67]           assertion failed: pthread_mutex_lock(mu) == 0
Fatal Python error: Aborted

Thread 0x00007f2c36e0e600 (most recent call first):
  File "/usr/local/lib/python3.7/codecs.py", line 322 in decode
  File "/workspace/speech/cloud-client/.nox/py-3-7/lib/python3.7/site-packages/_pytest/capture.py", line 484 in snap
  File "/workspace/speech/cloud-client/.nox/py-3-7/lib/python3.7/site-packages/_pytest/capture.py", line 570 in readouterr
  File "/workspace/speech/cloud-client/.nox/py-3-7/lib/python3.7/site-packages/_pytest/capture.py", line 657 in read_global_capture
  File "/workspace/speech/cloud-client/.nox/py-3-7/lib/python3.7/site-packages/_pytest/capture.py", line 718 in item_capture
nox > Command pytest --junitxml=sponge_log.xml failed with exit code -6
nox > Session py-3.7 failed.
```


Details:

Richard Belleville

unread,
Oct 1, 2020, 2:10:15 PM10/1/20
to grpc.io
This looks like a system-level failure of pthread_mutex_lock. Unfortunately, the actual return value doesn't appear to be printed here (it would be helpful if we could get that). So the cause could be any of the following:

EINVAL The mutex was created with the protocol attribute having the value PTHREAD_PRIO_PROTECT and the calling thread's priority is higher than the mutex's current priority ceiling.

EINVAL The value specified by mutex does not refer to an initialized mutex object.

EAGAIN The mutex could not be acquired because the maximum number of recursive locks for mutex has been exceeded.

EDEADLK The current thread already owns the mutex.

ersc...@google.com

unread,
Oct 1, 2020, 2:37:54 PM10/1/20
to grpc.io
Could this possibly be the result of multiple test runs occurring concurrently?

Richard Belleville

unread,
Oct 6, 2020, 1:03:59 PM10/6/20
to grpc.io
I think that's unlikely. We run the Core tests with thousands of instances in parallel regularly. Were you able to gatherany more information on this failure?

Eric Schmidt

unread,
Oct 12, 2020, 12:29:37 PM10/12/20
to grpc.io
We haven't been able to gather any further information yet. However, we've decide to use a different testing strategy that ameliorates this issue.

Thank you for the detailed help and responses!
Reply all
Reply to author
Forward
0 new messages