Is it possible to avoid OpenSLES deadlock?

2,137 views
Skip to first unread message

Oleg Sh

unread,
Nov 2, 2014, 10:02:41 PM11/2/14
to andro...@googlegroups.com
I'm talking about a deadlock like this one:
01-09 04:32:02.710: W/libOpenSLES(30983): frameworks/wilhelm/src/itf/IBufferQueue.c:57: pthread 0x6266ada8 (tid 31066) sees object 0x636f30d0 was locked by pthread 0x62349fa8 (tid 31201) at frameworks/wilhelm/src/android/AudioPlayer_to_android.cpp:1109
Most of the time I see people reporting it when they try to stop playback, but in this example it happens when I try to enqueue a new buffer.

In the code, you can see that it does interface_lock_exclusive() before getting your call back, then interface_unlock_exclusive(), and then it calls your callback. However, if you decide to add a buffer from your own thread at the moment it has the lock and, naturally, in IBufferQueue_Enqueue() it also calls interface_lock_exclusive(), you immediately get both your thread and OpenSL's thread deadlocked, which would typically mess up your whole application.

First question is, what kind of a locking mechanism is this? Normally, your expect a mutex to block one thread until another one finishes and unlocks, why would you just hang a thread instead? Am I not understanding something here?

The second question is, is it possible to work around this with 100% certainty? It appears that the only safe place to enqueue new buffers for playback is your callback for EVENT_MORE_DATA. But suppose you're playing a network stream, sending buffers for playback as they arrive, and you set OpenSL to use 3 buffers. And in the middle of playback there is an interruption in the stream, so all 3 buffers get played out and not refilled. Then you finally get a new buffer. But do you immediately enqueue it in OpenSL? If you do, then what do you do with the next buffer? If you wait for the callback on the first buffer before you enqueue the next one, you get jittery playback. If you enqueue the second buffer from your main thread, then you are risking a deadlock because how can you tell if OpenSL's thread currently has the lock in preparation to execute your callback for the first buffer? Your only safe option is to wait until you have all three buffers to enqueue at the same time, which will also exacerbate your gap in playback.

But when it comes to stopping playback, you can't even do that. Are you supposed to design your application in such a way as to only stop playback from the callback function, which in effect means you can't stop immediately and have to wait until at least the current buffer finishes playing? Because if you try to stop from your own thread, once again, you are risking that same deadlock. Its probability seems low, but it will happen often if the application's user base is large enough.

Maybe I just don't understand something about the design of OpenSL, but for now it seems like a very risky API to use if you don't want your application locking up on a regular basis. I would appreciate it if someone would explain how to use it without such risk.

Glenn Kasten

unread,
Nov 3, 2014, 10:41:48 AM11/3/14
to andro...@googlegroups.com
It is possible you have found a bug in the implementation of OpenSL ES on Android, but I am not sure.
There were problems with the mutexes used in the output path earlier,
but as far as I know the output path problems were fixed in more recent platform releases.
So I would like to ask you a few questions to help investigate.
1. Which Android devices and platform version(s) does the problem happen on?
    Are there are Android devices and platform version(s) where the problem does not happen?
2. Do you have a short code sequence that demonstrates the problem?
(please don't attach a large section of code)
Thanks

Oleg Sh

unread,
Nov 6, 2014, 3:18:53 PM11/6/14
to andro...@googlegroups.com
To me, it doesn't look like there is anything wrong with OpenSL itself, rather the locking mechanism it uses that I don't quite understand (I tried to look at the source, but it would probably require more time to figure out than I had). If it would use normal mutexes that would block a thread until the lock is released, rather than something that completely hangs the thread in case of conflict, then none of these deadlock issues would occur.

1. I can't say which device it is, but it runs Android 4.4.4 I also encountered the same issue over a year ago on multiple 4.1 and 4.2 devices (e.g. Samsung S3 and S4, Motorola Droid Razr, HTC One).

2. I don't have an executable that demonstrates the issue, but I can post some snippets of code that reproduce it for me quite reliably:

This is the callback function registered with the player queue OpenSL object, called for every buffer whose playback is completed:

static void playerQueueCallback(SLAndroidSimpleBufferQueueItf bq, void *context)
{
    FrameRef frame;

    // Release frame and front of the playing queue
    playingQueue.dequeue(frame);

    // OpenSL is ready for more frames, do we have any?
    while(overflowQueue.front(frame))
    {
        // Attempt to enqueue directly to OpenSL
        if(enqueueToPlayQueue(frame->data(), frame->size()))
        {
            // Success, remove from local queue
            overflowQueue.dequeue(frame);

            // Push frame to the playing queue
            playingQueue.enqueue(frame);
        }
        else
        {
            // Failed, OpenSL queue must be full -> leave frame in local queue and break
            break;
        }
    }
    break;
}

This is the function used to enqueue new buffers with OpenSL:

bool enqueueToPlayQueue(const void *buffer, unsigned int size)
{
    if(g_playerQueue == 0)
    {
        return false;
    }
    else
    {
        SLresult result;

        result = (*g_playerQueue)->Enqueue(g_playerQueue, buffer, size);

        if(result != SL_RESULT_SUCCESS)
        {
            if(result == SL_RESULT_BUFFER_INSUFFICIENT)
            {
                LOGD("OpenSL queue is full");
            }
            return false;
        }
        else
        {
            return true;
        }
    }
}

Note that this is where the key to reproducing this deadlock is. This function is called both from the callback, when it is vritually guaranteed there is space in the OpenSL queue, but it is also called from the application's own thread as soon as a packet arrives from the network. Even if the OpenSL queue is full, it will attempt to enqueue a new buffer, and if that fails (return of SL_RESULT_BUFFER_INSUFFICIENT), that buffer will go into the overflow queue, to be enqueued later when space becomes available. In effect, the enqueue happens more often than it needs to, and there is a certain probability that this enqueue from the application's main thread will coincide with the lock being in place in OpenSL's own thread, which will lead to the deadlock like the one I pasted in the original post.

Just in case, this is the initialization code for OpenSL playback:

bool createPlayer()
{

    SLresult result;

    // Create and realize engine, and get interface
    result = slCreateEngine(&g_engineObj, 0, 0, 0, 0, 0);
    if(result != SL_RESULT_SUCCESS)
    {
        return false;
    }
    else
    {
        (*g_engineObj)->Realize(g_engineObj, SL_BOOLEAN_FALSE);
        (*g_engineObj)->GetInterface(g_engineObj, SL_IID_ENGINE, &g_engine);
    }

    result = (*g_engine)->CreateOutputMix(g_engine, &g_outputMixObj, 0, 0, 0);
    if(result != SL_RESULT_SUCCESS)
    {
        return false;
    }
    else
    {
        (*g_outputMixObj)->Realize(g_outputMixObj, SL_BOOLEAN_FALSE);
        (*g_outputMixObj)->GetInterface(g_outputMixObj, SL_IID_OUTPUTMIX, &g_outputMix);
    }

    // Define audio source for playback (queue - PCM:Mono/8kHz/16-bit/little endian)
    SLDataLocator_AndroidSimpleBufferQueue playerQueueLocator = { SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE,
                                                                  PLAY_QUEUE_SIZE };
    SLDataFormat_PCM formatPcm = { SL_DATAFORMAT_PCM,
                                   1, /* Mono */
                                   SL_SAMPLINGRATE_8,
                                   SL_PCMSAMPLEFORMAT_FIXED_16,
                                   SL_PCMSAMPLEFORMAT_FIXED_16,
                                   SL_SPEAKER_FRONT_CENTER,
                                   SL_BYTEORDER_LITTLEENDIAN };
    SLDataSource audioSrc = {&playerQueueLocator, &formatPcm};

    // Define audio sink for playback
    SLDataLocator_OutputMix outputMixLocator = { SL_DATALOCATOR_OUTPUTMIX, g_outputMixObj };
    SLDataSink audioSnk = {&outputMixLocator, 0};

    // Create and realize player, and get interface
    const SLInterfaceID ids[2] = { SL_IID_ANDROIDSIMPLEBUFFERQUEUE, SL_IID_ANDROIDCONFIGURATION };
    const SLboolean reqs[2] = {SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE };

    result = (*g_engine)->CreateAudioPlayer(g_engine, &g_playerObj, &audioSrc, &audioSnk, 2, ids, reqs);
    if(result != SL_RESULT_SUCCESS)
    {
        return false;
    }
    else
    {
        SLAndroidConfigurationItf playerConfig;
        result = (*g_playerObj)->GetInterface(g_playerObj, SL_IID_ANDROIDCONFIGURATION, &playerConfig);
        if(SL_RESULT_SUCCESS == result)
        {
            SLint32 streamType = SL_ANDROID_STREAM_VOICE;
            (*playerConfig)->SetConfiguration(playerConfig, SL_ANDROID_KEY_STREAM_TYPE, &streamType, sizeof(SLint32));
        }

        (*g_playerObj)->Realize(g_playerObj, SL_BOOLEAN_FALSE);
        (*g_playerObj)->GetInterface(g_playerObj, SL_IID_PLAY, &g_play);

        // Get interface and register callback for player queue
        (*g_playerObj)->GetInterface(g_playerObj, SL_IID_ANDROIDSIMPLEBUFFERQUEUE, &g_playerQueue);
        (*g_playerQueue)->RegisterCallback(g_playerQueue, playerQueueCallback, 0);

        return true;
}

Oleg Sh

unread,
Nov 7, 2014, 5:01:28 PM11/7/14
to andro...@googlegroups.com
I think I know what the issue is. The lock is fine, but the deadlock is caused by the combination of mutexes used by my app and OpenSL. Typically, you want to protect your app resources by placing mutexes in callbacks and other functions that access OpenSL. There are two callbacks you might use: 1) the callback you register with the player queue interface to be notified when a buffer has finished playing and 2) the callback you register with the play interface to be notified of e.g. SL_PLAYEVENT_HEADATEND

So you might have something like this in your app:

enqueue_buffer(void * buffer)
{
lock_mutex()
// post buffer to open sl
unlock_mutex()
}

queue_callback()
{
lock_mutex()
// finished playing a buffer, trying to enqueue another one if available
unlock_mutex()
}

play_callback()
{
lock_mutex()
//open sl is out of buffers to play, stop playback or do whatever else the app needs in this case
unlock_mutex()
}

This design looks reasonable, but it proves to be fatal in one particular case. If we look at audioTrack_callBack_pullFromBuffQueue http://androidxref.com/4.4.4_r1/xref/frameworks/wilhelm/src/android/AudioPlayer_to_android.cpp#1089 we see that for case android::AudioTrack::EVENT_MORE_DATA, if the queue isn't empty (i.e. this isn't the last buffer that finished playing) what it does is roughly:
1. lock queue
2. update queue
3. get queue callback
4. unlock queue
5. execute queue callback
Therefore, when the app's play queue callback is executed, OpenSL doesn't hold a lock.

However, in the situation when this was the final buffer in the queue and the queue is now empty, it goes to the other condition:
1. lock queue
2. call audioPlayer_dispatch_headAtEnd_lockPlay http://androidxref.com/4.4.4_r1/xref/frameworks/wilhelm/src/android/AudioPlayer_to_android.cpp#402
    3. get play callback
    4. execute play callback with SL_PLAYEVENT_HEADATEND
5. stop AudioTrack
6. unlock queue

So when the play callback is executed, OpenSL has the lock on the queue. Thus, if you try to enqueue at the same time as OpenSL tries to execute your play callback, you end up in a deadlock like the one in the opening post.

Of course, an app can work around this, but I still think it's a flaw in OpenSL. If your callback is executed while OpenSL holds a lock, the callback becomes inherently unsafe for you. If you try to use any mutex within it, you risk a deadlock, and if you don't use a mutex, you risk some sort of corruption of internal state of your app. Thus, you are forced into a more convoluted design to avoid both these issues or you need to forgo the use of the play callback (you don't really need the head at end event, you can count the buffers yourself).

In any case, I would suggest a change in audioPlayer_dispatch_headAtEnd_lockPlay() and/or audioTrack_callBack_pullFromBuffQueue() to make sure that the lock on the queue is released before the play callback is executed.

Glenn Kasten

unread,
Nov 10, 2014, 9:58:45 PM11/10/14
to andro...@googlegroups.com
Oleg,
Thanks for your excellent analysis, in particular about the SL_PLAYEVENT_HEADATEND callback. I'm away from my computer and source code at the moment,
but as soon as I get back to it I'll confirm your analysis and then try to post
a source code patch asap. Unfortunately I am unable to post binary patches
or make commitments on binaries, but I'll definitely try to get that addressed.
Again, please look for another post from me later this week,
and feel free to nag if I don't.
Thanks,
Glenn

Glenn Kasten

unread,
Nov 21, 2014, 7:43:55 PM11/21/14
to andro...@googlegroups.com
Please see https://code.google.com/p/android/issues/detail?id=80436
"OpenSL ES calls SL_PLAYEVENT_HEADATEND callback handler with internal mutex locked"

Thanks again Oleg, and please let me know if the suggested workaround applies.
Reply all
Reply to author
Forward
0 new messages