Crash after primo::codecs::MediaBuffer::release()

3 views
Skip to first unread message

sander.borsboom

unread,
Jan 23, 2014, 4:21:09 AM1/23/14
to avblocks...@googlegroups.com
Hi,

when debugging a server crash we found a crash that seems to originate from a release() call on a MediaBuffer instance. I wouldn't be surprised if the source is something stupid done by us, but I can't see anything obvious so I was wondering if you could help us find out what we did wrong.


First some software details:

OS:
~$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 7.2 (wheezy)
Release:        7.2
Codename:       wheezy

AVBlocks:
Version: 1.8.0.
Edition: Debian 6
Bitness: 64-bit
~$ md5sum libAVBlocks64.so 
a27ca78560a53808a1abac7fa0a617b3  libAVBlocks64.so


The gdb stack that seems to indicate the problem occurring somewhere in libAVBlocks64.so:

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/java -cp <SNIP>.
Program terminated with signal 6, Aborted.
#0  0x00007fa827ce5475 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007fa827ce5475 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fa827ce86f0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fa827d2052b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fa827d29d76 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007fa827d2eaac in free () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007fa7fd68b059 in ?? () from /usr/share/cm_server/native/libAVBlocks64.so
#6  0x00007fa7fc10bf69 in Mpeg4Prototype::~Mpeg4Prototype (this=0x40c9be60, __in_chrg=<optimized out>) at /root/avblocks/AVBlocksJNI/AVBlocksJNI/Mpeg4Prototype.cpp:121
#7  0x00007fa7fc102ab2 in Java_camMan_shared_io_video_AvBlockVideoDecoder_destroyDecoder (pointer=1086963296) at /root/avblocks/AVBlocksJNI/AVBlocksJNI/camMan_shared_io_video_AvBlockVideoDecoder.cpp:90
#8  0x00007fa822fa08c4 in ?? ()
#9  0x000000074ae009a0 in ?? ()
#10 0x00007fa8231960a4 in ?? ()
#11 0x000000074b203358 in ?? ()
#12 0x000000074af65a30 in ?? ()
#13 0x0000000200000000 in ?? ()
#14 0x00000007991dbd78 in ?? ()
#15 0x00007fa7d869c6b0 in ?? ()
#16 0x0000000000000000 in ?? ()


The code in question is in the deconstructor of Mpeg4Prototype, with the following call:

_mediaBuffer->release();

Where _mediaBuffer is an instance of primo::codecs::MediaBuffer.

We can't share the dump since it in essence will contain confidential information like passwords, but please tell us if you need anything more from it.

Greetings,

Sander Borsboom

sander.borsboom

unread,
Jan 23, 2014, 5:52:08 AM1/23/14
to avblocks...@googlegroups.com
Ok, it seems that this issue might be related to the Mpeg4 decoding issue:
  • If we try to decode one of the bad streams
  • We get the facility:12 code:1 error
  • In older code we keep trying to decode, resulting in a "buffer is full" or something like that error (working on reproducing it to get more specific info)
  • 5+ frames after the original facility:12 code:1 error, we get a crash when MediaBuffer::release() is called. Before this call, MediaBuffer::retainCount() is 1, so we aren't calling release() too many times it seems.
So I am now creating a little test program to show this crash and will send this later today.

Op donderdag 23 januari 2014 10:21:09 UTC+1 schreef sander.borsboom:

Svilen Stoilov

unread,
Jan 23, 2014, 6:41:05 AM1/23/14
to avblocks...@googlegroups.com
Ok Sander,

A test program will help a lot to diagnose the problem.
In general MediaBuffer::release() would crash only if the object is already deleted or damaged.
As a rule - each call to avb_create_media_buffer (or Library::createMediaBuffer) must be paired with a single call to MediaBuffer::release.
You may be aware of this but note that when you attach a MediaBuffer to MediaSample with MediaSample::setBuffer() the MediaBuffer refcount is increased.

Another thing that comes to my mind is the possibility to overwrite some internals of the MediaBuffer object when it's managed and then when it is finally  released the crash is the best thing that can happen. This kind of crash may happen if you manually move/copy data in the MediaBuffer internal buffer. Or even if you use the helper class primo::codecs::Buffer.

Best Regards,
Svilen

Svilen Stoilov

unread,
Jan 23, 2014, 6:55:49 AM1/23/14
to avblocks...@googlegroups.com
I remember that we discussed a similar problem with MediaBuffer in the past:
Here is a previous issue. You may check if the current one is the same:

"This specific assertion in MediaBuffer was due to improper call of MediaBuffer::setData through the helper Buffer class which is implemented in PrimoAV.h.
The call to Buffer::push() is made in the AVBlocksJNITestUtil  code. Data passed to Buffer::push() must not exceed Buffer::freeLinearSpace(). It may happen so that the free space in the buffer is enough to accommodate new data but it is not linear – in this case a call to Buffer::normalize() is required before Buffer::push(). Ok, I see that the Buffer helper class is not making things much easier but I’m trying to explain the reason for the crash. So I can recommend that you do not use the Buffer helper class in PrimoAV.h but instead work directly with the MediaBuffer interface (you may still want to re-use some parts of the Buffer implementation  in PrimoAV.h)."



On Thursday, January 23, 2014 12:52:08 PM UTC+2, sander.borsboom wrote:

Sander Borsboom

unread,
Jan 23, 2014, 9:11:46 AM1/23/14
to Svilen Stoilov via AVBlocks Support, avblocks...@googlegroups.com
Hi Svilen,

I just send an email to support with a test program demonstrating the crash.

When unpacked:
  • cd to the build dir
  • cmake ../
  • make
  • ./stage/bin/CrashTestRunner
The code should also run on Windows: use cmake ..\ to create Visual Studio project files and you have to possibly copy the testdata to the build dir and the windows library to stage\bin\DEBUG\.

The interesting stuff happens in BaseDecoder.cpp, specifically checkBufferSize() which creates new mediaBuffer instances when the old ones are too small to contain new frames. This should happen twice: with teh first frame and the frame where it crashes.

I have included the output of a run on my machine as an attachment to this message.

I'll send a new message with some analyses and answer to your suggestions.




2014/1/23 Svilen Stoilov via AVBlocks Support <avblocks-support+noreply-APn2wQe...@googlegroups.com>

--
You received this message because you are subscribed to the Google Groups "AVBlocks Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to avblocks-suppo...@googlegroups.com.
To post to this group, send email to avblocks...@googlegroups.com.
Visit this group at http://groups.google.com/group/avblocks-support.
For more options, visit https://groups.google.com/groups/opt_out.



--
Met vriendelijke groet / Kind regards,

Sander Borsboom
Technical manager video analytics

Cameramanager.com     Office:       +31(0)88-006.84.50 / +31(0)88-006.84.58 (direct)
Hogehilweg 19               Tech:        sup...@cameramanager.com
1101 CB Amsterdam      Email:       sander....@cameramanager.com
The Netherlands            Site:          www.cameramanager.com
output.

Sander Borsboom

unread,
Jan 23, 2014, 9:27:24 AM1/23/14
to Svilen Stoilov via AVBlocks Support, avblocks...@googlegroups.com
I checked for too many releases first, but I think I don't see that happen in the example crash, but I might be wrong of course :).
There are two locations where we create the buffer: The constructor and in checkBuffer if we need a bigger buffer.
There are also two locations where we release the buffer: the deconstructor (through the AutoRelease) and the checkBuffer function through the AutoRelease::reset() and MediaSample::setBuffer(). (And a nice memory leak for when this function fails after creation of this buffer, but before the reset()).

I print the retainCount() before both releases in the checkBuffer function and as a result of the MediaSample::setBuffer() function. In both cases that checkBuffer is called, the retainCount will reach 0 during the setBuffer, resulting in the clean-up. The first times this succeeds, the second time, at sample 9, this fails.

Most of our calls are close to their Buffer:: equivalents, except for far less error checking, I couldn't find any obvious differences.

Greetings,

Sander


2014/1/23 Sander Borsboom <sander....@cameramanager.com>

Svilen Stoilov

unread,
Jan 23, 2014, 9:46:11 AM1/23/14
to avblocks...@googlegroups.com
Ok, we'll look at this issue.
However we haven't received your test code (project). Can you verify that you have attached it?
Or may be the gmail filters are stopping the attachment if it contains a binary...

Regards,
Svilen

Svilen Stoilov

unread,
Jan 24, 2014, 8:35:09 AM1/24/14
to avblocks...@googlegroups.com
Hi Sander,

Your current problem is the same thing we've discussed before - it's how you manage MediaBuffer (see my previous post).
To be more precise the problem is in the following snippet in BaseDecoder.cpp, addToBuffer():

if( _mediaBuffer->capacity() >= newDataSize )
{
   memcpy
(_mediaBuffer->data() + _mediaBuffer->dataSize(), frame->data_, frame->dataSize_ );
   _mediaBuffer
->setData( 0, newDataSize );
}


In this code the MediaBuffer is corrupted by the unchecked memcpy and this causes the crash when this object is released later.
The data in the MediaBuffer may not start form the beginning of the allocated internal buffer. It may be like this:
[ free_space_1 | data | free_space_2 ]
Here MediaBuffer capacity is (free_space_1+ data + free_space_2).
Therefore you cannot append new data after the 'data' segment just because the capacity is large enough.
You need first to defragment/normalize the buffer so that the free space becomes linear: [ data | free_space ]

if( _mediaBuffer->capacity() >= newDataSize )
{
   
if (_mediaBuffer->dataOffset() > 0)
   
{
      memmove
(_mediaBuffer->start(), _mediaBuffer->data(), _mediaBuffer->dataSize());
      _mediaBuffer
->setData(0, _mediaBuffer->dataSize());
   
}
   memcpy
(_mediaBuffer->data() + _mediaBuffer->dataSize(), frame->data_, frame->dataSize_ );
   _mediaBuffer
->setData( 0, newDataSize );
}

We recognize that the MediaBiffer operations are very basic and the helper class primo::codecs::Buffer is not good enough.
We intend to improve the MediaBuffer interface by providing richer and safer operations like: extending media buffer size, automatic defragmenting and similar.
However we will not remove the current low level interface which provides direct access to buffer and data.

I also want to make a few other points related to your test code:
1. In cmutil.cpp, printError() function there's a potential crash in the expression:

std::wcout << e->message()

because ErrorInfo::message() can be NULL. You need either to check explicitily for this or use primo::ustring like this:

std::wcout << primo::ustring(e->message())

This will not crash because we provide an overloaded operator<< for primo::ustring which checks for NULL.
In essence the Linux and non-Linux code can be equivalent.

2. After you receive error facility:12, code:1 there's nothing you can do to continue decoding with the current Transcoder. Solving this error is another issue in the mpeg-4 visual decoder. We discussed it briefly and we'll work for the best solution - we'll try to conceal the error in the stream and only if that's not possible we'll go to imperfect decoding without dropping the frame and without triggering an error.

Best Regards,
Svilen

Sander Borsboom

unread,
Feb 3, 2014, 8:24:22 AM2/3/14
to Svilen Stoilov via AVBlocks Support, avblocks...@googlegroups.com
Hi Svilen,

thank you for the explanation, I was on vacation last week, but my colleague Bernard implemented your MediaBuffer check and some code to stop it from decoding after an error and that seems to have solved the problem.

The reason I didn't think the normalization was necessary was because our code never changes the data offset: we always use MediaBuffer->setData( 0, X ). My assumption was that nothing else would change these offsets, so if we also never change them, we can assume the offset to always be 0 and thus that capacity() - dataSize() = space remaining. I can see that this assumption is very brittle, so I'll check the code for any other occurrences of this assumption and remove them. But just so I understood it correctly: this assumption is incorrect as for example the transcoder->push() will change the data offset?

Other points:

e->message() returning null: this seems to have been the general problem causing the crashes, sometimes in combination with us continuing to try to decode after an error, resulting in the end with an error with a null message. We fixed both problems and the first tests have been running a week now wtihtout crashes, which is a very good sign ;)

Kind Regards,

Sander Borsboom




2014-01-24 Svilen Stoilov via AVBlocks Support <avblocks-support+noreply-APn2wQe...@googlegroups.com>:

--
You received this message because you are subscribed to the Google Groups "AVBlocks Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to avblocks-suppo...@googlegroups.com.
To post to this group, send email to avblocks...@googlegroups.com.
Visit this group at http://groups.google.com/group/avblocks-support.
For more options, visit https://groups.google.com/groups/opt_out.

Svilen Stoilov

unread,
Feb 3, 2014, 10:59:14 AM2/3/14
to avblocks...@googlegroups.com
Hi Sander,

About the data in MediaBuffer: it's not guaranteed that data offset is 0 after Transcoder::push().
If the whole data in MediaBuffer is consumed by the Transcoder then data offset is 0. This happens most of the time.
However it's possible that Transcoder consumes only part of the data in MediaBuffer in which case data offset is greater than 0 and data is not automatically 'normalized'.
The documentation on Transcoder::push() is wrong when it states that the whole MediaBuffer is consumed.
Basically when you have some data in the buffer you need to check the data offset if you want to append more data to MediaBuffer.
Then depending on the available free space you may use MediaBuffer without modification, normalize it first or allocate a new larger MediaBuffer.

Best Regards,
Svilen


On Monday, February 3, 2014 3:24:22 PM UTC+2, sander.borsboom wrote:
Hi Svilen,

thank you for the explanation, I was on vacation last week, but my colleague Bernard implemented your MediaBuffer check and some code to stop it from decoding after an error and that seems to have solved the problem.

The reason I didn't think the normalization was necessary was because our code never changes the data offset: we always use MediaBuffer->setData( 0, X ). My assumption was that nothing else would change these offsets, so if we also never change them, we can assume the offset to always be 0 and thus that capacity() - dataSize() = space remaining. I can see that this assumption is very brittle, so I'll check the code for any other occurrences of this assumption and remove them. But just so I understood it correctly: this assumption is incorrect as for example the transcoder->push() will change the data offset?

Other points:

e->message() returning null: this seems to have been the general problem causing the crashes, sometimes in combination with us continuing to try to decode after an error, resulting in the end with an error with a null message. We fixed both problems and the first tests have been running a week now wtihtout crashes, which is a very good sign ;)

Kind Regards,

Sander Borsboom


Reply all
Reply to author
Forward
0 new messages