Re: Decoding with push and pull

20 views
Skip to first unread message

VK

unread,
Jun 12, 2013, 12:08:17 PM6/12/13
to avblocks...@googlegroups.com
Are you using the .NET or the C++ API?

On Wednesday, June 12, 2013 1:31:21 AM UTC-7, Bernard Maassen wrote:
Hi,

I'm evaluation this lib and was wondering if the following is possible:
decode a stream frame by frame, where i supply the frames.

I've tried to perform it in the following way:
1) create transcoder
2) create videostreaminfo
3) configure stream info, but don't attach a stream/file
4) add this config to the transcoder inputs

5) create videostreaminfo
6) configure it with same framerate and dimensions, but stream type uncompressed

7) open transcoder

8) push frame data
9) if push successfull, pull decoded

However when i run this code i'm getting the following error:
Assertion failed
File: Transcoder.cpp
Line: 2110
Expression: stream

VK

unread,
Jun 12, 2013, 10:01:38 PM6/12/13
to avblocks...@googlegroups.com
Unfortunately Transcoder::push and Transcoder::pull cannot be used together. If pushing you need to provide a stream or file on the output socket. If pulling you have to do the same on the input socket. You can find more details in the API docs for MediaSocket, Transcoder.
 
Basically the way to go is to create your own custom implementation of primo::Stream and use that either for the input or the output socket depending on whether you use pull or push in the transcoder. We can provide some sample code for that if needed.

Bernard Maassen

unread,
Jun 13, 2013, 3:34:00 AM6/13/13
to avblocks...@googlegroups.com
I'm using the c++ API,

If you could send me same sample code where the push is used that would be great

Bernard Maassen

unread,
Jun 13, 2013, 3:44:50 AM6/13/13
to avblocks...@googlegroups.com

I've just tried adding a dummy stream(no actual code in the class just empty methods) to my output socket, but then i'm still getting the same error

VK

unread,
Jun 13, 2013, 11:08:28 AM6/13/13
to avblocks...@googlegroups.com
OK. This might be a bug. We need about a day to investigate and to create a sample for you. Can you give some details about the type of your input: format, bitrate, resolution, is it video+audio or video only, etc? That will help the troubleshooting.

Bernard Maassen

unread,
Jun 14, 2013, 3:33:41 AM6/14/13
to avblocks...@googlegroups.com
I was an elementary h264 stream without any audio, resolution was 640x480

Svilen Stoilov

unread,
Jun 18, 2013, 10:34:54 AM6/18/13
to avblocks...@googlegroups.com
Hi Bernard,
 
Can you give us more information about the data flow in your scenario.
 
Does the frame data that you pass to Transcoder::push() represent
a) whole compressed frames
or
b) arbitrary consecutive chunks of the h264 stream
?
 
Thanks,
Svilen Stoilov

Bernard Maassen

unread,
Jun 25, 2013, 3:07:24 AM6/25/13
to avblocks...@googlegroups.com
Hi Svilen,

I will put whole frame into the push, should it include the start code(0x000001)?

Svilen Stoilov

unread,
Jun 26, 2013, 6:15:55 AM6/26/13
to avblocks...@googlegroups.com
Hi Bernard,
 
The input format (VideoStreamInfo::streamType) must be StreamType::H264 (defined in PrimoAV.h).
 
By default (when VideoStreamInfo::streamSubType is not defined) our h264 decoder expects the start code (0x000001 or 0x00000001) so you need to include it in the bitstream if it's not present.
 
The decoder can also handle frames without start code but in this case the stream subtype (VideoStreamInfo::streamSubType) must be set explicitly to StreamSubType::AVC1 (also defined in PrimoAV.h).
This is the case when the h264 frames come from a MP4 container.
 
I'm investigating your required scenario (pushing h264 frames to AVBlocks for decoding).
Currently there's a problem in AVBlocks when compressed video data is passed in push mode.
So even if you set the proper types specified above the push mode will not work for h264 decoding.
I'm working to fix this and I believe we need a few days to make it.
We'll release a new version of AVBocks and we'll provide you with a small console app that simulates this scenario.
 
I have another question related to your input data:
Do you know the video format in advance: frame size (width x height) and frame rate (fps)?
Or you need to detect it the moment you start receiving the h264 frames?
 
Thanks,
Svilen Stoilov

Bernard Maassen

unread,
Jun 26, 2013, 10:22:03 AM6/26/13
to avblocks...@googlegroups.com
Hi Svilen,

It would be really nice if the frame size could be parsed automatically.
The framerate is something we don't know. This is because we're working with live streams, and the frame rate we request differs most of the times from what we actually receive

Svilen Stoilov

unread,
Jul 5, 2013, 5:43:24 AM7/5/13
to avblocks...@googlegroups.com
Hi Bernard,
 
We published a new release - AVBlocks 1.4: http://www.avblocks.com/download-try/
 
It includes a couple of new samples and bugfixes: http://blog.avblocks.com/avblocks-1-4-now-available/
 
Specifically the PullPushDecoder sample app demonstrates how to decode an elementary video stream that comes in packets (whole frames).
It uses 2 Transcoder objects. The first transcoder operates in pull mode and splits the elementary stream in frames.
It serves to simulate data arrival in frames for the second transcoder which operates in push mode.
We've tested this with an h264 elementary stream.
 
This sample app assumes that the video parameters are known in advance.
It's possible to extend the sample to detect the video parameters on the fly if the data arrives in whole frames.
In order to achieve this it is necessary to follow these steps:
1. Buffer the first few frames (even the first one may be enough if it contains several NALUs). It is important to remember the frame boundaries/sizes.
2. Pass this concatenated buffer to the MediaInfo object and get the parameters of the video stream.
3. Instantitate a Transcoder object (as transcoder2 in the PullPushDecoder), set the video stream parameters obtained from the MediaInfo object as input.
4. Push the buffered frames in 1. to the transcoder complying with the frame boundaries.
5. Continue to push frames to the transcoder as they are received.
If you actually need this scenario and my explanation is vague we can help you by providing a working code for the above sequence within a day - as this will work with the latest AVBlocks version  (1.4).
 
Apart from that after releasing 1.4  now we are working on a more general streaming scenario.
It will allow: 1) push arbitrary data chunks to Transcoder (not only whole frames as it is now) and 2) detect stream format on the fly as data arrives (currently there's the non-ideal and cumbersome workaround described above).
 
Please let me know if you need more help to test AVBlocks 1.4 and the proposed solution in the PullPushDecoder sample.
 
Thanks,
Svilen

sander.borsboom

unread,
Jul 15, 2013, 5:19:22 AM7/15/13
to avblocks...@googlegroups.com
Hello Svilen,

Bernard is on vacation and I took over the implementation. I normally program in Java and sometimes in cpp on linux, so together with the business caused by a lot of people being on vacation, it took me a while to get up to speed.

I am trying to implement what you proposed, but I am really not sure how the program should look. So let me explain what we have as input, and what I think I should do to get a good video parameters. Then I'll list some of my questions.

We get frames, each nicely separated in their own char array. The first frame is a SPS array, then a PPS and then "normal" video frames. These frames come in "live":  They are being generated by a camera and we want to decode them immediatly, and thus do not have the full stream availble when we want to start, just the SPS, PPS and first keyframe. So the end results is pure H264 data, no stream/file format arround it.

I have implemented a simple class implementing of primo::Stream. It has an internal char array, from which it reads when read() is called and it is filled with another method.

So when I get a SPS frame, the following is done:

// Create the primo::Stream implementation object.
_in = new VideoInputStream();

//Add the SPS data to the stream object (including 0x000001 prefix) (in is a char array with the following raw data: 00 00 01 67 42 00 29 E3 50 3C 17 FC B8 0B 70 10 10 1A 41 E2 44 54)
_in->addData(in, insize);
MediaInfo *info = primo::avblocks::Library::createMediaInfo();
info->setInputStream(_in);
if (!info->load())
{
printError(L"load MediaInfo", info->error());
return;
}

The result:
load MediaInfo: Unsupported format (facility:5 code:7)

Questions:
  • Is implementing our own Stream class a good way to get the frame into the MediaInfo object and later the first Transcoder or are there already existing classes?
  • Which frames does MediaInfo need? Currently I am only putting in the SPS, but I could also add the PPS or even the first Keyframe (But the buffer set-up would be easier for us if we didn't have to do the keyframe).
  • Any other obvious problems with the code? Do I have to set MediaInfo.setInputType()?
Greetings,

Sander Borsboom

Op vrijdag 5 juli 2013 11:43:24 UTC+2 schreef Svilen Stoilov het volgende:

Svilen Stoilov

unread,
Jul 15, 2013, 11:17:38 AM7/15/13
to avblocks...@googlegroups.com
Hi Sander,
 
  • 1. Is implementing our own Stream class a good way to get the frame into the MediaInfo object and later the first Transcoder or are there already existing classes?
 
The primo::Stream interface is the (only) way to make MediaInfo work with memory data without using files.
The attached file "MemoryInputStream.h" shows a simplistic but working implementation of primo::Stream that is suitable for passing data to the MediaInfo class.
When mentioning the "first Transcoder" you probably refer to the PullPushDecoder sample app which is part of the AVBlocks SDK.
This sample uses 2 transcoder objects - the first one is used to split an existing h264 file to frames and the second one is used to decode those frames.
However the first transcoder is used mostly to simulate the process of getting h264 frames from a live source.
In your case you already receive the h264 frames from a web camera and you don't need the first transcoder, just the second one.
You don't need to implement primo::Stream to pass data to the transcoder object at all. You can do this via Transcoder::push as you receive the h264 frames.
The attached implementation of primo::Stream is needed only to use the MediaInfo class as an initialization step in order to detect the h264 stream parameters.
 
2.Which frames does MediaInfo need? Currently I am only putting in the SPS, but I could also add the PPS or even the first Keyframe (But the buffer set-
up would be easier for us if we didn't have to do the keyframe).
 
Currently the MediaInfo class requires SPS+PPS+I_FRAME in order to detect the parameters.
I understand that this is not optimal as the SPS frame should be enough but this is how it works.
It is OK for the above sequence (SPS+PPS+I_FRAME) to be preceded by a delimiter frame or a supplemental enhancement information (SEI).
Consider that you are using the MemoryStream class from "MemoryInputStream.h" and that you are receiving SPS, PPS and the I(Key) frame as separate packets:
 
MemoryStream ms;
ms
.append(sps_start, sps_end);
ms
.append(pps_start, pps_end);
ms
.append(iframe_start, iframe_end);

MediaInfo* info; // already created

info
->setInputStream(&ms);
info
->setInputType(primo::codecs::StreamType::H264); // the stream type must be specified explicitly
info
->load();

// ...

// init transcoder for decoding
// push data from the memory stream (SPS+PPS+I_FRAME) to the transcoder before proceeding with regular frames.
// it's OK to push the whole appended data (ms.data) because frames do have start codes

 
3. Any other obvious problems with the code? Do I have to set MediaInfo.setInputType()?
 
Currently the input type must be specified explicitly when the input is a primo::Stream instance and not a file.
 
MediaInfo* info;
info
->setInputType(primo::codecs::StreamType::H264);

I'll modify the sample app PullPushDecoder so that you can look at the complete code if you need to.
I'll notify you when it's done.
 
Thank you,
Svilen
 
 
MemoryInputStream.h

Svilen Stoilov

unread,
Jul 16, 2013, 4:49:53 AM7/16/13
to avblocks...@googlegroups.com
Hi Sander,
 
I've updated the PullPushDecoder sample for Windows:
 
 
This sample basically implements the idea that we've discussed:
 
1. Receive frames from transcoder1 (live source)
2. Accumulate received frames in MemoryStream
3. Try to detect the input parameters using MediaInfo and the MemoryStream.
4. Once the input parameters are detected initialize transcoder2 to decode input stream to YUV.
5. Push the data accumulated in MemoryStream to transcoder2.
6. Continue regular pushing of frames (received from transcoder1) to transcoder2 (MemoryStream is not used anymore).
 
I'll also update the PullPushDecoder for Mac OS X and Linux in the same repo. But the code is conceptually the same, just different projects and char type (encoding).
 
Thanks,
Svilen

On Monday, July 15, 2013 12:19:22 PM UTC+3, sander.borsboom wrote:

sander.borsboom

unread,
Jul 24, 2013, 10:09:10 AM7/24/13
to avblocks...@googlegroups.com
Hello Svilen,

I didn't have much time to work on the decoder last week, but this week I am working on it full-time. I currently roughly have the code to start decoding, except for 2 problems:

1: MediaSocket::createFromMediaInfo(info); results in a crash
Like your example I load the media info from three 3 frames (SPS, PPS and I frame) and this seems to work: it detects that it is H.264 and has the correct resolutions. But: if I try to create a MediaSocket based on this object, the application crashes (Windows pop-up with "... has stopped working"). Any idea what could cause this/how to debug it?

The piece of code where this happens (warning: java coder, trying to write c++ ;) :
// works about the same as example code:
MediaInfo* info = getMediaInfo( keyFrame );
cout << "info->inputFile(): " << (info->inputFile() == 0 ? "0" : (char*)info->inputFile() ) << endl;
cout << "info->inputStream() == 0: " << (info->inputStream() == 0 ? "true" : "false" ) << endl;
printf("info->inputType(): %d\n", info->inputType());
printf("info->streams()->count(): %d\n", info->streams()->count() );
for(int i = 0; i < info->streams()->count(); i++)
{
VideoStreamInfo* vsi = (VideoStreamInfo*) (info->streams()->at(i));
printf("Stream no %d:\n", i);
printf("vsi->mediaType(): %d\n", vsi->mediaType());
printf("vsi->streamType(): %d\n", vsi->streamType());
printf("vsi->streamSubType(): %d\n", vsi->streamSubType());
printf("vsi->frameHeight(): %d\n", vsi->frameHeight());
printf("vsi->frameWidth(): %d\n", vsi->frameWidth());
printf("vsi->duration(): %d\n", vsi->duration());
printf("vsi->ID(): %d\n", vsi->ID());
printf("vsi->bitrate(): %d\n", vsi->bitrate());
printf("vsi->bitrateMode(): %d\n", vsi->bitrateMode());

// Possible fix? (duration seems to be negative: did not work)
vsi->setDuration(0);
}

VideoStreamInfo* vsi = (VideoStreamInfo*) (info->streams()->at(0));

if(vsi->frameWidth() == 0 || vsi->frameHeight() == 0)
{
fprintf(stderr, "Error: Media info was unsuccessfully parsed."); 
return;
}
printf("init: 1\n");
MediaSocket* inputStream = MediaSocket::createFromMediaInfo(info);
printf("init: 2\n");

Example output:

info->inputFile(): 0
info->inputStream() == 0: false
info->inputType(): 8199
info->streams()->count(): 1
Stream no 0:
vsi->mediaType(): 2
vsi->streamType(): 8199
vsi->streamSubType(): 0
vsi->frameHeight(): 360
vsi->frameWidth(): 480
vsi->duration(): -572662306
vsi->ID(): 0
vsi->bitrate(): 0
vsi->bitrateMode(): 0
init: 1

So as far as I can see, the object seems to be reasonably set-up (except for the duration, but if I set it to be more reasonable, that doesn't help).

2: How to get the uncompressed video out?
Our goal is that we put compressed video frames into the decoder and get out uncompressed (eg StreamType:UncompressedVideo & ColorFormat::RGB24). The first part we now almost have, but how to do the rest? The only thing I could think of is using the MemoryInputStream to let the transcoder write to instead of a file, but what is an easy way then to find frame boundaries?

So my current workflow:

  1. Receive frames from outside the application in the form of uint8_t arrays, one for each frame.
  2. Accumulate SPS, PPS and IDR frame in MemoryStream ms.
  3. use MediaInfo.load() and the MemoryStream to load the media info.
  4. Set-up transcoder input using the loaded media info to set-up a MediaSocket.
  5. Create the output (using an outgoing MemoryStream? Something else?);
  6. For each frame:
    1. Push the data array into a MediaSample
    2. Push the media sample into the transcoder
    3. ... Magic...
If you want I can include more code, but it is currently rather messy, so I was hoping I include enough.

Greetings,

Sander Borsboom

Op dinsdag 16 juli 2013 10:49:53 UTC+2 schreef Svilen Stoilov het volgende:

sander.borsboom

unread,
Jul 25, 2013, 3:50:21 AM7/25/13
to avblocks...@googlegroups.com
Hi Svilen,

just as extra info, our end goal is for the decoder to work like this (maybe partially made by us ourselves, but hopefully mostly avblocks code):

Receive frames from outside the application in the form of uint8_t arrays, one for each frame/Nal unit (so SPS, PPS and SEI are also separate) + a value denoting which codec it is (H.264/MPEG 4 Part 2/etc.)

First we configure the decoder for the specific format (eg. H.264 or MPEG 4 part 2). We do not set the width, height or color system since we don't know, but we could provide a SPS or MPEG 4 Part 2 decoding details.

For each frame:
  1. Feed the frame into the decoder
  2. Check if the decoder was able to decode an image, if not return null, if true:
    1. Get the data array representing the data of the image
    2. Get the width and the height of the image
    3. If not pre-defined: ask the color coding of the image
So the end result after "decoding" a data array:
  • Null if the frame did not result in an image (SPS, PPS, SEI or damaged/incorrect frames).
  • uint8_t, width, height, color enum if the frame did result in an image.
A possible edge case would be when the encoder which generates the frames changes settings while stremaing, which with H.264 results in a new SPS and PPS frame. It would be great if the decoder could handle this and change settings on the fly (which is why we ask for the width/height/color for each decoded frame), if not we can work around it ourselves by creating a new decoder when we detect a new SPS which is not just a copy of the previous one.

I hope this gives a bit more information about what we are trying to accomplish, so you know the background of our questions.

Greetings,

Sander Borsboom

Op woensdag 24 juli 2013 16:09:10 UTC+2 schreef sander.borsboom het volgende:

Svilen Stoilov

unread,
Jul 26, 2013, 4:28:29 AM7/26/13
to avblocks...@googlegroups.com
Hi Sander,
 
There are a few issues to discuss so I'll answer in parts:
 
(*)  MediaSocket::createFromMediaInfo() crashes.
I analyzed the possibilities and I believe that the reason is that the MemoryInputStream no longer exists when you call MediaSocket::createFromMediaInfo().
Basically createFromMediaInfo initializes a new MediaSocket with properties obtained from MediaInfo.
It does not use an internal knowledge it is just a convenience method.
In your case I believe that the MemoryInputStream is no longer valid after:

 

     MediaInfo* info = getMediaInfo( keyFrame );

 

and then in the line

 
  MediaSocket* inputStream = MediaSocket::createFromMediaInfo(info);
 
AVBlocks tries to transfer the MemoryInputStream from MediaInfo to the new MediaSocket but since the MemoryInputStream is already destroyed there's a crash.
 
There are various ways to solve this.
1. Manually create your input socket like this:

AutoRelease<MediaSocket> socket1 (Library::createMediaSocket());

AutoRelease<MediaPin> pin1 (Library::createMediaPin());

pin1->setConnection(PinConnection::Auto);

pin1->setStreamInfo(info->streams()->at(0));

socket1->setStreamType(info->inputType());

2. Or make sure that you reset the input stream in MediaInfo before calling MediaSocket::createFromMediaInfo():

info->setInputStream(NULL);

MediaSocket* inputStream = MediaSocket::createFromMediaInfo(info);
 

3. Or you can fully implement the primo::Reference interface used by the MemoryInputStream but I don't recommend this because it's more complex  and you don't need it. And you will still have to remove the stream form the input socket (socket->setStream(NULL)) in order to use Transcoder::push().

 
Note that in order to call Transcoder::push() your input socket should have neither file nor stream configured, because you will manually provide the input samples. So the above will fix both the crash and make use Transcoder::push() later.
 
(*) cout << "info->inputFile(): " << (info->inputFile() == 0 ? "0" : (char*)info->inputFile() ) << endl;
This is a minor problem but note that our char interface for Windows is actually wchar_t (UTF-16) not char.
So you should print with wcout (not cout) and you should not cast  info->inputFile() to (char*).
 
(*) The negative duration is a bug in MediaInfo. Actually the duration cannot be detected from the initial sequence and it should be set to 0 but it is left as is (random value) which turns out to be a strange number.
We'll fix this.
 
(*) How to get the uncompressed video out?
The current approach is to implement your own primo::Stream and set it to the output socket. You basically need to override the primo::Stream::write method and it will give you the uncompressed frame. Each uncompressed frame will be received in a separate call to Stream::write.
 
I'll give you more information and samples later today.
 
Thanks,
Svilen

Sander Borsboom

unread,
Jul 26, 2013, 9:55:47 AM7/26/13
to Svilen Stoilov via AVBlocks Support, avblocks...@googlegroups.com
Hi Svilen,

thank you for the explanation, I used approach 2 which felt most natural and now the program continues without problems. Option 3 might be interesting to do if we should use the stream to get the output frames I think. 

Greetings,

Sander


2013/7/26 Svilen Stoilov via AVBlocks Support <avblocks-support+noreply-APn2wQe...@googlegroups.com>
--
You received this message because you are subscribed to the Google Groups "AVBlocks Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to avblocks-suppo...@googlegroups.com.
To post to this group, send email to avblocks...@googlegroups.com.
Visit this group at http://groups.google.com/group/avblocks-support.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Met vriendelijke groet / Kind regards,

Sander Borsboom
Technical manager video analytics

Cameramanager.com     Office:       +31(0)88-006.84.50 / +31(0)88-006.84.58 (direct)
Hogehilweg 19               Tech:        sup...@cameramanager.com
1101 CB Amsterdam      Email:       sander....@cameramanager.com
The Netherlands            Site:          www.cameramanager.com

Sander Borsboom

unread,
Jul 26, 2013, 9:57:56 AM7/26/13
to Svilen Stoilov via AVBlocks Support, avblocks...@googlegroups.com
Sorry, I misread your last note. I will further implement the stream to see if that approach for getting the decoded frame works.

Greetings,

Sander


2013/7/26 Sander Borsboom <sander....@cameramanager.com>
Reply all
Reply to author
Forward
0 new messages