Best way to replace curl to receive endless stream

46 views
Skip to first unread message

Flix

unread,
Jul 22, 2020, 10:51:30 AM7/22/20
to emscripten-discuss
Hi,

I'd like to port a (very basic) mp3 internet radio player to emscripten.

The original C code uses libcurl, but the whole libcurl code is just something like:

// curl stream callback
size_t stream_callback
(char *ptr, size_t size, size_t nmemb, void *userdata)    {
   
[...]
}

int main(int argc, char* argv[])    {
    CURL
*c;CURLcode res;
   
const char * webaddr = NULL;/* URL of the radio-station in <ip-addr>:<port> format */
    c
= curl_easy_init();
    curl_easy_setopt
(c, CURLOPT_WRITEFUNCTION,    stream_callback);
    curl_easy_setopt
(c, CURLOPT_NOPROGRESS, 1);
    curl_easy_setopt
(c, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_0);
    curl_easy_setopt
(c, CURLOPT_URL, webaddr);
    res
= curl_easy_perform(c);
   
/* curl_easy_perform(...) never returns unless user forces stream_callback to return 0, that results in res==CURLE_WRITE_ERROR */
    curl_easy_cleanup
(c);
   
return 0;
}

Basically I've seen that emscripten provides other ways to replace libcurl (the emscripten_async_wget() function family), but, as far as I can understand from the docs, there's no way to handle an "infinite" stream (because in the example code "stream_callback(...)" never stops (unless user wants it to stop).

Is there a way to do the same in emscripten?

Thank you in advance.

Floh

unread,
Jul 23, 2020, 1:50:23 PM7/23/20
to emscripten-discuss
I have implemented HTTP streaming for an MPEG player in this sample:


However this doesn't use any of the emscripten APIs but instead uses embedded Javascript to do the streaming through XmlHttpRequest objects. The whole library is a bit more involved since it does a bit more then just streaming, but maybe you can rip out the relevant parts (or even just use it as is, it's a totally self-contained single-file library):


Incremental streaming of large data files (such as the MPEG file) is done through HTTP range-requests to only load small parts of the file at once, but there's some action required by the application code to pause and continue the streaming so that there's always enough data preloaded for the video not to starve, but also not too much so that memory use doesn't explode.

The actual sample source code is here, this also has some comments to explain how the whole thing works:


Hope this helps,
-Floh.

Flix

unread,
Jul 24, 2020, 5:42:16 AM7/24/20
to emscripten-discuss
Hi Floh and thanks for your help!

I'm happy that I can use sokol_fetch.h (BTW: it contains a big amount of documentation too).

Main problem is that I just wanted to add some minimal code changes to support emscripten.

The program I'm trying to port is all in this gist here: https://gist.github.com/Flix01/157e8dafd9bef766092264ce6c1abbdb (a single .c file in about 450 loc, and I think it's already too long).

[I'm also using OpenAL instead of sokol_audio.h, so following plmpeg-sapp.c is a bit more difficult for me].

Anyway at least now I've got the proof that it can be done. Thank you again.

Floh

unread,
Jul 24, 2020, 6:22:45 AM7/24/20
to emscripten-discuss
> Main problem is that I just wanted to add some minimal code changes to support emscripten.

Yep, that would be preferrable. But I think curl's streaming callback isn't easy to map to the emscripten_wget* functions.

But maybe the emscripten_wget2_data() function can be used: This has an onprogress callback, which seems to be called before the entire file is downloaded, and this has a pointer to a buffer and a "number of bytes loaded" argument (see: https://emscripten.org/docs/api_reference/emscripten.h.html#c.emscripten_async_wget2_data)

However note that both methods (curl with CURLOPT_WRITEFUNCTION and emscripten_wget2_data()) don't have any guarantees how much data has actually been loaded before the callbacks are called (that's why I'm using HTTP range-requests basically), I think this means that a lot of audio data will be queued up in RAM (basically the entire file size) because there's no way to pause the download to allow the audio playback to catch up. Don't you see RAM usage explode in your example for "infinite streams"?

So basically, you could try replacing the call to curl_easy_perform() with a call to emscripten_async_wget2_data(), and maybe the onprogress callback is performing similar to the curl WRITEFUNCTION callback. One difference will definitely be the emscripten_async_wget2_data() will not block but instead return immediately, you need to use the provided callbacks to check whether the download has finished, and also should use one of the emscripten_request_animation_frame_* functions to periodically check what on the status of the download (which basically requires to split your code into an initialization-part, and a per-frame part).

Hope this helps!
-Floh.

Floh

unread,
Jul 24, 2020, 6:27:33 AM7/24/20
to emscripten-discuss
Ah, correction sorry, looks like the onprogress callback in emscripten_async_wget2_data() *doesn't* have a pointer to the data that's downloaded so far, you only get the number of bytes downloaded, but not the bytes themselves. You only get the data when the entire download is finished in the onload callback :(

So I guess you're out of luck and the only way to stream partial data is via HTTP range requests (which as far as I know are not supported by the emscripten_wget* functions).

Flix

unread,
Jul 24, 2020, 12:53:21 PM7/24/20
to emscripten-discuss

However note that both methods (curl with CURLOPT_WRITEFUNCTION and emscripten_wget2_data()) don't have any guarantees how much data has actually been loaded before the callbacks are called (that's why I'm using HTTP range-requests basically), I think this means that a lot of audio data will be queued up in RAM (basically the entire file size) because there's no way to pause the download to allow the audio playback to catch up. Don't you see RAM usage explode in your example for "infinite streams"?

Mmh, let me see... well it seems that (with the current buffer size settings), the release build (compiled with -O3 -march=native -openmp-simd), consumes about 12.5 MiB (and 0% CPU...strange?!?). Memory consumption seems stable in the interval [12.1-12.7 MiB].
I copy the memory I receive into the encodedBuffer . It should be big enough so that it never overflows, but I keep writing to, and reading from, it (all in the same thread), so that it should stay almost constant in size (actually for robustness, I discard old encoded samples in case of overflows).
The catch is that OpenAL tells me when one of its queued buffers needs a refill, so I can read from the encodedBuffer and use memmove (bad solution) to remove the used chunk of memory.

Ah, correction sorry, looks like the onprogress callback in emscripten_async_wget2_data() *doesn't* have a pointer to the data that's downloaded so far, you only get the number of bytes downloaded, but not the bytes themselves. You only get the data when the entire download is finished in the onload callback :(

So I guess you're out of luck and the only way to stream partial data is via HTTP range requests (which as far as I know are not supported by the emscripten_wget* functions).

Yep!
...what about <emscripten/fetch.h> streaming downloads ? Well, the note says: This currently only works in Firefox as it uses ‘moz-chunked-arraybuffer’.... so I'm not sure if it's worth trying it or not...

Reply all
Reply to author
Forward
0 new messages