Kurento crash

1,465 views
Skip to first unread message

Mikhail Novikov

unread,
Feb 6, 2018, 1:11:50 PM2/6/18
to kurento
from time to time i am getting a crash with this stack trace:

^[[31;1mSegmentation fault^[[0m (thread ^[[33;1m139796485371648^[[0m, pid ^[[33;1m7829^[[0m)
Stack trace:
^[[34;1m[g_socket_send_message]^[[0m
/usr/lib/x86_64-linux-gnu/libgio-2.0.so.0^[[32;1m:0x7B044^[[0m
^[[34;1m[nice_output_stream_new]^[[0m
/usr/lib/x86_64-linux-gnu/libnice.so.10^[[32;1m:0x2769E^[[0m
^[[34;1m[nice_output_stream_new]^[[0m
/usr/lib/x86_64-linux-gnu/libnice.so.10^[[32;1m:0x27813^[[0m
^[[34;1m[nice_agent_recv_nonblocking]^[[0m
/usr/lib/x86_64-linux-gnu/libnice.so.10^[[32;1m:0x11879^[[0m
^[[34;1m[gst_nice_src_get_type]^[[0m
/usr/lib/x86_64-linux-gnu/gstreamer-1.5/libgstnice15.so^[[32;1m:0x3902^[[0m
^[[34;1m[gst_nice_sink_get_type]^[[0m
/usr/lib/x86_64-linux-gnu/gstreamer-1.5/libgstnice15.so^[[32;1m:0x4203^[[0m
^[[34;1m[gst_base_sink_do_preroll]^[[0m
/usr/lib/x86_64-linux-gnu/libgstbase-1.5.so.0^[[32;1m:0x2A1B2^[[0m
^[[34;1m[gst_base_sink_do_preroll]^[[0m
/usr/lib/x86_64-linux-gnu/libgstbase-1.5.so.0^[[32;1m:0x2B620^[[0m
^[[34;1m[gst_flow_get_name]^[[0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0^[[32;1m:0x6E5CF^[[0m
^[[34;1m[gst_pad_push]^[[0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0^[[32;1m:0x76533^[[0m
^[[34;1m[gst_proxy_pad_chain_default]^[[0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0^[[32;1m:0x5F5E3^[[0m
^[[34;1m[gst_flow_get_name]^[[0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0^[[32;1m:0x6E5CF^[[0m
^[[34;1m[gst_pad_push]^[[0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0^[[32;1m:0x76533^[[0m
^[[32;1m0x1B48D^[[0m at /usr/lib/x86_64-linux-gnu/gstreamer-1.5/libgstcoreelements.so
^[[34;1m[gst_flow_get_name]^[[0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0^[[32;1m:0x6E5CF^[[0m
^[[34;1m[gst_pad_push]^[[0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0^[[32;1m:0x76533^[[0m
^@

Where to look?

Thanks!

Jorge Maiquez

unread,
Mar 12, 2018, 9:56:01 AM3/12/18
to kurento
Hi all,

Has anyone figured out what is causing this the segmentation fault in g_socket_send_message? There are various reports of this in this news group, but not a single positive response from the community.

Is the kurento team aware of this? Is there a workaround?

We didn't see this error a single time in our development and staging environments. But now that we deployed it to production, our client has seen it twice in less than a week.

Any tips are greatly appreciated. The client has a large webinar planned for tomorrow.

Thanks!
Jorge Maiquez, Digital Samba

Jon Ruddell

unread,
Mar 12, 2018, 12:19:24 PM3/12/18
to kurento
Do you have any more information such as your KMS version, OS info, and how to reproduce the crash?

Kapa6ac79

unread,
Mar 12, 2018, 4:26:22 PM3/12/18
to kurento
I confirm the problem, when you switch to the latest version, the same bug appears. The error occurs when there is a load on the server (from about 50+ clients), but there is no clear algorithm for reproducing the error. It arises suddenly. The WebRTC one-to-many broadcast mode was tested. OS version of Ubuntu 16.04, KMS 6.7.0

среда, 7 февраля 2018 г., 2:11:50 UTC+8 пользователь Mikhail Novikov написал:

Alex Kandrashkin

unread,
Mar 13, 2018, 3:20:51 AM3/13/18
to kurento
Also have this issue (not the latest KMS)
KMS version:  6.7.0~1.g6ebaa27
Found modules:
   Module: 'core' version '6.6.3'
   Module: 'elements' version '6.6.3'
   Module: 'filters' version '6.7.0~1.g0314843'

Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-112-generic x86_64)

Do you have it on the latest KMS ? 

dawn_...@outlook.com

unread,
Mar 13, 2018, 3:24:14 AM3/13/18
to kurento
I also met this problem!


this is my stack trace:

[31;1mSegmentation fault [0m (thread [33;1m139918407882496 [0m, pid [33;1m26118 [0m)
Stack trace:
[34;1m[g_socket_send_message] [0m
/usr/lib/x86_64-linux-gnu/libgio-2.0.so.0 [32;1m:0x7B044 [0m
[34;1m[nice_output_stream_new] [0m
/usr/lib/x86_64-linux-gnu/libnice.so.10 [32;1m:0x2ACBF [0m
[34;1m[nice_output_stream_new] [0m
/usr/lib/x86_64-linux-gnu/libnice.so.10 [32;1m:0x2AF3B [0m
[34;1m[nice_agent_recv_nonblocking] [0m
/usr/lib/x86_64-linux-gnu/libnice.so.10 [32;1m:0x12AE9 [0m
[34;1m[gst_nice_src_get_type] [0m
/usr/lib/x86_64-linux-gnu/gstreamer-1.5/libgstnice15.so [32;1m:0x36B2 [0m
[34;1m[gst_nice_sink_get_type] [0m
/usr/lib/x86_64-linux-gnu/gstreamer-1.5/libgstnice15.so [32;1m:0x3FB3 [0m
[34;1m[gst_base_sink_do_preroll] [0m
/usr/lib/x86_64-linux-gnu/libgstbase-1.5.so.0 [32;1m:0x2A1B2 [0m
[34;1m[gst_base_sink_do_preroll] [0m
/usr/lib/x86_64-linux-gnu/libgstbase-1.5.so.0 [32;1m:0x2B620 [0m
[34;1m[gst_flow_get_name] [0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0 [32;1m:0x6E5CF [0m
[34;1m[gst_pad_push] [0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0 [32;1m:0x76533 [0m
[34;1m[gst_proxy_pad_chain_default] [0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0 [32;1m:0x5F5E3 [0m
[34;1m[gst_flow_get_name] [0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0 [32;1m:0x6E5CF [0m
[34;1m[gst_pad_push] [0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0 [32;1m:0x76533 [0m
[32;1m0x1B48D [0m at /usr/lib/x86_64-linux-gnu/gstreamer-1.5/libgstcoreelements.so
[34;1m[gst_flow_get_name] [0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0 [32;1m:0x6E5CF [0m
[34;1m[gst_pad_push] [0m
/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0 [32;1m:0x76533 [0m
 **
libnice:ERROR:agent.c:2342:agent_signal_component_state_change: assertion failed: (TRANSITION (DISCONNECTED, FAILED) || TRANSITION (GATHERING, FAILED) || TRANSITION (CONNECTING, FAILED) || TRANSITION (CONNECTED, FAILED) || TRANSITION (READY, FAILED) || TRANSITION (DISCONNECTED, GATHERING) || TRANSITION (GATHERING, CONNECTING) || TRANSITION (CONNECTING, CONNECTED) || TRANSITION (CONNECTED, READY) || TRANSITION (READY, CONNECTED) || TRANSITION (FAILED, CONNECTING) || TRANSITION (FAILED, GATHERING) || TRANSITION (DISCONNECTED, CONNECTING))

在 2018年2月7日星期三 UTC+8上午2:11:50,Mikhail Novikov写道:

Jorge Maiquez

unread,
Mar 13, 2018, 3:42:03 AM3/13/18
to kurento

Thanks for the responses.


We don’t know how to reproduce this. If we did, then we could at least manage our client in some way. 


The KMS setup was unchanged in the switch from staging to production (same box, same KMS version, same OS, etc). You can see the respective version information in Alex’s response above.


Kapa6ac79, can you please confirm that you see the same error on the very latest 2018-01-18 version of KMS?


Jon, are you aware if this is something that was specifically worked on in the 2018-01-18 version? I can’t see anything in the change logs that would indicate that is the case.


Regardless, it seems we are not the only ones experiencing this problem, and bug like is not something you want kicking around in production.


Any hints/experiences much appreciated!


On Tuesday, February 6, 2018 at 7:11:50 PM UTC+1, Mikhail Novikov wrote:

Jorge Maiquez

unread,
Mar 13, 2018, 5:35:57 AM3/13/18
to kurento
Are there any gstreamer guru's out there that could suggest which direction we should explore to trigger segmentation errors specifically in
g_socket_send_message

We don't have enough experience with gstreamer or that particular method to compile targeted test cases for reproducing this error. It would be good to have an educated-guess starting point. Is this more likely to be load related (didn't seem to be the case for us), or some dodgy camera/mic device driver on the client, or something else entirely?

Thanks,
Jorge

Jorge Maiquez

unread,
Mar 14, 2018, 8:45:41 AM3/14/18
to kurento
Quick update. The client's session went well (2 broadcasters, 140 viewers), so we got lucky this time.

Can someone please suggest what things we can (stress) test to try to trigger the g_socket_send_message segmentation fault?

Micael Gallego

unread,
Mar 16, 2018, 6:38:47 PM3/16/18
to kur...@googlegroups.com
We are working on that right now.

--
You received this message because you are subscribed to the Google Groups "kurento" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kurento+u...@googlegroups.com.
To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.
To view this discussion on the web visit https://groups.google.com/d/msgid/kurento/61baf5e2-14ed-456f-815a-8537910e4f23%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Message has been deleted
Message has been deleted
Message has been deleted

Jorge Maiquez

unread,
Mar 17, 2018, 6:35:17 AM3/17/18
to kurento

Hi Micael,


Does this mean that you know how to reproduce it? If so, can you give us some more info about the conditions that trigger the error?


This client will be ramping up soon, and I'd like to take active steps to minimize the risk of this occurring, even if that means we have to cripple our application slightly in the short term.


Basically, is there a workaround we can use until you guys have fixed this?


If you need help with any specific testing, let me know.


Thanks & all the best,


Jorge Maiquez

Digital Samba


On Friday, March 16, 2018 at 11:38:47 PM UTC+1, Micael Gallego Carrillo wrote:
We are working on that right now.

Jorge Maiquez

unread,
Mar 21, 2018, 3:27:59 AM3/21/18
to kurento
Any more details you can share with us Micael?

Micael Gallego

unread,
Mar 21, 2018, 4:18:57 AM3/21/18
to kur...@googlegroups.com
We are digging into the problem right now... When we have somo advances we will publish more information about it

--
You received this message because you are subscribed to the Google Groups "kurento" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kurento+u...@googlegroups.com.
To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.
Message has been deleted

Jorge Maiquez

unread,
Mar 21, 2018, 9:12:47 AM3/21/18
to kurento
I'm not sure why my messages keep getting deleted in this thread. Here is my last message again:

Thanks for the update. We are flying blind in production right now, so if there is anything at all that we can do right now to mitigate the error from occurring, please share- even it it's not the final solution.

And if you need assistance with testing, just let me know.


On Wednesday, March 21, 2018 at 9:18:57 AM UTC+1, Micael Gallego Carrillo wrote:
We are digging into the problem right now... When we have somo advances we will publish more information about it

Juan Navarro

unread,
Mar 23, 2018, 8:43:30 AM3/23/18
to kurento
Hi,

we are having some issues with the overzealous spam filter that Google has in their Google Groups. It even happens to us some times, that messages get deleted! Currently looking for solutions, but it seems that other communities have had similar problems in the past. Sort of totally disabling spam filtering, it seems there is not much we can do because the messages don't appear in the "awaiting for moderation" list in the administration view... they instead get outright deleted.

Jorge Maiquez

unread,
Mar 23, 2018, 10:10:06 AM3/23/18
to kurento
Hi Juan,

After one of my messages gets deleted, I see additional links in the Reply section of the UI, and one of those is (paraphrased) "click here to post". This then leads to a captcha, and after I verify I'm not a robot, I'm able to post a reply successfully.

It's not ideal, but maybe this helps someone who has the same problem to at least be able to post a reply without ripping their hair out :-)

Have a great weekend,
Jorge

Jorge Maiquez

unread,
Mar 30, 2018, 1:55:05 AM3/30/18
to kurento
Any update on this?


On Wednesday, March 21, 2018 at 2:12:47 PM UTC+1, Jorge Maiquez wrote:

ankit...@gmail.com

unread,
Mar 30, 2018, 10:22:29 PM3/30/18
to kurento
Jorge,

First of all, I am just a user like you. While the work is being done on this. Use monit to monitor KMS and restart it. There must be a way in you client API to get reconnected event and then use this event to reconnect your app. I do this in my NodeJS app

Regards
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

Jorge Maiquez

unread,
Mar 31, 2018, 12:36:26 AM3/31/18
to kur...@googlegroups.com
Hi Ankit,

Monit looks like a great tool, thanks for sharing. But unfortunately, monitoring isn’t the problem in this case. 

The problem is that KMS/gstreamer will seem to be working fine one moment, and then it will throw this error in the middle of a 400 user session, for example. Even if we can detect that it is about to go down (CPU pegged at 100%, etc), it doesn’t help much if a large customer session is already in progress. The result will be an interruption of service and an unhappy customer.

Really, what we’re looking for is some kind of “best practice” guideline that will allow us to minimize the risk of the error occurring, until the Kurento team has solved the root cause of the error.

In the blog post from March 22nd, the roadmap lists the following item:
- “Update GStreamer and several other underlying support libraries to their latest versions.”

Will that solve this problem? And is that already included in 6.7.1?

Any additional info would be greatly appreciated.

Thanks!
Jorge

-- 
You received this message because you are subscribed to a topic in the Google Groups "kurento" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kurento/_rf1ANq5Cm8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kurento+u...@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Micael Gallego

unread,
Mar 31, 2018, 6:06:52 AM3/31/18
to kur...@googlegroups.com
Hi Jorge, 

We have detected some problems in the libraries in Trusty version of KMS that leads to KMS crash. If you are using Trusty, please update to Xenial and report us if your problems are gone.

By the way, KMS 6.7.1 is not still updated to recent library versions, as this update requires a lot of work.

Best regards

Micael Gallego
Kurento / OpenVidu Project Lead

To unsubscribe from this group and all its topics, send an email to kurento+unsubscribe@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.
To view this discussion on the web visit https://groups.google.com/d/msgid/kurento/6c88bf15-99f4-4f0d-952a-77f01c8e9ddf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "kurento" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kurento+unsubscribe@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Jorge Maiquez

unread,
Mar 31, 2018, 6:42:00 AM3/31/18
to kurento
Hi Micael,

Thanks for getting back. We are on Xenial and the problems we reported were on that version.

Are you able to reproduce the gstreamer_send_message error on Trusty? If so, can you share details? 

Is there anything else you can suggest for us to try?

Thanks!
Jorge

On Saturday, March 31, 2018 at 12:06:52 PM UTC+2, OpenVidu wrote:
Hi Jorge, 

We have detected some problems in the libraries in Trusty version of KMS that leads to KMS crash. If you are using Trusty, please update to Xenial and report us if your problems are gone.

By the way, KMS 6.7.1 is not still updated to recent library versions, as this update requires a lot of work.

Best regards

Micael Gallego
Kurento / OpenVidu Project Lead

Juan Navarro

unread,
Apr 12, 2018, 11:49:21 AM4/12/18
to kurento

TheV

unread,
May 25, 2018, 7:34:58 AM5/25/18
to kurento
We can reliably (9/10) reproduce this crash in our performance testing. At this point crashing is the expected behavior rather than the exception. Completely unusable in production.

Juan Navarro

unread,
May 25, 2018, 10:38:28 AM5/25/18
to kur...@googlegroups.com
The 3rd-party libnice library is the weak link here. The crash happens in that library, not in Kurento code. See https://github.com/Kurento/bugtracker/issues/247

Simply updating the library to latest versions (which would probably fix this issue) is not easy and straightforward for us because it makes some of our integration tests fail for some use cases. So this task has been registered in the issue tracker but delayed multiple times due to more pressing issues.

You may have success in updating the library for your use case without extra adaptation work, so I would suggest that you look into that. If you are not able to do so, or would need some help with this, you can also ask for our commercial support (which will also raise this issue's priority). Check https://doc-kurento.readthedocs.io/en/stable/business/

Regards,
Juan


> Sent: Friday, May 25, 2018 at 1:34 PM
> From: TheV <ry...@solutera.lt>
> To: kurento <kur...@googlegroups.com>
> Subject: [kurento-public] Re: Kurento crash
> --
> You received this message because you are subscribed to a topic in the Google Groups "kurento" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/kurento/_rf1ANq5Cm8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to kurento+u...@googlegroups.com.
> To post to this group, send email to kur...@googlegroups.com.
> Visit this group at https://groups.google.com/group/kurento.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kurento/8ed3c791-ce28-4883-9cfc-6fda4f818c09%40googlegroups.com.

Mikhail Novikov

unread,
May 26, 2018, 1:31:00 AM5/26/18
to kurento
I will quote that to any new clients asking to implement Kurento in
their projects...

Jorge Maiquez

unread,
May 28, 2018, 3:54:14 AM5/28/18
to kurento
Hi Juan,

Can you give examples of which use cases fail for you after updating the library? It doesn't make sense for us to invest time into this if we are talking about common use cases.

Thanks,
Jorge 

Jorge Maiquez

unread,
Jun 4, 2018, 8:56:06 AM6/4/18
to kurento
Any chance of a getting an answer to my previous question?

We are now just weeks away from our production deployment deadline, and I need to know what our options are for working around this problem. In the current state, our biggest concern is the stability of KMS, particularly this bug.

Thanks,
Jorge

TheV

unread,
Jun 4, 2018, 11:13:08 AM6/4/18
to kurento
Did some testing with different KMS versions down to 6.6.1 and while they all are affected in some form, 6.7.1 is the only one to crash with every load test (particularly during disconnects).

Paulo R. Lanzarin

unread,
Jun 4, 2018, 12:41:00 PM6/4/18
to kur...@googlegroups.com
Reposting from another thread:

If you aren't afraid of ignoring integration tests and such, and also ignore the ugliness of the workaround 
(desperate times, desperate measures):
https://github.com/prlanzarin/libnice/tree/crash-fix-upstream.

This is merged with libnice upstream. I added checks for NULL gsocket occurrences and commented out an assertion regarding
ICE state transition that was aborting Kurento. I reckon the assert is there for a reason, and there's probably some underlying condition
making it fail; however, I lack the time to go deeper into that. Assertion abortions also shouldn't be used in production hehe.
If anyone digs what's the underlying condition for that btw, I'd appreciate news regarding the problem.

It's been working nice for me with heavy load sessions (~300 streams or more, sometimes). However, use at your own risk :).

s, 

Paulo. 

--
You received this message because you are subscribed to the Google Groups "kurento" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kurento+unsubscribe@googlegroups.com.
To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Jorge Maiquez

unread,
Jun 5, 2018, 5:38:24 AM6/5/18
to kurento
Thanks TheV. Unfortunately, we need the latest version for other reasons.

And thanks for pointing to that thread Paulo. I think we will have to give this a shot, or at least deploy it and have it ready for a hot-swap when things go hairy in production.

Juan, I would still appreciate an answer on the use cases.
To unsubscribe from this group and stop receiving emails from it, send an email to kurento+u...@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Juan Navarro

unread,
Jun 5, 2018, 7:05:58 AM6/5/18
to kur...@googlegroups.com
Hi,

We are in the process of resurrecting our integration tests infrastructure and all browser-level tests will be reimplemented in a new product developed by the kurento team, ElasTest, which expands on the idea of the old ElasticRTC (bought to oblivion by Twilio). Sadly, this means that the tests we conducted while trying to upgrade libnice were manually done and are not reproducible.

The use cases that failed were related to basic browser-to-browser calls via Kurento Media Server; we found out that our own fork version of libnice contained several undocumented changes from the upstream version, and given that the older version was working mostly fine, we decided to lower this task's priority due to some other issues having more importance right then.

But now, complaints about libnice crashes have rised, so I've raised this concern with the team and we've decided to dedicate some time again and try upgrading again to the latest available libnice version, hoping that it will already contain fixes for all or most of its memory access errors. We're currently in the process of providing commercial support for one client, but in the second to third week of June we should be able to start working in this problem again.

Thanks for your patience; meanwhile, any help pinpointing the actual cause of these crashes could help in the integration of the new version of libnice; the comment about it being much more common in version 6.7.1 than it was in older ones is very interesting.

I'll be updating in this list about any news in this regard;
Juan
You received this message because you are subscribed to a topic in the Google Groups "kurento" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kurento/_rf1ANq5Cm8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kurento+u...@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Jorge Maiquez

unread,
Jun 7, 2018, 1:27:09 AM6/7/18
to kurento
Hi Juan,

Thanks for the more detailed response- much appreciated. We have not been able to pinpoint the trigger for the crashes, and this is why I asked about the use cases.

I understand that you give priority to commercial projects. Unfortunately, the time frames you have given for *starting* work on libnice integration will push us past our deadline. And we can't simply continue with the status quo and hope for the best. That will simply result in us losing customers. 

It looks like Paulo's reposted suggestion is currently the only option we have, and as you can imagine, it doesn't exactly fill me with confidence for a production deployment.

These kinds of bugs- especially when they are not taken care of quickly- erode confidence in the production capabilities of any platform. I suppose complaints about libnice crashes have increased because more and more folks out there are considering using KMS in a serious capacity. All the more reason to put this bug to bed quickly- before confidence is lost.

I'm sure none of this is news to you, and I look forward to updates on this problem.

All the best,
Jorge

Paulo R. Lanzarin

unread,
Jun 7, 2018, 10:04:16 AM6/7/18
to kur...@googlegroups.com

Given the commotion in this topic I'll try to clean up the fix in the solution I posted before. I'll do that in my free time though, so no promises, starting this weekend. If anyone wanna join the effort  please contact me or maintain the discussion in this thread. However, I'll just find the underlying cause and turn it into a real fix instead of a workaround. I really can't give a damn about integration tests because I don't really know what those are about, nor do I know where tests those are.

Juan, if you can link me the tests procedures I'd be happy to have a look. Also, feel free to give a look at the branch because the breaking points inside libnice were pinpointed and that's a hell of a start.

Jorge, if you're desperate I'd recommend to try out that branch. Despite the ugliness and my bad marketing skills, it works perfectly. Haven't seen any leaks nor crashes around libnice, and we've been running it for some time now and with some high stress involved.

To unsubscribe from this group and stop receiving emails from it, send an email to kurento+unsubscribe@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Jorge Maiquez

unread,
Jun 8, 2018, 8:48:26 AM6/8/18
to kurento
That's awesome Paulo. I wish my coding skills were up to scratch, but as it stands, I'd just be a total liability on that front. I will however contribute by reporting our experiences back to this thread. We'll get started next week. Thanks for the initiative!
<d

Alexandru Duzsardi

unread,
Jun 9, 2018, 6:04:58 AM6/9/18
to kurento
@Jorge
Here are the build instructions on ubuntu 16.04 , if you want to give it a spin with Paulo's fix

# as root or user with sudo privileges
export DISTRO="xenial"
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 5AFA7A83
sudo tee "/etc/apt/sources.list.d/kurento.list" >/dev/null <<EOF
# Kurento Media Server - Release packages
deb [arch=amd64] http://ubuntu.openvidu.io/6.7.1 $DISTRO kms6
EOF
sudo apt-get update
sudo apt-get build-dep libnice
sudo apt-get install fakeroot libgstreamer1.5-dev libgstreamer-plugins-base1.5-dev gobject-introspection libgirepository1.0-dev libgnutls28-dev

# as a regular user
mkdir BUILD
cd BUILD
apt-get source libnice
mv libnice-0.1.13/ libnice-0.1.13_ubu
git clone https://github.com/prlanzarin/libnice libnice-0.1.13
cd libnice-0.1.13
git checkout crash-fix-upstream
./autogen.sh
dpkg-buildpackage -rfakeroot -uc -b

you will end up with a few deb packages in ../BUILD dir
gir1.2-nice-0.1_0.1.13-3_amd64.deb    gstreamer1.5-nice_0.1.13-3_amd64.deb  libnice-dbg_0.1.13-3_amd64.deb  libnice-doc_0.1.13-3_all.deb
gstreamer1.0-nice_0.1.13-3_amd64.deb  libnice10_0.1.13-3_amd64.deb          libnice-dev_0.1.13-3_amd64.deb

of which i guess you only need to install libnice10_0.1.13-3_amd64.deb ( maybe uninstall the version from apt repositories first )

Jorge Maiquez

unread,
Jun 11, 2018, 7:11:00 AM6/11/18
to kurento
Thank you Alexandru- much appreciated. 

I'll ask someone from my team- with capabilities greater than me- to confirm here whether or not this worked for us.

Alexandru Duzsardi

unread,
Jun 11, 2018, 10:08:30 AM6/11/18
to kurento
Do you guys know if these messages are related to this issue ?

Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed
Jun 11 14:00:24 dev1-kurento2 kurento[11369]: (kurento-media-server:11369): GStreamer-CRITICAL **: gst_mini_object_unlock: assertion '(state & access_mode) == access_mode' failed


Also do you guys have a way to reproduce the crash with some load tests or something ?
We tried but it doesn't seem to react based on load or anything else ... it just crashes whenever it wants randomly.
Thaks!

Paulo R. Lanzarin

unread,
Jun 11, 2018, 11:11:51 AM6/11/18
to kur...@googlegroups.com
Hey Alexandru,

Thanks for posting the build instructions. I forgot about that completely.
Those assertions are certainly something to be looked at. I can't actually reproduce them yet,
but I'll skim through the code for the glib lock routine that might be throwing that. You're seeing
that with the stock libnice packages?

Regarding reproducing the crash: yeah, it's not deterministic. But we've been able to make it crash
based on load in a WebRTC only environment. With the stock packages, everytime it gets near
~140 streams it crashes, and we've assembled a test framework to reproduce it. It's a videoconferencing
app though, and the stress scripts are tightly coupled with our app, so I think it's no use that I share it here
(and I don't think I can hehe).

Have you tried the workaround? Did it work for you in any way?

s, 

Paulo.

--
You received this message because you are subscribed to the Google Groups "kurento" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kurento+u...@googlegroups.com.
To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Alexandru Duzsardi

unread,
Jun 11, 2018, 11:55:18 AM6/11/18
to kurento

Yes, those messages are with the stock libnice package.

Oh , thanks for the info , we actually tried the other way around ... one streamer and ~100 viewers.
We installed the patched version on some servers but we didn't really tested it since we couldn't make it crash on the other version either or it crashed very rarely and randomly, at least how we tested the load :)
We'll try again tomorrow with more streamers and viewers and maybe than we have some actual data to compare.
Thank you anyway for the work you've done.

Juan Navarro

unread,
Jun 12, 2018, 6:57:02 AM6/12/18
to kur...@googlegroups.com
Hi,
thank you very much for the work done here; I'm following closely this thread and, if the results are satisfactory, I agree to apply these changes as a contingency measurement in our copy of libnice while we work in upgrading it from upstream.
You received this message because you are subscribed to a topic in the Google Groups "kurento" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kurento/_rf1ANq5Cm8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kurento+u...@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Jorge Maiquez

unread,
Jun 12, 2018, 7:10:52 AM6/12/18
to kurento
FYI, we managed to build without any issues (with some slight mods to the instructions). 

The next step is to deploy it in the only environment where we have seen the crash to date0- hopefully tomorrow.

>> everytime it gets near ~140 streams it crashes

That's awesome info. We'll try to run this simulation on both the stock and the branch. I'll report all news/findings here.

Thanks guys!
Message has been deleted

Alexandru Duzsardi

unread,
Jun 14, 2018, 4:41:31 AM6/14/18
to kurento
applied the patched libnice yesterday on a test environment , didn't have kurento crashes until now with the tests we run
we will do more test and see what happens

# lsb_release -a
No LSB modules are available.
Distributor ID:    Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:    16.04
Codename:    xenial

# kurento-media-server -v
Kurento Media Server version: 6.7.1
Found modules:
   
'core' version 6.7.1
   
'elements' version 6.7.1
   
'filters' version 6.7.1

# dpkg -L libnice10 | grep so
/usr/lib/x86_64-linux-gnu/libnice.so.10.7.0
/usr/lib/x86_64-linux-gnu/libnice.so.10

Alexandru Duzsardi

unread,
Jun 20, 2018, 2:25:18 AM6/20/18
to kurento
The libnice error seems like it went away with the patched build but from time to time i see another crash happening
Any ideas what's causing it ?

kurento[31894]: Segmentation fault (thread 139745719990016, pid 31894)
kurento[31894]: Stack trace:
kurento[31894]: [sigc::internal::signal_emit1<void, kurento::ElementDisconnected, sigc::nil>::emit(sigc::internal::signal_impl*, kurento::ElementDisconnected const&)]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libkmscoreimpl.so.6:0x10AD69
kurento[31894]: [kurento::MediaElementImpl::mediaFlowInStateChange(int, char*, KmsElementPadType)]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libkmscoreimpl.so.6:0x10276F
kurento[31894]: [virtual thunk to kurento::MediaElementImpl::getGstreamerDot()]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libkmscoreimpl.so.6:0xF9989
kurento[31894]: [g_closure_invoke]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0:0xFFA5
kurento[31894]: [g_signal_handler_disconnect]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0:0x21FC1
kurento[31894]: [g_signal_emit_valist]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0:0x2AD5C
kurento[31894]: [g_signal_emit]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0:0x2B08F
kurento[31894]: [check_if_flow_media]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libkmsgstcommons.so.6:0x1F554
kurento[31894]: [gst_mini_object_steal_qdata]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0:0x6C29B
kurento[31894]: [g_hook_list_marshal]
kurento[31894]: /lib/x86_64-linux-gnu/libglib-2.0.so.0:0x3A904
kurento[31894]: [gst_mini_object_steal_qdata]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0:0x6AAFB
kurento[31894]: [gst_flow_get_name]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0:0x6E98B
kurento[31894]: [gst_pad_push]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0:0x76533
kurento[31894]: [gst_proxy_pad_chain_default]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0:0x5F5E3
kurento[31894]: [gst_flow_get_name]
kurento[31894]: /usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0:0x6E5CF
systemd[1]: kurento.service: Main process exited, code=killed, status=6/ABRT
systemd[1]: kurento.service: Unit entered failed state.
systemd[1]: kurento.service: Failed with result 'signal'.
systemd[1]: kurento.service: Service hold-off time over, scheduling restart.
systemd[1]: Stopped Kurento Media Server daemon.
systemd[1]: Started Kurento Media Server daemon.

Paulo R. Lanzarin

unread,
Jun 20, 2018, 5:03:47 PM6/20/18
to kur...@googlegroups.com
Hey Alexandru,

Thanks for the report. That's odd, haven't seen that. I'd like to carry some tests to identify if
that's something caused by the fix or a bug that already exists on upstream.
Could you clarify a few things: what's the setup that caused that error (nature of endpoints,
number of users)? Are you able to reproduce it reliably? If so, can you share steps so I can
investigate it?
I also see that you're trying to generate a gstreamer dot diagram, is that right? Could you try
disabling that and see if the error persists?

Also, would you care to reproduce it and send me a full media server log besides the error log?

Thanks for trying it out.

s,

Paulo.

--
You received this message because you are subscribed to the Google Groups "kurento" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kurento+u...@googlegroups.com.
To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Juan Navarro

unread,
Jun 20, 2018, 5:38:37 PM6/20/18
to kur...@googlegroups.com
Hi Alexandru,
please also install these packages in your machine, so next time the segmentation fault happens, we'll have file name and line number to look for in the code:

sudo apt-get install libglib2.0-0-dbg libgstreamer1.5-0-dbg kms-core-dbg
You received this message because you are subscribed to a topic in the Google Groups "kurento" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kurento/_rf1ANq5Cm8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kurento+u...@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Alexandru Duzsardi

unread,
Jun 25, 2018, 2:38:06 AM6/25/18
to kurento
@Paulo i have no idea what getGstreamerDot is or how to disable it and we are not able to reproduce it , it just happens randomly.
One to many video streaming , it doesn't seem to matter how many users are connected.

Another thing is that it seems like i've missed some step in building the libnice library because apt complains that it's broken
The following packages have unmet dependencies:
 kms-elements : Depends: libnice10 (>= 0.1.13.1.xenial~20170725160546.81.eebfdab) but 0.1.13-3 is installed
E: Unmet dependencies. Try using -f.


Any idea how to solve it ? I can't install the libraries with the debug symbols left in.

Alexandru Duzsardi

unread,
Jun 25, 2018, 5:35:57 AM6/25/18
to kurento
solved the dependency issue by incrementing the build version to 0.1.13.2

Abhishek Mishra

unread,
Jun 25, 2018, 5:41:23 AM6/25/18
to kur...@googlegroups.com
Install dependency manually may resolve your problem

sudo apt-get install libnice10

--
You received this message because you are subscribed to the Google Groups "kurento" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kurento+u...@googlegroups.com.
To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Sergiy Kukunin

unread,
Jul 26, 2018, 3:46:58 AM7/26/18
to kurento
Any progress on this? A half of year has passed (since Feb) since this was reported, and no solution is available. 

I tried the patched libnice lib, but get another crash: Segmentation fault.

We can't go live because of this bug. It would be very disappointing, if we need to switch out from Kurento.

Juan Navarro

unread,
Aug 13, 2018, 4:10:49 PM8/13/18
to kurento

libnice is a third-party library that implements ICE connectivity, but the project seems to be understaffed. If you benefit in any form from it (directly or indirectly by using Kurento), then consider contributing in any way you can, as they have a good number of issues pending to be solved. One crash that has affected users of Kurento before seems to be well known and tracked here: https://gitlab.freedesktop.org/libnice/libnice/issues/20


We've been working at Kurento on the issue of this thread, and found out that the latest development branch of libnice seems to work better. In our tests, either versions 0.1.13 and 0.1.14 crashed, but the development branch for 0.1.15 (or as libnice creators like to put it, "0.1.14.1") is currently working pretty well.


It would help a lot if we could confirm with more people if this improvement is only due to our specific test environment, or if this latest version of libnice does actually solve problems that are being encountered by KMS users.


I've prepared Debian package files from upstream libnice commit 090d3dba, the latest as of this week. If you have a staging server where you could try out this version, it would help us all to know whether this version is a good candidate to be included in the upcoming release of Kurento 6.8.

Installation steps in a clean Ubuntu 16.04 (Xenial) server:

  1. Download the experimental packages:
    https://www.dropbox.com/sh/525depzmhj2vt47/AADcgc4o_QwjcZpaBlQWqgspa?dl=1

  2. Install the media server:

    sudo apt-get update
    sudo apt-get install kurento-media-server
    
  3. Install the debug symbols. These will provide needed information in case of a crash. Run the apt-get steps to install all -dbg symbols, as explained in https://doc-kurento.readthedocs.io/en/latest/user/troubleshooting.html#media-server-crashed

  4. Install the experimental version of libnice:

    unzip libnice-0.1.15-snapshot.zip
    sudo dpkg -i ./*.*deb
    sudo apt-get install -f
    sudo dpkg -i ./*.*deb
    
  5. Test again your use case and let us know if it failed again with the same issue.


Also remember that this bug has a tracking issue here: https://github.com/Kurento/bugtracker/issues/247

Thanks and regards,
Juan

Jorge Maiquez

unread,
Aug 15, 2018, 1:47:46 AM8/15/18
to kurento
We will install this on an internal server and will let you know if we still encounter issues.

Do you have a rough release date for KMS 6.8?

Jorge Maiquez

unread,
Sep 10, 2018, 6:37:31 AM9/10/18
to kurento
Hi Juan,

We had a crash on one of our production deployments last Friday, and the server was using libnice-0.1.15.

We are currently following your instructions here, in an attempt to have better logs the next time.

But the fact that we are beta testing in a production environment is obviously not good. I don't know how many of these we can experience before we start losing customers.

Are you guys making any progress on your end? We may need to start making fundamental changes soon, but I would prefer to stick with kurento given all the time we have invested already.

How about the last suggestion here:

Please keep us posted on any progress you have made.

Thanks,
Jorge

Ayaan

unread,
Sep 12, 2018, 10:25:08 PM9/12/18
to kurento
In high-load situations, Kurento 6.7.2 usually crashes. We have 100s of users on active sessions every hour and Kurento cannot keep up the load. The servers are very powerful with 40 core, 128GB memory with 10gbps pipe. However, the culprit seems kurento.

Error log as follows:

(kurento-media-server:15068): GLib-GIO-CRITICAL **: g_socket_send_message: assertion 'G_IS_SOCKET (socket)' failed

(kurento-media-server:15068): GLib-CRITICAL **: g_error_free: assertion 'error != NULL' failed


When the above error pops up, users cannot hear each other. Kurento is great for 1 on 1 low-volume sessions but not great for high-volume group video conferencing with the current GLIB issues. It is often embarrassing when paying customers start putting the blame on the platform and ask us for refund. Since the entire solution is built on Kurento, it is quite difficult to move away to Janus or JITSI, but praying that Kurento team will answer our prayers and push newer version sooner with the libnice fixes.

Juan Navarro

unread,
Sep 13, 2018, 5:26:23 AM9/13/18
to kur...@googlegroups.com
Well, just like users wrongly put blame on the platform, platforms wrongly put blame on Kurento, this is a libnice bug: https://gitlab.freedesktop.org/libnice/libnice/issues/33

But not to pass blames here from one place to another; the libnice project is currently used (directly or indirectly) by a lot of projects and companies which benefit from free as in beer software but then later don't contribute back a dime (in terms of money or software contributions). We at Kurento are a pretty small team but libnice itself is a VERY understaffed project and they have already stated that they don't have the resources needed to solve this issue (https://gitlab.freedesktop.org/libnice/libnice/issues/33#note_24942)

So, is your core business on the verge of falling due to a bug in one of your supporting libraries? The least you could do is contribute a full-time developer dedicated to help on this issue, or incentive the project with 1/4th of what a big company such as Cisco would ask you for a support contract for WebRTC technologies, then I'm sure this bug will be history in a couple weeks, and it will still have cost you less than 0.1% of the value that these projects put on the table for you.

This reminds me (on a much lower scale of course) of the Heartbleed disaster of OpenSSL a few years back; everyone in the world using this library for security, then when a problem happens everybody in the world playing to be surprised that it essentially was maintained by some volunteers and no players (bit or small) did actually support it.

Nobody has even been able to provide a simple test scenario that the folks at libnice (or us helping them) can use to reproduce the issue and fix it... can you make one?
--
You received this message because you are subscribed to a topic in the Google Groups "kurento" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kurento/_rf1ANq5Cm8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kurento+u...@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Jorge Maiquez

unread,
Sep 13, 2018, 6:20:45 AM9/13/18
to kurento
Hi Juan, 

I don't want to throw fuel onto this fire, because the direction this conversation is going in is unlikely to be productive. I can't speak for anyone else, but my comment was not meant as a dig. I am sincerely concerned about the future of this project- and consequently our business. I imagine you feel the same way.

We have been trying to reproduce this bug- whenever possible (we are also a small team)- since March, with no luck. It has only appeared in production so far.

Yesterday, we managed to produce a crash with DBG tools installed- on our development box- so that was exciting. We tested the reproducibility of the crash, and it was VERY reproducible. Then we realized that we were using the pre-release version of KMS that you mentioned in an earlier thread, and it ends up that this was the cause of the crash. The last stable version of KMS does not crash under the same test. Incidentally, the production crash we experienced last Friday was running that pre-release version of KMS with libnice-0.1.15, so that explains why it crashed. It wasn't the gstreamer_send crash.

So back to square one. We are now in the process of deploying the latest stable version of KMS + DBG tools onto all our production servers. So the next time there is a crash- we will at least have logs to contribute.

In parallel, we are developing a load testing tool that will hopefully enable us to reproduce these crashes in the lab. I hope our efforts will see some results soon. 

If we do manage to get these crash logs, will the kurento team be able to do something with that, or are we then at the mercy of the libnice folks? Also, are you guys running any load tests to try to reproduce this bug? If so, how many users/streams have you managed to run in those tests?

Best regards,
Jorge

/usr/lib/x86_64-linux-gnu/libgstreamer-1.5.so.0^[[32;1m:0x76533^[[0m<br style="line-height:1;color:rgb(43,43,48);font-family:"Segoe UI Local","Segoe WP","Segoe UI Web",SkypeUISymbol,Tahoma,"Helvetica Neue",Helvetica,"Me

Ayaan

unread,
Sep 13, 2018, 8:06:16 AM9/13/18
to kurento
Hi Juan,

Yes, I do agree with you. Now, that it is clear the bug is with libnice, we'll try to setup debugging and contribute with some logs. In terms of hiring a dedicated resource from kurento, it is not clear about your support process. In the past, we have emailed kurento's gmail asking for support and never got a reply.

So, what we can do is to setup some common fund using PayPal or any supported methods for anyone willing to contribute financially to help Kurento/Libnice team to allocate resources.

Juan Navarro

unread,
Sep 14, 2018, 9:28:31 AM9/14/18
to kur...@googlegroups.com
Hi Jorge,

Sorry for the momentary burst of non productive rant; my response wasn't directed specifically to you or Ayaan, but in any case triggered by an accumulating the sense of entitlement that lots of users and companies feel they have while leveraging free libraries to build their products or services, then not contributing back in any form despite the fact that their product wouldn't have been viable even from the start if they had to build everything from scratch (or buy a commercially licensed version from some bigger company).

We receive lots of messages each one pressing for their own interest or issue, normally without any implicit or explicit intent to collaborate, not only in the forums but also privately in our inboxes. I had read a lot about this happening in open source projects, but now I'm seeing it's totally true. But it's OK, we'll keep on with our priorities.

It would be VERY good news if you find out a way of reproducing this issue; it is after all a problem in libnice library, but you are not at the mercy of that project's maintainers. The way I see it, I own the issues that happen in my supporting libraries, libnice in this case, so I am very inclined to collaborate with them and write code if needed in order to fix any issues, especially if they are critical such as this one. I'm currently in the process of contributing a modification that would greatly improve the library performance, see https://gitlab.freedesktop.org/libnice/libnice/merge_requests/13

Some colleages at Kurento are developing a new project called ElasTest, which will then be used in order to have extensive testing coverage of Kurento Media Server, that is, being able to test integration issues, performance, regressions, etc. We'll use this to push the upgrade of GStreamer from current 1.8 to 1.14, and also to being able to troubleshoot issues such as this libnice crash; but the product is still under development and it's not possible yet to use it for this.

You've been using some pre-release versions of KMS but the issue I see here is that there have been multiple versions of these, all of them distributed informally via the bug report (https://github.com/Kurento/bugtracker/issues/247) or in this email thread. I think we need a page where I can organize all tests and provide separate download links so everybody is able to know exactly what they are installing when using these experimental versions...

Thanks for your support and your help,

Kind regards
Juan
--
You received this message because you are subscribed to a topic in the Google Groups "kurento" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kurento/_rf1ANq5Cm8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kurento+u...@googlegroups.com.
To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Juan Navarro

unread,
Sep 14, 2018, 9:40:03 AM9/14/18
to kur...@googlegroups.com
Hi Ayaan,

First of all, I'd like to give you my apologies for my rant post, as I just told Jorge I've been accumulating all kinds of messages where users just expect that their needs get solved but without any disposition to contribute or help getting their hands dirty, while this is not a commercial project so it is actually provided without guarantees (although there is always the option of contracting explicit support)

The Kurento Google Group is a mailing list that intends to gather a community of users and interested people around this technology, but it's not a sure way of obtaining official support; we pay our bills via payed commercial support (https://doc-kurento.readthedocs.io/en/latest/business/index.html) and if you wrote asking for it and got no answer, that's an error which shouldn't happen again.

Regarding financial contributions to the project, that's an interesting possibility but for now the project lead hasn't considered it as an option; in any case, mind that my previous email commented that this crash is happening at libnice and it's the libnice project which needs help (https://gitlab.freedesktop.org/libnice/libnice/issues/33)

We at Kurento can help libnice by providing reproduction cases, development, or whatever they need; but if you are actually thinking of providing financial help, you should first contact Olivier Crête (https://gitlab.freedesktop.org/ocrete) to see what's his position on this. Or, by hiring Kurento via a support contract, we'll dedicate 100% of our resources to fixing this issue.

Kind regards,
Juan
--
You received this message because you are subscribed to a topic in the Google Groups "kurento" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kurento/_rf1ANq5Cm8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kurento+u...@googlegroups.com.
To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Ayaan

unread,
Sep 15, 2018, 11:43:24 AM9/15/18
to kurento
Hi Juan,

No worries. I'll reach out from business account to hire Kurento team on some of the work we are planning to do.

Thank you!
<span style="color:rgb(43,43,48);font-family:"Segoe UI Local","Segoe WP","Segoe UI Web",SkypeUISymbol,Tahoma,"Helvetica Neue",Helvetica,"Meiryo UI",Meiryo,&q

Jorge Maiquez

unread,
Sep 18, 2018, 5:03:10 AM9/18/18
to kurento
Hi Juan,

All good. Sorry for the late response. We're dealing with a crash situation as I write this.

A production client is now repeatedly crashing KMS. We are trying to get more information, but it is proving tricky because our client is working with external speakers, and we don't control the comms.

We followed your instructions to "debug with DBG", but the error log traces look very similar to what we have already seen:

libnice:ERROR:agent.c:2342:agent_signal_component_state_change: assertion failed: (TRANSITION (DISCONNECTED, FAILED) || TRANSITION (GATHERING, FAILED) || TRANSITION (CONNECTING, FAILED) || TRANSITION (CONNECTED, FAILED) || TRANSITION (READY, FAILED) || TRANSITION (DISCONNECTED, GATHERING) || TRANSITION (GATHERING, CONNECTING) || TRANSITION (CONNECTING, CONNECTED) || TRANSITION (CONNECTED, READY) || TRANSITION (READY, CONNECTED) || TRANSITION (FAILED, CONNECTING) || TRANSITION (FAILED, GATHERING) || TRANSITION (DISCONNECTED, CONNECTING))
[31;1mAborted [0m (thread [33;1m140084124813056 [0m, pid [33;1m3160 [0m)
Stack trace:
[34;1m[gsignal] [0m
/lib/x86_64-linux-gnu/libc.so.6 [32;1m:0x35428 [0m
[34;1m[abort] [0m
/lib/x86_64-linux-gnu/libc.so.6 [32;1m:0x3702A [0m
[34;1m[g_assertion_message] [0m
/build/glib2.0-b4FPyK/glib2.0-2.48.2/./glib/gtestutils.c [32;1m:2429 [0m
[34;1m[g_assertion_message_expr] [0m
/build/glib2.0-b4FPyK/glib2.0-2.48.2/./glib/gtestutils.c [32;1m:2453 [0m
[34;1m[agent_signal_component_state_change] [0m
/opt/libnice/agent/agent.c [32;1m:2353 [0m
[34;1m[priv_map_reply_to_conn_check_request] [0m
/opt/libnice/agent/conncheck.c [32;1m:3420 [0m
[34;1m[agent_recv_message_unlocked] [0m
/opt/libnice/agent/agent.c [32;1m:3886 [0m
[34;1m[component_io_cb] [0m
/opt/libnice/agent/agent.c [32;1m:5181 [0m
[34;1m[socket_source_dispatch] [0m
/build/glib2.0-b4FPyK/glib2.0-2.48.2/./gio/gsocket.c [32;1m:3543 [0m
[34;1m[g_main_dispatch] [0m
/build/glib2.0-b4FPyK/glib2.0-2.48.2/./glib/gmain.c [32;1m:3157 [0m
[34;1m[g_main_context_iterate] [0m
/build/glib2.0-b4FPyK/glib2.0-2.48.2/./glib/gmain.c [32;1m:3840 [0m
[34;1m[g_main_loop_run] [0m
/build/glib2.0-b4FPyK/glib2.0-2.48.2/./glib/gmain.c [32;1m:4033 [0m
[34;1m[gst_nice_src_create] [0m
/workspace/gst/gstnicesrc.c [32;1m:292 [0m
[34;1m[gst_base_src_get_range] [0m
/opt/gstreamer/libs/gst/base/gstbasesrc.c [32;1m:2465 [0m
[34;1m[gst_base_src_loop] [0m
/opt/gstreamer/libs/gst/base/gstbasesrc.c [32;1m:2737 [0m
[34;1m[gst_task_func] [0m
/opt/gstreamer/gst/gsttask.c [32;1m:346 [0m


Please respond ASAP if there is anything else we should be doing. We will most likely lose this client, and we won't have this opportunity again.

Thanks,
Jorge
Jorge

<span style="color:rgb(43,43,48);font-family:"Segoe UI Local","Segoe WP","Segoe UI Web",SkypeUISymbol,Tahoma,"Helvetica Neue",Helvetica,"Meiryo UI",Meiryo,"Arial Unicode MS",sans-serif;white-space:p

Andrew Costa

unread,
Sep 18, 2018, 9:26:48 AM9/18/18
to kur...@googlegroups.com
Looks like an issue with libnice. We had similar issues and they have gone away ever since we upgraded to the latest version of libnice.

Sent from my iPhone
--
You received this message because you are subscribed to the Google Groups "kurento" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kurento+u...@googlegroups.com.

To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Juan Navarro

unread,
Sep 18, 2018, 10:15:27 AM9/18/18
to kur...@googlegroups.com
Hi Jorge,

This assertion message seems exactly the same as reported here:
https://github.com/Kurento/bugtracker/issues/268

that's a different issue happening due to an unexpected state change in libnice, so let's track them separately. Please comment there any new info that I'll ask in this email.

I'll try to propose a quick solution to help solve the issue with your client; from the line number shown in the error (agent.c:2342), I see this client has installed the latest development version of libnice (from the pre-release repositories).

A good first step would be to bring him back to the latest stable version of the package (0.1.13.1.xenial~20170725160546.81.eebfdab in Xenial), from the release repos (check the docs if in doubt: https://doc-kurento.readthedocs.io/en/latest/user/installation.html#local-installation); the old, "stable" 0.1.13 version has a custom modification from a previous Kurento developer, that removes the assertion (replaces "g_assert" it with a more permissive "g_warn_if_fail"). This modification was lost when updating to latest libnice upstream, as I'm seeing now. I'll make sure this makes its way back to the experimental versions!

In any case that was only a clutch to quickly fix the issue of unexpected state transitions; the fact is that libnice _is_ doing some unforeseen state transition and that assert is catching it.

If it's possible, please before downgrading the client try to get libnice debug logs (you'll need to launch KMS on console, the "service init" won't log libnice messages yet). A command such as this should do it:

{
    sudo service kurento-media-server stop
    source /etc/default/kurento-media-server
    export GST_DEBUG="${GST_DEBUG:-3},kmsiceniceagent:5,kmswebrtcsession:5,webrtcendpoint:4"
    export G_MESSAGES_DEBUG="libnice,libnice-stun"
    export NICE_DEBUG="$G_MESSAGES_DEBUG"
    /usr/bin/kurento-media-server
}

libnice, when running with debug enabled, outputs a message such as this:

"Agent %p : stream %u component %u STATE-CHANGE %s -> %s."

(with placeholders filled with actual values). We'd benefit from knowing what's the state change that's causing this crash.

Also for completeness, if possible, also repeat the same procedure with the older version of libnice; it won't crash, but will still print some warning with the exact same format. We'll need to know what are those state changes that are not expected by libnice (see here: https://gitlab.freedesktop.org/libnice/libnice/blob/090d3dba7abdb1a997b0126b59846dba69a4b975/agent/agent.c#L2360) to then tell libnice maintainer about this issue ('ll probably open a bug report on your behalf, unless you prefer to do it yourself).

Thanks,
Juan
--
You received this message because you are subscribed to a topic in the Google Groups "kurento" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kurento/_rf1ANq5Cm8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kurento+u...@googlegroups.com.
To post to this group, send email to kur...@googlegroups.com.
Visit this group at https://groups.google.com/group/kurento.

Jorge Maiquez

unread,
Sep 18, 2018, 11:01:03 AM9/18/18
to kurento
We are running tests with the customer right now, and we'll run KMS in the console as you suggested. I will report back our findings. I hope we can reproduce this again. 

Jorge Maiquez

unread,
Sep 18, 2018, 12:42:01 PM9/18/18
to kurento
Hi Juan,

I left comments here:

Feel free to open bug reports on my behalf- much appreciated.

FYI, we maintain the servers for all our customers, and this particular customer is an event organizer that generally does not have close contact with the end users- as was the case today with the external speaker. 

We have now configured her servers with:
KMS 6.7.2~19.g181284d
libnice 0.1.13.1.xenial~20170725160546.81.eebfdab

We will be monitoring the rest of the events this week, and wherever possible, we will monitor it by launching KMS from the console. As I mentioned in that other thread, the 0.1.13 lib nice still has weird behavior, which requires all broadcasters to restart their streams when the "pseudo-crash" happens. 

I'm not sure how much longer our customer will accept this. She's being really great now, but I don't know how much longer this will last- I might not be so patient in her situation. 

I hope today's logs help you troubleshoot the unexpected state change problems. And if we're lucky, we may catch the segmentation fault from this thread, too. Fingers crossed.

All the best,
Jorge

Ayaan

unread,
Sep 20, 2018, 8:37:32 AM9/20/18
to kurento
I am using trusty version:

Kurento Media Server version: 6.7.2~19.g181284d
Found modules:
'core' version 6.7.2
'elements' version 6.7.2
'filters' version 6.7.2~6.g298ea12

In heavy load situations, audio/video stops working. Kurento service is running and did not crash this time. When we see "handling timeout failed" in the logs, it means there is no audio/video. I do not se SDP happening as well. When kurento service is restarted, it starts working again.

2018-09-20 00:13:58,300259 22424 [0x00007ef636cb5700] warning kmsutils                  kmsutils.c:1390 kms_utils_depayloader_adjust_pts_out() <rtpopusdepay15>  Fix PTS not strictly increasing, last: 1:36:50.872341157, current: 1:36:50.845341157, fixed = last + 1: 1:36:50.873341157
2018-09-20 00:13:58,309708 22424 [0x00007ef636cb5700] warning kmsutils                  kmsutils.c:1390 kms_utils_depayloader_adjust_pts_out() <rtpopusdepay15>  Fix PTS not strictly increasing, last: 1:36:50.873341157, current: 1:36:50.860341158, fixed = last + 1: 1:36:50.874341157
2018-09-20 00:14:02,440182 22424 [0x00007efd8beb4700]    info KurentoWebSocketTransport WebSocketTransport.cpp:263 keepAliveSessions()  Keep alive 0530b467-e243-46a9-82ad-ede31d3ccfc4
2018-09-20 00:14:13,590222 22424 [0x00007efb777fe700] warning dtlsconnection            gstdtlsconnection.c:312 handle_timeout() <GstDtlsConnection@0x7efbf8002510>  handling timeout failed
2018-09-20 00:14:13,590332 22424 [0x00007ef636cb5700] warning dtlsconnection            gstdtlsconnection.c:312 handle_timeout() <GstDtlsConnection@0x7efd6c145300>  handling timeout failed
2018-09-20 00:14:13,590460 22424 [0x00007ef607c57700] warning dtlsconnection            gstdtlsconnection.c:312 handle_timeout() <GstDtlsConnection@0x7efd6c143370>  handling timeout failed

Juan Navarro

unread,
Oct 18, 2018, 11:29:17 AM10/18/18
to kur...@googlegroups.com
Hi all,

I have imported the latest upstream commits in libnice, in order to put our fork up-to-date with the latest code from the original project. New packages are built and ready in the Pre-Release repositories and can be installed with a simple apt-get install.

You can check if the latest version of libnice is installed in your system if you run this command:

apt-cache policy libnice10

and the result looks like this:

libnice10:
  Installed: 0.1.15-1kurento1~20181018[...]


Note the 20181018 part is a timestamp; the first numbers mean 2018-10-18, i.e. the packages have been just built today.

This latest version is a mirror of current upstream code. Besides, some old Kurento-specific patches have been cherry-picked on top of the original code (can be found under debian/patches; this includes removing the infamous assert that has been affecting several users, it has been replaced with a safer warning ("safer" in the sense that crashing is always *the worst thing to do*, rendering assert() totally undesirable for production builds; still, if you realize that this new warning is printing in the log files, please fill a bug report!).

Hopefully, working with the latest upstream code eases finding the cause of the crash and we can help upstream fixing the problem

Remember that the tracking bug report is #247 here: https://github.com/Kurento/bugtracker/issues/247
and upstream bug report is #33 here: https://gitlab.freedesktop.org/libnice/libnice/issues/33
Reply all
Reply to author
Forward
0 new messages