Hi! I want to implement real time "visual indication" of active speaker in my
conference application based on webrtc and freeswitch.
Freeswitch can detect such activity and send events to api interfaces.
So you can capture events and send them to clients with some kind of transport,
for example with the same connection that being used for signaling protocol.
This approach is simple but not a work in real life because:
* you will experience delays in delivering events (your signaling transport usually
is not suitable for realtime data, tcp based etc)
* you will experience asynchronization between what you hear and what you
see. (again because different channels with pretty different features)
From my own experience common working solution for this task it's send
"who is talking" data together or nearby with media data. For example we
can extend rtcp a bit or use webrtc "data channels" as a transport.
Unfortunately after quick look to this approach I faced next issues:
* there is no easy way to work with rtcp on client side
* there is no implementation of data channels in any popular conference server
solutions (freeswitch, asterisk etc...). And i belive that it will be pretty complicated
task to implement "data channels" support.
Could somebody share some experience or ideas about implementation of similar features?
Thanks.