we are developing a webRTC stack based on Janus as MCU/SFU and coturn as TURN server. The media stream is a one-way AV stream, sourced from mobile-device and the subscriber(viewer) is on browser. The whole MCU + TURN + HAproxy stack is on AWS. Given the stack, our avg. latency from publisher to subscriber is ~400ms as the distance is half-the-globe, for now which is okay.
Now, the issue we are facing is that on publish end, usually two persons are speaking with quite a bit of volume difference in their voice making it hard to comprehend for the listener. So, we are thinking how we can apply Dynamic range compression aka volume normalization on the stream.
Now, what would be the best approach to solve it? Is it to do it through a server-side filter-plugin on the Janus-MCU? what'd be the impact on the latency? Or, it is wise to do it on the client-side on browser through customizing webRTC playback.
any pointer or hints would be greatly helpful.
// khaled