There are good reasons to mix the video to send it over the network e.g. interoperability or to provide a single stream rather than multiple which sometimes allows better adaptive nitrate control down to lower bandwidth. Perhaps that is why they want to mix it.
HTH
Now that you can stream from a canvas its possible to draw each video to a canvas and stream from the canvas over the network. I don't have code for this but I'm sure it could work though I'm not sure if you might find some performance issues on slower hardware.