hi jacek,
Thank you for sharing your opinion.
I do not think there is a perfect system when it comes to testing. That s why, instead of providing our opinion on how "complex", or "adequate" a given solution is, which would be subjective in nature and relate to one capacity and one use case, we try to document and explain the (arbitrary) choices made for a given tool so anybody make an informed choice given their own use case, their ressource, and their technical capacity. Matt decision, in his case, makes a lot of sense even if only from the programming language point of view.
Janus published a paper, we published multiple peer review papers, blog posts, and provided some code, so the information is publicly available to all. The kurento team also contributed a lot to the field. One can find below a list of the most recent papers on Webrtc testing, which do differentiate between testing types, some of them we authored.
I do not think anybody mentioned, or meant, "interoperability" earlier. What was said is that you need to compare things that are comparable. Let's imagine you send 720p streams and in productions you get 1080p streams, of course it would change things. This is quite an obvious example, but some more subtle cases are equally impactfull. Everything that deals with packet loss or bandwidth adaptation can end up (through RED, FEC or RTX, or the re-send of full frames) impact the bandwidth usage and tip your server over the limit.
Given the multiple variations one can see in a "webrtc stream", how are you sure that your simulated stream during the test actually represent the real stream in production? To those who think the variations are small, and wonder how different streams could really be, I would like to point to the effort Cullen Jennings, Cisco CTO for communication did in 2018: counting all the specifications in the webrtc dependency tree. pop quizz; Anybody dare risking a number? 200+ specifications, that take 30 pages to list. For the curious, I'm adding the corresponding document to this email. Until the IETF decides to have a formal test suite, all those modules bring in variability surface.
The conclusion is: there is absolutely no way to be sure you are producing the same stream that you would get from your client software in production unless you actually use the same software client in your testing. That leads to using "instrumentation" of those clients instead of "simulation/emulation" of those clients.
Now, that s the perfect scenario, and some people can say:
- I don't care about the approximation i m making, it's good enough,
- I want something quick and dirty
- I want something in a language i m proficient in
- ....
So there is more than one good answer, practically.
Most open source projects, when it comes to the audience, draw a line between competent people that have more time than money (a), and competent people that have more money than time or people not capable/willing to master the tech (b). a-type people use the open source and are super happy with it, b-type people reach out to the author and take a commercial package, with SLA, and possibly some professional services.
With respect to the complexity of open source software, it subjective and relative to your own capacity. Mileage varies, especially depending on the language. I find at time KITE difficult, mainly because I don't know Java :-) As a sanity check, we had the CFO and the entire sales team install it and run it from the readme. After a few iterations on the readme, they succeeded.
The open source KITE is only the interoperability testing engine. All the load testing modules and extensions are commercial. Trying to use the open source KITE for load testing is positioning oneself for challenges and complexity. CoSMo has a "grid manager" offer which prepackaged selenium nodes that gets automatically deployed and started for you, test get uploaded through a web GUI, and everything is run automatically. click and play. So easy a caveman could do it. Grid manager comes at under a couple of hundred bucks, and you use your own cloud account.
We hope this helped. Do not hesitate to ask for more info on the part where we would not have been clear enough, always happy to help.
2017: Real-time communication testing evolution with WebRTC 1.0
2017: Jattack: a WebRTC load testing tool
2018: Comparative Study of WebRTC Open Source SFUs for Video Conferencing