In a attempt to be constructive I did some research on the matter. From reading RFC 3550 and debugging Native Code implementation I discovered that:
- the RTP timestamps present in every incoming frame webrtc::VideoFrame starts from a initial random value (5.1 RTP Fixed Header Fields);
- RFC 3550 prescripts a sender to periodically send a Sender Report (6.4.1 SR: Sender Report RTCP Packet) with contains NTP timestamp, which is supposed to be the sender system wallclock and a RTP timestamp for the current SSRC (source identifier);
- The SR allows to correlate RTP timestamps to sender absolute time, allowing for example to synchronization different RTP sources (example audio and videos);
- Empirically verified: all major browsers I tested (Chrome, Firefox, Edge, Safari) respects the prescription of RFC 3550 so that NTP wallclock is seconds since 1 January 1900 UTC (4. Byte Order, Alignment, and Time Format).
To achieve my intended design, specifically to correlate a sync point in the remote JS peer to Native Code incoming video frame timestamps, I should do the following:
- In the JS peer take Date.now() at the sync point and convert to NTP wallclock time (seconds since 1 January 1900 UTC). Send this timestamp to the Native Code peer;
- Correlate RTP timestamps of incoming webrtc::VideoFrame to sender NTP wallclock. Compare these NTP timestamps to the above sync point NTP timestamp.
From static analysis and live debugging of Native Code it seems all utility classes to perform this task are available but **not** properly exposed nor fed with useful data. In particular:
- webrtc::RtpToNtpEstimator is supposed to correlate RTP to NTP timestamps;
- In webrtc::vcm::RtpVideoStreamReceiver an instance of RtpToNtpEstimator is wrapped through webrtc::RemoteNtpTimeEstimator;
- RemoteNtpTimeEstimator has a completely different purpose than RtpToNtpEstimator: it's meant to estimate a NTP timestamp based on **local** wallclock, not the sender one;
- Interesting enough, RemoteNtpTimeEstimator is never fed with parameters because to work it also need RTT (round-trip time, refer to RtpVideoStreamReceiver::DeliverRtcp()). RTT seems to be computed only when Receiver Reference Time Reports (RRTR), described by RFC 3611[2], are received. Even if "exposed" in a recent patch[2] in M67, RRTR are disabled and non functional by default.
Summarizing, what I'm asking to is actually a missing feature and requires quite some internals knowledge to put it in place. I actually sketched an attack plan to implement the feature:
- Add a publicly accessible "sender_ntp_time_ms" field to webrtc::VideoFrame and all intermediate result classes that precede in the chain to allow propagation;
- Make it possible in webrtc::RemoteNtpTimeEstimator to use an external instance of webrtc::RtpToNtpEstimator;
- Add two instances, respectively webrtc::RtpToNtpEstimator and webrtc::RemoteNtpTimeEstimator, to base class webrtc::RtpData, or inherit it so it can be used by both webrtc::vcm::RtpVideoStreamReceiver or webrtc::Channel (used for audio?). Instance of RemoteNtpTimeEstimator should use the external RtpToNtpEstimator;
- On the received Rtcp packets, inheritors should update either RemoteNtpTimeEstimator or the RtpToNtpEstimator, depending on the availability of the RTT;
- RtpVideoStreamReceiver (and other classes for audio) should always estimate "sender_ntp_time_ms" from local RtpToNtpEstimator and set it to intermediate classes in the pipeline (probably RtpVideoStreamReceiver::OnReceivedPayloadData) so it will be eventually propagated to webrtc::VideoFrame.
What do you think? Makes sense?
Regards,
Francesco