Hi Alessandro and welcome to the forums,
I guess this would all depend on what timebase you want to use for your synchronization and how precise you need the synchronization to be. If you would like to pair up audio and video buffers together in real-time, an easy way to do it while avoiding drift is to use the WithLatestFrom operator with your fastest source as the main input:
This will pair up the latest frame of video every time a new audio buffer arrives. The output will be at 100Hz (~10ms interval). If this is too fast for your video stream, what you can do is then use Slice with the Step property > 1 (e.g. Step=2 for 50Hz, Step=3 for 30Hz, Step=4 for 25Hz, etc).
You can combine it with more video sources by using CombineLatest on all the cameras and send the output to WithLatestFrom.
This will give you a structured format where all the audio samples will be written to file and there is a fixed relationship between audio buffer time and video time which will not drift over time. There will still be a small jitter in the video which may or may not be negligible depending on your video frame rate and timing needs.
Hope this helps.