Essentially it is encoding the watermark in the video, quite similar to what happens when you edit a video and save it. As this happens without the usual optimization and acceleration from the gpu etc., this becomes a very slow process.
The paid version simply skips this step. Aggregation as they call it is essentially piecing together all the video chunks downloaded, nothing fancy. Serves more or less like moving the fragments to the destination file.