I may be missing something, but it seems the timing data is too limited. I can see "tk roll", and "tk timecode", and "Tk FPS", but that's not enough. It could also be drop-frame, and it could use an NTSC timebase, meaning it will run at a slightly different speed than the timebase indicates, such as having a reported FPS of 30, but is actually 29.97. Also, there's nothing to indicate the overall duration of the take?
Regardless, encoding the timing data as a timecode only seems to me to be a mistake. It would be more accurate to have a start frame number and end frame number, from which the timecode can be calculated based on the fps, and this might avoid a whole host of problems down the line, as well as allowing for capture devices that do not support timecode. A timecode field could still be included just for reference's sake.
I also think there could be a larger issue here, which is that you seem to be assuming the base element is a take or "single clip", when you should probably be allowing for data to be recorded per track, per clip, per take. This would allow for (amongst other things) left/right stereo-3d data, and audio information to be recorded independently:
- take 1, camera A, clip 1, track 1, video, left eye, start, end, etc
- take 1, camera A, clip 1, track 2, video, right eye, start, end, etc
- take 1, camera A, clip 1, track 3, audio, channel 1, start, end, etc
- take 1, camera A, clip 1, track 4, audio, channel 2, start, end, etc
etc
HTH