There are events embedded in the eyetracking data referring to the start and end of the scan, and for most datasets an event for each 1-second TR as well.
Each file contains the eyetracker data for all available subjects for a given movie (ranging 161-163), downsampled to 24fps, including saccade, fixation, and blink masks (which were computed by eyelink).The data have been truncated to the stop/start of the scan.
So in python you might do:
M=loadmat('eyetracker_24fps_MOVIE1_7T_AP_allsubj_data.mat', simplify_cells=True)
subj0_xy=M['gazecoods'][:,:,0] #22104x2 array of (x,y) pixel coords for subject 0 (921*24=22104)
subj0_id=M['subjects'][0] #subjects[0] for this movie is '100610'
One detail that could be relevant is some scans were run with the eyetracker configured for a screen width of 1024 and some for 1280. That info is in the 'dispsize' field in the mat file. If you don't take it into account your gaze model will be inconsistent.
Let me know if you have any questions.
-Keith