file : name of the GTFS-RT feed file that was processed (there's one every 30 secs, so named by epoch timestamp of downloading)
num_vehicles : number of vehicle locations (entities) in the feed
feed_timestamp : timestamp of the feed that's there in the header. If you got the same value as the last feed, it's duplicate. (Whoops I just found out there were some 13k dupes in my data!)
feed_time : human, IST form of timestamp
incrementality : that's another field in the feed's header
bad_count : number of entities that had missing or flawed lat-longs like 0,0
earliest : timestamp of the earliest (farthest back in time) vehicle entity in the file (apart from feed timestamp, each vehicle location 'entity' must carry a timestamp of its own, because we can't assume all 700-odd vehicles sent in their lat-longs at exactly the same time. We're doing asynchronous business here.)
diff1 : seconds gap between earliest and feed_timestamp
latest : time of latest entity
diff2 : seconds gap between latest and feed_timestamp
What diff1, diff2 tells me : How "dated" is the information in the feed; and how "recent" is the information in it. Supposing diff2 was too large consistently, then I wouldn't bother to download a fresh feed every 30 secs which is the standard minimum refreshing time as per GTFS-RT specs.
For reference, Sample json output from a feed:
{'header': {'gtfsRealtimeVersion': '2.0',
'incrementality': 'FULL_DATASET',
'timestamp': '1550596818'},
'entity': [{'id': 'vehicle',
'vehicle': {'trip': {'tripId': '6255', 'routeId': '225'},
'position': {'latitude': 28.610946655273438,
'longitude': 76.980224609375,
'speed': 0.0},
'timestamp': '1550596773',
'vehicle': {'id': 'DL1PD0716', 'label': 'DL1PD0716'}}},