What is the `schema.compatibiility` connector configuration property set to? It defaults to `NONE`, which means that each file written to S3 has the proper schema. When the connector observes a schema change in data, it commits the current set of files for the affected topic partitions and writes the data with the new schema in new files.
You said you did a rolling upgrade of your producers, and that this took over an hour, during which you had a mixture of events with both old and new schemas. Perhaps it is possible that a single S3 sink connector task was seeing a mixture of events with old and new schemas as well, and with `schema.compatibility=NONE` the connector would have flushed the file every time it saw a change in schema. For example, if we use 1 and 2 to signify the schema version and the connector sees events like e1, e2, e1, e2, e2, e1, e1, e2, the connector would have written them into files like this: [e1], [e2], [e1], [e2, e2], [e1, e1], [e2].
Other schema compatibility settings result in different behavior. If your new schema is backward compatible and you set `schema.compatibility=BACKWARD`, then the connector can use the latest version of the schema that it sees. It will flush the current file when it sees a new schema, but after that it will always use the new schema. For example, if we use 1 and 2 to signify the schema version and the connector sees events like e1, e2, e1, e2, e2, e1, e1, e2, the connector would have written them into files like this: [e1], [e2, e2, e2, e2, e2, e2, e2].
There is also the FORWARD schema compatibility setting, which is similar to BACKWARD except it uses the older schema rather than the new one. Given the same example, the files would be written like this: [e1, e1, e1, e1, e1, e1, e1, e1].
Finally, there is the FULL schema compatibility setting, which works similarly to BACKWARD as long as your schemas are both backward and forward compatible.
Hope this helps explain what you're seeing. Best regards,
Randall