Hey everyone,
For one of my projects, I have a validation dataset that contains about 1 million entries. It would take way too much time to validate on that entire dataset for each validation, so I've been wondering: Is there a way to make the DataStream return a subset of that dataset for each epoch?
For example: For each validation, I want to use a subset of 1000 validation entries. For my first validation, the first epoch that the DataStream returns should contain the entries [0, ..., 999], the second one for the second validation should contain the entries [1000, ..., 1999], and so on. If possible, the batch size should be freely configurable as well.
I've been studying the API and experimenting with different IterationSchemes and DataStreams, but I've not been able to find a solution.
Apologies if it's something really simple that I've overseen, and thanks in advance!