Sorry about the complexity of the MNIST (and other dataset) internals. The datasets in our swift-models repository have undergone evolution over time to minimize repeated code and to develop an underlying shared base. That base is the
Epochs API that was described in
this presentation (
slides). We're working on building out a guide for how the Epochs API works, so I apologize for the currently lacking documentation around this.
The
MNIST dataset in swift-models has two key properties:
training and
validation. Those are both lazily-mapped sequences (so that processing of images is done on-demand and not all upfront) that allow for iteration over the training and validation splits within the MNIST dataset. The MNIST dataset is further complicated by the fact that some of its shared functionality has been extracted into
MNISTDatasetHandler.swift for use in three different MNIST variants.
The
fetchMNISTDataset() function downloads the MNIST dataset files if they aren't already present in a local cache, extracts and caches them, and then reads the binary files to produce an array of (image bytes, integer label) tuples. The
makeMNISTBatch() function is used by the lazy mapping of the Epochs API to create batches on-demand from a shuffled dataset. In this case, it creates image and label Tensors from batches of (image bytes, integer label) tuples.
The MNIST dataset further conforms to
ImageClassificationData to allow it to be used interchangeably with other image classification datasets in examples and benchmarks that use image classification models.
The
LeNet-MNIST example used to show how to use this directly, but that now has a further abstraction in the form of a generalized TrainingLoop that takes in the dataset and handles the rest for you. If you step back to
the example before that conversion, you can see how this works.
Again, sorry that this is more complicated than you'd expect, and isn't particularly well documented at present. We're hoping to improve upon that.