Hi Kaitlin,
I'll try to answer this. But I want to mention right away that we have just started talking about reorganizing/reimplementing that queue. The way it is implemented now has not been touched much in many years. But it may be time to build something better, primarily to provide a better way to monitor and manage that queue, to be able to see what's on it and how long it should be expected to take, etc. There's an open issue for this and I'm hoping we'll get this done for the upcoming Dataverse v6.
The files are ingested one at a time. Ingest can be very expensive, in terms of both memory and CPU cycles. So even with only one file at a time being processed, if you are adding a large number of large 'ingestable" files (STATA, SPSS, ...) at once, it is possible to end up with your server struggling with that queue for hours or days in ways that would be noticeable to your users. How much data are we talking about anyway? If you wanted to prevent your installation from being overloaded like that, I would deal with that by setting the ingest size limits to something low. These cutoff limits can be set either for all ingestable files, or for specific formats; if a file is larger than the size limit, Dataverse skips putting it on the ingest queue and adds it to the dataset as is, as row Stata, SPSS, etc. Later on you can decide which of these files you want to ingest, we have an API for ingesting individual existing datafiles. I.e. this could be done later without committing to adding all these files to the queue all at once.
What is your planned data migration process, is it going to be scripted batch job, using our native APIs to create datasets, add files, etc.?
There are some peculiarities of how the ingest queue currently operates (especially when you need to purge something that's already on the queue; we recently realized that this does not work all that well in some situations). But I'll skip that for now.
Best,
-Leo.