G'day Karl,
It depends how Alfresco is handling renditions these days. If they're stored persistently (something that, IIRC, wasn't always the case) and the way that they're stored in the repo is reasonably "vanilla" (no cyclical relationships, reasonably straight forward types for the rendition nodes and child relationship types, etc.) then the tool may be able to import them directly.
If that doesn't work, there's also the approach of "import then fix". In this mode the import tool is used to efficiently load the content into the repository, then a "fix up" script is run over the corpus to add any necessary finishing touches. This is the recommended approach for setting granular permissions on imported content, for example (since setting permissions "inline" during import drastically impacts performance, at least on Alfresco versions up to v5).
In terms of next steps, I'd suggest:
- Confirming that renditions are persistently stored in the repo and not elsewhere (e.g. in a disk-based cache of some kind)
- Reverse engineering the structure and properties of the rendition nodes and their relationships to the parent document
- Creating a test case in the on-disk format the bulk import tool uses that replicates that structure and properties for a small test corpus (perhaps just one document)
- Running a test import with that test corpus on a test instance
It would be awesome if you could report back with your findings, since this seems like a pretty common requirement!
Cheers,
Peter