EXIM - importing large counts of images

24 views
Skip to first unread message

Brian Street

unread,
Jul 26, 2017, 11:15:37 AM7/26/17
to Hippo Community
I have worked through the tutorial at https://onehippo-forge.github.io/content-export-import/tutorials-import-bins.html

It was easier to understand by exporting images, examining the output JSON, then move the files & edit lightly then import them back in as new images.

My question is: is there any way to read in binary files directly & build up the other properties along the way instead of pre-building JSON & base-64 encoding the file contents?

Are there other recommended approaches for importing thousands of JPEGs?  It would be preferable to let the system configuration create all the imageSet variants during import rather than pre-generating them outside the system - even if that meant import was much slower, at least it would be internally consistent.

If not, are there any scripts that would help generate the JSON files from a folder of JPEG images that already exist?

Woonsan Ko

unread,
Jul 26, 2017, 2:34:15 PM7/26/17
to hippo-c...@googlegroups.com
On Wed, Jul 26, 2017 at 11:15 AM, Brian Street <brian.c...@gmail.com> wrote:
I have worked through the tutorial at https://onehippo-forge.github.io/content-export-import/tutorials-import-bins.html

It was easier to understand by exporting images, examining the output JSON, then move the files & edit lightly then import them back in as new images.

Great! :-)
 

My question is: is there any way to read in binary files directly & build up the other properties along the way instead of pre-building JSON & base-64 encoding the file contents?

By default, the binary export task writes a data: URL with Base64 encoded binary data unless the binary property value exceeds 20KB. If it exceeds 20KB, it simply create the binary file (as same as the pure binary image or pdf file for instance) with a file: URL instead.
So, I guess you experimented only with smaller files until now.
You can change the maximum size threshold from the default 20KB to something else, for example, to zero in order to export every binary property value to a separate file (file://...) instead of data: URL like the following:

exportTask.setDataUrlSizeThreashold(256 * 1024); // 256KB as threshold

I actually copied and pasted the example above in the following page:

The example sets the threshold to 256KB.
If you set it to 0, every binary property value will be serialized to a file instead. Then you can read those binary files (referred to by the file: URLs) right away.
Also, the referenced section (https://onehippo-forge.github.io/content-export-import/tutorials-export-bins.html#Initializing_Export_Task) contains code setting those separate attachment file folder.
 

Are there other recommended approaches for importing thousands of JPEGs?  It would be preferable to let the system configuration create all the imageSet variants during import rather than pre-generating them outside the system - even if that meant import was much slower, at least it would be internally consistent.

gallery-magick library supports three options to create variants as explained here:

The first one is the default option and a pure Java solution which is the same as Hippo CMS GalleryProcessor component. So, if you stick with the default option, the variant generations is exactly the same as how Hippo CMS imageset / gallery processor generates the variants.

Therefore, even if it is generating variants in import process in prior, the result should be the same with the first pure-Java option by default.

The other two options are non-pure-Java solution. Each has pros and cons.

Regards,

Woonsan
 

If not, are there any scripts that would help generate the JSON files from a folder of JPEG images that already exist?

--
Hippo Community Group: The place for all discussions and announcements about Hippo CMS (and HST, repository etc. etc.)
 
To post to this group, send email to hippo-community@googlegroups.com
RSS: https://groups.google.com/group/hippo-community/feed/rss_v2_0_msgs.xml?num=50
---
You received this message because you are subscribed to the Google Groups "Hippo Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hippo-community+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/hippo-community.
For more options, visit https://groups.google.com/d/optout.



--
71 Summer Street, 2nd Floor, Boston, MA 02110
Amsterdam - Oosteinde 11, 1017 WT Amsterdam
US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466

Reply all
Reply to author
Forward
0 new messages