Alternative to imageflow

Isabel Martín

unread,

Mar 25, 2019, 6:30:41 AM3/25/19

to AtoM Users

Good morning, I am creating content in AtoM that have digital documents with multiple pages, 
most of them without more than 1000 pages. The viewer that brings AtoM (imageflow) becomes 
complicated to use since it is difficult to find a specific page of the document. 

Is there an alternative viewer? Or they could give me some idea of how to develop it myself. 

Thank you.

Dan Gillean

unread,

Mar 25, 2019, 11:34:17 AM3/25/19

to ICA-AtoM Users

Hi Isabel,

First, a clarification. It sounds like you are talking about mostly long text documents, like PDFs for example, and you want to improve AtoM's ability to search within the documents - is that correct?

If, so, then I don't think it's imageflow you want to focus your efforts on. ImageFlow is a simple JavaScript library for creating a gallery - we use it in AtoM for the digital object carousel that shows thumbnail previews when there are digital objects attached at lower levels of a descriptive hierarchy:

https://www.accesstomemory.org/docs/latest/user-manual/access-content/navigate/#carousel

For text documents, AtoM doesn't actually have a proper viewer - we rely on the built-in PDF viewers included in most modern browsers. Legacy browser users would get a download to view the PDF locally. Some of these include search boxes; if not, but the document has a text layer (such as OCR, etc), then you should still be able to press CTRL+F (or CMD+F on a Mac) to search for content. Remember, the quality of results in this search will depend on the quality of the text layer. In many cases, with bad OCR there are errors and glitches in the text layer that prevent accurate searching - a good way to test this would be to copy part of the document text layer with your cursor, and paste it into another place to examine. There is currently no way to edit the text layer of a document in AtoM.

AtoM does index the text layer of PDFs and other text-based digital objects when they are uploaded, so a search in AtoM will return results that include matches from inside the document. However, in addition to the possibility of a bad text layer, AtoM also has a max character limit on the the relevant database field. This field is a TEXT type field in the property_i18n table, whose max length is currently set for 65,535 bytes. This would equal 65,535 characters for single-byte encoded characters (such as Latin-1 encoding), but AtoM uses UTF-8, in which characters can be 1-4 bytes, depending - so it's difficult to give an exact word/character count. However, for documents that are 1,000+ pages, it is quite possible that your content is being cut off in the database, which might be adding to the difficulties you are having in finding exactly what you want.

If you want to find a particular record in the database based on content in the text layer, you could try the following SQL query:

SELECT property_i18n.value FROM property_i18n WHERE property_i18n.value LIKE "search string here";

Replace search string here with what you want to find. You can also see AtoM's database Enttity Relationship Diagrams on our wiki, here:

https://wiki.accesstomemory.org/Development/ERDs

You could potentially look into increasing the max length of this field (i.e. changing the field type from TEXT to MEDIUMTEXT would increase the max size from 65,535 bytes to 16,777,215 bytes) - but be aware that a) this will greatly increase the overall size of your search index, and may therefore return more noise in the search results, and b) this will only help you find the related description in AtoM - not a specific page within a text document.

We have long wanted to provide AtoM with a modern image viewer and document reader that can better support the kind of use case you are describing. Our ideal solution would likely be to implement IIIF support, and use one of the many open source IIIF viewers - we are partial to Universal Viewer and Diva.js, but they are all strong options. However, this would be a major development project for AtoM, requiring community support.

As you may know, AtoM relies on our community for major upgrades, enhancements, and new feature development - either in the form of development sponsorship, or community code contributions that follow our development recommendations. You can read more about how we maintain and develop AtoM here:

https://wiki.accesstomemory.org/Development/Philosophy

If you intend to work on adding support for a modern document reader in AtoM, check out the options I've listed above, and our Developer resources on the wiki.

Cheers,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056

@accesstomemory

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/3c811774-4131-4c59-bd97-a787c41fcbf1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Isabel Martín

unread,

Mar 26, 2019, 4:16:22 AM3/26/19

to ica-ato...@googlegroups.com

Hi Dan, thanks for your response. I think I have not explained well. The documents I want displayed are digital. 
For each description I have added a complete book scanned page by page in JPG format. 
They are large books and some have more than 1000 pages, the imageflow becomes cumbersome to visualize them.

That is the reason for wanting to implement a more agile viewer where it is easy to move around the pages.

The first problem that I find is the paging, AtoM returns a maximum of 100 pages in each query. Could this be modified?

Do you know any system already implemented for AtoM that could serve me?

Thanks for all. Greetings.

Dan Gillean

unread,

Mar 26, 2019, 10:48:00 AM3/26/19

to ICA-AtoM Users

Hi Isabel,

Unfortunately, I don't currently know of anyone who has improved on AtoM's current digital media solutions - however, as an open source project, I am regularly surprised by the fascinating enhancements and customizations our community has developed, so perhaps someone on this list might share something I don't current know about!

I still think that the IIIF viewer solutions I have suggested would help you most - they are viewers that allow you to browse compound digital objects (i.e. individual pages of a book, where each page is a scan) in a single viewer, with navigation tools, thumbnails, search support, and more. Because the carousel is made up of small thumbnails and JPGs typcially don't have an embedded text layer, the current digital object carousel is not an ideal way to explore and find specific content inside a compound object like a book.

In terms of the number of pages returned: this is controlled by the global administrative setting for results per page, found in Admin > Settings > Global here:

https://www.accesstomemory.org/docs/latest/user-manual/administer/settings/#results-per-page

You are correct that currently this setting is constrained in the code to a minimum of 5 and a maximum of 100 results per page. This is set in the code here:

https://github.com/artefactual/atom/blob/qa/2.5.x/lib/form/SettingsGlobalForm.class.php#L28-L31

And then it is invoked below, here:

https://github.com/artefactual/atom/blob/qa/2.5.x/lib/form/SettingsGlobalForm.class.php#L118-L130

I believe these hard limits are set for performance reasons, so if you intend to modify them locally, do so at your own risk, and I recommend making local backups of your data before you proceed, just in case!

Cheers,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056

@accesstomemory

To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/bbab87d4-a404-47dc-a86f-009bdf71c876%40googlegroups.com.

Isabel Martín

unread,

Mar 27, 2019, 3:32:20 AM3/27/19

to AtoM Users

Hi Dan, thank you very much for your response.

Reply all

Reply to author

Forward