Hi Isabel,
First, a clarification. It sounds like you are talking about mostly long text documents, like PDFs for example, and you want to improve AtoM's ability to search within the documents - is that correct?
If, so, then I don't think it's imageflow you want to focus your efforts on.
ImageFlow is a simple JavaScript library for creating a gallery - we use it in AtoM for the digital object carousel that shows thumbnail previews when there are digital objects attached at lower levels of a descriptive hierarchy:
For text documents, AtoM doesn't actually have a proper viewer - we rely on the built-in PDF viewers included in most modern browsers. Legacy browser users would get a download to view the PDF locally. Some of these include search boxes; if not, but the document has a text layer (such as OCR, etc), then you should still be able to press CTRL+F (or CMD+F on a Mac) to search for content. Remember, the quality of results in this search will depend on the quality of the text layer. In many cases, with bad OCR there are errors and glitches in the text layer that prevent accurate searching - a good way to test this would be to copy part of the document text layer with your cursor, and paste it into another place to examine. There is currently no way to edit the text layer of a document in AtoM.
AtoM does index the text layer of PDFs and other text-based digital objects when they are uploaded, so a search in AtoM will return results that include matches from inside the document. However, in addition to the possibility of a bad text layer, AtoM also has a max character limit on the the relevant database field. This field is a TEXT type field in the property_i18n table, whose max length is currently set for 65,535 bytes. This would equal 65,535 characters for single-byte encoded characters (such as Latin-1 encoding), but AtoM uses UTF-8, in which characters can be 1-4 bytes, depending - so it's difficult to give an exact word/character count. However, for documents that are 1,000+ pages, it is quite possible that your content is being cut off in the database, which might be adding to the difficulties you are having in finding exactly what you want.
If you want to find a particular record in the database based on content in the text layer, you could try the following SQL query:
- SELECT property_i18n.value FROM property_i18n WHERE property_i18n.value LIKE "search string here";
Replace search string here with what you want to find. You can also see AtoM's database Enttity Relationship Diagrams on our wiki, here:
You could potentially look into increasing the max length of this field (i.e. changing the field type from TEXT to MEDIUMTEXT would increase the max size from 65,535 bytes to 16,777,215 bytes) - but be aware that a) this will greatly increase the overall size of your search index, and may therefore return more noise in the search results, and b) this will only help you find the related description in AtoM - not a specific page within a text document.
We have long wanted to provide AtoM with a modern image viewer and document reader that can better support the kind of use case you are describing. Our ideal solution would likely be to implement
IIIF support, and use one of the many open source
IIIF viewers - we are partial to
Universal Viewer and
Diva.js, but they are all strong options. However, this would be a major development project for AtoM, requiring community support.
As you may know, AtoM relies on our community for major upgrades, enhancements, and new feature development - either in the form of development sponsorship, or community code contributions that follow our
development recommendations. You can read more about how we maintain and develop AtoM here:
If you intend to work on adding support for a modern document reader in AtoM, check out the options I've listed above, and our
Developer resources on the wiki.
Cheers,