Generating on-the-fly media derivatives

61 views
Skip to first unread message

luc...@gmail.com

unread,
Jun 23, 2017, 10:29:49 AM6/23/17
to AtoM Users
Hi everyone,
I wonder if it is possible to configure AtoM to stream media formats with lower quality than the original in the AIP without generate more DIP?
The subject is very close to "Regenerating Derivatives" where, however and if I'm not mistaken, we talk only about previews and thumbnails. (https://www.accesstomemory.org/en/docs/2.3/admin-manual/maintenance/cli-tools/#regenerating-derivatives).
In the case of saving high-resolution multimedia formats (photos, videos, music, but also historical documents in graphic form), it is useful allow only low-resolution versions to be displayed and/or downloaded.
It is difficult to establish a priori which lower resolution is more convenient to adopt and, in addition, the parameters to define it may vary over time (for example, depending on the network bandwidth availability, which always improves over time, or again following specific time-dependent copyrights). So the best solution should be to give to AtoM the original formats and to generate on-the-fly formats of lower quality. I know some media servers that use ffmpeg for this purpose and, for example, convert on-the-fly from flac (uncompressed codec) to mp3, even modulating the compression according to the detected network bandwidth.
Is it possible to do something like that in AtoM (since it uses ffmpeg) without having to have duplicate DIPs with different formats for the same AIP?
Thanks in advance for your answers.

Luca

Dan Gillean

unread,
Jun 23, 2017, 12:05:19 PM6/23/17
to ICA-AtoM Users
Hi Luca,

This is a very interesting thought! First, some background on how things currently work:

Right now, when you upload a digital object to AtoM it is considered the master digital object. On upload, the application will use its media-related packages (ffmpeg, imagemagick, ghostscript, pdftotext) to generate two copies for display in the user interface - a thumbnail image (for use in search and browse results pages, and the digital object carousel) and a reference display copy of the object, which is shown on the view page of an archival descirption record. By default, public users cannot click on the reference display copy and access the master digital object (except for PDFs, since the reference display copy is too small to be useful) - however, an administrator can control this via the user and group permissions - see:

Administrators and other users with the correct permissions can also use the PREMIS actionable rights to restrict access to digital objects, if desired. See:

So generally, low-res access copies are generated based on preconfigured conversion rules from ffmpeg, imagmagick, etc.The reference display copy is the one created for streaming in the browser / public access, etc. These derivatives are not currently generated on the fly, but in advance, on upload. It's possible that some AtoM users might not have the bandwidth to effectively generate access copies on the fly without timing out before they are served, so there are some advantages to this approach.


However, as you point out, right now the user has no choice over the formats used for conversion, or dimensions/resolution/etc of the access copies, without digging into ffmpeg and other libraries on their own and making changes (requiring command-line access and a familiarity with the tools, etc). Additionally, there is some redundancy that we would really like to solve:

When Archivematica sends AtoM a DIP (already in access copy format), AtoM still takes these and generates the derivatives from them. The thumbnail makes sense, but for the reference display copy, AtoM is not quite as efficient as Archivematica - so sometimes the AtoM reference copy will end up being a bigger file size than the original DIP object. It is an unnecessary and inefficient conversion, resulting from the two applications being developed independently at points. 


An ideal solution (I think) would be to abstract Archivematica's Format Policy Registry (FPR) so it can be used as a service by many applications ( a long-term goal of the FPR), and so that AtoM can make use of it as a tool for managing derivatives. That way users could have an interface to customize exactly how they want derivatives to be created, and we could customize the rules so that when formats that are already in a suitable access version are received (mp3 for example - or, generally most DIP upload objects), no conversion takes place. Users could also make changes if better formats become available, and trigger the regen-derivatives task to update those held in AtoM.

For performance and scalability, it may make sense to continue generating access copies in advance when needed, rather than on-the-fly as they are requested. However, there are certainly ways in which we could reduce redundancy (especially with DIP uploads), and give users more control via the user interface in how and when this occurs. An FPR-based solution would also allow users to swap in new tools for generating derivatives if and when better tools than ffmpeg/ghostscript/imagemagick become available.

Of course, this proposed solution is a long-term one - and it is not the only approach worth considering!

In any case, to change how AtoM handles derivatives and implement something akin to what you are suggesting would require development. However we choose to address this in the future, as we begin considering a next-generation version of AtoM we hope to keep this kind of modular design at the forefront of our considerations.


Regards,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory



--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/15cfd66a-46a6-40c1-90b2-c174b0c7bf7f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages