Fixity checking compressed AIPs

87 views
Skip to first unread message

Ima Oduok

unread,
Jul 24, 2025, 5:45:27 PMJul 24
to archivematica

Hello everyone! 

I have what might be a silly question. My organization stores AIPs as compressed bags and we are looking into ways to run fixity checks on them in cloud storage environments, specifically in AWS S3. Can fixity checks be run on compressed files without unzipping them first or do they have to be uncompressed for fixity tools to run? Does anyone know if fixity checks on compressed AIPs can be run in Archivematica’s Fixity tool or in any other program?  

- Ima Oduok 

Sarah Romkey

unread,
Jul 25, 2025, 8:26:32 AMJul 25
to archiv...@googlegroups.com
Hi Ima,

This isn't a silly question at all, it is in fact quite timely! 

Archivematica's own Fixity app does check fixity on compressed packages, in fact I would say that it's most common usage. The more tricky part of your question is the cloud storage part. It's less about the unzipping of the package (Archivematica will store a checksum for the package as a whole as well as checksum manifests for all of its content) and more about the egress charges you'll incur in order to run the fixity check. 

I'm interested to hear if any Archivematica users here have implemented their own solution to this problem- we at Artefactual read with interest this blog post from APTrust. There was also a paper at iPRES last year on this topic.

Sorry I don't have an easy solution to share but am very interested to see if others from the community have given this some thought!

Cheers,

Sarah

Sarah Romkey, MAS,MLIS
Head of Hosting and SaaS Products




--
You received this message because you are subscribed to the Google Groups "archivematica" group.
To unsubscribe from this group and stop receiving emails from it, send an email to archivematic...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/archivematica/7d34fa46-755b-45b7-9e2e-d67103ec70f5n%40googlegroups.com.

Mariecris Gatlabayan

unread,
Jul 28, 2025, 5:29:51 PMJul 28
to archivematica
Hi Sara,

Does Archivematica also document checksum for uncompressed packages? If so, how do we retrieve them?

Thanks,
Mariecris

Mariecris Gatlabayan

unread,
Jul 28, 2025, 5:29:58 PMJul 28
to archivematica
Hi all,

Are checksums generated for uncompressed packages? If so, how do we get the checksums for the AIPs?

Thank you for any help you all can provide :)

Best wishes,
Mariecris

On Friday, July 25, 2025 at 5:26:32 AM UTC-7 sro...@artefactual.com wrote:

Joseph Anderson

unread,
Jul 29, 2025, 9:36:12 AMJul 29
to archivematica
I've been interested in trying the serverless approach described by APTrust, but I'm curious if anyone has used it yet and could give a rough estimate of the cost per terabyte/gigabyte of data. We've used serverless approaches for other functions and the cost-savings are phenomenal.

-Joe

Charlie Hosale

unread,
Jul 30, 2025, 1:41:16 PMJul 30
to archiv...@googlegroups.com

Hi all,

 

MIT Libraries has been experimenting with fixity in the cloud – but we’re using uncompressed bags, so unfortunately our use case doesn’t align with Ima’s. We’ve developed a toolset that takes Archivematica AIP UUIDs as input and uses AWS Lambda to verify AIPs stored in S3. The code is here https://github.com/MITLibraries/s3-bagit-validator. Large files and AIPs sometimes time out, but we’ve got local workarounds for those. We’re just getting started in production so I don’t yet have useful data about costs and time, but I hope to share more in the future. If you’d like more information feel free to contact me!

 

Take care,

Charlie

 

Mariecris Gatlabayan

unread,
Jul 31, 2025, 12:23:43 PMJul 31
to archivematica
Thanks for sharing Charlie! Have you found using AWS Lambda for fixity checks to be relatively affordable? 

Ima Oduok

unread,
Jul 31, 2025, 4:49:33 PMJul 31
to archiv...@googlegroups.com

Hi Sarah,

 

Thank you for that information. I have a follow up question about fixity checking compressed packages.

 

It sounds like the Fixity app unzips the package to run the checks but also creates a checksum for the zipped packages as well. Say you have a zipped package and one of the files within it is changed. Would the checksum for the zipped package change, indicating that one of the files is not as it should be? Or does the zipped package checksum remain the same so long as all the files are present, regardless of any changes to the files themselves?

 

From: 'Sarah Romkey' via archivematica <archiv...@googlegroups.com>
Date: Friday, July 25, 2025 at 8:26
AM
To: archiv...@googlegroups.com <archiv...@googlegroups.com>
Subject: Re: [archivematica] Fixity checking compressed AIPs

--
You received this message because you are subscribed to a topic in the Google Groups "archivematica" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/archivematica/stqielB2X_s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to archivematic...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/archivematica/CAAr2QtspK6Wh56%3DpYdHM7ff5b%2B2yDNEOTFxWcH%2Bu9nCQb08gCQ%40mail.gmail.com.

Charlie Hosale

unread,
Aug 5, 2025, 10:42:55 AMAug 5
to archiv...@googlegroups.com

Hi Mariecris,

 

We’ve only had it in production since May but so far the Lambda costs are lower than $10 for fixity checking 5 TB of content. That said, the download and local workflow costs for edge cases like the big AIPs could be significant enough to impact budget.

 

Have a nice weekend,

 

Charlie Hosale

Digital Preservation Coordinator

MIT Libraries | Scholarly Communications & Collections Strategy

cho...@mit.edu

 

Reply all
Reply to author
Forward
0 new messages