Is there a way to keep track of memory usage per archival description or similar in AtoM?

th...@arkivsormland.se

unread,

Aug 10, 2020, 4:17:24 AM8/10/20

to AtoM Users

Hi!

I wonder if there is a way to see how much memory is used to store all data that is uploaded/connected to a separate archival description in AtoM? We store data from various organizations, and we need a way to keep track of how much data each and every organization i storing on our servers. We currently are using AtoM 2.5

Regards,

Theo

Dan Gillean

unread,

Aug 10, 2020, 4:22:28 PM8/10/20

to ICA-AtoM Users

Hi Theo,

First, memory and disk space are two different things. In terms of storage, we're talking about disk space - AtoM does allow you to see uploads associated with a particular repository (and set limits per repository, if desired), as well as overall. See:

Keep in mind that this does not currently track space used by downloads - such as generated or uploaded finding aids, cached XML files, reports, etc. For more information on the structure of both the uploads and downloads directories, see:

https://www.accesstomemory.org/docs/latest/admin-manual/maintenance/data-backup/#uploads-and-downloads

Additionally, this is on a per-repository basis. It will be trickier to figure out per descriptive hierarchy. For each individual description, AtoM does show you file sizes for the uploaded master, but it doesn't currently provide additional technical metadata for the derivatives - though this is coming in our 2.7 release. In the meantime however, I imagine it might be cumbersome to have go through all your descriptions individually to collect some numbers. It might be possible for someone to create a script that could derive this information from the database, but that would require a deeper look than I can currently provide here.

As for the metadata itself - it would be much harder to calculate the disk space used per repository, as this data is held in a relational database. I can tell you, by way of example, that the entire dataset in our public demo site is about 23MB when dumped to a SQL file. An individual description is likely only a couple KB (it's really just a bunch of connected rows in a series of tables), though as part of a SQL dump you can't really separate them out like that.

Memory, on the other hand, is not something that would be very easy to determine on a per-institution basis. The memory on your server is a pool that is drawn upon as needed, and then released for other uses - so it's usage will go up or down depending on the activities being performed. Some operations (for example, XML imports) require reading a lot of data into memory, to be able to quickly refer to parts of it while parsing the import - but once the import completes, that memory should become available again. The same is true for most processes - in general, I'd suspect that writes require more memory use than reads, especially if you are using caching on your public website. I've no idea how you would go about determining memory usage per institution, let alone per description. I will mention that there are tools available that will let you monitor system resource usage - we describe one command-line tool (htop), in our documentation here:

https://www.accesstomemory.org/docs/latest/admin-manual/maintenance/troubleshooting/#troubleshooting-resources-limits

This will show you memory usage at the server level - I suppose you could do some tests by performing actions in the user interface while monitoring with htop or a similar tool, to establish a rough benchmark, but it would be rough at best, and likely to vary depending on a number of interrelated factors. There may also be other types of tools out there that could help, but that's beyond my personal experience.

Likely not as straightforward of a response as you were hoping for, but I hope it helps!

Cheers,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056

@accesstomemory

he / him

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/931d12d0-8901-43f9-b12c-a72bfce10bedn%40googlegroups.com.

th...@arkivsormland.se

unread,

Aug 11, 2020, 4:49:41 AM8/11/20

to AtoM Users

Hi Dan,

Thanx for your answer and the clarification regarding the difference between memory and disk space (English is not my first language). So expressed in these terms, what we are interested in is to keep track of how much disk space is used by each descriptive hierarchy. You wrote that AtoM can show file sizes for the uploaded master, but not how? Could you please tell me where I can find this information? And btw, thanx for telling me about the coming 2.7 release.

Regards,
Theo

Dan Gillean

unread,

Aug 11, 2020, 10:16:53 AM8/11/20

to ICA-AtoM Users

Hi Theo,

You can see the technical metadata in the Digital object metadata area after uploading a file. Here's a description in our demo site as an example:

https://demo.accesstomemory.org/6-lisgar-st-looking-north-sudbury-ontaro-robert-brown-ltd-duplicate-benjamin-film-labs

These details can potentially be hidden from public users via the Visible elements module, so be sure to log in. Once logged in, you can also click on the area header to go to the digital object edit page, which will also show you the file size for each derivative.

One final thing to note: please notice that we use kibibytes (indicated by the KiB) and not kilobytes (KB) to measure file sizes. "Kilo" implies units of 1,000, while Kibi in this case implies multiples of 1024, and has been accepted since 1998 as a better standard for measuring bytes. The difference is subtle (and in some cases people still use kilobyte to mean 1024 bytes anyway) - but we've opted to use the available standardized term for clarity. In this case, the file shown in the image above would be about 279.04 kilobytes.

Cheers,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056

@accesstomemory

he / him

To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/fe495504-873f-47fe-88b0-8b54cf67c488n%40googlegroups.com.

th...@arkivsormland.se

unread,

Aug 12, 2020, 3:33:11 AM8/12/20

to AtoM Users

Aha, so that was what you ment when describing that AtoM for now only can show the size of the uploaded master but not its derivatives... When the forthcoming AtoM 2.7 is released (sometime next year I presume), will it be possible to get info about the size (in terms of occupied disk space) of "the bundle (the entirety) of masters and derivatives" that belongs to the same descriptive hierarchy, or does the update only make it possible to get info about the size of a single master(file) and it's derivatives?

Btw, I don't know if it makes a difference, but for now, we are uploading everything to AtoM directly from Archivematica.

All the best,

Theo

Dan Gillean

unread,

Aug 12, 2020, 2:10:33 PM8/12/20

to ICA-AtoM Users

Hi again Theo,

Unfortunately, the focus of the current development is about clarifying the technical metadata between different versions of an image on a single description, not viewing aggregate information. This will offer some slightly improved Archivematica integration, seeing more information about the preservation copies as well. You can find more details and initial wireframes on the following issue ticket:

https://projects.artefactual.com/issues/13388

Regards,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056

@accesstomemory

he / him

To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/449c31e7-e1e5-4d88-b637-788fe7a58fddn%40googlegroups.com.

th...@arkivsormland.se

unread,

Aug 13, 2020, 3:27:08 AM8/13/20

to AtoM Users

Ok, I understand!

Thank you for answering my questions, Dan.

Regards,

Theo

Reply all

Reply to author

Forward