Speeding up normalization

55 views
Skip to first unread message

Roberto Greiner

unread,
Sep 17, 2025, 8:56:18 AMSep 17
to archivematica
Hi,

About normalization when a transfer task is being run in Archivematica.

I'm currently importing a relatively large amount of files to Archivematica and Atom (all are PDF's, about 8.000 files, ~450GB), and noticed a bottleneck. When normalization is running, it is done using 'gs', but only one process is run at a time, and gs is not distributing the taska among multiple CPUs. I have a server whit 12/24 CPU and only one is being used when normalization is running, taking a long time to complete the normalization processes.

Is there any way to make this step take advantage of multiple CPU's? I did a search and found no reference to doing this. It would speed up the import jobs considerably.

Thank you,

Roberto Greiner

--
  -----------------------------------------------------
                Marcos Roberto Greiner

   Os otimistas acham que estamos no melhor dos mundos
    Os pessimistas tem medo de que isto seja verdade
                                       James Branch Cabell
  -----------------------------------------------------

Santiago Rodríguez Collazo

unread,
Sep 18, 2025, 3:14:57 AMSep 18
to archiv...@googlegroups.com
Hi Roberto


/santi

--
You received this message because you are subscribed to the Google Groups "archivematica" group.
To unsubscribe from this group and stop receiving emails from it, send an email to archivematic...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/archivematica/CAJgdVFsojrbkCbMhQZ%2B94s%3DuMggyr7Ao%3D3vR-_G%2Birr7%3Du2zTQ%40mail.gmail.com.


--
Santiago Rodríguez
DevOps, Artefactual Systems Inc.

Roberto Greiner

unread,
Sep 18, 2025, 6:58:01 AMSep 18
to archiv...@googlegroups.com

Great!

I will read the document and implement it as soon as possible.

Tks!

Roberto Greiner

unread,
Nov 12, 2025, 2:37:03 PM (12 days ago) Nov 12
to archiv...@googlegroups.com

Hi,

I only managed to test it now, but starting multiple instances didn't help. I have 8 instances running, and still the normalization is only running in a single CPU. Also, I noticed that running multiple instances would not have helped, as the each instance did fork into 8 processes anyway, as shown by   :


root@archive:~# ps -ef|grep -i archivematicaClient
archive+  154514       1  0 15:37 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154515       1  0 15:37 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154516       1  0 15:37 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154573       1  0 15:37 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154586       1  0 15:37 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154587       1  0 15:38 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154598  154586  0 15:38 ?        00:00:02 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154601  154516  0 15:38 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154604  154515  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154605  154514  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154606  154516  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154607  154573  0 15:38 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154608  154586  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154609  154587  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154610  154514  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154611  154516  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154612  154573  0 15:38 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154613  154586  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154614  154587  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154615  154573  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154616  154514  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154617  154514  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154618  154514  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154620  154514  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154622  154515  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154623  154514  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154626  154515  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154628  154516  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154630  154515  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154631  154586  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154632  154515  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154633  154587  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154635  154586  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154638  154587  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154639  154516  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154640  154515  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154641  154573  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154645  154573  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154646  154516  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154647  154573  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154650  154516  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154651  154586  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154652  154573  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154653  154587  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154655  154586  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154656  154587  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154659  154586  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154660  154587  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154662  154515  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154664  154587  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154669  154516  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154719       1  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154721  154719  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154722  154719  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154723  154719  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154724  154719  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154725  154719  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154727  154719  0 15:38 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154729  154719  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154730  154719  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154742       1  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154766  154742  0 15:38 ?        00:00:02 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154767  154742  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154768  154742  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154769  154742  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154770  154742  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154772  154742  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154775  154742  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  154777  154742  0 15:38 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  169515  154515  0 16:18 ?        00:00:00 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  169520  154573  0 16:18 ?        00:00:01 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
archive+  169530  154514  1 16:18 ?        00:00:11 /usr/share/archivematica/virtualenvs/archivematica/bin/python /usr/lib/archivematica/MCPClient/archivematicaClient.py
root      170885  122747  0 16:33 pts/0    00:00:00 grep --color=auto -i archivematicaClient

Any idea of what else I shoud do to get the normalization running in more than one CPU?

Tks.

Em 18/09/2025 04:14, 'Santiago Rodríguez Collazo' via archivematica escreveu:
Reply all
Reply to author
Forward
0 new messages