pg_restore sloq progress

97 views
Skip to first unread message

Hannes Schreiter

unread,
May 7, 2024, 10:53:08 AM5/7/24
to xnat_discussion
Hello there,

we are running an XNAT 1.8.6.1 dockerized Instance.
Somehow our DB got corrupted (data access issues, pg_dump also failed).

This was the reason we tried to restore one of our daily backups via pg_restore.
Unfortunately this takes ages: restore is currently running for over 100 hours.

What I tried:
- create a new clean postgres container
- restore the schema using: pg_restore -U postgres --dbname=xnat --clean --if-exists --job=4 --verbose --format=c --schema-only backupfile.dump 
(worked fine)
- restore data only using: pg_restore -U postgres --dbname=xnat --job=4 --verbose --format=c --data-only --disable-triggers  backupfile.dump

But restore hangs for days here:
pg_restore: enabling triggers for xhbm_configuration_data
pg_restore: finished item 17169 TABLE DATA xhbm_configuration_data
pg_restore: skipping item 11409 CONSTRAINT xhbm_configuration_data xhbm_configuration_data_pkey
pg_restore: skipping item 15539 FK CONSTRAINT xhbm_configuration fk_dudtx26q5dhjm96kgmb4qw89b
pg_restore: skipping item 15540 FK CONSTRAINT xhbm_configuration_data_configurations fk_2s21q9v235oouof1srpoeot68

Are there ways to improve restore speed, or should we just wait? I already adjusted some of the postgres settings like Maintenance_work_me, Autovacuum off etc.
Currently the DB size increases by a view Bytes per second, and still over 20GB to go.

What exactly is the purpose of xhbm_configuration_data table? This has around 12GB in our case and I see there all the DICOM tags listed which explaines the size. Could a restore work properly when excluding this specific table?

I am currently relatively new to the XNAT administration side in our institute and also not an SQL expert...
I hope the given information helps understanding our problem.

Thanks in advance
Hannes


Rick Herrick

unread,
May 7, 2024, 11:13:44 AM5/7/24
to xnat_di...@googlegroups.com

Hi Hannes,

 

To your question about xhbm_configuration_data, I think what is taking up so much space there is data stored by the OHIF viewer plugin. You definitely need other entries in that table (e.g. the site-wide and project anonymization scripts are stored in that table) but you could try removing all of the rows in xhbm_configuration and xhbm_configuration_data for the viewer. I think the tool ID in xhbm_configuration is simply ohif-viewer, so if you deleted those then any corresponding entries in xhbm_configuration_data that should greatly reduce the size of that table (newer versions of the plugin use a separate table named xhbm_ohif_session_data for the session data).

 

That said… I’ve restored large databases in the past and running for over 100 hours seems excessive. This seems telling to me:

 

> Currently the DB size increases by a view Bytes per second

 

Presuming you meant a “few bytes per second", that seems awfully slow to me, but PostgreSQL tuning and analysis is not one of my areas of expertise. The only suggestions I can come up with are:

 

  • Increase the memory available to the container itself and then to PostgreSQL running in the container. Most large PostgreSQL deployments that I know have something on the order of 64GB to 128GB of RAM available.
  • Check the volumes mounted for the PostgreSQL container. Performance impediments in the various layers between the container accessing “its” local storage and the actual media on the container server could cause issues like this, simply because it takes so long to perform writes.

 

Maybe someone with more experience tuning and managing PostgreSQL in a Docker container has some other suggestions…

 

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xnat_discussion/ccae9b08-3a8f-499d-aeb7-b7d2e320abd0n%40googlegroups.com.

Hannes Schreiter

unread,
May 8, 2024, 2:37:39 AM5/8/24
to xnat_discussion
Hello Rick,

Thank you very much for your help.
You are absolutely right: 17810 entries of 17820 in xhbm_configuration are related to the ohif-viewer (in the so far restored DB). I will definetly upgrade ohif to a newer version to have easier control over this.
What is the best way to delete these rows from a backupfile in the .dump format before a restore?
Is it sufficient to only handle the entries in xhbm_configuration table, or do I have to manualy delete the relating entries in xhbm_configuration_data?

   > Presuming you meant a “few bytes per second"...
Oops, yes, of course I meant "few". Generally, the database size growths relatively fast (less than 1 hour) until around 40GB when xhbm_configuration tables are processed.

   > Increase the memory available to the container itself and then to PostgreSQL
Our machine has 64gb RAM, 8 Cores and no ressource limitations in the docker-compose.
I have set PostgreSQL settings based on this recommendation http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html (knowing it is fairly old)

Regarding the mounts, we have both the backup-file and DB stored on a mounted local SSD, so no network drives or HDDs involved. I guess in short term, there might not be that much optimization potential.

Thank you very much for the information!

Best regards,
Hannes

Tim Olsen

unread,
May 8, 2024, 12:58:12 PM5/8/24
to xnat_discussion
It should be sufficient to only exclude the rows in xhbm_configuration table that are for ohif-viewer.  I'm guessing that change alone will resolve your issues.  But, I'll also link to one of our devops experts to see if they have any other suggests (including how to actually exclude those rows).

Tim

Rick Herrick

unread,
May 8, 2024, 5:07:00 PM5/8/24
to xnat_di...@googlegroups.com

To delete those entries, just delete all of the lines in xhbm_configuration with the ohif-viewer ID. I’d probably use vim or sed to do that myself (you could also delete the line in xhbm_tool that declares the ohif-viewer tool, so it might be possible to just delete everything line that contains that string).

 

The xhbm_configuration_data table is a little trickier because the IDs for the configuration entries will all be different. You could try just deleting every line that contains some distinctive strings within the stored JSON, which should be fairly straightforward, as the JSON for the viewer is quite different from most of the other stuff stored in the configuration service.

 

Hannes Schreiter

unread,
May 14, 2024, 7:54:53 AM5/14/24
to xnat_discussion
Thanks for the input.
Fortunately, our database appeared to have been restored after approximately 200h.

@Rick,  I'm still curious about the data format you suggested for deleting those rows? I generated an .sql file from the backup (.dump) using pg_restore -f file.sql backup.dump, but couldn't locate any entries containing 'ohif-viewer'.
Reply all
Reply to author
Forward
0 new messages