Well, that's why I asked, I was a bit afraid about the overhead mentioned here and I hoped to get some real use case statistics.
In fact, I've just tried to setup orthanc with postgres Storage and Index plugin on virtual machine.
Surprisingly - the real *database* size took a bit *less* then files on harddisk (reported by linux du command) but - there is WAL (Write-Ahead Logging) log file in postgres directory, which boosts reliability, can be used to rollback database, etc - and the file itself took like the same amount of space as the database itself, which is reasonable as the whole database was uploaded in a short amount of time. (I uploaded 2 exams - second right after the first one).
Good news is WAL logging is configurable, and You can limit the log:
- set the maximal size
- set the time of rollback
(and others)
Bad news: As the postgres will allocate space for the WAL log file You won't get it back. Even after expiration of the log harddisk space will still be allocated.
Note: I used Postgres 10.0.
In postgres 9.4 the WAL files were stored in pg_xlog directory, now it is called pg_wal.
So, some real data:
Debian testing fresh install, postgres 10.0, orthanc 1.3.0 from debian repo, postgres plugin from debian repo
currently: empty database, 3 exams were uploaded, removed, 2 of them uploaded, removed, simply I've played a bit:
After I've noticed wal filesize growing I've set checkpoint_timeout = 5min in postgresql.conf, I left other settings untouched.
root@orthanc-postgres:/var/lib/postgresql# du -h
945M ./10/main/pg_wal
56M ./10/main/base
1001M ./10
uploaded two studies: 575M in 1129 files and 337M in 662 files
root@orthanc-postgres:/var/lib/postgresql# du -h
209M ./10/main/pg_wal
820M ./10/main/base
1,1G ./10
820M/912M = 90 % (!) postgres compression?
Surprisingly - the wal file was shrinking during the upload!
I've uploaded the third one: 2318 instances, 350M (files exported from Ginkgo CADx)
97M ./10/main/pg_wal
1280M ./10/main/base
1378M ./10/main
1280M/1262M = 101%
(Technically i should use ./10/main/base/16385 for calculations which took 1258M, still <100%)
Drawbacks? Database fragmentation. After removal of the largest study the database havent shrinked *at all*. But here comes the VACUUM function.
Let's leave our server for a few minutes - Size was still the same.
Uploaded a study - database size increased by the size of a study.
Invoked a manual VACUUM FULL on the table - filesize remained constant.
Uploaded another large study - no change in filesize (!)
Removed two studies - database size remained constant.
pg_wal stayed at around 100 MB all the time.
To sum up:
The overhead is not so large. WAL is there for reasons, and can be limited.
Don't expect postgres to free the harddrive space after removal of instances, postgres keeps it allocated for incoming data. It might eventually be freed.
Let's assume postgres knows what it is doing. ; )
I hope to try it with much larger dataset this week.
--
Regards
Oskar Bożek