Hi to all.
I am exploring the option of using barman for my company's backup operations. Currently we use a simple rsync method. We rsync the archives folder to the backup server every 20 minutes and the PGDATA folder once every night (delta only). I find barman's cli and general usage much less error prone than this and also much more convenient for restoring purposes, but i am facing two issues and i hope you can help me:
A) The base backups are larger in size than the actual cluster! This is confirmed both by file system commands and by sql commands. The "du" command for the base backup folder and the PGDATA folder of the server shows the following differences:
backup size
[barman@mdvmsrv139 base]$ du -sk 20170103T135236
16409412 20170103T135236
PGDATA size
[enterprisedb@mdvmsrv518 ]$ du -sk /pgdata/data
10813016 /pgdata/data
Barman's show-backup command lists the backup as 15,6 GB in size, while an sql command on the cluster returns actual size of 8,5 GB.
I am sensing that this has something to do with our company's setup (created long before i was here). We put our users tablespaces inside the PGDATA directory, which is not postgres's default.Then we create a symbolic link named "data" inside the default data directory pointing to our custom data dir and we do not set "data_directory" parameter at all inside postgresql.conf. I think the above setup "confuses" barman with the results of rsyncing the same files twice. If i "ls " in the base backup folder (20170103T135236) i see a folder named "data" and then one extra folder for each tablespace of the cluster. The folder named "data" is 8.5 GB, exactly the size of the cluster.
The weird part is that the backup is absolutely restorable and the resulting cluster is only 8.5 GB!! Am i missing something here? Is this normal? I admit that i am fairly new to postgresql!
B) On the first base backup for each server postgres takes a long time to finalize the backup. I've seen similar problems in this forum, having to do with the archive command not working but this is not the issue here. Barman finishes the rsync in time and then it takes 20 minutes for postgres to create a restore point. The session from which i am giving the "barman backup" command stays stuck in "Asking PostgreSQL server to finalize the backup", an i am not able to take the prompt back until postgres log writes " SELECT pg_create_restore_point('barman_20170103T135236')". Notice from the following part of postgres log file that is a 20 minute interval between "pg_stop_backup" and "create_restore_point" commands during which the only abnormality is the following line from barman log file:
"barman.server INFO: Another archive-wal process is already running on server DRAW_DB. Skipping to the next server":
2017-01-03 14:24:30 UTC [1619]: [1]: [0]LOG: duration: 1002.900 ms statement: SELECT location, (pg_xlogfile_name_offset(location)).*, now() AS timestamp FROM pg_stop_backup() AS location
2017-01-03 14:27:37 UTC [30356]: [175]: [0]LOG: checkpoint starting: time
2017-01-03 14:27:37 UTC [30356]: [176]: [0]LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=0.005 s, sync=0.000 s, total=0.009 s; sync files=0, longest=0.000 s, average=0.000 s
2017-01-03 14:42:37 UTC [30356]: [177]: [0]LOG: checkpoint starting: time
2017-01-03 14:42:37 UTC [30356]: [178]: [0]LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=0.005 s, sync=0.000 s, total=0.008 s; sync files=0, longest=0.000 s, average=0.000 s
2017-01-03 14:44:17 UTC [1619]: [2]: [0]LOG: restore point "barman_20170103T135236" created at A/790000E8
2017-01-03 14:44:17 UTC [1619]: [3]: [0]STATEMENT: SELECT pg_create_restore_point('barman_20170103T135236')
After that all asubsequent incremental backups finish in time and with no problems. Is this 20 minute interbal normal behaviour??
I am looking forward for your reply. Congratulation on this otherwise very helpfull tool!!
PS:
database version: enterprisedb 9.2 or Postgresql 9.3
barman version: 2.0
os version: Centos 6.8 on both servers