Barman & tablespaces

538 views
Skip to first unread message

Tim Verhoeven

unread,
Oct 17, 2012, 9:15:40 AM10/17/12
to pgba...@googlegroups.com
Hi again,

I'm seeing another issue with barman. It seems to not really cope well with tablespaces. The rsync command used transforms symlinks to real files & folders. But postgres itself manages the pg_tblspc folder with symlinks. In our case we created the actual tablespaces inside this dir and postgres nicely created some symlinks to these.

But then when barman runs the backup these are converted to real files and the needed diskspace on the backup server for these tablespaces doubles. Since we are doing datawarehouses inside Postgres and don't have fancy deduplication hardware this really wastes alot of space.

Now, I then modified the rsync options inside the barman python code to just copy over the symlinks as is. But because the symlinks created by Postgres are absolute these now are broken and then the snippet of python code that tries to calculate the size of the backup fails.

I'm not really sure what the best way would be to fix this. Ideally Postgres creates relative links and we modify the barman rsync options to just copy symlinks as is. But I'm not that good a programmer and it would be a while before this would end up in regular Postgres releases I guesss.
Another solution would be to create some code inside barman that transforms the symlinks.

What are the reasons for choosing the rsync options to translate symlinks to real files & folders ?

I'm a bit at a loss.

Regards,
Tim

Gabriele Bartolini

unread,
Oct 17, 2012, 9:39:23 AM10/17/12
to pgba...@googlegroups.com, Tim Verhoeven
Hi Tim,

Il 17/10/12 15:15, Tim Verhoeven ha scritto:
> Hi again,
>
> I'm seeing another issue with barman. It seems to not really cope well
> with tablespaces.
Barman works fine with tablespaces.
> The rsync command used transforms symlinks to real files & folders.
> But postgres itself manages the pg_tblspc folder with symlinks.
Exactly. That's Postgres standard behaviour and we've adopted the same
strategy in Barman.

> In our case we created the actual tablespaces inside this dir and
> postgres nicely created some symlinks to these.
Hmmm ... Placing a tablespace inside the PostgreSQL data dir does not
make much sense. The real benefit of tablespaces is to separate them
from the actual data directory.
> But then when barman runs the backup these are converted to real files
> and the needed diskspace on the backup server for these tablespaces
> doubles. Since we are doing datawarehouses inside Postgres and don't
> have fancy deduplication hardware this really wastes alot of space.
This behaviour is the one we deliberately chose to manage tablespaces.
Unfortunately, having a tablespace inside the data directory causes
tablespace files to be duplicated (once through their real location, the
other time through the symbolic link).
> Another solution would be to create some code inside barman that
> transforms the symlinks.
I am sorry, but - as I said - this behaviour is the one we deliberately
chose. We do not feel safe with the idea of modifying this behaviour in
order to work with a non conventional way of creating tablespaces.

The only thing we could do is to add the real directories to the
exclusion list for rsync if they are a subdirectory of PGDATA.

Cheers,
Gabriele

--
Gabriele Bartolini - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gabriele....@2ndQuadrant.it | www.2ndQuadrant.it

Jérôme Vanandruel

unread,
Dec 24, 2012, 5:33:45 AM12/24/12
to pgba...@googlegroups.com, Tim Verhoeven
Hi all,

I'm sorry to dig out this topic (I don't know if it's authorized or not), but I encounter the same "problem" as Tim with tablespaces, the only difference in my case is that tablespace directory is outside "base" directory, on a different disk.

If barman don't take into account symlinks and "break" them during a restore, it will try to restore all data in pg_tblspc. Recreate symlinks and move data after restore, should not be a big problem, if I had enough space in base directory during restore process ... A workaround would be to had enough space on pg_tblspc or to mount my tablespace FS (when possible) directly in pg_tblspc, during a restore, but my opinion is that's a little bit "tricky" and not very clean ;-(

Is it not possible (or too difficult) to make barman "symlinks aware" ?

Regards

Jérôme
  gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it

Gabriele Bartolini

unread,
Dec 24, 2012, 6:46:54 AM12/24/12
to pgba...@googlegroups.com, Jérôme Vanandruel, Tim Verhoeven
Hi J�rome,

Il 24/12/12 11:33, J�r�me Vanandruel ha scritto:
> I'm sorry to dig out this topic (I don't know if it's authorized or
> not), but I encounter the same "problem" as Tim with tablespaces, the
> only difference in my case is that tablespace directory is outside
> "base" directory, on a different disk.
Just a question before answering your question: are you talking about
remote recovery?

Cheers,
Gabriele

--
Gabriele Bartolini - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gabriele....@2ndQuadrant.it | www.2ndQuadrant.it

Jérôme Vanandruel

unread,
Dec 24, 2012, 7:59:19 AM12/24/12
to pgba...@googlegroups.com, Jérôme Vanandruel, Tim Verhoeven
Hi Gabriele,

Sorry for the lack of explanations ;-)
Yes I talk especially about remote recovery but I think I would encounter the same problems with a "local" recovery (I mean but data on the base directory, with not enough space available).

Regards

Jérôme

On Monday, December 24, 2012 12:46:54 PM UTC+1, Support Watch wrote:
Hi J�rome,

Il 24/12/12 11:33, J�r�me Vanandruel ha scritto:
> I'm sorry to dig out this topic (I don't know if it's authorized or
> not), but I encounter the same "problem" as Tim with tablespaces, the
> only difference in my case is that tablespace directory is outside
> "base" directory, on a different disk.
Just a question before answering your question: are you talking about
remote recovery?

Cheers,
Gabriele

--
  Gabriele Bartolini - 2ndQuadrant Italia
  PostgreSQL Training, Services and Support
  gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it

Gabriele Bartolini

unread,
Dec 28, 2012, 3:42:09 AM12/28/12
to pgba...@googlegroups.com, Jérôme Vanandruel, Tim Verhoeven
Hi J�rome,

Il 24/12/12 13:59, J�r�me Vanandruel ha scritto:
> Sorry for the lack of explanations ;-)
No worries.
> Yes I talk especially about remote recovery but I think I would
> encounter the same problems with a "local" recovery (I mean but data
> on the base directory, with not enough space available).
Unfortunately, current implementation of remote recovery has some
limitations, which are outlined in the documentation too. This is all
due to the fact that current implementation is very simple and it is
purely based on ssh commands.

Until we implement a remote agent that talks with Barman and performs
initial checks, we cannot do this. However, even now, you can do a lot
of things with remote recovery, providing that you perform preliminary
operations on the remote server. It is not perfect, I know, but it
covers still the majority of use cases (in particular exact copy of the
original server after a disaster).

As far as local recovery is concerned, I agree with you that we could
add some controls before recovery in order to assure that enough disk
space is available. If I understood correctly, this is definitely
something to go in the TODO list.

Cheers,
Gabriele

--
Gabriele Bartolini - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gabriele....@2ndQuadrant.it | www.2ndQuadrant.it

Jérôme Vanandruel

unread,
Jun 19, 2013, 7:14:48 AM6/19/13
to pgba...@googlegroups.com, Jérôme Vanandruel, Tim Verhoeven
Hi,

I did not see any mention about this point on barman 1.2.1 release note (by the way thanks for the testimonial ;-))
Moreover, I discovered another side effect with the symlink / hardlink : data are copied 2 times by barman when tablespaces are located in pg_tblspc ;-(

I supposed that If i'm the only people to use barman & tablespace this topic is not on the top of your TODO list ?

Regards

Jérôme

On Friday, December 28, 2012 9:42:09 AM UTC+1, Support Watch wrote:
Hi J�rome,

Il 24/12/12 13:59, J�r�me Vanandruel ha scritto:
  gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it

pierre....@oxylane.com

unread,
Nov 4, 2013, 5:34:20 AM11/4/13
to pgba...@googlegroups.com, Jérôme Vanandruel, Tim Verhoeven
Hello,

Same problem for me. Will it be corrected in next version?
Because it's really problematic for big databases.
Having tablespace backuped two time will cost many space and many time to do it.
Reply all
Reply to author
Forward
0 new messages