repository mirroring

31 views
Skip to first unread message

tatyana irzun

unread,
Jun 16, 2021, 5:45:58 PM6/16/21
to us...@subversion.apache.org

Hello,

Can you advise me the better way what to do.

We have some subversion repo (for example builds) for mostly binaries data. And the read-only mirror this repository in different location. The mirror was created by svnsync tool. And everything is fine until network problems happen. And then svnsync try to replicate a huge commit and literally stuck. As I understand svnsync do something like : svnadmin dump --incremental from sourse ; copy to destination by some protocol; svnadmin load.

For example we stuck to revision r62031 , i made dump by hands and get 43G  file (svnadmin dump -r62031 --incremental /data/svn/builds >>r62031 ).

But if i look at this file on disk i see less size , i suppose it because of  enable-rep-sharing:


$ ls -lah /data/svn/builds/db/revs/62/

....

svnuser svnuser 1.3G 62031

...

It is substantially less that svnadmin dump gives me. And many revisions looks like that dump revision is GBytes size but on disk revision file is only MBytes. So, the question is how can i improve my synchronization time until network become faster. Can i manually sync (rsync, scp or other) folders revs and revprops to get consistent data on mirror server? or svnsync and svnadmin tools the only way to correctly mirror a Subversion repository.

Thank you.

wargaming.net
EgzO3mXGcK

This e-mail may contain CONFIDENTIAL AND PROPRIETARY INFORMATION and/or PRIVILEGED AND CONFIDENTIAL COMMUNICATION intended solely for the recipient and, therefore, may not be retransmitted to any party outside of the recipient's organization without the prior written consent of the sender. If you have received this e-mail in error please notify the sender immediately by telephone or reply e-mail and destroy the original message without making a copy. Wargaming.net accepts no liability for any losses or damages resulting from infected e-mail transmissions and viruses in e-mail attachment. kgzO3mXGcg

Thorsten

unread,
Jun 17, 2021, 3:28:40 AM6/17/21
to tatyana irzun, us...@subversion.apache.org

Hello,

You can manually replicate the repo on file basis. If you want to use svnsync after that, you have to reset the revprop properties that svn uses, I don't recall the exact syntax right now( svn ps --revprop -r 0 svn:sync-source "sdfhsdlf" something like that) . Other than that it could be helpfull if you state your versions for  subversion and repository versions vor source and target.

Best regards,

Thorsten

Thorsten

unread,
Jun 17, 2021, 8:44:57 AM6/17/21
to tatyana irzun, us...@subversion.apache.org

Hello,

Yes I was thinking of rsyncing the whole repository... That could work ok if rsync is able to detect that most files in the repo are identically, is that actually the case?

Maybe I am wrong and the file for revision x on the target is different from the source, even if they have the same content, then this will not be particular fast. Are you using http/https? Have you checked for error messages?

Best regards,

Thorsten


Am 17/06/2021 um 14:32 schrieb tatyana irzun:

Hello,

I try testing manually mirroring and make rsync one revision 62048 (rsync file from revs, from revprops folders and rep-cache.db). And i got a verify mistake:

svnadmin verify -r62048 /data/svn/builds/
 Verifying repository metadata ...
* Error verifying revision 62048.
svnadmin: E160004: Reading one svndiff window read beyond the end of the representation


and the same mistake when i try cat file from revision. So i suppose i can't to mirror revision by revision manually? i need to rsync all folder db/ wright?

svn version 1.12.0 on both severs

Thank you

Mark Phippard

unread,
Jun 17, 2021, 9:11:22 AM6/17/21
to tatyana irzun, Subversion
On Wed, Jun 16, 2021 at 5:45 PM tatyana irzun <t_i...@wargaming.net> wrote:

Hello,

Can you advise me the better way what to do.

We have some subversion repo (for example builds) for mostly binaries data. And the read-only mirror this repository in different location. The mirror was created by svnsync tool. And everything is fine until network problems happen. And then svnsync try to replicate a huge commit and literally stuck. As I understand svnsync do something like : svnadmin dump --incremental from sourse ; copy to destination by some protocol; svnadmin load.


Just to clarify .. svnsync does NOT use dump files to sync revisions. It essentially replays the same set of requests that the client would have made when it did the original commit. It is specifically designed so that it can handle failures and be re-run. Of course it can continually fail for the same reason. My guess is that since this is such a large transaction a timeout is happening somewhere between the svnsync (client) and your server.

I realize that does not help a lot but it does mean you might be able to tune your server or client and get it to work and sync this revision.

Mark


Thorsten

unread,
Jun 17, 2021, 9:20:12 AM6/17/21
to tatyana irzun, us...@subversion.apache.org

Hello,

To expand a bit on what mark said and to clarify: I remember getting problems because the sync target apache server rejected to big commits. So we added a few zeros to LimitRequestBody in the apache conf and it worked again. Or maybe your sync target just kills the connection after 10 minutes or so

Best regards, Thorsten

Mark Phippard

unread,
Jun 17, 2021, 9:22:23 AM6/17/21
to tatyana irzun, Subversion
On Thu, Jun 17, 2021 at 9:13 AM tatyana irzun <t_i...@wargaming.net> wrote:
>
> Hi, Mark
>
> How can i enable svnsync logging to debug process?

Please keep replies on list. It is best to keep the audience as wide
as possible.

I am not aware of any great ways to log from a client other than using
Wireshark. Assuming you are using http/https there might be a way to
turn on logging in the Serf library but I do not recall how. It might
require compiling your own version.

Does svnsync report any errors? You can also look at the error logs of
the server.

Mark

Nico Kadel-Garcia

unread,
Jun 17, 2021, 10:30:13 PM6/17/21
to Mark Phippard, tatyana irzun, Subversion
On Thu, Jun 17, 2021 at 9:22 AM Mark Phippard <mark...@gmail.com> wrote:
>
> On Thu, Jun 17, 2021 at 9:13 AM tatyana irzun <t_i...@wargaming.net> wrote:
> >
> > Hi, Mark
> >
> > How can i enable svnsync logging to debug process?
>
> Please keep replies on list. It is best to keep the audience as wide
> as possible.
>
> I am not aware of any great ways to log from a client other than using
> Wireshark. Assuming you are using http/https there might be a way to
> turn on logging in the Serf library but I do not recall how. It might
> require compiling your own version.

Also, Subversion does not perform well for extremely large binary
commits, especially when many distinct bulky, or mixed bulky and small
text changes, are in the same commit. If you're doing binary release
management as Subversion branches or tags, you may wish to rethink
this approach.

That said, the web servers for Subversion, typically httpd, are often
not configured to handle extremely large transfers well. Consider
enabling svn+ssh: it gets you away from the vagaries of intervening
web proxies and can be considerably more robust.
Reply all
Reply to author
Forward
0 new messages