If I understand correctly, you manually retrieve each version where
the given path/project has changed in any way to afterwards dump those
revisions. Why is this better/faster than using svndumpfilter with
specifying an include path, but without the need to post process the
dump files?
I personally don't see the advantage to waiting around for svnadmin dump
to process every unrelated revision. For one project, I am only concerned
with about 200 revisions, spread out over 210k unrelated revisions.
# This example took around 8 hours:
svnadmin dump /path/to/master | svndumpfilter --drop-empty-revs \
--re-number-revs include $PROJECT > $PROJECT.dump
# However, when I run this on the same project:
for rev in `svn log -r0:HEAD file:///path/to/master/$PROJECT | egrep \
"^r[0-9]+ |" | cut -d " " -f1`; do
svnadmin dump --incremental -r ${rev:1} /path/to/master | svndumpfilter \
include $PROJECT >> $PROJECT.dump
done
… I can have a usable dump file in under 30 seconds. I realize this will take
longer for larger projects, but I think it makes my point. ‘svnadmin dump’ is
still creating a full dump stream for each revision before svndumpfilter sees
that revision to decide to keep it or not.
Are you sure your approach doesn't need other paths
from the repo, e.g. other source paths from copy operations for
projects or stuff like that?
I absolutely agree with this checking for this. You can’t successfully pull out
a single path using svnadmin dump / svndumpfilter if there are copies from a
location outside of whatever you are filtering for.
I did notice that using svnrdump pointing to url/project seems to get
around the outside-copy-sources issue, but I think that’s another
discussion altogether.
> svnadmin dump $repo --quiet -r $rev --incremental >> $project.$rev.bak
Adding to revision files with >> should be impossible in your
approach.
Are you saying that appending to an existing dump file in general is a
problem or just with all of his node-path processing? I have had no
trouble appending to existing dump files.
Thanks,
Bryon Winger
Hi,
The ‘svnrdump’ tool that was added in Subversion 1.7 might do exactly what you to do.
This tool allows creating a dumpfile from a url (E.g. file:///path/to/repos) and should skip unrelated paths for you during the repository processing.
You probably still want the svndumpfilter processing to drop empty revisions before loading it in a new repository.
Bert
You probably still want the svndumpfilter processing to drop empty revisions before loading it in a new repository.
I believe that the current version of svndumpfilter only operates on
version 2 dump streams - which svnadmin dump produces. svnrdump
produces a version 3 dump stream and is not compatible with svnrdump.
That being said, I am able to get around dumping empty revisions (from a
previous dump/load) with svnrdump by running something along these lines:
for rev in `svn log -r0:HEAD ${url}/${project} | \
egrep "^r[0-9]+ |" | cut -d " " -f1`; do
svnrdump dump --incremental -r ${rev:1} ${url}/${project} >> ${project}.dump
done
Basically, I am only dumping (incrementally) the revisions which actually
affect the path in question. This obviously is not as fast as doing everything
server-side, but it does appear to work around having files or directories
copied from paths outside of the particular project path. The
outside-copy-paths are dumped in full as opposed to just a simple reference
as to where it was originally copied from.
I would appreciate some feedback if I’m missing something or if the above
statement is inaccurate or unreliable. In my tests, everything appears to be
the same once loaded into a fresh repository, checked out in full and diffed
against the originals.
There is a very brief mention in the svn-book of appending to an existing
dump file, so I expect that to be safe in general. It can be found in the
“Repository Backup” section by searching for ‘appending’.
Thanks,
Bryon Winger
for rev in `svn log -r0:HEAD ${url}/${project} | \
egrep "^r[0-9]+ |" | cut -d " " -f1`; do
svnrdump dump --incremental -r ${rev:1} ${url}/${project} >> ${project}.dump
done
Basically, I am only dumping (incrementally) the revisions which actually
affect the path in question.