Checksum in --xml for svn info <repo_file>

109 views
Skip to first unread message

Dan Ellis

unread,
May 12, 2014, 5:49:28 PM5/12/14
to Subversion Users
Hi,

Related to my other email regarding diffing a working copy file to a potentially unrelated repository file; my real intent is to find out if two files are different in content and really do not care about the actual differences (the reason I want --summarize).

When performing an svn info on a local working copy, I can see the file's checksum (sha-1 specifically), but this doesn't work on repositories.  It would be really nice and save some bandwidth if I could simply compare checksums versus performing a diff.  Is there support in the SVN server API that supports this property?  If so, is there any interest from the developers to add this xml tag?

c:\Project_files\sandbox>svn info AAAA.txt --xml
<?xml version="1.0" encoding="UTF-8"?>
<info><entry>
<non-XML SNIP>
<wc-info><checksum>f0d19ccba8cd50a4f88c31deeb635c564fadcf5f</checksum></wc-info>
</entry></info>

Thanks again,
Dan

Dan Ellis

unread,
May 13, 2014, 2:12:42 PM5/13/14
to Ben Reser, Subversion Users
On Tue, May 13, 2014 at 10:16 AM, Ben Reser <b...@reser.org> wrote:
> On 5/12/14, 2:49 PM, Dan Ellis wrote:
>> When performing an svn info on a local working copy, I can see the file's
>> checksum (sha-1 specifically), but this doesn't work on repositories. It would
>> be really nice and save some bandwidth if I could simply compare checksums
>> versus performing a diff. Is there support in the SVN server API that supports
>> this property? If so, is there any interest from the developers to add this
>> xml tag?
>

> Given that 1.8 solves your --summarize issue, unless there's some other use
> case for this I'm not sure it's worth exposing.

My only use case would be one of performance. How does the current
implementation of a diff from a local WC file to a server repository
file work? Since it doesn't sound like the server can expose the
checksum right now, I assume the client must bring over a copy of the
file's content to do a diff against (even a summary only diff).
Simply using a checksum comparison would be much more efficient for
comparing large numbers of files (or files with large content).


> You're of course welcome to send in a patch. You basically just need to expose
> it as a live property (to use the terms from DAV), which the client could
> request specifically or just get when requesting all such properties as we do
> when doing info over DAV. With svnserve the stat command would need to start
> returning it. To retrieve the data you'd use svn_fs_file_checksum().

I'll take a look, though I barely know how to spell DAV.

Thanks for all the help!
Dan

Ben Reser

unread,
May 13, 2014, 1:16:14 PM5/13/14
to Dan Ellis, Subversion Users
On 5/12/14, 2:49 PM, Dan Ellis wrote:
> When performing an svn info on a local working copy, I can see the file's
> checksum (sha-1 specifically), but this doesn't work on repositories. It would
> be really nice and save some bandwidth if I could simply compare checksums
> versus performing a diff. Is there support in the SVN server API that supports
> this property? If so, is there any interest from the developers to add this
> xml tag?

The server doesn't actually expose that information to the client, though we
could do so.

However, the XML output probably wouldn't look the same because the existing
output is extra output from the working copy data. So the output from the
server wouldn't belong in wc-info.

Given that 1.8 solves your --summarize issue, unless there's some other use
case for this I'm not sure it's worth exposing.

Bert Huijben

unread,
May 14, 2014, 8:49:06 AM5/14/14
to Dan Ellis, Ben Reser, Subversion Users
Luckily Subversion can use much smarter tricks than this system you describe here.

Svn reports which URL at which revision you have locally (and if a directory the exceptions on that below) to the repository, and then the repository reports back if there are any changes if you would compare the actual trees.

In case of a normal diff the binary diff and property changes would be send to the client, so it could reconstruct the whole file using the BASE information in the pristine store. But as you pass --summarize the server just sends the information: 'there are changes'.

Bert

Reply all
Reply to author
Forward
0 new messages