SVN corruption /w v1.60

37 views
Skip to first unread message

Ilsa Loving

unread,
Apr 23, 2020, 4:55:14 PM4/23/20
to scmmanager
Hi,

We've recently had an uptick of corrupted commits with one of our repos and having to rebuild the repo each time is getting a bit tedious.

I understand that the issue appears to be related to svnkit. I also understand that I can't just drop in a replacement jar from their website because it's been customized.  Are there plans to release a 1.61 revision with an updated version of it?  Or should I update the server to the latest v2.0 (rc7 as of this post)?

Rene Pfeuffer

unread,
Apr 24, 2020, 8:18:52 AM4/24/20
to scmmanager
Hi Ilsa,

we did not plan to update 1.60, but if it would fix a severe bug, we would probably do so nonetheless.

Could you so nice the check, whether the bug really is fixed with 2.0.0-rc7? If so, call us back and we would release a 1.61 (of course only, if you still would like to switch back to it from 2.x).

Regards

René

Ilsa Loving

unread,
Apr 24, 2020, 9:25:29 AM4/24/20
to scmma...@googlegroups.com
Unfortunately I have absolutely no idea why this problem is happening or how to force it to occur.  All I know is that randomly (and it’s happened twice in the past 3 weeks), our SVN repo reported a corrupted commit. 

svnadmin: E160004: Filesystem is corrupt
svnadmin: E200014: Checksum mismatch while reading representation:
Blah blah blah

I saw that there was discussion of this problem a couple years ago, and it was suggested that it was a possible bug in svnkit, but I then the conversation went cold and it’s not clear if anything further was ever done about it.

If 2.0 is stable enough to use then I can try setting that up and seeing if it makes a difference.  I don’t know how long it will take for me set that up so in the mean time I can set logging to trace and see if anything pops up?

Ilsa

--
You received this message because you are subscribed to a topic in the Google Groups "scmmanager" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scmmanager/bhJ3uhP4gHw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scmmanager+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scmmanager/13bec8af-77a5-4490-af83-2793ff7d3914%40googlegroups.com.

Rene Pfeuffer

unread,
Apr 24, 2020, 11:02:59 AM4/24/20
to scmmanager
SCM-Manager 2.0.0-rc7 is production ready (we and others are using it on a daily basis). The 'rc' is only due to the fact, that some API may still change and therefore this is not ready for others to develop plugins, yet.

As to setting logs to trace, this can have performance impacts and it can create a lot of data. It should be sufficient, only to set `svnkit.fsfs` to TRACE:

```
  <logger name="svnkit.fsfs" level="TRACE" />
```

René
To unsubscribe from this group and all its topics, send an email to scmma...@googlegroups.com.

Ilsa Loving

unread,
Apr 27, 2020, 12:26:55 PM4/27/20
to scmma...@googlegroups.com
We’ve been doing some digging and assuming I’m reading the svnkit source code correctly, it doesn’t appear to validate the checksum sent by the client.  So if the file is changed somehow mid-stream, whether by corruption or being modified while sending, then the actual commit won’t match the checksum. And this is from looking at their latest code (1.10).  I also find it very ironic that they’re hosting svnkit in a git repo… So much for eating your own dog food.

I’ve tried reaching out to svnkit but have failed on all attempts.  I cannot create a youtrack account, nor can I join their user mailing list (email bounced).  Heck, even their feedback email address gives me a bounce back, so I don’t know what else I can do.

Perhaps it’s time to abandon svnkit, or SVN entirely. :\

Ilsa

On Apr 24, 2020, at 11:02, Rene Pfeuffer <rene.p...@cloudogu.com> wrote:

SCM-Manager 2.0.0-rc7 is production ready (we and others are using it on a daily basis). The 'rc' is only due to the fact, that some API may still change and therefore this is not ready for others to develop plugins, yet.

As to setting logs to trace, this can have performance impacts and it can create a lot of data. It should be sufficient, only to set `svnkit.fsfs` to TRACE:

```
  <logger name="svnkit.fsfs" level="TRACE" />
```

René

Am Freitag, 24. April 2020 15:25:29 UTC+2 schrieb Ilsa Loving:
Unfortunately I have absolutely no idea why this problem is happening or how to force it to occur.  All I know is that randomly (and it’s happened twice in the past 3 weeks), our SVN repo reported a corrupted commit. 

svnadmin: E160004: Filesystem is corrupt
svnadmin: E200014: Checksum mismatch while reading representation:
Blah blah blah

I saw that there was discussion of this problem a couple years ago, and it was suggested that it was a possible bug in svnkit, but I then the conversation went cold and it’s not clear if anything further was ever done about it.

If 2.0 is stable enough to use then I can try setting that up and seeing if it makes a difference.  I don’t know how long it will take for me set that up so in the mean time I can set logging to trace and see if anything pops up?

Ilsa


--
You received this message because you are subscribed to a topic in the Google Groups "scmmanager" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scmmanager/bhJ3uhP4gHw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scmmanager+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scmmanager/acb95846-e9e6-41d8-8fe7-751653f2abfb%40googlegroups.com.

Ilsa Loving

unread,
Apr 27, 2020, 6:04:48 PM4/27/20
to scmma...@googlegroups.com
I took one last stab at it and was finally able to reach the developers of svnkit. Apparently methods of repo access do in fact verify the checksum…. Unless you are accessing the repo via file://.  If that’s how scm-manager is accessing the repository on the file system (which is understandable, seeing how it’s local), then this could be the loophole that is allowing corrupt revisions to be saved.  What do you think?

Ilsa

On Apr 24, 2020, at 11:02, Rene Pfeuffer <rene.p...@cloudogu.com> wrote:

To unsubscribe from this group and all its topics, send an email to scmmanager+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scmmanager/acb95846-e9e6-41d8-8fe7-751653f2abfb%40googlegroups.com.

Ilsa Loving

unread,
Apr 27, 2020, 6:50:02 PM4/27/20
to scmma...@googlegroups.com
Sorry for all the responses.  I just got another message back from tmate and they said that from my description it sounds like we’re running afoul of a bug that has been identified in v1.9 but has only been fixed in 1.10.1.

I don’t know if it’s feasible to release an updated svnkit plugin for v1.6 to cover this issue but in the mean time we’re going to try setting up v2.0 of scm-manager.  I presume you’re using the latest version of SVNKit on that.

Ilsa

On Apr 24, 2020, at 11:02, Rene Pfeuffer <rene.p...@cloudogu.com> wrote:

To unsubscribe from this group and all its topics, send an email to scmmanager+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scmmanager/acb95846-e9e6-41d8-8fe7-751653f2abfb%40googlegroups.com.

Message has been deleted

Christian Keydel

unread,
Apr 28, 2020, 3:22:24 AM4/28/20
to scmmanager
v2.0 of scm-manager.  I presume you’re using the latest version of SVNKit on that

You'd think, but no, unfortunately not. See here: https://groups.google.com/d/msg/scmmanager/bqg8_sSAytQ/B47p7mOLHAAJ

I am waiting on a version with upgraded svnkit, too, for other reasons.
Ilsa

Ilsa Loving

unread,
Apr 28, 2020, 12:23:54 PM4/28/20
to scmmanager
I'm exploring the 2.0 rc7 files and I see that they have in fact updated svnkit.  If you check the plugin file work/scm/webapp/WEB-INF/plugins/scm-svn-plugin-2.0.0-rc7.smp it clearly shows svnkit v1.10.1.

(Also, since the other post you reference talked about it, jgit was also updated to 5.6.1)

We're going to try to get this set up today and see how it goes, although I'm not hopeful that the issue will crop anytime soon.  I still haven't figured out how to replicate the problem.

Christian Keydel

unread,
Apr 28, 2020, 12:35:25 PM4/28/20
to scmmanager
Wonderful news! Would be great if you could keep us posted on how your migration goes I guess many are eager but anxious to try the same thing, for various reasons.

Sai Raghavendar

unread,
Apr 28, 2020, 12:43:21 PM4/28/20
to scmmanager
Hi Llsa,

Since you are using latest version of scm , could you please help on ssl configuration of scm?

Ilsa Loving

unread,
Apr 28, 2020, 12:53:40 PM4/28/20
to scmma...@googlegroups.com
We haven’t updated yet, but what is your question?

Generally speaking, dealing with SSL /w java has always been and will always be a nightmare for small installations that don’t have a full automation infrastructure set up.  This is further complicated by the fact that running any server as root (which is required if you try to bind a java server to standard port 443) is a very very bad idea from a security perspective.

 The cleanest way to do it is to run scm-manager as a non-privileged user on port 8080.  Then set up nginx on port 443 to handle the SSL and proxy to :8080.  Significantly more secure, AND significantly easier to maintain.  The only important thing is to make sure that you’re client body max size is large enough to accommodate the largest file you allow to be committed, assuming you’re using HTTP transfer.

For repository communication, you’re actually better off using SSH connectivity.

Ilsa

--
You received this message because you are subscribed to a topic in the Google Groups "scmmanager" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scmmanager/bhJ3uhP4gHw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scmmanager+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scmmanager/513c0aa9-561e-4b74-9782-93cf53a32e13%40googlegroups.com.

Sebastian Sdorra

unread,
Apr 28, 2020, 1:26:20 PM4/28/20
to scmma...@googlegroups.com
Hi Ilsa,
We are using a patched version of svnkit-dav for the svn http protocol.


The svnkit-dav library access the repository directly by manipulating the FSFS.
I'm not sure but i think svnkit-dav should be responsible for verifying the client checksum, before the files are written to the repository.

I think we have to check how the client sends the checksum and then we have to check if svnkit-dav handles it correctly.

Have you done a filesystem check on your machine, perhaps you have a harddisk failure?

Sebastian


You received this message because you are subscribed to the Google Groups "scmmanager" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scmmanager+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scmmanager/6B194B4D-3901-4852-8E90-507E6030AD04%40gmail.com.

Sebastian Sdorra

unread,
Apr 28, 2020, 1:47:44 PM4/28/20
to scmma...@googlegroups.com
After a short ngrep session, i was able to find how the subversion client sends the checksum.
Here is a commit of a newly created file:

PUT /scm/repo/scmadmin/checksum/!svn/wrk/0af3218c-f6e7-46d7-a39e-6183e83b4b34/hitchhiker.persons.txt HTTP/1.1.
Host: localhost:8080.
Authorization: Basic c2NtYWRtaW46c2NtYWRtaW4=.
User-Agent: SVN/1.13.0 (x86_64-apple-darwin19.4.0) serf/1.3.9.
Content-Type: application/vnd.svn-svndiff.
Accept-Encoding: gzip.
DAV: http://subversion.tigris.org/xmlns/dav/svn/depth.
DAV: http://subversion.tigris.org/xmlns/dav/svn/mergeinfo.
DAV: http://subversion.tigris.org/xmlns/dav/svn/log-revprops.
X-SVN-Result-Fulltext-MD5: 6d9d19d7ea9428d115bcac244eb3cd1a.
Transfer-Encoding: chunked.

SVN.......Arthur Dent

The header X-SVN-Result-Fulltext-MD5 contains the MD5 checksum of the added file and now we have to dig to the code of svnkit-dav to see if the checksum is checked before the file is added to the filesystem.

Sebastian

Sebastian Sdorra

unread,
Apr 28, 2020, 2:18:07 PM4/28/20
to scmma...@googlegroups.com
After digging to the code of svnkit-dav i see no validation of the checksum at all.

And is passed to the DAVRepositoryManager which creates a DAVResource which stores the checksum:

The checksum could now be retrieved via the method getResultChecksum:

But this method is never called.

In my opinion the DAVPutHandler should be responsible for verifying the checksum:

This should be easy to implement, but hard to test.

@Ilsa is a special type of file corrupted? Text or Binary?

Sebastian

Ilsa Loving

unread,
Apr 28, 2020, 2:44:24 PM4/28/20
to scmma...@googlegroups.com
The issues we’ve run into have been exclusively with binary files being added to the repos.  Unfortunately it happens randomly and I’ve been unable to reproduce the issue.


Maybe you could chime in?  I thought I found the same thing as you when I looked at the code, but my java-fu is now well beyond rusty and they said that svnkit _is_ supposed to be verifying the checksums, so I don’t know.

Ilsa

Sebastian Sdorra

unread,
Apr 29, 2020, 3:52:06 PM4/29/20
to scmma...@googlegroups.com
Hello again,
I've done some more tests:


By using an http proxy, which corrupts the body of an svn put request and i can confirm that svnkit-dav does not verify the checksum.
But i don't think that this your problem, because the repository on the server side does know about the corruption.

For the client checksum verification i've created an scm-manager issue: https://github.com/scm-manager/scm-manager/issues/1113

Sebastian

Ilsa Loving

unread,
Apr 29, 2020, 4:50:13 PM4/29/20
to scmma...@googlegroups.com
I’m not 100% sure that’s the issue either, but it’s the only hypothesis I have right now.  The only commonality I’ve been able to find is that the affected users are storing Office documents in the repo (don’t ask), and so placed their working copy on OneDrive to take advantage of the collaboration features.  If OneDrive/Windows is somehow bypassing whatever client-side locking mechanism there is and modifies the file after the checksum was created, but before the data was actually sent, then that would result in the situation we’re seeing.

Right now we have a cron job that performs daily verification of the repo on the server to catch possible problems sooner.  Assuming that this hypothesis is correct, then this isn’t good enough and the server *must* check the checksum prior to accepting the commit.  It’s significantly easier for a user to re-attempt a failed commit, than it is to have to rebuild the entire repo and make all the users have to re-check out their entire working copies.

Ilsa

Sebastian Sdorra

unread,
May 2, 2020, 7:21:56 AM5/2/20
to scmma...@googlegroups.com
After some debugging i was able to build a patch, which adds the checksum verification to svnkit-dav.


I've created an scm-manager snapshot version of 1.60 with version 1.10.1-scm2 of svnkit, which contains the patch above:


Sebastian


Sebastian Sdorra

unread,
May 7, 2020, 1:42:55 AM5/7/20
to scmma...@googlegroups.com
@Ilsa could you test the patched version? And see if it fixes your problem?

Sebastian

Ilsa Loving

unread,
May 7, 2020, 10:09:51 AM5/7/20
to scmma...@googlegroups.com
I will happily apply the patch but unfortunately I can’t give a timeline as to whether it fixes the problem.  The actual occurrences have been completely random and unpredictable, and despite spending a considerable amount of time on the problem, I haven’t been able to find a pattern.

I had a Zoom meeting with Dmitriy from Tmate. We dug very deeply into the broken repository and we couldn’t find any obvious signs of error.  We’ve got a couple theories as to what it might be, but it’s all conjecture.  He has agreed to add checksum verification upstream, but mostly because we don’t have any better ideas and this will at least help reduce the problem space by eliminating whether the error occurs somewhere between the client and the server.

Ilsa

Reply all
Reply to author
Forward
0 new messages