SVN repository corruption

623 visualizações
Ir para a primeira mensagem não lida

Chris Jones

não lida,
06/07/2016, 18:36:3306/07/16
para scmmanager
We have a few occurrences of this problem: A Subversion repository will become corrupted, presumably during a commit. Subsequent read operations will return an ArrayIndexOutOfBoundsException. The only recovery we've found is to dump and restore the repository, discarding the failed revision and any that have occurred after it. Our system config:
  • Apache reverse proxy
  • SCM Manager 1.46
    • scm-auth-ldap-plugin
    • scm-jira-plugin
    • scm-mail-plugin
    • scm-scheduler-plugin
    • scm-gravatar-plugin
    • scm-message-regex-plugin
    • scm-notify-plugin
    • A few custom plugins that contain UI tweaks, and one that replaces svn usernames based on LDAP info.
  • Ubuntu 14.04 LTS
  • 300+ svn repos, 100+ git, 40 hg, totaling about 700GB
  • About 100 commits per day

When the corruption happens, our replication system, which uses svnsync, sees this error:


svnsync: E175002: REPORT request on '/path/to/repo' failed: 500 java.lang.ArrayIndexOutOfBoundsException     at org.tmatesoft.svn.core.io.diff.SVNDiffWindow.apply(SVNDiffWindow.java:360)     at org.tmatesoft.svn.core.internal.delta.SVNDeltaCombiner.addWindow(SVNDeltaCombiner.java:225)     at org.tmatesoft.svn.core.internal.io.fs.FSInputStream.getContents(FSInputStream.java:171)     at org.tmatesoft.svn.core.internal.io.fs.FSInputStream.readContents(FSInputStream.java:114)     at org.tmatesoft.svn.core.internal.io.fs.FSInputStream.read(FSInputStream.java:98)     at org.tmatesoft.svn.core.internal.wc.SVNFileUtil.readIntoBuffer(SVNFileUtil.java:309)     at org.tmatesoft.svn.core.io.diff.SVNDeltaGenerator.readToBuffer(SVNDeltaGenerator.java:279)     at org.tmatesoft.svn.core.io.diff.SVNDeltaGenerator.sendDelta(SVNDeltaGenerator.java:154)     at org.tmatesoft.svn.core.internal.io.fs.FSReplayPathHandler.handleCommitPath(FSReplayPathHandler.java:228)     at org.tmatesoft.svn.core.internal.wc.SVNCommitUtil!
 
.driveCom
 mitEditor
(SVNCommitUtil.java:134)     at org.tmatesoft.svn.core.internal.io.fs.FSRepositoryUtil.rep


...and normal client operations see the same symptom. If I run "svnadmin verify" on the repository, I see something like this:


...
* Verified revision 262.
svnadmin
: E160004: Filesystem is corrupt
svnadmin
: E200014: Checksum mismatch while reading representation:
   expected
:  db206cf419e73630c8098878d0908e65
     actual
:  4b9c123413a97eacd3c487a14aed50ec




"svnadmin recover" won't fix this. All I can do is dump and restore the repository, discarding the bad revision. The users then commit the same files again, and everything is fine.


I've saved some of these bad repositories. I can use vi or hexedit to open (in this case) db/revs/0/263 and search for the expected checksum. That tells me which file has the bad checksum. In all cases so far, the file has been a binary file of some type.


I haven't been able to reproduce this problem myself, so I don't know if the users are doing anything peculiar; regardless, no combination of network inputs should result in a corrupted repository. Does anybody have suggestions on how I can troubleshoot this better?


Chris

Sebastian Sdorra

não lida,
08/07/2016, 07:13:1708/07/16
para scmma...@googlegroups.com
Hi Chris,
This sounds like a bug in svnkit (http://svnkit.com/) the svn library which is used by SCM-Manager. I've tried to find a similar issue in their ticket system. I found some locking problems, which probably could cause a corruption in combination with high traffic. I think the best way we can go is to update the library to the latest version and watch for the issue. If the error occurs with the latest version too, we will open a ticket at svnkit.

Could you give me some more information about your environment and the problematic files? How often did the problem occur? Is always the same repository broken? Are your repositories located on a network filesystem? Which clients are used for the broken commit? What kind of files are broken (size, type)? Were there many requests to that time? Did you found any parallels between the broken commits?

I will try to upgrade the svnkit library, but this could take some time, because the one which is used in scm-manager is heavily patched (https://bitbucket.org/sdorra/svnkit-mq/src).

Sebastian

--
You received this message because you are subscribed to the Google Groups "scmmanager" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scmmanager+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chris Jones

não lida,
08/07/2016, 13:45:4108/07/16
para scmmanager
Thank you, Sebastian. We've had this happen at least 7 times since October of 2015, to 3 different repositories. Initially it was with an old version of SCM Manager (maybe 1.43?), so we tried upgrading to 1.46 in hopes that would fix it. With the latest occurrence of the problem, it was the same repository three times in two weeks. All of our repositories are on a Linux ext4 filesystem, on local disks inside a VM running on VMWare. I believe the users of this repository use TortoiseSVN to access it. All of the broken files in this repository are .jdb files, I believe from Java-based BDB. They're in the range of 1-2MB in size. A previous event involved 200kB CAD files of some sort. The server was active when the most recent error occurred: 520 HTTPS requests during the minute the failure happened, with 306 of those associated with the bad repository. (I think that's normal for SVN, since it's very chatty; but I can send you the log excerpt if you want.)

Out of curiosity, how difficult would it be for SCM Manager to use native Subversion instead of svnkit, similar to how it uses native Python for Hg?

Chris

Chris Jones

não lida,
31/08/2016, 15:14:5731/08/16
para scmma...@googlegroups.com

Hi Sebastian,

Any progress on this? We haven't seen any recent repository corruption, but I'm hesitant to fully trust SCM Manager with our Subversion repositories right now.

Chris

You received this message because you are subscribed to a topic in the Google Groups "scmmanager" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scmmanager/r1wlKUN8oEQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scmmanager+...@googlegroups.com.

Sebastian Sdorra

não lida,
02/09/2016, 12:48:5202/09/16
para scmma...@googlegroups.com
Hi Chris,
I've some problems with the upgrade of the svnkit library. Their seems to be some problems with encoding after the upgrade. The problem is hard to debug and at the moment i'm not sure if the problem comes from one of my patches or from the upgrade itself. However i started a small test suite for easier verification of svnkit upgrades (https://bitbucket.org/sdorra/svn-server-spec). I'm not sure when i have more time to fix the problems.

Sebastian

To unsubscribe from this group and stop receiving emails from it, send an email to scmmanager+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "scmmanager" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scmmanager/r1wlKUN8oEQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scmmanager+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "scmmanager" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scmmanager+unsubscribe@googlegroups.com.

Chris Jones

não lida,
02/09/2016, 12:52:0302/09/16
para scmma...@googlegroups.com

Okay, thanks for the status update.

Chris

To unsubscribe from this group and all its topics, send an email to scmmanager+...@googlegroups.com.
Responder a todos
Responder ao autor
Reencaminhar
0 mensagens novas