Weird "Corrupt representation / Malformed representation header" Errors on fsfs repo.

670 views
Skip to first unread message

Felix....@t-systems.com

unread,
Aug 14, 2014, 1:30:13 AM8/14/14
to us...@subversion.apache.org
Hi,

one of our large repositories went corrupt. Maybe someone can help.
On Aug 6th we had a Hardware crash of our svn server (mod_dav_svn) / svn 1.7.11.

This caused erros like:
[Fri Aug 08 14:00:01 2014] [error] [client ip.ip.ip.ip] (20014)Internal error: Couldn't open rep-cache database
[Fri Aug 08 14:00:01 2014] [error] [client ip.ip.ip.ip] (20014)Internal error: -Couldn't perform atomic initialization
[Fri Aug 08 14:00:01 2014] [error] [client ip.ip.ip.ip] (20014)Internal error: -database disk image is malformed, executing statement 'PRAGMA synchronous=OFF;PRAGMA recursive_triggers=ON;'

This led me to:
https://mail-archives.apache.org/mod_mbox/subversion-users/201401.mbox/%3C52D8370...@reser.org%3E

As Restoring a dump of the prod-repo would Take about >=8 hours I found a workaround:
Taking a light older backup, restoring it on another place and replacing the corrupt rep-cache.db on the production repo and restarting apache httpd...
It seems to have worked. None of the error above appears anymore and the rep-cache.db file grows. :)

----------------

Sadly there is a new error where I don't know if the previous issue is linked to this - that's why I explained it. :)

Now the httpd-log presents Errors at Checkout like:
[Wed Aug 13 17:32:27 2014] [error] [client ip.ip.ip.ip] Unable to deliver content. [500, #0]
[Wed Aug 13 17:32:27 2014] [error] [client ip.ip.ip.ip] could not prepare to read the file [500, #160004]
[Wed Aug 13 17:32:27 2014] [error] [client ip.ip.ip.ip] Corrupt representation '76625 15736 24 1736 98aa6ec0f63f0cf7bc71f84cb15decf5 106c014975275e25ea0a7efab549472a8e32917e 77054-1ogo/_4x' [500, #160004]
[Wed Aug 13 17:32:27 2014] [error] [client ip.ip.ip.ip] Malformed representation header at /srv/svn/repo1/db/revs/76/76625:15751 [500, #160004]

Scanning the last revisions throws:
> for i in $(seq 77000 $(svnlook youngest /srv/svn/repo1)); do svnadmin verify /srv/svn/repo1 -r $i; done
[...]
* Verified revision 77014.
svnadmin: E160004: Corrupt representation '75477 1973 1105 7151 a6e0d04828d8d5683c69faf7289c62e0 96f59510929937634de7c7fea27a005db91a31c9 77014-1ofa/_g'
svnadmin: E160004: Malformed representation header at /srv/svn/repo1/db/revs/75/75477:1982
* Verified revision 77016.
...
* Verified revision 77047.
svnadmin: E160004: Corrupt representation '76625 48 29 7011 c2fbbb12f89423213f919c53b926fa82 fc3aba63eaa5672fe664fd47ae5f6db780b9552c 77047-1ogg/_a'
svnadmin: E160004: Malformed representation header at /srv/svn/repo1/db/revs/76/76625:52
* Verified revision 77049.
[...]
* Verified revision 77054.
svnadmin: E160004: Corrupt representation '76625 48 29 7011 c2fbbb12f89423213f919c53b926fa82 fc3aba63eaa5672fe664fd47ae5f6db780b9552c 77054-1ogo/_4w'
svnadmin: E160004: Malformed representation header at /srv/svn/repo1/db/revs/76/76625:52
* Verified revision 77056.
[...]

So revisions: 77015, 77048 and 77055 are corrupt - others are okay - so just commiting seems not to be the problem and thus I think it's not linked to the first one. All corrupt commits come from the same user account (authzn_svn_module) I am going to check his environment.

Researching I found:
https://mail-archives.apache.org/mod_mbox/subversion-dev/201010.mbox/%3C1286360504.2313.132.camel@edith%3E

trying the script sadly throws:
./fixer/fix-rev.py /srv/svn/repo1 77015
Traceback (most recent call last):
File "./fixer/fix-rev.py", line 237, in <module>
fix_rev(repo_dir, rev)
File "./fixer/fix-rev.py", line 222, in fix_rev
while fix_one_error(repo_dir, rev):
File "./fixer/fix-rev.py", line 182, in fix_one_error
if handle_one_error(repo_dir, rev, svnadmin_err):
File "./fixer/fix-rev.py", line 161, in handle_one_error
fix_delta_ref(repo_dir, rev, bad_rev, bad_offset, bad_size)
File "./fixer/fix-rev.py", line 111, in fix_delta_ref
good_offset = find_good_rep_header(repo_dir, bad_rev, bad_size)
File "/root/scripts/fixer/find_good_id.py", line 72, in find_good_rep_header
_, texts = rev_file_indexes(repo_dir, rev)
File "/root/scripts/fixer/find_good_id.py", line 41, in rev_file_indexes
for line in open(rev_file_path(repo_dir, rev)):
IOError: [Errno 2] No such file or directory: '/srv/svn/repo1/db/revs/75477


doing a symlink throws following:
]# ln -s /srv/svn/repo1/db/revs/75/75477 /srv/svn/repo1/db/revs/75477
]# ./fixer/fix-rev.py /srv/svn/repo1 77015
Traceback (most recent call last):
File "./fixer/fix-rev.py", line 237, in <module>
fix_rev(repo_dir, rev)
File "./fixer/fix-rev.py", line 222, in fix_rev
while fix_one_error(repo_dir, rev):
File "./fixer/fix-rev.py", line 182, in fix_one_error
if handle_one_error(repo_dir, rev, svnadmin_err):
File "./fixer/fix-rev.py", line 161, in handle_one_error
fix_delta_ref(repo_dir, rev, bad_rev, bad_offset, bad_size)
File "./fixer/fix-rev.py", line 111, in fix_delta_ref
good_offset = find_good_rep_header(repo_dir, bad_rev, bad_size)
File "/root/scripts/fixer/find_good_id.py", line 72, in find_good_rep_header
_, texts = rev_file_indexes(repo_dir, rev)
File "/root/scripts/fixer/find_good_id.py", line 44, in rev_file_indexes
id_noderev, id_rev, _ = parse_id(id)
File "/root/scripts/fixer/find_good_id.py", line 27, in parse_id
_, rev = noderev.split('.r')
ValueError: too many values to unpack

So script seems to be too old :|

I also tried ( from http://www.szakmeister.net/fsfsverify/ ):
python fsfsverify.py /srv/svn/repo1/db/revs/75/75477
but the output tells a lot of "Can't check <cryptic string>" like: "b-4028.q-73874.r73874/36007" and it does not seem to get fixed, too. (svnadmin verify throws same above)

Now I am at the edge of my competence. Hopefully someone of you can help!? Do you have an Idea how to fix that?
As far as I know the commiting guys use a lot of git-svn - if this is a needful information...

Thank you very much.
Felix


---
T-Systems Multimedia Solutions GmbH
CU BT
Felix Herzog
Centralized Applicationmanagement
Address: Riesaer Straße 5, 01129 Dresden, Germany
Postal address: Postfach 10 02 24, 01072 Dresden, Germany
+49 351 2820-2593 (Phone)
+49 351 2820-5111 (Fax)
+49 151 1483-1647 (Mobile)
E-Mail: E-Mail: felix....@t-systems.com
Internet: http://www.t-systems-mms.com
You can find the compulsory statement on: www.t-systems-mms.com/en/compulsory-statement

Markus Schaber

unread,
Aug 14, 2014, 4:31:06 AM8/14/14
to Felix....@t-systems.com, us...@subversion.apache.org
Hi, Felix,

you may have some success by restoring just the broken revision files by ones from the backup.

You should replace both files - the one where the verification fails, and the file which is reported to have the malformed header.

Best regards

Markus Schaber

CODESYS® a trademark of 3S-Smart Software Solutions GmbH

Inspiring Automation Solutions

3S-Smart Software Solutions GmbH
Dipl.-Inf. Markus Schaber | Product Development Core Technology
Memminger Str. 151 | 87439 Kempten | Germany
Tel. +49-831-54031-979 | Fax +49-831-54031-50

E-Mail: m.sc...@codesys.com | Web: http://www.codesys.com | CODESYS store: http://store.codesys.com
CODESYS forum: http://forum.codesys.com

Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade register: Kempten HRB 6186 | Tax ID No.: DE 167014915

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received
this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure
or distribution of the material in this e-mail is strictly forbidden.

> -----Ursprüngliche Nachricht-----
> Von: Felix....@t-systems.com [mailto:Felix....@t-systems.com]
> Gesendet: Donnerstag, 14. August 2014 07:30
> An: us...@subversion.apache.org
> Betreff: Weird "Corrupt representation / Malformed representation header"
> Errors on fsfs repo.

Felix....@t-systems.com

unread,
Aug 14, 2014, 11:55:35 AM8/14/14
to us...@subversion.apache.org
Hi,

thank you for your reply. I really appreciate that.
Sadly I found out I can not restore it as the backup via svnadmin dump stops at the corrupted revision...

It seems I am going to do it as follows:
1 creating incremental Dumpfiles leaving the corrupted revisions
2 loading the repo with the dumpfiles (in correct order) into a new repo-space like /srv/svn/repo1_fixed
3 moving the /srv/svn/repo1 to /srv/svn/repo1_corrupt
4 and doing a svnadmin hotcopy /srv/svn/repo1_fixed to /srv/svn/repo1 (and doing chown etc. and using the original authz...)

I've read that moving by "mv" in step 4 instead of that hotcopy might cause trouble... is that right? Otherwise this moving might be a lot faster. Today I measured the svn load of this 77014 revisions and it took 18 hours. :/

Regards,
Felix


-----Ursprüngliche Nachricht-----
Von: Markus Schaber [mailto:m.sc...@codesys.com]
Gesendet: Donnerstag, 14. August 2014 10:31
An: Herzog, Felix; us...@subversion.apache.org
Betreff: AW: Weird "Corrupt representation / Malformed representation header" Errors on fsfs repo.
Reply all
Reply to author
Forward
0 new messages