Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Stack overflow at write_one()
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  6 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Cesar Eduardo Barros  
View profile  
 More options Nov 19 2011, 3:27 pm
From: Cesar Eduardo Barros <ces...@cesarb.net>
Date: Sat, 19 Nov 2011 18:27:29 -0200
Local: Sat, Nov 19 2011 3:27 pm
Subject: Stack overflow at write_one()
I have found a stack overflow at builtin/pack-objects.c:write_one(),
where it calls itself endlessly. This is caused by the object_entry e
and e->delta->delta being the same. But I have no idea how that happened.

First, the full story:

I used Google's repo tool to mirror AOSP to my machine. This mirrors
several kernel trees (six last time I counted), without sharing objects
one with another. To save space, I decided to point their
objects/info/alternates to my mirror of the Linus kernel tree (which
should be safe, since Linus makes it always fast-forward), and run "git
gc" on them to create a smaller pack. This worked for all trees except
one, where it core dumped (abrt report at
https://bugzilla.redhat.com/show_bug.cgi?id=755132).

I compiled the latest git (v1.7.8-rc3-17-gf56ef11) to see if it still
happened, and here is what I could get from gdb. I attached to the
pack-objects process before it crashed (full command line "git
pack-objects --keep-true-parents --honor-pack-keep --non-empty --all
--reflog --unpack-unreachable --local --delta-base-offset
/home/cesarb/src/bug755132/omap.git/objects/pack/.tmp-5171-pack"),
continued, and let it crash:

(gdb) cont
Continuing.
[New Thread 0x7f3f2bad3700 (LWP 5205)]
[New Thread 0x7f3f2b2d2700 (LWP 5206)]
[New Thread 0x7f3f2aad1700 (LWP 5207)]
[New Thread 0x7f3f2a2d0700 (LWP 5208)]
[Thread 0x7f3f2b2d2700 (LWP 5206) exited]
[Thread 0x7f3f2bad3700 (LWP 5205) exited]
[Thread 0x7f3f2aad1700 (LWP 5207) exited]
[Thread 0x7f3f2a2d0700 (LWP 5208) exited]

Program received signal SIGSEGV, Segmentation fault.
0x00000000004472b9 in write_one (f=0x6a97db0, e=0x7f3f30233490,
     offset=0x7fff79b53908) at builtin/pack-objects.c:415
415     {

Unlike on Fedora's git binary, where it happened on a call instruction,
this time it happened on a push instruction:

(gdb) disassemble
Dump of assembler code for function write_one:
    0x00000000004472b0 <+0>:      push   %r15
    0x00000000004472b2 <+2>:      push   %r14
    0x00000000004472b4 <+4>:      push   %r13
    0x00000000004472b6 <+6>:      mov    %rdx,%r13
=> 0x00000000004472b9 <+9>:    push   %r12
    0x00000000004472bb <+11>:     mov    %rdi,%r12

The last few frames on the stack show the endless recursion:

(gdb) where
#0  0x00000000004472b9 in write_one (f=0x6a97db0, e=0x7f3f30233490,
     offset=0x7fff79b53908) at builtin/pack-objects.c:415
#1  0x00000000004472ed in write_one (f=0x6a97db0, e=0x7f3f30277390,
     offset=0x7fff79b53908) at builtin/pack-objects.c:423
#2  0x00000000004472ed in write_one (f=0x6a97db0, e=0x7f3f30233490,
     offset=0x7fff79b53908) at builtin/pack-objects.c:423
#3  0x00000000004472ed in write_one (f=0x6a97db0, e=0x7f3f30277390,
     offset=0x7fff79b53908) at builtin/pack-objects.c:423
#4  0x00000000004472ed in write_one (f=0x6a97db0, e=0x7f3f30233490,
     offset=0x7fff79b53908) at builtin/pack-objects.c:423

And here is the loop in the data structures:

(gdb) p e
$1 = (struct object_entry *) 0x7f3f30233490
(gdb) p e->delta
$2 = (struct object_entry *) 0x7f3f30277390
(gdb) p e->delta->delta
$3 = (struct object_entry *) 0x7f3f30233490

Unfortunately, I do not know enough of git's internals to debug further.
In case it helps, here is the contents of a few of these structures:

(gdb) p *e
$4 = {idx = {
     sha1 = "\257>J\241)\266\023\064\a\342J\320\375ӆ\262M\245",
<incomplete sequence \356>, crc32 = 0, offset = 0}, size = 20, in_pack =
0x259b610,
   in_pack_offset = 231061238, delta = 0x7f3f30277390,
   delta_child = 0x7f3f30277390, delta_sibling = 0x7f3f30413b10,
   delta_data = 0x0, delta_size = 20, z_delta_size = 0, hash = 2099915708,
   type = OBJ_OFS_DELTA, in_pack_type = OBJ_OFS_DELTA,
   in_pack_header_size = 5 '\005', preferred_base = 0 '\000',
   no_try_delta = 0 '\000', tagged = 0 '\000', filled = 1 '\001'}
(gdb) p *(e->delta)
$5 = {idx = {
     sha1 =
"\372\307\035\372\017\350\307\f\310R\t\236\006\034\063N*T\216\253",
     crc32 = 0, offset = 0}, size = 14, in_pack = 0x259b610,
   in_pack_offset = 39990, delta = 0x7f3f30233490,
   delta_child = 0x7f3f30233490, delta_sibling = 0x0, delta_data = 0x0,
   delta_size = 14, z_delta_size = 0, hash = 2099915708, type =
OBJ_REF_DELTA,
   in_pack_type = OBJ_REF_DELTA, in_pack_header_size = 21 '\025',
   preferred_base = 0 '\000', no_try_delta = 0 '\000', tagged = 0 '\000',
   filled = 1 '\001'}
(gdb) p *(e->in_pack)
$6 = {next = 0x25a53c0, windows = 0x259bc40, pack_size = 449155894,
   index_data = 0x7f3f4f0a9000, index_size = 58351420, num_objects =
2083941,
   num_bad_objects = 0, bad_object_sha1 = 0x0, index_version = 2,
   mtime = 1321387261, pack_fd = -1, pack_local = 1, pack_keep = 0,
   do_not_close = 0, sha1 =
"\371Q4\177.ȳv\364\246\332Z\234\025?\352ݠP\210",
   pack_name = 0x259b671
"/home/cesarb/src/bug755132/omap.git/objects/pack/pack-f951347f2ec8b376f4a6 da5a9c153feadda05088.pack"}

I tried using "git fsck" to see if it could find anything strange, but
it seems to get stuck (using 100% CPU) after these lines:

[...]
Checking commit fb630b9fc902e24209166b1659a8b375bf38099c
Checking tree fc32c012c750084eb1d82782cee7c80a45a78289
Checking blob fc7bbba585cee2c2b0d5282c42fb986bfb032a0a
Checking commit fdcb23634c9b6649bb02c681033d4973491b0e35
Checking tree fe773cf73ff553249be2f24ddf770f5dc43a41f1
Checking blob fe67b5c79f0ff33d92ebe7469a89c5a5d044fc0a
Checking blob fe73276e026bf263f494a917c84c6a3fcaeaaeda
Checking tree fe30eda9d92d074816f9c3a47fd3ffb9b89ca835
Checking tree fe9c75396e6d433b289d0e40c7e47921b91cad3a
Checking blob ff3ed6086ce1c6b6b4b5111c034d14a208c0d045
Checking blob ff66638ff54d5ad7067e4f246d392059eef1a7bf
Checking tree ff126d2bc67017199049ddba761979f3bda57eb9

Unfortunately, the reproducer I have (a copy of both trees with
objects/info/alternates modified) is 1.8G in size, and I do not know how
to create a smaller reproducer. If you know of a command which would get
more relevant information from them, just ask; I plan on keeping them
around for a while.

--
Cesar Eduardo Barros
ces...@cesarb.net
cesar.bar...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Junio C Hamano  
View profile  
 More options Nov 19 2011, 4:08 pm
From: Junio C Hamano <gits...@pobox.com>
Date: Sat, 19 Nov 2011 13:08:08 -0800
Local: Sat, Nov 19 2011 4:08 pm
Subject: Re: Stack overflow at write_one()
Already found the real cause (jGit bug) and workaround posted, I think.

See $gmane/185573
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Cesar Eduardo Barros  
View profile  
 More options Nov 19 2011, 4:46 pm
From: Cesar Eduardo Barros <ces...@cesarb.net>
Date: Sat, 19 Nov 2011 19:46:08 -0200
Local: Sat, Nov 19 2011 4:46 pm
Subject: Re: Stack overflow at write_one()
Em 19-11-2011 19:08, Junio C Hamano escreveu:

> Already found the real cause (jGit bug) and workaround posted, I think.

I presume the cause then is what was fixed by
http://egit.eclipse.org/w/?p=jgit.git;a=commit;h=2fbf296fda205446eac1...
?

> See $gmane/185573

That did it, thanks! The patch had an offset, a fuzz, and a reject, but
it was easy to fix by hand.

$ ../git/git gc
Counting objects: 30254, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (6614/6614), done.
warning: recursive delta detected for object
fac71dfa0fe8c70cc852099e061c334e2a548eab
warning: recursive delta detected for object
1b730f5b2e0bdb2a2206af8ed30170509e75a2f5
warning: recursive delta detected for object
2f25a87e67fa3a226e367b9e080f11aa90c9f953
warning: recursive delta detected for object
d5e5eefac91788da9a94efe9a15e0b928a77489e
Writing objects: 100% (30254/30254), done.
Total 30254 (delta 24008), reused 28803 (delta 23266)

And after that the repack does not break anymore:

$ ../git/git gc
Counting objects: 30254, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (5876/5876), done.
Writing objects: 100% (30254/30254), done.
Total 30254 (delta 24008), reused 30254 (delta 24008)

--
Cesar Eduardo Barros
ces...@cesarb.net
cesar.bar...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Shawn Pearce  
View profile  
 More options Nov 19 2011, 6:30 pm
From: Shawn Pearce <spea...@spearce.org>
Date: Sat, 19 Nov 2011 15:30:35 -0800
Local: Sat, Nov 19 2011 6:30 pm
Subject: Re: Stack overflow at write_one()
On Sat, Nov 19, 2011 at 13:46, Cesar Eduardo Barros <ces...@cesarb.net> wrote:

> Em 19-11-2011 19:08, Junio C Hamano escreveu:

>> Already found the real cause (jGit bug) and workaround posted, I think.

> I presume the cause then is what was fixed by
> http://egit.eclipse.org/w/?p=jgit.git;a=commit;h=2fbf296fda205446eac1...
> ?

Yes. The AOSP servers were all updated with the above JGit patch, so
the servers are no longer sending duplicate objects. But yes, for a
period of time there were duplicates in the kernel repositories,
particularly kernal/omap.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Cesar Eduardo Barros  
View profile  
 More options Nov 19 2011, 7:02 pm
From: Cesar Eduardo Barros <ces...@cesarb.net>
Date: Sat, 19 Nov 2011 22:02:04 -0200
Local: Sat, Nov 19 2011 7:02 pm
Subject: Re: Stack overflow at write_one()
Em 19-11-2011 21:30, Shawn Pearce escreveu:

> On Sat, Nov 19, 2011 at 13:46, Cesar Eduardo Barros<ces...@cesarb.net>  wrote:
>> Em 19-11-2011 19:08, Junio C Hamano escreveu:

>>> Already found the real cause (jGit bug) and workaround posted, I think.

>> I presume the cause then is what was fixed by
>> http://egit.eclipse.org/w/?p=jgit.git;a=commit;h=2fbf296fda205446eac1...
>> ?

> Yes. The AOSP servers were all updated with the above JGit patch, so
> the servers are no longer sending duplicate objects. But yes, for a
> period of time there were duplicates in the kernel repositories,
> particularly kernal/omap.

So, would an alternative workaround in my situation be to delete
kernel/omap.git and let repo sync recreate it? It seems repo does not
have extra metadata anywhere else, so just removing the directory should
be enough for it to clone again from scratch, hopefully getting a
corrected pack from the server.

--
Cesar Eduardo Barros
ces...@cesarb.net
cesar.bar...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Shawn Pearce  
View profile  
 More options Nov 19 2011, 9:00 pm
From: Shawn Pearce <spea...@spearce.org>
Date: Sat, 19 Nov 2011 18:00:31 -0800
Local: Sat, Nov 19 2011 9:00 pm
Subject: Re: Stack overflow at write_one()
On Sat, Nov 19, 2011 at 16:02, Cesar Eduardo Barros <ces...@cesarb.net> wrote:

Yes. repo does not have extra state, so just removing the directory
and running `repo sync` again to clone the repository would correct
the local repository.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »