Heya,
I am just trying to adopt the patch.
When dumping the symbols of some DWARF info, I am getting a lot of these
errors:
2009-10-18 22:07:39: minidump_processor.cc:137: INFO: Looking at thread
/home/az/.OpenLieroX/crashreports/4270b97b-6435-03ae-64d622b0-3c0346c7.dmp:0/43
id
0x3e0d
file contains no debugging information (no ".stab" or ".debug_info"
sections)
/home/az/Programmierung/openlierox/bin/openlierox: warning: in DWARF
compilation unit
"/home/az/Programmierung/openlierox/src/main.cpp", at offset 0x0:
/home/az/Programmierung/openlierox/bin/openlierox: warning: skipping
unpaired
lines/functions:
line: /home/az/Programmierung/openlierox/src/main.cpp:164 at 0x822ea96
line: /home/az/Programmierung/openlierox/src/main.cpp:164 at 0x822ea9f
line: /home/az/Programmierung/openlierox/src/main.cpp:1496 at 0x822eac6
line: /home/az/Programmierung/openlierox/src/main.cpp:1497 at 0x822eacc
line: /home/az/Programmierung/openlierox/src/main.cpp:1498 at 0x822eae2
line: /home/az/Programmierung/openlierox/src/main.cpp:1499 at 0x822eaf8
line: /home/az/Programmierung/openlierox/src/main.cpp:1500 at 0x822eb0e
line: /home/az/Programmierung/openlierox/src/main.cpp:1501 at 0x822eb24
Any idea? I am stumbling through the dump_dwarf.cc right now but I am not
really used
to the DWARF format.
--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings
That warning means that the line number data for main.cpp mentioned some
addresses
that, as far as the dumper could see, didn't belong to any particular
function. I
haven't looked into why this is happening yet.
However, it is just a warning, not an error. You should still be getting
appropriate
output for the rest of the data.
The notes at the head of linux-dwarf.patch mention this:
----
TODO: verify that these come from a dropped section; if they do, then add a
heuristic to drop them silently.
/home/jimb/breakpad/breakpad/src/tools/linux/dump_syms/dump_syms
../dwarf/libxpcom_core.so
warning: in compilation unit
/home/jimb/mc/in/xpcom/reflect/xptinfo/src/xptiInterfaceInfo.cpp (offset
0x1f424d):
warning: skipping unpaired lines/functions:
line: ../../../../dist/include/xpt_struct.h:332 at 0x0
line: ../../../../dist/include/xptinfo.h:59 at 0x0
line: ../../../../dist/include/xpt_struct.h:332 at 0x3
line: ../../../../dist/include/xptinfo.h:60 at 0x12
I don't get any usefull output at all in the final result. It doesn't find
any
function in the stacktrace.
Btw., that is the code:
http://openlierox.git.sourceforge.net/git/gitweb.cgi?p=openlierox/openlierox;a=blob;f=src/breakpad/ExtractInfo.cpp;h=fa2d3440670d9119cfb16373218ab071a368a70f;hb=refs/heads/breakpad-0.58
The code is taken partly from breakpads tools/mac/crash_report. It works
for MacOSX
and under Linux if I use Stabs debugging information.
It sounds like you've got the processor invoking dump_syms automatically
somehow.
Let's try to separate these two questions:
- Is the symbol dumper correctly converting DWARF to the Breakpad symbol
format?
- Is the processor correctly interpreting your minidump and the symbol data?
Considering just the first question: can you run dump_syms directly on your
executable file and look at the output it's producing? Does it look
reasonable?
Jup, there is the OnDemandSymbolSupplier:
http://openlierox.git.sourceforge.net/git/gitweb.cgi?p=openlierox/openlierox;a=blob;f=src/breakpad/on_demand_symbol_supplier.cpp;h=f54dab2e2ed31ed4e38b2d3b18764b435b2a4258;hb=refs/heads/breakpad-0.58
And DumpSyms:
http://openlierox.git.sourceforge.net/git/gitweb.cgi?p=openlierox/openlierox;a=blob;f=src/breakpad/DumpSyms.cpp;h=f83b13314bff0962357f9aaf99fc626208f76d76;hb=refs/heads/breakpad-0.58
(Similar DumpSyms.mm also for MacOSX.)
It creates the symbol files in /tmp. This is a short part of that output:
FILE 456 /usr/lib/gcc/i686-pc-linux-gnu/4.3.2/include/g++-v4/sstream
FILE 457 /usr/lib/gcc/i686-pc-linux-gnu/4.3.2/include/g++-v4/streambuf
FILE 458
/usr/lib/gcc/i686-pc-linux-gnu/4.3.2/include/g++-v4/tr1_impl/hashtable
FILE 459
/usr/lib/gcc/i686-pc-linux-gnu/4.3.2/include/g++-v4/tr1_impl/hashtable_policy.h
FILE 460
/usr/lib/gcc/i686-pc-linux-gnu/4.3.2/include/g++-v4/tr1_impl/unordered_set
FILE 461 /usr/lib/gcc/i686-pc-linux-gnu/4.3.2/include/g++-v4/typeinfo
FUNC 1e6de0 5 0 DoSystemChecks
1e6de0 3 131 395
1e6de3 2 140 395
FUNC 1e6de5 a 0 GetBinaryFilename
1e6de5 3 145 395
1e6de8 7 145 395
FUNC 1e6def 17 0 QuitEventThreadEvent
1e6def 9 155 395
1e6df8 3 157 395
1e6dfb 7 158 395
1e6e02 4 160 395
FUNC 1e6e25 25 0 GetLogFilename
1e6e25 6 318 395
1e6e2b 1f 318 395
FUNC 1e6e4a 11 0 ResetQuitEngineFlag
1e6e4a 3 1499 395
1e6e4d c 1500 395
This looks reasonable to me.
I am wondering though about a lot of unnamed functions:
FUNC 1f00d6 2b 0
1f00d6 6 71 150
1f00dc 25 71 150
FUNC 1f0102 26 0
1f0102 4 84 150
1f0106 22 84 150
FUNC 1f0128 2a 0
1f0128 6 90 150
1f012e 24 90 150
FUNC 1f0152 2a 0
1f0152 6 91 150
1f0158 24 91 150
FUNC 1f017c a 0 GetFullGameName
1f017c 3 18 147
1f017f 7 18 147
FUNC 1f0186 d 0
1f0186 3 43 42
1f0189 a 43 42
FUNC 1f0194 1c 0
1f0194 6 46 42
1f019a 16 46 42
It seems that there is no single C++ function. For example, there should be
some
ThreadPool / ThreadWrapper output here:
az@acompneu ~/Programmierung/openlierox $ grep Thread
/tmp/openlierox.x86.sym
FILE 141 /home/az/Programmierung/openlierox/./include/ThreadPool.h
FILE 142 /home/az/Programmierung/openlierox/./include/ThreadVar.h
FILE 388 /home/az/Programmierung/openlierox/src/common/ThreadPool.cpp
FUNC 1e6def 17 0 QuitEventThreadEvent
FUNC 1ec359 123 0 doActionInMainThread
FUNC 1ec47c 1f 0 doSetVideoModeInMainThread
FUNC 1ec9d0 1f 0 doVideoFrameInMainThread
FUNC 1ecbf9 6a0 0 MainLoopThread
FUNC 25440e 362 0 CPUFillFromThreadInfo
FUNC 25750b 26 0 ResumeThread
FUNC 25780e 9f 0 SuspendThread
FUNC 25dbae 48 0 PrintThread
FUNC 261531 13 0 EvHndl_SysWmEvent_MainThread
FUNC 2df154 5 0 setCurThreadName
FUNC 2df159 5 0 setCurThreadPriority
FUNC 4e341d a9 0 UnInitThreadPool
FUNC 4e38b7 d4 0 InitThreadPool
FUNC 52b490 2b1 0 SdlNetEventThreadMain
FUNC 5fae96 1a7 0 sock_rpThread
FUNC 600e98 a8 0 nlThreadCreate
FUNC 600f40 d 0 nlThreadYield
FUNC 600f4d 3d 0 nlThreadJoin
FUNC 600f8a 76 0 nlThreadSleep
Perhaps the C++ demangler has some problems? Or it has stored the C++
information
somehow different?
This is a short part of the final crash report:
Date: 2009-10-19 19:30:18 GMT
Operating system: Linux (0.0.0 Linux 2.6.27-gentoo-r8 #1 SMP Tue Jan 20
21:07:44 CET
2009 i686)
Architecture: x86
Crash reason: SIGSEGV
Crash address: 0x13
Thread 19 (crashed)
0 openlierox 0x085b7070
1 openlierox 0x085c29f0
2 openlierox 0x085aa23e
3 openlierox 0x085aa624
4 openlierox 0x08424f0f
5 openlierox 0x0842530b
6 openlierox 0x0842549b
7 openlierox 0x08234e37
8 openlierox 0x0852a01e
9 openlierox 0x0852acd7
10 libSDL-1.2.so.0.11.2 0xb7eed5ae
11 libSDL-1.2.so.0.11.2 0xb7f313d4
12 libpthread-2.9.so 0xb7d8e15e
13 libc-2.9.so 0xb7a66c0d
eip = 0x085b7070 esp = 0xacf4cf80 ebp = 0xacf4cfc8 ebx =
0x08774ce0
esi = 0x0a0e0f50 edi = 0x0a0e1018 eax = 0x00000013 ecx =
0x00000001
edx = 0x0a8e36d8 efl = 0x00210286
Btw., libSDL and all other libs on my system are compiled with "-ggdb".
openlierox
itself is compiled with "-g".
I played a bit around with objdump and this seems interesting to me:
az@acompneu ~/Programmierung/openlierox $ objdump -g bin/openlierox | grep
-C 2
threadWrapper
35512 std::_Rb_tree<ThreadPoolItem*, ThreadPoolItem*,
std::_Identity<ThreadPoolItem*>, std::less<ThreadPoolItem*>,
std::allocator<ThreadPoolItem*> >::_M_insert_unique
35622 std::set<ThreadPoolItem*, std::less<ThreadPoolItem*>,
std::allocator<ThreadPoolItem*> >::insert
35687 ThreadPool::threadWrapper
35763 ThreadPool::prepareNewThread
35812 ThreadPool::start
--
<2><11d4e>: Abbrev Number: 74 (DW_TAG_subprogram)
<11d4f> DW_AT_external : 1
<11d50> DW_AT_name : (indirect string, offset: 0x7c1e):
threadWrapper
<11d54> DW_AT_decl_file : 11
<11d55> DW_AT_decl_line : 52
<11d56> DW_AT_MIPS_linkage_name: (indirect string, offset: 0x4fa6b):
_ZN10ThreadPool13threadWrapperEPv
<11d5a> DW_AT_type : <0x47>
<11d5e> DW_AT_accessibility: 3 (private)
--
<2><143d0e>: Abbrev Number: 72 (DW_TAG_subprogram)
<143d0f> DW_AT_external : 1
<143d10> DW_AT_name : (indirect string, offset: 0x7c1e):
threadWrapper
<143d14> DW_AT_decl_file : 59
<143d15> DW_AT_decl_line : 52
<143d16> DW_AT_MIPS_linkage_name: (indirect string, offset: 0x4fa6b):
_ZN10ThreadPool13threadWrapperEPv
<143d1a> DW_AT_type : <0x13d64f>
<143d1e> DW_AT_accessibility: 3 (private)
--
And:
az@acompneu ~/Programmierung/openlierox $ objdump -g bin/openlierox | grep
Cmd_crash
344266 Cmd_crash::~Cmd_crash
344298 Cmd_crash::~Cmd_crash
360844 Cmd_crash::Cmd_crash
379357 Cmd_crash::exec
<169cf2d> DW_AT_name : (indirect string, offset: 0x23c1d1):
Cmd_crash
<169cf48> DW_AT_name : (indirect string, offset: 0x23c1d1):
Cmd_crash
<169cf60> DW_AT_name : (indirect string, offset: 0x23c1d1):
Cmd_crash
<169cf7d> DW_AT_MIPS_linkage_name: (indirect string, offset:
0x243693):
_ZN9Cmd_crash4execEP11CmdLineIntfRKSt6vectorISsSaISsEE
<169cfa3> DW_AT_name : (indirect string, offset: 0x23c1d0):
~Cmd_crash
0x0023c1d0 7e436d64 5f637261 7368005f 5a4e5374 ~Cmd_crash._ZNSt
0x00243690 345f005f 5a4e3943 6d645f63 72617368 4_._ZN9Cmd_crash
objdump: Warning: There is a hole [0x13bfff - 0x13c023] in .debug_loc
section.
objdump: Warning: There is a hole [0x14b9bb - 0x14b9df] in .debug_loc
section.
objdump: Warning: There is a hole [0x14e58f - 0x14e5b3] in .debug_loc
section.
objdump: Warning: There is a hole [0x158383 - 0x1583a7] in .debug_loc
section.
objdump: Warning: There is a hole [0x15c10b - 0x15c12f] in .debug_loc
section.
...
Could the problem be related to the fact that there are a lot of indirect
strings for
the DW_AT_name attribute?
Another interesting note: After fixing my OnDemandSymbolSupplier to also
find
debugging info when they are splitted from the libs (I have them in
/usr/lib/debug/),
so that it correctly creates the debugging information for the libs, I was
seeing
this output:
Date: 2009-10-20 01:41:52 GMT
Operating system: Linux (0.0.0 Linux 2.6.27-gentoo-r8 #1 SMP Tue Jan 20
21:07:44 CET
2009 i686)
Architecture: x86
Crash reason: SIGSEGV
Crash address: 0x13
Thread 19 (crashed)
0 openlierox 0x085b9650
1 openlierox 0x085c4fd0
2 openlierox 0x085ac81e
3 openlierox 0x085acc04
4 openlierox 0x084274f3
5 openlierox 0x084278ef
6 openlierox 0x08427a7f
7 openlierox 0x08236347
8 openlierox 0x0852c5fe
9 openlierox 0x0852d2b7
10 libSDL-1.2.so.0.11.2 0xb7e3b5ae
11 libSDL-1.2.so.0.11.2 0xb7e7f3d4
12 libpthread-2.9.so 0xb7cdc15e start_thread + 0x10
(pthread_create.c:297)
13 libc-2.9.so 0xb79b4c0d
For all the other threads of the report, the output is very similar. It
always prints
this single function (even same line number) from pthread but nothing else.
Is there perhaps some additional pthread handling needed?
Attached the output of the symbol dumper. Perhaps that helps for debugging.
Attachments:
crashrepout.txt 284 KB
That does help. Could you also post the output from running 'readelf -wi'
on your
executable?
The readelf output (which is quite big: ~700mb, compressed ~60mb) can be
downloaded here:
http://www.4shared.com/file/142157560/1eb1aa81/openlierox-readelf.html
I also attached the binary itself.
Attachments:
bin.7z 9.9 MB
The FUNC lines with no names are the result of dump_syms not following DWARF
DW_AT_specification links from debugging information entries (dies) with no
names
representing definitions to dies that do have names, representing the
declarations.
This is an oversight in breakpad/src/common/linux/dump_dwarf.cc. I
remember Ted
dealing with this for the Mac dumper. Let me see if I can whip something
up.
(For the record, the other big known issue with the DWARF dumper is that it
doesn't
deal with inlined functions yet.)
> Btw., I am using gcc version 4.3.2. I wonder that the code has worked for
> you. So
> probably your compiler has not used such DW_AT_specification references
> (and perhaps
> other things I don't know about yet).
No, I'm using 4.3.3. As I said in comment 10, the code hasn't been
thoroughly tested
yet, so there will be a bunch of stuff like this to work out. I appreciate
your help
wading through the problems.
Jupp, I already added the DW_AT_specification handling (see my previous
comment with
the patch).
I also added a further patch now which gives correct names when a DIE is
child of
another one. See here for more details:
I added module-relative addresses in the crash report:
Crash reason: SIGSEGV
Crash address: 0x13
Thread 19 (crashed)
0 openlierox 0x085b9650 (0x571650)??
1 openlierox 0x085c4fd0 (0x57cfd0)??
2 openlierox 0x085ac81e (0x56481e)??
3 openlierox 0x085acc04 (0x564c04)??
4 openlierox 0x084274f3 (0x3df4f3)??
5 openlierox 0x084278ef (0x3df8ef)??
6 openlierox 0x08427a7f (0x3dfa7f)??
7 openlierox 0x08236347 (0x1ee347)??
8 openlierox 0x0852c5fe (0x4e45fe)??
9 openlierox 0x0852d2b7 (0x4e52b7)??
10 libSDL-1.2.so.0.11.2 0xb7e3b5ae (0xf5ae)??
11 libSDL-1.2.so.0.11.2 0xb7e7f3d4 (0x533d4)??
12 libpthread-2.9.so 0xb7cdc15e (0x615e)start_thread
+ 0x10
(pthread_create.c:297)
13 libc-2.9.so 0xb79b4c0d (0xcec0d)??
The top stackframe is the function Cmd_crash::exec(). In the symbol file,
this is the
entry:
FUNC 5714ea 1bf 0 Cmd_crash::exec(CmdLineIntf*, std::vector<std::string,
std::allocator<std::string> > const&)
5714ea 8 557 361
5714f2 8d 559 361
57157f 23 564 361
5715a2 1c 559 361
5715be 8d 565 361
57164b b 567 361
571656 30 568 361
571686 1c 565 361
5716a2 7 570 361
The address 0x571650 is clearly in that range, though the specific address
is not
covered.
Great work. Can you look at the output from readelf -wl to see if the
DWARF line
number info does, in fact, provide a line number entry for 0x571650 that
has been
omitted from the symbol file?
By the way: using DW_AT_MIPS_linkage_name to produce the fully-qualified
name is
fine, but that attribute may not always be present; it mostly duplications
information available elsewhere, and takes up a lot of space, so there has
been
discussion about leaving it out. So in the long run we need a solution
that follows
DW_AT_specification references and uses the DIE nesting structure to
compute the
name. I'm working on a patch for that.
We don't use functioninfo.cc because we want to be able to populate a
Module instance
with both STABS and DWARF data.
...
22 3 0 0 Command.cpp
...
Set File Name to entry 22 in the File Name Table
Advance Line by 2211 to 2246
Advance PC by 81 to 0x85a8ea1
Copy
Special opcode 33: advance Address by 2 to 0x85a8ea3 and Line by 0 to 2246
Special opcode 89: advance Address by 6 to 0x85a8ea9 and Line by 0 to 2246
..
Special opcode 165: advance Address by 11 to 0x85b95be and Line by 6 to
565
Advance PC by 141 to 0x85b964b
Special opcode 7: advance Address by 0 to 0x85b964b and Line by 2 to 567
Special opcode 160: advance Address by 11 to 0x85b9656 and Line by 1 to
568
Advance PC by 48 to 0x85b9686
Special opcode 2: advance Address by 0 to 0x85b9686 and Line by -3 to 565
Advance PC by constant 17 to 0x85b9697
Special opcode 164: advance Address by 11 to 0x85b96a2 and Line by 5 to
570
Advance Line by -521 to 49
Special opcode 103: advance Address by 7 to 0x85b96a9 and Line by 0 to 49
Advance Line by 23 to 72
Special opcode 89: advance Address by 6 to 0x85b96af and Line by 0 to 72
Advance PC by 86 to 0x85b9705
So 0x85b9650 doesn't seem to be covered exactly.
Yes, that's why I have handled both cases where DW_AT_MIPS_linkage_name is
available
and if not it uses DW_AT_name together with the concatenate names of all
parents.
As far as I can see from Stackwalker::Walk and
BasicSourceLineResolver::Module::LookupAddress, it should already work if
the address
is just in the correct range (and that is the case as you can see in my
comment). I
also see that Walk hits 0x85b9650, but LookupAddress does not. It seems
that it
doesn't find the module.
I have a lot of these errors in my output, perhaps they are the reason?
2009-10-20 04:51:30: basic_source_line_resolver.cc:315: ERROR:
ParseFunction failed
at :1308
Yes, it was indeed related. The problem was for unnamed functions in the
symbolfile.
I added an extra check in the symbolfile parser which handles those cases:
http://openlierox.git.sourceforge.net/git/gitweb.cgi?p=openlierox/openlierox;a=commitdiff;h=7a61523fb8c413f751e5662e3dc764c6a4d6003e
This results finally in an useable result:
Thread 19 (crashed)
0 openlierox 0x085b9650 (0x00571650)
Cmd_crash::exec(CmdLineIntf*, std::vector<std::string,
std::allocator<std::string> >
const&) + 0x5 (Command.cpp:567)
1 openlierox 0x085c4fd0 (0x0057cfd0)
Command::exec(CmdLineIntf*,
std::string const&) + 0x1f (Command.cpp:331)
2 openlierox 0x085ac81e (0x0056481e) HandleCommand +
0x25
(Command.cpp:2111)
3 openlierox 0x085acc04 (0x00564c04)
HandlePendingCommands() +
0xa (Command.cpp:2164)
4 openlierox 0x084274f3 (0x003df4f3)
DeprecatedGUI::Menu_Frame()
+ 0x4 (MenuSystem.cpp:258)
5 openlierox 0x084278ef (0x003df8ef)
DeprecatedGUI::Menu_Loop()
+ 0x4 (MenuSystem.cpp:352)
6 openlierox 0x08427a7f (0x003dfa7f)
DeprecatedGUI::Menu_Start()
+ 0x4 (MenuSystem.cpp:246)
7 openlierox 0x08236347 (0x001ee347) MainLoopThread +
0x4
(main.cpp:817)
8 openlierox 0x0852c5fe (0x004e45fe) <anonymous
function> + 0x19
9 openlierox 0x0852d2b7 (0x004e52b7)
ThreadPool::threadWrapper(void*) + 0x14 (ThreadPool.cpp:91)
10 libSDL-1.2.so.0.11.2 0xb7e3b5ae (0x0000f5ae) SDL_RunThread +
0x5
(SDL_thread.c:202)
11 libSDL-1.2.so.0.11.2 0xb7e7f3d4 (0x000533d4) RunThread + 0xa
(SDL_systhread.c:47)
12 libpthread-2.9.so 0xb7cdc15e (0x0000615e) start_thread +
0x10
(pthread_create.c:297)
13 libc-2.9.so 0xb79b4c0d (0x000cec0d) *__GI___sysctl +
0x65
A FUNC line should always have a name, though. Even with your patches, is
the symbol
dumper still producing unnamed functions? That's the bug. Changing the
parser in
BasicSourceLineResolver isn't right.
Could it perhaps because of my small addition of DW_TAG_inlined_subroutine
here:
dwarf2reader::DIEHandler *DumpDwarfCURootHandler::FindChildHandler(
uint64 offset,
enum DwarfTag tag,
const AttributeList &attrs) {
switch (tag) {
case dwarf2reader::DW_TAG_subprogram:
case dwarf2reader::DW_TAG_inlined_subroutine:
return new DumpDwarfFuncHandler(cu_offset_, offset, "", &functions_,
&offset_to_funcinfo_);
default:
return new DumpDwarfChildIterator(this);
> Could it perhaps because of my small addition of
> DW_TAG_inlined_subroutine here:
Yes --- functions with overlapping ranges, like an inlined function within
an
enclosing function, will confuse the processor.
The solution which strictly obeys the current file format would be to have
the dumper
split the enclosing function into several distinct FUNC records, all with
the same
name, but whose addresses are the spaces between the inlined functions'
code. But
that's pretty gross.
I think the better solution would be to simply change the file format to
allow one
FUNC record's address range to be nested within another's, and require the
processor
to use the innermost matching function. That would probably entail some
interesting
changes to BasicSourceLineResolver::Module.