Crash Dump Analysis

Anubhav

unread,

Jan 30, 2007, 10:18:02 AM1/30/07

to

Hi,

I have a crash dump from the field for our product. This is obviously
corresponding to release mode binaries build with "maximize speed"
optimization. I open this in WinDBG and get an address of form
module!expfunc+xxxxx. I want to do the following steps in order to
troubleshoot the issue.

Q1. Please let me know if this is correct

a) Generate PDB files for the release mode build.
b) In step a I plan to retain the optimization settings ("maximize speed")
c) Also generate .cod files by setting the /FACs option in the compiler
(VC++ 6.0)

Q2. I also want to know how much reliably I can get to the exact source code
location for such optimized binaries using the .cod files

Q3. In WinDBG I normally use the option View | Disassembly and then key in
the offset e.g. in this case module!expfunc+xxxxx. I also right click on the
mainframe for this window and select the option "Show source line for each
instruction" and
"Show source file for each instruction".

My question is that how reliably can WinDBG point me to the exact source
code/line for release mode optimized binaries.

Please guide me.

Regards,
Anubhav.

Skywing [MVP]

unread,

Jan 30, 2007, 10:32:56 AM1/30/07

to

You'll typically have to use a bit of analysis to pin down things in a
completely precise way in optimized builds as far as source lines per
instruction go in some instances; it's not too bad, though. I just
disassemble around the target and correlate with any local constants or
function calls, or anything else easily matchable with source code.

--
Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net
"Anubhav" <Anu...@discussions.microsoft.com> wrote in message
news:D487613A-D41F-48D0...@microsoft.com...

Alan Adams

unread,

Jan 30, 2007, 11:21:21 AM1/30/07

to

Anubhav <Anu...@discussions.microsoft.com> wrote:

> a) Generate PDB files for the release mode build.
> b) In step a I plan to retain the optimization settings ("maximize speed")

Yes. In fact, don't wait for something to go wrong before doing this;
simply always generate .PDB files when building code you'll ship.

(Of course be sure that in addition to requesting the linker to
generate the .PDB file to store the symbol information for your
release build, you're also asking the compiler to generate symbol
information for your release build in the first place.)

> Q2. I also want to know how much reliably I can get to the exact source
> code location for such optimized binaries using the .cod files

Like Ken said, it won't be straight forward, but very possible to
successfully achieve. I do generally end up turning off WinDBG's
source mode in such cases, since seeing WinDBG try to react to the
source line information (which at many points will not seem linear or
logical in an optimized build) can be just misleading or distracting.

Alan Adams

Skywing [MVP]

unread,

Jan 30, 2007, 12:09:57 PM1/30/07

to

As an addendum to this, stay away from the local variables display support
in WinDbg. Due to limitations in the debug information format, it's
generally flaky, and more often than not completely wrong when faced with
even mimimally optimized code.

--
Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net

"Alan Adams" <alan...@nospam.nospam> wrote in message
news:gbrur2d1qtqlr11bk...@4ax.com...

Anubhav

unread,

Jan 30, 2007, 10:43:00 PM1/30/07

to

I would like to know if there is any recommendation documentation on the tips
and tricks of debugging through optimized code, any patterns which I can
expect to encounter in my venture. I know this gets into depths of compiler
theory of which I am by no means any expert and hence my question.

Skywing [MVP]

unread,

Jan 30, 2007, 11:25:57 PM1/30/07

to

The main things to look for are known things that the compiler can't change
(or not easily, anyway). For example, constants or function calls are
typically left in place (although for some constant math, the compiler might
do fancy tricks, or inline some function calls). Especially calls to
external functions can't be inlined though, and external functions always
have to correspond to a standard calling convention (at least when dealing
with the Microsoft C/C++ compiler).

Use these kinds of things as 'landmarks' to correlate disassembly with
source code. It also helps to understand the standard calling conventions
along the way (in fact, you'll find that in practice, some of the same kinds
of knowledge that help with reverse engineering foreign code are great for
helping to debug optimized code. Don't let that scare you away, though, as
you still have symbols and source code to help.).

There is a bit of low-level information on calling convention details here
that you might find useful: http://www.nynaeve.net/?p=66

Patterns you might encounter:

- Space for local variables (including parameters) on the stack is reused
for another once one variable is no longer referenced
- Locals may be moved entirely into registers or moved into and out of
registers
- Optimizations for stack argument cleaning after __cdecl functions (e.g.
changing `add esp, 4' to 'pop ecx')
- Optimization of `this' pointer with WPO (typically changes from `ecx' to
`ebx' or another nonvolatile register, for deep __thiscall chains)
- Custom calling conventions with WPO (though the compiler typically
continues to return the return value in `eax' or `edx:eax', and typically
doesn't use `eflags' across function calls)

--
Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net

"Anubhav" <Anu...@discussions.microsoft.com> wrote in message

news:2880CAE3-1CAA-4C65...@microsoft.com...

Marc Sherman

unread,

Jan 31, 2007, 9:51:48 AM1/31/07

to

Is it flaky in a debug build with no optimizations?

Is `dv` any better than the locals window?

thanks,
Marc

"Skywing [MVP]" <skywing_...@valhallalegends.com> wrote in message
news:uMll6FJR...@TK2MSFTNGP06.phx.gbl...

Skywing [MVP]

unread,

Jan 31, 2007, 10:55:31 AM1/31/07

to

`dv' should be the same engine internally as the locals window, I believe,
so I would expect them to both show the same results always.

It should be safe to use in a debug build without optimizations, assuming
you're using source mode. I still tend to avoid it even there, though,
since I tend to prefer assembler mode instead of source mode. When you are
stepping on an instruction basis (and not a source line basis), it will
sometimes show weird things for locals that aren't quite fully initialized
yet. However, it won't give you the same just plain wrong data as in a
fully optimized build; my bias against it in debug builds is more of a
personal preference think, I suppose.

--
Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net

"Marc Sherman" <masher...@yahoo.com> wrote in message
news:O9rvVdUR...@TK2MSFTNGP02.phx.gbl...

Marc Sherman

unread,

Jan 31, 2007, 11:46:50 AM1/31/07

to

"Skywing [MVP]" <skywing_...@valhallalegends.com> wrote in message

news:OziE%23AVRH...@TK2MSFTNGP06.phx.gbl...

> When you are stepping on an instruction basis (and not a source line
> basis), it will sometimes show weird things for locals that aren't quite
> fully initialized yet.

Ah yes, I've noticed that too, especially before the stack frame is set up
in the function preamble. Thanks for clarifying.

Marc

Anubhav

unread,

Jan 31, 2007, 10:23:00 PM1/31/07

to

Hi,

Here is one of my fears come true. WinDBG reports the following

1. FAULTING_IP: as MyDLL!AnExportedFunction+Offset.

2. There is an exception of code 0xC0000009

3. The stack (kb command) also does not have any of MyDLL functions.

When I do "dumpbin /DISASM MyDLL.dll > dump.txt" and go to "Offset" from
AnExportedFunction in dump.txt, I see that the error location reported by
WinDBG is actually a floating point instruction within the intel library
function _ftol

There are enough number of places in AnExportedFunction in MyDLL which are
calling floating point operations. Here are my questions.

Q1. Why is WinDBG reporting the faulting IP with respect to
AnExportedFunction in MyDLL, even though this points to an instruction in the
intel library? Is this reliable? Can I be sure that the problem is actually
in entry point AnExportedFunction?

Q2. How do I proceed further debugging this? Since, I do not get any of
MyDLL function on the call stack, is it a good idea to a rebuild of MyDLL.DLL
with /Oy- (non FPO) and redistribute to my client in hope of getting a better
call stack during the crash?

Q3. Am I missing something important in my analysis?

Please advice me further on this issue.

Regards,
Anubhav.

Anubhav

unread,

Feb 1, 2007, 12:44:01 AM2/1/07

to

Sorry I meant the exception code is 0xc0000090. I mistyped as 0xc0000009 in
my earlier post.

Avi Cohen Stuart

unread,

Feb 1, 2007, 2:47:59 PM2/1/07

to

Can you reproduce the error with DLL at the customer site with a DLL that
has been compiled in debug mode?
Here at Infor we usually have compiled our DLL's as such that a stack trace
is always possible (unless the stack is trashed somehow) and for debugging
purposes we do place sometimes debuggable objects at the customer site.

Second thing you can do:

Disassemble the area near the crash and inspect the register values. I know
it is hard but once you do it a couple of times it will become less of a
challenge.

0xC0000090 is EXCEPTION_FLT_INVALID_OPERATION. Looks like some error in
floating point code.

I see... ftol is floating point to long. check the floating point value
before this call is done. It might be out of range for a long or maybe you
have an uninitialized variable somewhere. Check your compiler output with
warning level 4 if you are using Visual Studio.

The reason why Windbg doesn't show you the actual function name because it
is either an inline function or it is a local function to the object file.
in that case windbg tries to find the function name that is closest to the
program counter. This is the best it can do as under heavy optimalizations
or not enough debugging info in your pdb or dbg file to pin point the exact
symboling info.

With the ftol you are on the good direction in finding the error.

Avi.

"Anubhav" <Anu...@discussions.microsoft.com> wrote in message

news:1514559F-072E-42E8...@microsoft.com...

Marc Sherman

unread,

Feb 1, 2007, 5:01:01 PM2/1/07

to

"Avi Cohen Stuart" <aviXcohenXstuart@ssaglobalXcom> wrote in message
news:umZ4hnjR...@TK2MSFTNGP02.phx.gbl...

> Here at Infor we usually have compiled our DLL's as such that a stack
> trace is always possible (unless the stack is trashed somehow)

By using /Oy- (non FPO) and/or something else?

thanks,
Marc

Anubhav

unread,

Feb 1, 2007, 10:19:06 PM2/1/07

to

Thanks for pouring in your bits and bytes of experience. That will surely
help me.

In the mean while I figured out that by default CRT libraries turn off
floating point exceptions. Here is the KB article
http://support.microsoft.com/kb/94998

Hence, my suspicion is that the specific application at customer site is
probably enabling floating point exceptions and some floating point operation
in our code is throwing an exception. Since we have not handled the
exceptions in our DLL entry points, terminate function MIGHT be getting
called and causing the application to exit.

Hence I am planning to add __try{} __excep(){} guards in my DLL entry points
and re ship MyDLL in release build with debug symbols with /Oy- setting. In
my exception handler I can dump the exception pointers into a text file and
do a post analysis later at our site.

Alternatively I think I can use ADPlus in crash mode. But I am not sure if
it will help me in this case.

As per the documentation ADPlus configures CDB only for the below exceptions

Invalid handle
Illegal instruction
Integer divide-by-zero
Floating-point divide-by-zero
Integer overflow
Invalid lock sequence
Access violation
Stack overflow
C++ EH exception
Unknown exception

The exception I am getting is EXCEPTION_FLT_INVALID_OPERATION. I am not sure
if this will be caught by ADPlus.

It is surprising to note that Appverifier has no STOP CODE to detect
floating point exceptions (at least I could not figure out).

Am I thinking in the right direction?