I have a crash dump from the field for our product. This is obviously
corresponding to release mode binaries build with "maximize speed"
optimization. I open this in WinDBG and get an address of form
module!expfunc+xxxxx. I want to do the following steps in order to
troubleshoot the issue.
Q1. Please let me know if this is correct
a) Generate PDB files for the release mode build.
b) In step a I plan to retain the optimization settings ("maximize speed")
c) Also generate .cod files by setting the /FACs option in the compiler
(VC++ 6.0)
Q2. I also want to know how much reliably I can get to the exact source code
location for such optimized binaries using the .cod files
Q3. In WinDBG I normally use the option View | Disassembly and then key in
the offset e.g. in this case module!expfunc+xxxxx. I also right click on the
mainframe for this window and select the option "Show source line for each
instruction" and
"Show source file for each instruction".
My question is that how reliably can WinDBG point me to the exact source
code/line for release mode optimized binaries.
Please guide me.
Regards,
Anubhav.
--
Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net
"Anubhav" <Anu...@discussions.microsoft.com> wrote in message
news:D487613A-D41F-48D0...@microsoft.com...
> a) Generate PDB files for the release mode build.
> b) In step a I plan to retain the optimization settings ("maximize speed")
Yes. In fact, don't wait for something to go wrong before doing this;
simply always generate .PDB files when building code you'll ship.
(Of course be sure that in addition to requesting the linker to
generate the .PDB file to store the symbol information for your
release build, you're also asking the compiler to generate symbol
information for your release build in the first place.)
> Q2. I also want to know how much reliably I can get to the exact source
> code location for such optimized binaries using the .cod files
Like Ken said, it won't be straight forward, but very possible to
successfully achieve. I do generally end up turning off WinDBG's
source mode in such cases, since seeing WinDBG try to react to the
source line information (which at many points will not seem linear or
logical in an optimized build) can be just misleading or distracting.
Alan Adams
--
Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net
"Alan Adams" <alan...@nospam.nospam> wrote in message
news:gbrur2d1qtqlr11bk...@4ax.com...
Use these kinds of things as 'landmarks' to correlate disassembly with
source code. It also helps to understand the standard calling conventions
along the way (in fact, you'll find that in practice, some of the same kinds
of knowledge that help with reverse engineering foreign code are great for
helping to debug optimized code. Don't let that scare you away, though, as
you still have symbols and source code to help.).
There is a bit of low-level information on calling convention details here
that you might find useful: http://www.nynaeve.net/?p=66
Patterns you might encounter:
- Space for local variables (including parameters) on the stack is reused
for another once one variable is no longer referenced
- Locals may be moved entirely into registers or moved into and out of
registers
- Optimizations for stack argument cleaning after __cdecl functions (e.g.
changing `add esp, 4' to 'pop ecx')
- Optimization of `this' pointer with WPO (typically changes from `ecx' to
`ebx' or another nonvolatile register, for deep __thiscall chains)
- Custom calling conventions with WPO (though the compiler typically
continues to return the return value in `eax' or `edx:eax', and typically
doesn't use `eflags' across function calls)
--
Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net
"Anubhav" <Anu...@discussions.microsoft.com> wrote in message
news:2880CAE3-1CAA-4C65...@microsoft.com...
Is `dv` any better than the locals window?
thanks,
Marc
"Skywing [MVP]" <skywing_...@valhallalegends.com> wrote in message
news:uMll6FJR...@TK2MSFTNGP06.phx.gbl...
It should be safe to use in a debug build without optimizations, assuming
you're using source mode. I still tend to avoid it even there, though,
since I tend to prefer assembler mode instead of source mode. When you are
stepping on an instruction basis (and not a source line basis), it will
sometimes show weird things for locals that aren't quite fully initialized
yet. However, it won't give you the same just plain wrong data as in a
fully optimized build; my bias against it in debug builds is more of a
personal preference think, I suppose.
--
Ken Johnson (Skywing)
Windows SDK MVP
http://www.nynaeve.net
"Marc Sherman" <masher...@yahoo.com> wrote in message
news:O9rvVdUR...@TK2MSFTNGP02.phx.gbl...
Ah yes, I've noticed that too, especially before the stack frame is set up
in the function preamble. Thanks for clarifying.
Marc
Here is one of my fears come true. WinDBG reports the following
1. FAULTING_IP: as MyDLL!AnExportedFunction+Offset.
2. There is an exception of code 0xC0000009
3. The stack (kb command) also does not have any of MyDLL functions.
When I do "dumpbin /DISASM MyDLL.dll > dump.txt" and go to "Offset" from
AnExportedFunction in dump.txt, I see that the error location reported by
WinDBG is actually a floating point instruction within the intel library
function _ftol
There are enough number of places in AnExportedFunction in MyDLL which are
calling floating point operations. Here are my questions.
Q1. Why is WinDBG reporting the faulting IP with respect to
AnExportedFunction in MyDLL, even though this points to an instruction in the
intel library? Is this reliable? Can I be sure that the problem is actually
in entry point AnExportedFunction?
Q2. How do I proceed further debugging this? Since, I do not get any of
MyDLL function on the call stack, is it a good idea to a rebuild of MyDLL.DLL
with /Oy- (non FPO) and redistribute to my client in hope of getting a better
call stack during the crash?
Q3. Am I missing something important in my analysis?
Please advice me further on this issue.
Regards,
Anubhav.
Second thing you can do:
Disassemble the area near the crash and inspect the register values. I know
it is hard but once you do it a couple of times it will become less of a
challenge.
0xC0000090 is EXCEPTION_FLT_INVALID_OPERATION. Looks like some error in
floating point code.
I see... ftol is floating point to long. check the floating point value
before this call is done. It might be out of range for a long or maybe you
have an uninitialized variable somewhere. Check your compiler output with
warning level 4 if you are using Visual Studio.
The reason why Windbg doesn't show you the actual function name because it
is either an inline function or it is a local function to the object file.
in that case windbg tries to find the function name that is closest to the
program counter. This is the best it can do as under heavy optimalizations
or not enough debugging info in your pdb or dbg file to pin point the exact
symboling info.
With the ftol you are on the good direction in finding the error.
Avi.
"Anubhav" <Anu...@discussions.microsoft.com> wrote in message
news:1514559F-072E-42E8...@microsoft.com...
By using /Oy- (non FPO) and/or something else?
thanks,
Marc
In the mean while I figured out that by default CRT libraries turn off
floating point exceptions. Here is the KB article
http://support.microsoft.com/kb/94998
Hence, my suspicion is that the specific application at customer site is
probably enabling floating point exceptions and some floating point operation
in our code is throwing an exception. Since we have not handled the
exceptions in our DLL entry points, terminate function MIGHT be getting
called and causing the application to exit.
Hence I am planning to add __try{} __excep(){} guards in my DLL entry points
and re ship MyDLL in release build with debug symbols with /Oy- setting. In
my exception handler I can dump the exception pointers into a text file and
do a post analysis later at our site.
Alternatively I think I can use ADPlus in crash mode. But I am not sure if
it will help me in this case.
As per the documentation ADPlus configures CDB only for the below exceptions
Invalid handle
Illegal instruction
Integer divide-by-zero
Floating-point divide-by-zero
Integer overflow
Invalid lock sequence
Access violation
Stack overflow
C++ EH exception
Unknown exception
The exception I am getting is EXCEPTION_FLT_INVALID_OPERATION. I am not sure
if this will be caught by ADPlus.
It is surprising to note that Appverifier has no STOP CODE to detect
floating point exceptions (at least I could not figure out).
Am I thinking in the right direction?