The StackWalk64 seems to be OK, too, and it only when it comes to the symbol
and the line that the code fails.
If a particular address points to, say, KERNEL32.dll, I do get the symbol
(not the file or the line, though, but this is obviously fine) but never for
my own executable.
The .pdb file for the executable is placed together with it in the same
directory (it should be noted that the code is compiled on a different
machine, so the path to the .pdb file saved in the executable will be
incorrect). I do not modify the path and SymGetSearchPath() returns
".;F:\WINNT\Symbols" (the current directory is in, that is).
For both SymGetSymFromAddr64() (SymFromAddr()) and SymGetLineFromAddr64(),
GetLastError() returns "Attempt to access invalid address."
Thank you.
Paul
Perhaps you are missing the symbol files for Win2K ? Sometimes I
couldn't evaluate dump files too, when I hadn't configured the symbol
server environment variable, so that the required symbols are downloaded
automatically.
Link, which describes how to configure the environment variables.
www.microsoft.com/whdc/devtools/debugging/debugstart.mspx
I don't know which code you are using, but all you normally need is to
write a crash dump file and load it into the VS 2005 or WinDbg IDE. If
you have properly configured the environment variable for the symbol
server, VS 2005 will automatically download the needed symbols and you
should be able to debug the dump file and all code lines should be
displayed correctly.
You may also manually download the symbols, but you must know on which
OS and service pack version the crash occurred. After you have copied
the appropriate symbols to your evaluation directory your code might
give you "better" results.
Hope that helps,
Andre
> Perhaps you are missing the symbol files for Win2K ? Sometimes I couldn't
> evaluate dump files too, when I hadn't configured the symbol server
> environment variable, so that the required symbols are downloaded
> automatically.
>
> Link, which describes how to configure the environment variables.
> www.microsoft.com/whdc/devtools/debugging/debugstart.mspx
For all I see, the Win2K symbols are OK: when addresses begin with 007...
(such as for KERNELL32.dll described), the symbol does get displayed (or
rather written to a log). It is symbols and file names/lines for addresses
like 004... pointing to the executable itself that I have problems with.
> I don't know which code you are using, but all you normally need is to
> write a crash dump file and load it into the VS 2005 or WinDbg IDE. If you
> have properly configured the environment variable for the symbol server,
> VS 2005 will automatically download the needed symbols and you should be
> able to debug the dump file and all code lines should be displayed
> correctly.
My code calls SetUnhandledExceptionFilter() to set up a handler and when the
process (NT service in this case) crashes, calls various APIs (such as
SymGetSymFromAddr64() (SymFromAddr()) and SymGetLineFromAddr64() mentioned)
to get more information. The addresses themselves do make perfect sense to
me: they do get written into a log OK and using map files I could determine
both symbols and lines (when the latter were available with Visual Studio
.NET 2003).
So the problem appears to be the loading of the executable's own symbols
from its .pdb file which, together with the latest dbghelp.dll, is placed in
the same directory as the executable.
I do not use a debugger: it is an NT service executable and the mechanism is
used to record details of a crash when one happens.
Thank you.
Paul
Yes, may be. But why should there be a difference if the crash occured
on Win2K and Win2003 ? I only had a similar problem, if the appropriate
symbols had been missing and didn't match exactly and assumed you might
have the same problems.
To force symbol matching: search for the tool chkmatch which should
force the PDB to match a executable. I never tried the tool, so I don't
know if it works properly or just deletes your hard disk ;-)
> I do not use a debugger: it is an NT service executable and the mechanism is
> used to record details of a crash when one happens.
Perhaps you could verify if everything is OK with either the Windows
symbols or the symbol file of your executable by forcing a crash /
writing a memory dump of your executable and loading it into either
WinDBG / VStudio 2003 / VStudio 2005. The "modules" window will show you
if the symbols have been loaded correctly.
In case that you don't have already enough information about crash dumps:
Article, explaining how to generate a crash dump externally:
http://msdn.microsoft.com/msdnmag/issues/05/07/Debugging/default.aspx
Very good book about debugging, dump files and sources how to create a
crash dump of your own:
J. Robbins - Windows Debugging .... ISBN: 0-7356-1536-5
If the debugger is able to load all symbols then your code should also
be able to load and use them.
> Thank you.
> Paul
No prob. Though it didn't help that much.
Andre
Hypothetically it is possible but every time I compile a new executable, a
.pdb is created with it at the same time, and then I copy the two
simultaneously from the same directory to again the same directory this time
on a server.
> Very good book about debugging, dump files and sources how to create a
> crash dump of your own:
>
> J. Robbins - Windows Debugging .... ISBN: 0-7356-1536-5
It is this very method that I have adapted and tried on its own to success.
The author talks about differences among various Windows versions from a
debugging perspective, so I thought the Windows version might play a part.
> If the debugger is able to load all symbols then your code should also be
> able to load and use them.
Unfortunately I am not entirely free to do whatever I would like on the
server and my own computer does not have everything necessary for the
service to start.
Try to call SymGetModuleInfo64 to check if symbols (not exports!) have
been loaded for the modules you see on the call stack. Another troubleshooting
option is to set SYMOPT_DEBUG option (see SymSetOptions) and monitor
the debug output (e.g. using DebugView) while your application is loading
symbols (or install callback (SymRegisterCallback64) and print the debug
messages right to the log).
Also ensure that SYMOPT_LOAD_LINES option is set.
Also note that this approach (stack walk in-process) will work reliably only
if you have symbols for all modules on the call stack (including system dlls).
That is, it is going to work only if you have control over the system
(and have downloaded all the necessary symbols beforehand, or configured
your application to use symbol server). If symbols will not be available
by some reason, and the application crashes in a system dll, stack walk will fail
on some systems (e.g. those that use FPO optimizations, like Windows 2000,
XP SP1 and 2003 without SP). If you cannot ensure that symbols will
always be available, consider using minidumps.
--
Oleg
[VC++ MVP http://www.debuginfo.com/]
Oleg, thank you for your suggestion.
SYMOPT_LOAD_LINES was set but SymGetModuleInfo64() for LoadedPdbName
returned an empty string and for LineNumbers, FALSE (PdbUnmatched was
FALSE). Does it mean that the .pdb file for the executable has to be loaded
manually? I thought that even though paths may differ (since the executable
is compiled on a different machine, the path of its .pdb file written into
it will be different from the real one), provided the name of the file
remains the same, the required .pdb will be found. (Once compiled, the
executable and the corresponding .pdb file are copied to the same location
on the server.)
Paul
What is the value of IMAGEHLP_MODULE64.SymType? SymNone?
> Does it mean that the .pdb file for the executable has to be loaded
> manually?
Either manually (SymLoadModule64) or using SymInitialize with "invade" option
set to TRUE.
> I thought that even though paths may differ (since the executable
> is compiled on a different machine, the path of its .pdb file written into
> it will be different from the real one), provided the name of the file
> remains the same, the required .pdb will be found. (Once compiled, the
> executable and the corresponding .pdb file are copied to the same location
> on the server.)
>
The .pdb file in the same directory with the .exe should usually be found.
If it does not happen, SYMOPT_DEBUG option is the most effective
way to find out why.
Oleg
Yes, SymNone.
>> Does it mean that the .pdb file for the executable has to be loaded
>> manually?
>
> Either manually (SymLoadModule64) or using SymInitialize with "invade"
> option
> set to TRUE.
"invade" was set to TRUE.
One thing I did not mention but which can be important: I call
SymInitialize() after the crash, i.e. on shutdown rather than at startup,
the reason being that whilst this piece of code will provide additional
information about a crash if one occurs, it will not cause any problems on
startup should the executable be running on a system where something may be
lacking (be it symbols or proper dbghelp.dll). May this play a part?
Paul
It should be enough. Enable SYMOPT_DEBUG and check the log
to see why symbols cannot be found.
> One thing I did not mention but which can be important: I call
> SymInitialize() after the crash, i.e. on shutdown rather than at startup,
> the reason being that whilst this piece of code will provide additional
> information about a crash if one occurs, it will not cause any problems on
> startup should the executable be running on a system where something may be
> lacking (be it symbols or proper dbghelp.dll). May this play a part?
>
In a real application with real unknown exception - yes, it can.
SymInitialize (and other Sym* functions) can depend on some parts
of the application state (e.g. process heap) that can be corrupted at
the moment when they are called, and therefore they can fail.
That's why it is usually not recommended to run exception reporting
code in-process.
In a proof-of-concept case (when the application is configured to raise
an unhandled exception to test the exception handler), it cannot.
Btw, if you are simply looking for a way to troubleshoot an application
crash on the server, consider using JIT debugger instead (e.g. see the link
below), or ADPlus (part of Debugging Tools for Windows) - in general
these approaches are more reliable than in-process exception reporting.
http://www.debuginfo.com/articles/ntsdwatson.html
Oleg
I tried to set up a callback function but seem to be having problems. The
callback looks like this:
BOOL CALLBACK SymRegisterCallbackProc64(
HANDLE hProcess,
ULONG ActionCode,
ULONG64 CallbackData,
ULONG64 UserContext
)
{
// cast UserContext
// output "SymRegisterCallbackProc64: "
switch (ActionCode) {
case CBA_DEBUG_INFO:
{ const char* str = reinterpret_cast<const char*>(CallbackData));
// second output - output str
}
return TRUE;
default:
return FALSE;
}
}
The first output ("SymRegisterCallbackProc64: ") works but then it becomes
the last one. Am I doing something wrong? (The code is compiled as
non-Unicode.)
(For output, I tried both a buffer and a file (std::ofstream()).)
Thank you.
Paul
The callback itself looks correct.
SymRegisterCallback64 needs the process handle that was previously passed
to SymInitialize, which implies that SymRegisterCallback64 can be called only after
SymInitialize. But if you rely on "invade" in SymInitialize to load symbols,
all the symbol loading messages get reported while SymInitialize is running,
and before you have registered your callback. Thus the callback cannot get them -
that could be the problem in this case.
As an alternative, you can use DebugView to capture the messages:
http://www.sysinternals.com/Utilities/DebugView.html
Oleg
Sorry, perhaps the callback was not even called with this ActionCode. At
what point are symbols loaded? I call functions in this order:
1) SymGetOptions(::SymGetOptions() | SYMOPT_LOAD_LINES | SYMOPT_DEBUG))
2) SymInitialize(::GetCurrentProcess(), NULL, TRUE))
3) ::SymRegisterCallback64()
Paul
I used DebugView and here is the output of interest:
[2272] DBGHELP: _NT_SYMBOL_PATH: F:\WINNT\Symbols
[2272] DBGHELP: Symbol Search Path: .;F:\WINNT\Symbols
[2272] DBGHELP: SymSrv load failure: symsrv.dll
[2272] DBGHELP: .\oms.pdb - file not found
[2272] DBGHELP: .\symbols\exe\oms.pdb - file not found
[2272] DBGHELP: .\exe\oms.pdb - file not found
[2272] DBGHELP: F:\WINNT\Symbols\oms.pdb - file not found
[2272] DBGHELP: F:\WINNT\Symbols\symbols\exe\oms.pdb - file not found
[2272] DBGHELP: F:\WINNT\Symbols\exe\oms.pdb - file not found
[2272] DBGHELP: [location written into executable] - file not found
[2272] DBGHELP: OMS - no symbols loaded
One thing that looks a little surprising is that both the executable and
.pdb file are copied with a single mouse movement to the same location on
the server, yet there is this line in the output:
[2272] DBGHELP: .\oms.pdb - file not found
If we assume that the dot (.) means current directory, the only explanation
that comes to mind is that this may have something to do with the fact the
executable (and .pdb) are located on drive D rather than C - and the
operating system being Windows 2000 (Server, Service Pack 4).
Will this mean that under the circumstances it would be preferable to use
the manual loading and skip the "invade" option?
When I copied the .pdb file to F:\WINNT\Symbols (one of the directories
tried during search), according to the debug output it was found, with lines
and private symbols.
Yet I still had no luck with the ultimate goal: GetLastError() returned 'The
parameter is incorrect' when SymInitialize() returned FALSE. It should be
said this did occur every now and again but I do not really know what this
means. Initialisation now looks like this:
1) SymGetOptions(::SymGetOptions() | SYMOPT_LOAD_LINES))
2) SymInitialize(::GetCurrentProcess(), NULL, TRUE))
Thank you.
Paul
Usually dbghelp should look for the .pdb file in the same directory with
the executable, but in this case it doesn't. What version of dbghelp.dll is used?
> Will this mean that under the circumstances it would be preferable to use
> the manual loading and skip the "invade" option?
>
Probably not, since it is much simpler to use "invade". But of course
you can try manual loading too.
> When I copied the .pdb file to F:\WINNT\Symbols (one of the directories
> tried during search), according to the debug output it was found, with lines
> and private symbols.
>
You can try a workaround - before calling SymInitialize, use SetEnvironmentVariable
to add the path to your .exe file to _NT_SYMBOL_PATH environment variable.
> Yet I still had no luck with the ultimate goal: GetLastError() returned 'The
> parameter is incorrect' when SymInitialize() returned FALSE. It should be
> said this did occur every now and again but I do not really know what this
> means.
SymInitialize? Does it also fail?
Oleg
dbghelp.dll that has been put in the same directory as the executable is
version 6.6.7.5 (downloaded the latest a couple of days ago).
I also used ImagehlpApiVersion() to determine and this returns 4.0.5 (major
version.minor version.revision). I do not know what this means (there is no
dbghelp.dll - or image...dll - on the machine that will have this version)
and if there is a better way of identifying the .dll actually used, I will
certainly use that. On the other hand, after a particular attempt I could
not delete this .dll file which indicates it was indeed used.
> Probably not, since it is much simpler to use "invade". But of course
> you can try manual loading too.
>
>
> You can try a workaround - before calling SymInitialize, use
> SetEnvironmentVariable
> to add the path to your .exe file to _NT_SYMBOL_PATH environment variable.
What I did eventually was pass the path in the call to SymInitialize(),
having added the actual path of the directory with the executable. This
worked but with the same results (I tried various things including getting
the path out of GetModuleFileNameEx() as recommended but in the simplest
case just passed a string, hard-coded version of the path to use): 'The
parameter is incorrect'.
>> Yet I still had no luck with the ultimate goal: GetLastError() returned
>> 'The
>> parameter is incorrect' when SymInitialize() returned FALSE. It should be
>> said this did occur every now and again but I do not really know what
>> this
>> means.
>
> SymInitialize? Does it also fail?
Yes, it is SymInitialize() that fails and GetLastError() returns 'The
parameter is incorrect'. If before - before I got to the stage when debug
output indicated symbols were found and loaded - the error message only
happened every now and again, ever since the symbols became discoverable,
this is always the result. If the symbols cannot be found, I get a stack
without symbols, if they are, this error message.
Paul
To trace what is going on, I used std::cerr with output redirected to a
file, and then I set up a counter to provide the information about how many
times the function passed in the call to SetUnhandledExceptionFilter() was
called. The code looked like this:
static int i = 0;
std::cerr << "Thread ID: " << ::GetCurrentThreadId() << ": ExceptionFilter
entered " << ++i << " times.\n";
With the executable's .pdb made discoverable (and debugging output indicatig
symbols were getting loaded), this was 202 times in one case, 157, in
another. It never got to outputting actual debugging information. Then the
sole change I made was rename oms.pdb -> oms1.pdb (to hide the file with the
executable's symbols) and repeat the experiment. This time the filter
function was called only once and got the debugging output but without
symbols (the situation from which I started).
Paul
I guess the application state is already corrupted at the moment when
the exception filter is called, and an exception is raised inside of the filter
(probably in dbghelp's code). It causes recursive calls to the filter.
This corruption could also explain why SymInitialize does not behave
like it should.
Process heap corruption is the most probable suspect in this case,
so try to test the application with PageHeap enabled, e.g. as shown here:
http://www.debuginfo.com/tips/userbpntdll.html
Oleg
I tried but could not really get to the problem.
My original test (deleting a pointer several times) was probably not a very
good one, so I reverted to the original problem which I investigated and so
knew the causes of. With this one, with Visual Studio 2005 debugger
attached, the access violation was caught immediately by the debugger and I
could not really take it any further to invoke the filter routine. Without
the debugger, it all appeared as before: the filter routine was called more
than once.
Paul
There are various ways to debug the exception filter:
http://www.debuginfo.com/articles/debugfilters.html
:)
Also, it probably will be useful to configure the debugger to break
on first chance exceptions.
Oleg
As luck would have it, during one of my attempts I got the information
pointing not to what I was expecting it would point to but to the output
from SymGetModuleInfo64() which I inserted between SymGetLineFromAddr64()
and StackWalk64() (the latter called repeatedly) since we wanted to get the
information about symbols loaded.
It is a little surprising since when dbghelp.dll could not find the symbols
for my executable, the code worked fine and produced sensible output but
with the symbols .pdb file identified, the presence of SymGetModuleInfo64()
sent it all into a spin.
With lines calling SymGetModuleInfo64() commented out, the code worked as
expected.
Thank you.
Paul