On Sun, 08 Feb 2015 23:13:34 -0500, DSF <nota...@address.here>
wrote:
Hello.
Info below.
The compiler treats initialization of a structure the same way it
treats initialization of a character array. As in:
void Bar(void)
{
char foo[] = {"This is foo!"};
...
In this case, the text is not constant, but is expected to contain
"This is foo!" after the above line every time Bar is run. Most
compilers, I imagine, will store the literal text string in a static
area and copy it to local "foo" memory at the start of Bar.
The code:
UINT array[20] = {1, 2, 3, 4};
Produces:
_DATA segment dword public use32 'DATA'
align 4
$mkbknmaa label dword
dd 1
dd 2
dd 3
dd 4
db 64 dup(?)
_DATA ends
...
; UINT array[20] = {1, 2, 3, 4};
mov esi,offset $mkbknmaa
lea edi,dword ptr [ebp-116]
mov ecx,20
rep movsd
I believe this is what my compiler is automatically doing with any
structure or array initialization. It stores the literal numeric
values in a static area and then copies them into the structure. If I
change the original to:
IOERRORS ret;
ret.ioerror = 0;
ret.syserror = 0;
It becomes:
; IOERRORS ret; // = {0, 0};
; ret.ioerror = 0;
@6:
xor eax,eax
mov dword ptr [ebp-8],eax
;
; ret.syserror = 0;
xor edx,edx
mov dword ptr [ebp-4],edx
Still stupid code. And it still creates/uses a local ret and copies
it to the LHS pointer on the stack at the end, even though there is
only one exit point. But at least I understand now; it's a
boilerplate for initialization.
> 2. The memory locations are referenced in entirely different ways.
That, as has been mentioned, is in linker territory.
> 3. [FBaseString<wchar_t>::blank + 0x30] refers to an offset from a
>string template class static data member and doesn't even belong in a
>C source file. (The C file *is* being compiled my the C compiler.)
>The only reason this C file is part of the project and not a library
>call is so that I can debug it properly.
It turns out that FBaseString<wchar_t>::blank + 0x30 resolves to...
...wait for it...
Address 0x42a694! Which explains:
> 4. I have no idea what/where [0x42a698] refers to.
It's the address following FBaseString<wchar_t>::blank + 0x30!
So there we have our static storage of two unsigned integer zeros.
Thanks to Ian Collins for putting me onto setting a breakpoint on
write, which forced me to track down the address of
FBaseString<wchar_t>::blank + 0x30. Provided by going into a function
of said template and placing a watch on &blank. Then setting a write
breakpoint for 8 bytes at address 0x42a694.
As to whether it's a compiler bug or an error of mine is yet to be
determined. But address 0x42a694 lies right in the middle of a buffer
of FBaseString and is overwritten near the end of the program loop
that this code is within. Thus causing all subsequent loop passes to
fail because GetVolumeInfo doesn't return 0, 0.
I've also learned (at least while I'm using this compiler) to avoid
multiple exit points if I'm returning a structure. As a test I added
three return ret; statements to GetVolumeInfo. Each one produced:
mov eax,dword ptr [ebp+8]
mov edx,dword ptr [ebp-8]
mov dword ptr [eax],edx
mov edx,dword ptr [ebp-4]
mov dword ptr [eax+4],edx
mov eax,dword ptr [ebp+8]
jmp @12
Copying the *same* local variable to the stack for return. They may
use different registers to do it, but it's an exit point, negating any
later requirements on the register values.
The last one even has a convenient label. So instead of repeating
the 17-byte sequence each time, they should have dumped every mov
above and changed the last to jmp @8!
@8:
mov ecx,dword ptr [ebp+8]
mov eax,dword ptr [ebp-8]
mov dword ptr [ecx],eax
mov eax,dword ptr [ebp-4]
mov dword ptr [ecx+4],eax
mov eax,dword ptr [ebp+8]
@12:
pop edi
pop esi
pop ebx
mov esp,ebp
pop ebp
ret
Enough of this off topic typing. At least I think I know why some
of the strange code is implemented the way it is. And I know where
the error is. FBaseString<wchar_t>::blank is a const static value, so
it's reasonable that 0x30 farther on (0x42a694) is still in the static
area. The memory allocated for the string starts at 0x42a664. I
don't have an idea off the top of my head how to determine if it's a
bug or corrupt memory data. 42a664 could be a segment of a string,
indicating a string has written over memory manager addresses, except
this is a 16-bit Unicode project and I don't believe there are any
ASCII strings. This will be the next challenge, as initializing ret
members individually (thus avoiding the address conflict) is only
putting a band-aid on a problem that's sure to arise when I least
expect it and byte me in the ass!
Thanks for all your advice and patience!