DLL Injection on suspended process fails

aela...@gmail.com

unread,

May 2, 2006, 9:27:59 PM5/2/06

to

My code fails on some computers, consistantly. ie, on my laptop it
fails everytime, but on my desktop never fails.

What it does is Spawn the process with CreateProcess passing in the
CREATE_SUSPENDED flag, then it just injects the DLL and resumes the
process. On some computers the process never resumes, it just sits
there like a zombie. This happens consistantly on these computers. I've
tried changing the DLL its injecting to something not mine,
ssleay32.dll and I've even tried having it target notepad.exe on these
computers and the same thing happens.

The ONLY way to stop it from happening seems to be commenting out my
WaitForSingleObject on the remote thread. If I comment that out,
injection works about 75% of the time on the afflicted computers. I've
had this happen on 4 computers all of them running XP sp2 and nothing
else seems to be common between them.

I've loaded a debugger up and checked the zombie'd processes and they
are frozen nowhere near my GetProcAddress ( GetModuleHandle("
kernel32.dll"), "LoadLibraryA" ); If it matters, the debugger shows two
threads in this app even though it was spawned with CREATE_SUSPENDED.

void CPartyLobby::SpawnApp() {
STARTUPINFO sStartInfo;
PROCESS_INFORMATION sProcessInformation;
ZeroMemory(&sStartInfo, sizeof(STARTUPINFO));
ZeroMemory(&sProcessInformation, sizeof(PROCESS_INFORMATION));
sStartInfo.cb = sizeof(STARTUPINFO);

char cPP[MAX_PATH];

OptionManager::GetPath(cPP);

if(!CreateProcess(NULL, cPP, NULL, NULL, TRUE,
NORMAL_PRIORITY_CLASS | CREATE_SUSPENDED, NULL,
NULL, &sStartInfo, &sProcessInformation)) {
DEBUG("CreateProcess failed on startup string: %s\n", cPP);
}
else {
DEBUG("CreateProcess succeeded pid = %x string: %s\n",
sProcessInformation.dwProcessId, cPP);
}

m_dPid = sProcessInformation.dwProcessId;

DEBUG("Created thread %x\n", sProcessInformation.hThread);
// Inject the dll then resume the process
DWORD dExit = Inject(sProcessInformation.hProcess);
DEBUG("Resuming thread %x\n", sProcessInformation.hThread);
ResumeThread( sProcessInformation.hThread);
}

DWORD Inject(HANDLE hModule) {
char cDLLFile[MAX_PATH];
GetCurrentDirectory(MAX_PATH, cDLLFile);
strcat(cDLLFile, "\\MyDLL.dll");
DEBUG("Trying to inject DLL from %s\n", cDLLFile);

int LenWrite = strlen(cDLLFile) + 1;
char * AllocMem;
LPTHREAD_START_ROUTINE Injector;
HANDLE hThread;
DWORD Result;

AllocMem = (char *) VirtualAllocEx ( hModule, NULL, LenWrite,
MEM_COMMIT, PAGE_EXECUTE_READWRITE );
if(AllocMem != NULL) {
DEBUG("VirtualAllocEx succeeded in Inject\n");
}
else {
DEBUG("VirtualAllocEx failed in Inject with error %d\n",
GetLastError());
return NULL;
}

if(WriteProcessMemory ( hModule, AllocMem , cDLLFile, LenWrite,
NULL )) {
DEBUG("WriteProcessMemory succeeded in Inject\n");
}
else {
DEBUG("WriteProcessMemory failed in Inject with error %d\n",
GetLastError());
}

Injector = ( LPTHREAD_START_ROUTINE ) GetProcAddress (
GetModuleHandle(" kernel32.dll"), "LoadLibraryA" );
if ( !Injector ) DEBUG ( "[!] Error while getting LoadLibraryA
address.\n");

DWORD dwThreadId;
hThread = CreateRemoteThread ( hModule, NULL, 0, Injector, (void *)
AllocMem, 0, &dwThreadId );
if ( !hThread )
DEBUG("[!] Cannot create thread.\n");
else
DEBUG("It appears that the thread was created succesfully
%x\n", hThread);

DWORD dRes = NULL;

Result = WaitForSingleObject ( hThread, INFINITE );
if ( Result==WAIT_ABANDONED || Result==WAIT_TIMEOUT ||
Result==WAIT_FAILED )
DEBUG ( "[!] Thread TIME OUT.\n" );
else
DEBUG("Thread ended!\n");

GetExitCodeThread(hThread, &dRes);

VirtualFreeEx ( hModule, (void *) AllocMem, 0, MEM_RELEASE );
if ( hThread!=NULL ) CloseHandle ( hThread );

DEBUG("Returning from Inject with exit code %x\n", dRes);

return dRes;
}

Lucian Wischik

unread,

May 2, 2006, 10:03:49 PM5/2/06

to

aela...@gmail.com wrote:
>What it does is Spawn the process with CreateProcess passing in the
>CREATE_SUSPENDED flag, then it just injects the DLL and resumes the
>process. On some computers the process never resumes, it just sits
>there like a zombie. This happens consistantly on these computers.

(I don't know the answer but...)

This reminds me of behaviour I see often, when certain applications
block until certain other applications do something. I remember when
new application windows would refuse to appear whenever Word was
frozen, for instance. Or yesterday, Start>MyComputer failed to pop up
any Explorer windows until after Remote Desktop Connection had been
closed, and then it did all of them at once. I've long wondered about
why this should be. My guess had been something to do with injected
DLLs that hooked windows but where the authors failed to consider all
possible blocking scenarios, or maybe a similar bug in csrss.exe (or
whatever was the name of that Windows system process which keeps track
of every window+thread in the system).

--
Lucian

Alf P. Steinbach

unread,

May 2, 2006, 10:04:15 PM5/2/06

to

* aela...@gmail.com:

> My code fails on some computers, consistantly. ie, on my laptop it
> fails everytime, but on my desktop never fails.
>
> What it does is Spawn the process with CreateProcess passing in the
> CREATE_SUSPENDED flag, then it just injects the DLL and resumes the
> process. On some computers the process never resumes, it just sits
> there like a zombie. This happens consistantly on these computers. I've
> tried changing the DLL its injecting to something not mine,
> ssleay32.dll and I've even tried having it target notepad.exe on these
> computers and the same thing happens.
>
> The ONLY way to stop it from happening seems to be commenting out my
> WaitForSingleObject on the remote thread. If I comment that out,
> injection works about 75% of the time on the afflicted computers. I've
> had this happen on 4 computers all of them running XP sp2 and nothing
> else seems to be common between them.

The documentation of CreateRemoteThread mentions that

<q>
The ExitProcess, ExitThread, CreateThread, CreateRemoteThread
functions, and a process that is starting (as the result of a
CreateProcess call) are serialized between each other within a
process. Only one of these events can happen in an address space at a
time. This means the following restrictions hold:

* During process startup and DLL initialization routines, new threads
can be created, but they do not begin execution until DLL
initialization is done for the process.

* Only one thread in a process can be in a DLL initialization or
detach routine at a time.

* ExitProcess does not return until no threads are in their DLL
initialization or detach routines.
</q>

Since you're creating the process suspended, then creating a thread in
that process, and waiting for it, you should according to the quote
above /always/ wait for eternity.

That it doesn't always happen may be due to faulty documentation, or
that I've misunderstood the documentation (the phrase "DLL
initialization is done" seems to be crucial), or some bug in Windows, or
a bug in your code,. E.g., are you sure that [kernel32.dll] is mapped
to the same address in all processes (for your passing of thread
function to CreateRemoteThread; it needs to exist in that process)? I
guess the thing to do is get hold of Jeffrey Richter's "Advanced
Windows", or whatever the modern version of it is named.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

James Brown [MVP]

unread,

May 3, 2006, 4:04:00 AM5/3/06

to

"Alf P. Steinbach" <al...@start.no> wrote in message
news:4bqhd3F...@individual.net...

There has been a lot of discussion of this topic over the years. The reason
why injecting into a suspended process (on creation) is bad, is because the
win32 environment (for that process) has not been initialized yet. Not only
environment variables as such, but basic state for APIs, COM, all that kind
of stuff. When the injected thread runs, the primary thread that is supposed
to be doing the initialization of the process still isn't running, and your
injected code runs in an environment that is not setup yet. Alot of API
calls will fail out-right - any that require state. It is not a good idea to
be injecting code this way. Remember your 'win32' app is not supposed to run
until the moment the OS calls the process-entrypoint in the executable. So
the preferred method is to patch (with a JMP) the exe entrypoint and gain
execution that way - then transfer control back to the real WinMainStartup
when you've finished.

James
--
Microsoft MVP - Windows SDK
www.catch22.net
Free Win32 Source and Tutorials

aela...@gmail.com

unread,

May 3, 2006, 6:50:40 AM5/3/06

to

Well correct me here if I'm wrong, but if you enumerate the loaded
libraries of even a CREATE_SUSPENDED process the important ones are
there. They get mapped in when the exe is mapped in. I doubt their
DllMains have run, but, I can't imagine kernel32.dll has much in the
way of a dllmain. The locale is stored per-thread so there might be
some initialization there, but...even so I can't imagine that this
would render LoadLibrary unfunctional.

I thought maybe that the loader lock was taken and stopping my dll from
being initialized, and then an unsuspend would release the lock, but my
WaitForSingleObject returns immediately so this can't be the case.

I'm going to go ahead and try to patch the entry point, but if anyone
could shed some light on these issues for me I'd appreciate it.

Olof Lagerkvist

unread,

May 3, 2006, 7:37:09 AM5/3/06

to

aela...@gmail.com wrote:

> Well correct me here if I'm wrong, but if you enumerate the loaded
> libraries of even a CREATE_SUSPENDED process the important ones are
> there. They get mapped in when the exe is mapped in. I doubt their
> DllMains have run, but, I can't imagine kernel32.dll has much in the
> way of a dllmain. The locale is stored per-thread so there might be
> some initialization there, but...even so I can't imagine that this
> would render LoadLibrary unfunctional.

(This applies to NT series of Windows:)
Actually, lots of initialization code for using the Win32 subsystem is
run before the entry point is actually called. Before this, it is really
only safe to call functions in ntdll.dll that do not require any state
etc. kept for the process or thread by the Win32 subsystem. You can see
how this is done by single-stepping in a debugger in assembly mode the
startup of a process. Look for call to ntdll!CsrClientConnectToServer etc.

Calling the native ntdll.dll functions are however not documented.

> I'm going to go ahead and try to patch the entry point, but if anyone
> could shed some light on these issues for me I'd appreciate it.

Patching the entry entry point is probably the only way to go in your case.

--
Olof Lagerkvist
ICQ: 724451
Web: http://here.is/olof

James Brown [MVP]

unread,

May 3, 2006, 1:36:39 PM5/3/06

to

<aela...@gmail.com> wrote in message
news:1146653440.1...@g10g2000cwb.googlegroups.com...

> Well correct me here if I'm wrong, but if you enumerate the loaded
> libraries of even a CREATE_SUSPENDED process the important ones are
> there. They get mapped in when the exe is mapped in. I doubt their
> DllMains have run,

correct

>I can't imagine kernel32.dll has much in the way of a dllmain.

kernel32 is one of the most important system DLL's, don't assume anything
about it's operation..

>The locale is stored per-thread so there might be
> some initialization there, but...even so I can't imagine that this
> would render LoadLibrary unfunctional.

LoadLibrary works perfectly, it is the code you try to execute that will
fail.

>
> I thought maybe that the loader lock was taken and stopping my dll from
> being initialized, and then an unsuspend would release the lock, but my
> WaitForSingleObject returns immediately so this can't be the case.
>
> I'm going to go ahead and try to patch the entry point, but if anyone
> could shed some light on these issues for me I'd appreciate it.
>

I have already explained what is happening to your program, why would I by
making it up? do a google search on CreateRemoteThread+CREATE_SUSPENDED to
see what's been said about this over the years.

aela...@gmail.com

unread,

May 3, 2006, 8:43:05 PM5/3/06

to

James,

Not accusing you of lying, was just a little confused about the
behaviour. I would understand a crash, but I don't understand a
zombie'd process.

In any case the info is much appreciated.

Yu

unread,

May 4, 2006, 4:49:02 AM5/4/06

to

James Brown [MVP] 写道：

Thanks too.

Stephen Kellett

unread,

May 4, 2006, 9:36:07 AM5/4/06

to

In message <1146703385....@i40g2000cwc.googlegroups.com>,
aela...@gmail.com writes

>Not accusing you of lying, was just a little confused about the
>behaviour. I would understand a crash, but I don't understand a
>zombie'd process.

A zombie'd process would effectively be caused by a deadlock of some
sort. That may be caused by the Win32 synchronisation mechanisms or some
under the hood weirdness in the core OS dlls that we know nothing about
and that only applies during startup.

Given what James has said about some bits of data being uninit'd at
startup in the scenario you refer to, its quite conceivable that you may
be running into a deadlock related problem.

I've written a lot of code that uses CreateRemoteThread and the tests we
ran (on thousands of apps, automatically - we created a test suite for
it) show the following:

NT 4.0 W2K XP
2% 3% 5%

That is the number of apps for which injecting using CreateRemoteThread
will fail. It seems to be OS dependent as well as application dependent.
I wouldn't be at all surprised to find that different hardware would
also influence the outcome.

Our preferred method is using CreateProcess to do the injection. Jeffery
Richter's books give you a brief clue (similar to the clue elsewhere in
this thread) on how to do it. Be aware that it can get more complex than
it initially looks. Not for the faint hearted or those shy of assembly
language.

Stephen
--
Stephen Kellett
Object Media Limited http://www.objmedia.demon.co.uk/software.html
Computer Consultancy, Software Development
Windows C++, Java, Assembler, Performance Analysis, Troubleshooting

aela...@gmail.com

unread,

May 4, 2006, 1:25:48 PM5/4/06

to

Well I've got a half-way working solution right now

I basically spawn the process as a debuggee then in the create process
event to the debugger loop I overwrite the entry point of the process
with an int3...then when my breakpoint hits I inject my code, and copy
the old code back over the int3.

This causes a couple of issues...1) I can't WaitForSingleObject on the
injected thread while I'm in the breakpoint handler, it just waits
forever 2) I have to have a debugger loop running, and I it'll kill the
debuggee when my app exits.

I suppose I'm going to have to bite the bullet and actually patch it
and jump in to my code...the way I envision it going is:
I spawn the process suspended, copy in my code which contains at the
end the first instruction at the entry point of the target app and a
jump back to the instruction immediately after, then overwrite the
entry point with a jmp to my newly copied in code. Then resume the
process.

The only problem being I haven't touched assembly in quite a while so
this is going to be interesting to say the least. Do I have the right
idea here?

Stephen Kellett

unread,

May 4, 2006, 2:28:53 PM5/4/06

to

In message <1146763548.6...@g10g2000cwb.googlegroups.com>,
"aela...@gmail.com" <aela...@gmail.com> writes

>The only problem being I haven't touched assembly in quite a while so
>this is going to be interesting to say the least. Do I have the right
>idea here?

Yes. Should take you a day or two to get it right, depending upon your
approach and how rusty your i386 assembly is and how well you can turn
that assembly into bytes for you to stuff into your patch (you won't be
able to code this in straight assembly as it'll be in the wrong process,
although you may be able to block copy some code from one process to
another).

Richie Hindle

unread,

May 5, 2006, 7:32:21 AM5/5/06

to

[aelaguiz]

> I spawn the process suspended, copy in my code which contains at the
> end the first instruction at the entry point of the target app and a
> jump back to the instruction immediately after, then overwrite the
> entry point with a jmp to my newly copied in code. Then resume the
> process.

I don't know whether it matters to your application, but this system
will mean that your code doesn't run until after all the DllMains have
run. The only way I can think of to work around that is to patch all
the DllMains in the same way - tedious but not difficult.

Also, there's no need to muck about with the first instruction of the
app's entry point - you can copy away the first few bytes, overwrite
them with your JMP instruction, then at the end of your code put back
the first few bytes and jump back to the original entry point. That
way you don't have to figure out how many bytes that first instruction
uses.

--
Richie Hindle
ric...@entrian.com

aela...@gmail.com

unread,

May 5, 2006, 10:00:35 PM5/5/06

to

Well I decided to write the whole injection process in assembly, mostly
just to see if I could. I haven't written x86 assembly in so long.

Anyhow, I've gotten my code being injected into the process, and I've
overwritten the entry point with a jump to my code, and my code is
actually executing! Then promptly crashing! But I'm getting there.

aela...@gmail.com

unread,

May 5, 2006, 11:40:38 PM5/5/06

to

Success! Thanks for the help guys.

One question, I had to hardcode the location of LoadLibrary into my
injected code as seen:

PUSH pString
MOV eax, 7C801D77h
CALL eax

Is there any way to get around doing that?

Stephen Kellett

unread,

May 6, 2006, 6:50:52 AM5/6/06

to

In message <1146886837.9...@i39g2000cwa.googlegroups.com>,
"aela...@gmail.com" <aela...@gmail.com> writes

Yes. Do a GetProcAddress in the process doing the injection to get the
address of LoadLibrary and then use that address in your mov
instruction. The kernel32.dll is always loaded at the same address in
each app so LoadLibrary address won't change. But you can't hard code it
like that as the address may vary from machine to machine.

David Jones

unread,

May 6, 2006, 8:06:38 AM5/6/06

to

Stephen Kellett wrote:
> In message <1146886837.9...@i39g2000cwa.googlegroups.com>,
> "aela...@gmail.com" <aela...@gmail.com> writes
>
>> Success! Thanks for the help guys.
>>
>> One question, I had to hardcode the location of LoadLibrary into my
>> injected code as seen:
>>
>> PUSH pString
>> MOV eax, 7C801D77h
>> CALL eax
>>
>> Is there any way to get around doing that?
>
> Yes. Do a GetProcAddress in the process doing the injection to get the
> address of LoadLibrary and then use that address in your mov
> instruction. The kernel32.dll is always loaded at the same address in
> each app so LoadLibrary address won't change. But you can't hard code it
> like that as the address may vary from machine to machine.

I'll bite -- how will he get the address of GetProcAddress so that
he can call it? :)

David

aela...@gmail.com

unread,

May 6, 2006, 5:36:55 PM5/6/06

to

Stephen,

Thanks :) I implemented it and it works great. I just added a spot for
the address of the function in the the struct I inject, used
GetProcAddress in MY application (thats the answer to your question
David) and then stored that in the injected structure. That way the
injected code, once ran, can call it by retrieving the address of
LoadLibrary from the injected structure.

Once I got it working in assembly I actually moved it over to C this
morning...I had a lot of problems getting the injected code workign
right in C so I copied it over in asm:

static void InjectedProcess(struct InjStruct *sStruct) {
DWORD pFunc = (DWORD) sStruct->fnLoadLibrary;
DWORD pStr = (DWORD) sStruct->cLibPath;
DWORD pOld = (DWORD) sStruct->cOldBytes;

__asm {
pushad // Call load library

push pStr
call pFunc
popad

pushad // Copy the entry point back
cld

mov edi,0x49AC20
mov esi,pOld
mov ecx, 12
shr ecx, 2
rep movsd

mov ecx, 12
and ecx,3
rep movsb

popad

mov eax,0x49AC20
call eax
}
}

Basically when I was doing it in C, I could look at the dissasembly and
see calls to random addresses (actually addresses in my injecting
program). I don't understand why as in C my code was like 3
instructions a call to LoadLibrary and a couple copies. Can anyone shed
light on this for me?

Stephen Kellett

unread,

May 7, 2006, 5:54:29 AM5/7/06

to

In message <lP07g.17770$ZW3.17381@dukeread04>, David Jones
<nc...@tadmas.com> writes

>> Yes. Do a GetProcAddress in the process doing the injection to get
>>the address of LoadLibrary and then use that address in your mov
>>instruction. The kernel32.dll is always loaded at the same address in
>>each app so LoadLibrary address won't change. But you can't hard code
>>it like that as the address may vary from machine to machine.
>
>I'll bite -- how will he get the address of GetProcAddress so that
>he can call it? :)

Be more observant, less haste, more speed. Read the text I wrote
carefully. I'll quote the important line below and add some asterisks
for emphasis. I didn't try to hide anything, it wasn't written in a why
to obscure the info, but you've obviously missed the context of the
GetProcAddress call.

<QUOTE>
Do a GetProcAddress in the **process doing the injection** to get the

address of LoadLibrary and then use that address in your mov
instruction.

</QUOTE>

Alf P. Steinbach

unread,

May 7, 2006, 6:01:13 AM5/7/06

to

* Stephen Kellett:

>
> <QUOTE>
> Do a GetProcAddress in the **process doing the injection** to get the
> address of LoadLibrary and then use that address in your mov instruction.
> </QUOTE>

Is it documented somewhere that [kernel.dll] will be mapped to the same
logical base address in all processes in the same Windows instance?

aela...@gmail.com

unread,

May 7, 2006, 6:28:38 AM5/7/06

to

I'm not sure if its documented as such but it always is in practice.

Stephen Kellett

unread,

May 7, 2006, 10:04:46 AM5/7/06

to

In message <4c5urbF...@individual.net>, Alf P. Steinbach
<al...@start.no> writes

>Is it documented somewhere that [kernel.dll] will be mapped to the same
>logical base address in all processes in the same Windows instance?

One or both of Jeffrey Richter's/John Robbins's official Microsoft book
states it. From my experience, it appears to be true.

From what I've read of x64 versions of the OS this may not be true, but
I haven't had a chance to test an x64 version and I never checked the
behaviour of Itanium boxes (which I would expect to be the same) when I
had access to one.

David Jones

unread,

May 7, 2006, 5:11:59 PM5/7/06

to

Stephen Kellett wrote:

> In message <lP07g.17770$ZW3.17381@dukeread04>, David Jones
> <nc...@tadmas.com> writes
>
>>> Yes. Do a GetProcAddress in the process doing the injection to get
>>> the address of LoadLibrary and then use that address in your mov
>>> instruction. The kernel32.dll is always loaded at the same address in
>>> each app so LoadLibrary address won't change. But you can't hard code
>>> it like that as the address may vary from machine to machine.
>>
>> I'll bite -- how will he get the address of GetProcAddress so that
>> he can call it? :)
>
> Be more observant, less haste, more speed. Read the text I wrote
> carefully.

You're right... sorry about that. I must have completely skipped
over that phrase.

David