I have a large multithreaded application which experience an access
violation always at the same place from time to time.
The AV occurs in "function FastFreeMem(APointer: Pointer): Integer;" at line
4209 (See source extract below, line marked "here"). The AV is either write
at address 00000000 or write at address 00000001.
This happend randomly about once per day which is very unfrequent given the
fact that the application is a server program used by something like 700
users 24h/24. That is still too much.
I have no idea about what can cause the issue. It looks like a dynamic issue
because this always happend when serevral threads are contending.
IsMultithread variable is set to TRUE. AssumeMultiThreaded is not defined.
Any idea about finding the cause of this AV is really welcome !
Extract from FastMM4.pas:
@LockBlockTypeLoop:
mov eax, $100
{Attempt to grab the block type}
lock cmpxchg TSmallBlockType([ebx]).BlockTypeLocked, ah
<===== HERE is the AV
je @GotLockOnSmallBlockType
{$ifndef NeverSleepOnThreadContention}
{Couldn't grab the block type - sleep and try again}
push ecx
push edx
push InitialSleepTime
call Sleep
pop edx
pop ecx
{Try again}
mov eax, $100
{Attempt to grab the block type}
lock cmpxchg TSmallBlockType([ebx]).BlockTypeLocked, ah
je @GotLockOnSmallBlockType
{Couldn't grab the block type - sleep and try again}
push ecx
push edx
push AdditionalSleepTime
call Sleep
--
francoi...@overbyte.be
Author of ICS (Internet Component Suite, freeware)
Author of MidWare (Multi-tier framework, freeware)
http://www.overbyte.be
kind regards
Mike
I wil try that because it is very easy, even if I would be surprised if it
change anything.
Thanks.
Please remember to post your results here, even if it is more just a gut
feeling than based on hard evidence.
-TP-
I will do.
Yesterday, I have deployed my application recompiled with MMX option
disabled in FastMM. 19 hours later, there is still no access violation. That
is good but not long enough to be sure it was that. Sometimes the
application ran 3 days before experiencing an AV and sometimes only a few
hours.
If deactivating EnableMMX do not solve the issue, I will try to upgrade to
the latest release (V4.84) but since i is very new, I fear it could comes
with more bugs (in new features) than fixes (in existing features).
Stay tuned !
Disabling MMX was not the solution. The application experienced the same AV
in FastFreeMem after running correctly for 21 hours.
Now trying with FastMM 4.84. I'm changing my code to periodically call
ScanMemoryPoolForCorruptions which could help detect bugs.
> 4209 (See source extract below, line marked "here"). The AV is either write
> at address 00000000 or write at address 00000001.
That error will occur if the dword before the start of the memory block
that you're trying to free is either zero or one, and the only way for
that to happen is if your application misbehaved and wrote to that
address. (FastMM uses the dword in front of a memory block to store a
pointer to an internal management structure.)
Have you tried using FullDebugMode?
Regards,
Pierre
> I once read that mmx instructions in FastMM causes sometimes troubles.
It's not a problem with FastMM per se, it's just not safe to use MMX in
the core RTL. The reason for that is because there are "compiler magic"
functions that pass floating point parameters on the FPU stack. If you
use MMX in the RTL those parameters could get trashed. It's rare though:
So far I have only hit the problem when working with custom variants.
The symptoms of that is an FPU stack error, so certainly not what
Francois is seeing.
Regards,
Pierre
O ye of little faith! ;-)
The core routines (GetMem, FreeMem, ReallocMem) haven't changed in
years, so it should be safe to use.
The application is now running with V4.84 with full debug mode and periodic
call to ScanMemoryPoolForCorruptions. I agree with you that probably the
application somewhat corrupt the memory. The real problem is how to find
where it happend !
The application is an application server which is multithreaded. Each client
request is executed in a separate thread. There is not really any data
shared between threads. Each thread has his own ADO connection to the DB and
execute independently of the others. The computer is a 4 simple core Intel
Xeon with 4GB of RAM.
I have extensive logging (including MadExcept) in the application and
nothing obvious pops up. The only constant in the AV is that it is always in
FastFreeMem. The AV even occurs during the night when activity is low.
This is really a big issue for me.