Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ReceiveById fails

140 views
Skip to first unread message

Johan Botha

unread,
Aug 1, 2008, 9:59:02 AM8/1/08
to
We are using MSMQ (3.0) on Windows Server 2003 sp2 64bit. Our queue listener
design does an async peek and throw the message on the threadpool queue to be
processed. When the message is done processing, we do a receivebyid to remove
it from the queue.

We have been using this succesfully on another product for a while, but we
are now getting an error stating "Message requested was not found in the
queue specified." Lookup id=288230376151723328

Stack --
System.Messaging.MessageQueue.InternalReceiveByLookupId(Boolean receive,
MessageLookupAction lookupAction, Int64 lookupId, MessageQueueTransaction
internalTransaction, MessageQueueTransactionType transactionType) at
System.Messaging.MessageQueue.ReceiveByLookupId(Int64 lookupId)

This is a serious issue for us. We are not using any transaction and the
queues are non transactional. The queues are also local private queues.

In one of the services, the service stopped processing the queue (but that
could be in our code, though I think I wrapped it in exception handling for
log and continue). It looks like the actual errored message reprocessed
sucecsfully (but this meant it was done twice).This is pretty urgent for us
as it is in production.

Any help would be greatly appreciated.

--

Johan Botha
Software Architect

John Breakwell (MSFT)

unread,
Aug 1, 2008, 2:11:55 PM8/1/08
to
Hi Johan,

There is a hotfix for a similar problem that appears under load (and I
assume your problem is an intermittent problem under load too).
I'd recommend giving this newer version a test first (don't worry about the
Triggers reference):

937549 FIX: Random messages may not be processed, and duplicate messages may
be processed by the trigger rule when you use the Message Queuing Triggers
service to process messages in Message Queuing 3.0
http://support.microsoft.com/default.aspx?scid=kb;EN-US;937549

Note that with your design you always need to code for an error in the
receive - you can't rely on MSMQ to always process the message.

With this sort of logic:

1 Peek message
2 Process message data in some way
3 Receive message to delete it
4 Goto 1

you really need to handle something like the machine rebooting half-way
through step 3.
You say you are not using transactions but these are exactly what you should
be using.

For example, you could develop the following:

1 Peek message
2 Start transaction
3 Process message data in some way
4 Receive message to delete it
5 Commit transaction (this part really deletes the message)
6 Goto 1

and

1 Peek message
2 Start transaction
3 Process message data in some way
4 Receive message but encounter error
5 Abort transaction (message is left in queue; may need to execute
compensating code to reverse the message processing in step 3)
6 Goto 1

Cheers
John Breakwell (MSFT)


"Johan Botha" <johan...@community.nospam> wrote in message
news:C8095F55-92F1-4B4C...@microsoft.com...

Johan Botha

unread,
Aug 1, 2008, 9:53:00 PM8/1/08
to
Hi John,

Thank you very much for your reply, I will try the hotfix out. It was very
weird, like the same message got peeked twice in a row ... happened twice. We
are still ramping up in production but will soon be doing betwwen 1-2 million
transactions a day (integration hub).

At this point for performance and other reasons we prefer not to use
transactions, but in future we might use it if a situation arises where that
level of data fidelity is required. The pattern you mentioned is actually one
of two I defined for use with this baseclass, and I chose the peek
specifically because it allows me to use async methods as well as
transactions when needed. So far in other products (different type), it has
worked very well for about 2 years, so I was surprised to run into it, but
the data pattern for this is a bit different and thse machines are 64bit and
hulkier.

Have a great weekend.

Thanks
Johan

--

Johan Botha
Software Architect

Johan Botha

unread,
Aug 5, 2008, 11:01:01 AM8/5/08
to
John,

One more question. I am about to apply the hotfixes in production, but I
was wondering if there is any best practices as far as configuring antivirus
to not scan certain files etc for MSMQ.

Thanks
--

Johan Botha
Software Architect

Johan Botha

unread,
Aug 7, 2008, 5:41:02 PM8/7/08
to
Hi John,

I applied the hotfix and turned off anti-virus and we are still getting the
error.

We are doing an async peek with a cursor, but it looks like the peek
actually peeks the same message twice in a row. I have a single thread
peeking and throwing the message on a worker thread queue for the threadpool.
I have a counting semaphore controlling how many can be in flight at any
point.

It looks like they are both processing in parallel then and whoever finishes
first, reads the message and the next guy up cant find it. This causes the
message to be processed twice and sent (via http, one of the reasons we can't
transact) twice.

Any other ideas as to why on these machine we are getting this behaviour ...
we are using the same baseclass in other projects that have been in
production for a few years now without problems. I am also not seeing this
issue in Sandbox or Managed Test environments.

Any help would be greatly appreciated.

Thanks
--

Johan Botha
Software Architect


"John Breakwell (MSFT)" wrote:

> Yes, AV software should never touch the Storage directory.
> MSMQ is expecting to be able to write to the disk unobstructed.
> Anything that could have a lock on the storage files could lead to message
> corruption.


>
> Cheers
> John Breakwell (MSFT)
>
> "Johan Botha" <johan...@community.nospam> wrote in message

> news:0B9F4759-D91D-4283...@microsoft.com...

Johan Botha

unread,
Aug 8, 2008, 9:34:01 AM8/8/08
to
Also, it seems that some messages are getting stuck in the queue, like the
async peeks are missing them. We have multiple servers writing to these
queues, so messages from multiple web servers could be sent to our
application servers. One App server can receive messages for the same queue
from multiple web servers.

This is starting to happen a lot as load is increasing.

--

Johan Botha
Software Architect

Johan Botha

unread,
Aug 8, 2008, 9:57:04 AM8/8/08
to
Also, we are using a cursor on our async peek.

Here is that code:
if (state == ProcessState.First)
{
this.msmqCursor = this.inputQueue.CreateCursor();
this.inputQueue.BeginPeek(this.waitTimeout, this.msmqCursor,
PeekAction.Current, null, this.OnPeekCompleted);
}
else
{
this.inputQueue.BeginPeek(this.waitTimeout, this.msmqCursor,
PeekAction.Next, null, this.OnPeekCompleted);
}

--

Johan Botha
Software Architect

Johan Botha

unread,
Aug 8, 2008, 11:05:01 AM8/8/08
to
I have managed to reproduce this in our managed test environment. The load is
not as much an issue as when the same app server queue gets load from 2 web
servers at the same time. We have load balanced web servers, and when I added
the second one into the load balance pool and stressing it with our tool, I
was able to reproduce it quite easily. However, giving it the same load
through just 1 web server did not seem to cause any issues.

What seems to happen is that on the reading side, the async cursor peek gets
a hickup in its cursor, and peek the same message a second time, it then also
skips over a message or two (sometimes perhaps none, I have not been able to
do a perfect corrolation on this, will work on it).

So I end up with duplicate transactions (Http Posts out to partners) as well
as missed messages that have to be manuall prodded along, either sending them
to the back of the queue with QueueExplorer from Cogin, or restarting the
service (which of course creates a new cursor from the beginning).

This seems to be a serious issue with MSMQ and I really do urgently need
help on it as we are using it in a production environment.

Thanks
--

Johan Botha
Software Architect

Johan Botha

unread,
Aug 8, 2008, 11:42:10 AM8/8/08
to
So far the closest thing I have seen is a mention of critical section

http://www.webmasterdev.com/showthread.php?t=253664

--

Johan Botha
Software Architect

behrad.s...@gmail.com

unread,
Aug 27, 2018, 5:17:34 AM8/27/18
to
Hi Johan,

Did you find any fix for this?

I have a legacy application and this has caused serious problems for us.
0 new messages