NServicebus host hanging and not processing messages

841 views
Skip to first unread message

Noam Fein

unread,
May 3, 2016, 3:11:34 PM5/3/16
to Particular Software
Helpful information to include
Product name:
Version:4.2
Stacktrace:
Description:So I have host endpoint running as a windows service. the host is processing some messages and then stop. the messages are left in the queue. I was able to pull the messages from the queue using a tool I created (to check if the msmq is stuck).
when we try to stop and start the service, it takes too long and we get an error. we have to kill it manually.
there's nothing in the host log and nothing in the event viewer.

thank you


Tim

unread,
May 4, 2016, 10:10:13 AM5/4/16
to Particular Software
Hi Noam,

When the messages aren't consumed, this could hint at errors while processing the messages. Usually, that should be logged in the log file. Can you share your handler code with us?

Noam Fein

unread,
May 4, 2016, 12:22:00 PM5/4/16
to Particular Software
I got nothing in the log. nothing in the event viewer. nothing in the error queue.  the messages remain in the queue.
my handler:

      public void Handle(GrandJuryPacketMessage message)
        {
            User sendingUser = _session.QueryOver<User>().Where(u=>u.Id == message.SenderId).ReadOnly().SingleOrDefault();
             
             _session.EnableFilter("org").SetParameter("OrganizationId", message.OrganizationId);

             var cases = _session.QueryOver<Case>().AndRestrictionOn(x => x.Id).IsIn(message.CaseIds).ReadOnly().List();

            HttpClient httpClient = _urlInfoHelper.GetHttpClient(message.IsWindowsAuthentication, message.ServiceAccountUsername, message.ServiceAccountPassword);

            var filesStreamList = new List<GeneratedFiles>();
            var filesWithErrors = new List<String>();

            foreach (var fileNameUrlDictionary in message.FileNameUrlDictionary)
            {
                string url = fileNameUrlDictionary.Value;
                string fileName = fileNameUrlDictionary.Key;
                try
                {
                    _urlInfoHelper.GetUrlInfo(httpClient, url, fileName, filesStreamList, filesWithErrors, message.BaseFileName);
                    //  await GetUrlInfo(httpClient, fileNameUrlDictionary.Value, fileNameUrlDictionary.Key, fileStreamList, filesWithErrors,message.BaseFileName);
                }
                catch (Exception ex)
                {
                    _log.ErrorFormat("error pulling data from url: '{0}' ,exception: '{1}'", url,ex.Message);
                    filesWithErrors.Add(fileName);
                } 
            }

            CallGrandJuryGeneratorsToGenerate(message, filesStreamList, filesWithErrors, cases, sendingUser);


            var zipFileName = _zipFiles.ZipAllStreams(message.BaseFileName, filesStreamList, _downloadFilePath + @"\" + sendingUser.Id + @"\", "ALL_DOCUMENTS");

            SendNotificationAndEmail(message, sendingUser, zipFileName, filesWithErrors);

            PrintAndAttachIfNeeded(message, filesStreamList, sendingUser); 

            foreach (var fileStream  in filesStreamList)
            {
                fileStream.Stream.Dispose();
            }
        }

Tim

unread,
May 4, 2016, 12:52:13 PM5/4/16
to Particular Software
Thanks for sharing your handler with us. Have you tried to set the log level to debug (see this documentation on how to do that: http://docs.particular.net/nservicebus/logging/#logging-levels ) and check whether you can see at what point the endpoint stops consuming messages (maybe add some additional logging in your handler)? 

Does this happen only in production or can you reproduce that issue on your local machine? What NServiceBus packages and which versions are you using, the description states NSB 4.2, is that correct?

Noam Fein

unread,
May 4, 2016, 1:14:42 PM5/4/16
to Particular Software
Thank you Tim for your quick response.
We are using 4.2.0
The log is set to Info.
We are also facing an issue (might be related) that stopping the host is taking some time and most of the time we get an error and we have to kill it. it could be related.

Tim

unread,
May 6, 2016, 3:51:31 AM5/6/16
to Particular Software
Hey Noam,

Can you please share the logged exceptions you're experiencing during shutdown? Try set the log to Debug to check for potential hints why your endpoint stops processing. Without access to the log or exception info, there is very little I can help you with.

Also, please note that NServiceBus 4.2 is no longer officially supported and does not receive patches. I'd highly recommend to update to the latest minor version (4.7) and see whether your issues still show up.

Cheers,
Tim

ramon...@particular.net

unread,
May 6, 2016, 8:47:10 AM5/6/16
to Particular Software


Noam,

NServiceBus does a graceful shutdown. Meaning any messages that are currently being processed need to complete but in the meantime new messages will not be fetched from the queue. 


Your handler is doing a few heavy operations:

- It does lots of http calls in a single handler
- Calls CallGrandJuryGeneratorsToGenerate, which probably also does a huge task
- Zips all stuff
- Send an email


Seems these steps can take a while to process hence the reason why the service doesn't stop within a few seconds. You can also just wait until it completes.


My advice would be to redesign this handler. Divide all steps in to separate handlers and pass the results into the next handler via a message. Redesign the for..each construction to send a message for all those items for separate processing.


A saga would be applicable here to help you orchestrate these messages.

Besides the fact that you now have lots of small tasks that run shorter it has a few other benefits
  • If an error happens during the zipping or emailing everything is lost and everything must be done again!
  • The foreach is currently sequential, by using messages each tasks can be performed in parallel meaning concurrent processing reducing overall duration
  • Less risk that the same email is send multiple times
  • If the email cannot be send because the mail server is down or temporarily rejects it you don't need to do all tasks again

Due to all handlers now doing just a small part the endpoint shutdown will now be *much* quicker as the endpoint does not need to wait before all those steps are finished but only the steps currently being executed.


Also, your current handler does not properly disposes the files when it fails because the disposing it not put in a try..finally


Regards,
Ramon

Noam Fein

unread,
May 6, 2016, 12:43:07 PM5/6/16
to Particular Software
thanks for the response.
we do use different messages and handlers. what you see in the code is wrapping those calls to different handlers -  for email we have a different handler and also for print.
the only thing that we do it that handler is get the stream (from url) and zip it.
sometimes it's getting stuck when the print message is sent.

the problem is that it wont stop. now I understand it is trying to finish processing a message that it's processing now, but it looks like it does not process anything. the service is stuck. meaning, the messages are left in the queue after a while (so it is processing few messages and then stop).

I will try it logging in debug. I will update you guys about what we see.
about upgrading:
I do want to upgrade the version but it requires upgrading other packages that are dependent on that (like structuremap). we will need to do it, but for now, i'd rather wait with that.

thanks again

Noam Fein

unread,
May 6, 2016, 3:11:53 PM5/6/16
to Particular Software
Another piece of information that might be useful:
when we restart the host, it is still not picking up the messages. only when we purge the queue, it will process new messages (but then get stuck again).
At first I thought maybe it is msmq issue, but I wrote a utility that pulls messages when the host is hanging. so I'm sure it's not msmq issue.

ramon...@particular.net

unread,
May 9, 2016, 3:45:05 AM5/9/16
to Particular Software

Hi Noam,

The 'hanging' is probably due to the handlers waiting for something. You mention that the printing and email tasks are done in separate handlers but all that work before it does downloading and zipping in the same handler.

Purging indicates that you just removed a message that caused the blocking. Maybe that message contained a lot of files or needed to download those from a (temporarily) very slow server?

I would add some Log.Debug statements to get an idea what the handler is doing.

Log each step, how many files it is going to download, etc. also, as it seems your handlers is downloading files first you could check if you see any file activity in that folder.


Did you configure allowed connections? It could be that there are no connection available to download files.


Regards,
Ramon

Noam Fein

unread,
May 9, 2016, 12:38:15 PM5/9/16
to Particular Software
I would think that it would error out in run time.
But I will add more logging and try it out.
Thanks.

ramon...@particular.net

unread,
May 13, 2016, 12:38:57 PM5/13/16
to Particular Software


Noam,

Were you able to resolve your issue?

-- Ramon

ramon...@particular.net

unread,
May 19, 2016, 6:33:53 AM5/19/16
to Particular Software

Noam,

I think you have resolved  your issue as you didn't respond. Please provide additional details if this is not the case.

Regards,
Ramon
 
Reply all
Reply to author
Forward
0 new messages