Hi Mike,
With regard to the Health Monitor, in life before NServiceBus, we implemented a lot of batch jobs (it's now my life's work to try and kill them!) which somebody had to manually check the status of each day. So we built a single logging store which our batch jobs plus our online applications could write to. It has a simple web front-end which means the Ops guys can see at a glance, whether any batch jobs failed and so on. Think of it like a poor man's ServicePulse! I'd love to use ServicePulse in the office, but as you know, it's possible to see the content of messages through the UI, which would mean potentially sensitive client data could be viewed by people who shouldn't be able to see it. I'd also like to use NServiceBus to handle the messaging between the batch jobs, etc. and our Health Monitor - one step at a time though!
In terms of what it actually subscribes to - currently it's just Failed Message and Heartbeat Stopped. These were deemed to be the two things we really wanted visibility of, so as well as logging the events, the Health Monitor emails the Ops team.
I discovered that the Heartbeat Plug-in can be used with Send-Only endpoints a couple of days ago - Mauro Servienti pointed out that it was the documentation which was wrong and promptly fixed it! So adding this to the web-application which hosts the endpoint is now on my "to do" list.
The only "hole" in our monitoring that still concerns me is around MSMQ rather than NServiceBus itself. We've had a couple of cases where messages have not been delivered despite everything appearing to be well with the machines concerned. Monitoring of the DLQ highlighted this, but to me this feels like a last resort - I want visibility of communication issues earlier. The only way I can think of doing this would be to build a little service that monitors the outgoing queues and flags any messages that have been sat there longer than a specified time.
All the best,
Ian.