Distributed processing

198 views
Skip to first unread message

Ben Dornis

unread,
Aug 14, 2012, 1:18:02 PM8/14/12
to distsys...@googlegroups.com
Let's hope this all makes sense...

The basics of the process is: 
    1. Request is put into queue - and we wait for a result
    2. Request is picked up by processor - broken up into discreet parts and processed by several different services
    3. Finished request is saved and the client picks it up

Here are the details.
    1. Request is actually put into several tables on a database (broken up into discrete types of requests) - After the request is made the client starts polling the database waiting for the records to be updated.
    2. Process servers are constantly polling the database looking for new requests. When it finds one it marks the request as in use and starts processing it. Different servers process different types of requests.
    3. Once the process server marks the request as done the server picks it up. It continues to wait until all the discrete parts have been fully processed or a specific amount of time has passed and displays the results.

Database polling:
    This seems to be a real problem and I never liked this. We have 4 servers (scalable) that process 10 types of requests per server (each on it's own thread). The current polling system is once every second with a burst when lots of requests are in waiting. Every so often we get database timeouts and connectivity issues. According to the docs of Windows Azure SQL it starts dropping connections if certain criteria are met. We think we're hitting one of these criteria.
I still don't like the idea of constantly polling the database and we're considering switching to Azure Service Bus to send messages back and forth about requests.

My ultimate question is how I should proceed? I'm just learning. 

I am so new to this and it's all very confusing. It hasn't been easy to find information on this type of system and I'm really trying to learn how this distributed computing in general works. 

I think I'm on the right path but I'm open to any criticisms that will help me learn more about this.

Clemens Vasters

unread,
Aug 14, 2012, 2:06:40 PM8/14/12
to distsys...@googlegroups.com

We actually have a set of specific features in Service Bus to make even fairly complex scatter/gather scenarios easy.

 

For distribution, you can make, depending on the concrete needs, either a Topic if you want to fan out the same data, or use per-processor Queues if you need to break up and route portions of the data specifically. For the reply flow you can make a reply Queue that has sessions enabled.  

 

As you flow information to the processors, include the name of the Reply queue (the ReplyTo [1] property is meant for that, it’s not interpreted by the system directly) and some kind of a job id either as a custom property or in the CorrelationId [2] property of the message. As you turn around with the reply, send the reply to the reply queue indicated in ReplyTo and set the SessionId [3] to the job-id you got in the request. Each replying party should set the TTL [4] such that the messages will expire X seconds after the initial request was made to avoid leftover garbage in the system.

 

On the gathering end of the reply queue, instead of calling Receive, you use AcceptMessageSession [5]. That operation will give you a receiver to which all messages with the particular SessionId are locked. This gives you de-multiplexing over a queue. Even if multiple concurrent receivers (and even nodes) were pulling on that same queue concurrently, all messages with that particular SessionId will be routed to whatever process owns that receive object. The receiver should hold on to the MessageSession [6] object while it expects messages on the session and then close it.

 

I talk about the general mechanics here http://channel9.msdn.com/Events/TechEd/Europe/2012/AZR317 in the “Who has data about me?” scenario discussion starting at the 14:10 mark. In the same talk I talk about Splitters and Aggregation and have a demo of sessions starting at 57:50. Sample is here http://code.msdn.microsoft.com/Brokered-Messaging-Session-41c43fb4

 

[1] http://msdn.microsoft.com/en-us/library/windowsazure/microsoft.servicebus.messaging.brokeredmessage.replyto.aspx

[2] http://msdn.microsoft.com/en-us/library/windowsazure/microsoft.servicebus.messaging.brokeredmessage.correlationid

[3] http://msdn.microsoft.com/en-us/library/windowsazure/microsoft.servicebus.messaging.brokeredmessage.sessionid

[4] http://msdn.microsoft.com/en-us/library/windowsazure/microsoft.servicebus.messaging.brokeredmessage.timetolive

[5] http://msdn.microsoft.com/en-us/library/windowsazure/hh293162.aspx

[6] http://msdn.microsoft.com/en-us/library/windowsazure/microsoft.servicebus.messaging.messagesession.aspx

--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/distsys-discuss/-/T0nyggCPFGAJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Reply all
Reply to author
Forward
0 new messages