PROPOSAL: make Message model core and re-org backends, router

10 views
Skip to first unread message

David McCann

unread,
Aug 17, 2011, 4:54:14 AM8/17/11
to rapi...@googlegroups.com
Greetings all--

You'll forgive the top post, I'm including a link to the thread regarding a recently-announced router created a while back by Caktus:

http://groups.google.com/group/rapidsms/browse_thread/thread/b6059e8082fb2a0f

To summarize, we now have two alternative routers, and no one's particularly happy with the core router, so this kicked off a discussion for designing a new, more flexible router.  I'll summarize what seems to be the set of requirements here:

>>   - Bulk message queueing (using a BulkInsertManager or similar) 
>>  - Asynchronous sending of outgoing messages (actually calling the 
>> outgoing URL of a particular backend process like Kannel) 
>>  - Asynchronous polling of low-level backends (GSM modems etc) 
>>  - Asynchronous processing of incoming messages? Threadlessrouter has 
>> this, I think this is up for debate as I've never had an issue just 
>> processing this as part of handling the GET request 
>>  - Synchronous sending and receiving of messages in automated testing and 
>> development environments 

I think most of this work is already sitting around *somewhere,* and it's just a matter of reorganizing it in such a way that it's easy to  understand and configure, and doesn't require people to constantly write divergent modules.  My proposals:

1) Make Message a core model, and add the critical portion of message processing code (the "message lifecycle," or passing a message to apps through all the phases) onto the Message model itself.  For backwards compatibility, we can use duck-typing and move all the methods and properties of OutgoingMessage and IncomingMessage onto the Message model accordingly, and pass actual Message instances to the apps for processing.

2) Make Backends more like apps.  In fact, *make* them django apps.  Backends that handle channels that speak http can just expose views/urls, as Httprouter and Threadless router both do.  Backends should also have various methods that are similar to RSMS app phases.  I'm thinking to begin with:

 - start() - init code

 - poll()/receive()  - this would be a method(s) where backends can place code that should be run periodically and asynchronously.  The backend would provide a best-effort implementation of polling, and it is *the router implementation's responsibility* to call this method often enough to achieve the required result.  This method could also be called synchronously during testing.  Backends that don't need to poll, like an HttpBackend, would just pass on this method.

 - deliverMany() - this would allow a backend to send multiple messages with the same content, if supported, in the case of a mass text, or just send them individually in a loop if unsupported.

3) Use a factory model to allow for different router implementations.  At this point, the "router" just becomes an engine for providing asynchronous calls, kicking off the backends initially, and handling queueing....it could be implemented using threads, celery, rabbit, or any combination thereof.  This would also allow for a test router implementation that does everything synchronously.  Also worth noting, the DB doesn't *have* to serve as the message queue persistence, as it does with httprouter...the message lifecycle logic could be invoked on an instance of Message without ever save()ing it.

I'm going to proceed in my fork to flesh this out, to give everyone something a little more tangible to respond to.  Just wanted to update the group with my latest thinking (and thanks to Jeff Wishnie and Nic Pottier for all their help solidifying this proposal).

Cheers,
--dm

Tobias McNulty

unread,
Aug 17, 2011, 9:32:37 AM8/17/11
to rapi...@googlegroups.com
On Wed, Aug 17, 2011 at 4:54 AM, David McCann <david.a...@gmail.com> wrote:
Greetings all--

You'll forgive the top post, I'm including a link to the thread regarding a recently-announced router created a while back by Caktus:

http://groups.google.com/group/rapidsms/browse_thread/thread/b6059e8082fb2a0f

To summarize, we now have two alternative routers, and no one's particularly happy with the core router, so this kicked off a discussion for designing a new, more flexible router.  I'll summarize what seems to be the set of requirements here:

>>   - Bulk message queueing (using a BulkInsertManager or similar) 
>>  - Asynchronous sending of outgoing messages (actually calling the 
>> outgoing URL of a particular backend process like Kannel) 
>>  - Asynchronous polling of low-level backends (GSM modems etc) 
>>  - Asynchronous processing of incoming messages? Threadlessrouter has 
>> this, I think this is up for debate as I've never had an issue just 
>> processing this as part of handling the GET request 
>>  - Synchronous sending and receiving of messages in automated testing and 
>> development environments 

I think most of this work is already sitting around *somewhere,* and it's just a matter of reorganizing it in such a way that it's easy to  understand and configure, and doesn't require people to constantly write divergent modules.  My proposals:

1) Make Message a core model, and add the critical portion of message processing code (the "message lifecycle," or passing a message to apps through all the phases) onto the Message model itself.  For backwards compatibility, we can use duck-typing and move all the methods and properties of OutgoingMessage and IncomingMessage onto the Message model accordingly, and pass actual Message instances to the apps for processing.

Your timing is eerie... Colin and I were just discussing this yesterday and basically came to the conclusion that the inclusion of a Message model is one of the most significant differences between httprouter and threadless.. and as we've seen it can lead to very different approaches to queuing, etc.  Typically, to date, we've preferred more business-specific models to keep track of outgoing messages, e.g.:


While I'm not 100% convinced yet, I can certainly see the value to having a single base model for all messages stored in the database, so long as (a) you don't have to use it (addressed below) and (b) there's an easy way to extend that model to include business-specific meta data that you want to store in the database (and I sort of hope that easy way doesn't turn out to be a generic foreign key).

2) Make Backends more like apps.  In fact, *make* them django apps.  Backends that handle channels that speak http can just expose views/urls, as Httprouter and Threadless router both do.  Backends should also have various methods that are similar to RSMS app phases.  I'm thinking to begin with:

 - start() - init code

 - poll()/receive()  - this would be a method(s) where backends can place code that should be run periodically and asynchronously.  The backend would provide a best-effort implementation of polling, and it is *the router implementation's responsibility* to call this method often enough to achieve the required result.  This method could also be called synchronously during testing.  Backends that don't need to poll, like an HttpBackend, would just pass on this method.

 - deliverMany() - this would allow a backend to send multiple messages with the same content, if supported, in the case of a mass text, or just send them individually in a loop if unsupported.

I look forward to seeing how this would work in practice.
 
3) Use a factory model to allow for different router implementations.  At this point, the "router" just becomes an engine for providing asynchronous calls, kicking off the backends initially, and handling queueing....it could be implemented using threads, celery, rabbit, or any combination thereof.  This would also allow for a test router implementation that does everything synchronously.

Also hard for me to say what this would look like without seeing the code.  I will say that I have a strong preference for the simplest "router" of all -- a la threadless -- that has a very short lifetime and all it does is manage the lifecycle of a single inbound or outbound message.  The fact that the current router tries to manage a lot of threads and backends all in one cumbersome package accounts for a significant portion of it's usability and reliability issues, IMHO.

Also worth noting, the DB doesn't *have* to serve as the message queue persistence, as it does with httprouter...the message lifecycle logic could be invoked on an instance of Message without ever save()ing it.

This last point is an important point and I don't want it to get lost.  It's definitely something we'd want to be able to turn off in certain cases (both in the lightweight case, e.g., unit testing, and in the highly scaled case).  There are plenty of reasons why you might /not/ want to have a record of all messages ever sent or received by your RapidSMS instance in the database.

Cheers,
Tobias
--
Tobias McNulty, Managing Member
Caktus Consulting Group, LLC
http://www.caktusgroup.com

Reply all
Reply to author
Forward
0 new messages