I need comments on an application I have been recently proposed. The way it is being envisioned at this moment is this:
One python daemon will be listening for different communications media such as email, web and SMS (web also). IMHO, it is necessary to have a daemon per each media. This daemon(s) will only make sure the messages are received from a validated source and put such messages in a DB
A second(?) python daemon would be waiting for those messages to be in the DB, process them, act accordingly to the objective of the application, and update the DB as expected. This process(es) might included complicated and numerous mathematical calculations, which might take seconds and even minutes to process.
A third(?) python daemon would be in charge of replying to the original message with the obtained results, but there might be other media channels involved, eg the message was received from a given email or SMS user, but the results have to be sent to multiple other email/SMS users.
The reason I want to do the application using Django is that all this HAS to have multiple web interfaces and, at the end of the day most media will come through web, and have to be processed as http requests. Also, Django gives me a frame to make this work better organized and clean and I can make the application(s) DB agnostic.
Wanting the application to be DB agnostic does not mean that I don't have a choice: I know I have many options to communicate among different python processes, but I prefer to leave that to the DBMS. Of the open source DBMS I know of, only Firebird and PostgreSQL have event that can provide the communication between all the processes involved. I was able to create a very similar application in 2012 with Firebird, but this time I am being restricted to PostgreSQL, which I don't to oppose at all. That application did not involve http requests.
My biggest concern at this point is this:
If most (if not all) requests to the application are going to be processed as http requests, what will happen to pending requests when one of them takes too long to reply? Is this something to be solved at the application level or at the server level?
This is as simple as
I can put it. Any thoughts, comments, criticism or recommendations
are welcome.
Thanks a lot in advanced!
--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/802693b7-c00c-46bf-9902-688b27e21bbd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CAFWa6tKZcKySJdG4YcnHh38oTuTTVEfaz6WBTOyKDmDiV0_sFA%40mail.gmail.com.
I need comments on an application I have been recently proposed. The way it is being envisioned at this moment is this:
One python daemon will be listening for different communications media such as email, web and SMS (web also). IMHO, it is necessary to have a daemon per each media. This daemon(s) will only make sure the messages are received from a validated source and put such messages in a DB
A second(?) python daemon would be waiting for those messages to be in the DB, process them, act accordingly to the objective of the application, and update the DB as expected. This process(es) might included complicated and numerous mathematical calculations, which might take seconds and even minutes to process.
A third(?) python daemon would be in charge of replying to the original message with the obtained results, but there might be other media channels involved, eg the message was received from a given email or SMS user, but the results have to be sent to multiple other email/SMS users.
The reason I want to do the application using Django is that all this HAS to have multiple web interfaces and, at the end of the day most media will come through web, and have to be processed as http requests. Also, Django gives me a frame to make this work better organized and clean and I can make the application(s) DB agnostic.
Wanting the application to be DB agnostic does not mean that I don't have a choice: I know I have many options to communicate among different python processes, but I prefer to leave that to the DBMS. Of the open source DBMS I know of, only Firebird and PostgreSQL have event that can provide the communication between all the processes involved. I was able to create a very similar application in 2012 with Firebird, but this time I am being restricted to PostgreSQL, which I don't to oppose at all. That application did not involve http requests.
My biggest concern at this point is this:
If most (if not all) requests to the application are going to be processed as http requests, what will happen to pending requests when one of them takes too long to reply? Is this something to be solved at the application level or at the server level?
This is as simple as I can put it. Any thoughts, comments, criticism or recommendations are welcome.
I suggest that you use Celery.
If people are making HTTP requests of you, that is reason enough to choose Django.
But do not wait for long calculations to complete before returning an HTTP result. Instead redirect to a page containing simple JavaScript that will poll for a result.
I can assure you that this works well on Linux (you don't mention the platform). I have not used Celery (or Django, for that matter) on Windows or Mac, but I'll bet that it runs fine, modulo the usual surprises about file system differences and the way that Windows processes are "special".
Pretty much you just code in Python. The exception is startup scripts to boot time start/manage the celery works, Apache/nginx front end for Django, and any additional required communications processes. I guess there is also that small JavaScript to poll for a result.
y So this is effectively a feed aggregation engine. I would recommend having a separate daemon running per media source, so that issues with one media source do not affect the operations of another.
It would be possible to do everything with one daemon, but would be much trickier to implement.
A second(?) python daemon would be waiting for those messages to be in the DB, process them, act accordingly to the objective of the application, and update the DB as expected. This process(es) might included complicated and numerous mathematical calculations, which might take seconds and even minutes to process.
Implementation here is less critical than your workflow design.
This could be implemented as a simple cron script on the host that runs every few minutes. The trick is to determine whether or not a) records have already been processed, b) certain records are currently processing, c) records are available that have yet to be processed/examined. You can use extra DB columns with the data to flag whether or not a process has already started examining that row, so any subsequent calls to look for new data can ignore those rows, even if the data hasn't finished processing.
The reason I want to do the application using Django is that all this HAS to have multiple web interfaces and, at the end of the day most media will c--ome through web, and have to be processed as http requests. Also, Django gives me a frame to make this work better organized and clean and I can make the application(s) DB agnostic.
What do you mean by 'multiple web interfaces'? You mean multiple daemons running on different listening ports? Different sites using the sites framework? End-user browser vs. API?
Wanting the application to be DB agnostic does not mean that I don't have a choice: I know I have many options to communicate among different python processes, but I prefer to leave that to the DBMS. Of the open source DBMS I know of, only Firebird and PostgreSQL have event that can provide the communication between all the processes involved. I was able to create a very similar application in 2012 with Firebird, but this time I am being restricted to PostgreSQL, which I don't to oppose at all. That application did not involve http requests.
Prefer to leave what to the DBMS? The DBMS is responsible for storing and indexing data, not process management. Some DBMS' may have some tricks to perform such tasks, but I wouldn't necessarily want to rely on them unless really necessary. If you're going to the trouble of writing separate listening daemons, then they can talk to whatever backend you choose with the right drivers.
y So this is effectively a feed aggregation engine. I would recommend having a separate daemon running per media source, so that issues with one media source do not affect the operations of another.I never would have thought of this application as a feed aggregation engine, but I'm not really sure it fits the definition, will be digging deeper into this
It would be possible to do everything with one daemon, but would be much trickier to implement.I agree 120%A second(?) python daemon would be waiting for those messages to be in the DB, process them, act accordingly to the objective of the application, and update the DB as expected. This process(es) might included complicated and numerous mathematical calculations, which might take seconds and even minutes to process.
Implementation here is less critical than your workflow design.I agree yet, this is the heart of my application. I understand it basically only involves the (web) application and the DBMS w/o any other external element; It is here where the whole shebang happens, but it might just be the DB application programmer in me though.
This could be implemented as a simple cron script on the host that runs every few minutes. The trick is to determine whether or not a) records have already been processed, b) certain records are currently processing, c) records are available that have yet to be processed/examined. You can use extra DB columns with the data to flag whether or not a process has already started examining that row, so any subsequent calls to look for new data can ignore those rows, even if the data hasn't finished processing.You gave me half my code there, but I'm not sure I want to trust a cron job for that. I know there are plenty of other options to do the dirty laundry here, such as queues, signals, sub-processes (and others?) but I kind'a feel comfortable leaving that communication exchange to the DBMS events as I see it; who would know better when 'something' happened but the DBMS itself?
The reason I want to do the application using Django is that all this HAS to have multiple web interfaces and, at the end of the day most media will c--ome through web, and have to be processed as http requests. Also, Django gives me a frame to make this work better organized and clean and I can make the application(s) DB agnostic.
What do you mean by 'multiple web interfaces'? You mean multiple daemons running on different listening ports? Different sites using the sites framework? End-user browser vs. API?A combination of all that and probably a bit more ... This is something I left out trying to evade the TL;DNR responses: I'm considering having this app return nothing but probably json or xml code for other applications to "feed" from it. (here is that feed word again!), there are a myriad of possible ways this application can be used. This, BTW, would leave all the HTML/CSS/Javascrpt/etc "problems" to someone else ... it might just be the DB app programmer in me trying to avoid dealing with web issues, or I might just be trying to make things harder for me; this is something I haven't really thought much about.
Wanting the application to be DB agnostic does not mean that I don't have a choice: I know I have many options to communicate among different python processes, but I prefer to leave that to the DBMS. Of the open source DBMS I know of, only Firebird and PostgreSQL have event that can provide the communication between all the processes involved. I was able to create a very similar application in 2012 with Firebird, but this time I am being restricted to PostgreSQL, which I don't to oppose at all. That application did not involve http requests.
Prefer to leave what to the DBMS? The DBMS is responsible for storing and indexing data, not process management. Some DBMS' may have some tricks to perform such tasks, but I wouldn't necessarily want to rely on them unless really necessary. If you're going to the trouble of writing separate listening daemons, then they can talk to whatever backend you choose with the right drivers.I understand I'm having the DBMS do some of the process management, but it only goes as far as letting other processes know there is some job to be done, not even what needs to be done. I don't thing the overhead on the DBMS is going to be all that big.
This whole application is an idea that's been in my mind for some 7 years now. I even got as far as having a working prototype. I was just starting to learn Python then and my code is a shameful non pythonic mess. But it worked. I used Firebird as my RDMS, and all feeds (again?) would come in and out through an ad-hoc gmail account (with google voice for SMS messaging) I would get the input, process it and return the output within 10 to 40 seconds, with the average at around 20 which is satisfying if you consider the app is not really controlling the "medium". Of course, I never even considered any heaving testing as there were many limitations, the 500 outgoing messages per day being just the first one. It just proved my concept. ande served as a very good (and long) exercise in Python.
I recently shared my thoughts with some close friends that linger around other branches of (IT related) knowledge and they liked they idea, hence the request for your input, for which I feel very much obliged.Thanks a BUNCH!
============DISCLAIMER!============I do not mean to argue any of the ideas you and all others have shared with me, on the contrary; you have fed even more my curiosity and curiosity well managed usually turns into knowledge. I can't do different from thanking all of you for that gift.