Is it possible to start the workers of queued tasks in the controller?

90 views
Skip to first unread message

Phillip

unread,
Aug 23, 2015, 6:42:58 PM8/23/15
to web...@googlegroups.com

9-9-15: The following code was expecting the scheduler to automatically start the queued tasks:


for dataID in dataIDs:

      scheduler.queue_task(ImportData, [dataID], immediate=True, timeout=100)

            # tried without immediate

            # tried db.commit() after the loop or each queued task


Is anything missing?


Any hints or redirection appreciated



To reiterate some notes below, the task function is in a module and the queued tasks need to operate concurrently.



____ previous notes _____

edit: Based on Niphlod's response, it appears my main concern is 'probable issues in concurrently writing data to the database.' Could someone clarify the scheduler's limitations for this purpose?



1.My goal is to use the scheduler to concurrently process as many files as possible (I assume there is no limit to the number of processes (edit: that can be posted to the scheduler) and the scheduler will prevent the server from dropping them).

2.My tasks are defined in a module, not model.

3.It is in the view that a potentially large number of items are selected which triggers the callback and process below.

4. In the function importData, the database is accessed and records added.



Is there anything I am missing to properly and fully utilize the scheduler for this purpose?

Please post any caveats, corrections, or tips, and let me know if any other information is needed.


Thanks,

PV

Niphlod

unread,
Aug 26, 2015, 10:47:45 AM8/26/15
to web2py-users
this is how you queue tasks, and there's nothing wrong with it. 
The missing point here is 1. : there is a limit to the number of concurrent workers, which is how many you start from the commandline... and, BTW, even if you queue 10k tasks, you can't expect to start 10k scheduler processes on a single server and observe every task completed in a single loop. 
web2py's scheduler with a postgresql backend starts to show its limits with 30~50 workers on modest hardware, i.e. even if you launch more, they won't get speedier in regards of concurrent execution. 
Given that it seems that you need to ingest data and probably the ingested goes on the same database, there's another probable concurrency penalty to consider. 

On Monday, August 24, 2015 at 12:42:58 AM UTC+2, Phillip wrote:

1.My goal is to use the scheduler to concurrently process as many files as possible (I assume there is no limit to the number of processes and the scheduler will prevent the server from dropping them).

2.My tasks are defined in a module, not model.

3.It is in the view that a potentially large number of items are selected which triggers the callback and process below.

4. In the function importData, the database is accessed and records added.


for dataID in dataIDs:

      scheduler.queue_task(ImportData, [dataID], timeout=100)

Message has been deleted
Message has been deleted
Message has been deleted

Phillip

unread,
Sep 9, 2015, 10:56:53 PM9/9/15
to web...@googlegroups.com
It appears that no workers are being started with my queuing the tasks in this manner. If they can't be started in the controller or if they were expected to be started with the above code, please let me know.

Otherwise, sincere thanks for all of your support with Web2py

Niphlod

unread,
Sep 10, 2015, 4:40:47 AM9/10/15
to web2py-users
workers are external processes that have nothing in common with the "web-serving process", otherwise a simple ajax call would do the trick.
workers needs to be started as ADDITIONAL processes, with the cmdline

web2py.py -K appname

If in need of multiple processes, start them with

web2py.py -K appname,appname

etc.
Reply all
Reply to author
Forward
0 new messages