Task queue in Elixir

473 views
Skip to first unread message

Stian Håklev

unread,
Jul 21, 2015, 10:57:10 PM7/21/15
to elixir-l...@googlegroups.com
Recently I needed a task queue with retries for my Phoenix project, and to my surprise I couldn't find anything that really worked the way I imagined. Basically, my main concern was to make sure that failing jobs, perhaps because of API throttling or temporary service unavailability, would be retried until complete. (Initially I needed to send out a lot of emails through Amazon SES, which would error if you send more than n per second, now I am also scraping a wiki site regularly).

I'm not very experienced with OTP, concurrency etc, but it was fun trying to put together something minimal. I began trying to do DETS, but it took me too long to figure out the API, so I went back to Postgres (which there was an Ecto wrapper for DETS! :))... 

My vision would be something like this:
- you can specify different quality of service groups in config. For each group you can specify how many retries, how long between each (fixed time or exponential backoff), process timeout, possibly logging and retention. (Currently it deletes all jobs completed, or past their max tries rate - I'd like to be able to keep them for inspection, but with a flag so they don't get executed again).
- initially I focused more on how many retries etc, but it might be neat to have some built-in throttling, like there should only be n amount of requests of type X per minute / minimum time between requests.

You can see my current implementation here: https://github.com/houshuang/survey. The model is in web/models/job.ex, and the worker in lib/job_worker.ex. I also have a custom supervisor (lib/job_worker_supervisor.ex), to make sure that if 500 email jobs fail after each other because Amazon is down, my whole Phoenix service won't die. Currently I'm just running a single worker, since my main concern is reliability and not speed. I'm not sure if having a pool of eight workers who regularly pull from the queue, or a supervisor that spawns new workers to do a single job, with a timeout, is a better model.

Anyway this is what I've got so far, and it's been running in production for a week sending a few thousand emails a day, and doing a few thousand other tasks (scraping etc). If something fairly similar to what I described above already exists, I'd love to hear about it, and happy to scrap my own code for something more robust (if anything it taught me a lot about things to think about when writing OTP/queue code). If not, I'd love to push this into a stand-alone useful library for others (and for my own future code). Happy to hear about ideas for API design and internals, or anyone who wants to collaborate.

Stian
PS: Currently the way to use it is to spin up the supervisor in application.ex, and then just call Survey.Job.add({m, f, a}). Currently all parameters (retries etc) are specified in config, but there's only one QoS level, not multiple.

--
http://reganmian.net/blog -- Random Stuff that Matters

Eduardo Gurgel

unread,
Jul 22, 2015, 5:20:39 AM7/22/15
to elixir-l...@googlegroups.com
I would probably take a look at https://github.com/uwiger/jobs

(I never used this project)

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/CAEKz3tj95WhJB02EvteDYOvoSrnirz3NQf7STXWn5J2DYvRF2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.



--
Eduardo
Reply all
Reply to author
Forward
0 new messages