How do people run background jobs with their Yesod apps?
Like every 4 hours: read something from the database, do some IO
action, update the database content.
I can think of
a)
Create a different executable and run it with cron every 4 hours?
This will ignore the database pool, open it's own connection and
consist of a rather large binary
as it needs the persistent layer and what-not.
Also not scale well to multiple servers.
b)
Have the action triggered be visiting a special path, and visit this
with a cron job.
We need some security here, like a never-expiring admin-user session cookie,
and somehow spawn the action in the background and log errors, to not
leave the cron hanging.
Not sure how to fork a handler action
This should perhaps also not run the action, but queue it to a worker process...
But maybe this is a bad strategy.
How do people achieve this? Is there some Yesod way of doing it?
And what about the people deploying to Heroku?
Thanks!
-Rune
But this still only works with one server, as I don't want N
background processes trying to do the same thing if I have N servers.
The same thing is true with Heroku I guess, which means I guess I'll
have to have another bloated binary.
> The bloated binary issue can be alleviated by creating a separate cabal
> package for the persistent layer (or at least the background job). Note that
> this does not mean that the application must to depend on that package. You
> can use hs-source-dirs or symlinks so that your application can still
> directly compile your persistent layer even though it is used by another
> package.
I'm not entirely sure what you mean here. Care to elaborate?
I haven't tried Heroku yet, but it seems like an interesting choice.
I guess I need to statically compile the binary.
Does that really work? I remember not being able to do that before,
and having to switch to a VPS (think it was libmysql problem, which
I'm not using now)
Best,
Rune
Thanks for all the answers, and thank for sharing the code, Arash. It
seems I was forgetting I was coding my webapp in a real language!
That is certainly the easiest approach.
Just need to find a way to stick the runDB inside forkIO.
But this still only works with one server, as I don't want N
background processes trying to do the same thing if I have N servers.
The same thing is true with Heroku I guess, which means I guess I'll
have to have another bloated binary.
> The bloated binary issue can be alleviated by creating a separate cabalI'm not entirely sure what you mean here. Care to elaborate?
> package for the persistent layer (or at least the background job). Note that
> this does not mean that the application must to depend on that package. You
> can use hs-source-dirs or symlinks so that your application can still
> directly compile your persistent layer even though it is used by another
> package.
I haven't tried Heroku yet, but it seems like an interesting choice.
I guess I need to statically compile the binary.
Does that really work? I remember not being able to do that before,
and having to switch to a VPS (think it was libmysql problem, which
I'm not using now)
Thanks for all the answers, and thank for sharing the code, Arash. It
seems I was forgetting I was coding my webapp in a real language!
That is certainly the easiest approach.
Just need to find a way to stick the runDB inside forkIO.
But this still only works with one server, as I don't want N
background processes trying to do the same thing if I have N servers.
The same thing is true with Heroku I guess, which means I guess I'll
have to have another bloated binary.
> The bloated binary issue can be alleviated by creating a separate cabal
> package for the persistent layer (or at least the background job). Note that
> this does not mean that the application must to depend on that package. You
> can use hs-source-dirs or symlinks so that your application can still
> directly compile your persistent layer even though it is used by another
> package.
Can't you just have two executables within your .cabal file? They may
have different dependencies and different modules, however there's no
problem in having them on the same hs-source-dir. This sounds a lot
easier for me.
Cheers, =)
--
Felipe.
I've just decided to make a small project[1] of this so you can
comfortably run background jobs in any haskell program, without
dealing with any forkIO code yourself. I'm still to release this on
hackage, but I'll get to it soon.
I posted here in case anybody encounters this old thread and is
interested in solutions.
[1]: https://github.com/Tarrasch/timed-repeating
Cheers,
Arash
2011/10/10 Rune Harder Bak <ru...@bak.dk>:
I intend to put the results of the action into the database, but it
should be easy to extend it with a functions
IO () -> IO ()
Another great extension could be to provide a function: memoIO :: (a
-> IO b) -> (a -> IO b)
where the new function would check if the result is already there, and
relatively recent,
if it is, use that, if not perform the action (or mark it to be
fetched on next repeated action).
It should also clean out old results, as the input type a might be infinite.
So it should probably be configurable with number of hours the result
is valid, and perhaps what to do when it's old.
an the input type should be restricted to Eq or more likely Ord to
allow for fast lookup.
Anyway, thanks for making the library!
-Rune
First of all, do you have any good name idea for this package, I
basically took what was on top of my head. Do you think the name should
instead have anything with caching in it?
On Mon 16 Jan 2012 04:38:08 AM CET, Rune Harder Bak wrote:
>
> Great work, I haven't really looked in to this issue too much since
> last, so I'm happy for a solution!
>
> I intend to put the results of the action into the database, but it
> should be easy to extend it with a functions
> IO () -> IO ()
Ok, so you want to provide an `IO a`, and every hour that should be run,
and that value should be passed to a user-provided `a -> IO ()`. Where
you want to provide something that writes to the db, but it could also
write to a file, sounds reasonable.
Or what did you have in mind with a `IO () -> IO ()`?
Also, do you have any opinion if we should return a `IORef a` or an `IO
a`. that is
runEveryHour :: IO a -> IO (IO (a))
or
runEveryHour :: IO a -> IO (IORef a)
Both the advantage and disadvantage is that `IO a` is more expressive,
and in the equivelent case you have something like `io = readIORef ref`.
The user isn't interested in whether my library stores the value in a
reference or not, on the other hand you probably only want a value, so
you don't have to worry that `IORef a` will launch the nuclear missiles,
meanwhile `IO a` could. But of course `IO a` is more flexible, what do
you think?
>
> Another great extension could be to provide a function: memoIO :: (a
> -> IO b) -> (a -> IO b)
>
> where the new function would check if the result is already there, and
> relatively recent,
> if it is, use that, if not perform the action (or mark it to be
> fetched on next repeated action).
> It should also clean out old results, as the input type a might be
> infinite.
Hmm, that is interesting, it's worth checking if this already exists or
not. But this seems good for other use cases, maybe it deserves it's own
package? It seems like a different thing here, as you probably are
seriously worried about your IO function crashing and want to handle it
yourself, as opposed to when you want a scraper running every hour.
>
> So it should probably be configurable with number of hours the result
> is valid, and perhaps what to do when it's old.
> an the input type should be restricted to Eq or more likely Ord to
> allow for fast lookup.
Yea, thats interesting too.
>
> Anyway, thanks for making the library!
>
> -Rune
>
Cheers,
Arash
Yeah, something like that would happen internally in the IO () action,
but I don't see value in forcing the user to split up his action in
two.
I mean, the user might want to stream the input to a file or what do I know.
But perhaps I'm missing something?
> Also, do you have any opinion if we should return a `IORef a` or an `IO a`.
> that is
>
> runEveryHour :: IO a -> IO (IO (a))
> or
> runEveryHour :: IO a -> IO (IORef a)
I liked your earlier example that I could just treat it like a value,
but I might be better off beeing aware that it actually isn't.
I've never really touch upon the imperative-style part of Haskell, so
I can't be of much help here.
Perhaps it should be an option.
>> Another great extension could be to provide a function: memoIO :: (a
>> -> IO b) -> (a -> IO b)
>>
>> where the new function would check if the result is already there, and
>> relatively recent,
>> if it is, use that, if not perform the action (or mark it to be
>> fetched on next repeated action).
>> It should also clean out old results, as the input type a might be
>> infinite.
>
> Hmm, that is interesting, it's worth checking if this already exists or not.
> But this seems good for other use cases, maybe it deserves it's own package?
> It seems like a different thing here, as you probably are seriously worried
> about your IO function crashing and want to handle it yourself, as opposed
> to when you want a scraper running every hour.
You could also view your use-case as a special case of this more
general pattern.
But albeit a simple one, where all the potential problems doesn't really occur.
When I first thought if it I thought it must exist somewhere, but I
couldn't find any mention of
something like that anywhere. Perhaps I'm using the wrong search-terms.
Thinking of implementation one could keep a map of values and clean it up
every once in a while, but I don't really know how to keep a fixed
bound on memory.
And this all seems very low-level like there should be some clever
Haskell-lazyness trick doing it for me.
But as I said, i haven't really worked with this kind of code before.
Testing out your code right now!
-Rune
Please give feedback or open any issue if you want. :)
Cheers,
Arash
--
You received this message because you are subscribed to the Google Groups "Yesod Web Framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to yesodweb+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "Yesod Web Framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to yesodweb+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.