re-run a jugfile with different code

Peter

unread,

Jul 26, 2013, 5:03:03 AM7/26/13

to jug-...@googlegroups.com

Hi,

How can I get jug to re-run a task that I've changed? As an exampled consider the primes.py example from the documentation. I run it once and all 99 tasks finish. Then I go back and change the function so that it always returns True. The next time I run it, it says that all the tasks are finished and it doesn't run them again. Is there a way to force it to re-run all of the Tasks?

Best,

-Peter

Luis Pedro Coelho

unread,

Jul 26, 2013, 5:18:34 AM7/26/13

to jug-...@googlegroups.com

Hi Peter,

Jug does not detect code changes (this is actually an explicit design
decision: I am afraid that the other option would push people to not
redesign their code so as not to trigger recomputations).

In this case, you can just remove the cache (the primes.jugdata
directory) completely. This will trigger all recomputations.

You can also use

jug invalidate primes.py --invalid=primes.is_prime

This will selectively invalidate the is_prime() tasks (and dependencies,
none in this case).

HTH
--
Luis Pedro Coelho | EMBL | http://luispedro.org

Recent stuff:
http://bit.ly/coelho2013-video

Peter

unread,

Jul 26, 2013, 6:47:19 AM7/26/13

to jug-...@googlegroups.com

Hi Luis,

Thanks for the quick and complete answer.

It made me a little curious about your decision to not check for code changes. So from you answer, I understand that you are worried that if changing the code leads to re-computations, then people will be afraid of changing their code lest they have to spend the time to redo their computations. Is this right? If so, couldn't it lead to some unpleasant surprises when somebody changes some upstream part of the code and then forgets to explicitly invalidate their results? The way I see it, the results depend as much on the code as on the parameters passed to the code.

I'm just curious what your thoughts about this are. Perhaps you've already addressed this in the documentation somewhere, and I was too lazy to find it?

Thanks again,

-Peter

Luis Pedro Coelho

unread,

Jul 26, 2013, 8:43:06 AM7/26/13

to jug-...@googlegroups.com

Hi,

I don't know if I ever wrote this down explicitly, but here it is (I'll
copy it to the FAQ later):

1) It is very hard to get this right. You can easily check Python code
(with dependencies), but checking into compiled C is harder. If the
system runs any command line programmes you need to check for them
(including libraries) as well as any configuration/datafiles they touch.

You can do this by monitoring the programmes, but it is no longer
portable (I could probably figure out how to do it on Linux, but not
other operating systems) and it is a lot of work.

It would also slow things down. Even if it checked only the Python code:
it would need to check the function code & all dependencies + global
variables at the time of task generation.

2) I was also afraid that this would make people wary of refactoring
their code. If improving your code makes jug recompute 2 hours of
results, then you wouldn't do it.

3) Jug supports explicit invalidation with jug invalidate. This checks
your dependencies.

HTH
Luis

> --
> You received this message because you are subscribed to the Google
> Groups "jug-users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to jug-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Peter

unread,

Jul 26, 2013, 9:19:49 AM7/26/13

to jug-...@googlegroups.com

That makes sense. I'll just invalidate my tasks :)

Reply all

Reply to author

Forward