--You received this message because you are subscribed to the Google Groups "jug-users" group.To unsubscribe from this group and stop receiving emails from it, send an email to jug-users+...@googlegroups.com.To view this discussion on the web visit https://groups.google.com/d/msgid/jug-users/d9dce621-a943-4c74-8b59-84742833b669%40googlegroups.com.For more options, visit https://groups.google.com/d/optout.
I have run millions of tasks without a problem using the redis backend (in fact, I initially wrote that code because I had a problem where it made sense to use millions of tasks and NFS was too slow). With NFS, each task will generate a file on disk and locking/checking will generate a network call to the NFS server, so it can easily overwhelm the process.
Have you tried the --cache option for jug status (except when you change your task topology)? My largest project is about 50k tasks, and for me it cuts it down from ~20 seconds to ~0.7 seconds. I would expect that you'd see a substantial performance benefit...I don't know if there's a way to ask it to use --cache when you run jug execute, though..
I have run millions of tasks without a problem using the redis backend (in fact, I initially wrote that code because I had a problem where it made sense to use millions of tasks and NFS was too slow). With NFS, each task will generate a file on disk and locking/checking will generate a network call to the NFS server, so it can easily overwhelm the process.I now migrated all tasks and data in this experiment to a redis backend. `jug status` is now faster with 3 minutes, although I would have hoped for sub 1 minute calls. But I'm more interested in the performance of workers finding a free task. Does workers use the same mechanism as jug status to find ready tasks? Do they iterate through all tasks to check if one of them is free?
Jug status tells me that 70 of 100 workers are running. While that is of course better than the 80/400 from using the NFS backend, 30% time spend on overhead seems still high. But is that number even meaningfull? I guess "running" counts the number of currently locked tasks, but I guess iterating through all tasks and checking whether they are locked can cause "race conditions" between the status process and the worker processes, so that not all really running worker processes are detected.
I also tried looking into jugs source code to identify if I'm doing something wrong. Maybe you can point me in the right direction in the source code? I think [jug.execution_loop](https://github.com/luispedro/jug/blob/master/jug/jug.py#L163) is the correct line for the abstract logic of finding a runnable task which should then calculate the hash of the task and checks in redis if the result is already present.
So I would guess performance is similar to `jug status`?
In that case I think it's reasonable to expect that the overhead to find a free task can get high if the number of tasks is high. Do you have any idea how I can measure if it is overhead?
@JustinHave you tried the --cache option for jug status (except when you change your task topology)? My largest project is about 50k tasks, and for me it cuts it down from ~20 seconds to ~0.7 seconds. I would expect that you'd see a substantial performance benefit...I don't know if there's a way to ask it to use --cache when you run jug execute, though..I just tried it out and it cuts back my execution time of `jug status` to 6 seconds. But when comparing status with --cache and without, --cache seems to not refresh numbers at all, but only showing the number of ready/finished tasks when status --cache was first executed.Also as --cache seems to rely on a slqite3 db I guess there is no good way to incorporate with executing tasks, as sqlite3 isn't really designed for concurrent writes.