restarting/reassigning jobs (wu)

20 views
Skip to first unread message

Matias Salimbene

unread,
Mar 5, 2025, 11:52:02 AMMar 5
to boinc_projects
I'm working on a private project were we've setup a boinc server with our custom apps. For the most part, everything works fine, but there's one scenario we've been unable to achieve.

Say you have an app, you create a job. Then you've 5 clients subscribed to the project, so a client picks the job. Because of the business logic we're implementing, some clients cannot execute the job, and should simply return a result stating that they cannot process it. So far so good. But then I would need to have that job be available again for other clients to pick up. 

Is there a "reset job" or similar that I can use so that a finished job (wu) starts over? (I would figure out a way for that client not to pick up the job later, but that can wait for the moment).

I think one way to do it would be to write an assimilation that handles that logic and creates a new job with the same input. But there may be a better way.

Thanks.


Vitalii Koshura

unread,
Mar 5, 2025, 12:18:54 PMMar 5
to Matias Salimbene, boinc_projects
Hello Matias,

Is it possible to change the logic of your application to return an error is such cases?
Then this WU will be automatically available for other clients on the server side.

Best regards,
Vitalii Koshura

Sent via iPhone


Ср, 5 марта 2025 г. в 17:52, Matias Salimbene <matias.s...@gmail.com>:
--
You received this message because you are subscribed to the Google Groups "boinc_projects" group.
To unsubscribe from this group and stop receiving emails from it, send an email to boinc_project...@ssl.berkeley.edu.
To view this discussion visit https://groups.google.com/a/ssl.berkeley.edu/d/msgid/boinc_projects/bf7beb3e-fed3-451c-8a4d-b49fca4d59f7n%40ssl.berkeley.edu.

Matias Salimbene

unread,
Mar 5, 2025, 12:24:47 PMMar 5
to boinc_projects, lestat.d...@gmail.com, boinc_projects, Matias Salimbene
Yes, but does it have to be a particular error? During testing, we've had jobs fail all the time, but they don't get picked up by other clients. They are just marked as "failed" and that's it. That isn't normal behaviour?

Vitalii Koshura

unread,
Mar 5, 2025, 12:40:14 PMMar 5
to Matias Salimbene, boinc_projects
No, nothing special, just non-zero: 
Have you configured custom WU replication?



Best regards,
Vitalii Koshura

Sent via iPhone


Ср, 5 марта 2025 г. в 18:24, Matias Salimbene <matias.s...@gmail.com>:

David P. Anderson

unread,
Mar 5, 2025, 9:25:52 PMMar 5
to Vitalii Koshura, Matias Salimbene, boinc_projects
The way to do this:
- create the job with max_total_results and max_error_results set
to values like 5 or 10
Do this in the input template file:
https://github.com/BOINC/boinc/wiki/JobTemplates
- if the app can't run on a host, it should call boinc_finish(1)
(i.e., exit with an error)

That way, if the job fails on a host, the scheduler will try it on other hosts,
and repeat this until it succeeds or a limit is reached.

-- D
> To view this discussion visit https://groups.google.com/a/ssl.berkeley.edu/d/msgid/boinc_projects/CAECimMUKtH4OxLSd9sGLx9YhxSY_WuUPEgu2_K86u_KXONyu%2BQ%40mail.gmail.com.

Rytis Slatkevičius

unread,
Mar 7, 2025, 4:24:31 PMMar 7
to Matias Salimbene, boinc_projects

Two options: either return a non-zero status code, or make a custom validator.

In the first scenario, the task will be considered an error and the transitioner will create a new copy. In the second one, your validator can mark the task as invalid and again, the transitioner will create a copy.

Pagarbiai,
Rytis Slatkevičius
+370 670 77777

Reply all
Reply to author
Forward
0 new messages