ResponseTimeMonitor kills the node with CloudRetentionStrategy

11 views
Skip to first unread message

Stanislav Baiduzhyi

unread,
Sep 8, 2015, 4:53:46 AM9/8/15
to jenkin...@googlegroups.com
Maybe I'm doing something wrong, but yesterday I've got interesting situation:

1. CloudRetentionStrategy is used, node is offline for quite some time.
2. During that time, ResponseTimeMonitor still gathers results and
interprets them as -1.
3. There is a job for a node, the node is starting.
4. ResponseTimeMonitor kick in, collects the result before the channel
to the node is established, and terminates the node on whichever stage
it is at the moment, which is mostly harmless when node is still
offline or not yet connected, but yesterday it actually killed the
working node.

I've forked Jenkins and made some changes to avoid this situation, but
that is fast and dirty solution, I would like to hear some ideas and
recommendations on how to proceed with this, and how is it expected to
work in general.

My changes: https://github.com/TheIndifferent/jenkins/commit/d8d93ccedd42a36cfe08548671a8b81470f77a1d

Oleg Nenashev

unread,
Sep 27, 2015, 4:59:16 PM9/27/15
to Jenkins Developers, Stephen Connolly, Kanstantsin Shautsou
Added Stephen and Kanstantin to Cc. If the issue is somehow related to Cloud API, probably they are the best persons to discuss it. BTW, it makes sense to send reminders if there is no response for a long time.

Regarding the further steps:
* Create an issue to Jenkins core. Seems
* If you have a fix (e.g. the referenced commit), submit a pull-request to the main repo. The pull-requests qre being commonly reviewed by Jenkins core devs (instead of the mailing list, unfortunately)

Best regards,
Oleg

вторник, 8 сентября 2015 г., 11:53:46 UTC+3 пользователь Stanislav Baiduzhyi написал:

Kanstantsin Shautsou

unread,
Sep 27, 2015, 5:23:30 PM9/27/15
to Jenkins Developers, scon...@cloudbees.com, kanstan...@gmail.com
Could you provide jenkins log, link to strategy and cloud plugin sources? 
Reply all
Reply to author
Forward
0 new messages