Killed Job but it still ran

65 views
Skip to first unread message

Roger McCarrick

unread,
Jun 22, 2023, 1:15:50 PM6/22/23
to rundeck-discuss
Rundeck Community 4.1.0
2 Linux frontend servers behind Netscaler
Both point to same SQL database.
Database is in Microsoft SQL 2019 Availability Group.

We have a a job that pulls requests from a database, then schedules a job for each request some time in the future, maybe hours from now, maybe days from now.

Sometimes we get a call to cancel the request. So we go into Rundeck and cancel that scheduled job. We take a screen shot of the cancelled job. We see the little grey circle that indicates "aborted". But 3 times the job has gone ahead and run.

Tested scheduling a job, kill it and quickly failover the database to see if the job shows 'scheduled" in the secondary DB. But it showed killed. So not having luck reproducing this.

Anyone have any ideas?

thanks

rac...@rundeck.com

unread,
Jun 22, 2023, 2:53:17 PM6/22/23
to rundeck-discuss
Hi Roger,

Do you see any clue on the service.log file at the moment of this issue?

Greetings.

Roger McCarrick

unread,
Jun 22, 2023, 4:52:23 PM6/22/23
to rundeck-discuss
The job was killed  Tuesday, ‎June ‎13, ‎2023, ‏‎5:35:58 PM CST, so that 3:35 EST. And we always take a screen shot 
9323e578-0082-4379-a966-43f5a600a51e.jpg



15c418a3-b175-4e83-835f-47dd6938fc5e.jpg


then 2 days later in the logs:

rundeck.executions.log.2023-06-15.gz:
[2023-06-15T16:59:00,455] INFO  execution.status srv-rundeck start [63079:aborted] ULM_Tasks srv-rundeck/adm-gwood "Removes/Disable-User  -requestid 34190 -RDRunNow NO -RDSkipNotify NO -RDNotifyAddress NONE"[4ff0da5c-e68a-41f2-a207-7560349ef9bc]
[2023-06-15T17:02:39,913] INFO  execution.status srv-rundeck finish [63079:failed] ULM_Tasks srv-rundeck/adm-gwood "Removes/Disable-User  -requestid 34190 -RDRunNow NO -RDSkipNotify NO -RDNotifyAddress NONE"[4ff0da5c-e68a-41f2-a207-7560349ef9bc]

rundeck.log.2023-06-15.gz:
[2023-06-15T17:02:39,863] ERROR services.ExecutionUtilService [quartzScheduler_Worker-9] - Execution failed: 63079 in project ULM_Tasks: [Workflow result: , step failures: {2=Dispatch failed on 1 nodes: [NYTXNODE: Unknown: java.lang.NullPointerException: Cannot get property 'success' on null object]}, Node failures: {NYTXNODE=[Unknown: java.lang.NullPointerException: Cannot get property 'success' on null object]}, status: failed]

service.log.2023-06-15.gz
[2023-06-15T17:02:39,863] ERROR services.ExecutionUtilService - Execution failed: 63079 in project ULM_Tasks: [Workflow result: , step failures: {2=Dispatch failed on 1 nodes: [NYTXNODE: Unknown: java.lang.                               NullPointerException: Cannot get property 'success' on null object]}, Node failures: {NYTXNODE=[Unknown: java.lang.NullPointerException: Cannot get property 'success' on null object]}, status: failed]

the job says it failed ... but it succeeded enough to do what we didn't want it to do, Disable a user's account.

So its curious the first log entry above says  execution.status srv-rundeck start [63079:aborted] .

thanks
Roger
Reply all
Reply to author
Forward
0 new messages