[JIRA] (JENKINS-61475) Infinite loop in StandardGraphLookupView

14 views
Skip to first unread message

campbell.francois@gmail.com (JIRA)

unread,
Mar 13, 2020, 11:47:04 AM3/13/20
to jenkinsc...@googlegroups.com
Francois Campbell created an issue
 
Jenkins / Bug JENKINS-61475
Infinite loop in StandardGraphLookupView
Issue Type: Bug Bug
Assignee: Unassigned
Attachments: jenkins-loop-cpu.png, jenkins-loop-stack-jelly.txt, jenkins-loop-stack.txt, jenkins-loop-threads.png
Components: workflow-api-plugin
Created: 2020-03-13 15:46
Environment: OS: Amazon Linux 2
Jenkins version: 2.223
Java version: 8.212.04-r0
JAVA_OPTS: "-Djava.awt.headless=true -XX:ActiveProcessorCount=48 -Dhudson.slaves.NodeProvisioner.initialDelay=0 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85"
Installed plugins: in description
How Jenkins is accessed: Through an AWS Classic Load Balancer
Jenkins installation method: Docker image jenkins/jenkins:2.223-alpine running on ECS, agents launched on-demand by ECS Plugin and connecting via JNLP
Web browser: Various users affected, all using modern versions of Chrome/Firefox
Labels: pipeline
Priority: Minor Minor
Reporter: Francois Campbell

About once a week, multiple threads handling HTTP requests matching the patterns /blue/rest/organizations/jenkins/pipelines/<folder>/pipelines/<job>/runs/<number>/ or /blue/organizations/jenkins/<folder>%2F<job>/detail/<job>/<number>/pipeline/ will get stuck in an infinite loop in StandardGraphLookupView.bruteForceScanForEnd.

Each thread that get stuck ties a CPU core, and they tend to happen in batches of about 20-30 threads. This results in the Jenkins web UI appearing to be very slow to users. The resolution so far has been to use the JavaMelody monitoring plugin to kill the stuck threads one by one.

I've attached stack traces from two example threads showing that they get stuck in the same place, as well as a snapshot of the CPU usage at the time

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v7.13.12#713012-sha1:6e07c38)
Atlassian logo

campbell.francois@gmail.com (JIRA)

unread,
Mar 13, 2020, 11:50:03 AM3/13/20
to jenkinsc...@googlegroups.com
Francois Campbell updated an issue
Change By: Francois Campbell
About once a week, multiple threads handling HTTP requests matching the patterns /{{blue/rest/organizations/jenkins/pipelines/<folder>/pipelines/<job>/runs/<number>/}} or {{/blue/organizations/jenkins/<folder>%2F<job>/detail/<job>/<number>/pipeline/}} will get stuck in an infinite loop in {{StandardGraphLookupView.bruteForceScanForEnd}}.


Each thread that get stuck ties a CPU core, and they tend to happen in batches of about 20-30 threads. This results in the Jenkins web UI appearing to be very slow to users. The resolution so far has been to use the JavaMelody monitoring plugin to kill the stuck threads one by one.

I've attached stack traces from two example threads showing that they get stuck in the same place, as well as a snapshot of the CPU usage at the time. There was nothing supicious going on at the time, and I've been unable to manually reproduce the issue.

I believe something like [https://github.com/francoiscampbell/workflow-api-plugin/compare/master...graph-lookup-concurrent
|https://github.com/francoiscampbell/workflow-api-plugin/compare/master...graph-lookup-concurrent ?quick_pull=1] should address the issue, but I'm not sure how to verify that.

campbell.francois@gmail.com (JIRA)

unread,
Mar 13, 2020, 11:50:06 AM3/13/20
to jenkinsc...@googlegroups.com
Francois Campbell updated an issue
About once a week, multiple threads handling HTTP requests matching the patterns /{{blue/rest/organizations/jenkins/pipelines/<folder>/pipelines/<job>/runs/<number>/}} or {{/blue/organizations/jenkins/<folder>%2F<job>/detail/<job>/<number>/pipeline/}} will get stuck in an infinite loop in {{StandardGraphLookupView.bruteForceScanForEnd}}.

Each thread that get stuck ties a CPU core, and they tend to happen in batches of about 20-30 threads. This results in the Jenkins web UI appearing to be very slow to users. The resolution so far has been to use the JavaMelody monitoring plugin to kill the stuck threads one by one.

I've attached stack traces from two example threads showing that they get stuck in the same place, as well as a snapshot of the CPU usage at the time . There was nothing supicious going on at the time, and I've been unable to manually reproduce the issue.

I believe something like [https://github.com/francoiscampbell/workflow-api-plugin/compare/master...graph-lookup-concurrent?quick_pull=1] should address the issue, but I'm not sure how to verify that.

campbell.francois@gmail.com (JIRA)

unread,
Mar 13, 2020, 11:59:02 AM3/13/20
to jenkinsc...@googlegroups.com
Francois Campbell commented on Bug JENKINS-61475
 
Re: Infinite loop in StandardGraphLookupView

All plugins and their versions:

ace-editor 1.1
amazon-ecs 1.27-SNAPSHOT (private-ea869ac0-francoiscampbell)
analysis-core 1.96
android-lint 2.6
ansicolor 0.6.3
ant 1.11
antisamy-markup-formatter 1.8
apache-httpcomponents-client-4-api 4.5.10-2.0
authentication-tokens 1.3
authorize-project 1.3.0
aws-credentials 1.28
aws-java-sdk 1.11.723
basic-branch-build-strategies 1.3.2
blueocean 1.22.0
blueocean-autofavorite 1.2.4
blueocean-bitbucket-pipeline 1.22.0
blueocean-commons 1.22.0
blueocean-config 1.22.0
blueocean-core-js 1.22.0
blueocean-dashboard 1.22.0
blueocean-display-url 2.3.1
blueocean-events 1.22.0
blueocean-git-pipeline 1.22.0
blueocean-github-pipeline 1.22.0
blueocean-i18n 1.22.0
blueocean-jira 1.22.0
blueocean-jwt 1.22.0
blueocean-personalization 1.22.0
blueocean-pipeline-api-impl 1.22.0
blueocean-pipeline-editor 1.22.0
blueocean-pipeline-scm-api 1.22.0
blueocean-rest 1.22.0
blueocean-rest-impl 1.22.0
blueocean-web 1.22.0
bouncycastle-api 2.18
branch-api 2.5.5
build-timeout 1.19.1
cloudbees-bitbucket-branch-source 2.7.0
cloudbees-folder 6.11.1
cobertura 1.15
code-coverage-api 1.1.3
command-launcher 1.4
conditional-buildstep 1.3.6
credentials 2.3.3
credentials-binding 1.21
datadog 1.0.2
datatheorem-mobile-app-security 2.0.1
display-url-api 2.3.2
docker-commons 1.16
docker-workflow 1.22
durable-task 1.33
embeddable-build-status 2.0.3
external-monitor-job 1.7
fail-the-build-plugin 1.0
favorite 2.3.2
git 4.2.0
git-client 3.2.0
git-server 1.9
github 1.29.5
github-api 1.106
github-branch-source 2.6.0
handlebars 1.1.1
handy-uri-templates-2-api 2.1.8-1.0
htmlpublisher 1.22
http_request 1.8.24
jackson2-api 2.10.2
jacoco 3.0.5
javadoc 1.5
jdk-tool 1.4
jenkins-design-language 1.22.0
jira 3.0.12
job-dsl 1.76
jquery-detached 1.2.1
jsch 0.1.55.2
junit 1.28
ldap 1.21
lockable-resources 2.7
mailer 1.30
mapdb-api 1.0.9.0
matrix-auth 2.5
matrix-project 1.14
maven-plugin 3.4
mercurial 2.8
metrics 4.0.2.6
metrics-datadog 1.0
momentjs 1.1.1
monitoring 1.82.0
naginator 1.18
opsgenie 1.8
pam-auth 1.6
parallel-test-executor 1.12
parameterized-scheduler 0.8
parameterized-trigger 2.36
pipeline-aws 1.39
pipeline-build-step 2.11
pipeline-github-lib 1.0
pipeline-graph-analysis 1.10
pipeline-input-step 2.11
pipeline-milestone-step 1.3.1
pipeline-model-api 1.5.1
pipeline-model-declarative-agent 1.1.1
pipeline-model-definition 1.5.1
pipeline-model-extensions 1.5.1
pipeline-rest-api 2.13
pipeline-stage-step 2.3
pipeline-stage-tags-metadata 1.5.1
pipeline-stage-view 2.13
pipeline-timeline 1.0.3
pipeline-utility-steps 2.5.0
plain-credentials 1.7
postbuildscript 2.9.1
pubsub-light 1.13
rebuild 1.31
resource-disposer 0.14
run-condition 1.2
saml 1.1.5
scm-api 2.6.3
script-security 1.70
simple-theme-plugin 0.6
slack 2.36
sse-gateway 1.22
ssh-credentials 1.18.1
ssh-slaves 1.31.1
strict-crumb-issuer 2.1.0
structs 1.20
throttle-concurrents 2.0.2
timestamper 1.11.1
token-macro 2.11
trilead-api 1.0.5
variant 1.3
windows-slaves 1.6
workflow-aggregator 2.6
workflow-api 2.40
workflow-basic-steps 2.19
workflow-cps 2.80
workflow-cps-global-lib 2.15
workflow-durable-task-step 2.35
workflow-job 2.36
workflow-multibranch 2.21
workflow-scm-step 2.10
workflow-step-api 2.22
workflow-support 3.4
ws-cleanup 0.38

campbell.francois@gmail.com (JIRA)

unread,
Mar 13, 2020, 11:59:02 AM3/13/20
to jenkinsc...@googlegroups.com
Francois Campbell updated an issue
Change By: Francois Campbell
Environment:
OS: Amazon Linux 2
Jenkins version: 2.223
Java version: 8.212.04-r0
JAVA_OPTS: "-Djava.awt.headless=true -XX:ActiveProcessorCount=48 -Dhudson.slaves.NodeProvisioner.initialDelay=0 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85"
Installed plugins: in description comment
How Jenkins is accessed: Through an AWS Classic Load Balancer
Jenkins installation method: Docker image jenkins/jenkins:2.223-alpine running on ECS, agents launched on-demand by ECS Plugin and connecting via JNLP
Web browser: Various users affected, all using modern versions of Chrome/Firefox

jjathman@gmail.com (JIRA)

unread,
Mar 30, 2020, 10:12:02 AM3/30/20
to jenkinsc...@googlegroups.com

I believe we are running in to the same issue. We just started using version 2.204.6 and this began.

Reply all
Reply to author
Forward
0 new messages