Multibranch pipeline overwrite GIT_COMMIT when merges the master branch

539 views
Skip to first unread message

Ivan Fernandez Calvo

unread,
Feb 28, 2019, 3:18:29 PM2/28/19
to Jenkins Developers
Hi,

We use multibranch pipelines for our project, it works pretty well and it is nice to have PRs tested merged with the master branch, we also use Blueocean for our UI, and here we have to use a workaround to make our UI user-friendly.
Our jobs spin 100-150 test in parallel, BO is not really user-friendly with that amount of parallel stages so we make groups and launch a downstream job for each group, at this point all OK, the problem starts when you try to pass the GIT_COMMIT
to the downstream job if the master branch has changed when the checkout make the merge, the resulting commit (GIT_COMMIT env variable) is not in the repo,so the downstream job cannot checkout the GIT_COMMIT and it fails. 
I can pass the GIT_BRANCH but I'd prefer to pass the commit sha1. I start to play with `git rev-list HEAD -4` and `git reflog -6` to see if always have the same pattern and try to grab the correct commit sha1, but I start thinking that should be a better way. 
Also, I can disable the merge and make it on the downstream job but I don't like it either. WDYT?

GIT_PREVIOUS_COMMIT also point to an existing commit so I cannot use it.

In this example, the PR head commit is `08da92d31512c4a0c24948bc80f911f83c075070`, but GIT_COMMIT points to'7c36975a772dbab59e20169a0ac01e60fb6db739' and GIT_PREVIOUS_COMMIT to '6cd1da59add5d188cad41b9cbdcd33f4a7241c63' none of those in the repo

```
using credential token
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/org/repo.git # timeout=10
Fetching without tags
Fetching upstream changes from https://github.com/org/repo.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials GitHub user @elasticmachine User + Personal Access Token
 > git fetch --no-tags --progress https://github.com/org/arepo.git +refs/pull/352/head:refs/remotes/origin/PR-352 +refs/heads/master:refs/remotes/origin/master
Merging remotes/origin/master commit e067dce0923be39f9b48e721240cf26ee083ab9e into PR head commit 08da92d31512c4a0c24948bc80f911f83c075070
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 08da92d31512c4a0c24948bc80f911f83c075070
 > git merge e067dce0923be39f9b48e721240cf26ee083ab9e # timeout=10
 > git rev-parse HEAD^{commit} # timeout=10
Merge succeeded, producing a620f8fbd54f038733b1b238d90c59ccf18f3fc1
Checking out Revision a620f8fbd54f038733b1b238d90c59ccf18f3fc1 (PR-352)
Commit message: "Merge commit 'e067dce0923be39f9b48e721240cf26ee083ab9e' into HEAD"
First time build. Skipping changelog.
> git checkout -f 08da92d31512c4a0c24948bc80f911f83c075070
> git merge e067dce0923be39f9b48e721240cf26ee083ab9e # timeout=10
> git rev-parse HEAD^{commit} # timeout=10
> git config core.sparsecheckout # timeout=10
> git checkout -f a620f8fbd54f038733b1b238d90c59ccf18f3fc1
> git rev-list --no-walk 6cd1da59add5d188cad41b9cbdcd33f4a7241c63 # timeout=10
```

```
[2019-02-28T19:36:16.610Z] export GIT_BRANCH='PR-352'
[2019-02-28T19:36:16.838Z] export GIT_COMMIT='7c36975a772dbab59e20169a0ac01e60fb6db739'
[2019-02-28T19:36:16.838Z] export GIT_PREVIOUS_COMMIT='6cd1da59add5d188cad41b9cbdcd33f4a7241c63'
```

I can grab `git HEAD -1` but I am not sure always would be correct because if there aren't changes on the master branch GIT_COMMIT is correct.

```
[2019-02-28T19:36:16.838Z] + git reflog -6
[2019-02-28T19:36:16.838Z] a620f8f HEAD@{0}: merge e067dce0923be39f9b48e721240cf26ee083ab9e: Merge made by the 'recursive' strategy.
[2019-02-28T19:36:16.838Z] 08da92d HEAD@{1}: checkout: moving from master to 08da92d31512c4a0c24948bc80f911f83c075070
```

```
[2019-02-28T19:36:16.838Z] + git rev-list HEAD -4
[2019-02-28T19:36:16.838Z] a620f8fbd54f038733b1b238d90c59ccf18f3fc1
[2019-02-28T19:36:16.838Z] 08da92d31512c4a0c24948bc80f911f83c075070
[2019-02-28T19:36:16.838Z] 6646f0310ea4e6e176faa178a6a2ecf3be11168b
[2019-02-28T19:36:16.838Z] 6ff5224c0892801c2632be855d91a38426677ba5
```

Liam Newman

unread,
Feb 28, 2019, 10:44:47 PM2/28/19
to Jenkins Developers
I'm doing some work in this area.  I think I can guess the problem: 

... so we make groups and launch a downstream job for each group, at this point all OK, the problem starts when you try to pass the GIT_COMMIT
to the downstream job if the master branch has changed ...

For a merge job on a PR, GIT_COMMIT is going to be either the head commit of the source branch (as you noted) or it is going to be the _local_ merge commit on that machine which definitely won't exist in in the repository.  

Further, GIT_PREVIOUS_COMMIT is the commit that was run for the previous execution of that job on that agent, so if it was a local merge it is also not going to be in the repository.

I understand your desire to pin the commit on the downstream jobs to match pipeline, but in the case of merge jobs there isn't a simple solution to achieve that reliably at this time.  Please file a Jenkins JIRA issue with this information so we can think about this scenario.  I'm not sure how we'd address this but that is a longer discussion.  

A couple of possible workarounds/mitigations come to mind: 
1. Use git to get the values yourself - as the first stage in your pipeline, run some git commands in a shell on an agent. These commands would get the head commits from the base and pr branches.  Load that information into variables, something like "GIT_BASE_COMMIT" and "GIT_PR_COMMIT".  Then pass those values into your downstream jobs and merge there.  This would be quite a bit of hacking, but it could work.  (I don't know what the scripts would look like off the top of my head, but I'm sure they are doable). 

2.  Use head ("current pull request revision" strategy) to build your PRs instead of merge - this is less than ideal, but it would get you stable builds that don't mutate out from under you. 

3. Alternately, as the first stage in your build you could create a merge and push it to a branch, something like "jenkins/pr-352/<job_number>",  then you could pass that branch to your downstream jobs and be confident that they wouldn't change.  It would however mean that your Jenkins would need push access to your repo. 

None of these are what I would call "good" workarounds, but they would unblock you within your current design and the current Jenkins behavior.  Definitely file a JIRA and we can delve further there into how this scenario would be better addressed in the future. 

Liam Newman 
CloudBees Pipeline Team

Ivan Fernandez Calvo

unread,
Mar 1, 2019, 6:23:01 AM3/1/19
to Jenkins Developers
It is a little odd but this function makes the trick, so now I can use a new environment variable GIT_BASE_COMMIT to reference the real commit 

```
def getBaseCommit(){
  def baseCommit = ''
  def latestCommit = sh(label: 'Get previous commit', script: "git rev-parse HEAD", returnStdout: true)?.trim()
  def previousCommit = sh(label: 'Get previous commit', script: "git rev-parse HEAD^", returnStdout: true)?.trim()
  if(env?.CHANGE_ID == null){
    baseCommit = env.GIT_COMMIT
  } else if("${env.GIT_COMMIT}".equals("${latestCommit}")){
    baseCommit = env.GIT_COMMIT
  } else {
    baseCommit = previousCommit
  }
  env.GIT_BASE_COMMIT = baseCommit
  return baseCommit
}
```

Ivan Fernandez Calvo

unread,
Mar 1, 2019, 6:52:06 AM3/1/19
to Jenkins Developers
I have opened an improvement issue https://issues.jenkins-ci.org/browse/JENKINS-56341

Jesse Glick

unread,
Mar 1, 2019, 3:17:22 PM3/1/19
to Jenkins Dev
On Thu, Feb 28, 2019 at 3:18 PM Ivan Fernandez Calvo
<kuisat...@gmail.com> wrote:
> Our jobs spin 100-150 test in parallel, BO is not really user-friendly with that amount of parallel stages so we make groups and launch a downstream job for each group

Uh, what? You do not need `build` just to run tests in parallel. If
B.O. is not displaying build results in the way you want, stop using
it, or patch it.
Reply all
Reply to author
Forward
0 new messages