Git Materials not updated in downstream pipeline

720 views
Skip to first unread message

eran....@riskified.com

unread,
Jul 5, 2016, 7:59:03 AM7/5/16
to go-cd, Haim Ashkenazi
Hi everyone,

I'm using gocd version  16.3.0.
Pipeline 1 with git material A, Downstream pipeline 2 with git material B with automatic scheduling.
All git materials set not to poll, all stages set to fetch materials.

When pipeline 2 is triggered (automatically) by success of pipeline 1 - git materials in pipeline 2 are NOT updated.



Is the only way to get the material to update is to set polling to true and blacklist **/* ?
How should I update my configuration?
I already have many pipelines which use this repository and more which use different branches in the same repository. Am I supposed to manually change the configuration file for all of these pipelines? For all branches?

Is this by design? I really want to understand why this is happening. Shouldn't a new build of the pipeline (2) use an updated version of the material?

Finally - I'd like to suggest a feature that will enable setting of polling for all relevant stages/pipelines. This can either be done through a new 'materials' page which will list the materials and associated stages/pipelines OR by adding this as an option in the warning dialog which appears when a user tries to change polling (and more than one stage is referencing that material)

Aravind SV

unread,
Jul 5, 2016, 6:25:37 PM7/5/16
to go...@googlegroups.com, Haim Ashkenazi
Hello!


On Tue, Jul 5, 2016 at 7:59 AM, <eran....@riskified.com> wrote:
Is the only way to get the material to update is to set polling to true and blacklist **/* ?
How should I update my configuration?
I already have many pipelines which use this repository and more which use different branches in the same repository. Am I supposed to manually change the configuration file for all of these pipelines? For all branches?
 

Is this by design? I really want to understand why this is happening. Shouldn't a new build of the pipeline (2) use an updated version of the material?

You mentioned that you've turned off polling. So, those materials are not polling (and hence, are not updated). Why (and how) would you expect a pipeline in GoCD to know that a material has a new commit (and needs to trigger) if polling is turned off? Maybe you're thinking of using the notification API to push instead of poll?

Would help to know more about what you're trying to do. Usually, you set polling to true and use the blacklist (as you mention) in case you want the pipelines to pick the latest materials, but not trigger upon commits to the materials.

Cheers,
Aravind

eran....@riskified.com

unread,
Jul 6, 2016, 2:10:22 AM7/6/16
to go-cd, ha...@riskified.com
Thanks for your reply Aravind,

Maybe I'm missing something here....
The Stage definitions screen has an option 'Fetch Materials' (Perform material updates or checkouts) which I thought should update the materials whenever that stage was triggered.
As I mentioned, the pipeline is triggered by an upstream pipeline, I only need its materials to be updated and this is not happening for some reason.
In fact, this seems to work fine (without polling) in all pipelines which are triggered manually (clicking on the 'Trigger' button). Only, in this case, the pipeline is being triggered by an upstream pipeline.
Since most of my pipelines are triggered manually, I thought I'd save the overhead of polling the git repositories and just have them update when the pipeline is triggered.
So, basically, one needs to set all materials in gocd to be polled and blacklist the ones which should not be triggered automatically?

thanks again,

Aravind SV

unread,
Jul 7, 2016, 12:45:01 PM7/7/16
to go...@googlegroups.com, Haim Ashkenazi
Hello,


On Wed, Jul 6, 2016 at 2:10 AM, <eran....@riskified.com> wrote:
Maybe I'm missing something here....
The Stage definitions screen has an option 'Fetch Materials' (Perform material updates or checkouts) which I thought should update the materials whenever that stage was triggered.
As I mentioned, the pipeline is triggered by an upstream pipeline, I only need its materials to be updated and this is not happening for some reason.
In fact, this seems to work fine (without polling) in all pipelines which are triggered manually (clicking on the 'Trigger' button). Only, in this case, the pipeline is being triggered by an upstream pipeline.
Since most of my pipelines are triggered manually, I thought I'd save the overhead of polling the git repositories and just have them update when the pipeline is triggered.
So, basically, one needs to set all materials in gocd to be polled and blacklist the ones which should not be triggered automatically?

Yes, I'd go for turning on polling and use blacklist (and from 16.6.0, whitelist too). To answer your question about why manual trigger works: There are multiple concepts here, which is what makes it hard to understand. Let me try and explain quickly, with an image:


Inline image 2

Given the above, here are some cases:
  • At time 10:00 AM: Commit g1-c2 happens on g1.

Result: Pipeline does not trigger, because the material is fully blacklisted. But, the server sees that commit and will change its latest known commit to g1-c2.


  • At time 10:05 AM: Commit g2-c2 happens on g2. All changes in the commit affect only misc/abc/README.txt and misc/def/hello.txt
Result: Pipeline does not trigger, because path misc/**/* is blacklisted. But, the server sees that commit and will change its latest known commit to g2-c2.

  • At time 10:10 AM: Commit g3-c2 happens on g3.
Result: Pipeline does not trigger, because polling is turned off (and the server does not see that commit at all).

  • At time 10:15 AM: Commit g4-c2 happens on g4.
Result: Pipeline triggers - with g1-c2, g2-c2, g3-c1 and g4-c2. Since polling is turned off, g3-c2 has not been seen.

  • At time 10:20 AM: Commit g2-c3 happens on g2. Changes in the commit affect misc/abc/README.txt and src/test.rb
Result: Pipeline triggers - with g1-c2, g2-c3, g3-c1 and g4-c2. This is because src/test.rb does not match the blacklist misc/**/*.

  • At time 10:25 AM: Pipeline is manually forced (click on "Play" button)
Result: Pipeline triggers - with g1-c2, g2-c3, g3-c2 and g4-c2. A manual trigger will forcibly update every material it knows of to the latest revision and triggers the pipeline.

Remember that all of the decisions about which materials to use for a pipeline trigger are done on the GoCD server side. That's how it can maintain consistency upon reruns, etc. The "Fetch materials" option affects only the agent side. It's an option which allows the agent to not fetch / checkout / clone the materials at all, when run. By this time, the server has already decided which materials to use. The agent cannot control it.

The "Fetch materials" being false should only be used in cases where you don't care about the material at all. In some cases, you just want a timer trigger for a pipeline and then you decided to fetch some artifact from somewhere and do something with it. In those cases, you don't care about the material that caused this pipeline to trigger. That's when you'd turn off "Fetch materials". In all my years of working on GoCD, I haven't really had to turn it off.

Hope that helps.

Cheers,
Aravind

eran....@riskified.com

unread,
Jul 17, 2016, 2:58:14 AM7/17/16
to go-cd, ha...@riskified.com
Thanks Aravind for the detailed explanation. It really helped us understand the process better.

I'd like to post one last clarification and ask if this is correct:
1. Materials are polled/downloaded on the gocd Server (and transferred to an agent when required). The server fetches new material only on a manual trigger of a pipeline or via polling.
2. The stage setting 'Fetch Materials' refers to fetching from the gocd server (and not the actual SCM)?

So, If polling is not enabled, and the pipeline is triggered i.e by an upstream pipeline -  the agent will retrieve the latest version on the gocd server.

Am I correct in this?

Also, If you have any tips on reconfiguring many pipelines to use polling with blacklists (we have about 30 which do not currently use polling) i'd reall appreciate it.

Finally - Just for everyone to know - it seems that gocd recognises a material repository according to its unique URL. So, for git repos at least, g...@github.com:myrepo/repo1 and g...@github.com:myrepo/repo1.git will be treated as separate material repos. each can have separate polling settings although they will both be pulling from the same github repo.
This might cause problems in troubleshooting issues, if some users define the repo differently. But on the other hand, it may help to change the polling setting when several pipelines are already configured with the same material. Otherwise, one would have to change the setting for all of the pipelines via the config XML.

Thanks again for the assistance,

Aravind SV

unread,
Jul 27, 2016, 4:19:17 PM7/27/16
to go...@googlegroups.com, Haim Ashkenazi
Hello,

Sorry, didn't realize you'd asked a question.


On Sun, Jul 17, 2016 at 2:58 AM, <eran....@riskified.com> wrote:
I'd like to post one last clarification and ask if this is correct:
1. Materials are polled/downloaded on the gocd Server (and transferred to an agent when required). The server fetches new material only on a manual trigger of a pipeline or via polling.

You said: "Materials are polled/downloaded on the GoCD server". That is correct.

You said: "and transferred to an agent when required". This is not correct. Agents also use svn/git, etc. to update the repository they have (or will clone a repository) to the correct revision for that pipeline run. The correct revision is not always the latest.

You said: "The server fetches new material only on a manual trigger of a pipeline or via polling". That is correct too, with an addition that the notify material API can also cause it to happen, if polling is turned off.


2. The stage setting 'Fetch Materials' refers to fetching from the gocd server (and not the actual SCM)?

Based on what I said earlier, it is not about fetching from the GoCD server, but it is about the actual SCM. However, the setting refers to the agent-side checkout/clone/update. So, you're partially right.


So, If polling is not enabled, and the pipeline is triggered i.e by an upstream pipeline -  the agent will retrieve the latest version on the gocd server.

It will retrieve the latest version that the GoCD server is aware of, but not from the GoCD server as mentioned earlier. It'll just be updating its local checkout of the repository to the latest version that the server is aware of, directly from the SCM repository.



Finally - Just for everyone to know - it seems that gocd recognises a material repository according to its unique URL. So, for git repos at least, g...@github.com:myrepo/repo1 and g...@github.com:myrepo/repo1.git will be treated as separate material repos. each can have separate polling settings although they will both be pulling from the same github repo.
This might cause problems in troubleshooting issues, if some users define the repo differently. But on the other hand, it may help to change the polling setting when several pipelines are already configured with the same material. Otherwise, one would have to change the setting for all of the pipelines via the config XML.

Yes, that is right. The way it is now. Truly, the material concept is a global concept and not a pipeline-level concept and that is how it should have been modeled.

Cheers,
Aravind
Reply all
Reply to author
Forward
0 new messages