goCD Native git materials plugin issue

96 views
Skip to first unread message

Kelly Salrin

unread,
Oct 3, 2022, 10:27:12 AM10/3/22
to go-cd
Hello!

I'm having issues using the Native git materials plugin (non pluggable scm) for materials with github repo. It is able to clone the repo sucessfully to the pod but when saving the yaml config it will cause gocd to crash (non-responsive) and we have to recover from snapshot. 
We did see when downloading the yaml, under materials there seems to be a unique identifier being used. We believe this might be causing gocd to crash but I am not sure how this identifier is used (git-dfbebc2) and if we can randomly generate it ourselves in the yaml or this has some significance with the goCD server.  
Does anyone know what and how is this identifier used for or have suggestions? 

Screen Shot 2022-10-03 at 9.21.34 AM.png

Thank you in advance!

Chad Wilson

unread,
Oct 4, 2022, 11:30:21 AM10/4/22
to go...@googlegroups.com
Hi Kelly

I'm not seeing how these identifiers are likely to cause an issue like this. They are just display names/aliases for a material. Is there a reason you think there is some correlation with these identifiers? Nevertheless, I think they can be whatever you want them to be.

Having said all this, I'm not sure I understand what you mean by "clone the repo sucessfully to the pod" or "when saving the yaml config it will cause gocd to crash" or what you're trying to do here. Perhaps you can write down the exact steps you are taking, what you see in the UI, and what you are trying to achieve.

Some general environment details are also likely required to help
  • GoCD Version
  • Installer/platform type etc
  • What specific action was taken when the server went unresponsive (clicked button X? added config repo Y? triggered build Z?)
  • What you see in the server logs when it goes unresponsive
  • What you see on the UI - are you viewing a YAML config repo? Something else?
  • Clarify what you mean by unresponsive. Pages don't load at all on the dashboard? Builds don't trigger? A build gets stuck on an agent? What is the GoCD server process doing at this time?
-Chad

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/997b582e-b956-432f-8e45-fcd4c9f4f849n%40googlegroups.com.

Kelly Salrin

unread,
Oct 4, 2022, 4:43:07 PM10/4/22
to go-cd
Hi Chad!
 Thank you for your response! No problem I'll try to explain better on this issue.

 - GoCD Version: 22.1.0
- Installer/platform type: Debian
- Using the native materials pluggable scm, we mounted git credentials to the pod template and run the pipeline. Download the pipeline configuration yaml for that pipeline and add it to the configuration. Run the pipeline. GoCD Server become unresponsive/frozen completely. Cannot do anything and have to restore.
 - We cannot find anything in the logs for the issue when it freezes - are you viewing a YAML config repo: Yes

If you have any other question let me know and I'll get those answers for you. 

Thank you!
Message has been deleted

Sifu Tian

unread,
Oct 5, 2022, 12:45:52 AM10/5/22
to go-cd
HI Chad,

To piggyback off Kelly's inquiry, we leverage the config repo and GitHub is where all of our pipeline configs are stored.  We also use the Pluggable SCM plugin to define our materials along with Kubernetes elastic agents.  What we have noticed is that when we revert from using the pluggable SCM plugin to the Native GIT Materials Plugin (Core to GoCD), there is a unique identifier that is generated for that material (e.g. materials: git-9d3fba6:) just as there is a unique identifier with the pluggable SCM (e.g. scm: 59323383-d746-443d-8c36-xxxxxxxxxxx).

Issue:
Since we already have config yamls in our repo that define our pipelines, how is that unique identifier generated?  When we create a manual pipeline, define the native GIT Material, and download the yaml, a unique identifier is created. 
We copy that unique identifier, paste it into a config yaml, and upload it to our repo.  As soon as GOCD Config repo plugin polls and finds the latest updated yaml, GOCD dies (becomes nonfunctional).  
We can't get to the login page, GoCD times out.  When logging into the server, the service is running but nothing in the logs indicates what the issue is. Even reverting the change in the yaml does not fix the issue. We end up having to restore from a snapshot.  We were thinking that perhaps in the GoCD database, that unique identifier is tied to a pipeline name or some reference that once we copy it to another yaml, even though the settings are the same, it causes a problem

We want to use the core GIT Materials but we don't know how to generate that unique materials identifier, use it in the config repo without GoCD dying.
I've attached some screenshots of the yaml files as a reference.

Screen Shot 2022-10-05 at 12.35.18 AM.png
Screen Shot 2022-10-05 at 12.34.01 AM.png
Screen Shot 2022-10-05 at 12.33.20 AM.png

Chad Wilson

unread,
Oct 5, 2022, 3:44:45 AM10/5/22
to go...@googlegroups.com
Thanks both - the additional context helps.

Can you help clarify what you mean by the pluggable SCM plugin? Are you talking about a specific plugin, e.g the 'git path material plugin' and you would prefer not to use this?

What I am hearing here is that something 'bad' is happening when you use some mechanism to generate/export a YAML config from an existing manually/UI defined pipeline, put it into a YAML config repo and then have GoCD try to parse it and merge the config into your existing manually defined pipelines and existing pipelines defined in config repos.

If everything locks up like this when parsing a config repo that sounds like a bug :) Unfortunately to investigate and have a chance at solving it (or providing a workaround) without confusion we'd probably need a set of simple steps (treat the reader as a dummy as much as possible) to replicate it - that isn't dependent on your internal environment/repos. Figuring the 'simple' case out might allow you to experiment with approaches to workaround.

Ideally, you'd be able to run an off the shelf test GoCD locally in a container or using the "GoCD test drive" and supply steps, an example config.xml and yaml config that can be used to replicate the lock-up via a GitHub issue.

Random guesses of things to possibly try as workarounds
1) rename those 'unique' generated material names in generated YAML to something of your choosing
2) where the materials are identical (same repo, branch etc) across multiple pipelines try making the material name in YAML the same. Does that change anything?

-Chad

Sifu Tian

unread,
Oct 6, 2022, 11:27:53 AM10/6/22
to go-cd
Hi Chad,

Q: Can you help clarify what you mean by the pluggable SCM plugin? Are you talking about a specific plugin, e.g the 'git path material plugin' and you would prefer not to use this?
Yes The git path material plugin and its not about preference but a problem that our developers have with not have the latest commit in our main branch represent what is built and deployed.

Q: What I am hearing here is that something 'bad' is happening when you use some mechanism to generate/export a YAML config from an existing manually/UI defined pipeline, put it into a YAML config repo and then have GoCD try to parse it and merge the config into your existing manually defined pipelines and existing pipelines defined in config repos.  Yes, This is whats happening.  We have also tried to export the yaml, complete delete the manual pipeline that was used to generate the pipeline and use that unique identifier and GoCD would stop functioning as soon as the config repo parses the change.

Q: Random guesses of things to possibly try as workarounds
1) rename those 'unique' generated material names in generated YAML to something of your choosing. We want to do this but we wanted some feedback from the community on if this is the right approach. Meaning that GoCD indeeds creates a unique identifier and just by changing it to something we choose will avoid GoCD dying.  (I guess we will find out)
2) where the materials are identical (same repo, branch etc) across multiple pipelines try making the material name in YAML the same. Does that change anything? We tried this and the same result happens.  GoCD stops functioning after the yaml changes are parsed.

  • The problem with GoCD building and deploying “a commit that doesn't match latest master” is that GoCD is only picking up the latest commit with files that match the pipeline’s watchlist.  GoCD doesn’t pick up the latest overall commit at the time it is evaluating changes, which is the behavior people seem to be expecting.
  • the problem mainly arises when a PR is merged via merge commit and some pipelines only pick up some of the commits in the PR due to watchlist, not all of the commits in the PR.
  • choosing the “latest matching commit” is the intended functionality of the git GoCD plugin we’re currently using (https://github.com/TWChennai/gocd-git-path-material-plugin), and there doesn’t appear to be a way to configure it differently.  in fact, that behavior seems to be the sole reason for that plugin’s existence.
  • one option might be to use the standard GoCD materials GIT triggering functionality instead of this plugin.  unclear, but it seems like that might achieve the behavior we want, which is to trigger using the latest commit if anything has changed in the watchlist since the last run.  (apparently, there’s some work that needs to be done to get the standard functionality to work, so it can’t be easily tested atm.)
  • another option would be to ensure each pipeline’s watchlist includes all dependencies for the pipeline.  e.g., the reason that some pipelines failed is because it's watchlist didn’t include other package factories which are needed for some failed pipeline tests.
Screen Shot 2022-10-05 at 10.57.51 AM.png

Chad Wilson

unread,
Oct 6, 2022, 12:31:04 PM10/6/22
to go...@googlegroups.com

> 1) rename those 'unique' generated material names in generated YAML to something of your choosing. We want to do this but we wanted some feedback from the community on if this is the right approach. Meaning that GoCD indeeds creates a unique identifier and just by changing it to something we choose will avoid GoCD dying.  (I guess we will find out)

Each config repo plugin handles how to export its configuration from GoCD native API format to its own syntax. In the YAML config plugin case, random names are generated if you don't define your own friendly name for the material in your pipeline config. This is because names for materials in the pipeline context are mandatory in the YAML design, yet optional in GoCD config itself.

Since they are optional, the names shouldn't be important or cause any issue like this, but if you think it's relevant you can try and experiment, either by setting the names on the UI before export or editing them afterwards. If you pre-configure material names in the UI before exporting/downloading the YAML you should see it using the names you specified.

image.png


> 2) where the materials are identical (same repo, branch etc) across multiple pipelines try making the material name in YAML the same. Does that change anything? We tried this and the same result happens.  GoCD stops functioning after the yaml changes are parsed.

In any case, for me personally to figure out what is happening here, I'd need either need some sort of magical brainwave as to the likely problem area, or a lot of spare time to try random things -- or most preferably a way to easily replicate this in some sort of simple configuration as noted below.

If you cannot replicate it, the closest thing that might get me to a brainwave I can think of would be
  1. trigger the problem, GoCD server hangs
  2. assuming you are on Linux, find the java.exe process for the GoCD server that is hung.
  3. observe GoCD's CPU usage. Is it using a lot of CPU, or basically idle?
  4. Send kill -QUIT <pid> to the process. This won't actually kill a Java process (unless it's in such a bad state that trying to dump debug data causes it to hang).
  5. You should see a massive set of stack traces be logged to logs/go-server.log starting with "Full thread dump OpenJDK" and ending with a line containing "class space". If you can share the full dump output stack traces somewhere (would need all of them), it might have some clues about what is blocked/hung.
But steps to re-produce are almost certainly better :-)

While it's an interesting problem with some trade-offs on either side, I'm leaving aside the issue of whether the Git Material or Git Path Material is best for your use case here; would prefer to keep this thread focused on the actual bug/problem that both you and Kelly seem to be having with the manual configure >> export-to-yaml >> import config repo flow you both are attempting.

-Chad

On Thu, Oct 6, 2022 at 11:27 PM Sifu Tian <sifutia...@gmail.com> wrote:
Hi Chad,

Q: Can you help clarify what you mean by the pluggable SCM plugin? Are you talking about a specific plugin, e.g the 'git path material plugin' and you would prefer not to use this?
Yes The git path material plugin and its not about preference but a problem that our developers have with not have the latest commit in our main branch represent what is built and deployed.

Sifu Tian

unread,
Oct 7, 2022, 12:38:50 PM10/7/22
to go-cd
Hi Chad,

Your suggestion for specifying the materials name did indeed resolve our issue.  We can use that name on all of our pipelines and it works as expected.  We also confirmed that not specifying the material name, letting GoCD determine the unique ID for the material when using it in a config repo yaml does kill GoCD.

Problem solved however and thanks so much for your help!
Reply all
Reply to author
Forward
0 new messages