[go-cd] Agent failed to do git clone after upgrading gocd

82 views
Skip to first unread message

Sachin Gupta

unread,
Jan 14, 2022, 9:30:01 PM1/14/22
to Chad Wilson, go...@googlegroups.com
Hi Team ,

I have updated gocd server and agent to version 21.4 but after that agent always failed to do git clone while running pipelines.

I have used docker-dind gocd image version 21.4 to create gocd agent image. Agent start working again if I use back my previous agent image which was created using docker-dind gocd version 21.3.

So in my current situation my gocd server running on version 21.4 but agent are still using docker-dind version 21.3.

Any suggestions to resolve this and to use latest gocd server and gocd agent image version 21.4 together.

Regards
Sachin

Chad Wilson

unread,
Jan 15, 2022, 12:14:33 AM1/15/22
to Sachin Gupta, go...@googlegroups.com
Hi Sachin

Can you share your custom Dockerfile (partially redacted if necessary), and any volume/bind mounts you might be using on your custom agent image?

It's possible that something you have in your setup is not playing nicely with Alpine 3.15 or git 2.34.1, which the docker-dind images are based on. There are no other major changes between the versions that come to mind as possibly causing a problem.

Other than that perhaps you can share some lower level logs of what is going wrong, e.g start a container and try a clone without the agent actually running and see if you can replicate the same problem and we know why the clones are failing.

e.g replace with your container below

docker run -it --entrypoint '' -e GIT_TRACE=1 <my-custom-agent-image> bash
cd /tmp
git clone <somerepo>

If that is not showing the problem, perhaps you can look at and share any other detail in /godata/logs/go-agent.log from running the real GoCD agent.

-Chad

PS: GoCD agents still bootstrap their core code from the server at startup/registration, so it's really just the bootstrap logic and underlying OS image that is running the "old" version. Good to update them to pull in OS-level patches, tool patches/fixes and agent bootstrapper fixes, but shouldn't cause an overall stability issue.

Sachin Gupta

unread,
Jan 17, 2022, 12:45:40 AM1/17/22
to Chad Wilson, go...@googlegroups.com
Hi Chad,

Thanks for your response. I am not able to clone repository even manually even if i am only using docker-dind-21.4 version without adding anything from docker file. 
Ye we are mounting repository access key to the image and I can see these file available in .ssh directory. 
But when I am trying to clone it always failing with error “permission denied (publickey)”.  I have permission in both working and non working images and both are same. 

I don’t see any log folder creating inside godata. 

Regards 
Sachin

On 15 Jan 2022, at 1:14 PM, Chad Wilson <ch...@thoughtworks.com> wrote:



Ketan Padegaonkar

unread,
Jan 17, 2022, 4:27:27 AM1/17/22
to go...@googlegroups.com
On Mon, Jan 17, 2022 at 11:15 AM Sachin Gupta <sachink...@gmail.com> wrote:
Hi Chad,
But when I am trying to clone it always failing with error “permission denied (publickey)”.  I have permission in both working and non working images and both are same. 

What are the permissions on the public key `ls -al ~/.ssh` when executed as the `go` user?

Sachin Gupta

unread,
Jan 17, 2022, 5:06:02 AM1/17/22
to go...@googlegroups.com
Hi Ketan,

See below screenshot 



I have verified these permission with the working agent(21.3).

Regards 
Sachin


On 17 Jan 2022, at 5:27 PM, Ketan Padegaonkar <ketanpad...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/CAMUPJd5qEuVd6E%3DED%3DD3j_KjQaSXbwy%3DdTv%3DhYK36dmG%2BVFiZQ%40mail.gmail.com.

Sriram Narayanan

unread,
Jan 17, 2022, 5:10:59 AM1/17/22
to go...@googlegroups.com
On Mon, Jan 17, 2022 at 6:06 PM Sachin Gupta <sachink...@gmail.com> wrote:
Hi Ketan,

See below screenshot 



I have verified these permission with the working agent(21.3).


Hey, if you add an entry of that remote server to the known hosts file and then mount or make available that file, then the SSH connection should succeed.

The error message in that screenshot shows us that the entry of the remote server is missing and could not be added either.

 
Regards 
Sachin


On 17 Jan 2022, at 5:27 PM, Ketan Padegaonkar <ketanpad...@gmail.com> wrote:


On Mon, Jan 17, 2022 at 11:15 AM Sachin Gupta <sachink...@gmail.com> wrote:
Hi Chad,
But when I am trying to clone it always failing with error “permission denied (publickey)”.  I have permission in both working and non working images and both are same. 

What are the permissions on the public key `ls -al ~/.ssh` when executed as the `go` user?

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/CAMUPJd5qEuVd6E%3DED%3DD3j_KjQaSXbwy%3DdTv%3DhYK36dmG%2BVFiZQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.

Sachin Gupta

unread,
Jan 17, 2022, 5:32:22 AM1/17/22
to go...@googlegroups.com

Hi Sriram,

It’s not because of that, I already tried same error, I am getting when I am running version 21.3 but able to do clone without any issue. Just for info I am running these two images manually without gocd sever( I mean not executing  using gocd pipelines), the only difference is the version of docker-dind. 
By Right I am not supposed to add anything to know host as it will wipe off once the pod restart or spin up again. 

Regards 
Sachin

On 17 Jan 2022, at 6:10 PM, Sriram Narayanan <srir...@gmail.com> wrote:



Sriram Narayanan

unread,
Jan 17, 2022, 6:26:37 AM1/17/22
to go...@googlegroups.com
On Mon, Jan 17, 2022 at 6:32 PM Sachin Gupta <sachink...@gmail.com> wrote:

Hi Sriram,

It’s not because of that, I already tried same error, I am getting when I am running version 21.3 but able to do clone without any issue. Just for info I am running these two images manually without gocd sever( I mean not executing  using gocd pipelines), the only difference is the version of docker-dind. 
By Right I am not supposed to add anything to know host as it will wipe off once the pod restart or spin up again. 


Could you share the command line using which you are running the 21.3 and the 21.4 containers?

For certain the known_hosts is not being written to in your case with 21.4, and I'm wondering if there are any privilege or permission issues getting in the way. The go-cd agent version itself is the same provided in other deployment scenarios.
 

Ketan Padegaonkar

unread,
Jan 17, 2022, 6:53:37 AM1/17/22
to go...@googlegroups.com
That is the problem. Your Ash sir is owned by root, not `go` user. I think gocd runs as go user.

image2.jpeg

Sachin Gupta

unread,
Jan 17, 2022, 10:34:28 AM1/17/22
to go...@googlegroups.com, Chad Wilson


Hi Sriram,

Please se below two images both are using same docker file the only difference is image version. The image with version 21.3 able to do git clone while with version 21.4 not able to do so. 



In the above you can see it’s started repo cloning but in the below it’s directly throwing error and failed to do clone. 


I am not sure if it’s new alpine image issue or git version. 

Regards 
Sachin

On 17 Jan 2022, at 7:26 PM, Sriram Narayanan <srir...@gmail.com> wrote:




On Mon, Jan 17, 2022 at 6:32 PM Sachin Gupta <sachink...@gmail.com> wrote:

Hi Sriram,

It’s not because of that, I already tried same error, I am getting when I am running version 21.3 but able to do clone without any issue. Just for info I am running these two images manually without gocd sever( I mean not executing  using gocd pipelines), the only difference is the version of docker-dind. 
By Right I am not supposed to add anything to know host as it will wipe off once the pod restart or spin up again. 


Could you share the command line using which you are running the 21.3 and the 21.4 containers?

For certain the known_hosts is not being written to in your case with 21.4, and I'm wondering if there are any privilege or permission issues getting in the way. The go-cd agent version itself is the same provided in other deployment scenarios.
 

Regards 
Sachin

On 17 Jan 2022, at 6:10 PM, Sriram Narayanan <srir...@gmail.com> wrote:




On Mon, Jan 17, 2022 at 6:06 PM Sachin Gupta <sachink...@gmail.com> wrote:
Hi Ketan,

See below screenshot 



I have verified these permission with the working agent(21.3).


Hey, if you add an entry of that remote server to the known hosts file and then mount or make available that file, then the SSH connection should succeed.

The error message in that screenshot shows us that the entry of the remote server is missing and could not be added either.

 
Regards 
Sachin


On 17 Jan 2022, at 5:27 PM, Ketan Padegaonkar <ketanpad...@gmail.com> wrote:


On Mon, Jan 17, 2022 at 11:15 AM Sachin Gupta <sachink...@gmail.com> wrote:
Hi Chad,
But when I am trying to clone it always failing with error “permission denied (publickey)”.  I have permission in both working and non working images and both are same. 

What are the permissions on the public key `ls -al ~/.ssh` when executed as the `go` user?

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/CAMUPJd5qEuVd6E%3DED%3DD3j_KjQaSXbwy%3DdTv%3DhYK36dmG%2BVFiZQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/5CB818FF-E3D7-491B-947B-B3A16FE8C80F%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.

Sriram Narayanan

unread,
Jan 17, 2022, 10:59:02 AM1/17/22
to go...@googlegroups.com, Chad Wilson
Thanks, Sachin.

Two things to try:

1. Are the permissions on the file causing an issue? (What Ketan has suspected):
Also vary the above by introducing some -v and -vv
Also consider explicitly specifying the identity file with the -i option (ssh -i full_path_to_key).

I am curious about the /home/go/.ssh/..data that your earlier screenshot had shown.

My take is this should not be an issue since the last 'r' should enable the ssh client within the container to still access the private key due to the 'r' permission. More importantly, we do see the key fingerprint in both cases being the same.

2. Understanding more:

Could you repeat the failing git clone command with this environment variable that increases the SSH error message verbosity? You can review the output and mask out sensitive information before sharing here.
GIT_SSH_COMMAND="ssh -vvv" git clone ssh://g...@bitbucket.restofdomain.com target
That would increase the SSH verbosity and help us understand the error.

-- Sriram

Chad Wilson

unread,
Jan 17, 2022, 11:20:02 AM1/17/22
to go...@googlegroups.com
This is a good suggestion from Sriram to increase verbosity since there is an implied upgrade/patch from OpenSSH 8.6P1 to 8.8p1 between these two image versions.

In OpenSSH 8.8p1 the use of old insecure rsa-sha/SHA-1 hash algs was disabled by default. If for some reason your server only allows the old insecure algorithms, it might fail to negotiate. Indeed Bitbucket Cloud had exactly this problem: https://community.atlassian.com/t5/Bitbucket-articles/OpenSSH-8-8-client-incompatibility-and-workaround/ba-p/1826047 however the error message they mention there is not one you seem to have received above. It does seem possible that Bitbucket Server also has/had the same problem, and needs to be upgraded/patched by the team that operates your Bitbucket? (or you have to temporarily employ similar workaround suggested in that article)

You might get more clarity from the verbose response and then could try those workarounds to confirm the source of the issue if it looks related.

-Chad

Sachin Gupta

unread,
Jan 18, 2022, 2:41:18 AM1/18/22
to go...@googlegroups.com

Thanks Chad,

Looks like it’s same issue what you highlighted in the bitbucket issue links. I was able to do clone by adding the workaround mentioned in the links. 

Thanks Sriram, for your help and suggestions also. I managed to get the issue by doing ssh -i with verbose. 

Regards,
Sachin

On 18 Jan 2022, at 12:20 AM, Chad Wilson <ch...@thoughtworks.com> wrote:



Chad Wilson

unread,
Jan 18, 2022, 9:09:03 PM1/18/22
to go...@googlegroups.com
That's great to hear - thanks for letting us know!

image1.jpeg
image0.jpeg
Reply all
Reply to author
Forward
0 new messages