Speed up Jenkins cloning git repositories with Multiple SCM plugin and multi-configuration

2,666 views
Skip to first unread message

Titus Nachbauer

unread,
Feb 6, 2015, 5:32:03 AM2/6/15
to jenkins...@googlegroups.com

We have a Jenkins job which uses Multiple SCM to clone 5 repositories and then builds it using a gradle script. There are two things that are slowing us down:

  1. A build is triggered on every repository on every change. That is fine, but it currently means that every time one of the repositories changes, all repositories are cloned again. Is there a way to make sure only the changed repo is cloned?
  2. Since it is a multi-configuration job, the clone is executed twice, once in the parent workspace and once in the configuration workspace. Now as I understand it this is the expected behaviour, but is there a way to change this and only clone in one of those or just copy the cloned workspace?

Also would there be a way to tell Jenkins to normally pull the repos with a hart reset and only clone on changes of .gitignore?

This is a duplicate of my question on Stackoverflow, it seems there is not much activity on the Jenkins front there. What is the best place to ask professional Jenkins questions?

Mark Waite

unread,
Feb 6, 2015, 7:19:17 AM2/6/15
to jenkins...@googlegroups.com
On Fri, Feb 6, 2015 at 3:32 AM, Titus Nachbauer <tit...@gmail.com> wrote:

We have a Jenkins job which uses Multiple SCM to clone 5 repositories and then builds it using a gradle script. There are two things that are slowing us down:

  1. A build is triggered on every repository on every change. That is fine, but it currently means that every time one of the repositories changes, all repositories are cloned again. Is there a way to make sure only the changed repo is cloned?
I don't understand that statement.  When my multi-configuration jobs run (single repository), they only clone a new copy of the repository the first time the build is executed on a particular slave or when the slave workspace is "wiped".  After the first build, subsequent builds fetch only what has changed since the last build.

Can you clarify further?
 
  1. Since it is a multi-configuration job, the clone is executed twice, once in the parent workspace and once in the configuration workspace. Now as I understand it this is the expected behaviour, but is there a way to change this and only clone in one of those or just copy the cloned workspace?
There is no way that I know to run a multi-configuration job without cloning once in the configuration workspace. 

Also would there be a way to tell Jenkins to normally pull the repos with a hart reset and only clone on changes of .gitignore?

The git plugin allows you to ignore commits based on file name patterns or on user name patterns.  You might investigate that option as a way to reduce the number of times a job is started.

If you have a large repository, you may also be able to improve cloning speed significantly by using a "reference" repository.  A reference repository is a bare repository clone of the original repository.  When git is told to use a reference repository, it reuses the content available in the reference repository rather than downloading it.

You may also be able to improve cloning speed by using a "shallow clone".  Both options are available from the git plugin advanced options.  

Refer to http://blog.cloudbees.com/2014/09/advanced-git-with-jenkins.html for a blog posting that describes the topic further.
 

This is a duplicate of my question on Stackoverflow, it seems there is not much activity on the Jenkins front there. What is the best place to ask professional Jenkins questions?

I find this mailing list is the best place to ask Jenkins user questions.  There are also several books available which describe Jenkins.  There are companies (like Cloudbees) which provide enterprise licensed and supported versions of Jenkins as well. 

Mark Waite

--
You received this message because you are subscribed to the Google Groups "Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/80d1217d-0b7c-42d2-88c8-40f7138f27b5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks!
Mark Waite

Titus Nachbauer

unread,
Feb 6, 2015, 7:50:31 AM2/6/15
to jenkins...@googlegroups.com
Thank you for the kind answer. I saw many of the git plugin options mentioned here and there, but since I am using the Multiple SCM plugin, there is no place to select these options. Would you know where to find them and if it is possible with this plugin?

About the cloning: I know that in general the git plugin actually does not use clone, but init and then fetch. In my particular case you are right: it only fetches and checks out the latest changes. However, this is relatively slow. The whole process takes 60 seconds (30 s for the master, 30 s for the workspace), where the rest of the build takes 2. This is not a huge issue now, I just want to make sure it does not become one later.

Mark Waite

unread,
Feb 6, 2015, 8:34:42 AM2/6/15
to jenkins...@googlegroups.com
On Fri, Feb 6, 2015 at 5:50 AM, Titus Nachbauer <tit...@gmail.com> wrote:
Thank you for the kind answer. I saw many of the git plugin options mentioned here and there, but since I am using the Multiple SCM plugin, there is no place to select these options. Would you know where to find them and if it is possible with this plugin?


I don't know how to access all the capabilities of the git plugin from the multiple SCM plugin.
 
About the cloning: I know that in general the git plugin actually does not use clone, but init and then fetch. In my particular case you are right: it only fetches and checks out the latest changes. However, this is relatively slow. The whole process takes 60 seconds (30 s for the master, 30 s for the workspace), where the rest of the build takes 2. This is not a huge issue now, I just want to make sure it does not become one later.


Wow, that is impressive.  An incremental update takes 30 seconds per workspace?  That seems like a very long time for an incremental update.

That may indicate that people are pushing large binaries into your git repository.  That has been a "bad thing" for me at two different employers.  When large binaries are pushed into git repositories, they become bulky and more difficult to manage.  Git can support large repositories (thankfully, since I have one that is over 9 GB now), but it works best when it manages source code rather than binaries.  The Linux kernel (millions of files, multiple commits per hour, 24 hours a day, 7 days a week) is only a few hundred megabytes, and clones (without checkout) from github to my local Ubuntu machine in about a minute when I use a reference repository.

It might instead indicate that you're on a Windows machine, or on a virtual machine with a slower file system.

Good luck!
Mark Waite
 

For more options, visit https://groups.google.com/d/optout.



--
Thanks!
Mark Waite

Titus Nachbauer

unread,
Feb 6, 2015, 9:37:51 AM2/6/15
to jenkins...@googlegroups.com
On my previous project we indeed had some binaries in the repos, which made cloning a pain. However, we now have very new, very small repos, which contain nothing but the code and some configuration and documentation. So no binaries. I will look into the file system though, because Jenkins is running inside a VM.

About the git settings: too bad, I think I will try and convince the others to consolidate into one repository, because there are many issues related to the multi-repo setup we chose...

Mark Waite

unread,
Feb 6, 2015, 11:25:06 AM2/6/15
to jenkins...@googlegroups.com
On Fri, Feb 6, 2015 at 7:37 AM, Titus Nachbauer <tit...@gmail.com> wrote:
On my previous project we indeed had some binaries in the repos, which made cloning a pain. However, we now have very new, very small repos, which contain nothing but the code and some configuration and documentation. So no binaries. I will look into the file system though, because Jenkins is running inside a VM.

About the git settings: too bad, I think I will try and convince the others to consolidate into one repository, because there are many issues related to the multi-repo setup we chose...


In another thread there is a mention that the new workflow plugin has strong support for multiple repositories.  You might explore workflow (though I gather it is still a little bit on the cutting edge).

Mark Waite
 

For more options, visit https://groups.google.com/d/optout.



--
Thanks!
Mark Waite
Reply all
Reply to author
Forward
0 new messages