GSoC 2020 Idea: Improve git checkout performance by removing redundant fetch

68 views
Skip to first unread message

Mark Waite

unread,
Dec 26, 2019, 3:42:47 PM12/26/19
to jenkinsci-gsoc-all-public
Improve git checkout performance by removing redundant fetch

The Jenkins git plugin clones remote git repositories into Jenkins workspaces on agents. The clone is performed by creating an empty local git repository then configuring it with 'git config' and populating it with 'git fetch'. Unfortunately, the most commonly used path through the code will call 'git fetch' twice.

The second call to 'git fetch' is useless when it is using the same arguments as the first call. It wastes server time, network bandwidth, and job time. With large repositories, that waste of time may be a minute or more.

The second call to 'git fetch' could be removed in those cases where the initial fetch uses the same arguments as the second fetch.

Implement changes in the plugin so that it skips the second call to 'git fetch' if the second call would use the same arguments as the first call.

Oleg Nenashev

unread,
Dec 26, 2019, 5:03:12 PM12/26/19
to Mark Waite, jenkinsci-gsoc-all-public
Hi Mark, thanks for the idea! 

What is your time estimation for this project? GSoC is a 3-month-long coding task, and I am not sure what would be the actual time here.

Thanks in advance
Oleg

--
You received this message because you are subscribed to the Google Groups "jenkinsci-gsoc-all-public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-gsoc-all...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-gsoc-all-public/f2ecbfd4-216c-4da7-a456-6b87d9ecbe9b%40googlegroups.com.

Mark Waite

unread,
Dec 26, 2019, 6:03:22 PM12/26/19
to Oleg Nenashev, jenkinsci-gsoc-all-public
My initial guess is this will be less than a 3 month effort.  With good luck and the right existing code, it could be as little as 3 weeks.  If the change requires modifications in both the git client plugin and the git plugin, then it will push more towards 6 weeks.
--
Thanks!
Mark Waite

Marky Jackson

unread,
Dec 26, 2019, 6:14:55 PM12/26/19
to Mark Waite, Oleg Nenashev, jenkinsci-gsoc-all-public
My question regarding this potential proposal is how does this interface with you existing plugin and the roadmap for that plugin?
I know you have a fairly robust roadmap and issues created for that have previously been referred to the roadmap so I would not want to create more work via this new proposal.
It may be worth you joining the next GSoC meeting to present your proposal and we can discuss.

{     
    "regards" : {
         "name" : “marky”,
         "phone" : "+1 (408) 464 2965”,
         "email" : “marky.r...@gmail.com",
         "team" : “jackson5“,
 “role” : “software engineer"
     }
 }

On Dec 26, 2019, at 3:03 PM, Mark Waite <mark.ea...@gmail.com> wrote:



Mark Waite

unread,
Dec 26, 2019, 6:27:00 PM12/26/19
to Marky Jackson, Oleg Nenashev, jenkinsci-gsoc-all-public
I'm happy to join the next GSoC meeting to discuss the proposals if that helps.  This proposal and the automatic caching proposal are part of the "Caching" theme that is the #2 priority, behind bug triage.  The JMH tuning proposal is outside the list of priorities.

In all three cases, the ideas are well-aligned with my attitude that compatibility is critical.  All three ideas can be implemented without breaking compatibility.
--
Thanks!
Mark Waite

Marky Jackson

unread,
Dec 26, 2019, 6:30:41 PM12/26/19
to Mark Waite, Marky Jackson, Oleg Nenashev, jenkinsci-gsoc-all-public
Thank you kindly for that. We will be resuming our meetings right after the new year and look forward to discussing.
Happy holidays!

On Dec 26, 2019, at 3:27 PM, Mark Waite <mark.ea...@gmail.com> wrote:



Martin d'Anjou

unread,
Dec 27, 2019, 9:48:37 AM12/27/19
to Marky Jackson, Mark Waite, Marky Jackson, Oleg Nenashev, jenkinsci-gsoc-all-public
This is a very good idea. If it is not big enough, I am sure there are other git plugin improvements that could also be included.

Jeff Pearce

unread,
Dec 27, 2019, 11:34:53 AM12/27/19
to Martin d'Anjou, Marky Jackson, Mark Waite, Marky Jackson, Oleg Nenashev, jenkinsci-gsoc-all-public
Agreed - this would be a huge benefit to Jenkins project . I’d love to see this done this year.

Sent from my iPhone

On Dec 27, 2019, at 6:48 AM, Martin d'Anjou <martin....@gmail.com> wrote:



Oleg Nenashev

unread,
Dec 27, 2019, 11:46:01 AM12/27/19
to Jeff Pearce, Martin d'Anjou, Marky Jackson, Mark Waite, Marky Jackson, jenkinsci-gsoc-all-public
My initial guess is this will be less than a 3 month effort.  With good luck and the right existing code, it could be as little as 3 weeks.  If the change requires modifications in both the git client plugin and the git plugin, then it will push more towards 6 weeks.

The thing with GSoC is that we need project ideas which would be approximately close to 3 months. It definitely depends on a student tho.
Speaking of it, what is the reason for keeping this project idea and the JMH one separate? IMHO they could be merged into a single "Git plugin performance" with several sub-items

Jeff Pearce

unread,
Dec 27, 2019, 11:51:23 AM12/27/19
to Oleg Nenashev, Martin d'Anjou, Marky Jackson, Mark Waite, Marky Jackson, jenkinsci-gsoc-all-public
+1 on merging the git related projects 

Sent from my iPhone

Mark Waite

unread,
Dec 27, 2019, 12:18:27 PM12/27/19
to Oleg Nenashev, Jeff Pearce, Martin d'Anjou, Marky Jackson, Marky Jackson, jenkinsci-gsoc-all-public
I like that idea.  I'll merge the two into a single "Git plugin performance" project idea.

I want to keep the automatic repository caching project idea separate because there are many interesting pitfalls hiding in that, as indicated by the prior pull request that was attempting to implement a form of caching.
--
Thanks!
Mark Waite

Martin d'Anjou

unread,
Dec 27, 2019, 12:21:52 PM12/27/19
to Mark Waite, Jeff Pearce, Marky Jackson, Marky Jackson, Oleg Nenashev, jenkinsci-gsoc-all-public
There is probably hidden stuff in caching like handling parallel builds.

Mark Waite

unread,
Dec 27, 2019, 12:29:10 PM12/27/19
to Martin d'Anjou, Jeff Pearce, Marky Jackson, Marky Jackson, Oleg Nenashev, jenkinsci-gsoc-all-public
Exactly! 
--
Thanks!
Mark Waite
Reply all
Reply to author
Forward
0 new messages