Duplicate entries in the manifest causing repo to use extra threads to sync

52 views
Skip to first unread message

Luciano Carvalho

unread,
Apr 14, 2015, 1:13:51 PM4/14/15
to Repo and Gerrit Discussion
All,

I'll try to summarize this bug as best as I can:

We have some cases where we need the same project to be checked out on a different location, using a different branch. There are plenty of use cases for that and the current version of repo supports it just fine.

The problem is: for each of those, a new parallel thread is added during the sync time. Consider this: if we have a manifest with 1000 projects and 30 of those fall in the case above, and more than 200 users use the same mirror, even if they're not using parallel threads on their sync command line, repo will use 31 threads to sync, if they use -j8 repo will sync using 38 parallel threads, and so on. On a busy hour of the day, that will amount to thousands of unneeded/unwanted threads hanging on the same mirror server.

I believe that is a serious bug and needs to be fixed.

2 questions: 
- Is there a way to work around it? 
- Can it be easily/quickly fixed?

Thanks,

Luciano.

Luciano Carvalho

unread,
Apr 23, 2015, 8:22:15 PM4/23/15
to Repo and Gerrit Discussion

Bump. Just following up. Has anyone been able to take a look at this?

Thanks.

Bassem Rabil

unread,
Apr 24, 2015, 8:06:46 AM4/24/15
to repo-d...@googlegroups.com
Do you mean that end users can trigger sync between master and mirrors using command line ? 
Can you please describe the way you synchronize/replicate to mirrors, if possible you can paste here your replication plugin configuration.
There are some parameters to tune for the replication plugin [1] to limit number of threads for each replication destination.


Regards
Bassem

Marcelo Bissaro

unread,
Apr 25, 2015, 10:40:14 AM4/25/15
to repo-d...@googlegroups.com

 Hi Bassem,
 
 I work with Luciano, and I'm aware of the problem as well. Let me try to describe the problem

Consider the following scenario:

- machine1: End user machine, where 'repo sync' runs. 
- machine2: Another end user machine
- mirror1: Linux server that hosts the git repos.
- Consider a manifest file with 1000 projects (all of them points to mirror1), which 3 of them being equal and pointing to different path and revisions. Something like this

 <project name="project1" path="path1" revision="rev1"/>
 <project name="project1" path="path2" revision="rev2"/>
 <project name="project1" path="path3" revision="rev3"/>

1 - On machine1, user runs:
repo sync -c -j4

2 - On machine2, user runs:
repo sync -c -j4

 On machine1, 4 threads should open. But what happens is that 7 threads are running, trying to download information from mirror1 (ssh connection)
 On machine2, 7 threads are running as well.
 On mirror1 we have 14 threads!
 
 If we have 30 equal projects on manifest instead of 3, we will have 34 threads per user! In this scenario, the mirror1 will need to handle 68 requests instead of 8!
 

Regards,
Marcelo
Reply all
Reply to author
Forward
0 new messages