Git MultiSite for Gerrit applies WANdisco’s unique, patented replication to turn Gerrit master
and slave servers into fully writable Git peer nodes, eliminating the bottleneck.
Every User Works at LAN Speed
Deploy a Git MultiSite server close to each office and experience LAN speed while cloning,
pulling, and pushing with Gerrit.
Zero Down Time. Zero Data Loss. Maximum Performance.
Failover between servers is automatic and there is no single point of failure for Git, making maintenance windows a thing of the past.
Pure Gerrit
Git MultiSite for Gerrit is built on open source Gerrit and WANdisco’s active/active
replication technology. There are no proprietary back ends or black
box data stores to contend with. Stay up to date with the latest Gerrit releases
and use regular Git protocols and client programs with no restrictions.
Unlike commercial solutions that use proprietary versions of Git polluted with other software users don’t want or need, Git MultiSite builds on open source Git. That means Git MultiSite can be used with any other open source or closed source software with no vendor lock-in, or need to retrain users. Git MultiSite now integrates with Gerrit and GitLab, the leading code review and collaboration tools.Unlike commercial solutions that use proprietary versions of Git polluted with other software users don’t want or need, Git MultiSite builds on open source Git. That means Git MultiSite can be used with any other open source or closed source software with no vendor lock-in, or need to retrain users. Git MultiSite now integrates with Gerrit and GitLab, the leading code review and collaboration tools.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
WANDisco developed its own patented technology a while ago (http://www.wandisco.com/get.php?f=documentation/whitepapers/WANdisco_DConE_White_Paper.pdf) for SVN.They then applied the same to Git, as it based on file-system level replication.Honestly Gerrit is more than just file-system level replication as it's not clear to me how they solved the DB problem: unless they require all Gerrit servers to share the same DB, which wouldn't be ideal :-OWould be useful if anyone from WANDisco watching this forum could clarify how it works :-)
Luca.
On 5 Dec 2014, at 10:13, Gustaf Lundh wrote:
Wandisco seems to provide a backend/framework which would allow Gerrit to run in a multimaster environment [1]. They make many great claims [2], like:Git MultiSite for Gerrit applies WANdisco’s unique, patented replication to turn Gerrit master
and slave servers into fully writable Git peer nodes, eliminating the bottleneck.andEvery User Works at LAN Speed
Deploy a Git MultiSite server close to each office and experience LAN speed while cloning,
pulling, and pushing with Gerrit.andZero Down Time. Zero Data Loss. Maximum Performance.
Failover between servers is automatic and there is no single point of failure for Git, making maintenance windows a thing of the past.Perhaps even more interesting is that they are not relying on a fork:Pure Gerrit
Git MultiSite for Gerrit is built on open source Gerrit and WANdisco’s active/active
replication technology. There are no proprietary back ends or black
box data stores to contend with. Stay up to date with the latest Gerrit releases
and use regular Git protocols and client programs with no restrictions.andUnlike commercial solutions that use proprietary versions of Git polluted with other software users don’t want or need, Git MultiSite builds on open source Git. That means Git MultiSite can be used with any other open source or closed source software with no vendor lock-in, or need to retrain users. Git MultiSite now integrates with Gerrit and GitLab, the leading code review and collaboration tools.Unlike commercial solutions that use proprietary versions of Git polluted with other software users don’t want or need, Git MultiSite builds on open source Git. That means Git MultiSite can be used with any other open source or closed source software with no vendor lock-in, or need to retrain users. Git MultiSite now integrates with Gerrit and GitLab, the leading code review and collaboration tools.So. Does anyone have experience of running this solution in production? How does it scale? Since LAN-speed pushes are promised, how is the ref update racing conditions handled? The master receiving the push should need to synchronize the ref lock with all the other masters around the world. And should be quite costly.--
--
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
Luca.
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
Concurrent Agreement
The Paxos algorithm only allows agreement to be reached on one proposal
at a time. This has the obvious effect of slowing down performance in a
high transaction volume environment. DConE allows multiple proposals
from multiple proposers to progress simultaneously, rather than waiting for
agreement to be reached by all of the nodes on a proposal by proposal basis.
5 reasons your Hadoop needs WANdisco
Listed on the London Stock Exchange: WAND
THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED. If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.
Hi Luca,WANDisco models each Git Repository as a State Machine [1] and uses the Paxos [2] algorithm to keep the state machines in sync.The original paper for this approach is "Time, Clocks and the Ordering of Events in a Distributed System" [3].
A Git push is a bunch of objects (blobs, commits and trees) in a packfile, this is unpacked into the repo and then git updatesthe refs, eg refs/heads/master
When a push for a branch comes into a node it first transfers the contents of the packfile to the other nodes, once enough nodes havethe contents (this is configurable) the node issues a Paxos proposal to update the ref. If there are multiple proposals outstanding forthe state machine then DConE creates a global ordering for the proposals.
Each node will execute the proposals in the same order keepingthe state machines synchronised. The node will only execute the proposal when it already has the packfile.If a node does not have the packfile then the node will block doing any updates to that repository until it has the packfile and can completethe proposal - this does not prevent it accepting any more pushes to the repository, they will be submitted to Paxos and will queue upwaiting for the earlier updates to complete. The later pushes may be rejected as they are out of date.
We don't replicate the RDBMS and indeed we are looking forward to seeing more data move into the Git repositories. We'd actually like to help in that area next year.
However we allow nodes to apply accepted proposals out of order on different refs within the same repository. This allows small updates on a stable branch to fly ahead of a large change applied to master that is blocking on copying a large pack file.
A Git push is a bunch of objects (blobs, commits and trees) in a packfile, this is unpacked into the repo and then git updatesthe refs, eg refs/heads/masterI understand how you can use hooks to catch activity from Git receive-pack, but how do you grab the ref updates initiated by Gerrit and send them through the Paxos queue?
When a push for a branch comes into a node it first transfers the contents of the packfile to the other nodes, once enough nodes havethe contents (this is configurable) the node issues a Paxos proposal to update the ref. If there are multiple proposals outstanding forthe state machine then DConE creates a global ordering for the proposals.I assume if two nodes "a", and "b" both initiate a proposal for "master" to different SHA-1s, I assume the Paxos system resolves this conflict by accepting one proposal and rejecting the other, since they are conflicting proposals inside the same repository?
While two nodes "a" and "b" initiating proposals for different branches (e.g. "a" updates mater, "b" updates stable) will globally order and both be accepted.
Let me ask about two specific cases:Node "c" is lagging behind and does not have any recent updates (e.g. slow line).A push comes into "c" for "stable", which has not been changed by any other node. Sounds like this would be queued at "c", submitted to Paxos after "c" finishes catching up, and would be accepted. OK. What does the user see while this is happening? Is their "git push" waiting for the catch up so the user can be notified if Paxos accepted or rejected their proposal for "stable"?
The case: a push comes into "c" for "master", which has been changed several times by other node(s). Since "c" is behind, what happens? If you queue it up and submit the proposal to Paxos, it should be rejected since "c" is behind and the client would be discarding the edits made by other nodes.
I think I would explain this as an "eventually consistent" system then, as reads made immediately after a push may not yet observe the push due to replication lag across the network.
We don't replicate the RDBMS and indeed we are looking forward to seeing more data move into the Git repositories. We'd actually like to help in that area next year.Looking forward to it. One of our motivations for removing the RDBMS is to offload the replication work onto our Paxos based Git system, which vastly simplifies the multi-site management.
User preferences are another area not moved to Git, but that should,
to help eliminate user information in the RDBMS. We have started the
migration by placing the user's custom menu items into the All-Users
repository. But we have not yet moved the other preferences.
> Mark's team made a few updates to JGit in order to intercept the additional
> activity. (Naturally we publish any changes we make to open source on
> request.)
OK, so its not a stock Gerrit server. Its stock Gerrit with an updated JGit. :)
If you are running in a hook, the hook stderr goes to the tty of the
git client. You can output messages to let the user know something is
still happening on the server side. You can use "\r" (without \n) to
reset the tty to the start of the line and rewrite the line. We use
that trick to show a message:
> It's a little more interesting in the Gerrit UI (e.g. merging a
> code review) so we built a configurable timeout into the system. To be
> honest we're still doing some UX research to figure out the best way to let
> the user know what's happening in a useful way.
I would be interested in your thoughts here, as we also can have delay
in the Gerrit UI during submit. We have been trying to keep this below
1 second to keep users happy, but sometimes that just isn't practical.
>
> Same page here. Any suggestions to the best way to get started? Mark's
> team will be looking at this next year.
One open question I posed earlier in this thread was, how does
WANdisco tell Gerrit a ref has replicated onto this node?
Planning for that in your product, and helping us hook onto that, would help. :)
Eventually consistent is accurate.
[1]. Its also
interesting to view this in terms of the CAP[2] trade offs - we
provide Availability,
Partition Tolerance and eventual consistency.
Listed on the London Stock Exchange: WAND
THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED. If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.
I imagine that the solution implemented by WANDisco is thus a sort of "branch" of Gerrit (vanilla + modified JGit) supported as commercial product.
A few questions:a. Do you support *ALL* Gerrit features and plugins on top of the WANDisco-modified Gerrit?
b. Do you have performance data from a real-life scenario of your Gerrit multi-master?
c. Do you have large scale Enterprise installations? Can you share some metrics? (# of nodes / sites, # of push / day, # of concurrent users, # of errors or conflicts, average Git push response time)
d. Are you up-to-date with the latest and greatest of Gerrit? (e.g. 2.10 and forthcoming 2.11)
Thanks again for providing feedback to this *very interesting* discussion thread :-)
Hi Randy,
Thanks for showing up and answering all the questions from the community.
> That's about right. We're trying to minimize source code changes as much as possible. The type of companies we want to do business with are really sensitive about keeping up to date with Gerrit, so we're committing to staying within a quarter (3 months) of the major releases. That just gives us some time to do a full release cycle and regression testing and such.
At Sony Mobile, we run a modified version of Gerrit. Most often it is a Official Release + a few hot fixes from @master and perhaps two or three of internal patches (that is not yet contributed or just still under review at gerrit-review.googlesource.com).
Our minor internal fork is quite organic and changes a bit with time, and quite occasionally between official releases. Sometimes we run into showstopper issues that needs to be hot fixed _ASAP_ or perhaps we detect a performance issue that we really want to include before it is included in a new official Gerrit release.
If we would go with WANdisco’s solution, would we lose the ability to manage our own internal fork? Or do you provide continuously the full source code to your own version, so that we can modify it as we please? If you are committing to stay within 3 months of major releases, what is the time span for a minor releases?
Best regards
Gustaf
--
If we would go with WANdisco’s solution, would we lose the ability to manage our own internal fork? Or do you provide continuously the full source code to your own version, so that we can modify it as we please? If you are committing to stay within 3 months of major releases, what is the time span for a minor releases?
Hi Gustaf,If we would go with WANdisco’s solution, would we lose the ability to manage our own internal fork? Or do you provide continuously the full source code to your own version, so that we can modify it as we please? If you are committing to stay within 3 months of major releases, what is the time span for a minor releases?
Normally we provide an installer that puts our changes into a compatible version of Gerrit. However we would be happy to provide our patches so that you could apply them to your fork. There's a bit of concern that we might run into problems trying to provide technical support in these cases. However, our changes would likely not conflict with changes that you make internally, so at this point we're willing to cross that bridge when we come to it.By major release I mean 2.8 -> 2.9 -> 2.10 and so on. If tomorrow we saw that there was a 2.9.3 release that had a critical bug fix, we'd probably pull it in and release it after we run a quick automated regression cycle on it.
We don't have anything set in stone yet for minor releases, but our approach will likely be that if we can pull in a minor release and it passes our automated test cycle, we'll ship it with our next patch release.Cheers,Randy
> One open question I posed earlier in this thread was, how does
> WANdisco tell Gerrit a ref has replicated onto this node?
Hi Shawn,this is what we do to tell one Gerrit that a ref has changed on another Gerrit.Every Gerrit installation is added of a standard simple plugin written by us that catchesall the change events (plus a new one we added, cause it was missing):
@Listenpublic class GitMSChangeEventListenerFile implements ChangeListener,LifecycleListener {...}so that if something happens on one Gerrit node about a change-id, this willbe caught and written to disk on that node. The GitMS (GIt MultiSite), livingon the very same node, will then pick up that file and send a proposal tothe other nodes about that change-id.The GitMS available on the other nodes will take the proposal and senda REST request to Gerrit (on the same node) in order to index thatchange-id. If Gerrit is down or does not reply in the right way, the indexrequest will be retried until successful (for a finite amount of time).
You said you have a> trigger in our accepted Paxos proposal records that allows the Paxos
> system to kick the event "up" the software stack into a Gerrit processCan you be more specific on that?
In our solution we use 2 JVM (one for Gerrit and one for GitMS), but ifGerrit already has something to manage the event, we could thinkto put everything in one single JVM, avoiding the REST calls to indexthe change ids.
On Tuesday, December 9, 2014 7:17:30 PM UTC, Shawn Pearce wrote:
One open question I posed earlier in this thread was, how does
WANdisco tell Gerrit a ref has replicated onto this node? If Gerrit
at site "a" writes refs/changes/cc/nncc/meta, the Gerrit at site "b"
will need to be told when this ref arrives so it can update the local
Lucene index, or the local Solr cluster, so that user dashboards and
search results will reflect the current state. At Google we have a
trigger in our accepted Paxos proposal records that allows the Paxos
system to kick the event "up" the software stack into a Gerrit process
to do "Gerrit specific stuff" after an accepted proposal is applied at
a node.
Planning for that in your product, and helping us hook onto that, would help. :)
The additional event may be of general interest but I doubt the rest would be. In any case the truth is in the code, so once we have it up on a public site you can judge for yourselves.
Those patches and plugin source should be published within a day or two.
Cheers,
Randy
> > OK, so its not a stock Gerrit server. Its stock Gerrit with an updated JGit. :)
>
> Previously, we did replace the JGit jar bundled with Gerrit with our own, before we determined
> that modifications to Gerrit itself were required, but now we just add our JGit changes to the
> gerrit-patch-jgit directory and that seems to be a lot nicer.
Please be careful with this.
If you are shadowing any classes in JGit ... that could be bad. The
Gerrit runtime ClassLoader structure doesn't provide any ordering
promises about JARs pushed into the classpath. It may be possible for
JGit to be placed ahead of gerrit-patch-jgit, and then your shadow
class doesn't exist.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/repo-discuss/ZrhEWrbZtO8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to repo-discuss...@googlegroups.com.
What's the preferred way to share with the community, particularly if we want to share back some bug fixes or patches?
ping ... is this thread (and the WANdisco solution?) definitely dead?
On 3 Mar 2017, at 17:52, 'Antonio Chirizzi' via Repo and Gerrit Discussion <repo-d...@googlegroups.com> wrote:Yes we (WANdisco) have not been updating this thread but in the meantime we have been working on Gerrit!We use Gerrit MultiSite ourselves and it is being used by our customers - large multi-national companies - who look more for stability than the latest available feature.
That’s why our last Gerrit fork is the 2.11.9 one.
In a few weeks we’ll be releasing the latest changes for Gerrit 2.13.x, cause that has been requested by some customers.
Since the last update on this thread we have changed our way of hacking Gerrit. In the beginning we were using a plugin to achieve a master-master replication, but since then, we had to move to change the Gerrit source code from the inside in order to control the master-master updates made on the database (we use freely available Percona XtraDB Cluster or the MariaDB galera cluster).
And yes we already publish our changes on GitHUB of course. We would welcome working with the community on how we could generalise the interfaces we have changed in order to make Gerrit more adaptable into the replicated product space!
We use our implementation of the Paxos algorithm to replicate the changes across the Gerrit instances, and we make use of freely available replicated databases. But we are completely open to review the interfaces or create new APIs and work with the community if this is required!
On Friday, March 3, 2017 at 11:14:59 AM UTC, lucamilanesio wrote:On 3 Mar 2017, at 11:09, David Ostrovsky <david.o...@gmail.com> wrote:
Am Freitag, 3. März 2017 11:03:23 UTC+1 schrieb lucamilanesio:ping ... is this thread (and the WANdisco solution?) definitely dead?Looking at their site: [1], reveals that if you use gerrit-multisite solution, you would have these benefits:* No downtime No restart required* Guaranteed consistency and performance* No change in Gerrit functionality and no learning curve for developersSo sounds like "Set it and forget it" to me, if it really works.Yes, but the documentation is 2 years old, it requires a WANDisco's fork of Gerrit based on 2.11.x [2] and had no major updates since then.That's why I was wondering *IF* there is something new *AND* has someone used it in production.I remember at the Summits that some people said they tried it in a test environment, but I am not aware of anyone using it in production yet.I believe we should get soon some updates from WANdisco's mates soon on this thread :-)--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Listed on the London Stock Exchange: WAND
THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED. If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.
Ciao Antonio,thanks for the update, see my feedback below.On 3 Mar 2017, at 17:52, 'Antonio Chirizzi' via Repo and Gerrit Discussion <repo-d...@googlegroups.com> wrote:Yes we (WANdisco) have not been updating this thread but in the meantime we have been working on Gerrit!We use Gerrit MultiSite ourselves and it is being used by our customers - large multi-national companies - who look more for stability than the latest available feature.I have to say that Gerrit is an exception on this: the most recent versions contains many more fixes and stability improvements than the older ones typically.A log of our clients on 2.11.x are urgently looking to upgrade to the 2.13.x because of the Lucene indexing thread cancellation problem.If your "large multi-national companies" are the ones coming to the Gerrit User Summits (btw: there will be one in 2017, you can register your interest at https://goo.gl/JuoIqQ) they are well aware of the problems I am referring to. I would be surprised that they would want to "stay on 2.11.x" looking for stability.That’s why our last Gerrit fork is the 2.11.9 one.FYI, the latest release is 2.11.10 at http://gerrit-documentation.storage.googleapis.com/ReleaseNotes/ReleaseNotes-2.11.10.html - August 2016.In a few weeks we’ll be releasing the latest changes for Gerrit 2.13.x, cause that has been requested by some customers.Good to know :-)Since the last update on this thread we have changed our way of hacking Gerrit. In the beginning we were using a plugin to achieve a master-master replication, but since then, we had to move to change the Gerrit source code from the inside in order to control the master-master updates made on the database (we use freely available Percona XtraDB Cluster or the MariaDB galera cluster).Ouch ... that would make your fork diverging even more from the plain vanilla I'm afraid.There were a lot of presentations of customers "stuck in a fork" at the past User Summits, for this reason large clients tend to avoid forks and stay with the plain-vanilla Gerrit.For an archive of all the previous Gerrit User Summits and the associated presentations, see:And yes we already publish our changes on GitHUB of course. We would welcome working with the community on how we could generalise the interfaces we have changed in order to make Gerrit more adaptable into the replicated product space!My 2p (I'm based in the UK): stop pushing to your GitHub repository and start contributing to Gerrit Code Review at https://gerrit.googlesource.com/gerrit/.Second suggestion: start coming to the User Summits and participating to the design sessions of the future architecture of Gerrit. Everyone is interested in multi-master and we started a long ago putting the basis for it.Ericsson contributed a series of very useful plugins to solve some of the multi-master problems, Google is working on Ketch (https://www.infoq.com/news/2016/02/google-kick-starts-git-ketch) for the Raft consensus protocol and GerritForge is working on an Apache Cassandra backend (https://www.slideshare.net/HaithemJarraya/infinite-gerrit) for a multi-node / multi-zone distributed high-performance storage.
Hi Matthias,there is a draft change uploaded to Gerrit, I'll ask Haithem to publish.The JGit DFS implementation is going to be 100% OpenSource and released under Apache 2.0.
How can Cassandra be considered to be a distributed file system (DFS) ?
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
On 3 Mar 2017, at 23:53, Matthias Sohn <matthi...@gmail.com> wrote:
On Sat, Mar 4, 2017 at 12:39 AM, Luca Milanesio <luca.milanesio@gmail.com> wrote:On 3 Mar 2017, at 23:36, Matthias Sohn <matthi...@gmail.com> wrote:On Sat, Mar 4, 2017 at 12:15 AM, Luca Milanesio <luca.milanesio@gmail.com> wrote:Hi Matthias,there is a draft change uploaded to Gerrit, I'll ask Haithem to publish.The JGit DFS implementation is going to be 100% OpenSource and released under Apache 2.0.so you don't want to contribute it to JGit itself but open source it elsewhere ?I wouldn't pollute JGit., because the DFS implementation would add a lot of Cassandra related dependencies which would make the build quite complicated.During the hackathon we installed as "LibModule" and deployed to Gerrit under its /lib directory.However, we are open to suggestions and best practice to follow for JGit contributions ;-)JGit consists of many libraries, so this could be added as a new one, e.g. org.eclipse.jgit.dfs.cassandra.Which dependencies does this DFS implementation need ?
A specific shaded version of the Cassandra Java Driver (com.datastax.cassandra:cassandra-driver-core) and then, of course, the Maven Shade plugin.
How can Cassandra be considered to be a distributed file system (DFS) ?It is not a filesystem but a key/value store, completely distributed over a network of nodes and dynamically extensible.It manages automatically:- data repartitioning- sharding- data locallity geo-location- recoveryIt typically excels for immutable data, which Git objects and Packs are.
On 5 Mar 2017, at 20:52, Matthias Sohn <matthi...@gmail.com> wrote:On Sat, Mar 4, 2017 at 1:08 AM, Luca Milanesio <luca.mi...@gmail.com> wrote:On 3 Mar 2017, at 23:53, Matthias Sohn <matthi...@gmail.com> wrote:On Sat, Mar 4, 2017 at 12:39 AM, Luca Milanesio <luca.milanesio@gmail.com> wrote:On 3 Mar 2017, at 23:36, Matthias Sohn <matthi...@gmail.com> wrote:On Sat, Mar 4, 2017 at 12:15 AM, Luca Milanesio <luca.milanesio@gmail.com> wrote:Hi Matthias,there is a draft change uploaded to Gerrit, I'll ask Haithem to publish.The JGit DFS implementation is going to be 100% OpenSource and released under Apache 2.0.so you don't want to contribute it to JGit itself but open source it elsewhere ?I wouldn't pollute JGit., because the DFS implementation would add a lot of Cassandra related dependencies which would make the build quite complicated.During the hackathon we installed as "LibModule" and deployed to Gerrit under its /lib directory.However, we are open to suggestions and best practice to follow for JGit contributions ;-)JGit consists of many libraries, so this could be added as a new one, e.g. org.eclipse.jgit.dfs.cassandra.Which dependencies does this DFS implementation need ?A specific shaded version of the Cassandra Java Driver (com.datastax.cassandra:cassandra-driver-core) and then, of course, the Maven Shade plugin.this doesn't sound like a lot of additional dependencies
How can Cassandra be considered to be a distributed file system (DFS) ?It is not a filesystem but a key/value store, completely distributed over a network of nodes and dynamically extensible.It manages automatically:- data repartitioning- sharding- data locallity geo-location- recoveryIt typically excels for immutable data, which Git objects and Packs are.How are you storing objects and packs ?Cassandra recommends chunking larger BLOBs
> did you ever move to production with it?
No, while we were interested, we never got time to investigate Wandisco's solution much further (and then I moved on to another company, where we had other challenges to attack).
Though I'm interested how the secondary index works here. Is it documented somewhere? Is it handled the same way Ericsson does it, with event forwarding?
/Gustaf
On 6 Mar 2017, at 08:53, Gustaf Lundh <gustaf...@axis.com> wrote:
> did you ever move to production with it?No, while we were interested, we never got time to investigate Wandisco's solution much further (and then I moved on to another company, where we had other challenges to attack).Though I'm interested how the secondary index works here. Is it documented somewhere? Is it handled the same way Ericsson does it, with event forwarding?/Gustaf
From: repo-d...@googlegroups.com <repo-d...@googlegroups.com> on behalf of lucamilanesio <luca.mi...@gmail.com>
Sent: Saturday, March 4, 2017 1:59 AM
To: Repo and Gerrit Discussion
Cc: antonio....@wandisco.com; david.o...@gmail.com
Subject: Re: Wandisco's Gerrit Multimaster solution. Anyone with experience?
@Gustaf / @Patrick you were the ones who have been in contact with WANdisco to experiment their multi-master solution ... did you ever move to production with it?Did you take any benchmark you can share? Any feedback?
Luca.
On Friday, March 3, 2017 at 9:47:31 PM UTC, lucamilanesio wrote:
Ciao Antonio,thanks for the update, see my feedback below.
On 3 Mar 2017, at 17:52, 'Antonio Chirizzi' via Repo and Gerrit Discussion <repo-d...@googlegroups.com> wrote:Yes we (WANdisco) have not been updating this thread but in the meantime we have been working on Gerrit!We use Gerrit MultiSite ourselves and it is being used by our customers - large multi-national companies - who look more for stability than the latest available feature.
I have to say that Gerrit is an exception on this: the most recent versions contains many more fixes and stability improvements than the older ones typically.A log of our clients on 2.11.x are urgently looking to upgrade to the 2.13.x because of the Lucene indexing thread cancellation problem.
If your "large multi-national companies" are the ones coming to the Gerrit User Summits (btw: there will be one in 2017, you can register your interest athttps://goo.gl/JuoIqQ) they are well aware of the problems I am referring to. I would be surprised that they would want to "stay on 2.11.x" looking for stability.