Darcs Patch Theory

13 views
Skip to first unread message

Jacob Groundwater

unread,
Jun 6, 2012, 3:16:34 AM6/6/12
to hacker-d...@googlegroups.com
I had a look at how Darcs patches work, they are identical to what Google calls Operational Transforms, a data structure used by Google Wave, and now incorporated into Google Documents.

If I recall, the only way this works is that every update can be decomposed into a sequence of operational transforms. I believe not ever update to a file can be decomposed in such a way, but I will have to look into that further. OTs have the benefit of never having merge conflicts.

Message has been deleted

onetom

unread,
Jun 9, 2012, 1:15:28 PM6/9/12
to hacker-d...@googlegroups.com
On Wednesday, June 6, 2012 3:16:34 PM UTC+8, Jacob Groundwater wrote:
If I recall, the only way this works is that every update can be decomposed into a sequence of operational transforms. I believe not ever update to a file can be decomposed in such a way, but I will have to look into that further. OTs have the benefit of never having merge conflicts.


==== Conflicts and merges ====

darcs transforms are line based, so it widens the window of chance for conflicts compared to the character based google docs or etherpad, but in practice i have experienced far less conflicts compared to git (even if i was carefully rebasing).

i hope we can all agree that having to deal with merges is a distraction from the actual development, so avoiding them contributes to productivity.


==== Ease of use ====

however the main benefit of darcs comes from its simple "UI",
which was made possible by the simple feature set:

+ initialize / add / remove / move
+ put / get
+ push / pull / unpull
+ record / unrecord / rollback
+ changes / whatsnew

rarely used:
+ revert / unrevert
+ amend-record
+ show / diff

notice the lack of explicite branching operations.
branching can be done by the operating system's file management tools.

there are no different merge algorithms to choose from,
because conflicts only happen in rare cases,
which are really ambiguous even to humans,
so those really need a human look at them.

for maximum convenience, all these commands can be abbreviated,
so most of the time i just use these commands maybe w 1 single option,
but mostly in interactive mode instead:
  get / pull / add / move / rem / rec -m / unrec --last 1 / push / wha -sl / cha -p


==== Ease of learning and teaching ====

after teaching darcs to 10+ ppl in the past ~5yrs,
some complete beginners, some long time svn / git users,
i can report the following:
- the learning curve takes 1-2 hours for the basics the 1st day
- next day another hour 1-2 hrs for coming up with branching conventions
  which is necessary with git too, btw
- next 2-3 days i might get 5-10 minute questions max 10 times
- 2nd week 1-2 more questions maybe
- next month 1-2 questions
- few months later they are telling me whom else they have taught darcs


==== "Myths" ====

#1 darcs is slow.

in raw processing performance it is indeed slower than most other version control systems,
BUT in *practice* it's actually _faster_!

why?

a, because u will think about version control differently if u use darcs
  and won't add unnecessary files to your repo which would just slow it down...
  (thinking about huge static assets which wont change probably ever,
   or modules, like gems, node packages and similar libs,
   which are version controlled elsewhere already)

b, the interactive patch preparation process promotes sensible
  change grouping which makes patch reviews easy and already doable during
  pull operations most of the time.

c, no funky scenarios with branches.
  it's NOT more complicated then pouring lego pieces from one box to another...
  so no waste of time here.
  no rebase bullshit, since it's happening transparently if possible.


#2 darcs has algorithmic issues

well, it's like saying you are just an irresponsible kid, not a grownup man
and stating the same 5 years later still...

yes, it fuck HAD algorithmic problems, which I have NEVER ever encountered,
but only a handful of people managed to reproduced it and it took ~1.5yrs before
it was fixed and not because there are not enough haskell programmers who
would bother, but because it was not that big of an issue actually.

but this is not really the case anymore! it is the PAST!
since then during Google Summer of Code sessions it's been sped up,
algorithmic issues were solved and it is under active development.
i have even donated the project some money to make this happen...

but these poor rigid fuckers who was "burnt" by the problem had no fucking
common sense to fix it by quickly creating a clean repo from the current
code base and carry on working as if it would be a capital crime to "lose"
the change history...

instead they have polluted every fucking possible forum saying darcs suck,
while actually they were the ones who suck really...
darcs WAS just a bit buggy...

(wikipedia links the related issue. it's a bit long, but if u take the effort
to read it, u will see how these ppl were not being able to reproduce the
situation or couldn't help because their codebase was proprietary, hence
not sharable. also some admitted they were making big patches of
unrelated changes and the problem only surfaced in these cases...)


==== Conclusion ====

so if u plan to use your version control system just as a way of
snapshotting your work directory by sweeping the accumulated
changes at once below the carper from time to time, then
u r better of w git or even just svn.

if you want personal version control or just a few MBs of source
written by 5-10 ppl, git, hg, bazaar is an overkill, u r better of with darcs,
because it's fast enough, no special setup need and brutally
easy to use, because it doesnt get in the way.

git was made for the kernel,
it was made for tons of ugly patches from hundreds of ppl,
it was made to be used by hard core ppl,
it was NOT designed but (d)evolved under the hands of coders
who didn't care about user experience...

u can spend your precious life time on more useful things, than fighting w git.
for example u could write great software instead...

-- 
  tom

Jacob Groundwater

unread,
Jun 9, 2012, 2:12:00 PM6/9/12
to hacker-d...@googlegroups.com
i hope we can all agree that having to deal with merges is a distraction from the actual development, so avoiding them contributes to productivity.

Agreed
 
however the main benefit of darcs comes from its simple "UI",
which was made possible by the simple feature set:

+ initialize / add / remove / move
+ put / get
+ push / pull / unpull
+ record / unrecord / rollback
+ changes / whatsnew


I think 90% of my git commands are:
  • branch / checkout / merge / rebase
  • add / commit
  • push / pull
  • status / diff 
notice the lack of explicite branching operations.
branching can be done by the operating system's file management tools.

How do you develop features concurrently but in isolation? For example, in git to develop three features you specify three branches. Each feature is developed in isolation from the others by switching between branches as necessary. What is the darcs equivalent workflow, or alternative strategy?

Finally, in this thread, Linus brings up a question that I am curious about:

Fundmantal example: somebody has a problem/bug. Tell me how to tell a
developer what his exact version is - without creating new tags, and
without having to synchronize the archives. Just tell the developer what
version he is at.

How does one arbitrarily identify which version of software they are using?

Reply all
Reply to author
Forward
0 new messages