Newbie Questions - is it OK to post here?

9 views
Skip to first unread message

martin-mpjdesign

unread,
Mar 17, 2011, 9:03:09 PM3/17/11
to MacHg
Hi

I'm a graphic and front end web designer - a real newbie when it comes
to VCS although I want add it to my workflow. I've spent weeks looking
at different options, one being Mercurial using MacHG - I'm a
designer, I need/like the GUI :) - and I've come up with a bunch of
questions. They're not specifically MacHG related, more general on
suitability of Mercurial with design files, VCS work flow options,
setting up VCS to my needs etc.

So before firing off a load of questions: is it OK to post these types
of questions here? If not can anyone recommend a good VCS/Mercurial
forum or mailing list where they go easy on people asking *really*
basic questions?

Thanks in advance.

Jason Harris

unread,
Mar 17, 2011, 9:07:42 PM3/17/11
to machg_m...@googlegroups.com
Sure post away...

This is probably a friendlier forum then most but the other forum is:

http://mercurial.selenic.com/wiki/MailingLists

But if you are after using a GUI, (like MacHg) then this is probably the right forum to start in.

Cheers,
Jas

martin-mpjdesign

unread,
Mar 18, 2011, 1:34:48 PM3/18/11
to MacHg
Thanks Jason

I'll spend some time to properly formulate questions and repost. And
yes, I'll definitely be using a GUI.

martin-mpjdesign

unread,
Mar 24, 2011, 4:48:54 PM3/24/11
to MacHg
I've finally got worked out a sensible set of questions. I think some
background on my current setup will be useful to begin with...

As a designer I work with web code but also a whole bunch of binary
files - InDesign, Illustrator, psd, png, dng, etc. Some of these can
be big - psd files of 300+ Mb when I'm working on press ads for
example.

All binary files sit on a second HD in my Mac Pro, I have separate
folders for clients and sub folders for each project.
Website code sits in MAMPs server root (Applications/MAMP/htdocs/).
Each site has a separate folder in the server root and I work on the
code directly.

I also use a MacBook Pro with MAMP installed when I'm out of the
office. I'll synchronise the binary files between the Mac Pro HD and a
portable HD attached to the MacBook Pro. I'll also synchronise the
contents of the Mac Pro and MacBook Pro MAMP server roots.

I regularly back up the binary files and content of MAMP server root
to a NAS (currently in the office, soon to be offsite with VPN
access). I use Decimus Sync to run the synchronisations and the
backups.

My current version control practice is pretty poor. I work with a
great web developer on some projects who introduced me to Git - I
understand the benefits but don't like/understand the mix of command
line and GitX. The developer is quite happy to change from Git to
Mercurial. Hence there's version control on some websites but not
others. Version control of binary files is bad - renaming files *-
v1.png, *-v2.png, *-v2-rev-1.png, etc. I can just about cope with this
but I'll be working with another graphic designer soon and I know the
system will fall down quickly.



So I'm looking to set up a proper version control system that can
ideally cover both the binary files and the code and ideally have a
system in place before the new graphic designer starts.

I've played a bit with MacHg and found it really good but I don't know
if Mercurial itself is the best solution for me. So here come the
questions:

1) I'm thinking that each for the binary files, each client folder or
project folder should be a separate Mercurial repo - rather than
making the entire HD one enormous repo. Maybe repos should be even
more narrow in focus?

2) I'd like to continue using Synk to synchronise binary files in the
Mercurial repos between the Mac Pro HD and the MacBook Pro external HD
when leaving the office. My thinking is that using Mercurial to push/
pull the repos between the HDs will take forever as I'll have to up
date each one individually - unless there's a way to do them all at
once. Likewise for the website in MAMP - one repo for each site is
logical but I'd like to be able to Sync them all at once. Is there a
danger of corruption if I use Sync to synchronise the Mercurial repos
between the 2 machines.

3) When the second graphic designer starts we'll be using the Mac Pro
and the MacBook Pro in the office, talking over a Gb network. Ideally
we'll both be working on the binary files stored on the Mac Pro HD
because a) the MacBook Pro internal HD is not large enough to hold all
the binary file repos, and b) the External HD slow - OK on the road
but not in the office.

4) I still need to backup all the repos to the NAS, again is using
Sync to do this likely to corrupt the repo? Can it be done easily
(i.e. all repos in one go) using Mercurial? I don't think I can set up
Mercurial on the NAS.

5) I know that Mercurial does not do file locking and I understand
why. Does it offer another method to help prevent myself and the other
designer working on the same binary file at the same time? Or is it
down to us simply talking to each other?

6) How does Mercurial cope with 300+ Mb binary files? Data integrity
is a higher priority than speed but both are important.

7) Are there any other VCSs which you feel could handle my mix of code
and binary files better than Mercurial? I'd ideally like one system to
do the lot rather than one for code and another for binary.

Other than Mercurial I've been looking at:

Bazaar - maybe its workflow flexibility would help with question 3.
above but I'm not sure.
SVN - discounted unless there's no option other than using it for
binary files in parallel with a DVCS for the code.
Plastic SCM - apparently handles binary files well and has a flexible
workflow. But it's commercial; there's a community edition at the
moment and I'm assured by the developers that it's here to stay. The
community license is renewable annually and you never know if there'll
be a change of policy down the line. I'll pay for software no problem
but their commercial licenses are quite a lot.

So finishing up: apologies for the length of the post, the newbie
questions and thanks in advance for any help/advice you can give.

On Mar 18, 1:07 am, Jason Harris <ja...@jasonfharris.com> wrote:

martin-mpjdesign

unread,
Mar 25, 2011, 9:10:41 AM3/25/11
to MacHg
Well, I've partially answered question 6) myself. I found some time
earlier to add a 550Mb photoshop file to a repo. MacHG did handle it
albeit taking several minutes.

Jason Harris

unread,
Mar 25, 2011, 10:38:01 AM3/25/11
to machg_m...@googlegroups.com, Martin Geisler
Wow Lots of questions :)

That sounds like a good idea. Likely each project should be a separate repository.

> 2) I'd like to continue using Synk to synchronise binary files in the
> Mercurial repos between the Mac Pro HD and the MacBook Pro external HD
> when leaving the office. My thinking is that using Mercurial to push/
> pull the repos between the HDs will take forever as I'll have to up
> date each one individually - unless there's a way to do them all at
> once. Likewise for the website in MAMP - one repo for each site is
> logical but I'd like to be able to Sync them all at once. Is there a
> danger of corruption if I use Sync to synchronise the Mercurial repos
> between the 2 machines.

Well.... Inside each Mercruial repository there is the .hg folder which contains all the relevant information. Basically as a beginner never mess with the stuff in there (later as an advanced user you can sometimes mess with the patches folder in there and some other files but the store files etc one should basically never touch...) If you do mess with the stuff in there you are basically messing with the repository and you can corrupt it.

If sync perfectly syncs the internal files in each repo ie makes exact duplicates always then it should be fine. However lets say you do some changes at home and you sync home_compter <-> laptop_computer, then you go into the office do some work on office_computer and later do sync office_computer <-> laptop_computer then sync might try and pull only some of the files in the .hg repository from the laptop_computer since it will see some of the files in the office_computer are later, so sync would mix and match files in the .hg repository and bingo you have a corrupted repository...

Thus if you can tell sync that your .hg folders in your repo's are *all or nothing* then if would be fine. If not, then sync is not a good idea.

If you are worried about synching everything then thats not so much a problem, if you can handle a tiny bit of shell scripting. Just create a script

#! bin/bash
hg pull --update myproject1
hg pull --update myproject2
hg pull --update myproject3
hg pull --update myproject4

etc...

and at the end of the day open your terminal and just run this script. With a little bit more unix magic you can even probably auto generate such a script using find and xargs or something... Is this a possible solution for you?

The next thing to look at would be to make all of these repos a subrepo of a main repo and then use the Mercurial extension, onsub to force update all the subrepos. Would this be an option?

Or does this need to be totally GUI'fied ? :)


> 3) When the second graphic designer starts we'll be using the Mac Pro
> and the MacBook Pro in the office, talking over a Gb network. Ideally
> we'll both be working on the binary files stored on the Mac Pro HD
> because a) the MacBook Pro internal HD is not large enough to hold all
> the binary file repos, and b) the External HD slow - OK on the road
> but not in the office.

So you are not going to have eg a small server? Maybe a mac-mini or something which holds the common files?


> 4) I still need to backup all the repos to the NAS, again is using
> Sync to do this likely to corrupt the repo? Can it be done easily
> (i.e. all repos in one go) using Mercurial? I don't think I can set up
> Mercurial on the NAS.

I actually just use time machine to have backups of everything for me... But again sync if its just syncing off one machine, then version should be fine. Its when sync (I am not sure what the program actually does :) ) would try and synchronize files between two machines is when it would have problems. Of course verify all of this by running 'hg verify' on the repos after they are on the NAS.


> 5) I know that Mercurial does not do file locking and I understand
> why. Does it offer another method to help prevent myself and the other
> designer working on the same binary file at the same time? Or is it
> down to us simply talking to each other?

Actually there is an extension in the works that will do file locking for Mercurial. I am including Martin Geisler on this email since he is writing it...

http://www.selenic.com/pipermail/mercurial-devel/2011-March/029226.html

So although its not available now it likely will be in the future.


> 6) How does Mercurial cope with 300+ Mb binary files? Data integrity
> is a higher priority than speed but both are important.

Well as far as I am aware Mercurial does everything "in memory" so sometimes for a 300MB file there might be the file and then a copy of it or two in memory so memory use might swell to say 1200MB while it deals with the file.

http://mercurial.808500.n3.nabble.com/Large-binary-files-td799531.html

However for such files there are the "big-files" extensions.

http://mercurial.selenic.com/wiki/BfilesExtension
http://mercurial.selenic.com/wiki/SnapExtension
http://mercurial.selenic.com/wiki/BigfilesExtension

Note I haven't actually used these extensions, but there are several of them. I would start with either BFiles or Snap since they offer I think a way to do full integration. It would be interesting to see how this plays out.

Please try them on some files and get back to me...


> 7) Are there any other VCSs which you feel could handle my mix of code
> and binary files better than Mercurial? I'd ideally like one system to
> do the lot rather than one for code and another for binary.

Well with big binary files you are basically going back to a centralized version control system in that its not doing much merging / comparing of file contents. Thus using a centralized server like SVN for the binary files in sub repos might be something to consider.

> Other than Mercurial I've been looking at:
>
> Bazaar - maybe its workflow flexibility would help with question 3.
> above but I'm not sure.

I think really git and Mercurial by default are all pretty much the same to a degree. Out of the box both don't handle big files well. However both have extensions for handling big files. Actually finding them for git was a little harder (but maybe its because I am just not so familiar with the normal git resources: but eg I found http://git-annex.branchable.com/ for git, etc.)

I didn't find any extensions for Bazaar when I did a quick google for them...

(As for which DVCS between git and hg, well I prefer Mercurial. I have of course looked at git, and some of git's commands. However, git just seems to be way too complex and over the top. Eg just look at 'git help log'. Its 1300 lines long!! In Mercurial 'hg help log' is 55 lines long. Or other concepts, like the staging area etc is ok but its really only there because git is a command line tool and there is no good gui to go on top of it. Basically the staging area should be done entirely in a good GUI tool, etc. (similarly though one could complain and say that the mq extension and patch handling in Mercurial should all be done by a good GUI and then you could throw away the mq interface. (Of course at an implementation level you might still have patches but you would never expose these to the user.))

I could talk about the origin-master garbage you have to do with git and its branching model, but Mercurial again just seems so much clearer in this regard to me.

(Note: I have had it in the back of my mind for a long time that I might actually port MacHg to be MacHgGit or something... I have looked at the details and the one thing holding me back is that its not so easy to get revision numbers in git. I have thought of ways around this and one day I will probably bite the bullet and do this...)


> SVN - discounted unless there's no option other than using it for
> binary files in parallel with a DVCS for the code.

One could... I would try the mercurial extensions first I think for big file handling...


> Plastic SCM - apparently handles binary files well and has a flexible
> workflow. But it's commercial; there's a community edition at the
> moment and I'm assured by the developers that it's here to stay. The
> community license is renewable annually and you never know if there'll
> be a change of policy down the line. I'll pay for software no problem
> but their commercial licenses are quite a lot.

It looked like their community license was free for life for under 15 users which would satisfy your needs... So you might try plastic scm. Note I have never tried it and had never heard of it until you mentioned it.


> So finishing up: apologies for the length of the post, the newbie
> questions and thanks in advance for any help/advice you can give.

No problems :) I would be interested to know your experiences with the big files extensions.

Cheers!
Jason

Reply all
Reply to author
Forward
0 new messages