not merging dependent files, but versioning them anyway

3 views
Skip to first unread message

da2

unread,
Jun 23, 2009, 3:39:38 PM6/23/09
to merc...@selenic.com

Hello,

I am using latex to create a pdf file. I don't want to merge these files
each time somebody
creates a new pdf, but I don't want to exclude them either, because it is
nice to be able to go to any
version without recreating the pdf.
The behaviour I am interested in is the following:
1) pdfs are ignored for merging, if there is a conflict the pdf can be
deleted, because neither of them is valid anyway. It can be recreated easily
from the .tex after it has been merged.
2) pdfs are versioned, i.e. I can immediately go back to any version.


Any ideas?

Daniel
--
View this message in context: http://www.nabble.com/not-merging-dependent-files%2C-but-versioning-them-anyway-tp24172360p24172360.html
Sent from the Mercurial mailing list archive at Nabble.com.

_______________________________________________
Mercurial mailing list
Merc...@selenic.com
http://selenic.com/mailman/listinfo/mercurial

David Frey

unread,
Jun 23, 2009, 8:29:08 PM6/23/09
to da2, merc...@selenic.com
On 6/23/2009, "da2" <dal...@gmail.com> wrote:

>
>Hello,
>
>I am using latex to create a pdf file. I don't want to merge these files
>each time somebody
>creates a new pdf, but I don't want to exclude them either, because it is
>nice to be able to go to any
>version without recreating the pdf.
>The behaviour I am interested in is the following:
>1) pdfs are ignored for merging, if there is a conflict the pdf can be
>deleted, because neither of them is valid anyway. It can be recreated easily
>from the .tex after it has been merged.
>2) pdfs are versioned, i.e. I can immediately go back to any version.
>
>
>Any ideas?


The question you should be asking is not "how can I do this?", but
"should I do this?" and the answer is no.

Greg Ward

unread,
Jun 23, 2009, 9:59:50 PM6/23/09
to da2, merc...@selenic.com
On Tue, Jun 23, 2009 at 3:39 PM, da2<dal...@gmail.com> wrote:
> I am using latex to create a pdf file. I don't want to merge these files
> each time somebody
> creates a new pdf, but I don't want to exclude them either, because it is
> nice to be able to go to any
> version without recreating the pdf.
> The behaviour I am interested in is the following:
> 1) pdfs are ignored for merging, if there is a conflict the pdf can be
> deleted, because neither of them is valid anyway. It can be recreated easily
> from the .tex after it has been merged.
> 2) pdfs are versioned, i.e. I can immediately go back to any version.

In a C project, would you version .o files or binaries? No way.

In a Java project, would you version .class or .jar files? Certainly not.

Mercurial advertises itself as an "SCM" -- that's "source code
management" system. Trying to version build products is just asking
for trouble. The correct answer is a script or Makefile that makes
recreating the .pdf from its source code trivial.

Either that or a directory full of historical PDFs with a mapping of
hg changeset ID to PDF filename. But that is outside the scope of
Mercurial. (Unless the bigfiles extension is of any use.)

Greg

Christian Ebert

unread,
Jun 24, 2009, 2:51:44 AM6/24/09
to merc...@selenic.com
* Greg Ward on Tuesday, June 23, 2009 at 21:59:50 -0400

> On Tue, Jun 23, 2009 at 3:39 PM, da2<dal...@gmail.com> wrote:
>> I am using latex to create a pdf file. I don't want to merge these files
>> each time somebody
>> creates a new pdf, but I don't want to exclude them either, because it is
>> nice to be able to go to any
>> version without recreating the pdf.
>> The behaviour I am interested in is the following:
>> 1) pdfs are ignored for merging, if there is a conflict the pdf can be
>> deleted, because neither of them is valid anyway. It can be recreated easily
>> from the .tex after it has been merged.
>> 2) pdfs are versioned, i.e. I can immediately go back to any version.
>
> In a C project, would you version .o files or binaries? No way.

But a C project is not a LaTeX project ;-)

> In a Java project, would you version .class or .jar files? Certainly not.

There might be a difference in the aim when you do a LaTeX project:

When you checkout an older version, you not only want to keep the
functionality but the result as well, like for printing. In this
case the result can even have higher priority than the source.
(Of course all this from a _practical_ pov, not a puristic scm
pov.)

This might mean that not only have to keep around or reachable
the "kernel" of the programming language in question (python 2.4
is probably still accessable when python 4.0 is out ;-) ) but
also, and in this case more importantly, *all* the macro packages
you are using in your project at that moment in time. Well, atm
CTAN is not run under any kind of scm, so this is impossible,
unless you keep your own scm of CTAN locally *and* rebuild the
binaries just to compile an old document ... good luck!



> Mercurial advertises itself as an "SCM" -- that's "source code
> management" system. Trying to version build products is just asking
> for trouble. The correct answer is a script or Makefile that makes
> recreating the .pdf from its source code trivial.
>
> Either that or a directory full of historical PDFs with a mapping of
> hg changeset ID to PDF filename. But that is outside the scope of
> Mercurial. (Unless the bigfiles extension is of any use.)

There is 3rd, albeit dirty, way, but which works fine for small
projects imho (on foot, so to speak):

When merging, and you have no binary tool configured, Mercurial
asks you which version of a file you want to keep; just make any
choice. Then either

- hg rm the pdf, hg ci the merge, (pdf)latex source, hg add pdf
(what the OP wants)

or

- (pdf)latex source, hg ci the (dirty) merge

If you have other stuff to do (coffee break etc. ;-) ), you can

hg resolve -u a.pdf
(coffee break)
pdflatex a.tex
hg resolve -m a.pdf
hg ci -m merge

I am well aware that this makes any scm purist cringe in pain,
but I have my books available *exactly like they were* at any
point in history that I deemed important enough to check in the
_result_. Of course, if anything goes wrong with this kind of
workflow, I wouldn't even dream of blaming Mercurial for it ;-)
Still it works well for me. And unless you have thousands of
graphics in your pdf it doesn't slow hg down too much either.

c
--
Was heißt hier Dogma, ich bin Underdogma!
[ What the hell do you mean dogma, I am underdogma. ]
_F R E E_ _V I D E O S_ http://www.blacktrash.org/underdogma/
http://www.blacktrash.org/underdogma/index-en.html

Mathieu Clabaut

unread,
Jun 24, 2009, 3:05:25 AM6/24/09
to da2, merc...@selenic.com


On Tue, Jun 23, 2009 at 21:39, da2 <dal...@gmail.com> wrote:

Hello,

I am using latex to create a pdf file. I don't want to merge these files
each time somebody
creates a new pdf, but I don't want to exclude them either, because it is
nice to be able to go to any
version without recreating the pdf.
The behaviour I am interested in is the following:
1) pdfs are ignored for merging, if there is a conflict the pdf can be
deleted, because neither of them is valid anyway. It can be recreated easily
from the .tex after it has been merged.
2) pdfs are versioned, i.e. I can immediately go back to any version.


Any ideas?

Some would say you shouldn't do that. But I do not see why apart the fact  that pdf should not be considered to be a source file, which still does not explain me why you shouldn't do that !

I had the same sort of problem with generated proof obligation files when working with formal method, where I had, for my workflow ,to be able to keep them under revision control, because they take hours to be generated, and because history was needed to automatically decide whose proof was to be kept and whose proof was to be throw away.
I used at the time a hacked hgmerge script that always keep my version of the files in case of conflict. It worked well for my own purpose.

 I guess nowadays you can simply add a merge-pattern configuration in your .hgrc or .hg/hgrc looking like :
[merge-patterns]
**.pdf = internal:local

-Mathieu

Arne Babenhauserheide

unread,
Jun 24, 2009, 3:26:38 AM6/24/09
to merc...@selenic.com, da2
Am Mittwoch, 24. Juni 2009 02:29:08 schrieb David Frey:
> >1) pdfs are ignored for merging, if there is a conflict the pdf can be
> >deleted, because neither of them is valid anyway. It can be recreated
> > easily from the .tex after it has been merged.
> >2) pdfs are versioned, i.e. I can immediately go back to any version.

I see two ways: One doing exactly what you requested and one which keeps your
sources cleaner.

1. Not merging pdfs (UNTESTED):

For this you just choose a merge tool for pdfs which simply keeps either your
or the other version.

Edit your .hg/hgrc to include the following section:

[merge-patterns]
**.pdf = internal:local #keep my files
**.pdf = internal:other #keep their files

(you should only use one of the lines)

This way all PDFs will always be either at your revision or the other revision
and you won't have (real) merges.

- http://mercurial.selenic.com/wiki/MergeToolConfiguration


2. Creating pdfs on the fly

This assumes that you always want to have the PDFs you can use, but that you
don't need to versiontrack tham - only their contents (and those are defined
in the tex files).

For this you add an update hook which crates the pdf whenever you update to a
revision.

Edit your .hg/hgrc to include the hooks section with an update hook:

[hooks]
update.create_pdfs = latex your_tex_file.tex

To make this still a bit easier, you can use a versioned script which creates
all pdf. that way you can just call the script and don't need to worry about
editing the .hg/hgrc when you add text files or change the call.

I use a python script for platform compatability:

--- parse_latex.py ---
#!/usr/bin/env python
from subprocess import call
for i in ["file1.tex", "file2.tex"]:
call(["latex", i])
--- --- --- --- --- --- --- ---

.hg/hgrc:
[hooks]
update.create = ./parse_latex.py

- http://hgbook.red-bean.com/read/handling-repository-events-with-hooks.html

> The question you should be asking is not "how can I do this?", but
> "should I do this?" and the answer is no.

Except if there are good reasons, and I can imagine that there are - even one
member who is not comfortable with not having PDFs inv ersion control (and who
doesn't want to reason about that) is a good reason.

If need be, you can convert the repo to remove the pdfs later.

Best wishes,
Arne

--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
Ein Mann wird auf der Straße mit einem Messer bedroht.
Zwei Polizisten sind sofort da und halten ein Transparent davor.

"Illegale Szene. Niemand darf das sehen."

Der Mann wird ausgeraubt, erstochen und verblutet,
denn die Polizisten haben beide Hände voll zu tun.

Willkommen in Deutschland. Zensur ist schön.
(http://draketo.de)
--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---


signature.asc

Mads Kiilerich

unread,
Jun 24, 2009, 4:54:14 AM6/24/09
to da2, merc...@selenic.com
On 06/24/2009 09:26 AM, Arne Babenhauserheide wrote:
> For this you just choose a merge tool for pdfs which simply keeps either your
> or the other version.
>
> Edit your .hg/hgrc to include the following section:
>
> [merge-patterns]
> **.pdf = internal:local #keep my files
> **.pdf = internal:other #keep their files
>
> (you should only use one of the lines)
>

Or use internal:prompt, available in the development branch and to be
released in 1.3.

/Mads

da2

unread,
Jun 24, 2009, 8:41:10 AM6/24/09
to merc...@selenic.com

Hi Arne,

these are two nice solutions. In my case recreating the pdfs is what I will
do, but the merge-pattern
solves an other of my problems.


Thank you very much!

Daniel


--
View this message in context: http://www.nabble.com/not-merging-dependent-files%2C-but-versioning-them-anyway-tp24172360p24184026.html


Sent from the Mercurial mailing list archive at Nabble.com.

_______________________________________________

da2

unread,
Jun 24, 2009, 8:50:11 AM6/24/09
to merc...@selenic.com

Greg,

>In a C project, would you version .o files or binaries? No way.

If I am e.g. making heavy use of expression templates and compilation takes
a long time I would certainly
want to version binaries.
My numerical simulations take days to run, and it is nice to have the output
of every version together with the corresponding code.
The additional benefit is that I can use the features of mercurial
(synchronization etc.) on these files.

>Either that or a directory full of historical PDFs with a mapping of
>hg changeset ID to PDF filename. But that is outside the scope of

That's a nice idea, I will see if I can use it for my simulations.


Thanks

Daniel

--
View this message in context: http://www.nabble.com/not-merging-dependent-files%2C-but-versioning-them-anyway-tp24172360p24184162.html


Sent from the Mercurial mailing list archive at Nabble.com.

_______________________________________________

Arne Babenhauserheide

unread,
Jun 24, 2009, 9:10:40 AM6/24/09
to merc...@selenic.com, da2
Hi Daniel,

Am Mittwoch, 24. Juni 2009 14:41:10 schrieb da2:
> these are two nice solutions. In my case recreating the pdfs is what I will
> do, but the merge-pattern
> solves an other of my problems.

I'm glad it helped you!

I now also put it into the wiki, so it can easily be retrieved:

http://mercurial.selenic.com/wiki/TipsAndTricks#Avoid_merging_autogenerated_.28binary.29_files_.28PDF.29

Out of curiosity: What was the other problem?

Best wishes,
Arne

--- --- --- --- --- --- --- --- ---

Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
- Arne (http://draketo.de)

signature.asc

Martin Geisler

unread,
Jun 24, 2009, 10:15:37 AM6/24/09
to merc...@selenic.com
Christian Ebert <black...@gmx.net> writes:

> When you checkout an older version, you not only want to keep the
> functionality but the result as well, like for printing. In this case
> the result can even have higher priority than the source. (Of course
> all this from a _practical_ pov, not a puristic scm pov.)
>
> This might mean that not only have to keep around or reachable the
> "kernel" of the programming language in question (python 2.4 is
> probably still accessable when python 4.0 is out ;-) ) but also, and
> in this case more importantly, *all* the macro packages you are using
> in your project at that moment in time. Well, atm CTAN is not run
> under any kind of scm, so this is impossible, unless you keep your own
> scm of CTAN locally *and* rebuild the binaries just to compile an old
> document ... good luck!

We are actually in luck -- the LaTeX world has a long tradition of
strict backwards compatibility. Processing old document with todays TeX
should not only work, but the results should also be very close if not
identical to the results you got 5 years ago.

I just tried recompiling an 8 year old LaTeX document, including some
MetaPost code... and it worked :-)

--
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

Hans Meine

unread,
Jun 24, 2009, 10:26:10 AM6/24/09
to merc...@selenic.com
On Wednesday 24 June 2009 16:15:37 Martin Geisler wrote:
> Christian Ebert <black...@gmx.net> writes:
> > When you checkout an older version, you not only want to keep the
> > functionality but the result as well, like for printing. In this case
> > the result can even have higher priority than the source. (Of course
> > all this from a _practical_ pov, not a puristic scm pov.)
> > [...]

> We are actually in luck -- the LaTeX world has a long tradition of
> strict backwards compatibility. Processing old document with todays TeX
> should not only work, but the results should also be very close if not
> identical to the results you got 5 years ago.
>
> I just tried recompiling an 8 year old LaTeX document, including some
> MetaPost code... and it worked :-)

Then *you* are in luck, not we. ;-}

I had severe problems with this in the past (which might depend on the number
of "unusual" packages included), from small changes I attributed to changed
(default?) fonts (or -metrics) over 3rd party packages that have changed
slightly and/or have become unmaintained or removed from LaTeX distributions
(e.g. due to license reasons) etc..

This is the reason why *I* recently started to take care to keep a compiled
PDF around... ;-/

Christian Ebert

unread,
Jun 24, 2009, 12:24:48 PM6/24/09
to merc...@selenic.com
* Hans Meine on Wednesday, June 24, 2009 at 16:26:10 +0200

> On Wednesday 24 June 2009 16:15:37 Martin Geisler wrote:
>> Christian Ebert <black...@gmx.net> writes:
>>> When you checkout an older version, you not only want to keep the
>>> functionality but the result as well, like for printing. In this case
>>> the result can even have higher priority than the source. (Of course
>>> all this from a _practical_ pov, not a puristic scm pov.)
>>> [...]
>> We are actually in luck -- the LaTeX world has a long tradition of
>> strict backwards compatibility. Processing old document with todays TeX
>> should not only work, but the results should also be very close if not
>> identical to the results you got 5 years ago.
>>
>> I just tried recompiling an 8 year old LaTeX document, including some
>> MetaPost code... and it worked :-)

"worked" is already lucky, but even then the print output may
differ.

> Then *you* are in luck, not we. ;-}
>
> I had severe problems with this in the past (which might depend on the number
> of "unusual" packages included), from small changes I attributed to changed
> (default?) fonts (or -metrics) over 3rd party packages that have changed
> slightly and/or have become unmaintained or removed from LaTeX distributions
> (e.g. due to license reasons) etc..
>
> This is the reason why *I* recently started to take care to keep a compiled
> PDF around... ;-/

Indeed.

With this tool

http://pdftex.sarovar.org/misc/pdfcmp.zip

(written by the author of pdfLaTeX)

you can detect even more differences (microtypographic changes
etc.) in the pdf.

c
--
_B A U S T E L L E N_ lesen! --->> <http://www.blacktrash.org/baustellen/>

Christian Ebert

unread,
Jun 24, 2009, 12:28:08 PM6/24/09
to merc...@selenic.com
* Mads Kiilerich on Wednesday, June 24, 2009 at 10:54:14 +0200

> On 06/24/2009 09:26 AM, Arne Babenhauserheide wrote:
>> For this you just choose a merge tool for pdfs which simply keeps either your
>> or the other version.
>>
>> Edit your .hg/hgrc to include the following section:
>>
>> [merge-patterns]
>> **.pdf = internal:local #keep my files
>> **.pdf = internal:other #keep their files
>>
>> (you should only use one of the lines)
>>
>
> Or use internal:prompt, available in the development branch and to be
> released in 1.3.

I'm using internal:fail. Gives me the best control even in case
of conflicts and in combination with hg resolve.

c
--
_B A U S T E L L E N_ lesen! --->> <http://www.blacktrash.org/baustellen/>

da2

unread,
Jun 24, 2009, 12:43:11 PM6/24/09
to merc...@selenic.com

I'm glad it helped you!

thanks again!

Out of curiosity: What was the other problem?

I am doing numerical simulations, it takes a fairly long time to compile
these programs (maybe 30 min) and to run them. It is really a nice feature
to be able to directly go back to any binary version without recompiling.
Sometimes I need intermediate results, that are quick to recompute, for this
I will add a hook like you mentioned.

Best wishes

Daniel
--
View this message in context: http://www.nabble.com/not-merging-dependent-files%2C-but-versioning-them-anyway-tp24172360p24188102.html

Greg Lindahl

unread,
Jun 24, 2009, 4:54:15 PM6/24/09
to merc...@selenic.com
On Tue, Jun 23, 2009 at 09:59:50PM -0400, Greg Ward wrote:

> In a C project, would you version .o files or binaries? No way.

Um, yes way. People generally don't store builds in their source
control system, but the various companies I've worked for keep 1-2
builds/day in a directory tree to speed up testing of regressions.

If you wanted to build a full-featured tool to do everything needed
for the entire life-cycle of software, it would handle this, and bug
reports, and code review, and and and...

Now I don't think Mercurial is trying to be such a system, but it's
important to keep in mind that people do version binary files all the
time, even if Mercurial doesn't support it very well.

-- greg

Mike Meyer

unread,
Jun 24, 2009, 7:37:56 PM6/24/09
to merc...@selenic.com
On Wed, 24 Jun 2009 13:54:15 -0700
Greg Lindahl <gr...@blekko.com> wrote:

> On Tue, Jun 23, 2009 at 09:59:50PM -0400, Greg Ward wrote:
>
> > In a C project, would you version .o files or binaries? No way.
>
> Um, yes way. People generally don't store builds in their source
> control system, but the various companies I've worked for keep 1-2
> builds/day in a directory tree to speed up testing of regressions.

I've had at least one client that stored builds - at least the dailies
- in the SCM. They expected developers to update the parts of the
system they weren't working on that often, and figured the cost of
storing those builds in the SCM was more than made up for by not
having all their developers waiting on the first build after an update
- not to mention that by avoiding that wait, they could actually get
their developers to update daily.

<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Reply all
Reply to author
Forward
0 new messages