Representing Leo outlines in git

126 views
Skip to first unread message

Edward K. Ream

unread,
Jul 8, 2014, 8:46:40 AM7/8/14
to leo-e...@googlegroups.com
I've been studying the pro git book: http://git-scm.com/book and am now closely studying the internals chapter: http://git-scm.com/book/en/Git-Internals

Stimulated by Kent's work with db's, the question arises: is it possible to represent a Leo outline as a git object?

I believe the answer is yes, and not just in the trivial sense that any content is a blob:

- Every node's gnx, headline, body text and uA (and anything else) has a unique (sha-1) hash.
- We could define (git) tree objects that contain the following entries: gnx, headline, body text, uA, parents, children.
- Empty uA's would be represented by the hash for an empty string.
- Parents and children entries would be other git tree objects.

In this way, we could use git plumbing to build a git tree object representing an entire outline, with all the data contained in a .leo file.

In other words, even though git is *content* addressable, the content can contain gnx's, so that nodes *identities* are preserved.

Don't know whether any of this will be helpful in the current creative ferment, but I thought I would point it out.

Edward

Fidel N

unread,
Jul 8, 2014, 10:16:50 AM7/8/14
to leo-e...@googlegroups.com
could that become some short of collaborative outline editing for Leo maybe?


--
You received this message because you are subscribed to the Google Groups "leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email to leo-editor+...@googlegroups.com.
To post to this group, send email to leo-e...@googlegroups.com.
Visit this group at http://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.

Jacob Peck

unread,
Jul 8, 2014, 10:36:47 AM7/8/14
to leo-e...@googlegroups.com
What it could allow is per-node versioning... but there are better ways of doing that.  Kent's work, for example...
-->Jake

Edward K. Ream

unread,
Jul 8, 2014, 11:05:08 AM7/8/14
to leo-editor
On Tue, Jul 8, 2014 at 9:36 AM, Jacob Peck <gates...@gmail.com> wrote:
> What it could allow is per-node versioning... but there are better ways of
> doing that. Kent's work, for example...

Thanks, Jake, for this comment. I was wondering about that.

I am also interested in preserving gnx's somehow, by tracking changes
to nodes in a db. I think it was Fidel who suggested that, and I'm
wondering whether git might have any part to play in that project.

Edward

Jacob Peck

unread,
Jul 8, 2014, 11:11:11 AM7/8/14
to leo-e...@googlegroups.com
It's my personal opinion that the most lightweight solution to a problem
is the best. I think that using the entirety of the git machinery (i.e.
libgit) would just be overkill for something like that. Sqlite seems
like a much more viable option IMO. While using a 'git-outline' would
allow some serious flexibility, I highly doubt there would be much use
for it in practice. And with the proper wrapper, sqlite calls could be
wrapped in Leo's node API... something like p.getPreviousVersions()
would return a list of SQL rows, containing (say) pickled or JSON'd or
leo-xml'd nodes, complete with uA's, timestamps, and headline/body pairs.

I'm not well versed in the gnx. I know it's unique per node, but not
much else. Is it updated every time the node is updated? If so, can
this behavior be broken without killing Leo's core?

-->Jake
> Edward
>

Terry Brown

unread,
Jul 8, 2014, 11:15:08 AM7/8/14
to leo-e...@googlegroups.com
On Tue, 08 Jul 2014 10:36:43 -0400
Jacob Peck <gates...@gmail.com> wrote:

> What it could allow is per-node versioning... but there are better
> ways of doing that. Kent's work, for example...

"versioning Leo nodes with git" 2013-8-28
https://groups.google.com/forum/#!topic/leo-editor/F4k_zCXjtYc

"Versioning Leo nodes... with Leo!" 2013-8-29
https://groups.google.com/forum/#!msg/leo-editor/Y-daCfU5C5Y/5pV0Ukgcr0cJ

I don't think either of these cover what Kent's trying to do.

Cheers -Terry
> > <mailto:leo-editor+...@googlegroups.com>.
> > To post to this group, send email to leo-e...@googlegroups.com
> > <mailto:leo-e...@googlegroups.com>.
> > Visit this group at http://groups.google.com/group/leo-editor.
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "leo-editor" group.
> > To unsubscribe from this group and stop receiving emails from it,
> > send an email to leo-editor+...@googlegroups.com
> > <mailto:leo-editor+...@googlegroups.com>.
> > To post to this group, send email to leo-e...@googlegroups.com
> > <mailto:leo-e...@googlegroups.com>.

Terry Brown

unread,
Jul 8, 2014, 11:19:43 AM7/8/14
to leo-e...@googlegroups.com
On Tue, 08 Jul 2014 11:11:06 -0400
Jacob Peck <gates...@gmail.com> wrote:

>
> On 7/8/2014 11:05 AM, Edward K. Ream wrote:
> > On Tue, Jul 8, 2014 at 9:36 AM, Jacob Peck <gates...@gmail.com>
> > wrote:
> >> What it could allow is per-node versioning... but there are better
> >> ways of doing that. Kent's work, for example...
> > Thanks, Jake, for this comment. I was wondering about that.
> >
> > I am also interested in preserving gnx's somehow, by tracking
> > changes to nodes in a db. I think it was Fidel who suggested that,
> > and I'm wondering whether git might have any part to play in that
> > project.
> It's my personal opinion that the most lightweight solution to a
> problem is the best. I think that using the entirety of the git
> machinery (i.e. libgit) would just be overkill for something like
> that. Sqlite seems like a much more viable option IMO. While using

+1 sqlite's include with Python, my reposting of the git versioning
proof of concept was just to point it out, not to suggest it's a way to
go.

Cheers -Terry

Kent Tenney

unread,
Jul 8, 2014, 11:26:20 AM7/8/14
to leo-editor
and, if the sqlalchemy layer is put between Leo and sqlite,
changing ONE string, the db uri, is all that's required to
move from sqlite to postgres

Edward K. Ream

unread,
Jul 8, 2014, 11:40:44 AM7/8/14
to leo-editor
On Tue, Jul 8, 2014 at 10:11 AM, Jacob Peck <gates...@gmail.com> wrote:


> I'm not well versed in the gnx. I know it's unique per node, but not much else.
The gnx is both unique and immutable. It is the permanent,
*unchanging* identity of a node. You can't change this behaviour
without breaking clones.

Each gnx has the form <user>.<time>.<number> for example:
ekr.20140707060759.17644

The .<number> field exists only if several nodes would otherwise have
the same gnx.

gnx's are, iirc, created if necessary when Leo writes a node for the first time.

> Is it updated every time the node is updated?

No. A gnx has nothing to do with content. It has everything to do
with identity.

> can this behavior be broken without killing Leo's core?

Nope. If you want a content-related key, try using the sha-1 hash :-)

Edward

Edward K. Ream

unread,
Jul 8, 2014, 11:43:36 AM7/8/14
to leo-editor
On Tue, Jul 8, 2014 at 10:14 AM, 'Terry Brown' via leo-editor
<leo-e...@googlegroups.com> wrote:
> On Tue, 08 Jul 2014 10:36:43 -0400
> Jacob Peck <gates...@gmail.com> wrote:
>
>> What it could allow is per-node versioning... but there are better
>> ways of doing that. Kent's work, for example...
>
> "versioning Leo nodes with git" 2013-8-28
> https://groups.google.com/forum/#!topic/leo-editor/F4k_zCXjtYc
>
> "Versioning Leo nodes... with Leo!" 2013-8-29
> https://groups.google.com/forum/#!msg/leo-editor/Y-daCfU5C5Y/5pV0Ukgcr0cJ
>
> I don't think either of these cover what Kent's trying to do.

We seem to have an embarrassment of riches. I'll try to get my head
around this...

Edward

Ville M. Vainio

unread,
Jul 8, 2014, 3:46:29 PM7/8/14
to leo-editor
Just as a quick stab - I was looking at camlistore through last few days.


It may be more natural fit for Leo outline management than git (as it's more about direct content addressable content access than git). I have had sketchy plans of reinventing something like camlistore from scratch, so it's something I will be looking into anyway.


--

Fidel N

unread,
Jul 8, 2014, 5:48:24 PM7/8/14
to leo-e...@googlegroups.com
About the gnx, I wanted to say something Edward didnt. Not a complaint or anything, but from my point of view, the only feature they are missing:

When you cut an outline, then paste it otherwhere (same or other file), you loose the gnx of every node in that outline.
That prevents you from using gnx as a stable reference since cut and paste is very frequent in any "leonine" workflow.

Thats, IMO, the only (and big) weakness of gnx's.

Terry Brown

unread,
Jul 8, 2014, 6:24:41 PM7/8/14
to leo-e...@googlegroups.com
On Tue, 8 Jul 2014 23:47:57 +0200
Fidel N <fidel...@gmail.com> wrote:

> About the gnx, I wanted to say something Edward didnt. Not a
> complaint or anything, but from my point of view, the only feature
> they are missing:
>
> When you cut an outline, then paste it otherwhere (same or other

Just in case anyone else misses the point here, took me a while to work
out what you meant, *cut* and paste..., when you paste only once,
ideally could avoid losing the gnxs. Copy paste, or cut paste paste,
would have to create new gnxs (at least on the second paste).

Is tracking the number of pastes after a cut and only creating new gnxs
on the second and subsequent pastes reasonable?

Cheers -Terry

Fidel N

unread,
Jul 8, 2014, 6:43:56 PM7/8/14
to leo-e...@googlegroups.com
Hehe yes, we had that conversation before, I wanted to point that out because I felt it is relevant towards recent discussions.

There are two answers to the cut/paste solution you suggest.

First, i think your solution perfectly makes clear that any node you create will always have its gnx!! 
That way you can include gnx references in your scripts (IE for refreshing data relative to your script, knowing the outline to search for values, etc, seems very convenient).

On the other hand, there is one more thing that could be done. 
I'll just write this but please note I don't consider (or see how it could) to be necessary yet.

More often that not, I find myself coming back to track information that I thought i would never need to begin with.

So, going to the point, I think we could also track the "pasted pasted" nodes, by adding some kind of tail to gnx's.
This way we would have original gnx's, and "descendants". So if a gnx already exists, a paste again would make that gnx with prefix 0, 1, 2, etc, each prefix being the nth of "previous brothers" each node has.

Again, I dont think this is relevant right now, but we would be completely sealing for ever the gnx's door, since there is no more information (that I can think of) for gnx's than that one. We would be able to follow a node from his birth, until their death, going through all their "familiars".

This solution becomes full circle if there is a Leo gnx database checking inter-file gnx, in case we move a part of our outline from one file to another. The gnx-checker could check the DB and track nodes life altogether.

Would such a thing might make unl's obsolete? We would have an inter-file traceable node ID remaining forever..

Edward K. Ream

unread,
Jul 9, 2014, 6:50:02 AM7/9/14
to leo-editor
On Tue, Jul 8, 2014 at 4:47 PM, Fidel N <fidel...@gmail.com> wrote:

> When you cut an outline, then paste it otherwhere (same or other file), you
> loose the gnx of every node in that outline.

The Paste Node As Clone (paste-retaining-clones) command preserves
gnx's, and hence clone links.

Edward

Fidel N

unread,
Jul 9, 2014, 7:06:59 AM7/9/14
to leo-e...@googlegroups.com
Wow, wasnt aware of that feature, thanks Edward!



Edward

Edward K. Ream

unread,
Jul 9, 2014, 9:13:54 AM7/9/14
to leo-e...@googlegroups.com


On Tuesday, July 8, 2014 2:46:29 PM UTC-5, Ville M. Vainio wrote:
Just as a quick stab - I was looking at camlistore through last few days.


It may be more natural fit for Leo outline management than git (as it's more about direct content addressable content access than git). I have had sketchy plans of reinventing something like camlistore from scratch, so it's something I will be looking into anyway.

Great to hear from you again, Ville.  We've been missing your wisdom :-)

camlistore looks very interesting, for many reasons.  In Leo's terms, distinction between an object and a blob is the distinction between a node (with gnx) and the various versions of it that may exist through time.  I'll be studying camlistore closely.

Edward

Ville M. Vainio

unread,
Jul 9, 2014, 5:13:44 PM7/9/14
to leo-editor
I'm always here (reading), just not mustering enough time and energy to write or contribute usefully :).



--

Edward K. Ream

unread,
Jul 10, 2014, 6:07:55 AM7/10/14
to leo-editor
On Wed, Jul 9, 2014 at 4:13 PM, Ville M. Vainio <viva...@gmail.com> wrote:
> I'm always here (reading), just not mustering enough time and energy to
> write or contribute usefully :).

Thanks for reading. I honored.

Edward

Zoltan Benedek

unread,
Jul 10, 2014, 6:26:55 PM7/10/14
to leo-e...@googlegroups.com
Hi,

HDF5 is an interesting data format, too (http://www.hdfgroup.org).
"ensure long-term access to HDF data"
"long term, mission critical data management needs"

There is a python interface:

http://www.h5py.org

Regards
Zoltan

Offray Vladimir Luna Cárdenas

unread,
Jul 17, 2014, 1:21:01 PM7/17/14
to leo-e...@googlegroups.com
Hi,

I agree with Jacob. Libgit can be an overkill. I suggested sometime ago
using fossil-scm which is sqlite based and kind of a github in a box.
For me, the big disadvantage of versioned Leo outlines is its xml
format. The org-mode format is a de-facto standard because is just plain
text. Having another plain text representation for Leo trees, hopefully
human redable and user extendable (like yaml) would be in my view a
better way to make Leo outlines easily versionable without depending on
a single DVCS or infrastructure (HDF5, Camlistore, despite they seem
really interesting).

Cheers,

Offray

Fidel N

unread,
Jul 24, 2014, 9:33:22 AM7/24/14
to leo-e...@googlegroups.com

> When you cut an outline, then paste it otherwhere (same or other file), you
> loose the gnx of every node in that outline.

The Paste Node As Clone (paste-retaining-clones) command preserves
gnx's, and hence clone links.

Edward

Shouldn't paste-clone do a paste-retaining-clones by default when no more nodes with same gnx are in the outline?
That would enable the cut/paste feature we talk so much about (coz I keep bringing it up hehe)

Edward K. Ream

unread,
Jul 29, 2014, 6:28:04 AM7/29/14
to leo-editor
On Thu, Jul 24, 2014 at 8:33 AM, Fidel N <fidel...@gmail.com> wrote:

> Shouldn't paste-clone do a paste-retaining-clones by default when no more
> nodes with same gnx are in the outline?

Leo has no paste-clone command. I assume you are asking about the
paste-node command.

Imo, the answer should be "no". paste-node should be distinct at all
times from paste-retaining-clones.

This is an important example of the zen-of-python principle, "explicit
is better than implicit". http://legacy.python.org/dev/peps/pep-0020/

You are, of course, free to create your own paste command, say using
@command, that works as you suggest.

Edward

Fidel N

unread,
Jul 29, 2014, 9:17:56 AM7/29/14
to leo-e...@googlegroups.com
You were right, I was asking about paste-node.

The thing is, in the described scenario (the node gnx's that we are going to paste have disappeared from the outline, IE, been cut), the paste-retaining-clones would behave exactly as paste-node with the addition that the nodes would have kept their gnx (their identity) after a cut/paste. 

IMO, this would be consistent with the current results of moving nodes through the outline (when you move a node, your gnx are stable, cut/paste should do the same).

As you know I always wanted a permanent gnx reference, to be able to solidly use gnx in my scripts (its very quick and easy) and I could forget about unl's that way.

Well, guess I'll have to put it in my @commands tree :)
Thanks Edward.


Reply all
Reply to author
Forward
0 new messages