noob with a likely off-the-wall request

50 views
Skip to first unread message

HansBKK

unread,
Nov 29, 2011, 11:18:02 PM11/29/11
to leo-editor
Just starting with Leo, not a programmer, but have always been
stimulated by brain-extension software like this, so far Leo looks
like a life-changer. To put my feelings in context (and showing my
age), I haven't been this excited since I saw Ray Ozzie demo Notes
back in 1989. Before that was Ecco and Agenda, and after AskSam and
not in the same league Evernote (v2 only please).

I only wish I'd discovered it earlier, I probably wouldn't have
50,000+ chunks and snippets locked up in a few dozen Evernote files 8-
(. This has led me to desire full transparency and as much permanence
in my primary content's storage - Leo's ability to let me keep the
"content" data out as plaintext in a designated tree of my filesystem,
while keeping krufty meta data out of the way in other parts of the
filesystem - so kewl.


OK, enough about me, on to my unreasonable wishlist request 8-)

Please let me know if there's a way to accomplish this that I haven't
found - I don't like the fact that the backlinks' tags, bookmarks etc
are "hidden in the meta", I want to put these relationships if not in
the content itself (not sentinels, I love @shadow) but perhaps in a
delimited header/footer? Or at least in **my** meta-data text files I
mentioned kept separate but somehow tied to the content files -
parallel filesystem?

So here's what I'm thinking for the moment - << section headings >>,
while not actually *in* the body written out to the filesystem (can
they be config'd to be automatically included?) are "close enough"
that my workflow could include updating both at the same time.

Based on that, I'd like a plugin (or something) that keeps track and
ensures so that if/when I change the << section headers >> text in, it
automatically does a search and replace throughout the whole shebang,
keeping the other inline references in sync with the original target.

Ideally this would operate on all @ <file> node types, even if they're
not fully "loaded"? Although I concede a problem with @ <file>
directives that aren't supposed to touch the external source, I think
there should be a configurable setting to make this particular
automated edit process an exception.

I told you it was out there. . .

I'm really hoping a response will start with "what you're asking for
is completely unnecessary, because. . . "


A less significant but related side question - one of my "master
source" markup syntaxes is txt2tags,

- yes it's too bad either that I hadn't chosen rest instead, or that
Leo supported a pluggable choose-your-own-markdown-syntax
architecture, depending on your POV, but I really prefer txt2tags'
unobtrusiveness when working directly in the plaintext

which has an %include operator as well as other pre/post operators for
expansion macros, which I use for repeating text snippets, urls etc. I
plan on tracking these in a distinct location in both my filesystem
and Leo's tree to avoid confusion.

Can anyone see systemic problems with this? I just can't see using
cloning for such ittybitty chunks of text, would rather reserve them
for the big-picture structural org functionality.

Any sort of comments and advice greatly appreciated. . .

Terry Brown

unread,
Nov 29, 2011, 11:49:00 PM11/29/11
to leo-e...@googlegroups.com
On Tue, 29 Nov 2011 20:18:02 -0800 (PST)
HansBKK <han...@gmail.com> wrote:

> Please let me know if there's a way to accomplish this that I haven't
> found - I don't like the fact that the backlinks' tags, bookmarks etc
> are "hidden in the meta", I want to put these relationships if not in
> the content itself (not sentinels, I love @shadow) but perhaps in a
> delimited header/footer? Or at least in **my** meta-data text files I
> mentioned kept separate but somehow tied to the content files -
> parallel filesystem?

Hmm, "the meta" is basically each node's so called "unknown
attributes", aka uA aka p.v.u, a python dict of anything.

I've been running in to wanting to search the content of that dict,
which is used for backlinks, not so much for bookmarks, but for a bunch
of other things.

One way to expose it as regular text would be to add a special child
node for every node which has the content of v.u in JSON. So a special
child node for a node with a backlink might look like:

{'_bklnk': {'links': [('S', 'tbrown.20111123112137.16198')]}}

Hmm, not that helpful. A better example might be getting at the way
XML attributes are stored by xml2leo, the special
child node for a node representing an XML element with attributes might
look like:

{u'_XML': {'_edit': {'unique': 'false', 'id': '_fld_fi_zone_photo_comments', 'allow_null': 'true', 'primary_key': 'false'}}}

making it somewhat easier to find an XML element by its @id or find
elements with 'primary_key': 'false'.

But not the simplest thing, and very application specific. I'm
thinking a wrapper for python snippets which process v.u might be more
use for those cases.

Can you give more specific examples of stuff hidden in the meta which
you don't want hidden in the meta?

Cheers -Terry

HansBKK

unread,
Nov 30, 2011, 3:31:33 AM11/30/11
to leo-editor
On Nov 30, 11:49 am, Terry Brown <terry_n_br...@yahoo.com> wrote:
> On Tue, 29 Nov 2011 20:18:02 -0800 (PST)
>
> HansBKK <hans...@gmail.com> wrote:
> > Please let me know if there's a way to accomplish this that I haven't
> > found - I don't like the fact that the backlinks' tags, bookmarks etc
> > are "hidden in the meta", I want to put these relationships if not in
> > the content itself (not sentinels, I love @shadow) but perhaps in a
> > delimited header/footer? Or at least in **my** meta-data text files I
> > mentioned kept separate but somehow tied to the content files -
> > parallel filesystem?
>
> Hmm, "the meta" is basically each node's so called "unknown
> attributes", aka uA aka p.v.u, a python dict of anything.

<snipping most of the programmer-speak>

> child node for a node with a backlink might look like:
>
> {'_bklnk': {'links': [('S', 'tbrown.20111123112137.16198')]}}

> Can you give more specific examples of stuff hidden in the meta which
> you don't want hidden in the meta?

This will seem completely disconnected from what you're talking about,
but is more in line with what I'm trying to figure out how to do in
Leo.

Dokuwiki doesn't use any database - it stores its data files as
plaintext in the filesystem ("pages" subdir), and stuff like creation
date, author, indexing, revision history etc. in separate subdirs
("index", "meta" and "attic"). Some of the DW plugins used for cross-
reference linking, building up navigation hierarchies etc store their
data in meta, but I found this one:

http://www.dokuwiki.org/plugin:subjectindex

that lets you put strings like this anywhere you like in your text
(e.g. an invisible comment block)

{{entry >books/fiction/writing novels|-}}
{{entry >1/book/binding|}}
{{entry >2/books/fiction/writing technical documentation|Writing
Docs}}

Such an approach means that if the "containing/organizing"
infrastructure (Dokuwiki on the one hand, here Leo) were to just
disappear, the core content data itself would remain intact and just
as useful, not only as independent "chunks" of text, but in this case
with the inter-relationships as well.

You can use whatever text processing tools you like to refactor,
script/generate, translate, whatever on the source data, wipe out the
Dokuwiki/Leo - specific meta stuff and rebuild it with the cross-
references still intact.

The above plugin is originally designed for automating the creation of
traditional end-matter indexes, or newfangled "tagging" pages -
they're actually the same thing other than presentation style, and the
fact that latter make backlinks explicit somewhere in the main
content.

But they can also be used for alternative (what I call) "axes of
access" - usually a "book" has a single sequence to navigate from
front to back, chapter 1, 2 etc. However if I have a "source text"
with that as a primary nav, as a teacher I may want to present the
student with a "study guide" sequence + meta-commentary, an
alternative path through the material without actually copying/
repeating the source content. Think of a pile of docs relating to
software - structure an "introduction to concepts", a "getting started
howto", a "how to contribute code" and "complete reference" for
developers, etc, recycling a lot of the same text.

I'm thinking Leo can be a tremendous tool for such an approach to
writing - techdocs are just one context of course, but any sort of
knowledgebase example will do.

Leo's internal ("meta") solution for this is cloning, and from my
current noob POV is perfect for **me** to use in selecting/arranging/
sequencing the source "chunks" of text for a given purpose/audience.
But each target output format (I'm thinking DokuWiki, HTML, LaTeX to
PDF, AsciiDoc to EPUB/mobi) has its own separate way of representing
the various cross-referencing relationships between the chunks
(footnotes, index terms, inline see-also's, glossary entries etc). I'm
trying to work toward ways to represent these structures **within**
the content files themselves, and my current thinking is, if Leo is
the master source container/organizer, to make use of it's "node" as
the lowest unit of "chunked" text. In other words not worry about
targeting a specific spot within a node, just target the node's title.
And the << section name >> construct seems like a good way to go,
already used for the sequencing/navigation/ToC idea. One text file can
have a "See also: << book/binding >>", and as long as that link source
got updated at the same time as the target heading text, it should be
usable for transforming into whatever each output format requires.

My goal is to *not* make use of structures that are specific to Leo,
so if fifteen years from now I'm not using Leo anymore, the next
50,000 chunks of text I've captured/written won't need a programmer to
continue being accessible, relatively easy to transform into whatever
I'll be using next.

In the browsing/googling I've been doing since, maybe the idea of
"UNL" are a good approach? Whatever I do, I'll need to keep the link
source strings in sync with the target name/location - I would just
rather avoid having to remember to do a full grep search and replace
manually every time I edit a heading string.

Sorry my brain doesn't fit the programmer's thought modes, I realize
my thinking here is very vague and unstructured. . .

Terry Brown

unread,
Nov 30, 2011, 10:35:07 AM11/30/11
to leo-e...@googlegroups.com
On Wed, 30 Nov 2011 00:31:33 -0800 (PST)
HansBKK <han...@gmail.com> wrote:

> My goal is to *not* make use of structures that are specific to Leo,
> so if fifteen years from now I'm not using Leo anymore, the next
> 50,000 chunks of text I've captured/written won't need a programmer to
> continue being accessible, relatively easy to transform into whatever
> I'll be using next.

Fundamentally Leo stores data in XML. It's always going to be possible
to get data back out of a .leo file without Leo, although it might
require tools / skills somewhat different from plain text. But storing
a tree with clones in plain text would make the plain text so sentinel
heavy it might as well be XML...

> In the browsing/googling I've been doing since, maybe the idea of
> "UNL" are a good approach? Whatever I do, I'll need to keep the link
> source strings in sync with the target name/location - I would just
> rather avoid having to remember to do a full grep search and replace
> manually every time I edit a heading string.

ok, I was perhaps heading off in the wrong direction before, I think I
see what your saying now.

You want some kind of text based link which doesn't break when you edit
the components (node headlines) which it targets. UNLs are an example
of a text base link, but they break when you edit the headlines they
reference. I can't see a simple way to make them unbreakable without
using clones or backlinks, which aren't text based.

I guess you could have a routine which checks for impacted UNLs every
time you edit a headline, or re-arrange nodes (which will also break
UNLs).

Perhaps the document could be scanned for UNLs, the internal node IDs of
the nodes they target noted, and then UNLs could be updated using these
node IDs as needed.

Interesting thought.

Cheers -Terry

HansBKK

unread,
Dec 1, 2011, 1:52:59 AM12/1/11
to leo-e...@googlegroups.com, terry_...@yahoo.com
On Wednesday, November 30, 2011 10:35:07 PM UTC+7, Terry wrote:

>> so if fifteen years from now I'm not using Leo anymore, the next 50,000 chunks of text I've captured/written won't need a programmer to continue being accessible, relatively easy to transform into whatever  I'll be using next.
>> My goal is to *not* make use of structures that are specific to Leo,

> Fundamentally Leo stores data in XML.  It's always going to be possible to get data back out of a .leo file without Leo, although it might require tools / skills somewhat different from plain text.  But storing a tree with clones in plain text would make the plain text so sentinel heavy it might as well be XML...

Yes, but the beauty of the @shadow structure is, "who cares"? The only people that are going to look at the sentinels are programmers who care about Leo. Therefore sentinels in @shadow files can carry all the important data right out in the filesystem.


>> In the browsing/googling I've been doing since, maybe the idea of "UNL" are a good approach? Whatever I do, I'll need to keep the link source strings in sync with the target name/location - I would just rather avoid having to remember to do a full grep search and replace manually every time I edit a heading string.

> ok, I was perhaps heading off in the wrong direction before, I think I see what your saying now.
> You want some kind of text based link which doesn't break when you edit the components (node headlines) which it targets. 

Exactly. The "anchor" would ideally be located anywhere in the body, either to a specific point or spanning a range of text in the chunk. I just started with << section heading >> as a well-defined example, and since Leo nodes can represent any chunk/snippet of text in the output stream (in effect an "include" macro), present a degree of flexibility.

 
> UNLs are an example of a text base link, but they break when you edit the headlines they reference.

As would anything. I started off by asking if a tool existed, or would be possible:

>> that keeps track and ensures that if/when I change the << section headers >> text, it automatically does a search and replace throughout the whole shebang, keeping the other inline references in sync with the original target.

>> Ideally this would operate on all @ <file> node types, even if they're not fully "loaded"? Although I concede a problem with @ <file> directives that aren't supposed to touch the external source, I think there should be a configurable setting to make this particular automated edit process an exception.

Let's call this hypothetical tool "FixLinks" - here's the workflow:
  - Save the .leo file and confirm the filesystem's in sync
    - (make a backup, commit to your VCS, whatever makes you feel secure)
  - open the FixLinks dialog and copy the text you're about to edit to "search"
  - edit the text and save the .leo file again
  - copy the new string to FixLinks' "replace" press the "go" button

FixLinks now greps the whole directory tree **including** the private @shadow folders, but not ignoring .svn .bzr etc folders, replacing oldString with newString throughout.


> I can't see a simple way to make them unbreakable without using clones or backlinks, which aren't text based.

Well let me go out on a limb here, and not just to solve my little problem, since I think FixLinks is a generic answer to that regardless of a Leo-specific implementation, but if my @shadow related comments above hold true:

Why can't backlinks (or even clones ??!!) now be stored in the private-file sentinels?


> I guess you could have a routine which checks for impacted UNLs every time you edit a headline, or re-arrange nodes (which will also break UNLs).

An automated background FixLinks tools would of course be much better than the above workflow, but even that would take care of the edit a headline issues. Here's my workaround solution (call me Mr Kludge 8-) for the location issue.

First I'll bore you with an account of my usual writing workflow - bear with me, it pertains.

- Anytime I'm going to do any data-entry, creating or editing prose content anywhere, it first gets written in my journal, sort of a "global scratchpad". For that I'm currently using RedNotebook, a python-based journal cum-editing-applet that stores all its data as plaintext, marked up with txt2tags and structured with YAML.

  - When I'm done with an editing session, I copy the content from there and paste it into wherever it belongs

  - I keep my team's Projects/Action stuff ("to-do's) in Redmine, "hard landscape" appointments and calendar data in iCal/CalDAV, contact data in a CRM backed by LDAP store, both integrated with email.

  - The rest (reference data, an all-topics "knowledgeBase") is basically plain text in vary levels of structure, stored in a well-organized folder hierarchy (much of which in the process of being liberated from my before-mentioned EvernoteV2 databases.

  - I always *copy*, never cut, from the journal database, leaving the original text there as an archive record for ever, never go back and edit it, nor refactor/organize it in any way. If existing text needs editing, I first copy it to RedNotebook and do it there. Sure the journal data gets huge, but who cares? Once I've finished parsing it for significant data I'll very rarely ever look at it again unless I really need to, and having the redundant original data has saved my butt on several occasions over the years. BTW everything is also under VC and of course well backed up.

Now to adapt this to Leo - I propose a "canonical tree", which is simply structured by date - YYYY / MM / DD (possibly HHMM, but I wouldn't). All content gets created there as nodes there first, and that location never changes. Clone nodes from there to "where they belong", by topic, function whatever, but always us the master hierarchy location when creating UNLs. I might even go so far as to copy clones needing editing to today and then use FixLinks to update my content - intra-leo version control - but YMMV.

To anyone who's followed along this far, I thank you for your patience 8-)


HansBKK

unread,
Dec 1, 2011, 3:05:36 AM12/1/11
to leo-e...@googlegroups.com, terry_...@yahoo.com
Found this syntax regarding reST/Sphinx, so if such links work within Leo, maybe I just have to bit the bullet and find a way to transform my txt2tags to reST syntax, and perhaps go from there to AsciiDoc rather than from txt2tags.

In looking possible ways to do that, I've put together a pivot table in google spreadsheet
showing the various paths.

The docutils finite state machine (behind reST/Sphinx) looks promising, and it's part of the python ecosystem, but Pandocs seems to be more tied into the outside world so far. . .






HansBKK

unread,
Dec 1, 2011, 9:48:19 PM12/1/11
to leo-e...@googlegroups.com, terry_...@yahoo.com
Found this snippet of wisdom browsing the archives, couldn't resist:

>> Backlinks are what I thought about immediately when confronted w/ this
>> problem, but then abandoned the idea. The problem w/ backlinks is that
>> they live in the object / opaque gnx / leo world, not the ever elusive
>> flat text world. And flat text world is what will always win :-)

Amen brother!
8-)

Reply all
Reply to author
Forward
0 new messages