vs code chronicles: code folding and Leo's importers

104 views
Skip to first unread message

Edward K. Ream

unread,
Aug 11, 2020, 12:45:42 PM8/11/20
to leo-editor
Yesterday I tried to import come complex c++ code into Leo. Leo's c++ importer didn't do well with templates. A lot of hand tweaking was necessary.

This work did create a head-slapping moment: the hard part of Leo's importers is determining the lines on which classes, methods and functions begin and end. Well, vs code must do something very similar! vs code automatically determines the folding units, the places where code folding begins and ends.

So how does vs code determine folding units? Might vs code use something like LSP? Well, LSP does not appear to offer language parsing/folding features.

I use vs code to study vs code's own sources. So let's take a look. The extensions/cpp folder contains lots of data. Here is extensions/cpp/language-configuration.json:

{
    "comments": {
        "lineComment""//",
        "blockComment": ["/*""*/"]
    },
    "brackets": [
        ["{""}"],
        ["[""]"],
        ["("")"]
    ],
    "autoClosingPairs": [
        { "open""[""close""]" },
        { "open""{""close""}" },
        { "open""(""close"")" },
        { "open""'""close""'""notIn": ["string""comment"] },
        { "open""\"""close""\"""notIn": ["string"] }
    ],
    "surroundingPairs": [
        ["{""}"],
        ["[""]"],
        ["("")"],
        ["\"""\""],
        ["'""'"],
        ["<"">"]
    ],
    "folding": {
        "markers": {
            "start""^\\s*#pragma\\s+region\\b",
            "end""^\\s*#pragma\\s+endregion\\b"
        }
    }
}

This is promising! We have a .json file that provides basic data about the c++ tokens.

And what's this at the end? It looks like vs code natively supports two pragmas that users can use to define custom folds. A quick experiment confirms that these pragmas do create custom folding regions.

The big question

I have not yet discovered how vs code parses c++ files in order to determine folding regions. Googling hasn't revealed any answers yet. Incremental parsing is always tricky: vs code can not assume the text is syntactically valid!

I wonder whether the electron editor might be the base on which folding happens. Iirc, electron is part of vs code, so the answer is likely present somewhere in vs code's sources.

Summary

vs code can calculate fold regions and c++ and many other language. Does anyone know how (and where) vs code calculates these fold regions?

The vs code way probably won't help Leo's importers, but I'd like to know for sure.

Edward

Edward K. Ream

unread,
Aug 11, 2020, 12:51:17 PM8/11/20
to leo-editor
On Tuesday, August 11, 2020 at 11:45:42 AM UTC-5, Edward K. Ream wrote:

> Does anyone know how (and where) vs code calculates these fold regions?

Hah! I forgot vs code's global search command. Searching for "fold" gives too many hits on "folders", but searching for "folding" yields promising results. And that's even before using a word-only or a regex search.

I'll report my results later today or tomorrow.

Edward

Thomas Passin

unread,
Aug 11, 2020, 1:27:31 PM8/11/20
to leo-editor
The editor component is a Microsoft project called Monaco -


Monaco itself can do code folding.  Maybe it would be easier to find the code in its code base than in vscode's.

On Tuesday, August 11, 2020 at 12:45:42 PM UTC-4, Edward K. Ream wrote:
Yesterday I tried to import come complex c++ code into Leo. Leo's c++ importer didn't do well with templates. A lot of hand tweaking was necessary.
[snip]

Edward K. Ream

unread,
Aug 11, 2020, 5:03:58 PM8/11/20
to leo-editor
On Tue, Aug 11, 2020 at 12:27 PM Thomas Passin <tbp1...@gmail.com> wrote:
The editor component is a Microsoft project called Monaco -


Monaco itself can do code folding.  Maybe it would be easier to find the code in its code base than in vscode's.

Thanks for this tip!

Edward

Edward K. Ream

unread,
Aug 11, 2020, 6:01:10 PM8/11/20
to leo-editor
On Tuesday, August 11, 2020 at 11:51:17 AM UTC-5 Edward K. Ream wrote:

Hah! I forgot vs code's global search command. Searching for "fold" gives too many hits on "folders", but searching for "folding" yields promising results. And that's even before using a word-only or a regex search.

It's probably best to search the vs code's sources, not monaco's sources.  The monaco editor is for the browser. It uses shims to adapt the vs base code, and the monaco editor is limited in what it can do by being in the browser. 

A regex search for \bfold\b yields many valuable clues. The takeaways:

- The vscode\src\vs folder likely contains all the relevant sources.

- The monaco sources are in monaco.d.ts. However, there are many hits elsewhere, including editor\contrib\folding\folding.ts.

In short, there is a huge code base related to code folding. It all looks like ts to me. I don't see any way to use it in Leo, but I'll look a bit further...

Edward

Edward K. Ream

unread,
Aug 11, 2020, 6:28:44 PM8/11/20
to leo-editor
On Tue, Aug 11, 2020 at 5:01 PM Edward K. Ream <edre...@gmail.com> wrote:

> I don't see any way to use [the ts code] in Leo, but I'll look a bit further...

Searching for foldingProvider yields:
src\vs\workbench\api\common\extHostLanguageFeatures.ts

There are related providers in various languages, which lead to the FoldingRange class in vscode\src\vs\vscode.d.ts.

Bingo! Searching on FoldingRange yields various language-specific "servers" (and other classes), all written in ts.

Searching on languageMode also yields good results.

Hmm. vscode\src\vs\vscode.d.ts does have a startServer method, but it's not clear what the server is, or what it does. The code is a labyrinth.

Leo is good at maneuvering through such labyrinths, but a study outline of vscode\src\vs would be huge.

It's time for a break.  

Edward

Edward K. Ream

unread,
Aug 11, 2020, 7:04:43 PM8/11/20
to leo-editor
On Tuesday, August 11, 2020 at 5:28:44 PM UTC-5 Edward K. Ream wrote:

> The code is a labyrinth...It's time for a break.

I googled "vs code folding" and found this manual page, containing this quote:

QQQ
Folding regions are by default evaluated based on the indentation of lines. A folding region starts when a line has a smaller indent than one or more following lines, and ends when there is a line with the same or smaller indent.

Since the 1.22 release, folding regions can also be computed based on syntax tokens of the editor's configured language. The following languages already provide syntax aware folding: Markdown, HTML, CSS, LESS, SCSS, and JSON.

If you prefer to switch back to indentation-based folding for one (or all) of the languages above, use:

"[html]": { "editor.foldingStrategy": "indentation" },
QQQ

This explains a lot :-) It's a disappointingly simple strategy, one that I never would have considered. Or maybe it's just a brilliantly simple strategy.  Hehe, probably not. It creates too many folding units. They don't matter much in vs code, but they would not be welcome in Leo. Leo's importers can't use this strategy.

Summary

Today's explorations have been useful. They have taught me vs code's searching capabilities.

vs code's documentation explains the clever way that vs code discovers folding units. It is independent of language!

Alas, this clever approach will be of no use to Leo's importers. Imo, the importers must know more about language syntax than indentation.  I plan no further explorations in this area.

Edward

jkn

unread,
Aug 12, 2020, 5:43:38 PM8/12/20
to leo-editor
I've used other editors with simple configurable folding modes, including this 'use the level of indentation' one.

Like you, I find it disappointingly limited, & never really used it in anger.

I can't remember if I've mentioned Origami, the old Transputer Development System's (DOS-based) editor. It had a wonderful implementation and key binding for code folding.

Edward K. Ream

unread,
Aug 13, 2020, 7:47:36 AM8/13/20
to leo-editor
On Wed, Aug 12, 2020 at 4:43 PM jkn <jkn...@nicorp.f9.co.uk> wrote:

I can't remember if I've mentioned Origami, the old Transputer Development System's (DOS-based) editor. It had a wonderful implementation and key binding for code folding.

A little googling took me here. Thanks for the implied link.

I'm not sure how Origami creates external files and remembers the folds. Perhaps "..." and "{{{" are the equivalent of sentinel comments. Or maybe the text file is like a .leo file and there is a way to create external files from the origami file.  Obviously, clones would add more complexity.

Edward

Thomas Passin

unread,
Aug 13, 2020, 8:33:15 AM8/13/20
to leo-e...@googlegroups.com
Here's someone who re-implemented Origami.  Here he talks a bit about how folding was implemented (double-linked n-trees) -


If you go to the parent directory, you can get the entire source (in C, apparently from 1997).

And some more folding editors from long ago -

Edward K. Ream

unread,
Aug 13, 2020, 9:35:02 AM8/13/20
to leo-editor
On Thu, Aug 13, 2020 at 7:33 AM Thomas Passin <tbp1...@gmail.com> wrote:
Here's someone who re-implemented Origami.  Here he talks a bit about how folding was implemented (double-linked n-trees) -


Thanks for this. Imo, emacs org mode is likely an extension of these ideas.

Edward

k-hen

unread,
Aug 13, 2020, 2:09:21 PM8/13/20
to leo-editor
Not sure if it's relevant, but it might be worth looking at Tree-Sitter which I've always found interesting.
It was being developed for Atom prior to the Microsoft GitHub acquisition.
More recently,  CodeMirror (an alternative to Monaco) has followed suite and implemented something similar, called Lezer.


Kevin

jkn

unread,
Aug 13, 2020, 4:39:00 PM8/13/20
to leo-editor

My comments here have probably been overtaken by other postings. But yes, IIRC the folding marks just appeared in the resulting output occam file as sentinel-like marks.

Origami had the advantage that it was only really used for occam, so you could be sure about what marks to use. I can't remember if the sentinels were 'comment-ified', or a sort of language addition.

    J^n

Peter GAAL

unread,
Aug 14, 2020, 11:21:40 AM8/14/20
to leo-e...@googlegroups.com
There is no trace of "folders" anywhere in my only surviving "Programming in occam 2" manual (by Geraint Jones and Michael Goldsmith from 1988), so the folding concept must have been definitively a feature of the original Transputer Development System (TDS) of INMOS.

As it was an integrated development environment to produce transputer code anyway, the appearance of sentinel comments in the listings didn’t affect the compiler necessarily. Me too, I do remember vividly having seen them in the output — I was so impressed, I never desisted from using folders in vim since then ;-)

The great thing about computers is that
there is always a way to do something.
--
You received this message because you are subscribed to the Google Groups "leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email to leo-editor+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/leo-editor/c56b761e-5aa8-431d-b5db-067f43606da3n%40googlegroups.com.

jkn

unread,
Aug 15, 2020, 5:59:44 PM8/15/20
to leo-editor
I re-read that link as well; something that amused me is that to this day, when writing out emails or other screeds of text, I tend to delimit 'blocks', for my own benefit, with '{{{' and '}}}'

blah blah blah, here I am, chatting about some code, say

the program looks like this:

{{{  <I might have a description here>
     int c;
    char * p;
    /* etc */
}}}

I had forgotten that this more-or-less comes from Origami!

I guess now is the time to ask about an import or language setting that understand this? Even though I no longer write occam, I like the format markers and it would be cute to be able to import stuff like this into Leo

{{{ dot-sig'ly yours
    J^n
Reply all
Reply to author
Forward
0 new messages