Text stylers

Phil Norman

unread,

Feb 7, 2021, 4:31:41 PM2/7/21

to evergre...@googlegroups.com

Hi.

I have disliked the way the text styling is done for many years: it's always been a case of either making do with the slightly-genericised C-like styler, or writing an entire styler from scratch (which is a big investment). However, it's not an easy problem to solve elegantly.

I had a bit of a brain-wave last night, and finally figured out a reasonably simple idea for simplifying the text styling implementation. So today I had a bit of a hack, and found it was fairly easy to get working.

I've pushed the changes, and will be using the new implementation for work as of tomorrow. However, so far it's only undergone relatively light testing, so update with care.

The changes have made it relatively easy to fix a bunch of long-standing gripes of mine, so with this update we also get:

1: Multiline strings in C++, eg R"V0G0N(my string here)V0G0N".

2: Multiline strings in python, eg """ here """ or ''' this instead '''.

3: Multiline strings in Go, eg ` `. These worked before, but now there's no go-specific code.

4: Single- and double-quoted strings in bash are now (correctly) treated as potentially multi-line.

5: Bash also gains support for <<EOF 'heredoc' strings.

The languages that are impacted by this change are Bash, C++, Go, Java, Proto and Python. If you get any weird crashes with these, or wacky redraw errors, please let me know.

In the slightly longer-term, I'd quite like to make the language identification, styling and support in general more data-driven. At some places of work there are internal-only languages for which it'd be useful to get highlighting support without having to expose details in a public github repo, and switching to a more data-driven model would allow support for this.

Anyway, feedback/comments/whatever welcome. Have a good evening :-)

Phil

Elliott Hughes

unread,

Feb 8, 2021, 10:14:47 PM2/8/21

to evergre...@googlegroups.com

On Sun, Feb 7, 2021, 13:31 Phil Norman <phil...@gmail.com> wrote:

Hi.

I have disliked the way the text styling is done for many years: it's always been a case of either making do with the slightly-genericised C-like styler, or writing an entire styler from scratch (which is a big investment). However, it's not an easy problem to solve elegantly.

One thing I don't get is how so many more recent editors are using regexes to configure their stylers. Though to be honest I've not used them, and the one recent editor I have used quite a bit (VS Code) is a bit slow on something like a rpi400. So maybe they're not actually getting away with it? Very easy and convenient for configuration though.

I had a bit of a brain-wave last night, and finally figured out a reasonably simple idea for simplifying the text styling implementation. So today I had a bit of a hack, and found it was fairly easy to get working.

I've pushed the changes, and will be using the new implementation for work as of tomorrow. However, so far it's only undergone relatively light testing, so update with care.

The changes have made it relatively easy to fix a bunch of long-standing gripes of mine, so with this update we also get:

1: Multiline strings in C++, eg R"V0G0N(my string here)V0G0N".
2: Multiline strings in python, eg """ here """ or ''' this instead '''.
3: Multiline strings in Go, eg ` `. These worked before, but now there's no go-specific code.
4: Single- and double-quoted strings in bash are now (correctly) treated as potentially multi-line.
5: Bash also gains support for <<EOF 'heredoc' strings.

Nice!

The languages that are impacted by this change are Bash, C++, Go, Java, Proto and Python. If you get any weird crashes with these, or wacky redraw errors, please let me know.

In the slightly longer-term, I'd quite like to make the language identification, styling and support in general more data-driven. At some places of work there are internal-only languages for which it'd be useful to get highlighting support without having to expose details in a public github repo, and switching to a more data-driven model would allow support for this.

Anyway, feedback/comments/whatever welcome. Have a good evening :-)

Hope you're all doing okay. Sounds like you survived your brush with covid at least!

Phil

--
You received this message because you are subscribed to the Google Groups "evergreen-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to evergreen-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/evergreen-users/CAOa8eG4eHBBGargFqJpnUxZSxmJcMg%3DDbcbR1n8XQRSj4yYnQQ%40mail.gmail.com.

Phil Norman

unread,

Feb 12, 2021, 11:39:31 AM2/12/21

to evergre...@googlegroups.com

On Tue, 9 Feb 2021 at 04:14, Elliott Hughes <elliott....@gmail.com> wrote:

On Sun, Feb 7, 2021, 13:31 Phil Norman <phil...@gmail.com> wrote:
Hi.

I have disliked the way the text styling is done for many years: it's always been a case of either making do with the slightly-genericised C-like styler, or writing an entire styler from scratch (which is a big investment). However, it's not an easy problem to solve elegantly.

One thing I don't get is how so many more recent editors are using regexes to configure their stylers. Though to be honest I've not used them, and the one recent editor I have used quite a bit (VS Code) is a bit slow on something like a rpi400. So maybe they're not actually getting away with it? Very easy and convenient for configuration though.

Sure, regexps are great, once you get them to work.

Actually, on that subject, I was wondering about some kind of regexp helper. There are regexp debuggers/editors out there in the wild, so maybe allowing one of those to be started from Evergreen would be good. But maybe some simple built-in thing would be good. Something that basically:

1: Understands how to take a string from $language of a particular type, and de-escape it.

2: Dumps that into an edit widget, along with a box you can copy/paste (or type) some to-be-matched text in.

3: Does something kind of similar to the "Find/Replace" window, highlighting matches and showing what the matching segments would be.

4: Understands how to re-escape the string into $language for writing back.

Of course, if there's a good existing solution, finding it and recommending it would be good enough. Particularly if it's just another open source package, like ctags.

Hope you're all doing okay. Sounds like you survived your brush with covid at least!

Yep, doing fine thanks. All covid-recovered and back to normal. Well, as far as anything's normal these days. I hope you're well too. Take care :-)

Phil

Elliott Hughes

unread,

Feb 12, 2021, 8:55:46 PM2/12/21

to evergre...@googlegroups.com

On Fri, Feb 12, 2021, 08:39 Phil Norman <phil...@gmail.com> wrote:

On Tue, 9 Feb 2021 at 04:14, Elliott Hughes <elliott....@gmail.com> wrote:
On Sun, Feb 7, 2021, 13:31 Phil Norman <phil...@gmail.com> wrote:
Hi.

I have disliked the way the text styling is done for many years: it's always been a case of either making do with the slightly-genericised C-like styler, or writing an entire styler from scratch (which is a big investment). However, it's not an easy problem to solve elegantly.

One thing I don't get is how so many more recent editors are using regexes to configure their stylers. Though to be honest I've not used them, and the one recent editor I have used quite a bit (VS Code) is a bit slow on something like a rpi400. So maybe they're not actually getting away with it? Very easy and convenient for configuration though.

Sure, regexps are great, once you get them to work.

Actually, on that subject, I was wondering about some kind of regexp helper. There are regexp debuggers/editors out there in the wild, so maybe allowing one of those to be started from Evergreen would be good. But maybe some simple built-in thing would be good. Something that basically:

1: Understands how to take a string from $language of a particular type, and de-escape it.
2: Dumps that into an edit widget, along with a box you can copy/paste (or type) some to-be-matched text in.
3: Does something kind of similar to the "Find/Replace" window, highlighting matches and showing what the matching segments would be.
4: Understands how to re-escape the string into $language for writing back.

Of course, if there's a good existing solution, finding it and recommending it would be good enough. Particularly if it's just another open source package, like ctags.

I literally use the find/replace window for this. It's how that ui came about. Stick your positive and negative examples in a file, bring up the window, and mess around.

Doesn't everyone have raw strings these days so they don't need to escape regexes any more? If not, sounds doable in an ExternalTool?

Hope you're all doing okay. Sounds like you survived your brush with covid at least!

Yep, doing fine thanks. All covid-recovered and back to normal.

Glad to hear it!

Well, as far as anything's normal these days. I hope you're well too. Take care :-)
Phil

--

You received this message because you are subscribed to the Google Groups "evergreen-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to evergreen-use...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/evergreen-users/CAOa8eG7aCQv_-YqkT_R8ww8rcFP1YDRt8wMUfOOLJGC1OKTr9Q%40mail.gmail.com.

Phil Norman

unread,

Feb 13, 2021, 5:53:47 AM2/13/21

to evergre...@googlegroups.com

On Sat, 13 Feb 2021 at 02:55, Elliott Hughes <elliott....@gmail.com> wrote:

On Fri, Feb 12, 2021, 08:39 Phil Norman <phil...@gmail.com> wrote:
Actually, on that subject, I was wondering about some kind of regexp helper. There are regexp debuggers/editors out there in the wild, so maybe allowing one of those to be started from Evergreen would be good. But maybe some simple built-in thing would be good. Something that basically:

1: Understands how to take a string from $language of a particular type, and de-escape it.
2: Dumps that into an edit widget, along with a box you can copy/paste (or type) some to-be-matched text in.
3: Does something kind of similar to the "Find/Replace" window, highlighting matches and showing what the matching segments would be.
4: Understands how to re-escape the string into $language for writing back.

Of course, if there's a good existing solution, finding it and recommending it would be good enough. Particularly if it's just another open source package, like ctags.

I literally use the find/replace window for this. It's how that ui came about. Stick your positive and negative examples in a file, bring up the window, and mess around.

Ha, interesting. This approach has a few drawbacks though. The text field is a standard JTextField, so doesn't have quite the same behaviour as the main text widget. It also doesn't remember much - it generally remembers what it was showing last, unless you happen to open it with some text selected that's shorter than a line. I don't think it allows one to debug multi-line-matching ((?m), I think) regexps - and those are, IME, the hardest to get right.

Doesn't everyone have raw strings these days so they don't need to escape regexes any more? If not, sounds doable in an ExternalTool?

Ha, if only. C++, Go and Python all have literal strings; Java still doesn't. There's some experimental preview feature to treat """...""" as multi-line strings, but it's not clear to me whether they support escaping within them (in which case, regexps are still a pain). I tried to try it out, but when running 'java --enable-preview --release 14', it fails to create a virtual machine. No idea why.

Cheers,

Phil

Elliott Hughes

unread,

Feb 14, 2021, 3:56:03 PM2/14/21

to evergre...@googlegroups.com

On Sat, Feb 13, 2021, 02:53 Phil Norman <phil...@gmail.com> wrote:

On Sat, 13 Feb 2021 at 02:55, Elliott Hughes <elliott....@gmail.com> wrote:
On Fri, Feb 12, 2021, 08:39 Phil Norman <phil...@gmail.com> wrote:
Actually, on that subject, I was wondering about some kind of regexp helper. There are regexp debuggers/editors out there in the wild, so maybe allowing one of those to be started from Evergreen would be good. But maybe some simple built-in thing would be good. Something that basically:

1: Understands how to take a string from $language of a particular type, and de-escape it.
2: Dumps that into an edit widget, along with a box you can copy/paste (or type) some to-be-matched text in.
3: Does something kind of similar to the "Find/Replace" window, highlighting matches and showing what the matching segments would be.
4: Understands how to re-escape the string into $language for writing back.

Of course, if there's a good existing solution, finding it and recommending it would be good enough. Particularly if it's just another open source package, like ctags.

I literally use the find/replace window for this. It's how that ui came about. Stick your positive and negative examples in a file, bring up the window, and mess around.

Ha, interesting. This approach has a few drawbacks though. The text field is a standard JTextField, so doesn't have quite the same behaviour as the main text widget. It also doesn't remember much - it generally remembers what it was showing last, unless you happen to open it with some text selected that's shorter than a line. I don't think it allows one to debug multi-line-matching ((?m), I think) regexps - and those are, IME, the hardest to get right.

regex101.com or regextester.com or similar?

Though I don't remember the last time I used a multiline regex, so I don't know how they are for that (but I noticed that regex101 defaults to having the m flag set).

Doesn't everyone have raw strings these days so they don't need to escape regexes any more? If not, sounds doable in an ExternalTool?

Ha, if only. C++, Go and Python all have literal strings; Java still doesn't.

Ah, I haven't used Java myself in the best part of a decade now, and have actually reviewed (for rubber-stamp values of "reviewed") more kotlin than Java lately.

There's some experimental preview feature to treat """...""" as multi-line strings, but it's not clear to me whether they support escaping within them (in which case, regexps are still a pain). I tried to try it out, but when running 'java --enable-preview --release 14', it fails to create a virtual machine. No idea why.

Cheers,
Phil

--

You received this message because you are subscribed to the Google Groups "evergreen-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to evergreen-use...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/evergreen-users/CAOa8eG7NgMhEqZ5YtQrqW1Ex%2BFXuDtevxC1qJBzoN-f2tuhQ2w%40mail.gmail.com.

Reply all

Reply to author

Forward