[ANN] T an acme/sam-like text editor library

burns...@gmail.com

unread,

Jan 15, 2016, 9:20:51 AM1/15/16

to golang-nuts

T (https://github.com/eaburns/T) is a text editor inspired by the Acme and Sam editors of the Plan9 operating system and Plan9 from User Space project.

The current incarnation of T is just a text editing library. It implements a (still slightly incomplete) dialect of the Sam language. This language is used for editing buffers of runes. Checkout the docs here: https://godoc.org/github.com/eaburns/T/edit.

In the future, if I ever get around to it, T will use this library as the backend for an editor much like Acme. (For a taste of Acme, see Russ Cox's tour here: http://research.swtch.com/acme. It is very good.) Until then, I wanted to share what's currently available in hopes that someone finds it useful or interesting.

Ethan

roger peppe

unread,

Jan 15, 2016, 12:52:00 PM1/15/16

to Ethan Burns, golang-nuts

Nice work and great docs too.

> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Nigel Tao

unread,

Jan 15, 2016, 11:08:57 PM1/15/16

to Ethan Burns, golang-nuts

On Sat, Jan 16, 2016 at 1:20 AM, <burns...@gmail.com> wrote:
> T (https://github.com/eaburns/T) is a text editor

Interesting indeed.

I see that the underlying data is held as []rune instead of []byte,
which can take 4x memory for large chunks of ASCII text. Also, I
haven't measured it, but I would have guessed that the conversion
costs between them are non-negligible. For example, the regexp package
in the standard library works on runes conceptually (e.g. MatchReader
takes an io.RuneReader and not an io.ByteReader, and see also
http://play.golang.org/p/_-zdYCi2ZD), but it throws around []byte, not
[]rune. I'm curious about your thoughts on picking []rune vs []byte
for T. I'm sure that []rune makes some programming easier, but I don't
have the experience with it that you do.

Ethan Burns

unread,

Jan 16, 2016, 9:32:16 AM1/16/16

to Nigel Tao, golang-nuts

I'm glad that you are interested. In the early days of shiny, I remember talk about using it for an Acme-like editor. At that time, T was already in the works, but it was a bit too early. My hope in advertising the code now is to surface any similar efforts to see if folks want to work together on something.

As for memory and CPU, I'm not worried yet, but I want to start prototyping and bench-marking soon.

Memory:

T's runes buffer implementation is heavily inspired by Sam and Acme. They also hold everything as runes. The address language relies heavily on accessing rune indexes in the buffer. If everything remained as UTF-8, it would get very complicated (require lots of scanning and a skip table of some kind).

The buffer itself is split into fixed-size blocks of runes. All the blocks are stored on disk with the exception of a single block. The one held in memory is the current working block. So the amount of memory for each buffer is constant. (Though the parameter used in the current implementation needs major tweaking).

CPU:

There are two places where conversions need to be done:

When loading or saving a file, we need to convert the whole thing to UTF-8. This shouldn't be a problem.
When moving a block between disk and memory. The in-memory block is a []rune, but io.Reader and io.Writer take a []byte. So we convert. If this starts showing up on profiles, then we can probably do something ugly to get a []byte from a []rune without conversion. I don't expect this to actually happen.

Regexp:

You mentioned the regexp package, so I also wanted to talk about that briefly. Unfortunately, T doesn't use it. At the time that I was writing it, there was some difficulty with finding the next match with a RuneReader. I'm not sure when FindReaderSubmatchIndex was added, but I don't think it existed at the time. So T uses its own regular expression package. This can be revisited if regexp now better supports RuneReader.

Ethan

burns...@gmail.com

unread,

Jan 16, 2016, 10:29:09 AM1/16/16

to golang-nuts, nige...@golang.org, burns...@gmail.com

I just remembered another difficulty with regexps. T needs to be able to match in reverse. I don't believe that the regexp package can do this at the moment. There may be more that I'm forgetting too. I spend a lot of time on this point. The choice to make my own regexp package wasn't made lightly.

Ethan

Rob Pike

unread,

Jan 16, 2016, 10:58:21 AM1/16/16

to burns...@gmail.com, golang-nuts, Nigel Tao

Sam was originally an "8-bit clean" editor. When Unicode arrived, it was converted to store runes inside because the byte-level algorithms it used were easy to update to that model. If I were doing it again today, though, I'd keep everything in UTF-8 internally. It would be more efficient in the end because less copying would be required, and of course in the modern era with 32-bit runes, there is a huge memory saving.

Acme's internals borrow largely from Sam, so the same reasoning applies there.

Going backwards in a UTF-8 stream is not too hard. UTF-8 was designed to be navigable like that.

-rob

On Sat, Jan 16, 2016 at 7:28 AM, <burns...@gmail.com> wrote:

I just remembered another difficulty with regexps. T needs to be able to match in reverse. I don't believe that the regexp package can do this at the moment. There may be more that I'm forgetting too. I spend a lot of time on this point. The choice to make my own regexp package wasn't made lightly.

Ethan

--

Ethan Burns

unread,

Jan 16, 2016, 11:24:54 AM1/16/16

to Rob Pike, golang-nuts, Nigel Tao

That's quite discouraging. Like you, I'm disinclined to rewrite the whole thing to store UTF-8 now that I've already written it for []runes.

My worry wasn't really going backwards in UTF-8, but it was rune indexing into the buffer. If everything is UTF-8, this would require scanning. This could be made more efficient with a table allowing to skip chunks of the buffer, but maintaining such a table could be a bit complex. Did you have any particular ideas how you would do rune indexing into UTF-8?

Ethan

Rob Pike

unread,

Jan 16, 2016, 11:30:10 AM1/16/16

to Ethan Burns, golang-nuts, Nigel Tao

If it's UTF-8 you might not need to do rune indexing at all (that's a UI decision), but if you did, it's easy. Since the text is stored in a list of blocks, so text can be inserted reasonably efficiently, you can store a rune count for each block and only need to scan a block itself for the count. That can be made efficient with a simple cache like the one in https://godoc.org/golang.org/x/exp/utf8string.

It's really not a big deal.

I'm not saying you should rewrite, just that if I were starting over, that's what I would do.

-rob

Ethan Burns

unread,

Jan 16, 2016, 1:45:18 PM1/16/16

to Rob Pike, golang-nuts, Nigel Tao

Thanks. You've convinced me that it's fine. I've never noticed either a performance or memory issue from this in Acme.

In T, this code is in its own package (edit/runes). If it ever becomes an issue, we should be able to rewrite it without affecting too much other code.

Ethan

Jeremy Jackins

unread,

Jan 20, 2016, 4:31:46 AM1/20/16

to golang-nuts, burns...@gmail.com

Hi Ethan,

I'm doing something a bit similar, although I've focused on the front-end more than the back-end. :)

- https://godoc.org/sigint.ca/graphics/editor - primary user-facing library.

- https://godoc.org/sigint.ca/graphics/editor/internal/text - internal data structures corresponding more closely to your work, although much more simplistic - no disk-backing, and currently only supports addressing by row and column.

- https://godoc.org/sigint.ca/graphics/cmd/edit - a text editor implemented using the sigint.ca/graphics/editor and golang.org/x/exp/shiny. This currently works if you're on OS X, but some issues to work out on the linux version, and not tested yet whatsoever on windows. I hope to someday get it working on Plan 9 as well, via a shiny backend for Plan 9.

At the moment, my editor only resembles Acme or Sam in shallow, cosmetic ways such as colour scheme and double-click selection rules, but my end goal is basically a single-column acme with a tag, main editing window and an output area (i.e. "+Errors" window), and little or no window managing features. The goal for the sigint.ca/graphics/editor package is a general purpose, configurable text editing widget for GUI programs.

Looking forward to stealing from^W^Wcollaborating with you. ;)

FWIW, my Buffer type has both a []rune backed implementation and a UTF8 encoded []byte backed implementation, and I set the []byte implementation to +build ignore after benchmarking. I considered the memory impact, but compared to the memory used in image processing, the extra ~3 bytes per character isn't significant at all for any files I'm editing. Not to mention I store the pixel advance of each character anyway (for drawing the cursor, selection rectangles, etc.), which is fits into an int16 at minimum so it's 6 bytes vs ~3 bytes per characters, not 4 bytes vs ~1 byte.

Cheers,
Jeremy

Jeremy Jackins

unread,

Jan 20, 2016, 4:35:20 AM1/20/16

to golang-nuts, burns...@gmail.com

- https://godoc.org/sigint.ca/graphics/cmd/edit - a text editor implemented using ...

Oops, that link won't work, but this one will: https://github.com/jnjackins/graphics/blob/master/cmd/edit/main.go

Sebastien Binet

unread,

Jan 20, 2016, 8:08:35 AM1/20/16

to Jeremy Jackins, golang-nuts, Ethan Burns

Jeremy Jackins

unread,

Jan 20, 2016, 8:32:49 AM1/20/16

to Sebastien Binet, golang-nuts, Ethan Burns

Hi Sebastien, thanks.

Sorry about the bad SSL cert. FWIW, it works with -insecure on tip as of two weeks ago: https://github.com/golang/go/issues/13197

As for the other issue, again I didn't notice it because I'm running Go tip. I think it will work on Go 1.5 if you set GO15VENDOREXPERIMENT=1. I'm using a slightly modified, vendored version of golang.org/x/mobile/event/mouse and golang.org/x/exp/shiny/driver/gldriver.

My apologies. Please let me know if that doesn't get it working.

Jeremy

Sebastien Binet

unread,

Jan 20, 2016, 8:38:56 AM1/20/16

to Jeremy Jackins, golang-nuts, Ethan Burns

Jeremy,

On Wed, Jan 20, 2016 at 2:32 PM, Jeremy Jackins <jeremy...@gmail.com> wrote:
> Hi Sebastien, thanks.
>
> Sorry about the bad SSL cert. FWIW, it works with -insecure on tip as of two
> weeks ago: https://github.com/golang/go/issues/13197
>
> As for the other issue, again I didn't notice it because I'm running Go tip.
> I think it will work on Go 1.5 if you set GO15VENDOREXPERIMENT=1. I'm using
> a slightly modified, vendored version of golang.org/x/mobile/event/mouse and
> golang.org/x/exp/shiny/driver/gldriver.
>
> My apologies. Please let me know if that doesn't get it working.

Jeremy Jackins

unread,

Jan 20, 2016, 8:49:22 AM1/20/16

to golang-nuts, jeremy...@gmail.com, burns...@gmail.com

FWIW, I then managed to compile it but ran into:
$> ./edit
clipboard: falling back to internal buffer
panic: interface conversion: screen.Buffer is *editor.Editor, not
*x11driver.bufferImpl

Ah, this is what I mean in my first post by "This currently works if you're on OS X, but some issues to work out on the linux version" :)
https://github.com/golang/go/issues/14026

Unfortunately, some assumptions I made while programming for the darwin shiny driver didn't hold up when testing on the x11 driver. It shouldn't be too much work to fix for x11, but I only discovered this yesterday and haven't had a chance to look at it much yet.

Nigel Tao

unread,

Jan 20, 2016, 9:18:07 PM1/20/16

to Jeremy Jackins, golang-nuts, Ethan Burns

On Wed, Jan 20, 2016 at 8:31 PM, Jeremy Jackins <jeremy...@gmail.com> wrote:
> - https://godoc.org/sigint.ca/graphics/editor/internal/text - internal data
> structures corresponding more closely to your work,

I skimmed the source code. Just a drive-by comment:

func (b *Buffer) GetSel(sel Selection) string {
etc
ret := string(etc) + "\n"
for i := sel.From.Row + 1; i < sel.To.Row; i++ {
ret += string(b.Lines[i].s) + "\n"
}
etc
return ret
}

String concatenation has quadratic complexity, which I'm guessing is
unworkable if your document has thousands of lines. Instead, I'd work
with append or a pre-sized slice and with []byte or []rune, and
convert to string only once, instead of once per line.

andrewc...@gmail.com

unread,

Jan 20, 2016, 9:52:44 PM1/20/16

to golang-nuts, nige...@golang.org, burns...@gmail.com

On Sunday, January 17, 2016 at 3:32:16 AM UTC+13, Ethan Burns wrote:

I'm glad that you are interested. In the early days of shiny, I remember talk about using it for an Acme-like editor. At that time, T was already in the works, but it was a bit too early. My hope in advertising the code now is to surface any similar efforts to see if folks want to work together on something.

I think that was me. I would love to try it out when you get a ui. Keep up the good work.

Jeremy Jackins

unread,

Jan 20, 2016, 10:39:25 PM1/20/16

to Nigel Tao, golang-nuts, Ethan Burns

Thanks for pointing that out. It made a big difference.

For 1000 lines:

BenchmarkGetSel-4 1 1153495294 ns/op

BenchmarkGetSelAvoidConcat-4 3000 501157 ns/op

Jeremy Jackins

unread,

Jan 20, 2016, 10:50:06 PM1/20/16

to Nigel Tao, golang-nuts, Ethan Burns

Oops, that's an exaggeration. I was unfairly testing the old version with 10,000 lines and the new version with 1000 lines.

When it's a fair fight, the new version still does an order of magnitude better:

BenchmarkGetSel-4 200 7041926 ns/op

BenchmarkGetSelAvoidConcat-4 2000 637645 ns/op

Daniel Theophanes

unread,

Jan 21, 2016, 12:06:46 PM1/21/16

to golang-nuts, burns...@gmail.com

Hi Jeremy,

It might be useful when you vendor your deps to vendor all platform deps. For instance, on X11 you aren't carrying the xgb dep in your vendor folder. I also don't know which revision you are pulling from to troubleshoot the compile error. I you use "github.com/kardianos/govendor" it will pull the deps for all platforms, as well as record the revision.

Nice work. I also get a mouse.ScrollEvent undefined compile error, but look forward to trying it out.

Thanks, -Daniel

Reply all

Reply to author

Forward