Personal take-aways from a rust video

93 views
Skip to first unread message

Edward K. Ream

unread,
Oct 27, 2021, 8:12:30 AM10/27/21
to leo-editor
For the past several days I have been looking for my next project. Here, I'll briefly discuss the video Rust: A Language for the Next 40 Years. The speaker is a member of rust's core development team at mozilla.

I highly recommend the first section, starting at 2:30, about the history of the railroad industry and the Federal Railroad Safety Appliance Act. Incredibly, railroads fought this legislation, though it (predictably) lead to thousands of fewer deaths among train crews every year.

Ironically, this excellent video convinced me that rust is not going to be a big part of my future, for several reasons:

1. Rust is essentially a safer, more complex, version of C. As discussed in the video, there is a gradual pathway for migration from C or C++ to rust.  However, I care nothing for the problems of those stuck with legacy C code.

2. The talk (dubiously!) assumes that garbage collection is a deal-breaker. But computers are much faster today than 40 years ago, so the number of legacy programs that now require native C performance is likely to be a tiny fraction of what it once was!

3. The video discusses the notion of a "person byte," the amount of data someone can comfortably deal with at a time. Alas, rust imposes a far greater mental load than does python!

Speed

One of Leo's "grand challenges" is to re-imagine Leo with an outline containing millions of nodes. Whether this challenge even makes sense is an open question. But one thing is clear: a speedup of 2-10 times would be inconsequential. Therefore, rust can not possibly be part of the solution.

Summary

Rust is never likely to be important for me or for Leo.  Rust's complexity actively impedes high-level speculation.

Rust remains an important language, both theoretically and practically. Rust can compile to web assembly, which means that rust programs can run natively in browsers. For this reason, there is considerable interest in pythonvm-rust. Otoh, pyodide (a spin-off from mozilla) does not use rust.

I am interested in rust's inference algorithm, and I may eventually get around to studying the algorithm in detail.  Indeed, I am interested in extending Leo to make it easier to understand other programs. I have just had a breakthrough in this area, which I'll discuss in another post.

Edward

rengel

unread,
Oct 29, 2021, 11:25:06 PM10/29/21
to leo-editor


On Wednesday, October 27, 2021 at 2:12:30 PM UTC+2 Edward K. Ream wrote:
  ...
Speed

One of Leo's "grand challenges" is to re-imagine Leo with an outline containing millions of nodes. Whether this challenge even makes sense is an open question. But one thing is clear: a speedup of 2-10 times would be inconsequential. Therefore, rust can not possibly be part of the solution.

With so many nodes, maybe it is time to have a look at the Elixir/Erlang combination...

rengel

unread,
Oct 30, 2021, 1:17:50 AM10/30/21
to leo-editor

Edward K. Ream

unread,
Oct 30, 2021, 6:49:33 AM10/30/21
to leo-editor
On Sat, Oct 30, 2021 at 12:17 AM rengel <reinhard...@gmail.com> wrote:
Thanks for these links. I'll take a look.

To be clear, right now I'm not particularly interested in how fast or robust any particular programming environment might be.

My challenge is to try to understand how one might profitably use very large outlines. I have no clear pictures in mind :-)

Edward

rengel

unread,
Oct 31, 2021, 2:50:58 AM10/31/21
to leo-editor
On Saturday, October 30, 2021 at 12:49:33 PM UTC+2 Edward K. Ream wrote:
 
My challenge is to try to understand how one might profitably use very large outlines. I have no clear pictures in mind :-)

Some things come to my mind:

What questions do you have to ask to get a clearer picture?

What is a 'very large outline'?
- You surely don't think of millions of hierarchy levels but millions of leaves (endpoints, items without children).

In his book 'Information Anxiety' (at Amazon), Richard Saul Wurman claims, that there are only 5 ways to organize Information:
1. Category (concepts, types)
2. Time (historical events, diary; but also processes, step-by-step procedures))
3. Location (country, state, county, city, etc.)
4. Alphabet (dictionary, telephone directory)
5. Continuum (organization by magnitude, small -> large, cheap -> expensive, etc.)

Which large quantities of things are best managed by a relative flat hierarchy of these organizational principles?.

Which very complex and/or large domains/problems/artefacts/collections are best described by tree-like structures?

Very abstractly, large outlines are useful everywhere where one uses some hierarchy to classify large quantities of things.
 - Dewey Decimal System to classify media (books, articles, etc.)
 - plant and animal taxonomies used in biology
 - bill of materials in industrial production
 - Yellow Pages
 - Population registries by country, state, county, city, etc.

Reinhard


tbp1...@gmail.com

unread,
Oct 31, 2021, 9:22:11 AM10/31/21
to leo-editor
Very large collections are best thought of a graphs, IMO, because there are usually many types of connections between them - depending of course on the type and intended use of the entries.  However, treelike *views* into the data are very often much better for a human to work with.  With large collections, it can take a long time to create a view from scratch, so it is helpful to create the most important ones in advance.  In the database world, these creation of such views are helped by indexes, temporary tables, and database views.  In Python (and other languages that have native map structures), dictionaries can play that role.

With increasing size, finding something becomes harder.  It may well be that for Leo, once it can work with very large numbers of nodes, that we will need new and faster ways to find items and peruse them.

Another issue of size is the amount of data that a single node can hold.  I recently crashed Leo by trying to read some 80 megabytes of text into the body of a node.  I was curious how fast it could do a search and replace on that much data, but I didn't find out because of the crash.  Of course, we are currently limited by Qt's capabilities, and Leo may never need to do such a thing, so it may not matter.

Edward K. Ream

unread,
Nov 1, 2021, 6:57:33 AM11/1/21
to leo-editor
On Sun, Oct 31, 2021 at 1:51 AM rengel <reinhard...@gmail.com> wrote:

>> My challenge is to try to understand how one might profitably use very large outlines. I have no clear pictures in mind :-)

 > What questions do you have to ask to get a clearer picture?

Thanks for this comment.

xml files, qt outlines and leo's generators will suffer too-large performance issues as the number of outline nodes increases. So some kind of database will be needed to represent the data itself.  This much seems clear.

The questions I have involve:

1. How to create a window into the database?
2. How to create meaningful views into the database?
3. How to re-imagine (or do without) leo's generators?

Edward

Edward K. Ream

unread,
Nov 1, 2021, 7:00:11 AM11/1/21
to leo-editor
On Sun, Oct 31, 2021 at 8:22 AM tbp1...@gmail.com <tbp1...@gmail.com> wrote:
Very large collections are best thought of a graphs, IMO, because there are usually many types of connections between them - depending of course on the type and intended use of the entries.  However, treelike *views* into the data are very often much better for a human to work with.  With large collections, it can take a long time to create a view from scratch, so it is helpful to create the most important ones in advance.  In the database world, these creation of such views are helped by indexes, temporary tables, and database views.  In Python (and other languages that have native map structures), dictionaries can play that role.

With increasing size, finding something becomes harder.  It may well be that for Leo, once it can work with very large numbers of nodes, that we will need new and faster ways to find items and peruse them.

Another issue of size is the amount of data that a single node can hold.  I recently crashed Leo by trying to read some 80 megabytes of text into the body of a node.  I was curious how fast it could do a search and replace on that much data, but I didn't find out because of the crash.  Of course, we are currently limited by Qt's capabilities, and Leo may never need to do such a thing, so it may not matter.

Thanks for these comments. They are close to my thinking.

These issues have low priority at present. I have no great confidence that these issues can be solved. More importantly, I have no need to solve them!

Edward

David Szent-Györgyi

unread,
Nov 3, 2021, 11:40:50 AM11/3/21
to leo-editor
On Sunday, October 31, 2021 at 9:22:11 AM UTC-4 tbp1...@gmail.com wrote:
Very large collections are best thought of a graphs, IMO, because there are usually many types of connections between them - depending of course on the type and intended use of the entries.  However, treelike *views* into the data are very often much better for a human to work with.  With large collections, it can take a long time to create a view from scratch, so it is helpful to create the most important ones in advance.  In the database world, these creation of such views are helped by indexes, temporary tables, and database views.  In Python (and other languages that have native map structures), dictionaries can play that role.

With increasing size, finding something becomes harder.  It may well be that for Leo, once it can work with very large numbers of nodes, that we will need new and faster ways to find items and peruse them.

Another issue of size is the amount of data that a single node can hold.  I recently crashed Leo by trying to read some 80 megabytes of text into the body of a node.  I was curious how fast it could do a search and replace on that much data, but I didn't find out because of the crash.  Of course, we are currently limited by Qt's capabilities, and Leo may never need to do such a thing, so it may not matter.

Decades ago, Project Xanadu was founded to create a scalable datastore suitable for hosting published information linkable in forms developed by end users, with separation of the back-end mechanisms of storage, publication, and collection of micropayments from the front end of presentation. While the project did not come to fruition as desired by founder and computer industry gadfly Ted Nelson, the Project's work was influential. Nelson was the first person to conceive of the idea of hypertext - the term is his. The mathematics underlying the back-end storage might be of interest; those are described in Literary Machines; a reprint is available from Nelson; more on them might be found through Xanadu Australia - see link below. 

Web site of the original Project
Web site Xanadu Australia, more recently updated and detailed than the original Project's Web site

Edward K. Ream

unread,
Nov 4, 2021, 9:39:39 AM11/4/21
to leo-editor
Thanks for this. The book appears to be back in print, but out of stock. I'll get a copy asap.

Edward

David Szent-Györgyi

unread,
Nov 14, 2021, 11:06:59 AM11/14/21
to leo-editor
The earlier edition that I saw doesn't give descriptions of the algorithms. Its descriptions of the data structures might be of interest. 
Reply all
Reply to author
Forward
0 new messages