This Engineering Notebook post restarts an old discussion about using Leo as a database.
Several new ideas have emerged that I'll discuss in later posts. As a result of these ideas, I have reopened #1125 (faster outline redraws) and #1123 (huge outlines). Neither issue has a milestone—the following thoughts are provisional.
Attitude
I dismissed using a (sqlite) database as a backing store because I wanted to complete Leo.
Conversely, I have often overestimated the importance of ideas in the misplaced hope that Leo would take over the world :-) But now I'm happy to explore the following ideas to see where they will lead
Two questions
Two questions provoke these discussions:
1. How can Leo support large outlines with 10,000 or more nodes?
2. Can Leo support huge outlines with millions of nodes?
Ground rules
I shall permit no changes to Leo's core. Plugins will implement all the ideas I shall propose.
Supporting large outlines
Leo must redraw
only visible nodes. See
#1125. Nothing else will change:
- Leo's startup code will load all nodes as before.
- Leo's find/replace code will plow through all nodes.
#1125 will become part of Leo's core, so the ground rules still apply.
Supporting huge outlines
Supporting huge outlines requires a new architecture. Incremental speed improvements will not suffice. I envisage the following architecture:
1. A new kind of .leo file, say .bigleo, would contain only enough data to describe a backing store, the initially visible nodes, and a few other startup-related data items.
2. A single backing plugin will load (and unload?) visible nodes from the backing store on demand. Queries form the interface to the backing plugin. Leo's iterators and find commands will probably not be used.
3. Optional: A Database Access Plugin (DAP for short) will handle read access to external databases. Each external DB would require a separate DAP. For example, an access plugin could read data from the HGP (Human Genome Project) databases. Given the proper authorization, the database could even write to the HGP DBs.
The backing plugin will pass queries to each DAP. In general, the results of those queries would be an arbitrary graph, not a DAG (Directed Acyclic Graph). So each DAP must translate an arbitrary graph to a DAG. I'll discuss this topic in another ENB.
Summary and implications
Leo could handle large outlines without over-stressing Qts drawing code. I'm not sure how well vs-code draws large outlines.
#1125 might extend the lifetime of leoInteg even after leoJS 1.0.
Leo will never be a database, but a Leonine window into a external DB would provide Leo's advantages to those perusing external DBs.
Database access plugins would be significant projects. I have no intention of writing any such plugin. However, the algorithms for converting an arbitrary graph to a Leo-compatible DAG are interesting. I've made some minor discoveries in this area that I'll discuss in a later ENB.
This post has summarized my current thinking. I'll elaborate in other ENBs.
Edward