ENB: converting queries to DAGs

13 views
Skip to first unread message

Edward K. Ream

unread,
Mar 23, 2023, 6:15:42 AM3/23/23
to leo-editor
This Engineering Notebook post continues the previous ENB post. It discusses how a Database Access Plugin (DAP) could convert the results of a query to a DAG (a Leo outline).

This post is a proof of concept. I have no intention of writing any DAP. Still, the ideas are interesting in themselves.

In general, the results of a query will be a graph. Ordered lists are a special kind of graph. Let's look at lists first.

Converting lists to outlines

Suppose a query returns an ordered list of 10,000 results. Let's call this list the results list. Leo could represent the results list as a single node with 10,000 lines, but let's put each result in a separate node.

We could represent the query as a node with 10,000 children, but that would be difficult for humans to use. Instead, let's convert the list into an organizer outline. Most of this outline will consist of organizer nodes. Only the bottom-level nodes (the nodes without children) contain results. Let's call these nodes the result nodes.

In our example, our organizer outline will contain 10 + 100 + 1000 organizer nodes in a 4-level outline:

- Level 0: A query node represents the entire result.
  It will have 10 children, the level 1 organizer nodes, each having 10 children.
- Level 1: 100 level 2 organizer nodes, each having 10 children.
- Level 2: 1000 level 3 organizer nodes, each having 10 children.
- Level 3: 10,000 results nodes.

I'll leave the algorithm that creates this outline as an exercise for the reader :-)

Summary: we can organize lists into an outline whose traversal yields the result nodes in the same order as the results list.

Converting graphs to outlines

In this section, I'll sketch some ideas for converting a general graph to an outline. The details don't matter much, but there will be a surprise at the end.

In general, the nodes of a graph have no natural order, so a Database Access Plugin may create an outline without worrying about its order.

We can assume that the graph is a directed graph. We'll convert undirected links into two directed links in opposite directions.

We'll convert a (directed) graph to an outline much like we converted (directed) lists:

- Choose any 10 nodes of the graph. These root nodes are the top-level nodes of the outline.
- Remove all links to the root nodes.
- For each root node, choose up to 10 level-one children.
- Remove the links from the level-one children to their parents.
- Continue recursively with all level-one children :-)

Yes, there is some hand-waving here. I have deliberately ignored some complications:

1. This approach might not guarantee that there will be links from the roots to all nodes. In that case, we will add the "orphaned" nodes to the top-level root nodes.

2. The algorithm doesn't say whether nodes may have more than one parent. Leo's outlines allow this possibility, but we may want to convert the graph to a tree instead of a DAG.

3. There are many ways of choosing roots, children, and which links to delete. Thus, there are many ways to convert a graph to a DAG. These differences don't matter, provided the resulting DAG has relatively few levels. A post-pass could reassign nodes to improve the DAG's shape, but I am not interested in such details here.

Summary: Converting a graph to a DAG (or tree) is relatively easy.

Intuitive graph traversals

Earlier I promised a surprise. Here it is.

There are many ways to traverse general graphs. None seem intuitive. But consider any DAG created from the outline.

Aha: The DAG's outline order is an easily understood ordering of the corresponding graph!

Summary

Database Access Plugins can (and should!) convert the complex results of queries to an organized outline. Outlines with six levels can represent millions of nodes.

There are many ways of creating results outlines. DAPs might prefer specific organizations, but such choices are beyond the scope of the present discussion.

The outline order of an organizer outline is an easily understood ordering of the corresponding graph!

This post concludes my present thoughts about databases and large outlines. I plan no further Engineering Notebook posts on this topic.

Edward
Reply all
Reply to author
Forward
0 new messages