Example of relational programming

39 views

Skip to first unread message

Giancarlo Niccolai

unread,

Apr 11, 2018, 2:13:01 PM4/11/18

to FalconPL

I realise it is quite hard to picture a concept like "relational programming" just out of my description of a document that is still not available to the public.

I wrote a simple example of how relational programming might look like in the quasi-pseudocode below.

For a moment, ignore the fact that the @ symbol is currently used as string expansion operator, we might find something different. In this example @xxx means "extract the value xxx from the related node", and this is done through message passing. This is related with the concept of "value function" in CR-Algebra.

So, this example program uses relational programming to count the lines, words and repetition of each word on a given file. This is how it looks like:

Relational program in a graph form

------

|FILE|

------ @content

| read("utf-8")

------ @content

|TEXT|

@char------ @word

/ | \

=='\n' / |@word\

' / | \ word

/ | 1 \

@inc/ | \

------- -------@inc ------- @entry

------- ------- -------

So, we'll be reading the @content of FILE into the @content of Text, which is basically a more "intelligent" text container which knows the concepts of characters and words, and we'll the text in relation with a counter of lines through its value "@char", and in relation with a count of words and the word dictionary with the value of "@word". The relation with line count impacts the value of @inc when @char is == '\n'. Similarly, word's count @inc is always impacted by 1 when @word is received from text, while a @entry is presented to the word dictionary after a TEXT @word.

The program might look like:

graph = FILE("xyz.txt") -> read("utf-8") -> txt=TEXT()

txt -> {@char => if @char=='\n': 1 => @inc } -> NUMBER()

txt -> {@word => 1 => @inc } -> NUMBER()

txt -> {@word => @word => @entry } -> ENTRY_LIST()

Now, evaluating graph() should do the trick, computing the relations and then spreading the computation as needed.

A more complete code, let's say a runnable Falcon 2 code, would look like:

// "read" is a relation, asking for "content" and genearting "content"

read= {@content, encoding => decode(@content, encoding) => @content}

// "ENTRY_LIST is a new class -- a concept

class ENTRY_LIST from CONCEPT

entries = [=>]

function entry(word) {

if word in self.entries

++self.entries[word]

else

self.entries[word] = 1

end

// FILE, TEXT and NUMBER are standard library concepts

// NUMBER is also the class of 1, 1000, 3.5 0.99 etc.

// We create a graph by setting the first relation between an anonymous FILE

// entry and a TEXT entry named txt

graph = FILE("xyz.txt") -> read("utf-8") -> txt=TEXT()

// Now txt is a member of Graph, but also a variable visible at this level.

// So, we can create new relations.

txt -> {@char => if @char=='\n': 1 => @inc } -> line_count = NUMBER()

txt -> {@word => 1 => @inc } -> word_count = NUMBER()

// it might be useful to think the relations from bottom up.

rgraph = word_list = ENTRY_LIST() <- {@word => @word => @entry } <- txt

// This will be true -- any reference to a connected graph is the same

> rgraph === graph

// Evaluating graph means pulling the values into all the sinks.

// This can be done with any automatic parallelism as needed.

graph()

// In the end, we have the values where we want.

> "Count of lines: ", graph.line_count

> "Count of words: ", graph.word_count

for word, count in graph.word_list

forfirst: > "Count of words: "

> word, "=", count

end

We have supposed the engine to know that some node values, as @char and @word of TEXT, are possibly varying (have multiple instances during the lifetime of the program), and this shall cause propagation of multiple values in the downstream nodes. Others, as @content from FILE, might one-shot, or constant values.

Now, notice that the declarative style of this program allows the combination of the compiler and the engine to have several information that will be useful during its execution. For instance they know that the variable "txt" is never used outside the graph evaluation, and so, we can avoid computing and/or remembering things about its composition (i.e. we can avoid recording the full text of the file somewhere in memory just to iterate through it).

The engine also knows that the three relations in which txt is involved can be parallelized.

With a bit of effort, it might even be possible to cache the change of @inc values in each parallelization, to consolidate (sum them all) at the end of the computation.

Also, notice that this line:

graph = FILE("xyz.txt") -> read("utf-8") -> txt=TEXT()

doesn't mean we're "calling" read("utf-8") there; we want the thing between the two nodes to be a relation, that is, a closure -- we might go for a different grammar.

I understand everything here is a bit up in the air and there is still much to work on, even at the pure grammar level, to decide how to make this work, but I hope the example explains where I want to go with the concept of "relational programming".