Fwd: [New post] COVID-19 Modelling and Random Social Networks

6 views
Skip to first unread message

Linas Vepstas

unread,
Apr 22, 2020, 12:04:26 AM4/22/20
to link-grammar

This post is actually, secretly about grammar and linguistics. But is also about disease simulation and progression.  Its about both. Let me explain.

So, in grammar learning, one has to infer a grammar from a corpus of text. It turns out that evaluating the accuracy of the learning process is quite hard ... but that is almost entirely because the available corpora of English text with syntax markup are of extremely low quality.  So, if you really want to evaluate accuracy, it seems best to generate random grammars, and random corpora.

On this mailing list, the grammar formalism is Link Grammar. But of course. The software below was developed to generate (random) corpora constrained by Link Grammar syntax rules. Well, but generating a network is generating a network, and one can just as easily generate random social networks, instead of sentences.  In LG terms, there are just two connector types: FRIEND and STRANGER, which can connect to the left or the right, an arbitrary number of times. Viola, social network!

The demo is a COVID-19 epidemiology simulation. The random networks are created by version 0.1 of some brand new code, meant to solve the language-learning problem above. URL embedded below.

----

tl;dr: The AtomSpace is a graph database that can store and manipulate networks, and values flowing on those networks.  COVID-19 is a disease that can flow, from person to person, along the graph of relationships between people.  Using OpenCog to perform epidemiology simulations should be easy. And it is.

-- Linas



Linas Vepstas posted: " Seems like everyone wants to be an epidemiologist these days, so why not OpenCog? After all, diseases spread through networks, propagating from one node to the next. A network is a graph, the AtomSpace is a graph database, and the value flow subsystem p"

New post on OpenCog Brainwave

COVID-19 Modelling and Random Social Networks

by Linas Vepstas

n20.gml.png
Small random network

Seems like everyone wants to be an epidemiologist these days, so why not OpenCog? After all, diseases spread through networks, propagating from one node to the next. A network is a graph, the AtomSpace is a graph database, and the value flow subsystem provides a convenient infrastructure for propagating data (e.g. communicable diseases) from one node to the next. How hard could it be to model the progression of disease through a social network? The answer? Pretty easy.

The full working demo is here. It highlights the brand-new network generation codebase. The network generator is an AtomSpace module, specialized for generating random networks that are constrained by grammatical rules. This is in the same sense as "grammar" in linguistics, or computer science: a description of syntax. With an appropriate grammar, one can generate linguistic parse trees. But there's no need to constrain oneself to trees: one can generate graphs with undirected edges and loops.The network generator is generic. That said, please note: this is version 0.1 of the network generator! This really is a brand-new module. It is needed for several projects, but really, its just completely brand-new, and is not a complete, finished final product.

Note also: this is a technology demo! This is not a full-scale medical epidemiology tool. It could be used for scientific research -- but, for that, it would need to be extended and attached to other scientific tools, such as data analysis tools, graphing tools, network visualization tools. Such tools are commonplace and abundant; we won't dwell on them here. The demo here is a demo of network generation, and of running a state transition machine on a network. Its a demo of using Atomese for these sorts of computations, playing to the strengths of Atomese.

The demo consists of four parts. The first part demonstrates how to write a finite state machine in Atomese. The second demonstrates the current network generation API. Third: actually run the danged thing, and fourth: collect up and print some stats.  Before starting, though, a few words about Atomese.

Atomese is a graphical programming language. Unlike almost all other programming languages, it is not optimized for human beings. It's optimized for machines. It's vaguely assembly-language-like, and that's a good thing, because that makes it easy for other algorithms to create and manipulate it. Its meant to be a low-level layer for higher-level algorithms. For this demo, imagine a WYSIWYG graph editor that you (a medical professional?) could use to hand-draw some disease-progression flow-charts. Or bubble diagrams, whatever. The intent is to compile those graphs down to Atomese to actually run what the diagrams describe. That's what Atomese is: a low level graph language, mixing together a graph-database with assorted comp-sci theory tools and features.

State transitions

The state transition machine will be written in Atomese. Its quite verbose, but this can be easily hidden.  Here, for example, is one of the state transitions:

 (And
    (Equal (ValueOf (Variable "$A") seir-state) exposed)
    (GreaterThan
       (ValueOf (Variable "$A") susceptibility)
       (RandomNumber (Number 0) (Number 1))))
 (SetValue (Variable "$A") seir-state infected)

I'm hoping this is is self-explanatory. The (Variable "$A") is a person, an individual in the network. The seir-state is a location - a lookup key, holding the "SEIR" state: "susceptible", "exposed", "infected", "recovered". If the individual is exposed, and a random draw exceeds that individual's susceptibility, they will become infected. All of the state transitions can be written in this form.

The above is written in scheme, but you could also use Python -- the AtomSpace has Python bindings. Arguing about Python vs. something else kind-of misses the point: the above Atomese is actually a graph, a tree, in this case, that happens to encode, in its nodes and links, an abstract syntax tree (AST) specifying an executable program. Atomese allows you to work directly with the AST, whether in scheme in python, or just abstractly, at a higher layer. The Wikipedia article on AST's explains exactly why this is a useful thing to have.

Network grammar

seed.png
A single seed

The network grammar in this demo is extremely simple. It specifies two types of edges: "friend" and "stranger", used for different types of social encounter. The grammar itself just consists of a set of "seeds" or "burrs" - a single vertex (a prototype individual) together with some half-edges on it - potential edges that could form either "friend" or "stranger" relationships.

It's perhaps the most useful to imagine each burr as a jigsaw-puzzle piece: in this case, with only two types of connectors: "friend" or "stranger", varying in quantity from a few, to many. The visuals show a seed with some attached half-edges, and the process of connecting them. It's painfully obvious, isn't it?

puzzle.png
Connecting puzzle pieces

A single seed, expressed in Atomese, looks like this:

(Section
   (Concept "person-2-3")
   (ConnectorSeq 
      (Connector (Concept "friend") (ConnectorDir "*"))
      (Connector (Concept "friend") (ConnectorDir "*"))
      (Connector (Concept "stranger") (ConnectorDir "*"))
      (Connector (Concept "stranger") (ConnectorDir "*"))
      (Connector (Concept "stranger") (ConnectorDir "*"))))

It has two unconnected "friend" connectors, and three unconnected "stranger" connectors. The connectors are unpolarized - they have no directionality, so that the resulting relationships are symmetrical. That is, unlike an actual jigsaw-puzzle piece, the "*"  ConnectorDir doesn't have a "direction", it allows an any-to-any connection to be made.  When a connection is formed, the connector labels must match - thus, in this case, two different kinds of edges are formed in the final graph.

seeds-two.png
A pair of connected seeds

Below follows an example random network, generated by the current system. It's long, and thin - very long and thin - as controlled by the network generation parameters.  Adjustable parameters include what one might expect: a maximum size, number of graphs to generate, a set of starting points that must be bridged, and so on. Again: this is version 0.1; there is more to come.

Graph generation parameters are also specified in Atomese. For example, the maximum network size:

(State (Member max-network-size params) (Number 2000))

The MemberLink denotes set membership. The StateLink is a link that allows one and only one relation at a time. In this case, the MemberLink can be associated to only one Atom: the NumberNode Atom.

Initializing the network

After generating the network, one now has a number of interconnected individuals. Some initial values and state must be assigned to each.  This is straight-forward, in Atomese:

 (Bind
    (TypedVariable (Variable "$person") (Type "ConceptNode"))
    (Present (Member (Variable "$person") anchor))
    (Delete (List
       (SetValue (Variable "$person") seir-state susceptible)
       (SetValue (Variable "$person") susceptibility
          (RandomNumber (Number 0.2) (Number 0.8)))
       (SetValue (Variable "$person") infirmity
          (RandomNumber (Number 0.01) (Number 0.55)))
       (SetValue (Variable "$person") recovery
          (RandomNumber (Number 0.6) (Number 0.95)))
    )))

The BindLink is a graph-rewriting link. It searches for all graphs of a certain shape, having a variable area in the graph, and then takes that variable, and generates some new graphs. In this particular case, the variable is obvious: the first stanza is just a type declaration for the variable.  The second stanza, the PresentLink, asserts that the indicated graph must be found in the AtomSpace.  The third stanza, the DeleteLink, is unusual. First, it asserts that some values are to be set, then wrapped up in a ListLink, and then the resulting ListLink is to be deleted. Well - its done this was a a trick: we don't need the ListLink, and rather than cluttering up the AtomSpace with useless links, its better to just get rid of it.

social-network.gml.png
A very long, thin social network

Running the simulation

The spread of the disease across the network can be acheived using the same graph-query and graph-rewriting mechanism. Consider the graph query below.

 (Bind
   (VariableList
      (TypedVariable (Variable "$A") (Type "ConceptNode"))
      (TypedVariable (Variable "$B") (Type "ConceptNode")))
   (And
      (Present
         (Evaluation
            (Concept "friend")
            (UnorderedLink (Variable "$A") (Variable "$B"))))
      (Equal (ValueOf (Variable "$A") seir-state) susceptible)
      (Equal (ValueOf (Variable "$B") seir-state) infected))
   (SetValue (Variable "$A") seir-state exposed))

This locates all "friendship" graphs: one must find, present in the AtomSpace, a very simple pairwise relation between A and B, marked with the label "friend". Habitually, EvaluationLinks are used for this purpose. Having found this relationship, and if B is infected, and A is not yet ill, nor has developed immunity (that is, if A is susceptible), then A becomes exposed.  After running such a disease-transmission step, one then runs a state-change step, using the SEIR rules described up top. And then one repeats ... looping, until the population has succumbed, or there has been a break in the transmission that keeps a portion of the population isolated and uninfected.

Printing Results

The graph query language is an all purpose tool. Network stats are straight-foreward to query. Of course! Like so:

 (Get
    (TypedVariable (Variable "$indiv") (Type "ConceptNode"))
    (And
       (Present (Member (Variable "$indiv") anchor))
       (Equal (ValueOf (Variable "$indiv") seir-state) infected)))

The above just searches for all individuals, identified by thier membership to a common anchor point. and then checks to see if their state is "infected". This returns a list of all of hose individuals; one need merely count to see how many of them there are.

Putting it together

The demo is fully functional. Start a guile shell, load the demo file, and off it will go. Network generation may take anywhere from a small fraction of a section to over ten seconds, that's the only burp. The source is heavily documented. If you get into trouble, the wiki links provide additional documentation. If Atomese is confusing, then perhaps its a review of the basic AtomSpace examples is in order.

The adventurous may try translating the demo to Python. The AtomSpace has Python bindings. They work. There's no particular reason that Atomese has to e written in scheme (although, clearly, I find it preferrable: its easier, simpler than Python. For me, at least.)

Conclusion

The AtomSpace is an in-RAM (hyper-)graph database with a power associated graph query language, specifically formulated for working with large, complex graphs, for re-writing and editing those graphs, and for managing the flow of state through the graph. Modelling disease networks is one of the things it should be good at ... and I hope, with this blog post, I've convinced you that this is indeed the case.

Linas Vepstas | April 22, 2020 at 3:02 am | Categories: Uncategorized | URL: https://wp.me/p9hhnI-bX

Comment    See all comments

Unsubscribe to no longer receive posts from OpenCog Brainwave.
Change your email settings at Manage Subscriptions.

Trouble clicking? Copy and paste this URL into your browser:
https://blog.opencog.org/2020/04/22/covid-19-modelling-and-random-social-networks/




--
cassette tapes - analog TV - film cameras - you


--
cassette tapes - analog TV - film cameras - you
Reply all
Reply to author
Forward
0 new messages