> But how will you calculate P(image|crow,black)?
Well as you know, if you really want to, something like "the RGB value
of the pixel at coordinate (444,555) is within a distance .01 of
(.3,.7,.8)" can be represented as a logical atom ... so there is no
problem using logic to reason about perceptual data in a very raw way if you want to
OTOH I don't really want to do it that way... instead, as you know, I
want to model visual data using deep NNs of the right sort, and then
feed info about the structured latent variables of these NNs and their
interrelationships into the logical reasoning engine.... This is
because it seems like NNs, rather than explicit logic or probabilistic
programming, are more efficient at processing large-scale raw video
data...
* slow to create, hard to destroy
* are indexed and globally unique
* are searchable
* are immutable
Values are:
* fast and easy to create, destroy, change
* values are highly mutable.
* values are not indexed, are not searchable, are not globally unique."
But we need "fast and easy to create, destroy, change, highly mutable, but searchable" entities. So, this is not only technical, but also conceptual problem...
I would really like to hear your opinion on this. What should we do? Resort to the most shallow integration between OpenCog and DNNs? In this case, SynerGANs will not work since we will not be able to mine patterns in values, and we will not be able to use Pattern Matcher to solve VQA. Express output of DNNs as Atoms? Linas objected even the idea to express coordinates and lables of bounding boxes as Atoms. To do this with activities of neurons will be even worse. Put everything into Space-Time server? But the idea to use the power of Pattern Matcher, URE, etc. will not be achievable. Extend Pattern Matcher to work with Values? Maybe... /*I like the idea of embedding TF computational graph into Atomspace, but tf.mul works over Values (tensors) - not NumberNodes. Thus, in this case, it will be required to make all links (like TimesLink) to work not only with NumberNodes, but also with Values... but I foresee objections from Linas here... Also, I believe it should be useful in general since Values are not first-class objects in Atomese - you should use Scheme/Python/C to describe how to recalculate truth values; you cannot reason about them directly...
Or should we try to use a sort of PPL as a bridge between Values and Atoms? Maybe... Or we should do something unifying all these.*/
The question is not just about binding vision and PLN. It is more general. Say, if you driving a car, you estimate distances and velocities of other cars and take actions on this basis. These are also Values, and you 'reason' over them using both 'number crunching' and 'logic' simultaneously (I don't mean procedural knowledge here in sense of GroundedSchemaNode). So, I don't think that we should limit outselves to a shallow integration and use DNNs/PPL/etc. peripherically only...
Ben Goertzel <b...@goertzel.org>:
if one stays in the world of finite discrete
distributions, one can construct probabilistic logics with
sampling-based semantics... https://arxiv.org/pdf/1602.06420.pdf
Hmm, well when I think about the algorithms involved, I do not see why
the Pattern Miner and Pattern Matcher would be unable to search for
patterns involving Values... I think they could.... It's true the
code doesn't do this now though...
It is true that Values are not indexed globally. But it seems to me
that the search algorithms inside the PMs do not need such indexes...
Now coordinate values of bounding boxes ... If we are talking about
something like the bounding box of Ben's face during a conversation,
which changes frequently, this would be appropriately stored in the
Atomspace using a StateLink,
https://wiki.opencog.org/w/StateLink
In any case I am confused about how these technical OpenCog plumbing
issues related to the general issues you raise...
One question is: Is probabilistic logic an appropriate method for the
core of an AGI system, given that this AGI system must proceed largely
on observation-based semantics ...
I think the answer is YES
Another question is: Is the current OpenCog infrastructure fully ready
to support scalable probabilistic logic on real-time observation
data...
I think the answer is NOT QUITE
Similarly, we could ask
One question is: Is probabilistic programming an appropriate method for the
core of an AGI system, given that this AGI system must proceed largely
on observation-based semantics ...
I think the answer is YES
Another question is: Is any currently available probabilistic
programming infrastructure fully ready
to support scalable probabilistic programming on real-time observation
data...
I think the answer is NO... or maybe (??) NOT QUITE
Regarding the comparison btw probabilistic logic and probabilistic
programming, I would note that
-- dealing with quantifiers and their binding functions in
probabilistic logic is a pain in the ass
-- dealing with execution traces in probabilistic programming is a
pain in the ass
[But ofc, to do probabilistic program learning in any AGI-ish sense,
you need to be modeling execution traces
and all the variable state changes and interrelationships in there etc. ]
So there is copious mess about variables, of different sorts, in both
paradigms..
When we extend these methods to 2nd and 3rd order
probability distros, we run into the
issue that doing probabilistic program learning via MC sampling or
anything similar to that becomes
extremely slow.... One then wants to do inference to bypass the need
for sampling. But what kind
of inference? Perhaps PLN type abductive and inductive inference?
In this case one needs the probabilistic
logic in order to actually do learning over probabilistic programs
without incurring unrealistic overhead...
Overall, my feeling is that probabilistic programming will be better
for procedural knowledge, and probabilistic
logic will be better for declarative knowledge
Currently, as you probably already understand, the (only?) way to match values is to resort to grounded schemata.
For similar reasons PLN formulas are programmed with grounded schemata. A way to address that would be to complement Atomese with links encoding operators to access and modify values, GetValueLink, etc. This wouldn't make the pattern matcher more efficient (initially), but at least it would allow OpenCog to reason about values.
Should I repeat some basics? Atoms are heavy-weight, precisely because they create and update caches of what they are connected to. That makes it easy and fast to find what an atom is connected to, but slow to actually make the atom. Atoms are also held in an index, so that they can be searched by name, by type. Insertion into an index is expensive -- and stupid, if you never use the index. Values avoid this overhead.I want to say that Values can be used to carry things that "flow around on the network", but this idea has not been explored very much. Right now, Values are only "fast-changing-things attached to an atom". How that Atom might represent the "topology" (the connection) between "things" does not yet have any clear policy. I have been advocating the idea of using "connectors" to connect things. I've tortured Anton Kolonin and the language-learning crew with this idea, but the concept of forming connections is more general than just linguistics.-- Use Values to hold fast-changing data. For example, you could have a (C++) VideoValue object that, when you attached to it, provided you with a video-stream. (Perhaps you want a VideoProducerValue and a VideoConsumerValue. I have not thought about that very much). The point is that, using today's code base, as it exists right now, you could write the code for a VideoValue object "in an afternoon", and it would work, with no performance bottlenecks, no excess RAM usage, no excess CPU overhead. (The "afternoon" might actually be a few days -- but it would not be a few weeks. You can get started now.)-- Use Atoms to represent the "topology" of a network: what is connected to what. Atoms express (long-term, slowly-varying) relationships between things.I will answer this email in several parts. Re: atoms vs values, my thinking is this:
Both tasks can be considered as a part of the Semantic Vision problem, but their solution can be useful in a more general context.OpenCog + TensorflowDepth of OpenCog+Tensoflow integration can be quite different. Shallow integration implies that Tensorflow is used as an external module, and communication between Tensorflow and OpenCog is limited to passing activities of neurons, which are represented both by Tensorflow and Atomspace nodes.The most restricted way is just to run (pre-trained) TF models on input data and to set values of Atomspace nodes in correspondence with the activities of output neurons. What will be missing in this case: feedback connections from the cognitive level to the perception system; online (and joint) training of neural networks and OpenCog.Let us consider the Visual Question Answering (VQA) task as a motivating example. How will OpenCog be able to answer such questions as “What is the color of the dress of the girl standing to the left of the man in a blue coat?” If our network is pre-trained to detect and recognize all objects in the image and supplement them with detailed descriptions of colors, shapes, poses, textures, etc., then Pattern Matcher will be able to answer such questions (converted to corresponding queries). However, this approach is not computationally feasible: there are too many objects in images, and too many grounded predicates which can be applied to them.
Thus, the question should influence the process of how the image is interpreted.For example, even if we detected bounding boxes (BBs) for all objects and inserted them into AtomSpace, predicate “left to” is not immediately evaluated to all pairs of BBs. Instead, it will be evaluated during query execution by Pattern Matcher (hopefully) only for relevant BBs labeled as “girl” and “man”.
Similarly, grounded predicate “is blue” implemented by a neural subnetwork can be computed only in the course of query execution meaning that the work of Pattern Matcher should be extended to neural network levels.
Indeed, purely DNN solutions for VQA usually implement some top-down processes at least in the form of attention mechanisms.Apparently, a cognitive feedback to perception is necessary for AGI in general.It is not a problem to feed Tensorflow models with data generated by OpenCog via placeholders, but OpenCog will also need some interface for executing computational graphs in Tensorflow. This can be done by binding corresponding Session.run calls with Grounded Predicate/Schema nodes.
The question is how to combine OpenCog and neural networks on the algorithmic level. Let us return to the considered request for VQA. We can imagine a grounded schema node, which detects all bounded boxes with a given class label, and inserts them into Atomspace,
so Pattern Matcher or Backward Chainer can further evaluate some grounded predicates over them finally finding an answer to the question. However, the question can be “What is the rightmost object in the scene?” In this case, we don’t expect our system to find all objects, but rather to examine the image starting from its right border.
We can imagine queries supposing other strategies of image processing/examination. In general, we would like not to hardcode all possible cases, but to have a general mechanism, which can be trained to execute different queries.
To make neural networks transparent for Pattern Matcher, we need to make nodes of Tensorflow also habitants of Atomspace.
The same is needed for a general case of unsupervised learning. In particular, architecture search is needed in order to achieve better generalization with neural networks or simply to choose an appropriate structure of the latent code. Thus, OpenCog should be able to add or deleted nodes in Tensorflow graphs.
These nodes correspond not just to neural layers, but also to operations over them. One can imagine TensorNode nodes connected by PlusLink, TimesLink, etc..
There can be tricky technical issues with Tensorflow (e.g. compilation of dynamical graphs), but they should be solvable.A conceptual problem consists in that fact that Pattern Matcher work with Atoms, but not with Values. Apparently, activities of neurons should be Values. However, evaluation of, e.g. GreaterThanLink requires NumberNode nodes.
Operations over (truth) values are usually implemented in Scheme within rules fed to URE. This might be enough for dealing with individual neuron activities as truth values and with neural networks as grounded predicates, but patterns in values cannot be matched or mined directly (while the idea of SynerGANs implied the necessity to mine patterns in activities of neurons of the latent code).I was going to illustrate by concreate the same kind of problems with implementing probabilistic programming with OpenCog, but I guess it's already TL;DR.So, briefly speaking, we need Pattern Matcher and Pattern Miner to work over Values/Valuations, that is not the case now (OpenCog uses only truth and attention values, and Atomese/Pattern Matcher doesn't have a built-in semantic even for them). I cite Linas here:"Atoms are:* slow to create, hard to destroy
* are indexed and globally unique
* are searchable
* are immutable
Values are:
* fast and easy to create, destroy, change
* values are highly mutable.
* values are not indexed, are not searchable, are not globally unique."
But we need "fast and easy to create, destroy, change, highly mutable, but searchable" entities. So, this is not only technical, but also conceptual problem...
I would really like to hear your opinion on this. What should we do? Resort to the most shallow integration between OpenCog and DNNs? In this case, SynerGANs will not work since we will not be able to mine patterns in values, and we will not be able to use Pattern Matcher to solve VQA. Express output of DNNs as Atoms? Linas objected even the idea to express coordinates and lables of bounding boxes as Atoms. To do this with activities of neurons will be even worse. Put everything into Space-Time server? But the idea to use the power of Pattern Matcher, URE, etc. will not be achievable. Extend Pattern Matcher to work with Values? Maybe... /*I like the idea of embedding TF computational graph into Atomspace, but tf.mul works over Values (tensors) - not NumberNodes. Thus, in this case, it will be required to make all links (like TimesLink) to work not only with NumberNodes, but also with Values... but I foresee objections from Linas here... Also, I believe it should be useful in general since Values are not first-class objects in Atomese - you should use Scheme/Python/C to describe how to recalculate truth values; you cannot reason about them directly...
Or should we try to use a sort of PPL as a bridge between Values and Atoms? Maybe... Or we should do something unifying all these.*/
The question is not just about binding vision and PLN. It is more general. Say, if you driving a car, you estimate distances and velocities of other cars and take actions on this basis. These are also Values, and you 'reason' over them using both 'number crunching' and 'logic' simultaneously (I don't mean procedural knowledge here in sense of GroundedSchemaNode). So, I don't think that we should limit outselves to a shallow integration and use DNNs/PPL/etc. peripherically only...
Ben Goertzel <b...@goertzel.org>:
if one stays in the world of finite discrete
distributions, one can construct probabilistic logics with
sampling-based semantics... https://arxiv.org/pdf/1602.06420.pdfSounds quite interesting. I'll study it in detail...-- Alexey
Hmm, well when I think about the algorithms involved, I do not see why
the Pattern Miner and Pattern Matcher would be unable to search for
patterns involving Values...
--
Ben Goertzel, PhD
http://goertzel.org
"Only those who will risk going too far can possibly find out how far
they can go." - T.S. Eliot
Yes, this is clear (although I'd like to know more about your ideas regarding connectors),
and I agree with this. But this is not an answer to the question, what is the best way to integrate DNNs with Atomese.
How many are we talking about, here? dozens, hundreds of objects? hundreds of predicates per object? That is 100x100 = 10K and, currently, you can create and add maybe 100K atoms/sec to the atomspace (via C++, less by scheme, python, due to wrapper overhead). So this seems manageable.
Similarly, grounded predicate “is blue” implemented by a neural subnetwork can be computed only in the course of query execution meaning that the work of Pattern Matcher should be extended to neural network levels.There is a generic mechanism called "GroundedPredicateNode", and it can call arbitrary C++/scheme/python/haskell code, which must return a true/false value. True means "yes, match and continue with the rest of the query".
Unfortunately, GroundedPredicateNodes are "black boxes"; we do not know what is inside. Thus, it is useful to sometimes define "clear boxes": for example: GreaterThanLink. The GreaterThanLink can handle an infinite number of inputs, but it is not a black box: we know exactly what kind of inputs it expects, what it produces, what it does. Thus, it is possible to perform logical reasoning on GreaterThanLinks, and/or perform algebraic simplification (a<b<c implies a<c, etc)
The question is how to combine OpenCog and neural networks on the algorithmic level. Let us return to the considered request for VQA. We can imagine a grounded schema node, which detects all bounded boxes with a given class label, and inserts them into Atomspace,For example, one creates a ConceptNode "dress". One also creates a PredicateNode "*-bounding-box-*" Then one writes C++ code to implement the TensorFlowBBValue object. One then associates all three:(cog-set-value! (Concept "dress") (Predicate "*-bounding-box-*") (TensorFlowBBValue "obj-id-42"))What is the current bounding box for that dress? I don't know, but I can find out:
(cog-value->list (cog-value (Concept "dress") (Predicate "*-bounding-box-*")))returns 2 or 4 floating point numbers, as a list. Is Susan wearing that dress?(cog-set-value! (Concept "Face-of-Susan") (Predicate "*-bounding-box-*") (TensorFlowBBValue "obj-id-66"))(is-near? A B) (> 0.1 distance (cog-value A (Predicate "*-bounding-box-*")) (cog-value B (Predicate "*-bounding-box-*"))returns true if there is less than 0.1 meters distance between the bounding boxes on A and B.The actual location of the bounding boxes is never stored, and never accessed, unless the is-near? predicate runs.
Yes... Yes...
These nodes correspond not just to neural layers, but also to operations over them. One can imagine TensorNode nodes connected by PlusLink, TimesLink, etc..
Yes. However, we might also need PlusValue or TimesValue. I do not know why, yet, but these are potentially useful, as well.
There can be tricky technical issues with Tensorflow (e.g. compilation of dynamical graphs), but they should be solvable.A conceptual problem consists in that fact that Pattern Matcher work with Atoms, but not with Values. Apparently, activities of neurons should be Values. However, evaluation of, e.g. GreaterThanLink requires NumberNode nodes.This is a historical accident. GreaterThanLink and NumberNodes were invented long before the idea of Values became clear. Now that the usefulness of Values is becoming clear, its time to redesign GreaterThanLink.
Perhaps we need an IsLeftOfLink that knows automatically to obtain the "*-centroid-*" value on two atoms, and then return true/false depending on the result (or throw exception if there is no *-centroid-* value.)
I think it's better, if possible, to figure out a way to suitably
modify the core PM rather than using a separate repo ...
However, I guess the PM tweaks would need to be done someone on your team, as
Linas and Nil probably are too busy and we don't have a lot of others
who can rapidly perform such changes...
I would personally be in favor of overloading stuff like TimesLink in
order to apply to both
NumberNodes and Values, because it seems to me that the Atom/Value
distinction is more of
an efficiency-driven implementation distinction rather than a
fundamental mathematical/conceptual distinction...
Nil and Linas should be consulted on this stuff, but at this point you
are also in the exalted
"inner circle" with foundational input on these OpenCog-architecture issues...
If needed we could also introduce some sort of entity that is between
a Value and an Atom in some sense -- i.e. we could introduce some sort of TensorValue entity that
1) Perhaps, knows what links to it (like an Atom but unlike a Value)
2) has an internal tensor that is mutable
There is nothing prohibiting one from building something like this
into Atomspace,
though obviously not breaking various mechanisms would require some care...
>
> Exactly. Probabilistic logic is a way to make inference over probabilistic
> programs much more efficient. I have specific examples for this in mind.
It will be good to hear the examples when you have time...
For instance, I like to think about evolutionary programming (e.g.
MOSES) as a tool for learning procedural knowledge, but OTOH our main
use of this tool right now is for learning classification rules. Now
a program embodying a classification rule is, in a sense, a "procedure
for performing the classification" ... but then in this sense, every
logical inference is also a cognitive procedure ;p
Yes, this is clear (although I'd like to know more about your ideas regarding connectors),Sure; say when. I can talk about them for days. Unfortunately, the core idea is so simple, so obvious, that it becomes very difficult to talk about the advanced concepts, so this needs to be a distinct conversation.
b) You ask me narrow, focused questions about certain specific tasks, and I answer how they could be accomplished, and how much work that would take.I find that b) is much easier.Currently, I don't know what more you want, besides what I've already written, in the last several emails.
What seems to happen in human is that high confidence knowledge tend
to go subconscious, while less confidence knowledge tend to be the
subject of the attention. So for instance a at first you may focus on
edge detection, etc, once you've built some high confidence model
about the relationships within this domain, you move that to the
subconscious (the neo cortex? I'm not enough into neuroscience to
tell) then you can focus on the next abstraction.
So transposed to OpenCog, I don't really know, it could mean that for
instance once you have say built an ImplicationLink with sufficient
confidence (with a Dirac like second order distribution), you no
longer need to bother updating it, unless perhaps the likelihood of
the incoming data considerably deviate from normal, in which case it
may mean that you need to go back to the basics (I suspect psychedelic
drugs do something like that, though I don't know if that a side
effect or a fundamental effect, I lack experience to really tell).
For instance
GetValueLink
<atom>
<key>
would return the value corresponding to `key` in `atom`. Note that
`key` is itself an atom. However the returned value may not
necessarily be an atom, it may be a proto atom, if so it would need to
"atomized". This makes me think that ProtoAtoms probably need some
"atomize" method or something.
The atomization might be the problem, but if it is meant to be
volatile (not be stored in the atomspace) then maybe the whole thing
can be done efficiently.
It seems it would be good if Atomese offers operators to create,
destroy, etc, atomspace, because then temporary computations could be
moved in small atomspace that I suspect would still be able to
manipulate atoms with less overhead (though I don't know well from
where come the overhead of inserting atoms in an atomspace).
2018-05-22 0:11 GMT+03:00 Linas Vepstas <linasv...@gmail.com>:How many are we talking about, here? dozens, hundreds of objects? hundreds of predicates per object? That is 100x100 = 10K and, currently, you can create and add maybe 100K atoms/sec to the atomspace (via C++, less by scheme, python, due to wrapper overhead). So this seems manageable.Thousands or even millions of objects. I can ask you a question about a speck of dust sparkling in the sunlight, hot pixel on your screen, tiny birthmark on a face, a hole in a button with a thread passing through it, etc. Each pixel belongs to tens of "objects"...
The question is how to combine OpenCog and neural networks on the algorithmic level. Let us return to the considered request for VQA. We can imagine a grounded schema node, which detects all bounded boxes with a given class label, and inserts them into Atomspace,For example, one creates a ConceptNode "dress". One also creates a PredicateNode "*-bounding-box-*" Then one writes C++ code to implement the TensorFlowBBValue object. One then associates all three:(cog-set-value! (Concept "dress") (Predicate "*-bounding-box-*") (TensorFlowBBValue "obj-id-42"))What is the current bounding box for that dress? I don't know, but I can find out:
(cog-value->list (cog-value (Concept "dress") (Predicate "*-bounding-box-*")))returns 2 or 4 floating point numbers, as a list. Is Susan wearing that dress?(cog-set-value! (Concept "Face-of-Susan") (Predicate "*-bounding-box-*") (TensorFlowBBValue "obj-id-66"))(is-near? A B) (> 0.1 distance (cog-value A (Predicate "*-bounding-box-*")) (cog-value B (Predicate "*-bounding-box-*"))returns true if there is less than 0.1 meters distance between the bounding boxes on A and B.The actual location of the bounding boxes is never stored, and never accessed, unless the is-near? predicate runs.
These nodes correspond not just to neural layers, but also to operations over them. One can imagine TensorNode nodes connected by PlusLink, TimesLink, etc..
Yes. However, we might also need PlusValue or TimesValue. I do not know why, yet, but these are potentially useful, as well.
Perhaps we need an IsLeftOfLink that knows automatically to obtain the "*-centroid-*" value on two atoms, and then return true/false depending on the result (or throw exception if there is no *-centroid-* value.)
I think it's better, if possible, to figure out a way to suitably
modify the core PM rather than
using a separate repo ...
However, I guess the PM tweaks would need to be done someone on your team, as
Linas and Nil probably are too busy and we don't have a lot of others
who can rapidly
perform such changes...
I would personally be in favor of overloading stuff like TimesLink in
order to apply to both
NumberNodes and Values, because it seems to me that the Atom/Value
distinction is more of
an efficiency-driven implementation distinction rather than a
fundamental mathematical/conceptual
distinction...
> Also, DNNs are trained on (mini-)batches. It is not too natural from an
> autonomous agent perspective, but efficient.
Yes I see. Again maybe some new TensorValue construct will be needed, we just
need to understand clearly what the requirements are in terms of any special
indexing etc.
The difference between Atoms and Values is just an implementation-efficiency
tactic...
What do you suppose GetValueLink to do?
For instance
GetValueLink
<atom>
<key>
would return the value corresponding to `key` in `atom`. Note that
`key` is itself an atom. However the returned value may not
necessarily be an atom, it may be a proto atom, if so it would need to
"atomized". This makes me think that ProtoAtoms probably need some
"atomize" method or something.
It seems it would be good if Atomese offers operators to create,
destroy, etc, atomspace,
because then temporary computations could be
moved in small atomspace that I suspect would still be able to
manipulate atoms with less overhead (though I don't know well from
where come the overhead of inserting atoms in an atomspace).
If needed we could also introduce some sort of entity that is between
a Value and an Atom in some sense -- i.e. we could introduce some sort of TensorValue entity that
1) Perhaps, knows what links to it (like an Atom but unlike a Value)
2) has an internal tensor that is mutable
There is nothing prohibiting one from building something like this
into Atomspace,
though obviously not breaking various mechanisms would require some care...OK, we will think about this.
Linas,Yes, this is clear (although I'd like to know more about your ideas regarding connectors),Sure; say when. I can talk about them for days. Unfortunately, the core idea is so simple, so obvious, that it becomes very difficult to talk about the advanced concepts, so this needs to be a distinct conversation.If you'd have a paper (not necessarily a journal-style paper) systematically describing the idea and its application to OpenCog, it would be really nice.
***
The existing architecture has room for a lot of things, a lot of
freedom for designing things. I'd like to stick to it as much as
possible.
***
It's very understandable, however "... as possible" is key here, and
it's hard to see how the current system can scalably deal with tensors
from sensory processing tools, without some at least modest
changes/additions...
On 05/24/2018 07:24 AM, Linas Vepstas wrote:
Could this indexing be made lazy?But the notion of uniqueness is only relevant when you query something about cat, like it's incoming set, etc.
No, its fundamental to what the definition of the atomspace is. Its the only way that you can have a single, unique (Concept "cat") in the system.
Now regarding memory management, yes it's true, although I guess it could still be lazy and only index when the memory grows too much.
I want to keep this conversation realistic. Sophia, today, struggles to see human faces.
Realistic compute power -- lets say several laptops worth of compute, and a GPU card that doesn't have some insanely whirry fan. This is what you can get on-site, at the location where the vision is happening.
> We would like to hardcode as less as possible. We can (and likely should) code TensorFlowValue
I think that would be a good experiment to conduct. While Ben and other enjoy designing systems top-down, I like to pursue a bottom-up approach -- build something, see how well it works. If it works poorly, make sure that we understood *why* it failed, and what parts were good, and then try again. So, for me a TensorFlowValue object would highlight what's good and what's bad in the current design. Engineering hill-climbing.
> but we would like to avoid hardcoding (is-near? A B).I agree, sort-of-ish. English language propositions are a "closed class" - its a finite list, and a fairly small list -- a few dozen that are truly practical. A few hundred, if you start listing archaic, obsolete, rare ones, ones unapplicable to images ... https://en.wikipedia.org/wiki/List_of_English_prepositions So for now, I find it acceptable to hard code a certain subset.
A discussion about "how can we learn prepositions from nothing?" would have to be a distinct conversation.
PlusLink, TimesLink, etc..
> This is exactly my question whether we need them or not :)
Whether they are needed or not depends a lot on what kind of data is exposed by TensorFlowValue, and how that data is then routed up into the natural-language and reasoning layers. There are multiple possible designs for this; there is no particular historical precedent (in the atomspace) for this.
It avoids some of the complexity of bounding boxes (which might be touching, overlapping or inside-of.)
No, I meant values in general. So for instance if atom (Node "A") holds a FloatValue (5.2, 0.1, 4.5) at key (Schema "*-my-key*-"), the following would return something like
(cog-execute!
(GetValueLink
(Node "A")
(Schema "*-my-key-*")))
(List (Number 5.2) (Number 0.1) (Number 4.5))
Linas,I want to keep this conversation realistic. Sophia, today, struggles to see human faces.We don't talk about applying existing narrow method. These method may be needed/realistic/practical now, but they don't bring us much closer to AGI. If we were talking about 'realistic' things in this sense, we would not talk about AGI at all.Our task is to move forward to goal of creating a vision system for AGI. It's not about making a better narrow face recognition algo. We can do this, but it is not our task now.
> but we would like to avoid hardcoding (is-near? A B).I agree, sort-of-ish. English language propositions are a "closed class" - its a finite list, and a fairly small list -- a few dozen that are truly practical. A few hundred, if you start listing archaic, obsolete, rare ones, ones unapplicable to images ... https://en.wikipedia.org/wiki/List_of_English_prepositions So for now, I find it acceptable to hard code a certain subset.
A discussion about "how can we learn prepositions from nothing?" would have to be a distinct conversation.
PlusLink, TimesLink, etc..> This is exactly my question whether we need them or not :)Whether they are needed or not depends a lot on what kind of data is exposed by TensorFlowValue, and how that data is then routed up into the natural-language and reasoning layers. There are multiple possible designs for this; there is no particular historical precedent (in the atomspace) for this.
model = tf.keras.Sequential([ tf.keras.layers.Dense(10, activation="relu", input_shape=(4,)), # input shape required tf.keras.layers.Dense(10, activation="relu"), tf.keras.layers.Dense(3) ])appears to be a purely declarative definition of a network topology, which we could map to Atomese.
This would allow us to write tensorflow programs in Atomese. Why is that interesting? Not because
we want humans to write tensorflow models in atomese, but because maybe we can have PLN
perform reasoning about tensorflow models, or because we can use MOSES to create, control
and evaluate tensorflow models, or perhaps you have so probbilistic-programing idea that could
auto-general different tensorflow models.
So far, I am very unclear about exactly what problem we are trying to solve, here (other than the
"problem of AGI").
It avoids some of the complexity of bounding boxes (which might be touching, overlapping or inside-of.)
No, I meant values in general. So for instance if atom (Node "A") holds a FloatValue (5.2, 0.1, 4.5) at key (Schema "*-my-key*-"), the following would return something like
(cog-execute!
(GetValueLink
(Node "A")
(Schema "*-my-key-*")))
(List (Number 5.2) (Number 0.1) (Number 4.5))
Will this list then be added to Atomspace or will it exist temporally while Pattern Matching is working?
I mean the idea was not to introduce any global changes to Pattern Matcher. So, PM once encounters GetValueLink, calls cog-execute! on it, and receives NumberNode, which then will be used to evaluate e.g. GreaterThanLink connecting GetValueLink to another GetValueLink or NumberNode. This was my initial understanding. Is it right?
Actually, I want you to not think about this. I strongly believe that pretty much anything you can think of will fit nicely into an Atom, or into a Value. I do not want to see a third kind of "generic object system" being created, that would be a deep mistake.
I think the newcomers need to have another half-year-ish of hands-on experience before we debate such fairly significant architectural changes.
Nothing I've heard so far requires any changes at all, and I can see a reasonable, simple solution, just fine.
And that is not what I was saying, at all. What I was talking about was the principles of software architecture
Again, it would be great if we could nail down the next level of details.
This would allow us to write tensorflow programs in Atomese. Why is that interesting? Not because we want humans to write tensorflow models in atomese, but because maybe we can have PLN perform reasoning about tensorflow models, or because we can use MOSES to create, control and evaluate tensorflow models, or perhaps you have so probbilistic-programing idea that could auto-general different tensorflow models.
OpenCog + PPL
One of the ideas/tasks was to make an OpenCog a “better Church”.
1. Why is it possible
Universal probabilistic programming languages (PPLs) utilize sampling-based approaches to infer posterior probabilities. They do not “reason” about tasks.
Consider a number of examples.
1.1)
(rejection-query
(define n (random-integer 1000000))
n
(= n 10))
This program says that n is a random integer number from 0 to 999999, and it tries to estimate P(n|n=10). It will take quite long also it is obvious that P(n=10|n=10)=1. We can imagine an (extended) Pattern Matcher easily deducing n=10 and checking it fits [0, 999999].
1.2)
(rejection-query
(define n (random-integer 10))
(define m (+ n 5))
n
(= m 0))
Here, rejection-query will hang forever, since the condition cannot be met. Again, the absence of the answer can be easily deduced (given some properties of +).
1.3) Einstein's (zebra) puzzle can be easily represented in Church
(define where list-index)
(define what list-ref)
(define (is? xs x ys y)
(eq? (what ys (where xs x)) y))
(define (neighbour? x y)
(or (= x (+ y 1))
(= x (- y 1))))
(rejection-query
(define colors (shuffle '(Yellow Blue Red Ivory Green)))
(define nationalities (shuffle '(Norwegian Ukrainian Englishman Spaniard Japanese)))
(define drinks (shuffle '(Water Tea Milk OrangeJuice Coffee)))
(define smokes (shuffle '(Kools Chesterfield OldGold LuckyStrike Parliament)))
(define pets (shuffle '(Fox Horse Snails Dog Zebra)))
(what nationalities (where drinks 'Water))
(and (is? nationalities 'Englishman colors 'Red)
(is? nationalities 'Spaniard pets 'Dog)
(is? drinks 'Coffee colors 'Green)
(is? nationalities 'Ukrainian drinks 'Tea)
(= (where colors 'Green) (+ (where colors 'Ivory) 1))
(is? smokes 'OldGold pets 'Snails)
(is? smokes 'Kools colors 'Yellow)
(= (where drinks 'Milk) 2)
(= (where nationalities 'Norwegian) 0)
(neighbour? (where smokes 'Chesterfield) (where pets 'Fox))
(neighbour? (where smokes 'Kools) (where pets 'Horse))
(is? smokes 'LuckyStrike drinks 'OrangeJuice)
(is? nationalities 'Japanese smokes 'Parliament)
(neighbour? (where nationalities 'Norwegian) (where colors 'Blue))
)
)
Again, blind search performed by PPLs is too inefficient here. OpenCog can solve this problem quite efficiently, and we can imagine that an equivalent probabilistic program is written in Atomese and URE deductively infers the answer starting from constraints without actually sampling random variables.
We can see that from PPLs perspective, OpenCog can be used to make inference over probabilistic programs much more efficient (at least, in some cases). Why not write these programs in Atomese directly? For what reason do we need a PPL metaphor?
2. What is missing
Consider the following example.
2.1)
(rejection-query
(define x (gaussian 0 1))
(define y (gaussian 0 1))
x
(= (+ x y) 1))
It will actually not work in Church, because the strict condition will not be satisfied (but it can be modified with the use of soft equality). URE (given necessary axioms) can easily infer y=1–x. However, it will not be able to ground variables. Actually, these are not variables to be grounded using number nodes, but rather values should be assigned to them. Also, we shouldn’t just sample x and then calculate y=1–x, because different values of y have different prior probability. Thus, if we want to estimate posterior probabilities, we should add specific mechanisms of taking prior probabilities of inferred values into account. Of course, we will also need some basic random distributions.
2.2) Fitting a polynomial with an unknown degree.
(define (calc-poly x ws)
(if (null? ws) 0
(+ (car ws) (* x (calc-poly x (cdr ws))))))
(define (calc-poly-noise x ws sigma)
(+ (calc-poly x ws) (gaussian 0 sigma)))
(define (generate xs ws sigma)
(map (lambda (x) (calc-poly-noise x ws sigma)) xs))
(define (sum-dif2 xs ys)
(if (null? xs) 0
(+ (+ (* (- (car xs) (car ys)) (- (car xs) (car ys))))
(sum-dif2 (cdr xs) (cdr ys)))))
(define (samples xs ys)
(mh-query 10 10
(define degree (sample-integer 4))
(define ws (repeat (+ 1 degree) (lambda () (gaussian 0 3))))
(define sigma (gamma 1 2))
degree
(< (sum-dif2 (generate xs ws sigma) ys) 0.5)))
(define xs '(0 1 2 3))
(define ys (generate xs '(0.1 1 2) 0.01))
(hist (samples xs ys) "degree")
This program actually works and finds a correct degree of the polynomial using Bayesian Occam razor that is available “for free” in PPLs. Here, Atomese appears to be rather useless (constraints cannot be deductively propagated/pattern-matched, and there is no set of atoms to which variables can be grounded). Small modifications of Pattern Matcher might be enough (e.g. introducing RandomVariableNode and sampling its value from a given distribution instead of systematically enumerating all possible groundings), but there still remain some problems (e.g. with marginalization, stochastic recursion, the inefficiency of blind guess in comparison with the metaheuristic search, etc.).
-- Alexey
Hi.Here are some additional thoughts on OpenCog+PPL, which I didn't include in the first messageOpenCog + PPL
One of the ideas/tasks was to make an OpenCog a “better Church”.
1. Why is it possible
Universal probabilistic programming languages (PPLs) utilize sampling-based approaches to infer posterior probabilities. They do not “reason” about tasks.
Consider a number of examples.
1.1)
(rejection-query
(define n (random-integer 1000000))
n
(= n 10))
This program says that n is a random integer number from 0 to 999999, and it tries to estimate P(n|n=10). It will take quite long also it is obvious that P(n=10|n=10)=1. We can imagine an (extended) Pattern Matcher easily deducing n=10 and checking it fits [0, 999999].
1.2)
(rejection-query
(define n (random-integer 10))
(define m (+ n 5))
n
(= m 0))
Here, rejection-query will hang forever, since the condition cannot be met. Again, the absence of the answer can be easily deduced (given some properties of +).
OpenCog can solve this problem quite efficiently,
and we can imagine that an equivalent probabilistic program is written in Atomese and URE deductively infers the answer starting from constraints without actually sampling random variables.
We can see that from PPLs perspective, OpenCog can be used to make inference over probabilistic programs much more efficient (at least, in some cases). Why not write these programs in Atomese directly? For what reason do we need a PPL metaphor?
2. What is missing
Consider the following example.
2.1)
(rejection-query
(define x (gaussian 0 1))
(define y (gaussian 0 1))
x
(= (+ x y) 1))
It will actually not work in Church, because the strict condition will not be satisfied (but it can be modified with the use of soft equality). URE (given necessary axioms) can easily infer y=1–x. However, it will not be able to ground variables. Actually, these are not variables to be grounded using number nodes, but rather values should be assigned to them. Also, we shouldn’t just sample x and then calculate y=1–x, because different values of y have different prior probability. Thus, if we want to estimate posterior probabilities, we should add specific mechanisms of taking prior probabilities of inferred values into account. Of course, we will also need some basic random distributions.
2.2) Fitting a polynomial with an unknown degree.
(define (calc-poly x ws)
(if (null? ws) 0
(+ (car ws) (* x (calc-poly x (cdr ws))))))
(define (calc-poly-noise x ws sigma)
(+ (calc-poly x ws) (gaussian 0 sigma)))
(define (generate xs ws sigma)
(map (lambda (x) (calc-poly-noise x ws sigma)) xs))
(define (sum-dif2 xs ys)
(if (null? xs) 0
(+ (+ (* (- (car xs) (car ys)) (- (car xs) (car ys))))
(sum-dif2 (cdr xs) (cdr ys)))))
(define (samples xs ys)
(mh-query 10 10
(define degree (sample-integer 4))
(define ws (repeat (+ 1 degree) (lambda () (gaussian 0 3))))
(define sigma (gamma 1 2))
degree
(< (sum-dif2 (generate xs ws sigma) ys) 0.5)))
(define xs '(0 1 2 3))
(define ys (generate xs '(0.1 1 2) 0.01))
(hist (samples xs ys) "degree")
This program actually works and finds a correct degree of the polynomial using Bayesian Occam razor that is available “for free” in PPLs.
Here, Atomese appears to be rather useless (constraints cannot be deductively propagated/pattern-matched, and there is no set of atoms to which variables can be grounded).
Small modifications of Pattern Matcher might be enough (e.g. introducing RandomVariableNode and sampling its value from a given distribution instead of systematically enumerating all possible groundings),
but there still remain some problems (e.g. with marginalization, stochastic recursion, the inefficiency of blind guess in comparison with the metaheuristic search, etc.).
-- Alexey
Oooof. I wrote a long diatribe about religion, until I realized you must be referring to something that Alonzo Church must have written or invented. I am not that familiar with his work, so I don't know what a "better Church" would be. Is there some specific paper or book?
I'm also interested in replacing the URE by a URE-II that avoids the use of variables, and replaces chaining by constraint satisfaction.
When you get a chance I would really like to understand more what you mean by that.
https://github.com/ngeiswei/opencog/commits/ppl
Alexey,
yes, that matches more or less the way I thought about OpenCog+PPL (though I didn't take the time to understand the Fitting Poly example). BTW, I gave a small related presentation a while ago
https://www.youtube.com/watch?v=CvUDMvMnFVc&t=933s
with some associated code
https://github.com/ngeiswei/opencog/commits/ppl
It doesn't go far at all but who knows it may help. In particular the use of the to-be-implemented GDTV https://github.com/opencog/atomspace/issues/833 (which Roman Treutlien is giving a shot BTW) could be useful.
Nil
Linas (and Nil),thanks for ValueOf link. It works just as we needed in our toy example with bounding boxes.This is the first small, but very important step. Pattern Matcher over values works! I see a lot of cool things that can be built on top of it. However, we need to move step-by-step.The next step is to have variable/unknown/random values in some generic way. Let me explain this on a number of examples.1) In webPPL we can writevar isAlarm = function(burglary, earthquake) {if(burlgary && flip(0.95)) return trueif(earthquake && flip(0.3)) return trueif(flip(0.001)) return truereturn false}var generate = function () {var burlgary = flip(0.001)var earthquake = flip(0.002)var alarm = isAlarm(burglary, earthquake)return alarm}Infer({method: 'enumerate', model: generate})(The same can be written in Church; I use webPPL here just for diversity)This means that burglary and earthquake are variables with attached distributions. In Atomese, they should be ConceptNodes, which truth values are defined as (very simple) distributions. One could just write something like (Concept burglary (stv 0.001 1)), or to use (not implemented) GDTV. But this is not precisely what we want! Because we might want to infer a posterior truth value of burglary given some observations. This posterior should not replace the prior probability/truth value, because we might want to use it in another inference with different conditions. Thus, flip(0.001) should be a part of Atomspace. Flip should be a node (or link?), and 0.001 should be a value for this atom.
Similarly, we might want to definevar diceRoll = randomInteger(6)
randomInteger fits neither to stv nor to GDTV.
It is a (prior) distribution over Values (not truth values in this case). Thus, to keep things general, we would like to be able to attach undefined/random values to atoms through some proxy atoms of a specific type.We should not directly implement flip or randomInteger as GroundedSchemaNode!!! This might work in simple cases, but we might want not to sample these values, but infer them in a different way.
2) In Tensorflow we can write
2) In Tensorflow we can write
3) If/when you rewrite the above example in atomese, you will find that it is much more verbose than webPPL. Again, that is because webPPL is designed for human programmers to express structure in human-readable, human-understandable ways. Atomese is not meant to be human-readable, it is meant to be machine-readable. So, the question really is: if you had the above function, written in Atomese, what are you going to do with it? What's the point? Why do you want to write it in Atomese?These are not rhetorical questions. I want Atomese, because I want to make it easy for Sophia to learn, viz, if two people are in the room, and video camera shakes, and one of the people says "its an earthquake!" and the other says "its a burglary!" that she can associate video-camera shaking with burglaries and earthquakes. Or at least, alarms. I'm not particularly interested in having human programmers writing isAlarm code.
but how to modulate the sampling-based rules to be efficient as well! These rules need to somehow recurse, they need to ask "is taking this sampling path gonna be fruitful?". That is, the AGI-recursion need to take place inside the rule itself. That is what I meant by using SampleLink as an inspiration (or more perhaps).
(define P-face (uniform 0 1))
(define face (flip P-face))
(define P-eyes-if-face (uniform 0 1))
(define P-eyes-if-noface (uniform 0 1))
(define eyes (flip (if face P-eyes-if-face P-eyes-if-noface)))
(define P-both-eyes-if-face (uniform 0 1))
(define P-both-eyes-if-noface (uniform 0 1))
(define P-both-eyes (if face P-both-eyes-if-face P-both-eyes-if-noface))
(define both-eyes (flip P-both-eyes))
(define left-eye (if eyes (if both-eyes #t (flip 0.5)) #f))
(define right-eye (if eyes (if both-eyes #t (not left-eye)))
#…
(define appearance (repeat 20 (lambda () (gaussian 0 1))))
(define z (append (list left-eye right-eye nose … ) appearance))
(define image-generated (DNN z W))
To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CABpRrhwjbFqve_sREBSraJsZCDPYzSq_eQWcqYTLyr7COUCvrA%40mail.gmail.com.--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+unsubscribe@googlegroups.com.
To post to this group, send email to ope...@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.
Assume we have a Node "image", which value is calculated by some DNN. How can we better represent this?
We would like to invoke this calculation using VALUE_OF link applied to "image".
DNN has a parameter, i.e. value of some atom, which calculation is invoked when we want to calculate the DNN output.
I don't understand very well the question. I suppose you may want to
store both parameters and outputs as values.
As for calculating DNN output, I don't think Atomese alone, without
resorting to grounded schemata can do it ATM, as I don't see any atom
link type value modifier (as opposed to ValueOf).
If your question is about StreamValue, I don't how much about it.
The following code example illustrates possible representation in atomeese:(define (test atom)
(display "My func called with atom arguments") (newline)
(display atom) (newline)
)(cog-set-value! (Concept "argHolder") (Predicate "arg") (ConceptNode "value"))
(cog-set-value!
(Concept "image")
(Predicate "dnn")
(EvaluationLink
(GroundedPredicateNode "scm: test")
(ListLink (ValueOf (Concept "argHolder") (Predicate "arg")))
)
)
(cog-evaluate! (ValueOf (Concept "image") (Predicate "dnn")))This code works but there are things which doesn't work or not natural:1) argHolder holds an argument and the argument cannot be ProtoAtom at the moment. If any value ((RandomStream 1) for instance) is used instead of (ConceptNode "value") in the example above then Instantiator dynamically casts it to the Handle, the cast returns null pointer and "test" function is called with empty list of arguments.2) There is ValueOfLink to get value but there is no atom to set value. In the example above (cog-set-value! ) is used to set values on atoms.