my critiques open for comment is below the section of Dreyfus' paper that I
am quoting. I highlight the particular words at issue in his paragraph(s)
by enclosing them between "***" characters. I also include any citations in
the quoted section of the paper at the end of this post.
I seek (intelligent and informed) technical/theoretical/philosophical
critique or feedback of my comments from anyone on the issue(s)
presented/raised.
See top of page 18:
VIII. Walter Freeman's Merleau-Pontian Neurodynamics
We have seen that our experience of the everyday world (not the universe) is
given as already organized in terms of significance and relevance, and that
significance can't be constructed by giving meaning to brute facts -- both
because we don't normally experience brute facts and, even if we did, no
value predicate could do the job of giving them situational significance.
Yet, all that the organism can receive is mere physical energy. How can
such senseless physical stimulation be experienced directly as significant?
All generally accepted neuro-models fail to help, even when they talk of
dynamic coupling, since they still accept the basic Cartesian model, viz.:
1. The brain receives input from the universe by way of its sense organs
(the picture on the retina, the vibrations in the cochlea, the odorant
particles in the nasal passages, etc.).
2. Out of this stimulus information, the brain abstracts features, which it
uses to construct a representation of the world.
This is supposedly accomplished either (a) by applying rules such as the
frames and scripts of GOFAI, - an approach that is generally acknowledged to
have failed to solve the frame problem. Or (b) by strengthening or
weakening weights on connections between simulated neurons in a simulated
neural network depending on the success or failure of the net's output as
defined by the net designer. ***Significance is thus added from outside
since the net is not seeking anything. This approach does not even try to
capture the animal's way of actively determining the significance of the
stimulus on the basis of its past experience and its current
arousal***[asb1] .
***Both these approaches treat the computer or brain as a passive receiver
of bits of meaningless data, which then have to have significance added to
them***[asb2] . The big problem for the traditional neuro-science approach
is, then, to understand how the brain binds the relevant features to each
other. That is, the problem for normal neuro-science is how to pick out and
relate features relevant to each other from among all the independent,
isolated features picked up by each of the independent isolated receptors.
For example, is the redness that has just been detected relevant to the
square or the circle shape also detected in the current input? This problem
is the neural version of the frame problem in AI: How can the brain keep
track of which facts in its representation of the current world are relevant
to which other facts? ***Like the frame problem, as long as the mind/brain
is thought of as passively receiving meaningless inputs that need to have
significance and relevance added to them, the binding problem has remained
unsolved and is almost certainly unsolvable.***[asb3] Somehow the
phenomenologist's description of how the ***active organism has direct
access to significance ***[asb4] must be built into the neuroscientific
model.
--------------------------------------------------------------------------------
MY CRITIQUES indexed by my initials "ASB" followed by the number of my
comment above:
[asb1]OK, so you are aware of NN. However, this is not necessarily the
case. What you call "significance" is simply an error or fitness function.
Any learning system must have such corrective feedback. The question simply
a matter of what is the reference (goal) used to determine the error/fitness
for training the NN. This could self-organized based on tagged training
examples to automatically learn categories of observed object, for example,
or another system could (in principle) present the error function based on
past experience or any other expectation generating or control loop method.
I think the real problem with NN in this context is that the failure of
knowing how to implement a rich hierarchical NN system requires the NN to be
controlled and used by standard modular/representational systems that must
provide the error function in the same modular/representational way which is
disconnected from global context and must be provided by humans, which
design and use the NN (almost strictly) as a module. Thus, you have a
defacto modular, representational system embodied (unnaturally) as a NN in
the black box of your dynamically coupled layer. Yet, this does not have to
be the case as you contend. It is just they are forcing the NN unnaturally
into this paradigm. If they knew how to build a hierarchical NN system then
the modular, representational constraint (I am sure) would largely disappear
and it would be have much as you say the Heideggerian AI should. Again,
what is lacking is not the philosophy, but how to do it. Keep in mind, I am
not saying that hierarchical NN architectures that are prevalent are in the
right direction (in my opinion) just that they could, in principle, address
the issue you raise as a prohibiting failure of common AI approaches.
[asb2]For example, under a standard hierarchical NN design, you could have
a self-organizing NN that learns to automatically separate (cluster)
patterns (say faces) into certain groups without a human generated error
function. However, what do the clusters mean? Then higher (abstract) NN
levels would presumably couple those clusters to experiences that are closer
to what we would call meaning. I am not saying they know how to do this or
that this is the right (best) architecture to do it in- just that
philosophically significance could be automatically created in a fully
hierarchical NN system they just cannot make them converge with a finite
number of training examples, let alone very few like humans. Never the
less, this is an engineering road block, not a philosophical one. Can you
argue logically otherwise? If not, then a deeper philosophy is needed to
contribute to what ales modern AI.
[asb3]is this really the problem? I mean, you can implement a machine
learning system that learns gravity (e.g. a ballistic targeting system)
without a human adding any significance to give the learned projectile
correction any meaning. It just, in a dynamically coupled sort of way would
by ready-at-hand to correctly position the gun to hit the distant target
after it learned the corrections. We would call such corrections (i.e.,
give them meaning) gravity, friction, wind, etc., but that does not matter
to the task at hand. In this context I do not think the binding problem is
a relational database problem as you put it. That is, I would not phrase
the problem as you say about what fact relate to what other fact. Instead,
I think it is more about how observations can be abstracted to apply to
different contexts and how to generate behaviors that instantiate the
contextual abstractions depending upon the proper context. In this way,
how one fact relates to another fact is but a small aspect of problem as I
see it. So, in a trivial example, if you could give a unified, abstract
"definition" that characterizes gravity in the different experienced
situations it manifests itself in and that definition has coherent,
predictive value to situations when used, then you, in large part, have
meaning. In some cases, even a dumb genetic algorithm (GA) can discover
such contextual meanings in principle (however, in practice they would have
to structure the domain a great deal due to their lack of imagination)
[asb4]how do you figure this is the case? I think you are going way too
broad and misleading by saying "active organisms". Are you including
bacteria? Insects? That kind of built in significance is largely genetic,
thus evolved so a GA discovered it (as I mentioned above). If you meant to
go this broad then it seems you are reducing human cognition to an evolved
multiplicity of predetermined ingrams of significance that is not much
different than highly optimized rules in a fuzzy contextual architecture.
Otherwise, it would seem much more intellectually accurate to say something
like "human" or "primate" (or specify the genus) organisms that characterize
this kind of "significance" you mean to say is missing in current models.
Significance is not at all the same at each creature's behavioral level.
Cheers!
Ariel B.-
CITATIONS MADE IN THE ABOVE QUOTED SECTION OF THE PAPER:
Yes, I think by definition significance must always be added from
"outside".
We do not rationally decide that food is significant to us. Not getting
enough food hurts - and we respond to pain by trying to stop it. This fact
was added "from outside" of the learning brain by hardware built into us
that made the condition of low energy level (however hunger is sensed in a
human - I don't even know) - which creates an "outside" motivation signal
for the learning network in our brain. All the physical body events that
create pain and pleasure are hard-wired outside measures of significance
which the learning brain must then work with to assign significance to
different behaviors.
Behaviors that are good at stopping hunger become significant, behaviors
that create hunger become negatively significant.
> ***Both these approaches treat the computer or brain as a passive
> receiver of bits of meaningless data, which then have to have
> significance added to them***[asb2] .
Yes, as I've talked about a few times in these threads, I think AI will be
solved by the creation of a reinforcement trained neural network of some
type. In such an approach, significance must be associated with the action
of all the nodes in the network. The weights of the network can actually
turn out to be a measure of the significance.
> The big problem for the
> traditional neuro-science approach is, then, to understand how the brain
> binds the relevant features to each other. That is, the problem for
> normal neuro-science is how to pick out and relate features relevant to
> each other from among all the independent, isolated features picked up by
> each of the independent isolated receptors. For example, is the redness
> that has just been detected relevant to the square or the circle shape
> also detected in the current input? This problem is the neural version
> of the frame problem in AI: How can the brain keep track of which facts
> in its representation of the current world are relevant to which other
> facts? ***Like the frame problem, as long as the mind/brain is thought
> of as passively receiving meaningless inputs that need to have
> significance and relevance added to them, the binding problem has
> remained unsolved and is almost certainly unsolvable.***[asb3] Somehow
> the phenomenologist's description of how the ***active organism has
> direct access to significance ***[asb4] must be built into the
> neuroscientific model.
The binding problem is solved like this....
The "dog leg" signal must be combined with the "dog face" signal and the
"dog bark" signal to form the "dog" signal. The simple act of creating a
generic "dog" signal from these three (and 1000 other dog feature signals)
is the solution to the binding problem. That is, if you use an AND or OR
function or some type of summation of feature weights - it makes no
difference - as long as you create that internal single signal which
represents "dog" then the system has solved the binding problem.
Or more specifically, if the network has the ability to take raw sensory
signals, like pixel data, and drive the audio output and correctly speak
the word "dog" when a dog is in the vision, and speak the word "cat" when a
cat is seen, then the network has solved the binding problem.
So to talk about the example above, if there is a "red" signal internally,
and there's also a "cube" signal, and we want the system to learn to speak
the word "rcube" then it must merge these signals together to drive the
production of the "rcube" behavior. The simple way to understand is to
assume it generates a single internal "rcube" signal, but in fact, it most
likely doesn't need to do that. It can be represented by a cluster of
signals that in the end, are transformed to whatever output signals are
needed to produce the "rcube" behavior.
Now, how to build a network which learns to create the correct internal
signals to allow this to happen is not so easy to understand, but that fact
that it could happen, is easy to understand, because with limited toy
examples, we can easily hand-design networks to solve binding problems like
this (and in fact we do it all the time when we design electronic logic
circuits).
But, as I've also talked about in other messages recently, I believe a key
point here is that we need a network learning algorithm that transforms the
raw sensory data into signals with lower correlations between signals -
which also performs an information maximizing function. This type of
translation correctly binds signals together because of their temporal
correlations. The dog-leg signals tends to correlate with the dog-face
signal (they tend to be active at the same time more so than the dog-leg
and tree-leaf signals for example). So the generic "dog" signal gets
created as an extraction of the correlation that exists between all the low
level dog feature signals.
So the primary binding problem will be solved by statistics applied to the
sensory data.
But the second binding problem is the one associated with significance.
That is, to signals will also need to be bound together in order to produce
the same output because of significance. If "stay away" behavior is the
correct behavior for responding to spiders and to snakes, it's not because
spiders and snakes are statistically correlated sensory events. It's
because they are both predictors of pains. If two different sensory events
must trigger a similar reaction, then the signals that represent those
sensory events must be physically7 bound together in the signal processing
of the network in order to drive the production if the single common
reaction. And that's the same binding issue as before, but this time,
solved by training associated with significance.
How I suspect this will work is that we need to create a network that is
self organizing based on statistical correlation of sensory signals. That
gives us a network that correctly binds signals to together to parse the
environment into objects or things. We then apply reinforcement learning
to that network in a way that allows it to re-shape the default binding
created by the statistical process in order to shape behavior in response
to the external definition of significance (the hardware which generates
the reward signal).
The implementation details of such a net are, when found, the solution to
AI in my view.
> -------------------------------------------------------------------------
> -------
>
> MY CRITIQUES indexed by my initials "ASB" followed by the number of my
> comment above:
>
> [asb1]OK, so you are aware of NN. However, this is not necessarily the
> case. What you call "significance" is simply an error or fitness
> function. Any learning system must have such corrective feedback. The
> question simply a matter of what is the reference (goal) used to
> determine the error/fitness for training the NN. This could
> self-organized based on tagged training examples to automatically learn
> categories of observed object, for example, or another system could (in
> principle) present the error function based on past experience or any
> other expectation generating or control loop method. I think the real
> problem with NN in this context is that the failure of knowing how to
> implement a rich hierarchical NN system requires the NN to be controlled
> and used by standard modular/representational systems that must provide
> the error function in the same modular/representational way which is
> disconnected from global context and must be provided by humans, which
> design and use the NN (almost strictly) as a module.
I can't make head or tails out of what you are trying to say there.
> Thus, you have a
> defacto modular, representational system embodied (unnaturally) as a NN
> in the black box of your dynamically coupled layer. Yet, this does not
> have to be the case as you contend. It is just they are forcing the NN
> unnaturally into this paradigm. If they knew how to build a hierarchical
> NN system then the modular, representational constraint (I am sure) would
> largely disappear and it would be have much as you say the Heideggerian
> AI should. Again, what is lacking is not the philosophy, but how to do
> it.
Yes, I think I agree with that point completely. The fact that someone has
not yet build a NN that solves this, is not proof that it can't be done
with a NN.
> Keep in mind, I am not saying that hierarchical NN architectures
> that are prevalent are in the right direction (in my opinion) just that
> they could, in principle, address the issue you raise as a prohibiting
> failure of common AI approaches.
Yes, if I understand correctly what you are saying, I agree.
> [asb2]For example, under a standard hierarchical NN design, you could
> have a self-organizing NN that learns to automatically separate (cluster)
> patterns (say faces) into certain groups without a human generated error
> function. However, what do the clusters mean?
They mean "face" of course! :)
The cluster would automatically show up in the network without a human
"making it happen" because it means that "face" features statistically are
predictive of each other. That is, the fact an eye pattern is in the
sensory field is predictive of a nose pattern and a chin pattern. It's
this temporal correlation (face-features tend to show up at the same time)
which drives the network to automatically produce the "face" signal.
> Then higher (abstract) NN
> levels would presumably couple those clusters to experiences that are
> closer to what we would call meaning.
The meaning of the signal is the events in the environment which caused the
events. A face signal means there is a face in the environment.
This "meaning" doens't exist only in the higher levels, it exists at all
levels.
And the point of all this clustering is to ultimately cluster the signals
to drive the system's outputs. So if there is a "raise arm" output, and
with training the network wires itself to raise it's arm in response to any
face, then the face signal takes on two meanings in this network. It means
both "face" (looking backwards in the network to see what caused it"), and
it means "raise arm" (looking forward in the network to see what the signal
will cause to happen).
Thoughtful the entire network, every signal has a complex dual meaning like
this. Though in a large complex network, it would be hard to describe the
meaning of every signal with words, there would still clearly be a meaning
for each signal.
> I am not saying they know how to
> do this or that this is the right (best) architecture to do it in- just
> that philosophically significance could be automatically created in a
> fully hierarchical NN system they just cannot make them converge with a
> finite number of training examples, let alone very few like humans.
Yes, exactly. It's a question of how do we make it work. The fact that we
have not made it work, is not proof it can not work like that. (actually I
have networks doing the type of thing I talk about here for simple problems
so I can say it can work to some extent already - my only question is how
to make it work as well as the brain works - not whether it might be
possible at all).
> Never the less, this is an engineering road block, not a philosophical
> one. Can you argue logically otherwise? If not, then a deeper
> philosophy is needed to contribute to what ales modern AI.
>
> [asb3]is this really the problem? I mean, you can implement a machine
> learning system that learns gravity (e.g. a ballistic targeting system)
> without a human adding any significance to give the learned projectile
> correction any meaning. It just, in a dynamically coupled sort of way
> would by ready-at-hand to correctly position the gun to hit the distant
> target after it learned the corrections.
Yes, but you could still look at the signals in such a system and talk
about their meaning.
> We would call such corrections
> (i.e., give them meaning) gravity, friction, wind, etc., but that does
> not matter to the task at hand. In this context I do not think the
> binding problem is a relational database problem as you put it. That
> is, I would not phrase the problem as you say about what fact relate to
> what other fact. Instead, I think it is more about how observations can
> be abstracted to apply to different contexts and how to generate
> behaviors that instantiate the contextual abstractions depending upon the
> proper context. In this way, how one fact relates to another fact is
> but a small aspect of problem as I see it.
The only way two facts (aka two internal signals) relate to each other is
if they need to be combined in some function to control a single output
behavior. And if they do, then when the network learns to correctly
produce the behavior, it will have correctly learned to combine the facts
as required.
This use of the word "fact" however bothers me. This is because we often
talk about facts such as "1+1=2" is a "fact". But the way such a fact is
implemented in a learning network is as a series of different learned
behaviors. For example, you send to the network the words: "what is 1+1?"
and the network responds: "2". So this "fact" is represented as a learned
behavior.
Two separate learned behaviors generally don't have to be combined. If you
train the system to respond to "what is your name" and it responds "Hal",
and you train it to respond to "what is 1 plus 1" as "2" there is little
need for the system to combine these two behaviors in any sense.
A whole complexity to human behavior is the fact that in addition to being
able to interact with an environment (walk without falling down, pick up
food and eat it, make a sandwich), we also have this large an complex set
of language behaviors. We produce language in complex ways in response to
what happens around us, and we respond to language in complex ways - and we
even respond to our own words - by performing actions or by producing more
language.
Because we have a brain large enough to learn a very large set of different
language behaviors (both production and reception) or behavior ends up
following complex paths all driven by these language events.
Sometimes, when people talk about "conscious" understanding, they seem to
be talking about events which we translate into language concepts in our
mind and events which we respond to by generating language which then
guides our actions.
Such as, I see a dog and produce the language-behaviors in my brain: "that
looks like Bob's dog - I should tell bog his dog is out". And in response
to that, I call Bob on my cell phone.
I still see this as nothing but a large and complex sequence of behaviors
the brain has produced in response to the sensory experience of seeing the
dog all driven by low level behavior selection hardware, but some will say
we had conscious awareness that we saw Bob's dog seeming because we
produced those language behaviors in us.
When people talk about "combining facts" I sometimes wonder if they are
talking about our language behaviors of combining facts which is a much
higher level behavior problem than the combing of "red" with "cube" to
produce a single reaction to red cubes instead of two blue cubes.
For example, if I train a dog to fetch the red ball on command there is a
different level of behavior taking place then when I train a human to
understand English and then ask them, "Can you get the ball which is the
same color as the setting sun and bring it to me?".
The dog only has to learn a few fairly simple behaviors to do that task
correctly where as the human has to learn a large set of behaviors in order
to respond to a complex sentence like that. But in either case, I believe
it's still the work of a simple underlying network that can be trained by
reinforcement to respond to the current context of the environment with the
best behavior for that context (the action which has the highest
significance in that context).
> So, in a trivial example, if
> you could give a unified, abstract "definition" that characterizes
> gravity in the different experienced situations it manifests itself in
> and that definition has coherent, predictive value to situations when
> used, then you, in large part, have meaning. In some cases, even a dumb
> genetic algorithm (GA) can discover such contextual meanings in principle
> (however, in practice they would have to structure the domain a great
> deal due to their lack of imagination)
>
> [asb4]how do you figure this is the case? I think you are going way too
> broad and misleading by saying "active organisms". Are you including
> bacteria? Insects? That kind of built in significance is largely
> genetic, thus evolved so a GA discovered it (as I mentioned above). If
> you meant to go this broad then it seems you are reducing human cognition
> to an evolved multiplicity of predetermined ingrams of significance that
> is not much different than highly optimized rules in a fuzzy contextual
> architecture. Otherwise, it would seem much more intellectually accurate
> to say something like "human" or "primate" (or specify the genus)
> organisms that characterize this kind of "significance" you mean to say
> is missing in current models. Significance is not at all the same at each
> creature's behavioral level.
>
> Cheers!
> Ariel B.-
>
> CITATIONS MADE IN THE ABOVE QUOTED SECTION OF THE PAPER:
--
Curt Welch http://CurtWelch.Com/
cu...@kcwc.com http://NewsReader.Com/
Please publish and claim your Nobel!
BWAHAHAHAHAHA!
Your so-called solution is so superficial as to beg so many question
that it is no solution at all. This is also true of every one of your
so-called solutions to other problems. This pattern of yours, of
ignoring (or being ignorant of) so many details science has uncovered,
and philosophy has explicated, shows that you are not serious about
these issues.
>
> The "dog leg" signal must be combined with the "dog face" signal and the
> "dog bark" signal to form the "dog" signal. The simple act of creating a
> generic "dog" signal from these three (and 1000 other dog feature signals)
> is the solution to the binding problem. That is, if you use an AND or OR
> function or some type of summation of feature weights - it makes no
> difference - as long as you create that internal single signal which
> represents "dog" then the system has solved the binding problem.
So lets see here. If I take a visual signal (whatver that means -
perhaps omthine coming from V1 or MT or wherever in the parallel-
processed myriad of signals that account for different features of the
"visualization", and then perform an add of that (and you stupidly say
that doing so on a monolithic signal *or* feature weights - if makes
no difference - how careless) and the signal from the auditory cortex
(which one - do you flip a multifacted coin or die?), then somehow you
get a representation of the do (which is a conceptual thing in mind.)
So maybe if I combine my garage with an apple (they are both
interacting particles after all - just like dog signals are all
signals in brain), then I would get somethinhg I can put my apple
computer into right - a backpack.
BWHAHAHAHAHA!
>
> Or more specifically, if the network has the ability to take raw sensory
> signals, like pixel data, and drive the audio output and correctly speak
> the word "dog" when a dog is in the vision, and speak the word "cat" when a
> cat is seen, then the network has solved the binding problem.
Nonsense!
>
> So to talk about the example above, if there is a "red" signal internally,
> and there's also a "cube" signal, and we want the system to learn to speak
> the word "rcube" then it must merge these signals together to drive the
> production of the "rcube" behavior. The simple way to understand is to
> assume it generates a single internal "rcube" signal, but in fact, it most
> likely doesn't need to do that. It can be represented by a cluster of
> signals that in the end, are transformed to whatever output signals are
> needed to produce the "rcube" behavior.
>
> Now, how to build a network which learns to create the correct internal
> signals to allow this to happen is not so easy to understand, but that fact
> that it could happen, is easy to understand, because with limited toy
> examples, we can easily hand-design networks to solve binding problems like
> this (and in fact we do it all the time when we design electronic logic
> circuits).
That is NOT the same as the binding problem ! Come out of your cave
and do some reading will ya!
How is an object like an apple - represented as parsed signals? What
is doing the parsing?
> We then apply reinforcement learning
> to that network in a way that allows it to re-shape the default binding
> created by the statistical process in order to shape behavior in response
> to the external definition of significance (the hardware which generates
> the reward signal).
>
> The implementation details of such a net are, when found, the solution to
> AI in my view.
Perhaps when you come out of your cave you can find more fodder for
your claims and then publish them and claim your Nobel!
> The binding problem is solved like this....
See http://nn.cs.utexas.edu/computationalmaps/demos/13.10.php for a
demonstration of how the PGLISSOM model solves the binding problem. The
movie takes about 1 minute to load. Set the speed to 40 ms and watch the
two groups. First they fire all together, but around iteration step 110,
the two groups become desynchronized.
> The implementation details of such a net are, when found, the solution
> to AI in my view.
I don't think that these LISSOM models will solve AI. They are for
educational purposes only. As pure feed-forward models they lack
top-down connections. But top-down connections are necessary for
attention, and attention is necessary for handling multiple objects. All
they can do at the moment is just providing another solution for
handwritten digit recognition. Yawn.
--
http://home.arcor.de/w.lorenz65/mlbench
No mercy for the cheaters in machine learning!
That is not a solution to the binding problem in brain!
>The
> movie takes about 1 minute to load. Set the speed to 40 ms and watch the
> two groups. First they fire all together, but around iteration step 110,
> the two groups become desynchronized.
>
> > The implementation details of such a net are, when found, the solution
> > to AI in my view.
>
> I don't think that these LISSOM models will solve AI. They are for
> educational purposes only. As pure feed-forward models they lack
> top-down connections. But top-down connections are necessary for
> attention, and attention is necessary for handling multiple objects. All
> they can do at the moment is just providing another solution for
> handwritten digit recognition. Yawn.
Indeed!
>
> --http://home.arcor.de/w.lorenz65/mlbench
I just now got around to looking at that demo.
That's a very neat idea of binding by synchronization. It allows the
network to represent multiple bindings at the same time by their temporal
synchronization patterns. It's a binding idea I've never thought of.
> > The implementation details of such a net are, when found, the solution
> > to AI in my view.
>
> I don't think that these LISSOM models will solve AI. They are for
> educational purposes only. As pure feed-forward models they lack
> top-down connections. But top-down connections are necessary for
> attention, and attention is necessary for handling multiple objects. All
> they can do at the moment is just providing another solution for
> handwritten digit recognition. Yawn.
Well, the demo seems to be a purely spatial pattern recognition model and
AI requires temporal as well as spatial pattern recognition. So at
minimal, if they have not already done so, the technique has to be enhanced
to identify patterns in the temporal domain (instead of just coding spatial
domain linear patterns into the temporal domain like that one demo is
doing).
I'll have to see if I can find more on those models. They look very
interesting and I hadn't heard about them.
I'm curious. Why do you think top down connections are necessary for
attention?
> Well, the demo seems to be a purely spatial pattern recognition model and
> AI requires temporal as well as spatial pattern recognition.
In their LISSOM model they used delayed neurons in the LGN layer to
recognize motion direction and speed. They said that lagged neurons have
been found in cat LGN.
> I'm curious. Why do you think top down connections are necessary for
> attention?
Because it has ben measured in monkey cortex. I think it was V4, where
RF sizes were large enough to hold both a horizontal and a vertical bar.
If a measured neuron was tuned 0.8 to H and 0.2 to V, and both signals
were present, and the monkey was not attending, then the net result was
about 0.5. If the task required attending to the H bar and ignoring the
V bar, the activation of the neuron was about 0.7. If it required
attenting to V, it was about 0.3.
They said somewhere else that in the orientation channel, attention can
amplify the result by a factor of 3. In the luminance channel it's only 20%.
Unfortunately I don't remember the document title or link, because there
are so many documents out there and I can read only 300 of them per year.
Interesting.
> > I'm curious. Why do you think top down connections are necessary for
> > attention?
>
> Because it has ben measured in monkey cortex.
Interesting.
> I think it was V4, where
> RF sizes were large enough to hold both a horizontal and a vertical bar.
> If a measured neuron was tuned 0.8 to H and 0.2 to V, and both signals
> were present, and the monkey was not attending, then the net result was
> about 0.5. If the task required attending to the H bar and ignoring the
> V bar, the activation of the neuron was about 0.7. If it required
> attenting to V, it was about 0.3.
Can you give me a bit more of how this was tested (if you remeber)? What
sort of task was the monkey doing? Was the "V bar" an image on a screen,
or pictures on cards, or some phsyical bar?
I'm just wondering if there might be some possible explanation of the
activity other than the obvious idea of high level feedback in the network
itself.
> They said somewhere else that in the orientation channel, attention can
> amplify the result by a factor of 3. In the luminance channel it's only
> 20%.
>
> Unfortunately I don't remember the document title or link, because there
> are so many documents out there and I can read only 300 of them per year.
--
> Can you give me a bit more of how this was tested (if you remeber)?
Sorry, I don't remember the details of that experiment, but Christof
Koch has an overview about attention in his 2004 paper "Selective Visual
Attention and Computational Models":
3.2 Single cell studies
Single cell studies in awake, behaving monkeys show that the responses
of individual neurons can be modulated by visual attention. Recent work
has shown that cells in almost all visual areas, including V1, can be
modulated by attention. This applies in particular to cells in
posterior parietal cortex, the final predominantly visual area in the
occipito-parietal stream, and neurons in area V4 and IT (inferotemporal)
cortex along the occipito-temporal pathway, can be modulated by
attentional factors (reviewed in Colby, 1991). Moran and Desimone
(1985) found that if two different objects, say a red and a green bar,
are both located within the receptive field of a V4 neuron selective for
red, the neuron will respond vigorously if the monkey attends to the red
stimulus, but respond much less if the monkey is attending to the
green stimulus. The stimulus is identical in both cases (a red and a
green bar); the difference is only in the internal state of the monkey.
This document is available at
http://www.klab.caltech.edu/cns186/PS/attention-koch.pdf
To find more detailed informations about the experiments, you may copy
and paste document titles from its reference section into Google.
--
http://home.arcor.de/w.lorenz65/mlbench
No mercy for the cheaters in machine learning!
Thanks...
> 3.2 Single cell studies
>
> Single cell studies in awake, behaving monkeys show that the responses
> of individual neurons can be modulated by visual attention. Recent work
> has shown that cells in almost all visual areas, including V1, can be
> modulated by attention. This applies in particular to cells in
> posterior parietal cortex, the final predominantly visual area in the
> occipito-parietal stream, and neurons in area V4 and IT (inferotemporal)
> cortex along the occipito-temporal pathway, can be modulated by
> attentional factors (reviewed in Colby, 1991). Moran and Desimone
> (1985) found that if two different objects, say a red and a green bar,
> are both located within the receptive field of a V4 neuron selective for
> red, the neuron will respond vigorously if the monkey attends to the red
> stimulus, but respond much less if the monkey is attending to the
> green stimulus. The stimulus is identical in both cases (a red and a
> green bar); the difference is only in the internal state of the monkey.
What I was wondering, is if the experiment for example was able to factor
out the effect of eye moments? That is, if the high levels were
controlling how the eye moved, and the monkey was trained to perform a task
that required them to "focus on" a red bar, then the eye movement
(controlled by the high level outputs of the system) might bias the signals
they see in the lower level. If the high level makes the eye spend longer
looking at the red bars, and spends less time scanning the rest of the
scene, then that bias could maybe change the characteristic of the signal
they were monitoring.
If something like that was happening, then it could be feedback from the
high level down to the low level, but it would happen externally through
the environment, and not internally with actual feedback paths in the
network.
I'll have to look at it and understand better what they did...
> This document is available at
> http://www.klab.caltech.edu/cns186/PS/attention-koch.pdf
>
> To find more detailed informations about the experiments, you may copy
> and paste document titles from its reference section into Google.
--