Gradual Learning, not Reinforcement Learning

Jim Bromer

unread,

Jul 14, 2006, 2:18:18 PM7/14/06

to

I think that the attempt to prove an a priori assumption about the
efficacy of 'reinforcement' in AI is not wise unless someone has
made the decision that he wants to spend his time researching
'reinforcement' in computational learning theories. By shifting to
a more generic concept, like my idea of 'gradual learning,' the AI
researcher can rid himself of the burden of a doxology of concept that
was not designed with his interests in mind.

I feel that people who insist on using reinforcement even when it does
not work for them are just creating a problem where one does not need
to exist. If an intense study of 'reinforcement' is what someone
enjoys or otherwise wishes to pursue, that is his decision and I would
be interested in his results. But if that is not an important sub-goal
then why bother schlepping someone else's dogma even when it does not
work for you? Sure, study the problem if you want, but if your real
interest is in finding solutions, then use the solutions that you can
find.

We need a priori beliefs. But we also need to examine them carefully
and to accept the results of our studies and experiments. When you see
an alternative that could act as a viable solution to a problem that
you are working on I think you should give it some consideration even
if it doesn't fit in with your preconceived theories.

Although I have certain opinions about what I call gradual learning,
the real issue that I am stressing in this message is that gradual
learning does not have to be constrained in the same ways that
reinforcement learning probably should be. With the freer concept of
gradual learning, you can examine possibilities that would be
dogmatically rejected under the epistemology of an artificial
implementation of operant conditioning and reinforcement (what ever
that might be).

It is the nature and modeling of the relations between the references
of the objects of knowledge that is key to the contemporary problem.
None of the paradigms of the past have been shown to be capable of
fully solving this problem. Something new has to be explored.

Jim Bromer

feedbackdroid

unread,

Jul 14, 2006, 4:03:01 PM7/14/06

to

Jim Bromer wrote:
> I think that the attempt to prove an a priori assumption about the
> efficacy of 'reinforcement' in AI is not wise unless someone has
> made the decision that he wants to spend his time researching
> 'reinforcement' in computational learning theories. By shifting to
> a more generic concept, like my idea of 'gradual learning,' the AI
> researcher can rid himself of the burden of a doxology of concept that
> was not designed with his interests in mind.
>

There are at least 50 theories of learning, so there is no need for
anyone interested in AI, per se, to get too imprinted on any specific
one .....

http://tip.psychology.org/

One size fits all only if you're a polyester sock.

Naiive learning devices certainly haven't worked.

Glen M. Sizemore

unread,

Jul 14, 2006, 5:39:39 PM7/14/06

to

JB: I think that the attempt to prove an a priori assumption about the

efficacy of 'reinforcement' in AI is not wise unless someone has
made the decision that he wants to spend his time researching
'reinforcement' in computational learning theories.

GS: The argument, which you have not dealt with, is that the behavioral
phenomena that are referred to broadly as habituation (and sensitization,
for that matter), classical conditioning, and (especially) operant
conditioning explain, along with behavior that is largely inherited, all of
animal and human behavior, at least at the behavioral level. Thus, if we
wish to simulate behavior (even if only the parts that this guy or that guy
define as "intelligent") we would do well to understand these phenomena and
to speculate upon, and look towards physiology to explain, how these
processes are "implemented" by physiology. [Part of "speculating upon" is
attempting simple models that can be tested by computer.] In any event, it
seems to be there are only two other possibilities. The first is that the
assertion is wrong on either logical/conceptual grounds or on empirical
grounds, and that we must, therefore, add more principles. The second is the
bird/plane argument. That is, that "artificial intelligence" can be achieved
in ways that don't have anything to do with "how nature does it."

JB: By shifting to

a more generic concept, like my idea of 'gradual learning,' the AI
researcher can rid himself of the burden of a doxology of concept that
was not designed with his interests in mind.

GS: As I recall, the last time you tried to explain "gradual learning" you
offered gibberish containing controversial terms defined in terms of other
controversial terms. My current question is, thus, what is "gradual
learning" and do you see it as some process that operates in real animals?

JB: I feel that people who insist on using reinforcement even when it does

not work for them are just creating a problem where one does not need
to exist.

GS: How do you tell the difference between this view and the view that the
processes in question are simply extremely complex?

JB: If an intense study of 'reinforcement' is what someone

enjoys or otherwise wishes to pursue, that is his decision and I would
be interested in his results. But if that is not an important sub-goal
then why bother schlepping someone else's dogma even when it does not
work for you? Sure, study the problem if you want, but if your real
interest is in finding solutions, then use the solutions that you can
find.

GS: How does this fit in with the issue that I have outlined?

JB: We need a priori beliefs. But we also need to examine them carefully

and to accept the results of our studies and experiments. When you see
an alternative that could act as a viable solution to a problem that
you are working on I think you should give it some consideration even
if it doesn't fit in with your preconceived theories.

GS: How does this fit in with the issue that I have outlined? Is this an
argument about processes that actually exist in animals or is this a
"bird/plane" type argument?

JB: Although I have certain opinions about what I call gradual learning,

the real issue that I am stressing in this message is that gradual
learning does not have to be constrained in the same ways that
reinforcement learning probably should be.

GS: Like "pixies," "poltergeists," and "God" are not constrained by the
findings of empirical science?

JB: With the freer concept of

gradual learning, you can examine possibilities that would be
dogmatically rejected under the epistemology of an artificial
implementation of operant conditioning and reinforcement (what ever
that might be).

GS: Weeeeeee! I'm a little pixie! Weeeeeeeeee!

JB: It is the nature and modeling of the relations between the references

of the objects of knowledge that is key to the contemporary problem.

GS: But the question is "What is reference?" "What is knowledge?" And these
questions may well be the same sort as "What is the life-force?" Who is the
peddler of dogma?

JB: None of the paradigms of the past have been shown to be capable of

fully solving this problem. Something new has to be explored.

GS: Like what? Oh yeah, "knowledge," "reference," and what "gradual
learning"? What about little pixies, and the life force?

"Jim Bromer" <jbr...@isp.com> wrote in message
news:1152901098....@i42g2000cwa.googlegroups.com...

Jim Bromer

unread,

Jul 14, 2006, 5:56:19 PM7/14/06

to

feedbackdroid wrote:

>
> There are at least 50 theories of learning, so there is no need for
> anyone interested in AI, per se, to get too imprinted on any specific
> one .....
>
> http://tip.psychology.org/
>
> One size fits all only if you're a polyester sock.
>

...

>
> Naiive learning devices certainly haven't worked.
>
>

Thanks for your comments. I did not mean to sound as negative as I did
by the way. I am not against reinforcement learning theories, but I
think as you seem to think that there are a lot of other good theories
and a lot of good variations that can be used effectively in various
situations. One of the things that makes reinforcement theory
interesting is that complex configurations can be shaped through simple
reinforcements of different configurations of input. This is
interesting and it deserves some thought, but there are variations upon
variations that can be considered within a broader view of this one
kind of configuration learning (to coin a name.) Since a response to
this kind of thing may also be seen in the terms of configurations of
responses this means that the possibilities within this one relatively
narrow field are so mind-boggling that I really have to wonder why
anyone would accept any less. Anyone who is interested in learning
theory should at the very least take a look at what's around.

Jim Bromer

Curt Welch

unread,

Jul 14, 2006, 6:29:25 PM7/14/06

to

Well, the "nothing else worked let's try something different" approach has
been the driving force of AI for 50 years now. It's caused a lot of people
to spend a lot of time walking in circles. Progress is made, but it's slow
when you don't know where you are going.

If you think the direction you are walking (towards this concept of
"gradual learning"), is a viable alternative, then you should keep walking.
That's all any of us can do.

I as you probably know I am a big supporter of reinforcement learning. I
believe in it because it's not an a priori assumption. It's a fact of both
human and animal behavior proven by decades of scientific research.
There's plenty of room to debate what else might be there, and how
important a role reinforcement learning plays in the big picture of full
human behavior, but reinforcement learning isn't a guess, or an assumption,
it's a well documented fact. It's something which all theories of human
intelligence must explain and, in the end, demonstrate.

The question to be answered, is what else is there?

Above you wrote:

> With the freer concept of
> gradual learning, you can examine possibilities that would be
> dogmatically rejected under the epistemology of an artificial
> implementation of operant conditioning and reinforcement (what ever
> that might be).

This implies you have some concepts that you sense are important to
creating intelligence but which are in direct conflict with the idea of
reinforcement learning. Care to share? I'm quite sure you can't name any
aspect of human behavior which I can't explain in terms compatible with
reinforcement. Most people that fight the idea of reinforcement show a
considerable lack of understanding of the subject and what they feel isn't,
or can't be, answered by a framework of reinforcement learning always can
be.

I believe there is always room for a fresh understanding of human behavior
created by a new approach. But any new approach will have to explain all
the same data, that Behaviorism (for one) has already taken great pains to
explain. There is a lot of important data which we don't have about human
behavior, so there is endless room to speculate about what that data might
look like if it were collected and what the cause of that would be. But,
it's unwise to push forward with new theories that make no attempt to, or
worse yet - are unable to, explain the hard facts and data we do have.

--
Curt Welch http://CurtWelch.Com/
cu...@kcwc.com http://NewsReader.Com/

feedbackdroid

unread,

Jul 14, 2006, 8:38:43 PM7/14/06

to

Curt Welch wrote:

............

>
> Well, the "nothing else worked let's try something different" approach has
> been the driving force of AI for 50 years now. It's caused a lot of people
> to spend a lot of time walking in circles. Progress is made, but it's slow
> when you don't know where you are going.
>

.............

>
> I believe there is always room for a fresh understanding of human behavior
> created by a new approach. But any new approach will have to explain all
> the same data, that Behaviorism (for one) has already taken great pains to

> explain. ..................
>

Does the engineer doing AI in the first paragraph really care about
whether his AI can solve or otherwise perform the psychologist's
job, as written in the latter paragraph? Is AI supposed to be the
salvation of the psychologist?

Curt Welch

unread,

Jul 14, 2006, 9:44:27 PM7/14/06

to

I'm not sure what you are asking. But AI as I was talking about is a
reference to the job of trying to make a machine duplicate full human
behavior - not just the job of making machines perform interesting tasks
like playing chess. If you don't duplicate the behavior which we already
know exists in humans, how could you possible believe you had duplicated
full human behavior? That was the point of the second paragraph. It had
nothing to do with doing the psychologists job - it has everything to do
with duplicating what psychology has shown us humans do.

feedbackdroid

unread,

Jul 15, 2006, 1:55:05 AM7/15/06

to

Who cares, other than you and GS. I realize there's a tendency to mix
everything into the same line of thinking, but let the psychologists
worry
about "full" human behavior. Commander Data is scifi, not AI. I'd just
like
my robot/AI to get itself across the street in one piece, plus a few
other
well-chosen tasks.

Jim Bromer

unread,

Jul 15, 2006, 10:29:18 AM7/15/06

to

Normally, I do not want to spend my time responding to people who make
derogatory comments about my comments as Glenn Sizemore did when he
said, "As I recall, the last time you tried to explain "gradual

learning" you offered gibberish containing controversial terms defined
in terms of other controversial terms."

The part of the statement where he said I used "controversial
terms," is certainly a reasonable criticism, but the part of the
statement where he said that I "offered gibberish," is unwarranted
and unsubstantiated. For an example of gibberish Glenn you might take
a look at your own remarks where you said, "Weeeeeee! I'm a little
pixie! Weeeeeeeeee!" I understood what you were saying there, and I
feel that some self-expression is a good thing, but isn't there some
irony here? I also feel that Glenn used other corruptive forms of
argumentation in his comments and these tactics are not at all unusual
for Glenn. However, I did make some combative remarks in my first
message so Glenn's excessively critical remarks probably were not
completely unwarranted this time.

My main criticism with Glenn is that he has almost never shown that he
is capable of criticizing his own views. I think Glenn represents
himself as unwaveringly right in all his views. Arguing with such a
person is difficult to say the least.

However, let me take a look at one thing Glenn said. "The argument,

which you have not dealt with, is that the behavioral phenomena that
are referred to broadly as habituation (and sensitization, for that
matter), classical conditioning, and (especially) operant conditioning
explain, along with behavior that is largely inherited, all of animal
and human behavior, at least at the behavioral level. Thus, if we wish
to simulate behavior (even if only the parts that this guy or that guy
define as "intelligent") we would do well to understand these phenomena
and to speculate upon, and look towards physiology to explain, how
these processes are "implemented" by physiology. [Part of "speculating
upon" is attempting simple models that can be tested by computer.]"

I do not dispute that Glenn's comments in this paragraph represent an
articulate expression of a reasonable point of view. I don't agree
with everything he said, but if someone wants to speculate on say,
habituation by using simple models that can be tested by computer that
is fine with me. I think that my view on this was implied when I said
in my first message, "If an intense study of 'reinforcement' is what

someone enjoys or otherwise wishes to pursue, that is his decision and

I would be interested in his results." Or when I said to
feedbackdroid in a subsequent message, "I did not mean to sound as

negative as I did by the way. I am not against reinforcement learning
theories, but I think as you seem to think that there are a lot of
other good theories and a lot of good variations that can be used

effectively in various situations." Perhaps Glenn did not read my
second message where I explicitly used a broader generality in support
the exploration of alternative learning theories. Or, perhaps, Glenn
disapproves of the study of learning theories that he doesn't support
and he is projecting his own kind of intolerance onto me.

Glenn's repetition of the comment, "How does this fit in with the
issue that I have outlined," combined with his "gibberish" remark
and his sarcastic "pixies" comments makes me think that this is the
same old Glenn who is not really interested in the exploration of
learning theories unless they fit in with his view of Behaviorism. It
is difficult to argue on that basis, since every remark I would make
would have to be translated into the terms of Behaviorism. Since I am
not an expert in Behaviorism, I would be at a disadvantage. Finally, I
would have to spend a lot of time on this and with no expectation of
impartiality or even genuine curiosity from Glenn I don't feel that
this is worth the effort. I don't dislike Glenn by the way, I just
don't have the time.

I feel that Behaviorism produced some insights of value, but it was
severely limited by its methodological constraints and it has been made
disreputable by the intolerance and arrogance of many of its
proponents. The idea that Behaviorism explains everything about
psychology is just not proven and not provable. I have no idea why
anyone would act as if it was.

If somneone wants to study models of Behaviorism that is his decision
and I don't care. I actually think its fine. However, computer
programmers should not be constrained by someone else's historical
dogma when it comes to finding technical solutions to programming
problems.

I would be happy to explore the possibilities of what I call "gradual
learning" if someone is genuinely interested. But Glenn has not
shown evidence that he has understood my attempts to explain this
concept, in fact, he only dismisses it with his label of
"gibberish." If you truly do not understand what I am talking
about then you would have to be willing to temporarily take on the role
of student in order for me to teach you about my ideas. It is obvious
that is what would be required, but I don't honestly think Glenn is
willing to learn from me. On the other hand, if Glenn is only
criticizing my remarks then I have to assume that his point is
something like: Your gibberish about gradual learning is not supported
by Behaviorist Theory. I already know that! Again, this implied in my
first message. No argument needed. (Actually there may be some
possibilities that a few of my ideas could constitute an expansion of
Behaviorist theories, but I am not interested in pursuing that
possibility.)

Seriously, I would be willing to try to explore my ideas of gradual
learning and configuration learning if anyone was genuinely interested.

Jim Bromer

J.A. Legris

unread,

Jul 15, 2006, 11:06:30 AM7/15/06

to

Jim Bromer wrote:
>
> Seriously, I would be willing to try to explore my ideas of gradual
> learning and configuration learning if anyone was genuinely interested.
>

OK, let's get started. What is gradual learning, and under what
circumstances does it arise?

--
Joe Legris

Curt Welch

unread,

Jul 15, 2006, 4:14:58 PM7/15/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote:
> Curt Welch wrote:

> > I'm not sure what you are asking. But AI as I was talking about is a
> > reference to the job of trying to make a machine duplicate full human
> > behavior - not just the job of making machines perform interesting

> > tasks like playing chess. ....

> Who cares, other than you and GS.

Well, a few billion people on the planet would care if someone figured out
how to do it. :)

But I take the care to write out "full human behavior" in the debates about
what intelligence is to make it clear my intended final target.

> I realize there's a tendency to mix
> everything into the same line of thinking, but let the psychologists
> worry about "full" human behavior. Commander Data is scifi, not AI. I'd
> just like my robot/AI to get itself across the street in one piece, plus
> a few other well-chosen tasks.

Creating engineering solutions to specific tasks is fun and useful. I've
spent most by life doing it. But my interest in AI is not simply to push
the state of the art forward by finding new solutions to limited domain
tasks. Nor is my interest to just find new technologies to use in my
projects. I'm only really interested in technology that I can believe is on
a direct path to creating Commander Data. As such, I try to identify
everything that is needed to create a Commander Data, and see which
technologies seem to be on that path, and which seem not to be on the path.
Everything that looks off the path to me, I mostly ignore. Anything on the
path, I try to understand and advance. I try to understand what's missing,
and what needs to be filled in to get us there. As you know from reading
too many of my posts, the prime missing link I see is stronger, real time,
high dimension, reinforcement learning systems.

I don't actually care all that much about full human behavior. I care
about creating something like commander data - that is, a machine with the
skills needed to do any job I might want done, which I am now forced to
hire a human to do because we don't know how to build a machine to do it.
This is close, but not the same thing, as full human behavior.

Jim Bromer

unread,

Jul 16, 2006, 11:15:32 AM7/16/06

to

J.A. Legris wrote:
> OK, let's get started. What is gradual learning, and under what
> circumstances does it arise?
>
> --
> Joe Legris

The classical example of logical reasoning is,
All men are mortal.
Socrates is a man.
Therefore, we know -by form- that Socrates is mortal.

This concept of form was also used in the development of algebra where
we know facts like,
2a + 2a = 4a
if a is any real number. So, for example, we know -by form- that if
a=3 then 2*3+2*3=4*3.

One of the GOFAI models used categories and logic in order to create
logical conclusions for new information based on previously stored
information. In a few cases this model produced good results even for
some novel examples. But, it also produced a lot of incorrect results
as well. I wondered why this GOFAI model did not work better more
often. One of the reasons I discovered is that we learn gradually, so
that by the time we are capable of realizing that the philosopher is
mortal just because he is a man and all men are mortal, we also know a
huge amount of other information that is relevant to this problem. The
child learns about mortality in dozens of ways if not hundreds or even
thousands of ways before he is capable of realizing that since all men
are mortal, then Socrates must also be mortal.

I realized that this kind of logical reasoning can be likened to
instant learning. If you learn that Ed is a man, then you also
instantly know that Ed must be mortal as well. This is indeed a valid
process, and I feel that it is an important aspect of intelligence.
But before we get to the stage where we can derive an insight through
previously learned information and have some capability to judge the
value of that derived insight, we have to learn a great many related
pieces of knowledge. So my argument here, is that while instant
derivations are an important part of Artificial Intelligence, we also
need to be able to use more gradual learning methods to produce the
prerequisite background information so that derived insights can be
used more effectively.

Gradual learning is an important part of this process. We first learn
about things in piecemeal fashion before we can put more complicated
ideas together. I would say that reinforcement learning is a form of
gradual learning but there are great many other methods of gradual
learning available to the computer programmer.

It's hard for most people to understand me (or for that matter even
to believe me) when I try to describe how adaptive AI learning might
take place without predefined variable-data references. So it is much
easier for me to use some kind of data variable-explicit model to try
to talk about my ideas.

Imagine a complicated production process that had all kinds of sensors
and alarms. You might imagine a refinery or something like that.
However, since I don't know too much about material processes, I
wouldn't try to simulate something like that but I would instead
create a computer model that used algorithms to produce streams of data
to represent the data produced by an array of sensors. Under a number
of different situations, alarms would go off when certain combinations
of sensor threshold values were hit. This computer generated model
would be put through thousands of different runs using different
initial input parameters so that it would produce a wide range of data
streams through the virtual sensors. It would then be the job of the
AI module to try to predict which alarms would be triggered and when
they would be triggered before the event occurred. The algorithms that
produced the alarms could be varied and complicated. For example, if
sensor line 3 and sensor line 4 go beyond some threshold values for at
least 5 units of time, then alarm 23 would be triggered unless line 6
dipped below some threshold value at least two times in the 10 units of
time before. There might be hundreds of such alarm scenarios.
Individual sensor lines might be involved in a number of different
alarm scenarios. An alarm might, for another example, be triggered if
the average value of all the sensor inputs was within some specified
range. The specified triggers for some alarms might change from run to
run, or even during a run. Some of these scenarios would be simple,
and some might be very complex. Some scenarios might even be triggered
by non-sensed events. The range of possibilities, even within this
very constrained data-event model is tremendous if not truly infinite.

The AI module might be exposed to a number of runs that produced very
similar sensor values, or it might be exposed to very few runs that
produced similar data streams.

Superficially this might look a little like a reinforcement scenario
since the alarms could be seen as negative reinforcements, but it
clearly is not a proper model for behaviorist conditioning. The only
innate 'behavior' is that the AI module is programmed to produce is to
try to develop conjectures to predict the data events that could
trigger the various alarms.

I argue that since simplistic assessments of the runs would not work
for every kind of alarm scenario, the program should start out with
gradual learning in order to reduce the false positives where it
predicted an alarm event that did not subsequently occur.

This model might have hundreds or thousands of sensors. It might have
hundreds of alarms. It might have a variety of combinations of data
events that could cause or inhibit an alarm. Non-sensible data events
might interact with the sensory data events to trigger or inhibit an
alarm. Furthermore, the AI module might be able to mitigate or operate
the data events that drive the sensors so that it could run interactive
experiments to test its conjectures.

I have described a complex model where an imagined AI module would have
to make conjectures about the data events that triggered an alarm. Off
hand I cannot think of any one learning method that would be best for
this problem. So lacking that wisdom I would suggest that the program
might run hundreds or even thousands of different learning methods in
an effort to discover predictive conjectures that would have a high
correlation with actual alarms. This is a complex model problem which
does not lend itself to a single simplistic AI paradigm. I contend
that the use of hundreds or maybe even thousands of learning mechanisms
is going to be a necessary component of innovative AI paradigms in near
future. And it seems reasonable to assume that initial learning is
typically going to be a gradual process in such complex scenarios.

I will try to finish this in the next few days so that I can describe
some of the different methods to produce conjectures that might be made
in this setting and to try to show how some of these methods could be
seen as making instant conjectures while others could be seen as
examples of gradual learning.

Jim Bromer

J.A. Legris

unread,

Jul 16, 2006, 1:28:51 PM7/16/06

to

What you've described so far sounds like the Bayesian model that
Michael Olea has been describing, where an estimate of the posterior
probability of an event is updated afer each observation of the
evidence. Is this the sort of thing you have in mind? At some point,
perhaps depending on a threshold probability level, a decision would
have to be made about whether the corresponding alarm should be
triggered.

It seems like a big jump from predicting outcomes, even thousands of
them, to running interactive experiments to test the predictions. How
might that work?

--
Joe Legris

Michael Olea

unread,

Jul 16, 2006, 1:28:13 PM7/16/06

to

Jim Bromer wrote:

> Imagine a complicated production process that had all kinds of sensors
> and alarms.

Imagine a joint probability distribution over a set of random variables.
Imagine estimating the distribution.

-- Michael

Glen M. Sizemore

unread,

Jul 16, 2006, 2:52:29 PM7/16/06

to

"J.A. Legris" <jale...@sympatico.ca> wrote in message
news:1153070931....@m73g2000cwd.googlegroups.com...

This strikes me as enormously charitable. It seems to me that he has said
little else than:

1.) We may be able to predict some events if we have access to some part
of what has happened.

2.) We should build a machine that does that.

Michael Olea

unread,

Jul 16, 2006, 2:53:24 PM7/16/06

to

J.A. Legris wrote:

> What you've described so far sounds like the Bayesian model that
> Michael Olea has been describing, where an estimate of the posterior
> probability of an event is updated afer each observation of the
> evidence. Is this the sort of thing you have in mind? At some point,
> perhaps depending on a threshold probability level, a decision would
> have to be made about whether the corresponding alarm should be
> triggered.

That would be where the "utility model" comes in (moving from Bayesian
Inference into Bayesian Decision Theory) - the cost and gain functions over
consequences. So you pick the thresold to maximize expected utility. That
is, of course, a normative theory, not a descriptive one - what an agent
should do, not what particular agents do in fact do. Even so it is often a
good model of behavior under experimental conditions. There is a consistent
difference, I've mentioned a few times, between the normative model and a
descriptive model of "matching law" like behavior. Suppose you have two
choices A and B, and that the expected utility is 90 for A and 10 for B.
The optimal choice is pick A every time. The observed behavior is more like
pick A 90% of the time, pick B 10% of the time. The discrepancy arises only
if the probability distribution is known, and stationary. If the
distribution is unknown (i.e. being estimated, or "learned"), and if it
might be changing then the matching law makes more sense, has been shown to
be optimal under some idealized conditions, and is a form of "importance
sampling", very much like particle filtering methods of approximate
Bayesian inference.

> It seems like a big jump from predicting outcomes, even thousands of
> them, to running interactive experiments to test the predictions. How
> might that work?

That, "intervention", gets a lot of attention in Judea Pearl's second major
book, the one on "Causality". It also has been studied in terms of "value
of information". Bayesian medical expert systems do a limited form of this
by suggesting tests to perform in order to arrive at a diagnosis. The role
of intervention in learning has also been studied in, for example,
developmental psychology. Discounting evidence ("let me try it, you just
aren't doing it right") is one example. It is a major theme in Allison
Gopnik's work:

http://ihd.berkeley.edu/gopnik.htm

For example:

A.Gopnik, C. Glymour, D. Sobel, L. Schulz, T. Kushnir, & D. Danks (2004). A
theory of causal learning in children: Causal maps and Bayes nets.
Psychological Review, 111, 1, 1-31.

T. Kushnir, A. Gopnik, L Schulz, & D. Danks. (in press). Inferring hidden
causes. Proceedings of the Twenty-Fourth Annual Meeting of the Cognitive
Science Society

-- Michael

Curt Welch

unread,

Jul 16, 2006, 3:13:37 PM7/16/06

to

You seem to be describing the problem that the GOFAI people talk about as a
lack of common sense. They feel their approaches don't have it and needs
it. Meaning, that for any base of high level knowledge we seem to have,
it's always supported by a larger base of lower level knowledge. All the
high level knowledge they try to put into a machine ends up lacking support
from the common sense facts about this knowledge that humans always have.

For example, if you can teach a logic machine that airplanes can fly and
bees can fly but that alone doesn't let the machine know the simple fact
that airplanes are normally very large machines (much lager than us), and
bees are very small animals (much smaller than us). This is a common sense
fact easily picked up by seeing a bee or an airplane for yourself, but one
of the many things you might have forgotten to tell a logic machine when
you were trying to hand-program knowledge into it.

This problem however recurses. No matter how much common sense knowledge
you put into the machine, that new knowledge is also lacking the common
sense support from below. And the missing support from below, always seems
to be larger than what you have already put into the machine. The harder
you work to solve the problem, the bigger the problem seems to get.

My take on this problem, is that you can't hand program human level
knowledge into a machine. Humans are not capable of doing it. We don't
understand the knowledge in our own heads well enough to simply copy the
knowledge out of our heads, into a computer, and reach full human
intelligence. We can only do it for limited domain problems like chess, or
all the other millions of programs we have written by simply translating
knowledge from our head, into computer code.

The missing piece of the puzzle is learning. The first approach that
Turing suggested - to build an adult machine by hand-coding human knowledge
into a machine can never work to reach full human intelligence. The high
level knowledge we understand exists in our brain is not enough to make a
machine intelligent.

Turing's second approach, building a baby machine and letting it learn for
itself is the only approach that can work. These machines build their own
base of knowledge from the bottom up, instead of us trying to fill it in
from the top down.

And what you say about your gradual learning seems to fit with this few of
mine in the fact that we need a learning system that slowly builds from the
bottom up, all the knowledge it needs to support a high level concept like
"man is mortal".

I don't think however you have added anything by putting the word "gradual"
in front of it. There is no other type of learning. All learning systems
(of any real interest to AI and psychology) our gradual. They add new
knowledge on top of old knowledge. This causes a progressive build up of
knowledge over time - which makes it gradual. A computer memory cell,
which erases all traces of the old knowledge when it learns something new,
is an example of instant learning. And that type of learning is well
understood and so uninteresting we don't both to call it learning, we call
it memory. Everything we use the word "learning" to described, is gradual
learning. So I don't see why you both to put the word "gradual" in front
of it. It's a redundancy to do in my view.

> I realized that this kind of logical reasoning can be likened to
> instant learning. If you learn that Ed is a man, then you also
> instantly know that Ed must be mortal as well. This is indeed a valid
> process, and I feel that it is an important aspect of intelligence.
> But before we get to the stage where we can derive an insight through
> previously learned information and have some capability to judge the
> value of that derived insight, we have to learn a great many related
> pieces of knowledge. So my argument here, is that while instant
> derivations are an important part of Artificial Intelligence, we also
> need to be able to use more gradual learning methods to produce the
> prerequisite background information so that derived insights can be
> used more effectively.
>
> Gradual learning is an important part of this process. We first learn
> about things in piecemeal fashion before we can put more complicated
> ideas together. I would say that reinforcement learning is a form of
> gradual learning but there are great many other methods of gradual
> learning available to the computer programmer.
>
> It's hard for most people to understand me (or for that matter even
> to believe me) when I try to describe how adaptive AI learning might
> take place without predefined variable-data references.

Not for me.

Yeah, but as you have described, the purpose is to predict the alarms, and
nothing else. This is not reinforcement learning - which would have the
purpose of preventing the alarms (if the alarms were a negative reward).

However, if your thinking (which you have not written) is that the machine
would use it's understanding to try and prevent the alarms, then you have
simply described the reinforcement learning problem.

> but it
> clearly is not a proper model for behaviorist conditioning. The only
> innate 'behavior' is that the AI module is programmed to produce is to
> try to develop conjectures to predict the data events that could
> trigger the various alarms.

I believe most people would look at that as a type of unsupervised
learning.

> I argue that since simplistic assessments of the runs would not work
> for every kind of alarm scenario, the program should start out with
> gradual learning in order to reduce the false positives where it
> predicted an alarm event that did not subsequently occur.
>
> This model might have hundreds or thousands of sensors. It might have
> hundreds of alarms.

What exactly is your point in defining some inputs as sensor inputs and
some as alarm inputs? Why the distinction? Why not just call them all
sensors and describe the point of the machine as being one of predicting
all sensor inputs before they happen? Why only predict the inputs you
label as "alarm" inputs?

> It might have a variety of combinations of data
> events that could cause or inhibit an alarm. Non-sensible data events
> might interact with the sensory data events to trigger or inhibit an
> alarm. Furthermore, the AI module might be able to mitigate or operate
> the data events that drive the sensors so that it could run interactive
> experiments to test its conjectures.
>
> I have described a complex model where an imagined AI module would have
> to make conjectures about the data events that triggered an alarm. Off
> hand I cannot think of any one learning method that would be best for
> this problem. So lacking that wisdom I would suggest that the program
> might run hundreds or even thousands of different learning methods in
> an effort to discover predictive conjectures that would have a high
> correlation with actual alarms. This is a complex model problem which
> does not lend itself to a single simplistic AI paradigm.

Except for the fact that by combining multiple techniques into one, and
deciding which to try, and how long to try each, all you have done is
defined yet another single learning system. When you are done, whatever
you have built, will be a just another single learning system.

This is the point of the new free lunch theorem in learning systems:

http://en.wikipedia.org/wiki/No-free-lunch_theorem

Every learning system has an inherent bias and there is no way to get
around that by trying to use all possible theorems. Any attempt to do so
will just change the bias to something else.

The "best learning" is the one with the bias that will produce the best
answers over the class of problems it will be expected to have to deal
with. All learning systems in the end must be biased towards a specific
class of problems.

To solve AI, you have to both correctly understand the class of problems it
will be expected to solve, and then, find the algorithm that best fits that
class. You can't cheat by just trying all of them. It's highly unlikely
that an approach based on, "I don't, try them all" is going to work very
well.

> I contend
> that the use of hundreds or maybe even thousands of learning mechanisms
> is going to be a necessary component of innovative AI paradigms in near
> future. And it seems reasonable to assume that initial learning is
> typically going to be a gradual process in such complex scenarios.

Initial, and final learning, is gradual in in all cases. I suspect however
that in humans, it's actually the inverse from what you suggest. We
probably learn more in our first 5 years than we do in the rest of lives.
The only reason it doesn't feel this way to us is that we take all that
initial learning for granted. The foundation of knowledge it gives us we
take for granted - like the fact that to touch something to our left, we
have to reach to the left. Knowing this, and knowing how to do this, is
not trivial in any sense when you look at the complexity of the problem
from the view of a robot trying to learn to use a manipulator to gap a coke
can. But it's one of the billions of things we as humans never think twice
about. It's one of the billions of things that forms our huge foundation
of common sense knowledge that our high level of knowledge is built from.

The issue I have with what you have written so far, is that it's focused on
the idea of a machine extracting knowledge from the environment, but it
ignores the most important question of AI - what do we do with the
knowledge?

How does a machine which learns to partially predict the input signals
labeled as being "alarms", determine how to move its arm? Why would it
move its arm?

How does a machine that's good at predicting alarms, get us closer to
building Commander Data from STTNG?

J.A. Legris

unread,

Jul 16, 2006, 3:29:10 PM7/16/06

to

Maybe I should have said "sounds consistent with" instead of "sounds
like", but what grabbed me was the idea that his AI should get
incrementally better at making predictions with repeated exposures to
informative data. Bayesian probability suggests a "machine" for
carrying this out.
--
Joe Legris

Curt Welch

unread,

Jul 16, 2006, 3:58:46 PM7/16/06

to

That is the what the entire subject of reinforcement learning seeks to
answer.

It's a problem of trying to produce the correct behavior, to maximize
rewards, while at the same time, using all your behaviors, as experiments,
to collect data about what future behaviors you should be producing.

As Micheal points out above, if you know the utility (value) of two
behaviors are 90 and 10, the best answer would be to pick 90. But if you
knowledge is not absolute, but only based on a limited number of past
"experiments", then picking 90 and never picking 10, will give you no
additional data about the relative value of the two behavior. If you never
pick the 10 behavior again, you will never be able to update your knowledge
about their relative value. If your knowledge is never absolute (which it
won't be for real world problems that deal with the universe - or for where
the learning happens though experiencing and all you will ever know is the
result of a fixed number of experiments) then the optimal solution is never
to stop experimenting - to never stop picking the 10 option at lest some
times to see if the result this time might be different.

The point of a reinforcement learning machine is to learn from it's own
interactions with the environment. The only thing such a machine can do to
change its faith, is to interact with the environment, so the only purpose
it can have, is to produce optimal interactions. But since everything it
knows about how to interact, is learned from past interactions, the optimal
rules it uses for picking behaviors, has to take into account, that it's
got a dual purpose in life - 1) pick the behaviors that lead to the best
results, and 2) improve it's understanding about which behaviors are best
so it can make better choices in the future. These two needs creates a
natural conflict which requires a compromise to be reached. It must bias
its behavior selection towards the behaviors which are currently known to
be better, but never totally abandon the "bad" behaviors, because what was
once seen as bad, might later (in a different environment) might turn out
to be good.

The problem of AI is exactly the problem of how a machine both learns to
produce optimal behaviors, while at the same time, use all the results of
past behaviors, as experimental data to guide the selection of future
behaviors. The solution to how this is done, is seen in humans and animals
as operant conditioning.

Curt Welch

unread,

Jul 16, 2006, 4:02:05 PM7/16/06

to

"J.A. Legris" <jale...@sympatico.ca> wrote:
> Glen M. Sizemore wrote:

> Maybe I should have said "sounds consistent with" instead of "sounds
> like", but what grabbed me was the idea that his AI should get
> incrementally better at making predictions with repeated exposures to
> informative data. Bayesian probability suggests a "machine" for
> carrying this out.

Just as all reinforcement learning algorithms are machines for carrying
that out as well.

Glen M. Sizemore

unread,

Jul 16, 2006, 5:32:42 PM7/16/06

to

"J.A. Legris" <jale...@sympatico.ca> wrote in message

news:1153078150.5...@75g2000cwc.googlegroups.com...

Yeah, but without specifying the Bayesian calculations, and what observable
sorts of events enter into the calculations, he has said nothing more than:
"We need a machine that can learn." And the sort of specific conditions
under which "learning" occurs is left totally unspecified. In other words,
his position is utterly vacuous.

> --
> Joe Legris
>

J.A. Legris

unread,

Jul 16, 2006, 5:56:24 PM7/16/06

to

Curt Welch wrote:
> "J.A. Legris" <jale...@sympatico.ca> wrote:
> > Glen M. Sizemore wrote:
>
> > Maybe I should have said "sounds consistent with" instead of "sounds
> > like", but what grabbed me was the idea that his AI should get
> > incrementally better at making predictions with repeated exposures to
> > informative data. Bayesian probability suggests a "machine" for
> > carrying this out.
>
> Just as all reinforcement learning algorithms are machines for carrying
> that out as well.
>

Part of Bromer's proposal is that reinforcement-based learning is not
always necessary. I think he intends to show that his AI generates
specific predictions based on the conjectures it forms, and then
compares those predictions with actual outcomes. Conjectures that are
borne out are retained preferentially over those that fail.

Now this raises an interesting possibility: a learning system that just
sits there, calmly observing events, building up a supply of successful
theories about how the world works. Then suddenly it rises up and
exhibits fully developed overt behaviour, acquired gradually, but
rehearsed and perfected entirely internally. The behaviourist can
insist that reinforcement-based learning occured, but there's no
evidence of it.

Just a guess.

--
Joe Legris

feedbackdroid

unread,

Jul 16, 2006, 7:58:45 PM7/16/06

to

It may be possible to devise such an "artificial" system, but real
learning
in brains probably isn't just "either - or". Rather, every behavioral
situation probably involves several mechanisms, including some
prediction, some intuition/analogical thinking [possibly based upon
past experiences], and some direct reinforced learning.

As Bernt Heinrich describes, even crows are able to solve problems
they've never seen before, and without any practice. Eg, pulling
up a piece of food tied to a string. They somehow reason it out
"internally" before beginning the task behaviorally [externally],
and which can take up to 10 or more individual steps in sequence
to perform. See "the Mind of the Raven".

Also, Horace Barlow and Wm Calvin talk about the idea of "guessing
well". This has to do with attempting totally new tasks one has never
tried before, and internally working out an execution sequence based
upon analogy with previously learned acts. Can something be
reinforced that you've never seen or done before?

Also, there are some neural nets that do "1-pass" learning. Is this
reinforcement?

Also, Edelman would probably saw something like behaviors are
selected for via internal mechanisms. IOW, any given stimulus might
elicit any #of potential behavioral responses, but only one of these
ends up being selected for execution. Certainly this happens when
you search for the proper word to stick into a sentence. Internally,
many words are filtered past before one is finally spoken. And then
of course you have to option to stop saying the word even while it's
being spoken, if it's not the right selection. Plus, there are multiple

options for how the word is spoken, emphasis, inflection, etc.

Reinforcement learning is only part of the system.

feedbackdroid

unread,

Jul 16, 2006, 8:38:59 PM7/16/06

to

Glen M. Sizemore wrote:
> JB: I think that the attempt to prove an a priori assumption about the
> efficacy of 'reinforcement' in AI is not wise unless someone has
> made the decision that he wants to spend his time researching
> 'reinforcement' in computational learning theories.
>
>
>
> GS: The argument, which you have not dealt with, is that the behavioral
> phenomena that are referred to broadly as habituation (and sensitization,
> for that matter), classical conditioning, and (especially) operant
> conditioning explain, along with behavior that is largely inherited, all of
> animal and human behavior, at least at the behavioral level. Thus, if we
> wish to simulate behavior (even if only the parts that this guy or that guy
> define as "intelligent") we would do well to understand these phenomena and
> to speculate upon, and look towards physiology to explain, how these
> processes are "implemented" by physiology. [Part of "speculating upon" is
> attempting simple models that can be tested by computer.]
>

DOH! Inherited behvaior? DOH! Look towards physiology to understand
them? DOH! How are these processes implemented by physiology?
DOH! Attempting simple computer models?

Welcome to the middle of the 20th century. People have been doing just
this for the past 50+ years. Where have you been?

>
> In any event, it
> seems to be there are only two other possibilities. The first is that the
> assertion is wrong on either logical/conceptual grounds or on empirical
> grounds, and that we must, therefore, add more principles. The second is the
> bird/plane argument. That is, that "artificial intelligence" can be achieved
> in ways that don't have anything to do with "how nature does it."
>

Well, the 2nd is probably true for AI, and the first is undoubtedly
true for
real intelligence/brains. More principles.

Real brains, as well as real living cells for that matter, are
humungusly
complex, and one size fits all approaches, like operant conditioning,
just
don't cover it. Forgetting about the complex interactions between the
100B neurons and 100T synapses in the brain for a minute, just the
internal complexity of each of the "individual" brain cells alone is
phenomenal. At least HALF of the DNA of the genome factors into
brain operation. 3 billion base pairs in humans, and 1000s of internal
DNA -> RNA -> allosteric protein -> DNA regulatory feedback loops
in each cell alone.

Damn right there are more principles that we just don't understand.

>
> JB: I feel that people who insist on using reinforcement even when it does
> not work for them are just creating a problem where one does not need
> to exist.
>
>
>
> GS: How do you tell the difference between this view and the view that the
> processes in question are simply extremely complex?
>

Wow. Good guess.

...........

>
> JB: None of the paradigms of the past have been shown to be capable of
> fully solving this problem. Something new has to be explored.
>
>
>
> GS: Like what? Oh yeah, "knowledge," "reference," and what "gradual
> learning"? What about little pixies, and the life force?
>

AWWW. Now you're not doing so well in the "GUESSING WELL"
department that I mentioned in the other post today on this thread.
You need to work on that. Analogical thinking, viz-aviz your past
experiences, should allow you to come up with a better guess than
pixies.

Curt Welch

unread,

Jul 16, 2006, 9:10:31 PM7/16/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote:
> Can something be
> reinforced that you've never seen or done before?

Of course it can. A good reinforcement learning machine is shaping classes
of behaviors as it learns. It's not learning specific reactions. It does
it by shaping the operation of a classifier as it learns. All possible
stimulus inputs then are guaranteed to fall into some class so that the
system will always have an "answer" as to how to respond. The answer will
be based on the reinforcement learning systems evaluation of what class the
current situation falls into, and on the systems past experience with other
events that might have been different, but yet fall into the same
classifications.

You can see one implementation of this in action in TD-Gammon. Each move
which gets reinforced shapes the weights of the neural network which causes
many other similar moves to be reinforced at the same time. It doesn't
have to see every move, to be able to make a good "guess" at how to respond
to a move. It has a good (for Backgammon) system for classifying moves
into response classes so that it can successfully merge it's learning from
other moves, to make a good guess at how to play a position it has never
seen before.

This power to correct make a "good" guess for situations never seen is the
one key missing piece in general reinforcement learning systems. How it
does it is easy to understand in theory - it simply needs a system that
automatically creates a closeness function and produces an answer which is
some type of merging and selecting, from the situations it has seen. But
how you do this so that a generic system of measuring "closeness" (one not
hand tuned to the application like it was in TD-Gammon), to do a good job
is the hard question that has not been well answered.

> Also, there are some neural nets that do "1-pass" learning. Is this
> reinforcement?
>
> Also, Edelman would probably saw something like behaviors are
> selected for via internal mechanisms. IOW, any given stimulus might
> elicit any #of potential behavioral responses, but only one of these
> ends up being selected for execution. Certainly this happens when
> you search for the proper word to stick into a sentence. Internally,
> many words are filtered past before one is finally spoken. And then
> of course you have to option to stop saying the word even while it's
> being spoken, if it's not the right selection. Plus, there are multiple
> options for how the word is spoken, emphasis, inflection, etc.
>
> Reinforcement learning is only part of the system.

And where is your evidence to show that all those "options" are not
selected for by the same low level reinforcement learning system?

JGCASEY

unread,

Jul 16, 2006, 11:44:01 PM7/16/06

to

Curt Welch wrote:
> "feedbackdroid" <feedba...@yahoo.com> wrote:
> > Can something be
> > reinforced that you've never seen or done before?
>
> Of course it can. A good reinforcement learning machine is
> shaping classes of behaviors as it learns. It's not learning
> specific reactions. It does it by shaping the operation of a
> classifier as it learns. All possible stimulus inputs then
> are guaranteed to fall into some class so that the system will
> always have an "answer" as to how to respond. The answer will
> be based on the reinforcement learning systems evaluation of
> what class the current situation falls into, and on the systems
> past experience with other events that might have been different,
> but yet fall into the same classifications.

But the problem is what kind of system is capable of classifying
inputs in a useful way in the first place to be reinforced?

> You can see one implementation of this in action in TD-Gammon.
> Each move which gets reinforced shapes the weights of the neural
> network which causes many other similar moves to be reinforced
> at the same time. It doesn't have to see every move, to be able
> to make a good "guess" at how to respond to a move. It has a
> good (for Backgammon) system for classifying moves into response
> classes so that it can successfully merge it's learning from
> other moves, to make a good guess at how to play a position it
> has never seen before.

So your problem as you allude to below is how to make a system
that can learn a good scheme for classifying whatever is useful
for it to classify.

Adjusting the values of the parameters of some learning scheme
may work fine for some programmer invented scheme but it doesn't
represent open ended general purpose scheme learning.

You talk about reinforcement learning but you cannot reinforce
something that doesn't exist. The learning must take place for
it to be reinforced. If the animal never makes the connections
they can never be reinforced. If the classification is never
made the classifier can never be reinforced.

> This power to correct makes a "good" guess for situations never

> seen is the one key missing piece in general reinforcement
> learning systems. How it does it is easy to understand in theory
> - it simply needs a system that automatically creates a closeness
> function and produces an answer which is some type of merging and
> selecting, from the situations it has seen. But how you do this
> so that a generic system of measuring "closeness" (one not hand
> tuned to the application like it was in TD-Gammon), to do a good
> job is the hard question that has not been well answered.

I think there are a lot of things that have not been well answered.

There may be no generic measuring of closeness, it may all be relative
to the mechanisms doing the classifying.

When you say two things are similar they are only similar with
respect to some mechanism. And different mechanisms will classify
things differently. If you shake some dirt through a sieve it will
classify some particles as big and another lot as small but the
threshold value is sieve dependent.

With a visual input you shake your pixels through some "sieve"
and hopefully out will come all the "objects" in the image. But
what constitutes an "object" is also sieve dependent.

An example is my target program that allows a target pattern to
fall through but excludes most other things. Of course some other
things might fall through, no pattern recognizer is perfect, even
we make classification mistakes if we are given marginal data.

What I would suggest is evolution generated random neural sieves
which made random classification rules which in turn were linked
to random classes of behavior generators all of which either
enhanced of reduced the chances of the animals reproductive success.

Human intellectual power may not be the result of a generic learning
system but rather as a set of innate learning instincts. Just as
a young migratory bird has the instinct to learn the pattern of stars
relative to the point around they appear to rotate each night we
have the language learning instinct and the reasoning instinct.

--
JC

Michael Olea

unread,

Jul 17, 2006, 1:15:21 AM7/17/06

to

J.A. Legris wrote:

> Glen M. Sizemore wrote:
>> "J.A. Legris" <jale...@sympatico.ca> wrote in message
>> news:1153070931....@m73g2000cwd.googlegroups.com...
>> >
>> > Jim Bromer wrote:
>> >>
>> >> Imagine a complicated production process that had all kinds of sensors

>> >> and alarms. ...

>> > What you've described so far sounds like the Bayesian model that
>> > Michael Olea has been describing, where an estimate of the posterior
>> > probability of an event is updated afer each observation of the
>> > evidence.

>> This strikes me as enormously charitable. It seems to me that he has said
>> little else than:

>> 1.) We may be able to predict some events if we have access to some
>> part of what has happened.

>> 2.) We should build a machine that does that.

> Maybe I should have said "sounds consistent with" instead of "sounds
> like", but what grabbed me was the idea that his AI should get
> incrementally better at making predictions with repeated exposures to
> informative data. Bayesian probability suggests a "machine" for
> carrying this out.

A little context:

Bromer on Bayesian Inference:

"Bayesian Networks have been around for at least 15 years. If the use
of a Bayesian Network had been the key to solving the mysteries of
higher intelligence, then 15 years of efforts in a multi-trillion
dollar industry whose products are used as fundamental tools of
science, technology, education, business, recreation and finance should
have been sufficient to prove it."

This may offer incidental insight into Bromer's grasp of the syllogism form.

Later he writes:

"Suppose the most likely probability at some Bayesian node is .6. That
means any likely first selection is going to have at least a .4 probability
of being wrong. Thus, the analysis of the most probable context is
going to lead to an error 4 out of 10 times. The problem with this is
that under this case the decision network is no longer operating under
the principle of automated rationality, since it has come to an invalid
conclusion. So the basis of the strongest reason for using a Bayesian
method is, in many actual cases, going to prove to be invalid."

This may offer incidental insight into Bromer's grasp of decision making
under conditions of uncertainty, not to mention Bromer's tenuous grip on
coherence, let alone any notion he may have of equilibrium distributions
over conditionally independent random variables.

Bromer on Behaviorism:

"Since I am not an expert in Behaviorism, I would be at a disadvantage."

Followed shortly by:

"I feel that Behaviorism produced some insights of value, but it was

severely limited by its methodological constraints..."

This may offer incidental insight into Bromer's willingness to pontificate
on things about which he knows little, if anything.

Bromer on Information Theory:

"I originally was only criticizing the wording of the definition of
"the representational capacity of a string" that someone gave in
another discussion group, but when I began to examine the problem more
closely I was surprised by the subtleties and complications of the concept."

That "someone" was me. Bromer had described Shannon's information theory as
simplistic. He argued that he could pack an arbitrary amount of information
into a single bit. A bit string one bit long could convey more than two
messages, he claimed. His proof was this: suppose I send the bit on a
yellow piece of paper, decode that into two messgaes. Now suppose I send
the bit on a white piece of paper, decode that into two other messgaes.
See, that one bit has sent 4 messages!

This may offer incidental insight into Bromer's grasp of information theory.

Bromer on Computational complexity:

NP-complete theory is all wrong, because Bromer has written a linear time
algorithm to solve the traveling salesman problem. Ok, there are some cases
in which it does not work, but really, with a few fixes, it should work.

-- Michael

Curt Welch

unread,

Jul 17, 2006, 4:29:54 AM7/17/06

to

"JGCASEY" <jgkj...@yahoo.com.au> wrote:
> Curt Welch wrote:
> > "feedbackdroid" <feedba...@yahoo.com> wrote:
> > > Can something be
> > > reinforced that you've never seen or done before?
> >
> > Of course it can. A good reinforcement learning machine is
> > shaping classes of behaviors as it learns. It's not learning
> > specific reactions. It does it by shaping the operation of a
> > classifier as it learns. All possible stimulus inputs then
> > are guaranteed to fall into some class so that the system will
> > always have an "answer" as to how to respond. The answer will
> > be based on the reinforcement learning systems evaluation of
> > what class the current situation falls into, and on the systems
> > past experience with other events that might have been different,
> > but yet fall into the same classifications.
>
> But the problem is what kind of system is capable of classifying
> inputs in a useful way in the first place to be reinforced?

True. That's what I'm working on. What is the correct general way to
classify? That's the question.

The type of work that Jeff Hawkins is funding with his invariant
representation ideas is part of that answer. It's what I was posting about
just recently asking for algorithms that can transform signals and remove
correlations without removing other information. This is all key to
answering how you would build a useful general purpose classifier system.
The invariant signals such a system would generate are the classifications
that are needed to drive the reinforcement learning.

> > You can see one implementation of this in action in TD-Gammon.
> > Each move which gets reinforced shapes the weights of the neural
> > network which causes many other similar moves to be reinforced
> > at the same time. It doesn't have to see every move, to be able
> > to make a good "guess" at how to respond to a move. It has a
> > good (for Backgammon) system for classifying moves into response
> > classes so that it can successfully merge it's learning from
> > other moves, to make a good guess at how to play a position it
> > has never seen before.
>
> So your problem as you allude to below is how to make a system
> that can learn a good scheme for classifying whatever is useful
> for it to classify.

Right. But there are generic answers, as I talked about above, that are
always useful.

> Adjusting the values of the parameters of some learning scheme
> may work fine for some programmer invented scheme but it doesn't
> represent open ended general purpose scheme learning.

But the type of ideas above do represent open ended learning schemes.

> You talk about reinforcement learning but you cannot reinforce
> something that doesn't exist. The learning must take place for
> it to be reinforced.

I have no clue what that means. The act of reinforcement is the learning.
To me, you just wrote, "the reinforcement must take place for the
reinforcement to take place" ???

> If the animal never makes the connections
> they can never be reinforced. If the classification is never
> made the classifier can never be reinforced.

I'm not sure what you are thinking. My pulse sorting net already shows
exactly how this type of system can work. Ever node is already a
classifier. They all function as classifiers from the beginning. The only
behavior of the network is a classification function. Each pulse, as it
enters the net, is classified each time it is sorted by a node into one of
the two output paths. The net as a whole, is acting as a classified as it
sorts an input pulse, though some selected path, to some final output.

What such a system then learns, is how to adjust the boundary conditions of
all these classifiers.

So, when you say, "if the animal never learns the connection" that makes no
sense for this type of system. Every behavior is already a "connection"
from one classification to the next. The network is nothing by
classification connections.

This type of system never has to learn a new classification. It's born
with all the classifications it can ever make. Instead, it simply must
learn to adjust the ones it already has, to shape them into the most useful
shape. This prevents it from ever having to find a new classification like
finding a needle in a haystack.

The other issue, is that all sorting ends up being probabilistic in nature.
It's all "fuzzy" sorting because in no case will a flow be identified and
all pulses sorting on the exact same path. Instead, low level noise in all
the signals will cause pulses from a single source to fan out in a tree
fashion. This is the default starting behavior. Learning will tend to
cluster the pulses where they are best used, but some pulses will still go
every which way. Just as in the 90/10 behavior selection Michael and I
just wrote about.

This means that if there is a different clustering, that works better,
these alternate sorted nodes will cause that alternate path to be
reinforced, and in time, it will become the 90, instead of the 10, path.

All the above is already works in my pulse sorting network, so there's no
doubt of what this allows.

But, there's a huge But lurking here. The question is can you find a
generic classification technique, that works like my current pulse sorting
network, which has the correct power, to create any type of classification
needed for some domain (with the interesting domain being full human
behavior).

I had high hopes for the pulse sorting you already know about. But I now
feel it's close to the right concept, but not quite right. I don't believe
it was correctly creating invariant representations. I believe a system
that correctly creates signals which extract invariant representations from
the combined sensory data, and which can otherwise, operate in a fashion
similar to my current pulse sorting nodes, will produce an extremely strong
and extremely generic, reinforcement learning system.

> > This power to correct makes a "good" guess for situations never
> > seen is the one key missing piece in general reinforcement
> > learning systems. How it does it is easy to understand in theory
> > - it simply needs a system that automatically creates a closeness
> > function and produces an answer which is some type of merging and
> > selecting, from the situations it has seen. But how you do this
> > so that a generic system of measuring "closeness" (one not hand
> > tuned to the application like it was in TD-Gammon), to do a good
> > job is the hard question that has not been well answered.
>
> I think there are a lot of things that have not been well answered.
>
> There may be no generic measuring of closeness, it may all be relative
> to the mechanisms doing the classifying.

I think a system that uses the correlations in the signals to extract
invariant representations will create the generic measures of closeness
that is needed. The measure will effectively be made by the number of
common invariant signals the two states share.

Think of my current pulse sorting network, but where every signal internal
to the network, was a separate invariant representation of some aspect of
the current sensory signals. They would all be extracted features of the
stimulus signals, and would represent the current state of the environment.

My current pulse sorting system already does the same thing, but I think
it's just using the wrong definition of how it classifies so it's not
creating the correct set of "features".

> When you say two things are similar they are only similar with
> respect to some mechanism. And different mechanisms will classify
> things differently. If you shake some dirt through a sieve it will
> classify some particles as big and another lot as small but the
> threshold value is sieve dependent.

Yes. That's the important question. Is one mechanism better than all
others, or do we simply have to hand-tune every mechanism to fit the
problem at hand. Evolution surely has the power to hand tune us to our
environment if that is what was needed.

But, I think that a system that creates invariant signals by removing
correlations (duplicate information) from the signals, is a clear win in
terms of how it will improve the quality of the learning. Is it good
enough to be a strong generic system? I don't know.

> With a visual input you shake your pixels through some "sieve"
> and hopefully out will come all the "objects" in the image. But
> what constitutes an "object" is also sieve dependent.

I don't believe it is. I believe objects are defined by the correlations
they create in the data. I've always believed this is how our brain works.
We learn to see things as objects, because of the fact they create
correlated effects in our sensory data. This is something that happens in
our brain, invisible to our conscious understand it's happening. We just
see things as objects, and have no clue why they seem to be "objects". We
just take for fact they are objects - until you sit down and try to write
image parsing code and find the definition of what makes an "object" not
obvious at all.

Networks that extract invariant representations based on correlations
present in the data will be extracting the one and only correct definition
of what an object is.

If you remember my arguments as to why man has always seen mental activity
as separate from physical activity, it was based on this same belief of
mine. We see the sensory data that we call "mental activity" as a
different "object" from the physical objects because there is NO
CORRELATION, in the data between physical sensory data, and mental sensory
data. That is why the brain ended up classifying all our mental activity,
as a different type of object, from all our physical activity. They were
NOT CORRELATED.

And why were there no correlations? Because when we have thoughts, it
makes no sound, no smell, no flash of light, or no tactile sensation. We
can't see, hear, smell, or tough, our thoughts. There is NO CORRELATION in
the sensory data streams. And, with no correlations, our object
classifier, can't create a single invariant relationship by extracting the
correlations. So, net result, the brain tells us that our mental activity
is a different object, from all physical activity.

> An example is my target program that allows a target pattern to
> fall through but excludes most other things. Of course some other
> things might fall through, no pattern recognizer is perfect, even
> we make classification mistakes if we are given marginal data.

Right. But if you kept exposing the same target to one of these
classification networks, it would form an invariant representation of the
target as as single object. That is because the characters that make up
the target create correlations in the various sensory input signals. If
pixel X from the right side of your target is turned on, then pixel Y from
the left side is also turned on. This is a correlation condition that
indicates the "target" is present when it happens. It's that correlation a
generic network could notice, and translate into the "target" invariant
representation.

This is a valid generic technique that applies to all types of sensory
data.

> What I would suggest is evolution generated random neural sieves
> which made random classification rules which in turn were linked
> to random classes of behavior generators all of which either
> enhanced of reduced the chances of the animals reproductive success.

If that is the only solution possible, then that would be a good guess.
But we have the correlations in the data to work with, and so did
evolution.

> Human intellectual power may not be the result of a generic learning
> system but rather as a set of innate learning instincts.

Yeah, it could be. But, as I have pointed out countless times. Humans
have GENERIC LEARNING POWERS. This is a simple proven fact. And the only
way to explain it is to build a generic learning system. Most of what we
learn, in our world today, to get by, could not have been done with innate
learning instincts shaped by evolution. Evolution could not have shaped a
generic "learn to play chess" sieve to allow us to correct see "chess game
patterns". It could not have also created a "go learning sieve", and a
"bike riding sieve" for learning to recognize the correct response classes
for balancing a bike.

No matter how many specialized systems we might also posses created for us
by evolution, we know for a fact that humans have generic learning skills
that no machine has ever equaled. So we know for fact, there must exist
stronger generic solutions that we have not yet found. Our best generic
learning systems can't come close to touching a human yet. Humans use one,
and only one, learning system to lean to play chess, and to play go. Yet
no one has created one learning system on a computer, than play both games
well. They haven't even created a hand-optimized learning system that
plays go anywhere near the best human players. So we know there are
generic learning systems stronger than anything we have yet created - and I
know that we have no chance of creating human level AI until we first
create stronger generic learning systems.

> Just as
> a young migratory bird has the instinct to learn the pattern of stars
> relative to the point around they appear to rotate each night we
> have the language learning instinct and the reasoning instinct.

I think it's more likely that we have strong generic learning systems, that
have been optimized, to give us all our various specialized learning skills
in areas like, vision, sound, language, 3D spatial awareness, hand eye
coordination, etc. We have different chunks of our brain, wired into a
fixed topology and tuned for each task, that allows the skill levels we
have in each area of the various sensory combinations.

And, if you were to build a good invariant signal abstraction system, it
would learn to recognize your targets correctly without you having to
hand-code a recognizer for the purpose. It would learn it after you
exposed your system to a lot of targets of the same type. It would learn
it by building a hierarchy of features that were common to all your
targets. And it would do a good job of recognizing patterns that you
though were close to the same type of target, simply because it shared a
lot of features in common with your target - simply because your brain, is
already using that type of definition of "closeness" for everything it
learns.

This definition of "closeness" I believe is a big thing that Behaviorism
has left unanswered. The question of how does a rat learn to see a light
as a single stimulus? How does it prevent from getting a red light
confused with a red wall on the cage for example? Why doesn't it seem them
both as "a red object", instead of seeing them as different?

There's probably work done in this area that I am not aware of (simply
because there is so much work done I'm not aware of) - but as far as I
know, Behaviorism or other fields, have not told us how to build a strong
generic closeness classifier that correct mimics the ones used by animals.
And without a strong "closeness" classifier, reinforcement learning systems
will always be weak as hell.

But it's clear, that the way it does work, is by recognizing the
correlations that exist between sensory signals and extracting those
correlations as invariant signals. The fine details of how to correctly do
this however, I don't yet fully see. But I'm getting close.

bob the builder

unread,

Jul 17, 2006, 5:52:56 AM7/17/06

to

Michael Olea wrote:

> A little context:
> ...

> Bromer on Information Theory:
>
> "I originally was only criticizing the wording of the definition of
> "the representational capacity of a string" that someone gave in
> another discussion group, but when I began to examine the problem more
> closely I was surprised by the subtleties and complications of the concept."
>
> That "someone" was me. Bromer had described Shannon's information theory as
> simplistic. He argued that he could pack an arbitrary amount of information
> into a single bit. A bit string one bit long could convey more than two
> messages, he claimed. His proof was this: suppose I send the bit on a
> yellow piece of paper, decode that into two messgaes. Now suppose I send
> the bit on a white piece of paper, decode that into two other messgaes.
> See, that one bit has sent 4 messages!
>
> This may offer incidental insight into Bromer's grasp of information theory.

LOL :) Well, he wasnt holded back by great amounts of knowledge
there...
He should play poker. With his decoding skills he could make a fortune.

> Bromer on Computational complexity:
>
> NP-complete theory is all wrong, because Bromer has written a linear time
> algorithm to solve the traveling salesman problem. Ok, there are some cases
> in which it does not work, but really, with a few fixes, it should work.

Yes always those damned few cases. Every first year computer science
student comes up with a solution to make those NPC problems solved in
P-problems. If someone would actually succeed he would be more famous
then Turing and
von Neumann together. Over a 100 years there still we be first years
trying to solve it.

J.A. Legris

unread,

Jul 17, 2006, 7:41:31 AM7/17/06

to

Michael Olea wrote:
> J.A. Legris wrote:
>
> > Glen M. Sizemore wrote:
> >> "J.A. Legris" <jale...@sympatico.ca> wrote in message
> >> news:1153070931....@m73g2000cwd.googlegroups.com...
> >> >
> >> > Jim Bromer wrote:
> >> >>
> >> >> Imagine a complicated production process that had all kinds of sensors
> >> >> and alarms. ...
>
> >> > What you've described so far sounds like the Bayesian model that
> >> > Michael Olea has been describing, where an estimate of the posterior
> >> > probability of an event is updated afer each observation of the
> >> > evidence.
>
> >> This strikes me as enormously charitable. It seems to me that he has said
> >> little else than:
>
> >> 1.) We may be able to predict some events if we have access to some
> >> part of what has happened.
>
> >> 2.) We should build a machine that does that.
>
> > Maybe I should have said "sounds consistent with" instead of "sounds
> > like", but what grabbed me was the idea that his AI should get
> > incrementally better at making predictions with repeated exposures to
> > informative data. Bayesian probability suggests a "machine" for
> > carrying this out.
>
> A little context:
>
> Bromer on Bayesian Inference:

[...]
> Bromer on Behaviorism:
[...]
> Bromer on Information Theory:
[...]
> Bromer on Computational complexity:
[...]

Gak! I've been Zicked.

--
Joe Legris

JGCASEY

unread,

Jul 17, 2006, 8:28:47 AM7/17/06

to

Curt Welch wrote:
> "JGCASEY" <jgkj...@yahoo.com.au> wrote:

> > You talk about reinforcement learning but you cannot reinforce
> > something that doesn't exist. The learning must take place for
> > it to be reinforced.
>
> I have no clue what that means. The act of reinforcement is the
> learning. To me, you just wrote, "the reinforcement must take
> place for the reinforcement to take place" ???

Not sure what it means myself now I have reread it. I think what
I had in mind was whatever it was that was being reinforced had to
exist in the first place. In other words there must be something
to be reinforced before it can be reinforced.

You can't reinforce a dam that doesn't exist. Reinforcement may be
required for the dam to persist but it doesn't require reinforcement
to exist.

Thus I would say what you see as existing is behaviors and thus
perhaps what you mean by reinforcement learning is the process of
reinforcing behaviors?

Your idea I think is to reinforce the behaviors of a generic input
dependent behaviour generating machine whenever you deem those
behaviors to be "intelligent" with the hope it will persist in
producing such behaviors.

Learning would mean some behaviors are becoming more likely than
other behaviors? To learn a behavior is to make it more likely.
But of course that doesn't answer what makes a behavior intelligent
for you can learn to produce unintelligent behaviours as well.

--
JC

feedbackdroid

unread,

Jul 17, 2006, 11:29:23 AM7/17/06

to

Yes, exactly. This is the sort of "other mechanism" I was alluding to
in my previous post. In the real-world, it takes more than just a
naiive
learning device to crack the nut of intelligence. 50 years of AI and
NN research shows this.

feedbackdroid

unread,

Jul 17, 2006, 11:48:47 AM7/17/06

to

Curt Welch wrote:
> "feedbackdroid" <feedba...@yahoo.com> wrote:
> > Can something be
> > reinforced that you've never seen or done before?
>
> Of course it can. A good reinforcement learning machine is shaping classes
> of behaviors as it learns. It's not learning specific reactions. It does
> it by shaping the operation of a classifier as it learns. All possible
> stimulus inputs then are guaranteed to fall into some class so that the
> system will always have an "answer" as to how to respond. The answer will
> be based on the reinforcement learning systems evaluation of what class the
> current situation falls into, and on the systems past experience with other
> events that might have been different, but yet fall into the same
> classifications.
>

I think you're mixing things here. If your trained machine receives an
input it has "never" seen before, and which is adequately far removed
from the centroid of your training set, it will produce "some"
response,
but it's unlikely to produce the "correct" response. OTOH, if the
"novel"
input is adequately close to one of your training set prototypes, then
it
really isn't novel.

>
> You can see one implementation of this in action in TD-Gammon. Each move
> which gets reinforced shapes the weights of the neural network which causes
> many other similar moves to be reinforced at the same time. It doesn't
> have to see every move, to be able to make a good "guess" at how to respond
> to a move. It has a good (for Backgammon) system for classifying moves
> into response classes so that it can successfully merge it's learning from
> other moves, to make a good guess at how to play a position it has never
> seen before.
>

This makes some sense, but playing Backgammon, or learning the rules
for
other games, is not general intelligence.

>
> This power to correct make a "good" guess for situations never seen is the
> one key missing piece in general reinforcement learning systems. How it
> does it is easy to understand in theory - it simply needs a system that
> automatically creates a closeness function and produces an answer which is
> some type of merging and selecting, from the situations it has seen.
>

Merging and selecting. Yes. This is one of the "other mechanisms",
besides
just the basic learning device, that I alluded to last time. Additional
structure,
of yet-unknown variety.

> But
> how you do this so that a generic system of measuring "closeness" (one not
> hand tuned to the application like it was in TD-Gammon), to do a good job
> is the hard question that has not been well answered.
>
> > Also, there are some neural nets that do "1-pass" learning. Is this
> > reinforcement?
> >
> > Also, Edelman would probably saw something like behaviors are
> > selected for via internal mechanisms. IOW, any given stimulus might
> > elicit any #of potential behavioral responses, but only one of these
> > ends up being selected for execution. Certainly this happens when
> > you search for the proper word to stick into a sentence. Internally,
> > many words are filtered past before one is finally spoken. And then
> > of course you have to option to stop saying the word even while it's
> > being spoken, if it's not the right selection. Plus, there are multiple
> > options for how the word is spoken, emphasis, inflection, etc.
> >
> > Reinforcement learning is only part of the system.
>
> And where is your evidence to show that all those "options" are not
> selected for by the same low level reinforcement learning system?
>

Basically, in the observation that 50 years of creation of naiive
learning
devices hasn't solved the problem. Many people, such as Grossberg,
have realized this and have tried to put various forms of additional
structure into their systems, but so far a good general solution hasn't

been found.

Lester Zick

unread,

Jul 17, 2006, 12:49:20 PM7/17/06

to

On 17 Jul 2006 04:41:31 -0700, "J.A. Legris" <jale...@sympatico.ca>
wrote:

You should be so lucky.

~v~~

J.A. Legris

unread,

Jul 17, 2006, 1:33:32 PM7/17/06

to

zick, v.t., (zik). provide an apparent explanation or argument using
counterfeit terminology.

Now don't you go zickin' me again in the same thread. Sobald gezickenes
zweimal schüchternes.

--
Joe Legris

feedbackdroid

unread,

Jul 17, 2006, 1:35:39 PM7/17/06

to

Michael Olea wrote:

...........

>
> Bromer on Computational complexity:
>
> NP-complete theory is all wrong, because Bromer has written a linear time
> algorithm to solve the traveling salesman problem. Ok, there are some cases
> in which it does not work, but really, with a few fixes, it should work.
>

Speaking of which, it's amazing how badly people [ie, computerists]
tend to get side-tracked into rote behavioral approaches regards the
TSP. Meaning, doing it like everybody else.

OTOH, if people follow nature's approach of evolving "good-enough"
rather than "optimal" processes/organisms, the solution CAN be made
many times simpler.

To wit, a 20 city TSP involves something like 10^18 different possible
routes.

the computerist approach:
Wow, impressive! Let's spend our lives trying to find the "optimal"
route out of all that. Let's invent NP-mathematics too.

nature's approach:
Ugh. It would take 1000s of millenia to solve this problem using any
brute-force, general-search, approach. How can we simply the problem?
Maybe a non-optimal but easily-computed solution is good-enough for
what's really important ... survival in the real-world. So, here it is
...

If you (a) first partition city-space into regions of grouped cities,
and then (b) solve the problem for each group, and then (c) adopt a
strategy for going between the groups, then (d) the problem is solvable
in the next couple of seconds [more or less] rather than the rest of
eternity [more or less]. Note that this partitioning is basically the
issue of "modularization", which enormously reduces the size of the
search space for any problem.

For the 20-city TSP, if we break the cities into 5 groups of 4 nearby
cities each, then the total #paths reduces to just 120 total. Down from
10^18. Now, that's really a wow.

There may be more-optimal solutions, but then this one only takes a
fraction of a second to execute. While the computerist's machine is
endlessly chugging away, nature's salesman has run his route and is
already back home having coffee, and talleying up his sales.
Good-enough.

There must be a moral here.

Curt Welch

unread,

Jul 17, 2006, 3:52:42 PM7/17/06

to

"JGCASEY" <jgkj...@yahoo.com.au> wrote:
> Curt Welch wrote:
> > "JGCASEY" <jgkj...@yahoo.com.au> wrote:
>
> > > You talk about reinforcement learning but you cannot reinforce
> > > something that doesn't exist. The learning must take place for
> > > it to be reinforced.
> >
> > I have no clue what that means. The act of reinforcement is the
> > learning. To me, you just wrote, "the reinforcement must take
> > place for the reinforcement to take place" ???
>
> Not sure what it means myself now I have reread it. I think what
> I had in mind was whatever it was that was being reinforced had to
> exist in the first place. In other words there must be something
> to be reinforced before it can be reinforced.

Yeah, and after reading the rest of your post I think I grasp what you were
getting at.

> You can't reinforce a dam that doesn't exist. Reinforcement may be
> required for the dam to persist but it doesn't require reinforcement
> to exist.
>
> Thus I would say what you see as existing is behaviors and thus
> perhaps what you mean by reinforcement learning is the process of
> reinforcing behaviors?
>
> Your idea I think is to reinforce the behaviors of a generic input
> dependent behaviour generating machine whenever you deem those
> behaviors to be "intelligent" with the hope it will persist in
> producing such behaviors.

Sure. Of course. Normal ideas of reinforcement. The system finds food by
random chance and the actions leading up to the find are reinforced and
more likely to be repeated in the future.

> Learning would mean some behaviors are becoming more likely than
> other behaviors?

Right. That's the bottom line of it all.

> To learn a behavior is to make it more likely.
> But of course that doesn't answer what makes a behavior intelligent
> for you can learn to produce unintelligent behaviours as well.

Well, my belief in reinforcement learning is so basic to all this that I
actually see reinforcement learning as intelligence. Any behavior that is
learned through reinforcement is intelligent behavior. So it's impossible
to learn unintelligent behavior (from this way of looking at intelligence).

And, of course, if you define intelligence as full human behavior, then
that means everything that humans do is intelligent as well - by
definition.

This of course is not how we use the word intelligence in normal day to day
talk. We say people are being stupid and unintelligent when they do things
that a moment of reasoned thought would show us to be a poor choice of
behavior. However, this would imply that only logical reasoned behavior is
intelligent behavior. And though that's a fine way to define it for casual
conversation, it is not a definition that gets us very near building human
like machines because humans aren't driven by logic and reason at the
lowest levels. They are driven by reinforcement learning. Intelligent
logic and reason only emerges as part of our behavior set over time because
of its great value to us in producing long term rewards.

When you look at intelligence as being reinforcement learning skills then
you can compare the intelligence of different machines by placing them into
the same environment and see which ones produce the most rewards over some
extended period of time. The better the machine does at adapting it's
behavior to the environment is the more intelligent it is - for that
environment.

Michael Olea

unread,

Jul 17, 2006, 4:19:45 PM7/17/06

to

feedbackdroid wrote:

> Speaking of which, it's amazing how badly people [ie, computerists]
> tend to get side-tracked into rote behavioral approaches regards the
> TSP. Meaning, doing it like everybody else.

Dan, the "engineer", once argued that a wavelet transform could not be
lossless because "any time you add two numbers you lose information". To be
fair, I don't know that Dan has ever actually claimed to be an engineer.

> OTOH, if people follow nature's approach of evolving "good-enough"
> rather than "optimal" processes/organisms, the solution CAN be made
> many times simpler.

What???? You mean there are algorithms that find approximate solutions? Do,
tell. So, would that imply that the study of the run-time complexity of
algorithms should not be limited to the worst case performance of exact
solutions, but include also average run-time, and when exact solutions are,
in a given context, too expensive, the worst case, and expected case of
approximate algorithms, and perhaps even bounds on the difference between
optimal solutions and approximate solutions such an algorithm finds? Wow!
Inform "computerists" everywhere of the stunning news. Also, the allies won
the war - Hitler is dead!!!

> To wit, a 20 city TSP involves something like 10^18 different possible
> routes.

It depends on which variant of the problem. In the case where each city is
to be visited exactly once, and cities can be visited in any order, it is:

N!/2 = 1,216,451,004,088,320,000

> the computerist approach:
> Wow, impressive! Let's spend our lives trying to find the "optimal"
> route out of all that. Let's invent NP-mathematics too.

Any other group you care to smear while you're at it?

Let's open an introductory text on algorithms and data structures. How many
approximate algorithms for the traveling salesman problem do we find?
There's Prim's algorithm, Kruskal's algorithm, a simple version of
"2-opting", generalized to "k-opting", and of course several pointers to
the literature.

> If you (a) first partition city-space into regions of grouped cities...

Having slandered the study of algorithms and their run-time, Dan sketches a
question-begging "algorithm", offered on the grounds of it's alleged
superior run-time, though punting on the question of how to partition
cities into groups, the run-time of consrtucting that partition, and
ignoring any analysis of bounds on the differences between the solutions
found by this "algorithm" and any optimal solutions. And yet the little
cockroach presumes to lecture "computerists" on their silly rote behavior.

> There must be a moral here.

That you suffer from penis envy? Or is it just a lack of rigor?

-- Michael

J.A. Legris

unread,

Jul 17, 2006, 4:23:50 PM7/17/06

to

The experimental results in the first paper (starting on p.64) are
fascinating. Required reading for behaviourists! Thanks for the links.

--
Joe Legris

JPl

unread,

Jul 17, 2006, 4:31:47 PM7/17/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote in message
news:1153150163.1...@p79g2000cwp.googlegroups.com...

But maybe Curt has a point in saying there must be some, or a few simple
basic RL algorithms that carry the complexities that evolve from them at
large. Maybe I found them: 1) the "protonic" algorithm of protons, 2) the
"neutronic" algorithm of neutrons, and 3) the "electronic" algorithm of
electrons. Duh.

feedbackdroid

unread,

Jul 17, 2006, 5:12:28 PM7/17/06

to

Michael Olea wrote:
> feedbackdroid wrote:
>
> > Speaking of which, it's amazing how badly people [ie, computerists]
> > tend to get side-tracked into rote behavioral approaches regards the
> > TSP. Meaning, doing it like everybody else.
>
> Dan, the "engineer", once argued that a wavelet transform could not be
> lossless because "any time you add two numbers you lose information". To be
> fair, I don't know that Dan has ever actually claimed to be an engineer.
>

Umm, don't recall wavelets specifically, but certainly when you add
2 #'s you lose some information.

If you have a 5, was this the result of adding 1+4, or 2+3? Easy case.
Which is it?

Or more realistic, 1.1+3.9, or any of an infinite #other possibilities.

>
> > To wit, a 20 city TSP involves something like 10^18 different possible
> > routes.
>
> It depends on which variant of the problem. In the case where each city is
> to be visited exactly once, and cities can be visited in any order, it is:
>
> N!/2 = 1,216,451,004,088,320,000
>

Hmmm, looks like about 10^18 to me.

>
> > the computerist approach:
> > Wow, impressive! Let's spend our lives trying to find the "optimal"
> > route out of all that. Let's invent NP-mathematics too.
>
> Any other group you care to smear while you're at it?
>

Rote and linear thinkers, maybe? Maybe ducks in a row.
Everybody who sits in the first row in lecture, but never
asks any questions.

>
> > If you (a) first partition city-space into regions of grouped cities...
>
> Having slandered the study of algorithms and their run-time, Dan sketches a
> question-begging "algorithm", offered on the grounds of it's alleged
> superior run-time, though punting on the question of how to partition
> cities into groups, the run-time of consrtucting that partition, and
> ignoring any analysis of bounds on the differences between the solutions
> found by this "algorithm" and any optimal solutions. And yet the little
> cockroach presumes to lecture "computerists" on their silly rote behavior.
>

Awww. Exposing a few raw nerves, are we? You might have that
checked out already .... before it's too late.

Actually, I wasn't punting, I was presenting a general approach. And
when I said grouped cities in part (a), I actually meant to say nearby
cities. The partitioning issue is trivial. Selecting which group to
visit
first and last is also trivial. Get pencil and paper and explore.

BTW, got another algorithm than reduces spearch space from 10^18
to 120? You notice I did say, optimality is traded for efficiency.
Do you think nature tried out all of the 10^18 class solutions before
deciding on a 120-class solution? Well, do ya? Regards survival, it's
a LOT better to be quick than be optimal.

In fact, the more we learn about biology, the more we see that
nature/evolution found all kinds of good short-cut solutions that
worked, and then it CONSERVED them. See Hox genes, and Pax6
genes, for instance.

>
> > There must be a moral here.
>
> That you suffer from penis envy? Or is it just a lack of rigor?
>

I should say calm down, but quite obviously it'll do no good here.

Curt Welch

unread,

Jul 17, 2006, 5:48:40 PM7/17/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote:

> Umm, don't recall wavelets specifically, but certainly when you add
> 2 #'s you lose some information.
>
> If you have a 5, was this the result of adding 1+4, or 2+3? Easy case.
> Which is it?
>
> Or more realistic, 1.1+3.9, or any of an infinite #other possibilities.

Yes, but if you store that lost information somewhere else at the same
time, then it's not lost. If you compute X=A+B and Y=A-B, then you have
lost information in both operations. But yet, you can still use X and Y to
recompute A, or B, so nothing was in fact lost if you transform A and B,
into X and Y in this way. What was lost in each operation was saved, in
the other. This is true of all linear transforms where the transformation
matrix is invertible.

There are many known transformations that loose information in their
individual operations, but which collectively, manage to retain all the
information. (FFTs, lossless compression, etc).

JGCASEY

unread,

Jul 17, 2006, 6:07:10 PM7/17/06

to

JC wrote:

>> To learn a behavior is to make it more likely.
>> But of course that doesn't answer what makes a behavior
>> intelligent for you can learn to produce unintelligent
>> behaviours as well.

Curt Welch wrote:

> Well, my belief in reinforcement learning is so basic to
> all this that I actually see reinforcement learning as
> intelligence. Any behavior that is learned through
> reinforcement is intelligent behavior. So it's impossible
> to learn unintelligent behavior (from this way of looking
> at intelligence).
>
> And, of course, if you define intelligence as full human
> behavior, then that means everything that humans do is
> intelligent as well - by definition.
>
> This of course is not how we use the word intelligence in
> normal day to day talk. We say people are being stupid and
> unintelligent when they do things that a moment of reasoned
> thought would show us to be a poor choice of behavior.

Then I would say that it is the "reasoned thought" behavior
that is being looked for. Do we really need the hanger on
destructive behaviors in a machine just because they served
us well as tribal people? Or is the ability to wipe out
the competition intelligent behavior? When we don't have a
war to fight we create one in the form of football to fill
in that need. Or play computer games where you can go to
war "killing" virtual characters. sure they are both games
but why do they stimulate the pleasure centers in most males?
In human societies i sometimes think perhaps our full
intelligent combined behavior is spread between the sexes.

This would forebode that an intelligent machine would, as an
act of intelligence, think about wiping us out or at least
most of us as it might keep some alive the way we do in zoos.
The typical science fiction scenario might be based on a
reality or intuition about what it means to be intelligent.
Look at the kinds of movie fiction that is most popular for
I suspect its content has a lot to say about our need for
conflict and successful resolution.

Glen M. Sizemore

unread,

Jul 17, 2006, 6:12:03 PM7/17/06

to

I looked over the paper (no, I didn't read it), and my first impression is
that this is not "must" reading for behaviorists. Or rather, it is far less
"must" reading than some of the tutorials on Bayesian analyses of coin
tosses and paper-frog jumps. But let's cut to the quick, Joe. Why do you
think it is "must" reading for behaviorists? Pitch me. After all, you can
argue that I can't be persuaded, but you know that you can get a rise out of
me.

"J.A. Legris" <jale...@sympatico.ca> wrote in message

news:1153167830.1...@i42g2000cwa.googlegroups.com...

JGCASEY

unread,

Jul 17, 2006, 6:13:20 PM7/17/06

to

Curt Welch wrote:
> "feedbackdroid" <feedba...@yahoo.com> wrote:
>
> > Umm, don't recall wavelets specifically, but certainly when you add
> > 2 #'s you lose some information.
> >
> > If you have a 5, was this the result of adding 1+4, or 2+3? Easy case.
> > Which is it?
> >
> > Or more realistic, 1.1+3.9, or any of an infinite #other possibilities.
>
> Yes, but if you store that lost information somewhere else at the same
> time, then it's not lost. If you compute X=A+B and Y=A-B, then you have
> lost information in both operations. But yet, you can still use X and Y to
> recompute A, or B, so nothing was in fact lost if you transform A and B,
> into X and Y in this way. What was lost in each operation was saved, in
> the other. This is true of all linear transforms where the transformation
> matrix is invertible.

Something like the brain does when the what and the where of
visual data go to different parts of the brain to be processed.
The absolute data is of no use to recogizing the what (temporal
lobe) but it is required to determine the spatial relationships of
the what (parietal lobe).

--
JC

Glen M. Sizemore

unread,

Jul 17, 2006, 6:20:44 PM7/17/06

to

Oops. I forgot to say that the paper I looked at was the "Inferring Hidden
Causes" mama.

G.

"Glen M. Sizemore" <gmsiz...@yahoo.com> wrote in message
news:44bc0ae6$0$2491$ed36...@nr1.newsreader.com...

Lester Zick

unread,

Jul 17, 2006, 6:49:02 PM7/17/06

to

On 17 Jul 2006 10:33:32 -0700, "J.A. Legris" <jale...@sympatico.ca>
wrote:

Doubt you'd recognize counterfeit terminology if it bit you in the
ass, Simple Simon.

>Now don't you go zickin' me again in the same thread. Sobald gezickenes
>zweimal schüchternes.

~v~~

Michael Olea

unread,

Jul 17, 2006, 7:01:36 PM7/17/06

to

Curt Welch wrote:

====================================================================
"Also, some people like to use the modern Gabor wavelet, mainly I
think, because it is more limited in space than a fourier grating.
To me this doesn't really mean anything, because you've basically
selected a mathematical shape that matches the cell response, which
is mainly a result of the anatomy. IOW, Gabor is a computational
method, but not anything very profound, IMO."

-- Dan Michaels

"Mathematically, the 2D Gabor function achieves the resolution
limit in the conjoint space only in its complex form. Since a
complex valued 2D Gabor function contains in quadrature projection
an even-symetric cosine component and an odd-symmetric sine
component, Pollen and Ronner's finding that simple cells exist
in quadrature-phase pairs therefore showed that the design of
the cells might indeed be optimal. The fact that the visual
cortical cell has evolved to an optimal design for information
encoding has caused a considerable amount of excitement not
only in the neuroscience community but in the computer science
community as well."

-- Tai Sing Lee [2]

"Besides everything I just wrote, I should reiterate that I think
viewing all these happenings as a "lossless" process is really a
misnomer. What is really going on are successive transforms of the
sensory data. If you have something like

Ce <- Si <- Wi

and the sums of gaussians, how can this possibly be lossless? When
you add 2 #'s you lose information, namely the original values of those
2 #'s"

-- Dan Michaels

"In this paper, we have derived, based on physiological constraints
and the wavelet theory, a family of 2D Gabor wavelets which model the
receptive fields of the simple cells in the brain's primary visual
cortex. By generalizing Daubechies's frame criteria to 2D, we
established the conditions under which a discrete class of continous
Gabor wavelets will provide complete representation of any image. ..."

-- Tai Sing Lee [2]

"... Well, I can't resist making one observation: of course the
transformation

(x,y) -> x+y

discards information. But that is not what is involved in projection onto a
wavelets basis. A better analogy would be:

(x,y) -> (x+y, x-y)

which is lossless. As to why such a transform might be advantageous:

[1] Field. Wavelets, vision and the statistics of natural scenes.
[2] Lee. Image Representation Using 2D Gabor Wavelets."
====================================================================

The above are remarks I made 22 Nov 2005.

All the engineers I know, several, including relatives, friends, and
colleagues, had to pass elementary linear algebra to get their degrees.
Of course, with wavelets and Fourier analysis the vectors are functions, and
the vector spaces are infinite dimensional function spaces, not much
covered in a first course on linear algebra. They become finite dimensional
again in the discrete domain of the DFFT or the varioius DWT's.

-- Michael

Michael Olea

unread,

Jul 17, 2006, 7:03:10 PM7/17/06

to

J.A. Legris wrote:

> Michael Olea wrote:

>> ... It is a major theme in

>> Allison Gopnik's work:
>>
>> http://ihd.berkeley.edu/gopnik.htm
>>
>> For example:
>>
>> A.Gopnik, C. Glymour, D. Sobel, L. Schulz, T. Kushnir, & D. Danks (2004).
>> A theory of causal learning in children: Causal maps and Bayes nets.
>> Psychological Review, 111, 1, 1-31.
>
>> T. Kushnir, A. Gopnik, L Schulz, & D. Danks. (in press). Inferring hidden
>> causes. Proceedings of the Twenty-Fourth Annual Meeting of the Cognitive
>> Science Society

> The experimental results in the first paper (starting on p.64) are

> fascinating. Required reading for behaviourists! Thanks for the links.

You're welcome.

-- Michael

Lester Zick

unread,

Jul 17, 2006, 7:19:09 PM7/17/06

to

On 17 Jul 2006 10:33:32 -0700, "J.A. Legris" <jale...@sympatico.ca>
wrote:

What's curious in all this is that with very minor exceptions those
who routinely post here thoroughly disagree with each other and always
have and we all recognize this and yet continue to disagree without
ever reaching any conclusion regarding ai or epistemological and
pedagogical approaches to ai and yet continue to post. What is it
anyone expects to happen?

~v~~

Michael Olea

unread,

Jul 17, 2006, 7:19:35 PM7/17/06

to

Glen M. Sizemore wrote:

> I looked over the paper (no, I didn't read it), and my first impression is
> that this is not "must" reading for behaviorists. Or rather, it is far
> less "must" reading than some of the tutorials on Bayesian analyses of
> coin
> tosses and paper-frog jumps. But let's cut to the quick, Joe. Why do you
> think it is "must" reading for behaviorists? Pitch me. After all, you can
> argue that I can't be persuaded, but you know that you can get a rise out
> of me.

He's probably referring to the differing predictions between RW models,
various Bayesian models, and the experimental results.

I have mixed reactions to papers with Gopnik's name on them - interesting
work, but a tendency towards straw-man caricature of alternatives.

-- Michael

Curt Welch

unread,

Jul 17, 2006, 7:14:41 PM7/17/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote:
> Curt Welch wrote:
> > "feedbackdroid" <feedba...@yahoo.com> wrote:
> > > Can something be
> > > reinforced that you've never seen or done before?
> >
> > Of course it can. A good reinforcement learning machine is shaping
> > classes of behaviors as it learns. It's not learning specific
> > reactions. It does it by shaping the operation of a classifier as it
> > learns. All possible stimulus inputs then are guaranteed to fall into
> > some class so that the system will always have an "answer" as to how to
> > respond. The answer will be based on the reinforcement learning
> > systems evaluation of what class the current situation falls into, and
> > on the systems past experience with other events that might have been
> > different, but yet fall into the same classifications.
> >
>
> I think you're mixing things here. If your trained machine receives an
> input it has "never" seen before, and which is adequately far removed
> from the centroid of your training set, it will produce "some"
> response, but it's unlikely to produce the "correct" response. OTOH, if
> the "novel" input is adequately close to one of your training set
> prototypes, then it really isn't novel.

Yes, but you didn't include some distant measure in your question. You
simply said something that hasn't been seen or done before. The answer is
that it is always advantageous for the learning system to include an
inherent system of measuring closeness and to use that, to guide it's
selection of behaviors in situations it has never seen before. An educated
guess is almost always better than a random guess even if it produces an
answer which is far from optimal.

A big point about all practical reinforcement learning problems is that
there never is a correct answer. All answers tend instead to be graded on a
scale of quality. Jumping off a cliff and killing yourself is not the
"wrong" answer - it simply produces far less rewards, than choosing to eat
dinner instead. It's not as bad an answer as slowly pealing your skin off
and feeding it to the pigs until you die. :)

This is what makes it very important in reinforcement learning to use
measures of closeness to select behaviors based no past experience when
presented with a novel stimulus. All real reinforcement problems, tend to
have the property that similar behaviors tend to produce similar results in
similar situations. So no matter how novel a new situation is, it's always
wise for the system to try what it believes is the best behavior based on
its best current understanding of similar.

It's understanding of similar (its system for measuring closeness),
likewise, needs to be trained by past experience as well. If the system has
learned to ignore the blue light because it makes no difference in the
optimal selection of behaviors in the conditions previously experienced,
then when a new stimulus experience is seen, far from past experience, then
assuming the blue light should still be ignored, is a good first guess.

The point is, if you have a fixed number of sensory inputs, then all input
values will have been seen in the past. After a short period of training,
the system is not expected to see anything truly "new". It's only expected
to see new combinations it has not yet seen. But those combinations will
always, share in common, traits of past combinations, so it should use
those shared traits as a guide to selecting behaviors based on past
training. In a correctly designed system, there will never be truly novel
inputs, after a short bit of training.

> > You can see one implementation of this in action in TD-Gammon. Each
> > move which gets reinforced shapes the weights of the neural network
> > which causes many other similar moves to be reinforced at the same
> > time. It doesn't have to see every move, to be able to make a good
> > "guess" at how to respond to a move. It has a good (for Backgammon)
> > system for classifying moves into response classes so that it can
> > successfully merge it's learning from other moves, to make a good guess
> > at how to play a position it has never seen before.
> >
>
> This makes some sense, but playing Backgammon, or learning the rules
> for other games, is not general intelligence.

But it's getting much closer because TD-gammon was far more generic than
all the same authors past attempts at creating a backgammon game. So it's
a step in the right direction and it shows how a program can learn complex
things on its own better, than a human can hand-code the same type of
knowledge into a program.

> > This power to correct make a "good" guess for situations never seen is
> > the one key missing piece in general reinforcement learning systems.
> > How it does it is easy to understand in theory - it simply needs a
> > system that automatically creates a closeness function and produces an
> > answer which is some type of merging and selecting, from the situations
> > it has seen.
> >
>
> Merging and selecting. Yes. This is one of the "other mechanisms",
> besides
> just the basic learning device, that I alluded to last time. Additional
> structure,
> of yet-unknown variety.

Well, it's not unknown to me. I have a good general high level grasp on
exactly what it needs to do, and why, and a bit of how. There's important
details missing from my understanding, but that's not the same thing as
having no clue at all about what is needed. I'd say I have an 80%
understanding of exactly what is needed.

Well, it took 120 years to go from balloons to powered controlled sustained
flight. The fact that we haven't gotten strong general learning working in
50 years doesn't prove much to me. Especially since I already clearly see
exactly what is missing. It's just not a mystery to me anymore. Only the
implementation details are still a mystery. It's like understanding that
all you need, is the correct configuration plane to create stability and
control and lift, combined with a power source with the correct power to
weight ratio. Knowing this is all you need, is not the same as knowing what
the correct configuration is, or knowing how to build a more powerful,
lighter engine. But once you understand what's missing, the rest is no
longer a mystery, it's just a job of directed research to fill in the
missing pieces.

Curt Welch

unread,

Jul 17, 2006, 7:46:16 PM7/17/06

to

Lester Zick <DontB...@nowhere.net> wrote:

> What's curious in all this is that with very minor exceptions those
> who routinely post here thoroughly disagree with each other and always
> have and we all recognize this and yet continue to disagree without
> ever reaching any conclusion regarding ai or epistemological and
> pedagogical approaches to ai and yet continue to post. What is it
> anyone expects to happen?

I think many of the clusters of agreement manage to move forward. I would
have stopped posting years ago if I didn't feel I was learning something
new in the process. I learn new things even when I argue the same old
points I've argued with the same people in the past. If nothing else, I
believe we all learn how to better defend our positions - and which parts
of our beliefs can't be defended.

J.A. Legris

unread,

Jul 17, 2006, 8:39:51 PM7/17/06

to

I was referring to the first paper: A theory of causal learning in
children: Causal maps and Bayes nets. Read pages 64-71 in particular.

--
Joe Legris

feedbackdroid

unread,

Jul 17, 2006, 10:22:02 PM7/17/06

to

Curt Welch wrote:
> "feedbackdroid" <feedba...@yahoo.com> wrote:
>
> > Umm, don't recall wavelets specifically, but certainly when you add
> > 2 #'s you lose some information.
> >
> > If you have a 5, was this the result of adding 1+4, or 2+3? Easy case.
> > Which is it?
> >
> > Or more realistic, 1.1+3.9, or any of an infinite #other possibilities.
>
> Yes, but if you store that lost information somewhere else at the same
> time, then it's not lost.
>

Well, YEAH, but that wasn't the issue, was it. If you otherwise save
the original data somewheres, then you don't lose it, by definition.

>
> If you compute X=A+B and Y=A-B, then you have
> lost information in both operations. But yet, you can still use X and Y to
> recompute A, or B, so nothing was in fact lost if you transform A and B,
> into X and Y in this way. What was lost in each operation was saved, in
> the other.
>

This works because you have "2" output equations. With 2 unknowns
and 2 equations you can work backwards. With 2 unknowns and
only 1 equation, not so. Try ONLY one ... X=A+B

Also, I don't recall exactly, but it's possible the previous discussion

originally came up in the context of looking at the output of a "black
box", and trying to guess the nature of the circuitry contained
within the box. I do seem to recall JAG asking about this. And I
seem to recall saying something to the effect that, you can
theoretically have an infinite #different circuits in the box which
could all produce the same output, and the only way to tell what it
is to open the box and look inside. In brain research, this is called
neurophysiology, rather than behaviorism.

>
>This is true of all linear transforms where the transformation
> matrix is invertible.
>
> There are many known transformations that loose information in their
> individual operations, but which collectively, manage to retain all the
> information. (FFTs, lossless compression, etc).
>

Remember, however, with FFT you transform input real+imaginary
data arrays into output magnitude+phase arrays.

You cannot recover the original data correctly by simply inverse-
transforming just the output magnitude array. You also need the
phase array to get the correct inverse transform.

J.A. Legris

unread,

Jul 17, 2006, 10:26:17 PM7/17/06

to

Lester Zick wrote:

>
> What's curious in all this is that with very minor exceptions those
> who routinely post here thoroughly disagree with each other and always
> have and we all recognize this and yet continue to disagree without
> ever reaching any conclusion regarding ai or epistemological and
> pedagogical approaches to ai and yet continue to post. What is it
> anyone expects to happen?
>
> ~v~~

All things considered, I expect more of the same. I've already
explained why I am here: I need the eggs.

--
Joe Legris

feedbackdroid

unread,

Jul 17, 2006, 11:52:14 PM7/17/06

to

Michael Olea wrote:

I'll make this brief, since I wrote it once already in longer form, and
it vaporized - [one day I'll find the goddamn friggin key that
accidentally erases the entire screen of text]

1. First off, Tai Lee's model is just another partial brain model based
upon a limited and simplified data set. Put it on the pile of 100s of
other models, read as "conjecture". You act like it's the final answer.

2. As noted previously, the Gabor wavelet is likely just the particular
stimulus form that happens to optimally and "fortuitously" excite the
cortex, based upon the particular way the cortex happens to be wired.

To wit, the cortex is wired as a large extended mesh where local cells
are inhibited by surrounding cells, and which in turn are inhibited by
even more distant cells. It's just the same thing repeated. This is
known as surround inhibition, and peripheral dis-inhibition by
multitudes of neuroscientists. It's a basic structure of neural tissue,
not just in cortex, but extending back to the earliest vertebrates, cf
amphibian optic tectum.

AFAIAC, Gabor wavelets just so happen to fortuitously reflect this
underlying structure. I told you this last time, but you didn't get it.
Lee's conjectures aren't gonna change this.

3. It's almost certain that information is removed, and invariant forms
abstracted from the raw images, as the visual cortical hierarchy is
ascended. Hawkins model is based on this idea, of course, as it comes
directly out of decades of neurophysiological recording.

It does no good to keep all of the raw data all the way from input to
highest levels. What is needed is to abstract the information from the
backgroud garbage, so the fact that "... Gabor wavelets will provide
complete representation of any image. ..." is irrelevant. In actual
fact, information is discarded at each and every level of the visual
hierarchy, and in every one of the 30+ visual areas. that's what
"feature detection" is all about, for chrissake.

4. Look at your own comment below ...

>
> (x,y) -> x+y
>
> discards information. But that is not what is involved in projection onto a
> wavelets basis.
>

The first part agrees with what I said. The 2nd is irrelevant, as
regards "actual" operation of cortex, as I've just indicated above.
Regardless of Lee's model.

Did I miss it somewhere, or did the entire neuroscience community come
out and say that Lee's model is THE FINAL model, and THE FINAL truth?
Or is it just one of the 100s of ideas in the pile?

========================

Curt Welch

unread,

Jul 18, 2006, 1:36:00 AM7/18/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote:
> Curt Welch wrote:
> > "feedbackdroid" <feedba...@yahoo.com> wrote:
> >
> > > Umm, don't recall wavelets specifically, but certainly when you add
> > > 2 #'s you lose some information.
> > >
> > > If you have a 5, was this the result of adding 1+4, or 2+3? Easy
> > > case. Which is it?
> > >
> > > Or more realistic, 1.1+3.9, or any of an infinite #other
> > > possibilities.
> >
> > Yes, but if you store that lost information somewhere else at the same
> > time, then it's not lost.
> >
>
> Well, YEAH, but that wasn't the issue, was it.

I believe it was the issue with wavelets that started this. But I really
have no clue where this discussion came from.

> If you otherwise save
> the original data somewheres, then you don't lose it, by definition.
>
> >
> > If you compute X=A+B and Y=A-B, then you have
> > lost information in both operations. But yet, you can still use X and
> > Y to recompute A, or B, so nothing was in fact lost if you transform A
> > and B, into X and Y in this way. What was lost in each operation was
> > saved, in the other.
> >
>
> This works because you have "2" output equations. With 2 unknowns
> and 2 equations you can work backwards. With 2 unknowns and
> only 1 equation, not so. Try ONLY one ... X=A+B

Though, again, this isn't the issue... But...

You can take two real numbers and combine them into one real number, and
not lose any information. Simply take the digits of one of the numbers,
and make them the even digits of the result, and the digits of the second
number, and make them the odd digits. Oh, and I guess you have to hide the
sign in there somewhere as well, so just steal an extra digit and encode it
in there.

> Also, I don't recall exactly, but it's possible the previous discussion
> originally came up in the context of looking at the output of a "black
> box", and trying to guess the nature of the circuitry contained
> within the box. I do seem to recall JAG asking about this. And I
> seem to recall saying something to the effect that, you can
> theoretically have an infinite #different circuits in the box which
> could all produce the same output, and the only way to tell what it
> is to open the box and look inside. In brain research, this is called
> neurophysiology, rather than behaviorism.

Yeah. But many times, the inside is not relevant. You certainly don't
need to know for AI.

> >This is true of all linear transforms where the transformation
> > matrix is invertible.
> >
> > There are many known transformations that loose information in their
> > individual operations, but which collectively, manage to retain all the
> > information. (FFTs, lossless compression, etc).
> >
>
> Remember, however, with FFT you transform input real+imaginary
> data arrays into output magnitude+phase arrays.
>
> You cannot recover the original data correctly by simply inverse-
> transforming just the output magnitude array. You also need the
> phase array to get the correct inverse transform.

Yeah, I know that.

J.A. Legris

unread,

Jul 18, 2006, 7:53:22 AM7/18/06

to

I was referring to the distinction between operant conditioning, which
encodes relationships between the organism's behaviour and the
environment, and causal maps, which encode relationships between
aspects of the environment, where the organism may just be an observer.
There seems to be connection to the distinction between learning where,
for example, a rat lacking a hippocampus can quickly return to a
previously discovered object but only if he always starts from the same
position, and learning exhibited by intact animals who can quickly
locate the object from any starting point. The latter are said to
employ a spatial map, and Gopnick claims that causal maps are analogous
functions in a different domain. I wonder if the hippocampus is
involved in both. Maybe causal maps can be seen as a generalization of
spatial maps.

>From p.11:

"Causal maps would also allow animals to extend their causal knowledge
and learning to a wide variety of new kinds of causal relations, not
just causal relations that involve rewards or punishments (as in
classical or operant conditioning), not just object movements and
collisions (as in the Michottean effects), and not just events that
immediately result from their own actions (as in operant conditioning
or trial-and-error learning). Finally, animals could combine new
information and prior causal information to create new causal maps,
whether that prior information was hard-wired or previously learned."

And on p.15:

"Just as causal maps are an interesting halfway point between
domain-specific and domain-general representations, these causal
learning mechanisms are an interesting halfway point between
classically nativist and empiricist approaches to learning.
Traditionally, there has been a tension between restricted and
domain-specific learning mechanisms like "triggering" or
"parameter-setting", and very general learning mechanisms like
association or conditioning. In the first kind of mechanism, very
specific kinds of input trigger very highly structured representations.
In the second kind of mechanism, any kind of input can be considered,
and the representations simply match the patterns in the input. Our
proposal is that causal learning mechanisms transform domain-general
information about patterns of events, along with other information,
into constrained and highly structured representations of causal
relations."

--
Joe Legris

feedbackdroid

unread,

Jul 18, 2006, 10:43:59 AM7/18/06

to

No. That was the original issue regarding loss of information when
adding
2 #'s, but all of the other stuff was piled onto that.

Glen M. Sizemore

unread,

Jul 18, 2006, 12:46:39 PM7/18/06

to

Michael Olea wrote:
> Glen M. Sizemore wrote:
>
> > I looked over the paper (no, I didn't read it), and my first impression
> > is
> > that this is not "must" reading for behaviorists. Or rather, it is far
> > less "must" reading than some of the tutorials on Bayesian analyses of
> > coin
> > tosses and paper-frog jumps. But let's cut to the quick, Joe. Why do
> > you
> > think it is "must" reading for behaviorists? Pitch me. After all, you
> > can
> > argue that I can't be persuaded, but you know that you can get a rise
> > out
> > of me.
>
> He's probably referring to the differing predictions between RW models,
> various Bayesian models, and the experimental results.
>
> I have mixed reactions to papers with Gopnik's name on them - interesting
> work, but a tendency towards straw-man caricature of alternatives.
>
> -- Michael

JL: I was referring to the distinction between operant conditioning, which

encodes relationships between the organism's behaviour and the
environment, and causal maps, which encode relationships between
aspects of the environment, where the organism may just be an observer.
There seems to be connection to the distinction between learning where,
for example, a rat lacking a hippocampus can quickly return to a
previously discovered object but only if he always starts from the same
position, and learning exhibited by intact animals who can quickly
locate the object from any starting point. The latter are said to
employ a spatial map, and Gopnick claims that causal maps are analogous
functions in a different domain. I wonder if the hippocampus is
involved in both. Maybe causal maps can be seen as a generalization of
spatial maps.

GS: All of this stems from an inability to understand the notion of a
response class. Say Pigeon A has been trained to peck a key when any one of
three different pictures of, say, trees, are presented, and Pigeon B has
been trained on hundreds of such pictures (responding is never reinforced
when there is anything but the target pictures, and for A the S- stimuli can't
be trees). The pigeons appear to "have the same response class" when the
particular picture of a tree is one of the target pictures for A, but the
difference is quickly revealed when novel pictures are used for each pigeon.
Here, of course, Pigeon B will likely respond appropriately, but Pigeon A is
not. Nevertheless, both involve operant response classes, one "big" one for
Pigeon B, and 3 "small" ones for Pigeon A. Now, how does this apply to
"spatial maps"? The issue is very difficult to talk about because the
response classes are hard to name. Let's take an animal that has had a lot
of exposure to a particular environment (though keep in mind that, unless
special things have been done since birth, the animal will likely have moved
about in several different environments). Such animals have moved to a
variety of places in the environment from a variety of places. Further,
sometimes they may have moved from one object to another by a, say, L-shaped
route. But if the object is still in sight, the animal's approach DIRECTLY
back to the object may be controlled by visual stimuli - yet its return has
also happened in the context of the L-shaped movement through the space.
With enough of these sorts of occurrences, it is feasible that the animal
acquires a set of responses approaching different areas that are partially
under stimulus control of the preceding movements. Returning to places that
are obscured visually would establish the preceding movement stimuli as
powerful discriminative stimuli. Further such animals may have approached a
particular area from so many different places that "approach to that
particular area" becomes a generalized operant. Now let's talk about an
animal that has experience moving about several different environments. Here
the control by the immediately preceding movements would likely begin to
control behavior since the visual stimuli would not be relevant to all of
the environments. Such animals would be able to move about a novel
environment, and return to certain places without having to retrace their
steps. The delay between some of the movements and the response raises some
issues, but are hardly insurmountable from a conceptual standpoint. I am,
for example, a novice boater, and when I go to a large and unfamiliar lake,
I frequently glance in the direction of the boat dock after traveling a
ways, even when the dock is not really visible. The glances are mediating
responses, and all I have to do is "update" the mediating response
periodically.

Now take a kid that has retrieved objects from a variety of containers. Some
need to be twisted, some unlatched some pushed and then turned etc. and the
person, thus, develops the general response of "going in to things to get
things" and acquires responses that may overlap different situations. Or say
the kid has been involved with a variety of circumstance watching falling
objects, thrown objects, etc. The response classes acquired when retrieving
such objects would produced generalized response classes that would be
effective if, say, the initial trajectory was observed but the landing was
obscured.

>From p.11:

JL: "Causal maps would also allow animals to extend their causal knowledge

and learning to a wide variety of new kinds of causal relations, not
just causal relations that involve rewards or punishments (as in
classical or operant conditioning), not just object movements and
collisions (as in the Michottean effects), and not just events that
immediately result from their own actions (as in operant conditioning
or trial-and-error learning). Finally, animals could combine new
information and prior causal information to create new causal maps,
whether that prior information was hard-wired or previously learned."

And on p.15:

"Just as causal maps are an interesting halfway point between
domain-specific and domain-general representations, these causal
learning mechanisms are an interesting halfway point between
classically nativist and empiricist approaches to learning.
Traditionally, there has been a tension between restricted and
domain-specific learning mechanisms like "triggering" or
"parameter-setting", and very general learning mechanisms like
association or conditioning. In the first kind of mechanism, very
specific kinds of input trigger very highly structured representations.
In the second kind of mechanism, any kind of input can be considered,
and the representations simply match the patterns in the input. Our
proposal is that causal learning mechanisms transform domain-general
information about patterns of events, along with other information,
into constrained and highly structured representations of causal
relations."

GS: All of the above nonsense stems from misunderstanding the generalized
nature of response classes, a willingness to seize on metaphor, and a
willingness to simply invent processes to "explain" the behavioral phenomena
from which the processes were inferred in the first place.

"J.A. Legris" <jale...@sympatico.ca> wrote in message

news:1153223602....@i42g2000cwa.googlegroups.com...

Michael Olea

unread,

Jul 18, 2006, 1:06:01 PM7/18/06

to

feedbackdroid wrote:

>
> Curt Welch wrote:

>> > This works because you have "2" output equations. With 2 unknowns
>> > and 2 equations you can work backwards. With 2 unknowns and
>> > only 1 equation, not so. Try ONLY one ... X=A+B
>>
>> Though, again, this isn't the issue... But...

> No. That was the original issue regarding loss of information when
> adding
> 2 #'s, but all of the other stuff was piled onto that.

======

"Also, some people like to use the modern Gabor wavelet, mainly I
think, because it is more limited in space than a fourier grating.
To me this doesn't really mean anything, because you've basically
selected a mathematical shape that matches the cell response, which
is mainly a result of the anatomy. IOW, Gabor is a computational
method, but not anything very profound, IMO."

"Besides everything I just wrote, I should reiterate that I think

viewing all these happenings as a "lossless" process is really a
misnomer. What is really going on are successive transforms of the
sensory data. If you have something like

Ce <- Si <- Wi

and the sums of gaussians, how can this possibly be lossless? When
you add 2 #'s you lose information, namely the original values of those
2 #'s"

-- Dan Michaels
=====

Dan is either lying, stupid, or both.

-- Michael

feedbackdroid

unread,

Jul 18, 2006, 1:35:23 PM7/18/06

to

Congrats. I see you're starting to emulate your new best friend, GS.

I just explained this entire thing to you - ONCE AGAIN - in my post
from yesterday. You still don't get it. But I'll do it a 3rd time.

1. The cortex has a specific connection architecture, involving
surround
inhibition coming to cells from local peripheral regions, and
dis-inhibition
from further-out peripheral regions. All the Gabor wavelet is is a
BETTER stimulus than the more spatially-extensive sinusoidal gratings
that were in popular use 20 or so years ago. No big surprise. When you
find a better fitting stimulus then you get a greater response from the
cells.

2. 50 years worth of physiological recordings have shown there is an
increasing amount of abstraction computed on the visual image as one
ascends the visual hierarchy. This means that loads of information is
thrown away in order to abstract out salient "features" present in the
image. Look at IT, the so-called face area. Those cells respond to
extreme abstractions in the visual images. The raw pixel data is
discarded
so the cells can signal "yes, it's a face". Turn on a pixel here or
there in
the incoming image, and the cell response isn't affected.

That's loss of information, redundancy reduction, invariance
abstraction,
feature detection, to name a few applicable terms. This has all been
known for 50+ years.

This is what I told you in november, and this is what I am re-iterating

today. Stop mixing your [ie Lee's] simplified mathematical models with
what actually happens in the real-world.

Michael Olea

unread,

Jul 18, 2006, 1:56:21 PM7/18/06

to

feedbackdroid wrote:

Yes, I'll get around to your remarks, the same irrelevant uncomprehending
remarks you made in november, remarks having nothing to do with the claims
you seem to imagine you are addressing, but the point here is simple and
something entirely else:

>> > No. That was the original issue regarding loss of information when
>> > adding
>> > 2 #'s, but all of the other stuff was piled onto that.

>> ...If you have something like

>>
>> Ce <- Si <- Wi
>>
>> and the sums of gaussians, how can this possibly be lossless? When
>> you add 2 #'s you lose information, namely the original values of those
>> 2 #'s"

This was in response to comments about the responses of so called simple
cells in V1 acting as Gabor wavelet basis functions. This is the context in
which you made your remark about adding two numbers losing information,
this is the point that Curt explained to you as did I in november, that
such transformations can be lossless because "adding two numbers" is not
the only thing they do, this is the point that makes your claim "all of the
other stuff was piled onto that" either lying or stupidity or both.

-- Michael

feedbackdroid

unread,

Jul 18, 2006, 3:41:36 PM7/18/06

to

Ah so, Glen Michael. You're claiming the cortex computes a "complete"
transform, and so retains all of the information for that reason, just
like
the complete FFT output contains both magnitude and phase information,
and for that reason is invertable. Magnitude info is coded in some
cells
and phase info in others. 2 arrays. The other Glen will love that one.
Dual representations.

The actual evidence for this is even MORE tenuous than all the other
unproven claims. The problem is always in the same place ... namely,
the underlying assumptions and over-simplifications necessary to
build a model.

You know, this stuff is loosely based upon Pribram's 30 YO ideas about
holograms [fourier transforms] in the brain, and will probably end up
on
exaclty the same scrapheap. 30 years later, Hubel+Wiesel are going
strong, and hardly anyone mentions Pribram anymore.

Your whole endeavor is just plain silly, for the reasons already given
4 times. Namely that 50 years of neurophysiology shows that info is
steadily discarded as the visual hierarchy is ascended.

Glen M. Sizemore

unread,

Jul 18, 2006, 4:14:53 PM7/18/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote in message
news:1153244123....@h48g2000cwc.googlegroups.com...

I don't think so. When I thought I had a chance to be best friends with
Michael, I sent him the following email:

Dear Michael,

I like you. Do you like me? I would like to be your best friend.
Do you want to be best friends? Mark one:

Yes________ NO_________

He didn't return my email. A reasonable conclusion is, therefore, that he
came to the conclusion that you are intellectually dishonest, and stupid, on
his own. I, however, think you are an order of magnitude more dishonest than
you are stupid. But, then, I read a lot of people's posts here and, thus, I
tend to think of "stupid" as being sort of calibrated on Verhey. You are not
as stupid as Verhey.

You're welcome,

Glen

Glen M. Sizemore

unread,

Jul 18, 2006, 5:30:59 PM7/18/06

to

Let me tell you another story; only this time it isn't allegory. Dan, on at
least 4 occasions, urged list readers to contact my employer and report my
"abuse" (i.e. swearing at Dan, calling him a stupid fuck and so forth).
Indeed, Dan posted the name of the private college, at which I was then
employed, saying something like "I wonder what his boss [and he gave my boss'
name] would think about statements like: [swearing at Dan, commenting on his
intellectual dishonesty, etc. etc.]. Eventually, in fact, someone took it up
with my current "boss" (I think the connection was directly through Dan, if
not Dan himself). A paraphrase of the interchange went like this:

Chair: I've heard that you are being verbally abusive in a medium that
interfaces with the public. For example, [something like my famous, "you do
philosophy as well as my ass chews gum"]- - -. Of course, if your identity
is being stolen - - -

Glen: No, my identity is not being stolen; it's me. But I never mentioned my
affiliation, and the reason is that I wish to maintain my first-amendment
rights, all while not even implying remotely that what I say is the opinion
of the university.

Chair: You never mentioned the University?

Glen: By design. For the very reasons you're talking about.

Chair: So they went the extra mile and tracked your affiliation through the
internet?

Glen: Yes.

Chair: It seems to me that you have balanced your first amendment rights and
the rights of the university.

Glen: Ok -----------.

Chair: Glen, I would urge you not to do what I have seen here. Really - I've
been there myself.

----------------------------------------------------

And so on.

I have said some mean things. I have used, ummm, harsh language. Father
forgive me! But I never, ever, argued that someone should be censored by
some institution. The irony, as I have mentioned before, is that Dan liked
to play the "Skinnerians are fascists" card. Go figure, huh?

"Glen M. Sizemore" <gmsiz...@yahoo.com> wrote in message

news:44bd40e8$0$2513$ed36...@nr1.newsreader.com...

Curt Welch

unread,

Jul 18, 2006, 7:16:57 PM7/18/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote:

> 2. 50 years worth of physiological recordings have shown there is an
> increasing amount of abstraction computed on the visual image as one
> ascends the visual hierarchy. This means that loads of information is
> thrown away in order to abstract out salient "features" present in the
> image. Look at IT, the so-called face area. Those cells respond to
> extreme abstractions in the visual images. The raw pixel data is
> discarded so the cells can signal "yes, it's a face". Turn on a pixel
> here or there in the incoming image, and the cell response isn't
> affected.
>
> That's loss of information,

Once again. It's not a loss of information if the information is stored
elsewhere. If you transform pixel data into _multiple_ high level
abstractions, there is no need for any of the information to be lost. So,
the simple fact that an operation like a+b is happening, is no proof that
information is being discarded by the network. You would have to prove
that a corresponding feature like a-b was not being abstracted at the same
time. And we know for a fact that the visual system transforms the lower
level data into multiple higher level abstractions at each step.

You would have to prove that when you changed a pixel, that none of the
high level abstractions changed as a result of the pixel change for
example.

In addition, I've heard the visual system creates a large fan-out in the
signals on the order of magnitude of 400 to 1. This makes it even easier
to believe the system is not throwing data way as it always seems to be
creating more high level abstractions at each new level. So if you turn
(a,b) into (a+b, 2a-b, 2b-a) (a 1.5 to 1 fan out) you have actually created
redundancy in the data. Any one of the three high level abstractions can
be thrown out and you still can recreate the (a,b) input.

As you said in your other reply to me - if you have more equations than
unknowns, you can always solve for the unknowns (assuming the equations are
not effective duplicates). Each high level abstraction (like a face
detector) represents an equation and if you have a 400 to 1 fan-out, you
will end up with 400 equations for each unknown. The brain would actually
have to work hard at producing redundant high level abstractions in order
to throw information away in this situation.

Now, on the other hand, I don't claim the visual system is not throwing
information away - I'm just claiming that what you have mentioned is not a
clear indication that it is - which seems to be what you are trying to
argue above.

Has there been a information theoretical analysis of the total visual
system that clearly shows information is being discarded at each higher
level of abstraction? I would guess there hasn't been simply because we
don't have enough tools to correctly record and map out the function of an
entire visual system to the resolution needed to answer that question.

> redundancy reduction, invariance abstraction,
> feature detection, to name a few applicable terms. This has all been
> known for 50+ years.

--

J.A. Legris

unread,

Jul 19, 2006, 10:41:07 AM7/19/06

to

What do you think of the experiments (p.64)?

--
Joe Legris

Jim Bromer

unread,

Jul 19, 2006, 10:50:30 AM7/19/06

to

Jim Bromer wrote:
> J.A. Legris wrote:
> > OK, let's get started. What is gradual learning, and under what
> > circumstances does it arise?
> >
> > --
> > Joe Legris
>
> The classical example of logical reasoning is,
> All men are mortal.
> Socrates is a man.
> Therefore, we know -by form- that Socrates is mortal.
>
> This concept of form was also used in the development of algebra where
> we know facts like,
> 2a + 2a = 4a
> if a is any real number. So, for example, we know -by form- that if
> a=3 then 2*3+2*3=4*3.
>
> One of the GOFAI models used categories and logic in order to create
> logical conclusions for new information based on previously stored
> information. In a few cases this model produced good results even for
> some novel examples. But, it also produced a lot of incorrect results
> as well. I wondered why this GOFAI model did not work better more
> often. One of the reasons I discovered is that we learn gradually, so
> that by the time we are capable of realizing that the philosopher is
> mortal just because he is a man and all men are mortal, we also know a
> huge amount of other information that is relevant to this problem. The
> child learns about mortality in dozens of ways if not hundreds or even
> thousands of ways before he is capable of realizing that since all men
> are mortal, then Socrates must also be mortal.
>
> I realized that this kind of logical reasoning can be likened to
> instant learning. If you learn that Ed is a man, then you also
> instantly know that Ed must be mortal as well. This is indeed a valid
> process, and I feel that it is an important aspect of intelligence.
> But before we get to the stage where we can derive an insight through
> previously learned information and have some capability to judge the
> value of that derived insight, we have to learn a great many related
> pieces of knowledge. So my argument here, is that while instant
> derivations are an important part of Artificial Intelligence, we also
> need to be able to use more gradual learning methods to produce the
> prerequisite background information so that derived insights can be
> used more effectively.
>
> Gradual learning is an important part of this process. We first learn
> about things in piecemeal fashion before we can put more complicated
> ideas together. I would say that reinforcement learning is a form of
> gradual learning but there are great many other methods of gradual
> learning available to the computer programmer.
>
> It's hard for most people to understand me (or for that matter even
> to believe me) when I try to describe how adaptive AI learning might
> take place without predefined variable-data references. So it is much
> easier for me to use some kind of data variable-explicit model to try
> to talk about my ideas.

>
> Imagine a complicated production process that had all kinds of sensors

> and alarms. You might imagine a refinery or something like that.
> However, since I don't know too much about material processes, I
> wouldn't try to simulate something like that but I would instead
> create a computer model that used algorithms to produce streams of data
> to represent the data produced by an array of sensors. Under a number
> of different situations, alarms would go off when certain combinations
> of sensor threshold values were hit. This computer generated model
> would be put through thousands of different runs using different
> initial input parameters so that it would produce a wide range of data
> streams through the virtual sensors. It would then be the job of the
> AI module to try to predict which alarms would be triggered and when
> they would be triggered before the event occurred. The algorithms that
> produced the alarms could be varied and complicated. For example, if
> sensor line 3 and sensor line 4 go beyond some threshold values for at
> least 5 units of time, then alarm 23 would be triggered unless line 6
> dipped below some threshold value at least two times in the 10 units of
> time before. There might be hundreds of such alarm scenarios.
> Individual sensor lines might be involved in a number of different
> alarm scenarios. An alarm might, for another example, be triggered if
> the average value of all the sensor inputs was within some specified
> range. The specified triggers for some alarms might change from run to
> run, or even during a run. Some of these scenarios would be simple,
> and some might be very complex. Some scenarios might even be triggered
> by non-sensed events. The range of possibilities, even within this
> very constrained data-event model is tremendous if not truly infinite.
>
> The AI module might be exposed to a number of runs that produced very
> similar sensor values, or it might be exposed to very few runs that
> produced similar data streams.
>
> Superficially this might look a little like a reinforcement scenario
> since the alarms could be seen as negative reinforcements, but it
> clearly is not a proper model for behaviorist conditioning. The only
> innate 'behavior' is that the AI module is programmed to produce is to
> try to develop conjectures to predict the data events that could
> trigger the various alarms.
>
> I argue that since simplistic assessments of the runs would not work
> for every kind of alarm scenario, the program should start out with
> gradual learning in order to reduce the false positives where it
> predicted an alarm event that did not subsequently occur.
>
> This model might have hundreds or thousands of sensors. It might have
> hundreds of alarms. It might have a variety of combinations of data
> events that could cause or inhibit an alarm. Non-sensible data events
> might interact with the sensory data events to trigger or inhibit an
> alarm. Furthermore, the AI module might be able to mitigate or operate
> the data events that drive the sensors so that it could run interactive
> experiments to test its conjectures.
>
> I have described a complex model where an imagined AI module would have
> to make conjectures about the data events that triggered an alarm. Off
> hand I cannot think of any one learning method that would be best for
> this problem. So lacking that wisdom I would suggest that the program
> might run hundreds or even thousands of different learning methods in
> an effort to discover predictive conjectures that would have a high
> correlation with actual alarms. This is a complex model problem which
> does not lend itself to a single simplistic AI paradigm. I contend
> that the use of hundreds or maybe even thousands of learning mechanisms
> is going to be a necessary component of innovative AI paradigms in near
> future. And it seems reasonable to assume that initial learning is
> typically going to be a gradual process in such complex scenarios.
>
> I will try to finish this in the next few days so that I can describe
> some of the different methods to produce conjectures that might be made
> in this setting and to try to show how some of these methods could be
> seen as making instant conjectures while others could be seen as
> examples of gradual learning.
>
> Jim Bromer
In my previous message I described a computer model that produced a
stream of data which, under a variety of different conditions could set
off alarms. An AI program or subprogram would have the task to try to
make predictions when and why the alarms would be set off. There could
be two test modes for the AI program. In one, it would only be able to
make observations of the streams of input data, and in the second test
mode it would be able to interact with data environment to some extent
by setting the values of some of the streams of input data in order to
test its conjectures. The AI module would have access to that input
data streams in order to try to make its predictions, but it would not
have access to the algorithms that produced the streams of data, and it
would not have access to the algorithms that established the causal
relations between the data streams or other undetectable events and the
alarms.

Suppose, for example, there were 1000 streams of simulated sensor
readings and 100 alarms. The streams of sensor readings range from a
value of 0 to 100 at each sampling. Each run of data involves some
number of sampling time units. Also suppose that some of the alarms
were set by the conditions like the following.
Alarm 4: Goes off whenever Sensor 5 is between the value of 30 and 40.
Alarm 6: Goes off whenever Sensor 8 is between 2 and 6, or between 20
and 25, or between 32 and 35, or between 43 and 47, or between 57 and
64.
Alarm 12: Goes off whenever the average value of the 1000 Sensor
Streams of data at any point in time is between 50 and 60.
Alarm 23: Goes off if Sensor 3 and Sensor 4 go beyond the threshold
value of 15 for at least 5 units of time, unless line 6 was below the
threshold value of 40 for at least two sampling times in the 10 units
of time before.
Alarm 33: Goes off when Sensor 3 and Sensor 4 go beyond the threshold
value of 15 for at least 5 units of time, or when Sensor 23 goes is
between the value of 20 and 90, or when Sensor 45 is above the
threshold of 80, or when Sensor 80 is below the threshold of 30. Alarm
33 would therefore go off whenever Alarm 23 would go off, but it would
also be set under other conditions as well.
Alarm 55: An undetected event sets Alarm 55 off.
Alarm 56: An undetected event sets Alarm 56 off if Sensor 32 is above
the threshold value or 80.

Suppose the AI module conducted a number of different analyses, and
that one of its analyses was made by examining each of the individual
Sensor data streams to see if any of their values correlated strongly
with an Alarm when it went off. Some Bayesian enthusiasts might think
that Alarm 4, which goes off whenever Sensor 5 is between 30 and 40,
should be catchable by a Bayesian analysis of the individual data
streams. That may be true, but it is not that simple. Remember that
the AI module would not have any way of casually distinguishing between
coincidences and valid causal relationships and that the kinds of
events that could set an alarm off is varied. So even after a few
incidences where Alarm 4 went off, while the collected data would show
that Sensor 5 was between 30 and 40 each time, the data would also show
a range of specific values for each of the 999 other data lines. True,
a Bayesian method that was programmed to test the relation between
Sensor 5 would detect the relation between Sensor 5 and Alarm 4, but
because there are so many different conditions that could trigger an
alarm, the program would have to still consider other conditions as
well. For example, while Alarm 6 is only triggered by Sensor Line 8 it
has a much more varied set of ranges which can act as triggers. This
means that in order for the Bayesian method to quickly ascertain the
relation between Sensor 5 and Alarm 4, it would have to be explicitly
looking for a single range as a trigger. For the Bayesian method to
quickly ascertain that Alarm 6 is correlated with Sensor 8 on the other
hand, it would have to programmed with the assumption that there could
be quite a few ranges for a single Sensor to trigger an alarm. This
reasoning suggests that the bland or vanilla proposal that a single
(simplistic) analytical technique could solve all AI problems is not
based on an insightful analysis of the varied kinds of problems the
program could be exposed to. Many analytical and learning methods work
well when the problem is kept simple enough, but when the problem is
not simple even a relatively sophisticated method like a Bayesian
method or other statistical methods just are not up to the task unless
they are designed to test for a wide range of possibilities. I believe
that the only way to get around this is to use a variety of analytical
methods in the initial development and testing of conjectures.

Once a statistical method began to suspect that a strong relation
between Alarm 4 and Sensor 5 existed, it could look at the negative
correlations well. Here again, negative correlations might not work as
well as might be presumed. Look at the causes of Alarm 33. Alarm 33
might be set off by Sensor 23 when it is between 20 and 90, but other
Sensor Lines can also trigger Alarm 33. So a correlation between Sensor
23 when it is less than 20 or greater than 90 and Alarm
33-has-not-been-triggered is not that great. (In other words, Alarm 33
might be set even when Sensor 23 is less than 20 or greater than 90).
So while Alarm 33is-triggered is positively correlated with Sensor 23
when it is between 20 and 90, the correlation between the negative
cases is not as strong. And, significantly other Sensors which do not
vary much might have similar signatures. While they have strong
correlations of being within the range when the Alarm is triggered,
they would also be in the same range even if the Alarm is not
triggered. So the negative cases and the background cases have to be
considered in a general analyses as well.

Alarm 12 goes off whenever the average value of the 1000 Sensor Streams
of data at any point in time is between 50 and 60 so this suggests that
an analytical technique that effectively examines the correlation
between the average of the Sensors and the Alarms would have to be used
to find this relation.

Alarm 33 is partially dependent on Alarm 23, so this shows that higher
information processing, a little like using higher symbols to deduce a
logical relationship could be used even to detect the simplest of
relations.

If the AI module had the ability to affect the Sensor input values, it
could do a better job of finding correlations between individual
Sensors and the Alarms. It could for example test Sensor 5 at various
values while holding the other Sensors constant and through this
methodical testing discover the relation between Sensor 5 and Alarm 4.
But without the ability to affect the Sensor input values or in cases
where it only had limited abilities to test the system by setting the
Sensor inputs this way, the complexity of distinguishing between false
and valid triggers would often be quite difficult. We can imagine a
Bayesian method that might look for the correlations between Alarms and
simpler ranges of individual Sensors first, and then after it rules
them out, it would look for more complicated multiple ranges and more
complicated combinations of Sensor values. But the idea here, is that
a very serious complication has already appeared before we have even
left the starting gate so to speak, and that means that more elaborate
methods have to be defined for the problem. These different analytical
methods will also require more time to test the more likely
possibilities and then to rule out the less likely possibilities and so
I contend that they constitute an example where gradual learning is
needed. Other kinds of testing and conjecture could then be used in an
effort to see how accurate and how far ahead of an Alarm it could make
its predictions.

This example wasn't as good as I had originally thought it would be,
but the majority of people sophisticated enough to program a computer
should be able to, at the least, get where I was going with it.
Because a single overly-simplistic analytical technique will not
suffice to detect all cases, a number of different test assumptions
need to be tried. Some of these test models will produce interesting
results that can then be followed up. However, because the typical
case under such complex conditions is that of partial correlation,
there are many cases of false conjectures which would also produce
partial correlations. This situation is similar to the situation in
the sciences where alternative theories both seem to explain the data
of an experiment, but where neither surpasses some minimal threshold of
confidence to be seen by the majority of experts as being convincing.
Further testing is needed. The follow up testing however will require
some sophistication whereby the results of the preliminary tests could
be interpreted, cross-referenced, integrated and used to intelligently
generate a series of additional tests. At that point conjectures
derived from the first tests could be used as the assumptions of the
next wave of testing.

I do not see a simplistic or an elegant method to create a viable
artificial intelligence. Instead, I see the need for a number of
different analytical techniques that will often provide imperfect
information. This imperfect information needs to be used carefully.
Newly acquired insights may be leveraged by being used with previous
knowledge, but as the history of AI has clearly shown, the products
derived though this kind of information leveraging have to be evaluated
in the terms of a background of relevant information in order to
increase their chances of being useful.

Computers are really good at learning. They can instantly remember
anything as long as they have enough memory to store the information.
So the real problem for AI is not how to get the computer to learn, but
how to get it to figure things out to be able to integrate information
and to use it intelligently. In a sense then, the conventional storing
of data is both instant and incremental, but it is not in itself an
insightful process. In order to get the computer to integrate
information intelligently it has to be able to figure out how the data
fits together. This process has to typically be gradual because the
number of different possibilities is so great.

But the important thing here is that there are good reasons for an AI
program to integrate information using a gradual process and this
understanding may be used to help shape more sophisticated learning
strategies. Learning has to consist of more than simply storing
information into the computer, it has to also include the skills
required for the computer to integrate the information intelligently.
The fact that these processes have to overcome such overwhelming odds
against them can be seen as a theory that explains why learning is
often gradual.

Jim Bromer

Glen M. Sizemore

unread,

Jul 19, 2006, 11:20:39 AM7/19/06

to

"J.A. Legris" <jale...@sympatico.ca> wrote in message

news:1153320067.7...@m79g2000cwm.googlegroups.com...

I see nothing very surprising and not much of merit. The authors simply make
a Cybulski-type argument because they have ignored the child's history. It
is the same sort of thing as saying to someone "pulling the plunger
sometimes causes quarters to drop into this cup." Now, we put the person in
the room with the device, and the person pulls the plunger. Is the first
pull a result of operant conditioning? Of course, but the operant
conditioning did not take place in the setting, it took place elsewhere. It
took place when the person's listener repertoire was produced through
operant conditioning. What sorts of histories are necessary to produce a
child who passes the blicket tests? What sorts of histories are necessary
for a person to see distant objects as larger than would be predicted based
on retinal image? What histories are necessary to produce any of the
behavioral phenomena that we observe? That isn't the business of cognitive
"science." The outcome of the blicket experiment depends on children having
behavior under the control of verbal stimuli; "blickets make the machine
go." "Which one is the blicket?" etc., and children's behavior is under
stimulus control of verbal stimuli because of the contingencies to which
they have been exposed. The situation is complicated, yes. What parts of
human behavior aren't? But outside the purview of operant conditioning?
Nonsense.

>
> --
> Joe Legris
>

feedbackdroid

unread,

Jul 19, 2006, 1:47:11 PM7/19/06

to

Remember that what you write into cyberspace is there FOREVER,
and for ALL to see. Forever is a long time.

You write verbal abuse and sign your own name to it at your own
risk. No one is responsible for your actions but you.

Too bad, isn't it, that all of Skinner's BS about responsibility is
just
plain stupidity. Applicable to pigeons maybe, but not to humans.
Welcome to real-life.

Just to assuage your feelings a bit, I once had a student write on
an evaluation form that I was the worst instructor he had ever had.
This was one out of 1000s of evaluations. Later did I understand
that this particular group of students had all expected A's, just for
showing up. Silly me. No one told me this ahead of time. That
doesn't make me feel too bad, but my coordinator wasn't too
pleased. She chewed me out for about 2 hours, over the
telephone no less.

OTOH, I once assigned a B to a student, and several years later
discovered it was the only grade below A that she had received
in her entire 4 years of undergraduate study. She had even come
to see me after the course, and ended up accepting her fate with
out a whimper. Even today, I live in the knowledge that, on her
deathbed she'll probably not even think about her first husband,
but will remember me for being the reason she didn't have a
perfect 4.0. Oh well. That's life. Live and learn.

Michael Olea

unread,

Jul 19, 2006, 2:36:42 PM7/19/06

to

Curt Welch wrote:

> "feedbackdroid" <feedba...@yahoo.com> wrote:
>
>> 2. 50 years worth of physiological recordings have shown there is an
>> increasing amount of abstraction computed on the visual image as one
>> ascends the visual hierarchy. This means that loads of information is
>> thrown away in order to abstract out salient "features" present in the
>> image. Look at IT, the so-called face area. Those cells respond to
>> extreme abstractions in the visual images. The raw pixel data is
>> discarded so the cells can signal "yes, it's a face". Turn on a pixel
>> here or there in the incoming image, and the cell response isn't
>> affected.

>> That's loss of information,

None of which has the slightest bearing, of course, on whether or not the
response of simple cells in V1 acts as a lossless high resoltion buffer.

> Once again. It's not a loss of information if the information is stored
> elsewhere. If you transform pixel data into _multiple_ high level
> abstractions, there is no need for any of the information to be lost. So,
> the simple fact that an operation like a+b is happening, is no proof that
> information is being discarded by the network. You would have to prove
> that a corresponding feature like a-b was not being abstracted at the same
> time. And we know for a fact that the visual system transforms the lower
> level data into multiple higher level abstractions at each step.

In fact so called simple cells in V1 occur in adjacent quadrature phase
pairs. That is, speaking metaphoricaly, for every A+B cell there is an A-B
cell next to it. More precisely, the receptive fields of the cells in the
pair cover the same portion of the visual field, have the same spatial
frequency, scale, and orientation, but differ by 90 degrees in phase. This
was first established in 1981, and confirmed many times since then. This is
precisely the condition that has to be met for Gabor wavelets to achieve
the minimum bound of simultaneous localization in both the 2D spatial
domain and the frequency domain. Of course, nobody recorded the responses
of every single simple cell in V1 in some animal to determine they all came
in pairs. Various studies sampled small patches of V1 intensively. The
conclusion is statistical. The original paper is:

Pollen DA, Ronner SF (1981) Phase relationships between adjacent cells in
the visual cortex. Science 212:1409-1411

More recently Pollen wrote:

"The conjoined optimal localization of signals in both the two-dimensional
spatial and spatial frequency domains (Daugman, 1985) is best expressed by
sets of phase-specific simple cells in V1 (Pollen and Ronner, 198l; Foster
et al., 1983). The subzones of the receptive fields of these cells are
selectively sensitive to either increments or decrements of light (Hubel
and Wiesel, 1962) and spatial processing across such receptive fields is
largely linear (Jacobson et al., 1993). The two-dimensional joint
optimalization for preferred orientation and spatial frequency in the
frequency domain and for the x and y coordinates in the spatial domain
follows from results that the largely linear receptive field line-weighting
functions of these cells are well-described as Gaussian-attenuated
sinusoids and cosinusoids (Marcelja, 1980). The Gaussian weighting renders
the signal as the most compact to specify jointly spatial frequency and
space (Gabor, 1946). The Fourier transform of these `Gabor functions' in
the space domain yields an equally compact function in the spatial
frequency domain (Gabor, 1946; Marcelja, 1980). The products of
uncertainties within the two domains approaches a theoretical minimum
(Marcelja, 1980). Simple cells with corresponding properties, at least for
analyses of brightness distributions within frontoparallel planes, are
found within both V1 and V2 (Foster et al., 1985), but not within V3A
(Gaska et al., 1988) nor apparently in V4 (Desimone and Schein, 1987)."

-- Michael

Glen M. Sizemore

unread,

Jul 19, 2006, 2:58:51 PM7/19/06

to

None of this changes the fact that you are a petty fascist and
intellectually dishonest. You have revealed yourself as the thug you are.

"feedbackdroid" <feedba...@yahoo.com> wrote in message

news:1153331230....@75g2000cwc.googlegroups.com...

mimo...@hotmail.com

unread,

Jul 19, 2006, 6:22:22 PM7/19/06

to

perhaps

Michael Olea

unread,

Jul 19, 2006, 7:14:01 PM7/19/06

to

Jim Bromer wrote:

So we have a stochastic process. In this case a discrete time series, where
the time-dependent variable is a 1100 dimensional vector. In general the
task would be to predict the future value of the vector from its past. In
this case there is a simplification since here the task is to predict the
value of a set of alarms, a 100 dimensional bit vector, from the past
values of a 1000 dimensional sensor vector. This amounts to estimatimating
the conditional distribution P(A | Si, Si-1, Si-2...), where A is the alarm
vector, Si is the value of the sensor vector at the time of the current
observation, and Si-n is the value of the sensor vector n time-steps in the
past. All but 2 components of A, as specified, are simple deterministic
functions of S. They are in fact linear threshold functions. The other two
components each depend on the values of a single hidden variable, which may
or may not be the same variable, call these hidden variables H1 and H2. The
problem description is incomplete since it does not specify whether or not
the values of the hidden variables are correlated with the values of the
observables. Only two components of A, in the problem as specified, depend
at all on the history of S, a dependence that extends at most 10 time-steps
into the past.

This is a simple estimation problem. The probability distribution, as
specified, is stationary, and belongs to the simplest complexity class of
probability distributions: those with finite predictive information. Within
that class it is a particularly simple instance since the distribution is a
composite of 0-1 distributions conditional on low dimensional linear
subspaces of the input space. The only questions of mild interest are how
much state (the history of Si) to retain (since the fact that we only need
the past 10 time steps is not something the estimator would "know" a
priori), and whether or not the existence of one or more hidden variables
can be inferred:

A4 = f1(S5) - 2 thresholds
A6 = f2(S8) - 10 thresholds
A12 = f3(L1Norm(S)) - 2 thresholds
A23 = f4(S3, S4, S6, t) - 6 thresholds
A33 = f5(S3, S4, S23, S45, S80, t) - 8 thresholds
A55 = f6(H1) = P(H1) (which is unspecified)
A56 = f7(S32, H2) = P(H2 | S32 > 80) (which is unspecified)

The optimal Bayesian estimator for this problem comes from the simplest
family of such estimators - the family of finite parametric models. How
quickly (i.e. after how many observations of (S,A)) such an estimator
converges on the optimal prediction hypothesis depends both on the joint
distribution P(S,H1,H2) and the capacity of the hypothesis space of the
estimator. Once the capacity is high enough to include the correct
hypothesis, the greater the capacity, the slower the convergence. In this
case, leaving open for the moment the question of P(H1) and P(H2 | S32 >
80), the optimal hypothesis is a linear function of the vector S and its
recent past. There is, of course, a variety of ways to treat the dependence
on the past. One way is to set a fixed threshold of how far back to look. A
less arbitrary approach is to use a decay function, like an exponential,
that gives progressively less weight to events further in the past. This
amounts to a prior over the correlation horizon of the stochastic process.

That leaves the estimation of P(H1) and P(H2). There is little to say about
this, given the description of the problem, other than that this is a
distribution estimation problem.

-- Michael

feedbackdroid

unread,

Jul 20, 2006, 10:53:08 AM7/20/06

to

Glen M. Sizemore wrote:
> None of this changes the fact that you are a petty fascist and
> intellectually dishonest. You have revealed yourself as the thug you are.
>
>

ROTFLOL.

Just for point of information, how old were you when you mother made
you start wearing long pants?

feedbackdroid

unread,

Jul 20, 2006, 11:38:43 AM7/20/06

to

Curt Welch wrote:
> "feedbackdroid" <feedba...@yahoo.com> wrote:
>
> > 2. 50 years worth of physiological recordings have shown there is an
> > increasing amount of abstraction computed on the visual image as one
> > ascends the visual hierarchy. This means that loads of information is
> > thrown away in order to abstract out salient "features" present in the
> > image. Look at IT, the so-called face area. Those cells respond to
> > extreme abstractions in the visual images. The raw pixel data is
> > discarded so the cells can signal "yes, it's a face". Turn on a pixel
> > here or there in the incoming image, and the cell response isn't
> > affected.
> >
> > That's loss of information,
>
> Once again. It's not a loss of information if the information is stored
> elsewhere. If you transform pixel data into _multiple_ high level
> abstractions, there is no need for any of the information to be lost.
>

IF, IF, IF. You need to analyze the underlying assumptions, and think
about the evidence in the real world rather than just postulating
hypothetical worlds.

Glen-Michael makes exactly the same mistake constantly. He finds one
paper out of many 1000s of papers, or one tentative "model" out of 100s
of such models, and then doesn't analyze the underlying assumptions,
and ignores the other 99.9% of the data, and then acts like it's the
FINAL TRUTH. And then he claims the high-ground for rigor. What a joke.

There are 100s of 1000s of neuro papers, and using any one like a club
to beat your opponent is just plain sophomoric silliness.

When your only tool is a hammer,
then every problem looks like a nail.

>
> So,
> the simple fact that an operation like a+b is happening, is no proof that
> information is being discarded by the network. You would have to prove
> that a corresponding feature like a-b was not being abstracted at the same
> time.
>

It works the other way. The onus is on your back to prove your case, in
light of all the other evidence [see below]..

>
> And we know for a fact that the visual system transforms the lower
> level data into multiple higher level abstractions at each step.
>

You've just contradicted your previous statement. Higher-level
abstractions means that information is discarded.

>
> You would have to prove that when you changed a pixel, that none of the
> high level abstractions changed as a result of the pixel change for
> example.
>

Everyone in visual neuroscience already knows that this is precisely
what happens. Starting from the first levels in the retina.

>
> In addition, I've heard the visual system creates a large fan-out in the
> signals on the order of magnitude of 400 to 1.
>

Yeah, I said that. 1M optic nerve fibers, and 400M neurons in V1.
[see also below].

>
> This makes it even easier
> to believe the system is not throwing data way as it always seems to be
> creating more high level abstractions at each new level. So if you turn
> (a,b) into (a+b, 2a-b, 2b-a) (a 1.5 to 1 fan out) you have actually created
> redundancy in the data. Any one of the three high level abstractions can
> be thrown out and you still can recreate the (a,b) input.
>
> As you said in your other reply to me - if you have more equations than
> unknowns, you can always solve for the unknowns (assuming the equations are
> not effective duplicates). Each high level abstraction (like a face
> detector) represents an equation and if you have a 400 to 1 fan-out, you
> will end up with 400 equations for each unknown. The brain would actually
> have to work hard at producing redundant high level abstractions in order
> to throw information away in this situation.
>

This is not really fan-out. Think instead of many separate processing
modules. Cells in V1 integrate center-surround retinal responses into
orientated-bar detection units, so-called "simple cells", plus the
other complex cells, etc. If you need to cover the entire visual field
with bar-detectors, where the "bar-size" is much longer and wider than
the retinal R-F, and with all bar orientations included, then it might
take a 400:1 ratio of cells.

And what do you do when you integrate photoreceptor responses into
small circular C-S structures in retinal ganglion cells, and further
into enlongated bar-detectors in cortex? You throw away loads of raw
pixel information.

Look at this .... http://en.wikipedia.org/wiki/Optic_nerve

===================
The optic nerve contains 1.2 million nerve fibers. This number is low
compared to the roughly 130 million receptors in the retina, and
implies that substantial pre-processing takes place in the retina
before the signals are sent to the brain through the optic nerve.
===================

130M photoreceptors and 1.2M O-N fibers. Substantial pre-processing.

This means that loads of raw pixel information is already removed in
the retina. Everyone who has ever studied retinal physiology already
knows that retainl cells integrate visual input over certain spatial
extents, and one photon more or less isn't gonna matter, except
possibly at the extreme lower limits of perception. In normal
phototopic vision [daylight], any one photon is lost in the barrage.

>
> Now, on the other hand, I don't claim the visual system is not throwing
> information away - I'm just claiming that what you have mentioned is not a
> clear indication that it is - which seems to be what you are trying to
> argue above.
>
> Has there been a information theoretical analysis of the total visual
> system that clearly shows information is being discarded at each higher
> level of abstraction? I would guess there hasn't been simply because we
> don't have enough tools to correctly record and map out the function of an
> entire visual system to the resolution needed to answer that question.
>

All we have are the many papers describing recordings in the 30+ visual
areas of cortex. And they all indicate abstraction and redundancy
reduction as the visual hierarchy is ascended.

It is always possible there is some undiscovered pathway which
transmits the visual image unchanged all the way to the top, but looks
like the major part of the system works in the other manner.
Abstraction and redundancy reduction.

Curt Welch

unread,

Jul 20, 2006, 1:15:52 PM7/20/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote:
> Curt Welch wrote:

> > So,
> > the simple fact that an operation like a+b is happening, is no proof
> > that information is being discarded by the network. You would have to
> > prove that a corresponding feature like a-b was not being abstracted at
> > the same time.
> >
>
> It works the other way. The onus is on your back to prove your case, in
> light of all the other evidence [see below]..

No it doesn't work the other way around. You are the only one making a
claim here and your claim is that information is lost. My claim is that I
don't know. My other claim is that you haven't presented any viable
evidence to support your claim.

If you want to continue to make the claim that information is lost, then
YOU have to prove it or you have to expect us to call you dishonest.

> > And we know for a fact that the visual system transforms the lower
> > level data into multiple higher level abstractions at each step.
> >
>
> You've just contradicted your previous statement. Higher-level
> abstractions means that information is discarded.

Only to you. And that's the claim you have not yet supported. a+b IS a
higher level abstraction, but if it's paired with the a-b high level
abstraction, then nothing is lost in the system as a whole. Do you not
understand this?

> > You would have to prove that when you changed a pixel, that none of the
> > high level abstractions changed as a result of the pixel change for
> > example.
>
> Everyone in visual neuroscience already knows that this is precisely
> what happens. Starting from the first levels in the retina.

I'm glad that everyone knows this. But if it's true, it seems to me
instead of running in circles with your head cut off in these past 3 or 4
posts, you would simply state the evidence that supports what everyone
knows. But yet, you still haven't. You just keep saying that it's
obvious, or that everyone knows it. Apparently you don't know why everyone
knows this or else you would have evidence to support the claim.

> > As you said in your other reply to me - if you have more equations than
> > unknowns, you can always solve for the unknowns (assuming the equations
> > are not effective duplicates). Each high level abstraction (like a
> > face detector) represents an equation and if you have a 400 to 1
> > fan-out, you will end up with 400 equations for each unknown. The
> > brain would actually have to work hard at producing redundant high
> > level abstractions in order to throw information away in this
> > situation.
> >
>
> This is not really fan-out.

Why not? Or are you thinking of fan-out as a simple process of copying a
signal - as how a logic gate can send it's data to 10 gates? The concept
I'm referring to is a bit more complex.

If you transform (x,y) into (x+y,x-y), then both the x and y signals have a
fan out of 1:2 (each input value effects 2 output values). But the net
fan-out of the transform is 1:1 (same number of input values as output
values). This is how I'm using the term.

> Think instead of many separate processing
> modules. Cells in V1 integrate center-surround retinal responses into
> orientated-bar detection units, so-called "simple cells", plus the
> other complex cells, etc. If you need to cover the entire visual field
> with bar-detectors, where the "bar-size" is much longer and wider than
> the retinal R-F, and with all bar orientations included, then it might
> take a 400:1 ratio of cells.
>
> And what do you do when you integrate photoreceptor responses into
> small circular C-S structures in retinal ganglion cells, and further
> into enlongated bar-detectors in cortex? You throw away loads of raw
> pixel information.
>
> Look at this .... http://en.wikipedia.org/wiki/Optic_nerve

Is there any data on that page that has any relevance to this issue of
whether the transform that takes place in the visual cortex is loosing
information as you say it is?

> ===================
> The optic nerve contains 1.2 million nerve fibers. This number is low
> compared to the roughly 130 million receptors in the retina, and
> implies that substantial pre-processing takes place in the retina
> before the signals are sent to the brain through the optic nerve.
> ===================
>
> 130M photoreceptors and 1.2M O-N fibers. Substantial pre-processing.
>
> This means that loads of raw pixel information is already removed in
> the retina.

Well, first off, it's possible to compress data and not remove any
information. You can compress a 1000 byte file into a 300 byte file with
standard compression programs. Does that prove that 700 bytes of data were
thrown away? If so, how is it possible for compression programs to
recreate those 700 bytes if the data was thrown out?

So, the fact that there is a 130 to 1 reduction in signals is no proof that
information is thrown away. There might simply be a huge amount of
redundancy in the raw signal which was as removed as the signal was
compressed on the 130 to 1 fan-in transform. So once again, you have
proved NOTHING about whether information is being lost or not. To prove
that, you have to analyze and the characteristics of the signals.

However, I suspect there is information being filtered at that stage -
mostly likely noise - aka data in the signal that has little to do with the
surrounding environment and more to do with the noise generated by the
mechanics of the eye.

However, most important, we weren't talking about whether information was
lost in the retina, we were talking about whether it happens in the higher
levels of abstraction that occur in the visual cortex. This is what you
were making statements about and what I was responding to. It's a lot
easier to believe there is information loss when there is a 130:1 reduction
than to believe there is information loss when there is a 400:1 expansion.

> Everyone who has ever studied retinal physiology already
> knows that retainl cells integrate visual input over certain spatial
> extents, and one photon more or less isn't gonna matter, except
> possibly at the extreme lower limits of perception. In normal
> phototopic vision [daylight], any one photon is lost in the barrage.

Fine. The "one photon" detection is below the noise floor of the system
most the time. Once again, you babble about stuff that has nothing to do
with whether information is lost in the processing You have put forth no
evidence to suggest the information about that one photon was ever in the
signal in the first place so that it could be lost later in processing as
you suggest happens in spades.

> All we have are the many papers describing recordings in the 30+ visual
> areas of cortex. And they all indicate abstraction and redundancy
> reduction as the visual hierarchy is ascended.
>
> It is always possible there is some undiscovered pathway which
> transmits the visual image unchanged all the way to the top, but looks
> like the major part of the system works in the other manner.
> Abstraction and redundancy reduction.

Do you still not get it? No one said the visual image data was transmitted
_unchanged_? The issue is whether information was lost by the
transformation (aka the change).

If you change (x,y) into (x+y, x-y) the data is very much changed - the old
values (x and y) are thrown away. But no _information_ is lost in the
process. The two new values still contains all the information which was
present in the two original values. Do you no understand that this is what
we have been talking about when you keep saying information is lost?

If information is thrown away in the transformation as you suggest, this
means that when we look at a visual scene, the scene could change in ways
that we would be unable to react to (aka see) because that information was
thrown away. Just what change in the visual field are you suggesting we
are blind to? What information is thrown away? If a pinpoint light source
the size of one pixel in the eye flashes on and off, are you saying we
won't see the flashing because we are blind to one-pixel changes? Are you
saying the data about that flashing pixel is sent to the visual cortex, but
it's lost before it can effect our behavior in the motor cortex? That's
what you are implying by saying information is lost in the abstraction
process.

JPl

unread,

Jul 20, 2006, 1:29:47 PM7/20/06

to

"Glen M. Sizemore" <gmsiz...@yahoo.com> wrote in message
news:44bd40e8$0$2513$ed36...@nr1.newsreader.com...
>

> "feedbackdroid" <feedba...@yahoo.com> wrote in message

...

>> Congrats. I see you're starting to emulate your new best friend, GS.
>
> I don't think so. When I thought I had a chance to be best friends with
> Michael, I sent him the following email:
>
>
>
> Dear Michael,
>
> I like you. Do you like me? I would like to be your best
> friend. Do you want to be best friends? Mark one:
>
>
>
>
>
>
>
> Yes________ NO_________
>
>
>
>
>
>
>
> He didn't return my email. A reasonable conclusion is, therefore, that he
> came to the conclusion that you are intellectually dishonest, and stupid,
> on his own. I, however, think you are an order of magnitude more dishonest
> than you are stupid. But, then, I read a lot of people's posts here and,
> thus, I tend to think of "stupid" as being sort of calibrated on Verhey.
> You are not as stupid as Verhey.

Always love to hear from you, Brother Darth. From your mouth all shit is an
honor... how you do it??

Jim Bromer

unread,

Jul 20, 2006, 1:42:48 PM7/20/06

to

Michael Olea wrote:

>
> This is a simple estimation problem. The probability distribution, as
> specified, is stationary, and belongs to the simplest complexity class of
> probability distributions: those with finite predictive information.

> That leaves the estimation of P(H1) and P(H2). There is little to say about

> this, given the description of the problem, other than that this is a
> distribution estimation problem.
>
> -- Michael

I will try to follow up on the terms that you used in this message, but
I was not talking about the "simple estimation problem," that you
described.

I am not sure why you have not been able (or willing) to understand
what I have been trying to say during the last year, but maybe that is
just the way its supposed to be. However, I sincerely appreciate your
sharing of your knowledge of probability distributions and Bayesian
methods.
Jim Bromer

Glen M. Sizemore

unread,

Jul 20, 2006, 2:09:50 PM7/20/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote in message

news:1153407188.4...@75g2000cwc.googlegroups.com...

Once again, your silly responses will not change the fact that you
encouraged, or directly attempted, to have the University squelch my right
to free speech. I consider virtually nothing to be as egregious as that. I
have come to see, as have a few others, that your intellectual dishonesty,
and hypocrisy, truly know no bounds. What I can't understand is how the few
people around here, that have some shred of decency, have not, at least,
rebuked and censured you (but not censored, Dan, I am not the fascist here).
Imagine, taking advantage of the rising fascist trend in the US to attempt
to have me censored. I'm glad to see that you are doing your part, Dan.

>

Michael Olea

unread,

Jul 20, 2006, 2:12:52 PM7/20/06

to

Jim Bromer wrote:

> Michael Olea wrote:

>> This is a simple estimation problem. The probability distribution, as
>> specified, is stationary, and belongs to the simplest complexity class of
>> probability distributions: those with finite predictive information.

>> That leaves the estimation of P(H1) and P(H2). There is little to say
>> about this, given the description of the problem, other than that this is
>> a distribution estimation problem.

>

> I will try to follow up on the terms that you used in this message, but
> I was not talking about the "simple estimation problem," that you
> described.

I thought you posed two problems:

1) predict when the alarms go off by observing the data stream
2) predict when the alarms go off by oserving the data stream and conducting
some experiments (e.g. setting the values of some sensors and observing the
effect).

I was addressing the former. Was that not one of two problems you posed?

-- Michael

feedbackdroid

unread,

Jul 20, 2006, 2:25:04 PM7/20/06

to

Curt Welch wrote:
> "feedbackdroid" <feedba...@yahoo.com> wrote:
> > Curt Welch wrote:
>
> > > So,
> > > the simple fact that an operation like a+b is happening, is no proof
> > > that information is being discarded by the network. You would have to
> > > prove that a corresponding feature like a-b was not being abstracted at
> > > the same time.
> > >
> >
> > It works the other way. The onus is on your back to prove your case, in
> > light of all the other evidence [see below]..
>
> No it doesn't work the other way around. You are the only one making a
> claim here and your claim is that information is lost. My claim is that I
> don't know. My other claim is that you haven't presented any viable
> evidence to support your claim.
>
> If you want to continue to make the claim that information is lost, then
> YOU have to prove it or you have to expect us to call you dishonest.
>
> > > And we know for a fact that the visual system transforms the lower
> > > level data into multiple higher level abstractions at each step.
> > >
> >
> > You've just contradicted your previous statement. Higher-level
> > abstractions means that information is discarded.
>
> Only to you. And that's the claim you have not yet supported. a+b IS a
> higher level abstraction, but if it's paired with the a-b high level
> abstraction, then nothing is lost in the system as a whole. Do you not
> understand this?
>

IF, IF, IF it's paired ..... and so, "I" have to show it's NOT paired,
just
because you postulate it "may be" paired. How about if I try showing
there's no god instead, just because you postulate there is one.

>
> > And what do you do when you integrate photoreceptor responses into
> > small circular C-S structures in retinal ganglion cells, and further
> > into enlongated bar-detectors in cortex? You throw away loads of raw
> > pixel information.
> >
> > Look at this .... http://en.wikipedia.org/wiki/Optic_nerve
>
> Is there any data on that page that has any relevance to this issue of
> whether the transform that takes place in the visual cortex is loosing
> information as you say it is?
>
> > ===================
> > The optic nerve contains 1.2 million nerve fibers. This number is low
> > compared to the roughly 130 million receptors in the retina, and
> > implies that substantial pre-processing takes place in the retina
> > before the signals are sent to the brain through the optic nerve.
> > ===================
> >
> > 130M photoreceptors and 1.2M O-N fibers. Substantial pre-processing.
> >
> > This means that loads of raw pixel information is already removed in
> > the retina.
>
> Well, first off, it's possible to compress data and not remove any
> information. You can compress a 1000 byte file into a 300 byte file with
> standard compression programs. Does that prove that 700 bytes of data were
> thrown away? If so, how is it possible for compression programs to
> recreate those 700 bytes if the data was thrown out?
>

I just love it when non-biologists like you and Glen-Michael come
around
claiming the brain performs complex mathematical transforms. As I
indicated
to him, Pribram's hologram/fourier transform idea now lies deep-6'ed on
the scrapheap of history. Now you're saying maybe the brain does
something like LZW. As I've indicated, YOU need to provide some proof
of your claims, not me needing to disprove your unproven claims.

Maybe means exactly one thing .... maybe.

>
> So, the fact that there is a 130 to 1 reduction in signals is no proof that
> information is thrown away. There might simply be a huge amount of
> redundancy in the raw signal which was as removed as the signal was
> compressed on the 130 to 1 fan-in transform. So once again, you have
> proved NOTHING about whether information is being lost or not. To prove
> that, you have to analyze and the characteristics of the signals.
>

130-to-1. Approx 130 photoreceptors feed onto 1 bipolar cell. The
bipolar
cell will change its output voltage in response to 3 [or 10 or
whatever]
photon strikes. So, which 3 photoreceptors of the 130 were hit by
photons?

What has happened here is that the retina has traded off knowledge
about specific location of the photons for the ability to respond to
tiny
light levels. It's more important for the survival of the organism for
it
detect a predator at extremely low light levels than to immediately
recognize who or what the predator is.

I meant unchanged in the sense of not losing information, and therefore
being totally recoverable at some later stage.

>
> If you change (x,y) into (x+y, x-y) the data is very much changed - the old
> values (x and y) are thrown away. But no _information_ is lost in the
> process. The two new values still contains all the information which was
> present in the two original values. Do you no understand that this is what
> we have been talking about when you keep saying information is lost?
>
> If information is thrown away in the transformation as you suggest, this
> means that when we look at a visual scene, the scene could change in ways
> that we would be unable to react to (aka see) because that information was
> thrown away. Just what change in the visual field are you suggesting we
> are blind to? What information is thrown away? If a pinpoint light source
> the size of one pixel in the eye flashes on and off, are you saying we
> won't see the flashing because we are blind to one-pixel changes?
>

Actually, as I already said, 1 photon more or less at the "theshold of
detection" will have an important effect, but not so at normal photopic
[daylight] levels. This is called retinal adaptation, sensitivity
adjustment,
curve-shifting. Look up Weber's Law, discovered in 1880 or so.

http://www.google.com/custom?q=webers+law

>
> Are you
> saying the data about that flashing pixel is sent to the visual cortex, but
> it's lost before it can effect our behavior in the motor cortex? That's
> what you are implying by saying information is lost in the abstraction
> process.
>

Look out the window, and see the trees. Now imagine that all the light
was blocked from one of 10,000 leaves on the tree. Would it matter
to identifying the tree?

feedbackdroid

unread,

Jul 20, 2006, 3:40:15 PM7/20/06

to

Glen M. Sizemore wrote:
> "feedbackdroid" <feedba...@yahoo.com> wrote in message
> news:1153407188.4...@75g2000cwc.googlegroups.com...
> >
> > Glen M. Sizemore wrote:
> >> None of this changes the fact that you are a petty fascist and
> >> intellectually dishonest. You have revealed yourself as the thug you are.
> >>
> >>
> >
> >
> > ROTFLOL.
> >
> > Just for point of information, how old were you when you mother made
> > you start wearing long pants?
>
> Once again, your silly responses will not change the fact that you
>

BTW, on the other part of the thread, both CW and MO are maintaining
that the cortex performs some sort of complex information-preserving
mathematical transforms. Good one. I'd agree, but certainly don't
intend
to make their lives any easier. They're gonna have to work for it.

MO has his wavelet transfom complete with phase info, and CW now
has an invertable compression routine. Sounds like the old humumculis
is rearing it's evil head again.

Best go straigthen them out already.

Michael Olea

unread,

Jul 20, 2006, 3:42:01 PM7/20/06

to

Glen M. Sizemore wrote:

> ... What I can't understand is how the

> few people around here, that have some shred of decency, have not, at
> least, rebuked and censured you (but not censored, Dan, I am not the
> fascist here).

I once told Curt to stop posting here. I meant it rhetorically, and I
apologized the next day. I'm sorry I called Dan a cockroach. That was
certainly unfair and highly insulting.

If he contacted your university in an attempt to muzzle your comments here
then that was a despicable act.

-- Michael

J.A. Legris

unread,

Jul 20, 2006, 5:38:30 PM7/20/06

to

Michael Olea wrote:
>
> If he contacted your university in an attempt to muzzle your comments here
> then that was a despicable act.
>
> -- Michael

Yes. The honourable thing would be to go there and muzzle him
personally. Alas, chivalry is dead.

Seriously though, if I discovered that one of my professors was
behaving so impolitely in a public forum, I might just drop the course.
It's bound to reflect on the university sooner or later. The chair's
advice was reasonable.

--
Joe Legris

Lester Zick

unread,

Jul 20, 2006, 6:14:15 PM7/20/06

to

On 20 Jul 2006 12:40:15 -0700, "feedbackdroid"
<feedba...@yahoo.com> wrote:

And please let's not forget my mechanical pluralism and first, second,
third, and fourth order differences between differences and cerebral
tautological regressions, Dan. The more things change . . .

~v~~

mimo...@hotmail.com

unread,

Jul 21, 2006, 5:15:03 AM7/21/06

to

Glenn Sizemore you are a bit of a bully, and I don't know whether its
to
do with your training, because in some ways you're also helping
people to defend themselves their opinions and preferences
by getting them to question their particular points of view. Obviously
a lot of private opinions are close to the heart, and its unavoidable
that people will get upset. You might have a problem yourself, perhaps
you were bullied.

To everyone else, we all have to earn a living.

This is an open NG, a public space available for people from all
walks of life. There are other groups.

bye,

N.

feedbackdroid

unread,

Jul 21, 2006, 11:11:45 AM7/21/06

to

>
> Seriously though, if I discovered that one of my professors was
> behaving so impolitely in a public forum, I might just drop the course.
> It's bound to reflect on the university sooner or later. The chair's
> advice was reasonable.
>

Yeah. What you post on the internet is forever, for the entire world
to see. If you're really really stupid, it might come back and bite you

in the ass one day.

Instead of you taking the course, Joe, what if your teenage daughter
was enrolled at a college, and you found out she was taking a
course from a guy who, inbetween lectures, was logging on to
the internet using university computers and posting filth? .....
and using his own name, no less. What a brilliant boy.

For my part, I refused to let my own daughter go to the local
university, which has been rampant with scandal after scandal
for the past 15 years, football recruiting sex scandals, continual
date-rape drug incidents, 2 freshman [under-age] students literally
having died at fraternity parties from too-much alcohol ingestion
[I kid you not], drunken students rioting in the streets [happened
3 years in a row the 1st week of classes], on and on. The regents
finally got some goddamn cahones and fired the useless president,
and brought in a new one who cleaned house. They also threw
all the universities off campus who wouldn't agree to limit using
alcohol to recruit freshmen, and totally closed 1 or 2 down.

A lot of academics just don't have squat when it comes to common-
sense. The entire university community [deans and faculty] sat on
their stupid thumbs for years while all this crap was going on.

Glen M. Sizemore

unread,

Jul 21, 2006, 11:35:10 AM7/21/06

to

"J.A. Legris" <jale...@sympatico.ca> wrote in message

news:1153431510....@i42g2000cwa.googlegroups.com...

Hmmm, maybe we're not all as hypocriticlly puritanical as you. So what, Joe,
you don't swear. But you don't mind joining in to Dan's continual stream of
abuse. Do you? BTW, Joe, any time you want to come and "muzzle" me, I'm
ready to play.

>
> --
> Joe Legris
>

Glen M. Sizemore

unread,

Jul 21, 2006, 11:44:58 AM7/21/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote in message

news:1153494705....@75g2000cwc.googlegroups.com...

>
>>
>> Seriously though, if I discovered that one of my professors was
>> behaving so impolitely in a public forum, I might just drop the course.
>> It's bound to reflect on the university sooner or later. The chair's
>> advice was reasonable.
>>
>
>
> Yeah. What you post on the internet is forever, for the entire world
> to see. If you're really really stupid, it might come back and bite you
>
> in the ass one day.
>
> Instead of you taking the course, Joe, what if your teenage daughter
> was enrolled at a college, and you found out she was taking a
> course from a guy who, inbetween lectures, was logging on to
> the internet using university computers and posting filth? .....
> and using his own name, no less. What a brilliant boy.

Did you teach your daughter to be a fascist, like you, Dan? Nothing you say
will change the fact that you sought to have me censored, and that's what's
called being a fascist, Dan. Oh, BTW, as I explained to you when you started
your little campaign to have me censored, I don't post using my university
address, and I don't mention what university I work at. And I told you the
reason: I want to be free to call slugs like you exactly what you are. Also,
BTW, the only one slinging insults around here, for the most part, has been
you. And "posting filth"? Tossing a few swear words at an
intellectually-dishonest, abusive fellow like you? Is your Mommy still
around so widdle Danny can go cwy to Mummy when tumone calls him a poo-poo
pants?

J.A. Legris

unread,

Jul 21, 2006, 12:31:04 PM7/21/06

to

That was meant to be humourous. I could have put a smiley after it, but
I figured the word "seriously" would make it clear. Do you prefer
smileys?

Puritanical? Gimme a break. What ever happened to professionalism? If
you cannot help yourself then you should at least inform your students
about the existence of Mr. Hyde and assure them that it is Dr. Jekyll
who will be grading them.

--
Joe Legris

bob the builder

unread,

Jul 21, 2006, 12:41:04 PM7/21/06

to

I dont know what problems are playing between some of the regular
posters here. But there are better places to offend each other than
this group. And altough its alright to be impolite here, beeing
OFF-TOPIC isnt!

feedbackdroid

unread,

Jul 21, 2006, 1:25:14 PM7/21/06

to

OT seems to be the name of this forum, in case you hadn't noticed.
You haven't been here very long.

But I think you can see the problem. I have a responsibility to protect
my own daughter, and now GS is calling even her a fascist.

BTW, there is no proof for what he's accused me of here. It's just
his imagination running on double-time. You have NO idea how many
people on this forum he has sullied with his verbal abuse over the
past several years. I just happen to be on top of his enemy list.

BUT, you can bet I'm gonna find out who his chair is NOW, and
apprise him of the msg GS just posted. Hopefully, GS didn't post it
from his university system. I'm sick of his endless BS.

bob the builder

unread,

Jul 21, 2006, 2:52:21 PM7/21/06

to

feedbackdroid wrote:
> bob the builder wrote:
> > I dont know what problems are playing between some of the regular
> > posters here. But there are better places to offend each other than
> > this group. And altough its alright to be impolite here, beeing
> > OFF-TOPIC isnt!
> >
>
>
> OT seems to be the name of this forum, in case you hadn't noticed.
> You haven't been here very long.

Iam incapeble to see the relevance even if something is relevant.

> But I think you can see the problem. I have a responsibility to protect
> my own daughter, and now GS is calling even her a fascist.

A hypothetical daughter or a real one? But i agee. On the internet its
very commen to be insulted or that relatives are insulted. But when
people know who you are and you know who they are insulting then its an
other ballgame. If someone would insult my girlfriend in the street
then i would make him apologize.

> BTW, there is no proof for what he's accused me of here. It's just
> his imagination running on double-time. You have NO idea how many
> people on this forum he has sullied with his verbal abuse over the
> past several years. I just happen to be on top of his enemy list.
>
> BUT, you can bet I'm gonna find out who his chair is NOW, and
> apprise him of the msg GS just posted. Hopefully, GS didn't post it
> from his university system. I'm sick of his endless BS.

Oh, i usually wait around the corner with a baseballbat.

Glen M. Sizemore

unread,

Jul 21, 2006, 3:15:13 PM7/21/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote in message

news:1153502714.3...@s13g2000cwa.googlegroups.com...

>
> bob the builder wrote:
>> I dont know what problems are playing between some of the regular
>> posters here. But there are better places to offend each other than
>> this group. And altough its alright to be impolite here, beeing
>> OFF-TOPIC isnt!
>>
>
>
> OT seems to be the name of this forum, in case you hadn't noticed.
> You haven't been here very long.
>
> But I think you can see the problem. I have a responsibility to protect
> my own daughter, and now GS is calling even her a fascist.

You know that this isn't true, Dan. But the truth means nothing to you -

absolutely nothing. This is what I wrote:

"Did you teach your daughter to be a fascist, like you, Dan? Nothing you say
will change the fact that you sought to have me censored, and that's what's
called being a fascist, Dan."

What would you call someone who seeks to control the content of others'
speech?

>
> BTW, there is no proof for what he's accused me of here. It's just
> his imagination running on double-time.

Of course there is proof. There is proof that you have repeatedly suggested
that I be censored, and you know it. I'm not going to dig up the posts, and
maybe you didn't archive them, but you know what you said. Or are you
pathological enough to think that you did not?

>You have NO idea how many
> people on this forum he has sullied with his verbal abuse over the
> past several years. I just happen to be on top of his enemy list.
>
> BUT, you can bet I'm gonna find out who his chair is NOW, and
> apprise him of the msg GS just posted. Hopefully, GS didn't post it
> from his university system. I'm sick of his endless BS.

As I have just pointed out, and what others can see is the truth, is that
you have repeatedly insulted, first me, and now Olea. Right, Dan?

>

Glen M. Sizemore

unread,

Jul 21, 2006, 6:00:34 PM7/21/06

to

"J.A. Legris" <jale...@sympatico.ca> wrote in message

news:1153499463.9...@s13g2000cwa.googlegroups.com...

> Glen M. Sizemore wrote:
>> "J.A. Legris" <jale...@sympatico.ca> wrote in message
>> news:1153431510....@i42g2000cwa.googlegroups.com...
>> >
>> > Michael Olea wrote:
>> >>
>> >> If he contacted your university in an attempt to muzzle your comments
>> >> here
>> >> then that was a despicable act.
>> >>
>> >> -- Michael
>> >
>> > Yes. The honourable thing would be to go there and muzzle him
>> > personally. Alas, chivalry is dead.
>> >
>> > Seriously though, if I discovered that one of my professors was
>> > behaving so impolitely in a public forum, I might just drop the course.
>> > It's bound to reflect on the university sooner or later. The chair's
>> > advice was reasonable.
>>
>> Hmmm, maybe we're not all as hypocriticlly puritanical as you. So what,
>> Joe,
>> you don't swear. But you don't mind joining in to Dan's continual stream
>> of
>> abuse. Do you? BTW, Joe, any time you want to come and "muzzle" me, I'm
>> ready to play.
>>
>
> That was meant to be humourous. I could have put a smiley after it, but
> I figured the word "seriously" would make it clear. Do you prefer
> smileys?

Well, you never know. Everybody thinks they are a cage fighter nowadays.
Stand up? Submission? Ground and pound?

>
> Puritanical? Gimme a break. What ever happened to professionalism?

I save it for the classroom and laboratory. I consider this to be more like
a bar with people listening in and jumping in should they care to. Joe, I
think in your heart of hearts, you know, no matter how much you hate me
(sure you do, Joe, it's ok - and I pretty much hate you too), and no matter
how mean I have been occasionally, that Dan should not be soliciting others
to (or attempting to directly) have my speech censored by the university.
And I think you can see that his latest threat shows that, as I implied, Dan
will go to ANY lengths to attack somebody. He has lied in saying that I
called his daughter a fascist, as you yourself can see. And even if I did,
whom did I call a fascist? Miss feedbackdroids? Why do you think that Dan
stopped posting under his real name? Continue to hate me, Joe, but for God's
sake, show some moral backbone occasionally. Oh, my moral backbone? I have
done many things in my life, but I have never suggested that anyone's legal
speech be curtailed - ever. And do you doubt that my speech is legal, Joe?
If it wasn't, Dan could easily go some sort of legal route. But he doesn't,
does he? Why? Because I have th right to free speech. What else is left in
this country?

>If
> you cannot help yourself[]

Is this some epistemological statement, Joe?

>then you should at least inform your students
> about the existence of Mr. Hyde and assure them that it is Dr. Jekyll
> who will be grading them.

The university monitors the performance of everyone teaching a course. Some
studnts hate me, some like me, and so it goes. What does my legal behavior
outside of the classroom, completely unconnected from the university, have
to do with the classroom.

>
> --
> Joe Legris
>

Glen M. Sizemore

unread,

Jul 21, 2006, 6:02:37 PM7/21/06

to

DM: "BTW, there is no proof for what he's accused me of here."

"BUT, you can bet I'm gonna find out who his chair is NOW, and
apprise him of the msg GS just posted. "

GS: I rest my case.

"feedbackdroid" <feedba...@yahoo.com> wrote in message

news:1153502714.3...@s13g2000cwa.googlegroups.com...

Curt Welch

unread,

Jul 22, 2006, 12:05:08 AM7/22/06

to

"feedbackdroid" <feedba...@yahoo.com> wrote:
> Curt Welch wrote:

> > "feedbackdroid" <feedba...@yahoo.com> wrote:

> Look out the window, and see the trees. Now imagine that all the light
> was blocked from one of 10,000 leaves on the tree. Would it matter
> to identifying the tree?

Of course not.

Do you understand that I agree completely with the idea the neurons perform
transforms that tosses out huge amounts of data? Just like all AND, OR,
and XOR gates perform transforms that toss out information. Only a simple
NOT gate performs a transform that doesn't loose information.

If you understand that, as I have explained before, I don't understand why
you would once again as a question like the one above. The neurons that
give us our ability to detect trees will, like all the neurons, toss out
huge amounts of information in their identification of a tree. Just like
the function a+b tosses out roughly half the information in a and b.

But, as I said in the first post, the information is not "lost" if it's
located elsewhere. So if there are other detectors that work in parallel
with the tree detectors (which of course we know there are millions) that
are correctly complementary, then the data is not lost in the network.
It's unlikely that evolution would construct a visual system to send a lot
of data to the visual cortex, only for the visual cortex to toss it out as
you seem to be suggesting. It takes energy to move information like that
and energy is not something evolution tends to allow to be wasted like
that. This is most likely why there is that large reduction in the eye.
Evolution couldn't build perfect sensors to extract only the important data
from the light, so it built the best sensors it could, and then
immediately, filtered out what wasn't cost justified, before sending it on
to the visual cortex (probably as you suggested, sacrificing resolution for
dynamic range in the process).