Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

The Sorry State of Statistics

10 views
Skip to first unread message

William Chambers

unread,
Feb 28, 2000, 3:00:00 AM2/28/00
to
Dennis Roberts (d...@psu.edu) wrote:

>i have asked the list owner to remove you from the list ...

Over the past few days (15 years) I have tried to describe a procedure that
will allow us to infer causation from correlations. I have consistently
been dismissed by personal attacks and very poorly thought out comments,
Dennis Roberts just wrote to inform me that he is asking the list owner to
remove me from this newslist, Of course, Dennis fails to acknowledge the
role of others in these exchanges, That is not part of the establishment's
"bait and switch" tactic,


The fact is, that polarization does allow us to infer causation, Not one
person has been able to disprove this over the past 15 years, But a long
list of pompous little men have questionned my expertise and personality in
an attempt to suppress recognition or even testing of corresponding
regressions, This is not science, it is a cult and a trade guild, It is
a travesty and when one things about the degree to which lame statistics and
theory are used to make public policies, it is a crime,


It would really be very easy to please me, Just try the simulations I have
described and if the facts speak to the point, admit in public that for the
model as I describe it, corresponding regressions allows us to determine
causes from effects, None of the little men who insult and cry to the list
owners and other thugs have dared, Some of you others, however, have tested
CC/CR privately and admitted as much to me, In fact, people in hiding have
been doing this for years, None, except Gottfried Helms, have had the
courage to say in public that yes the method works for the given model. The
journals of Perceptual and Motor Skills , Structural Equation Modeling and
the Journal of Mind an Behavior have published articles that show that CR
works and they should be applauded for their integrity, But other journals
have returned reviews that were even less objective and more viscious and
stupid than the comments of people like Rich, Frank and Rubin, These
journals include Psychological Bulletin and Psychometrika. And while people
like Stan Mulaik admit privately there may be something to CR, he dismisses
me and the method publicly with diagnosis of my person, Paul Meehl
indicated to me privately that he could see no error with CR but he said
nothing in public, Hans Eysenck told me in private that he could not see any
reason CR would not be useful in inferring causation, but death has closed
his lips for ever, There are several other "famous" men who admit in private
that CR seems to work. And many lesser mortals have written me in private,
But only Gottfried has has the guts to speak out in public, I may disagree
with some of Gottfried's views but I have strong respect for him as a man,
and as a friend,

Another friend on this list wrote to me the other day and explained why
things are
as they are, Statistics is really a big conservative club, There are people
who fancy themselves the guard dogs of an establishement that seems to be
handed down from the hands of God, This establishment is really an economy
that has enriched itself by selling software and textbooks, and
certifications to people who are incapable of really understanding
statistics, There is a lot of money for those who play ball with the old boy
system, There are several rules to the game, however, Do not get creative
unless you are famous, If you are a nobody, do not do anything too
interesting, because the old boys will resent you for stealing the lime
light, Another rule is that you must take insults without returning them,
My violating this rule motivates Dennis to get me thrown off this list, when
the people who are the cause of the problem go on year after year insulting
people by their
ignorance and pathology, People like Dennis tolerate lie after lie and
insult after insult but get offended when someone finally says "Enough
Bullshit." Dennis and Co, want to keep up the superficial appearance that
all is well and professional but the fact is that the discipline is
incapable of addressing something as simple as corresponding regressions, It
is not that CR is difficult math, I have explained it to undergraduate
psychology majors with ease, The problem is CR is revolutionary,

So long as we cannot infer causation from correlation, then our culture is
free to wallow in ambiguities and uncertainties that give powerful people
the chance to avoid taking responsibility for their crimes. Just as none of
you are willing to speak up on this newslist and say "You idiots who keep
criticizing the man Bill Chambers have no right to say anything until you
have tested the method and logic yourselves." You do not because you all
have a place or want a place on the gravy train, The train is really not
science but is a bag of tricks to be manipulated by administrators who value
money and power more than truth. So long as statistics remain full of
ambiguities, with regard to causal inference, anybody with power can argue
his or her case and have her way. They say, "Who can say what is true?" and
look around with a snear and with egg on their faces, Statistics is very
much like the old wild west of Ronnie Reagan. Truth and goodness are beside
the point, Its who has the power to suppress angry voices that matters...
who lies with the biggest grin and softest voice.

You may throw me off this list but you have not been able to disprove
corresponding regressions and you sure as hell have not proved yourselves
superior to me personally, You may keep a low profile and save your
backsides for the men in power and for a place at the table, but you are
eating those innocent people who have to pay the price of your lies. YOU ARE
ALL GUILTY when you turn your back on the pursuit of truth.

Now tell us, if you have the brains and integrity, what do you think about
the polarization that occurs for the model y=x1+x2? You remember this
question when you go flirting with the big boys. Remember this question when
your own children take the medication that some other scum ball has lied
about, (And do not pretend I am being melodramatic, My daughter Grace died
from birth defects caused by a drug that was taken off the market soon after
her birth.) Remember this question when you complain that you have been
treated
unfairly, And look away in shame,

William Chambers, PhD

Robert Ehrlich

unread,
Feb 28, 2000, 3:00:00 AM2/28/00
to
Bill: Without judging procedure, I can sympathise with your dilemma.
Over the years some stuff that I have done has been greeted by a
groundswell of indifference. I changed tactics and used the new tools
to work on real-world problems that were posed by non statisticians.
Success in that realm helped make our stuff "reasonable"---although some
of thsi was our fault in that we had not carefully thought out the
implications of our work. finaly, I decided that, because we live in a
capitalistic society, the best way to garner acceptance is to make a lot
of money using "new" approaches. I am in the midst of that right now.
So, for now, forget the stats people and the psych. people. Find out
where the real-world problems are and prove by doing. I am having the
time of my life right now.

William Chambers

unread,
Feb 28, 2000, 3:00:00 AM2/28/00
to
Muriel Strand wrote and said,

"i strongly recommend that YOU consider therapy."

I responded that I had several times but that even so I am still disgusted
by people like her, She did not give any comment on corresponding
regressions, I guess she was busy prescribing to other people as well,

Now on to a reply to Robert...

Robert,

I am slowly being driven your way, I am not against wealth, just stealing,
And I continue to wonder why I waste so much time with insincere people,
But I am somewhat a function of my birth and childhood, I saw a lot of bad
things growing up in the Mississippi delta, My father fought in WWII and my
mother worked in a factory in England during the war, I saw a lot of things
through their eyes, I remember one million dead Vietnamese peasants, I
remember my daughter's wonderful face and how real scientists and doctors
worked such marvels to give her some life, at least, I would like to give
back to people who deserve it, for what ever reason,

If you have any suggestions of how I can apply corresponding regressions or
know anybody who would be interested giving me the chance, I am ready to go
to work, For that matter, I am ready to go to work doing just about anything
that is ethical, I have been fixing axels on cars this week but it is not
much fun without the right tools,

I am glad you are having a good time, Maybe I will see you on the beach some
day, I hope so, Isn't it strange that we have to let some people steal from
us what we would freely give?

Best,

Bill

Robert Ehrlich wrote in message <38BAFCA4...@home.com>...

William Chambers

unread,
Feb 29, 2000, 3:00:00 AM2/29/00
to

I received the following by private mail and thought it worthy of inclusion
in the public debate, with the author's named deleted,

>Hi William

>I think you are having a hard time with this because you want to use a word
>(in various forms) which has a special philosophical meaning in statistics
>(and in physics). The word is 'cause' (or 'causal' or 'causation'). Within
>statistics, 'cause' is effectively "copyrighted" as an abstract and
>unreachable ideal (as is 'truth').

>Perhaps the main problem is your marketing strategy.
>How about changing the descriptive label for your method?
>(e.g. "inferential correlation" or "associated correlation"
>or "second order inference" or "latent correlation")

>Kind regards

>A.

Bill responded:

I appreciate the author's advise and I think she has something clever to
say, But I am not comfortable with the strategy, personally. The point of
corresponding regressions is that it really does allow us to infer causation
when we bother to explicitly define it in the way that I have (and most
others implicitly do, i.e. additivity). Without causal inference,
however, corresponding regressions is not much to crow about, In fact, it
needs attention from mathematicians before it can enter the high state of
elegance, even if it were trivial elegance (which it is not), Someone with
a mastery of calculus needs to develop the most efficient approaches to
polarization and the control of type I and type II errors. But these are
clean up issues that the course of "normal" science tends to correct, if it
gets the chance, The fact, however, is that no one is going to bother with
such refinements on corresponding regressions unless the revolutionary
implications of the method are made known, I am a nobody and the method
will not be recognized by the sorry state of statistics without a fight, I
have tried for 15 years and have constantly run up against a corrupt system
of old boys being protected by idiot guard dogs who do the dirty work of
misunderstanding truth. I doubt that I will personally ever see the method
get acceptance or even be fairly tested. But I feel a sense of duty to
continue the good fight,

Trying to sneak CR/CC into the establishment via euphemisms will not work
simply because there is no incentive for anyone to use the method, short of
the revolutionary truth, Maybe someone could make a lot of money with it
but I personally am a peasant and know nothing about money, I do, however,
know about philosophy, I am up against a jaded culture that has given up on
the higher ground of causal inference while forbidding others from
continuing the search with respect and dignity.

If you notice, I have repeatedly asked people about the polarization effect,
I have done this many, many times over the years, including scores of times
on SEMNET, The reaction is predictable, First there are superficial
dismissals and laughs, Chambers is a nut, It cannot be done, Then I present
the simulation. Sudden silence, then more insults and evasions. I try a
patient and constructive response, More evasion and insult, I snap insults
back, They start to pay attention but dig their heels in against me
personally, Then they start free associating, coming up with any bizarre or
vague excuse to dismiss my logic, change the subject and above all to avoid
agreeing with the mathematical results of the y=x1+x2 simulation, They
instinctively know that if they put aside bullshit and try the simulation,
honestly, that they will have to agree with my conclusions, They know I am
not bluffing about the simulation and polarization effect. So they refuse
to address it,

Its like the scientist in Galileo's time who refused to look through his
telescope, People refuse to try the simulation because they are afraid it
is the first step in a slippery slope that will lead them to being seduced
and publicly embarrassed. Here is where my problem with the sorry state of
statistics originates, I have written to Joreskog and just about every
other statistician who claims to be an expert on causal inference. They all
ignore me, I am old enough and smart enough to know that all these people
do not hate me personally, They do not know me, Its one of the advantages of
being a nobody, I also know that if they could they would love to rip my
ideas to shreds. Its a mean vanity thing but its not all bad, because
science is serious stuff, If I am suggesting something ridiculous then it
SHOULD be torn up and discredited, But there are rules to this sort of
invalidation and nobody is playing by them, The reason they do not disprove
CR is that they do not know how, The simple math is too convincing and even
highly educated and otherwise honorable men will react to the polarization
effect with psychological horror, This is because the implications are
revolutionary, They do not want to know that it is possible to infer
causation from correlations,

If science worked in a systematic and logical manner as a sociological
discipline, then sneaking in CR with a euphemism would work, Like a virus,
the polarization effect would spread and be incorporated into the
technology, perhaps initially via the business world and fortune making,
Eventually it would just be obvious that the method allows us to infer
causation, But science is not really that vulnerable to the virus of truth,
People simply refuse to look through Galileo's telescope and the method does
not get used by any name. 15 years ago I might have considered the euphemism
tactic but now I know the problem is much deeper than pretty versus ugly
words, The problem goes to the core of the system. The problem is
corruption at the deepest intellectual and moral levels.

Capitalism and the business paradigm have corrupted science. Just look at
our universities and colleges, Most are run by business men, many of whom
have pretty sorry histories of intellectual productivity, At least this is
the case when you get out side the best schools, The upshot is that the
very ideas of truth and integrity are dismissed as adolescent. The
philosophy of relativism is then rammed down the weak minds of the youth and
those paying their mortgages, and we end up with a culture of euphemisms
that
are all intended to hide the fact that the Emperor has no clothes, He is a
naked liar and a beast who rapes the truth and innocent fools who can not
protect themselves.

Some of you think I must need therapy for saying this, but you are probably
so casual in your values because you long ago sold your souls for the
comforts that silver can buy. To you I say, "What do you think of the
polarization of the correlations of the causes across the range of the
dependent variable in the model y=x1+x2." You stare back dumbly and in
anger, diagnosing what you are afraid to understand, It all comes back to
responsibility, You do not want to take it, You want to hide behind
committees and behind the myths of a cruel society. That we cannot infer
causation from correlation is one of these myths, It helps keep the light
out. It outlaws Galileo's telescope, It is statistics nasty little secret,

And so I rant and rave and calculate very simple equations that most of you
will dare not inspect, I have made a career of pursuing the truth and most
of you have made a career of lies, You have money and jobs, I have a soul,
So far I think I am winning, whether I am miserable and need therapy or not.
Beware, because the laugh of a simple child in the crowd could start an
avalanche. But relax, because the mob police will probably beat the laughter
out of you when you sign over the check this month on that fancy car you
drive, Then the cycle of self-deception by euphemisms starts all over again,
until next month and another laughing child threatens the parade.

Bill

Elliot Cramer

unread,
Feb 29, 2000, 3:00:00 AM2/29/00
to
William Chambers <will...@roman.net> wrote:
: Dennis Roberts (d...@psu.edu) wrote:

:>i have asked the list owner to remove you from the list ...

I couldn't care less whether you stay on the list or not, but Fisher had
it right. If you want to infer causation, do a randomized trial.
That's all the FDA will accept as evidence. Otherwise you will have
endless arguments of the sort you are engaging in. If you REALLY can't do
a randomized trial eg smoking and cancer, you'd better have a great deal
of non-statistical evidence.

If you are just too lazy, stupid, or ignorant to do one, Don't call
me; I'll call you.


William Chambers

unread,
Feb 29, 2000, 3:00:00 AM2/29/00
to
Elliot,

Of course you are correct in encouraging us to do experiments, The problem
is that there are many things in life that we can measure but that we can
not manipulate, It may simply be impossible to manipulate the phenomena or
it might be unethical. This leaves a lot of folks in trouble if their goal
is to develop a science of these impossible to manipulate variables. CR is a
method that could allow those of us interested in such phenomena the
opportunity to further develop our scientific theories.

There is a major historical issue here, So long as we can only infer
causation by manipulating physical objects, then the scientific purview is
limited to concrete phenomena, This tends to discount the credibility or
worth of less concrete events and objects, such as psychology, sociology,
etc. So those of us who are interested in the social sciences are easily
dismissed as flakes by the physicists and chemists. This shapes our culture
and economy and leaves the determination of many social sciences issues to
the sort of dark forces that at one time had us believing that the world was
flat, the witches flew through the night on broom sticks and that what ever
was to be seen in the telescope was not worth the look. So there are major
ramifications to the problem of causal inference without manipulation.

Now as to the FDA's recognition of nonexperimental causal inference, So long
as the FDA can insure that experiments exist to test the drugs, then I say
let them continue in their narrow mindedness. I am sure they think they are
very special, Surely you, however, do not delude yourself into thinking that
just because a government agency insists on something that it is true?

Anyway, the problem comes when there is no experimental evidence, for
example back in the 1940s with cigarettes and lung cancer, The correlation
had been noticed, as I recall in the 30s but without the experiment there
was little that could be said, The tobacco company scientists just laughed
because they knew the causation could not be proven with the traditional
correlation. If corresponding correlations was a recognized method back
then, on the other hand, the government might have been forced to recognize
that tobacco causes cancer and we might not have lost so many people
(including my father) to lung and other cancers.

You suggest that anyone who does not want to do an experiment is lazy, Well
would you volunteer to smoke cigarrettes for 30 years to see if it gives you
cancer? Would you be lazy if you did not force subjects by random assignment
to undergo such treatment?

Do you think I am too stupid to dream up such a Nazi like experiment? Do you
think I did not learn how to run an experiment after taking seven graduate
stats courses and teaching a number of times as a professor? Your attitude
is that of a person who is incapable of seeing beyond his own good fortune
of having something simple or at least concrete to study. Why is there no
room in you mind for alternative approaches to causal inference? What do
you think about the polarization effect for the model y=x1+x2? Tell me, why
wouldn't you want someone who has apparently solved a great statistical
puzzle (inference of causation) on this list? Are you unfriendly or just
plain stupid?


Bill Chambers

Henry

unread,
Feb 29, 2000, 3:00:00 AM2/29/00
to
On Tue, 29 Feb 2000 09:50:24 -0600, "William Chambers"
<will...@roman.net> wrote: <snipped>

>The point of corresponding regressions is that it really does allow us to infer
>causation when we bother to explicitly define it in the way that I have
>(and most others implicitly do, i.e. additivity).
>Without causal inference, however, corresponding regressions is not much
>to crow about.

The following is my understanding of what you are saying. After that
is my personal problem with your conclusion. I recognise either could
be wrong.

Imagine a dice game. Throw a red die and note the score (call it X),
then throw a blue die (call it Y), then add the two scores (call the
result Z=X+Y). Do this several times, then sort the results by Z,
divide the results into groups of low Zs, medium Zs and high Zs and
look at the pairwise correlations between X, Y and Z in each group. I
simulated the results 3600 times, took the bottom 1000, middle 1600,
and top 1000 and got the following results for correlation coefficents
comparing X with Y, Y with Z, and Z with X:

Low Z -0.487 0.517 0.495
Medium Z -0.879 0.243 0.249
High Z -0.506 0.485 0.509

This seems to be what is described as "polarization", and is
interesting. The suggestion is that this implies that Z is caused by
X and Y (and indeed Z is a direct result of X and Y - the only
question is that of the implication).

But what about another dice game? This time throw two green dice and
note the total score (call it A), then throw a yellow die until it
shows a number strictly less than A but greater than or equal to A-6
and note its score (call it B), then look at the difference between
the two scores (C=A-B). Do this several times and then consider the
data. It looks very similar to that in the first game (with A and Z
showing similar patterns and X, Y, B and C showing similar patterns).
Now do the same pairwise correlation tests to see whether there is a
suggestion that B and C could be causing A (something we know not to
be true). Again I simulated the results 3600 times, sorted by A, took
the bottom 1000, middle 1600, and top 1000 and got the following
results for correlation coefficents comparing C with B, B with A, and
A with C in each group:

Low A -0.506 0.493 0.505
Medium A -0.879 0.281 0.260
High A -0.491 0.512 0.497

It looks like the same polarization effect (or close enough to my
eyes). But from the construction of the second game, the causality is
the reverse of that in the first (A is not caused by B and C; C is
caused by A and B and B is affected by A). Hence my doubts that this
test (or indeed any other that simply looks at the data) can infer
causality.

William Chambers

unread,
Feb 29, 2000, 3:00:00 AM2/29/00
to
Henry,

First, when I say additivity I do not mean the we must create the model by
y=x1+x2. The term additivity is used in the literature, for example by
Michell, to indicate data that is composed by arithemtic operations, Thus
y=x1-x2 is also a legitimate causal model, The difference is that in the
model y=x1-x2, the correlations between x1 and x2 will be negative in the
extremes of y but positive in the midrange, But there is still polarization,
The method also works with the multiplication and division versions of the
model but not as well.

You seem to have created three groups instead of two across the range of y.
In fact, we concatenate the upper and lower quartiles and call it simply the
extremes of y. Extremes implies both very high and very low values of y, So
you may want to simplify your analysis a little bit in this way, I am
pleased to hear that you got the polarization I have described for the model
y=x1+x2. Thus, for that model, it clear that the polarization does occur
and that it will be useful for inferring this causal pattern.

But lets get on to the interesting simulation, First, the assumption of
corresponding regressions is that the causes are orthogonal and uniformly
distributed. Your simulation appears to violate the first assumptions since
A will not be uniform, But Cr usually handles several layers of causation
so this is not really an explanation for your simulation, What you say is:

>But what about another dice game? This time throw two green dice and
>note the total score (call it A), then throw a yellow die until it
>shows a number strictly less than A but greater than or equal to A-6
>and note its score (call it B), then look at the difference between
>the two scores (C=A-B). Do this several times and then consider the
>data.

A is as like y, i.e. A=x1+x2. You then break the rules by building a yellow
variable that is logically determined, in part, by A. It is not independent
of A. B is not simply another random variable but is tailored to wrap around
the A values in a special way, a way that sets us up for false polarization.
This kind of transformation is of course a logical possibility but it
probably requires a human intelligence to contrive the trick. A and B are
probably correlated either linearly or curvilinearly, I am not sure if I am
smart enough to figure it out, But it is abundantly clear that A and B are
not orthogonal and logically independent of one another. There are extremely
unlikely interactions at work here, Therefore the assumptions are violated
when C is calculated from what amounts two related variables, This is a kind
of structural incest and it will trick CR. At least I take your word for
it, But this is not really much of a problem for CR/ CC, which are based on
linear models and independent causes, If nature chooses to be so perverse
as to think up ways to trick CR, then I suppose we are all in trouble, But
this seems extremely unlikely, Its also kind of like violating the
assumption of independence is an experiment. You can yoke all sorts of
patterns together to fool the ANOVA and the ostensible "random" assignment,
But I do not think the ANOVA boys and experimentalist would accept this as
proof against the experimental method.

What correlation did you find between A and B across the whole sample? What
do you think is the mechanism that causes the polarization in this data? I
do not believe that this is something you came up with by chance, It looks
like something you purposefully contrived to fool the method, Tell us how
you did it. If I had some of the data I would do a scatter plot of A on B
and I will bet from that I could figure out how you did it. But I think you
ought to stick to simple linear models if you are going to test CR, Or at
least do tests for nonlinear interactions between the variables. After all
CR is not God, No statistical method is that smart.

(I am going out of town for a few days tomorrow and if I do not respond it
is not that I am not interested in your simulation, I will try to get access
to the newslist on a friends computer but may not be able to. I am attending
a conference, So hang in here with me, Hopefully you will answer tonight.)

Bill


William Chambers

unread,
Feb 29, 2000, 3:00:00 AM2/29/00
to

Henry

unread,
Mar 1, 2000, 3:00:00 AM3/1/00
to
On Tue, 29 Feb 2000 19:14:38 -0600, "William Chambers"
<will...@roman.net> wrote:

>You seem to have created three groups instead of two across the range of y.
>In fact, we concatenate the upper and lower quartiles and call it simply the
>extremes of y. Extremes implies both very high and very low values of y, So
>you may want to simplify your analysis a little bit in this way,

It may be a simplification, but I would expect it to inevitably
produce positive correlations, so that is why I split the top and the
bottom. But this is not a crucial point.

I had earlier constructed X, Y and Z equivalent to your x1, x2 and y.
Then I constructed A, B and C and tried to understand whether assuming
that they were equivalent to y, x2 and x1 respectively produced
similar results (they did, even though the causation was reversed).
A and B are no more independent than Y and Z were; however B and C are
pairwise independent (knowledge of B gives no information about C
unless something is known about A), in the same way X and Y were. A
and Z both had the same triangular distribution for the score from
throwing two dice. Similarly the distributions of B, C, X and Y are
all the same (a number from 1 through to 6 with equal probabilities).


The question I am asking myself is the implication of constructing an
experiment where the causation is reversed, but the underlying
distributions are the same, and then producing the same results.

Frank Joyce

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
Apropos of your advice to work on real-world problems, lawyers have already
solved the proof of causation problem. See
http://www.toxictorts.com/relrisk.htm for a formula using linear
proportions. Next problem?

Robert Ehrlich <bobeh...@home.com> wrote in message
news:38BAFCA4...@home.com...


> Bill: Without judging procedure, I can sympathise with your dilemma.
> Over the years some stuff that I have done has been greeted by a
> groundswell of indifference. I changed tactics and used the new tools
> to work on real-world problems that were posed by non statisticians.
> Success in that realm helped make our stuff "reasonable"---although some
> of thsi was our fault in that we had not carefully thought out the
> implications of our work. finaly, I decided that, because we live in a
> capitalistic society, the best way to garner acceptance is to make a lot
> of money using "new" approaches. I am in the midst of that right now.
> So, for now, forget the stats people and the psych. people. Find out
> where the real-world problems are and prove by doing. I am having the
> time of my life right now.
>

[William Chambers quote deleted]

Pete Gieser

unread,
Mar 14, 2000, 3:00:00 AM3/14/00
to
Actually, the text at that site only really addresses what the lawyers
call "general causation", which seems to be a bit mis-named, since
they
define it with respect to A being *capable* of causing B. i.e.
general
causation really only means that A *could have* caused B. The subject
of "specific causation" (did A *actually cause* B) is left totally
unaddressed.


--
Peter Gieser, PhD
Biostatistics Core, Cancer Control
H. Lee Moffitt Cancer Center & Research Institute
at the University of South Florida
12902 Magnolia Dr., MRC-CANCONT 208
Tampa, FL 33612


"Frank Joyce" <fjo...@ids2.idsonline.com> wrote in message
news:8am8qb$81c$1...@nnrp-corp.news.cais.net...


> Apropos of your advice to work on real-world problems, lawyers have
already
> solved the proof of causation problem. See
> http://www.toxictorts.com/relrisk.htm for a formula using linear
> proportions. Next problem?
>
> Robert Ehrlich <bobeh...@home.com> wrote in message
> news:38BAFCA4...@home.com...

> > Bill: Without judging procedure, I can sympathise with your
dilemma.
> > Over the years some stuff that I have done has been greeted by a
> > groundswell of indifference. I changed tactics and used the new
tools
> > to work on real-world problems that were posed by non
statisticians.
> > Success in that realm helped make our stuff
"reasonable"---although some
> > of thsi was our fault in that we had not carefully thought out the
> > implications of our work. finaly, I decided that, because we live
in a
> > capitalistic society, the best way to garner acceptance is to make
a lot
> > of money using "new" approaches. I am in the midst of that right
now.
> > So, for now, forget the stats people and the psych. people. Find
out
> > where the real-world problems are and prove by doing. I am having
the
> > time of my life right now.
> >

> [William Chambers quote deleted]
>
>

William Chambers

unread,
Mar 15, 2000, 3:00:00 AM3/15/00
to
Frank,

What do you think about the polarization of the correlations between
independent variables across the ranges of the dependent variable
(corresponding regressions/correlations?

Bill Chambers


Frank Joyce wrote in message <8am8qb$81c$1...@nnrp-corp.news.cais.net>...


>Apropos of your advice to work on real-world problems, lawyers have already
>solved the proof of causation problem. See
>http://www.toxictorts.com/relrisk.htm for a formula using linear
>proportions. Next problem?
>
>Robert Ehrlich <bobeh...@home.com> wrote in message
>news:38BAFCA4...@home.com...

>> Bill: Without judging procedure, I can sympathise with your dilemma.
>> Over the years some stuff that I have done has been greeted by a
>> groundswell of indifference. I changed tactics and used the new tools
>> to work on real-world problems that were posed by non statisticians.
>> Success in that realm helped make our stuff "reasonable"---although some
>> of thsi was our fault in that we had not carefully thought out the
>> implications of our work. finaly, I decided that, because we live in a
>> capitalistic society, the best way to garner acceptance is to make a lot
>> of money using "new" approaches. I am in the midst of that right now.
>> So, for now, forget the stats people and the psych. people. Find out
>> where the real-world problems are and prove by doing. I am having the
>> time of my life right now.
>>

>[William Chambers quote deleted]
>
>

Henry

unread,
Mar 15, 2000, 3:00:00 AM3/15/00
to
On Tue, 14 Mar 2000 15:49:44 -0500, "Frank Joyce"
<fjo...@ids2.idsonline.com> wrote:

>Apropos of your advice to work on real-world problems, lawyers have already
>solved the proof of causation problem. See
>http://www.toxictorts.com/relrisk.htm for a formula using linear
>proportions. Next problem?

The formulae on this page seem capable of producing negative
probabilities, e.g. if 0<RR<1 then (RR-1)/RR<0

I am also slightly concerned about the leap in logic: it seems
incapable of dealing with the issue for example of vaccinations which
reduce serious disease overall, but for a few unfortunate individuals
have been shown to be the direct cause of disease in that individual.


William Chambers

unread,
Mar 16, 2000, 3:00:00 AM3/16/00
to
>In article William Chambers <will...@roman.net> you wrote:

>> What do you think about the polarization of the correlations between
>> independent variables across the ranges of the dependent variable
>> (corresponding regressions/correlations?
>

>Unfortunately I believe I have completely lost track of the demonstration
>that you were working on with this initially. I am really sorry.
>
>Would it be possible to have the smallest and clearest description of
>how I might take a random number generator and produce some of these
>examples and then what I might look at to see this particular feature?
>
>Many thanks
>


Good to hear from you again, The demonstration of the polarization effect
is pretty simple using excel or other spreadsheets.

1. First generate two columns of uniform random numbers (x1 and x2).
Generate, say, 100 rows. These will be the causes.

2. Create a third column (y) by adding x1 and x2. (It also works with
subtraction, multiplication and division but stick with addition for now). Y
is the effect.

3. Sort the data by column y, the effect.

4. Cut the upper 25% of the sorted data and paste to a spreadsheet called
extremes in y.

5. Cut the lower 25% of the whole data set and paste/concatenate these 25 to
the bottom of the 25 in step 4. You now have the data corresponding to the
extremes of y, both upper and lower extremes.

6. Find the correlations between x1and x2 for the data at the extremes of
y and set them aside.

7. Go back to the whole data set sorted by y. Cut and paste the middle 50
cases to a spread sheet called data at the midrange of y,

8. Find the correlations for the data at the midrange of y,

9. Now compare the correlations between x1 and x2 at the extremes of y to
those found at the midrange of y.

x1 and x2 should be positively correlated in the extremes of y, On the other
hand, x1 and x2 will be negatively correlated in the midrange of y, These
opposite correlations between the independent variables across the ranges of
the dependent variable are the polarization effect, We can exploit this
polarization property to discover unknown causes. This simulation just
demonstrates the polarization when we know the causes and effects.

Assumptions:

1. The causes are uncorrelated in the whole data set.
2. The causes are uniformly distributed at the time of their causal
generation.
3. The model is linear.

There are more calculations to consider but they are just refinements of the
above,

Let us know what you discover!

Bill


Anon.

unread,
Mar 16, 2000, 3:00:00 AM3/16/00
to

Pete Gieser wrote:

> Actually, the text at that site only really addresses what the lawyers
> call "general causation", which seems to be a bit mis-named, since
> they
> define it with respect to A being *capable* of causing B. i.e.
> general
> causation really only means that A *could have* caused B. The subject
> of "specific causation" (did A *actually cause* B) is left totally
> unaddressed.
>

It also defines possible causation _a priori_ (i.e. A causes B), but the
maths would be the same if in reality causation went the other way round.
In other words, if it was the propensity to get the disease that changed
the probability of exposure.

I reckon that there's a fair chance that someone has a tale of a bizarre
case where this happened. Would anyone like to update my prior?

Bob

--
Bob O'Hara
Metapopulation Research Group
Division of Population Biology
Department of Ecology and Systematics
PO Box 17 (Arkadiankatu 7)
FIN-00014 University of Helsinki
Finland

tel: +358 9 191 7382 fax: +358 9 191 7301
email: bob....@helsinki.fi
To induce catatonia, visit:
http://www.helsinki.fi/science/metapop/

And a Happy New 1900 to you all!

William Chambers

unread,
Mar 16, 2000, 3:00:00 AM3/16/00
to
Would someone please try the following simulation with normally distributed
x1 and x2? I just tried it with Excel and Gauss and to my surprise the
polarization occured with both programs. Corresponding regressions does not
work with normally distributed x1 and x2 and I have assumed that
corresponding correlations will not either, But when I conducted the
following corresponding correlations analysis (described in the previous
post and attached below) it worked with NORMAL distributions for x1 and x2!
I checked the distributions of x1 and x2 and they were both normal. Does
anybody else get polarization with normal causes? Remember we are looking
for opposite correlations between x1 and x2 in the extremes of y versus the
midrange of y.

Regards,

Bill Chambers

William Chambers wrote in message <2E7A4.1078$vb....@newsfeed.slurp.net>...

Daniel J. Nordlund

unread,
Mar 17, 2000, 3:00:00 AM3/17/00
to
Bill Chambers wrote:

>Would someone please try the following simulation with normally distributed
>x1 and x2? I just tried it with Excel and Gauss and to my surprise the
>polarization occured with both programs. Corresponding regressions does not
>work with normally distributed x1 and x2 and I have assumed that
>corresponding correlations will not either, But when I conducted the
>following corresponding correlations analysis (described in the previous
>post and attached below) it worked with NORMAL distributions for x1 and x2!
>I checked the distributions of x1 and x2 and they were both normal. Does
>anybody else get polarization with normal causes? Remember we are looking
>for opposite correlations between x1 and x2 in the extremes of y versus the
>midrange of y.
>
>Regards,
>
>Bill Chambers
>
>

For what it is worth, I programmed the simulation in SAS, and yes, normally
distributed variables produce the same kind of effect as uniformly distributed
variables.

Dan Nordlund

Dan Bonnick

unread,
Mar 17, 2000, 3:00:00 AM3/17/00
to
Hi Bob

There is quite a bit of recent-ish research into how parasites change the
behaviour of their hosts to improve their own chances of propagation.

I thought that a classical example was syphilis - which is reported to
increase the sex drive of the sufferer, so that the disease can spread...

regards,
Dan


Anon. <bob....@helsinki.fi> wrote in article
<38D1081C...@helsinki.fi>...
>
<snipped Pete's part>

>
> It also defines possible causation _a priori_ (i.e. A causes B), but the
> maths would be the same if in reality causation went the other way round.
> In other words, if it was the propensity to get the disease that changed
> the probability of exposure.
>
> I reckon that there's a fair chance that someone has a tale of a bizarre
> case where this happened. Would anyone like to update my prior?
>
> Bob
>

<snipped address>

William Chambers

unread,
Mar 17, 2000, 3:00:00 AM3/17/00
to
Dan said:

>
>For what it is worth, I programmed the simulation in SAS, and yes, normally
>distributed variables produce the same kind of effect as uniformly
distributed
>variables.
>
>Dan Nordlund
>


Bill responds:

Thanks for checking up on this Dan, What it is worth remains to be seen, as
far as the sorry state of statistics goes, but it could means a lot for
science n principle, I have only recently worked out the method of
corresponding correlations, which directly exploits the polarization effect,
That the polarization occurs with normally distributed causes (though I
suspect in an attenuated degree) means that corresponding correlations
should be useful even when the causes are normally distributed, This makes
corresponding correlations a practical alternative to corresponding
regressions, It makes it much easier to infer causation in applied contexts.

We are still left with the restriction that the causes must be uncorrelated,
I think that the use of orthogonal factor scores via factor analysis is the
solution to this problem. Another, of course is the use of the residual
estimate of latent x, as I do with corresponding regressions, only now use
the residual with corresponding correlations,

Dan I have been wondering something that perhaps you can tell me, I was
trainned as a psychological researcher, Research psychologists often have an
insatiable interest in new phenomena, I am puzzled about statisticians,
however. Are they selected as students for their lack of curiousity? Why do
they not find the polarization effect of inherent interest? They seem to be
more like policemen whose mission in life is to protect the current
assumptions and methods from change rather than scientists, looking for
reasons and pushing the horizon. What is it with statisticians? Are they
just intellectual cowards or stupid? Why does not one of them just take
five minutes to disprove me if I am so wrong? I have been talking about
this stuff now since 1986. It seems like at least one would disprove me if
they could, if for no other reasons than love of battle. I recall that as
a child we could play baseball and win or lose without quitting when the
catcher chattered (swing). The chatter was part of the game, but it was not
viscious. No one quit over it. We just loved the game of baseball enough to
put up with most anything, but cheating. Were any statisticians baseball
players as kids?

Sincerely,

Bill

Henry

unread,
Mar 17, 2000, 3:00:00 AM3/17/00
to
On Thu, 16 Mar 2000 18:04:25 -0600, "William Chambers"
<will...@roman.net> wrote:
>Would someone please try the following simulation with normally distributed
>x1 and x2? I just tried it with Excel and Gauss and to my surprise the
>polarization occured with both programs. Corresponding regressions does not
>work with normally distributed x1 and x2 and I have assumed that
>corresponding correlations will not either, But when I conducted the
>following corresponding correlations analysis (described in the previous
>post and attached below) it worked with NORMAL distributions for x1 and x2!
>I checked the distributions of x1 and x2 and they were both normal. Does
>anybody else get polarization with normal causes? Remember we are looking
>for opposite correlations between x1 and x2 in the extremes of y versus the
>midrange of y.

Indeed the effect seems to be there, though whether it has anything to
do with "causes" is another matter.

Another result: trying to simulate Cauchy distributions produced a
(smaller though still clear) negative correlation in the mid range,
but a virtually zero correlation for the extremes.

This starts to suggests an explantion for the effect. For most well
behaved distributions, the positive correlation at the extremes is a
direct result of combining the top quartile with the bottom quartile
(taking each on its own seems to produce a negative correlation): if
the distributions are centred around zero then the top quartile will
usually be predominately made up of two positive numbers and the
bottom quartile of two negative numbers so the correlation will
usually be positive when the two sets of results are mixed. On the
other hand the interquartile range will predominately be made up of a
positive and a negative number, or the reverse, or of two numbers
both close to zero so the correlation will be negative.

I recognise that this is hand waving rather than proof, but I suspect
it is true.

The two questions in my mind are:
(a) does this have anything to do with demonstrating causality? (my
invented dice games in earlier posts makes me unconvinced)
(b) can we learn anything from this about the relationship between
three sets of data that is not also suggested by classical linear
regression? (I suspect not).


William Chambers

unread,
Mar 17, 2000, 3:00:00 AM3/17/00
to
Henry,

Your dice examples supported my model, The one that "seemed" not to was the
model in which the causes were correlated, I was disappointed that you did
not respond to my comments on this and now you ignore them altogether, as
though your second model did not violate my assumption of uncorrelated
causes.

When you say this does not tell us anything regression does not tell us, you
are not explaining your reasoning, You are casting doubt by vauge assertion,
This is not very scholarly, You should back up your suspicions with sound
reasons, And just putting on the tough guy conservative image is really too
easy to respect, We all know I am probably just crazy, But its the logic
that keeps me on your screens, not my reputation, Its the logic of
polarization that you should want to destroy but that requires good logic.
Posturing will not get rid of me, much less the logic of corresponding
correlations. Dig deeper.

Let's assume for the moment that I never mentioned causation, Let's just
talk about logical dependencies between mathematical variables,,, say of the
sort that we would expect if we built in artifactual dependencies by
including some of the same items in two questionnaires, Would the
polarization effect be of use in detecting such artificial dependencies?

Bill

Frank Joyce

unread,
Mar 17, 2000, 3:00:00 AM3/17/00
to
Bill,

The more I poke into what is 'reality' the more I wonder whether there is a
generally valid definition of causality. For example, surely the Copenhagen
Interpretation of QM with the observer collapsing the wave function can't be
what we mean. Or even in thermodynamics, by controlling the macro parameters
P, V I can 'cause' T but T is just an average, a measurement, of the
physical reality which, if I'm lucky and mostly I am, things go as planned
and I observe the expected T value. But what did I 'cause' in a rigorous
sense? Certainly not the actual molecular activity.

Seems to me that all I'm left with finally is correlations anyway so if the
technique is robust enough for the application, I should think it would
prove causation as well as anything does.

Frank


William Chambers <will...@roman.net> wrote in message
news:EIPz4.104$vb....@newsfeed.slurp.net...
> Frank,


>
> What do you think about the polarization of the correlations between
> independent variables across the ranges of the dependent variable
> (corresponding regressions/correlations?
>

> Bill Chambers
>
>
> Frank Joyce wrote in message <8am8qb$81c$1...@nnrp-corp.news.cais.net>...

> >Apropos of your advice to work on real-world problems, lawyers have
already
> >solved the proof of causation problem. See
> >http://www.toxictorts.com/relrisk.htm for a formula using linear
> >proportions. Next problem?
> >

> >Robert Ehrlich <bobeh...@home.com> wrote in message
> >news:38BAFCA4...@home.com...

> >> Bill: Without judging procedure, I can sympathise with your dilemma.
> >> Over the years some stuff that I have done has been greeted by a
> >> groundswell of indifference. I changed tactics and used the new tools
> >> to work on real-world problems that were posed by non statisticians.
> >> Success in that realm helped make our stuff "reasonable"---although
some
> >> of thsi was our fault in that we had not carefully thought out the
> >> implications of our work. finaly, I decided that, because we live in a
> >> capitalistic society, the best way to garner acceptance is to make a
lot
> >> of money using "new" approaches. I am in the midst of that right now.
> >> So, for now, forget the stats people and the psych. people. Find out
> >> where the real-world problems are and prove by doing. I am having the
> >> time of my life right now.
> >>

> >[William Chambers quote deleted]
> >
> >
>
>

William Chambers

unread,
Mar 18, 2000, 3:00:00 AM3/18/00
to
Hi Frank,

I do not understand physics enough to comment on the Copenhagen
Interpretation, But I think we can go further than mere correlation in
causal inference, Correlations are a problem because they are symmetrical,
The correlation of A with B is the same as that of B with A, If we assume
that causation is an asymmetrical relationship, in which the independent
variable(s) determines in some way the values of the dependent variable,
while the dependent does not determine the independent, then we have an
asymmetry, Correlations, regression analysis, trend analysis and so forth
do no reflect this asymmetry, Corresponding correlations/regressions does,
This is why CC really is an advance over traditional correlational analysis.


Bill

Frank Joyce wrote in message <8aummb$18o2$1...@nnrp-corp.news.cais.net>...

Henry

unread,
Mar 18, 2000, 3:00:00 AM3/18/00
to
On Fri, 17 Mar 2000 19:07:39 -0600, "William Chambers"
<will...@roman.net> wrote:
>Your dice examples supported my model, The one that "seemed" not to was the
>model in which the causes were correlated, I was disappointed that you did
>not respond to my comments on this and now you ignore them altogether, as
>though your second model did not violate my assumption of uncorrelated
>causes.

They did not support your model; they challenged it. Your claim, as I
understand it, is that if you have three sets of data (say L, M and N)
where L and M are identically distributed (origininally uniform,
though you are now extending it to Gaussian distributions) and L and M
are independent of each other, and you then perform your sorting by N
and look at the correlations between L and M in the interquartile
range of N and in the other half of the data, then a positive
correlation in the former and a negative correlation in the latter
indicates that L and M are "causes" of N.

I gave you an example where the preconditions held, the correlations
were as you expected, but where L was in fact caused by M and N. You
ignored this fundamental issue.

>When you say this does not tell us anything regression does not tell us, you
>are not explaining your reasoning, You are casting doubt by vauge assertion,
>This is not very scholarly, You should back up your suspicions with sound
>reasons, And just putting on the tough guy conservative image is really too
>easy to respect, We all know I am probably just crazy, But its the logic
>that keeps me on your screens, not my reputation, Its the logic of
>polarization that you should want to destroy but that requires good logic.
>Posturing will not get rid of me, much less the logic of corresponding
>correlations. Dig deeper.

I have dug deeper. I have tried various data sets which produce your
polarisation effect, and in each case linear regression has also
suggested statistically significant relationships between N and L and
between N and M. I am not a tough guy or a conservative, but I am
often a skeptic. I had not been aware of your polarisation effect
before, and I accept it happens. My post yesterday tried to produce
an explanation. My question is still whether it tells us anything new
about the relationship between random variables - I don't know the
answer to this, but I am dubious about its relationship to causation.

>Let's assume for the moment that I never mentioned causation, Let's just
>talk about logical dependencies between mathematical variables,,, say of the
>sort that we would expect if we built in artifactual dependencies by
>including some of the same items in two questionnaires, Would the

>polarization effect be of use in detecting such artificial dependencies?

If you had not mentioned causation, I would have been much happier, as
I suspect would most of the other critics on this newsgroup. I would
much prefer to talk about statistical dependencies and relationships.
As far as I can tell "artifactual" and "artificial" should mean the
same thing as each other, i.e. human made rather than natural, but
statistical relationships based on underlying distributions and
relationships need not depend on the method of creating the results.

Louis M. Pecora

unread,
Mar 19, 2000, 3:00:00 AM3/19/00
to
In article <s8SA4.1186$L5.2...@newsfeed.slurp.net>, William Chambers
<will...@roman.net> wrote:

> Hi Frank,
>
> I do not understand physics enough to comment on the Copenhagen
> Interpretation, But I think we can go further than mere correlation in
> causal inference, Correlations are a problem because they are symmetrical,
> The correlation of A with B is the same as that of B with A, If we assume
> that causation is an asymmetrical relationship, in which the independent
> variable(s) determines in some way the values of the dependent variable,
> while the dependent does not determine the independent, then we have an
> asymmetry, Correlations, regression analysis, trend analysis and so forth
> do no reflect this asymmetry, Corresponding correlations/regressions does,
> This is why CC really is an advance over traditional correlational analysis.

Pardon me for jumping in mid-stream.

What you probably want is functionality. That is, the strict,
mathematical defintion of a function: a relation between points in one
space (domain) and another (range) such that for each point in the
domain there is one and only one point in the range. A way to think
about this is that a point in the domain -determines- the value of the
range.

Now, you might ask what has this to do with statistics? You can
formulate functionality statistics, analogous to how you use
correlations. After all, correlations are tests for linear
functionality. Linear functions have inverses (in most cases), hence
the symmetry of the "causal" relationship. But generally, functions are
nonlinear, maybe even non-differentialble and inverses generally do
-not- exist. Hence, you can establish asymmetric relations.

We have done some work on defining such function statistics mostly with
the intention of use in the realm of nonlinear dynamics analysis of
time series, but they can be used in general situation when you want to
search for functional relations between multivariate sets of data
points. There is also work by a group at Cardiff University in Wales
UK under Antonia Jones. They've developed a "Gamma" statistic that is
very similiar in mentality to what work we've done, but maybe more from
a statistic angle. There may be others in the statistics community
working on such approaches, but I am not aware of them. I recently
found out there is work that is related, but not the same, in attempts
to estimate functions given two multivariate data sets. However, such
estimates are only very indirect functionality tests.

Anyway, I am not saying we solved the problem, but only took some
steps. There are many ways to go about testing for relationships. You
can check out the following references and do look up Jones' papers
(sorry, I don't have those refs. handy):

[1] Louis M. Pecora, Thomas L. Carroll, and James F. Heagy, ³Statistics
for Continuity and Differentiability: An Application to Attractor
Reconstruction from Time Series,² in Nonlinear Dynamics and Time
Series: Building a Bridge Between the Natural and Statistical Sciences,
Fields Institute Communications, edited by C.D. Cutler and D.T. Kaplan
(American Mathematical Society, Providence, Rhode Island, 1996), Vol.
11, pp. 49-62.

[2] L. Pecora, T. Carroll, and J. Heagy, ³Statistics for Mathematical
Properties of Maps between Time-Series Embeddings,² Physical Review E
52 (4), 3420-39 (1995).

William Chambers

unread,
Mar 19, 2000, 3:00:00 AM3/19/00
to
Louis,

Thanks for the comments and references. I do not have access to a good
library and would appreciate clarification from you, You are correct to
frame the issue as one of functions, But as I point out in my most recent
paper, Bunge explained that even functions are apparently symmetrical, at
least when linear and measured by traditional means, So unless we can
establish some method by which asymmetries may be demonstrated we are stuck
with the same theorization that has turned psychology and other social
sciences into a circus of speculations, in which fame and connections decide
what gets considered. In psychology, at least, this amounts to people who
are famous for being famous just get more famous for nonsense or common
sense.

Your suggestion that the lack of an inverse reveals an asymmetry is
interesting but not obvious, If I give you two columns of numbers, A and B,
each expressed as ordinal ranks can you tell me whether one was derived from
the other? Let's say that A=B+C But you are not given C, only A and B
without information of their determination. The distributions are uniform
ranks, so you can not guess from the distributions, Can your method tell us
that A is determined by B? Corresponding correlations can. I do not see how
an inverse or absence of one can do this, Please explain, How does your
method reveal asymmetries in linear relations?

I am curious also, has anyone referenced my 1986 and my two 1991 papers or
the hundreds of newslist posts I made in 1997-98 to semnet on corresponding
regressions and corresponding correlations? Why not? Last time I checked,
my papers come up on various search engines found in libraries. Just do a
search for causation, cause or formal cause, covering the years since 1986.

How can I get into the equation and get recognition for my years of work? Or
at least enough respect to be tested by intelligent persons with integrity?
I have mailed my publications to various research establishments for review,
including the Navy, Army, CIA and many others, also including a number of
folks at the LL Thurstone Lab UNC and Joreskog and Sorbom in Sweden. None of
these experts ever acknowledged getting my letters or email. Journals are
not much better, though there are some exceptions (Perceptual and Motor
Skills, Journal of Mind and Behavior, Structural Equations Modeling). I
sent a paper to Psychological Bulletin in the late 1980s, They kept it
nearly a year and finally sent a 1/2 page rejection indicating simply that I
am trying "to iron into gold." No other explanation was given, That paper
included many, many simulations demonstrating my point, The paper was later
published by the Journal of Mind and Behavior and can be downloaded at
http://www.wynja.com/chambers/regression.html . I think if you read it, you
will agree that it deserves more comment than I got, Psychometrika refused
to review the paper at all. I have since signed onto the newlist of the
society of mathematical psychology, looking for people to review the ideas,
The owner of the list told me that if I was a member of their society that
they might have been willing to let my little blurb go out on their
newslist, asking for comments, The ideas themselves apparently mean
nothing unless they are attached to a recognized "player," All of this
attests to the Sorry State of Statistics.

What do you think about corresponding regressions and corresponding
correlations? What do you think about the polarization effect that a number
of people over the past 14 years have finally admitted exists?

May I have copies of your papers?

Thanks,

Bill

Louis M. Pecora wrote in message
<190320001053361001%pec...@anvil.nrl.navy.mil>...


>In article <s8SA4.1186$L5.2...@newsfeed.slurp.net>, William Chambers

>

William Chambers

unread,
Mar 19, 2000, 3:00:00 AM3/19/00
to
Henry,

You said:
>They did not support your model; they challenged it. Your claim, as I
>understand it, is that if you have three sets of data (say L, M and N)
>where L and M are identically distributed (origininally uniform,
>though you are now extending it to Gaussian distributions) and L and M
>are independent of each other, and you then perform your sorting by N
>and look at the correlations between L and M in the interquartile
>range of N and in the other half of the data, then a positive
>correlation in the former and a negative correlation in the latter
>indicates that L and M are "causes" of N.
>
>I gave you an example where the preconditions held, the correlations
>were as you expected, but where L was in fact caused by M and N. You
>ignored this fundamental issue.
>

One of the dangers of making comments on other people's works without
bothering to read them, is that you risk making some pretty uniformed
statements, I have offered everyone on this list copies of my publications,
You have apparently not dowloaded my 1991 paper nor read the 1986 paper nor
read the two 2000 papers, In them all I make it very clear that the causes I
am measuring must be uncorrelated. In your first dice model, they were
uncorrelated and you found the same polarization that I found, In your
second dice experiment, you used correlated causes, I have repeatedly argued
on SEMNET and in recent papers (one in press and the other revised and under
review) that when we add correlated causes, we abstract their common origin,
This is the foundation of classical true score theory, Thus, in your second
dice model, by adding together two correlated causes, you do not create a
third effect but abstract their common casual variance, Their common
abstraction is a cause of the second variable, which you represent as a pure
cause but that is really an effect of the first cause.

I explained this to you after your posting the results of your dice
experiments and you said you were not interested in my explanation, You
apparently are not interested now and wish to make it look as though I am
ignoring your simulation, In fact, I did your simulation or one that does
the same thing years ago and have talked about it many times. Get your
facts straight before suggesting that I am trying to cheat. Obviously I am
begging people to try the stuff themselves, just follow the rules when you
do!

Now tell us, what do you think about the distinction I draw between
constructions and abstractions in causal modeling? I posted an article to
this newslist recently on this, before you tried your simulations, And also,
why do you pretend that I have not made it explicit for years that the
causes need to be uncorrelated for corresponding regressions to work?

Henry said:
>
>I have dug deeper. I have tried various data sets which produce your
>polarisation effect, and in each case linear regression has also
>suggested statistically significant relationships between N and L and
>between N and M.

Bill responds:

I am not saying that such relationships do not exist, In fact, we analyzed
them in detail on semnet two years ago, Since x1 and x2 both determine 50%
of y, it is obvious that they are correlated with y,

Henry continued:

> I am not a tough guy or a conservative, but I am
>often a skeptic. I had not been aware of your polarisation effect
>before, and I accept it happens. My post yesterday tried to produce
>an explanation. My question is still whether it tells us anything new
>about the relationship between random variables - I don't know the
>answer to this, but I am dubious about its relationship to causation.
>

Henry, I have been trying to disprove corresponding regressions/correlations
for many years now, I could have quit in 1985 when my colleagues questionned
my sanity for even trying to infer causation, But I am a well trainned
psychologist and I do not jump to premature diagnoses. I am a better
trainned researcher and I do not dismiss ideas unless I see reason to do so,
especially when the idea makes at least some sort of sense, I have begged
every expert I can find to disprove me and none have been able or willing to
do so, I do not know how I can be more skeptical and keep my intellectual
integrity, Part of being a real intellectual is not running from unpopular
puzzles and facts, I have not run. You are not sure if you will run or
not, But I do admire you for at least admitting that the polarization
exists,


Henry said:

>If you had not mentioned causation, I would have been much happier, as
>I suspect would most of the other critics on this newsgroup. I would
>much prefer to talk about statistical dependencies and relationships.
>As far as I can tell "artifactual" and "artificial" should mean the
>same thing as each other, i.e. human made rather than natural, but
>statistical relationships based on underlying distributions and
>relationships need not depend on the method of creating the results.
>

Bill responded:

How would you model a linear causal relationship? I use the equation
y=x1+x2 because it is so very simple, Y is linearly dependent on x1 and x2.
This even captures the meanings of the terms "independent variable" (cause)
and "dependent variable"(effect). When we use math simulations, we simply
make explicit that the DV is logically derived from the IVs.

If such a dependency were created by a sloppy scientist, as in the
questionnaires having the same items, we would call the correlations between
the totals scores of the measures an artifact, That correlation would
reflect a type of causation that is man made and artificial, What makes it
bad is that the poor scientist may not realize that it is not nature that
creates the relationship between the questionnaires but his own bad
mathematics.

BUT, the dependency is still there, The total scores (effect) of both
questionnaires are determined in part by the same items (causes). The total
scores are dependent upon the items that make them up, The total scores do
not cause one another but are merely correlated withone another, One total
score does not cause the other, You seem to be comfortable with this logic
so long as we stick to mere numbers.

But what it nature and not the idiot scientist creates the dependencies? Why
not call this causation? The skeptic would rightly say that we should be
careful because we may not be measuring what we think we are, This is true,
This is why we cannot PROVE causation via science, As soon as we get away
from the numbers as mere numbers and assume they reflect some attributes in
phenomena, then we make an inferential leap, This is an act of informed
faith but it absolutely prevents us from claiming any proof of phenomenal
causation. To do so is to commit the ancient logical fallacy of affirming
the consequent. But science is not about proof, Mathematics is about proof,
Science is about making reasonable statements of faith. It is about
inference.

With corresponding correlations/regressions we do not prove causation, we
infer it. I carefully selected this word for my publications in order to
underscore that I am not proving anything, at least not scientifically.
However, if we want to stick with mathematics as mere numbers, I think a
degree of proof is possible for corresponding correlations/regressions.


I am not sure of what your final statement means... concerning creating
methods, It sounds like you are saying CC is just an artifact, Perhaps you
would care to explain this further, Having written a paper on this
criticism, I feel ready to respond,

Bill

Henry

unread,
Mar 19, 2000, 3:00:00 AM3/19/00
to
On Sun, 19 Mar 2000 14:21:53 -0600, "William Chambers"
<will...@roman.net> wrote:

>
>Henry said:
>>They did not support your model; they challenged it. Your claim, as I
>>understand it, is that if you have three sets of data (say L, M and N)
>>where L and M are identically distributed (origininally uniform,
>>though you are now extending it to Gaussian distributions) and L and M
>>are independent of each other, and you then perform your sorting by N
>>and look at the correlations between L and M in the interquartile
>>range of N and in the other half of the data, then a positive
>>correlation in the former and a negative correlation in the latter
>>indicates that L and M are "causes" of N.
>>
>>I gave you an example where the preconditions held, the correlations
>>were as you expected, but where L was in fact caused by M and N. You
>>ignored this fundamental issue.
>>

William said:
>One of the dangers of making comments on other people's works without
>bothering to read them, is that you risk making some pretty uniformed
>statements, I have offered everyone on this list copies of my publications,
>You have apparently not dowloaded my 1991 paper nor read the 1986 paper nor
>read the two 2000 papers, In them all I make it very clear that the causes I
>am measuring must be uncorrelated. In your first dice model, they were
>uncorrelated and you found the same polarization that I found, In your
>second dice experiment, you used correlated causes, I have repeatedly argued
>on SEMNET and in recent papers (one in press and the other revised and under
>review) that when we add correlated causes, we abstract their common origin,
>This is the foundation of classical true score theory, Thus, in your second
>dice model, by adding together two correlated causes, you do not create a
>third effect but abstract their common casual variance, Their common
>abstraction is a cause of the second variable, which you represent as a pure
>cause but that is really an effect of the first cause.

This is my final post on this subject. I now understand why others
have given up on you.

You accuse me of not reading your work. You either have not read my
comments or have consistently failed to understand what I am saying.

Your claim is that from your method you can deduce/infer causality in
a particular direction. I have stated a model which produces your
effect without having causality in that direction. With two
independent (in the probabilistic sense of P(L|M)=P(L) and
P(M|L)=P(M)) random variables I used your test of whether they
"caused" a third (which was equal to their sum N=L+M because the
undelying model is L=N-M). Your test produced a positive result,
though in fact L and M did not "cause" N. Your complaint was that the
real "causes" (N and M) were correlated. Given the nature of your test
this was inevitable in any potential counter-example which has
identical distributions to yours (it has to replicate the correlation
between y and x2 in your y=x1+x2 relationship) but has a different
causal relationship. I told you the relationship. But my point was
that given data which gives a positive result in your test and where
the the random variables being tested as possible "causes" are
independent of each other, you cannot be confident that the positive
result implies causality in the implied direction.

I have no feeling that you accept any of this. The fact that in some
actual causal relationships your method produces a positive result is
insufficient. It also has to avoid producing a positive result in
cases where the data is similar but where the actual causal
relationship is not that being tested. I think your method fails to
do this.

I can hardly say this has been a pleasure. I have learnt something
new (the polarisation effect), but the rest has largely been a waste
of my time. I recognise you may feel the same.


Louis M. Pecora

unread,
Mar 20, 2000, 3:00:00 AM3/20/00
to
>Louis,
>
>Thanks for the comments and references. I do not have access to a good
library and would appreciate clarification from you, You are correct
to frame the issue as one of functions, But as I point out in my most
recent paper, Bunge explained that even functions are apparently
symmetrical, at least when linear and measured by traditional means,
So unless we can establish some method by which asymmetries may be
demonstrated we are stuck with the same theorization that has turned
psychology and other social sciences into a circus of speculations, in
which fame and connections decide what gets considered. In psychology,
at least, this amounts to people who are famous for being famous just
get more famous for nonsense or common sense.


Asymmetries can be demonstrated (if they exist in the data) by testing
for functionality from one data set to another (e.g. X->Y), then
testing the inverse (e.g. Y->X).

Sorry, I don't know Bunge's work. I am not a statistician (I just play
one at conferences sometimes :-) ). I am a physicist.


>Your suggestion that the lack of an inverse reveals an asymmetry is
interesting but not obvious, If I give you two columns of numbers, A
and B, each expressed as ordinal ranks can you tell me whether one was
derived from the other? Let's say that A=B+C But you are not given C,
only A and B without information of their determination. The
distributions are uniform ranks, so you can not guess from the
distributions, Can your method tell us that A is determined by B?
Corresponding correlations can. I do not see how an inverse or absence
of one can do this, Please explain, How does your method reveal
asymmetries in linear relations?


I guess I don't understand the nature of the problem. I'm not sure how
one can test for functionality when the domain points are not fully
given (B,C). Correlations are linear functions so I would suspect any
function test that subsumes them would also handle this case, but I
can't see how even correlations handle the case where C is unknown.
Maybe just my ignorance of the field.


>I am curious also, has anyone referenced my 1986 and my two 1991
papers or the hundreds of newslist posts I made in 1997-98 to semnet on
corresponding regressions and corresponding correlations? Why not?
Last time I checked, my papers come up on various search engines found
in libraries. Just do a search for causation, cause or formal cause,
covering the years since 1986.


Sorry, I am not familiar with your papers, but then I am a novice in
this field, at best.


>What do you think about corresponding regressions and corresponding
>correlations? What do you think about the polarization effect that a
number
>of people over the past 14 years have finally admitted exists?


I'm not familiar with polarization or what it means. Sorry.


>May I have copies of your papers?


Sure. Just send me your address.

Jerry Dallal

unread,
Mar 20, 2000, 3:00:00 AM3/20/00
to
William Chambers wrote:
>
> One of the dangers of making comments on other people's works without
> bothering to read them, is that you risk making some pretty uniformed
> statements, I have offered everyone on this list copies of my publications,
> You have apparently not dowloaded my 1991 paper nor read the 1986 paper nor
> read the two 2000 papers, In them all I make it very clear that the causes I
> am measuring must be uncorrelated.

Bill,

I've read your paper. My sense is that if one puts enough constraints
on the system (such as uncorrelated causes) and restricts the set
of possbile models to one causal model and a few competitors, you can
pick the "causal" model out of the set. At the moment, my take is that
the restrictions and constraints are to strict for the method to be
useful in practice. This is reflected in the applications portion of
the '91 paper where you suggest the method has identified some causal
mechanisms but gives inexplicable results in others.

--Jerry

William Chambers

unread,
Mar 20, 2000, 3:00:00 AM3/20/00
to
Henry,

You remind me of many other people who make half-baked attacks on
corresponding regressions/correlations and when they fail to win by logic,
attack me personally as they run away, But it has been a pleasure for me
to converse with you, I derive a certain satisfaction from tracking down
logically inconsistent people who make bold claims for which they are not
willing to take responsibility, You are ever so typical of what is wrong in
the Sorry State of Statistics. My impression is that you are still very
young, however, and hopefully have learned something from this, More likely,
however, you are just addicted to intimidating people by using mystifying
words and misleading, pseudological demonstrations, My bet is that although
you have some of the depth of a "calculator mind," that you lack the width
of the true mathematician and especially the scope of the good
statistician's intellect. You are rushing to prove your fragmented insights
rather than systematically trying to understand your opponent's logic, This
may get you through schools where pathological teachers are just happy to
have students who can pass exams but such mere cleverness will do you a
disservice in the long run, This exchange is an example, If you were well
educated and had not been spoiled by your unrefined intellect, you would
take great delight in running circles around any poor logic that I might be
hiding behind, You would explain yourself over and over, in different ways,
until you added up from many directions, It would be like listening to Bach.
Instead, you have chosen a cheap and impulsive conclusion, only to gain a
brief and unsupportable consistency that can not stand up to sincere public
debate. You find me and this exchange unpleasant because you entered the
game looking to be the winner instead of a fellow learner. This is a sign
of a poor education in a young mind and a sign of cruelty in an older mind,
Either way, it is typical of the Sorry State of Statistics.

I do not know how more clearly or loudly I can put it: The causes must be
uncorrelated, When they are not uncorrelated, we have a measurement
confound and must either remeasure or use factor analysis to abstract a more
crystallized interpretation of the measures. If the causes are correlated,
adding the causes will tend not to construct a new variable but will,
instead, abstract an existing latent variable, This is not causation,
Adding correlated variables merely recovers the latent structure that is
already there. This is the essence of the classical theory of true scores,
Here is the model:

y1=x1+x2, y2=x1+x3, y3=x1+x4

When we add total=y1+y2+y3 the differences in the y variables cancel one
another and their covariance is multiplied, The total scores will be
progressively correlated with x1, as k (number of items y1, y2... y(k)
increases and the correlation between the y variables increases. Thus
adding correlated variables recovers the latent common cause. If we do not
understand this, we end up thinking that the total scores are an effect
rather than a proxy for a common cause, This is what you appear to have done
in your second dice simulation,

Adding correlated y variables obscures the sum of the independent variances
of y1, y2...y(k). If you wish to find the causal construction from the y
variables independent of their common covariance (x1), then partial out x1
(total/true scores) from the y variables and then add the y variables. But
if you do not first partial out x1, then the construction of independent y
variances is obscured by the greater variance of the latent trait x1. First
partial out covariance and then the combinations of the uncorrelated aspects
of the y items will produce a new construction, The polarization effect
will then work, even with your second simulation,

What you did apparently in your second dice example was create a second
variable that was correlated with the first, This violated the assumptions
of the model and abstracted a common variance that already existed. If I am
wrong in this, I apologize but your unwillingness to explain what you did
further keeps me in the dark, You have admitted (I believe) the correlation
between your causes, however, and this is enough to bring the game to a
hault, You have chosen the way of mystification for the end, unfortunately
and it is this that is causing you pain, I think this mystification is not
a mark against my intelligence but that it is indicative of your sophistical
style of reasoning, More than anything, the sophist hates to make things
explicit... to spell out the details for all to see, Your failure to
understand that you violated the assumption of uncorrelated causes may give
you the impression of being a clever boy who defeated the crazy psychologist
but you are really only doing what they call satisficing (sp?),

If you figure out later that I really am wrong, I and others would
appreciate your telling us exactly why I am wrong, Until then, try to avoid
arguing impressionistically, You may meet a master of ambiguities and end up
with egg on your face, even if your opponent is merely an unemployed
professor.

Bill

William Chambers

unread,
Mar 20, 2000, 3:00:00 AM3/20/00
to
Hi Jerry,

Part of the problem in the 1991 paper with uninterpretable results is simply
that I analyzed data about which I knew nothing, I did this on purpose, so
to help avoid any temptation on my part to read in explanations, I thus let
the data speak for themselves, It is true, however, that the CC analysis
may demand more investigation that is the custom in social/behavioral
sciences. When measures are correlated we lack clarity. It would be like
trying to discover the periodic table of elements from a bucket of dirt,
weighting the most common dirt particles and the most important. This would
and does confuse interpretation, The answer is more refined measurement,
This requires better measurement, much better than the kind of crude buckets
that pass for measures in the social sciences.

The ideal of measurement is reflected in the experimenter's use of factorial
anova designs, All combinations of the factors are considered, Here there is
no tendency to confound the factors, since we have articulated all
combinations, This removes ambiguities and facilitates scientific progress,
Such clarity is not often sought in applied nonexperimental research but it
should be, Factor analysis (with orthogonal rotation), in fact, is an
attempt to crystallize orthogonal measures and thus refine interpretations,
It may be that the sort of applied nonexperimental research that would
benefit most from corresponding correlations would be best applied to
orthogonal factor scores, I argue as much in my paper that is in press at
Structural Equations Modeling.

In my most recent paper I develop an algebraic method of unraveling the
confounds that exist in nested variables, You are welcome to a copy if you
wish, It reveals even more clearly what a mess we typically get from our
measures in the social sciences but I think the method offers a means of
clarifying the relationships,

As to the practicality of focusing on orthogonal causal variables,,, I guess
it depends on the purpose of science, Very little progress comes from the
current state of the structural equation art because the results are not
cumulative. They are mystifyng and seem to be very clever but ultimately
causal inference is impossible from methods like LISREL, whether such
methods are restricted to uncorrelated or correlated putative causes. We
might follow the experimentalists' taste for clearer measurement and end up
much better off,

For example, the chemist first isolates his variables into orthogonal forms
and then systematically studies interactions between the elements in an
"artificial" environment, From this controlled and purified purview, general
principles of chemistry were developed which were then extended to very
complex arrangements. In the social sciences, we might do well to do a
similar purified analysis before trying to explain the universe from a
"bucket of dirt." Careful control and measurement will add up in the long
run and we will have something to crow about,,, at least that is what
history suggests. In the mean time, I share your frustration concerning
orthogonal causes, I just do not see anyway around the extra work.

Best regards,

Bill

Jerry Dallal wrote in message <38D63CB1...@hnrc.tufts.edu>...


>William Chambers wrote:
>>
>> One of the dangers of making comments on other people's works without
>> bothering to read them, is that you risk making some pretty uniformed
>> statements, I have offered everyone on this list copies of my
publications,
>> You have apparently not dowloaded my 1991 paper nor read the 1986 paper
nor
>> read the two 2000 papers, In them all I make it very clear that the
causes I
>> am measuring must be uncorrelated.
>

William Chambers

unread,
Mar 21, 2000, 3:00:00 AM3/21/00
to
Hi Louis,

snip

Lou responded:

>Sorry, I don't know Bunge's work. I am not a statistician (I just play
>one at conferences sometimes :-) ). I am a physicist.
>

Bill responds:
Bunge wrote a book on the nature of causation that is considered a classic,
I read it last year and it really is good, if for nothing else than
clarifying some definitions, I refer to it as a merely a road mark, It is
not crucial to our discussion,

Bill wrote:
>
>>Your suggestion that the lack of an inverse reveals an asymmetry is
>interesting but not obvious, If I give you two columns of numbers, A
>and B, each expressed as ordinal ranks can you tell me whether one was
>derived from the other? Let's say that A=B+C But you are not given C,
>only A and B without information of their determination. The
>distributions are uniform ranks, so you can not guess from the
>distributions, Can your method tell us that A is determined by B?
>Corresponding correlations can. I do not see how an inverse or absence
>of one can do this, Please explain, How does your method reveal
>asymmetries in linear relations?
>

Lou responded:


>
>I guess I don't understand the nature of the problem. I'm not sure how
>one can test for functionality when the domain points are not fully
>given (B,C). Correlations are linear functions so I would suspect any
>function test that subsumes them would also handle this case, but I
>can't see how even correlations handle the case where C is unknown.
>Maybe just my ignorance of the field.
>


If A=B+C but we have only measured A and B, we can recover C using a
residual from regression analysis, The part of A that is not B is C, We
predict B from A and the residual will be C. Of course we can derive the
residual estimate in several ways, predicting C from A or A from C,
Corresponding correlations takes advantage of this and tells us which to
select for the causal model.

snip

Lou responded

>Sorry, I am not familiar with your papers, but then I am a novice in
>this field, at best.
>


Bill responds: There is room for all! I look forward to reading your work.

>
>>What do you think about corresponding regressions and corresponding
>>correlations? What do you think about the polarization effect that a
>number
>>of people over the past 14 years have finally admitted exists?
>
>

>I'm not familiar with polarization or what it means. Sorry.
>


The polarization is the root issue of these threads, The sorry state
business refers to my disgust at the lack of intellectual integrity in the
statistics old boy system,

I use polarization to infer causation (linear dependencies). When we
simulate y=x1+x2 (y is caused by x1 and x2), we can sort the data by y and
then clip off the upper and lower quartiles to concatenate them into data
called the extremes of y, The remaining inner quartiles of y are the
midrange of y, The correlations between x1 and x2 will be positive in the
extremes of y but negative in the midrange of y,This polarization of the
correlations between the causes (x1 and x2) across the ranges of the effect
(y) allows us to infer causation or linear dependence,

Bill asked:


>
>>May I have copies of your papers?


Lou responded:

>Sure. Just send me your address.


Bill responds:
My address shifts a lot lately, The fate f a whistly blower, I will send you
the most recent by private e-mail,

By the way, I guess your point about nonsingular matrices was that they
imply linear dependence, This is true, I just do not see how we can
determine the direction of the linear dependence by simply failing to find
an inverse. The direction is the aspect of asymmetry that tells us which
variables should be on which side of the functional equation, as a model of
causation.

Best,

Bill

r.e.s.

unread,
Apr 30, 2000, 3:00:00 AM4/30/00
to
[I've just read some of your more recent postings in another
thread, but thought it best to reply in this one to preserve the
continuity of the discussion re the spreadsheet example.]

The phenomenon you're calling "polarization" certainly does occur
-- but it doesn't distinguish between what you've termed "causes"
and "effects", if I've understood your usage of these terms.

To demonstrate this, here's an example contrary to yours, also
using, say, 100 rows:

1. Generate two columns, y and z, of (simulated) independent
Normal(0,1) random variates. In your terminology, these will be
"causes".

2. Create two more columns, x1 and x2, defined as x1=(y-z)/2 and
x2=(y+z)/2. In your terminology, x1 & x2 are "effects" of y & z.
NOTE: x1, x2 are each N(0,1/2) and are mutually independent
(hence uncorrelated), and x1+x2=y.

3. Sort the rows of (y,z,x1,x2) in order of increasing value of
y, which is one of the "causes".

4. Make one scatterplot showing all the (x1,x2) data points in
three separately labeled series: Series A from the first 25%
of the rows, Series B from the middle 50% of the rows, and
Series C from the last 25% of the rows.

The scatterplot will resemble the following, if you're reading
this with a fixed-width font:

--------
| BCCC |
| BBBCCC |
x2 | ABBBCC |
| AABBBC |
| AAABBB |
| AAAB |
--------
x1

The larger the number of points in the sample, the more you'll
see that they're tending to form a circular "cloud" centered on
the origin, whose density approximates to that of a symmetric
bivariate normal density. The cloud is partitioned into three
diagonal "bands", A,B,C, by the two lines x1+x2=c1, x1+x2=c2,
where the constants are determined by the 1st & 3rd quartiles of
the y-values. (In the large-sample limit, the constants are
c1= -0.674, c2= +0.674, i.e. the 1st & 3rd quartiles of N(0,1).)

This makes it clear that each of the three separate series will
tend to exhibit a negative corr(x1,x2); furthermore, when the
A- & C-series are combined, the resulting "two-lobed cloud"
tends to show positive corr(x1,x2) because the centers of mass
of the two lobes are sufficiently separated along the line x2=x1
of slope=+1. Hence the so-called "polarization" phenomenon.


We now have two examples, each exhibiting polarization when the
values of mutually independent (x1,x2) are sorted with respect
to another variate y:

In the present example, (x1,x2) are "effects" and y is a "cause",
while in your example, (x1,x2) are "causes" and y is an "effect".

Such polarization, therefore, fails to distinguish "causes" from
"effects".

--r.e.s.

"William Chambers" <will...@roman.net> wrote in message

news:yWdA4.239$Q%.504@newsfeed.slurp.net...


| Would someone please try the following simulation with normally
distributed

| x1 and x2? [...] when I conducted the following corresponding


correlations
| analysis (described in the previous post and attached below) it
worked with
| NORMAL distributions for x1 and x2!

[...]

| >other hand, x1 and x2 will be negatively correlated in the midrange


of y,
| >These opposite correlations between the independent variables

across the
| >ranges of the dependent variable are the polarization effect, We


can
| >exploit this polarization property to discover unknown causes.
This
| >simulation just demonstrates the polarization when we know the
causes and
| >effects.
|
| >Assumptions:
| >
| >1. The causes are uncorrelated in the whole data set.

| >2. The causes are uniformly distributed at the time of their causal
| > generation.
[This must be relaxed to include your use of Normal variates]

William Chambers

unread,
May 1, 2000, 3:00:00 AM5/1/00
to
R.E.S. has presented a case in which polarization indicates that two effects
are causes, This is a very serious blow to the validity of polarization and
corresponding correlations. It may, indeed, be the end of the line for
these methods. First, I would like to thank R.E.S. (who ever you are) for
bothering to test me and CC. It is a honor to be taken seriously by a
fellow intellectual, Thank you.

The case is this: Generate the models y1=x1-x2 and y2=x1+x2, where x1 and x2
are uniform random numbers. The polarization test will indicate that the
correlations between x1 and x2 polarize across the ranges of both y1 and y2.
This is consistent with what I have been saying for years.

Now here is the point made by R.E.S. The correlations between y1 and y2 are
zero in general but THEY polarize across the range of x1 and x2! I tried
this and it does happen. The polarization of y1 and y2 across x1 or x2 is
not as strong as is the polarization between x1 and x2 across the ranges of
y1 and y2 but it is there. This finding is unexpected and contradicts the
assumption that polarization only occurs between causes across the ranges of
the dependent variable, Y1 and y2 are not causes, they are effects.
Something is wrong but something is interesting.

To translate the math into words, it appears that sums and differences
between random numbers are positively correlated in the extremes of those
random numbers and negatively correlated in the midrange of these same
random numbers.

I do not understand what this means. But I have some questions to ask.

Does the fact that we calculate the y1 and y2 from the same x1 and x2
variables make a difference? Ordinarily, y1 and y2 are modeled by y1=x1+x2
and y2=x1+x3. That is the effects are not mirror images of one another,
Until a better term comes long, lets say that in the creation of two
dependent variables y1=x1-x2 and y2=x1+x2, y1 and y2 are images of one
another, They are made from the same variables x1 and x2 but from inverse
operations, Clearly mirror image effects create false polarization,

My next question is does the mirror image relationship represent a special
case that can be detected and controlled in corresponding correlations?

There are several things to consider here,

First, what are these mirror relationships? I do not know.

Second, can they be detected? The correlation between y1 and y2 is roughly
zero. What other relationship exists between them? R.E.S points out that
the scatter plot between mirror image y1 and y2 is a ball with two tails
stretching out. Andi found this as well. What do we know about such plots?
Can they help us detect mirror image variables? I do not know,

Questions arise, while watching my 15 year dream of revolution, respect and
a job, pour into the dirt? Do I cut and run in shame to one of the several
beautiful women in my house? Do I just ignore the evidence and be the idiot
that the boys think I am? Or do I choose honor and integrity, and learn
something from the situation?

I think I want to know more. Lead on R.E.S. I will think about it from this
end as well.

Bill Chambers


r.e.s. wrote in message <8ej84q$k0c$1...@slb0.atl.mindspring.net>...

Tony T. Warnock

unread,
May 2, 2000, 3:00:00 AM5/2/00
to
Y1=x1-x2 and y2=x1+x2 have the same distribution except for location.
Both are triangular with variance 1/6, range 2, and means 0 and 1
respectively. Perhaps this has some bearing?


Gus Gassmann

unread,
May 2, 2000, 3:00:00 AM5/2/00
to

William Chambers wrote:

> R.E.S. has presented a case in which polarization indicates that two effects
> are causes, This is a very serious blow to the validity of polarization and
> corresponding correlations. It may, indeed, be the end of the line for
> these methods. First, I would like to thank R.E.S. (who ever you are) for
> bothering to test me and CC. It is a honor to be taken seriously by a
> fellow intellectual, Thank you.

(I'm snipping the rest)

Bill,

I admire your integrity for admitting your error. It can't have
been easy writing your post. And I congratulate r.e.s. for
his tenacity and his ability to nail down the example so clearly.

-------------------------------------------------------

gus gassmann (NOSPAMg...@mgmt.dal.ca)

School of Business Administration, Dalhousie University
Halifax, Nova Scotia, Canada , B3H 1Z5
ph. (902) 494-1844
fax (902) 494-1107

http://www.mgmt.dal.ca/sba/profs/hgassmann/

Remove NOSPAM in the reply-to address

r.e.s.

unread,
May 4, 2000, 3:00:00 AM5/4/00
to
About terminology:
In the present simulation context, I understand "causes" to
mean just simulated iid samples from selected distributions,
and I understand "effects" to mean just linear combinations
of those "causes".

To find cases where the mutually uncorrelated "causes" cannot
be distinguished from "effects", the "effects" must also be
mutually uncorrelated, which in a linear model leads directly
to a necessary condition on the relationship between "causes"
and "effects".

(We can call this a "no-correlations" condition, and it's
what you referred to as "mirroring"; however, it's actually
a requirement of orthogonality between every pair of the
linear combinations that define the "effects", as explained
below.)


--
A general case:

Suppose that x1,x2,... are mutually uncorrelated random
variates with respective variances v1,v2,..., and let
y1,y2,... be defined by

[y1,y2,...]' = [c1,c2,...]' + M [x1,x2,...]'

(with ' denoting transpose) where c1,c2,... are constants,
and M is a constant matrix of appropriate dimensions.
In other words, each yi is some linear combination of
x1,x2,...

Under these circumstances, the covariance matrix of the
y1,y2,... is

cov[y1,y2,...] = M diag[v1,v2,...] M'

and therefore a necessary & sufficient condition for
y1,y2,... to also be mutually uncorrelated is that

M diag[v1,v2,...] M' must be diagonal.

This is the "no correlations" condition, and can also
be written as

sum( m(i,k)*m(j,k)*vk, k=1,2,...) = 0 (i =/= j)

where m(r,c) is the (row r, col c)-element of M.
(This is a kind of orthogonality condition on the rows
of matrix M, in which the variances {vi} establish a
special metric in the row-space.)

In the special case where all the mutually uncorrelated
x1,x2,... have equal variances, the "no correlations"
condition becomes

sum( m(i,k)*m(j,k), k=1,2,...) = 0 (i =/= j)

which just says that every pair of rows in the matrix M
must have a zero scalar product, i.e. the rows must be
orthogonal to one another.

(For a given number of x- and y-variates, say m and n
respectively, if we look at the nm-dimensional space of
possible m(i,j)-values (i=1..m, j=1..n), the subset that
corresponds to values that "preserve uncorrelatedness"
has Lebesgue measure 0; consequently, "almost all" of
the possible linear combinations will lead to "effects"
that are mutually correlated, and thus distinguishable
from "causes" -- even though we can construct any number
of exceptional cases by applying the "no correlations"
orthogonality condition.)


--
A couple of examples:

In the 2x2 case

y1 = m11*x1 + m12*x2 + c1
y2 = m21*x1 + m22*x2 + c2,

the "no-correlations" condition is

m11*m21*v1 + m12*m22*v2 = 0

which is illustrated by the example y1=x1-x2, y2=x1+x2,
with v1=v2, and whose M-matrix has rows [1 -1] & [1 1],
such that [1 -1] [1 1]' = 0.

Here's a 3x3 "no-correlations" example, supposing that
the mutually uncorrelated x1,x2,x3 have equal variances:

y1 = 2*x1 + 1*x2 + 1*x3
y2 = 1*x1 + 1*x2 - 3*x3
y3 = 4*x1 - 7*x2 - 1*x3

Note that [2 1 1], [1 1 -3], and [4 -7 -1] are such that
the scalar product of any two of them is 0, in order to
satisfy the "no correlations" orthogonality condition.

In this example, if we make a scatterplot of *any* two
of the {yi}, say yi vs. yj, then we should find the
following:

- The scatterplot will tend to show corr(yi,yj)=0
- The scatterplot will tend to show polarization
by *any* one of the x-variates (x1, x2, or x3).

(The "will tend to show" becomes "will show" in the
large-sample limit, but the degree of polarization
can be expected to vary among the yi,yj pairs.)

--r.e.s.


"William Chambers" <will...@roman.net> wrote ...
[...]

[...]

William Chambers

unread,
May 4, 2000, 3:00:00 AM5/4/00
to
Dear r.e.s,

Thank you for a beautiful explanation of some of the properties of our
mystery variables, I have been tinkering with the material today and wonder
if you have found, as I have, that x1=2((x1+x2)+(x1-x2)). Thus the linear
combination of (x1+x2) and (x1-x2) that is indicated as causes of x1 by the
polarization method does in fact equal x1. I am not feeling well today and
this may be just stating the obvious but I wonder if this is really so
inconsistent with the idea of corresponding correlations? We find that x1
and x2 are indicated as causes of y1=x1+x2 and that x1 and x2 are indicated
as causes of (x1-x2). This much makes sense as usual, When we then curl the
equation back on itself with (x1+x2)+(x1-x2), we do recover this causes x1,
as the effect of combining y1 and y2. In fact, this actually seems to
happen, There seems to be some kind of self-reflexive process going on
here. I am too tired to make much more out of it right now,
More thoughts follow below:

r.e.s. wrote in message <8era2b$rtl$1...@slb6.atl.mindspring.net>...


>About terminology:
>In the present simulation context, I understand "causes" to
>mean just simulated iid samples from selected distributions,
>and I understand "effects" to mean just linear combinations
>of those "causes".


Isn't x1=2(x1+x2)+(x1-x2) a linear combination? If we simplified this
equation wouldn't we just end up with something similar to x1=x1 and would
not this be the final word?


>
>To find cases where the mutually uncorrelated "causes" cannot
>be distinguished from "effects", the "effects" must also be
>mutually uncorrelated, which in a linear model leads directly
>to a necessary condition on the relationship between "causes"
>and "effects".


I am a bit lost here because I have always thought that effects from the
same cause would be correlated. This is because I thought of dependent
variables as best modeled by y1=x1+x2 and y2=x1+x3. In this model, the
causal direction is constructive, that is, it include new causes, I think
your self reflexive model (y1=x1+x2) y2=(x1-x2) might be called an
abstractive model, but more on this another time, Its the curling in upon
itself that seems to make the notion of self reflexivity relevant and
perhaps a special class of functions,

(By the way, I use "x" to symbolize causes and "y" effects because this is
the tradition in regression analysis, Its been the convention in CC/CR
exchanges over the years and it this reason that I reformulated your
observations using this convention, )

>
>(We can call this a "no-correlations" condition, and it's
>what you referred to as "mirroring"; however, it's actually
>a requirement of orthogonality between every pair of the
>linear combinations that define the "effects", as explained
>below.)
>


Can you think of any no-correlation condition between effects in the real
world? It could help clarify the issues.


This orthogonality condition ... are these just operations (x1+x2)+(x1-x2)
that make possible analytic statements that actually presume the prior
existence of the causes (x1 and x2). Should we distinquish between synthetic
and analytic causes (in the sort of Kantian manner)?


Mighty interesting and admirable work, I am not a mathematician and would
appreciate your patience. Does the orthogonal transformation represent a
domain of operations and causal patterns that is qualitatively different
from the raw causal model? Over the past few years I have called
construction the combination of uncorrelated variables and perhaps I should
add different variables (as illustrated in myy1=x1+x2, y2=x1+x3, The
combination of correlated variables in psychometrics is an act of
abstraction,,, it is how we use total scores (sums of correlated variables)
to abstract their common true score variance, I call such combinations of
correlated variables abstraction because the analysis folds in on itself.
Perhaps you have illustrated that the combination of causally related
variables need not be based on correlation but may also include self-
reflexive operations(y1=x1+x2, y2=x1-x2), i.e inverse relations between the
same variables with themselves,,, I begin to ramble and will sign off now,
Later.

Bill

r.e.s.

unread,
May 8, 2000, 3:00:00 AM5/8/00
to
"William Chambers" <will...@roman.net> wrote ...
[...]
| I have been tinkering with the material today and wonder
| if you have found, as I have, that x1=2((x1+x2)+(x1-x2)).
| Thus the linear combination of (x1+x2) and (x1-x2) that
| is indicated as causes of x1 by the polarization method
| does in fact equal x1.
[...]

I think you meant to write
x1 = (y1 + y2)/2 = ((x1+x2)+(x1-x2))/2,

and you're probably noticing what I mentioned in
the "NOTE: ..." of my first reply (we've changed
the notation since then, but that's fine).

Since

y1 = x1 - x2
y2 = x1 + x2,

we have the inverse relationship

x1 = (y1 + y2)/2
x2 = (y2 - y1)/2.

I think it's the existence of an such an inverse that
you call "curling the equation back on itself", and
you seem to recognize this in the closing paragraph
of your reply.

In the general case sketched below, the matrix form is

y = M x + c,

which, if M is square and det(M)<>0, can be inverted
to give

x = M^(-1) (y - c).

If the x-variables are not to be distinguishable from
y-variables based solely on correlations, then M must
be invertible; otherwise, by exhausting all pairs of
variables, we could separate the variables into two
groups, each with zero within-group correlations, but
nonzero between-group correlations. Then we would know
that one group is x and the other is y; hence, unless
M is invertible, the functional relationship between
groups would exist only "one way", namely y = M x + c,
making x- and y- identifiable.

(If we there are equal numbers of x- and y-variables,
then the no-correlations condition, viz., mutual
orthogonality of the rows of M, itself implies
invertibility.)

| r.e.s. wrote in message <8era2b$rtl$1...@slb6.atl.mindspring.net>...
| >About terminology:
| >In the present simulation context, I understand "causes" to
| >mean just simulated iid samples from selected distributions,
| >and I understand "effects" to mean just linear combinations
| >of those "causes".
|
|
| Isn't x1=2(x1+x2)+(x1-x2) a linear combination? If we simplified
| this equation wouldn't we just end up with something similar to
| x1=x1 and would not this be the final word?

( x1 = (y1 + y2)/2 = ((x1-x2)+(x1+x2))/2 )

| >To find cases where the mutually uncorrelated "causes" cannot
| >be distinguished from "effects", the "effects" must also be
| >mutually uncorrelated, which in a linear model leads directly
| >to a necessary condition on the relationship between "causes"
| >and "effects".
|
|
| I am a bit lost here because I have always thought that effects
| from the same cause would be correlated. This is because I
| thought of dependent variables as best modeled by y1=x1+x2 and
| y2=x1+x3. In this model, the causal direction is constructive,
| that is, it include new causes,

The examples that I made up were all specially constructed
just to satisfy the "no correlations" condition, which was
developed to counter the idea that "effects from the same
cause[s] would be correlated". Playing "devil's advocate",
I wanted also to demonstrate that just because two variates
showed polarization with respect to a third, that in itself
could not lead to a conclusion concerning which of the three
are "causes" and which are "effects" (using the special
definitions given for these terms).

| I think
| your self reflexive model (y1=x1+x2) y2=(x1-x2) might be called
| an abstractive model, but more on this another time, Its the
| curling in upon itself that seems to make the notion of self
| reflexivity relevant and perhaps a special class of functions,

Since "abstraction" is used below to mean the "combination of
correlated variables", I don't think this fits what has been
done here, although "causes" are being combined: The
no-correlations condition ensures that starting from mutually
uncorrelated x-variates ("causes"), the transformed, or y-,
variates ("effects") are likewise mutually uncorrelated.

| (By the way, I use "x" to symbolize causes and "y" effects
| because this is the tradition in regression analysis, Its been
| the convention in CC/CR exchanges over the years and it this
| reason that I reformulated your observations using this
| convention, )
|
| >(We can call this a "no-correlations" condition, and it's
| >what you referred to as "mirroring"; however, it's actually
| >a requirement of orthogonality between every pair of the
| >linear combinations that define the "effects", as explained
| >below.)
|
| Can you think of any no-correlation condition between effects
| in the real world? It could help clarify the issues.

What I've called the no-correlations condition is not
a condition between y-variates, but is a condition on
the given linear *relationship* y = M x + c; that is,
it's a condition on the matrix M, and is a necessary
& sufficient condition for the elements of y to be
uncorrelated (given your assumption that the elements
of x are uncorrelated).

A concise statement of the conclusion is, I think, the best
that I can offer at this point:

Suppose that variables x1,x2,...,xn are uncorrelated, and
that variables y1,y2,...,yn are derived from them according
to y = M x + c, with the rows of M mutually orthogonal(*).
If we are presented with the x- and y-variables, not knowing
which is which, then it's demonstrably impossible to identify,
based solely on correlations, which variables are the x- and
which are the y-variables. Moreover, we have that for any
(yi,yj,xk), (yi,yj) will be polarized by xk, and that for any
(xi,xj,yk), (xi,xj) will be polarized by yk. So polarization
also cannot serve to distinguish x-elements from y-elements.

(*) The "orthogonality" of the rows of M is wrt the metric
defined by cov(x)=diag(v1,v2,...):
sum( m(i,k)*m(j,k)*vk, k=1,2,...) = 0 (i =/= j).

I don't think so.

| Over the past few
| years I have called construction the combination of
| uncorrelated variables and perhaps I should add different
| variables (as illustrated in myy1=x1+x2, y2=x1+x3,
|
| The combination of correlated variables in psychometrics
| is an act of abstraction,,, it is how we use total scores
| (sums of correlated variables) to abstract their common
| true score variance, I call such combinations of correlated
| variables abstraction because the analysis folds in on
| itself.
|
| Perhaps you have illustrated that the combination of causally
| related variables need not be based on correlation but may

| also include self-reflexive operations(y1=x1+x2, y2=x1-x2),


| i.e inverse relations between the same variables with
| themselves,,, I begin to ramble and will sign off now,
| Later.
|
| Bill

While looking at the simulated polarization phenomenon
described in your original posting, I used scatterplots
to "see" clearly what was happening with your "lower- and
upper-quartile" datasets. This kind of plot is interesting
in its own right, and suggests some questions about what
information such plots themselves might provide in various
contexts, over and above numerical measures such as
correlation coefficients. I hope to discuss this in a
later posting.

--r.e.s.

William Chambers

unread,
May 8, 2000, 3:00:00 AM5/8/00
to
Dear r.e.s,

Thanks again for a refreshingly concise and objective post, I played around
with the model this weekend and found that the polarization is, on average,
less for the y1+y2=x1 relation than for the x1+x2=y1 relation, When I used
uniformly distributed x1 and x2, and large sample sizes (n=500) I found this
to be the case in all of four thousand replications of the model, If we look
at the model you describe as a case of circular causation, the greater
polarization occurs with the root or original cause, This suggests we may
yet be able to distinquish the x and y variables analytically,

I have been trying to come up with some hypotheses for and against
polarization, assuming your model, It is not easy, since, in a sense we
make causes out of our effects, the x1=y1+y2 model is both a cause and an
effect. ( Would you mind working up some hypotheses, pro and con for CC and
your model?) In the end, I decided that the actual data we have makes the
most sense, Variables that are the primarily causes (x1 and x2) produce
greater polarization than do the "Backlash" variables, y1 and y2.

I look forward to your next post,

Bill


Gottfried Helms

unread,
May 9, 2000, 3:00:00 AM5/9/00
to
"r.e.s." schrieb:
(...)

>
> While looking at the simulated polarization phenomenon
> described in your original posting, I used scatterplots
> to "see" clearly what was happening with your "lower- and
> upper-quartile" datasets. This kind of plot is interesting
> in its own right, and suggests some questions about what
> information such plots themselves might provide in various
> contexts, over and above numerical measures such as
> correlation coefficients. I hope to discuss this in a
> later posting.
>
> --r.e.s.

Hi ,

occasionally I am following the articles of these threads
without contributing, as I don't have the space to go into
it deeply - and other I would not like to.
So I don't want to distract the focus of this productive
discussion. But as you write about scatterplots, I think
it could be helpful to point to the scatterplots I have
done some monthes ago in similar threads.
Just send a request to my email.

Sincerely -

Gottfried Helms.

Henry

unread,
May 11, 2000, 3:00:00 AM5/11/00
to

Try to read one from me without resorting to insults.

Your result about the relative strength of polarisation is correct,
using your basic starting point of x1 and x2 being uniform
distributions with the polarisation effect being greater than when the
potential "causes" are y1 and y2 (in this case triangular
distributions).

However, if you had started as r.e.s. stated in the original post with
x1 and x2 being normal distributions then, although as you correctly
said many weeks ago the polarisation effect is still there, it does
not change in significant degree when y1 and y2 are tested as
potential "causes" (in this case also normal distributions).

Gus Gassmann suggested in another thread that this might have
something to do with kurtosis. So I tried x1 and x2 as double
exponential distributions [p(x)=exp(|x|)/2], and this too showed the
polarisation effect. However, when trying y1 and y2 as potential
"causes" it seemed to me that the effect was greater. Someone else
might like to check, but if true this does suggest that the strength
of the polarisation effect depends on the distribution, quite probably
related in some way to the kurtosis, but possibly not on the "cause".

(If the kurtosis is taken with a normal distribution being zero then a
uniform distribution is negative, a triangular distrbution being less
negative, a double exponential distribution being positive and the sum
or difference of two i.i.d. double exponential distribution being less
positive.)

There could be an case that if x1 and x2 have to be i.i.d. with a
non-normal distribution then it may usually be possible to
distinguish between x1 and y1 by which one's distribution is closer to
a normal distribution. But without deciding in advance which
distribution is being used, I am not certain that your methodology
will do so.

If x1 and x2 need not be independent (i.e. being set free from the
effects of the Central Limit Theorem), then it is probably possible to
construct distributions which produce the same "causal" indication but
with an opposite direction of construction of most other pairs of
distribution. I can certainly produce an x1 and x2 which are related
triangular distributions where y1 and y2 are uniform and independent
of each other.

Henry

unread,
May 12, 2000, 3:00:00 AM5/12/00
to
On Thu, 11 May 2000 22:57:22 GMT, se...@btinternet.com (Henry) wrote:
>If x1 and x2 need not be independent (i.e. being set free from the
>effects of the Central Limit Theorem), then it is probably possible to
>construct distributions which produce the same "causal" indication but
>with an opposite direction of construction of most other pairs of
>distribution. I can certainly produce an x1 and x2 which are related
>triangular distributions where y1 and y2 are uniform and independent
>of each other.

Just for completeness here are two ways of achieving the same result,
with x1 and x2 together being uniformly distributed within a diamond
and individually having triangular distributions between -1 and 1
(where "random" is a computer generated (0,1) uniform random
function):

do
x1=2*random-1
x2=2*random-1
until absolute(x1)+absolute(x2)<1
y1=x1+x2
y2=x1-x2

or:

x1=random-random
x2=random*2*(1-absolute(x1))+absolute(x1)-1
y1=x1+x2
y2=x1-x2

In both cases y1 is uncorrelated with y2 (as indeed is x1 with x2) and
y1 and y2 are uniformly distributed between -1 and 1.

Calculating the polarisation effect will produce a stronger effect for
y1 and y2 when controlled on x1 (or x2) than for x1 and x2 when
controlled on y1 (or y2).


r.e.s.

unread,
May 14, 2000, 3:00:00 AM5/14/00
to
"Henry" <se...@btinternet.com> wrote ...
[...]

| Just for completeness here are two ways of achieving the same
| result, with x1 and x2 together being uniformly distributed
| within a diamond and individually having triangular distributions
| between -1 and 1 (where "random" is a computer generated
| (0,1) uniform random function):
|
| do
| x1=2*random-1
| x2=2*random-1
| until absolute(x1)+absolute(x2)<1
| y1=x1+x2
| y2=x1-x2
|
| or:
|
| x1=random-random
| x2=random*2*(1-absolute(x1))+absolute(x1)-1
| y1=x1+x2
| y2=x1-x2
|
| In both cases y1 is uncorrelated with y2 (as indeed is x1 with x2)
| and y1 and y2 are uniformly distributed between -1 and 1.
|
| Calculating the polarisation effect will produce a stronger effect
| for y1 and y2 when controlled on x1 (or x2) than for x1 and x2 when
| controlled on y1 (or y2).

That's a nice example. (As you mentioned, though, it
does forego independence of x1,x2.)

It's probably worth mentioning that this essentially
takes the (y1,y2) distribution that results in William's
example (differing only in location and scale), and
assigns it to (x1,x2). This does work to reverse the
direction of the "polarization asymmetry", because the
transformation (y1=x1+x2, y2=x1-x2) has the effect of
rotating by 45 degrees and re-scaling, now producing
for (y1,y2) essentially the distribution that had been
assigned to (x1,x2) -- thus reversing the polarization
behavior of the x- and y-variables.

(To make this clearer, here's an alternative method:
u1 = rand()
u2 = rand()
x1 = (u1 + u2)/2
x2 = (u1 - u2)/2
y1=x1+x2
y2=x1-x2.
Then (x1,x2) is, I believe, uniform on a diamond,
and (y1,y2) is uniform on the unit square.)

Another way to get the reversal is to leave x1,x2
independently uniform, but allow them to have
different variances. E.g., x1 ~ Uniform(0,2) and
x2 ~ Uniform(0,1) with x1,x2 independent, and still
using the transformation y1=x1+x2, y2=x1-x2, produces
stronger polarization for (y1,y2) conditioned by x1,
than for (x1,x2) conditioned by y1.

--r.e.s.

William Chambers

unread,
May 16, 2000, 3:00:00 AM5/16/00
to
Hello res and Henry:

I just got back in town and read the following posts, I do not really
follow the findings yet. What do you mean by "conditioned on?" Also, are you
saying that y1 and y2 are uniform? It seems that they would be more
triangular, as is the case with y=x1+x2. When you say the polarisation is
stronger when conditioned on x1, are you saying that it is stronger when
sorted on x1? I found that that with sufficient sample size the
polarization between x1 and x2 sorted and partitioned by y1 or y2 was always
greater than the polarization of y1 and y2 sorted and partitioned on either
x1 or x2. Is this what you found?

r.e.s. wrote in message <8flt7a$2cs$1...@slb7.atl.mindspring.net>...

r.e.s.

unread,
May 17, 2000, 3:00:00 AM5/17/00
to
"William Chambers" <will...@roman.net> wrote ...
| Hello res and Henry:
|
| I just got back in town and read the following posts,
| I do not really follow the findings yet.
| What do you mean by "conditioned on?"


I use the following expressions interchangeably:

"variables (z1,z2) sorted into lower- and upper-
quartile subsets with respect to variable z3"

"(z1,z2) sorted wrt z3"
"(z1,z2) conditioned by z3"
etc

(where z may be either an x- or a y-variable).


| Also, are you saying that y1 and y2 are uniform?
| It seems that they would be more triangular,
| as is the case with y=x1+x2.


The "uniform on a diamond" distribution of the 2-dim
point (y1,y2), is such that the 1-dim distributions of
y1 and y2 are triangular.


| When you say the polarisation is stronger when
| conditioned on x1, are you saying that it
| is stronger when sorted on x1?

Exactly.

| I found that that with sufficient sample size the
| polarization between x1 and x2 sorted and partitioned

| by y1 or y2 was always greater than the polarization


| of y1 and y2 sorted and partitioned on either
| x1 or x2. Is this what you found?


I've yet to find a simulation that disagrees with
your result concerning this asymmetry in the degree
of polarization, given that x1 and x2 are iid uniform.
(This case is the same as (x1,x2) being uniform on
a square whose sides are parallel to the x1-, x2-
axes.)

But the same is not so if (x1,x2) is required merely
to be uniform on some arbitrary region in the plane
-- e.g. (x1,x2) "uniform on a diamond", or e.g.
uniform on a rectangle with sides parallel to the
coordinate axes but of unequal length (which is the
same as saying x1 & x2 are independently uniform with
unequal variances).

--r.e.s.

William Chambers

unread,
May 17, 2000, 3:00:00 AM5/17/00
to
Hello r.e.s.,

I guess I am still confused. The problem is that I do not know the
trigonometry (?) that underlies the rotation operation and the notions of
uniformity on a square and triangle... but I can guess and I do have some
good books to consult,

snip

>
>
>I use the following expressions interchangeably:
>
>"variables (z1,z2) sorted into lower- and upper-
>quartile subsets with respect to variable z3"
>
>"(z1,z2) sorted wrt z3"
>"(z1,z2) conditioned by z3"
>etc

Ok, I see that you use Z as a more general term, recognizing that polarity
can exist from either the x or y direction, This echoes my use of the terms
rde(y) and rde(x) in my 1991 paper. The conditioning variable is the one we
sort and partition on, the (y) and (x) in the rde, D=rde(y)-rde(x).

>
>(where z may be either an x- or a y-variable).
>
>
>| Also, are you saying that y1 and y2 are uniform?
>| It seems that they would be more triangular,
>| as is the case with y=x1+x2.
>
>
>The "uniform on a diamond" distribution of the 2-dim
>point (y1,y2), is such that the 1-dim distributions of
>y1 and y2 are triangular.


It seems you are using the word "uniform" in a different way when
describing the diamond versus the distribution of the individual variables
(as in uniform versus normal distribution). What do you mean by "uniform on
a diamond?" Is it a square when x1 and x2 are uncorrelated and diamond when
they are correlated?

>
>
>| When you say the polarisation is stronger when
>| conditioned on x1, are you saying that it
>| is stronger when sorted on x1?
>
>Exactly.


What you say immediately above seems to contradict what you say at the end
of the post, Given the assumptions I give for corresponding
regressions/correlations, we sample the causes as uncorrelated and uniformly
distributed. I think it was Henry that found that polarization also occurs
with normal distributions but I think this effect is greatly attenuated,
This is why I have recommended sampling x1 and x2 as uniform and
uncorrelated from my first publication in 1986. The problem with normal or
triangular distributions is that they prevent us from obtaining a sufficient
number of correlated extreme x values being paired at random to create the
correlation between x1 and x2 in the extremes of y, In my most recent
papers this has led me to emphasize the importance of sampling causes
uniformly, not only for corresponding correlations but as a general
principle in structural equation modeling, This is an view also held by
Nunnally and others.

>
>| I found that that with sufficient sample size the
>| polarization between x1 and x2 sorted and partitioned
>| by y1 or y2 was always greater than the polarization
>| of y1 and y2 sorted and partitioned on either
>| x1 or x2. Is this what you found?
>
>
>I've yet to find a simulation that disagrees with
>your result concerning this asymmetry in the degree
>of polarization, given that x1 and x2 are iid uniform.
>(This case is the same as (x1,x2) being uniform on
>a square whose sides are parallel to the x1-, x2-
>axes.)


I think what you are saying is that given the assumptions of uniform
uncorrelated causes, the polarization works when we take into account the
maginitude of polarization in the case of circular causes. This puts
polarization out of the deep trouble category and we are back to Mad Bill
and his Flying Machine.

>
>But the same is not so if (x1,x2) is required merely
>to be uniform on some arbitrary region in the plane
>-- e.g. (x1,x2) "uniform on a diamond",


In other words, if we violate the original assumption of orthgonality of
causes...?


>or e.g.
>uniform on a rectangle with sides parallel to the
>coordinate axes but of unequal length (which is the
>same as saying x1 & x2 are independently uniform with
>unequal variances).

Well, this seems to be a fancy way of saying if we violate the assumptions
of corresponding correlations then we can cook up some data that does not
work, I agree. If the causes are correlated then the square turns into a
diamond, And as I have maintained since 1986, polarization disappears with
correlated causes, This makes perfect sense and is consistent with the logic
of the method, I have also argued that causes that are not conjugated (i.e.
measured in all possible combinations) are confounded. This is equivalent to
the chemist being unable to purify his material, i.e. the bucket of dirt
syndrome, in which the chemical of interest is mixed up with other chemicals
by arbitrary proportions. The factor analysts (Thurstone et al) were well
aware of this problem and used orthogonal rotation to simple structure as a
means of purifying data for subsequent experimental and correlational
analysis. This is the topic of my paper that is in press at Structural
Equation Modeling.


As to the unequal variances... I am not sure I follow you here since I use
z-score transformations on all the variables before conducting the
polarization test, So you must be talking about unequal variances AT THE
TIME of the causal generation, This is an altogether different issue than
merely what scales we measure by.

Perhaps what you are saying is that if we have causes that are different in
their variances before the combination, that one variable will have more
weight in the equation, as if we were using 3x1+x2=y, It seems that x1 would
then correlate higher with y than would x2. In my 1991 paper I provide a
table that shows the optimal correlations between the two causes and the
effect will be .70. The tables suggest that some expected degree of
polarization, given different degrees of correlation should be available.
All of this should be mathematically tractable and any mathematician with a
masters degree and undergraduate calculus should be able work out the
details, I can not, however, not being so clever. Perhaps you would explain
what you are talking about regarding unequal variance, keeping in mind
please that I am a psychologist and not a mathematician,


I kind of got the impression that perhaps Henry was truncating data with his
"Until" operation:

>| (0,1) uniform random function):
>| >|
>| >| do
>| >| x1=2*random-1
>| >| x2=2*random-1
>| >| until absolute(x1)+absolute(x2)<1
>| >| y1=x1+x2
>| >| y2=x1-x2
>| >|
>| >| or:
>| >|
>| >| x1=random-random
>| >| x2=random*2*(1-absolute(x1))+absolute(x1)-1
>| >| y1=x1+x2
>| >| y2=x1-x2
>| >|

I have not had time to simulate these operations. What is the point of the
less than 1 operator? This seems to remove a lot of data, And may be a way
of attenuating the polarization of x1,x2 by y. It is difficult to know if
the actual values of the polarizations are not given. In any case, I do not
see why this greater polarization you observe should be obtained, Perhaps
you would kindly explain?

At present I am wondering what the polarization of the root causes (x1,x2)
is greater than the secondary causes (y1,y2). I wonder if it has something
to do with the fact that the secondary causes are more triangular than the
root causes. Deviating from the uniform lowers polarization, because of
truncation effects at the time of causal generation, rather than
superficially as a distribution problem, We know that effects are more
triangular than are their causes, by definition, We can sample the causes
uniformly and expect to get samples in which all possible combinations of
the x1 and x2 are likely, But such may not be the case by sampling the
effect (y1,y2) uniformly, If we just collect an equal number of cases per
level of y, then this will not remove the tendency for fewer instances of
correlated pairs of x1 and x2 in the extremes of y. If we sample an effect
uniformly, we will just have duplicates that do not pair (correlate) the x1
and x2 in the extremes of y. So even if we pull uniform samples of y1 and
y2, they will still not polarize x1 and x2 more than x1 and x2 polarize y1
and y2. This makes the comparison of polarizations potentially practical,
since we can choose the greater polarization as pointing to the root causes.

You may note as well that from my first paper in 1986 I have always compared
the polarizations from both the x and y sides. The D statistic that is the
core of the 1991 paper is based on comparing the magnitudes of polarities
from both sides of the causal equation, In the paper I sent you recently
there is a section on unconfounding and correlated dependent variables. An
algebraic analysis is introduced, based on dual polarizations. When we have
correlated dependent variables, the polarizations cancel one another. This
cancelation process was built into the D statistic in the 1991 paper and I
am pretty sure it was part of the 1986 corresponding variances paper as
well.

Your observations with regard to circular causation are very interesting but
it appears to me that you have opened up a new direction for corresponding
regressions/correlations rather than shutting down the method. But only
time, intelligence and sincere hard work will tell.

Best,

Bill

James S. Adelman

unread,
May 17, 2000, 3:00:00 AM5/17/00
to
William Chambers said:

> Hello r.e.s.,

[snip]



> It seems you are using the word "uniform" in a different way when
> describing the diamond versus the distribution of the individual variables
> (as in uniform versus normal distribution). What do you mean by "uniform on
> a diamond?" Is it a square when x1 and x2 are uncorrelated and diamond when
> they are correlated?

lol! you really don't have a clue, do you: it means that all the
possible values form a diamond, and within that diamond, the pdf is
uniform.
--
James S. Adelman
Harrow, Middlesex

William Chambers

unread,
May 17, 2000, 3:00:00 AM5/17/00
to
James,

Good to see you are listening to the conversation, I still have not heard
from you regarding your education and such, so I do not know if I should
respond to you as merely a cheeky uneducated student or as a miseducated
professional. Anyway, perhaps you will explain the diamond to the rest of
us and while you are at it, comment on the many other things that have been
said about polarization, I have an idea of what the diamond is but I prefer
to let people explain things to me, It usually brings out the hubris in
intellectual light weights and I learn about there personalities, I learned
this trick as a child watching old black men play nigger in an evil racist
society, Those old men made fools of the "superior white folk." Some folks
call it the Socratic Method, some call it Playing the Fool. I mostly just
call it fun.

Explaining things helps us all know what we are discussing, It also helps
deflated the arrogant and mystifying jargon of half educated mathematicians.
Tell me in particular if the diamond violates the assumption that the causes
are uncorrelated and conjugated (combined in all possible ways).

Your Pal,

Dr. Bill


James S. Adelman wrote in message
<39231A0F.MD-1...@ukonline.co.uk>...


>William Chambers said:
>
>> Hello r.e.s.,
>
>[snip]
>

>> It seems you are using the word "uniform" in a different way when
>> describing the diamond versus the distribution of the individual
variables
>> (as in uniform versus normal distribution). What do you mean by "uniform
on
>> a diamond?" Is it a square when x1 and x2 are uncorrelated and diamond
when
>> they are correlated?
>

William Chambers

unread,
May 22, 2000, 3:00:00 AM5/22/00
to

James said:
>
>It's a distribution. If you want to generate it, you can take uniform
>x1 and x2, and calculate y1 and y2 like you say, and it has a diamond
>distribution. y1 and y2 are uncorrelated, but not independent.


Bill responded:

James, what do you mean by uncorrelated but not independent? The way I
conceptualize things is that the causes are uncorrelated and independent,
since they are generated by random functions and represent unrelated causal
variables. True they can be combined to form a mutual effect, but they are
independent of one another. You are saying that y1 and y2 are uncorrelated
but dependent on one another, How can we determine that such dependency
exists? Would not such a test provide a red flag that tells us we are
dealing with a reverse causal model or some nonlinear and perhaps perverse
trick?


>
>If you want to be taken seriously, you should make the effort to learn
>the basics of the subject you wish to talk about. And I really do
>mean the basics: probability and distribution theory, not methods.

Actually some people are taking me seriously, including several journals,
members of this and other newlists,

>
>> >What we did say was that they were not independent. For example in
>> >the case of |x1+x2|<=1, if x1=0 then x2 could be anywhere between -1
>> >and 1, but if x1=+1/3 or -1/3 then all we know is that x2 is between
>> >-2/3 and +2/3, and if x1=+1 or -1 then we know precisely that x2=0.


x1 and x2 are generated by random numbers in the root causal model, Knowing
about x1 tells you nothing about x2. When you say |x1|+|x2|<=1, you are
stating a condition for x1 and x2 rather than stating a causal model for
generating an effect from independent variables, You are building a
dependency into the values of x1 and x2. This is not what I mean by
independent variables (causes). Independent variables are uncorrelated and
independent in any (non)linear sense, This is the sense in which the words
independent variables are applied to causes in experimental design, In
regression analysis, without experiments, people stretch their rights by
calling the predictor the independent variable but this is generally
understood to be a risky putative label, The whole problem of
multicollinearity arises because of nonindependent independent variables,
To avoid such nonsense labels, we should refer to independent variables as
just that, independent and make sure that they are independent. If they are
not independent, we need to improve our formulation and explication. This
may require tests of independence other than a mere correlation. Tell me how
to do this test.

You guys do have a point that I should make it explicit that the causes are
independent, I had not realized that uncorrelated variables could be
dependent on one another, except in nonlinear ways. My papers have made it
clear that I am talking about linear relations,

Ok... so we have more clarity, The causes are assumed to be uncorrelated and
independent of one another... hence truely independent variables.

>>
>> When you take the absolute values, what has this got to do with the model
>> y=x1+x2? If tried to just throw in such manipulations with out
explaining
>> myself in this conversation, I would be attacked as being mystical and
full
>> of bull, Therefore, I explain myself, You should as well. And r.e.s
should
>> follow up on his points and my questions.
>
>He is telling you why the two random variables distributed on the
>diamond are not independent -- it is not a manipulation, it is a
>description.


This is not very sincere of you James. Some manipulation causes the
dependence of the variables. They do not just pop up from nowhere but
follow a sequence of combinations that directly parallel what would be
experimental manipulations. y1 and y2 are reverse causes.

>
>> >Knowledge of x2 puts similar constraints on the magnitude of possible
>> >values of x1, though again not the sign.
>>
>>
>> If x1 and x2 are independent random numbers, how does the value of one
>> constrain the value of the other?
>
>He's just told you at least twice that the variables he is talking
>about are not independent, and not correlated.

Maybe so but you are only saying what is convenient for criticism. It has
been very clear that I conceptualize x1 and x2 are random numbers in these
simulations. Such numbers are independent and uncorrelated, Now if what you
are saying is that uncorrelated but dependent variables also produce
polarization, then you may have a point, but only a small point, What you
are missing or refusing to acknowledge is that the polarization is less for
the dependent variables y1 and y2 across x1. Furthermore, if there is some
way of showing that the two variables are uncorrelated but dependent, then
we have a red flag to indicate that this is a special case. You have not
invalidated corresponding correlations, r.e.s and perhaps you (if you change
your negative attitude) may have pointed out ways to improve it.

For a man who is supposed to be such an expert in mathematics, you seem to
be more interested in proving me personally wrong than in testing the nature
and value of corresponding correlations. I am really beginning to question
your integrity as an intellectual.

Is there anything positive that you can say about me and corresponding
correlations?

Bill

William Chambers

unread,
May 22, 2000, 3:00:00 AM5/22/00
to
r.e.s,

I looked at your excel program, It appears that you are only looking at one
of the poles of the polarization effect. It is necessary that we look at
both, In the paper I sent you I took the product of the correlations taken
from the extreme and the midrange. In the data that you describe the
correlations are high in one range but very low.... near zero... in the
other,

zy1 and zy2 correlations sorting and partioning over zx1 (with variance of
zx1=2).


reverse cause: zx1=y1+y2
sorting and partitioning on zx1
mid ext of zx1
Polarity
.77 -.08 -.06
.75 -.06 -.04
.76 -.01 -.00
.74 .09 .06
.74 .05 .03

Your spreadsheet and your report of the greater polarization in this
condition appears to be based on only the first column of this data. If we
go by the polarity scores, however, we see that the reverse cause
polarization is actually lower than the root cause.

Unequal variances:

root model: y1=zx1+zx2
sorted and partitioned on y1

mid ext polarity
.39 -.69 -.26
.57 -.64 -.36
.40 -.74 -.29
.48 -.71 -.34
.38 -.73 -.27

Note that I mislabeled this data in the previous post, I tried to catch this
error and edited it but my server apparently sent the wrong version to
newslist. Anyway, I recommend that you check these results with your
program, I believe you will find the polarization (as defined by the above
products of correlations) will be higher for the root cause, in both equal
and unequal conditions.

Bill

Henry

unread,
May 22, 2000, 3:00:00 AM5/22/00
to
A response on just one point (remainder snipped):

On Sun, 21 May 2000 18:22:52 -0500, "William Chambers"
<will...@roman.net> wrote:
>>>This does not make sense, The fact is, that the polarization is less when
>>>y1 and y2 are sorted and partitioned on x1 or x2 than when we look at the
>>>correlations between x1 and x2 sorting and partitioning on either y1 or
>>>y2. r.e.s acknoweldged this when I put it directly to him, Do you?
>>
>>[Henry] I acknowledged it above in the very special case you created.
>
>Ok. This is major progress because it has been assumed historically that
>this asymmetry is impossible to detect in any condition. So maybe all our
>arguing has been for some purpose.

Except of course that we do not agree as to whether this asymmetry
(relative strength of polarisation) occurs in this case solely
because of the patterns created by the uniform and triangular
distributions, and whether it gives any reliable information about
causal direction even in that special case.

I would reply yes and no, while I suspect you would reply no and yes.


William Chambers

unread,
May 23, 2000, 3:00:00 AM5/23/00
to
>>Ok. This is major progress because it has been assumed historically that
>>this asymmetry is impossible to detect in any condition. So maybe all our
>>arguing has been for some purpose.
>
>Except of course that we do not agree as to whether this asymmetry
>(relative strength of polarisation) occurs in this case solely
>because of the patterns created by the uniform and triangular
>distributions, and whether it gives any reliable information about
>causal direction even in that special case.
>
>I would reply yes and no, while I suspect you would reply no and yes.
>

Bill responded:

Henry, you seem to be throwing up a smoke screen that amounts to a heads you
will tails I lose proposition, We have been talking about polarization and
reverse causes for a while now, I have shown that the polarization is
greater for the root versus the reverse cause. Had I not been able to do
this, you would have concluded that polarization can not be of use in causal
inference (Heads you win), Now that the coin turns in my favor, you say it
does not matter (Tails you lose.) This suggests that you are not arguing in
good faith.

I have shown that if we pull a uniform sample of y values, then the
polarization effect still occurs consistent with the causal asymmetry we are
modeling, Furthermore, variables with no causal relationship to one another
can be uniformly and triangularly distributed, making the distribution issue
moot, You consistently ignore my arguments along these lines, thinking that
simply repeating your claims will substantiate them, This is an ancient
logical fallacy but I see often on the internet.

The thing about arguing with people like you and James is that you never
admit your mistakes, You always followup with a smoke screen or denial,
choosing to avoid
any conclusions but the ones you started out with. No matter what
simulations and logic reveal, you will always come back with some dogmatic
statement about not being able to infer causation from correlations. Alex
showed this intellectual deficit in the extreme. James has the same problem
but it is somewhat disguised by his joy in making personal insults along
with his denials. It remains to be seen if r.e.s will play the game fairly
but we will get an idea from how he reacts to his mistake in calculating
polarization.

All of this adds up to the Sorry State of Statistics. While claiming a
professional status, statisticians on this newslist refused to police their
own, There are readers on this newslist who see how insincere you and
others are in your arguments, They should be responsible and criticize your
behavior in public, Otherwise, more naive readers, especially young
students, may come to believe that the ways you argue are legitimate. This
lowers the competence, credibility and future integrity of the discipline of
statistics. It would not concern me much if this problem had no direct
consequence on other disciplines, I can not solve the problem of evil,
except in my own small way. But you are not the only statisticians who
parade your ignorance and insincerity. Many reviewers of journal articles
are just as sorry and irresponsible. There is no telling how many
significant scientific advances have been quashed because of the
incompetence and lack of integrity of the members of gate keeper
disciplines, Statistics is a gate keeper discipline because it dictates much
of the shape of psychological inquiry. This is why psychologists have to
learn so much about statistics. For some reason, however, statisticians are
not expected to learn much about psychology. Statisticians have enjoyed
the financial rewards of their discipline on a take it or leave it basis.
They feel no responsibility to produce the best methods for their clients
needs, only to dole out existing solutions to a limited number of
mathematical problems. In the end, this is intellectually dishonest. Worse
still, the effects of this arrogance on society are very harmful in a real
and often physical sense. Real people are affected by statistical
conclusions.

You folks are accustomed to talking about what you think you understand and
feel no compulsion to learn anything from others, This is why so many
people hate mathematicians, It is of course not the nature of mathematics
that facilitates the rise of sophistry. Math is just math. But the ability
of even bad mathematicians to mystify nonmathematicians, is very
significant, And such people are usually very conceited. They act
personally offended when asked to explain themselves. When ever people
refuse to explain themselves, it is almost always due to the fact that they
do not really know what they are talking about or they are hiding
something... usually a lie,

You, James and perhaps r.e.s. have been holding back on me with all this
square and diamond stuff. If I had not simulated the models myself, it is
quite possible that I would have been forced to accept that polarization is
greater for unequal variances and reverse causation (.70 vs -.03 type
patterns). If I had not requested r.e.s' calculations, I would not have
noticed that he mistakenly considers (.70 vs -.03) greater polarization than
(.6 vs -.5). The conversation and the whole corresponding regressions
ediface might then have vanished, simply because you guys thought it beneath
you to explain yourselves to a mere psychologist. Such a disappearance
would have been an intellectual crime, with very real physical and human
consequences.. if, indeed, I am correct about polarization, The willingness
of you and other members of this newslist to let such things happen tells me
just what a Sorry State Statitistics is in,

Is it possible for you guys to straighten up and start discussing the issues
sincerely? Or do we need to revert back to discussions of psychopaths and
little boys lusting after the mothers they do not deserve?

Bill

r.e.s.

unread,
May 23, 2000, 3:00:00 AM5/23/00
to

"William Chambers" <will...@roman.net> wrote ...
[...]

| It remains to be seen if r.e.s will play the game fairly


| but we will get an idea from how he reacts to his mistake
| in calculating polarization.

[...]


Yes, I did err in not sticking to your definition of
"polarization", which requires the extreme- & midrange-
conditional correlation coefficients to differ in sign.
(One might regard a large magnitude of the former alone
as also a kind of "polarization", since it typically
involves two spatially isolated "clouds" of points in
the (x1,x2) plane -- the clouds being separated by the
midrange quartile sets, and hence "polarized". But your
postings have made it clear that both this and the other
correlation being of opposite sign, is what you mean by
polarization. Sorry for the confusion.)


| You, James and perhaps r.e.s. have been holding back on me
| with all this square and diamond stuff. If I had not simulated
| the models myself, it is quite possible that I would have been
| forced to accept that polarization is greater for unequal
| variances and reverse causation (.70 vs -.03 type patterns).


(At times my replies have been slow in coming, just because
of other things demanding time.)

Speaking for myself... I've held nothing back, and offered
the "square & diamond stuff" as an aid to understanding -- it
does have that potential. I may not share your philosophy
about "causation", but that hasn't prevented me from seeking
ways to view your findings about polarization as possible
puzzles in applied probability.

Whether polarization is something of great value for
statistical inference -- which you clearly believe it to be
-- is a question I prefer to avoid for now. Instead, it has
seemed more productive to cast it entirely in "mathematical"
terms, so a larger audience might be willing to at least
look and see that there are some things here, imo, that are
rather curious, if not indeed puzzling.


| If I had not requested r.e.s' calculations, I would not have
| noticed that he mistakenly considers (.70 vs -.03) greater
| polarization than (.6 vs -.5).


In fact, it was my initiative to offer the actual Excel file
that I'd worked up. When for some reason you had trouble
accessing it, you asked me for it by email, and I happily
sent it.

(Since learning that I wasn't following your conventions for
defining polarization -- something inadvertent on my part --
I've removed that file from the server.)


| The conversation and the whole
| corresponding regressions ediface might then have vanished,
| simply because you guys thought it beneath you to explain
| yourselves to a mere psychologist.

[...]


That has never been my attitude, nor will it be.

--r.e.s.

William Chambers

unread,
May 24, 2000, 3:00:00 AM5/24/00
to
r.e.s,

You are correct. You have never been anything but objective and honest in
these exchanges and I apologize for suggesting otherwise. I do sometimes
get the impression that the whole world is against me when in fact only part
of it is. You show every indication of being a man of honor and
intelligence and I should show more respect when I encounter people of your
caliber. You did, indeed, produce your calculations immediately upon
request and you do have a point that the .72 vs -.05 is a kind of
polarization, It just is not the two poled polarization that I am focusing
on,

I think the deflation of one of the poles that you observe with rotation is
akin to the deflation of factor loadings on one factor that occurs with a
varimax rotation, In that procedure, variables (such as y1 and y2) that are
loaded on two factors (x1 and x2) are redefined so that they load on only
one factor. The correlations between y and one of the factors increases
while the correlation between y and the other factor drops to zero. I
suspect this is what is going on with the rotation implicit in reverse
causation.

As to whether polarization implies causation or not,,, I appreciate that
causation is a larger issue and I value the fact that you have been willing
to focus on the mathematics without jumping back and forth from math to
philosophy of science in order to create a smoke screen, Perhaps if we focus
by measured steps, solving one problem at a time, then we can at least bring
as much clarity as possible to the later stages of the inquiry.

For now, lets focus on the issue of asymmetry. Allowing for the possiblity
of exceptions in the future, does it appear that we can detect asymmetries
in the sequence y=x1+x2, given that x1 and x2 are uncorrelated and uniform?
If so, then are we ready to discuss the role of asymmetry in causation?

Bill

r.e.s.

unread,
May 30, 2000, 3:00:00 AM5/30/00
to
"William Chambers" <will...@roman.net> wrote ...
[...]
| I looked at your excel program, It appears that you are
| only looking at one of the poles of the polarization effect.
| It is necessary that we look at both, In the paper I sent

| you I took the product of the correlations taken from the
| extreme and the midrange. In the data that you describe
| the correlations are high in one range but very low....
| near zero... in the other,

<snip>

| I recommend that you check these results with your
| program, I believe you will find the polarization (as
| defined by the above products of correlations) will be
| higher for the root cause, in both equal and unequal
| conditions.

--
Thanks for the correction, and for offering a specific
measure to quantify the the degree of polarization.

Using that measure, I do indeed "in effect" confirm
your results -- meaning that the exceptions to what you
say above may be small enough in magnitude to justify
ignoring them. The exceptional cases are detailed below.

(As always, I take "root cause" to mean just some
x-variables that we start with in a simulation, from
which other variables are calculated.)

I've now derived exact analytical formulas for this
polarization behavior, as well as having confirmed it
(and the formulas) by simulation, using your "extreme-
& midrange- correlations product" to measure the degree
of polarization in the specific model at hand.

Since your assumption in the simulations is that x1,x2 are
independent Uniform random variables, with y1 = x1 + x2,
it follows that the case of possibly-unequal x-variances
is automatically treated by considering x1,x2 iid Uniform,
with y1 = A*x1 + B*x2, and arbitrary A,B. More generally:

Model:
x1,x2 independently Uniform(0,1)
y1 = A*x1 + B*x2
y2 = C*x1 + D*x2
(A,B,C,D arbitrary)
==================================

Analytical Results

Here I want to simply state the polarization formulas
that I've derived, in order to prevent them from getting
lost in the detail that follows.

By calculating the correlations directly as integrations
in the (x1,x2)- and (y1,y2)-planes, I find the following
*exact* general solution for corr_midrange * corr_extreme:

For (x1,x2) wrt y1:
corr_midrange * corr_extreme
= -4*m^2 / sqrt((7-4*m^2)*(1+4*m^2)) , m <= 1/2
= -(2^(1/2)*(m^(1/2)+m^(-1/2))-7/2)^2
/ sqrt( (m -2*2^(1/2)*m^( 1/2)+3)
*(m^(-1)-2*2^(1/2)*m^(-1/2)+3)
*(m -2*2^(1/2)*m^( 1/2)+1)
*(m^(-1)-2*2^(1/2)*m^(-1/2)+1) , 1/2 < m < 2
= -(4/m^2) / sqrt((7-4/m^2)*(1+4/m^2)) , m >= 2
where m = abs(A/B).

For (y1,y2) wrt x1:
corr_midrange * corr_extreme
= (4+j*k)*(4+7*j*k)
/ sqrt( (4+j^2)*(4+k^2)*(4+7*j^2)*(4+7*k^2) )
where j = C/D, k = A/B.

Note that the solution depends only on the ratios
A/B and C/D. (Also, I suspect that the middle formula
in variable "m" above, might have a simpler form.)
==================================

Definitions, Problem Statement & Sketch of Solution:

We define the conditional correlations, i.e.
corr_midrange and corr_extreme, as integrals that
correspond to what, in a simulation, is computed by
sorting the samples of (z1,z2) wrt z3 and calculating
the sample corr(z1,z2) for the middle 50% of the sorted
rows, and then doing the same thing separately for the
remaining 50% of the sorted rows, respectively.

Thus, suppose that z1,z2,z3 (e.g. x1,x2,y1, or y1,y2,x1)
are jointly distributed random variables:

For "(z1,z2) wrt z3",

corr_midrange
:= correlation of (z1,z2) conditional on z3 being in its
interquartile "midrange" (i.e. z3(0.25) < z3 <= z3(0.75))
:= ( E'(z1*z2)-E'(z1)*E'(z2) )
/ sqrt( (E'(z1^2)-(E'(z1))^2) * (E'(z2^2)-(E'(z2))^2) )
where each E' is a double integral wrt the conditional
density function for (z1,z2), given that (z1,z2) is in the
midrange set.

corr_extreme
:= correlation of (z1,z2) conditional on z3 *not* being
in its midrange (i.e. z3 <= z3(0.25) or z3 > z3(0.75))
:= ( E''(z1*z2)-E''(z1)*E''(z2) )
/ sqrt((E''(z1^2)-(E''(z1))^2) * (E''(z2^2)-(E''(z2))^2))
where each E'' is a double integral wrt the conditional
density function for (z1,z2), given that (z1,z2) is *not*
in the midrange set.

These integrals are simple enough, but are tedious
because of the various cases of integration limits
introduced by the Uniform distributions, together with
the arbitrary transformation coefficients (A,B,C,D).

Thus, for (x1,x2) wrt y1, there are basically 10 double
integrals to be evaluated, and another 10 for (y1,y2) wrt
x1; however, the value of some of these integrals are
obvious by symmetry, making a total of about 16 integrals
to be set up & evaluated to obtain the formulas given
above.

Finally, define the degree of polarization" as
P = corr_midrange * corr_extreme

(Thus, "polarized" means "P<0", "more polarized" means
"more negative P", and "not polarized" means "P >= 0".)


Problem Statement:
The problem is to prove or disprove the following
assertion that "polarization asymmetry" holds in
general:

If x1 & x2 are independent & uniformly distributed,
and y1 & y2 are linear combinations of x1 & x2,
then it never happens that (y1,y2) wrt x1 has a
"more negative" value of corr_midrange * corr_extreme
than does (x1,x2) wrt y1;
i.e., under the stated assumptions, (y1,y2) wrt x1
cannot be more polarized than (x1,x2) wrt y1.
==================================

Conclusions:

The resulting analytical formulas for
corr_midrange * corr_extreme
show that polarization asymmetry does *not* hold in
general -- i.e., there are values for (A,B,C,D) for
which (y1,y2) wrt x1 is more polarized than (x1,x2)
wrt y1; however, the formulas also show that the
difference of the correlation-products has, in every
exceptional case, a magnitude less than 0.005.

(It's the fact that these exceptional cases are so
hard to detect in simulations that prompted me to
work the problem analytically.)

An example set of exceptional cases consists of
all values of (A,B,C,D) for which
6 <= A/B <=7 and -.4 <= C/D <= -.3
-- for any such case, (x1,x2) wrt y1 is
"less polarized" than (y1,y2) wrt x1.

E.g., (A/B = 6, C/D = -1/3):

(x1,x2) wrt y1:
corr_mid*corr_ext = -1/(2*sqrt(155)) = -0.0401...
(y1,y2) wrt x1:
corr_mid*corr_ext = -45/(8*sqrt(15910)) = -0.0445...

An Excel97 workbook that will evaluate the analytical
formulas and perform a polarization simulation for any
choice of (A,B,C,D), is at

http://rs.1.home.mindspring.com/polariz.xls
(about 600KB)

The workbook also contains a jpeg image showing
plots of corr_mid*corr_ext as a continuous function
of A/B, for a wide range of selected C/D values.

--r.e.s.

Peter Westfall

unread,
May 31, 2000, 3:00:00 AM5/31/00
to

"r.e.s." wrote:

snip...

R.E.S.: Nice work - I hope you and Bill consider writing this up for a
general interest journal - say "Chance" or "American Statistician."


William Chambers

unread,
Jun 2, 2000, 3:00:00 AM6/2/00
to
Hi Peter,

I have encouraged res to publish his results as well. I think he has done
an excellent job and acted with a level of intellectual maturity and
objectivity that we should all admire and emulate.

There is a matter that I should bring up at this point. About a month ago
Don Talyor produced a similar work in calculus and sent it to me via e-mail,
I do not know calculus but as far as I can tell Don's work developed at
least part of the formula that res presents, though perhaps not as
extensive. Don has been studying the issues under the weight of many other
projects and wanted to delay any report until he felt more certain. I have
discussed this issue with neither Don nor res and I hope Don and res will
forgive my taking the liberty of explaining all this in public, I did not
realize that res was working on the problem with calculus. If I did, I would
have immediately asked Don's permission to share his work.

None of this is intended to undermine or discredit res. His work extends
Don's independent work and between the two of them I think they have a good
article. I recommend that they pool their discoveries and together present
a paper. There is no requirement for me to share authorship with them,
though I would be very proud to tag along as third. I know these guys are
busy and probably do not have a lot of time to devote to an article. I do
not think it would take too much writing, however, since they have
crystallized the essence with their calculus.

I have been out of town for a few days and may have missed something on the
newslist. I think res discovery that the polarization is not "general" but
pretty close is highly significant and original. I wish polarization had
worked out to be "general" but on a practical standpoint, the results are
still very encouraging, We might yet find a way to exclude the rare cases
res found and then use corresponding correlations. We still need to fight
about correlation and causation but I suspect that will be fun and will come
later. It would not be necessary that res and Don talk about their work as
demonstrating causation. Their work essentially concerns asymmetrical
implications in the sequence of variables in generative equations and I
recommend they present it as such. The word causation spooks people too
much to let it obstruct the development of mathematics that have their own
place, independent of the philosophy of causation.

I am starting a new job in a new state next week, I may not be able to
bother you guys as much anymore on this newslist. As much fighting as we
have done, I would like to say that I actually appreciate this group as
something special, I have discussed corresponding regressions on several
newslists and been literally banned, as soon as it became clear that things
were adding up. I admire the owner of the group for not letting that happen
and I appreciate the efforts that Henry, James and others made in tackeling
a difficult and rightfully suspicious looking topic in public. The bottom
line is whether I am right or wrong about causation, WE know more about
mathematics than when our conversation started. Those who have tolerated
and listened to our fighting, miscommunication and errors (including mine)
are to be thanked as well. I am still not convinced that the state of
statitics is not in sorry shape but I do believe their is hope and that some
talented and honest folks are out there. I wish you all well.

Thanks.

I have a U-Haul truck to pack and 500 miles to drive... gotta run.

Bill Chambers

R.E.S.: Nice work - I hope you and Bill consider writing this up for a
general interest journal - say "Chance" or "American Statistician."

Peter Westfall wrote in message <39350A8F...@attglobal.net>...

0 new messages