Fred L. Bookstein: PARTIAL LEAST SQUARES: A DOSE-RESPONSE MODEL
Author's Rationale for Soliciting Commentary:
Multivariate statistical methods are often found to apply in
disciplines strikingly distant from those in which they originated.
The method discussed in this target article has hitherto been
disseminated mainly to the rather small community concerned with the
behavioral consequences of prenatal alcohol exposure. But the method
does not seem to me to be specific to alcohol teratogenesis. It is
couched broadly in a rhetoric of quantitative reasoning shared by all
the fields that study subtle abnormalities of behavioral
development. By presenting this discussion as a target article in
Psycoloquy, I hope to generate a much broader consideration of the new
method than is possible within any circumscribed professional
specialty. I encourage the would-be commentator, then, to assume the
role of a reviewer for a journal in a different discipline, and to
explore the suitability or unsuitability of this method for studies
of other sorts of complex, indirectly measured systems.
--------------------------------------------------------------------------
psycoloquy.94.5.23.least-squares.1.bookstein Wednesday 6 April
ISSN 1055-0143 (27 paragraphs, 1 figure, 8 references, 582 lines)
PSYCOLOQUY is sponsored by the American Psychological Association (APA)
Copyright 1994 FL Bookstein
PARTIAL LEAST SQUARES: A DOSE-RESPONSE MODEL FOR
MEASUREMENT IN THE BEHAVIORAL AND BRAIN SCIENCES
Fred L. Bookstein
Center for Human Growth and Development
The University of Michigan
Ann Arbor, Michigan 48109
(313) 764-2443
fr...@brainmap.med.umich.edu
ABSTRACT: Partial Least Squares (PLS) is a relatively new
multivariate statistical method for the analysis of indirectly
measured cause and effect in complex behavioral systems. The core
of the technique is a rearrangement of the singular-value
decomposition (SVD) of the correlation matrix between two blocks of
variables. In this setting, the SVD can be reinterpreted as dealing
with two latent variable (LV) scores, one for each block, such that
the coefficients of either are proportional to the predictive
salience of the corresponding variable for the other LV. In the
presence of a true causal nexus, subsequent statistical
manipulation of these coefficients and scores can be very
enlightening. The strengths of PLS are demonstrated using the
Seattle study of the effects of prenatal alcohol exposure on
offspring development. This longitudinal study is based on 13
diverse measures of prenatal exposure and hundreds of outcome
scores that assay attentional behavior, neuromotor maturation,
cognitive functioning, and socialization to school in a
population-based sample of 500 children born in 1975. There is an
enduring effect of prenatal exposure on outcomes in all of these
channels. I argue that PLS is the best method for discovering and
reporting the nature of the dose-response relationship and the
characteristics of affected children in studies such as these.
KEYWORDS: behavioral teratology, dose-response analysis, fetal
alcohol effects, latent variables, longitudinal data analysis,
partial least squares, singular-value decomposition.
I. INTRODUCTION
1. This target article calls the attention of the PSYCOLOQUY readership
to a relatively new statistical methodology for the analysis of cause
and effect when neither can be measured directly but both can be
measured redundantly. This familiar measurement design is particularly
important at the nexus of the behavioral and brain sciences, where
studies often combine multiple indirect measurements of important but
brief causes (e.g., prenatal insults) having important consequences of
relatively great duration (e.g., the lifespan). A shorter version of
this article was published in PSYCOLOQUY in February 1991 (Bookstein,
1991), but elicited no commentary (PSYCOLOQUY was still very young then,
and the biobehavioral science community not yet familiar with the
concept of interactive electronic publication). Now, three years later,
the Editor has invited me to broach the topic again, at greater
length. Today it is easier to argue that in its combination of
biometric and psychometric themes the approach is worth considering for
a much broader range of studies than those in which it has been
exploited so far.
2. The technique I am reviewing, "Partial Least Squares" (PLS), is a
variant of a family of least-squares models of correlation matrices
introduced in the 1920's by the biometrician Sewall Wright (1889-1988)
to link path analysis with factor analysis. The technique was
rediscovered, and the present name assigned, by the Swedish
econometrician Herman Wold (1907-1992), in diverse sociological
applications throughout the 1970's; but the explanation that follows is
not Wold's, and does not correspond to the algorithm of a program
package matching Wold's ideas that was distributed by Lohmuller in the
late 1980's. Also, this version of PLS should not be confused with a
quite different algorithm of the same name that applies to prediction
and classification problems in chemometrics, an algorithm having
another Wold for inventor (Svante Wold, Herman's son).
3. The present method was worked out in extensive analyses of
neurobehavioral sequelae of prenatal exposure to alcohol in 500
children exposed at levels milder than those bringing on frank Fetal
Alcohol Syndrome (FAS). Paul D. Sampson has been my principal
statistical collaborator in this work. Ann P. Streissguth of the
University of Washington, Seattle, has been Principal Investigator of
this project since 1974. Our 1993 monograph (see the reading-list at
the end of this paper) explains PLS methods in much greater detail than
we have room for here, and also reviews the teratological context of
this study, the design of all its phases of data collection, and its
major findings through the early school years.
4. Rather than turning directly to these teratological matters,
however, I will begin at the core of the technique, with a description
of some unexpectedly fruitful matrix maneuvers. The scientist for whom
PLS is designed is faced with two lists of ordinary statistical
variables; call them the X's and the Y's. Later on, the X's will be
measures of prenatal alcohol exposure, and the Y's a great variety of
measures of child neurobehavioral functioning. Number the X's from X1
to Xm, the Y's from Y1 to Yn. In the application to come, m is 13 and n
474. Even though the X's and Y's are to have been measured on the same
cases, usually they do not share any natural units of measurement; then
it is convenient to normalize each of them to mean zero and variance
one. Write R for the correlation matrix of the X's with the Y's, m rows
by n columns. Its element Rij is the correlation of Xi with Yj. This
matrix does not have 1's down the diagonal; usually it is not even
square.
5. A certain interesting computation can now be phrased in any of
several different ways. All involve the production of a vector of m
coefficients Ai, one for each X, together with a vector of n
coefficients Bj, one for each Y. In the easiest approach to this nexus
of overlapping interpretations, the name "Partial Least Squares" can be
thought of as referring to "Least-Squares analysis of Part of a
correlation matrix."
6. Suppose we want the vectors A and B for which the dyadic product AB
(that is, the m by n matrix whose ij-th entry is Ai*Bj) comes closest
to the matrix R in the least-squares sense. There are algorithms
(specifically, the singular-value decomposition, SVD) for producing A
and B at the same time: they are the first pair of left and right
"singular vectors" of the matrix R. When R is a correlation matrix,
however, they take on an additional meaning that it is useful to
explain from first principles. Assume we already have the vector A
somehow and merely wish to compute the corresponding vector B one entry
Bk at a time. If Ai*Bj matches R in a least-squares sense, then its
k-th column, Ai*Bk, has to do its share in fitting the k-th column of
R: we want to choose Bk so that the difference between Rik and Ai*Bk is
as small as possible in the least-squares sense -- to minimize the sum
of the terms (Rik-Ai*Bk)**2 as i varies from 1 to m. Here Bk is
unknown, and the Rik and Ai are already known (by assumption).
7. But this is an ordinary univariate regression (without constant
term). So we know the formula for Bk: it is equal to the cross product
of the A's by the R's divided by the cross product of the A's by the
A's: Bk=sum(Ai*Rik)/sum(Ai**2), both sums taken over i. Given the A's,
then, the values Bj of the B's that together guarantee a least-squares
fit of Ai*Bj to Rij are proportional to the expression sum(Ai*Rij). The
scientific import of PLS is concentrated in a peculiarly useful
alternate interpretation of part of this same formula. The expression
sum(Ai*Rik), numerator of the least-squares estimate of the value of Bk
for the k-th column of R, is at the same time the covariance of the
original variable Yk with a new linear combination of the X's, the
"latent variable" LV.X=sum(Ai*Xi) that combines the X's using the
coefficients of the vector A for weights. That is,
sum(Ai*Rik)=cov(Yk,sum(Ai*Xi)) -- this is just an algebraic identity.
8. Similarly, given the values of the B's, we find that each Ak is
proportional to the covariance sum(Bj*Rkj) of the original variable Xk
with a new "latent variable" LV.Y=sum(Bj*Yj) combining the Y's using
the coefficients of the vector B for weights.
9. A computation ostensibly referring only to a matrix of cross-block
correlations, then, ends up interpretable in terms of covariances with
"latent variables" summarizing the import of either block for
predicting the other. Each coefficient, an Ak or a Bk, is proportional
to the covariance of its original variable with the LV score of the
OTHER block. These coefficients are called "saliences." The
coefficients Ai are the saliences of the several Xi for predicting (or
being predicted by) the summary score LV.Y combining the Y's, and
similarly the coefficients Bj are the saliences of the several Yj for
predicting (or being predicted by) the summary score LV.X combining the
X's. The A's are the saliences of the X's for the Y's, and the B's are
the saliences of the Y's for the X's, all at the same time. We call
this a "consistency criterion"; in other fields, such as econometrics,
it is called an alternating expression of a fix-point property.
10. Other characterizations of this same pairing of singular vectors A
and B follow from other aspects of the singular-value decomposition
reviewed in the papers listed in the bibliography below. There are
several pairs of vectors A and B that satisfy the "consistency
criterion" just reviewed (each element of either proportional to
covariances with the latent variable score for the other). One of these
pairs has the greatest possible covariance of any pair of linear
combinations sum(Ai*Xi), sum(Bj*Yj) when the scales of the A's and B's
are controlled separately (by setting sum(Ai**2)=sum(Bj**2)=1). This
covariance is called a "singular value" of the original matrix R; the
sum of squares of all these covariances is the sum of squares of all
the entries of R. So we can talk about the "goodness of fit" of the
model Ai*Bj for R in terms of the covariance of the two latent
variables involved. The usual statistic from this least-squares fitting
problem is the "fraction of summed squared correlation explained" by
this pair of LV's. ("Explanation" here does not mean the way
correlations explain case values, as in regression or analysis of
variance, but the way that latent variables explain correlations.) The
larger that fraction, the closer the pattern Ai*Bj comes to fitting all
the entries of R -- the more the columns of R all look as if they have
the same pattern, the pattern of the vector A of X-saliences, or,
equivalently, the more the rows of R all look as if they have the same
pattern, the pattern of the vector B of Y-saliences. From the basic
formula, the salience Ak of the k-th X is the "amplitude" according to
which that row of R matches the general vector B across columns, and
vice-versa.
11. Another way of thinking about this same pairing is to imagine a
principal components analysis (PCA) of the matrix R as if it were raw
data: "rows" that are "cases" Xi, and columns that are "variables" Rij.
Then the vector B we're looking for is the ordinary first principal
component of the matrix R interpreted in this way (a principal
component computed "around zero," not around the means of the columns
of R), and the vector A we are looking for is made up of the scores of
the "cases" on this first principal component. Or we can imagine the
"cases" to be the columns of R, not the rows, which are now the
"variables." Now that the problem is transposed, the first principal
component of the "rows" is the vector A, and the "scores" on this PC
make up the vector B. But in either version, rows or columns, both
vectors (A's or B's) actually combine with the original data (X's or
Y's) to generate scores over the original sample of actual cases (no
quotes this time). It is this shared, symmetric interpretation of what
is usually an asymmetric computation that underlies the application to
dose-response analysis concerning us here. All these interpretations
are guaranteed consistent by the properties of the singular-value
decomposition.
12. These "latent variables" LV.X, LV.Y should not be thought of as
"factors" or as the sort of LV's computed in other approaches to
"structural equations modeling." For now, consider them simply as naked
formulas: a latent variable of a block of variables X for predicting a
variable Z is the new linear combination sum(cov(Xi,Z)*Xi). In this
definition, Z could be any variate at all. The power of this new
construction of LV's arises from the computation of LV.X and LV.Y in
pairs. LV.Y serves the role of Z for the definition of LV.X -- we have
LV.X=sum(cov(Xi,LV.Y)*Xi) -- and, simultaneously, LV.X serves that role
for the definition of LV.Y: LV.Y=sum(cov(LV.X,Yj)*Yj).
13. So far I have been describing the "two-block" form of PLS. Our 1993
monograph explains its extension to multiple dimensions within blocks
and to analysis of more than two blocks of variables at the same time.
The basic algorithm is still least-squares fitting of some off-diagonal
parts of a correlation matrix, and it still results in sets of multiple
latent variable scores, one per "block," bearing pairwise covariances
that are optimal in a certain scientifically useful sense. Lists of
variables work well as blocks if they are measured at the same time,
represent the same behavioral channel, or can otherwise be expected to
have "something in common" in respect of their prediction of or
prediction by other blocks in an explanatory scheme.
14. To understand what these formulas can mean in practice, we need to
interpret them in the context of how the X's and the Y's were collected
in the first place. The A's and B's tell us important things about
variables; the latent variable scores LV.X and LV.Y tell us equally
important things about dose and response case by case. The remainder of
this article, then, is a description of one particular scientific
application to which PLS appears exceptionally well matched.
II. SCIENTIFIC CONTEXT
15. PLS seems ideal for studies of dose and response (cause and effect)
in systems under indirect observation. These are not studies of
"normal variation." Instead, the investigator is typically trying to
extend into the range of human observational studies a pure
dose-response nexus known to lead to unipolar syndromes in high-dose
cases. Fetal alcohol syndrome (FAS) is known to exist and to be caused
by prenatal exposure to alcohol in sufficient quantity. The associated
dose-response studies analyze the dependence of effect upon cause -- of
response upon dose -- in the mildly abnormal case ("social drinking").
The fact of this prior knowledge means that the existence of the
underlying statistical tie is not in question. The study is, rather,
one of calibration: to find and rank the saliences of the dose
measures, or the response measures; and to find and rank the subjects
of the study according to their dose or their response scores. For the
present study, a sample of 500 Seattle women pregnant in 1974-75 was
drawn from 1500 interviews of women in prenatal care by the fifth month
in two hospitals. The sample is typical of Seattle pregnancies of the
mid-1970's except that in order to increase the precision of the
dose-response calibration it was intentionally overweighted for high
levels of social drinking. At the time, drinking was not correlated
with other aspects of high-risk pregnancies; most of the statistical
confounds bedeviling recent studies of this type were inoperative in
this one.
III. MEASUREMENTS
16. PLS applies to studies in which cause and effect are each measured
variously and redundantly. Our alcohol study includes multiple measures
of amount and pattern: averaged dose ("drinks per day" -- a drink is a
standardized dose of alcohol, about half an ounce), occasions per
month, average drinks per occasion, maximum drinks per occasion, and
"bingeing" according to two different categorization schemes. These
were assessed at two times during pregnancy ("prior to awareness" and
contemporaneously) in the course of one single interview during the
fifth month, and were augmented by a summary rating of priority for
inclusion in the "follow-up" sample when the child was actually born.
The measures of "effect" include quite an assortment: nearly five
hundred measures of neurobehavioral functions typically found to be
altered in the full clinical expression of Fetal Alcohol Syndrome.
These outcomes are gathered into "blocks" by child's age (from 1 day to
14 years), behavioral channel (attentional, motor, mental, and
school-related), and modality (laboratory task, expert rating, parent
rating, teacher rating, standardized psychometric exam). Analyses
proceed both separately block by block and with the outcomes pooled
into one list of 474 (through age 7 years only, as reviewed in our
monograph). The study is prospective and longitudinal in design, with
82% sample continuity from birth right through 14 years. This
continuity is as salutary for PLS analysis as for any other form of
biometric investigation.
IV. FINDINGS
IV.1 Alcohol Saliences (A's)
17. We find that one single latent measure of dose reliably accounts
for most of the profiles of correlation with outcomes whether pooled
over seven years or separately block by block at up to fourteen years.
(For this report, dose measures have been linearized with respect to
this composite outcome. The transformations are in Figure 4.1 of our
monograph.) A first pair of LVs accounts for 75% of the sum of squares
of the 13 x 474 correlations up through seven years; the incorporation
of a second dimension to account for variations in the general
weighting of binges increases that fraction to 86%. Even without this
refinement, the pattern of saliences of dose confirms the findings from
animal studies. Salience for net deleterious outcome is higher for
doses early in pregnancy ("prior to awareness") than in mid-pregnancy
(when the interview took place), and saliences are higher for patterns
of massed drinking ("binges") than for the comparable net intake of
alcohol in a pattern of steady drinking. The two-dimensional analysis
corresponds to a collection of simpler fits against alcohol, block by
block, in which all the alcohol salience vectors A resemble the one
that accounts for 75% of all 13 x 474 correlations and differ from it
mainly by the weighting of the early/binge components with respect to
the others. For the typical outcome, the most salient measure of
alcohol dose is average number of drinks per occasion prior to
awareness of pregnancy. The most frequently encountered measure of
alcohol intake, net ethanol per day, is among the least useful for
prediction of the ensuing neurobehavioral deficits.
IV.2 Outcome Saliences (B's)
18. The second set of saliences pertains to outcomes. The first PLS
latent variable for the first seven years of outcomes most heavily
weights our modification of a Brazelton neonatal measure, habituation
to light at age 1 day (an age clearly too early for socioeconomic
effects to have intervened); but the next most salient outcome variable
is a rating of adjustment to the second-grade school setting. Other
outcomes having high salience (high correlation with the alcohol LV)
include several other neonatal manifestations of neurological maturity,
standardized scores (especially the standard deviation of reaction
time) on vigilance tests at ages four through fourteen years, teacher
ratings of undesirable classroom behaviors related to learning
disability, a sharply focused profile of IQ deficit at 7 years
concentrating on arithmetic and digit span subtests, a trade-off of
accuracy for speed in a geometric puzzle task at age 14, and a sharp,
specific failure to pronounce unfamiliar English phonemics correctly.
IV.3 Case Scores
19. In spite of the long tail of the measure of latent dose, the joint
distribution of dose and response LV scores for this sample is
well-behaved, as shown in the following scatterplot summarizing the
findings up through seven years (Streissguth, Bookstein et al., 1993,
Figure 6.1). This pair of LV's incorporates 75% of all the squared
correlation between the blocks; the correlation between the scores is
0.29.
-------------------------------------------------------------------------
FIGURE 1
First Alcohol LV Score
...............................................................
. *
.
.
F 15..
i .
r .
s . * * * *
t . * *
10.. ** * *
O . 2 * * * * 2 *
u . 22 * ** ** * * **
t . 25 * * * ** * * 2 * *
c 5.. 5* * 2 2 * * ** *
o . 24** 3 * * * ** *2 * * ** *
m . 57 ** ** 2 **** *2*2* * * *
e . 242*3** * ** * * 2 * 2 * * * *
. b0222** *2*2 2 ** ** ** * 3 * ** * * *
L 0.. ih 4*335*222 2* 22** ** 2 232*2 * * * *
V . gqia3522 *3 *3 *** ***3* * 2
. 0923 **23 * 2 * *2 * *
S . a5* *2** *2* 2* * * 2 *
c -5.. a02*22*2***** * ** **** ** * * * **
o . 34 * * * * **
r . 242* ** ** *
e . 32 2 ** * * * * * *
. *2 * * 3
-10.. 23 * * **
. 3* *
. *
...............................................................
0 2 4 6 8 10 12
-------------------------------------------------------------------------
20. In the context of dose-response analysis, the abscissa of this plot
is the best composite measure of dose for these responses, and the
ordinate the best composite measure of response for these dose
measures. The best single statistical realization of the "dose-response
curve" of teratological theory would be the trail of a scatterplot
smoother through this scatter. After parental education, the alcohol LV
is a more significant predictor of this composite outcome than any
other measurable factor, and the slope of the apparent dose-response
curve is hardly altered at all after such statistical adjustments.
Children can show neurobehavioral deficits at all levels of prenatal
dose (notice, for instance, the large vertical scatter of the streak at
the left, the children of abstainers), but the risk of a high deficit
clearly increases with dose. Of the few dozen most heavily exposed
subjects (those at the right of the horizontal axis), two were formally
diagnosed FAS at birth. We have been able to diagnose 11 others as
fetal-alcohol-affected on the basis of consistent deficits over the
components of this summary score disaggregated by channel and wave of
measurement.
21. It may be helpful to compare this two-block procedure with the more
familiar approach of canonical correlations analysis (CCA), which is an
optimization of the correlation between a similar pair of scores,
likewise linear combinations of the two blocks. Recall that PLS
optimizes covariance, not correlation; the distinction is much more
important than it seems. Interpreting the coefficients of canonical
variates requires the usual stringent assumptions underlying multiple
regression of either canonical variate upon the variables of the other
block. Such assumptions are unlikely to obtain when predictors or
outcomes are intentionally redundant. (In a typical analysis from the
Seattle study, alcohol versus 11 IQ subscores, the first three pairs of
canonical variates have nearly the same high correlation; but each
involves an uninterpretable contrast among the alcohol variables, and
none bears much predictively usable covariance.) In contrast, the PLS
procedure begins with the assignment of an interpretive meaning to each
coefficient, as being a salience for the cross-block prediction problem
driving the research: being proportional to covariance with the facing
LV. From this follows the optimization of covariance of the normalized
LV's. In this, PLS directly generalizes the meaning of the coefficients
of a principal component (the linear combination LV.X satisfying the
definition of LV with Z = LV.X itself).
22. Because two-block PLS is effectively a principal-components
analysis of either the rows or the columns of the cross-block
correlation matrix, its pathologies are the milder ones characteristic
of PCA (influential observations, clusters) rather than those of
multiple regression or likelihood-based modeling of covariance
structures. The "data" for the PCA are correlations rather than
individual measurements, further ameliorating these difficulties.
Another way of thinking about all this construes the alcohol LV score
as an averaging of a large list of univariate predictions
cov(Xi,LV.Y)Xi of the same outcome LV: one prediction for each
predictor in the "predictor block" of alcohol scores. The same
averaging applies to the object of prediction, the outcome LV. The
apparent regularization of all sorts of quirks of data so that a
central tendency can emerge (in PLS, a central tendency of cross-block
prediction) is just what one expects an average to do. (As early as
1892, F. Y. Edgeworth had noticed that covariances themselves were
already weighted averages of simple slope measures.)
23. Computing the SVD of an arbitrary matrix, such as the
cross-correlation matrix R underlying the PLS approach, is emerging as
a standard capability in most statistical computing environments. The
scatters, diagnostics, and further explorations we have described are
not peculiar to the particular linear combinations that are PLS LV's,
but can be carried out effectively in almost any statistics package. In
our studies of outcome blocks showing alcohol teratogenesis, age after
age and channel after channel, after each SVD we routinely produce
scatters of the LV scores against each other (as in the example above)
to check for outliers and nonlinearity; we scatter the outcome LV
against the dose measures and rescale those measures by a gently
nonlinear scatterplot smoother if necessary; we check the covariances
between dose and response LV's for confounding by the obvious
covariates that afflict human teratology studies (other prenatal
exposures, social class, education, nutrition) -- so far these have
never been a problem; and we verify that the children labeled as
"alcohol-affected" at earlier ages remain characteristically in deficit
at later ages. Beyond the basic SVD itself, all of these statistical
maneuvers are elementary. Our monograph (see the bibliography) explains
all of these tactics and lays out the tables and diagrams that lead us
to conclude that the primary findings about the pattern of alcohol
effects and the pattern of saliences of dose are valid.
24. PLS may be contrasted with diverse other approaches to the same
sort of causal explanation. By maximizing covariance between the LV
scores, PLS optimizes the usefulness of the analysis for subsequent
studies of intervention. Unlike the coefficients of a canonical
correlations analysis, the saliences that PLS computes have meaning
individually even when (indeed, especially when) the predictor block or
the outcome block is intentionally multicollinear. Along with the
scores, the saliences can be computed in any statistical package that
gives users access to eigenanalysis, so that PLS can be applied to
much larger problems than can more sophisticated optimizations. PLS
differs from structural equations models in its lack of most
distributional assumptions and in that it invariably ignores the
within-block factor structure of the dose measures and the response
measures separately. In our experience, this structure is quite
irrelevant to the assigned task of cross-block explanation. (For
instance, alcohol does not affect the general factor of IQ as much as it
affects a particular profile of arithmetic deficiency.)
25. As a fit to the cross-correlation matrix rather than the raw data,
PLS avoids the difficulty of all likelihood-based structural equation
modeling (including multiple regression), namely, that to be
interpretable a fitted model must first be "true." PLS is a useful
multivariate tool for those who believe, along with me, David Freedman,
Clark Glymour, and many others, that structural-equations modeling as
applied in the behavioral sciences has never taught us anything we did
not already know -- that it has never arrived at any positive
conclusions not already built into the hypotheses. While PLS is not
designed for the "testing" of "hypotheses," the vectors of saliences A
and B can be tested against a null model by bootstrap computations, and
similar exploratory resampling data analyses can be applied to
substantive aspects of the interpretations that result, such as
covariates of LV scores or the reliable identification of types of dose
or response measures as particularly salient for each other.
26. Through 1993, this biometric mode of PLS has been applied in
diverse evolutionary and developmental studies as well as in the
extensive study of alcohol effects to which I've been referring. Many
more studies of behavioral/brain development could be cast into a
framework for which these essentially simple computations, and the
insights they support, might be similarly useful. Our version of PLS
was designed to reward careful, conscientious measurement of multiple
aspects of familiar but only indirectly observable phenomena and to
discourage all modeling, including multiple regression and structural
equations, that drifts farther than necessary from such data. Although
PLS is not part of any common psychometric statistical package today,
its saliences and scores can all be computed in any interactive
statistical environment that includes a singular-value decomposition,
such as S, Matlab, or SAS. I would welcome comments from readers,
whatever their discipline, regarding precursors of this technique,
other potential applications, or pitfalls.
V. ACKNOWLEDGEMENTS
27. This version of PLS has been a collective effort of the Pregnancy
and Health Study, Department of Psychiatry, the University of
Washington. I am grateful for the energies of my colleagues Paul
Sampson, Ann Streissguth, and Helen Barr over the years during which
these techniques and explanations were crafted. Support for this
methodology has been obtained from NIAAA grant AA-01455 to A. P.
Streissguth and NIA grant AG-11037 to Fred L. Bookstein. Paul Sampson
produced the scatterplot in paragraph 19. [Note that this target
article is a revised and updated version of an article that originally
appeared in PSYCOLOQUY 2(3) 1991.]
VI. FOR FURTHER READING
On this particular dose-response form of Partial Least Squares, the
best source is our recently published monograph:
Streissguth, A.P., Bookstein, F.L., Sampson, P.D. and Barr, H.M. The
Enduring Effects of Prenatal Alcohol Exposure on Child Development.
University of Michigan Press, 1993. xxxiv + 301 pp.
An appendix to that monograph lists earlier articles over a range
of journals.
Streissguth, A.P., Barr, H.M., Bookstein, F.L. and Sampson, P.D.
Neurobehavioral Effects of Prenatal Alcohol. Neurotoxicology and
Teratology 11:461-507, 1989.
Bookstein, F.L., Sampson, P.D., Streissguth, A.P. and Barr, H.M.
Measuring "Dose" and "Response" With Multivariate Data Using Partial
Least Squares Techniques. Communications in Statistics: Theory and
Methods 19:765-804, 1990.
Bookstein, F.L. (1991) Partial Least Squares: A Dose-response Model
for Measurement in the Behavioral and Brain Sciences. PSYCOLOQUY
2(3). psyc.arch.2.3.91
Carmichael Olson, H., Sampson, P.D., Barr, H.M, Streissguth, A.P. and
Bookstein, F.L. Prenatal Exposure to Alcohol and School Problems in
Late Childhood: A longitudinal prospective study. Development and
Psychopathology 4:341-359, 1992.
Streissguth, A.P., Sampson, P.D., Carmichael Olson, H., Bookstein,
F.L., Barr, H.M., Scott, M., Feldman, J. and Mirsky, A.F. Maternal
Drinking During Pregnancy: Attention and Short-term Memory Performance
in 14-year-old Offspring: A Longitudinal Prospective Study.
Alcoholism: Clinical and Experimental Research, in press, 1994.
Two readers in earlier styles of PLS analysis:
Joreskog, K.G., and Wold, H. eds. Systems Under Indirect Observation:
Causality, Structure, Prediction. Contributions to Economic Analysis,
Volume 139, Part II. Amsterdam: North-Holland, 1982.
Wold, H., ed. Theoretical Empiricism: A General Rationale for
Scientific Model Building. New York: Paragon House, 1989.
---------------------------------------------------------------
INSTRUCTIONS FOR PSYCOLOQUY AUTHORS AND COMMENTATORS
PSYCOLOQUY is a refereed electronic journal (ISSN 1055-0143) sponsored
on an experimental basis by the American Psychological Association
and currently estimated to reach a readership of 36,000. PSYCOLOQUY
publishes brief reports of new ideas and findings on which the author
wishes to solicit rapid peer feedback, international and
interdisciplinary ("Scholarly Skywriting"), in all areas of psychology
and its related fields (biobehavioral, cognitive, neural, social, etc.)
All contributions are refereed by members of PSYCOLOQUY's Editorial Board.
Target article length should normally not exceed 500 lines [c. 4500 words].
Commentaries and responses should not exceed 200 lines [c. 1800 words].
All target articles, commentaries and responses must have (1) a short
abstract (up to 100 words for target articles, shorter for commentaries
and responses), (2) an indexable title, (3) the authors' full name(s)
and institutional address(es).
In addition, for target articles only: (4) 6-8 indexable keywords,
(5) a separate statement of the authors' rationale for soliciting
commentary (e.g., why would commentary be useful and of interest to the
field? what kind of commentary do you expect to elicit?) and
(6) a list of potential commentators (with their email addresses).
All paragraphs should be numbered in articles, commentaries and
responses (see format of already published articles in the PSYCOLOQUY
archive; line length should be < 80 characters, no hyphenation).
It is strongly recommended that all figures be designed so as to be
screen-readable ascii. If this is not possible, the provisional
solution is the less desirable hybrid one of submitting them as
postscript files (or in some other universally available format) to be
printed out locally by readers to supplement the screen-readable text
of the article.
PSYCOLOQUY also publishes multiple reviews of books in any of the above
fields; these should normally be the same length as commentaries, but
longer reviews will be considered as well. Book authors should submit a
500-line self-contained Precis of their book, in the format of a target
article; if accepted, this will be published in PSYCOLOQUY together
with a formal Call for Reviews (of the book, not the Precis). The
author's publisher must agree in advance to furnish review copies to the
reviewers selected.
Authors of accepted manuscripts assign to PSYCOLOQUY the right to
publish and distribute their text electronically and to archive and
make it permanently retrievable electronically, but they retain the
copyright, and after it has appeared in PSYCOLOQUY authors may
republish their text in any way they wish -- electronic or print -- as
long as they clearly acknowledge PSYCOLOQUY as its original locus of
publication. However, except in very special cases, agreed upon in
advance, contributions that have already been published or are being
considered for publication elsewhere are not eligible to be considered
for publication in PSYCOLOQUY,
Please submit all material to ps...@pucc.bitnet or ps...@pucc.princeton.edu
More detailed Instructions are available on request.
Anonymous ftp archive is DIRECTORY pub/harnad/Psycoloquy HOST princeton.edu
----------------------------------------------------------------------