Aaro,
Welcome to Talking Measurement and thankyou for posting.
There are many points in your post that are worthy of discussion, but I think your first list of questions goes to the core of the problem.
Science, in my view, is best defined as the critical investigation of natural systems, with the primary aim being the development of descriptive theories of these systems. Such investigation needs to be critical as human beings are fallible thinkers and our perceptual systems are quite limited. We find it difficult to understand the behaviour of even simple quantities like length, for example, without sophisticated measurement apparatus, such as length measurement devices based on laser technology. Hence our hypotheses and observations made of natural systems, and our deductions from them, could be wrong and the best way to identify and rectify such errors is through being critical of our own perceptions, thoughts and observational apparatus.
Psychological systems are natural systems. Hence if behavioural scientists want to consider themselves to be scientists, then their research activity should focus on the critical investigation of psychological systems. Psychologists, above all else, should be trying to develop descriptive theories of psychological systems.
Psychologists generally believe that psychological attributes and systems are quantitative. Now, it must be stressed that this is a coherent hypothesis. There is nothing logically problematic about it. But like all hypotheses, it must be subject to critical investigation and tested. Without this we have no way of knowing if the hypothesis is true or even plausible. By and large, psychologists have not tested this hypothesis and have largely assumed it to be true. This is the nub of Joel Michell’s (1997; 1999) criticism of psychometrics – the central hypothesis of the field simply has not been subject to critical scrutiny. Hence there is very little compelling evidence in support of the hypothesis; and therefore we cannot reasonably assume that psychological attributes are measurable, continuous quantities.
Why psychologists want to measure has complex historical causes that have been discussed by Michell (1997; 1999). I believe that the first of your five listed reasons is accurate. Psychologists must compete against the “hard” sciences for research funding and they believe, probably correctly, that funding bodies will be more convinced to make grants if it is perceived that psychologists engage in real scientific measurement. Another related problem is that psychological testing is an established, global industry worth billions. Just educational testing in the US alone is a multi-billion dollar industry. Much of the business conducted within the testing industry is based on the hypothesis that tests are instruments of scientific measurement. If Testing Company X argues that their tests just order students with respect to their cognitive abilities, but Company Y argues that their tests “measure” such abilities, it is a safe bet that Company Y will win the tender or contract. Hence the testing companies have a vested, non-scientific interest in maintaining that psychological attributes are measureable. Do not underestimate how strong this can be. I criticised a testing company on another forum for not being interested in foundational issues in measurement (as testing companies in general are not), and was censured for being “uncivilised”, despite the fact that I did not engage in slander, libel or personal vilification.
The third of your reasons is also pertinent. Practicalism is also a reason for why the hypothesis of psychological quantities has not been critically investigated. Students abilities need to be assessed and tests are useful in this regard. They seem to do the job and because of this psychological tests must be measuring something. However, practical concerns are logically indifferent to scientific ones, so the practical value of tests has no logical bearing upon the truth of the hypothesis of psychological quantities. But one not need argue that tests measure anything for tests to provide useful information regarding cognitive abilities. In most instances ordering students with respect to their abilities is enough in practical terms and it is fair to say that tests actually achieve this. Cliff & Keats (1996) devoted a whole book to this issue. Moreover, there is a whole field of psychometrics now devoted to non – parametric Item Response Theories in which cognitive abilities are assumed to be ordinal.
In my opinion, only a few psychological attributes seem to be plausibly measureable, such as the utility of incremental gains and losses under conditions of risk or uncertainty. But even in this field, where a notable emphasis is placed on theories which describe choice behaviour, there are problems and lapses in critical thinking. Ultimately, there is no “silver bullet” to the issue of scientific measurement in psychology, but I think some progress might be achieved if psychometricians start trying to devise descriptive theories of cognitive systems and of item response processes. This is not an easy task, but one which I feel can be done if thought is put to it.
Cheers,
Andrew
P.S. Talking Measurement subscribers might be interested to know that I have my own website up and running. I am happy to link it to other Talking Measurement members’ websites if I can get a link in return.
Andrew Kyngdon, PhD
MetaMetrics, Inc.
My website: Dr Andrew Kyngdon
--
You received this message because you are subscribed to the Google Groups "Talking Measurement" group.
To post to this group, send email to talking-m...@googlegroups.com.
To unsubscribe from this group, send email to talking-measure...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/talking-measurement?hl=en.
I think Andrew said it quite right. As you stated in your email, and
as Andrew outlined in his response, there is a large set of factors,
both scientific and extra-scientific, that push the measurement boat.
Everybody wants to be associated with measurement because it brings
the credentials of precision, prediction, and control.
I also share Andrew's evaluation that, at least when one adheres to
the strict definition of measurement common in this forum (the
determination of the ratio between two magnitudes of the same
continuous quantitative attribute, one of which functions as a unit),
measurement is largely out of the question in psychology. The primary
reason being that psychological attributes are not continuous and have
complex internal structures so they aren't lines or isomorphic to
lines; and things that aren't (isomorphic to) lines (real numbers)
aren't measurable according to the strict doctrine (i.e., the
Michell/Holder line of thinking).
Personally, I have become less and less convinced that it makes sense
to set up tests of the assumption of continuous quantity because it
appears to be so very unlikely that candidate attributes like
intelligence, personality traits, attitudes, and psychopathological
disorders are in fact continuous quantities. If fact as I am writing
this, the very idea strikes me as almost ridiculous. Given the a
priori implausibility of the continuous quantity assumption, it is by
the way remarkable that some psychometric practices that appear to be
based on it (e.g., certain IRT applications, adaptive testing
routines, etc.) perform so well.
Having said that, I think that psychology and psychometrics cover much
more terrain than the strict definition of measurement. For instance
psychometrics includes categorical structures (latent class and latent
profile models), ordinal structures (nonparametric IRT), complex
models (e.g,. multidimensional and network models), and purely
descriptive systems that have no measurement pretension whatsoever
(e.g., many kinds of MDS, components analysis, etc.).
One can probably however put the same questions you asked about
measurement before the use of these techniques. Why do people use
formal/mathematical/statistical methods at all?
Best
Denny
> --
> You received this message because you are subscribed to the Google Groups
> "Talking Measurement" group.
> To post to this group, send email to talking-m...@googlegroups.com.
> To unsubscribe from this group, send email to
> talking-measure...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/talking-measurement?hl=en.
>
--
Denny Borsboom
Department of Psychology
University of Amsterdam
Roetersstraat 15
1018 WB Amsterdam
The Netherlands
phone: +31 20 525 6882
email: d.bor...@uva.nl
homepage: http://users.fmg.uva.nl/dborsboom
Very nice post. Thanks. I agree it is the clear question that emerges--why measure? I agree with most of your answers, and I agree most of the discussion on this group focuses on those things.
To take each question one at a time.
I think there are at least five reasons why measurement is used/ pretended to be used in psychology:
1. "Real sciences" measure and psychology must look like/ is a real science. This position may be common, but maybe not very meaningful
Steve: I would just qualify this as "real quantitative sciences" involve measurement and so on. I totally agree: the immense success of physics in particular, and the many natural scientific disciplines that use physical measurements, set up such an expectation of what a successful science should look like, this drew people in. This is quite clear from the history, and the formative figures like Thurstone, Weber, Wundt, Fechner had backgrounds in physics (or were physicists). It is a shame because so much of science doesn't involve the application of mathematics that presupposes measurements like r kg, r m, r s, r A (where r is a real number and kg etc are the SI unis). However, it is in my view clearly worth pursuing quantitative approaches in psychophysics and I consider it an open-question with respect to many other attributes and phenomena considered the focus of psychology.
2. Everybody else is measuring. Here two forces are operative. First, universities usually teach research methodology as if quantitative data analysis is the scientific method. And second, publishing is also easier. Here the reasons are nonscientific, thus.
Steve: Again, I essentially agree.
3. Psychology lacks better methods for organizing massive amounts of information. Until these better methods will be discovered, statistical data analysis, that requires "measurement," should be used. This position was quite explicitly taken by founders of statistical data analysis in psychology and other sciences--Karl Pearson, Louis Thurstone, among others.
Steve: I agree about the position being taken by these people and others. This is without question. I am not at all convinced that statistical data analysis requiring measurement (most in practice, including most applications of GLMs) should be used. Indeed, I would argue that the almost unbridled use of statistics takes away massive resources and talent from questions, including how we can more realistically use some mathematics (not involving real numbers and real number arithmetic).
4. Measurement and following dstatistical data manipulation helps to predict future states and events beyond chance.
Steve: Yes, beyond chance. That doesn't mean a lot, as you may well agree (?)
5. Measurement and quantitative data analysis help to reveal the mechanisms of the studied phenomena, psyche in psychology.
Steve: Here, I'm a little confused. They would if they had a foundation like that in physics, but for the most part, they don't. I agree this is the hope, and to some degree certainly we learn some things, but in a highly confused way, and for the most part dressed up as so much more than it is.
Great post.
Regards,
Steve
________________________________
From: talking-m...@googlegroups.com [talking-m...@googlegroups.com] On Behalf Of aaro [aaro.t...@ut.ee]
Sent: Wednesday, 16 March 2011 2:57 AM
To: talking-m...@googlegroups.com
Subject: [talking-measurement] Why to measure?
Hi All,
With best regards
Aaro
--
The fifth aim--understanding of the mechanisms--must take the issue of measurement seriously. The result is quite clear--psychic attributes cannot be measured.
Psychology can and should attempt measuring and mathematical methods if the principles and laws of "real quantitative (as Steve correctly specified) science" can be attributed to the mind as well. If psychological regularities can be reduced to physical this approach would be menaingful. But I think also that there is anough evidence to reject reductionism; the world of psyche is not covered by physical laws (I am not opposing mind to physical world here, I suggest that there is psychic part of the physical universe which is described by a set of principles that does not apply to all the physical world). Therefore science of psyche should reject measurement and mathematical methods; by becoming "real quantitative science" psychology ceases to be psychology.
Andrew proposed a candidate for a measurable attribute--the utility of incremental gains and losses under conditions of risk and uncertainty. I am not sure this attribute is actually psychic. Here an interesting question of the definition of psyche emerges, but this is another issue.
“Thus, the most pertinent question for me, as is alluded to in the description for this forum that you quoted in your earlier post, is how do we commence such research? This entails other questions like; What exactly do I mean by 'such research'? What will the content of 'such research' be? Is there any existing psychological theory that has already commenced 'such research'? Is there any existing physical theory that is relevant to 'such research', analogously or otherwise? Hopefully, by way of these discussions and others, we can get away from postulation and begin the necessary research.
I don't see a distinction between wanting to measure and wanting to systematically understand, because the only real path to scientific measurement is via systematic understanding. It may well be that attempts at such understanding are best served by a healthy skepticism toward the possibility of psychological measurement, encouraging critical reflection, etc. But, if such an understanding reveals the possibility for psychological measurement, then why wouldn't we? If it turns out improbable, then that is an important addition to our systematic understanding.”
Hi Josh,
I very much welcome your call for less postulation and more focus on research. But did ‘such research’ not already commence? How about the Rasch model? It is a psychological theory and it concurs to the classical definition of measurement and Rasch himself already made the connection to the relevant physical theory (e.g. laws of ideal gases). I also assume that most Rasch people believe that they can (do) already measure psychological attributes at least to some extent. I’m bit surprised that this view has not yet be articulated here. Maybe simply because there are no 'such Rasch people’ present here?
Guenter
It seems that there is much agreement on the forum, amongst the vocal minority at least, regarding your comments on the extra-scientific motivations for and implications of 'measurement' in psychology. Your latest post seems to come to a much stronger conclusion.
Can you please clarify what you take the content of psychology to be?
Further, it may well be the case that the 'psychic part of the physical universe' is not, at least in part, explained by existing physical theory/laws (although I believe this is an open, empirical question), but why does it necessarily follow that the establishment of quantitative psycho-physical (for want of a better term) theory/laws is not possible, therefore precluding the possibility of psychological measurement? If I am misreading you, then please let me know.
So, to digress a little, as far as I can tell from your post and other publications (http://www.frontiersin.org/quantitative_psychology_and_measurement/10.3389/fpsyg.2010.00029/abstract), you emphasise a structural-systemic epistemology. But such an epistemology seems to me consistent with the way that measurement was actually established in the physical sciences, i.e., hypothesising and confirming physical properties and the types of relations between them across a wide range of physical phenomena and building this into a coherent body of substantive theory that is the foundation (not mathematical as it is often presented in psychology) of the international system of measurement. I would argue that the possibility of psychological measurement can only be established (or not) by such research, and certainly not by fiat (which I have been somewhat guilty of in the past).
I very much welcome your call for less postulation and more focus on research. But did ‘such research’ not already commence? How about the Rasch model? It is a psychological theory and it concurs to the classical definition of measurement and Rasch himself already made the connection to the relevant physical theory (e.g. laws of ideal gases). I also assume that most Rasch people believe that they can (do) already measure psychological attributes at least to some extent. I’m bit surprised that this view has not yet be articulated here. Maybe simply because there are no 'such Rasch people’ present here?
several interesting points you make; I am sympathetic to many of the
questions you ask, some being identical or nearly so to questions I
have often asked myself.
However I do not understand why the issues you raise should be
specific to psychology. Variables are a source of headaches to all who
think but they are endemic to every field of enquiry that engages in
empirical tests; be it biology, medicine, physics, or economics.
Similarly, the idea of construction may be appealing, but we cannot
"construct de novo" gravitation, quantum probability, or the speed of
light. So what? In other words I don't see why your criticism wouldn't
apply to science at large.
In addition I do not see your point about mathematics. When it
functions within science, mathematics has no intrinsic semantics, that
is, what it's about is *always* fixed by a nonmathematical link. To
call that link "verbal" is far too limited; it is often ostentative or
pragmatic. It usually exceeds the linguistic resources by far. But
outside mathematics it is never the case that the formal system itself
contains its meaning. This is well known. If it is a basis for a
critique of psychology, then it is a basis basis for a critique of
science at large. In other words it is again not specific to
psychology.
Maybe it would help if you gave an indication of what you do consider
to be a good explanation from your standpoint. What is a good
explanation in your terms?
Best
Denny
> --
> You received this message because you are subscribed to the Google Groups
> "Talking Measurement" group.
> To post to this group, send email to talking-m...@googlegroups.com.
> To unsubscribe from this group, send email to
> talking-measure...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/talking-measurement?hl=en.
>
--
Aaro,
You said:
“Andrew proposed a candidate for a measurable attribute--the utility of incremental gains and losses under conditions of risk and uncertainty. I am not sure this attribute is actually psychic. Here an interesting question of the definition of psyche emerges, but this is another issue.”
But then you say:
Psyche is a system of processes that, on the basis of individual experiences, organizes behavior with the aim of maintaining equilibrium of the organism as a whole in a changing environment.
I cannot see how decision making under risk or uncertainty is not a system of psychological processes, or how it is inconsistent with the definition of psyche that you have given above. I have never seen a tree, rock or automobile make a decision.
Andrew
You said:
"How about the Rasch model? It is a psychological theory and it concurs to the classical definition of measurement and Rasch himself already made the connection to the relevant physical theory (e.g. laws of ideal gases). I also assume that most Rasch people believe that they can (do) already measure psychological attributes at least to some extent. I'm bit surprised that this view has not yet be articulated here. Maybe simply because there are no 'such Rasch people' present here?"
You're on the wrong forum! Might I suggest this one: http://www2.wu-wien.ac.at/marketing/mbc/mbc.html
Be careful though. When it comes to critical debate, you'll find that more often than not you'll be talking past people.
But yes, a critical, dispassionate discussion of the Rasch model, including of its strengths, weaknesses and historical development, is something that would be most welcome on Talking Measurement.
Andrew
By the way: I could understand that, with a little imagination, one
might call the Rasch model a theory of sorts. But to call it a
psychological theory is a bit too much I think.
Best
Denny
> --
> You received this message because you are subscribed to the Google Groups "Talking Measurement" group.
> To post to this group, send email to talking-m...@googlegroups.com.
> To unsubscribe from this group, send email to talking-measure...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/talking-measurement?hl=en.
>
>
--
You say:
"I very much welcome your call for less postulation and more focus on research. But did ‘such research’ not already commence? How about the Rasch model? It is a psychological theory and it concurs to the classical definition of measurement and Rasch himself already made the connection to the relevant physical theory (e.g. laws of ideal gases). I also assume that most Rasch people believe that they can (do) already measure psychological attributes at least to some extent. I’m bit surprised that this view has not yet be articulated here. Maybe simply because there are no 'such Rasch people’ present here?
Yes, certainly many "Rasch people" believe they can and do mesure psychological attributes. Certainly, they use the word "mesaurement" very freely.
Where it comes to the standard (or 'classical') definition of measurement, I do not think it makes sense to say that a Rasch model is a psychological theory. Where it comes to the classical definition, yes, Rasch made the connection first using the Poisson (model) to try to measure reading ability based on the numbers of errors. Most people use the dichotomous or polytomous Rasch models, and the idea that there is a "class" of models revolves around algebra in a manner quite clearly removed from substantive theory. So if we are talking substantive theory, there is in my view no class of models. Most saliently, the dichotomous and polytomous models, as they are almost universally used, contain functions of differences between "parameters". To interpret the parameters as measurements in the standard/classical sense, there must be ratios between differences and a unit (e.g. (b-d):u where b and d are magnitudes and u is a magnitude, the unit). There is no in principle issue here, but nobody has yet taken up the necessary challenges required to demonstrate that it is possible to use these models to mesure in well-defined units.
(By the way, it is easily shown that the dichotomous Rasch model works for psychophysical results in as much as Weber-Fechner "law" holds, although an interesting question arises about what is actually measured--and investigations with Joshua revealed this has received rather more attention than it would seem from looking at the vast bulk of material).
Rasch explicitly referenced Maxwell, which is a lot more than most others have done. However, the way his work is used today is in my view a kind of hybrid of statistics and measurement thinking, which needs a serious rethink to get it back on course (I mean stock-standard statistics of the kind used throughout psychometrics, which has never been justified in the way that the use of quantitative methods in physics and the natural sciences to which physical measurements are so necessary). The rethink would, in my view, mean that the idea that "Rasch models" are somehow definable as a potentially substantive and coherent body of theory, definition and law is not really that much more likely than the idea that Stevens' theory of scale types is the same. Having said that, I think it is quite possible that some of Rasch's work will prove valuable if we ultimately succeed in mesauring "psychological quantities" in terms of well-defined units within a desciptive body of cogent theory, definition and law.
Regards,
Steve
"Can you construct--create de novo--any psychological phenomenon by relying
only on the Rasch model "psychological theory"? I think science should aim
at understanding; from the structural-systemic epistemology point of view,
Rasch models are not explanatory. If you have another definition for
explanation then please explain why and how your definition is more useful
than structural-systemic?"
First I must confess that I’m not yet very much acquainted with the ‘structural-systemic epistemology point of view’. However I believe that the Rasch model (at least as initially conceived by Georg Rasch) in particular and measurement models in general are not so much about any kind of explanation, but about the discovery of quantitative laws. Hence, strictly speaking, if the Rasch model would work it would not explain why certain phenomena go quantitatively together, but only that they do.
Regards
Guenter
Thank you for sending me on a suicide mission! ;-)
Denny: “By the way: I could understand that, with a little imagination, one might call the Rasch model a theory of sorts. But to call it a psychological theory is a bit too much I think.”
Stephen: “The rethink would, in my view, mean that the idea that "Rasch models" are somehow definable as a potentially substantive and coherent body of theory, definition and law is not really that much more likely than the idea that Stevens' theory of scale types is the same.”
What exactly is missing from the Rasch model to convert it into a substantive theory? Or, what do we have to do to transform it into such?
Guenter
On Saturday, March 19, 2011, Trendler, Guenter
--
Successful descriptive quantitative theories, definitions and laws are--how did you put it Andrew?--inferred from the phenomena. The formal statements of these are alluringly simple. They make sense when there is a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs that have been worked out by understanding the phenomena and existing, simpler empirical results, using the know-how available. In the end, key physical quantitative definitions and laws have very simple forms; although more involved ones also form part of successful bodies of theory. For example, measuring voltage, resitance and electric current are all based largely on Ohm's law, but design principles must take into account various other aspects of theory like induction and temperature.
It is in my view much too simplistic to presuppose a "measurement model" will somehow embody all of the substantive theory required to systematically obtain empirical results of the kind required to measure magnitudes of quanties in well-defined units. Rasch picked up on the basic form of quantity equations that capture the simplest relations that need to be isolated from others in a measurement process as a whole. So I don't think it's so much a matter of transforming--more likely a matter of understanding what the simple relatons capture and how they must be understood in a broader substantive theoretical framework.
Regards, Steve
________________________________________
From: talking-m...@googlegroups.com [talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter [guenter....@zi-mannheim.de]
Sent: Saturday, 19 March 2011 5:58 PM
To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological Measurement
However I do not understand why the issues you raise should be
specific to psychology. Variables are a source of headaches to all who
think but they are endemic to every field of enquiry that engages in
empirical tests; be it biology, medicine, physics, or economics.
Similarly, the idea of construction may be appealing, but we cannot
"construct de novo" gravitation, quantum probability, or the speed of
light.
So what? In other words I don't see why your criticism wouldn't
apply to science at large.
In addition I do not see your point about mathematics. When it
functions within science, mathematics has no intrinsic semantics, that
is, what it's about is *always* fixed by a nonmathematical link. To
call that link "verbal" is far too limited; it is often ostentative or
pragmatic.
It usually exceeds the linguistic resources by far. But
outside mathematics it is never the case that the formal system itself
contains its meaning. This is well known. If it is a basis for a
critique of psychology, then it is a basis basis for a critique of
science at large. In other words it is again not specific to
psychology.
Maybe it would help if you gave an indication of what you do consider
to be a good explanation from your standpoint. What is a good
explanation in your terms?
You said:
“Andrew proposed a candidate for a measurable attribute--the utility of incremental gains and losses under conditions of risk and uncertainty. I am not sure this attribute is actually psychic. Here an interesting question of the definition of psyche emerges, but this is another issue.”
But then you say:
Psyche is a system of processes that, on the basis of individual experiences, organizes behavior with the aim of maintaining equilibrium of the organism as a whole in a changing environment.
I cannot see how decision making under risk or uncertainty is not a system of psychological processes, or how it is inconsistent with the definition of psyche that you have given above. I have never seen a tree, rock or automobile make a decision.
First I must confess that I’m not yet very much acquainted with the ‘structural-systemic epistemology point of view’. However I believe that the Rasch model (at least as initially conceived by Georg Rasch) in particular and measurement models in general are not so much about any kind of explanation, but about the discovery of quantitative laws. Hence, strictly speaking, if the Rasch model would work it would not explain why certain phenomena go quantitatively together, but only that they do.
Ok, I was a bit over-hasty. Let me take a step back.
Josh was asking:
“Is there any existing psychological theory that has already commenced 'such research'?”
In reply I was pointing out that, and let me be now more precise, Rasch's idea (person ability + item difficulty => probabilistic response) may provide such a psychological theory and that, at least in the view of the Rasch community, the research may have already commenced. (Of course I did not mean just the formal mathematical theory, but, as Denny points out: “To transform a model into a theory its parameters have to be interpreted. For instance "theta" is not interpreted, it's just a placeholder.”)
Previously you wrote:
“Where it comes to the standard (or 'classical') definition of measurement, I do not think it makes sense to say that a Rasch model is a psychological theory. Where it comes to the classical definition, yes, Rasch made the connection first using the Poisson (model) to try to measure reading ability based on the numbers of errors. Most people use the dichotomous or polytomous Rasch models, and the idea that there is a "class" of models revolves around algebra in a manner quite clearly removed from substantive theory. So if we are talking substantive theory, there is in my view no class of models. Most saliently, the dichotomous and polytomous models, as they are almost universally used, contain functions of differences between "parameters". To interpret the parameters as measurements in the standard/classical sense, there must be ratios between differences and a unit (e.g. (b-d):u where b and d are magnitudes and u is a magnitude, the unit). There is no in principle issue here, but nobody has yet taken up the necessary challenges required to demonstrate that it is possible to use these models to mesure in well-defined units.”
So there is in principle no issue here. Do we agree then that we already have a psychological theory, Josh was asking for? Or is there something substantive missing? The problem only is that the ‘proper’ research (i.e. the search for ratios) has not yet commenced?
Regards,
Guenter
-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Stephen Humphry
Gesendet: Sa 19.03.2011 12:11
An: talking-m...@googlegroups.com
Betreff: RE: [talking-measurement] The Rasch model and Psychological Measurement
Regards, Steve
Guenter
--
Aaro: "It brings us back to my original question ... Why we need such models? I am
not implying with this question that mathematical models are not useful. My
question is exactly--if they are useful, then exactly for what? "To
understand" would not be sufficient because "understanding" is defined in
many ways. Maybe for some understanding can mean 'it is possible to make a
mathematical model.' Then I would say, obviously mathematical models can be
made of everything. So what? So, what is the purpose of such models?"
The purpose of such models is the measurement of the quantities involved. Though I’m not sure I understand your argument; maybe mathematical models can be made of everything but in an empirical science reality may resist their practical application. In my view Rasch models are mathematically sound, but reality does not conform to them. That is, indifferent of the circumstances, humans do not behave as demanded by the model.
Regards,
Guenter
You wrote: “Successful descriptive quantitative theories, definitions and laws are--how did you put it Andrew?--inferred from the phenomena. The formal statements of these are alluringly simple. They make sense when there is a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs that have been worked out by understanding the phenomena and existing, simpler empirical results, using the know-how available. In the end, key physical quantitative definitions and laws have very simple forms; although more involved ones also form part of successful bodies of theory. For example, measuring voltage, resitance and electric current are all based largely on Ohm's law, but design principles must take into account various other aspects of theory like induction and temperature.”
Isn’t this rather the case AFTER the law is firmly established? At the outset, prior to Ohm, there is not much of ‘a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs’. Instruments with ‘intricate’ design had to be first invented and developed. In the beginning there is not much instrumental know-how and the body of theory is very skinny.
The origin of Rasch models is the observation that some person A solves an item D with less effort then person B. This observation gave rise to the hypothesis that there may be a quantitative law between ‘person ability’, ‘item difficulty’ and the probability of a correct response. In physics a similar starting observation is that liquids expand with heat. The phenomenon can be use to measure ‘temperature’; the theory is that volume expands proportional to temperature. The next step was the construction of the thermoscope (thermometer without a scale). This involved the construction of an appropriate tube, the selection of an appropriate thermoscopic substance and so on. However, although theory is always involved along the way I believe that it is not by coincidence that complex theories like the molecular-kinetic theory stand at the end of such a process. Must we not start somewhere and be content with as much substantive theory as we have? The body of theory will grow naturally in substance as we gain experimental insight and control over the phenomena under investigation. Hence, isn’t the Rasch theory just as it stands now a sufficient starting point? What should we expect more at this point?
Regards
Guenter
-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Stephen Humphry
Gesendet: Sa 19.03.2011 12:11
An: talking-m...@googlegroups.com
Betreff: RE: [talking-measurement] The Rasch model and Psychological Measurement
Regards, Steve
Guenter
--
I said previously: “Successful descriptive quantitative theories, definitions and laws are--how did you put it Andrew?--inferred from the phenomena. The formal statements of these are alluringly simple. They make sense when there is a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs that have been worked out by understanding the phenomena and existing, simpler empirical results, using the know-how available. In the end, key physical quantitative definitions and laws have very simple forms; although more involved ones also form part of successful bodies of theory. For example, measuring voltage, resitance and electric current are all based largely on Ohm's law, but design principles must take into account various other aspects of theory like induction and temperature.”
You replied:
"Isn’t this rather the case AFTER the law is firmly established? At the outset, prior to Ohm, there is not much of ‘a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs’. Instruments with ‘intricate’ design had to be first invented and developed. In the beginning there is not much instrumental know-how and the body of theory is very skinny."
I think there's an important distinction to be made. If we look back on historical events in light of the theory and know-how now at our disposal, it may seem this way. Let's just consider a few details (apologies for anything not entirely accurate, historically--I do not think it should alter the basic message). Ohm's work is entitled (in English) the Galvanaic circuit investigated mathematically. Ohm used galvanaic or voltaic cells (and later a tehrmocouple device). He made explicit reference to Fourier and Poisson, saying that the form of the differential equations are similar to those given for the propagation of heat. He referred to generalizations deducible from the phenomena and empirical results. He formulated several quantitative equations in arriving at the one we now call Oh'm law. He referred to the "force of a current" (current) and tensions (potentials I think). Ohm used a number of detailed figures of the curcuits. He used a galvanometer to measure current (crude by today's forms, nevertheless an intricate instrument). So it's important not to overestimate the degree to which theory was unified, or the degree to which instruments had intricate designs, viewing them by modern standards. Its equally important, I think, not to underestimate either the degree to which theory and definition were developed or the intracacy of the instruments by the standards of the day.
Guenter, you also say:
"The origin of Rasch models is the observation that some person A solves an item D with less effort then person B. This observation gave rise to the hypothesis that there may be a quantitative law between ‘person ability’, ‘item difficulty’ and the probability of a correct response. In physics a similar starting observation is that liquids expand with heat. The phenomenon can be use to measure ‘temperature’; the theory is that volume expands proportional to temperature ....?"
Sadly, I disagree that it is a similar starting observation. Examined in terms of metrology and systematic measurement (and units), the problem is the lack of something corresponding to linear expansion. The spatial extent (volume) of a material or substance is measurable, and this is a starting point. Once it can be shown in a systeamtic manner than heat tansferance affects spatial extent under specific conditions, there is a basis for mesuring heat (and ultimately temperature). The problem is that "relative frequencies" of errors and so forth are not established measurements. Rasch's work to show that frequencies were systematically related and could be used to infer estimates of "parameters" is very useful. However, it doesn't have the same basis in a clearly measurable attribute (spatial extent) that is also related to various other physical dimensions.
Steve
You wrote. “I think there's an important distinction to be made. If we look back on historical events in light of the theory and know-how now at our disposal, it may seem this way. Let's just consider a few details (apologies for anything not entirely accurate, historically--I do not think it should alter the basic message). Ohm's work is entitled (in English) the Galvanaic circuit investigated mathematically. Ohm used galvanaic or voltaic cells (and later a tehrmocouple device). He made explicit reference to Fourier and Poisson, saying that the form of the differential equations are similar to those given for the propagation of heat. He referred to generalizations deducible from the phenomena and empirical results. He formulated several quantitative equations in arriving at the one we now call Oh'm law. He referred to the "force of a current" (current) and tensions (potentials I think). Ohm used a number of detailed figures of the curcuits. He used a galvanometer to measure current (crude by today's forms, nevertheless an intricate instrument). So it's important not to overestimate the degree to which theory was unified, or the degree to which instruments had intricate designs, viewing them by modern standards. Its equally important, I think, not to underestimate either the degree to which theory and definition were developed or the intracacy of the instruments by the standards of the day.”
There is a certain danger that we start splitting hairs here. The point I want to make is that in order to apply measurement theory we don’t need much ‘sophisticated’ theory to get started with experimenting. Furthermore, theory and experiment have to go hand in hand from the simple to the complex. We should therefore try to avoid burdening established empirical discoveries with to much theory. An extreme case where a huge theory was build upon no empirical evidence at all is Herbart’s ‘Psychologie als Wissenschaft’. I believe that the psychological theory as presented by Rasch in his ‘Probabilistic models for some intelligence and attainment tests’ is enough theory to get started which does not mean that I believe that such a start would or must be successful.
You also argue: “Sadly, I disagree that it is a similar starting observation. Examined in terms of metrology and systematic measurement (and units), the problem is the lack of something corresponding to linear expansion. The spatial extent (volume) of a material or substance is measurable, and this is a starting point. Once it can be shown in a systeamtic manner than heat tansferance affects spatial extent under specific conditions, there is a basis for mesuring heat (and ultimately temperature). The problem is that "relative frequencies" of errors and so forth are not established measurements. Rasch's work to show that frequencies were systematically related and could be used to infer estimates of "parameters" is very useful. However, it doesn't have the same basis in a clearly measurable attribute (spatial extent) that is also related to various other physical dimensions.”
In my view the question is if it is possible theoretically to determine constants in a law without being able to measure any of the factors involved, i.e. without ‘something corresponding to linear expansion’. As is well known Rasch’s crucial insight was that theoretically it is indeed possible. The probability of (a correct) response is ‘measurable’ and this is the starting point. True, this ‘measurable’ quantity is not (yet) related in various ways to other measurable quantities as you point out. However, maybe a more suitable physical analogy to the situation is, instead of Ohm’s law, the level of knowledge about electricity at the time of William Gilbert (1544 – 1603). One of Gilbert’s crucial observations was that a metallic needle (i.e. primitive electroscope) is deviated by the electricity produced by rubbing amber with cloth. Of course Gilbert had some ‘primitive’ theory about this phenomenon (for details see Bordeau: Volts to Hertz). Note that the measurable quantity ‘angle deviation’ is not yet connected to other quantities just as is the case with the Rasch model, but that is basically how all got started. However, I jumped into this discussion to take position on Josh question if there is ‘any existing psychological theory that has already commenced 'such research'’ to which my answer is: ‘Yes, in my view such theory exists and the research has already commenced.’ Of course, the question remains why no progress has been made beyond the ‘Gilbert level’ but that’s another topic. Still no agreement about ‘the starting observation’ and amount of theory necessary to get started, Steve?
Best
Guenter
-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Stephen Humphry
Gesendet: So 20.03.2011 04:14
Hello Aaro,
You said:
Obviously decision-making is a psychological phenomenon. Why I said that maybe "utility of incremental gains and losses" is not is that this "utility" is basically a variable; but psyche is not composed of variables. Maybe my interpretation was too literal, but still the problem is that "utility" as a variable can be based on different psychological structures; if so, the variable is abstracted from psyche and becomes nonpsychical. Also, as a variable that can be based on different psychical structures, it is not a measure until those psychical structures are clearly distinguished, until it is made clear that the same real quality of psyche is quantitatively described.
The argument that you are making here sounds vaguely representationalist – psychological attributes are “qualities” which can be “quantitatively described” or measured once “psychical structures” are clearly distinguished. By psychical structures I would assume that you mean psychological systems. You argue that the utility of gains and losses under conditions of risk and uncertainty “can be based upon different psychological structures”. I interpret this as arguing that theories of utility can be proposed which are descriptively different. If my perception is true, then you are correct.
Since Tversky’s (1969) often cited choice study, it has been argued that the utility of incremental gains and losses under conditions of risk or uncertainty may not be quantitative. Specifically, utility could be a “lexicographic semiorder” (Tversky, 1969). A “lexicographic order” occurs when objects or events are ordered on the basis of a series of ranked attributes, such that if the difference between two objects with respect to the first attribute does not exceed a threshold of some kind, then attention is paid to the second attribute. If a difference can be discerned on the second attribute, then an order between two objects is identified. The most familiar example of a lexicographic order is the alphabetical ordering of words in a dictionary. If two words share the first same letter, then attention is paid to the second. If the second letters are different, then the words are alphabetically ordered irrespective of the letters following the second letter. A “semiorder” is an ordinal relation with transitive indifference, that is, if A is indifferent to B (A ~ B) and B ~ C it need not follow that A ~ C (Luce, 1956). Of course, an attribute cannot logically be a quantity if its degrees (levels) form a lexicographic semiorder.
A lexicographic semiorder in a choice situation is where people compare two simple lotteries, for example, on the basis of one attribute first, and if the difference between the lotteries with respect to this attribute is less than a threshold of some kind, then attention is paid to the second attribute, and if the relevant difference is less than a threshold, then the third attribute is inspected, and so on (Birnbaum, 2010). One example of a theory of utility which proposes that utility is one kind of lexicographic semiorder is the “priority heuristic” of Brandstatter, Gigerenzer & Hertwig (2006). This work is in the vein of Goldstein & Gigerenzer’s (2002) theory of “fast and frugal” heuristics.
The priority heuristic argues that people assess a simple lottery as follows. The minimum gain of each lottery is assessed first, then the probabilities of the minimum gains and then the maximum gains. If the difference in minimum gains is greater than 1/10th of the magnitude of the maximum gain, then examination of the lotteries ceases and a choice is made. Brandstatter, et al, (2006) argued that this threshold was a decision “stopping rule”. For example, consider Lottery A, which has a 50% chance of yielding $200 and a 50% chance of yielding nothing. Lottery B consists of receiving $100 for sure (i.e., Lottery B is a gift). As $0 is the minimum in Lottery A and $100 in B, and that $100 > $20, then people will choose Lottery B over A (which most people almost always do). However, if presented with Lottery C, which has a 50% chance of yielding $3000 and 50% chance of nothing, then not all people choose Lottery B (as $300 > $100). Again, this is consistent with observed human choice behaviour. So far, the priority heuristic makes predictions consistent with theories in which utility is hypothesised to be a continuous quantity, such as prospect theory (Kahneman & Tverksy, 1979), rank dependent utility theory (Luce & Fishburn, 1991) and transfer of attention exchange theory (TAX) (Birnbaum, 1999). But the priority heuristic is much simpler and does not assume that utility is quantitative.
However, strong evidence has been presented against the lexicographic semiorder theories of utility. Birnbaum (2004) found that the priority heuristic predicted the modal choice in 3 out of 13 cases, whilst the TAX theory predicted all 13. Moreover, the priority heuristic did not describe the choice behaviour of the majority of subjects in Birnbaum & Navarrete’s (1998) study, despite the fact that Brandstatter, et al (2006) cited that study. Re-examining Birnbaum & Navarrete’s (1998) data, Birnbaum (2008) found only 31% of subjects made choices in accordance to the 1/10th of the greatest outcome “stopping rule” of Brandstatter, et al (2006).
Birnbaum (2010) recently conducted a series of choice experiments designed to test lexicographic semiorder theories of utility, including the priority heuristic. To summarise his rather intensive study, Birnbaum found that the lexicographic semiorder theories did not describe choice behaviour that well. For example, priority dominance was systematically violated, meaning that choices were made using attributes of “lower” priority than those argued to be or “greater” priority by the priority heuristic. Subjects did not also display the intransitive choice behaviour that lexicographic semi order theories predict. However, the most frequent patterns of individual choice behaviour were consistent with the quantitative TAX theory.
More research is needed of course, but it would seem that utility may indeed be a psychological quantity. This perhaps why Tversky (1969) abandoned the lexicographic semiorder idea he proposed in favour of quantitative prospect theory. Too bad he passed away before the Nobel Economics Prize was awarded to his collaborator Danny Kahneman in 2002.
Andrew
My website: https://sites.google.com/site/drandrewkyngdon/home
Measurement Forum: http://groups.google.com/group/talking-measurement
From: talking-m...@googlegroups.com [mailto:talking-m...@googlegroups.com] On Behalf Of aaro
Sent: Sunday, 20 March 2011 12:07 AM
To: talking-m...@googlegroups.com
--
On Sun, Mar 20, 2011 at 9:37 AM, Trendler, Guenter
<guenter....@zi-mannheim.de> wrote:
>
> Hi Steve,
>
> You wrote. "I think there's an important distinction to be made. If we look back on historical events in light of the theory and know-how now at our disposal, it may seem this way. Let's just consider a few details (apologies for anything not entirely accurate, historically--I do not think it should alter the basic message). Ohm's work is entitled (in English) the Galvanaic circuit investigated mathematically. Ohm used galvanaic or voltaic cells (and later a tehrmocouple device). He made explicit reference to Fourier and Poisson, saying that the form of the differential equations are similar to those given for the propagation of heat. He referred to generalizations deducible from the phenomena and empirical results. He formulated several quantitative equations in arriving at the one we now call Oh'm law. He referred to the "force of a current" (current) and tensions (potentials I think). Ohm used a number of detailed figures of the curcuits. He used a galvanometer to measure current (crude by today's forms, nevertheless an intricate instrument). So it's important not to overestimate the degree to which theory was unified, or the degree to which instruments had intricate designs, viewing them by modern standards. Its equally important, I think, not to underestimate either the degree to which theory and definition were developed or the intracacy of the instruments by the standards of the day."
>
> There is a certain danger that we start splitting hairs here. The point I want to make is that in order to apply measurement theory we don't need much 'sophisticated' theory to get started with experimenting. Furthermore, theory and experiment have to go hand in hand from the simple to the complex. We should therefore try to avoid burdening established empirical discoveries with to much theory. An extreme case where a huge theory was build upon no empirical evidence at all is Herbart's 'Psychologie als Wissenschaft'. I believe that the psychological theory as presented by Rasch in his 'Probabilistic models for some intelligence and attainment tests' is enough theory to get started which does not mean that I believe that such a start would or must be successful.
>
> You also argue: "Sadly, I disagree that it is a similar starting observation. Examined in terms of metrology and systematic measurement (and units), the problem is the lack of something corresponding to linear expansion. The spatial extent (volume) of a material or substance is measurable, and this is a starting point. Once it can be shown in a systeamtic manner than heat tansferance affects spatial extent under specific conditions, there is a basis for mesuring heat (and ultimately temperature). The problem is that "relative frequencies" of errors and so forth are not established measurements. Rasch's work to show that frequencies were systematically related and could be used to infer estimates of "parameters" is very useful. However, it doesn't have the same basis in a clearly measurable attribute (spatial extent) that is also related to various other physical dimensions."
>
> In my view the question is if it is possible theoretically to determine constants in a law without being able to measure any of the factors involved, i.e. without 'something corresponding to linear expansion'. As is well known Rasch's crucial insight was that theoretically it is indeed possible. The probability of (a correct) response is 'measurable' and this is the starting point. True, this 'measurable' quantity is not (yet) related in various ways to other measurable quantities as you point out. However, maybe a more suitable physical analogy to the situation is, instead of Ohm's law, the level of knowledge about electricity at the time of William Gilbert (1544 - 1603). One of Gilbert's crucial observations was that a metallic needle (i.e. primitive electroscope) is deviated by the electricity produced by rubbing amber with cloth. Of course Gilbert had some 'primitive' theory about this phenomenon (for details see Bordeau: Volts to Hertz). Note that the measurable quantity 'angle deviation' is not yet connected to other quantities just as is the case with the Rasch model, but that is basically how all got started. However, I jumped into this discussion to take position on Josh question if there is 'any existing psychological theory that has already commenced 'such research'' to which my answer is: 'Yes, in my view such theory exists and the research has already commenced.' Of course, the question remains why no progress has been made beyond the 'Gilbert level' but that's another topic. Still no agreement about 'the starting observation' and amount of theory necessary to get started, Steve?
--
OK - I know this came out probably more "directly phrased" than Guenter
might have actually meant to imply but ...
I think a read of Wood, R. (1978) Fitting the Rasch model - a heady tale.
British Journal of Mathematical and Statistical Psychology, 31, , 27-32;
quickly dispels any notion that the Rasch model can be a theory of anything
except item response probabilities. Paul Kline any myself back in 1981 fit a
Rasch model to the EPQ - adequately - encompassing items from all 4 scales!
You might as well call an eigenvector-eigenvalue decomposition routine a
"theory of psychology"!
Likewise, Michell, J. (2004) Item Response Models, pathological science, and
the shape of error. Theory and Psychology, 14, 1, 121-129. As you reduce
response error, so the model eventually fails.
I am amazed anyone would imbue any statistical item response model with the
status of a theory about a psychological attribute.
In my opinion, the best theories of ability are found in AI and
computational intelligence. Why? Because they build and explicitly test that
which they theorize might be causal for the production of reasoning and
learning within systems (of which we are but one type).
Whether or not one proposes that connectionist modeling is a reasonable
model for some human cognitive functions, the theory is about the generative
cause of the responses, not about statistically describing aggregate
responses on questionnaire items.
And it is that "theory about what generates the phenomenon" sets apart
mathematical/statistical descriptions of data from theories of a how a
phenomenal observation comes to be observed.
Basically, all Rasch says is "a latent variable - call it ability" causes
aggregate observations to align themselves like this, and so be modeled
using an explicit stochastic mathematical model.
But, the same 'theory' can be expressed as "the ability to solve these kinds
of problems produces an approximately ordered set of item difficulties,
which we can use to rank test-takers" - no mathematical model is required to
produce that ordering.
I think Aaro has it just right when he asks: " It brings us back to my
original question ... Why we need such models? I am not implying with this
question that mathematical models are not useful. My question is exactly--if
they are useful, then exactly for what?"
There is a very fine set of three paragraphs concluding the wonderful
chapter by Schonemann, P. (1994) Measurement: the Reasonable Ineffectiveness
of Mathematics in the Social Sciences. In I. Borg and P.Mohler (Eds.).
Trends and Perspectives in Empirical Social Research. Walter de Gruyter.
ISBN: .
"Whatever use axioms may have in mathematics, in an empirical science they
must be either self-evident or empirically founded. However, it is far from
self-evident why the Archimedean axiom should hold in psychology, or in
biology, where most phenomena are bounded by physiological constraints. Nor
is it self-evident why it should always be possible, or even helpful, to
remove interactions as additive conjoint measurement (and the closely
related "functional measurement") try to do. Why should the "crisp"
mathematics of physics apply without change to the fuzzy nature of living
things? Why should subjects always utilize a particular family of distance
functions when they produce dissimilarity ratings, and what prompts them to
always interpolate a monotone transformation so that we always can use the
same canned programs?
None of this is self-evident a priori, nor is any of it empirically founded.
In some instances, as we saw, there is solid empirical evidence to the
contrary, which is simply brushed aside. As Coombs (1983) observed, the line
separating this research strategy from "mathematical game playing in search
of a trivial application ... is an exceedingly difficult line to draw" (p.
93).
What should have been self-evident from the start is that a research
strategy which develops models "independently as a body of abstract formal
theory with empirical interpretations being left to a later stage" was
doomed from the outset. Thus, in the social sciences, the real mystery is
how anyone could have seriously believed the empirical connections would
materialize at a later stage. As the experience of the last 20 years shows,
they didn't. "
I concluded my recent commentary on Stephen Humphry's forthcoming target
article: "The Role of the Unit in Physics and Psychometrics" in the journal
" Measurement: Interdisciplinary Research and Perspectives ", with ..
"As I see it, the problem remaining for any social scientist is, not one of
developing yet more derivations of existing statistical item response models
or even new such models, but one of creating bodies of evidence that
demonstrate that a psychological attribute does indeed vary additively. If
these bodies of evidence are missing, then we must continue to explore and
make careful observations and, where possible, manipulate features of
phenomena and attributes, but without this continuing pretence of an
artificial precision accorded by so-called "measurement models" within
"quantitative psychology." And we continue like this until such time as the
body of observational evidence either invites obvious and unambiguous
quantification or theory-related causal explanations of our observations
show it is simply not possible in principle. "
Two very nice papers on this issue of the status of theory in psychology
have recently been published in the journal Theory and Psychology:
Gigerenzer, G. (2010) Personal reflections on Theory and Psychology. Theory
and Psychology, 20, 6, 733-743.
And
Rosenbaum, P.J., & Valsiner, J. (2011) The un-making of a method: From
rating scales to the study of psychological processes. Theory and
Psychology, 21, 1, 47-65.
For me, this is where the real works begins - with sensible and powerful
theory construction, Not with silly aggregate-model "latent variable"
statistical methods of any description.
But, what hope is there when 'methodolatry' is the order of the day? The
word came from Janesick, V. J. (1994). The dance of qualitative research
design: Metaphor, methodolatry, and meaning. In N. K. Denzin & Y. S. Lincoln
(Eds.), Handbook of qualitative research (pp. 209-219). Thousand Oaks, CA:
Sage... where he defined "methodolatry as: "a combination of method and
idolatry, to describe a preoccupation with selecting and defending methods
to the exclusion of the actual substance of the story being told.
Methodolatry is the slavish attachment and devotion to method that so often
overtakes the discourse in the education and human services fields. (p.
215)"
Just about sums up the entire "latent variable" tosh which now dictates much
of psychometrics and edumetrics these days.
Regards .. Paul
It's just sad that all these bright minds should waste their their
time complaining. Sad to see so much energy just die in pure
negativity. As if anyone cares. You people have the power to make
something good happen. Something better. Noblesse oblige!!!
D
Are you actually saying IRT stochastic data models represent theories of
psychological processes?
Are you saying that AI/connectionist research is not seeking to
postulate/build generative causal models of reasoning processes?
As to:
" Where exactly does your neural network dance deviate from, say, the good
old ML statistics? "
In terms of how the network design/technology is deployed to build DYNAMIC
intrinsically non-linear functional models of human psychological processes.
As to
" How precisely is your work different from what I see every day, every
hour, every minute of the day?"
1. I don't consider any psychological attribute as "quantitatively"
measurable. But, I'll use numbers etc., orders etc. to arrive at "good
enough" predictive accuracies (categorized or orders) for applied work -
assessed using actuarial rather than "continuous-valued" functions.
2. I will also use James Grice's Observational Oriented Modeling - a
completely intrinsically non-quantitative-metric binary pattern analysis
methodology for analyzing data and logical path statement (Grice, J. (in
press). Observation oriented modeling: An analysis of cause in the
behavioral sciences. New York: Elsevier.)
3. I disavow ALL test theory psychometrics in my test construction work -
preferring instead to treat each "assessment" problem uniquely,
algorithmically, and actuarially.
4. I don't use questionnaires anymore - I build graphical profilers - in 1
and 2 dimensions. And, I'm developing single-stimulus dynamically evolving
reasoning items - where a single stimulus replaces the myriad of "usual
suspect" ability items. But then, I'm also after "reasoning in gestalt/situ"
rather than "abilities as rulers in our heads".
I would dearly love to work more on theory - and the measurement issue. Find
me a job that would pay me to do so and I would. Until then, I have to eke
out a living doing ad-hoc HR-type consultancy work - as no university
department or test publishing company is remotely interested in someone not
doing/teaching "what everyone else does".
I cannot speak for the others on this list; but what I have said in my
message is not negative - merely factual. You may find it negative, for me
it's just the way things are.
I have also said how things should proceed, and really how positive the
situation has become (for scientists, not psychometricians) - and that there
will be limited and uneven stabs and pushes along the frontiers while we
collectively grapple with coming to an understanding of what exactly we are
trying to propose as a 'theory' of something we might wish to call a
psychological process, let alone claim "measurement" of something.
Ah well, there we go ... by the way, if you want to see how I think about
conceptualising "human psychology" - go read my two presentations (and
supporting notes):
http://www.pbarrett.net/NZ_Psych_2007.htm
#2: Two Big Ideas
#3: Brunswick Symmetry, Complexity, & Non-Quantitative Psychology - Tying it
all Together
It's a bit raw in places - but you can see why I'm no longer interested in
measuring psychological "attributes" using rulers - but looking at "the
system" as a "complex" dynamic system in situ.
But, you are the man who published "The Attack of the Psychometricians" ...
it is no wonder what you see as a negative I see as a positive.
Regards .. Paul
-----Original Message-----
From: talking-m...@googlegroups.com
[mailto:talking-m...@googlegroups.com] On Behalf Of Denny Borsboom
Sent: Monday, 21 March 2011 1:21 p.m.
To: talking-m...@googlegroups.com
Cc: Paul Barrett
Subject: Re: [talking-measurement] The Rasch model and Psychological
Measurement
You said:Obviously decision-making is a psychological phenomenon. Why I said that maybe "utility of incremental gains and losses" is not is that this "utility" is basically a variable; but psyche is not composed of variables. Maybe my interpretation was too literal, but still the problem is that "utility" as a variable can be based on different psychological structures; if so, the variable is abstracted from psyche and becomes nonpsychical. Also, as a variable that can be based on different psychical structures, it is not a measure until those psychical structures are clearly distinguished, until it is made clear that the same real quality of psyche is quantitatively described.
The argument that you are making here sounds vaguely representationalist – psychological attributes are “qualities” which can be “quantitatively described” or measured once “psychical structures” are clearly distinguished. By psychical structures I would assume that you mean psychological systems.
You argue that the utility of gains and losses under conditions of risk and uncertainty “can be based upon different psychological structures”. I interpret this as arguing that theories of utility can be proposed which are descriptively different. If my perception is true, then you are correct.
Birnbaum (2010) recently conducted a series of choice experiments designed to test lexicographic semiorder theories of utility, including the priority heuristic. To summarise his rather intensive study, Birnbaum found that the lexicographic semiorder theories did not describe choice behaviour that well. For example, priority dominance was systematically violated, meaning that choices were made using attributes of “lower” priority than those argued to be or “greater” priority by the priority heuristic. Subjects did not also display the intransitive choice behaviour that lexicographic semi order theories predict. However, the most frequent patterns of individual choice behaviour were consistent with the quantitative TAX theory.
More research is needed of course, but it would seem that utility may indeed be a psychological quantity.
Aaro,
Your counterpoints are speculative and as such there is nothing in your response which casts doubt upon Birnbaum’s findings. Indeed, your posts thus far seem to consist of speculation and conjecture. You have responded to points addressing your speculation with only more speculation.
Nonetheless, you made two coherent statements - the utility of gains and losses under conditions of risk or uncertainty is non quantitative and that different theories of utility can be proposed. I presented empirical work directly relevant to these points. That utility may not be a quantity is an hypothesis that has received serious attention in the past 40 years. However, lexicographic semi-order theories such as the priority heuristic have never been as descriptively powerful as either prospect theory, transfer of attention exchange or rank dependent utility theory. Indeed, heuristics only seem to describe the choice behaviour from which they have been derived (Birnbaum, 2008). Obtain a fresh set of choice data, and heuristics fail. Even Brandstatter, et al (2008) had to concede that heuristic theories of choice were not descriptively powerful enough to displace the quantitative theories of utility. Hence it would seem that the evidence is against non quantitative theories of decision making under risk, or at least theories based on lexicographic semi-orders. But this isn’t really news. Concerns about the descriptive adequacy of lexicographic semi-order theories of utility have existed for some time (e.g., Grether & Plott, 1979).
Incidentally, human choice behaviour under risk is quite consistent over time and across different samples of people. For example, the preference reversals of the Allais Paradox have been replicated in every choice study in which the Paradox has been tested for almost the past 50 years. Kahneman & Tversky (1979) observed these preference reversals in the choice behaviour of psychology students from Israel and the US. Grether & Plott (1979) observed the behaviour in US students of economics and Oliver (2003) observed it in the choice behaviour of staff from a major London healthcare facility. Moreover, Tversky & Kahneman (1981) found that framing effects in medical applications of utility theory occurred in samples consisting of either university students or highly trained medical physicians. Johnson, Hershey, Meszaros & Kunreuther (1993) also discovered framing effects when business executives were asked to make choices concerning insurance products. Even Birnbaum (2000) conducts choice experiments on the Internet. So your argument of sample bias does not seem to be supported.
Andrew
Andrew Kyngdon, PhD
MetaMetrics, Inc.
My website: https://sites.google.com/site/drandrewkyngdon/home
Measurement Forum: http://groups.google.com/group/talking-measurement
From: talking-m...@googlegroups.com [mailto:talking-m...@googlegroups.com] On Behalf Of aaro
Sent: Tuesday, 22 March 2011 2:18 AM
To: talking-m...@googlegroups.com
--
You say:
"In my view the question is if it is possible theoretically to determine constants in a law without being able to measure any of the factors involved, i.e. without 'something corresponding to linear expansion'. As is well known Rasch's crucial insight was that theoretically it is indeed possible. The probability of (a correct) response is 'measurable' and this is the starting point. True, this 'measurable' quantity is not (yet) related in various ways to other measurable quantities as you point out. However, maybe a more suitable physical analogy to the situation is, instead of Ohm's law, the level of knowledge about electricity at the time of William Gilbert (1544 - 1603). One of Gilbert's crucial observations was that a metallic needle (i.e. primitive electroscope) is deviated by the electricity produced by rubbing amber with cloth. Of course Gilbert had some 'primitive' theory about this phenomenon (for details see Bordeau: Volts to Hertz). Note that the measurable quantity 'angle deviation' is not yet connected to other quantities just as is the case with the Rasch model, but that is basically how all got started. However, I jumped into this discussion to take position on Josh question if there is 'any existing psychological theory that has already commenced 'such research'' to which my answer is: 'Yes, in my view such theory exists and the research has already commenced.' Of course, the question remains why no progress has been made beyond the 'Gilbert level' but that's another topic. Still no agreement about 'the starting observation' and amount of theory necessary to get started, Steve?"
Steve:
You say above: "the probability of a correct response is 'measureable' and this is the starting point. Can you explain to me what you mean by "the probability (of anything) is measurable"? How about "odds". Would you say that if I obtain the ratio of the frequency of occurrences of an event A to frequency of occurrence of event B, that is a measurement?
I think we need to be clear on this first.
Best, Steve
-----Original Message-----
From: talking-m...@googlegroups.com [mailto:talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter
Sent: Sunday, 20 March 2011 4:37 PM
To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological Measurement
Hi Steve,
You wrote. "I think there's an important distinction to be made. If we look back on historical events in light of the theory and know-how now at our disposal, it may seem this way. Let's just consider a few details (apologies for anything not entirely accurate, historically--I do not think it should alter the basic message). Ohm's work is entitled (in English) the Galvanaic circuit investigated mathematically. Ohm used galvanaic or voltaic cells (and later a tehrmocouple device). He made explicit reference to Fourier and Poisson, saying that the form of the differential equations are similar to those given for the propagation of heat. He referred to generalizations deducible from the phenomena and empirical results. He formulated several quantitative equations in arriving at the one we now call Oh'm law. He referred to the "force of a current" (current) and tensions (potentials I think). Ohm used a number of detailed figures of the curcuits. He used a galvanometer to measure current (crude by today's forms, nevertheless an intricate instrument). So it's important not to overestimate the degree to which theory was unified, or the degree to which instruments had intricate designs, viewing them by modern standards. Its equally important, I think, not to underestimate either the degree to which theory and definition were developed or the intracacy of the instruments by the standards of the day."
There is a certain danger that we start splitting hairs here. The point I want to make is that in order to apply measurement theory we don't need much 'sophisticated' theory to get started with experimenting. Furthermore, theory and experiment have to go hand in hand from the simple to the complex. We should therefore try to avoid burdening established empirical discoveries with to much theory. An extreme case where a huge theory was build upon no empirical evidence at all is Herbart's 'Psychologie als Wissenschaft'. I believe that the psychological theory as presented by Rasch in his 'Probabilistic models for some intelligence and attainment tests' is enough theory to get started which does not mean that I believe that such a start would or must be successful.
You also argue: "Sadly, I disagree that it is a similar starting observation. Examined in terms of metrology and systematic measurement (and units), the problem is the lack of something corresponding to linear expansion. The spatial extent (volume) of a material or substance is measurable, and this is a starting point. Once it can be shown in a systeamtic manner than heat tansferance affects spatial extent under specific conditions, there is a basis for mesuring heat (and ultimately temperature). The problem is that "relative frequencies" of errors and so forth are not established measurements. Rasch's work to show that frequencies were systematically related and could be used to infer estimates of "parameters" is very useful. However, it doesn't have the same basis in a clearly measurable attribute (spatial extent) that is also related to various other physical dimensions."
In my view the question is if it is possible theoretically to determine constants in a law without being able to measure any of the factors involved, i.e. without 'something corresponding to linear expansion'. As is well known Rasch's crucial insight was that theoretically it is indeed possible. The probability of (a correct) response is 'measurable' and this is the starting point. True, this 'measurable' quantity is not (yet) related in various ways to other measurable quantities as you point out. However, maybe a more suitable physical analogy to the situation is, instead of Ohm's law, the level of knowledge about electricity at the time of William Gilbert (1544 - 1603). One of Gilbert's crucial observations was that a metallic needle (i.e. primitive electroscope) is deviated by the electricity produced by rubbing amber with cloth. Of course Gilbert had some 'primitive' theory about this phenomenon (for details see Bordeau: Volts to Hertz). Note that the measurable quantity 'angle deviation' is not yet connected to other quantities just as is the case with the Rasch model, but that is basically how all got started. However, I jumped into this discussion to take position on Josh question if there is 'any existing psychological theory that has already commenced 'such research'' to which my answer is: 'Yes, in my view such theory exists and the research has already commenced.' Of course, the question remains why no progress has been made beyond the 'Gilbert level' but that's another topic. Still no agreement about 'the starting observation' and amount of theory necessary to get started, Steve?
Your counterpoints are speculative and as such there is nothing in your response which casts doubt upon Birnbaum’s findings.
Incidentally, human choice behaviour under risk is quite consistent over time and across different samples of people. For example, the preference reversals of the Allais Paradox have been replicated ... psychology students ... students of economics ... ... staff from a major London healthcare facility ... university students or highly trained medical physicians ... business executives ... choice experiments on the Internet.
So your argument of sample bias does not seem to be supported.
2. externally identical behaviors can be based on structurally different minds
3. no variable-based theory is able to distinguish between such differently composed minds that behave in some situations similarly
2. in other words again: there are qualitatively different ways of informations processing available to humans.
4. These ways constitute a developmental hierarchy.
5. Perhaps in some levels this utility might be processed as quantity but definitely not at all levels of this hierarchy.
Aaro,
All of your arguments against utility theory have been speculative. Why? Because there exists no established body of research that shows how decision making under conditions of risk or uncertainty is either influenced or caused by any of things you’ve been arguing. For example, you have presented nothing which would convince someone familiar with utility theory that Nuria’s work on neurological rehabilitation has anything to do at all with the St Petersburg Paradox, or violations of stochastic dominance, or the common ratio effect of the Allais Paradox. How would “cultural evolution” have anything to do with the descriptive failures of lexicographic semi-order theories of utility? Or do you expect me to believe you just because you say that these things are relevant? You need evidence to back up your arguments, not speculation or your opinion.
By the way, Lattimore, Baker & Witte (1992) recruited both a sample of college students and a sample of North Carolina prison inmates in their choice study. They found no material differences in choice behaviour between the two groups of subjects, with the exception that the male college student choice behaviour seemed to accord slightly (and I mean slightly) more with the traditional expected utility theory. Female college students and the prison inmates seemed to weight outcome probabilities as predicted by cumulative prospect theory. However, this was only for lotteries consisting of gains. For gambles consisting of losses, the choice behaviour of the prison inmates and the college students was virtually identical.
Given that such things like the Allais, Ellsberg and St Petersburg Paradoxes are some of the most experimentally reproducible phenomena in the behavioural sciences, I doubt anyone in utility theory would seriously entertain the idea that the whole field is compromised by your allegation of sample bias. The St Petersburg Lottery was first proposed in 1738. Today, over 270 years later, people continue to be willing to pay only a small fee to play the St Petersburg Lottery, despite the fact that this lottery has an infinite expected value (meaning that a player could possibly become “infinitely” rich). Hence the term “St Petersburg Paradox”. In 1980 the philosopher Hacking suggested that only a few people would pay even USD$25 to play the game (Hacking, 1980). I doubt the St Petersburg Paradox would have survived for centuries if it only pertained to university graduates, especially given there were hardly any university graduates back in 1738. As the Lattimore, et al (1992) study showed, it’s not the sample which is the greatest influential factor in choice behaviour, but the kind of risky situation that people have to make decisions in.
Andrew
From: aaro [mailto:aaro.t...@ut.ee]
Sent: Tuesday, 22 March 2011 10:56 PM
To: talking-m...@googlegroups.com
All of your arguments against utility theory have been speculative. Why? Because there exists no established body of research that shows how decision making under conditions of risk or uncertainty is either influenced or caused by any of things you’ve been arguing. For example, you have presented nothing which would convince someone familiar with utility theory that Nuria’s work on neurological rehabilitation has anything to do at all with the St Petersburg Paradox, or violations of stochastic dominance, or the common ratio effect of the Allais Paradox. How would “cultural evolution” have anything to do with the descriptive failures of lexicographic semi-order theories of utility? Or do you expect me to believe you just because you say that these things are relevant? You need evidence to back up your arguments, not speculation or your opinion.
By the way, Lattimore, Baker & Witte (1992) recruited both a sample of college students and a sample of North Carolina prison inmates in their choice study. They found no material differences in choice behaviour between the two groups of subjects, with the exception that the male college student choice behaviour seemed to accord slightly (and I mean slightly) more with the traditional expected utility theory. Female college students and the prison inmates seemed to weight outcome probabilities as predicted by cumulative prospect theory. However, this was only for lotteries consisting of gains. For gambles consisting of losses, the choice behaviour of the prison inmates and the college students was virtually identical.
Given that such things like the Allais, Ellsberg and St Petersburg Paradoxes are some of the most experimentally reproducible phenomena in the behavioural sciences, I doubt anyone in utility theory would seriously entertain the idea that the whole field is compromised by your allegation of sample bias. The St Petersburg Lottery was first proposed in 1738. Today, over 270 years later, people continue to be willing to pay only a small fee to play the St Petersburg Lottery, despite the fact that this lottery has an infinite expected value (meaning that a player could possibly become “infinitely” rich). Hence the term “St Petersburg Paradox”. In 1980 the philosopher Hacking suggested that only a few people would pay even USD$25 to play the game (Hacking, 1980). I doubt the St Petersburg Paradox would have survived for centuries if it only pertained to university graduates, especially given there were hardly any university graduates back in 1738. As the Lattimore, et al (1992) study showed, it’s not the sample which is the greatest influential factor in choice behaviour, but the kind of risky situation that people have to make decisions in.
Aaro,
If your point is that there is no evidence related to the paradoxes you mention then here I agree with you. If you say that there have been no studies on decision-making, then this is wrong; and Luria's work on rehabilitation includes work on rehabilitation of lost decision-making abilities. But this is not the point.
The point is that you have singularly failed to explicate how anything of what you say relates to decision making under risk and uncertainty. It is very obvious that you know next to nothing of the field. Arguing from a position of near total ignorance is not going to convince anyone. At least now though you agree that there is no established research which backs your speculation. Now, Luria’s work sounds interesting, but once more you have failed to explained how it relates to decision making under risk in any specific way. Again, you seem to think that I should believe you because you say it does.
I suggested that externally similar behavioral results can be based on psychologically different mechanisms and vice versa. This principle has been supported by so many studies in so different areas of research that I do not see any reason to suspect that it does not apply to the utility behaviors. I also think that no studies are needed to support this principle, observations of everyday behaviors would be sufficient.
So, are you saying that you do not need scientific evidence to support your hypothesis that “behavioral results can be based on psychologically different mechanisms”? If so, then you are no scientist. Indeed, your hypothesis is a woolly and vague generalistion, not an hypothesis of concerning a psychological system or component thereof. Hypotheses without rigorous scientific evidence supporting them are nothing more than speculation. So once again, you are merely speculating.
Even more, you pushed me to think more on the subject of our discussion and I realized that you already provided evidence that the same principle applies to the phenomenon you refer to--there also different psychological ways for solving the same problems have been observed. You provided two kinds of evidence. First, persons who know the paradox solve the problems differently from those who do not know. And second, you also referred in an earlier post that not all subjects behave in the expected by the theory ways. This is exactly what I would also predict on the basis of "speculations" -- which might be called also generalizations.
I said no such thing. Either this is a mistake on your part or you are deliberately engaging in a straw man fallacy. What I said was this “…theories of utility can be proposed which are descriptively different.” I did not say anything about people solving paradoxes in different ways.
The studies and theories you mentioned, from my (cultural-historical or structural-systemic) perspective are not even trying to reveal the psychological operations that underlie the observed behaviors.
Once more you are completely wrong. You should actually try reading some literature on decision making under risk and uncertainty. Kahneman & Tversky’s (1979) classic paper on prospect theory is quite readable for the non expert. In this paper, Kahneman & Tversky explain why people weight the outcome probabilities of a lottery in a non-linear way. This they called the “fourfold pattern of attitude towards risk”. They found that people are risk averse in the context of probable gains, but they counterintuitively seek risk in the face of certain losses. But people are also aversive to risk in the context of improbable losses (e.g., the purchasing of renter’s insurance) and they seek risk for improbable gains (e.g., poker maching gambling). So the mathematical weighting function in quantitative theories of utility does have a firm, descriptive psychological basis. Also, there is an affective component to utility. People find losses more painful than what gains are pleasurable, and the utility function in CPT accounts for this. So your argument that theories of decision making under risk are “not even trying” to be descriptive of the psychological processes that underlie choice behaviour is totally false.
Perhaps your “cultural – historical or structural systemic perspective” (whatever that means) is too blinkered, or you believe in it too much?
Well, I think the evidence you provide cannot support your conclusions; because for different reasons, group-level data analyses cannot in principle be translated back to psychological operations that characterize the individual level of analysis.
Ha! Now you are changing your argument in the face of evidence which refutes your earlier position. You said “I wrote that study of exclusively very highly educated persons (everybody who has more education than primary is very highly educated compared to the world population) creates problems.
That aside, utility theorists are already way ahead of you. Luce (2000) and Birnbaum (1999) argued that tests of decision making phenomena should also be done at the level of the individual. It has been found that choice behaviour such event splitting effects hold at the level of the individual just as they do for groups (e.g., Birnbaum, 1999c, 2004a, 2007b; Humphrey, 1998, 2000, 2001a, 2001b). Moreover, at the individual level, people who make choices under risk weight the outcome probabilities of lotteries a non-linear way (e.g., Abdellaoui, 2000; Gonzalez & Wu, 1999; Lattimore, et al, 1992), which is exactly what the current quantitative theories of utility argue. If your allegation of sample bias had any weight, it would have been highly unlikely that Lattimore, et al’s (1992) North Carolina prisoners non-linearly weighted lottery outcome probabilities just as Kahneman & Tversky’s (1979) Israeli psychology students did.
What I am suggesting is that there is strong evidence for different ways of problem solving in many different domains. I generalize this principle to the domain of decision-making in risky situations without empirical support available yet. Maybe even introspection would give some necessary support?
If you do not have empirical support, then what you say cannot be anything more than speculation. Mere speculation cannot cast doubt upon established theories. Introspection? You’ve got to be joking, right?
The point I made is that there are quantitative and non-quantitative, heuristic based theories of utility; and that the non-quantitative theories have been descriptive failures compared to the quantitative ones. You have systematically refused to engage with this point and have thrown at me instead all sorts of red herrings, such as sample bias.
For me the theories you mention are not psychological theories even though behavior is studied in them.
This is a contradition and is therefore logically false. A theory of human behaviour is a theory of psychology, given that psychology is the study of human (and animal) behaviour. Utility theories are theories of choice behaviour and so therefore are psychological theories. As I said before, I’ve never seen a rock make a decision.
Andrew
From: talking-m...@googlegroups.com [mailto:talking-m...@googlegroups.com] On Behalf Of aaro
Sent: Thursday, 24 March 2011 2:58 AM
To: talking-m...@googlegroups.com
--
Aaro,
You are right. Call me old fashioned, but I think that in order to convincingly critique an established theory, one has to understand that theory well, articulate an argument which explicitly criticises key components or assumptions of that theory and present evidence in support of the argument. I do not think it’s wise to continue debating with someone who makes straw man arguments, such as
I reject behaviorism that logically follows from your definition of psychology.
and
I do not think that giving a name to a process or series of events is an explanation.
I never said that I endorse behaviourism. Neither does my definition of psychology logically rule out the study of cognitive phenomena. Why would I want to do that anyway, given that making decisions under risk obviously involves cognitive processes? And I never argued that reification constitutes a scientific theory. Once more, you’ve made the classic straw man fallacy.
It’s funny you mention constructive experiments, as Steve Humphry and I have been planning a choice experiment this week.
Cheers,
Andrew
From: talking-m...@googlegroups.com [mailto:talking-m...@googlegroups.com] On Behalf Of aaro
Sent: Thursday, 24 March 2011 10:39 PM
To: talking-m...@googlegroups.com
--
For an exemplar of the kind of work agent-based embedded cognition AI
roboticists get up to .. goto
And, take a look at the paper below ..
Berthouze, L., and Metta, G. 2005. Epigenetic robotics: modelling cognitive
development in robotic systems. In Cognitive Systems Research. Volume 6
Issue 3. September.
http://www.robotcub.org/misc/papers/05_Berthouze_Metta.pdf
You will note from the list of publications at this site that some of the
algorithms embodied in the cognitive system are not "fixed", or even
mathematical (akin to non-computational cellular automata and
non-computational "emergent property" systems such as found in Artificial
Life simulations). See also the Science article " Self-Organization,
Embodiment, and Biologically Inspired Robotics"
(http://www.sciencemag.org/content/318/5853/1088.full )
There are developments, robot systems "out there" in emergent system
agent-based robotics which are trying to get to grips with how our brain
adapts, grow, and learns, and how to build a system which "feels" and has
"personality" (e.g. KISMET ...
http://www.ai.mit.edu/projects/humanoid-robotics-group/kismet/kismet.html )
Just run a quick Google search using the terms "Encoding Emotionality in
Robots".
There are huge philosophical and computational/algorithmic problems with
dealing with emotionality .. but at least some are trying to figure these
out, step-by-step rather than keep hand-wringing and saying "it's
impossible".
What is fascinating about this work, and what evades the "tiny-mind
mentality" of many psychologists, especially "quantitative psychologists" is
that this work is dealing with the very essence of being human:
Consciousness, sentience, adaptation, biological self-organization, and
emotionality. It is no surprise to find theoreticians, philosophers, and
applied scientists from physics, computation, and engineering working
alongside the kind of thoughtful psychologists who are prepared to try and
build systems which seem closer to how human cognitive systems might work.
----------------------------------------------------
And, I can see what may be nagging at Aaro ... for me it's a bit like the
effect Gigerenzer and the ABC group in Berlin had in the world of
quantitative decision making models with the introduction of
evolutionary-advantageous non-computational fast and frugal decision-making.
It's not an "either or" situation, but you get the feeling that mathematics
may not be a hugely realistic way of modeling -some- human processes
(inasmuch as the good 'ol backpropagation system in neural nets is not a
good way to model aspects of human learning, which is why the LEABRA {local,
error-driven and associative, biologically realistic algorithm} algorithm
was invented ...
O’Reilly, R. C. (1996). Biologically plausible error-driven learning using
local activation differences: The generalized recirculation algorithm.
Neural Computation, 8, 895–938...
and
O’Reilly, R. C., & Munakata, Y. (2000). Computational explorations in
cognitive neuroscience: Understanding the mind by simulating the brain.
Cambridge, MA: MIT Press.
----------------------------------------------------
So, although I too have some "issues" with Aaro's conceptualisation of
"psychic explanation", I don't poke fun at him.
Andrew has clearly and firmly highlighted the evidence surrounding
choice-behaviors in certain contexts, which produce outcomes which can be
modeled mathematically with some considerable degree of success.
But, how they actually do this, how it could be implemented in wetware,
simulated in digital-analog hybrid algorithms, or any kind of human
neuophysiological adaptive system remains a mystery. And, at what stage
might a fast and frugal process bypass what looks to be a rational
computational process? Does it/could it ever do so - and how might we test
such a proposition?
And what happens when you inhibit neurogenesis in the hippocampus with
extreme stress, with the knock-on effect on working memory? Do the math
models still hold? Unfair question really but it brings home the notion of
an integrated system at work - and what might happen to change the
behavioral outputs of a system which under other conditions can be modeled
mathematically. Maybe things still do function as expected; but this is the
problem of a mathematical model devoid of explanatory content (i.e. how does
the brain actually implement the 'math", or do what it does that enables a
mathematical model to be fit to the observable outcomes in the first
place?). No doubt, already being investigated somewhere ...
----------------------------------------------------
The contrast in the "scientific" vs a "strictly quantitative" approach is
how Denny and his colleagues criticised Daryl Bem's recent piece of
nonsense:
Bem, D.J. (2011) Feeling the future: Experimental evidence for anomalous
retroactive influences on cognition and affect. Journal of Personality and
Social Psychology, 100, 3, 407-425.
Critiqued in:
Wagenmakers, E-J., Wetzels, R., Borsboom, D., van der Maas, H.L.J. (2011)
Why psychologists must change the way they analyze their data: The case of
Psi: Comment on Bem (2011). Journal of Personality and Social Psychology,
100, 3, 426-432.
Frankly, a pointless exercise in "my method is better than your method" from
Wagenmakers et al. They missed the "bleeding obvious" issue entirely - lost
in a world of statistical methodology ...
Let me explain ...
The key issue here (for me) is all about aggregation, and the specific form
of hypotheses that can be tested using sample statistics.
The effect size is a ratio between aggregated variances, or the scaled
difference between two aggregate parameters (the means).
But, what is the form of hypothesis which can be tested?
H1: If the hypothesis is that aggregate statistical effects can be shown to
be larger than zero, then Bem did a good job. But the hypothesis says
nothing about “people show psi ability” – because “people” defined as
constituting many single-unit effect-producing entities were never examined.
All that was examined were aggregates of all the “single-unit”
scores/outcomes.
H2: On the other hand, IF the hypothesis was to be: all humans show evidence
of psi ability, then every individual must show that ability (however tiny)
in order for such a hypothesis to be supported.
A careful reading of Bem’s paper shows that he continually wants to argue
for H2 (as an evolutionary advantageous property of being human, like having
a prefrontal cortex etc., we all have it in varying degrees), but implements
a hypothesis testing procedure which can only address H1.
We know from the pitiful effect sizes that many people did not show psi
(assuming normally distributed data required by the methods he used). Hence,
H2 is already disproven by Bem, although he is so committed to a statistical
view of phenomena that he doesn’t recognize what his own results imply.
Consider the fact that some people in his sample may have shown truly
outstanding evidence of psi, some will have actually produced behaviors less
than chance would have expected, some will be 50/50 ... you average them and
what do you get? Exactly what Bem found, slightly above chance effect sizes.
But what have you got in the real world (not in his statistician’s view of
the world)
1. Some people seem to possess psi.
2. Some people don’t possess any psi; they respond at chance expected
levels.
3. Some people seem to have responded worse than chance.
As a scientist, #1 (and maybe #2) is the really important finding as psi
need not be a property of every human; what’s important is being able to
demonstrate conclusively that some individuals really do show
precognition/psi on many independent testing occasions (i.e it's
replicable).
Who cares about an effect size of 0.15 for an entire sample when perhaps 5
individuals in that sample show a psi effect that is consistently 90%
accurate above chance-expected levels? Instead of a silly “what exactly is
the point of all this tosh” paper, we’d be reading the work in Science and
Nature – with our collective jaws dropping around the world.
It is as though some psychologists no longer understand how to even compose
scientific hypotheses that make sense. Everything is "given over" to Lord
Charles Bowen's "average man on the Clapham Omnibus" view of "the person" ..
as though this fine legal proposition is a foundation for an investigative
science.
----------------------------------------------------
And finally, I/we don't just criticize - some of us really try and do the
new business, bit by bit, - against a backdrop of individuals whose
preservation of the status quo results in drip-fed acidic and pernicious
ridicule. This work is really hard, really time-consuming, and really
awkward.
Regards .. Paul
Dear measurement enthusiasts,
--
Steve: "You say above: "the probability of a correct response is 'measureable' and this is the starting point. Can you explain to me what you mean by "the probability (of anything) is measurable"? How about "odds". Would you say that if I obtain the ratio of the frequency of occurrences of an event A to frequency of occurrence of event B, that is a measurement?"
Surprisingly calling Rasch model(s) a psychological theory created some discomfort among group members. Obviously I mistakenly assumed that some basic ideas about ‘Rasch models’ are commonly shared. Since this is not the case I have to make clear in what sense the Rasch model is a theory. Hambleton et al. write:
“Item response theory (IRT) rests on two basic postulates: (a) The performance of an examinee on a test item can be predicted (or explained) by a set of factors called traits, latent traits, or abilities; and (b) the relationship between examinees' item performance and the set of traits underlying item performance can be described by a monotonically increasing function called an item characteristic function or item characteristic curve (ICC). This function specifies that as the level of the trait increases, the probability of a correct response to an item increases.” (Fundamentals of Item Response Theory, 1991, p. 8)
Hopefully this does not worsen the situation by adding to the dispute the questions if the Rasch model is really an IRT. In the following I will heavily rely on Georg Rasch’s ‘Probabilistic models for some intelligence and attainment tests’ and on Andrich’s ‘Rasch Models for Measurement’.
Josh wrote:
“But such an epistemology seems to me consistent with the way that measurement was actually established in the physical sciences, i.e., hypothesising and confirming physical properties and the types of relations between them across a wide range of physical phenomena and building this into a coherent body of substantive theory that is the foundation (not mathematical as it is often presented in psychology) of the international system of measurement. I would argue that the possibility of psychological measurement can only be established (or not) by such research, and certainly not by fiat (which I have been somewhat guilty of in the past). [/] Is there any existing psychological theory that has already commenced 'such research'? Is there any existing physical theory that is relevant to 'such research', analogously or otherwise? Hopefully, by way of these discussions and others, we can get away from postulation and begin the necessary research.”
In my view the Rasch model is such a theory. It relies on the empirical observation that some examinees have higher probabilities of answering an item correctly than do other examinees. The explanatory hypothesis is that the observed differences are induced by differences in traits in such a way that "that examinees, with higher values on the trait have higher probabilities of answering the item correctly than do examinees with lower values on the trait" (op. cit., p. 8) The Rasch hypothesis is that the relation is quantitative. In order to start testing the hypothesis we need test persons and test items. The probability of a correct response to an item can be described in terms of odds (for details see Andrich, p. 12). Finally, let’s assume with Andrich that the observations are made by means of the Eysenck Personality Inventory.
How would physicists (sic!) test the quantitative hypothesis? For example, since the hypothesis is that every person has some level of neuroticism he will need at least two persons and two items (e.g. 1. “Do you sometime feel happy, sometimes depressed without any apparent reason?” 2. “Do you have frequent ups and downs in mood, either with or without apparent cause?”). He will invite the test person to repeatedly answer each question with “yes” (=1) or “no” (=0). If the quantitative hypothesis is correct the experimenter must find that the ratio of different levels of neuroticism is constant across items (for details see Andrich p. 24ff). If he does he has strong indication that the factors involved are indeed quantitative and therefore measurable. If he does not he can either drop the quantitative hypothesis or investigate the causes of why the search has failed. For example, consider that “a common assumption of IRT models is that only one ability is measured by a set of items in a test. This assumption cannot be strictly met because several cognitive, personality, and testtaking factors always affect test performance, at least to some extent. These factors might include level of motivation, test anxiety, ability to work quickly, tendency to guess when in doubt about answers. And cognitive skills in addition to the dominant one measured by the set of test items. What is required for the unidimensionality assumption to be met adequately by a set of test data is the presence of a "dominant" component or factor that influences test performance.” (op. cit., p. 9) Hence, maybe the failure to find constants is due to some other dominant factors (e.g. learning and memory) which act as systematic disturbances. The next step therefore is to identify these factors, control them and repeat the experiment and so on. This, then, is how ‘such research’ can commence.
My point in answer to Josh’s questions: “Is there any existing psychological theory that has already commenced 'such research'?” is; certainly there is enough psychological theory already existing; only the proper research (search for constants) has not yet commenced. What we need is not even more theory but experiments. Experimental results will tell us in which direction we have to go and if more theory is needed. Experimental science consists of interplay between theory and experiment. Theory and experiment must match in complexity or alternatively, theory should not depart too much from experiment in complexity. In physics of course nature is ‘simplified’ in experiment in order to match theory in ‘simplicity’.
Regards,
Guenter
"Hence, maybe the failure to find constants is due to some other dominant factors (e.g. learning and memory) which act as systematic disturbances. The next step therefore is to identify these factors, control them and repeat the experiment and so on. This, then, is how 'such research' can commence."
How does one do this precisely without theory? You say that we don't need more theory, but more experiments. If so, what happens when an experiment fails to establish a trade off between two attributes? In such cases, how can one speculate as to the causes of this failure? How can one identify which attributes are potentially confounding and which are not without a theory of the relevant natural system?
I would argue that theory, however crude and incomplete, precedes experiment. The theory of luminiferous ether preceded the Mitchelson - Morley experiment. Von Neumann & Morgenstern's (1947) independence condition of utility preceded Allais' (1953) choice experiments which tested it. I agree with you when you say that science is the interplay between theory and experiment, but I cannot see how that logically entails that no more theory is needed in the behavioural sciences. Allais' (1953) common ratio and common consequence effects were resolved only by new *theories* of decision making under risk, not by more choice experiments.
Andrew
-----Original Message-----
From: talking-m...@googlegroups.com [mailto:talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter
Sent: Saturday, 26 March 2011 7:47 PM
To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological Measurement
If there is no item response error, then the Rasch model cannot fit the
data; the measurement is ordinal - no ratios are possible. So, you have to
have measurement error for the model to fit, and for ratios to be
computable.
What kind of causal theory requires measurement error in order for the
theory to be adjudged "correct"?
Maybe we need a deterministic model rather than a probabilistic one?
Regards .. Paul
-----Original Message-----
From: talking-m...@googlegroups.com
[mailto:talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter
Sent: Saturday, 26 March 2011 9:47 p.m.
To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological
Measurement
Hi Steve,
Josh wrote:
Regards,
Guenter
Interesting post and I will read up on those papers concerning the LEABRA algorithm. Has mainstream connectionism embraced these kinds of error algorithms?
PB:
"...for me it's a bit like the effect Gigerenzer and the ABC group in Berlin had in the world of quantitative decision making models with the introduction of evolutionary-advantageous non-computational fast and frugal decision-making."
You may be interested in the following papers and the debate between the priority heuristic and quantitative theories of risky choice:
Brandstatter, Gigerenzer & Hertwig (2006). The priority heuristic: making choices without tradeoffs. Psychological Review, 113, 409-432.
Birnbaum, M.H. (2008). Evaluation of the priority heuristic as a descriptive model of risky decision making: comment on Brandstatter, et al, (2006). Psychological Review, 115, 253-262.
Brandstatter, Gigerenzer & Hertwig (2008). Risky choice with heuristics: reply to Birnbaum (2008), Johnson, Schulte-Mecklenbeck and Willemsen (2008) and Rieger & Wang (2008). Psychological Review, 115, 281-290.
Birnbaum, M.H. (2008). Postscript: rejoinder to Brandstatter, et al, (2008). Psychological Review, 115, 260-262.
Birnbaum, M.H. (2010). Testing lexicographic semiorders as models of decision making: priority dominance, integration, interaction and transitivity. Journal of Mathematical Psychology, 54, 363-386.
In my view, Michael Birnbaum's series of choice experiments (Birnbaum, 2010) has clearly shown that the priority heuristic theory of risky decision making is a descriptive failure.
PB:
"Andrew has clearly and firmly highlighted the evidence surrounding choice-behaviors in certain contexts, which produce outcomes which can be modeled mathematically with some considerable degree of success."
Thanks Paul, I'm glad someone noticed what I was doing.
I find myself drawn more and more towards decision making under risk as, unlike psychometrics, formal theories are motivated by the attempt to describe human behaviour. This and the history of critical, experimental study means that compelling arguments against quantitative theories of utility are much more difficult to mount and sustain than arguments against quantitative theories of test performance. Indeed, the most plausible non-quantitative alternatives are the lexicographic semiorder class of the theories, but these have been shown to be descriptively inferior to the quantitative theories.
Cheers,
Andrew
Wim (and Denny)
----------------------------------------------------
Neural Computation, 8, 895-938...
----------------------------------------------------
----------------------------------------------------
Let me explain ...
nothing about "people show psi ability" - because "people" defined as
Nature - with our collective jaws dropping around the world.
----------------------------------------------------
Regards .. Paul
Dear measurement enthusiasts,
engine....) But our patient has a problem understanding psychology or
better the psychic apparatus that is present in every human. This cure
is no mean task.
But i hope that in a few years time CNN and the rest of the world-
media will flock to Estonia to report on Aaro's "psychic ability-
machine" that has a personality (or rather personality traits) too. I
do not hope that this machine will be only a computer simulation or
computer programme. In that case CNN won't send any camera teams.
It is not from AI that the solution will come, I prophesize. That line
of theorizing about human cognition and abilities has been talked to
death more than a decade ago. If you programme everything you want
"cognition" to do it does not prove very much. For one thing, a
computer programme can't have real feelings and emotions. This fact
alone will preclude a working model that has anything to do with real
psychic processes.
Pure cognition can be implemented in computer chips thse days. Even
visual perception that has eluded AI specialists for so long can be
implemented. Already there are cars without a human driver riding on
the roads in the US. But these artificial drivers will never be
startled by a sudden occurrence on the road. It won't have a heart and
therefore it's heart will never skip a beat. Of course an emotion
algorithm could be added to the driving programme per se. (Like Data
had in the SF TV series Star Trek. But that doesn't prove anything....)
How do you define the term "item response error" as you use it below? Is it synonymous with your "measurement error"?
It's possible to look analytically at the dichotomous Rasch model in terms of linearly decomposed errors and model "parameters". Through this lens, I can understand your saying you have to have error for the model to fit. I did this with the balance beam as a prototype to see what is implied. I do not think item response error and measurement error are interchangeable, but it really depends exactly what you mean.
I think there is a larger issue, namely that a theory should translate to substantive quantitative relations, not a purely algebraic model. I take quantities like length and mass to be real, and physical theory, definition and law as referring directly to the quantities and their relations.
No doubt, I largely agree with what you're trying to say, but it seems I differ with you regarding the most profitable place to start looking for a way to better understand what's done now and how to attack the problem of measuring posited psychological quantities.
Best,
Steve
One way to put a spotlight on the basic problem is to ask: where are the units? Rasch drew upon the ideal gas law and Newton's second law. Indeed, he cited Maxwell on the latter, regarding the definition of the unit of force as that which acting on the unit of mass produces the unit of acceleration. To take Newton's second, F = m a refers to (a) actual quantities and (b) an actual causal relation that can be isolated from other physical relations. F is a force, and a force is not a number. Similar, m is a mass, and a is the acceleration of a body. a = F/m is not merely algebra. {L} m per {T} s, per {T} s = {F} N per {m} kg s is the same thing stated in terms of the standard units.
If the standard (classical) definition of measurement is accepted, it's just not possible for the dichotomous model to be a definition or law of the kind that form the most direct and tangible basis for the definitions of SI units and for measuring in those units. Instead, Rasch models are merely algebraic expressions. Again, the Poisson is more promising in this respect, and even the dichotomous model as Odds = B/D has a form parallel with those of physical definitions and laws; but the same can't be said of the logistic function(s) so widely used.
So Rasch drew explicit parallels with physics that, if we're thinking about successful measurement, ought in my view to be taken seriously in two respects. First, the parallels potentially connect to the basis of successful measurement in a way that is not elsewhere seen in psychometrics. Second, however, the nature of those connections carries clear implications for the way the "models" would have to be interpreted and applied if they were to be used as a succeful basis for measurement. The second implication is substantial where it comes to possible psychological theory (as opposed to merely mathematical theory some of which is used in essentially a post hoc way to justify the stock-standard use of raw scores on tests).
Regards, Stve
Steve
________________________________________
From: talking-m...@googlegroups.com [talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter [guenter....@zi-mannheim.de]
Sent: Saturday, 26 March 2011 4:47 PM
To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological Measurement
Hi Steve,
Michell suggests the application of conjoint measurement, but since we can measure time we can determine constants and thus apply derived measurement instead, just as usually done in physics. Still something missing?
Regards,
Guenter
-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Stephen Humphry
Gesendet: Mo 28.03.2011 14:26
An: talking-m...@googlegroups.com
Betreff: RE: [talking-measurement] The Rasch model and Psychological Measurement
Regards, Stve
Steve
Hi Steve,
Josh wrote:
Regards,
Guenter
--
Hello Steve
> Hi Paul.
>
> How do you define the term "item response error" as you use it
> below? Is it synonymous with your "measurement error"?
> It's possible to look analytically at the dichotomous Rasch model
> in terms of linearly decomposed errors and model "parameters".
> Through this lens, I can understand your saying you have to have
> error for the model to fit. I did this with the balance beam as a
> prototype to see what is implied. I do not think item response
> error and measurement error are interchangeable, but it really
> depends exactly what you mean.
Sorry Steve, I used the phrase rather clumsily in the context of being able
to assess, with no error, a person's ability. In essence, a Guttman scale.
Michell (2004, p. 126) put it nicely (Michell, J. (2004) Item Response
Models, pathological science, and the shape of error. Theory and Psychology,
14, 1, 121-129.) ...
"Now, if a person's correct response to an item depended solely on ability,
with no random 'error' component involved, one would only learn the ordinal
fact that that person's ability at least matches the difficulty level of the
item. Item response modellers derive all quantitative information (as
distinct from merely ordinal) from the distributional properties of the
random 'error' component. If the model is true, the shape of the 'error'
distribution reflects the quantitative structure of the attribute, but if
the attribute is not quantitative, the supposed shape of 'error' only
projects the image of a fictitious quantitivity. Here, as elsewhere,
psychometricians derive what they want most (measures) from what they know
least (the shape of 'error') by presuming to already know it.
If the random 'error' concept is retained, but it is admitted that the shape
of these 'errors' is unknown, then at best only ordinal relationships
between people (or items) follow from test performances (Grayson, 1988)
unless the cancellation conditions alluded to above (namely double
cancellation, triple cancellation, etc.) obtain."
Ben wright chastised me back in 1999 for creating a test of the Rasch model
which failed (trying to recover an underlying quantitative attribute which
was measured using a variety of different length objects using a "bad
-non-linear ordinal-unit" ruler - all that happened was that the Rasch model
recovered the ordinal units as linear ones - big surprise!) - the fault
evidently was because I had not incorporated sufficient random error in my
observations.
That "you need more error", coupled with Robert Wood's article Wood, R.
(1978) Fitting the rasch model - a heady tale. British Journal of
Mathematical and Statistical Psychology, 31, , 27-32, and finally Michell's
later expositions, convinced me that IRT was just another way of
statistically modeling item responses.
>
> I think there is a larger issue, namely that a theory should
> translate to substantive quantitative relations, not a purely
> algebraic model. I take quantities like length and mass to be
> real, and physical theory, definition and law as referring
> directly to the quantities and their relations.
>
> No doubt, I largely agree with what you're trying to say, but it
> seems I differ with you regarding the most profitable place to
> start looking for a way to better understand what's done now and
> how to attack the problem of measuring posited psychological
> quantities.
>
Yes, I think we probably do agree on most things ... where we differ perhaps
is how we consider what might be referred to as the substantive issue of
whether any "psychological" attribute can be measurable at all, from a
"first principle" perspective.
I simply cannot conceive of any kind of attribute (personality, values,
temperament, motivation, ability etc.) where a standard unit could ever be
maintained by an adaptive self-organizing biological system. And it is that
phrase "self-organizing adaptive system" which I think separates my view of
things from probably many others on this list, and many quantitative
psychologists. I'm not sure any non-physical feature of such a system can be
isolated entirely from other interconnected parts, in such a way that
controlled manipulations of one particular feature can be undertaken in
order to establish additivity of some unit. The sheer magnitude of that
"self-organizing" function we are dealing with is given in this article from
2007 ..
http://www.newscientist.com/article/dn12301-man-with-tiny-brain-shocks-docto
rs.html
I know, a one-off, a fluke, but as a scientist it speaks volumes to me about
the nature of the system whose outputs I am intending to understand and
"measure". The Lancet paper is available online at
http://download.thelancet.com/pdfs/journals/lancet/PIIS0140673607611271.pdf
But, we know that we can loosely capture variations in outputs from such a
system, and these work pretty well for many practical purposes in some cases
(performance measures of various kinds).
So, I prefer to explore what might be done more creatively with "good
enough/fuzzy" assessments rather than concentrate on trying to increase
"precision" where none may be found in reality.
I can't claim I'm correct in my thinking; it's based more upon my view of
"how humans function" (fed by complex/adaptive systems theory) than an
adherence to an abstract measurement theory.
Steve Blinkhorn recently offered this construal of how people might be
answering personality questionnaire items (on the Psychometrics Forum
listserv on Linked-In) ... it is not offered here as a "rigorous theory"
from Steve, more of an armchair muse really, but I do find it "interesting"
as again, it meshes with a broader view of an integrated adaptive neural
system at work ..
"Meanwhile, more or less totally ignored was the question, what is going on
when a person generates an answer to an item? This is particularly apposite
when considering non-cognitive tests. Do I have a preference for a lonely
cottage in the woods over a busy seaside town before you require me to
express one? How much are you accessing the dimensions of my mind, and how
much forcing me to respond to the dimensions of yours? Why do you not
provide both the get-out options of "neither" and "in between"?
Take an analogy from quantum mechanics, and suppose for a moment that minds
hover in wave-function like states of quantum superposition until a test
item comes along and causes a collapse. So you can be both introvert and
extravert at the same time, but because we feed back information from our
own behaviour to attempt to create consistency, test items don't act just as
indicators of consistency, they induce it. Just an alternative to Paul's
little rulers in the head."
Me? I'm caught in a world where practical concerns/profits require more
accurate predictions of outcomes from any assesments we can devise, yet am
aware that precision of measurement may not be a realizable feature of any
assessment, not because we lack the brains to utilise measurement models,
but because the system under examination can create it's own "internally
generated" cause on-the-fly (not just responding passively to external
stimuli), and is complex (in terms of massively interconnected,
self-organizing, adaptive neural networks). I've always been puzzled by
trying to answer a question "if we could measure an attribute
quantitatively, to a substantive degree fo precision (say to within 1
decimal place of a unit), what would be required from a human to be able
sustain that accuracy?"
My feeling is that those who work in edumetrics tend to forget that working
solely with scholastic and performance-based assessments is hugely different
from trying to assess those features of "being human" which are probably the
most fundamental aspects of the science of psychology.
However, maybe this (educational attainment) is where attempts at
"quantitative measurement" may work best - where attainment of a very
specific outcome is at stake, and not assessment of a feature/attribute of
human psychology like "religiosity" or "propensity to commit a violent act"
for example?
On emergent, self-organising systems, I was a big fan of this in general, Stuart Kauffman's work in particular, in the 90s. I think we gain some real insights. As you may well know, Kauffman later worked with a physicist and tried to apply the thinking to physics. The "primitives" of a system, as John Holland calles them, are very simple and the rules/laws governing their interactions can be of a basic quantitative nature. Indeed, in Chemistry with autocatalytic sets, that is precisely the nature of interactions. Now, I think complex adaptive systems in general are useful for providing broad insights. As yet, for example, nobody has created life using the insights. They may, or may not. I think not, but I won't go into the reasons here.
Suffice to say that I do agree we can gain some insights, and they may concern exactly what is mesurable. Is temperature an emergent phenomenon? Arguably it is, albeit not a self-organising one considered in isolation.
You say:
"My feeling is that those who work in edumetrics tend to forget that working
solely with scholastic and performance-based assessments is hugely different
from trying to assess those features of "being human" which are probably the
most fundamental aspects of the science of psychology."
Yes, I think that is generally very true. I'm skeptical about measuring attributes, but mostly not for reasons given by Joel or because of issues with distributions of measurement error.
Steve
________________________________________
From: talking-m...@googlegroups.com [talking-m...@googlegroups.com] On Behalf Of Paul Barrett [pa...@pbarrett.net]
Sent: Tuesday, 29 March 2011 4:50 AM
A: How does one do this precisely without theory? I would argue that theory, however crude and incomplete, precedes experiment. I agree with you when you say that science is the interplay between theory and experiment, but I cannot see how that logically entails that no more theory is needed in the behavioural sciences.
G. My point is NOT that we need no theory at all, but that we already have enough theory to get started. This does not imply that more theory along the road is unnecessary. It also does not imply that we must be successful in our endeavour. Most physical phenomena and laws were discovered by countless trial and error in the laboratory and much of this research ended nowhere and has gone unreported. In short, Faraday the experimenter should be as much our model as Maxwell the theorist. Only by forgetting Faraday Rasch could be so optimistic about measurability in psychology.
I’m trying to put myself here in the position of someone who believes that psychological attributes are measurable. Hence, if one does believe, which I don’t, one should roll up one’s sleeves and withdraw to the lab (or in the dungeon as Denny put it) in order to return with positive or negative results. Consider, for example, Faraday’s "Experimental Researches in Electricity" (1). Without it no Maxwell would have been possible. Faraday indeed spent all his life in the lab. Science is not always fun and sunshine.
One can only demonstrate that psychological attributes are measurable by actually measuring them. A positive result would put an immediate end to just ‘talking measurement’ and lead over to ‘doing measurement’. Of course, some people believe that psychological attributes are already measured, but they don’t seem to be around here.
G
(1) http://www.archive.org/details/experimentalres01faragoog
-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Andrew Kyngdon
Gesendet: Sa 26.03.2011 11:45
An: talking-m...@googlegroups.com
Betreff: RE: [talking-measurement] The Rasch model and Psychological Measurement
G,
Andrew
Hi Steve,
Josh wrote:
Regards,
Guenter
--