Why to measure?

177 views
Skip to first unread message

aaro

unread,
Mar 15, 2011, 2:57:04 PM3/15/11
to talking-m...@googlegroups.com
Hi All,

There is a question that bothers me already some time. Over last 4-5 years I have tried to understand what is wrong with methodology of psychology today and ended up with understanding that there seems to be nothing right. (Some of the reasons have been provided, among others, by the followers of this discussion group; I have proposed some more and extended criticism to modern qualitative approaches as well). The issue of measurement is definitely one that needs to be understood deeper.

A scientist, I believe, should ask four general questions in the beginning of any study to be conducted:
1. What do I want to know? What is my research question?
2. Why do I want to answer that question?
3. By what methods can I find the answer?
4. Do the answers to the first three questions make a coherent whole?

This forum is dedicated to the following: "A core focus is the state of measurement in the social sciences. Why are disciplines such as Psychology, Education, Sociology and Economics only considered to have 'soft measurement'. What we can do to change this?"

So, there seem to be some questions we are trying to answer: What is measurement? Are psychological attributes measurable? Can we improve measurement in psychology and other "soft" sciences?

Now the second question should emerge--Why we want these questions to be answered? WHY TO MEASURE?  I suppose there is more than one answer to this question. If so, the answers to "research questions" of this discussion group may be different. So far in this discussion group, however, it seems one aim of measurement is implicitly assumed. It may turn out that the methods or ways to answer the measurement questions may not correspond to the aims of measurement. In that case only confusion will arise.

I think there are at least five reasons why measurement is used/ pretended to be used in psychology:
1. "Real sciences" measure and psychology must look like/ is a real science. This position may be common, but maybe not very meaningful
2. Everybody else is measuring. Here two forces are operative. First, universities usually teach research methodology as if quantitative data analysis is the scientific method. And second, publishing is also easier. Here the reasons are nonscientific, thus.
3. Psychology lacks better methods for organizing massive amounts of information. Until these better methods will be discovered, statistical data analysis, that requires "measurement," should be used. This position was quite explicitly taken by founders of statistical data analysis in psychology and other sciences--Karl Pearson, Louis Thurstone, among others.
4. Measurement and following dstatistical data manipulation helps to predict future states and events beyond chance.
5. Measurement and quantitative data analysis help to reveal the mechanisms of the studied phenomena, psyche in psychology.

Each of these reasons has, I think, different relationship to understanding what is measurement and how it can be applied.

Altogether, I have two related questions to the group:
I. Is the list of reasons, why psychologists want to measure, complete? Are there more reasons? Or maybe some should be excluded?
II. Which of the reasons do you imply when discussing the theory of measurement?

My impression is that the issues discussed so far relate to one or the other of the reasons provided. Depending on the reasons, however, the same questions about measurement and measurability have different answers. It might be interesting to discuss these relationships in more details.

With best regards

Aaro

(Aaro Toomela
Institute of Psychology
Tallinn University
Tallinn, Estonia)


Andrew Kyngdon

unread,
Mar 15, 2011, 8:51:29 PM3/15/11
to talking-m...@googlegroups.com

Aaro,

 

Welcome to Talking Measurement and thankyou for posting.

 

There are many points in your post that are worthy of discussion, but I think your first list of questions goes to the core of the problem.

 

Science, in my view, is best defined as the critical investigation of natural systems, with the primary aim being the development of descriptive theories of these systems. Such investigation needs to be critical as human beings are fallible thinkers and our perceptual systems are quite limited. We find it difficult to understand the behaviour of even simple quantities like length, for example, without sophisticated measurement apparatus, such as length measurement devices based on laser technology. Hence our hypotheses and observations made of natural systems, and our deductions from them, could be wrong and the best way to identify and rectify such errors is through being critical of our own perceptions, thoughts and observational apparatus.

 

Psychological systems are natural systems. Hence if behavioural scientists want to consider themselves to be scientists, then their research activity should focus on the critical investigation of psychological systems. Psychologists, above all else, should be trying to develop descriptive theories of psychological systems.

 

Psychologists generally believe that psychological attributes and systems are quantitative. Now, it must be stressed that this is a coherent hypothesis. There is nothing logically problematic about it. But like all hypotheses, it must be subject to critical investigation and tested. Without this we have no way of knowing if the hypothesis is true or even plausible. By and large, psychologists have not tested this hypothesis and have largely assumed it to be true. This is the nub of Joel Michell’s (1997; 1999) criticism of psychometrics – the central hypothesis of the field simply has not been subject to critical scrutiny. Hence there is very little compelling evidence in support of the hypothesis; and  therefore we cannot reasonably assume that psychological attributes are measurable, continuous quantities.

 

Why psychologists want to measure has complex historical causes that have been discussed by Michell (1997; 1999). I believe that the first of your five listed reasons is accurate. Psychologists must compete against the “hard” sciences for research funding and they believe, probably correctly, that funding bodies will be more convinced to make grants if it is perceived that psychologists engage in real scientific measurement. Another related problem is that psychological testing is an established, global industry worth billions. Just educational testing in the US alone is a multi-billion dollar industry. Much of the business conducted within the testing industry is based on the hypothesis that tests are instruments of scientific measurement. If Testing Company X argues that their tests just order students with respect to their cognitive abilities, but Company Y argues that their tests “measure” such abilities, it is a safe bet that Company Y will win the tender or contract. Hence the testing companies have a vested, non-scientific interest in maintaining that psychological attributes are measureable. Do not underestimate how strong this can be. I criticised a testing company on another forum for not being interested in foundational issues in measurement (as testing companies in general are not), and was censured for being “uncivilised”, despite the fact that I did not engage in slander, libel or personal vilification.

 

The third of your reasons is also pertinent. Practicalism is also a reason for why the hypothesis of psychological quantities has not been critically investigated. Students abilities need to be assessed and tests are useful in this regard. They seem to do the job and  because of this psychological tests must be measuring something. However, practical concerns are logically indifferent to scientific ones, so the practical value of tests has no logical bearing upon the truth of the hypothesis of psychological quantities. But one not need argue that tests measure anything for tests to provide useful information regarding cognitive abilities. In most instances ordering students with respect to their abilities is enough in practical terms and it is fair to say that tests actually achieve this. Cliff & Keats (1996) devoted a whole book to this issue. Moreover, there is a whole field of psychometrics now devoted to non – parametric Item Response Theories in which cognitive abilities are assumed to be ordinal.

 

In my opinion, only a few psychological attributes seem to be plausibly measureable, such as the utility of incremental gains and losses under conditions of risk or uncertainty. But even in this field, where a notable emphasis is placed on theories which describe choice behaviour, there are problems and lapses in critical thinking. Ultimately, there is no “silver bullet” to the issue of scientific measurement in psychology, but I think some progress might be achieved if psychometricians start trying to devise descriptive theories of cognitive systems and of item response processes. This is not an easy task, but one which I feel can be done if thought is put to it.

 

Cheers,

 

Andrew

 

P.S. Talking Measurement subscribers might be interested to know that I have my own website up and running. I am happy to link it to other Talking Measurement members’ websites if I can get a link in return.

 

Andrew Kyngdon, PhD

MetaMetrics, Inc.

www.lexile.com

My website: Dr Andrew Kyngdon

--
You received this message because you are subscribed to the Google Groups "Talking Measurement" group.
To post to this group, send email to talking-m...@googlegroups.com.
To unsubscribe from this group, send email to talking-measure...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/talking-measurement?hl=en.

Denny Borsboom

unread,
Mar 16, 2011, 5:19:13 AM3/16/11
to talking-m...@googlegroups.com, aaro
Hi Aaro,

I think Andrew said it quite right. As you stated in your email, and
as Andrew outlined in his response, there is a large set of factors,
both scientific and extra-scientific, that push the measurement boat.
Everybody wants to be associated with measurement because it brings
the credentials of precision, prediction, and control.

I also share Andrew's evaluation that, at least when one adheres to
the strict definition of measurement common in this forum (the
determination of the ratio between two magnitudes of the same
continuous quantitative attribute, one of which functions as a unit),
measurement is largely out of the question in psychology. The primary
reason being that psychological attributes are not continuous and have
complex internal structures so they aren't lines or isomorphic to
lines; and things that aren't (isomorphic to) lines (real numbers)
aren't measurable according to the strict doctrine (i.e., the
Michell/Holder line of thinking).

Personally, I have become less and less convinced that it makes sense
to set up tests of the assumption of continuous quantity because it
appears to be so very unlikely that candidate attributes like
intelligence, personality traits, attitudes, and psychopathological
disorders are in fact continuous quantities. If fact as I am writing
this, the very idea strikes me as almost ridiculous. Given the a
priori implausibility of the continuous quantity assumption, it is by
the way remarkable that some psychometric practices that appear to be
based on it (e.g., certain IRT applications, adaptive testing
routines, etc.) perform so well.

Having said that, I think that psychology and psychometrics cover much
more terrain than the strict definition of measurement. For instance
psychometrics includes categorical structures (latent class and latent
profile models), ordinal structures (nonparametric IRT), complex
models (e.g,. multidimensional and network models), and purely
descriptive systems that have no measurement pretension whatsoever
(e.g., many kinds of MDS, components analysis, etc.).

One can probably however put the same questions you asked about
measurement before the use of these techniques. Why do people use
formal/mathematical/statistical methods at all?

Best
Denny

> --
> You received this message because you are subscribed to the Google Groups
> "Talking Measurement" group.
> To post to this group, send email to talking-m...@googlegroups.com.
> To unsubscribe from this group, send email to
> talking-measure...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/talking-measurement?hl=en.
>

--
Denny Borsboom
Department of Psychology
University of Amsterdam
Roetersstraat 15
1018 WB Amsterdam
The Netherlands
phone: +31 20 525 6882
email: d.bor...@uva.nl
homepage: http://users.fmg.uva.nl/dborsboom

Stephen Humphry

unread,
Mar 16, 2011, 7:06:16 AM3/16/11
to talking-m...@googlegroups.com
Hi Aaro,

Very nice post. Thanks. I agree it is the clear question that emerges--why measure? I agree with most of your answers, and I agree most of the discussion on this group focuses on those things.

To take each question one at a time.

I think there are at least five reasons why measurement is used/ pretended to be used in psychology:
1. "Real sciences" measure and psychology must look like/ is a real science. This position may be common, but maybe not very meaningful

Steve: I would just qualify this as "real quantitative sciences" involve measurement and so on. I totally agree: the immense success of physics in particular, and the many natural scientific disciplines that use physical measurements, set up such an expectation of what a successful science should look like, this drew people in. This is quite clear from the history, and the formative figures like Thurstone, Weber, Wundt, Fechner had backgrounds in physics (or were physicists). It is a shame because so much of science doesn't involve the application of mathematics that presupposes measurements like r kg, r m, r s, r A (where r is a real number and kg etc are the SI unis). However, it is in my view clearly worth pursuing quantitative approaches in psychophysics and I consider it an open-question with respect to many other attributes and phenomena considered the focus of psychology.

2. Everybody else is measuring. Here two forces are operative. First, universities usually teach research methodology as if quantitative data analysis is the scientific method. And second, publishing is also easier. Here the reasons are nonscientific, thus.

Steve: Again, I essentially agree.

3. Psychology lacks better methods for organizing massive amounts of information. Until these better methods will be discovered, statistical data analysis, that requires "measurement," should be used. This position was quite explicitly taken by founders of statistical data analysis in psychology and other sciences--Karl Pearson, Louis Thurstone, among others.

Steve: I agree about the position being taken by these people and others. This is without question. I am not at all convinced that statistical data analysis requiring measurement (most in practice, including most applications of GLMs) should be used. Indeed, I would argue that the almost unbridled use of statistics takes away massive resources and talent from questions, including how we can more realistically use some mathematics (not involving real numbers and real number arithmetic).

4. Measurement and following dstatistical data manipulation helps to predict future states and events beyond chance.

Steve: Yes, beyond chance. That doesn't mean a lot, as you may well agree (?)

5. Measurement and quantitative data analysis help to reveal the mechanisms of the studied phenomena, psyche in psychology.

Steve: Here, I'm a little confused. They would if they had a foundation like that in physics, but for the most part, they don't. I agree this is the hope, and to some degree certainly we learn some things, but in a highly confused way, and for the most part dressed up as so much more than it is.

Great post.

Regards,

Steve


________________________________
From: talking-m...@googlegroups.com [talking-m...@googlegroups.com] On Behalf Of aaro [aaro.t...@ut.ee]
Sent: Wednesday, 16 March 2011 2:57 AM


To: talking-m...@googlegroups.com
Subject: [talking-measurement] Why to measure?

Hi All,

With best regards

Aaro

--

aaro

unread,
Mar 17, 2011, 12:36:21 PM3/17/11
to talking-m...@googlegroups.com
Hi,

Andrew, Denny, and Steve -- Thank you for your comments. It seems you agree that the list of reasons why measurement is intended in psychology I provided is meaningful. No additional reasons also emerged in your replies. So, I think it is time to move to the discussion of the consequences of the reasons why measurement is pretended or aimed at in psychology. Denny actually jumped into the deeper problem I also had in mind--he asks, "One can probably however put the same questions you asked about measurement before the use of these techniques. Why do people use formal/mathematical/statistical methods at all?" The question of measurement and the question of formal/mathematical (definitely not only statistical) methods are closely related indeed.

I propose for further discussion some thoughts that are related to each of the reasons for measurement (and use of mathematical methods).

1. "Real sciences" measure and psychology must look like/ is a real science.

Psychology can and should attempt measuring and mathematical methods if the principles and laws of "real quantitative (as Steve correctly specified) science" can be attributed to the mind as well. If psychological regularities can be reduced to physical this approach would be menaingful. But I think also that there is anough evidence to reject reductionism; the world of psyche is not covered by physical laws (I am not opposing mind to physical world here, I suggest that there is psychic part of the physical universe which is described by a set of principles that does not apply to all the physical world). Therefore science of psyche should reject measurement and mathematical methods; by becoming "real quantitative science" psychology ceases to be psychology.

It also follows that the true essence of measurement is not important here--whatever is done to make psychology into quantitative non-psychology--the result is the same; phenomenon that is supposed to be studied is lost in the process anyway.

Another issue is about money, respect and other extra-scientific reasons related to this reason (and agreed by Andrew, Denny, and Steve as well). Here extrascientific issues become essentially antiscientific because understanding of mind is pushed to become a science that cannot understand mind in principle.


2. Everybody else is measuring. Here two forces are operative. First,
universities usually teach research methodology as if quantitative data
analysis is the scientific method. And second, publishing is also easier.
Here the reasons are nonscientific, thus.

Clearly extra-scientific issues that, again, turn out to be antiscientific in essence: There still is a possibility that measurement and mathematics can be applied to the study of psyche without being reductionsist--perhaps just other mathematical methods are needed. But in that case there should be clear and justified ground for deciding what is this novel kind of mathematics and measurement. This ground does not exists, so the mathematical game of science is played by those who decide what can be published and what is "nonscientific," i.e., nonquantitative.

Here, also, measurement is not an issue; it is a game of science and it is not important whether something is measured or "measured" in this game, it is still only a pretend-play of science.


3. Psychology lacks better methods for organizing massive amounts of
information. Until these better methods will be discovered, statistical data
analysis, that requires "measurement," should be used. This position was
quite explicitly taken by founders of statistical data analysis in
psychology and other sciences--Karl Pearson, Louis Thurstone, among others.

This aim seems to be reasonable, but interesting consequences--also implied by Steve--follow. The number of ways how events and relationships between events can be expressed mathematically, is not constrained in principle. There is always a possibility to create another model and yet another. If we really go this way and build more and more models, the aim to organize information is lost; we do not need, for this purpose, million models of ten observations. We would perhaps need one. Many mathematical tools become useless this way, as Steve also suggested.

This purpose also does not need any measurement in the strict sense, because it is explicitly understood that mathematics is used just for organizing the amount of information without making any claims as to the mechanisms or causes why such organization emerges. Likert scale or any attribution of numbers to events becomes acceptable under this purpose.

4. Measurement and following dstatistical data manipulation helps to predict
future states and events beyond chance.

Steve suggested that this aim is not very meaningful. In fact, for me it is the only meaning quantitative data analysis can provide--we can make pragmatic decisions without knowing the mechanisms that connect the events in the world. But again, there is no need to "measure", if we can predict numbers by numbers and translate these numbers back to the possibility/ probability of a future state of affairs, we have got what we aimed at.

It might seem that science can learn from such prediction because prediction helps to constrain the areas where to look for possible mechanisms and understanding of the studied phenomena. I have thought about this possibility a lot and my impression is that such prediction is not helpful: there are too many possible ways to predict a probability of some event without involving any information that would be necessary for explanation. But maybe I am wrong here.


5. Measurement and quantitative data analysis help to reveal the mechanisms
of the studied phenomena, psyche in psychology.

Steve suggests that it is the hope. I also would add that under this purpose the question of measurement becomes truly important. If psychic attributes cannot be measured, large set of mathematical methods is immediately ruled out as well, including almost all the methods used in psychology today. My impression is that everybody who bothers to ask the question--are psychic attributes measurable?--ends up with the answer--No.

Andrew proposed a candidate for a measurable attribute--the utility of incremental gains and losses under conditions of risk and uncertainty. I am not sure this attribute is actually psychic. Here an interesting question of the definition of psyche emerges, but this is another issue.

So, what is left over, is the possibility that perhaps some other mathematical methods could be used, methods that do not require measurement. This is again a question that would need a separate discussion but shortly--I think no mathematical method can help to reveal mechanisms of psyche in principle because in mathematics concrete real-world phenomena are fit to a priori defined axioms; in mathematical approach, thus, decision about the possible ways to understand is made before the study of the phenomenon; it is just another version of reductionism (I have discussed this issue in more details elsewhere).

CONCLUSIONS
Four out of the five reasons why measurement is attempted in psychology do not require measurement in the strict sense. There is even no need to ask, whether the studied phenomena or measured or just observed events are coded into numbers. Of course, no understanding of the studied phenomenon--psyche--beyond probabilistic prediction can also be achieved. Quantitative methods can be useful here as tools for formalizing prediction. At the moment we have already too many of such tools, most of them are just unnecessary.

The fifth aim--understanding of the mechanisms--must take the issue of measurement seriously. The result is quite clear--psychic attributes cannot be measured. This leads to the next point: mathematical methods commonly used in psychology today can be rejected as they actually would require measurement if the aim is to go beyond prediction. As the whole application of mathematics to the study of psyche can be essentially reductionist, perhaps it is time to reject the mathematics also?

best

Aaro



Josh McGrane

unread,
Mar 18, 2011, 4:36:58 AM3/18/11
to talking-m...@googlegroups.com
Hi Aaro,

Thanks for your interesting posts. It seems that there is much agreement on the forum, amongst the vocal minority at least, regarding your comments on the extra-scientific motivations for and implications of 'measurement' in psychology. Your latest post seems to come to a much stronger conclusion. I am wondering if you can elaborate on the premise for your conclusion?

I will cut and paste just so my post/question is clear for everyone. You conclude:

The fifth aim--understanding of the mechanisms--must take the issue of measurement seriously. The result is quite clear--psychic attributes cannot be measured. 

This seems to be based upon the premise:

Psychology can and should attempt measuring and mathematical methods if the principles and laws of "real quantitative (as Steve correctly specified) science" can be attributed to the mind as well. If psychological regularities can be reduced to physical this approach would be menaingful. But I think also that there is anough evidence to reject reductionism; the world of psyche is not covered by physical laws (I am not opposing mind to physical world here, I suggest that there is psychic part of the physical universe which is described by a set of principles that does not apply to all the physical world). Therefore science of psyche should reject measurement and mathematical methods; by becoming "real quantitative science" psychology ceases to be psychology. 

Which was complemented by this statement:

Andrew proposed a candidate for a measurable attribute--the utility of incremental gains and losses under conditions of risk and uncertainty. I am not sure this attribute is actually psychic. Here an interesting question of the definition of psyche emerges, but this is another issue.

Can you please clarify what you take the content of psychology to be? 

Further, it may well be the case that the 'psychic part of the physical universe' is not, at least in part, explained by existing physical theory/laws (although I believe this is an open, empirical question), but why does it necessarily follow that the establishment of quantitative psycho-physical (for want of a better term) theory/laws is not possible, therefore precluding the possibility of psychological measurement? If I am misreading you, then please let me know.


So, to digress a little, as far as I can tell from your post and other publications (http://www.frontiersin.org/quantitative_psychology_and_measurement/10.3389/fpsyg.2010.00029/abstract), you emphasise a structural-systemic epistemology. But such an epistemology seems to me consistent with the way that measurement was actually established in the physical sciences, i.e., hypothesising and confirming physical properties and the types of relations between them across a wide range of physical phenomena and building this into a coherent body of substantive theory that is the foundation (not mathematical as it is often presented in psychology) of the international system of measurement. I would argue that the possibility of psychological measurement can only be established (or not) by such research, and certainly not by fiat (which I have been somewhat guilty of in the past). 

Thus, the most pertinent question for me, as is alluded to in the description for this forum that you quoted in your earlier post, is how do we commence such research? This entails other questions like; What exactly do I mean by 'such research'? What will the content of 'such research' be? Is there any existing psychological theory that has already commenced 'such research'? Is there any existing physical theory that is relevant to 'such research', analogously or otherwise? Hopefully, by way of these discussions and others, we can get away from postulation and begin the necessary research.

I don't see a distinction between wanting to measure and wanting to systematically understand, because the only real path to scientific measurement is via systematic understanding. It may well be that attempts at such understanding are best served by a healthy skepticism toward the possibility of psychological measurement, encouraging critical reflection, etc. But, if such an understanding reveals the possibility for psychological measurement, then why wouldn't we? If it turns out improbable, then that is an important addition to our systematic understanding.

I look forward to your (and others') replies.

Cheers,
Josh



Trendler, Guenter

unread,
Mar 18, 2011, 8:36:59 AM3/18/11
to talking-m...@googlegroups.com
Josh:

“Thus, the most pertinent question for me, as is alluded to in the description for this forum that you quoted in your earlier post, is how do we commence such research? This entails other questions like; What exactly do I mean by 'such research'? What will the content of 'such research' be? Is there any existing psychological theory that has already commenced 'such research'? Is there any existing physical theory that is relevant to 'such research', analogously or otherwise? Hopefully, by way of these discussions and others, we can get away from postulation and begin the necessary research.

I don't see a distinction between wanting to measure and wanting to systematically understand, because the only real path to scientific measurement is via systematic understanding. It may well be that attempts at such understanding are best served by a healthy skepticism toward the possibility of psychological measurement, encouraging critical reflection, etc. But, if such an understanding reveals the possibility for psychological measurement, then why wouldn't we? If it turns out improbable, then that is an important addition to our systematic understanding.”

Hi Josh,

I very much welcome your call for less postulation and more focus on research. But did ‘such research’ not already commence? How about the Rasch model? It is a psychological theory and it concurs to the classical definition of measurement and Rasch himself already made the connection to the relevant physical theory (e.g. laws of ideal gases). I also assume that most Rasch people believe that they can (do) already measure psychological attributes at least to some extent. I’m bit surprised that this view has not yet be articulated here. Maybe simply because there are no 'such Rasch people’ present here?

Guenter


winmail.dat

aaro

unread,
Mar 18, 2011, 3:06:42 PM3/18/11
to talking-m...@googlegroups.com
Hi Josh,


On Friday, March 18, 2011 10:36:58 AM UTC+2, Josh McGrane wrote:
It seems that there is much agreement on the forum, amongst the vocal minority at least, regarding your comments on the extra-scientific motivations for and implications of 'measurement' in psychology. Your latest post seems to come to a much stronger conclusion.

Yes, I believe indeed that the reasons why measurement is applied are directly connected to the way 'measurement' (and use of mathematical tools in general) is understood. And strong conclusions emerge from this.
 
Can you please clarify what you take the content of psychology to be? 

You are right, sooner or later the discussion should arrive at this question also. This is because methods of science are not separate from theories of the studied phenomenon or thing; method is essential part of the theory. I have defined psyche--the object of studies of psychology--as follows: Psyche is a system of processes that, on the basis of individual experiences, organizes behavior with the aim of maintaining equilibrium of the organism as a whole in a changing environment.

I think one aspect of this definition is especially important in the context of our discussion, it is the notion of 'system.' There were several systems theories already half a century ago, and the theories defined what system is in different ways. My impression is that today usually systems are understood as composed of variables. For the approach, I have tried to follow and elaborate--approach I have called structural-systemic in order to distinguish it from other systems approaches--what sciences should aim at is not systems of variables; real world is not composed of variables but real physical components that sometimes (but not necessarily always) can in some limited sense be described as levels of variables. But in order to understand the system studied, it does not even matter, whether elements of a system can be represented as levels of a variable, what matters is the identification of the components and specific relationships between the components of the studied system. In other words, the aim of structural-systemic approach is to identify the components and relationships between the components of the studied whole.

All this is related to your next question:

Further, it may well be the case that the 'psychic part of the physical universe' is not, at least in part, explained by existing physical theory/laws (although I believe this is an open, empirical question), but why does it necessarily follow that the establishment of quantitative psycho-physical (for want of a better term) theory/laws is not possible, therefore precluding the possibility of psychological measurement? If I am misreading you, then please let me know.

No, you are not misreading me. It is the structural-systemic epstemology leads to one of the reasons why I think mathematics cannot be used as a tool for discovering what is searched for because definition of the components is qualitative description; so is the definition of the relationships between the components. So, my point is that mathematics cannot be used for discovering structural description of the psyche--even though many psychologist today seem to believe (contrary to Thurstone, who explicitly denied this possibility) that they are discovering the structure of personality, for instance, with the help of factor analysis. Mathematics is formal language, which content and properties are defined a priori, there is therefore no way to discover qualitative novelty with mathematical tools.

We can build mathematical models a posteriori, of course. But then a question emerges, what is really represented in these models, do they represent the structural-systemic explanation? I think not. We can, of course, represent qualitative aspects of the world by symbols and then manipulate these symbols in mathematical language, but in that case  some "verbal" theory--theory that links representations to experiences of the world--is absolutely needed for explaining what is meant by the symbols; no 'back-translation' is possible from mathematics when the verbal link is lost in some stage of the process of building a mathematical theory. Now the question emerges, why mathematical description is useful, if it must stay connected to 'verbal' theory? And my answer is--it is not useful. I do not say that no quality of mathematical description is useful; I think that mathematics has at least one very important characteristic, that of exactness. Good theory is defined in exact terms. The reasons why mathematics is, in my opinion, not useful, are provided in my Frontiers article you also mentioned. In shortm structural-systemic approach aims at discovering components of the system, and the specific relationships of the components that were necessary for the emergence of the studied whole. If your definition of mathematics is not different from Luce's and other mathematicians, structural description is not mathematical.
 

So, to digress a little, as far as I can tell from your post and other publications (http://www.frontiersin.org/quantitative_psychology_and_measurement/10.3389/fpsyg.2010.00029/abstract), you emphasise a structural-systemic epistemology. But such an epistemology seems to me consistent with the way that measurement was actually established in the physical sciences, i.e., hypothesising and confirming physical properties and the types of relations between them across a wide range of physical phenomena and building this into a coherent body of substantive theory that is the foundation (not mathematical as it is often presented in psychology) of the international system of measurement. I would argue that the possibility of psychological measurement can only be established (or not) by such research, and certainly not by fiat (which I have been somewhat guilty of in the past). 

Perhaps one more issue needs to be introduced. The notion of 'explanation' is important here. In structural-systemic epistemology, explanation is the structural description--the description of components and the relationships between the components. If explanation is defined in that way, mathematical expression or measurement does not provide explanation. Why I think this structural-systemic definition of explanation is worthy to follow? because it provides a scientist a very good criterion of truth--theory is good when it defines how the studied phenomenon can be made, not elicited to happen but created, constructed. The "final" test of the theory is what I have called "constructive experiment, experiment in the course of which the studied phenomenon is created.

Important consequence follows from this--there is no mathematical formula that is sufficient for describing how to create the thing studied. But there are very many "verbal" theories that contain all the necessary information for conducting constructive experiments in all fields of science. Here is my weak point!!! If you provide a mathematical formula that is sufficient for constructive experiment, I am wrong. But clearly we need to have the same definition of mathematics here.
 
In sum, mathematics is not necessary for psychology. One qualification still is in order. Mathematical language might be useful as some part of the structural-systemi theory because the world is continuous--there is only physical world, part of which is further differentiated into biological world explained by biological principles; part of the biological world, in turn, is further differentiated into psychological world with the psychological principles of explanation. Because of this continuity quantitative aspects cannot be excluded from biological and psychological theories. For instance, the emergence of different (biological or psychological) wholes requires at least two components; so the quantity is important. But not sufficient.

In the end a note on Guenther's question:

I very much welcome your call for less postulation and more focus on research. But did ‘such research’ not already commence? How about the Rasch model? It is a psychological theory and it concurs to the classical definition of measurement and Rasch himself already made the connection to the relevant physical theory (e.g. laws of ideal gases). I also assume that most Rasch people believe that they can (do) already measure psychological attributes at least to some extent. I’m bit surprised that this view has not yet be articulated here. Maybe simply because there are no 'such Rasch people’ present here?

Can you construct--create de novo--any psychological phenomenon by relying only on the Rasch model "psychological theory"? I think science should aim at understanding; from the structural-systemic epistemology point of view, Rasch models are not explanatory. If you have another definition for explanation then please explain why and how your definition is more useful than structural-systemic?

Thank you all for your comments! They have definitely been thought-provoking to me

Best

Aaro
 

Denny Borsboom

unread,
Mar 18, 2011, 4:33:23 PM3/18/11
to talking-m...@googlegroups.com, aaro
Hi Aaro,

several interesting points you make; I am sympathetic to many of the
questions you ask, some being identical or nearly so to questions I
have often asked myself.

However I do not understand why the issues you raise should be
specific to psychology. Variables are a source of headaches to all who
think but they are endemic to every field of enquiry that engages in
empirical tests; be it biology, medicine, physics, or economics.
Similarly, the idea of construction may be appealing, but we cannot
"construct de novo" gravitation, quantum probability, or the speed of
light. So what? In other words I don't see why your criticism wouldn't
apply to science at large.

In addition I do not see your point about mathematics. When it
functions within science, mathematics has no intrinsic semantics, that
is, what it's about is *always* fixed by a nonmathematical link. To
call that link "verbal" is far too limited; it is often ostentative or
pragmatic. It usually exceeds the linguistic resources by far. But
outside mathematics it is never the case that the formal system itself
contains its meaning. This is well known. If it is a basis for a
critique of psychology, then it is a basis basis for a critique of
science at large. In other words it is again not specific to
psychology.

Maybe it would help if you gave an indication of what you do consider
to be a good explanation from your standpoint. What is a good
explanation in your terms?

Best
Denny

> --
> You received this message because you are subscribed to the Google Groups
> "Talking Measurement" group.
> To post to this group, send email to talking-m...@googlegroups.com.
> To unsubscribe from this group, send email to
> talking-measure...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/talking-measurement?hl=en.
>

--

Andrew Kyngdon

unread,
Mar 18, 2011, 5:29:11 PM3/18/11
to talking-m...@googlegroups.com

Aaro,

 

You said:

 

Andrew proposed a candidate for a measurable attribute--the utility of incremental gains and losses under conditions of risk and uncertainty. I am not sure this attribute is actually psychic. Here an interesting question of the definition of psyche emerges, but this is another issue.”

 

But then you say:

 

Psyche is a system of processes that, on the basis of individual experiences, organizes behavior with the aim of maintaining equilibrium of the organism as a whole in a changing environment.

 

I cannot see how decision making under risk or uncertainty is not a system of psychological processes, or how it is inconsistent with the definition of psyche that you have given above. I have never seen a tree, rock or automobile make a decision.

 

Andrew

Andrew Kyngdon

unread,
Mar 18, 2011, 5:58:05 PM3/18/11
to talking-m...@googlegroups.com
Guenter,

You said:

"How about the Rasch model? It is a psychological theory and it concurs to the classical definition of measurement and Rasch himself already made the connection to the relevant physical theory (e.g. laws of ideal gases). I also assume that most Rasch people believe that they can (do) already measure psychological attributes at least to some extent. I'm bit surprised that this view has not yet be articulated here. Maybe simply because there are no 'such Rasch people' present here?"

You're on the wrong forum! Might I suggest this one: http://www2.wu-wien.ac.at/marketing/mbc/mbc.html

Be careful though. When it comes to critical debate, you'll find that more often than not you'll be talking past people.

But yes, a critical, dispassionate discussion of the Rasch model, including of its strengths, weaknesses and historical development, is something that would be most welcome on Talking Measurement.

Andrew

Denny Borsboom

unread,
Mar 18, 2011, 6:40:36 PM3/18/11
to talking-m...@googlegroups.com, Andrew Kyngdon
There's a few of those. I was threatened with expulsion off one of
them when I maintained that the Rasch model is an IRT model. People
have funny ideas.

By the way: I could understand that, with a little imagination, one
might call the Rasch model a theory of sorts. But to call it a
psychological theory is a bit too much I think.

Best
Denny

> --
> You received this message because you are subscribed to the Google Groups "Talking Measurement" group.
> To post to this group, send email to talking-m...@googlegroups.com.
> To unsubscribe from this group, send email to talking-measure...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/talking-measurement?hl=en.
>
>

--

Stephen Humphry

unread,
Mar 19, 2011, 12:08:35 AM3/19/11
to talking-m...@googlegroups.com
Hi Guenter, hope you are well.

You say:


"I very much welcome your call for less postulation and more focus on research. But did ‘such research’ not already commence? How about the Rasch model? It is a psychological theory and it concurs to the classical definition of measurement and Rasch himself already made the connection to the relevant physical theory (e.g. laws of ideal gases). I also assume that most Rasch people believe that they can (do) already measure psychological attributes at least to some extent. I’m bit surprised that this view has not yet be articulated here. Maybe simply because there are no 'such Rasch people’ present here?

Yes, certainly many "Rasch people" believe they can and do mesure psychological attributes. Certainly, they use the word "mesaurement" very freely.

Where it comes to the standard (or 'classical') definition of measurement, I do not think it makes sense to say that a Rasch model is a psychological theory. Where it comes to the classical definition, yes, Rasch made the connection first using the Poisson (model) to try to measure reading ability based on the numbers of errors. Most people use the dichotomous or polytomous Rasch models, and the idea that there is a "class" of models revolves around algebra in a manner quite clearly removed from substantive theory. So if we are talking substantive theory, there is in my view no class of models. Most saliently, the dichotomous and polytomous models, as they are almost universally used, contain functions of differences between "parameters". To interpret the parameters as measurements in the standard/classical sense, there must be ratios between differences and a unit (e.g. (b-d):u where b and d are magnitudes and u is a magnitude, the unit). There is no in principle issue here, but nobody has yet taken up the necessary challenges required to demonstrate that it is possible to use these models to mesure in well-defined units.

(By the way, it is easily shown that the dichotomous Rasch model works for psychophysical results in as much as Weber-Fechner "law" holds, although an interesting question arises about what is actually measured--and investigations with Joshua revealed this has received rather more attention than it would seem from looking at the vast bulk of material).

Rasch explicitly referenced Maxwell, which is a lot more than most others have done. However, the way his work is used today is in my view a kind of hybrid of statistics and measurement thinking, which needs a serious rethink to get it back on course (I mean stock-standard statistics of the kind used throughout psychometrics, which has never been justified in the way that the use of quantitative methods in physics and the natural sciences to which physical measurements are so necessary). The rethink would, in my view, mean that the idea that "Rasch models" are somehow definable as a potentially substantive and coherent body of theory, definition and law is not really that much more likely than the idea that Stevens' theory of scale types is the same. Having said that, I think it is quite possible that some of Rasch's work will prove valuable if we ultimately succeed in mesauring "psychological quantities" in terms of well-defined units within a desciptive body of cogent theory, definition and law.

Regards,

Steve

Trendler, Guenter

unread,
Mar 19, 2011, 5:36:06 AM3/19/11
to talking-m...@googlegroups.com
Hi Aaro,

"Can you construct--create de novo--any psychological phenomenon by relying
only on the Rasch model "psychological theory"? I think science should aim
at understanding; from the structural-systemic epistemology point of view,
Rasch models are not explanatory. If you have another definition for
explanation then please explain why and how your definition is more useful
than structural-systemic?"

First I must confess that I’m not yet very much acquainted with the ‘structural-systemic epistemology point of view’. However I believe that the Rasch model (at least as initially conceived by Georg Rasch) in particular and measurement models in general are not so much about any kind of explanation, but about the discovery of quantitative laws. Hence, strictly speaking, if the Rasch model would work it would not explain why certain phenomena go quantitatively together, but only that they do.

Regards
Guenter

winmail.dat

Trendler, Guenter

unread,
Mar 19, 2011, 5:58:49 AM3/19/11
to talking-m...@googlegroups.com
Andrew: “You're on the wrong forum! Might I suggest this one: http://www2.wu-wien.ac.at/marketing/mbc/mbc.html

Thank you for sending me on a suicide mission! ;-)

Denny: “By the way: I could understand that, with a little imagination, one might call the Rasch model a theory of sorts. But to call it a psychological theory is a bit too much I think.”

Stephen: “The rethink would, in my view, mean that the idea that "Rasch models" are somehow definable as a potentially substantive and coherent body of theory, definition and law is not really that much more likely than the idea that Stevens' theory of scale types is the same.”

What exactly is missing from the Rasch model to convert it into a substantive theory? Or, what do we have to do to transform it into such?

Guenter

winmail.dat

Denny Borsboom

unread,
Mar 19, 2011, 7:09:23 AM3/19/11
to talking-m...@googlegroups.com
To transform a model into a theory its parameters have to be
interpreted. For instance "theta" is not interpreted, it's just a
placeholder.
Denny

On Saturday, March 19, 2011, Trendler, Guenter

--

Stephen Humphry

unread,
Mar 19, 2011, 7:11:13 AM3/19/11
to talking-m...@googlegroups.com
Hi Guenter. I doubt you mean it to be, but the questions "What exactly is missing from the Rasch model to convert it into a substantive theory? Or, what do we have to do to transform it into such?" to me presuppose there is more to "the Rasch model" than there is. First, there is the question of which model. Second, why should we presuppose we can transform it into such?

Successful descriptive quantitative theories, definitions and laws are--how did you put it Andrew?--inferred from the phenomena. The formal statements of these are alluringly simple. They make sense when there is a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs that have been worked out by understanding the phenomena and existing, simpler empirical results, using the know-how available. In the end, key physical quantitative definitions and laws have very simple forms; although more involved ones also form part of successful bodies of theory. For example, measuring voltage, resitance and electric current are all based largely on Ohm's law, but design principles must take into account various other aspects of theory like induction and temperature.

It is in my view much too simplistic to presuppose a "measurement model" will somehow embody all of the substantive theory required to systematically obtain empirical results of the kind required to measure magnitudes of quanties in well-defined units. Rasch picked up on the basic form of quantity equations that capture the simplest relations that need to be isolated from others in a measurement process as a whole. So I don't think it's so much a matter of transforming--more likely a matter of understanding what the simple relatons capture and how they must be understood in a broader substantive theoretical framework.

Regards, Steve


________________________________________
From: talking-m...@googlegroups.com [talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter [guenter....@zi-mannheim.de]
Sent: Saturday, 19 March 2011 5:58 PM
To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological Measurement

aaro

unread,
Mar 19, 2011, 8:54:43 AM3/19/11
to talking-m...@googlegroups.com, aaro
Hi Denny,


On Friday, March 18, 2011 10:33:23 PM UTC+2, Denny Borsboom wrote:

However I do not understand why the issues you raise should be
specific to psychology. Variables are a source of headaches to all who
think but they are endemic to every field of enquiry that engages in
empirical tests; be it biology, medicine, physics, or economics.

I agree, this issue is not specific to psychology. Psychology is just the field where I feel myself more confident.
 

Similarly, the idea of construction may be appealing, but we cannot
"construct de novo" gravitation, quantum probability, or the speed of
light.

This list is interesting. First, speed of light is a characteristic of light. We cannot create 'speed' abstracted from matter. But light can be created; also other fields with the speed of light (electromagnetic, for instance). Second, quantum probability. If I am not missing the point, quantum probability is a theory; any theory is a creation of mind and can be created again--in fact it is created de novo every time a new person understands it. And third, gravitation. Again, it is physics and I am not sure I know ehere physics is at the moment in understanding gravity. But some time ago gravity was one phenomenon actually not understood by physicists; so tha fact that gravity cannot be created fits with my suggestion that the ability to create a phenomenon is a test of understanding.
 

So what? In other words I don't see why your criticism wouldn't
apply to science at large.

It does apply to science at large ... but to declare it for sure I should know more about these other sciences (especially physics and chemistry; I do have some background in biology)

In addition I do not see your point about mathematics. When it
functions within science, mathematics has no intrinsic semantics, that
is, what it's about is *always* fixed by a nonmathematical link. To
call that link "verbal" is far too limited; it is often ostentative or
pragmatic.

This would bring us to another discussion. I think the only possible form of any theory about the world is semiotic -- any theory is a system of symbols. I called "verbal" a system of symbols with direct links to the external world around us to distinguish it from a subsystem of symbols that refers only to other symbols. Mathematics is one of such subsystems, there are others, philosophy, for example.
 

It usually exceeds the linguistic resources by far. But
outside mathematics it is never the case that the formal system itself
contains its meaning. This is well known. If it is a basis for a
critique of psychology, then it is a basis basis for a critique of
science at large. In other words it is again not specific to
psychology.

The point I am making is that mathematics is useful in principle. But it should be used for appropriate purposes. And psychology (perhaps among other sciences), abuses mathematics; mathematics cannot be used for discovering the explanation of the studied phenomena as it is too often attempted in psychology; no mathematical model is also able to formulate an explanatory (constructive) theory of the studied object. Potential of mathematics is grossly overestimated.

Maybe it would help if you gave an indication of what you do consider
to be a good explanation from your standpoint. What is a good
explanation in your terms?

explanation is good when it contains information about what components in what specific relationships constitute the studied phenomenon and how it is possible to create the phenomenon or thing under study. Physics, chemistry, and biology provide numerous examples of such theories. The "ultimate" test for the theory of genes
was the recent construction of a genome by ... ? I forgot the name. 

Best

Aaro

aaro

unread,
Mar 19, 2011, 9:06:35 AM3/19/11
to talking-m...@googlegroups.com
Hi Andrew,


On Friday, March 18, 2011 11:29:11 PM UTC+2, Andrew Kyngdon wrote:

You said:

Andrew proposed a candidate for a measurable attribute--the utility of incremental gains and losses under conditions of risk and uncertainty. I am not sure this attribute is actually psychic. Here an interesting question of the definition of psyche emerges, but this is another issue.”

 But then you say:

 Psyche is a system of processes that, on the basis of individual experiences, organizes behavior with the aim of maintaining equilibrium of the organism as a whole in a changing environment.

 I cannot see how decision making under risk or uncertainty is not a system of psychological processes, or how it is inconsistent with the definition of psyche that you have given above. I have never seen a tree, rock or automobile make a decision.

Obviously decision-making is a psychological phenomenon. Why I said that maybe "utility of incremental gains and losses" is not is that this "utility" is basically a variable; but psyche is not composed of variables. Maybe my interpretation was too literal, but still the problem is that "utility" as a variable can be based on different psychological structures; if so, the variable is abstracted from psyche and becomes nonpsychical. Also, as a variable that can be based on different psychical structures, it is not a measure until those psychical structures are clearly distinguished, until it is made clear that the same real quality of psyche is quantitatively described.

Best

aaro

 


aaro

unread,
Mar 19, 2011, 9:21:57 AM3/19/11
to talking-m...@googlegroups.com
Hi Guenter,


On Saturday, March 19, 2011 11:36:06 AM UTC+2, Trendler, Guenter wrote:

First I must confess that I’m not yet very much acquainted with the ‘structural-systemic epistemology point of view’. However I believe that the Rasch model (at least as initially conceived by Georg Rasch) in particular and measurement models in general are not so much about any kind of explanation, but about the discovery of quantitative laws. Hence, strictly speaking, if the Rasch model would work it would not explain why certain phenomena go quantitatively together, but only that they do.

It brings us back to my original question ... Why we need such models? I am not implying with this question that mathematical models are not useful. My question is exactly--if they are useful, then exactly for what? "To understand" would not be sufficient because "understanding" is defined in many ways. Maybe for some understanding can mean 'it is possible to make a mathematical model.' Then I would say, obviously mathematical models can be made of everything. So what?

So, what is the purpose of such models?

Best

Aaro

Trendler, Guenter

unread,
Mar 19, 2011, 12:09:10 PM3/19/11
to talking-m...@googlegroups.com
Hi Steve,

Ok, I was a bit over-hasty. Let me take a step back.

Josh was asking:
“Is there any existing psychological theory that has already commenced 'such research'?”

In reply I was pointing out that, and let me be now more precise, Rasch's idea (person ability + item difficulty => probabilistic response) may provide such a psychological theory and that, at least in the view of the Rasch community, the research may have already commenced. (Of course I did not mean just the formal mathematical theory, but, as Denny points out: “To transform a model into a theory its parameters have to be interpreted. For instance "theta" is not interpreted, it's just a placeholder.”)

Previously you wrote:
“Where it comes to the standard (or 'classical') definition of measurement, I do not think it makes sense to say that a Rasch model is a psychological theory. Where it comes to the classical definition, yes, Rasch made the connection first using the Poisson (model) to try to measure reading ability based on the numbers of errors. Most people use the dichotomous or polytomous Rasch models, and the idea that there is a "class" of models revolves around algebra in a manner quite clearly removed from substantive theory. So if we are talking substantive theory, there is in my view no class of models. Most saliently, the dichotomous and polytomous models, as they are almost universally used, contain functions of differences between "parameters". To interpret the parameters as measurements in the standard/classical sense, there must be ratios between differences and a unit (e.g. (b-d):u where b and d are magnitudes and u is a magnitude, the unit). There is no in principle issue here, but nobody has yet taken up the necessary challenges required to demonstrate that it is possible to use these models to mesure in well-defined units.”

So there is in principle no issue here. Do we agree then that we already have a psychological theory, Josh was asking for? Or is there something substantive missing? The problem only is that the ‘proper’ research (i.e. the search for ratios) has not yet commenced?

Regards,
Guenter

-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Stephen Humphry
Gesendet: Sa 19.03.2011 12:11
An: talking-m...@googlegroups.com
Betreff: RE: [talking-measurement] The Rasch model and Psychological Measurement

Regards, Steve

Guenter

--

winmail.dat

Trendler, Guenter

unread,
Mar 19, 2011, 12:28:56 PM3/19/11
to talking-m...@googlegroups.com

Hi Aaro,

Aaro: "It brings us back to my original question ... Why we need such models? I am

not implying with this question that mathematical models are not useful. My
question is exactly--if they are useful, then exactly for what? "To
understand" would not be sufficient because "understanding" is defined in
many ways. Maybe for some understanding can mean 'it is possible to make a
mathematical model.' Then I would say, obviously mathematical models can be
made of everything. So what? So, what is the purpose of such models?"

The purpose of such models is the measurement of the quantities involved. Though I’m not sure I understand your argument; maybe mathematical models can be made of everything but in an empirical science reality may resist their practical application. In my view Rasch models are mathematically sound, but reality does not conform to them. That is, indifferent of the circumstances, humans do not behave as demanded by the model.

Regards,
Guenter

winmail.dat

Trendler, Guenter

unread,
Mar 19, 2011, 1:19:02 PM3/19/11
to talking-m...@googlegroups.com
Hi Steve,

You wrote: “Successful descriptive quantitative theories, definitions and laws are--how did you put it Andrew?--inferred from the phenomena. The formal statements of these are alluringly simple. They make sense when there is a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs that have been worked out by understanding the phenomena and existing, simpler empirical results, using the know-how available. In the end, key physical quantitative definitions and laws have very simple forms; although more involved ones also form part of successful bodies of theory. For example, measuring voltage, resitance and electric current are all based largely on Ohm's law, but design principles must take into account various other aspects of theory like induction and temperature.”

Isn’t this rather the case AFTER the law is firmly established? At the outset, prior to Ohm, there is not much of ‘a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs’. Instruments with ‘intricate’ design had to be first invented and developed. In the beginning there is not much instrumental know-how and the body of theory is very skinny.

The origin of Rasch models is the observation that some person A solves an item D with less effort then person B. This observation gave rise to the hypothesis that there may be a quantitative law between ‘person ability’, ‘item difficulty’ and the probability of a correct response. In physics a similar starting observation is that liquids expand with heat. The phenomenon can be use to measure ‘temperature’; the theory is that volume expands proportional to temperature. The next step was the construction of the thermoscope (thermometer without a scale). This involved the construction of an appropriate tube, the selection of an appropriate thermoscopic substance and so on. However, although theory is always involved along the way I believe that it is not by coincidence that complex theories like the molecular-kinetic theory stand at the end of such a process. Must we not start somewhere and be content with as much substantive theory as we have? The body of theory will grow naturally in substance as we gain experimental insight and control over the phenomena under investigation. Hence, isn’t the Rasch theory just as it stands now a sufficient starting point? What should we expect more at this point?

Regards
Guenter

-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Stephen Humphry
Gesendet: Sa 19.03.2011 12:11
An: talking-m...@googlegroups.com

Betreff: RE: [talking-measurement] The Rasch model and Psychological Measurement

Regards, Steve

Guenter

--

winmail.dat

Stephen Humphry

unread,
Mar 19, 2011, 11:14:31 PM3/19/11
to talking-m...@googlegroups.com

Hi Guenter,

I said previously: “Successful descriptive quantitative theories, definitions and laws are--how did you put it Andrew?--inferred from the phenomena. The formal statements of these are alluringly simple. They make sense when there is a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs that have been worked out by understanding the phenomena and existing, simpler empirical results, using the know-how available. In the end, key physical quantitative definitions and laws have very simple forms; although more involved ones also form part of successful bodies of theory. For example, measuring voltage, resitance and electric current are all based largely on Ohm's law, but design principles must take into account various other aspects of theory like induction and temperature.”

You replied:


"Isn’t this rather the case AFTER the law is firmly established? At the outset, prior to Ohm, there is not much of ‘a body of theory, definition and law backed by empirical results obtained using instruments with intricate designs’. Instruments with ‘intricate’ design had to be first invented and developed. In the beginning there is not much instrumental know-how and the body of theory is very skinny."

I think there's an important distinction to be made. If we look back on historical events in light of the theory and know-how now at our disposal, it may seem this way. Let's just consider a few details (apologies for anything not entirely accurate, historically--I do not think it should alter the basic message). Ohm's work is entitled (in English) the Galvanaic circuit investigated mathematically. Ohm used galvanaic or voltaic cells (and later a tehrmocouple device). He made explicit reference to Fourier and Poisson, saying that the form of the differential equations are similar to those given for the propagation of heat. He referred to generalizations deducible from the phenomena and empirical results. He formulated several quantitative equations in arriving at the one we now call Oh'm law. He referred to the "force of a current" (current) and tensions (potentials I think). Ohm used a number of detailed figures of the curcuits. He used a galvanometer to measure current (crude by today's forms, nevertheless an intricate instrument). So it's important not to overestimate the degree to which theory was unified, or the degree to which instruments had intricate designs, viewing them by modern standards. Its equally important, I think, not to underestimate either the degree to which theory and definition were developed or the intracacy of the instruments by the standards of the day.

Guenter, you also say:

"The origin of Rasch models is the observation that some person A solves an item D with less effort then person B. This observation gave rise to the hypothesis that there may be a quantitative law between ‘person ability’, ‘item difficulty’ and the probability of a correct response. In physics a similar starting observation is that liquids expand with heat. The phenomenon can be use to measure ‘temperature’; the theory is that volume expands proportional to temperature ....?"

Sadly, I disagree that it is a similar starting observation. Examined in terms of metrology and systematic measurement (and units), the problem is the lack of something corresponding to linear expansion. The spatial extent (volume) of a material or substance is measurable, and this is a starting point. Once it can be shown in a systeamtic manner than heat tansferance affects spatial extent under specific conditions, there is a basis for mesuring heat (and ultimately temperature). The problem is that "relative frequencies" of errors and so forth are not established measurements. Rasch's work to show that frequencies were systematically related and could be used to infer estimates of "parameters" is very useful. However, it doesn't have the same basis in a clearly measurable attribute (spatial extent) that is also related to various other physical dimensions.

Steve

Trendler, Guenter

unread,
Mar 20, 2011, 4:37:10 AM3/20/11
to talking-m...@googlegroups.com

Hi Steve,

You wrote. “I think there's an important distinction to be made. If we look back on historical events in light of the theory and know-how now at our disposal, it may seem this way. Let's just consider a few details (apologies for anything not entirely accurate, historically--I do not think it should alter the basic message). Ohm's work is entitled (in English) the Galvanaic circuit investigated mathematically. Ohm used galvanaic or voltaic cells (and later a tehrmocouple device). He made explicit reference to Fourier and Poisson, saying that the form of the differential equations are similar to those given for the propagation of heat. He referred to generalizations deducible from the phenomena and empirical results. He formulated several quantitative equations in arriving at the one we now call Oh'm law. He referred to the "force of a current" (current) and tensions (potentials I think). Ohm used a number of detailed figures of the curcuits. He used a galvanometer to measure current (crude by today's forms, nevertheless an intricate instrument). So it's important not to overestimate the degree to which theory was unified, or the degree to which instruments had intricate designs, viewing them by modern standards. Its equally important, I think, not to underestimate either the degree to which theory and definition were developed or the intracacy of the instruments by the standards of the day.”

There is a certain danger that we start splitting hairs here. The point I want to make is that in order to apply measurement theory we don’t need much ‘sophisticated’ theory to get started with experimenting. Furthermore, theory and experiment have to go hand in hand from the simple to the complex. We should therefore try to avoid burdening established empirical discoveries with to much theory. An extreme case where a huge theory was build upon no empirical evidence at all is Herbart’s ‘Psychologie als Wissenschaft’. I believe that the psychological theory as presented by Rasch in his ‘Probabilistic models for some intelligence and attainment tests’ is enough theory to get started which does not mean that I believe that such a start would or must be successful.

You also argue: “Sadly, I disagree that it is a similar starting observation. Examined in terms of metrology and systematic measurement (and units), the problem is the lack of something corresponding to linear expansion. The spatial extent (volume) of a material or substance is measurable, and this is a starting point. Once it can be shown in a systeamtic manner than heat tansferance affects spatial extent under specific conditions, there is a basis for mesuring heat (and ultimately temperature). The problem is that "relative frequencies" of errors and so forth are not established measurements. Rasch's work to show that frequencies were systematically related and could be used to infer estimates of "parameters" is very useful. However, it doesn't have the same basis in a clearly measurable attribute (spatial extent) that is also related to various other physical dimensions.”

In my view the question is if it is possible theoretically to determine constants in a law without being able to measure any of the factors involved, i.e. without ‘something corresponding to linear expansion’. As is well known Rasch’s crucial insight was that theoretically it is indeed possible. The probability of (a correct) response is ‘measurable’ and this is the starting point. True, this ‘measurable’ quantity is not (yet) related in various ways to other measurable quantities as you point out. However, maybe a more suitable physical analogy to the situation is, instead of Ohm’s law, the level of knowledge about electricity at the time of William Gilbert (1544 – 1603). One of Gilbert’s crucial observations was that a metallic needle (i.e. primitive electroscope) is deviated by the electricity produced by rubbing amber with cloth. Of course Gilbert had some ‘primitive’ theory about this phenomenon (for details see Bordeau: Volts to Hertz). Note that the measurable quantity ‘angle deviation’ is not yet connected to other quantities just as is the case with the Rasch model, but that is basically how all got started. However, I jumped into this discussion to take position on Josh question if there is ‘any existing psychological theory that has already commenced 'such research'’ to which my answer is: ‘Yes, in my view such theory exists and the research has already commenced.’ Of course, the question remains why no progress has been made beyond the ‘Gilbert level’ but that’s another topic. Still no agreement about ‘the starting observation’ and amount of theory necessary to get started, Steve?

Best
Guenter


-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Stephen Humphry

Gesendet: So 20.03.2011 04:14

winmail.dat

Andrew Kyngdon

unread,
Mar 20, 2011, 4:45:38 AM3/20/11
to talking-m...@googlegroups.com

Hello Aaro,

 

You said:

 

Obviously decision-making is a psychological phenomenon. Why I said that maybe "utility of incremental gains and losses" is not is that this "utility" is basically a variable; but psyche is not composed of variables. Maybe my interpretation was too literal, but still the problem is that "utility" as a variable can be based on different psychological structures; if so, the variable is abstracted from psyche and becomes nonpsychical. Also, as a variable that can be based on different psychical structures, it is not a measure until those psychical structures are clearly distinguished, until it is made clear that the same real quality of psyche is quantitatively described.

 

The argument that you are making here sounds vaguely representationalist – psychological attributes are “qualities” which can be “quantitatively described” or measured once “psychical structures” are clearly distinguished. By psychical structures I would assume that you mean psychological systems. You argue that the utility of gains and losses under conditions of risk and uncertainty “can be based upon different psychological structures”. I interpret this as arguing that theories of utility can be proposed which are descriptively different. If my perception is true, then you are correct.

 

Since Tversky’s (1969) often cited choice study, it has been argued that the utility of incremental gains and losses under conditions of risk or uncertainty may not be quantitative. Specifically, utility could be a “lexicographic semiorder” (Tversky, 1969). A “lexicographic order” occurs when objects or events are ordered on the basis of a series of ranked attributes, such that if the difference between two objects with respect to the first attribute does not exceed a threshold of some kind, then attention is paid to the second attribute. If a difference can be discerned on the second attribute, then an order between two objects is identified. The most familiar example of a lexicographic order is the alphabetical ordering of words in a dictionary. If two words share the first same letter, then attention is paid to the second. If the second letters are different, then the words are alphabetically ordered irrespective of the letters following the second letter. A “semiorder” is an ordinal relation with transitive indifference, that is, if A is indifferent to B (A ~ B) and B ~ C it need not follow that A ~ C (Luce, 1956). Of course, an attribute cannot logically be a quantity if its degrees (levels) form a lexicographic semiorder.

 

A lexicographic semiorder in a choice situation is where people compare two simple lotteries, for example, on the basis of one attribute first, and if the difference between the lotteries with respect to this attribute is less than a threshold of some kind, then attention is paid to the second attribute, and if the relevant difference is less than a threshold, then the third attribute is inspected, and so on (Birnbaum, 2010). One example of a theory of utility which proposes that utility is one kind of lexicographic semiorder is the “priority heuristic” of Brandstatter, Gigerenzer & Hertwig (2006). This work is in the vein of Goldstein & Gigerenzer’s (2002) theory of “fast and frugal” heuristics.

 

The priority heuristic argues that people assess a simple lottery as follows. The minimum gain of each lottery is assessed first, then the probabilities of the minimum gains and then the maximum gains. If the difference in minimum gains is greater than 1/10th of the magnitude of the maximum gain, then examination of the lotteries ceases and a choice is made. Brandstatter, et al, (2006) argued that this threshold was a decision “stopping rule”. For example, consider Lottery A, which has a 50% chance of yielding $200 and a 50% chance of yielding nothing. Lottery B consists of receiving $100 for sure (i.e., Lottery B is a gift). As $0 is the minimum in Lottery A and $100 in B, and that $100 > $20, then people will choose Lottery B over A (which most people almost always do). However, if presented with Lottery C, which has a 50% chance of yielding $3000 and 50% chance of nothing, then not all people choose Lottery B (as $300 > $100). Again, this is consistent with observed human choice behaviour. So far, the priority heuristic makes predictions consistent with theories in which utility is hypothesised to be a continuous quantity, such as prospect theory (Kahneman & Tverksy, 1979), rank dependent utility theory (Luce & Fishburn, 1991) and transfer of attention exchange theory (TAX) (Birnbaum, 1999). But the priority heuristic is much simpler and does not assume that utility is quantitative.

 

However, strong evidence has been presented against the lexicographic semiorder theories of utility. Birnbaum (2004) found that the priority heuristic predicted the modal choice in 3 out of 13 cases, whilst the TAX theory predicted all 13. Moreover, the priority heuristic did not describe the choice behaviour of the majority of subjects in Birnbaum & Navarrete’s (1998) study, despite the fact that Brandstatter, et al (2006) cited that study. Re-examining Birnbaum & Navarrete’s (1998) data, Birnbaum (2008) found only 31% of subjects made choices in accordance to the 1/10th of the greatest outcome “stopping rule” of Brandstatter, et al (2006).

 

Birnbaum (2010) recently conducted a series of choice experiments designed to test lexicographic semiorder theories of utility, including the priority heuristic. To summarise his rather intensive study, Birnbaum found that the lexicographic semiorder theories did not describe choice behaviour that well. For example, priority dominance was systematically violated, meaning that choices were made using attributes of “lower” priority than those argued to be or “greater” priority by the priority heuristic. Subjects did not also display the intransitive choice behaviour that lexicographic semi order theories predict. However, the most frequent patterns of individual choice behaviour were consistent with the quantitative TAX theory.

 

More research is needed of course, but it would seem that utility may indeed be a psychological quantity. This perhaps why Tversky (1969) abandoned the lexicographic semiorder idea he proposed in favour of quantitative prospect theory. Too bad he passed away before the Nobel Economics Prize was awarded to his collaborator Danny Kahneman in 2002.

 

Andrew

 

Andrew Kyngdon, PhD

MetaMetrics, Inc.

www.lexile.com


Sent: Sunday, 20 March 2011 12:07 AM
To: talking-m...@googlegroups.com

--

Denny Borsboom

unread,
Mar 20, 2011, 6:05:21 AM3/20/11
to talking-m...@googlegroups.com, Trendler, Guenter
Hi Guenter,
if the Rasch model is good enough, then I don't see exactly which
research is yet to commence. You can throw stones at the model and see
whether it survives your attacks, that's called fitting a model and is
done all the time, sometimes more critical than other times but that's
human nature.
Best
Denny

On Sun, Mar 20, 2011 at 9:37 AM, Trendler, Guenter
<guenter....@zi-mannheim.de> wrote:
>
> Hi Steve,
>
> You wrote. "I think there's an important distinction to be made. If we look back on historical events in light of the theory and know-how now at our disposal, it may seem this way. Let's just consider a few details (apologies for anything not entirely accurate, historically--I do not think it should alter the basic message). Ohm's work is entitled (in English) the Galvanaic circuit investigated mathematically. Ohm used galvanaic or voltaic cells (and later a tehrmocouple device). He made explicit reference to Fourier and Poisson, saying that the form of the differential equations are similar to those given for the propagation of heat. He referred to generalizations deducible from the phenomena and empirical results. He formulated several quantitative equations in arriving at the one we now call Oh'm law. He referred to the "force of a current" (current) and tensions (potentials I think). Ohm used a number of detailed figures of the curcuits. He used a galvanometer to measure current (crude by today's forms, nevertheless an intricate instrument). So it's important not to overestimate the degree to which theory was unified, or the degree to which instruments had intricate designs, viewing them by modern standards. Its equally important, I think, not to underestimate either the degree to which theory and definition were developed or the intracacy of the instruments by the standards of the day."
>
> There is a certain danger that we start splitting hairs here. The point I want to make is that in order to apply measurement theory we don't need much 'sophisticated' theory to get started with experimenting. Furthermore, theory and experiment have to go hand in hand from the simple to the complex. We should therefore try to avoid burdening established empirical discoveries with to much theory. An extreme case where a huge theory was build upon no empirical evidence at all is Herbart's 'Psychologie als Wissenschaft'. I believe that the psychological theory as presented by Rasch in his 'Probabilistic models for some intelligence and attainment tests' is enough theory to get started which does not mean that I believe that such a start would or must be successful.
>
> You also argue: "Sadly, I disagree that it is a similar starting observation. Examined in terms of metrology and systematic measurement (and units), the problem is the lack of something corresponding to linear expansion. The spatial extent (volume) of a material or substance is measurable, and this is a starting point. Once it can be shown in a systeamtic manner than heat tansferance affects spatial extent under specific conditions, there is a basis for mesuring heat (and ultimately temperature). The problem is that "relative frequencies" of errors and so forth are not established measurements. Rasch's work to show that frequencies were systematically related and could be used to infer estimates of "parameters" is very useful. However, it doesn't have the same basis in a clearly measurable attribute (spatial extent) that is also related to various other physical dimensions."
>

> In my view the question is if it is possible theoretically to determine constants in a law without being able to measure any of the factors involved, i.e. without 'something corresponding to linear expansion'. As is well known Rasch's crucial insight was that theoretically it is indeed possible. The probability of (a correct) response is 'measurable' and this is the starting point. True, this 'measurable' quantity is not (yet) related in various ways to other measurable quantities as you point out. However, maybe a more suitable physical analogy to the situation is, instead of Ohm's law, the level of knowledge about electricity at the time of William Gilbert (1544 - 1603). One of Gilbert's crucial observations was that a metallic needle (i.e. primitive electroscope) is deviated by the electricity produced by rubbing amber with cloth. Of course Gilbert had some 'primitive' theory about this phenomenon (for details see Bordeau: Volts to Hertz). Note that the measurable quantity 'angle deviation' is not yet connected to other quantities just as is the case with the Rasch model, but that is basically how all got started. However, I jumped into this discussion to take position on Josh question if there is 'any existing psychological theory that has already commenced 'such research'' to which my answer is: 'Yes, in my view such theory exists and the research has already commenced.' Of course, the question remains why no progress has been made beyond the 'Gilbert level' but that's another topic. Still no agreement about 'the starting observation' and amount of theory necessary to get started, Steve?

--

Paul Barrett

unread,
Mar 20, 2011, 7:03:21 PM3/20/11
to talking-m...@googlegroups.com
Going back to Guenter's original statement:
"How about the Rasch model? It is a psychological theory ... ".

OK - I know this came out probably more "directly phrased" than Guenter
might have actually meant to imply but ...

I think a read of Wood, R. (1978) Fitting the Rasch model - a heady tale.
British Journal of Mathematical and Statistical Psychology, 31, , 27-32;
quickly dispels any notion that the Rasch model can be a theory of anything
except item response probabilities. Paul Kline any myself back in 1981 fit a
Rasch model to the EPQ - adequately - encompassing items from all 4 scales!
You might as well call an eigenvector-eigenvalue decomposition routine a
"theory of psychology"!

Likewise, Michell, J. (2004) Item Response Models, pathological science, and
the shape of error. Theory and Psychology, 14, 1, 121-129. As you reduce
response error, so the model eventually fails.

I am amazed anyone would imbue any statistical item response model with the
status of a theory about a psychological attribute.

In my opinion, the best theories of ability are found in AI and
computational intelligence. Why? Because they build and explicitly test that
which they theorize might be causal for the production of reasoning and
learning within systems (of which we are but one type).

Whether or not one proposes that connectionist modeling is a reasonable
model for some human cognitive functions, the theory is about the generative
cause of the responses, not about statistically describing aggregate
responses on questionnaire items.

And it is that "theory about what generates the phenomenon" sets apart
mathematical/statistical descriptions of data from theories of a how a
phenomenal observation comes to be observed.

Basically, all Rasch says is "a latent variable - call it ability" causes
aggregate observations to align themselves like this, and so be modeled
using an explicit stochastic mathematical model.

But, the same 'theory' can be expressed as "the ability to solve these kinds
of problems produces an approximately ordered set of item difficulties,
which we can use to rank test-takers" - no mathematical model is required to
produce that ordering.

I think Aaro has it just right when he asks: " It brings us back to my


original question ... Why we need such models? I am not implying with this
question that mathematical models are not useful. My question is exactly--if
they are useful, then exactly for what?"

There is a very fine set of three paragraphs concluding the wonderful
chapter by Schonemann, P. (1994) Measurement: the Reasonable Ineffectiveness
of Mathematics in the Social Sciences. In I. Borg and P.Mohler (Eds.).
Trends and Perspectives in Empirical Social Research. Walter de Gruyter.
ISBN: .

"Whatever use axioms may have in mathematics, in an empirical science they
must be either self-evident or empirically founded. However, it is far from
self-evident why the Archimedean axiom should hold in psychology, or in
biology, where most phenomena are bounded by physiological constraints. Nor
is it self-evident why it should always be possible, or even helpful, to
remove interactions as additive conjoint measurement (and the closely
related "functional measurement") try to do. Why should the "crisp"
mathematics of physics apply without change to the fuzzy nature of living
things? Why should subjects always utilize a particular family of distance
functions when they produce dissimilarity ratings, and what prompts them to
always interpolate a monotone transformation so that we always can use the
same canned programs?

None of this is self-evident a priori, nor is any of it empirically founded.
In some instances, as we saw, there is solid empirical evidence to the
contrary, which is simply brushed aside. As Coombs (1983) observed, the line
separating this research strategy from "mathematical game playing in search
of a trivial application ... is an exceedingly difficult line to draw" (p.
93).

What should have been self-evident from the start is that a research
strategy which develops models "independently as a body of abstract formal
theory with empirical interpretations being left to a later stage" was
doomed from the outset. Thus, in the social sciences, the real mystery is
how anyone could have seriously believed the empirical connections would
materialize at a later stage. As the experience of the last 20 years shows,
they didn't. "


I concluded my recent commentary on Stephen Humphry's forthcoming target
article: "The Role of the Unit in Physics and Psychometrics" in the journal
" Measurement: Interdisciplinary Research and Perspectives ", with ..
"As I see it, the problem remaining for any social scientist is, not one of
developing yet more derivations of existing statistical item response models
or even new such models, but one of creating bodies of evidence that
demonstrate that a psychological attribute does indeed vary additively. If
these bodies of evidence are missing, then we must continue to explore and
make careful observations and, where possible, manipulate features of
phenomena and attributes, but without this continuing pretence of an
artificial precision accorded by so-called "measurement models" within
"quantitative psychology." And we continue like this until such time as the
body of observational evidence either invites obvious and unambiguous
quantification or theory-related causal explanations of our observations
show it is simply not possible in principle. "

Two very nice papers on this issue of the status of theory in psychology
have recently been published in the journal Theory and Psychology:

Gigerenzer, G. (2010) Personal reflections on Theory and Psychology. Theory
and Psychology, 20, 6, 733-743.
And
Rosenbaum, P.J., & Valsiner, J. (2011) The un-making of a method: From
rating scales to the study of psychological processes. Theory and
Psychology, 21, 1, 47-65.

For me, this is where the real works begins - with sensible and powerful
theory construction, Not with silly aggregate-model "latent variable"
statistical methods of any description.

But, what hope is there when 'methodolatry' is the order of the day? The
word came from Janesick, V. J. (1994). The dance of qualitative research
design: Metaphor, methodolatry, and meaning. In N. K. Denzin & Y. S. Lincoln
(Eds.), Handbook of qualitative research (pp. 209-219). Thousand Oaks, CA:
Sage... where he defined "methodolatry as: "a combination of method and
idolatry, to describe a preoccupation with selecting and defending methods
to the exclusion of the actual substance of the story being told.
Methodolatry is the slavish attachment and devotion to method that so often
overtakes the discourse in the education and human services fields. (p.
215)"

Just about sums up the entire "latent variable" tosh which now dictates much
of psychometrics and edumetrics these days.

Regards .. Paul

Denny Borsboom

unread,
Mar 20, 2011, 8:20:44 PM3/20/11
to talking-m...@googlegroups.com, Paul Barrett
So after the myst clears up: What do you have to add to what already
exists? Where exactly does your neural network dance deviate from,
say, the good old ML statistics? How precisely is your work different
from what I see every day, every hour, every minute of the day?

It's just sad that all these bright minds should waste their their
time complaining. Sad to see so much energy just die in pure
negativity. As if anyone cares. You people have the power to make
something good happen. Something better. Noblesse oblige!!!

D

Paul Barrett

unread,
Mar 20, 2011, 10:18:07 PM3/20/11
to talking-m...@googlegroups.com
Denny, what on earth brought that fit of pique on?

Are you actually saying IRT stochastic data models represent theories of
psychological processes?

Are you saying that AI/connectionist research is not seeking to
postulate/build generative causal models of reasoning processes?

As to:


" Where exactly does your neural network dance deviate from, say, the good
old ML statistics? "

In terms of how the network design/technology is deployed to build DYNAMIC
intrinsically non-linear functional models of human psychological processes.


As to


" How precisely is your work different from what I see every day, every
hour, every minute of the day?"

1. I don't consider any psychological attribute as "quantitatively"
measurable. But, I'll use numbers etc., orders etc. to arrive at "good
enough" predictive accuracies (categorized or orders) for applied work -
assessed using actuarial rather than "continuous-valued" functions.

2. I will also use James Grice's Observational Oriented Modeling - a
completely intrinsically non-quantitative-metric binary pattern analysis
methodology for analyzing data and logical path statement (Grice, J. (in
press). Observation oriented modeling: An analysis of cause in the
behavioral sciences. New York: Elsevier.)

3. I disavow ALL test theory psychometrics in my test construction work -
preferring instead to treat each "assessment" problem uniquely,
algorithmically, and actuarially.

4. I don't use questionnaires anymore - I build graphical profilers - in 1
and 2 dimensions. And, I'm developing single-stimulus dynamically evolving
reasoning items - where a single stimulus replaces the myriad of "usual
suspect" ability items. But then, I'm also after "reasoning in gestalt/situ"
rather than "abilities as rulers in our heads".


I would dearly love to work more on theory - and the measurement issue. Find
me a job that would pay me to do so and I would. Until then, I have to eke
out a living doing ad-hoc HR-type consultancy work - as no university
department or test publishing company is remotely interested in someone not
doing/teaching "what everyone else does".

I cannot speak for the others on this list; but what I have said in my
message is not negative - merely factual. You may find it negative, for me
it's just the way things are.

I have also said how things should proceed, and really how positive the
situation has become (for scientists, not psychometricians) - and that there
will be limited and uneven stabs and pushes along the frontiers while we
collectively grapple with coming to an understanding of what exactly we are
trying to propose as a 'theory' of something we might wish to call a
psychological process, let alone claim "measurement" of something.


Ah well, there we go ... by the way, if you want to see how I think about
conceptualising "human psychology" - go read my two presentations (and
supporting notes):

http://www.pbarrett.net/NZ_Psych_2007.htm
#2: Two Big Ideas
#3: Brunswick Symmetry, Complexity, & Non-Quantitative Psychology - Tying it
all Together

It's a bit raw in places - but you can see why I'm no longer interested in
measuring psychological "attributes" using rulers - but looking at "the
system" as a "complex" dynamic system in situ.

But, you are the man who published "The Attack of the Psychometricians" ...
it is no wonder what you see as a negative I see as a positive.

Regards .. Paul


-----Original Message-----
From: talking-m...@googlegroups.com
[mailto:talking-m...@googlegroups.com] On Behalf Of Denny Borsboom
Sent: Monday, 21 March 2011 1:21 p.m.
To: talking-m...@googlegroups.com
Cc: Paul Barrett
Subject: Re: [talking-measurement] The Rasch model and Psychological
Measurement

aaro

unread,
Mar 21, 2011, 11:18:04 AM3/21/11
to talking-m...@googlegroups.com
Andrew,


On Sunday, March 20, 2011 10:45:38 AM UTC+2, Andrew Kyngdon wrote:
You said:

Obviously decision-making is a psychological phenomenon. Why I said that maybe "utility of incremental gains and losses" is not is that this "utility" is basically a variable; but psyche is not composed of variables. Maybe my interpretation was too literal, but still the problem is that "utility" as a variable can be based on different psychological structures; if so, the variable is abstracted from psyche and becomes nonpsychical. Also, as a variable that can be based on different psychical structures, it is not a measure until those psychical structures are clearly distinguished, until it is made clear that the same real quality of psyche is quantitatively described.

 

The argument that you are making here sounds vaguely representationalist – psychological attributes are “qualities” which can be “quantitatively described” or measured once “psychical structures” are clearly distinguished. By psychical structures I would assume that you mean psychological systems.

Yes, but keeping in mind that I am talking about one version of systems theories where systems are understood as structures composed of some elements or components that are material. There are other systems theories according to which systems are build from variables. Variables are not material, no real-world system can be constructed from variables, only models.
 
 

You argue that the utility of gains and losses under conditions of risk and uncertainty “can be based upon different psychological structures”. I interpret this as arguing that theories of utility can be proposed which are descriptively different. If my perception is true, then you are correct.

Yes, this is one side of the idea; the other side is that externally identical behaviors can be based on structurally different minds; no variable-based theory is able to distinguish between such differently composed minds that behave in some situations similarly (structural differences of minds that underlie externally similar behaviors can be distinguished by studying such individuals in different contexts).  


Birnbaum (2010) recently conducted a series of choice experiments designed to test lexicographic semiorder theories of utility, including the priority heuristic. To summarise his rather intensive study, Birnbaum found that the lexicographic semiorder theories did not describe choice behaviour that well. For example, priority dominance was systematically violated, meaning that choices were made using attributes of “lower” priority than those argued to be or “greater” priority by the priority heuristic. Subjects did not also display the intransitive choice behaviour that lexicographic semi order theories predict. However, the most frequent patterns of individual choice behaviour were consistent with the quantitative TAX theory.

 More research is needed of course, but it would seem that utility may indeed be a psychological quantity.

Here I would have serious doubts; there is quite strong evidence (especially that, collected in the framework of the cultural-historical psychology) suggesting that there are qualitatively different ways of informations processing available to humans. These ways constitute a developmental hierarchy. Perhaps in some levels this utility might be processed as quantity but definitely not at all levels of this hierarchy. At some less developed levels even the question about quantitative relationships between losses and gains would not arise; the reason why some lottery is chosen might be that "my neighbor won on that too" or "my favourite color is blue and the tickets of this lottery are blue too". I suspect the studies you refer to are conducted mostly or exclusively with very highly educated subjects (students?). Studying only highly educated people creates a very distorted view on psyche.

best

aaro

Andrew Kyngdon

unread,
Mar 21, 2011, 8:34:29 PM3/21/11
to talking-m...@googlegroups.com

Aaro,

 

Your counterpoints are speculative and as such there is nothing in your response which casts doubt upon Birnbaum’s findings. Indeed, your posts thus far seem to consist of speculation and conjecture. You have responded to points addressing your speculation with only more speculation.

 

Nonetheless, you made two coherent statements - the utility of gains and losses under conditions of risk or uncertainty is non quantitative and that different theories of utility can be proposed. I presented empirical work directly relevant to these points. That utility may not be a quantity is an hypothesis that has received serious attention in the past 40 years. However, lexicographic semi-order theories such as the priority heuristic have never been as descriptively powerful as either prospect theory, transfer of attention exchange or rank dependent utility theory. Indeed, heuristics only seem to describe the choice behaviour from which they have been derived (Birnbaum, 2008). Obtain a fresh set of choice data, and heuristics fail. Even Brandstatter, et al (2008) had to concede that heuristic theories of choice were not descriptively powerful enough to displace the quantitative theories of utility. Hence it would seem that the evidence is against non quantitative theories of decision making under risk, or at least theories based on lexicographic semi-orders. But this isn’t really news. Concerns about the descriptive adequacy of lexicographic semi-order theories of utility have existed for some time (e.g., Grether & Plott, 1979).

 

Incidentally, human choice behaviour under risk is quite consistent over time and across different samples of people. For example, the preference reversals of the Allais Paradox have been replicated in every choice study in which the Paradox has been tested for almost the past 50 years. Kahneman & Tversky (1979) observed these preference reversals in the choice behaviour of psychology students from Israel and the US. Grether & Plott (1979) observed the behaviour in US students of economics and Oliver (2003) observed it in the choice behaviour of staff from a major London healthcare facility. Moreover, Tversky & Kahneman (1981) found that framing effects in medical applications of utility theory occurred in samples consisting of either university students or highly trained medical physicians. Johnson, Hershey, Meszaros & Kunreuther (1993) also discovered framing effects when business executives were asked to make choices concerning insurance products. Even Birnbaum (2000) conducts choice experiments on the Internet. So your argument of sample bias does not seem to be supported.

 

Andrew

 

Andrew Kyngdon, PhD

MetaMetrics, Inc.

www.lexile.com

My website: https://sites.google.com/site/drandrewkyngdon/home

Measurement Forum: http://groups.google.com/group/talking-measurement

 


Sent: Tuesday, 22 March 2011 2:18 AM
To: talking-m...@googlegroups.com

--

Stephen Humphry

unread,
Mar 21, 2011, 11:58:53 PM3/21/11
to talking-m...@googlegroups.com
Hi again Guenter.

You say:

"In my view the question is if it is possible theoretically to determine constants in a law without being able to measure any of the factors involved, i.e. without 'something corresponding to linear expansion'. As is well known Rasch's crucial insight was that theoretically it is indeed possible. The probability of (a correct) response is 'measurable' and this is the starting point. True, this 'measurable' quantity is not (yet) related in various ways to other measurable quantities as you point out. However, maybe a more suitable physical analogy to the situation is, instead of Ohm's law, the level of knowledge about electricity at the time of William Gilbert (1544 - 1603). One of Gilbert's crucial observations was that a metallic needle (i.e. primitive electroscope) is deviated by the electricity produced by rubbing amber with cloth. Of course Gilbert had some 'primitive' theory about this phenomenon (for details see Bordeau: Volts to Hertz). Note that the measurable quantity 'angle deviation' is not yet connected to other quantities just as is the case with the Rasch model, but that is basically how all got started. However, I jumped into this discussion to take position on Josh question if there is 'any existing psychological theory that has already commenced 'such research'' to which my answer is: 'Yes, in my view such theory exists and the research has already commenced.' Of course, the question remains why no progress has been made beyond the 'Gilbert level' but that's another topic. Still no agreement about 'the starting observation' and amount of theory necessary to get started, Steve?"

Steve:

You say above: "the probability of a correct response is 'measureable' and this is the starting point. Can you explain to me what you mean by "the probability (of anything) is measurable"? How about "odds". Would you say that if I obtain the ratio of the frequency of occurrences of an event A to frequency of occurrence of event B, that is a measurement?

I think we need to be clear on this first.

Best, Steve

-----Original Message-----
From: talking-m...@googlegroups.com [mailto:talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter
Sent: Sunday, 20 March 2011 4:37 PM
To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological Measurement


Hi Steve,

You wrote. "I think there's an important distinction to be made. If we look back on historical events in light of the theory and know-how now at our disposal, it may seem this way. Let's just consider a few details (apologies for anything not entirely accurate, historically--I do not think it should alter the basic message). Ohm's work is entitled (in English) the Galvanaic circuit investigated mathematically. Ohm used galvanaic or voltaic cells (and later a tehrmocouple device). He made explicit reference to Fourier and Poisson, saying that the form of the differential equations are similar to those given for the propagation of heat. He referred to generalizations deducible from the phenomena and empirical results. He formulated several quantitative equations in arriving at the one we now call Oh'm law. He referred to the "force of a current" (current) and tensions (potentials I think). Ohm used a number of detailed figures of the curcuits. He used a galvanometer to measure current (crude by today's forms, nevertheless an intricate instrument). So it's important not to overestimate the degree to which theory was unified, or the degree to which instruments had intricate designs, viewing them by modern standards. Its equally important, I think, not to underestimate either the degree to which theory and definition were developed or the intracacy of the instruments by the standards of the day."

There is a certain danger that we start splitting hairs here. The point I want to make is that in order to apply measurement theory we don't need much 'sophisticated' theory to get started with experimenting. Furthermore, theory and experiment have to go hand in hand from the simple to the complex. We should therefore try to avoid burdening established empirical discoveries with to much theory. An extreme case where a huge theory was build upon no empirical evidence at all is Herbart's 'Psychologie als Wissenschaft'. I believe that the psychological theory as presented by Rasch in his 'Probabilistic models for some intelligence and attainment tests' is enough theory to get started which does not mean that I believe that such a start would or must be successful.

You also argue: "Sadly, I disagree that it is a similar starting observation. Examined in terms of metrology and systematic measurement (and units), the problem is the lack of something corresponding to linear expansion. The spatial extent (volume) of a material or substance is measurable, and this is a starting point. Once it can be shown in a systeamtic manner than heat tansferance affects spatial extent under specific conditions, there is a basis for mesuring heat (and ultimately temperature). The problem is that "relative frequencies" of errors and so forth are not established measurements. Rasch's work to show that frequencies were systematically related and could be used to infer estimates of "parameters" is very useful. However, it doesn't have the same basis in a clearly measurable attribute (spatial extent) that is also related to various other physical dimensions."

In my view the question is if it is possible theoretically to determine constants in a law without being able to measure any of the factors involved, i.e. without 'something corresponding to linear expansion'. As is well known Rasch's crucial insight was that theoretically it is indeed possible. The probability of (a correct) response is 'measurable' and this is the starting point. True, this 'measurable' quantity is not (yet) related in various ways to other measurable quantities as you point out. However, maybe a more suitable physical analogy to the situation is, instead of Ohm's law, the level of knowledge about electricity at the time of William Gilbert (1544 - 1603). One of Gilbert's crucial observations was that a metallic needle (i.e. primitive electroscope) is deviated by the electricity produced by rubbing amber with cloth. Of course Gilbert had some 'primitive' theory about this phenomenon (for details see Bordeau: Volts to Hertz). Note that the measurable quantity 'angle deviation' is not yet connected to other quantities just as is the case with the Rasch model, but that is basically how all got started. However, I jumped into this discussion to take position on Josh question if there is 'any existing psychological theory that has already commenced 'such research'' to which my answer is: 'Yes, in my view such theory exists and the research has already commenced.' Of course, the question remains why no progress has been made beyond the 'Gilbert level' but that's another topic. Still no agreement about 'the starting observation' and amount of theory necessary to get started, Steve?

aaro

unread,
Mar 22, 2011, 7:56:19 AM3/22/11
to talking-m...@googlegroups.com, Andrew Kyngdon
Andrew,


On Tuesday, March 22, 2011 2:34:29 AM UTC+2, Andrew Kyngdon wrote:

Your counterpoints are speculative and as such there is nothing in your response which casts doubt upon Birnbaum’s findings.

Which of the arguments is speculative? I will address them one by one
 
1. I suggested that "
I suspect the studies you refer to are conducted mostly or exclusively with very highly educated subjects (students?). Studying only highly educated people creates a very distorted view on psyche."

Interestingly, you provide a list of studies that fully supports my doubts and you conclude that my suggestion was not supported? I wrote that study of exclusively very highly educated persons (everybody who has more education than primary is very highly educated compared to the world population) creates problems. You list ONLY very highly educated samples:

Incidentally, human choice behaviour under risk is quite consistent over time and across different samples of people. For example, the preference reversals of the Allais Paradox have been replicated ... psychology students ...  students of economics ... ... staff from a major London healthcare facility ... university students or highly trained medical physicians ...  business executives ... choice experiments on the Internet.

I see only two possible samples that may include some less educated people. First, the staff from a healthcare facility and Internet. Internet users have computers, know how to use them and are interested in participating in the study. I have conducted studies of persons with low level of education (including illiterates in Brasil). They do not use internet and they are often not very motivated in participation. Oliver does not provide necessary information, but says that "from the general office staff to the directors" I doubt whether uneducated people would work as office staff.

So, you support my argument and then you say:
 

So your argument of sample bias does not seem to be supported.

As to my other "speculations"  I proposed these:

2. externally identical behaviors can be based on structurally different minds

 I can provide some hundreds of studies supporting this conclusion. Luria's neuropsychological studies, and especially his work in neuropsychological rehabilitation alon would be sufficient; but also Siegler's studies on arithmetical operations in children, Nunes on the same subject but in cross-cultural perspective ....

3.  no variable-based theory is able to distinguish between such differently composed minds that behave in some situations similarly

My 2007 article in the Integrative Psychological and Behavioral Science on variables discusses this issue in details. But no sophicticated discussion is needed here because there is no mathematical way to show that two different behaviors attributed the same code are essentially different.

2. in other words again: there are qualitatively different ways of informations processing available to humans.
 
4. These ways constitute a developmental hierarchy.

Numerous works in Vygotsky-Luria direction; I have summarized such work in different publications. The evidence is clear both for ontogenesis and cultural evolution (on ontogenesis, the book chapter I published in the book I also edited in 2003, Book: Cultural guidance in the development of the human mind; on cultural evolution the 2003 book chapter in the Dialogicality in development edited by Jospehs); the same hierarchy can be observed in medical decision making, Toomela, (2005) in the Science and medicine in dialogue, edited by Bibace et al.
 
5. Perhaps in some levels this utility might be processed as quantity but definitely not at all levels of this hierarchy.

Follows from the analysis of the hierarchical stages of development

Which of them is speculation? 

Best

Aaro

Andrew Kyngdon

unread,
Mar 22, 2011, 5:48:20 PM3/22/11
to talking-m...@googlegroups.com

Aaro,

 

All of your arguments against utility theory have been speculative. Why? Because there exists no established body of research that shows how  decision making under conditions of risk or uncertainty is either influenced or caused by any of things you’ve been arguing. For example, you have presented nothing which would convince someone familiar with utility theory that Nuria’s work on neurological rehabilitation has anything to do at all with the St Petersburg Paradox, or violations of stochastic dominance, or the common ratio effect of the Allais Paradox. How would “cultural evolution” have anything to do with the descriptive failures of lexicographic semi-order theories of utility? Or do you expect me to believe you just because you say that these things are relevant? You need evidence to back up your arguments, not speculation or your opinion.

 

By the way, Lattimore, Baker & Witte (1992) recruited both a sample of college students and a sample of North Carolina prison inmates in their choice study. They found no material differences in choice behaviour between the two groups of subjects, with the exception that the male college student choice behaviour seemed to accord slightly (and I mean slightly) more with the traditional expected utility theory. Female college students and the prison inmates seemed to weight outcome probabilities as predicted by cumulative prospect theory. However, this was only for lotteries consisting of gains. For gambles consisting of losses, the choice behaviour of the prison inmates and the college students was virtually identical.

 

Given that such things like the Allais, Ellsberg and St Petersburg Paradoxes are some of the most experimentally reproducible phenomena in the behavioural sciences, I doubt anyone in utility theory would seriously entertain the idea that the whole field is compromised by your allegation of sample bias. The St Petersburg Lottery was first proposed in 1738. Today, over 270 years later, people continue to be willing to pay only a small fee to play the St Petersburg Lottery, despite the fact that this lottery has an infinite expected value (meaning that  a player could possibly become “infinitely” rich). Hence the term “St Petersburg Paradox”. In 1980 the philosopher Hacking suggested that only a few people would pay even USD$25 to play the game (Hacking, 1980). I doubt the St Petersburg Paradox would have survived for centuries if it only pertained to university graduates, especially given there were hardly any university graduates back in 1738. As the Lattimore, et al (1992) study showed, it’s not the sample which is the greatest influential factor in choice behaviour, but the kind of risky situation that people have to make decisions in.

 

Andrew

 

From: aaro [mailto:aaro.t...@ut.ee]

Sent: Tuesday, 22 March 2011 10:56 PM
To: talking-m...@googlegroups.com

aaro

unread,
Mar 23, 2011, 11:58:24 AM3/23/11
to talking-m...@googlegroups.com
Andrew,


On Tuesday, March 22, 2011 11:48:20 PM UTC+2, Andrew Kyngdon wrote:

All of your arguments against utility theory have been speculative. Why? Because there exists no established body of research that shows how  decision making under conditions of risk or uncertainty is either influenced or caused by any of things you’ve been arguing. For example, you have presented nothing which would convince someone familiar with utility theory that Nuria’s work on neurological rehabilitation has anything to do at all with the St Petersburg Paradox, or violations of stochastic dominance, or the common ratio effect of the Allais Paradox. How would “cultural evolution” have anything to do with the descriptive failures of lexicographic semi-order theories of utility? Or do you expect me to believe you just because you say that these things are relevant? You need evidence to back up your arguments, not speculation or your opinion.

If your point is that there is no evidence related to the paradoxes you mention then here I agree with you. If you say that there have been no studies on decision-making, then this is wrong; and Luria's work on rehabilitation includes work on rehabilitation of lost decision-making abilities. But this is not the point.

I suggested that externally similar behavioral results can be based on psychologically different mechanisms and vice versa. This principle has been supported by so many studies in so different areas of research that I do not see any reason to suspect that it does not apply to the utility behaviors. I also think that no studies are needed to support this principle, observations of everyday behaviors would be sufficient. Even more, you pushed me to think more on the subject of our discussion and I realized that you already provided evidence that the same principle applies to the phenomenon you refer to--there also different psychological ways for solving the same problems have been observed. You provided two kinds of evidence. First, persons who know the paradox solve the problems differently from those who do not know. And second, you also referred in an earlier post that not all subjects behave in the expected by the theory ways. This is exactly what I would also predict on the basis of "speculations" -- which might be called also generalizations.

So, if there is more than one way to make decisions in the tasks used to study the paradoxes then I think it is the real subject matter of psychology to discover what these mechanisms are. And explanation is not finding a name, such as heuristic, for the phenomenon. The explanation would reveal what units of thought are organized in which specific relationships so that observed behavioral regularities emerge. The studies and theories you mentioned, from my (cultural-historical or structural-systemic) perspective are not even trying to reveal the psychological operations that underlie the observed behaviors. One reason for that is that behavioral results are encoded into variables before revealing whether externally the same results emerge on the basis of same psychological operations or not. This questions emerges naturally when more than one way of solving the studied problems can be observed. As I said, you provided that evidence by yourself.
 

 By the way, Lattimore, Baker & Witte (1992) recruited both a sample of college students and a sample of North Carolina prison inmates in their choice study. They found no material differences in choice behaviour between the two groups of subjects, with the exception that the male college student choice behaviour seemed to accord slightly (and I mean slightly) more with the traditional expected utility theory. Female college students and the prison inmates seemed to weight outcome probabilities as predicted by cumulative prospect theory. However, this was only for lotteries consisting of gains. For gambles consisting of losses, the choice behaviour of the prison inmates and the college students was virtually identical.

Well, I think the evidence you provide cannot support your conclusions; because for different reasons, group-level data analyses cannot in principle be translated back to psychological operations that characterize the individual level of analysis. Peter Molenaar has written on this issue from the mathematical perspective; there are other arguments to support this conclusion you can find already in Kurt Lewin's works, for instance. Also several other authors talking about person-level approach are relevant here (Lienert, Meehl, von Eye, Magnusson, to name just a few. In addition, the effect of education on many test performances is not linear, the largest differences are observed between illiterates and those with only 1-2 years of education; the differences almost disappear among people with more than about 9 years of education. How educated were the inmates in this study?  And even that is not really important here because ...
 

 Given that such things like the Allais, Ellsberg and St Petersburg Paradoxes are some of the most experimentally reproducible phenomena in the behavioural sciences, I doubt anyone in utility theory would seriously entertain the idea that the whole field is compromised by your allegation of sample bias. The St Petersburg Lottery was first proposed in 1738. Today, over 270 years later, people continue to be willing to pay only a small fee to play the St Petersburg Lottery, despite the fact that this lottery has an infinite expected value (meaning that  a player could possibly become “infinitely” rich). Hence the term “St Petersburg Paradox”. In 1980 the philosopher Hacking suggested that only a few people would pay even USD$25 to play the game (Hacking, 1980). I doubt the St Petersburg Paradox would have survived for centuries if it only pertained to university graduates, especially given there were hardly any university graduates back in 1738. As the Lattimore, et al (1992) study showed, it’s not the sample which is the greatest influential factor in choice behaviour, but the kind of risky situation that people have to make decisions in.

... in 1738 most humans were illiterate, I agree here. But that is also not so important. What I am suggesting is that there is strong evidence for different ways of problem solving in many different domains. I generalize this principle to the domain of decision-making in risky situations without empirical support available yet. Maybe even introspection would give some necessary support? Can you tell that you will solve these lottery problems in the same way as the usual subjects in the studies? I can tell that I would not. Psychological theory, I think, should aim at understanding these differences in problem-solving mechanisms. For me the theories you mention are not psychological theories even though behavior is studied in them. But this is already question of epistemological differences between us and it seems our disagreement emerges from much deeper sources than particulars related to decision-making in risky situations.

Best

Aaro

Andrew Kyngdon

unread,
Mar 23, 2011, 7:39:04 PM3/23/11
to talking-m...@googlegroups.com

Aaro,

 

If your point is that there is no evidence related to the paradoxes you mention then here I agree with you. If you say that there have been no studies on decision-making, then this is wrong; and Luria's work on rehabilitation includes work on rehabilitation of lost decision-making abilities. But this is not the point.

 

The point is that you have singularly failed to explicate how anything of what you say relates to decision making under risk and uncertainty. It is very obvious that you know next to nothing of the field. Arguing from a position of near total ignorance is not going to convince anyone. At least now though you agree that there is no established research which backs your speculation. Now, Luria’s work sounds interesting, but once more you have failed to explained how it relates to decision making under risk in any specific way. Again, you seem to think that I should believe you because you say it does.

 

I suggested that externally similar behavioral results can be based on psychologically different mechanisms and vice versa. This principle has been supported by so many studies in so different areas of research that I do not see any reason to suspect that it does not apply to the utility behaviors. I also think that no studies are needed to support this principle, observations of everyday behaviors would be sufficient.

 

So, are you saying that you do not need scientific evidence to support your hypothesis that “behavioral results can be based on psychologically different mechanisms”? If so, then you are no scientist. Indeed, your hypothesis is a woolly and vague generalistion, not an hypothesis of concerning a psychological system or component thereof. Hypotheses without rigorous scientific evidence supporting them are nothing more than speculation. So once again, you are merely speculating.

 

Even more, you pushed me to think more on the subject of our discussion and I realized that you already provided evidence that the same principle applies to the phenomenon you refer to--there also different psychological ways for solving the same problems have been observed. You provided two kinds of evidence. First, persons who know the paradox solve the problems differently from those who do not know. And second, you also referred in an earlier post that not all subjects behave in the expected by the theory ways. This is exactly what I would also predict on the basis of "speculations" -- which might be called also generalizations.

I said no such thing. Either this is a mistake on your part or you are deliberately engaging in a straw man fallacy. What I said was this “…theories of utility can be proposed which are descriptively different.” I did not say anything about people solving paradoxes in different ways.

 

The studies and theories you mentioned, from my (cultural-historical or structural-systemic) perspective are not even trying to reveal the psychological operations that underlie the observed behaviors.

 

Once more you are completely wrong. You should actually try reading some literature on decision making under risk and uncertainty. Kahneman & Tversky’s (1979) classic paper on prospect theory is quite readable for the non expert. In this paper, Kahneman & Tversky explain why people weight the outcome probabilities of a lottery in a non-linear way. This they called the “fourfold pattern of attitude towards risk”. They found that people are risk averse in the context of probable gains, but they counterintuitively seek risk in the face of certain losses. But people are also aversive to risk  in the context of improbable losses (e.g., the purchasing of renter’s insurance) and they seek risk for improbable gains (e.g., poker maching gambling). So the mathematical weighting function in quantitative theories of utility does have a firm, descriptive psychological basis. Also, there is an affective component to utility. People find losses more painful than what gains are pleasurable, and the utility function in CPT accounts for this. So your argument that theories of decision making under risk are “not even trying” to be descriptive of the psychological processes that underlie choice behaviour is totally false.

 

Perhaps your “cultural – historical or structural systemic perspective” (whatever that means) is too blinkered, or you believe in it too much?

 

Well, I think the evidence you provide cannot support your conclusions; because for different reasons, group-level data analyses cannot in principle be translated back to psychological operations that characterize the individual level of analysis.

 

Ha! Now you are changing your argument in the face of evidence which refutes your earlier position. You said “I wrote that study of exclusively very highly educated persons (everybody who has more education than primary is very highly educated compared to the world population) creates problems.

 

That aside, utility theorists are already way ahead of you. Luce (2000) and Birnbaum (1999) argued that tests of decision making phenomena should also be done at the level of the individual. It has been found that choice behaviour such event splitting effects hold at the level of the individual just as they do for groups (e.g., Birnbaum, 1999c, 2004a, 2007b; Humphrey, 1998, 2000, 2001a, 2001b). Moreover, at the individual level, people who make choices under risk weight the outcome probabilities of lotteries a non-linear way (e.g., Abdellaoui, 2000; Gonzalez & Wu, 1999; Lattimore, et al, 1992), which is exactly what the current quantitative theories of utility argue. If your allegation of sample bias had any weight, it would have been highly unlikely that Lattimore, et al’s (1992) North Carolina prisoners non-linearly weighted lottery outcome probabilities just as Kahneman & Tversky’s (1979) Israeli psychology students did.

 

What I am suggesting is that there is strong evidence for different ways of problem solving in many different domains. I generalize this principle to the domain of decision-making in risky situations without empirical support available yet. Maybe even introspection would give some necessary support?

 

If you do not have empirical support, then what you say cannot be anything more than speculation. Mere speculation cannot cast doubt upon established theories. Introspection? You’ve got to be joking, right?

 

The point I made is that there are quantitative and non-quantitative, heuristic based theories of utility; and that the non-quantitative theories have been descriptive failures compared to the quantitative ones. You have systematically refused to engage with this point and have thrown at me instead all sorts of red herrings, such as sample bias.

 

For me the theories you mention are not psychological theories even though behavior is studied in them.

 

This is a contradition and is therefore logically false. A theory of human behaviour is a theory of psychology, given that psychology is the study of human (and animal) behaviour. Utility theories are theories of choice behaviour and so therefore are psychological theories. As I said before, I’ve never seen a rock make a decision.

 

Andrew

 


Sent: Thursday, 24 March 2011 2:58 AM
To: talking-m...@googlegroups.com

--

aaro

unread,
Mar 24, 2011, 7:38:33 AM3/24/11
to talking-m...@googlegroups.com
Andrew,

I see already too many emotions intervening into our discussion. I do not feel that evaluative statements and expressions are helpful in developing a constructive discussion. I can only blame myself in all this because I have given too many red herrings, made jokes (e.g. I suggested that introspection can be a valuable scientific tool), I know "next-to-nothing" of the field and demonstrate near total ignorance, I am no scientist, etc.

As I see the situation, we have fundamental epistemological disagreements. We understand differently, what is explanation, what is scientific methodology, what methods can provide explanation. We disagree in what scientific questions are worthy to answer and why. We disagree in the definition of psyche. I reject behaviorism that logically follows from your definition of psychology. I do not think that giving a name to a process or series of events is an explanation. There are more differences which are rooted in qualitatively different scientific world-views. There is no point in expressing disagreements in superficial issues when the reasons of the disagreements are elsewhere. Better to end this discussion before it goes too deconstructive.

It might be that you are right in everything you have written and I am totally wrong. ... And yet I wait to see how the approach you took leads to constructive experiments.

With best regards,

Aaro




On Thursday, March 24, 2011 1:39:04 AM UTC+2, Andrew Kyngdon wrote:

Andrew Kyngdon

unread,
Mar 24, 2011, 6:06:24 PM3/24/11
to talking-m...@googlegroups.com

Aaro,

 

You are right. Call me old fashioned, but I think that in order to convincingly critique an established theory, one has to understand that theory well, articulate an argument which explicitly criticises key components or assumptions of that theory and present evidence in support of the argument. I do not think it’s wise to continue debating with someone who makes straw man arguments, such as

 

I reject behaviorism that logically follows from your definition of psychology.

 

and

 

I do not think that giving a name to a process or series of events is an explanation.

 

I never said that I endorse behaviourism. Neither does my definition of psychology logically rule out the study of cognitive phenomena. Why would I want to do that anyway, given that making decisions under risk obviously involves cognitive processes? And I never argued that reification constitutes a scientific theory. Once more, you’ve made the classic straw man fallacy.

 

It’s funny you mention constructive experiments, as Steve Humphry and I have been planning a choice experiment this week.

 

Cheers,

 

Andrew

 


Sent: Thursday, 24 March 2011 10:39 PM
To: talking-m...@googlegroups.com

--

WimpieXL

unread,
Mar 25, 2011, 7:30:01 AM3/25/11
to Talking Measurement
Dear measurement enthusiasts,

If the subject of this forum is thought of as a patient who seeks
advice from a doctor, the doctor’s diagnosis seems to be that the
patient is working too much with mathematical models. The cure is
another matter altogether. The doctor has to seek advise from
consultants in the hospital. Some of them suggest a quite spectacular
cure in the form of what is called “constructural experiments” i.e.
building a device with the available knowledge that will perform the
process the patient is worried about. (If the patient had a problem
with understanding steam, the advice would be to build a steam
engine….) But our patient has a problem understanding psychology or
better the psychic apparatus that is present in every human. This cure
is no mean task.

But i hope that in a few years time CNN and the rest of the world-
media will flock to Estonia to report on Aaro’s “psychic ability-
machine” that has a personality (or rather personality traits) too. I
do not hope that this machine will be only a computer simulation or
computer programme. In that case CNN won’t send any camera teams.

It is not from AI that the solution will come, I prophesize. That line
of theorizing about human cognition and abilities has been talked to
death more than a decade ago. If you programme everything you want
“cognition” to do it does not prove very much. For one thing, a
computer programme can’t have real feelings and emotions. This fact
alone will preclude a working model that has anything to do with real
psychic processes.

Pure cognition can be implemented in computer chips thse days. Even
visual perception that has eluded AI specialists for so long can be
implemented. Already there are cars without a human driver riding on
the roads in the US. But these artificial drivers will never be
startled by a sudden occurrence on the road. It won’t have a heart and
therefore it’s heart will never skip a beat. Of course an emotion
algorithm could be added to the driving programme per se. (Like Data
had in the SF TV series Star Trek. But that doesn’t prove anything….)

Without real emotions there will be no human psychic traits. And our
form of emotions will not appear as an emerging new property in the
computer programmes of the robotic driver. (That only happens in SF.)
Emotions rely on hormones and neurotransmitters and phylogentical old
limbic structures in the brain. I hope Aaro’s construction will have
nuts and bolts or perhaps it will grows in a glass lab container.
Building a brain the novo would be quite spectacular.

But luckily there other ways to conduct a constructural experiment..
Aaro cites the cognitive rehabilition of a patient by the Russian
neuropsychologist Luria as an example of a successfull constructural
experiment. The success of this rehabilitation “proves” that Luria’s
theory must be sound.. Alas, i am sorry to say that in real life
cognitive rehabilitation of neurological patients is never truly
successful and that the inter individual differences in patient groups
are very large. In some patients the rehabilitation programme just
will not work! Perhaps in those patients their organization of
cognition (or the affected brain region) is different. But the fact
that a alrge number of patients cannot be rehabilitated, disproves the
success of this type of constructural experiment, i think.

If success is measured by the ability to perform the activities of
dayly living (ADL) to such a degree that the patient can live at his
or her own home (with some help), cognitive rehabilitation can be
successful. But no psychologist in the world is able to restore a
defective memory in the real sense of restoring the “normal” processes
of encoding, storage and retrieval. or to restore language processing
in an aphasic patient. Yes the patient can sometimes learn to cope
successful with his or her disability and perform ADL. This is no mean
feat for the psychologist. I don’t believe that Luria did any better
in the 1940’s. The Russian language precludes a more detailed critique
of Luria’s work. If it was really successful he would have certainly
included the case in one of his english publications like “The working
brain” or “The man with a shattered brain”, but het didn’t.

I wonder whether the problems with mathematical models you guys are
talking about, also apply to the case of inferential hypothesis
testing, as in analyzing the results of an experiment. Sometimes very
broad statements are made, like “The point I am making is that
mathematics is useful in principle. But it should be used for
appropriate purposes. And psychology (perhaps among other sciences),
abuses mathematics; mathematics cannot be used for discovering the
explanation of the studied phenomena as it is too often attempted in
psychology; no mathematical model is also able to formulate an
explanatory (constructive) theory of the studied object. The potential
of mathematics is grossly overestimated. The GLM usually used in
hypothesis testing is definitely a mathematical model."

The inferential testing of group means would of course be possible by
using other means (but non-parametric and distribution free tests are
still mathematical.) .I would be very unhappy if this use of
mathematical models was suspect. But perhaps I misunderstood.


Regards,
Wim Maring,
former clinical neuropsychologist and former traffic psychologist
from The Netherlands


On Mar 15, 7:57 pm, aaro <aaro.toom...@ut.ee> wrote:
> Hi All,
>
> There is a question that bothers me already some time. Over last 4-5 years I
> have tried to understand what is wrong with methodology of psychology today
> and ended up with understanding that there seems to be nothing right. (Some
> of the reasons have been provided, among others, by the followers of this
> discussion group; I have proposed some more and extended criticism to modern
> qualitative approaches as well). The issue of measurement is definitely one
> that needs to be understood deeper.
>
> A scientist, I believe, should ask four general questions in the beginning
> of any study to be conducted:
> 1. What do I want to know? What is my research question?
> 2. Why do I want to answer that question?
> 3. By what methods can I find the answer?
> 4. Do the answers to the first three questions make a coherent whole?
>
> This forum is dedicated to the following: "A core focus is the state of
> measurement in the social sciences. Why are disciplines such as Psychology,
> Education, Sociology and Economics only considered to have 'soft
> measurement'. What we can do to change this?"
>
> So, there seem to be some questions we are trying to answer: What is
> measurement? Are psychological attributes measurable? Can we improve
> measurement in psychology and other "soft" sciences?
>
> Now the second question should emerge--Why we want these questions to be
> answered? WHY TO MEASURE?  I suppose there is more than one answer to this
> question. If so, the answers to "research questions" of this discussion
> group may be different. So far in this discussion group, however, it seems
> one aim of measurement is implicitly assumed. It may turn out that the
> methods or ways to answer the measurement questions may not correspond to
> the aims of measurement. In that case only confusion will arise.
>
> I think there are at least five reasons why measurement is used/ pretended
> to be used in psychology:
> 1. "Real sciences" measure and psychology must look like/ is a real science.
> This position may be common, but maybe not very meaningful
> 2. Everybody else is measuring. Here two forces are operative. First,
> universities usually teach research methodology as if quantitative data
> analysis is the scientific method. And second, publishing is also easier.
> Here the reasons are nonscientific, thus.
> 3. Psychology lacks better methods for organizing massive amounts of
> information. Until these better methods will be discovered, statistical data
> analysis, that requires "measurement," should be used. This position was
> quite explicitly taken by founders of statistical data analysis in
> psychology and other sciences--Karl Pearson, Louis Thurstone, among others.
> 4. Measurement and following dstatistical data manipulation helps to predict
> future states and events beyond chance.
> 5. Measurement and quantitative data analysis help to reveal the mechanisms
> of the studied phenomena, psyche in psychology.
>
> Each of these reasons has, I think, different relationship to understanding
> what is measurement and how it can be applied.
>
> Altogether, I have two related questions to the group:
> I. Is the list of reasons, why psychologists want to measure, complete? Are
> there more reasons? Or maybe some should be excluded?
> II. Which of the reasons do you imply when discussing the theory of
> measurement?
>
> My impression is that the issues discussed so far relate to one or the other
> of the reasons provided. Depending on the reasons, however, the same
> questions about measurement and measurability have different answers. It
> might be interesting to discuss these relationships in more details.
>
> With best regards
>
> Aaro
>
> (Aaro Toomela
> Institute of Psychology
> Tallinn University
> Tallinn, Estonia)

Paul Barrett

unread,
Mar 25, 2011, 5:32:41 PM3/25/11
to talking-m...@googlegroups.com
Wim (and Denny)

For an exemplar of the kind of work agent-based embedded cognition AI
roboticists get up to .. goto

http://www.robotcub.org/

And, take a look at the paper below ..

Berthouze, L., and Metta, G. 2005. Epigenetic robotics: modelling cognitive
development in robotic systems. In Cognitive Systems Research. Volume 6
Issue 3. September.
http://www.robotcub.org/misc/papers/05_Berthouze_Metta.pdf

You will note from the list of publications at this site that some of the
algorithms embodied in the cognitive system are not "fixed", or even
mathematical (akin to non-computational cellular automata and
non-computational "emergent property" systems such as found in Artificial
Life simulations). See also the Science article " Self-Organization,
Embodiment, and Biologically Inspired Robotics"
(http://www.sciencemag.org/content/318/5853/1088.full )

There are developments, robot systems "out there" in emergent system
agent-based robotics which are trying to get to grips with how our brain
adapts, grow, and learns, and how to build a system which "feels" and has
"personality" (e.g. KISMET ...
http://www.ai.mit.edu/projects/humanoid-robotics-group/kismet/kismet.html )

Just run a quick Google search using the terms "Encoding Emotionality in
Robots".

There are huge philosophical and computational/algorithmic problems with
dealing with emotionality .. but at least some are trying to figure these
out, step-by-step rather than keep hand-wringing and saying "it's
impossible".

What is fascinating about this work, and what evades the "tiny-mind
mentality" of many psychologists, especially "quantitative psychologists" is
that this work is dealing with the very essence of being human:
Consciousness, sentience, adaptation, biological self-organization, and
emotionality. It is no surprise to find theoreticians, philosophers, and
applied scientists from physics, computation, and engineering working
alongside the kind of thoughtful psychologists who are prepared to try and
build systems which seem closer to how human cognitive systems might work.

----------------------------------------------------

And, I can see what may be nagging at Aaro ... for me it's a bit like the
effect Gigerenzer and the ABC group in Berlin had in the world of
quantitative decision making models with the introduction of
evolutionary-advantageous non-computational fast and frugal decision-making.


It's not an "either or" situation, but you get the feeling that mathematics
may not be a hugely realistic way of modeling -some- human processes
(inasmuch as the good 'ol backpropagation system in neural nets is not a
good way to model aspects of human learning, which is why the LEABRA {local,
error-driven and associative, biologically realistic algorithm} algorithm
was invented ...
O’Reilly, R. C. (1996). Biologically plausible error-driven learning using
local activation differences: The generalized recirculation algorithm.
Neural Computation, 8, 895–938...
and
O’Reilly, R. C., & Munakata, Y. (2000). Computational explorations in
cognitive neuroscience: Understanding the mind by simulating the brain.
Cambridge, MA: MIT Press.

----------------------------------------------------

So, although I too have some "issues" with Aaro's conceptualisation of
"psychic explanation", I don't poke fun at him.

Andrew has clearly and firmly highlighted the evidence surrounding
choice-behaviors in certain contexts, which produce outcomes which can be
modeled mathematically with some considerable degree of success.

But, how they actually do this, how it could be implemented in wetware,
simulated in digital-analog hybrid algorithms, or any kind of human
neuophysiological adaptive system remains a mystery. And, at what stage
might a fast and frugal process bypass what looks to be a rational
computational process? Does it/could it ever do so - and how might we test
such a proposition?

And what happens when you inhibit neurogenesis in the hippocampus with
extreme stress, with the knock-on effect on working memory? Do the math
models still hold? Unfair question really but it brings home the notion of
an integrated system at work - and what might happen to change the
behavioral outputs of a system which under other conditions can be modeled
mathematically. Maybe things still do function as expected; but this is the
problem of a mathematical model devoid of explanatory content (i.e. how does
the brain actually implement the 'math", or do what it does that enables a
mathematical model to be fit to the observable outcomes in the first
place?). No doubt, already being investigated somewhere ...

----------------------------------------------------

The contrast in the "scientific" vs a "strictly quantitative" approach is
how Denny and his colleagues criticised Daryl Bem's recent piece of
nonsense:

Bem, D.J. (2011) Feeling the future: Experimental evidence for anomalous
retroactive influences on cognition and affect. Journal of Personality and
Social Psychology, 100, 3, 407-425.
Critiqued in:
Wagenmakers, E-J., Wetzels, R., Borsboom, D., van der Maas, H.L.J. (2011)
Why psychologists must change the way they analyze their data: The case of
Psi: Comment on Bem (2011). Journal of Personality and Social Psychology,
100, 3, 426-432.

Frankly, a pointless exercise in "my method is better than your method" from
Wagenmakers et al. They missed the "bleeding obvious" issue entirely - lost
in a world of statistical methodology ...

Let me explain ...

The key issue here (for me) is all about aggregation, and the specific form
of hypotheses that can be tested using sample statistics.

The effect size is a ratio between aggregated variances, or the scaled
difference between two aggregate parameters (the means).

But, what is the form of hypothesis which can be tested?

H1: If the hypothesis is that aggregate statistical effects can be shown to
be larger than zero, then Bem did a good job. But the hypothesis says
nothing about “people show psi ability” – because “people” defined as
constituting many single-unit effect-producing entities were never examined.
All that was examined were aggregates of all the “single-unit”
scores/outcomes.

H2: On the other hand, IF the hypothesis was to be: all humans show evidence
of psi ability, then every individual must show that ability (however tiny)
in order for such a hypothesis to be supported.

A careful reading of Bem’s paper shows that he continually wants to argue
for H2 (as an evolutionary advantageous property of being human, like having
a prefrontal cortex etc., we all have it in varying degrees), but implements
a hypothesis testing procedure which can only address H1.

We know from the pitiful effect sizes that many people did not show psi
(assuming normally distributed data required by the methods he used). Hence,
H2 is already disproven by Bem, although he is so committed to a statistical
view of phenomena that he doesn’t recognize what his own results imply.

Consider the fact that some people in his sample may have shown truly
outstanding evidence of psi, some will have actually produced behaviors less
than chance would have expected, some will be 50/50 ... you average them and
what do you get? Exactly what Bem found, slightly above chance effect sizes.


But what have you got in the real world (not in his statistician’s view of
the world)

1. Some people seem to possess psi.
2. Some people don’t possess any psi; they respond at chance expected
levels.
3. Some people seem to have responded worse than chance.

As a scientist, #1 (and maybe #2) is the really important finding as psi
need not be a property of every human; what’s important is being able to
demonstrate conclusively that some individuals really do show
precognition/psi on many independent testing occasions (i.e it's
replicable).

Who cares about an effect size of 0.15 for an entire sample when perhaps 5
individuals in that sample show a psi effect that is consistently 90%
accurate above chance-expected levels? Instead of a silly “what exactly is
the point of all this tosh” paper, we’d be reading the work in Science and
Nature – with our collective jaws dropping around the world.

It is as though some psychologists no longer understand how to even compose
scientific hypotheses that make sense. Everything is "given over" to Lord
Charles Bowen's "average man on the Clapham Omnibus" view of "the person" ..
as though this fine legal proposition is a foundation for an investigative
science.

----------------------------------------------------

And finally, I/we don't just criticize - some of us really try and do the
new business, bit by bit, - against a backdrop of individuals whose
preservation of the status quo results in drip-fed acidic and pernicious
ridicule. This work is really hard, really time-consuming, and really
awkward.

Regards .. Paul

Dear measurement enthusiasts,

--

Trendler, Guenter

unread,
Mar 26, 2011, 4:47:20 AM3/26/11
to talking-m...@googlegroups.com
Hi Steve,

Steve: "You say above: "the probability of a correct response is 'measureable' and this is the starting point. Can you explain to me what you mean by "the probability (of anything) is measurable"? How about "odds". Would you say that if I obtain the ratio of the frequency of occurrences of an event A to frequency of occurrence of event B, that is a measurement?"


Surprisingly calling Rasch model(s) a psychological theory created some discomfort among group members. Obviously I mistakenly assumed that some basic ideas about ‘Rasch models’ are commonly shared. Since this is not the case I have to make clear in what sense the Rasch model is a theory. Hambleton et al. write:

“Item response theory (IRT) rests on two basic postulates: (a) The performance of an examinee on a test item can be predicted (or explained) by a set of factors called traits, latent traits, or abilities; and (b) the relationship between examinees' item performance and the set of traits underlying item performance can be described by a monotonically increasing function called an item characteristic function or item characteristic curve (ICC). This function specifies that as the level of the trait increases, the probability of a correct response to an item increases.” (Fundamentals of Item Response Theory, 1991, p. 8)

Hopefully this does not worsen the situation by adding to the dispute the questions if the Rasch model is really an IRT. In the following I will heavily rely on Georg Rasch’s ‘Probabilistic models for some intelligence and attainment tests’ and on Andrich’s ‘Rasch Models for Measurement’.

Josh wrote:

“But such an epistemology seems to me consistent with the way that measurement was actually established in the physical sciences, i.e., hypothesising and confirming physical properties and the types of relations between them across a wide range of physical phenomena and building this into a coherent body of substantive theory that is the foundation (not mathematical as it is often presented in psychology) of the international system of measurement. I would argue that the possibility of psychological measurement can only be established (or not) by such research, and certainly not by fiat (which I have been somewhat guilty of in the past). [/] Is there any existing psychological theory that has already commenced 'such research'? Is there any existing physical theory that is relevant to 'such research', analogously or otherwise? Hopefully, by way of these discussions and others, we can get away from postulation and begin the necessary research.”

In my view the Rasch model is such a theory. It relies on the empirical observation that some examinees have higher probabilities of answering an item correctly than do other examinees. The explanatory hypothesis is that the observed differences are induced by differences in traits in such a way that "that examinees, with higher values on the trait have higher probabilities of answering the item correctly than do examinees with lower values on the trait" (op. cit., p. 8) The Rasch hypothesis is that the relation is quantitative. In order to start testing the hypothesis we need test persons and test items. The probability of a correct response to an item can be described in terms of odds (for details see Andrich, p. 12). Finally, let’s assume with Andrich that the observations are made by means of the Eysenck Personality Inventory.

How would physicists (sic!) test the quantitative hypothesis? For example, since the hypothesis is that every person has some level of neuroticism he will need at least two persons and two items (e.g. 1. “Do you sometime feel happy, sometimes depressed without any apparent reason?” 2. “Do you have frequent ups and downs in mood, either with or without apparent cause?”). He will invite the test person to repeatedly answer each question with “yes” (=1) or “no” (=0). If the quantitative hypothesis is correct the experimenter must find that the ratio of different levels of neuroticism is constant across items (for details see Andrich p. 24ff). If he does he has strong indication that the factors involved are indeed quantitative and therefore measurable. If he does not he can either drop the quantitative hypothesis or investigate the causes of why the search has failed. For example, consider that “a common assumption of IRT models is that only one ability is measured by a set of items in a test. This assumption cannot be strictly met because several cognitive, personality, and testtaking factors always affect test performance, at least to some extent. These factors might include level of motivation, test anxiety, ability to work quickly, tendency to guess when in doubt about answers. And cognitive skills in addition to the dominant one measured by the set of test items. What is required for the unidimensionality assumption to be met adequately by a set of test data is the presence of a "dominant" component or factor that influences test performance.” (op. cit., p. 9) Hence, maybe the failure to find constants is due to some other dominant factors (e.g. learning and memory) which act as systematic disturbances. The next step therefore is to identify these factors, control them and repeat the experiment and so on. This, then, is how ‘such research’ can commence.

My point in answer to Josh’s questions: “Is there any existing psychological theory that has already commenced 'such research'?” is; certainly there is enough psychological theory already existing; only the proper research (search for constants) has not yet commenced. What we need is not even more theory but experiments. Experimental results will tell us in which direction we have to go and if more theory is needed. Experimental science consists of interplay between theory and experiment. Theory and experiment must match in complexity or alternatively, theory should not depart too much from experiment in complexity. In physics of course nature is ‘simplified’ in experiment in order to match theory in ‘simplicity’.

Regards,
Guenter

winmail.dat

Andrew Kyngdon

unread,
Mar 26, 2011, 6:45:37 AM3/26/11
to talking-m...@googlegroups.com
G,

"Hence, maybe the failure to find constants is due to some other dominant factors (e.g. learning and memory) which act as systematic disturbances. The next step therefore is to identify these factors, control them and repeat the experiment and so on. This, then, is how 'such research' can commence."

How does one do this precisely without theory? You say that we don't need more theory, but more experiments. If so, what happens when an experiment fails to establish a trade off between two attributes? In such cases, how can one speculate as to the causes of this failure? How can one identify which attributes are potentially confounding and which are not without a theory of the relevant natural system?

I would argue that theory, however crude and incomplete, precedes experiment. The theory of luminiferous ether preceded the Mitchelson - Morley experiment. Von Neumann & Morgenstern's (1947) independence condition of utility preceded Allais' (1953) choice experiments which tested it. I agree with you when you say that science is the interplay between theory and experiment, but I cannot see how that logically entails that no more theory is needed in the behavioural sciences. Allais' (1953) common ratio and common consequence effects were resolved only by new *theories* of decision making under risk, not by more choice experiments.

Andrew


-----Original Message-----
From: talking-m...@googlegroups.com [mailto:talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter
Sent: Saturday, 26 March 2011 7:47 PM
To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological Measurement

Paul Barrett

unread,
Mar 26, 2011, 6:56:42 AM3/26/11
to talking-m...@googlegroups.com
Guenter

If there is no item response error, then the Rasch model cannot fit the
data; the measurement is ordinal - no ratios are possible. So, you have to
have measurement error for the model to fit, and for ratios to be
computable.

What kind of causal theory requires measurement error in order for the
theory to be adjudged "correct"?

Maybe we need a deterministic model rather than a probabilistic one?

Regards .. Paul


-----Original Message-----
From: talking-m...@googlegroups.com
[mailto:talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter
Sent: Saturday, 26 March 2011 9:47 p.m.
To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological
Measurement

Hi Steve,

Josh wrote:

Regards,
Guenter

Andrew Kyngdon

unread,
Mar 27, 2011, 8:01:18 PM3/27/11
to talking-m...@googlegroups.com
Paul,

Interesting post and I will read up on those papers concerning the LEABRA algorithm. Has mainstream connectionism embraced these kinds of error algorithms?

PB:

"...for me it's a bit like the effect Gigerenzer and the ABC group in Berlin had in the world of quantitative decision making models with the introduction of evolutionary-advantageous non-computational fast and frugal decision-making."

You may be interested in the following papers and the debate between the priority heuristic and quantitative theories of risky choice:

Brandstatter, Gigerenzer & Hertwig (2006). The priority heuristic: making choices without tradeoffs. Psychological Review, 113, 409-432.

Birnbaum, M.H. (2008). Evaluation of the priority heuristic as a descriptive model of risky decision making: comment on Brandstatter, et al, (2006). Psychological Review, 115, 253-262.

Brandstatter, Gigerenzer & Hertwig (2008). Risky choice with heuristics: reply to Birnbaum (2008), Johnson, Schulte-Mecklenbeck and Willemsen (2008) and Rieger & Wang (2008). Psychological Review, 115, 281-290.

Birnbaum, M.H. (2008). Postscript: rejoinder to Brandstatter, et al, (2008). Psychological Review, 115, 260-262.

Birnbaum, M.H. (2010). Testing lexicographic semiorders as models of decision making: priority dominance, integration, interaction and transitivity. Journal of Mathematical Psychology, 54, 363-386.

In my view, Michael Birnbaum's series of choice experiments (Birnbaum, 2010) has clearly shown that the priority heuristic theory of risky decision making is a descriptive failure.

PB:

"Andrew has clearly and firmly highlighted the evidence surrounding choice-behaviors in certain contexts, which produce outcomes which can be modeled mathematically with some considerable degree of success."

Thanks Paul, I'm glad someone noticed what I was doing.

I find myself drawn more and more towards decision making under risk as, unlike psychometrics, formal theories are motivated by the attempt to describe human behaviour. This and the history of critical, experimental study means that compelling arguments against quantitative theories of utility are much more difficult to mount and sustain than arguments against quantitative theories of test performance. Indeed, the most plausible non-quantitative alternatives are the lexicographic semiorder class of the theories, but these have been shown to be descriptively inferior to the quantitative theories.

Cheers,

Andrew

Wim (and Denny)

http://www.robotcub.org/

----------------------------------------------------

Neural Computation, 8, 895-938...

----------------------------------------------------

----------------------------------------------------

Let me explain ...

nothing about "people show psi ability" - because "people" defined as

Nature - with our collective jaws dropping around the world.

----------------------------------------------------

Regards .. Paul

Dear measurement enthusiasts,

engine....) But our patient has a problem understanding psychology or


better the psychic apparatus that is present in every human. This cure
is no mean task.

But i hope that in a few years time CNN and the rest of the world-
media will flock to Estonia to report on Aaro's "psychic ability-
machine" that has a personality (or rather personality traits) too. I
do not hope that this machine will be only a computer simulation or
computer programme. In that case CNN won't send any camera teams.

It is not from AI that the solution will come, I prophesize. That line
of theorizing about human cognition and abilities has been talked to
death more than a decade ago. If you programme everything you want
"cognition" to do it does not prove very much. For one thing, a
computer programme can't have real feelings and emotions. This fact
alone will preclude a working model that has anything to do with real
psychic processes.

Pure cognition can be implemented in computer chips thse days. Even
visual perception that has eluded AI specialists for so long can be
implemented. Already there are cars without a human driver riding on
the roads in the US. But these artificial drivers will never be
startled by a sudden occurrence on the road. It won't have a heart and
therefore it's heart will never skip a beat. Of course an emotion
algorithm could be added to the driving programme per se. (Like Data

had in the SF TV series Star Trek. But that doesn't prove anything....)

Stephen Humphry

unread,
Mar 28, 2011, 2:35:47 AM3/28/11
to talking-m...@googlegroups.com
Hi Paul.

How do you define the term "item response error" as you use it below? Is it synonymous with your "measurement error"?

It's possible to look analytically at the dichotomous Rasch model in terms of linearly decomposed errors and model "parameters". Through this lens, I can understand your saying you have to have error for the model to fit. I did this with the balance beam as a prototype to see what is implied. I do not think item response error and measurement error are interchangeable, but it really depends exactly what you mean.

I think there is a larger issue, namely that a theory should translate to substantive quantitative relations, not a purely algebraic model. I take quantities like length and mass to be real, and physical theory, definition and law as referring directly to the quantities and their relations.

No doubt, I largely agree with what you're trying to say, but it seems I differ with you regarding the most profitable place to start looking for a way to better understand what's done now and how to attack the problem of measuring posited psychological quantities.

Best,

Steve

Stephen Humphry

unread,
Mar 28, 2011, 8:26:46 AM3/28/11
to talking-m...@googlegroups.com
Hi Guenter. I think it's generous, to say the least, to call IRT a psychological theory. Rasch's use of the Poisson was a much more promising start, but I'll leave that aside here as you seem to be referring to IRT as typically and widely used.

One way to put a spotlight on the basic problem is to ask: where are the units? Rasch drew upon the ideal gas law and Newton's second law. Indeed, he cited Maxwell on the latter, regarding the definition of the unit of force as that which acting on the unit of mass produces the unit of acceleration. To take Newton's second, F = m a refers to (a) actual quantities and (b) an actual causal relation that can be isolated from other physical relations. F is a force, and a force is not a number. Similar, m is a mass, and a is the acceleration of a body. a = F/m is not merely algebra. {L} m per {T} s, per {T} s = {F} N per {m} kg s is the same thing stated in terms of the standard units.

If the standard (classical) definition of measurement is accepted, it's just not possible for the dichotomous model to be a definition or law of the kind that form the most direct and tangible basis for the definitions of SI units and for measuring in those units. Instead, Rasch models are merely algebraic expressions. Again, the Poisson is more promising in this respect, and even the dichotomous model as Odds = B/D has a form parallel with those of physical definitions and laws; but the same can't be said of the logistic function(s) so widely used.

So Rasch drew explicit parallels with physics that, if we're thinking about successful measurement, ought in my view to be taken seriously in two respects. First, the parallels potentially connect to the basis of successful measurement in a way that is not elsewhere seen in psychometrics. Second, however, the nature of those connections carries clear implications for the way the "models" would have to be interpreted and applied if they were to be used as a succeful basis for measurement. The second implication is substantial where it comes to possible psychological theory (as opposed to merely mathematical theory some of which is used in essentially a post hoc way to justify the stock-standard use of raw scores on tests).

Regards, Stve

Steve

________________________________________
From: talking-m...@googlegroups.com [talking-m...@googlegroups.com] On Behalf Of Trendler, Guenter [guenter....@zi-mannheim.de]

Sent: Saturday, 26 March 2011 4:47 PM


To: talking-m...@googlegroups.com
Subject: AW: [talking-measurement] The Rasch model and Psychological Measurement

Hi Steve,

Trendler, Guenter

unread,
Mar 28, 2011, 2:41:02 PM3/28/11
to talking-m...@googlegroups.com
Ok, how about this “theory”? For example “suppose that P is performance on some task (say, the time it takes to run a maze), A is motivation and X is amount of prior practice. Of course, it would be a simple matter to order the performances and classify subjects according to motivation (e.g., duration of food or water deprivation) and number of previous practice trials.” (Michell, An Introduction to the Logic etc., 1990, p. 70)

Michell suggests the application of conjoint measurement, but since we can measure time we can determine constants and thus apply derived measurement instead, just as usually done in physics. Still something missing?

Regards,
Guenter


-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Stephen Humphry

Gesendet: Mo 28.03.2011 14:26
An: talking-m...@googlegroups.com
Betreff: RE: [talking-measurement] The Rasch model and Psychological Measurement

Regards, Stve

Steve

Hi Steve,

Josh wrote:

Regards,
Guenter

--

winmail.dat

Paul Barrett

unread,
Mar 28, 2011, 4:50:38 PM3/28/11
to talking-m...@googlegroups.com
-----Original Message-----
From: talking-m...@googlegroups.com
[mailto:talking-m...@googlegroups.com] On Behalf Of Stephen Humphry
Sent: Monday, 28 March 2011 7:36 p.m.
To: talking-m...@googlegroups.com
Subject: RE: [talking-measurement] The Rasch model and Psychological
Measurement


Hello Steve


> Hi Paul.
>
> How do you define the term "item response error" as you use it
> below? Is it synonymous with your "measurement error"?

> It's possible to look analytically at the dichotomous Rasch model
> in terms of linearly decomposed errors and model "parameters".
> Through this lens, I can understand your saying you have to have
> error for the model to fit. I did this with the balance beam as a
> prototype to see what is implied. I do not think item response
> error and measurement error are interchangeable, but it really
> depends exactly what you mean.

Sorry Steve, I used the phrase rather clumsily in the context of being able
to assess, with no error, a person's ability. In essence, a Guttman scale.
Michell (2004, p. 126) put it nicely (Michell, J. (2004) Item Response


Models, pathological science, and the shape of error. Theory and Psychology,

14, 1, 121-129.) ...
"Now, if a person's correct response to an item depended solely on ability,
with no random 'error' component involved, one would only learn the ordinal
fact that that person's ability at least matches the difficulty level of the
item. Item response modellers derive all quantitative information (as
distinct from merely ordinal) from the distributional properties of the
random 'error' component. If the model is true, the shape of the 'error'
distribution reflects the quantitative structure of the attribute, but if
the attribute is not quantitative, the supposed shape of 'error' only
projects the image of a fictitious quantitivity. Here, as elsewhere,
psychometricians derive what they want most (measures) from what they know
least (the shape of 'error') by presuming to already know it.

If the random 'error' concept is retained, but it is admitted that the shape
of these 'errors' is unknown, then at best only ordinal relationships
between people (or items) follow from test performances (Grayson, 1988)
unless the cancellation conditions alluded to above (namely double
cancellation, triple cancellation, etc.) obtain."

Ben wright chastised me back in 1999 for creating a test of the Rasch model
which failed (trying to recover an underlying quantitative attribute which
was measured using a variety of different length objects using a "bad
-non-linear ordinal-unit" ruler - all that happened was that the Rasch model
recovered the ordinal units as linear ones - big surprise!) - the fault
evidently was because I had not incorporated sufficient random error in my
observations.

That "you need more error", coupled with Robert Wood's article Wood, R.
(1978) Fitting the rasch model - a heady tale. British Journal of
Mathematical and Statistical Psychology, 31, , 27-32, and finally Michell's
later expositions, convinced me that IRT was just another way of
statistically modeling item responses.

>
> I think there is a larger issue, namely that a theory should
> translate to substantive quantitative relations, not a purely
> algebraic model. I take quantities like length and mass to be
> real, and physical theory, definition and law as referring
> directly to the quantities and their relations.
>
> No doubt, I largely agree with what you're trying to say, but it
> seems I differ with you regarding the most profitable place to
> start looking for a way to better understand what's done now and
> how to attack the problem of measuring posited psychological
> quantities.
>

Yes, I think we probably do agree on most things ... where we differ perhaps
is how we consider what might be referred to as the substantive issue of
whether any "psychological" attribute can be measurable at all, from a
"first principle" perspective.

I simply cannot conceive of any kind of attribute (personality, values,
temperament, motivation, ability etc.) where a standard unit could ever be
maintained by an adaptive self-organizing biological system. And it is that
phrase "self-organizing adaptive system" which I think separates my view of
things from probably many others on this list, and many quantitative
psychologists. I'm not sure any non-physical feature of such a system can be
isolated entirely from other interconnected parts, in such a way that
controlled manipulations of one particular feature can be undertaken in
order to establish additivity of some unit. The sheer magnitude of that
"self-organizing" function we are dealing with is given in this article from
2007 ..
http://www.newscientist.com/article/dn12301-man-with-tiny-brain-shocks-docto
rs.html
I know, a one-off, a fluke, but as a scientist it speaks volumes to me about
the nature of the system whose outputs I am intending to understand and
"measure". The Lancet paper is available online at
http://download.thelancet.com/pdfs/journals/lancet/PIIS0140673607611271.pdf

But, we know that we can loosely capture variations in outputs from such a
system, and these work pretty well for many practical purposes in some cases
(performance measures of various kinds).

So, I prefer to explore what might be done more creatively with "good
enough/fuzzy" assessments rather than concentrate on trying to increase
"precision" where none may be found in reality.

I can't claim I'm correct in my thinking; it's based more upon my view of
"how humans function" (fed by complex/adaptive systems theory) than an
adherence to an abstract measurement theory.

Steve Blinkhorn recently offered this construal of how people might be
answering personality questionnaire items (on the Psychometrics Forum
listserv on Linked-In) ... it is not offered here as a "rigorous theory"
from Steve, more of an armchair muse really, but I do find it "interesting"
as again, it meshes with a broader view of an integrated adaptive neural
system at work ..

"Meanwhile, more or less totally ignored was the question, what is going on
when a person generates an answer to an item? This is particularly apposite
when considering non-cognitive tests. Do I have a preference for a lonely
cottage in the woods over a busy seaside town before you require me to
express one? How much are you accessing the dimensions of my mind, and how
much forcing me to respond to the dimensions of yours? Why do you not
provide both the get-out options of "neither" and "in between"?

Take an analogy from quantum mechanics, and suppose for a moment that minds
hover in wave-function like states of quantum superposition until a test
item comes along and causes a collapse. So you can be both introvert and
extravert at the same time, but because we feed back information from our
own behaviour to attempt to create consistency, test items don't act just as
indicators of consistency, they induce it. Just an alternative to Paul's
little rulers in the head."

Me? I'm caught in a world where practical concerns/profits require more
accurate predictions of outcomes from any assesments we can devise, yet am
aware that precision of measurement may not be a realizable feature of any
assessment, not because we lack the brains to utilise measurement models,
but because the system under examination can create it's own "internally
generated" cause on-the-fly (not just responding passively to external
stimuli), and is complex (in terms of massively interconnected,
self-organizing, adaptive neural networks). I've always been puzzled by
trying to answer a question "if we could measure an attribute
quantitatively, to a substantive degree fo precision (say to within 1
decimal place of a unit), what would be required from a human to be able
sustain that accuracy?"

My feeling is that those who work in edumetrics tend to forget that working
solely with scholastic and performance-based assessments is hugely different
from trying to assess those features of "being human" which are probably the
most fundamental aspects of the science of psychology.

However, maybe this (educational attainment) is where attempts at
"quantitative measurement" may work best - where attainment of a very
specific outcome is at stake, and not assessment of a feature/attribute of
human psychology like "religiosity" or "propensity to commit a violent act"
for example?

Stephen Humphry

unread,
Mar 29, 2011, 9:14:42 PM3/29/11
to talking-m...@googlegroups.com
Paul, just a couple of points quickly. Firstly, as you'll see in my rejoinder for the Measurement article, I don't think the shape of error distributions is actually the big deal with the dichotomous Rasch model. The whole way of looking at the models analytically and purely algebraically leads to error distributions. The starting point for the dichotomous model, as it is most often used, is Odds = exp(b-d). What would need to be substantively explained, in my view, is the basis of an exponential relationship. If relative frequencies have this relationship with actual magnitudes of differences, there wouldn't be a need to start by assuming error distributions. Unfortunately, given the predominant way of looking at things, I am not so sure this point will be appreciated. If need be, we can go into it further another time.

On emergent, self-organising systems, I was a big fan of this in general, Stuart Kauffman's work in particular, in the 90s. I think we gain some real insights. As you may well know, Kauffman later worked with a physicist and tried to apply the thinking to physics. The "primitives" of a system, as John Holland calles them, are very simple and the rules/laws governing their interactions can be of a basic quantitative nature. Indeed, in Chemistry with autocatalytic sets, that is precisely the nature of interactions. Now, I think complex adaptive systems in general are useful for providing broad insights. As yet, for example, nobody has created life using the insights. They may, or may not. I think not, but I won't go into the reasons here.

Suffice to say that I do agree we can gain some insights, and they may concern exactly what is mesurable. Is temperature an emergent phenomenon? Arguably it is, albeit not a self-organising one considered in isolation.

You say:

"My feeling is that those who work in edumetrics tend to forget that working
solely with scholastic and performance-based assessments is hugely different
from trying to assess those features of "being human" which are probably the
most fundamental aspects of the science of psychology."

Yes, I think that is generally very true. I'm skeptical about measuring attributes, but mostly not for reasons given by Joel or because of issues with distributions of measurement error.

Steve

________________________________________
From: talking-m...@googlegroups.com [talking-m...@googlegroups.com] On Behalf Of Paul Barrett [pa...@pbarrett.net]
Sent: Tuesday, 29 March 2011 4:50 AM

Trendler, Guenter

unread,
Apr 2, 2011, 10:04:25 AM4/2/11
to talking-m...@googlegroups.com
Hi Andrew, sorry for being late with my reply...

A: How does one do this precisely without theory? I would argue that theory, however crude and incomplete, precedes experiment. I agree with you when you say that science is the interplay between theory and experiment, but I cannot see how that logically entails that no more theory is needed in the behavioural sciences.

G. My point is NOT that we need no theory at all, but that we already have enough theory to get started. This does not imply that more theory along the road is unnecessary. It also does not imply that we must be successful in our endeavour. Most physical phenomena and laws were discovered by countless trial and error in the laboratory and much of this research ended nowhere and has gone unreported. In short, Faraday the experimenter should be as much our model as Maxwell the theorist. Only by forgetting Faraday Rasch could be so optimistic about measurability in psychology.

I’m trying to put myself here in the position of someone who believes that psychological attributes are measurable. Hence, if one does believe, which I don’t, one should roll up one’s sleeves and withdraw to the lab (or in the dungeon as Denny put it) in order to return with positive or negative results. Consider, for example, Faraday’s "Experimental Researches in Electricity" (1). Without it no Maxwell would have been possible. Faraday indeed spent all his life in the lab. Science is not always fun and sunshine.

One can only demonstrate that psychological attributes are measurable by actually measuring them. A positive result would put an immediate end to just ‘talking measurement’ and lead over to ‘doing measurement’. Of course, some people believe that psychological attributes are already measured, but they don’t seem to be around here.

G

(1) http://www.archive.org/details/experimentalres01faragoog


-----Ursprüngliche Nachricht-----
Von: talking-m...@googlegroups.com im Auftrag von Andrew Kyngdon
Gesendet: Sa 26.03.2011 11:45
An: talking-m...@googlegroups.com
Betreff: RE: [talking-measurement] The Rasch model and Psychological Measurement

G,

Andrew

Hi Steve,

Josh wrote:

Regards,
Guenter

--

winmail.dat
Reply all
Reply to author
Forward
0 new messages