Experiments

Itzhak Gilboa

unread,

Aug 2, 2009, 11:39:05 AM8/2/09

to decision_t...@googlegroups.com

Re Marco's comment: I don't want to minimize the importance of axioms in relating theoretical concepts to observables, making sure that theories are refutable but hopefully not refuted etc. But I think that some caution need to be taken with experiments. Unfortunately, the results of experiments are not free of rhetoric either. For example, experiments naturally focus on domains of problems (say, small probabilities) and may give us a biased view on which theory "works". Then there are problems of framing, as Gigerenzer has shown regarding some of Kahneman-Tversky's results. And there are a host of other problems of specificity if we want to take experimental results seriously.

Here's a recent example: we know from KT's results that small probabilities seem to have overstated "decision weights". In a recent talk, Ido Erev argued (if I got it right) that this phenomenon is mostly observed when probabilities are stated explicitly, but that you can get the opposite phenomenon if you consider naturally occurring events. That is, events that are very rare often are completely ignored. This suggests that we should have different theories for situations in which probabilities are stated as compared to situations in which one only has a memory, from which empirical frequencies can be computed.

I have recently heard also that it is not so clear whether there is risk loving behavior in the domain of losses. Something apparently happens to the utility function near the reference point, but it's not clear it gets to be convex.

These doubts do not mean that we should ignore experiments. But we should be careful when we interpret these results, especially that many of us, by personality and by training, have a preference for general theories. And even more so given that these discussions are often supposed to help us reason about more remote problems, such as how to run the economy.

In fact, I sometimes wonder whether our field might not be split into two different domains: one that refines very specific predictions for the purposes of particular applications (such as selling insurance policies) and another that focuses on general insights. I'm not suggesting that such a split will be welcome, but I sometimes feel that there is a gap between the two types of motivation.

Tzachi

Drew Fudenberg

unread,

Aug 3, 2009, 12:40:08 PM8/3/09

to decision_t...@googlegroups.com

I have recently heard also that it is not so clear whether there is risk loving behavior in the domain of losses. Something apparently happens to the utility function near the reference point, but it's not clear it gets to be convex.

Can you (or anyone else) remember any of the evidence against this convexity? A lot of people seem to treat it as an established fact and want David +me to change our dual self model to produce it.

Drew

Drew Fudenberg

unread,

Aug 3, 2009, 12:40:09 PM8/3/09

to decision_t...@googlegroups.com

I have recently heard also that it is not so clear whether there is risk loving behavior in the domain of losses. Something apparently happens to the utility function near the reference point, but it's not clear it gets to be convex.

Itzhak Gilboa

unread,

Aug 4, 2009, 4:13:08 AM8/4/09

to decision_theory_forum

Drew: Two people who know the utility function rather intimately are
Mohammed Abdellaoui and Peter Wakker. They may be out of e-touch
right now.

Another doubt that I recently had has to do with "ignoring base
probabilities", i.e. confusing P(A|B) with P(B|A). In class this
works like magic -- undergrad classes can have a majority failing in
Kahneman-Tversky's examples. (For instance: informed that the
probability of testing positive given the disease is high, people tend
to think that the probability of the disease given testing positive is
also high, even if the disease is a-priori very rare.)

However, I recently came to doubt that this happens to practitioners
on tasks that they are familiar with. It is possible that the very
fact that practitioners are asked about probabilities results in
different estimates than what their behavior might reflect. I could
find some studies involving differently-phrased questionnaires
(see http://www.psychologicalscience.org/observer/getArticle.cfm?id=2253)
but is there anything that compares replies to actual decisions in
this context?

Tzachi

Julian Jamison

unread,

Aug 7, 2009, 11:08:11 AM8/7/09

to decision_t...@googlegroups.com

Hi Tzachi,

This is somewhat old, and only 9 subjects, so I wouldn't trust it very far at all, but it does relate the base rate fallacy to actual decisions:
http://www.ncbi.nlm.nih.gov/pubmed/6457103

[I found most of the text in google books via google scholar]

I don't think they asked them to solve formal problems as well, however -- maybe what you wanted was the link between the two.

cheers,
julian

Attila Ambrus

unread,

Aug 15, 2009, 5:14:16 PM8/15/09

to decision_theory_forum

Hi Tzachi,
I completely agree with your reservations concerning lab experiments.
I recently saw a round table discussion at the Stony Brook Game Theory
conference, where Robert Aumann and Sergiu Hart expressed similar
concerns. Also, I remember reading a good methodological paper by John
List on the topic, bringing up related issues.
You might ask at this point why I'm running lab experiments myself...
For practical reasons: they are much cheaper and easier to run than
clear field experiments. I think results from lab experiments
shouldn't be regarded as final evidence, for example because subjects
are asked to make choices they are not used to, in an artificial
setting, where it's not clear what their motivations are. However, if
a regularity comes through strongly from lab experiments, I would
think that's a good indicator that it might be worth checking it
outside the lab setting, for decisions that economists are typically
interested in. At the end of the day it might still be true that in
the latter settings the result from the lab experiment disappears
(there are several examples in the literature) - but I think it
becomes worth checking.
One more comment, given that we are all theorists commenting here: my
impression is that most non-experimental empirical economists regard
lab experiments the same way as I do: they basically don't really
trust their conclusions until they got confirmed by field data (and
they laugh at the small sample sizes of lab experiments, and have
other methodological problems about the econometric analysis).
Attila

marco

unread,

Aug 16, 2009, 1:46:44 PM8/16/09

to decision_theory_forum

Surely 'there are problems' with experiments (as there are problems
with theory) but I personally disagree with the criticisms - and
attitude! - I hear from many theorists.
I am struck by the fact that experimentalists are by and large very
aware of the problems in their field. They seem to me to debate them
in an often more mature, more fruitful, and less complacent manner,
than we theorists debate -if ever- our own (I recommend reading their
forums, e.g. http://groups.google.com/group/esa-discuss).
A couple of observations on specific problems mentioned here.
Yes, 'there are problems' with small samples. But there is also a
sound statistical theory of small samples (by the way, the example in
Fischer's original paper, about the exact test bearing his name, of
tea with milk - has 8 observations...) which is good enough for
medicine and other disciplines. The explosion in computational power
now allows one to run exact, nonparametric tests unthinkable just a
few year ago. So I find the small sample criticism weak. And anybody
who is sufficiently intimate with empirical economists working with
large datasets will learn, after a drink or two, of some skeletons in
the cupboard in that area, too :).
Yes, some later experiments reverse or correct the findings of earlier
ones. But so what? Isn't this scientific progress, learning either
about faulty methodology, or about a more nuanced view of human
behaviour (this is how I read the Erev findings)?
I don't see the data coming from experiments as cheap surrogates for
'real' data. They are a fully legitimate part of a large spectrum of
empirical evidence (as Daron observed, including economic, historical,
literary, introspective etc.) accessible to us as theorists and social
scientists. They have comparative disadvantages and advantages. They
are neither superior nor inferior to oher types of data. The main
advantage is the control they afford the observer, reducing noise and
the need for econometric hammering. The main disadvantage is that the
conditions in the lab are 'artificial'. Hence the recent trend towards
field experiments. Yet, even studying decisions made in an artificial
contest may shed light on the procedures that human decision makers
use in natural settings.

Attila Ambrus

unread,

Aug 16, 2009, 2:27:08 PM8/16/09

to decision_t...@googlegroups.com, marco, decision_theory_forum

Hi Marco,
Hope you didn't mean my attitude - again, I'm running experiments myself.
I think it's great if someone who proposes a theory provides experimental
evidence supporting it, even indirectly. It's definitely at least as convincing
as just giving a motivating story that "rings true" to a lot of people (which is
I guess what most theory papers do). But I still think that it does not
constitute a "proof" of the theory, just supporting evidence.
Similarly, I don't think that it "disproves" a theory, like backward induction,
if in lab settings it is shown that people clearly violate it. It is still an
interesting observation, and it gives us a hint, but does not directly tell us
how people behave in the real world, doing tasks that they are familiar with.
As for economists' attitudes towards experimental economics, it's a
multi-layered question. One component of it is that many experimental
economists does not have as much classical training in economics as people in
other fields (of course there are many counterexamples, that is experimental
economists who are also fully trained and very good economists). I think
theorists are especially sensitive to this issue, as the experiments address
questions that game theorists and decision theorists have been studying for
decades. As you said, experimental economists address these issues in a
"simpler way", and at the same time often criticizing the mainstream theories.
I guess some theorists mind that some of the criticism comes from people who
didn't fully master the theories that they criticize. Another component, which
Robert Aumann mentioned in Stony Brook, is that experimental economics as a
tradition grew out of psychology, and partially incorporated objectives from
that discipline that are traditionally much less present in economics:
psychologists are often interested in pathologies and mistakes, gearing their
experiments towards finding types of decisions in which people make mistakes or
surprising choices (like finding decisions in which group discussion leads to
worse choices than the average of individual choices - that is, people
convincing each other about the wrong conclusion).
Best,
Attila

John Kagel

unread,

Aug 16, 2009, 4:22:57 PM8/16/09

to decision_t...@googlegroups.com, marco, decision_theory_forum

A note in response to the latest comment cited below:

I think this is called stress testing a theory. So when U find it
doesn't work well in these cases - you are motivated to look for
something better. But of course you do need a better model to beat
an old one that does not work in some (special?) circumstances. And
even then you might not abandon the old model (e.g., EU theory) as
its so easy to employ and versatile and usually correct.

At 02:27 PM 8/16/2009, Attila Ambrus wrote:
>Robert Aumann mentioned in Stony Brook, is that experimental economics as a
>tradition grew out of psychology, and partially incorporated objectives from
>that discipline that are traditionally much less present in economics:
>psychologists are often interested in pathologies and mistakes, gearing their
>experiments towards finding types of decisions in which people make
>mistakes or
>surprising choices (like finding decisions in which group discussion leads to
>worse choices than the average of individual choices - that is, people
>convincing each other about the wrong conclusion).

John H. Kagel
Department of Economics.
Ohio state University
410 Arps Hall
1945 North High Street
Columbus, OH 43210-1172
kag...@osu.edu
614-292-4812 (phone)
614-292-4192 (fax)
http://www.econ.ohio-state.edu/kagel/

Attila Ambrus

unread,

Aug 17, 2009, 11:01:18 PM8/17/09

to decision_t...@googlegroups.com, John Kagel, decision_t...@googlegroups.com, marco

I might be wrong, but I think there is absolutely no theory in economics that
could withstand "stress testing" in every situation. So the question is that
what do we learn for example from the fact that, although in most tasks groups
outperform individuals (partly because people promoting the correct answer
usually win over people promoting the wrong answer, but also because there are
complementarities in people's information), if you show two line segments
projected on the wall that are of equal length, then more individuals get it
right (ie tell you that the two lines are of equal length) than groups of three
(after group members could discuss the answer). It's a strange task, but maybe
we learn something about confidence or fake confidence of certain people(?)
Attila

Itzhak Gilboa

unread,

Aug 18, 2009, 8:22:39 AM8/18/09

to decision_theory_forum

Hi!

I think that we need to be a bit more explicit about what we mean by
"experiments confirming/refuting theories". This has to do with the
distinction between theories as making specific predictions, and
"theories" that are paradigms, or general metaphors. And the same
distinction can apply to the experiments themselves. There are
therefore two distinct points here:

1. Whether a theory "works" or not depends on the application we have
in mind, as (I think) we all agree. And among the applications there
are specific ones, but there are also general, qualitative
conclusions. Let's consider the example of EUT for decision under
risk with the challenge posed by Allais's paradox and the alternative
of PT (prospect theory), or its cumulative version. If we only think
about specific applications, we may choose to stick to EUT, or to
refine it, depending on the application. For instance, if we think of
insurance problems, where probabilities tend to be close to zero, EUT
may perform badly. And if we think of labor search problems, where
probabilities are not necessarily close to 0 or 1, EUT may be a
reasonable approximation.
But another distinction would be, for example, between the application
to insurance problems, and the development of the argument that
asymmetric information leads to inefficiencies in markets. Here again
one may choose to use EUT in the latter application but not in the
former. But the sense in which EUT may be a good approximation for
the qualitative argument is not a matter of fitting data, but of
following an argument. If, once the argument is understood, we go
over it once more with PT rather than with EUT, and we find that the
main point does not depend on the specifics of EUT, we can conclude
that EUT is a "good enough approximation" for this application. Yet,
it's a different type of "approximation" and its quantification is
trickier.

2. Along similar lines, we should also distinguish between
experiments that show a particular quantitative inaccuracy of a
theory, and experiments that are themselves metaphors that point to
more fundamental difficulties with the theory. Allais's paradox is of
the first type. It shows a failure of EUT with probabilities near 0
or 1, and then one can choose whether to refine the theory for a
particular application, or to assume it's a good enough
approximation. By contrast, Ellsberg's paradox is of the second type,
I think. It’s not about a failure of EUT where certain urns are
concerned; it’s a metaphor for the general problem of having
insufficient information to generate prior probabilities. In fact, I
think that Ellsberg’s two experiments, in their beautiful simplicity,
becloud his main point. In these crisp experiments it's easy to
assign an "uninformative" or uniform prior to Ellsberg's urn(s), and
brush aside uncertainty, at least for normative purposes. But one
cannot deal with real life uncertainty as one can deal with the urn
uncertainty. Many view Ellsberg’s experiments as saying something
about the inadequacy of the theory when it comes to stock market
behavior and political events, the success of a start-up or a marriage
-- in short, about many things beyond the urns.
Thus, while Allais's paradox deals with refinements of the theory,
Ellsberg's paradox is itself a metaphor. Using it in this capacity,
it’s perhaps more important to know how we feel about it as a thought
experiment than in actual labs. After all, if it’s our intuition
that’s the target of the exercise, a thought experiment is as close to
reality as we can get.

The bottom line, I think, is the platitude that some experiments will
be more relevant to some applications, and less to others. However,
among these experiments and applications we should not minimize the
role of the qualitative insights and reasoning that often affect the
way we think about social problems.

Tzachi

marco

unread,

Aug 18, 2009, 3:37:16 PM8/18/09

to decision_theory_forum

I'd like to highlight an interesting article from 'really empirical
science' which is relevant to our discussions about both experiments
and rethorics, and about the general nature of 'facts':

'How citation distortions create unfounded authority:
analysis of a citation network' , by Steven A Greenberg, associate
professor of neurology
British Medical Journal 2009;339:b2680.

http://www.bmj.com/cgi/reprint/339/jul20_3/b2680

Snippets:

'Unfounded authority was established
by citation bias against papers that refuted or weakened
the belief; amplification, the marked expansion of the
belief system by papers presenting no data addressing it;
and forms of invention such as the conversion of
hypothesis into fact through citation alone'

[why do the words 'hyperbolic' and 'discounting' come to my mind?]
...

'an inherent property of negative results,
which failed to spread through the network. These
were not repeatedly cited by their authors in subsequent
papers (only one instance was present) as perhaps
there was simply nothing further to say about
them. Unlike “positive results” there is nothing exciting
to be repeatedly written about how something was
not found in an experiment.'

marco

Reply all

Reply to author

Forward