The Stanford Prison Experiment did NOT show that strong situations overpower personality.
The Stanford Prison experiment (SPE) is one of the most famous, or indeed infamous, studies in the history of psychology. The dramatic and horrifying result of the SPE have been used to draw rather sweeping conclusions about human nature and the psychology of evil. For example, the SPE supposedly illustrates the power of an abusive situation to induce good people to do evil things. In particular, Phil Zimbardo has argued that the study shows that strong situational forces can override individual differences in personality and moral values so that the latter count for very little. Indeed he has even claimed that virtually anybody at all who was put into a situation where they had power over others, such as guards have over prisoners, would act in a tyrannical and abusive way. Furthermore, the results of the SPE have been applied to prisoner abuse in Abu Ghraib. The influence of the SPE on psychology is all the more remarkable considering the obvious limitations of the study, such as its small sample size and the ad hoc way in which the experiment was conducted. Closer examination shows that the design of the SPE did not provide an adequate test of the role of individual differences in a simulated prison, and that no satisfactory account of the individual differences in behaviour shown by participants has been offered. Therefore, the popularly accepted conclusion that the SPE shows that “situational power triumphs over individual power in certain contexts” (Zimbardo, 2007) is quite unfounded.
Prisoner abuse at Abu Ghraib has been compared to events at Stanford. What are the real lessons?
Dispositions vs. situations - opposed or complementary?
The details of the
SPE are fairly well-known and are explained in detail on Zimbardo’s website. Susan Krauss Whitbourne also
provides a nice accessible summary of the study on her blog.
When the study was first published, the stated rationale was to critique the
“dispositional hypothesis” of why prison life is so deplorable (Haney, Banks, & Zimbardo, 1973). Briefly,
the “dispositional hypothesis” supposedly blames the “nature” of the people who
administer the prison system (e.g. the guards) and the “nature” of the people
who populate it (the prisoners). That is, when guards act in a brutal manner it
is because they are brutal people. Alternatively, prisoners are seen as
naturally aggressive people unable to control their impulses, and therefore
repressive measures are needed to control them. According to Haney et al., this
dispositional hypothesis has been invoked both by those who defend the status
quo (poor conditions in prisons are due to evil prisoners) and by critics of
the system (poor conditions in prisons are due to sadistic guards). Supposedly,
such simplistic explanations draw attention away from the complex social,
economic, and political causes that really underlie this deplorable situation,
and which are too difficult to change without radical social upheaval. A few
years ago, a research paper proposed that self-selection might have influenced
the outcomes of the SPE, because the sort of people who would willingly
volunteer for a study on prison life might have distinctive personality traits
that might predispose them to abusive behaviour (Carnahan
& McFarland, 2007).[1]
Haney and Zimbardo (2009) responded
to this by attacking the influence of what they call “persistent
dispositionalism” in psychology – “explaining
context-driven socially problematic behavior in largely individualistic,
trait-based terms, no matter how much evidence has been amassed to the
contrary”.
The
alternative hypothesis that Haney et al. present is a “situationist” one, which
is the claim that powerful and oppressive social situational forces, such as
occur in a prison, over-ride individual differences in personality and
moral values, and induce ordinary decent people to act in abhorrent ways. Haney
et al. (1973) attacked the idea that prisoner abuse is due to “bad seeds” and
alternatively suggested that the prison system consists instead of “bad soil”
that can corrupt anyone. In a more recent talk,
Zimbardo has explained his belief about prisoner abuse at Abu Ghraib “I believed our soldiers were good apples that someone had put into a very bad barrel in that prison dungeon.”
Prisoner abuse at Abu Ghraib: good apples in a bad barrel? Did they not choose to behave the way they did?
Personality is revealed rather than suppressed by situations
Note the apparent dichotomy here. A person’s behaviour in a
situation such as a mock prison or even a real one is supposed to be due either
to their internal dispositions or the external features of the situation, but
not both. I think this a false dichotomy that has led to extreme and unfounded
conclusions. Furthermore, it appears to be a straw man argument. When
Haney et al. (1973) originally discussed the “dispositional hypothesis” they
did not cite any references to show that this is a real hypothesis taken
seriously by any genuine scholars. Perhaps certain naïve laypeople believe in
it, but whether actual social scientists and psychologists do is not clear.
Similarly, when Haney and Zimbardo (2009) attack “persistent dispositionalism”
they seem to invoke a decades-old misconception that personality psychologists
believe that behaviour can be understood primarily as a function of a person’s
traits without serious consideration of the context of the person’s behaviour.
On the contrary, personality psychologists have long maintained that a person’s
behaviour is a function of both the
features of the person and the features of the situation, not just one or the
other. That is, personality psychologists argue that people generally
make choices about how to behave in order to meet their needs within the
constraints and opportunities inherent in particular situations.[2]
Regarding the SPE in particular, the authors who argued that self-selection
could have influenced the SPE’s outcome responded to the criticism of Haney and
Zimbardo that they supposedly preferred “dispositionalist” explanations over
situational ones, by acknowledging that features of
the situation had a powerful influence on the behaviour of the participants (McFarland & Carnahan, 2009). What they
were arguing was that traits might influence a person’s decision to participate
in such a situation in the first place. Furthermore, they also argued that
being in such a situation with people with similar personality traits would
tend to amplify whatever tendencies one already had to be abusive. However,
Zimbardo (2007) has argued for a more
extreme situationist view, claiming that “a large body of evidence in social
psychology supports the concept that situational power triumphs over individual
power in certain contexts” and that bad situations can cause “good” people to
do “evil” things. However an alternative view of the power of situations is
that they provide opportunities that can reveal rather than suppress individual
differences (Krueger, 2008). That is, put
two different people with different desires in the same situation, and they will respond in accordance with their personal preferences, within whatever
constraints are imposed by the demands of the situation. Let’s examine the
actual findings of the SPE and see which view of situational power finds more
support.
What really happened at Stanford
The SPE study sample consisted of 21[3]
men who had been selected from a large pool of 75 volunteers based on
psychological assessments to ensure their mental stability and lack of criminal
history. One day prior to the study these 21 were assessed on ten different
personality trait tests and then randomly assigned to the role of guard or
prisoner – 11 to the former, 10 to the latter. On the whole it seems, the
guards were pretty mean, and the prisoners became demoralised by their
situation, and five of the latter had such adverse psychological reactions that
they had to be released early. So far, sounds like a big win for the
situationist account right? Participants acted the way they did based on their
situationally defined roles, so the situation had a strong influence on their
behaviour. However, I don’t think anyone is actually denying that situations
influence behaviour. Zimbardo’s claim is that “situational power triumphs over
individual power.” If this was the case, then we would expect that there was
little or no variation in the way participants behaved in their respective
roles as prisoners or guards. Did this really happen though?
“Some guards were tough but fair (“played by the rules”), some went far
beyond their roles to engage in creative cruelty and harassment, while a few
were passive and rarely instigated any coercive control over the prisoners” (p.
81).
Apparently about a third of the guards (so about 3 or 4) were actively
cruel, while those described as “passive” by Haney et al. have been described elsewhere as “good guards from the prisoner’s
point of view since they did them small favors and were friendly”. Furthermore,
although five prisoners broke down under the stress of being abused, the other
five were more resilient.
Clothes make the man? (Image Source)
The role of personality traits - at first acknowledged, then later dismissed
The original report by Haney et al. does acknowledge that personality traits could moderate the effect of social situational variables, allaying or intensifying the latter’s effects. That is, individual differences in participants could influence how they respond to the perceived demands of their assigned role. When discussing the limitations of their study they even go so far as to admit that they could not adequately test whether a dispositional or a situational account provides a better explanation of their results and state that “We cannot say that personality differences do not have an important effect on behavior in situations such as the one reported here.” They acknowledge that a stronger test would involve comparing two conditions where participants were pre-selected for having more extreme personality traits. I suppose one way to do this would to set up two mock prisons for comparison, one featuring people selected for above-average kindness and compassion, the other one populated only with narcissists and psychopaths. If there were no differences in the behaviour shown in the two conditions (!) this would provide strong evidence that personality traits are not an important influence on behaviour in such a situation. However, they lacked the resources to perform such an experiment, which (hardly surprisingly) has not been done to this day.
In their more recent article though, Haney and Zimbardo (2009) summarily dismissed the role of individual differences, arguing that the precautions and controls they used in their original study were sufficient to lay to rest “any trait-based explanations of our findings” (emphasis added). Specifically, participants were assessed on a number of personality traits and found to score within the normal range for the general population. Additionally, guards and prisoners did not differ on any of these traits. And finally, these personality measures did not predict variations in behaviour within either the prisoner group or the guard group. Supposedly, these precautions should be enough to settle the matter for good.
On its face, such an assertion that the results from a single study of 21 people can permanently lay to rest “any” trait-based explanations seems to me like a breathtakingly bold dismissal that flies in the face of usual scientific practice. Such a small single study like this would normally be considered by most scientists just the beginning of enquiry into the matter not the end of it. Haney and Zimbardo offer no explanation of why individual differences occurred in people who were exposed to the same situation, yet claim that they have enough evidence to dismiss “any” trait-based explanation at all based on their statistical analysis of 21 people. Let’s examine the merits of their “precautions and controls.”
Weak arguments about strong situations
The first argument is that participants did not differ from the general population on their personality traits, and were therefore a fair sample of “normal” individuals. Eight of these measures comprised the Comrey Personality scales. According to a critique by McFarland and Carnahan (2009) none of these traits have ever been linked to abusive and aggressive behaviour. If this is correct, they would have been of no use in assessing whether the participants were “normal” with respect to their propensity to be abusive in a situation where they held power over others. The other two traits measured were authoritarianism and Machiavellianism (the propensity to manipulate others for one’s own gain), which would appear to be theoretically relevant to abusive behaviour. The original report by Haney et al. is actually silent on how their participants compared to the normal population on these measures. For some reason that is not made clear, the researchers used a non-standard scoring method for Machiavellianism that makes comparisons with the general population not possible. Carnahan and McFarland (2007) pointed out that participants actually did score higher on authoritarianism than the general population and their scores were actually comparable to those found in a study of actual prisoners in San Quentin. Haney and Zimbardo argued that the actual difference from the norm was fairly small, so whether it was enough to contribute to the actual behaviour of participants in their study was a moot point. Still, the matter has hardly been “laid to rest.”
The second argument is that participants assigned to the prisoner and guard roles did not differ significantly on their personality traits. Apart from the miniscule sample size involved, which I will address shortly, I am tempted to respond “So, what?” Prisoners and guards were effectively in two different situations with differing opportunities and faced different challenges. For example, some of the guards disturbed the prisoners’ sleep by banging on their cell doors. The prisoners obviously did not have the opportunity to reciprocate this treatment, because the guards went home at the end of their shifts. So the prisoners could not engage in such abusive behaviour even if they had felt inclined to do so, because the opportunity was simply not there. As I have argued earlier, personality theorists propose that individual differences are relevant to how people respond to their circumstances, not that individual differences somehow allow people to transcend these circumstances and behave however they feel like.
Haney and Zimbardo’s third argument is that the behaviour of individuals within their respective roles of prisoner or guard could not be predicted from their personality scores. They do not deny that there were individual differences in behaviour, just that they could not predict them. I think this is their weakest argument of all. Remember that there were 11 guards and 10 prisoners. The guards’ behaviour in particular sorted them into three distinct types – good guards, tough but fair, and mean guards. So this means that in order to perform a statistical analysis we would have to compare three subgroups consisting of 3 – 4 individuals to determine if there were significant differences in their personality traits. Statistically this is laughable. A basic principle of statistics is that significant differences between groups can only be detected if the sample sizes are adequately large, and the sample sizes in the SPE are so small as to be completely inadequate for the purpose. Now, let’s say that I was a researcher who wanted to test the hypothesis that individual differences in personality traits could predict behaviour in an experimental situation such as an in a mock prison. (Let’s also assume that I knew in advance what personality traits were relevant to the outcomes concerned.) I could actually estimate in advance what sort of sample size I would need in order to have a reasonable chance of finding a significant result, if a real effect existed. Using a procedure known as power analysis, I can calculate that if personality traits had a medium-sized effect on behaviour (i.e. about average compared to most effects in psychology) resulting in three different behavioural subgroups I would need about 50 or so participants per subgroup (so 150 in total) to have an 80% chance of detecting a statistically significant effect if one actually existed. Even if the effects of personality were actually much larger than average, I would still need about 22 participants per subgroup, so 66 in total. Remember, that these numbers refer only to the number of guards. Presumably we would need an equivalent number of prisoners as well. This means that I could anticipate in advance that I would need a sample of between 132 to 300 participants to have a reasonable chance of getting a significant result. If for some reason I then decided to settle for a grand sample of 21 people - which would give me less than a 9% chance of finding a statistically significant result assuming a medium sized effect, and about a 15% chance assuming a large one - I would look rather foolish as such a tiny sample would not allow me to test my hypothesis in anything like a conclusive way. Haney et al. quite obviously did not have anywhere near enough statistical power to predict individual behavior from measured personality traits, so the fact that they could not do so reflects a defect in their methodology rather than some deep truth about the power of situations to overwhelm individual differences.
Conclusions: the importance of choice
In summary, the purpose of the SPE was supposed to be to demonstrate that
powerful situational forces could over-ride individual dispositions and
choices, leading good people to do bad things simply because of the role they
found themselves in. If this were true, then participants in the study should
have acted in a uniform way depending on their role. However, this was not the
case, participants acted like individuals, showing that they still had the
capacity to make choices within the constraints of their situations.
Furthermore, the study was not even designed to provide a fair assessment of
the influence of personality traits in such a situation because the sample size
was nowhere near large enough to justify any definite conclusions. Far from
demonstrating that individual differences do not matter in how people behave in
a strong situation, the study’s results illustrate that even in undeniably
tough situations people still have the capacity to make choices and that these
choices matter.
Footnotes
[2] To be fair, Zimbardo has stated, for example on his blog, that be believes that behaviour is a function of both individual differences and situational factors. However, many of his published remarks indicate that he sees dispositional and situational factors as competing with each other to explain behaviour. Personality psychologists see this “competition” hypothesis as being based on a false dichotomy. See this blog post by David Funder for example for an explanation of why this dichotomy is not valid.
[3] The sample was originally 24. Two were asked to remain on standby and one withdrew before the study began.
Please consider
following me on Facebook, Google Plus, or Twitter.
This article also appears on Psychology Today on my blog Unique - Like Everybody Else.
© Scott McGreal. Please do not
reproduce without permission. Brief excerpts may be quoted as long as a link to
the original article is provided. Any version of this
article appearing on sites other than Eye on Psych or my blog at Psychology Today has been ripped off without my consent.
Follow up articles critiquing situationism that discuss the SPE
Challenging the "Banality" of Evil and of Heroism, Part 1 and Part 2. This pair of articles refutes Zimbardo's claim that heroic and evil acts are equally "banal" outcomes of situational factors and that qualities within a person are of no real importance.
Further interesting reading
Further interesting reading
Don’t
blame Milgram by David Funder – debunks the popular claim that Milgram’s
obedience studies show that the “power of the situation” overwhelms the “power of the person”.
References
Carnahan T, & McFarland S (2007). Revisiting the Stanford prison experiment: could participant self-selection have led to the cruelty? Personality & social psychology bulletin, 33 (5), 603-14 PMID: 17440210
Haney, C., Banks, C., & Zimbardo, P. G. (1973). Interpersonal dynamics
in a simulated prison. International
Journal of Criminology and Penology, 1, 69-97.
Haney, C., & Zimbardo, P. G. (2009). Persistent Dispositionalism in
Interactionist Clothing: Fundamental Attribution Error in Explaining Prison
Abuse. Personality and Social Psychology
Bulletin, 35(6), 807-814. doi: 10.1177/0146167208322864
Krueger, J. I. (2008). Lucifer's last laugh. The American Journal of Psychology, 121, 335-341.
McFarland, S., & Carnahan, T. (2009). A Situation's First Powers Are
Attracting Volunteers and Selecting Participants: A Reply to Haney and Zimbardo
(2009). Personality and Social Psychology
Bulletin, 35(6), 815-818. doi: 10.1177/0146167209334781
Zimbardo, P. G. (2007). The
Lucifer Effect: Understanding How Good People Turn Evil (1st ed.). New
York: Random House.