From the site armin gravitas, characterized as “a simulacrum standing in for Gavin Leech“, a consultant, we have a three-year old piece that gives many examples of once widely-accepted psychological claims that didn’t stand up to (or were severely weakened by) attempts at replication. There are many more than the few I give below, but I’ve chosen a couple that I’ve written about or that readers may be familiar with. On the webpage below (click to access), each weakened or refuted claim comes with a link to the original paper or book making the claim, and then a list of the studies that failed to replicate it.
. . . a cognitive bias whereby people with low ability, expertise, or experience regarding a certain type of task or area of knowledge tend to overestimate their ability or knowledge. Some researchers also include in their definition the opposite effect for high performers: their tendency to underestimate their skills.
The fields and claims:
No good evidence for many forms of priming, automatic behaviour change from ‘related’ (often only metaphorically related) stimuli.
- No good evidence of anything from the Stanford prison ‘experiment’. It was not an experiment; ‘demand characteristics’ and scripting of the abuse; constant experimenter intervention; faked reactions from participants; as Zimbardo concedes, they began with a complete “absence of specific hypotheses”.
- No good evidence from the famous Milgram experiments that 65% of people will inflict pain if ordered to. Experiment was riddled with researcher degrees of freedom, going off-script, implausible agreement between very different treatments, and “only half of the people who undertook the experiment fully believed it was real and of those, 66% disobeyed the experimenter.”
- At most weak use in implicit bias testing for racism. Implicit bias scores poorly predict actual bias, r = 0.15. The operationalisations used to measure that predictive power are often unrelated to actual discrimination (e.g. ambiguous brain activations). Test-retest reliability of 0.44 for race, which is usually classed as “unacceptable”. This isn’t news; the original study also found very low test-criterion correlations.
- No good evidence that taking a “power pose” lowers cortisol, raises testosterone, risk tolerance.
That a person can, by assuming two simple 1-min poses, embody power and instantly become more powerful has real-world, actionable implications.
After the initial backlash, it focussed on subjective effect, a claim about “increased feelings of power”. Even then: weak evidence for decreased “feelings of power” from contractive posture only. My reanalysis is here.
- Mixed evidence for the Dunning-Kruger effect. No evidence for the “Mount Stupid” misinterpretation.
- In general, be highly suspicious of anything that claims a positive permanent effect on adult IQ. Even in children the absolute maximum is 4–15 points for a powerful single intervention (iodine supplementation during pregnancy in deficient populations).
- “Expertise attained after 10,000 hours practice” (Gladwell). Disowned by the supposed proponents.
- No good evidence that tailoring teaching to students’ preferred learning styles has any effect on objective measures of attainment. There are dozens of these inventories, and really you’d have to look at each. (I won’t.)
- The effect of “nudges” (clever design of defaults) may be exaggerated in general. One big review found average effects were six times smaller than billed. (Not saying there are no big effects.)
- No good evidence that brains contain one mind per hemisphere. The corpus callosotomy studies which purported to show “two consciousnesses” inhabiting the same brain were badly overinterpreted.
- At most extremely weak evidence that psychiatric hospitals (of the 1970s) could not detect sane patients in the absence of deception.
- No good evidence for precognition, undergraduates improving memory test performance by studying after the test. This one is fun because Bem’s statistical methods were “impeccable” in the sense that they were what everyone else was using. He is Patient Zero in the replication crisis, and has done us all a great service. (Heavily reliant on a flat / frequentist prior; evidence of optional stopping; forking paths analysis.)