Reproducibility in Psychology

Meme

Context

the accuracy and reliability of reproducibility metrics in psychology—the field has been grappling with a well-documented replication crisis for over a decade. Here's a curated set of key findings and references that highlight the scope and causes of the problem:

📉 1. Reproducibility Project: Psychology (Open Science Collaboration, 2015) Only 36% of 100 replicated studies from top psychology journals produced statistically significant results, compared to 97% in the original studies.

The average effect size in replications was about half that of the originals.

Conclusion: Even high-quality studies often fail to replicate, suggesting systemic issues in research design, analysis, and publication incentives.

📖 Read the full study inScience

🧠 2. Statistical Assumptions and Logical Fallacies A 2022 study in Frontiers in Psychology found that many psychological studies violate key statistical assumptions (e.g. normality, homoscedasticity).

It also critiques the misuse of p-values, noting logical fallacies like the transposed conditional and affirming the consequent.

These flaws undermine the reliability of statistical inferences and inflate false positives.

📖 Frontiers in Psychology – Statistical Assumptions and Reproducibility

🧪 3. Meta-Analyses and Replication Rates The Many Labs projects found that only 50–54% of classic psychology effects replicated with p < .05.

Effect sizes were consistently smaller in replications than in original studies.

Subfields like social psychology had particularly low replication rates.

📖 Reflections on the Reproducibility Project – Springer

🧰 4. Measurement Reliability and Validity Many psychological instruments lack test-retest reliability, inter-rater reliability, or internal consistency.

Even when reliable, they may not be valid—i.e., they don’t measure what they claim to measure.

📖 Reliability in Psychology Research – SimplyPsychology

🧩 5. Structural Incentives and Publication Bias Researchers are incentivized to publish novel, significant results, not replications.

Practices like p-hacking, HARKing (hypothesizing after results are known), and selective reporting contribute to irreproducibility.

📖 Quantity Over Quality? – UC Press

🧠 Summary Table Issue Impact on Reproducibility Low replication rates Only ~36–50% of studies replicate Inflated effect sizes Replications show smaller effects Statistical misuse Misinterpretation of p-values, flawed assumptions Measurement problems Poor reliability and questionable validity Incentive structures Favor novelty over verification

References

Estimating the reproducibility of psychological science

Reproducibility in Psychology

Meme

Context

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

Sidebar Links