YouTube link: The Statistical Crisis in Science and How to Move Forward by Professor Andrew Gelman (57 min; if you want to skip the introductions, the lecture starts at 5:30. Questions starting from 46:10).

I’ve been reading Gelman’s blog for some time now; it was quite exciting to see him giving a talk (through YouTube record).

Description:

Andrew Gelman, Higgins Professor of Statistics, Professor of Political Science, and Director of the Applied Statistics Center at Columbia University, delivers a University Lecture on the statistical crisis in science and how to move forward. Using examples ranging from elections to birthdays to policy analysis, Professor Gelman discusses ways in which statistical methods have failed, leading to a replication crisis in much of science, as well as directions for improvements through statistical methods that make use of more information.

Inaugurated in 1971 and sponsored by the Offices of the President and Provost, the University Lecture is a semiannual address given by an outstanding member of the Columbia University faculty, celebrating his or her work and academic achievements. The Lecture provides a forum for in-depth exploration on a topic of the speaker’s choice.

Some notes

Some points (not all) as far as I understood them + some references I found:

  • Replication crisis: There seems to be a problem in how statistical research is conducted and reported; this affects the quality of statistical science.

  • The scientific community and the rest of the society wants simple answers. Unfortunately, sometimes there simply is too much noise in your data to make reject / not reject decision.

  • Politicians and the public have learned the wrong lesson, because it’s a lesson they want to hear and the statisticians want to sell: you can use statistics to find patterns in noise and use them for success!

  • Low-hanging fruit (when the finding simple yes/no results was easy) has been picked.

  • Consider a story. It appears that days of the year, such as Valentine day, has an effect on number of babies born.
    • Lesson. Instead of blindly looking at the plot of births per day (which might be illustrative sometimes, but not always), it’s better to properly analyze the data (with a model) and see what the underlying trends and effects are. (Day of year vs Day of week vs Seasonal effect…)
    • Another lesson. When they actually did (or in particular, Aki Vehtari did) exactly that (analyzed the Valentine days with proper methods), looking at the results of the first version of their Valentine day model (using prior information) they noticed that the model must have been wrong, and they were able to fix it.
    • More detailed exposition given as Example 21.2 in the BDA3 book. (It’s an additive GP model!)
  • Consider the statistical significance filter. Article. It’s made worse by a feedback loop: because published results report a too large effect, people expect that they will see large effects and think studies they design have high power (when they in reality their studies are underpowered).

  • Big data. Classical methods were designed for data poor age and random sampling. Big data is not random samples, so it requires different thinking.
    • Fun example about this with XBox users polling data. Link to paper.
  • See also this recent post on piranhas.