Statistical Rethinking: A Bayesian Course

Overview

Richard McElreath’s “Statistical Rethinking” is hands-down the best statistics book I’ve read. It teaches Bayesian statistics from the ground up with a focus on understanding rather than rote application.

Why This Book Stands Out

1. Causal Thinking First

Unlike traditional statistics books, McElreath emphasizes causal reasoning before jumping into models. This reframes statistics as a tool for understanding causation, not just correlation.

2. Bayesian from the Start

No frequentist detour - the book teaches Bayesian inference as the primary framework. This makes more sense pedagogically and practically.

3. Practical Implementation

Every concept is implemented in code (R with the rethinking package). I followed along in Python using PyMC3, which deepened my understanding.

Key Concepts

Small Worlds and Large Worlds

Small world: The model we build
Large world: The real world

Models are always wrong, but some are useful. Understanding this distinction is crucial.

The Golem of Prague

Statistical models are like golems - powerful but mindless. They do exactly what you tell them, not what you mean. This metaphor runs throughout the book.

Multilevel Models

The chapters on multilevel (hierarchical) models are exceptional. They show how to:

Pool information across groups
Model varying effects
Handle imbalanced data

Practical Applications

Applied these concepts to:

A/B testing: Bayesian approach gives more nuanced results than p-values
Time series: Hierarchical models for multiple related series
Causal inference: DAGs for thinking about confounding

Favorite Sections

Chapter 5: The Many Variables & The Spurious Waffles

A brilliant example using divorce rates and waffle houses to teach about confounding and DAGs. Funny and educational.

Chapter 13: Models With Memory

Introduction to time series and spatial models. The examples with chimpanzees and oceanic societies are memorable.

Code Examples

I recreated all examples in Python/PyMC3. The translation process was educational - it forced me to really understand what each model was doing.

# Example: Simple linear regression in PyMC3
with pm.Model() as model:
    # Priors
    alpha = pm.Normal('alpha', mu=0, sigma=10)
    beta = pm.Normal('beta', mu=0, sigma=10)
    sigma = pm.HalfNormal('sigma', sigma=1)
    
    # Likelihood
    mu = alpha + beta * x
    y_obs = pm.Normal('y_obs', mu=mu, sigma=sigma, observed=y)
    
    # Inference
    trace = pm.sample(1000)

Reflections

This book fundamentally changed how I approach statistical modeling:

Think causally, not just correlationally
Use priors to encode domain knowledge
Embrace uncertainty quantification
Validate models against reality, not just other models

Companion Resources

The best statistics book I’ve read. Challenging but rewarding. I’ll be returning to this book for years to come.

My Rating: 10/10

Note: This is my personal assessment based on how much the book influenced my thinking or provided practical value.