James–Stein
James–Stein changed statistical inference forever – ML didn't get the memo. At least many practitioners seem oddly unaware of it. In 1955, Charles Stein found that maximum likelihood estimators for Gaussian models are inadmissible (bad). Shrinkage towards zero beats it uniformly (good). In this note, I review the estimation theory and connect James–Stein to two popular methods: L2 regularization and weight decay.