Applied Data Science & ML
  • Home
  • Causal Inference
  • About
Categories
All (12)
CausalInference (4)
Cheatsheet (1)
classification (1)
data-engineering (1)
DataScience (1)
DataScienceBasics (3)
howto (1)
kaggle (1)
Know-how (1)
mlops (1)
Recommendations (1)
regression (1)
tips and tricks (1)

Applied Data Science & ML

Hi, my name is Billy. This is where I try to archive topics related to Applied Data Science & ML.

 

Changes to my workflow

What changed in how I work after handing real tasks to a terminal coding agent, and what it replaced.

Apr 12, 2026
 

Going down a random rabbit hole: From XML Tags to $100M Weight Updates

Why Anthropic leans on XML tags. Following the thread from prompt formatting down to how self-attention was trained.

Mar 4, 2026
 

Estimating the Distribution of Omitted Variable Bias in Causal Inference

CausalInference

How large would an unobserved confounder need to be to overturn your causal estimate? A survey of methods for bounding omitted variable bias.

Apr 6, 2025

When Linear Regression Gets Massively Confused

DataScienceBasics
Know-how
regression

A mass point (a big cluster of identical values) can wreck linear regression after a log transform. Why it happens and what it breaks.

Mar 15, 2025

Causal Inference: Assessing Overlap in Covariate Distributions

CausalInference
DataScience
DataScienceBasics
howto

Working through Chapter 14 of Imbens & Rubin: the diagnostics for checking covariate overlap between treatment and control, with the formulas spelled out.

Sep 1, 2024

Recommendations as treatments

CausalInference
Recommendations

Joachims et al. reframe recommender systems as policies you can study with causal tools like inverse propensity weighting. My summary.

Jun 17, 2024
 

Causal Inference cheatsheet

CausalInference
Cheatsheet

A single-page reference for causal inference methods and the load-bearing assumption behind each one, following Facure’s Causal Inference for the Brave and True.

Jan 15, 2024
 

Relationship of covariance and dot product

DataScienceBasics

Covariance is a dot product of centered variables. The short derivation that connects the statistical and geometric views.

Feb 1, 2022

Deep dive into MLOps.

mlops

What happens after the ML proof-of-concept: the deployment patterns, tradeoffs, and failure modes of putting a model into production.

May 14, 2021

Data engineering: simple and complex data pipelines

data-engineering

Notes on data pipeline patterns I picked up moving from ML work into data engineering, from a Chris Riccomini talk.

Apr 13, 2021

Takeaways from Kaggle’s “Jane Street Market Prediction” competition

kaggle
tips and tricks

What I took away from Kaggle’s Jane Street Market Prediction competition: cross-validation, Keras tuning, and fast inference tricks.

Mar 13, 2021

Not so simple classification.

classification

Binary classification gets called easy. A POC predicting campaign visits showed me where that assumption falls apart.

Feb 14, 2021
No matching items