Missing at Random Data

In this post, I will discuss the different ways in which data can be missing in a (tabular) dataset, and particularly how the Missing at Random (MAR) assumption is ambiguous.

Read More

What Are Neural Networks Anyway? A Statistician's Perspective

In this post, I’m going to introduce the concept of a neural network at a level appropriate for someone with the mathematical background typical of a Statistics student. I’ll start by reviewing linear regression and then showing how a multi-layer perceptron (MLP) is a natural extension.

Read More

Deep Learning for Tabular Data

In this post, I will discuss why deep learning is less effective for tabular data compared to other data modalities, followed by some reasons why deep learning for tabular data is still a worthwhile consideration.

Read More

Large Deviations

In this post, I am going to give an introduction to Large Deviations, a theory for analyzing the probability of very rare events. I am going to focus on why Large Deviations theory is useful in some situations where more standard probability is not.

Read More