Occasionally, I realize there's an area of statistics or machine learning that I don't understand quite as well as I'd like. When that happens I read as much in the area as I can, and when I feel like I've reached clarity, I write about it. In these posts, I strive to find languages, framings, and ways of understanding that build intuition rather than just relying on high-context mathematical framings. I believe it's important for technical writers to be generous, to assume that their readers are smart and have the capability of understanding a technical field, even if they won't have familiarity with it initially. I typically try to write for people with an exposure to machine learning, but no deep knowledge of a particular niche subfield
Published on January 20, 2019
NVIDIA's StyleGAN outperforms prior GAN structures, and does so through use of thoughtful engineering of pathways for local and global structure. Here, I explain the idea of locality of information, and walk through how the concept is used in this model.
Published on December 26, 2018
Enthusiasts of graph-structured data want to find a convolution analogue for graphs, and the literature went down two paths: one based on a mathematically rigorous definition of convolution, the other more heuristic.
Published on October 1, 2018
Convolution has a history prior to its resurgence as a machine learning operation; this post walks through the mathematical convolution operator, and connects the conceptual dots between historical convolution and the intuitions we've developed for its more applied incarnation
Published on July 8, 2018
A core problem of reinforcement learning is that, for on-policy algorithms, an update in your policy has the potential to be a catastrophic mistake that leads to a less informative distribution of future actions. This post explores TRPO and PPO, two approaches for making updates in a "safe" way.
Published on May 6, 2018
This second part in a series on VAE representation learning explains why the objective function of a Variational Autoencoder, intended to push towards disentenglement, end up disincentivizing the model from actually learning informative representations. It additionally explores modifications to the objective function designed to solve this problem
Published on April 15, 2018
This is the first of a two-part series exploring the problem of representation learning, focusing on what it means to have a representation be disentangled, and why this is a valuable property for learned representations to have.
Published on March 24, 2018
A known problem of modern machine learning is that it can specialize well at specific tasks, but can't necessarily transfer to new tasks, which is, obviously, a prerequisite of anything approaching general intelligence. Meta Learning as a field tries to design ways to train models that can quickly learn to perform well on different tasks. This post explores the nuances in what's considered a new "task", and summarizes the current state of the art in the field.
Published on February 16, 2018
An important and subtle fact about trained models is that they don't just make binary predictions, they produce scores, and those scores contain valuable information about the strength of a model's beliefs about a particular observation. This post describes how highly-performant small models can be trained by using the predicted distributions of much larger, more resource-intensive models as targets, leveraging this additional learned information.
Published on February 3, 2018
This is the second post in an adversarial examples series, and where the first focused on construction of adversarial attacks, this one examines why it is so difficult to devise reliable defenses against these attacks, walking through some of the things that have been tried, mostly to middling success.
Published on January 23, 2018
This is the first part in a series on adversarial examples, and explains how adversarial examples are generated, and some common theories for what aspects of models might make them vulnerable to these particular kinds of modifications.
Published on January 12, 2018
One popular way to explain model predictions is by designing an attribution strategy: a way to determine how much each feature contributed to a prediction. This is interestingly complicated in cases where features are highly correlated with one another. This post explores an approach for this derived from game theory for sharing credit among players in a game.
Published on January 12, 2018
What do people actually want when they call for more interpretable models? How much of the challenge of intepretability is based on models themselves being complex, and how much is an inerent problem caused by the difficulty in building a human narrative on top of very raw information
Published on December 26, 2017
What do we want when we call for "fairer" machine learning systems? This post argues that "unfairness" is often a property of the data collected from a world that we normatively believe is not as it should be, and that building fairer systems will require a clear convesration about what kinds of fairness we want our models to embody, since our normative preferences are not an inherent part of the world the model learns from