Research Deep Dives

Occasionally, I realize there's an area of statistics or machine learning that I don't understand quite as well as I'd like. When that happens I read as much in the area as I can, and when I feel like I've reached clarity, I write about it. In these posts, I strive to find languages, framings, and ways of understanding that build intuition rather than just relying on high-context mathematical framings. I believe it's important for technical writers to be generous, to assume that their readers are smart and have the capability of understanding a technical field, even if they won't have familiarity with it initially. I typically try to write for people with an exposure to machine learning, but no deep knowledge of a particular niche subfield

Generating, With Style: The Mechanics Behind NVIDIA’s Highly Realistic GAN Images

Published on January 20, 2019

NVIDIA's StyleGAN outperforms prior GAN structures, and does so through use of thoughtful engineering of pathways for local and global structure. Here, I explain the idea of locality of information, and walk through how the concept is used in this model.

A Tale of Two Convolutions: Differing Design Paradigms for Graph Neural Networks

Published on December 26, 2018

Enthusiasts of graph-structured data want to find a convolution analogue for graphs, and the literature went down two paths: one based on a mathematically rigorous definition of convolution, the other more heuristic.

Convolution: An Exploration of a Familiar Operator’s Deeper Roots

Published on October 1, 2018

Convolution has a history prior to its resurgence as a machine learning operation; this post walks through the mathematical convolution operator, and connects the conceptual dots between historical convolution and the intuitions we've developed for its more applied incarnation

The Pursuit of (Robotic) Happiness: How TRPO and PPO Stabilize Policy Gradient Methods

Published on July 8, 2018

A core problem of reinforcement learning is that, for on-policy algorithms, an update in your policy has the potential to be a catastrophic mistake that leads to a less informative distribution of future actions. This post explores TRPO and PPO, two approaches for making updates in a "safe" way.

With Great Power Comes Poor Latent Codes: Representation Learning in VAEs Pt 2

Published on May 6, 2018

This second part in a series on VAE representation learning explains why the objective function of a Variational Autoencoder, intended to push towards disentenglement, end up disincentivizing the model from actually learning informative representations. It additionally explores modifications to the objective function designed to solve this problem

What a Disentangled Net We Weave: Representation Learning in VAEs Pt 1

Published on April 15, 2018

This is the first of a two-part series exploring the problem of representation learning, focusing on what it means to have a representation be disentangled, and why this is a valuable property for learned representations to have.

Learning About Algorithms That Learn to Learn

Published on March 24, 2018

A known problem of modern machine learning is that it can specialize well at specific tasks, but can't necessarily transfer to new tasks, which is, obviously, a prerequisite of anything approaching general intelligence. Meta Learning as a field tries to design ways to train models that can quickly learn to perform well on different tasks. This post explores the nuances in what's considered a new "task", and summarizes the current state of the art in the field.

Turning Up the Heat: The Mechanics of Model Distillation

Published on February 16, 2018

An important and subtle fact about trained models is that they don't just make binary predictions, they produce scores, and those scores contain valuable information about the strength of a model's beliefs about a particular observation. This post describes how highly-performant small models can be trained by using the predicted distributions of much larger, more resource-intensive models as targets, leveraging this additional learned information.

The Modeler Strikes Back: Defense Strategies Against Adversarial Attacks

Published on February 3, 2018

This is the second post in an adversarial examples series, and where the first focused on construction of adversarial attacks, this one examines why it is so difficult to devise reliable defenses against these attacks, walking through some of the things that have been tried, mostly to middling success.

Know Your Adversary: Understanding Adversarial Examples

Published on January 23, 2018

This is the first part in a series on adversarial examples, and explains how adversarial examples are generated, and some common theories for what aspects of models might make them vulnerable to these particular kinds of modifications.

One Feature Attribution Method to (Supposedly) Rule Them All: Shapley Values

Published on January 12, 2018

One popular way to explain model predictions is by designing an attribution strategy: a way to determine how much each feature contributed to a prediction. This is interestingly complicated in cases where features are highly correlated with one another. This post explores an approach for this derived from game theory for sharing credit among players in a game.

Tell Me a Story: Thoughts on Model Interpretability

Published on January 12, 2018

What do people actually want when they call for more interpretable models? How much of the challenge of intepretability is based on models themselves being complex, and how much is an inerent problem caused by the difficulty in building a human narrative on top of very raw information

Fair and Balanced? Thoughts on Bias in Probabilistic Modeling

Published on December 26, 2017

What do we want when we call for "fairer" machine learning systems? This post argues that "unfairness" is often a property of the data collected from a world that we normatively believe is not as it should be, and that building fairer systems will require a clear convesration about what kinds of fairness we want our models to embody, since our normative preferences are not an inherent part of the world the model learns from

Copyright © All rights reserved | This template is made with by Colorlib