Was a core technical member of a team researching solutions to a low-data, multi-class classification problem. In the support of research questions, implemented active learning methods,an attention-based URL encoder mechanism, and a novel semi-supervised method in Python/Pytorch. Authored and gave an accepted talk at CAMLIS on a novel clustered loss function used to integrate asymmetric costs of misclassification in a multi-class setting.
Was part of a team working to improve a core production model for identifying malicious binary files. Proposed adding a non-binary auxiliary loss, which generated one of the highest performance boosts of the research period. Built a Python framework for easy calculation of metrics across core structures (multi-class, multi-label). Am an author of a currently-being-submitted paper on the value gained through creative use of auxiliary loss functions
Author on paper,accepted into the S&P Deep Learning in Security workshop, on a neural network design using shared weights over document aggregations at multiple resolutions for HTML detection. Helped design and write Keras implementations of baseline network structures to test the value of specific architectural choices. Gave the accepted paper talk at S&P conference in May 2018
Made a survey of current state of the art in model explanation techniques to support both data science and compliance teams. Tested LIME and feature perturbation analysis as explanation methods, to compare both the quality of their results and their efficiency. Built a Numpy-optimized Python implementation of a feature perturbation explanation system.
Researched methods to detect high-density anomalies with multivariate, categorical, time series data. Designed, implemented, and launched a time-series variant of CLICKS, a subspace clustering technique.