portrait of Ben

It is difficult to stop the impulse to reveal secrets in conversation, as if information had the desire to live and the power to multiply. – Nassim Taleb, The Bed of Procrustes

Bio

I am an Applied Scientist at AWS AI Labs working on deep learning and computer vision. I earned my Ph.D. in applied mathematics at UCLA working under the supervision of Guido Montúfar. In the past I studied primarily the training/optimization process of neural networks, seeking to understand how the parameterization and algorithm influence the properties of the network throughout time and at convergence. Currently, I focus more on developing scalable machine learning methods that respect individual’s data usage and privacy rights.

Some of my research interests:

Machine unlearning
Data privacy
Compute and parameter efficient adaptation of large models
Implicit bias/regularization of gradient descent
Neural Tangent Kernel (NTK)

Before UCLA I studied computational mathematics at Penn State. Outside of work I enjoy coffee, cooking, and cocktails.

À-la-carte Learning

Classically, machine learning has focused on training on a monolithic dataset. However, for deployed models training on a monolithic dataset is problematic. As new data become available (continual learning), finetuning can be prohibitively expensive or lead to poor performance (catastrophic forgetting). Furthermore, users can change their sharing preferences at any time, leading to datasets that shrink over time (machine unlearning) or different subsets of the data being usable by different users (compartmentalization). To address these challenges, in our CVPR 2023 paper we introduce the À-la-carte Learning Problem. The mandate of À-la-carte Learning is to construct bespoke machine learning models specific to each user’s data access rights and preferences. Through our method (APT), we demonstrate that we can assemble models by combining learned prompts trained on compartmentalized data sources to achieve performance competitive with monolithic training, with further benefits for privacy, model securitization, and model customization. For the continual learning benchmarks Split-CIFAR100 and CORe50, we achieve state-of-the-art performance. For more information, see the video below.

News

09/2023 Our paper Your representations are in the network: composable and parallel adaptation for large scale models was accepted to NeurIPS 2023

07/2023 Our paper SAFE: Machine Unlearning With Shard Graphs was accepted to ICCV 2023

07/2023 I have started work as an Applied Scientist at AWS AI Labs

06/2023 My thesis was accepted and I have earned my Ph.D. in mathematics

02/2023 Our paper À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting was accepted to CVPR 2023

01/2023 Our paper Characterizing the Spectrum of the NTK via a Power Series Expansion was accepted to ICLR 2023

06/2022 Our paper Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime was accepted to NeurIPS 2022

06/2022 This summer I will be an Applied Scientist Intern at Amazon (AWS)

01/2022 Our paper Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks was accepted to ICLR 2022. (If you are interested in this work, our more recent work offers significant improvements).

06/2021 I will be spending the summer in Leipzig, Germany as a summer researcher at the Max Planck Institute for Mathematics in the Sciences working on training dynamics of neural networks