Udari Madhushani

Research Scientist, Sony AI

I am a research scientist at Sony AI where I lead efforts on enhancing safety and utility of large-scale generative models.

I received my PhD from Princeton University where I was advised by Prof. Prateek Mittal and Prof. Mung Chiang. I previously interned at Meta AI (AI Red Team) and Microsoft Research. I have been fortunate to receive Qualcomm Innovation Fellowship and the Rising Star Award in adversarial machine learning. I previously organized the first seminar series on Security & Privacy in Machine Learning (SPML) at Princeton University.

Research Interests. I believe AI will have a transformative impact on society; however, more efforts are needed for safer, trustworthy AI systems. My work focuses on uncovering and mitigating safety risks and building the next generation of trustworthy AI systems.
  • Safer generative AI. We have demonstrated privacy risks in real-world diffusion models and developed privacy-preserving sampling and training methods [1, 2, 3, 4]. We have also developed techniques and benchmarked automated generation of adversarial and unsafe content from generative models [5, 6].
  • Responsible data synthesis. Concerned by the vast amount of generative content online, we've recently developed techniques to identify synthetic samples [7], even in the absence of artificial watermarks, and tracing them to source generative models [8].
  • Robust machine learning. We have conducted an in-depth exploration of adversarial robust learning, including circumventing its higher sample complexity using synthetic data [9], finding fundamental limits on robustness [10], demonstrating higher robustness with transformers [11], robustness across threat models [12, 13, 14], the effect of model scaling and compression [15, 16], and adversarial risks in transitioning from closed-domain to open-world systems [17, 18].
  • Benchmarking progress in AI safety. We developed the widely adopted RobustBench benchmark [19], followed by MultiRobustBench to account for multiple attacks [20], and most recently JailbreakBench [6] to benchmark progress on jailbreaks against LLMs. We have also released a detailed discussion on nuanced similarity and distinction in security and safety approaches towards Trustworthy AI [21].

News

02/2023
New paper on extracting training data from diffusion models (pdf)
12/2022
Presented our paper on understanding robust representations at Neurips'22.
09/2022
Awarded graduate student award for excellence in service (ECE, Princeton University).
04/2022
Awarded Charlotte Elizabeth Proctor Honorific Fellowship, one of the highest honors at Princeton University.
03/2022
Paper on low-density sampling from diffusion models got accepted at CVPR'22.
01/2022
Paper on using synthetic data in robust learning got accepted at ICLR'22.
08/2021
Finished an amazing research internship at Facebook AI.
05/2021
Paper on lower bounds on adversarial robustness accepted at ICML 2021 (pdf).
04/2021
Paper on improving robustness using proxy distributions is now out (pdf)!
03/2021
RobustBench won best paper honorable mention prize at ICLR AiSecure workshop.
01/2021
Self-supervised outlier detection (SSD) paper accepted at ICLR 2021 (pdf, slides).
01/2021
Paper on PatchGuard accepted at USENIX Security 2021 (pdf).
10/2020
Releasing RobustBench, a standardized benchmark for adversarial robustness.
10/2020
Work on fast-convergent federated learning to appear in IEEE JSAC (arxiv).
09/2020
Paper on prning robust networks (Hydra) accepted at NeurIPS 2020. (webpage).
07/2020
Paper on background check of deep learning - ICML OOL workshop (pdf, video).
07/2020
Work on separability of self-supervised representations, and another one on critical evaluation of open-world meachine learning, accepted at ICML UDL workshop.
06/2020
Volunteered as junior mentor at Princeton-OLCF-NVIDIA GPU Hackathon.
05/2020
Releasing PatchGuard, a provable defense against adversarial patches (Pdf, Code).
04/2020
Work on pruning robust networks accepted at ICLR TTML workshop (slides, video).
01/2020
Taught a mini-course on adversarial attacks & defenses in Winterssion 2020 (Slides, Colab-notebook).
09/2019
Finished amazing summer research internship at Microsoft Research, Redmond.
08/2019
Paper on robust open-world machine learning accepted at AISec 2019 (Slides).

Selected Publications

Extracting Training Data from Diffusion Models

Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace

Arxiv 2023

We show that modern diffusion models, such as Stable-diffusion and ImageN, memorize certain training images, which can be extracted by an adversary during sampling.

A Light Recipe to Train Robust Vision Transformers

Edoardo Debenedetti, Vikash Sehwag, Prateek Mittal

SaTML 2023

Contrary to the conventional wisdom of using heavy data augmentation in ViTs, we show that a lighter data augmentation (along with other bag-of-tricks) achieves state-of-the-art performance with ViTs adversarial training.

Generating High Fidelity Data from Low-density Regions using Diffusion Models

Vikash Sehwag, Caner Hazirbas, Albert Gordo, Firat Ozgenel, Cristian Canton Ferrer

CVPR 2022

We improve the sampling process of diffusion models to generate high fidelity hard, i.e., from low-density regions, synthetic images.

Understanding Robust Learning through the Lens of Representation Similarities

Christian Cianfarani, Arjun Nitin Bhagoji, Vikash Sehwag, Ben Y. Zhao, Prateek Mittal, Haitao Zheng

NeurIPS 2022

Using representation similarity metrics, such as CKA, we demonstrate multiple interesting characteristics of adversarially robust networks compared to non-robust networks.

Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?

Vikash Sehwag, Saeed Mahloujifar, Tinashe Handina, Sihui Dai, Chong Xiang, Mung Chiang, Prateek Mittal

ICLR 2022

We show that synthetic data from diffusion model provides a termendous boost in generalization performance of robust training.

Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries

Arjun Nitin Bhagoji, Daniel Cullina, Vikash Sehwag, Prateek Mittal

ICML 2021

We provide lower-bounds on cross-entropy loss in persence of adversarial attacks on basic vision datasets.

SSD: A Unified Framework for Self-Supervised outlier detection

Vikash Sehwag, Mung Chiang, Prateek Mittal

ICLR 2021, Short version accepted at NeurIPS SSL workshop, 2020

Using only unlabeled data, we develop a highly succesful framework to detect outliers or out-of-distribution samples.


RobustBench: A Standardized Adversarial Robustness Benchmark

Francesco Croce, Maksym Andriushchenko, Vikash Sehwag,
Nicolas Flammarion, Mung Chiang, Prateek Mittal, Matthias Hein

NeurIPS, 2021

We provide a leaderboard to track progress + a library for unified access to SOTA defenses against adversarial examples.


Time for a Background Check! Uncovering the impact of Background Features on Deep Neural Networks

Vikash Sehwag, Rajvardhan Oak, Mung Chiang, Prateek Mittal

ICML workshop on Object-Oriented Learning, 2020

We investigate background invariance and influence over 32 deep neural networks on ImageNet dataset.


On Separability of Self-Supervised Representations

Vikash Sehwag, Mung Chiang, Prateek Mittal

ICML workshop on Uncertainty & Robustness in Deep Learning, 2020

We compare the representations learned by several self-supervised methods with supervised networks.

HYDRA: Pruning Adversarially Robust Neural Networks

Vikash Sehwag, Shiqi Wang, Prateek Mittal, Suman Jana

NeurIPS 2020, Short paper in ICLR workshop on Trustworthy Machine Learning, 2020

We achieve state-of-the-art accuracy and robustness for pruned networks (pruning up to 100x).


PatchGuard: Provable Defense against Adversarial Patches Using Masks on Small Receptive Fields

Chong Xiang, Arjun Nitin Bhagoji, Vikash Sehwag, Prateek Mittal

Arxiv, 2020

A general defense framework to acheive provable robustness against adversrial patches.


Fast-Convergent Federated Learning

Hung T. Nguyen, Vikash Sehwag, Seyyedali Hosseinalipour, Christopher G. Brinton, Mung Chiang, H. Vincent Poor

To appear in IEEE Journal on Selected Areas in Communications (J-SAC) - Series on Machine Learning for Communications and Networks

We proposed a fast-convergent federated learning algorithm, called FOLB, which improves convergence speed by an intelligent sampling of devices in each round.

A Critical Evaluation of Open-World Machine Learning

Liwei Song, Vikash Sehwag, Arjun Nitin Bhagoji, Prateek Mittal

ICML Workshop on Uncertainty & Robustness in Deep Learning , 2020

We discover a conflict between the objective of open-world machine learning and adversarial robustness.

Analyzing the Robustness of Open-World Machine Learning

Vikash Sehwag, Arjun Nitin Bhagoji, Liwei Song, Chawin Sitawarin, Daniel Cullina, Mung Chiang, Prateek Mittal

ACM Workshop on Artificial Intelligence and Security (AISec), 2019

We demonstrate the vulnerability of open-world ML to adversarial examples and proposed a defense.


Research Work in Undergraduate

A Parallel Stochastic Number Generator With Bit Permutation Networks with N. Prasad and Indrajit Chakrabarti

IEEE Transactions on Circuits and Systems II: Express Briefs, 2017 (Pdf)

Variation Aware Performance Analysis of TFETs for Low-Voltage Computing with Saurav Maji and Mrigank Sharad

IEEE International Symposium on Nanoelectronic and Information Systems (iNIS), 2016 (Pdf)

TV-PUF: a fast lightweight analog physical unclonable function with Tanujay Saha

IEEE International Symposium on Nanoelectronic and Information Systems (iNIS), 2016 (Pdf)

A Study of Stochastic SIS Disease Spreading on Random Graphs with Wasiur R. KhudaBukhsh and Heinz Koeppl, 2016 (Pdf)

Academic Services

Teaching and Mentoring

Taught a mini-course on adversarial attacks & defenses (Winterssion 2020)

Teaching assistant for ELE 535: Machine Learning and Pattern Recognition (Fall 2019)

Mentoring Princeton undergraduates for their senior independent research work
Tinashe Handina (B.S.E., Electrical Engineering 2021); Matteo Russo (B.S.E., Computer Science 2020)

Other Services

One of the core maintainers of Adversarial Robustness Benchmark (robustbench.github.io)

Volunteered as junior mentor at Princeton-OLCF-NVIDIA GPU Hackathon (June 2020)

Reviewer for ACM Transactions on Privacy and Security (TOPS), PLOS One

Sub-reviewer for USENIX Security 2018, 2019