Vikash Sehwag - Academic webpage

Publications

2024

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

Vikash Sehwag, Xianghao Kong, Jingtao Li, Michael Spranger, Lingjuan Lyu

Arxiv 2024 - pdf | code

We train a stable-diffusion quality model with only $2,000 budget (14x cost reduction) and publicly available 37M images (not requiring any proprietry or billion image dataset).

Finding Needles in a Haystack: A Black-Box Approach to Invisible Watermark Detection

Minzhou Pan, Zhenting Wang, Xin Dong, Vikash Sehwag, Lingjuan Lyu, Xue Lin

ECCV 2024 - pdf

We propose WaterMark Detector (WMD), the first invisible watermark detection method under a black-box and annotation-free setting.

How to Trace Latent Generative Model Generated Images without Artificial Watermark?

Zhenting Wang, Vikash Sehwag, Chen Chen, Lingjuan Lyu, Dimitris N. Metaxas, Shiqing Ma

ICML 2024 - pdf | code

Using signature from the latent autoencoders, we propose an approach to trace synthetic images back to the source latent generative model.

A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization

Ashwinee Panda, Xinyu Tang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

ICML 2024 - pdf

We consider the cost of hyperparameter optimization in differentially private learning and propose a strategy that prvoides linear scaling of hyperparameters, thus reducing the privacy cost and simultaneously achieving state-of-the-art performance across 22 benchmark tasks in CV and NLP.

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

Arxiv 2024 - pdf | webpage | code

A centralized benchmark for 1) repository of jailbreaking attacks and artifacts, 2) standardized evaluation framework, and 3) up-to-date leaderboard.

2023

Differentially Private Image Classification by Learning Priors from Random Processes

Xinyu Tang, Ashwinee Panda, Vikash Sehwag, Prateek Mittal

NeurIPS 2023 (spotlight) - pdf | code

We show that pre-training on data from random processes enables better performance during differentially private finetuning, while simultaneously avoiding privacy leakage associated with real pretraining images.

Extracting Training Data from Diffusion Models

Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace

USENIX Security Symposium, 2023 - pdf | video | News (1, 2, 3, 4)

This was one of the first works to demonstrate significant memorization of real-world images in web-scale text-to-image generative models (Stable Diffusion, ImageN). Our findings further motivated web-scale data deduplication in training dataset of generative models.

Uncovering Adversarial Risks of Test-Time Adaptation

Tong Wu, Feiran Jia, Xiangyu Qi, Jiachen T. Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

ICML 2023 - pdf | webpage | code

We show that test-time adaptation, a technique that aims to improve performance at test time, also increases exposure to novel security risks.

MultiRobustBench: Benchmarking Robustness Against Multiple Attacks

Sihui Dai, Saeed Mahloujifar, Chong Xiang, Vikash Sehwag, Pin-Yu Chen, Prateek Mittal

ICML 2023 - pdf | webpage | code

Going beyond single-attack robustness (RobustBench), we develop a standardized benchmark for multi-attack threat vectors.

A Light Recipe to Train Robust Vision Transformers

Edoardo Debenedetti, Vikash Sehwag, Prateek Mittal

SaTML 2023 - pdf | video | slides | code

Contrary to the conventional wisdom of using heavy data augmentation in ViTs, we showed that a lighter data augmentation (along with other bag-of-tricks) achieves state-of-the-art performance with ViTs adversarial training.

2022

Generating High Fidelity Data from Low-density Regions using Diffusion Models

Vikash Sehwag, Caner Hazirbas, Albert Gordo, Firat Ozgenel, Cristian Canton Ferrer

CVPR 2022 - pdf

Our work showed strong generalization of diffusion models in the tail of the data distribution and developed adaptive sampling techniques to generate high-fidelity samples from the tail of the data distribution.

Understanding Robust Learning through the Lens of Representation Similarities

Christian Cianfarani, Arjun Nitin Bhagoji, Vikash Sehwag, Ben Y. Zhao, Prateek Mittal, Haitao Zheng

NeurIPS 2022 - pdf | video | slides | code

Using representation similarity metrics, such as CKA, we demonstrate multiple interesting characteristics of adversarially robust networks compared to non-robust networks.

Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?

Vikash Sehwag, Saeed Mahloujifar, Tinashe Handina, Sihui Dai, Chong Xiang, Mung Chiang, Prateek Mittal

ICLR 2022 - pdf | video | slides | code | blog

We showed that synthetic data from diffusion model provides a termendous boost in the generalization performance of adversarial training.

2021

Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries

Arjun Nitin Bhagoji, Daniel Cullina, Vikash Sehwag, Prateek Mittal

ICML 2021 - pdf | video | slides | poster | code

We provide lower-bounds on cross-entropy loss in the persence of adversarial attacks for common small-scale computer vision datasets.

SSD: A Unified Framework for Self-Supervised outlier detection

Vikash Sehwag, Mung Chiang, Prateek Mittal

ICLR 2021, NeurIPS SSL workshop 2020 - pdf | video | slides | code

Using only unlabeled data, we develop a highly succesful framework to detect outliers/out-of-distribution samples.

RobustBench: A Standardized Adversarial Robustness Benchmark

Francesco Croce, Maksym Andriushchenko, Vikash Sehwag, Nicolas Flammarion, Mung Chiang, Prateek Mittal, Matthias Hein

NeurIPS 2021 - leaderboard | pdf | code

We develop a standardized benchmark to track progress on adversarial robustness in deep learning. Our benchmark has been highly insightful and been visited by more than 40K users.

PatchGuard: Provable Defense against Adversarial Patches Using Masks on Small Receptive Fields

Chong Xiang, Arjun Nitin Bhagoji, Vikash Sehwag, Prateek Mittal

USENIX Security Symposium 2021 - pdf | video | code

A general defense framework to acheive provable robustness against adversrial patches.

2020

HYDRA: Pruning Adversarially Robust Neural Networks

Vikash Sehwag, Shiqi Wang, Prateek Mittal, Suman Jana

NeurIPS 2020 - webpage | pdf | video | slides | code

We achieved state-of-the-art clean and robust accuracy when aggressively pruning the parameters of deep neural networks.

Fast-Convergent Federated Learning

Hung T. Nguyen, Vikash Sehwag, Seyyedali Hosseinalipour, Christopher G. Brinton, Mung Chiang, H. Vincent Poor

IEEE Journal on Selected Areas in Communications (J-SAC) - Series on Machine Learning for Communications and Networks 2020 - pdf

We proposed a fast-convergent federated learning algorithm, called FOLB, which improves convergence speed by an smart sampling of devices in each round.

A Critical Evaluation of Open-World Machine Learning

Liwei Song, Vikash Sehwag, Arjun Nitin Bhagoji, Prateek Mittal

ICML Workshop on Uncertainty & Robustness 2020 - pdf | code

We demonstrate a fundamental conflict between the learning objectives of open-world machine learning and adversarial robustness.

Analyzing the Robustness of Open-World Machine Learning

Vikash Sehwag, Arjun Nitin Bhagoji, Liwei Song, Chawin Sitawarin, Daniel Cullina, Mung Chiang, Prateek Mittal

ACM Workshop on Artificial Intelligence and Security (AISec) 2019 - pdf | slides | code

We demonstrate the vulnerability of open-world machine learning models to adversarial examples and proposed a defense against the open-world adversarial attacks.

Selected Open Source Repositories

https://github.com/RobustBench/robustbench | ★ 642 | RobustBench leaderboard
https://github.com/VSehwag/minimal-diffusion | ★ 237 | Minimalistic implementation of diffusion models
https://github.com/inspire-group/SSD | ★ 130 | Self-supervised outlier detection
https://github.com/inspire-group/hydra | ★ 88 | Pruning adversarial robust networks
https://github.com/inspire-group/proxy-distributions | ★ 26 | Improving adversarial robustness using synthetic data
https://github.com/inspire-group/OOD-Attacks | ★ 12 | Robust open-world machine learning
https://github.com/inspire-group/robust_representation_similarity | ★ 6 | Representation similarity analysis for robust and non-robust networks

Invited Talks

How to train stable diffusion under $2,000 | Aug 2024 | SPARK Seminar - Google
On safety risks of generative AI - From ChatGPT to DallE.3 | Nov 2023 | Columbia University
Prospects and pitfalls of modern generative models - An AI safety perspective | Feb 2023 | AAAI
Enhancing machine learning using synthetic data distilled from generative models | Jan 2023 | MSR
Role of synthetic data in trustworthy machine learning | May 2022 | UChicago, UBerkeley
A generative approach to robust machine learning | Mar 2022 | CISS Conference
A generative approach to robust machine learning | Jan 2022 | RIKEN-AIP TrustML Young Scientist Seminar
Generating novel hard-instances form low-density regions using generative models | Aug 2021 | Meta AI
A primer on adversarial machine learning | July 2021 | Princeton-Intel REU Seminar
Embedding data distribution to make machine learning more reliable | Mar 2021 | EPFL
Private Deep Learning Made Practical | Oct 2019 | Qualcomm

Academic Services

Teaching and Mentoring

Lecture on basics of adversarial machine learning | Princeton-Intel REU Seminar 2021
Teaching assistant for ECE 574: Security & Privacy | Fall 2021 - Princeton University
Taught a mini-course on adversarial attacks & defenses | Winterssion 2020 - Princeton University
Teaching assistant for ELE 535: Machine Learning and Pattern Recognition | Fall 2019 - Princeton University
Mentored ten students in AI research over the years: Edoardo Debenedetti, Rajvardhan Oak, Christian Cianfarani, Tinashe Handina, Matteo Russo, Xianghao Kong, Song Wen, Minzhou Pan, Zhenting Wang, Jie Ren.

Other Services

Workshop organizer - ICCV 2023 ARROW workshop, CVPR 2023 Workshop of Adversarial Machine Learning on Computer Vision: Art of Robustness 2023
Program committe member for IEEE Conference on Secure and Trustworthy Machine Learning - 2023
Organized more than 20 talks on security & privacy in machine learning (SPML seminar series) - 2022
One of the core maintainers of Adversarial Robustness Benchmark | robustbench.github.io
Volunteered as junior mentor at Princeton-OLCF-NVIDIA GPU Hackathon | June 2020 - Princeton University
Reviewed more than 50 papers for major computer vision and machine learning conferences and journals.

Research Scientist, GDM

sehwag.vikash@gmail.com

2024

2023

2022

2021

2020