Ranjay Krishna

I teach machines to see like people and interact with people. As modern machines struggle to fully conceptualize the visual world, my research bootstraps machine learning using frameworks from behavioral and social sciences.

Ranjay Krishna is an Assistant Professor at the Paul G. Allen School of Computer Science & Engineering. He co-directs the RAIVN lab at UW and leads the computer vision team at Ai2. His research lies at the intersection of computer vision, natural language processing, robotics, and human computer interaction. This research has received best paper, outstanding paper, and orals at CVPR, ACL, CSCW, NeurIPS, UIST, and ECCV, and has been reported by Science, Forbes, the Wall Street Journal, and PBS NOVA. His research has been supported by Google, Apple, Ai2, Amazon, Cisco, Toyota Motor Inc, Toyota Research Institute, NSF, ONR, and Yahoo. He holds a bachelor's degree in Electrical & Computer Engineering and in Computer Science from Cornell University, a master's degree in Computer Science from Stanford University and a Ph.D. in Computer Science from Stanford University.

RECENT PAPER HIGHLIGHTS

[Jun 2025] Our Molmo paper received Best Paper Honorable Mention at CVPR 2025.
[Jun 2025] Our Molmo paper will appear as a Oral at CVPR 2025, awarded to top 0.7% of submissions.
[Apr 2025] Our interleaved scene graph paper will appear as a Spotlight at ICLR 2025, awarded to top 5% of submissions
[Dec 2024] Our Multilingual diversity for LLMs paper will appear as an Spotlist at NeurIPS 2024, awarded to top 5% of submissions.
[June 2024] Our Visual Program Distillation paper will appear as an Oral at CVPR 2024, awarded to top 5% of submissions.
[May 2024] Our Selective Visual Representations paper will appear as a Spotlight at ICLR 2024, awarded to top 5% of submissions.
[Dec 2023] Our DataComp paper will appear as an Oral at NeurIPS 2023, awarded to top (0.6%) submissions.
[Dec 2023] Our Quilt-1M will appear as an Oral at NeurIPS 2023, awarded to top (0.6%) submissions.
[Oct 2023] Our paper on Explanations and human-AI decision making got awarded a Best Paper Honorable Mention at CSCW 2023
[Mar 2023] Our CREPE paper was recognized as a Highlight at CVPR 2023, awarded to top 2.5% of submissions.

RECENT TALKS @ CONFERENCES

[Jul 2025] Invited talk at CogSci 2025 workshop on Minds in the Making
[Jun 2025] Keynote at CVPR 2025 workshop on Harnessing Generative Models for Synthetic Visual Datasets
[Jun 2025] Keynote at CVPR 2025 workshop on Generalization in Robotics Manipulation
[Jun 2025] Keynote at CVPR 2025 workshop on 3D Vision Language Models for Robotic Manipulation
[Jun 2025] Keynote at CVPR 2025 workshop on Demographic Diversity in Computer Vision
[Mar 2025] Invited talk at RAISE 2025 seminar series at the University of Washington
[Dec 2024] Invited talk at IndoML 2024 Symposium
[Dec 2024] Keynote at NeurIPS 2024 workshop on Multimodal Algorithmic Reasoning
[Oct 2024] Keynote at ECCV 2024 workshop on Efficient Deep Learning for Foundation Models
[Oct 2024] Keynote at ECCV 2024 workshop on Green Foundation Models
[Jun 2024] Invited talk at DUB 2024 speaker series at the University of Washington
[Jun 2024] Keynote at CVPR 2024 workshop on Evaluation of Generative Foundation Models
[Jun 2024] Keynote at CVPR 2024 workshop on Computer Vision with Humans in the Loop
[April 2023] Invited DUB seminar talk at the University of Washington
[Oct 2023] Keynote at ICCV 2023 workshop on Scene Graphs and Graph Representation Learning
[Oct 2023] Keynote at ICCV 2023 workshop on On Closing The Loop Between vision an language
[Aug 2023] Distinguished researcher talk on Compositionally at Salesforce AI
[July 2023] Talk on Embodied Intelligence at the AAAI 2023 Inaugural Summer Symposium on Embodied Intelligence
[Jun 2023] Keynote at CVPR 2023 workshop on New Frontiers in Vision and Language Reasoning.

RECENT WORKSHOPS

[Jun 2025] Synthetic Data for Computer Vision at CVPR 2025
[Jun 2024] Synthetic Data for Computer Vision at CVPR 2024
[Oct 2023] International Challenge on Compositional and Multimodal Perception at ICCV 2023
[Jul 2023] Artificial Intelligence and Human-Computer Interaction at ICML 2023

ACademic Publications

Visual Representations inside the Language Model
Benlin Liu, Amita Kamath, Madeleine Grunde-McLaughlin, Winson Han, Ranjay Krishna
CoLM 2025
[pdf coming soon]

The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains
Scott Geng, Hamish Ivison, Chun-Liang Li, Maarten Sap, Jerry Li, Ranjay Krishna, Pang Wei Koh
CoLM 2025
[pdf coming soon]

SAT: Dynamic Spatial Aptitude Training for Multimodal Language Models
Arijit Ray, Jiafei Duan, Ellis L Brown II, Reuben Tan, Dina Bashkirova, Rose Hendrix, Kiana Ehsani, Aniruddha Kembhavi, Bryan A. Plummer, Ranjay Krishna, Kuo-Hao Zeng, Kate Saenko
CoLM 2025
[pdf]

One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory
Chenhao Zheng, Jieyu Zhang, Mohammadreza Salehi, Ziqi Gao, Vishnu Iyengar, Norimasa Kobori, Quan Kong, Ranjay Krishna
ICCV 2025
[pdf]

Contrastive Flow Matching
George Stoica, Vivek Ramanujan*, Xiang Fan*, Ranjay Krishna, Judy Hoffman
ICCV 2025
[pdf]

PathFinder: A Multi-Modal Multi-Agent System for Medical Diagnostic Decision-Making Applied to Histopathology
Mehmet Saygin Seyfioglu*, Fatemeh Ghezloo*, Rustin Soraki*, Wisdom O. Ikezogwo*, Beibin Li*, Tejoram Vivekanandan, Joann G. Elmore, Ranjay Krishna, Linda Shapiro
ICCV 2025
[pdf] [website]

CoSyn: Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Yue Yang*, Ajay Patel*, Matt Deitke, Tanmay Gupta, Luca Weihs, Andrew Head, Mark Yatskar, Chris Callison-Burch, Ranjay Krishna, Aniruddha Kembhavi, Christopher Clark
ACL 2025
[pdf] [data] [code] [website]

SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation
Haoquan Fang, Markus Grotz, Wilbert Pumacay, Yi Ru Wang, Dieter Fox, Ranjay Krishna, Jiafei Duan
ICML 2025
[pdf] [benchmark] [code] [website] [mentioned in AI Index]

Unsettling the Hegemony of Intention: Agonistic Image Generation
Andre Ye, Andrew Shaw, Ranjay Krishna, Amy Zhang
Faact 2025
[pdf coming soon]

Improving Interpersonal Communication by Simulating Audiences with Language Models
Ryan Liu, Howard Yen, Raja Marjieh, Thomas L. Griffiths, Ranjay Krishna
CogSci 2025
[pdf] [code]

Perception Tokens Enhance Visual Reasoning in Multimodal Language Models
Mahtab Bigverdi, Zelun Luo, Cheng-Yu Hsieh, Ethan Shen, Dongping Chen, Linda G. Shapiro, Ranjay Krishna
CVPR 2025
[pdf] [website]

Synthetic Visual Genome
Jae Sung Park, Zixian Ma, Linjie Li, Chenhao Zheng, Cheng-Yu Hsieh, Ximing Lu, Khyathi Chandu, Quan Kong, Norimasa Kobori, Ali Farhadi, Yejin Choi, Ranjay Krishna
CVPR 2025
[pdf coming soon]

Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
Shivam Duggal, Yushi Hu, Oscar Michel, Aniruddha Kembhavi, William T. Freeman, Noah A. Smith, Ranjay Krishna, Antonio Torralba, Ali Farhadi, Wei-Chiu Ma
CVPR 2025
[pdf]

RealEdit: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations
Petr Sushko, Ayana Bharadwaj, Zhi Yang Lim, Vasily Ilin, Ben Caffee, Dongping Chen, Mohammadreza Salehi, Cheng-Yu Hsieh, Ranjay Krishna
CVPR 2025
[pdf]

NVILA: Efficient Frontier Visual Language Models
Zhijian Liu, Ligeng Zhu, Baifeng Shi, Zhuoyang Zhang, Yuming Lou, Shang Yang, Haocheng Xi, Shiyi Cao, Yuxian Gu, Dacheng Li, Xiuyu Li, Haotian Tang, Yunhao Fang, Yukang Chen, Cheng-Yu Hsieh, De-An Huang, An-Chieh Cheng, Jinyi Hu, Sifei Liu, Ranjay Krishna, Pavlo Molchanov, Jan Kautz, Hongxu Yin, Song Han, Yao Lu
CVPR 2025
[pdf] [website] [code] [demo]

One Diffusion to Generate Them All
Duong H. Le, Tuan Pham, Sangho Lee, Christopher Clark, Aniruddha Kembhavi, Stephan Mandt, Ranjay Krishna, Jiasen Lu
CVPR 2025
[pdf] [code]

Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model
Benlin Liu, Yiqin Wang, Yuhao Dong, Yongming Rao, Yansong Tang, Wei-Chiu Ma, Ranjay Krishna
CVPR 2025
[pdf] [website]

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Ai2 + UW
CVPR 2025 [CVPR Oral awarded to top 0.7% of submissions]
[pdf] [live demo]

Semantic and Expressive Variations in Image Captions Across Languages
Andre Ye, Sebastin Santy, Jena D. Hwang, Amy X. Zhang, Ranjay Krishna
CVPR 2025
[pdf]

Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment
Dongping Chen, Ruoxi Chen, Shu Pu, Zhaoyi Liu, Yanru Wu, Caixi Chen, Benlin Liu, Yue Huang, Yao Wan, Pan Zhou, Ranjay Krishna
ICLR 2025 [ICLR Spotlight awarded to top 5% of submissions]
[pdf] [website] [code]

AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic
Manipulation
Jiafei Duan, Wilbert Pumacay, Nishanth Kumar, Yi Ru Wang, Shulin Tian, Wentao Yuan, Ranjay Krishna, Dieter Fox, Ajay Mandlekar, Yijie Guo
ICLR 2025
[pdf] [website]

Self-Enhancing Video Data Management System for Compositional Events with Large Language Models
Enhao Zhang, Nicole Sullivan, Brandon Haynes, Ranjay Krishna, Magdalena Balazinska
SIGMOD 2025
[pdf]

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback
Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian
NAACL 2025
[pdf]

Designing LLM Chains by Adapting Techniques from Crowdsourcing Workflows
Madeleine Grunde-McLaughlin, Michelle S. Lam, Ranjay Krishna, Daniel S. Weld, Jeffrey Heer
TOCHI 2025
[pdf]

2024

Task Me Anything
Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna
NeurIPS 2024
[pdf] [website] [UI] [code]

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples
Baiqi Li, Zhiqiu Lin, Wenxuan Peng, Jean de Dieu Nyandwi, Daniel Jiang, Zixian Ma, Simran Khanuja, Ranjay Krishna, Graham Neubig, Deva Ramanan
NeurIPS 2024
[pdf] [website]

ActionAtlas: A VideoQA Benchmark for Fine-grained Action Recognition
Mohammadreza Salehi, Jae Sung Park, Aditya Kusupati, Ranjay Krishna, Yejin Choi, Hannaneh Hajishirzi, Ali Farhadi
NeurIPS 2024
[pdf]

Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
Yushi Hu*, Weijia Shi*, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A Smith*, Ranjay Krishna*
NeurIPS 2024
[pdf] [website] [code]

Multilingual Diversity Improves Vision-Language Representations
Thao Nguyen, Matthew Wallingford, Sebastin Santy, Wei-Chiu Ma, Sewoong Oh, Ludwig Schmidt, Pang Wei Koh, Ranjay Krishna
NeurIPS 2024 [NeurIPS Spotlight awarded to top 5% of submissions]
[pdf]

The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better
Scott Geng, Cheng-Yu Hsieh, Vivek Ramanujan, Matthew Wallingford, Chun-Liang Li, Pang Wei Koh*, Ranjay Krishna*
NeurIPS 2024
[pdf]

Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Ethan Shen, Alan Fan, Sarah M Pratt, Jae Sung Park, Matthew Wallingford, Sham M. Kakade, Ari Holtzman, Ranjay Krishna, Ali Farhadi, Aditya Kusupati
NeurIPS 2024
[pdf]

Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps
Yung-Sung Chuang, Linlu Qiu, Cheng-Yu Hsieh, Ranjay Krishna, Yoon Kim, James R. Glass
EMNLP 2024
[pdf] [blog 1] [blog 2] [video]

Is C4 Dataset Enough for Pruning? An Investigation of Calibration Data for LLM Pruning
Abhinav Bandari, Lu Yin, Cheng-Yu Hsieh, AJAY KUMAR JAISWAL, Tianlong Chen, Li Shen, Ranjay Krishna, Shiwei Liu
EMNLP 2024
[pdf]

ImageInWords: Unlocking Hyper-Detailed Image Descriptions
Roopal Garg, Andrea Burns, Burcu Karagol Ayan, Yonatan Bitton, Ceslee Montgomery, Yasumasa Onoe, Andrew Bunner, Ranjay Krishna, Jason Baldridge, Radu Soricut
EMNLP 2024
[pdf] [website] [code]

Manipulate-Anything: Automating Real-World Robots using Vision-Language Models
Jiafei Duan, Wentao Yuan, Wilbert Pumacay, Yi Ru Wang, Kiana Ehsani, Dieter Fox, Ranjay Krishna
CoRL 2024
[pdf] [website]

RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics
Wentao Yuan, Jiafei Duan, Valts Blukis, Wilbert Pumacay, Ranjay Krishna, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox
CoRL 2024
[pdf]

I Can Tell What I am Doing: Toward Real-World Natural Language Grounding of Robot Experiences
Zihan Wang, Brian Liang, Varad Dhat, Nick Walker, Zander Brumbaugh, Ranjay Krishna, Maya Cakmak
CoRL 2024
[pdf]

EVE: Enabling Anyone to Train Robots using Augmented Reality
Jun Wang, Chun-Cheng Chang, Jiafei Duan, Dieter Fox, Ranjay Krishna
UIST 2024
[pdf]

BLINK: Multimodal Large Language Models Can See but Not Perceive
Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna
ECCV 2024
[pdf] [website] [code] [dataset] [eval]

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion
Xiang Fan, Anand Bhattad, Ranjay Krishna
ECCV 2024
[pdf] [website] [code]

m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks
Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna
ECCV 2024
[pdf] [huggingface] [code]

SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision
Ankit Vani, Bac Nguyen, Samuel Lavoie, Ranjay Krishna, Aaron Courville
ECCV 2024
[pdf] [code]

The Hard Positive Truth about Vision-Language Compositionality
Amita Kamath, Cheng-Yu Hsieh, Kai-Wei Chang, Ranjay Krishna
ECCV 2024
[pdf]

Efficient Inference of Vision Instruction-Following Models with Elastic Cache
Zuyan Liu, Benlin Liu, Jiahui Wang, Yuhao Dong, Guangyi Chen, Jiwen Lu, Ranjay Krishna, Yongming Rao
ECCV 2024
[pdf] [code]

Found in the middle: Calibrating Positional Attention Bias Improves Long Context Utilization
Cheng-Yu_Hsieh, Yung-Sung Chuang, Chun-Liang Li, Zifeng Wang, Long Le, Abhishek Kumar, James R. Glass, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna*, Tomas Pfister*
ACL Findings 2024
[pdf]

The Colosseum: A Benchmark for Evaluating Generalization for Robotic Manipulation
Wilbert Pumacay*, Ishika Singh*, Jiafei Duan*, Ranjay Krishna, Jesse Thomason, Dieter Fox
RSS 2024
[pdf] [project] [code] [website]

Training Language Model Agents without Modifying Language Models
Shaokun Zhang, Jieyu Zhang, Jiale Liu, Linxin Song, Chi Wang, Ranjay Krishna, Qingyun Wu
ICML 2024
[pdf] [code] [blog]

Iterated Learning Improves Compositionality in Large Vision-Language Models
Chenhao Zheng, Jieyu Zhang, Aniruddha Kembhavi, Ranjay Krishna
CVPR 2024
[pdf] [code] [website] [video]

Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Imad Eddine Toubal, Aditya Avinash, Neil Gordon Alldrin, Jan Dlabal, Wenlei Zhou, Enming Luo, Otilia Stretcu, Hao Xiong, Chun-Ta Lu, Howard Zhou, Ranjay Krishna, Ariel Fuxman, Tom Duerig
CVPR 2024
[pdf]

Holodeck: Language Guided Generation of 3D Embodied AI Environments
Yue Yang, Fan-Yun Sun, Luca Weihs, Eli VanderBilt, Alvaro Herrasti, Winson Han, Jiajun Wu, Nick Haber, Ranjay Krishna, Lingjie Liu, Chris Callison-Burch, Mark Yatskar, Aniruddha Kembhavi, Christopher Clark
CVPR 2024
[pdf] [code] [website]

Quilt-LLaVA: Visual Instruction Tuning by Extracting Localized Narratives from Open-Source Histopathology Videos
Mehmet Saygin Seyfioglu, Wisdom O. Ikezogwo, Fatemeh Ghezloo, Ranjay Krishna, Linda Shapiro
CVPR 2024
[pdf] [code] [website] [data]

Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World
Kiana Ehsani, Tanmay Gupta, Rose Hendrix, Jordi Salvador, Luca Weihs, Kuo-Hao Zeng, Kunal Pratap Singh, Yejin Kim, Winson Han, Alvaro Herrasti, Ranjay Krishna, Dustin Schwenk, Eli VanderBilt, Aniruddha Kembhavi
CVPR 2024
[pdf] [code and data] [website]

Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models
Yushi Hu, Otilia Stretcu, Chun-Ta Lu, Krishnamurthy Viswanathan, Kenji Hata, Enming Luo, Ranjay Krishna, Ariel Fuxman
CVPR 2024 [CVPR Oral awarded to top 0.7% of submissions]
[pdf] [website]

Selective Visual Representations Improve Convergence and Generalization for Embodied AI
Ainaz Eftekhar, Kuo-Hao Zeng, Jiafei Duan, Ali Farhadi, Ani Kembhavi, Ranjay Krishna
ICLR 2024 [ICLR Spotlight awarded to top 5% of submissions]
[pdf] [code] [website] [slides]

Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-Image Generation
Jaemin Cho, Yushi Hu, Roopal Garg, Peter Anderson, Ranjay Krishna, Jason Baldridge, Mohit Bansal, Jordi Pont-Tuset, Su Wang
ICLR 2024
[pdf] [website] [code]

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building
Maureem Daum, Enhao Zhang, Dong He, Brandon Hayes, Ranjay Krishna, Magdalena Balazinska
VLDB 2024
[pdf] [code]

2023

OBJECT 3DIT: Language-guided 3D-aware Image Editing
Oscar Michel, Anand Bhattad, Eli VanderBilt, Ranjay Krishna, Aniruddha Kembhavi, Tanmay Gupta
NeurIPS 2023
[pdf] [website] [code] [dataset]

SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality
Cheng-Yu Hsieh, Jieyu Zhang, Zixian Ma, Aniruddha Kembhavi, Ranjay Krishna
NeurIPS 2023
[pdf] [code]

Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias
Yue Yu, Yuchen Zhuang, Jieyu Zhang, Yu Meng, Alexander Ratner, Ranjay Krishna, Jiaming Shen, Chao Zhang
NeurIPS 2023
[pdf] [code]

Quilt-1M: One Million Image-Text Pairs for Histopathology
Wisdom Oluchi Ikezogwo, Mehmet Saygin Seyfioglu, Fatemeh Ghezloo, Dylan Stefan Chan Geva, Fatwir Sheikh Mohammed, Pavan Kumar Anand, Ranjay Krishna, Linda Shapiro
NeurIPS 2023 [NeurIPS Oral awarded to 0.6% of submissions]
[pdf] [code]

Cola: How to adapt vision-language models to Compose Objects Localized with Attributes?
Arijit Ray, Filip Radenovic, Abhimanyu Dubey, Bryan A. Plummer, Ranjay Krishna, and Kate Saenko
NeurIPS 2023
[pdf] [project] [data]

DataComp: In search of the next generation of multimodal datasets
Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt
NeurIPS 2023 [NeurIPS Oral awarded to 0.6% of submissions]
[pdf] [website] [code]

AR2-D2:Training a Robot Without a Robot
Jiafei Duan, Yi Ru Wang, Mohit Shridhar, Dieter Fox, Ranjay Krishna
CoRL 2023
[pdf]

Agile Modeling: From Concept to Classifier in Minutes
Otilia Stretcu, Edward Vendrow, Kenji Hata, Krishnamurthy Viswanathan, Vittorio Ferrari, Sasan Tavakkol, Wenlei Zhou, Aditya Avinash, Enming Luo, Neil Gordon Alldrin, MohammadHossein Bateni, Gabriel Berger, Andrew Bunner, Chun-Ta Lu, Javier A Rey, Giulia DeSalvo, Ranjay Krishna, Ariel Fuxman
ICCV 2023
Also published at NeurIPS 2023 ReALML workshop [Best paper nominee]
[pdf]

TIFA: Text-to-Image Faithfulness Evaluation with Question Answering
Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, Noah Smith
ICCV 2023
[pdf] [website] [code]

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alex Jason Ratner, Ranjay Krishna, Chen-Yu Lee and Tomas Pfister
ACL 2023 Findings
[pdf] [code] [video] [blog]

EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions
Enhao Zhang, Maureem Daum, Dong He, Brandon Hayes, Ranjay Krishna, Magdalena Balazinska
VLDB 2023
[pdf] [code]

CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Zixian Ma*, Jerry Hong*, Mustafa Omer Gul*, Mona Gandhi, Irena Gao, Ranjay Krishna
CVPR 2023 [CVPR Highlight awarded to 2.5% of submissions]
[pdf] [code]

Explanations can Reduce Overreliance on AI Systems during Decision-Making
Helena Vasconcelos, Matthew Jorke, Madeleine Grunde-McLaughlin, Tobias Gerstenberg, Michael Bernstein, Ranjay Krishna
CSCW 2023 [Best paper honorable mention awarded to the top 23 papers]
[pdf]

2022

Alignment as a Multi-Agent Intrinsic Reward
Zixian Ma, Rose Wang, Li Fei-Fei, Michael Bernstein, Ranjay Krishna
NeurIPS 2022
[pdf] [code]

Socially situated artificial intelligence enables learning from human interaction
Ranjay Krishna, Donsuk Lee, Li Fei-Fei*, Michael Bernstein*
* = equal last authors
PNAS 2022
[main paper] [appendix] [science article] [techxplore article]

Searching for Computer Vision North Stars
Li Fei-Fei, Ranjay Krishna
Book: Daedalus Special issue on "AI & Society"
Daedalus Spring 2022
[book] [pdf] [website]

Measuring Compositional Consistency for Video Question Answering
Mona Gandhi*, Mustafa Omer Gul*, Eva Prakash, Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala
CVPR 2022
[pdf] [website] [dataset] [code]

VOCAL: Video Organization and Interactive AnaLytics
Maureem Daum*, Enhao Zhang*, Dong He, Magdalena Balazinska, Brandon Hayes, Ranjay Krishna, Apryle Craig, Aaron Wirsing
CIDR 2022
[pdf] [video]

EARLIER PUBLICATIONS

Visual Intelligence through Human Interaction
Ranjay Krishna, Mitchell Gordon, Li Fei-Fei, Michael Bernstein
Book: Artificial Intelligence for Human Computer Interaction: A Modern Approach
Springer 2021
[book] [chapter] [preprint]

On the Opportunities and Risks of Foundation Models
Center for Foundation Models @ Stanford
Report 2021
[pdf] [website] [workshop]

Mind Your Outliers! Investigating the Negative Impact of Outliers on
Active Learning through the Lens of Visual Question Answering
Siddharth Karamcheti, Ranjay Krishna, Li Fei-Fei, Christopher Manning
ACL 2021 [Outstanding paper awarded to top 6 papers]
[pdf] [code]

AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning
Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala
CVPR 2021
[pdf] [website] [dataset] [blog] [video]

AGQA 2.0: An Updated Benchmark for Compositional Spatio-Temporal Reasoning [pdf]

Conceptual Metaphors Impact Perceptions of Human-AI Collaboration
Pranav Khadpe, Ranjay Krishna, Li Fei-Fei, Jeffrey Hancock, Michael Bernstein
CSCW 2020 [Best paper honorable mention award]
[pdf] [blog] [press] [video]

Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Jingwei Ji, Ranjay Krishna, Li Fei-Fei, Juan Carlos Niebles
CVPR 2020
[pdf] [website]

AI-based Request Augmentation to Increase Crowdsourcing Participation
Junwon Park, Ranjay Krishna, Pranav Khadpe, Li Fei-Fei, Michael Bernstein
HCOMP 2019
[pdf]

Visual Relationships as Functions: Enabling Few-Shot Scene Graph Prediction
Apoorva Dornadula, Austin Narcomey, Ranjay Krishna, Michael Bernstein, Li Fei-Fei
ICCV 2019 - Scene Graph Representation and Learning workshop
[website] [pdf]

Scene Graph Prediction with Limited Labels
Vincent Chen, Paroma Varma, Ranjay Krishna, Michael Bernstein, Christopher Re, Li Fei-Fei
ICCV 2019
[website] [pdf] [code]

HYPE: Human eYe Perceptual Evaluation of Generative Models
Sharon Zhou*, Mitchell Gordon*, Ranjay Krishna, Austin Narcomey, Li Fei-Fei, Michael Bernstein
NeurIPS 2019 [Oral awarded to top 0.53% of submissions]
[website] [pdf]

Information Maximizing Visual Question Generation
Ranjay Krishna, Michael Bernstein, Li Fei-Fei
CVPR 2019
[website] [pdf] [code]

Referring Relationships
Ranjay Krishna*, Ines Chami*, Michael Bernstein, Li Fei-Fei
* = indicates equal contribution
CVPR 2018
[website] [pdf] [code]

Dense-Captioning Events in Videos
Ranjay Krishna, Kenji Hata, Frederic Ren, Li Fei-Fei, Juan Carlos Niebles
ICCV 2017
[website] [pdf] [dataset] [eval code] [challenge] [poster]

Crowd Research: Open and Scalable University Laboratories
Rajan Vaish, Snehalkumar Gaikwad, Geza Kovacs, Andreas Veit, Ranjay Krishna, Imanol Arrieta Ibarra, Camelia Simoiu, Michael Wilber, Serge Belongie, Sharad C. Goel, James Davis, Michael Bernstein
UIST 2017 [Awarded best paper honorable mention]
[website] [pdf]

A Hierarchical Approach for Generating Descriptive Image Paragraphs
Jonathan Krause, Justin Johnson, Ranjay Krishna, Li Fei-Fei
CVPR 2017 [Spotlight award to top 6% of papers]
[website] [pdf] [dataset]

A Glimpse Far into the Future: Understanding Long-term Crowd Worker Accuracy
Kenji Hata, Ranjay Krishna, Li Fei-Fei, Michael Bernstein
CSCW 2017
[website] [pdf]

Visual Genome: Crowdsourced Visual Knowledge Representations
Ranjay Krishna
Masters Thesis - Stanford University 2016
[pdf] [Christofer Stephenson Memorial award for best Stanford CS Thesis]

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li Jia-Li, David Ayman Shamma, Michael Bernstein, Li Fei-Fei
IJCV 2016
[website] [article] [pdf] [download] [api] [twitter] [press]

Visual Relationship Detection with Language Priors
Cewu Lu*, Ranjay Krishna*, Michael Bernstein, Li Fei-Fei
* = indicates equal contribution
ECCV 2016 [Oral awarded to top 1% of papers]
[pdf] [dataset] [images (2GB)] [code] [project] [slides] [poster] [video]

Embracing Error to Enable Rapid Crowdsourcing
Ranjay Krishna, Kenji Hata, Stephanie Chen, Joshua Kravitz, David Ayman Shamma, Li Fei-Fei, Michael Bernstein
CHI 2016
[pdf] [talk] [slides] [demo] [code]

DAEMO: A Self-Governed Crowdsourcing Marketplace
S. Gaikwad, D. Morina, R. Nistala, M. Agarwal, A. Cossette, R. Bhanu, S. Savage, V. Narwal, K. Rajpal, J. Regino, A. Mithal, A. Ginzberg, A. Nath, K. R. Ziulkoski, T. Cossette, D. Gamage, A. Richmond-Fuller, R. Suzuki, J. Herrejon, K. V. Le, C. Flores-Saviaga, H. Thilakarathne, K. Gupta, W. Dai, A. Sastry, S. Goyal, T. Rajapakshe, N. Abolhassani, A. Xie, A. Reyes, S. Ingle, V. Jaramillo, M.D. Godinez, W. Angel, M. Godinez, C. Toxtli, J. Flores, A. Gupta, V. Sethia, D. Padilla, K. Milland, K. Setyadi, N. Wajirasena, M. Batagoda, R. Cruz, J. Damon, D. Nekkanti, T. Sarma, M.H. Saleh, G. Gongora-Svartzman, S. Bateni, G. Toledo-Barrera, A. Pena, R. Compton, D. Aariff, L. Palacios, M. P. Ritter, Nisha K.K., A. Kay, J. Uhrmeister, S. Nistala, M. Esfahani, E. Bakiu, C. Diemert, L. Matsumoto, M. Singh, V. Jaramillo-Lopez, K. Patel, R. Krishna, G. Kovacs, R. Vaish, M. Bernstein
UIST 2015
[pdf]

Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval
Sebastian Schuster, Ranjay Krishna, Angel Chang, Li Fei-Fei and Christopher D. Manning
EMNLP 2015 - Vision and Language Workshop
[oral] [pdf]

Image Retrieval using Scene Graphs
Justin Johnson, Ranjay Krishna, Michael Stark, Li-Jia Li, David Ayman Shamma, Michael Bernstein, Li Fei-Fei
CVPR 2015
[pdf] [bib] [dataset (2GB)]

Non-Archival PAPERS

Lasagna: Layered Score Distillation for Disentangled Object Relighting
Dina Bashkirova, Arijit Ray, Rupayan Mallick, Sarah Adel Bargal, Jianming Zhang, Ranjay Krishna, Kate Saenko
ArXiv 2023
[pdf] [code]

EcoAssistant: Using LLM Assistant More Affordably and Accurately
Jieyu Zhang, Ranjay Krishna, Ahmed H. Awadallah, Chi Wang
ArXiv 2023
[pdf] [code] [blog]

Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models
Cheng-Yu Hsieh, Si-An Chen, Chun-Liang Li, Yasuhisa Fujii, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister
ArXiv 2023
[pdf]

MIMIC: Masked Image Modeling with Image Correspondences
Kalyani Marathe*, Mahtab Bigverdi*, Nishat Khan, Tuhin Kundu, Aniruddha Kembhavi, Linda G. Shapiro, Ranjay Krishna
CVPR 2024 Workshop for Learning 3D with Multi-View Supervision
[pdf] [code]

Determining Question-Answer Plausibility in Crowdsourced Datasets Using Multi-Task Learning
Rachel Gardner, Maya Varma, Clare Zhu, Ranjay Krishna
EMNLP 2020 - Workshop on Noisy User-Generated Text [Oral awarded to top 10% of submissions]
[pdf] [code]

Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui, Ranjay Krishna, Michael Bernstein, Li Fei-Fei
ArXiv 2019
[pdf]

The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary
Bernard Ghanem, Juan Carlos Niebles, Cees Snoek, Fabian Caba Heilbron, Humam Alwassel, Victor Escorcia, Ranjay Krishna, Shyamal Buch, Cuong Duc Dao
CVPR 2018 - The ActivityNet Large-scale Activity Recognition Challenge Workshop
[website] [pdf] [leaderboard]

Engagement Learning: Expanding Visual Knowledge by Engaging Online Participants
Ranjay Krishna, Donsuk Lee, Fei-Fei Li, Michael Bernstein
UIST 2018 [Poster]
[pdf]

ActivityNet Challenge 2017 Summary
Bernard Ghanem, Juan Carlos Niebles, Cees Snoek, Fabian Caba Heilbron, Humam Alwassel, Victor Escorcia, Ranjay Krishna, Shyamal Buch, Cuong Duc Dao
CVPR 2017 - The ActivityNet Large-scale Activity Recognition Challenge Workshop
[website] [pdf] [leaderboard]

Assistant Professor
Computer Science & Engineering
University of Washington

Ph.D. @ Stanford University, 2021
Co-advised by Fei-Fei Li
and Michael Bernstein.

Curriculum Vitae [2024]
Google scholar

Research statement [2021]
Teaching statement [2021]
Diversity statement [2021]

CONTACT

ranjay [at] cs [dot] washington [dot] edu

Bill & Melinda Gates Center
Room 304
3800 E Stevens Way NE,
Seattle, WA 98195

Follow @RanjayKrishna

TEACHING

University of Washington:
CSE 599H: AI vs IA [2023]
CSE 493G1: Deep learning [2025] [2024] [2023]
CSE 455: Computer Vision [2025] [2024]

Stanford University:
CS231N: Convolutional Neural Networks for Visual Recognition [2021] [2020]
CS131 Computer Vision: Foundations and Applications [2019] [2018] [2017]
[crowdsourced class notes]

RESEARCH GROUP

Prospective students read this.

PhD students

Cheng-Yu Hsieh
(2020-)

Jieyu Zhang
(2020-)

Jae Sung Park with Yejin Choi
(2020-)

Benlin Liu
(2021-)

George Stoica with Judy Hoffman
(2021-)

Jiafei Duan with Dieter Fox
(2022-)

Ainaz Eftekhar with Ali Farhadi
(2022-)

Amita Kamath with Kai-Wei Chang
(2022-)

Mahtab Bigverdi with Linda Shapiro

Zixian (Sunnie) Ma
(2023-)

Scott K. Geng with Pang Wei Koh
(2023-)

Xiang Fan
(2023-)

Linjie Li with Yejin Choi
(2023-)

Chenhao Zheng
(2024-)

Masters and undergraduate students

Ayana Bharadwaj

Peter Sushko

Long term collaborating PhD students

Madeleine Grunde-McLaughlin with Dan Weld and Jeff Heer

Yushi Hu with Noah Smith

Arijit Ray with Kate Saenko

Enhao Zhang with Magdalena Balazinska

Former PostDocs
- Wei-Chiu Ma (Faculty @ Cornell)

Former masters students
- Sho Arora
- Ines Chami
- Apoorva Dornadula
- Oliver Groth
- Mayank Kumar
- Mona Gandhi
- Donsuk Lee

Former undergraduate students
- Andre Ye
- Helena Vasconcelos
- Vincent Chen
- Shubhang Desai
- Omer Gul
- Jerry Hong
- Khaled Jedoui
- Pranav Khadpe
- Michelle Lam
- Austin Narcomey
- Junwon Park
- Jihyeon Janel Lee
- Joshua Kravitz
- Stephanie Chen
- Kenji Hata

Former PhD mentees
- Ankit Vani
- Kalyani Marathe
- Sebastin Santy
- Done He
- Jingwei Ji
- Siddharth Karamcheti

Selected Talks

Venue: CVPR 2024 - Computer Vision and Pattern Recognition
Panel: CVPR: past, present, and future

Venue: CVPR 2020 - Computer Vision and Pattern Recognition
Title: Compositionally in Computer Vision
[slides][video][workshop]

Venue: CVPR 2020 - Computer Vision and Pattern Recognition
Title: Dense Captioning Events in Videos
[slides][video][workshop]

Venue: ECCV 2016 - European Conference on Computer Vision
Title: Visual Relationship Detection with Language Priors
[pdf][project][slides][poster][video]

Venue: CHI 2016 - Conference on Human Factors in Computer Systems
Title: Embracing Error to Enable Rapid Crowdsourcing
[pdf][slides]

Tweets by @RanjayKrishna

MISCELLANEOUS

Trailer for a documentary
Venue: PBS NOVA
Title: Can we build a brain?
Year: 2018

Complete documentary
Venue: PBS NOVA
Title: Can we build a brain?
Year: 2018