About
Most existing social robot navigation techniques either leverage hand-crafted rules or human demonstrations to connect robot perception to socially compliant actions. However, there remains a significant gap in effectively translating perception into socially compliant actions, much like how human reasoning naturally occurs in dynamic environments. Considering the recent success of Vision-Language Models (VLMs), we propose using language to bridge the gap in human-like reasoning between perception and socially aware robot actions. We create a vision-language dataset, Social robot Navigation via Explainable Interactions (SNEI), featuring 40K human-annotated Visual Question Answers (VQAs) based on 2K human-robot social interactions in unstructured, crowded public spaces, spanning perception, prediction, chain-of-thought reasoning, action, and explanation. We fine-tune a VLM, Social-LLaVA, using SNEI to demonstrate the practical application of our dataset. Social-LLaVA outperforms state-of-the-art models like GPT-4V and Gemini, based on the average of fifteen different human-judge scores across 50 VQAs. Deployed onboard a mobile robot, Social-LLaVA enables human-like reasoning, marking a promising step toward socially compliant robot navigation in dynamic public spaces through language reasoning.
Data Collection
We use the SCAND dataset, which is collected from various human-crowded public environments and features intricate human-robot interaction scenarios. We manually choose and label 2K scenarios where the robot interacts with people.
One row of SNEI dataset
| Category | Detail |
|---|
Social-LLaVA
Learning Social Robot Navigation
The primary purpose of SNEI is to provide a large corpus of training data for Vision Language Models. Using our SNEI dataset, we develop a VLM, Social-LLaVA, which learns to perform human-like reasoning when facing social navigation interactions.
Download
Contact
For questions, please contact:
Amirreza Payandeh
apayande@gmu.edu
Dr. Xuesu Xiao
xiao@gmu.edu
Department of Computer Science
George Mason University
3401 Fairfax Dr, Arlington, VA 22201 USA