The intersection of 3D Vision-and-Language models (3D VLMs) in robotics presents a new frontier, blending spatial understanding with contextual reasoning. The Robo-3DVLM workshop seeks to explore the opportunities and challenges posed by integrating these technologies to enhance robot perception, decision-making, and interaction with the real world. As robots evolve to operate in increasingly complex environments, bridging the gap between 3D spatial reasoning and language understanding becomes critical. Key questions at the heart of this workshop include:
We are excited to announce the Call for Papers for the Robo-3DVLM workshop. We invite original contributions presenting novel ideas, research, and applications relevant to the workshop’s theme.
Event | Date |
---|---|
Call for Papers | January 30th, 2025 |
Submission Deadline | May 16nd, 2025, 23:59 PST |
Notification | May 20th, 2025 |
Camera-Ready | May 25th, 2025 |
Accepted papers will be presented in the form of posters at the workshop. In addition, selected papers may be invited to deliver spotlight talks.
A non-exhaustive list of relevant topics:
Start Time (CDT) | End Time (CDT) | Event |
---|---|---|
9:00 AM | 9:10 AM | Opening remarks |
9:10 AM | 9:45 AM | Hao Su Exploring World Model for Robotic Manipulation |
9:45 AM | 10:20 AM | Chelsea Finn Pretraining and Posttraining Robotic Foundation Models |
10:20 AM | 10:55 AM | Ranjay Krishna Preparing perception for robotics |
10:55 AM | 11:10 AM | Coffee Break |
11:10 AM | 11:45 AM | Yunzhi Li Foundation Models for Structured Scene Modeling in Robotic Manipulation |
11:45 AM | 12:20 PM | Katerina Fragkiada 3D Generative Manipulation Policies: Bridging 2D Pre-training with 3D Scene Reasoning |
12:20 PM | 1:30 PM | Lunch |
1:30 PM | 2:00 PM | Poster Session (ExHall D, #357-#371 |
2:00 PM | 2:35 PM | Angel Chang Building vision-language maps for embodied AI |
2:35 PM | 3:10 PM | Dieter Fox Hierarchical Action Models for Open-World 3D Policies |
3:10 PM | 3:25 PM | Coffee Break |
3:25 PM | 4:00 PM | Chuang Gan Genesis: An Unified and Generative Physics Simulation for Robotics |
4:00 PM | 4:45 PM |
Spotlight Paper Talks (5 min talk / 2 min Q&A) • The One RING: A Robotic Indoor Navigation Generalist • Manual2Skill: Learning to Read Manuals and Acquire Robotic Skills for Furniture Assembly Using Vision-Language Models • Agentic Language-Grounded Adaptive Robotic Assembly • ZeroMimic: Distilling Robotic Manipulation Skills from Web Videos |
4:45 PM | 5:00 PM | Ending Remarks and Paper Awards |
listed alphabetically