Embodied Intelligence for Autonomous Systems on the Horizon
Event: CVPR 2025 Workshop on Embodied Intelligence for Autonomous Systems · Duration: 0 min · ▶ Watch on YouTube
Abstract
This workshop session explores two key areas in embodied intelligence for autonomous systems: achieving efficiency in self-driving with large models and learning robust reward functions. The first part introduces ETA, a dual-model approach that leverages ‘thinking ahead’ with a large model and reactive decisions with a small model, demonstrating near real-time performance. The second part presents RELACS, a novel reward learning framework that uses counterfactual data to train a dedicated reward model, enabling better generalization and addressing the complexities of reward engineering in autonomous driving.
Speakers
- Fatma Güney — Koç University, Istanbul, Turkey
Talks (2)
- 00:26 — Fatma Güney: ETA: Efficiency through Thinking Ahead - A Dual Approach to Self-Driving with Large Models
- This talk introduces ETA, a dual-model approach that combines a slow, high-performing large model with a fast, small model and a forecasting module to achieve real-time, efficient self-driving.
- 01:03:00 — Fatma Güney: RELACS: Reward Learning for Autonomous Driving using Counterfactuals
- This talk presents RELACS, a dedicated reward model that learns driving-related rewards directly from video observations and counterfactual data, addressing the challenges of hand-designed rewards and improving generalization to real-world scenarios.
Key Takeaways
- Large models can be efficiently integrated into self-driving systems by employing a dual-model architecture that separates proactive ‘thinking ahead’ from reactive decision-making.
- Forecasting future states and maintaining real-time reactivity through a smaller model are critical for achieving both high performance and low latency in autonomous driving.
- Learning reward functions directly from observations, especially using counterfactual data, can overcome the challenges of manual reward engineering and improve generalization to diverse real-world scenarios.
- Dedicated reward models, independent of future prediction, can accurately assess driving quality, even in uncertain situations, by learning to distinguish between expert, near-crash, and crash behaviors.
- The choice of tokenizer and its training data diversity significantly impacts the generalization capabilities of reward models to out-of-distribution real-world driving videos.
Methods / Models / Datasets Mentioned
DriveMLMLLM4ADCarLLaVAAD-MLPUniAD-BaseVADTCP-trajThinkTwiceDriveTransformerDriveAdapterDaniel Kahneman's 'Thinking, Fast and Slow'RoboDualHi RobotLeapADGigaflowNAvSimImplicit AffordancesROACHVIPERVistaDiffusion ModelsCosmos-TokenizerDINOv2
Topics
Large Language Models (LLMs) · Multimodal Large Language Models (MLLMs) · Self-driving · Real-time performance · Efficiency · Dual-model approach · Forecasting · Reward learning · Counterfactuals · Generalization · Out-of-distribution · Latency · Ablation studies
Notes
Open for commentary — connections to other work, critiques, follow-up reading.