Novel Hardware for Spatial AI
Event: CVPR 2019 Workshop · Duration: 28 min · ▶ Watch on YouTube
Abstract
This talk explores the evolution of SLAM (Simultaneous Localization and Mapping) into Spatial AI, emphasizing the need for robust real-time systems that can understand and interact with their environment. The speaker highlights the current maturity of sparse/semi-dense reconstruction and the rapid advancements in dense and semantic mapping. Achieving breakthrough Spatial AI products, such as general-purpose home robots or lightweight augmented reality glasses, necessitates significant innovations in novel hardware and efficient scene representations. The presentation delves into the potential of event cameras and graph-based processing architectures as key enablers for these future intelligent systems.
Speakers
- Andrew Davison — Dyson Robotics Laboratory, Department of Computing, Imperial College London
Talks (1)
- 00:00:00 — Andrew Davison: Novel Hardware for Spatial AI
- A discussion on the evolution of SLAM into Spatial AI, highlighting the need for novel hardware and representations to enable future AI products like home robots and AR glasses, focusing on event cameras and graph-based processing.
Key Takeaways
- Spatial AI represents the next frontier beyond traditional SLAM, aiming for embodied devices that can intelligently interact with their environment by building persistent and understandable 3D scene representations.
- Achieving future Spatial AI products like mass-market home robots or augmented reality glasses requires closing a significant gap between current capabilities and desired performance, particularly in terms of precision, low-latency, dense/semantic mapping, and long-term scene understanding on low-cost hardware.
- Novel hardware, such as event cameras, offers advantages in efficiency and latency for certain tasks, providing rich data streams that can be processed event-by-event to reconstruct scene intensity and 3D structure.
- Efficient data representations, like learned low-dimensional codes for depth and semantics (e.g., SceneCode), are crucial for managing the vast amounts of information in complex scenes and enabling coherent multi-view fusion.
- Graph-based processing architectures (e.g., Graphcore’s IPU) are emerging as a promising paradigm for Spatial AI, allowing algorithms to bring processing and local memory closer together, reducing data movement, and enabling efficient distributed computation for complex tasks like belief propagation and graph neural networks.
Methods / Models / Datasets Mentioned
Dyson/iRobotARKit/ARCoreOculus/HoloLensDJI/SkydioSemanticFusionElasticFusionCNNDVS128Simultaneous Mosaicing and Tracking with an Event CameraEKF3D Motion, Structure and Intensity from Event DataSceneCodeFusion++: Volumetric Object-Level SLAMMask-RCNNKinectFusionSLAMBenchPAMELA ProjectSpiNNakerGraphcoreIPUIBM TruenorthBrainchipGaussian Belief PropagationAlexNet
Topics
Spatial AI · SLAM · Novel Hardware · Event Cameras · Graph Processing · Semantic Mapping · Augmented Reality · Robotics · Real-time Systems · Scene Representation
Notes
Open for commentary — connections to other work, critiques, follow-up reading.