X-WORLD: Accessibility, Vision, and Autonomy Meet

Event: CVPR 2025 · Duration: 5 min · ▶ Watch on YouTube

Abstract

X-World is an open-source, photo-realistic, and behaviorally realistic simulation module integrated into CARLA, designed to address the critical lack of diverse pedestrian data in computer vision and robotics. It enables the generation of multimodal data for training perception models to interact seamlessly with all people, including those with disabilities. The platform features an extensive set of 28 wheelchair and cane models with realistic kinematics and dynamics, and supports various environmental configurations. This work introduces a fine-grained instance segmentation task and a diverse real-world benchmark to evaluate model generalization, highlighting the significant challenges in accurately detecting and segmenting mobility aid users.

Speakers

  • Jimuyang Zhang — Boston University
  • Minglan Zheng — Boston University
  • Matthew Boyd — Boston University
  • Eshed Ohn-Bar — Boston University

Talks (1)

  • 00:00:00 — Minglan Zheng: X-WORLD: Accessibility, Vision, and Autonomy Meet
    • Introduction of X-World, an interactive accessibility-aware simulation environment for training robust perception models for diverse people, including those with mobility aids.

Key Takeaways

  • Existing large-scale datasets in computer vision and robotics lack diverse pedestrian data, especially for people with disabilities, leading to accessibility gaps in autonomous systems.
  • X-World is an open-source, interactive simulation environment that generates diverse, multimodal data for training perception models to be accessibility-aware.
  • The Person-Aid Simulation Module within X-World provides 28 wheelchair and cane models with realistic kinematics and dynamics, which can be attached to any pedestrian.
  • Fine-grained instance segmentation of mobility aid users is challenging for current models, but training with X-World data significantly improves performance.
  • Combining simulated and real-world datasets is crucial for developing robust perception models for diverse mobility aid classes and ensuring generalization.

Methods / Models / Datasets Mentioned

  • CARLA
  • KITTI
  • COCO
  • Cityscapes
  • Mask R-CNN

Topics

Accessibility · Autonomous Systems · Computer Vision · Robotics · Simulation Environment · Instance Segmentation · Pedestrian Detection · Mobility Aids · Dataset Generation · Domain Adaptation


Notes

Open for commentary — connections to other work, critiques, follow-up reading.