Second Egocentric Vision (EgoVis) Workshop

Event: CVPR 2025 (Highlight) · Duration: 25 min · ▶ Watch on YouTube

Speakers

  • Ashutosh
  • Zhenyi Liu
  • Brent Yi — UC Berkeley
  • Rosario Forte — University of Catania

Talks (4)

  • 00:04Ashutosh: FIction: 4D Future Interaction Prediction from Video
    • This work introduces a novel multimodal architecture for predicting future interaction locations and body poses in 4D space (3D + temporal dimension) over significantly longer durations than prior work, outperforming baselines on a curated Ego-Exo4D dataset.
  • 06:15Zhenyi Liu: EgoPressure: A Dataset for Hand Pressure and Pose Estimation in Egocentric Vision
    • This project introduces EgoPressure, the first high-quality egocentric touch contact and pressure dataset with 3D hand poses, along with an optimization method for hand pose annotation and a novel model called PressureFormer for 3D hand pressure estimation using transformers.
  • 14:06Brent Yi: Estimating Body And Hand Motion in an Ego-sensed World
    • This work presents EgoAllo, a system that uses egocentric sensor data (device SLAM pose and visual hand observations) to estimate complete allocentric human body pose, height, and hand parameters by combining a head-conditioned diffusion model with guidance from an off-the-shelf hand estimator.
  • 19:29Rosario Forte: Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory
    • This work introduces an online episodic memory system (ESOM) for visual query localization in egocentric video, which operates without storing the entire video stream, is memory-efficient, and scalable, outperforming baselines on the Ego4D dataset.

Notes

Open for commentary — connections to other work, critiques, follow-up reading.