Human Motion Generation (HuMoGen) Workshop
Event: CVPR Workshop 2024 · Duration: 207 min · ▶ Watch on YouTube
Abstract
This workshop explores the latest advancements and open challenges in Human Motion Generation (HuMoGen). Speakers from academia and industry present diverse approaches to synthesizing realistic, controllable, and expressive human motion for applications ranging from video games and robotics to healthcare and virtual reality. Key themes include overcoming computational and memory constraints, achieving high-quality and diverse motion, developing robust data collection methods for embodied intelligence, and leveraging novel techniques like diffusion models and egocentric synthetic data generation to push the boundaries of human motion synthesis. The discussions highlight the importance of understanding indirect control, addressing the idiosyncratic nature of gestures, and developing unified frameworks for motion optimization and learning.
Speakers
- Guy Tevet — Tel-Aviv University
- Daniel Holden — Epic Games
- Michael Neff — University of California, Davis
- C. Karen Liu — Stanford University
- Siyu Tang — ETH Zurich
Talks (5)
- 00:00:00 — Guy Tevet: Welcome to the first workshop on Human Motion Generation (HuMoGen)
- Introduction to the Human Motion Generation (HuMoGen) workshop, outlining the organizing team, scope, and agenda for the day, including speakers and accepted papers.
- 00:05:18 — Daniel Holden: Human Motion Generation for Video Games
- Discusses the constraints and motivations for human motion generation in video games, focusing on real-time, online neural network inference for character animation, and highlighting challenges in computational budgets, quality, and responsiveness.
- 00:15:15 — Michael Neff: The Challenge of Gesture Synthesis
- Explores the complexities of gesture synthesis, emphasizing the idiosyncratic, many-to-many mapping, and multimodal nature of human gestures, and presents a data-driven approach using adversarial loss based on gesture phases.
- 00:23:45 — C. Karen Liu: New Challenges in 3D Human Motion Synthesis
- Discusses new challenges in 3D human motion synthesis, focusing on collecting embodied data, predicting external forces with diffusion models, and generating egocentric synthetic data for tasks like human mesh recovery and AR mapping.
- 00:38:30 — Siyu Tang: Controllable Human Motion Synthesis
- Presents work on controllable human motion synthesis, introducing DART for real-time text-driven control and EgoGen for egocentric synthetic data generation, emphasizing generative, efficient, diverse, and controllable models for realistic virtual humans in simulators.
Key Takeaways
- Human motion generation for video games faces significant challenges including strict computational/memory budgets, high quality demands, and the need for highly responsive and controllable character animation.
- Gesture synthesis is a complex problem due to the idiosyncratic nature of human gestures, requiring robust multimodal data collection and methods to capture subtle nuances and semantic content.
- Novel data collection systems like DexCap and EgoGen are crucial for acquiring embodied data in diverse, contextualized, and egocentric environments, enabling the training of more realistic and generalizable motion models.
- Diffusion models offer a promising unified framework for various motion tasks (editing, in-betweening, denoising) and can be optimized using loss guidance or noise optimization to achieve controllable and high-quality motion synthesis.
- Future work in human motion generation will focus on leveraging synthetic and real-world egocentric data, grounding models in conversational context, and integrating large language models (LLMs) to create more intelligent and interactive virtual humans.
Methods / Models / Datasets Mentioned
Local Motion Phases [Starke et al. 2020]Motorica [Alexanderson et al. 2022]Motion Matching [Holden et al. 2020]Phase-Functioned Neural Networks [Holden et al. 2017]Diffusion Noise Optimization (DNO)DART architectureMotion Primitive VAECLIP text encoderCLIP visual encoderAesthetic Scoring ModelEDICCT Generative Process [Wallace et al. 2022]PhysicsVAE [Won et al. SIGGRAPH 2022]FlowMDM [Barquero et al. CVPR 2024]DexCapSLAM+IMU pose trackingDiffIP motion reconstructionEgoGenInverse KinematicsTorque Inverse DynamicsMuscle ReconstructionDART (Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control)EgoGen (An Egocentric Synthetic Data Generator)HOOD [Grigorev et al. CVPR 2023]PPO algorithmPhysicsVAE [Won et al. SIGGRAPH 2022]GAMMA [Zhang et al. CVPR 2022]
Topics
Human Motion Generation · Motion Synthesis · Character Animation · Video Games · Robotics · Healthcare · Virtual Reality · Embodied Intelligence · Data Collection · Diffusion Models · Generative Models · Controllable Motion · Real-time Inference · Motion Matching · Gesture Synthesis · Egocentric Perception · Dynamic Simulation · Data Sharing · Optimization
Notes
Open for commentary — connections to other work, critiques, follow-up reading.