Virtual Try-On Workshop

Event: CVPR 2024 Workshop · Duration: 192 min · ▶ Watch on YouTube

Abstract

This workshop presents recent advancements in virtual try-on (VTO) technology, covering diverse approaches from 3D human modeling to diffusion-based image synthesis. Speakers discuss methods for accurate garment recovery, material estimation, and realistic cloth simulation, emphasizing the integration of physics-inspired models and learning-based frameworks. Key challenges addressed include achieving high-fidelity rendering, preserving garment details, handling complex poses and body shapes, and ensuring real-time performance on various devices. The presentations showcase innovative techniques for generating animatable layered assets, performing virtual try-on using latent diffusion models, and even estimating 3D facial makeup, pushing the boundaries of photorealistic and user-friendly VTO applications.

Speakers

  • Javier Romero — Amazon
  • Gerard Pons-Moll — University of Tübingen and MPII
  • Sunil — Amazon
  • Ming C. Lin — UMD
  • Hanbyul Joo — Seoul National University
  • Jeongho Kim — KAIST
  • Mehmet Saygin Seyfioglu — University of Washington
  • Xingchao Yang — CyberAgent AI Lab and University of Tsukuba
  • Katie Lewis — ModiFace

Talks (9)

  • 00:00:00 — Javier Romero: Virtual Try-On CVPR 2024 workshop
    • Introduction to the CVPR 2024 Virtual Try-On workshop, acknowledging co-organizers and providing a historical context of virtual try-on technology from 2010 to recent advancements.
  • 00:52:00Gerard Pons-Moll: What do foundation models know about 3D humans in clothing?
    • Discusses the evolution of 3D human modeling in clothing, from mesh-based SMPL to neural implicits and Gaussian Splats, and introduces Human 3-Diffusion as a new approach to extract 3D information from foundation models.
  • 01:09:00Sunil: Garment Layering and Material Estimation for Virtual Try-On
    • Presents a two-stage pipeline for disentangling garment style and texture, using parsing-based style editing and texture inpainting modules, and leveraging CLIP features for photorealistic texture transfer.
  • 01:16:01Ming C. Lin: Physics-Inspired Fit-Aware Virtual Try-On
    • Introduces a physics-inspired, fit-aware virtual try-on system that accurately reconstructs human bodies, faithfully estimates garment materials, and performs real-time cloth simulation, addressing challenges in scalability and realism.
  • 01:36:56Hanbyul Joo: GALA: Generating Animatable Layered Assets from a Single Scan
    • Presents GALA, a method for generating animatable and layered 3D assets from a single scan by decomposing objects and humans, canonicalizing poses, and refining misaligned assets through penetration handling.
  • 01:47:51Jeongho Kim: Virtual Try-on with Latent Diffusion Model
    • Explores virtual try-on using latent diffusion models, introducing StableVITON to address limitations of clothing tokenization and warped clothing input, and demonstrating improved performance in garment reconstruction.
  • 02:00:01Mehmet Saygin Seyfioglu: Diffuse2Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All
    • Proposes Diffuse2Choose, a novel approach for virtual try-all that enriches image-conditioned inpainting in latent diffusion models, focusing on fast inference, preserving product details, and achieving high-quality results.
  • 02:10:01Xingchao Yang: Makeup Prior Models for 3D Facial Makeup Estimation and Applications
    • Develops two makeup prior models (PCA-based and StyleGAN2-based) for efficient and accurate 3D facial makeup estimation and transfer, demonstrating robustness in handling self-occluded faces and improving 3D face reconstruction.
  • 02:20:01Katie Lewis: Integrating Learning-based VTO: Challenges and Advances
    • Discusses challenges and advances in integrating learning-based virtual try-on, highlighting the need for hyper-fidelity, image aesthetics, and style diversity, and introducing a mobile fitting room application.

Key Takeaways

  • Virtual try-on technology has evolved significantly, moving from early 2D image manipulation to sophisticated 3D modeling and diffusion-based synthesis, offering increasingly realistic and personalized experiences.
  • Integrating physics-inspired simulations with deep learning models is crucial for achieving high-fidelity garment draping, material estimation, and realistic motion in virtual try-on applications.
  • The development of large-scale synthetic datasets and multi-view learning frameworks is essential to overcome limitations of small-scale real-world data and improve generalization capabilities of VTO models across diverse body shapes, poses, and garment types.
  • Future advancements in VTO will focus on enhancing control over garment reconstruction, enabling seamless animation, and improving the efficiency and scalability of models to support real-time applications on various devices while addressing privacy concerns.
  • Novel approaches leveraging latent diffusion models and explicit 3D representations are demonstrating promising results in generating high-quality, consistent, and customizable virtual try-on images, paving the way for more interactive and personalized online shopping experiences.

Methods / Models / Datasets Mentioned

  • SMPL
  • ClothCap
  • Video-Avatars
  • PIFu
  • NeRF
  • SMPLicit
  • Gaussian Splats
  • Human 3-Diffusion
  • FX Mirror
  • Wannakicks
  • Amazon Virtual Try-On
  • Google Virtual Try-On
  • Codec Avatars
  • Neural-Gif
  • Multi-Garment Net (MGN)
  • ImageDream
  • Neural Surface Fields (NSF)
  • ControlNet
  • StableVITON
  • DreamPaint
  • PBE (Paint by Example)
  • Diffuse2Choose
  • PCA
  • StyleGAN2
  • FLAME model
  • DECA
  • HMR
  • ARCsim
  • Tiny-CNN
  • S2GAN
  • KD GAN
  • Sparsifiner
  • Controllable GAN
  • VTON-HD
  • HR-VITON
  • GP-VITON
  • LADI-VITON
  • DCI-VITON
  • PSGAN
  • SCGAN
  • SpMT
  • LADN
  • SSAT
  • EleGANt
  • CSD-MT
  • PPO (Proximal Policy Optimization)
  • PointNet
  • AtlasNet
  • HumanSGD
  • MAE (Masked Autoencoder)
  • DIT Transformer
  • U-Net
  • LSTM
  • CNN
  • AlexNet
  • VPoser
  • CLoSE-D
  • DreamBooth
  • TADA
  • GALA

Topics

Virtual Try-On (VTO) · 3D Human Modeling · Garment Recovery · Cloth Simulation · Diffusion Models · Material Estimation · Pose Estimation · Image Synthesis · Facial Makeup Estimation · Scalability


Notes

Open for commentary — connections to other work, critiques, follow-up reading.