CVPR MetaFood Workshop

Event: CVPR MetaFood Workshop 2024 · Duration: 195 min · ▶ Watch on YouTube

Abstract

The CVPR MetaFood Workshop 2024 brings together researchers to address the growing importance of food computing in various research areas. The workshop focuses on the challenges and opportunities in food recognition and analysis, emphasizing the need for data-centric approaches and robust, scalable systems for portion size estimation and nutrition tracking. Key topics include physically informed 3D food reconstruction, one- and few-shot food volume estimation, and vision-based systems for food portion estimation on utensils. The workshop also explores the potential for multi-modal chatbots and predictive healthcare applications in the food domain.

Speakers

  • Yuhao Chen — SMU School of Computing and Information Systems
  • Jiangpeng He — SMU School of Computing and Information Systems
  • Petia Radeva — Universitat de Barcelona
  • Chris Czarnecki — University of Waterloo
  • Aaryam Sharma — University of Waterloo
  • Alexander Wong — University of Waterloo
  • Ahmad AlMughrabi — Universitat de Barcelona
  • Umair Haroon — Universitat de Barcelona
  • Ricardo Marques — Universitat de Barcelona
  • Yawei Jueluo — Beijing Institute of Technology
  • Chengyu Shi — Beijing Institute of Technology
  • Pengyu Wang — Beijing Institute of Technology
  • Jiadong Tang — Beijing Institute of Technology
  • Dianyi Yang — Beijing Institute of Technology
  • Yu Gao — Beijing Institute of Technology
  • Zhaoxiang Liang — Beijing Institute of Technology
  • Mingui Sun — SMU School of Computing and Information Systems

Talks (7)

  • 00:00:00 — Yuhao Chen: Welcome and Introduction
    • The speaker welcomes attendees to the CVPR MetaFood Workshop, highlighting the growing importance of food computing in various research areas and inviting the community to engage with the challenges in the food domain.
  • 00:56:00Petia Radeva: Data-centric Food Computing
    • The speaker discusses the increasing popularity and challenges of food recognition and analysis, emphasizing the need for data-centric approaches and highlighting various problems like high intra-class variability, ambiguity, scalability, and lack of annotated data.
  • 02:44:52Yawei Jueluo: Physically Informed 3D Food Reconstruction
    • The speaker presents a method for physically informed 3D food reconstruction that aims to accurately estimate food volume and portion size, addressing challenges like scale factor estimation, geometric distortions, and unobserved areas in single-view and multi-view images.
  • 02:54:51Jiadong Tang: A Workflow for Physically Informed 3D Food Reconstruction
    • The speaker presents a workflow for physically informed 3D food reconstruction, focusing on accurately estimating food volume and portion size by combining multi-view and single-view reconstruction techniques with mesh refinement and post-processing steps.
  • 03:01:01Ahmad AlMughrabi: VoIETA: One- and Few-shot Food Volume Estimation
    • The speaker introduces VoIETA, a novel approach for one- and few-shot food volume estimation that leverages a combination of image processing, 3D reconstruction, and deep learning techniques to accurately measure food portions from images, addressing challenges related to scale factor, geometry, and unobserved areas.
  • 03:12:31Chris Czarnecki: How Much You Ate? Food Portion Estimation on Spoons
    • The speaker presents a vision-based system for food portion estimation on spoons, designed for elderly individuals aging at home, addressing challenges of tracking partially consumed portions and hard-to-estimate meals like soups and stews by monitoring food on the utensil.
  • 03:31:52Mingui Sun: Food Recognition in the Wild: Challenges and Opportunities
    • The speaker discusses the challenges and opportunities in food recognition in real-world scenarios, highlighting the need for robust, scalable, and accurate systems for portion size estimation and nutrition tracking, while also emphasizing the potential for multi-modal chatbots and predictive healthcare applications.

Key Takeaways

  • Food computing is a rapidly growing field with significant applications in health, economy, and sustainability, requiring robust solutions for food recognition and analysis.
  • Challenges in food recognition include high intra-class variability, ambiguity, scalability, and the scarcity of high-quality annotated data, necessitating data-centric approaches and advanced deep learning techniques.
  • Physically informed 3D food reconstruction and volume estimation are crucial for accurate nutritional tracking, with novel methods leveraging multi-view and single-view images, mesh refinement, and scale factor estimation.
  • Self-supervised learning and multi-modal models, including large language models, offer promising avenues for addressing data limitations and integrating diverse information sources (images, text, recipes) to enhance food analysis capabilities.
  • Real-world deployment of food recognition systems faces practical difficulties such as handling imbalanced datasets, diverse food appearances, and user-generated image quality, highlighting the need for robust algorithms and continuous model improvement.

Methods / Models / Datasets Mentioned

  • COLMAP
  • SuperPoint
  • SuperGlue
  • Gaussian Splatting
  • PixSFM
  • SAM
  • XMem
  • CLIP Image Encoder
  • FoodLearner
  • Image-Informed Text Encoder
  • CLIP
  • ResNet50
  • GMM
  • Co-Divide
  • Bayesian DivideMix
  • Transformer
  • UMAP
  • Faster/Mask R-CNN
  • ForestDet
  • QueryInst
  • IOF
  • DINO
  • SparseInst
  • LOFI
  • EQULv2
  • Mask Scoring Head
  • GIoU Loss
  • Graph Neural Network
  • GATv2
  • FoodAI V1 Model
  • FoodAI-756 dataset
  • Se-ResNeXt-101
  • FoodAI V4
  • FoodLMM
  • LoRA
  • CLIP Image Encoder
  • FoodLearner
  • Image-Informed Text Encoder
  • CLIP
  • PixSFM
  • SAM
  • XMem
  • VoIETA

Topics

Food Computing · Food Recognition · Food Analysis · Data-centric AI · 3D Food Reconstruction · Food Volume Estimation · Portion Size Estimation · Nutrition Tracking · Multi-modal Chatbots · Predictive Healthcare


Notes

Open for commentary — connections to other work, critiques, follow-up reading.