CVPR MetaFood Workshop

Event: CVPR MetaFood Workshop 2024 · Duration: 195 min · ▶ Watch on YouTube

Abstract

The CVPR MetaFood Workshop 2024 brings together researchers to address the growing importance of food computing in various research areas. The workshop focuses on the challenges and opportunities in food recognition and analysis, emphasizing the need for data-centric approaches and robust, scalable systems for portion size estimation and nutrition tracking. Key topics include physically informed 3D food reconstruction, one- and few-shot food volume estimation, and vision-based systems for food portion estimation on utensils. The workshop also explores the potential for multi-modal chatbots and predictive healthcare applications in the food domain.

Speakers

Yuhao Chen — SMU School of Computing and Information Systems
Jiangpeng He — SMU School of Computing and Information Systems
Petia Radeva — Universitat de Barcelona
Chris Czarnecki — University of Waterloo
Aaryam Sharma — University of Waterloo
Alexander Wong — University of Waterloo
Ahmad AlMughrabi — Universitat de Barcelona
Umair Haroon — Universitat de Barcelona
Ricardo Marques — Universitat de Barcelona
Yawei Jueluo — Beijing Institute of Technology
Chengyu Shi — Beijing Institute of Technology
Pengyu Wang — Beijing Institute of Technology
Jiadong Tang — Beijing Institute of Technology
Dianyi Yang — Beijing Institute of Technology
Yu Gao — Beijing Institute of Technology
Zhaoxiang Liang — Beijing Institute of Technology
Mingui Sun — SMU School of Computing and Information Systems

Talks (7)

00:00:00 — Yuhao Chen: Welcome and Introduction
- The speaker welcomes attendees to the CVPR MetaFood Workshop, highlighting the growing importance of food computing in various research areas and inviting the community to engage with the challenges in the food domain.
00:56:00 — Petia Radeva: Data-centric Food Computing
- The speaker discusses the increasing popularity and challenges of food recognition and analysis, emphasizing the need for data-centric approaches and highlighting various problems like high intra-class variability, ambiguity, scalability, and lack of annotated data.
02:44:52 — Yawei Jueluo: Physically Informed 3D Food Reconstruction
- The speaker presents a method for physically informed 3D food reconstruction that aims to accurately estimate food volume and portion size, addressing challenges like scale factor estimation, geometric distortions, and unobserved areas in single-view and multi-view images.
02:54:51 — Jiadong Tang: A Workflow for Physically Informed 3D Food Reconstruction
- The speaker presents a workflow for physically informed 3D food reconstruction, focusing on accurately estimating food volume and portion size by combining multi-view and single-view reconstruction techniques with mesh refinement and post-processing steps.
03:01:01 — Ahmad AlMughrabi: VoIETA: One- and Few-shot Food Volume Estimation
- The speaker introduces VoIETA, a novel approach for one- and few-shot food volume estimation that leverages a combination of image processing, 3D reconstruction, and deep learning techniques to accurately measure food portions from images, addressing challenges related to scale factor, geometry, and unobserved areas.
03:12:31 — Chris Czarnecki: How Much You Ate? Food Portion Estimation on Spoons
- The speaker presents a vision-based system for food portion estimation on spoons, designed for elderly individuals aging at home, addressing challenges of tracking partially consumed portions and hard-to-estimate meals like soups and stews by monitoring food on the utensil.
03:31:52 — Mingui Sun: Food Recognition in the Wild: Challenges and Opportunities
- The speaker discusses the challenges and opportunities in food recognition in real-world scenarios, highlighting the need for robust, scalable, and accurate systems for portion size estimation and nutrition tracking, while also emphasizing the potential for multi-modal chatbots and predictive healthcare applications.

Key Takeaways

Food computing is a rapidly growing field with significant applications in health, economy, and sustainability, requiring robust solutions for food recognition and analysis.
Challenges in food recognition include high intra-class variability, ambiguity, scalability, and the scarcity of high-quality annotated data, necessitating data-centric approaches and advanced deep learning techniques.
Physically informed 3D food reconstruction and volume estimation are crucial for accurate nutritional tracking, with novel methods leveraging multi-view and single-view images, mesh refinement, and scale factor estimation.
Self-supervised learning and multi-modal models, including large language models, offer promising avenues for addressing data limitations and integrating diverse information sources (images, text, recipes) to enhance food analysis capabilities.
Real-world deployment of food recognition systems faces practical difficulties such as handling imbalanced datasets, diverse food appearances, and user-generated image quality, highlighting the need for robust algorithms and continuous model improvement.

Methods / Models / Datasets Mentioned

COLMAP
SuperPoint
SuperGlue
Gaussian Splatting
PixSFM
SAM
XMem
CLIP Image Encoder
FoodLearner
Image-Informed Text Encoder
CLIP
ResNet50
GMM
Co-Divide
Bayesian DivideMix
Transformer
UMAP
Faster/Mask R-CNN
ForestDet
QueryInst
IOF
DINO
SparseInst
LOFI
EQULv2
Mask Scoring Head
GIoU Loss
Graph Neural Network
GATv2
FoodAI V1 Model
FoodAI-756 dataset
Se-ResNeXt-101
FoodAI V4
FoodLMM
LoRA
CLIP Image Encoder
FoodLearner
Image-Informed Text Encoder
CLIP
PixSFM
SAM
XMem
VoIETA

Topics

Food Computing · Food Recognition · Food Analysis · Data-centric AI · 3D Food Reconstruction · Food Volume Estimation · Portion Size Estimation · Nutrition Tracking · Multi-modal Chatbots · Predictive Healthcare

Notes

Open for commentary — connections to other work, critiques, follow-up reading.