The 20th Embedded Vision Workshop (EVW2024)

Event: CVPR Embedded Vision Workshop 2024 · Duration: 189 min · ▶ Watch on YouTube

Abstract

The 20th Embedded Vision Workshop (EVW2024) at CVPR 2024 featured a series of invited talks and oral presentations covering recent advances and challenges in embedded vision. Topics ranged from neuromorphic computing and efficient neural network design for lightweight instance segmentation to real-time urban streetscape video analysis and hardware-aware scaling for on-device continual learning. Discussions also included novel approaches for video stabilization, object detection on ultra-low-power systems, and the application of AI in companion robots, emphasizing the importance of latency, energy efficiency, and robust design methodologies for real-world deployment.

Speakers

Branislav Kisačanin — Nvidia and Institute for AI R&D of Serbia
Yusuke Sakemi — Chiba Institute of Technology, Japan
Tse-Wei Chen — Canon Inc., Japan
Carlos Victorino Padeiro — ETH Zurich
Sam Leroux — Qualcomm
Zoran Kostic — Columbia University, US
Lukas Frickenstein — BMW Group
Manon Damphoffer — UGA (Université Grenoble Alpes)
Cevahir Cigla — Aselsan Inc.
Mario E. Munich — Embodied, Inc., US
Marilyn Wolf — University of Nebraska - Lincoln, US
Jamie Menjay Lin — ETH Zurich
Francesco Paissan — ETH Zurich
Laurie Nicholas Bose — Visionchip Ltd.
Luca Bompani — ETH Zurich
Elishai Ezra Tsur — Technion - Israel Institute of Technology
Anamika Jha — Texas Instruments
Parakh Agarwal — Purdue University
Omkar Prabhune — Purdue University
Jan Ernst — Latent AI, USA

Talks (20)

00:00:00 — Branislav Kisačanin: Welcome and Workshop Overview
- Opening remarks for the 20th Embedded Vision Workshop (EVW2024) at CVPR 2024, outlining the agenda, invited speakers, and logistical details for presentations and poster sessions.
01:04:00 — Yusuke Sakemi: Physical Modeling Approach for Simple and Energy-Efficient Analog Neuromorphic Computers
- This talk presents a physical modeling approach for in-memory computing circuits, interpreting them as spiking neural networks to address non-idealities and improve energy efficiency for neuromorphic applications.
01:07:20 — Tse-Wei Chen: Dedicated Inference Engine and Binary-Weight Neural Networks for Lightweight Instance Segmentation (Paper #01)
- This presentation introduces a dedicated inference engine for binary-weight neural networks designed to reduce hardware costs and improve accuracy for lightweight instance segmentation tasks on embedded platforms.
01:10:15 — Carlos Victorino Padeiro: Lightweight Maize Disease Detection Through Post-Training Quantization with Similarity Preservation (Paper #03)
- This talk explores a post-training quantization method that preserves similarity for lightweight maize disease detection models, enabling efficient deployment on embedded systems without significant accuracy loss.
01:12:55 — Sam Leroux: Multi-bit, Black-box Watermarking of Deep Neural Networks in Embedded Applications (Paper #06)
- This presentation introduces a multi-bit, black-box watermarking technique for deep neural networks, designed to protect intellectual property in embedded applications by embedding a unique fingerprint into the model.
01:15:55 — Zoran Kostic: Real Time for Urban Streetscape Video-Based Applications
- This invited talk discusses the challenges and solutions for achieving real-time performance in urban streetscape video-based applications, focusing on low-latency processing and high-bandwidth communication for embedded vision systems.
01:22:05 — Lukas Frickenstein: Pruning as a Binarization Technique (Paper #09)
- This presentation introduces a novel pruning technique that leverages binarization to achieve high-accuracy and efficient deep neural networks, focusing on reducing computational complexity for embedded applications.
01:25:15 — Manon Damphoffer: Neuromorphic Lip-Reading with Signed Spiking Gated Recurrent Units (Paper #10)
- This talk presents a neuromorphic lip-reading pipeline using event cameras and signed spiking gated recurrent units, demonstrating high accuracy and energy efficiency for on-device processing.
01:28:15 — Cevahir Cigla: Efficient Video Stabilization via Partial Block Phase Correlation on Edge GPUs (Paper #11)
- This presentation introduces an efficient video stabilization method utilizing partial block phase correlation on edge GPUs, designed to provide low-latency and high-performance stabilization for embedded vision applications.
01:31:15 — Mario E. Munich: Building Moxie - an embedded AI companion
- This invited talk discusses the development of Moxie, an embedded AI companion robot designed for children, focusing on the challenges of building a consumer robot with advanced AI capabilities while maintaining cost-effectiveness and emotional intelligence.
01:38:00 — Marilyn Wolf: Perception/Control Co-Design for Autonomous Vehicles
- This invited talk explores the critical relationship between perception latency and control accuracy in autonomous vehicles, proposing a Markovian error model to improve tracking performance and enable predictive design space exploration for robust embedded vision systems.
01:46:15 — Jamie Menjay Lin: SciFlow: Empowering Lightweight Optical Flow Models with Self-Cleaning Iterations (Paper #12)
- This presentation introduces SciFlow, a lightweight optical flow model empowered by self-cleaning iterations and a regression focal loss, designed to mitigate ambiguities and errors in optical flow estimation for embedded applications.
01:49:15 — Francesco Paissan: Structured Sparse Back-propagation for Lightweight On-Device Continual Learning on Microcontroller Units (Paper #13)
- This talk presents a structured sparse back-propagation method for lightweight on-device continual learning on microcontroller units, enabling efficient model updates and significant memory savings for embedded vision tasks.
01:52:15 — Laurie Nicholas Bose: Demonstration of SCAMP-7 Pixel Processor Array (Demo #04)
- This demonstration showcases the SCAMP-7 Pixel Processor Array, an image sensor where every pixel is a processor, enabling high-speed, low-power in-pixel computation for various embedded vision applications like object tracking, gesture detection, and CNN inference.
01:55:15 — Luca Bompani: Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems (Paper #15)
- This presentation introduces a multi-resolution rescored ByteTrack method for video object detection on ultra-low-power embedded systems, improving accuracy and throughput by leveraging temporal consistency and selectively processing frames at different resolutions.
01:58:15 — Elishai Ezra Tsur: ED-DCFNet: an unsupervised encoder-decoder neural model for event-driven feature extraction and object tracking (Paper #17)
- This talk introduces ED-DCFNet, an unsupervised encoder-decoder neural model designed for event-driven feature extraction and object tracking, leveraging event camera data for efficient and robust performance in embedded vision systems.
02:01:15 — Anamika Jha: RAVN: Reinforcement Aided Adaptive Vector Quantization of Deep Neural Networks (Paper #18)
- This presentation introduces RAVN, a reinforcement-aided adaptive vector quantization method for deep neural networks, designed to optimize quantization for efficient deployment on embedded systems while maintaining accuracy.
02:04:15 — Parakh Agarwal: Prune Efficiently by Soft Pruning (Paper #21)
- This talk presents a soft pruning technique for neural networks that efficiently reduces model complexity by adjusting weights across epochs, leading to smaller and more efficient networks suitable for embedded applications.
02:07:15 — Omkar Prabhune: Content-aware Input Scaling & Deep Learning Computation Offloading for Low-Latency Embedded Vision (Paper #22)
- This presentation introduces a content-aware input scaling and computation offloading method for low-latency embedded vision, optimizing the trade-off between latency and accuracy by selectively processing high-resolution regions of interest on edge devices and offloading background processing to a server.
02:10:15 — Jan Ernst: Design Space Exploration for ML System Design with Hardware-in-the-Loop
- This invited talk introduces a holistic and evidence-driven approach to ML system design using hardware-in-the-loop, leveraging semantic abstraction and automation to navigate complex design tradeoffs and predict performance across various hardware targets and applications.

Key Takeaways

Embedded vision systems require careful co-design of perception and control, considering factors like latency, energy efficiency, and hardware constraints for optimal real-world performance.
Novel approaches in neuromorphic computing and sparse back-propagation enable efficient on-device continual learning and reduce computational demands for ultra-low-power embedded systems.
Addressing non-idealities in analog computing and leveraging techniques like self-cleaning iterations and content-aware scaling are crucial for developing robust and accurate embedded AI models.
The development of AI companion robots like Moxie highlights the need for advanced multimodal input processing, emotional intelligence, and robust hardware design within strict cost and privacy considerations.
Design space exploration with hardware-in-the-loop and Markovian error modeling are essential for predictive performance analysis and navigating complex tradeoffs in ML system design for safety-critical applications like autonomous vehicles.

Methods / Models / Datasets Mentioned

YOLO v5
YOLO v8
SSD-MobilenetV2
MobileNetV2-FPNLite
EfficientDet-D1
FasterRCNN-ResNet50
ByteTrack
SCAMP-7
ED-DCFNet
RAVN
PhiNet
SciFlow
YOLOv8n
YOLOv8s
YOLOv8m
YOLOv8l
YOLOv8x
YOLOv1
YOLOv8x6
TensorRT
DeepStream
GStreamer
Tiny YOLO V3
ResNetV1
MobileNetV2
PixelRNN
IMX334
NileCAM01
QDrop
BRECQ
SPQ
CORE50
PULP platform
RISC-V
COSMOS
DataCity
Mobility Intelligence
HMM (Hidden Markov Model)
NanoDet-Plus
YOLOX-Nano

Topics

Embedded Vision · Neuromorphic Computing · Efficient AI · Optical Flow · Continual Learning · Microcontroller Units · Video Stabilization · Object Detection · AI Companions · Hardware-in-the-Loop · Low-Latency Systems · Energy Efficiency · Deep Learning Optimization

Notes

Open for commentary — connections to other work, critiques, follow-up reading.