PBVS 2024 Workshop: Challenges and Results
Event: Perception Beyond the Visible Spectrum (PBVS) workshop 2024 · Duration: 44 min · ▶ Watch on YouTube
Abstract
This video presents the results of the PBVS 2024 workshop challenges, covering Thermal Image Super-Resolution and Multi-modal Aerial View Image tasks (Classification and Translation). Winning teams for each challenge track present their methodologies, including network architectures, training strategies, and loss functions. The overall results are summarized, highlighting the performance metrics and the innovative approaches employed by participants. The presentations also discuss the datasets used, evaluation criteria, and the future implications of the challenge results for advancing research in these fields.
Speakers
- Chenyang Wang — Harbin Institute of Technology, China
- Zhiwei Zhong — City University of Hong Kong, China
- Rafael E. Rivadeneira — Escuela Superior Politécnica del Litoral, CIDIS-ESPOL, Guayaquil, Ecuador
- Spencer Low — Computer Vision Center, Edifici O, Campus UAB, Barcelona, Spain
Talks (5)
- 00:09:59 — Chenyang Wang: Thermal Image Super-Resolution Challenge Track 1 - PBVS 2024
- The AC-TSR team’s method for Track 1 of the Thermal Image Super-Resolution Challenge, achieving first place in PSNR/SSIM evaluation, uses a network with feature extraction, NAFBlock-based feature enhancement, and image reconstruction, trained with L1 loss and data augmentation.
- 01:54:55 — Zhiwei Zhong: Thermal Image Super-Resolution Challenge Track 2 - PBVS 2024
- The GUIDEDSR team’s hybrid framework for Track 2 of the Thermal Image Super-Resolution Challenge uses a mixture of experts strategy to enhance image super-resolution by processing different rotations of the input image and integrating their outputs.
- 18:41:00 — Rafael E. Rivadeneira: Thermal Image Super-Resolution Challenge Results PBVS 2024
- This presentation provides an overview of the Thermal Image Super-Resolution Challenge results, detailing the two tracks, the CIDIS dataset, quantitative results, and announcing the winning teams (AC-TSR and GUIDEDSR) for each track, including a recap of their architectures.
- 30:00:00 — Spencer Low: Multi-modal Aerial View Image Challenge Results – Classification PBVS 2024
- This talk presents the results of the Multi-modal Aerial View Image Classification Challenge, utilizing the UNICORN V3 dataset for SAR classification and Out-of-Distribution scoring, highlighting the top-ranked methods and winners.
- 40:00:00 — Spencer Low: Multi-modal Aerial View Image Challenge: Translation from Synthetic Aperture Radar to Electro-Optical Domain Results — PBVS 2024
- This presentation covers the results of the Multi-modal Aerial View Image Translation Challenge, focusing on translation tasks between SAR, RGB, and IR modalities using the new MAGIC-Stacks Dataset, and detailing the top-ranked methods and winners.
Key Takeaways
- The Thermal Image Super-Resolution Challenge saw strong performance from AC-TSR and GUIDEDSR teams, utilizing NAFBlocks and a mixture of experts, respectively, to achieve top results.
- Transformer-based models, attention mechanisms, and hybrid architectures were key to success across various challenge tracks, demonstrating their effectiveness in complex image processing tasks.
- The Multi-modal Aerial View Image Challenge highlighted the importance of combining different sensor data (SAR, EO) for robust classification and translation, with innovative methods leveraging data augmentation and specialized network components.
- Out-of-Distribution detection and rigorous data splitting were emphasized in the classification challenge, pushing models to generalize better beyond known data distributions.
- The introduction of new benchmark datasets like CIDIS and MAGIC-Stacks provides valuable resources for future research and development in thermal and multi-modal image analysis.
Methods / Models / Datasets Mentioned
NAFBlockL1 lossself-ensembleCIDIS datasetUNICORN V3 datasetAUROCScattering Prompt Tuning (SPT)Textual PromptsResidual AdapterMLP (RAMLP)Dynamic Distributional Contrast LossViT-based modelAdapterFormerPerceptual hash algorithmMAGIC-Stacks DatasetMean Squared ErrorPerceptual LossFréchet Inception DistancePix2PixHDL2 lossLPIPSBinary Cross Entropy (BCE)
Topics
Thermal Image Super-Resolution · Multi-modal Aerial View Image · SAR Classification · Sensor Domain Translation · Deep Learning Architectures · Feature Enhancement · Image Reconstruction · Loss Functions · Data Augmentation · Mixture of Experts · Transformer-based Models · Out-of-Distribution Detection
Notes
Open for commentary — connections to other work, critiques, follow-up reading.