π₯[08/07/2024]π the detection code of PODA was release in the detection branch
Mohammad Fahes1,
Tuan-Hung Vu1,2,
Andrei Bursuc1,2,
Patrick PΓ©rez1,2,
Raoul de Charette1
1 Inria, Paris, France.
2 valeo.ai, Paris, France.
TL; DR: PΓDA (or PODA) is a simple feature augmentation method for zero-shot domain adaptation guided by a single textual description of the target domain.
Project page: https://astra-vision.github.io/PODA/
Paper: https://arxiv.org/abs/2212.03241
@InProceedings{fahes2023poda,
title={P{\O}DA: Prompt-driven Zero-shot Domain Adaptation},
author={Fahes, Mohammad and Vu, Tuan-Hung and Bursuc, Andrei and P{\'e}rez, Patrick and de Charette, Raoul},
booktitle={ICCV},
year={2023}
}
Overview of PΓDA
We propose Prompt-driven Instance Normalization (PIN) to augment feature styles based on "feature/target domain description" similarity
Test on unseen youtube video of night driving:
Training dataset: Cityscapes
Prompt: "driving at night"
- News
- Installation
- Running PODA
- Inference & Visualization
- Qualitative Results
- PODA for Object Detection
- License
- Acknowledgement
- 28/10/2024: Extension of PΓDA Domain Adaptation with a Single Vision-Language Embedding in on arxiv.
- 29/11/2023: Check out our recent CVPR 2024 work A Simple Recipe for Language-guided Domain Generalized Segmentation, which builds upon PIN to achieve state-of-the-art in domain generalized segmentation.
- 19/08/2023: Camera-ready version is on arxiv.
- 14/07/2023: PODA is accepted at ICCV 2023.
First create a new conda environment with the required packages:
conda env create --file environment.yml
Then activate environment using:
conda activate poda_env
-
CITYSCAPES: Follow the instructions in Cityscapes to download the images and semantic segmentation ground-truths. Please follow the dataset directory structure:
<CITYSCAPES_DIR>/ % Cityscapes dataset root βββ leftImg8bit/ % input image (leftImg8bit_trainvaltest.zip) βββ gtFine/ % semantic segmentation labels (gtFine_trainvaltest.zip)
-
ACDC: Download ACDC images and ground truths from ACDC. Please follow the dataset directory structure:
<ACDC_DIR>/ % ACDC dataset root βββ rbg_anon/ % input image (rgb_anon_trainvaltest.zip) βββ gt/ % semantic segmentation labels (gt_trainval.zip)
-
GTA5: Download GTA5 images and ground truths from GTA5. Please follow the dataset directory structure:
<GTA5_DIR>/ % GTA5 dataset root βββ images/ % input image βββ labels/ % semantic segmentation labels
The source models are available here.
python3 main.py \
--dataset <source_dataset> \
--data_root <path_to_source_dataset> \
--data_aug \
--lr 0.1 \
--crop_size 768 \
--batch_size 2 \
--freeze_BB \
--ckpts_path saved_ckpts
python3 PIN_aug.py \
--dataset <source_dataset> \
--data_root <path_to_source_dataset> \
--total_it 100 \
--resize_feat \
--domain_desc <target_domain_description> \
--save_dir <directory_for_saved_statistics>
python3 main.py \
--dataset <source_dataset> \
--data_root <path_to_source_dataset> \
--ckpt <path_to_source_checkpoint> \
--batch_size 8 \
--lr 0.01 \
--ckpts_path adapted \
--freeze_BB \
--train_aug \
--total_itrs 2000 \
--path_mu_sig <path_to_augmented_statistics>
python3 main.py \
--dataset <dataset_name> \
--data_root <dataset_path> \
--ckpt <path_to_tested_model> \
--test_only \
--val_batch_size 1 \
--ACDC_sub <ACDC_subset_if_tested_on_ACDC>
To test any model on any image and visualize the output, please add the images to predict_test directory and run:
python3 predict.py \
--ckpt <ckpt_path> \
--save_val_results_to <directory_for_saved_output_images>
PΓDA for uncommon driving situations
Our feature augmentation is task-agnostic, as it operates on the feature extractor's level. We show some results of PΓDA for object detection. The metric is mAP%
βββ [08/07/2024] For the Night-Clear and Day-Foggy results, we corrected the evaluation bug from the original paper where the test split was mistakenly used instead of the train split for testing.
PΓDA is released under the Apache 2.0 license.
The code heavily borrows from this implementation of DeepLabv3+, and uses code from CLIP