This is the official implementation of our work End-to-End Semi-Supervised Learning for Video Action Detection at CVPR'22. Paper
This is the command line argument to run the code respectively for variance and gradient maps:
python main.py --epochs 100 --bs 8 --loc_loss dice --lr 1e-4\
 --pkl_file_label train_annots_20_labeled.pkl\
 --pkl_file_unlabel train_annots_80_unlabeled.pkl\
 --wt_loc 1 --wt_cls 1 --wt_cons 0.1\
 --const_loss l2\
 --bv --n_frames 5 --thresh_epoch 11\
 --exp_id cyclic_variance_maps
python main.py --epochs 100 --bs 8 --loc_loss dice --lr 1e-4\
 --pkl_file_label train_annots_20_labeled.pkl\
 --pkl_file_unlabel train_annots_80_unlabeled.pkl\
 --wt_loc 1 --wt_cls 1 --wt_cons 0.1\
 --const_loss l2\
 --gv\
 --exp_id gradient_maps
Parameters explanation:
- bv - Temporal Variance Attentive Mask
 - gv - Gradient Smoothness Attentive Mask
 - wt_loc - Weight for localization loss
 - wt_cls - Weight for classification loss
 - wt_cons - Weight for consistency loss
 - exp_id - Experiment id to set the folder name for saving checkpoints
 - pkl_file_label - Labeled subset
 - pkl_file_unlabel - Unlabeled subset
 
python evaluate.py --ckpt exp_id_folder
Link to download I3D pre-trained weights:
https://github.com/piergiaj/pytorch-i3d/tree/master/models
We have used rgb_charades.pt for our experiments.
UCF101-24 splits: Pickle files
JHMDB-21 splits: Text files
Set data path for UCF101 videos in ucf_dataloader.py inside datasets.
If you find this work useful, please consider citing the following paper:
@InProceedings{Kumar_2022_CVPR,
    author    = {Kumar, Akash and Rawat, Yogesh Singh},
    title     = {End-to-End Semi-Supervised Learning for Video Action Detection},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {14700-14710}
}

