Just ask: Learning to answer questions from millions of narrated videos A Yang, A Miech, J Sivic, I Laptev, C Schmid Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 189 | 2021 |
NAS evaluation is frustratingly hard A Yang, PM Esperança, FM Carlucci International Conference on Learning Representations, 2020 | 179 | 2020 |
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models A Yang, A Miech, J Sivic, I Laptev, C Schmid Advances in Neural Information Processing Systems 35, 124-141, 2022 | 71 | 2022 |
TubeDETR: Spatio-Temporal Video Grounding with Transformers A Yang, A Miech, J Sivic, I Laptev, C Schmid Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 55 | 2022 |
Vid2seq: Large-scale pretraining of a visual language model for dense video captioning A Yang, A Nagrani, PH Seo, A Miech, J Pont-Tuset, I Laptev, J Sivic, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 43 | 2023 |
MANAS: multi-agent neural architecture search V Lopes, FM Carlucci, P Esperanca, M Singh, A Yang, V Gabillon, H Xu, ... Machine Learning, 1-24, 2023 | 23 | 2023 |
Learning to Answer Visual Questions from Web Videos A Yang, A Miech, J Sivic, I Laptev, C Schmid IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022 | 20 | 2022 |
Just ask: Learning to answer questions from millions of narrated videos. 2021 IEEE A Yang, A Miech, J Sivic, I Laptev, C Schmid CVF International Conference on Computer Vision (ICCV), 1666-1677, 2021 | 5 | 2021 |
VidChapters-7M: Video Chapters at Scale A Yang, A Nagrani, I Laptev, J Sivic, C Schmid arXiv preprint arXiv:2309.13952, 2023 | 3 | 2023 |
Covr: Learning composed video retrieval from web video captions L Ventura, A Yang, C Schmid, G Varol arXiv preprint arXiv:2308.14746, 2023 | 2 | 2023 |
Learning Visual Language Models for Video Understanding A Yang Ecole Normale Superieure de Paris-ENS Paris, 2023 | | 2023 |
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models Supplementary Material A Yang, A Miech, J Sivic, I Laptev, C Schmid | | |
TubeDETR: Spatio-Temporal Video Grounding with Transformers Supplementary Material A Yang, A Miech, J Sivic, I Laptev, C Schmid | | |
Just Ask: Learning to Answer Questions from Millions of Narrated Videos Supplementary Material A Yang, A Miech, J Sivic, I Laptev, C Schmid | | |