Open-vocabulary multi-label classification via multi-modal knowledge transfer S He, T Guo, T Dai, R Qiao, X Shu, B Ren, ST Xia Proceedings of the AAAI Conference on Artificial Intelligence 37 (1), 808-816, 2023 | 23 | 2023 |
VLMAE: Vision-language masked autoencoder S He, T Guo, T Dai, R Qiao, C Wu, X Shu, B Ren arXiv preprint arXiv:2208.09374, 2022 | 11 | 2022 |
Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Understanding T Zhang, S He, T Dai, Z Wang, B Chen, ST Xia Proceedings of the AAAI Conference on Artificial Intelligence 38 (7), 7296-7304, 2024 | 2 | 2024 |
Examining user-friendly and open-sourced large gpt models: A survey on language, multimodal, and scientific gpt models K Gao, S He, Z He, J Lin, QZ Pei, J Shao, W Zhang arXiv preprint arXiv:2308.14149, 2023 | 2 | 2023 |
D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation H Li, X Shu, S He, R Qiao, W Wen, T Guo, B Gan, X Sun Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 2 | 2023 |
Exploiting Feature Diversity for Make-up Temporal Video Grounding X Shu, W Wen, T Guo, S He, C Wu, R Qiao Proceedings of the 4th on Person in Context Workshop, 29-32, 2022 | 1 | 2022 |
MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning S He, Y Nie, Z Chen, Z Cai, H Wang, S Yang, H Chen arXiv preprint arXiv:2404.15127, 2024 | | 2024 |