Riformer: Keep your vision backbone effective but removing token mixer J Wang, S Zhang, Y Liu, T Wu, Y Yang, X Liu, K Chen, P Luo, D Lin Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 28 | 2023 |
Tencentpretrain: A scalable and flexible toolkit for pre-training models of different modalities Z Zhao, Y Li, C Hou, J Zhao, R Tian, W Liu, Y Chen, N Sun, H Liu, W Mao, ... arXiv preprint arXiv:2212.06385, 2022 | 19 | 2022 |
Syngen: A syntactic plug-and-play module for generative aspect-based sentiment analysis C Yu, T Wu, J Li, X Bai, Y Yang ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 16 | 2023 |
Rethinking kullback-leibler divergence in knowledge distillation for large language models T Wu, C Tao, J Wang, R Yang, Z Zhao, N Wong arXiv preprint arXiv:2404.02657, 2024 | 11 | 2024 |
Modeling fine-grained information via knowledge-aware hierarchical graph for zero-shot entity retrieval T Wu, X Bai, W Guo, W Liu, S Li, Y Yang Proceedings of the Sixteenth ACM International Conference on Web Search and …, 2023 | 11 | 2023 |
Weight-inherited distillation for task-agnostic bert compression T Wu, C Hou, S Lao, J Li, N Wong, Z Zhao, Y Yang arXiv preprint arXiv:2305.09098, 2023 | 7 | 2023 |
Edge-free but structure-aware: Prototype-guided knowledge distillation from gnns to mlps T Wu, Z Zhao, J Wang, X Bai, L Wang, N Wong, Y Yang arXiv preprint arXiv:2303.13763, 2023 | 7 | 2023 |
Mixture-of-Subspaces in Low-Rank Adaptation T Wu, J Wang, Z Zhao, N Wong arXiv preprint arXiv:2406.11909, 2024 | 6 | 2024 |
Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast C Shi, C Yang, X Zhu, J Wang, T Wu, S Li, D Cai, Y Yang, Y Meng arXiv preprint arXiv:2405.14507, 2024 | 4 | 2024 |
Adapting LLaMA Decoder to Vision Transformer J Wang, W Shao, M Chen, C Wu, Y Liu, T Wu, K Zhang, S Zhang, K Chen, ... arXiv preprint arXiv:2404.06773, 2024 | 3 | 2024 |
Prompt-based Model for Acronym Disambiguation via Negative Sampling T Wu, X Bai, Y Yang AAAI 2022 workshop SDU@2022, 2022 | 3 | 2022 |
Riformer: Keep your vision backbone effective while removing token mixer J Wang, S Zhang, Y Liu, T Wu, Y Yang, X Liu, K Chen, P Luo, D Lin arXiv preprint arXiv:2304.05659, 2023 | 2 | 2023 |
A survey on the honesty of large language models S Li, C Yang, T Wu, C Shi, Y Zhang, X Zhu, Z Cheng, D Cai, M Yu, L Liu, ... arXiv preprint arXiv:2409.18786, 2024 | 1 | 2024 |
Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching K Ding, W Liu, Y Fang, Z Zhao, Q Ju, X Yang, R Tian, Z Tao, H Liu, H Guo, ... NAACL 2022 Findings, 2022 | 1 | 2022 |
LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models R Yang, T Wu, J Wang, P Hu, N Wong, Y Yang arXiv preprint arXiv:2411.06839, 2024 | | 2024 |
Autoregressive Models in Vision: A Survey J Xiong, G Liu, L Huang, C Wu, T Wu, Y Mu, Y Yao, H Shen, Z Wan, ... arXiv preprint arXiv:2411.05902, 2024 | | 2024 |
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers Z Yang, R Chen, T Wu, N Wong, Y Liang, R Wang, R Huang, M Li arXiv preprint arXiv:2410.17957, 2024 | | 2024 |
LoCa: Logit Calibration for Knowledge Distillation R Yang, T Wu, Y Yang arXiv preprint arXiv:2409.04778, 2024 | | 2024 |
Recouple Event Field via Probabilistic Bias for Event Extraction X Bai, T Wu, H Guo, Z Zhao, X Yang, J Li, W Liu, Q Ju, W Guo, Y Yang ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | | 2023 |
Overview of the NLPCC 2021 Shared Task: AutoIE2 W Guo, X Yang, X Bai, T Wu, W Liu, Z Zhao, Q Ju, Y Yang Natural Language Processing and Chinese Computing: 10th CCF International …, 2021 | | 2021 |