Roberta: A robustly optimized bert pretraining approach Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692, 2019 | 20062* | 2019 |
Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension M Lewis, Y Liu, N Goyal, M Ghazvininejad, A Mohamed, O Levy, ... arXiv preprint arXiv:1910.13461, 2019 | 7532 | 2019 |
Multilingual denoising pre-training for neural machine translation Y Liu, J Gu, N Goyal, X Li, S Edunov, M Ghazvininejad, M Lewis, ... Transactions of the Association for Computational Linguistics 8, 726-742, 2020 | 1286 | 2020 |
Hierarchical neural story generation A Fan, M Lewis, Y Dauphin arXiv preprint arXiv:1805.04833, 2018 | 1253 | 2018 |
Retrieval-augmented generation for knowledge-intensive nlp tasks P Lewis, E Perez, A Piktus, F Petroni, V Karpukhin, N Goyal, H Küttler, ... Advances in Neural Information Processing Systems 33, 9459-9474, 2020 | 1116 | 2020 |
End-to-end neural coreference resolution K Lee, L He, M Lewis, L Zettlemoyer arXiv preprint arXiv:1707.07045, 2017 | 1031 | 2017 |
Deep semantic role labeling: What works and what’s next L He, K Lee, M Lewis, L Zettlemoyer Proceedings of the 55th Annual Meeting of the Association for Computational …, 2017 | 531 | 2017 |
Rethinking the role of demonstrations: What makes in-context learning work? S Min, X Lyu, A Holtzman, M Artetxe, M Lewis, H Hajishirzi, L Zettlemoyer arXiv preprint arXiv:2202.12837, 2022 | 494 | 2022 |
Generalization through memorization: Nearest neighbor language models U Khandelwal, O Levy, D Jurafsky, L Zettlemoyer, M Lewis arXiv preprint arXiv:1911.00172, 2019 | 461 | 2019 |
Deal or no deal? end-to-end learning for negotiation dialogues M Lewis, D Yarats, YN Dauphin, D Parikh, D Batra arXiv preprint arXiv:1706.05125, 2017 | 460 | 2017 |
Asking and answering questions to evaluate the factual consistency of summaries A Wang, K Cho, M Lewis arXiv preprint arXiv:2004.04228, 2020 | 320 | 2020 |
Cross-lingual transfer learning for multilingual task oriented dialog S Schuster, S Gupta, R Shah, M Lewis arXiv preprint arXiv:1810.13327, 2018 | 262 | 2018 |
Incoder: A generative model for code infilling and synthesis D Fried, A Aghajanyan, J Lin, S Wang, E Wallace, F Shi, R Zhong, W Yih, ... arXiv preprint arXiv:2204.05999, 2022 | 235 | 2022 |
A corpus of natural language for visual reasoning A Suhr, M Lewis, J Yeh, Y Artzi Proceedings of the 55th Annual Meeting of the Association for Computational …, 2017 | 225 | 2017 |
Question-answer driven semantic role labeling: Using natural language to annotate natural language L He, M Lewis, L Zettlemoyer Proceedings of the 2015 conference on empirical methods in natural language …, 2015 | 222 | 2015 |
Llm. int8 (): 8-bit matrix multiplication for transformers at scale T Dettmers, M Lewis, Y Belkada, L Zettlemoyer arXiv preprint arXiv:2208.07339, 2022 | 209* | 2022 |
Strategies for structuring story generation A Fan, M Lewis, Y Dauphin arXiv preprint arXiv:1902.01109, 2019 | 209 | 2019 |
Train short, test long: Attention with linear biases enables input length extrapolation O Press, NA Smith, M Lewis arXiv preprint arXiv:2108.12409, 2021 | 201 | 2021 |
Nearest neighbor machine translation U Khandelwal, A Fan, D Jurafsky, L Zettlemoyer, M Lewis arXiv preprint arXiv:2010.00710, 2020 | 200 | 2020 |
Combined distributional and logical semantics M Lewis, M Steedman Transactions of the Association for Computational Linguistics 1, 179-192, 2013 | 191 | 2013 |