Unsupervised learning of sentence embeddings using compositional n-gram features M Pagliardini, P Gupta, M Jaggi NAACL-HLT, 2018, 2017 | 861 | 2017 |
Agree to disagree: Diversity through disagreement for better transferability M Pagliardini, M Jaggi, F Fleuret, SP Karimireddy ICLR 2023, 2022 | 50 | 2022 |
Better word embeddings by disentangling contextual n-gram information P Gupta, M Pagliardini, M Jaggi NAACL-HLT, 2019, 2019 | 45 | 2019 |
Meditron-70b: Scaling medical pretraining for large language models Z Chen, AH Cano, A Romanou, A Bonnet, K Matoba, F Salvi, ... arXiv preprint arXiv:2311.16079, 2023 | 38 | 2023 |
Taming gans with lookahead T Chavdarova, M Pagliardini, SU Stich, M Jaggi, F Fleuret ICLR 2021, 2020 | 34* | 2020 |
The peril of popular deep learning uncertainty estimation methods Y Liu, M Pagliardini, T Chavdarova, SU Stich Bayesian Deep Learning workshop, at NeurIPS 2021, 2021 | 15 | 2021 |
Unsupervised learning of sentence embeddings using compositional n-gram features (2017) M Pagliardini, P Gupta, M Jaggi arXiv preprint arXiv:1703.02507, 2017 | 10 | 2017 |
Fast Attention Over Long Sequences With Dynamic Sparse Flash Attention M Pagliardini, D Paliotta, M Jaggi, F Fleuret Advances in Neural Information Processing Systems 36, 2024 | 7* | 2024 |
Improving generalization via uncertainty driven perturbations M Pagliardini, G Manunza, M Jaggi, MI Jordan, T Chavdarova arXiv preprint arXiv:2202.05737, 2022 | 3 | 2022 |
Doge: Domain reweighting with generalization estimation S Fan, M Pagliardini, M Jaggi arXiv preprint arXiv:2310.15393, 2023 | 2 | 2023 |
A Primal-Dual Approach to Solving Variational Inequalities with General Constraints T Chavdarova, T Yang, M Pagliardini, M Jordan The Twelfth International Conference on Learning Representations, 2023 | 2* | 2023 |
CoTFormer: More Tokens With Attention Make Up For Less Depth A Mohtashami, M Pagliardini, M Jaggi arXiv preprint arXiv:2310.10845, 2023 | 1 | 2023 |
Fast causal attention with dynamic sparsity D Paliotta, M Pagliardini, M Jaggi, F Fleuret Workshop on Efficient Systems for Foundation Models@ ICML2023, 2023 | 1 | 2023 |
Diversity through Disagreement for Better Transferability M Pagliardini, M Jaggi, F Fleuret, SP Karimireddy NeurIPS 2022 Workshop on Distribution Shifts: Connecting Methods and …, 2022 | 1 | 2022 |
Improved generalization-robustness trade-off via uncertainty targeted attacks M Pagliardini, G Manunza, M Jaggi, T Chavdarova | 1 | 2021 |
DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging M Pagliardini, A Mohtashami, F Fleuret, M Jaggi arXiv preprint arXiv:2402.02622, 2024 | | 2024 |
MLO J Bachmann Ona, SA Bahreinian, LF Barba Flores, WA Ben Naceur, ... | | |