Prati
Juan Ciro
Juan Ciro
Software engineer, Mlcommons
Potvrđena adresa e-pošte na unal.edu.co
Naslov
Citirano
Citirano
Godina
Dataperf: Benchmarks for data-centric ai development
M Mazumder, C Banbury, X Yao, B Karlaš, W Gaviria Rojas, S Diamos, ...
Advances in Neural Information Processing Systems 36, 2024
732024
The people's speech: A large-scale diverse english speech recognition dataset for commercial usage
D Galvez, G Diamos, J Ciro, JF Cerón, K Achorn, A Gopi, D Kanter, M Lam, ...
arXiv preprint arXiv:2111.09344, 2021
552021
Multilingual spoken words corpus
M Mazumder, S Chitlangia, C Banbury, Y Kang, JM Ciro, K Achorn, ...
Thirty-fifth Conference on Neural Information Processing Systems Datasets …, 2021
412021
Findings of the BabyLM Challenge: Sample-efficient pretraining on developmentally plausible corpora
A Warstadt, A Mueller, L Choshen, E Wilcox, C Zhuang, J Ciro, ...
Proceedings of the BabyLM Challenge at the 27th Conference on Computational …, 2023
342023
Dataperf: Benchmarks for data-centric ai development, 2022
M Mazumder, C Banbury, X Yao, B Karlaš, WG Rojas, S Diamos, ...
URL https://arxiv. org/abs/2207.10062, 0
6
Adversarial nibbler: A data-centric challenge for improving the safety of text-to-image models
A Parrish, HR Kirk, J Quaye, C Rastogi, M Bartolo, O Inel, J Ciro, ...
arXiv preprint arXiv:2305.14384, 2023
32023
LSH methods for data deduplication in a Wikipedia artificial dataset
J Ciro, D Galvez, T Schlippe, D Kanter
arXiv preprint arXiv:2112.11478, 2021
12021
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
HR Kirk, A Whitefield, P Röttger, A Bean, K Margatina, J Ciro, R Mosquera, ...
arXiv preprint arXiv:2404.16019, 2024
2024
Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation
J Quaye, A Parrish, O Inel, C Rastogi, HR Kirk, M Kahng, E van Liemt, ...
arXiv preprint arXiv:2403.12075, 2024
2024
Speech Wikimedia: A 77 Language Multilingual Speech Dataset
RM Gómez, J Eusse, J Ciro, D Galvez, R Hileman, K Bollacker, D Kanter
arXiv preprint arXiv:2308.15710, 2023
2023
Speech Wikimedia: A 77 Language Multilingual Speech Dataset
R Mosquera Gómez, J Eusse, J Ciro, D Galvez, R Hileman, K Bollacker, ...
arXiv e-prints, arXiv: 2308.15710, 2023
2023
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning
A Warstadt, A Mueller, L Choshen, E Wilcox, C Zhang, J Ciro, R Mosquera, ...
BabyLM Challenge at the 27th Conference on Computational Natural Language …, 2023
2023
Sustav trenutno ne može provesti ovu radnju. Pokušajte ponovo kasnije.
Članci 1–12