Geoffrey Cideron

Cited by

	All	Since 2019
Citations	1260	1260
h-index	6	6
i10-index	5	5

1200

600

300

900

2020202120222023202410 19 37 80 1110

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Olivier PietquinCohere | ex Google DeepMind (On leave - Professor at University of Lille)Verified email at univ-lille.fr
Mathieu SeurinDe Vinci Research Center (DVRC)Verified email at devinci.fr
Florian STRUBCohereVerified email at cohere.com

Geoffrey Cideron

Google DeepMind

Verified email at google.com

Reinforcement Learning Natural Language Processing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023	1042	2023
Qd-rl: Efficient mixing of quality and diversity in reinforcement learning G Cideron, T Pierrot, N Perrin, K Beguir, O Sigaud arXiv preprint arXiv:2006.08505, 28-73, 2020	83*	2020
Higher: Improving instruction following with hindsight generation for experience replay G Cideron, M Seurin, F Strub, O Pietquin 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 225-232, 2020	53*	2020
Factually consistent summarization via reinforcement learning with textual entailment feedback P Roit, J Ferret, L Shani, R Aharoni, G Cideron, R Dadashi, M Geist, ... arXiv preprint arXiv:2306.00186, 2023	44	2023
Warm: On the benefits of weight averaged reward models A Ramé, N Vieillard, L Hussenot, R Dadashi, G Cideron, O Bachem, ... arXiv preprint arXiv:2401.12187, 2024	26	2024
Musicrl: Aligning music generation to human preferences G Cideron, S Girgin, M Verzetti, D Vincent, M Kastelic, Z Borsos, ... arXiv preprint arXiv:2402.04229, 2024	6	2024
Get back here: Robust imitation by return-to-distribution planning G Cideron, B Tabanpour, S Curi, S Girgin, L Hussenot, G Dulac-Arnold, ... arXiv preprint arXiv:2305.01400, 2023	4	2023
vec2text with round-trip translations G Cideron, S Girgin, A Raichuk, O Pietquin, O Bachem, L Hussenot arXiv preprint arXiv:2209.06792, 2022	2	2022
Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning K Wang, R Kidambi, R Sullivan, A Agarwal, C Dann, A Michi, M Gelmi, ... arXiv preprint arXiv:2407.15762, 2024		2024
BOND: Aligning LLMs with Best-of-N Distillation PG Sessa, R Dadashi, L Hussenot, J Ferret, N Vieillard, A Ramé, ... arXiv preprint arXiv:2407.14622, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–10

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors