Yunhao Tang

Cited by

	All	Since 2019
Citations	2473	2464
h-index	19	19
i10-index	28	28

1600

800

400

1200

20182019202020212022202320247 40 126 185 241 354 1511

Public access

View all

5 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Rémi MunosGoogle DeepMindVerified email at inria.fr
Michal ValkoLlama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMindVerified email at meta.com
Krzysztof ChoromanskiGoogle Brain Robotics New York & Columbia UniversityVerified email at columbia.edu
Mark RowlandResearch Scientist, Google DeepMindVerified email at google.com
Aldo PacchianoBroad Institute of MIT and HarvardVerified email at broadinstitute.org
Will DabneyDeepMindVerified email at google.com
Zhaohan Daniel GuoDeepMindVerified email at google.com
Daniele CalandrielloResearch Scientist, DeepMindVerified email at google.com
Bilal PiotGoogle DeepmindVerified email at google.com
Mohammad Gheshlaghi AzarCohereVerified email at google.com
Shipra AgrawalColumbia universityVerified email at columbia.edu
Tadashi KozunoOMRON SINIC XVerified email at alumni.oist.jp
Tamás SarlósGoogleVerified email at google.com
Vikas SindhwaniGoogle DeepMind RoboticsVerified email at google.com
Wenbo GaoColumbia UniversityVerified email at columbia.edu
Florent AltchéResearch Engineer, DeepMindVerified email at google.com
Marc G. BellemareReliant AI, prev. Google Brain, DeepMindVerified email at reliant.ai
Yuri FaenzaAssociate Professor, IEOR, Columbia UniversityVerified email at columbia.edu
Alp KucukelbirAdjunct Professor of Computer Science, Columbia UniversityVerified email at cs.columbia.edu
Adrian WellerDirector of Research, Machine Learning, University of CambridgeVerified email at eng.cam.ac.uk

Yunhao Tang

Research Scientist, DeepMind

Verified email at columbia.edu - Homepage

Reinforcement Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023	1042	2023
Reinforcement learning for integer programming: Learning to cut Y Tang, S Agrawal, Y Faenza International conference on machine learning, 9367-9376, 2020	203	2020
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024	196	2024
Es-maml: Simple hessian-free meta learning X Song, W Gao, Y Yang, K Choromanski, A Pacchiano, Y Tang arXiv preprint arXiv:1910.01215, 2019	132	2019
Discretizing continuous action space for on-policy optimization Y Tang, S Agrawal Proceedings of the aaai conference on artificial intelligence 34 (04), 5981-5988, 2020	121	2020
Monte-Carlo tree search as regularized policy optimization JB Grill, F Altché, Y Tang, T Hubert, M Valko, I Antonoglou, R Munos International Conference on Machine Learning, 3769-3778, 2020	73	2020
Byol-explore: Exploration by bootstrapped prediction Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ... Advances in neural information processing systems 35, 31855-31870, 2022	58	2022
From complexity to simplicity: Adaptive es-active subspaces for blackbox optimization KM Choromanski, A Pacchiano, J Parker-Holder, Y Tang, V Sindhwani Advances in Neural Information Processing Systems 32, 2019	50	2019
Orthogonal estimation of Wasserstein distances M Rowland, J Hron, Y Tang, K Choromanski, T Sarlos, A Weller The 22nd International Conference on Artificial Intelligence and Statistics …, 2019	47	2019
Nash learning from human feedback R Munos, M Valko, D Calandriello, MG Azar, M Rowland, ZD Guo, Y Tang, ... arXiv preprint arXiv:2312.00886, 2023	46	2023
Provably robust blackbox optimization for reinforcement learning K Choromanski, A Pacchiano, J Parker-Holder, Y Tang, D Jain, Y Yang, ... CoRR, abs/1903.02993, 2019	42	2019
Learning to Score Behaviors for Guided Policy Optimization A Pacchiano, J Parker-Holder, Y Tang, A Choromanska, K Choromanski, ... arXiv preprint arXiv:1906.04349, 2019	41	2019
Exploration by distributional reinforcement learning Y Tang, S Agrawal arXiv preprint arXiv:1805.01907, 2018	40	2018
Boosting trust region policy optimization by normalizing flows policy Y Tang, S Agrawal arXiv preprint arXiv:1809.10326, 2018	33	2018
Understanding self-predictive learning for reinforcement learning Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ... International Conference on Machine Learning, 33632-33656, 2023	27	2023
Generalized preference optimization: A unified approach to offline alignment Y Tang, ZD Guo, Z Zheng, D Calandriello, R Munos, M Rowland, ... arXiv preprint arXiv:2402.05749, 2024	23	2024
Self-imitation learning via generalized lower bound q-learning Y Tang Advances in neural information processing systems 33, 13964-13975, 2020	23	2020
Revisiting Peng’s Q() for Modern Reinforcement Learning T Kozuno, Y Tang, M Rowland, R Munos, S Kapturowski, W Dabney, ... International Conference on Machine Learning, 5794-5804, 2021	22	2021
Taylor expansion policy optimization Y Tang, M Valko, R Munos International Conference on Machine Learning, 9397-9406, 2020	20	2020
Hindsight expectation maximization for goal-conditioned reinforcement learning Y Tang, A Kucukelbir International Conference on Artificial Intelligence and Statistics, 2863-2871, 2021	19	2021

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors