16-qubit IBM universal quantum computer can be fully entangled
Y Wang, Y Li, Z Yin, B Zeng
npj Quantum Information 4 (1), 1-6, 2018
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
K Dong, Y Wang, X Chen, L Wang
International Conference on Learning Representations 2020, 2019
On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach
Y Wang, G Zhang, J Ba
International Conference on Learning Representations 2020, 2019
V-Learning—A Simple, Efficient, Decentralized Algorithm for Multiagent Reinforcement Learning
C Jin, Q Liu, Y Wang, T Yu
Mathematics of Operations Research, 2023
Distributed bandit learning: Near-optimal regret with efficient communication
Y Wang, J Hu, X Chen, L Wang
arXiv preprint arXiv:1904.06309, 2019
Online learning in unknown markov games
Y Tian, Y Wang, T Yu, S Sra
International conference on machine learning, 10279-10288, 2021
Improved Algorithms for Convex-Concave Minimax Optimization
Y Wang, J Li
Neural Information Processing Systems 2020, 2020
Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization
G Zhang, Y Wang, L Lessard, RB Grosse
International Conference on Artificial Intelligence and Statistics, 7659-7679, 2022
An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap
Y Wang, R Wang, S Kakade
Advances in Neural Information Processing Systems 34, 2021
Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation
Y Wang, Q Liu, Y Bai, C Jin
Conference on Learning Theory 2023, 2023
Is RLHF More Difficult than Standard RL? A Theoretical Perspective
Y Wang, Q Liu, C Jin
Thirty-seventh Conference on Neural Information Processing Systems, 2023
On the suboptimality of negative momentum for minimax optimization
G Zhang, Y Wang
International Conference on Artificial Intelligence and Statistics, 2020
Learning markov games with adversarial opponents: Efficient algorithms and fundamental limits
Q Liu, Y Wang, C Jin
Proceedings of the 39th International Conference on Machine Learning, PMLR …, 2022
Directional Smoothness and Gradient Methods: Convergence and Adaptivity
A Mishkin, A Khaled, Y Wang, A Defazio, RM Gower
arXiv preprint arXiv:2403.04081, 2024
Learning Rationalizable Equilibria in Multiplayer Games
Y Wang, D Kong, Y Bai, C Jin
arXiv preprint arXiv:2210.11402, 2022
