Follow
Tian Xu
Tian Xu
Verified email at lamda.nju.edu.cn - Homepage
Title
Cited by
Cited by
Year
Error bounds of imitating policies and environments
T Xu, Z Li, Y Yu
Advances in Neural Information Processing Systems 33, 15737-15749, 2020
88*2020
A survey on model-based reinforcement learning
FM Luo, T Xu, H Lai, XH Chen, W Zhang, Y Yu
Science China Information Sciences 67 (2), 121101, 2024
622024
Error bounds of imitating policies and environments for reinforcement learning
T Xu, Z Li, Y Yu
IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (10), 6968 …, 2021
242021
Rethinking ValueDice: Does it really improve performance?
Z Li, T Xu, Y Yu, ZQ Luo
arXiv preprint arXiv:2202.02468, 2022
142022
Yang Yu
FM Luo, T Xu, H Lai, XH Chen, W Zhang
A Survey on Model-based Reinforcement Learning, 2022
52022
Understanding adversarial imitation learning in small sample regime: A stage-coupled analysis
T Xu, Z Li, Y Yu, ZQ Luo
arXiv preprint arXiv:2208.01899, 2022
42022
On generalization of adversarial imitation learning and beyond
T Xu, Z Li, Y Yu, ZQ Luo
arXiv preprint arXiv:2106.10424, 2021
42021
Remax: A simple, effective, and efficient method for aligning large language models
Z Li, T Xu, Y Zhang, Y Yu, R Sun, ZQ Luo
arXiv preprint arXiv:2310.10505, 2023
32023
Policy Optimization in RLHF: The Impact of Out-of-preference Data
Z Li, T Xu, Y Yu
arXiv preprint arXiv:2312.10584, 2023
22023
Provably efficient adversarial imitation learning with unknown transitions
T Xu, Z Li, Y Yu, ZQ Luo
Uncertainty in Artificial Intelligence, 2367-2378, 2023
22023
Theoretical analysis of offline imitation with supplementary dataset
Z Li, T Xu, Y Yu, ZQ Luo
arXiv preprint arXiv:2301.11687, 2023
22023
A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle
Z Li, T Xu, Y Yu
arXiv preprint arXiv:2203.11489, 2022
12022
Sparsity prior regularized Q-learning for sparse action tasks
JC Pang, T Xu, SY Jiang, YR Liu, Y Yu
arXiv preprint arXiv:2105.08666, 2021
12021
Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions
T Xu, Z Li, Y Yu
CoRR abs/2106.10424, 2021
12021
Model gradient: unified model and policy learning in model-based reinforcement learning
C Jia, F Zhang, T Xu, JC Pang, Z Zhang, Y Yu
Frontiers of Computer Science 18 (4), 184339, 2024
2024
Imitation Learning from Imperfection: Theoretical Justifications and Algorithms
Z Li, T Xu, Z Qin, Y Yu, ZQ Luo
Advances in Neural Information Processing Systems 36, 2024
2024
Offline Imitation Learning without Auxiliary High-quality Behavior Data
JJ Shao, HS Shi, T Xu, LZ Guo, Y Yu, YF Li
2023
Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning
FM Luo, T Xu, X Cao, Y Yu
arXiv preprint arXiv:2310.05422, 2023
2023
Model Generation with Provable Coverability for Offline Reinforcement Learning
C Jia, H Yin, C Gao, T Xu, L Yuan, Z Zhang, Y Yu
arXiv preprint arXiv:2206.00316, 2022
2022
Reinforcement Learning With Sparse-Executing Actions via Sparsity Regularization
JC Pang, T Xu, S Jiang, YR Liu, Y Yu
arXiv preprint arXiv:2105.08666, 2021
2021
The system can't perform the operation now. Try again later.
Articles 1–20