Prati
Lihong Li (李力鸿)
Lihong Li (李力鸿)
Potvrđena adresa e-pošte na amazon.com - Početna stranica
Naslov
Citirano
Citirano
Godina
A contextual-bandit approach to personalized news article recommendation
L Li, W Chu, J Langford, RE Schapire
Proceedings of the 19th international conference on World wide web, 661-670, 2010
23972010
Parallelized stochastic gradient descent
M Zinkevich, M Weimer, L Li, A Smola
Advances in neural information processing systems 23, 2010
13682010
An empirical evaluation of thompson sampling
O Chapelle, L Li
Advances in neural information processing systems 24, 2011
13262011
Contextual bandits with linear payoff functions
W Chu, L Li, L Reyzin, R Schapire
Proceedings of the Fourteenth International Conference on Artificial …, 2011
7882011
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms
L Li, W Chu, J Langford, X Wang
Proceedings of the fourth ACM international conference on Web search and …, 2011
5562011
Neural approaches to conversational ai
J Gao, M Galley, L Li
Foundations and trends® in information retrieval 13 (2-3), 127-298, 2019
5512019
Sparse Online Learning via Truncated Gradient.
J Langford, L Li, T Zhang
Journal of Machine Learning Research 10 (3), 2009
5322009
Doubly robust policy evaluation and learning
M Dudík, J Langford, L Li
arXiv preprint arXiv:1103.4601, 2011
5062011
Doubly Robust Policy Evaluation and Learning
M Dudık, J Langford, L Li
506*
PAC model-free reinforcement learning
AL Strehl, L Li, E Wiewiora, J Langford, ML Littman
Proceedings of the 23rd international conference on Machine learning, 881-888, 2006
5002006
Doubly robust off-policy value evaluation for reinforcement learning
N Jiang, L Li
International Conference on Machine Learning, 652-661, 2016
4082016
Towards a Unified Theory of State Abstraction for MDPs.
L Li, TJ Walsh, ML Littman
ISAIM 4 (5), 9, 2006
4032006
Taming the monster: A fast and simple algorithm for contextual bandits
A Agarwal, D Hsu, S Kale, J Langford, L Li, R Schapire
International Conference on Machine Learning, 1638-1646, 2014
3822014
Towards end-to-end reinforcement learning of dialogue agents for information access
B Dhingra, L Li, X Li, J Gao, YN Chen, F Ahmed, L Deng
arXiv preprint arXiv:1609.00777, 2016
358*2016
End-to-end task-completion neural dialogue systems
X Li, YN Chen, L Li, J Gao, A Celikyilmaz
arXiv preprint arXiv:1703.01008, 2017
3252017
Reinforcement Learning in Finite MDPs: PAC Analysis.
AL Strehl, L Li, ML Littman
Journal of Machine Learning Research 10 (11), 2009
3072009
Neuro-symbolic program synthesis
E Parisotto, A Mohamed, R Singh, L Li, D Zhou, P Kohli
arXiv preprint arXiv:1611.01855, 2016
2862016
Knows what it knows: a framework for self-aware learning
L Li, ML Littman, TJ Walsh
Proceedings of the 25th international conference on Machine learning, 568-575, 2008
2752008
Contextual bandit algorithms with supervised learning guarantees
A Beygelzimer, J Langford, L Li, L Reyzin, RE Schapire
Arxiv preprint arXiv:1002.4058, 2010
2472010
An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning
R Parr, L Li, G Taylor, C Painter-Wakefield, ML Littman
Proceedings of the 25th international conference on Machine learning, 752-759, 2008
2132008
Sustav trenutno ne može provesti ovu radnju. Pokušajte ponovo kasnije.
Članci 1–20