Follow
Jaehoon Lee
Jaehoon Lee
Anthropic
Verified email at anthropic.com - Homepage
Title
Cited by
Cited by
Year
Deep Neural Networks as Gaussian Processes
J Lee*, Y Bahri*, R Novak, SS Schoenholz, J Pennington, ...
International Conference on Learning Representations (ICLR), 2018
13352018
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...
TMLR 2023, 2022
11752022
Wide neural networks of any depth evolve as linear models under gradient descent
J Lee*, L Xiao*, SS Schoenholz, Y Bahri, J Sohl-Dickstein, J Pennington
Neural Information Processing Systems (NeurIPS), 2019
11642019
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ...
arXiv preprint arXiv:2403.05530, 2024
6842024
Measuring the effects of data parallelism on neural network training
CJ Shallue*, J Lee*, J Antognini, J Sohl-Dickstein, R Frostig, GE Dahl
Journal of Machine Learning Research (2019) 20, 1-49, 2019
4572019
On empirical comparisons of optimizers for deep learning
D Choi
arXiv preprint arXiv:1910.05446, 2019
3932019
Bayesian Deep Convolutional Neural Networks with Many Channels are Gaussian Processes
R Novak*, L Xiao*, J Lee, Y Bahri, G Yang, D Abolafia, J Pennington, ...
International Conference on Learning Representations (ICLR), 2019
392*2019
Neural tangents: Fast and easy infinite neural networks in python
R Novak*, L Xiao*, J Hron, J Lee, AA Alemi, J Sohl-Dickstein, ...
International Conference on Learning Representations (ICLR), Spotlight, 2020
2632020
Dataset Distillation with Infinitely Wide Convolutional Networks
T Nguyen, R Novak, L Xiao, J Lee
Neural Information Processing Systems (NeurIPS), 2021
2512021
Dataset Meta-Learning from Kernel Ridge-Regression
T Nguyen, Z Chen, J Lee
International Conference on Learning Representations (ICLR), 2021
2432021
Finite versus infinite neural networks: an empirical study
J Lee, SS Schoenholz, J Pennington, B Adlam, L Xiao, R Novak, ...
Neural Information Processing Systems (NeurIPS), Spotlight, 2020
2272020
Explaining neural scaling laws
Y Bahri*, E Dyer*, J Kaplan*, J Lee*, U Sharma*
arXiv preprint arXiv:2102.06701, 2021
2152021
The superconformal bootstrap in three dimensions
SM Chester, J Lee, SS Pufu, R Yacoby
Journal of High Energy Physics 2014 (9), 1-59, 2014
1672014
Exact correlators of BPS operators from the 3d superconformal bootstrap
SM Chester, J Lee, SS Pufu, R Yacoby
Journal of High Energy Physics 2015 (3), 1-55, 2015
1512015
Scaling llm test-time compute optimally can be more effective than scaling model parameters
C Snell, J Lee, K Xu, A Kumar
arXiv preprint arXiv:2408.03314, 2024
65*2024
Beyond human data: Scaling self-training for problem-solving with language models
A Singh, JD Co-Reyes, R Agarwal, A Anand, P Patil, X Garcia, PJ Liu, ...
arXiv preprint arXiv:2312.06585, 2023
632023
On the infinite width limit of neural networks with a standard parameterization
J Sohl-Dickstein, R Novak, SS Schoenholz, J Lee
arXiv preprint arXiv:2001.07301, 2020
582020
Small-scale proxies for large-scale transformer training instabilities
M Wortsman, PJ Liu, L Xiao, K Everett, A Alemi, B Adlam, JD Co-Reyes, ...
International Conference on Learning Representations (ICLR), Oral, 2024
412024
Algebra of Majorana doubling
J Lee, F Wilczek
Physical Review Letters 111 (22), 226402, 2013
372013
Towards NNGP-guided Neural Architecture Search
DS Park*, J Lee*, D Peng, Y Cao, J Sohl-Dickstein
arXiv preprint arXiv:2011.06006, 2020
342020
The system can't perform the operation now. Try again later.
Articles 1–20