Parallel restarted SGD with faster convergence and less communication: Demystifying why model averaging works for deep learning H Yu, S Yang, S Zhu Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 5693-5700, 2019 | 319* | 2019 |
On the linear speedup analysis of communication efficient momentum SGD for distributed non-convex optimization H Yu, R Jin, S Yang International Conference on Machine Learning, 7184-7193, 2019 | 194 | 2019 |
Sparse temporally dynamic resting-state functional connectivity networks for early MCI identification CY Wee, S Yang, PT Yap, D Shen Brain imaging and behavior 10 (2), 342-356, 2016 | 143 | 2016 |
Feature grouping and selection over an undirected graph S Yang, L Yuan, YC Lai, X Shen, P Wonka, J Ye Proceedings of the 18th ACM SIGKDD international conference on Knowledge …, 2012 | 131 | 2012 |
Fused multiple graphical lasso S Yang, Z Lu, X Shen, P Wonka, J Ye SIAM Journal on Optimization 25 (2), 916–943, 2015 | 91 | 2015 |
An efficient ADMM algorithm for multidimensional anisotropic total variation regularization problems S Yang, J Wang, W Fan, X Zhang, P Wonka, J Ye Proceedings of the 19th ACM SIGKDD international conference on Knowledge …, 2013 | 51 | 2013 |
Multifeature, sparse-based approach for defects detection and classification in semiconductor units BM Haddad, S Yang, LJ Karam, J Ye, NS Patel, MW Braun IEEE Transactions on Automation Science and Engineering 15 (1), 145-159, 2016 | 38 | 2016 |
Structural Graphical Lasso for Learning Mouse Brain Connectivity S Yang, Q Sun, S Ji, P Wonka, I Davidson, J Ye Proceedings of the 21th ACM SIGKDD International Conference on Knowledge …, 2015 | 27 | 2015 |
Learning with non-convex truncated losses by SGD Y Xu, S Zhu, S Yang, C Zhang, R Jin, T Yang Uncertainty in Artificial Intelligence, 701-711, 2020 | 23 | 2020 |
A highly scalable parallel algorithm for isotropic total variation models J Wang, Q Li, S Yang, W Fan, P Wonka, J Ye International Conference on Machine Learning, 235-243, 2014 | 15 | 2014 |
Multi-task vector field learning B Lin, S Yang, C Zhang, J Ye, X He Advances in neural information processing systems 2012, 296, 2012 | 10 | 2012 |
GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization Y Guo, H Yuan, J Tan, Z Wang, S Yang, J Liu ICCV 2021, 2021 | 9 | 2021 |
Shifted Chunk Transformer for Spatio-Temporal Representational Learning X Zha, W Zhu, T Lv, S Yang, J Liu NeurIPS 2021, 2021 | 7 | 2021 |
Weighted linear methods for the camera pose estimation S Yang, FC Wu Ruanjian Xuebao/Journal of Software 22 (10), 2476-2487, 2011 | 7 | 2011 |
On the Convergence of (Stochastic) Gradient Descent with Extrapolation for Non-Convex Minimization. Y Xu, Z Yuan, S Yang, R Jin, T Yang IJCAI, 4003-4009, 2019 | 6 | 2019 |
BAGUA: Scaling up Distributed Learning with System Relaxations S Gan, X Lian, R Wang, J Chang, C Liu, H Shi, S Zhang, X Li, T Sun, ... VLDB 2022, 2022 | 5 | 2022 |
Shrinking the upper confidence bound: A dynamic product selection problem for urban warehouses R Jin, D Simchi-Levi, L Wang, X Wang, S Yang Management Science, 2021 | 5 | 2021 |
Unified Visual Transformer Compression S Yu, T Chen, J Shen, H Yuan, J Tan, S Yang, J Liu, Z Wang ICLR 2022, 2022 | 4 | 2022 |
Persia: A Hybrid System Scaling Deep Learning Based Recommenders up to 100 Trillion Parameters X Lian, B Yuan, X Zhu, Y Wang, Y He, H Wu, L Sun, H Lyu, C Liu, X Dong, ... arXiv preprint arXiv:2111.05897, 2021 | 2 | 2021 |
ProxyBO: Accelerating Neural Architecture Search via Bayesian Optimization with Zero-cost Proxies Y Shen, Y Li, J Zheng, W Zhang, P Yao, J Li, S Yang, J Liu, C Bin arXiv preprint arXiv:2110.10423, 2021 | 2 | 2021 |