Prati
Guohao Dai(戴国浩)
Guohao Dai(戴国浩)
Associate Professor of Shanghai Jiao Tong University
Potvrđena adresa e-pošte na sjtu.edu.cn - Početna stranica
Naslov
Citirano
Citirano
Godina
GraphH: A Processing-in-Memory Architecture for Large-scale Graph Processing
G Dai, T Huang, Y Chi, J Zhao, G Sun, Y Liu, Y Wang, Y Xie, H Yang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and …, 2019
194*2019
ForeGraph: Exploring large-scale graph processing on multi-FPGA architecture
G Dai, T Huang, Y Chi, N Xu, Y Wang, H Yang
International Symposium on Field-Programmable Gate Arrays (FPGA), 217-226, 2017
1832017
FPGP: Graph Processing Framework on FPGA A Case Study of Breadth-First Search
G Dai, Y Chi, Y Wang, H Yang
International Symposium on Field-Programmable Gate Arrays (FPGA), 105-110, 2016
1502016
GE-SpMM: General-purposed Sparse Matrix-Matrix Multiplication on GPUs for Graph Neural Networks
G Huang, G Dai, Y Wang, Y Huazhong
International Conference for High Performance Computing, Networking, Storage …, 2020
1412020
A Configurable Multi-precision CNN Computing Framework based on Single Bit RRAM
Z Zhu, H Sun, Y Lin, G Dai, L Xia, S Han, Y Wang, H Yang
ACM/IEEE Design Automation Conference (DAC), 1-6, 2019
1062019
NXgraph: An Efficient Graph Processing System on a Single Machine
Y Chi, G Dai, Y Wang, G Sun, G Li, H Yang
International Conference on Data Engineering (ICDE), 409-420, 2016
1052016
A survey on efficient inference for large language models
Z Zhou, X Ning, K Hong, T Fu, J Xu, S Li, Y Lou, L Wang, Z Yuan, X Li, ...
arXiv preprint arXiv:2404.14294, 2024
862024
MNSIM 2.0: A Behavior-Level Modeling Tool for Memristor-based Neuromorphic Computing Systems
Z Zhu, H Sun, K Qiu, L Xia, G Krishnan, G Dai, D Niu, X Chen, XS Hu, ...
Great Lakes Symposium on VLSI (GLSVLSI), 83-88, 2020
842020
Flashdecoding++: Faster large language model inference with asynchronization, flat gemm optimization, and heuristics
K Hong, G Dai, J Xu, Q Mao, X Li, J Liu, Y Dong, Y Wang
Proceedings of Machine Learning and Systems 6, 148-161, 2024
64*2024
Understanding gnn computational graph: A coordinated computation, io, and memory perspective
H Zhang, Z Yu, G Dai, G Huang, Y Ding, Y Xie, Y Wang
Proceedings of Machine Learning and Systems 4, 467-484, 2022
582022
GraphSAR: A Sparsity-aware Processing-in-memory Architecture for Large-scale Graph Processing on ReRAMs
G Dai, T Huang, Y Wang, H Yang, J Wawrzynek
Asia and South Pacific Design Automation Conference (ASP-DAC), 120-126, 2019
572019
Evaluating quantized large language models
S Li, X Ning, L Wang, T Liu, X Shi, S Yan, G Dai, H Yang, Y Wang
arXiv preprint arXiv:2402.18158, 2024
552024
Flightllm: Efficient large language model inference with a complete mapping flow on fpgas
S Zeng, J Liu, G Dai, X Yang, T Fu, H Wang, W Ma, H Sun, S Li, Z Huang, ...
Proceedings of the 2024 ACM/SIGDA International Symposium on Field …, 2024
522024
Dimmining: pruning-efficient and parallel graph mining on near-memory-computing
G Dai, Z Zhu, T Fu, C Wei, B Wang, X Li, Y Xie, H Yang, Y Wang
Proceedings of the 49th Annual International Symposium on Computer …, 2022
492022
CogDL: An extensive toolkit for deep learning on graphs
Y Cen, Z Hou, Y Wang, Q Chen, Y Luo, X Yao, A Zeng, S Guo, P Zhang, ...
arXiv preprint arXiv:2103.00959, 2021
392021
Mnsim 2.0: A behavior-level modeling tool for processing-in-memory architectures
Z Zhu, H Sun, T Xie, Y Zhu, G Dai, L Xia, D Niu, X Chen, XS Hu, Y Cao, ...
IEEE transactions on computer-aided design of integrated circuits and …, 2023
302023
Online Scheduling for FPGA Computation in the Cloud
G Dai, Y Shan, F Chen, Y Wang, K Wang, H Yang
International Conference on Field-Programmable Technology (FPT), 330-333, 2014
292014
Dim: Diffusion mamba for efficient high-resolution image synthesis
Y Teng, Y Wu, H Shi, X Ning, G Dai, Y Wang, Z Li, X Liu
arXiv preprint arXiv:2405.14224, 2024
272024
Enabling Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud
S Zeng, G Dai, H Sun, K Zhong, G Ge, K Guo, Y Wang, H Yang
International Symposium on Field-Programmable Custom Computing Machines …, 2020
27*2020
Torchsparse++: Efficient training and inference framework for sparse convolution on gpus
H Tang, S Yang, Z Liu, K Hong, Z Yu, X Li, G Dai, Y Wang, S Han
Proceedings of the 56th Annual IEEE/ACM International Symposium on …, 2023
262023
Sustav trenutno ne može provesti ovu radnju. Pokušajte ponovo kasnije.
Članci 1–20