Active memory cube: A processing-in-memory architecture for exascale systems R Nair, SF Antao, C Bertolli, P Bose, JR Brunheroto, T Chen, CY Cher, ... IBM Journal of Research and Development 59 (2/3), 17: 1-17: 14, 2015 | 233 | 2015 |
Data access optimization in a processing-in-memory system Z Sura, A Jacob, T Chen, B Rosenburg, O Sallenave, C Bertolli, S Antao, ... Proceedings of the 12th ACM International Conference on Computing Frontiers, 1-8, 2015 | 89 | 2015 |
Offloading support for OpenMP in Clang and LLVM SF Antao, A Bataev, AC Jacob, GT Bercea, AE Eichenberger, G Rokos, ... 2016 Third Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC), 1-11, 2016 | 88 | 2016 |
RNS-based elliptic curve point multiplication for massive parallel architectures S Antão, JC Bajard, L Sousa The Computer Journal 55 (5), 629-647, 2012 | 87 | 2012 |
Combining residue arithmetic to design efficient cryptographic circuits and systems L Sousa, S Antao, P Martins IEEE Circuits and Systems Magazine 16 (4), 6-32, 2016 | 83 | 2016 |
Coordinating GPU threads for OpenMP 4.0 in LLVM C Bertolli, SF Antao, AE Eichenberger, KOBZ Sura, AC Jacob, T Chen, ... 2014 LLVM Compiler Infrastructure in HPC, 12-21, 2014 | 81 | 2014 |
Integrating GPU support for OpenMP offloading directives into Clang C Bertolli, SF Antao, GT Bercea, AC Jacob, AE Eichenberger, T Chen, ... Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in …, 2015 | 72 | 2015 |
MRC-Based RNS Reverse Converters for the Four-Moduli Setsand L Sousa, S Antao IEEE Transactions on Circuits and Systems II: Express Briefs 59 (4), 244-248, 2012 | 72 | 2012 |
Elliptic curve point multiplication on GPUs S Antao, JC Bajard, L Sousa ASAP 2010-21st IEEE International Conference on Application-specific Systems …, 2010 | 51 | 2010 |
Reverse converter design via parallel-prefix adders: Novel components, methodology, and implementations AAE Zarandi, AS Molahosseini, M Hosseinzadeh, S Sorouri, S Antao, ... IEEE Transactions on Very Large Scale Integration (VLSI) Systems 23 (2), 374-378, 2014 | 41 | 2014 |
The CRNS framework and its application to programmable and reconfigurable cryptography S Antao, L Sousa ACM Transactions on Architecture and Code Optimization (TACO) 9 (4), 1-25, 2013 | 41 | 2013 |
Performance analysis of OpenMP on a GPU using a CORAL proxy application GT Bercea, C Bertolli, SF Antao, AC Jacob, AE Eichenberger, T Chen, ... Proceedings of the 6th International Workshop on Performance Modeling …, 2015 | 37 | 2015 |
On the Design of RNS Reverse Converters for the Four-Moduli Set ${\bf\{2^{\mmb n}+ 1, 2^{\mmb n}-1, 2^{\mmb n}, 2^{{\mmb n}+ 1}+ 1\}} $ L Sousa, S Antão, R Chaves IEEE transactions on very large scale integration (VLSI) systems 21 (10 …, 2012 | 34 | 2012 |
Performance analysis and optimization of Clang's OpenMP 4.5 GPU support M Martineau, S McIntosh-Smith, C Bertolli, AC Jacob, SF Antao, ... 2016 7th International Workshop on Performance Modeling, Benchmarking and …, 2016 | 28 | 2016 |
Early experiences porting three applications to OpenMP 4.5 I Karlin, T Scogland, AC Jacob, SF Antao, GT Bercea, C Bertolli, ... OpenMP: Memory, Devices, and Tasks: 12th International Workshop on OpenMP …, 2016 | 27 | 2016 |
Op2-clang: A source-to-source translator using clang/llvm libtooling GD Balogh, GR Mudalige, IZ Reguly, SF Antao, C Bertolli 2018 IEEE/ACM 5th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM …, 2018 | 22 | 2018 |
Efficient fork-join on GPUs through warp specialization AC Jacob, AE Eichenberger, H Sung, SF Antao, GT Bercea, C Bertolli, ... 2017 IEEE 24th International Conference on High Performance Computing (HiPC …, 2017 | 21 | 2017 |
A lab project on the design and implementation of programmable and configurable embedded systems L Sousa, S Antao, J Germano IEEE Transactions on Education 56 (3), 322-328, 2012 | 21 | 2012 |
Compiling for the active memory cube AC Jacob, Z Sura, T Chen, C Bertolli, S Antao, O Sallenave, K O’Brien, ... Tech. rep. RC25644 (WAT1612–008). IBM Research Division, Tech. Rep., 2016 | 20 | 2016 |
Compact and flexible microcoded elliptic curve processor for reconfigurable devices S Antao, R Chaves, L Sousa 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines …, 2009 | 14 | 2009 |