I am a postdoctoral researcher at UPC - BarcelonaTech. I am working in the Computer Architecture Department enclosed in the ARCO (ARchitectures and COmpilers) research group. I received BSc in Computer Engineering from the Universitat Jaume I, and MSc and Ph.D. on Computer Architecture from the UPC. My thesis was done under the supervision of Dr. Polychronis Xekalakis and Prof. Joan-Manuel Parcerisa.
My current research focuses on developing energy-efficient hardware platforms for cognitive computing. The main goal is to propose hardware architectures for ultra low-power devices that efficiently support applications such as automatic speech recognition (ASR), machine translation (MT) or text-to-speech (TTS). These applications are mainly based on deep neural networks (DNNs), but graph-based algorithms like the Viterbi search are also widely employed. My research explores four different platforms for cognitive computing:
- GPUs: Graphics processors are already available in virtually any mobile device. Despite they are designed for graphics, they offer high perf/W for DNN processing. However, graph algorithms used in voice-based applications are especially challenging for GPUs. Further improvements in GPU microarchitecture are required to efficiently support the traversal of large unstructured graphs.
A characterization of ASR software running on modern GPUs is provided in Albert's Master Thesis.
- CPUs: Low-power CPUs support SIMD instructions, such as the ARM NEON extension. We explore the use of the Vector Processing Unit (VPU) to improve the performance and energy-efficiency of mobile CPUs when running cognitive applications.
- Accelerators: Hardware accelerators provide high-performance and low energy consumption for a specific task and, hence, they represent the most promising way of supporting computationally expensive applications in the area of cognitive computing on low-power devices. Check our last micro paper for an example of an accelerator for the Viterbi beam search used in ASR.
- FPGAs: Reconfigurable hardware is another interesting platform for supporting multiple algorithms for cognitive computing. We explore the implementation of such algoritms in the Intel-Altera Heterogeneous Architecture Research
Current Ph.D. Students
I am co-advising five Ph.D. students along with Prof. Antonio Gonzalez:
- Hamid Tabani
- Reza Yazdani
- Marc Riera
- Albert Segura
- Franyell Silfa
My past research focused on energy-efficient architectures for mobile GPUs:
- "UNFOLD: A Memory-Efficient Speech Recognizer Using On-The-Fly WFST Composition". Reza Yazdani Aminabadi, Jose-Maria Arnau, Antonio Gonzalez. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2017.
- "An Ultra Low-Power Hardware Accelerator for Acoustic Scoring in Speech Recognition". Hamid Tabani, Jose-Maria Arnau, Jordi Tubella, Antonio Gonzalez. In Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2017.
- "An Ultra Low-Power Hardware Accelerator for Automatic Speech Recognition". Reza Yazdani Aminabadi, Albert Segura, Jose-Maria Arnau, Antonio Gonzalez. In Proceedings of the IEEE/ACM
International Symposium on Microarchitecture (MICRO), October 2016. PDF
- “Eliminating Redundant Fragment Shader Executions on a Mobile GPU via Hardware Memoization” .
Jose-Maria Arnau, Joan-Manuel Parcerisa and Polychronis Xekalakis. In Proceedings of the 41st
IEEE/ACM International Symposium on Computer Architecture (ISCA), June 2014. PDF Slides
- “Parallel Frame Rendering: Trading Responsiveness for Energy on a Mobile GPU” . Jose-Maria Arnau,
Joan-Manuel Parcerisa and Polychronis Xekalakis. In Proceedings of the 22nd IEEE/ACM International
Conference on Parallel Architectures and Compilation Techniques (PACT), September 2013. PDF Slides
- “TEAPOT: A Toolset for Evaluating Performance, Power and Image Quality on Mobile Graphics
Systems” . Jose-Maria Arnau, Joan-Manuel Parcerisa and Polychronis Xekalakis . In Proceedings of the
27th ACM International Conference on Supercomputing (ICS), June 2013. PDF Slides
- “Boosting Mobile GPU Performance with a Decoupled Access/Execute Fragment Processor”. Jose-Maria
Arnau, Joan-Manuel Parcerisa and Polychronis Xekalakis . In Proceedings of the 39th IEEE/ACM
International Symposium on Computer Architecture (ISCA), June 2012. PDF Slides
- “Study of the Pressing Operation of Large-sized Tiles Using X-ray Absorption” . J.L. Amoros, G. Mallol,
D. Llorens, J. Boix, J.M. Arnau, C.Feliu, J.A. Cerisuelo and J.J. Gargallo. In Proceedings of Qualicer,
February 2010. PDF
- “Rapid, Harmless, and Non-destructive Measurement of Ceramic Tile Bulk Density” . G. Mallol, M.
Llorens, J. Boix, J.M. Arnau and L. Foucard . In Proceedings of Qualicer, February 2010. PDF
- "Performance Analysis and Optimization of Automatic Speech Recognition". Hamid Tabani, Jose-Maria Arnau, Jordi Tubella and Antonio González. IEEE Transactions on Multi-Scale Computing Systems (TMSCS). doi: 10.1109/TMSCS.2017.2739158
- "Low-Power Automatic Speech Recognition Through a Mobile GPU and a Viterbi Accelerator". Reza Yazdani, Albert Segura, Jose-Maria Arnau and Antonio Gonzalez. IEEE Micro, vol. 37, no. 1, 2017, pp. 22-29.
Awards and Honors
- HiPEAC Paper Award for the publication of "An Ultra Low-Power Hardware Accelerator for Automatic Speech Recognition" in MICRO-49 (2016)
- Intel Doctoral Student Programme Honoree (2012)
- FI Research Grant: funding from the Catalan Government for a three year Ph.D. (2011)
- Best Student Graduating in Master's Degree on Computer Architecture, Networks and Systems at UPC (2011)
- Second Best Student Graduating in Computer Engineering in Spain (2010)
- Best Student Graduating in Computer Engineering in the Valencian Community (2009)
- Best Student Graduating in Computer Engineering at the Universitat Jaume I (2008)
- Best Student Graduating in Computer Engineering at the School of Technology and
Experimental Sciences of Castellon (2008)
- Research Collaboration Grant from the Spanish Ministry of Education (2007)
- "Optimizing Mobile GPU Memory Bandwidth via Parallel Frame Execution". CArD talk, University of Edinburgh, July 2013. Slides
- "Dealing with Full HD Graphics in Smartphones and Tablets". ARCO seminar, UPC - BarcelonaTech, May 2013. Slides
- "Ultra Low-power Rasterization Architectures". ARCO seminar, UPC - BarcelonaTech, December 2011. Slides