Processor Design
This course offers a more advanced treatment of digital design in the context of microprocessors. With a special emphasis on:
    - Design methodology and practice
- Pipeline microprocessor architecture implementation and evaluation
Contents
Course presentation: [pdf]
    - Moore's Law & Dennard Scaling [pdf]
    
        - Transistor basic physics
- Power wall
- Dark silicon
 
- Hardware Design Cycle [pdf] [dotprod_example.zip]
    
        - System specification
- Architectural design
- Logic design
- Circuit design
- Physical Design
 
- Functional Verification [pdf] [verification_examples.zip]
    
        - Directed testing
- Constrained Random Verification
- Functional Coverage
 
- Circuit Design [pdf] [yosys_examples.zip]
    
        - RTL synthesis
- Gate-level netlist optimization
- Delay and area estimation
 
- Physical Design [Slides VLSI Physical Design book]
    
        - Partitioning
- Chip planning
- Placement & Routing
- Timing closure
 
- Field Programmable Gate Arrays [pdf]
- Example of Pecha Kucha presentation [pdf]
Lab Sessions
This labs are initially thought as a continuation of 
PA-MIRI ones. Should you have not taken PA-MIRI, please contact your lab 
professor.
	
		| Name | Date | Docs | 
	
		| 1.Infrastructure setup and test. Microprocessor selection. | 10/9 17/9
 
 | [pdf] | 
	
		| 2.Module definition and especification. Workplan. | 1/10 8/10
 | [pdf] | 
	
		| 3. Module implementation | 15/10 22/10
 29/10
 5/11 (NO LAB)
 | [pdf] | 
	
		| 4. Module Review | Week of 11-15/11 | Interview | 
	
		| 5. Insertion in pipelined CPU | 12/11 19/11 (NO LAB)
 26/11
 3/12
 |  | 
	
		| 6.CPU review | 17/12 | Interview | 
	
		| Each group will chose one of the following modules to extend their 
		baseline processor 1) Associative cache
 2) Branch predictor
 3) Error detection and correction codes in the memory path
 4) 
		High-performance functional units (i.e. adder, substracter, etc.)
 5) Accelerators (i.e crypthograhy, neuronal nets, etc.)
 
 
 | 10/12 17/12
 7 or 14/1
 Due on
		21/01
 
 |  | 
Lab session X is due on the date of lab sesion X+1. 
List of Papers for Presentation
Students will select one of the following papers for their presentations:
    - Processor Design and Computer Architecture:
        - NOT AVAILABLE Dark Silicon and the End of Multicore Scaling.
            Esmaeilzadeh, H., Blem, E., Amant, R. S.,
            Sankaralingam, K., & Burger, D. ISCA 2011.
- NOT AVAILABLE The accelerator wall: Limits of chip specialization. Fuchs, A., & Wentzlaff, D. (2019, February). In 2019 IEEE International Symposium on High Performance Computer Architecture 
                (HPCA) (pp.1-14). IEEE.
- Bit fusion: Bit-level dynamically composable architecture for accelerating deep neural networks. Sharma, H., Park, J., Suda, N., Lai, L., Chau, B., Chandra, V., & Esmaeilzadeh, H. (2018, June). 
                In Proceedings of the 45th Annual International Symposium on Computer Architecture (pp. 764-775). IEEE Press.
- NOT AVAILABLE Reducing data transfer energy by exploiting similarity within a data transaction. Lee, D., O'Connor, M., & Chatterjee, N. (2018, February).  In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA) (pp. 40-51). IEEE.
- NOT AVAILABLE Aladdin: a Pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures.  In Proceeding of the 41st annual international symposium on Computer architecuture (ISCA '14). Yakun Sophia Shao, Brandon Reagen, Gu-Yeon Wei, and David Brooks. IEEE Press, Piscataway, NJ, USA, 97-108.
- NOT AVAILABLE Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators. B. Reagen et al. 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul, 2016, pp. 267-278.
- NOT AVAILABLE Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks. Yu-Hsin Chen, Joel Emer, and Vivienne Sze. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA '16). IEEE Press, Piscataway, NJ, USA, 367-379. DOI: https://doi.org/10.1109/ISCA.2016.40
- Darwin: A Genomics Co-processor Provides up to 15,000X Acceleration on Long Read Assembly. Yatish Turakhia, Gill Bejerano, and William J. Dally. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '18). ACM, New York, NY, USA, 199-213. February 2018.
- In-datacenter performance analysis of a tensor processing unit. N. P. Jouppi et al. 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), Toronto, ON, 2017, pp. 1-12.
- Hardware Verification:
        - Trends in functional verification: a 2014 industry study. Foster, H. D. (2015, June). In Proceedings of the 52nd Annual Design Automation Conference (p. 48). ACM.
- Instruction-Level Abstraction (ILA): A Uniform Specification for System-on-Chip (SoC) Verification. ACM Transactions on Design Automation of Electronic Systems (TODAES), 24(1), 10. January 2019.
- NOT AVAILABLE Automated activation of multiple targets in RTL models using concolic testing. Lyu, Y., Ahmed, A., & Mishra, P. (2019, March). In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE) (pp. 354-359). IEEE.
- Assertion-Based Functional Consistency Checking between TLM and RTL Models. M. Chen and P. Mishra. 2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems, Pune, 2013, pp. 320-325.
- Automatic generation of hardware checkers from formal micro-architectural specifications. A. Fedotov and J. Schmaltz. 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, 2018, pp. 1568-1573.
- Logic Synthesis and Circuit Design:
        - NOT AVAILABLE Yosys - a free Verilog synthesis suite. Wolf, C., Glaser, J., & Kepler, J. (2013, October). In Proceedings of the 21st Austrian Workshop on Microelectronics (Austrochip).
- Simmani: Runtime Power Modeling for Arbitrary RTL with Automatic Signal Selection. Kim, D., Zhao, J., Bachrach, J., & Asanović, K. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. October 2019.
- NOT AVAILABLE OpenRAM: an open-source memory compiler. In Proceedings of the 35th International Conference on Computer-Aided Design (p. 93). ACM. November 2016.
- NOT AVAILABLE LegUp: An open-source high-level synthesis tool for FPGA-based processor/accelerator systems. ACM Transactions on Embedded Computing Systems (TECS), 13(2), 24. 2013.
- OpenTimer: A High-Performance Timing Analysis Tool. Tsung-Wei Huang and Martin D. F. Wong. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD '15). IEEE Press, Piscataway, NJ, USA, 895-902.
- ASAP7: A 7-nm finFET predictive process design kit. Clark, Lawrence T., et al. Microelectronics Journal 53 (2016): 105-115.
- Physical Design:
        - RTL-Aware Dataflow-Driven Macro Placement. A. Vidal-Obiols, J. Cortadella, J. Petit, M. Galceran-Oms and F. Martorell. 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 2019, pp. 186-191.
- Machine learning applications in physical design: recent results and directions. Kahng, A. B. (2018, March). In Proceedings of the 2018 International Symposium on Physical Design (pp. 68-73). ACM.
- Prim-Dijkstra Revisited: Achieving Superior Timing-driven Routing Trees. Alpert, C. J., Chow, W. K., Han, K., Kahng, A. B., Li, Z., Liu, D., & Venkatesh, S. (2018, March). In Proceedings of the 2018 International Symposium on Physical Design (pp. 10-17). ACM.
- NOT AVAILABLE Routability-Driven Macro Placement with Embedded CNN-Based Prediction Model. Huang, Y. H., Xie, Z., Fang, G. Q., Yu, T. C., Ren, H., Fang, S. Y., ... & Hu, J. (2019, March). In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE) (pp. 180-185). IEEE.
- Accurate Wirelength Prediction for Placement-Aware Synthesis through Machine Learning. Hyun, D., Fan, Y., & Shin, Y. (2019, March). In 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE) (pp. 324-327). IEEE.
Alternatevely, students may choose a different paper as long as it is related to the contents of PD. Any paper
published in one of the following conferences in recent years should be acceptable:
    - International Symposium on Computer Architecture (ISCA)
- International Symposium on Microarchitecture (MICRO)
- International Symposium on High-Performance Computer Architecture (HPCA)
- Design and Automation Conference (DAC)
- Design, Automation and Test in Europe Conference (DATE)
- International Conference On Computer Aided Design (ICCAD)
Bibliography and Useful Links
- Weste, N. H., & Harris, D. “CMOS VLSI Design : A Circuits and Systems Perspective”. 4th Edition, 2010.
- Kahng AB, Lienig J, Markov IL, Hu J. “VLSI Physical Design: from Graph Partitioning to Timing Closure”. Springer Science & Business Media; 2011 Jan 27.
- Mead, C., & Conway, L. (1980). Introduction to VLSI systems (Vol. 1080). Reading, MA: Addison-Wesley.
- Yosys Open SYnthesis Suite: http://www.clifford.at/yosys/
- Qflow 1.3: An Open-Source Digital Synthesis Flow: http://opencircuitdesign.com/qflow/