Jan. 31-Feb. 4, 2016 | Chen, Yu-Hsin, Tushar Krishna, Joel Emer, and Vivienne Sze
The paper "Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks" by Yu-Hsin Chen, Tushar Krishna, Joel Emer, and Vivienne Sze presents a novel accelerator designed to efficiently process deep convolutional neural networks (CNNs) with minimal energy consumption. The accelerator addresses the challenges of large CNNs with varying shapes and filter sizes, which require significant data movement and energy. Key features include:
1. **Efficient Dataflow and Hardware Support**: The accelerator minimizes data movement by exploiting data reuse through a spatial array, memory hierarchy, and on-chip network. This design supports different shapes and filter sizes, allowing for high computational efficiency even at lower clock frequencies.
2. **Energy Minimization Techniques**: The accelerator uses data statistics to skip unnecessary reads and computations (zeros skipping/gating) and compresses data to reduce off-chip memory bandwidth, the most expensive form of data movement.
3. **Network-on-Chip (NoC)**: A NoC is implemented to support configurable data patterns and energy-efficient multicast to a variable number of processing elements (PEs). This ensures efficient data delivery and reduces power consumption.
4. **Performance and Power Efficiency**: The test chip, implemented in 65nm CMOS, operates at 200MHz core clock and 60MHz link clock, achieving a frame rate of 34.7fps on AlexNet with a measured power of 276mW at 1V. The chip can scale up to 250MHz core clock and 90MHz link clock, enabling a throughput of 44.8fps at 1.17V.
The paper also includes detailed architectural diagrams and performance metrics, demonstrating the effectiveness of the Eyeriss accelerator in reducing power consumption and improving efficiency for deep learning applications.The paper "Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks" by Yu-Hsin Chen, Tushar Krishna, Joel Emer, and Vivienne Sze presents a novel accelerator designed to efficiently process deep convolutional neural networks (CNNs) with minimal energy consumption. The accelerator addresses the challenges of large CNNs with varying shapes and filter sizes, which require significant data movement and energy. Key features include:
1. **Efficient Dataflow and Hardware Support**: The accelerator minimizes data movement by exploiting data reuse through a spatial array, memory hierarchy, and on-chip network. This design supports different shapes and filter sizes, allowing for high computational efficiency even at lower clock frequencies.
2. **Energy Minimization Techniques**: The accelerator uses data statistics to skip unnecessary reads and computations (zeros skipping/gating) and compresses data to reduce off-chip memory bandwidth, the most expensive form of data movement.
3. **Network-on-Chip (NoC)**: A NoC is implemented to support configurable data patterns and energy-efficient multicast to a variable number of processing elements (PEs). This ensures efficient data delivery and reduces power consumption.
4. **Performance and Power Efficiency**: The test chip, implemented in 65nm CMOS, operates at 200MHz core clock and 60MHz link clock, achieving a frame rate of 34.7fps on AlexNet with a measured power of 276mW at 1V. The chip can scale up to 250MHz core clock and 90MHz link clock, enabling a throughput of 44.8fps at 1.17V.
The paper also includes detailed architectural diagrams and performance metrics, demonstrating the effectiveness of the Eyeriss accelerator in reducing power consumption and improving efficiency for deep learning applications.