24 Jan 2024 | ASIF ALI KHAN, TU Dresden, Germany; JOÃO PAULO C. DE LIMA, TU Dresden and ScaDS.AI, Germany; HAMID FARZANEH, TU Dresden, Germany; JERONIMO CASTRILLON, TU Dresden and ScaDS.AI, Germany
The landscape of Compute-near-memory (CNM) and Compute-in-memory (CIM) systems is evolving rapidly to address the challenges of handling large volumes of data efficiently in data-centric applications, particularly in machine learning. CNM and CIM architectures aim to reduce data movement and energy consumption by performing computations either near or within memory chips. This survey provides an overview of the fundamental concepts, technologies, and evolving landscape of CNM and CIM systems in academia and industry.
**Key Technologies and Concepts:**
- **CNM:** Specialized CMOS logic integrated into memory chips to perform computations close to the data.
- **CIM:** Perform computations directly within memory devices, such as DRAM, using the inherent physical properties of memory cells.
**Commercial Trends:**
- The market for CNM/CIM systems is growing rapidly, with a CAGR of 17.5% expected over the next decade. Companies like Samsung, SK Hynix, and Intel are developing solutions based on SRAM, resistive NVM, and flash technologies.
- Solutions are primarily based on SRAM, but NVMs like PCM, RRAM, MRAM, and FeFETs are also gaining attention due to their potential for higher bandwidth and lower power consumption.
**Challenges:**
- The main challenge is the lack of a robust software ecosystem, which hinders programmability and optimization.
- Other challenges include reliability issues with emerging NVMs, developing performance models, and creating profiling and analysis tools for these systems.
**Selected Architectures:**
- **CNM Systems:**
- **UPMEM:** A commercial near-bank CNM system with general-purpose RISC processors integrated into DRAM.
- **McDRAM and MViD:** Domain-specific CNM systems for machine learning, embedding MAC units within DRAM banks.
- **Samsung's PIM-HBM:** Incorporates SIMD engines within DRAM banks for bank-level parallelism.
- **SK hynix’s AiM:** Utilizes GDDR6 for machine learning applications.
- **AxRAM:** Integrates approximate MAC units in DRAM to reduce off-chip memory communication.
- **3D-stacked DRAM-based CNM systems:** Such as TESSEACT, TOP-PIM, AMC, and HRL, leveraging HMC and HBM technologies.
- **CIM Systems:**
- **ISAAC (by Hewlett Packard Enterprise):** A CIM accelerator for convolutional neural networks using RRAM, featuring interconnected tiles with IMA units and embedded DRAM buffers.
The survey highlights the potential benefits of CNM and CIM systems in terms of performance, energy, and cost, while also addressing the challenges and future directions for these emerging computing paradigms.The landscape of Compute-near-memory (CNM) and Compute-in-memory (CIM) systems is evolving rapidly to address the challenges of handling large volumes of data efficiently in data-centric applications, particularly in machine learning. CNM and CIM architectures aim to reduce data movement and energy consumption by performing computations either near or within memory chips. This survey provides an overview of the fundamental concepts, technologies, and evolving landscape of CNM and CIM systems in academia and industry.
**Key Technologies and Concepts:**
- **CNM:** Specialized CMOS logic integrated into memory chips to perform computations close to the data.
- **CIM:** Perform computations directly within memory devices, such as DRAM, using the inherent physical properties of memory cells.
**Commercial Trends:**
- The market for CNM/CIM systems is growing rapidly, with a CAGR of 17.5% expected over the next decade. Companies like Samsung, SK Hynix, and Intel are developing solutions based on SRAM, resistive NVM, and flash technologies.
- Solutions are primarily based on SRAM, but NVMs like PCM, RRAM, MRAM, and FeFETs are also gaining attention due to their potential for higher bandwidth and lower power consumption.
**Challenges:**
- The main challenge is the lack of a robust software ecosystem, which hinders programmability and optimization.
- Other challenges include reliability issues with emerging NVMs, developing performance models, and creating profiling and analysis tools for these systems.
**Selected Architectures:**
- **CNM Systems:**
- **UPMEM:** A commercial near-bank CNM system with general-purpose RISC processors integrated into DRAM.
- **McDRAM and MViD:** Domain-specific CNM systems for machine learning, embedding MAC units within DRAM banks.
- **Samsung's PIM-HBM:** Incorporates SIMD engines within DRAM banks for bank-level parallelism.
- **SK hynix’s AiM:** Utilizes GDDR6 for machine learning applications.
- **AxRAM:** Integrates approximate MAC units in DRAM to reduce off-chip memory communication.
- **3D-stacked DRAM-based CNM systems:** Such as TESSEACT, TOP-PIM, AMC, and HRL, leveraging HMC and HBM technologies.
- **CIM Systems:**
- **ISAAC (by Hewlett Packard Enterprise):** A CIM accelerator for convolutional neural networks using RRAM, featuring interconnected tiles with IMA units and embedded DRAM buffers.
The survey highlights the potential benefits of CNM and CIM systems in terms of performance, energy, and cost, while also addressing the challenges and future directions for these emerging computing paradigms.