Memory-centric Computing
- Memory-centric Chip Technologies
New memory-centric chip technologies are emerging that promise to solve bandwidth bottlenecks in today's systems.
The idea behind these technologies is to bring memory closer to processing tasks to speed up the system. The concept is not new, and previous versions of the technology have had shortcomings. Also, it's unclear whether the new method will have the desired effect.
Memory-centric is a broad term with different definitions, although recent discussions have revolved around two technologies - in-memory computing and near-memory computing. Near-memory integrates memory and logic in one advanced IC package, while in-memory brings processing tasks near or inside memory. Both techniques are valid and can be used for different applications.
Both in-memory and near-memory are aimed at enhancing data processing capabilities in today's systems, or driving new architectures such as neural networks. With both approaches, the processor handles the processing functions, while both memory and storage store data.
In a system, data moves between memory and processors. But sometimes this swapping causes latency and power consumption, which is sometimes called a memory wall.
- Memory-centric Computing
Memory-centric computing aims to bring computing power to and near wherever data is generated and stored. Therefore, it can greatly reduce the huge negative impact of data access and data movement on performance and energy by essentially avoiding data movement and reducing data access latency and energy.
Modern computing systems are processor-centric. Data processing (i.e. computation) occurs only in the processor (e.g. CPU, GPU, FPGA, ASIC). Therefore, data needs to be moved from where it is generated/captured (e.g., sensors) and where it is stored (e.g., storage and storage devices) to the processor before it can be processed.
The processor-centric design paradigm significantly limits the performance and energy efficiency as well as scalability and sustainability of modern computing systems. Many studies have shown that even the most powerful processors and accelerators waste a significant portion (e.g., >60%) of the time waiting for data and energy moving data between storage/memory units to the processor.
This is true even though most of the hardware space of such systems is dedicated to data storage and communication (e.g., multi-level caches, DRAM chips, storage systems, and interconnects). Memory-centric computing aims to bring computing power to and near wherever data is generated and stored.
Therefore, it can greatly reduce the huge negative impact of data access and data movement on performance and energy by essentially avoiding data movement and reducing data access latency and energy. Many recent studies have shown that memory-centric computing can greatly improve system performance and energy efficiency.
- In-Memory Computing
In computer science, in-memory processing is an emerging technique for manipulating data stored in in-memory databases. In-memory processing is one way to address the performance and power bottlenecks caused by the movement of data between the processor and main memory.
Older systems have been based on disk storage and relational databases using the SQL query language, but these systems are increasingly considered inadequate for the needs of business intelligence (BI).
Because stored data is accessed much faster when stored in random access memory (RAM) or flash memory, in-memory processing allows data to be analyzed in real time, speeding up business reporting and decision making.
- Near-Memory Computing
The traditional method of moving data to the CPU for computation has become a significant performance bottleneck for emerging scale-out data-intensive applications due to limited data reuse.
At the same time, advances in 3D integration technology have made the decade-old concept of coupling computing units close to memory, known as near-memory computing (NMC), more feasible. Processing at the "home" of the data can significantly reduce data movement issues for data-intensive applications.
Today's memory hierarchy typically consists of multiple levels of cache, main memory, and storage. The traditional approach is to move data up from storage into cache and then process it.
In contrast, near-memory computing (NMC) is designed to perform processing close to where the data resides. This data-centric approach couples computing units close to the data and seeks to minimize expensive data movement.
Notably, 3D stacking is being touted as a real enabler for near-memory processing. It allows logic and memory to be stacked together using through-silicon vias (TSVs), which help reduce memory access latency, power consumption, and provide higher bandwidth.