MATI: Programmable Deep In-Memory Instruction Set Architecture for Machine Learning
The pervasive use of machine learning (ML) and their high computational complexity created a need for energy-efficient ML accelerators. The energy cost in such systems is dominated by the need to move and process massive volumes of data. While current work has focused on digital architectures, the recently proposed deep in-memory architectures (DIMA) have demonstrated significant energy and through-put benefits (up to 32× in energy-delay product) over their digital counterparts. However, as DIMA relies on array pitch-matched analog computations, their benefits have been demonstrated only for fixed-function scenarios. This raises the question: can DIMA be made programmable without losing its energy and throughput benefits over their digital counterparts?
We propose MATI, a programmable deep in-memory instruction set architecture, that is designed via a synergistic combination of instruction set, architecture and circuit design. Employing silicon-validated energy, delay and behavioral models of deep in-memory components, we demonstrate that MATI is able to realize nine ML benchmarks while incurring negligible overhead in energy (< 0.1%), and area (4.5%), and in throughput, over fixed-function DIMA. In this process, MATI is able to simultaneously achieve enhancements in both energy (2.5× to 5.5×) and throughput (1.4× to 3.4×) for an overall energy-delay product improvement of up to 12.6× over fixed-function digital architectures.