Composable Accelerator Advancements
With the explosive rise of artificial intelligence, deep neural networks (DNNs) and machine learning (ML) methods are being used so extensively that the computation they require is now considered a significant category of work on its own. DNNs are a type of artificial neural network with multiple layers between the input and output layers. Essentially, they are a specific type of ML model that excels at handling complex data and tasks.
One approach that researchers have taken is to develop specialized hardware for DNNs that takes advantage of data sparsity. Data sparsity means that many elements in the data are zero or irrelevant, so they use techniques like compressed storage and processing only the non-zero elements. This approach helps reduce memory and bandwidth usage, making the system more energy-efficient and improving performance.
To keep up with the rapid development of new, larger, and more complex DNN models and ML methods beyond DNN, Professor Zhengya Zhang (University of Michigan) and his team suggest using modular chiplet integration. This means creating smaller, modular chips (chiplets) that can be combined in various ways to meet the growing computational demands.
The team used two strategies to scale up and scale out the hardware to support larger models of more complex functions. Homogeneous tiling means using identical tiles (or units) to build up the hardware, while heterogeneous integration, which involves combining different types of units to handle various tasks, allows the system to support larger and more complex models. With Liaisons from JUMP 2.0 member companies Intel, RTX, Samsung, and TSMC, work on this composable accelerator effort is greatly liked by DARPA.
While advancements in DNNs and ML methods have created significant computational workload challenges, we are excited to report this innovative approach, which paves the way for more efficient and robust systems. These developments not only enhance the performance and energy efficiency of DNN and LM computations but also open new avenues for future research and applications. The collaboration between academia and industry will play a crucial role in driving further innovations in artificial intelligence. This knowledge is further transferred into SRC member companies as graduate students are hired into the industry, as evidenced by two notable alumni from Prof. Zhang's group. Sung-Gun Cho was hired by Intel after several successful internships, while Teyuh Chou is now with AMD. Student hires into member companies are the most effective form of technology transfer for SRC member companies.
Visit Dr. Zhengya Zhang's SRC/DARPA JUMP 2.0 ACE Center project, “Composable Distributed Acceleration” (3134.002), at https://app.pillar.science/projects/3209/overview. You can view view a recent presentation here and a recent publication here.