Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel S. Emer
A significant amount of specialized hardware has been developed for processing deep neural networks (DNNs) in both academia and industry. This article aims to highlight the key concepts required to evaluate and compare these DNN processors. We discuss existing challenges, such as the flexibility and scalability needed to support a wide range of neural networks, as well as design considerations for both the DNN processors and the DNN models themselves. We also describe specific metrics that can be used to evaluate and compare existing solutions beyond the commonly used tera-operations per second per watt (TOPS/W). This article is based on the tutorial “How to Understand and Evaluate Deep Learning Processors” that was given at the 2020 International Solid-State Circuits Conference, as well as excerpts from the book, Efficient Processing of Deep Neural Networks .