Deep Neural Networks (DNNs) have led to significant breakthroughs that expand the possibilities of applying artificial intelligence (AI) to many domains. Although DNNs have been driving the mainstream AI applications, it is becoming challenging to deploy them efficiently on modern hardware due to their increasingly compute-intensive and data-intensive nature.
In this talk, I will review the recent hardware-centric advancements in domain-specific architectures for DNNs. Then, I will discuss the representative software-centric techniques, such as quantization and pruning, that improve execution efficiency. Finally, I will present my research plan on integrating software methods and hardware specialization for efficient deep learning systems at the post-Moore’s law era.