Automatic Tensorized Program Optimization

Date: 12/16/2021 2:52 pm
Track:

Lightning Talks

Organization: Carnegie Mellon University
Speakers: Bohan Hou

Deploying high-performance machine learning models has become an emerging challenge in various areas. Tensorized programs are programs that contain hierarchical loop nests, multi-dimensional loads, stores and these acceleration primitives, while those optimized implementations of machine-learning operators are usually tensorized programs. Hence, we aim to use machine learning system to apply automation methods on optimizations of tensorized programs to improve the inference time of common machine learning operators. To achieve this goal, we implement a compiler intermediate representation TensorIR designed for tensorized programs, and further designs and implements a new automatic optimization framework MetaSchedule. The inference speed of GEMM (General Matrix Multiplication) operator in different shapes can be improved and compared to CUTLASS and cuBLAS on GPU if using Tensor Core acceleration primitives, where both of them are high-performance kernel libraries.

Event Details

Automatic Tensorized Program Optimization

Register for TVMCon 2021