In this session we discuss to use the automatic FP16 quantization in TVM. We talk about some of the benefits from using the features and when and how to use this feature in TVM. At the end we showcase a demo demonstrating model speedup.
This session is split into two parts, a 20 minute talk and a 10 minute community breakout session.