Quantization

Quantization is the process of reducing the numerical precision of a model’s weights and activations (e.g., converting from 32-bit floats to 8-bit integers) to decrease memory footprint and improve inference speed with minimal impact on model accuracy.

Sources:

TensorFlow Model Optimization: Post-Training Quantization
NVIDIA Developer: Quantization Techniques for DNNs

Quartz 4

Explorer

Quantization

Graph View