Model Distillation is a technique where a smaller “student” model is trained to reproduce the behavior of a larger “teacher” model by learning from its softened outputs or intermediate representations, enabling efficient inference while retaining much of the original model’s performance.
Sources:
- Hinton et al.: Distilling the Knowledge in a Neural Network
- Distiller.ai: Model Distillation Overview