Optimizations

Optimizations.

Posted Oct 3, 2024 Updated Nov 27, 2024

By Laurentiu Soica

1 min read

Quantization

Knowledge Distillation

Transfer the knowledge of a larger (teacher) model to a smaller (student) one.

Soft targets: train student using logits of the teacher. During training, the teacher’s logits are used as targets for the student. Synthetic data: generate synthetic data from the teacher’s predictions.

Sparsity

Torch compiler

References

GPU MODE IRL 2024 Keynotes Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training Knowledge Distillation

Blogging, Tutorial

neuralnetworks optimizations quantization

This post is licensed under CC BY 4.0 by the author.