23 min read

A collection of 1 post

Optimize ML Models to Run Them on Tiny Hardware using Quantization

Typically, the parameters of a neural network (layer weights) are represented using 32-bit floating-point numbers. The rationale is that since the parameters of an ML model are not constrained to any specific range of values, assigning a data type to parameters that cover a wide range of values

Sep 5, 2024