23 min read
Optimize ML Models to Run Them on Tiny Hardware using Quantization
Typically, the parameters of a neural network (layer weights) are represented using 32-bit floating-point numbers. The rationale is that since the parameters of an ML model are not constrained to any specific range of values, assigning a data type to parameters that cover a wide range of values is wise