Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

https://arxiv.org/abs/1510.00149

Step 1: Pruning

Step 2 Quantization

Initialization:

  • Forgy (random) initialization randomly chooses k observations from the data set and uses these as
    the initial centroids.

  • Density-based initialization linearly spaces the CDF of the weights in the y-axis, then finds the
    horizontal intersection with the CDF, and finally finds the vertical intersection on the x-axis, which
    becomes a centroid,

  • Linear initialization linearly spaces the centroids between the [min, max] of the original weights.

Step 3:Hoffman Encoding

results matching ""

    No results matching ""