Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression The key idea The key idea The paper introduces grouped lattice vector quantisation (GLVQ), a weight quantisation technique tha...
#efficient-inference #quantisation
Origin | Interest | Match