Quantization plays a crucial role in deploying Large Language Models (LLMs) in resource-constrained environments. However, the presence of outlier features significantly hinders low-bit quantization.
[2025.07.07] Our PodGPT manuscript is finally published in npj Biomedical Innovations. Link to the paper. [2025.04.01] We share our codes to evaluate PodGPT and baseline using the Perplexity metric on ...