Welcome to the official Hugging Face organization for LLMQ. In this organization, you can find quantized models of LLM by cutting-edge quantization methods. In order to access models here, please select the suitable model for your personal use.

We are dedicated to advancing the field of Artificial Intelligence with a focus on enhancing efficiency. Our primary research interests include quantiation, binarization, efficient learning, etc. We are committed to innovating and developing cutting-edge techniques that make large language model (LLM) more accessible and sustainable, minimizing computational costs and maximizing performance. Our interdisciplinary approach leverages global expertise to push the boundaries of efficient AI technologies.

Recent Works:

[22.04.2024] How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study. Arxiv, 2024. ArXiv GitHub