AutoGPTQ

Device Requirements

At least one NVIDIA 20-series or higher GPU with more than 12GB of VRAM is required.

Method 1: Directly Obtain Quantized GPTQ Weights (Recommended)

git clone https://huggingface.co/openbmb/MiniCPM3-4B-GPTQ-Int4

Method 2: Quantize by Yourself (Recommended after Fine-Tuning)

Acquire MiniCPM Model Weights

git clone https://huggingface.co/openbmb/MiniCPM3-4B

Acquire Quantization Script

git clone https://github.com/OpenBMB/MiniCPM

Install the AutoGPTQ Branch Here, you will get the code from my forked branch. (A PR has been submitted)

git clone https://github.com/LDLINGLINGLING/AutoGPTQ.git
cd AutoGPTQ
git checkout minicpm3
pip install -e .

Start Quantization

cd MiniCPM/quantize
# In the following command, modify no_quant_model_path to the location where the MiniCPM3 weights are saved, and quant_save_path to the directory where the quantized model will be saved.
python gptq_quantize.py --pretrained_model_dir no_quant_model_path --quantized_model_dir quant_save_path --bits 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gptq.md

gptq.md

AutoGPTQ

Device Requirements

Method 1: Directly Obtain Quantized GPTQ Weights (Recommended)

Method 2: Quantize by Yourself (Recommended after Fine-Tuning)

Files

gptq.md

Latest commit

History

gptq.md

File metadata and controls

AutoGPTQ

Device Requirements

Method 1: Directly Obtain Quantized GPTQ Weights (Recommended)

Method 2: Quantize by Yourself (Recommended after Fine-Tuning)