chatglm.cpp简明教程 – Neohope的网络笔记

1、下载并编译chatglm.cpp

git clone --recursive https://github.com/li-plus/chatglm.cpp.git
cd chatglm.cpp
git submodule update --init --recursive
#cmake -B build
cmake -B build -DGGML_OPENBLAS=ON
#cmake -B build -DGGML_CUBLAS=ON
cmake --build build -j --config Release

2、下载模型，转化为ggml格式

#从hf下载模型，下载完成后，本地地址为 ~/.cache/huggingface/hub/模型名称
#部分代码文件会有缺失，可以到hf上对比下载
from transformers import AutoModel
model = AutoModel.from_pretrained("THUDM/chatglm-6b",trust_remote_code=True)

#模型转化为ggml格式
#同时进行量化，降低资源需求
pip install torch tabulate tqdm transformers accelerate sentencepiece
python3 chatglm_cpp/convert.py -i PATH_TO_MODEL -t q4_0 -o chatglm-6b-q40-ggml.bin

3、运行模型

./build/bin/main -m chatglm-6b-q40-ggml.bin -i

4、常见问题

#下面的错误，是transformers版本太高导致
AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'. Did you mean: '_tokenize'?
#需要降低transformers版本
pip uninstall transformers
pip install transformers==4.33.2

Leave a Reply Cancel reply