LLM模型加载

FallingAngel 字数: 3245 阅读耗时: 8 分钟 2025/06/22 2025/06/22 博客独享热度: 6 评论: 0

通常情况下，直接使用HuggingFace的pipeline加载即可，如下：

from transformers import pipeline

pipeline("text-generation", model="JetBrains/Mellum-4b-sft-kotlin", torch_dtype=torch.float16)

只要不是太旧的模型，通常都可以直接加载成功。但有时，模型比较老，它使用torch.load加载模型，同时环境中的PyTorch版本是2.6以下的，就会产生如下报错：

ValueError: Due to a serious vulnerability issue in `torch.load`, even with `weights_only=True`, we now require users to upgrade torch to at least v2.6 in order to use the function. This version restriction does not apply when loading files with safetensors.
See the vulnerability report here https://nvd.nist.gov/vuln/detail/CVE-2025-32434

(nlp-journey) fallingangel@FallingAngel:~/nlp-journey$ pip list | grep torch
torch                    2.5.1+cu121
torchaudio               2.5.1+cu121
torchvision              0.20.1+cu121

此时，需要使用另一种方式加载模型：

model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-zh-en", use_safetensors=True)
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-zh-en")

pipeline("translation_zh_to_en", model=model, tokenizer=tokenizer)

两种加载方式

以上，涉及到模型的两种加载方式：直接使用pipeline，和使用from_pretrained配合use_safetensors=True

这只是表象，其实质是pipeline支持如下两种模型加载方式：

使用PyTorch中的torch.load加载，但它要求PyTorch的版本至少为2.6，如果低于这个版本，就要使用safetensors方式加载

safetensors

HuggingFace主导的新格式，更安全，更快