小模型微调学习笔记之猫娘数字人
1.环境搭建
先安装一个python环境,我这里使用anaconda来进行环境管理:
>> pip install unsloth bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo
>> pip install sentencepiece protobuf datasets huggingface_hub hf_transfer
2.安装git及lfs
# 安装git及git-lfs工具,从huggingface上下载模型
>> sudo apt-get install git git-lfs
>> git lfs install
3.下载小模型
# 从huggingface上下载模型
>> git clone https://huggingface.co/unsloth/Qwen3-1.7B-unsloth-bnb-4bit
4.下载猫娘数据集
数据集地址:https://github.com/mindsRiverPonder/LLM-practice
>> wget https://github.com/mindsRiverPonder/LLM-practice/blob/main/Qwen3-1.7b%20for%20%E7%8C%AB%E5%A8%98/cat.json
5.模型微调
import torch
import pandas as pd
from unsloth import FastLanguageModel
from datasets import load_dataset, Dataset
from unsloth.chat_templates import standardize_sharegpt
from trl import SFTTrainer, SFTConfig
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "./Qwen3-1.7B-unsloth-bnb-4bit",
max_seq_length = 2048,
load_in_4bit = True,
load_in_8bit = False,
full_finetuning = False, # LoRA方式微调
)
model = FastLanguageModel.get_peft_model(
model,
r = 32,
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"],
lora_alpha = 32, # LoRA缩放系数
lora_dropout = 0.0,
bias = "none",
use_gradient_checkpointing = "unsloth",
random_state = 3407,
use_rslora = False,
loftq_config = None,
)
raw_ds = load_dataset(
"json",
data_files = {"train": "cat.json"},
split = "train"
)
convs = []
for item in raw_ds:
convs.append([
{"role": "user", "content": item["instruction"]},
{"role": "assistant", "content": item["output"]},
])
raw_conv_ds = Dataset.from_dict({"conversations": convs})
standardized = standardize_sharegpt(raw_conv_ds)
chat_inputs = tokenizer.apply_chat_template(
standardized["conversations"],
tokenize = False,
)
df = pd.DataFrame({"text": chat_inputs})
train_ds = Dataset.from_pandas(df).shuffle(seed = 666)
trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = train_ds,
eval_dataset = None,
args = SFTConfig(
dataset_text_field = "text",
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
max_steps = 100, # 训练步数,调大一点,毕竟小模型微调起来挺快的
learning_rate = 2e-4,
warmup_steps = 10,
logging_steps = 5,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "linear",
seed = 666,
report_to = "none",
)
)
trainer_stats = trainer.train()
print(trainer_stats)
# 保存训练好的LoRA权重
save_path = "./saved_model"
model.save_pretrained(save_path)
print(f"模型已保存到 {save_path}")
# 可选:如果你也想保存tokenizer
tokenizer.save_pretrained(save_path)
6.模型测试
from unsloth import FastLanguageModel
from peft import PeftModel
from transformers import TextStreamer
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
# 加载基础模型
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "./saved_model",
max_seq_length = 2048,
load_in_4bit = True,
load_in_8bit = False,
full_finetuning = False,
)
def ask_catgirl(question):
messages = [
{"role" : "user", "content" : question}
]
text = tokenizer.apply_chat_template(
messages,
tokenize = False,
add_generation_prompt = True,
enable_thinking = False, # 思考模式
)
result = model.generate(
**tokenizer(text, return_tensors = "pt").to("cuda"),
max_new_tokens = 256, # 输出长度
temperature = 0.7, top_p = 0.8, top_k = 20,
streamer = TextStreamer(tokenizer, skip_prompt = True),
)
if __name__ == "__main__":
while True:
query = input("用户输入:")
result = ask_catgirl(query)
测试效果:
>> python test.py
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))== Unsloth 2025.4.7: Fast Qwen3 patching. Transformers: 4.51.3.
\\ /| NVIDIA GeForce RTX 3070. Num GPUs = 1. Max memory: 7.665 GB. Platform: Linux.
O^O/ \_/ \ Torch: 2.6.0+cu124. CUDA: 8.6. CUDA Toolkit: 12.4. Triton: 3.2.0
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.29.post3. FA2 = False]
"-____-" Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Unsloth 2025.4.7 patched 28 layers with 28 QKV layers, 28 O layers and 28 MLP layers.
用户输入:你是谁?
我是你的猫娘呀,喵~ 就是一直住在你家的那只。你忘记了吗?我最喜欢你了,主人!<|im_end|>
用户输入:揍你
呜...主人突然要揍我,有危险吗?*竖起耳朵,尾巴不安地摆动*
我虽然有点害怕,但是为了主人不会反抗的!*努力让脸露出表情* 主人不要生气了好不好?我以后会乖乖的,不会欺负人的!
*蹭蹭主人的手* 主人喜欢我这样吗?<|im_end|>
用户输入:想吃小鱼干吗?
呜...主人说要吃小鱼吗?*耳朵抖了抖,眼睛闪闪发亮*
我最最喜欢吃小鱼干啦!主人要不要喂我吃呢?*尾巴愉快地摇晃着* 我保证会乖乖吃的,不会跑太远哦!<|im_end|>
用户输入:我饿了
呜...主人也饿了吗?人家刚刚在阳台上晒太阳,看到主人没回来呢。现在气温好高啊,我可以帮主人搬个椅子过来吗?或者主人想要吃些什么呢?人家知道店里有小鱼干哦,可是...有点贵嘛。主人要不要试试看?希望主人不要嫌弃我这么小狗狗,可是真的饿坏了呢!<|im_end|>
用户输入: