alg789

这个屌丝很懒，什么也没留下！

热门标签

ChatGLM + LoRA 进行finetune_chatglm rola killed

作者：alg789 | 2024-02-03 19:08:28

踩

chatglm rola killed

项目地址：kingglory/ChatGLM-Tuning

一、介绍

原文参考链接：https://articles.zsxq.com/id_e2389qm0w0sx.html
对于ChatGLM-6B模型基于LoRA 进行finetune。
以alpaca 为例。

硬件需求

显卡: 显存 >= 16G (最好24G或者以上)
环境：
- python>=3.8
- cuda>=11.6, cupti, cuDNN, TensorRT等深度学习环境

二、环境搭建

2.1 构建python环境

 conda create -n py310_chat python=3.10       # 创建新环境
 conda activate py310_chat                   # 激活环境
1
2

2.2 下载微调代码

git clone https://github.com/mymusise/ChatGLM-Tuning.git
cd ChatGLM-Tuning
1
2

2.3 安装python依赖

运行微调需要4.27.1版本的transformers

requirements.txt 内容：

# int8
bitsandbytes==0.37.1
accelerate==0.17.1

# chatglm
protobuf>=3.19.5,<3.20.1
transformers==4.27.1
icetk
cpm_kernels==1.0.11
torch>=1.13.1
tensorboard

#
datasets==2.10.1
git+https://github.com/huggingface/peft.git  # 最新版本 >=0.3.0.dev0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

最后一个peft 安装一直失败，最后

git clone git+https://github.com/huggingface/peft.git
1

下载下来之后，进入peft 文件夹，本地安装

pip install .
1

如果想修改peft内容可以使用以下命令，修改后立即生效

pip install -e .
1

-e 可修改的意思，不要忘记最后的那个点

然后注释掉requirements.txt 里面最后peft那行，安装其他依赖

pip install -r requirements.txt
1

三、使用方法

3.1 训练数据下载

3.1.1 数据来源

alpaca

3.1.2 数据介绍

本章使用 alpaca作为本次特定任务微调实验数据。
样例

[
    {
        "instruction": "Give three tips for staying healthy.",
        "input": "",
        "output": "1.Eat a balanced diet and make sure to include plenty of fruits and vegetables. \n2. Exercise regularly to keep your body active and strong. \n3. Get enough sleep and maintain a consistent sleep schedule."
    },...
]
1
2
3
4
5
6
7

字段
- instruction: 指令
- input: 输入（本数据集均为空）
- output: 输出

3.1.3 转化alpaca数据集为jsonl

运行代码

python cover_alpaca2jsonl.py  --data_path data/alpaca_data.json  --save_path data/alpaca_data.jsonl 
1

生成数据 data/alpaca_data.jsonl

{"text": "### Instruction:\nIdentify the odd one out.\n\n### Input:\nTwitter, Instagram, Telegram\n\n### Response:\nTelegram\nEND\n"}
{"text": "### Instruction:\nExplain why the following fraction is equivalent to 1/4\n\n### Input:\n4/16\n\n### Response:\nThe fraction 4/16 is equivalent to 1/4 because both numerators and denominators are divisible by 4. Dividing both the top and bottom numbers by 4 yields the fraction 1/4.\nEND\n"}
1
2

注：text 中包含 Instruction、Input、Response 三个信息,拼接格式为:

### Instruction:\n【Instruction内容】\n\n### Input:\n【Input内容】\n\n### Response:\n【Response内容】\nEND\n
1

3.2 tokenize_dataset 下载

python tokenize_dataset_rows.py  --jsonl_path data/alpaca_data.jsonl  --save_path data/alpaca     --max_seq_length 128
1

jsonl_path 微调的数据路径, 格式jsonl, 对每行的[‘context’]和[‘target’]字段进行encode
save_path 输出路径
max_seq_length 样本的最大长度

3.3 模型 finetune

运行fintune.sh进行微调：lora 方式 finetune

nohup sh finetune.sh
1

finetune.sh 内容：

python finetune.py \
    --dataset_path data/alpaca \
    --lora_rank 8 \
    --per_device_train_batch_size 6 \
    --gradient_accumulation_steps 1 \
    --max_steps 52000 \
    --save_steps 1000 \
    --save_total_limit 2 \
    --learning_rate 1e-4 \
    --fp16 \
    --remove_unused_columns false \
    --logging_steps 50 \
    --output_dir output
1
2
3
4
5
6
7
8
9
10
11
12
13

finetune.py 略有修改

 # setup peft
    peft_config = LoraConfig(
        task_type=TaskType.CAUSAL_LM,
        inference_mode=False,
        r=finetune_args.lora_rank,
        lora_alpha=32,
        lora_dropout=0.1,
        target_modules=["query_key_value", "dense", "dense_h_to_4h", "dense_4h_to_h"],
    )
1
2
3
4
5
6
7
8
9

target_modules=[“query_key_value”]变成target_modules=[“query_key_value”, “dense”, “dense_h_to_4h”, “dense_4h_to_h”]更有效，- lora_rank=8变成- lora_rank=32也会更有效，但是推理会变得巨慢（参数变多了嘛）

3.4 模型推理
运行infer.py 文件进行推理：

python infer.py
1

from transformers import AutoModel
import torch
model_path = '/home/trainer/.cache/huggingface/hub/models--THUDM--chatglm-6b/snapshots/55cced37950bc26aa9f2209859c026f59ff7adb8'

model = AutoModel.from_pretrained(model_path, trust_remote_code=True, load_in_8bit=True, device_map='auto')

print('load model done !')
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
print('load tokenizer done !')

from peft import PeftModel
model = PeftModel.from_pretrained(model, "./output/checkpoint-52000/")
print('load lora checkpoint done !')

import json
instructions = json.load(open("data/alpaca_data.json"))


answers = []
from cover_alpaca2jsonl import format_example
print('load data done !')

with torch.no_grad():
    while True:
        input_text = input()
        #feature = format_example(item)
        #input_text = feature['context']
        print('!@#$\n')
        print(input_text,'\n')
        ids = tokenizer.encode(input_text)
        input_ids = torch.LongTensor([ids])
        out = model.generate(
            input_ids=input_ids,
            max_length=150,
            do_sample=False,
            temperature=0.75
        )
        out_text = tokenizer.decode(out[0])
        answer = out_text.replace(input_text, "").replace("\nEND", "").strip()
        #item['infer_answer'] = answer
        print('out_text:\n')
        print(out_text,'\n')
        print(f"###Answer:\n", answer, '\n\n')
        #answers.append({'index': idx, **item})
exit()
with torch.no_grad():
    for idx, item in enumerate(instructions[:3]):
        feature = format_example(item)
        input_text = feature['context']
        print('!@#$\n')
        print(input_text,'\n')
        ids = tokenizer.encode(input_text)
        input_ids = torch.LongTensor([ids])
        out = model.generate(
            input_ids=input_ids,
            max_length=150,
            do_sample=False,
            temperature=0
        )
        out_text = tokenizer.decode(out[0])
        answer = out_text.replace(input_text, "").replace("\nEND", "").strip()
        item['infer_answer'] = answer
        print('out_text:\n')
        print(out_text,'\n')
        print(f"### {idx+1}.Answer:\n", item.get('output'), '\n\n')
        answers.append({'index': idx, **item})

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68

运行日志：

Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
/home/trainer/wws/miniconda3/envs/glm_lora/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/trainer/wws/miniconda3/envs/glm_lora did not contain libcudart.so as expected! Searching further paths...
  warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 111
CUDA SETUP: Loading binary /home/trainer/wws/miniconda3/envs/glm_lora/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda111.so...
Killed
(glm_lora) trainer@vito:~/wws/ChatGLM-Tuning$ python infer.py
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
/home/trainer/wws/miniconda3/envs/glm_lora/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/trainer/wws/miniconda3/envs/glm_lora did not contain libcudart.so as expected! Searching further paths...
  warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 111
CUDA SETUP: Loading binary /home/trainer/wws/miniconda3/envs/glm_lora/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda111.so...
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:15<00:00,  1.93s/it]
load model done !
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
load tokenizer done !
load lora checkpoint done !
load data done !
What are the three primary colors?
!@#$

What are the three primary colors?

/home/trainer/wws/miniconda3/envs/glm_lora/lib/python3.10/site-packages/transformers/generation/utils.py:1374: UserWarning: You are calling .generate() with the `input_ids` being on a device type different than your model's device. `input_ids` is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put `input_ids` to the correct device by calling for example input_ids = input_ids.to('cuda') before running `.generate()`.
  warnings.warn(
The dtype of attention mask (torch.int64) is not bool
out_text:

What are the three primary colors? The three primary colors are red, blue, and yellow.

###Answer:
 The three primary colors are red, blue, and yellow.


How can we reduce air pollution?
!@#$

How can we reduce air pollution?

out_text:

How can we reduce air pollution? Reducing air pollution can be achieved through a variety of measures. These include reducing the use of fossil fuels, encouraging the use of renewable energy sources, improving energy efficiency in buildings and transportation, and reducing emissions from industrial processes. Additionally, governments and businesses can implement policies that incentivize the use of electric vehicles, reduce emissions from power plants, and incentivize the use of green energy sources.

###Answer:
 Reducing air pollution can be achieved through a variety of measures. These include reducing the use of fossil fuels, encouraging the use of renewable energy sources, improving energy efficiency in buildings and transportation, and reducing emissions from industrial processes. Additionally, governments and businesses can implement policies that incentivize the use of electric vehicles, reduce emissions from power plants, and incentivize the use of green energy sources.


Describe a time when you had to make a difficult decision.
!@#$

Describe a time when you had to make a difficult decision.

out_text:

Describe a time when you had to make a difficult decision. I had to make a difficult decision when I was in college. I was between two universities that both offered the same degree program but one had a better reputation. I was worried about the financial burden that it would put me through, but I was also excited about the opportunity to gain a better job prospects. Ultimately, I decided to go to the university with a better reputation.

###Answer:
 I had to make a difficult decision when I was in college. I was between two universities that both offered the same degree program but one had a better reputation. I was worried about the financial burden that it would put me through, but I was also excited about the opportunity to gain a better job prospects. Ultimately, I decided to go to the university with a better reputation.


How did Julius Caesar die?
!@#$

How did Julius Caesar die?

out_text:

How did Julius Caesar die? Julius Caesar was assassinated by a group of Roman soldiers under the leadership of Gaius Julius Caesar in the year 44 BC. The soldiers were motivated by Caesar's political ambition and the soldiers' belief that Caesar's plan to take over the Roman Empire could lead to their own reward. Caesar had been forced to commit himself to a military campaign, and the soldiers used a sharp rock to kill Caesar. Caesar's body was taken to the Roman Forum and the body was then buried in the Roman tomb at the Cisneus.

###Answer:
 Julius Caesar was assassinated by a group of Roman soldiers under the leadership of Gaius Julius Caesar in the year 44 BC. The soldiers were motivated by Caesar's political ambition and the soldiers' belief that Caesar's plan to take over the Roman Empire could lead to their own reward. Caesar had been forced to commit himself to a military campaign, and the soldiers used a sharp rock to kill Caesar. Caesar's body was taken to the Roman Forum and the body was then buried in the Roman tomb at the Cisneus.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91

参考/感谢

THUDM/ChatGLM-6B
ChatGLM-Tuning
simple|来自：关于AiGC那些你不知道的事
 ChatGLM-6B 小编填坑记

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/article/detail/57752