赞
踩
迁移学习的步骤:
加载需要的预训练模型并且安装需要的依赖包
pip install tqdm boto3 requests regex sentencepiece sacremoses dataclasses tokenizers filelock
加载预训练模型的Tokenizer
预训练模型来源
source = 'huggingface/pytorch-transformers'
选定加载模型的映射器
part = 'tokenizer'
加载的预训练模型的名字
model_name = 'bert-base-chinese'
>>> tokenizer = torch.hub.load(source, part, model_name)
Using cache found in /root/.cache/torch/hub/huggingface_pytorch-transformers_master
Downloading: 100%|███████████████████████████████████████████████████]
████████████████████| 624/624 [00:00<00:00, 309kB/s]Downloading: 16%|██████████▋ | 17.4k/110k [00:00<00:01
Downloading: 48%|███████████████████████████████▉ | 52Downloading: 100%|█████████████████████████████████████████████████████████████████████| 110k/110k [00:00<00:00, 222kB/s]
加载带/不带头的预训练模型
带/不带头的'头'是指模型的任务输出层, 选择加载不带头的模型, 相当 于使用模型对输入文本进行特征表示.
选择加载带头的模型时, 有三种类型的'头'可供选择,
modelWithLMHead(语言模型头),
modelForSequenceClassification(分类模型头),
modelForQuestionAnswering(问答模型头)
加载不带头的预训练模型
>>> part='model'
>>> model=torch.hub.load(source,part,model_name)
Using cache found in /root/.cache/torch/hub/huggingface_pytorch-transformers_master
加载带头的预训练模型
加载带有语言模型头的预训练模型
>>> part='modelWithLMHead'
>>> lm_model=torch.hub.load(source,part,model_name)
Using cache found in /root/.cache/torch/hub/huggingface_pytorch-transformers_master
加载带有分类模型头的预训练模型
part = 'modelForSequenceClassification'
classification_model = torch.hub.load(source, part, model_name)
加载带有问答模型头的预训练模型
part = 'modelForQuestionAnswering'
qa_model = torch.hub.load(source, part, model_name)
>>> input_text="我喜欢户外旅游"
将文本通过映射器映射成数值形式 >>> index_token=tokenizer.encode(input_text) >>> index_token [101, 2769, 1599, 3614, 2787, 1912, 3180, 3952, 102] 将映射的数值转换成张量的形式 >>> token_tensor=torch.tensor([index_token]) >>> token_tensor tensor([[ 101, 2769, 1599, 3614, 2787, 1912, 3180, 3952, 102]]) 将张量输入到不带头的预训练模型 >>> with torch.no_grad(): ... encoder_layers,_=model(token_tensor) ... >>> 获得输出结果 >>> encoder_layers tensor([[[ 0.8063, 0.6158, -0.1213, ..., -0.0863, 0.0681, -0.2473], [ 0.7400, -0.0640, 0.8920, ..., -1.4946, 0.2172, -0.0372], [ 0.9403, 0.3824, -0.4127, ..., -0.3384, 1.2332, -0.6460], ..., [ 1.0523, 0.4518, 0.5788, ..., 0.3973, 0.5235, 0.0141], [ 1.5142, 0.1522, -0.1348, ..., -0.1085, 0.4870, -0.3403], [ 0.3736, 0.5509, -0.0080, ..., -0.3194, 0.0956, -0.2440]]]) 输出尺寸为1x9x768, 表示每个字已经使用768维的向量进行了表示 >>> encoder_layers.size() torch.Size([1, 9, 768])
>>> with torch.no_grad(): ... lm_model_output=lm_model(token_tensor) ... >>> lm_model_output (tensor([[[ -8.1496, -8.0437, -8.0349, ..., -6.8098, -6.9320, -7.1604], [ -8.2627, -8.0974, -8.0490, ..., -6.8207, -7.0922, -6.4908], [-14.5500, -13.5348, -13.1422, ..., -8.0304, -7.7855, -8.6745], ..., [-15.5826, -15.5957, -14.2849, ..., -7.4225, -10.9780, -13.6987], [-15.5152, -16.1954, -15.1410, ..., -10.0985, -9.3024, -17.0246], [ -8.8590, -8.6877, -8.8669, ..., -6.3729, -6.7244, -7.1840]]]),) >>> lm_model_output[0].size() torch.Size([1, 9, 21128]) 输出尺寸为1x9x21128, 表示每个字已经使用21128维的向量进行了表示
>>> with torch.no_grad():
... classification_model_output=classification_model(token_tensor)
...
>>> classification_model_output
(tensor([[0.0741, 0.2867]]),)
torch.Size([1, 2]) 可直接用于文本二分类
第一条句子是对客观事物的陈述 第二条句子是针对第一条句子提出的问题 问答模型最终输出到两个张量, 每个张量中最大值对应索引分别代表答案在文本中的起始位置和终止位置 输入文本 >>> input_text1='我在北京工作' >>> input_text2='你在哪里工作?' 使用映射器映射 >>> indexed_token=tokenizer.encode(input_text1,input_text2) >>> indexed_token [101, 2769, 1762, 1266, 776, 2339, 868, 102, 872, 1762, 1525, 7027, 2339, 868, 136, 102] >>> len(input_text1) 6 >>> len(input_text2) 7 用0,1来区分第一条和第二条句子 >>> segments_ind=[0]*8+[1]*8 >>> segments_ind [0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1] 将映射的值转换成张量 >>> token_tensor=torch.tensor([indexed_token]) >>> token_tensor tensor([[ 101, 2769, 1762, 1266, 776, 2339, 868, 102, 872, 1762, 1525, 7027, 2339, 868, 136, 102]]) >>> segments_tensor=torch.tensor([segments_ind]) >>> >>> segments_tensor tensor([[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]]) 输出带有问答模型头的预训练模型输出 >>> with torch.no_grad(): ... qa_model_output1,qa_model_output2=qa_model(token_tensor,token_type_ids=segments_tensor) ... >>> >>> qa_model_output1 tensor([[ 0.8832, 0.8586, 0.2342, 0.1330, -0.1880, 0.6211, 0.0404, 0.5124, -0.5570, 0.2587, -0.1349, -0.2702, 0.4284, -0.5097, 0.0942, 0.5124]]) >>> qa_model_output2 tensor([[-0.2523, 0.4561, -0.3396, 0.0350, -0.3772, -0.4343, 0.0519, -0.0140, -0.4970, -0.2794, -0.4312, -0.8907, -0.6724, -0.5484, -0.0472, -0.0140]]) >>> qa_model_output1.size() torch.Size([1, 16]) >>> qa_model_output2.size() torch.Size([1, 16])
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。