Default / 默认 · 8月 31, 2021

“pytorch版本的reformer库推荐 中文实现也很轻松reformer-pytorch-chinese”

内容纲要

Transformer很强大但是消耗资源有点多,还好google又搞出来了reformer在资源消耗上做了很大的优化,这也让我们能够以更小的代价进行尝试,毕竟GPU真的不便宜。

reformer-pytorch可以试用下

https://github.com/lucidrains/reformer-pytorch

配合transformers的BertTokenizer把文字转化成ids后直接交个reformer处理,很轻松的就可以实现一个gpt2效果一样的模型了。

colab上运行示例 

import torch

from torch import randint

from reformer_pytorch import ReformerLM

from reformer_pytorch.generative_tools import TrainingWrapper

from transformers import BertTokenizer

tokenizer = BertTokenizer.from_pretrained(‘bert-base-chinese’) 

tokenizer.max_len = 128

model = ReformerLM(

    num_tokens= tokenizer.vocab_size,

    dim = 128,

    depth = 12,

    max_seq_len = tokenizer.max_len ,

    lsh_dropout = 0.1,

    causal = True,

    full_attn_thres = 128

)

# 0 is used for padding and no loss to be calculated on it

model = TrainingWrapper(model, ignore_index = 0, pad_value = 0)

#训练语料

text_list=[‘你好吗’,’哈哈’,’我还好’,’我还好’,’我还好’,’我还好’,’我还好’]

x_train=[]

for item in text_list:

  # x_train.append(torch.tensor(tokenizer.encode(“你好吗”, add_special_tokens=True)).unsqueeze(0))  # Batch size 1

  tok = tokenizer.encode(item, max_length=tokenizer.max_len, add_special_tokens=True)

  tok = torch.tensor(tok, dtype=torch.long)

  # print(tok)

  x_train.append(tok)

print(x_train)

# when training, set return_loss equal to True

model.train()

loss = model(x_train, return_loss = True)

loss.backward()

# when evaluating, just use the generate function, which will default to top_k sampling with temperature of 1.

initial = torch.tensor([[0]]).long() # assume 0 is start token

sample = model.generate(initial, 100, temperature=1., filter_thres = 0.9, eos_token = 1) # assume end token is 1, or omit and it will sample up to 100

print(sample.shape) # (1, <=100) token ids

print(sample)

text = tokenizer.convert_ids_to_tokens(sample.tolist()[0])

print(text)

colab上运行示例

%d 博主赞过: