简介 Brief Introduction


The Chinese Longformer-large (330M), which uses rotating positional encoding, is adept at handling lengthy text.

模型分类 Model Taxonomy

需求 Demand 任务 Task 系列 Series 模型 Model 参数 Parameter 额外 Extra
通用 General 自然语言理解 NLU 二郎神 Erlangshen Longformeer 330M 中文 Chinese

模型信息 Model Information

遵循Longformer-large的设计,我们基于chinese_roformer_L-12_H-768_A-12,在悟道语料库(180 GB版本)上进行了继续预训练。特别的,我们采用旋转位置嵌入(RoPE)来避免预训练语料库的不均匀序列长度问题。

Following the design of Longformer-large, we performed continual pre-training on the WuDao corpus (180 GB) based on chinese_roformer_L-12_H-768_A-12. Particularly, we employed rotational position embedding (RoPE) to avoid the uneven sequence length of the pre-trained corpus.

使用 Usage

模型下载地址 Download Address


加载模型 Loading Models


Since there is no structure of Longformer-large in transformers library, you can find the structure of Longformer-base and run the codes in Fengshenbang-LM.

git clone
from fengshen import LongformerModel    
from fengshen import LongformerConfig
from transformers import BertTokenizer

tokenizer = BertTokenizer.from_pretrained("IDEA-CCNL/Erlangshen-Longformer-330M")
config = LongformerConfig.from_pretrained("IDEA-CCNL/Erlangshen-Longformer-330M")
model = LongformerModel.from_pretrained("IDEA-CCNL/Erlangshen-Longformer-330M")

