下载
加载中…
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
2020年6月
DeBERTa (Decoding-enhanced BERT with disentangled attention) improves the BERT and RoBERTa models using two novel techniques. The first is the disentangled attention mechanism, where each word is represented using two vectors that encode its content and position, respectively, and the…