Télécharger
Chargement…
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
juin 2020
DeBERTa (Decoding-enhanced BERT with disentangled attention) improves the BERT and RoBERTa models using two novel techniques. The first is the disentangled attention mechanism, where each word is represented using two vectors that encode its content and position, respectively, and the…