Retrieval Augmentation for Commonsense Reasoning: A Unified Approach

  • Wenhao Yu ,
  • Chenguang Zhu ,
  • Zhihan Zhang ,
  • ,
  • Zhuosheng Zhang ,
  • Yuwei Fang ,
  • Meng Jiang

Empirical Methods in Natural Language Processing (EMNLP) 2022 |

A common thread of retrieval-augmented methods in the existing literature focuses on retrieving encyclopedic knowledge, such as Wikipedia, which facilitates well-defined entity and relation spaces that can be modeled. However, applying such methods to commonsense reasoning tasks faces two unique challenges, i.e., the lack of a general large-scale corpus for retrieval and a corresponding effective commonsense retriever. In this paper, we systematically investigate how to leverage commonsense knowledge retrieval to improve commonsense reasoning tasks. We proposed a unified framework of Retrieval-Augmented Commonsense reasoning (called RACo), including a newly constructed commonsense corpus with over 20 million documents and novel strategies for training a commonsense retriever. We conducted experiments on four different commonsense reasoning tasks. Extensive evaluation results showed that our proposed RACo can significantly outperform other knowledge-enhanced method counterparts, achieving new SoTA performance on the CommonGen and CREAK leaderboards.