Predicting Citation Counts Using Text and Graph Mining
- Avishay Livne ,
- Eytan Adar ,
- Jaime Teevan ,
- Susan Dumais
iConference 2013, Workshop on Computational Scientometrics: Theory and Application |
As the volume of scientific literature grows faster it becomes more difficult for researchers to identify promising papers that are likely to become influential in their field. We study the problem of predicting future citation counts of papers given information available at the time of publication (five years forward in our pilot study). We apply machine learning techniques on a dataset of millions of academic papers from several research domains to identify predictive features including venue reputation, authors and institutions, citation networks and content measures. We identify how these features are differentially predictive in various domains and identify possible reasons where citation behaviors might lead to these differences.