Re-evaluating Retrosynthesis Algorithms with Syntheseus
- Krzysztof Maziarz ,
- Austin Tripp ,
- Guoqing Liu ,
- Megan Stanley ,
- Shufang Xie ,
- Piotr Gaiński ,
- Philipp Seidl ,
- Marwin Segler
Faraday Discussions |
Automated Synthesis Planning has recently re-emerged as a research area at the intersection of chemistry and machine learning. Despite the appearance of steady progress, we argue that imperfect benchmarks and inconsistent comparisons mask systematic shortcomings of existing techniques, and unnecessarily hamper progress. To remedy this, we present a synthesis planning library with an extensive benchmarking framework, called syntheseus, which promotes best practice by default, enabling consistent meaningful evaluation of single-step models and multi-step planning algorithms. We demonstrate the capabilities of syntheseus by re-evaluating several previous retrosynthesis algorithms, and find that the ranking of state-of-the-art models changes in controlled evaluation experiments. We end with guidance for future works in this area, and call the community to engage in the discussion on how to improve benchmarks for synthesis planning.