Correction of Erroneous Characters in Chinese Sentence Analysis

  • Andi Wu ,
  • George Heidorn ,
  • Zixin Jiang ,
  • Terence Peng

Chinese and Oriental Language Information Processing Society |

Publication

This paper presents a method of automatically detecting and correcting erroneous characters in electronic Chinese texts. The characters to be corrected are those that are easily confused with other characters because of their phonological or graphical resemblance. The correction takes place as an integral part of syntactic analysis, where both the original character in the text and some replacement character(s) are considered, and the character that ends up in the final parse is identified as the correct one. The accuracy of this method substantially surpasses existing proofreaders in both recall and precision in the category of substitution errors. A demo of the system is available.