An Iterative Unsupervised Learning Method for Information Distillation
- Dilek Hakkani-Tür ,
- Gokhan Tur ,
- Michael Levit
ICASSP, Proc. of ICASSP |
Published by IEEE - Institute of Electrical and Electronics Engineers
Information distillation techniques are used to analyze and interpret large volumes of speech and text archives in multiple languages and produce structured information of interest to the user. In this work, we propose an iterative unsupervised sentence extraction method to answer open-ended natural language queries about an event. The approach consists of finding the subset of sentences that are very likely to be relevant or irrelevant for the query from candidate documents, and iteratively training a classification model using these examples. Our results indicate that performance of the system may be improved by around 30% relative in terms of F-measure, by using the proposed method.
© IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.