Cross-document Event Coreference Resolution based on Cross-media Features

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing |

Published by Association for Computational Linguistics

In this paper we focus on a new problem of event coreference resolution across television news videos. Based on the observation that the contents from multiple data modalities are complementary, we develop a novel approach to jointly encode effective features from both closed captions and video key frames. Experiment results demonstrate that visual features provided 7.2% absolute F-score gain on state-of-the-art text based event extraction and coreference resolution.