Access to multilingual textual resources
- Ossama Emam ,
- Ahmed Awadallah ,
- Hany Hassan Awadalla
US Patent US8204736B2
A mechanism is provided for determining a second document of a set of documents in a second language having the same textual content as a first document in a first language. A first histogram that is indicative of the textual content of the first document is generated. A second histogram is generated for each document of the set of documents. Each second histogram is indicative of the textual content of a document of the set of documents. Each second histogram is compared with the first histogram to determine at least one histogram from the plurality of second histograms which matches the first histogram. The second document is then identified as the document having the at least one histogram.