Efficient Query Refinement in Multimedia Databases
Large repositories of multimedia objects containing digital images, video, audio and text documents are becoming common. There is an increasing application need to search these repositories based on their content. Examples include online shopping (e.g., “Find me all shirts that look like this one), medical applications (e.g., “Find all tumors similar to a given pattern”), face recognition, trademark searches, audio retrieval and content-based video browsing/searching. To address this need, we are building the Multimedia Analysis and Retrieval System (MARS), a system for effective and efficient content-based searching and browsing of large scale multimedia repositories [2]. MARS represents the content of multimedia objects by a collection of features (e.g. color histograms, cooccurence texture, color layout and shape for images; keywords for text). The user poses a query by submitting an example and requesting for a few objects that are “most similar” to the submitted example. The similarity between any two objects is computed by first computing their similarities based on the individual features and then combining them to obtain the overall similarity[2]. MARS uses specialized indices and advanced query processing algorithms to efficiently return the top matches to the user [1, 2].