In this section we provide a broad overview of areas of application of content-based retrieval of a variety of media in multimedia databases. Later sections will discuss how such retrieval may be facilitated.
For satisfying diagnosis, however, it is not sufficient to store and access a patient's CT images with the patient's record-id. Rather, suitable querying mechanisms are needed for a useful employment of the images in medical diagnosis. The questions of a surgeon to a medical image archival may be: How does my patient's tumor look compared to similar cases of brain tumors? What is the normal growth rate of a special type of brain tumor? Does the spatial growth of a brain tumor decrease with a certain drug therapy?
The images themselves do not give hints about whether they show a brain tumor or where it is located in the body. Therefore the knowledge of the spatial content of the images and the evolutionary behavior of the spatial content for a medical image (e.g., for a brain tumor) must be used or made available when processing a surgeon's queries. The result of a query should then be a collection of images that have similar spatial characteristics compared to a given image or a sequence of images showing the growth of a brain tumor over a year's time.
Searching image collections can be employed to find a starting point in a web of images from which the user may want to start a navigation through images and related information.
After having selected a particular image of interest, navigating through an image collection can take place, e.g. by choosing a particular part of the image that is currently being displayed. This selection can lead to associated data such as a set of related images or some other related textual information. One may also navigate in the (hyper-)textual information and may come back to the image collection via special links/hooks in the text and find an image associated with the respective textual information. For example, an image of a person comprises various regions having semantic content, e.g. the various subregions that correspond to the eyes, lips, and nose. When viewing a media object, the related information can be investigated for learning, e.g. which person can be seen on the image. Additionally an information location associated with a person's image can lead to an associated building and room of the location and then finally to the image of the person's office.
Navigating image collections might also involve navigating three-dimensional (3-D) representations, e.g. of the body's interior. A sequence of CT images can be the basis of a computed 3-D graphics representation of the brain. A surgeon may navigate through this representation of the brain. She/he may select a particular volume of interest, the thalamus, and enter it, viewing it at a higher resolution to see whether there is a growth inside. The surgeon may also select a part of the 3-D representation inside a thalamus that allows him/her to view photographs of patients with similar growth, etc. This kind of support for image retrieval, navigation, and browsing requires a lot of semantic knowledge for the retrieval, navigation, and browsing algorithms. Such matters are still very much research issues.
For example, nowadays we have to watch a provider's news and cannot eliminate those news items we are not interested in. Personalized news , cut to special personal interest, will make a news watcher independent of the news and of the time the news is actually on air. According to a user profile, videos are searched, and those parts of the present news items are selected that fit a questioner's need. With semantic knowledge about the structure of news, newly assembled and temporally arranged news items can be composed to form a personalized news extract. The interesting issue is how to define such a user profile and how thousands of news items of a news provider can be attached metadata that in combination with a user profile allow for a satisfying mapping between the two and the successful reassembling of the personalized news.
A similar application scenario can be derived from the demand of a critic who only wants to watch those parts of a film that suffice to write a quick review of the film or the special demand of a sport enthusiast who has only time to see a sequence of all goals of a certain football game of a certain team in order to be able to talk about the game the next day or the post-game analysis of football teams to support planning of strategies and analyze performance.
Knowledge about structure can be used by the author of a news article to retrieve interesting parts of documents in a huge document archive. For a well-targeted query the document structure available via metadata can be exploited. Not only can all the documents be retrieved by the author that contain the name "Clint Eastwood" but also all documents that contain the name in their heading as this is a known structural element in the documents. Efficient retrieval is achieved by exploiting document structure as the metadata can be used for indexing, and that is essential for short query response time. A typesetter of a newspaper's title page can make use of the metadata to properly lay out the article, that is, to process the document like ``Place the title in 18pt Helvetica at the top of the page, align the first two paragraphs beneath the headline, and let the remaining paragraphs follow on the next page.''