Printer Friendly

Intrinsic limits and possible results for research in multimedia semantics.

In the study of the language, the semantics is the discipline that takes care of the meaning of the expressions of the language. Next to it, the syntax studies the rules for the formation of the correct expressions, and the pragmatics studies the use of the expressions of the language for the communication. For a long time, it was thought that the meaning of linguistic expressions was given by abstract entities, such as ideas or concepts. It was an ingenuous theory of meaning, widely rooted in the common sense, that could be formulated in the following way:
 We have in our mind ideas and concepts. To be able
 to express them, we need something audible or
 visible. For this purpose, we invent a language
 composed of words, that are vehicles to convey the
 meaning, that is, the ideas and the concepts that we
 want to communicate. We can associate to each
 word or complex of words an idea, a concept, a thought
 Ideas, concepts and thoughts are stable, the same
 for all people, wherever they come from. People
 express them with various oral/written expressions
 depending on the "tribe" they belong to.

These convictions have been changed in the course of the last century, thanks to the contribution of philosophers of various extraction. Today it is universally thought, that the meaning of a statement is the set of worlds (or situations, or states) that make true the statement itself [1]. As an example the meaning of the statement "it rains" is the set of worlds where it effectively rains.

A further step ahead in the theory of the meaning has been made by introducing holism, saying that meaning cannot be distributed to parts of the theory, but can only be obtained by considering the whole language, that is the whole set (rather, network) of occurrences of certain expressions. This means that the theoretical content is distributed and interleaved in such a way that it is not possible to divide statement by statement. This approach has been strengthen by Davidson [2], giving further foundation to it. Davidson argues that holism has in fact two aspects: semantic holism (that is, the holism of meaning) and epistemologic holism (that is, he holism of knowledge). These two types of holism are two faces of the same medal.

Going from theory to practice, that is, the research in multimedia semantics, we can say that the meaning of a multimedia object, for example, a text, is the set of the worlds where what text says is true. In a similar manner, we can say that the meaning of an image or a video is the set of worlds where the image or the video occurs and is relevant. The meaning of complex document (possibly, composed by mixed media objects) can be composed in a non trivial way on the basis of the meaning of the composing objects. However, when we want to represent the meaning of multimedia (complex) objects, that is, when we want to define linguistic expressions that denote the worlds that are the meaning of an object, we have to deal with several hard problems:

1. Holism implies that the representations are anyway partial, tied to a context, a scope and a community of people. For this reason, the pure ontological approach, for which it is possible to detail all the characteristics a certain sector of the world, is quite hopeless. This approach seems to forget than the failure of Artificial Intelligence, some twenty year ago, is just here, in the impossibility to construct knowledge representations capturing all the aspects of a sufficiently complex domain. This representation works for micro-worlds. When it is attempted to go beyond these micro-worlds, is miserably fails. It is also likely tat many general purpose research approaches in the today fashionable field of Semantic Web will also fail given this theoretical limitation posed by the holistic nature of meaning, since only partial representations and with limited scope are possible.

2. There is also a linguistic problem, for which a representation, even with a partial scope, is always tied to a specific language, meant as specific syntax, semantics and pragmatics, that carries a specific vision of the world, that cannot easily transported to a different language.

Besides these theoretical problems, there are also more practical problem, that are anyway relevant for us, computer scientists. As an example,

1. how to automatically extract a description of the content of a multimedia object, as simple as it can be, in an adapted formal description language;

2. how to overcome the computational problems posed by the formal languages used used for knowledge representation (based on description logic) [3]. Even the simplest form of logic have inferential algorithms that are exponential in complexity in time and no foreseeable way out is on sight.

3. how to take into account and manage the uncertainty, considering the uncertainty deriving from different sources.

What can we say more about the future? We can envisage success stories, sometimes important. in partial fields. Most important, we should pursuit the interoperability of different representation schemes in order to find the result we search for, by combining al the available information, carried in different and inter-related media objects (i.e., combined similarity-based multimedia search).

At the Networked Multimedia Information System Lab of ISTI-CNR in Pisa, we have applied this approach in the research prototype MILOS [4], where we have tried to demonstrate the technologies that we consider important for multimedia search, i.e. combining different similarity based information sources to be able to deliver meaningful information. MILOS is a multimedia content management system specialised to support multimedia digital library applications. MILOS provides applications with functionalities for the storage of arbitrary multimedia documents and their content based retrieval using arbitrary metadata models represented in XML. MILOS is flexible in the management of documents containing different types of data and content descriptions; it is efficient and scalable in the storage and content based retrieval of these documents.


[1] Quine, W. V. (1986). Philosophy of Logic.Second Edition. Harvard University Press. Cambridge, MA.

[2] Davidson, D. (1991). Inquiries into Truth and Interpretation. Clarendon Press. Oxford.

[3] Baader, F. Nutt, W (2003). Basic Description Logics. The Description Logic Handbook. Cambridge University Press.


Fausto Rabitti, Carlo Meghini

ISTI-CNR, CNR Research Area

Via Moruzzi 1, 56124 Pisa, Italy

(Fausto.Rabitti, Carlo.Meghini)
COPYRIGHT 2005 Digital Information Research Foundation
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2005 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Position Paper
Author:Rabitti, Fausto; Meghini, Carlo
Publication:Journal of Digital Information Management
Date:Dec 1, 2005
Previous Article:Third International Conference on Digital Information Management ICDIM 2008.
Next Article:On semantics as a social construction.

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |