This chapter addresses the incorporation of a module for natural language generation into an artificial cognitive vision system, to describe the most relevant events and behaviors observed in video sequences of a given domain. It also introduces the required stages to convert conceptual predicates into natural language texts. The introduction of natural language (NL) interfaces into vision systems has become popular for a broad range of applications. In particular, they have become an important part of surveillance systems, in which human behavior is represented by predefined sequences of events in given contexts. The analysis of human behaviors in image sequences is currently restricted to the generation of quantitative parameters describing where and when motion is being observed. However, recent trends in cognitive vision demonstrate that it is becoming necessary to exploit linguistic knowledge to incorporate abstraction and uncertainty in the analysis and thus enhance the semantic richness of reported descriptions. Toward this end, natural language generation will in the near future constitute a mandatory step toward intelligent user interfacing. Moreover, the characteristics of ambient intelligence demand this process to adapt to users and their context-for example, through multilingual capabilities. This chapter illustrates some experimental results to elucidate multilingual generation in various application domains. © 2010 Elsevier Inc. All rights reserved.
|Title of host publication||Human-Centric Interfaces for Ambient Intelligence|
|Number of pages||22|
|Publication status||Published - 1 Dec 2010|