Exploiting Natural Language Generation in Scene Interpretation

Carles Fernández, Pau Baiget, Xavier Roca

Research output: Chapter in BookChapterResearchpeer-review


This chapter addresses the incorporation of a module for natural language generation into an artificial cognitive vision system, to describe the most relevant events and behaviors observed in video sequences of a given domain. It also introduces the required stages to convert conceptual predicates into natural language texts. The introduction of natural language (NL) interfaces into vision systems has become popular for a broad range of applications. In particular, they have become an important part of surveillance systems, in which human behavior is represented by predefined sequences of events in given contexts. The analysis of human behaviors in image sequences is currently restricted to the generation of quantitative parameters describing where and when motion is being observed. However, recent trends in cognitive vision demonstrate that it is becoming necessary to exploit linguistic knowledge to incorporate abstraction and uncertainty in the analysis and thus enhance the semantic richness of reported descriptions. Toward this end, natural language generation will in the near future constitute a mandatory step toward intelligent user interfacing. Moreover, the characteristics of ambient intelligence demand this process to adapt to users and their context-for example, through multilingual capabilities. This chapter illustrates some experimental results to elucidate multilingual generation in various application domains. © 2010 Elsevier Inc. All rights reserved.
Original languageEnglish
Title of host publicationHuman-Centric Interfaces for Ambient Intelligence
Number of pages22
Publication statusPublished - 1 Dec 2010


Dive into the research topics of 'Exploiting Natural Language Generation in Scene Interpretation'. Together they form a unique fingerprint.

Cite this