Understanding dynamic scenes based on human sequence evaluation

Jordi Gonzàlez, Daniel Rowe, Javier Varona, F. Xavier Roca

Research output: Contribution to journalArticleResearchpeer-review

25 Citations (Scopus)


In this paper, a Cognitive Vision System (CVS) is presented, which explains the human behaviour of monitored scenes using natural-language texts. This cognitive analysis of human movements recorded in image sequences is here referred to as Human Sequence Evaluation (HSE) which defines a set of transformation modules involved in the automatic generation of semantic descriptions from pixel values. In essence, the trajectories of human agents are obtained to generate textual interpretations of their motion, and also to infer the conceptual relationships of each agent w.r.t. its environment. For this purpose, a human behaviour model based on Situation Graph Trees (SGTs) is considered, which permits both bottom-up (hypothesis generation) and top-down (hypothesis refinement) analysis of dynamic scenes. The resulting system prototype interprets different kinds of behaviour and reports textual descriptions in multiple languages. © 2008 Elsevier B.V. All rights reserved.
Original languageEnglish
Pages (from-to)1433-1444
JournalImage and Vision Computing
Publication statusPublished - 2 Sep 2009


  • Event recognition in dynamic scenes
  • High-level processing of monitored scenes
  • Human behaviour interpretation
  • Human motion understanding
  • Image Sequence Evaluation
  • Natural-language text generation
  • Realistic demonstrators
  • Segmentation and tracking in complex scenes


Dive into the research topics of 'Understanding dynamic scenes based on human sequence evaluation'. Together they form a unique fingerprint.

Cite this