Modulating shape features by color attention for object recognition

Fahad Shahbaz Khan, Joost Van De Weijer, Maria Vanrell

Research output: Contribution to journalArticleResearchpeer-review

94 Citations (Scopus)

Abstract

Bag-of-words based image representation is a successful approach for object recognition. Generally, the subsequent stages of the process: feature detection, feature description, vocabulary construction and image representation are performed independent of the intentioned object classes to be detected. In such a framework, it was found that the combination of different image cues, such as shape and color, often obtains below expected results. This paper presents a novel method for recognizing object categories when using multiple cues by separately processing the shape and color cues and combining them by modulating the shape features by category-specific color attention. Color is used to compute bottom-up and top-down attention maps. Subsequently, these color attention maps are used to modulate the weights of the shape features. In regions with higher attention shape features are given more weight than in regions with low attention. We compare our approach with existing methods that combine color and shape cues on five data sets containing varied importance of both cues, namely, Soccer (color predominance), Flower (color and shape parity), PASCAL VOC 2007 and 2009 (shape predominance) and Caltech-101 (color co-interference). The experiments clearly demonstrate that in all five data sets our proposed framework significantly outperforms existing methods for combining color and shape information. © 2011 Springer Science+Business Media, LLC.
Original languageEnglish
Pages (from-to)49-64
JournalInternational Journal of Computer Vision
Volume98
Issue number1
DOIs
Publication statusPublished - 1 May 2012

Keywords

  • Color features
  • Image representation
  • Object recognition

Fingerprint Dive into the research topics of 'Modulating shape features by color attention for object recognition'. Together they form a unique fingerprint.

Cite this