TY - CHAP
T1 - Location Sensitive Image Retrieval and Tagging
AU - Gomez, Raul
AU - Gibert, Jaume
AU - Gomez, Lluis
AU - Karatzas, Dimosthenis
N1 - Funding Information:
Work supported by project TIN2017-89779-P, the CERCA Pro-gramme/Generalitat de Catalunya and the PhD scholarship AGAUR 2016-DI-84.
Funding Information:
Acknowledgement. Work supported by project TIN2017-89779-P, the CERCA Programme/Generalitat de Catalunya and the PhD scholarship AGAUR 2016-DI-84.
Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies to balance the location influence in the final ranking. LocSens learns to fuse textual and location information of multimodal queries to retrieve related images at different levels of location granularity, and successfully utilizes location information to improve image tagging.
AB - People from different parts of the globe describe objects and concepts in distinct manners. Visual appearance can thus vary across different geographic locations, which makes location a relevant contextual information when analysing visual data. In this work, we address the task of image retrieval related to a given tag conditioned on a certain location on Earth. We present LocSens, a model that learns to rank triplets of images, tags and coordinates by plausibility, and two training strategies to balance the location influence in the final ranking. LocSens learns to fuse textual and location information of multimodal queries to retrieve related images at different levels of location granularity, and successfully utilizes location information to improve image tagging.
UR - http://www.scopus.com/inward/record.url?scp=85092920175&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-58517-4_38
DO - 10.1007/978-3-030-58517-4_38
M3 - Chapter
AN - SCOPUS:85092920175
SN - 9783030585167
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 649
EP - 665
BT - Computer Vision – ECCV 2020 - 16th European Conference, Proceedings
A2 - Vedaldi, Andrea
A2 - Bischof, Horst
A2 - Brox, Thomas
A2 - Frahm, Jan-Michael
PB - Springer Science and Business Media Deutschland GmbH
ER -