Recurrent Image Annotation with Explicit Inter-label Dependencies

A. Dutta; Yashaswi Verma; C.V. Jawahar

doi:10.1007/978-3-030-58526-6_12

Profiles Research Units Publications

Conferences

Recurrent Image Annotation with Explicit Inter-label Dependencies

A. Dutta, , C.V. Jawahar

Published in Springer Science and Business Media Deutschland GmbH

2020

DOI: 10.1007/978-3-030-58526-6_12

Volume: 12374 LNCS

Pages: 191 - 207

Abstract

Inspired by the success of the CNN-RNN framework in the image captioning task, several works have explored this in multi-label image annotation with the hope that the RNN followed by a CNN would encode inter-label dependencies better than using a CNN alone. To do so, for each training sample, the earlier methods converted the ground-truth label-set into a sequence of labels based on their frequencies (e.g., rare-to-frequent) for training the RNN. However, since the ground-truth is an unordered set of labels, imposing a fixed and predefined sequence on them does not naturally align with this task. To address this, some of the recent papers have proposed techniques that are capable to train the RNN without feeding the ground-truth labels in a particular sequence/order. However, most of these techniques leave it to the RNN to implicitly choose one sequence for the ground-truth labels corresponding to each sample at the time of training, thus making it inherently biased. In this paper, we address this limitation and propose a novel approach in which the RNN is explicitly forced to learn multiple relevant inter-label dependencies, without the need of feeding the ground-truth in any particular order. Using thorough empirical comparisons, we demonstrate that our approach outperforms several state-of-the-art techniques on two popular datasets (MS-COCO and NUS-WIDE). Additionally, it provides a new perspecitve of looking at an unordered set of labels as equivalent to a collection of different permutations (sequences) of those labels, thus naturally aligning with the image annotation task. Our code is available at: https://github.com/ayushidutta/multi-order-rnn. © 2020, Springer Nature Switzerland AG.

Topics: Automatic image annotation (57)%

View more info for "Recurrent Image Annotation with Explicit Inter-label Dependencies"

About the journal

Journal	Data powered by SciSpaceLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publisher	Data powered by SciSpaceSpringer Science and Business Media Deutschland GmbH
ISSN	03029743

Authors (1)

Yashaswi Verma
- Department of Computer Science & Engineering

ACADEMICS

RESEARCH

STUDENTS

FACULTY