-
Active Learning for Finely-Categorized Image-Text Retrieval by Selecting Hard Negative Unpaired Samples
Authors:
Dae Ung Jo,
Kyuewang Lee,
JaeHo Chung,
** Young Choi
Abstract:
Securing a sufficient amount of paired data is important to train an image-text retrieval (ITR) model, but collecting paired data is very expensive. To address this issue, in this paper, we propose an active learning algorithm for ITR that can collect paired data cost-efficiently. Previous studies assume that image-text pairs are given and their category labels are asked to the annotator. However,…
▽ More
Securing a sufficient amount of paired data is important to train an image-text retrieval (ITR) model, but collecting paired data is very expensive. To address this issue, in this paper, we propose an active learning algorithm for ITR that can collect paired data cost-efficiently. Previous studies assume that image-text pairs are given and their category labels are asked to the annotator. However, in the recent ITR studies, the importance of category label is decreased since a retrieval model can be trained with only image-text pairs. For this reason, we set up an active learning scenario where unpaired images (or texts) are given and the annotator provides corresponding texts (or images) to make paired data. The key idea of the proposed AL algorithm is to select unpaired images (or texts) that can be hard negative samples for existing texts (or images). To this end, we introduce a novel scoring function to choose hard negative samples. We validate the effectiveness of the proposed method on Flickr30K and MS-COCO datasets.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Influential Rank: A New Perspective of Post-training for Robust Model against Noisy Labels
Authors:
Seulki Park,
Hwanjun Song,
Daeho Um,
Dae Ung Jo,
Sangdoo Yun,
** Young Choi
Abstract:
Deep neural network can easily overfit to even noisy labels due to its high capacity, which degrades the generalization performance of a model. To overcome this issue, we propose a new approach for learning from noisy labels (LNL) via post-training, which can significantly improve the generalization performance of any pre-trained model on noisy label data. To this end, we rather exploit the overfi…
▽ More
Deep neural network can easily overfit to even noisy labels due to its high capacity, which degrades the generalization performance of a model. To overcome this issue, we propose a new approach for learning from noisy labels (LNL) via post-training, which can significantly improve the generalization performance of any pre-trained model on noisy label data. To this end, we rather exploit the overfitting property of a trained model to identify mislabeled samples. Specifically, our post-training approach gradually removes samples with high influence on the decision boundary and refines the decision boundary to improve generalization performance. Our post-training approach creates great synergies when combined with the existing LNL methods. Experimental results on various real-world and synthetic benchmark datasets demonstrate the validity of our approach in diverse realistic scenarios.
△ Less
Submitted 19 April, 2023; v1 submitted 14 June, 2021;
originally announced June 2021.
-
Class-Attentive Diffusion Network for Semi-Supervised Classification
Authors:
Jongin Lim,
Daeho Um,
Hyung ** Chang,
Dae Ung Jo,
** Young Choi
Abstract:
Recently, graph neural networks for semi-supervised classification have been widely studied. However, existing methods only use the information of limited neighbors and do not deal with the inter-class connections in graphs. In this paper, we propose Adaptive aggregation with Class-Attentive Diffusion (AdaCAD), a new aggregation scheme that adaptively aggregates nodes probably of the same class am…
▽ More
Recently, graph neural networks for semi-supervised classification have been widely studied. However, existing methods only use the information of limited neighbors and do not deal with the inter-class connections in graphs. In this paper, we propose Adaptive aggregation with Class-Attentive Diffusion (AdaCAD), a new aggregation scheme that adaptively aggregates nodes probably of the same class among K-hop neighbors. To this end, we first propose a novel stochastic process, called Class-Attentive Diffusion (CAD), that strengthens attention to intra-class nodes and attenuates attention to inter-class nodes. In contrast to the existing diffusion methods with a transition matrix determined solely by the graph structure, CAD considers both the node features and the graph structure with the design of our class-attentive transition matrix that utilizes a classifier. Then, we further propose an adaptive update scheme that leverages different reflection ratios of the diffusion result for each node depending on the local class-context. As the main advantage, AdaCAD alleviates the problem of undesired mixing of inter-class features caused by discrepancies between node labels and the graph topology. Built on AdaCAD, we construct a simple model called Class-Attentive Diffusion Network (CAD-Net). Extensive experiments on seven benchmark datasets consistently demonstrate the efficacy of the proposed method and our CAD-Net significantly outperforms the state-of-the-art methods. Code is available at https://github.com/l**0429/CAD-Net.
△ Less
Submitted 29 December, 2020; v1 submitted 17 June, 2020;
originally announced June 2020.
-
Cross-modal Variational Auto-encoder with Distributed Latent Spaces and Associators
Authors:
Dae Ung Jo,
ByeongJu Lee,
Jongwon Choi,
Haanju Yoo,
** Young Choi
Abstract:
In this paper, we propose a novel structure for a cross-modal data association, which is inspired by the recent research on the associative learning structure of the brain. We formulate the cross-modal association in Bayesian inference framework realized by a deep neural network with multiple variational auto-encoders and variational associators. The variational associators transfer the latent spa…
▽ More
In this paper, we propose a novel structure for a cross-modal data association, which is inspired by the recent research on the associative learning structure of the brain. We formulate the cross-modal association in Bayesian inference framework realized by a deep neural network with multiple variational auto-encoders and variational associators. The variational associators transfer the latent spaces between auto-encoders that represent different modalities. The proposed structure successfully associates even heterogeneous modal data and easily incorporates the additional modality to the entire network via the proposed cross-modal associator. Furthermore, the proposed structure can be trained with only a small amount of paired data since auto-encoders can be trained by unsupervised manner. Through experiments, the effectiveness of the proposed structure is validated on various datasets including visual and auditory data.
△ Less
Submitted 30 May, 2019;
originally announced May 2019.
-
Backbone Can Not be Trained at Once: Rolling Back to Pre-trained Network for Person Re-Identification
Authors:
Youngmin Ro,
Jongwon Choi,
Dae Ung Jo,
Byeongho Heo,
Jongin Lim,
** Young Choi
Abstract:
In person re-identification (ReID) task, because of its shortage of trainable dataset, it is common to utilize fine-tuning method using a classification network pre-trained on a large dataset. However, it is relatively difficult to sufficiently fine-tune the low-level layers of the network due to the gradient vanishing problem. In this work, we propose a novel fine-tuning strategy that allows low-…
▽ More
In person re-identification (ReID) task, because of its shortage of trainable dataset, it is common to utilize fine-tuning method using a classification network pre-trained on a large dataset. However, it is relatively difficult to sufficiently fine-tune the low-level layers of the network due to the gradient vanishing problem. In this work, we propose a novel fine-tuning strategy that allows low-level layers to be sufficiently trained by rolling back the weights of high-level layers to their initial pre-trained weights. Our strategy alleviates the problem of gradient vanishing in low-level layers and robustly trains the low-level layers to fit the ReID dataset, thereby increasing the performance of ReID tasks. The improved performance of the proposed strategy is validated via several experiments. Furthermore, without any add-ons such as pose estimation or segmentation, our strategy exhibits state-of-the-art performance using only vanilla deep convolutional neural network architecture.
△ Less
Submitted 18 January, 2019;
originally announced January 2019.