-
Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches
Authors:
Ayan Kumar Bhunia,
Viswanatha Reddy Gajjala,
Subhadeep Koley,
Rohit Kundu,
Aneeshan Sain,
Tao Xiang,
Yi-Zhe Song
Abstract:
The human visual system is remarkable in learning new visual concepts from just a few examples. This is precisely the goal behind few-shot class incremental learning (FSCIL), where the emphasis is additionally placed on ensuring the model does not suffer from "forgetting". In this paper, we push the boundary further for FSCIL by addressing two key questions that bottleneck its ubiquitous applicati…
▽ More
The human visual system is remarkable in learning new visual concepts from just a few examples. This is precisely the goal behind few-shot class incremental learning (FSCIL), where the emphasis is additionally placed on ensuring the model does not suffer from "forgetting". In this paper, we push the boundary further for FSCIL by addressing two key questions that bottleneck its ubiquitous application (i) can the model learn from diverse modalities other than just photo (as humans do), and (ii) what if photos are not readily accessible (due to ethical and privacy constraints). Our key innovation lies in advocating the use of sketches as a new modality for class support. The product is a "Doodle It Yourself" (DIY) FSCIL framework where the users can freely sketch a few examples of a novel class for the model to learn to recognize photos of that class. For that, we present a framework that infuses (i) gradient consensus for domain invariant learning, (ii) knowledge distillation for preserving old class information, and (iii) graph attention networks for message passing between old and novel classes. We experimentally show that sketches are better class support than text in the context of FSCIL, echoing findings elsewhere in the sketching literature.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
Partially Does It: Towards Scene-Level FG-SBIR with Partial Input
Authors:
Pinaki Nath Chowdhury,
Ayan Kumar Bhunia,
Viswanatha Reddy Gajjala,
Aneeshan Sain,
Tao Xiang,
Yi-Zhe Song
Abstract:
We scrutinise an important observation plaguing scene-level sketch research -- that a significant portion of scene sketches are "partial". A quick pilot study reveals: (i) a scene sketch does not necessarily contain all objects in the corresponding photo, due to the subjective holistic interpretation of scenes, (ii) there exists significant empty (white) regions as a result of object-level abstrac…
▽ More
We scrutinise an important observation plaguing scene-level sketch research -- that a significant portion of scene sketches are "partial". A quick pilot study reveals: (i) a scene sketch does not necessarily contain all objects in the corresponding photo, due to the subjective holistic interpretation of scenes, (ii) there exists significant empty (white) regions as a result of object-level abstraction, and as a result, (iii) existing scene-level fine-grained sketch-based image retrieval methods collapse as scene sketches become more partial. To solve this "partial" problem, we advocate for a simple set-based approach using optimal transport (OT) to model cross-modal region associativity in a partially-aware fashion. Importantly, we improve upon OT to further account for holistic partialness by comparing intra-modal adjacency matrices. Our proposed method is not only robust to partial scene-sketches but also yields state-of-the-art performance on existing datasets.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
MERANet: Facial Micro-Expression Recognition using 3D Residual Attention Network
Authors:
Viswanatha Reddy Gajjala,
Sai Prasanna Teja Reddy,
Snehasis Mukherjee,
Shiv Ram Dubey
Abstract:
Micro-expression has emerged as a promising modality in affective computing due to its high objectivity in emotion detection. Despite the higher recognition accuracy provided by the deep learning models, there are still significant scope for improvements in micro-expression recognition techniques. The presence of micro-expressions in small-local regions of the face, as well as the limited size of…
▽ More
Micro-expression has emerged as a promising modality in affective computing due to its high objectivity in emotion detection. Despite the higher recognition accuracy provided by the deep learning models, there are still significant scope for improvements in micro-expression recognition techniques. The presence of micro-expressions in small-local regions of the face, as well as the limited size of available databases, continue to limit the accuracy in recognizing micro-expressions. In this work, we propose a facial micro-expression recognition model using 3D residual attention network named MERANet to tackle such challenges. The proposed model takes advantage of spatial-temporal attention and channel attention together, to learn deeper fine-grained subtle features for classification of emotions. Further, the proposed model encompasses both spatial and temporal information simultaneously using the 3D kernels and residual connections. Moreover, the channel features and spatio-temporal features are re-calibrated using the channel and spatio-temporal attentions, respectively in each residual module. Our attention mechanism enables the model to learn to focus on different facial areas of interest. The experiments are conducted on benchmark facial micro-expression datasets. A superior performance is observed as compared to the state-of-the-art for facial micro-expression recognition on benchmark data.
△ Less
Submitted 23 January, 2022; v1 submitted 7 December, 2020;
originally announced December 2020.