Search | arXiv e-print repository

doi 10.1007/s10462-021-10092-2

Neural Attention for Image Captioning: Review of Outstanding Methods

Authors: Zanyar Zohourianshahzadi, Jugal K. Kalita

Abstract: Image captioning is the task of automatically generating sentences that describe an input image in the best way possible. The most successful techniques for automatically generating image captions have recently used attentive deep learning models. There are variations in the way deep learning models with attention are designed. In this survey, we provide a review of literature related to attentive… ▽ More Image captioning is the task of automatically generating sentences that describe an input image in the best way possible. The most successful techniques for automatically generating image captions have recently used attentive deep learning models. There are variations in the way deep learning models with attention are designed. In this survey, we provide a review of literature related to attentive deep learning models for image captioning. Instead of offering a comprehensive review of all prior work on deep image captioning models, we explain various types of attention mechanisms used for the task of image captioning in deep learning models. The most successful deep learning models used for image captioning follow the encoder-decoder architecture, although there are differences in the way these models employ attention mechanisms. Via analysis on performance results from different attentive deep models for image captioning, we aim at finding the most successful types of attention mechanisms in deep models for image captioning. Soft attention, bottom-up attention, and multi-head attention are the types of attention mechanism widely used in state-of-the-art attentive deep learning models for image captioning. At the current time, the best results are achieved from variants of multi-head attention with bottom-up attention. △ Less

Submitted 29 November, 2021; originally announced November 2021.

Comments: This is the accepted version, which we are allowed to publish on arxiv based on Springer Nature policies. For the published version please refer to Springer Nature Artificial Intelligence Review Journal. DOI number is attached. For Citation refer to AIRE journal using DOI link

arXiv:2108.02807 [pdf, other]

doi 10.1142/S1793351X21500045

Neural Twins Talk & Alternative Calculations

Authors: Zanyar Zohourianshahzadi, Jugal K. Kalita

Abstract: Inspired by how the human brain employs a higher number of neural pathways when describing a highly focused subject, we show that deep attentive models used for the main vision-language task of image captioning, could be extended to achieve better performance. Image captioning bridges a gap between computer vision and natural language processing. Automated image captioning is used as a tool to eli… ▽ More Inspired by how the human brain employs a higher number of neural pathways when describing a highly focused subject, we show that deep attentive models used for the main vision-language task of image captioning, could be extended to achieve better performance. Image captioning bridges a gap between computer vision and natural language processing. Automated image captioning is used as a tool to eliminate the need for human agent for creating descriptive captions for unseen images.Automated image captioning is challenging and yet interesting. One reason is that AI based systems capable of generating sentences that describe an input image could be used in a wide variety of tasks beyond generating captions for unseen images found on web or uploaded to social media. For example, in biology and medical sciences, these systems could provide researchers and physicians with a brief linguistic description of relevant images, potentially expediting their work. △ Less

Submitted 5 August, 2021; originally announced August 2021.

Comments: This paper was published at World Scientific Journal, International Journal of Semantic Computing. This is a preprint version that was submitted to the journal before final publication. arXiv admin note: substantial text overlap with arXiv:2009.12524

Journal ref: International Journal of Semantic Computing, 2021, 93-116

arXiv:2009.12524 [pdf, other]

doi 10.1109/HCCAI49649.2020.00009

Neural Twins Talk

Authors: Zanyar Zohourianshahzadi, Jugal Kumar Kalita

Abstract: Inspired by how the human brain employs more neural pathways when increasing the focus on a subject, we introduce a novel twin cascaded attention model that outperforms a state-of-the-art image captioning model that was originally implemented using one channel of attention for the visual grounding task. Visual grounding ensures the existence of words in the caption sentence that are grounded into… ▽ More Inspired by how the human brain employs more neural pathways when increasing the focus on a subject, we introduce a novel twin cascaded attention model that outperforms a state-of-the-art image captioning model that was originally implemented using one channel of attention for the visual grounding task. Visual grounding ensures the existence of words in the caption sentence that are grounded into a particular region in the input image. After a deep learning model is trained on visual grounding task, the model employs the learned patterns regarding the visual grounding and the order of objects in the caption sentences, when generating captions. We report the results of our experiments in three image captioning tasks on the COCO dataset. The results are reported using standard image captioning metrics to show the improvements achieved by our model over the previous image captioning model. The results gathered from our experiments suggest that employing more parallel attention pathways in a deep neural network leads to higher performance. Our implementation of NTT is publicly available at: https://github.com/zanyarz/NeuralTwinsTalk. △ Less

Submitted 26 September, 2020; originally announced September 2020.

Journal ref: Proceeding of 2020 IEEE International Conference on Humanized Computing and Communication with Artificial Intelligence (HCCAI)

arXiv:1807.10854 [pdf, other]

A Survey of the Usages of Deep Learning in Natural Language Processing

Authors: Daniel W. Otter, Julian R. Medina, Jugal K. Kalita

Abstract: Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research… ▽ More Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to a number of applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field. △ Less

Submitted 21 December, 2019; v1 submitted 27 July, 2018; originally announced July 2018.

arXiv:1710.08628 [pdf, other]

DDoS Attacks: Tools, Mitigation Approaches, and Probable Impact on Private Cloud Environment

Authors: Rup Kumar Deka, Dhruba Kumar Bhattacharyya, Jugal Kumar Kalita

Abstract: The future of the Internet is predicted to be on the cloud, resulting in more complex and more intensive computing, but possibly also a more insecure digital world. The presence of a large amount of resources organized densely is a key factor in attracting DDoS attacks. Such attacks are arguably more dangerous in private individual clouds with limited resources. This paper discusses several promin… ▽ More The future of the Internet is predicted to be on the cloud, resulting in more complex and more intensive computing, but possibly also a more insecure digital world. The presence of a large amount of resources organized densely is a key factor in attracting DDoS attacks. Such attacks are arguably more dangerous in private individual clouds with limited resources. This paper discusses several prominent approaches introduced to counter DDoS attacks in private clouds. We also discuss issues and challenges to mitigate DDoS attacks in private clouds. △ Less

Submitted 24 October, 2017; originally announced October 2017.

Journal ref: Big Data Analytics for Internet of Things, 2020

arXiv:1211.4493 [pdf]

Survey on Incremental Approaches for Network Anomaly Detection

Authors: Monowar H. Bhuyan, D. K. Bhattacharyya, J. K. Kalita

Abstract: As the communication industry has connected distant corners of the globe using advances in network technology, intruders or attackers have also increased attacks on networking infrastructure commensurately. System administrators can attempt to prevent such attacks using intrusion detection tools and systems. There are many commercially available signature-based Intrusion Detection Systems (IDSs).… ▽ More As the communication industry has connected distant corners of the globe using advances in network technology, intruders or attackers have also increased attacks on networking infrastructure commensurately. System administrators can attempt to prevent such attacks using intrusion detection tools and systems. There are many commercially available signature-based Intrusion Detection Systems (IDSs). However, most IDSs lack the capability to detect novel or previously unknown attacks. A special type of IDSs, called Anomaly Detection Systems, develop models based on normal system or network behavior, with the goal of detecting both known and unknown attacks. Anomaly detection systems face many problems including high rate of false alarm, ability to work in online mode, and scalability. This paper presents a selective survey of incremental approaches for detecting anomaly in normal system or network traffic. The technological trends, open problems, and challenges over anomaly detection using incremental approach are also discussed. △ Less

Submitted 19 November, 2012; v1 submitted 19 November, 2012; originally announced November 2012.

Comments: 14 pages, 1 figure, 11 tables referred journal publication

MSC Class: 68T10 ACM Class: K.6.5

Journal ref: International Journal of Communication Networks and Information Security (KUST), vol. 3, no. 3, pp. 226-239, 2011

Showing 1–6 of 6 results for author: Kalita, J K