-
Neural Attention for Image Captioning: Review of Outstanding Methods
Authors:
Zanyar Zohourianshahzadi,
Jugal K. Kalita
Abstract:
Image captioning is the task of automatically generating sentences that describe an input image in the best way possible. The most successful techniques for automatically generating image captions have recently used attentive deep learning models. There are variations in the way deep learning models with attention are designed. In this survey, we provide a review of literature related to attentive…
▽ More
Image captioning is the task of automatically generating sentences that describe an input image in the best way possible. The most successful techniques for automatically generating image captions have recently used attentive deep learning models. There are variations in the way deep learning models with attention are designed. In this survey, we provide a review of literature related to attentive deep learning models for image captioning. Instead of offering a comprehensive review of all prior work on deep image captioning models, we explain various types of attention mechanisms used for the task of image captioning in deep learning models. The most successful deep learning models used for image captioning follow the encoder-decoder architecture, although there are differences in the way these models employ attention mechanisms. Via analysis on performance results from different attentive deep models for image captioning, we aim at finding the most successful types of attention mechanisms in deep models for image captioning. Soft attention, bottom-up attention, and multi-head attention are the types of attention mechanism widely used in state-of-the-art attentive deep learning models for image captioning. At the current time, the best results are achieved from variants of multi-head attention with bottom-up attention.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
Neural Twins Talk & Alternative Calculations
Authors:
Zanyar Zohourianshahzadi,
Jugal K. Kalita
Abstract:
Inspired by how the human brain employs a higher number of neural pathways when describing a highly focused subject, we show that deep attentive models used for the main vision-language task of image captioning, could be extended to achieve better performance. Image captioning bridges a gap between computer vision and natural language processing. Automated image captioning is used as a tool to eli…
▽ More
Inspired by how the human brain employs a higher number of neural pathways when describing a highly focused subject, we show that deep attentive models used for the main vision-language task of image captioning, could be extended to achieve better performance. Image captioning bridges a gap between computer vision and natural language processing. Automated image captioning is used as a tool to eliminate the need for human agent for creating descriptive captions for unseen images.Automated image captioning is challenging and yet interesting. One reason is that AI based systems capable of generating sentences that describe an input image could be used in a wide variety of tasks beyond generating captions for unseen images found on web or uploaded to social media. For example, in biology and medical sciences, these systems could provide researchers and physicians with a brief linguistic description of relevant images, potentially expediting their work.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
Neural Twins Talk
Authors:
Zanyar Zohourianshahzadi,
Jugal Kumar Kalita
Abstract:
Inspired by how the human brain employs more neural pathways when increasing the focus on a subject, we introduce a novel twin cascaded attention model that outperforms a state-of-the-art image captioning model that was originally implemented using one channel of attention for the visual grounding task. Visual grounding ensures the existence of words in the caption sentence that are grounded into…
▽ More
Inspired by how the human brain employs more neural pathways when increasing the focus on a subject, we introduce a novel twin cascaded attention model that outperforms a state-of-the-art image captioning model that was originally implemented using one channel of attention for the visual grounding task. Visual grounding ensures the existence of words in the caption sentence that are grounded into a particular region in the input image. After a deep learning model is trained on visual grounding task, the model employs the learned patterns regarding the visual grounding and the order of objects in the caption sentences, when generating captions. We report the results of our experiments in three image captioning tasks on the COCO dataset. The results are reported using standard image captioning metrics to show the improvements achieved by our model over the previous image captioning model. The results gathered from our experiments suggest that employing more parallel attention pathways in a deep neural network leads to higher performance. Our implementation of NTT is publicly available at: https://github.com/zanyarz/NeuralTwinsTalk.
△ Less
Submitted 26 September, 2020;
originally announced September 2020.
-
A Survey of the Usages of Deep Learning in Natural Language Processing
Authors:
Daniel W. Otter,
Julian R. Medina,
Jugal K. Kalita
Abstract:
Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research…
▽ More
Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to a number of applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.
△ Less
Submitted 21 December, 2019; v1 submitted 27 July, 2018;
originally announced July 2018.
-
DDoS Attacks: Tools, Mitigation Approaches, and Probable Impact on Private Cloud Environment
Authors:
Rup Kumar Deka,
Dhruba Kumar Bhattacharyya,
Jugal Kumar Kalita
Abstract:
The future of the Internet is predicted to be on the cloud, resulting in more complex and more intensive computing, but possibly also a more insecure digital world. The presence of a large amount of resources organized densely is a key factor in attracting DDoS attacks. Such attacks are arguably more dangerous in private individual clouds with limited resources. This paper discusses several promin…
▽ More
The future of the Internet is predicted to be on the cloud, resulting in more complex and more intensive computing, but possibly also a more insecure digital world. The presence of a large amount of resources organized densely is a key factor in attracting DDoS attacks. Such attacks are arguably more dangerous in private individual clouds with limited resources. This paper discusses several prominent approaches introduced to counter DDoS attacks in private clouds. We also discuss issues and challenges to mitigate DDoS attacks in private clouds.
△ Less
Submitted 24 October, 2017;
originally announced October 2017.
-
Survey on Incremental Approaches for Network Anomaly Detection
Authors:
Monowar H. Bhuyan,
D. K. Bhattacharyya,
J. K. Kalita
Abstract:
As the communication industry has connected distant corners of the globe using advances in network technology, intruders or attackers have also increased attacks on networking infrastructure commensurately. System administrators can attempt to prevent such attacks using intrusion detection tools and systems. There are many commercially available signature-based Intrusion Detection Systems (IDSs).…
▽ More
As the communication industry has connected distant corners of the globe using advances in network technology, intruders or attackers have also increased attacks on networking infrastructure commensurately. System administrators can attempt to prevent such attacks using intrusion detection tools and systems. There are many commercially available signature-based Intrusion Detection Systems (IDSs). However, most IDSs lack the capability to detect novel or previously unknown attacks. A special type of IDSs, called Anomaly Detection Systems, develop models based on normal system or network behavior, with the goal of detecting both known and unknown attacks. Anomaly detection systems face many problems including high rate of false alarm, ability to work in online mode, and scalability. This paper presents a selective survey of incremental approaches for detecting anomaly in normal system or network traffic. The technological trends, open problems, and challenges over anomaly detection using incremental approach are also discussed.
△ Less
Submitted 19 November, 2012; v1 submitted 19 November, 2012;
originally announced November 2012.