-
Privacy Aware Memory Forensics
Authors:
Janardhan Kalikiri,
Gaurav Varshney,
Jaswinder Kour,
Tarandeep Singh
Abstract:
In recent years, insider threats and attacks have been increasing in terms of frequency and cost to the corporate business. The utilization of end-to-end encrypted instant messaging applications (WhatsApp, Telegram, VPN) by malicious insiders raised data breach incidents exponentially. The Securities and Exchange Board of India (SEBI) investigated reports on such data leak incidents and reported a…
▽ More
In recent years, insider threats and attacks have been increasing in terms of frequency and cost to the corporate business. The utilization of end-to-end encrypted instant messaging applications (WhatsApp, Telegram, VPN) by malicious insiders raised data breach incidents exponentially. The Securities and Exchange Board of India (SEBI) investigated reports on such data leak incidents and reported about twelve companies where earnings data and financial information were leaked using WhatsApp messages. Recent surveys indicate that 60% of data breaches are primarily caused by malicious insider threats. Especially, in the case of the defense environment, information leaks by insiders will jeopardize the countrys national security. Sniffing of network and host-based activities will not work in an insider threat detection environment due to end-to-end encryption. Memory forensics allows access to the messages sent or received over an end-to-end encrypted environment but with a total compromise of the users privacy. In this research, we present a novel solution to detect data leakages by insiders in an organization. Our approach captures the RAM of the insiders device and analyses it for sensitive information leaks from a host system while maintaining the users privacy. Sensitive data leaks are identified with context using a deep learning model. The feasibility and effectiveness of the proposed idea have been demonstrated with the help of a military use case. The proposed architecture can however be used across various use cases with minor modifications.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Cross-temporal Detection of Novel Ransomware Campaigns: A Multi-Modal Alert Approach
Authors:
Sathvik Murli,
Dhruv Nandakumar,
Prabhat Kumar Kushwaha,
Cheng Wang,
Christopher Redino,
Abdul Rahman,
Shalini Israni,
Tarun Singh,
Edward Bowen
Abstract:
We present a novel approach to identify ransomware campaigns derived from attack timelines representations within victim networks. Malicious activity profiles developed from multiple alert sources support the construction of alert graphs. This approach enables an effective and scalable representation of the attack timelines where individual nodes represent malicious activity detections with connec…
▽ More
We present a novel approach to identify ransomware campaigns derived from attack timelines representations within victim networks. Malicious activity profiles developed from multiple alert sources support the construction of alert graphs. This approach enables an effective and scalable representation of the attack timelines where individual nodes represent malicious activity detections with connections describing the potential attack paths. This work demonstrates adaptability to different attack patterns through implementing a novel method for parsing and classifying alert graphs while maintaining efficacy despite potentially low-dimension node features.
△ Less
Submitted 1 September, 2023;
originally announced September 2023.
-
Project Aria: A New Tool for Egocentric Multi-Modal AI Research
Authors:
Jakob Engel,
Kiran Somasundaram,
Michael Goesele,
Albert Sun,
Alexander Gamino,
Andrew Turner,
Arjang Talattof,
Arnie Yuan,
Bilal Souti,
Brighid Meredith,
Cheng Peng,
Chris Sweeney,
Cole Wilson,
Dan Barnes,
Daniel DeTone,
David Caruso,
Derek Valleroy,
Dinesh Ginjupalli,
Duncan Frost,
Edward Miller,
Elias Mueggler,
Evgeniy Oleinik,
Fan Zhang,
Guruprasad Somasundaram,
Gustavo Solaira
, et al. (49 additional authors not shown)
Abstract:
Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, mul…
▽ More
Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, multi-modal data recording and streaming device with the goal to foster and accelerate research in this area. In this paper, we describe the Aria device hardware including its sensor configuration and the corresponding software tools that enable recording and processing of such data.
△ Less
Submitted 1 October, 2023; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Object counting from aerial remote sensing images: application to wildlife and marine mammals
Authors:
Tanya Singh,
Hugo Gangloff,
Minh-Tan Pham
Abstract:
Anthropogenic activities pose threats to wildlife and marine fauna, prompting the need for efficient animal counting methods. This research study utilizes deep learning techniques to automate counting tasks. Inspired by previous studies on crowd and animal counting, a UNet model with various backbones is implemented, which uses Gaussian density maps for training, bypassing the need of training a d…
▽ More
Anthropogenic activities pose threats to wildlife and marine fauna, prompting the need for efficient animal counting methods. This research study utilizes deep learning techniques to automate counting tasks. Inspired by previous studies on crowd and animal counting, a UNet model with various backbones is implemented, which uses Gaussian density maps for training, bypassing the need of training a detector. The new model is applied to the task of counting dolphins and elephants in aerial images. Quantitative evaluation shows promising results, with the EfficientNet-B5 backbone achieving the best performance for African elephants and the ResNet18 backbone for dolphins. The model accurately locates animals despite complex image background conditions. By leveraging artificial intelligence, this research contributes to wildlife conservation efforts and enhances coexistence between humans and wildlife through efficient object counting without detection from aerial remote sensing.
△ Less
Submitted 17 June, 2023;
originally announced June 2023.
-
A deep-learning approach to early identification of suggested sexual harassment from videos
Authors:
Shreya Shetye,
Anwita Maiti,
Tannistha Maiti,
Tarry Singh
Abstract:
Sexual harassment, sexual abuse, and sexual violence are prevalent problems in this day and age. Women's safety is an important issue that needs to be highlighted and addressed. Given this issue, we have studied each of these concerns and the factors that affect it based on images generated from movies. We have classified the three terms (harassment, abuse, and violence) based on the visual attrib…
▽ More
Sexual harassment, sexual abuse, and sexual violence are prevalent problems in this day and age. Women's safety is an important issue that needs to be highlighted and addressed. Given this issue, we have studied each of these concerns and the factors that affect it based on images generated from movies. We have classified the three terms (harassment, abuse, and violence) based on the visual attributes present in images depicting these situations. We identified that factors such as facial expression of the victim and perpetrator and unwanted touching had a direct link to identifying the scenes containing sexual harassment, abuse and violence. We also studied and outlined how state-of-the-art explicit content detectors such as Google Cloud Vision API and Clarifai API fail to identify and categorise these images. Based on these definitions and characteristics, we have developed a first-of-its-kind dataset from various Indian movie scenes. These scenes are classified as sexual harassment, sexual abuse, or sexual violence and exported in the PASCAL VOC 1.1 format. Our dataset is annotated on the identified relevant features and can be used to develop and train a deep-learning computer vision model to identify these issues. The dataset is publicly available for research and development.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
The Impact of Large Language Multi-Modal Models on the Future of Job Market
Authors:
Tarry Singh
Abstract:
The rapid advancements in artificial intelligence, particularly in large language multi-modal models like GPT-4, have raised concerns about the potential displacement of human workers in various industries. This position paper aims to analyze the current state of job replacement by AI models and explores potential implications and strategies for a balanced coexistence between AI and human workers.
The rapid advancements in artificial intelligence, particularly in large language multi-modal models like GPT-4, have raised concerns about the potential displacement of human workers in various industries. This position paper aims to analyze the current state of job replacement by AI models and explores potential implications and strategies for a balanced coexistence between AI and human workers.
△ Less
Submitted 22 March, 2023;
originally announced April 2023.
-
On Structural and Spectral Properties of Distance Magic Graphs
Authors:
Himadri Mukherjee,
Ravindra Pawar,
Tarkeshwar Singh
Abstract:
A graph $G=(V,E)$ is said to be distance magic if there is a bijection $f$ from a vertex set of $G$ to the first $|V(G)|$ natural numbers such that for each vertex $v$, its weight given by $\sum_{u \in N(v)}f(u)$ is constant, where $N(v)$ is an open neighborhood of a vertex $v$. In this paper, we introduce the concept of $p$-distance magic labeling and establish the necessary and sufficient condit…
▽ More
A graph $G=(V,E)$ is said to be distance magic if there is a bijection $f$ from a vertex set of $G$ to the first $|V(G)|$ natural numbers such that for each vertex $v$, its weight given by $\sum_{u \in N(v)}f(u)$ is constant, where $N(v)$ is an open neighborhood of a vertex $v$. In this paper, we introduce the concept of $p$-distance magic labeling and establish the necessary and sufficient condition for a graph to be distance magic. Additionally, we introduce necessary and sufficient conditions for a connected regular graph to exhibit distance magic properties in terms of the eigenvalues of its adjacency and Laplacian matrices. Furthermore, we study the spectra of distance magic graphs, focusing on singular distance magic graphs. Also, we show that the number of distance magic labelings of a graph is, at most, the size of its automorphism group.
△ Less
Submitted 8 February, 2024; v1 submitted 11 February, 2023;
originally announced February 2023.
-
Dynamic Speech Endpoint Detection with Regression Targets
Authors:
Dawei Liang,
Hang Su,
Tarun Singh,
Jay Mahadeokar,
Shanil Puri,
Jiedan Zhu,
Edison Thomaz,
Mike Seltzer
Abstract:
Interactive voice assistants have been widely used as input interfaces in various scenarios, e.g. on smart homes devices, wearables and on AR devices. Detecting the end of a speech query, i.e. speech end-pointing, is an important task for voice assistants to interact with users. Traditionally, speech end-pointing is based on pure classification methods along with arbitrary binary targets. In this…
▽ More
Interactive voice assistants have been widely used as input interfaces in various scenarios, e.g. on smart homes devices, wearables and on AR devices. Detecting the end of a speech query, i.e. speech end-pointing, is an important task for voice assistants to interact with users. Traditionally, speech end-pointing is based on pure classification methods along with arbitrary binary targets. In this paper, we propose a novel regression-based speech end-pointing model, which enables an end-pointer to adjust its detection behavior based on context of user queries. Specifically, we present a pause modeling method and show its effectiveness for dynamic end-pointing. Based on our experiments with vendor-collected smartphone and wearables speech queries, our strategy shows a better trade-off between endpointing latency and accuracy, compared to the traditional classification-based method. We further discuss the benefits of this model and generalization of the framework in the paper.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Digitization of Raster Logs: A Deep Learning Approach
Authors:
M Quamer Nasim,
Narendra Patwardhan,
Tannistha Maiti,
Tarry Singh
Abstract:
Raster well-log images are digital representations of well-logs data generated over the years. Raster digital well logs represent bitmaps of the log image in a rectangular array of black (zeros) and white dots (ones) called pixels. Experts study the raster logs manually or with software applications that still require a tremendous amount of manual input. Besides the loss of thousands of person-hou…
▽ More
Raster well-log images are digital representations of well-logs data generated over the years. Raster digital well logs represent bitmaps of the log image in a rectangular array of black (zeros) and white dots (ones) called pixels. Experts study the raster logs manually or with software applications that still require a tremendous amount of manual input. Besides the loss of thousands of person-hours, this process is erroneous and tedious. To digitize these raster logs, one must buy a costly digitizer that is not only manual and time-consuming but also a hidden technical debt since enterprises stand to lose more money in additional servicing and consulting charges. We propose a deep neural network architecture called VeerNet to semantically segment the raster images from the background grid and classify and digitize the well-log curves. Raster logs have a substantially greater resolution than images traditionally consumed by image segmentation pipelines. Since the input has a low signal-to-resolution ratio, we require rapid downsampling to alleviate unnecessary computation. We thus employ a modified UNet-inspired architecture that balances retaining key signals and reducing result dimensionality. We use attention augmented read-process-write architecture. This architecture efficiently classifies and digitizes the curves with an overall F1 score of 35% and IoU of 30%. When compared to the actual las values for Gamma-ray and derived value of Gamma-ray from VeerNet, a high Pearson coefficient score of 0.62 was achieved.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Programmable and Customized Intelligence for Traffic Steering in 5G Networks Using Open RAN Architectures
Authors:
Andrea Lacava,
Michele Polese,
Rajarajan Sivaraj,
Rahul Soundrarajan,
Bhawani Shanker Bhati,
Tarunjeet Singh,
Tommaso Zugno,
Francesca Cuomo,
Tommaso Melodia
Abstract:
5G and beyond mobile networks will support heterogeneous use cases at an unprecedented scale, thus demanding automated control and optimization of network functionalities customized to the needs of individual users. Such fine-grained control of the Radio Access Network (RAN) is not possible with the current cellular architecture. To fill this gap, the Open RAN paradigm and its specification introd…
▽ More
5G and beyond mobile networks will support heterogeneous use cases at an unprecedented scale, thus demanding automated control and optimization of network functionalities customized to the needs of individual users. Such fine-grained control of the Radio Access Network (RAN) is not possible with the current cellular architecture. To fill this gap, the Open RAN paradigm and its specification introduce an open architecture with abstractions that enable closed-loop control and provide data-driven, and intelligent optimization of the RAN at the user level. This is obtained through custom RAN control applications (i.e., xApps) deployed on near-real-time RAN Intelligent Controller (near-RT RIC) at the edge of the network. Despite these premises, as of today the research community lacks a sandbox to build data-driven xApps, and create large-scale datasets for effective AI training. In this paper, we address this by introducing ns-O-RAN, a software framework that integrates a real-world, production-grade near-RT RIC with a 3GPP-based simulated environment on ns-3, enabling the development of xApps and automated large-scale data collection and testing of Deep Reinforcement Learning-driven control policies for the optimization at the user-level. In addition, we propose the first user-specific O-RAN Traffic Steering (TS) intelligent handover framework. It uses Random Ensemble Mixture, combined with a state-of-the-art Convolutional Neural Network architecture, to optimally assign a serving base station to each user in the network. Our TS xApp, trained with more than 40 million data points collected by ns-O-RAN, runs on the near-RT RIC and controls its base stations. We evaluate the performance on a large-scale deployment, showing that the xApp-based handover improves throughput and spectral efficiency by an average of 50% over traditional handover heuristics, with less mobility overhead.
△ Less
Submitted 14 October, 2022; v1 submitted 28 September, 2022;
originally announced September 2022.
-
Energy-Time Optimal Control of Wheeled Mobile Robots
Authors:
Young** Kim,
Tarunraj Singh
Abstract:
This paper focuses on the energy-time optimal control of wheeled mobile robots undergoing point-to-point transitions in an obstacles free space. Two interchangeable models are used to arrive at the necessary conditions for optimality. The first formulation exploits the Hamiltonian, while the second formulation considers the first variation of the augmented cost to derive the necessary conditions f…
▽ More
This paper focuses on the energy-time optimal control of wheeled mobile robots undergoing point-to-point transitions in an obstacles free space. Two interchangeable models are used to arrive at the necessary conditions for optimality. The first formulation exploits the Hamiltonian, while the second formulation considers the first variation of the augmented cost to derive the necessary conditions for optimality. Jacobi elliptic functions are shown to parameterize the closed form solutions for the states, control and costates. Analysis of the optimal control reveal that they are constrained to lie on a cylinder whose circular cross-section is a function of the weight penalizing the relative costs of time and energy. The evolving optimal costates for the second formulation are shown to lie on the intersection of two cylinders. The optimal control for the wheeled mobile robot undergoing point-to-point motion is also developed where the linear velocity is constrained to be time-invariant. It is shown that the costates are constrained to lie on the intersection of a cylinder and an extruded parabola. Numerical results for various point-to-point maneuvers are presented to illustrate the change in the structure of the optimal trajectories as a function of the relative location of the terminal and initial states.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
Classification of Skin Cancer Images using Convolutional Neural Networks
Authors:
Kartikeya Agarwal,
Tismeet Singh
Abstract:
Skin cancer is the most common human malignancy(American Cancer Society) which is primarily diagnosed visually, starting with an initial clinical screening and followed potentially by dermoscopic(related to skin) analysis, a biopsy and histopathological examination. Skin cancer occurs when errors (mutations) occur in the DNA of skin cells. The mutations cause the cells to grow out of control and f…
▽ More
Skin cancer is the most common human malignancy(American Cancer Society) which is primarily diagnosed visually, starting with an initial clinical screening and followed potentially by dermoscopic(related to skin) analysis, a biopsy and histopathological examination. Skin cancer occurs when errors (mutations) occur in the DNA of skin cells. The mutations cause the cells to grow out of control and form a mass of cancer cells. The aim of this study was to try to classify images of skin lesions with the help of convolutional neural networks. The deep neural networks show humongous potential for image classification while taking into account the large variability exhibited by the environment. Here we trained images based on the pixel values and classified them on the basis of disease labels. The dataset was acquired from an Open Source Kaggle Repository(Kaggle Dataset)which itself was acquired from ISIC(International Skin Imaging Collaboration) Archive. The training was performed on multiple models accompanied with Transfer Learning. The highest model accuracy achieved was over 86.65%. The dataset used is publicly available to ensure credibility and reproducibility of the aforementioned result.
△ Less
Submitted 1 February, 2022;
originally announced February 2022.
-
Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric
Authors:
Suyoun Kim,
Duc Le,
Weiyi Zheng,
Tarun Singh,
Abhinav Arora,
Xiaoyu Zhai,
Christian Fuegen,
Ozlem Kalinli,
Michael L. Seltzer
Abstract:
Measuring automatic speech recognition (ASR) system quality is critical for creating user-satisfying voice-driven applications. Word Error Rate (WER) has been traditionally used to evaluate ASR system quality; however, it sometimes correlates poorly with user perception/judgement of transcription quality. This is because WER weighs every word equally and does not consider semantic correctness whic…
▽ More
Measuring automatic speech recognition (ASR) system quality is critical for creating user-satisfying voice-driven applications. Word Error Rate (WER) has been traditionally used to evaluate ASR system quality; however, it sometimes correlates poorly with user perception/judgement of transcription quality. This is because WER weighs every word equally and does not consider semantic correctness which has a higher impact on user perception. In this work, we propose evaluating ASR output hypotheses quality with SemDist that can measure semantic correctness by using the distance between the semantic vectors of the reference and hypothesis extracted from a pre-trained language model. Our experimental results of 71K and 36K user annotated ASR output quality show that SemDist achieves higher correlation with user perception than WER. We also show that SemDist has higher correlation with downstream Natural Language Understanding (NLU) tasks than WER.
△ Less
Submitted 5 July, 2022; v1 submitted 11 October, 2021;
originally announced October 2021.
-
A Comprehensive Review on Recent Methods and Challenges of Video Description
Authors:
Alok Singh,
Thoudam Doren Singh,
Sivaji Bandyopadhyay
Abstract:
Video description involves the generation of the natural language description of actions, events, and objects in the video. There are various applications of video description by filling the gap between languages and vision for visually impaired people, generating automatic title suggestion based on content, browsing of the video based on the content and video-guided machine translation [86] etc.I…
▽ More
Video description involves the generation of the natural language description of actions, events, and objects in the video. There are various applications of video description by filling the gap between languages and vision for visually impaired people, generating automatic title suggestion based on content, browsing of the video based on the content and video-guided machine translation [86] etc.In the past decade, several works had been done in this field in terms of approaches/methods for video description, evaluation metrics,and datasets. For analyzing the progress in the video description task, a comprehensive survey is needed that covers all the phases of video description approaches with a special focus on recent deep learning approaches. In this work, we report a comprehensive survey on the phases of video description approaches, the dataset for video description, evaluation metrics, open competitions for motivating the research on the video description, open challenges in this field, and future research directions. In this survey, we cover the state-of-the-art approaches proposed for each and every dataset with their pros and cons. For the growth of this research domain,the availability of numerous benchmark dataset is a basic need. Further, we categorize all the dataset into two classes: open domain dataset and domain-specific dataset. From our survey, we observe that the work in this field is in fast-paced development since the task of video description falls in the intersection of computer vision and natural language processing. But still, the work in the video description is far from saturation stage due to various challenges like the redundancy due to similar frames which affect the quality of visual features, the availability of dataset containing more diverse content and availability of an effective evaluation metric.
△ Less
Submitted 30 November, 2020;
originally announced November 2020.
-
Seismic Facies Analysis: A Deep Domain Adaptation Approach
Authors:
M Quamer Nasim,
Tannistha Maiti,
Ayush Srivastava,
Tarry Singh,
Jie Mei
Abstract:
Deep neural networks (DNNs) can learn accurately from large quantities of labeled input data, but often fail to do so when labelled data are scarce. DNNs sometimes fail to generalize ontest data sampled from different input distributions. Unsupervised Deep Domain Adaptation (DDA)techniques have been proven useful when no labels are available, and when distribution shifts are observed in the target…
▽ More
Deep neural networks (DNNs) can learn accurately from large quantities of labeled input data, but often fail to do so when labelled data are scarce. DNNs sometimes fail to generalize ontest data sampled from different input distributions. Unsupervised Deep Domain Adaptation (DDA)techniques have been proven useful when no labels are available, and when distribution shifts are observed in the target domain (TD). In the present study, experiments are performed on seismic images of the F3 block 3D dataset from offshore Netherlands (source domain; SD) and Penobscot 3D survey data from Canada (target domain; TD). Three geological classes from SD and TD that have similar reflection patterns are considered. A deep neural network architecture named EarthAdaptNet (EAN) is proposed to semantically segment the seismic images when few classes have data scarcity, and we use a transposed residual unit to replace the traditional dilated convolution in the decoder block. The EAN achieved a pixel-level accuracy >84% and an accuracy of ~70% for the minority classes, showing improved performance compared to existing architectures. In addition, we introduce the CORAL (Correlation Alignment) method to the EAN to create an unsupervised deep domain adaptation network (EAN-DDA) for the classification of seismic reflections from F3 and Penobscot, to demonstrate possible approaches when labelled data are unavailable. Maximum class accuracy achieved was ~99% for class 2 of Penobscot, with an overall accuracy>50%. Taken together, the EAN-DDA has the potential to classify target domain seismic facies classes with high accuracy.
△ Less
Submitted 27 October, 2021; v1 submitted 20 November, 2020;
originally announced November 2020.
-
NITS-Hinglish-SentiMix at SemEval-2020 Task 9: Sentiment Analysis For Code-Mixed Social Media Text Using an Ensemble Model
Authors:
Subhra Jyoti Baroi,
Nivedita Singh,
Ringki Das,
Thoudam Doren Singh
Abstract:
Sentiment Analysis is the process of deciphering what a sentence emotes and classifying them as either positive, negative, or neutral. In recent times, India has seen a huge influx in the number of active social media users and this has led to a plethora of unstructured text data. Since the Indian population is generally fluent in both Hindi and English, they end up generating code-mixed Hinglish…
▽ More
Sentiment Analysis is the process of deciphering what a sentence emotes and classifying them as either positive, negative, or neutral. In recent times, India has seen a huge influx in the number of active social media users and this has led to a plethora of unstructured text data. Since the Indian population is generally fluent in both Hindi and English, they end up generating code-mixed Hinglish social media text i.e. the expressions of Hindi language, written in the Roman script alongside other English words. The ability to adequately comprehend the notions in these texts is truly necessary. Our team, rns2020 participated in Task 9 at SemEval2020 intending to design a system to carry out the sentiment analysis of code-mixed social media text. This work proposes a system named NITS-Hinglish-SentiMix to viably complete the sentiment analysis of such code-mixed Hinglish text. The proposed framework has recorded an F-Score of 0.617 on the test data.
△ Less
Submitted 4 September, 2020; v1 submitted 23 July, 2020;
originally announced July 2020.
-
Towards the Study of Morphological Processing of the Tangkhul Language
Authors:
Mirinso Shadang,
Navanath Saharia,
Thoudam Doren Singh
Abstract:
There is no or little work on natural language processing of Tangkhul language. The current work is a humble beginning of morphological processing of this language using an unsupervised approach. We use a small corpus collected from different sources of text books, short stories and articles of other topics. Based on the experiments carried out, the morpheme identification task using morphessor gi…
▽ More
There is no or little work on natural language processing of Tangkhul language. The current work is a humble beginning of morphological processing of this language using an unsupervised approach. We use a small corpus collected from different sources of text books, short stories and articles of other topics. Based on the experiments carried out, the morpheme identification task using morphessor gives reasonable and interesting output despite using a small corpus.
△ Less
Submitted 29 June, 2020;
originally announced June 2020.
-
Seq2Seq and Joint Learning Based Unix Command Line Prediction System
Authors:
Thoudam Doren Singh,
Abdullah Faiz Ur Rahman Khilji,
Divyansha,
Apoorva Vikram Singh,
Surmila Thokchom,
Sivaji Bandyopadhyay
Abstract:
Despite being an open-source operating system pioneered in the early 90s, UNIX based platforms have not been able to garner an overwhelming reception from amateur end users. One of the rationales for under popularity of UNIX based systems is the steep learning curve corresponding to them due to extensive use of command line interface instead of usual interactive graphical user interface. In past y…
▽ More
Despite being an open-source operating system pioneered in the early 90s, UNIX based platforms have not been able to garner an overwhelming reception from amateur end users. One of the rationales for under popularity of UNIX based systems is the steep learning curve corresponding to them due to extensive use of command line interface instead of usual interactive graphical user interface. In past years, the majority of insights used to explore the concern are eminently centered around the notion of utilizing chronic log history of the user to make the prediction of successive command. The approaches directed at anatomization of this notion are predominantly in accordance with Probabilistic inference models. The techniques employed in past, however, have not been competent enough to address the predicament as legitimately as anticipated. Instead of deploying usual mechanism of recommendation systems, we have employed a simple yet novel approach of Seq2seq model by leveraging continuous representations of self-curated exhaustive Knowledge Base (KB) to enhance the embedding employed in the model. This work describes an assistive, adaptive and dynamic way of enhancing UNIX command line prediction systems. Experimental methods state that our model has achieved accuracy surpassing mixture of other techniques and adaptive command line interface mechanism as acclaimed in the past.
△ Less
Submitted 20 June, 2020;
originally announced June 2020.
-
NITS-VC System for VATEX Video Captioning Challenge 2020
Authors:
Alok Singh,
Thoudam Doren Singh,
Sivaji Bandyopadhyay
Abstract:
Video captioning is process of summarising the content, event and action of the video into a short textual form which can be helpful in many research areas such as video guided machine translation, video sentiment analysis and providing aid to needy individual. In this paper, a system description of the framework used for VATEX-2020 video captioning challenge is presented. We employ an encoder-dec…
▽ More
Video captioning is process of summarising the content, event and action of the video into a short textual form which can be helpful in many research areas such as video guided machine translation, video sentiment analysis and providing aid to needy individual. In this paper, a system description of the framework used for VATEX-2020 video captioning challenge is presented. We employ an encoder-decoder based approach in which the visual features of the video are encoded using 3D convolutional neural network (C3D) and in the decoding phase two Long Short Term Memory (LSTM) recurrent networks are used in which visual features and input captions are fused separately and final output is generated by performing element-wise product between the output of both LSTMs. Our model is able to achieve BLEU scores of 0.20 and 0.22 on public and private test data sets respectively.
△ Less
Submitted 25 September, 2020; v1 submitted 7 June, 2020;
originally announced June 2020.
-
Devising Malware Characterstics using Transformers
Authors:
Simra Shahid,
Tanmay Singh,
Yash Sharma,
Kapil Sharma
Abstract:
With the increasing number of cybersecurity threats, it becomes more difficult for researchers to skim through the security reports for malware analysis. There is a need to be able to extract highly relevant sentences without having to read through the entire malware reports. In this paper, we are finding relevant malware behavior mentions from Advanced Persistent Threat Reports. This main contrib…
▽ More
With the increasing number of cybersecurity threats, it becomes more difficult for researchers to skim through the security reports for malware analysis. There is a need to be able to extract highly relevant sentences without having to read through the entire malware reports. In this paper, we are finding relevant malware behavior mentions from Advanced Persistent Threat Reports. This main contribution is an opening attempt to Transformer the approach for malware behavior analysis.
△ Less
Submitted 23 May, 2020;
originally announced May 2020.
-
Big Computing: Where are we heading?
Authors:
Sabuzima Nayak,
Ripon Patgiri,
Thoudam Doren Singh
Abstract:
This paper presents the overview of the current trends of Big data against the computing scenario from different aspects. Some of the important aspect includes the Exascale, the computing power and the kind of applications which offer the Big data. This starts with the current computing hardware constraint against the need of the rising Big data applications. We highlight the issues and challenges…
▽ More
This paper presents the overview of the current trends of Big data against the computing scenario from different aspects. Some of the important aspect includes the Exascale, the computing power and the kind of applications which offer the Big data. This starts with the current computing hardware constraint against the need of the rising Big data applications. We highlight the issues and challenges of energy requirement, software complexity, hardware failure, fault tolerant computing, and communication. As the complexity of computation is going to rise in the future. The paper also highlights the future direction of Big computing systems for Bioinformatics, social media, hardware and software requirements, data intensive computation and then towards GPU era.
△ Less
Submitted 9 April, 2020;
originally announced May 2020.
-
Investigating Transferability in Pretrained Language Models
Authors:
Alex Tamkin,
Trisha Singh,
Davide Giovanardi,
Noah Goodman
Abstract:
How does language model pretraining help transfer learning? We consider a simple ablation technique for determining the impact of each pretrained layer on transfer task performance. This method, partial reinitialization, involves replacing different layers of a pretrained model with random weights, then finetuning the entire model on the transfer task and observing the change in performance. This…
▽ More
How does language model pretraining help transfer learning? We consider a simple ablation technique for determining the impact of each pretrained layer on transfer task performance. This method, partial reinitialization, involves replacing different layers of a pretrained model with random weights, then finetuning the entire model on the transfer task and observing the change in performance. This technique reveals that in BERT, layers with high probing performance on downstream GLUE tasks are neither necessary nor sufficient for high accuracy on those tasks. Furthermore, the benefit of using pretrained parameters for a layer varies dramatically with finetuning dataset size: parameters that provide tremendous performance improvement when data is plentiful may provide negligible benefits in data-scarce settings. These results reveal the complexity of the transfer learning process, highlighting the limitations of methods that operate on frozen models or single data samples.
△ Less
Submitted 9 November, 2020; v1 submitted 30 April, 2020;
originally announced April 2020.
-
Identifying Weakly Connected Subsystems in Building Energy Model for Effective Load Estimation in Presence of Parametric Uncertainty
Authors:
Arpan Mukherjee,
Anna Kuechle Szweda,
Andrew Alegria,
Rahul Rai,
Tarunraj Singh
Abstract:
It is necessary to estimate the expected energy usage of a building to determine how to reduce energy usage. The expected energy usage of a building can be reliably simulated using a Building Energy Model (BEM). Many of the numerous input parameters in a BEM are uncertain. To ensure that the building simulation is sufficiently accurate, and to better understand the impact of imprecisions in the in…
▽ More
It is necessary to estimate the expected energy usage of a building to determine how to reduce energy usage. The expected energy usage of a building can be reliably simulated using a Building Energy Model (BEM). Many of the numerous input parameters in a BEM are uncertain. To ensure that the building simulation is sufficiently accurate, and to better understand the impact of imprecisions in the input parameters and calculation methods, it is desirable to quantify uncertainty in the BEM throughout the modeling process. Uncertainty quantification (UQ) typically requires a large number of simulations to produce meaningful data, which, due to the vast number of input parameters and the dynamic nature of building simulation, is computationally expensive. Uncertainty Quantification (UQ) in BEM domain is thus intractable due to the size of the problem and parameters involved and hence it needs an advanced methodology for analysis. The current paper outlines a novel Weakly-Connected-Systems (WCSs) identification-based UQ framework developed to propagate the quantifiable uncertainty in the BEM. The overall approach is demonstrated on the physics-based thermal model of an actual building in Central New York.
△ Less
Submitted 17 April, 2020;
originally announced April 2020.
-
Systematically designing better instance counting models on cell images with Neural Arithmetic Logic Units
Authors:
Ashish Rana,
Taranveer Singh,
Harpreet Singh,
Neeraj Kumar,
Prashant Singh Rana
Abstract:
The big problem for neural network models which are trained to count instances is that whenever test range goes high training range generalization error increases i.e. they are not good generalizers outside training range. Consider the case of automating cell counting process where more dense images with higher cell counts are commonly encountered as compared to images used in training data. By ma…
▽ More
The big problem for neural network models which are trained to count instances is that whenever test range goes high training range generalization error increases i.e. they are not good generalizers outside training range. Consider the case of automating cell counting process where more dense images with higher cell counts are commonly encountered as compared to images used in training data. By making better predictions for higher ranges of cell count we are aiming to create better generalization systems for cell counting. With architecture proposal of neural arithmetic logic units (NALU) for arithmetic operations, task of counting has become feasible for higher numeric ranges which were not included in training data with better accuracy. As a part of our study we used these units and different other activation functions for learning cell counting task with two different architectures namely Fully Convolutional Regression Network and U-Net. These numerically biased units are added in the form of residual concatenated layers to original architectures and a comparative experimental study is done with these newly proposed changes. This comparative study is described in terms of optimizing regression loss problem from these models trained with extensive data augmentation techniques. We were able to achieve better results in our experiments of cell counting tasks with introduction of these numerically biased units to already existing architectures in the form of residual layer concatenation connections. Our results confirm that above stated numerically biased units does help models to learn numeric quantities for better generalization results.
△ Less
Submitted 15 June, 2020; v1 submitted 14 April, 2020;
originally announced April 2020.
-
Fair Generative Modeling via Weak Supervision
Authors:
Kristy Choi,
Aditya Grover,
Trisha Singh,
Rui Shu,
Stefano Ermon
Abstract:
Real-world datasets are often biased with respect to key demographic factors such as race and gender. Due to the latent nature of the underlying factors, detecting and mitigating bias is especially challenging for unsupervised machine learning. We present a weakly supervised algorithm for overcoming dataset bias for deep generative models. Our approach requires access to an additional small, unlab…
▽ More
Real-world datasets are often biased with respect to key demographic factors such as race and gender. Due to the latent nature of the underlying factors, detecting and mitigating bias is especially challenging for unsupervised machine learning. We present a weakly supervised algorithm for overcoming dataset bias for deep generative models. Our approach requires access to an additional small, unlabeled reference dataset as the supervision signal, thus sidestep** the need for explicit labels on the underlying bias factors. Using this supplementary dataset, we detect the bias in existing datasets via a density ratio technique and learn generative models which efficiently achieve the twin goals of: 1) data efficiency by using training examples from both biased and reference datasets for learning; and 2) data generation close in distribution to the reference dataset at test time. Empirically, we demonstrate the efficacy of our approach which reduces bias w.r.t. latent factors by an average of up to 34.6% over baselines for comparable image generation using generative adversarial networks.
△ Less
Submitted 30 June, 2020; v1 submitted 26 October, 2019;
originally announced October 2019.
-
Reduced-order modeling using Dynamic Mode Decomposition and Least Angle Regression
Authors:
John Graff,
Xianzhang Xu,
Francis D. Lagor,
Tarunraj Singh
Abstract:
Dynamic Mode Decomposition (DMD) yields a linear, approximate model of a system's dynamics that is built from data. We seek to reduce the order of this model by identifying a reduced set of modes that best fit the output. We adopt a model selection algorithm from statistics and machine learning known as Least Angle Regression (LARS). We modify LARS to be complex-valued and utilize LARS to select D…
▽ More
Dynamic Mode Decomposition (DMD) yields a linear, approximate model of a system's dynamics that is built from data. We seek to reduce the order of this model by identifying a reduced set of modes that best fit the output. We adopt a model selection algorithm from statistics and machine learning known as Least Angle Regression (LARS). We modify LARS to be complex-valued and utilize LARS to select DMD modes. We refer to the resulting algorithm as Least Angle Regression for Dynamic Mode Decomposition (LARS4DMD). Sparsity-Promoting Dynamic Mode Decomposition (DMDSP), a popular mode-selection algorithm, serves as a benchmark for comparison. Numerical results from a Poiseuille flow test problem show that LARS4DMD yields reduced-order models that have comparable performance to DMDSP. LARS4DMD has the added benefit that the regularization weighting parameter required for DMDSP is not needed.
△ Less
Submitted 16 January, 2020; v1 submitted 16 May, 2019;
originally announced May 2019.
-
A Hybrid Framework for Action Recognition in Low-Quality Video Sequences
Authors:
Tej Singh,
Dinesh Kumar Vishwakarma
Abstract:
Vision-based activity recognition is essential for security, monitoring and surveillance applications. Further, real-time analysis having low-quality video and contain less information about surrounding due to poor illumination, and occlusions. Therefore, it needs a more robust and integrated model for low quality and night security operations. In this context, we proposed a hybrid model for illum…
▽ More
Vision-based activity recognition is essential for security, monitoring and surveillance applications. Further, real-time analysis having low-quality video and contain less information about surrounding due to poor illumination, and occlusions. Therefore, it needs a more robust and integrated model for low quality and night security operations. In this context, we proposed a hybrid model for illumination invariant human activity recognition based on sub-image histogram equalization enhancement and k-key pose human silhouettes. This feature vector gives good average recognition accuracy on three low exposure video sequences subset of original actions video datasets. Finally, the performance of the proposed approach is tested over three manually downgraded low qualities Weizmann action, KTH, and Ballet Movement dataset. This model outperformed on low exposure videos over existing technique and achieved comparable classification accuracy to similar state-of-the-art methods.
△ Less
Submitted 10 March, 2019;
originally announced March 2019.
-
EvalAI: Towards Better Evaluation Systems for AI Agents
Authors:
Deshraj Yadav,
Rishabh Jain,
Harsh Agrawal,
Prithvijit Chattopadhyay,
Taranjeet Singh,
Akash Jain,
Shiv Baran Singh,
Stefan Lee,
Dhruv Batra
Abstract:
We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the research community to fulfill the critical need of evaluating machine learning models and agents acting in an environment against annotations or with a human-in-the-loop. This will help researcher…
▽ More
We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the research community to fulfill the critical need of evaluating machine learning models and agents acting in an environment against annotations or with a human-in-the-loop. This will help researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe. By simplifying and standardizing the process of benchmarking these models, EvalAI seeks to lower the barrier to entry for participating in the global scientific effort to push the frontiers of machine learning and artificial intelligence, thereby increasing the rate of measurable progress in this domain.
△ Less
Submitted 10 February, 2019;
originally announced February 2019.
-
Learning concise representations for regression by evolving networks of trees
Authors:
William La Cava,
Tilak Raj Singh,
James Taggart,
Srinivas Suri,
Jason H. Moore
Abstract:
We propose and study a method for learning interpretable representations for the task of regression. Features are represented as networks of multi-type expression trees comprised of activation functions common in neural networks in addition to other elementary functions. Differentiable features are trained via gradient descent, and the performance of features in a linear model is used to weight th…
▽ More
We propose and study a method for learning interpretable representations for the task of regression. Features are represented as networks of multi-type expression trees comprised of activation functions common in neural networks in addition to other elementary functions. Differentiable features are trained via gradient descent, and the performance of features in a linear model is used to weight the rate of change among subcomponents of each representation. The search process maintains an archive of representations with accuracy-complexity trade-offs to assist in generalization and interpretation. We compare several stochastic optimization approaches within this framework. We benchmark these variants on 100 open-source regression problems in comparison to state-of-the-art machine learning approaches. Our main finding is that this approach produces the highest average test scores across problems while producing representations that are orders of magnitude smaller than the next best performing method (gradient boosting). We also report a negative result in which attempts to directly optimize the disentanglement of the representation result in more highly correlated features.
△ Less
Submitted 25 March, 2019; v1 submitted 3 July, 2018;
originally announced July 2018.
-
Hybrid Memristor-CMOS (MeMOS) based Logic Gates and Adder Circuits
Authors:
Te**der Singh
Abstract:
Practical memristor came into picture just few years back and instantly became the topic of interest for researchers and scientists. Memristor is the fourth basic two-terminal passive circuit element apart from well known resistor, capacitor and inductor. Recently, memristor based architectures has been proposed by many researchers. In this paper, we have designed a hybrid Memristor-CMOS (MeMOS) l…
▽ More
Practical memristor came into picture just few years back and instantly became the topic of interest for researchers and scientists. Memristor is the fourth basic two-terminal passive circuit element apart from well known resistor, capacitor and inductor. Recently, memristor based architectures has been proposed by many researchers. In this paper, we have designed a hybrid Memristor-CMOS (MeMOS) logic based adder circuit that can be used in numerous logic computational architectures. We have also analyzed the transient response of logic gates designed using MeMOS logic circuits. MeMOS use CMOS 180 nm process with memristor to compute boolean logic operations. Various parameters including speed, ares, delay and power dissipation are computed and compared with standard CMOS 180 nm logic design. The proposed logic shows better area utilization and excellent results from existing CMOS logic circuits at standard 1.8 V operating voltage.
△ Less
Submitted 19 June, 2015;
originally announced June 2015.
-
Software-Defined Networking: State of the Art and Research Challenges
Authors:
Manar Jammal,
Taranpreet Singh,
Abdallah Shami,
Rasool Asal,
Yiming Li
Abstract:
Plug-and-play information technology (IT) infrastructure has been expanding very rapidly in recent years. With the advent of cloud computing, many ecosystem and business paradigms are encountering potential changes and may be able to eliminate their IT infrastructure maintenance processes. Real-time performance and high availability requirements have induced telecom networks to adopt the new conce…
▽ More
Plug-and-play information technology (IT) infrastructure has been expanding very rapidly in recent years. With the advent of cloud computing, many ecosystem and business paradigms are encountering potential changes and may be able to eliminate their IT infrastructure maintenance processes. Real-time performance and high availability requirements have induced telecom networks to adopt the new concepts of the cloud model: software-defined networking (SDN) and network function virtualization (NFV). NFV introduces and deploys new network functions in an open and standardized IT environment, while SDN aims to transform the way networks function. SDN and NFV are complementary technologies; they do not depend on each other. However, both concepts can be merged and have the potential to mitigate the challenges of legacy networks. In this paper, our aim is to describe the benefits of using SDN in a multitude of environments such as in data centers, data center networks, and Network as Service offerings. We also present the various challenges facing SDN, from scalability to reliability and security concerns, and discuss existing solutions to these challenges.
△ Less
Submitted 31 May, 2014;
originally announced June 2014.
-
Study and Capacity Evaluation of SISO, MISO and MIMO RF Wireless Communication Systems
Authors:
Kritika Sengar,
Nishu Rani,
Ankita Singhal,
Dolly Sharma,
Seema Verma,
Tanya Singh
Abstract:
The wireless communication systems has gone from different generations from SISO systems to MIMO systems. Bandwidth is one important constraint in wireless communication. In wireless communication, high data transmission rates are essential for the services like tripple play i.e. data, voice and video. At user end the capacity determines the quality of the communication systems. This paper aims to…
▽ More
The wireless communication systems has gone from different generations from SISO systems to MIMO systems. Bandwidth is one important constraint in wireless communication. In wireless communication, high data transmission rates are essential for the services like tripple play i.e. data, voice and video. At user end the capacity determines the quality of the communication systems. This paper aims to compare the different RF wireless communication systems like SISO, MISO, SIMO and MIMO systems on the capacity basis and explaining the concept as today, the wireless communication has evolved from 2G, 3G to 4G and the companies are fighting to create networks with more and more capacity so that data rates can be increased and customers can be benefitted more. The ultimate goal of wireless communication systems is to create a global personal and multimedia communication without any capacity issues.
△ Less
Submitted 30 March, 2014;
originally announced March 2014.
-
Towards the Framework of the File Systems Performance Evaluation Techniques and the Taxonomy of Replay Traces
Authors:
Brijender Kahanwal,
Te**der Pal Singh
Abstract:
This is the era of High Performance Computing (HPC). There is a great demand of the best performance evaluation techniques for the file and storage systems. The task of evaluation is both necessary and hard. It gives in depth analysis of the target system and that becomes the decision points for the users. That is also helpful for the inventors or developers to find out the bottleneck in their sys…
▽ More
This is the era of High Performance Computing (HPC). There is a great demand of the best performance evaluation techniques for the file and storage systems. The task of evaluation is both necessary and hard. It gives in depth analysis of the target system and that becomes the decision points for the users. That is also helpful for the inventors or developers to find out the bottleneck in their systems. In this paper many performance evaluation techniques are described for file and storage system evaluation and the main stress is given on the important one that is replay traces. A survey has been done for the performance evaluation techniques used by the researchers and on the replay traces. And the taxonomy of the replay traces is described. The some of the popular replay traces are just like, Tracefs [1], //Trace [2], Replayfs [3] and VFS Interceptor [12]. At last we have concluded all the features that must be considered when we are going to develop the new tool for the replay traces. The complete work of this paper shows that the storage system developers must care about all the techniques which can be used for the performance evaluation of the file systems. So they can develop highly efficient future file and storage systems.
△ Less
Submitted 6 December, 2013;
originally announced December 2013.
-
Java File Security System (JFSS) Evaluation Using Software Engineering Approaches
Authors:
Brijender Kahanwal,
Te**der Pal Singh
Abstract:
A Java File Security System (JFSS) [1] has been developed by us. That is an ecrypted file system. It is developed by us because there are so many file data breaches in the past and current history and they are going to increase day by day as the reports by DataLossDB (Open Security Foundation) organization, a non-profit organization in US so it is. The JFSS is evaluated regarding the two software…
▽ More
A Java File Security System (JFSS) [1] has been developed by us. That is an ecrypted file system. It is developed by us because there are so many file data breaches in the past and current history and they are going to increase day by day as the reports by DataLossDB (Open Security Foundation) organization, a non-profit organization in US so it is. The JFSS is evaluated regarding the two software engineering approaches. One of them is size metric that is Lines of Code (LOC) in the software product development. Another approach is the customer oriented namely User Satisfaction Testing methodology. Satisfying our customers is an essential element to stay in business in modern world of global competition. We must satisfy and even delight our customers with the value of our software products and services to gain their loyalty and repeat business. Customer satisfaction is therefore a primary goal of process improvement programs as well as quality predictions of our software. With the help of User Satisfaction Index that is calculated for many parameters regarding the customer satisfaction. Customer Satisfaction Surveys are the best way to find the satisfaction level of our product quality.
△ Less
Submitted 6 December, 2013;
originally announced December 2013.
-
File System - A Component of Operating System
Authors:
Brijender Kahanwal,
Te**der Pal Singh,
Ruchira Bhargava,
Girish Pal Singh
Abstract:
The file system provides the mechanism for online storage and access to file contents, including data and programs. This paper covers the high-level details of file systems, as well as related topics such as the disk cache, the file system interface to the kernel, and the user-level APIs that use the features of the file system. It will give you a thorough understanding of how a file system works…
▽ More
The file system provides the mechanism for online storage and access to file contents, including data and programs. This paper covers the high-level details of file systems, as well as related topics such as the disk cache, the file system interface to the kernel, and the user-level APIs that use the features of the file system. It will give you a thorough understanding of how a file system works in general. The main component of the operating system is the file system. It is used to create, manipulate, store, and retrieve data. At the highest level, a file system is a way to manage information on a secondary storage medium. There are so many layers under and above the file system. All the layers are to be fully described here. This paper will give the explanatory knowledge of the file system designers and the researchers in the area. The complete path from the user process to secondary storage device is to be mentioned. File system is the area where the researchers are doing lot of job and there is always a need to do more work. The work is going on for the efficient, secure, energy saving techniques for the file systems. As we know that the hardware is going to be fast in performance and low-priced day by day. The software is not built to comeback with the hardware technology. So there is a need to do research in this area to bridge the technology gap.
△ Less
Submitted 6 December, 2013;
originally announced December 2013.
-
Towards the Framework of Information Security
Authors:
Dr. Brijender Kahanwal,
Dr. Te**der Pal Singh
Abstract:
Todays modern society is extremely dependent on computer based information systems. Many of the organizations would simply not be able to function properly without services provided by these systems, just like financing organizations. Although interruption might decrease the efficiency of an organization, theft or unintentional disclosure of entrusted private data could have more serious consequen…
▽ More
Todays modern society is extremely dependent on computer based information systems. Many of the organizations would simply not be able to function properly without services provided by these systems, just like financing organizations. Although interruption might decrease the efficiency of an organization, theft or unintentional disclosure of entrusted private data could have more serious consequences, such as legal actions as well as loss of business due to lack of trust from potential users. This dependence on information systems has lead to a need for securing these systems and this in turn has created a need for knowing how secure they are. The introduction of the information society has changed how people interact with government agencies. Government agencies are now encouraged to uphold a 24-hour electronic service to the citizens. The introduction of government services on the Internet is meant to facilitate communication with agencies, decrease service times and to lessen the amount of papers that needs to be processed. The increased connectivity to the Internet results in a rising demand for information security in these systems. In this article, we have discussed about many file data breaches in the past and current history and they are going to increase day by day as the reports by DataLossDB (Open Security Foundation) organization, a non-profit organization in US.
△ Less
Submitted 5 December, 2013;
originally announced December 2013.
-
Function Overloading Implementation in C++
Authors:
Dr. Brijender Kahanwal,
Dr. T. P. Singh
Abstract:
In this article the function overloading in object-oriented programming is elaborated and how they are implemented in C++. The language supports a variety of programming styles. Here we are describing the polymorphism and its types in brief. The main stress is given on the function overloading implementation styles in the language. The polymorphic nature of languages has advantages like that we ca…
▽ More
In this article the function overloading in object-oriented programming is elaborated and how they are implemented in C++. The language supports a variety of programming styles. Here we are describing the polymorphism and its types in brief. The main stress is given on the function overloading implementation styles in the language. The polymorphic nature of languages has advantages like that we can add new code without requiring changes to the other classes and interfaces (in Java language) are easily implemented. In this technique, the run-time overhead is also introduced in dynamic binding which increases the execution time of the software. But there are no such types of overheads in this static type of polymorphism because everything is resolved at the time of compile time. Polymorphism; Function Overloading; Static Polymorphism; Overloading; Compile Time Polymorphism.
△ Less
Submitted 27 November, 2013;
originally announced November 2013.
-
Performance Evaluation of Java File Security System (JFSS)
Authors:
Brijender Kahanwal,
Dr. Te**der Pal Singh,
Dr. R. K. Tuteja
Abstract:
Security is a critical issue of the modern file and storage systems, it is imperative to protect the stored data from unauthorized access. We have developed a file security system named as Java File Security System (JFSS) [1] that guarantee the security to files on the demand of all users. It has been developed on Java platform. Java has been used as programming language in order to provide portab…
▽ More
Security is a critical issue of the modern file and storage systems, it is imperative to protect the stored data from unauthorized access. We have developed a file security system named as Java File Security System (JFSS) [1] that guarantee the security to files on the demand of all users. It has been developed on Java platform. Java has been used as programming language in order to provide portability, but it enforces some performance limitations. It is developed in FUSE (File System in User space) [3]. Many efforts have been done over the years for develo** file systems in user space (FUSE). All have their own merits and demerits. In this paper we have evaluated the performance of Java File Security System (JFSS). Over and over again, the increased security comes at the expense of user convenience, performance or compatibility with other systems. JFSS system performance evaluations show that encryption overheads are modest as compared to security.
△ Less
Submitted 13 November, 2013;
originally announced November 2013.
-
The Distributed Computing Paradigms: P2P, Grid, Cluster, Cloud, and Jungle
Authors:
Dr. Brijender Kahanwal,
Dr. T. P. Singh
Abstract:
The distributed computing is done on many systems to solve a large scale problem. The growing of high-speed broadband networks in developed and develo** countries, the continual increase in computing power, and the rapid growth of the Internet have changed the way. In it the society manages information and information services. Historically, the state of computing has gone through a series of pl…
▽ More
The distributed computing is done on many systems to solve a large scale problem. The growing of high-speed broadband networks in developed and develo** countries, the continual increase in computing power, and the rapid growth of the Internet have changed the way. In it the society manages information and information services. Historically, the state of computing has gone through a series of platform and environmental changes. Distributed computing holds great assurance for using computer systems effectively. As a result, supercomputer sites and data centers have changed from providing high performance floating point computing capabilities to concurrently servicing huge number of requests from billions of users. The distributed computing system uses multiple computers to solve large-scale problems over the Internet. It becomes data-intensive and network-centric. The applications of distributed computing have become increasingly wide-spread. In distributed computing, the main stress is on the large scale resource sharing and always goes for the best performance. In this article, we have reviewed the work done in the area of distributed computing paradigms. The main stress is on the evolving area of cloud computing.
△ Less
Submitted 13 November, 2013;
originally announced November 2013.
-
Analysis of Different Privacy Preserving Cloud Storage Frameworks
Authors:
Rajeev Bedi,
Mohit Marwaha,
Ta**der Singh,
Harwinder Singh,
Amritpal Singh
Abstract:
Privacy Security of data in Cloud Storage is one of the main issues. Many Frameworks and Technologies are used to preserve data security in cloud storage. [1] Proposes a framework which includes the design of data organization structure, the generation and management of keys, the treatment of change of user's access right and dynamic operations of data, and the interaction between participants. It…
▽ More
Privacy Security of data in Cloud Storage is one of the main issues. Many Frameworks and Technologies are used to preserve data security in cloud storage. [1] Proposes a framework which includes the design of data organization structure, the generation and management of keys, the treatment of change of user's access right and dynamic operations of data, and the interaction between participants. It also design an interactive protocol and an extirpation-based key derivation algorithm, which are combined with lazy revocation, it uses multi-tree structure and symmetric encryption to form a privacy-preserving, efficient framework for cloud storage. [2] Proposes a framework which design a privacy-preserving cloud storage framework in which he designed an interaction protocol among participants, use key derivation algorithm to generate and manage keys, use both symmetric and asymmetric encryption to hide the sensitive data of users, and apply Bloom filter for cipher text retrieval. A system based on this framework is realized. This paper analyzes both the frameworks in terms of the feasibility of the frameworks, running overhead of the system and the privacy security of the frameworks.
△ Less
Submitted 14 January, 2012;
originally announced May 2012.
-
TSET: Token based Secure Electronic Transaction
Authors:
Rajdeep Borgohain,
Moirangthem Tiken Singh,
Chandrakant Sakharwade,
Sugata Sanyal
Abstract:
Security and trust are the most important factors in online transaction, this paper introduces TSET a Token based Secure Electronic Transaction which is an improvement over the existing SET, Secure Electronic Transaction protocol. We take the concept of tokens in the TSET protocol to provide end to end security. It also provides trust evaluation mechanism so that trustworthiness of the merchants c…
▽ More
Security and trust are the most important factors in online transaction, this paper introduces TSET a Token based Secure Electronic Transaction which is an improvement over the existing SET, Secure Electronic Transaction protocol. We take the concept of tokens in the TSET protocol to provide end to end security. It also provides trust evaluation mechanism so that trustworthiness of the merchants can be known by customers before being involved in the transaction. Moreover, we also propose a grading mechanism so that quality of service in the transactions improves.
△ Less
Submitted 7 April, 2012; v1 submitted 27 March, 2012;
originally announced March 2012.
-
A New Local Adaptive Thresholding Technique in Binarization
Authors:
T. Romen Singh,
Sudipta Roy,
O. Imocha Singh,
Tejmani Sinam,
Kh. Manglem Singh
Abstract:
Image binarization is the process of separation of pixel values into two groups, white as background and black as foreground. Thresholding plays a major in binarization of images. Thresholding can be categorized into global thresholding and local thresholding. In images with uniform contrast distribution of background and foreground like document images, global thresholding is more appropriate. In…
▽ More
Image binarization is the process of separation of pixel values into two groups, white as background and black as foreground. Thresholding plays a major in binarization of images. Thresholding can be categorized into global thresholding and local thresholding. In images with uniform contrast distribution of background and foreground like document images, global thresholding is more appropriate. In degraded document images, where considerable background noise or variation in contrast and illumination exists, there exists many pixels that cannot be easily classified as foreground or background. In such cases, binarization with local thresholding is more appropriate. This paper describes a locally adaptive thresholding technique that removes background by using local mean and mean deviation. Normally the local mean computational time depends on the window size. Our technique uses integral sum image as a prior processing to calculate local mean. It does not involve calculations of standard deviations as in other local adaptive techniques. This along with the fact that calculations of mean is independent of window size speed up the process as compared to other local thresholding techniques.
△ Less
Submitted 25 January, 2012;
originally announced January 2012.
-
Comparison of SCIPUFF Plume Prediction with Particle Filter Assimilated Prediction for Dipole Pride 26 Data
Authors:
Gabriel Terejanu,
Yang Cheng,
Tarunraj Singh,
Peter D. Scott
Abstract:
This paper presents the application of a particle filter for data assimilation in the context of puff-based dispersion models. Particle filters provide estimates of the higher moments, and are well suited for strongly nonlinear and/or non-Gaussian models. The Gaussian puff model SCIPUFF, is used in predicting the chemical concentration field after a chemical incident. This model is highly nonlinea…
▽ More
This paper presents the application of a particle filter for data assimilation in the context of puff-based dispersion models. Particle filters provide estimates of the higher moments, and are well suited for strongly nonlinear and/or non-Gaussian models. The Gaussian puff model SCIPUFF, is used in predicting the chemical concentration field after a chemical incident. This model is highly nonlinear and evolves with variable state dimension and, after sufficient time, high dimensionality. While the particle filter formalism naturally supports variable state dimensionality high dimensionality represents a challenge in selecting an adequate number of particles, especially for the Bootstrap version. We present an implementation of the Bootstrap particle filter and compare its performance with the SCIPUFF predictions. Both the model and the Particle Filter are evaluated on the Dipole Pride 26 experimental data. Since there is no available ground truth, the data has been divided in two sets: training and testing. We show that even with a modest number of particles, the Bootstrap particle filter provides better estimates of the concentration field compared with the process model, without excessive increase in computational complexity.
△ Less
Submitted 7 July, 2011;
originally announced July 2011.
-
New ID Based Multi-Proxy Multi-Signcryption Scheme from Pairings
Authors:
Sunder Lal,
Tej Singh
Abstract:
This paper presents an identity based multi-proxy multi-signcryption scheme from pairings. In this scheme a proxy signcrypter group could authorized as a proxy agent by the coopration of all members in the original signcryption group. Then the proxy signcryption can be generated by the cooperation of all the signcrypters in the authorized proxy signcrypter group on the behalf of the original sig…
▽ More
This paper presents an identity based multi-proxy multi-signcryption scheme from pairings. In this scheme a proxy signcrypter group could authorized as a proxy agent by the coopration of all members in the original signcryption group. Then the proxy signcryption can be generated by the cooperation of all the signcrypters in the authorized proxy signcrypter group on the behalf of the original signcrypter group. As compared to the scheme of Liu and Xiao, the proposed scheme provides public verifiability of the signature along with simplified key management.
△ Less
Submitted 8 January, 2007;
originally announced January 2007.