-
Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT Threats
Authors:
Ryan Pavlich,
Nima Ebadi,
Richard Tarbell,
Billy Linares,
Adrian Tan,
Rachael Humphreys,
Jayanta Kumar Das,
Rambod Ghandiparsi,
Hannah Haley,
Jerris George,
Rocky Slavin,
Kim-Kwang Raymond Choo,
Glenn Dietrich,
Anthony Rios
Abstract:
Recognizing the promise of natural language interfaces to databases, prior studies have emphasized the development of text-to-SQL systems. While substantial progress has been made in this field, existing research has concentrated on generating SQL statements from text queries. The broader challenge, however, lies in inferring new information about the returned data. Our research makes two major co…
▽ More
Recognizing the promise of natural language interfaces to databases, prior studies have emphasized the development of text-to-SQL systems. While substantial progress has been made in this field, existing research has concentrated on generating SQL statements from text queries. The broader challenge, however, lies in inferring new information about the returned data. Our research makes two major contributions to address this gap. First, we introduce a novel Internet-of-Things (IoT) text-to-SQL dataset comprising 10,985 text-SQL pairs and 239,398 rows of network traffic activity. The dataset contains additional query types limited in prior text-to-SQL datasets, notably temporal-related queries. Our dataset is sourced from a smart building's IoT ecosystem exploring sensor read and network traffic data. Second, our dataset allows two-stage processing, where the returned data (network traffic) from a generated SQL can be categorized as malicious or not. Our results show that joint training to query and infer information about the data can improve overall text-to-SQL performance, nearly matching substantially larger models. We also show that current large language models (e.g., GPT3.5) struggle to infer new information about returned data, thus our dataset provides a novel test bed for integrating complex domain-specific reasoning into LLMs.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Investigating the Utility of ChatGPT in the Issue Tracking System: An Exploratory Study
Authors:
Joy Krishan Das,
Saikat Mondal,
Chanchal K. Roy
Abstract:
Issue tracking systems serve as the primary tool for incorporating external users and customizing a software project to meet the users' requirements. However, the limited number of contributors and the challenge of identifying the best approach for each issue often impede effective resolution. Recently, an increasing number of developers are turning to AI tools like ChatGPT to enhance problem-solv…
▽ More
Issue tracking systems serve as the primary tool for incorporating external users and customizing a software project to meet the users' requirements. However, the limited number of contributors and the challenge of identifying the best approach for each issue often impede effective resolution. Recently, an increasing number of developers are turning to AI tools like ChatGPT to enhance problem-solving efficiency. While previous studies have demonstrated the potential of ChatGPT in areas such as automatic program repair, debugging, and code generation, there is a lack of study on how developers explicitly utilize ChatGPT to resolve issues in their tracking system. Hence, this study aims to examine the interaction between ChatGPT and developers to analyze their prevalent activities and provide a resolution. In addition, we assess the code reliability by confirming if the code produced by ChatGPT was integrated into the project's codebase using the clone detection tool NiCad. Our investigation reveals that developers mainly use ChatGPT for brainstorming solutions but often opt to write their code instead of using ChatGPT-generated code, possibly due to concerns over the generation of "hallucinated code", as highlighted in the literature.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Analyzing Host-Viral Interactome of SARS-CoV-2 for Identifying Vulnerable Host Proteins during COVID-19 Pathogenesis
Authors:
Jayanta Kumar Das,
Swarup Roy,
Pietro Hiram Guzzi
Abstract:
The development of therapeutic targets for COVID-19 treatment is based on the understanding of the molecular mechanism of pathogenesis. The identification of genes and proteins involved in the infection mechanism is the key to shed out light into the complex molecular mechanisms. The combined effort of many laboratories distributed throughout the world has produced the accumulation of both protein…
▽ More
The development of therapeutic targets for COVID-19 treatment is based on the understanding of the molecular mechanism of pathogenesis. The identification of genes and proteins involved in the infection mechanism is the key to shed out light into the complex molecular mechanisms. The combined effort of many laboratories distributed throughout the world has produced the accumulation of both protein and genetic interactions. In this work we integrate these available results and we obtain an host protein-protein interaction network composed by 1432 human proteins. We calculate network centrality measures to identify key proteins. Then we perform functional enrichment of central proteins. We observed that the identified proteins are mostly associated with several crucial pathways, including cellular process, signalling transduction, neurodegenerative disease. Finally, we focused on proteins involved in causing disease in the human respiratory tract. We conclude that COVID19 is a complex disease, and we highlighted many potential therapeutic targets including RBX1, HSPA5, ITCH, RAB7A, RAB5A, RAB8A, PSMC5, CAPZB, CANX, IGF2R, HSPA1A, which are central and also associated with multiple diseases
△ Less
Submitted 5 February, 2021;
originally announced February 2021.
-
Relationship of Two Discrete Dynamical Models: One-dimensional Cellular Automata and Integral Value Transformations
Authors:
Sreeya Ghosh,
Sudhakar Sahoo,
Sk. Sarif Hassan,
Jayanta Kumar Das,
Pabitra Pal Choudhury
Abstract:
Cellular Automaton (CA) and an Integral Value Transformation (IVT) are two well established mathematical models which evolve in discrete time steps. Theoretically, studies on CA suggest that CA is capable of producing a great variety of evolution patterns. However computation of non-linear CA or higher dimensional CA maybe complex, whereas IVTs can be manipulated easily. The main purpose of this p…
▽ More
Cellular Automaton (CA) and an Integral Value Transformation (IVT) are two well established mathematical models which evolve in discrete time steps. Theoretically, studies on CA suggest that CA is capable of producing a great variety of evolution patterns. However computation of non-linear CA or higher dimensional CA maybe complex, whereas IVTs can be manipulated easily. The main purpose of this paper is to study the link between a transition function of a one-dimensional CA and IVTs. Mathematically, we have also established the algebraic structures of a set of transition functions of a one-dimensional CA as well as that of a set of IVTs using binary operations. Also DNA sequence evolution has been modelled using IVTs.
△ Less
Submitted 30 June, 2020; v1 submitted 24 June, 2020;
originally announced June 2020.
-
Two Dimensional Discrete Dynamics of Integral Value Transformations
Authors:
Jayanta Kumar Das,
Sudhakar Sahoo,
Sk. Sarif Hassan,
Pabitra Pal Choudhury
Abstract:
A notion of dimension preservative map, \textit{Integral Value Transformations} (IVTs) is defined over $\mathbb{N}^k$ using the set of $p$-adic functions. Thereafter, two dimensional \textit{Integral Value Transformations} (IVTs) is systematically analyzed over $\mathbb{N} \times \mathbb{N}$ using pair of two variable Boolean functions. The dynamics of IVTs over…
▽ More
A notion of dimension preservative map, \textit{Integral Value Transformations} (IVTs) is defined over $\mathbb{N}^k$ using the set of $p$-adic functions. Thereafter, two dimensional \textit{Integral Value Transformations} (IVTs) is systematically analyzed over $\mathbb{N} \times \mathbb{N}$ using pair of two variable Boolean functions. The dynamics of IVTs over $\mathbb{N} \times \mathbb{N}=\mathbb{N}^2$ is studied from algebraic perspective. It is seen that the dynamics of the IVTs solely depends on the dynamics (state transition diagram) of the pair of two variable Boolean functions. A set of sixteen \textit{Collatz-like} IVTs are identified in two dimensions. Also, the dynamical system of IVTs having attractor with one, two, three and four cycles are studied. Additionally, some quantitative information of \textit{Integral Value Transformations} (IVTs) in different bases and dimensions are also discussed.
△ Less
Submitted 28 January, 2020; v1 submitted 25 August, 2017;
originally announced September 2017.
-
Implementation of the open source virtualization technologies in cloud computing
Authors:
Mohammad Mamun Or Rashid,
M. Masud Rana,
Jugal Krishna Das
Abstract:
The Virtualization and Cloud Computing is a recent buzzword in the digital world. Cloud computing provide IT as a service to the users on demand basis. This service has greater flexibility, availability, reliability and scalability with utility computing model. This new concept of computing has an immense potential in it to be used in the field of e-governance and in the overall IT development per…
▽ More
The Virtualization and Cloud Computing is a recent buzzword in the digital world. Cloud computing provide IT as a service to the users on demand basis. This service has greater flexibility, availability, reliability and scalability with utility computing model. This new concept of computing has an immense potential in it to be used in the field of e-governance and in the overall IT development perspective in develo** countries like Bangladesh.
△ Less
Submitted 11 May, 2016;
originally announced May 2016.
-
Multi-Number CVT-XOR Arithmetic Operations in any Base System and its Significant Properties
Authors:
Jayanta Kumar Das,
Pabitra Pal Choudhury,
Sudhakar Sahoo
Abstract:
Carry Value Transformation (CVT) is a model of discrete dynamical system which is one special case of Integral Value Transformations (IVTs). Earlier in [5] it has been proved that sum of two non-negative integers is equal to the sum of their CVT and XOR values in any base system. In the present study, this phenomenon is extended to perform CVT and XOR operations for many non-negative integers in a…
▽ More
Carry Value Transformation (CVT) is a model of discrete dynamical system which is one special case of Integral Value Transformations (IVTs). Earlier in [5] it has been proved that sum of two non-negative integers is equal to the sum of their CVT and XOR values in any base system. In the present study, this phenomenon is extended to perform CVT and XOR operations for many non-negative integers in any base system. To achieve that both the definition of CVT and XOR are modified over the set of multiple integers instead of two. Also some important properties of these operations have been studied. With the help of cellular automata the adder circuit designed in [14] on using CVT-XOR recurrence formula is used to design a parallel adder circuit for multiple numbers in binary number system.
△ Less
Submitted 30 November, 2015;
originally announced January 2016.
-
Understanding of Genetic Code Degeneracy and New Way of Classifying of Protein Family: A Mathematical Approach
Authors:
Jayanta Kumar Das,
Atrayee Majumder,
Pabitra Pal Choudhury
Abstract:
The genetic code is the set of rules by which information encoded in genetic material (DNA or RNA sequences) is translated into proteins (amino acid sequences) by living cells. The code defines a map** between tri-nucleotide sequences, called codons, and amino acids. Since there are 20 amino acids and 64 possible tri-nucleotide sequences, more than one among these 64 triplets can code for a sing…
▽ More
The genetic code is the set of rules by which information encoded in genetic material (DNA or RNA sequences) is translated into proteins (amino acid sequences) by living cells. The code defines a map** between tri-nucleotide sequences, called codons, and amino acids. Since there are 20 amino acids and 64 possible tri-nucleotide sequences, more than one among these 64 triplets can code for a single amino acid which incorporates the problem of degeneracy. This manuscript explains the underlying logic of degeneracy of genetic code based on a mathematical point of view using a parameter named Impression. Classification of protein family is also a long standing problem in the field of Bio-chemistry and Genomics. Proteins belonging to a particular class have some similar bio-chemical properties which are of utmost importance for new drug design. Using the same parameter Impression and using graph theoretic properties we have also devised a new way of classifying a protein family.
△ Less
Submitted 30 November, 2015;
originally announced December 2015.
-
Carry Value Transformation (CVT) - Exclusive OR (XOR) Tree and Its Significant Properties
Authors:
Jayanta Kumar Das,
Pabitra Pal Choudhury,
Sudhakar Sahoo
Abstract:
CVT and XOR are two binary operations together used to calculate the sum of two non-negative integers on using a recursive mechanism. In this present study the convergence behaviors of this recursive mechanism has been captured through a tree like structure named as CVT-XOR Tree. We have analyzed how to identify the parent nodes, leaf nodes and internal nodes in the CVT-XOR Tree. We also provide t…
▽ More
CVT and XOR are two binary operations together used to calculate the sum of two non-negative integers on using a recursive mechanism. In this present study the convergence behaviors of this recursive mechanism has been captured through a tree like structure named as CVT-XOR Tree. We have analyzed how to identify the parent nodes, leaf nodes and internal nodes in the CVT-XOR Tree. We also provide the parent information, depth information and the number of children of a node in different CVT-XOR Trees on defining three different matrices. Lastly, one observation is made towards very old Mathematical problem of Goldbach Conjecture.
△ Less
Submitted 4 June, 2015;
originally announced June 2015.
-
Analysis of Boolean Functions based on Interaction Graphs and their influence in System Biology
Authors:
Jayanta Kumar Das,
Ranjeet Kumar Rout,
Pabitra Pal Choudhury
Abstract:
Interaction graphs provide an important qualitative modeling approach for System Biology. This paper presents a novel approach for construction of interaction graph with the help of Boolean function decomposition. Each decomposition part (Consisting of 2-bits) of the Boolean functions has some important significance. In the dynamics of a biological system, each variable or node is nothing but gene…
▽ More
Interaction graphs provide an important qualitative modeling approach for System Biology. This paper presents a novel approach for construction of interaction graph with the help of Boolean function decomposition. Each decomposition part (Consisting of 2-bits) of the Boolean functions has some important significance. In the dynamics of a biological system, each variable or node is nothing but gene or protein. Their regulation has been explored in terms of interaction graphs which are generated by Boolean functions. In this paper, different classes of Boolean functions with regards to Interaction Graph with biologically significant properties have been adumbrated.
△ Less
Submitted 24 September, 2014;
originally announced September 2014.
-
On Analysis And Generation Of Biologically Important Boolean Functions
Authors:
Camellia Ray,
Jayanta Kumar Das,
Pabitra Pal Choudhury
Abstract:
Boolean networks are used to model biological networks such as gene regulatory networks. Often Boolean networks show very chaotic behavior which is sensitive to any small perturbations.In order to reduce the chaotic behavior and to attain stability in the gene regulatory network,nested canalizing functions(NCF)are best suited NCF and its variants have a wide range of applications in system biology…
▽ More
Boolean networks are used to model biological networks such as gene regulatory networks. Often Boolean networks show very chaotic behavior which is sensitive to any small perturbations.In order to reduce the chaotic behavior and to attain stability in the gene regulatory network,nested canalizing functions(NCF)are best suited NCF and its variants have a wide range of applications in system biology. Previously many work were done on the application of canalizing functions but there were fewer methods to check if any arbitrary Boolean function is canalizing or not. In this paper, by using Karnaugh Map this problem gas been solved and also it has been shown that when the canalizing functions of n variable is given, all the canalizing functions of n+1 variable could be generated by the method of concatenation. In this paper we have uniquely identified the number of NCFs having a particular hamming distance (H.D) generated by each variable x as starting canalizing input. Partially nested canalizing functions of 4 variables have also been studied in this paper. Keywords: Karnaugh Map, Canalizing function, Nested canalizing function, Partially nested canalizing function,concatenation
△ Less
Submitted 12 September, 2014;
originally announced September 2014.
-
On Analysis and Generation of some Biologically Important Boolean Functions
Authors:
Camellia Ray,
Jayanta Kumar Das,
Pabitra Pal Choudhury
Abstract:
Boolean networks are used to model biological networks such as gene regulatory networks. Often Boolean networks show very chaotic behaviour which is sensitive to any small perturbations. In order to reduce the chaotic behaviour and to attain stability in the gene regulatory network, nested Canalizing Functions (NCFs) are best suited. NCFs and its variants have a wide range of applications in syste…
▽ More
Boolean networks are used to model biological networks such as gene regulatory networks. Often Boolean networks show very chaotic behaviour which is sensitive to any small perturbations. In order to reduce the chaotic behaviour and to attain stability in the gene regulatory network, nested Canalizing Functions (NCFs) are best suited. NCFs and its variants have a wide range of applications in systems biology. Previously, many works were done on the application of canalizing functions, but there were fewer methods to check if any arbitrary Boolean function is canalizing or not. In this paper, by using Karnaugh Map this problem is solved and also it has been shown that when the canalizing functions of variable is given, all the canalizing functions of variable could be generated by the method of concatenation. In this paper we have uniquely identified the number of NCFs having a particular Hamming Distance (H.D) generated by each variable as starting canalizing input. Partially NCFs of 4 variables has also been studied in this paper.
△ Less
Submitted 12 September, 2014; v1 submitted 9 May, 2014;
originally announced May 2014.