-
Fair and Efficient Completion of Indivisible Goods
Authors:
Vishwa Prakash H. V.,
Ayumi Igarashi,
Rohit Vaish
Abstract:
We formulate the problem of fair and efficient completion of indivisible goods, defined as follows: Given a partial allocation of indivisible goods among agents, does there exist an allocation of the remaining goods (i.e., a completion) that satisfies fairness and economic efficiency guarantees of interest? We study the computational complexity of the completion problem for prominent fairness and…
▽ More
We formulate the problem of fair and efficient completion of indivisible goods, defined as follows: Given a partial allocation of indivisible goods among agents, does there exist an allocation of the remaining goods (i.e., a completion) that satisfies fairness and economic efficiency guarantees of interest? We study the computational complexity of the completion problem for prominent fairness and efficiency notions such as envy-freeness up one good (EF1), proportionality up to one good (Prop1), maximin share (MMS), and Pareto optimality (PO), and focus on the class of additive valuations as well as its subclasses such as binary additive and lexicographic valuations. We find that while the completion problem is significantly harder than the standard fair division problem (wherein the initial partial allocation is empty), the consideration of restricted preferences facilitates positive algorithmic results for threshold-based fairness notions (Prop1 and MMS). On the other hand, the completion problem remains computationally intractable for envy-based notions such as EF1 and EF1+PO even under restricted preferences.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Image Processing Based Forest Fire Detection
Authors:
Vipin V
Abstract:
A novel approach for forest fire detection using image processing technique is proposed. A rule-based color model for fire pixel classification is used. The proposed algorithm uses RGB and YCbCr color space. The advantage of using YCbCr color space is that it can separate the luminance from the chrominance more effectively than RGB color space. The performance of the proposed algorithm is tested o…
▽ More
A novel approach for forest fire detection using image processing technique is proposed. A rule-based color model for fire pixel classification is used. The proposed algorithm uses RGB and YCbCr color space. The advantage of using YCbCr color space is that it can separate the luminance from the chrominance more effectively than RGB color space. The performance of the proposed algorithm is tested on two sets of images, one of which contains fire; the other contains fire-like regions. Standard methods are used for calculating the performance of the algorithm. The proposed method has both higher detection rate and lower false alarm rate. Since the algorithm is cheap in computation, it can be used for real-time forest fire detection.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
QMedShield: A Novel Quantum Chaos-based Image Encryption Scheme for Secure Medical Image Storage in the Cloud
Authors:
Arun Amaithi Rajan,
Vetriselvi V
Abstract:
In the age of digital technology, medical images play a crucial role in the healthcare industry which aids surgeons in making precise decisions and reducing the diagnosis time. However, the storage of large amounts of these images in third-party cloud services raises privacy and security concerns. There are a lot of classical security mechanisms to protect them. Although, the advent of quantum com…
▽ More
In the age of digital technology, medical images play a crucial role in the healthcare industry which aids surgeons in making precise decisions and reducing the diagnosis time. However, the storage of large amounts of these images in third-party cloud services raises privacy and security concerns. There are a lot of classical security mechanisms to protect them. Although, the advent of quantum computing entails the development of quantum-based encryption models for healthcare. Hence, we introduce a novel quantum chaos-based encryption scheme for medical images in this article. The model comprises bit-plane scrambling, quantum logistic map, quantum operations in the diffusion phase and hybrid chaotic map, DNA encoding, and computations in the confusion phase to transform the plain medical image into a cipher medical image. The proposed scheme has been evaluated using multiple statistical measures and validated against more attacks such as differential attacks with three different medical datasets. Hence the introduced encryption model has proved to be attack-resistant and robust than other existing image encryption schemes, ensuring the secure storage of medical images in cloud environments.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
A Deep Look Into -- Automated Lung X-Ray Abnormality Detection System
Authors:
Nagullas KS,
Vivekanand. V,
Narayana Darapaneni,
Anwesh R P
Abstract:
Introduction: Automated Lung X-Ray Abnormality Detection System is the application which distinguish the normal x-ray images from infected x-ray images and highlight area considered for prediction, with the recent pandemic a need to have a non-conventional method and faster detecting diseases, for which X ray serves the purpose. Obectives: As of current situation any viral disease that is infectio…
▽ More
Introduction: Automated Lung X-Ray Abnormality Detection System is the application which distinguish the normal x-ray images from infected x-ray images and highlight area considered for prediction, with the recent pandemic a need to have a non-conventional method and faster detecting diseases, for which X ray serves the purpose. Obectives: As of current situation any viral disease that is infectious is potential pandemic, so there is need for cheap and early detection system. Methods: This research will help to eases the work of expert to do further analysis. Accuracy of three different preexisting models such as DenseNet, MobileNet and VGG16 were high but models over-fitted primarily due to black and white images. Results: This led to building up new method such as as V-BreathNet which gave more than 96% percent accuracy. Conclusion: Thus, it can be stated that not all state-of art CNN models can be used on B/W images. In conclusion not all state-of-art CNN models can be used on B/W images.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
The Surprising Effectiveness of Rankers Trained on Expanded Queries
Authors:
Abhijit Anand,
Venktesh V,
Vinay Setty,
Avishek Anand
Abstract:
An important problem in text-ranking systems is handling the hard queries that form the tail end of the query distribution. The difficulty may arise due to the presence of uncommon, underspecified, or incomplete queries. In this work, we improve the ranking performance of hard or difficult queries without compromising the performance of other queries. Firstly, we do LLM based query enrichment for…
▽ More
An important problem in text-ranking systems is handling the hard queries that form the tail end of the query distribution. The difficulty may arise due to the presence of uncommon, underspecified, or incomplete queries. In this work, we improve the ranking performance of hard or difficult queries without compromising the performance of other queries. Firstly, we do LLM based query enrichment for training queries using relevant documents. Next, a specialized ranker is fine-tuned only on the enriched hard queries instead of the original queries. We combine the relevance scores from the specialized ranker and the base ranker, along with a query performance score estimated for each query. Our approach departs from existing methods that usually employ a single ranker for all queries, which is biased towards easy queries, which form the majority of the query distribution. In our extensive experiments on the DL-Hard dataset, we find that a principled query performance based scoring method using base and specialized ranker offers a significant improvement of up to 25% on the passage ranking task and up to 48.4% on the document ranking task when compared to the baseline performance of using original queries, even outperforming SOTA model.
△ Less
Submitted 12 June, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
QuanTemp: A real-world open-domain benchmark for fact-checking numerical claims
Authors:
Venktesh V,
Abhijit Anand,
Avishek Anand,
Vinay Setty
Abstract:
Automated fact checking has gained immense interest to tackle the growing misinformation in the digital era. Existing systems primarily focus on synthetic claims on Wikipedia, and noteworthy progress has also been made on real-world claims. In this work, we release QuanTemp, a diverse, multi-domain dataset focused exclusively on numerical claims, encompassing temporal, statistical and diverse aspe…
▽ More
Automated fact checking has gained immense interest to tackle the growing misinformation in the digital era. Existing systems primarily focus on synthetic claims on Wikipedia, and noteworthy progress has also been made on real-world claims. In this work, we release QuanTemp, a diverse, multi-domain dataset focused exclusively on numerical claims, encompassing temporal, statistical and diverse aspects with fine-grained metadata and an evidence collection without leakage. This addresses the challenge of verifying real-world numerical claims, which are complex and often lack precise information, not addressed by existing works that mainly focus on synthetic claims. We evaluate and quantify the limitations of existing solutions for the task of verifying numerical claims. We also evaluate claim decomposition based methods, numerical understanding based models and our best baselines achieves a macro-F1 of 58.32. This demonstrates that QuanTemp serves as a challenging evaluation set for numerical claim verification.
△ Less
Submitted 1 May, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Weighted Proportional Allocations of Indivisible Goods and Chores: Insights via Matchings
Authors:
Vishwa Prakash H. V.,
Prajakta Nimbhorkar
Abstract:
We study the fair allocation of indivisible goods and chores under ordinal valuations for agents with unequal entitlements. We show the existence and polynomial time computation of weighted necessarily proportional up to one item (WSD-PROP1) allocations for both goods and chores, by reducing it to a problem of finding perfect matchings in a bipartite graph. We give a complete characterization of t…
▽ More
We study the fair allocation of indivisible goods and chores under ordinal valuations for agents with unequal entitlements. We show the existence and polynomial time computation of weighted necessarily proportional up to one item (WSD-PROP1) allocations for both goods and chores, by reducing it to a problem of finding perfect matchings in a bipartite graph. We give a complete characterization of these allocations as corner points of a perfect matching polytope. Using this polytope, we can optimize over all allocations to find a min-cost WSD-PROP1 allocation of goods or most efficient WSD-PROP1 allocation of chores. Additionally, we show the existence and computation of sequencible (SEQ) WSD-PROP1 allocations by using rank-maximal perfect matching algorithms and show incompatibility of Pareto optimality under all valuations and WSD-PROP1.
We also consider the Best-of-Both-Worlds (BoBW) fairness notion. By using our characterization, we show the existence and polynomial time computation of Ex-ante envy free (WSD-EF) and Ex-post WSD-PROP1 allocations under ordinal valuations for both chores and goods.
△ Less
Submitted 31 December, 2023; v1 submitted 24 December, 2023;
originally announced December 2023.
-
Minimizing Factual Inconsistency and Hallucination in Large Language Models
Authors:
Muneeswaran I,
Shreya Saxena,
Siva Prasad,
M V Sai Prakash,
Advaith Shankar,
Varun V,
Vishal Vaddina,
Saisubramaniam Gopalakrishnan
Abstract:
Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generat…
▽ More
Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generates the rationale first, verifies and refines incorrect ones, and uses them as supporting references to generate the answer. The generated rationale enhances the transparency of the answer and our framework provides insights into how the model arrived at this answer, by using this rationale and the references to the context. In this paper, we demonstrate its effectiveness in improving the quality of responses to drug-related inquiries in the life sciences industry. Our framework improves traditional Retrieval Augmented Generation (RAG) by enabling OpenAI GPT-3.5-turbo to be 14-25% more faithful and 16-22% more accurate on two datasets. Furthermore, fine-tuning samples based on our framework improves the accuracy of smaller open-access LLMs by 33-42% and competes with RAG on commercial models.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
In-Context Ability Transfer for Question Decomposition in Complex QA
Authors:
Venktesh V,
Sourangshu Bhattacharya,
Avishek Anand
Abstract:
Answering complex questions is a challenging task that requires question decomposition and multistep reasoning for arriving at the solution. While existing supervised and unsupervised approaches are specialized to a certain task and involve training, recently proposed prompt-based approaches offer generalizable solutions to tackle a wide variety of complex question-answering (QA) tasks. However, e…
▽ More
Answering complex questions is a challenging task that requires question decomposition and multistep reasoning for arriving at the solution. While existing supervised and unsupervised approaches are specialized to a certain task and involve training, recently proposed prompt-based approaches offer generalizable solutions to tackle a wide variety of complex question-answering (QA) tasks. However, existing prompt-based approaches that are effective for complex QA tasks involve expensive hand annotations from experts in the form of rationales and are not generalizable to newer complex QA scenarios and tasks. We propose, icat (In-Context Ability Transfer) which induces reasoning capabilities in LLMs without any LLM fine-tuning or manual annotation of in-context samples. We transfer the ability to decompose complex questions to simpler questions or generate step-by-step rationales to LLMs, by careful selection from available data sources of related tasks. We also propose an automated uncertainty-aware exemplar selection approach for selecting examples from transfer data sources. Finally, we conduct large-scale experiments on a variety of complex QA tasks involving numerical reasoning, compositional complex QA, and heterogeneous complex QA which require decomposed reasoning. We show that ICAT convincingly outperforms existing prompt-based solutions without involving any model training, showcasing the benefits of re-using existing abilities.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
Synergistic Fusion of Graph and Transformer Features for Enhanced Molecular Property Prediction
Authors:
M V Sai Prakash,
Siddartha Reddy N,
Ganesh Parab,
Varun V,
Vishal Vaddina,
Saisubramaniam Gopalakrishnan
Abstract:
Molecular property prediction is a critical task in computational drug discovery. While recent advances in Graph Neural Networks (GNNs) and Transformers have shown to be effective and promising, they face the following limitations: Transformer self-attention does not explicitly consider the underlying molecule structure while GNN feature representation alone is not sufficient to capture granular a…
▽ More
Molecular property prediction is a critical task in computational drug discovery. While recent advances in Graph Neural Networks (GNNs) and Transformers have shown to be effective and promising, they face the following limitations: Transformer self-attention does not explicitly consider the underlying molecule structure while GNN feature representation alone is not sufficient to capture granular and hidden interactions and characteristics that distinguish similar molecules. To address these limitations, we propose SYN- FUSION, a novel approach that synergistically combines pre-trained features from GNNs and Transformers. This approach provides a comprehensive molecular representation, capturing both the global molecule structure and the individual atom characteristics. Experimental results on MoleculeNet benchmarks demonstrate superior performance, surpassing previous models in 5 out of 7 classification datasets and 4 out of 6 regression datasets. The performance of SYN-FUSION has been compared with other Graph-Transformer models that have been jointly trained using a combination of transformer and graph features, and it is found that our approach is on par with those models in terms of performance. Extensive analysis of the learned fusion model across aspects such as loss, latent space, and weight distribution further validates the effectiveness of SYN-FUSION. Finally, an ablation study unequivocally demonstrates that the synergy achieved by SYN-FUSION surpasses the performance of its individual model components and their ensemble, offering a substantial improvement in predicting molecular properties.
△ Less
Submitted 25 August, 2023;
originally announced October 2023.
-
Context Aware Query Rewriting for Text Rankers using LLM
Authors:
Abhijit Anand,
Venktesh V,
Vinay Setty,
Avishek Anand
Abstract:
Query rewriting refers to an established family of approaches that are applied to underspecified and ambiguous queries to overcome the vocabulary mismatch problem in document ranking. Queries are typically rewritten during query processing time for better query modelling for the downstream ranker. With the advent of large-language models (LLMs), there have been initial investigations into using ge…
▽ More
Query rewriting refers to an established family of approaches that are applied to underspecified and ambiguous queries to overcome the vocabulary mismatch problem in document ranking. Queries are typically rewritten during query processing time for better query modelling for the downstream ranker. With the advent of large-language models (LLMs), there have been initial investigations into using generative approaches to generate pseudo documents to tackle this inherent vocabulary gap. In this work, we analyze the utility of LLMs for improved query rewriting for text ranking tasks. We find that there are two inherent limitations of using LLMs as query re-writers -- concept drift when using only queries as prompts and large inference costs during query processing. We adopt a simple, yet surprisingly effective, approach called context aware query rewriting (CAR) to leverage the benefits of LLMs for query understanding. Firstly, we rewrite ambiguous training queries by context-aware prompting of LLMs, where we use only relevant documents as context.Unlike existing approaches, we use LLM-based query rewriting only during the training phase. Eventually, a ranker is fine-tuned on the rewritten queries instead of the original queries during training. In our extensive experiments, we find that fine-tuning a ranker using re-written queries offers a significant improvement of up to 33% on the passage ranking task and up to 28% on the document ranking task when compared to the baseline performance of using original queries.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
MWPRanker: An Expression Similarity Based Math Word Problem Retriever
Authors:
Mayank Goel,
Venktesh V,
Vikram Goyal
Abstract:
Math Word Problems (MWPs) in online assessments help test the ability of the learner to make critical inferences by interpreting the linguistic information in them. To test the mathematical reasoning capabilities of the learners, sometimes the problem is rephrased or the thematic setting of the original MWP is changed. Since manual identification of MWPs with similar problem models is cumbersome,…
▽ More
Math Word Problems (MWPs) in online assessments help test the ability of the learner to make critical inferences by interpreting the linguistic information in them. To test the mathematical reasoning capabilities of the learners, sometimes the problem is rephrased or the thematic setting of the original MWP is changed. Since manual identification of MWPs with similar problem models is cumbersome, we propose a tool in this work for MWP retrieval. We propose a hybrid approach to retrieve similar MWPs with the same problem model. In our work, the problem model refers to the sequence of operations to be performed to arrive at the solution. We demonstrate that our tool is useful for the mentioned tasks and better than semantic similarity-based approaches, which fail to capture the arithmetic and logical sequence of the MWPs. A demo of the tool can be found at https://www.youtube.com/watch?v=gSQWP3chFIs
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Query Understanding in the Age of Large Language Models
Authors:
Avishek Anand,
Venktesh V,
Abhijit Anand,
Vinay Setty
Abstract:
Querying, conversing, and controlling search and information-seeking interfaces using natural language are fast becoming ubiquitous with the rise and adoption of large-language models (LLM). In this position paper, we describe a generic framework for interactive query-rewriting using LLMs. Our proposal aims to unfold new opportunities for improved and transparent intent understanding while buildin…
▽ More
Querying, conversing, and controlling search and information-seeking interfaces using natural language are fast becoming ubiquitous with the rise and adoption of large-language models (LLM). In this position paper, we describe a generic framework for interactive query-rewriting using LLMs. Our proposal aims to unfold new opportunities for improved and transparent intent understanding while building high-performance retrieval systems using LLMs. A key aspect of our framework is the ability of the rewriter to fully specify the machine intent by the search engine in natural language that can be further refined, controlled, and edited before the final retrieval phase. The ability to present, interact, and reason over the underlying machine intent in natural language has profound implications on transparency, ranking performance, and a departure from the traditional way in which supervised signals were collected for understanding intents. We detail the concept, backed by initial experiments, along with open questions for this interactive query understanding framework.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
Enhancing Programming eTextbooks with ChatGPT Generated Counterfactual-Thinking-Inspired Questions
Authors:
Arun Balajiee Lekshmi Narayanan,
Rully Agus Hendrawan,
Venktesh V
Abstract:
Digital textbooks have become an integral part of everyday learning tasks. In this work, we consider the use of digital textbooks for programming classes. Generally, students struggle with utilizing textbooks on programming to the maximum, with a possible reason being that the example programs provided as illustration of concepts in these textbooks don't offer sufficient interactivity for students…
▽ More
Digital textbooks have become an integral part of everyday learning tasks. In this work, we consider the use of digital textbooks for programming classes. Generally, students struggle with utilizing textbooks on programming to the maximum, with a possible reason being that the example programs provided as illustration of concepts in these textbooks don't offer sufficient interactivity for students, and thereby not sufficiently motivating to explore or understand these programming examples better. In our work, we explore the idea of enhancing the navigability of intelligent textbooks with the use of ``counterfactual'' questions, to make students think critically about these programs and enhance possible program comprehension. Inspired from previous works on nudging students on counter factual thinking, we present the possibility to enhance digital textbooks with questions generated using GPT.
△ Less
Submitted 6 June, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Towards Learning and Explaining Indirect Causal Effects in Neural Networks
Authors:
Abbavaram Gowtham Reddy,
Saketh Bachu,
Harsharaj Pathak,
Benin L Godfrey,
Vineeth N. Balasubramanian,
Varshaneya V,
Satya Narayanan Kar
Abstract:
Recently, there has been a growing interest in learning and explaining causal effects within Neural Network (NN) models. By virtue of NN architectures, previous approaches consider only direct and total causal effects assuming independence among input variables. We view an NN as a structural causal model (SCM) and extend our focus to include indirect causal effects by introducing feedforward conne…
▽ More
Recently, there has been a growing interest in learning and explaining causal effects within Neural Network (NN) models. By virtue of NN architectures, previous approaches consider only direct and total causal effects assuming independence among input variables. We view an NN as a structural causal model (SCM) and extend our focus to include indirect causal effects by introducing feedforward connections among input neurons. We propose an ante-hoc method that captures and maintains direct, indirect, and total causal effects during NN model training. We also propose an algorithm for quantifying learned causal effects in an NN model and efficient approximation strategies for quantifying causal effects in high-dimensional data. Extensive experiments conducted on synthetic and real-world datasets demonstrate that the causal effects learned by our ante-hoc method better approximate the ground truth effects compared to existing methods.
△ Less
Submitted 8 January, 2024; v1 submitted 24 March, 2023;
originally announced March 2023.
-
Coherence and Diversity through Noise: Self-Supervised Paraphrase Generation via Structure-Aware Denoising
Authors:
Rishabh Gupta,
Venktesh V.,
Mukesh Mohania,
Vikram Goyal
Abstract:
In this paper, we propose SCANING, an unsupervised framework for paraphrasing via controlled noise injection. We focus on the novel task of paraphrasing algebraic word problems having practical applications in online pedagogy as a means to reduce plagiarism as well as ensure understanding on the part of the student instead of rote memorization. This task is more complex than paraphrasing general-d…
▽ More
In this paper, we propose SCANING, an unsupervised framework for paraphrasing via controlled noise injection. We focus on the novel task of paraphrasing algebraic word problems having practical applications in online pedagogy as a means to reduce plagiarism as well as ensure understanding on the part of the student instead of rote memorization. This task is more complex than paraphrasing general-domain corpora due to the difficulty in preserving critical information for solution consistency of the paraphrased word problem, managing the increased length of the text and ensuring diversity in the generated paraphrase. Existing approaches fail to demonstrate adequate performance on at least one, if not all, of these facets, necessitating the need for a more comprehensive solution. To this end, we model the noising search space as a composition of contextual and syntactic aspects and sample noising functions consisting of either one or both aspects. This allows for learning a denoising function that operates over both aspects and produces semantically equivalent and syntactically diverse outputs through grounded noise injection. The denoising function serves as a foundation for learning a paraphrasing function which operates solely in the input-paraphrase space without carrying any direct dependency on noise. We demonstrate SCANING considerably improves performance in terms of both semantic preservation and producing diverse paraphrases through extensive automated and manual evaluation across 4 datasets.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
EFX Exists for Four Agents with Three Types of Valuations
Authors:
Pratik Ghosal,
Vishwa Prakash H. V.,
Prajakta Nimbhorkar,
Nithin Varma
Abstract:
In this paper, we address the problem of determining an envy-free allocation of indivisible goods among multiple agents. EFX, which stands for envy-free up to any good, is a well-studied problem that has been shown to exist for specific scenarios, such as when there are only three agents with MMS valuations, as demonstrated by Chaudhury et al(2020), and for any number of agents when there are only…
▽ More
In this paper, we address the problem of determining an envy-free allocation of indivisible goods among multiple agents. EFX, which stands for envy-free up to any good, is a well-studied problem that has been shown to exist for specific scenarios, such as when there are only three agents with MMS valuations, as demonstrated by Chaudhury et al(2020), and for any number of agents when there are only two types of valuations as shown by Mahara(2020). Our contribution is to extend these results by showing that EFX exists for four agents with three distinct valuations. We further generalize this to show the existance of EFX allocations for n agents when n-2 of them have identical valuations.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Unsupervised Question Duplicate and Related Questions Detection in e-learning platforms
Authors:
Maksimjeet Chowdhary,
Sanyam Goyal,
Venktesh V,
Mukesh Mohania,
Vikram Goyal
Abstract:
Online learning platforms provide diverse questions to gauge the learners' understanding of different concepts. The repository of questions has to be constantly updated to ensure a diverse pool of questions to conduct assessments for learners. However, it is impossible for the academician to manually skim through the large repository of questions to check for duplicates when onboarding new questio…
▽ More
Online learning platforms provide diverse questions to gauge the learners' understanding of different concepts. The repository of questions has to be constantly updated to ensure a diverse pool of questions to conduct assessments for learners. However, it is impossible for the academician to manually skim through the large repository of questions to check for duplicates when onboarding new questions from external sources. Hence, we propose a tool QDup in this paper that can surface near-duplicate and semantically related questions without any supervised data. The proposed tool follows an unsupervised hybrid pipeline of statistical and neural approaches for incorporating different nuances in similarity for the task of question duplicate detection. We demonstrate that QDup can detect near-duplicate questions and also suggest related questions for practice with remarkable accuracy and speed from a large repository of questions. The demo video of the tool can be found at https://www.youtube.com/watch?v=loh0_-7XLW4.
△ Less
Submitted 20 December, 2022;
originally announced January 2023.
-
Real Time Object Detection System with YOLO and CNN Models: A Review
Authors:
Viswanatha V,
Chandana R K,
Ramachandra A. C.
Abstract:
The field of artificial intelligence is built on object detection techniques. YOU ONLY LOOK ONCE (YOLO) algorithm and it's more evolved versions are briefly described in this research survey. This survey is all about YOLO and convolution neural networks (CNN)in the direction of real time object detection.YOLO does generalized object representation more effectively without precision losses than oth…
▽ More
The field of artificial intelligence is built on object detection techniques. YOU ONLY LOOK ONCE (YOLO) algorithm and it's more evolved versions are briefly described in this research survey. This survey is all about YOLO and convolution neural networks (CNN)in the direction of real time object detection.YOLO does generalized object representation more effectively without precision losses than other object detection models.CNN architecture models have the ability to eliminate highlights and identify objects in any given image. When implemented appropriately, CNN models can address issues like deformity diagnosis, creating educational or instructive application, etc. This article reached atnumber of observations and perspective findings through the analysis.Also it provides support for the focused visual information and feature extraction in the financial and other industries, highlights the method of target detection and feature selection, and briefly describe the development process of YOLO algorithm.
△ Less
Submitted 23 July, 2022;
originally announced August 2022.
-
Implementation Of Tiny Machine Learning Models On Arduino 33 BLE For Gesture And Speech Recognition
Authors:
Viswanatha V,
Ramachandra A. C,
Raghavendra Prasanna,
Prem Chowdary Kakarla,
Viveka Simha PJ,
Nishant Mohan
Abstract:
In this article gesture recognition and speech recognition applications are implemented on embedded systems with Tiny Machine Learning (TinyML). It features 3-axis accelerometer, 3-axis gyroscope and 3-axis magnetometer. The gesture recognition,provides an innovative approach nonverbal communication. It has wide applications in human-computer interaction and sign language. Here in the implementati…
▽ More
In this article gesture recognition and speech recognition applications are implemented on embedded systems with Tiny Machine Learning (TinyML). It features 3-axis accelerometer, 3-axis gyroscope and 3-axis magnetometer. The gesture recognition,provides an innovative approach nonverbal communication. It has wide applications in human-computer interaction and sign language. Here in the implementation of hand gesture recognition, TinyML model is trained and deployed from EdgeImpulse framework for hand gesture recognition and based on the hand movements, Arduino Nano 33 BLE device having 6-axis IMU can find out the direction of movement of hand. The Speech is a mode of communication. Speech recognition is a way by which the statements or commands of human speech is understood by the computer which reacts accordingly. The main aim of speech recognition is to achieve communication between man and machine. Here in the implementation of speech recognition, TinyML model is trained and deployed from EdgeImpulse framework for speech recognition and based on the keywords pronounced by human, Arduino Nano 33 BLE device having built-in microphone can make an RGB LED glow like red, green or blue based on keyword pronounced. The results of each application are obtained and listed in the results section and given the analysis upon the results.
△ Less
Submitted 23 July, 2022;
originally announced July 2022.
-
Auxiliary Task Guided Interactive Attention Model for Question Difficulty Prediction
Authors:
Venktesh V,
Md. Shad Akhtar,
Mukesh Mohania,
Vikram Goyal
Abstract:
Online learning platforms conduct exams to evaluate the learners in a monotonous way, where the questions in the database may be classified into Bloom's Taxonomy as varying levels in complexity from basic knowledge to advanced evaluation. The questions asked in these exams to all learners are very much static. It becomes important to ask new questions with different difficulty levels to each learn…
▽ More
Online learning platforms conduct exams to evaluate the learners in a monotonous way, where the questions in the database may be classified into Bloom's Taxonomy as varying levels in complexity from basic knowledge to advanced evaluation. The questions asked in these exams to all learners are very much static. It becomes important to ask new questions with different difficulty levels to each learner to provide a personalized learning experience. In this paper, we propose a multi-task method with an interactive attention mechanism, Qdiff, for jointly predicting Bloom's Taxonomy and difficulty levels of academic questions. We model the interaction between the predicted bloom taxonomy representations and the input representations using an attention mechanism to aid in difficulty prediction. The proposed learning method would help learn representations that capture the relationship between Bloom's taxonomy and difficulty labels. The proposed multi-task method learns a good input representation by leveraging the relationship between the related tasks and can be used in similar settings where the tasks are related. The results demonstrate that the proposed method performs better than training only on difficulty prediction. However, Bloom's labels may not always be given for some datasets. Hence we soft label another dataset with a model fine-tuned to predict Bloom's labels to demonstrate the applicability of our method to datasets with only difficulty labels.
△ Less
Submitted 24 May, 2022;
originally announced July 2022.
-
Obj2Sub: Unsupervised Conversion of Objective to Subjective Questions
Authors:
Aarish Chhabra,
Nandini Bansal,
Venktesh V,
Mukesh Mohania,
Deep Dwivedi
Abstract:
Exams are conducted to test the learner's understanding of the subject. To prevent the learners from guessing or exchanging solutions, the mode of tests administered must have sufficient subjective questions that can gauge whether the learner has understood the concept by mandating a detailed answer. Hence, in this paper, we propose a novel hybrid unsupervised approach leveraging rule-based method…
▽ More
Exams are conducted to test the learner's understanding of the subject. To prevent the learners from guessing or exchanging solutions, the mode of tests administered must have sufficient subjective questions that can gauge whether the learner has understood the concept by mandating a detailed answer. Hence, in this paper, we propose a novel hybrid unsupervised approach leveraging rule-based methods and pre-trained dense retrievers for the novel task of automatically converting the objective questions to subjective questions. We observe that our approach outperforms the existing data-driven approaches by 36.45% as measured by Recall@k and Precision@k.
△ Less
Submitted 25 May, 2022;
originally announced June 2022.
-
'John ate 5 apples' != 'John ate some apples': Self-Supervised Paraphrase Quality Detection for Algebraic Word Problems
Authors:
Rishabh Gupta,
Venktesh V,
Mukesh Mohania,
Vikram Goyal
Abstract:
This paper introduces the novel task of scoring paraphrases for Algebraic Word Problems (AWP) and presents a self-supervised method for doing so. In the current online pedagogical setting, paraphrasing these problems is helpful for academicians to generate multiple syntactically diverse questions for assessments. It also helps induce variation to ensure that the student has understood the problem…
▽ More
This paper introduces the novel task of scoring paraphrases for Algebraic Word Problems (AWP) and presents a self-supervised method for doing so. In the current online pedagogical setting, paraphrasing these problems is helpful for academicians to generate multiple syntactically diverse questions for assessments. It also helps induce variation to ensure that the student has understood the problem instead of just memorizing it or using unfair means to solve it. The current state-of-the-art paraphrase generation models often cannot effectively paraphrase word problems, losing a critical piece of information (such as numbers or units) which renders the question unsolvable. There is a need for paraphrase scoring methods in the context of AWP to enable the training of good paraphrasers. Thus, we propose ParaQD, a self-supervised paraphrase quality detection method using novel data augmentations that can learn latent representations to separate a high-quality paraphrase of an algebraic question from a poor one by a wide margin. Through extensive experimentation, we demonstrate that our method outperforms existing state-of-the-art self-supervised methods by up to 32% while also demonstrating impressive zero-shot performance.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
K-12BERT: BERT for K-12 education
Authors:
Vasu Goel,
Dhruv Sahnan,
Venktesh V,
Gaurav Sharma,
Deep Dwivedi,
Mukesh Mohania
Abstract:
Online education platforms are powered by various NLP pipelines, which utilize models like BERT to aid in content curation. Since the inception of the pre-trained language models like BERT, there have also been many efforts toward adapting these pre-trained models to specific domains. However, there has not been a model specifically adapted for the education domain (particularly K-12) across subje…
▽ More
Online education platforms are powered by various NLP pipelines, which utilize models like BERT to aid in content curation. Since the inception of the pre-trained language models like BERT, there have also been many efforts toward adapting these pre-trained models to specific domains. However, there has not been a model specifically adapted for the education domain (particularly K-12) across subjects to the best of our knowledge. In this work, we propose to train a language model on a corpus of data curated by us across multiple subjects from various sources for K-12 education. We also evaluate our model, K12-BERT, on downstream tasks like hierarchical taxonomy tagging.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Topic Aware Contextualized Embeddings for High Quality Phrase Extraction
Authors:
Venktesh V,
Mukesh Mohania,
Vikram Goyal
Abstract:
Keyphrase extraction from a given document is the task of automatically extracting salient phrases that best describe the document. This paper proposes a novel unsupervised graph-based ranking method to extract high-quality phrases from a given document. We obtain the contextualized embeddings from pre-trained language models enriched with topic vectors from Latent Dirichlet Allocation (LDA) to re…
▽ More
Keyphrase extraction from a given document is the task of automatically extracting salient phrases that best describe the document. This paper proposes a novel unsupervised graph-based ranking method to extract high-quality phrases from a given document. We obtain the contextualized embeddings from pre-trained language models enriched with topic vectors from Latent Dirichlet Allocation (LDA) to represent the candidate phrases and the document. We introduce a scoring mechanism for the phrases using the information obtained from contextualized embeddings and the topic vectors. The salient phrases are extracted using a ranking algorithm on an undirected graph constructed for the given document. In the undirected graph, the nodes represent the phrases, and the edges between the phrases represent the semantic relatedness between them, weighted by a score obtained from the scoring mechanism. To demonstrate the efficacy of our proposed method, we perform several experiments on open source datasets in the science domain and observe that our novel method outperforms existing unsupervised embedding based keyphrase extraction methods. For instance, on the SemEval2017 dataset, our method advances the F1 score from 0.2195 (EmbedRank) to 0.2819 at the top 10 extracted keyphrases. Several variants of the proposed algorithm are investigated to determine their effect on the quality of keyphrases. We further demonstrate the ability of our proposed method to collect additional high-quality keyphrases that are not present in the document from external knowledge bases like Wikipedia for enriching the document with newly discovered keyphrases. We evaluate this step on a collection of annotated documents. The F1-score at the top 10 expanded keyphrases is 0.60, indicating that our algorithm can also be used for 'concept' expansion using external knowledge.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
Course Difficulty Estimation Based on Map** of Bloom's Taxonomy and ABET Criteria
Authors:
Premalatha M,
Suganya G,
Viswanathan V,
G Jignesh Chowdary
Abstract:
Current Educational system uses grades or marks to assess the performance of the student. The marks or grades a students scores depends on different parameters, the main parameter being the difficulty level of a course. Computation of this difficulty level may serve as a support for both the students and teachers to fix the level of training needed for successful completion of course. In this pape…
▽ More
Current Educational system uses grades or marks to assess the performance of the student. The marks or grades a students scores depends on different parameters, the main parameter being the difficulty level of a course. Computation of this difficulty level may serve as a support for both the students and teachers to fix the level of training needed for successful completion of course. In this paper, we proposed a methodology that estimates the difficulty level of a course by map** the Bloom's Taxonomy action words along with Accreditation Board for Engineering and Technology (ABET) criteria and learning outcomes. The estimated difficulty level is validated based on the history of grades secured by the students.
△ Less
Submitted 16 November, 2021;
originally announced December 2021.
-
Autonomous Cooperative Multi-Vehicle System for Interception of Aerial and Stationary Targets in Unknown Environments
Authors:
Lima Agnel Tony,
Shuvrangshu Jana,
Varun V. P.,
Aashay Anil Bhise,
Aruul Mozhi Varman S.,
Vidyadhara B. V.,
Mohitvishnu S. Gadde,
Raghu Krishnapuram,
Debasish Ghose
Abstract:
This paper presents the design, development, and testing of hardware-software systems by the IISc-TCS team for Challenge 1 of the Mohammed Bin Zayed International Robotics Challenge 2020. The goal of Challenge 1 was to grab a ball suspended from a moving and maneuvering UAV and pop balloons anchored to the ground, using suitable manipulators. The important tasks carried out to address this challen…
▽ More
This paper presents the design, development, and testing of hardware-software systems by the IISc-TCS team for Challenge 1 of the Mohammed Bin Zayed International Robotics Challenge 2020. The goal of Challenge 1 was to grab a ball suspended from a moving and maneuvering UAV and pop balloons anchored to the ground, using suitable manipulators. The important tasks carried out to address this challenge include the design and development of a hardware system with efficient grabbing and pop** mechanisms, considering the restrictions in volume and payload, design of accurate target interception algorithms using visual information suitable for outdoor environments, and development of a software architecture for dynamic multi-agent aerial systems performing complex dynamic missions. In this paper, a single degree of freedom manipulator attached with an end-effector is designed for grabbing and pop**, and robust algorithms are developed for the interception of targets in an uncertain environment. Vision-based guidance and tracking laws are proposed based on the concept of pursuit engagement and artificial potential function. The software architecture presented in this work proposes an Operation Management System (OMS) architecture that allocates static and dynamic tasks collaboratively among multiple UAVs to perform any given mission. An important aspect of this work is that all the systems developed were designed to operate in completely autonomous mode. A detailed description of the architecture along with simulations of complete challenge in the Gazebo environment and field experiment results are also included in this work. The proposed hardware-software system is particularly useful for counter-UAV systems and can also be modified in order to cater to several other applications.
△ Less
Submitted 1 September, 2021;
originally announced September 2021.
-
TagRec: Automated Tagging of Questions with Hierarchical Learning Taxonomy
Authors:
Venktesh V,
Mukesh Mohania,
Vikram Goyal
Abstract:
Online educational platforms organize academic questions based on a hierarchical learning taxonomy (subject-chapter-topic). Automatically tagging new questions with existing taxonomy will help organize these questions into different classes of hierarchical taxonomy so that they can be searched based on the facets like chapter. This task can be formulated as a flat multi-class classification proble…
▽ More
Online educational platforms organize academic questions based on a hierarchical learning taxonomy (subject-chapter-topic). Automatically tagging new questions with existing taxonomy will help organize these questions into different classes of hierarchical taxonomy so that they can be searched based on the facets like chapter. This task can be formulated as a flat multi-class classification problem. Usually, flat classification based methods ignore the semantic relatedness between the terms in the hierarchical taxonomy and the questions. Some traditional methods also suffer from the class imbalance issues as they consider only the leaf nodes ignoring the hierarchy. Hence, we formulate the problem as a similarity-based retrieval task where we optimize the semantic relatedness between the taxonomy and the questions. We demonstrate that our method helps to handle the unseen labels and hence can be used for taxonomy tagging in the wild. In this method, we augment the question with its corresponding answer to capture more semantic information and then align the question-answer pair's contextualized embedding with the corresponding label (taxonomy) vector representations. The representations are aligned by fine-tuning a transformer based model with a loss function that is a combination of the cosine similarity and hinge rank loss. The loss function maximizes the similarity between the question-answer pair and the correct label representations and minimizes the similarity to unrelated labels. Finally, we perform experiments on two real-world datasets. We show that the proposed learning method outperforms representations learned using the multi-class classification method and other state of the art methods by 6% as measured by Recall@k. We also demonstrate the performance of the proposed method on unseen but related learning content like the learning objectives without re-training the network.
△ Less
Submitted 3 July, 2021;
originally announced July 2021.
-
An Innovative Security Strategy using Reactive Web Application Honeypot
Authors:
Rajat Gupta,
Madhu Viswanatham V.,
Manikandan K
Abstract:
Nowadays, web applications have become most prevalent in the industry, and the critical data of most organizations stored using web apps. Hence, web applications a much bigger target for diverse cyber-attacks, which varies from database injections-SQL injection, PHP object injection, template injection, XML external entity injection, unsanitized input attacks- Cross-Site Scripting(XSS), and many m…
▽ More
Nowadays, web applications have become most prevalent in the industry, and the critical data of most organizations stored using web apps. Hence, web applications a much bigger target for diverse cyber-attacks, which varies from database injections-SQL injection, PHP object injection, template injection, XML external entity injection, unsanitized input attacks- Cross-Site Scripting(XSS), and many more. As mitigation for them, among many proposed solutions, web application honeypots are a much sophisticated and powerful protection mechanism.
In this paper, we propose a low interaction, adaptive, and dynamic web application honeypot that imitates the vulnerabilities through HTTP events. The honeypot is built with SNARE and TANNER; SNARE creates the attack surface and sends the requests to TANNER, which evaluates them and decides how SNARE should respond to the requests. TANNER is an analysis and classification tool, which analyzes and evaluates HTTP requests served by SNARE and to compose the response, it is powered by emulators, which are engines used for the emulation of vulnerabilities.
△ Less
Submitted 10 May, 2021;
originally announced May 2021.
-
Fake News Detection System using XLNet model with Topic Distributions: CONSTRAINT@AAAI2021 Shared Task
Authors:
Akansha Gautam,
Venktesh V,
Sarah Masud
Abstract:
With the ease of access to information, and its rapid dissemination over the internet (both velocity and volume), it has become challenging to filter out truthful information from fake ones. The research community is now faced with the task of automatic detection of fake news, which carries real-world socio-political impact. One such research contribution came in the form of the Constraint@AAA1202…
▽ More
With the ease of access to information, and its rapid dissemination over the internet (both velocity and volume), it has become challenging to filter out truthful information from fake ones. The research community is now faced with the task of automatic detection of fake news, which carries real-world socio-political impact. One such research contribution came in the form of the Constraint@AAA12021 Shared Task on COVID19 Fake News Detection in English. In this paper, we shed light on a novel method we proposed as a part of this shared task. Our team introduced an approach to combine topical distributions from Latent Dirichlet Allocation (LDA) with contextualized representations from XLNet. We also compared our method with existing baselines to show that XLNet + Topic Distributions outperforms other approaches by attaining an F1-score of 0.967.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
OneStopTuner: An End to End Architecture for JVM Tuning of Spark Applications
Authors:
Venktesh V,
Pooja B Bindal,
Devesh Singhal,
A V Subramanyam,
Vivek Kumar
Abstract:
Java is the backbone of widely used big data frameworks, such as Apache Spark, due to its productivity, portability from JVM-based execution, and support for a rich set of libraries. However, the performance of these applications can widely vary depending on the runtime flags chosen out of all existing JVM flags. Manually tuning these flags is both cumbersome and error-prone. Automated tuning appr…
▽ More
Java is the backbone of widely used big data frameworks, such as Apache Spark, due to its productivity, portability from JVM-based execution, and support for a rich set of libraries. However, the performance of these applications can widely vary depending on the runtime flags chosen out of all existing JVM flags. Manually tuning these flags is both cumbersome and error-prone. Automated tuning approaches can ease the task, but current solutions either require considerable processing time or target a subset of flags to avoid time and space requirements. In this paper, we present OneStopTuner, a Machine Learning based novel framework for autotuning JVM flags. OneStopTuner controls the amount of data generation by leveraging batch mode active learning to characterize the user application. Based on the user-selected optimization metric, OneStopTuner then discards the irrelevant JVM flags by applying feature selection algorithms on the generated data. Finally, it employs sample efficient methods such as Bayesian optimization and regression guided Bayesian optimization on the shortlisted JVM flags to find the optimal values for the chosen set of flags. We evaluated OneStopTuner on widely used Spark benchmarks and compare its performance with the traditional simulated annealing based autotuning approach. We demonstrate that for optimizing execution time, the flags chosen by OneStopTuner provides a speedup of up to 1.35x over default Spark execution, as compared to 1.15x speedup by using the flag configurations proposed by simulated annealing. OneStopTuner was able to reduce the number of executions for data-generation by 70% and was able to suggest the optimal flag configuration 2.4x faster than the standard simulated annealing based approach, excluding the time for data-generation.
△ Less
Submitted 7 September, 2020;
originally announced September 2020.
-
Internet of Things(IoT) Based Multilevel Drunken Driving Detection and Prevention System Using Raspberry Pi 3
Authors:
Viswanatha V,
Venkata Siva Reddy R,
Ashwini Kumari P,
Pradeep Kumar S
Abstract:
In this paper, the proposed system has demonstrated three ways of detecting alcohol level in the body of the car driver and prevent car driver from driving the vehicle by turning off the ignition system. It also sends messages to concerned people. In order to detect breath alcohol level MQ-3 sensor is included in this module along with a heartbeat sensor which can detect the heart beat rate of dri…
▽ More
In this paper, the proposed system has demonstrated three ways of detecting alcohol level in the body of the car driver and prevent car driver from driving the vehicle by turning off the ignition system. It also sends messages to concerned people. In order to detect breath alcohol level MQ-3 sensor is included in this module along with a heartbeat sensor which can detect the heart beat rate of driver, facial recognition using webcam & MATLAB and a Wi-Fi module to send a message through the TCP/IP App, a Raspberry pi module to turn off the ignition and an alarm as prevention module. If a driver alcohol intake is more than the prescribed range, set by government the ignition will be made off provided either his heart beat abnormal or the driver is drowsy. In both the cases there will be a message sent to the App and from the App you can send it to family, friend, and well-wisher or nearest cop for the help. The system is developed considering the fact if driver is drunk and he needs a help, his friend can drive the car if he is not drunk. The safety of both the driver and the surroundings are aimed by this system and this aids in minimizing death cases by drunken driving and also burden on the cops.
△ Less
Submitted 21 April, 2020;
originally announced April 2020.
-
Analysis of Big Data Technology for Health Care Services
Authors:
Dinesh Samuel Sathia Raj,
Vijayakumar V,
Bharat Rawal,
Longzhi Yang
Abstract:
Deep learning and other big data technologies have over time become very powerful and accurate. There are algorithms and models developed that have near human accuracy in their task. In health care, the amount of data available is massive and hence, these technologies have a great scope in health care. This paper reviews a few interesting contributions to the field specifically to medical imaging,…
▽ More
Deep learning and other big data technologies have over time become very powerful and accurate. There are algorithms and models developed that have near human accuracy in their task. In health care, the amount of data available is massive and hence, these technologies have a great scope in health care. This paper reviews a few interesting contributions to the field specifically to medical imaging, genomics and patient health records.
△ Less
Submitted 1 September, 2019;
originally announced September 2019.
-
Teaching GANs to Sketch in Vector Format
Authors:
Varshaneya V,
S Balasubramanian,
Vineeth N Balasubramanian
Abstract:
Sketching is more fundamental to human cognition than speech. Deep Neural Networks (DNNs) have achieved the state-of-the-art in speech-related tasks but have not made significant development in generating stroke-based sketches a.k.a sketches in vector format. Though there are Variational Auto Encoders (VAEs) for generating sketches in vector format, there is no Generative Adversarial Network (GAN)…
▽ More
Sketching is more fundamental to human cognition than speech. Deep Neural Networks (DNNs) have achieved the state-of-the-art in speech-related tasks but have not made significant development in generating stroke-based sketches a.k.a sketches in vector format. Though there are Variational Auto Encoders (VAEs) for generating sketches in vector format, there is no Generative Adversarial Network (GAN) architecture for the same. In this paper, we propose a standalone GAN architecture SkeGAN and a VAE-GAN architecture VASkeGAN, for sketch generation in vector format. SkeGAN is a stochastic policy in Reinforcement Learning (RL), capable of generating both multidimensional continuous and discrete outputs. VASkeGAN hybridizes a VAE and a GAN, in order to couple the efficient representation of data by VAE with the powerful generating capabilities of a GAN, to produce visually appealing sketches. We also propose a new metric called the Ske-score which quantifies the quality of vector sketches. We have validated that SkeGAN and VASkeGAN generate visually appealing sketches by using Human Turing Test and Ske-score.
△ Less
Submitted 7 April, 2019;
originally announced April 2019.
-
RES-SE-NET: Boosting Performance of Resnets by Enhancing Bridge-connections
Authors:
Varshaneya V,
Balasubramanian S,
Darshan Gera
Abstract:
One of the ways to train deep neural networks effectively is to use residual connections. Residual connections can be classified as being either identity connections or bridge-connections with a resha** convolution. Empirical observations on CIFAR-10 and CIFAR-100 datasets using a baseline Resnet model, with bridge-connections removed, have shown a significant reduction in accuracy. This reducti…
▽ More
One of the ways to train deep neural networks effectively is to use residual connections. Residual connections can be classified as being either identity connections or bridge-connections with a resha** convolution. Empirical observations on CIFAR-10 and CIFAR-100 datasets using a baseline Resnet model, with bridge-connections removed, have shown a significant reduction in accuracy. This reduction is due to lack of contribution, in the form of feature maps, by the bridge-connections. Hence bridge-connections are vital for Resnet. However, all feature maps in the bridge-connections are considered to be equally important. In this work, an upgraded architecture "Res-SE-Net" is proposed to further strengthen the contribution from the bridge-connections by quantifying the importance of each feature map and weighting them accordingly using Squeeze-and-Excitation (SE) block. It is demonstrated that Res-SE-Net generalizes much better than Resnet and SE-Resnet on the benchmark CIFAR-10 and CIFAR-100 datasets.
△ Less
Submitted 16 February, 2019;
originally announced February 2019.
-
Ontology Matching Techniques: A Gold Standard Model
Authors:
Alok Chauhan,
Vijayakumar V,
Layth Sliman
Abstract:
Typically an ontology matching technique is a combination of much different type of matchers operating at various abstraction levels such as structure, semantic, syntax, instance etc. An ontology matching technique which employs matchers at all possible abstraction levels is expected to give, in general, best results in terms of precision, recall and F-measure due to improvement in matching opport…
▽ More
Typically an ontology matching technique is a combination of much different type of matchers operating at various abstraction levels such as structure, semantic, syntax, instance etc. An ontology matching technique which employs matchers at all possible abstraction levels is expected to give, in general, best results in terms of precision, recall and F-measure due to improvement in matching opportunities and if we discount efficiency issues which may improve with better computing resources such as parallel processing. A gold standard ontology matching model is derived from a model classification of ontology matching techniques. A suitable metric is also defined based on gold standard ontology matching model. A review of various ontology matching techniques specified in recent research papers in the area was undertaken to categorize an ontology matching technique as per newly proposed gold standard model and a metric value for the whole group was computed. The results of the above study support proposed gold standard ontology matching model.
△ Less
Submitted 26 November, 2018;
originally announced November 2018.
-
Top Down Approach to find Maximal Frequent Item Sets using Subset Creation
Authors:
Jnanamurthy H. K.,
Vishesh H. V.,
Vishruth Jain,
Preetham Kumar,
Radhika M. Pai
Abstract:
Association rule has been an area of active research in the field of knowledge discovery. Data mining researchers had improved upon the quality of association rule mining for business development by incorporating influential factors like value (utility), quantity of items sold (weight) and more for the mining of association patterns. In this paper, we propose an efficient approach to find maximal…
▽ More
Association rule has been an area of active research in the field of knowledge discovery. Data mining researchers had improved upon the quality of association rule mining for business development by incorporating influential factors like value (utility), quantity of items sold (weight) and more for the mining of association patterns. In this paper, we propose an efficient approach to find maximal frequent itemset first. Most of the algorithms in literature used to find minimal frequent item first, then with the help of minimal frequent itemsets derive the maximal frequent itemsets. These methods consume more time to find maximal frequent itemsets. To overcome this problem, we propose a navel approach to find maximal frequent itemset directly using the concepts of subsets. The proposed method is found to be efficient in finding maximal frequent itemsets.
△ Less
Submitted 31 October, 2012;
originally announced October 2012.
-
Recent Trends and Research Issues in Video Association Mining
Authors:
Vijayakumar V,
Nedunchezhian R
Abstract:
With the ever-growing digital libraries and video databases, it is increasingly important to understand and mine the knowledge from video database automatically. Discovering association rules between items in a large video database plays a considerable role in the video data mining research areas. Based on the research and development in the past years, application of association rule mining is gr…
▽ More
With the ever-growing digital libraries and video databases, it is increasingly important to understand and mine the knowledge from video database automatically. Discovering association rules between items in a large video database plays a considerable role in the video data mining research areas. Based on the research and development in the past years, application of association rule mining is growing in different domains such as surveillance, meetings, broadcast news, sports, archives, movies, medical data, as well as personal and online media collections. The purpose of this paper is to provide general framework of mining the association rules from video database. This article is also represents the research issues in video association mining followed by the recent trends.
△ Less
Submitted 9 December, 2011;
originally announced December 2011.
-
Performance of Cache Memory Subsystems for Multicore Architectures
Authors:
N. Ramasubramanian,
Srinivas V. V.,
N. Ammasai Gounden
Abstract:
Advancements in multi-core have created interest among many research groups in finding out ways to harness the true power of processor cores. Recent research suggests that on-board component such as cache memory plays a crucial role in deciding the performance of multi-core systems. In this paper, performance of cache memory is evaluated through the parameters such as cache access time, miss rate…
▽ More
Advancements in multi-core have created interest among many research groups in finding out ways to harness the true power of processor cores. Recent research suggests that on-board component such as cache memory plays a crucial role in deciding the performance of multi-core systems. In this paper, performance of cache memory is evaluated through the parameters such as cache access time, miss rate and miss penalty. The influence of cache parameters over execution time is also discussed. Results obtained from simulated studies of multi-core environments with different instruction set architectures (ISA) like ALPHA and X86 are produced.
△ Less
Submitted 13 November, 2011;
originally announced November 2011.