Search | arXiv e-print repository

arXiv:2403.05764 [pdf, ps, other]

Investigation into the Potential of Parallel Quantum Annealing for Simultaneous Optimization of Multiple Problems: A Comprehensive Study

Authors: Arit Kumar Bishwas, Anuraj Som, Saurabh Choudhary

Abstract: Parallel Quantum Annealing is a technique to solve multiple optimization problems simultaneously. Parallel quantum annealing aims to optimize the utilization of available qubits on a quantum topology by addressing multiple independent problems in a single annealing cycle. This study provides insights into the potential and the limitations of this parallelization method. The experiments consisting… ▽ More Parallel Quantum Annealing is a technique to solve multiple optimization problems simultaneously. Parallel quantum annealing aims to optimize the utilization of available qubits on a quantum topology by addressing multiple independent problems in a single annealing cycle. This study provides insights into the potential and the limitations of this parallelization method. The experiments consisting of two different problems are integrated, and various problem dimensions are explored including normalization techniques using specific methods such as DWaveSampler with Default Embedding, DWaveSampler with Custom Embedding and LeapHybridSampler. This method minimizes idle qubits and holds promise for substantial speed-up, as indicated by the Time-to-Solution (TTS) metric, compared to traditional quantum annealing, which solves problems sequentially and may leave qubits unutilized. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.00507 [pdf, other]

Molecular unfolding formulation with enhanced quantum annealing approach

Authors: Arit Kumar Bishwas, Arish Pitchai, Anuraj Som

Abstract: Molecular docking is a crucial phase in drug discovery, involving the precise determination of the optimal spatial arrangement between two molecules when they bind. The such analysis, the 3D structure of molecules is a fundamental consideration, involving the manipulation of molecular representations based on their degrees of freedom, including rigid roto-translation and fragment rotations along r… ▽ More Molecular docking is a crucial phase in drug discovery, involving the precise determination of the optimal spatial arrangement between two molecules when they bind. The such analysis, the 3D structure of molecules is a fundamental consideration, involving the manipulation of molecular representations based on their degrees of freedom, including rigid roto-translation and fragment rotations along rotatable bonds, to determine the preferred spatial arrangement when molecules bind to each other. In this paper, quantum annealing based solution to solve Molecular unfolding (MU) problem, a specific phase within molecular docking, is explored and compared with a state-of-the-art classical algorithm named "GeoDock". Molecular unfolding focuses on expanding a molecule to an unfolded state to simplify manipulation within the target cavity and optimize its configuration, typically by maximizing molecular area or internal atom distances. Molecular unfolding problem aims to find the torsional configuration that increases the inter-atomic distance within a molecule, which also increases the molecular area. Quantum annealing approach first encodes the problem into a Higher-order Unconstrained Binary Optimization (HUBO) equation which is pruned to an arbitrary percentage to improve the time efficiency and to be able to solve the equation using any quantum annealer. The resultant HUBO is then converted to a Quadratic Unconstrained Binary Optimization equation (QUBO), which is easily embedded on a D-wave annealing Quantum processor. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: 11 pages, 8 figures

arXiv:2310.10707 [pdf, other]

Demonstrations Are All You Need: Advancing Offensive Content Paraphrasing using In-Context Learning

Authors: Anirudh Som, Karan Sikka, Helen Gent, Ajay Divakaran, Andreas Kathol, Dimitra Vergyri

Abstract: Paraphrasing of offensive content is a better alternative to content removal and helps improve civility in a communication environment. Supervised paraphrasers; however, rely heavily on large quantities of labelled data to help preserve meaning and intent. They also often retain a large portion of the offensiveness of the original content, which raises questions on their overall usability. In this… ▽ More Paraphrasing of offensive content is a better alternative to content removal and helps improve civility in a communication environment. Supervised paraphrasers; however, rely heavily on large quantities of labelled data to help preserve meaning and intent. They also often retain a large portion of the offensiveness of the original content, which raises questions on their overall usability. In this paper we aim to assist practitioners in develo** usable paraphrasers by exploring In-Context Learning (ICL) with large language models (LLMs), i.e., using a limited number of input-label demonstration pairs to guide the model in generating desired outputs for specific queries. Our study focuses on key factors such as - number and order of demonstrations, exclusion of prompt instruction, and reduction in measured toxicity. We perform principled evaluation on three datasets, including our proposed Context-Aware Polite Paraphrase (CAPP) dataset, comprising of dialogue-style rude utterances, polite paraphrases, and additional dialogue context. We evaluate our approach using four closed source and one open source LLM. Our results reveal that ICL is comparable to supervised methods in generation quality, while being qualitatively better by 25% on human evaluation and attaining lower toxicity by 76%. Also, ICL-based paraphrasers only show a slight reduction in performance even with just 10% training data. △ Less

Submitted 9 June, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted in Association for Computational Linguistics (ACL) 2024 Findings

arXiv:2201.00111 [pdf, other]

doi 10.1109/JIOT.2021.3139038

Role of Data Augmentation Strategies in Knowledge Distillation for Wearable Sensor Data

Authors: Eun Som Jeon, Anirudh Som, Ankita Shukla, Kristina Hasanaj, Matthew P. Buman, Pavan Turaga

Abstract: Deep neural networks are parametrized by several thousands or millions of parameters, and have shown tremendous success in many classification problems. However, the large number of parameters makes it difficult to integrate these models into edge devices such as smartphones and wearable devices. To address this problem, knowledge distillation (KD) has been widely employed, that uses a pre-trained… ▽ More Deep neural networks are parametrized by several thousands or millions of parameters, and have shown tremendous success in many classification problems. However, the large number of parameters makes it difficult to integrate these models into edge devices such as smartphones and wearable devices. To address this problem, knowledge distillation (KD) has been widely employed, that uses a pre-trained high capacity network to train a much smaller network, suitable for edge devices. In this paper, for the first time, we study the applicability and challenges of using KD for time-series data for wearable devices. Successful application of KD requires specific choices of data augmentation methods during training. However, it is not yet known if there exists a coherent strategy for choosing an augmentation approach during KD. In this paper, we report the results of a detailed study that compares and contrasts various common choices and some hybrid data augmentation strategies in KD based human activity analysis. Research in this area is often limited as there are not many comprehensive databases available in the public domain from wearable devices. Our study considers databases from small scale publicly available to one derived from a large scale interventional study into human activity and sedentary behavior. We find that the choice of data augmentation techniques during KD have a variable level of impact on end performance, and find that the optimal network choice as well as data augmentation strategies are specific to a dataset at hand. However, we also conclude with a general set of recommendations that can provide a strong baseline performance across databases. △ Less

Submitted 31 December, 2021; originally announced January 2022.

arXiv:2106.09623 [pdf, other]

Towards Explainable Student Group Collaboration Assessment Models Using Temporal Representations of Individual Student Roles

Authors: Anirudh Som, Sujeong Kim, Bladimir Lopez-Prado, Svati Dhamija, Nonye Alozie, Amir Tamrakar

Abstract: Collaboration is identified as a required and necessary skill for students to be successful in the fields of Science, Technology, Engineering and Mathematics (STEM). However, due to growing student population and limited teaching staff it is difficult for teachers to provide constructive feedback and instill collaborative skills using instructional methods. Development of simple and easily explain… ▽ More Collaboration is identified as a required and necessary skill for students to be successful in the fields of Science, Technology, Engineering and Mathematics (STEM). However, due to growing student population and limited teaching staff it is difficult for teachers to provide constructive feedback and instill collaborative skills using instructional methods. Development of simple and easily explainable machine-learning-based automated systems can help address this problem. Improving upon our previous work, in this paper we propose using simple temporal-CNN deep-learning models to assess student group collaboration that take in temporal representations of individual student roles as input. We check the applicability of dynamically changing feature representations for student group collaboration assessment and how they impact the overall performance. We also use Grad-CAM visualizations to better understand and interpret the important temporal indices that led to the deep-learning model's decision. △ Less

Submitted 17 June, 2021; originally announced June 2021.

Comments: Accepted in the poster session at the 14th International Conference on Educational Data Mining

arXiv:2102.08360 [pdf, other]

Interpretable COVID-19 Chest X-Ray Classification via Orthogonality Constraint

Authors: Ella Y. Wang, Anirudh Som, Ankita Shukla, Hongjun Choi, Pavan Turaga

Abstract: Deep neural networks have increasingly been used as an auxiliary tool in healthcare applications, due to their ability to improve performance of several diagnosis tasks. However, these methods are not widely adopted in clinical settings due to the practical limitations in the reliability, generalizability, and interpretability of deep learning based systems. As a result, methods have been develope… ▽ More Deep neural networks have increasingly been used as an auxiliary tool in healthcare applications, due to their ability to improve performance of several diagnosis tasks. However, these methods are not widely adopted in clinical settings due to the practical limitations in the reliability, generalizability, and interpretability of deep learning based systems. As a result, methods have been developed that impose additional constraints during network training to gain more control as well as improve interpretabilty, facilitating their acceptance in healthcare community. In this work, we investigate the benefit of using Orthogonal Spheres (OS) constraint for classification of COVID-19 cases from chest X-ray images. The OS constraint can be written as a simple orthonormality term which is used in conjunction with the standard cross-entropy loss during classification network training. Previous studies have demonstrated significant benefits in applying such constraints to deep learning models. Our findings corroborate these observations, indicating that the orthonormality loss function effectively produces improved semantic localization via GradCAM visualizations, enhanced classification performance, and reduced model calibration error. Our approach achieves an improvement in accuracy of 1.6% and 4.8% for two- and three-class classification, respectively; similar results are found for models with data augmentation applied. In addition to these findings, our work also presents a new application of the OS regularizer in healthcare, increasing the post-hoc interpretability and performance of deep learning models for COVID-19 classification to facilitate adoption of these methods in clinical settings. We also identify the limitations of our strategy that can be explored for further research in future. △ Less

Submitted 21 December, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

Comments: Accepted in the 2021 ACM CHIL Workshop track. An extended version of this work is under consideration at Pattern Recognition Letters

arXiv:2009.10762 [pdf, other]

Role of Orthogonality Constraints in Improving Properties of Deep Networks for Image Classification

Authors: Hongjun Choi, Anirudh Som, Pavan Turaga

Abstract: Standard deep learning models that employ the categorical cross-entropy loss are known to perform well at image classification tasks. However, many standard models thus obtained often exhibit issues like feature redundancy, low interpretability, and poor calibration. A body of recent work has emerged that has tried addressing some of these challenges by proposing the use of new regularization func… ▽ More Standard deep learning models that employ the categorical cross-entropy loss are known to perform well at image classification tasks. However, many standard models thus obtained often exhibit issues like feature redundancy, low interpretability, and poor calibration. A body of recent work has emerged that has tried addressing some of these challenges by proposing the use of new regularization functions in addition to the cross-entropy loss. In this paper, we present some surprising findings that emerge from exploring the role of simple orthogonality constraints as a means of imposing physics-motivated constraints common in imaging. We propose an Orthogonal Sphere (OS) regularizer that emerges from physics-based latent-representations under simplifying assumptions. Under further simplifying assumptions, the OS constraint can be written in closed-form as a simple orthonormality term and be used along with the cross-entropy loss function. The findings indicate that orthonormality loss function results in a) rich and diverse feature representations, b) robustness to feature sub-selection, c) better semantic localization in the class activation maps, and d) reduction in model calibration error. We demonstrate the effectiveness of the proposed OS regularization by providing quantitative and qualitative results on four benchmark datasets - CIFAR10, CIFAR100, SVHN and tiny ImageNet. △ Less

Submitted 22 September, 2020; originally announced September 2020.

Comments: 8 figures, 4 tables, 1 pseudo-code

arXiv:2007.06667 [pdf, other]

A Machine Learning Approach to Assess Student Group Collaboration Using Individual Level Behavioral Cues

Authors: Anirudh Som, Sujeong Kim, Bladimir Lopez-Prado, Svati Dhamija, Nonye Alozie, Amir Tamrakar

Abstract: K-12 classrooms consistently integrate collaboration as part of their learning experiences. However, owing to large classroom sizes, teachers do not have the time to properly assess each student and give them feedback. In this paper we propose using simple deep-learning-based machine learning models to automatically determine the overall collaboration quality of a group based on annotations of ind… ▽ More K-12 classrooms consistently integrate collaboration as part of their learning experiences. However, owing to large classroom sizes, teachers do not have the time to properly assess each student and give them feedback. In this paper we propose using simple deep-learning-based machine learning models to automatically determine the overall collaboration quality of a group based on annotations of individual roles and individual level behavior of all the students in the group. We come across the following challenges when building these models: 1) Limited training data, 2) Severe class label imbalance. We address these challenges by using a controlled variant of Mixup data augmentation, a method for generating additional data samples by linearly combining different pairs of data samples and their corresponding class labels. Additionally, the label space for our problem exhibits an ordered structure. We take advantage of this fact and also explore using an ordinal-cross-entropy loss function and study its effects with and without Mixup. △ Less

Submitted 2 September, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

Comments: Accepted in the ECCV 2020 workshop on Imbalance Problems in Computer Vision

arXiv:2005.02589 [pdf, ps, other]

Unsupervised Pre-trained Models from Healthy ADLs Improve Parkinson's Disease Classification of Gait Patterns

Authors: Anirudh Som, Narayanan Krishnamurthi, Matthew Buman, Pavan Turaga

Abstract: Application and use of deep learning algorithms for different healthcare applications is gaining interest at a steady pace. However, use of such algorithms can prove to be challenging as they require large amounts of training data that capture different possible variations. This makes it difficult to use them in a clinical setting since in most health applications researchers often have to work wi… ▽ More Application and use of deep learning algorithms for different healthcare applications is gaining interest at a steady pace. However, use of such algorithms can prove to be challenging as they require large amounts of training data that capture different possible variations. This makes it difficult to use them in a clinical setting since in most health applications researchers often have to work with limited data. Less data can cause the deep learning model to over-fit. In this paper, we ask how can we use data from a different environment, different use-case, with widely differing data distributions. We exemplify this use case by using single-sensor accelerometer data from healthy subjects performing activities of daily living - ADLs (source dataset), to extract features relevant to multi-sensor accelerometer gait data (target dataset) for Parkinson's disease classification. We train the pre-trained model using the source dataset and use it as a feature extractor. We show that the features extracted for the target dataset can be used to train an effective classification model. Our pre-trained source model consists of a convolutional autoencoder, and the target classification model is a simple multi-layer perceptron model. We explore two different pre-trained source models, trained using different activity groups, and analyze the influence the choice of pre-trained model has over the task of Parkinson's disease classification. △ Less

Submitted 6 May, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

Comments: Accepted in the 42nd Annual International Conferences of the IEEE Engineering in Medicine and Biology Society (EMBC 2020)

arXiv:2004.09805 [pdf, other]

AMC-Loss: Angular Margin Contrastive Loss for Improved Explainability in Image Classification

Authors: Hongjun Choi, Anirudh Som, Pavan Turaga

Abstract: Deep-learning architectures for classification problems involve the cross-entropy loss sometimes assisted with auxiliary loss functions like center loss, contrastive loss and triplet loss. These auxiliary loss functions facilitate better discrimination between the different classes of interest. However, recent studies hint at the fact that these loss functions do not take into account the intrinsi… ▽ More Deep-learning architectures for classification problems involve the cross-entropy loss sometimes assisted with auxiliary loss functions like center loss, contrastive loss and triplet loss. These auxiliary loss functions facilitate better discrimination between the different classes of interest. However, recent studies hint at the fact that these loss functions do not take into account the intrinsic angular distribution exhibited by the low-level and high-level feature representations. This results in less compactness between samples from the same class and unclear boundary separations between data clusters of different classes. In this paper, we address this issue by proposing the use of geometric constraints, rooted in Riemannian geometry. Specifically, we propose Angular Margin Contrastive Loss (AMC-Loss), a new loss function to be used along with the traditional cross-entropy loss. The AMC-Loss employs the discriminative angular distance metric that is equivalent to geodesic distance on a hypersphere manifold such that it can serve a clear geometric interpretation. We demonstrate the effectiveness of AMC-Loss by providing quantitative and qualitative results. We find that although the proposed geometrically constrained loss-function improves quantitative results modestly, it has a qualitatively surprisingly beneficial effect on increasing the interpretability of deep-net decisions as seen by the visual explanations generated by techniques such as the Grad-CAM. Our code is available at https://github.com/hchoi71/AMC-Loss. △ Less

Submitted 21 April, 2020; originally announced April 2020.

arXiv:2004.07384 [pdf, other]

Topological Descriptors for Parkinson's Disease Classification and Regression Analysis

Authors: Afra Nawar, Farhan Rahman, Narayanan Krishnamurthi, Anirudh Som, Pavan Turaga

Abstract: At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use Topological Data Analysis (TDA) together with machine learning tools to automate the process of Parkinson's disease classification and severity assessment. An automated, stable, and accurate method to… ▽ More At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use Topological Data Analysis (TDA) together with machine learning tools to automate the process of Parkinson's disease classification and severity assessment. An automated, stable, and accurate method to evaluate Parkinson's would be significant in streamlining diagnoses of patients and providing families more time for corrective measures. We propose a methodology which incorporates TDA into analyzing Parkinson's disease postural shifts data through the representation of persistence images. Studying the topology of a system has proven to be invariant to small changes in data and has been shown to perform well in discrimination tasks. The contributions of the paper are twofold. We propose a method to 1) classify healthy patients from those afflicted by disease and 2) diagnose the severity of disease. We explore the use of the proposed method in an application involving a Parkinson's disease dataset comprised of healthy-elderly, healthy-young and Parkinson's disease patients. Our code is available at https://github.com/itsmeafra/Sublevel-Set-TDA. △ Less

Submitted 6 May, 2020; v1 submitted 15 April, 2020; originally announced April 2020.

Comments: Accepted in the 42nd Annual International Conferences of the IEEE Engineering in Medicine and Biology Society (EMBC 2020)

arXiv:1906.01769 [pdf, other]

PI-Net: A Deep Learning Approach to Extract Topological Persistence Images

Authors: Anirudh Som, Hongjun Choi, Karthikeyan Natesan Ramamurthy, Matthew Buman, Pavan Turaga

Abstract: Topological features such as persistence diagrams and their functional approximations like persistence images (PIs) have been showing substantial promise for machine learning and computer vision applications. This is greatly attributed to the robustness topological representations provide against different types of physical nuisance variables seen in real-world data, such as view-point, illuminati… ▽ More Topological features such as persistence diagrams and their functional approximations like persistence images (PIs) have been showing substantial promise for machine learning and computer vision applications. This is greatly attributed to the robustness topological representations provide against different types of physical nuisance variables seen in real-world data, such as view-point, illumination, and more. However, key bottlenecks to their large scale adoption are computational expenditure and difficulty incorporating them in a differentiable architecture. We take an important step in this paper to mitigate these bottlenecks by proposing a novel one-step approach to generate PIs directly from the input data. We design two separate convolutional neural network architectures, one designed to take in multi-variate time series signals as input and another that accepts multi-channel images as input. We call these networks Signal PI-Net and Image PI-Net respectively. To the best of our knowledge, we are the first to propose the use of deep learning for computing topological features directly from data. We explore the use of the proposed PI-Net architectures on two applications: human activity recognition using tri-axial accelerometer sensor data and image classification. We demonstrate the ease of fusion of PIs in supervised deep learning architectures and speed up of several orders of magnitude for extracting PIs from data. Our code is available at https://github.com/anirudhsom/PI-Net. △ Less

Submitted 23 May, 2020; v1 submitted 4 June, 2019; originally announced June 2019.

Comments: 10 pages, 8 figures, 4 tables

arXiv:1807.10400 [pdf, other]

Perturbation Robust Representations of Topological Persistence Diagrams

Authors: Anirudh Som, Kowshik Thopalli, Karthikeyan Natesan Ramamurthy, Vinay Venkataraman, Ankita Shukla, Pavan Turaga

Abstract: Topological methods for data analysis present opportunities for enforcing certain invariances of broad interest in computer vision, including view-point in activity analysis, articulation in shape analysis, and measurement invariance in non-linear dynamical modeling. The increasing success of these methods is attributed to the complementary information that topology provides, as well as availabili… ▽ More Topological methods for data analysis present opportunities for enforcing certain invariances of broad interest in computer vision, including view-point in activity analysis, articulation in shape analysis, and measurement invariance in non-linear dynamical modeling. The increasing success of these methods is attributed to the complementary information that topology provides, as well as availability of tools for computing topological summaries such as persistence diagrams. However, persistence diagrams are multi-sets of points and hence it is not straightforward to fuse them with features used for contemporary machine learning tools like deep-nets. In this paper we present theoretically well-grounded approaches to develop novel perturbation robust topological representations, with the long-term view of making them amenable to fusion with contemporary learning architectures. We term the proposed representation as Perturbed Topological Signatures, which live on a Grassmann manifold and hence can be efficiently used in machine learning pipelines. We explore the use of the proposed descriptor on three applications: 3D shape analysis, view-invariant activity analysis, and non-linear dynamical modeling. We show favorable results in both high-level recognition performance and time-complexity when compared to other baseline methods. △ Less

Submitted 26 July, 2018; originally announced July 2018.

Comments: 19 pages, 4 figures, 6 tables

Showing 1–13 of 13 results for author: Som, A