-
Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness
Authors:
Lars Hillebrand,
Prabhupad Pradhan,
Christian Bauckhage,
Rafet Sifa
Abstract:
We introduce "pointer-guided segment ordering" (SO), a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations in large language models. Our methodology leverages a self-attention-driven pointer network to restore the original sequence of shuffled text segments, addressing the challenge of capturing the structural coherence and contextua…
▽ More
We introduce "pointer-guided segment ordering" (SO), a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations in large language models. Our methodology leverages a self-attention-driven pointer network to restore the original sequence of shuffled text segments, addressing the challenge of capturing the structural coherence and contextual dependencies within documents. This pre-training approach is complemented by a fine-tuning methodology that incorporates dynamic sampling, augmenting the diversity of training instances and improving sample efficiency for various downstream applications. We evaluate our method on a diverse set of datasets, demonstrating its efficacy in tasks requiring sequential text classification across scientific literature and financial reporting domains. Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures, leading to state-of-the-art performance in downstream classification tasks.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Higher weight spectra of ternary codes associated to the quadratic Veronese $3$-fold
Authors:
Krishna Kaipa,
Puspendu Pradhan
Abstract:
The problem studied in this work is to determine the higher weight spectra of the Projective Reed-Muller codes associated to the Veronese $3$-fold $\mathcal V$ in $PG(9,q)$, which is the image of the quadratic Veronese embedding of $PG(3,q)$ in $PG(9,q)$. We reduce the problem to the following combinatorial problem in finite geometry: For each subset $S$ of $\mathcal V$, determine the dimension of…
▽ More
The problem studied in this work is to determine the higher weight spectra of the Projective Reed-Muller codes associated to the Veronese $3$-fold $\mathcal V$ in $PG(9,q)$, which is the image of the quadratic Veronese embedding of $PG(3,q)$ in $PG(9,q)$. We reduce the problem to the following combinatorial problem in finite geometry: For each subset $S$ of $\mathcal V$, determine the dimension of the linear subspace of $PG(9,q)$ generated by $S$. We develop a systematic method to solve the latter problem. We implement the method for $q=3$, and use it to obtain the higher weight spectra of the associated code. The case of a general finite field $\mathbb F_q$ will be treated in a future work.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Authors:
Mallika Garg,
Debashis Ghosh,
Pyari Mohan Pradhan
Abstract:
Transformer model have achieved state-of-the-art results in many applications like NLP, classification, etc. But their exploration in gesture recognition task is still limited. So, we propose a novel GestFormer architecture for dynamic hand gesture recognition. The motivation behind this design is to propose a resource efficient transformer model, since transformers are computationally expensive a…
▽ More
Transformer model have achieved state-of-the-art results in many applications like NLP, classification, etc. But their exploration in gesture recognition task is still limited. So, we propose a novel GestFormer architecture for dynamic hand gesture recognition. The motivation behind this design is to propose a resource efficient transformer model, since transformers are computationally expensive and very complex. So, we propose to use a pooling based token mixer named PoolFormer, since it uses only pooling layer which is a non-parametric layer instead of quadratic attention. The proposed model also leverages the space-invariant features of the wavelet transform and also the multiscale features are selected using multi-scale pooling. Further, a gated mechanism helps to focus on fine details of the gesture with the contextual information. This enhances the performance of the proposed model compared to the traditional transformer with fewer parameters, when evaluated on dynamic hand gesture datasets, NVidia Dynamic Hand Gesture and Briareo datasets. To prove the efficacy of the proposed model, we have experimented on single as well multimodal inputs such as infrared, normals, depth, optical flow and color images. We have also compared the proposed GestFormer in terms of resource efficiency and number of operations. The source code is available at https://github.com/mallikagarg/GestFormer.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Differentially Private Communication of Measurement Anomalies in the Smart Grid
Authors:
Nikhil Ravi,
Anna Scaglione,
Sean Peisert,
Parth Pradhan
Abstract:
In this paper, we present a framework based on differential privacy (DP) for querying electric power measurements to detect system anomalies or bad data. Our DP approach conceals consumption and system matrix data, while simultaneously enabling an untrusted third party to test hypotheses of anomalies, such as the presence of bad data, by releasing a randomized sufficient statistic for hypothesis-t…
▽ More
In this paper, we present a framework based on differential privacy (DP) for querying electric power measurements to detect system anomalies or bad data. Our DP approach conceals consumption and system matrix data, while simultaneously enabling an untrusted third party to test hypotheses of anomalies, such as the presence of bad data, by releasing a randomized sufficient statistic for hypothesis-testing. We consider a measurement model corrupted by Gaussian noise and a sparse noise vector representing the attack, and we observe that the optimal test statistic is a chi-square random variable. To detect possible attacks, we propose a novel DP chi-square noise mechanism that ensures the test does not reveal private information about power injections or the system matrix. The proposed framework provides a robust solution for detecting bad data while preserving the privacy of sensitive power system data.
△ Less
Submitted 22 March, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Out of Sight, Still in Mind: Reasoning and Planning about Unobserved Objects with Video Tracking Enabled Memory Models
Authors:
Yixuan Huang,
Jialin Yuan,
Chanho Kim,
Pupul Pradhan,
Bryan Chen,
Li Fuxin,
Tucker Hermans
Abstract:
Robots need to have a memory of previously observed, but currently occluded objects to work reliably in realistic environments. We investigate the problem of encoding object-oriented memory into a multi-object manipulation reasoning and planning framework. We propose DOOM and LOOM, which leverage transformer relational dynamics to encode the history of trajectories given partial-view point clouds…
▽ More
Robots need to have a memory of previously observed, but currently occluded objects to work reliably in realistic environments. We investigate the problem of encoding object-oriented memory into a multi-object manipulation reasoning and planning framework. We propose DOOM and LOOM, which leverage transformer relational dynamics to encode the history of trajectories given partial-view point clouds and an object discovery and tracking engine. Our approaches can perform multiple challenging tasks including reasoning with occluded objects, novel objects appearance, and object reappearance. Throughout our extensive simulation and real-world experiments, we find that our approaches perform well in terms of different numbers of objects and different numbers of distractor actions. Furthermore, we show our approaches outperform an implicit memory baseline.
△ Less
Submitted 24 May, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth Soft-Thresholding
Authors:
Shaik Basheeruddin Shah,
Pradyumna Pradhan,
Wei Pu,
Ramunaidu Randhi,
Miguel R. D. Rodrigues,
Yonina C. Eldar
Abstract:
Solving linear inverse problems plays a crucial role in numerous applications. Algorithm unfolding based, model-aware data-driven approaches have gained significant attention for effectively addressing these problems. Learned iterative soft-thresholding algorithm (LISTA) and alternating direction method of multipliers compressive sensing network (ADMM-CSNet) are two widely used such approaches, ba…
▽ More
Solving linear inverse problems plays a crucial role in numerous applications. Algorithm unfolding based, model-aware data-driven approaches have gained significant attention for effectively addressing these problems. Learned iterative soft-thresholding algorithm (LISTA) and alternating direction method of multipliers compressive sensing network (ADMM-CSNet) are two widely used such approaches, based on ISTA and ADMM algorithms, respectively. In this work, we study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs, for finite-layer unfolded networks such as LISTA and ADMM-CSNet with smooth soft-thresholding in an over-parameterized (OP) regime. We achieve this by leveraging a modified version of the Polyak-Lojasiewicz, denoted PL$^*$, condition. Satisfying the PL$^*$ condition within a specific region of the loss landscape ensures the existence of a global minimum and exponential convergence from initialization using gradient descent based methods. Hence, we provide conditions, in terms of the network width and the number of training samples, on these unfolded networks for the PL$^*$ condition to hold. We achieve this by deriving the Hessian spectral norm of these networks. Additionally, we show that the threshold on the number of training samples increases with the increase in the network width. Furthermore, we compare the threshold on training samples of unfolded networks with that of a standard fully-connected feed-forward network (FFNN) with smooth soft-thresholding non-linearity. We prove that unfolded networks have a higher threshold value than FFNN. Consequently, one can expect a better expected error for unfolded networks than FFNN.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Estimation of Correlation Matrices from Limited time series Data using Machine Learning
Authors:
Nikhil Easaw,
Woo Seok Lee,
Prashant Singh Lohiya,
Sarika Jalan,
Priodyuti Pradhan
Abstract:
Correlation matrices contain a wide variety of spatio-temporal information about a dynamical system. Predicting correlation matrices from partial time series information of a few nodes characterizes the spatio-temporal dynamics of the entire underlying system. This information can help to predict the underlying network structure, e.g., inferring neuronal connections from spiking data, deducing cau…
▽ More
Correlation matrices contain a wide variety of spatio-temporal information about a dynamical system. Predicting correlation matrices from partial time series information of a few nodes characterizes the spatio-temporal dynamics of the entire underlying system. This information can help to predict the underlying network structure, e.g., inferring neuronal connections from spiking data, deducing causal dependencies between genes from expression data, and discovering long spatial range influences in climate variations. Traditional methods of predicting correlation matrices utilize time series data of all the nodes of the underlying networks. Here, we use a supervised machine learning technique to predict the correlation matrix of entire systems from finite time series information of a few randomly selected nodes. The accuracy of the prediction validates that only a limited time series of a subset of the entire system is enough to make good correlation matrix predictions. Furthermore, using an unsupervised learning algorithm, we furnish insights into the success of the predictions from our model. Finally, we employ the machine learning model developed here to real-world data sets.
△ Less
Submitted 13 March, 2023; v1 submitted 2 September, 2022;
originally announced September 2022.
-
SCNet: A Generalized Attention-based Model for Crack Fault Segmentation
Authors:
Hrishikesh Sharma,
Prakhar Pradhan,
Balamuralidhar P
Abstract:
Anomaly detection and localization is an important vision problem, having multiple applications. Effective and generic semantic segmentation of anomalous regions on various different surfaces, where most anomalous regions inherently do not have any obvious pattern, is still under active research. Periodic health monitoring and fault (anomaly) detection in vast infrastructures, which is an importan…
▽ More
Anomaly detection and localization is an important vision problem, having multiple applications. Effective and generic semantic segmentation of anomalous regions on various different surfaces, where most anomalous regions inherently do not have any obvious pattern, is still under active research. Periodic health monitoring and fault (anomaly) detection in vast infrastructures, which is an important safety-related task, is one such application area of vision-based anomaly segmentation. However, the task is quite challenging due to large variations in surface faults, texture-less construction material/background, lighting conditions etc. Cracks are critical and frequent surface faults that manifest as extreme zigzag-shaped thin, elongated regions. They are among the hardest faults to detect, even with deep learning. In this work, we address an open aspect of automatic crack segmentation problem, that of generalizing and improving the performance of segmentation across a variety of scenarios, by modeling the problem differently. We carefully study and abstract the sub-problems involved and solve them in a broader context, making our solution generic. On a variety of datasets related to surveillance of different infrastructures, under varying conditions, our model consistently outperforms the state-of-the-art algorithms by a significant margin, without any bells-and-whistles. This performance advantage easily carried over in two deployments of our model, tested against industry-provided datasets. Even further, we could establish our model's performance for two manufacturing quality inspection scenarios as well, where the defect types are not just crack equivalents, but much more and different. Hence we hope that our model is indeed a truly generic defect segmentation model.
△ Less
Submitted 2 December, 2021;
originally announced December 2021.
-
Multilevel Digital Contact Tracing
Authors:
Gautam Mahapatra,
Priodyuti Pradhan,
Abhinandan Khan,
Sanjit Kumar Setua,
Rajat Kumar Pal,
Ayush Rathor
Abstract:
Digital contact tracing plays a crucial role in alleviating an outbreak, and designing multilevel digital contact tracing for a country is an open problem due to the analysis of large volumes of temporal contact data. We develop a multilevel digital contact tracing framework that constructs dynamic contact graphs from the proximity contact data. Prominently, we introduce the edge label of the cont…
▽ More
Digital contact tracing plays a crucial role in alleviating an outbreak, and designing multilevel digital contact tracing for a country is an open problem due to the analysis of large volumes of temporal contact data. We develop a multilevel digital contact tracing framework that constructs dynamic contact graphs from the proximity contact data. Prominently, we introduce the edge label of the contact graph as a binary circular contact queue, which holds the temporal social interactions during the incubation period. After that, our algorithm prepares the direct and indirect (multilevel) contact list for a given set of infected persons from the contact graph. Finally, the algorithm constructs the infection pathways for the trace list. We implement the framework and validate the contact tracing process with synthetic and real-world data sets. In addition, analysis reveals that for COVID-19 close contact parameters, the framework takes reasonable space and time to create the infection pathways. Our framework can apply to any epidemic spreading by changing the algorithm's parameters.
△ Less
Submitted 18 May, 2024; v1 submitted 10 July, 2020;
originally announced July 2020.
-
Climate Adaptation: Reliably Predicting from Imbalanced Satellite Data
Authors:
Ruchit Rawal,
Prabhu Pradhan
Abstract:
The utility of aerial imagery (Satellite, Drones) has become an invaluable information source for cross-disciplinary applications, especially for crisis management. Most of the map** and tracking efforts are manual which is resource-intensive and often lead to delivery delays. Deep Learning methods have boosted the capacity of relief efforts via recognition, detection, and are now being used for…
▽ More
The utility of aerial imagery (Satellite, Drones) has become an invaluable information source for cross-disciplinary applications, especially for crisis management. Most of the map** and tracking efforts are manual which is resource-intensive and often lead to delivery delays. Deep Learning methods have boosted the capacity of relief efforts via recognition, detection, and are now being used for non-trivial applications. However the data commonly available is highly imbalanced (similar to other real-life applications) which severely hampers the neural network's capabilities, this reduces robustness and trust. We give an overview on different kinds of techniques being used for handling such extreme settings and present solutions aimed at maximizing performance on minority classes using a diverse set of methods (ranging from architectural tuning to augmentation) which as a combination generalizes for all minority classes. We hope to amplify cross-disciplinary efforts by enhancing model reliability.
△ Less
Submitted 26 April, 2020;
originally announced April 2020.
-
Distributed Data Verification Protocols in Cloud Computing
Authors:
Priodyuti Pradhan
Abstract:
Recently, storage of huge volume of data into Cloud has become an effective trend in modern day Computing due to its dynamic nature. After storing, users deletes their original copy of the data files. Therefore users, cannot directly control over that data. This lack of control introduces security issues in Cloud data storage, one of the most important security issue is integrity of the remotely s…
▽ More
Recently, storage of huge volume of data into Cloud has become an effective trend in modern day Computing due to its dynamic nature. After storing, users deletes their original copy of the data files. Therefore users, cannot directly control over that data. This lack of control introduces security issues in Cloud data storage, one of the most important security issue is integrity of the remotely stored data. Here, we propose a Distributed Algorithmic approach to address this problem with publicly probabilistic verifiable scheme. Due to heavy workload at the Third Party Auditor side, we distributes the verification task among various SUBTPAs. We uses Sobol Random Sequences to generates the random block numbers that maintains the uniformity property. In addition, our method provides uniformity for each subtasks also. To makes each subtask uniform, we uses some analytical approach. For this uniformity, our protocols verify the integrity of the data very efficiently and quickly. Also, we provides special care about critical data by using Overlap Task Distribution Keys.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Co-VeGAN: Complex-Valued Generative Adversarial Network for Compressive Sensing MR Image Reconstruction
Authors:
Bhavya Vasudeva,
Puneesh Deora,
Saumik Bhattacharya,
Pyari Mohan Pradhan
Abstract:
Compressive sensing (CS) is widely used to reduce the acquisition time of magnetic resonance imaging (MRI). Although state-of-the-art deep learning based methods have been able to obtain fast, high-quality reconstruction of CS-MR images, their main drawback is that they treat complex-valued MRI data as real-valued entities. Most methods either extract the magnitude from the complex-valued entities…
▽ More
Compressive sensing (CS) is widely used to reduce the acquisition time of magnetic resonance imaging (MRI). Although state-of-the-art deep learning based methods have been able to obtain fast, high-quality reconstruction of CS-MR images, their main drawback is that they treat complex-valued MRI data as real-valued entities. Most methods either extract the magnitude from the complex-valued entities or concatenate them as two real-valued channels. In both the cases, the phase content, which links the real and imaginary parts of the complex-valued entities, is discarded. In order to address the fundamental problem of real-valued deep networks, i.e. their inability to process complex-valued data, we propose a novel framework based on a complex-valued generative adversarial network (Co-VeGAN). Our model can process complex-valued input, which enables it to perform high-quality reconstruction of the CS-MR images. Further, considering that phase is a crucial component of complex-valued entities, we propose a novel complex-valued activation function, which is sensitive to the phase of the input. Extensive evaluation of the proposed approach on different datasets using various sampling masks demonstrates that the proposed model significantly outperforms the existing CS-MRI reconstruction techniques in terms of peak signal-to-noise ratio as well as structural similarity index. Further, it uses significantly fewer trainable parameters to do so, as compared to the real-valued deep learning based methods.
△ Less
Submitted 24 September, 2020; v1 submitted 24 February, 2020;
originally announced February 2020.
-
Structure Preserving Compressive Sensing MRI Reconstruction using Generative Adversarial Networks
Authors:
Puneesh Deora,
Bhavya Vasudeva,
Saumik Bhattacharya,
Pyari Mohan Pradhan
Abstract:
Compressive sensing magnetic resonance imaging (CS-MRI) accelerates the acquisition of MR images by breaking the Nyquist sampling limit. In this work, a novel generative adversarial network (GAN) based framework for CS-MRI reconstruction is proposed. Leveraging a combination of patch-based discriminator and structural similarity index based loss, our model focuses on preserving high frequency cont…
▽ More
Compressive sensing magnetic resonance imaging (CS-MRI) accelerates the acquisition of MR images by breaking the Nyquist sampling limit. In this work, a novel generative adversarial network (GAN) based framework for CS-MRI reconstruction is proposed. Leveraging a combination of patch-based discriminator and structural similarity index based loss, our model focuses on preserving high frequency content as well as fine textural details in the reconstructed image. Dense and residual connections have been incorporated in a U-net based generator architecture to allow easier transfer of information as well as variable network length. We show that our algorithm outperforms state-of-the-art methods in terms of quality of reconstruction and robustness to noise. Also, the reconstruction time, which is of the order of milliseconds, makes it highly suitable for real-time clinical use.
△ Less
Submitted 26 April, 2020; v1 submitted 14 October, 2019;
originally announced October 2019.
-
Planning Robot Motion using Deep Visual Prediction
Authors:
Meenakshi Sarkar,
Prabhu Pradhan,
Debasish Ghose
Abstract:
In this paper, we introduce a novel framework that can learn to make visual predictions about the motion of a robotic agent from raw video frames. Our proposed motion prediction network (PROM-Net) can learn in a completely unsupervised manner and efficiently predict up to 10 frames in the future. Moreover, unlike any other motion prediction models, it is lightweight and once trained it can be easi…
▽ More
In this paper, we introduce a novel framework that can learn to make visual predictions about the motion of a robotic agent from raw video frames. Our proposed motion prediction network (PROM-Net) can learn in a completely unsupervised manner and efficiently predict up to 10 frames in the future. Moreover, unlike any other motion prediction models, it is lightweight and once trained it can be easily implemented on mobile platforms that have very limited computing capabilities. We have created a new robotic data set comprising LEGO Mindstorms moving along various trajectories in three different environments under different lighting conditions for testing and training the network. Finally, we introduce a framework that would use the predicted frames from the network as an input to a model predictive controller for motion planning in unknown dynamic environments with moving obstacles.
△ Less
Submitted 24 June, 2019;
originally announced June 2019.
-
Network construction: A learning framework through localizing principal eigenvector
Authors:
Priodyuti Pradhan,
Sarika Jalan
Abstract:
Information of localization properties of eigenvectors of the complex network has applicability in many different areas which include networks centrality measures, spectral partitioning, development of approximation algorithms, and disease spreading phenomenon. For linear dynamical process localization of principal eigenvector (PEV) of adjacency matrices infers condensation of the information in t…
▽ More
Information of localization properties of eigenvectors of the complex network has applicability in many different areas which include networks centrality measures, spectral partitioning, development of approximation algorithms, and disease spreading phenomenon. For linear dynamical process localization of principal eigenvector (PEV) of adjacency matrices infers condensation of the information in the smaller section of the network. For a network, an eigenvector is said to be localized when most of its components are near to zero with few taking very high values. Here, we provide three different random-sampling-based algorithms which, by using the edge rewiring method, can evolve a random network having a delocalized PEV to a network having a highly localized PEV. In other words, we develop a learning framework to explore the localization of PEV through a random sampling-based optimization method. We discuss the drawbacks and advantages of these algorithms. Additionally, we show that the construction of such networks corresponding to the highly localized PEV is a non-convex optimization problem when the objective function is the inverse participation ratio. This framework is also relevant to construct a network structure for other lower-order eigenvectors.
△ Less
Submitted 28 September, 2021; v1 submitted 1 February, 2018;
originally announced February 2018.
-
Land-cover Classification and Map** for Eastern Himalayan State Sikkim
Authors:
Ratika Pradhan,
Mohan P. Pradhan,
Ashish Bhusan,
Ronak K. Pradhan,
M. K. Ghose
Abstract:
Area of classifying satellite imagery has become a challenging task in current era where there is tremendous growth in settlement i.e. construction of buildings, roads, bridges, dam etc. This paper suggests an improvised k-means and Artificial Neural Network (ANN) classifier for land-cover map** of Eastern Himalayan state Sikkim. The improvised k-means algorithm shows satisfactory results compar…
▽ More
Area of classifying satellite imagery has become a challenging task in current era where there is tremendous growth in settlement i.e. construction of buildings, roads, bridges, dam etc. This paper suggests an improvised k-means and Artificial Neural Network (ANN) classifier for land-cover map** of Eastern Himalayan state Sikkim. The improvised k-means algorithm shows satisfactory results compared to existing methods that includes k-Nearest Neighbor and maximum likelihood classifier. The strength of the Artificial Neural Network (ANN) classifier lies in the fact that they are fast and have good recognition rate and it's capability of self-learning compared to other classification algorithms has made it widely accepted. Classifier based on ANN shows satisfactory and accurate result in comparison with the classical method.
△ Less
Submitted 22 March, 2010;
originally announced March 2010.