-
Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation
Authors:
Amartya Sanyal,
Yaxi Hu,
Yaodong Yu,
Yian Ma,
Yixin Wang,
Bernhard Schölkopf
Abstract:
"Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisan…
▽ More
"Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisance features can be sufficient to shatter the Accuracy-on-the-line phenomenon. In these cases, ID and OOD accuracy can become negatively correlated, leading to "Accuracy-on-the-wrong-line". This phenomenon can also occur in the presence of spurious (shortcut) features, which tend to overshadow the more complex signal (core, non-spurious) features, resulting in a large nuisance feature space. Moreover, scaling to larger datasets does not mitigate this undesirable behavior and may even exacerbate it. We formally prove a lower bound on Out-of-distribution (OOD) error in a linear classification model, characterizing the conditions on the noise and nuisance features for a large OOD error. We finally demonstrate this phenomenon across both synthetic and real datasets with noisy data and nuisance features.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Holographic description of unified early and late universe in viscous mimetic gravity
Authors:
G. S Khadekar Saibal Ray Aritra Sanyal
Abstract:
In this study, we explore the mimetic matter model proposed by Chamseddine and Mukhanov (J. High Energy Phys. 11, 135, 2013), utilizing the holographic principle to coherently describe both the early and late universe when bulk viscosity is present in the inhomogeneous equation of state. Our examination of the universe's evolution is based on the generalized infrared-cutoff holographic dark energy…
▽ More
In this study, we explore the mimetic matter model proposed by Chamseddine and Mukhanov (J. High Energy Phys. 11, 135, 2013), utilizing the holographic principle to coherently describe both the early and late universe when bulk viscosity is present in the inhomogeneous equation of state. Our examination of the universe's evolution is based on the generalized infrared-cutoff holographic dark energy model detailed by Nojiri and Odintsov (Eur. Phys. J. C 77, 528, 2017) within the context of the flat FRW model. From a holographic perspective, we derive the energy conservation equation incorporating mimetic matter through a viscous holographic fluid model. Furthermore, we analyze various scenarios of bulk viscosity by assuming a constant equation of state parameter and derive the infrared cut-off expression in terms of the particle horizon. We demonstrate that within the framework of mimetic gravity, there is a class of solutions comparable to those in General Relativity, with an additional contribution from a non-relativistic mimetic matter component. These solutions can effectively describe dark matte
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost
Authors:
Masha Belyi,
Robert Friel,
Shuai Shao,
Atindriyo Sanyal
Abstract:
Retriever Augmented Generation (RAG) systems have become pivotal in enhancing the capabilities of language models by incorporating external knowledge retrieval mechanisms. However, a significant challenge in deploying these systems in industry applications is the detection and mitigation of hallucinations: instances where the model generates information that is not grounded in the retrieved contex…
▽ More
Retriever Augmented Generation (RAG) systems have become pivotal in enhancing the capabilities of language models by incorporating external knowledge retrieval mechanisms. However, a significant challenge in deploying these systems in industry applications is the detection and mitigation of hallucinations: instances where the model generates information that is not grounded in the retrieved context. Addressing this issue is crucial for ensuring the reliability and accuracy of responses generated by large language models (LLMs) in diverse industry settings. Current hallucination detection techniques fail to deliver accuracy, low latency, and low cost simultaneously. We introduce Luna: a DeBERTA-large (440M) encoder, finetuned for hallucination detection in RAG settings. We demonstrate that Luna outperforms GPT-3.5 and commercial evaluation frameworks on the hallucination detection task, with 97% and 91% reduction in cost and latency, respectively. Luna is lightweight and generalizes across multiple industry verticals and out-of-domain data, making it an ideal candidate for industry LLM applications.
△ Less
Submitted 5 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
The Role of Learning Algorithms in Collective Action
Authors:
Omri Ben-Dov,
Jake Fawkes,
Samira Samadi,
Amartya Sanyal
Abstract:
Collective action in machine learning is the study of the control that a coordinated group can have over machine learning algorithms. While previous research has concentrated on assessing the impact of collectives against Bayes (sub-)optimal classifiers, this perspective is limited in that it does not account for the choice of learning algorithm. Since classifiers seldom behave like Bayes classifi…
▽ More
Collective action in machine learning is the study of the control that a coordinated group can have over machine learning algorithms. While previous research has concentrated on assessing the impact of collectives against Bayes (sub-)optimal classifiers, this perspective is limited in that it does not account for the choice of learning algorithm. Since classifiers seldom behave like Bayes classifiers and are influenced by the choice of learning algorithms along with their inherent biases, in this work we initiate the study of how the choice of the learning algorithm plays a role in the success of a collective in practical settings. Specifically, we focus on distributionally robust optimization (DRO), popular for improving a worst group error, and on the ubiquitous stochastic gradient descent (SGD), due to its inductive bias for "simpler" functions. Our empirical results, supported by a theoretical foundation, show that the effective size and success of the collective are highly dependent on properties of the learning algorithm. This highlights the necessity of taking the learning algorithm into account when studying the impact of collective action in machine learning.
△ Less
Submitted 4 June, 2024; v1 submitted 10 May, 2024;
originally announced May 2024.
-
Provable Privacy with Non-Private Pre-Processing
Authors:
Yaxi Hu,
Amartya Sanyal,
Bernhard Schölkopf
Abstract:
When analysing Differentially Private (DP) machine learning pipelines, the potential privacy cost of data-dependent pre-processing is frequently overlooked in privacy accounting. In this work, we propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms. Our framework establishes upper bounds on the overall privacy guarante…
▽ More
When analysing Differentially Private (DP) machine learning pipelines, the potential privacy cost of data-dependent pre-processing is frequently overlooked in privacy accounting. In this work, we propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms. Our framework establishes upper bounds on the overall privacy guarantees by utilising two new technical notions: a variant of DP termed Smooth DP and the bounded sensitivity of the pre-processing algorithms. In addition to the generic framework, we provide explicit overall privacy guarantees for multiple data-dependent pre-processing algorithms, such as data imputation, quantization, deduplication and PCA, when used in combination with several DP algorithms. Notably, this framework is also simple to implement, allowing direct integration into existing DP pipelines.
△ Less
Submitted 21 June, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
On the Growth of Mistakes in Differentially Private Online Learning: A Lower Bound Perspective
Authors:
Daniil Dmitriev,
Kristóf Szabó,
Amartya Sanyal
Abstract:
In this paper, we provide lower bounds for Differentially Private (DP) Online Learning algorithms. Our result shows that, for a broad class of $(\varepsilon,δ)$-DP online algorithms, for $T$ such that $\log T\leq O(1 / δ)$, the expected number of mistakes incurred by the algorithm grows as $Ω(\log \frac{T}δ)$. This matches the upper bound obtained by Golowich and Livni (2021) and is in contrast to…
▽ More
In this paper, we provide lower bounds for Differentially Private (DP) Online Learning algorithms. Our result shows that, for a broad class of $(\varepsilon,δ)$-DP online algorithms, for $T$ such that $\log T\leq O(1 / δ)$, the expected number of mistakes incurred by the algorithm grows as $Ω(\log \frac{T}δ)$. This matches the upper bound obtained by Golowich and Livni (2021) and is in contrast to non-private online learning where the number of mistakes is independent of $T$. To the best of our knowledge, our work is the first result towards settling lower bounds for DP-Online learning and partially addresses the open question in Sanyal and Ramponi (2022).
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Corrective Machine Unlearning
Authors:
Shashwat Goel,
Ameya Prabhu,
Philip Torr,
Ponnurangam Kumaraguru,
Amartya Sanyal
Abstract:
Machine Learning models increasingly face data integrity challenges due to the use of large-scale training datasets drawn from the internet. We study what model developers can do if they detect that some data was manipulated or incorrect. Such manipulated data can cause adverse effects like vulnerability to backdoored samples, systematic biases, and in general, reduced accuracy on certain input do…
▽ More
Machine Learning models increasingly face data integrity challenges due to the use of large-scale training datasets drawn from the internet. We study what model developers can do if they detect that some data was manipulated or incorrect. Such manipulated data can cause adverse effects like vulnerability to backdoored samples, systematic biases, and in general, reduced accuracy on certain input domains. Often, all manipulated training samples are not known, and only a small, representative subset of the affected data is flagged.
We formalize "Corrective Machine Unlearning" as the problem of mitigating the impact of data affected by unknown manipulations on a trained model, possibly knowing only a subset of impacted samples. We demonstrate that the problem of corrective unlearning has significantly different requirements from traditional privacy-oriented unlearning. We find most existing unlearning methods, including the gold-standard retraining-from-scratch, require most of the manipulated data to be identified for effective corrective unlearning. However, one approach, SSD, achieves limited success in unlearning adverse effects with just a small portion of the manipulated samples, showing the tractability of this setting. We hope our work spurs research towards develo** better methods for corrective unlearning and offers practitioners a new strategy to handle data integrity challenges arising from web-scale training.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Reconstructing modified and alternative theories of gravity
Authors:
Dalia Saha,
Manas Chakrabortty,
Abhik Kumar Sanyal
Abstract:
A viable radiation dominated era in the early universe is best described by the standard (FLRW) model of cosmology. In this short review, we demonstrate reconstruction of the forms of F(R) in the modified theory of gravity and the metric compatible F(T) together with the symmetric F(Q) in alternative teleparallel theories of gravity, from different perspectives, primarily rendering emphasis on a v…
▽ More
A viable radiation dominated era in the early universe is best described by the standard (FLRW) model of cosmology. In this short review, we demonstrate reconstruction of the forms of F(R) in the modified theory of gravity and the metric compatible F(T) together with the symmetric F(Q) in alternative teleparallel theories of gravity, from different perspectives, primarily rendering emphasis on a viable FLRW radiation era. Inflation has also been studied for a particular choice of the scalar potential. The inflationary parameters are found to agree appreciably with the recently released observational data.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
Can semi-supervised learning use all the data effectively? A lower bound perspective
Authors:
Alexandru Ţifrea,
Gizem Yüce,
Amartya Sanyal,
Fanny Yang
Abstract:
Prior works have shown that semi-supervised learning algorithms can leverage unlabeled data to improve over the labeled sample complexity of supervised learning (SL) algorithms. However, existing theoretical analyses focus on regimes where the unlabeled data is sufficient to learn a good decision boundary using unsupervised learning (UL) alone. This begs the question: Can SSL algorithms simultaneo…
▽ More
Prior works have shown that semi-supervised learning algorithms can leverage unlabeled data to improve over the labeled sample complexity of supervised learning (SL) algorithms. However, existing theoretical analyses focus on regimes where the unlabeled data is sufficient to learn a good decision boundary using unsupervised learning (UL) alone. This begs the question: Can SSL algorithms simultaneously improve upon both UL and SL? To this end, we derive a tight lower bound for 2-Gaussian mixture models that explicitly depends on the labeled and the unlabeled dataset size as well as the signal-to-noise ratio of the mixture distribution. Surprisingly, our result implies that no SSL algorithm can improve upon the minimax-optimal statistical error rates of SL or UL algorithms for these distributions. Nevertheless, we show empirically on real-world data that SSL algorithms can still outperform UL and SL methods. Therefore, our work suggests that, while proving performance gains for SSL algorithms is possible, it requires careful tracking of constants.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Chainpoll: A high efficacy method for LLM hallucination detection
Authors:
Robert Friel,
Atindriyo Sanyal
Abstract:
Large language models (LLMs) have experienced notable advancements in generating coherent and contextually relevant responses. However, hallucinations - incorrect or unfounded claims - are still prevalent, prompting the creation of automated metrics to detect these in LLM outputs. Our contributions include: introducing ChainPoll, an innovative hallucination detection method that excels compared to…
▽ More
Large language models (LLMs) have experienced notable advancements in generating coherent and contextually relevant responses. However, hallucinations - incorrect or unfounded claims - are still prevalent, prompting the creation of automated metrics to detect these in LLM outputs. Our contributions include: introducing ChainPoll, an innovative hallucination detection method that excels compared to its counterparts, and unveiling RealHall, a refined collection of benchmark datasets to assess hallucination detection metrics from recent studies. While creating RealHall, we assessed tasks and datasets from previous hallucination detection studies and observed that many are not suitable for the potent LLMs currently in use. Overcoming this, we opted for four datasets challenging for modern LLMs and pertinent to real-world scenarios. Using RealHall, we conducted a comprehensive comparison of ChainPoll with numerous hallucination metrics from recent studies. Our findings indicate that ChainPoll outperforms in all RealHall benchmarks, achieving an overall AUROC of 0.781. This surpasses the next best theoretical method by 11% and exceeds industry standards by over 23%. Additionally, ChainPoll is cost-effective and offers greater transparency than other metrics. We introduce two novel metrics to assess LLM hallucinations: Adherence and Correctness. Adherence is relevant to Retrieval Augmented Generation workflows, evaluating an LLM's analytical capabilities within given documents and contexts. In contrast, Correctness identifies logical and reasoning errors.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
Inflation and cosmological evolution with $F(R,G)$ gravity theory
Authors:
Dalia Saha,
Jyoti Prasad Saha,
Abhik Kumar Sanyal
Abstract:
In the last decade Planck PR4 data together with ground-based experimental data such as, BK18, BAO and CMB lensing, tightened constraint of the tensor to scalar ratio, starting form $r < 0.14$ to $r < 0.032$, while the spectral index lies within the range $0.9631 < n_s < 0.9705$. Viability of modified gravity theories, proposed as alternatives to the dark-energy issue, should therefore be tested i…
▽ More
In the last decade Planck PR4 data together with ground-based experimental data such as, BK18, BAO and CMB lensing, tightened constraint of the tensor to scalar ratio, starting form $r < 0.14$ to $r < 0.032$, while the spectral index lies within the range $0.9631 < n_s < 0.9705$. Viability of modified gravity theories, proposed as alternatives to the dark-energy issue, should therefore be tested in the light of such new result. Here, we explore $F(R,G)$ gravity theory in regard of the early universe and have shown that, it is not compatible with newly released constraints on $r$ and $n_s$ simultaneously. Further, it also fails to produce a feasible radiation dominated era. It therefore questions the justification of using the model for resolving the cosmic puzzle.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Geometric Extended State Observer on SE(3) with Fast Finite-Time Stability: Theory and Validation on a Rotorcraft Aerial Vehicle
Authors:
Ningshan Wang,
Reza Hamrah,
Amit K. Sanyal,
Mark N. Glauser
Abstract:
This article presents an extended state observer for vehicle modeled as a rigid body in three-dimensional translational and rotational motions. The extended state observer is applicable to a rotorcraft aerial vehicle with a fixed plane of rotors, modeled as an under-actuated system on the tangent bundle of the six-dimensional Lie group of rigid body motions, SE(3). The extended state observer is d…
▽ More
This article presents an extended state observer for vehicle modeled as a rigid body in three-dimensional translational and rotational motions. The extended state observer is applicable to a rotorcraft aerial vehicle with a fixed plane of rotors, modeled as an under-actuated system on the tangent bundle of the six-dimensional Lie group of rigid body motions, SE(3). The extended state observer is designed to estimate the resultant external disturbance force and disturbance torque acting on the vehicle. It guarantees stable convergence of disturbance estimation errors in finite time when the disturbances are constant and finite time convergence to a bounded neighborhood of zero errors for time-varying disturbances. This extended state observer design is based on a Hölder-continuous fast finite time stable differentiator that is similar to the super-twisting algorithm, to obtain fast convergence. Numerical simulations are conducted to validate the proposed extended state observer. The proposed extended state observer is compared with other existing research to show its advantages. A set of experimental results implementing disturbance rejection control using feedback of disturbance estimates from the extended state observer is also presented.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Geometric Active Disturbance Rejection Control of Rotorcraft on $SE(3)$ with Fast Finite-Time Stability
Authors:
Ningshan Wang,
Reza Hamrah,
Amit K. Sanyal,
Mark N. Glauser
Abstract:
This article presents a tracking control framework enhanced by an extended state observer for a rotorcraft aerial vehicle modeled as a rigid body in three-dimensional translational and rotational motions. The system is considered as an underactuated system on the tangent bundle of the six-dimensional Lie group of rigid body motions, $SE(3)$. The extended state observer is designed to estimate the…
▽ More
This article presents a tracking control framework enhanced by an extended state observer for a rotorcraft aerial vehicle modeled as a rigid body in three-dimensional translational and rotational motions. The system is considered as an underactuated system on the tangent bundle of the six-dimensional Lie group of rigid body motions, $SE(3)$. The extended state observer is designed to estimate the resultant external disturbance force and disturbance torque acting on the vehicle. It guarantees stable convergence of disturbance estimation errors in finite time when the disturbances are constant and finite time convergence to a bounded neighborhood of zero errors for time-varying disturbances. This extended state observer design is based on a Hölder-continuous fast finite time stable differentiator that is similar to the super-twisting algorithm, to obtain fast convergence. A tracking control scheme that uses the estimated disturbances from extended state observer for disturbance rejection, is designed to achieve fast finite-time stable tracking control. Numerical simulations are conducted to validate the proposed extended state observer and tracking control scheme with disturbance rejection. The proposed extended state observer is compared with other existing research to show its supremacy.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
How robust accuracy suffers from certified training with convex relaxations
Authors:
Piersilvio De Bartolomeis,
Jacob Clarysse,
Amartya Sanyal,
Fanny Yang
Abstract:
Adversarial attacks pose significant threats to deploying state-of-the-art classifiers in safety-critical applications. Two classes of methods have emerged to address this issue: empirical defences and certified defences. Although certified defences come with robustness guarantees, empirical defences such as adversarial training enjoy much higher popularity among practitioners. In this paper, we s…
▽ More
Adversarial attacks pose significant threats to deploying state-of-the-art classifiers in safety-critical applications. Two classes of methods have emerged to address this issue: empirical defences and certified defences. Although certified defences come with robustness guarantees, empirical defences such as adversarial training enjoy much higher popularity among practitioners. In this paper, we systematically compare the standard and robust error of these two robust training paradigms across multiple computer vision tasks. We show that in most tasks and for both $\mathscr{l}_\infty$-ball and $\mathscr{l}_2$-ball threat models, certified training with convex relaxations suffers from worse standard and robust error than adversarial training. We further explore how the error gap between certified and adversarial training depends on the threat model and the data distribution. In particular, besides the perturbation budget, we identify as important factors the shape of the perturbation set and the implicit margin of the data distribution. We support our arguments with extensive ablations on both synthetic and image datasets.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
PILLAR: How to make semi-private learning more effective
Authors:
Francesco Pinto,
Yaxi Hu,
Fanny Yang,
Amartya Sanyal
Abstract:
In Semi-Supervised Semi-Private (SP) learning, the learner has access to both public unlabelled and private labelled data. We propose a computationally efficient algorithm that, under mild assumptions on the data, provably achieves significantly lower private labelled sample complexity and can be efficiently run on real-world datasets. For this purpose, we leverage the features extracted by networ…
▽ More
In Semi-Supervised Semi-Private (SP) learning, the learner has access to both public unlabelled and private labelled data. We propose a computationally efficient algorithm that, under mild assumptions on the data, provably achieves significantly lower private labelled sample complexity and can be efficiently run on real-world datasets. For this purpose, we leverage the features extracted by networks pre-trained on public (labelled or unlabelled) data, whose distribution can significantly differ from the one on which SP learning is performed. To validate its empirical effectiveness, we propose a wide variety of experiments under tight privacy constraints ($ε= 0.1$) and with a focus on low-data regimes. In all of these settings, our algorithm exhibits significantly improved performance over available baselines that use similar amounts of public data.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Certifying Ensembles: A General Certification Theory with S-Lipschitzness
Authors:
Aleksandar Petrov,
Francisco Eiras,
Amartya Sanyal,
Philip H. S. Torr,
Adel Bibi
Abstract:
Improving and guaranteeing the robustness of deep learning models has been a topic of intense research. Ensembling, which combines several classifiers to provide a better model, has shown to be beneficial for generalisation, uncertainty estimation, calibration, and mitigating the effects of concept drift. However, the impact of ensembling on certified robustness is less well understood. In this wo…
▽ More
Improving and guaranteeing the robustness of deep learning models has been a topic of intense research. Ensembling, which combines several classifiers to provide a better model, has shown to be beneficial for generalisation, uncertainty estimation, calibration, and mitigating the effects of concept drift. However, the impact of ensembling on certified robustness is less well understood. In this work, we generalise Lipschitz continuity by introducing S-Lipschitz classifiers, which we use to analyse the theoretical robustness of ensembles. Our results are precise conditions when ensembles of robust classifiers are more robust than any constituent classifier, as well as conditions when they are less robust.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
A viable form of the metric Teleparallel F(T) theory of gravity
Authors:
Manas Chakrabortty,
Nayem Sk,
Abhik Kumar Sanyal
Abstract:
Unlike F(R) gravity, pure metric F(T) gravity in the vacuum dominated era, ends up with an imaginary action and is therefore not feasible. This eerie situation may only be circumvented by associating a scalar field, which can also drive inflation in the very early universe. We show that, despite diverse claims, F(T) theory admits Noether symmetry only in the pressure-less dust era in the form F(T)…
▽ More
Unlike F(R) gravity, pure metric F(T) gravity in the vacuum dominated era, ends up with an imaginary action and is therefore not feasible. This eerie situation may only be circumvented by associating a scalar field, which can also drive inflation in the very early universe. We show that, despite diverse claims, F(T) theory admits Noether symmetry only in the pressure-less dust era in the form F(T) proportional to the nth power of T, n being odd integers. A suitable form of F(T), admitting a viable Friedmann-like radiation dominated era, together with early deceleration and late-time accelerated expansion in the pressure-less dust era, has been proposed.
△ Less
Submitted 13 April, 2023; v1 submitted 9 April, 2023;
originally announced April 2023.
-
Certified private data release for sparse Lipschitz functions
Authors:
Konstantin Donhauser,
Johan Lokna,
Amartya Sanyal,
March Boedihardjo,
Robert Hönig,
Fanny Yang
Abstract:
As machine learning has become more relevant for everyday applications, a natural requirement is the protection of the privacy of the training data. When the relevant learning questions are unknown in advance, or hyper-parameter tuning plays a central role, one solution is to release a differentially private synthetic data set that leads to similar conclusions as the original training data. In thi…
▽ More
As machine learning has become more relevant for everyday applications, a natural requirement is the protection of the privacy of the training data. When the relevant learning questions are unknown in advance, or hyper-parameter tuning plays a central role, one solution is to release a differentially private synthetic data set that leads to similar conclusions as the original training data. In this work, we introduce an algorithm that enjoys fast rates for the utility loss for sparse Lipschitz queries. Furthermore, we show how to obtain a certificate for the utility loss for a large class of algorithms.
△ Less
Submitted 28 August, 2023; v1 submitted 19 February, 2023;
originally announced February 2023.
-
Perusing Buchbinder--Lyakhovich canonical formalism for Higher-Order Theories of Gravity
Authors:
Dalia Saha,
Abhik Kumar Sanyal
Abstract:
Ostrogradsky's, Dirac's and Horowitz's techniques of higher order theories of gravity produce identical phase-space structures. The problem is manifested in the case of Gauss-Bonnet-dilatonic coupled action in the presence of higher-order term, in which case, classical correspondence can't be established. Here, we explore yet another technique developed by Buchbinder and his collaborators (BL) lon…
▽ More
Ostrogradsky's, Dirac's and Horowitz's techniques of higher order theories of gravity produce identical phase-space structures. The problem is manifested in the case of Gauss-Bonnet-dilatonic coupled action in the presence of higher-order term, in which case, classical correspondence can't be established. Here, we explore yet another technique developed by Buchbinder and his collaborators (BL) long back and show that it also suffers from the same disease. However, expressing the action in terms of the three-space curvature, and removing "the total derivative terms", if Horowitz's formalism or even Dirac's constraint analysis is pursued, all pathologies disappear. Here we show that the same is true for BL formalism, which appears to be the simplest of all the techniques, to handle.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
Synergetic Effect of Wall-Slip and Compressibility During Startup Flow of Complex Fluids
Authors:
Aniruddha Sanyal,
Sachin Balasaheb Shinde,
Lalit Kumar
Abstract:
The present letter explains the synergetic effect of wall-slip, compressibility, and thixotropy in a pressurized flow startup operation of various structured fluids. Opposite to the intuition, experimental and numerical simulations suggest that the wall-slip (adhesive failure) is facilitating gel degradation (cohesive failure), revealing a new flow-startup mechanism. The thixotropic rheological mo…
▽ More
The present letter explains the synergetic effect of wall-slip, compressibility, and thixotropy in a pressurized flow startup operation of various structured fluids. Opposite to the intuition, experimental and numerical simulations suggest that the wall-slip (adhesive failure) is facilitating gel degradation (cohesive failure), revealing a new flow-startup mechanism. The thixotropic rheological model includes structural degradation kinetics at the bulk. Whereas, a static slip-based model addresses the near-wall phenomenon. The near-wall transient variations in axial velocity or strain evolution, and the initial pressure propagation mechanism along the axis of the circular pipe explain the essence of the aforementioned synergy.
△ Less
Submitted 10 January, 2023;
originally announced January 2023.
-
Practice Makes Perfect: an iterative approach to achieve precise tracking for legged robots
Authors:
**g Cheng,
Yasser G. Alqaham,
Amit K. Sanyal,
Zhenyu Gan
Abstract:
Precise trajectory tracking for legged robots can be challenging due to their high degrees of freedom, unmodeled nonlinear dynamics, or random disturbances from the environment. A commonly adopted solution to overcome these challenges is to use optimization-based algorithms and approximate the system with a simplified, reduced-order model. Additionally, deep neural networks are becoming a more pro…
▽ More
Precise trajectory tracking for legged robots can be challenging due to their high degrees of freedom, unmodeled nonlinear dynamics, or random disturbances from the environment. A commonly adopted solution to overcome these challenges is to use optimization-based algorithms and approximate the system with a simplified, reduced-order model. Additionally, deep neural networks are becoming a more promising option for achieving agile and robust legged locomotion. These approaches, however, either require large amounts of onboard calculations or the collection of millions of data points from a single robot. To address these problems and improve tracking performance, this paper proposes a method based on iterative learning control. This method lets a robot learn from its own mistakes by exploiting the repetitive nature of legged locomotion within only a few trials. Then, a torque library is created as a lookup table so that the robot does not need to repeat calculations or learn the same skill over and over again. This process resembles how animals learn their muscle memories in nature. The proposed method is tested on the A1 robot in a simulated environment, and it allows the robot to pronk at different speeds while precisely following the reference trajectories without heavy calculations.
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
Robust Quantum Circuit for Clique Problem with Intermediate Qudits
Authors:
Arpita Sanyal,
Amit Saha,
Banani Saha,
Amlan Chakrabarti
Abstract:
Clique problem has a wide range of applications due to its pattern matching ability. There are various formulation of clique problem like $k$-clique problem, maximum clique problem, etc. The $k$-Clique problem, determines whether an arbitrary network has a clique or not whereas maximum clique problem finds the largest clique in a graph. It is already exhibited in the literature that the $k$-clique…
▽ More
Clique problem has a wide range of applications due to its pattern matching ability. There are various formulation of clique problem like $k$-clique problem, maximum clique problem, etc. The $k$-Clique problem, determines whether an arbitrary network has a clique or not whereas maximum clique problem finds the largest clique in a graph. It is already exhibited in the literature that the $k$-clique or maximum clique problem (NP-problem) can be solved in an asymptotically faster manner by using quantum algorithms as compared to the conventional computing. Quantum computing with higher dimensions is gaining popularity due to its large storage capacity and computation power. In this article, we have shown an improved quantum circuit implementation for the $k$-clique problem and maximum clique problem (MCP) with the help of higher-dimensional intermediate temporary qudits for the first time to the best of our knowledge. The cost of state-of-the-art quantum circuit for $k$-clique problem is colossal due to a huge number of $n$-qubit Toffoli gates. We have exhibited an improved cost and depth over the circuit by applying a generalized $n$-qubit Toffoli gate decomposition with intermediate ququarts (4-dimensional qudits).
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Current Landscape of Mesenchymal Stem Cell Therapy in COVID Induced Acute Respiratory Distress Syndrome
Authors:
Adrita Chanda,
Adrija Aich,
Arka Sanyal,
Anantika Chandra,
Saumyadeep Goswami
Abstract:
The severe acute respiratory syndrome coronavirus 2 outbreak in Chinas Hubei area in late 2019 has now created a global pandemic that has spread to over 150 countries. In most people, COVID 19 is a respiratory infection that produces fever, cough, and shortness of breath. Patients with severe COVID 19 may develop ARDS. MSCs can come from a number of places, such as bone marrow, umbilical cord, and…
▽ More
The severe acute respiratory syndrome coronavirus 2 outbreak in Chinas Hubei area in late 2019 has now created a global pandemic that has spread to over 150 countries. In most people, COVID 19 is a respiratory infection that produces fever, cough, and shortness of breath. Patients with severe COVID 19 may develop ARDS. MSCs can come from a number of places, such as bone marrow, umbilical cord, and adipose tissue. Because of their easy accessibility and low immunogenicity, MSCs were often used in animal and clinical research. In recent studies, MSCs have been shown to decrease inflammation, enhance lung permeability, improve microbial and alveolar fluid clearance, and accelerate lung epithelial and endothelial repair. Furthermore, MSC-based therapy has shown promising outcomes in preclinical studies and phase 1 clinical trials in sepsis and ARDS. In this paper, we posit the therapeutic strategies using MSC and dissect how and why MSC therapy is a potential treatment option for COVID 19 induced ARDS. We cite numerous promising clinical trials, elucidate the potential advantages of MSC therapy for COVID 19 ARDS patients, examine the detriments of this therapeutic strategy and suggest possibilities of subsequent research.
△ Less
Submitted 5 November, 2022;
originally announced November 2022.
-
Identification and Molecular Dynamic Simulation of Flavonoids from Mediterranean species of Oregano against the Zika NS2B-NS3 Protease
Authors:
Anushikha Ghosh,
Arka Sanyal,
Sameer Sharma
Abstract:
The Zika virus, is an emerging infectious disease causing severe complications such as microcephaly in infants and Guillain Barre syndrome in adults. There is no licensed vaccination or approved medicine to treat ZIKV infection. Therefore, extensive research is being carried out to find compounds that can be used effectively as therapeutic molecules to treat ZIKV infection. Oregano, a commonly fou…
▽ More
The Zika virus, is an emerging infectious disease causing severe complications such as microcephaly in infants and Guillain Barre syndrome in adults. There is no licensed vaccination or approved medicine to treat ZIKV infection. Therefore, extensive research is being carried out to find compounds that can be used effectively as therapeutic molecules to treat ZIKV infection. Oregano, a commonly found herb in the Mediterranean region, has been used predominantly for culinary purposes. The fact that the members of the Origanum species are a storehouse of various bioactive compounds gives us a solid reason to study compounds extracted from it for therapeutic purposes. In this study, were retrieved 20 Flavonoids found in various Origanum species belonging to the Mediterranean region from the PubChem database and pharmacological analysis using SwissADME and Molecular docking using AutoDock Vina 4.0. were carried out against the NS2B NS3 protease since it serves as an effective drug target owing to its role in viral replication and immune evasion within the host. The best hit compounds were subjected to MD simulation at 100 ns using Desmond Schrodinger to analyze the molecule's stability. We observed Cirsiliol as the best hit compound against the NS2B NS3 complex with a binding affinity of -8.5 kcal per mol. It also showed good stability during MD simulation. We recommend the use of Cirsiliol for in vitro and in vivo studies for further investigation concerning the ZIKA virus.
△ Less
Submitted 5 November, 2022;
originally announced November 2022.
-
Do you pay for Privacy in Online learning?
Authors:
Amartya Sanyal,
Giorgia Ramponi
Abstract:
Online learning, in the mistake bound model, is one of the most fundamental concepts in learning theory. Differential privacy, instead, is the most widely used statistical concept of privacy in the machine learning community. It is thus clear that defining learning problems that are online differentially privately learnable is of great interest. In this paper, we pose the question on if the two pr…
▽ More
Online learning, in the mistake bound model, is one of the most fundamental concepts in learning theory. Differential privacy, instead, is the most widely used statistical concept of privacy in the machine learning community. It is thus clear that defining learning problems that are online differentially privately learnable is of great interest. In this paper, we pose the question on if the two problems are equivalent from a learning perspective, i.e., is privacy for free in the online learning framework?
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Probing symmetric teleparallel gravity in the early universe
Authors:
Avik De,
Dalia Saha,
Ganesh Subramaniam,
Abhik Kumar Sanyal
Abstract:
General theory of relativity can be equivalently formulated on a flat spacetime that associates with a torsion-free affine connection of non-vanishing non-metricity scalar $Q$. In this work, we present an extension of this, the $f(Q)$ theory of gravity, and explore the early evolution of the universe in the background of anisotropic Bianchi-I model. The f(Q) theory in the current setting through i…
▽ More
General theory of relativity can be equivalently formulated on a flat spacetime that associates with a torsion-free affine connection of non-vanishing non-metricity scalar $Q$. In this work, we present an extension of this, the $f(Q)$ theory of gravity, and explore the early evolution of the universe in the background of anisotropic Bianchi-I model. The f(Q) theory in the current setting through its geometric modification is quite successful in explaining the late time accelerated expansion. Here we notice that it also fits inflationary parameters with excellent precession, but fails to produce a decelerated expansion in the radiation dominated era.
△ Less
Submitted 24 September, 2022;
originally announced September 2022.
-
A law of adversarial risk, interpolation, and label noise
Authors:
Daniel Paleka,
Amartya Sanyal
Abstract:
In supervised learning, it has been shown that label noise in the data can be interpolated without penalties on test accuracy. We show that interpolating label noise induces adversarial vulnerability, and prove the first theorem showing the relationship between label noise and adversarial risk for any data distribution. Our results are almost tight if we do not make any assumptions on the inductiv…
▽ More
In supervised learning, it has been shown that label noise in the data can be interpolated without penalties on test accuracy. We show that interpolating label noise induces adversarial vulnerability, and prove the first theorem showing the relationship between label noise and adversarial risk for any data distribution. Our results are almost tight if we do not make any assumptions on the inductive bias of the learning algorithm. We then investigate how different components of this problem affect this result, including properties of the distribution. We also discuss non-uniform label noise distributions; and prove a new theorem showing uniform label noise induces nearly as large an adversarial risk as the worst poisoning with the same noise rate. Then, we provide theoretical and empirical evidence that uniform label noise is more harmful than typical real-world label noise. Finally, we show how inductive biases amplify the effect of label noise and argue the need for future work in this direction.
△ Less
Submitted 13 March, 2023; v1 submitted 8 July, 2022;
originally announced July 2022.
-
Discrete-time Rigid Body Pose Estimation based on Lagrange-d'Alembert principle
Authors:
Maulik Bhatt,
Srikant Sukumar,
Amit K Sanyal
Abstract:
The problem of rigid body pose estimation is treated in discrete-time via discrete Lagrange-d'Alembert principle and discrete Lyapunov methods. The position and attitude of the rigid body are to be estimated simultaneously with the help of vision and inertial sensors. For the discrete-time estimation of pose, the continuous-time rigid body kinematics equations are discretized appropriately. We app…
▽ More
The problem of rigid body pose estimation is treated in discrete-time via discrete Lagrange-d'Alembert principle and discrete Lyapunov methods. The position and attitude of the rigid body are to be estimated simultaneously with the help of vision and inertial sensors. For the discrete-time estimation of pose, the continuous-time rigid body kinematics equations are discretized appropriately. We approach the pose estimation problem as minimising the energies stored in the errors of estimated quantities. With the help of measurements obtained through optical sensors, artificial rotational and translation potential energy-like terms have been designed. Similarly, artificial rotational and translation kinetic energy-like terms have been devised using inertial sensor measurements. This allows us to construct a discrete-time Lagrangian as the difference of the kinetic and potential energy like terms, to which a Lagrange-d'Alembert principle is applied to obtain an optimal pose estimation filter. The dissipation terms in the optimal filter are designed through discrete-Lyapunov analysis on a suitably constructed Morse-Lyapunov function and the overall scheme is proven to be almost globally asymptotically stable. The filtering scheme is simulated using noisy sensor data to verify the theoretical properties.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
How Robust is Unsupervised Representation Learning to Distribution Shift?
Authors:
Yuge Shi,
Imant Daunhawer,
Julia E. Vogt,
Philip H. S. Torr,
Amartya Sanyal
Abstract:
The robustness of machine learning algorithms to distributions shift is primarily discussed in the context of supervised learning (SL). As such, there is a lack of insight on the robustness of the representations learned from unsupervised methods, such as self-supervised learning (SSL) and auto-encoder based algorithms (AE), to distribution shift. We posit that the input-driven objectives of unsup…
▽ More
The robustness of machine learning algorithms to distributions shift is primarily discussed in the context of supervised learning (SL). As such, there is a lack of insight on the robustness of the representations learned from unsupervised methods, such as self-supervised learning (SSL) and auto-encoder based algorithms (AE), to distribution shift. We posit that the input-driven objectives of unsupervised algorithms lead to representations that are more robust to distribution shift than the target-driven objective of SL. We verify this by extensively evaluating the performance of SSL and AE on both synthetic and realistic distribution shift datasets. Following observations that the linear layer used for classification itself can be susceptible to spurious correlations, we evaluate the representations using a linear head trained on a small amount of out-of-distribution (OOD) data, to isolate the robustness of the learned representations from that of the linear head. We also develop "controllable" versions of existing realistic domain generalisation datasets with adjustable degrees of distribution shifts. This allows us to study the robustness of different learning algorithms under versatile yet realistic distribution shift conditions. Our experiments show that representations learned from unsupervised learning algorithms generalise better than SL under a wide variety of extreme as well as realistic distribution shifts.
△ Less
Submitted 16 December, 2022; v1 submitted 17 June, 2022;
originally announced June 2022.
-
Catastrophic overfitting can be induced with discriminative non-robust features
Authors:
Guillermo Ortiz-Jiménez,
Pau de Jorge,
Amartya Sanyal,
Adel Bibi,
Puneet K. Dokania,
Pascal Frossard,
Gregory Rogéz,
Philip H. S. Torr
Abstract:
Adversarial training (AT) is the de facto method for building robust neural networks, but it can be computationally expensive. To mitigate this, fast single-step attacks can be used, but this may lead to catastrophic overfitting (CO). This phenomenon appears when networks gain non-trivial robustness during the first stages of AT, but then reach a breaking point where they become vulnerable in just…
▽ More
Adversarial training (AT) is the de facto method for building robust neural networks, but it can be computationally expensive. To mitigate this, fast single-step attacks can be used, but this may lead to catastrophic overfitting (CO). This phenomenon appears when networks gain non-trivial robustness during the first stages of AT, but then reach a breaking point where they become vulnerable in just a few iterations. The mechanisms that lead to this failure mode are still poorly understood. In this work, we study the onset of CO in single-step AT methods through controlled modifications of typical datasets of natural images. In particular, we show that CO can be induced at much smaller $ε$ values than it was observed before just by injecting images with seemingly innocuous features. These features aid non-robust classification but are not enough to achieve robustness on their own. Through extensive experiments we analyze this novel phenomenon and discover that the presence of these easy features induces a learning shortcut that leads to CO. Our findings provide new insights into the mechanisms of CO and improve our understanding of the dynamics of AT. The code to reproduce our experiments can be found at https://github.com/gortizji/co_features.
△ Less
Submitted 15 August, 2023; v1 submitted 16 June, 2022;
originally announced June 2022.
-
How unfair is private learning ?
Authors:
Amartya Sanyal,
Yaxi Hu,
Fanny Yang
Abstract:
As machine learning algorithms are deployed on sensitive data in critical decision making processes, it is becoming increasingly important that they are also private and fair. In this paper, we show that, when the data has a long-tailed structure, it is not possible to build accurate learning algorithms that are both private and results in higher accuracy on minority subpopulations. We further sho…
▽ More
As machine learning algorithms are deployed on sensitive data in critical decision making processes, it is becoming increasingly important that they are also private and fair. In this paper, we show that, when the data has a long-tailed structure, it is not possible to build accurate learning algorithms that are both private and results in higher accuracy on minority subpopulations. We further show that relaxing overall accuracy can lead to good fairness even with strict privacy requirements. To corroborate our theoretical results in practice, we provide an extensive set of experimental results using a variety of synthetic, vision (CIFAR10 and CelebA), and tabular (Law School) datasets and learning algorithms.
△ Less
Submitted 24 December, 2022; v1 submitted 8 June, 2022;
originally announced June 2022.
-
Input Influence Matrix Design for MIMO Discrete-Time Ultra-Local Model
Authors:
Sangli Teng,
Amit K. Sanyal,
Ram Vasudevan,
Anthony Bloch,
Maani Ghaffari
Abstract:
Ultra-Local Models (ULM) have been applied to perform model-free control of nonlinear systems with unknown or partially known dynamics. Unfortunately, extending these methods to MIMO systems requires designing a dense input influence matrix which is challenging. This paper presents guidelines for designing an input influence matrix for discrete-time, control-affine MIMO systems using an ULM-based…
▽ More
Ultra-Local Models (ULM) have been applied to perform model-free control of nonlinear systems with unknown or partially known dynamics. Unfortunately, extending these methods to MIMO systems requires designing a dense input influence matrix which is challenging. This paper presents guidelines for designing an input influence matrix for discrete-time, control-affine MIMO systems using an ULM-based controller. This paper analyzes the case that uses ULM and a model-free control scheme: the Hölder-continuous Finite-Time Stable (FTS) control. By comparing the ULM with the actual system dynamics, the paper describes how to extract the input-dependent part from the lumped ULM dynamics and finds that the tracking and state estimation error are coupled. The stability of the ULM-FTS error dynamics is affected by the eigenvalues of the difference (defined by matrix multiplication) between the actual and designed input influence matrix. Finally, the paper shows that a wide range of input influence matrix designs can keep the ULM-FTS error dynamics (at least locally) asymptotically stable. A numerical simulation is included to verify the result. The analysis can also be extended to other ULM-based controllers.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
Make Some Noise: Reliable and Efficient Single-Step Adversarial Training
Authors:
Pau de Jorge,
Adel Bibi,
Riccardo Volpi,
Amartya Sanyal,
Philip H. S. Torr,
Grégory Rogez,
Puneet K. Dokania
Abstract:
Recently, Wong et al. showed that adversarial training with single-step FGSM leads to a characteristic failure mode named Catastrophic Overfitting (CO), in which a model becomes suddenly vulnerable to multi-step attacks. Experimentally they showed that simply adding a random perturbation prior to FGSM (RS-FGSM) could prevent CO. However, Andriushchenko and Flammarion observed that RS-FGSM still le…
▽ More
Recently, Wong et al. showed that adversarial training with single-step FGSM leads to a characteristic failure mode named Catastrophic Overfitting (CO), in which a model becomes suddenly vulnerable to multi-step attacks. Experimentally they showed that simply adding a random perturbation prior to FGSM (RS-FGSM) could prevent CO. However, Andriushchenko and Flammarion observed that RS-FGSM still leads to CO for larger perturbations, and proposed a computationally expensive regularizer (GradAlign) to avoid it. In this work, we methodically revisit the role of noise and clip** in single-step adversarial training. Contrary to previous intuitions, we find that using a stronger noise around the clean sample combined with \textit{not clip**} is highly effective in avoiding CO for large perturbation radii. We then propose Noise-FGSM (N-FGSM) that, while providing the benefits of single-step adversarial training, does not suffer from CO. Empirical analyses on a large suite of experiments show that N-FGSM is able to match or surpass the performance of previous state-of-the-art GradAlign, while achieving 3x speed-up. Code can be found in https://github.com/pdejorge/N-FGSM
△ Less
Submitted 17 October, 2022; v1 submitted 2 February, 2022;
originally announced February 2022.
-
The issue of Branched Hamiltonian in F(T) Teleparallel Gravity
Authors:
Manas Chakrabortty,
Kaushik Sarkar,
Abhik Kumar Sanyal
Abstract:
As in the case of Lanczos-Lovelock gravity, the main advantage of F(T) gravity is said to be that it leads to second order field equations, while F(R) gravity theory leads to fourth order equations. We show that it is rather a disadvantage, since it leads to the unresolved issue of `Branched Hamiltonian'. The problem is bypassed in F(R,T) gravity theory.
As in the case of Lanczos-Lovelock gravity, the main advantage of F(T) gravity is said to be that it leads to second order field equations, while F(R) gravity theory leads to fourth order equations. We show that it is rather a disadvantage, since it leads to the unresolved issue of `Branched Hamiltonian'. The problem is bypassed in F(R,T) gravity theory.
△ Less
Submitted 24 June, 2022; v1 submitted 20 January, 2022;
originally announced January 2022.
-
Towards Adversarial Evaluations for Inexact Machine Unlearning
Authors:
Shashwat Goel,
Ameya Prabhu,
Amartya Sanyal,
Ser-Nam Lim,
Philip Torr,
Ponnurangam Kumaraguru
Abstract:
Machine Learning models face increased concerns regarding the storage of personal user data and adverse impacts of corrupted data like backdoors or systematic bias. Machine Unlearning can address these by allowing post-hoc deletion of affected training data from a learned model. Achieving this task exactly is computationally expensive; consequently, recent works have proposed inexact unlearning al…
▽ More
Machine Learning models face increased concerns regarding the storage of personal user data and adverse impacts of corrupted data like backdoors or systematic bias. Machine Unlearning can address these by allowing post-hoc deletion of affected training data from a learned model. Achieving this task exactly is computationally expensive; consequently, recent works have proposed inexact unlearning algorithms to solve this approximately as well as evaluation methods to test the effectiveness of these algorithms.
In this work, we first outline some necessary criteria for evaluation methods and show no existing evaluation satisfies them all. Then, we design a stronger black-box evaluation method called the Interclass Confusion (IC) test which adversarially manipulates data during training to detect the insufficiency of unlearning procedures. We also propose two analytically motivated baseline methods~(EU-k and CF-k) which outperform several popular inexact unlearning methods. Overall, we demonstrate how adversarial evaluation strategies can help in analyzing various unlearning phenomena which can guide the development of stronger unlearning algorithms.
△ Less
Submitted 22 February, 2023; v1 submitted 17 January, 2022;
originally announced January 2022.
-
Inflation -- a Comparative Study Amongst Different Modified Gravity Theories
Authors:
Dalia Saha,
Abhik Kumar Sanyal
Abstract:
In the recent years, a host of modified gravity models have been proposed as alternatives to the dark energy. A quantum theory of gravity also requires to modify `General Theory of Relativity'. In the present article, we consider five different modified theories of gravity, and compare inflationary parameters with recent data sets released by two Planck collaboration teams. Our analysis reveals th…
▽ More
In the recent years, a host of modified gravity models have been proposed as alternatives to the dark energy. A quantum theory of gravity also requires to modify `General Theory of Relativity'. In the present article, we consider five different modified theories of gravity, and compare inflationary parameters with recent data sets released by two Planck collaboration teams. Our analysis reveals that the scalar-tensor theory of gravity is the best alternative.
△ Less
Submitted 7 January, 2022;
originally announced January 2022.
-
Inflation with F(T) Teleparallel Gravity
Authors:
Manas Chakrabortty,
Nayem Sk,
Susmita Sanyal,
Abhik Kumar Sanyal
Abstract:
We study early universe with a particular form of F(T) Telleparallel gravity theory, in which inflation is driven by a scalar field. To ensure slow rollover, two different potentials are chosen in a manner, such that they remain almost flat for large initial value of the scalar field. Inflationary parameters show wonderful fit with the presently available Planck's data set. The energy scale of inf…
▽ More
We study early universe with a particular form of F(T) Telleparallel gravity theory, in which inflation is driven by a scalar field. To ensure slow rollover, two different potentials are chosen in a manner, such that they remain almost flat for large initial value of the scalar field. Inflationary parameters show wonderful fit with the presently available Planck's data set. The energy scale of inflation is sub-Planckian and graceful exit from inflation is also administered. The chosen form of F(T) administers late-time cosmic acceleration too. In the process, unification of the early inflation with late-time acceleration is ensured. Unfortunately, a decelerated radiation dominated era is only possible with a different form of (quartic) potential, which being devoid of a flat section does not admit slow rollover.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Fix your Models by Fixing your Datasets
Authors:
Atindriyo Sanyal,
Vikram Chatterji,
Nidhi Vyas,
Ben Epstein,
Nikita Demir,
Anthony Corletti
Abstract:
The quality of underlying training data is very crucial for building performant machine learning models with wider generalizabilty. However, current machine learning (ML) tools lack streamlined processes for improving the data quality. So, getting data quality insights and iteratively pruning the errors to obtain a dataset which is most representative of downstream use cases is still an ad-hoc man…
▽ More
The quality of underlying training data is very crucial for building performant machine learning models with wider generalizabilty. However, current machine learning (ML) tools lack streamlined processes for improving the data quality. So, getting data quality insights and iteratively pruning the errors to obtain a dataset which is most representative of downstream use cases is still an ad-hoc manual process. Our work addresses this data tooling gap, required to build improved ML workflows purely through data-centric techniques. More specifically, we introduce a systematic framework for (1) finding noisy or mislabelled samples in the dataset and, (2) identifying the most informative samples, which when included in training would provide maximal model performance lift. We demonstrate the efficacy of our framework on public as well as private enterprise datasets of two Fortune 500 companies, and are confident this work will form the basis for ML teams to perform more intelligent data discovery and pruning.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
Multitude of Topological Phase Transitions in Bipartite Dice and Lieb Lattices with Interacting Electrons and Rashba Coupling
Authors:
Rahul Soni,
Amit Bikram Sanyal,
Nitin Kaushal,
Satoshi Okamoto,
Adriana Moreo,
Elbio Dagotto
Abstract:
We report the results of a Hartree-Fock study applied to interacting electrons moving in two different bipartite lattices: the dice and the Lieb lattices, at half-filling. Both lattices develop ferrimagnetic order in the phase diagram $U$-$λ$, where $U$ is the Hubbard onsite repulsion and $λ$ the Rashba spin-orbit coupling strength. Our main result is the observation of an unexpected multitude of…
▽ More
We report the results of a Hartree-Fock study applied to interacting electrons moving in two different bipartite lattices: the dice and the Lieb lattices, at half-filling. Both lattices develop ferrimagnetic order in the phase diagram $U$-$λ$, where $U$ is the Hubbard onsite repulsion and $λ$ the Rashba spin-orbit coupling strength. Our main result is the observation of an unexpected multitude of topological phases for both lattices. All these phases are ferrimagnetic, but they differ among themselves in their set of six Chern numbers (six numbers because the unit cells have three atoms). The Chern numbers $|C|$ observed in our study range from 0 to 3, showing that large Chern numbers can be obtained by the effect of electronic correlations, adding to the recently discussed methodologies to increase $|C|$ based on extending the hop** range in tight-binding models, using sudden quenches, or photonic crystals, all without including electronic interactions.
△ Less
Submitted 21 December, 2021; v1 submitted 7 October, 2021;
originally announced October 2021.
-
Identifying and Exploiting Structures for Reliable Deep Learning
Authors:
Amartya Sanyal
Abstract:
Deep learning research has recently witnessed an impressively fast-paced progress in a wide range of tasks including computer vision, natural language processing, and reinforcement learning. The extraordinary performance of these systems often gives the impression that they can be used to revolutionise our lives for the better. However, as recent works point out, these systems suffer from several…
▽ More
Deep learning research has recently witnessed an impressively fast-paced progress in a wide range of tasks including computer vision, natural language processing, and reinforcement learning. The extraordinary performance of these systems often gives the impression that they can be used to revolutionise our lives for the better. However, as recent works point out, these systems suffer from several issues that make them unreliable for use in the real world, including vulnerability to adversarial attacks (Szegedy et al. [248]), tendency to memorise noise (Zhang et al. [292]), being over-confident on incorrect predictions (miscalibration) (Guo et al. [99]), and unsuitability for handling private data (Gilad-Bachrach et al. [88]). In this thesis, we look at each of these issues in detail, investigate their causes, and propose computationally cheap algorithms for mitigating them in practice. To do this, we identify structures in deep neural networks that can be exploited to mitigate the above causes of unreliability of deep learning algorithms.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
Managing ML Pipelines: Feature Stores and the Coming Wave of Embedding Ecosystems
Authors:
Laurel Orr,
Atindriyo Sanyal,
Xiao Ling,
Karan Goel,
Megan Leszczynski
Abstract:
The industrial machine learning pipeline requires iterating on model features, training and deploying models, and monitoring deployed models at scale. Feature stores were developed to manage and standardize the engineer's workflow in this end-to-end pipeline, focusing on traditional tabular feature data. In recent years, however, model development has shifted towards using self-supervised pretrain…
▽ More
The industrial machine learning pipeline requires iterating on model features, training and deploying models, and monitoring deployed models at scale. Feature stores were developed to manage and standardize the engineer's workflow in this end-to-end pipeline, focusing on traditional tabular feature data. In recent years, however, model development has shifted towards using self-supervised pretrained embeddings as model features. Managing these embeddings and the downstream systems that use them introduces new challenges with respect to managing embedding training data, measuring embedding quality, and monitoring downstream models that use embeddings. These challenges are largely unaddressed in standard feature stores. Our goal in this tutorial is to introduce the feature store system and discuss the challenges and current solutions to managing these new embedding-centric pipelines.
△ Less
Submitted 11 August, 2021;
originally announced August 2021.
-
Conflict between some higher-order curvature invariant terms
Authors:
Dalia Saha,
Mohosin Alam,
Ranajit Mandal,
Abhik Kumar Sanyal
Abstract:
A viable quantum theory does not allow curvature invariant terms of different higher orders to be accommodated in the gravitational action. We show that there is indeed a conflict between the curvature squared and Gauss-Bonnet squared terms from the point of view of hermiticity. This means one should choose either, in addition to the Einstein-Hilbert term, but never the two together. We explore ea…
▽ More
A viable quantum theory does not allow curvature invariant terms of different higher orders to be accommodated in the gravitational action. We show that there is indeed a conflict between the curvature squared and Gauss-Bonnet squared terms from the point of view of hermiticity. This means one should choose either, in addition to the Einstein-Hilbert term, but never the two together. We explore early cosmic evolution with Gauss-Bonnet squared term.
△ Less
Submitted 26 October, 2021; v1 submitted 12 June, 2021;
originally announced June 2021.
-
String cosmology in Bianchi I space-time
Authors:
A. Banerjee,
Abhik Kumar Sanyal,
S. Chakraborty
Abstract:
Some cosmological solutions of massive strings are obtained in Bianchi I space-time following the techniques used by Letelier and Stachel. A class of solutions corresponds to string cosmology associated with/without a magnetic field and the other class consists of pure massive strings, obeying the Takabayashi equation of state.
Some cosmological solutions of massive strings are obtained in Bianchi I space-time following the techniques used by Letelier and Stachel. A class of solutions corresponds to string cosmology associated with/without a magnetic field and the other class consists of pure massive strings, obeying the Takabayashi equation of state.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
Viscoelastic flow past an infinite plate with suction and constant heat flux
Authors:
Abhik Kumar Sanyal,
D. Ray
Abstract:
While studying the viscoelastic flow past an infinite plate with suction and constant heat flux between fluid and plate, Raptis and Tziyanidis gave the solution of a pair of equations for velocity and temperature as functions of distance. They then gave some approximate solutions. This letter shows that the approximations are not justified and presents an exact analytical study.
While studying the viscoelastic flow past an infinite plate with suction and constant heat flux between fluid and plate, Raptis and Tziyanidis gave the solution of a pair of equations for velocity and temperature as functions of distance. They then gave some approximate solutions. This letter shows that the approximations are not justified and presents an exact analytical study.
△ Less
Submitted 1 June, 2021;
originally announced June 2021.
-
New conserved tensors and Brans-Dicke type field equation, using integrability condition
Authors:
Abhik Kumar Sanyal,
Bijan Modak,
Manas Chakrabortty
Abstract:
We explore some new off-shell and on-shell conserved quantities for a scalar field in Minkowski space, using integrability condition. The off-shell conserved tensors are related to the kinematics of the field, while a linear combination of the off-shell and the on-shell conserved tensors ends up with the energy-momentum tensor for the scalar field. In the curved background, using Ricci and Bianchi…
▽ More
We explore some new off-shell and on-shell conserved quantities for a scalar field in Minkowski space, using integrability condition. The off-shell conserved tensors are related to the kinematics of the field, while a linear combination of the off-shell and the on-shell conserved tensors ends up with the energy-momentum tensor for the scalar field. In the curved background, using Ricci and Bianchi identities, Brans-Dicke type field equations emerge, without requiring the principle of equivalence. Further, starting from the curvature scalar and using these identities, the field equations for modified gravity (Einstein-Hilbert action in the presence of higher-order terms) follows.
△ Less
Submitted 20 January, 2022; v1 submitted 27 May, 2021;
originally announced May 2021.
-
Bianchi II, VIII, and IX viscous fluid cosmology
Authors:
A. Banerjee,
Abhik Kumar Sanyal,
S. Chakraborty
Abstract:
In this paper we study the exact solutions for a viscous fluid distribution in Bianchi II, VIII, and IX models. The metric is simplified by assuming a relationship between the coefficients and the metric tensor. Solutions are obtained in two special cases: in one, an additional assumption is made where the matter density and the expansion scalar have a definite relation and in the other a barotrop…
▽ More
In this paper we study the exact solutions for a viscous fluid distribution in Bianchi II, VIII, and IX models. The metric is simplified by assuming a relationship between the coefficients and the metric tensor. Solutions are obtained in two special cases: in one, an additional assumption is made where the matter density and the expansion scalar have a definite relation and in the other a barotropic equation of state between the matter density and the thermodynamic pressure is assumed. While the Bianchi II solutions are already found in the literature the other two classes of solutions are apparently new.
△ Less
Submitted 22 May, 2021;
originally announced May 2021.
-
Canonical equivalence, quantization and anisotropic inflation in higher order theory of gravity
Authors:
Subhra Debnath,
Abhik Kumar Sanyal
Abstract:
We construct phase-space structure of a typical higher-order theory of gravity, in the background of anisotropic Bianchi-1 mini-superspace, following `Modified Horowitz Formalism' as well as applying `Dirac Algorithm' (after taking care of the divergent terms), and establish equivalence. Canonical quantization, and semiclassical approximation are performed to expatiate the fact that such a quantum…
▽ More
We construct phase-space structure of a typical higher-order theory of gravity, in the background of anisotropic Bianchi-1 mini-superspace, following `Modified Horowitz Formalism' as well as applying `Dirac Algorithm' (after taking care of the divergent terms), and establish equivalence. Canonical quantization, and semiclassical approximation are performed to expatiate the fact that such a quantum theory transits successfully to a classical de-Sitter universe. Inflation has thereafter been studied. The numerical values of the inflationary parameters show excellent agreement with the latest released Planck's data.
△ Less
Submitted 20 May, 2021;
originally announced May 2021.
-
Bianchi VI0 viscous fluid cosmology with magnetic field
Authors:
Marcelo Byrro Ribeiro,
Abhik Kumar Sanyal
Abstract:
A spatially homogeneous Bianchi type VI0 model containing a viscous fluid in the presence of an axial magnetic field has been studied. A barotropic equation of state together with a pair of linear relations among the square root of matter density, shear scalar, and expansion scalar have been assumed. Solutions are obtained in the presence of a magnetic field, only in two special cases, which are c…
▽ More
A spatially homogeneous Bianchi type VI0 model containing a viscous fluid in the presence of an axial magnetic field has been studied. A barotropic equation of state together with a pair of linear relations among the square root of matter density, shear scalar, and expansion scalar have been assumed. Solutions are obtained in the presence of a magnetic field, only in two special cases, which are comparatively simpler. The complete solutions for this model in the absence of a magnetic field are also obtained. The presence of a magnetic field in the former case however, does not in effect cause any major modification in the fundamental nature of the initial singularity of the expanding model.
△ Less
Submitted 17 May, 2021;
originally announced May 2021.
-
Homogeneous Anisotropic Cosmological Models with Viscous Fluid and Magnetic Field
Authors:
A. Banerjee,
Abhik Kumar Sanyal
Abstract:
The paper presents some exact solutions of Bianchi types I, III and Kantowski-Sachs cosmological models consisting of a dissipative fluid along with an axial magnetic field. A barytropic equation of state between the thermodynamic pressure and the matter density, together with a pair of linear relations between the matter density, the shear scalar, and the expansion scalar have been assumed for si…
▽ More
The paper presents some exact solutions of Bianchi types I, III and Kantowski-Sachs cosmological models consisting of a dissipative fluid along with an axial magnetic field. A barytropic equation of state between the thermodynamic pressure and the matter density, together with a pair of linear relations between the matter density, the shear scalar, and the expansion scalar have been assumed for simplicity. The solutions are basically of two different types, one for the Bianchi-I and the other for Bianchi-III and Kantowski-Sachs type. The presence of the magnetic field, however, does not change the fundamental nature of the initial singularity.
△ Less
Submitted 15 May, 2021;
originally announced May 2021.
-
Bianchi Type-II Cosmological Model with Viscous Fluid
Authors:
A. Banerjee,
S. B. Duttachoudhury,
Abhik Kumar Sanyal
Abstract:
A spatially homogeneous and locally rotationally symmetric Bianchi type-II cosmological model under the influence of both shear and bulk viscosity has been studied. Exact solutions are obtained with a barotropic equation of state between thermodynamics pressure and the energy density of the fluid, and considering the linear relationships amongst the energy density, the expansion scalar and the she…
▽ More
A spatially homogeneous and locally rotationally symmetric Bianchi type-II cosmological model under the influence of both shear and bulk viscosity has been studied. Exact solutions are obtained with a barotropic equation of state between thermodynamics pressure and the energy density of the fluid, and considering the linear relationships amongst the energy density, the expansion scalar and the shear scalar. Special cases with vanishing bulk viscosity coefficients and with the perfect fluid in the absence of viscosity have also been studied. The formal appearance of the solutions is the same for both the viscous as well as the perfect fluids. The difference is only in choosing a constant parameter which appears in the solutions. In the cases of either a fluid with bulk viscosity alone or a perfect fluid, the barotropic equation of state is no longer an additional assumption to be imposed; rather it follows directly from the field equations.
△ Less
Submitted 8 May, 2021;
originally announced May 2021.