Skip to main content

Showing 1–20 of 20 results for author: Patel, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.11179  [pdf, other

    cs.LG math.ST physics.comp-ph

    Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes

    Authors: Jeremiah Hauth, Cosmin Safta, Xun Huan, Ravi G. Patel, Reese E. Jones

    Abstract: The application of neural network models to scientific machine learning tasks has proliferated in recent years. In particular, neural network models have proved to be adept at modeling processes with spatial-temporal complexity. Nevertheless, these highly parameterized models have garnered skepticism in their ability to produce outputs with quantified error bounds over the regimes of interest. Hen… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 27 pages, 20 figures

  2. arXiv:2305.14991  [pdf, other

    cs.CL cs.AI

    MuLER: Detailed and Scalable Reference-based Evaluation

    Authors: Taelin Karidi, Leshem Choshen, Gal Patel, Omri Abend

    Abstract: We propose a novel methodology (namely, MuLER) that transforms any reference-based evaluation metric for text generation, such as machine translation (MT) into a fine-grained analysis tool. Given a system and a metric, MuLER quantifies how much the chosen metric penalizes specific error types (e.g., errors in translating names of locations). MuLER thus enables a detailed error analysis which can l… ▽ More

    Submitted 29 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  3. arXiv:2304.09750  [pdf, other

    q-fin.CP cs.CE cs.LG quant-ph

    Application of Tensor Neural Networks to Pricing Bermudan Swaptions

    Authors: Raj G. Patel, Tomas Dominguez, Mohammad Dib, Samuel Palmer, Andrea Cadarso, Fernando De Lope Contreras, Abdelkader Ratnani, Francisco Gomez Casanova, Senaida Hernández-Santana, Álvaro Díaz-Fernández, Eva Andrés, Jorge Luis-Hita, Escolástico Sánchez-Martínez, Samuel Mugel, Roman Orus

    Abstract: The Cheyette model is a quasi-Gaussian volatility interest rate model widely used to price interest rate derivatives such as European and Bermudan Swaptions for which Monte Carlo simulation has become the industry standard. In low dimensions, these approaches provide accurate and robust prices for European Swaptions but, even in this computationally simple setting, they are known to underestimate… ▽ More

    Submitted 10 March, 2024; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: 16 pages, 9 figures, 2 tables, minor changes

  4. arXiv:2302.14290  [pdf, other

    cs.LG cs.CV

    Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation

    Authors: Gaurav Patel, Konda Reddy Mopuri, Qiang Qiu

    Abstract: Data-free Knowledge Distillation (DFKD) has gained popularity recently, with the fundamental idea of carrying out knowledge transfer from a Teacher neural network to a Student neural network in the absence of training data. However, in the Adversarial DFKD framework, the student network's accuracy, suffers due to the non-stationary distribution of the pseudo-samples under multiple generator update… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: Accepted at CVPR 2023

  5. arXiv:2212.14076  [pdf, other

    q-fin.PR cs.CE cs.LG quant-ph

    Quantum-Inspired Tensor Neural Networks for Option Pricing

    Authors: Raj G. Patel, Chia-Wei Hsing, Serkan Sahin, Samuel Palmer, Saeed S. Jahromi, Shivam Sharma, Tomas Dominguez, Kris Tziritas, Christophe Michel, Vincent Porte, Mustafa Abid, Stephane Aubert, Pierre Castellani, Samuel Mugel, Roman Orus

    Abstract: Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Altho… ▽ More

    Submitted 10 March, 2024; v1 submitted 28 December, 2022; originally announced December 2022.

    Comments: 11 pages, 8 figures, minor changes. arXiv admin note: substantial text overlap with arXiv:2208.02235

  6. arXiv:2209.00641  [pdf, other

    cs.CV

    Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition

    Authors: Gaurav Patel, Jan Allebach, Qiang Qiu

    Abstract: This paper looks at semi-supervised learning (SSL) for image-based text recognition. One of the most popular SSL approaches is pseudo-labeling (PL). PL approaches assign labels to unlabeled data before re-training the model with a combination of labeled and pseudo-labeled data. However, PL methods are severely degraded by noise and are prone to over-fitting to noisy labels, due to the inclusion of… ▽ More

    Submitted 6 October, 2022; v1 submitted 30 August, 2022; originally announced September 2022.

    Comments: Accepted at WACV 2023

  7. arXiv:2207.02891  [pdf, other

    cs.LG cs.AI

    Don't overfit the history -- Recursive time series data augmentation

    Authors: Amine Mohamed Aboussalah, Min-Jae Kwon, Raj G Patel, Cheng Chi, Chi-Guhn Lee

    Abstract: Time series observations can be seen as realizations of an underlying dynamical system governed by rules that we typically do not know. For time series learning tasks, we need to understand that we fit our model on available data, which is a unique realized history. Training on a single realization often induces severe overfitting lacking generalization. To address this issue, we introduce a gener… ▽ More

    Submitted 28 January, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: Accepted to ICLR 2023 Resubmitted here due to major change in proofs following conference submission

  8. arXiv:2204.10909  [pdf, other

    cs.LG stat.ML

    Error-in-variables modelling for operator learning

    Authors: Ravi G. Patel, Indu Manickam, Myoungkyu Lee, Mamikon Gulian

    Abstract: Deep operator learning has emerged as a promising tool for reduced-order modelling and PDE model discovery. Leveraging the expressive power of deep neural networks, especially in high dimensions, such methods learn the map** between functional state variables. While proposed methods have assumed noise only in the dependent variables, experimental and numerical data for operator learning typicall… ▽ More

    Submitted 19 July, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: 23 pages, 10 figures

  9. arXiv:2110.03067  [pdf, other

    cs.CL

    On Neurons Invariant to Sentence Structural Changes in Neural Machine Translation

    Authors: Gal Patel, Leshem Choshen, Omri Abend

    Abstract: We present a methodology that explores how sentence structure is reflected in neural representations of machine translation systems. We demonstrate our model-agnostic approach with the Transformer English-German translation model. We analyze neuron-level correlation of activations between paraphrases while discussing the methodology challenges and the need for confound analysis to isolate the effe… ▽ More

    Submitted 2 November, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

  10. arXiv:2104.02488  [pdf, other

    cs.CV

    Weakly supervised segmentation with cross-modality equivariant constraints

    Authors: Gaurav Patel, Jose Dolz

    Abstract: Weakly supervised learning has emerged as an appealing alternative to alleviate the need for large labeled datasets in semantic segmentation. Most current approaches exploit class activation maps (CAMs), which can be generated from image-level annotations. Nevertheless, resulting maps have been demonstrated to be highly discriminant, failing to serve as optimal proxy pixel-level labels. We present… ▽ More

    Submitted 13 January, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: Under Review at MedIA. Code available

  11. arXiv:2101.11256  [pdf, other

    cs.LG math.NA stat.ML

    Partition of unity networks: deep hp-approximation

    Authors: Kook** Lee, Nathaniel A. Trask, Ravi G. Patel, Mamikon A. Gulian, Eric C. Cyr

    Abstract: Approximation theorists have established best-in-class optimal approximation rates of deep neural networks by utilizing their ability to simultaneously emulate partitions of unity and monomials. Motivated by this, we propose partition of unity networks (POUnets) which incorporate these elements directly into the architecture. Classification architectures of the type used to learn probability measu… ▽ More

    Submitted 27 January, 2021; originally announced January 2021.

    Comments: 8 pages, 5 figures

  12. arXiv:2009.11992  [pdf, other

    physics.comp-ph cs.LG math.NA stat.ML

    A physics-informed operator regression framework for extracting data-driven continuum models

    Authors: Ravi G. Patel, Nathaniel A. Trask, Mitchell A. Wood, Eric C. Cyr

    Abstract: The application of deep learning toward discovery of data-driven models requires careful application of inductive biases to obtain a description of physics which is both accurate and robust. We present here a framework for discovering continuum models from high fidelity molecular simulation data. Our approach applies a neural network parameterization of governing physics in modal space, allowing a… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: 37 pages, 15 figures

  13. arXiv:2008.03750  [pdf, other

    eess.IV cs.CV

    Switching Loss for Generalized Nucleus Detection in Histopathology

    Authors: Deepak Anand, Gaurav Patel, Yaman Dang, Amit Sethi

    Abstract: The accuracy of deep learning methods for two foundational tasks in medical image analysis -- detection and segmentation -- can suffer from class imbalance. We propose a `switching loss' function that adaptively shifts the emphasis between foreground and background classes. While the existing loss functions to address this problem were motivated by the classification task, the switching loss is ba… ▽ More

    Submitted 9 August, 2020; originally announced August 2020.

  14. arXiv:2006.10123  [pdf, other

    cs.LG stat.ML

    A block coordinate descent optimizer for classification problems exploiting convexity

    Authors: Ravi G. Patel, Nathaniel A. Trask, Mamikon A. Gulian, Eric C. Cyr

    Abstract: Second-order optimizers hold intriguing potential for deep learning, but suffer from increased cost and sensitivity to the non-convexity of the loss surface as compared to gradient-based approaches. We introduce a coordinate descent method to train deep neural networks for classification tasks that exploits global convexity of the cross-entropy loss in the weights of the linear layer. Our hybrid N… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Comments: 10 pages, 4 figures

  15. arXiv:1912.04862  [pdf, other

    cs.LG math.NA stat.ML

    Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

    Authors: Eric C. Cyr, Mamikon A. Gulian, Ravi G. Patel, Mauro Perego, Nathaniel A. Trask

    Abstract: Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dram… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

    Comments: 26 pages

  16. arXiv:1909.05371  [pdf, other

    cs.LG math.DS physics.data-an stat.ML

    GMLS-Nets: A framework for learning from unstructured data

    Authors: Nathaniel Trask, Ravi G. Patel, Ben J. Gross, Paul J. Atzberger

    Abstract: Data fields sampled on irregularly spaced points arise in many applications in the sciences and engineering. For regular grids, Convolutional Neural Networks (CNNs) have been successfully used to gaining benefits from weight sharing and invariances. We generalize CNNs by introducing methods for data on unstructured point clouds based on Generalized Moving Least Squares (GMLS). GMLS is a non-parame… ▽ More

    Submitted 13 September, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

    Journal ref: AAAI-MLPS Proceedings, (2020)

  17. arXiv:1810.08552  [pdf, ps, other

    cs.LG physics.comp-ph physics.data-an stat.ML

    Nonlinear integro-differential operator regression with neural networks

    Authors: Ravi G. Patel, Olivier Desjardins

    Abstract: This note introduces a regression technique for finding a class of nonlinear integro-differential operators from data. The method parametrizes the spatial operator with neural networks and Fourier transforms such that it can fit a class of nonlinear operators without needing a library of a priori selected operators. We verify that this method can recover the spatial operators in the fractional hea… ▽ More

    Submitted 19 October, 2018; originally announced October 2018.

    Comments: 5 pages, 3 figures, preprint submitted to the Journal of Computational Physics

  18. arXiv:1507.01818  [pdf, other

    math.CO cs.DM

    Improved Upper Bounds on $a'(G\Box H)$

    Authors: Punit Mehta, Rahul Muthu, Gaurav Patel, Om Thakkar, Devanshi Vyas

    Abstract: The acyclic edge colouring problem is extensively studied in graph theory. The corner-stone of this field is a conjecture of Alon et. al.\cite{alonacyclic} that $a'(G)\le Δ(G)+2$. In that and subsequent work, $a'(G)$ is typically bounded in terms of $Δ(G)$. Motivated by this we introduce a term $gap(G)$ defined as $gap(G)=a'(G)-Δ(G)$. Alon's conjecture can be rephrased as $gap(G)\le2$ for all grap… ▽ More

    Submitted 7 July, 2015; originally announced July 2015.

    Comments: 10 pages, 5 figures

  19. On the Impact of Phase Noise on Active Cancellation in Wireless Full-Duplex

    Authors: Achaleshwar Sahai, Gaurav Patel, Chris Dick, Ashutosh Sabharwal

    Abstract: Recent experimental results have shown that full-duplex communication is possible for short-range communications. However, extending full-duplex to long-range communication remains a challenge, primarily due to residual self-interference even with a combination of passive suppression and active cancellation methods. In this paper, we investigate the root cause of performance bottlenecks in current… ▽ More

    Submitted 21 December, 2012; originally announced December 2012.

    Comments: 35 pages, Submitted to IEEE Transactions on Vehicular Technology, Dec 2012

  20. arXiv:1107.0607  [pdf, other

    cs.NI

    Pushing the limits of Full-duplex: Design and Real-time Implementation

    Authors: Achaleshwar Sahai, Gaurav Patel, Ashutosh Sabharwal

    Abstract: Recent work has shown the feasibility of single-channel full-duplex wireless physical layer, allowing nodes to send and receive in the same frequency band at the same time. In this report, we first design and implement a real-time 64-subcarrier 10 MHz full-duplex OFDM physical layer, FD-PHY. The proposed FD-PHY not only allows synchronous full-duplex transmissions but also selective asynchronous f… ▽ More

    Submitted 4 July, 2011; originally announced July 2011.

    Comments: 12 page Rice University technical report

    Report number: TREE1104