Skip to main content

Showing 1–25 of 25 results for author: Berisha, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.14422  [pdf, other

    cs.LG cs.AI cs.CY

    Unraveling overoptimism and publication bias in ML-driven science

    Authors: Pouria Saidi, Gautam Dasarathy, Visar Berisha

    Abstract: Machine Learning (ML) is increasingly used across many disciplines with impressive reported results across many domain areas. However, recent studies suggest that the published performance of ML models are often overoptimistic. Validity concerns are underscored by findings of an inverse relationship between sample size and reported accuracy in published ML models, contrasting with the theory of le… ▽ More

    Submitted 10 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 31 pages, 7 figures, 6 tables

  2. arXiv:2310.17049  [pdf, other

    cs.SD cs.AI eess.AS

    Learning Repeatable Speech Embeddings Using An Intra-class Correlation Regularizer

    Authors: Jianwei Zhang, Suren Jayasuriya, Visar Berisha

    Abstract: A good supervised embedding for a specific machine learning task is only sensitive to changes in the label of interest and is invariant to other confounding factors. We leverage the concept of repeatability from measurement theory to describe this property and propose to use the intra-class correlation coefficient (ICC) to evaluate the repeatability of embeddings. We then propose a novel regulariz… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023

  3. arXiv:2303.02523  [pdf, ps, other

    eess.AS cs.SD

    Requirements for Mass Adoption of Assistive Listening Technology by the General Public

    Authors: Thomas B. Kaufmann, Mehdi Foroogozar, Julie Liss, Visar Berisha

    Abstract: Assistive listening systems (ALSs) dramatically increase speech intelligibility and reduce listening effort. It is very likely that essentially everyone, not only individuals with hearing loss, would benefit from the increased signal-to-noise ratio an ALS provides in almost any listening scenario. However, ALSs are rarely used by anyone other than people with severe to profound hearing losses. To… ▽ More

    Submitted 3 May, 2023; v1 submitted 4 March, 2023; originally announced March 2023.

    Comments: Accepted to ICASSP 2023

  4. arXiv:2302.09114  [pdf, other

    cs.LG cs.IT

    Smoothly Giving up: Robustness for Simple Models

    Authors: Tyler Sypherd, Nathan Stromberg, Richard Nock, Visar Berisha, Lalitha Sankar

    Abstract: There is a growing need for models that are interpretable and have reduced energy and computational cost (e.g., in health care analytics and federated learning). Examples of algorithms to train such models include logistic regression and boosting. However, one challenge facing these algorithms is that they provably suffer from label noise; this has been attributed to the joint interaction between… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: To appear in AISTATS 2023

  5. arXiv:2301.12616  [pdf, other

    cs.LG stat.ME

    Active Sequential Two-Sample Testing

    Authors: Weizhi Li, Prad Kadambi, Pouria Saidi, Karthikeyan Natesan Ramamurthy, Gautam Dasarathy, Visar Berisha

    Abstract: A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample features) are inexpensive to access, but their group memberships (or labels) are costly. To address the problem, we devise the first \emph{active sequential two… ▽ More

    Submitted 27 June, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

  6. arXiv:2211.09858  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection

    Authors: Jianwei Zhang, Julie Liss, Suren Jayasuriya, Visar Berisha

    Abstract: Approximately 1.2% of the world's population has impaired voice production. As a result, automatic dysphonic voice detection has attracted considerable academic and clinical interest. However, existing methods for automated voice assessment often fail to generalize outside the training conditions or to other related applications. In this paper, we propose a deep learning framework for generating a… ▽ More

    Submitted 26 January, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: This manuscript is submitted on July 06, 2022 to IEEE/ACM Transactions on Audio, Speech, and Language Processing for peer-review

  7. arXiv:2210.09334  [pdf

    eess.AS cs.LG cs.SD

    TorchDIVA: An Extensible Computational Model of Speech Production built on an Open-Source Machine Learning Library

    Authors: Sean Kinahan, Julie Liss, Visar Berisha

    Abstract: The DIVA model is a computational model of speech motor control that combines a simulation of the brain regions responsible for speech production with a model of the human vocal tract. The model is currently implemented in Matlab Simulink; however, this is less than ideal as most of the development in speech technology research is done in Python. This means there is a wealth of machine learning to… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

  8. arXiv:2203.13352  [pdf, other

    cs.CL cs.LG

    Does human speech follow Benford's Law?

    Authors: Leo Hsu, Visar Berisha

    Abstract: Researchers have observed that the frequencies of leading digits in many man-made and naturally occurring datasets follow a logarithmic curve, with digits that start with the number 1 accounting for $\sim 30\%$ of all numbers in the dataset and digits that start with the number 9 accounting for $\sim 5\%$ of all numbers in the dataset. This phenomenon, known as Benford's Law, is highly repeatable… ▽ More

    Submitted 21 December, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

  9. arXiv:2111.08861  [pdf, other

    cs.LG stat.ML

    A label-efficient two-sample test

    Authors: Weizhi Li, Gautam Dasarathy, Karthikeyan Natesan Ramamurthy, Visar Berisha

    Abstract: Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are easily measured whereas sample labels are unknown and costly to obtain. Accordingly, we devise a three-stage framework in service of performing an effective two… ▽ More

    Submitted 19 July, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: Accepted to the 38th conference on Uncertainty in Artificial Intelligence (UAI2022)

  10. Restoring degraded speech via a modified diffusion model

    Authors: Jianwei Zhang, Suren Jayasuriya, Visar Berisha

    Abstract: There are many deterministic mathematical operations (e.g. compression, clip**, downsampling) that degrade speech quality considerably. In this paper we introduce a neural network architecture, based on a modification of the DiffWave model, that aims to restore the original speech signal. DiffWave, a recently published diffusion-based vocoder, has shown state-of-the-art synthesized speech qualit… ▽ More

    Submitted 2 September, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

    Journal ref: Proc. Interspeech 2021, 221-225, 2021)

  11. arXiv:2011.09645  [pdf, other

    cs.LG

    Finding the Homology of Decision Boundaries with Active Learning

    Authors: Weizhi Li, Gautam Dasarathy, Karthikeyan Natesan Ramamurthy, Visar Berisha

    Abstract: Accurately and efficiently characterizing the decision boundary of classifiers is important for problems related to model selection and meta-learning. Inspired by topological data analysis, the characterization of decision boundaries using their homology has recently emerged as a general and powerful tool. In this paper, we propose an active learning algorithm to recover the homology of decision b… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

    Journal ref: Advances in Neural Information Processing Systems 33 (2020)

  12. arXiv:2001.01900  [pdf, other

    cs.LG stat.ML

    Regularization via Structural Label Smoothing

    Authors: Weizhi Li, Gautam Dasarathy, Visar Berisha

    Abstract: Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network by softening the ground-truth labels in the training data in an attempt to penalize overconfident outputs. Existing approaches typically use cross-validation to… ▽ More

    Submitted 4 July, 2020; v1 submitted 7 January, 2020; originally announced January 2020.

  13. arXiv:1911.11360  [pdf, other

    eess.AS cs.SD eess.SP

    Robust Estimation of Hypernasality in Dysarthria with Acoustic Model Likelihood Features

    Authors: Michael Saxon, Ayush Tripathi, Yishan Jiao, Julie Liss, Visar Berisha

    Abstract: Hypernasality is a common characteristic symptom across many motor-speech disorders. For voiced sounds, hypernasality introduces an additional resonance in the lower frequencies and, for unvoiced sounds, there is reduced articulatory precision due to air esca** through the nasal cavity. However, the acoustic manifestation of these symptoms is highly variable, making hypernasality estimation very… ▽ More

    Submitted 5 August, 2020; v1 submitted 26 November, 2019; originally announced November 2019.

    Comments: 12 pages, 9 figures, 2 tables

    Journal ref: IEEE/ACM Trans. on Audio, Speech, and Language Proc. 28 (2020) 2511-2522

  14. arXiv:1906.01157  [pdf, other

    cs.CL cs.SD eess.AS eess.SP

    A Review of Automated Speech and Language Features for Assessment of Cognitive and Thought Disorders

    Authors: Rohit Voleti, Julie M. Liss, Visar Berisha

    Abstract: It is widely accepted that information derived from analyzing speech (the acoustic signal) and language production (words and sentences) serves as a useful window into the health of an individual's cognitive ability. In fact, most neuropsychological testing batteries have a component related to speech and language where clinicians elicit speech from patients for subjective evaluation across a broa… ▽ More

    Submitted 4 November, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: \c{opyright} 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Report number: J-STSP-AAHD-00183-2019

  15. arXiv:1904.10622  [pdf, other

    cs.CL

    Objective Assessment of Social Skills Using Automated Language Analysis for Identification of Schizophrenia and Bipolar Disorder

    Authors: Rohit Voleti, Stephanie Woolridge, Julie M. Liss, Melissa Milanovic, Christopher R. Bowie, Visar Berisha

    Abstract: Several studies have shown that speech and language features, automatically extracted from clinical interviews or spontaneous discourse, have diagnostic value for mental disorders such as schizophrenia and bipolar disorder. They typically make use of a large feature set to train a classifier for distinguishing between two groups of interest, i.e. a clinical and control group. However, a purely dat… ▽ More

    Submitted 28 July, 2019; v1 submitted 23 April, 2019; originally announced April 2019.

    Comments: Accepted to be presented at INTERSPEECH 2019 conference in Graz, Austria. 4 pages + 1 page references. Two figures

  16. arXiv:1811.07021  [pdf, other

    cs.CL cs.SD eess.AS

    Investigating the Effects of Word Substitution Errors on Sentence Embeddings

    Authors: Rohit Voleti, Julie M. Liss, Visar Berisha

    Abstract: A key initial step in several natural language processing (NLP) tasks involves embedding phrases of text to vectors of real numbers that preserve semantic meaning. To that end, several methods have been recently proposed with impressive results on semantic similarity tasks. However, all of these approaches assume that perfect transcripts are available when generating the embeddings. While this is… ▽ More

    Submitted 24 April, 2019; v1 submitted 16 November, 2018; originally announced November 2018.

    Comments: 4 Pages, 2 figures. Copyright IEEE 2019. Accepted and to appear in the Proceedings of the 44th International Conference on Acoustics, Speech, and Signal Processing 2019 (IEEE-ICASSP-2019), May 12-17 in Brighton, U.K. Personal use of this material is permitted. However, permission to reprint/republish this material must be obtained from the IEEE

  17. arXiv:1808.01535  [pdf, other

    eess.AS cs.CL cs.LG stat.ML

    Triplet Network with Attention for Speaker Diarization

    Authors: Huan Song, Megan Willi, Jayaraman J. Thiagarajan, Visar Berisha, Andreas Spanias

    Abstract: In automatic speech processing systems, speaker diarization is a crucial front-end component to separate segments from different speakers. Inspired by the recent success of deep neural networks (DNNs) in semantic inferencing, triplet loss-based architectures have been successfully used for this problem. However, existing work utilizes conventional i-vectors as the input representation and builds s… ▽ More

    Submitted 4 August, 2018; originally announced August 2018.

    Comments: Interspeech2018

  18. arXiv:1807.01738  [pdf, other

    eess.AS cs.SD

    Investigating the role of L1 in automatic pronunciation evaluation of L2 speech

    Authors: Ming Tu, Anna Grabek, Julie Liss, Visar Berisha

    Abstract: Automatic pronunciation evaluation plays an important role in pronunciation training and second language education. This field draws heavily on concepts from automatic speech recognition (ASR) to quantify how close the pronunciation of non-native speech is to native-like pronunciation. However, it is known that the formation of accent is related to pronunciation patterns of both the target languag… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

    Comments: To appear in Interspeech 2018

  19. arXiv:1804.08663  [pdf, other

    eess.AS cs.SD

    A Discriminative Acoustic-Prosodic Approach for Measuring Local Entrainment

    Authors: Megan M. Willi, Stephanie A. Borrie, Tyson S. Barrett, Ming Tu, Visar Berisha

    Abstract: Acoustic-prosodic entrainment describes the tendency of humans to align or adapt their speech acoustics to each other in conversation. This alignment of spoken behavior has important implications for conversational success. However, modeling the subtle nature of entrainment in spoken dialogue continues to pose a challenge. In this paper, we propose a straightforward definition for local entrainmen… ▽ More

    Submitted 12 July, 2018; v1 submitted 23 April, 2018; originally announced April 2018.

  20. arXiv:1804.07370  [pdf

    cs.NE

    Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

    Authors: Shihui Yin, Gaurav Srivastava, Shreyas K. Venkataramanaiah, Chaitali Chakrabarti, Visar Berisha, Jae-sun Seo

    Abstract: Deep learning algorithms have shown tremendous success in many recognition tasks; however, these algorithms typically include a deep neural network (DNN) structure and a large number of parameters, which makes it challenging to implement them on power/area-constrained embedded platforms. To reduce the network size, several studies investigated compression by introducing element-wise or row-/column… ▽ More

    Submitted 19 April, 2018; originally announced April 2018.

    Comments: 2017 Asilomar Conference on Signals, Systems and Computers

  21. arXiv:1705.06315  [pdf, ps, other

    cs.IT

    Direct Ensemble Estimation of Density Functionals

    Authors: Alan Wisler, Kevin Moon, Visar Berisha

    Abstract: Estimating density functionals of analog sources is an important problem in statistical signal processing and information theory. Traditionally, estimating these quantities requires either making parametric assumptions about the underlying distributions or using non-parametric density estimation followed by integration. In this paper we introduce a direct nonparametric approach which bypasses the… ▽ More

    Submitted 17 May, 2017; originally announced May 2017.

    Comments: 5 pages

  22. Direct estimation of density functionals using a polynomial basis

    Authors: Alan Wisler, Visar Berisha, Andreas Spanias, Alfred O. Hero

    Abstract: A number of fundamental quantities in statistical signal processing and information theory can be expressed as integral functions of two probability density functions. Such quantities are called density functionals as they map density functions onto the real line. For example, information divergence functions measure the dissimilarity between two probability density functions and are useful in a n… ▽ More

    Submitted 20 November, 2017; v1 submitted 21 February, 2017; originally announced February 2017.

    Comments: Under review for IEEE Transactions on Signal Processing

  23. arXiv:1605.04859  [pdf, other

    cs.LG cs.NE

    Reducing the Model Order of Deep Neural Networks Using Information Theory

    Authors: Ming Tu, Visar Berisha, Yu Cao, Jae-sun Seo

    Abstract: Deep neural networks are typically represented by a much larger number of parameters than shallow models, making them prohibitive for small footprint devices. Recent research shows that there is considerable redundancy in the parameter space of deep neural networks. In this paper, we propose a method to compress deep neural networks by using the Fisher Information metric, which we estimate through… ▽ More

    Submitted 16 May, 2016; originally announced May 2016.

    Comments: To appear in ISVLSI 2016 special session

  24. arXiv:1412.6534  [pdf, other

    cs.IT stat.ML

    Empirically Estimable Classification Bounds Based on a New Divergence Measure

    Authors: Visar Berisha, Alan Wisler, Alfred O. Hero, Andreas Spanias

    Abstract: Information divergence functions play a critical role in statistics and information theory. In this paper we show that a non-parametric f-divergence measure can be used to provide improved bounds on the minimum binary classification probability of error for the case when the training and test data are drawn from the same distribution and for the case where there exists some mismatch between traini… ▽ More

    Submitted 10 February, 2015; v1 submitted 19 December, 2014; originally announced December 2014.

    Comments: 12 pages, 5 figures

  25. arXiv:1408.1182  [pdf, other

    stat.CO cs.IT stat.ML

    Empirical non-parametric estimation of the Fisher Information

    Authors: Visar Berisha, Alfred O. Hero

    Abstract: The Fisher information matrix (FIM) is a foundational concept in statistical signal processing. The FIM depends on the probability distribution, assumed to belong to a smooth parametric family. Traditional approaches to estimating the FIM require estimating the probability distribution function (PDF), or its parameters, along with its gradient or Hessian. However, in many practical situations the… ▽ More

    Submitted 16 November, 2014; v1 submitted 6 August, 2014; originally announced August 2014.

    Comments: 12 pages