Skip to main content

Showing 1–24 of 24 results for author: Rousseau, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.08396  [pdf

    cs.CV cs.AI cs.CL

    Hidden Flaws Behind Expert-Level Accuracy of GPT-4 Vision in Medicine

    Authors: Qiao **, Fangyuan Chen, Yiliang Zhou, Ziyang Xu, Justin M. Cheung, Robert Chen, Ronald M. Summers, Justin F. Rousseau, Peiyun Ni, Marc J Landsman, Sally L. Baxter, Subhi J. Al'Aref, Yijia Li, Alex Chen, Josef A. Brejt, Michael F. Chiang, Yifan Peng, Zhiyong Lu

    Abstract: Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by… ▽ More

    Submitted 22 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Under review

  2. arXiv:2309.10918  [pdf, other

    stat.ML cs.LG math.ST

    Posterior Contraction Rates for Matérn Gaussian Processes on Riemannian Manifolds

    Authors: Paul Rosa, Viacheslav Borovitskiy, Alexander Terenin, Judith Rousseau

    Abstract: Gaussian processes are used in many machine learning applications that rely on uncertainty quantification. Recently, computational tools for working with these models in geometric settings, such as when inputs lie on a Riemannian manifold, have been developed. This raises the question: can these intrinsic models be shown theoretically to lead to better performance, compared to simply embedding all… ▽ More

    Submitted 29 October, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

    Journal ref: Advances in Neural Information Processing Systems, 2023

  3. arXiv:2307.14066  [pdf, other

    cs.CV cs.LG

    Pre-Training with Diffusion models for Dental Radiography segmentation

    Authors: Jérémy Rousseau, Christian Alaka, Emma Covili, Hippolyte Mayard, Laura Misrachi, Willy Au

    Abstract: Medical radiography segmentation, and specifically dental radiography, is highly limited by the cost of labeling which requires specific expertise and labor-intensive annotations. In this work, we propose a straightforward pre-training method for semantic segmentation leveraging Denoising Diffusion Probabilistic Models (DDPM), which have shown impressive results for generative modeling. Our straig… ▽ More

    Submitted 27 July, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: 13 pages, 6 figures

  4. arXiv:2305.19339  [pdf, other

    cs.CL cs.AI

    Less Likely Brainstorming: Using Language Models to Generate Alternative Hypotheses

    Authors: Liyan Tang, Yifan Peng, Yanshan Wang, Ying Ding, Greg Durrett, Justin F. Rousseau

    Abstract: A human decision-maker benefits the most from an AI assistant that corrects for their biases. For problems such as generating interpretation of a radiology report given findings, a system predicting only highly likely outcomes may be less useful, where such outcomes are already obvious to the user. To alleviate biases in human decision-making, it is worth considering a broad differential diagnosis… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL (Findings) 2023

  5. arXiv:2303.00517  [pdf

    cs.LG cs.AI cs.CY

    Analyzing Impact of Socio-Economic Factors on COVID-19 Mortality Prediction Using SHAP Value

    Authors: Redoan Rahman, Jooyeong Kang, Justin F Rousseau, Ying Ding

    Abstract: This paper applies multiple machine learning (ML) algorithms to a dataset of de-identified COVID-19 patients provided by the COVID-19 Research Database. The dataset consists of 20,878 COVID-positive patients, among which 9,177 patients died in the year 2020. This paper aims to understand and interpret the association of socio-economic characteristics of patients with their mortality instead of max… ▽ More

    Submitted 27 February, 2023; originally announced March 2023.

    Comments: 10 pages, 10 figures, American Medical Informatics Association(AMIA) Annual Conference 2022, Washington DC, USA, Nov 5-9, 2022

    Journal ref: AMIA 2022 Annual Symposium

  6. arXiv:2302.08605  [pdf

    cs.LG cs.AI cs.CY

    Using Explainable AI to Cross-Validate Socio-economic Disparities Among Covid-19 Patient Mortality

    Authors: Li Shi, Redoan Rahman, Esther Melamed, Jacek Gwizdka, Justin F. Rousseau, Ying Ding

    Abstract: This paper applies eXplainable Artificial Intelligence (XAI) methods to investigate the socioeconomic disparities in COVID patient mortality. An Extreme Gradient Boosting (XGBoost) prediction model is built based on a de-identified Austin area hospital dataset to predict the mortality of COVID-19 patients. We apply two XAI methods, Shapley Additive exPlanations (SHAP) and Locally Interpretable Mod… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: AMIA 2023 Informatics Summit, March 13-16, Seattle, WA, USA. 10 pages

    Journal ref: AMIA 2023 Informatics Summit

  7. arXiv:2212.02675  [pdf, other

    cs.CV

    Attend Who is Weak: Pruning-assisted Medical Image Localization under Sophisticated and Implicit Imbalances

    Authors: Ajay Jaiswal, Tianlong Chen, Justin F. Rousseau, Yifan Peng, Ying Ding, Zhangyang Wang

    Abstract: Deep neural networks (DNNs) have rapidly become a \textit{de facto} choice for medical image understanding tasks. However, DNNs are notoriously fragile to the class imbalance in image classification. We further point out that such imbalance fragility can be amplified when it comes to more sophisticated tasks such as pathology localization, as imbalances in such problems can have highly complex and… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

    Comments: Accepted in WACV 2023

  8. arXiv:2210.08388  [pdf, other

    cs.CV cs.LG

    RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging

    Authors: Ajay Jaiswal, Kumar Ashutosh, Justin F Rousseau, Yifan Peng, Zhangyang Wang, Ying Ding

    Abstract: AI-powered Medical Imaging has recently achieved enormous attention due to its ability to provide fast-paced healthcare diagnoses. However, it usually suffers from a lack of high-quality datasets due to high annotation cost, inter-observer variability, human annotator error, and errors in computer-generated labels. Deep learning models trained on noisy labelled datasets are sensitive to the noise… ▽ More

    Submitted 2 December, 2022; v1 submitted 15 October, 2022; originally announced October 2022.

    Comments: Accepted in ICDM 2022

  9. arXiv:2210.08122  [pdf, other

    cs.LG cs.SI

    Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again

    Authors: Ajay Jaiswal, Peihao Wang, Tianlong Chen, Justin F. Rousseau, Ying Ding, Zhangyang Wang

    Abstract: Despite the enormous success of Graph Convolutional Networks (GCNs) in modeling graph-structured data, most of the current GCNs are shallow due to the notoriously challenging problems of over-smoothening and information squashing along with conventional difficulty caused by vanishing gradients and over-fitting. Previous works have been primarily focused on the study of over-smoothening and over-sq… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Advances in Neural Information Processing Systems (NeurIPS), 2022

  10. arXiv:2205.12854  [pdf, other

    cs.CL cs.AI

    Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors

    Authors: Liyan Tang, Tanya Goyal, Alexander R. Fabbri, Philippe Laban, Jiacheng Xu, Semih Yavuz, Wojciech Kryściński, Justin F. Rousseau, Greg Durrett

    Abstract: The propensity of abstractive summarization models to make factual errors has been studied extensively, including design of metrics to detect factual errors and annotation of errors in current systems' outputs. However, the ever-evolving nature of summarization systems, metrics, and annotated benchmarks makes factuality evaluation a moving target, and drawing clear comparisons among metrics has be… ▽ More

    Submitted 25 May, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to ACL 2023

  11. arXiv:2203.09675  [pdf, other

    stat.ML cs.LG stat.CO

    Fast Bayesian Coresets via Subsampling and Quasi-Newton Refinement

    Authors: Cian Naik, Judith Rousseau, Trevor Campbell

    Abstract: Bayesian coresets approximate a posterior distribution by building a small weighted subset of the data points. Any inference procedure that is too computationally expensive to be run on the full posterior can instead be run inexpensively on the coreset, with results that approximate those on the full data. However, current approaches are limited by either a significant run-time or the need for the… ▽ More

    Submitted 15 January, 2023; v1 submitted 17 March, 2022; originally announced March 2022.

  12. arXiv:2110.15426  [pdf, other

    cs.LG cs.CL

    RadBERT-CL: Factually-Aware Contrastive Learning For Radiology Report Classification

    Authors: Ajay Jaiswal, Liyan Tang, Meheli Ghosh, Justin Rousseau, Yifan Peng, Ying Ding

    Abstract: Radiology reports are unstructured and contain the imaging findings and corresponding diagnoses transcribed by radiologists which include clinical facts and negated and/or uncertain statements. Extracting pathologic findings and diagnoses from radiology reports is important for quality control, population health, and monitoring of disease progress. Existing works, primarily rely either on rule-bas… ▽ More

    Submitted 19 November, 2021; v1 submitted 28 October, 2021; originally announced October 2021.

  13. arXiv:2110.14787  [pdf, other

    eess.IV cs.CV

    SCALP -- Supervised Contrastive Learning for Cardiopulmonary Disease Classification and Localization in Chest X-rays using Patient Metadata

    Authors: Ajay Jaiswal, Tianhao Li, Cyprian Zander, Yan Han, Justin F. Rousseau, Yifan Peng, Ying Ding

    Abstract: Computer-aided diagnosis plays a salient role in more accessible and accurate cardiopulmonary diseases classification and localization on chest radiography. Millions of people get affected and die due to these diseases without an accurate and timely diagnosis. Recently proposed contrastive learning heavily relies on data augmentation, especially positive data augmentation. However, generating clin… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  14. arXiv:2010.12859  [pdf, other

    cs.LG stat.ML

    Stable ResNet

    Authors: Soufiane Hayou, Eugenio Clerico, Bobby He, George Deligiannidis, Arnaud Doucet, Judith Rousseau

    Abstract: Deep ResNet architectures have achieved state of the art performance on many tasks. While they solve the problem of gradient vanishing, they might suffer from gradient exploding as the depth becomes large (Yang et al. 2017). Moreover, recent results have shown that ResNet might lose expressivity as the depth goes to infinity (Yang et al. 2017, Hayou et al. 2019). To resolve these issues, we introd… ▽ More

    Submitted 18 March, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: 43 pages, 4 figures

  15. arXiv:2010.12611  [pdf, other

    cs.SI

    Information access representations and social capital in networks

    Authors: Ashkan Bashardoust, Hannah C. Beilinson, Sorelle A. Friedler, Jiajie Ma, Jade Rousseau, Carlos E. Scheidegger, Blair D. Sullivan, Nasanbayar Ulzii-Orshikh, Suresh Venkatasubramanian

    Abstract: Social network position confers power and social capital. In the setting of online social networks that have massive reach, creating mathematical representations of social capital is an important step towards understanding how network position can differentially confer advantage to different groups and how network position can itself be a source of advantage. In this paper, we use well established… ▽ More

    Submitted 16 October, 2023; v1 submitted 23 October, 2020; originally announced October 2020.

  16. arXiv:2005.08502  [pdf, other

    cs.CR cs.AI cs.CY

    COVI White Paper

    Authors: Hannah Alsdurf, Edmond Belliveau, Yoshua Bengio, Tristan Deleu, Prateek Gupta, Daphne Ippolito, Richard Janda, Max Jarvie, Tyler Kolody, Sekoul Krastev, Tegan Maharaj, Robert Obryk, Dan Pilat, Valerie Pisano, Benjamin Prud'homme, Meng Qu, Nasim Rahaman, Irina Rish, Jean-Francois Rousseau, Abhinav Sharma, Brooke Struck, Jian Tang, Martin Weiss, Yun William Yu

    Abstract: The SARS-CoV-2 (Covid-19) pandemic has caused significant strain on public health institutions around the world. Contact tracing is an essential tool to change the course of the Covid-19 pandemic. Manual contact tracing of Covid-19 cases has significant challenges that limit the ability of public health authorities to minimize community infections. Personalized peer-to-peer contact tracing through… ▽ More

    Submitted 27 July, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: 64 pages, 1 figure

  17. arXiv:2005.04308  [pdf

    cs.DL

    Building a PubMed knowledge graph

    Authors: Jian Xu, Sunkyu Kim, Min Song, Minbyul Jeong, Donghyeon Kim, Jaewoo Kang, Justin F. Rousseau, Xin Li, Weijia Xu, Vetle I. Torvik, Yi Bu, Chongyan Chen, Islam Akef Ebeid, Daifeng Li, Ying Ding

    Abstract: PubMed is an essential resource for the medical domain, but useful concepts are either difficult to extract or are ambiguated, which has significantly hindered knowledge discovery. To address this issue, we constructed a PubMed knowledge graph (PKG) by extracting bio-entities from 29 million PubMed abstracts, disambiguating author names, integrating funding data through the National Institutes of… ▽ More

    Submitted 15 May, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

    Comments: 19 pages, 5 figures, 14 tables

  18. arXiv:2003.05500  [pdf, ps, other

    math.PR cs.IT math.DS

    Rényi entropy and pattern matching for run-length encoded sequences

    Authors: Jerome Rousseau

    Abstract: In this note, we studied the asymptotic behaviour of the length of the longest common substring for run-length encoded sequences. When the original sequences are generated by an $α$-mixing process with exponential decay (or $ψ$-mixing with polynomial decay), we proved that this length grows logarithmically with a coefficient depending on the Rényi entropy of the pushforward measure. For Bernoulli… ▽ More

    Submitted 11 December, 2020; v1 submitted 11 March, 2020; originally announced March 2020.

  19. arXiv:1912.07516  [pdf, ps, other

    math.DS cs.IT math.PR

    Shortest distance between multiple orbits and generalized fractal dimensions

    Authors: Vanessa Barros, Jerome Rousseau

    Abstract: We consider rapidly mixing dynamical systems and link the decay of the shortest distance between multiple orbits with the generalized fractal dimension. We apply this result to multidimensional expanding maps and extend it to the realm of random dynamical systems. For random sequences, we obtain a relation between the longest common substring between multiple sequences and the generalized Rényi en… ▽ More

    Submitted 13 December, 2019; originally announced December 2019.

    Comments: arXiv admin note: text overlap with arXiv:1808.00078

  20. arXiv:1910.09679  [pdf, other

    stat.ME cs.SI physics.soc-ph

    Sparse Networks with Core-Periphery Structure

    Authors: Cian Naik, François Caron, Judith Rousseau

    Abstract: We propose a statistical model for graphs with a core-periphery structure. To do this we define a precise notion of what it means for a graph to have this structure, based on the sparsity properties of the subgraphs of core and periphery nodes. We present a class of sparse graphs with such properties, and provide methods to simulate from this class, and to perform posterior inference. We demonstra… ▽ More

    Submitted 21 October, 2019; originally announced October 2019.

    MSC Class: Primary: 62F15; 05C80. Secondary: 60G55

  21. arXiv:1905.13654  [pdf, other

    stat.ML cs.LG

    Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth Limit

    Authors: Soufiane Hayou, Arnaud Doucet, Judith Rousseau

    Abstract: Recent work by Jacot et al. (2018) has shown that training a neural network using gradient descent in parameter space is related to kernel gradient descent in function space with respect to the Neural Tangent Kernel (NTK). Lee et al. (2019) built on this result by establishing that the output of a neural network trained using gradient descent can be approximated by a linear model when the network… ▽ More

    Submitted 25 May, 2022; v1 submitted 31 May, 2019; originally announced May 2019.

    Comments: 59 pages, 8 figures

  22. arXiv:1903.09625  [pdf, ps, other

    math.PR cs.DS cs.IT math.DS

    Matching strings in encoded sequences

    Authors: Adriana Coutinho, Rodrigo Lambert, Jérôme Rousseau

    Abstract: We investigate the longest common substring problem for encoded sequences and its asymptotic behaviour. The main result is a strong law of large numbers for a re-scaled version of this quantity, which presents an explicit relation with the Rényi entropy of the source. We apply this result to the zero-inflated contamination model and the stochastic scrabble. In the case of dynamical systems, this p… ▽ More

    Submitted 10 December, 2019; v1 submitted 22 March, 2019; originally announced March 2019.

  23. arXiv:1902.06853  [pdf, other

    stat.ML cs.AI cs.LG

    On the Impact of the Activation Function on Deep Neural Networks Training

    Authors: Soufiane Hayou, Arnaud Doucet, Judith Rousseau

    Abstract: The weight initialization and the activation function of deep neural networks have a crucial impact on the performance of the training procedure. An inappropriate selection can lead to the loss of information of the input during forward propagation and the exponential vanishing/exploding of gradients during back-propagation. Understanding the theoretical properties of untrained random networks is… ▽ More

    Submitted 26 May, 2019; v1 submitted 18 February, 2019; originally announced February 2019.

    Comments: 22 pages

  24. arXiv:1805.08266  [pdf, other

    stat.ML cs.LG

    On the Selection of Initialization and Activation Function for Deep Neural Networks

    Authors: Soufiane Hayou, Arnaud Doucet, Judith Rousseau

    Abstract: The weight initialization and the activation function of deep neural networks have a crucial impact on the performance of the training procedure. An inappropriate selection can lead to the loss of information of the input during forward propagation and the exponential vanishing/exploding of gradients during back-propagation. Understanding the theoretical properties of untrained random networks is… ▽ More

    Submitted 7 October, 2018; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: 8 pages, 15 figures