Search | arXiv e-print repository

Towards Trustworthy Automatic Diagnosis Systems by Emulating Doctors' Reasoning with Deep Reinforcement Learning

Authors: Arsene Fansi Tchango, Rishab Goel, Julien Martel, Zhi Wen, Gaetan Marceau Caron, Joumana Ghosn

Abstract: The automation of the medical evidence acquisition and diagnosis process has recently attracted increasing attention in order to reduce the workload of doctors and democratize access to medical care. However, most works proposed in the machine learning literature focus solely on improving the prediction accuracy of a patient's pathology. We argue that this objective is insufficient to ensure docto… ▽ More The automation of the medical evidence acquisition and diagnosis process has recently attracted increasing attention in order to reduce the workload of doctors and democratize access to medical care. However, most works proposed in the machine learning literature focus solely on improving the prediction accuracy of a patient's pathology. We argue that this objective is insufficient to ensure doctors' acceptability of such systems. In their initial interaction with patients, doctors do not only focus on identifying the pathology a patient is suffering from; they instead generate a differential diagnosis (in the form of a short list of plausible diseases) because the medical evidence collected from patients is often insufficient to establish a final diagnosis. Moreover, doctors explicitly explore severe pathologies before potentially ruling them out from the differential, especially in acute care settings. Finally, for doctors to trust a system's recommendations, they need to understand how the gathered evidences led to the predicted diseases. In particular, interactions between a system and a patient need to emulate the reasoning of doctors. We therefore propose to model the evidence acquisition and automatic diagnosis tasks using a deep reinforcement learning framework that considers three essential aspects of a doctor's reasoning, namely generating a differential diagnosis using an exploration-confirmation approach while prioritizing severe pathologies. We propose metrics for evaluating interaction quality based on these three aspects. We show that our approach performs better than existing models while maintaining competitive pathology prediction accuracy. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Comments: Camera ready. NeurIPS 2022

arXiv:2103.14636 [pdf, other]

doi 10.1145/3586074

A Practical Survey on Faster and Lighter Transformers

Authors: Quentin Fournier, Gaétan Marceau Caron, Daniel Aloise

Abstract: Recurrent neural networks are effective models to process sequences. However, they are unable to learn long-term dependencies because of their inherent sequential nature. As a solution, Vaswani et al. introduced the Transformer, a model solely based on the attention mechanism that is able to relate any two positions of the input sequence, hence modelling arbitrary long dependencies. The Transforme… ▽ More Recurrent neural networks are effective models to process sequences. However, they are unable to learn long-term dependencies because of their inherent sequential nature. As a solution, Vaswani et al. introduced the Transformer, a model solely based on the attention mechanism that is able to relate any two positions of the input sequence, hence modelling arbitrary long dependencies. The Transformer has improved the state-of-the-art across numerous sequence modelling tasks. However, its effectiveness comes at the expense of a quadratic computational and memory complexity with respect to the sequence length, hindering its adoption. Fortunately, the deep learning community has always been interested in improving the models' efficiency, leading to a plethora of solutions such as parameter sharing, pruning, mixed-precision, and knowledge distillation. Recently, researchers have directly addressed the Transformer's limitation by designing lower-complexity alternatives such as the Longformer, Reformer, Linformer, and Performer. However, due to the wide range of solutions, it has become challenging for researchers and practitioners to determine which methods to apply in practice in order to meet the desired trade-off between capacity, computation, and memory. This survey addresses this issue by investigating popular approaches to make Transformers faster and lighter and by providing a comprehensive explanation of the methods' strengths, limitations, and underlying assumptions. △ Less

Submitted 27 March, 2023; v1 submitted 26 March, 2021; originally announced March 2021.

Comments: ACM Computing Surveys; 40 pages, 18 figures, 4 tables

arXiv:2010.16004 [pdf, other]

COVI-AgentSim: an Agent-based Model for Evaluating Methods of Digital Contact Tracing

Authors: Prateek Gupta, Tegan Maharaj, Martin Weiss, Nasim Rahaman, Hannah Alsdurf, Abhinav Sharma, Nanor Minoyan, Soren Harnois-Leblanc, Victor Schmidt, Pierre-Luc St. Charles, Tristan Deleu, Andrew Williams, Akshay Patel, Meng Qu, Olexa Bilaniuk, Gaétan Marceau Caron, Pierre Luc Carrier, Satya Ortiz-Gagné, Marc-Andre Rousseau, David Buckeridge, Joumana Ghosn, Yang Zhang, Bernhard Schölkopf, Jian Tang, Irina Rish , et al. (4 additional authors not shown)

Abstract: The rapid global spread of COVID-19 has led to an unprecedented demand for effective methods to mitigate the spread of the disease, and various digital contact tracing (DCT) methods have emerged as a component of the solution. In order to make informed public health choices, there is a need for tools which allow evaluation and comparison of DCT methods. We introduce an agent-based compartmental si… ▽ More The rapid global spread of COVID-19 has led to an unprecedented demand for effective methods to mitigate the spread of the disease, and various digital contact tracing (DCT) methods have emerged as a component of the solution. In order to make informed public health choices, there is a need for tools which allow evaluation and comparison of DCT methods. We introduce an agent-based compartmental simulator we call COVI-AgentSim, integrating detailed consideration of virology, disease progression, social contact networks, and mobility patterns, based on parameters derived from empirical research. We verify by comparing to real data that COVI-AgentSim is able to reproduce realistic COVID-19 spread dynamics, and perform a sensitivity analysis to verify that the relative performance of contact tracing methods are consistent across a range of settings. We use COVI-AgentSim to perform cost-benefit analyses comparing no DCT to: 1) standard binary contact tracing (BCT) that assigns binary recommendations based on binary test results; and 2) a rule-based method for feature-based contact tracing (FCT) that assigns a graded level of recommendation based on diverse individual features. We find all DCT methods consistently reduce the spread of the disease, and that the advantage of FCT over BCT is maintained over a wide range of adoption rates. Feature-based methods of contact tracing avert more disability-adjusted life years (DALYs) per socioeconomic cost (measured by productive hours lost). Our results suggest any DCT method can help save lives, support re-opening of economies, and prevent second-wave outbreaks, and that FCT methods are a promising direction for enriching BCT using self-reported symptoms, yielding earlier warning signals and a significantly reduced spread of the virus per socioeconomic cost. △ Less

Submitted 29 October, 2020; originally announced October 2020.

arXiv:2010.12536 [pdf, other]

Predicting Infectiousness for Proactive Contact Tracing

Authors: Yoshua Bengio, Prateek Gupta, Tegan Maharaj, Nasim Rahaman, Martin Weiss, Tristan Deleu, Eilif Muller, Meng Qu, Victor Schmidt, Pierre-Luc St-Charles, Hannah Alsdurf, Olexa Bilanuik, David Buckeridge, Gáetan Marceau Caron, Pierre-Luc Carrier, Joumana Ghosn, Satya Ortiz-Gagne, Chris Pal, Irina Rish, Bernhard Schölkopf, Abhinav Sharma, Jian Tang, Andrew Williams

Abstract: The COVID-19 pandemic has spread rapidly worldwide, overwhelming manual contact tracing in many countries and resulting in widespread lockdowns for emergency containment. Large-scale digital contact tracing (DCT) has emerged as a potential solution to resume economic and social activity while minimizing spread of the virus. Various DCT methods have been proposed, each making trade-offs between pri… ▽ More The COVID-19 pandemic has spread rapidly worldwide, overwhelming manual contact tracing in many countries and resulting in widespread lockdowns for emergency containment. Large-scale digital contact tracing (DCT) has emerged as a potential solution to resume economic and social activity while minimizing spread of the virus. Various DCT methods have been proposed, each making trade-offs between privacy, mobility restrictions, and public health. The most common approach, binary contact tracing (BCT), models infection as a binary event, informed only by an individual's test results, with corresponding binary recommendations that either all or none of the individual's contacts quarantine. BCT ignores the inherent uncertainty in contacts and the infection process, which could be used to tailor messaging to high-risk individuals, and prompt proactive testing or earlier warnings. It also does not make use of observations such as symptoms or pre-existing medical conditions, which could be used to make more accurate infectiousness predictions. In this paper, we use a recently-proposed COVID-19 epidemiological simulator to develop and test methods that can be deployed to a smartphone to locally and proactively predict an individual's infectiousness (risk of infecting others) based on their contact history and other information, while respecting strong privacy constraints. Predictions are used to provide personalized recommendations to the individual via an app, as well as to send anonymized messages to the individual's contacts, who use this information to better predict their own infectiousness, an approach we call proactive contact tracing (PCT). We find a deep-learning based PCT method which improves over BCT for equivalent average mobility, suggesting PCT could help in safe re-opening and second-wave prevention. △ Less

Submitted 23 October, 2020; originally announced October 2020.

arXiv:1807.06939 [pdf, other]

Combining neural networks and signed particles to simulate quantum systems more efficiently, Part III

Authors: Jean Michel Sellier, Gaetan Marceau Caron, Jacob Leygonie

Abstract: This work belongs to a series of articles which have been dedicated to the combination of signed particles and neural networks to speed up the time-dependent simulation of quantum systems. More specifically, the suggested networks are utilized to compute the function known as the Wigner kernel. In the first paper, we suggested a network which is completely defined analytically and which does not n… ▽ More This work belongs to a series of articles which have been dedicated to the combination of signed particles and neural networks to speed up the time-dependent simulation of quantum systems. More specifically, the suggested networks are utilized to compute the function known as the Wigner kernel. In the first paper, we suggested a network which is completely defined analytically and which does not necessitate any training process. Although very useful, this approach keeps the same complexity as the more standard finite difference methods. In the second work, we presented a different architecture which has generalization capabilities. Although more convenient in terms of computational time, it still uses a similar structure compared to the previous approach, with less neurons in the hidden layer but with the same expensive activation functions (sine functions). In this work, we focus on neural networks without any prior physics based knowledge. In more details, the network now consists of different hidden layers which are based on more common, and less computationally expensive, activation functions such as rectified linear units, and which focus on predicting only one column of the discretized Wigner kernel at a time. This new suggested architecture proves to accurately learn the transform from the space of potential functions to the space of Wigner kernels and performs very well during the simulation of quantum systems (20 times faster). Finally, we simulate a one-dimensional quantum system consisting of a wave packet im**ing on a potential barrier for validation purposes. This represents one further step ahead to achieve fast and reliable simulations of time-dependent quantum systems. △ Less

Submitted 28 June, 2018; originally announced July 2018.

arXiv:1806.00082 [pdf, ps, other]

Combining neural networks and signed particles to simulate quantum systems more efficiently, Part II

Authors: Jean Michel Sellier, Jacob Leygonie, Gaetan Marceau Caron

Abstract: Recently the use of neural networks has been introduced in the context of the signed particle formulation of quantum mechanics to rapidly and reliably compute the Wigner kernel of any provided potential. This new technique has introduced two important advantages over the more standard finite difference/element methods: 1) it reduces the amount of memory required for the simulation of a quantum sys… ▽ More Recently the use of neural networks has been introduced in the context of the signed particle formulation of quantum mechanics to rapidly and reliably compute the Wigner kernel of any provided potential. This new technique has introduced two important advantages over the more standard finite difference/element methods: 1) it reduces the amount of memory required for the simulation of a quantum system. As a matter of fact, it does not require storing the kernel in a (expensive) multi-dimensional array, and 2) a consistent speedup is obtained since now one can compute the kernel on the cells of interest only, i.e. the cells occupied by signed particles. Although this certainly represents a step forward into the direction of rapid simulations of quantum systems, it comes at a price: the number of hidden neurons is constrained by design to be equal to the number of cells of the discretized real space. It is easy to see how this limitation can quickly become an issue when very fine meshes are necessary. In this work, we continue to ameliorate this previous approach by reducing the complexity of the neural network and, consequently, by introducing an additional speedup. More specifically, we propose a new network architecture which requires less neurons than the previous approach in its hidden layer. For validation purposes, we apply this novel technique to a well known simple, but very indicative, one-dimensional quantum system consisting of a Gaussian wave packet interacting with a potential barrier. In order to clearly show the validity of our suggested approach, time-dependent comparisons with the previous technique are presented. In spite of its simpler architecture, a good agreement is observed, thus representing one step further towards fast and reliable simulations of time-dependent quantum systems. △ Less

Submitted 29 May, 2018; originally announced June 2018.

Showing 1–6 of 6 results for author: Caron, G M