Search | arXiv e-print repository

arXiv:2407.02737 [pdf, other]

Development of Machine Learning Classifiers for Blood-based Diagnosis and Prognosis of Suspected Acute Infections and Sepsis

Authors: Ljubomir Buturovic, Michael Mayhew, Roland Luethy, Kirindi Choi, Uros Midic, Nandita Damaraju, Yehudit Hasin-Brumshtein, Amitesh Pratap, Rhys M. Adams, Joao Fonseca, Ambika Srinath, Paul Fleming, Claudia Pereira, Oliver Liesenfeld, Purvesh Khatri, Timothy Sweeney

Abstract: We applied machine learning to the unmet medical need of rapid and accurate diagnosis and prognosis of acute infections and sepsis in emergency departments. Our solution consists of a Myrna (TM) Instrument and embedded TriVerity (TM) classifiers. The instrument measures abundances of 29 messenger RNAs in patient's blood, subsequently used as features for machine learning. The classifiers convert t… ▽ More We applied machine learning to the unmet medical need of rapid and accurate diagnosis and prognosis of acute infections and sepsis in emergency departments. Our solution consists of a Myrna (TM) Instrument and embedded TriVerity (TM) classifiers. The instrument measures abundances of 29 messenger RNAs in patient's blood, subsequently used as features for machine learning. The classifiers convert the input features to an intuitive test report comprising the separate likelihoods of (1) a bacterial infection (2) a viral infection, and (3) severity (need for Intensive Care Unit-level care). In internal validation, the system achieved AUROC = 0.83 on the three-class disease diagnosis (bacterial, viral, or non-infected) and AUROC = 0.77 on binary prognosis of disease severity. The Myrna, TriVerity system was granted breakthrough device designation by the United States Food and Drug Administration (FDA). This engineering manuscript teaches the standard and novel machine learning methods used to translate an academic research concept to a clinical product aimed at improving patient care, and discusses lessons learned. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 16 pages, 6 figures

arXiv:2305.12371 [pdf]

Machine Translation by Projecting Text into the Same Phonetic-Orthographic Space Using a Common Encoding

Authors: Amit Kumar, Shantipriya Parida, Ajay Pratap, Anil Kumar Singh

Abstract: The use of subword embedding has proved to be a major innovation in Neural Machine Translation (NMT). It helps NMT to learn better context vectors for Low Resource Languages (LRLs) so as to predict the target words by better modelling the morphologies of the two languages and also the morphosyntax transfer. Even so, their performance for translation in Indian language to Indian language scenario i… ▽ More The use of subword embedding has proved to be a major innovation in Neural Machine Translation (NMT). It helps NMT to learn better context vectors for Low Resource Languages (LRLs) so as to predict the target words by better modelling the morphologies of the two languages and also the morphosyntax transfer. Even so, their performance for translation in Indian language to Indian language scenario is still not as good as for resource-rich languages. One reason for this is the relative morphological richness of Indian languages, while another is that most of them fall into the extremely low resource or zero-shot categories. Since most major Indian languages use Indic or Brahmi origin scripts, the text written in them is highly phonetic in nature and phonetically similar in terms of abstract letters and their arrangements. We use these characteristics of Indian languages and their scripts to propose an approach based on common multilingual Latin-based encodings (WX notation) that take advantage of language similarity while addressing the morphological complexity issue in NMT. These multilingual Latin-based encodings in NMT, together with Byte Pair Embedding (BPE) allow us to better exploit their phonetic and orthographic as well as lexical similarities to improve the translation quality by projecting different but similar languages on the same orthographic-phonetic character space. We verify the proposed approach by demonstrating experiments on similar language pairs (Gujarati-Hindi, Marathi-Hindi, Nepali-Hindi, Maithili-Hindi, Punjabi-Hindi, and Urdu-Hindi) under low resource conditions. The proposed approach shows an improvement in a majority of cases, in one case as much as ~10 BLEU points compared to baseline techniques for similar language pairs. We also get up to ~1 BLEU points improvement on distant and zero-shot language pairs. △ Less

Submitted 21 May, 2023; originally announced May 2023.

Comments: 17 pages, 2 figures

arXiv:2303.18011 [pdf]

Exploiting Multilingualism in Low-resource Neural Machine Translation via Adversarial Learning

Authors: Amit Kumar, Ajay Pratap, Anil Kumar Singh

Abstract: Generative Adversarial Networks (GAN) offer a promising approach for Neural Machine Translation (NMT). However, feeding multiple morphologically languages into a single model during training reduces the NMT's performance. In GAN, similar to bilingual models, multilingual NMT only considers one reference translation for each sentence during model training. This single reference translation limits t… ▽ More Generative Adversarial Networks (GAN) offer a promising approach for Neural Machine Translation (NMT). However, feeding multiple morphologically languages into a single model during training reduces the NMT's performance. In GAN, similar to bilingual models, multilingual NMT only considers one reference translation for each sentence during model training. This single reference translation limits the GAN model from learning sufficient information about the source sentence representation. Thus, in this article, we propose Denoising Adversarial Auto-encoder-based Sentence Interpolation (DAASI) approach to perform sentence interpolation by learning the intermediate latent representation of the source and target sentences of multilingual language pairs. Apart from latent representation, we also use the Wasserstein-GAN approach for the multilingual NMT model by incorporating the model generated sentences of multiple languages for reward computation. This computed reward optimizes the performance of the GAN-based multilingual model in an effective manner. We demonstrate the experiments on low-resource language pairs and find that our approach outperforms the existing state-of-the-art approaches for multilingual NMT with a performance gain of up to 4 BLEU points. Moreover, we use our trained model on zero-shot language pairs under an unsupervised scenario and show the robustness of the proposed approach. △ Less

Submitted 31 March, 2023; originally announced March 2023.

Comments: 10 pages, 4 figures

arXiv:2303.01793 [pdf]

Exploiting Language Relatedness in Machine Translation Through Domain Adaptation Techniques

Authors: Amit Kumar, Rupjyoti Baruah, Ajay Pratap, Mayank Swarnkar, Anil Kumar Singh

Abstract: One of the significant challenges of Machine Translation (MT) is the scarcity of large amounts of data, mainly parallel sentence aligned corpora. If the evaluation is as rigorous as resource-rich languages, both Neural Machine Translation (NMT) and Statistical Machine Translation (SMT) can produce good results with such large amounts of data. However, it is challenging to improve the quality of MT… ▽ More One of the significant challenges of Machine Translation (MT) is the scarcity of large amounts of data, mainly parallel sentence aligned corpora. If the evaluation is as rigorous as resource-rich languages, both Neural Machine Translation (NMT) and Statistical Machine Translation (SMT) can produce good results with such large amounts of data. However, it is challenging to improve the quality of MT output for low resource languages, especially in NMT and SMT. In order to tackle the challenges faced by MT, we present a novel approach of using a scaled similarity score of sentences, especially for related languages based on a 5-gram KenLM language model with Kneser-ney smoothing technique for filtering in-domain data from out-of-domain corpora that boost the translation quality of MT. Furthermore, we employ other domain adaptation techniques such as multi-domain, fine-tuning and iterative back-translation approach to compare our novel approach on the Hindi-Nepali language pair for NMT and SMT. Our approach succeeds in increasing ~2 BLEU point on multi-domain approach, ~3 BLEU point on fine-tuning for NMT and ~2 BLEU point on iterative back-translation approach. △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2207.11654 [pdf, other]

BPFISH: Blockchain and Privacy-preserving FL Inspired Smart Healthcare

Authors: Moirangthem Biken Singh, Ajay Pratap

Abstract: This paper proposes Federated Learning (FL) based smart healthcare system where Medical Centers (MCs) train the local model using the data collected from patients and send the model weights to the miners in a blockchain-based robust framework without sharing raw data, kee** privacy preservation into deliberation. We formulate an optimization problem by maximizing the utility and minimizing the l… ▽ More This paper proposes Federated Learning (FL) based smart healthcare system where Medical Centers (MCs) train the local model using the data collected from patients and send the model weights to the miners in a blockchain-based robust framework without sharing raw data, kee** privacy preservation into deliberation. We formulate an optimization problem by maximizing the utility and minimizing the loss function considering energy consumption and FL process delay of MCs for learning effective models on distributed healthcare data underlying a blockchain-based framework. We propose a solution in two stages: first, offer a stable matching-based association algorithm to maximize the utility of both miners and MCs and then solve loss minimization using Stochastic Gradient Descent (SGD) algorithm employing FL under Differential Privacy (DP) and blockchain technology. Moreover, we incorporate blockchain technology to provide tempered resistant and decentralized model weight sharing in the proposed FL-based framework. The effectiveness of the proposed model is shown through simulation on real-world healthcare data comparing other state-of-the-art techniques. △ Less

Submitted 27 July, 2022; v1 submitted 24 July, 2022; originally announced July 2022.

arXiv:2106.00759 [pdf, other]

Covid-19 Spread Detection and Controlling with Fog-based Infection Probability Evaluation Model

Authors: Suraj Mahawar, Ajay Pratap

Abstract: COVID-19 has created a pandemic around the world, paused the path of building the future, and still ongoing without having any long-term solution shortly. The time taken in vaccine distribution is too slow compared to the spread of COVID-19. Hence, it is important to aware and takes precautions on time without delaying and waiting for long-duration after getting infected with the virus. Currently… ▽ More COVID-19 has created a pandemic around the world, paused the path of building the future, and still ongoing without having any long-term solution shortly. The time taken in vaccine distribution is too slow compared to the spread of COVID-19. Hence, it is important to aware and takes precautions on time without delaying and waiting for long-duration after getting infected with the virus. Currently used technology is more advanced than ever before. Almost everyone has access to at least one mobile device with an Internet connection. Therefore, we propose a Fog Server (FS) based system that can be used to create awareness about the spread of COVID-19 within the surroundings of individuals utilizing the concept of Hidden Markov Models (HMM) and Bluetooth contact tracing, in polynomial computational time complexity. Moreover, we evaluate the effectiveness of the proposed model through real-world data analysis on different simulation parameter settings. △ Less

Submitted 20 May, 2021; originally announced June 2021.

arXiv:2105.11097 [pdf, other]

Criticality and Utility-aware Fog Computing System for Remote Health Monitoring

Authors: Moirangthem Biken Singh, Navneet Taunk, Naveen Kumar Mall, Ajay Pratap

Abstract: Growing remote health monitoring system allows constant monitoring of the patient's condition and performance of preventive and control check-ups outside medical facilities. However, the real-time smart-healthcare application poses a delay constraint that has to be solved efficiently. Fog computing is emerging as an efficient solution for such real-time applications. Moreover, different medical ce… ▽ More Growing remote health monitoring system allows constant monitoring of the patient's condition and performance of preventive and control check-ups outside medical facilities. However, the real-time smart-healthcare application poses a delay constraint that has to be solved efficiently. Fog computing is emerging as an efficient solution for such real-time applications. Moreover, different medical centers are getting attracted to the growing IoT-based remote healthcare system in order to make a profit by hiring Fog computing resources. However, there is a need for an efficient algorithmic model for allocation of limited fog computing resources in the criticality-aware smart-healthcare system considering the profit of medical centers. Thus, the objective of this work is to maximize the system utility calculated as a linear combination of the profit of the medical center and the loss of patients. To measure profit, we propose a flat-pricing-based model. Further, we propose a swap**-based heuristic to maximize the system utility. The proposed heuristic is tested on various parameters and shown to perform close to the optimal with criticality-awareness in its core. Through extensive simulations, we show that the proposed heuristic achieves an average utility of $96\%$ of the optimal, in polynomial time complexity. △ Less

Submitted 2 April, 2022; v1 submitted 24 May, 2021; originally announced May 2021.

arXiv:1910.01165 [pdf]

Indicators of retention in remote digital health studies: A cross-study evaluation of 100,000 participants

Authors: Abhishek Pratap, Elias Chaibub Neto, Phil Snyder, Carl Stepnowsky, Noémie Elhadad, Daniel Grant, Matthew H. Mohebbi, Sean Mooney, Christine Suver, John Wilbanks, Lara Mangravite, Patrick Heagerty, Pat Arean, Larsson Omberg

Abstract: Digital technologies such as smartphones are transforming the way scientists conduct biomedical research using real-world data. Several remotely-conducted studies have recruited thousands of participants over a span of a few months. Unfortunately, these studies are hampered by substantial participant attrition, calling into question the representativeness of the collected data including generaliza… ▽ More Digital technologies such as smartphones are transforming the way scientists conduct biomedical research using real-world data. Several remotely-conducted studies have recruited thousands of participants over a span of a few months. Unfortunately, these studies are hampered by substantial participant attrition, calling into question the representativeness of the collected data including generalizability of findings from these studies. We report the challenges in retention and recruitment in eight remote digital health studies comprising over 100,000 participants who participated for more than 850,000 days, completing close to 3.5 million remote health evaluations. Survival modeling surfaced several factors significantly associated(P < 1e-16) with increase in median retention time i) Clinician referral(increase of 40 days), ii) Effect of compensation (22 days), iii) Clinical conditions of interest to the study (7 days) and iv) Older adults(4 days). Additionally, four distinct patterns of daily app usage behavior that were also associated(P < 1e-10) with participant demographics were identified. Most studies were not able to recruit a representative sample, either demographically or regionally. Combined together these findings can help inform recruitment and retention strategies to enable equitable participation of populations in future digital health research. △ Less

Submitted 2 October, 2019; originally announced October 2019.

arXiv:1905.01143 [pdf, other]

On Maximizing Task Throughput in IoT-enabled 5G Networks under Latency and Bandwidth Constraints

Authors: Ajay Pratap, Ragini Gupta, Venkata Sriram Siddhardh Nadendla, Sajal K. Das

Abstract: Fog computing in 5G networks has played a significant role in increasing the number of users in a given network. However, Internet-of-Things (IoT) has driven system designers towards designing heterogeneous networks to support diverse demands (tasks with different priority values) with different latency and data rate constraints. In this paper, our goal is to maximize the total number of tasks ser… ▽ More Fog computing in 5G networks has played a significant role in increasing the number of users in a given network. However, Internet-of-Things (IoT) has driven system designers towards designing heterogeneous networks to support diverse demands (tasks with different priority values) with different latency and data rate constraints. In this paper, our goal is to maximize the total number of tasks served by a heterogeneous network, labeled task throughput, in the presence of data rate and latency constraints and device preferences regarding computational needs. Since our original problem is intractable, we propose an efficient solution based on graph-coloring techniques. We demonstrate the effectiveness of our proposed algorithm using numerical results, real-world experiments on a laboratory testbed and comparing with the state-of-the-art algorithm. △ Less

Submitted 10 April, 2019; originally announced May 2019.

Showing 1–9 of 9 results for author: Pratap, A