Skip to main content

Showing 1–13 of 13 results for author: Ge, L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.02671  [pdf, other

    stat.ME stat.AP

    When Do Natural Mediation Effects Differ from Their Randomized Interventional Analogues: Test and Theory

    Authors: Ang Yu, Li Ge, Felix Elwert

    Abstract: In causal mediation analysis, the natural direct and indirect effects (natural effects) are nonparametrically unidentifiable in the presence of treatment-induced confounding, which motivated the development of randomized interventional analogues (RIAs) of the natural effects. The RIAs are easier to identify and widely used in practice. Applied researchers often interpret RIA estimates as if they w… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2405.02612  [pdf, other

    cs.LG cs.AI cs.CY stat.ML

    Learning Linear Utility Functions From Pairwise Comparison Queries

    Authors: Luise Ge, Brendan Juba, Yevgeniy Vorobeychik

    Abstract: We study learnability of linear utility functions from pairwise comparison queries. In particular, we consider two learning objectives. The first objective is to predict out-of-sample responses to pairwise comparisons, whereas the second is to approximately recover the true parameters of the utility function. We show that in the passive learning setting, linear utilities are efficiently learnable… ▽ More

    Submitted 19 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: Submitted to ECAI for review

  3. arXiv:2403.14044  [pdf

    stat.ME stat.AP

    Statistical tests for comparing the associations of multiple exposures with a common outcome in Cox proportional hazard models

    Authors: Rikuta Hamaya, Peilu Wang, Lin Ge, Edward L. Giovannucci, Molin Wang

    Abstract: With advancement of medicine, alternative exposures or interventions are emerging with respect to a common outcome, and there are needs to formally test the difference in the associations of multiple exposures. We propose a duplication method-based multivariate Wald test in the Cox proportional hazard regression analyses to test the difference in the associations of multiple exposures with a same… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  4. arXiv:2312.17122  [pdf, other

    cs.CL cs.AI stat.ML

    Large Language Model for Causal Decision Making

    Authors: Haitao Jiang, Lin Ge, Yuhe Gao, Jianian Wang, Rui Song

    Abstract: Large Language Models (LLMs) have shown their success in language understanding and reasoning on general topics. However, their capability to perform inference based on user-specified structured data and knowledge in corpus-rare concepts, such as causal decision-making is still limited. In this work, we explore the possibility of fine-tuning an open-sourced LLM into LLM4Causal, which can identify… ▽ More

    Submitted 11 April, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  5. arXiv:2307.00214  [pdf, ps, other

    stat.ME

    Utilizing a Capture-Recapture Strategy to Accelerate Infectious Disease Surveillance

    Authors: Lin Ge, Yuzi Zhang, Lance A. Waller, Robert H. Lyles

    Abstract: Monitoring key elements of disease dynamics (e.g., prevalence, case counts) is of great importance in infectious disease prevention and control, as emphasized during the COVID-19 pandemic. To facilitate this effort, we propose a new capture-recapture (CRC) analysis strategy that takes misclassification into account from easily-administered, imperfect diagnostic test kits, such as the Rapid Antigen… ▽ More

    Submitted 30 June, 2023; originally announced July 2023.

  6. arXiv:2306.10666  [pdf

    stat.AP

    On some pitfalls of the log-linear modeling framework for capture-recapture studies in disease surveillance

    Authors: Yuzi Zhang, Lin Ge, Lance A. Waller, Robert H. Lyles

    Abstract: In epidemiological studies, the capture-recapture (CRC) method is a powerful tool that can be used to estimate the number of diseased cases or potentially disease prevalence based on data from overlap** surveillance systems. Estimators derived from log-linear models are widely applied by epidemiologists when analyzing CRC data. The popularity of the log-linear model framework is largely associat… ▽ More

    Submitted 18 June, 2023; originally announced June 2023.

  7. Enhanced Inference for Finite Population Sampling-Based Prevalence Estimation with Misclassification Errors

    Authors: Lin Ge, Yuzi Zhang, Lance A. Waller, Robert H. Lyles

    Abstract: Epidemiologic screening programs often make use of tests with small, but non-zero probabilities of misdiagnosis. In this article, we assume the target population is finite with a fixed number of true cases, and that we apply an imperfect test with known sensitivity and specificity to a sample of individuals from the population. In this setting, we propose an enhanced inferential approach for use i… ▽ More

    Submitted 13 August, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

  8. arXiv:2301.13348  [pdf, other

    stat.ML cs.LG stat.ME

    A Reinforcement Learning Framework for Dynamic Mediation Analysis

    Authors: Lin Ge, Jitao Wang, Chengchun Shi, Zhenke Wu, Rui Song

    Abstract: Mediation analysis learns the causal effect transmitted via mediator variables between treatments and outcomes and receives increasing attention in various scientific domains to elucidate causal relations. Most existing works focus on point-exposure studies where each subject only receives one treatment at a single time point. However, there are a number of applications (e.g., mobile health) where… ▽ More

    Submitted 2 September, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  9. A Design and Analytic Strategy for Monitoring Disease Positivity and Case Characteristics in Accessible Closed Populations

    Authors: Robert H. Lyles, Yuzi Zhang, Lin Ge, Lance A. Waller

    Abstract: We propose a monitoring strategy for efficient and robust estimation of disease prevalence and case numbers within closed and enumerated populations such as schools, workplaces, or retirement communities. The proposed design relies largely on voluntary testing, notoriously biased (e.g., in the case of COVID-19) due to non-representative sampling. The approach yields unbiased and comparatively prec… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

  10. Tailoring Capture-Recapture Methods to Estimate Registry-Based Case Counts Based on Error-Prone Diagnostic Signals

    Authors: Lin Ge, Yuzi Zhang, Kevin C. Ward, Timothy L. Lash, Lance A. Waller, Robert H. Lyles

    Abstract: Surveillance research is of great importance for effective and efficient epidemiological monitoring of case counts and disease prevalence. Taking specific motivation from ongoing efforts to identify recurrent cases based on the Georgia Cancer Registry, we extend recently proposed "anchor stream" sampling design and estimation methodology. Our approach offers a more efficient and defensible alterna… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

  11. arXiv:2202.12819  [pdf, other

    stat.AP stat.ML

    Exploratory Hidden Markov Factor Models for Longitudinal Mobile Health Data: Application to Adverse Posttraumatic Neuropsychiatric Sequelae

    Authors: Lin Ge, Xinming An, Donglin Zeng, Samuel McLean, Ronald Kessler, Rui Song

    Abstract: Adverse posttraumatic neuropsychiatric sequelae (APNS) are common among veterans and millions of Americans after traumatic exposures, resulting in substantial burdens for trauma survivors and society. Despite numerous studies conducted on APNS over the past decades, there has been limited progress in understanding the underlying neurobiological mechanisms due to several unique challenges. One of t… ▽ More

    Submitted 4 June, 2023; v1 submitted 25 February, 2022; originally announced February 2022.

  12. arXiv:2101.04783  [pdf, other

    math.ST stat.ME

    Variable bandwidth kernel regression estimation

    Authors: Janet Nakarmi, Hailin Sang, Lin Ge

    Abstract: In this paper we propose a variable bandwidth kernel regression estimator for $i.i.d.$ observations in $\mathbb{R}^2$ to improve the classical Nadaraya-Watson estimator. The bias is improved to the order of $O(h_n^4)$ under the condition that the fifth order derivative of the density function and the sixth order derivative of the regression function are bounded and continuous. We also establish th… ▽ More

    Submitted 12 January, 2021; originally announced January 2021.

    Comments: accepted by ESAIM: PS. 36 pages, 3 figures

    MSC Class: 62G07; 62E20; 62H12

  13. arXiv:2009.09161  [pdf, other

    cs.LG cs.AI physics.app-ph stat.ML

    Label-Based Diversity Measure Among Hidden Units of Deep Neural Networks: A Regularization Method

    Authors: Chenguang Zhang, Yuexian Hou, Dawei Song, Liangzhu Ge, Yaoshuai Yao

    Abstract: Although the deep structure guarantees the powerful expressivity of deep networks (DNNs), it also triggers serious overfitting problem. To improve the generalization capacity of DNNs, many strategies were developed to improve the diversity among hidden units. However, most of these strategies are empirical and heuristic in absence of either a theoretical derivation of the diversity measure or a cl… ▽ More

    Submitted 3 April, 2021; v1 submitted 19 September, 2020; originally announced September 2020.