Skip to main content

Showing 1–50 of 61 results for author: Simon, N

.
  1. arXiv:2406.18487  [pdf

    astro-ph.EP astro-ph.SR

    14 New Light Curves and an Updated Ephemeris for the Hot Jupiter HAT-P-54 b

    Authors: Heather B. Hewitt, Bradley Hutson, Michael Brockman, Elizabeth Catogni, Rosemary Ferreira, Gary Fussell, Atea Johnson, Chris Kight, Ryan A. Kilinski, Khatu Nguyen, Ty Perry, Elizabeth Quinlan, Eva Randazzo, Kellan Reagan, Kinley Subers, Federico R. Noguer, Molly N. Simon, Robert T. Zellem

    Abstract: Here we present an analysis of 14 transit light curves of the hot Jupiter HAT-P-54 b. Thirteen of our datasets were obtained with the 6-inch MicroObservatory telescope, Cecilia, and one was measured with the 61-inch Kuiper Telescope. We used the EXOplanet Transit Interpretation Code (EXOTIC) to reduce 49 datasets in order to update the planet's ephemeris to a mid-transit time of 2460216.95257 +/-… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 9 pages, 9 figures

    Journal ref: Journal of the American Association of Variable Star Observers, 52, https://app.aavso.org/jaavso/article/3923/ (2024)

  2. arXiv:2405.20074  [pdf, ps, other

    math.OC

    Control in the Coefficients of an Obstacle Problem

    Authors: Nicolai Simon, Winnifried Wollner

    Abstract: In this work, we consider optimality conditions of an optimal control problem governed by an obstacle problem. Here, we focus on introducing a, matrix valued, control variable as the coefficients of the obstacle problem. As it is well known, obstacle problems can be formulated as a complementarity system and consequently the associated solution operator is not Gateaux differentiable. As a conseque… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    MSC Class: 49K21; 49J40

  3. arXiv:2405.19615  [pdf, other

    astro-ph.EP astro-ph.IM

    Enhancing Exoplanet Ephemerides by Leveraging Professional and Citizen Science Data: A Test Case with WASP-77A b

    Authors: Federico R. Noguer, Suber Corley, Kyle A. Pearson, Robert T. Zellem, Molly N. Simon, Jennifer A. Burt, Isabela Huckabee, Prune C. August, Megan Weiner Mansfield, Paul A. Dalba, Peter C. B. Smith, Timothy Banks, Ira Bell, Dominique Daniel, Lindsay Dawson, Jesús De Mula, Marc Deldem, Dimitrios Deligeorgopoulos, Romina P. Di Sisto, Roger Dymock, Phil Evans, Giulio Follero, Martin J. F. Fowler, Eduardo Fernández-Lajús, Alex Hamrick , et al. (20 additional authors not shown)

    Abstract: We present an updated ephemeris and physical parameters for the exoplanet WASP-77 A b. In this effort, we combine 64 ground- and space-based transit observations, 6 space-based eclipse observations, and 32 radial velocity observations to produce the most precise orbital solution to date for this target, aiding in the planning of James Webb Space Telescope (JWST) and Ariel observations and atmosphe… ▽ More

    Submitted 4 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: Updated a co-author name. Added a co-author. Added an acknowledgement

  4. arXiv:2405.19435  [pdf, other

    physics.ed-ph

    Bringing Lecture-Tutorials Online: An Analysis of A New Strategy to Teach Planet Formation in the Undergraduate Classroom

    Authors: Haylee N. Archer, Molly N. Simon, Chris Mead, Edward E. Prather, Mia Brunkhorst, Diana Hunsley

    Abstract: Previous studies conclusively show that pencil-and-paper lecture-tutorials (LTs) are incredibly effective at increasing student engagement and learning gains on a variety of topics when compared to traditional lecture. LTs in astronomy are post-lecture activities developed with the intention of hel** students engage with conceptual and reasoning difficulties around a specific topic with the end… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: In press in the Astronomy Education Journal

  5. arXiv:2405.15117  [pdf, other

    physics.ed-ph astro-ph.IM

    A Pilot Study from the First Course-Based Undergraduate Research Experience for Online Degree-Seeking Astronomy Students

    Authors: Justin Hom, Jennifer Patience, Karen Knierman, Molly N. Simon, Ara Austin

    Abstract: Research-based active learning approaches are critical for the teaching and learning of undergraduate STEM majors. Course-based undergraduate research experiences (CUREs) are becoming more commonplace in traditional, in-person academic environments, but have only just started to be utilized in online education. Online education has been shown to create accessible pathways to knowledge for individu… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 21 pages, 12 figures, 19 tables, Accepted for Publication in the Astronomy Education Journal

  6. arXiv:2311.14100  [pdf, other

    cs.RO

    MonoNav: MAV Navigation via Monocular Depth Estimation and Reconstruction

    Authors: Nathaniel Simon, Anirudha Majumdar

    Abstract: A major challenge in deploying the smallest of Micro Aerial Vehicle (MAV) platforms (< 100 g) is their inability to carry sensors that provide high-resolution metric depth information (e.g., LiDAR or stereo cameras). Current systems rely on end-to-end learning or heuristic approaches that directly map images to control inputs, and struggle to fly fast in unknown environments. In this work, we ask… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: International Symposium on Experimental Robotics (ISER) 2023

  7. arXiv:2311.12726  [pdf, other

    stat.ME stat.AP

    Nonparametric variable importance for time-to-event outcomes with application to prediction of HIV infection

    Authors: Charles J. Wolock, Peter B. Gilbert, Noah Simon, Marco Carone

    Abstract: In survival analysis, complex machine learning algorithms have been increasingly used for predictive modeling. Given a collection of features available for inclusion in a predictive model, it may be of interest to quantify the relative importance of a subset of features for the prediction task at hand. In particular, in HIV vaccine trials, participant baseline characteristics are used to predict t… ▽ More

    Submitted 11 December, 2023; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: 91 total pages (31 main text, 60 supplementary); 14 total figures (4 main text, 10 supplementary)

  8. arXiv:2308.01470  [pdf, other

    math.ST stat.ME

    Improved convergence rates of nonparametric penalized regression under misspecified total variation

    Authors: Marlena S. Bannick, Noah Simon

    Abstract: Penalties that induce smoothness are common in nonparametric regression. In many settings, the amount of smoothness in the data generating function will not be known. Simon and Shojaie (2021) derived convergence rates for nonparametric estimators under misspecified smoothness. We show that their theoretical convergence rates can be improved by working with convenient approximating functions. Prope… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

  9. arXiv:2307.00869  [pdf, other

    math.OC

    Coefficient Control of Variational Inequalities

    Authors: Andreas Hehl, Denis Khimin, Ira Neitzel, Nicolai Simon, Thomas Wick, Winnifried Wollner

    Abstract: Within this chapter, we discuss control in the coefficients of an obstacle problem. Utilizing tools from H-convergence, we show existence of optimal solutions. First order necessary optimality conditions are obtained after deriving directional differentiability of the coefficient to solution map** for the obstacle problem. Further, considering a regularized obstacle problem as a constraint yield… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  10. arXiv:2306.17251  [pdf

    astro-ph.EP astro-ph.SR

    13 New Light Curves and Updated Mid-Transit Time and Period for Hot Jupiter WASP-104 b with EXOTIC

    Authors: Heather B. Hewitt, Federico Noguer, Suber Corley, James Ball, Claudia Chastain, Richard Cochran-White, Kendall Collins, Kris Ganzel, Kimberly Merriam Gray, Mike Logan, Steve Marquez-Perez, Chyna Merchant, Matthew Pedone, Gina Plumey, Matthew Rice, Zachary Ruybal, Molly N. Simon, Isabela Huckabee, Robert T. Zellem, Kyle A. Pearson

    Abstract: Using the EXOplanet Transit Interpretation Code (EXOTIC), we reduced 52 sets of images of WASP-104 b, a Hot Jupiter-class exoplanet orbiting WASP-104, in order to obtain an updated mid-transit time (ephemeris) and orbital period for the planet. We performed this reduction on images taken with a 6-inch telescope of the Center for Astrophysics | Harvard & Smithsonian MicroObservatory. Of the reduced… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: 6 pages, 5 figures, published in JAAVSO

    Journal ref: Journal of the American Association of Variable Star Observers, 51(1) (2023)

  11. arXiv:2306.08776  [pdf, other

    cs.RO

    Online Learning for Obstacle Avoidance

    Authors: David Snyder, Meghan Booker, Nathaniel Simon, Wenhan Xia, Daniel Suo, Elad Hazan, Anirudha Majumdar

    Abstract: We approach the fundamental problem of obstacle avoidance for robotic systems via the lens of online learning. In contrast to prior work that either assumes worst-case realizations of uncertainty in the environment or a stationary stochastic model of uncertainty, we propose a method that is efficient to implement and provably grants instance-optimality with respect to perturbations of trajectories… ▽ More

    Submitted 5 November, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 8 + 21 pages, 2 + 11 figures, Accepted to CoRL 2023 [Poster]

  12. arXiv:2211.03031  [pdf, other

    stat.ME

    A framework for leveraging machine learning tools to estimate personalized survival curves

    Authors: Charles J. Wolock, Peter B. Gilbert, Noah Simon, Marco Carone

    Abstract: The conditional survival function of a time-to-event outcome subject to censoring and truncation is a common target of estimation in survival analysis. This parameter may be of scientific interest and also often appears as a nuisance in nonparametric and semiparametric problems. In addition to classical parametric and semiparametric methods (e.g., based on the Cox proportional hazards model), flex… ▽ More

    Submitted 31 October, 2023; v1 submitted 6 November, 2022; originally announced November 2022.

    Comments: 52 pages, 13 figures

  13. arXiv:2210.05857  [pdf, other

    cs.RO

    FlowDrone: Wind Estimation and Gust Rejection on UAVs Using Fast-Response Hot-Wire Flow Sensors

    Authors: Nathaniel Simon, Allen Z. Ren, Alexander Piqué, David Snyder, Daphne Barretto, Marcus Hultmark, Anirudha Majumdar

    Abstract: Unmanned aerial vehicles (UAVs) are finding use in applications that place increasing emphasis on robustness to external disturbances including extreme wind. However, traditional multirotor UAV platforms do not directly sense wind; conventional flow sensors are too slow, insensitive, or bulky for widespread integration on UAVs. Instead, drones typically observe the effects of wind indirectly throu… ▽ More

    Submitted 24 October, 2022; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Submitted to ICRA 2023. See supplementary video at https://youtu.be/KWqkH9Z-338

  14. Fast-response hot-wire flow sensors for wind and gust estimation on UAVs

    Authors: Nathaniel Simon, Alexander Piqué, David Snyder, Kyle Ikuma, Anirudha Majumdar, Marcus Hultmark

    Abstract: Due to limitations in available sensor technology, unmanned aerial vehicles (UAVs) lack an active sensing capability to measure turbulence, gusts, or other unsteady aerodynamic phenomena. Conventional in situ anemometry techniques fail to deliver in the harsh and dynamic multirotor environment due to form factor, resolution, or robustness requirements. To address this capability gap, a novel, fast… ▽ More

    Submitted 26 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: 23 pages, 11+2 figures, under review

  15. arXiv:2206.12393  [pdf, other

    stat.ME

    Accounting for Inconsistent Use of Covariate Adjustment in Group Sequential Trials

    Authors: Marlena S. Bannick, Sonya L. Heltshe, Noah Simon

    Abstract: Group sequential designs in clinical trials allow for interim efficacy and futility monitoring. Adjustment for baseline covariates can increase power and precision of estimated effects. However, inconsistently applying covariate adjustment throughout the stages of a group sequential trial can result in inflation of type I error, biased point estimates, and anti-conservative confidence intervals. W… ▽ More

    Submitted 9 August, 2023; v1 submitted 24 June, 2022; originally announced June 2022.

  16. arXiv:2206.02994  [pdf, other

    stat.ME math.ST

    Regression in Tensor Product Spaces by the Method of Sieves

    Authors: Tianyu Zhang, Noah Simon

    Abstract: Estimation of a conditional mean (linking a set of features to an outcome of interest) is a fundamental statistical task. While there is an appeal to flexible nonparametric procedures, effective estimation in many classical nonparametric function spaces (e.g., multivariate Sobolev spaces) can be prohibitively difficult -- both statistically and computationally -- especially when the number of feat… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  17. arXiv:2201.10410  [pdf, other

    cs.CV

    Comparison of Evaluation Metrics for Landmark Detection in CMR Images

    Authors: Sven Koehler, Lalith Sharan, Julian Kuhm, Arman Ghanaat, Jelizaveta Gordejeva, Nike K. Simon, Niko M. Grell, Florian André, Sandy Engelhardt

    Abstract: Cardiac Magnetic Resonance (CMR) images are widely used for cardiac diagnosis and ventricular assessment. Extracting specific landmarks like the right ventricular insertion points is of importance for spatial alignment and 3D modeling. The automatic detection of such landmarks has been tackled by multiple groups using Deep Learning, but relatively little attention has been paid to the failure case… ▽ More

    Submitted 28 January, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

    Comments: Accepted at Bildverarbeitung für die Medizin (BVM), Informatik aktuell. Springer Vieweg, Wiesbaden 2022

  18. arXiv:2112.03428  [pdf, other

    stat.ME stat.CO stat.ML

    Mesh-Based Solutions for Nonparametric Penalized Regression

    Authors: Brayan Ortiz, Noah Simon

    Abstract: It is often of interest to estimate regression functions non-parametrically. Penalized regression (PR) is one statistically-effective, well-studied solution to this problem. Unfortunately, in many cases, finding exact solutions to PR problems is computationally intractable. In this manuscript, we propose a mesh-based approximate solution (MBS) for those scenarios. MBS transforms the complicated fu… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: 29 pages, 4 figures

    MSC Class: 62G08; 62J07 (Primary); 62G20 (Secondary) ACM Class: G.3

  19. arXiv:2107.08787  [pdf

    stat.AP cs.LG

    The Future will be Different than Today: Model Evaluation Considerations when Develo** Translational Clinical Biomarker

    Authors: Yichen Lu, Jane Fridlyand, Tiffany Tang, Ting Qi, Noah Simon, Ning Leng

    Abstract: Finding translational biomarkers stands center stage of the future of personalized medicine in healthcare. We observed notable challenges in identifying robust biomarkers as some with great performance in one scenario often fail to perform well in new trials (e.g. different population, indications). With rapid development in the clinical trial world (e.g. assay, disease definition), new trials ver… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

    Comments: Paper has 4 pages, 2 figures. Appendix are supplementary at the end

  20. arXiv:2105.01874  [pdf, other

    math.ST stat.ME stat.ML

    On the Optimality of Nuclear-norm-based Matrix Completion for Problems with Smooth Non-linear Structure

    Authors: Yunhua Xiang, Tianyu Zhang, Xu Wang, Ali Shojaie, Noah Simon

    Abstract: Originally developed for imputing missing entries in low rank, or approximately low rank matrices, matrix completion has proven widely effective in many problems where there is no reason to assume low-dimensional linear structure in the underlying matrix, as would be imposed by rank constraints. In this manuscript, we build some theoretical intuition for this behavior. We consider matrices which a… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Comments: 47 pages, 1 figure

  21. arXiv:2104.00846  [pdf, other

    math.ST stat.ME

    A Sieve Stochastic Gradient Descent Estimator for Online Nonparametric Regression in Sobolev ellipsoids

    Authors: Tianyu Zhang, Noah Simon

    Abstract: The goal of regression is to recover an unknown underlying function that best links a set of predictors to an outcome from noisy observations. In nonparametric regression, one assumes that the regression function belongs to a pre-specified infinite-dimensional function space (the hypothesis space). In the online setting, when the observations come in a stream, it is computationally-preferable to i… ▽ More

    Submitted 6 January, 2022; v1 submitted 1 April, 2021; originally announced April 2021.

  22. arXiv:2104.00780  [pdf, other

    stat.ME

    An Online Projection Estimator for Nonparametric Regression in Reproducing Kernel Hilbert Spaces

    Authors: Tianyu Zhang, Noah Simon

    Abstract: The goal of nonparametric regression is to recover an underlying regression function from noisy observations, under the assumption that the regression function belongs to a pre-specified infinite dimensional function space. In the online setting, when the observations come in a stream, it is generally computationally infeasible to refit the whole model repeatedly. There are as of yet no methods th… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

  23. arXiv:2011.13944  [pdf, other

    astro-ph.EP astro-ph.IM astro-ph.SR

    Planet Hunters TESS II: Findings from the first two years of TESS

    Authors: Nora L. Eisner, Oscar Barragán, Chris Lintott, Suzanne Aigrain, Belinda Nicholson, Tabetha S. Boyajian, Steve B. Howell, Cole Johnston, Ben Lakeland, Grant Miller, Adam McMaster, Hannu Parviainen, Emily J. Safron, Megan E. Schwamb, Laura Trouille, Sophia Vaughan, Norbert Zicher, Campbell Allen, Sarah Allen, Mark Bouslog, Cliff Johnson, Molly N. Simon, Zach Wolfenbarger, Elisabeth M. L. Baeten, David M. Bundy , et al. (1 additional authors not shown)

    Abstract: We present the results from the first two years of the Planet Hunters TESS citizen science project, which identifies planet candidates in the TESS data by engaging members of the general public. Over 22,000 citizen scientists from around the world visually inspected the first 26 Sectors of TESS data in order to help identify transit-like signals. We use a clustering algorithm to combine these clas… ▽ More

    Submitted 27 November, 2020; originally announced November 2020.

    Comments: Accepted for publication in MNRAS (22 pages, 12 figures, 3 tables)

  24. arXiv:2010.00718  [pdf, other

    stat.ML cs.LG stat.CO

    When to Impute? Imputation before and during cross-validation

    Authors: Byron C. Jaeger, Nicholas J. Tierney, Noah R. Simon

    Abstract: Cross-validation (CV) is a technique used to estimate generalization error for prediction models. For pipeline modeling algorithms (i.e. modeling procedures with multiple steps), it has been recommended the entire sequence of steps be carried out during each replicate of CV to mimic the application of the entire pipeline to an external testing set. While theoretically sound, following this recomme… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

    Comments: 11 pages (main text, not including references), 6 tables, and 4 figures. Code to replicate manuscript available at https://github.com/bcjaeger/Imputation-and-CV

  25. arXiv:2005.04834  [pdf, other

    stat.ML cs.LG stat.ME

    Ensembled sparse-input hierarchical networks for high-dimensional datasets

    Authors: Jean Feng, Noah Simon

    Abstract: Neural networks have seen limited use in prediction for high-dimensional data with small sample sizes, because they tend to overfit and require tuning many more hyperparameters than existing off-the-shelf machine learning methods. With small modifications to the network architecture and training procedure, we show that dense neural networks can be a practical data analysis tool in these settings.… ▽ More

    Submitted 10 May, 2020; originally announced May 2020.

  26. Spatial Matrix Completion for Spatially-Misaligned and High-Dimensional Air Pollution Data

    Authors: Phuong T. Vu, Adam A. Szpiro, Noah Simon

    Abstract: In health-pollution cohort studies, accurate predictions of pollutant concentrations at new locations are needed, since the locations of fixed monitoring sites and study participants are often spatially misaligned. For multi-pollution data, principal component analysis (PCA) is often incorporated to obtain low-rank (LR) structure of the data prior to spatial prediction. Recently developed predicti… ▽ More

    Submitted 21 January, 2022; v1 submitted 11 April, 2020; originally announced April 2020.

    Comments: 26 pages, 5 figures, 5 tables, 1 supplemental file (available upon request). This v2 is a pre peer-reviewed version that was submitted to Environmetrics. A final version with minor revisions was accepted for publication by Environmetrics on Dec 13, 2021, and will be linked to this version once published

  27. arXiv:2004.03683  [pdf, other

    stat.ME math.ST stat.ML

    A general framework for inference on algorithm-agnostic variable importance

    Authors: Brian D. Williamson, Peter B. Gilbert, Noah R. Simon, Marco Carone

    Abstract: In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response -- in other words, to gauge the variable importance of features. Most recent work on variable importance assessment has focused on describing the importance of features within the confines of a given prediction algorithm. However, such assessment… ▽ More

    Submitted 13 September, 2021; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: 69 total pages (35 in the main document, 34 supplementary), 23 figures (4 in the main document, 19 supplementary)

  28. arXiv:2003.00401  [pdf, other

    stat.AP

    A flexible Bayesian framework to estimate age- and cause-specific child mortality over time from sample registration data

    Authors: Austin E Schumacher, Tyler H McCormick, Jon Wakefield, Yue Chu, Jamie Perin, Francisco Villavicencio, Noah Simon, Li Liu

    Abstract: In order to implement disease-specific interventions in young age groups, policy makers in low- and middle-income countries require timely and accurate estimates of age- and cause-specific child mortality. High quality data is not available in settings where these interventions are most needed, but there is a push to create sample registration systems that collect detailed mortality information. C… ▽ More

    Submitted 18 May, 2021; v1 submitted 29 February, 2020; originally announced March 2020.

    Comments: 16 pages, 4 figures, submitted to The Annals of Applied Statistics

    MSC Class: 62P99

  29. arXiv:2003.00116  [pdf, other

    math.ST

    BigSurvSGD: Big Survival Data Analysis via Stochastic Gradient Descent

    Authors: Aliasghar Tarkhan, Noah Simon

    Abstract: In many biomedical applications, outcome is measured as a ``time-to-event'' (eg. disease progression or death). To assess the connection between features of a patient and this outcome, it is common to assume a proportional hazards model, and fit a proportional hazards regression (or Cox regression). To fit this model, a log-concave objective function known as the ``partial likelihood'' is maximize… ▽ More

    Submitted 9 August, 2020; v1 submitted 28 February, 2020; originally announced March 2020.

    Comments: 37 pages, 11 figures

  30. arXiv:1912.12413  [pdf, other

    stat.ML cs.LG

    Approval policies for modifications to Machine Learning-Based Software as a Medical Device: A study of bio-creep

    Authors: Jean Feng, Scott Emerson, Noah Simon

    Abstract: Successful deployment of machine learning algorithms in healthcare requires careful assessments of their performance and safety. To date, the FDA approves locked algorithms prior to marketing and requires future updates to undergo separate premarket reviews. However, this negates a key feature of machine learning--the ability to learn from a growing dataset and improve over time. This paper frames… ▽ More

    Submitted 28 December, 2019; originally announced December 2019.

  31. arXiv:1906.05473  [pdf, other

    stat.ML cs.LG

    Selective prediction-set models with coverage guarantees

    Authors: Jean Feng, Arjun Sondhi, Jessica Perry, Noah Simon

    Abstract: Though black-box predictors are state-of-the-art for many complex tasks, they often fail to properly quantify predictive uncertainty and may provide inappropriate predictions for unfamiliar data. Instead, we can learn more reliable models by letting them either output a prediction set or abstain when the uncertainty is high. We propose training these selective prediction-set models using an uncert… ▽ More

    Submitted 10 December, 2021; v1 submitted 13 June, 2019; originally announced June 2019.

    Comments: Published at Biometrics

  32. arXiv:1905.12768  [pdf, other

    stat.ME

    Using Propensity Scores to Develop and Evaluate Treatment Rules with Observational Data

    Authors: Jeremy Roth, Noah Simon

    Abstract: In this paper, we outline a principled approach to estimate an individualized treatment rule that is appropriate for data from observational studies where, in addition to treatment assignment not being independent of individual characteristics, some characteristics may affect treatment assignment in the current study but not be available in future clinical settings where the estimated rule would b… ▽ More

    Submitted 3 June, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

  33. arXiv:1904.00117  [pdf, other

    q-bio.QM stat.AP

    Estimation of cell lineage trees by maximum-likelihood phylogenetics

    Authors: Jean Feng, William S DeWitt III, Aaron McKenna, Noah Simon, Amy Willis, Frederick A Matsen IV

    Abstract: CRISPR technology has enabled large-scale cell lineage tracing for complex multicellular organisms by mutating synthetic genomic barcodes during organismal development. However, these sophisticated biological tools currently use ad-hoc and outmoded computational methods to reconstruct the cell lineage tree from the mutated barcodes. Because these methods are agnostic to the biological mechanism, t… ▽ More

    Submitted 29 March, 2019; originally announced April 2019.

  34. An analysis of the cost of hyper-parameter selection via split-sample validation, with applications to penalized regression

    Authors: Jean Feng, Noah Simon

    Abstract: In the regression setting, given a set of hyper-parameters, a model-estimation procedure constructs a model from training data. The optimal hyper-parameters that minimize generalization error of the model are usually unknown. In practice they are often estimated using split-sample validation. Up to now, there is an open question regarding how the generalization error of the selected model grows wi… ▽ More

    Submitted 28 March, 2019; originally announced March 2019.

  35. arXiv:1903.04641  [pdf, other

    stat.ME math.ST stat.ML

    Generalized Sparse Additive Models

    Authors: Asad Haris, Noah Simon, Ali Shojaie

    Abstract: We present a unified framework for estimation and analysis of generalized additive models in high dimensions. The framework defines a large class of penalized regression estimators, encompassing many existing methods. An efficient computational algorithm for this class is presented that easily scales to thousands of observations and features. We prove minimax optimal convergence bounds for this cl… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

  36. arXiv:1903.04631  [pdf, other

    stat.ML cs.LG

    Wavelet regression and additive models for irregularly spaced data

    Authors: Asad Haris, Noah Simon, Ali Shojaie

    Abstract: We present a novel approach for nonparametric regression using wavelet basis functions. Our proposal, $\texttt{waveMesh}$, can be applied to non-equispaced data with sample size not necessarily a power of 2. We develop an efficient proximal gradient descent algorithm for computing the estimator and establish adaptive minimax convergence rates. The main appeal of our approach is that it naturally e… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Journal ref: Advances in Neural Information Processing Systems 2018, 8987-8997

  37. arXiv:1711.07592  [pdf, other

    stat.ME stat.ML

    Sparse-Input Neural Networks for High-dimensional Nonparametric Regression and Classification

    Authors: Jean Feng, Noah Simon

    Abstract: Neural networks are usually not the tool of choice for nonparametric high-dimensional problems where the number of input features is much larger than the number of observations. Though neural networks can approximate complex multivariate functions, they generally require a large number of training observations to obtain reasonable fits, unless one can learn the appropriate network structure. In th… ▽ More

    Submitted 21 June, 2019; v1 submitted 20 November, 2017; originally announced November 2017.

  38. arXiv:1711.04057  [pdf, other

    q-bio.PE stat.AP

    Survival analysis of DNA mutation motifs with penalized proportional hazards

    Authors: Jean Feng, David A. Shaw, Vladimir N. Minin, Noah Simon, Frederick A. Matsen IV

    Abstract: Antibodies, an essential part of our immune system, develop through an intricate process to bind a wide array of pathogens. This process involves randomly mutating DNA sequences encoding these antibodies to find variants with improved binding, though mutations are not distributed uniformly across sequence sites. Immunologists observe this nonuniformity to be consistent with "mutation motifs", whic… ▽ More

    Submitted 21 September, 2018; v1 submitted 10 November, 2017; originally announced November 2017.

  39. arXiv:1703.09813  [pdf, other

    stat.ML

    Gradient-based Regularization Parameter Selection for Problems with Non-smooth Penalty Functions

    Authors: Jean Feng, Noah Simon

    Abstract: In high-dimensional and/or non-parametric regression problems, regularization (or penalization) is used to control model complexity and induce desired structure. Each penalty has a weight parameter that indicates how strongly the structure corresponding to that penalty should be enforced. Typically the parameters are chosen to minimize the error on a separate validation set using a simple grid sea… ▽ More

    Submitted 28 March, 2017; originally announced March 2017.

  40. arXiv:1703.06946  [pdf, other

    stat.AP q-bio.NC

    SCALPEL: Extracting Neurons from Calcium Imaging Data

    Authors: Ashley Petersen, Noah Simon, Daniela Witten

    Abstract: In the past few years, new technologies in the field of neuroscience have made it possible to simultaneously image activity in large populations of neurons at cellular resolution in behaving animals. In mid-2016, a huge repository of this so-called "calcium imaging" data was made publicly-available. The availability of this large-scale data resource opens the door to a host of scientific questions… ▽ More

    Submitted 20 March, 2017; originally announced March 2017.

  41. arXiv:1702.06986  [pdf, other

    stat.ME

    Rank conditional coverage and confidence intervals in high dimensional problems

    Authors: Jean Morrison, Noah Simon

    Abstract: Confidence interval procedures used in low dimensional settings are often inappropriate for high dimensional applications. When a large number of parameters are estimated, marginal confidence intervals associated with the most significant estimates have very low coverage rates: They are too small and centered at biased estimates. The problem of forming confidence intervals in high dimensional sett… ▽ More

    Submitted 22 February, 2017; originally announced February 2017.

  42. arXiv:1611.09972  [pdf, ps, other

    stat.ME math.ST stat.ML

    Nonparametric Regression with Adaptive Truncation via a Convex Hierarchical Penalty

    Authors: Asad Haris, Ali Shojaie, Noah Simon

    Abstract: We consider the problem of non-parametric regression with a potentially large number of covariates. We propose a convex, penalized estimation framework that is particularly well-suited for high-dimensional sparse additive models. The proposed approach combines appealing features of finite basis representation and smoothing penalties for non-parametric estimation. In particular, in the case of addi… ▽ More

    Submitted 18 June, 2019; v1 submitted 29 November, 2016; originally announced November 2016.

    Journal ref: Biometrika 2018, Vol. 106, No. 1, 87-107

  43. Simultaneous detection and estimation of trait associations with genomic phenotypes

    Authors: Jean Morrison, Noah Simon, Daniela Witten

    Abstract: Genomic phenotypes, such as DNA methylation and chromatin accessibility, can be used to characterize the transcriptional and regulatory activity of DNA within a cell. Recent technological advances have made it possible to measure such phenotypes very densely. This density often results in spatial structure, in the sense that measurements at nearby sites are very similar. In this paper, we consid… ▽ More

    Submitted 14 November, 2016; originally announced November 2016.

    Comments: In press in Biostatistics (2016)

  44. Graphical Models for Zero-Inflated Single Cell Gene Expression

    Authors: Andrew McDavid, Raphael Gottardo, Noah Simon, Mathias Drton

    Abstract: Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in microfluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-in… ▽ More

    Submitted 14 March, 2018; v1 submitted 18 October, 2016; originally announced October 2016.

    Comments: Fixed error in software URL

    Journal ref: Ann. Appl. Stat., Volume 13, Number 2 (2019), 848-873

  45. arXiv:1609.05551  [pdf, other

    math.ST stat.OT

    Graphical Models for Discrete and Continuous Data

    Authors: Rui Zhuang, Noah Simon, Johannes Lederer

    Abstract: We introduce a general framework for undirected graphical models. It generalizes Gaussian graphical models to a wide range of continuous, discrete, and combinations of different types of data. The models in the framework, called exponential trace models, are amenable to estimation based on maximum likelihood. We introduce a sampling-based approximation algorithm for computing the maximum likelihoo… ▽ More

    Submitted 15 June, 2019; v1 submitted 18 September, 2016; originally announced September 2016.

  46. arXiv:1608.06992  [pdf, other

    astro-ph.SR astro-ph.EP astro-ph.GA

    Tracing Slow Winds from T Tauri Stars via Low Velocity Forbidden Line Emission

    Authors: M. N. Simon, I. Pascucci, S. Edwards, W. Feng, U. Gorti, D. Hollenbach, E. Rigliaco, J. T. Keane

    Abstract: Using Keck/HIRES spectra Δv ~ 7 km/s, we analyze forbidden lines of [O I] 6300 Å, [O I] 5577 Å and [S II] 6731 Å from 33 T Tauri stars covering a range of disk evolutionary stages. After removing a high velocity component (HVC) associated with microjets, we study the properties of the low velocity component (LVC). The LVC can be attributed to slow disk winds that could be magnetically (MHD) or the… ▽ More

    Submitted 24 August, 2016; originally announced August 2016.

  47. Narrow Na and K Absorption Lines Toward T Tauri Stars - Tracing the Atomic Envelope of Molecular Clouds

    Authors: I. Pascucci, S. Edwards, M. Heyer, E. Rigliaco, L. Hillenbrand, U. Gorti, D. Hollenbach, M. N. Simon

    Abstract: We present a detailed analysis of narrow of NaI and KI absorption resonance lines toward nearly 40 T Tauri stars in Taurus with the goal of clarifying their origin. The NaI 5889.95 angstrom line is detected toward all but one source, while the weaker KI 7698.96 angstrom line in about two thirds of the sample. The similarity in their peak centroids and the significant positive correlation between t… ▽ More

    Submitted 7 October, 2015; originally announced October 2015.

    Comments: Accepted to ApJ

  48. Convex Modeling of Interactions with Strong Heredity

    Authors: Asad Haris, Daniela Witten, Noah Simon

    Abstract: We consider the task of fitting a regression model involving interactions among a potentially large set of covariates, in which we wish to enforce strong heredity. We propose FAMILY, a very general framework for this task. Our proposal is a generalization of several existing methods, such as VANISH [Radchenko and James, 2010], hierNet [Bien et al., 2013], the all-pairs lasso, and the lasso using o… ▽ More

    Submitted 3 October, 2015; v1 submitted 13 October, 2014; originally announced October 2014.

    Comments: Final version accepted for publication in JCGS

    Journal ref: Journal of Computational and Graphical Statistics 2016, Vol. 25, No. 4, 981-1004

  49. arXiv:1409.5391  [pdf, other

    stat.ME stat.ML

    Fused Lasso Additive Model

    Authors: Ashley Petersen, Daniela Witten, Noah Simon

    Abstract: We consider the problem of predicting an outcome variable using $p$ covariates that are measured on $n$ independent observations, in the setting in which flexible and interpretable fits are desirable. We propose the fused lasso additive model (FLAM), in which each additive function is estimated to be piecewise constant with a small number of adaptively-chosen knots. FLAM is the solution to a conve… ▽ More

    Submitted 18 September, 2014; originally announced September 2014.

  50. arXiv:1405.4251  [pdf, other

    stat.ME stat.AP stat.ML

    Selection Bias Correction and Effect Size Estimation under Dependence

    Authors: Kean Ming Tan, Noah Simon, Daniela Witten

    Abstract: We consider large-scale studies in which it is of interest to test a very large number of hypotheses, and then to estimate the effect sizes corresponding to the rejected hypotheses. For instance, this setting arises in the analysis of gene expression or DNA sequencing data. However, naive estimates of the effect sizes suffer from selection bias, i.e., some of the largest naive estimates are large… ▽ More

    Submitted 28 March, 2015; v1 submitted 16 May, 2014; originally announced May 2014.

    Comments: 21 pages, 2 figures