Search | arXiv e-print repository

A Review of EMA Public Assessment Reports where Non-Proportional Hazards were Identified

Authors: Florian Klinglmueller, Norbert Benda, Tim Friede, Tobias Fellinger, Harald Heinzl, Andrew Hooker, Franz Koenig, Tim Mathes, Martin Posch, Florian Stampfer, Susanne Urach

Abstract: While well-established methods for time-to-event data are available when the proportional hazards assumption holds, there is no consensus on the best approach under non-proportional hazards. A wide range of parametric and non-parametric methods for testing and estimation in this scenario have been proposed. In this review we identified EMA marketing authorization procedures where non-proportional… ▽ More While well-established methods for time-to-event data are available when the proportional hazards assumption holds, there is no consensus on the best approach under non-proportional hazards. A wide range of parametric and non-parametric methods for testing and estimation in this scenario have been proposed. In this review we identified EMA marketing authorization procedures where non-proportional hazards were raised as a potential issue in the risk-benefit assessment and extract relevant information on trial design and results reported in the corresponding European Assessment Reports (EPARs) available in the database at paediatricdata.eu. We identified 16 Marketing authorization procedures, reporting results on a total of 18 trials. Most procedures covered the authorization of treatments from the oncology domain. For the majority of trials NPH issues were related to a suspected delayed treatment effect, or different treatment effects in known subgroups. Issues related to censoring, or treatment switching were also identified. For most of the trials the primary analysis was performed using conventional methods assuming proportional hazards, even if NPH was anticipated. Differential treatment effects were addressed using stratification and delayed treatment effect considered for sample size planning. Even though, not considered in the primary analysis, some procedures reported extensive sensitivity analyses and model diagnostics evaluating the proportional hazards assumption. For a few procedures methods addressing NPH (e.g.~weighted log-rank tests) were used in the primary analysis. We extracted estimates of the median survival, hazard ratios, and time of survival curve separation. In addition, we digitized the KM curves to reconstruct close to individual patient level data. Extracted outcomes served as the basis for a simulation study of methods for time to event analysis under NPH. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 16 pages, 5 figures

arXiv:2403.14348 [pdf, other]

Statistical modeling to adjust for time trends in adaptive platform trials utilizing non-concurrent controls

Authors: Pavla Krotka, Martin Posch, Mohamed Gewily, Günter Höglinger, Marta Bofill Roig

Abstract: Utilizing non-concurrent controls in the analysis of late-entering experimental arms in platform trials has recently received considerable attention, both on academic and regulatory levels. While incorporating this data can lead to increased power and lower required sample sizes, it might also introduce bias to the effect estimators if temporal drifts are present in the trial. Aiming to mitigate t… ▽ More Utilizing non-concurrent controls in the analysis of late-entering experimental arms in platform trials has recently received considerable attention, both on academic and regulatory levels. While incorporating this data can lead to increased power and lower required sample sizes, it might also introduce bias to the effect estimators if temporal drifts are present in the trial. Aiming to mitigate the potential calendar time bias, we propose various frequentist model-based approaches that leverage the non-concurrent control data, while adjusting for time trends. One of the currently available frequentist models incorporates time as a categorical fixed effect, separating the duration of the trial into periods, defined as time intervals bounded by any treatment arm entering or leaving the platform. In this work, we propose two extensions of this model. First, we consider an alternative definition of the time covariate by dividing the trial into fixed-length calendar time intervals. Second, we propose alternative methods to adjust for time trends. In particular, we investigate adjusting for autocorrelated random effects to account for dependency between closer time intervals and employing spline regression to model time with a smooth polynomial function. We evaluate the performance of the proposed approaches in a simulation study and illustrate their use by means of a case study. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2402.08336 [pdf, ps, other]

A two-step approach for analyzing time to event data under non-proportional hazards

Authors: Jonas Brugger, Tim Friede, Florian Klinglmüller, Martin Posch, Robin Ristl, Franz König

Abstract: The log-rank test and the Cox proportional hazards model are commonly used to compare time-to-event data in clinical trials, as they are most powerful under proportional hazards. But there is a loss of power if this assumption is violated, which is the case for some new oncology drugs like immunotherapies. We consider a two-stage test procedure, in which the weighting of the log-rank test statisti… ▽ More The log-rank test and the Cox proportional hazards model are commonly used to compare time-to-event data in clinical trials, as they are most powerful under proportional hazards. But there is a loss of power if this assumption is violated, which is the case for some new oncology drugs like immunotherapies. We consider a two-stage test procedure, in which the weighting of the log-rank test statistic depends on a pre-test of the proportional hazards assumption. I.e., depending on the pre-test either the log-rank or an alternative test is used to compare the survival probabilities. We show that if naively implemented this can lead to a substantial inflation of the type-I error rate. To address this, we embed the two-stage test in a permutation test framework to keep the nominal level alpha. We compare the operating characteristics of the two-stage test with the log-rank test and other tests by clinical trial simulations. △ Less

Submitted 13 February, 2024; originally announced February 2024.

arXiv:2312.08169 [pdf, other]

Efficiency of Multivariate Tests in Trials in Progressive Supranuclear Palsy

Authors: Elham Yousefi, Mohamed Gewily, Franz König, Günter Höglinger, Franziska Hopfner, Mats O. Karlsson, Robin Ristl, Sonja Zehetmayer, Martin Posch

Abstract: Measuring disease progression in clinical trials for testing novel treatments for multifaceted diseases as Progressive Supranuclear Palsy (PSP), remains challenging. In this study we assess a range of statistical approaches to compare outcomes measured by the items of the Progressive Supranuclear Palsy Rating Scale (PSPRS). We consider several statistical approaches, including sum scores, as an FD… ▽ More Measuring disease progression in clinical trials for testing novel treatments for multifaceted diseases as Progressive Supranuclear Palsy (PSP), remains challenging. In this study we assess a range of statistical approaches to compare outcomes measured by the items of the Progressive Supranuclear Palsy Rating Scale (PSPRS). We consider several statistical approaches, including sum scores, as an FDA-recommended version of the PSPRS, multivariate tests, and analysis approaches based on multiple comparisons of the individual items. We propose two novel approaches which measure disease status based on Item Response Theory models. We assess the performance of these tests in an extensive simulation study and illustrate their use with a re-analysis of the ABBV-8E12 clinical trial. Furthermore, we discuss the impact of the FDA-recommended scoring of item scores on the power of the statistical tests. We find that classical approaches as the PSPRS sum score demonstrate moderate to high power when treatment effects are consistent across the individual items. The tests based on Item Response Theory models yield the highest power when the simulated data are generated from an IRT model. The multiple testing based approaches have a higher power in settings where the treatment effect is limited to certain domains or items. The FDA-recommended item rescoring tends to decrease the simulated power. The study shows that there is no one-size-fits-all testing procedure for evaluating treatment effects using PSPRS items; the optimal method varies based on the specific effect size patterns. The efficiency of the PSPRS sum score, while generally robust and straightforward to apply, varies depending on the effect sizes' patterns encountered and more powerful alternatives are available in specific settings. These findings can have important implications for the design of future clinical trials in PSP. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2310.05622 [pdf, other]

A neutral comparison of statistical methods for time-to-event analyses under non-proportional hazards

Authors: Florian Klinglmüller, Tobias Fellinger, Franz König, Tim Friede, Andrew C. Hooker, Harald Heinzl, Martina Mittlböck, Jonas Brugger, Maximilian Bardo, Cynthia Huber, Norbert Benda, Martin Posch, Robin Ristl

Abstract: While well-established methods for time-to-event data are available when the proportional hazards assumption holds, there is no consensus on the best inferential approach under non-proportional hazards (NPH). However, a wide range of parametric and non-parametric methods for testing and estimation in this scenario have been proposed. To provide recommendations on the statistical analysis of clinic… ▽ More While well-established methods for time-to-event data are available when the proportional hazards assumption holds, there is no consensus on the best inferential approach under non-proportional hazards (NPH). However, a wide range of parametric and non-parametric methods for testing and estimation in this scenario have been proposed. To provide recommendations on the statistical analysis of clinical trials where non proportional hazards are expected, we conducted a comprehensive simulation study under different scenarios of non-proportional hazards, including delayed onset of treatment effect, crossing hazard curves, subgroups with different treatment effect and changing hazards after disease progression. We assessed type I error rate control, power and confidence interval coverage, where applicable, for a wide range of methods including weighted log-rank tests, the MaxCombo test, summary measures such as the restricted mean survival time (RMST), average hazard ratios, and milestone survival probabilities as well as accelerated failure time regression models. We found a trade-off between interpretability and power when choosing an analysis strategy under NPH scenarios. While analysis methods based on weighted logrank tests typically were favorable in terms of power, they do not provide an easily interpretable treatment effect estimate. Also, depending on the weight function, they test a narrow null hypothesis of equal hazard functions and rejection of this null hypothesis may not allow for a direct conclusion of treatment benefit in terms of the survival function. In contrast, non-parametric procedures based on well interpretable measures as the RMST difference had lower power in most scenarios. Model based methods based on specific survival distributions had larger power, however often gave biased estimates and lower than nominal confidence interval coverage. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2310.02080 [pdf, other]

Design Considerations for a Phase II platform trial in Major Depressive Disorder

Authors: Michaela Maria Freitag, Dario Zocholl, Elias Laurin Meyer, Stefan M. Gold, Marta Bofill Roig, Heidi De Smedt, Martin Posch, Franz König

Abstract: Major Depressive Disorder (MDD) is one of the most common causes of disability worldwide. Unfortunately, about one-third of patients do not benefit sufficiently from available treatments and not many new drugs have been developed in this area in recent years. We thus need better and faster ways to evaluate many different treatment options quickly. Platform trials are a possible remedy - they facil… ▽ More Major Depressive Disorder (MDD) is one of the most common causes of disability worldwide. Unfortunately, about one-third of patients do not benefit sufficiently from available treatments and not many new drugs have been developed in this area in recent years. We thus need better and faster ways to evaluate many different treatment options quickly. Platform trials are a possible remedy - they facilitate the evaluation of more investigational treatments in a shorter period of time by sharing controls, as well as reducing clinical trial activation and recruitment times. We discuss design considerations for a platform trial in MDD, taking into account the unique disease characteristics, and present the results of extensive simulations to investigate the operating characteristics under various realistic scenarios. To allow the testing of more treatments, interim futility analyses should be performed to eliminate treatments that have either no or negligible treatment effect. Furthermore, we investigate different randomisation and allocation strategies as well as the impact of the per-treatment arm sample size. We compare the operating characteristics of such platform trials to those of traditional randomised controlled trials and highlight the potential advantages of platform trials. △ Less

Submitted 3 October, 2023; originally announced October 2023.

MSC Class: 62K99

arXiv:2310.01990 [pdf, other]

Simultaneous inference procedures for the comparison of multiple characteristics of two survival functions

Authors: Robin Ristl, Heiko Götte, Armin Schüler, Martin Posch, Franz König

Abstract: Survival time is the primary endpoint of many randomized controlled trials, and a treatment effect is typically quantified by the hazard ratio under the assumption of proportional hazards. Awareness is increasing that in many settings this assumption is a-priori violated, e.g. due to delayed onset of drug effect. In these cases, interpretation of the hazard ratio estimate is ambiguous and statisti… ▽ More Survival time is the primary endpoint of many randomized controlled trials, and a treatment effect is typically quantified by the hazard ratio under the assumption of proportional hazards. Awareness is increasing that in many settings this assumption is a-priori violated, e.g. due to delayed onset of drug effect. In these cases, interpretation of the hazard ratio estimate is ambiguous and statistical inference for alternative parameters to quantify a treatment effect is warranted. We consider differences or ratios of milestone survival probabilities or quantiles, differences in restricted mean survival times and an average hazard ratio to be of interest. Typically, more than one such parameter needs to be reported to assess possible treatment benefits, and in confirmatory trials the according inferential procedures need to be adjusted for multiplicity. By using the counting process representation of the mentioned parameters, we show that their estimates are asymptotically multivariate normal and we propose according parametric multiple testing procedures and simultaneous confidence intervals. Also, the logrank test may be included in the framework. Finite sample type I error rate and power are studied by simulation. The methods are illustrated with an example from oncology. A software implementation is provided in the R package nph. △ Less

Submitted 3 October, 2023; originally announced October 2023.

arXiv:2306.16858 [pdf, other]

Methods for non-proportional hazards in clinical trials: A systematic review

Authors: Maximilian Bardo, Cynthia Huber, Norbert Benda, Jonas Brugger, Tobias Fellinger, Vaidotas Galaune, Judith Heinz, Harald Heinzl, Andrew C. Hooker, Florian Klinglmüller, Franz König, Tim Mathes, Martina Mittlböck, Martin Posch, Robin Ristl, Tim Friede

Abstract: For the analysis of time-to-event data, frequently used methods such as the log-rank test or the Cox proportional hazards model are based on the proportional hazards assumption, which is often debatable. Although a wide range of parametric and non-parametric methods for non-proportional hazards (NPH) has been proposed, there is no consensus on the best approaches. To close this gap, we conducted a… ▽ More For the analysis of time-to-event data, frequently used methods such as the log-rank test or the Cox proportional hazards model are based on the proportional hazards assumption, which is often debatable. Although a wide range of parametric and non-parametric methods for non-proportional hazards (NPH) has been proposed, there is no consensus on the best approaches. To close this gap, we conducted a systematic literature search to identify statistical methods and software appropriate under NPH. Our literature search identified 907 abstracts, out of which we included 211 articles, mostly methodological ones. Review articles and applications were less frequently identified. The articles discuss effect measures, effect estimation and regression approaches, hypothesis tests, and sample size calculation approaches, which are often tailored to specific NPH situations. Using a unified notation, we provide an overview of methods available. Furthermore, we derive some guidance from the identified articles. We summarized the contents from the literature review in a concise way in the main text and provide more detailed explanations in the supplement. △ Less

Submitted 29 January, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

MSC Class: 62Nxx; 62Pxx

arXiv:2304.03035 [pdf, other]

doi 10.1177/09622802241239008

Optimal allocation strategies in platform trials

Authors: Marta Bofill Roig, Ekkehard Glimm, Tobias Mielke, Martin Posch

Abstract: Platform trials are randomized clinical trials that allow simultaneous comparison of multiple interventions, usually against a common control. Arms to test experimental interventions may enter and leave the platform over time. This implies that the number of experimental intervention arms in the trial may change over time. Determining optimal allocation rates to allocate patients to the treatment… ▽ More Platform trials are randomized clinical trials that allow simultaneous comparison of multiple interventions, usually against a common control. Arms to test experimental interventions may enter and leave the platform over time. This implies that the number of experimental intervention arms in the trial may change over time. Determining optimal allocation rates to allocate patients to the treatment and control arms in platform trials is challenging because the change in treatment arms implies that also the optimal allocation rates will change when treatments enter or leave the platform. In addition, the optimal allocation depends on the analysis strategy used. In this paper, we derive optimal treatment allocation rates for platform trials with shared controls, assuming that a stratified estimation and testing procedure based on a regression model, is used to adjust for time trends. We consider both, analysis using concurrent controls only as well as analysis methods based on also non-concurrent controls and assume that the total sample size is fixed. The objective function to be minimized is the maximum of the variances of the effect estimators. We show that the optimal solution depends on the entry time of the arms in the trial and, in general, does not correspond to the square root of $k$ allocation rule used in the classical multi-arm trials. We illustrate the optimal allocation and evaluate the power and type 1 error rate compared to trials using one-to-one and square root of $k$ allocations by means of a case study. △ Less

Submitted 6 April, 2023; originally announced April 2023.

Journal ref: Statistical Methods in Medical Research, 2024

arXiv:2302.12634 [pdf, other]

doi 10.1016/j.softx.2023.101437

NCC: An R-package for analysis and simulation of platform trials with non-concurrent controls

Authors: Pavla Krotka, Katharina Hees, Peter Jacko, Dominic Magirr, Martin Posch, Marta Bofill Roig

Abstract: Platform trials evaluate the efficacy of multiple treatments, allowing for late entry of the experimental arms and enabling efficiency gains by sharing controls. The power of individual treatment-control comparisons in such trials can be improved by utilizing non-concurrent controls (NCC) in the analysis. We present the R-package NCC for the design and analysis of platform trials using non-concurr… ▽ More Platform trials evaluate the efficacy of multiple treatments, allowing for late entry of the experimental arms and enabling efficiency gains by sharing controls. The power of individual treatment-control comparisons in such trials can be improved by utilizing non-concurrent controls (NCC) in the analysis. We present the R-package NCC for the design and analysis of platform trials using non-concurrent controls. NCC allows for simulating platform trials and evaluating the properties of analysis methods that make use of non-concurrent controls in a variety of settings. We describe the main NCC functions and show how to use the package to simulate and analyse platform trials by means of specific examples. △ Less

Submitted 5 June, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

arXiv:2211.09547 [pdf, other]

doi 10.1186/s13063-023-07398-7

On the use of non-concurrent controls in platform trials: A sco** review

Authors: Marta Bofill Roig, Cora Burgwinkel, Ursula Garczarek, Franz Koenig, Martin Posch, Quynh Nguyen, Katharina Hees

Abstract: Platform trials gained popularity during the last few years as they increase flexibility compared to multi-arm trials by allowing new experimental arms entering when the trial already started. Using a shared control group in platform trials increases the trial efficiency compared to separate trials. Because of the later entry of some of the experimental treatment arms, the shared control group inc… ▽ More Platform trials gained popularity during the last few years as they increase flexibility compared to multi-arm trials by allowing new experimental arms entering when the trial already started. Using a shared control group in platform trials increases the trial efficiency compared to separate trials. Because of the later entry of some of the experimental treatment arms, the shared control group includes concurrent and non-concurrent control data. For a given experimental arm, non-concurrent controls refer to patients allocated to the control arm before the arm enters the trial, while concurrent controls refer to control patients that are randomised concurrently to the experimental arm. Using non-concurrent controls can result in bias in the estimate in case of time trends if the appropriate methodology is not used and the assumptions are not met. In this paper, we faced two main objectives. In the first, we aimed to identify the methods currently available for incorporating non-concurrent controls, clarify the key concepts and assumptions, and name the main characteristics of each method. For this purpose, we systematically searched research articles on methods to include non-concurrent controls. The second objective is to summarise the current regulatory view on non-concurrent controls to clarify the key concepts and current guidance. Therefore, we conducted a systematic search in regulatory guidelines regarding using non-concurrent controls and summarised the most relevant arguments and recommended methods. Finally, we discuss the advantages and potential caveats of using non-concurrent controls. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Journal ref: Trials 2023

arXiv:2209.09173 [pdf]

doi 10.1080/19466315.2021.1938204

Use of Non-concurrent Common Control in Master Protocols in Oncology Trials: Report of an American Statistical Association Biopharmaceutical Section Open Forum Discussion

Authors: Rajeshwari Sridhara, Olga Marchenko, Qi Jiang, Richard Pazdur, Martin Posch, Scott Berry, Marc Theoret, Yuan Li Shen, Thomas Gwise, Lorenzo Hess, Andrew Raven, Khadija Rantell, Kit Roes, Richard Simon, Mary Redman, Yuan Ji, Cindy Lu

Abstract: This article summarizes the discussions from the American Statistical Association (ASA) Biopharmaceutical (BIOP) Section Open Forum that took place on December 10, 2020 and was organized by the ASA BIOP Statistical Methods in Oncology Scientific Working Group, in coordination with the US FDA Oncology Center of Excellence. Diverse stakeholders including experts from international regulatory agencie… ▽ More This article summarizes the discussions from the American Statistical Association (ASA) Biopharmaceutical (BIOP) Section Open Forum that took place on December 10, 2020 and was organized by the ASA BIOP Statistical Methods in Oncology Scientific Working Group, in coordination with the US FDA Oncology Center of Excellence. Diverse stakeholders including experts from international regulatory agencies, academicians, and representatives of the pharmaceutical industry engaged in a discussion on the use of non-concurrent control in Master Protocols for oncology trials. While the use of non-concurrent control with the concurrent control may increase the power of detecting the therapeutic difference between a treatment and the control, the panelists had diverse opinion on the statistical approaches for modeling non-concurrent and concurrent controls. Some were more concerned about the temporality of the non-concurrent control and bias introduced by different confounders related to time, e.g., changes in standard of care, changes in patient population, changes in recruiting strategies, changes in assessment of endpoints. Nevertheless, in some situations such as when the recruitment is extremely challenging for a rare disease, the panelists concluded that the use of a non-concurrent control can be justified. △ Less

Submitted 19 September, 2022; originally announced September 2022.

MSC Class: 62P10

Journal ref: Statistics in Biopharmaceutical Research 14.3 (2022): 353-357

arXiv:2209.07776 [pdf]

doi 10.1002/pst.2120

The use of external controls: To what extent can it currently be recommended?

Authors: Hans Ulrich Burger, Christoph Gerlinger, Chris Harbron, Armin Koch, Martin Posch, Justine Rochon, Anja Schiel

Abstract: With more and better clinical data being captured outside of clinical studies and greater data sharing of clinical studies, external controls may become a more attractive alternative to randomized clinical trials. Both industry and regulators recognize that in situations where a randomized study cannot be performed, external controls can provide the needed contextualization to allow a better inter… ▽ More With more and better clinical data being captured outside of clinical studies and greater data sharing of clinical studies, external controls may become a more attractive alternative to randomized clinical trials. Both industry and regulators recognize that in situations where a randomized study cannot be performed, external controls can provide the needed contextualization to allow a better interpretation of studies without a randomized control. It is also agreed that external controls will not fully replace randomized clinical trials as the gold standard for formal proof of efficacy in drug development and the yardstick of clinical research. However, it remains unclear in which situations conclusions about efficacy and a positive benefit/risk can reliably be based on the use of an external control. This paper will provide an overview on types of external control, their applications and the different sources of bias their use may incur, and discuss potential mitigation steps. It will also give recommendations on how the use of external controls can be justified. △ Less

Submitted 16 September, 2022; originally announced September 2022.

MSC Class: 62P10

Journal ref: Pharmaceutical Statistics, 20(6), 1002-1016 (2021)

arXiv:2206.09639 [pdf, other]

doi 10.1093/biostatistics/kxac040

Adaptive clinical trial designs with blinded selection of binary composite endpoints and sample size reassessment

Authors: Marta Bofill Roig, Guadalupe Gómez Melis, Martin Posch, Franz Koenig

Abstract: For randomized clinical trials where a single, primary, binary endpoint would require unfeasibly large sample sizes, composite endpoints are widely chosen as the primary endpoint. Despite being commonly used, composite endpoints entail challenges in designing and interpreting results. Given that the components may be of different relevance and have different effect sizes, the choice of components… ▽ More For randomized clinical trials where a single, primary, binary endpoint would require unfeasibly large sample sizes, composite endpoints are widely chosen as the primary endpoint. Despite being commonly used, composite endpoints entail challenges in designing and interpreting results. Given that the components may be of different relevance and have different effect sizes, the choice of components must be made carefully. Especially, sample size calculations for composite binary endpoints depend not only on the anticipated effect sizes and event probabilities of the composite components, but also on the correlation between them. However, information on the correlation between endpoints is usually not reported in the literature which can be an obstacle for planning of future sound trial design. We consider two-arm randomized controlled trials with a primary composite binary endpoint and an endpoint that consists only of the clinically more important component of the composite endpoint. We propose a trial design that allows an adaptive modification of the primary endpoint based on blinded information obtained at an interim analysis. We consider a decision rule to select between a composite endpoint and its most relevant component as primary endpoint. The decision rule chooses the endpoint with the lower estimated required sample size. Additionally, the sample size is reassessed using the estimated event probabilities and correlation, and the expected effect sizes of the composite components. We investigate the statistical power and significance level under the proposed design through simulations. We show that the adaptive design is equally or more powerful than designs without adaptive modification on the primary endpoint. The targeted power is achieved even if the correlation is misspecified while maintaining the type 1 error. We illustrated the proposal by means of two case studies. △ Less

Submitted 20 June, 2022; originally announced June 2022.

Journal ref: Biostatistics (2022)

arXiv:2202.03838 [pdf, other]

Online error control for platform trials

Authors: David S. Robertson, James M. S. Wason, Franz König, Martin Posch, Thomas Jaki

Abstract: Platform trials evaluate multiple experimental treatments under a single master protocol, where new treatment arms are added to the trial over time. Given the multiple treatment comparisons, there is the potential for inflation of the overall type I error rate, which is complicated by the fact that the hypotheses are tested at different times and are not all necessarily pre-specified. Online error… ▽ More Platform trials evaluate multiple experimental treatments under a single master protocol, where new treatment arms are added to the trial over time. Given the multiple treatment comparisons, there is the potential for inflation of the overall type I error rate, which is complicated by the fact that the hypotheses are tested at different times and are not all necessarily pre-specified. Online error control methodology provides a possible solution to the problem of multiplicity for platform trials where a relatively large number of hypotheses are expected to be tested over time. In the online testing framework, hypotheses are tested in a sequential manner, where at each time-step an analyst decides whether to reject the current null hypothesis without knowledge of future tests but based solely on past decisions. Methodology has recently been developed for online control of the false discovery rate as well as the familywise error rate (FWER). In this paper, we describe how to apply online error control to the platform trial setting, present extensive simulation results, and give some recommendations for the use of this new methodology in practice. We show that the algorithms for online error rate control can have a substantially lower FWER than uncorrected testing, while still achieving noticeable gains in power when compared with the use of a Bonferroni procedure. We also illustrate how online error control would have impacted a currently ongoing platform trial. △ Less

Submitted 8 February, 2022; originally announced February 2022.

Comments: 26 pages, 13 figures

MSC Class: 62L10

arXiv:2112.10619 [pdf, other]

Online control of the False Discovery Rate in group-sequential platform trials

Authors: Sonja Zehetmayer, Martin Posch, Franz Koenig

Abstract: When testing multiple hypotheses, a suitable error rate should be controlled even in exploratory trials. Conventional methods to control the False Discovery Rate (FDR) assume that all p-values are available at the time point of test decision. In platform trials, however, treatment arms enter and leave the trial at any time during its conduct. Therefore, the number of treatments and hypothesis test… ▽ More When testing multiple hypotheses, a suitable error rate should be controlled even in exploratory trials. Conventional methods to control the False Discovery Rate (FDR) assume that all p-values are available at the time point of test decision. In platform trials, however, treatment arms enter and leave the trial at any time during its conduct. Therefore, the number of treatments and hypothesis tests is not fixed in advance and hypotheses are not tested at once, but sequentially. Recently, for such a setting the concept of online control of the FDR was introduced. We investigate the LOND procedure to control the online FDR in platform trials and propose an extension to allow for interim analyses with the option of early stop** for efficacy or futility for individual hypotheses. The power depends sensitively on the prior distribution of effect sizes, e.g., whether true alternatives are uniformly distributed over time or not. We consider the choice of design parameters for the LOND procedure to maximize the overall power and compare the OBrien-Fleming group-sequential design with the Pocock approach. Finally we investigate the impact on error rates by including both concurrent and non-concurrent control data. △ Less

Submitted 20 December, 2021; originally announced December 2021.

Comments: 17 pages, 7 figures, 3 tables

arXiv:2112.06574 [pdf, other]

doi 10.1186/s12874-022-01683-w

On model-based time trend adjustments in platform trials with non-concurrent controls

Authors: Marta Bofill Roig, Pavla Krotka, Carl-Fredrik Burman, Ekkehard Glimm, Stefan M. Gold, Katharina Hees, Peter Jacko, Franz Koenig, Dominic Magirr, Peter Mesenbrink, Kert Viele, Martin Posch

Abstract: Platform trials can evaluate the efficacy of several treatments compared to a control. The number of treatments is not fixed, as arms may be added or removed as the trial progresses. Platform trials are more efficient than independent parallel-group trials because of using shared control groups. For arms entering the trial later, not all patients in the control group are randomised concurrently. T… ▽ More Platform trials can evaluate the efficacy of several treatments compared to a control. The number of treatments is not fixed, as arms may be added or removed as the trial progresses. Platform trials are more efficient than independent parallel-group trials because of using shared control groups. For arms entering the trial later, not all patients in the control group are randomised concurrently. The control group is then divided into concurrent and non-concurrent controls. Using non-concurrent controls (NCC) can improve the trial's efficiency, but can introduce bias due to time trends. We focus on a platform trial with two treatment arms and a common control arm. Assuming that the second treatment arm is added later, we assess the robustness of model-based approaches to adjust for time trends when using NCC. We consider approaches where time trends are modeled as linear or as a step function, with steps at times where arms enter or leave the trial. For trials with continuous or binary outcomes, we investigate the type 1 error (t1e) rate and power of testing the efficacy of the newly added arm under a range of scenarios. In addition to scenarios where time trends are equal across arms, we investigate settings with trends that are different or not additive in the model scale. A step function model fitted on data from all arms gives increased power while controlling the t1e, as long as the time trends are equal for the different arms and additive on the model scale. This holds even if the trend's shape deviates from a step function if block randomisation is used. But if trends differ between arms or are not additive on the model scale, t1e control may be lost. The efficiency gained by using step function models to incorporate NCC can outweigh potential biases. However, the specifics of the trial, plausibility of different time trends, and robustness of results should be considered △ Less

Submitted 26 July, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

Journal ref: BMC Medical Research Methodology (2022)

arXiv:2006.01474 [pdf, other]

Conformal prediction intervals for the individual treatment effect

Authors: Danijel Kivaranovic, Robin Ristl, Martin Posch, Hannes Leeb

Abstract: We propose several prediction intervals procedures for the individual treatment effect with either finite-sample or asymptotic coverage guarantee in a non-parametric regression setting, where non-linear regression functions, heteroskedasticity and non-Gaussianity are allowed. The construct the prediction intervals we use the conformal method of Vovk et al. (2005). In extensive simulations, we comp… ▽ More We propose several prediction intervals procedures for the individual treatment effect with either finite-sample or asymptotic coverage guarantee in a non-parametric regression setting, where non-linear regression functions, heteroskedasticity and non-Gaussianity are allowed. The construct the prediction intervals we use the conformal method of Vovk et al. (2005). In extensive simulations, we compare the coverage probability and interval length of our prediction interval procedures. We demonstrate that complex learning algorithms, such as neural networks, can lead to narrower prediction intervals than simple algorithms, such as linear regression, if the sample size is large enough. △ Less

Submitted 2 June, 2020; originally announced June 2020.

Comments: 32 pages, 2 figures

arXiv:1811.09824 [pdf, other]

A multiple comparison procedure for dose-finding trials with subpopulations

Authors: Marius Thomas, Björn Bornkamp, Martin Posch, Franz König

Abstract: Identifying subgroups of patients with an enhanced response to a new treatment has become an area of increased interest in the last few years. When there is knowledge about possible subpopulations with an enhanced treatment effect before the start of a trial it might be beneficial to set up a testing strategy, which tests for a significant treatment effect not only in the full population, but also… ▽ More Identifying subgroups of patients with an enhanced response to a new treatment has become an area of increased interest in the last few years. When there is knowledge about possible subpopulations with an enhanced treatment effect before the start of a trial it might be beneficial to set up a testing strategy, which tests for a significant treatment effect not only in the full population, but also in these prespecified subpopulations. In this paper we present a parametric multiple testing approach for tests in multiple populations for dose-finding trials. Our approach is based on the MCP-Mod methodology, which uses multiple comparison procedures to test for a dose-response signal, while considering multiple possible candidate dose-response shapes. Our proposed methods allow for heteroscedasticity between populations and control the FWER over tests in multiple populations and for multiple candidate models. We show in simulations, that the proposed multi-population testing approaches can increase the power to detect a significant dose-response signal over the standard single-population MCP-Mod, when the considered subpopulation has an enhanced treatment effect. △ Less

Submitted 24 November, 2018; originally announced November 2018.

arXiv:1811.02504 [pdf]

doi 10.1186/s13023-018-0919-y

Recent advances in methodology for clinical trials in small populations: the InSPiRe project

Authors: T. Friede, M. Posch, S. Zohar, C. Alberti, N. Benda, E. Comets, S. Day, A. Dmitrenko, A. Graf, B. K. Günhan, S. W. Hee, F. Lentz, J. Madan, F. Miller, T. Ondra, M. Pearce, C. Röver, A. Tournazi, S. Unkel, M. Ursino, G. Wassmer, N. Stallard

Abstract: Where there are a limited number of patients, such as in a rare disease, clinical trials in these small populations present several challenges, including statistical issues. This led to an EU FP7 call for proposals in 2013. One of the three projects funded was the Innovative Methodology for Small Populations Research (InSPiRe) project. This paper summarizes the main results of the project, which w… ▽ More Where there are a limited number of patients, such as in a rare disease, clinical trials in these small populations present several challenges, including statistical issues. This led to an EU FP7 call for proposals in 2013. One of the three projects funded was the Innovative Methodology for Small Populations Research (InSPiRe) project. This paper summarizes the main results of the project, which was completed in 2017. The InSPiRe project has led to development of novel statistical methodology for clinical trials in small populations in four areas. We have explored new decision-making methods for small population clinical trials using a Bayesian decision-theoretic framework to compare costs with potential benefits, developed approaches for targeted treatment trials, enabling simultaneous identification of subgroups and confirmation of treatment effect for these patients, worked on early phase clinical trial design and on extrapolation from adult to pediatric studies, develo** methods to enable use of pharmacokinetics and pharmacodynamics data, and also developed improved robust meta-analysis methods for a small number of trials to support the planning, analysis and interpretation of a trial as well as enabling extrapolation between patient groups. In addition to scientific publications, we have contributed to regulatory guidance and produced free software in order to facilitate implementation of the novel methods. △ Less

Submitted 30 October, 2018; originally announced November 2018.

Comments: 9 pages, 3 figures

Journal ref: Orphanet Journal of Rare Diseases, 13:136, 2018

arXiv:1612.07561 [pdf, other]

doi 10.1016/j.csda.2018.01.001

Optimal exact tests for multiple binary endpoints

Authors: Robin Ristl, Dong Xi, Ekkehard Glimm, Martin Posch

Abstract: In confirmatory clinical trials with small sample sizes, hypothesis tests based on asymptotic distributions are often not valid and exact non-parametric procedures are applied instead. However, the latter are based on discrete test statistics and can become very conservative, even more so, if adjustments for multiple testing as the Bonferroni correction are applied. We propose improved exact multi… ▽ More In confirmatory clinical trials with small sample sizes, hypothesis tests based on asymptotic distributions are often not valid and exact non-parametric procedures are applied instead. However, the latter are based on discrete test statistics and can become very conservative, even more so, if adjustments for multiple testing as the Bonferroni correction are applied. We propose improved exact multiple testing procedures for the setting where two parallel groups are compared in multiple binary endpoints. Based on the joint conditional distribution of test statistics of Fisher's exact tests, optimal rejection regions for intersection hypotheses tests are constructed. To efficiently search the large space of possible rejection regions, we propose an optimization algorithm based on constrained optimization and integer linear programming. Depending on the optimization objective, the optimal test yields maximal power under a specific alternative, maximal exhaustion of the nominal type I error rate, or the largest possible rejection region controlling the type I error rate. Applying the closed testing principle, we construct optimized multiple testing procedures with strong familywise error rate control. Furthermore, we propose a greedy algorithm for nearly optimal tests, which is computationally more efficient. We numerically compare the unconditional power of the optimized procedure with alternative approaches and illustrate the optimal tests with a clinical trial example in a rare disease. △ Less

Submitted 22 December, 2016; originally announced December 2016.

Journal ref: Computational Statistics and Data Analysis 122 (2018) 1-17

arXiv:1606.03987 [pdf, other]

doi 10.1371/journal.pone.0163726

Optimizing Trial Designs for Targeted Therapies

Authors: Thomas Ondra, Sebastian Jobjörnsson, Robert A. Beckman, Carl-Fredrik Burman, Franz König, Nigel Stallard, Martin Posch

Abstract: An important objective in the development of targeted therapies is to identify the populations where the treatment under consideration has positive benefit risk balance. We consider pivotal clinical trials, where the efficacy of a treatment is tested in an overall population and/or in a pre-specified subpopulation. Based on a decision theoretic framework we derive optimized trial designs by maximi… ▽ More An important objective in the development of targeted therapies is to identify the populations where the treatment under consideration has positive benefit risk balance. We consider pivotal clinical trials, where the efficacy of a treatment is tested in an overall population and/or in a pre-specified subpopulation. Based on a decision theoretic framework we derive optimized trial designs by maximizing utility functions. Features to be optimized include the sample size and the population in which the trial is performed (the full population or the targeted subgroup only) as well as the underlying multiple test procedure. The approach accounts for prior knowledge of the efficacy of the drug in the considered populations using a two dimensional prior distribution. The considered utility functions account for the costs of the clinical trial as well as the expected benefit when demonstrating efficacy in the different subpopulations. We model utility functions from a sponsor's as well as from a public health perspective, reflecting actual civil interests. Examples of optimized trial designs obtained by numerical optimization are presented for both perspectives. △ Less

Submitted 13 June, 2016; originally announced June 2016.

arXiv:1602.07207 [pdf]

doi 10.1186/s13023-016-0402-6

Systematic reviews in paediatric multiple sclerosis and Creutzfeldt-Jakob disease exemplify shortcomings in methods used to evaluate therapies in rare conditions

Authors: Steffen Unkel, Christian Röver, Nigel Stallard, Norbert Benda, Martin Posch, Sarah Zohar, Tim Friede

Abstract: BACKGROUND: Randomized controlled trials (RCTs) are the gold standard design of clinical research to assess interventions. However, RCTs cannot always be applied for practical or ethical reasons. To investigate the current practices in rare diseases, we review evaluations of therapeutic interventions in paediatric multiple sclerosis (MS) and Creutzfeldt-Jakob disease (CJD). In particular, we shed… ▽ More BACKGROUND: Randomized controlled trials (RCTs) are the gold standard design of clinical research to assess interventions. However, RCTs cannot always be applied for practical or ethical reasons. To investigate the current practices in rare diseases, we review evaluations of therapeutic interventions in paediatric multiple sclerosis (MS) and Creutzfeldt-Jakob disease (CJD). In particular, we shed light on the endpoints used, the study designs implemented and the statistical methodologies applied. METHODS: We conducted literature searches to identify relevant primary studies. Data on study design, objectives, endpoints, patient characteristics, randomization and masking, type of intervention, control, withdrawals and statistical methodology were extracted from the selected studies. The risk of bias and the quality of the studies were assessed. RESULTS: Twelve (seven) primary studies on paediatric MS (CJD) were included in the qualitative synthesis. No double-blind, randomized placebo-controlled trial for evaluating interventions in paediatric MS has been published yet. Evidence from one open-label RCT is available. The observational studies are before-after studies or controlled studies. Three of the seven selected studies on CJD are RCTs, of which two received the maximum mark on the Oxford Quality Scale. Four trials are controlled observational studies. CONCLUSIONS: Evidence from double-blind RCTs on the efficacy of treatments appears to be variable between rare diseases. With regard to paediatric conditions it remains to be seen what impact regulators will have through e.g., paediatric investigation plans. Overall, there is space for improvement by using innovative trial designs and data analysis techniques. △ Less

Submitted 21 February, 2016; originally announced February 2016.

Comments: 11 pages, 2 figures, 3 tables

Journal ref: Orphanet Journal of Rare Diseases, 11:16, 2016

arXiv:1405.1569 [pdf, other]

Adaptive Survival Trials

Authors: Dominic Magirr, Thomas Jaki, Franz Koenig, Martin Posch

Abstract: Mid-study design modifications are becoming increasingly accepted in confirmatory clinical trials, so long as appropriate methods are applied such that error rates are controlled. It is therefore unfortunate that the important case of time-to-event endpoints is not easily handled by the standard theory. We analyze current methods that allow design modifications to be based on the full interim data… ▽ More Mid-study design modifications are becoming increasingly accepted in confirmatory clinical trials, so long as appropriate methods are applied such that error rates are controlled. It is therefore unfortunate that the important case of time-to-event endpoints is not easily handled by the standard theory. We analyze current methods that allow design modifications to be based on the full interim data, i.e., not only the observed event times but also secondary endpoint and safety data from patients who are yet to have an event. We show that the final test statistic may ignore a substantial subset of the observed event times. Since it is the data corresponding to the earliest recruited patients that is ignored, this neglect becomes egregious when there is specific interest in learning about long-term survival. An alternative test incorporating all event times is proposed, where a conservative assumption is made in order to guarantee type I error control. We examine the properties of our proposed approach using the example of a clinical trial comparing two cancer therapies. △ Less

Submitted 7 May, 2014; originally announced May 2014.

Comments: 22 pages, 7 figures

MSC Class: 62N03

Showing 1–24 of 24 results for author: Posch, M