mSigSDK -- private, at scale, computation of mutation signatures
Authors:
Aaron Ge,
Yasmmin CĂ´rtes Martins,
Tongwu Zhang,
Kailing Chen,
Maria Teresa Landi,
Brian Park,
Jeya Balasubramanian,
Jonas S Almeida
Abstract:
In our previous work, we demonstrated that it is feasible to perform analysis on mutation signature data without the need for downloads or installations and analyze individual patient data at scale without compromising privacy. Building on this foundation, we developed a Software Development Kit (SDK) called mSigSDK to facilitate the orchestration of distributed data processing workflows and graph…
▽ More
In our previous work, we demonstrated that it is feasible to perform analysis on mutation signature data without the need for downloads or installations and analyze individual patient data at scale without compromising privacy. Building on this foundation, we developed a Software Development Kit (SDK) called mSigSDK to facilitate the orchestration of distributed data processing workflows and graphic visualization of mutational signature analysis results. We strictly adhered to modern web computing standards, particularly the modularization standards set by the ECMAScript ES6 framework (JavaScript modules). Our approach allows for computation to be entirely performed by secure delegation to the computational resources of the user's own machine (in-browser), without any downloads or installations. The mSigSDK was developed primarily as a companion library to the mSig Portal resource of the National Cancer Institute Division of Cancer Epidemiology and Genetics (NIH/NCI/DCEG), with a focus on its FAIR extensibility as components of other researchers' computational constructs. Anticipated extensions include the programmatic operation of other mutation signature API ecosystems such as SIGNAL and COSMIC, advancing towards a data commons for mutational signature research (Grossman et al., 2016).
△ Less
Submitted 19 January, 2024; v1 submitted 5 August, 2023;
originally announced August 2023.
A FAIR platform for reproducing mutational signature detection on tumor sequencing data
Authors:
Aaron Ge,
Tongwu Zhang,
Clara Bodelon,
Montserrat Garcia-Closas,
Jonas Almeida,
Jeya Balasubramanian
Abstract:
This paper presents a portable, privacy-preserving, in-browser platform for the reproducible assessment of mutational signature detection methods from sparse sequencing data generated by targeted gene panels. The platform aims to address the reproducibility challenges in mutational signature research by adhering to the FAIR principles, making it findable, accessible, interoperable, and reusable. O…
▽ More
This paper presents a portable, privacy-preserving, in-browser platform for the reproducible assessment of mutational signature detection methods from sparse sequencing data generated by targeted gene panels. The platform aims to address the reproducibility challenges in mutational signature research by adhering to the FAIR principles, making it findable, accessible, interoperable, and reusable. Our approach focuses on the detection of specific mutational signatures, such as SBS3, which have been linked to specific mutagenic processes. The platform relies on publicly available data, simulation, downsampling techniques, and machine learning algorithms to generate training data and labels and to train and evaluate models. The key achievement of our platform is its transparency, reusability, and privacy preservation, enabling researchers and clinicians to analyze mutational signatures with the guarantee that no data circulates outside the client machine.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.