-
More Software Analytics Patterns: Broad-Spectrum Diagnostic and Embedded Improvements
Authors:
Duarte Oliveira,
João Fidalgo,
Joelma Choma,
Eduardo Guerra,
Filipe Correia
Abstract:
Software analytics is a data-driven approach to decision making, which allows software practitioners to leverage valuable insights from data about software to achieve higher development process productivity and improve different aspects of software quality. In previous work, a set of patterns for adopting a lean software analytics process was identified through a literature review. This paper pres…
▽ More
Software analytics is a data-driven approach to decision making, which allows software practitioners to leverage valuable insights from data about software to achieve higher development process productivity and improve different aspects of software quality. In previous work, a set of patterns for adopting a lean software analytics process was identified through a literature review. This paper presents two patterns to add to the original set, forming a pattern language for adopting software analytics practices that aims to inform decision-making activities of software practitioners. The writing of these two patterns was informed by the solutions employed in the context of two case studies on software analytics practices, and the patterns were further validated by searching for their occurrence in the literature. The pattern Broad-Spectrum Diagnostic proposes to conduct more broad analysis based on common metrics when the team does not have the expertise to understand the kind of problems that software analytics can help to solve; and the pattern Embedded Improvements suggests adding improvement tasks as part of other routine activities.
△ Less
Submitted 10 January, 2022;
originally announced January 2022.
-
An investigation of over-training within semi-supervised machine learning models in the search for heavy resonances at the LHC
Authors:
Benjamin Lieberman,
Joshua Choma,
Salah-Eddine Dahbi,
Bruce Mellado,
Xifeng Ruan
Abstract:
In particle physics, semi-supervised machine learning is an attractive option to reduce model dependencies searches beyond the Standard Model. When utilizing semi-supervised techniques in training machine learning models in the search for bosons at the Large Hadron Collider, the over-training of the model must be investigated. Internal fluctuations of the phase space and bias in training can cause…
▽ More
In particle physics, semi-supervised machine learning is an attractive option to reduce model dependencies searches beyond the Standard Model. When utilizing semi-supervised techniques in training machine learning models in the search for bosons at the Large Hadron Collider, the over-training of the model must be investigated. Internal fluctuations of the phase space and bias in training can cause semi-supervised models to label false signals within the phase space due to over-fitting. The issue of false signal generation in semi-supervised models has not been fully analyzed and therefore utilizing a toy Monte Carlo model, the probability of such situations occurring must be quantified. This investigation of $Zγ$ resonances is performed using a pure background Monte Carlo sample. Through unique pure background samples extracted to mimic ATLAS data in a background-plus-signal region, multiple runs enable the probability of these fake signals occurring due to over-training to be thoroughly investigated.
△ Less
Submitted 15 September, 2021;
originally announced September 2021.
-
Machine learning approach for the search of resonances with topological features at the Large Hadron Collider
Authors:
Salah-eddine Dahbi,
Joshua Choma,
Bruce Mellado,
Gaogalalwe Mokgatitswane,
Xifeng Ruan,
Benjamin Lieberman,
Turgay Celik
Abstract:
The observation of resonances is unequivocal evidence of new physics beyond the Standard Model at the Large Hadron Collider (LHC). So far, inclusive and model dependent searches have not provided evidence of new resonances, indicating that these could be driven by subtle topologies. Here, we use machine learning techniques based on weak supervision to perform searches. Weak supervision based on mi…
▽ More
The observation of resonances is unequivocal evidence of new physics beyond the Standard Model at the Large Hadron Collider (LHC). So far, inclusive and model dependent searches have not provided evidence of new resonances, indicating that these could be driven by subtle topologies. Here, we use machine learning techniques based on weak supervision to perform searches. Weak supervision based on mixed samples can be used to search for resonances with little or no prior knowledge on the production mechanism. Also, it offers the advantage that sidebands or control regions can be used to effectively model backgrounds with minimal reliance on simulations. However, weak supervision alone is found to be highly inefficient in identifying corners of the multi-dimensional space of interest. Instead, we propose an approach to search for new resonances that involves a classification procedure that is signature and topology based. A combination of weak supervision with Deep Neural Network algorithms are applied following this classification. The performance of this strategy is evaluated on the production of SM Higgs boson decaying to a pair of photons inclusively and in exclusive regions of phase space tailored for specific production modes at the LHC. After verifying the ability of the methodology to extract different SM Higgs boson signal mechanisms, a search for new phenomena in high-mass final states is setup for the LHC.
△ Less
Submitted 27 October, 2021; v1 submitted 19 November, 2020;
originally announced November 2020.