-
Smart Pixels: In-pixel AI for on-sensor data filtering
Authors:
Benjamin Parpillon,
Chinar Syal,
Jieun Yoo,
Jennet Dickinson,
Morris Swartz,
Giuseppe Di Guglielmo,
Alice Bean,
Douglas Berry,
Manuel Blanco Valentin,
Karri DiPetrillo,
Anthony Badea,
Lindsey Gray,
Petar Maksimovic,
Corrinne Mills,
Mark S. Neubauer,
Gauri Pradhan,
Nhan Tran,
Dahai Wen,
Farah Fahim
Abstract:
We present a smart pixel prototype readout integrated circuit (ROIC) designed in CMOS 28 nm bulk process, with in-pixel implementation of an artificial intelligence (AI) / machine learning (ML) based data filtering algorithm designed as proof-of-principle for a Phase III upgrade at the Large Hadron Collider (LHC) pixel detector. The first version of the ROIC consists of two matrices of 256 smart p…
▽ More
We present a smart pixel prototype readout integrated circuit (ROIC) designed in CMOS 28 nm bulk process, with in-pixel implementation of an artificial intelligence (AI) / machine learning (ML) based data filtering algorithm designed as proof-of-principle for a Phase III upgrade at the Large Hadron Collider (LHC) pixel detector. The first version of the ROIC consists of two matrices of 256 smart pixels, each 25$\times$25 $μ$m$^2$ in size. Each pixel consists of a charge-sensitive preamplifier with leakage current compensation and three auto-zero comparators for a 2-bit flash-type ADC. The frontend is capable of synchronously digitizing the sensor charge within 25 ns. Measurement results show an equivalent noise charge (ENC) of $\sim$30e$^-$ and a total dispersion of $\sim$100e$^-$ The second version of the ROIC uses a fully connected two-layer neural network (NN) to process information from a cluster of 256 pixels to determine if the pattern corresponds to highly desirable high-momentum particle tracks for selection and readout. The digital NN is embedded in-between analog signal processing regions of the 256 pixels without increasing the pixel size and is implemented as fully combinatorial digital logic to minimize power consumption and eliminate clock distribution, and is active only in the presence of an input signal. The total power consumption of the neural network is $\sim$ 300 $μ$W. The NN performs momentum classification based on the generated cluster patterns and even with a modest momentum threshold, it is capable of 54.4\% - 75.4\% total data rejection, opening the possibility of using the pixel information at 40MHz for the trigger. The total power consumption of analog and digital functions per pixel is $\sim$ 6 $μ$W per pixel, which corresponds to $\sim$ 1 W/cm$^2$ staying within the experimental constraints.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
A primer on synthetic health data
Authors:
Jennifer A Bartell,
Sander Boisen Valentin,
Anders Krogh,
Henning Langberg,
Martin Bøgsted
Abstract:
Recent advances in deep generative models have greatly expanded the potential to create realistic synthetic health datasets. These synthetic datasets aim to preserve the characteristics, patterns, and overall scientific conclusions derived from sensitive health datasets without disclosing patient identity or sensitive information. Thus, synthetic data can facilitate safe data sharing that supports…
▽ More
Recent advances in deep generative models have greatly expanded the potential to create realistic synthetic health datasets. These synthetic datasets aim to preserve the characteristics, patterns, and overall scientific conclusions derived from sensitive health datasets without disclosing patient identity or sensitive information. Thus, synthetic data can facilitate safe data sharing that supports a range of initiatives including the development of new predictive models, advanced health IT platforms, and general project ideation and hypothesis development. However, many questions and challenges remain, including how to consistently evaluate a synthetic dataset's similarity and predictive utility in comparison to the original real dataset and risk to privacy when shared. Additional regulatory and governance issues have not been widely addressed. In this primer, we map the state of synthetic health data, including generation and evaluation methods and tools, existing examples of deployment, the regulatory and ethical landscape, access and governance options, and opportunities for further development.
△ Less
Submitted 3 July, 2024; v1 submitted 31 January, 2024;
originally announced January 2024.
-
Smartpixels: Towards on-sensor inference of charged particle track parameters and uncertainties
Authors:
Jennet Dickinson,
Rachel Kovach-Fuentes,
Lindsey Gray,
Morris Swartz,
Giuseppe Di Guglielmo,
Alice Bean,
Doug Berry,
Manuel Blanco Valentin,
Karri DiPetrillo,
Farah Fahim,
James Hirschauer,
Shruti R. Kulkarni,
Ron Lipton,
Petar Maksimovic,
Corrinne Mills,
Mark S. Neubauer,
Benjamin Parpillon,
Gauri Pradhan,
Chinar Syal,
Nhan Tran,
Dahai Wen,
Jieun Yoo,
Aaron Young
Abstract:
The combinatorics of track seeding has long been a computational bottleneck for triggering and offline computing in High Energy Physics (HEP), and remains so for the HL-LHC. Next-generation pixel sensors will be sufficiently fine-grained to determine angular information of the charged particle passing through from pixel-cluster properties. This detector technology immediately improves the situatio…
▽ More
The combinatorics of track seeding has long been a computational bottleneck for triggering and offline computing in High Energy Physics (HEP), and remains so for the HL-LHC. Next-generation pixel sensors will be sufficiently fine-grained to determine angular information of the charged particle passing through from pixel-cluster properties. This detector technology immediately improves the situation for offline tracking, but any major improvements in physics reach are unrealized since they are dominated by lowest-level hardware trigger acceptance. We will demonstrate track angle and hit position prediction, including errors, using a mixture density network within a single layer of silicon as well as the progress towards and status of implementing the neural network in hardware on both FPGAs and ASICs.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Smart pixel sensors: towards on-sensor filtering of pixel clusters with deep learning
Authors:
Jieun Yoo,
Jennet Dickinson,
Morris Swartz,
Giuseppe Di Guglielmo,
Alice Bean,
Douglas Berry,
Manuel Blanco Valentin,
Karri DiPetrillo,
Farah Fahim,
Lindsey Gray,
James Hirschauer,
Shruti R. Kulkarni,
Ron Lipton,
Petar Maksimovic,
Corrinne Mills,
Mark S. Neubauer,
Benjamin Parpillon,
Gauri Pradhan,
Chinar Syal,
Nhan Tran,
Dahai Wen,
Aaron Young
Abstract:
Highly granular pixel detectors allow for increasingly precise measurements of charged particle tracks. Next-generation detectors require that pixel sizes will be further reduced, leading to unprecedented data rates exceeding those foreseen at the High Luminosity Large Hadron Collider. Signal processing that handles data incoming at a rate of O(40MHz) and intelligently reduces the data within the…
▽ More
Highly granular pixel detectors allow for increasingly precise measurements of charged particle tracks. Next-generation detectors require that pixel sizes will be further reduced, leading to unprecedented data rates exceeding those foreseen at the High Luminosity Large Hadron Collider. Signal processing that handles data incoming at a rate of O(40MHz) and intelligently reduces the data within the pixelated region of the detector at rate will enhance physics performance at high luminosity and enable physics analyses that are not currently possible. Using the shape of charge clusters deposited in an array of small pixels, the physical properties of the traversing particle can be extracted with locally customized neural networks. In this first demonstration, we present a neural network that can be embedded into the on-sensor readout and filter out hits from low momentum tracks, reducing the detector's data volume by 54.4-75.4%. The network is designed and simulated as a custom readout integrated circuit with 28 nm CMOS technology and is expected to operate at less than 300 $μW$ with an area of less than 0.2 mm$^2$. The temporal development of charge clusters is investigated to demonstrate possible future performance gains, and there is also a discussion of future algorithmic and technological improvements that could enhance efficiency, data reduction, and power per area.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
A Cryogenic Readout IC with 100 KSPS in-Pixel ADC for Skipper CCD-in-CMOS Sensors
Authors:
Adam Quinn,
Manuel B. Valentin,
Thomas Zimmerman,
Davide Braga,
Seda Memik,
Farah Fahim
Abstract:
The Skipper CCD-in-CMOS Parallel Read-Out Circuit (SPROCKET) is a mixed-signal front-end design for the readout of Skipper CCD-in-CMOS image sensors. SPROCKET is fabricated in a 65 nm CMOS process and each pixel occupies a 50$μ$m $\times$ 50$μ$m footprint. SPROCKET is intended to be heterogeneously integrated with a Skipper-in-CMOS sensor array, such that one readout pixel is connected to a multip…
▽ More
The Skipper CCD-in-CMOS Parallel Read-Out Circuit (SPROCKET) is a mixed-signal front-end design for the readout of Skipper CCD-in-CMOS image sensors. SPROCKET is fabricated in a 65 nm CMOS process and each pixel occupies a 50$μ$m $\times$ 50$μ$m footprint. SPROCKET is intended to be heterogeneously integrated with a Skipper-in-CMOS sensor array, such that one readout pixel is connected to a multiplexed array of nine Skipper-in-CMOS pixels to enable massively parallel readout. The front-end includes a variable gain preamplifier, a correlated double sampling circuit, and a 10-bit serial successive approximation register (SAR) ADC. The circuit achieves a sample rate of 100 ksps with 0.48 $\mathrm{e^-_{rms}}$ equivalent noise at the input to the ADC. SPROCKET achieves a maximum dynamic range of 9,000 $e^-$ at the lowest gain setting (or 900 $e^-$ at the lowest noise setting). The circuit operates at 100 Kelvin with a power consumption of 40 $μW$ per pixel. A SPROCKET test chip was submitted in September 2022, and test results will be presented at the conference.
△ Less
Submitted 7 February, 2023;
originally announced February 2023.
-
The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments
Authors:
Nailia Mirzakhmedova,
Johannes Kiesel,
Milad Alshomary,
Maximilian Heinrich,
Nicolas Handke,
Xiaoni Cai,
Barriere Valentin,
Doratossadat Dastgheib,
Omid Ghahroodi,
Mohammad Ali Sadraei,
Ehsaneddin Asgari,
Lea Kawaletz,
Henning Wachsmuth,
Benno Stein
Abstract:
We present the Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. To investigate approaches for the automated detection of human values behind arguments, we collected 9324 arguments from 6 diverse sources, covering religious texts, political discussions, free-text arguments, newspaper editorials, and online democracy platforms. Each argument was annotated by 3 crowdworkers f…
▽ More
We present the Touché23-ValueEval Dataset for Identifying Human Values behind Arguments. To investigate approaches for the automated detection of human values behind arguments, we collected 9324 arguments from 6 diverse sources, covering religious texts, political discussions, free-text arguments, newspaper editorials, and online democracy platforms. Each argument was annotated by 3 crowdworkers for 54 values. The Touché23-ValueEval dataset extends the Webis-ArgValues-22. In comparison to the previous dataset, the effectiveness of a 1-Baseline decreases, but that of an out-of-the-box BERT model increases. Therefore, though the classification difficulty increased as per the label distribution, the larger dataset allows for training better models.
△ Less
Submitted 31 January, 2023;
originally announced January 2023.
-
Emergence of dynamic contractile patterns in slime mold confined in a ring geometry
Authors:
Busson Valentin,
Saiseau Raphaël,
Durand Marc
Abstract:
Coordination of cytoplasmic flows on large scales in space and time are at the root of many cellular processes, including growth, migration or division. These flows are driven by organized contractions of the actomyosin cortex. In order to elucidate the basic mechanisms at work in the self-organization of contractile activity, we investigate the dynamic patterns of cortex contraction in true slime…
▽ More
Coordination of cytoplasmic flows on large scales in space and time are at the root of many cellular processes, including growth, migration or division. These flows are driven by organized contractions of the actomyosin cortex. In order to elucidate the basic mechanisms at work in the self-organization of contractile activity, we investigate the dynamic patterns of cortex contraction in true slime mold \textit{Physarum polycephalum} confined in ring-shaped chambers of controlled geometrical dimensions. We make an exhaustive inventory of the different stable contractile patterns in the absence of migration and growth. We show that the primary frequency of the oscillations is independent of the ring perimeter, while the wavelength scales linearly with it. We discuss the consistence of these results with the existing models, shedding light on the possible feedback mechanisms leading to coordinated contractile activity.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Develo** a Victorious Strategy to the Second Strong Gravitational Lensing Data Challenge
Authors:
C. R. Bom,
B. M. O. Fraga,
L. O. Dias,
P. Schubert,
M. Blanco Valentin,
C. Furlanetto,
M. Makler,
K. Teles,
M. Portes de Albuquerque,
R. Benton Metcalf
Abstract:
Strong Lensing is a powerful probe of the matter distribution in galaxies and clusters and a relevant tool for cosmography. Analyses of strong gravitational lenses with Deep Learning have become a popular approach due to these astronomical objects' rarity and image complexity. Next-generation surveys will provide more opportunities to derive science from these objects and an increasing data volume…
▽ More
Strong Lensing is a powerful probe of the matter distribution in galaxies and clusters and a relevant tool for cosmography. Analyses of strong gravitational lenses with Deep Learning have become a popular approach due to these astronomical objects' rarity and image complexity. Next-generation surveys will provide more opportunities to derive science from these objects and an increasing data volume to be analyzed. However, finding strong lenses is challenging, as their number densities are orders of magnitude below those of galaxies. Therefore, specific Strong Lensing search algorithms are required to discover the highest number of systems possible with high purity and low false alarm rate. The need for better algorithms has prompted the development of an open community data science competition named Strong Gravitational Lensing Challenge (SGLC). This work presents the Deep Learning strategies and methodology used to design the highest-scoring algorithm in the II SGLC. We discuss the approach used for this dataset, the choice for a suitable architecture, particularly the use of a network with two branches to work with images in different resolutions, and its optimization. We also discuss the detectability limit, the lessons learned, and prospects for defining a tailor-made architecture in a survey in contrast to a general one. Finally, we release the models and discuss the best choice to easily adapt the model to a dataset representing a survey with a different instrument. This work helps to take a step towards efficient, adaptable and accurate analyses of strong lenses with deep learning frameworks.
△ Less
Submitted 17 March, 2022;
originally announced March 2022.
-
How does a Pre-Trained Transformer Integrate Contextual Keywords? Application to Humanitarian Computing
Authors:
Barriere Valentin,
Jacquet Guillaume
Abstract:
In a classification task, dealing with text snippets and metadata usually requires dealing with multimodal approaches. When those metadata are textual, it is tempting to use them intrinsically with a pre-trained transformer, in order to leverage the semantic information encoded inside the model. This paper describes how to improve a humanitarian classification task by adding the crisis event type…
▽ More
In a classification task, dealing with text snippets and metadata usually requires dealing with multimodal approaches. When those metadata are textual, it is tempting to use them intrinsically with a pre-trained transformer, in order to leverage the semantic information encoded inside the model. This paper describes how to improve a humanitarian classification task by adding the crisis event type to each tweet to be classified. Based on additional experiments of the model weights and behavior, it identifies how the proposed neural network approach is partially over-fitting the particularities of the Crisis Benchmark, to better highlight how the model is still undoubtedly learning to use and take advantage of the metadata's textual semantics.
△ Less
Submitted 7 November, 2021;
originally announced November 2021.
-
A reconfigurable neural network ASIC for detector front-end data compression at the HL-LHC
Authors:
Giuseppe Di Guglielmo,
Farah Fahim,
Christian Herwig,
Manuel Blanco Valentin,
Javier Duarte,
Cristian Gingu,
Philip Harris,
James Hirschauer,
Martin Kwok,
Vladimir Loncar,
Yingyi Luo,
Llovizna Miranda,
Jennifer Ngadiuba,
Daniel Noonan,
Seda Ogrenci-Memik,
Maurizio Pierini,
Sioni Summers,
Nhan Tran
Abstract:
Despite advances in the programmable logic capabilities of modern trigger systems, a significant bottleneck remains in the amount of data to be transported from the detector to off-detector logic where trigger decisions are made. We demonstrate that a neural network autoencoder model can be implemented in a radiation tolerant ASIC to perform lossy data compression alleviating the data transmission…
▽ More
Despite advances in the programmable logic capabilities of modern trigger systems, a significant bottleneck remains in the amount of data to be transported from the detector to off-detector logic where trigger decisions are made. We demonstrate that a neural network autoencoder model can be implemented in a radiation tolerant ASIC to perform lossy data compression alleviating the data transmission problem while preserving critical information of the detector energy profile. For our application, we consider the high-granularity calorimeter from the CMS experiment at the CERN Large Hadron Collider. The advantage of the machine learning approach is in the flexibility and configurability of the algorithm. By changing the neural network weights, a unique data compression algorithm can be deployed for each sensor in different detector regions, and changing detector or collider conditions. To meet area, performance, and power constraints, we perform a quantization-aware training to create an optimized neural network hardware implementation. The design is achieved through the use of high-level synthesis tools and the hls4ml framework, and was processed through synthesis and physical layout flows based on a LP CMOS 65 nm technology node. The flow anticipates 200 Mrad of ionizing radiation to select gates, and reports a total area of 3.6 mm^2 and consumes 95 mW of power. The simulated energy consumption per inference is 2.4 nJ. This is the first radiation tolerant on-detector ASIC implementation of a neural network that has been designed for particle physics applications.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices
Authors:
Farah Fahim,
Benjamin Hawks,
Christian Herwig,
James Hirschauer,
Sergo **dariani,
Nhan Tran,
Luca P. Carloni,
Giuseppe Di Guglielmo,
Philip Harris,
Jeffrey Krupa,
Dylan Rankin,
Manuel Blanco Valentin,
Josiah Hester,
Yingyi Luo,
John Mamish,
Seda Orgrenci-Memik,
Thea Aarrestad,
Hamza Javed,
Vladimir Loncar,
Maurizio Pierini,
Adrian Alan Pol,
Sioni Summers,
Javier Duarte,
Scott Hauck,
Shih-Chieh Hsu
, et al. (5 additional authors not shown)
Abstract:
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-h…
▽ More
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.
△ Less
Submitted 23 March, 2021; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Estimation of Basic Reproduction Number of the COVID-19 Epidemic in Denmark using a Two-Step Model
Authors:
Jan Brink Valentin
Abstract:
Objective: To conduct an early estimation of the Basic Reproduction Number (BRN) induced by government interference, and to project resulting day to day number of in-patients, ICU-patients and cumulative number of deaths in a Danish setting.
Method: We used the Kermack and McKendrick model with varying basic reproduction number to estimate number infected and age stratified percentages to estima…
▽ More
Objective: To conduct an early estimation of the Basic Reproduction Number (BRN) induced by government interference, and to project resulting day to day number of in-patients, ICU-patients and cumulative number of deaths in a Danish setting.
Method: We used the Kermack and McKendrick model with varying basic reproduction number to estimate number infected and age stratified percentages to estimate number of in-patients, ICU-patients and cumulative number of deaths. Changes in basic reproduction number was estimated based on current in-patient numbers.
Results: The basic reproductive number in the time period of February 27th to March 18th was found to be 2.65, however, this number was reduced to 1.99 after March 18th.
Keywords: COVID-19, basic reproduction number, Danish population
△ Less
Submitted 25 March, 2020; v1 submitted 21 March, 2020;
originally announced March 2020.
-
Patch redundancy in images: a statistical testing framework and some applications
Authors:
De Bortoli Valentin,
Desolneux Agnès,
Galerne Bruno,
Leclaire Arthur
Abstract:
In this work we introduce a statistical framework in order to analyze the spatial redundancy in natural images. This notion of spatial redundancy must be defined locally and thus we give some examples of functions (auto-similarity and template similarity) which, given one or two images, computes a similarity measurement between patches. Two patches are said to be similar if the similarity measurem…
▽ More
In this work we introduce a statistical framework in order to analyze the spatial redundancy in natural images. This notion of spatial redundancy must be defined locally and thus we give some examples of functions (auto-similarity and template similarity) which, given one or two images, computes a similarity measurement between patches. Two patches are said to be similar if the similarity measurement is small enough. To derive a criterion for taking a decision on the similarity between two patches we present an a contrario model. Namely, two patches are said to be similar if the associated similarity measurement is unlikely to happen in a background model. Choosing Gaussian random fields as background models we derive non-asymptotic expressions for the probability distribution function of similarity measurements. We introduce a fast algorithm in order to assess redundancy in natural images and present applications in denoising, periodicity analysis and texture ranking.
△ Less
Submitted 12 April, 2019;
originally announced April 2019.
-
Macrocanonical Models for Texture Synthesis
Authors:
De Bortoli Valentin,
Desolneux Agnès,
Galerne Bruno,
Leclaire Arthur
Abstract:
In this article we consider macrocanonical models for texture synthesis. In these models samples are generated given an input texture image and a set of features which should be matched in expectation. It is known that if the images are quantized, macrocanonical models are given by Gibbs measures, using the maximum entropy principle. We study conditions under which this result extends to real-valu…
▽ More
In this article we consider macrocanonical models for texture synthesis. In these models samples are generated given an input texture image and a set of features which should be matched in expectation. It is known that if the images are quantized, macrocanonical models are given by Gibbs measures, using the maximum entropy principle. We study conditions under which this result extends to real-valued images. If these conditions hold, finding a macrocanonical model amounts to minimizing a convex function and sampling from an associated Gibbs measure. We analyze an algorithm which alternates between sampling and minimizing. We present experiments with neural network features and study the drawbacks and advantages of using this sampling scheme.
△ Less
Submitted 12 April, 2019;
originally announced April 2019.
-
The Strong Gravitational Lens Finding Challenge
Authors:
R. Benton Metcalf,
M. Meneghetti,
Camille Avestruz,
Fabio Bellagamba,
Clécio R. Bom,
Emmanuel Bertin,
Rémi Cabanac,
F. Courbin,
Andrew Davies,
Etienne Decencière,
Rémi Flamary,
Raphael Gavazzi,
Mario Geiger,
Philippa Hartley,
Marc Huertas-Company,
Neal Jackson,
Eric Jullo,
Jean-Paul Kneib,
Léon V. E. Koopmans,
François Lanusse,
Chun-Liang Li,
Quanbin Ma,
Martin Makler,
Nan Li,
Matthew Lightman
, et al. (11 additional authors not shown)
Abstract:
Large scale imaging surveys will increase the number of galaxy-scale strong lensing candidates by maybe three orders of magnitudes beyond the number known today. Finding these rare objects will require picking them out of at least tens of millions of images and deriving scientific results from them will require quantifying the efficiency and bias of any search method. To achieve these objectives a…
▽ More
Large scale imaging surveys will increase the number of galaxy-scale strong lensing candidates by maybe three orders of magnitudes beyond the number known today. Finding these rare objects will require picking them out of at least tens of millions of images and deriving scientific results from them will require quantifying the efficiency and bias of any search method. To achieve these objectives automated methods must be developed. Because gravitational lenses are rare objects reducing false positives will be particularly important. We present a description and results of an open gravitational lens finding challenge. Participants were asked to classify 100,000 candidate objects as to whether they were gravitational lenses or not with the goal of develo** better automated methods for finding lenses in large data sets. A variety of methods were used including visual inspection, arc and ring finders, support vector machines (SVM) and convolutional neural networks (CNN). We find that many of the methods will be easily fast enough to analyse the anticipated data flow. In test data, several methods are able to identify upwards of half the lenses after applying some thresholds on the lens characteristics such as lensed image brightness, size or contrast with the lens galaxy without making a single false-positive identification. This is significantly better than direct inspection by humans was able to do. (abridged)
△ Less
Submitted 20 March, 2019; v1 submitted 10 February, 2018;
originally announced February 2018.
-
On a method for Rock Classification using Textural Features and Genetic Optimization
Authors:
Manuel Blanco Valentin,
Clecio Roque De Bom,
Marcio Portes de Albuquerque,
Marcelo Portes de Albuquerque,
Elisangela Faria,
Maury Duarte Correia,
Rodrigo Surmas
Abstract:
In this work we present a method to classify a set of rock textures based on a Spectral Analysis and the extraction of the texture Features of the resulted images. Up to 520 features were tested using 4 different filters and all 31 different combinations were verified. The classification process relies on a Naive Bayes classifier. We performed two kinds of optimizations: statistical optimization w…
▽ More
In this work we present a method to classify a set of rock textures based on a Spectral Analysis and the extraction of the texture Features of the resulted images. Up to 520 features were tested using 4 different filters and all 31 different combinations were verified. The classification process relies on a Naive Bayes classifier. We performed two kinds of optimizations: statistical optimization with covariance-based Principal Component Analysis (PCA) and a genetic optimization, for 10,000 randomly defined samples, achieving a final maximum classification success of 91% against the original 70% success ratio (without any optimization nor filters used). After the optimization 9 types of features emerged as most relevant.
△ Less
Submitted 17 August, 2017; v1 submitted 6 July, 2016;
originally announced July 2016.