-
Integration Of Evolutionary Automated Machine Learning With Structural Sensitivity Analysis For Composite Pipelines
Authors:
Nikolay O. Nikitin,
Maiia Pinchuk,
Valerii Pokrovskii,
Peter Shevchenko,
Andrey Getmanov,
Yaroslav Aksenkin,
Ilia Revin,
Andrey Stebenkov,
Ekaterina Poslavskaya,
Anna V. Kalyuzhnaya
Abstract:
Automated machine learning (AutoML) systems propose an end-to-end solution to a given machine learning problem, creating either fixed or flexible pipelines. Fixed pipelines are task independent constructs: their general composition remains the same, regardless of the data. In contrast, the structure of flexible pipelines varies depending on the input, making them finely tailored to individual task…
▽ More
Automated machine learning (AutoML) systems propose an end-to-end solution to a given machine learning problem, creating either fixed or flexible pipelines. Fixed pipelines are task independent constructs: their general composition remains the same, regardless of the data. In contrast, the structure of flexible pipelines varies depending on the input, making them finely tailored to individual tasks. However, flexible pipelines can be structurally overcomplicated and have poor explainability. We propose the EVOSA approach that compensates for the negative points of flexible pipelines by incorporating a sensitivity analysis which increases the robustness and interpretability of the flexible solutions. EVOSA quantitatively estimates positive and negative impact of an edge or a node on a pipeline graph, and feeds this information to the evolutionary AutoML optimizer. The correctness and efficiency of EVOSA was validated in tabular, multimodal and computer vision tasks, suggesting generalizability of the proposed approach across domains.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Generative Design of Physical Objects using Modular Framework
Authors:
Nikita O. Starodubcev,
Nikolay O. Nikitin,
Konstantin G. Gavaza,
Elizaveta A. Andronova,
Denis O. Sidorenko,
Anna V. Kalyuzhnaya
Abstract:
In recent years generative design techniques have become firmly established in numerous applied fields, especially in engineering. These methods are demonstrating intensive growth owing to promising outlook. However, existing approaches are limited by the specificity of problem under consideration. In addition, they do not provide desired flexibility. In this paper we formulate general approach to…
▽ More
In recent years generative design techniques have become firmly established in numerous applied fields, especially in engineering. These methods are demonstrating intensive growth owing to promising outlook. However, existing approaches are limited by the specificity of problem under consideration. In addition, they do not provide desired flexibility. In this paper we formulate general approach to an arbitrary generative design problem and propose novel framework called GEFEST (Generative Evolution For Encoded STructure) on its basis. The developed approach is based on three general principles: sampling, estimation and optimization. This ensures the freedom of method adjustment for solution of particular generative design problem and therefore enables to construct the most suitable one. A series of experimental studies was conducted to confirm the effectiveness of the GEFEST framework. It involved synthetic and real-world cases (coastal engineering, microfluidics, thermodynamics and oil field planning). Flexible structure of the GEFEST makes it possible to obtain the results that surpassing baseline solutions.
△ Less
Submitted 29 July, 2022;
originally announced July 2022.
-
Surrogate-Assisted Evolutionary Generative Design Of Breakwaters Using Deep Convolutional Networks
Authors:
Nikita O. Starodubcev,
Nikolay O. Nikitin,
Anna V. Kalyuzhnaya
Abstract:
In the paper, a multi-objective evolutionary surrogate-assisted approach for the fast and effective generative design of coastal breakwaters is proposed. To approximate the computationally expensive objective functions, the deep convolutional neural network is used as a surrogate model. This model allows optimizing a configuration of breakwaters with a different number of structures and segments.…
▽ More
In the paper, a multi-objective evolutionary surrogate-assisted approach for the fast and effective generative design of coastal breakwaters is proposed. To approximate the computationally expensive objective functions, the deep convolutional neural network is used as a surrogate model. This model allows optimizing a configuration of breakwaters with a different number of structures and segments. In addition to the surrogate, an assistant model was developed to estimate the confidence of predictions. The proposed approach was tested on the synthetic water area, the SWAN model was used to calculate the wave heights. The experimental results confirm that the proposed approach allows obtaining more effective (less expensive with better protective properties) solutions than non-surrogate approaches for the same time.
△ Less
Submitted 7 April, 2022;
originally announced April 2022.
-
Automated Evolutionary Approach for the Design of Composite Machine Learning Pipelines
Authors:
Nikolay O. Nikitin,
Pavel Vychuzhanin,
Mikhail Sarafanov,
Iana S. Polonskaia,
Ilia Revin,
Irina V. Barabanova,
Gleb Maximov,
Anna V. Kalyuzhnaya,
Alexander Boukhanovsky
Abstract:
The effectiveness of the machine learning methods for real-world tasks depends on the proper structure of the modeling pipeline. The proposed approach is aimed to automate the design of composite machine learning pipelines, which is equivalent to computation workflows that consist of models and data operations. The approach combines key ideas of both automated machine learning and workflow managem…
▽ More
The effectiveness of the machine learning methods for real-world tasks depends on the proper structure of the modeling pipeline. The proposed approach is aimed to automate the design of composite machine learning pipelines, which is equivalent to computation workflows that consist of models and data operations. The approach combines key ideas of both automated machine learning and workflow management systems. It designs the pipelines with a customizable graph-based structure, analyzes the obtained results, and reproduces them. The evolutionary approach is used for the flexible identification of pipeline structure. The additional algorithms for sensitivity analysis, atomization, and hyperparameter tuning are implemented to improve the effectiveness of the approach. Also, the software implementation on this approach is presented as an open-source framework. The set of experiments is conducted for the different datasets and tasks (classification, regression, time series forecasting). The obtained results confirm the correctness and effectiveness of the proposed approach in the comparison with the state-of-the-art competitors and baseline solutions.
△ Less
Submitted 26 June, 2021;
originally announced June 2021.
-
MIxBN: library for learning Bayesian networks from mixed data
Authors:
Anna V. Bubnova,
Irina Deeva,
Anna V. Kalyuzhnaya
Abstract:
This paper describes a new library for learning Bayesian networks from data containing discrete and continuous variables (mixed data). In addition to the classical learning methods on discretized data, this library proposes its algorithm that allows structural learning and parameters learning from mixed data without discretization since data discretization leads to information loss. This algorithm…
▽ More
This paper describes a new library for learning Bayesian networks from data containing discrete and continuous variables (mixed data). In addition to the classical learning methods on discretized data, this library proposes its algorithm that allows structural learning and parameters learning from mixed data without discretization since data discretization leads to information loss. This algorithm based on mixed MI score function for structural learning, and also linear regression and Gaussian distribution approximation for parameters learning. The library also offers two algorithms for enumerating graph structures - the greedy Hill-Climbing algorithm and the evolutionary algorithm. Thus the key capabilities of the proposed library are as follows: (1) structural and parameters learning of a Bayesian network on discretized data, (2) structural and parameters learning of a Bayesian network on mixed data using the MI mixed score function and Gaussian approximation, (3) launching learning algorithms on one of two algorithms for enumerating graph structures - Hill-Climbing and the evolutionary algorithm. Since the need for mixed data representation comes from practical necessity, the advantages of our implementations are evaluated in the context of solving approximation and gap recovery problems on synthetic data and real datasets.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Hybrid and Automated Machine Learning Approaches for Oil Fields Development: the Case Study of Volve Field, North Sea
Authors:
Nikolay O. Nikitin,
Ilia Revin,
Alexander Hvatov,
Pavel Vychuzhanin,
Anna V. Kalyuzhnaya
Abstract:
The paper describes the usage of intelligent approaches for field development tasks that may assist a decision-making process. We focused on the problem of wells location optimization and two tasks within it: improving the quality of oil production estimation and estimation of reservoir characteristics for appropriate wells allocation and parametrization, using machine learning methods. For oil pr…
▽ More
The paper describes the usage of intelligent approaches for field development tasks that may assist a decision-making process. We focused on the problem of wells location optimization and two tasks within it: improving the quality of oil production estimation and estimation of reservoir characteristics for appropriate wells allocation and parametrization, using machine learning methods. For oil production estimation, we implemented and investigated the quality of forecasting models: physics-based, pure data-driven, and hybrid one. The CRMIP model was chosen as a physics-based approach. We compare it with the machine learning and hybrid methods in a frame of oil production forecasting task. In the investigation of reservoir characteristics for wells location choice, we automated the seismic analysis using evolutionary identification of convolutional neural network for the reservoir detection. The Volve oil field dataset was used as a case study to conduct the experiments. The implemented approaches can be used to analyze different oil fields or adapted to similar physics-related problems.
△ Less
Submitted 24 February, 2022; v1 submitted 3 March, 2021;
originally announced March 2021.
-
Oil and Gas Reservoirs Parameters Analysis Using Mixed Learning of Bayesian Networks
Authors:
Irina Deeva,
Anna Bubnova,
Petr Andriushchenko,
Anton Voskresenskiy,
Nikita Bukhanov,
Nikolay O. Nikitin,
Anna V. Kalyuzhnaya
Abstract:
In this paper, a multipurpose Bayesian-based method for data analysis, causal inference and prediction in the sphere of oil and gas reservoir development is considered. This allows analysing parameters of a reservoir, discovery dependencies among parameters (including cause and effects relations), checking for anomalies, prediction of expected values of missing parameters, looking for the closest…
▽ More
In this paper, a multipurpose Bayesian-based method for data analysis, causal inference and prediction in the sphere of oil and gas reservoir development is considered. This allows analysing parameters of a reservoir, discovery dependencies among parameters (including cause and effects relations), checking for anomalies, prediction of expected values of missing parameters, looking for the closest analogues, and much more. The method is based on extended algorithm MixLearn@BN for structural learning of Bayesian networks. Key ideas of MixLearn@BN are following: (1) learning the network structure on homogeneous data subsets, (2) assigning a part of the structure by an expert, and (3) learning the distribution parameters on mixed data (discrete and continuous). Homogeneous data subsets are identified as various groups of reservoirs with similar features (analogues), where similarity measure may be based on several types of distances. The aim of the described technique of Bayesian network learning is to improve the quality of predictions and causal inference on such networks. Experimental studies prove that the suggested method gives a significant advantage in missing values prediction and anomalies detection accuracy. Moreover, the method was applied to the database of more than a thousand petroleum reservoirs across the globe and allowed to discover novel insights in geological parameters relationships.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Multi-Objective Evolutionary Design of Composite Data-Driven Models
Authors:
Iana S. Polonskaia,
Nikolay O. Nikitin,
Ilia Revin,
Pavel Vychuzhanin,
Anna V. Kalyuzhnaya
Abstract:
In this paper, a multi-objective approach for the design of composite data-driven mathematical models is proposed. It allows automating the identification of graph-based heterogeneous pipelines that consist of different blocks: machine learning models, data preprocessing blocks, etc. The implemented approach is based on a parameter-free genetic algorithm (GA) for model design called GPComp@Free. I…
▽ More
In this paper, a multi-objective approach for the design of composite data-driven mathematical models is proposed. It allows automating the identification of graph-based heterogeneous pipelines that consist of different blocks: machine learning models, data preprocessing blocks, etc. The implemented approach is based on a parameter-free genetic algorithm (GA) for model design called GPComp@Free. It is developed to be part of automated machine learning solutions and to increase the efficiency of the modeling pipeline automation. A set of experiments was conducted to verify the correctness and efficiency of the proposed approach and substantiate the selected solutions. The experimental results confirm that a multi-objective approach to the model design allows achieving better diversity and quality of obtained models. The implemented approach is available as a part of the open-source AutoML framework FEDOT.
△ Less
Submitted 17 May, 2021; v1 submitted 1 March, 2021;
originally announced March 2021.
-
Automated data-driven approach for gap filling in the time series using evolutionary learning
Authors:
Mikhail Sarafanov,
Nikolay O. Nikitin,
Anna V. Kalyuzhnaya
Abstract:
In the paper, we propose an adaptive data-driven model-based approach for filling the gaps in time series. The approach is based on the automated evolutionary identification of the optimal structure for a composite data-driven model. It allows adapting the model for the effective gap-filling in a specific dataset without the involvement of the data scientist. As a case study, both synthetic and re…
▽ More
In the paper, we propose an adaptive data-driven model-based approach for filling the gaps in time series. The approach is based on the automated evolutionary identification of the optimal structure for a composite data-driven model. It allows adapting the model for the effective gap-filling in a specific dataset without the involvement of the data scientist. As a case study, both synthetic and real datasets from different fields (environmental, economic, etc) are used. The experiments confirm that the proposed approach allows achieving the higher quality of the gap restoration and improve the effectiveness of forecasting models.
△ Less
Submitted 20 July, 2021; v1 submitted 1 March, 2021;
originally announced March 2021.
-
The multi-objective optimisation of breakwaters using evolutionary approach
Authors:
Nikolay O. Nikitin,
Iana S. Polonskaia,
Anna V. Kalyuzhnaya,
Alexander V. Boukhanovsky
Abstract:
In engineering practice, it is often necessary to increase the effectiveness of existing protective constructions for ports and coasts (i. e. breakwaters) by extending their configuration, because existing configurations don't provide the appropriate environmental conditions. That extension task can be considered as an optimisation problem. In the paper, the multi-objective evolutionary approach f…
▽ More
In engineering practice, it is often necessary to increase the effectiveness of existing protective constructions for ports and coasts (i. e. breakwaters) by extending their configuration, because existing configurations don't provide the appropriate environmental conditions. That extension task can be considered as an optimisation problem. In the paper, the multi-objective evolutionary approach for the breakwaters optimisation is proposed. Also, a greedy heuristic is implemented and included to algorithm, that allows achieving the appropriate solution faster. The task of the identification of the attached breakwaters optimal variant that provides the safe ship parking and manoeuvring in large Black Sea Port of Sochi has been used as a case study. The results of the experiments demonstrated the possibility to apply the proposed multi-objective evolutionary approach in real-world engineering problems. It allows identifying the Pareto-optimal set of the possible configuration, which can be analysed by decision makers and used for final construction
△ Less
Submitted 8 September, 2021; v1 submitted 6 April, 2020;
originally announced April 2020.
-
REBEC: Robust Evolutionary-based Calibration Approach for the Numerical Wind Wave Model
Authors:
Pavel Vychuzhanin,
Nikolay O. Nikitin,
Anna V. Kalyuzhnaya
Abstract:
The adaptation of numerical wind wave models to the local time-spatial conditions is a problem that can be solved by using various calibration techniques. However, the obtained sets of physical parameters become over-tuned to specific events if there is a lack of observations. In this paper, we propose a robust evolutionary calibration approach that allows to build the stochastic ensemble of pertu…
▽ More
The adaptation of numerical wind wave models to the local time-spatial conditions is a problem that can be solved by using various calibration techniques. However, the obtained sets of physical parameters become over-tuned to specific events if there is a lack of observations. In this paper, we propose a robust evolutionary calibration approach that allows to build the stochastic ensemble of perturbed models and use it to achieve the trade-off between quality and robustness of the target model. The implemented robust ensemble-based evolutionary calibration (REBEC) approach was compared to the baseline SPEA2 algorithm in a set of experiments with the SWAN wind wave model configuration for the Kara Sea domain. Provided metrics for the set of scenarios confirm the effectiveness of the REBEC approach for the majority of calibration scenarios.
△ Less
Submitted 19 March, 2019;
originally announced June 2019.
-
Adaptation of NEMO-LIM3 model for multigrid high resolution Arctic simulation
Authors:
Alexander Hvatov,
Nikolay O. Nikitin,
Anna V. Kalyuzhnaya,
Sergey S. Kosukhin
Abstract:
High-resolution regional hindcasting of ocean and sea ice plays an important role in the assessment of ship** and operational risks in the Arctic Ocean. The ice-ocean model NEMO-LIM3 was modified to improve its simulation quality for appropriate spatio-temporal resolutions. A multigrid model setup with connected coarse- (14 km) and fine-resolution (5 km) model configurations was devised. These t…
▽ More
High-resolution regional hindcasting of ocean and sea ice plays an important role in the assessment of ship** and operational risks in the Arctic Ocean. The ice-ocean model NEMO-LIM3 was modified to improve its simulation quality for appropriate spatio-temporal resolutions. A multigrid model setup with connected coarse- (14 km) and fine-resolution (5 km) model configurations was devised. These two configurations were implemented and run separately. The resulting computational cost was lower when compared to that of the built-in AGRIF nesting system. Ice and tracer boundary-condition schemes were modified to achieve the correct interaction between coarse- and fine grids through a long ice-covered open boundary. An ice-restoring scheme was implemented to reduce spin-up time. The NEMO-LIM3 configuration described in this article provides more flexible and customisable tools for high-resolution regional Arctic simulations.
△ Less
Submitted 5 February, 2019; v1 submitted 8 October, 2018;
originally announced October 2018.
-
A Conceptual Approach to Complex Model Management with Generalized Modelling Patterns and Evolutionary Identification
Authors:
Sergey V. Kovalchuk,
Oleg G. Metsker,
Anastasia A. Funkner,
Ilia O. Kisliakovskii,
Nikolay O. Nikitin,
Anna V. Kalyuzhnaya,
Danila A. Vaganov,
Klavdiya O. Bochenina
Abstract:
Complex systems' modeling and simulation are powerful ways to investigate a multitude of natural phenomena providing extended knowledge on their structure and behavior. However, enhanced modeling and simulation require integration of various data and knowledge sources, models of various kinds (data-driven models, numerical models, simulation models, etc.), intelligent components in one composite s…
▽ More
Complex systems' modeling and simulation are powerful ways to investigate a multitude of natural phenomena providing extended knowledge on their structure and behavior. However, enhanced modeling and simulation require integration of various data and knowledge sources, models of various kinds (data-driven models, numerical models, simulation models, etc.), intelligent components in one composite solution. Growing complexity of such composite model leads to the need of specific approaches for management of such model. This need extends where the model itself becomes a complex system. One of the important aspects of complex model management is dealing with the uncertainty of various kinds (context, parametric, structural, input/output) to control the model. In the situation where a system being modeled, or modeling requirements change over time, specific methods and tools are needed to make modeling and application procedures (meta-modeling operations) in an automatic manner. To support automatic building and management of complex models we propose a general evolutionary computation approach which enables managing of complexity and uncertainty of various kinds. The approach is based on an evolutionary investigation of model phase space to identify the best model's structure and parameters. Examples of different areas (healthcare, hydrometeorology, social network analysis) were elaborated with the proposed approach and solutions.
△ Less
Submitted 12 September, 2018;
originally announced September 2018.