Search | arXiv e-print repository

A One class Classifier based Framework using SVDD : Application to an Imbalanced Geological Dataset

Authors: Soumi Chaki, Akhilesh Kumar Verma, Aurobinda Routray, William K. Mohanty, Mamata Jenamani

Abstract: Evaluation of hydrocarbon reservoir requires classification of petrophysical properties from available dataset. However, characterization of reservoir attributes is difficult due to the nonlinear and heterogeneous nature of the subsurface physical properties. In this context, present study proposes a generalized one class classification framework based on Support Vector Data Description (SVDD) to… ▽ More Evaluation of hydrocarbon reservoir requires classification of petrophysical properties from available dataset. However, characterization of reservoir attributes is difficult due to the nonlinear and heterogeneous nature of the subsurface physical properties. In this context, present study proposes a generalized one class classification framework based on Support Vector Data Description (SVDD) to classify a reservoir characteristic water saturation into two classes (Class high and Class low) from four logs namely gamma ray, neutron porosity, bulk density, and P sonic using an imbalanced dataset. A comparison is carried out among proposed framework and different supervised classification algorithms in terms of g metric means and execution time. Experimental results show that proposed framework has outperformed other classifiers in terms of these performance evaluators. It is envisaged that the classification analysis performed in this study will be useful in further reservoir modeling. △ Less

Submitted 2 December, 2016; originally announced December 2016.

Comments: presented at IEEE Students Technology Symposium (TechSym), 28 February to 2 March 2014, IIT Kharagpur, India. 6 pages, 7 figures, 2tables

arXiv:1612.00841 [pdf]

A Novel Framework based on SVDD to Classify Water Saturation from Seismic Attributes

Authors: Soumi Chaki, Akhilesh Kumar Verma, Aurobinda Routray, William K. Mohanty, Mamata Jenamani

Abstract: Water saturation is an important property in reservoir engineering domain. Thus, satisfactory classification of water saturation from seismic attributes is beneficial for reservoir characterization. However, diverse and non-linear nature of subsurface attributes makes the classification task difficult. In this context, this paper proposes a generalized Support Vector Data Description (SVDD) based… ▽ More Water saturation is an important property in reservoir engineering domain. Thus, satisfactory classification of water saturation from seismic attributes is beneficial for reservoir characterization. However, diverse and non-linear nature of subsurface attributes makes the classification task difficult. In this context, this paper proposes a generalized Support Vector Data Description (SVDD) based novel classification framework to classify water saturation into two classes (Class high and Class low) from three seismic attributes seismic impedance, amplitude envelop, and seismic sweetness. G-metric means and program execution time are used to quantify the performance of the proposed framework along with established supervised classifiers. The documented results imply that the proposed framework is superior to existing classifiers. The present study is envisioned to contribute in further reservoir modeling. △ Less

Submitted 2 December, 2016; originally announced December 2016.

Comments: 6 pages, 8 figures, 2table Presented at Fourth International Conference on Emerging Applications of Information Technology (EAIT 2014), ISI Kolkata, India

arXiv:1612.00840 [pdf]

A novel multiclassSVM based framework to classify lithology from well logs: a real-world application

Authors: Soumi Chaki, Aurobinda Routray, William K. Mohanty, Mamata Jenamani

Abstract: Support vector machines (SVMs) have been recognized as a potential tool for supervised classification analyses in different domains of research. In essence, SVM is a binary classifier. Therefore, in case of a multiclass problem, the problem is divided into a series of binary problems which are solved by binary classifiers, and finally the classification results are combined following either the on… ▽ More Support vector machines (SVMs) have been recognized as a potential tool for supervised classification analyses in different domains of research. In essence, SVM is a binary classifier. Therefore, in case of a multiclass problem, the problem is divided into a series of binary problems which are solved by binary classifiers, and finally the classification results are combined following either the one-against-one or one-against-all strategies. In this paper, an attempt has been made to classify lithology using a multiclass SVM based framework using well logs as predictor variables. Here, the lithology is classified into four classes such as sand, shaly sand, sandy shale and shale based on the relative values of sand and shale fractions as suggested by an expert geologist. The available dataset consisting well logs (gamma ray, neutron porosity, density, and P-sonic) and class information from four closely spaced wells from an onshore hydrocarbon field is divided into training and testing sets. We have used one-against-all strategy to combine the results of multiple binary classifiers. The reported results established the superiority of multiclass SVM compared to other classifiers in terms of classification accuracy. The selection of kernel function and associated parameters has also been investigated here. It can be envisaged from the results achieved in this study that the proposed framework based on multiclass SVM can further be used to solve classification problems. In future research endeavor, seismic attributes can be introduced in the framework to classify the lithology throughout a study area from seismic inputs. △ Less

Submitted 2 December, 2016; originally announced December 2016.

Comments: 5 pages, 5 figures, 4 tables Presented at INDICON 2015 at New Delhi, India

arXiv:1612.00585 [pdf]

Development of a hybrid learning system based on SVM, ANFIS and domain knowledge: DKFIS

Authors: Soumi Chaki, Aurobinda Routray, William K. Mohanty, Mamata Jenamani

Abstract: This paper presents the development of a hybrid learning system based on Support Vector Machines (SVM), Adaptive Neuro-Fuzzy Inference System (ANFIS) and domain knowledge to solve prediction problem. The proposed two-stage Domain Knowledge based Fuzzy Information System (DKFIS) improves the prediction accuracy attained by ANFIS alone. The proposed framework has been implemented on a noisy and inco… ▽ More This paper presents the development of a hybrid learning system based on Support Vector Machines (SVM), Adaptive Neuro-Fuzzy Inference System (ANFIS) and domain knowledge to solve prediction problem. The proposed two-stage Domain Knowledge based Fuzzy Information System (DKFIS) improves the prediction accuracy attained by ANFIS alone. The proposed framework has been implemented on a noisy and incomplete dataset acquired from a hydrocarbon field located at western part of India. Here, oil saturation has been predicted from four different well logs i.e. gamma ray, resistivity, density, and clay volume. In the first stage, depending on zero or near zero and non-zero oil saturation levels the input vector is classified into two classes (Class 0 and Class 1) using SVM. The classification results have been further fine-tuned applying expert knowledge based on the relationship among predictor variables i.e. well logs and target variable - oil saturation. Second, an ANFIS is designed to predict non-zero (Class 1) oil saturation values from predictor logs. The predicted output has been further refined based on expert knowledge. It is apparent from the experimental results that the expert intervention with qualitative judgment at each stage has rendered the prediction into the feasible and realistic ranges. The performance analysis of the prediction in terms of four performance metrics such as correlation coefficient (CC), root mean square error (RMSE), and absolute error mean (AEM), scatter index (SI) has established DKFIS as a useful tool for reservoir characterization. △ Less

Submitted 2 December, 2016; originally announced December 2016.

Comments: 6 pages, 5 figures, 3tables Presented at Indicon 2015

arXiv:1605.00693 [pdf, other]

The Generalized Degrees of Freedom Region of the MIMO Z-Interference Channel with Delayed CSIT

Authors: Kaniska Mohanty, Mahesh K. Varanasi

Abstract: The generalized degrees of freedom (GDoF) region of the multiple-input multiple-output (MIMO) Gaussian Z-interference channel with an arbitrary number of antennas at each node is established under the assumption of delayed channel state information at transmitters (CSIT). The GDoF region is parameterized by $α$, which links the interference-to-noise ratio (INR) to the signal-to-noise ratio (SNR) v… ▽ More The generalized degrees of freedom (GDoF) region of the multiple-input multiple-output (MIMO) Gaussian Z-interference channel with an arbitrary number of antennas at each node is established under the assumption of delayed channel state information at transmitters (CSIT). The GDoF region is parameterized by $α$, which links the interference-to-noise ratio (INR) to the signal-to-noise ratio (SNR) via $INR=SNR^α$. A new outer bound for the GDoF region is established by maximizing a bound on the weighted sum-rate of the two users, which in turn is obtained by using a combination of genie-aided side-information and an extremal inequality. The maximum weighted sum-rate in the high SNR regime is shown to occur when the transmission covariance matrix of the interfering transmitter has full rank. An achievability scheme based on block-Markov encoding and backward decoding is developed which uses interference quantization and digital multicasting to take advantage of the channel statistics of the cross-link, and the scheme is separately shown to be GDoF-optimal in both the weak ($α\leq1$) and strong ($α>1$) interference regimes. This is the first complete characterization of the GDoF region of any interference network with delayed CSIT, as well as the first such GDoF characterization of a MIMO network with delayed CSIT and arbitrary number of antennas at each node. For all antenna tuples, the GDoF region is shown to be equal to or larger than the degrees of freedom (DoF) region over the entire range of $α$, which leads to a V-shaped maximum sum-GDoF as a function of $α$, with the minimum occurring at $α=1$. The delayed CSIT GDoF region and the sum-DoF are compared with their counterparts under perfect CSIT, thereby characterizing all antenna tuples and ranges of $α$ for which delayed CSIT is sufficient to achieve the perfect CSIT GDoF region or sum-DoF. △ Less

Submitted 2 May, 2016; originally announced May 2016.

Comments: submitted, IEEE Transactions on Information Theory

arXiv:1509.07079 [pdf]

doi 10.1016/j.petrol.2014.06.019

Well Tops Guided Prediction of Reservoir Properties using Modular Neural Network Concept A Case Study from Western Onshore, India

Authors: Soumi Chaki, Akhilesh K Verma, Aurobinda Routray, William K Mohanty, Mamata Jenamani

Abstract: This paper proposes a complete framework consisting pre-processing, modeling, and post-processing stages to carry out well tops guided prediction of a reservoir property (sand fraction) from three seismic attributes (seismic impedance, instantaneous amplitude, and instantaneous frequency) using the concept of modular artificial neural network (MANN). The data set used in this study comprising thre… ▽ More This paper proposes a complete framework consisting pre-processing, modeling, and post-processing stages to carry out well tops guided prediction of a reservoir property (sand fraction) from three seismic attributes (seismic impedance, instantaneous amplitude, and instantaneous frequency) using the concept of modular artificial neural network (MANN). The data set used in this study comprising three seismic attributes and well log data from eight wells, is acquired from a western onshore hydrocarbon field of India. Firstly, the acquired data set is integrated and normalized. Then, well log analysis and segmentation of the total depth range into three different units (zones) separated by well tops are carried out. Secondly, three different networks are trained corresponding to three different zones using combined data set of seven wells and then trained networks are validated using the remaining test well. The target property of the test well is predicted using three different tuned networks corresponding to three zones; and then the estimated values obtained from three different networks are concatenated to represent the predicted log along the complete depth range of the testing well. The application of multiple simpler networks instead of a single one improves the prediction accuracy in terms of performance metrics such as correlation coefficient, root mean square error, absolute error mean and program execution time. △ Less

Submitted 23 September, 2015; originally announced September 2015.

Comments: in Journal of Petroleum Science and Engineering, 2014

arXiv:1509.07074 [pdf]

doi 10.1016/j.jappgeo.2014.10.005

Quantification of sand fraction from seismic attributes using Neuro-Fuzzy approach

Authors: Akhilesh K Verma, Soumi Chaki, Aurobinda Routray, William K Mohanty, Mamata Jenamani

Abstract: In this paper, we illustrate the modeling of a reservoir property (sand fraction) from seismic attributes namely seismic impedance, seismic amplitude, and instantaneous frequency using Neuro-Fuzzy (NF) approach. Input dataset includes 3D post-stacked seismic attributes and six well logs acquired from a hydrocarbon field located in the western coast of India. Presence of thin sand and shale layers… ▽ More In this paper, we illustrate the modeling of a reservoir property (sand fraction) from seismic attributes namely seismic impedance, seismic amplitude, and instantaneous frequency using Neuro-Fuzzy (NF) approach. Input dataset includes 3D post-stacked seismic attributes and six well logs acquired from a hydrocarbon field located in the western coast of India. Presence of thin sand and shale layers in the basin area makes the modeling of reservoir characteristic a challenging task. Though seismic data is helpful in extrapolation of reservoir properties away from boreholes; yet, it could be challenging to delineate thin sand and shale reservoirs using seismic data due to its limited resolvability. Therefore, it is important to develop state-of-art intelligent methods for calibrating a nonlinear map** between seismic data and target reservoir variables. Neural networks have shown its potential to model such nonlinear map**s; however, uncertainties associated with the model and datasets are still a concern. Hence, introduction of Fuzzy Logic (FL) is beneficial for handling these uncertainties. More specifically, hybrid variants of Artificial Neural Network (ANN) and fuzzy logic, i.e., NF methods, are capable for the modeling reservoir characteristics by integrating the explicit knowledge representation power of FL with the learning ability of neural networks. The documented results in this study demonstrate acceptable resemblance between target and predicted variables, and hence, encourage the application of integrated machine learning approaches such as Neuro-Fuzzy in reservoir characterization domain. Furthermore, visualization of the variation of sand probability in the study area would assist in identifying placement of potential wells for future drilling operations. △ Less

Submitted 23 September, 2015; originally announced September 2015.

Comments: Journal of Applied Geophysics, volume 111, page 141-155

arXiv:1509.07065 [pdf]

doi 10.1109/JSTARS.2015.2404808

A Novel Pre-processing Scheme to Improve the Prediction of Sand Fraction from Seismic Attributes using Neural Networks

Authors: Soumi Chaki, Aurobinda Routray, William K. Mohanty

Abstract: This paper presents a novel pre-processing scheme to improve the prediction of sand fraction from multiple seismic attributes such as seismic impedance, amplitude and frequency using machine learning and information filtering. The available well logs along with the 3-D seismic data have been used to benchmark the proposed pre-processing stage using a methodology which primarily consists of three s… ▽ More This paper presents a novel pre-processing scheme to improve the prediction of sand fraction from multiple seismic attributes such as seismic impedance, amplitude and frequency using machine learning and information filtering. The available well logs along with the 3-D seismic data have been used to benchmark the proposed pre-processing stage using a methodology which primarily consists of three steps: pre-processing, training and post-processing. An Artificial Neural Network (ANN) with conjugate-gradient learning algorithm has been used to model the sand fraction. The available sand fraction data from the high resolution well logs has far more information content than the low resolution seismic attributes. Therefore, regularization schemes based on Fourier Transform (FT), Wavelet Decomposition (WD) and Empirical Mode Decomposition (EMD) have been proposed to shape the high resolution sand fraction data for effective machine learning. The input data sets have been segregated into training, testing and validation sets. The test results are primarily used to check different network structures and activation function performances. Once the network passes the testing phase with an acceptable performance in terms of the selected evaluators, the validation phase follows. In the validation stage, the prediction model is tested against unseen data. The network yielding satisfactory performance in the validation stage is used to predict lithological properties from seismic attributes throughout a given volume. Finally, a post-processing scheme using 3-D spatial filtering is implemented for smoothing the sand fraction in the volume. Prediction of lithological properties using this framework is helpful for Reservoir Characterization. △ Less

Submitted 23 September, 2015; originally announced September 2015.

Comments: 13 pages, volume 8, no 4, pp. 1808-1820, April 2015 in IEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2015

arXiv:1505.05962 [pdf, other]

Nearest Neighbor based Clustering Algorithm for Large Data Sets

Authors: Pankaj Kumar Yadav, Sriniwas Pandey, Sraban Kumar Mohanty

Abstract: Clustering is an unsupervised learning technique in which data or objects are grouped into sets based on some similarity measure. Most of the clustering algorithms assume that the main memory is infinite and can accommodate the set of patterns. In reality many applications give rise to a large set of patterns which does not fit in the main memory. When the data set is too large, much of the data i… ▽ More Clustering is an unsupervised learning technique in which data or objects are grouped into sets based on some similarity measure. Most of the clustering algorithms assume that the main memory is infinite and can accommodate the set of patterns. In reality many applications give rise to a large set of patterns which does not fit in the main memory. When the data set is too large, much of the data is stored in the secondary memory. Input/Outputs (I/O) from the disk are the major bottleneck in designing efficient clustering algorithms for large data sets. Different designing techniques have been used to design clustering algorithms for large data sets. External memory algorithms are one class of algorithms which can be used for large data sets. These algorithms exploit the hierarchical memory structure of the computers by incorporating locality of reference directly in the algorithm. This paper makes some contribution towards designing clustering algorithms in the external memory model (Proposed by Aggarwal and Vitter 1988) to make the algorithms scalable. In this paper, it is shown that the Shared near neighbors algorithm is not very I/O efficient since the computational complexity is same as the I/O complexity. The algorithm is designed in the external memory model and I/O complexity is reduced. The computational complexity remains same. We substantiate the theoretical analysis by showing the performance of the algorithms with their traditional counterpart by implementing in STXXL library. △ Less

Submitted 22 May, 2015; originally announced May 2015.

Comments: 10 pages

arXiv:1312.1309 [pdf, other]

On the DoF Region of the K-user MISO Broadcast Channel with Hybrid CSIT

Authors: Kaniska Mohanty, Mahesh K. Varanasi

Abstract: An outer bound for the degrees of freedom (DoF) region of the K-user multiple-input single-output (MISO) broadcast channel (BC) is developed under the hybrid channel state information at transmitter (CSIT) model, in which the transmitter has instantaneous CSIT of channels to a subset of the receivers and delayed CSIT of channels to the rest of the receivers. For the 3-user MISO BC, when the transm… ▽ More An outer bound for the degrees of freedom (DoF) region of the K-user multiple-input single-output (MISO) broadcast channel (BC) is developed under the hybrid channel state information at transmitter (CSIT) model, in which the transmitter has instantaneous CSIT of channels to a subset of the receivers and delayed CSIT of channels to the rest of the receivers. For the 3-user MISO BC, when the transmitter has instantaneous CSIT of the channel to one receiver and delayed CSIT of channels to the other two, two new communication schemes are designed, which are able to achieve the DoF tuple of $\left(1,\frac{1}{3},\frac{1}{3}\right)$, with a sum DoF of $\frac{5}{3}$, that is greater than the sum DoF achievable only with delayed CSIT. Another communication scheme showing the benefit of the alternating CSIT model is also developed, to obtain the DoF tuple of $\left(1,\frac{4}{9},\frac{4}{9}\right)$ for the 3-user MISO BC. △ Less

Submitted 4 December, 2013; originally announced December 2013.

arXiv:1209.0047 [pdf, other]

The Degrees of Freedom Region of the MIMO Interference Channel with Hybrid CSIT

Authors: Kaniska Mohanty, Chinmay S. Vaze, Mahesh K. Varanasi

Abstract: The degrees of freedom (DoF) region of the two-user MIMO (multiple-input multiple-output) interference channel is established under a new model termed as hybrid CSIT. In this model, one transmitter has delayed channel state information (CSI) and the other transmitter has instantaneous CSIT, of incoming channel matrices at the respective unpaired receivers, and neither transmitter has any knowledge… ▽ More The degrees of freedom (DoF) region of the two-user MIMO (multiple-input multiple-output) interference channel is established under a new model termed as hybrid CSIT. In this model, one transmitter has delayed channel state information (CSI) and the other transmitter has instantaneous CSIT, of incoming channel matrices at the respective unpaired receivers, and neither transmitter has any knowledge of the incoming channel matrices of its respective paired receiver. The DoF region for hybrid CSIT, and consequently that of $2\times2\times3^{5}$ CSIT models, is completely characterized, and a new achievable scheme based on a combination of transmit beamforming and retrospective interference alignment is developed. Conditions are obtained on the numbers of antennas at each of the four terminals such that the DoF region under hybrid CSIT is equal to that under (a) global and instantaneous CSIT and (b) global and delayed CSIT, with the remaining cases resulting in a DoF region with hybrid CSIT that lies somewhere in between the DoF regions under the instantaneous and delayed CSIT settings. Further synergistic benefits accruing from switching between the two hybrid CSIT models are also explored. △ Less

Submitted 2 December, 2013; v1 submitted 31 August, 2012; originally announced September 2012.

arXiv:1006.1307 [pdf, ps, other]

I/O Efficient Algorithms for Matrix Computations

Authors: Sraban Kumar Mohanty

Abstract: We analyse some QR decomposition algorithms, and show that the I/O complexity of the tile based algorithm is asymptotically the same as that of matrix multiplication. This algorithm, we show, performs the best when the tile size is chosen so that exactly one tile fits in the main memory. We propose a constant factor improvement, as well as a new recursive cache oblivious algorithm with the same as… ▽ More We analyse some QR decomposition algorithms, and show that the I/O complexity of the tile based algorithm is asymptotically the same as that of matrix multiplication. This algorithm, we show, performs the best when the tile size is chosen so that exactly one tile fits in the main memory. We propose a constant factor improvement, as well as a new recursive cache oblivious algorithm with the same asymptotic I/O complexity. We design Hessenberg, tridiagonal, and bidiagonal reductions that use banded intermediate forms, and perform only asymptotically optimal numbers of I/Os; these are the first I/O optimal algorithms for these problems. In particular, we show that known slab based algorithms for two sided reductions all have suboptimal asymptotic I/O performances, even though they have been reported to do better than the traditional algorithms on the basis of empirical evidence. We propose new tile based variants of multishift QR and QZ algorithms that under certain conditions on the number of shifts, have better seek and I/O complexities than all known variants. We show that techniques like rescheduling of computational steps, appropriate choosing of the blocking parameters and incorporating of more matrix-matrix operations, can be used to improve the I/O and seek complexities of matrix computations. △ Less

Submitted 7 June, 2010; originally announced June 2010.

MSC Class: 15A23; 11Y16; 65Y20; 68Q25; 68W40; 68W05 ACM Class: F.2.1; G.1.0; I.1.2

Showing 1–12 of 12 results for author: Mohanty, K