-
Super-Resolution of SOHO/MDI Magnetograms of Solar Active Regions Using SDO/HMI Data and an Attention-Aided Convolutional Neural Network
Authors:
Chunhui Xu,
Jason T. L. Wang,
Haimin Wang,
Haodi Jiang,
Qin Li,
Yasser Abduallah,
Yan Xu
Abstract:
Image super-resolution has been an important subject in image processing and recognition. Here, we present an attention-aided convolutional neural network (CNN) for solar image super-resolution. Our method, named SolarCNN, aims to enhance the quality of line-of-sight (LOS) magnetograms of solar active regions (ARs) collected by the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric…
▽ More
Image super-resolution has been an important subject in image processing and recognition. Here, we present an attention-aided convolutional neural network (CNN) for solar image super-resolution. Our method, named SolarCNN, aims to enhance the quality of line-of-sight (LOS) magnetograms of solar active regions (ARs) collected by the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric Observatory (SOHO). The ground-truth labels used for training SolarCNN are the LOS magnetograms collected by the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO). Solar ARs consist of strong magnetic fields in which magnetic energy can suddenly be released to produce extreme space weather events, such as solar flares, coronal mass ejections, and solar energetic particles. SOHO/MDI covers Solar Cycle 23, which is stronger with more eruptive events than Cycle 24. Enhanced SOHO/MDI magnetograms allow for better understanding and forecasting of violent events of space weather. Experimental results show that SolarCNN improves the quality of SOHO/MDI magnetograms in terms of the structural similarity index measure (SSIM), Pearson's correlation coefficient (PCC), and the peak signal-to-noise ratio (PSNR).
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Prediction of the SYM-H Index Using a Bayesian Deep Learning Method with Uncertainty Quantification
Authors:
Yasser Abduallah,
Khalid A. Alobaid,
Jason T. L. Wang,
Haimin Wang,
Vania K. Jordanova,
Vasyl Yurchyshyn,
Huseyin Cavus,
Ju **g
Abstract:
We propose a novel deep learning framework, named SYMHnet, which employs a graph neural network and a bidirectional long short-term memory network to cooperatively learn patterns from solar wind and interplanetary magnetic field parameters for short-term forecasts of the SYM-H index based on 1-minute and 5-minute resolution data. SYMHnet takes, as input, the time series of the parameters' values p…
▽ More
We propose a novel deep learning framework, named SYMHnet, which employs a graph neural network and a bidirectional long short-term memory network to cooperatively learn patterns from solar wind and interplanetary magnetic field parameters for short-term forecasts of the SYM-H index based on 1-minute and 5-minute resolution data. SYMHnet takes, as input, the time series of the parameters' values provided by NASA's Space Science Data Coordinated Archive and predicts, as output, the SYM-H index value at time point t + w hours for a given time point t where w is 1 or 2. By incorporating Bayesian inference into the learning framework, SYMHnet can quantify both aleatoric (data) uncertainty and epistemic (model) uncertainty when predicting future SYM-H indices. Experimental results show that SYMHnet works well at quiet time and storm time, for both 1-minute and 5-minute resolution data. The results also show that SYMHnet generally performs better than related machine learning methods. For example, SYMHnet achieves a forecast skill score (FSS) of 0.343 compared to the FSS of 0.074 of a recent gradient boosting machine (GBM) method when predicting SYM-H indices (1 hour in advance) in a large storm (SYM-H = -393 nT) using 5-minute resolution data. When predicting the SYM-H indices (2 hours in advance) in the large storm, SYMHnet achieves an FSS of 0.553 compared to the FSS of 0.087 of the GBM method. In addition, SYMHnet can provide results for both data and model uncertainty quantification, whereas the related methods cannot.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Estimating Coronal Mass Ejection Mass and Kinetic Energy by Fusion of Multiple Deep-learning Models
Authors:
Khalid A. Alobaid,
Yasser Abduallah,
Jason T. L. Wang,
Haimin Wang,
Shen Fan,
Jialiang Li,
Huseyin Cavus,
Vasyl Yurchyshyn
Abstract:
Coronal mass ejections (CMEs) are massive solar eruptions, which have a significant impact on Earth. In this paper, we propose a new method, called DeepCME, to estimate two properties of CMEs, namely, CME mass and kinetic energy. Being able to estimate these properties helps better understand CME dynamics. Our study is based on the CME catalog maintained at the Coordinated Data Analysis Workshops…
▽ More
Coronal mass ejections (CMEs) are massive solar eruptions, which have a significant impact on Earth. In this paper, we propose a new method, called DeepCME, to estimate two properties of CMEs, namely, CME mass and kinetic energy. Being able to estimate these properties helps better understand CME dynamics. Our study is based on the CME catalog maintained at the Coordinated Data Analysis Workshops (CDAW) Data Center, which contains all CMEs manually identified since 1996 using the Large Angle and Spectrometric Coronagraph (LASCO) on board the Solar and Heliospheric Observatory (SOHO). We use LASCO C2 data in the period between January 1996 and December 2020 to train, validate and test DeepCME through 10-fold cross validation. The DeepCME method is a fusion of three deep learning models, including ResNet, InceptionNet, and InceptionResNet. Our fusion model extracts features from LASCO C2 images, effectively combining the learning capabilities of the three component models to jointly estimate the mass and kinetic energy of CMEs. Experimental results show that the fusion model yields a mean relative error (MRE) of 0.013 (0.009, respectively) compared to the MRE of 0.019 (0.017, respectively) of the best component model InceptionResNet (InceptionNet, respectively) in estimating the CME mass (kinetic energy, respectively). To our knowledge, this is the first time that deep learning has been used for CME mass and kinetic energy estimations.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Ensemble Learning for CME Arrival Time Prediction
Authors:
Khalid A. Alobaid,
Jason T. L. Wang
Abstract:
The Sun constantly releases radiation and plasma into the heliosphere. Sporadically, the Sun launches solar eruptions such as flares and coronal mass ejections (CMEs). CMEs carry away a huge amount of mass and magnetic flux with them. An Earth-directed CME can cause serious consequences to the human system. It can destroy power grids/pipelines, satellites, and communications. Therefore, accurately…
▽ More
The Sun constantly releases radiation and plasma into the heliosphere. Sporadically, the Sun launches solar eruptions such as flares and coronal mass ejections (CMEs). CMEs carry away a huge amount of mass and magnetic flux with them. An Earth-directed CME can cause serious consequences to the human system. It can destroy power grids/pipelines, satellites, and communications. Therefore, accurately monitoring and predicting CMEs is important to minimize damages to the human system. In this study we propose an ensemble learning approach, named CMETNet, for predicting the arrival time of CMEs from the Sun to the Earth. We collect and integrate eruptive events from two solar cycles, #23 and #24, from 1996 to 2021 with a total of 363 geoeffective CMEs. The data used for making predictions include CME features, solar wind parameters and CME images obtained from the SOHO/LASCO C2 coronagraph. Our ensemble learning framework comprises regression algorithms for numerical data analysis and a convolutional neural network for image processing. Experimental results show that CMETNet performs better than existing machine learning methods reported in the literature, with a Pearson product-moment correlation coefficient of 0.83 and a mean absolute error of 9.75 hours.
△ Less
Submitted 29 April, 2023;
originally announced May 2023.
-
A Deep Learning Approach to Generating Photospheric Vector Magnetograms of Solar Active Regions for SOHO/MDI Using SDO/HMI and BBSO Data
Authors:
Haodi Jiang,
Qin Li,
Zhihang Hu,
Nian Liu,
Yasser Abduallah,
Ju **g,
Genwei Zhang,
Yan Xu,
Wynne Hsu,
Jason T. L. Wang,
Haimin Wang
Abstract:
Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few large flares, though it is the only solar cycle i…
▽ More
Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few large flares, though it is the only solar cycle in which consistent time-sequence vector magnetograms have been available through the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) since its launch in 2010. In this paper, we look into another major instrument, namely the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric Observatory (SOHO) from 1996 to 2010. The data archive of SOHO/MDI covers more active solar cycle 23 with many large flares. However, SOHO/MDI data only has line-of-sight (LOS) magnetograms. We propose a new deep learning method, named MagNet, to learn from combined LOS magnetograms, Bx and By taken by SDO/HMI along with H-alpha observations collected by the Big Bear Solar Observatory (BBSO), and to generate vector components Bx' and By', which would form vector magnetograms with observed LOS data. In this way, we can expand the availability of vector magnetograms to the period from 1996 to present. Experimental results demonstrate the good performance of the proposed method. To our knowledge, this is the first time that deep learning has been used to generate photospheric vector magnetograms of solar active regions for SOHO/MDI using SDO/HMI and H-alpha data.
△ Less
Submitted 4 November, 2022;
originally announced November 2022.
-
Inferring Line-of-Sight Velocities and Doppler Widths from Stokes Profiles of GST/NIRIS Using Stacked Deep Neural Networks
Authors:
Haodi Jiang,
Qin Li,
Yan Xu,
Wynne Hsu,
Kwangsu Ahn,
Wenda Cao,
Jason T. L. Wang,
Haimin Wang
Abstract:
Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at th…
▽ More
Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory (BBSO). The training data of SDNN is prepared by a Milne-Eddington (ME) inversion code used by BBSO. We quantitatively assess SDNN, comparing its inversion results with those obtained by the ME inversion code and related machine learning (ML) algorithms such as multiple support vector regression, multilayer perceptrons and a pixel-level convolutional neural network. Major findings from our experimental study are summarized as follows. First, the SDNN-inferred LOS velocities are highly correlated to the ME-calculated ones with the Pearson product-moment correlation coefficient being close to 0.9 on average. Second, SDNN is faster, while producing smoother and cleaner LOS velocity and Doppler width maps, than the ME inversion code. Third, the maps produced by SDNN are closer to ME's maps than those from the related ML algorithms, demonstrating the better learning capability of SDNN than the ML algorithms. Finally, comparison between the inversion results of ME and SDNN based on GST/NIRIS and those from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory in flare-prolific active region NOAA 12673 is presented. We also discuss extensions of SDNN for inferring vector magnetic fields with empirical evaluation.
△ Less
Submitted 8 October, 2022;
originally announced October 2022.
-
A Deep Learning Approach to Dst Index Prediction
Authors:
Yasser Abduallah,
Jason T. L. Wang,
Prianka Bose,
Genwei Zhang,
Firas Gerges,
Haimin Wang
Abstract:
The disturbance storm time (Dst) index is an important and useful measurement in space weather research. It has been used to characterize the size and intensity of a geomagnetic storm. A negative Dst value means that the Earth's magnetic field is weakened, which happens during storms. In this paper, we present a novel deep learning method, called the Dst Transformer, to perform short-term, 1-6 hou…
▽ More
The disturbance storm time (Dst) index is an important and useful measurement in space weather research. It has been used to characterize the size and intensity of a geomagnetic storm. A negative Dst value means that the Earth's magnetic field is weakened, which happens during storms. In this paper, we present a novel deep learning method, called the Dst Transformer, to perform short-term, 1-6 hour ahead, forecasting of the Dst index based on the solar wind parameters provided by the NASA Space Science Data Coordinated Archive. The Dst Transformer combines a multi-head attention layer with Bayesian inference, which is capable of quantifying both aleatoric uncertainty and epistemic uncertainty when making Dst predictions. Experimental results show that the proposed Dst Transformer outperforms related machine learning methods in terms of the root mean square error and R-squared. Furthermore, the Dst Transformer can produce both data and model uncertainty quantification results, which can not be done by the existing methods. To our knowledge, this is the first time that Bayesian deep learning has been used for Dst index forecasting.
△ Less
Submitted 5 May, 2022;
originally announced May 2022.
-
Predicting Solar Energetic Particles Using SDO/HMI Vector Magnetic Data Products and a Bidirectional LSTM Network
Authors:
Yasser Abduallah,
Vania K. Jordanova,
Hao Liu,
Qin Li,
Jason T. L. Wang,
Haimin Wang
Abstract:
Solar energetic particles (SEPs) are an essential source of space radiation, which are hazards for humans in space, spacecraft, and technology in general. In this paper we propose a deep learning method, specifically a bidirectional long short-term memory (biLSTM) network, to predict if an active region (AR) would produce an SEP event given that (i) the AR will produce an M- or X-class flare and a…
▽ More
Solar energetic particles (SEPs) are an essential source of space radiation, which are hazards for humans in space, spacecraft, and technology in general. In this paper we propose a deep learning method, specifically a bidirectional long short-term memory (biLSTM) network, to predict if an active region (AR) would produce an SEP event given that (i) the AR will produce an M- or X-class flare and a coronal mass ejection (CME) associated with the flare, or (ii) the AR will produce an M- or X-class flare regardless of whether or not the flare is associated with a CME. The data samples used in this study are collected from the Geostationary Operational Environmental Satellite's X-ray flare catalogs provided by the National Centers for Environmental Information. We select M- and X-class flares with identified ARs in the catalogs for the period between 2010 and 2021, and find the associations of flares, CMEs and SEPs in the Space Weather Database of Notifications, Knowledge, Information during the same period. Each data sample contains physical parameters collected from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory. Experimental results based on different performance metrics demonstrate that the proposed biLSTM network is better than related machine learning algorithms for the two SEP prediction tasks studied here. We also discuss extensions of our approach for probabilistic forecasting and calibration with empirical evaluation.
△ Less
Submitted 27 March, 2022;
originally announced March 2022.
-
Deep Learning Based Reconstruction of Total Solar Irradiance
Authors:
Yasser Abduallah,
Jason T. L. Wang,
Yucong Shen,
Khalid A. Alobaid,
Serena Criscuoli,
Haimin Wang
Abstract:
The Earth's primary source of energy is the radiant energy generated by the Sun, which is referred to as solar irradiance, or total solar irradiance (TSI) when all of the radiation is measured. A minor change in the solar irradiance can have a significant impact on the Earth's climate and atmosphere. As a result, studying and measuring solar irradiance is crucial in understanding climate changes a…
▽ More
The Earth's primary source of energy is the radiant energy generated by the Sun, which is referred to as solar irradiance, or total solar irradiance (TSI) when all of the radiation is measured. A minor change in the solar irradiance can have a significant impact on the Earth's climate and atmosphere. As a result, studying and measuring solar irradiance is crucial in understanding climate changes and solar variability. Several methods have been developed to reconstruct total solar irradiance for long and short periods of time; however, they are physics-based and rely on the availability of data, which does not go beyond 9,000 years. In this paper we propose a new method, called TSInet, to reconstruct total solar irradiance by deep learning for short and long periods of time that span beyond the physical models' data availability. On the data that are available, our method agrees well with the state-of-the-art physics-based reconstruction models. To our knowledge, this is the first time that deep learning has been used to reconstruct total solar irradiance for more than 9,000 years.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
Tracing Halpha Fibrils through Bayesian Deep Learning
Authors:
Haodi Jiang,
Ju **g,
Jiasheng Wang,
Chang Liu,
Qin Li,
Yan Xu,
Jason T. L. Wang,
Haimin Wang
Abstract:
We present a new deep learning method, dubbed FibrilNet, for tracing chromospheric fibrils in Halpha images of solar observations. Our method consists of a data pre-processing component that prepares training data from a threshold-based tool, a deep learning model implemented as a Bayesian convolutional neural network for probabilistic image segmentation with uncertainty quantification to predict…
▽ More
We present a new deep learning method, dubbed FibrilNet, for tracing chromospheric fibrils in Halpha images of solar observations. Our method consists of a data pre-processing component that prepares training data from a threshold-based tool, a deep learning model implemented as a Bayesian convolutional neural network for probabilistic image segmentation with uncertainty quantification to predict fibrils, and a post-processing component containing a fibril-fitting algorithm to determine fibril orientations. The FibrilNet tool is applied to high-resolution Halpha images from an active region (AR 12665) collected by the 1.6 m Goode Solar Telescope (GST) equipped with high-order adaptive optics at the Big Bear Solar Observatory (BBSO). We quantitatively assess the FibrilNet tool, comparing its image segmentation algorithm and fibril-fitting algorithm with those employed by the threshold-based tool. Our experimental results and major findings are summarized as follows. First, the image segmentation results (i.e., detected fibrils) of the two tools are quite similar, demonstrating the good learning capability of FibrilNet. Second, FibrilNet finds more accurate and smoother fibril orientation angles than the threshold-based tool. Third, FibrilNet is faster than the threshold-based tool and the uncertainty maps produced by FibrilNet not only provide a quantitative way to measure the confidence on each detected fibril, but also help identify fibril structures that are not detected by the threshold-based tool but are inferred through machine learning. Finally, we apply FibrilNet to full-disk Halpha images from other solar observatories and additional high-resolution Halpha images collected by BBSO/GST, demonstrating the tool's usability in diverse datasets.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
DeepSun: Machine-Learning-as-a-Service for Solar Flare Prediction
Authors:
Yasser Abduallah,
Jason T. L. Wang,
Yang Nie,
Chang Liu,
Haimin Wang
Abstract:
Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA's Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun's magnetic activity. HMI provides continuous full-disk observations of the solar vector magnetic field with high c…
▽ More
Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA's Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun's magnetic activity. HMI provides continuous full-disk observations of the solar vector magnetic field with high cadence data that lead to reliable predictive capability; yet, solar flare prediction effort utilizing these data is still limited. In this paper, we present a machine-learning-as-a-service (MLaaS) framework, called DeepSun, for predicting solar flares on the Web based on HMI's data products. Specifically, we construct training data by utilizing the physical parameters provided by the Space-weather HMI Active Region Patches (SHARP) and categorize solar flares into four classes, namely B, C, M, X, according to the X-ray flare catalogs available at the National Centers for Environmental Information (NCEI). Thus, the solar flare prediction problem at hand is essentially a multi-class (i.e., four-class) classification problem. The DeepSun system employs several machine learning algorithms to tackle this multi-class prediction problem and provides an application programming interface (API) for remote programming users. To our knowledge, DeepSun is the first MLaaS tool capable of predicting solar flares through the Internet.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
Identifying and Tracking Solar Magnetic Flux Elements with Deep Learning
Authors:
Haodi Jiang,
Jiasheng Wang,
Chang Liu,
Ju **g,
Hao Liu,
Jason T. L. Wang,
Haimin Wang
Abstract:
Deep learning has drawn a lot of interest in recent years due to its effectiveness in processing big and complex observational data gathered from diverse instruments. Here we propose a new deep learning method, called SolarUnet, to identify and track solar magnetic flux elements or features in observed vector magnetograms based on the Southwest Automatic Magnetic Identification Suite (SWAMIS). Our…
▽ More
Deep learning has drawn a lot of interest in recent years due to its effectiveness in processing big and complex observational data gathered from diverse instruments. Here we propose a new deep learning method, called SolarUnet, to identify and track solar magnetic flux elements or features in observed vector magnetograms based on the Southwest Automatic Magnetic Identification Suite (SWAMIS). Our method consists of a data pre-processing component that prepares training data from the SWAMIS tool, a deep learning model implemented as a U-shaped convolutional neural network for fast and accurate image segmentation, and a post-processing component that prepares tracking results. SolarUnet is applied to data from the 1.6 meter Goode Solar Telescope at the Big Bear Solar Observatory. When compared to the widely used SWAMIS tool, SolarUnet is faster while agreeing mostly with SWAMIS on feature size and flux distributions, and complementing SWAMIS in tracking long-lifetime features. Thus, the proposed physics-guided deep learning-based tool can be considered as an alternative method for solar magnetic tracking.
△ Less
Submitted 27 August, 2020;
originally announced August 2020.
-
Inferring Vector Magnetic Fields from Stokes Profiles of GST/NIRIS Using a Convolutional Neural Network
Authors:
Hao Liu,
Yan Xu,
Jiasheng Wang,
Ju **g,
Chang Liu,
Jason T. L. Wang,
Haimin Wang
Abstract:
We propose a new machine learning approach to Stokes inversion based on a convolutional neural network (CNN) and the Milne-Eddington (ME) method. The Stokes measurements used in this study were taken by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory. By learning the latent patterns in the training data prepared by the…
▽ More
We propose a new machine learning approach to Stokes inversion based on a convolutional neural network (CNN) and the Milne-Eddington (ME) method. The Stokes measurements used in this study were taken by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory. By learning the latent patterns in the training data prepared by the physics-based ME tool, the proposed CNN method is able to infer vector magnetic fields from the Stokes profiles of GST/NIRIS. Experimental results show that our CNN method produces smoother and cleaner magnetic maps than the widely used ME method. Furthermore, the CNN method is 4~6 times faster than the ME method, and is able to produce vector magnetic fields in near real-time, which is essential to space weather forecasting. Specifically, it takes ~50 seconds for the CNN method to process an image of 720 x 720 pixels comprising Stokes profiles of GST/NIRIS. Finally, the CNN-inferred results are highly correlated to the ME-calculated results and are closer to the ME's results with the Pearson product-moment correlation coefficient (PPMCC) being closer to 1 on average than those from other machine learning algorithms such as multiple support vector regression and multilayer perceptrons (MLP). In particular, the CNN method outperforms the current best machine learning method (MLP) by 2.6% on average in PPMCC according to our experimental study. Thus, the proposed physics-assisted deep learning-based CNN tool can be considered as an alternative, efficient method for Stokes inversion for high resolution polarimetric observations obtained by GST/NIRIS.
△ Less
Submitted 8 May, 2020;
originally announced May 2020.
-
Predicting Coronal Mass Ejections Using SDO/HMI Vector Magnetic Data Products and Recurrent Neural Networks
Authors:
Hao Liu,
Chang Liu,
Jason T. L. Wang,
Haimin Wang
Abstract:
We present two recurrent neural networks (RNNs), one based on gated recurrent units and the other based on long short-term memory, for predicting whether an active region (AR) that produces an M- or X-class flare will also produce a coronal mass ejection (CME). We model data samples in an AR as time series and use the RNNs to capture temporal information of the data samples. Each data sample has 1…
▽ More
We present two recurrent neural networks (RNNs), one based on gated recurrent units and the other based on long short-term memory, for predicting whether an active region (AR) that produces an M- or X-class flare will also produce a coronal mass ejection (CME). We model data samples in an AR as time series and use the RNNs to capture temporal information of the data samples. Each data sample has 18 physical parameters, or features, derived from photospheric vector magnetic field data taken by the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO). We survey M- and X-class flares that occurred from 2010 May to 2019 May using the Geostationary Operational Environmental Satellite's X-ray flare catalogs provided by the National Centers for Environmental Information (NCEI), and select those flares with identified ARs in the NCEI catalogs. In addition, we extract the associations of flares and CMEs from the Space Weather Database Of Notifications, Knowledge, Information (DONKI). We use the information gathered above to build the labels (positive versus negative) of the data samples at hand. Experimental results demonstrate the superiority of our RNNs over closely related machine learning methods in predicting the labels of the data samples. We also discuss an extension of our approach to predict a probabilistic estimate of how likely an M- or X-class flare will initiate a CME, with good performance results. To our knowledge this is the first time that RNNs have been used for CME prediction.
△ Less
Submitted 22 February, 2020;
originally announced February 2020.
-
Predicting Solar Flares Using a Long Short-Term Memory Network
Authors:
Hao Liu,
Chang Liu,
Jason T. L. Wang,
Haimin Wang
Abstract:
We present a long short-term memory (LSTM) network for predicting whether an active region (AR) would produce a gamma-class flare within the next 24 hours. We consider three gamma classes, namely >=M5.0 class, >=M class, and >=C class, and build three LSTM models separately, each corresponding to a gamma class. Each LSTM model is used to make predictions of its corresponding gamma-class flares. Th…
▽ More
We present a long short-term memory (LSTM) network for predicting whether an active region (AR) would produce a gamma-class flare within the next 24 hours. We consider three gamma classes, namely >=M5.0 class, >=M class, and >=C class, and build three LSTM models separately, each corresponding to a gamma class. Each LSTM model is used to make predictions of its corresponding gamma-class flares. The essence of our approach is to model data samples in an AR as time series and use LSTMs to capture temporal information of the data samples. Each data sample has 40 features including 25 magnetic parameters obtained from the Space-weather HMI Active Region Patches (SHARP) and related data products as well as 15 flare history parameters. We survey the flare events that occurred from 2010 May to 2018 May, using the GOES X-ray flare catalogs provided by the National Centers for Environmental Information (NCEI), and select flares with identified ARs in the NCEI flare catalogs. These flare events are used to build the labels (positive vs. negative) of the data samples. Experimental results show that (i) using only 14-22 most important features including both flare history and magnetic parameters can achieve better performance than using all the 40 features together; (ii) our LSTM network outperforms related machine learning methods in predicting the labels of the data samples. To our knowledge, this is the first time that LSTMs have been used for solar flare prediction.
△ Less
Submitted 16 May, 2019;
originally announced May 2019.
-
MapReduce Algorithms for Inferring Gene Regulatory Networks from Time-Series Microarray Data Using an Information-Theoretic Approach
Authors:
Yasser Abduallah,
Turki Turki,
Kevin Byron,
Zongxuan Du,
Miguel Cervantes-Cervantes,
Jason T. L. Wang
Abstract:
Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections, are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous…
▽ More
Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections, are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.
△ Less
Submitted 15 March, 2017;
originally announced April 2017.
-
A Computational Approach to Finding RNA Tertiary Motifs in Genomic Sequences
Authors:
Kevin Byron,
Jason T. L. Wang
Abstract:
Motif finding in DNA, RNA and proteins plays an important role in life science research. Recent patents concerning motif finding in the biomolecular data are recorded in the DNA Patent Database which serves as a resource for policy makers and members of the general public interested in fields like genomics, genetics and biotechnology. In this paper we present a computational approach to mining for…
▽ More
Motif finding in DNA, RNA and proteins plays an important role in life science research. Recent patents concerning motif finding in the biomolecular data are recorded in the DNA Patent Database which serves as a resource for policy makers and members of the general public interested in fields like genomics, genetics and biotechnology. In this paper we present a computational approach to mining for RNA tertiary motifs in genomic sequences. Specifically we describe a method, named CSminer, for finding RNA coaxial helical stackings in genomes. A coaxial helical stacking occurs in an RNA tertiary structure where two separate helical elements form a pseudocontiguous helix and provides thermodynamic stability to the molecule as a whole. Experimental results demonstrate the effectiveness of our approach.
△ Less
Submitted 2 January, 2017;
originally announced January 2017.
-
Effective Classification of MicroRNA Precursors Using Combinatorial Feature Mining and AdaBoost Algorithms
Authors:
Ling Zhong,
Jason T. L. Wang
Abstract:
MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides (nt) that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpins can be found in genomes. It is a challenge to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-lo…
▽ More
MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides (nt) that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpins can be found in genomes. It is a challenge to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (referred to as pseudo pre-miRNAs). Several computational methods have been developed to tackle this challenge. In this paper we propose a new method, called MirID, for identifying and classifying microRNA precursors. We collect 74 features from the sequences and secondary structures of pre-miRNAs; some of these features are taken from our previous studies on non-coding RNA prediction while others were suggested in the literature. We develop a combinatorial feature mining algorithm to identify suitable feature sets. These feature sets are then used to train support vector machines to obtain classification models, based on which classifier ensemble is constructed. Finally we use an AdaBoost algorithm to further enhance the accuracy of the classifier ensemble. Experimental results on a variety of species demonstrate the good performance of the proposed method, and its superiority over existing tools.
△ Less
Submitted 6 October, 2016;
originally announced October 2016.
-
CHSalign: A Web Server That Builds upon Junction-Explorer and RNAJAG for Pairwise Alignment of RNA Secondary Structures with Coaxial Helical Stacking
Authors:
Lei Hua,
Yang Song,
Namhee Kim,
Christian Laing,
Jason T. L. Wang,
Tamar Schlick
Abstract:
RNA junctions are important structural elements of RNA molecules. They are formed when three or more helices come together in three-dimensional space. Recent studies have focused on the annotation and prediction of coaxial helical stacking (CHS) motifs within junctions. Here we exploit such predictions to develop an efficient alignment tool to handle RNA secondary structures with CHS motifs. Speci…
▽ More
RNA junctions are important structural elements of RNA molecules. They are formed when three or more helices come together in three-dimensional space. Recent studies have focused on the annotation and prediction of coaxial helical stacking (CHS) motifs within junctions. Here we exploit such predictions to develop an efficient alignment tool to handle RNA secondary structures with CHS motifs. Specifically, we build upon our Junction-Explorer software for predicting coaxial stacking and RNAJAG for modelling junction topologies as tree graphs to incorporate constrained tree matching and dynamic programming algorithms into a new method, called CHSalign, for aligning the secondary structures of RNA molecules containing CHS motifs. Thus, CHSalign is intended to be an efficient alignment tool for RNAs containing similar junctions. Experimental results based on thousands of alignments demonstrate that CHSalign can align two RNA secondary structures containing CHS motifs more accurately than other RNA secondary structure alignment tools. CHSalign yields a high score when aligning two RNA secondary structures with similar CHS motifs or helical arrangement patterns, and a low score otherwise. This new method has been implemented in a web server, and the program is also made freely available, at http://bioinformatics.njit.edu/CHSalign/.
△ Less
Submitted 5 September, 2016;
originally announced September 2016.
-
Semi-Supervised Prediction of Gene Regulatory Networks Using Machine Learning Algorithms
Authors:
Nihir Patel,
Jason T. L. Wang
Abstract:
Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learni…
▽ More
Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabeled data for training. We investigate inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabeled data. We then apply our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluate the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.
△ Less
Submitted 11 August, 2016;
originally announced August 2016.