License: CC BY-NC-SA 4.0
arXiv:2312.15989v1 [astro-ph.SR] 26 Dec 2023

Parameter Estimation of LAMOST Medium-resolution Stellar Spectra

Xiangru Li11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT, Xiaoyu Zhang11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT, Shengchun Xiong11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT, Yulong Zheng11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT, and Hui Li11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT
11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPTSchool of Computer Science, South China Normal University, No. 55 West of Yat-sen Avenue, Guangzhou 510631, China
E-mail: [email protected]
(Accepted XXX. Received YYY; in original form ZZZ)
Abstract

This paper investigates the problem of estimating three stellar atmospheric physical parameters and thirteen elemental abundances for medium-resolution spectra from Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST). Typical characteristics of these spectra are their huge scale, wide range of spectral signal-to-noise ratios, and uneven distribution in parameter space. These characteristics lead to unsatisfactory results on the spectra with low temperature, high temperature or low metallicity. To this end, this paper proposes a Stellar Parameter Estimation method based on Multiple Regions (SPEMR) that effectively improves parameter estimation accuracy. On the spectra with S/N 10absent10\geq 10≥ 10, the precisions are 47 K, 0.08 dex, 0.03 dex respectively for the estimations of (Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT, logg𝑔\log\,groman_log italic_g and [Fe/H]delimited-[]FeH\rm[Fe/H][ roman_Fe / roman_H ]), 0.03 dex to 0.06 dex for elements C, Mg, Al, Si, Ca, Mn and Ni, 0.07 dex to 0.13 dex for N, O, S, K and Ti, while that of Cr is 0.16 dex. For the reference of astronomical science researchers and algorithm researchers, we released a catalog for 4.19 million medium-resolution spectra from the LAMOST DR8, experimental code, trained model, training data, and test data.

keywords:
methods: data analysis –- methods: statistical – stars: abundances – stars: fundamental parameters.
pubyear: 2022pagerange: Parameter Estimation of LAMOST Medium-resolution Stellar SpectraParameter Estimation of LAMOST Medium-resolution Stellar Spectra

1 Introduction

In this paper, we study the estimation problem of stellar atmospheric parameters and element abundances from Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST; Cui et al., 2012; Liu et al., 2015) medium-resolution stellar spectra. LAMOST, also known as the Guo Shou**g Telescope, is a large optical band observation equipment. It is the telescope with the highest spectral acquisition rate in the world and has provided a lot of precious spectral data for astronomical researchers. Since October 2018, LAMOST started the second stage survey program (LAMOST II), which conducts both low- and medium-resolution spectroscopic surveys (Wang et al., 2019). The wavelength coverages of the LAMOST medium-resolution spectra are [4950, 5350] Å and [6300, 6800] Å (Rui et al., 2019). LAMOST DR8 released 5.53 million medium-resolution spectra, and the signal-to-noise ratio (S/NSN\rm S/Nroman_S / roman_N) of 4.19 million spectra in them are greater than 10 (Wang et al., 2019).

From the large amount of spectroscopic data obtained during the LAMOST survey project, stellar parameters and elemental abundances can be estimated (Liu et al., 2014) for huge number of stars. These parameters and elemental abundances can be used to infer the stars’ properties and their evolutionary history (Recio-Blanco et al., 2022). So far, researchers have proposed many methods to estimate the stellar parameters of LAMOST spectra. In addition, the LAMOST survey project has its own Stellar Parameter Estimation Pipeline (LAMOST stellar parameter pipeline, LASP; Luo et al., 2015; Wu et al., 2011). The LASP works by minimizing the cardinality distance between the observed spectrum and theoretical spectra to find the best matching template and accordingly give the parameter estimate for the observed spectrum (Prugniel & Soubiran, 2001). The limitation of this traditional method is that the model compuational complexity depends more on the grid that generates the theoretical spectra rather than on the problem complexity. This results in relatively low computational efficiency and another limitation of this method is the high-quality requirements for the observed data. However, the LAMOST observational spectral library is characterized by large amount of data and wide range of signal-to-noise ratios. This leads to a large room for improvement in the parameter estimation of LAMOST spectra.

Refer to caption
Figure 1: The parameter estimation performance of RRNet: the performance obviously decreased on spectra with low temperature, high temperature or low metallicity. Since RRNet is proposed for the problem of the parameter estimation for the medium-resolution spectra from LAMOST DR7, this experiment is carried out on 110,500 LAMOST DR7 spectra which have observations in the APOGEE DR17/ASPCAP catalog from common sources. The performance of the parameter estimation is measured by the difference (Mean of Absolute Error, MAE) between the RRNet estimation and the reference result in the APOGEE DR17/ASPCAP catalog. As the trained model and the complete model prediction results are not published by SPCANet, only the performance characteristics of RRNet are analyzed in this experiment.

With the arrival of artificial intelligence and the big data era, researchers have tried to adopt deep learning methods to solve the problem of estimating stellar parameters from LAMOST medium-resolution spectra. Wang et al. (2020) proposed a residual-like network model (SPCANet) in 2020. This model consists of three convolutional layers and three fully connected layers, and can accurately predict the stellar parameters and elemental abundances from LAMOST DR7 medium-resolution spectra. In 2022, Xiong et al. (2022) developed a neural network model (RRNet) by combining several residual modules and some recurrent modules. The RRNet further improved the parameter estimation accuracy on the base of SPCANet. However, the above two methods only effectively work on the spectra with a restricted parameter range. For example, the parameter estimation accuracy of SPCANet (Wang et al., 2020) on the spectra with Teff>6500subscript𝑇eff6500T_{\rm eff}>6500italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT > 6500 K is significantly lower than that on the spectra with Teff[4000,6500]subscript𝑇eff40006500T_{\rm eff}\in\rm[4000,6500]italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT ∈ [ 4000 , 6500 ] K. Therefore, the SPCANet rejected the estimations for the spectra with Teff>6500subscript𝑇eff6500T_{\rm eff}>6500italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT > 6500 K. RRNet (Xiong et al., 2022) does not perform parameter estimation for spectra with Teff<4000subscript𝑇eff4000T_{\rm eff}<4000italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT < 4000 K and Teff>6500subscript𝑇eff6500T_{\rm eff}>6500italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT > 6500 K. In particular, the parameter estimation performance of RRNet is shown in Figure 1. It is shown that the performance of the RRNet decreases apparently on the spectra with high temperature, low temperature, or low metallicity. The performance variation of the RRNet is closely related with the distribution characteristics of the observed spectra in the parameter space (More discussions can be found in Section 2).

To deal with the above-mentioned problems, this paper proposes a Stellar Parameter Estimation method based on Multiple Regions (SPEMR) based on the distribution characteristics of LAMOST data in the parameter space. This scheme significantly improves the estimation of parameters for the spectra with high temperature, low temperature, or low metallicity apart from its performance increasing on common type spectra. This paper is organized as follows: Section 2 introduces the medium-resolution stellar spectra in LAMOST DR8, the reference set of this paper, and the scheme dividing reference set into different subsets according to the distribution characteristics. Section 3 describes the principle of SPEMR. The results of SPEMR on LAMOST DR8 are investigated in Section 4. Section 5 offers concluding remarks.

2 Data

The model SPEMR proposed in this paper needs a reference set to learn the model parameters and to test model performance. The reference set is established by cross-matching the LAMOST DR8 medium-resolution spectra with the APOGEE DR17/ASPCAP catalog. The reference set consists of a series of samples, and each sample consists of an observed spectrum of an object and its reference label. The reference spectra are obtained from the LAMOST DR8 medium-resolution spectral library, and the reference labels were the stellar physical parameters (Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT, logg𝑔\log\,groman_log italic_g, [Fe/H]) and chemical abundances of 13 elements ([C/H], [N/H], [O/H], [Mg/H], [Al/H], [Si/H], [S/H], [K/H], [Ca/H], [Ti/H], [Cr/H], [Mn/H], [Ni/H]) from the APOGEE DR17/ASPCAP catalog. It is worth noting that the reference sets provided by Wang et al. (2020) and Xiong et al. (2022) are obtained by cross-matching the LAMOST DR7 medium-resolution spectral data with the APOGEE-Payne catalog. While the reference set provided in this paper is based on the LAMOST DR8 medium-resolution spectra and APOGEE DR17/ASPCAP catalog. The APOGEE DR17 catalog provides more reference labels and the corresponding labels with higher accuracy. Therefore, we used the APOGEE DR17 catalog as the source of reference labels. The reference set we obtained finally are more than twice as many as those of Wang et al. (2020) and Xiong et al. (2022). This bigger reference set helps to build models with better accuracy of parameter estimation. The typical characteristic of the reference set is that the data are exceedingly imbalanced in the parameter space (Figure 3). For example, in case of the effective temperature (Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT) higher than 6500 K, lower than 4000 K, or the metal abundance ([Fe/H]delimited-[]FeH\rm[Fe/H][ roman_Fe / roman_H ]) lower than -1.0 dex, the reference data are very sparse (Figure 1). This imbalance leads to a significant decrease in the accuracy of the parameter estimation models (e.g. Wang et al., 2020; Xiong et al., 2022). To this end, this paper proposes a novel parameter estimation method based on multiple regions by dividing the parameter space into several sub-regions with different distrubution characteristics and accordingly dividing the reference set into three subsets. More on the establishment of the reference set and its pre-processing procedures are described in the next two sub-chapters.

2.1 Reference dataset based on common observational targets of APOGEE and LAMOST

Refer to caption
Figure 2: The distribution histograms of the common sources between APOGEE DR17 catalog and LAMOST DR8 medium-resolution spectral library.
Refer to caption
Figure 3: The distribution of the overall reference dataset in the Teff[Fe/H]subscript𝑇effdelimited-[]FeHT_{\rm eff}-\rm[Fe/H]italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT - [ roman_Fe / roman_H ] space. In this paper, the samples are divided into reference set 1 (S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT), reference set 2 (S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT) and reference set 3 (S3subscript𝑆3S_{3}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT) according to their distribution characteristics in the parameter space: S1=A2subscript𝑆1subscript𝐴2S_{1}=A_{2}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, S2=A1A3A4A6subscript𝑆2subscript𝐴1subscript𝐴3subscript𝐴4subscript𝐴6S_{2}=A_{1}\cup A_{3}\cup A_{4}\cup A_{6}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ italic_A start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ∪ italic_A start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ∪ italic_A start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT, S3=A4A5A6subscript𝑆3subscript𝐴4subscript𝐴5subscript𝐴6S_{3}=A_{4}\cup A_{5}\cup A_{6}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = italic_A start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ∪ italic_A start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ∪ italic_A start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT.
Refer to caption
Figure 4: The distribution of the overall reference dataset in the Teffloggsubscript𝑇eff𝑔T_{\rm eff}-\log gitalic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT - roman_log italic_g space.

APOGEE (Majewski et al., 2017) is a medium-high resolution (Rsimilar-to\sim22500) spectroscopic survey in the near-infrared band ([15000, 17000] Å). The APOGEE spectra were obtained using the Sloan telescope at Apache Point Observatory in New Mexico City, USA. The APOGEE Stellar Parameters and Chemical Abundances Pipeline (ASPCAP; Pérez et al., 2016) obtained stellar parameters and elemental abundances for most of the spectra by comparing the observed spectra with the theoretical spectral library using cardinal distance. The APOGEE DR17 catalog publishes the stellar parameters (Teffsubscript𝑇effT_{\mathrm{eff}}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT, log g𝑔gitalic_g, [Fe/H]) and 20 elemental abundances for 475,144 stars. The ranges of the stellar atmospheric parameters in the APOGEE DR17 catalog are [3500,7000]35007000\rm[3500,7000][ 3500 , 7000 ] K for Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT, [0.5,5]0.55\rm[-0.5,5][ - 0.5 , 5 ] dex for logg𝑔\log\,groman_log italic_g, and [2.0,0.5]2.00.5\rm[-2.0,0.5][ - 2.0 , 0.5 ] dex for [Fe/H]delimited-[]FeH\rm[Fe/H][ roman_Fe / roman_H ]. The accuracies of the three parameters are 17 K, 0.03 dex and 0.009 dex, respectively.

In this paper, we used the same method as Wang et al. (2020) and Xiong et al. (2022) to obtain the reference dataset. We cross-matched the LAMOST DR8 medium-resolution spectra with the APOGEE DR17 catalog and obtained 75,316 common observational targets. There are 358,416 observed spectra in LAMOST medium-resolution spectral library from these common targets. It is worth noting that some LAMOST spectra are affected by cosmic rays and other influences, which result in a large number of outliers (bad pixels) in them. Therefore, the spectra with more than 100 outliers or more than 30 consecutive outliers are rejected. In addition, to ensure the reliability of the dataset, we only kept the spectral data with S/N10SN10\rm S/N\geq 10roman_S / roman_N ≥ 10 and quality_flag=goodquality_flaggood\rm quality\_flag=goodroman_quality _ roman_flag = roman_good. Finally, we obtained 73,773 common observational targets and 310,086 LAMOST DR8 medium-resolution spectra from these targets. This dataset has over 100% more data than the reference sets obtained by Wang et al. (2020) and Xiong et al. (2022). Figure 2 shows the distribution histograms of the common sources between the APOGEE DR17 catalog and the LAMOST DR8 medium-resolution spectral library. It is shown that the data samples are sparse in the regions where Teffsubscript𝑇effT_{\mathrm{eff}}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT > 6500 K, Teffsubscript𝑇effT_{\mathrm{eff}}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT < 4000 K, log g𝑔gitalic_g < 2.0 dex, and [Fe/H] < -0.5 dex.

To accurately predict the parameters for the spectra with high temperature, low temperature, or low metallicity, the obtained reference dataset is divided into three subsets according to the distribution characteristics of the samples in the Teff[Fe/H]subscript𝑇effdelimited-[]FeHT_{\rm eff}-\rm[Fe/H]italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT - [ roman_Fe / roman_H ] parameter space: reference set 1 (S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT), reference set 2 (S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT), and reference set 3 (S3subscript𝑆3S_{3}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT). The three reference subsets are defined as shown in Figure 3. Reference set 1 (S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT) is used to further improve the parameter estimation accuracy on the spectra observed with high probability. Reference set 2 (S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT) is used to improve the parameter estimation accuracy on the spectra with high temperature or low temperature. And reference set 3 (S3subscript𝑆3S_{3}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT) is used to improve the parameter estimation accuracy on the spectra with low metallicity ([Fe/H]delimited-[]FeH\rm[Fe/H][ roman_Fe / roman_H ]). However, it is shown that the established model based on the above-mentioned three subsets performs unsatisfactory on the spectra of cool dwarf stars (Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT < 4500 K and logg𝑔\log\,groman_log italic_g > 4.0 dex). Therefore, the fourth reference set, S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, is established (Figure 4). The thresholds Teff<5000subscript𝑇eff5000T_{\rm eff}<5000italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT < 5000 K and logg>2.5𝑔2.5\log\,g>2.5roman_log italic_g > 2.5 dex were chosen based on experimental performance.

2.2 Data pre-processing

To facilitate machine learning model optimization, the reference spectra should be pre-processed before training the parameter estimation model. For example, wavelength correction, spectral resampling, spectral normalization, etc. And the details of preprocessing procedure can be found in Xiong et al. (2022) and Wang et al. (2020). After the above pre-processing procedures, the spectral data can be directly input into the SPEMR model for estimating the spectral parameters.

3 Stellar Parameter Estimation based on Multiple Regions

In this paper, a novel method Stellar Parameter Estimation based on Multiple Regions (SPEMR) is proposed based on the distribution characteristics of LAMOST medium-resolution survey spectra in Teff[Fe/H]subscript𝑇effdelimited-[]FeHT_{\rm eff}-\rm[Fe/H]italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT - [ roman_Fe / roman_H ] and Teffloggsubscript𝑇eff𝑔T_{\rm eff}-\log\,gitalic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT - roman_log italic_g parameter space. Since the parameter estimation for the spectra in each sub-region is implemented respectively based on the RRNet model, the proposed scheme can be specifically abbreviated as SPEMR (RRNet) in this paper. The following sections will introduce the RRNet method, the motivation and principle of SPEMR model, and the method to obtain the final parameter estimation result for a spectrum based on SPEMR (RRNet), respectively.

3.1 RRNet model

Residual Recurrent Neural Network (RRNet) is a convolutional neural network whose main components are a recurrent learning module and a residual learning module (Xiong et al., 2022). RRNet model is proposed in the problem of parameter estimation of the medium-resolution spectrum of LAMOST. Furtherly, compared with StarNet (Fabbro et al., 2018; Bialek et al., 2020) and SPCANet (Wang et al., 2020), RRNet has some superiorities on accuracy and robustness. Therefore, RRNet is chosen as the backbone network in the SPEMR model.

Compared to high-resolution spectroscopy, it is more challenging to discern some typical spectral line features in medium-resolution and low-resolution spectra. In these cases, it is necessary to design a parameter estimation algorithm with stronger sensitivity and detection capability for weak spectral features. To this end, the RRNet model was proposed. In RRNet, the residual learning module enhances the sensitivity to spectral feature based on the driving power from parameter labels.

The super high spectral acquisition rate is a characteristic of the LAMOST survey, which helps to acquire a large-scale stellar spectral data set in a short period of time. However, an accompanying problem is the large amount of noises in the observed spectra. The recurrent learning module in the RRNet achieves cross-band information propagation and belief enhancement by mining the correlation between spectral features on different bands. And this module can suppress the negative effects from noises in the spectra. More information about RRNet can be found in Xiong et al. (2022).

3.2 Division of sub-regions and overall learning architecture

Refer to caption
Figure 5: The learning principle and prediction principle of the SPEMR model. Part A shows the learning process of the SPEMR model, which consists of two phases: overall pre-training and personalized fine-tuning. Part B shows the flow chart of SPEMR for parameter estimation on the spectra. S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, S3subscript𝑆3S_{3}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT denote the reference data used for training (Figure 3 and Figure 4).

A two-stage learning scheme is used in SPEMR to improve the accuracy of parameter estimation both on high-frequency-observed-type spectra and on spectra with low temperature, high temperature, or low metallicity. The two learning stages are an overall pre-training and a personalized fine-tuning (Part A in Figure 5). In the first stage, RRNet is trained by the reference spectra over the entire parameter space to obtain a common knowledge of the parameter estimation problem. In the second stage, four reference subsets S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, S3subscript𝑆3S_{3}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT (Figure 3 and Figure 4) are independently used for further personalized optimizing the pre-trained model and four models RRNet1𝑅𝑅𝑁𝑒subscript𝑡1RRNet_{1}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, RRNet2𝑅𝑅𝑁𝑒subscript𝑡2RRNet_{2}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, RRNet3𝑅𝑅𝑁𝑒subscript𝑡3RRNet_{3}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and RRNet4𝑅𝑅𝑁𝑒subscript𝑡4RRNet_{4}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT are obtained. This fine-tuning learning allows the model to better handle specific types of spectral parameter estimation problems. Four fine-tuned models RRNet1𝑅𝑅𝑁𝑒subscript𝑡1RRNet_{1}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, RRNet2𝑅𝑅𝑁𝑒subscript𝑡2RRNet_{2}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, RRNet3𝑅𝑅𝑁𝑒subscript𝑡3RRNet_{3}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and RRNet4𝑅𝑅𝑁𝑒subscript𝑡4RRNet_{4}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT are fused into the final SPEMR parameter estimation model. More information about the two learning stages of SPEMR is presented below.

3.2.1 Overall pre-training

Table 1: Hyperparameter evaluation. The Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT and Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT are two hyperparameters in RRNet. Experimental results are calculated on the validation set using the mean absolute error (MAE). To simplify the computation complexity, the evaluations of Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT=5, 20, 40 and 60 are conducted with Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT = 3, and evaluations of Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT= 1, 2, 3, 4 and 5 are conducted with Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT=5.
Labels Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT=1 Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT=2 Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT=3 Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT=4 Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT=5 Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT=20 Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT=40 Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT=60
Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT 42.6130 40.0473 41.7769 40.1840 41.7769 40.3235 39.8680 40.3373
logg𝑔\log\,groman_log italic_g 0.0807 0.0737 0.0744 0.0725 0.0744 0.0719 0.0718 0.0716
[Fe/H] 0.0286 0.0276 0.0282 0.0281 0.0282 0.0279 0.0278 0.0280
[C/H] 0.0427 0.0417 0.0413 0.0418 0.0413 0.0417 0.0419 0.0423
[N/H] 0.1030 0.1023 0.1015 0.1024 0.1015 0.1028 0.1034 0.1040
[O/H] 0.0554 0.0552 0.0551 0.0552 0.0551 0.0552 0.0554 0.0555
[Mg/H] 0.0327 0.0320 0.0318 0.0321 0.0318 0.0322 0.0323 0.0327
[Al/H] 0.0472 0.0462 0.0458 0.0463 0.0458 0.0466 0.0469 0.0473
[Si/H] 0.0339 0.0335 0.0331 0.0334 0.0331 0.0337 0.0338 0.0340
[S/H] 0.0624 0.0622 0.0621 0.0623 0.0621 0.0624 0.0623 0.0624
[K/H] 0.0638 0.0636 0.0634 0.0638 0.0634 0.0639 0.0640 0.0642
[Ca/H] 0.0376 0.0370 0.0367 0.0369 0.0367 0.0372 0.0372 0.0374
[Ti/H] 0.0919 0.0912 0.0915 0.0916 0.0915 0.0918 0.0921 0.0924
[Cr/H] 0.1086 0.1081 0.1079 0.1076 0.1079 0.1080 0.1082 0.1083
[Mn/H] 0.0440 0.0431 0.0425 0.0427 0.0425 0.0429 0.0428 0.0431
[Ni/H] 0.0357 0.0351 0.0348 0.0349 0.0348 0.0348 0.0349 0.0351

In the pre-training process, we randomly divide the overall reference set (see 2.1 section) into a training set, a validation set and a test set at the ratio of 7:1:2. The three data sets respectively consist of 217,379 spectra from 51,641 stars, 30,821 spectra from 7,377 stars, and 61,886 spectra from 14,755 stars. The training set is used for learning the pre-trained model parameters, the validation set is used to determine the pre-trained model hyperparameters, and the test set is used to evaluate the performance of the parameter estimation results.

To accurately estimate the probability density function (PDF; Bialek et al., 2020) of the estimated stellar parameters, 6 instances of the model are trained with different random initializations. The mean μ^(𝐗)^𝜇𝐗\hat{\mu}(\mathbf{X})over^ start_ARG italic_μ end_ARG ( bold_X ) of the ensembling is determined by the average of the predicted means of these six models. The variance σ^pred2(𝐗)subscriptsuperscript^𝜎2𝑝𝑟𝑒𝑑𝐗\hat{\sigma}^{2}_{pred}(\mathbf{X})over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d end_POSTSUBSCRIPT ( bold_X ) of the ensembling is determined by the following equation:

σ^pred2(𝐗)=16i=16(σθi2(𝐗)+μθi2(𝐗))μ^2(𝐗),subscriptsuperscript^𝜎2𝑝𝑟𝑒𝑑𝐗16superscriptsubscript𝑖16subscriptsuperscript𝜎2subscript𝜃𝑖𝐗subscriptsuperscript𝜇2subscript𝜃𝑖𝐗superscript^𝜇2𝐗\hat{\sigma}^{2}_{pred}(\mathbf{X})=\frac{1}{6}\sum_{i=1}^{6}{(\sigma^{2}_{% \theta_{i}}(\mathbf{X})+\mu^{2}_{\theta_{i}}(\mathbf{X}))-\hat{\mu}^{2}(% \mathbf{X})},over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d end_POSTSUBSCRIPT ( bold_X ) = divide start_ARG 1 end_ARG start_ARG 6 end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT ( italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_X ) + italic_μ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_X ) ) - over^ start_ARG italic_μ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_X ) , (1)

where θisubscript𝜃𝑖\theta_{i}italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the parameters to be optimized for the i-th model, and μθi(𝐗),σθi2(𝐗)subscript𝜇subscript𝜃𝑖𝐗subscriptsuperscript𝜎2subscript𝜃𝑖𝐗\mu_{\theta_{i}}(\mathbf{X}),\sigma^{2}_{\theta_{i}}(\mathbf{X})italic_μ start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_X ) , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_X ) are the mean and variance of the prediction of the i-th model, respectively.

In the RRNet model, a spectra is divided into Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT sub-bands. The correlation and complementarity of spectral information between various bands are learned through the recurrent module. There is another hyperparameter Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT in the RRNet model, which indicates the number of residual blocks. These two hyperparameters have an impact on the RRNet model performance. Therefore, we optimized them using the validation set. In this paper, some experimental explorations are conducted on different configurations Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT = 1, 2, 3, 4 and Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 5, 20, 40, 60 (Table 1). It is shown that the pre-trained model is the smallest error on the whole in case of Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT = 3 and Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 5. Therefore, the Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT of RRNet are set to 3 and the Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT are set to 5 in the subsequent experiments. In addition, the number of training iterations and the learning rate are 30 and 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT, which are consistent with RRNet (Xiong et al., 2022).

3.2.2 Personalized fine-tuning

Although the Base RRNet obtained in the pre-training stage already has some parameter estimation capabilities in the overall parameter space, the non-uniformity of the sample distribution (Fig. 3 and Fig. 4) leads to a significant room for improvement in each sub-region (Fig. 1). Therefore, we fine-tuned the model in a targeted way for each sub-region separately. Specifically, the spectra with Teff[4000,6500]subscript𝑇eff40006500T_{\rm eff}\in\rm[4000,6500]italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT ∈ [ 4000 , 6500 ] K and [Fe/H]1.0delimited-[]FeH1.0\rm[Fe/H]\geq-1.0[ roman_Fe / roman_H ] ≥ - 1.0 dex in the training set are used as training set 1 (the S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in Fig. 3) to fine-tune the Base RRNet, and the corresponding parameter estimation model RRNet1𝑅𝑅𝑁𝑒subscript𝑡1RRNet_{1}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is obtained. Furtherly, the spectra with Teff<4000subscript𝑇eff4000T_{\rm eff}<4000italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT < 4000 K and Teff>6500subscript𝑇eff6500T_{\rm eff}>6500italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT > 6500 K in the training set are treated as training set 2 (the S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Fig. 3) to fine-tune the Base RRNet, and the corresponding parameter estimation model RRNet2𝑅𝑅𝑁𝑒subscript𝑡2RRNet_{2}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is computed. The spectra with [Fe/H]<1.0delimited-[]FeH1.0\rm[Fe/H]<-1.0[ roman_Fe / roman_H ] < - 1.0 dex in the training set are considered as training set 3 (the S3subscript𝑆3S_{3}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT in Fig. 3) to fine-tune the Base RRNet, and the corresponding parameter estimation model RRNet3𝑅𝑅𝑁𝑒subscript𝑡3RRNet_{3}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT is obtained. Finally, the spectra with Teff𝑇effT\mathrm{eff}italic_T roman_eff < 5000 K and logg𝑔\log groman_log italic_g > 2.5 dex in the training set are considered as training set 4 (the S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT in Fig. 4) to fine-tune the Base RRNet, and the corresponding parameter estimation model RRNet4𝑅𝑅𝑁𝑒subscript𝑡4RRNet_{4}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT is obtained.

When fine-tuning each sub-model, we kept the parameters of the convolutional layer unchanged, and only re-optimized the parameters of the fully connected layer. In this way, the sub-model can converge earlier and has a relatively strong spectral feature extraction ability at the beginning. In addition, the number of training iterations and the learning rate for each sub-model are respectively set to 10 and 105superscript10510^{-5}10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT. This method also further accelerates the convergence of the sub-models. After fine-tuning, we obtained four sub-models, RRNet1𝑅𝑅𝑁𝑒subscript𝑡1RRNet_{1}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, RRNet2𝑅𝑅𝑁𝑒subscript𝑡2RRNet_{2}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and RRNet3𝑅𝑅𝑁𝑒subscript𝑡3RRNet_{3}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, RRNet3𝑅𝑅𝑁𝑒subscript𝑡3RRNet_{3}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and RRNet4𝑅𝑅𝑁𝑒subscript𝑡4RRNet_{4}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. The RRNet1𝑅𝑅𝑁𝑒subscript𝑡1RRNet_{1}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is used to predict the stellar parameters for the stellar spectra observed with high probability. The RRNet2𝑅𝑅𝑁𝑒subscript𝑡2RRNet_{2}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is used to predict the stellar parameters for the spectra with high temperature or low temperature. The RRNet3𝑅𝑅𝑁𝑒subscript𝑡3RRNet_{3}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT is used to predict the stellar parameters of spectra with low metallicity. And the RRNet4𝑅𝑅𝑁𝑒subscript𝑡4RRNet_{4}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT is used to improve the parameters of the cold end spectra of dwarf stars.

3.3 Integration of the estimated results from four sub-models

In practical application, it is unknown about the sub-region of the parameter space to which a spectrum belongs before estimating its parameters. This problem makes it impossible to determine which model should be used to predict the spectral parameters beforehand. To this end, we proposed a strategy of multi-label fusion to solve this problem (Part B in Figure 5). Based on this strategy, we input a spectrum 𝐗R1×7200𝐗superscript𝑅17200\mathbf{X}\in R^{1\times 7200}bold_X ∈ italic_R start_POSTSUPERSCRIPT 1 × 7200 end_POSTSUPERSCRIPT into RRNeti,i{1,2,3,4}𝑅𝑅𝑁𝑒subscript𝑡𝑖𝑖1234RRNet_{i},i\in\{1,2,3,4\}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i ∈ { 1 , 2 , 3 , 4 } respectively. The outputs of RRNeti𝑅𝑅𝑁𝑒subscript𝑡𝑖RRNet_{i}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are μi(𝐗)R1×16subscript𝜇𝑖𝐗superscript𝑅116\mu_{i}(\mathbf{X})\in R^{1\times 16}italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_X ) ∈ italic_R start_POSTSUPERSCRIPT 1 × 16 end_POSTSUPERSCRIPT and σi2(𝐗)R1×16superscriptsubscript𝜎𝑖2𝐗superscript𝑅116\sigma_{i}^{2}(\mathbf{X})\in R^{1\times 16}italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_X ) ∈ italic_R start_POSTSUPERSCRIPT 1 × 16 end_POSTSUPERSCRIPT. The μij(𝐗)subscript𝜇𝑖𝑗𝐗\mu_{ij}(\mathbf{X})italic_μ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( bold_X ) is the estimation of the j-th spectral parameter from RRNeti𝑅𝑅𝑁𝑒subscript𝑡𝑖RRNet_{i}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. The σij2(𝐗)superscriptsubscript𝜎𝑖𝑗2𝐗\sigma_{ij}^{2}(\mathbf{X})italic_σ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_X ) is the uncertainty estimation of μij(𝐗)subscript𝜇𝑖𝑗𝐗\mu_{ij}(\mathbf{X})italic_μ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( bold_X ). The final prediction of the SPEMR model μ^(𝐗)=(μ^1(𝐗),,μ^17(𝐗))^𝜇𝐗subscript^𝜇1𝐗subscript^𝜇17𝐗{\hat{\mu}}(\mathbf{X})=({\hat{\mu}_{1}}(\mathbf{X}),\cdots,{\hat{\mu}_{17}}(% \mathbf{X}))over^ start_ARG italic_μ end_ARG ( bold_X ) = ( over^ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_X ) , ⋯ , over^ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT 17 end_POSTSUBSCRIPT ( bold_X ) ) and its uncertainty estimate σ^2(𝐗)=(σ^12(𝐗),,σ^172(𝐗))superscript^𝜎2𝐗subscriptsuperscript^𝜎21𝐗subscriptsuperscript^𝜎217𝐗{\hat{\sigma}^{2}}(\mathbf{X})=({\hat{\sigma}^{2}_{1}}(\mathbf{X}),\dots,{\hat% {\sigma}^{2}_{17}}(\mathbf{X}))over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_X ) = ( over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_X ) , … , over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 17 end_POSTSUBSCRIPT ( bold_X ) ) can be obtained by fusing {μij(𝐗)subscript𝜇𝑖𝑗𝐗\mu_{ij}(\mathbf{X})italic_μ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( bold_X ), i=1, 2, 3, 4} and {σij2(𝐗)superscriptsubscript𝜎𝑖𝑗2𝐗\sigma_{ij}^{2}(\mathbf{X})italic_σ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_X ), i=1, 2, 3, 4}. The specific fusion formula as follows:

μ^j(𝐗)=μi(j)0,j(𝐗),subscript^𝜇𝑗𝐗subscript𝜇𝑖subscript𝑗0𝑗𝐗{\hat{\mu}_{j}}(\mathbf{X})=\mu_{i(j)_{0},j}(\mathbf{X}),over^ start_ARG italic_μ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( bold_X ) = italic_μ start_POSTSUBSCRIPT italic_i ( italic_j ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_j end_POSTSUBSCRIPT ( bold_X ) , (2)
σ^j2(𝐗)=σi(j)0,j2(𝐗),subscriptsuperscript^𝜎2𝑗𝐗superscriptsubscript𝜎𝑖subscript𝑗0𝑗2𝐗{\hat{\sigma}^{2}_{j}}(\mathbf{X})={\sigma_{i(j)_{0},j}^{2}}(\mathbf{X}),over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( bold_X ) = italic_σ start_POSTSUBSCRIPT italic_i ( italic_j ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_X ) , (3)

where i(j)0=argmini=1,2,3,4σij2(𝐗)𝑖subscript𝑗0subscript𝑖1234superscriptsubscript𝜎𝑖𝑗2𝐗i(j)_{0}=\arg\min\limits_{i=1,2,3,4}{\sigma_{ij}^{2}}(\mathbf{X})italic_i ( italic_j ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_arg roman_min start_POSTSUBSCRIPT italic_i = 1 , 2 , 3 , 4 end_POSTSUBSCRIPT italic_σ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_X ), and j=1,,16𝑗116j=1,\cdots,16italic_j = 1 , ⋯ , 16. That is to say, i(j)0𝑖subscript𝑗0i(j)_{0}italic_i ( italic_j ) start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT denotes the model index with the smallest prediction uncertainty σij2(𝐗)superscriptsubscript𝜎𝑖𝑗2𝐗{\sigma_{ij}^{2}}(\mathbf{X})italic_σ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_X ). Therefore, the model fusion schemes (2) and (3) are to adopt the predictions with the smallest uncertainty as the final fusion result.

3.4 Testing of the SPEMR model

Refer to caption
Refer to caption
Refer to caption
Figure 6: Performance evaluation of stellar atmospheric parameters (Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT, logg𝑔\log\,groman_log italic_g, [Fe/H]) estimated by SPEMR on spectra with different S/N. The S/N intervals are 10S/N<2010SN20\rm 10\leq S/N<2010 ≤ roman_S / roman_N < 20, 20S/N<4020SN40\rm 20\leq S/N<4020 ≤ roman_S / roman_N < 40, 40S/N<6040SN60\rm 40\leq S/N<6040 ≤ roman_S / roman_N < 60 and 60S/N<10060SN100\rm 60\leq S/N<10060 ≤ roman_S / roman_N < 100, respectively. The vertical axis presents the distribution of differences between SPEMR predictions and ASPCAP results on the test set.
Refer to caption
Figure 7: Distribution of the residuals between the abundance of 13 elements estimated by SPEMR and the ASPCAP results on the test set. The color indicates the distribution density of the samples.

After an overall pre-training and three subsequent, independent personalized fine-tuning for the RRNet (Section 2.1), four sub-models are computed: RRNet1𝑅𝑅𝑁𝑒subscript𝑡1RRNet_{1}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, RRNet2𝑅𝑅𝑁𝑒subscript𝑡2RRNet_{2}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, RRNet3𝑅𝑅𝑁𝑒subscript𝑡3RRNet_{3}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and RRNet4𝑅𝑅𝑁𝑒subscript𝑡4RRNet_{4}italic_R italic_R italic_N italic_e italic_t start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. Based on them, we can obtain the preposed SPEMR model (Section 3.2) using the multi-label fusion strategy (Section 3.3). In this section, we evaluate the performance of the SPEMR model by comparing the differences between the SPEMR estimations and the ASPCAP labels on the test set. Thus, any comparison here is not affected by biases in the ASPCAP results themselves. More comprehensive evaluations are conducted in section 4.

Figure 6 shows the distribution of the differences between the stellar atmospheric parameters predicted by SPEMR and the ASPCAP results. The deviation of the SPEMR predictions from the ASPCAP labels are small on the spectra with low- and high- S/N level. This phenonmennon indicates that the SPEMR model can effectively suppress the noise effects on the spectra with low signal-to-noise. For effective temperature, the corresponding residual is smallest on the spectra with Teff[4500,5000]subscript𝑇eff45005000T_{\rm eff}\in[4500,5000]italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT ∈ [ 4500 , 5000 ] K. This phenomenon is mainly due to the large number of training samples in this region and their good quality. In the second row of Figure 6, we can see a slight underestimation from SPEMR on logg𝑔\log\,groman_log italic_g in case of logg>4𝑔4\log\,g>4roman_log italic_g > 4 (dex). This phenomenon is consistent with the estimations of Wang et al. (2020) and Xiong et al. (2022). It is mainly due to the scarcity of training examples in this parameter space region (as shown in Fig. 4), which causes the increase of prediction error for the dwarfs (logg>4𝑔4\log\,g>4roman_log italic_g > 4). For metal abundance, the best prediction results were obtained when the spectra with [Fe/H] \in[-0.5, 0.5] dex. This phenomenon is also due to the larger number of training samples in this region. The above phenomena suggest that the ASPCAP labels provide excellent learning benchmarks for the stellar atmospheric parameters estimated by the SPEMR model.

To further evaluate the results of other parameters estimated by the SPEMR model, we investigated the differences between the SPEMR estimations and the ASPCAP results for the remaining elemental abundances on the test set. Figure 7 shows the distribution of differences between the abundances of 13 elements predicted by the SPEMR model and the ASPCAP labels on the test set. The residual and dispersion of most element abundances estimated by SPEMR are around 0.005 dex and 0.07 dex. These lower residual and dispersion indicate the better precision and accuracy of the SPEMR model. However, for elements N, Ti and Cr, the corresponding residual and dispersion are slightly higher. Therefore, the accuracy of the SPEMR model on elements N, Ti, Cr should be further improved.

3.5 Best Fitting Template

To further explore the performance of the SPEMR model, we investigated several representative LAMOST spectra in the test set and their corresponding best-fit templates. These test spectra are selected based on their representativeness in parameter space and spectral quality. This study can increase the physical interpretability of the model and allow the reader to more intuitively observe the fit of the model. In this paper, the best fitting template of a spectrum is found by minimizing the Euclidean distance between the spectrum and each training spectrum using the estimated stellar parameters for test spectra and the reference parameters for the training spectra (the parameters of the test spectrum are estimated using the SPEMR model). The corresponding results are shown in Figure 8. The eight representative LAMOST spectra from top to bottom in Fig. 8 are a low-temperature spectrum (Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT = 3951.45 K), a high-temperature spectrum (Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT = 6577.52 K), a dwarf spectrum (logg𝑔\log\,groman_log italic_g = 4.42 dex), a giant spectrum (logg𝑔\log\,groman_log italic_g = 2.71 dex), a metal-poor abundance spectrum ([Fe/H] = -1.20 dex), a metal-rich abundance spectrum ([Fe/H] = 0.00 dex), a low signal-to-noise ratio spectrum (S/N = 17.32), and a high signal-to-noise ratio spectrum (S/N = 117.32). It can be found that the SPEMR model fits well on low temperature spectrum, dwarf spectrum, giant spectrum, rich metal abundance spectrum, and high signal-to-noise ratio spectrum, that is, the residuals between the test spectrum and the best-fit template almost reach zero in the whole wavelength space. However, for high-temperature spectrum and low-SNR spectrum, the fitting effect of the SPEMR model is not as good as that of the above spectra, especially in the blue-end wavelength space. This phenomenon may be related with the sparse distribution of this kind training samples (Fig. 3).

Refer to caption
Figure 8: Several representative LAMOST spectra with various configurations on parameters and their corresponding best fitting templates. The black line indicates a test spectrum, the red line indicates its best-fit template, and the yellow line indicates the residual between them. The best fit template is determined based on the Euclidean distance between the test spectrum and each training spectrum in stellar parameter space (the parameters of the test spectrum are estimated using the SPEMR model). This figure only shows three stellar atmospheric parameters for each spectrum due to space limitations.

4 APPLICATION ON LAMOST DR8

In this section, we applied the SPEMR proposed in section 3 to the LAMOST DR8 medium-resolution spectra to obtain a LAMOST-SPEMR catalog. To assess the reliability of the LAMOST-SPEMR catalog, we compared it with other typical catalogs, performed an uncertainty analysis, and tested it on open clusters.

4.1 Parameter estimation for the medium resolution spectra from LAMOST DR8

Refer to caption
Figure 9: Distribution of stellar parameters from LAMOST-SPEMR catalog. The color in this figure represents [Fe/H]delimited-[]FeH\rm[Fe/H][ roman_Fe / roman_H ], and the three isochrones represent the evolutionary tracks of MIST stars with stellar ages of 7 Gyr ([Fe/H]delimited-[]FeH\rm[Fe/H][ roman_Fe / roman_H ] = -0.5 (cyan), 0.0 (yellow), and 0.5 (red) respectively). The S/N intervals are 10S/N<2010SN20\rm 10\leq S/N<2010 ≤ roman_S / roman_N < 20, 20S/N<4020SN40\rm 20\leq S/N<4020 ≤ roman_S / roman_N < 40, 40S/N<6040SN60\rm 40\leq S/N<6040 ≤ roman_S / roman_N < 60 and 60S/N<10060SN100\rm 60\leq S/N<10060 ≤ roman_S / roman_N < 100, respectively.
Refer to caption
Figure 10: The parameter estimation results of SPEMR: the performance on the spectra with low temperature, high temperature, low metallicity, and high-frequency-observed-type spectra are improved to various degrees over the RRNet model (Fig. 1).

Three learned sub-models based on reference datasets 1, 2,, 3 and 4(S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, S3subscript𝑆3S_{3}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT in Fig. 3 and S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT in Fig. 4) can perform parameter estimation for spectra on different regions of the parameter space (Section 3.2). The four sub-models can estimate stellar parameters for four types of spectra: the spectra with Teff[4000,6500]subscript𝑇eff40006500T_{\rm eff}\in[4000,6500]italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT ∈ [ 4000 , 6500 ] K and [Fe/H] 1.0absent1.0\geq-1.0≥ - 1.0 dex, the spectra with Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT > 6500 K or Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT < 4000 K, the spectra with [Fe/H] < -1.0 dex and the spectra with Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT < 5000 K and logg𝑔\log\,groman_log italic_g > 2.5 dex, respectively. The results of four RRNet sub-models are fused using the SPEMR scheme (Section 3.3) to obtain the SPEMR (RRNet) parameter estimation. Accordingly, the LAMOST-SPEMR catalog is obtianed by SPEMR. This catalog contains stellar atmospheric parameters, chemical abundances, and the corresponding 1σ1𝜎1\sigma1 italic_σ uncertainties for 4,197,960 medium-resolution spectra in LAMOST DR8 estimated by SPEMR.

Figure 9 shows the Teffloggsubscript𝑇eff𝑔T_{\rm eff}-\log\,gitalic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT - roman_log italic_g distribution of the LAMOST-SPEMR catalog in different S/N intervals. The three isochrones in the figure are the MIST stellar evolutionary tracks with a stellar age of 7 Gyr (Dotter, 2016; Choi et al., 2016). Compared with Figure 9 in Wang et al. (2020), it is shown that the stellar parameters estimated by SPEMR in the high-temperature spectral region fit better with the three MIST stellar evolutionary tracks. In the low-temperature spectral region, the SPEMR and SPCANet estimates show a similar pattern, with an underestimation of logg𝑔\log\,groman_log italic_g for the cold ends of the main sequence stars (Teff[4000,4500]subscript𝑇eff40004500T_{\rm eff}\in[4000,4500]italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT ∈ [ 4000 , 4500 ] K). For the spectra with low metallicity, SPEMR can also effectively estimate their stellar parameters. Compared with Fig. 5 in Xiong et al. (2022), it is shown taht RRNet lacks the estimation results for the high-temperature spectra, while the proposed SPEMR effectively estimate the stellar parameters from this kind spectra. And the estimation results generally agree with the MIST stellar evolutionary tracks.

To evaluate the validity of SPEMR, we estimated the parameters for the spectra in the reference set, and the results are shown in Fig. 10. Compared with the estimation results of RRNet (Fig. 1), the performance of our model is improved to different degrees for the spectra with low temperature, high temperature, low metallicity, and high-frequency-observed-type spectra. Specifically, obvious improvements are shown on the spectra with low temperature and the spectra with low metallicity. However, no evident improvements are found on the spectra with high temperature. This phenomenon in the spectra with high temperature is caused by the small number of high-temperature spectral samples in the training data. In addition, SPEMR improves the estimation results on high-frequency-observed-type spectra, such as Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT, logg𝑔\log\,groman_log italic_g, Fe, Si, etc.

4.2 Some comparisons with other typical catalogs

Table 2: Some comparisions between the LAMOST-SPEMR catalog and several typical catalogs. The ’\cdots’ indicates that the estimate for a stellar parameter is not given in the corresponding catalog.
Labels SPEMR-ASPCAP SPCANet-ASPCAP SPEMR-GALAH
μrsubscript𝜇𝑟\mu_{r}italic_μ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT σrsubscript𝜎𝑟\sigma_{r}italic_σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT MAE μrsubscript𝜇𝑟\mu_{r}italic_μ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT σrsubscript𝜎𝑟\sigma_{r}italic_σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT MAE μrsubscript𝜇𝑟\mu_{r}italic_μ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT σrsubscript𝜎𝑟\sigma_{r}italic_σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT
Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT -5.28 57.02 33.51 -30.65 118.05 94.96 15.03 243.65
logg𝑔\log\,groman_log italic_g -0.020 0.084 0.060 0.024 0.167 0.118 0.097 0.255
[Fe/H] -0.005 0.038 0.026 -0.029 0.075 0.059 0.027 0.117
[C/H] -0.009 0.063 0.045 -0.115 0.119 0.138 -0.037 0.154
[N/H] -0.001 0.172 0.099 -0.029 0.213 0.139
[O/H] -0.005 0.083 0.054 -0.035 0.129 0.103 -0.020 0.214
[Mg/H] -0.005 0.046 0.033 -0.027 0.095 0.075 0.019 0.133
[Al/H] -0.009 0.069 0.049 -0.051 0.157 0.124 0.025 0.150
[Si/H] -0.004 0.048 0.034 -0.018 0.110 0.088 0.032 0.110
[S/H] -0.005 0.089 0.060 -0.019 0.116 0.081
[K/H] -0.006 0.099 0.064 -0.014 0.258
[Ca/H] -0.005 0.052 0.035 -0.026 0.085 0.070 -0.042 0.146
[Ti/H] -0.009 0.148 0.090 0.026 0.190 0.136 -0.158 0.211
[Cr/H] -0.005 0.187 0.106 0.077 0.210 0.114 -0.098 0.168
[Mn/H] -0.005 0.063 0.042 0.017 0.149
[Ni/H] -0.005 0.050 0.036 -0.034 0.084 0.064 0.025 0.141

NOTE: The μrsubscript𝜇𝑟\mu_{r}italic_μ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT, σrsubscript𝜎𝑟\sigma_{r}italic_σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT are the mean and standard deviation of the difference between two catalogs, respectively.

Refer to caption
Figure 11: Distribution of stellar parameters from GALAH catalog (the first row) and LAMOST-SPEMR catalog (the second row). The color in this figure represents [Fe/H]delimited-[]FeH\rm[Fe/H][ roman_Fe / roman_H ], and the three isochrones represent the evolutionary tracks of MIST stars with stellar ages of 7 Gyr ([Fe/H]delimited-[]FeH\rm[Fe/H][ roman_Fe / roman_H ] = -0.5 (cyan), 0.0 (yellow), and 0.5 (red) respectively). The samples in the figure are the 110,042 spectra obtained by cross-matching the LAMOST-SPEMR catalog with the GALAH catalog. The S/N intervals are 10S/N<2010SN20\rm 10\leq S/N<2010 ≤ roman_S / roman_N < 20, 20S/N<4020SN40\rm 20\leq S/N<4020 ≤ roman_S / roman_N < 40, 40S/N<6040SN60\rm 40\leq S/N<6040 ≤ roman_S / roman_N < 60 and 60S/N<10060SN100\rm 60\leq S/N<10060 ≤ roman_S / roman_N < 100, respectively.
Refer to caption
Figure 12: Comparison of chemical abundances predicted by LAMOST-SPEMR catalog with GALAH survey’ results. The dotted lines above are theoretical reference lines.
Refer to caption
Figure 13: Comparison of ASPCAP estimations with GALAH survey’ results on the common sources between APOGEE DR17, LAMOST DR8 medium-resolution spectral library and GALAH catalog. The dotted lines above are theoretical reference lines.
Refer to caption
Refer to caption
Figure 14: Comparison of the distribution density of dwarf (logg𝑔\log\,groman_log italic_g > 4) elemental abundances [X/Fe]-[Fe/H] for the LAMOST-SPEMR catalog and GALAH catalog. The two left columns are the estimations for the LAMOST-SPEMR catalog, and the two right columns are the results for the GALAH catalog. The color indicates the density of the sample distribution. Actually, the corresponding samples in the two left columns are from the training set, validation set and test set.
Refer to caption
Refer to caption
Figure 15: Comparison of the distribution density of giant (logg𝑔\log\,groman_log italic_g < 4) elemental abundances [X/Fe]-[Fe/H] for the LAMOST-SPEMR catalog and GALAH catalog. The two left columns are the estimations for the LAMOST-SPEMR catalog, and the two right columns are the results for the GALAH catalog. The color indicates the density of the sample distribution. Actually, the corresponding samples in the two left columns are from the training set, validation set and test set.

To further verify the accuracy of the LAMOST-SPEMR catalog, we investigated the consistency between LAMOST-SPEMR catalog and the SPCANet catalog, the GALAH DR3 catalog.

Wang et al. (2020) used the SPCANet model to estimate the stellar atmospheric parameters and chemical abundances for 1,472,211 medium-resolution spectra from LAMSOT DR7, and those results are called the SPCANet catalog in short. In order to compare the differences between the LAMOST-SPEMR catalog and the SPCANet catalog, we cross-matched the reference set in this paper with the SPCANet catalog and obtained 24,1033 spectra from 5,3775 common stars. It should be noted that the stellar parameter and chemical abundance estimations are made simutaneously by the SPCANet model, the SPEMR model, and the APOGEE ASPCAP pipeline for each of the 24,1033 spectra.

GALAH (De Silva et al., 2015) is a large-scale high-resolution (Rsimilar-to\sim28000) spectroscopic survey project, which uses the Anglo-Australian Telescope and the HERMES spectrograph at the Australian Observatory to observe stellar spectra. The GALAH spectral coverage is [4713, 4903] Å, [5648, 5873] Å, [6478, 6737] Å and [7585, 7887] Å. GALAH DR3 (Buder et al., 2021) published stellar atmospheric parameters and elemental abundances for 588,571 stars. In the observations, there are 383,088 dwarf stars, 200,927 giant stars, and 4,556 unclassified stars. We cross-matched the LAMOST-SPEMR catalog with the GALAH DR3 catalog and obtained 110,042 LAMOST DR8 spectra from 25,519 common stars.

Compared with the SPCANet catalog (Table 2 (SPEMR-ASPCAP, SPEMR-ASPCAP)), LAMOST-SPEMR estimated two more parameters [K/H] and [Mn/H], and reduced the overall bias, dispersion, and MAE by 30%, 50%, 52%, respectively on most of the stellar parameters. This phenomenon indicates that SPEMR has better estimation performance than SPCANet. However, for elements S, Ti, Cr, the precision improvement of the SPEMR model are not significant. This may be caused by the lack of stronger metal lines in the blue part of the LAMOST spectra.

To further investigate the consistency between the LAMOST-SPEMR catalog and GALAH DR3 catalog, we evaluated the differences between SPEMR and GALAH results. Figure 11 shows the Teffloggsubscript𝑇eff𝑔T_{\rm eff}-\log\,gitalic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT - roman_log italic_g distribution of the GALAH catalog and LAMOST-SPEMR catalog colored by [Fe/H] at different S/N intervals. Compared with the GALAH catalog, the LAMOST-SPEMR catalog shows a larger dispersion for giant stars with effective temperature around 5000 K, especially in the region of low metallicity ([Fe/H] < -0.5 dex). This phenomenon is mainly due to the scarcity of reference samples with (Teffsubscript𝑇effT_{\mathrm{eff}}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT, log g𝑔gitalic_g) similar-to\sim (5000 K, 2.5 dex) (Figure 4), as well as the relatively sparse spectral samples for [Fe/H] < -0.5 dex (Figure 2 and Figure 3). In addition, the LAMOST-SPEMR catalog shows some close correlations between the fitting performance to the MIST stellar evolution tracks and the signal-to-noise ratio (SNR). In case of a higher SNR, the LAMOST-SPEMR catalog shows a stronger consistency with MIST stellar evolution tracks, and the corresponding dispersion and bias are also smaller. Otherwise, the corresponding dispersion and bias are larger on the low signal-to-noise spectra. This is mainly due to the poor quality of the low SNR spectra and it indicates that it is necessary to investigate a more robust estimation method and establish an expanded reference set with a better coverage on stellar parameter space. It is worth mentioning that Li et al. (2022) have specifically studied the parameter estimation problem from the LAMOST stellar spectra with low SNR and low resolution. Therefore, we can also pay special attention to the low SNR LAMOST medium-resolution spectral data to improve the overall parameter estimation accuracy of the model in future research.

Figure 12 shows the comparison of the abundances of chemical elements estimated by LAMOST-SPEMR catalog with the results of the GALAH catalog. The detailed biases and the standard deviations are listed in Table 2 (SPEMR-GALAH). It is shown that the standard deviations of the difference between the LAMOST-SPEMR catalog and the GALAH catalog range between 0.13 dex similar-to\sim 0.15 dex in the abundance of elements C, Mg, Al, Si, Ca, Mn, and Ni, and the overall differences distribute around the theoretical line with little dispersion. For the elements O, K, Ti, and Cr, the corresponding differences have a relatively larger dispersion, around 0.20 dex; and the estimations of Ti and Cr show some relatively evident deviation from the theoretical line. To further explore the source of the discrepancy between the LAMOST-SPEMR catalog and the GALAH catalog, we compared the ASPCAP reference value with those of the GALAH catalog in terms of the above elemental abundances (Figure 13). It is shown that the trend of the difference between the ASPCAP catalog and the GALAH catalog is basically consistent with that of the LAMOST-SPEMR catalog. This indicates that the deviation and dispersion of the difference between LAMSOT-SPEMR results and GALAH survey largely originate from the difference between the ASPCAP catalog and the GALAH catalog.

To further evaluate the performance of the SPEMR model, we show the [X/Fe] vs. [Fe/H] distributions of all elements estimated by SPEMR for giants and dwarfs and compare them to the GALAH results for the same stars. Figure 14 shows the [X/Fe]-[Fe/H] distribution over the dwarfs (logg>4𝑔4\log\,g>4roman_log italic_g > 4 dex) for all elements of the LAMOST-SPEMR and GALAH DR3 catalogs. Figure 15 shows the distribution of the corresponding giants (logg<4𝑔4\log\,g<4roman_log italic_g < 4 dex). Comparing the two left and right columns in Fig. 14 and Fig. 15, we can clearly find that the elemental abundance patterns of LAMOST-SPEMR catalog are tighter than those of the GALAH DR3 catalog both on the dwarfs and on the giants. For the dwarfs, the elemental abundances of the LAMOST-SPEMR catalog are concentrated in the intermediate metal abundances ([Fe/H] \in [-0.2, 0.3]dex). For the giants, the distribution of elemental abundances is much wider, and most of them are concentrated in [Fe/H] \in[-0.6, 0.4]dex. This phenomenon is largely consistent with the distribution of the GALAH DR3 catalog. Comparing the two left columns of Fig. 14 and Fig. 15, we can find that the distributions of giants and dwarfs are inconsistent for most elements estimated by SPEMR. For example, the elements Mg, Al, Ti, and Cr from the giants show a more dense distribution on [Fe/H] from -0.8 dex to 0.3 dex. However, this dense pattern is not present in the dwarfs. This is mainly due to the scarcity of main-sequence dwarfs in our training samples. This sample imbalance leads to the fact that most of the labels predicted by SPEMR are around red giants. Only the elements Cr and Ti of the dwarfs show distinct bimodal structures, while most of the elements of the giants all show distinct bimodal structures. In addition, for the giants, elements O, S and K show clear negative correlations relative to [Fe/H], while elements N, Cr and Mn show obvious positive correlations relative to [Fe/H]. And the other elements are closely distributed on a horizontal line. For the dwarfs, elements C, O, and S show obvious negative correlations relative to [Fe/H], while elements N, AL, Cr and Mn show obvious positive correlations relative to [Fe/H]. And other elements are closely distributed on a horizontal line. For most elemental abundances, the position and slope of the dwarf stars distributions are not consistent with those of the giant stars.This phenomenon may be caused by the difference on sampling spaces of the dwarf and giant stars.

4.3 Uncertainty Analysis

Refer to caption
Figure 16: Dependencies of parameter estimation uncertainties on S/N. The dots in this figure indicate the uncertainty predicted by SPEMR, and the length of the line segments centered on the dots indicate the uncertainty obtained from repeated observations (> 5 times).

The SPEMR model is able to predict the PDF of stellar parameters and gives the uncertainty σpredsubscript𝜎𝑝𝑟𝑒𝑑\sigma_{pred}italic_σ start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d end_POSTSUBSCRIPT for the predicted parameters using a deep ensembling approach and equation (1). In addition, in the LAMOST sky survey, some stars are observed for multiple times at different time and under various observing conditions. This phenomenon can be used to analyze the uncertainty σobssubscript𝜎𝑜𝑏𝑠\sigma_{obs}italic_σ start_POSTSUBSCRIPT italic_o italic_b italic_s end_POSTSUBSCRIPT caused by observation configurations. Suppose we have nssubscript𝑛𝑠n_{s}italic_n start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT repeated observations {𝐱1,,𝐱ns}subscript𝐱1subscript𝐱subscript𝑛𝑠\{\mathbf{x}_{1},\cdots,\mathbf{x}_{n_{s}}\}{ bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , bold_x start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_POSTSUBSCRIPT } from a source and nssubscript𝑛𝑠n_{s}italic_n start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT estimations {y1,,yns}subscript𝑦1subscript𝑦subscript𝑛𝑠\{y_{1},\cdots,y_{n_{s}}\}{ italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_y start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_POSTSUBSCRIPT } from these observations using SPEMR. The standard deviation of these nssubscript𝑛𝑠n_{s}italic_n start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT parameter estimations is the corresponding uncertainty σobssubscript𝜎𝑜𝑏𝑠\sigma_{obs}italic_σ start_POSTSUBSCRIPT italic_o italic_b italic_s end_POSTSUBSCRIPT.

Figure 16 shows the dependencies of the uncertainty of LAMOST-SPEMR catalog on S/N. The dots in this figure indicate the uncertainties σpredsubscript𝜎𝑝𝑟𝑒𝑑\sigma_{pred}italic_σ start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d end_POSTSUBSCRIPT predicted by SPEMR, and the length of the line segment centered on the dots indicate the uncertainty estimated from the repeated observations σobssubscript𝜎𝑜𝑏𝑠\sigma_{obs}italic_σ start_POSTSUBSCRIPT italic_o italic_b italic_s end_POSTSUBSCRIPT. On the whole, the RRNet model shows the strong robustness and generalization from the lower uncertainty. Specifically, in case of S/N 10absent10\geq 10≥ 10, the σpredsubscript𝜎𝑝𝑟𝑒𝑑\sigma_{pred}italic_σ start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d end_POSTSUBSCRIPT of the parameters Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT, logg𝑔\log\,groman_log italic_g and [Fe/H] are 134 K, 0.17 dex and 0.07 dex, respectively, and those of the remaining elements are 0.07 dexsimilar-to\sim0.19 dex. In addition, σpredsubscript𝜎𝑝𝑟𝑒𝑑\sigma_{pred}italic_σ start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d end_POSTSUBSCRIPT and σobssubscript𝜎𝑜𝑏𝑠\sigma_{obs}italic_σ start_POSTSUBSCRIPT italic_o italic_b italic_s end_POSTSUBSCRIPT decrease with the increase of S/N. This phenomenon is caused by the higher quality of LAMOST spectra with high S/N. The spectra with high S/N suffer from less noises. The uncertainties in this paper are numerically different from Figure 7 in Xiong et al. (2022). This difference is caused by the region changes in the parameter space of stellar spectra under being processed. At the same time, RRNet and SPEMR show a similar pattern. Therefore, the SPEMR model is stable in estimating stellar parameters from LAMOST spectra.

4.4 Tests on open clusters

Refer to caption
Figure 17: The variation of chemical abundances from the LAMOST-SPEMR catalog with Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT in the three open clusters of Melotte 22, NGC 2682, and NGC 2632. The three colors in the figure correspond to the three open clusters, and the mean μ𝜇\muitalic_μ and standard deviation σ𝜎\sigmaitalic_σ of the chemical abundances are added to each panel.
Refer to caption
Figure 18: The variation of chemical abundances from the LAMOST-SPEMR catalog with [Fe/H] for the three open clusters of Melotte 22, NGC 2682, and NGC 2632. The three colors in the figure correspond to the three open clusters, and the mean μ𝜇\muitalic_μ and standard deviation σ𝜎\sigmaitalic_σ of the chemical abundances are added to each panel.
Refer to caption
Figure 19: Some comparisons of LAMOST-SPEMR with the SPCANet and RRNet catalogs in terms of chemical abundances in three open clusters (Melotte 22, NGC 2682, NGC 2632).

Open clusters have good chemical homogeneity (Bovy, 2016; Ness et al., 2018). Therefore, they can be used as chemical indicators to assess the effect of stellar parameter estimation. To further investigate the accuracy of the element abundances from the LAMOST-SPEMR catalog, we performed more tests on open clusters. Zhong et al. (2020) analyzed the properties of many open clusters based on Gaia DR2 and LAMOST data, and provided a spectroscopic parametric catalog consisting of the stellar physical parameters of 8,811 member stars. We cross-matched these cluster member stars with the LAMOST-SPEMR catalog, and obtained a variety of open clusters, such as Melotte 22, NGC 2682, NGC 2632, NGC 2168, Melotte 20, NGC 2281, Stock 2, NGC 1750, NGC 1545, and so on. Finaly, we selected three open clusters (Melotte 22, NGC 2682 and NGC 2632) with the largest number of matches to LAMOST-SPEMR and removed the parameter estimation with large uncertainties σpredsubscript𝜎𝑝𝑟𝑒𝑑\sigma_{pred}italic_σ start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d end_POSTSUBSCRIPT (section 4.3).

To investigate the effects of effective temperature and metal abundance on the elemental abundances from open clusters, we show the variation of LAMOST-SPEMR chemical elemental abundances with effective temperature and metal abundance in the three open clusters. Figure 17 shows the dependencies of the chemical elemental abundances from LAMOST-SPEMR catalog on effective temperature (Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT) in the above-mentioned open clusters. In agreement with the performance of Ting et al. (2019), the chemical abundances of SPEMR do not show a significant variation trend with Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT in all three aforementioned clusters, and the chemical abundances are at a low deviation and dispersion. Figure 18 shows the dependencies of the chemical elemental abundances from LAMOST-SPEMR catalog on metal abundance ([Fe/H]) for the above-mentioned open clusters. It is shown that [Fe/H] values estimated by the LAMOST-SPEMR catalog approximately range between -0.2 dex and 0.2 dex. And there is a different [Fe/H] spread depending on [X/H] for these open clusters. This phenomenon is mainly due to the differences between these open clusters in ages and distances (Tautvaišienė et al., 2015).

In addition, we compared the chemical abundances of the LAMOST-SPEMR catalog with those of the SPCANet and RRNet catalogs on the three above-mentioned clusters (Fig. 19). Figure 19 shows the standard deviation of the elemental abundances from LAMOST-SPEMR and RRNet catalogs are lower than those of the SPCANet catalog on the whole. For the Melotte 22 and NGC 2632 open clusters, LAMOST-SPEMR shows an overall lower standard deviation than RRNet. For the NGC 2682 open cluster, LAMOST-SPEMR shows lower standard deviation than RRNet on elements (S, Ca, Ti, and Cr). The overall chemical homogeneity from LAMOST-SPEMR on the three clusters is 0.054±plus-or-minus\pm±0.022 dex, 0.055±plus-or-minus\pm±0.016 dex and 0.067±plus-or-minus\pm±0.024 dex, respectively. These phenomena indicate that the LAMOST-SPEMR catalog has higher accuracy compared with SPCANet and RRNet.

4.5 LAMOST-SPEMR catalog

Finally, we published the LAMOST-SPEMR catalog for the estimated stellar atmospheric parameters and elemental abundances of 4,197,960 medium-resolution spectra from LAMOST DR8. This catalog contains the following information: the identifier for the observed spectrum (obsid), the fits file name corresponding to the spectrum (filename), coordinate information (ra, dec), the extension name of the spectrum (extname_blue, extname_red), the signal-to-noise ratio of the spectrum (snr_blue, snr_red), effective temperature (Teff[K]), surface gravity (Logg), metallicity (Fe/H), 13 elemental abundances (X/H), and the 1σ1𝜎1\sigma1 italic_σ uncertainty of the corresponding stellar parameters (X_err).

5 Summary and outlook

This paper proposed a novel method Stellar Parameter Estimation based on Multiple Regions Scheme (SPEMR) based on the distribution characteristics of LAMOST medium-resolution data in parameter space. We estimated the stellar atmospheric parameters, elemental abundances, and corresponding uncertainties for 4,197,960 medium-resolution spectra in LAMOST DR8 using SPEMR. In case of S/N 10absent10\geq 10≥ 10, the precision of the parameters Teffsubscript𝑇effT_{\rm eff}italic_T start_POSTSUBSCRIPT roman_eff end_POSTSUBSCRIPT, logg𝑔\log\,groman_log italic_g, [Fe/H], and [Cr/H] are 47 K, 0.08 dex, 0.03 dex, and 0.16 dex, respectively, while the precision of the other elemental abundances are 0.03 dex similar-to\sim 0.13 dex. To verify the performance of SPEMR, we conducted a series of comparing experiments with other typical medium-resolution spectral parameter estimation models and other surveys. The experimental results demonstrate that the SPEMR model not only improves the parameter accuracy on high-frequency-observed-type spectra but also provides good parameter estimation on the spectra with high temperature, low temperature, or low metallicity. In addition, the SPEMR parameter estimation results are excellently consistent with other high-resolution sky survey. In the future, we will explore the characteristics of high-temperature and low-signal-to-noise spectra, and build extended reference sets with a better coverage on stellar parameter space to further improve the parameter estimation capability of the model.

Acknowledgements

We are very grateful to the referee for helpful suggestions, as well as the correction for some issues, which have improved the paper significantly. This work is supported by the National Natural Science Foundation of China (Grant No. 11973022), the Natural Science Foundation of Guangdong Province (No. 2020A1515010710), the Major projects of the joint fund of Guangdong, and the National Natural Science Foundation (Grant No. U1811464).

LAMOST, a multi-target optical fiber spectroscopic telescope in the large sky area, is a major national engineering project built by the Chinese Academy of Sciences. Funding for the project is provided by the National Development and Reform Commission. LAMOST is operated and managed by the National Astronomical Observatory of the Chinese Academy of Sciences.

Data Availability

The LAMOST data employed in this article are available after September 2022 to the users out of China for download from LAMOST DR8, at http://www.lamost.org/dr8/. The computed catalog for 4.19 million medium-resolution spectra from the LAMOST DR8, the source code, the trained model and the experimental data have been made publicly available at: https://github.com/yulongzh/SPEMR.

Footnotes

software: Numpy (Harris et al., 2020), Scipy (Virtanen et al., 2020), Astropy (Price-Whelan et al., 2018), Matplotlib (Hunter, 2007), Scikit-learn(Pedregosa et al., 2011), Pytorch(Paszke et al., 2019).

References

  • Bialek et al. (2020) Bialek S., Fabbro S., Venn K. A., et al., 2020, Monthly Notices of the Royal Astronomical Society, 498, 3817
  • Bovy (2016) Bovy J., 2016, The Astrophysical Journal, 817, 49
  • Buder et al. (2021) Buder S., Sharma S., Kos J., et al., 2021, Monthly Notices of the Royal Astronomical Society, 506, 150
  • Choi et al. (2016) Choi J., Dotter A., Conroy C., et al., 2016, The Astrophysical Journal, 823, 102
  • Cui et al. (2012) Cui X.-Q., Zhao Y.-H., Chu Y.-Q., et al., 2012, Research in Astronomy and Astrophysics, 12, 1197
  • De Silva et al. (2015) De Silva G. M., Freeman K. C., Bland-Hawthorn J., et al., 2015, Monthly Notices of the Royal Astronomical Society, 449, 2604
  • Dotter (2016) Dotter A., 2016, The Astrophysical Journal Supplement Series, 222, 8
  • Fabbro et al. (2018) Fabbro S., Venn K., O’Briain T., et al., 2018, Monthly Notices of the Royal Astronomical Society, 475, 2978
  • Harris et al. (2020) Harris C. R., Millman K. J., Van Der Walt S. J., et al., 2020, Nature, 585, 357
  • Hunter (2007) Hunter J. D., 2007, Computing in Science & Engineering, 9, 90
  • Li et al. (2022) Li X., Zeng S., Wang Z., et al., 2022, Monthly Notices of the Royal Astronomical Society, 514, 4588
  • Liu et al. (2014) Liu C., Deng L.-C., Carlin J. L., et al., 2014, The Astrophysical Journal, 790, 110
  • Liu et al. (2015) Liu X.-W., Zhao G., Hou J.-L., 2015, Research in Astronomy and Astrophysics, 15, 1089
  • Luo et al. (2015) Luo A.-L., Zhao Y.-H., Zhao G., et al., 2015, Research in Astronomy and Astrophysics, 15, 1095
  • Majewski et al. (2017) Majewski S. R., Schiavon R. P., Frinchaboy P. M., et al., 2017, The Astronomical Journal, 154, 94
  • Ness et al. (2018) Ness M., Rix H. W., Hogg D. W., et al., 2018, The Astrophysical Journal, 853, 198
  • Paszke et al. (2019) Paszke A., Gross S., Massa F., et al., 2019, in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019. Vancouver, BC, Canada, pp 8024–8035
  • Pedregosa et al. (2011) Pedregosa F., Varoquaux G., Gramfort A., et al., 2011, The Journal of Machine Learning Research, 12, 2825
  • Pérez et al. (2016) Pérez A. E. G., Prieto C. A., Holtzman J. A., et al., 2016, The Astronomical Journal, 151, 144
  • Price-Whelan et al. (2018) Price-Whelan A. M., Sipőcz B., Günther H., et al., 2018, The Astronomical Journal, 156, 123
  • Prugniel & Soubiran (2001) Prugniel P., Soubiran C., 2001, Astronomy & Astrophysics, 369, 1048
  • Recio-Blanco et al. (2022) Recio-Blanco A., de Laverny P., Palicio P., et al., 2022, preprint (arXiv:2206.05541)
  • Rui et al. (2019) Rui W., A-li L., Shuo Z., et al., 2019, Publications of the Astronomical Society of the Pacific, 131, 024505
  • Tautvaišienė et al. (2015) Tautvaišienė G., Drazdauskas A., Mikolaitis Š., et al., 2015, Astronomy and Astrophysics, 573, A55
  • Ting et al. (2019) Ting Y.-S., Conroy C., Rix H.-W., et al., 2019, The Astrophysical Journal, 879, 69
  • Virtanen et al. (2020) Virtanen P., Gommers R., Oliphant T. E., et al., 2020, Nature Methods, 17, 261
  • Wang et al. (2019) Wang R., Luo A.-L., Chen J.-J., et al., 2019, The Astrophysical Journal Supplement Series, 244, 27
  • Wang et al. (2020) Wang R., Luo A.-L., Chen J.-J., et al., 2020, The Astrophysical Journal, 891, 23
  • Wu et al. (2011) Wu Y., Luo A.-L., Li H.-N., et al., 2011, Research in Astronomy and Astrophysics, 11, 924
  • Xiong et al. (2022) Xiong S., Li X., Liao C., 2022, The Astrophysical Journal Supplement Series, 261, 36
  • Zhong et al. (2020) Zhong J., Chen L., Wu D., et al., 2020, Astronomy and Astrophysics, 640, A127