-
RiskQ: Risk-sensitive Multi-Agent Reinforcement Learning Value Factorization
Authors:
Siqi Shen,
Chennan Ma,
Chao Li,
Weiquan Liu,
Yongquan Fu,
Songzhu Mei,
Xinwang Liu,
Cheng Wang
Abstract:
Multi-agent systems are characterized by environmental uncertainty, varying policies of agents, and partial observability, which result in significant risks. In the context of Multi-Agent Reinforcement Learning (MARL), learning coordinated and decentralized policies that are sensitive to risk is challenging. To formulate the coordination requirements in risk-sensitive MARL, we introduce the Risk-s…
▽ More
Multi-agent systems are characterized by environmental uncertainty, varying policies of agents, and partial observability, which result in significant risks. In the context of Multi-Agent Reinforcement Learning (MARL), learning coordinated and decentralized policies that are sensitive to risk is challenging. To formulate the coordination requirements in risk-sensitive MARL, we introduce the Risk-sensitive Individual-Global-Max (RIGM) principle as a generalization of the Individual-Global-Max (IGM) and Distributional IGM (DIGM) principles. This principle requires that the collection of risk-sensitive action selections of each agent should be equivalent to the risk-sensitive action selection of the central policy. Current MARL value factorization methods do not satisfy the RIGM principle for common risk metrics such as the Value at Risk (VaR) metric or distorted risk measurements. Therefore, we propose RiskQ to address this limitation, which models the joint return distribution by modeling quantiles of it as weighted quantile mixtures of per-agent return distribution utilities. RiskQ satisfies the RIGM principle for the VaR and distorted risk metrics. We show that RiskQ can obtain promising performance through extensive experiments. The source code of RiskQ is available in https://github.com/xmu-rl-3dv/RiskQ.
△ Less
Submitted 21 March, 2024; v1 submitted 3 November, 2023;
originally announced November 2023.
-
Euclid preparation. TBD. Forecast impact of super-sample covariance on 3x2pt analysis with Euclid
Authors:
Euclid Collaboration,
D. Sciotti,
S. Gouyou Beauchamps,
V. F. Cardone,
S. Camera,
I. Tutusaus,
F. Lacasa,
A. Barreira,
A. Gorce,
M. Aubert,
P. Baratta,
R. E. Upham,
M. Bonici,
C. Carbone,
S. Casas,
S. Ilić,
M. Martinelli,
Z. Sakr,
A. Schneider,
R. Maoli,
R. Scaramella,
S. Escoffier,
W. Gillard,
N. Aghanim,
A. Amara
, et al. (199 additional authors not shown)
Abstract:
Deviations from Gaussianity in the distribution of the fields probed by large-scale structure surveys generate additional terms in the data covariance matrix, increasing the uncertainties in the measurement of the cosmological parameters. Super-sample covariance (SSC) is among the largest of these non-Gaussian contributions, with the potential to significantly degrade constraints on some of the pa…
▽ More
Deviations from Gaussianity in the distribution of the fields probed by large-scale structure surveys generate additional terms in the data covariance matrix, increasing the uncertainties in the measurement of the cosmological parameters. Super-sample covariance (SSC) is among the largest of these non-Gaussian contributions, with the potential to significantly degrade constraints on some of the parameters of the cosmological model under study -- especially for weak lensing cosmic shear. We compute and validate the impact of SSC on the forecast uncertainties on the cosmological parameters for the Euclid photometric survey, obtained with a Fisher matrix analysis, both considering the Gaussian covariance alone and adding the SSC term -- computed through the public code PySSC. The photometric probes are considered in isolation and combined in the `3$\times$2pt' analysis. We find the SSC impact to be non-negligible -- halving the Figure of Merit of the dark energy parameters ($w_0$, $w_a$) in the 3$\times$2pt case and substantially increasing the uncertainties on $Ω_{{\rm m},0}, w_0$, and $σ_8$ for cosmic shear; photometric galaxy clustering, on the other hand, is less affected due to the lower probe response. The relative impact of SSC does not show significant changes under variations of the redshift binning scheme, while it is smaller for weak lensing when marginalising over the multiplicative shear bias nuisance parameters, which also leads to poorer constraints on the cosmological parameters. Finally, we explore how the use of prior information on the shear and galaxy bias changes the SSC impact. Improving shear bias priors does not have a significant impact, while galaxy bias must be calibrated to sub-percent level to increase the Figure of Merit by the large amount needed to achieve the value when SSC is not included.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Euclid preparation. XXXI. The effect of the variations in photometric passbands on photometric-redshift accuracy
Authors:
Euclid Collaboration,
Stéphane Paltani,
J. Coupon,
W. G. Hartley,
A. Alvarez-Ayllon,
F. Dubath,
J. J. Mohr,
M. Schirmer,
J. -C. Cuillandre,
G. Desprez,
O. Ilbert,
K. Kuijken,
N. Aghanim,
B. Altieri,
A. Amara,
N. Auricchio,
M. Baldi,
R. Bender,
C. Bodendorf,
D. Bonino,
E. Branchini,
M. Brescia,
J. Brinchmann,
S. Camera,
V. Capobianco
, et al. (192 additional authors not shown)
Abstract:
The technique of photometric redshifts has become essential for the exploitation of multi-band extragalactic surveys. While the requirements on photo-zs for the study of galaxy evolution mostly pertain to the precision and to the fraction of outliers, the most stringent requirement in their use in cosmology is on the accuracy, with a level of bias at the sub-percent level for the Euclid cosmology…
▽ More
The technique of photometric redshifts has become essential for the exploitation of multi-band extragalactic surveys. While the requirements on photo-zs for the study of galaxy evolution mostly pertain to the precision and to the fraction of outliers, the most stringent requirement in their use in cosmology is on the accuracy, with a level of bias at the sub-percent level for the Euclid cosmology mission. A separate, and challenging, calibration process is needed to control the bias at this level of accuracy. The bias in photo-zs has several distinct origins that may not always be easily overcome. We identify here one source of bias linked to the spatial or time variability of the passbands used to determine the photometric colours of galaxies. We first quantified the effect as observed on several well-known photometric cameras, and found in particular that, due to the properties of optical filters, the redshifts of off-axis sources are usually overestimated. We show using simple simulations that the detailed and complex changes in the shape can be mostly ignored and that it is sufficient to know the mean wavelength of the passbands of each photometric observation to correct almost exactly for this bias; the key point is that this mean wavelength is independent of the spectral energy distribution of the source}. We use this property to propose a correction that can be computationally efficiently implemented in some photo-z algorithms, in particular template-fitting. We verified that our algorithm, implemented in the new photo-z code Phosphoros, can effectively reduce the bias in photo-zs on real data using the CFHTLS T007 survey, with an average measured bias Delta z over the redshift range 0.4<z<0.7 decreasing by about 0.02, specifically from Delta z~0.04 to Delta z~0.02 around z=0.5. Our algorithm is also able to produce corrected photometry for other applications.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin
Authors:
Tianshuo Zhou,
Sen Mei,
Xinze Li,
Zhenghao Liu,
Chenyan Xiong,
Zhiyuan Liu,
Yu Gu,
Ge Yu
Abstract:
This paper proposes Multi-modAl Retrieval model via Visual modulE pLugin (MARVEL), which learns an embedding space for queries and multi-modal documents to conduct retrieval. MARVEL encodes queries and multi-modal documents with a unified encoder model, which helps to alleviate the modality gap between images and texts. Specifically, we enable the image understanding ability of the well-trained de…
▽ More
This paper proposes Multi-modAl Retrieval model via Visual modulE pLugin (MARVEL), which learns an embedding space for queries and multi-modal documents to conduct retrieval. MARVEL encodes queries and multi-modal documents with a unified encoder model, which helps to alleviate the modality gap between images and texts. Specifically, we enable the image understanding ability of the well-trained dense retriever, T5-ANCE, by incorporating the visual module's encoded image features as its inputs. To facilitate the multi-modal retrieval tasks, we build the ClueWeb22-MM dataset based on the ClueWeb22 dataset, which regards anchor texts as queries, and extracts the related text and image documents from anchor-linked web pages. Our experiments show that MARVEL significantly outperforms the state-of-the-art methods on the multi-modal retrieval dataset WebQA and ClueWeb22-MM. MARVEL provides an opportunity to broaden the advantages of text retrieval to the multi-modal scenario. Besides, we also illustrate that the language model has the ability to extract image semantics and partly map the image features to the input word embedding space. All codes are available at https://github.com/OpenMatch/MARVEL.
△ Less
Submitted 15 June, 2024; v1 submitted 21 October, 2023;
originally announced October 2023.
-
An experimental investigation of the heat and flow features in street canyons: Impacts of the approaching turbulent boundary layer flow
Authors:
Yunpeng Xue,
Yongling Zhao,
Shuo-Jun Mei,
Yuan Chao,
Jan Carmeliet
Abstract:
The study of turbulent boundary layer flow holds significant importance in urban climate research, particularly concerning numerical simulation studies where it serves as a crucial inflow boundary condition. However, understanding the turbulent boundary layer's influence on flow and heat features within canyon and canopy flow remains incomplete. To address this knowledge gap, our current work empl…
▽ More
The study of turbulent boundary layer flow holds significant importance in urban climate research, particularly concerning numerical simulation studies where it serves as a crucial inflow boundary condition. However, understanding the turbulent boundary layer's influence on flow and heat features within canyon and canopy flow remains incomplete. To address this knowledge gap, our current work employs simultaneous Particle Image Velocimetry and Laser-Induced Fluorescence (PIV-LIF) measurements within a large closed-circuit water tunnel. Through this approach, we obtain valuable flow information under various flow and thermal conditions, allowing us to explore the impacts of three distinct turbulent boundary layer flows. The three chosen turbulent boundary layer flows display distinct influences on flow characteristics and heat removal capacity. The ventilation rate exhibits a maximum difference of 80% among the tested boundary layer flows. Additionally, the most significant variation in heat removal capacity is approximately 45%. Moreover, the different turbulence inlet profiles result in diverse fluctuating features at the canyon opening, while the deeper region of the canyon remains less affected.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations
Authors:
Tianyu Guo,
Wei Hu,
Song Mei,
Huan Wang,
Caiming Xiong,
Silvio Savarese,
Yu Bai
Abstract:
While large language models based on the transformer architecture have demonstrated remarkable in-context learning (ICL) capabilities, understandings of such capabilities are still in an early stage, where existing theory and mechanistic understanding focus mostly on simple scenarios such as learning simple function classes. This paper takes initial steps on understanding ICL in more complex scena…
▽ More
While large language models based on the transformer architecture have demonstrated remarkable in-context learning (ICL) capabilities, understandings of such capabilities are still in an early stage, where existing theory and mechanistic understanding focus mostly on simple scenarios such as learning simple function classes. This paper takes initial steps on understanding ICL in more complex scenarios, by studying learning with representations. Concretely, we construct synthetic in-context learning problems with a compositional structure, where the label depends on the input through a possibly complex but fixed representation function, composed with a linear function that differs in each instance. By construction, the optimal ICL algorithm first transforms the inputs by the representation function, and then performs linear ICL on top of the transformed dataset. We show theoretically the existence of transformers that approximately implement such algorithms with mild depth and size. Empirically, we find trained transformers consistently achieve near-optimal ICL performance in this setting, and exhibit the desired dissection where lower layers transforms the dataset and upper layers perform linear ICL. Through extensive probing and a new pasting experiment, we further reveal several mechanisms within the trained transformers, such as concrete copying behaviors on both the inputs and the representations, linear ICL capability of the upper layers alone, and a post-ICL representation selection mechanism in a harder mixture setting. These observed mechanisms align well with our theory and may shed light on how transformers perform ICL in more realistic scenarios.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining
Authors:
Licong Lin,
Yu Bai,
Song Mei
Abstract:
Large transformer models pretrained on offline reinforcement learning datasets have demonstrated remarkable in-context reinforcement learning (ICRL) capabilities, where they can make good decisions when prompted with interaction trajectories from unseen environments. However, when and how transformers can be trained to perform ICRL have not been theoretically well-understood. In particular, it is…
▽ More
Large transformer models pretrained on offline reinforcement learning datasets have demonstrated remarkable in-context reinforcement learning (ICRL) capabilities, where they can make good decisions when prompted with interaction trajectories from unseen environments. However, when and how transformers can be trained to perform ICRL have not been theoretically well-understood. In particular, it is unclear which reinforcement-learning algorithms transformers can perform in context, and how distribution mismatch in offline training data affects the learned algorithms. This paper provides a theoretical framework that analyzes supervised pretraining for ICRL. This includes two recently proposed training methods -- algorithm distillation and decision-pretrained transformers. First, assuming model realizability, we prove the supervised-pretrained transformer will imitate the conditional expectation of the expert algorithm given the observed trajectory. The generalization error will scale with model capacity and a distribution divergence factor between the expert and offline algorithms. Second, we show transformers with ReLU attention can efficiently approximate near-optimal online reinforcement learning algorithms like LinUCB and Thompson sampling for stochastic linear bandits, and UCB-VI for tabular Markov decision processes. This provides the first quantitative analysis of the ICRL capabilities of transformers pretrained from offline trajectories.
△ Less
Submitted 26 May, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Euclid: Identification of asteroid streaks in simulated images using deep learning
Authors:
M. Pöntinen,
M. Granvik,
A. A. Nucita,
L. Conversi,
B. Altieri,
B. Carry,
C. M. O'Riordan,
D. Scott,
N. Aghanim,
A. Amara,
L. Amendola,
N. Auricchio,
M. Baldi,
D. Bonino,
E. Branchini,
M. Brescia,
S. Camera,
V. Capobianco,
C. Carbone,
J. Carretero,
M. Castellano,
S. Cavuoti,
A. Cimatti,
R. Cledassou,
G. Congedo
, et al. (92 additional authors not shown)
Abstract:
Up to 150000 asteroids will be visible in the images of the ESA Euclid space telescope, and the instruments of Euclid offer multiband visual to near-infrared photometry and slitless spectra of these objects. Most asteroids will appear as streaks in the images. Due to the large number of images and asteroids, automated detection methods are needed. A non-machine-learning approach based on the Strea…
▽ More
Up to 150000 asteroids will be visible in the images of the ESA Euclid space telescope, and the instruments of Euclid offer multiband visual to near-infrared photometry and slitless spectra of these objects. Most asteroids will appear as streaks in the images. Due to the large number of images and asteroids, automated detection methods are needed. A non-machine-learning approach based on the StreakDet software was previously tested, but the results were not optimal for short and/or faint streaks. We set out to improve the capability to detect asteroid streaks in Euclid images by using deep learning.
We built, trained, and tested a three-step machine-learning pipeline with simulated Euclid images. First, a convolutional neural network (CNN) detected streaks and their coordinates in full images, aiming to maximize the completeness (recall) of detections. Then, a recurrent neural network (RNN) merged snippets of long streaks detected in several parts by the CNN. Lastly, gradient-boosted trees (XGBoost) linked detected streaks between different Euclid exposures to reduce the number of false positives and improve the purity (precision) of the sample.
The deep-learning pipeline surpasses the completeness and reaches a similar level of purity of a non-machine-learning pipeline based on the StreakDet software. Additionally, the deep-learning pipeline can detect asteroids 0.25-0.5 magnitudes fainter than StreakDet. The deep-learning pipeline could result in a 50% increase in the number of detected asteroids compared to the StreakDet software. There is still scope for further refinement, particularly in improving the accuracy of streak coordinates and enhancing the completeness of the final stage of the pipeline, which involves linking detections across multiple exposures.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Euclid: The search for primordial features
Authors:
M. Ballardini,
Y. Akrami,
F. Finelli,
D. Karagiannis,
B. Li,
Y. Li,
Z. Sakr,
D. Sapone,
A. Achúcarro,
M. Baldi,
N. Bartolo,
G. Cañas-Herrera,
S. Casas,
R. Murgia,
H. A. Winther,
M. Viel,
A. Andrews,
J. Jasche,
G. Lavaux,
D. K. Hazra,
D. Paoletti,
J. Valiviita,
A. Amara,
S. Andreon,
N. Auricchio
, et al. (104 additional authors not shown)
Abstract:
Primordial features, in particular oscillatory signals, imprinted in the primordial power spectrum of density perturbations represent a clear window of opportunity for detecting new physics at high-energy scales. Future spectroscopic and photometric measurements from the $Euclid$ space mission will provide unique constraints on the primordial power spectrum, thanks to the redshift coverage and hig…
▽ More
Primordial features, in particular oscillatory signals, imprinted in the primordial power spectrum of density perturbations represent a clear window of opportunity for detecting new physics at high-energy scales. Future spectroscopic and photometric measurements from the $Euclid$ space mission will provide unique constraints on the primordial power spectrum, thanks to the redshift coverage and high-accuracy measurement of nonlinear scales, thus allowing us to investigate deviations from the standard power-law primordial power spectrum. We consider two models with primordial undamped oscillations superimposed on the matter power spectrum, one linearly spaced in $k$-space the other logarithmically spaced in $k$-space. We forecast uncertainties applying a Fisher matrix method to spectroscopic galaxy clustering, weak lensing, photometric galaxy clustering, cross correlation between photometric probes, spectroscopic galaxy clustering bispectrum, CMB temperature and $E$-mode polarization, temperature-polarization cross correlation, and CMB weak lensing. We also study a nonlinear density reconstruction method to retrieve the oscillatory signals in the primordial power spectrum. We find the following percentage relative errors in the feature amplitude with $Euclid$ primary probes for the linear (logarithmic) feature model: 21% (22%) in the pessimistic settings and 18% (18%) in the optimistic settings at 68.3% confidence level (CL) using GC$_{\rm sp}$+WL+GC$_{\rm ph}$+XC. Combining all the sources of information explored expected from $Euclid$ in combination with future SO-like CMB experiment, we forecast ${\cal A}_{\rm lin} \simeq 0.010 \pm 0.001$ at 68.3% CL and ${\cal A}_{\rm log} \simeq 0.010 \pm 0.001$ for GC$_{\rm sp}$(PS rec + BS)+WL+GC$_{\rm ph}$+XC+SO-like both for the optimistic and pessimistic settings over the frequency range $(1,\,10^{2.1})$.
△ Less
Submitted 29 March, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Informative Data Mining for One-Shot Cross-Domain Semantic Segmentation
Authors:
Yuxi Wang,
Jian Liang,
Jun Xiao,
Shuqi Mei,
Yuran Yang,
Zhaoxiang Zhang
Abstract:
Contemporary domain adaptation offers a practical solution for achieving cross-domain transfer of semantic segmentation between labeled source data and unlabeled target data. These solutions have gained significant popularity; however, they require the model to be retrained when the test environment changes. This can result in unbearable costs in certain applications due to the time-consuming trai…
▽ More
Contemporary domain adaptation offers a practical solution for achieving cross-domain transfer of semantic segmentation between labeled source data and unlabeled target data. These solutions have gained significant popularity; however, they require the model to be retrained when the test environment changes. This can result in unbearable costs in certain applications due to the time-consuming training process and concerns regarding data privacy. One-shot domain adaptation methods attempt to overcome these challenges by transferring the pre-trained source model to the target domain using only one target data. Despite this, the referring style transfer module still faces issues with computation cost and over-fitting problems. To address this problem, we propose a novel framework called Informative Data Mining (IDM) that enables efficient one-shot domain adaptation for semantic segmentation. Specifically, IDM provides an uncertainty-based selection criterion to identify the most informative samples, which facilitates quick adaptation and reduces redundant training. We then perform a model adaptation method using these selected samples, which includes patch-wise mixing and prototype-based information maximization to update the model. This approach effectively enhances adaptation and mitigates the overfitting problem. In general, we provide empirical evidence of the effectiveness and efficiency of IDM. Our approach outperforms existing methods and achieves a new state-of-the-art one-shot performance of 56.7\%/55.4\% on the GTA5/SYNTHIA to Cityscapes adaptation tasks, respectively. The code will be released at \url{https://github.com/yxiwang/IDM}.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models
Authors:
Song Mei,
Yuchen Wu
Abstract:
We investigate the approximation efficiency of score functions by deep neural networks in diffusion-based generative modeling. While existing approximation theories utilize the smoothness of score functions, they suffer from the curse of dimensionality for intrinsically high-dimensional data. This limitation is pronounced in graphical models such as Markov random fields, common for image distribut…
▽ More
We investigate the approximation efficiency of score functions by deep neural networks in diffusion-based generative modeling. While existing approximation theories utilize the smoothness of score functions, they suffer from the curse of dimensionality for intrinsically high-dimensional data. This limitation is pronounced in graphical models such as Markov random fields, common for image distributions, where the approximation efficiency of score functions remains unestablished.
To address this, we observe score functions can often be well-approximated in graphical models through variational inference denoising algorithms. Furthermore, these algorithms are amenable to efficient neural network representation. We demonstrate this in examples of graphical models, including Ising models, conditional Ising models, restricted Boltzmann machines, and sparse encoding models. Combined with off-the-shelf discretization error bounds for diffusion-based sampling, we provide an efficient sample complexity bound for diffusion-based generative modeling when the score function is learned by deep neural networks.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Uncertainty Intervals for Prediction Errors in Time Series Forecasting
Authors:
Hui Xu,
Song Mei,
Stephen Bates,
Jonathan Taylor,
Robert Tibshirani
Abstract:
Inference for prediction errors is critical in time series forecasting pipelines. However, providing statistically meaningful uncertainty intervals for prediction errors remains relatively under-explored. Practitioners often resort to forward cross-validation (FCV) for obtaining point estimators and constructing confidence intervals based on the Central Limit Theorem (CLT). The naive version assum…
▽ More
Inference for prediction errors is critical in time series forecasting pipelines. However, providing statistically meaningful uncertainty intervals for prediction errors remains relatively under-explored. Practitioners often resort to forward cross-validation (FCV) for obtaining point estimators and constructing confidence intervals based on the Central Limit Theorem (CLT). The naive version assumes independence, a condition that is usually invalid due to time correlation. These approaches lack statistical interpretations and theoretical justifications even under stationarity.
This paper systematically investigates uncertainty intervals for prediction errors in time series forecasting. We first distinguish two key inferential targets: the stochastic test error over near future data points, and the expected test error as the expectation of the former. The stochastic test error is often more relevant in applications needing to quantify uncertainty over individual time series instances. To construct prediction intervals for the stochastic test error, we propose the quantile-based forward cross-validation (QFCV) method. Under an ergodicity assumption, QFCV intervals have asymptotically valid coverage and are shorter than marginal empirical quantiles. In addition, we also illustrate why naive CLT-based FCV intervals fail to provide valid uncertainty intervals, even with certain corrections. For non-stationary time series, we further provide rolling intervals by combining QFCV with adaptive conformal prediction to give time-average coverage guarantees. Overall, we advocate the use of QFCV procedures and demonstrate their coverage and efficiency through simulations and real data examples.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Euclid preparation. XXXIV. The effect of linear redshift-space distortions in photometric galaxy clustering and its cross-correlation with cosmic shear
Authors:
Euclid Collaboration,
K. Tanidis,
V. F. Cardone,
M. Martinelli,
I. Tutusaus,
S. Camera,
N. Aghanim,
A. Amara,
S. Andreon,
N. Auricchio,
M. Baldi,
S. Bardelli,
E. Branchini,
M. Brescia,
J. Brinchmann,
V. Capobianco,
C. Carbone,
J. Carretero,
S. Casas,
M. Castellano,
S. Cavuoti,
A. Cimatti,
R. Cledassou,
G. Congedo,
L. Conversi
, et al. (185 additional authors not shown)
Abstract:
The cosmological surveys that are planned for the current decade will provide us with unparalleled observations of the distribution of galaxies on cosmic scales, by means of which we can probe the underlying large-scale structure (LSS) of the Universe. This will allow us to test the concordance cosmological model and its extensions. However, precision pushes us to high levels of accuracy in the th…
▽ More
The cosmological surveys that are planned for the current decade will provide us with unparalleled observations of the distribution of galaxies on cosmic scales, by means of which we can probe the underlying large-scale structure (LSS) of the Universe. This will allow us to test the concordance cosmological model and its extensions. However, precision pushes us to high levels of accuracy in the theoretical modelling of the LSS observables, so that no biases are introduced into the estimation of the cosmological parameters. In particular, effects such as redshift-space distortions (RSD) can become relevant in the computation of harmonic-space power spectra even for the clustering of the photometrically selected galaxies, as has previously been shown in literature. In this work, we investigate the contribution of linear RSD, as formulated in the Limber approximation by a previous work, in forecast cosmological analyses with the photometric galaxy sample of the Euclid survey. We aim to assess their impact and to quantify the bias on the measurement of cosmological parameters that would be caused if this effect were neglected. We performed this task by producing mock power spectra for photometric galaxy clustering and weak lensing, as is expected to be obtained from the Euclid survey. We then used a Markov chain Monte Carlo approach to obtain the posterior distributions of cosmological parameters from these simulated observations. When the linear RSD is neglected, significant biases are caused when galaxy correlations are used alone and when they are combined with cosmic shear in the so-called 3$\times$2pt approach. These biases can be equivalent to as much as $5\,σ$ when an underlying $Λ$CDM cosmology is assumed. When the cosmological model is extended to include the equation-of-state parameters of dark energy, the extension parameters can be shifted by more than $1\,σ$.
△ Less
Submitted 22 April, 2024; v1 submitted 31 August, 2023;
originally announced September 2023.
-
Text Matching Improves Sequential Recommendation by Reducing Popularity Biases
Authors:
Zhenghao Liu,
Sen Mei,
Chenyan Xiong,
Xiaohua Li,
Shi Yu,
Zhiyuan Liu,
Yu Gu,
Ge Yu
Abstract:
This paper proposes Text mAtching based SequenTial rEcommendation model (TASTE), which maps items and users in an embedding space and recommends items by matching their text representations. TASTE verbalizes items and user-item interactions using identifiers and attributes of items. To better characterize user behaviors, TASTE additionally proposes an attention sparsity method, which enables TASTE…
▽ More
This paper proposes Text mAtching based SequenTial rEcommendation model (TASTE), which maps items and users in an embedding space and recommends items by matching their text representations. TASTE verbalizes items and user-item interactions using identifiers and attributes of items. To better characterize user behaviors, TASTE additionally proposes an attention sparsity method, which enables TASTE to model longer user-item interactions by reducing the self-attention computations during encoding. Our experiments show that TASTE outperforms the state-of-the-art methods on widely used sequential recommendation datasets. TASTE alleviates the cold start problem by representing long-tail items using full-text modeling and bringing the benefits of pretrained language models to recommendation systems. Our further analyses illustrate that TASTE significantly improves the recommendation accuracy by reducing the popularity bias of previous item id based recommendation models and returning more appropriate and text-relevant items to satisfy users. All codes are available at https://github.com/OpenMatch/TASTE.
△ Less
Submitted 27 August, 2023;
originally announced August 2023.
-
YOLOrtho -- A Unified Framework for Teeth Enumeration and Dental Disease Detection
Authors:
Shenxiao Mei,
Chenglong Ma,
Feihong Shen,
Huikai Wu
Abstract:
Detecting dental diseases through panoramic X-rays images is a standard procedure for dentists. Normally, a dentist need to identify diseases and find the infected teeth. While numerous machine learning models adopting this two-step procedure have been developed, there has not been an end-to-end model that can identify teeth and their associated diseases at the same time. To fill the gap, we devel…
▽ More
Detecting dental diseases through panoramic X-rays images is a standard procedure for dentists. Normally, a dentist need to identify diseases and find the infected teeth. While numerous machine learning models adopting this two-step procedure have been developed, there has not been an end-to-end model that can identify teeth and their associated diseases at the same time. To fill the gap, we develop YOLOrtho, a unified framework for teeth enumeration and dental disease detection. We develop our model on Dentex Challenge 2023 data, which consists of three distinct types of annotated data. The first part is labeled with quadrant, and the second part is labeled with quadrant and enumeration and the third part is labeled with quadrant, enumeration and disease. To further improve detection, we make use of Tufts Dental public dataset. To fully utilize the data and learn both teeth detection and disease identification simultaneously, we formulate diseases as attributes attached to their corresponding teeth. Due to the nature of position relation in teeth enumeration, We replace convolution layer with CoordConv in our model to provide more position information for the model. We also adjust the model architecture and insert one more upsampling layer in FPN in favor of large object detection. Finally, we propose a post-process strategy for teeth layout that corrects teeth enumeration based on linear sum assignment. Results from experiments show that our model exceeds large Diffusion-based model.
△ Less
Submitted 4 September, 2023; v1 submitted 11 August, 2023;
originally announced August 2023.
-
Deep Directly-Trained Spiking Neural Networks for Object Detection
Authors:
Qiaoyi Su,
Yuhong Chou,
Yifan Hu,
Jianing Li,
Shijie Mei,
Ziyang Zhang,
Guoqi Li
Abstract:
Spiking neural networks (SNNs) are brain-inspired energy-efficient models that encode information in spatiotemporal dynamics. Recently, deep SNNs trained directly have shown great success in achieving high performance on classification tasks with very few time steps. However, how to design a directly-trained SNN for the regression task of object detection still remains a challenging problem. To ad…
▽ More
Spiking neural networks (SNNs) are brain-inspired energy-efficient models that encode information in spatiotemporal dynamics. Recently, deep SNNs trained directly have shown great success in achieving high performance on classification tasks with very few time steps. However, how to design a directly-trained SNN for the regression task of object detection still remains a challenging problem. To address this problem, we propose EMS-YOLO, a novel directly-trained SNN framework for object detection, which is the first trial to train a deep SNN with surrogate gradients for object detection rather than ANN-SNN conversion strategies. Specifically, we design a full-spike residual block, EMS-ResNet, which can effectively extend the depth of the directly-trained SNN with low power consumption. Furthermore, we theoretically analyze and prove the EMS-ResNet could avoid gradient vanishing or exploding. The results demonstrate that our approach outperforms the state-of-the-art ANN-SNN conversion methods (at least 500 time steps) in extremely fewer time steps (only 4 time steps). It is shown that our model could achieve comparable performance to the ANN with the same architecture while consuming 5.83 times less energy on the frame-based COCO Dataset and the event-based Gen1 Dataset.
△ Less
Submitted 26 July, 2023; v1 submitted 21 July, 2023;
originally announced July 2023.
-
What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Authors:
Hengyu Fu,
Tianyu Guo,
Yu Bai,
Song Mei
Abstract:
Attention layers -- which map a sequence of inputs to a sequence of outputs -- are core building blocks of the Transformer architecture which has achieved significant breakthroughs in modern artificial intelligence. This paper presents a rigorous theoretical study on the learning and generalization of a single multi-head attention layer, with a sequence of key vectors and a separate query vector a…
▽ More
Attention layers -- which map a sequence of inputs to a sequence of outputs -- are core building blocks of the Transformer architecture which has achieved significant breakthroughs in modern artificial intelligence. This paper presents a rigorous theoretical study on the learning and generalization of a single multi-head attention layer, with a sequence of key vectors and a separate query vector as input. We consider the random feature setting where the attention layer has a large number of heads, with randomly sampled frozen query and key matrices, and trainable value matrices. We show that such a random-feature attention layer can express a broad class of target functions that are permutation invariant to the key vectors. We further provide quantitative excess risk bounds for learning these target functions from finite samples, using random feature attention with finitely many heads.
Our results feature several implications unique to the attention structure compared with existing random features theory for neural networks, such as (1) Advantages in the sample complexity over standard two-layer random-feature networks; (2) Concrete and natural classes of functions that can be learned efficiently by a random-feature attention layer; and (3) The effect of the sampling distribution of the query-key weight matrix (the product of the query and key matrix), where Gaussian random weights with a non-zero mean result in better sample complexities over the zero-mean counterpart for learning certain natural target functions. Experiments on simulated data corroborate our theoretical findings and further illustrate the interplay between the sample size and the complexity of the target function.
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
Euclid Preparation XXXIII. Characterization of convolutional neural networks for the identification of galaxy-galaxy strong lensing events
Authors:
Euclid Collaboration,
L. Leuzzi,
M. Meneghetti,
G. Angora,
R. B. Metcalf,
L. Moscardini,
P. Rosati,
P. Bergamini,
F. Calura,
B. Clément,
R. Gavazzi,
F. Gentile,
M. Lochner,
C. Grillo,
G. Vernardos,
N. Aghanim,
A. Amara,
L. Amendola,
S. Andreon,
N. Auricchio,
S. Bardelli,
C. Bodendorf,
D. Bonino,
E. Branchini,
M. Brescia
, et al. (194 additional authors not shown)
Abstract:
Forthcoming imaging surveys will potentially increase the number of known galaxy-scale strong lenses by several orders of magnitude. For this to happen, images of tens of millions of galaxies will have to be inspected to identify potential candidates. In this context, deep learning techniques are particularly suitable for the finding patterns in large data sets, and convolutional neural networks (…
▽ More
Forthcoming imaging surveys will potentially increase the number of known galaxy-scale strong lenses by several orders of magnitude. For this to happen, images of tens of millions of galaxies will have to be inspected to identify potential candidates. In this context, deep learning techniques are particularly suitable for the finding patterns in large data sets, and convolutional neural networks (CNNs) in particular can efficiently process large volumes of images. We assess and compare the performance of three network architectures in the classification of strong lensing systems on the basis of their morphological characteristics. We train and test our models on different subsamples of a data set of forty thousand mock images, having characteristics similar to those expected in the wide survey planned with the ESA mission \Euclid, gradually including larger fractions of faint lenses. We also evaluate the importance of adding information about the colour difference between the lens and source galaxies by repeating the same training on single-band and multi-band images. Our models find samples of clear lenses with $\gtrsim 90\%$ precision and completeness, without significant differences in the performance of the three architectures. Nevertheless, when including lenses with fainter arcs in the training set, the three models' performance deteriorates with accuracy values of $\sim 0.87$ to $\sim 0.75$ depending on the model. Our analysis confirms the potential of the application of CNNs to the identification of galaxy-scale strong lenses. We suggest that specific training with separate classes of lenses might be needed for detecting the faint lenses since the addition of the colour information does not yield a significant improvement in the current analysis, with the accuracy ranging from $\sim 0.89$ to $\sim 0.78$ for the different models.
△ Less
Submitted 26 January, 2024; v1 submitted 17 July, 2023;
originally announced July 2023.
-
A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking
Authors:
Shaohui Mei,
Jiawei Lian,
Xiaofei Wang,
Yuru Su,
Mingyang Ma,
Lap-Pui Chau
Abstract:
Deep neural networks (DNNs) have found widespread applications in interpreting remote sensing (RS) imagery. However, it has been demonstrated in previous works that DNNs are vulnerable to different types of noises, particularly adversarial noises. Surprisingly, there has been a lack of comprehensive studies on the robustness of RS tasks, prompting us to undertake a thorough survey and benchmark on…
▽ More
Deep neural networks (DNNs) have found widespread applications in interpreting remote sensing (RS) imagery. However, it has been demonstrated in previous works that DNNs are vulnerable to different types of noises, particularly adversarial noises. Surprisingly, there has been a lack of comprehensive studies on the robustness of RS tasks, prompting us to undertake a thorough survey and benchmark on the robustness of image classification and object detection in RS. To our best knowledge, this study represents the first comprehensive examination of both natural robustness and adversarial robustness in RS tasks. Specifically, we have curated and made publicly available datasets that contain natural and adversarial noises. These datasets serve as valuable resources for evaluating the robustness of DNNs-based models. To provide a comprehensive assessment of model robustness, we conducted meticulous experiments with numerous different classifiers and detectors, encompassing a wide range of mainstream methods. Through rigorous evaluation, we have uncovered insightful and intriguing findings, which shed light on the relationship between adversarial noise crafting and model training, yielding a deeper understanding of the susceptibility and limitations of various models, and providing guidance for the development of more resilient and robust models
△ Less
Submitted 15 September, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Thermo-osmotic slip flows around a thermophoretic microparticle characterized by optical trap** of tracers
Authors:
Tetsuro Tsuji,
Satoshi Mei,
Satoshi Taguchi
Abstract:
Thermo-osmotic flow around a microparticle in a liquid is characterized by observing and analyzing the distribution of tiny particles, i.e., tracers, near the microparticle's surface. First, an optical trap** laser is used to localize the tracer motion along a circular path near the circumference of the microparticle. Then, upon creating an overall temperature gradient in the liquid, the tracers…
▽ More
Thermo-osmotic flow around a microparticle in a liquid is characterized by observing and analyzing the distribution of tiny particles, i.e., tracers, near the microparticle's surface. First, an optical trap** laser is used to localize the tracer motion along a circular path near the circumference of the microparticle. Then, upon creating an overall temperature gradient in the liquid, the tracers on the circular path, originally uniformly distributed, gather towards the hotter side of the microparticle, indicating a flow along the particle toward the hot. Analyzing the tracer distribution further, it is found that (i) the flow magnitude decreases with the distance from the surface, and (ii) changing the surface property of the microparticle results in a change in the flow magnitude. These show that the observed flow is a thermally induced slip flow along the microparticle's surface. Then, assuming a simple slip boundary condition for a fluid equation, we evaluate the magnitude of the slip coefficient based on two experimental data: (i) the thermophoretic velocity of the microparticle and (ii) the thermo-osmotic flow around the microparticle. The results of the two approaches are in quantitative agreement. They are also compared with those of theoretical models for a slip flow in existing studies.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Impact of street canyon morphology on heat and fluid flow-an experimental water tunnel study using simultaneous PIV-LIF technique
Authors:
Yunpeng Xue,
Yongling Zhao,
Shuo-Jun Mei,
Yuan Chao,
Jan Carmeliet
Abstract:
Urban areas are known for their complex atmospheric environments, with the building morphology having a significant impact on local climate patterns, air quality, and overall urban microclimate. Understanding the heat transport and fluid flow in complex urban environments is crucial for improving urban climate resilience, which remains an open frontier in the field of urban studies. To gain a more…
▽ More
Urban areas are known for their complex atmospheric environments, with the building morphology having a significant impact on local climate patterns, air quality, and overall urban microclimate. Understanding the heat transport and fluid flow in complex urban environments is crucial for improving urban climate resilience, which remains an open frontier in the field of urban studies. To gain a more profound insight into the physical processes occurring in urban areas, particularly within street canyons, we conducted an experimental investigation in a large-scale water tunnel. This study involved the simultaneous examination of heat and flow fields, carried out at high spatial and temporal resolutions, utilizing Laser-induced Fluorescence (LIF) for heat analysis and Particle Image Velocimetry (PIV) for flow analysis. Our results of heat and flow in different street canyons indicate that the flow is significantly influenced by a combination of factors, including canyon configuration, the presence of buoyant force, and the magnitude of the approaching flow. The ventilation rate and heat flux from the street canyon, which are key factors sha** the urban microclimate, are found dominated significantly by the street canyon morphology. For instance, changing the aspect ratio of a street canyon results in a significant change of air ventilation rate, ranging from as low as 0.02 to as high as 1.5 under the same flow conditions. Additionally, canyons with high air ventilation rates exhibit significant heat flux removal at the canyon roof level, which is accurately described by the local Richardson number.
△ Less
Submitted 19 October, 2023; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
Authors:
Yu Bai,
Fan Chen,
Huan Wang,
Caiming Xiong,
Song Mei
Abstract:
Neural sequence models based on the transformer architecture have demonstrated remarkable \emph{in-context learning} (ICL) abilities, where they can perform new tasks when prompted with training and test examples, without any parameter update to the model. This work first provides a comprehensive statistical theory for transformers to perform ICL. Concretely, we show that transformers can implemen…
▽ More
Neural sequence models based on the transformer architecture have demonstrated remarkable \emph{in-context learning} (ICL) abilities, where they can perform new tasks when prompted with training and test examples, without any parameter update to the model. This work first provides a comprehensive statistical theory for transformers to perform ICL. Concretely, we show that transformers can implement a broad class of standard machine learning algorithms in context, such as least squares, ridge regression, Lasso, learning generalized linear models, and gradient descent on two-layer neural networks, with near-optimal predictive power on various in-context data distributions. Using an efficient implementation of in-context gradient descent as the underlying mechanism, our transformer constructions admit mild size bounds, and can be learned with polynomially many pretraining sequences.
Building on these ``base'' ICL algorithms, intriguingly, we show that transformers can implement more complex ICL procedures involving \emph{in-context algorithm selection}, akin to what a statistician can do in real life -- A \emph{single} transformer can adaptively select different base ICL algorithms -- or even perform qualitatively different tasks -- on different input sequences, without any explicit prompting of the right algorithm or task. We both establish this in theory by explicit constructions, and also observe this phenomenon experimentally. In theory, we construct two general mechanisms for algorithm selection with concrete examples: pre-ICL testing, and post-ICL validation. As an example, we use the post-ICL validation mechanism to construct a transformer that can perform nearly Bayes-optimal ICL on a challenging task -- noisy linear models with mixed noise levels. Experimentally, we demonstrate the strong in-context algorithm selection capabilities of standard transformer architectures.
△ Less
Submitted 6 July, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Euclid preparation. XXIX. Water ice in spacecraft part I: The physics of ice formation and contamination
Authors:
Euclid Collaboration,
M. Schirmer,
K. Thürmer,
B. Bras,
M. Cropper,
J. Martin-Fleitas,
Y. Goueffon,
R. Kohley,
A. Mora,
M. Portaluppi,
G. D. Racca,
A. D. Short,
S. Szmolka,
L. M. Gaspar Venancio,
M. Altmann,
Z. Balog,
U. Bastian,
M. Biermann,
D. Busonero,
C. Fabricius,
F. Grupp,
C. Jordi,
W. Löffler,
A. Sagristà Sellés,
N. Aghanim
, et al. (196 additional authors not shown)
Abstract:
Molecular contamination is a well-known problem in space flight. Water is the most common contaminant and alters numerous properties of a cryogenic optical system. Too much ice means that Euclid's calibration requirements and science goals cannot be met. Euclid must then be thermally decontaminated, a long and risky process. We need to understand how iced optics affect the data and when a decontam…
▽ More
Molecular contamination is a well-known problem in space flight. Water is the most common contaminant and alters numerous properties of a cryogenic optical system. Too much ice means that Euclid's calibration requirements and science goals cannot be met. Euclid must then be thermally decontaminated, a long and risky process. We need to understand how iced optics affect the data and when a decontamination is required. This is essential to build adequate calibration and survey plans, yet a comprehensive analysis in the context of an astrophysical space survey has not been done before.
In this paper we look at other spacecraft with well-documented outgassing records, and we review the formation of thin ice films. A mix of amorphous and crystalline ices is expected for Euclid. Their surface topography depends on the competing energetic needs of the substrate-water and the water-water interfaces, and is hard to predict with current theories. We illustrate that with scanning-tunnelling and atomic-force microscope images.
Industrial tools exist to estimate contamination, and we must understand their uncertainties. We find considerable knowledge errors on the diffusion and sublimation coefficients, limiting the accuracy of these tools. We developed a water transport model to compute contamination rates in Euclid, and find general agreement with industry estimates. Tests of the Euclid flight hardware in space simulators did not pick up contamination signals; our in-flight calibrations observations will be much more sensitive.
We must understand the link between the amount of ice on the optics and its effect on Euclid's data. Little research is available about this link, possibly because other spacecraft can decontaminate easily, quenching the need for a deeper understanding. In our second paper we quantify the various effects of iced optics on spectrophotometric data.
△ Less
Submitted 23 May, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Intracluster light in the core of z~2 galaxy proto-clusters
Authors:
S. V. Werner,
N. A. Hatch,
J. Matharu,
A. H. Gonzalez,
Y. M. Bahé,
S. Mei,
G. Noirot,
D. Wylezalek
Abstract:
Intracluster light is thought to originate from stars that were ripped away from their parent galaxies by gravitational tides and galaxy interactions during the build up of the cluster. The stars from such interactions will accumulate over time, so semi-analytic models suggest that the abundance of intracluster stars is negligible in young proto-clusters at z~2 and grows to around a quarter of the…
▽ More
Intracluster light is thought to originate from stars that were ripped away from their parent galaxies by gravitational tides and galaxy interactions during the build up of the cluster. The stars from such interactions will accumulate over time, so semi-analytic models suggest that the abundance of intracluster stars is negligible in young proto-clusters at z~2 and grows to around a quarter of the stellar mass in the oldest, most mature clusters. In contrast to these theoretical expectations, we report on the detection of intracluster light within two proto-clusters at z=2 using deep HST images. We use the colour of the intracluster light to estimate its mass-to-light ratio in annuli around the brightest cluster galaxies (BCG), up to a radius of 100 kpc. We find that $54\pm5$% and $71\pm3$% of the stellar mass in these regions is located more than 10 kpc away from the BCGs in the two proto-clusters. This low concentration is similar to BCGs in lower redshift clusters, and distinct from other massive proto-cluster galaxies. This suggests that intracluster stars are already present within the core 100 kpc of proto-clusters. We compare these observations to the Hydrangea hydrodynamical galaxy cluster simulations and find that intracluster stars are predicted to be a generic feature of group-sized halos at z=2. These intracluster stars will gradually move further away from the BCG as the proto-cluster assembles into a cluster.
△ Less
Submitted 12 May, 2023; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Euclid preparation. XXVII. A UV-NIR spectral atlas of compact planetary nebulae for wavelength calibration
Authors:
Euclid Collaboration,
K. Paterson,
M. Schirmer,
Y. Copin,
J. -C. Cuillandre,
W. Gillard,
L. A. Gutiérrez Soto,
L. Guzzo,
H. Hoekstra,
T. Kitching,
S. Paltani,
W. J. Percival,
M. Scodeggio,
L. Stanghellini,
P. N. Appleton,
R. Laureijs,
Y. Mellier,
N. Aghanim,
B. Altieri,
A. Amara,
N. Auricchio,
M. Baldi,
R. Bender,
C. Bodendorf,
D. Bonino
, et al. (179 additional authors not shown)
Abstract:
The Euclid mission will conduct an extragalactic survey over 15000 deg$^2$ of the extragalactic sky. The spectroscopic channel of the Near-Infrared Spectrometer and Photometer (NISP) has a resolution of $R\sim450$ for its blue and red grisms that collectively cover the $0.93$--$1.89 $\micron;range. NISP will obtain spectroscopic redshifts for $3\times10^7$ galaxies for the experiments on galaxy cl…
▽ More
The Euclid mission will conduct an extragalactic survey over 15000 deg$^2$ of the extragalactic sky. The spectroscopic channel of the Near-Infrared Spectrometer and Photometer (NISP) has a resolution of $R\sim450$ for its blue and red grisms that collectively cover the $0.93$--$1.89 $\micron;range. NISP will obtain spectroscopic redshifts for $3\times10^7$ galaxies for the experiments on galaxy clustering, baryonic acoustic oscillations, and redshift space distortion. The wavelength calibration must be accurate within $5$Åto avoid systematics in the redshifts and downstream cosmological parameters. The NISP pre-flight dispersion laws for the grisms were obtained on the ground using a Fabry-Perot etalon. Launch vibrations, zero gravity conditions, and thermal stabilisation may alter these dispersion laws, requiring an in-flight recalibration. To this end, we use the emission lines in the spectra of compact planetary nebulae (PNe), which were selected from a PN data base. To ensure completeness of the PN sample, we developed a novel technique to identify compact and strong line emitters in Gaia spectroscopic data using the Gaia spectra shape coefficients. We obtained VLT/X-SHOOTER spectra from $0.3$ to $2.5$ \micron;for 19 PNe in excellent seeing conditions and a wide slit, mimicking Euclid's slitless spectroscopy mode but with 10 times higher spectral resolution. Additional observations of one northern PN were obtained in the $0.80$--$1.90$ \micron range with the GMOS and GNIRS instruments at the Gemini North observatory. The collected spectra were combined into an atlas of heliocentric vacuum wavelengths with a joint statistical and systematic accuracy of 0.1 Åin the optical and 0.3 Åin the near-infrared. The wavelength atlas and the related 1D and 2D spectra are made publicly available.
△ Less
Submitted 25 April, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
Gender and Precarity in Astronomy
Authors:
N. A. Webb,
C. Bot,
S. Charpinet,
T. Contini,
L. Jouve,
H. Meheut,
S. Mei,
B. Mosser,
G. Soucail
Abstract:
Following the survey Well-being in astrophysics that was sent out in March 2021, to establish how astrophysics researchers, primarily in France, experience their career, some of the results were published in Webb et al. (2021). Here we further analyse the data to determine if gender can cause different experiences in astrophysics. We also study the impact on the well-being of temporary staff (prim…
▽ More
Following the survey Well-being in astrophysics that was sent out in March 2021, to establish how astrophysics researchers, primarily in France, experience their career, some of the results were published in Webb et al. (2021). Here we further analyse the data to determine if gender can cause different experiences in astrophysics. We also study the impact on the well-being of temporary staff (primarily PhD students and postdocs), compared to permanent staff. Whilst more temporary staff stated that they felt permanently overwhelmed than permanent staff, the experiences in astrophysics for the different genders were in general very similar, except in one area. More than three times more females than males experienced harassment or discrimination, rising sharply for gender discrimination and sexual harassment, where all of those having experienced sexual harassment and who had provided their gender in the survey, were female. Further, as previously reported (Webb et al. 2021), 20% of the respondents had suffered mental health issues before starting their career in astrophysics. We found that whilst this group was split approximately equally with regards to males and females, the number rose sharply to almost 45% of astronomers experiencing mental health issues since starting in astrophysics. Of this population, there were 50% more females than males. This excess of females was almost entirely made up of the population of women that had been harassed or discriminated against.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.
-
Euclid: Validation of the MontePython forecasting tools
Authors:
S. Casas,
J. Lesgourgues,
N. Schöneberg,
Sabarish V. M.,
L. Rathmann,
M. Doerenkamp,
M. Archidiacono,
E. Bellini,
S. Clesse,
N. Frusciante,
M. Martinelli,
F. Pace,
D. Sapone,
Z. Sakr,
A. Blanchard,
T. Brinckmann,
S. Camera,
C. Carbone,
S. Ilić,
K. Markovic,
V. Pettorino,
I. Tutusaus,
N. Aghanim,
A. Amara,
L. Amendola
, et al. (102 additional authors not shown)
Abstract:
The Euclid mission of the European Space Agency will perform a survey of weak lensing cosmic shear and galaxy clustering in order to constrain cosmological models and fundamental physics. We expand and adjust the mock Euclid likelihoods of the MontePython software in order to match the exact recipes used in previous Euclid Fisher matrix forecasts for several probes: weak lensing cosmic shear, phot…
▽ More
The Euclid mission of the European Space Agency will perform a survey of weak lensing cosmic shear and galaxy clustering in order to constrain cosmological models and fundamental physics. We expand and adjust the mock Euclid likelihoods of the MontePython software in order to match the exact recipes used in previous Euclid Fisher matrix forecasts for several probes: weak lensing cosmic shear, photometric galaxy clustering, the cross-correlation between the latter observables, and spectroscopic galaxy clustering. We also establish which precision settings are required when running the Einstein-Boltzmann solvers CLASS and CAMB in the context of Euclid. For the minimal cosmological model, extended to include dynamical dark energy, we perform Fisher matrix forecasts based directly on a numerical evaluation of second derivatives of the likelihood with respect to model parameters. We compare our results with those of other forecasting methods and tools. We show that such MontePython forecasts agree very well with previous Fisher forecasts published by the Euclid Collaboration, and also, with new forecasts produced by the CosmicFish code, now interfaced directly with the two Einstein-Boltzmann solvers CAMB and CLASS. Moreover, to establish the validity of the Gaussian approximation, we show that the Fisher matrix marginal error contours coincide with the credible regions obtained when running Monte Carlo Markov Chains with MontePython while using the exact same mock likelihoods. The new Euclid forecast pipelines presented here are ready for use with additional cosmological parameters, in order to explore extended cosmological models.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Exploring Epipolar Consistency Conditions for Rigid Motion Compensation in In-vivo X-ray Microscopy
Authors:
Mareike Thies,
Fabian Wagner,
Mingxuan Gu,
Siyuan Mei,
Yixing Huang,
Sabrina Pechmann,
Oliver Aust,
Daniela Weidner,
Georgiana Neag,
Stefan Uderhardt,
Georg Schett,
Silke Christiansen,
Andreas Maier
Abstract:
Intravital X-ray microscopy (XRM) in preclinical mouse models is of vital importance for the identification of microscopic structural pathological changes in the bone which are characteristic of osteoporosis. The complexity of this method stems from the requirement for high-quality 3D reconstructions of the murine bones. However, respiratory motion and muscle relaxation lead to inconsistencies in…
▽ More
Intravital X-ray microscopy (XRM) in preclinical mouse models is of vital importance for the identification of microscopic structural pathological changes in the bone which are characteristic of osteoporosis. The complexity of this method stems from the requirement for high-quality 3D reconstructions of the murine bones. However, respiratory motion and muscle relaxation lead to inconsistencies in the projection data which result in artifacts in uncompensated reconstructions. Motion compensation using epipolar consistency conditions (ECC) has previously shown good performance in clinical CT settings. Here, we explore whether such algorithms are suitable for correcting motion-corrupted XRM data. Different rigid motion patterns are simulated and the quality of the motion-compensated reconstructions is assessed. The method is able to restore microscopic features for out-of-plane motion, but artifacts remain for more realistic motion patterns including all six degrees of freedom of rigid motion. Therefore, ECC is valuable for the initial alignment of the projection data followed by further fine-tuning of motion parameters using a reconstruction-based method.
△ Less
Submitted 28 February, 2024; v1 submitted 1 March, 2023;
originally announced March 2023.
-
CBA: Contextual Background Attack against Optical Aerial Detection in the Physical World
Authors:
Jiawei Lian,
Xiaofei Wang,
Yuru Su,
Mingyang Ma,
Shaohui Mei
Abstract:
Patch-based physical attacks have increasingly aroused concerns.
However, most existing methods focus on obscuring targets captured on the ground, and some of these methods are simply extended to deceive aerial detectors.
They smear the targeted objects in the physical world with the elaborated adversarial patches, which can only slightly sway the aerial detectors' prediction and with weak att…
▽ More
Patch-based physical attacks have increasingly aroused concerns.
However, most existing methods focus on obscuring targets captured on the ground, and some of these methods are simply extended to deceive aerial detectors.
They smear the targeted objects in the physical world with the elaborated adversarial patches, which can only slightly sway the aerial detectors' prediction and with weak attack transferability.
To address the above issues, we propose to perform Contextual Background Attack (CBA), a novel physical attack framework against aerial detection, which can achieve strong attack efficacy and transferability in the physical world even without smudging the interested objects at all.
Specifically, the targets of interest, i.e. the aircraft in aerial images, are adopted to mask adversarial patches.
The pixels outside the mask area are optimized to make the generated adversarial patches closely cover the critical contextual background area for detection, which contributes to gifting adversarial patches with more robust and transferable attack potency in the real world.
To further strengthen the attack performance, the adversarial patches are forced to be outside targets during training, by which the detected objects of interest, both on and outside patches, benefit the accumulation of attack efficacy.
Consequently, the sophisticatedly designed patches are gifted with solid fooling efficacy against objects both on and outside the adversarial patches simultaneously.
Extensive proportionally scaled experiments are performed in physical scenarios, demonstrating the superiority and potential of the proposed framework for physical attacks.
We expect that the proposed physical attack method will serve as a benchmark for assessing the adversarial robustness of diverse aerial detectors and defense methods.
△ Less
Submitted 23 March, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Contextual adversarial attack against aerial detection in the physical world
Authors:
Jiawei Lian,
Xiaofei Wang,
Yuru Su,
Mingyang Ma,
Shaohui Mei
Abstract:
Deep Neural Networks (DNNs) have been extensively utilized in aerial detection. However, DNNs' sensitivity and vulnerability to maliciously elaborated adversarial examples have progressively garnered attention. Recently, physical attacks have gradually become a hot issue due to they are more practical in the real world, which poses great threats to some security-critical applications. In this pape…
▽ More
Deep Neural Networks (DNNs) have been extensively utilized in aerial detection. However, DNNs' sensitivity and vulnerability to maliciously elaborated adversarial examples have progressively garnered attention. Recently, physical attacks have gradually become a hot issue due to they are more practical in the real world, which poses great threats to some security-critical applications. In this paper, we take the first attempt to perform physical attacks in contextual form against aerial detection in the physical world. We propose an innovative contextual attack method against aerial detection in real scenarios, which achieves powerful attack performance and transfers well between various aerial object detectors without smearing or blocking the interested objects to hide. Based on the findings that the targets' contextual information plays an important role in aerial detection by observing the detectors' attention maps, we propose to make full use of the contextual area of the interested targets to elaborate contextual perturbations for the uncovered attacks in real scenarios. Extensive proportionally scaled experiments are conducted to evaluate the effectiveness of the proposed contextual attack method, which demonstrates the proposed method's superiority in both attack efficacy and physical practicality.
△ Less
Submitted 26 February, 2023;
originally announced February 2023.
-
Euclid preparation. XXX. Performance assessment of the NISP Red-Grism through spectroscopic simulations for the Wide and Deep surveys
Authors:
Euclid Collaboration,
L. Gabarra,
C. Mancini,
L. Rodriguez Munoz,
G. Rodighiero,
C. Sirignano,
M. Scodeggio,
M. Talia,
S. Dusini,
W. Gillard,
B. R. Granett,
E. Maiorano,
M. Moresco,
L. Paganin,
E. Palazzi,
L. Pozzetti,
A. Renzi,
E. Rossetti,
D. Vergani,
V. Allevato,
L. Bisigello,
G. Castignani,
B. De Caro,
M. Fumana,
K. Ganga
, et al. (210 additional authors not shown)
Abstract:
This work focuses on the pilot run of a simulation campaign aimed at investigating the spectroscopic capabilities of the Euclid Near-Infrared Spectrometer and Photometer (NISP), in terms of continuum and emission line detection in the context of galaxy evolutionary studies. To this purpose we constructed, emulated, and analysed the spectra of 4992 star-forming galaxies at $0.3 \leq z \leq 2.5$ usi…
▽ More
This work focuses on the pilot run of a simulation campaign aimed at investigating the spectroscopic capabilities of the Euclid Near-Infrared Spectrometer and Photometer (NISP), in terms of continuum and emission line detection in the context of galaxy evolutionary studies. To this purpose we constructed, emulated, and analysed the spectra of 4992 star-forming galaxies at $0.3 \leq z \leq 2.5$ using the NISP pixel-level simulator. We built the spectral library starting from public multi-wavelength galaxy catalogues, with value-added information on spectral energy distribution (SED) fitting results, and from Bruzual and Charlot (2003) stellar population templates. Rest-frame optical and near-IR nebular emission lines were included using empirical and theoretical relations. We inferred the 3.5$σ$ NISP red grism spectroscopic detection limit of the continuum measured in the $H$ band for star-forming galaxies with a median disk half-light radius of \ang{;;0.4} at magnitude $H= 19.5\pm0.2\,$AB$\,$mag for the Euclid Wide Survey and at $H = 20.8\pm0.6\,$AB$\,$mag for the Euclid Deep Survey. We found a very good agreement with the red grism emission line detection limit requirement for the Wide and Deep surveys. We characterised the effect of the galaxy shape on the detection capability of the red grism and highlighted the degradation of the quality of the extracted spectra as the disk size increases. In particular, we found that the extracted emission line signal to noise ratio (SNR) drops by $\sim\,$45$\%$ when the disk size ranges from \ang{;;0.25} to \ang{;;1}. These trends lead to a correlation between the emission line SNR and the stellar mass of the galaxy and we demonstrate the effect in a stacking analysis unveiling emission lines otherwise too faint to detect.
△ Less
Submitted 25 August, 2023; v1 submitted 18 February, 2023;
originally announced February 2023.
-
Euclid: Cosmology forecasts from the void-galaxy cross-correlation function with reconstruction
Authors:
S. Radinović,
S. Nadathur,
H. -A. Winther,
W. J. Percival,
A. Woodfinden,
E. Massara,
E. Paillas,
S. Contarini,
N. Hamaus,
A. Kovacs,
A. Pisani,
G. Verza,
M. Aubert,
A. Amara,
N. Auricchio,
M. Baldi,
D. Bonino,
E. Branchini,
M. Brescia,
S. Camera,
V. Capobianco,
C. Carbone,
V. F. Cardone,
J. Carretero,
M. Castellano
, et al. (96 additional authors not shown)
Abstract:
We investigate the cosmological constraints that can be expected from measurement of the cross-correlation of galaxies with cosmic voids identified in the Euclid spectroscopic survey, which will include spectroscopic information for tens of millions of galaxies over $15\,000$ deg$^2$ of the sky in the redshift range $0.9\leq z<1.8$. We do this using simulated measurements obtained from the Flagshi…
▽ More
We investigate the cosmological constraints that can be expected from measurement of the cross-correlation of galaxies with cosmic voids identified in the Euclid spectroscopic survey, which will include spectroscopic information for tens of millions of galaxies over $15\,000$ deg$^2$ of the sky in the redshift range $0.9\leq z<1.8$. We do this using simulated measurements obtained from the Flagship mock catalogue, the official Euclid mock that closely matches the expected properties of the spectroscopic data set. To mitigate anisotropic selection-bias effects, we use a velocity field reconstruction method to remove large-scale redshift-space distortions from the galaxy field before void-finding. This allows us to accurately model contributions to the observed anisotropy of the cross-correlation function arising from galaxy velocities around voids as well as from the Alcock-Paczynski effect, and we study the dependence of constraints on the efficiency of reconstruction. We find that Euclid voids will be able to constrain the ratio of the transverse comoving distance $D_{\rm M}$ and Hubble distance $D_{\rm H}$ to a relative precision of about $0.3\%$, and the growth rate $fσ_8$ to a precision of between $5\%$ and $8\%$ in each of four redshift bins covering the full redshift range. In the standard cosmological model, this translates to a statistical uncertainty $ΔΩ_\mathrm{m}=\pm0.0028$ on the matter density parameter from voids, better than can be achieved from either Euclid galaxy clustering and weak lensing individually. We also find that voids alone can measure the dark energy equation of state to $6\%$ precision.
△ Less
Submitted 9 October, 2023; v1 submitted 10 February, 2023;
originally announced February 2023.
-
Euclid preparation: XXVIII. Modelling of the weak lensing angular power spectrum
Authors:
Euclid Collaboration,
A. C. Deshpande,
T. Kitching,
A. Hall,
M. L. Brown,
N. Aghanim,
L. Amendola,
N. Auricchio,
M. Baldi,
R. Bender,
D. Bonino,
E. Branchini,
M. Brescia,
J. Brinchmann,
S. Camera,
G. P. Candini,
V. Capobianco,
C. Carbone,
V. F. Cardone,
J. Carretero,
F. J. Castander,
M. Castellano,
S. Cavuoti,
A. Cimatti,
R. Cledassou
, et al. (178 additional authors not shown)
Abstract:
This work considers which higher-order effects in modelling the cosmic shear angular power spectra must be taken into account for Euclid. We identify which terms are of concern, and quantify their individual and cumulative impact on cosmological parameter inference from Euclid. We compute the values of these higher-order effects using analytic expressions, and calculate the impact on cosmological…
▽ More
This work considers which higher-order effects in modelling the cosmic shear angular power spectra must be taken into account for Euclid. We identify which terms are of concern, and quantify their individual and cumulative impact on cosmological parameter inference from Euclid. We compute the values of these higher-order effects using analytic expressions, and calculate the impact on cosmological parameter estimation using the Fisher matrix formalism. We review 24 effects and find the following potentially need to be accounted for: the reduced shear approximation, magnification bias, source-lens clustering, source obscuration, local Universe effects, and the flat Universe assumption. Upon computing these explicitly, and calculating their cosmological parameter biases, using a maximum multipole of $\ell=5000$, we find that the magnification bias, source-lens clustering, source obscuration, and local Universe terms individually produce significant ($\,>0.25σ$) cosmological biases in one or more parameters, and accordingly must be accounted for. In total, over all effects, we find biases in $Ω_{\rm m}$, $Ω_{\rm b}$, $h$, and $σ_{8}$ of $0.73σ$, $0.28σ$, $0.25σ$, and $-0.79σ$, respectively, for flat $Λ$CDM. For the $w_0w_a$CDM case, we find biases in $Ω_{\rm m}$, $Ω_{\rm b}$, $h$, $n_{\rm s}$, $σ_{8}$, and $w_a$ of $1.49σ$, $0.35σ$, $-1.36σ$, $1.31σ$, $-0.84σ$, and $-0.35σ$, respectively; which are increased relative to the $Λ$CDM due to additional degeneracies as a function of redshift and scale.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Lower Bounds for Learning in Revealing POMDPs
Authors:
Fan Chen,
Huan Wang,
Caiming Xiong,
Song Mei,
Yu Bai
Abstract:
This paper studies the fundamental limits of reinforcement learning (RL) in the challenging \emph{partially observable} setting. While it is well-established that learning in Partially Observable Markov Decision Processes (POMDPs) requires exponentially many samples in the worst case, a surge of recent work shows that polynomial sample complexities are achievable under the \emph{revealing conditio…
▽ More
This paper studies the fundamental limits of reinforcement learning (RL) in the challenging \emph{partially observable} setting. While it is well-established that learning in Partially Observable Markov Decision Processes (POMDPs) requires exponentially many samples in the worst case, a surge of recent work shows that polynomial sample complexities are achievable under the \emph{revealing condition} -- A natural condition that requires the observables to reveal some information about the unobserved latent states. However, the fundamental limits for learning in revealing POMDPs are much less understood, with existing lower bounds being rather preliminary and having substantial gaps from the current best upper bounds.
We establish strong PAC and regret lower bounds for learning in revealing POMDPs. Our lower bounds scale polynomially in all relevant problem parameters in a multiplicative fashion, and achieve significantly smaller gaps against the current best upper bounds, providing a solid starting point for future studies. In particular, for \emph{multi-step} revealing POMDPs, we show that (1) the latent state-space dependence is at least $Ω(S^{1.5})$ in the PAC sample complexity, which is notably harder than the $\widetildeΘ(S)$ scaling for fully-observable MDPs; (2) Any polynomial sublinear regret is at least $Ω(T^{2/3})$, suggesting its fundamental difference from the \emph{single-step} case where $\widetilde{O}(\sqrt{T})$ regret is achievable. Technically, our hard instance construction adapts techniques in \emph{distribution testing}, which is new to the RL literature and may be of independent interest.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Euclid preparation. XXXII. Evaluating the weak lensing cluster mass biases using the Three Hundred Project hydrodynamical simulations
Authors:
Euclid Collaboration,
C. Giocoli,
M. Meneghetti,
E. Rasia,
S. Borgani,
G. Despali,
G. F. Lesci,
F. Marulli,
L. Moscardini,
M. Sereno,
W. Cui,
A. Knebe,
G. Yepes,
T. Castro,
P. -S. Corasaniti,
S. Pires,
G. Castignani,
L. Ingoglia,
T. Schrabback,
G. W. Pratt,
A. M. C. Le Brun,
N. Aghanim,
L. Amendola,
N. Auricchio,
M. Baldi
, et al. (191 additional authors not shown)
Abstract:
The photometric catalogue of galaxy clusters extracted from ESA Euclid data is expected to be very competitive for cosmological studies. Using state-of-the-art hydrodynamical simulations, we present systematic analyses simulating the expected weak lensing profiles from clusters in a variety of dynamic states and at wide range of redshifts. In order to derive cluster masses, we use a model consiste…
▽ More
The photometric catalogue of galaxy clusters extracted from ESA Euclid data is expected to be very competitive for cosmological studies. Using state-of-the-art hydrodynamical simulations, we present systematic analyses simulating the expected weak lensing profiles from clusters in a variety of dynamic states and at wide range of redshifts. In order to derive cluster masses, we use a model consistent with the implementation within the Euclid Consortium of the dedicated processing function and find that, when jointly modelling mass and the concentration parameter of the Navarro-Frenk-White halo profile, the weak lensing masses tend to be, on average, biased low by 5-10% with respect to the true mass, up to z=0.5. Using a fixed value for the concentration $c_{200} = 3$, the mass bias is diminished below 5%, up to z=0.7, along with its relative uncertainty. Simulating the weak lensing signal by projecting along the directions of the axes of the moment of inertia tensor ellipsoid, we find that orientation matters: when clusters are oriented along the major axis, the lensing signal is boosted, and the recovered weak lensing mass is correspondingly overestimated. Typically, the weak lensing mass bias of individual clusters is modulated by the weak lensing signal-to-noise ratio, related to the redshift evolution of the number of galaxies used for weak lensing measurements: the negative mass bias tends to be larger toward higher redshifts. However, when we use a fixed value of the concentration parameter, the redshift evolution trend is reduced. These results provide a solid basis for the weak-lensing mass calibration required by the cosmological application of future cluster surveys from Euclid and Rubin.
△ Less
Submitted 18 October, 2023; v1 submitted 1 February, 2023;
originally announced February 2023.
-
Euclid Preparation. XXVIII. Forecasts for ten different higher-order weak lensing statistics
Authors:
Euclid Collaboration,
V. Ajani,
M. Baldi,
A. Barthelemy,
A. Boyle,
P. Burger,
V. F. Cardone,
S. Cheng,
S. Codis,
C. Giocoli,
J. Harnois-Déraps,
S. Heydenreich,
V. Kansal,
M. Kilbinger,
L. Linke,
C. Llinares,
N. Martinet,
C. Parroni,
A. Peel,
S. Pires,
L. Porth,
I. Tereno,
C. Uhlemann,
M. Vicinanza,
S. Vinciguerra
, et al. (189 additional authors not shown)
Abstract:
Recent cosmic shear studies have shown that higher-order statistics (HOS) developed by independent teams now outperform standard two-point estimators in terms of statistical precision thanks to their sensitivity to the non-Gaussian features of large-scale structure. The aim of the Higher-Order Weak Lensing Statistics (HOWLS) project is to assess, compare, and combine the constraining power of ten…
▽ More
Recent cosmic shear studies have shown that higher-order statistics (HOS) developed by independent teams now outperform standard two-point estimators in terms of statistical precision thanks to their sensitivity to the non-Gaussian features of large-scale structure. The aim of the Higher-Order Weak Lensing Statistics (HOWLS) project is to assess, compare, and combine the constraining power of ten different HOS on a common set of $Euclid$-like mocks, derived from N-body simulations. In this first paper of the HOWLS series, we computed the nontomographic ($Ω_{\rm m}$, $σ_8$) Fisher information for the one-point probability distribution function, peak counts, Minkowski functionals, Betti numbers, persistent homology Betti numbers and heatmap, and scattering transform coefficients, and we compare them to the shear and convergence two-point correlation functions in the absence of any systematic bias. We also include forecasts for three implementations of higher-order moments, but these cannot be robustly interpreted as the Gaussian likelihood assumption breaks down for these statistics. Taken individually, we find that each HOS outperforms the two-point statistics by a factor of around two in the precision of the forecasts with some variations across statistics and cosmological parameters. When combining all the HOS, this increases to a $4.5$ times improvement, highlighting the immense potential of HOS for cosmic shear cosmological analyses with $Euclid$. The data used in this analysis are publicly released with the paper.
△ Less
Submitted 10 July, 2023; v1 submitted 30 January, 2023;
originally announced January 2023.
-
YOLO-CL: Galaxy cluster detection in the SDSS with deep machine learning
Authors:
Kirill Grishin,
Simona Mei,
Stéphane Ilic
Abstract:
(Abridged) Galaxy clusters are a powerful probe of cosmological models. Next generation large-scale optical and infrared surveys will reach unprecedented depths over large areas and require highly complete and pure cluster catalogs, with a well defined selection function. We have developed a new cluster detection algorithm YOLO-CL, which is a modified version of the state-of-the-art object detecti…
▽ More
(Abridged) Galaxy clusters are a powerful probe of cosmological models. Next generation large-scale optical and infrared surveys will reach unprecedented depths over large areas and require highly complete and pure cluster catalogs, with a well defined selection function. We have developed a new cluster detection algorithm YOLO-CL, which is a modified version of the state-of-the-art object detection deep convolutional network YOLO, optimized for the detection of galaxy clusters. We trained YOLO-CL on color images of the redMaPPer cluster detections in the SDSS. We find that YOLO-CL detects $95-98\%$ of the redMaPPer clusters, with a purity of $95-98\%$ calculated by applying the network to SDSS blank fields. When compared to the MCXC2021 X-ray catalog in the SDSS footprint,YOLO-CL is more complete then redMaPPer, which means that the neural network improved the cluster detection efficiency of its training sample. The YOLO-CL selection function is approximately constant with redshift, with respect to the MCXC2021 cluster mean X-ray surface brightness. YOLO-CL shows high performance when compared to traditional detection algorithms applied to SDSS. Deep learning networks benefit from a strong advantage over traditional galaxy cluster detection techniques because they do not need galaxy photometric and photometric redshift catalogs. This eliminates systematic uncertainties that can be introduced during source detection, and photometry and photometric redshift measurements. Our results show that YOLO-CL is an efficient alternative to traditional cluster detection methods. In general, this work shows that it is worth exploring the performance of deep convolution networks for future cosmological cluster surveys, such as the Rubin/LSST, Euclid or the Roman Space Telescope surveys.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
A first-order augmented Lagrangian method for constrained minimax optimization
Authors:
Zhaosong Lu,
Sanyou Mei
Abstract:
In this paper we study a class of constrained minimax problems. In particular, we propose a first-order augmented Lagrangian method for solving them, whose subproblems turn out to be a much simpler structured minimax problem and are suitably solved by a first-order method recently developed in [26] by the authors. Under some suitable assumptions, an \emph{operation complexity} of…
▽ More
In this paper we study a class of constrained minimax problems. In particular, we propose a first-order augmented Lagrangian method for solving them, whose subproblems turn out to be a much simpler structured minimax problem and are suitably solved by a first-order method recently developed in [26] by the authors. Under some suitable assumptions, an \emph{operation complexity} of ${\cal O}(\varepsilon^{-4}\log\varepsilon^{-1})$, measured by its fundamental operations, is established for the first-order augmented Lagrangian method for finding an $\varepsilon$-KKT solution of the constrained minimax problems.
△ Less
Submitted 17 April, 2023; v1 submitted 5 January, 2023;
originally announced January 2023.
-
First-order penalty methods for bilevel optimization
Authors:
Zhaosong Lu,
Sanyou Mei
Abstract:
In this paper we study a class of unconstrained and constrained bilevel optimization problems in which the lower level is a possibly nonsmooth convex optimization problem, while the upper level is a possibly nonconvex optimization problem. We introduce a notion of $\varepsilon$-KKT solution for them and show that an $\varepsilon$-KKT solution leads to an $O(\sqrt{\varepsilon})$- or…
▽ More
In this paper we study a class of unconstrained and constrained bilevel optimization problems in which the lower level is a possibly nonsmooth convex optimization problem, while the upper level is a possibly nonconvex optimization problem. We introduce a notion of $\varepsilon$-KKT solution for them and show that an $\varepsilon$-KKT solution leads to an $O(\sqrt{\varepsilon})$- or $O(\varepsilon)$-hypergradient based stionary point under suitable assumptions. We also propose first-order penalty methods for finding an $\varepsilon$-KKT solution of them, whose subproblems turn out to be a structured minimax problem and can be suitably solved by a first-order method recently developed by the authors. Under suitable assumptions, an \emph{operation complexity} of $O(\varepsilon^{-4}\log\varepsilon^{-1})$ and $O(\varepsilon^{-7}\log\varepsilon^{-1})$, measured by their fundamental operations, is established for the proposed penalty methods for finding an $\varepsilon$-KKT solution of the unconstrained and constrained bilevel optimization problems, respectively. Preliminary numerical results are presented to illustrate the performance of our proposed methods. To the best of our knowledge, this paper is the first work to demonstrate that bilevel optimization can be approximately solved as minimax optimization, and moreover, it provides the first implementable method with complexity guarantees for such sophisticated bilevel optimization.
△ Less
Submitted 7 March, 2024; v1 submitted 4 January, 2023;
originally announced January 2023.
-
THMA: Tencent HD Map AI System for Creating HD Map Annotations
Authors:
Kun Tang,
Xu Cao,
Zhipeng Cao,
Tong Zhou,
Erlong Li,
Ao Liu,
Shengtao Zou,
Chang Liu,
Shuqi Mei,
Elena Sizikova,
Chao Zheng
Abstract:
Nowadays, autonomous vehicle technology is becoming more and more mature. Critical to progress and safety, high-definition (HD) maps, a type of centimeter-level map collected using a laser sensor, provide accurate descriptions of the surrounding environment. The key challenge of HD map production is efficient, high-quality collection and annotation of large-volume datasets. Due to the demand for h…
▽ More
Nowadays, autonomous vehicle technology is becoming more and more mature. Critical to progress and safety, high-definition (HD) maps, a type of centimeter-level map collected using a laser sensor, provide accurate descriptions of the surrounding environment. The key challenge of HD map production is efficient, high-quality collection and annotation of large-volume datasets. Due to the demand for high quality, HD map production requires significant manual human effort to create annotations, a very time-consuming and costly process for the map industry. In order to reduce manual annotation burdens, many artificial intelligence (AI) algorithms have been developed to pre-label the HD maps. However, there still exists a large gap between AI algorithms and the traditional manual HD map production pipelines in accuracy and robustness. Furthermore, it is also very resource-costly to build large-scale annotated datasets and advanced machine learning algorithms for AI-based HD map automatic labeling systems. In this paper, we introduce the Tencent HD Map AI (THMA) system, an innovative end-to-end, AI-based, active learning HD map labeling system capable of producing and labeling HD maps with a scale of hundreds of thousands of kilometers. In THMA, we train AI models directly from massive HD map datasets via supervised, self-supervised, and weakly supervised learning to achieve high accuracy and efficiency required by downstream users. THMA has been deployed by the Tencent Map team to provide services to downstream companies and users, serving over 1,000 labeling workers and producing more than 30,000 kilometers of HD map data per day at most. More than 90 percent of the HD map data in Tencent Map is labeled automatically by THMA, accelerating the traditional HD map labeling process by more than ten times.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
Plausible deniability for privacy-preserving data synthesis
Authors:
Song Mei,
Zhiqiang Ye
Abstract:
In the field of privacy protection, publishing complete data (especially high-dimensional data sets) is one of the most challenging problems. The common encryption technology can not deal with the attacker to take differential attack to obtain sensitive information, while the existing differential privacy protection algorithm model takes a long time for high-dimensional calculation and needs to ad…
▽ More
In the field of privacy protection, publishing complete data (especially high-dimensional data sets) is one of the most challenging problems. The common encryption technology can not deal with the attacker to take differential attack to obtain sensitive information, while the existing differential privacy protection algorithm model takes a long time for high-dimensional calculation and needs to add noise to reduce data accuracy, which is not suitable for high-dimensional large data sets. In view of this situation, this paper designs a complete data synthesis scheme to protect data privacy around the concept of "plausible denial". Firstly, the paper provides the theoretical support for the difference between "plausible data" and "plausible data". In the process of scheme designing, this paper decomposes the scheme design into construction data synthesis module and privacy test module, then designs algorithm models for them respectively and realizes the function of privacy protection. When evaluating the feasibility of the scheme, the paper selects the Results of the 2013 community census in the United States as the high-dimensional data set, uses the simulation program that is based on Python to test and analyzes the efficiency and reliability of the data synthesis scheme. This portion focuses on the evaluation of the privacy protection effectiveness of the scheme.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
A Novel Location Free Link Prediction in Multiplex Social Networks
Authors:
Song Mei,
Cong Zhen
Abstract:
In recent decades, the emergence of social networks has enabled internet service providers (e.g., Facebook, Twitter and Uber) to achieve great commercial success. Link prediction is recognized as a common practice to build the topology of social networks and keep them evolving. Conventionally, link prediction methods are dependent of location information of users, which suffers from information le…
▽ More
In recent decades, the emergence of social networks has enabled internet service providers (e.g., Facebook, Twitter and Uber) to achieve great commercial success. Link prediction is recognized as a common practice to build the topology of social networks and keep them evolving. Conventionally, link prediction methods are dependent of location information of users, which suffers from information leakage from time to time. To deal with this problem, companies of smart devices (e.g., Apple Inc.) keeps tightening their privacy policy, impeding internet service providers from acquiring location information. Therefore, it is of great importance to design location free link prediction methods, while the accuracy still preserves. In this study, a novel location free link prediction method is proposed for complex social networks. Experiments on real datasets show that the precision of our location free link prediction method increases by 10 percent.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
The galaxy mass-size relation in CARLA clusters and proto-clusters at 1.4 < z < 2.8: larger cluster galaxy sizes
Authors:
Anton V. Afanasiev,
Simona Mei,
Hao Fu,
Francesco Shankar,
Stefania Amodeo,
Daniel Stern,
Elizabeth A. Cooke,
Anthony H. Gonzalez,
Gaël Noirot,
Alessandro Rettura,
Dominika Wylezalek,
Carlos De Breuck,
Nina A. Hatch,
Spencer A. Stanford,
Joël Vernet
Abstract:
(Abridged) We study the galaxy mass-size relation in CARLA spectroscopically confirmed clusters at $1.4<z<2.8$, which span a total stellar mass $11.3<\mathrm{log}(M^c_*/M_{\odot})<12.6$ (halo mass $13.5 \lesssim \mathrm{log}(M^c_h/M_{\odot}) \lesssim 14.5$). Our main finding is that cluster passive ETG at $z \gtrsim 1.5$ with ${\rm log}(M/M_{\odot})>10.5$ are systematically…
▽ More
(Abridged) We study the galaxy mass-size relation in CARLA spectroscopically confirmed clusters at $1.4<z<2.8$, which span a total stellar mass $11.3<\mathrm{log}(M^c_*/M_{\odot})<12.6$ (halo mass $13.5 \lesssim \mathrm{log}(M^c_h/M_{\odot}) \lesssim 14.5$). Our main finding is that cluster passive ETG at $z \gtrsim 1.5$ with ${\rm log}(M/M_{\odot})>10.5$ are systematically $\gtrsim 0.2-0.3~{\rm dex}$ larger than field ETGs. The passive ETG average size evolution is slower at $1<z<2$ when compared to the field. This could be explained by differences in the formation and early evolution of galaxies in haloes of a different mass. Strong compaction and gas dissipation in field galaxies, followed by a sequence of mergers may have also played a significant role in the field ETG evolution, but not in the evolution of cluster galaxies. Our passive ETG mass-size relation shows a tendency to flatten at $9.6<{\rm log}(M/M_{\odot})<10.5$, where the average size is $\mathrm{log}(R_e/\mathrm{kpc}) = 0.05 \pm 0.22$. This implies that galaxies in the low end of the mass-size relation do not evolve much from $z\sim 2$ to the present, and that their sizes evolve in a similar way in clusters and in the field. BCGs lie on the same mass-size relation as satellites, suggesting that their size evolution is not different at redshift z $\gtrsim$ 2. Half of the active ETGs ($\sim 30\%$ of the ETGs) follow the field passive galaxy mass-size relation, and the other half follow the field active galaxy mass-size relation. These galaxies likely went through a recent merger or neighbor galaxy interaction, and would most probably quench at a later epoch and increase the fraction of passive ETGs in clusters. We do not observe a large population of compact galaxies, as is observed in the field at these redshifts, implying that the galaxies in our clusters are not observed in an epoch close to their compaction.
△ Less
Submitted 2 February, 2023; v1 submitted 30 November, 2022;
originally announced December 2022.
-
Online Optimization in Power Systems with High Penetration of Renewable Generation: Advances and Prospects
Authors:
Zhaojian Wang,
Wei Wei,
John Zhen Fu Pang,
Feng Liu,
Bo Yang,
** Guan,
Shengwei Mei
Abstract:
Traditionally, offline optimization of power systems is acceptable due to the largely predictable loads and reliable generation. The increasing penetration of fluctuating renewable generation and Internet-of-Things devices allowing for fine-grained controllability of loads have led to the diminishing applicability of offline optimization in the power systems domain, and have redirected attention t…
▽ More
Traditionally, offline optimization of power systems is acceptable due to the largely predictable loads and reliable generation. The increasing penetration of fluctuating renewable generation and Internet-of-Things devices allowing for fine-grained controllability of loads have led to the diminishing applicability of offline optimization in the power systems domain, and have redirected attention to online optimization methods. However, online optimization is a broad topic that can be applied in and motivated by different settings, operated on different time scales, and built on different theoretical foundations. This paper reviews the various types of online optimization techniques used in the power systems domain and aims to make clear the distinction between the most common techniques used. In particular, we introduce and compare four distinct techniques used covering the breadth of online optimization techniques used in the power systems domain, i.e., optimization-guided dynamic control, feedback optimization for single-period problems, Lyapunov-based optimization, and online convex optimization techniques for multi-period problems. Lastly, we recommend some potential future directions for online optimization in the power systems domain.
△ Less
Submitted 26 November, 2022;
originally announced November 2022.
-
Euclid preparation. XXVII. Covariance model validation for the 2-point correlation function of galaxy clusters
Authors:
Euclid Collaboration,
A. Fumagalli,
A. Saro,
S. Borgani,
T. Castro,
M. Costanzi,
P. Monaco,
E. Munari,
E. Sefusatti,
N. Aghanim,
N. Auricchio,
M. Baldi,
C. Bodendorf,
D. Bonino,
E. Branchini,
M. Brescia,
J. Brinchmann,
S. Camera,
V. Capobianco,
C. Carbone,
J. Carretero,
F. J. Castander,
M. Castellano,
S. Cavuoti,
R. Cledassou
, et al. (169 additional authors not shown)
Abstract:
Aims. We validate a semi-analytical model for the covariance of real-space 2-point correlation function of galaxy clusters. Methods. Using 1000 PINOCCHIO light cones mimicking the expected Euclid sample of galaxy clusters, we calibrate a simple model to accurately describe the clustering covariance. Then, we use such a model to quantify the likelihood analysis response to variations of the covaria…
▽ More
Aims. We validate a semi-analytical model for the covariance of real-space 2-point correlation function of galaxy clusters. Methods. Using 1000 PINOCCHIO light cones mimicking the expected Euclid sample of galaxy clusters, we calibrate a simple model to accurately describe the clustering covariance. Then, we use such a model to quantify the likelihood analysis response to variations of the covariance, and investigate the impact of a cosmology-dependent matrix at the level of statistics expected for the Euclid survey of galaxy clusters. Results. We find that a Gaussian model with Poissonian shot-noise does not correctly predict the covariance of the 2-point correlation function of galaxy clusters. By introducing few additional parameters fitted from simulations, the proposed model reproduces the numerical covariance with 10 per cent accuracy, with differences of about 5 per cent on the figure of merit of the cosmological parameters $Ω_{\rm m}$ and $σ_8$. Also, we find that the cosmology-dependence of the covariance adds valuable information that is not contained in the mean value, significantly improving the constraining power of cluster clustering. Finally, we find that the cosmological figure of merit can be further improved by taking mass binning into account. Our results have significant implications for the derivation of cosmological constraints from the 2-point clustering statistics of the Euclid survey of galaxy clusters.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Near-optimal multiple testing in Bayesian linear models with finite-sample FDR control
Authors:
Taejoo Ahn,
Licong Lin,
Song Mei
Abstract:
In high dimensional variable selection problems, statisticians often seek to design multiple testing procedures that control the False Discovery Rate (FDR), while concurrently identifying a greater number of relevant variables. Model-X methods, such as Knockoffs and conditional randomization tests, achieve the primary goal of finite-sample FDR control, assuming a known distribution of covariates.…
▽ More
In high dimensional variable selection problems, statisticians often seek to design multiple testing procedures that control the False Discovery Rate (FDR), while concurrently identifying a greater number of relevant variables. Model-X methods, such as Knockoffs and conditional randomization tests, achieve the primary goal of finite-sample FDR control, assuming a known distribution of covariates. However, whether these methods can also achieve the secondary goal of maximizing discoveries remains uncertain. In fact, designing procedures to discover more relevant variables with finite-sample FDR control is a largely open question, even within the arguably simplest linear models.
In this paper, we develop near-optimal multiple testing procedures for high dimensional Bayesian linear models with isotropic covariates. We introduce Model-X procedures that provably control the frequentist FDR from finite samples, even when the model is misspecified, and conjecturally achieve near-optimal power when the data follow the Bayesian linear model. Our proposed procedure, PoEdCe, incorporates three key ingredients: Posterior Expectation, distilled Conditional randomization test (dCRT), and the Benjamini-Hochberg procedure with e-values (eBH). The optimality conjecture of PoEdCe is based on a heuristic calculation of its asymptotic true positive proportion (TPP) and false discovery proportion (FDP), which is supported by methods from statistical physics as well as extensive numerical simulations. Our result establishes the Bayesian linear model as a benchmark for comparing the power of various multiple testing procedures.
△ Less
Submitted 21 July, 2023; v1 submitted 4 November, 2022;
originally announced November 2022.
-
Benchmarking Adversarial Patch Against Aerial Detection
Authors:
Jiawei Lian,
Shaohui Mei,
Shun Zhang,
Mingyang Ma
Abstract:
DNNs are vulnerable to adversarial examples, which poses great security concerns for security-critical systems. In this paper, a novel adaptive-patch-based physical attack (AP-PA) framework is proposed, which aims to generate adversarial patches that are adaptive in both physical dynamics and varying scales, and by which the particular targets can be hidden from being detected. Furthermore, the ad…
▽ More
DNNs are vulnerable to adversarial examples, which poses great security concerns for security-critical systems. In this paper, a novel adaptive-patch-based physical attack (AP-PA) framework is proposed, which aims to generate adversarial patches that are adaptive in both physical dynamics and varying scales, and by which the particular targets can be hidden from being detected. Furthermore, the adversarial patch is also gifted with attack effectiveness against all targets of the same class with a patch outside the target (No need to smear targeted objects) and robust enough in the physical world. In addition, a new loss is devised to consider more available information of detected objects to optimize the adversarial patch, which can significantly improve the patch's attack efficacy (Average precision drop up to 87.86% and 85.48% in white-box and black-box settings, respectively) and optimizing efficiency. We also establish one of the first comprehensive, coherent, and rigorous benchmarks to evaluate the attack efficacy of adversarial patches on aerial detection tasks. Finally, several proportionally scaled experiments are performed physically to demonstrate that the elaborated adversarial patches can successfully deceive aerial detection algorithms in dynamic physical circumstances. The code is available at https://github.com/JiaweiLian/AP-PA.
△ Less
Submitted 30 October, 2022;
originally announced October 2022.
-
Enhanced Hydrogen Evolution Catalysis of Pentlandite due to the Increases in Coordination Number and Sulfur Vacancy during Cubic-Hexagonal Phase Transition
Authors:
Yuegao Liu,
Chao Cai,
Shengcai Zhu,
Zhi Zheng,
Guowu Li,
Haiyan Chen,
Chao Li,
Haiyan Sun,
I-Ming Chou,
Yanan Yu,
Shenghua Mei,
Li** Wang
Abstract:
The search for new phases is an important direction in materials science. The phase transition of sulfides results in significant changes in catalytic performance, such as MoS2 and WS2. Cubic pentlandite [cPn, (Fe, Ni)9S8] can be a functional material in batteries, solar cells, and catalytic fields. However, no report about the material properties of other phases of pentlandite exists. In this stu…
▽ More
The search for new phases is an important direction in materials science. The phase transition of sulfides results in significant changes in catalytic performance, such as MoS2 and WS2. Cubic pentlandite [cPn, (Fe, Ni)9S8] can be a functional material in batteries, solar cells, and catalytic fields. However, no report about the material properties of other phases of pentlandite exists. In this study, the unit-cell parameters of a new phase of pentlandite, sulfur-vacancy enriched hexagonal pentlandite (hPn), and the phase boundary between cPn and hPn were determined for the first time. Compared to cPn, the hPn shows a high coordination number, more sulfur vacancies, and high conductivity, which result in significantly higher hydrogen evolution performance of hPn than that of cPn and make the non-nano rock catalyst hPn superior to other most known nanosulfide catalysts. The increase of sulfur vacancies during phase transition provides a new approach to designing functional materials.
△ Less
Submitted 14 March, 2024; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Accelerating the assembly of defect-free atomic arrays with maximum parallelisms
Authors:
Shuai Wang,
Wenjun Zhang,
Tao Zhang,
Shuyao Mei,
Yuqing Wang,
Jiazhong Hu,
Wenlan Chen
Abstract:
Defect-free atomic arrays have been demonstrated as a scalable and fully-controllable platform for quantum simulations and quantum computations. To push the qubit size limit of this platform further, we design an integrated measurement and feedback system, based on field programmable gate array (FPGA), to quickly assemble two-dimensional defect-free atomic array using maximum parallelisms. The tot…
▽ More
Defect-free atomic arrays have been demonstrated as a scalable and fully-controllable platform for quantum simulations and quantum computations. To push the qubit size limit of this platform further, we design an integrated measurement and feedback system, based on field programmable gate array (FPGA), to quickly assemble two-dimensional defect-free atomic array using maximum parallelisms. The total time cost of the rearrangement is first reduced by processing atom detection, atomic occupation analysis, rearrangement strategy formulation, and acousto-optic deflectors (AOD) driving signal generation in parallel in time. Then, by simultaneously moving multiple atoms in the same row (column), we save rearrangement time by parallelism in space. To best utilize these parallelisms, we propose a new algorithm named Tetris algorithm to reassemble atoms to arbitrary target array geometry from two-dimensional stochastically loaded atomic arrays. For an $L \times L$ target array geometry, the number of moves scales as $L$, and the total rearrangement time scales at most as $L^2$. We present the overall performance for different target geometries, and demonstrate a significant reduction in rearrangement time and the potential to scale up defect-free atomic array system to thousands of qubits.
△ Less
Submitted 12 May, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms
Authors:
Fan Chen,
Yu Bai,
Song Mei
Abstract:
Partial Observability -- where agents can only observe partial information about the true underlying state of the system -- is ubiquitous in real-world applications of Reinforcement Learning (RL). Theoretically, learning a near-optimal policy under partial observability is known to be hard in the worst case due to an exponential sample complexity lower bound. Recent work has identified several tra…
▽ More
Partial Observability -- where agents can only observe partial information about the true underlying state of the system -- is ubiquitous in real-world applications of Reinforcement Learning (RL). Theoretically, learning a near-optimal policy under partial observability is known to be hard in the worst case due to an exponential sample complexity lower bound. Recent work has identified several tractable subclasses that are learnable with polynomial samples, such as Partially Observable Markov Decision Processes (POMDPs) with certain revealing or decodability conditions. However, this line of research is still in its infancy, where (1) unified structural conditions enabling sample-efficient learning are lacking; (2) existing sample complexities for known tractable subclasses are far from sharp; and (3) fewer sample-efficient algorithms are available than in fully observable RL.
This paper advances all three aspects above for Partially Observable RL in the general setting of Predictive State Representations (PSRs). First, we propose a natural and unified structural condition for PSRs called \emph{B-stability}. B-stable PSRs encompasses the vast majority of known tractable subclasses such as weakly revealing POMDPs, low-rank future-sufficient POMDPs, decodable POMDPs, and regular PSRs. Next, we show that any B-stable PSR can be learned with polynomial samples in relevant problem parameters. When instantiated in the aforementioned subclasses, our sample complexities improve substantially over the current best ones. Finally, our results are achieved by three algorithms simultaneously: Optimistic Maximum Likelihood Estimation, Estimation-to-Decisions, and Model-Based Optimistic Posterior Sampling. The latter two algorithms are new for sample-efficient learning of POMDPs/PSRs.
△ Less
Submitted 16 December, 2022; v1 submitted 29 September, 2022;
originally announced September 2022.