-
Robust field-level inference with dark matter halos
Authors:
Helen Shao,
Francisco Villaescusa-Navarro,
Pablo Villanueva-Domingo,
Romain Teyssier,
Lehman H. Garrison,
Marco Gatti,
Derek Inman,
Yueying Ni,
Ulrich P. Steinwandel,
Mihir Kulkarni,
Eli Visbal,
Greg L. Bryan,
Daniel Angles-Alcazar,
Tiago Castro,
Elena Hernandez-Martinez,
Klaus Dolag
Abstract:
We train graph neural networks on halo catalogues from Gadget N-body simulations to perform field-level likelihood-free inference of cosmological parameters. The catalogues contain $\lesssim$5,000 halos with masses $\gtrsim 10^{10}~h^{-1}M_\odot$ in a periodic volume of $(25~h^{-1}{\rm Mpc})^3$; every halo in the catalogue is characterized by several properties such as position, mass, velocity, co…
▽ More
We train graph neural networks on halo catalogues from Gadget N-body simulations to perform field-level likelihood-free inference of cosmological parameters. The catalogues contain $\lesssim$5,000 halos with masses $\gtrsim 10^{10}~h^{-1}M_\odot$ in a periodic volume of $(25~h^{-1}{\rm Mpc})^3$; every halo in the catalogue is characterized by several properties such as position, mass, velocity, concentration, and maximum circular velocity. Our models, built to be permutationally, translationally, and rotationally invariant, do not impose a minimum scale on which to extract information and are able to infer the values of $Ω_{\rm m}$ and $σ_8$ with a mean relative error of $\sim6\%$, when using positions plus velocities and positions plus masses, respectively. More importantly, we find that our models are very robust: they can infer the value of $Ω_{\rm m}$ and $σ_8$ when tested using halo catalogues from thousands of N-body simulations run with five different N-body codes: Abacus, CUBEP$^3$M, Enzo, PKDGrav3, and Ramses. Surprisingly, the model trained to infer $Ω_{\rm m}$ also works when tested on thousands of state-of-the-art CAMELS hydrodynamic simulations run with four different codes and subgrid physics implementations. Using halo properties such as concentration and maximum circular velocity allow our models to extract more information, at the expense of breaking the robustness of the models. This may happen because the different N-body codes are not converged on the relevant scales corresponding to these parameters.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
N$^3$LO+N$^3$LL QCD improved Higgs pair cross sections
Authors:
A. H. Ajjath,
Hua-Sheng Shao
Abstract:
We report a new calculation of the soft-gluon threshold resummation for the Higgs boson pair production in the dominant production mode -- gluon-gluon fusion -- up to the next-to-next-to-next-to-leading logarithmic (N$^3$LL) accuracy. After matching N$^3$LL to the next-to-next-to-next-to-leading order (N$^3$LO) QCD calculation in the infinite top quark mass approximation, we show that the central…
▽ More
We report a new calculation of the soft-gluon threshold resummation for the Higgs boson pair production in the dominant production mode -- gluon-gluon fusion -- up to the next-to-next-to-next-to-leading logarithmic (N$^3$LL) accuracy. After matching N$^3$LL to the next-to-next-to-next-to-leading order (N$^3$LO) QCD calculation in the infinite top quark mass approximation, we show that the central values of the inclusive cross sections are quite stable with respect to N$^3$LO, while the conventional renormalisation and factorisation scale uncertainties are reduced by a factor of two, reaching to the subpercent level. Our study further consolidates the good asymptotic perturbative convergence. After combining with the full top-quark mass dependent next-to-leading order QCD results, our most advanced predictions are presented for both the inclusive total cross sections and the differential invariant mass distributions of the Higgs pair.
△ Less
Submitted 10 February, 2023; v1 submitted 8 September, 2022;
originally announced September 2022.
-
Structural Adaptivity of Directed Networks
Authors:
Lulu Pan,
Haibin Shao,
Mehran Mesbahi,
Dewei Li,
Yugeng Xi
Abstract:
Network structure plays a critical role in functionality and performance of network systems. This paper examines structural adaptivity of diffusively coupled, directed multi-agent networks that are subject to diffusion performance. Inspired by the observation that the link redundancy in a network may degrade its diffusion performance, a distributed data-driven neighbor selection framework is propo…
▽ More
Network structure plays a critical role in functionality and performance of network systems. This paper examines structural adaptivity of diffusively coupled, directed multi-agent networks that are subject to diffusion performance. Inspired by the observation that the link redundancy in a network may degrade its diffusion performance, a distributed data-driven neighbor selection framework is proposed to adaptively adjust the network structure for improving the diffusion performance of exogenous influence over the network. Specifically, each agent is allowed to interact with only a specific subset of neighbors while global reachability from exogenous influence to all agents of the network is maintained. Both continuous-time and discrete-time directed networks are examined. For each of the two cases, we first examine the reachability properties encoded in the eigenvectors of perturbed variants of graph Laplacian or SIA matrix associated with directed networks, respectively. Then, an eigenvector-based rule for neighbor selection is proposed to derive a reduced network, on which the diffusion performance is enhanced. Finally, motivated by the necessity of distributed and data-driven implementation of the neighbor selection rule, quantitative connections between eigenvectors of the perturbed graph Laplacian and SIA matrix and relative rate of change in agent state are established, respectively. These connections immediately enable a data-driven inference of the reduced neighbor set for each agent using only locally accessible data. As an immediate extension, we further discuss the distributed data-driven construction of directed spanning trees of directed networks using the proposed neighbor selection framework. Numerical simulations are provided to demonstrate the theoretical results.
△ Less
Submitted 28 August, 2022;
originally announced August 2022.
-
ASTRO: An AST-Assisted Approach for Generalizable Neural Clone Detection
Authors:
Yifan Zhang,
Junwen Yang,
Haoyu Dong,
Qingchen Wang,
Huajie Shao,
Kevin Leach,
Yu Huang
Abstract:
Neural clone detection has attracted the attention of software engineering researchers and practitioners. However, most neural clone detection methods do not generalize beyond the scope of clones that appear in the training dataset. This results in poor model performance, especially in terms of model recall. In this paper, we present an Abstract Syntax Tree (AST) assisted approach for generalizabl…
▽ More
Neural clone detection has attracted the attention of software engineering researchers and practitioners. However, most neural clone detection methods do not generalize beyond the scope of clones that appear in the training dataset. This results in poor model performance, especially in terms of model recall. In this paper, we present an Abstract Syntax Tree (AST) assisted approach for generalizable neural clone detection, or ASTRO, a framework for finding clones in codebases reflecting industry practices. We present three main components: (1) an AST-inspired representation for source code that leverages program structure and semantics, (2) a global graph representation that captures the context of an AST among a corpus of programs, and (3) a graph embedding for programs that, in combination with extant large-scale language models, improves state-of-the-art code clone detection. Our experimental results show that ASTRO improves state-of-the-art neural clone detection approaches in both recall and F-1 scores.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
Safety-Enhanced Autonomous Driving Using Interpretable Sensor Fusion Transformer
Authors:
Hao Shao,
Letian Wang,
RuoBing Chen,
Hongsheng Li,
Yu Liu
Abstract:
Large-scale deployment of autonomous vehicles has been continually delayed due to safety concerns. On the one hand, comprehensive scene understanding is indispensable, a lack of which would result in vulnerability to rare but complex traffic situations, such as the sudden emergence of unknown objects. However, reasoning from a global context requires access to sensors of multiple types and adequat…
▽ More
Large-scale deployment of autonomous vehicles has been continually delayed due to safety concerns. On the one hand, comprehensive scene understanding is indispensable, a lack of which would result in vulnerability to rare but complex traffic situations, such as the sudden emergence of unknown objects. However, reasoning from a global context requires access to sensors of multiple types and adequate fusion of multi-modal sensor signals, which is difficult to achieve. On the other hand, the lack of interpretability in learning models also hampers the safety with unverifiable failure causes. In this paper, we propose a safety-enhanced autonomous driving framework, named Interpretable Sensor Fusion Transformer(InterFuser), to fully process and fuse information from multi-modal multi-view sensors for achieving comprehensive scene understanding and adversarial event detection. Besides, intermediate interpretable features are generated from our framework, which provide more semantics and are exploited to better constrain actions to be within the safe sets. We conducted extensive experiments on CARLA benchmarks, where our model outperforms prior methods, ranking the first on the public CARLA Leaderboard. Our code will be made available at https://github.com/opendilab/InterFuser
△ Less
Submitted 7 December, 2022; v1 submitted 28 July, 2022;
originally announced July 2022.
-
gamma-UPC: Automated generation of exclusive photon-photon processes in ultraperipheral proton and nuclear collisions with varying form factors
Authors:
Hua-Sheng Shao,
David d'Enterria
Abstract:
The automated generation of arbitrary exclusive final states produced via photon fusion in ultraperipheral high-energy collisions of protons and/or nuclei is implemented in the MadGraph5_aMC@NLO and HelacOnia Monte Carlo codes. Cross sections are calculated in the equivalent photon approximation using $γ$ fluxes derived from electric dipole and charge form factors, and incorporating hadronic survi…
▽ More
The automated generation of arbitrary exclusive final states produced via photon fusion in ultraperipheral high-energy collisions of protons and/or nuclei is implemented in the MadGraph5_aMC@NLO and HelacOnia Monte Carlo codes. Cross sections are calculated in the equivalent photon approximation using $γ$ fluxes derived from electric dipole and charge form factors, and incorporating hadronic survival probabilities. Multiple examples of $γγ$ cross sections computed with this setup, named gamma-UPC, are presented for proton-proton, proton-nucleus, and nucleus-nucleus ultraperipheral collisions (UPCs) at the Large Hadron Collider and Future Circular Collider. Total photon-fusion cross sections for the exclusive production of spin-0,2 resonances (quarkonia, ditauonium, and Higgs boson; as well as axions and gravitons), and for pairs of particles ($J/ψJ/ψ$, WW, ZZ, Z$γ$, $t\bar{t}$, HH) are presented. Differential cross sections for exclusive dileptons and light-by-light scattering are compared to LHC data. This development paves the way for the upcoming automatic event generation of any UPC final state with electroweak corrections at next-to-leading-order accuracy and beyond.
△ Less
Submitted 27 September, 2022; v1 submitted 6 July, 2022;
originally announced July 2022.
-
High thermoelectric performances in PbP monolayers considering full electron-phonon coupling and four-phonon scattering processes
Authors:
Ao Wu,
Yiming Zhang,
Yujie Xia,
Lei Peng,
Heyuan Zhu,
Hezhu Shao,
Hao Zhang
Abstract:
The band convergence strategy, which improves Seebeck coefficient by inducing multi-valley in bandstructures, has been widely used in thermoelectric performance (TE) enhancing. However, the phonon-assisted intervalley scattering effect is neglected and the mode-selection rules remain unclear. In this work, TE properties for $α$-, $β$- and $γ$-PbP are intestigated under the consideration of full mo…
▽ More
The band convergence strategy, which improves Seebeck coefficient by inducing multi-valley in bandstructures, has been widely used in thermoelectric performance (TE) enhancing. However, the phonon-assisted intervalley scattering effect is neglected and the mode-selection rules remain unclear. In this work, TE properties for $α$-, $β$- and $γ$-PbP are intestigated under the consideration of full mode-, energy- and momentum-resolved electron-phonon interactions (EPI). The group theory is used to analyze the selection rules for EPI matrix elements. Our calculations reveal that, the intervalley scattering contributes non-trivially to the total carrier relaxation time, and the intervalley scattering can be modulated through crystal symmetry. In addition, the investigation on the thermal properties reveals that four-phonon scattering effect dominates the phonon relaxation processes, since the three-phonon scattering is suppressed due to the significantly large acoustic-optical phonon bandgap in $α$-, $β$- and $γ$-PbP. By considering full EPI effect and high-order phonon scattering processes, the calculated ZT values reach 0.90, 0.24 and 1.25 for $α$-, $β$- and $γ$-PbP, repectively, indicating their promising applications in thermoelectric devices.
△ Less
Submitted 5 June, 2022;
originally announced June 2022.
-
Are Large Pre-Trained Language Models Leaking Your Personal Information?
Authors:
Jie Huang,
Hanyin Shao,
Kevin Chen-Chuan Chang
Abstract:
Are Large Pre-Trained Language Models Leaking Your Personal Information? In this paper, we analyze whether Pre-Trained Language Models (PLMs) are prone to leaking personal information. Specifically, we query PLMs for email addresses with contexts of the email address or prompts containing the owner's name. We find that PLMs do leak personal information due to memorization. However, since the model…
▽ More
Are Large Pre-Trained Language Models Leaking Your Personal Information? In this paper, we analyze whether Pre-Trained Language Models (PLMs) are prone to leaking personal information. Specifically, we query PLMs for email addresses with contexts of the email address or prompts containing the owner's name. We find that PLMs do leak personal information due to memorization. However, since the models are weak at association, the risk of specific personal information being extracted by attackers is low. We hope this work could help the community to better understand the privacy risk of PLMs and bring new insights to make PLMs safe.
△ Less
Submitted 20 October, 2022; v1 submitted 25 May, 2022;
originally announced May 2022.
-
Ditauonium spectroscopy
Authors:
David d'Enterria,
Redamy Perez-Ramos,
Hua-Sheng Shao
Abstract:
We examine the properties of ditauonium, an exotic atom consisting of a pair of opposite-sign $τ$ leptons bound together by the quantum electrodynamics (QED) interaction in a hydrogen-like state. The energy levels, decay modes and associated partial widths, as well as total widths and lifetimes of the ortho- and para-ditauonium states are calculated. Higher-order QED effects -- including Lamb shif…
▽ More
We examine the properties of ditauonium, an exotic atom consisting of a pair of opposite-sign $τ$ leptons bound together by the quantum electrodynamics (QED) interaction in a hydrogen-like state. The energy levels, decay modes and associated partial widths, as well as total widths and lifetimes of the ortho- and para-ditauonium states are calculated. Higher-order QED effects -- including Lamb shifts, hyperfine splitting structure, and partial decay widths corrections -- are incorporated up to approximately next-to-next-to-leading-order (NNLO) accuracy. Beyond the dominant diphoton and difermion decays, the rates of rare decay channels -- including Dalitz, radiative, triple-photon, double-Dalitz, four-fermion, and neutrinos final states -- are determined.
△ Less
Submitted 2 November, 2022; v1 submitted 14 April, 2022;
originally announced April 2022.
-
Revenue Management Under the Markov Chain Choice Model with Joint Price and Assortment Decisions
Authors:
Anton J. Kleywegt,
Hongzhang Shao
Abstract:
Finding the optimal product prices and product assortment are two fundamental problems in revenue management. Usually, a seller needs to jointly determine the prices and assortment while managing a network of resources with limited capacity. However, there is not yet a tractable method to efficiently solve such a problem. Existing papers studying static joint optimization of price and assortment c…
▽ More
Finding the optimal product prices and product assortment are two fundamental problems in revenue management. Usually, a seller needs to jointly determine the prices and assortment while managing a network of resources with limited capacity. However, there is not yet a tractable method to efficiently solve such a problem. Existing papers studying static joint optimization of price and assortment cannot incorporate resource constraints. Then we study the revenue management problem with resource constraints and price bounds, where the prices and the product assortments need to be jointly determined over time. We showed that under the Markov chain (MC) choice model (which subsumes the multinomial logit (MNL) model), we could reformulate the choice-based joint optimization problem as a tractable convex conic optimization problem. We also proved that an optimal solution with a constant price vector exists even with constraints on resources. In addition, a solution with both constant assortment and price vector can be optimal when there is no resource constraint.
△ Less
Submitted 10 April, 2022;
originally announced April 2022.
-
Lensless coherent diffraction imaging based on spatial light modulator with unknown modulation curve
Authors:
Hao Sha,
Chao He,
Shaowei Jiang,
Pengming Song,
Shuai Liu,
Wenzhen Zou,
Peiwu Qin,
Haoqian Wang,
Yongbing Zhang
Abstract:
Lensless imaging is a popular research field for the advantages of small size, wide field-of-view and low aberration in recent years. However, some traditional lensless imaging methods suffer from slow convergence, mechanical errors and conjugate solution interference, which limit its further application and development. In this work, we proposed a lensless imaging method based on spatial light mo…
▽ More
Lensless imaging is a popular research field for the advantages of small size, wide field-of-view and low aberration in recent years. However, some traditional lensless imaging methods suffer from slow convergence, mechanical errors and conjugate solution interference, which limit its further application and development. In this work, we proposed a lensless imaging method based on spatial light modulator (SLM) with unknown modulation curve. In our imaging system, we use SLM to modulate the wavefront of object, and introduce the ptychographic scanning algorithm that is able to recover the complex amplitude information even the SLM modulation curve is inaccurate or unknown. In addition, we also design a split-beam interference experiment to calibrate the modulation curve of SLM, and using the calibrated modulation function as the initial value of the expended ptychography iterative engine (ePIE) algorithm can improve the convergence speed. We further analyze the effect of modulation function, algorithm parameters and the characteristics of the coherent light source on the quality of reconstructed image. The simulated and real experiments show that the proposed method is superior to traditional mechanical scanning methods in terms of recovering speed and accuracy, with the recovering resolution up to 14 um.
△ Less
Submitted 8 April, 2022;
originally announced April 2022.
-
Induced dynamics of non-autonomous dynamical systems
Authors:
Hua Shao
Abstract:
Let $f_{0,\infty}=\{f_n\}_{n=0}^{\infty}$ be a sequence of continuous self-maps on a compact metric space $X$. The non-autonomous dynamical system $(X,f_{0,\infty})$ induces the set-valued system $(\mathcal{K}(X), \bar{f}_{0,\infty})$ and the fuzzified system $(\mathcal{F}(X),\tilde{f}_{0,\infty})$. We prove that under some natural conditions, positive topological entropy of $(X,f_{0,\infty})$ imp…
▽ More
Let $f_{0,\infty}=\{f_n\}_{n=0}^{\infty}$ be a sequence of continuous self-maps on a compact metric space $X$. The non-autonomous dynamical system $(X,f_{0,\infty})$ induces the set-valued system $(\mathcal{K}(X), \bar{f}_{0,\infty})$ and the fuzzified system $(\mathcal{F}(X),\tilde{f}_{0,\infty})$. We prove that under some natural conditions, positive topological entropy of $(X,f_{0,\infty})$ implies infinite entropy of $(\mathcal{K}(X),\bar{f}_{0,\infty})$ and $(\mathcal{F}(X),\tilde{f}_{0,\infty})$, respectively; and zero entropy of $(S^1,f_{0,\infty})$ implies zero entropy of some invariant subsystems of $(\mathcal{K}(S^1),\bar{f}_{0,\infty})$ and $(\mathcal{F}(S^1),\tilde{f}_{0,\infty})$, respectively. We confirm that $(\mathcal{K}(I), \bar{f})$ and $(\mathcal{F}(I), \tilde{f})$ have infinite entropy for any transitive interval map $f$. In contrast, we construct a transitive non-autonomous system $(I, f_{0,\infty})$ such that both $(\mathcal{K}(I), \bar{f}_{0,\infty})$ and $(\mathcal{F}(I), \tilde{f}_{0,\infty})$ have zero entropy. We obtain that $(\mathcal{K}(X),\bar{f}_{0,\infty})$ is chain weakly mixing of all orders if and only if $(\mathcal{F}^1(X),\tilde{f}_{0,\infty})$ is so, and chain mixing (resp. $h$-shadowing and multi-$\mathscr{F}$-sensitivity) among $(X,f_{0,\infty})$, $(\mathcal{K}(X),\bar{f}_{0,\infty})$ and $(\mathcal{F}^1(X),\tilde{f}_{0,\infty})$ are equivalent, where $(\mathcal{F}^1(X),\tilde{f}_{0,\infty})$ is the induced normal fuzzification.
△ Less
Submitted 27 February, 2022;
originally announced February 2022.
-
Progress on stochastic analytic continuation of quantum Monte Carlo data
Authors:
Hui Shao,
Anders W. Sandvik
Abstract:
We report multipronged progress on the stochastic averaging approach to numerical analytic continuation of quantum Monte Carlo data. With the sampled spectrum parametrized with delta-functions in continuous frequency space, a calculation of the configurational entropy lends support to a simple goodness-of-fit criterion for the optimal sampling temperature. To further investigate entropic effects,…
▽ More
We report multipronged progress on the stochastic averaging approach to numerical analytic continuation of quantum Monte Carlo data. With the sampled spectrum parametrized with delta-functions in continuous frequency space, a calculation of the configurational entropy lends support to a simple goodness-of-fit criterion for the optimal sampling temperature. To further investigate entropic effects, we compare spectra sampled in continuous frequency with results of amplitudes sampled on a fixed frequency grid. We demonstrate equivalences between sampling and optimizing spectral functions with the maximum-entropy approach with different forms of the entropy. These insights revise prevailing notions of the maximum-entropy method and its relationship to stochastic analytic continuation. We further explore various adjustable (optimized) constraints that allow sharp spectral features to be resolved, in particular at the lower frequency edge. The constraints, e.g., the location of the edge or the spectral weight of a quasi-particle peak, are optimized using a statistical criterion. We show that this method can correctly reproduce both narrow and broad quasi-particle peaks. We next introduce a parametrization for more intricate spectral functions with sharp edges, e.g., power-law singularities. Tests with synthetic data as well as with real simulation data for the spin-1/2 Heisenberg chain demonstrate that constrained sampling methods can reproduce spectral functions with sharp edge features at unprecedented fidelity. We present new results for S=1/2 Heisenberg 2-leg and 3-leg ladders to illustrate the ability of the methods to resolve spectral features arising from both elementary and composite excitations. Finally, we also propose how the methods developed here could be used as "pre processors" for analytic continuation by machine learning.
△ Less
Submitted 9 January, 2023; v1 submitted 20 February, 2022;
originally announced February 2022.
-
A Theory of PAC Learnability under Transformation Invariances
Authors:
Han Shao,
Omar Montasser,
Avrim Blum
Abstract:
Transformation invariances are present in many real-world problems. For example, image classification is usually invariant to rotation and color transformation: a rotated car in a different color is still identified as a car. Data augmentation, which adds the transformed data into the training set and trains a model on the augmented data, is one commonly used technique to build these invariances i…
▽ More
Transformation invariances are present in many real-world problems. For example, image classification is usually invariant to rotation and color transformation: a rotated car in a different color is still identified as a car. Data augmentation, which adds the transformed data into the training set and trains a model on the augmented data, is one commonly used technique to build these invariances into the learning process. However, it is unclear how data augmentation performs theoretically and what the optimal algorithm is in presence of transformation invariances. In this paper, we study PAC learnability under transformation invariances in three settings according to different levels of realizability: (i) A hypothesis fits the augmented data; (ii) A hypothesis fits only the original data and the transformed data lying in the support of the data distribution; (iii) Agnostic case. One interesting observation is that distinguishing between the original data and the transformed data is necessary to achieve optimal accuracy in setting (ii) and (iii), which implies that any algorithm not differentiating between the original and transformed data (including data augmentation) is not optimal. Furthermore, this type of algorithms can even "harm" the accuracy. In setting (i), although it is unnecessary to distinguish between the two data sets, data augmentation still does not perform optimally. Due to such a difference, we propose two combinatorial measures characterizing the optimal sample complexity in setting (i) and (ii)(iii) and provide the optimal algorithms.
△ Less
Submitted 2 November, 2022; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Observing true tauonium via two-photon fusion at $e^+e^-$ and hadron colliders
Authors:
David d'Enterria,
Hua-Sheng Shao
Abstract:
The feasibility of observing true tauonium, the bound state of two tau leptons, $\mathcal{T}_0\equiv(τ^+τ^-)_0$, via photon-photon collisions at $e^+e^-$ colliders and at the LHC, is studied. The production cross sections of the process $γγ\to\mathcal{T}_0\toγγ$ -- as well as those of all relevant backgrounds: spin-0 and 2 charmonium resonances decaying to diphotons, and light-by-light scattering…
▽ More
The feasibility of observing true tauonium, the bound state of two tau leptons, $\mathcal{T}_0\equiv(τ^+τ^-)_0$, via photon-photon collisions at $e^+e^-$ colliders and at the LHC, is studied. The production cross sections of the process $γγ\to\mathcal{T}_0\toγγ$ -- as well as those of all relevant backgrounds: spin-0 and 2 charmonium resonances decaying to diphotons, and light-by-light scattering -- are computed in the equivalent photon approximation for $e^+e^-$ collisions at BES III ($\sqrt{s} = 3.8$ GeV), Belle II ($\sqrt{s} = 10.6$ GeV), and FCC-ee ($\sqrt{s} = 91.2$ GeV), as well as for ultraperipheral p-p, p-Pb, and Pb-Pb collisions at the LHC. Despite small $\mathcal{T}_0$ production cross sections and a final state swamped by decays from overlap** pseudoscalar and tensor charmonium states -- the $\mathrm{χ_{c2}}$, $\mathrm{η_{c}(2S)}$, and $\mathrm{χ_{c0}}$ states have masses only 2.5, 84, and 139 MeV away, respectively, from the $\mathcal{T}_0$ peak -- evidence and observation of the ground state of the heaviest leptonium appears feasible at Belle II and FCC-ee, respectively, with in-situ high-precision measurements of the irreducible backgrounds.
△ Less
Submitted 24 May, 2022; v1 submitted 4 February, 2022;
originally announced February 2022.
-
Kee** Deep Lithography Simulators Updated: Global-Local Shape-Based Novelty Detection and Active Learning
Authors:
Hao-Chiang Shao,
Hsing-Lei **,
Kuo-shiuan Chen,
Weng-Tai Su,
Chia-Wen Lin,
Shao-Yun Fang,
Pin-Yian Tsai,
Yan-Hsiu Liu
Abstract:
Learning-based pre-simulation (i.e., layout-to-fabrication) models have been proposed to predict the fabrication-induced shape deformation from an IC layout to its fabricated circuit. Such models are usually driven by pairwise learning, involving a training set of layout patterns and their reference shape images after fabrication. However, it is expensive and time-consuming to collect the referenc…
▽ More
Learning-based pre-simulation (i.e., layout-to-fabrication) models have been proposed to predict the fabrication-induced shape deformation from an IC layout to its fabricated circuit. Such models are usually driven by pairwise learning, involving a training set of layout patterns and their reference shape images after fabrication. However, it is expensive and time-consuming to collect the reference shape images of all layout clips for model training and updating. To address the problem, we propose a deep learning-based layout novelty detection scheme to identify novel (unseen) layout patterns, which cannot be well predicted by a pre-trained pre-simulation model. We devise a global-local novelty scoring mechanism to assess the potential novelty of a layout by exploiting two subnetworks: an autoencoder and a pretrained pre-simulation model. The former characterizes the global structural dissimilarity between a given layout and training samples, whereas the latter extracts a latent code representing the fabrication-induced local deformation. By integrating the global dissimilarity with the local deformation boosted by a self-attention mechanism, our model can accurately detect novelties without the ground-truth circuit shapes of test samples. Based on the detected novelties, we further propose two active-learning strategies to sample a reduced amount of representative layouts most worthy to be fabricated for acquiring their ground-truth circuit shapes. Experimental results demonstrate i) our method's effectiveness in layout novelty detection, and ii) our active-learning strategies' ability in selecting representative novel layouts for kee** a learning-based pre-simulation model updated.
△ Less
Submitted 24 January, 2022;
originally announced January 2022.
-
The CAMELS project: public data release
Authors:
Francisco Villaescusa-Navarro,
Shy Genel,
Daniel Anglés-Alcázar,
Lucia A. Perez,
Pablo Villanueva-Domingo,
Digvijay Wadekar,
Helen Shao,
Faizan G. Mohammad,
Sultan Hassan,
Emily Moser,
Erwin T. Lau,
Luis Fernando Machado Poletti Valle,
Andrina Nicola,
Leander Thiele,
Yongseok Jo,
Oliver H. E. Philcox,
Benjamin D. Oppenheimer,
Megan Tillman,
ChangHoon Hahn,
Neerav Kaushal,
Alice Pisani,
Matthew Gebhardt,
Ana Maria Delgado,
Joyce Caliendo,
Christina Kreisch
, et al. (22 additional authors not shown)
Abstract:
The Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project was developed to combine cosmology with astrophysics through thousands of cosmological hydrodynamic simulations and machine learning. CAMELS contains 4,233 cosmological simulations, 2,049 N-body and 2,184 state-of-the-art hydrodynamic simulations that sample a vast volume in parameter space. In this paper we present…
▽ More
The Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project was developed to combine cosmology with astrophysics through thousands of cosmological hydrodynamic simulations and machine learning. CAMELS contains 4,233 cosmological simulations, 2,049 N-body and 2,184 state-of-the-art hydrodynamic simulations that sample a vast volume in parameter space. In this paper we present the CAMELS public data release, describing the characteristics of the CAMELS simulations and a variety of data products generated from them, including halo, subhalo, galaxy, and void catalogues, power spectra, bispectra, Lyman-$α$ spectra, probability distribution functions, halo radial profiles, and X-rays photon lists. We also release over one thousand catalogues that contain billions of galaxies from CAMELS-SAM: a large collection of N-body simulations that have been combined with the Santa Cruz Semi-Analytic Model. We release all the data, comprising more than 350 terabytes and containing 143,922 snapshots, millions of halos, galaxies and summary statistics. We provide further technical details on how to access, download, read, and process the data at \url{https://camels.readthedocs.io}.
△ Less
Submitted 4 January, 2022;
originally announced January 2022.
-
Revisiting NLO QCD corrections to total inclusive J/psi and Upsilon photoproduction cross sections in lepton-proton collisions
Authors:
Alice Colpani Serri,
Yu Feng,
Carlo Flore,
Jean-Philippe Lansberg,
Melih A. Ozcelik,
Hua-Sheng Shao,
Yelyzaveta Yedelkina
Abstract:
We revisit inclusive J/psi and Upsilon photoproduction at lepton-hadron colliders, namely in the limit when the exchanged photon is quasi real. Our computation includes the next-to-leading-order (NLO) alpha_s corrections to the leading-order contributions in v. Similarly to the case of NLO charmonium-hadroproduction processes, the resulting cross sections obtained in the MS-bar factorisation schem…
▽ More
We revisit inclusive J/psi and Upsilon photoproduction at lepton-hadron colliders, namely in the limit when the exchanged photon is quasi real. Our computation includes the next-to-leading-order (NLO) alpha_s corrections to the leading-order contributions in v. Similarly to the case of NLO charmonium-hadroproduction processes, the resulting cross sections obtained in the MS-bar factorisation scheme are sometimes found to be negative. We show that the scale-fixing criteria which we derived in a previous study of eta(c) production successfully solves this problem from the EicC all the way up to the FCC-eh energies. We then elaborate on how to study a scale uncertainty akin to that derived by scale variations when one fixes a scale. In turn, we investigate where both J/psi and Upsilon photoproduction could be used to improve our knowledge of gluon content of the proton at scales as low as a couple of GeV.
△ Less
Submitted 2 February, 2023; v1 submitted 9 December, 2021;
originally announced December 2021.
-
Optimizing Pricing, Repositioning, En-Route Time, and Idle Time in Ride-Hailing Systems
Authors:
Anton J. Kleywegt,
Hongzhang Shao
Abstract:
In ride-hailing systems, en-route time refers to the time that elapses from the moment a car is dispatched to pick up a rider until the rider is picked up. A fundamental phenomenon in ride-hailing systems is that there is a trade-off between en-route time and the time that a car waits for a dispatch. In short, if cars spend little time idle waiting for a dispatch, then few cars are available when…
▽ More
In ride-hailing systems, en-route time refers to the time that elapses from the moment a car is dispatched to pick up a rider until the rider is picked up. A fundamental phenomenon in ride-hailing systems is that there is a trade-off between en-route time and the time that a car waits for a dispatch. In short, if cars spend little time idle waiting for a dispatch, then few cars are available when a rider makes a request, and thus the mean distance between a rider and the closest available car is long, which means that en-route time is long. This phenomenon is of great importance in ride-hailing, because en-route time increases rapidly as the number of idle cars decreases, and every minute that a car spends en-route is one minute less that the car can transport riders. In spite of this, the existing literature on price optimization for ride-hailing, and on repositioning optimization for ride-hailing, ignores en-route time. Initial attempts to take this trade-off for the mean en-route time into account when considering price optimization or repositioning optimization all resulted in intractable optimization problems. Then we found a way to reformulate a simultaneous price and repositioning optimization problem, that takes this trade-off for the distribution of en-route time into account, as a tractable convex optimization problem. We show how the optimal solution can be used to construct policies that perform much better in simulations than the policies proposed in previous papers.
△ Less
Submitted 22 November, 2021;
originally announced November 2021.
-
Understanding Jargon: Combining Extraction and Generation for Definition Modeling
Authors:
Jie Huang,
Hanyin Shao,
Kevin Chen-Chuan Chang,
**jun Xiong,
Wen-mei Hwu
Abstract:
Can machines know what twin prime is? From the composition of this phrase, machines may guess twin prime is a certain kind of prime, but it is still difficult to deduce exactly what twin stands for without additional knowledge. Here, twin prime is a jargon - a specialized term used by experts in a particular field. Explaining jargon is challenging since it usually requires domain knowledge to unde…
▽ More
Can machines know what twin prime is? From the composition of this phrase, machines may guess twin prime is a certain kind of prime, but it is still difficult to deduce exactly what twin stands for without additional knowledge. Here, twin prime is a jargon - a specialized term used by experts in a particular field. Explaining jargon is challenging since it usually requires domain knowledge to understand. Recently, there is an increasing interest in extracting and generating definitions of words automatically. However, existing approaches, either extraction or generation, perform poorly on jargon. In this paper, we propose to combine extraction and generation for jargon definition modeling: first extract self- and correlative definitional information of target jargon from the Web and then generate the final definitions by incorporating the extracted definitional information. Our framework is remarkably simple but effective: experiments demonstrate our method can generate high-quality definitions for jargon and outperform state-of-the-art models significantly, e.g., BLEU score from 8.76 to 22.66 and human-annotated score from 2.34 to 4.04.
△ Less
Submitted 20 October, 2022; v1 submitted 14 November, 2021;
originally announced November 2021.
-
BSC: Block-based Stochastic Computing to Enable Accurate and Efficient TinyML
Authors:
Yuhong Song,
Edwin Hsing-Mean Sha,
Qingfeng Zhuge,
Rui Xu,
Yongzhuo Zhang,
Bingzhe Li,
Lei Yang
Abstract:
Along with the progress of AI democratization, machine learning (ML) has been successfully applied to edge applications, such as smart phones and automated driving. Nowadays, more applications require ML on tiny devices with extremely limited resources, like implantable cardioverter defibrillator (ICD), which is known as TinyML. Unlike ML on the edge, TinyML with a limited energy supply has higher…
▽ More
Along with the progress of AI democratization, machine learning (ML) has been successfully applied to edge applications, such as smart phones and automated driving. Nowadays, more applications require ML on tiny devices with extremely limited resources, like implantable cardioverter defibrillator (ICD), which is known as TinyML. Unlike ML on the edge, TinyML with a limited energy supply has higher demands on low-power execution. Stochastic computing (SC) using bitstreams for data representation is promising for TinyML since it can perform the fundamental ML operations using simple logical gates, instead of the complicated binary adder and multiplier. However, SC commonly suffers from low accuracy for ML tasks due to low data precision and inaccuracy of arithmetic units. Increasing the length of the bitstream in the existing works can mitigate the precision issue but incur higher latency. In this work, we propose a novel SC architecture, namely Block-based Stochastic Computing (BSC). BSC divides inputs into blocks, such that the latency can be reduced by exploiting high data parallelism. Moreover, optimized arithmetic units and output revision (OUR) scheme are proposed to improve accuracy. On top of it, a global optimization approach is devised to determine the number of blocks, which can make a better latency-power trade-off. Experimental results show that BSC can outperform the existing designs in achieving over 10% higher accuracy on ML tasks and over 6 times power reduction.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
Blending Anti-Aliasing into Vision Transformer
Authors:
Shengju Qian,
Hao Shao,
Yi Zhu,
Mu Li,
Jiaya Jia
Abstract:
The transformer architectures, based on self-attention mechanism and convolution-free design, recently found superior performance and booming applications in computer vision. However, the discontinuous patch-wise tokenization process implicitly introduces jagged artifacts into attention maps, arising the traditional problem of aliasing for vision transformers. Aliasing effect occurs when discrete…
▽ More
The transformer architectures, based on self-attention mechanism and convolution-free design, recently found superior performance and booming applications in computer vision. However, the discontinuous patch-wise tokenization process implicitly introduces jagged artifacts into attention maps, arising the traditional problem of aliasing for vision transformers. Aliasing effect occurs when discrete patterns are used to produce high frequency or continuous information, resulting in the indistinguishable distortions. Recent researches have found that modern convolution networks still suffer from this phenomenon. In this work, we analyze the uncharted problem of aliasing in vision transformer and explore to incorporate anti-aliasing properties. Specifically, we propose a plug-and-play Aliasing-Reduction Module(ARM) to alleviate the aforementioned issue. We investigate the effectiveness and generalization of the proposed method across multiple tasks and various vision transformer families. This lightweight design consistently attains a clear boost over several famous structures. Furthermore, our module also improves data efficiency and robustness of vision transformers.
△ Less
Submitted 28 October, 2021;
originally announced October 2021.
-
Event-triggered Consensus of Matrix-weighted Networks Subject to Actuator Saturation
Authors:
Lulu Pan,
Haibin Shao,
Yuanlong Li,
Dewei Li,
Yugeng Xi
Abstract:
The ubiquitous interdependencies among higher-dimensional states of neighboring agents can be characterized by matrix-weighted networks. This paper examines event-triggered global consensus of matrix-weighted networks subject to actuator saturation. Specifically, a distributed dynamic event-triggered coordination strategy, whose design involves sampled state of agents, saturation constraint and au…
▽ More
The ubiquitous interdependencies among higher-dimensional states of neighboring agents can be characterized by matrix-weighted networks. This paper examines event-triggered global consensus of matrix-weighted networks subject to actuator saturation. Specifically, a distributed dynamic event-triggered coordination strategy, whose design involves sampled state of agents, saturation constraint and auxiliary systems, is proposed for this category of generalized network to guarantee its global consensus. Under the proposed event-triggered coordination strategy, sufficient conditions are derived to guarantee the leaderless and leader-follower global consensus of the multi-agent systems on matrix-weighted networks, respectively. The Zeno phenomenon can be excluded for both cases under the proposed coordination strategy. It turns out that the spectral properties of matrix-valued weights are crucial in event-triggered mechanism design for matrix-weighted networks with actuator saturation constraint. Finally, simulations are provided to demonstrate the effectiveness of proposed event-triggered coordination strategy. This work provides a more general design framework compared with existing results that are only applicable to scalar-weighted networks.
△ Less
Submitted 25 October, 2021;
originally announced October 2021.
-
Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization
Authors:
Panjie Qi,
Edwin Hsing-Mean Sha,
Qingfeng Zhuge,
Hongwu Peng,
Shaoyi Huang,
Zhenglun Kong,
Yuhong Song,
Bingbing Li
Abstract:
State-of-the-art Transformer-based models, with gigantic parameters, are difficult to be accommodated on resource constrained embedded devices. Moreover, with the development of technology, more and more embedded devices are available to run a Transformer model. For a Transformer model with different constraints (tight or loose), it can be deployed onto devices with different computing power. Howe…
▽ More
State-of-the-art Transformer-based models, with gigantic parameters, are difficult to be accommodated on resource constrained embedded devices. Moreover, with the development of technology, more and more embedded devices are available to run a Transformer model. For a Transformer model with different constraints (tight or loose), it can be deployed onto devices with different computing power. However, in previous work, designers did not choose the best device among multiple devices. Instead, they just used an existing device to deploy model, which was not necessarily the best fit and may lead to underutilization of resources. To address the deployment challenge of Transformer and the problem to select the best device, we propose an algorithm & hardware closed-loop acceleration framework. Given a dataset, a model, latency constraint LC and accuracy constraint AC, our framework can provide a best device satisfying both constraints. In order to generate a compressed model with high sparsity ratio, we propose a novel pruning technique, hierarchical pruning (HP). We optimize the sparse matrix storage format for HP matrix to further reduce memory usage for FPGA implementation. We design a accelerator that takes advantage of HP to solve the problem of concurrent random access. Experiments on Transformer and TinyBert model show that our framework can find different devices for various LC and AC, covering from low-end devices to high-end devices. Our HP can achieve higher sparsity ratio and is more flexible than other sparsity pattern. Our framework can achieve 37x, 1.9x, 1.7x speedup compared to CPU, GPU and FPGA, respectively.
△ Less
Submitted 19 October, 2021;
originally announced October 2021.
-
New insights into price drivers of crude oil futures markets: Evidence from quantile ARDL approach
Authors:
Hao-Lin Shao,
Ying-Hui Shao,
Yan-Hong Yang
Abstract:
This paper investigates the cointegration between possible determinants of crude oil futures prices during the COVID-19 pandemic period. We perform comparative analysis of WTI and newly-launched Shanghai crude oil futures (SC) via the Autoregressive Distributed Lag (ARDL) model and Quantile Autoregressive Distributed Lag (QARDL) model. The empirical results confirm that economic policy uncertainty…
▽ More
This paper investigates the cointegration between possible determinants of crude oil futures prices during the COVID-19 pandemic period. We perform comparative analysis of WTI and newly-launched Shanghai crude oil futures (SC) via the Autoregressive Distributed Lag (ARDL) model and Quantile Autoregressive Distributed Lag (QARDL) model. The empirical results confirm that economic policy uncertainty, stock markets, interest rates and coronavirus panic are important drivers of WTI futures prices. Our findings also suggest that the US and China's stock markets play vital roles in movements of SC futures prices. Meanwhile, CSI300 stock index has a significant positive short-run impact on SC futures prices while S\&P500 prices possess a positive nexus with SC futures prices both in long-run and short-run. Overall, these empirical evidences provide practical implications for investors and policymakers.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
Unsupervised Belief Representation Learning with Information-Theoretic Variational Graph Auto-Encoders
Authors:
**ning Li,
Huajie Shao,
Dachun Sun,
Ruijie Wang,
Yuchen Yan,
**yang Li,
Shengzhong Liu,
Hanghang Tong,
Tarek Abdelzaher
Abstract:
This paper develops a novel unsupervised algorithm for belief representation learning in polarized networks that (i) uncovers the latent dimensions of the underlying belief space and (ii) jointly embeds users and content items (that they interact with) into that space in a manner that facilitates a number of downstream tasks, such as stance detection, stance prediction, and ideology map**. Inspi…
▽ More
This paper develops a novel unsupervised algorithm for belief representation learning in polarized networks that (i) uncovers the latent dimensions of the underlying belief space and (ii) jointly embeds users and content items (that they interact with) into that space in a manner that facilitates a number of downstream tasks, such as stance detection, stance prediction, and ideology map**. Inspired by total correlation in information theory, we propose the Information-Theoretic Variational Graph Auto-Encoder (InfoVGAE) that learns to project both users and content items (e.g., posts that represent user views) into an appropriate disentangled latent space. To better disentangle latent variables in that space, we develop a total correlation regularization module, a Proportional-Integral (PI) control module, and adopt rectified Gaussian distribution to ensure the orthogonality. The latent representation of users and content can then be used to quantify their ideological leaning and detect/predict their stances on issues. We evaluate the performance of the proposed InfoVGAE on three real-world datasets, of which two are collected from Twitter and one from U.S. Congress voting records. The evaluation results show that our model outperforms state-of-the-art unsupervised models by reducing 10.5% user clustering errors and achieving 12.1% higher F1 scores for stance separation of content items. In addition, InfoVGAE produces a comparable result with supervised models. We also discuss its performance on stance prediction and user ranking within ideological groups.
△ Less
Submitted 9 May, 2022; v1 submitted 1 October, 2021;
originally announced October 2021.
-
Distributed Stabilization of Signed Networks via Self-loop Compensation
Authors:
Haibin Shao,
Lulu Pan
Abstract:
This paper examines the stability and distributed stabilization of signed multi-agent networks. Here, positive semidefiniteness is not inherent for signed Laplacians, which renders the stability and consensus of this category of networks intricate. First, we examine the stability of signed networks by introducing a novel graph-theoretic objective negative cut set, which implies that manipulating n…
▽ More
This paper examines the stability and distributed stabilization of signed multi-agent networks. Here, positive semidefiniteness is not inherent for signed Laplacians, which renders the stability and consensus of this category of networks intricate. First, we examine the stability of signed networks by introducing a novel graph-theoretic objective negative cut set, which implies that manipulating negative edge weights cannot change a unstable network into a stable one. Then, inspired by the diagonal dominance and stability of matrices, a local state dam** mechanism is introduced using self-loop compensation. The self-loop compensation is only active for those agents who are incident to negative edges and can stabilize signed networks in a fully distributed manner. Quantitative connections between self-loop compensation and the stability of the compensated signed network are established for a tradeoff between compensation efforts and network stability. Necessary and/or sufficient conditions for predictable cluster consensus of compensated signed networks are provided. The optimality of self-loop compensation is discussed. Furthermore, we extend our results to directed signed networks where the symmetry of signed Laplacian is not free. The correlation between the stability of the compensated dynamics obtained by self-loop compensation and eventually positivity is further discussed. Novel insights into the stability of multi-agent systems on signed networks in terms of self-loop compensation are offered. Simulation examples are provided to demonstrate the theoretical results.
△ Less
Submitted 22 June, 2022; v1 submitted 26 September, 2021;
originally announced September 2021.
-
The CAMELS Multifield Dataset: Learning the Universe's Fundamental Parameters with Artificial Intelligence
Authors:
Francisco Villaescusa-Navarro,
Shy Genel,
Daniel Angles-Alcazar,
Leander Thiele,
Romeel Dave,
Desika Narayanan,
Andrina Nicola,
Yin Li,
Pablo Villanueva-Domingo,
Benjamin Wandelt,
David N. Spergel,
Rachel S. Somerville,
Jose Manuel Zorrilla Matilla,
Faizan G. Mohammad,
Sultan Hassan,
Helen Shao,
Digvijay Wadekar,
Michael Eickenberg,
Kaze W. K. Wong,
Gabriella Contardo,
Yongseok Jo,
Emily Moser,
Erwin T. Lau,
Luis Fernando Machado Poletti Valle,
Lucia A. Perez
, et al. (3 additional authors not shown)
Abstract:
We present the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) Multifield Dataset, CMD, a collection of hundreds of thousands of 2D maps and 3D grids containing many different properties of cosmic gas, dark matter, and stars from 2,000 distinct simulated universes at several cosmic times. The 2D maps and 3D grids represent cosmic regions that span $\sim$100 million light year…
▽ More
We present the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) Multifield Dataset, CMD, a collection of hundreds of thousands of 2D maps and 3D grids containing many different properties of cosmic gas, dark matter, and stars from 2,000 distinct simulated universes at several cosmic times. The 2D maps and 3D grids represent cosmic regions that span $\sim$100 million light years and have been generated from thousands of state-of-the-art hydrodynamic and gravity-only N-body simulations from the CAMELS project. Designed to train machine learning models, CMD is the largest dataset of its kind containing more than 70 Terabytes of data. In this paper we describe CMD in detail and outline a few of its applications. We focus our attention on one such task, parameter inference, formulating the problems we face as a challenge to the community. We release all data and provide further technical details at https://camels-multifield-dataset.readthedocs.io.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
Robust marginalization of baryonic effects for cosmological inference at the field level
Authors:
Francisco Villaescusa-Navarro,
Shy Genel,
Daniel Angles-Alcazar,
David N. Spergel,
Yin Li,
Benjamin Wandelt,
Leander Thiele,
Andrina Nicola,
Jose Manuel Zorrilla Matilla,
Helen Shao,
Sultan Hassan,
Desika Narayanan,
Romeel Dave,
Mark Vogelsberger
Abstract:
We train neural networks to perform likelihood-free inference from $(25\,h^{-1}{\rm Mpc})^2$ 2D maps containing the total mass surface density from thousands of hydrodynamic simulations of the CAMELS project. We show that the networks can extract information beyond one-point functions and power spectra from all resolved scales ($\gtrsim 100\,h^{-1}{\rm kpc}$) while performing a robust marginalizat…
▽ More
We train neural networks to perform likelihood-free inference from $(25\,h^{-1}{\rm Mpc})^2$ 2D maps containing the total mass surface density from thousands of hydrodynamic simulations of the CAMELS project. We show that the networks can extract information beyond one-point functions and power spectra from all resolved scales ($\gtrsim 100\,h^{-1}{\rm kpc}$) while performing a robust marginalization over baryonic physics at the field level: the model can infer the value of $Ω_{\rm m} (\pm 4\%)$ and $σ_8 (\pm 2.5\%)$ from simulations completely different to the ones used to train it.
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
Finding universal relations in subhalo properties with artificial intelligence
Authors:
Helen Shao,
Francisco Villaescusa-Navarro,
Shy Genel,
David N. Spergel,
Daniel Angles-Alcazar,
Lars Hernquist,
Romeel Dave,
Desika Narayanan,
Gabriella Contardo,
Mark Vogelsberger
Abstract:
We use a generic formalism designed to search for relations in high-dimensional spaces to determine if the total mass of a subhalo can be predicted from other internal properties such as velocity dispersion, radius, or star-formation rate. We train neural networks using data from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project and show that the model can predict t…
▽ More
We use a generic formalism designed to search for relations in high-dimensional spaces to determine if the total mass of a subhalo can be predicted from other internal properties such as velocity dispersion, radius, or star-formation rate. We train neural networks using data from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project and show that the model can predict the total mass of a subhalo with high accuracy: more than 99% of the subhalos have a predicted mass within 0.2 dex of their true value. The networks exhibit surprising extrapolation properties, being able to accurately predict the total mass of any type of subhalo containing any kind of galaxy at any redshift from simulations with different cosmologies, astrophysics models, subgrid physics, volumes, and resolutions, indicating that the network may have found a universal relation. We then use different methods to find equations that approximate the relation found by the networks and derive new analytic expressions that predict the total mass of a subhalo from its radius, velocity dispersion, and maximum circular velocity. We show that in some regimes, the analytic expressions are more accurate than the neural networks. We interpret the relation found by the neural network and approximated by the analytic equation as being connected to the virial theorem.
△ Less
Submitted 9 September, 2021;
originally announced September 2021.
-
Independent dimensional phase transition on a two-dimensional Kuramoto model with matrix coupling
Authors:
Chongzhi Wang,
Haibin Shao,
Dewei Li
Abstract:
The high-dimensional generalization of the one-dimensional Kuramoto paradigm has been an essential step in bringing about a more faithful depiction of the dynamics of real-world systems. Despite the multi-dimensional nature of the oscillators in these generalized models, the interacting schemes so far have been dominated by a scalar factor unanimously between any pair of oscillators that leads eve…
▽ More
The high-dimensional generalization of the one-dimensional Kuramoto paradigm has been an essential step in bringing about a more faithful depiction of the dynamics of real-world systems. Despite the multi-dimensional nature of the oscillators in these generalized models, the interacting schemes so far have been dominated by a scalar factor unanimously between any pair of oscillators that leads eventually to synchronization on all dimensions. As a natural extension of the scalar coupling befitting for the one-dimensional case, we take a tentative step in studying numerically and theoretically the coupling mechanism of $2\times2$ real matrices on two-dimensional Kuramoto oscillators. One of the features stemmed from this new mechanism is that the matrix coupling enables the two dimensions of the oscillators to separate their transitions to either synchronization or desynchronization which has not been seen in other high-dimensional generalizations. Under various matrix configurations, the synchronization and desynchronization of the two dimensions combine into four qualitatively distinct modes of position and motion of the system. We demonstrate that as one matrix is morphed into another in a specific manner, the system mode also switches correspondingly either through continuous or explosive transitions of the order parameters, thus mimicking a range of behaviors in information science and biology.
△ Less
Submitted 26 August, 2021;
originally announced August 2021.
-
Revisiting the Reduction of Thermal Conductivity in Nano- to Micro-Grained Bismuth Telluride: The Importance of Grain-Boundary Thermal Resistance
Authors:
Sien Wang,
Xiaowei Lu,
Ankit Negi,
Jixiong He,
Kyunghoon Kim,
Hezhu Shao,
Peng Jiang,
Jun Liu,
Qing Hao
Abstract:
Nanograined bulk alloys based on bismuth telluride (Bi2Te3) are the dominant materials for room-temperature thermoelectric applications. In numerous studies, existing bulk phonon mean free path (MFP) spectra predicted by atomistic simulations suggest sub-100 nm grain sizes are necessary to reduce the lattice thermal conductivity by decreasing phonon MFPs. This is in contrast with available experim…
▽ More
Nanograined bulk alloys based on bismuth telluride (Bi2Te3) are the dominant materials for room-temperature thermoelectric applications. In numerous studies, existing bulk phonon mean free path (MFP) spectra predicted by atomistic simulations suggest sub-100 nm grain sizes are necessary to reduce the lattice thermal conductivity by decreasing phonon MFPs. This is in contrast with available experimental data, where a remarkable thermal conductivity reduction is observed even for micro-grained Bi2Te3 samples. In this work, first-principles phonon MFPs along both the in-plane and cross-plane directions are re-computed for bulk Bi2Te3. These phonon MFPs can explain new and existing experimental data on flake-like Bi2Te3 nanostructures with various thicknesses. For polycrystalline Bi2Te3-based materials, a better explanation of the experimental data requires further consideration of the grain-boundary thermal resistance that can largely suppress the transport of high-frequency optical phonons.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
NLO inclusive $J/ψ$ photoproduction at large $P_T$ at HERA and the EIC
Authors:
Carlo Flore,
Jean-Philippe Lansberg,
Hua-Sheng Shao,
Yelyzaveta Yedelkina
Abstract:
We study inclusive $J/ψ$ photoproduction at NLO at large $P_T$ at HERA and the EIC. Our computation includes NLO QCD leading-$P_T$ corrections, QED contributions via an off-shell photon as well as those from $J/ψ$+charm channels. For the latter, we employ the variable-flavour-number scheme. Our results are found to agree with the latest HERA data by H1 and provide, for the first time, a reliable e…
▽ More
We study inclusive $J/ψ$ photoproduction at NLO at large $P_T$ at HERA and the EIC. Our computation includes NLO QCD leading-$P_T$ corrections, QED contributions via an off-shell photon as well as those from $J/ψ$+charm channels. For the latter, we employ the variable-flavour-number scheme. Our results are found to agree with the latest HERA data by H1 and provide, for the first time, a reliable estimate of the EIC reach for such a measurement. Finally, we demonstrate the observability of $J/ψ$+charm production and the sensitivy to probe the non-perturbative charm content of the proton at high $x$, also known as intrinsic charm, at the EIC.
△ Less
Submitted 29 July, 2021; v1 submitted 28 July, 2021;
originally announced July 2021.
-
Distributed Neighbor Selection in Multi-agent Networks
Authors:
Haibin Shao,
Lulu Pan,
Mehran Mesbahi,
Yugeng Xi,
Dewei Li
Abstract:
Achieving consensus via nearest neighbor rules is an important prerequisite for multi-agent networks to accomplish collective tasks. A common assumption in consensus setup is that each agent interacts with all its neighbors. This paper examines whether network functionality and performance can be maintained-and even enhanced-when agents interact only with a subset of their respective (available) n…
▽ More
Achieving consensus via nearest neighbor rules is an important prerequisite for multi-agent networks to accomplish collective tasks. A common assumption in consensus setup is that each agent interacts with all its neighbors. This paper examines whether network functionality and performance can be maintained-and even enhanced-when agents interact only with a subset of their respective (available) neighbors. As shown in the paper, the answer to this inquiry is affirmative. In this direction, we show that by exploring the monotonicity property of the Laplacian eigenvectors, a neighbor selection rule with guaranteed performance enhancements, can be realized for consensus-type networks. For distributed implementation, a quantitative connection between entries of Laplacian eigenvectors and the "relative rate of change" in the state between neighboring agents is further established; this connection facilitates a distributed algorithm for each agent to identify "favorable" neighbors to interact with. Multi-agent networks with and without external influence are examined, as well as extensions to signed networks. This paper underscores the utility of Laplacian eigenvectors in the context of distributed neighbor selection, providing novel insights into distributed data-driven control of multi-agent systems.
△ Less
Submitted 22 June, 2022; v1 submitted 26 July, 2021;
originally announced July 2021.
-
Experimental Analysis of PandaX-4T Cryogenic Distillation System for Removing Krypton from Xenon
Authors:
Rui Yan,
Zhou Wang,
Xiangyi Cui,
Yonglin Ju,
Haidong Sha,
Shuaijie Li,
Peiyao Huang,
Xiuli Wang,
Wenbo Ma,
Yingjie Fan,
Xiangdong Ji,
Jifang Zhou,
Changsong Shang,
Liqiang Liu
Abstract:
An efficient cryogenic distillation system was designed and constructed for PandaX-4T dark matter detector based on the McCabe-Thiele (M-T) method and the conservation of mass and energy. This distillation system is designed to reduce the concentration of krypton in commercial xenon from 5X$10^{-7}$ mol/mol to $10^{-14}$ mol/mol with 99% xenon collection efficiency at a maximum flow rate of 10 kg/…
▽ More
An efficient cryogenic distillation system was designed and constructed for PandaX-4T dark matter detector based on the McCabe-Thiele (M-T) method and the conservation of mass and energy. This distillation system is designed to reduce the concentration of krypton in commercial xenon from 5X$10^{-7}$ mol/mol to $10^{-14}$ mol/mol with 99% xenon collection efficiency at a maximum flow rate of 10 kg/h. The offline distillation operation has been completed and 5.75 tons of ultra-high purity xenon was produced, which is used as the detection medium in PandaX-4T detector. The krypton concentration of the product xenon is measured with an upper limit of 8.0 ppt. The stability and purification performance of the cryogenic distillation system are studied by analyzing the experimental data, which is important for theoretical research and distillation operation optimization.
△ Less
Submitted 21 October, 2021; v1 submitted 20 July, 2021;
originally announced July 2021.
-
Cluster Consensus on Matrix-weighted Switching Networks
Authors:
Lulu Pan,
Haibin Shao,
Mehran Mesbahi,
Dewei Li,
Yugeng Xi
Abstract:
This paper examines the cluster consensus problem of multi-agent systems on matrix-weighted switching networks. Necessary and/or sufficient conditions under which cluster consensus can be achieved are obtained and quantitative characterization of the steady-state of the cluster consensus are provided as well. Specifically, if the underlying network switches amongst finite number of networks, a nec…
▽ More
This paper examines the cluster consensus problem of multi-agent systems on matrix-weighted switching networks. Necessary and/or sufficient conditions under which cluster consensus can be achieved are obtained and quantitative characterization of the steady-state of the cluster consensus are provided as well. Specifically, if the underlying network switches amongst finite number of networks, a necessary condition for cluster consensus of multi-agent system on switching matrix-weighted networks is firstly presented, it is shown that the steady-state of the system lies in the intersection of the null space of matrix-valued Laplacians corresponding to all switching networks. Second, if the underlying network switches amongst infinite number of networks, the matrix-weighted integral network is employed to provide sufficient conditions for cluster consensus and the quantitative characterization of the corresponding steady-state of the multi-agent system, using null space analysis of matrix-valued Laplacian related of integral network associated with the switching networks. In particular, conditions for the bipartite consensus under the matrix-weighted switching networks are examined. Simulation results are finally provided to demonstrate the theoretical analysis.
△ Less
Submitted 20 July, 2021; v1 submitted 20 July, 2021;
originally announced July 2021.
-
Resonantly pumped bright-triplet exciton lasing in caesium lead bromide perovskites
Authors:
Guanhua Ying,
Tristan Farrow,
Atanu Jana,
Hanbo Shao,
Hyunsik Im,
Vitaly Osokin,
Seung Bin Baek,
Mutibah Alanazi,
Sanjit Karmakar,
Manas Mukherjee,
Youngsin Park,
Robert A. Taylor
Abstract:
The surprising recent observation of highly emissive triplet-states in lead halide perovskites accounts for their orders-of-magnitude brighter optical signals and high quantum efficiencies compared to other semiconductors. This makes them attractive for future optoelectronic applications, especially in bright low-threshold nano-lasers. Whilst non-resonantly pumped lasing from all-inorganic lead-ha…
▽ More
The surprising recent observation of highly emissive triplet-states in lead halide perovskites accounts for their orders-of-magnitude brighter optical signals and high quantum efficiencies compared to other semiconductors. This makes them attractive for future optoelectronic applications, especially in bright low-threshold nano-lasers. Whilst non-resonantly pumped lasing from all-inorganic lead-halide perovskites is now well-established as an attractive pathway to scalable low-power laser sources for nano-optoelectronics, here we showcase a resonant optical pum** scheme on a fast triplet-state in CsPbBr3 nanocrystals. The scheme allows us to realize a polarized triplet-laser source that dramatically enhances the coherent signal by one order of magnitude whilst suppressing non-coherent contributions. The result is a source with highly attractive technological characteristics including a bright and polarized signal, and a high stimulated-to-spontaneous emission signal contrast that can be filtered to enhance spectral purity. The emission is generated by pum** selectively on a weakly-confined excitonic state with a Bohr radius ~10 nm in the nanocrystals. The exciton fine-structure is revealed by the energy-splitting resulting from confinement in nanocrystals with tetragonal symmetry. We use a linear polarizer to resolve two-fold non-degenerate sub-levels in the triplet exciton and use photoluminescence excitation spectroscopy to determine the energy of the state before pum** it resonantly.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
Electrochemical control of ferroelectricity in hafnia-based ferroelectric devices using reversible oxygen migration
Authors:
M. H. Shao,
H. F. Liu,
R. He,
X. M. Li,
L. Wu,
J. Ma,
X. C. Hu,
R. T. Zhao,
Z. C. Zhong,
Y. Yu,
C. H. Wan,
Y. Yang,
C. -W. Nan,
X. D. Bai,
T. -L. Ren,
X. Renshaw Wang
Abstract:
Ferroelectricity, especially in hafnia-based thin films at nanosizes, has been rejuvenated in the fields of low-power, nonvolatile and Si-compatible modern memory and logic applications. Despite tremendous efforts to explore the formation of the metastable ferroelectric phase and the polarization degradation during field cycling, the ability of oxygen vacancy to exactly engineer and switch polariz…
▽ More
Ferroelectricity, especially in hafnia-based thin films at nanosizes, has been rejuvenated in the fields of low-power, nonvolatile and Si-compatible modern memory and logic applications. Despite tremendous efforts to explore the formation of the metastable ferroelectric phase and the polarization degradation during field cycling, the ability of oxygen vacancy to exactly engineer and switch polarization remains to be elucidated. Here we report reversibly electrochemical control of ferroelectricity in Hf$_{0.5}$Zr$_{0.5}$O$_2$ (HZO) heterostructures with a mixed ionic-electronic LaSrMnO$_3$ electrode, achieving a hard breakdown field more than 18 MV/cm, over fourfold as high as that of typical HZO. The electrical extraction and insertion of oxygen into HZO is macroscopically characterized and atomically imaged in situ. Utilizing this reversible process, we achieved multiple polarization states and even repeatedly repaired the damaged ferroelectricity by reversed negative electric fields. Our study demonstrates the robust and switchable ferroelectricity in hafnia oxide distinctly associated with oxygen vacancy and opens up opportunities to recover, manipulate, and utilize rich ferroelectric functionalities for advanced ferroelectric functionality to empower the existing Si-based electronics such as multi-bit storage.
△ Less
Submitted 20 June, 2021;
originally announced June 2021.
-
Dynamic Event-Triggered Consensus of Multi-agent Systems on Matrix-weighted Networks
Authors:
Lulu Pan,
Haibin Shao,
Dewei Li,
Lin Liu
Abstract:
This paper examines the event-triggered consensus of the multi-agent system on matrix-weighted networks, where the interdependencies among higher-dimensional states of neighboring agents are characterized by matrix-weighted edges in the network. Specifically, a novel distributed dynamic event-triggered coordination strategy is proposed for this category of generalized networks, in which an auxilia…
▽ More
This paper examines the event-triggered consensus of the multi-agent system on matrix-weighted networks, where the interdependencies among higher-dimensional states of neighboring agents are characterized by matrix-weighted edges in the network. Specifically, a novel distributed dynamic event-triggered coordination strategy is proposed for this category of generalized networks, in which an auxiliary system is employed for each agent to dynamically adjust the triggering threshold, which plays an essential role in guaranteeing that the triggering time sequence does not exhibit Zeno behavior. Distributed event-triggered control protocols are proposed to guarantee leaderless and leader-follower consensus for multi-agent systems on matrix-weighted networks, respectively. Remarkably, the spectrum of matrix-valued weights is crucial in event-triggered mechanism design for matrix-weighted networks, generalizing those results only applicable for scalar-weighted networks. The proposed approach allows each agent to broadcast and receive information only at its triggering instants. Finally, simulation examples are provided to demonstrate the theoretical results.
△ Less
Submitted 4 September, 2022; v1 submitted 11 June, 2021;
originally announced June 2021.
-
DyDiff-VAE: A Dynamic Variational Framework for Information Diffusion Prediction
Authors:
Ruijie Wang,
Zijie Huang,
Shengzhong Liu,
Huajie Shao,
Dongxin Liu,
**yang Li,
Tianshi Wang,
Dachun Sun,
Shuochao Yao,
Tarek Abdelzaher
Abstract:
This paper describes a novel diffusion model, DyDiff-VAE, for information diffusion prediction on social media. Given the initial content and a sequence of forwarding users, DyDiff-VAE aims to estimate the propagation likelihood for other potential users and predict the corresponding user rankings. Inferring user interests from diffusion data lies the foundation of diffusion prediction, because us…
▽ More
This paper describes a novel diffusion model, DyDiff-VAE, for information diffusion prediction on social media. Given the initial content and a sequence of forwarding users, DyDiff-VAE aims to estimate the propagation likelihood for other potential users and predict the corresponding user rankings. Inferring user interests from diffusion data lies the foundation of diffusion prediction, because users often forward the information in which they are interested or the information from those who share similar interests. Their interests also evolve over time as the result of the dynamic social influence from neighbors and the time-sensitive information gained inside/outside the social media. Existing works fail to model users' intrinsic interests from the diffusion data and assume user interests remain static along the time. DyDiff-VAE advances the state of the art in two directions: (i) We propose a dynamic encoder to infer the evolution of user interests from observed diffusion data. (ii) We propose a dual attentive decoder to estimate the propagation likelihood by integrating information from both the initial cascade content and the forwarding user sequence. Extensive experiments on four real-world datasets from Twitter and Youtube demonstrate the advantages of the proposed model; we show that it achieves 43.3% relative gains over the best baseline on average. Moreover, it has the lowest run-time compared with recurrent neural network based models.
△ Less
Submitted 6 June, 2021;
originally announced June 2021.
-
Automated EW corrections with isolated photons: $t \bar t γ$, $t \bar t γγ$ and $t γj$ as case studies
Authors:
Davide Pagani,
Hua-Sheng Shao,
Ioannis Tsinikos,
Marco Zaro
Abstract:
In this work we compute for the first time the so-called Complete-NLO predictions for top-quark pair hadroproduction in association with at least one isolated photon ($ t \bar t γ$). We also compute NLO QCD+EW predictions for the similar case with at least two isolated photons ($ t \bar t γγ$) and for single-top hadroproduction in association with at least one isolated photon. In addition, we comp…
▽ More
In this work we compute for the first time the so-called Complete-NLO predictions for top-quark pair hadroproduction in association with at least one isolated photon ($ t \bar t γ$). We also compute NLO QCD+EW predictions for the similar case with at least two isolated photons ($ t \bar t γγ$) and for single-top hadroproduction in association with at least one isolated photon. In addition, we complement our results with NLO QCD+EW predictions of the hadronic and leptonic decays of top-quark including an isolated photon. All these results have been obtained in a completely automated approach, by extending the capabilities of the MadGraph5_aMC@NLO framework and enabling the Complete-NLO predictions for processes with isolated photons in the final state. We discuss the technical details of the implementation, which involves a mixed EW renormalisation scheme for such processes.
△ Less
Submitted 1 October, 2021; v1 submitted 3 June, 2021;
originally announced June 2021.
-
Single production of vector-like quarks: the effects of large width, interference and NLO corrections
Authors:
Aldo Deandrea,
Thomas Flacke,
Benjamin Fuks,
Luca Panizzi,
Hua-Sheng Shao
Abstract:
We provide a comprehensive discussion, together with a complete setup for simulations, relevant for the production of a single vector-like quark at hadron colliders. Our predictions include finite width effects, signal-background interference effects and next-to-leading order QCD corrections. We explicitly apply the framework to study the single production of a vector-like quark $T$ with charge 2/…
▽ More
We provide a comprehensive discussion, together with a complete setup for simulations, relevant for the production of a single vector-like quark at hadron colliders. Our predictions include finite width effects, signal-background interference effects and next-to-leading order QCD corrections. We explicitly apply the framework to study the single production of a vector-like quark $T$ with charge 2/3, but the same procedure can be used to analyse the single production of vector-like quarks with charge $-4/3$, $-1/3$, $2/3$ and $5/3$, when the vector-like quark interacts with the Standard Model quarks and electroweak bosons. Moreover, this procedure can be straightforwardly extended to include additional interactions with exotic particles. We provide quantitative results for representative benchmark scenarios characterised by the $T$ mass and width, and we determine the role of the interference terms for a range of masses and widths of phenomenological significance. We additionally describe in detail, both analytically and numerically, a striking feature in the invariant mass distribution appearing only in the $T \to th$ channel.
△ Less
Submitted 20 September, 2022; v1 submitted 18 May, 2021;
originally announced May 2021.
-
Comprehensive Review On Twin Support Vector Machines
Authors:
M. Tanveer,
T. Rajani,
R. Rastogi,
Y. H. Shao,
M. A. Ganaie
Abstract:
Twin support vector machine (TWSVM) and twin support vector regression (TSVR) are newly emerging efficient machine learning techniques which offer promising solutions for classification and regression challenges respectively. TWSVM is based upon the idea to identify two nonparallel hyperplanes which classify the data points to their respective classes. It requires to solve two small sized quadrati…
▽ More
Twin support vector machine (TWSVM) and twin support vector regression (TSVR) are newly emerging efficient machine learning techniques which offer promising solutions for classification and regression challenges respectively. TWSVM is based upon the idea to identify two nonparallel hyperplanes which classify the data points to their respective classes. It requires to solve two small sized quadratic programming problems (QPPs) in lieu of solving single large size QPP in support vector machine (SVM) while TSVR is formulated on the lines of TWSVM and requires to solve two SVM kind problems. Although there has been good research progress on these techniques; there is limited literature on the comparison of different variants of TSVR. Thus, this review presents a rigorous analysis of recent research in TWSVM and TSVR simultaneously mentioning their limitations and advantages. To begin with we first introduce the basic theory of support vector machine, TWSVM and then focus on the various improvements and applications of TWSVM, and then we introduce TSVR and its various enhancements. Finally, we suggest future research and development prospects.
△ Less
Submitted 18 March, 2022; v1 submitted 1 May, 2021;
originally announced May 2021.
-
Heavy-flavour studies with a high-luminosity fixed-target experiment at the LHC
Authors:
B. Trzeciak,
S. J. Brodsky,
G. Cavoto,
C. Da Silva,
M. G. Echevarria,
E. G. Ferreiro,
C. Hadjidakis,
R. Haque,
I. Hrivnacova,
D. Kikola,
A. Klein,
A. Kurepin,
A. Kusina,
J. P. Lansberg,
C. Lorce,
F. Lyonnet,
Y. Makdisi,
L. Massacrier,
S. Porteboeuf,
C. Quintans,
A. Rakotozafindrabe,
P. Robbe,
W. Scandale,
I. Schienbein,
J. Seixas
, et al. (9 additional authors not shown)
Abstract:
Extraction of the multi-TeV proton and lead LHC beams with a bent crystal or by using an internal gas target allows one to perform the most energetic fixed-target experiment ever. pp, pd and pA collisions at $\sqrt{s}$ = 115 GeV and Pbp and PbA collisions at $\sqrt{s_{\rm{NN}}}$ = 72 GeV can be studied with high precision and modern detection techniques over a broad rapidity range. Using the LHCb…
▽ More
Extraction of the multi-TeV proton and lead LHC beams with a bent crystal or by using an internal gas target allows one to perform the most energetic fixed-target experiment ever. pp, pd and pA collisions at $\sqrt{s}$ = 115 GeV and Pbp and PbA collisions at $\sqrt{s_{\rm{NN}}}$ = 72 GeV can be studied with high precision and modern detection techniques over a broad rapidity range. Using the LHCb or the ALICE detector in a fixed-target mode offers unprecedented possibilities to access heavy-flavour production in a new energy domain, half way between the SPS and the nominal RHIC energy. In this contribution, a review of projection studies for quarkonium and open charm and beauty production with both detector set-ups used with various nuclear targets and the LHC lead beams is presented.
△ Less
Submitted 22 April, 2021;
originally announced April 2021.
-
GnetDet: Object Detection Optimized on a 224mW CNN Accelerator Chip at the Speed of 106FPS
Authors:
Baohua Sun,
Tao Zhang,
Jiapeng Su,
Hao Sha
Abstract:
Object detection is widely used on embedded devices. With the wide availability of CNN (Convolutional Neural Networks) accelerator chips, the object detection applications are expected to run with low power consumption, and high inference speed. In addition, the CPU load is expected to be as low as possible for a CNN accelerator chip working as a co-processor with a host CPU. In this paper, we opt…
▽ More
Object detection is widely used on embedded devices. With the wide availability of CNN (Convolutional Neural Networks) accelerator chips, the object detection applications are expected to run with low power consumption, and high inference speed. In addition, the CPU load is expected to be as low as possible for a CNN accelerator chip working as a co-processor with a host CPU. In this paper, we optimize the object detection model on the CNN accelerator chip by minimizing the CPU load. The resulting model is called GnetDet. The experimental result shows that the GnetDet model running on a 224mW chip achieves the speed of 106FPS with excellent accuracy.
△ Less
Submitted 19 February, 2021;
originally announced March 2021.
-
Leaning Compact and Representative Features for Cross-Modality Person Re-Identification
Authors:
Guangwei Gao,
Hao Shao,
Fei Wu,
Meng Yang,
Yi Yu
Abstract:
This paper pays close attention to the cross-modality visible-infrared person re-identification (VI Re-ID) task, which aims to match pedestrian samples between visible and infrared modes. In order to reduce the modality-discrepancy between samples from different cameras, most existing works usually use constraints based on Euclidean metric. Because of the Euclidean based distance metric strategy c…
▽ More
This paper pays close attention to the cross-modality visible-infrared person re-identification (VI Re-ID) task, which aims to match pedestrian samples between visible and infrared modes. In order to reduce the modality-discrepancy between samples from different cameras, most existing works usually use constraints based on Euclidean metric. Because of the Euclidean based distance metric strategy cannot effectively measure the internal angles between the embedded vectors, the existing solutions cannot learn the angularly discriminative feature embedding. Since the most important factor affecting the classification task based on embedding vector is whether there is an angularly discriminative feature space, in this paper, we present a new loss function called Enumerate Angular Triplet (EAT) loss. Also, motivated by the knowledge distillation, to narrow down the features between different modalities before feature embedding, we further present a novel Cross-Modality Knowledge Distillation (CMKD) loss. Benefit from the above two considerations, the embedded features are discriminative enough in a way to tackle modality-discrepancy problem. The experimental results on RegDB and SYSU-MM01 datasets have demonstrated that the proposed method is superior to the other most advanced methods in terms of impressive performance. Code is available at https://github.com/IVIPLab/LCCRF.
△ Less
Submitted 9 February, 2022; v1 submitted 25 March, 2021;
originally announced March 2021.
-
Ensemble Learning with Manifold-Based Data Splitting for Noisy Label Correction
Authors:
Hao-Chiang Shao,
Hsin-Chieh Wang,
Weng-Tai Su,
Chia-Wen Lin
Abstract:
Label noise in training data can significantly degrade a model's generalization performance for supervised learning tasks. Here we focus on the problem that noisy labels are primarily mislabeled samples, which tend to be concentrated near decision boundaries, rather than uniformly distributed, and whose features should be equivocal. To address the problem, we propose an ensemble learning method to…
▽ More
Label noise in training data can significantly degrade a model's generalization performance for supervised learning tasks. Here we focus on the problem that noisy labels are primarily mislabeled samples, which tend to be concentrated near decision boundaries, rather than uniformly distributed, and whose features should be equivocal. To address the problem, we propose an ensemble learning method to correct noisy labels by exploiting the local structures of feature manifolds. Different from typical ensemble strategies that increase the prediction diversity among sub-models via certain loss terms, our method trains sub-models on disjoint subsets, each being a union of the nearest-neighbors of randomly selected seed samples on the data manifold. As a result, each sub-model can learn a coarse representation of the data manifold along with a corresponding graph. Moreover, only a limited number of sub-models will be affected by locally-concentrated noisy labels. The constructed graphs are used to suggest a series of label correction candidates, and accordingly, our method derives label correction results by voting down inconsistent suggestions. Our experiments on real-world noisy label datasets demonstrate the superiority of the proposed method over existing state-of-the-arts.
△ Less
Submitted 13 March, 2021;
originally announced March 2021.
-
One for One, or All for All: Equilibria and Optimality of Collaboration in Federated Learning
Authors:
Avrim Blum,
Nika Haghtalab,
Richard Lanas Phillips,
Han Shao
Abstract:
In recent years, federated learning has been embraced as an approach for bringing about collaboration across large populations of learning agents. However, little is known about how collaboration protocols should take agents' incentives into account when allocating individual resources for communal learning in order to maintain such collaborations. Inspired by game theoretic notions, this paper in…
▽ More
In recent years, federated learning has been embraced as an approach for bringing about collaboration across large populations of learning agents. However, little is known about how collaboration protocols should take agents' incentives into account when allocating individual resources for communal learning in order to maintain such collaborations. Inspired by game theoretic notions, this paper introduces a framework for incentive-aware learning and data sharing in federated learning. Our stable and envy-free equilibria capture notions of collaboration in the presence of agents interested in meeting their learning objectives while kee** their own sample collection burden low. For example, in an envy-free equilibrium, no agent would wish to swap their sampling burden with any other agent and in a stable equilibrium, no agent would wish to unilaterally reduce their sampling burden.
In addition to formalizing this framework, our contributions include characterizing the structural properties of such equilibria, proving when they exist, and showing how they can be computed. Furthermore, we compare the sample complexity of incentive-aware collaboration with that of optimal collaboration when one ignores agents' incentives.
△ Less
Submitted 4 March, 2021;
originally announced March 2021.
-
Robust learning under clean-label attack
Authors:
Avrim Blum,
Steve Hanneke,
Jian Qian,
Han Shao
Abstract:
We study the problem of robust learning under clean-label data-poisoning attacks, where the attacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test instances at test time. The learning goal is to minimize the attackable rate (the probability mass of attackable test instances), which is more difficult than opt…
▽ More
We study the problem of robust learning under clean-label data-poisoning attacks, where the attacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test instances at test time. The learning goal is to minimize the attackable rate (the probability mass of attackable test instances), which is more difficult than optimal PAC learning. As we show, any robust algorithm with diminishing attackable rate can achieve the optimal dependence on $ε$ in its PAC sample complexity, i.e., $O(1/ε)$. On the other hand, the attackable rate might be large even for some optimal PAC learners, e.g., SVM for linear classifiers. Furthermore, we show that the class of linear hypotheses is not robustly learnable when the data distribution has zero margin and is robustly learnable in the case of positive margin but requires sample complexity exponential in the dimension. For a general hypothesis class with bounded VC dimension, if the attacker is limited to add at most $t>0$ poison examples, the optimal robust learning sample complexity grows almost linearly with $t$.
△ Less
Submitted 6 July, 2021; v1 submitted 28 February, 2021;
originally announced March 2021.
-
Controllable and Diverse Text Generation in E-commerce
Authors:
Huajie Shao,
Jun Wang,
Haohong Lin,
Xuezhou Zhang,
Aston Zhang,
Heng Ji,
Tarek Abdelzaher
Abstract:
In E-commerce, a key challenge in text generation is to find a good trade-off between word diversity and accuracy (relevance) in order to make generated text appear more natural and human-like. In order to improve the relevance of generated results, conditional text generators were developed that use input keywords or attributes to produce the corresponding text. Prior work, however, do not finely…
▽ More
In E-commerce, a key challenge in text generation is to find a good trade-off between word diversity and accuracy (relevance) in order to make generated text appear more natural and human-like. In order to improve the relevance of generated results, conditional text generators were developed that use input keywords or attributes to produce the corresponding text. Prior work, however, do not finely control the diversity of automatically generated sentences. For example, it does not control the order of keywords to put more relevant ones first. Moreover, it does not explicitly control the balance between diversity and accuracy. To remedy these problems, we propose a fine-grained controllable generative model, called~\textit{Apex}, that uses an algorithm borrowed from automatic control (namely, a variant of the \textit{proportional, integral, and derivative (PID) controller}) to precisely manipulate the diversity/accuracy trade-off of generated text. The algorithm is injected into a Conditional Variational Autoencoder (CVAE), allowing \textit{Apex} to control both (i) the order of keywords in the generated sentences (conditioned on the input keywords and their order), and (ii) the trade-off between diversity and accuracy. Evaluation results on real-world datasets show that the proposed method outperforms existing generative models in terms of diversity and relevance. Apex is currently deployed to generate production descriptions and item recommendation reasons in Taobao owned by Alibaba, the largest E-commerce platform in China. The A/B production test results show that our method improves click-through rate (CTR) by 13.17\% compared to the existing method for production descriptions. For item recommendation reason, it is able to increase CTR by 6.89\% and 1.42\% compared to user reviews and top-K item recommendation without reviews, respectively.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.