Search | arXiv e-print repository

Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices

Authors: Yuhong Song, Weiwen Jiang, Bingbing Li, Panjie Qi, Qingfeng Zhuge, Edwin Hsing-Mean Sha, Sakyasingha Dasgupta, Yiyu Shi, Caiwen Ding

Abstract: A pruning-based AutoML framework for run-time reconfigurability, namely RT3, is proposed in this work. This enables Transformer-based large Natural Language Processing (NLP) models to be efficiently executed on resource-constrained mobile devices and reconfigured (i.e., switching models for dynamic hardware conditions) at run-time. Such reconfigurability is the key to save energy for battery-power… ▽ More A pruning-based AutoML framework for run-time reconfigurability, namely RT3, is proposed in this work. This enables Transformer-based large Natural Language Processing (NLP) models to be efficiently executed on resource-constrained mobile devices and reconfigured (i.e., switching models for dynamic hardware conditions) at run-time. Such reconfigurability is the key to save energy for battery-powered mobile devices, which widely use dynamic voltage and frequency scaling (DVFS) technique for hardware reconfiguration to prolong battery life. In this work, we creatively explore a hybrid block-structured pruning (BP) and pattern pruning (PP) for Transformer-based models and first attempt to combine hardware and software reconfiguration to maximally save energy for battery-powered mobile devices. Specifically, RT3 integrates two-level optimizations: First, it utilizes an efficient BP as the first-step compression for resource-constrained mobile devices; then, RT3 heuristically generates a shrunken search space based on the first level optimization and searches multiple pattern sets with diverse sparsity for PP via reinforcement learning to support lightweight software reconfiguration, which corresponds to available frequency levels of DVFS (i.e., hardware reconfiguration). At run-time, RT3 can switch the lightweight pattern sets within 45ms to guarantee the required real-time constraint at different frequency levels. Results further show that RT3 can prolong battery life over 4x improvement with less than 1% accuracy loss for Transformer and 1.5% score decrease for DistilBERT. △ Less

Submitted 11 February, 2021; originally announced February 2021.

Comments: 7 pages, 5 figures

arXiv:2101.10444 [pdf, ps, other]

GnetSeg: Semantic Segmentation Model Optimized on a 224mW CNN Accelerator Chip at the Speed of 318FPS

Authors: Baohua Sun, Weixiong Lin, Hao Sha, Jiapeng Su

Abstract: Semantic segmentation is the task to cluster pixels on an image belonging to the same class. It is widely used in the real-world applications including autonomous driving, medical imaging analysis, industrial inspection, smartphone camera for person segmentation and so on. Accelerating the semantic segmentation models on the mobile and edge devices are practical needs for the industry. Recent year… ▽ More Semantic segmentation is the task to cluster pixels on an image belonging to the same class. It is widely used in the real-world applications including autonomous driving, medical imaging analysis, industrial inspection, smartphone camera for person segmentation and so on. Accelerating the semantic segmentation models on the mobile and edge devices are practical needs for the industry. Recent years have witnessed the wide availability of CNN (Convolutional Neural Networks) accelerators. They have the advantages on power efficiency, inference speed, which are ideal for accelerating the semantic segmentation models on the edge devices. However, the CNN accelerator chips also have the limitations on flexibility and memory. In addition, the CPU load is very critical because the CNN accelerator chip works as a co-processor with a host CPU. In this paper, we optimize the semantic segmentation model in order to fully utilize the limited memory and the supported operators on the CNN accelerator chips, and at the same time reduce the CPU load of the CNN model to zero. The resulting model is called GnetSeg. Furthermore, we propose the integer encoding for the mask of the GnetSeg model, which minimizes the latency of data transfer between the CNN accelerator and the host CPU. The experimental result shows that the model running on the 224mW chip achieves the speed of 318FPS with excellent accuracy for applications such as person segmentation. △ Less

Submitted 9 January, 2021; originally announced January 2021.

Comments: 7 pages, 3 figures, and 2 tables

arXiv:2012.14161 [pdf, other]

doi 10.1016/j.ppnp.2021.103906

Prospects for quarkonium studies at the high-luminosity LHC

Authors: Emilien Chapon, David d'Enterria, Bertrand Ducloue, Miguel G. Echevarria, Pol-Bernard Gossiaux, Vato Kartvelishvili, Tomas Kasemets, Jean-Philippe Lansberg, Ronan McNulty, Darren D. Price, Hua-Sheng Shao, Charlotte Van Hulse, Michael Winn, Jaroslav Adam, Liupan An, Denys Yen Arrebato Villar, Shohini Bhattacharya, Francesco G. Celiberto, Cvetan Cheshkov, Umberto D'Alesio, Cesar da Silva, Elena G. Ferreiro, Chris A. Flett, Carlo Flore, Maria Vittoria Garzelli , et al. (26 additional authors not shown)

Abstract: Prospects for quarkonium-production studies accessible during the upcoming high-luminosity phases of the CERN Large Hadron Collider operation after 2021 are reviewed. Current experimental and theoretical open issues in the field are assessed together with the potential for future studies in quarkonium-related physics. This will be possible through the exploitation of the huge data samples to be co… ▽ More Prospects for quarkonium-production studies accessible during the upcoming high-luminosity phases of the CERN Large Hadron Collider operation after 2021 are reviewed. Current experimental and theoretical open issues in the field are assessed together with the potential for future studies in quarkonium-related physics. This will be possible through the exploitation of the huge data samples to be collected in proton-proton, proton-nucleus and nucleus-nucleus collisions, both in the collider and fixed-target modes. Such investigations include, among others, those of: (i) J/psi and Upsilon produced in association with other hard particles; (ii) chi(c,b) and eta(c,b) down to small transverse momenta; (iii) the constraints brought in by quarkonia on gluon PDFs, nuclear PDFs, TMDs, GPDs and GTMDs, as well as on the low-x parton dynamics; (iv) the gluon Sivers effect in polarised-nucleon collisions; (v) the properties of the quark-gluon plasma produced in ultra-relativistic heavy-ion collisions and of collective partonic effects in general; and (vi) double and triple parton scatterings. △ Less

Submitted 30 November, 2021; v1 submitted 28 December, 2020; originally announced December 2020.

Comments: Latex, 115 pages, 55 figures, 4 tables. v2: Review published in Progress in Particle and Nuclear Physics

Report number: MIT-CTP/5231

Journal ref: Progress in Particle and Nuclear Physics 122 (2022) 103906

arXiv:2012.11462 [pdf, other]

doi 10.1103/PhysRevD.104.014010

Reweighted nuclear PDFs using Heavy-Flavor Production Data at the LHC: nCTEQ15_rwHF & EPPS16_rwHF

Authors: Aleksander Kusina, Jean-Philippe Lansberg, Ingo Schienbein, Hua-Sheng Shao

Abstract: We present the reweighting of two sets of nuclear PDFs, nCTEQ15 and EPPS16, using a selection of experimental data on heavy-flavor meson [D0, J/psi, J/psi from B and Upsilon(1S)] production in proton-lead collisions at the LHC which were not used in the original determination of these nuclear PDFs. The reweighted PDFs exhibit significantly smaller uncertainties thanks to these new heavy-flavor con… ▽ More We present the reweighting of two sets of nuclear PDFs, nCTEQ15 and EPPS16, using a selection of experimental data on heavy-flavor meson [D0, J/psi, J/psi from B and Upsilon(1S)] production in proton-lead collisions at the LHC which were not used in the original determination of these nuclear PDFs. The reweighted PDFs exhibit significantly smaller uncertainties thanks to these new heavy-flavor constraints. We present a comparison with another selection of data from the LHC and RHIC which were not included in our reweighting procedure. The comparison is overall very good and serves as a validation of these reweighted nuclear PDF sets, which we dub nCTEQ15_rwHF & EPPS16_rwHF. This indicates that the LHC and forward RHIC heavy-flavor data can be described within the standard collinear factorization framework with the same (universal) small-x gluon distribution. We discuss how we believe such reweighted PDFs should be used as well as the limitations of our procedure. △ Less

Submitted 8 January, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

Comments: Latex, 20 pages, 20 figures, 1 table; v2: a few references added

Report number: IFJPAN-IV-2020-11

Journal ref: Phys. Rev. D 104, 014010 (2021)

arXiv:2012.02436 [pdf, other]

doi 10.1088/1748-0221/16/07/P07046

Design and Commissioning of the PandaX-4T Cryogenic Distillation System for Krypton and Radon Removal

Authors: Xiangyi Cui, Zhou Wang, Yonglin Ju, Xiuli Wang, Huaxuan Liu, Wenbo Ma, Jianglai Liu, Li Zhao, Xiangdong Ji, Shuaijie Li, Rui Yan, Haidong Sha, Peiyao Huang

Abstract: An online cryogenic distillation system for the removal of krypton and radon from xenon was designed and constructed for PandaX-4T, a highly sensitive dark matter detection experiment. The krypton content in a commercial xenon product is expected to be reduced by 7 orders of magnitude with 99% xenon collection efficiency at a flow rate of 10 kg/h by design. The same system can reduce radon content… ▽ More An online cryogenic distillation system for the removal of krypton and radon from xenon was designed and constructed for PandaX-4T, a highly sensitive dark matter detection experiment. The krypton content in a commercial xenon product is expected to be reduced by 7 orders of magnitude with 99% xenon collection efficiency at a flow rate of 10 kg/h by design. The same system can reduce radon content in xenon by reversed operation, with an expected radon reduction factor of about 1.8 in PandaX-4T under a flow rate of 56.5 kg/h. The commissioning of this system was completed, with krypton and radon operations tested under respective working conditions. The krypton concentration of the product xenon was measured with an upper limit of 8.0 ppt. △ Less

Submitted 18 May, 2021; v1 submitted 4 December, 2020; originally announced December 2020.

Comments: 18 pages, 12 figures

arXiv:2012.02033 [pdf, ps, other]

SuperOCR: A Conversion from Optical Character Recognition to Image Captioning

Authors: Baohua Sun, Michael Lin, Hao Sha, Lin Yang

Abstract: Optical Character Recognition (OCR) has many real world applications. The existing methods normally detect where the characters are, and then recognize the character for each detected location. Thus the accuracy of characters recognition is impacted by the performance of characters detection. In this paper, we propose a method for recognizing characters without detecting the location of each chara… ▽ More Optical Character Recognition (OCR) has many real world applications. The existing methods normally detect where the characters are, and then recognize the character for each detected location. Thus the accuracy of characters recognition is impacted by the performance of characters detection. In this paper, we propose a method for recognizing characters without detecting the location of each character. This is done by converting the OCR task into an image captioning task. One advantage of the proposed method is that the labeled bounding boxes for the characters are not needed during training. The experimental results show the proposed method outperforms the existing methods on both the license plate recognition and the watermeter character recognition tasks. The proposed method is also deployed into a low-power (300mW) CNN accelerator chip connected to a Raspberry Pi 3 for on-device applications. △ Less

Submitted 21 November, 2020; originally announced December 2020.

Comments: 8 pages, 2 figures, 2 tables

arXiv:2011.14105 [pdf, ps, other]

Characterizing Bipartite Consensus on Signed Matrix-Weighted Networks via Balancing Set

Authors: Chongzhi Wang, Lulu Pan, Haibin Shao, Dewei Li, Yugeng Xi

Abstract: In contrast with the scalar-weighted networks, where bipartite consensus can be achieved if and only if the underlying signed network is structurally balanced, the structural balance property is no longer a graph-theoretic equivalence to the bipartite consensus in the case of signed matrix-weighted networks. To re-establish the relationship between the network structure and the bipartite consensus… ▽ More In contrast with the scalar-weighted networks, where bipartite consensus can be achieved if and only if the underlying signed network is structurally balanced, the structural balance property is no longer a graph-theoretic equivalence to the bipartite consensus in the case of signed matrix-weighted networks. To re-establish the relationship between the network structure and the bipartite consensus solution, the non-trivial balancing set is introduced which is a set of edges whose sign negation can transform a structurally imbalanced network into a structurally balanced one and the weight matrices associated with edges in this set have a non-trivial intersection of null spaces. We show that necessary and/or sufficient conditions for bipartite consensus on matrix-weighted networks can be characterized by the uniqueness of the non-trivial balancing set, while the contribution of the associated non-trivial intersection of null spaces to the steady-state of the matrix-weighted network is examined. Moreover, for matrix-weighted networks with a positive-negative spanning tree, necessary and sufficient condition for bipartite consensus using the non-trivial balancing set is obtained. Simulation examples are provided to demonstrate the theoretical results. △ Less

Submitted 24 June, 2021; v1 submitted 28 November, 2020; originally announced November 2020.

arXiv:2011.04265 [pdf, other]

Higgs boson pair production at N$^3$LO QCD

Authors: Long-Bin Chen, Hai Tao Li, Hua-Sheng Shao, Jian Wang

Abstract: Understanding the Higgs potential by measuring its self-interactions is fundamental in answering several big questions, such as electroweak symmetry breaking, electroweak baryogenesis, electroweak phase transition, and electroweak vacuum stability. The most promising way to probe the Higgs potential is to detect Higgs boson pair final state at high-energy colliders. In this talk, we report a recen… ▽ More Understanding the Higgs potential by measuring its self-interactions is fundamental in answering several big questions, such as electroweak symmetry breaking, electroweak baryogenesis, electroweak phase transition, and electroweak vacuum stability. The most promising way to probe the Higgs potential is to detect Higgs boson pair final state at high-energy colliders. In this talk, we report a recent perturbative calculation for the di-Higgs gluon-fusion process by taking into account N$^3$LO QCD radiative corrections in the approximation of infinite top quark mass limit. Finite top quark mass effects are also incorporated with several approximate schemes, which are known to be crucial in phenomenological applications. We show a very good asymptotic perturbative convergence at $\mathcal{O}(α_s^5)$, and demonstrate that the remaining scale uncertainty is only at percent level. △ Less

Submitted 13 November, 2020; v1 submitted 9 November, 2020; originally announced November 2020.

Comments: 7 pages, 2 figures, accepted contribution to proceedings of 40th International Conference on High Energy physics (ICHEP2020), July 28 - August 6, 2020, Prague, Czech Republic (virtual meeting); v2: update references

Journal ref: PoS(ICHEP2020)084

arXiv:2011.01754 [pdf, other]

ControlVAE: Tuning, Analytical Properties, and Performance Analysis

Authors: Huajie Shao, Zhisheng Xiao, Shuochao Yao, Aston Zhang, Shengzhong Liu, Tarek Abdelzaher

Abstract: This paper reviews the novel concept of controllable variational autoencoder (ControlVAE), discusses its parameter tuning to meet application needs, derives its key analytic properties, and offers useful extensions and applications. ControlVAE is a new variational autoencoder (VAE) framework that combines the automatic control theory with the basic VAE to stabilize the KL-divergence of VAE models… ▽ More This paper reviews the novel concept of controllable variational autoencoder (ControlVAE), discusses its parameter tuning to meet application needs, derives its key analytic properties, and offers useful extensions and applications. ControlVAE is a new variational autoencoder (VAE) framework that combines the automatic control theory with the basic VAE to stabilize the KL-divergence of VAE models to a specified value. It leverages a non-linear PI controller, a variant of the proportional-integral-derivative (PID) control, to dynamically tune the weight of the KL-divergence term in the evidence lower bound (ELBO) using the output KL-divergence as feedback. This allows us to precisely control the KL-divergence to a desired value (set point), which is effective in avoiding posterior collapse and learning disentangled representations. In order to improve the ELBO over the regular VAE, we provide simplified theoretical analysis to inform setting the set point of KL-divergence for ControlVAE. We observe that compared to other methods that seek to balance the two terms in VAE's objective, ControlVAE leads to better learning dynamics. In particular, it can achieve a good trade-off between reconstruction quality and KL-divergence. We evaluate the proposed method on three tasks: image generation, language modeling and disentangled representation learning. The results show that ControlVAE can achieve much better reconstruction quality than the other methods for comparable disentanglement. On the language modeling task, ControlVAE can avoid posterior collapse (KL vanishing) and improve the diversity of generated text. Moreover, our method can change the optimization trajectory, improving the ELBO and the reconstruction quality for image generation. △ Less

Submitted 31 October, 2020; originally announced November 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2004.05988

arXiv:2011.01112 [pdf, other]

Scheduling Real-time Deep Learning Services as Imprecise Computations

Authors: Shuochao Yao, Yifan Hao, Yiran Zhao, Huajie Shao, Dongxin Liu, Shengzhong Liu, Tianshi Wang, **yang Li, Tarek Abdelzaher

Abstract: The paper presents an efficient real-time scheduling algorithm for intelligent real-time edge services, defined as those that perform machine intelligence tasks, such as voice recognition, LIDAR processing, or machine vision, on behalf of local embedded devices that are themselves unable to support extensive computations. The work contributes to a recent direction in real-time computing that devel… ▽ More The paper presents an efficient real-time scheduling algorithm for intelligent real-time edge services, defined as those that perform machine intelligence tasks, such as voice recognition, LIDAR processing, or machine vision, on behalf of local embedded devices that are themselves unable to support extensive computations. The work contributes to a recent direction in real-time computing that develops scheduling algorithms for machine intelligence tasks with anytime prediction. We show that deep neural network workflows can be cast as imprecise computations, each with a mandatory part and (several) optional parts whose execution utility depends on input data. The goal of the real-time scheduler is to maximize the average accuracy of deep neural network outputs while meeting task deadlines, thanks to opportunistic shedding of the least necessary optional parts. The work is motivated by the proliferation of increasingly ubiquitous but resource-constrained embedded devices (for applications ranging from autonomous cars to the Internet of Things) and the desire to develop services that endow them with intelligence. Experiments on recent GPU hardware and a state of the art deep neural network for machine vision illustrate that our scheme can increase the overall accuracy by 10%-20% while incurring (nearly) no deadline misses. △ Less

Submitted 2 November, 2020; originally announced November 2020.

arXiv:2010.15086 [pdf, other]

doi 10.1109/GlobalSIP.2018.8646594

Forgery Blind Inspection for Detecting Manipulations of Gel Electrophoresis Images

Authors: Hao-Chiang Shao, Ya-Jen Cheng, Meng-Yun Duh, Chia-Wen Lin

Abstract: Recently, falsified images have been found in papers involved in research misconducts. However, although there have been many image forgery detection methods, none of them was designed for molecular-biological experiment images. In this paper, we proposed a fast blind inquiry method, named FBI$_{GEL}$, for integrity of images obtained from two common sorts of molecular experiments, i.e., western b… ▽ More Recently, falsified images have been found in papers involved in research misconducts. However, although there have been many image forgery detection methods, none of them was designed for molecular-biological experiment images. In this paper, we proposed a fast blind inquiry method, named FBI$_{GEL}$, for integrity of images obtained from two common sorts of molecular experiments, i.e., western blot (WB) and polymerase chain reaction (PCR). Based on an optimized pseudo-background capable of highlighting local residues, FBI$_{GEL}$ can reveal traceable vestiges suggesting inappropriate local modifications on WB/PCR images. Additionally, because the optimized pseudo-background is derived according to a closed-form solution, FBI$_{GEL}$ is computationally efficient and thus suitable for large scale inquiry tasks for WB/PCR image integrity. We applied FBI$_{GEL}$ on several papers questioned by the public on \textbf{PUBPEER}, and our results show that figures of those papers indeed contain doubtful unnatural patterns. △ Less

Submitted 28 October, 2020; originally announced October 2020.

Comments: This version is an extension of Prof. Shao's previous conference paper (IEEE GlobalSIP 2018): "Unveiling Vestiges of Man-Made Modifications on Molecular-Biological Experiment Images." (https://doi.org/10.1109/GlobalSIP.2018.8646594)

arXiv:2010.14670 [pdf, ps, other]

Online Learning with Primary and Secondary Losses

Authors: Avrim Blum, Han Shao

Abstract: We study the problem of online learning with primary and secondary losses. For example, a recruiter making decisions of which job applicants to hire might weigh false positives and false negatives equally (the primary loss) but the applicants might weigh false negatives much higher (the secondary loss). We consider the following question: Can we combine "expert advice" to achieve low regret with r… ▽ More We study the problem of online learning with primary and secondary losses. For example, a recruiter making decisions of which job applicants to hire might weigh false positives and false negatives equally (the primary loss) but the applicants might weigh false negatives much higher (the secondary loss). We consider the following question: Can we combine "expert advice" to achieve low regret with respect to the primary loss, while at the same time performing {\em not much worse than the worst expert} with respect to the secondary loss? Unfortunately, we show that this goal is unachievable without any bounded variance assumption on the secondary loss. More generally, we consider the goal of minimizing the regret with respect to the primary loss and bounding the secondary loss by a linear threshold. On the positive side, we show that running any switching-limited algorithm can achieve this goal if all experts satisfy the assumption that the secondary loss does not exceed the linear threshold by $o(T)$ for any time interval. If not all experts satisfy this assumption, our algorithms can achieve this goal given access to some external oracles which determine when to deactivate and reactivate experts. △ Less

Submitted 27 October, 2020; originally announced October 2020.

arXiv:2010.08061 [pdf, ps, other]

Stochastic Bandits with Vector Losses: Minimizing $\ell^\infty$-Norm of Relative Losses

Authors: Xuedong Shang, Han Shao, Jian Qian

Abstract: Multi-armed bandits are widely applied in scenarios like recommender systems, for which the goal is to maximize the click rate. However, more factors should be considered, e.g., user stickiness, user growth rate, user experience assessment, etc. In this paper, we model this situation as a problem of $K$-armed bandit with multiple losses. We define relative loss vector of an arm where the $i$-th en… ▽ More Multi-armed bandits are widely applied in scenarios like recommender systems, for which the goal is to maximize the click rate. However, more factors should be considered, e.g., user stickiness, user growth rate, user experience assessment, etc. In this paper, we model this situation as a problem of $K$-armed bandit with multiple losses. We define relative loss vector of an arm where the $i$-th entry compares the arm and the optimal arm with respect to the $i$-th loss. We study two goals: (a) finding the arm with the minimum $\ell^\infty$-norm of relative losses with a given confidence level (which refers to fixed-confidence best-arm identification); (b) minimizing the $\ell^\infty$-norm of cumulative relative losses (which refers to regret minimization). For goal (a), we derive a problem-dependent sample complexity lower bound and discuss how to achieve matching algorithms. For goal (b), we provide a regret lower bound of $Ω(T^{2/3})$ and provide a matching algorithm. △ Less

Submitted 15 October, 2020; originally announced October 2020.

Comments: 14 pages

arXiv:2010.06620 [pdf, other]

doi 10.1103/PhysRevLett.126.011102

Improved limits for violations of local position invariance from atomic clock comparisons

Authors: R. Lange, N. Huntemann, J. M. Rahm, C. Sanner, H. Shao, B. Lipphardt, Chr. Tamm, S. Weyers, E. Peik

Abstract: We compare two optical clocks based on the $^2$S$_{1/2}(F=0)\to {}^2$D$_{3/2}(F=2)$ electric quadrupole (E2) and the $^2$S$_{1/2}(F=0)\to {}^2$F$_{7/2}(F=3)$ electric octupole (E3) transition of $^{171}$Yb$^{+}$ and measure the frequency ratio $ν_{\mathrm{E3}}/ν_{\mathrm{E2}}=0.932\,829\,404\,530\,965\,376(32)$. We determine the transition frequency $ν_{E3}=642\,121\,496\,772\,645.10(8)$ Hz using… ▽ More We compare two optical clocks based on the $^2$S$_{1/2}(F=0)\to {}^2$D$_{3/2}(F=2)$ electric quadrupole (E2) and the $^2$S$_{1/2}(F=0)\to {}^2$F$_{7/2}(F=3)$ electric octupole (E3) transition of $^{171}$Yb$^{+}$ and measure the frequency ratio $ν_{\mathrm{E3}}/ν_{\mathrm{E2}}=0.932\,829\,404\,530\,965\,376(32)$. We determine the transition frequency $ν_{E3}=642\,121\,496\,772\,645.10(8)$ Hz using two caesium fountain clocks. Repeated measurements of both quantities over several years are analyzed for potential violations of local position invariance. We improve by factors of about 20 and 2 the limits for fractional temporal variations of the fine structure constant $α$ to $1.0(1.1)\times10^{-18}/\mathrm{yr}$ and of the proton-to-electron mass ratio $μ$ to $-8(36)\times10^{-18}/\mathrm{yr}$. Using the annual variation of the Sun's gravitational potential at Earth $Φ$, we improve limits for a potential coupling of both constants to gravity, $(c^2/α) (dα/dΦ)=14(11)\times 10^{-9}$ and $(c^2/μ) (dμ/dΦ)=7(45)\times 10^{-8}$. △ Less

Submitted 7 January, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: 6 pages, 3 figures

Journal ref: Phys. Rev. Lett. 126, 011102 (2021)

arXiv:2010.05219 [pdf, other]

Towards Accurate Predictions of Carrier Mobilities and Thermoelectric Performances in 2D Materials

Authors: Yu Wu, Bowen Hou, Ying Chen, Jiang Cao, Congcong Ma, Hezhu Shao, Yiming Zhang, Zixuan Lu, Heyuan Zhu, Zhilai Fang, Rongjun Zhang, Hao Zhang

Abstract: The interactions between electrons and lattice vibrational modes play the key role in determining the carrier transport properties, thermoelectric performance and other physical quantities related to phonons in semiconductors. However, for two-dimensional (2D) materials, the widely-used models for carrier transport only consider the interactions between electrons and some specific phonon modes, wh… ▽ More The interactions between electrons and lattice vibrational modes play the key role in determining the carrier transport properties, thermoelectric performance and other physical quantities related to phonons in semiconductors. However, for two-dimensional (2D) materials, the widely-used models for carrier transport only consider the interactions between electrons and some specific phonon modes, which usually leads to inaccruate predictions of electrons/phonons transport properties. In this work, comprehensive investigations on full electron-phonon couplings and their influences on carrier mobility and thermoelectric performances of 2D group-IV and V elemental monolayers were performed, and we also analyzed in details the selection rules on electron-phonon couplings using group-theory arguments. Our calculations revealed that, for the cases of shallow do**s where only intravalley scatterings are allowed, the contributions from optical phonon modes are significantly larger than those from acoustic phonon modes in group-IV elemental monolayers, and LA and some specific optical phonon modes contribute significantly to the total intravalley scatterings. When the do** increases and intervalley scatterings are allowed, the intervalley scatterings are much stronger than intravalley scatterings, and ZA/TA/LO phonon modes dominate the intervalley scatterings in monolayer Si, Ge and Sn. The dominant contributions to the total intervalley scatterings are ZA/TO in monolayer P, ZA/TO in monolayer As and TO/LO in monolayer Sb. Based on the thorough investigations on the full electron-phonon couplings, we predict accurately the carrier mobilities and thermoelectric figure of merits in these two elemental crystals, and reveal significant reductions when compared with the calculations based on the widely-used simplified model. △ Less

Submitted 11 October, 2020; originally announced October 2020.

Comments: 7 pages

arXiv:2009.12555 [pdf, other]

Associated production in pp and heavy ion collisions

Authors: Hua-Sheng Shao

Abstract: Associated particle production processes in pp and heavy ion collisions at the LHC are in particular interesting in the sense that they provide unique tools to study double parton scattering (DPS) mechanism. In this talk, I will first review the recent theoretical, phenomenological and experimental developments of DPS in pp collisions. Then, I will focus on the DPS studies in heavy ion collisions,… ▽ More Associated particle production processes in pp and heavy ion collisions at the LHC are in particular interesting in the sense that they provide unique tools to study double parton scattering (DPS) mechanism. In this talk, I will first review the recent theoretical, phenomenological and experimental developments of DPS in pp collisions. Then, I will focus on the DPS studies in heavy ion collisions, and stress their roles in understanding the cold nuclear matter effects, such as the (poorly known) impact-parameter dependent nuclear parton densities. △ Less

Submitted 26 September, 2020; originally announced September 2020.

Comments: 4 pages, 1 figure, accepted contribution to proceedings of the 8th Annual Conference on Large Hadron Collider Physics (LHCP2020), 25-30 May, 2020, online

Journal ref: PoS(LHCP2020)172

arXiv:2009.08264 [pdf, other]

doi 10.1016/j.physletb.2020.135926

Large-P_T inclusive photoproduction of J/psi in electron-proton collisions at HERA and the EIC

Authors: Carlo Flore, Jean-Philippe Lansberg, Hua-Sheng Shao, Yelyzaveta Yedelkina

Abstract: We study the inclusive J/psi production at large transverse momenta at lepton-hadron colliders in the limit when the exchange photon is quasi real, also referred to as photoproduction. Our computation includes the leading-P_T leading-v next-to-leading alpha_s corrections. In particular, we consider the contribution from J/psi plus another charm quark, by employing for the first time in quarkonium… ▽ More We study the inclusive J/psi production at large transverse momenta at lepton-hadron colliders in the limit when the exchange photon is quasi real, also referred to as photoproduction. Our computation includes the leading-P_T leading-v next-to-leading alpha_s corrections. In particular, we consider the contribution from J/psi plus another charm quark, by employing for the first time in quarkonium photoproduction the variable-flavour-number scheme. We also include a QED-induced contribution via an off-shell photon which remained ignored in the literature and which we show to be the leading contribution at high P_T within the reach of the EIC. In turn, we use our computation of J/psi+charm to demonstrate its observability at the future EIC and the EIC sensitivity to probe the non-perturbative charm content of the proton at high x. △ Less

Submitted 17 September, 2020; originally announced September 2020.

Comments: LaTeX, 11 pages; 15 figures

arXiv:2009.06795 [pdf, other]

DynamicVAE: Decoupling Reconstruction Error and Disentangled Representation Learning

Authors: Huajie Shao, Haohong Lin, Qinmin Yang, Shuochao Yao, Han Zhao, Tarek Abdelzaher

Abstract: This paper challenges the common assumption that the weight $β$, in $β$-VAE, should be larger than $1$ in order to effectively disentangle latent factors. We demonstrate that $β$-VAE, with $β< 1$, can not only attain good disentanglement but also significantly improve reconstruction accuracy via dynamic control. The paper removes the inherent trade-off between reconstruction accuracy and disentang… ▽ More This paper challenges the common assumption that the weight $β$, in $β$-VAE, should be larger than $1$ in order to effectively disentangle latent factors. We demonstrate that $β$-VAE, with $β< 1$, can not only attain good disentanglement but also significantly improve reconstruction accuracy via dynamic control. The paper removes the inherent trade-off between reconstruction accuracy and disentanglement for $β$-VAE. Existing methods, such as $β$-VAE and FactorVAE, assign a large weight to the KL-divergence term in the objective function, leading to high reconstruction errors for the sake of better disentanglement. To mitigate this problem, a ControlVAE has recently been developed that dynamically tunes the KL-divergence weight in an attempt to control the trade-off to more a favorable point. However, ControlVAE fails to eliminate the conflict between the need for a large $β$ (for disentanglement) and the need for a small $β$. Instead, we propose DynamicVAE that maintains a different $β$ at different stages of training, thereby decoupling disentanglement and reconstruction accuracy. In order to evolve the weight, $β$, along a trajectory that enables such decoupling, DynamicVAE leverages a modified incremental PI (proportional-integral) controller, and employs a moving average as well as a hybrid annealing method to evolve the value of KL-divergence smoothly in a tightly controlled fashion. We theoretically prove the stability of the proposed approach. Evaluation results on three benchmark datasets demonstrate that DynamicVAE significantly improves the reconstruction accuracy while achieving disentanglement comparable to the best of existing methods. The results verify that our method can separate disentangled representation learning and reconstruction, removing the inherent tension between the two. △ Less

Submitted 30 September, 2020; v1 submitted 14 September, 2020; originally announced September 2020.

arXiv:2009.03249 [pdf, other]

doi 10.1103/PhysRevB.103.054418

Unconventional U(1) to $\mathbf{Z_q}$ cross-over in quantum and classical ${\bf q}$-state clock models

Authors: Pranay Patil, Hui Shao, Anders W. Sandvik

Abstract: We consider two-dimensional $q$-state quantum clock models with quantum fluctuations connecting states with clock transitions with different choices for matrix elements. We study the quantum phase transitions in these models using quantum Monte Carlo simulations, with the aim of characterizing the cross-over from emergent U(1) symmetry at the transition (for $q \ge 4$) to $Z_q$ symmetry of the ord… ▽ More We consider two-dimensional $q$-state quantum clock models with quantum fluctuations connecting states with clock transitions with different choices for matrix elements. We study the quantum phase transitions in these models using quantum Monte Carlo simulations, with the aim of characterizing the cross-over from emergent U(1) symmetry at the transition (for $q \ge 4$) to $Z_q$ symmetry of the ordered state. We also study classical three-dimensional clock models with spatial anisotropy corresponding to the space-time anisotropy of the quantum systems. The U(1) to ${Z_q}$ symmetry cross-over in all these systems is governed by a dangerously irrelevant operator. We specifically study $q=5$ and $q=6$ models with different forms of the quantum fluctuations and different anisotropies in the classical models. We find the expected classical XY critical exponents and scaling dimensions $y_q$ of the clock fields. However, the initial weak violation of the U(1) symmetry in the ordered phase, characterized by a $Z_q$ symmetric order parameter $φ_q$, scales in an unexpected way. As a function of the system size $L$, close to the critical temperature $φ_q \propto L^p$, where the known value of the exponent is $p=2$ in the classical isotropic clock model. In contrast, for strongly anisotropic classical models and the quantum models we find $p=3$. For weakly anisotropic classical models we observe a cross-over from $p=2$ to $p=3$ scaling. The exponent $p$ directly impacts the exponent $ν'$ governing the divergence of the U(1) to $Z_q$ cross-over length scale $ξ'$ in the thermodynamic limit, according to the relationship $ν'=ν(1+|y_q|/p)$, where $ν$ is the conventional correlation length exponent. We present a phenomenological argument based on an anomalous renormalization of the clock field in the presence of anisotropy, possibly as a consequence of topological (vortex) line defects. △ Less

Submitted 20 June, 2024; v1 submitted 7 September, 2020; originally announced September 2020.

Comments: Due to small technical error pointed out by readers, we have replaced page 20, left column, last sentence "These operators instead ... is not necessary)."

Journal ref: Phys. Rev. B 103, 054418 (2021)

arXiv:2008.11889 [pdf, other]

Self-cleaning and self-cooling paper

Authors: Yanpei Tian, Hong Shao, Xiaojie Liu, Fangqi Chen, Yongsheng Li, Changyu Tang, Yi Zheng

Abstract: The technique of passive daytime radiative cooling (PDRC) is used to cool an object down by simultaneously reflecting sunlight and thermally radiating heat to the cold outer space through the Earth's atmospheric window. However, for practical applications, current PDRC materials are facing unprecedented challenges such as complicated and expensive fabrication approaches and performance degradation… ▽ More The technique of passive daytime radiative cooling (PDRC) is used to cool an object down by simultaneously reflecting sunlight and thermally radiating heat to the cold outer space through the Earth's atmospheric window. However, for practical applications, current PDRC materials are facing unprecedented challenges such as complicated and expensive fabrication approaches and performance degradation arising from surface contamination. Here, we develop a scalable paper-based material with excellent self-cleaning and self-cooling capabilities, through air-spraying ethanolic polytetrafluoroethylene (PTFE) microparticles suspensions embedded within the micropores of the paper. The formed superhydrophobic PTFE coating not only protects the paper from water wetting and dust contamination for real-life applications but also reinforces its solar reflectance by sunlight backscattering. The paper fibers, when enhanced with PTFE particles, efficiently reflect sunlight and strongly radiate heat through the atmospheric window, resulting in a sub-ambient cooling performance of 5$^{\circ}$C and radiative cooling power of 104 W/m$^2$ under direct solar irradiance of 834 W/m$^2$ and 671 W/m$^2$, respectively. The self-cleaning surface of the cooling paper extends its lifespan and keep its good cooling performance for outdoor applications. Additionally, dyed papers are experimentally studied for broad engineering applications. They can absorb appropriate visible wavelengths to display specific colors and effectively reflect near-infrared lights to reduce solar heating, which synchronously achieves effective radiative cooling and aesthetic varieties in a cost-effective, scalable, and energy-efficient way. △ Less

Submitted 30 August, 2020; v1 submitted 26 August, 2020; originally announced August 2020.

arXiv:2007.15189 [pdf, other]

Deep Multi-View Spatiotemporal Virtual Graph Neural Network for Significant Citywide Ride-hailing Demand Prediction

Authors: Guangyin **, Zhexu Xi, Hengyu Sha, Yanghe Feng, **cai Huang

Abstract: Urban ride-hailing demand prediction is a crucial but challenging task for intelligent transportation system construction. Predictable ride-hailing demand can facilitate more reasonable vehicle scheduling and online car-hailing platform dispatch. Conventional deep learning methods with no external structured data can be accomplished via hybrid models of CNNs and RNNs by meshing plentiful pixel-lev… ▽ More Urban ride-hailing demand prediction is a crucial but challenging task for intelligent transportation system construction. Predictable ride-hailing demand can facilitate more reasonable vehicle scheduling and online car-hailing platform dispatch. Conventional deep learning methods with no external structured data can be accomplished via hybrid models of CNNs and RNNs by meshing plentiful pixel-level labeled data, but spatial data sparsity and limited learning capabilities on temporal long-term dependencies are still two striking bottlenecks. To address these limitations, we propose a new virtual graph modeling method to focus on significant demand regions and a novel Deep Multi-View Spatiotemporal Virtual Graph Neural Network (DMVST-VGNN) to strengthen learning capabilities of spatial dynamics and temporal long-term dependencies. Specifically, DMVST-VGNN integrates the structures of 1D Convolutional Neural Network, Multi Graph Attention Neural Network and Transformer layer, which correspond to short-term temporal dynamics view, spatial dynamics view and long-term temporal dynamics view respectively. In this paper, experiments are conducted on two large-scale New York City datasets in fine-grained prediction scenes. And the experimental results demonstrate effectiveness and superiority of DMVST-VGNN framework in significant citywide ride-hailing demand prediction. △ Less

Submitted 9 September, 2020; v1 submitted 29 July, 2020; originally announced July 2020.

arXiv:2007.09883 [pdf, other]

Complementary Boundary Generator with Scale-Invariant Relation Modeling for Temporal Action Localization: Submission to ActivityNet Challenge 2020

Authors: Haisheng Su, **yuan Feng, Hao Shao, Zhenyu Jiang, Manyuan Zhang, Wei Wu, Yu Liu, Hongsheng Li, Junjie Yan

Abstract: This technical report presents an overview of our solution used in the submission to ActivityNet Challenge 2020 Task 1 (\textbf{temporal action localization/detection}). Temporal action localization requires to not only precisely locate the temporal boundaries of action instances, but also accurately classify the untrimmed videos into specific categories. In this paper, we decouple the temporal ac… ▽ More This technical report presents an overview of our solution used in the submission to ActivityNet Challenge 2020 Task 1 (\textbf{temporal action localization/detection}). Temporal action localization requires to not only precisely locate the temporal boundaries of action instances, but also accurately classify the untrimmed videos into specific categories. In this paper, we decouple the temporal action localization task into two stages (i.e. proposal generation and classification) and enrich the proposal diversity through exhaustively exploring the influences of multiple components from different but complementary perspectives. Specifically, in order to generate high-quality proposals, we consider several factors including the video feature encoder, the proposal generator, the proposal-proposal relations, the scale imbalance, and ensemble strategy. Finally, in order to obtain accurate detections, we need to further train an optimal video classifier to recognize the generated proposals. Our proposed scheme achieves the state-of-the-art performance on the temporal action localization task with \textbf{42.26} average mAP on the challenge testing set. △ Less

Submitted 25 August, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

Comments: Submitted to CVPR workshop of ActivityNet Challenge 2020

arXiv:2007.09193 [pdf, ps, other]

Tractable Profit Maximization over Multiple Attributes under Discrete Choice Models

Authors: Hongzhang Shao, Anton J. Kleywegt

Abstract: A fundamental problem in revenue management is to optimally choose the attributes of products, such that the total profit or revenue or market share is maximized. Usually, these attributes can affect both a product's market share (probability to be chosen) and its profit margin. For example, if a smart phone has a better battery, then it is more costly to be produced, but is more likely to be purc… ▽ More A fundamental problem in revenue management is to optimally choose the attributes of products, such that the total profit or revenue or market share is maximized. Usually, these attributes can affect both a product's market share (probability to be chosen) and its profit margin. For example, if a smart phone has a better battery, then it is more costly to be produced, but is more likely to be purchased by a customer. The decision maker then needs to choose an optimal vector of attributes for each product that balances this trade-off. In spite of the importance of such problems, there is not yet a method to solve it efficiently in general. Past literature in revenue management and discrete choice models focus on pricing problems, where price is the only attribute to be chosen for each product. Existing approaches to solve pricing problems tractably cannot be generalized to the optimization problem with multiple product attributes as decision variables. On the other hand, papers studying product line design with multiple attributes all result in intractable optimization problems. Then we found a way to reformulate the static multi-attribute optimization problem, as well as the multi-stage fluid optimization problem with both resource constraints and upper and lower bounds of attributes, as a tractable convex conic optimization problem. Our result applies to optimization problems under the multinomial logit (MNL) model, the Markov chain (MC) choice model, and with certain conditions, the nested logit (NL) model. △ Less

Submitted 22 December, 2021; v1 submitted 17 July, 2020; originally announced July 2020.

arXiv:2007.02494 [pdf]

doi 10.1109/tpwrs.2021.3062359

Data Based Linearization: Least-Squares Based Approximation

Authors: hentong Shao, Qiaozhu Zhai, Jiang Wu, Xiaohong Guan

Abstract: Linearization of power flow is an important topic in power system analysis. The computational burden can be greatly reduced under the linear power flow model while the model error is the main concern. Therefore, various linear power flow models have been proposed in literature and dedicated to seek the optimal approximation. Most linear power flow models are based on some kind of transformation/si… ▽ More Linearization of power flow is an important topic in power system analysis. The computational burden can be greatly reduced under the linear power flow model while the model error is the main concern. Therefore, various linear power flow models have been proposed in literature and dedicated to seek the optimal approximation. Most linear power flow models are based on some kind of transformation/simplification/Taylor expansion of AC power flow equations and fail to be accurate under cold-start mode. It is surprising that data-based linearization methods have not yet been fully investigated. In this paper, the performance of a data-based least-squares approximation method is investigated. The resulted cold-start sensitive factors are named as least-squares distribution factors (LSDF). Compared with the traditional power transfer distribution factors (PTDF), it is found that the LSDF can work very well for systems with large load variation, and the average error of LSDF is only about 1% of the average error of PTDF. Comprehensive numerical testing is performed and the results show that LSDF has attractive performance in all studied cases and has great application potential in occasions requiring only cold-start linear power flow models. △ Less

Submitted 5 July, 2020; originally announced July 2020.

arXiv:2007.00969 [pdf, other]

Structure Adaptive Algorithms for Stochastic Bandits

Authors: Rémy Degenne, Han Shao, Wouter M. Koolen

Abstract: We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods that are flexible (in that they easily adapt to different structures), powerful (in that they perform well empirically and/or provably match instance-dependent l… ▽ More We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods that are flexible (in that they easily adapt to different structures), powerful (in that they perform well empirically and/or provably match instance-dependent lower bounds) and efficient in that the per-round computational burden is small. We develop asymptotically optimal algorithms from instance-dependent lower-bounds using iterative saddle-point solvers. Our approach generalises recent iterative methods for pure exploration to reward maximisation, where a major challenge arises from the estimation of the sub-optimality gaps and their reciprocals. Still we manage to achieve all the above desiderata. Notably, our technique avoids the computational cost of the full-blown saddle point oracle employed by previous work, while at the same time enabling finite-time regret bounds. Our experiments reveal that our method successfully leverages the structural assumptions, while its regret is at worst comparable to that of vanilla UCB. △ Less

Submitted 2 July, 2020; originally announced July 2020.

Comments: 10+18 pages. To be published in the proceedings of ICML 2020

arXiv:2006.09116 [pdf, ps, other]

1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020

Authors: Siyu Chen, Junting Pan, Guanglu Song, Manyuan Zhang, Hao Shao, Ziyi Lin, **g Shao, Hongsheng Li, Yu Liu

Abstract: This technical report introduces our winning solution to the spatio-temporal action localization track, AVA-Kinetics Crossover, in ActivityNet Challenge 2020. Our entry is mainly based on Actor-Context-Actor Relation Network. We describe technical details for the new AVA-Kinetics dataset, together with some experimental results. Without any bells and whistles, we achieved 39.62 mAP on the test set… ▽ More This technical report introduces our winning solution to the spatio-temporal action localization track, AVA-Kinetics Crossover, in ActivityNet Challenge 2020. Our entry is mainly based on Actor-Context-Actor Relation Network. We describe technical details for the new AVA-Kinetics dataset, together with some experimental results. Without any bells and whistles, we achieved 39.62 mAP on the test set of AVA-Kinetics, which outperforms other entries by a large margin. Code will be available at: https://github.com/Siyu-C/ACAR-Net. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2006.07976

arXiv:2006.03897 [pdf, other]

Accurately Solving Physical Systems with Graph Learning

Authors: Han Shao, Tassilo Kugelstadt, Torsten Hädrich, Wojciech Pałubicki, Jan Bender, Sören Pirk, Dominik L. Michels

Abstract: Iterative solvers are widely used to accurately simulate physical systems. These solvers require initial guesses to generate a sequence of improving approximate solutions. In this contribution, we introduce a novel method to accelerate iterative solvers for physical systems with graph networks (GNs) by predicting the initial guesses to reduce the number of iterations. Unlike existing methods that… ▽ More Iterative solvers are widely used to accurately simulate physical systems. These solvers require initial guesses to generate a sequence of improving approximate solutions. In this contribution, we introduce a novel method to accelerate iterative solvers for physical systems with graph networks (GNs) by predicting the initial guesses to reduce the number of iterations. Unlike existing methods that aim to learn physical systems in an end-to-end manner, our approach guarantees long-term stability and therefore leads to more accurate solutions. Furthermore, our method improves the run time performance of traditional iterative solvers. To explore our method we make use of position-based dynamics (PBD) as a common solver for physical systems and evaluate it by simulating the dynamics of elastic rods. Our approach is able to generalize across different initial conditions, discretizations, and realistic material properties. Finally, we demonstrate that our method also performs well when taking discontinuous effects into account such as collisions between individual rods. Finally, to illustrate the scalability of our approach, we simulate complex 3D tree models composed of over a thousand individual branch segments swaying in wind fields. A video showing dynamic results of our graph learning assisted simulations of elastic rods can be found on the project website available at http://computationalsciences.org/publications/shao-2021-physical-systems-graph-learning.html . △ Less

Submitted 13 January, 2021; v1 submitted 6 June, 2020; originally announced June 2020.

Comments: This work has been supported by KAUST under individual baseline and center partnership funding

MSC Class: Machine Learning (cs.LG); Machine Learning (stat.ML)

arXiv:2005.14687 [pdf, other]

doi 10.1103/PhysRevLett.125.143201

Coherent suppression of tensor frequency shifts through magnetic field rotation

Authors: R. Lange, N. Huntemann, C. Sanner, H. Shao, B. Lipphardt, Chr. Tamm, E. Peik

Abstract: We introduce a scheme to coherently suppress second-rank tensor frequency shifts in atomic clocks, relying on the continuous rotation of an external magnetic field during the free atomic state evolution in a Ramsey sequence. The method retrieves the unperturbed frequency within a single interrogation cycle and is readily applicable to various atomic clock systems. For the frequency shift due to th… ▽ More We introduce a scheme to coherently suppress second-rank tensor frequency shifts in atomic clocks, relying on the continuous rotation of an external magnetic field during the free atomic state evolution in a Ramsey sequence. The method retrieves the unperturbed frequency within a single interrogation cycle and is readily applicable to various atomic clock systems. For the frequency shift due to the electric quadrupole interaction, we experimentally demonstrate suppression by more than two orders of magnitude for the ${}^2S_{1/2} \to {}^2D_{3/2}$ transition of a single trapped ${}^{171}\text{Yb}^+$ ion. The scheme provides particular advantages in the case of the ${}^{171}\text{Yb}^+$ ${}^2S_{1/2} \to {}^2F_{7/2}$ electric octupole (E3) transition. For an improved estimate of the residual quadrupole shift for this transition, we measure the excited state electric quadrupole moments $Θ({}^2D_{3/2}) = 1.95(1)~ea_0^2$ and $Θ({}^2F_{7/2}) = -0.0297(5)~ea_0^2$ with $e$ the elementary charge and $a_0$ the Bohr radius, improving the measurement uncertainties by one order of magnitude. △ Less

Submitted 29 May, 2020; originally announced May 2020.

Comments: 6 pages, 3 figures

Journal ref: Phys. Rev. Lett. 125, 143201 (2020)

arXiv:2005.12967 [pdf, other]

doi 10.1103/PhysRevD.102.034023

$J/ψ$ meson production in association with an open charm hadron at the LHC: A reappraisal

Authors: Hua-Sheng Shao

Abstract: We critically (re)examine the associated production process of a $J/ψ$ meson plus an open charm hadron at the LHC in the proton-proton ($pp$) and proton-lead ($p{\rm Pb}$) collisions. Such a process is very intriguing in the sense of tailoring to explore the double parton structure of nucleons and to determine the geometry of partons in nuclei. In order to interpret the existing $pp$ data with the… ▽ More We critically (re)examine the associated production process of a $J/ψ$ meson plus an open charm hadron at the LHC in the proton-proton ($pp$) and proton-lead ($p{\rm Pb}$) collisions. Such a process is very intriguing in the sense of tailoring to explore the double parton structure of nucleons and to determine the geometry of partons in nuclei. In order to interpret the existing $pp$ data with the LHCb detector at the center-of-mass energy $\sqrt{s}=7$ TeV, we introduce two overlooked mechanisms for the double parton scattering (DPS) and single parton scattering (SPS) processes. Besides the conventional DPS mode, where the two mesons are produced almost independently in the two separate scattering subprocesses, we propose a novel DPS mechanism that the two constituent (heavy) quarks stemming from two hard scatterings can form into a composite particle, like the $J/ψ$ meson, during the hadronization phase. However, it turns out the corresponding contribution is small in $J/ψ+c\bar{c}$ hadroproduction. On the contrary, we point out that the resummation of the initial state logarithms due to gluon splitting into a charm quark pair is crucial to understand the LHCb measurement, which was overlooked in the literature. We perform a proper matching between the perturbative calculations in different initial-quark flavor number schemes, generically referring to the variable flavor number scheme. The new variable flavor number scheme calculation for the process strongly enhance the SPS cross sections, almost closing the discrepancies between theory and experiment. Finally, we present our predictions for the forthcoming LHCb measurement in $p{\rm Pb}$ collisions at $\sqrt{s_{NN}}=8.16$ TeV. Some interesting observables are exploited to set up the control regions of the DPS signal and to probe the impact-parameter dependent parton densities in lead. △ Less

Submitted 24 August, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

Comments: 32 pages, 20 figures, 7 tables; v2: match to journal version (add a new figure and a few references, fix a few typos)

Journal ref: Phys. Rev. D 102, 034023 (2020)

arXiv:2005.12044 [pdf]

A Joint Pixel and Feature Alignment Framework for Cross-dataset Palmprint Recognition

Authors: Huikai Shao, Dexing Zhong

Abstract: Deep learning-based palmprint recognition algorithms have shown great potential. Most of them are mainly focused on identifying samples from the same dataset. However, they may be not suitable for a more convenient case that the images for training and test are from different datasets, such as collected by embedded terminals and smartphones. Therefore, we propose a novel Joint Pixel and Feature Al… ▽ More Deep learning-based palmprint recognition algorithms have shown great potential. Most of them are mainly focused on identifying samples from the same dataset. However, they may be not suitable for a more convenient case that the images for training and test are from different datasets, such as collected by embedded terminals and smartphones. Therefore, we propose a novel Joint Pixel and Feature Alignment (JPFA) framework for such cross-dataset palmprint recognition scenarios. Two stage-alignment is applied to obtain adaptive features in source and target datasets. 1) Deep style transfer model is adopted to convert source images into fake images to reduce the dataset gaps and perform data augmentation on pixel level. 2) A new deep domain adaptation model is proposed to extract adaptive features by aligning the dataset-specific distributions of target-source and target-fake pairs on feature level. Adequate experiments are conducted on several benchmarks including constrained and unconstrained palmprint databases. The results demonstrate that our JPFA outperforms other models to achieve the state-of-the-arts. Compared with baseline, the accuracy of cross-dataset identification is improved by up to 28.10% and the Equal Error Rate (EER) of cross-dataset verification is reduced by up to 4.69%. To make our results reproducible, the codes are publicly available at http://gr.xjtu.edu.cn/web/bell/resource. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: 12 pages, 7 figures

arXiv:2005.10277 [pdf, other]

doi 10.1007/JHEP11(2020)036

RIP $H b \bar b$: How other Higgs production modes conspire to kill a rare signal at the LHC

Authors: Davide Pagani, Hua-Sheng Shao, Marco Zaro

Abstract: The hadroproduction of a Higgs boson in association with a bottom-quark pair ($H b \bar b$) is commonly considered as the key process for directly probing the Yukawa interaction between the Higgs boson and the bottom quark ($y_b$). However, in the Standard-Model (SM) this process is also known to suffer from very large irreducible backgrounds from other Higgs production channels, notably gluon-fus… ▽ More The hadroproduction of a Higgs boson in association with a bottom-quark pair ($H b \bar b$) is commonly considered as the key process for directly probing the Yukawa interaction between the Higgs boson and the bottom quark ($y_b$). However, in the Standard-Model (SM) this process is also known to suffer from very large irreducible backgrounds from other Higgs production channels, notably gluon-fusion ($gg$F). In this paper we calculate for the first time the so-called QCD and electroweak complete-NLO predictions for $H b \bar b$ production, using the four-flavour scheme. Our calculation shows that not only the $gg$F but also the $ZH$ and even the vector-boson fusion channels are sizeable irreducible backgrounds. Moreover, we demonstrate that, at the LHC, the rates of these backgrounds are very large with respect to the "genuine" and $y_b$-dependent $H b \bar b$ production mode. In particular, no suppression occurs at the differential level and therefore backgrounds survive typical analysis cuts. This fact further jeopardises the chances of measuring at the LHC the $y_b$-dependent component of $H b \bar b$ production in the SM. Especially, unless $y_b$ is significantly enlarged by new physics, even for beyond-the SM scenarios the direct determination of $y_b$ via this process seems to be hopeless at the LHC. △ Less

Submitted 18 November, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

Comments: 33 pages, 8 figures, 4 tables. Version accepted for publication in JHEP

Report number: DESY 20-089, TIF-UNIMI-2020-16

Journal ref: JHEP 11 (2020) 036

arXiv:2005.08102 [pdf, other]

doi 10.1007/JHEP07(2020)127

Rare two-body decays of the top quark into a bottom meson plus an up or charm quark

Authors: David d'Enterria, Hua-Sheng Shao

Abstract: Rare two-body decays of the top quark into a neutral bottom-quark meson plus an up- or charm-quark: $t\to {\overline B}^0+ u, c$; $t\to {\overline B}^0_{s}+ c,u$; and $t \to Υ(nS)+ c,u$, are studied for the first time. The corresponding partials widths are computed at leading order in the non-relativistic QCD framework. The sums of all two-body branching ratios amount to… ▽ More Rare two-body decays of the top quark into a neutral bottom-quark meson plus an up- or charm-quark: $t\to {\overline B}^0+ u, c$; $t\to {\overline B}^0_{s}+ c,u$; and $t \to Υ(nS)+ c,u$, are studied for the first time. The corresponding partials widths are computed at leading order in the non-relativistic QCD framework. The sums of all two-body branching ratios amount to $\mathcal{B}(t \to {\overline B}^0+ {\rm jet}) \approx \mathcal{B}(t \to {\overline B}^0_{s}+ {\rm jet}) \approx 4.2\cdot 10^{-5}$ and $\mathcal{B}(t \to Υ(nS)+ {\rm jet}) \approx 2\cdot 10^{-9}$, respectively. The feasibility to observe the $t\to {\overline B}^0_{(s)}+{\rm jet}$ decay is estimated in top-pair events produced in proton-proton collisions at $\sqrt{s} = 14, 100$ TeV at the LHC and FCC, respectively. Combining many exclusive hadronic ${\overline B}^0_{(s)}$ decays, with $J/ψ$ or $D^{0,\pm}$ final states, about 50 (16000) events are expected in 3 (20) ab$^{-1}$ of integrated luminosity at the LHC (FCC), after typical selection criteria, acceptance, and efficiency losses. An observation of the two-body top-quark decay can also be achieved in the interesting $t\to b(\rm{jet})+c(\rm{jet})$ dijet final state, where the ${\overline B}^0_{(s)}$ decay products are reconstructed as a jet, with 5300 and 1.4 million signal events above backgrounds expected after selection criteria at the LHC and FCC, respectively. Such unique final states provide a new direct method to precisely measure the top-quark mass via simple 2-body invariant mass analyses. △ Less

Submitted 16 May, 2020; originally announced May 2020.

Comments: 26 pages, 5 figures

arXiv:2004.14345 [pdf, other]

doi 10.1016/j.physletb.2020.135559

Complete NLO QCD study of single- and double-quarkonium hadroproduction in the colour-evaporation model at the Tevatron and the LHC

Authors: Jean-Philippe Lansberg, Hua-Sheng Shao, Nodoka Yamanaka, Yu-Jie Zhang, Camille Noûs

Abstract: We study the Single-Parton-Scattering (SPS) production of double quarkonia (J/psi+J/psi, J/psi+Upsilon, and Upsilon+Upsilon) in pp and pp(bar) collisions at the LHC and the Tevatron as measured by the CMS, ATLAS, LHCb, and D0 experiments in the Colour-Evaporation Model (CEM), based on the quark-hadron-duality, including Next-to-Leading Order (NLO) QCD corrections up to alpha_s^5. To do so, we also… ▽ More We study the Single-Parton-Scattering (SPS) production of double quarkonia (J/psi+J/psi, J/psi+Upsilon, and Upsilon+Upsilon) in pp and pp(bar) collisions at the LHC and the Tevatron as measured by the CMS, ATLAS, LHCb, and D0 experiments in the Colour-Evaporation Model (CEM), based on the quark-hadron-duality, including Next-to-Leading Order (NLO) QCD corrections up to alpha_s^5. To do so, we also perform the first true NLO --up to alpha_s^4-- study of the p_T-differential cross section for single-quarkonium production. This allows us to fix the non-perturbative CEM parameters at NLO accuracy in the region where quarkonium-pair data are measured. Our results show that the CEM at NLO in general significantly undershoots these experimental data and, in view of the other existing SPS studies, confirm the need for Double Parton Scattering (DPS) to account for the data. Our NLO study of single-quarkonium production at mid and large p_T also confirms the difficulty of the approach to account for the measured p_T spectra; this is reminiscent of the impossibility to fit single-quarkonium data with the sole 3S18 NRQCD contribution from gluon fragmentation. We stress that the discrepancy occurs in a kinematical region where the new features of the improved CEM are not relevant. △ Less

Submitted 29 April, 2020; originally announced April 2020.

Comments: LaTeX, 9 pages; 17 figures, 2 tables

arXiv:2004.11692 [pdf, other]

Dynamic topic modeling of the COVID-19 Twitter narrative among U.S. governors and cabinet executives

Authors: Hao Sha, Mohammad Al Hasan, George Mohler, P. Jeffrey Brantingham

Abstract: A combination of federal and state-level decision making has shaped the response to COVID-19 in the United States. In this paper we analyze the Twitter narratives around this decision making by applying a dynamic topic model to COVID-19 related tweets by U.S. Governors and Presidential cabinet members. We use a network Hawkes binomial topic model to track evolving sub-topics around risk, testing a… ▽ More A combination of federal and state-level decision making has shaped the response to COVID-19 in the United States. In this paper we analyze the Twitter narratives around this decision making by applying a dynamic topic model to COVID-19 related tweets by U.S. Governors and Presidential cabinet members. We use a network Hawkes binomial topic model to track evolving sub-topics around risk, testing and treatment. We also construct influence networks amongst government officials using Granger causality inferred from the network Hawkes process. △ Less

Submitted 19 April, 2020; originally announced April 2020.

arXiv:2004.06059 [pdf, other]

doi 10.1145/3366423.3380145

paper2repo: GitHub Repository Recommendation for Academic Papers

Authors: Huajie Shao, Dachun Sun, Jiahao Wu, Zecheng Zhang, Aston Zhang, Shuochao Yao, Shengzhong Liu, Tianshi Wang, Chao Zhang, Tarek Abdelzaher

Abstract: GitHub has become a popular social application platform, where a large number of users post their open source projects. In particular, an increasing number of researchers release repositories of source code related to their research papers in order to attract more people to follow their work. Motivated by this trend, we describe a novel item-item cross-platform recommender system,… ▽ More GitHub has become a popular social application platform, where a large number of users post their open source projects. In particular, an increasing number of researchers release repositories of source code related to their research papers in order to attract more people to follow their work. Motivated by this trend, we describe a novel item-item cross-platform recommender system, $\textit{paper2repo}$, that recommends relevant repositories on GitHub that match a given paper in an academic search system such as Microsoft Academic. The key challenge is to identify the similarity between an input paper and its related repositories across the two platforms, $\textit{without the benefit of human labeling}$. Towards that end, paper2repo integrates text encoding and constrained graph convolutional networks (GCN) to automatically learn and map the embeddings of papers and repositories into the same space, where proximity offers the basis for recommendation. To make our method more practical in real life systems, labels used for model training are computed automatically from features of user actions on GitHub. In machine learning, such automatic labeling is often called {\em distant supervision\/}. To the authors' knowledge, this is the first distant-supervised cross-platform (paper to repository) matching system. We evaluate the performance of paper2repo on real-world data sets collected from GitHub and Microsoft Academic. Results demonstrate that it outperforms other state of the art recommendation methods. △ Less

Submitted 13 April, 2020; originally announced April 2020.

Journal ref: The Web Conference 2020 (WWW)

arXiv:2004.05988 [pdf, other]

ControlVAE: Controllable Variational Autoencoder

Authors: Huajie Shao, Shuochao Yao, Dachun Sun, Aston Zhang, Shengzhong Liu, Dongxin Liu, Jun Wang, Tarek Abdelzaher

Abstract: Variational Autoencoders (VAE) and their variants have been widely used in a variety of applications, such as dialog generation, image generation and disentangled representation learning. However, the existing VAE models have some limitations in different applications. For example, a VAE easily suffers from KL vanishing in language modeling and low reconstruction quality for disentangling. To addr… ▽ More Variational Autoencoders (VAE) and their variants have been widely used in a variety of applications, such as dialog generation, image generation and disentangled representation learning. However, the existing VAE models have some limitations in different applications. For example, a VAE easily suffers from KL vanishing in language modeling and low reconstruction quality for disentangling. To address these issues, we propose a novel controllable variational autoencoder framework, ControlVAE, that combines a controller, inspired by automatic control theory, with the basic VAE to improve the performance of resulting generative models. Specifically, we design a new non-linear PI controller, a variant of the proportional-integral-derivative (PID) control, to automatically tune the hyperparameter (weight) added in the VAE objective using the output KL-divergence as feedback during model training. The framework is evaluated using three applications; namely, language modeling, disentangled representation learning, and image generation. The results show that ControlVAE can achieve better disentangling and reconstruction quality than the existing methods. For language modelling, it not only averts the KL-vanishing, but also improves the diversity of generated text. Finally, we also demonstrate that ControlVAE improves the reconstruction quality of generated images compared to the original VAE. △ Less

Submitted 20 June, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

Comments: accepted by ICML2020

Journal ref: 37th proceedings of ICML, 2020

arXiv:2004.03303 [pdf]

Towards Efficient Unconstrained Palmprint Recognition via Deep Distillation Hashing

Authors: Huikai Shao, Dexing Zhong, Xuefeng Du

Abstract: Deep palmprint recognition has become an emerging issue with great potential for personal authentication on handheld and wearable consumer devices. Previous studies of palmprint recognition are mainly based on constrained datasets collected by dedicated devices in controlled environments, which has to reduce the flexibility and convenience. In addition, general deep palmprint recognition algorithm… ▽ More Deep palmprint recognition has become an emerging issue with great potential for personal authentication on handheld and wearable consumer devices. Previous studies of palmprint recognition are mainly based on constrained datasets collected by dedicated devices in controlled environments, which has to reduce the flexibility and convenience. In addition, general deep palmprint recognition algorithms are often too heavy to meet the real-time requirements of embedded system. In this paper, a new palmprint benchmark is established, which consists of more than 20,000 images collected by 5 brands of smart phones in an unconstrained manner. Each image has been manually labeled with 14 key points for region of interest (ROI) extraction. Further, the approach called Deep Distillation Hashing (DDH) is proposed as benchmark for efficient deep palmprint recognition. Palmprint images are converted to binary codes to improve the efficiency of feature matching. Derived from knowledge distillation, novel distillation loss functions are constructed to compress deep model to further improve the efficiency of feature extraction on light network. Comprehensive experiments are conducted on both constrained and unconstrained palmprint databases. Using DDH, the accuracy of palmprint identification can be increased by up to 11.37%, and the Equal Error Rate (EER) of palmprint verification can be reduced by up to 3.11%. The results indicate the feasibility of our database, and DDH can outperform other baselines to achieve the state-of-the-art performance. The collected dataset and related source codes are publicly available at http://gr.xjtu.edu.cn/web/bell/resource. △ Less

Submitted 7 April, 2020; originally announced April 2020.

Comments: 13 pages, 8 figures, to access database, see http://gr.xjtu.edu.cn/web/bell/resource

arXiv:2003.07868 [pdf, ps, other]

doi 10.21468/SciPostPhys.9.2.022

Reinterpretation of LHC Results for New Physics: Status and Recommendations after Run 2

Authors: Waleed Abdallah, Shehu AbdusSalam, Azar Ahmadov, Amine Ahriche, Gaël Alguero, Benjamin C. Allanach, Jack Y. Araz, Alexandre Arbey, Chiara Arina, Peter Athron, Emanuele Bagnaschi, Yang Bai, Michael J. Baker, Csaba Balazs, Daniele Barducci, Philip Bechtle, Aoife Bharucha, Andy Buckley, Jonathan Butterworth, Haiying Cai, Claudio Campagnari, Cari Cesarotti, Marcin Chrzaszcz, Andrea Coccaro, Eric Conte , et al. (117 additional authors not shown)

Abstract: We report on the status of efforts to improve the reinterpretation of searches and measurements at the LHC in terms of models for new physics, in the context of the LHC Reinterpretation Forum. We detail current experimental offerings in direct searches for new particles, measurements, technical implementations and Open Data, and provide a set of recommendations for further improving the presentati… ▽ More We report on the status of efforts to improve the reinterpretation of searches and measurements at the LHC in terms of models for new physics, in the context of the LHC Reinterpretation Forum. We detail current experimental offerings in direct searches for new particles, measurements, technical implementations and Open Data, and provide a set of recommendations for further improving the presentation of LHC results in order to better enable reinterpretation in the future. We also provide a brief description of existing software reinterpretation frameworks and recent global analyses of new physics that make use of the current data. △ Less

Submitted 21 July, 2020; v1 submitted 17 March, 2020; originally announced March 2020.

Comments: 58 pages, minor revision following comments from SciPost referees

Report number: CERN-LPCC-2020-001, FERMILAB-FN-1098-CMS-T, Imperial/HEP/2020/RIF/01

Journal ref: SciPost Phys. 9, 022 (2020)

arXiv:2003.05837 [pdf, other]

Top-1 Solution of Multi-Moments in Time Challenge 2019

Authors: Manyuan Zhang, Hao Shao, Guanglu Song, Yu Liu, Junjie Yan

Abstract: In this technical report, we briefly introduce the solutions of our team 'Efficient' for the Multi-Moments in Time challenge in ICCV 2019. We first conduct several experiments with popular Image-Based action recognition methods TRN, TSN, and TSM. Then a novel temporal interlacing network is proposed towards fast and accurate recognition. Besides, the SlowFast network and its variants are explored.… ▽ More In this technical report, we briefly introduce the solutions of our team 'Efficient' for the Multi-Moments in Time challenge in ICCV 2019. We first conduct several experiments with popular Image-Based action recognition methods TRN, TSN, and TSM. Then a novel temporal interlacing network is proposed towards fast and accurate recognition. Besides, the SlowFast network and its variants are explored. Finally, we ensemble all the above models and achieve 67.22\% on the validation set and 60.77\% on the test set, which ranks 1st on the final leaderboard. In addition, we release a new code repository for video understanding which unifies state-of-the-art 2D and 3D methods based on PyTorch. The solution of the challenge is also included in the repository, which is available at https://github.com/Sense-X/X-Temporal. △ Less

Submitted 13 March, 2020; v1 submitted 12 March, 2020; originally announced March 2020.

arXiv:2003.03875 [pdf]

Overview of the CCKS 2019 Knowledge Graph Evaluation Track: Entity, Relation, Event and QA

Authors: Xianpei Han, Zhichun Wang, Jiangtao Zhang, Qinghua Wen, Wenqi Li, Buzhou Tang, Qi Wang, Zhifan Feng, Yang Zhang, Yajuan Lu, Haitao Wang, Wenliang Chen, Hao Shao, Yubo Chen, Kang Liu, Jun Zhao, Taifeng Wang, Kezun Zhang, Meng Wang, Yinlin Jiang, Guilin Qi, Lei Zou, Sen Hu, Minhao Zhang, Yinnian Lin

Abstract: Knowledge graph models world knowledge as concepts, entities, and the relationships between them, which has been widely used in many real-world tasks. CCKS 2019 held an evaluation track with 6 tasks and attracted more than 1,600 teams. In this paper, we give an overview of the knowledge graph evaluation tract at CCKS 2019. By reviewing the task definition, successful methods, useful resources, goo… ▽ More Knowledge graph models world knowledge as concepts, entities, and the relationships between them, which has been widely used in many real-world tasks. CCKS 2019 held an evaluation track with 6 tasks and attracted more than 1,600 teams. In this paper, we give an overview of the knowledge graph evaluation tract at CCKS 2019. By reviewing the task definition, successful methods, useful resources, good strategies and research challenges associated with each task in CCKS 2019, this paper can provide a helpful reference for develo** knowledge graph applications and conducting future knowledge graph researches. △ Less

Submitted 8 March, 2020; originally announced March 2020.

Comments: 21 pages, in Chinese, 9 figures and 17 tables, CCKS 2019 held an evaluation track about knowledge graph with 6 tasks and attracted more than 1,600 teams

arXiv:2003.02313 [pdf, other]

Joint Estimation of Discrete Choice Model and Arrival Rate with Unobserved Stock-out Events

Authors: Hongzhang Shao, Anton J. Kleywegt

Abstract: This paper studies the joint estimation problem of a discrete choice model and the arrival rate of potential customers when unobserved stock-out events occur. In this paper, we generalize [Anupindi et al., 1998] and [Conlon and Mortimer, 2013] in the sense that (1) we work with generic choice models, (2) we allow arbitrary numbers of products and stock-out events, and (3) we consider the existence… ▽ More This paper studies the joint estimation problem of a discrete choice model and the arrival rate of potential customers when unobserved stock-out events occur. In this paper, we generalize [Anupindi et al., 1998] and [Conlon and Mortimer, 2013] in the sense that (1) we work with generic choice models, (2) we allow arbitrary numbers of products and stock-out events, and (3) we consider the existence of the null alternative, and estimates the overall arrival rate of potential customers. In addition, we point out that the modeling in [Conlon and Mortimer, 2013] is problematic, and present the correct formulation. △ Less

Submitted 4 March, 2020; originally announced March 2020.

arXiv:2002.09859 [pdf, other]

DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition

Authors: Hao-Chiang Shao, Kang-Yu Liu, Chia-Wen Lin, Jiwen Lu

Abstract: The performance of a convolutional neural network (CNN) based face recognition model largely relies on the richness of labelled training data. Collecting a training set with large variations of a face identity under different poses and illumination changes, however, is very expensive, making the diversity of within-class face images a critical issue in practice. In this paper, we propose a 3D mode… ▽ More The performance of a convolutional neural network (CNN) based face recognition model largely relies on the richness of labelled training data. Collecting a training set with large variations of a face identity under different poses and illumination changes, however, is very expensive, making the diversity of within-class face images a critical issue in practice. In this paper, we propose a 3D model-assisted domain-transferred face augmentation network (DotFAN) that can generate a series of variants of an input face based on the knowledge distilled from existing rich face datasets collected from other domains. DotFAN is structurally a conditional CycleGAN but has two additional subnetworks, namely face expert network (FEM) and face shape regressor (FSR), for latent code control. While FSR aims to extract face attributes, FEM is designed to capture a face identity. With their aid, DotFAN can learn a disentangled face representation and effectively generate face images of various facial attributes while preserving the identity of augmented faces. Experiments show that DotFAN is beneficial for augmenting small face datasets to improve their within-class diversity so that a better face recognition model can be learned from the augmented dataset. △ Less

Submitted 23 February, 2020; originally announced February 2020.

Comments: 12 pages, 10 figures

arXiv:2002.05797 [pdf, other]

Hierarchical Overlap** Belief Estimation by Structured Matrix Factorization

Authors: Chaoqi Yang, **yang Li, Ruijie Wang, Shuochao Yao, Huajie Shao, Dongxin Liu, Shengzhong Liu, Tianshi Wang, Tarek F. Abdelzaher

Abstract: Much work on social media opinion polarization focuses on a flat categorization of stances (or orthogonal beliefs) of different communities from media traces. We extend in this work in two important respects. First, we detect not only points of disagreement between communities, but also points of agreement. In other words, we estimate community beliefs in the presence of overlap. Second, in lieu o… ▽ More Much work on social media opinion polarization focuses on a flat categorization of stances (or orthogonal beliefs) of different communities from media traces. We extend in this work in two important respects. First, we detect not only points of disagreement between communities, but also points of agreement. In other words, we estimate community beliefs in the presence of overlap. Second, in lieu of flat categorization, we consider hierarchical belief estimation, where communities might be hierarchically divided. For example, two opposing parties might disagree on core issues, but within a party, despite agreement on fundamentals, disagreement might occur on further details. We call the resulting combined problem a hierarchical overlap** belief estimation problem. To solve it, this paper develops a new class of unsupervised Non-negative Matrix Factorization (NMF) algorithms, we call Belief Structured Matrix Factorization (BSMF). Our proposed unsupervised algorithm captures both the latent belief intersections and dissimilarities, as well as a hierarchical structure. We discuss the properties of the algorithm and evaluate it on both synthetic and real-world datasets. In the synthetic dataset, our model reduces error by 40%. In real Twitter traces, it improves accuracy by around 10%. The model also achieves 96.08% self-consistency in a sanity check. △ Less

Submitted 19 September, 2022; v1 submitted 13 February, 2020; originally announced February 2020.

Comments: accepted in ASONAM 2020

arXiv:2002.04967 [pdf, other]

doi 10.1109/TCAD.2020.3015469

From IC Layout to Die Photo: A CNN-Based Data-Driven Approach

Authors: Hao-Chiang Shao, Chao-Yi Peng, Jun-Rei Wu, Chia-Wen Lin, Shao-Yun Fang, Pin-Yen Tsai, Yan-Hsiu Liu

Abstract: We propose a deep learning-based data-driven framework consisting of two convolutional neural networks: i) LithoNet that predicts the shape deformations on a circuit due to IC fabrication, and ii) OPCNet that suggests IC layout corrections to compensate for such shape deformations. By learning the shape correspondences between pairs of layout design patterns and their scanning electron microscope… ▽ More We propose a deep learning-based data-driven framework consisting of two convolutional neural networks: i) LithoNet that predicts the shape deformations on a circuit due to IC fabrication, and ii) OPCNet that suggests IC layout corrections to compensate for such shape deformations. By learning the shape correspondences between pairs of layout design patterns and their scanning electron microscope (SEM) images of the product wafer thereof, given an IC layout pattern, LithoNet can mimic the fabrication process to predict its fabricated circuit shape. Furthermore, LithoNet can take the wafer fabrication parameters as a latent vector to model the parametric product variations that can be inspected on SEM images. Besides, traditional optical proximity correction (OPC) methods used to suggest a correction on a lithographic photomask is computationally expensive. Our proposed OPCNet mimics the OPC procedure and efficiently generates a corrected photomask by collaborating with LithoNet to examine if the shape of a fabricated circuit optimally matches its original layout design. As a result, the proposed LithoNet-OPCNet framework can not only predict the shape of a fabricated IC from its layout pattern, but also suggests a layout correction according to the consistency between the predicted shape and the given layout. Experimental results with several benchmark layout patterns demonstrate the effectiveness of the proposed method. △ Less

Submitted 6 August, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

Comments: 14 pages, 16 figures

arXiv:2001.11179 [pdf, ps, other]

Consensus on Matrix-weighted Time-varying Networks

Authors: Lulu Pan, Haibin Shao, Mehran Mesbahi, Yugeng Xi, Dewei Li

Abstract: This paper examines the consensus problem on time-varying matrix-weighed undirected networks. First, we introduce the matrix-weighted integral network for the analysis of such networks. Under mild assumptions on the switching pattern of the time-varying network, necessary and/or sufficient conditions for which average consensus can be achieved are then provided in terms of the null space of matrix… ▽ More This paper examines the consensus problem on time-varying matrix-weighed undirected networks. First, we introduce the matrix-weighted integral network for the analysis of such networks. Under mild assumptions on the switching pattern of the time-varying network, necessary and/or sufficient conditions for which average consensus can be achieved are then provided in terms of the null space of matrix-valued Laplacian of the corresponding integral network. In particular, for periodic matrix-weighted time-varying networks, necessary and sufficient conditions for reaching average consensus is obtained from an algebraic perspective. Moreover, we show that if the integral network with period $T>0$ has a positive spanning tree over the time span $[0,T)$, average consensus for the node states is achieved. Simulation results are provided to demonstrate the theoretical analysis. △ Less

Submitted 30 January, 2020; originally announced January 2020.

arXiv:2001.10179 [pdf, ps, other]

Multi-modal Sentiment Analysis using Super Characters Method on Low-power CNN Accelerator Device

Authors: Baohua Sun, Lin Yang, Hao Sha, Michael Lin

Abstract: Recent years NLP research has witnessed the record-breaking accuracy improvement by DNN models. However, power consumption is one of the practical concerns for deploying NLP systems. Most of the current state-of-the-art algorithms are implemented on GPUs, which is not power-efficient and the deployment cost is also very high. On the other hand, CNN Domain Specific Accelerator (CNN-DSA) has been in… ▽ More Recent years NLP research has witnessed the record-breaking accuracy improvement by DNN models. However, power consumption is one of the practical concerns for deploying NLP systems. Most of the current state-of-the-art algorithms are implemented on GPUs, which is not power-efficient and the deployment cost is also very high. On the other hand, CNN Domain Specific Accelerator (CNN-DSA) has been in mass production providing low-power and low cost computation power. In this paper, we will implement the Super Characters method on the CNN-DSA. In addition, we modify the Super Characters method to utilize the multi-modal data, i.e. text plus tabular data in the CL-Aff sharedtask. △ Less

Submitted 28 January, 2020; originally announced January 2020.

Comments: 9 pages, 2 figures, 6 tables. Accepted by AAAI 2020 Affective Content Analysis Workshop

arXiv:2001.07499 [pdf, other]

Effects of intervalley scatterings in thermoelectric performance of band-convergent antimonene

Authors: Yu Wu, Bowen Hou, Congcong Ma, Jiang Cao, Ying Chen, Zixuan Lu, Haodong Mei, Hezhu Shao, Yuanfeng Xu, Heyuan Zhu, Zhilai Fang, Rongjun Zhang, Hao Zhang

Abstract: The strategy of band convergence of multi-valley conduction bands or multi-peak valence bands has been widely used to search or improve thermoelectric materials. However, the phonon-assisted intervalley scatterings due to multiple band degeneracy are usually neglected in the thermoelectric community. In this work, we investigate the (thermo)electric properties of non-polar monolayer $β$- and $α$-a… ▽ More The strategy of band convergence of multi-valley conduction bands or multi-peak valence bands has been widely used to search or improve thermoelectric materials. However, the phonon-assisted intervalley scatterings due to multiple band degeneracy are usually neglected in the thermoelectric community. In this work, we investigate the (thermo)electric properties of non-polar monolayer $β$- and $α$-antimonene considering full mode- and momentum-resolved electron-phonon interactions. We also analyze thoroughly the selection rules on electron-phonon matrix-elements using group-theory arguments. Our calculations reveal strong intervalley scattering between the nearly degenerate valley states in both $β$- and $α$-antimonene, and the commonly-used deformation potential approximation neglecting the dominant intervalley scattering gives inaccurate estimations of the electron-phonon scattering and thermoelectric transport properties. By considering full electron-phonon interactions based on the rigid-band approximation, we find that, the maximum value of the thermoelectric figure of merits $zT$ at room temperature reduces to 0.37 in $β$-antimonene, by a factor of 5.7 comparing to the value predicted based on the constant relaxation-time approximation method. Our work not only provides an accurate prediction of the thermoelectric performances of antimonenes that reveals the key role of intervalley scatterings in determining the electronic part of zT, but also showcases a computational framework for thermoelectric materials. △ Less

Submitted 21 September, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

Comments: 7 figures

arXiv:2001.06499 [pdf, other]

Temporal Interlacing Network

Authors: Hao Shao, Shengju Qian, Yu Liu

Abstract: For a long time, the vision community tries to learn the spatio-temporal representation by combining convolutional neural network together with various temporal models, such as the families of Markov chain, optical flow, RNN and temporal convolution. However, these pipelines consume enormous computing resources due to the alternately learning process for spatial and temporal information. One natur… ▽ More For a long time, the vision community tries to learn the spatio-temporal representation by combining convolutional neural network together with various temporal models, such as the families of Markov chain, optical flow, RNN and temporal convolution. However, these pipelines consume enormous computing resources due to the alternately learning process for spatial and temporal information. One natural question is whether we can embed the temporal information into the spatial one so the information in the two domains can be jointly learned once-only. In this work, we answer this question by presenting a simple yet powerful operator -- temporal interlacing network (TIN). Instead of learning the temporal features, TIN fuses the two kinds of information by interlacing spatial representations from the past to the future, and vice versa. A differentiable interlacing target can be learned to control the interlacing process. In this way, a heavy temporal model is replaced by a simple interlacing operator. We theoretically prove that with a learnable interlacing target, TIN performs equivalently to the regularized temporal convolution network (r-TCN), but gains 4% more accuracy with 6x less latency on 6 challenging benchmarks. These results push the state-of-the-art performances of video understanding by a considerable margin. Not surprising, the ensemble model of the proposed TIN won the $1^{st}$ place in the ICCV19 - Multi Moments in Time challenge. Code is made available to facilitate further research at https://github.com/deepcs233/TIN △ Less

Submitted 17 January, 2020; originally announced January 2020.

Comments: Accepted to AAAI 2020. Winning entry of ICCV Multi-Moments in Time Challenge 2019. Code is available at https://github.com/deepcs233/TIN

arXiv:2001.04256 [pdf, other]

doi 10.1103/PhysRevD.101.054036

Probing impact-parameter dependent nuclear parton densities from double parton scatterings in heavy-ion collisions

Authors: Hua-Sheng Shao

Abstract: We propose a new method to determine the spatially or impact-parameter dependent nuclear parton distribution functions (nPDFs) using the double parton scattering (DPS) processes in high-energy heavy-ion (proton-nucleus and nucleus-nucleus) collisions. We derive a simple generic DPS formula in nuclear collisions by accommodating both the nuclear collision geometry and the spatially dependent nuclea… ▽ More We propose a new method to determine the spatially or impact-parameter dependent nuclear parton distribution functions (nPDFs) using the double parton scattering (DPS) processes in high-energy heavy-ion (proton-nucleus and nucleus-nucleus) collisions. We derive a simple generic DPS formula in nuclear collisions by accommodating both the nuclear collision geometry and the spatially dependent nuclear modification effect, under the assumption that the impact-parameter dependence of nPDFs is only related to the nuclear thickness function. While the geometric effect is widely adopted, the impact of the spatially dependent nuclear modification on DPS cross sections has been overlooked so far, which can, however, be significant when the initial nuclear modification is large. In turn, the DPS cross sections in heavy-ion collisions can provide useful information on the spatial dependence of nPDFs. They can be, in general, obtained in minimum-bias nuclear collisions, featuring the virtue of independence of Glauber modeling. △ Less

Submitted 26 March, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

Comments: 12 pages, 3 figures, 4 tables; v2: journal version (add a new appendix and references)

Journal ref: Phys. Rev. D 101, 054036 (2020)

arXiv:2001.04035 [pdf, ps, other]

On the Controllability of Matrix-weighted Networks

Authors: Lulu Pan, Haibin Shao, Mehran Mesbahi, Yugeng Xi, Dewei Li

Abstract: This letter examines the controllability of consensus dynamics on matrix-weighed networks from a graph-theoretic perspective. Unlike the scalar-weighted networks, the rank of weight matrix introduces additional intricacies into characterizing the dimension of controllable subspace for such networks. Specifically, we investigate how the definiteness of weight matrices influences the dimension of th… ▽ More This letter examines the controllability of consensus dynamics on matrix-weighed networks from a graph-theoretic perspective. Unlike the scalar-weighted networks, the rank of weight matrix introduces additional intricacies into characterizing the dimension of controllable subspace for such networks. Specifically, we investigate how the definiteness of weight matrices influences the dimension of the controllable subspace. In this direction, graph-theoretic characterizations of the lower and upper bounds on the dimension of the controllable subspace are provided by employing, respectively, distance partition and almost equitable partition of matrix-weighted networks. Furthermore, the structure of an uncontrollable input for such networks is examined. Examples are then provided to demonstrate the theoretical results. △ Less

Submitted 12 January, 2020; originally announced January 2020.

Showing 151–200 of 344 results for author: Shao, H