-
Robust Educational Dialogue Act Classifiers with Low-Resource and Imbalanced Datasets
Authors:
Jionghao Lin,
Wei Tan,
Ngoc Dang Nguyen,
David Lang,
Lan Du,
Wray Buntine,
Richard Beare,
Guanliang Chen,
Dragan Gasevic
Abstract:
Dialogue acts (DAs) can represent conversational actions of tutors or students that take place during tutoring dialogues. Automating the identification of DAs in tutoring dialogues is significant to the design of dialogue-based intelligent tutoring systems. Many prior studies employ machine learning models to classify DAs in tutoring dialogues and invest much effort to optimize the classification…
▽ More
Dialogue acts (DAs) can represent conversational actions of tutors or students that take place during tutoring dialogues. Automating the identification of DAs in tutoring dialogues is significant to the design of dialogue-based intelligent tutoring systems. Many prior studies employ machine learning models to classify DAs in tutoring dialogues and invest much effort to optimize the classification accuracy by using limited amounts of training data (i.e., low-resource data scenario). However, beyond the classification accuracy, the robustness of the classifier is also important, which can reflect the capability of the classifier on learning the patterns from different class distributions. We note that many prior studies on classifying educational DAs employ cross entropy (CE) loss to optimize DA classifiers on low-resource data with imbalanced DA distribution. The DA classifiers in these studies tend to prioritize accuracy on the majority class at the expense of the minority class which might not be robust to the data with imbalanced ratios of different DA classes. To optimize the robustness of classifiers on imbalanced class distributions, we propose to optimize the performance of the DA classifier by maximizing the area under the ROC curve (AUC) score (i.e., AUC maximization). Through extensive experiments, our study provides evidence that (i) by maximizing AUC in the training process, the DA classifier achieves significant performance improvement compared to the CE approach under low-resource data, and (ii) AUC maximization approaches can improve the robustness of the DA classifier under different class imbalance ratios.
△ Less
Submitted 15 April, 2023;
originally announced April 2023.
-
Hygroscopic phase field fracture modelling of composite materials
Authors:
K. Au-Yeung,
A. Quintanas-Corominas,
E. Martínez-Pañeda,
W. Tan
Abstract:
This paper investigates the effect of moisture content upon the degradation behaviour of composite materials. A coupled phase field framework considering moisture diffusion, hygroscopic expansion, and fracture behaviour is developed. This multi-physics framework is used to explore the damage evolution of composite materials, spanning the micro-, meso- and macro-scales. The micro-scale unit-cell mo…
▽ More
This paper investigates the effect of moisture content upon the degradation behaviour of composite materials. A coupled phase field framework considering moisture diffusion, hygroscopic expansion, and fracture behaviour is developed. This multi-physics framework is used to explore the damage evolution of composite materials, spanning the micro-, meso- and macro-scales. The micro-scale unit-cell model shows how the mismatch between the hygroscopic expansion of fibre and matrix leads to interface debonding. From the meso-scale ply-level model, we learn that the distribution of fibres has a minor influence on the material properties, while increasing moisture content facilitates interface debonding. The macro-scale laminate-level model shows that moisture induces a higher degree of damage on the longitudinal ply relative to the transverse ply. This work opens a new avenue to understand and predict environmentally-assisted degradation in composite materials.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Proximity Forest 2.0: A new effective and scalable similarity-based classifier for time series
Authors:
Matthieu Herrmann,
Chang Wei Tan,
Mahsa Salehi,
Geoffrey I. Webb
Abstract:
Time series classification (TSC) is a challenging task due to the diversity of types of feature that may be relevant for different classification tasks, including trends, variance, frequency, magnitude, and various patterns. To address this challenge, several alternative classes of approach have been developed, including similarity-based, features and intervals, shapelets, dictionary, kernel, neur…
▽ More
Time series classification (TSC) is a challenging task due to the diversity of types of feature that may be relevant for different classification tasks, including trends, variance, frequency, magnitude, and various patterns. To address this challenge, several alternative classes of approach have been developed, including similarity-based, features and intervals, shapelets, dictionary, kernel, neural network, and hybrid approaches. While kernel, neural network, and hybrid approaches perform well overall, some specialized approaches are better suited for specific tasks. In this paper, we propose a new similarity-based classifier, Proximity Forest version 2.0 (PF 2.0), which outperforms previous state-of-the-art similarity-based classifiers across the UCR benchmark and outperforms state-of-the-art kernel, neural network, and hybrid methods on specific datasets in the benchmark that are best addressed by similarity-base methods. PF 2.0 incorporates three recent advances in time series similarity measures -- (1) computationally efficient early abandoning and pruning to speedup elastic similarity computations; (2) a new elastic similarity measure, Amerced Dynamic Time War** (ADTW); and (3) cost function tuning. It rationalizes the set of similarity measures employed, reducing the eight base measures of the original PF to three and using the first derivative transform with all similarity measures, rather than a limited subset. We have implemented both PF 1.0 and PF 2.0 in a single C++ framework, making the PF framework more efficient.
△ Less
Submitted 13 April, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Does Informativeness Matter? Active Learning for Educational Dialogue Act Classification
Authors:
Wei Tan,
Jionghao Lin,
David Lang,
Guanliang Chen,
Dragan Gasevic,
Lan Du,
Wray Buntine
Abstract:
Dialogue Acts (DAs) can be used to explain what expert tutors do and what students know during the tutoring process. Most empirical studies adopt the random sampling method to obtain sentence samples for manual annotation of DAs, which are then used to train DA classifiers. However, these studies have paid little attention to sample informativeness, which can reflect the information quantity of th…
▽ More
Dialogue Acts (DAs) can be used to explain what expert tutors do and what students know during the tutoring process. Most empirical studies adopt the random sampling method to obtain sentence samples for manual annotation of DAs, which are then used to train DA classifiers. However, these studies have paid little attention to sample informativeness, which can reflect the information quantity of the selected samples and inform the extent to which a classifier can learn patterns. Notably, the informativeness level may vary among the samples and the classifier might only need a small amount of low informative samples to learn the patterns. Random sampling may overlook sample informativeness, which consumes human labelling costs and contributes less to training the classifiers. As an alternative, researchers suggest employing statistical sampling methods of Active Learning (AL) to identify the informative samples for training the classifiers. However, the use of AL methods in educational DA classification tasks is under-explored. In this paper, we examine the informativeness of annotated sentence samples. Then, the study investigates how the AL methods can select informative samples to support DA classifiers in the AL sampling process. The results reveal that most annotated sentences present low informativeness in the training dataset and the patterns of these sentences can be easily captured by the DA classifier. We also demonstrate how AL methods can reduce the cost of manual annotation in the AL sampling process.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Light double-gluon hybrid states with the exotic quantum numbers $J^{PC} = 1^{-+}$ and $3^{-+}$
Authors:
Niu Su,
Wei-Han Tan,
Hua-Xing Chen,
Wei Chen,
Shi-Lin Zhu
Abstract:
We apply the QCD sum rule method to study the double-gluon hybrid states with the quark-gluon contents $\bar q q gg$ ($q=u/d$) and $\bar s s gg$. We construct twenty-eight double-gluon hybrid currents, eleven of which are found to be zero due to some internal symmetries between the two gluons fields. We concentrate on the non-vanishing currents with the exotic quantum numbers $J^{PC} = 1^{-+}$ and…
▽ More
We apply the QCD sum rule method to study the double-gluon hybrid states with the quark-gluon contents $\bar q q gg$ ($q=u/d$) and $\bar s s gg$. We construct twenty-eight double-gluon hybrid currents, eleven of which are found to be zero due to some internal symmetries between the two gluons fields. We concentrate on the non-vanishing currents with the exotic quantum numbers $J^{PC} = 1^{-+}$ and $3^{-+}$. Their masses are calculated to be $M_{|\bar q q gg;1^{-+}\rangle} = 4.35^{+0.26}_{-0.30}$ GeV, $M_{|\bar s s gg;1^{-+}\rangle} = 4.49^{+0.25}_{-0.30}$ GeV, $M_{|\bar q q gg;3^{-+}\rangle} = 3.02^{+0.24}_{-0.31}$ GeV, and $M_{|\bar s s gg;3^{-+}\rangle} = 3.16^{+0.22}_{-0.28}$ GeV. The decay behaviors of the $J^{PC} = 3^{-+}$ states are studied, and we propose to search for them in the $πa_1(1260)/ρω/φφ$ channels in future particle experiments.
△ Less
Submitted 31 May, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
GRB 221009A: An ordinary nearby GRB with extraordinary observational properties
Authors:
Lin Lan,
He Gao,
An Li,
Shuo Xiao,
Shunke Ai,
Zong-Kai Peng,
Long Li,
Chen-Yu Wang,
Nan Xu,
Shijie Lin,
Wei-Hua Lei,
Bing Zhang,
Yan-Qiu Zhang,
Chao Zheng,
Jia-Cong Liu,
Wang-Chen Xue,
Chen-Wei Wang,
Wen-Jun Tan,
Shao-Lin Xiong
Abstract:
The gamma-ray burst GRB 221009A, known as the ``brightest-of-all-time" (BOAT), is the closest energetic burst detected so far, with an energy of $E_{γ,\rm iso} \sim 10^{55}$ ergs. This study aims to assess its compatibility with known GRB energy and luminosity distributions. Our analysis indicates that the energy/luminosity function of GRBs is consistent across various redshift intervals, and that…
▽ More
The gamma-ray burst GRB 221009A, known as the ``brightest-of-all-time" (BOAT), is the closest energetic burst detected so far, with an energy of $E_{γ,\rm iso} \sim 10^{55}$ ergs. This study aims to assess its compatibility with known GRB energy and luminosity distributions. Our analysis indicates that the energy/luminosity function of GRBs is consistent across various redshift intervals, and that the inclusion of GRB 221009A does not significantly impact the function at low redshifts. Additionally, our evaluation of the best-fitting result of the entire GRB sample suggests that the expected number of GRBs with energy greater than $10^{55}$ ergs at a low redshift is 0.2, so that the emergence of GRB 221009A is consistent with expected energy/luminosity functions within $\sim 2σ$ Poisson fluctuation error, still adhering to the principles of small number statistics. Furthermore, we find that GRB 221009A and other energetic bursts, defined as $E_{γ,\rm iso} \gtrsim10^{54} {\rm ergs}$, exhibit no significant differences in terms of distributions of $T_{90}$, minimum timescale, Amati relation, $E_{\rm γ,iso}$-$E_{\rm X,iso}$ relation, $L_{γ,\rm iso}-Γ_0$ relation, $E_{γ,\rm iso}-Γ_0$ relation, $L_{γ,\rm iso}-E_{\rm p,i}-Γ_0$ relation, and host galaxy properties, compared to normal long GRBs. This suggests that energetic GRBs (including GRB 221009A) and other long GRBs likely have similar progenitor systems and undergo similar energy dissipation and radiation processes. The generation of energetic GRBs may be due to more extreme central engine properties or, more likely, a rarer viewing configuration of a quasi-universal structured jet.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
NoiseCAM: Explainable AI for the Boundary Between Noise and Adversarial Attacks
Authors:
Wenkai Tan,
Justus Renkhoff,
Alvaro Velasquez,
Ziyu Wang,
Lusi Li,
Jian Wang,
Shuteng Niu,
Fan Yang,
Yongxin Liu,
Houbing Song
Abstract:
Deep Learning (DL) and Deep Neural Networks (DNNs) are widely used in various domains. However, adversarial attacks can easily mislead a neural network and lead to wrong decisions. Defense mechanisms are highly preferred in safety-critical applications. In this paper, firstly, we use the gradient class activation map (GradCAM) to analyze the behavior deviation of the VGG-16 network when its inputs…
▽ More
Deep Learning (DL) and Deep Neural Networks (DNNs) are widely used in various domains. However, adversarial attacks can easily mislead a neural network and lead to wrong decisions. Defense mechanisms are highly preferred in safety-critical applications. In this paper, firstly, we use the gradient class activation map (GradCAM) to analyze the behavior deviation of the VGG-16 network when its inputs are mixed with adversarial perturbation or Gaussian noise. In particular, our method can locate vulnerable layers that are sensitive to adversarial perturbation and Gaussian noise. We also show that the behavior deviation of vulnerable layers can be used to detect adversarial examples. Secondly, we propose a novel NoiseCAM algorithm that integrates information from globally and pixel-level weighted class activation maps. Our algorithm is susceptible to adversarial perturbations and will not respond to Gaussian random noise mixed in the inputs. Third, we compare detecting adversarial examples using both behavior deviation and NoiseCAM, and we show that NoiseCAM outperforms behavior deviation modeling in its overall performance. Our work could provide a useful tool to defend against certain adversarial attacks on deep neural networks.
△ Less
Submitted 9 March, 2023;
originally announced March 2023.
-
Exploring Adversarial Attacks on Neural Networks: An Explainable Approach
Authors:
Justus Renkhoff,
Wenkai Tan,
Alvaro Velasquez,
illiam Yichen Wang,
Yongxin Liu,
Jian Wang,
Shuteng Niu,
Lejla Begic Fazlic,
Guido Dartmann,
Houbing Song
Abstract:
Deep Learning (DL) is being applied in various domains, especially in safety-critical applications such as autonomous driving. Consequently, it is of great significance to ensure the robustness of these methods and thus counteract uncertain behaviors caused by adversarial attacks. In this paper, we use gradient heatmaps to analyze the response characteristics of the VGG-16 model when the input ima…
▽ More
Deep Learning (DL) is being applied in various domains, especially in safety-critical applications such as autonomous driving. Consequently, it is of great significance to ensure the robustness of these methods and thus counteract uncertain behaviors caused by adversarial attacks. In this paper, we use gradient heatmaps to analyze the response characteristics of the VGG-16 model when the input images are mixed with adversarial noise and statistically similar Gaussian random noise. In particular, we compare the network response layer by layer to determine where errors occurred. Several interesting findings are derived. First, compared to Gaussian random noise, intentionally generated adversarial noise causes severe behavior deviation by distracting the area of concentration in the networks. Second, in many cases, adversarial examples only need to compromise a few intermediate blocks to mislead the final decision. Third, our experiments revealed that specific blocks are more vulnerable and easier to exploit by adversarial examples. Finally, we demonstrate that the layers $Block4\_conv1$ and $Block5\_cov1$ of the VGG-16 model are more susceptible to adversarial attacks. Our work could provide valuable insights into develo** more reliable Deep Neural Network (DNN) models.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Cloudless-Training: A Framework to Improve Efficiency of Geo-Distributed ML Training
Authors:
Wenting Tan,
Xiao Shi1,
Cunchi Lv,
Xiaofang Zhao
Abstract:
Geo-distributed ML training can benefit many emerging ML scenarios (e.g., large model training, federated learning) with multi-regional cloud resources and wide area network. However, its efficiency is limited due to 2 challenges. First, efficient elastic scheduling of multi-regional cloud resources is usually missing, affecting resource utilization and performance of training. Second, training co…
▽ More
Geo-distributed ML training can benefit many emerging ML scenarios (e.g., large model training, federated learning) with multi-regional cloud resources and wide area network. However, its efficiency is limited due to 2 challenges. First, efficient elastic scheduling of multi-regional cloud resources is usually missing, affecting resource utilization and performance of training. Second, training communication on WAN is still the main overhead, easily subjected to low bandwidth and high fluctuations of WAN. In this paper, we propose a framework, Cloudless-Training, to realize efficient PS-based geo-distributed ML training in 3 aspects. First, it uses a two-layer architecture with control and physical training planes to support elastic scheduling and communication for multi-regional clouds in a serverless maner.Second, it provides an elastic scheduling strategy that can deploy training workflows adaptively according to the heterogeneity of available cloud resources and distribution of pre-existing training datasets. Third, it provides 2 new synchronization strategies for training partitions among clouds, including asynchronous SGD with gradient accumulation (ASGD-GA) and inter-PS model averaging (MA). It is implemented with OpenFaaS and evaluated on Tencent Cloud. Experiments show that Cloudless-Training can support general ML training in a geo-distributed way, greatly improve resource utilization (e.g., 9.2%-24.0% training cost reduction) and synchronization efficiency (e.g., 1.7x training speedup over baseline at most) with model correctness guarantees.
△ Less
Submitted 9 March, 2023;
originally announced March 2023.
-
PaReNTT: Low-Latency Parallel Residue Number System and NTT-Based Long Polynomial Modular Multiplication for Homomorphic Encryption
Authors:
Weihang Tan,
Sin-Wei Chiu,
Antian Wang,
Yingjie Lao,
Keshab K. Parhi
Abstract:
High-speed long polynomial multiplication is important for applications in homomorphic encryption (HE) and lattice-based cryptosystems. This paper addresses low-latency hardware architectures for long polynomial modular multiplication using the number-theoretic transform (NTT) and inverse NTT (iNTT). Chinese remainder theorem (CRT) is used to decompose the modulus into multiple smaller moduli. Our…
▽ More
High-speed long polynomial multiplication is important for applications in homomorphic encryption (HE) and lattice-based cryptosystems. This paper addresses low-latency hardware architectures for long polynomial modular multiplication using the number-theoretic transform (NTT) and inverse NTT (iNTT). Chinese remainder theorem (CRT) is used to decompose the modulus into multiple smaller moduli. Our proposed architecture, namely PaReNTT, makes four novel contributions. First, parallel NTT and iNTT architectures are proposed to reduce the number of clock cycles to process the polynomials. This can enable real-time processing for HE applications, as the number of clock cycles to process the polynomial is inversely proportional to the level of parallelism. Second, the proposed architecture eliminates the need for permuting the NTT outputs before their product is input to the iNTT. This reduces latency by n/4 clock cycles, where n is the length of the polynomial, and reduces buffer requirement by one delay-switch-delay circuit of size n. Third, an approach to select special moduli is presented where the moduli can be expressed in terms of a few signed power-of-two terms. Fourth, novel architectures for pre-processing for computing residual polynomials using the CRT and post-processing for combining the residual polynomials are proposed. These architectures significantly reduce the area consumption of the pre-processing and post-processing steps. The proposed long modular polynomial multiplications are ideal for applications that require low latency and high sample rate as these feed-forward architectures can be pipelined at arbitrary levels.
△ Less
Submitted 6 July, 2023; v1 submitted 3 March, 2023;
originally announced March 2023.
-
Insight-HXMT and GECAM-C observations of the brightest-of-all-time GRB 221009A
Authors:
Zheng-Hua An,
S. Antier,
Xing-Zi Bi,
Qing-Cui Bu,
Ce Cai,
Xue-Lei Cao,
Anna-Elisa Camisasca,
Zhi Chang,
Gang Chen,
Li Chen,
Tian-Xiang Chen,
Wen Chen,
Yi-Bao Chen,
Yong Chen,
Yu-Peng Chen,
Michael W. Coughlin,
Wei-Wei Cui,
Zi-Gao Dai,
T. Hussenot-Desenonges,
Yan-Qi Du,
Yuan-Yuan Du,
Yun-Fei Du,
Cheng-Cheng Fan,
Filippo Frontera,
He Gao
, et al. (153 additional authors not shown)
Abstract:
GRB 221009A is the brightest gamma-ray burst ever detected since the discovery of this kind of energetic explosions. However, an accurate measurement of the prompt emission properties of this burst is very challenging due to its exceptional brightness. With joint observations of \textit{Insight}-HXMT and GECAM-C, we made an unprecedentedly accurate measurement of the emission during the first…
▽ More
GRB 221009A is the brightest gamma-ray burst ever detected since the discovery of this kind of energetic explosions. However, an accurate measurement of the prompt emission properties of this burst is very challenging due to its exceptional brightness. With joint observations of \textit{Insight}-HXMT and GECAM-C, we made an unprecedentedly accurate measurement of the emission during the first $\sim$1800 s of GRB 221009A, including its precursor, main emission (ME, which dominates the burst in flux), flaring emission and early afterglow, in the hard X-ray to soft gamma-ray band from $\sim$ 10 keV to $\sim$ 6 MeV. Based on the GECAM-C unsaturated data of the ME, we measure a record-breaking isotropic equivalent energy ($E_{\rm iso}$) of $\bf \sim 1.5 \times 10^{55}$ erg, which is about eight times the total rest-mass energy of the Sun. The early afterglow data require a significant jet break between 650 s and 1100 s, most likely at $\sim950$ s from the afterglow starting time $T_{AG}$, which corresponds to a jet opening angle of $\sim {0.7^\circ} \ (η_γn)^{1/8}$, where $n$ is the ambient medium density in units of $\rm cm^{-3}$ and $η_γ$ is the ratio between $γ$-ray energy and afterglow kinetic energy. The beaming-corrected total $γ$-ray energy $E_γ$ is $\sim 1.15 \times10^{51} \ (η_γn)^{1/4}$ erg, which is typical for long GRBs. These results suggest that this GRB may have a special central engine, which could launch and collimate a very narrowly beamed jet with an ordinary energy budget, leading to exceptionally luminous gamma-ray radiation per unit solid angle. Alternatively, more GRBs might have such a narrow and bright beam, which are missed by an unfavorable viewing angle or have been detected without distance measurement.
△ Less
Submitted 3 March, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Synchrotron Radiation Dominates the Extremely Bright GRB 221009A
Authors:
Jun Yang,
Xiao-Hong Zhao,
Zhenyu Yan,
Xiangyu I. Wang,
Yan-Qiu Zhang,
Zheng-Hua An,
Ce Cai,
Xin-Qiao Li,
Zihan Li,
Jia-Cong Liu,
Zi-Ke Liu,
Xiang Ma,
Yan-Zhi Meng,
Wen-Xi Peng,
Rui Qiao,
Lang Shao,
Li-Ming Song,
Wen-Jun Tan,
** Wang,
Chen-Wei Wang,
Xiang-Yang Wen,
Shuo Xiao,
Wang-Chen Xue,
Yu-han Yang,
Yihan Yin
, et al. (8 additional authors not shown)
Abstract:
The brightest Gamma-ray burst, GRB 221009A, has spurred numerous theoretical investigations, with particular attention paid to the origins of ultra-high energy TeV photons during the prompt phase. However, analyzing the mechanism of radiation of photons in the $\sim$MeV range has been difficult because the high flux causes pile-up and saturation effects in most GRB detectors. In this letter, we pr…
▽ More
The brightest Gamma-ray burst, GRB 221009A, has spurred numerous theoretical investigations, with particular attention paid to the origins of ultra-high energy TeV photons during the prompt phase. However, analyzing the mechanism of radiation of photons in the $\sim$MeV range has been difficult because the high flux causes pile-up and saturation effects in most GRB detectors. In this letter, we present systematic modeling of the time-resolved spectra of the GRB using unsaturated data obtained from Fermi/GBM (precursor) and SATech-01/GECAM-C (main emission and flare). Our approach incorporates the synchrotron radiation model, which assumes an expanding emission region with relativistic speed and a global magnetic field that decays with radius, and successfully fits such a model to the observational data. Our results indicate that the spectra of the burst are fully in accordance with a synchrotron origin from relativistic electrons accelerated at a large emission radius. The lack of thermal emission in the prompt emission spectra supports a Poynting-flux-dominated jet composition.
△ Less
Submitted 28 March, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Cross calibration of gamma-ray detectors (GRD) of GECAM-C
Authors:
Yan-Qiu Zhang,
Shao-Lin Xiong,
Rui Qiao,
Dong-Ya Guo,
Wen-Xi Peng,
Xin-Qiao Li,
Wang-Chen Xue,
Chao Zheng,
Jia-Cong Liu,
Wen-Jun Tan,
Chen-Wei Wang,
Peng Zhang,
** Wang,
Ce Cai,
Shuo Xiao,
Yue Huang,
Pei-Yi Feng,
Xiao-Bo Li,
Li-Ming Song,
Qi-Bin Yi,
Yi Zhao,
Zhi-Wei Guo,
Jian-Jian He,
Chao-Yang Li,
Ya-Qing Liu
, et al. (20 additional authors not shown)
Abstract:
The gamma-ray detectors (GRDs) of GECAM-C onborad SATech-01 satellite is designed to monitor gamma-ray transients all over the sky from 6 keV to 6 MeV. The energy response matrix is the key to do spectral measurements of bursts, which is usually generated from GEANT4 simulation and partially verified by the ground calibration. In this work, energy response matrix of GECAM-C GRD is cross-calibrated…
▽ More
The gamma-ray detectors (GRDs) of GECAM-C onborad SATech-01 satellite is designed to monitor gamma-ray transients all over the sky from 6 keV to 6 MeV. The energy response matrix is the key to do spectral measurements of bursts, which is usually generated from GEANT4 simulation and partially verified by the ground calibration. In this work, energy response matrix of GECAM-C GRD is cross-calibrated with Fermi/GBM and Swift/BAT using a sample of Gamma-Ray Bursts (GRBs) and Soft Gamma-Ray Repeaters (SGRs). The calibration results show there is a good agreement between GECAM-C and other reasonably well calibrated instrument (i.e. Fermi/GBM and Swift/BAT). We also find that different GRD detectors of GECAM-C also show consistency with each other. All these results indicate that GECAM-C GRD can provide reliable spectral measurements.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Ground calibration of Gamma-Ray Detectors of GECAM-C
Authors:
Chao Zheng,
Zheng-Hua An,
Wen-Xi Peng,
Da-Li Zhang,
Shao-Lin Xiong,
Rui. Qiao,
Yan-Qiu Zhang,
Wang-Chen Xue,
Jia-Cong Liu,
Pei-Yi Feng,
Ce. Cai,
Min Gao,
Ke Gong,
Dong-Ya Guo,
Dong-Jie Hou,
Gang Li,
Xin-Qiao Li,
Yan-Guo Li,
Mao-Shun Li,
Xiao-Hua Liang,
Ya-Qing Liu,
Xiao-**g Liu,
Li-Ming Song,
Xi-Lei Sun,
Wen-Jun Tan
, et al. (13 additional authors not shown)
Abstract:
As a new member of GECAM mission, GECAM-C (also named High Energy Burst Searcher, HEBS) was launched onboard the SATech-01 satellite on July 27th, 2022, which is capable to monitor gamma-ray transients from $\sim$ 6 keV to 6 MeV. As the main detector, there are 12 gamma-ray detectors (GRDs) equipped for GECAM-C. In order to verify the GECAM-C GRD detector performance and to validate the Monte Carl…
▽ More
As a new member of GECAM mission, GECAM-C (also named High Energy Burst Searcher, HEBS) was launched onboard the SATech-01 satellite on July 27th, 2022, which is capable to monitor gamma-ray transients from $\sim$ 6 keV to 6 MeV. As the main detector, there are 12 gamma-ray detectors (GRDs) equipped for GECAM-C. In order to verify the GECAM-C GRD detector performance and to validate the Monte Carlo simulations of detector response, comprehensive on-ground calibration experiments have been performed using X-ray beam and radioactive sources, including Energy-Channel relation, energy resolution, detection efficiency, SiPM voltage-gain relation and the non-uniformity of positional response. In this paper, the detailed calibration campaigns and data analysis results for GECAM-C GRDs are presented, demonstrating the excellent performance of GECAM-C GRD detectors.
△ Less
Submitted 30 May, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
The performance of SiPM-based gamma-ray detector (GRD) of GECAM-C
Authors:
Dali Zhang,
Chao Zheng,
Jiacong Liu,
Zhenghua An,
Chenwei Wang,
Xiangyang Wen,
Xinqiao Li,
Xilei Sun,
Ke Gong,
Yaqing Liu,
Xiao**g Liu,
Sheng Yang,
Wenxi Peng,
Rui Qiao,
Dongya Guo,
Peiyi Feng,
Yanqiu Zhang,
Wangchen Xue,
Wenjun Tan,
Ce Cai,
Shuo Xiao,
Qibin Yi,
Yanbing Xu,
Min Gao,
**zhou Wang
, et al. (20 additional authors not shown)
Abstract:
As a new member of GECAM mission, the GECAM-C (also called High Energy Burst Searcher, HEBS) is a gamma-ray all-sky monitor onboard SATech-01 satellite, which was launched on July 27th, 2022 to detect gamma-ray transients from 6 keV to 6 MeV, such as Gamma-Ray Bursts (GRBs), high energy counterpart of Gravitational Waves (GWs) and Fast Radio Bursts (FRBs), and Soft Gamma-ray Repeaters (SGRs). Toge…
▽ More
As a new member of GECAM mission, the GECAM-C (also called High Energy Burst Searcher, HEBS) is a gamma-ray all-sky monitor onboard SATech-01 satellite, which was launched on July 27th, 2022 to detect gamma-ray transients from 6 keV to 6 MeV, such as Gamma-Ray Bursts (GRBs), high energy counterpart of Gravitational Waves (GWs) and Fast Radio Bursts (FRBs), and Soft Gamma-ray Repeaters (SGRs). Together with GECAM-A and GECAM-B launched in December 2020, GECAM-C will greatly improve the monitoring coverage, localization, as well as temporal and spectral measurements of gamma-ray transients. GECAM-C employs 12 SiPM-based Gamma-Ray Detectors (GRDs) to detect gamma-ray transients . In this paper, we firstly give a brief description of the design of GECAM-C GRDs, and then focus on the on-ground tests and in-flight performance of GRDs. We also did the comparison study of the SiPM in-flight performance between GECAM-C and GECAM-B. The results show GECAM-C GRD works as expected and is ready to make scientific observations.
△ Less
Submitted 7 March, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Electrostatic effect due to patch potentials between closely spaced surfaces
Authors:
Jun Ke,
Wen-Can Dong,
Sheng-Hua Huang,
Yu-Jie Tan,
Wen-Hai Tan,
Shan-Qing Yang,
Cheng-Gang Shao,
Jie Luo
Abstract:
The spatial variation and temporal variation in surface potential are important error sources in various precision experiments and deserved to be considered carefully. In the former case, the theoretical analysis shows that this effect depends on the surface potentials through their spatial autocorrelation functions. By making some modification to the quasi-local correlation model, we obtain a rig…
▽ More
The spatial variation and temporal variation in surface potential are important error sources in various precision experiments and deserved to be considered carefully. In the former case, the theoretical analysis shows that this effect depends on the surface potentials through their spatial autocorrelation functions. By making some modification to the quasi-local correlation model, we obtain a rigorous formula for the patch force, where the magnitude is proportional to ${\frac{1}{{{a}^{2}}}{{(\frac{a}{w})}^{β(a/w)+2}}}$ with ${a}$ the distance between two parallel plates, ${w}$ the mean patch size, and $β$ the scaling coefficient from ${-2}$ to ${-4}$. A torsion balance experiment is then conducted, and obtain a 0.4 mm effective patch size and 20 mV potential variance. In the latter case, we apply an adatom diffusion model to describe this mechanism and predicts a ${f^{-3/4}}$ frequency dependence above 0.01 ${\rm mHz}$. This prediction meets well with a typical experimental results. Finally, we apply these models to analyze the patch effect for two typical experiments. Our analysis will help to investigate the properties of surface potentials.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
Anomalous Nernst effect induced terahertz emission in a single ferromagnetic film
Authors:
Zheng Feng,
Wei Tan,
Zuanming **,
Yi-Jia Chen,
Zhangfeng Zhong,
Liang Zhang,
Song Sun,
** Tang,
Yexing Jiang,
Po-Hsun Wu,
Jun Cheng,
Bingfeng Miao,
Haifeng Ding,
Dacheng Wang,
Yiming Zhu,
Liang Guo,
Sunmi Shin,
Guohong Ma,
Dazhi Hou,
Ssu-Yen Huang
Abstract:
By develo** a bidirectional-pump terahertz (THz) emission spectroscopy, we reveal an anomalous Nernst effect (ANE) induced THz emission in a single ferromagnetic film. Based on the distinctive symmetry of the THz signals, ANE is unequivocally distinguished from the previously attributed ultrafast demagnetization and anomalous Hall effect mechanisms. A quantitative method is established to separa…
▽ More
By develo** a bidirectional-pump terahertz (THz) emission spectroscopy, we reveal an anomalous Nernst effect (ANE) induced THz emission in a single ferromagnetic film. Based on the distinctive symmetry of the THz signals, ANE is unequivocally distinguished from the previously attributed ultrafast demagnetization and anomalous Hall effect mechanisms. A quantitative method is established to separate the different contributions, demonstrating a significant ANE contribution that even overwhelms other competing mechanisms. Our work not only clarifies the origin of the ferromagnetic-based THz emission, but also offers a fertile platform for investigating the ultrafast magnetism and THz spintronics.
△ Less
Submitted 16 June, 2023; v1 submitted 21 February, 2023;
originally announced February 2023.
-
Modelling Fatigue Behaviours and Lifetimes of Novel GLARE Laminates under Random Loading Spectrum
Authors:
Zheng-Qiang Cheng,
Wei Tan,
Jun-Jiang Xiong,
Er-Ming Hed,
Tao-Huan Xiong,
Ying-Peng Wang
Abstract:
This paper aims to experimentally and numerically probe fatigue behaviours and lifetimes of novel GLARE (glass laminate aluminium reinforced epoxy) laminates under random loading spectrum. A mixed algorithm based on fatigue damage concepts of three-phase materials was proposed for modelling progressive fatigue damage mechanisms and fatigue life of fibre metal laminates (FML) under random loading s…
▽ More
This paper aims to experimentally and numerically probe fatigue behaviours and lifetimes of novel GLARE (glass laminate aluminium reinforced epoxy) laminates under random loading spectrum. A mixed algorithm based on fatigue damage concepts of three-phase materials was proposed for modelling progressive fatigue damage mechanisms and fatigue life of fibre metal laminates (FML) under random loading spectrum. To validate the proposed modelling algorithm, fatigue tests were conducted on the GLARE 2/1 and GLARE 3/2 laminates subjected to random loading spectrum, and fatigue mechanisms were discussed by using scanning electron microscope (SEM) analysis. It is shown that predominant fatigue failure of the GLARE laminate depends on the reference load level of random loading spectrum. Specifically, dominant fatigue failure of the GLARE laminate is dependent on fatigue strength of fibre layer at a high reference load level, but metal layer at a low reference load level. Numerical predictions agree well with experimental results, demonstrating that the proposed mixed modelling algorithm can effectively simulate fatigue behaviours and lives of the GLARE laminate under random loading spectrum.
△ Less
Submitted 21 February, 2023;
originally announced February 2023.
-
Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies
Authors:
Wee Ling Tan,
Stephen Roberts,
Stefan Zohren
Abstract:
We introduce Spatio-Temporal Momentum strategies, a class of models that unify both time-series and cross-sectional momentum strategies by trading assets based on their cross-sectional momentum features over time. While both time-series and cross-sectional momentum strategies are designed to systematically capture momentum risk premia, these strategies are regarded as distinct implementations and…
▽ More
We introduce Spatio-Temporal Momentum strategies, a class of models that unify both time-series and cross-sectional momentum strategies by trading assets based on their cross-sectional momentum features over time. While both time-series and cross-sectional momentum strategies are designed to systematically capture momentum risk premia, these strategies are regarded as distinct implementations and do not consider the concurrent relationship and predictability between temporal and cross-sectional momentum features of different assets. We model spatio-temporal momentum with neural networks of varying complexities and demonstrate that a simple neural network with only a single fully connected layer learns to simultaneously generate trading signals for all assets in a portfolio by incorporating both their time-series and cross-sectional momentum features. Backtesting on portfolios of 46 actively-traded US equities and 12 equity index futures contracts, we demonstrate that the model is able to retain its performance over benchmarks in the presence of high transaction costs of up to 5-10 basis points. In particular, we find that the model when coupled with least absolute shrinkage and turnover regularization results in the best performance over various transaction cost scenarios.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Neutron lifetime anomaly and mirror matter theory
Authors:
Wanpeng Tan
Abstract:
This paper reviews the puzzles in modern neutron lifetime measurements and related unitarity issues in the CKM matrix. It is not a comprehensive and unbiased compilation of all historic data and studies, but rather a focus on compelling evidence leading to new physics. In particular, the largely overlooked nuances of different techniques applied in material and magnetic trap experiments are clarif…
▽ More
This paper reviews the puzzles in modern neutron lifetime measurements and related unitarity issues in the CKM matrix. It is not a comprehensive and unbiased compilation of all historic data and studies, but rather a focus on compelling evidence leading to new physics. In particular, the largely overlooked nuances of different techniques applied in material and magnetic trap experiments are clarified. Further detailed analysis shows that the ``beam'' approach of neutron lifetime measurements is likely to give the ``true'' $β$-decay lifetime, while discrepancies in ``bottle'' measurements indicate new physics at play. The most feasible solution to these puzzles is a newly proposed ordinary-mirror neutron ($n-n'$) oscillation model under the framework of mirror matter theory. This phenomenological model is reviewed and introduced, and its explanations of the neutron lifetime anomaly and possible non-unitarity of the CKM matrix are presented. Most importantly, various new experimental proposals, especially lifetime measurements with small/narrow magnetic traps or under super-strong magnetic fields, are discussed in order to test the surprisingly large anomalous signals that are uniquely predicted by this new $n-n'$ oscillation model.
△ Less
Submitted 20 February, 2023; v1 submitted 20 January, 2023;
originally announced February 2023.
-
Deep Learning for Time Series Classification and Extrinsic Regression: A Current Survey
Authors:
Navid Mohammadi Foumani,
Lynn Miller,
Chang Wei Tan,
Geoffrey I. Webb,
Germain Forestier,
Mahsa Salehi
Abstract:
Time Series Classification and Extrinsic Regression are important and challenging machine learning tasks. Deep learning has revolutionized natural language processing and computer vision and holds great promise in other fields such as time series analysis where the relevant features must often be abstracted from the raw data but are not known a priori. This paper surveys the current state of the a…
▽ More
Time Series Classification and Extrinsic Regression are important and challenging machine learning tasks. Deep learning has revolutionized natural language processing and computer vision and holds great promise in other fields such as time series analysis where the relevant features must often be abstracted from the raw data but are not known a priori. This paper surveys the current state of the art in the fast-moving field of deep learning for time series classification and extrinsic regression. We review different network architectures and training methods used for these tasks and discuss the challenges and opportunities when applying deep learning to time series data. We also summarize two critical applications of time series classification and extrinsic regression, human activity recognition and satellite earth observation.
△ Less
Submitted 19 December, 2023; v1 submitted 5 February, 2023;
originally announced February 2023.
-
EuclidNet: Deep Visual Reasoning for Constructible Problems in Geometry
Authors:
Man Fai Wong,
Xintong Qi,
Chee Wei Tan
Abstract:
In this paper, we present a deep learning-based framework for solving geometric construction problems through visual reasoning, which is useful for automated geometry theorem proving. Constructible problems in geometry often ask for the sequence of straightedge-and-compass constructions to construct a given goal given some initial setup. Our EuclidNet framework leverages the neural network archite…
▽ More
In this paper, we present a deep learning-based framework for solving geometric construction problems through visual reasoning, which is useful for automated geometry theorem proving. Constructible problems in geometry often ask for the sequence of straightedge-and-compass constructions to construct a given goal given some initial setup. Our EuclidNet framework leverages the neural network architecture Mask R-CNN to extract the visual features from the initial setup and goal configuration with extra points of intersection, and then generate possible construction steps as intermediary data models that are used as feedback in the training process for further refinement of the construction step sequence. This process is repeated recursively until either a solution is found, in which case we backtrack the path for a step-by-step construction guide, or the problem is identified as unsolvable. Our EuclidNet framework is validated on complex Japanese Sangaku geometry problems, demonstrating its capacity to leverage backtracking for deep visual reasoning of challenging problems.
△ Less
Submitted 27 December, 2022;
originally announced January 2023.
-
Parameterizing the cost function of Dynamic Time War** with application to time series classification
Authors:
Matthieu Herrmann,
Chang Wei Tan,
Geoffrey I. Webb
Abstract:
Dynamic Time War** (DTW) is a popular time series distance measure that aligns the points in two series with one another. These alignments support war** of the time dimension to allow for processes that unfold at differing rates. The distance is the minimum sum of costs of the resulting alignments over any allowable war** of the time dimension. The cost of an alignment of two points is a fun…
▽ More
Dynamic Time War** (DTW) is a popular time series distance measure that aligns the points in two series with one another. These alignments support war** of the time dimension to allow for processes that unfold at differing rates. The distance is the minimum sum of costs of the resulting alignments over any allowable war** of the time dimension. The cost of an alignment of two points is a function of the difference in the values of those points. The original cost function was the absolute value of this difference. Other cost functions have been proposed. A popular alternative is the square of the difference. However, to our knowledge, this is the first investigation of both the relative impacts of using different cost functions and the potential to tune cost functions to different tasks. We do so in this paper by using a tunable cost function λγ with parameter γ. We show that higher values of γ place greater weight on larger pairwise differences, while lower values place greater weight on smaller pairwise differences. We demonstrate that training γ significantly improves the accuracy of both the DTW nearest neighbor and Proximity Forest classifiers.
△ Less
Submitted 28 March, 2023; v1 submitted 24 January, 2023;
originally announced January 2023.
-
Mirror symmetry for new physics beyond the Standard Model in $4D$ spacetime
Authors:
Wanpeng Tan
Abstract:
The two discrete generators of the full Lorentz group $O(1,3)$ in $4D$ spacetime are typically chosen to be parity inversion symmetry $P$ and time reversal symmetry $T$, which are responsible for the four topologically separate components of $O(1,3)$. Under general considerations of quantum field theory (QFT) with internal degrees of freedom, mirror symmetry is a natural extension of $P$, while…
▽ More
The two discrete generators of the full Lorentz group $O(1,3)$ in $4D$ spacetime are typically chosen to be parity inversion symmetry $P$ and time reversal symmetry $T$, which are responsible for the four topologically separate components of $O(1,3)$. Under general considerations of quantum field theory (QFT) with internal degrees of freedom, mirror symmetry is a natural extension of $P$, while $CP$ symmetry resembles $T$ in spacetime. In particular, mirror symmetry is critical as it doubles the full Dirac fermion representation in QFT and essentially introduces a new sector of mirror particles. Its close connection to T-duality and Calabi-Yau mirror symmetry in string theory is clarified. Extension beyond the Standard model can then be constructed using both left- and right-handed heterotic strings guided by mirror symmetry. Many important implications such as supersymmetry, chiral anomalies, topological transitions, Higgs, neutrinos, and dark energy, are discussed.
△ Less
Submitted 14 July, 2023; v1 submitted 21 December, 2022;
originally announced December 2022.
-
HiTSKT: A Hierarchical Transformer Model for Session-Aware Knowledge Tracing
Authors:
Fucai Ke,
Weiqing Wang,
Weicong Tan,
Lan Du,
Yuan **,
Yu** Huang,
Hongzhi Yin
Abstract:
Knowledge tracing (KT) aims to leverage students' learning histories to estimate their mastery levels on a set of pre-defined skills, based on which the corresponding future performance can be accurately predicted. As an important way of providing personalized experience for online education, KT has gained increased attention in recent years. In practice, a student's learning history comprises ans…
▽ More
Knowledge tracing (KT) aims to leverage students' learning histories to estimate their mastery levels on a set of pre-defined skills, based on which the corresponding future performance can be accurately predicted. As an important way of providing personalized experience for online education, KT has gained increased attention in recent years. In practice, a student's learning history comprises answers to sets of massed questions, each known as a session, rather than merely being a sequence of independent answers. Theoretically, within and across these sessions, students' learning dynamics can be very different. Therefore, how to effectively model the dynamics of students' knowledge states within and across the sessions is crucial for handling the KT problem. Most existing KT models treat student's learning records as a single continuing sequence, without capturing the sessional shift of students' knowledge state. To address the above issue, we propose a novel hierarchical transformer model, named HiTSKT, comprises an interaction(-level) encoder to capture the knowledge a student acquires within a session, and a session(-level) encoder to summarise acquired knowledge across the past sessions. To predict an interaction in the current session, a knowledge retriever integrates the summarised past-session knowledge with the previous interactions' information into proper knowledge representations. These representations are then used to compute the student's current knowledge state. Additionally, to model the student's long-term forgetting behaviour across the sessions, a power-law-decay attention mechanism is designed and deployed in the session encoder, allowing it to emphasize more on the recent sessions. Extensive experiments on three public datasets demonstrate that HiTSKT achieves new state-of-the-art performance on all the datasets compared with six state-of-the-art KT models.
△ Less
Submitted 6 June, 2023; v1 submitted 22 December, 2022;
originally announced December 2022.
-
AUC Maximization for Low-Resource Named Entity Recognition
Authors:
Ngoc Dang Nguyen,
Wei Tan,
Wray Buntine,
Richard Beare,
Changyou Chen,
Lan Du
Abstract:
Current work in named entity recognition (NER) uses either cross entropy (CE) or conditional random fields (CRF) as the objective/loss functions to optimize the underlying NER model. Both of these traditional objective functions for the NER problem generally produce adequate performance when the data distribution is balanced and there are sufficient annotated training examples. But since NER is in…
▽ More
Current work in named entity recognition (NER) uses either cross entropy (CE) or conditional random fields (CRF) as the objective/loss functions to optimize the underlying NER model. Both of these traditional objective functions for the NER problem generally produce adequate performance when the data distribution is balanced and there are sufficient annotated training examples. But since NER is inherently an imbalanced tagging problem, the model performance under the low-resource settings could suffer using these standard objective functions. Based on recent advances in area under the ROC curve (AUC) maximization, we propose to optimize the NER model by maximizing the AUC score. We give evidence that by simply combining two binary-classifiers that maximize the AUC score, significant performance improvement over traditional loss functions is achieved under low-resource NER settings. We also conduct extensive experiments to demonstrate the advantages of our method under the low-resource and highly-imbalanced data distribution settings. To the best of our knowledge, this is the first work that brings AUC maximization to the NER setting. Furthermore, we show that our method is agnostic to different types of NER embeddings, models and domains. The code to replicate this work will be provided upon request.
△ Less
Submitted 13 April, 2023; v1 submitted 9 December, 2022;
originally announced December 2022.
-
Automatic Differentiation for Orbital-Free Density Functional Theory
Authors:
Chuin Wei Tan,
Chris J. Pickard,
William C. Witt
Abstract:
Differentiable programming has facilitated numerous methodological advances in scientific computing. Physics engines supporting automatic differentiation have simpler code, accelerating the development process and reducing the maintenance burden. Furthermore, fully-differentiable simulation tools enable direct evaluation of challenging derivatives - including those directly related to properties m…
▽ More
Differentiable programming has facilitated numerous methodological advances in scientific computing. Physics engines supporting automatic differentiation have simpler code, accelerating the development process and reducing the maintenance burden. Furthermore, fully-differentiable simulation tools enable direct evaluation of challenging derivatives - including those directly related to properties measurable by experiment - that are conventionally computed with finite difference methods. Here, we investigate automatic differentiation in the context of orbital-free density functional theory (OFDFT) simulations of materials, introducing PROFESS-AD. Its automatic evaluation of properties derived from first derivatives, including functional potentials, forces, and stresses, facilitates the development and testing of new density functionals, while its direct evaluation of properties requiring higher-order derivatives, such as bulk moduli, elastic constants, and force constants, offers more concise implementations compared to conventional finite difference methods. For these reasons, PROFESS-AD serves as an excellent prototy** tool and provides new opportunities for OFDFT.
△ Less
Submitted 2 April, 2023; v1 submitted 6 December, 2022;
originally announced December 2022.
-
Detecting Outdated Code Element References in Software Repository Documentation
Authors:
Wen Siang Tan,
Markus Wagner,
Christoph Treude
Abstract:
Outdated documentation is a pervasive problem in software development, preventing effective use of software, and misleading users and developers alike. We posit that one possible reason why documentation becomes out of sync so easily is that developers are unaware of when their source code modifications render the documentation obsolete. Ensuring that the documentation is always in sync with the s…
▽ More
Outdated documentation is a pervasive problem in software development, preventing effective use of software, and misleading users and developers alike. We posit that one possible reason why documentation becomes out of sync so easily is that developers are unaware of when their source code modifications render the documentation obsolete. Ensuring that the documentation is always in sync with the source code takes considerable effort, especially for large codebases. To address this situation, we propose an approach that can automatically detect code element references that survive in the documentation after all source code instances have been deleted. In this work, we analysed over 3,000 GitHub projects and found that most projects contain at least one outdated code element reference at some point in their history. We submitted GitHub issues to real-world projects containing outdated references detected by our approach, some of which have already led to documentation fixes. As an initiative toward kee** documentation in software repositories up-to-date, we have made our implementation available for developers to scan their GitHub projects for outdated code element references.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
Path Planning Considering Time-Varying and Uncertain Movement Speed in Multi-Robot Automatic Warehouses: Problem Formulation and Algorithm
Authors:
**gchuan Chen,
Wei Chen,
**g Li,
Xiguang Wei,
Wenzhe Tan,
Zuo-Jun Max Shen,
Hongbo Li
Abstract:
Path planning in the multi-robot system refers to calculating a set of actions for each robot, which will move each robot to its goal without conflicting with other robots. Lately, the research topic has received significant attention for its extensive applications, such as airport ground, drone swarms, and automatic warehouses. Despite these available research results, most of the existing invest…
▽ More
Path planning in the multi-robot system refers to calculating a set of actions for each robot, which will move each robot to its goal without conflicting with other robots. Lately, the research topic has received significant attention for its extensive applications, such as airport ground, drone swarms, and automatic warehouses. Despite these available research results, most of the existing investigations are concerned with the cases of robots with a fixed movement speed without considering uncertainty. Therefore, in this work, we study the problem of path-planning in the multi-robot automatic warehouse context, which considers the time-varying and uncertain robots' movement speed. Specifically, the path-planning module searches a path with as few conflicts as possible for a single agent by calculating traffic cost based on customarily distributed conflict probability and combining it with the classic A* algorithm. However, this probability-based method cannot eliminate all conflicts, and speed's uncertainty will constantly cause new conflicts. As a supplement, we propose the other two modules. The conflict detection and re-planning module chooses objects requiring re-planning paths from the agents involved in different types of conflicts periodically by our designed rules. Also, at each step, the scheduling module fills up the agent's preserved queue and decides who has a higher priority when the same element is assigned to two agents simultaneously. Finally, we compare the proposed algorithm with other algorithms from academia and industry, and the results show that the proposed method is validated as the best performance.
△ Less
Submitted 1 December, 2022;
originally announced December 2022.
-
GECAM Localization of High Energy Transients and the Systematic Error
Authors:
Yi Zhao,
Wang-Chen Xue,
Shao-Lin Xiong,
Yuan-Hao Wang,
Jia-Cong Liu,
Qi Liuo,
Yan-Qiu Zhang,
Jian-Chao Sun,
Xiao-Yun Zhao,
Ce Cai,
Shuo Xiao,
Yue Huang,
Xiao-Bo Li,
Zhen Zhang,
**-Yuan Liao,
Sheng Yang,
Rui Qiao,
Dong-Ya Guo,
Chao Zheng,
Qi-Bin Yi,
Sheng-Lun Xie,
Zhi-Wei Guo,
Chao-Yang Li,
Chen-Wei Wang,
Wen-Jun Tan
, et al. (41 additional authors not shown)
Abstract:
Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a pair of microsatellites (i.e. GECAM-A and GECAM-B) dedicated to monitoring gamma-ray transients including gravitational waves high-energy electromagnetic counterparts, Gamma-ray Bursts, Soft Gamma-ray Repeaters, Solar Flares and Terrestrial Gamma-ray Flashes. Since launch in December 2020, GECAM-B has detected…
▽ More
Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a pair of microsatellites (i.e. GECAM-A and GECAM-B) dedicated to monitoring gamma-ray transients including gravitational waves high-energy electromagnetic counterparts, Gamma-ray Bursts, Soft Gamma-ray Repeaters, Solar Flares and Terrestrial Gamma-ray Flashes. Since launch in December 2020, GECAM-B has detected hundreds of astronomical and terrestrial events. For these bursts, localization is the key for burst identification and classification as well as follow-up observations in multi-wavelength. Here, we propose a Bayesian localization method with Poisson data with Gaussian background profile likelihood to localize GECAM bursts based on the burst counts distribution in detectors with different orientations. We demonstrate that this method can work well for all kinds of bursts, especially for extremely short ones. In addition, we propose a new method to estimate the systematic error of localization based on a confidence level test, which can overcome some problems of the existing method in literature. We validate this method by Monte Carlo simulations, and then apply it to a burst sample with accurate location and find that the mean value of the systematic error of GECAM-B localization is $\sim 2.5^{\circ}$. By considering this systematic error, we can obtain a reliable localization probability map for GECAM bursts. Our methods can be applied to other gamma-ray monitors.
△ Less
Submitted 23 December, 2022; v1 submitted 28 November, 2022;
originally announced November 2022.
-
Molecular Strong Coupling and Cavity Finesse
Authors:
Kishan S. Menghrajani,
Adarsh B. Vasista,
Wai Jue Tan,
Philip A. Thomas,
Felipe Herrera,
William L. Barnes
Abstract:
Molecular strong coupling offers exciting prospects in physics, chemistry and materials science. Whilst attention has been focused on including realistic models for the molecular systems involved, the important role played by the entire mode structure of the optical cavities employed so far has been largely overlooked. We show that the extent and effectiveness of molecular strong coupling is criti…
▽ More
Molecular strong coupling offers exciting prospects in physics, chemistry and materials science. Whilst attention has been focused on including realistic models for the molecular systems involved, the important role played by the entire mode structure of the optical cavities employed so far has been largely overlooked. We show that the extent and effectiveness of molecular strong coupling is critically dependent on cavity finesse. Low finesse cavities can show strong coupling as judged from the presence of Rabi splitting in reflectivity measurements, but give photoluminescence signals equivalent to that of bare molecules outside the cavity. The emission of light is shown to involve hybridized light-matter polaritonic states only for cavities with high finesse. By develo** an analytical model of cavity photoluminescence in multimode structures, we clarify the role of finite finesse in polariton formation. The detailed nature of the modes supported by a cavity and how these modes interact with the molecular system - whether on resonance or not - will be as important in develo** a coherent framework for molecular strong coupling as the inclusion of realistic molecular models.
△ Less
Submitted 18 March, 2024; v1 submitted 15 November, 2022;
originally announced November 2022.
-
DeepTrace: Learning to Optimize Contact Tracing in Epidemic Networks with Graph Neural Networks
Authors:
Chee Wei Tan,
Pei-Duo Yu,
Siya Chen,
H. Vincent Poor
Abstract:
Digital contact tracing aims to curb epidemics by identifying and mitigating public health emergencies through technology. Backward contact tracing, which tracks the sources of infection, proved crucial in places like Japan for identifying COVID-19 infections from superspreading events. This paper presents a novel perspective of digital contact tracing as online graph exploration and addresses the…
▽ More
Digital contact tracing aims to curb epidemics by identifying and mitigating public health emergencies through technology. Backward contact tracing, which tracks the sources of infection, proved crucial in places like Japan for identifying COVID-19 infections from superspreading events. This paper presents a novel perspective of digital contact tracing as online graph exploration and addresses the forward and backward contact tracing problem as a maximum-likelihood (ML) estimation problem using iterative epidemic network data sampling. The challenge lies in the combinatorial complexity and rapid spread of infections. We introduce DeepTrace, an algorithm based on a Graph Neural Network (GNN) that iteratively updates its estimations as new contact tracing data is collected, learning to optimize the maximum likelihood estimation by utilizing topological features to accelerate learning and improve convergence. The contact tracing process combines either BFS or DFS to expand the network and trace the infection source, ensuring comprehensive and efficient exploration. Additionally, the GNN model is fine-tuned through a two-phase approach: pre-training with synthetic networks to approximate likelihood probabilities and fine-tuning with high-quality data to refine the model. Using COVID-19 variant data, we illustrate that DeepTrace surpasses current methods in identifying superspreaders, providing a robust basis for a scalable digital contact tracing strategy.
△ Less
Submitted 24 June, 2024; v1 submitted 2 November, 2022;
originally announced November 2022.
-
DHR: Distributed Hybrid Rendering for Metaverse Experiences
Authors:
Yu Wei Tan,
Alden Tan,
Nicholas Nge,
Anand Bhojan
Abstract:
Classically, rasterization techniques are performed for real-time rendering to meet the constraint of interactive frame rates. However, such techniques do not produce realistic results as compared to ray tracing approaches. Hence, hybrid rendering has emerged to improve the graphics fidelity of rasterization with ray tracing in real-time. We explore the approach of distributed rendering in incorpo…
▽ More
Classically, rasterization techniques are performed for real-time rendering to meet the constraint of interactive frame rates. However, such techniques do not produce realistic results as compared to ray tracing approaches. Hence, hybrid rendering has emerged to improve the graphics fidelity of rasterization with ray tracing in real-time. We explore the approach of distributed rendering in incorporating real-time hybrid rendering into metaverse experiences for immersive graphics. In standalone extended reality (XR) devices, such ray tracing-enabled graphics is only feasible through pure cloud-based remote rendering systems that rely on low-latency networks to transmit real-time ray-traced data in response to interactive user input. Under high network latency conditions, remote rendering might not be able to maintain interactive frame rates for the client, adversely affecting the user experience. We adopt hybrid rendering via a distributed rendering approach by integrating ray tracing on powerful remote hardware with raster-based rendering on user access devices. With this hybrid approach, our technique can help standalone XR devices achieve ray tracing-incorporated graphics and maintain interactive frame rates even under high-latency conditions.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
Visually Improved Erosion Algorithm for the Procedural Generation of Tile-based Terrain
Authors:
Fong Yuan Lim,
Yu Wei Tan,
Anand Bhojan
Abstract:
Procedural terrain generation is the process of generating a digital representation of terrain using a computer program or procedure, with little to no human guidance. This paper proposes a procedural terrain generation algorithm based on a graph representation of fluvial erosion that offers several novel improvements over existing algorithms. Namely, the use of a height constraint map with two ty…
▽ More
Procedural terrain generation is the process of generating a digital representation of terrain using a computer program or procedure, with little to no human guidance. This paper proposes a procedural terrain generation algorithm based on a graph representation of fluvial erosion that offers several novel improvements over existing algorithms. Namely, the use of a height constraint map with two types of locally defined constraint strengths; the ability to specify a realistic erosion strength via level of rainfall; and the ability to carve realistic gorges. These novelties allow it to generate more varied and realistic terrain by integrating additional parameters and simulation processes, while being faster and offering more flexibility and ease of use to terrain designers due to the nature and intuitiveness of these new parameters and processes. This paper additionally reviews some common metrics used to evaluate terrain generators, and suggests a completely new one that contributes to a more holistic evaluation.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Overlooked Video Classification in Weakly Supervised Video Anomaly Detection
Authors:
Weijun Tan,
Qi Yao,
**gfeng Liu
Abstract:
Current weakly supervised video anomaly detection algorithms mostly use multiple instance learning (MIL) or their varieties. Almost all recent approaches focus on how to select the correct snippets for training to improve the performance. They overlook or do not realize the power of video classification in boosting the performance of anomaly detection. In this paper, we study explicitly the power…
▽ More
Current weakly supervised video anomaly detection algorithms mostly use multiple instance learning (MIL) or their varieties. Almost all recent approaches focus on how to select the correct snippets for training to improve the performance. They overlook or do not realize the power of video classification in boosting the performance of anomaly detection. In this paper, we study explicitly the power of video classification supervision using a BERT or LSTM. With this BERT or LSTM, CNN features of all snippets of a video can be aggregated into a single feature which can be used for video classification. This simple yet powerful video classification supervision, combined into the MIL framework, brings extraordinary performance improvement on all three major video anomaly detection datasets. Particularly it improves the mean average precision (mAP) on the XD-Violence from SOTA 78.84\% to new 82.10\%. The source code is available at https://github.com/wjtan99/BERT_Anomaly_Video_Classification.
△ Less
Submitted 19 April, 2023; v1 submitted 12 October, 2022;
originally announced October 2022.
-
RTSDF: Real-time Signed Distance Fields for Soft Shadow Approximation in Games
Authors:
Yu Wei Tan,
Nicholas Chua,
Clarence Koh,
Anand Bhojan
Abstract:
Signed distance fields (SDFs) are a form of surface representation widely used in computer graphics, having applications in rendering, collision detection and modelling. In interactive media such as games, high-resolution SDFs are commonly produced offline and subsequently loaded into the application, representing rigid meshes only. This work develops a novel technique that combines jump flooding…
▽ More
Signed distance fields (SDFs) are a form of surface representation widely used in computer graphics, having applications in rendering, collision detection and modelling. In interactive media such as games, high-resolution SDFs are commonly produced offline and subsequently loaded into the application, representing rigid meshes only. This work develops a novel technique that combines jump flooding and ray tracing to generate approximate SDFs in real-time. Our approach can produce relatively accurate scene representation for rendering soft shadows while maintaining interactive frame rates. We extend our previous work with details on the design and implementation as well as visual quality and performance evaluation of the technique.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Hybrid MBlur: A Systematic Approach to Augment Rasterization with Ray Tracing for Rendering Motion Blur in Games
Authors:
Yu Wei Tan,
Xiaohan Cui,
Anand Bhojan
Abstract:
Motion blur is commonly used in game cinematics to achieve photorealism by modelling the behaviour of the camera shutter and simulating its effect associated with the relative motion of scene objects. A common real-time post-process approach is spatial sampling, where the directional blur of a moving object is rendered by integrating its colour based on velocity information within a single frame.…
▽ More
Motion blur is commonly used in game cinematics to achieve photorealism by modelling the behaviour of the camera shutter and simulating its effect associated with the relative motion of scene objects. A common real-time post-process approach is spatial sampling, where the directional blur of a moving object is rendered by integrating its colour based on velocity information within a single frame. However, such screen space approaches typically cannot produce accurate partial occlusion semi-transparencies. Our real-time hybrid rendering technique leverages hardware-accelerated ray tracing to correct post-process partial occlusion artifacts by advancing rays recursively into the scene to retrieve background information for motion-blurred regions, with reasonable additional performance cost for rendering game contents. We extend our previous work with details on the design, implementation, and future work of the technique as well as performance comparisons with post-processing.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
A Hybrid System for Real-time Rendering of Depth of Field Effect in Games
Authors:
Yu Wei Tan,
Nicholas Chua,
Nathan Biette,
Anand Bhojan
Abstract:
Real-time depth of field in game cinematics tends to approximate the semi-transparent silhouettes of out-of-focus objects through post-processing techniques. We leverage ray tracing hardware acceleration and spatio-temporal reconstruction to improve the realism of such semi-transparent regions through hybrid rendering, while maintaining interactive frame rates for immersive gaming. This paper exte…
▽ More
Real-time depth of field in game cinematics tends to approximate the semi-transparent silhouettes of out-of-focus objects through post-processing techniques. We leverage ray tracing hardware acceleration and spatio-temporal reconstruction to improve the realism of such semi-transparent regions through hybrid rendering, while maintaining interactive frame rates for immersive gaming. This paper extends our previous work with a complete presentation of our technique and details on its design, implementation, and future work.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Cloud-Assisted Hybrid Rendering for Thin-Client Games and VR Applications
Authors:
Yu Wei Tan,
Louiz Kim-Chan,
Anthony Halim,
Anand Bhojan
Abstract:
We introduce a novel distributed rendering approach to generate high-quality graphics in thin-client games and VR applications. Many mobile devices have limited computational power to achieve ray tracing in real-time. Hence, hardware-accelerated cloud servers can perform ray tracing instead and have their output streamed to clients in remote rendering. Applying the approach of distributed hybrid r…
▽ More
We introduce a novel distributed rendering approach to generate high-quality graphics in thin-client games and VR applications. Many mobile devices have limited computational power to achieve ray tracing in real-time. Hence, hardware-accelerated cloud servers can perform ray tracing instead and have their output streamed to clients in remote rendering. Applying the approach of distributed hybrid rendering, we leverage the computational capabilities of both the thin client and powerful server by performing rasterization locally while offloading ray tracing to the server. With advancements in 5G technology, the server and client can communicate effectively over the network and work together to produce a high-quality output while maintaining interactive frame rates. Our approach can achieve better visuals as compared to local rendering but faster performance as compared to remote rendering.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Hybrid MBlur: Using Ray Tracing to Solve the Partial Occlusion Artifacts in Real-Time Rendering of Motion Blur Effect
Authors:
Yu Wei Tan,
Xiaohan Cui,
Anand Bhojan
Abstract:
For a foreground object in motion, details of its background which would otherwise be hidden are uncovered through its inner blur. This paper presents a novel hybrid motion blur rendering technique combining post-process image filtering and hardware-accelerated ray tracing. In each frame, we advance rays recursively into the scene to retrieve background information for inner blur regions and apply…
▽ More
For a foreground object in motion, details of its background which would otherwise be hidden are uncovered through its inner blur. This paper presents a novel hybrid motion blur rendering technique combining post-process image filtering and hardware-accelerated ray tracing. In each frame, we advance rays recursively into the scene to retrieve background information for inner blur regions and apply a post-process filtering pass on the ray-traced background and rasterized colour before compositing them together. Our approach achieves more accurate partial occlusion semi-transparencies for moving objects while maintaining interactive frame rates.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Multilingual Representation Distillation with Contrastive Learning
Authors:
Weiting Tan,
Kevin Heffernan,
Holger Schwenk,
Philipp Koehn
Abstract:
Multilingual sentence representations from large models encode semantic information from two or more languages and can be used for different cross-lingual information retrieval and matching tasks. In this paper, we integrate contrastive learning into multilingual representation distillation and use it for quality estimation of parallel sentences (i.e., find semantically similar sentences that can…
▽ More
Multilingual sentence representations from large models encode semantic information from two or more languages and can be used for different cross-lingual information retrieval and matching tasks. In this paper, we integrate contrastive learning into multilingual representation distillation and use it for quality estimation of parallel sentences (i.e., find semantically similar sentences that can be used as translations of each other). We validate our approach with multilingual similarity search and corpus filtering tasks. Experiments across different low-resource languages show that our method greatly outperforms previous sentence encoders such as LASER, LASER3, and LaBSE.
△ Less
Submitted 30 April, 2023; v1 submitted 10 October, 2022;
originally announced October 2022.
-
Hybrid DoF: Ray-Traced and Post-Processed Hybrid Depth of Field Effect for Real-Time Rendering
Authors:
Yu Wei Tan,
Nicholas Chua,
Nathan Biette,
Anand Bhojan
Abstract:
Depth of Field (DoF) in games is usually achieved as a post-process effect by blurring pixels in the sharp rasterized image based on the defined focus plane. This paper describes a novel real-time DoF technique that uses ray tracing with image filtering to achieve more accurate partial occlusion semi-transparencies on edges of blurry foreground geometry. This hybrid rendering technique leverages r…
▽ More
Depth of Field (DoF) in games is usually achieved as a post-process effect by blurring pixels in the sharp rasterized image based on the defined focus plane. This paper describes a novel real-time DoF technique that uses ray tracing with image filtering to achieve more accurate partial occlusion semi-transparencies on edges of blurry foreground geometry. This hybrid rendering technique leverages ray tracing hardware acceleration as well as spatio-temporal reconstruction techniques to achieve interactive frame rates.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
RTSDF: Generating Signed Distance Fields in Real Time for Soft Shadow Rendering
Authors:
Yu Wei Tan,
Nicholas Chua,
Clarence Koh,
Anand Bhojan
Abstract:
Signed Distance Fields (SDFs) for surface representation are commonly generated offline and subsequently loaded into interactive applications like games. Since they are not updated every frame, they only provide a rigid surface representation. While there are methods to generate them quickly on GPU, the efficiency of these approaches is limited at high resolutions. This paper showcases a novel tec…
▽ More
Signed Distance Fields (SDFs) for surface representation are commonly generated offline and subsequently loaded into interactive applications like games. Since they are not updated every frame, they only provide a rigid surface representation. While there are methods to generate them quickly on GPU, the efficiency of these approaches is limited at high resolutions. This paper showcases a novel technique that combines jump flooding and ray tracing to generate approximate SDFs in real-time for soft shadow approximation, achieving prominent shadow penumbras while maintaining interactive frame rates.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Automated Sex Classification of Children's Voices and Changes in Differentiating Factors with Age
Authors:
Fuling Chen,
Roberto Togneri,
Murray Maybery,
Diana Weiting Tan
Abstract:
Sex classification of children's voices allows for an investigation of the development of secondary sex characteristics which has been a key interest in the field of speech analysis. This research investigated a broad range of acoustic features from scripted and spontaneous speech and applied a hierarchical clustering-based machine learning model to distinguish the sex of children aged between 5 a…
▽ More
Sex classification of children's voices allows for an investigation of the development of secondary sex characteristics which has been a key interest in the field of speech analysis. This research investigated a broad range of acoustic features from scripted and spontaneous speech and applied a hierarchical clustering-based machine learning model to distinguish the sex of children aged between 5 and 15 years. We proposed an optimal feature set and our modelling achieved an average F1 score (the harmonic mean of the precision and recall) of 0.84 across all ages. Our results suggest that the sex classification is generally more accurate when a model is developed for each year group rather than for children in 4-year age bands, with classification accuracy being better for older age groups. We found that spontaneous speech could provide more helpful cues in sex classification than scripted speech, especially for children younger than 7 years. For younger age groups, a broad range of acoustic factors contributed evenly to sex classification, while for older age groups, F0-related acoustic factors were found to be the most critical predictors generally. Other important acoustic factors for older age groups include vocal tract length estimators, spectral flux, loudness and unvoiced features.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
A Localization Method of High Energy Transients for All-Sky Gamma-Ray Monitor
Authors:
Yi Zhao,
Wangchen Xue,
Shaolin Xiong,
Qi Luo,
Yuanhao Wang,
Jiacong Liu,
Heng Yu,
Xiaoyun Zhao,
Yue Huang,
**yuan Liao,
Jianchao Sun,
Xiaobo Li,
Qibin Yi,
Ce Cai,
Shuo Xiao,
Shenglun Xie,
Chao Zheng,
Yanqiu Zhang,
Chenwei Wang,
Wenjun Tan,
Zhiwei Guo,
Chaoyang Li,
Zhenghua An,
Gang Chen,
Yanqi Du
, et al. (40 additional authors not shown)
Abstract:
Fast and reliable localization of high-energy transients is crucial for characterizing the burst properties and guiding the follow-up observations. Localization based on the relative counts of different detectors has been widely used for all-sky gamma-ray monitors. There are two major methods for this counts distribution localization: $χ^{2}$ minimization method and the Bayesian method. Here we pr…
▽ More
Fast and reliable localization of high-energy transients is crucial for characterizing the burst properties and guiding the follow-up observations. Localization based on the relative counts of different detectors has been widely used for all-sky gamma-ray monitors. There are two major methods for this counts distribution localization: $χ^{2}$ minimization method and the Bayesian method. Here we propose a modified Bayesian method that could take advantage of both the accuracy of the Bayesian method and the simplicity of the $χ^{2}$ method. With comprehensive simulations, we find that our Bayesian method with Poisson likelihood is generally more applicable for various bursts than $χ^{2}$ method, especially for weak bursts. We further proposed a location-spectrum iteration approach based on the Bayesian inference, which could alleviate the problems caused by the spectral difference between the burst and location templates. Our method is very suitable for scenarios with limited computation resources or time-sensitive applications, such as in-flight localization software, and low-latency localization for rapid follow-up observations.
△ Less
Submitted 22 February, 2023; v1 submitted 26 September, 2022;
originally announced September 2022.
-
Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning
Authors:
Yuchen Xiao,
Weihao Tan,
Christopher Amato
Abstract:
Synchronizing decisions across multiple agents in realistic settings is problematic since it requires agents to wait for other agents to terminate and communicate about termination reliably. Ideally, agents should learn and execute asynchronously instead. Such asynchronous methods also allow temporally extended actions that can take different amounts of time based on the situation and action execu…
▽ More
Synchronizing decisions across multiple agents in realistic settings is problematic since it requires agents to wait for other agents to terminate and communicate about termination reliably. Ideally, agents should learn and execute asynchronously instead. Such asynchronous methods also allow temporally extended actions that can take different amounts of time based on the situation and action executed. Unfortunately, current policy gradient methods are not applicable in asynchronous settings, as they assume that agents synchronously reason about action selection at every time step. To allow asynchronous learning and decision-making, we formulate a set of asynchronous multi-agent actor-critic methods that allow agents to directly optimize asynchronous policies in three standard training paradigms: decentralized learning, centralized learning, and centralized training for decentralized execution. Empirical results (in simulation and hardware) in a variety of realistic domains demonstrate the superiority of our approaches in large multi-agent problems and validate the effectiveness of our algorithms for learning high-quality and asynchronous solutions.
△ Less
Submitted 10 October, 2022; v1 submitted 20 September, 2022;
originally announced September 2022.
-
FRANS: Automatic Feature Extraction for Time Series Forecasting
Authors:
Alexey Chernikov,
Chang Wei Tan,
Pablo Montero-Manso,
Christoph Bergmeir
Abstract:
Feature extraction methods help in dimensionality reduction and capture relevant information. In time series forecasting (TSF), features can be used as auxiliary information to achieve better accuracy. Traditionally, features used in TSF are handcrafted, which requires domain knowledge and significant data-engineering work. In this research, we first introduce a notion of static and dynamic featur…
▽ More
Feature extraction methods help in dimensionality reduction and capture relevant information. In time series forecasting (TSF), features can be used as auxiliary information to achieve better accuracy. Traditionally, features used in TSF are handcrafted, which requires domain knowledge and significant data-engineering work. In this research, we first introduce a notion of static and dynamic features, which then enables us to develop our autonomous Feature Retrieving Autoregressive Network for Static features (FRANS) that does not require domain knowledge. The method is based on a CNN classifier that is trained to create for each series a collective and unique class representation either from parts of the series or, if class labels are available, from a set of series of the same class. It allows to discriminate series with similar behaviour but from different classes and makes the features extracted from the classifier to be maximally discriminatory. We explore the interpretability of our features, and evaluate the prediction capabilities of the method within the forecasting meta-learning environment FFORMA. Our results show that our features lead to improvement in accuracy in most situations. Once trained our approach creates features orders of magnitude faster than statistical methods.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
Integral Sampler and Polynomial Multiplication Architecture for Lattice-based Cryptography
Authors:
Antian Wang,
Weihang Tan,
Keshab K. Parhi,
Yingjie Lao
Abstract:
With the surge of the powerful quantum computer, lattice-based cryptography proliferated the latest cryptography hardware implementation due to its resistance against quantum computers. Among the computational blocks of lattice-based cryptography, the random errors produced by the sampler play a key role in ensuring the security of these schemes. This paper proposes an integral architecture for th…
▽ More
With the surge of the powerful quantum computer, lattice-based cryptography proliferated the latest cryptography hardware implementation due to its resistance against quantum computers. Among the computational blocks of lattice-based cryptography, the random errors produced by the sampler play a key role in ensuring the security of these schemes. This paper proposes an integral architecture for the sampler, which can reduce the overall resource consumption by reusing the multipliers and adders within the modular polynomial computation. For instance, our experimental results show that the proposed design can effectively reduce the discrete Ziggurat sampling method in DSP usage.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Bitext Mining for Low-Resource Languages via Contrastive Learning
Authors:
Weiting Tan,
Philipp Koehn
Abstract:
Mining high-quality bitexts for low-resource languages is challenging. This paper shows that sentence representation of language models fine-tuned with multiple negatives ranking loss, a contrastive objective, helps retrieve clean bitexts. Experiments show that parallel data mined from our approach substantially outperform the previous state-of-the-art method on low resource languages Khmer and Pa…
▽ More
Mining high-quality bitexts for low-resource languages is challenging. This paper shows that sentence representation of language models fine-tuned with multiple negatives ranking loss, a contrastive objective, helps retrieve clean bitexts. Experiments show that parallel data mined from our approach substantially outperform the previous state-of-the-art method on low resource languages Khmer and Pashto.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Unified Normalization for Accelerating and Stabilizing Transformers
Authors:
Qiming Yang,
Kai Zhang,
Chaoxiang Lan,
Zhi Yang,
Zheyang Li,
Wenming Tan,
Jun Xiao,
Shiliang Pu
Abstract:
Solid results from Transformers have made them prevailing architectures in various natural language and vision tasks. As a default component in Transformers, Layer Normalization (LN) normalizes activations within each token to boost the robustness. However, LN requires on-the-fly statistics calculation in inference as well as division and square root operations, leading to inefficiency on hardware…
▽ More
Solid results from Transformers have made them prevailing architectures in various natural language and vision tasks. As a default component in Transformers, Layer Normalization (LN) normalizes activations within each token to boost the robustness. However, LN requires on-the-fly statistics calculation in inference as well as division and square root operations, leading to inefficiency on hardware. What is more, replacing LN with other hardware-efficient normalization schemes (e.g., Batch Normalization) results in inferior performance, even collapse in training. We find that this dilemma is caused by abnormal behaviors of activation statistics, including large fluctuations over iterations and extreme outliers across layers. To tackle these issues, we propose Unified Normalization (UN), which can speed up the inference by being fused with other linear operations and achieve comparable performance on par with LN. UN strives to boost performance by calibrating the activation and gradient statistics with a tailored fluctuation smoothing strategy. Meanwhile, an adaptive outlier filtration strategy is applied to avoid collapse in training whose effectiveness is theoretically proved and experimentally verified in this paper. We demonstrate that UN can be an efficient drop-in alternative to LN by conducting extensive experiments on language and vision tasks. Besides, we evaluate the efficiency of our method on GPU. Transformers equipped with UN enjoy about 31% inference speedup and nearly 18% memory reduction. Code will be released at https://github.com/hikvision-research/Unified-Normalization.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.