Search | arXiv e-print repository

Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay

Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, J. Cheng, Y. -C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng , et al. (177 additional authors not shown)

Abstract: This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive… ▽ More This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive region, the relative $\overlineν_{e}$ rates and energy spectra variation among the near and far detectors gives $\mathrm{sin}^22θ_{13} = 0.0759_{-0.0049}^{+0.0050}$ and $Δm^2_{32} = (2.72^{+0.14}_{-0.15})\times10^{-3}$ eV$^2$ assuming the normal neutrino mass ordering, and $Δm^2_{32} = (-2.83^{+0.15}_{-0.14})\times10^{-3}$ eV$^2$ for the inverted neutrino mass ordering. This estimate of $\sin^2 2θ_{13}$ is consistent with and essentially independent from the one obtained using the capture-on-gadolinium sample at Daya Bay. The combination of these two results yields $\mathrm{sin}^22θ_{13}= 0.0833\pm0.0022$, which represents an 8% relative improvement in precision regarding the Daya Bay full 3158-day capture-on-gadolinium result. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2405.06906 [pdf, other]

Finding structure in logographic writing with library learning

Authors: Guangyuan Jiang, Matthias Hofer, Jiayuan Mao, Lionel Wong, Joshua B. Tenenbaum, Roger P. Levy

Abstract: One hallmark of human language is its combinatoriality -- reusing a relatively small inventory of building blocks to create a far larger inventory of increasingly complex structures. In this paper, we explore the idea that combinatoriality in language reflects a human inductive bias toward representational efficiency in symbol systems. We develop a computational framework for discovering structure… ▽ More One hallmark of human language is its combinatoriality -- reusing a relatively small inventory of building blocks to create a far larger inventory of increasingly complex structures. In this paper, we explore the idea that combinatoriality in language reflects a human inductive bias toward representational efficiency in symbol systems. We develop a computational framework for discovering structure in a writing system. Built on top of state-of-the-art library learning and program synthesis techniques, our computational framework discovers known linguistic structures in the Chinese writing system and reveals how the system evolves towards simplification under pressures for representational efficiency. We demonstrate how a library learning approach, utilizing learned abstractions and compression, may help reveal the fundamental computational principles that underlie the creation of combinatorial structures in human cognition, and offer broader insights into the evolution of efficient communication systems. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: Accepted at CogSci 2024 (Talk)

arXiv:2404.16877 [pdf, other]

Rapid Deployment of DNNs for Edge Computing via Structured Pruning at Initialization

Authors: Bailey J. Eccles, Leon Wong, Blesson Varghese

Abstract: Edge machine learning (ML) enables localized processing of data on devices and is underpinned by deep neural networks (DNNs). However, DNNs cannot be easily run on devices due to their substantial computing, memory and energy requirements for delivering performance that is comparable to cloud-based ML. Therefore, model compression techniques, such as pruning, have been considered. Existing pruning… ▽ More Edge machine learning (ML) enables localized processing of data on devices and is underpinned by deep neural networks (DNNs). However, DNNs cannot be easily run on devices due to their substantial computing, memory and energy requirements for delivering performance that is comparable to cloud-based ML. Therefore, model compression techniques, such as pruning, have been considered. Existing pruning methods are problematic for edge ML since they: (1) Create compressed models that have limited runtime performance benefits (using unstructured pruning) or compromise the final model accuracy (using structured pruning), and (2) Require substantial compute resources and time for identifying a suitable compressed DNN model (using neural architecture search). In this paper, we explore a new avenue, referred to as Pruning-at-Initialization (PaI), using structured pruning to mitigate the above problems. We develop Reconvene, a system for rapidly generating pruned models suited for edge deployments using structured PaI. Reconvene systematically identifies and prunes DNN convolution layers that are least sensitive to structured pruning. Reconvene rapidly creates pruned DNNs within seconds that are up to 16.21x smaller and 2x faster while maintaining the same accuracy as an unstructured PaI counterpart. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: The 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing

arXiv:2404.16182 [pdf]

Optomagnetic forces on YIG/YFeO3 microspheres levitated in chiral hollow-core photonic crystal fibre

Authors: Soumya Chakraborty, Gordon K. L. Wong, Ferdi Oda, Vanessa Wachter, Silvia Viola Kusminskiy, Tadahiro Yokosawa, Sabine Hübner, Benjamin Apeleo Zubiri, Erdmann Spiecker, Monica Distaso, Philip St. J. Russell, Nicolas Y. Joly

Abstract: We explore a magnetooptomechanical system consisting of a single magnetic microparticle optically levitated within the core of a helically twisted single-ring hollow-core photonic crystal fibre. We use newly-developed magnetic particles that have a core of antiferromagnetic yttrium-ortho-ferrite (YFeO3) and a shell of ferrimagnetic YIG (Y3Fe5O12) approximately 50 nm thick. Using a 632.8 nm probe b… ▽ More We explore a magnetooptomechanical system consisting of a single magnetic microparticle optically levitated within the core of a helically twisted single-ring hollow-core photonic crystal fibre. We use newly-developed magnetic particles that have a core of antiferromagnetic yttrium-ortho-ferrite (YFeO3) and a shell of ferrimagnetic YIG (Y3Fe5O12) approximately 50 nm thick. Using a 632.8 nm probe beam, we observe optical-torque-induced rotation of the particle and rotation of the magnetization vector in presence of an external static magnetic field. This one-of-a-kind platform opens a path to novel investigations of optomagnetic physics with levitated magnetic particles. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.14957 [pdf, other]

doi 10.1126/sciadv.adm9563

Strongly correlated multi-electron bunches from interaction with quantum light

Authors: Suraj Kumar, Jeremy Lim, Nicholas Rivera, Wesley Wong, Yee Sin Ang, Lay Kee Ang, Liang Jie Wong

Abstract: Strongly correlated electron systems are a cornerstone of modern physics, being responsible for groundbreaking phenomena from superconducting magnets to quantum computing. In most cases, correlations in electrons arise exclusively due to Coulomb interactions. In this work, we reveal that free electrons interacting simultaneously with a light field can become highly correlated via mechanisms beyond… ▽ More Strongly correlated electron systems are a cornerstone of modern physics, being responsible for groundbreaking phenomena from superconducting magnets to quantum computing. In most cases, correlations in electrons arise exclusively due to Coulomb interactions. In this work, we reveal that free electrons interacting simultaneously with a light field can become highly correlated via mechanisms beyond Coulomb interactions. In the case of two electrons, the resulting Pearson correlation coefficient (PCC) for the joint probability distribution of the output electron energies is enhanced over 13 orders of magnitude compared to that of electrons interacting with the light field in succession (one after another). These highly correlated electrons are the result of momentum and energy exchange between the participating electrons via the external quantum light field. Our findings pave the way to the creation and control of highly correlated free electrons for applications including quantum information and ultra-fast imaging. △ Less

Submitted 13 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: 3 figures for Main Text, 4 figures for Supplementary Materials, Supplementary is available at end of Main Text figures

arXiv:2404.12169 [pdf, other]

Shotit: compute-efficient image-to-video search engine for the cloud

Authors: Leslie Wong

Abstract: With the rapid growth of information technology, users are exposed to a massive amount of data online, including image, music, and video. This has led to strong needs to provide effective corresponsive search services such as image, music, and video search services. Most of them are operated based on keywords, namely using keywords to find related image, music, and video. Additionally, there are i… ▽ More With the rapid growth of information technology, users are exposed to a massive amount of data online, including image, music, and video. This has led to strong needs to provide effective corresponsive search services such as image, music, and video search services. Most of them are operated based on keywords, namely using keywords to find related image, music, and video. Additionally, there are image-to-image search services that enable users to find similar images using one input image. Given that videos are essentially composed of image frames, then similar videos can be searched by one input image or screenshot. We want to target this scenario and provide an efficient method and implementation in this paper. We present Shotit, a cloud-native image-to-video search engine that tailors this search scenario in a compute-efficient approach. One main limitation faced in this scenario is the scale of its dataset. A typical image-to-image search engine only handles one-to-one relationships, colloquially, one image corresponds to another single image. But image-to-video proliferates. Take a 24-min length video as an example, it will generate roughly 20,000 image frames. As the number of videos grows, the scale of the dataset explodes exponentially. In this case, a compute-efficient approach ought to be considered, and the system design should cater to the cloud-native trend. Choosing an emerging technology - vector database as its backbone, Shotit fits these two metrics performantly. Experiments for two different datasets, a 50 thousand-scale Blender Open Movie dataset, and a 50 million-scale proprietary TV genre dataset at a 4 Core 32GB RAM Intel Xeon Gold 6271C cloud machine with object storage reveal the effectiveness of Shotit. A demo regarding the Blender Open Movie dataset is illustrated within this paper. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: Submitted to ACM ICMR 2024

arXiv:2404.07236 [pdf, other]

Lightweight Deep Learning for Resource-Constrained Environments: A Survey

Authors: Hou-I Liu, Marco Galindo, Hongxia Xie, Lai-Kuan Wong, Hong-Han Shuai, Yung-Hui Li, Wen-Huang Cheng

Abstract: Over the past decade, the dominance of deep learning has prevailed across various domains of artificial intelligence, including natural language processing, computer vision, and biomedical signal processing. While there have been remarkable improvements in model accuracy, deploying these models on lightweight devices, such as mobile phones and microcontrollers, is constrained by limited resources.… ▽ More Over the past decade, the dominance of deep learning has prevailed across various domains of artificial intelligence, including natural language processing, computer vision, and biomedical signal processing. While there have been remarkable improvements in model accuracy, deploying these models on lightweight devices, such as mobile phones and microcontrollers, is constrained by limited resources. In this survey, we provide comprehensive design guidance tailored for these devices, detailing the meticulous design of lightweight models, compression methods, and hardware acceleration strategies. The principal goal of this work is to explore methods and concepts for getting around hardware constraints without compromising the model's accuracy. Additionally, we explore two notable paths for lightweight deep learning in the future: deployment techniques for TinyML and Large Language Models. Although these paths undoubtedly have potential, they also present significant challenges, encouraging research into unexplored areas. △ Less

Submitted 12 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

Comments: 40 pages

arXiv:2404.06625 [pdf, other]

Adapted optimal transport between Gaussian processes in discrete time

Authors: Madhu Gunasingam, Ting-Kam Leonard Wong

Abstract: We derive explicitly the adapted $2$-Wasserstein distance between non-degenerate Gaussian distributions on $\mathbb{R}^N$ and characterize the optimal bicausal coupling(s). This leads to an adapted version of the Bures-Wasserstein distance on the space of positive definite matrices. We derive explicitly the adapted $2$-Wasserstein distance between non-degenerate Gaussian distributions on $\mathbb{R}^N$ and characterize the optimal bicausal coupling(s). This leads to an adapted version of the Bures-Wasserstein distance on the space of positive definite matrices. △ Less

Submitted 30 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: 14 pages, 2 figures. Revised

MSC Class: 49Q22; 60G15 (Primary); 60B10 (Secondary)

arXiv:2404.01687 [pdf, other]

Search for a sub-eV sterile neutrino using Daya Bay's full dataset

Authors: F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding, Y. Y. Ding , et al. (176 additional authors not shown)

Abstract: This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis… ▽ More This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis benefits from a doubling of the statistics of our previous result and from improvements of several important systematic uncertainties. No significant oscillation due to mixing of a sub-eV sterile neutrino with active neutrinos was found. Exclusion limits are set by both Feldman-Cousins and CLs methods. Light sterile neutrino mixing with $\sin^2 2θ_{14} \gtrsim 0.01$ can be excluded at 95\% confidence level in the region of $0.01$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.1 $ eV$^2$. This result represents the world-leading constraints in the region of $2 \times 10^{-4}$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.2 $ eV$^2$. △ Less

Submitted 15 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 7 pages, 4 figures, 1 table

arXiv:2404.01611 [pdf]

Audio Simulation for Sound Source Localization in Virtual Evironment

Authors: Yi Di Yuan, Swee Liang Wong, Jonathan Pan

Abstract: Non-line-of-sight localization in signal-deprived environments is a challenging yet pertinent problem. Acoustic methods in such predominantly indoor scenarios encounter difficulty due to the reverberant nature. In this study, we aim to locate sound sources to specific locations within a virtual environment by leveraging physically grounded sound propagation simulations and machine learning methods… ▽ More Non-line-of-sight localization in signal-deprived environments is a challenging yet pertinent problem. Acoustic methods in such predominantly indoor scenarios encounter difficulty due to the reverberant nature. In this study, we aim to locate sound sources to specific locations within a virtual environment by leveraging physically grounded sound propagation simulations and machine learning methods. This process attempts to overcome the issue of data insufficiency to localize sound sources to their location of occurrence especially in post-event localization. We achieve 0.786+/- 0.0136 F1-score using an audio transformer spectrogram approach. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 2024 IEEE World Forum on Public Safety Technology

arXiv:2404.01135 [pdf]

Enhancing Reasoning Capacity of SLM using Cognitive Enhancement

Authors: Jonathan Pan, Swee Liang Wong, Xin Wei Chia, Yidi Yuan

Abstract: Large Language Models (LLMs) have been applied to automate cyber security activities and processes including cyber investigation and digital forensics. However, the use of such models for cyber investigation and digital forensics should address accountability and security considerations. Accountability ensures models have the means to provide explainable reasonings and outcomes. This information c… ▽ More Large Language Models (LLMs) have been applied to automate cyber security activities and processes including cyber investigation and digital forensics. However, the use of such models for cyber investigation and digital forensics should address accountability and security considerations. Accountability ensures models have the means to provide explainable reasonings and outcomes. This information can be extracted through explicit prompt requests. For security considerations, it is crucial to address privacy and confidentiality of the involved data during data processing as well. One approach to deal with this consideration is to have the data processed locally using a local instance of the model. Due to limitations of locally available resources, namely memory and GPU capacities, a Smaller Large Language Model (SLM) will typically be used. These SLMs have significantly fewer parameters compared to the LLMs. However, such size reductions have notable performance reduction, especially when tasked to provide reasoning explanations. In this paper, we aim to mitigate performance reduction through the integration of cognitive strategies that humans use for problem-solving. We term this as cognitive enhancement through prompts. Our experiments showed significant improvement gains of the SLMs' performances when such enhancements were applied. We believe that our exploration study paves the way for further investigation into the use of cognitive enhancement to optimize SLM for cyber security applications. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2402.19001 [pdf, other]

Analysis of the Two-Step Heterogeneous Transfer Learning for Laryngeal Blood Vessel Classification: Issue and Improvement

Authors: Xinyi Fang, Xu Yang, Chak Fong Chong, Kei Long Wong, Yapeng Wang, Tiankui Zhang, Sio-Kei Im

Abstract: Accurate classification of laryngeal vascular as benign or malignant is crucial for early detection of laryngeal cancer. However, organizations with limited access to laryngeal vascular images face challenges due to the lack of large and homogeneous public datasets for effective learning. Distinguished from the most familiar works, which directly transfer the ImageNet pre-trained models to the tar… ▽ More Accurate classification of laryngeal vascular as benign or malignant is crucial for early detection of laryngeal cancer. However, organizations with limited access to laryngeal vascular images face challenges due to the lack of large and homogeneous public datasets for effective learning. Distinguished from the most familiar works, which directly transfer the ImageNet pre-trained models to the target domain for fine-tuning, this work pioneers exploring two-step heterogeneous transfer learning (THTL) for laryngeal lesion classification with nine deep-learning models, utilizing the diabetic retinopathy color fundus images, semantically non-identical yet vascular images, as the intermediate domain. Attention visualization technique, Layer Class Activate Map (LayerCAM), reveals a novel finding that yet the intermediate and the target domain both reflect vascular structure to a certain extent, the prevalent radial vascular pattern in the intermediate domain prevents learning the features of twisted and tangled vessels that distinguish the malignant class in the target domain, summarizes a vital rule for laryngeal lesion classification using THTL. To address this, we introduce an enhanced fine-tuning strategy in THTL called Step-Wise Fine-Tuning (SWFT) and apply it to the ResNet models. SWFT progressively refines model performance by accumulating fine-tuning layers from back to front, guided by the visualization results of LayerCAM. Comparison with the original THTL approach shows significant improvements. For ResNet18, the accuracy and malignant recall increases by 26.1% and 79.8%, respectively, while for ResNet50, these indicators improve by 20.4% and 62.2%, respectively. △ Less

Submitted 14 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.18675 [pdf, other]

Robot Body Schema Learning from Full-body Extero/Proprioception Sensors

Authors: Shuo Jiang, **kun Zhang, Lawson Wong

Abstract: For a robot, its body structure is an a-prior knowledge when it is designed. However, when such information is not available, can a robot recognize it by itself? In this paper, we aim to grant a robot such ability to learn its body structure from exteroception and proprioception data collected from on-body sensors. By a novel machine learning method, the robot can learn a binary Heterogeneous Depe… ▽ More For a robot, its body structure is an a-prior knowledge when it is designed. However, when such information is not available, can a robot recognize it by itself? In this paper, we aim to grant a robot such ability to learn its body structure from exteroception and proprioception data collected from on-body sensors. By a novel machine learning method, the robot can learn a binary Heterogeneous Dependency Matrix from its sensor readings. We showed such matrix is equivalent to a Heterogeneous out-tree structure which can uniquely represent the robot body topology. We explored the properties of such matrix and the out-tree, and proposed a remedy to fix them when they are contaminated by partial observability or data noise. We ran our algorithm on 6 different robots with different body structures in simulation and 1 real robot. Our algorithm correctly recognized their body structures with only on-body sensor readings but no topology prior knowledge. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.17681 [pdf, other]

JKO schemes with general transport costs

Authors: Cale Rankin, Ting-Kam Leonard Wong

Abstract: We modify the JKO scheme, which is a time discretization of Wasserstein gradient flows, by replacing the Wasserstein distance with more general transport costs on manifolds. We show when the cost function has a mixed Hessian which defines a Riemannian metric, our modified JKO scheme converges under suitable conditions to the corresponding Riemannian Fokker--Planck equation. Thus on a Riemannian ma… ▽ More We modify the JKO scheme, which is a time discretization of Wasserstein gradient flows, by replacing the Wasserstein distance with more general transport costs on manifolds. We show when the cost function has a mixed Hessian which defines a Riemannian metric, our modified JKO scheme converges under suitable conditions to the corresponding Riemannian Fokker--Planck equation. Thus on a Riemannian manifold one may replace the (squared) Riemannian distance with any cost function which induces the metric. Of interest is when the Riemannian distance is computationally intractable, but a suitable cost has a simple analytic expression. We consider the Fokker--Planck equation on compact submanifolds with the Neumann boundary condition and on complete Riemannian manifolds with a finite drift condition. As an application we consider Hessian manifolds, taking as a cost the Bregman divergence. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: 27 pages

MSC Class: 35K57 (Primary) 58J35; 82C31 (Secondary)

arXiv:2402.10416 [pdf, other]

Grounding Language about Belief in a Bayesian Theory-of-Mind

Authors: Lance Ying, Tan Zhi-Xuan, Lionel Wong, Vikash Mansinghka, Joshua Tenenbaum

Abstract: Despite the fact that beliefs are mental states that cannot be directly observed, humans talk about each others' beliefs on a regular basis, often using rich compositional language to describe what others think and know. What explains this capacity to interpret the hidden epistemic content of other minds? In this paper, we take a step towards an answer by grounding the semantics of belief statemen… ▽ More Despite the fact that beliefs are mental states that cannot be directly observed, humans talk about each others' beliefs on a regular basis, often using rich compositional language to describe what others think and know. What explains this capacity to interpret the hidden epistemic content of other minds? In this paper, we take a step towards an answer by grounding the semantics of belief statements in a Bayesian theory-of-mind: By modeling how humans jointly infer coherent sets of goals, beliefs, and plans that explain an agent's actions, then evaluating statements about the agent's beliefs against these inferences via epistemic logic, our framework provides a conceptual role semantics for belief, explaining the gradedness and compositionality of human belief attributions, as well as their intimate connection with goals and plans. We evaluate this framework by studying how humans attribute goals and beliefs while watching an agent solve a doors-and-keys gridworld puzzle that requires instrumental reasoning about hidden objects. In contrast to pure logical deduction, non-mentalizing baselines, and mentalizing that ignores the role of instrumental plans, our model provides a much better fit to human goal and belief attributions, demonstrating the importance of theory-of-mind for a semantics of belief. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: Under Review, 7 pages

arXiv:2402.05383 [pdf, other]

First measurement of the yield of $^8$He isotopes produced in liquid scintillator by cosmic-ray muons at Daya Bay

Authors: Daya Bay Collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding , et al. (177 additional authors not shown)

Abstract: Daya Bay presents the first measurement of cosmogenic $^8$He isotope production in liquid scintillator, using an innovative method for identifying cascade decays of $^8$He and its child isotope, $^8$Li. We also measure the production yield of $^9$Li isotopes using well-established methodology. The results, in units of 10$^{-8}μ^{-1}$g$^{-1}$cm$^{2}$, are 0.307$\pm$0.042, 0.341$\pm$0.040, and 0.546… ▽ More Daya Bay presents the first measurement of cosmogenic $^8$He isotope production in liquid scintillator, using an innovative method for identifying cascade decays of $^8$He and its child isotope, $^8$Li. We also measure the production yield of $^9$Li isotopes using well-established methodology. The results, in units of 10$^{-8}μ^{-1}$g$^{-1}$cm$^{2}$, are 0.307$\pm$0.042, 0.341$\pm$0.040, and 0.546$\pm$0.076 for $^8$He, and 6.73$\pm$0.73, 6.75$\pm$0.70, and 13.74$\pm$0.82 for $^9$Li at average muon energies of 63.9~GeV, 64.7~GeV, and 143.0~GeV, respectively. The measured production rate of $^8$He isotopes is more than an order of magnitude lower than any other measurement of cosmogenic isotope production. It replaces the results of previous attempts to determine the ratio of $^8$He to $^9$Li production that yielded a wide range of limits from 0 to 30\%. The results provide future liquid-scintillator-based experiments with improved ability to predict cosmogenic backgrounds. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2401.03676 [pdf, other]

Assessing AI Detectors in Identifying AI-Generated Code: Implications for Education

Authors: Wei Hung Pan, Ming Jie Chok, Jonathan Leong Shan Wong, Yung Xin Shin, Yeong Shian Poon, Zhou Yang, Chun Yong Chong, David Lo, Mei Kuan Lim

Abstract: Educators are increasingly concerned about the usage of Large Language Models (LLMs) such as ChatGPT in programming education, particularly regarding the potential exploitation of imperfections in Artificial Intelligence Generated Content (AIGC) Detectors for academic misconduct. In this paper, we present an empirical study where the LLM is examined for its attempts to bypass detection by AIGC Det… ▽ More Educators are increasingly concerned about the usage of Large Language Models (LLMs) such as ChatGPT in programming education, particularly regarding the potential exploitation of imperfections in Artificial Intelligence Generated Content (AIGC) Detectors for academic misconduct. In this paper, we present an empirical study where the LLM is examined for its attempts to bypass detection by AIGC Detectors. This is achieved by generating code in response to a given question using different variants. We collected a dataset comprising 5,069 samples, with each sample consisting of a textual description of a coding problem and its corresponding human-written Python solution codes. These samples were obtained from various sources, including 80 from Quescol, 3,264 from Kaggle, and 1,725 from LeetCode. From the dataset, we created 13 sets of code problem variant prompts, which were used to instruct ChatGPT to generate the outputs. Subsequently, we assessed the performance of five AIGC detectors. Our results demonstrate that existing AIGC Detectors perform poorly in distinguishing between human-written code and AI-generated code. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 11 pages, paper accepted at 46th International Conference on Software Engineering, Software Engineering Education and Training Track (ICSE-SEET 2024)

arXiv:2401.02901 [pdf, other]

Charged-current non-standard neutrino interactions at Daya Bay

Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding , et al. (177 additional authors not shown)

Abstract: The full data set of the Daya Bay reactor neutrino experiment is used to probe the effect of the charged current non-standard interactions (CC-NSI) on neutrino oscillation experiments. Two different approaches are applied and constraints on the corresponding CC-NSI parameters are obtained with the neutrino flux taken from the Huber-Mueller model with a $5\%$ uncertainty. For the quantum mechanics-… ▽ More The full data set of the Daya Bay reactor neutrino experiment is used to probe the effect of the charged current non-standard interactions (CC-NSI) on neutrino oscillation experiments. Two different approaches are applied and constraints on the corresponding CC-NSI parameters are obtained with the neutrino flux taken from the Huber-Mueller model with a $5\%$ uncertainty. For the quantum mechanics-based approach (QM-NSI), the constraints on the CC-NSI parameters $ε_{eα}$ and $ε_{eα}^{s}$ are extracted with and without the assumption that the effects of the new physics are the same in the production and detection processes, respectively. The approach based on the weak effective field theory (WEFT-NSI) deals with four types of CC-NSI represented by the parameters $[\varepsilon_{X}]_{eα}$. For both approaches, the results for the CC-NSI parameters are shown for cases with various fixed values of the CC-NSI and the Dirac CP-violating phases, and when they are allowed to vary freely. We find that constraints on the QM-NSI parameters $ε_{eα}$ and $ε_{eα}^{s}$ from the Daya Bay experiment alone can reach the order $\mathcal{O}(0.01)$ for the former and $\mathcal{O}(0.1)$ for the latter, while for WEFT-NSI parameters $[\varepsilon_{X}]_{eα}$, we obtain $\mathcal{O}(0.1)$ for both cases. △ Less

Submitted 19 March, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

Comments: 25 pages, 16 figures, 6 tables; 36 pages, format changed, references added

arXiv:2401.01853 [pdf, other]

An Angular Diameter Measurement of $β$ UMa via Stellar Intensity Interferometry with the VERITAS Observatory

Authors: A. Acharyya, J. P. Aufdenberg, P. Bangale, J. T. Bartkoske, P. Batista, W. Benbow, A. J. Chromey, J. D. Davis, Q. Feng, G. M. Foote, A. Furniss, W. Hanlon, C. E. Hinrichs, J. Holder, W. **, P. Kaaret, M. Kertzman, D. Kieda, T. K. Kleiner, N. Korzoun, T. LeBohec, M. A. Lisa, M. Lundy, N. Matthews, C. E McGrath , et al. (22 additional authors not shown)

Abstract: We use the VERITAS imaging air Cherenkov Telescope (IACT) array to obtain the first measured angular diameter of $β$ UMa at visual wavelengths using stellar intensity interferometry (SII) and independently constrain the limb-darkened angular diameter. The age of the Ursa Major moving group has been assessed from the ages of its members, including nuclear member Merak ($β$ UMa), an A1-type subgiant… ▽ More We use the VERITAS imaging air Cherenkov Telescope (IACT) array to obtain the first measured angular diameter of $β$ UMa at visual wavelengths using stellar intensity interferometry (SII) and independently constrain the limb-darkened angular diameter. The age of the Ursa Major moving group has been assessed from the ages of its members, including nuclear member Merak ($β$ UMa), an A1-type subgiant, by comparing effective temperature and luminosity constraints to model stellar evolution tracks. Previous interferometric limb-darkened angular-diameter measurements of $β$ UMa in the near-infrared (CHARA Array, $1.149 \pm 0.014$ mas) and mid-infrared (Keck Nuller, $1.08 \pm 0.07$ mas), together with the measured parallax and bolometric flux, have constrained the effective temperature. This paper presents current VERITAS-SII observation and analysis procedures to derive squared visibilities from correlation functions. We fit the resulting squared visibilities to find a limb-darkened angular diameter of $1.07 \pm 0.04 {\rm (stat)} \pm 0.05$ (sys) mas, using synthetic visibilities from a stellar atmosphere model that provides a good match to the spectrum of $β$ UMa in the optical wave band. The VERITAS-SII limb-darkened angular diameter yields an effective temperature of $9700\pm200\pm 200$ K, consistent with ultraviolet spectrophotometry, and an age of $390\pm 29 \pm 32 $ Myr, using MESA Isochrones and Stellar Tracks (MIST). This age is consistent with $408 \pm 6$ Myr from the CHARA Array angular diameter. △ Less

Submitted 3 January, 2024; originally announced January 2024.

arXiv:2312.14845 [pdf, other]

On the Use of Metaphor Translation in Psychiatry

Authors: Lois Wong

Abstract: Providing mental healthcare to individuals with limited English proficiency (LEP) remains a pressing problem within psychiatry. Because the majority of individuals trained in providing psychiatric care are English speakers, the quality of mental healthcare given to LEP patients is significantly lower than that provided for English speakers. The provision of mental healthcare is contingent on commu… ▽ More Providing mental healthcare to individuals with limited English proficiency (LEP) remains a pressing problem within psychiatry. Because the majority of individuals trained in providing psychiatric care are English speakers, the quality of mental healthcare given to LEP patients is significantly lower than that provided for English speakers. The provision of mental healthcare is contingent on communication and understanding between the patient and healthcare provider, much more so than in the realm of physical healthcare, and English speakers are often unable to comprehend figurative language such as metaphors used by LEPs. Hence, Figurative Language Translation is invaluable to providing equitable psychiatric care. Now, metaphor has been shown to be paramount in both identifying individuals struggling with mental problems and hel** those individuals understand and communicate their experiences. Therefore, this paper aims to survey the potential of Machine Translation for providing equitable psychiatric healthcare and highlights the need for further research on the transferability of existing machine and metaphor translation research in the domain of psychiatry. △ Less

Submitted 22 December, 2023; originally announced December 2023.

arXiv:2312.08566 [pdf, other]

Learning adaptive planning representations with natural language guidance

Authors: Lionel Wong, Jiayuan Mao, Pratyusha Sharma, Zachary S. Siegel, Jiahai Feng, Noa Korneev, Joshua B. Tenenbaum, Jacob Andreas

Abstract: Effective planning in the real world requires not only world knowledge, but the ability to leverage that knowledge to build the right representation of the task at hand. Decades of hierarchical planning techniques have used domain-specific temporal action abstractions to support efficient and accurate planning, almost always relying on human priors and domain knowledge to decompose hard tasks into… ▽ More Effective planning in the real world requires not only world knowledge, but the ability to leverage that knowledge to build the right representation of the task at hand. Decades of hierarchical planning techniques have used domain-specific temporal action abstractions to support efficient and accurate planning, almost always relying on human priors and domain knowledge to decompose hard tasks into smaller subproblems appropriate for a goal or set of goals. This paper describes Ada (Action Domain Acquisition), a framework for automatically constructing task-specific planning representations using task-general background knowledge from language models (LMs). Starting with a general-purpose hierarchical planner and a low-level goal-conditioned policy, Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks. On two language-guided interactive planning benchmarks (Mini Minecraft and ALFRED Household Tasks), Ada strongly outperforms other approaches that use LMs for sequential decision-making, offering more accurate plans and better generalization to complex tasks. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2312.07774

VERITAS contributions to the 38th International Cosmic Ray Conference

Authors: A. Acharyya, C. B. Adams, A. Archer, P. Bangale, J. T. Bartkoske, P. Batista, W. Benbow, J. L. Christiansen, A. J. Chromey, A. Duerr, M. Errando, Q. Feng, G. M. Foote, L. Fortson, A. Furniss, W. Hanlon, O. Hervet, C. E. Hinrichs, J. Hoang, J. Holder, Z. Hughes, T. B. Humensky, W. **, M. N. Johnson, M. Kertzman , et al. (39 additional authors not shown)

Abstract: Compilation of papers presented by the VERITAS Collaboration at the 38th International Cosmic Ray Conference (ICRC), held July 26 through August 3, 2023 in Nagoya, Japan. Compilation of papers presented by the VERITAS Collaboration at the 38th International Cosmic Ray Conference (ICRC), held July 26 through August 3, 2023 in Nagoya, Japan. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: html page. ICRC 2023, Nagoya, Japan

arXiv:2312.05098 [pdf, other]

Comparison of readout systems for high-rate silicon photo-multiplier applications

Authors: M. L. Wong, M. Kołodziej, K. Briggl, R. Hetzel, G. Korcyl, R. Lalik, A. Malige, A. Magiera, G. Ostrzołek, K. Rusiecka, A. Stahl, V. Urbanevych, M. Wiebusch, A. Wrońska

Abstract: Recent years have shown an increased use of silicon photo-multipliers (SiPM) in experiments as they are of reasonable cost, have relatively low power consumption and are easily available in a variety of form factors allowing for a large number of readout channels. At the same time, experiments are generating data at increasingly high rates requiring the use of more efficient readout systems. In th… ▽ More Recent years have shown an increased use of silicon photo-multipliers (SiPM) in experiments as they are of reasonable cost, have relatively low power consumption and are easily available in a variety of form factors allowing for a large number of readout channels. At the same time, experiments are generating data at increasingly high rates requiring the use of more efficient readout systems. In this work, the dead time, efficiency, dynamic range, coincidence time resolution and energy resolution of five different readout systems at various stages of maturity are evaluated to determine the best system for acquiring data from a detector in a high rate experiment. Additional functionalities of the systems are also discussed. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: 12 pages, 5 figures

arXiv:2312.04383 [pdf]

Elastic Recoil Imprinted on Free-electron Radiation

Authors: Xihang Shi, Lee Wei Wesley Wong, Sunchao Huang, LiangJie Wong, Ido Kaminer

Abstract: Free-electron radiation phenomena are treated almost exclusively with classical electrodynamics, despite the intrinsic interaction being that of quantum electrodynamics. The lack of quantumness arises from the vast disparity between the electron energy and the much smaller photon energy, creating a small cross-section that makes quantum effects negligible. Here we identify a fundamentally distinct… ▽ More Free-electron radiation phenomena are treated almost exclusively with classical electrodynamics, despite the intrinsic interaction being that of quantum electrodynamics. The lack of quantumness arises from the vast disparity between the electron energy and the much smaller photon energy, creating a small cross-section that makes quantum effects negligible. Here we identify a fundamentally distinct phenomenon of electron radiation that bypasses this energy disparity, and thus displays extremely strong quantum features. This phenomenon arises from free-electron elastic recoil, which can influence fundamental radiation processes in ways thought so far to necessitate inelastic scattering. The underlying reason for the quantum radiation features, which have no counterparts in classical theory, is the entanglement between each elastically recoiled electron and the photons it emitted. We show that this phenomenon is more accessible than all other types of quantum features in free-electron radiation and can be detected in current experimental setups such as electron microscopes. These quantum radiation features could guide the development of compact coherent X-ray sources facilitated by nanophotonics and quantum optics. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2312.03225 [pdf, other]

Snake Robot with Tactile Perception Navigates on Large-scale Challenging Terrain

Authors: Shuo Jiang, Adarsh Salagame, Alireza Ramezani, Lawson Wong

Abstract: Along with the advancement of robot skin technology, there has been notable progress in the development of snake robots featuring body-surface tactile perception. In this study, we proposed a locomotion control framework for snake robots that integrates tactile perception to augment their adaptability to various terrains. Our approach embraces a hierarchical reinforcement learning (HRL) architectu… ▽ More Along with the advancement of robot skin technology, there has been notable progress in the development of snake robots featuring body-surface tactile perception. In this study, we proposed a locomotion control framework for snake robots that integrates tactile perception to augment their adaptability to various terrains. Our approach embraces a hierarchical reinforcement learning (HRL) architecture, wherein the high-level orchestrates global navigation strategies while the low-level uses curriculum learning for local navigation maneuvers. Due to the significant computational demands of collision detection in whole-body tactile sensing, the efficiency of the simulator is severely compromised. Thus a distributed training pattern to mitigate the efficiency reduction was adopted. We evaluated the navigation performance of the snake robot in complex large-scale cave exploration with challenging terrains to exhibit improvements in motion efficiency, evidencing the efficacy of tactile perception in terrain-adaptive locomotion of snake robots. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2312.03223 [pdf, other]

Hierarchical RL-Guided Large-scale Navigation of a Snake Robot

Authors: Shuo Jiang, Adarsh Salagame, Alireza Ramezani, Lawson Wong

Abstract: Classical snake robot control leverages mimicking snake-like gaits tuned for specific environments. However, to operate adaptively in unstructured environments, gait generation must be dynamically scheduled. In this work, we present a four-layer hierarchical control scheme to enable the snake robot to navigate freely in large-scale environments. The proposed model decomposes navigation into global… ▽ More Classical snake robot control leverages mimicking snake-like gaits tuned for specific environments. However, to operate adaptively in unstructured environments, gait generation must be dynamically scheduled. In this work, we present a four-layer hierarchical control scheme to enable the snake robot to navigate freely in large-scale environments. The proposed model decomposes navigation into global planning, local planning, gait generation, and gait tracking. Using reinforcement learning (RL) and a central pattern generator (CPG), our method learns to navigate in complex mazes within hours and can be directly deployed to arbitrary new environments in a zero-shot fashion. We use the high-fidelity model of Northeastern's slithering robot COBRA to test the effectiveness of the proposed hierarchical control approach. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: arXiv admin note: text overlap with arXiv:2311.14878

arXiv:2311.13016 [pdf, other]

doi 10.1109/IGARSS52108.2023.10281551

Image-Based Soil Organic Carbon Remote Sensing from Satellite Images with Fourier Neural Operator and Structural Similarity

Authors: Ken C. L. Wong, Levente Klein, Ademir Ferreira da Silva, Hongzhi Wang, Jitendra Singh, Tanveer Syeda-Mahmood

Abstract: Soil organic carbon (SOC) sequestration is the transfer and storage of atmospheric carbon dioxide in soils, which plays an important role in climate change mitigation. SOC concentration can be improved by proper land use, thus it is beneficial if SOC can be estimated at a regional or global scale. As multispectral satellite data can provide SOC-related information such as vegetation and soil prope… ▽ More Soil organic carbon (SOC) sequestration is the transfer and storage of atmospheric carbon dioxide in soils, which plays an important role in climate change mitigation. SOC concentration can be improved by proper land use, thus it is beneficial if SOC can be estimated at a regional or global scale. As multispectral satellite data can provide SOC-related information such as vegetation and soil properties at a global scale, estimation of SOC through satellite data has been explored as an alternative to manual soil sampling. Although existing studies show promising results, they are mainly based on pixel-based approaches with traditional machine learning methods, and convolutional neural networks (CNNs) are uncommon. To study the use of CNNs on SOC remote sensing, here we propose the FNO-DenseNet based on the Fourier neural operator (FNO). By combining the advantages of the FNO and DenseNet, the FNO-DenseNet outperformed the FNO in our experiments with hundreds of times fewer parameters. The FNO-DenseNet also outperformed a pixel-based random forest by 18% in the mean absolute percentage error. △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: This paper was accepted by the 2023 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2023)

arXiv:2311.10158 [pdf, other]

The Asteroseismological Richness of RCB and dLHdC Stars

Authors: Tin Long Sunny Wong, Lars Bildsten

Abstract: RCB stars are $L\approx10^4\,L_{\odot}$ solar-mass objects that can exhibit large periods of extinction from dust ejection episodes. Many exhibit semiregular pulsations in the range of $30-50$ days with semi-amplitudes of $0.05-0.3$ magnitude. Space-based photometry has discovered that solar-like oscillations are ubiquitous in hydrogen-dominated stars that have substantial outer convective envelop… ▽ More RCB stars are $L\approx10^4\,L_{\odot}$ solar-mass objects that can exhibit large periods of extinction from dust ejection episodes. Many exhibit semiregular pulsations in the range of $30-50$ days with semi-amplitudes of $0.05-0.3$ magnitude. Space-based photometry has discovered that solar-like oscillations are ubiquitous in hydrogen-dominated stars that have substantial outer convective envelopes, so we explore the hypothesis that the pulsations in RCB stars and the closely related dustless hydrogen-deficient carbon (dLHdC) stars, which have large convective outer envelopes of nearly pure helium, have a similar origin. Through stellar modeling and pulsation calculations, we find that the observed periods and amplitudes of these pulsations follows the well-measured phenomenology of their H-rich brethren. In particular, we show that the observed modes are likely of angular orders $l=0,1$ and $2$ and predominantly of an acoustic nature (i.e. $p$-modes with low radial order). The modes with largest amplitude are near the acoustic cut-off frequency appropriately rescaled to the helium-dominated envelope, and the observed amplitudes are consistent with that seen in high luminosity ($L>10^3\,L_{\odot}$) H-rich giants. We also find that for $T_{\mathrm{eff}}\gtrsim5400\,\mathrm{K}$, an HdC stellar model exhibits a radiative layer between two outer convective zones, creating a $g$-mode cavity that supports much longer period ($\approx 100$ days) oscillations. Our initial work was focused primarily on the adiabatic modes, but we expect that subsequent space-based observations of these targets (e.g. with TESS or Plato) are likely to lead to a larger set of detected frequencies that would allow for a deeper study of the interiors of these rare stars. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 12 pages, 4 figures; Accepted to ApJ

arXiv:2311.05261 [pdf]

RAGLog: Log Anomaly Detection using Retrieval Augmented Generation

Authors: Jonathan Pan, Swee Liang Wong, Yidi Yuan

Abstract: The ability to detect log anomalies from system logs is a vital activity needed to ensure cyber resiliency of systems. It is applied for fault identification or facilitate cyber investigation and digital forensics. However, as logs belonging to different systems and components differ significantly, the challenge to perform such analysis is humanly challenging from the volume, variety and velocity… ▽ More The ability to detect log anomalies from system logs is a vital activity needed to ensure cyber resiliency of systems. It is applied for fault identification or facilitate cyber investigation and digital forensics. However, as logs belonging to different systems and components differ significantly, the challenge to perform such analysis is humanly challenging from the volume, variety and velocity of logs. This is further complicated by the lack or unavailability of anomalous log entries to develop trained machine learning or artificial intelligence models for such purposes. In this research work, we explore the use of a Retrieval Augmented Large Language Model that leverages a vector database to detect anomalies from logs. We used a Question and Answer configuration pipeline. To the best of our knowledge, our experiment which we called RAGLog is a novel one and the experimental results show much promise. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2203.10960

arXiv:2310.19791 [pdf, other]

LILO: Learning Interpretable Libraries by Compressing and Documenting Code

Authors: Gabriel Grand, Lionel Wong, Maddy Bowers, Theo X. Olausson, Muxin Liu, Joshua B. Tenenbaum, Jacob Andreas

Abstract: While large language models (LLMs) now excel at code generation, a key aspect of software development is the art of refactoring: consolidating code into libraries of reusable and readable programs. In this paper, we introduce LILO, a neurosymbolic framework that iteratively synthesizes, compresses, and documents code to build libraries tailored to particular problem domains. LILO combines LLM-guid… ▽ More While large language models (LLMs) now excel at code generation, a key aspect of software development is the art of refactoring: consolidating code into libraries of reusable and readable programs. In this paper, we introduce LILO, a neurosymbolic framework that iteratively synthesizes, compresses, and documents code to build libraries tailored to particular problem domains. LILO combines LLM-guided program synthesis with recent algorithmic advances in automated refactoring from Stitch: a symbolic compression system that efficiently identifies optimal lambda abstractions across large code corpora. To make these abstractions interpretable, we introduce an auto-documentation (AutoDoc) procedure that infers natural language names and docstrings based on contextual examples of usage. In addition to improving human readability, we find that AutoDoc boosts performance by hel** LILO's synthesizer to interpret and deploy learned abstractions. We evaluate LILO on three inductive program synthesis benchmarks for string editing, scene reasoning, and graphics composition. Compared to existing neural and symbolic methods - including the state-of-the-art library learning algorithm DreamCoder - LILO solves more complex tasks and learns richer libraries that are grounded in linguistic knowledge. △ Less

Submitted 15 March, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: ICLR 2024 camera-ready

arXiv:2310.19589 [pdf, other]

Modeling Dynamics over Meshes with Gauge Equivariant Nonlinear Message Passing

Authors: Jung Yeon Park, Lawson L. S. Wong, Robin Walters

Abstract: Data over non-Euclidean manifolds, often discretized as surface meshes, naturally arise in computer graphics and biological and physical systems. In particular, solutions to partial differential equations (PDEs) over manifolds depend critically on the underlying geometry. While graph neural networks have been successfully applied to PDEs, they do not incorporate surface geometry and do not conside… ▽ More Data over non-Euclidean manifolds, often discretized as surface meshes, naturally arise in computer graphics and biological and physical systems. In particular, solutions to partial differential equations (PDEs) over manifolds depend critically on the underlying geometry. While graph neural networks have been successfully applied to PDEs, they do not incorporate surface geometry and do not consider local gauge symmetries of the manifold. Alternatively, recent works on gauge equivariant convolutional and attentional architectures on meshes leverage the underlying geometry but underperform in modeling surface PDEs with complex nonlinear dynamics. To address these issues, we introduce a new gauge equivariant architecture using nonlinear message passing. Our novel architecture achieves higher performance than either convolutional or attentional networks on domains with highly complex and nonlinear dynamics. However, similar to the non-mesh case, design trade-offs favor convolutional, attentional, or message passing networks for different tasks; we investigate in which circumstances our message passing method provides the most benefit. △ Less

Submitted 2 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: Accepted to NeurIPS 2023

arXiv:2310.10822 [pdf, other]

Vision and Language Navigation in the Real World via Online Visual Language Map**

Authors: Chengguang Xu, Hieu T. Nguyen, Christopher Amato, Lawson L. S. Wong

Abstract: Navigating in unseen environments is crucial for mobile robots. Enhancing them with the ability to follow instructions in natural language will further improve navigation efficiency in unseen cases. However, state-of-the-art (SOTA) vision-and-language navigation (VLN) methods are mainly evaluated in simulation, neglecting the complex and noisy real world. Directly transferring SOTA navigation poli… ▽ More Navigating in unseen environments is crucial for mobile robots. Enhancing them with the ability to follow instructions in natural language will further improve navigation efficiency in unseen cases. However, state-of-the-art (SOTA) vision-and-language navigation (VLN) methods are mainly evaluated in simulation, neglecting the complex and noisy real world. Directly transferring SOTA navigation policies trained in simulation to the real world is challenging due to the visual domain gap and the absence of prior knowledge about unseen environments. In this work, we propose a novel navigation framework to address the VLN task in the real world. Utilizing the powerful foundation models, the proposed framework includes four key components: (1) an LLMs-based instruction parser that converts the language instruction into a sequence of pre-defined macro-action descriptions, (2) an online visual-language mapper that builds a real-time visual-language map to maintain a spatial and semantic understanding of the unseen environment, (3) a language indexing-based localizer that grounds each macro-action description into a waypoint location on the map, and (4) a DD-PPO-based local controller that predicts the action. We evaluate the proposed pipeline on an Interbotix LoCoBot WX250 in an unseen lab environment. Without any fine-tuning, our pipeline significantly outperforms the SOTA VLN baseline in the real world. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.09848 [pdf]

Enhancing Stance Classification with Quantified Moral Foundations

Authors: Hong Zhang, Prasanta Bhattacharya, Wei Gao, Liang Ze Wong, Brandon Siyuan Loh, Joseph J. P. Simons, Jisun An

Abstract: This study enhances stance detection on social media by incorporating deeper psychological attributes, specifically individuals' moral foundations. These theoretically-derived dimensions aim to provide a comprehensive profile of an individual's moral concerns which, in recent work, has been linked to behaviour in a range of domains, including society, politics, health, and the environment. In this… ▽ More This study enhances stance detection on social media by incorporating deeper psychological attributes, specifically individuals' moral foundations. These theoretically-derived dimensions aim to provide a comprehensive profile of an individual's moral concerns which, in recent work, has been linked to behaviour in a range of domains, including society, politics, health, and the environment. In this paper, we investigate how moral foundation dimensions can contribute to predicting an individual's stance on a given target. Specifically we incorporate moral foundation features extracted from text, along with message semantic features, to classify stances at both message- and user-levels across a range of targets and models. Our preliminary results suggest that encoding moral foundations can enhance the performance of stance detection tasks and help illuminate the associations between specific moral foundations and online stances on target topics. The results highlight the importance of considering deeper psychological attributes in stance analysis and underscores the role of moral foundations in guiding online social behavior. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: 11 pages, 5 figures

arXiv:2310.04466 [pdf, other]

doi 10.1007/978-3-031-43901-8_35

HartleyMHA: Self-Attention in Frequency Domain for Resolution-Robust and Parameter-Efficient 3D Image Segmentation

Authors: Ken C. L. Wong, Hongzhi Wang, Tanveer Syeda-Mahmood

Abstract: With the introduction of Transformers, different attention-based models have been proposed for image segmentation with promising results. Although self-attention allows capturing of long-range dependencies, it suffers from a quadratic complexity in the image size especially in 3D. To avoid the out-of-memory error during training, input size reduction is usually required for 3D segmentation, but th… ▽ More With the introduction of Transformers, different attention-based models have been proposed for image segmentation with promising results. Although self-attention allows capturing of long-range dependencies, it suffers from a quadratic complexity in the image size especially in 3D. To avoid the out-of-memory error during training, input size reduction is usually required for 3D segmentation, but the accuracy can be suboptimal when the trained models are applied on the original image size. To address this limitation, inspired by the Fourier neural operator (FNO), we introduce the HartleyMHA model which is robust to training image resolution with efficient self-attention. FNO is a deep learning framework for learning map**s between functions in partial differential equations, which has the appealing properties of zero-shot super-resolution and global receptive field. We modify the FNO by using the Hartley transform with shared parameters to reduce the model size by orders of magnitude, and this allows us to further apply self-attention in the frequency domain for more expressive high-order feature combination with improved efficiency. When tested on the BraTS'19 dataset, it achieved superior robustness to training image resolution than other tested models with less than 1% of their model parameters. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: This paper was accepted by the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2023). arXiv admin note: text overlap with arXiv:2310.03872

arXiv:2310.03884 [pdf, other]

Information Geometry for the Working Information Theorist

Authors: Kumar Vijay Mishra, M. Ashok Kumar, Ting-Kam Leonard Wong

Abstract: Information geometry is a study of statistical manifolds, that is, spaces of probability distributions from a geometric perspective. Its classical information-theoretic applications relate to statistical concepts such as Fisher information, sufficient statistics, and efficient estimators. Today, information geometry has emerged as an interdisciplinary field that finds applications in diverse areas… ▽ More Information geometry is a study of statistical manifolds, that is, spaces of probability distributions from a geometric perspective. Its classical information-theoretic applications relate to statistical concepts such as Fisher information, sufficient statistics, and efficient estimators. Today, information geometry has emerged as an interdisciplinary field that finds applications in diverse areas such as radar sensing, array signal processing, quantum physics, deep learning, and optimal transport. This article presents an overview of essential information geometry to initiate an information theorist, who may be unfamiliar with this exciting area of research. We explain the concepts of divergences on statistical manifolds, generalized notions of distances, orthogonality, and geodesics, thereby paving the way for concrete applications and novel theoretical investigations. We also highlight some recent information-geometric developments, which are of interest to the broader information theory community. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: 12 pages, 3 figures, 1 table

arXiv:2310.03872 [pdf, other]

doi 10.1109/ISBI53787.2023.10230586

FNOSeg3D: Resolution-Robust 3D Image Segmentation with Fourier Neural Operator

Authors: Ken C. L. Wong, Hongzhi Wang, Tanveer Syeda-Mahmood

Abstract: Due to the computational complexity of 3D medical image segmentation, training with downsampled images is a common remedy for out-of-memory errors in deep learning. Nevertheless, as standard spatial convolution is sensitive to variations in image resolution, the accuracy of a convolutional neural network trained with downsampled images can be suboptimal when applied on the original resolution. To… ▽ More Due to the computational complexity of 3D medical image segmentation, training with downsampled images is a common remedy for out-of-memory errors in deep learning. Nevertheless, as standard spatial convolution is sensitive to variations in image resolution, the accuracy of a convolutional neural network trained with downsampled images can be suboptimal when applied on the original resolution. To address this limitation, we introduce FNOSeg3D, a 3D segmentation model robust to training image resolution based on the Fourier neural operator (FNO). The FNO is a deep learning framework for learning map**s between functions in partial differential equations, which has the appealing properties of zero-shot super-resolution and global receptive field. We improve the FNO by reducing its parameter requirement and enhancing its learning capability through residual connections and deep supervision, and these result in our FNOSeg3D model which is parameter efficient and resolution robust. When tested on the BraTS'19 dataset, it achieved superior robustness to training image resolution than other tested models with less than 1% of their model parameters. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: This paper was accepted by the IEEE International Symposium on Biomedical Imaging (ISBI) 2023

arXiv:2309.13043 [pdf, other]

doi 10.1109/LRA.2024.3360011

E(2)-Equivariant Graph Planning for Navigation

Authors: Linfeng Zhao, Hongyu Li, Taskin Padir, Huaizu Jiang, Lawson L. S. Wong

Abstract: Learning for robot navigation presents a critical and challenging task. The scarcity and costliness of real-world datasets necessitate efficient learning approaches. In this letter, we exploit Euclidean symmetry in planning for 2D navigation, which originates from Euclidean transformations between reference frames and enables parameter sharing. To address the challenges of unstructured environment… ▽ More Learning for robot navigation presents a critical and challenging task. The scarcity and costliness of real-world datasets necessitate efficient learning approaches. In this letter, we exploit Euclidean symmetry in planning for 2D navigation, which originates from Euclidean transformations between reference frames and enables parameter sharing. To address the challenges of unstructured environments, we formulate the navigation problem as planning on a geometric graph and develop an equivariant message passing network to perform value iteration. Furthermore, to handle multi-camera input, we propose a learnable equivariant layer to lift features to a desired space. We conduct comprehensive evaluations across five diverse tasks encompassing structured and unstructured environments, along with maps of known and unknown, given point goals or semantic goals. Our experiments confirm the substantial benefits on training efficiency, stability, and generalization. More details can be found at the project website: https://lhy.xyz/e2-planning/. △ Less

Submitted 27 January, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

Comments: Accepted by RA-L

arXiv:2309.09126 [pdf, other]

How much can ChatGPT really help Computational Biologists in Programming?

Authors: Chowdhury Rafeed Rahman, Limsoon Wong

Abstract: ChatGPT, a recently developed product by openAI, is successfully leaving its mark as a multi-purpose natural language based chatbot. In this paper, we are more interested in analyzing its potential in the field of computational biology. A major share of work done by computational biologists these days involve coding up bioinformatics algorithms, analyzing data, creating pipelining scripts and even… ▽ More ChatGPT, a recently developed product by openAI, is successfully leaving its mark as a multi-purpose natural language based chatbot. In this paper, we are more interested in analyzing its potential in the field of computational biology. A major share of work done by computational biologists these days involve coding up bioinformatics algorithms, analyzing data, creating pipelining scripts and even machine learning modeling and feature extraction. This paper focuses on the potential influence (both positive and negative) of ChatGPT in the mentioned aspects with illustrative examples from different perspectives. Compared to other fields of computer science, computational biology has - (1) less coding resources, (2) more sensitivity and bias issues (deals with medical data) and (3) more necessity of coding assistance (people from diverse background come to this field). Kee** such issues in mind, we cover use cases such as code writing, reviewing, debugging, converting, refactoring and pipelining using ChatGPT from the perspective of computational biologists in this paper. △ Less

Submitted 4 December, 2023; v1 submitted 16 September, 2023; originally announced September 2023.

arXiv:2309.06320 [pdf, other]

doi 10.1002/adma.202309410

The Nanoplasmonic Purcell Effect in Ultrafast and High-Light-Yield Perovskite Scintillators

Authors: Wenzheng Ye, Zhihua Yong, Michael Go, Dominik Kowal, Francesco Maddalena, Liliana Tjahjana, Wang Hong, Arramel Arramel, Christophe Dujardin, Muhammad Danang Birowosuto, Liang Jie Wong

Abstract: The development of X-ray scintillators with ultrahigh light yields and ultrafast response times is a long sought-after goal. In this work, we theoretically predict and experimentally demonstrate a fundamental mechanism that pushes the frontiers of ultrafast X-ray scintillator performance: the use of nanoscale-confined surface plasmon polariton modes to tailor the scintillator response time via the… ▽ More The development of X-ray scintillators with ultrahigh light yields and ultrafast response times is a long sought-after goal. In this work, we theoretically predict and experimentally demonstrate a fundamental mechanism that pushes the frontiers of ultrafast X-ray scintillator performance: the use of nanoscale-confined surface plasmon polariton modes to tailor the scintillator response time via the Purcell effect. By incorporating nanoplasmonic materials in scintillator devices, this work predicts over 10-fold enhancement in decay rate and 38% reduction in time resolution even with only a simple planar design. We experimentally demonstrate the nanoplasmonic Purcell effect using perovskite scintillators, enhancing the light yield by over 120% to 88 $\pm$ 11 ph/keV, and the decay rate by over 60% to 2.0 $\pm$ 0.2 ns for the average decay time, and 0.7 $\pm$ 0.1 ns for the ultrafast decay component, in good agreement with the predictions of our theoretical framework. We perform proof-of-concept X-ray imaging experiments using nanoplasmonic scintillators, demonstrating 182% enhancement in the modulation transfer function at 4 line pairs per millimeter spatial frequency. This work highlights the enormous potential of nanoplasmonics in optimizing ultrafast scintillator devices for applications including time-of-flight X-ray imaging and photon-counting computed tomography. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: 34 pages, 3 figures

arXiv:2309.03097 [pdf, other]

An Algorithm for Modelling Escalator Fixed Loss Energy for PHM and sustainable energy usage

Authors: Xuwen Hu, Jiaqi Qiu, Yu Lin, Inez Maria Zwetsloot, William Ka Fai Lee, Edmond Yin San Yeung, Colman Yiu Wah Yeung, Chris Chun Long Wong

Abstract: Prognostic Health Management (PHM) is designed to assess and monitor the health status of systems, anticipate the onset of potential failure, and prevent unplanned downtime. In recent decades, collecting massive amounts of real-time sensor data enabled condition monitoring (CM) and consequently, detection of abnormalities to support maintenance decision-making. Additionally, the utilization of PHM… ▽ More Prognostic Health Management (PHM) is designed to assess and monitor the health status of systems, anticipate the onset of potential failure, and prevent unplanned downtime. In recent decades, collecting massive amounts of real-time sensor data enabled condition monitoring (CM) and consequently, detection of abnormalities to support maintenance decision-making. Additionally, the utilization of PHM techniques can support energy sustainability efforts by optimizing energy usage and identifying opportunities for energy-saving measures. Escalators are efficient machines for transporting people and goods, and measuring energy consumption in time can facilitate PHM of escalators. Fixed loss energy, or no-load energy, of escalators denotes the energy consumption by an unloaded escalator. Fixed loss energy varies over time indicating varying operating conditions. In this paper, we propose to use escalators' fixed loss energy for PHM. We propose an approach to compute daily fixed loss energy based on energy consumption sensor data. The proposed approach is validated using a set of experimental data. The advantages and disadvantages of each approach are also presented, and recommendations are given. Finally, to illustrate PHM, we set up an EWMA chart for monitoring the fixed loss over time and demonstrate the potential in reducing energy costs associated with escalator operation. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2308.12011 [pdf, other]

doi 10.1103/PhysRevB.108.085128

Fe substitution in URu$_2$Si$_2$: singlet magnetism in an extended Doniach phase diagram

Authors: Andrea Marino, Denise S. Christovam, Chun-Fu Chang, Johannes Falke, Chang-Yang Kuo, Chi-Nan Wu, Martin Sundermann, Andrea Amorese, Hlynur Gretarsson, Eric Lee Wong, Camilla M. Moir, Yuang Deng, M. Brian Maple, Peter Thalmeier, Liu Hao Tjeng, Andrea Severing

Abstract: The application of pressure as well as the successive substitution of Ru with Fe in the hidden order (HO) compound URu$_2$Si$_2$ leads to the formation of the large moment antiferromagnetic phase (LMAFM). Here we have investigated the substitution series URu$_{2-x}$Fe$_x$Si$_2$ from $x$\,=\,0.0 to 2.0 by U\,4$f$ core-level photoelectron spectroscopy and have observed non-monotonic changes in the s… ▽ More The application of pressure as well as the successive substitution of Ru with Fe in the hidden order (HO) compound URu$_2$Si$_2$ leads to the formation of the large moment antiferromagnetic phase (LMAFM). Here we have investigated the substitution series URu$_{2-x}$Fe$_x$Si$_2$ from $x$\,=\,0.0 to 2.0 by U\,4$f$ core-level photoelectron spectroscopy and have observed non-monotonic changes in the spectra. The initial increase and subsequent decrease of the spectral weight of the 4$f$ core level satellite with increasing $x$ stands for a non-monotonic 5$f$ filling across the substitution series. The competition of chemical pressure and increase of the density of states at the Fermi energy, both due to substitution of Ru with Fe, can explain such a behavior. An extended Doniach phase diagram including the $x$ dependence of the density of states is proposed. Also in URu$_{2-x}$Fe$_x$Si$_2$ the ground state is a singlet or quasi-doublet state consisting of two singlets. Hence, the formation of magnetic order in the URu$_{2-x}$Fe$_x$Si$_2$ substitution series must be explained within a singlet magnetism model. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: 10 pages, 7 figures

Journal ref: Phys. Rev. B 108, 085128 (2023)

arXiv:2307.08226 [pdf, other]

Can Euclidean Symmetry be Leveraged in Reinforcement Learning and Planning?

Authors: Linfeng Zhao, Owen Howell, Jung Yeon Park, Xupeng Zhu, Robin Walters, Lawson L. S. Wong

Abstract: In robotic tasks, changes in reference frames typically do not influence the underlying physical properties of the system, which has been known as invariance of physical laws.These changes, which preserve distance, encompass isometric transformations such as translations, rotations, and reflections, collectively known as the Euclidean group. In this work, we delve into the design of improved learn… ▽ More In robotic tasks, changes in reference frames typically do not influence the underlying physical properties of the system, which has been known as invariance of physical laws.These changes, which preserve distance, encompass isometric transformations such as translations, rotations, and reflections, collectively known as the Euclidean group. In this work, we delve into the design of improved learning algorithms for reinforcement learning and planning tasks that possess Euclidean group symmetry. We put forth a theory on that unify prior work on discrete and continuous symmetry in reinforcement learning, planning, and optimal control. Algorithm side, we further extend the 2D path planning with value-based planning to continuous MDPs and propose a pipeline for constructing equivariant sampling-based planning algorithms. Our work is substantiated with empirical evidence and illustrated through examples that explain the benefits of equivariance to Euclidean symmetry in tackling natural control problems. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: Preprint. Website: http://lfzhao.com/SymCtrl

arXiv:2306.17819 [pdf, other]

Multiwavelength Observations of the Blazar PKS 0735+178 in Spatial and Temporal Coincidence with an Astrophysical Neutrino Candidate IceCube-211208A

Authors: A. Acharyya, C. B. Adams, A. Archer, P. Bangale, J. T. Bartkoske, P. Batista, W. Benbow, A. Brill, J. H. Buckley, J. L. Christiansen, A. J. Chromey, M. Errando, A. Falcone, Q. Feng, G. M. Foote, L. Fortson, A. Furniss, G. Gallagher, W. Hanlon, D. Hanna, O. Hervet, C. E. Hinrichs, J. Hoang, J. Holder, T. B. Humensky , et al. (185 additional authors not shown)

Abstract: We report on multiwavelength target-of-opportunity observations of the blazar PKS 0735+178, located 2.2$^\circ$ away from the best-fit position of the IceCube neutrino event IceCube-211208A detected on December 8, 2021. The source was in a high-flux state in the optical, ultraviolet, X-ray, and GeV gamma-ray bands around the time of the neutrino event, exhibiting daily variability in the soft X-ra… ▽ More We report on multiwavelength target-of-opportunity observations of the blazar PKS 0735+178, located 2.2$^\circ$ away from the best-fit position of the IceCube neutrino event IceCube-211208A detected on December 8, 2021. The source was in a high-flux state in the optical, ultraviolet, X-ray, and GeV gamma-ray bands around the time of the neutrino event, exhibiting daily variability in the soft X-ray flux. The X-ray data from Swift-XRT and NuSTAR characterize the transition between the low-energy and high-energy components of the broadband spectral energy distribution (SED), and the gamma-ray data from Fermi -LAT, VERITAS, and H.E.S.S. require a spectral cut-off near 100 GeV. Both X-ray and gamma-ray measurements provide strong constraints on the leptonic and hadronic models. We analytically explore a synchrotron self-Compton model, an external Compton model, and a lepto-hadronic model. Models that are entirely based on internal photon fields face serious difficulties in matching the observed SED. The existence of an external photon field in the source would instead explain the observed gamma-ray spectral cut-off in both leptonic and lepto-hadronic models and allow a proton jet power that marginally agrees with the Eddington limit in the lepto-hadronic model. We show a numerical lepto-hadronic model with external target photons that reproduces the observed SED and is reasonably consistent with the neutrino event despite requiring a high jet power. △ Less

Submitted 30 June, 2023; originally announced June 2023.

Comments: 21 pages, 3 figures, accepted by ApJ

arXiv:2306.14325 [pdf, other]

The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling Probabilistic Social Inferences from Linguistic Inputs

Authors: Lance Ying, Katherine M. Collins, Megan Wei, Cedegao E. Zhang, Tan Zhi-Xuan, Adrian Weller, Joshua B. Tenenbaum, Lionel Wong

Abstract: Human beings are social creatures. We routinely reason about other agents, and a crucial component of this social reasoning is inferring people's goals as we learn about their actions. In many settings, we can perform intuitive but reliable goal inference from language descriptions of agents, actions, and the background environments. In this paper, we study this process of language driving and inf… ▽ More Human beings are social creatures. We routinely reason about other agents, and a crucial component of this social reasoning is inferring people's goals as we learn about their actions. In many settings, we can perform intuitive but reliable goal inference from language descriptions of agents, actions, and the background environments. In this paper, we study this process of language driving and influencing social reasoning in a probabilistic goal inference domain. We propose a neuro-symbolic model that carries out goal inference from linguistic inputs of agent scenarios. The "neuro" part is a large language model (LLM) that translates language descriptions to code representations, and the "symbolic" part is a Bayesian inverse planning engine. To test our model, we design and run a human experiment on a linguistic goal inference task. Our model closely matches human response patterns and better predicts human judgements than using an LLM alone. △ Less

Submitted 27 June, 2023; v1 submitted 25 June, 2023; originally announced June 2023.

Comments: To appear at ICML Workshop on Theory of Mind in Communicating Agents

arXiv:2306.12672 [pdf, other]

From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought

Authors: Lionel Wong, Gabriel Grand, Alexander K. Lew, Noah D. Goodman, Vikash K. Mansinghka, Jacob Andreas, Joshua B. Tenenbaum

Abstract: How does language inform our downstream thinking? In particular, how do humans make meaning from language--and how can we leverage a theory of linguistic meaning to build machines that think in more human-like ways? In this paper, we propose rational meaning construction, a computational framework for language-informed thinking that combines neural language models with probabilistic models for rat… ▽ More How does language inform our downstream thinking? In particular, how do humans make meaning from language--and how can we leverage a theory of linguistic meaning to build machines that think in more human-like ways? In this paper, we propose rational meaning construction, a computational framework for language-informed thinking that combines neural language models with probabilistic models for rational inference. We frame linguistic meaning as a context-sensitive map** from natural language into a probabilistic language of thought (PLoT)--a general-purpose symbolic substrate for generative world modeling. Our architecture integrates two computational tools that have not previously come together: we model thinking with probabilistic programs, an expressive representation for commonsense reasoning; and we model meaning construction with large language models (LLMs), which support broad-coverage translation from natural language utterances to code expressions in a probabilistic programming language. We illustrate our framework through examples covering four core domains from cognitive science: probabilistic reasoning, logical and relational reasoning, visual and physical reasoning, and social reasoning. In each, we show that LLMs can generate context-sensitive translations that capture pragmatically-appropriate linguistic meanings, while Bayesian inference with the generated programs supports coherent and robust commonsense reasoning. We extend our framework to integrate cognitively-motivated symbolic modules (physics simulators, graphics engines, and planning algorithms) to provide a unified commonsense thinking interface from language. Finally, we explore how language can drive the construction of world models themselves. We hope this work will provide a roadmap towards cognitive models and AI systems that synthesize the insights of both modern and classical computational perspectives. △ Less

Submitted 23 June, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

arXiv:2306.12392 [pdf, other]

One-shot Imitation Learning via Interaction War**

Authors: Ondrej Biza, Skye Thompson, Kishore Reddy Pagidi, Abhinav Kumar, Elise van der Pol, Robin Walters, Thomas Kipf, Jan-Willem van de Meent, Lawson L. S. Wong, Robert Platt

Abstract: Imitation learning of robot policies from few demonstrations is crucial in open-ended applications. We propose a new method, Interaction War**, for learning SE(3) robotic manipulation policies from a single demonstration. We infer the 3D mesh of each object in the environment using shape war**, a technique for aligning point clouds across object instances. Then, we represent manipulation actio… ▽ More Imitation learning of robot policies from few demonstrations is crucial in open-ended applications. We propose a new method, Interaction War**, for learning SE(3) robotic manipulation policies from a single demonstration. We infer the 3D mesh of each object in the environment using shape war**, a technique for aligning point clouds across object instances. Then, we represent manipulation actions as keypoints on objects, which can be warped with the shape of the object. We show successful one-shot imitation learning on three simulated and real-world object re-arrangement tasks. We also demonstrate the ability of our method to predict object meshes and robot grasps in the wild. △ Less

Submitted 4 November, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

Comments: CoRL 2023

arXiv:2306.05436 [pdf, other]

Remaining Useful Life Modelling with an Escalator Health Condition Analytic System

Authors: Inez M. Zwetsloot, Yu Lin, Jiaqi Qiu, Lishuai Li, William Ka Fai Lee, Edmond Yin San Yeung, Colman Yiu Wah Yeung, Chris Chun Long Wong

Abstract: The refurbishment of an escalator is usually linked with its design life as recommended by the manufacturer. However, the actual useful life of an escalator should be determined by its operating condition which is affected by the runtime, workload, maintenance quality, vibration, etc., rather than age only. The objective of this project is to develop a comprehensive health condition analytic syste… ▽ More The refurbishment of an escalator is usually linked with its design life as recommended by the manufacturer. However, the actual useful life of an escalator should be determined by its operating condition which is affected by the runtime, workload, maintenance quality, vibration, etc., rather than age only. The objective of this project is to develop a comprehensive health condition analytic system for escalators to support refurbishment decisions. The analytic system consists of four parts: 1) online data gathering and processing; 2) a dashboard for condition monitoring; 3) a health index model; and 4) remaining useful life prediction. The results can be used for a) predicting the remaining useful life of the escalators, in order to support asset replacement planning and b) monitoring the real-time condition of escalators; including alerts when vibration exceeds the threshold and signal diagnosis, giving an indication of possible root cause (components) of the alert signal. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: 14 pages, 12 figures, 7 tables

arXiv:2306.02217 [pdf, ps, other]

Diagonal Lemma for Presheaves on Eilenberg-Zilber Categories

Authors: Daniel Carranza, Chris Kapulkin, Liang Ze Wong

Abstract: The diagonal lemma asserts that if a map of bisimplicial sets is a levelwise weak equivalence in the Kan-Quillen model structure, then it induces a weak equivalence of the diagonal simplicial sets. In this short note, we observe that the standard proof of this fact works for an arbitrary Eilenberg-Zilber category in place of the simplex category. The diagonal lemma asserts that if a map of bisimplicial sets is a levelwise weak equivalence in the Kan-Quillen model structure, then it induces a weak equivalence of the diagonal simplicial sets. In this short note, we observe that the standard proof of this fact works for an arbitrary Eilenberg-Zilber category in place of the simplex category. △ Less

Submitted 3 June, 2023; originally announced June 2023.

Comments: short note; 9 pages; comments welcome

MSC Class: 18N40; 55U35 (primary); 18N45; 18N50 (secondary)

arXiv:2306.01694 [pdf, other]

Evaluating Language Models for Mathematics through Interactions

Authors: Katherine M. Collins, Albert Q. Jiang, Simon Frieder, Lionel Wong, Miri Zilka, Umang Bhatt, Thomas Lukasiewicz, Yuhuai Wu, Joshua B. Tenenbaum, William Hart, Timothy Gowers, Wenda Li, Adrian Weller, Mateja Jamnik

Abstract: There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs, and is insufficient for making an informed decision about which LLMs and under which assistive settings can they be sensibly used. Static assessment fails to a… ▽ More There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs, and is insufficient for making an informed decision about which LLMs and under which assistive settings can they be sensibly used. Static assessment fails to account for the essential interactive element in LLM deployment, and therefore limits how we understand language model capabilities. We introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics, with a mixed cohort of participants from undergraduate students to professors of mathematics. We release the resulting interaction and rating dataset, MathConverse. By analysing MathConverse, we derive a taxonomy of human behaviours and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness in LLM generations, amongst other findings. Further, we garner a more granular understanding of GPT-4 mathematical problem-solving through a series of case studies, contributed by expert mathematicians. We conclude with actionable takeaways for ML practitioners and mathematicians: models that communicate uncertainty respond well to user corrections, and are more interpretable and concise may constitute better assistants. Interactive evaluation is a promising way to navigate the capability of these models; humans should be aware of language models' algebraic fallibility and discern where they are appropriate to use. △ Less

Submitted 5 November, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

arXiv:2305.18540 [pdf, other]

Gravitational waves from binary black holes in a self-interacting scalar dark matter cloud

Authors: Alexis Boudon, Philippe Brax, Patrick Valageas, Leong Khim Wong

Abstract: We investigate the imprints of accretion and dynamical friction on the gravitational-wave signals emitted by binary black holes embedded in a scalar dark matter cloud. As a key feature in this work, we focus on scalar fields with a repulsive self-interaction that balances against the self-gravity of the cloud. To a first approximation, the phase of the gravitational-wave signal receives extra corr… ▽ More We investigate the imprints of accretion and dynamical friction on the gravitational-wave signals emitted by binary black holes embedded in a scalar dark matter cloud. As a key feature in this work, we focus on scalar fields with a repulsive self-interaction that balances against the self-gravity of the cloud. To a first approximation, the phase of the gravitational-wave signal receives extra correction terms at $-3$PN, $-4$PN and $-5.5$PN orders, relative to the prediction of vacuum general relativity, due to cloud gravity, accretion and dynamical friction. Future observations by LISA and B-DECIGO have the potential to detect these effects for a large range of scalar masses~$m_\mathrm{DM}$ and self-interaction couplings~$λ_4$. This would correspond to scenarios with dark matter clouds smaller than $0.1$ pc, which would be difficult to detect by other probes. △ Less

Submitted 18 February, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

Comments: 20 pages, 6 figures, 5 tables

Showing 1–50 of 367 results for author: Wong, L