Search | arXiv e-print repository

Finding dusty AGNs from the JWST CEERS survey with mid-infrared photometry

Authors: Tom C. -C. Chien, Chih-Teng Ling, Tomotsugu Goto, Cossas K. -W. Wu, Seong ** Kim, Tetsuya Hashimoto, Yu-Wei Lin, Ece Kilerci, Simon C. -C. Ho, Po-Ya Wang, Bjorn Jasper R. Raquel

Abstract: The nature of the interaction between active galactic nuclei (AGNs) and their host galaxies remains an unsolved question. Therefore, conducting an AGN census is valuable to AGN research. Nevertheless, a significant fraction of AGNs are obscured by their environment, which blocks UV and optical emissions due to the dusty torus surrounding the central supermassive black hole (SMBH). To overcome this… ▽ More The nature of the interaction between active galactic nuclei (AGNs) and their host galaxies remains an unsolved question. Therefore, conducting an AGN census is valuable to AGN research. Nevertheless, a significant fraction of AGNs are obscured by their environment, which blocks UV and optical emissions due to the dusty torus surrounding the central supermassive black hole (SMBH). To overcome this challenge, mid-infrared (IR) surveys have emerged as a valuable tool for identifying obscured AGNs, as the obscured light is re-emitted in this range. With its high sensitivity, the James Webb Space Telescope (JWST) uncovered more fainter objects than previous telescopes. By applying the SED fitting, this work investigates AGN candidates in JWST Cosmic Evolution Early Release Science (CEERS) fields. We identified 42 candidates, 30 of them are classified as composites ($0.2\leq f_{\rm AGN, IR}< 0.5$), and 12 of them are AGNs ($f_{\rm AGN, IR}\geq 0.5$). We report the AGN luminosity contributions and AGN number fractions as a function of redshift and total infrared luminosity, showing that previously reported increasing relations are not apparent in our sample due to the sample size. We also extend the previous results on ultra-luminous infrared galaxies (ULIRGs, $L_{\rm TIR}\geq 10^{12} L_{\odot}$) to less luminous AGNs, highlighting the power of JWST. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 15 pages, 20 figures, 4 tables. Accepted for publication in MNRAS. The 3 min summary: https://www.youtube.com/watch?v=mWUebbgUOh8

arXiv:2406.10310 [pdf, other]

TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs

Authors: Zhuofeng Li, Zixing Gou, Xiangnan Zhang, Zhongyuan Liu, Sirui Li, Yuntong Hu, Chen Ling, Zheng Zhang, Liang Zhao

Abstract: Text-Attributed Graphs (TAGs) augment graph structures with natural language descriptions, facilitating detailed depictions of data and their interconnections across various real-world settings. However, existing TAG datasets predominantly feature textual information only at the nodes, with edges typically represented by mere binary or categorical attributes. This lack of rich textual edge annotat… ▽ More Text-Attributed Graphs (TAGs) augment graph structures with natural language descriptions, facilitating detailed depictions of data and their interconnections across various real-world settings. However, existing TAG datasets predominantly feature textual information only at the nodes, with edges typically represented by mere binary or categorical attributes. This lack of rich textual edge annotations significantly limits the exploration of contextual relationships between entities, hindering deeper insights into graph-structured data. To address this gap, we introduce Textual-Edge Graphs Datasets and Benchmark (TEG-DB), a comprehensive and diverse collection of benchmark textual-edge datasets featuring rich textual descriptions on nodes and edges. The TEG-DB datasets are large-scale and encompass a wide range of domains, from citation networks to social networks. In addition, we conduct extensive benchmark experiments on TEG-DB to assess the extent to which current techniques, including pre-trained language models, graph neural networks, and their combinations, can utilize textual node and edge information. Our goal is to elicit advancements in textual-edge graph research, specifically in develo** methodologies that exploit rich textual node and edge descriptions to enhance graph analysis and provide deeper insights into complex real-world networks. The entire TEG-DB project is publicly accessible as an open-source repository on Github, accessible at https://github.com/Zhuofeng-Li/TEG-Benchmark. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.08231 [pdf, other]

Using Deep Convolutional Neural Networks to Detect Rendered Glitches in Video Games

Authors: Carlos Garcia Ling, Konrad Tollmar, Linus Gisslen

Abstract: In this paper, we present a method using Deep Convolutional Neural Networks (DCNNs) to detect common glitches in video games. The problem setting consists of an image (800x800 RGB) as input to be classified into one of five defined classes, normal image, or one of four different kinds of glitches (stretched, low resolution, missing and placeholder textures). Using a supervised approach, we train a… ▽ More In this paper, we present a method using Deep Convolutional Neural Networks (DCNNs) to detect common glitches in video games. The problem setting consists of an image (800x800 RGB) as input to be classified into one of five defined classes, normal image, or one of four different kinds of glitches (stretched, low resolution, missing and placeholder textures). Using a supervised approach, we train a ShuffleNetV2 using generated data. This work focuses on detecting texture graphical anomalies achieving arguably good performance with an accuracy of 86.8\%, detecting 88\% of the glitches with a false positive rate of 8.7\%, and with the models being able to generalize and detect glitches even in unseen objects. We apply a confidence measure as well to tackle the issue with false positives as well as an effective way of aggregating images to achieve better detection in production. The main use of this work is the partial automatization of graphical testing in the final stages of video game development. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 8 pages, 6 figures, AAIDE conference

arXiv:2405.20790 [pdf, other]

Intersectional Unfairness Discovery

Authors: Gezheng Xu, Qi Chen, Charles Ling, Boyu Wang, Changjian Shui

Abstract: AI systems have been shown to produce unfair results for certain subgroups of population, highlighting the need to understand bias on certain sensitive attributes. Current research often falls short, primarily focusing on the subgroups characterized by a single sensitive attribute, while neglecting the nature of intersectional fairness of multiple sensitive attributes. This paper focuses on its on… ▽ More AI systems have been shown to produce unfair results for certain subgroups of population, highlighting the need to understand bias on certain sensitive attributes. Current research often falls short, primarily focusing on the subgroups characterized by a single sensitive attribute, while neglecting the nature of intersectional fairness of multiple sensitive attributes. This paper focuses on its one fundamental aspect by discovering diverse high-bias subgroups under intersectional sensitive attributes. Specifically, we propose a Bias-Guided Generative Network (BGGN). By treating each bias value as a reward, BGGN efficiently generates high-bias intersectional sensitive attributes. Experiments on real-world text and image datasets demonstrate a diverse and efficient discovery of BGGN. To further evaluate the generated unseen but possible unfair intersectional sensitive attributes, we formulate them as prompts and use modern generative AI to produce new texts and images. The results of frequently generating biased data provides new insights of discovering potential unfairness in popular modern generative AI systems. Warning: This paper contains generative examples that are offensive in nature. △ Less

Submitted 6 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

Comments: ICML-2024 camera-ready

arXiv:2405.18891 [pdf]

Inverse Design of Promising Alloys for Electrocatalytic CO$_2$ Reduction via Generative Graph Neural Networks Combined with Bird Swarm Algorithm

Authors: Zhilong Song, Linfeng Fan, Shuaihua Lu, Qionghua Zhou, Chongyi Ling, **lan Wang

Abstract: Directly generating material structures with optimal properties is a long-standing goal in material design. One of the fundamental challenges lies in how to overcome the limitation of traditional generative models to efficiently explore the global chemical space rather than a small localized space. Herein, we develop a framework named MAGECS to address this dilemma, by integrating the bird swarm a… ▽ More Directly generating material structures with optimal properties is a long-standing goal in material design. One of the fundamental challenges lies in how to overcome the limitation of traditional generative models to efficiently explore the global chemical space rather than a small localized space. Herein, we develop a framework named MAGECS to address this dilemma, by integrating the bird swarm algorithm and supervised graph neural network to effectively navigate the generative model in the immense chemical space towards materials with target properties. As a demonstration, MAGECS is applied to design compelling alloy electrocatalysts for CO$_2$ reduction reaction (CO$_2$RR) and works extremely well. Specifically, the chemical space of CO$_2$RR is effectively explored, where over 250,000 promising structures with high activity have been generated and notably, the proportion of desired structures is 2.5-fold increased. Moreover, five predicted alloys, i.e., CuAl, AlPd, Sn$_2$Pd$_5$, Sn$_9$Pd$_7$, and CuAlSe$_2$ are successfully synthesized and characterized experimentally, two of which exhibit about 90% Faraday efficiency of CO$_2$RR, and CuAl achieved 76% efficiency for C$_2$ products. This pioneering application of inverse design in CO$_2$RR catalysis showcases the potential of MAGECS to dramatically accelerate the development of functional materials, paving the way for fully automated, artificial intelligence-driven material design. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.16800 [pdf, other]

TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations

Authors: Zheng Zhang, Yuntong Hu, Bo Pan, Chen Ling, Liang Zhao

Abstract: Text-Attributed Graphs (TAGs) enhance graph structures with natural language descriptions, enabling detailed representation of data and their relationships across a broad spectrum of real-world scenarios. Despite the potential for deeper insights, existing TAG representation learning primarily relies on supervised methods, necessitating extensive labeled data and limiting applicability across dive… ▽ More Text-Attributed Graphs (TAGs) enhance graph structures with natural language descriptions, enabling detailed representation of data and their relationships across a broad spectrum of real-world scenarios. Despite the potential for deeper insights, existing TAG representation learning primarily relies on supervised methods, necessitating extensive labeled data and limiting applicability across diverse contexts. This paper introduces a new self-supervised learning framework, Text-And-Graph Multi-View Alignment (TAGA), which overcomes these constraints by integrating TAGs' structural and semantic dimensions. TAGA constructs two complementary views: Text-of-Graph view, which organizes node texts into structured documents based on graph topology, and the Graph-of-Text view, which converts textual nodes and connections into graph data. By aligning representations from both views, TAGA captures joint textual and structural information. In addition, a novel structure-preserving random walk algorithm is proposed for efficient training on large-sized TAGs. Our framework demonstrates strong performance in zero-shot and few-shot scenarios across eight real-world datasets. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16606 [pdf, other]

Link Prediction on Textual Edge Graphs

Authors: Chen Ling, Zhuofeng Li, Yuntong Hu, Zheng Zhang, Zhongyuan Liu, Shuang Zheng, Liang Zhao

Abstract: Textual-edge Graphs (TEGs), characterized by rich text annotations on edges, are increasingly significant in network science due to their ability to capture rich contextual information among entities. Existing works have proposed various edge-aware graph neural networks (GNNs) or let language models directly make predictions. However, they often fall short of fully capturing the contextualized sem… ▽ More Textual-edge Graphs (TEGs), characterized by rich text annotations on edges, are increasingly significant in network science due to their ability to capture rich contextual information among entities. Existing works have proposed various edge-aware graph neural networks (GNNs) or let language models directly make predictions. However, they often fall short of fully capturing the contextualized semantics on edges and graph topology, respectively. This inadequacy is particularly evident in link prediction tasks that require a comprehensive understanding of graph topology and semantics between nodes. In this paper, we present a novel framework - Link2Doc, designed especially for link prediction on textual-edge graphs. Specifically, we propose to summarize neighborhood information between node pairs as a human-written document to preserve both semantic and topology information. A self-supervised learning model is then utilized to enhance GNN's text-understanding ability from language models. Empirical evaluations, including link prediction, edge classification, parameter analysis, runtime comparison, and ablation studies, on four real-world datasets demonstrate that Link2Doc achieves generally better performance against existing edge-aware GNNs and pre-trained language models in predicting links on TEGs. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16506 [pdf, other]

GRAG: Graph Retrieval-Augmented Generation

Authors: Yuntong Hu, Zhihan Lei, Zheng Zhang, Bo Pan, Chen Ling, Liang Zhao

Abstract: While Retrieval-Augmented Generation (RAG) enhances the accuracy and relevance of responses by generative language models, it falls short in graph-based contexts where both textual and topological information are important. Naive RAG approaches inherently neglect the structural intricacies of textual graphs, resulting in a critical gap in the generation process. To address this challenge, we intro… ▽ More While Retrieval-Augmented Generation (RAG) enhances the accuracy and relevance of responses by generative language models, it falls short in graph-based contexts where both textual and topological information are important. Naive RAG approaches inherently neglect the structural intricacies of textual graphs, resulting in a critical gap in the generation process. To address this challenge, we introduce $\textbf{Graph Retrieval-Augmented Generation (GRAG)}$, which significantly enhances both the retrieval and generation processes by emphasizing the importance of subgraph structures. Unlike RAG approaches that focus solely on text-based entity retrieval, GRAG maintains an acute awareness of graph topology, which is crucial for generating contextually and factually coherent responses. Our GRAG approach consists of four main stages: indexing of $k$-hop ego-graphs, graph retrieval, soft pruning to mitigate the impact of irrelevant entities, and generation with pruned textual subgraphs. GRAG's core workflow-retrieving textual subgraphs followed by soft pruning-efficiently identifies relevant subgraph structures while avoiding the computational infeasibility typical of exhaustive subgraph searches, which are NP-hard. Moreover, we propose a novel prompting strategy that achieves lossless conversion from textual subgraphs to hierarchical text descriptions. Extensive experiments on graph multi-hop reasoning benchmarks demonstrate that in scenarios requiring multi-hop reasoning on textual graphs, our GRAG approach significantly outperforms current state-of-the-art RAG methods while effectively mitigating hallucinations. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: 14 pages, 4 figures

arXiv:2405.13115 [pdf, other]

Simulating optically-active spin defects with a quantum computer

Authors: Jack S. Baker, Pablo A. M. Casares, Modjtaba Shokrian Zini, Jaydeep Thik, Debasish Banerjee, Chen Ling, Alain Delgado, Juan Miguel Arrazola

Abstract: There is a pressing need for more accurate computational simulations of the opto-electronic properties of defects in materials to aid in the development of quantum sensing platforms. In this work, we explore how quantum computers could be effectively utilized for this purpose. Specifically, we develop fault-tolerant quantum algorithms to simulate optically active defect states and their radiative… ▽ More There is a pressing need for more accurate computational simulations of the opto-electronic properties of defects in materials to aid in the development of quantum sensing platforms. In this work, we explore how quantum computers could be effectively utilized for this purpose. Specifically, we develop fault-tolerant quantum algorithms to simulate optically active defect states and their radiative emission rates. We employ quantum defect embedding theory to translate the Hamiltonian of a defect-containing supercell into a smaller, effective Hamiltonian that accounts for dielectric screening effects. Our approach integrates block-encoding of the dipole operator with quantum phase estimation to selectively sample the optically active excited states that exhibit the largest dipole transition amplitudes. We also provide estimates of the quantum resources required to simulate a negatively-charged boron vacancy in a hexagonal boron nitride cluster. We conclude by offering a forward-looking perspective on the potential of quantum computers to enhance quantum sensor capabilities and identify specific scenarios where quantum computing can resolve problems traditionally challenging for classical computers. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 18 pages, 4 figures, 2 tables

arXiv:2405.11764 [pdf, other]

doi 10.1145/3626772.3657802

Modeling User Fatigue for Sequential Recommendation

Authors: Nian Li, Xin Ban, Cheng Ling, Chen Gao, Lantao Hu, Peng Jiang, Kun Gai, Yong Li, Qingmin Liao

Abstract: Recommender systems filter out information that meets user interests. However, users may be tired of the recommendations that are too similar to the content they have been exposed to in a short historical period, which is the so-called user fatigue. Despite the significance for a better user experience, user fatigue is seldom explored by existing recommenders. In fact, there are three main challen… ▽ More Recommender systems filter out information that meets user interests. However, users may be tired of the recommendations that are too similar to the content they have been exposed to in a short historical period, which is the so-called user fatigue. Despite the significance for a better user experience, user fatigue is seldom explored by existing recommenders. In fact, there are three main challenges to be addressed for modeling user fatigue, including what features support it, how it influences user interests, and how its explicit signals are obtained. In this paper, we propose to model user Fatigue in interest learning for sequential Recommendations (FRec). To address the first challenge, based on a multi-interest framework, we connect the target item with historical items and construct an interest-aware similarity matrix as features to support fatigue modeling. Regarding the second challenge, built upon feature cross, we propose a fatigue-enhanced multi-interest fusion to capture long-term interest. In addition, we develop a fatigue-gated recurrent unit for short-term interest learning, with temporal fatigue representations as important inputs for constructing update and reset gates. For the last challenge, we propose a novel sequence augmentation to obtain explicit fatigue signals for contrastive learning. We conduct extensive experiments on real-world datasets, including two public datasets and one large-scale industrial dataset. Experimental results show that FRec can improve AUC and GAUC up to 0.026 and 0.019 compared with state-of-the-art models, respectively. Moreover, large-scale online experiments demonstrate the effectiveness of FRec for fatigue reduction. Our codes are released at https://github.com/tsinghua-fib-lab/SIGIR24-FRec. △ Less

Submitted 22 May, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

Comments: SIGIR 2024

arXiv:2405.10124 [pdf, ps, other]

Smoothing Linear Codes by Rényi Divergence and Applications to Security Reduction

Authors: Hao Yan, Cong Ling

Abstract: The concept of the smoothing parameter plays a crucial role in both lattice-based and code-based cryptography, primarily due to its effectiveness in achieving nearly uniform distributions through the addition of noise. Recent research by Pathegama and Barg has determined the optimal smoothing bound for random codes under Rényi Divergence for any order $α\in (1, \infty)$ \cite{pathegama2024r}. Cons… ▽ More The concept of the smoothing parameter plays a crucial role in both lattice-based and code-based cryptography, primarily due to its effectiveness in achieving nearly uniform distributions through the addition of noise. Recent research by Pathegama and Barg has determined the optimal smoothing bound for random codes under Rényi Divergence for any order $α\in (1, \infty)$ \cite{pathegama2024r}. Considering the inherent complexity of encoding/decoding algorithms in random codes, our research introduces enhanced structural elements into these coding schemes. Specifically, this paper presents a novel derivation of the smoothing bound for random linear codes, maintaining the same order of Rényi Divergence and achieving optimality for any $α\in (1,\infty)$. We extend this framework under KL Divergence by transitioning from random linear codes to random self-dual codes, and subsequently to random quasi-cyclic codes, incorporating progressively more structures. As an application, we derive an average-case to average-case reduction from the Learning Parity with Noise (LPN) problem to the average-case decoding problem. This reduction aligns with the parameter regime in \cite{debris2022worst}, but uniquely employs Rényi divergence and directly considers Bernoulli noise, instead of combining ball noise and Bernoulli noise. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.09784 [pdf, other]

Online bipartite matching with imperfect advice

Authors: Davin Choo, Themis Gouleakis, Chun Kai Ling, Arnab Bhattacharyya

Abstract: We study the problem of online unweighted bipartite matching with $n$ offline vertices and $n$ online vertices where one wishes to be competitive against the optimal offline algorithm. While the classic RANKING algorithm of Karp et al. [1990] provably attains competitive ratio of $1-1/e > 1/2$, we show that no learning-augmented method can be both 1-consistent and strictly better than $1/2$-robust… ▽ More We study the problem of online unweighted bipartite matching with $n$ offline vertices and $n$ online vertices where one wishes to be competitive against the optimal offline algorithm. While the classic RANKING algorithm of Karp et al. [1990] provably attains competitive ratio of $1-1/e > 1/2$, we show that no learning-augmented method can be both 1-consistent and strictly better than $1/2$-robust under the adversarial arrival model. Meanwhile, under the random arrival model, we show how one can utilize methods from distribution testing to design an algorithm that takes in external advice about the online vertices and provably achieves competitive ratio interpolating between any ratio attainable by advice-free methods and the optimal ratio of 1, depending on the advice quality. △ Less

Submitted 23 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

Comments: Accepted into ICML 2024

arXiv:2405.04051 [pdf, ps, other]

On the quantization goodness of polar lattices

Authors: Ling Liu, Shanxiang Lyu, Cong Ling, Baoming Bai

Abstract: In this work, we prove that polar lattices, when tailored for lossy compression, are quantization-good in the sense that their normalized second moments approach $\frac{1}{2πe}$ as the dimension of lattices increases. It has been predicted by Zamir et al. \cite{ZamirQZ96} that the Entropy Coded Dithered Quantization (ECDQ) system using quantization-good lattices can achieve the rate-distortion bou… ▽ More In this work, we prove that polar lattices, when tailored for lossy compression, are quantization-good in the sense that their normalized second moments approach $\frac{1}{2πe}$ as the dimension of lattices increases. It has been predicted by Zamir et al. \cite{ZamirQZ96} that the Entropy Coded Dithered Quantization (ECDQ) system using quantization-good lattices can achieve the rate-distortion bound of i.i.d. Gaussian sources. In our previous work \cite{LingQZ}, we established that polar lattices are indeed capable of attaining the same objective. It is reasonable to conjecture that polar lattices also demonstrate quantization goodness in the context of lossy compression. This study confirms this hypothesis. △ Less

Submitted 13 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: 12 pages, 5 figures, submitted to IEEE for possible publication

arXiv:2405.03070 [pdf, other]

Layered Graph Security Games

Authors: Jakub Černý, Chun Kai Ling, Christian Kroer, Garud Iyengar

Abstract: Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation en… ▽ More Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation entails not only classic pursuit-evasion games, but also other security games, such as those modeling anti-terrorism and logistical interdiction. We study two-player zero-sum games under two distinct utility models: linear and binary utilities. We show that under linear utilities, Nash equilibrium can be computed in polynomial time, while binary utilities may lead to situations where even computing a best-response is computationally intractable. To this end, we propose a practical algorithm based on incremental strategy generation and mixed integer linear programs. We show through extensive experiments that our algorithm efficiently computes $ε$-equilibrium for many games of interest. We find that target values and graph structure often have a larger influence on running times as compared to the size of the graph per se. △ Less

Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

Comments: In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. IJCAI Press, 2024

arXiv:2405.01680 [pdf, other]

Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations

Authors: Nima Hosseini Dashtbayaz, Ghazal Farhani, Boyu Wang, Charles X. Ling

Abstract: The residual loss in Physics-Informed Neural Networks (PINNs) alters the simple recursive relation of layers in a feed-forward neural network by applying a differential operator, resulting in a loss landscape that is inherently different from those of common supervised problems. Therefore, relying on the existing theory leads to unjustified design choices and suboptimal performance. In this work,… ▽ More The residual loss in Physics-Informed Neural Networks (PINNs) alters the simple recursive relation of layers in a feed-forward neural network by applying a differential operator, resulting in a loss landscape that is inherently different from those of common supervised problems. Therefore, relying on the existing theory leads to unjustified design choices and suboptimal performance. In this work, we analyze the residual loss by studying its characteristics at critical points to find the conditions that result in effective training of PINNs. Specifically, we first show that under certain conditions, the residual loss of PINNs can be globally minimized by a wide neural network. Furthermore, our analysis also reveals that an activation function with well-behaved high-order derivatives plays a crucial role in minimizing the residual loss. In particular, to solve a $k$-th order PDE, the $k$-th derivative of the activation function should be bijective. The established theory paves the way for designing and choosing effective activation functions for PINNs and explains why periodic activations have shown promising performance in certain cases. Finally, we verify our findings by conducting a set of experiments on several PDEs. Our code is publicly available at https://github.com/nimahsn/pinns_tf2. △ Less

Submitted 12 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: Accepted at IJCAI 2024. V2: Corrected typos

arXiv:2404.19348 [pdf, ps, other]

Quasi-determinant and right eigenvalues of dual quaternion matrices

Authors: Chen Ling, Liqun Qi

Abstract: Dual quaternion/complex matrices have important applications in brain science and multi-agent formation control. In this paper, we first study some basic properties of determinants of dual complex matrices, including Sturm theorem and Bloomfield-Watson inequality for dual complex matrices. Then, we show that every eigenvalue of a dual complex matrix must be the root of the characteristic polynomia… ▽ More Dual quaternion/complex matrices have important applications in brain science and multi-agent formation control. In this paper, we first study some basic properties of determinants of dual complex matrices, including Sturm theorem and Bloomfield-Watson inequality for dual complex matrices. Then, we show that every eigenvalue of a dual complex matrix must be the root of the characteristic polynomial of this matrix. With the help of the determinants of dual complex matrices, we introduce the concept of quasi-determinants of dual quaternion matrices, and show that every right eigenvalue of a dual quaternion matrix must be the root of the quasi-characteristic polynomial of this matrix, as well as the quasi-determinant of a dual quaternion Hermitian matrix is equivalent to the product of the square of the magnitudes of all eigenvalues. Our results are helpful for the further study of dual quaternion matrix theory, and their applications. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.19196 [pdf, other]

Tail Asymptotic of Heavy-Tail Risks with Elliptical Copula

Authors: Kai Wang, Chengxiu Ling

Abstract: We consider a family of multivariate distributions with heavy-tailed margins and the type I elliptical dependence structure. This class of risks is common in finance, insurance, environmental and biostatistic applications. We obtain the asymptotic tail risk probabilities and characterize the multivariate regular variation property. The results demonstrate how the rate of decay of probabilities on… ▽ More We consider a family of multivariate distributions with heavy-tailed margins and the type I elliptical dependence structure. This class of risks is common in finance, insurance, environmental and biostatistic applications. We obtain the asymptotic tail risk probabilities and characterize the multivariate regular variation property. The results demonstrate how the rate of decay of probabilities on tail sets varies in tail sets and the covariance matrix of the elliptical copula. The theoretical results are well illustrated by typical examples and numerical simulations. A real data application shows its advantages in a more flexible dependence structure to characterize joint insurance losses. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.14668 [pdf, other]

Source Localization for Cross Network Information Diffusion

Authors: Chen Ling, Tanmoy Chowdhury, Jie Ji, Sirui Li, Andreas Züfle, Liang Zhao

Abstract: Source localization aims to locate information diffusion sources only given the diffusion observation, which has attracted extensive attention in the past few years. Existing methods are mostly tailored for single networks and may not be generalized to handle more complex networks like cross-networks. Cross-network is defined as two interconnected networks, where one network's functionality depend… ▽ More Source localization aims to locate information diffusion sources only given the diffusion observation, which has attracted extensive attention in the past few years. Existing methods are mostly tailored for single networks and may not be generalized to handle more complex networks like cross-networks. Cross-network is defined as two interconnected networks, where one network's functionality depends on the other. Source localization on cross-networks entails locating diffusion sources on the source network by only giving the diffused observation in the target network. The task is challenging due to challenges including: 1) diffusion sources distribution modeling; 2) jointly considering both static and dynamic node features; and 3) heterogeneous diffusion patterns learning. In this work, we propose a novel method, namely CNSL, to handle the three primary challenges. Specifically, we propose to learn the distribution of diffusion sources through Bayesian inference and leverage disentangled encoders to separately learn static and dynamic node features. The learning objective is coupled with the cross-network information propagation estimation model to make the inference of diffusion sources considering the overall diffusion process. Additionally, we also provide two novel cross-network datasets collected by ourselves. Extensive experiments are conducted on both datasets to demonstrate the effectiveness of \textit{CNSL} in handling the source localization on cross-networks. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: Code and data are available at: https://github.com/tanmoysr/CNSL/

arXiv:2404.14446 [pdf, other]

Spatio-temporal Joint Analysis of PM2.5 and Ozone in California with INLA

Authors: Jianan Pan, Kunyang He, Kai Wang, Qing Mu, Chengxiu Ling

Abstract: The substantial threat of concurrent air pollutants to public health is increasingly severe under climate change. To identify the common drivers and extent of spatio-temporal similarity of PM2.5 and ozone, this paper proposed a log Gaussian-Gumbel Bayesian hierarchical model allowing for sharing a SPDE-AR(1) spatio-temporal interaction structure. The proposed model outperforms in terms of estimati… ▽ More The substantial threat of concurrent air pollutants to public health is increasingly severe under climate change. To identify the common drivers and extent of spatio-temporal similarity of PM2.5 and ozone, this paper proposed a log Gaussian-Gumbel Bayesian hierarchical model allowing for sharing a SPDE-AR(1) spatio-temporal interaction structure. The proposed model outperforms in terms of estimation accuracy and prediction capacity for its increased parsimony and reduced uncertainty, especially for the shared ozone sub-model. Besides the consistently significant influence of temperature (positive), extreme drought (positive), fire burnt area (positive), and wind speed (negative) on both PM2.5 and ozone, surface pressure and GDP per capita (precipitation) demonstrate only positive associations with PM2.5 (ozone), while population density relates to neither. In addition, our results show the distinct spatio-temporal interactions and different seasonal patterns of PM2.5 and ozone, with peaks of PM2.5 and ozone in cold and hot seasons, respectively. Finally, with the aid of the excursion function, we see that the areas around the intersection of San Luis Obispo and Santa Barbara counties are likely to exceed the unhealthy ozone level for sensitive groups throughout the year. Our findings provide new insights for regional and seasonal strategies in the co-control of PM2.5 and ozone. Our methodology is expected to be utilized when interest lies in multiple interrelated processes in the fields of environment and epidemiology. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2403.13574 [pdf, other]

A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation

Authors: Bowen Zheng, Zihan Lin, Enze Liu, Chen Yang, Enyang Bai, Cheng Ling, Wayne Xin Zhao, Ji-Rong Wen

Abstract: In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interactio… ▽ More In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interaction histories with both videos and comments, so as to jointly conduct personalized video and comment recommendation. Specifically, our approach consists of two key components, namely sequential recommendation (SR) model and supplemental large language model (LLM) recommender. The SR model serves as the primary recommendation backbone (retained in deployment) of our approach, allowing for efficient user preference modeling. Meanwhile, we leverage the LLM recommender as a supplemental component (discarded in deployment) to better capture underlying user preferences from heterogeneous interaction behaviors. In order to integrate the merits of the SR model and the supplemental LLM recommender, we design a twostage training paradigm. The first stage is personalized preference alignment, which aims to align the preference representations from both components, thereby enhancing the semantics of the SR model. The second stage is recommendation-oriented fine-tuning, in which the alignment-enhanced SR model is fine-tuned according to specific objectives. Extensive experiments in both video and comment recommendation tasks demonstrate the effectiveness of LSVCR. Additionally, online A/B testing on the KuaiShou platform verifies the actual benefits brought by our approach. In particular, we achieve a significant overall gain of 4.13% in comment watch time. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.11440 [pdf, ps, other]

Boosting Continuous Emotion Recognition with Self-Pretraining using Masked Autoencoders, Temporal Convolutional Networks, and Transformers

Authors: Weiwei Zhou, Jiada Lu, Chenkun Ling, Weifeng Wang, Shaowei Liu

Abstract: Human emotion recognition holds a pivotal role in facilitating seamless human-computer interaction. This paper delineates our methodology in tackling the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, and Action Unit (AU) Detection Challenge within the ambit of the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). Our study advo… ▽ More Human emotion recognition holds a pivotal role in facilitating seamless human-computer interaction. This paper delineates our methodology in tackling the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, and Action Unit (AU) Detection Challenge within the ambit of the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). Our study advocates a novel approach aimed at refining continuous emotion recognition. We achieve this by initially harnessing pre-training with Masked Autoencoders (MAE) on facial datasets, followed by fine-tuning on the aff-wild2 dataset annotated with expression (Expr) labels. The pre-trained model serves as an adept visual feature extractor, thereby enhancing the model's robustness. Furthermore, we bolster the performance of continuous emotion recognition by integrating Temporal Convolutional Network (TCN) modules and Transformer Encoder modules into our framework. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2402.17946 [pdf, other]

SparseLLM: Towards Global Pruning for Pre-trained Language Models

Authors: Guangji Bai, Yijiang Li, Chen Ling, Kibaek Kim, Liang Zhao

Abstract: The transformative impact of large language models (LLMs) like LLaMA and GPT on natural language processing is countered by their prohibitive computational demands. Pruning has emerged as a pivotal compression strategy, introducing sparsity to enhance both memory and computational efficiency. Yet, traditional global pruning is impractical for LLMs due to scalability issues, while local pruning, de… ▽ More The transformative impact of large language models (LLMs) like LLaMA and GPT on natural language processing is countered by their prohibitive computational demands. Pruning has emerged as a pivotal compression strategy, introducing sparsity to enhance both memory and computational efficiency. Yet, traditional global pruning is impractical for LLMs due to scalability issues, while local pruning, despite its efficiency, leads to suboptimal solutions. Addressing these challenges, we propose SparseLLM, a novel framework that redefines the global pruning process into manageable, coordinated subproblems, allowing for resource-efficient optimization with global optimality. SparseLLM's approach, which conceptualizes LLMs as a chain of modular functions and leverages auxiliary variables for problem decomposition, not only facilitates a pragmatic application on LLMs but also demonstrates significant performance improvements, particularly in high-sparsity regimes where it surpasses current state-of-the-art methods. △ Less

Submitted 23 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

Comments: Preprint. Under review

arXiv:2402.16898 [pdf, other]

MIM-Reasoner: Learning with Theoretical Guarantees for Multiplex Influence Maximization

Authors: Nguyen Do, Tanmoy Chowdhury, Chen Ling, Liang Zhao, My T. Thai

Abstract: Multiplex influence maximization (MIM) asks us to identify a set of seed users such as to maximize the expected number of influenced users in a multiplex network. MIM has been one of central research topics, especially in nowadays social networking landscape where users participate in multiple online social networks (OSNs) and their influences can propagate among several OSNs simultaneously. Altho… ▽ More Multiplex influence maximization (MIM) asks us to identify a set of seed users such as to maximize the expected number of influenced users in a multiplex network. MIM has been one of central research topics, especially in nowadays social networking landscape where users participate in multiple online social networks (OSNs) and their influences can propagate among several OSNs simultaneously. Although there exist a couple combinatorial algorithms to MIM, learning-based solutions have been desired due to its generalization ability to heterogeneous networks and their diversified propagation characteristics. In this paper, we introduce MIM-Reasoner, coupling reinforcement learning with probabilistic graphical model, which effectively captures the complex propagation process within and between layers of a given multiplex network, thereby tackling the most challenging problem in MIM. We establish a theoretical guarantee for MIM-Reasoner as well as conduct extensive analyses on both synthetic and real-world datasets to validate our MIM-Reasoner's performance. △ Less

Submitted 10 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Journal ref: International Conference on Artificial Intelligence and Statistics (AISTATS) 2024

arXiv:2402.16649 [pdf, other]

A Strongly Lensed Dusty Starburst of an Intrinsic Disk Morphology at Photometric Redshift of 7.7

Authors: Chenxiaoji Ling, Bangzheng Sun, Cheng Cheng, Nan Li, Zhiyuan Ma, Hao**g Yan

Abstract: We present COSBO-7, a strong millimeter (mm) source known for more than sixteen years but was just revealed its near-to-mid-IR counterpart by the James Webb Space Telescope (JWST). The precise pin-pointing by the Atacama Large Millimeter Array (ALMA) on the exquisite NIRCam and MIRI images show that it is a background source gravitationally lensed by a single foreground galaxy, and the analysis of… ▽ More We present COSBO-7, a strong millimeter (mm) source known for more than sixteen years but was just revealed its near-to-mid-IR counterpart by the James Webb Space Telescope (JWST). The precise pin-pointing by the Atacama Large Millimeter Array (ALMA) on the exquisite NIRCam and MIRI images show that it is a background source gravitationally lensed by a single foreground galaxy, and the analysis of its spectral energy distribution by different tools consistently derives its photometric redshift at $\sim$7.7. Strikingly, our lens modeling based on the JWST data shows that it has a regular, disk morphology in the source plane. The dusty region giving rise to the far-IR-to-mm emission seems to be confined to a limited region to one side of the disk and has a high dust temperature of $>90$~K. The galaxy is experiencing starburst both within and outside of this dusty region. After taking the lensing magnification of $μ\approx 2.5\mbox{-}3.6$ into account, the intrinsic star formation rate is several hundred $M_\odot$ yr$^{-1}$ both within the dusty region and across the more extended stellar disk, and the latter already has $>10^{10}M_\odot$ of stars in place. If all this is true, COSBO-7 presents an extraordinary case that is against the common wisdom about galaxy formation in the early universe; simply put, its existence poses a critical question to be answered: how could a massive disk galaxy come into being so early in the universe and sustain its regular morphology in the middle of an enormous starburst? △ Less

Submitted 19 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: 27 pages, 12 figures. ApJL accepted

arXiv:2402.13098 [pdf, other]

ELAD: Explanation-Guided Large Language Models Active Distillation

Authors: Yifei Zhang, Bo Pan, Chen Ling, Yuntong Hu, Liang Zhao

Abstract: The deployment and application of Large Language Models (LLMs) is hindered by their memory inefficiency, computational demands, and the high costs of API inferences. Traditional distillation methods, which transfer the capabilities of LLMs to smaller models, often fail to determine whether the knowledge has been sufficiently transferred, potentially resulting in high costs or incomplete distillati… ▽ More The deployment and application of Large Language Models (LLMs) is hindered by their memory inefficiency, computational demands, and the high costs of API inferences. Traditional distillation methods, which transfer the capabilities of LLMs to smaller models, often fail to determine whether the knowledge has been sufficiently transferred, potentially resulting in high costs or incomplete distillation. In this paper, we propose an Explanation-Guided LLMs Active Distillation (ELAD) framework that employs an active learning strategy to optimize the balance between annotation costs and model performance. To improve efficient sample selection, we introduce an explanation-guided sample selection method that identifies samples challenging its reasoning by exploiting uncertainties in explanation steps. Additionally, we present a customized LLM-annotated explanation revision technique where the teacher model detects and corrects flaws in the student model's reasoning. Our experiments across various reasoning datasets demonstrate that our framework significantly enhances the efficiency of LLM knowledge distillation. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.10779 [pdf, other]

A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models

Authors: Mingchen Li, Chen Ling, Rui Zhang, Liang Zhao

Abstract: Zero-shot link prediction (ZSLP) on knowledge graphs aims at automatically identifying relations between given entities. Existing methods primarily employ auxiliary information to predict tail entity given head entity and its relation, yet face challenges due to the occasional unavailability of such detailed information and the inherent simplicity of predicting tail entities based on semantic simi… ▽ More Zero-shot link prediction (ZSLP) on knowledge graphs aims at automatically identifying relations between given entities. Existing methods primarily employ auxiliary information to predict tail entity given head entity and its relation, yet face challenges due to the occasional unavailability of such detailed information and the inherent simplicity of predicting tail entities based on semantic similarities. Even though Large Language Models (LLMs) offer a promising solution to predict unobserved relations between the head and tail entity in a zero-shot manner, their performance is still restricted due to the inability to leverage all the (exponentially many) paths' information between two entities, which are critical in collectively indicating their relation types. To address this, in this work, we introduce a Condensed Transition Graph Framework for Zero-Shot Link Prediction (CTLP), which encodes all the paths' information in linear time complexity to predict unseen relations between entities, attaining both efficiency and information preservation. Specifically, we design a condensed transition graph encoder with theoretical guarantees on its coverage, expressiveness, and efficiency. It is learned by a transition graph contrastive learning strategy. Subsequently, we design a soft instruction tuning to learn and map the all-path embedding to the input of LLMs. Experimental results show that our proposed CTLP method achieves state-of-the-art performance on three standard ZSLP datasets △ Less

Submitted 16 February, 2024; originally announced February 2024.

arXiv:2402.10189 [pdf, other]

Uncertainty Quantification for In-Context Learning of Large Language Models

Authors: Chen Ling, Xujiang Zhao, Xuchao Zhang, Wei Cheng, Yanchi Liu, Yiyou Sun, Mika Oishi, Takao Osaki, Katsushi Matsuda, Jie Ji, Guangji Bai, Liang Zhao, Haifeng Chen

Abstract: In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs) and revolutionized various fields by providing a few task-relevant demonstrations in the prompt. However, trustworthy issues with LLM's response, such as hallucination, have also been actively discussed. Existing works have been devoted to quantifying the uncertainty in LLM's response, but they often overlo… ▽ More In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs) and revolutionized various fields by providing a few task-relevant demonstrations in the prompt. However, trustworthy issues with LLM's response, such as hallucination, have also been actively discussed. Existing works have been devoted to quantifying the uncertainty in LLM's response, but they often overlook the complex nature of LLMs and the uniqueness of in-context learning. In this work, we delve into the predictive uncertainty of LLMs associated with in-context learning, highlighting that such uncertainties may stem from both the provided demonstrations (aleatoric uncertainty) and ambiguities tied to the model's configurations (epistemic uncertainty). We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties. The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion. Extensive experiments are conducted to demonstrate the effectiveness of the decomposition. The code and data are available at: https://github.com/lingchen0331/UQ_ICL. △ Less

Submitted 28 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: Accepted to the main conference of NAACL 2024

arXiv:2402.07834 [pdf, other]

Generalizing across Temporal Domains with Koopman Operators

Authors: Qiuhao Zeng, Wei Wang, Fan Zhou, Gezheng Xu, Ruizhi Pu, Changjian Shui, Christian Gagne, Shichun Yang, Boyu Wang, Charles X. Ling

Abstract: In the field of domain generalization, the task of constructing a predictive model capable of generalizing to a target domain without access to target data remains challenging. This problem becomes further complicated when considering evolving dynamics between domains. While various approaches have been proposed to address this issue, a comprehensive understanding of the underlying generalization… ▽ More In the field of domain generalization, the task of constructing a predictive model capable of generalizing to a target domain without access to target data remains challenging. This problem becomes further complicated when considering evolving dynamics between domains. While various approaches have been proposed to address this issue, a comprehensive understanding of the underlying generalization theory is still lacking. In this study, we contribute novel theoretic results that aligning conditional distribution leads to the reduction of generalization bounds. Our analysis serves as a key motivation for solving the Temporal Domain Generalization (TDG) problem through the application of Koopman Neural Operators, resulting in Temporal Koopman Networks (TKNets). By employing Koopman Operators, we effectively address the time-evolving distributions encountered in TDG using the principles of Koopman theory, where measurement functions are sought to establish linear transition relations between evolving domains. Through empirical evaluations conducted on synthetic and real-world datasets, we validate the effectiveness of our proposed approach. △ Less

Submitted 15 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

Comments: 15 pages, 7 figures, Accepted by AAAI 2024. arXiv admin note: text overlap with arXiv:2206.00047

arXiv:2402.05386 [pdf, other]

Exploring the faintest end of mid-infrared luminosity functions up to $z\simeq 5$ with the JWST CEERS survey

Authors: Chih-Teng Ling, Tomotsugu Goto, Seong ** Kim, Cossas K. -W. Wu, Tetsuya Hashimoto, Tom C. -C. Chien, Yu-Wei Lin, Simon C. -C. Ho, Ece Kilerci

Abstract: Mid-infrared (MIR) light from galaxies is sensitive to dust-obscured star-formation activities because it traces the characteristic emission of dust heated by young, massive stars. By constructing the MIR luminosity functions (LFs), we are able to quantify the overall dusty star formation history and the evolution of galaxies over cosmic time. In this work, we report the first rest-frame MIR LFs a… ▽ More Mid-infrared (MIR) light from galaxies is sensitive to dust-obscured star-formation activities because it traces the characteristic emission of dust heated by young, massive stars. By constructing the MIR luminosity functions (LFs), we are able to quantify the overall dusty star formation history and the evolution of galaxies over cosmic time. In this work, we report the first rest-frame MIR LFs at 7.7, 10, 12.8, 15, 18, and 21 $μ$m as well as the total IR LF from the James Webb Space Telescope (JWST) Cosmic Evolution Early Release Science (CEERS) survey. We identify 506 galaxies at $z=0-5.1$ in the CEERS survey that also have optical photometry from the Hubble Space Telescope. With the unprecedented sensitivity of the JWST, we probe the faintest end of the LFs at $z=0-1$ down to $L^* \sim 10^7 L_\odot$, $\sim 2$ orders of magnitude fainter than those from the previous generation of IR space telescopes. Our findings connect well with and continue the faint end of the MIR LFs from the deepest observations in past works. As a proxy of star formation history, we present the MIR-based luminosity density up to $z\simeq4.0$, marking the first probe of the early Universe by JWST MIRI. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 22 pages, 22 figures, 7 tables. Accepted for publication in MNRAS. A summary video can be found at https://youtu.be/TRb6bjmGfOU

arXiv:2402.03030 [pdf, other]

Rejection-Sampled Universal Quantization for Smaller Quantization Errors

Authors: Chih Wei Ling, Cheuk Ting Li

Abstract: We construct a randomized vector quantizer which has a smaller maximum error compared to all known lattice quantizers with the same entropy for dimensions 5, 6, ..., 48, and also has a smaller mean squared error compared to known lattice quantizers with the same entropy for dimensions 35, ..., 48, in the high resolution limit. Moreover, our randomized quantizer has a desirable property that the qu… ▽ More We construct a randomized vector quantizer which has a smaller maximum error compared to all known lattice quantizers with the same entropy for dimensions 5, 6, ..., 48, and also has a smaller mean squared error compared to known lattice quantizers with the same entropy for dimensions 35, ..., 48, in the high resolution limit. Moreover, our randomized quantizer has a desirable property that the quantization error is always uniform over the ball and independent of the input. Our construction is based on applying rejection sampling on universal quantization, which allows us to shape the error distribution to be any continuous distribution, not only uniform distributions over basic cells of a lattice as in conventional dithered quantization. We also characterize the high SNR limit of one-shot channel simulation for any additive noise channel under a mild assumption (e.g., the AWGN channel), up to an additive constant of 1.45 bits. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 15 pages, 2 figures

arXiv:2401.10773 [pdf, other]

Multilevel lattice codes from Hurwitz quaternion integers

Authors: Juliana G. F. Souza, Sueli I. R. Costa, Cong Ling

Abstract: This work presents an extension of the Construction $π_A$ lattices proposed in \cite{huang2017construction}, to Hurwitz quaternion integers. This construction is provided by using an isomorphism from a version of the Chinese remainder theorem applied to maximal orders in contrast to natural orders in prior works. Exploiting this map, we analyze the performance of the resulting multilevel lattice c… ▽ More This work presents an extension of the Construction $π_A$ lattices proposed in \cite{huang2017construction}, to Hurwitz quaternion integers. This construction is provided by using an isomorphism from a version of the Chinese remainder theorem applied to maximal orders in contrast to natural orders in prior works. Exploiting this map, we analyze the performance of the resulting multilevel lattice codes, highlight via computer simulations their notably reduced computational complexity provided by the multistage decoding. Moreover it is shown that this construction effectively attain the Poltyrev-limit. △ Less

Submitted 27 February, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

arXiv:2401.09490 [pdf, other]

Gene-associated Disease Discovery Powered by Large Language Models

Authors: Jiayu Chang, Shiyu Wang, Chen Ling, Zhaohui Qin, Liang Zhao

Abstract: The intricate relationship between genetic variation and human diseases has been a focal point of medical research, evidenced by the identification of risk genes regarding specific diseases. The advent of advanced genome sequencing techniques has significantly improved the efficiency and cost-effectiveness of detecting these genetic markers, playing a crucial role in disease diagnosis and forming… ▽ More The intricate relationship between genetic variation and human diseases has been a focal point of medical research, evidenced by the identification of risk genes regarding specific diseases. The advent of advanced genome sequencing techniques has significantly improved the efficiency and cost-effectiveness of detecting these genetic markers, playing a crucial role in disease diagnosis and forming the basis for clinical decision-making and early risk assessment. To overcome the limitations of existing databases that record disease-gene associations from existing literature, which often lack real-time updates, we propose a novel framework employing Large Language Models (LLMs) for the discovery of diseases associated with specific genes. This framework aims to automate the labor-intensive process of sifting through medical literature for evidence linking genetic variations to diseases, thereby enhancing the efficiency of disease identification. Our approach involves using LLMs to conduct literature searches, summarize relevant findings, and pinpoint diseases related to specific genes. This paper details the development and application of our LLM-powered framework, demonstrating its potential in streamlining the complex process of literature retrieval and summarization to identify diseases associated with specific genetic variations. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: This is the official paper accepted by AAAI 2024 Workshop on Large Language Models for Biological Discoveries

arXiv:2401.01043 [pdf, other]

Polycyclic aromatic hydrocarbon (PAH) luminous galaxies in JWST CEERS data

Authors: Yu-Wei Lin, Cossas K. -W. Wu, Chih-Teng Ling, Tomotsugu Goto, Seong ** Kim, Ece Kilerci, Tetsuya Hashimoto, Po-Ya Wang, Simon C. -C. Ho, Tiger Yu-Yang Hsiao, Bjorn Jasper R. Raquel, Yuri Uno

Abstract: It has been an unanswered question how many dusty galaxies have been undetected from the state-of-the-art observational surveys. JWST enables us to detect faint IR galaxies that have prominent polycyclic aromatic hydrocarbon (PAH) features in the mid-IR wavelengths. PAH is a valuable tracer of star formation and dust properties in the mid-infrared wavelength. The JWST Cosmic Evolution Early Releas… ▽ More It has been an unanswered question how many dusty galaxies have been undetected from the state-of-the-art observational surveys. JWST enables us to detect faint IR galaxies that have prominent polycyclic aromatic hydrocarbon (PAH) features in the mid-IR wavelengths. PAH is a valuable tracer of star formation and dust properties in the mid-infrared wavelength. The JWST Cosmic Evolution Early Release Science (CEERS) fields provide us with wavelength coverage from 7.7 to 21 $μ$m using six photometric bands of the mid-infrared instrument (MIRI). We have identified galaxies dominated by mid-IR emission from PAHs, termed PAH galaxies. From our multi-band photometry catalogue, we selected ten PAH galaxies displaying high flux ratios of $\log(S_{15}/S_{10}) > 0.8$. The SED fitting analysis indicates that these galaxies are star-forming galaxies with total IR luminosities of $10^{10}$ $\sim$ $10^{11.5}$ $L_{\odot}$ at z $\sim 1$. The morphology of PAH galaxies does not show any clear signatures of major merging or interaction within the MIRI resolution. The majority of them are on the star-formation main sequence at $z \sim 1$. Our result demonstrates that JWST can detect PAH emissions from normal star-forming galaxies at $z \sim 1$, in addition to ultra-luminous infrared galaxies (ULIRGs) or luminous infrared galaxies (LIRGs). △ Less

Submitted 2 January, 2024; originally announced January 2024.

Comments: 12 pages, 20 figures, 4 tables. Accepted by MNRAS. A summary video is at https://www.youtube.com/watch?v=UtPaVTFM4f8&ab_channel=NTHUCosmology

arXiv:2401.00625 [pdf, ps, other]

Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

Authors: Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao

Abstract: The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims t… ▽ More The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims to systematically address these challenges by reviewing a broad spectrum of techniques designed to enhance the resource efficiency of LLMs. We categorize methods based on their optimization focus: computational, memory, energy, financial, and network resources and their applicability across various stages of an LLM's lifecycle, including architecture design, pretraining, finetuning, and system design. Additionally, the survey introduces a nuanced categorization of resource efficiency techniques by their specific resource types, which uncovers the intricate relationships and map**s between various resources and corresponding optimization techniques. A standardized set of evaluation metrics and datasets is also presented to facilitate consistent and fair comparisons across different models and techniques. By offering a comprehensive overview of the current sota and identifying open research avenues, this survey serves as a foundational reference for researchers and practitioners, aiding them in develo** more sustainable and efficient LLMs in a rapidly evolving landscape. △ Less

Submitted 3 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

Comments: Preprint. GitHub repo: https://github.com/tiingweii-shii/Awesome-Resource-Efficient-LLM-Papers

arXiv:2312.15566 [pdf, other]

Deep Copula-Based Survival Analysis for Dependent Censoring with Identifiability Guarantees

Authors: Weijia Zhang, Chun Kai Ling, Xuanhui Zhang

Abstract: Censoring is the central problem in survival analysis where either the time-to-event (for instance, death), or the time-tocensoring (such as loss of follow-up) is observed for each sample. The majority of existing machine learning-based survival analysis methods assume that survival is conditionally independent of censoring given a set of covariates; an assumption that cannot be verified since onl… ▽ More Censoring is the central problem in survival analysis where either the time-to-event (for instance, death), or the time-tocensoring (such as loss of follow-up) is observed for each sample. The majority of existing machine learning-based survival analysis methods assume that survival is conditionally independent of censoring given a set of covariates; an assumption that cannot be verified since only marginal distributions is available from the data. The existence of dependent censoring, along with the inherent bias in current estimators has been demonstrated in a variety of applications, accentuating the need for a more nuanced approach. However, existing methods that adjust for dependent censoring require practitioners to specify the ground truth copula. This requirement poses a significant challenge for practical applications, as model misspecification can lead to substantial bias. In this work, we propose a flexible deep learning-based survival analysis method that simultaneously accommodate for dependent censoring and eliminates the requirement for specifying the ground truth copula. We theoretically prove the identifiability of our model under a broad family of copulas and survival distributions. Experiments results from a wide range of datasets demonstrate that our approach successfully discerns the underlying dependency structure and significantly reduces survival estimation bias when compared to existing methods. △ Less

Submitted 27 December, 2023; v1 submitted 24 December, 2023; originally announced December 2023.

Comments: To appear in AAAI 2024

arXiv:2312.09058 [pdf, other]

Learning Coalition Structures with Games

Authors: Yixuan Even Xu, Chun Kai Ling, Fei Fang

Abstract: Coalitions naturally exist in many real-world systems involving multiple decision makers such as ridesharing, security, and online ad auctions, but the coalition structure among the agents is often unknown. We propose and study an important yet previously overseen problem -- Coalition Structure Learning (CSL), where we aim to carefully design a series of games for the agents and infer the underlyi… ▽ More Coalitions naturally exist in many real-world systems involving multiple decision makers such as ridesharing, security, and online ad auctions, but the coalition structure among the agents is often unknown. We propose and study an important yet previously overseen problem -- Coalition Structure Learning (CSL), where we aim to carefully design a series of games for the agents and infer the underlying coalition structure by observing their interactions in those games. We establish a lower bound on the sample complexity -- defined as the number of games needed to learn the structure -- of any algorithms for CSL and propose the Iterative Grou** (IG) algorithm for designing normal-form games to achieve the lower bound. We show that IG can be extended to other succinct games such as congestion games and graphical games. Moreover, we solve CSL in a more restrictive and practical setting: auctions. We show a variant of IG to solve CSL in the auction setting even if we cannot design the bidder valuations. Finally, we conduct experiments to evaluate IG in the auction setting and the results align with our theoretical analysis. △ Less

Submitted 18 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 13 pages, 4 figures, 3 tables, aaai 2024

arXiv:2312.05822 [pdf, other]

Toward Open-ended Embodied Tasks Solving

Authors: William Wei Wang, Dongqi Han, Xufang Luo, Yifei Shen, Charles Ling, Boyu Wang, Dongsheng Li

Abstract: Empowering embodied agents, such as robots, with Artificial Intelligence (AI) has become increasingly important in recent years. A major challenge is task open-endedness. In practice, robots often need to perform tasks with novel goals that are multifaceted, dynamic, lack a definitive "end-state", and were not encountered during training. To tackle this problem, this paper introduces \textit{Diffu… ▽ More Empowering embodied agents, such as robots, with Artificial Intelligence (AI) has become increasingly important in recent years. A major challenge is task open-endedness. In practice, robots often need to perform tasks with novel goals that are multifaceted, dynamic, lack a definitive "end-state", and were not encountered during training. To tackle this problem, this paper introduces \textit{Diffusion for Open-ended Goals} (DOG), a novel framework designed to enable embodied AI to plan and act flexibly and dynamically for open-ended task goals. DOG synergizes the generative prowess of diffusion models with state-of-the-art, training-free guidance techniques to adaptively perform online planning and control. Our evaluations demonstrate that DOG can handle various kinds of novel task goals not seen during training, in both maze navigation and robot control problems. Our work sheds light on enhancing embodied AI's adaptability and competency in tackling open-ended goals. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2312.02090 [pdf, other]

Cosmic star-formation history and black hole accretion history inferred from the JWST mid-infrared source counts

Authors: Seong ** Kim, Tomotsugu Goto, Chih-Teng Ling, Cossas K. -W. Wu, Tetsuya Hashimoto, Ece Kilerci, Simon C. -C. Ho, Yuri Uno, Po-Ya Wang, Yu-Wei Lin

Abstract: With the advent of the James Webb Space Telescope (JWST), extra-galactic source count studies were conducted down to sub-microJy in the mid-infrared (MIR), which is several tens of times fainter than what the previous-generation infrared (IR) telescopes achieved in the MIR. In this work, we aim to interpret the JWST source counts and constrain cosmic star-formation history (CSFH) and black hole ac… ▽ More With the advent of the James Webb Space Telescope (JWST), extra-galactic source count studies were conducted down to sub-microJy in the mid-infrared (MIR), which is several tens of times fainter than what the previous-generation infrared (IR) telescopes achieved in the MIR. In this work, we aim to interpret the JWST source counts and constrain cosmic star-formation history (CSFH) and black hole accretion history (BHAH). We employ the backward evolution of local luminosity functions (LLFs) of galaxies to reproduce the observed source counts from sub-microJy to a few tens of mJy in the MIR bands of the JWST. The shapes of the LLFs at the MIR bands are determined using the model templates of the spectral energy distributions (SEDs) for five representative galaxy types (star-forming galaxies, starbursts, composite, AGN type 2 and 1). By simultaneously fitting our model to all the source counts in the six MIR bands, along with the previous results, we determine the best-fit evolutions of MIR LFs for each of the five galaxy types, and subsequently estimate the CSFH and BHAH. Thanks to the JWST, our estimates are based on several tens of times fainter MIR sources, the existence of which was merely an extrapolation in previous studies. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: 15 pages, 12 figures, published in MNRAS, https://doi.org/10.1093/mnras/stad3499. A summary video is https://youtu.be/Md6wragrYyM

arXiv:2311.16870 [pdf, ps, other]

Unit Reducible Cyclotomic Fields

Authors: Christian Porter, Piero Sarti, Cong Ling, Alar Leibak

Abstract: In this paper, we continue the study of unit reducible fields as introduced in \cite{LPL23} for the special case of cyclotomic fields. Specifically, we deduce that the cyclotomic fields of conductors $2,3,5,7,8,9,12,15$ are all unit reducible, and show that any cyclotomic field of conductor $N$ is not unit reducible if $2^4, 3^3, 5^2, 7^2, 11^2$ or any prime $p \geq 13$ divide $N$, meaning the uni… ▽ More In this paper, we continue the study of unit reducible fields as introduced in \cite{LPL23} for the special case of cyclotomic fields. Specifically, we deduce that the cyclotomic fields of conductors $2,3,5,7,8,9,12,15$ are all unit reducible, and show that any cyclotomic field of conductor $N$ is not unit reducible if $2^4, 3^3, 5^2, 7^2, 11^2$ or any prime $p \geq 13$ divide $N$, meaning the unit reducible cyclotomic fields are finite in number. Finally, if $a$ is a totally positive element of a cyclotomic field, we show that for all equivalent $a^\prime$, the discrepancy between $\trace_{K/\mathbb{Q}}(a^\prime)$ and the shortest nonzero element of the quadratic form $\trace_{K/\mathbb{Q}}(axx^*)$ where $x$ is taken from the ring of integers tends to infinity as the conductor $N$ goes to infinity. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: 12 pages including bibliography

arXiv:2311.16392 [pdf, other]

Multi-defender Security Games with Schedules

Authors: Zimeng Song, Chun Kai Ling, Fei Fang

Abstract: Stackelberg Security Games are often used to model strategic interactions in high-stakes security settings. The majority of existing models focus on single-defender settings where a single entity assumes command of all security assets. However, many realistic scenarios feature multiple heterogeneous defenders with their own interests and priorities embedded in a more complex system. Furthermore, d… ▽ More Stackelberg Security Games are often used to model strategic interactions in high-stakes security settings. The majority of existing models focus on single-defender settings where a single entity assumes command of all security assets. However, many realistic scenarios feature multiple heterogeneous defenders with their own interests and priorities embedded in a more complex system. Furthermore, defenders rarely choose targets to protect. Instead, they have a multitude of defensive resources or schedules at its disposal, each with different protective capabilities. In this paper, we study security games featuring multiple defenders and schedules simultaneously. We show that unlike prior work on multi-defender security games, the introduction of schedules can cause non-existence of equilibrium even under rather restricted environments. We prove that under the mild restriction that any subset of a schedule is also a schedule, non-existence of equilibrium is not only avoided, but can be computed in polynomial time in games with two defenders. Under additional assumptions, our algorithm can be extended to games with more than two defenders and its computation scaled up in special classes of games with compactly represented schedules such as those used in patrolling applications. Experimental results suggest that our methods scale gracefully with game size, making our algorithms amongst the few that can tackle multiple heterogeneous defenders. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: Extended version of the paper accepted to GameSec 2023

arXiv:2311.15161 [pdf, other]

doi 10.1109/TKDE.2024.3419449

Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning

Authors: Jiaqi Li, Yuanhao Lai, Rui Wang, Changjian Shui, Sabyasachi Sahoo, Charles X. Ling, Shichun Yang, Boyu Wang, Christian Gagné, Fan Zhou

Abstract: Continual learning aims to learn a series of tasks sequentially without forgetting the knowledge acquired from the previous ones. In this work, we propose the Hessian Aware Low-Rank Perturbation algorithm for continual learning. By modeling the parameter transitions along the sequential tasks with the weight matrix transformation, we propose to apply the low-rank approximation on the task-adaptive… ▽ More Continual learning aims to learn a series of tasks sequentially without forgetting the knowledge acquired from the previous ones. In this work, we propose the Hessian Aware Low-Rank Perturbation algorithm for continual learning. By modeling the parameter transitions along the sequential tasks with the weight matrix transformation, we propose to apply the low-rank approximation on the task-adaptive parameters in each layer of the neural networks. Specifically, we theoretically demonstrate the quantitative relationship between the Hessian and the proposed low-rank approximation. The approximation ranks are then globally determined according to the marginal increment of the empirical loss estimated by the layer-specific gradient and low-rank approximation error. Furthermore, we control the model capacity by pruning less important parameters to diminish the parameter growth. We conduct extensive experiments on various benchmarks, including a dataset with large-scale tasks, and compare our method against some recent state-of-the-art methods to demonstrate the effectiveness and scalability of our proposed method. Empirical results show that our method performs better on different benchmarks, especially in achieving task order robustness and handling the forgetting issue. The source code is at https://github.com/lijiaqi/HALRP. △ Less

Submitted 24 June, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

Comments: Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE)

arXiv:2311.15121 [pdf, other]

Candidate Galaxies at z ~ 11.3--21.8 and beyond: results from JWST's public data taken in its first year

Authors: Hao**g Yan, Bangzheng Sun, Zhiyuan Ma, Chenxiaoji Ling

Abstract: We present a systematic search of candidate galaxies at z > 11.3 using the public Near Infrared Camera data taken by the James Webb Space Telescope (JWST) in its Cycle 1, which include six blank fields totalling 386 sq.arcmin and two lensing cluster fields totalling 48 sq.arcmin. The candidates are selected as F150W, F200W and F277W dropouts, which correspond to z ~ 12.7 (11.3 < z < 15.4), 17.3 (1… ▽ More We present a systematic search of candidate galaxies at z > 11.3 using the public Near Infrared Camera data taken by the James Webb Space Telescope (JWST) in its Cycle 1, which include six blank fields totalling 386 sq.arcmin and two lensing cluster fields totalling 48 sq.arcmin. The candidates are selected as F150W, F200W and F277W dropouts, which correspond to z ~ 12.7 (11.3 < z < 15.4), 17.3 (15.4 < z < 21.8) and 24.7 (21.8 < z < 28.3), respectively. Our sample consists of 123 F150W dropouts, 52 F200W dropouts and 32 F277W dropouts, which is the largest candidate galaxy sample probing the highest redshift range to date. The F150W and F200W dropouts have sufficient photometric information that allows contaminant rejection, which we do by fitting to their spectrum energy distributions. Based on the purified samples of F150W and F200W dropouts, we derive galaxy luminosity functions at z ~ 12.7 and 17.3, respectively. We find that both are better described by power law than Schechter function and that there is only a marginal evolution (a factor of < 2) between the two epochs. The emergence of galaxy population at z ~ 17.3 or earlier is consistent with the suggestion of an early cosmic hydrogen reionization and is not necessarily a crisis of the LCDM paradigm. To establish a new picture of galaxy formation in the early universe, we will need both JWST spectroscopic confirmation of bright candidates such as those in our sample and deeper surveys to further constrain the faint-end of the luminosity function at M > -18 mag. △ Less

Submitted 25 November, 2023; originally announced November 2023.

Comments: Submitted to ApJ

arXiv:2310.11672 [pdf, other]

Open-ended Commonsense Reasoning with Unrestricted Answer Scope

Authors: Chen Ling, Xuchao Zhang, Xujiang Zhao, Yanchi Liu, Wei Cheng, Mika Oishi, Takao Osaki, Katsushi Matsuda, Haifeng Chen, Liang Zhao

Abstract: Open-ended Commonsense Reasoning is defined as solving a commonsense question without providing 1) a short list of answer candidates and 2) a pre-defined answer scope. Conventional ways of formulating the commonsense question into a question-answering form or utilizing external knowledge to learn retrieval-based methods are less applicable in the open-ended setting due to an inherent challenge. Wi… ▽ More Open-ended Commonsense Reasoning is defined as solving a commonsense question without providing 1) a short list of answer candidates and 2) a pre-defined answer scope. Conventional ways of formulating the commonsense question into a question-answering form or utilizing external knowledge to learn retrieval-based methods are less applicable in the open-ended setting due to an inherent challenge. Without pre-defining an answer scope or a few candidates, open-ended commonsense reasoning entails predicting answers by searching over an extremely large searching space. Moreover, most questions require implicit multi-hop reasoning, which presents even more challenges to our problem. In this work, we leverage pre-trained language models to iteratively retrieve reasoning paths on the external knowledge base, which does not require task-specific supervision. The reasoning paths can help to identify the most precise answer to the commonsense question. We conduct experiments on two commonsense benchmark datasets. Compared to other approaches, our proposed method achieves better performance both quantitatively and qualitatively. △ Less

Submitted 27 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: Findings of EMNLP 2023

arXiv:2310.10408 [pdf, other]

A cross Transformer for image denoising

Authors: Chunwei Tian, Menghua Zheng, Wangmeng Zuo, Shichao Zhang, Yanning Zhang, Chia-Wen Ling

Abstract: Deep convolutional neural networks (CNNs) depend on feedforward and feedback ways to obtain good performance in image denoising. However, how to obtain effective structural information via CNNs to efficiently represent given noisy images is key for complex scenes. In this paper, we propose a cross Transformer denoising CNN (CTNet) with a serial block (SB), a parallel block (PB), and a residual blo… ▽ More Deep convolutional neural networks (CNNs) depend on feedforward and feedback ways to obtain good performance in image denoising. However, how to obtain effective structural information via CNNs to efficiently represent given noisy images is key for complex scenes. In this paper, we propose a cross Transformer denoising CNN (CTNet) with a serial block (SB), a parallel block (PB), and a residual block (RB) to obtain clean images for complex scenes. A SB uses an enhanced residual architecture to deeply search structural information for image denoising. To avoid loss of key information, PB uses three heterogeneous networks to implement multiple interactions of multi-level features to broadly search for extra information for improving the adaptability of an obtained denoiser for complex scenes. Also, to improve denoising performance, Transformer mechanisms are embedded into the SB and PB to extract complementary salient features for effectively removing noise in terms of pixel relations. Finally, a RB is applied to acquire clean images. Experiments illustrate that our CTNet is superior to some popular denoising methods in terms of real and synthetic image denoising. It is suitable to mobile digital devices, i.e., phones. Codes can be obtained at https://github.com/hellloxiaotian/CTNet. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.09062 [pdf]

doi 10.1103/PhysRevB.108.184203

Mechanisms of temperature-dependent thermal transport in amorphous silica from machine-learning molecular dynamics

Authors: Ting Liang, Penghua Ying, Ke Xu, Zhenqiang Ye, Chao Ling, Zheyong Fan, Jianbin Xu

Abstract: Amorphous silica (a-SiO$_2$) is a foundational disordered material for which the thermal transport properties are important for various applications. To accurately model the interatomic interactions in classical molecular dynamics (MD) simulations of thermal transport in a-SiO$_2$, we herein develop an accurate yet highly efficient machine-learned potential model that allowed us to generate a-SiO… ▽ More Amorphous silica (a-SiO$_2$) is a foundational disordered material for which the thermal transport properties are important for various applications. To accurately model the interatomic interactions in classical molecular dynamics (MD) simulations of thermal transport in a-SiO$_2$, we herein develop an accurate yet highly efficient machine-learned potential model that allowed us to generate a-SiO$_2$ samples closely resembling experimentally produced ones. Using the homogeneous nonequilibrium MD method and a proper quantum-statistical correction to the classical MD results, quantitative agreement with experiments is achieved for the thermal conductivities of bulk and 190 nm-thick a-SiO$_2$ films over a wide range of temperatures. To interrogate the thermal vibrations at different temperatures, we calculated the current correlation functions corresponding to the transverse acoustic (TA) and longitudinal acoustic (LA) collective vibrations. The results reveal that below the Ioffe-Regel crossover frequency, phonons as well-defined excitations, remain applicable in a-SiO$_2$ and play a predominant role at low temperatures, resulting in a temperature-dependent increase in thermal conductivity. In the high-temperature region, more phonons are excited, accompanied by a more intense liquid-like diffusion event. We attribute the temperature-independent thermal conductivity in the high-temperature range of a-SiO$_2$ to the collaborative involvement of excited phonon scattering and liquid-like diffusion in heat conduction. These findings provide physical insights into the thermal transport of a-SiO$_2$ and are expected to be applied to a vast range of amorphous materials. △ Less

Submitted 1 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: 12 pages, 7 figures in main text; 15 pages, 12 figures in Supplemental Material

Journal ref: Physical Review B 108, 184203 (2023)

arXiv:2309.14337 [pdf, other]

The true fraction of repeating fast radio bursts revealed through CHIME source count evolution

Authors: Shotaro Yamasaki, Tomotsugu Goto, Chih-Teng Ling, Tetsuya Hashimoto

Abstract: Fast Radio Bursts (FRBs) are classified into repeaters and non-repeaters, with only a few percent of the observed FRB population from the Canadian Hydrogen Intensity Map** Experiment (CHIME) confirmed as repeaters. However, this figure represents only a lower limit due to the observational biases, and the true fraction of repeaters remains unknown. Correcting for these biases uncovers a notable… ▽ More Fast Radio Bursts (FRBs) are classified into repeaters and non-repeaters, with only a few percent of the observed FRB population from the Canadian Hydrogen Intensity Map** Experiment (CHIME) confirmed as repeaters. However, this figure represents only a lower limit due to the observational biases, and the true fraction of repeaters remains unknown. Correcting for these biases uncovers a notable decline in apparently non-repeating FRB detection rate as the CHIME operational time increases. This finding suggests that a significant portion of apparently non-repeating FRBs could in fact exhibit repetition when observed over more extended periods. A simple population model infers that the true repeater fraction likely exceeds 50% with 99% confidence, a figure substantially larger than the observed face value, even consistent with 100%. This greater prevalence of repeaters had previously gone unnoticed due to their very low repetition rates ($\sim$10$^{-3.5}$ hr$^{-1}$ on average). Hence, theoretical FRB models must incorporate these low-rate repeaters. Furthermore, our results indicate a significantly higher repeater volume number density, potentially exceeding observed values by up to 10$^4$ times, which in turn impacts comparisons with potential FRB progenitors. △ Less

Submitted 12 December, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: 10 pages, 10 figures, MNRAS in press, updated to match the accepted version

arXiv:2309.08301 [pdf, other]

RaSpectLoc: RAman SPECTroscopy-dependent robot LOCalisation

Authors: Christopher Thomas Thirgood, Oscar Alejandro Mendez Maldonado, Chao Ling, Jonathan Storey, Simon J Hadfield

Abstract: This paper presents a new information source for supporting robot localisation: material composition. The proposed method complements the existing visual, structural, and semantic cues utilized in the literature. However, it has a distinct advantage in its ability to differentiate structurally, visually or categorically similar objects such as different doors, by using Raman spectrometers. Such de… ▽ More This paper presents a new information source for supporting robot localisation: material composition. The proposed method complements the existing visual, structural, and semantic cues utilized in the literature. However, it has a distinct advantage in its ability to differentiate structurally, visually or categorically similar objects such as different doors, by using Raman spectrometers. Such devices can identify the material of objects it probes through the bonds between the material's molecules. Unlike similar sensors, such as mass spectroscopy, it does so without damaging the material or environment. In addition to introducing the first material-based localisation algorithm, this paper supports the future growth of the field by presenting a gazebo plugin for Raman spectrometers, material sensing demonstrations, as well as the first-ever localisation data-set with benchmarks for material-based localisation. This benchmarking shows that the proposed technique results in a significant improvement over current state-of-the-art localisation techniques, achieving 16\% more accurate localisation than the leading baseline. △ Less

Submitted 21 September, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: 8 pages, 5 figures. This work will be presented at IROS 2023

arXiv:2309.06982 [pdf, other]

Communication-Efficient Laplace Mechanism for Differential Privacy via Random Quantization

Authors: Ali Moradi Shahmiri, Chih Wei Ling, Cheuk Ting Li

Abstract: We propose the first method that realizes the Laplace mechanism exactly (i.e., a Laplace noise is added to the data) that requires only a finite amount of communication (whereas the original Laplace mechanism requires the transmission of a real number) while guaranteeing privacy against the server and database. Our mechanism can serve as a drop-in replacement for local or centralized differential… ▽ More We propose the first method that realizes the Laplace mechanism exactly (i.e., a Laplace noise is added to the data) that requires only a finite amount of communication (whereas the original Laplace mechanism requires the transmission of a real number) while guaranteeing privacy against the server and database. Our mechanism can serve as a drop-in replacement for local or centralized differential privacy applications where the Laplace mechanism is used. Our mechanism is constructed using a random quantization technique. Unlike the simple and prevalent Laplace-mechanism-then-quantize approach, the quantization in our mechanism does not result in any distortion or degradation of utility. Unlike existing dithered quantization and channel simulation schemes for simulating additive Laplacian noise, our mechanism guarantees privacy not only against the database and downstream, but also against the honest but curious server which attempts to decode the data using the dither signals. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: 11 pages, 3 figures, short version to be submitted at 2024 IEEE International Conference on Acoustics, Speech and Signal Processing

arXiv:2309.03433 [pdf, other]

Improving Open Information Extraction with Large Language Models: A Study on Demonstration Uncertainty

Authors: Chen Ling, Xujiang Zhao, Xuchao Zhang, Yanchi Liu, Wei Cheng, Haoyu Wang, Zhengzhang Chen, Takao Osaki, Katsushi Matsuda, Haifeng Chen, Liang Zhao

Abstract: Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text, typically in the form of (subject, relation, object) triples. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks due to two key issues. First, LLMs struggle to distinguish irrelevant context f… ▽ More Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text, typically in the form of (subject, relation, object) triples. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks due to two key issues. First, LLMs struggle to distinguish irrelevant context from relevant relations and generate structured output due to the restrictions on fine-tuning the model. Second, LLMs generates responses autoregressively based on probability, which makes the predicted relations lack confidence. In this paper, we assess the capabilities of LLMs in improving the OIE task. Particularly, we propose various in-context learning strategies to enhance LLM's instruction-following ability and a demonstration uncertainty quantification module to enhance the confidence of the generated relations. Our experiments on three OIE benchmark datasets show that our approach holds its own against established supervised methods, both quantitatively and qualitatively. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2309.02978 [pdf, other]

Helper Recommendation with seniority control in Online Health Community

Authors: Junruo Gao, Chen Ling, Carl Yang, Liang Zhao

Abstract: Online health communities (OHCs) are forums where patients with similar conditions communicate their experiences and provide moral support. Social support in OHCs plays a crucial role in easing and rehabilitating patients. However, many time-sensitive questions from patients often remain unanswered due to the multitude of threads and the random nature of patient visits in OHCs. To address this iss… ▽ More Online health communities (OHCs) are forums where patients with similar conditions communicate their experiences and provide moral support. Social support in OHCs plays a crucial role in easing and rehabilitating patients. However, many time-sensitive questions from patients often remain unanswered due to the multitude of threads and the random nature of patient visits in OHCs. To address this issue, it is imperative to propose a recommender system that assists solution seekers in finding appropriate problem helpers. Nevertheless, develo** a recommendation algorithm to enhance social support in OHCs remains an under-explored area. Traditional recommender systems cannot be directly adapted due to the following obstacles. First, unlike user-item links in traditional recommender systems, it is hard to model the social support behind helper-seeker links in OHCs since they are formed based on various heterogeneous reasons. Second, it is difficult to distinguish the impact of historical activities in characterizing patients. Third, it is significantly challenging to ensure that the recommended helpers possess sufficient expertise to assist the seekers. To tackle the aforementioned challenges, we develop a Monotonically regularIzed diseNTangled Variational Autoencoders (MINT) model to strengthen social support in OHCs. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Showing 1–50 of 391 results for author: Ling, C