Search | arXiv e-print repository

CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning

Authors: Weiqi Wang, Tianqing Fang, Chunyang Li, Haochen Shi, Wenxuan Ding, Baixuan Xu, Zhaowei Wang, Jiaxin Bai, Xin Liu, Jiayang Cheng, Chunkit Chan, Yangqiu Song

Abstract: The sequential process of conceptualization and instantiation is essential to generalizable commonsense reasoning as it allows the application of existing knowledge to unfamiliar scenarios. However, existing works tend to undervalue the step of instantiation and heavily rely on pre-built concept taxonomies and human annotations to collect both types of knowledge, resulting in a lack of instantiate… ▽ More The sequential process of conceptualization and instantiation is essential to generalizable commonsense reasoning as it allows the application of existing knowledge to unfamiliar scenarios. However, existing works tend to undervalue the step of instantiation and heavily rely on pre-built concept taxonomies and human annotations to collect both types of knowledge, resulting in a lack of instantiated knowledge to complete reasoning, high cost, and limited scalability. To tackle these challenges, we introduce CANDLE, a distillation framework that iteratively performs contextualized conceptualization and instantiation over commonsense knowledge bases by instructing large language models to generate both types of knowledge with critic filtering. By applying CANDLE to ATOMIC, we construct a comprehensive knowledge base comprising six million conceptualizations and instantiated commonsense knowledge triples. Both types of knowledge are firmly rooted in the original ATOMIC dataset, and intrinsic evaluations demonstrate their exceptional quality and diversity. Empirical results indicate that distilling CANDLE on student models provides benefits across four downstream tasks. Our code, data, and models are publicly available at https://github.com/HKUST-KnowComp/CANDLE. △ Less

Submitted 21 May, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

Comments: ACL2024

arXiv:2401.07189 [pdf, ps, other]

Generic character sheaves on parahoric subgroups

Authors: Roman Bezrukavnikov, Charlotte Chan

Abstract: We prove that on a "generic locus" of the equivariant derived category of constructible sheaves, positive-depth parabolic induction is a $t$-exact equivalence of categories. Iterating this with respect to sequences of generic data allows us to take as input an arbitrary character sheaf on a connected algebraic group and output a family of positive-depth character sheaves on parahoric group schemes… ▽ More We prove that on a "generic locus" of the equivariant derived category of constructible sheaves, positive-depth parabolic induction is a $t$-exact equivalence of categories. Iterating this with respect to sequences of generic data allows us to take as input an arbitrary character sheaf on a connected algebraic group and output a family of positive-depth character sheaves on parahoric group schemes. In the simplest interesting setting, our construction produces a simple perverse sheaf associated to a sufficiently nontrivial multiplicative local system on a torus, resolving a conjecture of Lusztig. We prove, under a mild condition on $q$, that this realizes the character of the representation arising from the associated parahoric Deligne--Lusztig induction. △ Less

Submitted 13 January, 2024; originally announced January 2024.

Comments: 46 pages

arXiv:2401.01952 [pdf, other]

Instruct-Imagen: Image Generation with Multi-modal Instruction

Authors: Hexiang Hu, Kelvin C. K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, Kihyuk Sohn, Yang Zhao, Xue Ben, Boqing Gong, William Cohen, Ming-Wei Chang, Xuhui Jia

Abstract: This paper presents instruct-imagen, a model that tackles heterogeneous image generation tasks and generalizes across unseen tasks. We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision. It uses natural language to amalgamate disparate modalities (e.g., text, edge, style, subject, etc.), such that abundant gener… ▽ More This paper presents instruct-imagen, a model that tackles heterogeneous image generation tasks and generalizes across unseen tasks. We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision. It uses natural language to amalgamate disparate modalities (e.g., text, edge, style, subject, etc.), such that abundant generation intents can be standardized in a uniform format. We then build instruct-imagen by fine-tuning a pre-trained text-to-image diffusion model with a two-stage framework. First, we adapt the model using the retrieval-augmented training, to enhance model's capabilities to ground its generation on external multimodal context. Subsequently, we fine-tune the adapted model on diverse image generation tasks that requires vision-language understanding (e.g., subject-driven generation, etc.), each paired with a multi-modal instruction encapsulating the task's essence. Human evaluation on various image generation datasets reveals that instruct-imagen matches or surpasses prior task-specific models in-domain and demonstrates promising generalization to unseen and more complex tasks. △ Less

Submitted 3 January, 2024; originally announced January 2024.

Comments: 20 pages, 18 figures

arXiv:2401.00994 [pdf]

Detection and Defense Against Prominent Attacks on Preconditioned LLM-Integrated Virtual Assistants

Authors: Chun Fai Chan, Daniel Wankit Yip, Aysan Esmradi

Abstract: The emergence of LLM (Large Language Model) integrated virtual assistants has brought about a rapid transformation in communication dynamics. During virtual assistant development, some developers prefer to leverage the system message, also known as an initial prompt or custom prompt, for preconditioning purposes. However, it is important to recognize that an excessive reliance on this functionalit… ▽ More The emergence of LLM (Large Language Model) integrated virtual assistants has brought about a rapid transformation in communication dynamics. During virtual assistant development, some developers prefer to leverage the system message, also known as an initial prompt or custom prompt, for preconditioning purposes. However, it is important to recognize that an excessive reliance on this functionality raises the risk of manipulation by malicious actors who can exploit it with carefully crafted prompts. Such malicious manipulation poses a significant threat, potentially compromising the accuracy and reliability of the virtual assistant's responses. Consequently, safeguarding the virtual assistants with detection and defense mechanisms becomes of paramount importance to ensure their safety and integrity. In this study, we explored three detection and defense mechanisms aimed at countering attacks that target the system message. These mechanisms include inserting a reference key, utilizing an LLM evaluator, and implementing a Self-Reminder. To showcase the efficacy of these mechanisms, they were tested against prominent attack techniques. Our findings demonstrate that the investigated mechanisms are capable of accurately identifying and counteracting the attacks. The effectiveness of these mechanisms underscores their potential in safeguarding the integrity and reliability of virtual assistants, reinforcing the importance of their implementation in real-world scenarios. By prioritizing the security of virtual assistants, organizations can maintain user trust, preserve the integrity of the application, and uphold the high standards expected in this era of transformative technologies. △ Less

Submitted 1 January, 2024; originally announced January 2024.

Comments: Accepted to be published in the Proceedings of the 10th IEEE CSDE 2023, the Asia-Pacific Conference on Computer Science and Data Engineering 2023

arXiv:2401.00991 [pdf]

A Novel Evaluation Framework for Assessing Resilience Against Prompt Injection Attacks in Large Language Models

Authors: Daniel Wankit Yip, Aysan Esmradi, Chun Fai Chan

Abstract: Prompt injection attacks exploit vulnerabilities in large language models (LLMs) to manipulate the model into unintended actions or generate malicious content. As LLM integrated applications gain wider adoption, they face growing susceptibility to such attacks. This study introduces a novel evaluation framework for quantifying the resilience of applications. The framework incorporates innovative t… ▽ More Prompt injection attacks exploit vulnerabilities in large language models (LLMs) to manipulate the model into unintended actions or generate malicious content. As LLM integrated applications gain wider adoption, they face growing susceptibility to such attacks. This study introduces a novel evaluation framework for quantifying the resilience of applications. The framework incorporates innovative techniques designed to ensure representativeness, interpretability, and robustness. To ensure the representativeness of simulated attacks on the application, a meticulous selection process was employed, resulting in 115 carefully chosen attacks based on coverage and relevance. For enhanced interpretability, a second LLM was utilized to evaluate the responses generated from these simulated attacks. Unlike conventional malicious content classifiers that provide only a confidence score, the LLM-based evaluation produces a score accompanied by an explanation, thereby enhancing interpretability. Subsequently, a resilience score is computed by assigning higher weights to attacks with greater impact, thus providing a robust measurement of the application resilience. To assess the framework's efficacy, it was applied on two LLMs, namely Llama2 and ChatGLM. Results revealed that Llama2, the newer model exhibited higher resilience compared to ChatGLM. This finding substantiates the effectiveness of the framework, aligning with the prevailing notion that newer models tend to possess greater resilience. Moreover, the framework exhibited exceptional versatility, requiring only minimal adjustments to accommodate emerging attack techniques and classifications, thereby establishing itself as an effective and practical solution. Overall, the framework offers valuable insights that empower organizations to make well-informed decisions to fortify their applications against potential threats from prompt injection. △ Less

Submitted 1 January, 2024; originally announced January 2024.

Comments: Accepted to be published in the Proceedings of The 10th IEEE CSDE 2023, the Asia-Pacific Conference on Computer Science and Data Engineering 2023

arXiv:2312.14373 [pdf, other]

Learning Socio-Temporal Graphs for Multi-Agent Trajectory Prediction

Authors: Yuke Li, Lixiong Chen, Guangyi Chen, Ching-Yao Chan, Kun Zhang, Stefano Anzellotti, Donglai Wei

Abstract: In order to predict a pedestrian's trajectory in a crowd accurately, one has to take into account her/his underlying socio-temporal interactions with other pedestrians consistently. Unlike existing work that represents the relevant information separately, partially, or implicitly, we propose a complete representation for it to be fully and explicitly captured and analyzed. In particular, we introd… ▽ More In order to predict a pedestrian's trajectory in a crowd accurately, one has to take into account her/his underlying socio-temporal interactions with other pedestrians consistently. Unlike existing work that represents the relevant information separately, partially, or implicitly, we propose a complete representation for it to be fully and explicitly captured and analyzed. In particular, we introduce a Directed Acyclic Graph-based structure, which we term Socio-Temporal Graph (STG), to explicitly capture pair-wise socio-temporal interactions among a group of people across both space and time. Our model is built on a time-varying generative process, whose latent variables determine the structure of the STGs. We design an attention-based model named STGformer that affords an end-to-end pipeline to learn the structure of the STGs for trajectory prediction. Our solution achieves overall state-of-the-art prediction accuracy in two large-scale benchmark datasets. Our analysis shows that a person's past trajectory is critical for predicting another person's future path. Our model learns this relationship with a strong notion of socio-temporal localities. Statistics show that utilizing this information explicitly for prediction yields a noticeable performance gain with respect to the trajectory-only approaches. △ Less

Submitted 21 December, 2023; originally announced December 2023.

arXiv:2312.11771 [pdf, other]

Near-field Spin Chern Number Quantized by Real-space Topology of Optical Structures

Authors: Tong Fu, Ruo-Yang Zhang, Shiqi Jia, C. T. Chan, Shubo Wang

Abstract: The Chern number has been widely used to describe the topological properties of periodic structures in the momentum space. Here, we introduce a real-space spin Chern number for the optical near fields of finite-sized structures. This new spin Chern number is intrinsically quantized and equal to the structure's Euler characteristic. The relationship is robust against continuous deformation of the s… ▽ More The Chern number has been widely used to describe the topological properties of periodic structures in the momentum space. Here, we introduce a real-space spin Chern number for the optical near fields of finite-sized structures. This new spin Chern number is intrinsically quantized and equal to the structure's Euler characteristic. The relationship is robust against continuous deformation of the structure's geometry and is irrelevant to the specific material constituents or external excitation. Our work enriches topological physics by extending the concept of Chern number to the real space, opening exciting possibilities for exploring the real-space topological properties of light. △ Less

Submitted 1 May, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

Comments: 9 pages, 6 figures

arXiv:2312.10982 [pdf]

A Comprehensive Survey of Attack Techniques, Implementation, and Mitigation Strategies in Large Language Models

Authors: Aysan Esmradi, Daniel Wankit Yip, Chun Fai Chan

Abstract: Ensuring the security of large language models (LLMs) is an ongoing challenge despite their widespread popularity. Developers work to enhance LLMs security, but vulnerabilities persist, even in advanced versions like GPT-4. Attackers exploit these weaknesses, highlighting the need for proactive cybersecurity measures in AI model development. This article explores two attack categories: attacks on… ▽ More Ensuring the security of large language models (LLMs) is an ongoing challenge despite their widespread popularity. Developers work to enhance LLMs security, but vulnerabilities persist, even in advanced versions like GPT-4. Attackers exploit these weaknesses, highlighting the need for proactive cybersecurity measures in AI model development. This article explores two attack categories: attacks on models themselves and attacks on model applications. The former requires expertise, access to model data, and significant implementation time, while the latter is more accessible to attackers and has seen increased attention. Our study reviews over 100 recent research works, providing an in-depth analysis of each attack type. We identify the latest attack methods and explore various approaches to carry them out. We thoroughly investigate mitigation techniques, assessing their effectiveness and limitations. Furthermore, we summarize future defenses against these attacks. We also examine real-world techniques, including reported and our implemented attacks on LLMs, to consolidate our findings. Our research highlights the urgency of addressing security concerns and aims to enhance the understanding of LLM attacks, contributing to robust defense development in this evolving domain. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: Accepted to be published in the Proceedings of the 3rd International Conference on Ubiquitous Security 2023 (UbiSec-2023)

arXiv:2312.10919 [pdf, other]

IoT Ground Sensing Systems for Early Wildfire Detection: Technologies, Challenges and Opportunities

Authors: Chiu Chun Chan, Sheeraz A. Alvi, Xiangyun Zhou, Salman Durrani, Nicholas Wilson, Marta Yebra

Abstract: The threat posed by wildfires or bushfires has become a severe global issue due to the increase in human activities in forested areas and the impact of climate change. Consequently, there is a surge in the development of automatic wildfire detection methods. Approaches based on long-distance imagery from satellites or watchtowers encounter limitations, such as restricted visibility, which results… ▽ More The threat posed by wildfires or bushfires has become a severe global issue due to the increase in human activities in forested areas and the impact of climate change. Consequently, there is a surge in the development of automatic wildfire detection methods. Approaches based on long-distance imagery from satellites or watchtowers encounter limitations, such as restricted visibility, which results in delayed response times. To address and overcome these challenges, research interest has grown in the implementation of ground-based Internet of Things (IoT) sensing systems for early wildfire detection. However, research on energy consumption, detection latency, and detection accuracy of IoT sensing systems, as well as the performance of various anomaly detection algorithms when evaluated using these metrics, is lacking. Therefore, in this article, we present an overview of current IoT ground sensing systems for early wildfire detection. Camera and environmental sensing technologies suitable for early wildfire detection are discussed, as well as vision-based detection algorithms and detection algorithms for environmental sensing. Challenges related to the development and implementation of IoT ground sensing systems for early wildfire detection and the future research directions important for creating a robust detection system to combat the growing threat of wildfires worldwide are discussed. △ Less

Submitted 17 December, 2023; originally announced December 2023.

arXiv:2312.10072 [pdf, other]

Assessing the Usability of GutGPT: A Simulation Study of an AI Clinical Decision Support System for Gastrointestinal Bleeding Risk

Authors: Colleen Chan, Kisung You, Sunny Chung, Mauro Giuffrè, Theo Saarinen, Niroop Rajashekar, Yuan Pu, Yeo Eun Shin, Loren Laine, Ambrose Wong, René Kizilcec, Jasjeet Sekhon, Dennis Shung

Abstract: Applications of large language models (LLMs) like ChatGPT have potential to enhance clinical decision support through conversational interfaces. However, challenges of human-algorithmic interaction and clinician trust are poorly understood. GutGPT, a LLM for gastrointestinal (GI) bleeding risk prediction and management guidance, was deployed in clinical simulation scenarios alongside the electroni… ▽ More Applications of large language models (LLMs) like ChatGPT have potential to enhance clinical decision support through conversational interfaces. However, challenges of human-algorithmic interaction and clinician trust are poorly understood. GutGPT, a LLM for gastrointestinal (GI) bleeding risk prediction and management guidance, was deployed in clinical simulation scenarios alongside the electronic health record (EHR) with emergency medicine physicians, internal medicine physicians, and medical students to evaluate its effect on physician acceptance and trust in AI clinical decision support systems (AI-CDSS). GutGPT provides risk predictions from a validated machine learning model and evidence-based answers by querying extracted clinical guidelines. Participants were randomized to GutGPT and an interactive dashboard, or the interactive dashboard and a search engine. Surveys and educational assessments taken before and after measured technology acceptance and content mastery. Preliminary results showed mixed effects on acceptance after using GutGPT compared to the dashboard or search engine but appeared to improve content mastery based on simulation performance. Overall, this study demonstrates LLMs like GutGPT could enhance effective AI-CDSS if implemented optimally and paired with interactive interfaces. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10, 2023, New Orleans, United States, 11 pages

arXiv:2312.06775 [pdf, other]

Magnetorotational instability in eccentric disks under vertical gravity

Authors: Chi-Ho Chan, Tsvi Piran, Julian H. Krolik

Abstract: Previously we demonstrated that the magnetorotational instability (MRI) grows vigorously in eccentric disks, much as it does in circular disks, and we investigated the nonlinear development of the eccentric MRI without vertical gravity. Here we explore how vertical gravity influences the magnetohydrodynamic (MHD) turbulence stirred by the eccentric MRI. Similar to eccentric disks without vertical… ▽ More Previously we demonstrated that the magnetorotational instability (MRI) grows vigorously in eccentric disks, much as it does in circular disks, and we investigated the nonlinear development of the eccentric MRI without vertical gravity. Here we explore how vertical gravity influences the magnetohydrodynamic (MHD) turbulence stirred by the eccentric MRI. Similar to eccentric disks without vertical gravity, the ratio of Maxwell stress to pressure, or the Shakura--Sunyaev alpha parameter, remains ~0.01, and the local sign flip in the Maxwell stress persists. Vertical gravity also introduces two new effects. Strong vertical compression near pericenter amplifies reconnection and dissipation, weakening the magnetic field. Angular momentum transport by MHD stresses broadens the mass distribution over eccentricity at much faster rates than without vertical gravity; as a result, spatial distributions of mass and eccentricity can be substantially modified in just ~5 to 10 orbits. MHD stresses in the eccentric debris of tidal disruption events may power emission $\gtrsim$1 yr after disruption. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 14 pages, 10 figures, 4 appendices, submitted to ApJ

arXiv:2312.05849 [pdf, other]

InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

Authors: Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan, Yap-Peng Tan, Weipeng Hu

Abstract: Large-scale text-to-image (T2I) diffusion models have showcased incredible capabilities in generating coherent images based on textual descriptions, enabling vast applications in content generation. While recent advancements have introduced control over factors such as object localization, posture, and image contours, a crucial gap remains in our ability to control the interactions between objects… ▽ More Large-scale text-to-image (T2I) diffusion models have showcased incredible capabilities in generating coherent images based on textual descriptions, enabling vast applications in content generation. While recent advancements have introduced control over factors such as object localization, posture, and image contours, a crucial gap remains in our ability to control the interactions between objects in the generated content. Well-controlling interactions in generated images could yield meaningful applications, such as creating realistic scenes with interacting characters. In this work, we study the problems of conditioning T2I diffusion models with Human-Object Interaction (HOI) information, consisting of a triplet label (person, action, object) and corresponding bounding boxes. We propose a pluggable interaction control model, called InteractDiffusion that extends existing pre-trained T2I diffusion models to enable them being better conditioned on interactions. Specifically, we tokenize the HOI information and learn their relationships via interaction embeddings. A conditioning self-attention layer is trained to map HOI tokens to visual tokens, thereby conditioning the visual tokens better in existing T2I diffusion models. Our model attains the ability to control the interaction and location on existing T2I diffusion models, which outperforms existing baselines by a large margin in HOI detection score, as well as fidelity in FID and KID. Project page: https://jiuntian.github.io/interactdiffusion. △ Less

Submitted 26 February, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

Comments: Website: https://jiuntian.github.io/interactdiffusion. Accepted at CVPR2024

arXiv:2312.05811 [pdf]

Liposomic lubricants suppress shear-stress induced inflammatory gene regulation in the joint in vivo

Authors: Linyi Zhu, Weifeng Lin, Monika Kluzek, Jadwiga Miotla-Zarebska, Vicky Batchelor, Matthew Gardiner, Chris Chan, Peter Culmer, Anastasios Chanalaris, Ronit Goldberg, Jacob Klein, Tonia L. Vincent

Abstract: Osteoarthritis (OA) is a widespread, debilitating joint disease associated with articular cartilage degradation. It is driven via mechano-inflammatory catabolic pathways, presumed up-regulated due to increased shear stress on the cartilage-embedded chondrocytes, that lead to tissue degeneration. Here we demonstrate that the up-regulation of the matrix metalloproteinase 3 (Mmp3) and interleukin-1be… ▽ More Osteoarthritis (OA) is a widespread, debilitating joint disease associated with articular cartilage degradation. It is driven via mechano-inflammatory catabolic pathways, presumed up-regulated due to increased shear stress on the cartilage-embedded chondrocytes, that lead to tissue degeneration. Here we demonstrate that the up-regulation of the matrix metalloproteinase 3 (Mmp3) and interleukin-1beta (Il1b) genes upon surgical joint destabilization in a model of murine OA is completely suppressed when lipid-based lubricants are injected into the joints. At the same time, Timp1, a compression but not shear-stress sensitive gene, is unaffected by lubricant. Our results provide direct evidence that biolubrication couples to catabolic gene regulation in OA, shed strong light on the nature of the chondrocytes' response to shear stress, and have clear implications for novel OA treatments. △ Less

Submitted 12 December, 2023; v1 submitted 10 December, 2023; originally announced December 2023.

arXiv:2312.04142 [pdf, other]

TimeDRL: Disentangled Representation Learning for Multivariate Time-Series

Authors: Ching Chang, Chiao-Tung Chan, Wei-Yao Wang, Wen-Chih Peng, Tien-Fu Chen

Abstract: Multivariate time-series data in numerous real-world applications (e.g., healthcare and industry) are informative but challenging due to the lack of labels and high dimensionality. Recent studies in self-supervised learning have shown their potential in learning rich representations without relying on labels, yet they fall short in learning disentangled embeddings and addressing issues of inductiv… ▽ More Multivariate time-series data in numerous real-world applications (e.g., healthcare and industry) are informative but challenging due to the lack of labels and high dimensionality. Recent studies in self-supervised learning have shown their potential in learning rich representations without relying on labels, yet they fall short in learning disentangled embeddings and addressing issues of inductive bias (e.g., transformation-invariance). To tackle these challenges, we propose TimeDRL, a generic multivariate time-series representation learning framework with disentangled dual-level embeddings. TimeDRL is characterized by three novel features: (i) disentangled derivation of timestamp-level and instance-level embeddings from patched time-series data using a [CLS] token strategy; (ii) utilization of timestamp-predictive and instance-contrastive tasks for disentangled representation learning, with the former optimizing timestamp-level embeddings with predictive loss, and the latter optimizing instance-level embeddings with contrastive loss; and (iii) avoidance of augmentation methods to eliminate inductive biases, such as transformation-invariance from crop** and masking. Comprehensive experiments on 6 time-series forecasting datasets and 5 time-series classification datasets have shown that TimeDRL consistently surpasses existing representation learning approaches, achieving an average improvement of forecasting by 58.02% in MSE and classification by 1.48% in accuracy. Furthermore, extensive ablation studies confirmed the relative contribution of each component in TimeDRL's architecture, and semi-supervised learning evaluations demonstrated its effectiveness in real-world scenarios, even with limited labeled data. The code is available at https://github.com/blacksnail789521/TimeDRL. △ Less

Submitted 13 March, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

Comments: This paper has been accepted by the International Conference on Data Engineering (ICDE) 2024

arXiv:2312.03771 [pdf, other]

DreamInpainter: Text-Guided Subject-Driven Image Inpainting with Diffusion Models

Authors: Shaoan Xie, Yang Zhao, Zhisheng Xiao, Kelvin C. K. Chan, Yandong Li, Yanwu Xu, Kun Zhang, Tingbo Hou

Abstract: This study introduces Text-Guided Subject-Driven Image Inpainting, a novel task that combines text and exemplar images for image inpainting. While both text and exemplar images have been used independently in previous efforts, their combined utilization remains unexplored. Simultaneously accommodating both conditions poses a significant challenge due to the inherent balance required between editab… ▽ More This study introduces Text-Guided Subject-Driven Image Inpainting, a novel task that combines text and exemplar images for image inpainting. While both text and exemplar images have been used independently in previous efforts, their combined utilization remains unexplored. Simultaneously accommodating both conditions poses a significant challenge due to the inherent balance required between editability and subject fidelity. To tackle this challenge, we propose a two-step approach DreamInpainter. First, we compute dense subject features to ensure accurate subject replication. Then, we employ a discriminative token selection module to eliminate redundant subject details, preserving the subject's identity while allowing changes according to other conditions such as mask shape and text prompts. Additionally, we introduce a decoupling regularization technique to enhance text control in the presence of exemplar images. Our extensive experiments demonstrate the superior performance of our method in terms of visual quality, identity preservation, and text control, showcasing its effectiveness in the context of text-guided subject-driven image inpainting. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2312.03505 [pdf, other]

Metadata for the Flux Density Calibration of the April 2018 Event Horizon Telescope Data

Authors: J. Y. Koay, C. Romero-Cañizales, L. D. Matthews, M. Janssen, L. Blackburn, R. P. J. Tilanus, J. Park, K. Asada, S. Matsushita, A. -K. Baczko, N. La Bella, C. -K. Chan, G. B. Crew, V. Fish, N. Patel, V. Ramakrishnan, H. Rottmann, J. Wagner, K. Wiik, P. Friberg, C. Goddi, S. Issaoun, G. Keating, J. Kim, T. P. Krichbaum , et al. (7 additional authors not shown)

Abstract: The Event Horizon Telescope (EHT) observations carried out in 2018 April at 1.3 mm wavelengths included 9 stations in the array, comprising 7 single-dish telescopes and 2 phased arrays. The metadata package for the 2018 EHT observing campaign contains calibration tables required for the a-priori amplitude calibration of the 2018 April visibility data. This memo is the official documentation accomp… ▽ More The Event Horizon Telescope (EHT) observations carried out in 2018 April at 1.3 mm wavelengths included 9 stations in the array, comprising 7 single-dish telescopes and 2 phased arrays. The metadata package for the 2018 EHT observing campaign contains calibration tables required for the a-priori amplitude calibration of the 2018 April visibility data. This memo is the official documentation accompanying the release of the 2018 EHT metadata package, providing an overview of the contents of the package. We describe how telescope sensitivities, gain curves and other relevant parameters for each station in the EHT array were collected, processed, and validated to produce the calibration tables. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: 26 pages, 7 figures, EHT Memo Series 2023-L1-01

arXiv:2312.03419 [pdf, other]

Synthesizing Physical Backdoor Datasets: An Automated Framework Leveraging Deep Generative Models

Authors: Sze Jue Yang, Chinh D. La, Quang H. Nguyen, Kok-Seng Wong, Anh Tuan Tran, Chee Seng Chan, Khoa D. Doan

Abstract: Backdoor attacks, representing an emerging threat to the integrity of deep neural networks, have garnered significant attention due to their ability to compromise deep learning systems clandestinely. While numerous backdoor attacks occur within the digital realm, their practical implementation in real-world prediction systems remains limited and vulnerable to disturbances in the physical world. Co… ▽ More Backdoor attacks, representing an emerging threat to the integrity of deep neural networks, have garnered significant attention due to their ability to compromise deep learning systems clandestinely. While numerous backdoor attacks occur within the digital realm, their practical implementation in real-world prediction systems remains limited and vulnerable to disturbances in the physical world. Consequently, this limitation has given rise to the development of physical backdoor attacks, where trigger objects manifest as physical entities within the real world. However, creating the requisite dataset to train or evaluate a physical backdoor model is a daunting task, limiting the backdoor researchers and practitioners from studying such physical attack scenarios. This paper unleashes a recipe that empowers backdoor researchers to effortlessly create a malicious, physical backdoor dataset based on advances in generative modeling. Particularly, this recipe involves 3 automatic modules: suggesting the suitable physical triggers, generating the poisoned candidate samples (either by synthesizing new samples or editing existing clean samples), and finally refining for the most plausible ones. As such, it effectively mitigates the perceived complexity associated with creating a physical backdoor dataset, transforming it from a daunting task into an attainable objective. Extensive experiment results show that datasets created by our "recipe" enable adversaries to achieve an impressive attack success rate on real physical world data and exhibit similar properties compared to previous physical backdoor attack studies. This paper offers researchers a valuable toolkit for studies of physical backdoors, all within the confines of their laboratories. △ Less

Submitted 15 March, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

arXiv:2312.03284 [pdf]

Adaptive Multi-band Modulation for Robust and Low-complexity Faster-than-Nyquist Non-Orthogonal FDM IM-DD System

Authors: Peiji Song, Zhouyi Hu, Yizhan Dai, Yuan Liu, Chao Gao, Chun-Kit Chan

Abstract: Faster-than-Nyquist non-orthogonal frequency-division multiplexing (FTN-NOFDM) is robust against the steep frequency roll-off by saving signal bandwidth. Among the FTN-NOFDM techniques, the non-orthogonal matrix precoding (NOM-p) based FTN has high compatibility with the conventional orthogonal frequency division multiplexing (OFDM), in terms of the advanced digital signal processing already used… ▽ More Faster-than-Nyquist non-orthogonal frequency-division multiplexing (FTN-NOFDM) is robust against the steep frequency roll-off by saving signal bandwidth. Among the FTN-NOFDM techniques, the non-orthogonal matrix precoding (NOM-p) based FTN has high compatibility with the conventional orthogonal frequency division multiplexing (OFDM), in terms of the advanced digital signal processing already used in OFDM. In this work, by dividing the single band into multiple sub-bands in the NOM-p-based FTN-NOFDM system, we propose a novel FTN-NOFDM scheme with adaptive multi-band modulation. The proposed scheme assigns different quadrature amplitude modulation (QAM) levels to different sub-bands, effectively utilizing the low-pass-like channel and reducing the complexity. The impacts of sub-band number and bandwidth compression factor on the bit-error-rate (BER) performance and implementation complexity are experimentally analyzed with a 32.23-Gb/s and 20-km intensity modulation-direct detection (IM-DD) optical transmission system. Results show that the proposed scheme with proper sub-band numbers can lower BER and greatly reduce the complexity compared to the conventional single-band way. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2312.01734 [pdf, other]

Effective Adapter for Face Recognition in the Wild

Authors: Yunhao Liu, Yu-Ju Tsai, Kelvin C. K. Chan, Xiangtai Li, Lu Qi, Ming-Hsuan Yang

Abstract: In this paper, we tackle the challenge of face recognition in the wild, where images often suffer from low quality and real-world distortions. Traditional heuristic approaches-either training models directly on these degraded images or their enhanced counterparts using face restoration techniques-have proven ineffective, primarily due to the degradation of facial features and the discrepancy in im… ▽ More In this paper, we tackle the challenge of face recognition in the wild, where images often suffer from low quality and real-world distortions. Traditional heuristic approaches-either training models directly on these degraded images or their enhanced counterparts using face restoration techniques-have proven ineffective, primarily due to the degradation of facial features and the discrepancy in image domains. To overcome these issues, we propose an effective adapter for augmenting existing face recognition models trained on high-quality facial datasets. The key of our adapter is to process both the unrefined and enhanced images using two similar structures, one fixed and the other trainable. Such design can confer two benefits. First, the dual-input system minimizes the domain gap while providing varied perspectives for the face recognition model, where the enhanced image can be regarded as a complex non-linear transformation of the original one by the restoration model. Second, both two similar structures can be initialized by the pre-trained models without drop** the past knowledge. The extensive experiments in zero-shot settings show the effectiveness of our method by surpassing baselines of about 3%, 4%, and 7% in three datasets. Our code will be publicly available. △ Less

Submitted 3 April, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

arXiv:2312.01677

Multi-task Image Restoration Guided By Robust DINO Features

Authors: Xin Lin, Chao Ren, Kelvin C. K. Chan, Lu Qi, **shan Pan, Ming-Hsuan Yang

Abstract: Multi-task image restoration has gained significant interest due to its inherent versatility and efficiency compared to its single-task counterpart. Despite its potential, performance degradation is observed with an increase in the number of tasks, primarily attributed to the distinct nature of each restoration task. Addressing this challenge, we introduce \mbox{\textbf{DINO-IR}}, a novel multi-ta… ▽ More Multi-task image restoration has gained significant interest due to its inherent versatility and efficiency compared to its single-task counterpart. Despite its potential, performance degradation is observed with an increase in the number of tasks, primarily attributed to the distinct nature of each restoration task. Addressing this challenge, we introduce \mbox{\textbf{DINO-IR}}, a novel multi-task image restoration approach leveraging robust features extracted from DINOv2. Our empirical analysis shows that while shallow features of DINOv2 capture rich low-level image characteristics, the deep features ensure a robust semantic representation insensitive to degradations while preserving high-frequency contour details. Building on these features, we devise specialized components, including multi-layer semantic fusion module, DINO-Restore adaption and fusion module, and DINO perception contrastive loss, to integrate DINOv2 features into the restoration paradigm. Equipped with the aforementioned components, our DINO-IR performs favorably against existing multi-task image restoration approaches in various tasks by a large margin, indicating the superiority and necessity of reinforcing the robust features for multi-task image restoration. △ Less

Submitted 5 December, 2023; v1 submitted 4 December, 2023; originally announced December 2023.

Comments: Some important information need to add

arXiv:2311.12611 [pdf, other]

Reconstructing the Baryonic Acoustic Oscillations in the presence of photo-$z$ uncertainties

Authors: Kwan Chuen Chan, Guoyuan Lu, Xin Wang

Abstract: The reconstruction method has been widely employed to improve the Baryon Acoustic Oscillations (BAO) measurement in spectroscopic survey data analysis. In this study, we explore the reconstruction of the BAO signals in the realm of photometric data. By adapting the Zel'dovich reconstruction technique, we develop a formalism to reconstruct the transverse BAO in the presence of photo-$z$ uncertainti… ▽ More The reconstruction method has been widely employed to improve the Baryon Acoustic Oscillations (BAO) measurement in spectroscopic survey data analysis. In this study, we explore the reconstruction of the BAO signals in the realm of photometric data. By adapting the Zel'dovich reconstruction technique, we develop a formalism to reconstruct the transverse BAO in the presence of photo-$z$ uncertainties \change{under the plane-parallel approximation}. We access the performance of the BAO reconstruction through comoving $N$-body simulations. The transverse reconstruction potential can be derived by solving a 2D potential equation, with the surface density and the radial potential contribution acting as the source terms. The solution is predominantly determined by the surface density. As is evident in dense samples, such as the matter field, the transverse BAO reconstruction can enhance both the strength of the BAO signals and their cross correlation with the initial conditions. At $z=0$, the cross correlation is increased by a factor of 1.2 at $ k_\perp = 0.2 \, \mathrm{Mpc}^{-1}h $ and 1.4 at $ k_\perp = 0.3 \, \mathrm{Mpc}^{-1}h $, respectively. We contrast the 2D potential results with the 3D Poisson equation solution, wherein we directly solve the potential equation using the position in photo-$z$ space, and find good agreement. Additionally, we examine the impact of various conditions, such as the smoothing scales and the level of photo-$z$ uncertainties, on the reconstruction results. We envision the straightforward application of this method to survey data. △ Less

Submitted 13 March, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

Comments: 15 pages, 13 figures, matched to the published version

arXiv:2311.08360 [pdf, other]

The Transient Nature of Emergent In-Context Learning in Transformers

Authors: Aaditya K. Singh, Stephanie C. Y. Chan, Ted Moskovitz, Erin Grant, Andrew M. Saxe, Felix Hill

Abstract: Transformer neural networks can exhibit a surprising capacity for in-context learning (ICL) despite not being explicitly trained for it. Prior work has provided a deeper understanding of how ICL emerges in transformers, e.g. through the lens of mechanistic interpretability, Bayesian inference, or by examining the distributional properties of training data. However, in each of these cases, ICL is t… ▽ More Transformer neural networks can exhibit a surprising capacity for in-context learning (ICL) despite not being explicitly trained for it. Prior work has provided a deeper understanding of how ICL emerges in transformers, e.g. through the lens of mechanistic interpretability, Bayesian inference, or by examining the distributional properties of training data. However, in each of these cases, ICL is treated largely as a persistent phenomenon; namely, once ICL emerges, it is assumed to persist asymptotically. Here, we show that the emergence of ICL during transformer training is, in fact, often transient. We train transformers on synthetic data designed so that both ICL and in-weights learning (IWL) strategies can lead to correct predictions. We find that ICL first emerges, then disappears and gives way to IWL, all while the training loss decreases, indicating an asymptotic preference for IWL. The transient nature of ICL is observed in transformers across a range of model sizes and datasets, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models. We find that L2 regularization may offer a path to more persistent ICL that removes the need for early stop** based on ICL-style validation tasks. Finally, we present initial evidence that ICL transience may be caused by competition between ICL and IWL circuits. △ Less

Submitted 11 December, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

Comments: 19 pages, 16 figures

arXiv:2311.06416 [pdf, other]

TESLA-X: An effective method to search for sub-threshold lensed gravitational waves with a targeted population model

Authors: Alvin K. Y. Li, Juno C. L. Chan, Heather Fong, Aidan H. Y. Chong, Alan J. Weinstein, Jose M. Ezquiaga

Abstract: Strong gravitational lensing can produce copies of gravitational-wave signals from the same source with the same waveform morphologies but different amplitudes and arrival times. Some of these strongly-lensed gravitational-wave signals can be demagnified and become sub-threshold. We present TESLA-X, an enhanced approach to the original GstLAL-based TargetEd Subthreshold Lensing seArch (TESLA) meth… ▽ More Strong gravitational lensing can produce copies of gravitational-wave signals from the same source with the same waveform morphologies but different amplitudes and arrival times. Some of these strongly-lensed gravitational-wave signals can be demagnified and become sub-threshold. We present TESLA-X, an enhanced approach to the original GstLAL-based TargetEd Subthreshold Lensing seArch (TESLA) method, for improving the detection efficiency of these potential sub-threshold lensed signals. TESLA-X utilizes lensed injections to generate a targeted population model and a targeted template bank. We compare the performance of a full template bank search, TESLA, and TESLA-X methods via a simulation campaign, and demonstrate the performance of TESLA-X in recovering lensed injections, particularly targeting a mock event. Our results show that the TESLA-X method achieves a maximum of $\sim 10\%$ higher search sensitivity compared to the TESLA method within the sub-threshold regime, presenting a step towards detecting the first lensed gravitational wave. TESLA-X will be employed for the LIGO-Virgo-KAGRA's collaboration-wide analysis to search for lensing signatures in the fourth observing run. △ Less

Submitted 4 June, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

arXiv:2311.04871 [pdf, other]

Integration of Summary Information from External Studies for Semiparametric Models

Authors: Jianxuan Zang, K. C. G. Chan, Fei Gao

Abstract: With the development of biomedical science, researchers have increasing access to an abundance of studies focusing on similar research questions. There is a growing interest in the integration of summary information from those studies to enhance the efficiency of estimation in their own internal studies. In this work, we present a comprehensive framework on integration of summary information from… ▽ More With the development of biomedical science, researchers have increasing access to an abundance of studies focusing on similar research questions. There is a growing interest in the integration of summary information from those studies to enhance the efficiency of estimation in their own internal studies. In this work, we present a comprehensive framework on integration of summary information from external studies when the data are modeled by semiparametric models. Our novel framework offers straightforward estimators that update conventional estimations with auxiliary information. It addresses computational challenges by capitalizing on the intricate mathematical structure inherent to the problem. We demonstrate the conditions when the proposed estimators are theoretically more efficient than initial estimate based solely on internal data. Several special cases such as proportional hazards model in survival analysis are provided with numerical examples. △ Less

Submitted 9 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

arXiv:2311.04388 [pdf, other]

The $230$ GHz Variability of Numerical Models of Sagittarius~A* I. Parameter Surveys on Varying Ion-electron Temperature Ratios Under Strongly Magnetized Conditions

Authors: Ho-Sang Chan, Chi-kwan Chan, Ben S. Prather, George N. Wong, Charles Gammie

Abstract: The $230$ GHz lightcurves of Sagittarius~A* (Sgr~A*) predicted by general relativistic magnetohydrodynamics (GRMHD) and ray-tracing (GRRT) models in Event Horizon Telescope Collaboration et al. (2022) have higher variability $M_{ΔT}$ compared to observations. In this series of papers, we explore the origin of such large brightness variability. In this first paper, we performed large GRRT parameter… ▽ More The $230$ GHz lightcurves of Sagittarius~A* (Sgr~A*) predicted by general relativistic magnetohydrodynamics (GRMHD) and ray-tracing (GRRT) models in Event Horizon Telescope Collaboration et al. (2022) have higher variability $M_{ΔT}$ compared to observations. In this series of papers, we explore the origin of such large brightness variability. In this first paper, we performed large GRRT parameter surveys that span from the optically thin to the optically thick regimes, covering the ion-electron temperature ratio under strongly magnetized conditions, $R_{\rm Low}$, from $1$ to $60$. We find that increasing $R_{\rm Low}$ can lead to either an increase or a reduction in $M_{ΔT}$ depending on other model parameters, making it consistent with the observed variability of Sgr~A* in some cases. Our analysis of GRRT image snapshots finds that the major contribution to the large $M_{ΔT}$ for the $R_{\rm Low} = 1$ models comes from the photon rings. However, secondary contributions from the accretion flow are also visible depending on the spin parameter. Our work demonstrates the importance of the electron temperature used for modelling radiatively inefficient accretion flows and places new constraints on the ion-electron temperature ratio. A more in-depth analysis for understanding the dependencies of $M_{ΔT}$ on $R_{\rm Low}$ will be performed in subsequent papers. △ Less

Submitted 4 February, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

Comments: 15 Pages, 9 Figures

arXiv:2311.04044 [pdf, other]

PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models

Authors: Haoran Li, Dadi Guo, Donghao Li, Wei Fan, Qi Hu, Xin Liu, Chunkit Chan, Duanyi Yao, Yuan Yao, Yangqiu Song

Abstract: The rapid development of language models (LMs) brings unprecedented accessibility and usage for both models and users. On the one hand, powerful LMs achieve state-of-the-art performance over numerous downstream NLP tasks. On the other hand, more and more attention is paid to unrestricted model accesses that may bring malicious privacy risks of data leakage. To address these issues, many recent wor… ▽ More The rapid development of language models (LMs) brings unprecedented accessibility and usage for both models and users. On the one hand, powerful LMs achieve state-of-the-art performance over numerous downstream NLP tasks. On the other hand, more and more attention is paid to unrestricted model accesses that may bring malicious privacy risks of data leakage. To address these issues, many recent works propose privacy-preserving language models (PPLMs) with differential privacy (DP). Unfortunately, different DP implementations make it challenging for a fair comparison among existing PPLMs. In this paper, we present PrivLM-Bench, a multi-perspective privacy evaluation benchmark to empirically and intuitively quantify the privacy leakage of LMs. Instead of only reporting DP parameters, PrivLM-Bench sheds light on the neglected inference data privacy during actual usage. PrivLM-Bench first clearly defines multi-faceted privacy objectives. Then, PrivLM-Bench constructs a unified pipeline to perform private fine-tuning. Lastly, PrivLM-Bench performs existing privacy attacks on LMs with pre-defined privacy objectives as the empirical evaluation results. The empirical attack results are used to fairly and intuitively evaluate the privacy leakage of various PPLMs. We conduct extensive experiments on three datasets of GLUE for mainstream LMs. △ Less

Submitted 1 June, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

Comments: To appear at ACL 2024

arXiv:2311.00899 [pdf, other]

RoboVQA: Multimodal Long-Horizon Reasoning for Robotics

Authors: Pierre Sermanet, Tianli Ding, Jeffrey Zhao, Fei Xia, Debidatta Dwibedi, Keerthana Gopalakrishnan, Christine Chan, Gabriel Dulac-Arnold, Sharath Maddineni, Nikhil J Joshi, Pete Florence, Wei Han, Robert Baruch, Yao Lu, Suvir Mirchandani, Peng Xu, Pannag Sanketi, Karol Hausman, Izhak Shafran, Brian Ichter, Yuan Cao

Abstract: We present a scalable, bottom-up and intrinsically diverse data collection scheme that can be used for high-level reasoning with long and medium horizons and that has 2.2x higher throughput compared to traditional narrow top-down step-by-step collection. We collect realistic data by performing any user requests within the entirety of 3 office buildings and using multiple robot and human embodiment… ▽ More We present a scalable, bottom-up and intrinsically diverse data collection scheme that can be used for high-level reasoning with long and medium horizons and that has 2.2x higher throughput compared to traditional narrow top-down step-by-step collection. We collect realistic data by performing any user requests within the entirety of 3 office buildings and using multiple robot and human embodiments. With this data, we show that models trained on all embodiments perform better than ones trained on the robot data only, even when evaluated solely on robot episodes. We find that for a fixed collection budget it is beneficial to take advantage of cheaper human collection along with robot collection. We release a large and highly diverse (29,520 unique instructions) dataset dubbed RoboVQA containing 829,502 (video, text) pairs for robotics-focused visual question answering. We also demonstrate how evaluating real robot experiments with an intervention mechanism enables performing tasks to completion, making it deployable with human oversight even if imperfect while also providing a single performance metric. We demonstrate a single video-conditioned model named RoboVQA-VideoCoCa trained on our dataset that is capable of performing a variety of grounded high-level reasoning tasks in broad realistic settings with a cognitive intervention rate 46% lower than the zero-shot state of the art visual language model (VLM) baseline and is able to guide real robots through long-horizon tasks. The performance gap with zero-shot state-of-the-art models indicates that a lot of grounded data remains to be collected for real-world deployment, emphasizing the critical need for scalable data collection approaches. Finally, we show that video VLMs significantly outperform single-image VLMs with an average error rate reduction of 19% across all VQA tasks. Data and videos available at https://robovqa.github.io △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2311.00210 [pdf, other]

Broken Adaptive Ridge Method for Variable Selection in Generalized Partly Linear Models with Application to the Coronary Artery Disease Data

Authors: Christian Chan, Xiaotian Dai, Thierry Chekouo, Quan Long, Xuewen Lu

Abstract: Motivated by the CATHGEN data, we develop a new statistical learning method for simultaneous variable selection and parameter estimation under the context of generalized partly linear models for data with high-dimensional covariates. The method is referred to as the broken adaptive ridge (BAR) estimator, which is an approximation of the $L_0$-penalized regression by iteratively performing reweight… ▽ More Motivated by the CATHGEN data, we develop a new statistical learning method for simultaneous variable selection and parameter estimation under the context of generalized partly linear models for data with high-dimensional covariates. The method is referred to as the broken adaptive ridge (BAR) estimator, which is an approximation of the $L_0$-penalized regression by iteratively performing reweighted squared $L_2$-penalized regression. The generalized partly linear model extends the generalized linear model by including a non-parametric component to construct a flexible model for modeling various types of covariate effects. We employ the Bernstein polynomials as the sieve space to approximate the non-parametric functions so that our method can be implemented easily using the existing R packages. Extensive simulation studies suggest that the proposed method performs better than other commonly used penalty-based variable selection methods. We apply the method to the CATHGEN data with a binary response from a coronary artery disease study, which motivated our research, and obtained new findings in both high-dimensional genetic and low-dimensional non-genetic covariates. △ Less

Submitted 31 October, 2023; originally announced November 2023.

arXiv:2310.18204 [pdf, other]

doi 10.1038/s41586-020-03058-x

Competing magnetic orders in a bilayer Hubbard model with ultracold atoms

Authors: Marcell Gall, Nicola Wurz, Jens Samland, Chun Fai Chan, Michael Köhl

Abstract: Fermionic atoms in optical lattices have served as a compelling model system to study and emulate the physics of strongly-correlated matter. Driven by the advances of high-resolution microscopy, the recent focus of research has been on two-dimensional systems in which several quantum phases, such as anti-ferromagnetic Mott insulators for repulsive interactions and charge-density waves for attracti… ▽ More Fermionic atoms in optical lattices have served as a compelling model system to study and emulate the physics of strongly-correlated matter. Driven by the advances of high-resolution microscopy, the recent focus of research has been on two-dimensional systems in which several quantum phases, such as anti-ferromagnetic Mott insulators for repulsive interactions and charge-density waves for attractive interactions have been observed. However, the aspired emulations of real materials, such as bilayer graphene, have to take into account that their lattice structure composes of coupled layers and therefore is not strictly two-dimensional. In this work, we realize a bilayer Fermi-Hubbard model using ultracold atoms in an optical lattice and demonstrate that the interlayer coupling controls a crossover between a planar anti-ferromagnetically ordered Mott insulator and a band insulator of spin-singlets along the bonds between the layers. Our work will enable the exploration of further fascinating properties of coupled-layer Hubbard models, such as theoretically predicted superconducting pairing mechanisms. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Journal ref: Nature 589, 40(2021)

arXiv:2310.14146 [pdf, other]

Cocaine Use Prediction with Tensor-based Machine Learning on Multimodal MRI Connectome Data

Authors: Anru R. Zhang, Ryan P. Bell, Chen An, Runshi Tang, Shana A. Hall, Cliburn Chan, Kareem Al-Khalil, Christina S. Meade

Abstract: This paper considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study utilized functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 individuals, which was then parcellated into 246 regions of interest (ROIs) using the Brainnetome atlas. After data preprocessing, the datasets were transformed in… ▽ More This paper considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study utilized functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 individuals, which was then parcellated into 246 regions of interest (ROIs) using the Brainnetome atlas. After data preprocessing, the datasets were transformed into tensor form. We developed a tensor-based unsupervised machine learning algorithm to reduce the size of the data tensor from $275$ (individuals) $\times 2$ (fMRI and dMRI) $\times 246$ (ROIs) $\times 246$ (ROIs) to $275$ (individuals) $\times 2$ (fMRI and dMRI) $\times 6$ (clusters) $\times 6$ (clusters). This was achieved by applying the high-order Lloyd algorithm to group the ROI data into 6 clusters. Features were extracted from the reduced tensor and combined with demographic features (age, gender, race, and HIV status). The resulting dataset was used to train a Catboost model using subsampling and nested cross-validation techniques, which achieved a prediction accuracy of 0.857 for identifying cocaine users. The model was also compared with other models, and the feature importance of the model was presented. Overall, this study highlights the potential for using tensor-based machine learning algorithms to predict cocaine use based on MRI connectomic data and presents a promising approach for identifying individuals at risk of substance abuse. △ Less

Submitted 21 October, 2023; originally announced October 2023.

arXiv:2310.12874 [pdf, other]

StoryAnalogy: Deriving Story-level Analogies from Large Language Models to Unlock Analogical Understanding

Authors: Cheng Jiayang, Lin Qiu, Tsz Ho Chan, Tianqing Fang, Weiqi Wang, Chunkit Chan, Dongyu Ru, Qipeng Guo, Hongming Zhang, Yangqiu Song, Yue Zhang, Zheng Zhang

Abstract: Analogy-making between narratives is crucial for human reasoning. In this paper, we evaluate the ability to identify and generate analogies by constructing a first-of-its-kind large-scale story-level analogy corpus, \textsc{StoryAnalogy}, which contains 24K story pairs from diverse domains with human annotations on two similarities from the extended Structure-Map** Theory. We design a set of tes… ▽ More Analogy-making between narratives is crucial for human reasoning. In this paper, we evaluate the ability to identify and generate analogies by constructing a first-of-its-kind large-scale story-level analogy corpus, \textsc{StoryAnalogy}, which contains 24K story pairs from diverse domains with human annotations on two similarities from the extended Structure-Map** Theory. We design a set of tests on \textsc{StoryAnalogy}, presenting the first evaluation of story-level analogy identification and generation. Interestingly, we find that the analogy identification tasks are incredibly difficult not only for sentence embedding models but also for the recent large language models (LLMs) such as ChatGPT and LLaMa. ChatGPT, for example, only achieved around 30% accuracy in multiple-choice questions (compared to over 85% accuracy for humans). Furthermore, we observe that the data in \textsc{StoryAnalogy} can improve the quality of analogy generation in LLMs, where a fine-tuned FlanT5-xxl model achieves comparable performance to zero-shot ChatGPT. △ Less

Submitted 23 October, 2023; v1 submitted 19 October, 2023; originally announced October 2023.

Comments: Accepted by EMNLP 2023 main conference

arXiv:2310.10383 [pdf, other]

Privacy in Large Language Models: Attacks, Defenses and Future Directions

Authors: Haoran Li, Yulin Chen, **glong Luo, Yan Kang, Xiao** Zhang, Qi Hu, Chunkit Chan, Yangqiu Song

Abstract: The advancement of large language models (LLMs) has significantly enhanced the ability to effectively tackle various downstream NLP tasks and unify these tasks into generative pipelines. On the one hand, powerful language models, trained on massive textual data, have brought unparalleled accessibility and usability for both models and users. On the other hand, unrestricted access to these models c… ▽ More The advancement of large language models (LLMs) has significantly enhanced the ability to effectively tackle various downstream NLP tasks and unify these tasks into generative pipelines. On the one hand, powerful language models, trained on massive textual data, have brought unparalleled accessibility and usability for both models and users. On the other hand, unrestricted access to these models can also introduce potential malicious and unintentional privacy risks. Despite ongoing efforts to address the safety and privacy concerns associated with LLMs, the problem remains unresolved. In this paper, we provide a comprehensive analysis of the current privacy attacks targeting LLMs and categorize them according to the adversary's assumed capabilities to shed light on the potential vulnerabilities present in LLMs. Then, we present a detailed overview of prominent defense strategies that have been developed to counter these privacy attacks. Beyond existing works, we identify upcoming privacy concerns as LLMs evolve. Lastly, we point out several potential avenues for future exploration. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.08864 [pdf, other]

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. More details can be found on the project website https://robotics-transformer-x.github.io. △ Less

Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: Project website: https://robotics-transformer-x.github.io

arXiv:2310.06662 [pdf, other]

doi 10.1038/s41467-024-45531-5

Chiral Active Particles are Sensitive Reporter to Environmental Geometry

Authors: Chung Wing Chan, Daihui Wu, Kaiyao Qiao, Kin Long Fong, Zhiyu Yang, Yilong Han, Rui Zhang

Abstract: Chiral active particles (CAPs) are self-propelling particles that break time-reversal symmetry by orbiting or spinning, leading to intriguing behaviors. Here, we examined the dynamics of CAPs moving in 2D lattices of disk obstacles through active Brownian dynamics simulations and granular experiments with grass seeds. We find that the effective diffusivity of the CAPs is sensitive to the structure… ▽ More Chiral active particles (CAPs) are self-propelling particles that break time-reversal symmetry by orbiting or spinning, leading to intriguing behaviors. Here, we examined the dynamics of CAPs moving in 2D lattices of disk obstacles through active Brownian dynamics simulations and granular experiments with grass seeds. We find that the effective diffusivity of the CAPs is sensitive to the structure of the obstacle lattice, a feature absent in achiral active particles. We further studied the transport of CAPs in obstacle arrays under an external field and found a reentrant directional locking effect, which can be used to sort CAPs with different activities. Finally, we demonstrated that the parallelogram lattice of obstacles without mirror symmetry can separate clockwise and counter-clockwise CAPs. The mechanisms of the above three novel phenomena are qualitatively explained. As such, our work provides a basis for designing chirality-based tools for single-cell diagnosis and separation, and active particle-based environmental sensors. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.02639 [pdf, ps, other]

doi 10.1140/epjc/s10052-023-12346-5

On bilinear superintegrability for monomial matrix models in pure phase

Authors: C. -T. Chan, V. Mishnyakov, A. Popolitov, K. Tsybikov

Abstract: We argue that the recently discovered bilinear superintegrability arXiv:2206.02045 generalizes, in a non-trivial way, to monomial matrix models in pure phase. The structure is much richer: for the trivial core Schur functions required modifications are minor, and the only new ingredient is a certain (contour-dependent) permutation matrix; for non-trivial-core Schur functions, in both bi-linear and… ▽ More We argue that the recently discovered bilinear superintegrability arXiv:2206.02045 generalizes, in a non-trivial way, to monomial matrix models in pure phase. The structure is much richer: for the trivial core Schur functions required modifications are minor, and the only new ingredient is a certain (contour-dependent) permutation matrix; for non-trivial-core Schur functions, in both bi-linear and tri-linear averages the deformation is more complicated: averages acquire extra N-dependent factors and selection rule is less straightforward to imply. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Journal ref: EPJC 83, 1145 (2023)

arXiv:2310.02321 [pdf, other]

doi 10.1103/PhysRevD.108.084004

Not all spacetime coordinates for general-relativistic ray tracing are created equal

Authors: Gabriele Bozzola, Chi-kwan Chan, Vasileios Paschalidis

Abstract: Models for the observational appearance of astrophysical black holes rely critically on accurate general-relativistic ray tracing and radiation transport to compute the intensity measured by a distant observer. In this paper, we illustrate how the choice of coordinates and initial conditions affect this process. In particular, we show that propagating rays from the camera to the source leads to di… ▽ More Models for the observational appearance of astrophysical black holes rely critically on accurate general-relativistic ray tracing and radiation transport to compute the intensity measured by a distant observer. In this paper, we illustrate how the choice of coordinates and initial conditions affect this process. In particular, we show that propagating rays from the camera to the source leads to different solutions if the spatial part of the momentum of the photon points towards the horizon or away from it. In doing this, we also show that coordinates that are well suited for numerical General-Relativistic MagnetoHydroDynamic (GRMHD) simulations are typically not optimal for generic ray tracing. We discuss the implications for black-hole images and show that radiation transport in optimal and non-optimal spacetime coordinates lead to the same images up to numerical errors and algorithmic choices. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: 10 pages, 6 figures, matches PRD version

Journal ref: PRD 108, 8, 084004 (2023)

arXiv:2309.16185 [pdf, other]

doi 10.1103/PhysRevLett.132.044001

Generation of Spatiotemporal Vortex Pulses by Simple Diffractive Grating

Authors: Zhiyuan Che, Wenzhe Liu, Lei Shi, C. T. Chan, Jian Zi

Abstract: Spatiotemporal vortex pulses are wave packets that carry transverse orbital angular momentum, exhibiting exotic structured wavefronts that can twist through space and time. Existing methods to generate these pulses require complex setups like spatial light modulators or computer-optimized structures. Here, we demonstrate a new approach to generate spatiotemporal vortex pulses using just a simple d… ▽ More Spatiotemporal vortex pulses are wave packets that carry transverse orbital angular momentum, exhibiting exotic structured wavefronts that can twist through space and time. Existing methods to generate these pulses require complex setups like spatial light modulators or computer-optimized structures. Here, we demonstrate a new approach to generate spatiotemporal vortex pulses using just a simple diffractive grating. The key is constructing a phase vortex in frequency-momentum space by leveraging symmetry, resonance, and diffraction. Our approach is applicable to any wave system. We use a liquid surface wave platform to directly demonstrate and observe the real-time generation and evolution of spatiotemporal vortex pulses. This straightforward technique provides opportunities to explore pulse dynamics and potential applications across different disciplines. △ Less

Submitted 29 September, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

arXiv:2309.14022 [pdf, other]

Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time

Authors: Cheng-Hung Chan, Cheng-Yang Yuan, Cheng Sun, Hwann-Tzong Chen

Abstract: We present a video decomposition method that facilitates layer-based editing of videos with spatiotemporally varying lighting and motion effects. Our neural model decomposes an input video into multiple layered representations, each comprising a 2D texture map, a mask for the original video, and a multiplicative residual characterizing the spatiotemporal variations in lighting conditions. A single… ▽ More We present a video decomposition method that facilitates layer-based editing of videos with spatiotemporally varying lighting and motion effects. Our neural model decomposes an input video into multiple layered representations, each comprising a 2D texture map, a mask for the original video, and a multiplicative residual characterizing the spatiotemporal variations in lighting conditions. A single edit on the texture maps can be propagated to the corresponding locations in the entire video frames while preserving other contents' consistencies. Our method efficiently learns the layer-based neural representations of a 1080p video in 25s per frame via coordinate hashing and allows real-time rendering of the edited result at 71 fps on a single GPU. Qualitatively, we run our method on various videos to show its effectiveness in generating high-quality editing effects. Quantitatively, we propose to adopt feature-tracking evaluation metrics for objectively assessing the consistency of video editing. Project page: https://lightbulb12294.github.io/hashing-nvd/ △ Less

Submitted 25 September, 2023; originally announced September 2023.

arXiv:2309.08303 [pdf, other]

Self-Consistent Narrative Prompts on Abductive Natural Language Inference

Authors: Chunkit Chan, Xin Liu, Tsz Ho Chan, Jiayang Cheng, Yangqiu Song, Ginny Wong, Simon See

Abstract: Abduction has long been seen as crucial for narrative comprehension and reasoning about everyday situations. The abductive natural language inference ($α$NLI) task has been proposed, and this narrative text-based task aims to infer the most plausible hypothesis from the candidates given two observations. However, the inter-sentential coherence and the model consistency have not been well exploited… ▽ More Abduction has long been seen as crucial for narrative comprehension and reasoning about everyday situations. The abductive natural language inference ($α$NLI) task has been proposed, and this narrative text-based task aims to infer the most plausible hypothesis from the candidates given two observations. However, the inter-sentential coherence and the model consistency have not been well exploited in the previous works on this task. In this work, we propose a prompt tuning model $α$-PACE, which takes self-consistency and inter-sentential coherence into consideration. Besides, we propose a general self-consistent framework that considers various narrative sequences (e.g., linear narrative and reverse chronology) for guiding the pre-trained language model in understanding the narrative context of input. We conduct extensive experiments and thorough ablation studies to illustrate the necessity and effectiveness of $α$-PACE. The performance of our method shows significant improvement against extensive competitive baselines. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: Accepted at IJCNLP-AACL 2023 main track

arXiv:2309.08039 [pdf, other]

Flexible Functional Treatment Effect Estimation

Authors: Jiayi Wang, Raymond K. W. Wong, Xiaoke Zhang, Kwun Chuen Gary Chan

Abstract: We study treatment effect estimation with functional treatments where the average potential outcome functional is a function of functions, in contrast to continuous treatment effect estimation where the target is a function of real numbers. By considering a flexible scalar-on-function marginal structural model, a weight-modified kernel ridge regression (WMKRR) is adopted for estimation. The weight… ▽ More We study treatment effect estimation with functional treatments where the average potential outcome functional is a function of functions, in contrast to continuous treatment effect estimation where the target is a function of real numbers. By considering a flexible scalar-on-function marginal structural model, a weight-modified kernel ridge regression (WMKRR) is adopted for estimation. The weights are constructed by directly minimizing the uniform balancing error resulting from a decomposition of the WMKRR estimator, instead of being estimated under a particular treatment selection model. Despite the complex structure of the uniform balancing error derived under WMKRR, finite-dimensional convex algorithms can be applied to efficiently solve for the proposed weights thanks to a representer theorem. The optimal convergence rate is shown to be attainable by the proposed WMKRR estimator without any smoothness assumption on the true weight function. Corresponding empirical performance is demonstrated by a simulation study and a real data application. △ Less

Submitted 14 September, 2023; originally announced September 2023.

arXiv:2309.07231 [pdf, other]

A new covariant formalism for kinetic plasma simulations in curved spacetimes

Authors: Tyler Trent, Pierre Christian, Chi-kwan Chan, Dimitrios Psaltis, Feryal Ozel

Abstract: Low density plasmas are characterized by a large scale separation between the gyromotion of particles around local magnetic fields and the macroscopic scales of the system, often making global kinetic simulations computationally intractable. The guiding center formalism has been proposed as a powerful tool to bridge the gap between these scales. Despite its usefulness, the guiding center approach… ▽ More Low density plasmas are characterized by a large scale separation between the gyromotion of particles around local magnetic fields and the macroscopic scales of the system, often making global kinetic simulations computationally intractable. The guiding center formalism has been proposed as a powerful tool to bridge the gap between these scales. Despite its usefulness, the guiding center approach has been formulated successfully only in flat spacetimes, limiting its applicability in astrophysical settings. Here, we present a new covariant formalism that leads to kinetic equations in the guiding center limit that are valid in arbitrary spacetimes. Through a variety of experiments, we demonstrate that our equations capture all known gyro-center drifts while overcoming one severe limitation imposed on numerical algorithms by the fast timescales of the particle gyromotion. This formalism will enable explorations of a variety of global plasma kinetic phenomena in the curved spacetimes around black holes and neutron stars. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: Accepted for publication in ApJL

arXiv:2309.07109 [pdf, ps, other]

Real-time Monitoring for the Next Core-Collapse Supernova in JUNO

Authors: Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli , et al. (606 additional authors not shown)

Abstract: The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neu… ▽ More The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), a 20 kton liquid scintillator detector currently under construction in South China. The real-time monitoring system is designed to ensure both prompt alert speed and comprehensive coverage of progenitor stars. It incorporates prompt monitors on the electronic board as well as online monitors at the data acquisition stage. Assuming a false alert rate of 1 per year, this monitoring system exhibits sensitivity to pre-SN neutrinos up to a distance of approximately 1.6 (0.9) kiloparsecs and SN neutrinos up to about 370 (360) kiloparsecs for a progenitor mass of 30 solar masses, considering both normal and inverted mass ordering scenarios. The pointing ability of the CCSN is evaluated by analyzing the accumulated event anisotropy of inverse beta decay interactions from pre-SN or SN neutrinos. This, along with the early alert, can play a crucial role in facilitating follow-up multi-messenger observations of the next galactic or nearby extragalactic CCSN. △ Less

Submitted 4 December, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: 24 pages, 9 figures, accepted for the publication at JCAP

arXiv:2309.03897 [pdf, other]

ProPainter: Improving Propagation and Transformer for Video Inpainting

Authors: Shangchen Zhou, Chongyi Li, Kelvin C. K. Chan, Chen Change Loy

Abstract: Flow-based propagation and spatiotemporal Transformer are two mainstream mechanisms in video inpainting (VI). Despite the effectiveness of these components, they still suffer from some limitations that affect their performance. Previous propagation-based approaches are performed separately either in the image or feature domain. Global image propagation isolated from learning may cause spatial misa… ▽ More Flow-based propagation and spatiotemporal Transformer are two mainstream mechanisms in video inpainting (VI). Despite the effectiveness of these components, they still suffer from some limitations that affect their performance. Previous propagation-based approaches are performed separately either in the image or feature domain. Global image propagation isolated from learning may cause spatial misalignment due to inaccurate optical flow. Moreover, memory or computational constraints limit the temporal range of feature propagation and video Transformer, preventing exploration of correspondence information from distant frames. To address these issues, we propose an improved framework, called ProPainter, which involves enhanced ProPagation and an efficient Transformer. Specifically, we introduce dual-domain propagation that combines the advantages of image and feature war**, exploiting global correspondences reliably. We also propose a mask-guided sparse video Transformer, which achieves high efficiency by discarding unnecessary and redundant tokens. With these components, ProPainter outperforms prior arts by a large margin of 1.46 dB in PSNR while maintaining appealing efficiency. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: Accepted by ICCV 2023. Code: https://github.com/sczhou/ProPainter

arXiv:2309.00756 [pdf, other]

Learning Risk Preferences in Markov Decision Processes: an Application to the Fourth Down Decision in Football

Authors: Nathan Sandholtz, Lucas Wu, Martin Puterman, Timothy C. Y. Chan

Abstract: For decades, National Football League (NFL) coaches' observed fourth down decisions have been largely inconsistent with prescriptions based on statistical models. In this paper, we develop a framework to explain this discrepancy using a novel inverse optimization approach. We model the fourth down decision and the subsequent sequence of plays in a game as a Markov decision process (MDP), the dynam… ▽ More For decades, National Football League (NFL) coaches' observed fourth down decisions have been largely inconsistent with prescriptions based on statistical models. In this paper, we develop a framework to explain this discrepancy using a novel inverse optimization approach. We model the fourth down decision and the subsequent sequence of plays in a game as a Markov decision process (MDP), the dynamics of which we estimate from NFL play-by-play data from the 2014 through 2022 seasons. We assume that coaches' observed decisions are optimal but that the risk preferences governing their decisions are unknown. This yields a novel inverse decision problem for which the optimality criterion, or risk measure, of the MDP is the estimand. Using the quantile function to parameterize risk, we estimate which quantile-optimal policy yields the coaches' observed decisions as minimally suboptimal. In general, we find that coaches' fourth-down behavior is consistent with optimizing low quantiles of the next-state value distribution, which corresponds to conservative risk preferences. We also find that coaches exhibit higher risk tolerances when making decisions in the opponent's half of the field than in their own, and that league average fourth down risk tolerances have increased over the seasons in our data. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: 33 pages, 9 figures

arXiv:2308.16684 [pdf, other]

Everyone Can Attack: Repurpose Lossy Compression as a Natural Backdoor Attack

Authors: Sze Jue Yang, Quang Nguyen, Chee Seng Chan, Khoa D. Doan

Abstract: The vulnerabilities to backdoor attacks have recently threatened the trustworthiness of machine learning models in practical applications. Conventional wisdom suggests that not everyone can be an attacker since the process of designing the trigger generation algorithm often involves significant effort and extensive experimentation to ensure the attack's stealthiness and effectiveness. Alternativel… ▽ More The vulnerabilities to backdoor attacks have recently threatened the trustworthiness of machine learning models in practical applications. Conventional wisdom suggests that not everyone can be an attacker since the process of designing the trigger generation algorithm often involves significant effort and extensive experimentation to ensure the attack's stealthiness and effectiveness. Alternatively, this paper shows that there exists a more severe backdoor threat: anyone can exploit an easily-accessible algorithm for silent backdoor attacks. Specifically, this attacker can employ the widely-used lossy image compression from a plethora of compression tools to effortlessly inject a trigger pattern into an image without leaving any noticeable trace; i.e., the generated triggers are natural artifacts. One does not require extensive knowledge to click on the "convert" or "save as" button while using tools for lossy image compression. Via this attack, the adversary does not need to design a trigger generator as seen in prior works and only requires poisoning the data. Empirically, the proposed attack consistently achieves 100% attack success rate in several benchmark datasets such as MNIST, CIFAR-10, GTSRB and CelebA. More significantly, the proposed attack can still achieve almost 100% attack success rate with very small (approximately 10%) poisoning rates in the clean label setting. The generated trigger of the proposed attack using one lossy compression algorithm is also transferable across other related compression algorithms, exacerbating the severity of this backdoor threat. This work takes another crucial step toward understanding the extensive risks of backdoor attacks in practice, urging practitioners to investigate similar attacks and relevant backdoor mitigation methods. △ Less

Submitted 3 September, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

Comments: 14 pages. This paper shows everyone can mount a powerful and stealthy backdoor attack with the widely-used lossy image compression

arXiv:2308.15381 [pdf, other]

A search for pulsars around Sgr A* in the first Event Horizon Telescope dataset

Authors: Pablo Torne, Kuo Liu, Ralph P. Eatough, Jompoj Wongphechauxsorn, James M. Cordes, Gregory Desvignes, Mariafelicia De Laurentis, Michael Kramer, Scott M. Ransom, Shami Chatterjee, Robert Wharton, Ramesh Karuppusamy, Lindy Blackburn, Michael Janssen, Chi-kwan Chan, Geoffrey B. Crew, Lynn D. Matthews, Ciriaco Goddi, Helge Rottmann, Jan Wagner, Salvador Sanchez, Ignacio Ruiz, Federico Abbate, Geoffrey C. Bower, Juan J. Salamanca , et al. (261 additional authors not shown)

Abstract: The Event Horizon Telescope (EHT) observed in 2017 the supermassive black hole at the center of the Milky Way, Sagittarius A* (Sgr A*), at a frequency of 228.1 GHz ($λ$=1.3 mm). The fundamental physics tests that even a single pulsar orbiting Sgr A* would enable motivate searching for pulsars in EHT datasets. The high observing frequency means that pulsars - which typically exhibit steep emission… ▽ More The Event Horizon Telescope (EHT) observed in 2017 the supermassive black hole at the center of the Milky Way, Sagittarius A* (Sgr A*), at a frequency of 228.1 GHz ($λ$=1.3 mm). The fundamental physics tests that even a single pulsar orbiting Sgr A* would enable motivate searching for pulsars in EHT datasets. The high observing frequency means that pulsars - which typically exhibit steep emission spectra - are expected to be very faint. However, it also negates pulse scattering, an effect that could hinder pulsar detections in the Galactic Center. Additionally, magnetars or a secondary inverse Compton emission could be stronger at millimeter wavelengths than at lower frequencies. We present a search for pulsars close to Sgr A* using the data from the three most-sensitive stations in the EHT 2017 campaign: the Atacama Large Millimeter/submillimeter Array, the Large Millimeter Telescope and the IRAM 30 m Telescope. We apply three detection methods based on Fourier-domain analysis, the Fast-Folding-Algorithm and single pulse search targeting both pulsars and burst-like transient emission; using the simultaneity of the observations to confirm potential candidates. No new pulsars or significant bursts were found. Being the first pulsar search ever carried out at such high radio frequencies, we detail our analysis methods and give a detailed estimation of the sensitivity of the search. We conclude that the EHT 2017 observations are only sensitive to a small fraction ($\lesssim$2.2%) of the pulsars that may exist close to Sgr A*, motivating further searches for fainter pulsars in the region. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: 33 pages, 7 figures, 6 Tables. Accepted for publication in ApJ

arXiv:2308.13666 [pdf, other]

A Joint Fermi-GBM and Swift-BAT Analysis of Gravitational-Wave Candidates from the Third Gravitational-wave Observing Run

Authors: C. Fletcher, J. Wood, R. Hamburg, P. Veres, C. M. Hui, E. Bissaldi, M. S. Briggs, E. Burns, W. H. Cleveland, M. M. Giles, A. Goldstein, B. A. Hristov, D. Kocevski, S. Lesage, B. Mailyan, C. Malacaria, S. Poolakkil, A. von Kienlin, C. A. Wilson-Hodge, The Fermi Gamma-ray Burst Monitor Team, M. Crnogorčević, J. DeLaunay, A. Tohuvavohu, R. Caputo, S. B. Cenko , et al. (1674 additional authors not shown)

Abstract: We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses,… ▽ More We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses, the Targeted Search and the Untargeted Search, we investigate whether there are any coincident GRBs associated with the GWs. We also search the Swift-BAT rate data around the GW times to determine whether a GRB counterpart is present. No counterparts are found. Using both the Fermi-GBM Targeted Search and the Swift-BAT search, we calculate flux upper limits and present joint upper limits on the gamma-ray luminosity of each GW. Given these limits, we constrain theoretical models for the emission of gamma-rays from binary black hole mergers. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.10848 [pdf, other]

AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Authors: Weize Chen, Yusheng Su, **gwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, Yujia Qin, Xin Cong, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou

Abstract: Autonomous agents empowered by Large Language Models (LLMs) have undergone significant improvements, enabling them to generalize across a broad spectrum of tasks. However, in real-world scenarios, cooperation among individuals is often required to enhance the efficiency and effectiveness of task accomplishment. Hence, inspired by human group dynamics, we propose a multi-agent framework \framework… ▽ More Autonomous agents empowered by Large Language Models (LLMs) have undergone significant improvements, enabling them to generalize across a broad spectrum of tasks. However, in real-world scenarios, cooperation among individuals is often required to enhance the efficiency and effectiveness of task accomplishment. Hence, inspired by human group dynamics, we propose a multi-agent framework \framework that can collaboratively and dynamically adjust its composition as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that \framework framework can effectively deploy multi-agent groups that outperform a single agent. Furthermore, we delve into the emergence of social behaviors among individual agents within a group during collaborative task accomplishment. In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups. Our codes for \framework will soon be released at \url{https://github.com/OpenBMB/AgentVerse}. △ Less

Submitted 23 October, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

Comments: Under review. Code at https://github.com/OpenBMB/AgentVerse/

arXiv:2308.07580 [pdf, other]

doi 10.1609/aaai.v38i20.30227

AutoLTS: Automating Cycling Stress Assessment via Contrastive Learning and Spatial Post-processing

Authors: Bo Lin, Shoshanna Saxe, Timothy C. Y. Chan

Abstract: Cycling stress assessment, which quantifies cyclists' perceived stress imposed by the built environment and motor traffics, increasingly informs cycling infrastructure planning and cycling route recommendation. However, currently calculating cycling stress is slow and data-intensive, which hinders its broader application. In this paper, We propose a deep learning framework to support accurate, fas… ▽ More Cycling stress assessment, which quantifies cyclists' perceived stress imposed by the built environment and motor traffics, increasingly informs cycling infrastructure planning and cycling route recommendation. However, currently calculating cycling stress is slow and data-intensive, which hinders its broader application. In this paper, We propose a deep learning framework to support accurate, fast, and large-scale cycling stress assessments for urban road networks based on street-view images. Our framework features i) a contrastive learning approach that leverages the ordinal relationship among cycling stress labels, and ii) a post-processing technique that enforces spatial smoothness into our predictions. On a dataset of 39,153 road segments collected in Toronto, Canada, our results demonstrate the effectiveness of our deep learning framework and the value of using image data for cycling stress assessment in the absence of high-quality road geometry and motor traffic data. △ Less

Submitted 15 August, 2023; originally announced August 2023.

arXiv:2308.07314 [pdf, other]

Dual Associated Encoder for Face Restoration

Authors: Yu-Ju Tsai, Yu-Lun Liu, Lu Qi, Kelvin C. K. Chan, Ming-Hsuan Yang

Abstract: Restoring facial details from low-quality (LQ) images has remained a challenging problem due to its ill-posedness induced by various degradations in the wild. The existing codebook prior mitigates the ill-posedness by leveraging an autoencoder and learned codebook of high-quality (HQ) features, achieving remarkable quality. However, existing approaches in this paradigm frequently depend on a singl… ▽ More Restoring facial details from low-quality (LQ) images has remained a challenging problem due to its ill-posedness induced by various degradations in the wild. The existing codebook prior mitigates the ill-posedness by leveraging an autoencoder and learned codebook of high-quality (HQ) features, achieving remarkable quality. However, existing approaches in this paradigm frequently depend on a single encoder pre-trained on HQ data for restoring HQ images, disregarding the domain gap between LQ and HQ images. As a result, the encoding of LQ inputs may be insufficient, resulting in suboptimal performance. To tackle this problem, we propose a novel dual-branch framework named DAEFR. Our method introduces an auxiliary LQ branch that extracts crucial information from the LQ inputs. Additionally, we incorporate association training to promote effective synergy between the two branches, enhancing code prediction and output quality. We evaluate the effectiveness of DAEFR on both synthetic and real-world datasets, demonstrating its superior performance in restoring facial details. Project page: https://liagm.github.io/DAEFR/ △ Less

Submitted 20 January, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

Comments: ICLR 2024, Project page: https://liagm.github.io/DAEFR/

Showing 51–100 of 1,058 results for author: Chan, C