-
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
Authors:
Enshu Liu,
Junyi Zhu,
Zinan Lin,
Xuefei Ning,
Matthew B. Blaschko,
Shengen Yan,
Guohao Dai,
Huazhong Yang,
Yu Wang
Abstract:
The rapid advancement of large language models (LLMs) has led to architectures with billions to trillions of parameters, posing significant deployment challenges due to their substantial demands on memory, processing power, and energy consumption. Sparse Mixture-of-Experts (SMoE) architectures have emerged as a solution, activating only a subset of parameters per token, thereby achieving faster in…
▽ More
The rapid advancement of large language models (LLMs) has led to architectures with billions to trillions of parameters, posing significant deployment challenges due to their substantial demands on memory, processing power, and energy consumption. Sparse Mixture-of-Experts (SMoE) architectures have emerged as a solution, activating only a subset of parameters per token, thereby achieving faster inference while maintaining performance. However, SMoE models still face limitations in broader deployment due to their large parameter counts and significant GPU memory requirements. In this work, we introduce a gradient-free evolutionary strategy named EEP (Efficient Expert P}runing) to enhance the pruning of experts in SMoE models. EEP relies solely on model inference (i.e., no gradient computation) and achieves greater sparsity while maintaining or even improving performance on downstream tasks. EEP can be used to reduce both the total number of experts (thus saving GPU memory) and the number of active experts (thus accelerating inference). For example, we demonstrate that pruning up to 75% of experts in Mixtral $8\times7$B-Instruct results in a substantial reduction in parameters with minimal performance loss. Remarkably, we observe improved performance on certain tasks, such as a significant increase in accuracy on the SQuAD dataset (from 53.4% to 75.4%), when pruning half of the experts. With these results, EEP not only lowers the barrier to deploying SMoE models,but also challenges the conventional understanding of model pruning by showing that fewer experts can lead to better task-specific performance without any fine-tuning. Code is available at https://github.com/imagination-research/EEP.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Blockchain Based Zero-Knowledge Proof of Location in IoT
Authors:
Wei Wu,
Erwu Liu,
Xinglin Gong,
Rui Wang
Abstract:
With the development of precise positioning technology, a growing number of location-based services (LBSs) facilitate people's life. Most LBSs require proof of location (PoL) to prove that the user satisfies the service requirement, which exposes the user's privacy. In this paper, we propose a zero-knowledge proof of location (zk-PoL) protocol to better protect the user's privacy. With the zk-PoL…
▽ More
With the development of precise positioning technology, a growing number of location-based services (LBSs) facilitate people's life. Most LBSs require proof of location (PoL) to prove that the user satisfies the service requirement, which exposes the user's privacy. In this paper, we propose a zero-knowledge proof of location (zk-PoL) protocol to better protect the user's privacy. With the zk-PoL protocol, the user can choose necessary information to expose to the server, so that hierarchical privacy protection can be achieved. The evaluation shows that the zk-PoL has excellent security to resist main attacks, moreover the computational efficiency is independent of input parameters and the zk-PoL is appropriate to delay-tolerant LBSs.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Language Modeling with Editable External Knowledge
Authors:
Belinda Z. Li,
Emmy Liu,
Alexis Ross,
Abbas Zeitoun,
Graham Neubig,
Jacob Andreas
Abstract:
When the world changes, so does the text that humans write about it. How do we build language models that can be easily updated to reflect these changes? One popular approach is retrieval-augmented generation, in which new documents are inserted into a knowledge base and retrieved during prediction for downstream tasks. Most prior work on these systems have focused on improving behavior during pre…
▽ More
When the world changes, so does the text that humans write about it. How do we build language models that can be easily updated to reflect these changes? One popular approach is retrieval-augmented generation, in which new documents are inserted into a knowledge base and retrieved during prediction for downstream tasks. Most prior work on these systems have focused on improving behavior during prediction through better retrieval or reasoning. This paper introduces ERASE, which instead improves model behavior when new documents are acquired, by incrementally deleting or rewriting other entries in the knowledge base each time a document is added. In two new benchmark datasets evaluating models' ability to answer questions about a stream of news articles or conversations, ERASE improves accuracy relative to conventional retrieval-augmented generation by 7-13% (Mixtral-8x7B) and 6-10% (Llama-3-8B) absolute. Code and data are available at https://github.com/belindal/ERASE
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Balancing Performance and Cost for Two-Hop Cooperative Communications: Stackelberg Game and Distributed Multi-Agent Reinforcement Learning
Authors:
Yuanzhe Geng,
Erwu Liu,
Wei Ni,
Rui Wang,
Yan Liu,
Hao Xu,
Chen Cai,
Abbas Jamalipour
Abstract:
This paper aims to balance performance and cost in a two-hop wireless cooperative communication network where the source and relays have contradictory optimization goals and make decisions in a distributed manner. This differs from most existing works that have typically assumed that source and relay nodes follow a schedule created implicitly by a central controller. We propose that the relays for…
▽ More
This paper aims to balance performance and cost in a two-hop wireless cooperative communication network where the source and relays have contradictory optimization goals and make decisions in a distributed manner. This differs from most existing works that have typically assumed that source and relay nodes follow a schedule created implicitly by a central controller. We propose that the relays form an alliance in an attempt to maximize the benefit of relaying while the source aims to increase the channel capacity cost-effectively. To this end, we establish the trade problem as a Stackelberg game, and prove the existence of its equilibrium. Another important aspect is that we use multi-agent reinforcement learning (MARL) to approach the equilibrium in a situation where the instantaneous channel state information (CSI) is unavailable, and the source and relays do not have knowledge of each other's goal. A multi-agent deep deterministic policy gradient-based framework is designed, where the relay alliance and the source act as agents. Experiments demonstrate that the proposed method can obtain an acceptable performance that is close to the game-theoretic equilibrium for all players under time-invariant environments, which considerably outperforms its potential alternatives and is only about 2.9% away from the optimal solution.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Quantized Andreev conductance in semiconductor nanowires
Authors:
Yichun Gao,
Wenyu Song,
Yuhao Wang,
Zuhan Geng,
Zhan Cao,
Zehao Yu,
Shuai Yang,
Jiaye Xu,
Fangting Chen,
Zonglin Li,
Ruidong Li,
Lining Yang,
Zhaoyu Wang,
Shan Zhang,
Xiao Feng,
Tiantian Wang,
Yunyi Zang,
Lin Li,
Dong E. Liu,
Runan Shang,
Qi-Kun Xue,
Ke He,
Hao Zhang
Abstract:
Clean one-dimensional electron systems can exhibit quantized conductance. The plateau conductance doubles if the transport is dominated by Andreev reflection. Here, we report quantized conductance observed in both Andreev and normal-state transports in PbTe-Pb and PbTe-In hybrid nanowires. The Andreev plateau is observed at $4e^2/h$, twice of the normal plateau value of $2e^2/h$. In comparison, An…
▽ More
Clean one-dimensional electron systems can exhibit quantized conductance. The plateau conductance doubles if the transport is dominated by Andreev reflection. Here, we report quantized conductance observed in both Andreev and normal-state transports in PbTe-Pb and PbTe-In hybrid nanowires. The Andreev plateau is observed at $4e^2/h$, twice of the normal plateau value of $2e^2/h$. In comparison, Andreev conductance in the best-optimized III-V nanowires is non-quantized due to mode-mixing induced dips (a disorder effect), despite the quantization of normal-state transport. The negligible mode mixing in PbTe hybrids indicates an unprecedented low-disorder transport regime for nanowire devices, beneficial for Majorana researches.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
A Large-scale Universal Evaluation Benchmark For Face Forgery Detection
Authors:
Yijun Bei,
Hengrui Lou,
**song Geng,
Erteng Liu,
Lechao Cheng,
Jie Song,
Mingli Song,
Zunlei Feng
Abstract:
With the rapid development of AI-generated content (AIGC) technology, the production of realistic fake facial images and videos that deceive human visual perception has become possible. Consequently, various face forgery detection techniques have been proposed to identify such fake facial content. However, evaluating the effectiveness and generalizability of these detection techniques remains a si…
▽ More
With the rapid development of AI-generated content (AIGC) technology, the production of realistic fake facial images and videos that deceive human visual perception has become possible. Consequently, various face forgery detection techniques have been proposed to identify such fake facial content. However, evaluating the effectiveness and generalizability of these detection techniques remains a significant challenge. To address this, we have constructed a large-scale evaluation benchmark called DeepFaceGen, aimed at quantitatively assessing the effectiveness of face forgery detection and facilitating the iterative development of forgery detection technology. DeepFaceGen consists of 776,990 real face image/video samples and 773,812 face forgery image/video samples, generated using 34 mainstream face generation techniques. During the construction process, we carefully consider important factors such as content diversity, fairness across ethnicities, and availability of comprehensive labels, in order to ensure the versatility and convenience of DeepFaceGen. Subsequently, DeepFaceGen is employed in this study to evaluate and analyze the performance of 13 mainstream face forgery detection techniques from various perspectives. Through extensive experimental analysis, we derive significant findings and propose potential directions for future research. The code and dataset for DeepFaceGen are available at https://github.com/HengruiLou/DeepFaceGen.
△ Less
Submitted 13 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
Fast-Fading Channel and Power Optimization of the Magnetic Inductive Cellular Network
Authors:
Honglei Ma,
Erwu Liu,
Zhijun Fang,
Rui Wang,
Yongbin Gao,
Wenjun Yu,
Dongming Zhang
Abstract:
The cellular network of magnetic Induction (MI) communication holds promise in long-distance underground environments. In the traditional MI communication, there is no fast-fading channel since the MI channel is treated as a quasi-static channel. However, for the vehicle (mobile) MI (VMI) communication, the unpredictable antenna vibration brings the remarkable fast-fading. As such fast-fading cann…
▽ More
The cellular network of magnetic Induction (MI) communication holds promise in long-distance underground environments. In the traditional MI communication, there is no fast-fading channel since the MI channel is treated as a quasi-static channel. However, for the vehicle (mobile) MI (VMI) communication, the unpredictable antenna vibration brings the remarkable fast-fading. As such fast-fading cannot be modeled by the central limit theorem, it differs radically from other wireless fast-fading channels. Unfortunately, few studies focus on this phenomenon. In this paper, using a novel space modeling based on the electromagnetic field theorem, we propose a 3-dimension model of the VMI antenna vibration. By proposing ``conjugate pseudo-piecewise functions'' and boundary $p(x)$ distribution, we derive the cumulative distribution function (CDF), probability density function (PDF) and the expectation of the VMI fast-fading channel. We also theoretically analyze the effects of the VMI fast-fading on the network throughput, including the VMI outage probability which can be ignored in the traditional MI channel study. We draw several intriguing conclusions different from those in wireless fast-fading studies. For instance, the fast-fading brings more uniformly distributed channel coefficients. Finally, we propose the power control algorithm using the non-cooperative game and multiagent Q-learning methods to optimize the throughput of the cellular VMI network. Simulations validate the derivation and the proposed algorithm.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Authors:
Tianchen Zhao,
Tongcheng Fang,
Enshu Liu,
Rui Wan,
Widyadewi Soedarmadji,
Shiyao Li,
Zinan Lin,
Guohao Dai,
Shengen Yan,
Huazhong Yang,
Xuefei Ning,
Yu Wang
Abstract:
Diffusion transformers (DiTs) have exhibited remarkable performance in visual generation tasks, such as generating realistic images or videos based on textual instructions. However, larger model sizes and multi-frame processing for video generation lead to increased computational and memory costs, posing challenges for practical deployment on edge devices. Post-Training Quantization (PTQ) is an ef…
▽ More
Diffusion transformers (DiTs) have exhibited remarkable performance in visual generation tasks, such as generating realistic images or videos based on textual instructions. However, larger model sizes and multi-frame processing for video generation lead to increased computational and memory costs, posing challenges for practical deployment on edge devices. Post-Training Quantization (PTQ) is an effective method for reducing memory costs and computational complexity. When quantizing diffusion transformers, we find that applying existing diffusion quantization methods designed for U-Net faces challenges in preserving quality. After analyzing the major challenges for quantizing diffusion transformers, we design an improved quantization scheme: "ViDiT-Q": Video and Image Diffusion Transformer Quantization) to address these issues. Furthermore, we identify highly sensitive layers and timesteps hinder quantization for lower bit-widths. To tackle this, we improve ViDiT-Q with a novel metric-decoupled mixed-precision quantization method (ViDiT-Q-MP). We validate the effectiveness of ViDiT-Q across a variety of text-to-image and video models. While baseline quantization methods fail at W8A8 and produce unreadable content at W4A8, ViDiT-Q achieves lossless W8A8 quantization. ViDiTQ-MP achieves W4A8 with negligible visual quality degradation, resulting in a 2.5x memory optimization and a 1.5x latency speedup.
△ Less
Submitted 30 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment
Authors:
Chen Zhang,
Qiang He,
Zhou Yuan,
Elvis S. Liu,
Hong Wang,
Jian Zhao,
Yang Wang
Abstract:
Deep Reinforcement Learning (DRL) agents have demonstrated impressive success in a wide range of game genres. However, existing research primarily focuses on optimizing DRL competence rather than addressing the challenge of prolonged player interaction. In this paper, we propose a practical DRL agent system for fighting games named Shūkai, which has been successfully deployed to Naruto Mobile, a p…
▽ More
Deep Reinforcement Learning (DRL) agents have demonstrated impressive success in a wide range of game genres. However, existing research primarily focuses on optimizing DRL competence rather than addressing the challenge of prolonged player interaction. In this paper, we propose a practical DRL agent system for fighting games named Shūkai, which has been successfully deployed to Naruto Mobile, a popular fighting game with over 100 million registered users. Shūkai quantifies the state to enhance generalizability, introducing Heterogeneous League Training (HELT) to achieve balanced competence, generalizability, and training efficiency. Furthermore, Shūkai implements specific rewards to align the agent's behavior with human expectations. Shūkai's ability to generalize is demonstrated by its consistent competence across all characters, even though it was trained on only 13% of them. Additionally, HELT exhibits a remarkable 22% improvement in sample efficiency. Shūkai serves as a valuable training partner for players in Naruto Mobile, enabling them to enhance their abilities and skills.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
BeerReview: A Blockchain-enabled Peer Review Platform
Authors:
Guodong **,
Zihan Zhou,
Wenzheng Tang,
Kanglei Yu,
Hao Xu,
Erwu Liu
Abstract:
In an era of increasing concerns over intellectual property rights, traditional peer review systems face challenges including plagiarism, malicious attacks, and unauthorized data access. BeerReview, a blockchain-enabled peer review platform, offers a robust solution, enabling experts and scholars to participate actively in the review process without concerns about plagiarism or security threats. F…
▽ More
In an era of increasing concerns over intellectual property rights, traditional peer review systems face challenges including plagiarism, malicious attacks, and unauthorized data access. BeerReview, a blockchain-enabled peer review platform, offers a robust solution, enabling experts and scholars to participate actively in the review process without concerns about plagiarism or security threats. Following the completion of its alpha testing, BeerReview demonstrates the potential for expanded deployment. This platform offers improved convenience and more robust intellectual property protection within the peer review process with open source initiative.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
Authors:
Tianchen Zhao,
Xuefei Ning,
Tongcheng Fang,
Enshu Liu,
Guyue Huang,
Zinan Lin,
Shengen Yan,
Guohao Dai,
Yu Wang
Abstract:
Diffusion models have achieved significant visual generation quality. However, their significant computational and memory costs pose challenge for their application on resource-constrained mobile devices or even desktop GPUs. Recent few-step diffusion models reduces the inference time by reducing the denoising steps. However, their memory consumptions are still excessive. The Post Training Quantiz…
▽ More
Diffusion models have achieved significant visual generation quality. However, their significant computational and memory costs pose challenge for their application on resource-constrained mobile devices or even desktop GPUs. Recent few-step diffusion models reduces the inference time by reducing the denoising steps. However, their memory consumptions are still excessive. The Post Training Quantization (PTQ) replaces high bit-width FP representation with low-bit integer values (INT4/8) , which is an effective and efficient technique to reduce the memory cost. However, when applying to few-step diffusion models, existing quantization methods face challenges in preserving both the image quality and text alignment. To address this issue, we propose an mixed-precision quantization framework - MixDQ. Firstly, We design specialized BOS-aware quantization method for highly sensitive text embedding quantization. Then, we conduct metric-decoupled sensitivity analysis to measure the sensitivity of each layer. Finally, we develop an integer-programming-based method to conduct bit-width allocation. While existing quantization methods fall short at W8A8, MixDQ could achieve W8A8 without performance loss, and W4A8 with negligible visual degradation. Compared with FP16, we achieve 3-4x reduction in model size and memory cost, and 1.45x latency speedup.
△ Less
Submitted 29 May, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Give and Take: An End-To-End Investigation of Giveaway Scam Conversion Rates
Authors:
Enze Liu,
George Kappos,
Eric Mugnier,
Luca Invernizzi,
Stefan Savage,
David Tao,
Kurt Thomas,
Geoffrey M. Voelker,
Sarah Meiklejohn
Abstract:
Scams -- fraudulent schemes designed to swindle money from victims -- have existed for as long as recorded history. However, the Internet's combination of low communication cost, global reach, and functional anonymity has allowed scam volumes to reach new heights. Designing effective interventions requires first understanding the context: how scammers reach potential victims, the earnings they mak…
▽ More
Scams -- fraudulent schemes designed to swindle money from victims -- have existed for as long as recorded history. However, the Internet's combination of low communication cost, global reach, and functional anonymity has allowed scam volumes to reach new heights. Designing effective interventions requires first understanding the context: how scammers reach potential victims, the earnings they make, and any potential bottlenecks for durable interventions. In this short paper, we focus on these questions in the context of cryptocurrency giveaway scams, where victims are tricked into irreversibly transferring funds to scammers under the pretense of even greater returns. Combining data from Twitter, YouTube and Twitch livestreams, landing pages, and cryptocurrency blockchains, we measure how giveaway scams operate at scale. We find that 1 in 1000 scam tweets, and 4 in 100,000 livestream views, net a victim, and that scammers managed to extract nearly \$4.62 million from just hundreds of victims during our measurement window.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Dual-Segment Clustering Strategy for Federated Learning in Heterogeneous Environments
Authors:
Pengcheng Sun,
Erwu Liu,
Wei Ni,
Kanglei Yu,
Rui Wang,
Abbas Jamalipour
Abstract:
Federated learning (FL) is a distributed machine learning paradigm with high efficiency and low communication load, only transmitting parameters or gradients of network. However, the non-independent and identically distributed (Non-IID) data characteristic has a negative impact on this paradigm. Furthermore, the heterogeneity of communication quality will significantly affect the accuracy of param…
▽ More
Federated learning (FL) is a distributed machine learning paradigm with high efficiency and low communication load, only transmitting parameters or gradients of network. However, the non-independent and identically distributed (Non-IID) data characteristic has a negative impact on this paradigm. Furthermore, the heterogeneity of communication quality will significantly affect the accuracy of parameter transmission, causing a degradation in the performance of the FL system or even preventing its convergence. This letter proposes a dual-segment clustering (DSC) strategy, which first clusters the clients according to the heterogeneous communication conditions and then performs a second clustering by the sample size and label distribution, so as to solve the problem of data and communication heterogeneity. Experimental results show that the DSC strategy proposed in this letter can improve the convergence rate of FL, and has superiority on accuracy in a heterogeneous environment compared with the classical algorithm of cluster.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
BeACONS: A Blockchain-enabled Authentication and Communications Network for Scalable IoV
Authors:
Qi Shi,
**gyi Sun,
Hanwei Fu,
Peizhe Fu,
Jiayuan Ma,
Hao Xu,
Erwu Liu
Abstract:
This paper introduces a novel blockchain-enabled authentication and communications network for scalable Internet of Vehicles, which aims to bolster security and confidentiality, diminish communications latency, and reduce dependence on centralised infrastructures like Certificate Authorities and Public Key Infrastructures by leveraging Blockchain-enabled Domain Name Services and Blockchain-enabled…
▽ More
This paper introduces a novel blockchain-enabled authentication and communications network for scalable Internet of Vehicles, which aims to bolster security and confidentiality, diminish communications latency, and reduce dependence on centralised infrastructures like Certificate Authorities and Public Key Infrastructures by leveraging Blockchain-enabled Domain Name Services and Blockchain-enabled Mutual Authentication. The proposed network is structured into a primary layer, consisting of Road Side Units and edge servers as servers of Blockchain-enabled Domain Name Services for managing inter-vehicle communications identities, and a sub-layer within each vehicle for intra-vehicle communications via the Blockchain-enabled Mutual Authentication Protocol. This design facilitates secure connections across vehicles by coordinating between the layers, significantly improving communications security and efficiency. This study also evaluates Road Side Unit availability against the random distribution of Road Side Units along the route of different vehicles. The proposed model presents a novel pathway towards a decentralised, secure, and efficient Internet of Vehicles ecosystem, contributing to the advancement of autonomous and trustworthy vehicular networks.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
A SER-based Device Selection Mechanism in Multi-bits Quantization Federated Learning
Authors:
Pengcheng Sun,
Erwu Liu,
Rui Wang
Abstract:
The quality of wireless communication will directly affect the performance of federated learning (FL), so this paper analyze the influence of wireless communication on FL through symbol error rate (SER). In FL system, non-orthogonal multiple access (NOMA) can be used as the basic communication framework to reduce the communication congestion and interference caused by multiple users, which takes a…
▽ More
The quality of wireless communication will directly affect the performance of federated learning (FL), so this paper analyze the influence of wireless communication on FL through symbol error rate (SER). In FL system, non-orthogonal multiple access (NOMA) can be used as the basic communication framework to reduce the communication congestion and interference caused by multiple users, which takes advantage of the superposition characteristics of wireless channels. The Minimum Mean Square Error (MMSE) based serial interference cancellation (SIC) technology is used to recover the gradient of each terminal node one by one at the receiving end. In this paper, the gradient parameters are quantized into multiple bits to retain more gradient information to the maximum extent and to improve the tolerance of transmission errors. On this basis, we designed the SER-based device selection mechanism (SER-DSM) to ensure that the learning performance is not affected by users with bad communication conditions, while accommodating as many users as possible to participate in the learning process, which is inclusive to a certain extent. The experiments show the influence of multi-bit quantization of gradient on FL and the necessity and superiority of the proposed SER-based device selection mechanism.
△ Less
Submitted 20 April, 2024;
originally announced May 2024.
-
Online Electricity Purchase for Data Center with Dynamic Virtual Battery from Flexibility Aggregation
Authors:
Kekun Gao,
Yuejun Yan,
Yixuan Liu,
Endong Liu,
Pengcheng You
Abstract:
As a critical component of modern infrastructure, data centers account for a huge amount of power consumption and greenhouse gas emission. This paper studies the electricity purchase strategy for a data center to lower its energy cost while integrating local renewable generation under uncertainty. To facilitate efficient and scalable decision-making, we propose a two-layer hierarchy where the lowe…
▽ More
As a critical component of modern infrastructure, data centers account for a huge amount of power consumption and greenhouse gas emission. This paper studies the electricity purchase strategy for a data center to lower its energy cost while integrating local renewable generation under uncertainty. To facilitate efficient and scalable decision-making, we propose a two-layer hierarchy where the lower layer consists of the operation of all electrical equipment in the data center and the upper layer determines the procurement and dispatch of electricity. At the lower layer, instead of device-level scheduling in real time, we propose to exploit the inherent flexibility in demand, such as thermostatically controlled loads and flexible computing tasks, and aggregate them into virtual batteries. By this means, the upper-layer decision only needs to take into account these virtual batteries, the size of which is generally small and independent of the data center scale. We further propose an online algorithm based on Lyapunov optimization to purchase electricity from the grid with a manageable energy cost, even though the prices, renewable availability, and battery specifications are uncertain and dynamic. In particular, we show that, under mild conditions, our algorithm can achieve bounded loss compared with the offline optimal cost, while strictly respecting battery operational constraints. Extensive simulation studies validate the theoretical analysis and illustrate the tradeoff between optimality and conservativeness.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Curse of Dimensionality on Persistence Diagrams
Authors:
Yasuaki Hiraoka,
Yusuke Imoto,
Shu Kanazawa,
Enhao Liu
Abstract:
The stability of persistent homology has led to wide applications of the persistence diagram as a trusted topological descriptor in the presence of noise. However, with the increasing demand for high-dimension and low-sample-size data processing in modern science, it is questionable whether persistence diagrams retain their reliability in the presence of high-dimensional noise. This work aims to s…
▽ More
The stability of persistent homology has led to wide applications of the persistence diagram as a trusted topological descriptor in the presence of noise. However, with the increasing demand for high-dimension and low-sample-size data processing in modern science, it is questionable whether persistence diagrams retain their reliability in the presence of high-dimensional noise. This work aims to study the reliability of persistence diagrams in the high-dimension low-sample-size data setting. By analyzing the asymptotic behavior of persistence diagrams for high-dimensional random data, we show that persistence diagrams are no longer reliable descriptors of low-sample-size data under high-dimensional noise perturbations. We refer to this loss of reliability of persistence diagrams in such data settings as the curse of dimensionality on persistence diagrams. Next, we investigate the possibility of using normalized principal component analysis as a method for reducing the dimensionality of the high-dimensional observed data to resolve the curse of dimensionality. We show that this method can mitigate the curse of dimensionality on persistence diagrams. Our results shed some new light on the challenges of processing high-dimension low-sample-size data by persistence diagrams and provide a starting point for future research in this area.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
SNR Maximization and Localization for UAV-IRS-Assisted Near-Field Systems
Authors:
Hanfu Zhang,
Yidan Mei,
Erwu Liu,
Rui Wang
Abstract:
This letter introduces a novel unmanned aerial vehicle (UAV)-intelligent reflecting surface (IRS) structure into near-field localization systems to enhance the design flexibility of IRS, thereby obtaining additional performance gains. Specifically, a UAV-IRS is utilized to improve the harsh wireless environment and provide localization possibilities. To improve the localization accuracy, a joint o…
▽ More
This letter introduces a novel unmanned aerial vehicle (UAV)-intelligent reflecting surface (IRS) structure into near-field localization systems to enhance the design flexibility of IRS, thereby obtaining additional performance gains. Specifically, a UAV-IRS is utilized to improve the harsh wireless environment and provide localization possibilities. To improve the localization accuracy, a joint optimization problem considering UAV position and UAV-IRS passive beamforming is formulated to maximize the receiving signal-to-noise ratio (SNR). An alternative optimization algorithm is proposed to solve the complex non-convex problem leveraging the projected gradient ascent (PGA) algorithm and the principle of minimizing the phase difference of the receiving signals. Closed-form expressions for UAV-IRS phase shift are derived to reduce the algorithm complexity. In the simulations, the proposed algorithm is compared with three different schemes and outperforms the others in both receiving SNR and localization accuracy.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Achieving High Yield of Perpendicular SOT-MTJ Manufactured on 300 mm Wafers
Authors:
Wenlong Yang,
Zhenghui Ji,
Yang Gao,
Kaiyuan Zhou,
Qijun Guo,
Dinggui Zeng,
Shasha Wang,
Ming Wang,
Lijie Shen,
Guilin Chen,
Yihui Sun,
Enlong Liu,
Shikun He
Abstract:
The large-scale fabrication of three-terminal magnetic tunnel junctions (MTJs) with high yield is becoming increasingly crucial, especially with the growing interest in spin-orbit torque (SOT) magnetic random access memory (MRAM) as the next generation of MRAM technology. To achieve high yield and consistent device performance in MTJs with perpendicular magnetic anisotropy, an integration flow has…
▽ More
The large-scale fabrication of three-terminal magnetic tunnel junctions (MTJs) with high yield is becoming increasingly crucial, especially with the growing interest in spin-orbit torque (SOT) magnetic random access memory (MRAM) as the next generation of MRAM technology. To achieve high yield and consistent device performance in MTJs with perpendicular magnetic anisotropy, an integration flow has been developed that incorporates special MTJ etching technique and other CMOS-compatible processes on a 300 mm wafer manufacturing platform. Systematic studies have been conducted on device performance and statistical uniformity, encompassing magnetic properties, electrical switching behavior, and reliability. Achievements include a switching current of 680 uA at 2 ns, a TMR as high as 119%, ultra-high endurance (over 1012 cycles), and excellent uniformity in the fabricated SOT-MTJ devices, with a yield of up to 99.6%. The proposed integration process, featuring high yield, is anticipated to streamline the mass production of SOT-MRAM.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models
Authors:
Emmy Liu,
Graham Neubig,
Jacob Andreas
Abstract:
Modern language models (LMs) can learn to perform new tasks in different ways: in instruction following, the target task is described explicitly in natural language; in few-shot prompting, the task is specified implicitly with a small number of examples; in instruction inference, LMs are presented with in-context examples and are then prompted to generate a natural language task description before…
▽ More
Modern language models (LMs) can learn to perform new tasks in different ways: in instruction following, the target task is described explicitly in natural language; in few-shot prompting, the task is specified implicitly with a small number of examples; in instruction inference, LMs are presented with in-context examples and are then prompted to generate a natural language task description before making predictions. Each of these procedures may be thought of as invoking a different form of reasoning: instruction following involves deductive reasoning, few-shot prompting involves inductive reasoning, and instruction inference involves abductive reasoning. How do these different capabilities relate? Across four LMs (from the gpt and llama families) and two learning problems (involving arithmetic functions and machine translation) we find a strong dissociation between the different types of reasoning: LMs can sometimes learn effectively from few-shot prompts even when they are unable to explain their own prediction rules; conversely, they sometimes infer useful task descriptions while completely failing to learn from human-generated descriptions of the same task. Our results highlight the non-systematic nature of reasoning even in some of today's largest LMs, and underscore the fact that very different learning mechanisms may be invoked by seemingly similar prompting procedures.
△ Less
Submitted 10 April, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Gate-tunable subband degeneracy in semiconductor nanowires
Authors:
Yuhao Wang,
Wenyu Song,
Zhan Cao,
Zehao Yu,
Shuai Yang,
Zonglin Li,
Yichun Gao,
Ruidong Li,
Fangting Chen,
Zuhan Geng,
Lining Yang,
Jiaye Xu,
Zhaoyu Wang,
Shan Zhang,
Xiao Feng,
Tiantian Wang,
Yunyi Zang,
Lin Li,
Runan Shang,
Qi-Kun Xue,
Dong E. Liu,
Ke He,
Hao Zhang
Abstract:
Degeneracy and symmetry have a profound relation in quantum systems. Here, we report gate-tunable subband degeneracy in PbTe nanowires with a nearly symmetric cross-sectional shape. The degeneracy is revealed in electron transport by the absence of a quantized plateau. Utilizing a dual gate design, we can apply an electric field to lift the degeneracy, reflected as emergence of the plateau. This d…
▽ More
Degeneracy and symmetry have a profound relation in quantum systems. Here, we report gate-tunable subband degeneracy in PbTe nanowires with a nearly symmetric cross-sectional shape. The degeneracy is revealed in electron transport by the absence of a quantized plateau. Utilizing a dual gate design, we can apply an electric field to lift the degeneracy, reflected as emergence of the plateau. This degeneracy and its tunable lifting were challenging to observe in previous nanowire experiments, possibly due to disorder. Numerical simulations can qualitatively capture our observation, shedding light on device parameters for future applications.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
Authors:
Enshu Liu,
Junyi Zhu,
Zinan Lin,
Xuefei Ning,
Matthew B. Blaschko,
Sergey Yekhanin,
Shengen Yan,
Guohao Dai,
Huazhong Yang,
Yu Wang
Abstract:
Diffusion Models (DM) and Consistency Models (CM) are two types of popular generative models with good generation quality on various tasks. When training DM and CM, intermediate weight checkpoints are not fully utilized and only the last converged checkpoint is used. In this work, we find that high-quality model weights often lie in a basin which cannot be reached by SGD but can be obtained by pro…
▽ More
Diffusion Models (DM) and Consistency Models (CM) are two types of popular generative models with good generation quality on various tasks. When training DM and CM, intermediate weight checkpoints are not fully utilized and only the last converged checkpoint is used. In this work, we find that high-quality model weights often lie in a basin which cannot be reached by SGD but can be obtained by proper checkpoint averaging. Based on these observations, we propose LCSC, a simple but effective and efficient method to enhance the performance of DM and CM, by combining checkpoints along the training trajectory with coefficients deduced from evolutionary search. We demonstrate the value of LCSC through two use cases: $\textbf{(a) Reducing training cost.}$ With LCSC, we only need to train DM/CM with fewer number of iterations and/or lower batch sizes to obtain comparable sample quality with the fully trained model. For example, LCSC achieves considerable training speedups for CM (23$\times$ on CIFAR-10 and 15$\times$ on ImageNet-64). $\textbf{(b) Enhancing pre-trained models.}$ Assuming full training is already done, LCSC can further improve the generation quality or speed of the final converged models. For example, LCSC achieves better performance using 1 number of function evaluation (NFE) than the base model with 2 NFE on consistency distillation, and decreases the NFE of DM from 15 to 9 while maintaining the generation quality on CIFAR-10. Our code is available at https://github.com/imagination-research/LCSC.
△ Less
Submitted 7 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Tensorized NeuroEvolution of Augmenting Topologies for GPU Acceleration
Authors:
Lishuang Wang,
Mengfei Zhao,
Enyu Liu,
Kebin Sun,
Ran Cheng
Abstract:
The NeuroEvolution of Augmenting Topologies (NEAT) algorithm has received considerable recognition in the field of neuroevolution. Its effectiveness is derived from initiating with simple networks and incrementally evolving both their topologies and weights. Although its capability across various challenges is evident, the algorithm's computational efficiency remains an impediment, limiting its sc…
▽ More
The NeuroEvolution of Augmenting Topologies (NEAT) algorithm has received considerable recognition in the field of neuroevolution. Its effectiveness is derived from initiating with simple networks and incrementally evolving both their topologies and weights. Although its capability across various challenges is evident, the algorithm's computational efficiency remains an impediment, limiting its scalability potential. In response, this paper introduces a tensorization method for the NEAT algorithm, enabling the transformation of its diverse network topologies and associated operations into uniformly shaped tensors for computation. This advancement facilitates the execution of the NEAT algorithm in a parallelized manner across the entire population. Furthermore, we develop TensorNEAT, a library that implements the tensorized NEAT algorithm and its variants, such as CPPN and HyperNEAT. Building upon JAX, TensorNEAT promotes efficient parallel computations via automated function vectorization and hardware acceleration. Moreover, the TensorNEAT library supports various benchmark environments including Gym, Brax, and gymnax. Through evaluations across a spectrum of robotics control environments in Brax, TensorNEAT achieves up to 500x speedups compared to the existing implementations such as NEAT-Python. Source codes are available at: https://github.com/EMI-Group/tensorneat.
△ Less
Submitted 11 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation
Authors:
Bowen Zheng,
Zihan Lin,
Enze Liu,
Chen Yang,
Enyang Bai,
Cheng Ling,
Wayne Xin Zhao,
Ji-Rong Wen
Abstract:
In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interactio…
▽ More
In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interaction histories with both videos and comments, so as to jointly conduct personalized video and comment recommendation. Specifically, our approach consists of two key components, namely sequential recommendation (SR) model and supplemental large language model (LLM) recommender. The SR model serves as the primary recommendation backbone (retained in deployment) of our approach, allowing for efficient user preference modeling. Meanwhile, we leverage the LLM recommender as a supplemental component (discarded in deployment) to better capture underlying user preferences from heterogeneous interaction behaviors. In order to integrate the merits of the SR model and the supplemental LLM recommender, we design a twostage training paradigm. The first stage is personalized preference alignment, which aims to align the preference representations from both components, thereby enhancing the semantics of the SR model. The second stage is recommendation-oriented fine-tuning, in which the alignment-enhanced SR model is fine-tuned according to specific objectives. Extensive experiments in both video and comment recommendation tasks demonstrate the effectiveness of LSVCR. Additionally, online A/B testing on the KuaiShou platform verifies the actual benefits brought by our approach. In particular, we achieve a significant overall gain of 4.13% in comment watch time.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Interval Replacements of Persistence Modules
Authors:
Hideto Asashiba,
Etienne Gauthier,
Enhao Liu
Abstract:
We define two notions. The first one is a $compression\ system$ $ξ$ for a finite poset $\mathbf{P}$, which assigns each interval subposet $I$ to a poset morphism $ξ_I \colon Q_I \to \mathbf{P}$ satisfying some conditions, where $Q_I$ is a connected finite poset. An example is given by the $total$ compression system that assigns each $I$ to the inclusion of $I$ into $\mathbf{P}$. The second one is…
▽ More
We define two notions. The first one is a $compression\ system$ $ξ$ for a finite poset $\mathbf{P}$, which assigns each interval subposet $I$ to a poset morphism $ξ_I \colon Q_I \to \mathbf{P}$ satisfying some conditions, where $Q_I$ is a connected finite poset. An example is given by the $total$ compression system that assigns each $I$ to the inclusion of $I$ into $\mathbf{P}$. The second one is an $I$-$rank$ of a persistence module $M$ under $ξ$, the family of which is called the $interval\ rank\ invariant$ of $M$ under $ξ$. A compression system $ξ$ makes it possible to define the $interval\ replacement$ (also called the interval-decomposable approximation) not only for 2D persistence modules but also for any persistence modules over any finite poset. We will show that the forming of the interval replacement preserves the interval rank invariant, which is a stronger property than the preservation of the usual rank invariant. Moreover, to know what is preserved explicitly, we will give a formula of the $I$-rank of $M$ under $ξ$ in terms of the structure linear maps of $M$ for any compression system $ξ$, and give a sufficient condition for the $I$-rank of $M$ under $ξ$ to coincide with that under the total compression system, the value of which is equal to the generalized rank invariant introduced by Kim--Mémoli.
△ Less
Submitted 17 June, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
An Empirical Analysis on the Use and Reporting of National Security Letters
Authors:
Alex Bellon,
Miro Haller,
Andrey Labunets,
Enze Liu,
Stefan Savage
Abstract:
National Security Letters (NSLs) are similar to administrative subpoenas and can be issued directly by elements of the executive branch without requiring prior approval from a court or grand jury. Importantly, NSLs authorize the imposition of nondisclosure orders (aka "gag orders") on the receiving party. Controversy about potential abuses of this authority has driven a range of legal and policy d…
▽ More
National Security Letters (NSLs) are similar to administrative subpoenas and can be issued directly by elements of the executive branch without requiring prior approval from a court or grand jury. Importantly, NSLs authorize the imposition of nondisclosure orders (aka "gag orders") on the receiving party. Controversy about potential abuses of this authority has driven a range of legal and policy discussions. To address these concerns, both the public sector and the private sector have sought to document the usage of NSLs in aggregated form. However, each data source is limited in scope, time, and kind.
In this paper, we consolidate the available data around NSLs and answer two questions: (1) what can the public effectively learn from the reported data and does this information suffice to assess the NSL usage? (2) how accessible is this data collection? We show that longitudinal trends in the usage of NSLs can be observed. For instance, we find a significant increase in NSL requests for non-US persons and that the policy reforms to decrease the mandated nondisclosure period appear to be effective. The observed trends suggest that the current transparency mechanisms are viable safeguards against the excessive use of NSLs. However, aggregating and normalizing the data requires manual reviewing, parsing, and validating. We even find inconsistencies within and across official data sources. Overall, the laborious data collection process hinders external and internal auditing efforts and demonstrates the need for a unified and more usable dataset for NSLs.
△ Less
Submitted 10 April, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
Deep Learning for Cross-Domain Data Fusion in Urban Computing: Taxonomy, Advances, and Outlook
Authors:
Xingchen Zou,
Yibo Yan,
Xixuan Hao,
Yuehong Hu,
Haomin Wen,
Erdong Liu,
Junbo Zhang,
Yong Li,
Tianrui Li,
Yu Zheng,
Yuxuan Liang
Abstract:
As cities continue to burgeon, Urban Computing emerges as a pivotal discipline for sustainable development by harnessing the power of cross-domain data fusion from diverse sources (e.g., geographical, traffic, social media, and environmental data) and modalities (e.g., spatio-temporal, visual, and textual modalities). Recently, we are witnessing a rising trend that utilizes various deep-learning m…
▽ More
As cities continue to burgeon, Urban Computing emerges as a pivotal discipline for sustainable development by harnessing the power of cross-domain data fusion from diverse sources (e.g., geographical, traffic, social media, and environmental data) and modalities (e.g., spatio-temporal, visual, and textual modalities). Recently, we are witnessing a rising trend that utilizes various deep-learning methods to facilitate cross-domain data fusion in smart cities. To this end, we propose the first survey that systematically reviews the latest advancements in deep learning-based data fusion methods tailored for urban computing. Specifically, we first delve into data perspective to comprehend the role of each modality and data source. Secondly, we classify the methodology into four primary categories: feature-based, alignment-based, contrast-based, and generation-based fusion methods. Thirdly, we further categorize multi-modal urban applications into seven types: urban planning, transportation, economy, public safety, society, environment, and energy. Compared with previous surveys, we focus more on the synergy of deep learning methods with urban computing applications. Furthermore, we shed light on the interplay between Large Language Models (LLMs) and urban computing, postulating future research directions that could revolutionize the field. We firmly believe that the taxonomy, progress, and prospects delineated in our survey stand poised to significantly enrich the research community. The summary of the comprehensive and up-to-date paper list can be found at https://github.com/yoshall/Awesome-Multimodal-Urban-Computing.
△ Less
Submitted 16 June, 2024; v1 submitted 29 February, 2024;
originally announced February 2024.
-
Magnetic properties of binary alloys Ni1-xMox and Ni1-yCuy close to critical concentrations
Authors:
R. -Z. Lin,
C. -H. Hsu,
E. -P. Liu,
W. -T. Chen,
C. -L. Huang
Abstract:
The search for the ferromagnetic quantum critical point (FM QCP) has always been a captivating research topic in the scientific community. In pursuit of this goal, we introduced nonmagnetic transition metals to alloy with elemental nickel, and studied the magnetic properties of nickel binary alloys Ni1-xMox and Ni1-yCuy as a function of x and y up to the critical concentrations x_{cr} and y_{cr} a…
▽ More
The search for the ferromagnetic quantum critical point (FM QCP) has always been a captivating research topic in the scientific community. In pursuit of this goal, we introduced nonmagnetic transition metals to alloy with elemental nickel, and studied the magnetic properties of nickel binary alloys Ni1-xMox and Ni1-yCuy as a function of x and y up to the critical concentrations x_{cr} and y_{cr} at which the FM transition T_C disappears. T_C-x(y) phase diagrams were constructed via the Arrott-Noakes scaling of magnetization data. An enhanced Sommerfeld coefficient (the value of C/T as T \rightarrow 0) is observed near y_{cr}, manifesting the effect of quantum fluctuations near the quantum phase transition. It is evident that C/T diverges with -logT down to 0.1 K in the vicinity of y_{cr}, suggests the plausible FM QCP in Ni1-yCuy. However, in the case of Ni1-xMox, although the enhancement of the Sommerfeld coefficient is also observed near x_{cr}, the spin glass behavior is identified through the ac magnetic susceptibility measurement. This observation rules out the possibility of the existence of the FM QCP in Ni1-xMox.
△ Less
Submitted 13 May, 2024; v1 submitted 29 February, 2024;
originally announced February 2024.
-
Blockchain for Finance: A Survey
Authors:
Hanjie Wu,
Qian Yao,
Zhenguang Liu,
Butian Huang,
Yuan Zhuang,
Huayun Tang,
Erwu Liu
Abstract:
As an innovative technology for enhancing authenticity, security, and risk management, blockchain is being widely adopted in trade and finance systems. The unique capabilities of blockchain, such as immutability and transparency, enable new business models of distributed data storage, point-to-point transactions, and decentralized autonomous organizations. In this paper, we focus on blockchain-bas…
▽ More
As an innovative technology for enhancing authenticity, security, and risk management, blockchain is being widely adopted in trade and finance systems. The unique capabilities of blockchain, such as immutability and transparency, enable new business models of distributed data storage, point-to-point transactions, and decentralized autonomous organizations. In this paper, we focus on blockchain-based securities trading, in which blockchain technology plays a vital role in financial services as it ultimately lifts trust and frees the need for third-party verification by using consensus-based verification. We investigate the 12 most popular blockchain platforms and elaborate on 6 platforms that are related to finance, seeking to provide a panorama of securities trading practices. Meanwhile, this survey provides a comprehensive summary of blockchain-based securities trading applications. We gather numerous practical applications of blockchain-based securities trading and categorize them into four distinct categories. For each category, we introduce a typical example and explain how blockchain contributes to solving the key problems faced by FinTech companies and researchers. Finally, we provide interesting observations ranging from mainstream blockchain-based financial institutions to security issues of decentralized finance applications, aiming to picture the current blockchain ecosystem in finance.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Interpretation of Intracardiac Electrograms Through Textual Representations
Authors:
William Jongwon Han,
Diana Gomez,
Avi Alok,
Chao**g Duan,
Michael A. Rosenberg,
Douglas Weber,
Emerson Liu,
Ding Zhao
Abstract:
Understanding the irregular electrical activity of atrial fibrillation (AFib) has been a key challenge in electrocardiography. For serious cases of AFib, catheter ablations are performed to collect intracardiac electrograms (EGMs). EGMs offer intricately detailed and localized electrical activity of the heart and are an ideal modality for interpretable cardiac studies. Recent advancements in artif…
▽ More
Understanding the irregular electrical activity of atrial fibrillation (AFib) has been a key challenge in electrocardiography. For serious cases of AFib, catheter ablations are performed to collect intracardiac electrograms (EGMs). EGMs offer intricately detailed and localized electrical activity of the heart and are an ideal modality for interpretable cardiac studies. Recent advancements in artificial intelligence (AI) has allowed some works to utilize deep learning frameworks to interpret EGMs during AFib. Additionally, language models (LMs) have shown exceptional performance in being able to generalize to unseen domains, especially in healthcare. In this study, we are the first to leverage pretrained LMs for finetuning of EGM interpolation and AFib classification via masked language modeling. We formulate the EGM as a textual sequence and present competitive performances on AFib classification compared against other representations. Lastly, we provide a comprehensive interpretability study to provide a multi-perspective intuition of the model's behavior, which could greatly benefit the clinical use.
△ Less
Submitted 11 April, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
On the Structure and Generators of the $n$th-order Chromatic Algebra
Authors:
Ethan Yi-Heng Liu
Abstract:
This work investigates the intrinsic properties of the chromatic algebra, introduced by Fendley and Krushkal as a framework to study the chromatic polynomial. We prove that the dimension of the $n$th-order chromatic algebra is the $2n$th Riordan number, which exhibits exponential growth. We find a generating set of size $\binom{n}{2}$, and we provide a procedure to construct the basis from the gen…
▽ More
This work investigates the intrinsic properties of the chromatic algebra, introduced by Fendley and Krushkal as a framework to study the chromatic polynomial. We prove that the dimension of the $n$th-order chromatic algebra is the $2n$th Riordan number, which exhibits exponential growth. We find a generating set of size $\binom{n}{2}$, and we provide a procedure to construct the basis from the generating set. We additionally provide proofs for fundamental facts about this algebra that appear to be missing from the literature. These include determining a representation of the chromatic algebra as noncrossing planar partitions and expanding the chromatic relations to include an edge case.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Airline recovery problem under disruptions: A review
Authors:
Shuai Wu,
Enze Liu,
Rui Cao,
Qiang Bai
Abstract:
In practice, both passenger and cargo flights are vulnerable to unexpected factors, such as adverse weather, airport flow control, crew absence, unexpected aircraft maintenance, and pandemic, which can cause disruptions in flight schedules. Thus, managers need to reallocate relevant resources to ensure that the airport can return to normal operations on the basis of minimum cost, which is the airl…
▽ More
In practice, both passenger and cargo flights are vulnerable to unexpected factors, such as adverse weather, airport flow control, crew absence, unexpected aircraft maintenance, and pandemic, which can cause disruptions in flight schedules. Thus, managers need to reallocate relevant resources to ensure that the airport can return to normal operations on the basis of minimum cost, which is the airline recovery problem. Airline recovery is an active research area, with a lot of publications in recent years. To better summarize the progress of airline recovery, first of all, keywords are chosen to search the relevant studies, then software is used to analyze the existing studies in terms of the number of papers, keywords, and sources. Secondly, the airline recovery problem is divided into two categories, namely Passenger-Oriented Airline Recovery Problem (POARP) and Cargo-Oriented Airline Recovery Problem (COARP). In POARP, the existing studies are classified according to recovery strategies, including common recovery strategies, cruise speed control strategy, flexible aircraft maintenance strategy, multi-modal transportation strategy, passenger-centric recovery strategy, and clubbing of flights strategy. Moreover, the POARP is discussed from the perspectives of disruption types, recovery strategies, problem types, objective functions, and solution methods. Thirdly, POARP and COARP are compared from the perspectives of timeliness, subjectivity, flexibility, transferability, and combinability. Finally, the conclusions are drawn and future study directions are provided. For future studies, it is recommended to conduct more in-depth research on dynamic and real-time recovery, incorporating human factors into the modeling, multi-modal transportation coupling, optimization of other airport processes, combination of robust scheduling and airline recovery, and optimization algorithm improvement.
△ Less
Submitted 16 January, 2024; v1 submitted 9 January, 2024;
originally announced January 2024.
-
Extracting Error Thresholds through the Framework of Approximate Quantum Error Correction Condition
Authors:
Yuanchen Zhao,
Dong E. Liu
Abstract:
The robustness of quantum memory against physical noises is measured by two methods: the exact and approximate quantum error correction (QEC) conditions for error recoverability, and the decoder-dependent error threshold which assesses if the logical error rate diminishes with system size. Here we unravel their relations and propose a unified framework to extract an intrinsic error threshold from…
▽ More
The robustness of quantum memory against physical noises is measured by two methods: the exact and approximate quantum error correction (QEC) conditions for error recoverability, and the decoder-dependent error threshold which assesses if the logical error rate diminishes with system size. Here we unravel their relations and propose a unified framework to extract an intrinsic error threshold from the approximate QEC condition, which could upper bound other decoder-dependent error thresholds. Our proof establishes that relative entropy, effectively measuring deviations from exact QEC conditions, serves as the order parameter delineating the transition from asymptotic recoverability to unrecoverability. Consequently, we establish a unified framework for determining the error threshold across both exact and approximate QEC codes, addressing errors originating from noise channels as well as those from code space imperfections. This result sharpens our comprehension of error thresholds across diverse QEC codes and error models.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Optical probe on do** modulation of magnetic Weyl semimetal Co$_3$Sn$_2$S$_2$
Authors:
L. Wang,
S. Zhang,
B. B. Wang,
B. X. Gao,
L. Y. Cao,
X. T. Zhang,
X. Y. Zhang,
E. K. Liu,
R. Y. Chen
Abstract:
The magnetic Weyl semimetal Co$_3$Sn$_2$S$_2$ is extensively investigated due to its giant anomalous Hall effect (AHE).Recent studies demonstrate that the AHE can be effectively tuned by multi-electron Ni do**.To reveal the underlying mechanism of this significant manipulation,it is crucial to explore the band structure modification caused by Ni do**. Here,we study the electrodynamics of both…
▽ More
The magnetic Weyl semimetal Co$_3$Sn$_2$S$_2$ is extensively investigated due to its giant anomalous Hall effect (AHE).Recent studies demonstrate that the AHE can be effectively tuned by multi-electron Ni do**.To reveal the underlying mechanism of this significant manipulation,it is crucial to explore the band structure modification caused by Ni do**. Here,we study the electrodynamics of both pristine and Ni-doped Co$_{3-x}$Ni$_x$Sn$_2$S$_2$ with $x=$0, 0.11 and 0.17 by infrared spectroscopy. We find that the inverted energy gap around the Fermi level($E_{F}$) gets smaller at $x=$0.11,which is supposed to enhance the Berry curvature and therefore increase the AHE.Then $E_{F}$ moves out of this gap at $x=$0.17. Additionally,the low temperature carrier density is demonstrated to increase monotonically upon do**,which is different from previous Hall measurement results. We also observe the evidences of band broadening and exotic changes of high-energy interband transitions caused by do**.Our results provide detailed information about the band structure of Co$_{3-x}$Ni$_x$Sn$_2$S$_2$ at different do** levels,which will help to guide further studies on the chemical tuning of AHE.
△ Less
Submitted 8 January, 2024; v1 submitted 27 December, 2023;
originally announced December 2023.
-
MotionScript: Natural Language Descriptions for Expressive 3D Human Motions
Authors:
Payam Jome Yazdian,
Eric Liu,
Li Cheng,
Angelica Lim
Abstract:
This paper proposes MotionScript, a motion-to-text conversion algorithm and natural language representation for human body motions. MotionScript aims to describe movements in greater detail and with more accuracy than previous natural language approaches. Many motion datasets describe relatively objective and simple actions with little variation on the way they are expressed (e.g. sitting, walking…
▽ More
This paper proposes MotionScript, a motion-to-text conversion algorithm and natural language representation for human body motions. MotionScript aims to describe movements in greater detail and with more accuracy than previous natural language approaches. Many motion datasets describe relatively objective and simple actions with little variation on the way they are expressed (e.g. sitting, walking, dribbling a ball). But for expressive actions that contain a diversity of movements in the class (e.g. being sad, dancing), or for actions outside the domain of standard motion capture datasets (e.g. stylistic walking, sign-language), more specific and granular natural language descriptions are needed. Our proposed MotionScript descriptions differ from existing natural language representations in that it provides direct descriptions in natural language instead of simple action labels or high-level human captions. To the best of our knowledge, this is the first attempt at translating 3D motions to natural language descriptions without requiring training data. Our experiments show that when MotionScript representations are used in a text-to-motion neural task, body movements are more accurately reconstructed, and large language models can be used to generate unseen complex motions.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Near-Field Localization and Phase Shift Optimization for RIS-Assisted Non-Ideal OFDM Systems
Authors:
Hanfu Zhang,
Erwu Liu,
Rui Wang,
Zhe Xing,
Yan Liu
Abstract:
By incorporating reconfigurable intelligent surface (RIS) into communication-assisted localization systems, the issue of signal blockage caused by obstacles can be addressed, and passive beamforming can be employed to enhance localization accuracy. However, existing works mainly consider ideal channels and do not account for the effects of realistic impairments like carrier frequency offset (CFO)…
▽ More
By incorporating reconfigurable intelligent surface (RIS) into communication-assisted localization systems, the issue of signal blockage caused by obstacles can be addressed, and passive beamforming can be employed to enhance localization accuracy. However, existing works mainly consider ideal channels and do not account for the effects of realistic impairments like carrier frequency offset (CFO) and phase noise (PN) on localization. This paper proposes an iterative joint estimation algorithm for CFO, PN, and user position based on maximum a posteriori (MAP) criterion and gradient descent (GD) algorithm. Closed-form expressions for CFO and PN updates are provided. The hybrid Cramér-Rao lower bound (HCRLB) for the estimation parameters is derived, and the ambiguity in CFO and PN estimation is analyzed. To minimize the HCRLB, a non-convex RIS shift optimization problem is formulated and is transformed into a convex semidefinite programming (SDP) problem using the technique of semidefinite relaxation (SDR) and Schur complement. After optimizing the RIS phase shift, the theoretical positioning accuracy within the area of interest (AOI) can be improved by two orders of magnitude, with a maximum positioning root mean square error (RMSE) lower than $\rm 10^{-2}m$.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Multi-Agent Reinforcement Learning for Connected and Automated Vehicles Control: Recent Advancements and Future Prospects
Authors:
Min Hua,
Dong Chen,
Xinda Qi,
Kun Jiang,
Zemin Eitan Liu,
Quan Zhou,
Hongming Xu
Abstract:
Connected and automated vehicles (CAVs) have emerged as a potential solution to the future challenges of develo** safe, efficient, and eco-friendly transportation systems. However, CAV control presents significant challenges, given the complexity of interconnectivity and coordination required among the vehicles. To address this, multi-agent reinforcement learning (MARL), with its notable advance…
▽ More
Connected and automated vehicles (CAVs) have emerged as a potential solution to the future challenges of develo** safe, efficient, and eco-friendly transportation systems. However, CAV control presents significant challenges, given the complexity of interconnectivity and coordination required among the vehicles. To address this, multi-agent reinforcement learning (MARL), with its notable advancements in addressing complex problems in autonomous driving, robotics, and human-vehicle interaction, has emerged as a promising tool for enhancing the capabilities of CAVs. However, there is a notable absence of current reviews on the state-of-the-art MARL algorithms in the context of CAVs. Therefore, this paper delivers a comprehensive review of the application of MARL techniques within the field of CAV control. The paper begins by introducing MARL, followed by a detailed explanation of its unique advantages in addressing complex mobility and traffic scenarios that involve multiple agents. It then presents a comprehensive survey of MARL applications on the extent of control dimensions for CAVs, covering critical and typical scenarios such as platooning control, lane-changing, and unsignalized intersections. In addition, the paper provides a comprehensive review of the prominent simulation platforms used to create reliable environments for training in MARL. Lastly, the paper examines the current challenges associated with deploying MARL within CAV control and outlines potential solutions that can effectively overcome these issues. Through this review, the study highlights the tremendous potential of MARL to enhance the performance and collaboration of CAV control in terms of safety, travel efficiency, and economy.
△ Less
Submitted 16 January, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models
Authors:
Enshu Liu,
Xuefei Ning,
Huazhong Yang,
Yu Wang
Abstract:
Recent years have witnessed the rapid progress and broad application of diffusion probabilistic models (DPMs). Sampling from DPMs can be viewed as solving an ordinary differential equation (ODE). Despite the promising performance, the generation of DPMs usually consumes much time due to the large number of function evaluations (NFE). Though recent works have accelerated the sampling to around 20 s…
▽ More
Recent years have witnessed the rapid progress and broad application of diffusion probabilistic models (DPMs). Sampling from DPMs can be viewed as solving an ordinary differential equation (ODE). Despite the promising performance, the generation of DPMs usually consumes much time due to the large number of function evaluations (NFE). Though recent works have accelerated the sampling to around 20 steps with high-order solvers, the sample quality with less than 10 NFE can still be improved. In this paper, we propose a unified sampling framework (USF) to study the optional strategies for solver. Under this framework, we further reveal that taking different solving strategies at different timesteps may help further decrease the truncation error, and a carefully designed \emph{solver schedule} has the potential to improve the sample quality by a large margin. Therefore, we propose a new sampling framework based on the exponential integral formulation that allows free choices of solver strategy at each step and design specific decisions for the framework. Moreover, we propose $S^3$, a predictor-based search method that automatically optimizes the solver schedule to get a better time-quality trade-off of sampling. We demonstrate that $S^3$ can find outstanding solver schedules which outperform the state-of-the-art sampling methods on CIFAR-10, CelebA, ImageNet, and LSUN-Bedroom datasets. Specifically, we achieve 2.69 FID with 10 NFE and 6.86 FID with 5 NFE on CIFAR-10 dataset, outperforming the SOTA method significantly. We further apply $S^3$ to Stable-Diffusion model and get an acceleration ratio of 2$\times$, showing the feasibility of sampling in very few steps without retraining the neural network.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Effects of domain walls and chiral supercurrent in quantum anomalous Hall Josephson junctions
Authors:
Junjie Qi,
Haiwen Liu,
Jie Liu,
Hua Jiang,
Dong E. Liu,
Chui-Zhen Chen,
Ke He,
X. C. Xie
Abstract:
The intriguing interplay between topology and superconductivity has attracted significant attention, given its potential for realizing topological superconductivity. In this study, we investigate the transport properties of the chiral Josephson effect in the quantum anomalous Hall insulators (QAHIs)-based junction. We reveal a systematic crossover from edge-state to bulk-state dominant supercurren…
▽ More
The intriguing interplay between topology and superconductivity has attracted significant attention, given its potential for realizing topological superconductivity. In this study, we investigate the transport properties of the chiral Josephson effect in the quantum anomalous Hall insulators (QAHIs)-based junction. We reveal a systematic crossover from edge-state to bulk-state dominant supercurrents, with a notable $0-π$ transition observed under non-zero magnetic flux through chemical potential adjustments. This transition underscores the competition between bulk and chiral edge transport. Furthermore, we identify an evolution among three distinct quantum interference patterns: from a $2Φ_0$-periodic oscillation pattern, to a $Φ_0$-periodic oscillation pattern, and then to an asymmetric Fraunhofer pattern ($Φ_0 = h/2e$ is the flux quantum, $h$ the Planck constant, and $e$ the electron charge). Subsequently, we examine the influence of domains on quantum interference patterns. Intriguingly, a distinctive Fraunhofer-like pattern emerges due to coexistence of chiral edge states and domain wall states, even when the chemical potential is within gap. These results not only advance the theoretical understanding but also pave the way for the experimental discovery of the chiral Josephson effect based on QAHI doped with magnetic impurities.
△ Less
Submitted 14 December, 2023; v1 submitted 30 November, 2023;
originally announced December 2023.
-
Compact Electrochromic Optical Recording of Bioelectric Potentials
Authors:
Kenneth Nakasone,
Chris Zavik,
Erica Liu,
Burhan Ahmed,
Dana Griffith,
Lothar Maisenbacher,
Ashwin Singh,
Yuecheng Zhou,
Bianxiao Cui,
Holger Müller
Abstract:
Electrochromic optical recording (ECORE) is a label-free method that utilizes electrochromism to optically detect electrical signals in biological cells with a high signal-to-noise ratio and is suitable for long-term recording. However, ECORE usually requires a large and intricate optical setup, making it relatively difficult to transport and to study specimens on a large scale. Here, we present a…
▽ More
Electrochromic optical recording (ECORE) is a label-free method that utilizes electrochromism to optically detect electrical signals in biological cells with a high signal-to-noise ratio and is suitable for long-term recording. However, ECORE usually requires a large and intricate optical setup, making it relatively difficult to transport and to study specimens on a large scale. Here, we present a compact ECORE apparatus that drastically reduces the spatial footprint and complexity of the ECORE setup whilst maintaining high sensitivity. An autobalancing differential photodetector automates common-mode noise rejection, removing the need for manually adjustable optics, and a compact laser module conserves space compared to a typical laser mount. The result is a simple, easy-to-use, and relatively low cost system that achieves a sensitivity of 16.7 μV (within a factor of 5 of the shot noise limit), and reliably detects action potentials from Human-induced pluripotent stem cell (HiPSC) derived cardiomyocytes. This setup can be further improved to within 1.5 dB of the shot noise limit by filtering out power-line interference.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
Program-Aided Reasoners (better) Know What They Know
Authors:
Anubha Kabra,
Sanketh Rangreji,
Yash Mathur,
Aman Madaan,
Emmy Liu,
Graham Neubig
Abstract:
Prior work shows that program-aided reasoning, in which large language models (LLMs) are combined with programs written in programming languages such as Python, can significantly improve accuracy on various reasoning tasks. However, while accuracy is essential, it is also important for such reasoners to "know what they know", which can be quantified through the calibration of the model. In this pa…
▽ More
Prior work shows that program-aided reasoning, in which large language models (LLMs) are combined with programs written in programming languages such as Python, can significantly improve accuracy on various reasoning tasks. However, while accuracy is essential, it is also important for such reasoners to "know what they know", which can be quantified through the calibration of the model. In this paper, we compare the calibration of Program Aided Language Models (PAL) and text-based Chain-of-thought (COT) prompting techniques over 5 datasets and 2 model types: LLaMA models and OpenAI models. Our results indicate that PAL leads to improved calibration in 75% of the instances. Our analysis uncovers that prompting styles that produce lesser diversity in generations also have more calibrated results, and thus we also experiment with inducing lower generation diversity using temperature scaling and find that for certain temperatures, PAL is not only more accurate but is also more calibrated than COT. Overall, we demonstrate that, in the majority of cases, program-aided reasoners better know what they know than text-based counterparts.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Divergences between Language Models and Human Brains
Authors:
Yuchen Zhou,
Emmy Liu,
Graham Neubig,
Michael J. Tarr,
Leila Wehbe
Abstract:
Do machines and humans process language in similar ways? Recent research has hinted in the affirmative, finding that brain signals can be effectively predicted using the internal representations of language models (LMs). Although such results are thought to reflect shared computational principles between LMs and human brains, there are also clear differences in how LMs and humans represent and use…
▽ More
Do machines and humans process language in similar ways? Recent research has hinted in the affirmative, finding that brain signals can be effectively predicted using the internal representations of language models (LMs). Although such results are thought to reflect shared computational principles between LMs and human brains, there are also clear differences in how LMs and humans represent and use language. In this work, we systematically explore the divergences between human and machine language processing by examining the differences between LM representations and human brain responses to language as measured by Magnetoencephalography (MEG) across two datasets in which subjects read and listened to narrative stories. Using a data-driven approach, we identify two domains that are not captured well by LMs: social/emotional intelligence and physical commonsense. We then validate these domains with human behavioral experiments and show that fine-tuning LMs on these domains can improve their alignment with human brain responses.
△ Less
Submitted 4 February, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Aligned: A Platform-based Process for Alignment
Authors:
Ethan Shaotran,
Ido Pesok,
Sam Jones,
Emi Liu
Abstract:
We are introducing Aligned, a platform for global governance and alignment of frontier models, and eventually superintelligence. While previous efforts at the major AI labs have attempted to gather inputs for alignment, these are often conducted behind closed doors. We aim to set the foundation for a more trustworthy, public-facing approach to safety: a constitutional committee framework. Initial…
▽ More
We are introducing Aligned, a platform for global governance and alignment of frontier models, and eventually superintelligence. While previous efforts at the major AI labs have attempted to gather inputs for alignment, these are often conducted behind closed doors. We aim to set the foundation for a more trustworthy, public-facing approach to safety: a constitutional committee framework. Initial tests with 680 participants result in a 30-guideline constitution with 93% overall support. We show the platform naturally scales, instilling confidence and enjoyment from the community. We invite other AI labs and teams to plug and play into the Aligned ecosystem.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
The NeurIPS 2022 Neural MMO Challenge: A Massively Multiagent Competition with Specialization and Trade
Authors:
Enhong Liu,
Joseph Suarez,
Chenhui You,
Bo Wu,
Bingcheng Chen,
Jun Hu,
Jiaxin Chen,
Xiaolong Zhu,
Clare Zhu,
Julian Togelius,
Sharada Mohanty,
Weijun Hong,
Rui Du,
Yibing Zhang,
Qinwen Wang,
Xinhang Li,
Zheng Yuan,
Xiang Li,
Yuejia Huang,
Kun Zhang,
Hanhui Yang,
Shiqi Tang,
Phillip Isola
Abstract:
In this paper, we present the results of the NeurIPS-2022 Neural MMO Challenge, which attracted 500 participants and received over 1,600 submissions. Like the previous IJCAI-2022 Neural MMO Challenge, it involved agents from 16 populations surviving in procedurally generated worlds by collecting resources and defeating opponents. This year's competition runs on the latest v1.6 Neural MMO, which in…
▽ More
In this paper, we present the results of the NeurIPS-2022 Neural MMO Challenge, which attracted 500 participants and received over 1,600 submissions. Like the previous IJCAI-2022 Neural MMO Challenge, it involved agents from 16 populations surviving in procedurally generated worlds by collecting resources and defeating opponents. This year's competition runs on the latest v1.6 Neural MMO, which introduces new equipment, combat, trading, and a better scoring system. These elements combine to pose additional robustness and generalization challenges not present in previous competitions. This paper summarizes the design and results of the challenge, explores the potential of this environment as a benchmark for learning methods, and presents some practical reinforcement learning training approaches for complex tasks with sparse rewards. Additionally, we have open-sourced our baselines, including environment wrappers, benchmarks, and visualization tools for future research.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Non-volatile memory based on PZT/FeGa thin film memtranstor
Authors:
**-Cheng He,
Jian Xing,
Jian-Xin Shen,
Dan Su,
En-Ke Liu,
Shou-Guo Wang,
Young Sun
Abstract:
The PZT/FeGa thin film memtranstor was prepared and the modulation of the magnetoelectric coefficient by external magnetic and electric fields was studied. The magnetoelectric coefficient of the PZT/FeGa memtranstor can be reversed by flip** the direction of magnetization of FeGa or ferroelectric polarization of PZT. Notably, the sign of the magnetoelectric coefficient can be switched repeatedly…
▽ More
The PZT/FeGa thin film memtranstor was prepared and the modulation of the magnetoelectric coefficient by external magnetic and electric fields was studied. The magnetoelectric coefficient of the PZT/FeGa memtranstor can be reversed by flip** the direction of magnetization of FeGa or ferroelectric polarization of PZT. Notably, the sign of the magnetoelectric coefficient can be switched repeatedly by reversing ferroelectric polarization of PZT when the external magnetic field remains constant. Moreover, the binary switching behavior can still be maintained under zero DC bias magnetic field. When the polarization direction remains stable, the magnetoelectric coefficient also does not change. This means that the magnetoelectric coefficient of PZT/FeGa is non-volatile. Furthermore, the retention and endurance characteristics of the PZT/FeGa thin film memtranstor have been investigated. These findings demonstrate the potential of the PZT/FeGa thin film memtranstor for non-volatile memory applications.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
A Non-Hermitian Moiré Valley Filter
Authors:
Kai Shao,
Hao Geng,
Erfu Liu,
Jose L. Lado,
Wei Chen,
D. Y. Xing
Abstract:
A valley filter capable of generating a valley-polarized current is a crucial element in valleytronics, yet its implementation remains challenging. Here, we propose a valley filter made of a graphene bilayer which exhibits a 1D moiré pattern in the overlap** region of the two layers controlled by heterostrain. In the presence of a lattice modulation between layers, electrons propagating in one l…
▽ More
A valley filter capable of generating a valley-polarized current is a crucial element in valleytronics, yet its implementation remains challenging. Here, we propose a valley filter made of a graphene bilayer which exhibits a 1D moiré pattern in the overlap** region of the two layers controlled by heterostrain. In the presence of a lattice modulation between layers, electrons propagating in one layer can have valley-dependent dissipation due to valley asymmetric interlayer coupling, thus giving rise to a valley-polarized current. Such a process can be described by an effective non-Hermitian theory, in which the valley filter is driven by a valley-resolved non-Hermitian skin effect. Nearly 100\% valley-polarization can be achieved within a wide parameter range and the functionality of the valley filter is electrically tunable. The non-Hermitian topological scenario of the valley filter ensures high tolerance against imperfections such as disorder and edge defects. Our work opens a new route for efficient and robust valley filters while significantly relaxing the stringent implementation requirements.
△ Less
Submitted 18 April, 2024; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Crossing the Threshold: Idiomatic Machine Translation through Retrieval Augmentation and Loss Weighting
Authors:
Emmy Liu,
Aditi Chaudhary,
Graham Neubig
Abstract:
Idioms are common in everyday language, but often pose a challenge to translators because their meanings do not follow from the meanings of their parts. Despite significant advances, machine translation systems still struggle to translate idiomatic expressions. We provide a simple characterization of idiomatic translation and related issues. This allows us to conduct a synthetic experiment reveali…
▽ More
Idioms are common in everyday language, but often pose a challenge to translators because their meanings do not follow from the meanings of their parts. Despite significant advances, machine translation systems still struggle to translate idiomatic expressions. We provide a simple characterization of idiomatic translation and related issues. This allows us to conduct a synthetic experiment revealing a tip** point at which transformer-based machine translation models correctly default to idiomatic translations. To expand multilingual resources, we compile a dataset of ~4k natural sentences containing idiomatic expressions in French, Finnish, and Japanese. To improve translation of natural idioms, we introduce two straightforward yet effective techniques: the strategic upweighting of training loss on potentially idiomatic sentences, and using retrieval-augmented models. This not only improves the accuracy of a strong pretrained MT model on idiomatic sentences by up to 13% in absolute accuracy, but also holds potential benefits for non-idiomatic sentences.
△ Less
Submitted 20 October, 2023; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Mitigating crosstalk and residual coupling errors in superconducting quantum processors using many-body localization
Authors:
Peng Qian,
Hong-Ze Xu,
Peng Zhao,
Xiao Li,
Dong E. Liu
Abstract:
Addressing the paramount need for precise calibration in superconducting quantum qubits, especially in frequency control, this study introduces a novel calibration scheme harnessing the principles of Many-Body Localization (MBL). While existing strategies, such as Google's snake algorithm, have targeted optimization of qubit frequency parameters, our MBL-based methodology emerges as a stalwart aga…
▽ More
Addressing the paramount need for precise calibration in superconducting quantum qubits, especially in frequency control, this study introduces a novel calibration scheme harnessing the principles of Many-Body Localization (MBL). While existing strategies, such as Google's snake algorithm, have targeted optimization of qubit frequency parameters, our MBL-based methodology emerges as a stalwart against noise, notably crosstalk and residual coupling errors, thereby significantly enhancing quantum processor fidelity and stability without necessitating extensive optimization computation. Not only does this approach provide a marked improvement in performance, particularly where specific residue couplings are present, but it also presents a more resource-efficient and cost-effective calibration process. The research delineated herein affords fresh insights into advanced calibration strategies and propels forward the domain of superconducting quantum computation by offering a robust framework for future explorations in minimizing error and optimizing qubit performance.
△ Less
Submitted 15 October, 2023; v1 submitted 10 October, 2023;
originally announced October 2023.
-
A Two-stage Based Social Preference Recognition in Multi-Agent Autonomous Driving System
Authors:
**tao Xue,
Dongkun Zhang,
Rong Xiong,
Yue Wang,
Eryun Liu
Abstract:
Multi-Agent Reinforcement Learning (MARL) has become a promising solution for constructing a multi-agent autonomous driving system (MADS) in complex and dense scenarios. But most methods consider agents acting selfishly, which leads to conflict behaviors. Some existing works incorporate the concept of social value orientation (SVO) to promote coordination, but they lack the knowledge of other agen…
▽ More
Multi-Agent Reinforcement Learning (MARL) has become a promising solution for constructing a multi-agent autonomous driving system (MADS) in complex and dense scenarios. But most methods consider agents acting selfishly, which leads to conflict behaviors. Some existing works incorporate the concept of social value orientation (SVO) to promote coordination, but they lack the knowledge of other agents' SVOs, resulting in conservative maneuvers. In this paper, we aim to tackle the mentioned problem by enabling the agents to understand other agents' SVOs. To accomplish this, we propose a two-stage system framework. Firstly, we train a policy by allowing the agents to share their ground truth SVOs to establish a coordinated traffic flow. Secondly, we develop a recognition network that estimates agents' SVOs and integrates it with the policy trained in the first stage. Experiments demonstrate that our developed method significantly improves the performance of the driving policy in MADS compared to two state-of-the-art MARL algorithms.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
The Ideal of Vanishing Polynomials and the Ring of Polynomial Functions
Authors:
Matvey Borodin,
Ethan Liu,
Justin Zhang
Abstract:
Vanishing polynomials are polynomials over a ring which output $0$ for all elements in the ring. In this paper, we study the ideal of vanishing polynomials over specific types of rings, along with the closely related ring of polynomial functions. In particular, we provide several results on generating vanishing polynomials. We first analyze the ideal of vanishing polynomial over $\mathbb{Z}_n$, th…
▽ More
Vanishing polynomials are polynomials over a ring which output $0$ for all elements in the ring. In this paper, we study the ideal of vanishing polynomials over specific types of rings, along with the closely related ring of polynomial functions. In particular, we provide several results on generating vanishing polynomials. We first analyze the ideal of vanishing polynomial over $\mathbb{Z}_n$, the ring of the integers modulo $n$. We then establish an isomorphism between the vanishing polynomials of a ring and the vanishing polynomials of the constituent rings in its decomposition. Lastly, we generalize our results to study the ideal of vanishing polynomials over arbitrary commutative rings.
△ Less
Submitted 24 September, 2023;
originally announced October 2023.