-
RVISA: Reasoning and Verification for Implicit Sentiment Analysis
Authors:
Wenna Lai,
Haoran Xie,
Guandong Xu,
Qing Li
Abstract:
With an increasing social demand for fine-grained sentiment analysis (SA), implicit sentiment analysis (ISA) poses a significant challenge with the absence of salient cue words in expressions. It necessitates reliable reasoning to understand how the sentiment is aroused and thus determine implicit sentiments. In the era of Large Language Models (LLMs), Encoder-Decoder (ED) LLMs have gained popular…
▽ More
With an increasing social demand for fine-grained sentiment analysis (SA), implicit sentiment analysis (ISA) poses a significant challenge with the absence of salient cue words in expressions. It necessitates reliable reasoning to understand how the sentiment is aroused and thus determine implicit sentiments. In the era of Large Language Models (LLMs), Encoder-Decoder (ED) LLMs have gained popularity to serve as backbone models for SA applications, considering impressive text comprehension and reasoning ability among diverse tasks. On the other hand, Decoder-only (DO) LLMs exhibit superior natural language generation and in-context learning capabilities. However, their responses may contain misleading or inaccurate information. To identify implicit sentiment with reliable reasoning, this study proposes RVISA, a two-stage reasoning framework that harnesses the generation ability of DO LLMs and the reasoning ability of ED LLMs to train an enhanced reasoner. Specifically, we adopt three-hop reasoning prompting to explicitly furnish sentiment elements as cues. The generated rationales are utilized to fine-tune an ED LLM into a skilled reasoner. Additionally, we develop a straightforward yet effective verification mechanism to ensure the reliability of the reasoning learning. We evaluated the proposed method on two benchmark datasets and achieved state-of-the-art results in ISA performance.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
HRSAM: Efficiently Segment Anything in High-Resolution Images
Authors:
You Huang,
Wenbin Lai,
Jiayi Ji,
Liujuan Cao,
Shengchuan Zhang,
Rongrong Ji
Abstract:
The Segment Anything Model (SAM) has significantly advanced interactive segmentation but struggles with high-resolution images crucial for high-precision segmentation. This is primarily due to the quadratic space complexity of SAM-implemented attention and the length extrapolation issue in common global attention. This study proposes HRSAM that integrates Flash Attention and incorporates Plain, Sh…
▽ More
The Segment Anything Model (SAM) has significantly advanced interactive segmentation but struggles with high-resolution images crucial for high-precision segmentation. This is primarily due to the quadratic space complexity of SAM-implemented attention and the length extrapolation issue in common global attention. This study proposes HRSAM that integrates Flash Attention and incorporates Plain, Shifted and newly proposed Cycle-scan Window (PSCWin) attention to address these issues. The shifted window attention is redesigned with padding to maintain consistent window sizes, enabling effective length extrapolation. The cycle-scan window attention adopts the recently developed State Space Models (SSMs) to ensure global information exchange with minimal computational overhead. Such window-based attention allows HRSAM to perform effective attention computations on scaled input images while maintaining low latency. Moreover, we further propose HRSAM++ that additionally employs a multi-scale strategy to enhance HRSAM's performance. The experiments on the high-precision segmentation datasets HQSeg44K and DAVIS show that high-resolution inputs enable the SAM-distilled HRSAM models to outperform the teacher model while maintaining lower latency. Compared to the SOTAs, HRSAM achieves a 1.56 improvement in interactive segmentation's NoC95 metric with only 31% of the latency. HRSAM++ further enhances the performance, achieving a 1.63 improvement in NoC95 with just 38% of the latency.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
LLMs Beyond English: Scaling the Multilingual Capability of LLMs with Cross-Lingual Feedback
Authors:
Wen Lai,
Mohsen Mesgar,
Alexander Fraser
Abstract:
To democratize large language models (LLMs) to most natural languages, it is imperative to make these models capable of understanding and generating texts in many languages, in particular low-resource ones. While recent multilingual LLMs demonstrate remarkable performance in such capabilities, these LLMs still support a limited number of human languages due to the lack of training data for low-res…
▽ More
To democratize large language models (LLMs) to most natural languages, it is imperative to make these models capable of understanding and generating texts in many languages, in particular low-resource ones. While recent multilingual LLMs demonstrate remarkable performance in such capabilities, these LLMs still support a limited number of human languages due to the lack of training data for low-resource languages. Moreover, these LLMs are not yet aligned with human preference for downstream tasks, which is crucial for the success of LLMs in English. In this paper, we introduce xLLaMA-100 and xBLOOM-100 (collectively xLLMs-100), which scale the multilingual capabilities of LLaMA and BLOOM to 100 languages. To do so, we construct two datasets: a multilingual instruction dataset including 100 languages, which represents the largest language coverage to date, and a cross-lingual human feedback dataset encompassing 30 languages. We perform multilingual instruction tuning on the constructed instruction data and further align the LLMs with human feedback using the DPO algorithm on our cross-lingual human feedback dataset. We evaluate the multilingual understanding and generating capabilities of xLLMs-100 on five multilingual benchmarks. Experimental results show that xLLMs-100 consistently outperforms its peers across the benchmarks by considerable margins, defining a new state-of-the-art multilingual LLM that supports 100 languages.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
A Vlogger-augmented Graph Neural Network Model for Micro-video Recommendation
Authors:
Weijiang Lai,
Beihong **,
Beibei Li,
Yiyuan Zheng,
Rui Zhao
Abstract:
Existing micro-video recommendation models exploit the interactions between users and micro-videos and/or multi-modal information of micro-videos to predict the next micro-video a user will watch, ignoring the information related to vloggers, i.e., the producers of micro-videos. However, in micro-video scenarios, vloggers play a significant role in user-video interactions, since vloggers generally…
▽ More
Existing micro-video recommendation models exploit the interactions between users and micro-videos and/or multi-modal information of micro-videos to predict the next micro-video a user will watch, ignoring the information related to vloggers, i.e., the producers of micro-videos. However, in micro-video scenarios, vloggers play a significant role in user-video interactions, since vloggers generally focus on specific topics and users tend to follow the vloggers they are interested in. Therefore, in the paper, we propose a vlogger-augmented graph neural network model VA-GNN, which takes the effect of vloggers into consideration. Specifically, we construct a tripartite graph with users, micro-videos, and vloggers as nodes, capturing user preferences from different views, i.e., the video-view and the vlogger-view. Moreover, we conduct cross-view contrastive learning to keep the consistency between node embeddings from the two different views. Besides, when predicting the next user-video interaction, we adaptively combine the user preferences for a video itself and its vlogger. We conduct extensive experiments on two real-world datasets. The experimental results show that VA-GNN outperforms multiple existing GNN-based recommendation models.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Source Localization by Multidimensional Steered Response Power Map** with Sparse Bayesian Learning
Authors:
Wei-Ting Lai,
Lachlan Birnie,
Xingyu Chen,
Amy Bastine,
Thushara D. Abhayapala,
Prasanga N. Samarasinghe
Abstract:
We propose an advance Steered Response Power (SRP) method for localizing multiple sources. While conventional SRP performs well in adverse conditions, it remains to struggle in scenarios with closely neighboring sources, resulting in ambiguous SRP maps. We address this issue by applying sparsity optimization in SRP to obtain high-resolution maps. Our approach represents SRP maps as multidimensiona…
▽ More
We propose an advance Steered Response Power (SRP) method for localizing multiple sources. While conventional SRP performs well in adverse conditions, it remains to struggle in scenarios with closely neighboring sources, resulting in ambiguous SRP maps. We address this issue by applying sparsity optimization in SRP to obtain high-resolution maps. Our approach represents SRP maps as multidimensional matrices to preserve time-frequency information and further improve performance in unfavorable conditions. We use multi-dictionary Sparse Bayesian Learning to localize sources without needing prior knowledge of their quantity. We validate our method through practical experiments with a 16-channel planar microphone array and compare against three other SRP and sparsity-based methods. Our multidimensional SRP approach outperforms conventional SRP and the current state-of-the-art sparse SRP methods for localizing closely spaced sources in a reverberant room.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Atomic Vibration Sensor
Authors:
Wenxi Lai
Abstract:
Previously in vibration sensors, optical glass plates, optical fibres, carbon nanotubes, semiconductor materials, piezoelectric materials and molecules are proved to be effective transducers for sensing vibrations. In this work, for the first time, we will propose a model of vibration sensor using single atoms in low temperature. In this apparatus, information of mechanical vibration could be tran…
▽ More
Previously in vibration sensors, optical glass plates, optical fibres, carbon nanotubes, semiconductor materials, piezoelectric materials and molecules are proved to be effective transducers for sensing vibrations. In this work, for the first time, we will propose a model of vibration sensor using single atoms in low temperature. In this apparatus, information of mechanical vibration could be transferred into shaking of optical lattice through one of a cavity mirror. Shaking lattice consequently induces Mott insulator due to quantum interference. It is found that information of vibration is encoded in the atomic current and it could be extracted by Fourier transformations. The present atomic vibration sensor has wide detection range of frequency with high precision. Our present model of sensor based on atomic system opens a new area of studying vibration sensors.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Monaural speech enhancement on drone via Adapter based transfer learning
Authors:
Xingyu Chen,
Hanwen Bi,
Wei-Ting Lai,
Fei Ma
Abstract:
Monaural Speech enhancement on drones is challenging because the ego-noise from the rotating motors and propellers leads to extremely low signal-to-noise ratios at onboard microphones. Although recent masking-based deep neural network methods excel in monaural speech enhancement, they struggle in the challenging drone noise scenario. Furthermore, existing drone noise datasets are limited, causing…
▽ More
Monaural Speech enhancement on drones is challenging because the ego-noise from the rotating motors and propellers leads to extremely low signal-to-noise ratios at onboard microphones. Although recent masking-based deep neural network methods excel in monaural speech enhancement, they struggle in the challenging drone noise scenario. Furthermore, existing drone noise datasets are limited, causing models to overfit. Considering the harmonic nature of drone noise, this paper proposes a frequency domain bottleneck adapter to enable transfer learning. Specifically, the adapter's parameters are trained on drone noise while retaining the parameters of the pre-trained Frequency Recurrent Convolutional Recurrent Network (FRCRN) fixed. Evaluation results demonstrate the proposed method can effectively enhance speech quality. Moreover, it is a more efficient alternative to fine-tuning models for various drone types, which typically requires substantial computational resources.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Quantum geometric tensor and the topological characterization of the extended Su-Schrieffer-Heeger model
Authors:
Xiang-Long Zeng,
Wen-Xi Lai,
Yi-Wen Wei,
Yu-Quan Ma
Abstract:
We investigate the quantum metric and topological Euler number in a cyclically modulated Su-Schrieffer-Heeger (SSH) model with long-range hop** terms. By computing the quantum geometry tensor, we derive exactly expressions for the quantum metric and Berry curvature of the energy band electrons, and we obtain the phase diagram of the model marked by the first Chern number. Furthermore, we also ob…
▽ More
We investigate the quantum metric and topological Euler number in a cyclically modulated Su-Schrieffer-Heeger (SSH) model with long-range hop** terms. By computing the quantum geometry tensor, we derive exactly expressions for the quantum metric and Berry curvature of the energy band electrons, and we obtain the phase diagram of the model marked by the first Chern number. Furthermore, we also obtain the topological Euler number of the energy band based on the Gauss-Bonnet theorem on the topological characterization of the closed Bloch states manifold in the first Brillouin zone. However, some regions where the Berry curvature is identically zero in the first Brillouin zone results in the degeneracy of the quantum metric, which leads to ill-defined non-integer topological Euler numbers. Nevertheless, the non-integer "Euler number" provides valuable insights and provide an upper bound for absolute values of the Chern numbers.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Vision-Language Model-based Physical Reasoning for Robot Liquid Perception
Authors:
Wenqiang Lai,
Yuan Gao,
Tin Lun Lam
Abstract:
There is a growing interest in applying large language models (LLMs) in robotic tasks, due to their remarkable reasoning ability and extensive knowledge learned from vast training corpora. Grounding LLMs in the physical world remains an open challenge as they can only process textual input. Recent advancements in large vision-language models (LVLMs) have enabled a more comprehensive understanding…
▽ More
There is a growing interest in applying large language models (LLMs) in robotic tasks, due to their remarkable reasoning ability and extensive knowledge learned from vast training corpora. Grounding LLMs in the physical world remains an open challenge as they can only process textual input. Recent advancements in large vision-language models (LVLMs) have enabled a more comprehensive understanding of the physical world by incorporating visual input, which provides richer contextual information than language alone. In this work, we proposed a novel paradigm that leveraged GPT-4V(ision), the state-of-the-art LVLM by OpenAI, to enable embodied agents to perceive liquid objects via image-based environmental feedback. Specifically, we exploited the physical understanding of GPT-4V to interpret the visual representation (e.g., time-series plot) of non-visual feedback (e.g., F/T sensor data), indirectly enabling multimodal perception beyond vision and language using images as proxies. We evaluated our method using 10 common household liquids with containers of various geometry and material. Without any training or fine-tuning, we demonstrated that our method can enable the robot to indirectly perceive the physical response of liquids and estimate their viscosity. We also showed that by jointly reasoning over the visual and physical attributes learned through interactions, our method could recognize liquid objects in the absence of strong visual cues (e.g., container labels with legible text or symbols), increasing the accuracy from 69.0% -- achieved by the best-performing vision-only variant -- to 86.0%.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Coherent insulator at arbitrary frequency in a driven atomtronic transistor
Authors:
Wenxi Lai
Abstract:
We use numerical approach to study non-equilibrium transport of atomic gas in a driven optical lattice atomtronic transistor. The shaken optical lattice transistor displays a property of insulator within some regions of shaking frequency and shaking strength. It is proved that appearance of the insulation is directly connected to coherence of the system. Coherence of the system is accompanied by c…
▽ More
We use numerical approach to study non-equilibrium transport of atomic gas in a driven optical lattice atomtronic transistor. The shaken optical lattice transistor displays a property of insulator within some regions of shaking frequency and shaking strength. It is proved that appearance of the insulation is directly connected to coherence of the system. Coherence of the system is accompanied by coherent trap** of non-equilibrium atomic gas in one of the optical wells, which stops atomic currents. Comparing with the effective Hamiltonian approach in Floquet engineering, the time-dependent Hamiltonian approach could be used in any frequency regime of periodically driven quantum system.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches
Authors:
Yun Xin Teoh,
Alice Othmani,
Siew Li Goh,
Juliana Usman,
Khin Wee Lai
Abstract:
Knee osteoarthritis is a degenerative joint disease that induces chronic pain and disability. Bone morphological analysis is a promising tool to understand the mechanical aspect of this disorder. This study proposes a 2D bone morphological analysis using manually segmented bones to explore morphological features related to distinct pain conditions. Furthermore, six semantic segmentation algorithms…
▽ More
Knee osteoarthritis is a degenerative joint disease that induces chronic pain and disability. Bone morphological analysis is a promising tool to understand the mechanical aspect of this disorder. This study proposes a 2D bone morphological analysis using manually segmented bones to explore morphological features related to distinct pain conditions. Furthermore, six semantic segmentation algorithms are assessed for extracting femur and tibia bones from X-ray images. Our analysis reveals that the morphology of the femur undergoes significant changes in instances where pain worsens. Conversely, improvements in pain may not manifest pronounced alterations in bone shape. The few-shot-learning-based algorithm, UniverSeg, demonstrated superior segmentation results with Dice scores of 99.69% for femur and 99.60% for tibia. Regarding pain condition classification, the zero-shot-learning-based algorithm, CP-SAM, achieved the highest accuracy at 66% among all models. UniverSeg is recommended for automatic knee bone segmentation, while SAM models show potential with prompt encoder modifications for optimized outcomes. These findings highlight the effectiveness of few-shot learning for semantic segmentation and the potential of zero-shot learning in enhancing classification models for knee osteoarthritis diagnosis.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
On Defeating Graph Analysis of Anonymous Transactions
Authors:
Christoph Egger,
Russell W. F. Lai,
Viktoria Ronge,
Ivy K. Y. Woo,
Hoover H. F. Yin
Abstract:
In a ring-signature-based anonymous cryptocurrency, signers of a transaction are hidden among a set of potential signers, called a ring, whose size is much smaller than the number of all users. The ring-membership relations specified by the sets of transactions thus induce bipartite transaction graphs, whose distribution is in turn induced by the ring sampler underlying the cryptocurrency.
Since…
▽ More
In a ring-signature-based anonymous cryptocurrency, signers of a transaction are hidden among a set of potential signers, called a ring, whose size is much smaller than the number of all users. The ring-membership relations specified by the sets of transactions thus induce bipartite transaction graphs, whose distribution is in turn induced by the ring sampler underlying the cryptocurrency.
Since efficient graph analysis could be performed on transaction graphs to potentially deanonymise signers, it is crucial to understand the resistance of (the transaction graphs induced by) a ring sampler against graph analysis. Of particular interest is the class of partitioning ring samplers. Although previous works showed that they provide almost optimal local anonymity, their resistance against global, e.g. graph-based, attacks were unclear.
In this work, we analyse transaction graphs induced by partitioning ring samplers. Specifically, we show (partly analytically and partly empirically) that, somewhat surprisingly, by setting the ring size to be at least logarithmic in the number of users, a graph-analysing adversary is no better than the one that performs random guessing in deanonymisation up to constant factor of 2.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Dual-IMU State Estimation for Relative Localization of Two Mobile Agents
Authors:
Wenqian Lai,
Ruonan Guo,
Kejian J. Wu
Abstract:
In this paper, we address the problem of relative localization of two mobile agents. Specifically, we consider the Dual-IMU system, where each agent is equipped with one IMU, and employs relative pose observations between them. Previous works, however, typically assumed known ego motion and ignored biases of the IMUs. Instead, we study the most general case of unknown biases for both IMUs. Besides…
▽ More
In this paper, we address the problem of relative localization of two mobile agents. Specifically, we consider the Dual-IMU system, where each agent is equipped with one IMU, and employs relative pose observations between them. Previous works, however, typically assumed known ego motion and ignored biases of the IMUs. Instead, we study the most general case of unknown biases for both IMUs. Besides the derivation of dynamic model equations of the proposed system, we focus on the observability analysis, for the observability under general motion and the unobservable directions arising from various special motions. Through numerical simulations, we validate our key observability findings and examine their impact on the estimation accuracy and consistency. Finally, the system is implemented to achieve effective relative localization of an HMD with respect to a vehicle moving in the real world.
△ Less
Submitted 6 March, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Efficient Hybrid Zoom using Camera Fusion on Mobile Phones
Authors:
Xiaotong Wu,
Wei-Sheng Lai,
YiChang Shih,
Charles Herrmann,
Michael Krainin,
Deqing Sun,
Chia-Kai Liang
Abstract:
DSLR cameras can achieve multiple zoom levels via shifting lens distances or swap** lens types. However, these techniques are not possible on smartphone devices due to space constraints. Most smartphone manufacturers adopt a hybrid zoom system: commonly a Wide (W) camera at a low zoom level and a Telephoto (T) camera at a high zoom level. To simulate zoom levels between W and T, these systems cr…
▽ More
DSLR cameras can achieve multiple zoom levels via shifting lens distances or swap** lens types. However, these techniques are not possible on smartphone devices due to space constraints. Most smartphone manufacturers adopt a hybrid zoom system: commonly a Wide (W) camera at a low zoom level and a Telephoto (T) camera at a high zoom level. To simulate zoom levels between W and T, these systems crop and digitally upsample images from W, leading to significant detail loss. In this paper, we propose an efficient system for hybrid zoom super-resolution on mobile devices, which captures a synchronous pair of W and T shots and leverages machine learning models to align and transfer details from T to W. We further develop an adaptive blending method that accounts for depth-of-field mismatches, scene occlusion, flow uncertainty, and alignment errors. To minimize the domain gap, we design a dual-phone camera rig to capture real-world inputs and ground-truths for supervised training. Our method generates a 12-megapixel image in 500ms on a mobile platform and compares favorably against state-of-the-art methods under extensive evaluation on real-world scenarios.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Amplitude-Ensemble Quantum-Inspired Tabu Search Algorithm for Solving 0/1 Knapsack Problems
Authors:
Kuo-Chun Tseng,
Wei-Chieh Lai,
I-Chia Chen,
Yun-Hsiang Hsiao,
Jr-Yu Chiue,
Wei-Chun Huang
Abstract:
In this paper, an improved version of QTS (Quantum-inspired Tabu Search) has been proposed, which enhances the utilization of population information, called "amplitude-ensemble" QTS (AE-QTS). This makes AE-QTS more similar to the real quantum search algorithm, Grover Search Algorithm, in abstract concept, while kee** the simplicity of the algorithm. Later, we demonstrate the AE-QTS on the classi…
▽ More
In this paper, an improved version of QTS (Quantum-inspired Tabu Search) has been proposed, which enhances the utilization of population information, called "amplitude-ensemble" QTS (AE-QTS). This makes AE-QTS more similar to the real quantum search algorithm, Grover Search Algorithm, in abstract concept, while kee** the simplicity of the algorithm. Later, we demonstrate the AE-QTS on the classical combinatorial optimization 0/1 knapsack problem. Experimental results show that the AE-QTS outperforms other algorithms, including the QTS, by at least an average of 20% in all cases and even by 30% in some cases. Even as the problem complexity increases, the quality of the solutions found by our method remains superior to that of the QTS. These results prove that our method has better search performance.
△ Less
Submitted 17 March, 2024; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Extending Multilingual Machine Translation through Imitation Learning
Authors:
Wen Lai,
Viktor Hangya,
Alexander Fraser
Abstract:
Despite the growing variety of languages supported by existing multilingual neural machine translation (MNMT) models, most of the world's languages are still being left behind. We aim to extend large-scale MNMT models to a new language, allowing for translation between the newly added and all of the already supported languages in a challenging scenario: using only a parallel corpus between the new…
▽ More
Despite the growing variety of languages supported by existing multilingual neural machine translation (MNMT) models, most of the world's languages are still being left behind. We aim to extend large-scale MNMT models to a new language, allowing for translation between the newly added and all of the already supported languages in a challenging scenario: using only a parallel corpus between the new language and English. Previous approaches, such as continued training on parallel data including the new language, suffer from catastrophic forgetting (i.e., performance on other languages is reduced). Our novel approach Imit-MNMT treats the task as an imitation learning process, which mimicks the behavior of an expert, a technique widely used in the computer vision area, but not well explored in NLP. More specifically, we construct a pseudo multi-parallel corpus of the new and the original languages by pivoting through English, and imitate the output distribution of the original MNMT model. Extensive experiments show that our approach significantly improves the translation performance between the new and the original languages, without severe catastrophic forgetting. We also demonstrate that our approach is capable of solving copy and off-target problems, which are two common issues existence in current large-scale MNMT models.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Extremal surfaces in glue-on AdS/$T\bar T$ holography
Authors:
Luis Apolo,
Peng-Xiang Hao,
Wen-Xin Lai,
Wei Song
Abstract:
$T\bar T$ deformed CFTs with positive deformation parameter have been proposed to be holographically dual to Einstein gravity in a glue-on $\mathrm{AdS}_3$ spacetime. The latter is constructed from AdS$_3$ by gluing a patch of an auxiliary AdS$_3^*…
▽ More
$T\bar T$ deformed CFTs with positive deformation parameter have been proposed to be holographically dual to Einstein gravity in a glue-on $\mathrm{AdS}_3$ spacetime. The latter is constructed from AdS$_3$ by gluing a patch of an auxiliary AdS$_3^*$ spacetime to its asymptotic boundary. In this work, we propose a glue-on version of the Ryu-Takayanagi formula, which is given by the signed area of an extremal surface. The extremal surface is anchored at the endpoints of an interval on a cutoff surface in the glue-on geometry. It consists of an RT surface lying in the AdS$_3$ part of the spacetime and its extension to the AdS$_3^*$ region. The signed area is the length of the RT surface minus the length of the segments in AdS$_3^*$. We find that the Ryu-Takayanagi formula with the signed area reproduces the entanglement entropy of a half interval for $T\bar T$-deformed CFTs on the sphere. We then study the properties of extremal surfaces on various glue-on geometries, including Poincaré $\mathrm{AdS}_3$, global $\mathrm{AdS}_3$, and the BTZ black hole. When anchored on multiple intervals at the boundary, the signed area of the minimal surfaces undergoes phase transitions with novel properties. In all of these examples, we find that the glue-on extremal surfaces exhibit a minimum length related to the deformation parameter of $T\bar T$-deformed CFTs.
△ Less
Submitted 29 January, 2024; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
Authors:
Rui Zheng,
Wei Shen,
Yuan Hua,
Wenbin Lai,
Shihan Dou,
Yuhao Zhou,
Zhiheng Xi,
Xiao Wang,
Haoran Huang,
Tao Gui,
Qi Zhang,
Xuan**g Huang
Abstract:
The success of AI assistants based on language models (LLMs) hinges crucially on Reinforcement Learning from Human Feedback (RLHF), which enables the generation of responses more aligned with human preferences. As universal AI assistants, there's a growing expectation for them to perform consistently across various domains. However, previous work shows that Reinforcement Learning (RL) often exploi…
▽ More
The success of AI assistants based on language models (LLMs) hinges crucially on Reinforcement Learning from Human Feedback (RLHF), which enables the generation of responses more aligned with human preferences. As universal AI assistants, there's a growing expectation for them to perform consistently across various domains. However, previous work shows that Reinforcement Learning (RL) often exploits shortcuts to attain high rewards and overlooks challenging samples. This focus on quick reward gains undermines both the stability in training and the model's ability to generalize to new, unseen data. In this work, we propose a novel approach that can learn a consistent policy via RL across various data groups or domains. Given the challenges associated with acquiring group annotations, our method automatically classifies data into different groups, deliberately maximizing performance variance. Then, we optimize the policy to perform well on challenging groups. Lastly, leveraging the established groups, our approach adaptively adjusts the exploration space, allocating more learning capacity to more challenging data and preventing the model from over-optimizing on simpler data. Experimental results indicate that our approach significantly enhances training stability and model generalization.
△ Less
Submitted 25 December, 2023; v1 submitted 18 October, 2023;
originally announced October 2023.
-
A Two-Step Approach for Narrowband Source Localization in Reverberant Rooms
Authors:
Wei-Ting Lai,
Lachlan Birnie,
Thushara Abhayapala,
Amy Bastine,
Shaoheng Xu,
Prasanga Samarasinghe
Abstract:
This paper presents a two-step approach for narrowband source localization within reverberant rooms. The first step involves dereverberation by modeling the homogeneous component of the sound field by an equivalent decomposition of planewaves using Iteratively Reweighted Least Squares (IRLS), while the second step focuses on source localization by modeling the dereverberated component as a sparse…
▽ More
This paper presents a two-step approach for narrowband source localization within reverberant rooms. The first step involves dereverberation by modeling the homogeneous component of the sound field by an equivalent decomposition of planewaves using Iteratively Reweighted Least Squares (IRLS), while the second step focuses on source localization by modeling the dereverberated component as a sparse representation of point-source distribution using Orthogonal Matching Pursuit (OMP). The proposed method enhances localization accuracy with fewer measurements, particularly in environments with strong reverberation. A numerical simulation in a conference room scenario, using a uniform microphone array affixed to the wall, demonstrates real-world feasibility. Notably, the proposed method and microphone placement effectively localize sound sources within the 2D-horizontal plane without requiring prior knowledge of boundary conditions and room geometry, making it versatile for application in different room types.
△ Less
Submitted 24 September, 2023;
originally announced September 2023.
-
Neutrino Imaging of the Galactic Centre and Millisecond Pulsar Population
Authors:
Paul C. W. Lai,
Matteo Agostini,
Foteini Oikonomou,
Beatrice Crudele,
Ellis R. Owen,
Kinwah Wu
Abstract:
In this work, we consider the possible presence of a large population of millisecond pulsars in the Galactic Centre. Their direct detection would be challenging due to severe pulse broadening caused by scattering of radiation. We propose a new method to constrain their population with neutrino imaging of the Galactic Centre. Millisecond pulsars are proposed cosmic-ray accelerators. The high-energy…
▽ More
In this work, we consider the possible presence of a large population of millisecond pulsars in the Galactic Centre. Their direct detection would be challenging due to severe pulse broadening caused by scattering of radiation. We propose a new method to constrain their population with neutrino imaging of the Galactic Centre. Millisecond pulsars are proposed cosmic-ray accelerators. The high-energy protons they produce will collide with the baryonic matter in the central molecular zone to create charged and neutral pions that decay into neutrinos and $γ$-rays, respectively. The specific neutrino and $γ$-ray fluxes must be below their corresponding observed values, allowing us to put a conservative upper limit on the millisecond pulsar population of N_MSP < 10,000 within a galacto-centric radius of 20 pc. This upper limit is sensitive to the proton acceleration efficiency of the pulsars, but is less dependent on the particle injection spectral index and the choice of mass tracers. The population will be better constrained when high resolution neutrino observations of the Galactic Centre become available. The presence of these millisecond pulsars can account for the $γ$-ray excess in the Galactic Centre.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Deciphering knee osteoarthritis diagnostic features with explainable artificial intelligence: A systematic review
Authors:
Yun Xin Teoh,
Alice Othmani,
Siew Li Goh,
Juliana Usman,
Khin Wee Lai
Abstract:
Existing artificial intelligence (AI) models for diagnosing knee osteoarthritis (OA) have faced criticism for their lack of transparency and interpretability, despite achieving medical-expert-like performance. This opacity makes them challenging to trust in clinical practice. Recently, explainable artificial intelligence (XAI) has emerged as a specialized technique that can provide confidence in t…
▽ More
Existing artificial intelligence (AI) models for diagnosing knee osteoarthritis (OA) have faced criticism for their lack of transparency and interpretability, despite achieving medical-expert-like performance. This opacity makes them challenging to trust in clinical practice. Recently, explainable artificial intelligence (XAI) has emerged as a specialized technique that can provide confidence in the model's prediction by revealing how the prediction is derived, thus promoting the use of AI systems in healthcare. This paper presents the first survey of XAI techniques used for knee OA diagnosis. The XAI techniques are discussed from two perspectives: data interpretability and model interpretability. The aim of this paper is to provide valuable insights into XAI's potential towards a more reliable knee OA diagnosis approach and encourage its adoption in clinical practice.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Knowledge Distilled Ensemble Model for sEMG-based Silent Speech Interface
Authors:
Wenqiang Lai,
Qihan Yang,
Ye Mao,
Endong Sun,
Jiangnan Ye
Abstract:
Voice disorders affect millions of people worldwide. Surface electromyography-based Silent Speech Interfaces (sEMG-based SSIs) have been explored as a potential solution for decades. However, previous works were limited by small vocabularies and manually extracted features from raw data. To address these limitations, we propose a lightweight deep learning knowledge-distilled ensemble model for sEM…
▽ More
Voice disorders affect millions of people worldwide. Surface electromyography-based Silent Speech Interfaces (sEMG-based SSIs) have been explored as a potential solution for decades. However, previous works were limited by small vocabularies and manually extracted features from raw data. To address these limitations, we propose a lightweight deep learning knowledge-distilled ensemble model for sEMG-based SSI (KDE-SSI). Our model can classify a 26 NATO phonetic alphabets dataset with 3900 data samples, enabling the unambiguous generation of any English word through spelling. Extensive experiments validate the effectiveness of KDE-SSI, achieving a test accuracy of 85.9\%. Our findings also shed light on an end-to-end system for portable, practical equipment.
△ Less
Submitted 6 August, 2023;
originally announced August 2023.
-
Leveraging Optical Communication Fiber and AI for Distributed Water Pipe Leak Detection
Authors:
Huan Wu,
Huan-Feng Duan,
Wallace W. L. Lai,
Kun Zhu,
Xin Cheng,
Hao Yin,
Bin Zhou,
Chun-Cheung Lai,
Chao Lu,
Xiaoli Ding
Abstract:
Detecting leaks in water networks is a costly challenge. This article introduces a practical solution: the integration of optical network with water networks for efficient leak detection. Our approach uses a fiber-optic cable to measure vibrations, enabling accurate leak identification and localization by an intelligent algorithm. We also propose a method to access leak severity for prioritized re…
▽ More
Detecting leaks in water networks is a costly challenge. This article introduces a practical solution: the integration of optical network with water networks for efficient leak detection. Our approach uses a fiber-optic cable to measure vibrations, enabling accurate leak identification and localization by an intelligent algorithm. We also propose a method to access leak severity for prioritized repairs. Our solution detects even small leaks with flow rates as low as 0.027 L/s. It offers a cost-effective way to improve leak detection, enhance water management, and increase operational efficiency.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Secrets of RLHF in Large Language Models Part I: PPO
Authors:
Rui Zheng,
Shihan Dou,
Songyang Gao,
Yuan Hua,
Wei Shen,
Binghai Wang,
Yan Liu,
Senjie **,
Qin Liu,
Yuhao Zhou,
Limao Xiong,
Lu Chen,
Zhiheng Xi,
Nuo Xu,
Wenbin Lai,
Minghao Zhu,
Cheng Chang,
Zhangyue Yin,
Rongxiang Weng,
Wensen Cheng,
Haoran Huang,
Tianxiang Sun,
Hang Yan,
Tao Gui,
Qi Zhang
, et al. (2 additional authors not shown)
Abstract:
Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramount significance, and reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. Current…
▽ More
Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramount significance, and reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. Current technical routes usually include \textbf{reward models} to measure human preferences, \textbf{Proximal Policy Optimization} (PPO) to optimize policy model outputs, and \textbf{process supervision} to improve step-by-step reasoning capabilities. However, due to the challenges of reward design, environment interaction, and agent training, coupled with huge trial and error cost of large language models, there is a significant barrier for AI researchers to motivate the development of technical alignment and safe landing of LLMs. The stable training of RLHF has still been a puzzle. In the first report, we dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training. We identify policy constraints being the key factor for the effective implementation of the PPO algorithm. Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model. Based on our main results, we perform a comprehensive analysis of RLHF abilities compared with SFT models and ChatGPT. The absence of open-source implementations has posed significant challenges to the investigation of LLMs alignment. Therefore, we are eager to release technical reports, reward models and PPO codes, aiming to make modest contributions to the advancement of LLMs.
△ Less
Submitted 18 July, 2023; v1 submitted 10 July, 2023;
originally announced July 2023.
-
Blind Beamforming for Intelligent Reflecting Surface in Fading Channels without CSI
Authors:
Wenhai Lai,
Wenyu Wang,
Fan Xu,
Xin Li,
Shaobo Niu,
Kaiming Shen
Abstract:
This paper discusses how to optimize the phase shifts of intelligent reflecting surface (IRS) to combat channel fading without any channel state information (CSI), namely blind beamforming. Differing from most previous works based on a two-stage paradigm of first estimating channels and then optimizing phase shifts, our approach is completely data-driven, only requiring a dataset of the received s…
▽ More
This paper discusses how to optimize the phase shifts of intelligent reflecting surface (IRS) to combat channel fading without any channel state information (CSI), namely blind beamforming. Differing from most previous works based on a two-stage paradigm of first estimating channels and then optimizing phase shifts, our approach is completely data-driven, only requiring a dataset of the received signal power at the user terminal. Thus, our method does not incur extra overhead costs for channel estimation, and does not entail collaboration from service provider, either. The main idea is to choose phase shifts at random and use the corresponding conditional sample mean of the received signal power to extract the main features of the wireless environment. This blind beamforming approach guarantees an $N^2$ boost of signal-to-noise ratio (SNR), where $N$ is the number of reflective elements (REs) of IRS, regardless of whether the direct channel is line-of-sight (LoS) or not. Moreover, blind beamforming is extended to a double-IRS system with provable performance. Finally, prototype tests show that the proposed blind beamforming method can be readily incorporated into the existing communication systems in the real world; simulation tests further show that it works for a variety of fading channel models.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
The case for an EIC Theory Alliance: Theoretical Challenges of the EIC
Authors:
Raktim Abir,
Igor Akushevich,
Tolga Altinoluk,
Daniele Paolo Anderle,
Fatma P. Aslan,
Alessandro Bacchetta,
Baha Balantekin,
Joao Barata,
Marco Battaglieri,
Carlos A. Bertulani,
Guillaume Beuf,
Chiara Bissolotti,
Daniël Boer,
M. Boglione,
Radja Boughezal,
Eric Braaten,
Nora Brambilla,
Vladimir Braun,
Duane Byer,
Francesco Giovanni Celiberto,
Yang-Ting Chien,
Ian C. Cloët,
Martha Constantinou,
Wim Cosyn,
Aurore Courtoy
, et al. (146 additional authors not shown)
Abstract:
We outline the physics opportunities provided by the Electron Ion Collider (EIC). These include the study of the parton structure of the nucleon and nuclei, the onset of gluon saturation, the production of jets and heavy flavor, hadron spectroscopy and tests of fundamental symmetries. We review the present status and future challenges in EIC theory that have to be addressed in order to realize thi…
▽ More
We outline the physics opportunities provided by the Electron Ion Collider (EIC). These include the study of the parton structure of the nucleon and nuclei, the onset of gluon saturation, the production of jets and heavy flavor, hadron spectroscopy and tests of fundamental symmetries. We review the present status and future challenges in EIC theory that have to be addressed in order to realize this ambitious and impactful physics program, including how to engage a diverse and inclusive workforce. In order to address these many-fold challenges, we propose a coordinated effort involving theory groups with differing expertise is needed. We discuss the scientific goals and scope of such an EIC Theory Alliance.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation
Authors:
Wen Lai,
Alexandra Chronopoulou,
Alexander Fraser
Abstract:
Despite advances in multilingual neural machine translation (MNMT), we argue that there are still two major challenges in this area: data imbalance and representation degeneration. The data imbalance problem refers to the imbalance in the amount of parallel corpora for all language pairs, especially for long-tail languages (i.e., very low-resource languages). The representation degeneration proble…
▽ More
Despite advances in multilingual neural machine translation (MNMT), we argue that there are still two major challenges in this area: data imbalance and representation degeneration. The data imbalance problem refers to the imbalance in the amount of parallel corpora for all language pairs, especially for long-tail languages (i.e., very low-resource languages). The representation degeneration problem refers to the problem of encoded tokens tending to appear only in a small subspace of the full space available to the MNMT model. To solve these two issues, we propose Bi-ACL, a framework that uses only target-side monolingual data and a bilingual dictionary to improve the performance of the MNMT model. We define two modules, named bidirectional autoencoder and bidirectional contrastive learning, which we combine with an online constrained beam search and a curriculum learning sampling strategy. Extensive experiments show that our proposed method is more effective both in long-tail languages and in high-resource languages. We also demonstrate that our approach is capable of transferring knowledge between domains and languages in zero-shot scenarios.
△ Less
Submitted 24 October, 2023; v1 submitted 22 May, 2023;
originally announced May 2023.
-
The Longest Subsequence-Repeated Subsequence Problem
Authors:
Manuel Lafond,
Wenfeng Lai,
Adiesha Liyanage,
Binhai Zhu
Abstract:
Motivated by computing duplication patterns in sequences, a new fundamental problem called the longest subsequence-repeated subsequence (LSRS) is proposed. Given a sequence $S$ of length $n$, a letter-repeated subsequence is a subsequence of $S$ in the form of $x_1^{d_1}x_2^{d_2}\cdots x_k^{d_k}$ with $x_i$ a subsequence of $S$, $x_j\neq x_{j+1}$ and $d_i\geq 2$ for all $i$ in $[k]$ and $j$ in…
▽ More
Motivated by computing duplication patterns in sequences, a new fundamental problem called the longest subsequence-repeated subsequence (LSRS) is proposed. Given a sequence $S$ of length $n$, a letter-repeated subsequence is a subsequence of $S$ in the form of $x_1^{d_1}x_2^{d_2}\cdots x_k^{d_k}$ with $x_i$ a subsequence of $S$, $x_j\neq x_{j+1}$ and $d_i\geq 2$ for all $i$ in $[k]$ and $j$ in $[k-1]$. We first present an $O(n^6)$ time algorithm to compute the longest cubic subsequences of all the $O(n^2)$ substrings of $S$, improving the trivial $O(n^7)$ bound. Then, an $O(n^6)$ time algorithm for computing the longest subsequence-repeated subsequence (LSRS) of $S$ is obtained. Finally we focus on two variants of this problem. We first consider the constrained version when $Σ$ is unbounded, each letter appears in $S$ at most $d$ times and all the letters in $Σ$ must appear in the solution. We show that the problem is NP-hard for $d=4$, via a reduction from a special version of SAT (which is obtained from 3-COLORING). We then show that when each letter appears in $S$ at most $d=3$ times, then the problem is solvable in $O(n^5)$ time.
△ Less
Submitted 31 August, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Glue-on AdS holography for $T\bar T$-deformed CFTs
Authors:
Luis Apolo,
Peng-Xiang Hao,
Wen-Xin Lai,
Wei Song
Abstract:
The $T\bar T$ deformation is a solvable irrelevant deformation whose properties depend on the sign of the deformation parameter $μ$. In particular, $T\bar T$-deformed CFTs with $μ<0$ have been proposed to be holographically dual to Einstein gravity where the metric satisfies Dirichlet boundary conditions at a finite cutoff surface. In this paper, we put forward a holographic proposal for…
▽ More
The $T\bar T$ deformation is a solvable irrelevant deformation whose properties depend on the sign of the deformation parameter $μ$. In particular, $T\bar T$-deformed CFTs with $μ<0$ have been proposed to be holographically dual to Einstein gravity where the metric satisfies Dirichlet boundary conditions at a finite cutoff surface. In this paper, we put forward a holographic proposal for $T\bar T$-deformed CFTs with $μ>0$, in which case the bulk geometry is constructed by gluing a patch of AdS$_3$ to the original spacetime. As evidence, we show that the $T\bar T$ trace flow equation, the spectrum on the cylinder, and the partition function on the torus and the sphere, among other results, can all be reproduced from bulk calculations in glue-on AdS$_3$.
△ Less
Submitted 23 June, 2023; v1 submitted 8 March, 2023;
originally announced March 2023.
-
Atomtronic superconducting quantum interference device in synthetic dimensions
Authors:
Wenxi Lai,
Yu-Quan Ma,
Yi-Wen Wei
Abstract:
We propose atomtronic counterpart of superconducting quantum interference device (SQUID) in synthetic $2$-dimensional space. The system is composed of Bose-Einstein condensate (BEC) in two neighboring optical wells which is coupled to an external coherent light. Furthermore, availability of controllable atomtronic flux qubit in the synthetic dimensions is demonstrated with this system. Control par…
▽ More
We propose atomtronic counterpart of superconducting quantum interference device (SQUID) in synthetic $2$-dimensional space. The system is composed of Bose-Einstein condensate (BEC) in two neighboring optical wells which is coupled to an external coherent light. Furthermore, availability of controllable atomtronic flux qubit in the synthetic dimensions is demonstrated with this system. Control parameter for the qubit is naturally provided by artificial magnetic flux originated from the coherent atom-light coupling. Comparing with traditional SQUID which requires at least $2$-dimensional circuits, the synthetic dimensional SQUID can be realized only in $1$-dimensional circuits. It should be a great advantage for the scalability and integration feature of quantum logic gates.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Quantum Information Science and Technology for Nuclear Physics. Input into U.S. Long-Range Planning, 2023
Authors:
Douglas Beck,
Joseph Carlson,
Zohreh Davoudi,
Joseph Formaggio,
Sofia Quaglioni,
Martin Savage,
Joao Barata,
Tanmoy Bhattacharya,
Michael Bishof,
Ian Cloet,
Andrea Delgado,
Michael DeMarco,
Caleb Fink,
Adrien Florio,
Marianne Francois,
Dorota Grabowska,
Shannon Hoogerheide,
Mengyao Huang,
Kazuki Ikeda,
Marc Illa,
Kyungseon Joo,
Dmitri Kharzeev,
Karol Kowalski,
Wai Kin Lai,
Kyle Leach
, et al. (76 additional authors not shown)
Abstract:
In preparation for the 2023 NSAC Long Range Plan (LRP), members of the Nuclear Science community gathered to discuss the current state of, and plans for further leveraging opportunities in, QIST in NP research at the Quantum Information Science for U.S. Nuclear Physics Long Range Planning workshop, held in Santa Fe, New Mexico on January 31 - February 1, 2023. The workshop included 45 in-person pa…
▽ More
In preparation for the 2023 NSAC Long Range Plan (LRP), members of the Nuclear Science community gathered to discuss the current state of, and plans for further leveraging opportunities in, QIST in NP research at the Quantum Information Science for U.S. Nuclear Physics Long Range Planning workshop, held in Santa Fe, New Mexico on January 31 - February 1, 2023. The workshop included 45 in-person participants and 53 remote attendees. The outcome of the workshop identified strategic plans and requirements for the next 5-10 years to advance quantum sensing and quantum simulations within NP, and to develop a diverse quantum-ready workforce. The plans include resolutions endorsed by the participants to address the compelling scientific opportunities at the intersections of NP and QIST. These endorsements are aligned with similar affirmations by the LRP Computational Nuclear Physics and AI/ML Workshop, the Nuclear Structure, Reactions, and Astrophysics LRP Town Hall, and the Fundamental Symmetries, Neutrons, and Neutrinos LRP Town Hall communities.
△ Less
Submitted 28 February, 2023;
originally announced March 2023.
-
Coordinating Multiple Intelligent Reflecting Surfaces without Channel Information
Authors:
Fan Xu,
Jiawei Yao,
Wenhai Lai,
Kaiming Shen,
Xin Li,
Xin Chen,
Zhi-Quan Luo
Abstract:
Conventional beamforming methods for intelligent reflecting surfaces (IRSs) or reconfigurable intelligent surfaces (RISs) typically entail the full channel state information (CSI). However, the computational cost of channel acquisition soars exponentially with the number of IRSs. To bypass this difficulty, we propose a novel strategy called blind beamforming that coordinates multiple IRSs by means…
▽ More
Conventional beamforming methods for intelligent reflecting surfaces (IRSs) or reconfigurable intelligent surfaces (RISs) typically entail the full channel state information (CSI). However, the computational cost of channel acquisition soars exponentially with the number of IRSs. To bypass this difficulty, we propose a novel strategy called blind beamforming that coordinates multiple IRSs by means of statistics without knowing CSI. Blind beamforming only requires measuring the received signal power at the user terminal for a sequence of randomly generated phase shifts across all IRSs. The main idea is to extract the key statistical quantity for beamforming by exploring only a small portion of the whole solution space of phase shifts. We show that blind beamforming guarantees a signal-to-noise ratio (SNR) boost of Theta(N^{2L}) under certain conditions, where L is the number of IRSs and N is the number of reflecting elements per IRS. The proposed conditions for achieving the optimal SNR boost of Theta(N^{4}) in a double-IRS system are much easier to satisfy than the existing ones in the literature. Most importantly, the proposed conditions can be extended to a fully general L-IRS system. The above result significantly improves upon the state of the art in the area of multi-IRS-assisted communication. Moreover, blind beamforming is justified via field tests and simulations. In particular, as shown in our field tests at 2.6 GHz, our method yields up to 17 dB SNR boost; to the best of our knowledge, this is the first time that the use of multiple IRSs gets verified in the real world.
△ Less
Submitted 8 January, 2024; v1 submitted 19 February, 2023;
originally announced February 2023.
-
Scattering Amplitude from Quantum Computing with Reduction Formula
Authors:
Tianyin Li,
Wai Kin Lai,
Enke Wang,
Hongxi Xing
Abstract:
Utilizing the Lehmann-Symanzik-Zimmermann reduction formula, we present a new general framework for computing scattering amplitudes in quantum field theory with quantum computers in a fully nonperturbative way. In this framework, one only has to construct one-particle states of zero momentum, and no wave packets of incoming particles are needed. The framework is able to incorporate scatterings of…
▽ More
Utilizing the Lehmann-Symanzik-Zimmermann reduction formula, we present a new general framework for computing scattering amplitudes in quantum field theory with quantum computers in a fully nonperturbative way. In this framework, one only has to construct one-particle states of zero momentum, and no wave packets of incoming particles are needed. The framework is able to incorporate scatterings of bound states, and is ideal for scatterings involving a small number of particles. We expect this framework to have particular advantages when applied to exclusive hadron scatterings. As a proof of concept, by simulations on classical hardware, we demonstrate that in the one-flavor Gross-Neveu model, the fermion propagator, the connected fermion four-point function, and the propagator of a fermion-antifermion bound state obtained from our proposed quantum algorithm have the desired pole structure crucial to the implementation of the Lehmann-Symanzik-Zimmermann reduction formula.
△ Less
Submitted 26 February, 2024; v1 submitted 10 January, 2023;
originally announced January 2023.
-
Heavy hybrid decays to quarkonia
Authors:
Nora Brambilla,
Wai Kin Lai,
Abhishek Mohapatra,
Antonio Vairo
Abstract:
The decay rates of the XYZ exotics discovered in the heavy quarkonium sector are crucial observables for identifying the nature of these states. Based on the framework of nonrelativistic effective field theories, we calculate the rates of semi-inclusive decays of heavy quarkonium hybrids into conventional heavy quarkonia. We compute them at leading and subleading power in the inverse of the heavy-…
▽ More
The decay rates of the XYZ exotics discovered in the heavy quarkonium sector are crucial observables for identifying the nature of these states. Based on the framework of nonrelativistic effective field theories, we calculate the rates of semi-inclusive decays of heavy quarkonium hybrids into conventional heavy quarkonia. We compute them at leading and subleading power in the inverse of the heavy-quark mass, extending and updating previous results. We compare our predictions with experimental data of inclusive decay rates for candidates of heavy quarkonium hybrids.
△ Less
Submitted 18 December, 2022;
originally announced December 2022.
-
$m^4Adapter$: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter
Authors:
Wen Lai,
Alexandra Chronopoulou,
Alexander Fraser
Abstract:
Multilingual neural machine translation models (MNMT) yield state-of-the-art performance when evaluated on data from a domain and language pair seen at training time. However, when a MNMT model is used to translate under domain shift or to a new language pair, performance drops dramatically. We consider a very challenging scenario: adapting the MNMT model both to a new domain and to a new language…
▽ More
Multilingual neural machine translation models (MNMT) yield state-of-the-art performance when evaluated on data from a domain and language pair seen at training time. However, when a MNMT model is used to translate under domain shift or to a new language pair, performance drops dramatically. We consider a very challenging scenario: adapting the MNMT model both to a new domain and to a new language pair at the same time. In this paper, we propose $m^4Adapter$ (Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter), which combines domain and language knowledge using meta-learning with adapters. We present results showing that our approach is a parameter-efficient solution which effectively adapts a model to both a new language pair and a new domain, while outperforming other adapter methods. An ablation study also shows that our approach more effectively transfers domain knowledge across different languages and language information across different domains.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
Design and Development of a Tracked Inspection Robot
Authors:
Erika Sahari,
Weiyao Lai,
Alireza Pulles,
XiaoQi Guo,
Marc Bernhard
Abstract:
This paper presents the examination of the clever Differential with three levels of opportunity. The is the principal differential with that interprets differential speed and force to its three results when the results are under fluctuated loads, however deciphers equivalent movement and force to its results when exposed to approach loads. The kinematics and elements of the are determined and are…
▽ More
This paper presents the examination of the clever Differential with three levels of opportunity. The is the principal differential with that interprets differential speed and force to its three results when the results are under fluctuated loads, however deciphers equivalent movement and force to its results when exposed to approach loads. The kinematics and elements of the are determined and are hypothetically investigated under three different burden cases. The movement of the under the three burden cases is additionally recreated and concentrated in. The benefits of alongside its current and potential applications are introduced.
△ Less
Submitted 1 September, 2022;
originally announced September 2022.
-
Impact of Loss Model Selection on Power Semiconductor Lifetime Prediction in Electric Vehicles
Authors:
Hongjian Xia,
Yi Zhang,
Dao Zhou,
Minyou Chen,
Wei Lai,
Yunhai Wei,
Huai Wang
Abstract:
Power loss estimation is an indispensable procedure to conduct lifetime prediction for power semiconductor device. The previous studies successfully perform steady-state power loss estimation for different applications, but which may be limited for the electric vehicles (EVs) with high dynamics. Based on two EV standard driving cycle profiles, this paper gives a comparative study of power loss est…
▽ More
Power loss estimation is an indispensable procedure to conduct lifetime prediction for power semiconductor device. The previous studies successfully perform steady-state power loss estimation for different applications, but which may be limited for the electric vehicles (EVs) with high dynamics. Based on two EV standard driving cycle profiles, this paper gives a comparative study of power loss estimation models with two different time resolutions, i.e., the output period average and the switching period average. The correspondingly estimated power losses, thermal profiles, and lifetime clearly pointed out that the widely applied power loss model with the output period average is limited for EV applications, in particular for the highly dynamic driving cycle. The difference in the predicted lifetime can be up to 300 times due to the unreasonable choice the loss model, which calls for the industry attention on the differences of the EVs and the importance of loss model selection in lifetime prediction.
△ Less
Submitted 27 August, 2022;
originally announced August 2022.
-
Design and Development of Miniature long distance multi-moving robots for 3D Smart Sensing for underground Pipe Inspection
Authors:
Alireza Pulles,
Weiyao Lai,
Erika Sahari,
XiaoQi Guo,
Marc Bernhard
Abstract:
Designing an in-pipe climbing robot that manipulates sharp gears to study complex line relationships. Traditional rolling/happening pipe climbing robots tend to slide when exploring pipe curves. The proposed gearbox connects to the farthest ground plane of a standard dual output gearbox. Instrumentation helps achieve a very well-defined deceleration sequence in which the robot slides and pulls as…
▽ More
Designing an in-pipe climbing robot that manipulates sharp gears to study complex line relationships. Traditional rolling/happening pipe climbing robots tend to slide when exploring pipe curves. The proposed gearbox connects to the farthest ground plane of a standard dual output gearbox. Instrumentation helps achieve a very well-defined deceleration sequence in which the robot slides and pulls as it moves forward. This instrument takes into account the forces exerted on each track within the line relationship and intentionally modifies the robot's track speed, unlocking the key to fine-tuning. This makes the 3 output transmissions take a lot of time. Deflection of the robot on a pipe network with various bearings and non-slip pipe bends demonstrates the integrity of the proposed structure.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Deployment of long distance multi-moving robots for underground pipe inspection
Authors:
Weiyao Lai,
Wei Xu,
Marc Bernhard
Abstract:
Blueprint of an in-pipe climbing robot that works with sharp transmissions to study complex line relationships. Standard wheeled/happening pipe climbing robots tend to slide when exploring pipe turns. Instruments help achieve a very distinct delay sequence in which the robot slides and drags as it progresses. The proposed transmission joins the farthest ground plane of the standard two-output tran…
▽ More
Blueprint of an in-pipe climbing robot that works with sharp transmissions to study complex line relationships. Standard wheeled/happening pipe climbing robots tend to slide when exploring pipe turns. Instruments help achieve a very distinct delay sequence in which the robot slides and drags as it progresses. The proposed transmission joins the farthest ground plane of the standard two-output transmission. This opens up a substantial time for 3 output transmissions. This instrument takes into account the force exerted on each track within the line relation to specifically alter the robot's track speed, unlocking the key to fine control. Deflection of the robot across pipe networks with different bearings and non-slip pipe bends demonstrate the integrity of the proposed structure.
△ Less
Submitted 13 August, 2022;
originally announced August 2022.
-
Exploring Light-Cone Distribution Amplitudes from Quantum Computing
Authors:
Tianyin Li,
Xingyu Guo,
Wai Kin Lai,
Xiaohui Liu,
Enke Wang,
Hongxi Xing,
Dan-Bo Zhang,
Shi-Liang Zhu
Abstract:
Light-cone distribution amplitudes (LCDAs) are essential nonperturbative quantities for theoretical predictions of exclusive high-energy processes in quantum chromodynamics (QCD). We demonstrate the prospect of calculating LCDAs on a quantum computer by applying a recently proposed quantum algorithm, with staggered fermions, to the simulation of the LCDA in the (1+1)-dimensional Nambu-Jona-Lasinio…
▽ More
Light-cone distribution amplitudes (LCDAs) are essential nonperturbative quantities for theoretical predictions of exclusive high-energy processes in quantum chromodynamics (QCD). We demonstrate the prospect of calculating LCDAs on a quantum computer by applying a recently proposed quantum algorithm, with staggered fermions, to the simulation of the LCDA in the (1+1)-dimensional Nambu-Jona-Lasinio (NJL) model on classical hardware. The agreement between the result from the classical simulation of the quantum algorithm and that from exact diagonalization justifies the proposed quantum algorithm. We find that the resulting LCDA in the NJL model exhibits features shared with the LCDAs obtained from QCD.
△ Less
Submitted 17 October, 2023; v1 submitted 26 July, 2022;
originally announced July 2022.
-
Face Deblurring using Dual Camera Fusion on Mobile Phones
Authors:
Wei-Sheng Lai,
YiChang Shih,
Lun-Cheng Chu,
Xiaotong Wu,
Sung-Fang Tsai,
Michael Krainin,
Deqing Sun,
Chia-Kai Liang
Abstract:
Motion blur of fast-moving subjects is a longstanding problem in photography and very common on mobile phones due to limited light collection efficiency, particularly in low-light conditions. While we have witnessed great progress in image deblurring in recent years, most methods require significant computational power and have limitations in processing high-resolution photos with severe local mot…
▽ More
Motion blur of fast-moving subjects is a longstanding problem in photography and very common on mobile phones due to limited light collection efficiency, particularly in low-light conditions. While we have witnessed great progress in image deblurring in recent years, most methods require significant computational power and have limitations in processing high-resolution photos with severe local motions. To this end, we develop a novel face deblurring system based on the dual camera fusion technique for mobile phones. The system detects subject motion to dynamically enable a reference camera, e.g., ultrawide angle camera commonly available on recent premium phones, and captures an auxiliary photo with faster shutter settings. While the main shot is low noise but blurry, the reference shot is sharp but noisy. We learn ML models to align and fuse these two shots and output a clear photo without motion blur. Our algorithm runs efficiently on Google Pixel 6, which takes 463 ms overhead per shot. Our experiments demonstrate the advantage and robustness of our system against alternative single-image, multi-frame, face-specific, and video deblurring algorithms as well as commercial products. To the best of our knowledge, our work is the first mobile solution for face motion deblurring that works reliably and robustly over thousands of images in diverse motion and lighting conditions.
△ Less
Submitted 23 July, 2022;
originally announced July 2022.
-
Vision Transformer for NeRF-Based View Synthesis from a Single Input Image
Authors:
Kai-En Lin,
Lin Yen-Chen,
Wei-Sheng Lai,
Tsung-Yi Lin,
Yi-Chang Shih,
Ravi Ramamoorthi
Abstract:
Although neural radiance fields (NeRF) have shown impressive advances for novel view synthesis, most methods typically require multiple input images of the same scene with accurate camera poses. In this work, we seek to substantially reduce the inputs to a single unposed image. Existing approaches condition on local image features to reconstruct a 3D object, but often render blurry predictions at…
▽ More
Although neural radiance fields (NeRF) have shown impressive advances for novel view synthesis, most methods typically require multiple input images of the same scene with accurate camera poses. In this work, we seek to substantially reduce the inputs to a single unposed image. Existing approaches condition on local image features to reconstruct a 3D object, but often render blurry predictions at viewpoints that are far away from the source view. To address this issue, we propose to leverage both the global and local features to form an expressive 3D representation. The global features are learned from a vision transformer, while the local features are extracted from a 2D convolutional network. To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering. This novel 3D representation allows the network to reconstruct unseen regions without enforcing constraints like symmetry or canonical coordinate systems. Our method can render novel views from only a single input image and generalize across multiple object categories using a single model. Quantitative and qualitative evaluations demonstrate that the proposed method achieves state-of-the-art performance and renders richer details than existing approaches.
△ Less
Submitted 13 October, 2022; v1 submitted 12 July, 2022;
originally announced July 2022.
-
Multichannel Optimal Tree-Decodable Codes are Not Always Optimal Prefix Codes
Authors:
Hoover H. F. Yin,
Harry W. H. Wong,
Mehrdad Tahernia,
Russell W. F. Lai
Abstract:
The theory of multichannel prefix codes aims to generalize the classical theory of prefix codes. Although single- and two-channel prefix codes always have decoding trees, the same cannot be said when there are more than two channels. One question is of theoretical interest: Do there exist optimal tree-decodable codes that are not optimal prefix codes? Existing literature, which focused on generali…
▽ More
The theory of multichannel prefix codes aims to generalize the classical theory of prefix codes. Although single- and two-channel prefix codes always have decoding trees, the same cannot be said when there are more than two channels. One question is of theoretical interest: Do there exist optimal tree-decodable codes that are not optimal prefix codes? Existing literature, which focused on generalizing single-channel results, covered little about non-tree-decodable prefix codes since they have no single-channel counterparts. In this work, we study the fundamental reason behind the non-tree-decodability of prefix codes. By investigating the simplest non-tree-decodable structure, we obtain a general sufficient condition on the channel alphabets for the existence of optimal tree-decodable codes that are not optimal prefix codes.
△ Less
Submitted 10 May, 2022;
originally announced May 2022.
-
Unveiling Nucleon 3D Chiral-Odd Structure with Jet Axes
Authors:
Wai Kin Lai,
Xiaohui Liu,
Manman Wang,
Hongxi Xing
Abstract:
We reinterpret jet clustering as an axis-finding procedure which, along with the proton beam, defines the virtual-photon transverse momentum $q_T$ in deep inelastic scattering (DIS). In this way, we are able to probe the nucleon intrinsic structure using jet axes in a fully inclusive manner, similar to the Drell-Yan process. We present the complete list of azimuthal asymmetries and the associated…
▽ More
We reinterpret jet clustering as an axis-finding procedure which, along with the proton beam, defines the virtual-photon transverse momentum $q_T$ in deep inelastic scattering (DIS). In this way, we are able to probe the nucleon intrinsic structure using jet axes in a fully inclusive manner, similar to the Drell-Yan process. We present the complete list of azimuthal asymmetries and the associated factorization formulae at leading power for deep-inelastic scattering of a nucleon. The factorization formulae involve both the conventional time-reversal-even (T-even) jet function and the T-odd one, which have access to all transverse-momentum-dependent parton distribution functions (TMD PDFs) at leading twist. Since the factorization holds as long as $q_T \ll Q$, where $Q$ is the photon virtuality, the jet-axis probe into the nucleon structure should be feasible for machines with relatively low energies such as the Electron-Ion Collider in China (EicC). We show that, within the winner-take-all (WTA) axis-finding scheme, the coupling between the T-odd jet function and the quark transversity or the Boer-Mulders function could induce sizable azimuthal asymmetries at the EicC, the EIC and HERA. We also give predictions for the azimuthal asymmetry of back-to-back dijet production in $e^+e^-$ annihilation at Belle and other energies.
△ Less
Submitted 9 May, 2022;
originally announced May 2022.
-
Online Aggregation based Approximate Query Processing: A Literature Survey
Authors:
Pritom Saha Akash,
Wei-Cheng Lai,
Po-Wen Lin
Abstract:
In the current world, OLAP (Online Analytical Processing) is used intensively by modern organizations to perform ad hoc analysis of data, providing insight for better decision making. Thus, the performance for OLAP is crucial; however, it is costly to support OLAP for a large data-set. An approximate query process (AQP) was proposed to efficiently compute approximate values as close as to the exac…
▽ More
In the current world, OLAP (Online Analytical Processing) is used intensively by modern organizations to perform ad hoc analysis of data, providing insight for better decision making. Thus, the performance for OLAP is crucial; however, it is costly to support OLAP for a large data-set. An approximate query process (AQP) was proposed to efficiently compute approximate values as close as to the exact answer. Existing AQP techniques can be categorized into two parts, online aggregation, and offline synopsis generation, each having its limitations and challenges. Online aggregation-based AQP progressively generates approximate results with some error estimates (i.e., confidence interval) until the processing of all data is done. In Offline synopsis generation-based AQP, synopses are generated offline using a-priori knowledge such as query workload or data statistics. Later, OLAP queries are answered using these synopses. This paper focuses on surveying only the online aggregation-based AQP. For this purpose, firstly, we discuss the research challenges in online aggregation-based AQP and summarize existing approaches to address these challenges. In addition, we also discuss the advantages and limitations of existing online aggregation mechanisms. Lastly, we discuss some research challenges and opportunities for further advancing online aggregation research. Our goal is for people to understand the current progress in the online aggregation-based AQP area and find new insights into it.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Optical purification of materials based on atom-light coherent coupling
Authors:
Wenxi Lai
Abstract:
An optical method for precise purification of chemical elements is introduced in this paper. The materials are supposed to be in the states of gaseous beams, which are coherently coupled to an external traveling light during purification. Before decoherence occurs, atoms periodically move in the light with different speeds that depends on masses and optical transition wave lengths of these atoms.…
▽ More
An optical method for precise purification of chemical elements is introduced in this paper. The materials are supposed to be in the states of gaseous beams, which are coherently coupled to an external traveling light during purification. Before decoherence occurs, atoms periodically move in the light with different speeds that depends on masses and optical transition wave lengths of these atoms. The speed gradient leads to deflections of different atoms in different directions. The model is described by Schrödinger equations with analytical results. This method could be used for some hardly separable atoms and isotopes depending on the condition of atom coherent time. The present work opens a platform for applications of cold atom technology in the purification of atoms and molecules.
△ Less
Submitted 12 February, 2023; v1 submitted 6 April, 2022;
originally announced April 2022.
-
Optical Stern-Gerlach effect via a single traveling-wave light
Authors:
Haihu Cui,
Wenxi Lai
Abstract:
In this paper, we propose a simplified model of optical Stern-Gerlach effect based on coherent coupling between clock transition of alkaline-earth single atoms and a traveling-wave light. It is demonstrated that spin-orbit coupling induced chiral motion in atom deflection appears under the strong atom-light interaction. The strong optical driving removes perturbation from the Doppler effect and ba…
▽ More
In this paper, we propose a simplified model of optical Stern-Gerlach effect based on coherent coupling between clock transition of alkaline-earth single atoms and a traveling-wave light. It is demonstrated that spin-orbit coupling induced chiral motion in atom deflection appears under the strong atom-light interaction. The strong optical driving removes perturbation from the Doppler effect and back action effect to access the coherent system. In this process, superposition of distant matter waves connected to the arbitrary distribution of atom internal state could be predicted, which is important for the realization of atom interferometry and quantum state operation. The influence from atom relaxation and atom-atom interactions are discussed. Basic conditions of experimental design are given in the end of this work.
△ Less
Submitted 12 February, 2023; v1 submitted 6 April, 2022;
originally announced April 2022.
-
Don't Get Me Wrong: How to Apply Deep Visual Interpretations to Time Series
Authors:
Christoffer Loeffler,
Wei-Cheng Lai,
Bjoern Eskofier,
Dario Zanca,
Lukas Schmidt,
Christopher Mutschler
Abstract:
The correct interpretation and understanding of deep learning models are essential in many applications. Explanatory visual interpretation approaches for image, and natural language processing allow domain experts to validate and understand almost any deep learning model. However, they fall short when generalizing to arbitrary time series, which is inherently less intuitive and more diverse. Wheth…
▽ More
The correct interpretation and understanding of deep learning models are essential in many applications. Explanatory visual interpretation approaches for image, and natural language processing allow domain experts to validate and understand almost any deep learning model. However, they fall short when generalizing to arbitrary time series, which is inherently less intuitive and more diverse. Whether a visualization explains valid reasoning or captures the actual features is difficult to judge. Hence, instead of blind trust, we need an objective evaluation to obtain trustworthy quality metrics. We propose a framework of six orthogonal metrics for gradient-, propagation- or perturbation-based post-hoc visual interpretation methods for time series classification and segmentation tasks. An experimental study includes popular neural network architectures for time series and nine visual interpretation methods. We evaluate the visual interpretation methods with diverse datasets from the UCR repository and a complex, real-world dataset and study the influence of standard regularization techniques during training. We show that none of the methods consistently outperforms others on all metrics, while some are sometimes ahead. Our insights and recommendations allow experts to choose suitable visualization techniques for the model and task.
△ Less
Submitted 15 September, 2023; v1 submitted 14 March, 2022;
originally announced March 2022.
-
Atom walking in a traveling-wave light
Authors:
Wenxi Lai
Abstract:
In this paper, we investigate mechanical motion of ultra-slow single atoms considering each atom is coherently coupled to a traveling-wave light. The main noise in this system is originated from Doppler broadening due to the continuous momentum distribution in atom wave packet. Here, it is proved that the Doppler broadening could be effectively suppressed in strong coupling regime. Under the coher…
▽ More
In this paper, we investigate mechanical motion of ultra-slow single atoms considering each atom is coherently coupled to a traveling-wave light. The main noise in this system is originated from Doppler broadening due to the continuous momentum distribution in atom wave packet. Here, it is proved that the Doppler broadening could be effectively suppressed in strong coupling regime. Under the coherent coupling, individual neutral atoms periodically walk in a definite direction. Direction of the motion depends on occupation of the atom in its two internal states related to the optical transition, since the atom would be affected by attractive or repulsive forces depending on the internal states. It is analogous to the electric force acting on negatively or positively charged particles. We explain them with spin-orbit coupling of atoms which is hidden in our Hamiltonian. These results have potential applications for the construction of future atomic devices.
△ Less
Submitted 2 March, 2023; v1 submitted 9 February, 2022;
originally announced February 2022.
-
Deep Image Deblurring: A Survey
Authors:
Kaihao Zhang,
Wenqi Ren,
Wenhan Luo,
Wei-Sheng Lai,
Bjorn Stenger,
Ming-Hsuan Yang,
Hongdong Li
Abstract:
Image deblurring is a classic problem in low-level computer vision with the aim to recover a sharp image from a blurred input image. Advances in deep learning have led to significant progress in solving this problem, and a large number of deblurring networks have been proposed. This paper presents a comprehensive and timely survey of recently published deep-learning based image deblurring approach…
▽ More
Image deblurring is a classic problem in low-level computer vision with the aim to recover a sharp image from a blurred input image. Advances in deep learning have led to significant progress in solving this problem, and a large number of deblurring networks have been proposed. This paper presents a comprehensive and timely survey of recently published deep-learning based image deblurring approaches, aiming to serve the community as a useful literature review. We start by discussing common causes of image blur, introduce benchmark datasets and performance metrics, and summarize different problem formulations. Next, we present a taxonomy of methods using convolutional neural networks (CNN) based on architecture, loss function, and application, offering a detailed review and comparison. In addition, we discuss some domain-specific deblurring applications including face images, text, and stereo image pairs. We conclude by discussing key challenges and future research directions.
△ Less
Submitted 27 May, 2022; v1 submitted 25 January, 2022;
originally announced January 2022.