-
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
Authors:
Bozhong Tian,
Xiaozhuan Liang,
Siyuan Cheng,
Qingbin Liu,
Mengru Wang,
Dianbo Sui,
Xi Chen,
Huajun Chen,
Ningyu Zhang
Abstract:
Large Language Models (LLMs) trained on extensive corpora inevitably retain sensitive data, such as personal privacy information and copyrighted material. Recent advancements in knowledge unlearning involve updating LLM parameters to erase specific knowledge. However, current unlearning paradigms are mired in vague forgetting boundaries, often erasing knowledge indiscriminately. In this work, we i…
▽ More
Large Language Models (LLMs) trained on extensive corpora inevitably retain sensitive data, such as personal privacy information and copyrighted material. Recent advancements in knowledge unlearning involve updating LLM parameters to erase specific knowledge. However, current unlearning paradigms are mired in vague forgetting boundaries, often erasing knowledge indiscriminately. In this work, we introduce KnowUnDo, a benchmark containing copyrighted content and user privacy domains to evaluate if the unlearning process inadvertently erases essential knowledge. Our findings indicate that existing unlearning methods often suffer from excessive unlearning. To address this, we propose a simple yet effective method, MemFlex, which utilizes gradient information to precisely target and unlearn sensitive parameters. Experimental results show that MemFlex is superior to existing methods in both precise knowledge unlearning and general knowledge retaining of LLMs. Code and dataset will be released at https://github.com/zjunlp/KnowUnDo.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
The neutron array of the compact spectrometer for heavy ion experiments in Fermi energy region
Authors:
Dawei Si,
Sheng Xiao,
Yuhao Qin,
Yijie Wang,
Junhuai Xu,
Baiting Tian,
Boyuan Zhang,
Dong Guo,
Qin Zhi,
Xiaobao Wei,
Yibo Hao,
Zengxiang Wang,
Tianren Zhuo,
Yuansheng Yang,
Xianglun Wei,
Herun Yang,
Peng Ma,
Limin Duan,
Fangfang Duan,
Junbing Ma,
Shiwei Xu,
Zhen Bai,
Guo Yang,
Yanyun Yang,
Zhigang Xiao
Abstract:
The emission of neutrons from heavy ion reactions is an important observable for studying the asymmetric nuclear equation of state and the reaction dynamics. A 20-unit neutron array has been developed and mounted on the compact spectrometer for heavy ion experiments (CSHINE) to measure the neutron spectra, neutron-neutron and neutron-proton correlation functions. Each unit consists of a…
▽ More
The emission of neutrons from heavy ion reactions is an important observable for studying the asymmetric nuclear equation of state and the reaction dynamics. A 20-unit neutron array has been developed and mounted on the compact spectrometer for heavy ion experiments (CSHINE) to measure the neutron spectra, neutron-neutron and neutron-proton correlation functions. Each unit consists of a $\rm 15\times 15\times 15~cm^3$ plastic scintillator coupled to a $ φ=52 ~\rm mm$ photomultiplier. The Geant4 simulation with optical process is performed to investigate the time resolution and the neutron detection efficiency. The inherent time resolution of 212 ps is obtained by cosmic ray coincidence test. The n-$γ$ discrimination and time-of-flight performance are given by $\rm ^{252}Cf$ radioactive source test and beam test. The neutron energy spectra have been obtained in the angle range $30^\circ \le θ_{\rm lab} \le 51^\circ$ in the beam experiment of $^{124}$Sn+$^{124}$Sn at 25 MeV/u with CSHINE.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
3D Unsupervised Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving
Authors:
Boyi Sun,
Yuhang Liu,
Xingxia Wang,
Bin Tian,
Long Chen,
Fei-Yue Wang
Abstract:
Point cloud data labeling is considered a time-consuming and expensive task in autonomous driving, whereas unsupervised learning can avoid it by learning point cloud representations from unannotated data. In this paper, we propose UOV, a novel 3D Unsupervised framework assisted by 2D Open-Vocabulary segmentation models. It consists of two stages: In the first stage, we innovatively integrate high-…
▽ More
Point cloud data labeling is considered a time-consuming and expensive task in autonomous driving, whereas unsupervised learning can avoid it by learning point cloud representations from unannotated data. In this paper, we propose UOV, a novel 3D Unsupervised framework assisted by 2D Open-Vocabulary segmentation models. It consists of two stages: In the first stage, we innovatively integrate high-quality textual and image features of 2D open-vocabulary models and propose the Tri-Modal contrastive Pre-training (TMP). In the second stage, spatial map** between point clouds and images is utilized to generate pseudo-labels, enabling cross-modal knowledge distillation. Besides, we introduce the Approximate Flat Interaction (AFI) to address the noise during alignment and label confusion. To validate the superiority of UOV, extensive experiments are conducted on multiple related datasets. We achieved a record-breaking 47.73% mIoU on the annotation-free point cloud segmentation task in nuScenes, surpassing the previous best model by 10.70% mIoU. Meanwhile, the performance of fine-tuning with 1% data on nuScenes and SemanticKITTI reached a remarkable 51.75% mIoU and 48.14% mIoU, outperforming all previous pre-trained models.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Memristive switching in the surface of a charge-density-wave topological semimetal
Authors:
Jianwen Ma,
Xianghao Meng,
Binhua Zhang,
Yuxiang Wang,
Yicheng Mou,
Wenting Lin,
Yannan Dai,
Luqiu Chen,
Haonan Wang,
Haoqi Wu,
Jiaming Gu,
Jiayu Wang,
Yuhan Du,
Chunsen Liu,
Wu Shi,
Zhenzhong Yang,
Bobo Tian,
Lin Miao,
Peng Zhou,
Chun-Gang Duan,
Changsong Xu,
Xiang Yuan,
Cheng Zhang
Abstract:
Owing to the outstanding properties provided by nontrivial band topology, topological phases of matter are considered as a promising platform towards low-dissipation electronics, efficient spin-charge conversion, and topological quantum computation. Achieving ferroelectricity in topological materials enables the non-volatile control of the quantum states, which could greatly facilitate topological…
▽ More
Owing to the outstanding properties provided by nontrivial band topology, topological phases of matter are considered as a promising platform towards low-dissipation electronics, efficient spin-charge conversion, and topological quantum computation. Achieving ferroelectricity in topological materials enables the non-volatile control of the quantum states, which could greatly facilitate topological electronic research. However, ferroelectricity is generally incompatible with systems featuring metallicity due to the screening effect of free carriers. In this study, we report the observation of memristive switching based on the ferroelectric surface state of a topological semimetal (TaSe4)2I. We find that the surface state of (TaSe4)2I presents out-of-plane ferroelectric polarization due to surface reconstruction. With the combination of ferroelectric surface and charge-density-wave-gapped bulk states, an electric switchable barrier height can be achieved in (TaSe4)2I-metal contact. By employing a multi-terminal grounding design, we manage to construct a prototype ferroelectric memristor based on (TaSe4)2I with on/off ratio up to 10^3, endurance over 10^3 cycles, and good retention characteristics. The origin of the ferroelectric surface state is further investigated by first-principles calculations, which reveals an interplay between ferroelectricity and band topology. The emergence of ferroelectricity in (TaSe4)2I not only demonstrates it as a rare but essential case of ferroelectric topological materials, but also opens new routes towards the implementation of topological materials in functional electronic devices.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Prove Symbolic Regression is NP-hard by Symbol Graph
Authors:
**glu Song,
Qiang Lu,
Bozhou Tian,
**gwen Zhang,
Jake Luo,
Zhiguang Wang
Abstract:
Symbolic regression (SR) is the task of discovering a symbolic expression that fits a given data set from the space of mathematical expressions. Despite the abundance of research surrounding the SR problem, there's a scarcity of works that confirm its NP-hard nature. Therefore, this paper introduces the concept of a symbol graph as a comprehensive representation of the entire mathematical expressi…
▽ More
Symbolic regression (SR) is the task of discovering a symbolic expression that fits a given data set from the space of mathematical expressions. Despite the abundance of research surrounding the SR problem, there's a scarcity of works that confirm its NP-hard nature. Therefore, this paper introduces the concept of a symbol graph as a comprehensive representation of the entire mathematical expression space, effectively illustrating the NP-hard characteristics of the SR problem. Leveraging the symbol graph, we establish a connection between the SR problem and the task of identifying an optimally fitted degree-constrained Steiner Arborescence (DCSAP). The complexity of DCSAP, which is proven to be NP-hard, directly implies the NP-hard nature of the SR problem.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
OpenMines: A Light and Comprehensive Mining Simulation Environment for Truck Dispatching
Authors:
Shi Meng,
Bin Tian,
Xiaotong Zhang,
Shuangying Qi,
Caiji Zhang,
Qiang Zhang
Abstract:
Mine fleet management algorithms can significantly reduce operational costs and enhance productivity in mining systems. Most current fleet management algorithms are evaluated based on self-implemented or proprietary simulation environments, posing challenges for replication and comparison. This paper models the simulation environment for mine fleet management from a complex systems perspective. Bu…
▽ More
Mine fleet management algorithms can significantly reduce operational costs and enhance productivity in mining systems. Most current fleet management algorithms are evaluated based on self-implemented or proprietary simulation environments, posing challenges for replication and comparison. This paper models the simulation environment for mine fleet management from a complex systems perspective. Building upon previous work, we introduce probabilistic, user-defined events for random event simulation and implement various evaluation metrics and baselines, effectively reflecting the robustness of fleet management algorithms against unforeseen incidents. We present ``OpenMines'', an open-source framework encompassing the entire process of mine system modeling, algorithm development, and evaluation, facilitating future algorithm comparison and replication in the field. Code is available in https://github.com/370025263/openmines.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
ICAT: An Indoor Connected and Autonomous Testbed for Vehicle Computing
Authors:
Zhaofeng Tian,
William He,
Boyang Tian,
Ren Zhong,
Erfan Foorginejad,
Weisong Shi
Abstract:
Indoor autonomous driving testbeds have emerged to complement expensive outdoor testbeds and virtual simulations, offering scalable and cost-effective solutions for research in navigation, traffic optimization, and swarm intelligence. However, they often lack the robust sensing and computing infrastructure for advanced research. Addressing these limitations, we introduce the Indoor Connected Auton…
▽ More
Indoor autonomous driving testbeds have emerged to complement expensive outdoor testbeds and virtual simulations, offering scalable and cost-effective solutions for research in navigation, traffic optimization, and swarm intelligence. However, they often lack the robust sensing and computing infrastructure for advanced research. Addressing these limitations, we introduce the Indoor Connected Autonomous Testbed (ICAT), a platform that not only tackles the unique challenges of indoor autonomous driving but also innovates vehicle computing and V2X communication. Moreover, ICAT leverages digital twins through CARLA and SUMO simulations, facilitating both centralized and decentralized autonomy deployments.
△ Less
Submitted 5 March, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Trimma: Trimming Metadata Storage and Latency for Hybrid Memory Systems
Authors:
Yiwei Li,
Boyu Tian,
Mingyu Gao
Abstract:
Hybrid main memory systems combine both performance and capacity advantages from heterogeneous memory technologies. With larger capacities, higher associativities, and finer granularities, hybrid memory systems currently exhibit significant metadata storage and lookup overheads for flexibly remap** data blocks between the two memory tiers. To alleviate the inefficiencies of existing designs, we…
▽ More
Hybrid main memory systems combine both performance and capacity advantages from heterogeneous memory technologies. With larger capacities, higher associativities, and finer granularities, hybrid memory systems currently exhibit significant metadata storage and lookup overheads for flexibly remap** data blocks between the two memory tiers. To alleviate the inefficiencies of existing designs, we propose Trimma, the combination of a multi-level metadata structure and an efficient metadata cache design. Trimma uses a multi-level metadata table to only track truly necessary address remap entries. The saved memory space is effectively utilized as extra DRAM cache capacity to improve performance. Trimma also uses separate formats to store the entries with non-identity and identity map**s. This improves the overall remap cache hit rate, further boosting the performance. Trimma is transparent to software and compatible with various types of hybrid memory systems. When evaluated on a representative DDR4 + NVM hybrid memory system, Trimma achieves up to 2.4$\times$ and on average 58.1\% speedup benefits, compared with a state-of-the-art design that only leverages the unallocated fast memory space for caching. Trimma addresses metadata management overheads and targets future scalable large-scale hybrid memory architectures.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
InstructEdit: Instruction-based Knowledge Editing for Large Language Models
Authors:
Ningyu Zhang,
Bozhong Tian,
Siyuan Cheng,
Xiaozhuan Liang,
Yi Hu,
Kouying Xue,
Yanjie Gou,
Xi Chen,
Huajun Chen
Abstract:
Knowledge editing for large language models can offer an efficient solution to alter a model's behavior without negatively impacting the overall performance. However, the current approaches encounter issues with limited generalizability across tasks, necessitating one distinct editor for each task, significantly hindering the broader applications. To address this, we take the first step to analyze…
▽ More
Knowledge editing for large language models can offer an efficient solution to alter a model's behavior without negatively impacting the overall performance. However, the current approaches encounter issues with limited generalizability across tasks, necessitating one distinct editor for each task, significantly hindering the broader applications. To address this, we take the first step to analyze the multi-task generalization issue in knowledge editing. Specifically, we develop an instruction-based editing technique, termed InstructEdit, which facilitates the editor's adaptation to various task performances simultaneously using simple instructions. With only one unified editor for each LLM, we empirically demonstrate that InstructEdit can improve the editor's control, leading to an average 14.86% increase in Reliability in multi-task editing setting. Furthermore, experiments involving holdout unseen task illustrate that InstructEdit consistently surpass previous strong baselines. To further investigate the underlying mechanisms of instruction-based knowledge editing, we analyze the principal components of the editing gradient directions, which unveils that instructions can help control optimization direction with stronger OOD generalization. Code and datasets are available in https://github.com/zjunlp/EasyEdit.
△ Less
Submitted 28 April, 2024; v1 submitted 25 February, 2024;
originally announced February 2024.
-
MIKE: A New Benchmark for Fine-grained Multimodal Entity Knowledge Editing
Authors:
Jiaqi Li,
Miaozeng Du,
Chuanyi Zhang,
Yongrui Chen,
Nan Hu,
Guilin Qi,
Haiyun Jiang,
Siyuan Cheng,
Bozhong Tian
Abstract:
Multimodal knowledge editing represents a critical advancement in enhancing the capabilities of Multimodal Large Language Models (MLLMs). Despite its potential, current benchmarks predominantly focus on coarse-grained knowledge, leaving the intricacies of fine-grained (FG) multimodal entity knowledge largely unexplored. This gap presents a notable challenge, as FG entity recognition is pivotal for…
▽ More
Multimodal knowledge editing represents a critical advancement in enhancing the capabilities of Multimodal Large Language Models (MLLMs). Despite its potential, current benchmarks predominantly focus on coarse-grained knowledge, leaving the intricacies of fine-grained (FG) multimodal entity knowledge largely unexplored. This gap presents a notable challenge, as FG entity recognition is pivotal for the practical deployment and effectiveness of MLLMs in diverse real-world scenarios. To bridge this gap, we introduce MIKE, a comprehensive benchmark and dataset specifically designed for the FG multimodal entity knowledge editing. MIKE encompasses a suite of tasks tailored to assess different perspectives, including Vanilla Name Answering, Entity-Level Caption, and Complex-Scenario Recognition. In addition, a new form of knowledge editing, Multi-step Editing, is introduced to evaluate the editing efficiency. Through our extensive evaluations, we demonstrate that the current state-of-the-art methods face significant challenges in tackling our proposed benchmark, underscoring the complexity of FG knowledge editing in MLLMs. Our findings spotlight the urgent need for novel approaches in this domain, setting a clear agenda for future research and development efforts within the community.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Key Patch Proposer: Key Patches Contain Rich Information
Authors:
**g Xu,
Beiwen Tian,
Hao Zhao
Abstract:
In this paper, we introduce a novel algorithm named Key Patch Proposer (KPP) designed to select key patches in an image without additional training. Our experiments showcase KPP's robust capacity to capture semantic information by both reconstruction and classification tasks. The efficacy of KPP suggests its potential application in active learning for semantic segmentation. Our source code is pub…
▽ More
In this paper, we introduce a novel algorithm named Key Patch Proposer (KPP) designed to select key patches in an image without additional training. Our experiments showcase KPP's robust capacity to capture semantic information by both reconstruction and classification tasks. The efficacy of KPP suggests its potential application in active learning for semantic segmentation. Our source code is publicly available at https://github.com/CA-TT-AC/key-patch-proposer.
△ Less
Submitted 17 February, 2024;
originally announced February 2024.
-
Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images
Authors:
Xiaoxiao Long,
Yuhang Zheng,
Yupeng Zheng,
Beiwen Tian,
Cheng Lin,
Lingjie Liu,
Hao Zhao,
Guyue Zhou,
Wen** Wang
Abstract:
We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context. The difficulty of reliably capturing geometric context in existing methods impedes their ability to accurately enforce the consistency between the different geometric properties, thereby leading to a bottleneck of geometric estimation quality. We therefore propose t…
▽ More
We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context. The difficulty of reliably capturing geometric context in existing methods impedes their ability to accurately enforce the consistency between the different geometric properties, thereby leading to a bottleneck of geometric estimation quality. We therefore propose the Adaptive Surface Normal (ASN) constraint, a simple yet efficient method. Our approach extracts geometric context that encodes the geometric variations present in the input image and correlates depth estimation with geometric constraints. By dynamically determining reliable local geometry from randomly sampled candidates, we establish a surface normal constraint, where the validity of these candidates is evaluated using the geometric context. Furthermore, our normal estimation leverages the geometric context to prioritize regions that exhibit significant geometric variations, which makes the predicted normals accurately capture intricate and detailed geometric information. Through the integration of geometric context, our method unifies depth and surface normal estimations within a cohesive framework, which enables the generation of high-quality 3D geometry from images. We validate the superiority of our approach over state-of-the-art methods through extensive evaluations and comparisons on diverse indoor and outdoor datasets, showcasing its efficiency and robustness.
△ Less
Submitted 31 March, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics
Authors:
Beiwen Tian,
Huan-ang Gao,
Leiyao Cui,
Yupeng Zheng,
Lan Luo,
Baofeng Wang,
Rong Zhi,
Guyue Zhou,
Hao Zhao
Abstract:
In the past several years, road anomaly segmentation is actively explored in the academia and drawing growing attention in the industry. The rationale behind is straightforward: if the autonomous car can brake before hitting an anomalous object, safety is promoted. However, this rationale naturally calls for a temporally informed setting while existing methods and benchmarks are designed in an unr…
▽ More
In the past several years, road anomaly segmentation is actively explored in the academia and drawing growing attention in the industry. The rationale behind is straightforward: if the autonomous car can brake before hitting an anomalous object, safety is promoted. However, this rationale naturally calls for a temporally informed setting while existing methods and benchmarks are designed in an unrealistic frame-wise manner. To bridge this gap, we contribute the first video anomaly segmentation dataset for autonomous driving. Since placing various anomalous objects on busy roads and annotating them in every frame are dangerous and expensive, we resort to synthetic data. To improve the relevance of this synthetic dataset to real-world applications, we train a generative adversarial network conditioned on rendering G-buffers for photorealism enhancement. Our dataset consists of 120,000 high-resolution frames at a 60 FPS framerate, as recorded in 7 different towns. As an initial benchmarking, we provide baselines using latest supervised and unsupervised road anomaly segmentation methods. Apart from conventional ones, we focus on two new metrics: temporal consistency and latencyaware streaming accuracy. We believe the latter is valuable as it measures whether an anomaly segmentation algorithm can truly prevent a car from crashing in a temporally informed setting.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
A Comprehensive Study of Knowledge Editing for Large Language Models
Authors:
Ningyu Zhang,
Yunzhi Yao,
Bozhong Tian,
Peng Wang,
Shumin Deng,
Mengru Wang,
Zekun Xi,
Shengyu Mao,
**tian Zhang,
Yuansheng Ni,
Siyuan Cheng,
Ziwen Xu,
Xin Xu,
Jia-Chen Gu,
Yong Jiang,
Pengjun Xie,
Fei Huang,
Lei Liang,
Zhiqiang Zhang,
Xiaowei Zhu,
Jun Zhou,
Huajun Chen
Abstract:
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs t…
▽ More
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs to correct outdated information or integrate new knowledge, thereby ensuring their continued relevance. Note that many applications demand continual model adjustments post-training to address deficiencies or undesirable behaviors. There is an increasing interest in efficient, lightweight methods for on-the-fly model modifications. To this end, recent years have seen a burgeoning in the techniques of knowledge editing for LLMs, which aim to efficiently modify LLMs' behaviors within specific domains while preserving overall performance across various inputs. In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches. Drawing inspiration from educational and cognitive research theories, we propose a unified categorization criterion that classifies knowledge editing methods into three groups: resorting to external knowledge, merging knowledge into the model, and editing intrinsic knowledge. Furthermore, we introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches. Additionally, we provide an in-depth analysis of knowledge location, which can give a deeper understanding of the knowledge structures inherent within LLMs. Finally, we discuss several potential applications of knowledge editing, outlining its broad and impactful implications.
△ Less
Submitted 28 March, 2024; v1 submitted 2 January, 2024;
originally announced January 2024.
-
Revisit to the yield ratio of triton and $^3$He as an indicator of neutron-rich neck emission
Authors:
Yijie Wang,
Mengting Wan,
Xinyue Diao,
Sheng Xiao,
Yuhao Qin,
Zhi Qin,
Dong Guo,
Dawei Si,
Boyuan Zhang,
Baiting Tian,
Fenhai Guan,
Qianghua Wu,
Xianglun Wei,
Herun Yang,
Peng Ma,
Rongjiang Hu,
Limin Duan,
Fangfang Duan,
Junbing Ma,
Shiwei Xu,
Qiang Hu,
Zhen Bai,
Yanyun Yang,
Jiansong Wang,
Wenbo Liu
, et al. (12 additional authors not shown)
Abstract:
The neutron rich neck zone created in heavy ion reaction is experimentally probed by the production of the $A=3$ isobars. The energy spectra and angular distributions of triton and $^3$He are measured with the CSHINE detector in $^{86}$Kr +$^{208}$Pb reactions at 25 MeV/u. While the energy spectrum of $^{3}$He is harder than that of triton, known as "$^{3}$He-puzzle", the yield ratio…
▽ More
The neutron rich neck zone created in heavy ion reaction is experimentally probed by the production of the $A=3$ isobars. The energy spectra and angular distributions of triton and $^3$He are measured with the CSHINE detector in $^{86}$Kr +$^{208}$Pb reactions at 25 MeV/u. While the energy spectrum of $^{3}$He is harder than that of triton, known as "$^{3}$He-puzzle", the yield ratio $R({\rm t/^3He})$ presents a robust rising trend with the polar angle in laboratory. Using the fission fragments to reconstruct the fission plane, the enhancement of out-plane $R({\rm t/^3He})$ is confirmed in comparison to the in-plane ratios. Transport model simulations reproduce qualitatively the experimental trends, but the quantitative agreement is not achieved. The results demonstrate that a neutron rich neck zone is formed in the reactions. Further studies are called for to understand the clustering and the isospin dynamics related to neck formation.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Fast Shapley Value Estimation: A Unified Approach
Authors:
Borui Zhang,
Baotong Tian,
Wenzhao Zheng,
Jie Zhou,
Jiwen Lu
Abstract:
Shapley values have emerged as a widely accepted and trustworthy tool, grounded in theoretical axioms, for addressing challenges posed by black-box models like deep neural networks. However, computing Shapley values encounters exponential complexity as the number of features increases. Various approaches, including ApproSemivalue, KernelSHAP, and FastSHAP, have been explored to expedite the comput…
▽ More
Shapley values have emerged as a widely accepted and trustworthy tool, grounded in theoretical axioms, for addressing challenges posed by black-box models like deep neural networks. However, computing Shapley values encounters exponential complexity as the number of features increases. Various approaches, including ApproSemivalue, KernelSHAP, and FastSHAP, have been explored to expedite the computation. In our analysis of existing approaches, we observe that stochastic estimators can be unified as a linear transformation of randomly summed values from feature subsets. Based on this, we investigate the possibility of designing simple amortized estimators and propose a straightforward and efficient one, SimSHAP, by eliminating redundant techniques. Extensive experiments conducted on tabular and image datasets validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
△ Less
Submitted 23 May, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Tracking dynamic flow: Decoding flow fluctuations through performance in a fine motor control task
Authors:
Bohao Tian,
Shijun Zhang,
Sirui Chen,
Yuru Zhang,
Kai** Peng,
Hongxing Zhang,
Dangxiao Wang
Abstract:
Flow, an optimal mental state merging action and awareness, significantly impacts our emotion, performance, and well-being. However, capturing its swift fluctuations on a fine timescale is challenging due to the sparsity of the existing flow detecting tools. Here we present a fine fingertip force control (F3C) task to induce flow, wherein the task challenge is set at a compatible level with person…
▽ More
Flow, an optimal mental state merging action and awareness, significantly impacts our emotion, performance, and well-being. However, capturing its swift fluctuations on a fine timescale is challenging due to the sparsity of the existing flow detecting tools. Here we present a fine fingertip force control (F3C) task to induce flow, wherein the task challenge is set at a compatible level with personal skill, and to quantitatively track the flow state variations from synchronous motor control performance. We extract eight performance metrics from fingertip force sequence and reveal their significant differences under distinct flow states. Further, we built a learning-based flow decoder that aims to predict the continuous flow intensity during the user experiment through the selected performance metrics, taking the self-reported flow as the label. Cross-validation shows that the predicted flow intensity reaches significant correlation with the self-reported flow intensity (r=0.81). Based on the decoding results, we observe rapid oscillations in flow fluctuations during the intervals between sparse self-reporting probes. This study showcases the feasibility of tracking intrinsic flow variations with high temporal resolution using task performance measures and may serve as foundation for future work aiming to take advantage of flow' s dynamics to enhance performance and positive emotions.
△ Less
Submitted 28 December, 2023; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Can We Edit Multimodal Large Language Models?
Authors:
Siyuan Cheng,
Bozhong Tian,
Qingbin Liu,
Xi Chen,
Yongheng Wang,
Huajun Chen,
Ningyu Zhang
Abstract:
In this paper, we focus on editing Multimodal Large Language Models (MLLMs). Compared to editing single-modal LLMs, multimodal model editing is more challenging, which demands a higher level of scrutiny and careful consideration in the editing process. To facilitate research in this area, we construct a new benchmark, dubbed MMEdit, for editing multimodal LLMs and establishing a suite of innovativ…
▽ More
In this paper, we focus on editing Multimodal Large Language Models (MLLMs). Compared to editing single-modal LLMs, multimodal model editing is more challenging, which demands a higher level of scrutiny and careful consideration in the editing process. To facilitate research in this area, we construct a new benchmark, dubbed MMEdit, for editing multimodal LLMs and establishing a suite of innovative metrics for evaluation. We conduct comprehensive experiments involving various model editing baselines and analyze the impact of editing different components for multimodal LLMs. Empirically, we notice that previous baselines can implement editing multimodal LLMs to some extent, but the effect is still barely satisfactory, indicating the potential difficulty of this task. We hope that our work can provide the NLP community with insights. Code and dataset are available in https://github.com/zjunlp/EasyEdit.
△ Less
Submitted 18 April, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
PointGAT: A quantum chemical property prediction model integrating graph attention and 3D geometry
Authors:
Rong Zhang,
Rongqing Yuan,
Boxue Tian
Abstract:
Predicting quantum chemical properties is a fundamental challenge for computational chemistry. While the development of graph neural networks has advanced molecular representation learning and property prediction, their performance could be further enhanced by incorporating 3D structural geometry into 2D molecular graph representation. In this study, we introduce the PointGAT model for quantum mol…
▽ More
Predicting quantum chemical properties is a fundamental challenge for computational chemistry. While the development of graph neural networks has advanced molecular representation learning and property prediction, their performance could be further enhanced by incorporating 3D structural geometry into 2D molecular graph representation. In this study, we introduce the PointGAT model for quantum molecular property prediction, which integrates 3D molecular coordinates with graph-attention modeling. Comparison with other current models in molecular prediction tasks showed that PointGAT could provide higher predictive accuracy in various benchmark datasets from MoleculeNet, including ESOL, FreeSolv, Lipop, HIV, and 10 out of 12 tasks of the QM9 dataset. To further examine PointGAT prediction of quantum mechanical (QM) energies, we constructed a C10 dataset comprising 11,841 charged and chiral carbocation intermediates with QM energies calculated at the DM21/6-31G*//B3LYP/6-31G* levels. Notably, PointGAT achieved an R2 value of 0.950 and an MAE of 1.616 kcal/mol, outperforming other models. Additional ablation studies indicated that incorporating molecular geometry into the model resulted in markedly higher predictive accuracy, reducing the MAE value from 1.802 kcal/mol to 1.616 kcal/mol. Moreover, visualization of PointGAT atomic attention weights suggested its predictions were interpretable. Findings in this study support the application of PointGAT as a powerful and versatile tool for quantum chemical property prediction that can facilitate high-accuracy modeling for fundamental exploration of chemical space as well as drug design and molecular engineering.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Adaptive Unscented Kalman Filter under Minimum Error Entropy with Fiducial Points for Non-Gaussian Systems
Authors:
Boyu Tian,
Haiquan Zhao
Abstract:
The minimum error entropy (MEE) has been extensively used in unscented Kalman filter (UKF) to handle impulsive noises or abnormal measurement data in non-Gaussian systems. However, the MEE-UKF has poor numerical stability due to the inverse operation of singular matrix. In this paper, a novel UKF based on minimum error entropy with fiducial points (MEEF) is proposed \textcolor{black}{to improve th…
▽ More
The minimum error entropy (MEE) has been extensively used in unscented Kalman filter (UKF) to handle impulsive noises or abnormal measurement data in non-Gaussian systems. However, the MEE-UKF has poor numerical stability due to the inverse operation of singular matrix. In this paper, a novel UKF based on minimum error entropy with fiducial points (MEEF) is proposed \textcolor{black}{to improve the problem of non-positive definite key matrix. By adding the correntropy to the error entropy, the proposed algorithm further enhances the ability of suppressing impulse noise and outliers. At the same time, considering the uncertainty of noise distribution, the modified Sage-Husa estimator of noise statistics is introduced to adaptively update the noise covariance matrix. In addition, the convergence analysis of the proposed algorithm provides a guidance for the selection of kernel width. The robustness and estimation accuracy of the proposed algorithm are manifested by the state tracking examples under complex non-Gaussian noises.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Realizing chameleonlike thermal rotator with transformation-invariant metamaterials
Authors:
Fubao Yang,
Boyan Tian,
Liujun Xu
Abstract:
Heat flux rotation has important significance in thermal protection since it can shield the heat energy from a selected direction. Combining with tailored metamaterials, transformation thermotics provides a powerful way to manipulate heat flux, and various kinds of thermal meta-devices have been designed including thermal rotator. However, the existing transformation-thermotics-based thermal rotat…
▽ More
Heat flux rotation has important significance in thermal protection since it can shield the heat energy from a selected direction. Combining with tailored metamaterials, transformation thermotics provides a powerful way to manipulate heat flux, and various kinds of thermal meta-devices have been designed including thermal rotator. However, the existing transformation-thermotics-based thermal rotator can only work in a fixed background. Remanufacturing is inevitable when background changes, which is inconvenient and restricts the practical application. Here, we propose a novel mechanism for chameleonlike thermal rotator. The designed rotator can adaptively change its thermal conductivity with the object nearby while rotating heat flux without distorting the background temperature profile, just like a chameleon in nature. Moreover, such rotator is made of transformation-invariant material, thus its constitutive parameters do not change under arbitrary coordinate transformations. Therefore, the proposed rotator also has functionality-invariance beyond shape adjustment, and can theoretically transfer heat flux in arbitrary direction using different shapes of the same material. A prototype rotator was designed and fabricated, and its chameleonlike behavior is successfully demonstrated. Our concept provides a guidance to design chameleonlike thermal meta-devices and can be extended to other fields like acoustics, hydrodynamics, etc. The chameleonlike thermal rotator will have potential applications for the implementation of adaptive and adjustable metamaterials.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models
Authors:
Peng Wang,
Ningyu Zhang,
Bozhong Tian,
Zekun Xi,
Yunzhi Yao,
Ziwen Xu,
Mengru Wang,
Shengyu Mao,
Xiaohan Wang,
Siyuan Cheng,
Kangwei Liu,
Yuansheng Ni,
Guozhou Zheng,
Huajun Chen
Abstract:
Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy issues, which means they are unaware of unseen events or generate text with incorrect facts owing to outdated/noisy data. To this end, many knowledge editing approaches for LLMs have emerged -- aiming to subtly inject/edit updated knowledge or adjust undesired behavior while minimizing the impact on unrelated inputs. Neve…
▽ More
Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy issues, which means they are unaware of unseen events or generate text with incorrect facts owing to outdated/noisy data. To this end, many knowledge editing approaches for LLMs have emerged -- aiming to subtly inject/edit updated knowledge or adjust undesired behavior while minimizing the impact on unrelated inputs. Nevertheless, due to significant differences among various knowledge editing methods and the variations in task setups, there is no standard implementation framework available for the community, which hinders practitioners from applying knowledge editing to applications. To address these issues, we propose EasyEdit, an easy-to-use knowledge editing framework for LLMs. It supports various cutting-edge knowledge editing approaches and can be readily applied to many well-known LLMs such as T5, GPT-J, LlaMA, etc. Empirically, we report the knowledge editing results on LlaMA-2 with EasyEdit, demonstrating that knowledge editing surpasses traditional fine-tuning in terms of reliability and generalization. We have released the source code on GitHub, along with Google Colab tutorials and comprehensive documentation for beginners to get started. Besides, we present an online system for real-time knowledge editing, and a demo video.
△ Less
Submitted 23 June, 2024; v1 submitted 14 August, 2023;
originally announced August 2023.
-
WeldMon: A Cost-effective Ultrasonic Welding Machine Condition Monitoring System
Authors:
Beitong Tian,
Kuan-Chieh Lu,
Ahmadreza Eslaminia,
Yaohui Wang,
Chenhui Shao,
Klara Nahrstedt
Abstract:
Ultrasonic welding machines play a critical role in the lithium battery industry, facilitating the bonding of batteries with conductors. Ensuring high-quality welding is vital, making tool condition monitoring systems essential for early-stage quality control. However, existing monitoring methods face challenges in cost, downtime, and adaptability. In this paper, we present WeldMon, an affordable…
▽ More
Ultrasonic welding machines play a critical role in the lithium battery industry, facilitating the bonding of batteries with conductors. Ensuring high-quality welding is vital, making tool condition monitoring systems essential for early-stage quality control. However, existing monitoring methods face challenges in cost, downtime, and adaptability. In this paper, we present WeldMon, an affordable ultrasonic welding machine condition monitoring system that utilizes a custom data acquisition system and a data analysis pipeline designed for real-time analysis. Our classification algorithm combines auto-generated features and hand-crafted features, achieving superior cross-validation accuracy (95.8% on average over all testing tasks) compared to the state-of-the-art method (92.5%) in condition classification tasks. Our data augmentation approach alleviates the concept drift problem, enhancing tool condition classification accuracy by 8.3%. All algorithms run locally, requiring only 385 milliseconds to process data for each welding cycle. We deploy WeldMon and a commercial system on an actual ultrasonic welding machine, performing a comprehensive comparison. Our findings highlight the potential for develo** cost-effective, high-performance, and reliable tool condition monitoring systems.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Minimum-consumption discrimination of quantum states via globally optimal adaptive measurements
Authors:
Boxuan Tian,
Wenzhe Yan,
Zhibo Hou,
Guo-Yong Xiang,
Chuan-Feng Li,
Guang-Can Guo
Abstract:
Reducing the average resource consumption is the central quest in discriminating non-orthogonal quantum states for a fixed admissible error rate $\varepsilon$. The globally optimal fixed local projective measurement (GOFL) for this task is found to be different from that for previous minimum-error discrimination tasks [PRL 118, 030502 (2017)]. To achieve the ultimate minimum average consumption, h…
▽ More
Reducing the average resource consumption is the central quest in discriminating non-orthogonal quantum states for a fixed admissible error rate $\varepsilon$. The globally optimal fixed local projective measurement (GOFL) for this task is found to be different from that for previous minimum-error discrimination tasks [PRL 118, 030502 (2017)]. To achieve the ultimate minimum average consumption, here we develop a general globally optimal adaptive strategy (GOA) by subtly using the updated posterior probability, which works under any error rate requirement and any one-way measurement restrictions, and can be solved by a convergent iterative relation. First, under the local measurement restrictions, our GOA is solved to serve as the local bound, which saves 16.6 copies (24%) compared with the previously best GOFL. When the more powerful two-copy collective measurements are allowed, our GOA is experimentally demonstrated to beat the local bound by 3.9 copies (6.0%). By exploiting both adaptivity and collective measurements, our work marks an important step towards minimum-consumption quantum state discrimination.
△ Less
Submitted 15 October, 2023; v1 submitted 30 July, 2023;
originally announced July 2023.
-
Probing high-momentum component in nucleon momentum distribution by neutron-proton bremsstrahlung γ-rays in heavy ion reactions
Authors:
Yuhao Qin,
Qinglin Niu,
Dong Guo,
Sheng Xiao,
Baiting Tian,
Yijie Wang,
Zhi Qin,
Xinyue Diao,
Fenhai Guan,
Dawei Si,
Boyuan Zhang,
Yaopeng Zhang,
Xianglun Wei,
Herun Yang,
Peng Ma,
Rongjiang Hu,
Limin Duan,
Fangfang Duan,
Qiang Hu,
Junbing Ma,
Shiwei Xu,
Zhen Bai,
Yanyun Yang,
Hongwei Wang,
Baohua Sun
, et al. (3 additional authors not shown)
Abstract:
The high momentum tail (HMT) of nucleons, as a signature of the short-range correlations in nuclei, has been investigated by the high-energy bremsstrahlung $γ$ rays produced in $^{86}$Kr + $^{124}$Sn at 25 MeV/u. The energetic photons are measured by a CsI(Tl) hodoscope mounted on the spectrometer CSHINE. The energy spectrum above 30 MeV can be reproduced by the IBUU model calculations incorporati…
▽ More
The high momentum tail (HMT) of nucleons, as a signature of the short-range correlations in nuclei, has been investigated by the high-energy bremsstrahlung $γ$ rays produced in $^{86}$Kr + $^{124}$Sn at 25 MeV/u. The energetic photons are measured by a CsI(Tl) hodoscope mounted on the spectrometer CSHINE. The energy spectrum above 30 MeV can be reproduced by the IBUU model calculations incorporating the photon production channel from $np$ process in which the HMTs of nucleons is considered. A non-zero HMT ratio of about $15\%$ is favored by the data. The effect of the capture channel $np \to dγ$ is demonstrated.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning
Authors:
Yufan Liu,
Boxue Tian
Abstract:
Protein-DNA interaction is critical for life activities such as replication, transcription, and splicing. Identifying protein-DNA binding residues is essential for modeling their interaction and downstream studies. However, develo** accurate and efficient computational methods for this task remains challenging. Improvements in this area have the potential to drive novel applications in biotechno…
▽ More
Protein-DNA interaction is critical for life activities such as replication, transcription, and splicing. Identifying protein-DNA binding residues is essential for modeling their interaction and downstream studies. However, develo** accurate and efficient computational methods for this task remains challenging. Improvements in this area have the potential to drive novel applications in biotechnology and drug design. In this study, we propose a novel approach called CLAPE, which combines a pre-trained protein language model and the contrastive learning method to predict DNA binding residues. We trained the CLAPE-DB model on the protein-DNA binding sites dataset and evaluated the model performance and generalization ability through various experiments. The results showed that the AUC values of the CLAPE-DB model on the two benchmark datasets reached 0.871 and 0.881, respectively, indicating superior performance compared to other existing models. CLAPE-DB showed better generalization ability and was specific to DNA-binding sites. In addition, we trained CLAPE on different protein-ligand binding sites datasets, demonstrating that CLAPE is a general framework for binding sites prediction. To facilitate the scientific community, the benchmark datasets and codes are freely available at https://github.com/YAndrewL/clape.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
DeepStream: Bandwidth Efficient Multi-Camera Video Streaming for Deep Learning Analytics
Authors:
Hongpeng Guo,
Beitong Tian,
Zhe Yang,
Bo Chen,
Qian Zhou,
Shengzhong Liu,
Klara Nahrstedt,
Claudiu Danilov
Abstract:
Deep learning video analytic systems process live video feeds from multiple cameras with computer vision models deployed on edge or cloud. To optimize utility for these systems, which usually corresponds to query accuracy, efficient bandwidth management for the cameras competing for the fluctuating network resources is crucial. We propose DeepStream, a bandwidth efficient multi-camera video stream…
▽ More
Deep learning video analytic systems process live video feeds from multiple cameras with computer vision models deployed on edge or cloud. To optimize utility for these systems, which usually corresponds to query accuracy, efficient bandwidth management for the cameras competing for the fluctuating network resources is crucial. We propose DeepStream, a bandwidth efficient multi-camera video streaming system for deep learning video analytics. DeepStream addresses the challenge of limited and fluctuating bandwidth resources by offering several tailored solutions. We design a novel Regions of Interest detection (ROIDet) algorithm which can run in real time on resource constraint devices, such as Raspberry Pis, to remove spatial redundancy in video frames and reduce the amount of data to be transmitted. We also propose a content-aware bandwidth optimization framework and an Elastic Transmission Mechanism that exploits correlations among video contents. We implement DeepStream on Raspberry Pis and a desktop computer. Evaluations on real-world datasets show that DeepStream's ROIDet algorithm saves up to 54\% bandwidth with less than 1\% accuracy drop. Additionally,DeepStream improves utility by up to 23\% compared to baselines under the same bandwidth conditions.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Editing Large Language Models: Problems, Methods, and Opportunities
Authors:
Yunzhi Yao,
Peng Wang,
Bozhong Tian,
Siyuan Cheng,
Zhoubo Li,
Shumin Deng,
Huajun Chen,
Ningyu Zhang
Abstract:
Despite the ability to train capable LLMs, the methodology for maintaining their relevancy and rectifying errors remains elusive. To this end, the past few years have witnessed a surge in techniques for editing LLMs, the objective of which is to efficiently alter the behavior of LLMs within a specific domain without negatively impacting performance across other inputs. This paper embarks on a deep…
▽ More
Despite the ability to train capable LLMs, the methodology for maintaining their relevancy and rectifying errors remains elusive. To this end, the past few years have witnessed a surge in techniques for editing LLMs, the objective of which is to efficiently alter the behavior of LLMs within a specific domain without negatively impacting performance across other inputs. This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. In particular, we provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. We also build a new benchmark dataset to facilitate a more robust evaluation and pinpoint enduring issues intrinsic to existing techniques. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context. Code and datasets are available at https://github.com/zjunlp/EasyEdit.
△ Less
Submitted 30 November, 2023; v1 submitted 22 May, 2023;
originally announced May 2023.
-
DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection
Authors:
Huan-ang Gao,
Beiwen Tian,
Pengfei Li,
Hao Zhao,
Guyue Zhou
Abstract:
In this paper, we study the problem of semi-supervised 3D object detection, which is of great importance considering the high annotation cost for cluttered 3D indoor scenes. We resort to the robust and principled framework of selfteaching, which has triggered notable progress for semisupervised learning recently. While this paradigm is natural for image-level or pixel-level prediction, adapting it…
▽ More
In this paper, we study the problem of semi-supervised 3D object detection, which is of great importance considering the high annotation cost for cluttered 3D indoor scenes. We resort to the robust and principled framework of selfteaching, which has triggered notable progress for semisupervised learning recently. While this paradigm is natural for image-level or pixel-level prediction, adapting it to the detection problem is challenged by the issue of proposal matching. Prior methods are based upon two-stage pipelines, matching heuristically selected proposals generated in the first stage and resulting in spatially sparse training signals. In contrast, we propose the first semisupervised 3D detection algorithm that works in the singlestage manner and allows spatially dense training signals. A fundamental issue of this new design is the quantization error caused by point-to-voxel discretization, which inevitably leads to misalignment between two transformed views in the voxel domain. To this end, we derive and implement closed-form rules that compensate this misalignment onthe-fly. Our results are significant, e.g., promoting ScanNet [email protected] from 35.2% to 48.5% using 20% annotation. Codes and data will be publicly available.
△ Less
Submitted 11 August, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Revisiting k-NN for Fine-tuning Pre-trained Language Models
Authors:
Lei Li,
**g Chen,
Bozhong Tian,
Ningyu Zhang
Abstract:
Pre-trained Language Models (PLMs), as parametric-based eager learners, have become the de-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (kNN) classifiers, as the lazy learning paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we revisit kNN classifiers for augmenting the PLMs-based classifiers. From the methodolog…
▽ More
Pre-trained Language Models (PLMs), as parametric-based eager learners, have become the de-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (kNN) classifiers, as the lazy learning paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we revisit kNN classifiers for augmenting the PLMs-based classifiers. From the methodological level, we propose to adopt kNN with textual representations of PLMs in two steps: (1) Utilize kNN as prior knowledge to calibrate the training process. (2) Linearly interpolate the probability distribution predicted by kNN with that of the PLMs' classifier. At the heart of our approach is the implementation of kNN-calibrated training, which treats predicted results as indicators for easy versus hard examples during the training process. From the perspective of the diversity of application scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings, respectively, across eight diverse end-tasks. We hope our exploration will encourage the community to revisit the power of classical methods for efficient NLP. Code and datasets are available in https://github.com/zjunlp/Revisit-KNN.
△ Less
Submitted 17 June, 2023; v1 submitted 18 April, 2023;
originally announced April 2023.
-
Delving into Shape-aware Zero-shot Semantic Segmentation
Authors:
Xinyu Liu,
Beiwen Tian,
Zhen Wang,
Rui Wang,
Kehua Sheng,
Bo Zhang,
Hao Zhao,
Guyue Zhou
Abstract:
Thanks to the impressive progress of large-scale vision-language pretraining, recent recognition models can classify arbitrary objects in a zero-shot and open-set manner, with a surprisingly high accuracy. However, translating this success to semantic segmentation is not trivial, because this dense prediction task requires not only accurate semantic understanding but also fine shape delineation an…
▽ More
Thanks to the impressive progress of large-scale vision-language pretraining, recent recognition models can classify arbitrary objects in a zero-shot and open-set manner, with a surprisingly high accuracy. However, translating this success to semantic segmentation is not trivial, because this dense prediction task requires not only accurate semantic understanding but also fine shape delineation and existing vision-language models are trained with image-level language descriptions. To bridge this gap, we pursue \textbf{shape-aware} zero-shot semantic segmentation in this study. Inspired by classical spectral methods in the image segmentation literature, we propose to leverage the eigen vectors of Laplacian matrices constructed with self-supervised pixel-wise features to promote shape-awareness. Despite that this simple and effective technique does not make use of the masks of seen classes at all, we demonstrate that it out-performs a state-of-the-art shape-aware formulation that aligns ground truth and predicted edges during training. We also delve into the performance gains achieved on different datasets using different backbones and draw several interesting and conclusive observations: the benefits of promoting shape-awareness highly relates to mask compactness and language embedding locality. Finally, our method sets new state-of-the-art performance for zero-shot semantic segmentation on both Pascal and COCO, with significant margins. Code and models will be accessed at https://github.com/Liuxinyv/SAZS.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Role of electrodes in study of hydrovoltaic effects
Authors:
Chunxiao Zheng,
Sunmiao Fang,
Weicun Chu,
** Tan,
Bingkun Tian,
Xiaofeng Jiang,
Wanlin Guo
Abstract:
The last decade has witnessed the emergence of hydrovoltaic technology, which can harvest electricity from different forms of water movement, such as raindrops, waves, flows, moisture, and natural evaporation. In particular, the evaporation-induced hydrovoltaic effect received great attention since its discovery in 2017 due to its negative heat emission property. Nevertheless, the influence of ele…
▽ More
The last decade has witnessed the emergence of hydrovoltaic technology, which can harvest electricity from different forms of water movement, such as raindrops, waves, flows, moisture, and natural evaporation. In particular, the evaporation-induced hydrovoltaic effect received great attention since its discovery in 2017 due to its negative heat emission property. Nevertheless, the influence of electrode reactions in evaporation-induced power generation is not negligible due to the chemical reaction between active metal electrodes and water, which leads to " exceptional " power generation. Herein, we designed a series of experiments based on air-laid paper devices with electrodes of different activities as the top and bottom electrodes. To verify the contribution of electrodes, we compared the output performance of different electrode combinations when the device is partially-wetted and fully-wetted. The device hydrophilicity, salt concentration, and acidity or basicity of solutions are also comprehensively investigated. It is demonstrated that the chemical reaction of active metals (Zn, Cu, Ag, etc.) with different aqueous solutions can generate considerable electrical energy and significantly distort the device performance, especially for Zn electrodes with an output voltage from ~1.26 to ~1.52 V and current from ~1.24 to ~75.69 μA. To promote the long-term development of hydrovoltaic technology, we recommend use of inert electrodes in hydrovoltaic studies, such as Au and Pt, especially in water and moisture environment.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
Néel-type optical skyrmions inherited from evanescent electromagnetic fields with rotational symmetry
Authors:
Bo Tian,
**gyao Jiang,
Ningsheng Xu,
Zebo Zheng,
Ximiao Wang,
Shao**g Liu,
Wuchao Huang,
Tian Jiang,
Huanjun Chen,
Shaozhi Deng
Abstract:
Optical skyrmions, the optical analogue of topological configurations formed by three-dimensional vector fields covering the whole 4π solid angle but confined in a two-dimensional (2D) domain, have recently attracted growing interest due to their potential applications in high-density data transfer, storage, and processing. While the optical skyrmions have been successfully demonstrated using diff…
▽ More
Optical skyrmions, the optical analogue of topological configurations formed by three-dimensional vector fields covering the whole 4π solid angle but confined in a two-dimensional (2D) domain, have recently attracted growing interest due to their potential applications in high-density data transfer, storage, and processing. While the optical skyrmions have been successfully demonstrated using different field vectors in both of free-space propagating and near-field evanescent electromagnetic fields, the study on generation and control of the optical skyrmions, and their general correlation with the electromagnetic (EM) fields, are still in infancy. Here, we theoretically propose that an evanescent transverse-magnetic-polarized (TM-polarized) EM fields with rotational symmetry are actually Néel-type optical skyrmions of the electric field vectors. Such optical skyrmions maintain the rotation symmetry that are independent on the operation frequency and medium. Our proposal was verified by numerical simulations and real-space nano-imaging experiments performed on a graphene monolayer. Such a discovery can therefore not only further our understanding on the formation mechanisms of EM topological textures, but also provide a guideline for facile construction of EM skyrmions that may impact future information technologies.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Point Cloud Classification Using Content-based Transformer via Clustering in Feature Space
Authors:
Yahui Liu,
Bin Tian,
Yisheng Lv,
Lingxi Li,
Feiyue Wang
Abstract:
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention, but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called Poi…
▽ More
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention, but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space (content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an Inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectNN. Source code of this paper is available at https://github.com/yahuiliu99/PointConT.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
MPS-AMS: Masked Patches Selection and Adaptive Masking Strategy Based Self-Supervised Medical Image Segmentation
Authors:
Xiangtao Wang,
Ruizhi Wang,
Biao Tian,
Jiaojiao Zhang,
Shuo Zhang,
Junyang Chen,
Thomas Lukasiewicz,
Zhenghua Xu
Abstract:
Existing self-supervised learning methods based on contrastive learning and masked image modeling have demonstrated impressive performances. However, current masked image modeling methods are mainly utilized in natural images, and their applications in medical images are relatively lacking. Besides, their fixed high masking strategy limits the upper bound of conditional mutual information, and the…
▽ More
Existing self-supervised learning methods based on contrastive learning and masked image modeling have demonstrated impressive performances. However, current masked image modeling methods are mainly utilized in natural images, and their applications in medical images are relatively lacking. Besides, their fixed high masking strategy limits the upper bound of conditional mutual information, and the gradient noise is considerable, making less the learned representation information. Motivated by these limitations, in this paper, we propose masked patches selection and adaptive masking strategy based self-supervised medical image segmentation method, named MPS-AMS. We leverage the masked patches selection strategy to choose masked patches with lesions to obtain more lesion representation information, and the adaptive masking strategy is utilized to help learn more mutual information and improve performance further. Extensive experiments on three public medical image segmentation datasets (BUSI, Hecktor, and Brats2018) show that our proposed method greatly outperforms the state-of-the-art self-supervised baselines.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Joint Acoustic Echo Cancellation and Speech Dereverberation Using Kalman filters
Authors:
Ziteng Wang,
Yueyue Na,
Biao Tian,
Qiang Fu
Abstract:
This paper proposes a joint acoustic echo cancellation (AEC) and speech dereverberation (DR) algorithm in the short-time Fourier transform domain. The reverberant microphone signals are described using an auto-regressive (AR) model. The AR coefficients and the loudspeaker-to-microphone acoustic transfer functions (ATFs) are considered time-varying and are modeled simultaneously using a first-order…
▽ More
This paper proposes a joint acoustic echo cancellation (AEC) and speech dereverberation (DR) algorithm in the short-time Fourier transform domain. The reverberant microphone signals are described using an auto-regressive (AR) model. The AR coefficients and the loudspeaker-to-microphone acoustic transfer functions (ATFs) are considered time-varying and are modeled simultaneously using a first-order Markov process. This leads to a solution where these parameters can be optimally estimated using Kalman filters. It is shown that the proposed algorithm outperforms vanilla solutions that solve AEC and DR sequentially and one state-of-the-art joint DRAEC algorithm based on semi-blind source separation, in terms of both speech quality and echo reduction performance.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Leveraging Contaminated Datasets to Learn Clean-Data Distribution with Purified Generative Adversarial Networks
Authors:
Bowen Tian,
Qinliang Su,
Jianxing Yu
Abstract:
Generative adversarial networks (GANs) are known for their strong abilities on capturing the underlying distribution of training instances. Since the seminal work of GAN, many variants of GAN have been proposed. However, existing GANs are almost established on the assumption that the training dataset is clean. But in many real-world applications, this may not hold, that is, the training dataset ma…
▽ More
Generative adversarial networks (GANs) are known for their strong abilities on capturing the underlying distribution of training instances. Since the seminal work of GAN, many variants of GAN have been proposed. However, existing GANs are almost established on the assumption that the training dataset is clean. But in many real-world applications, this may not hold, that is, the training dataset may be contaminated by a proportion of undesired instances. When training on such datasets, existing GANs will learn a mixture distribution of desired and contaminated instances, rather than the desired distribution of desired data only (target distribution). To learn the target distribution from contaminated datasets, two purified generative adversarial networks (PuriGAN) are developed, in which the discriminators are augmented with the capability to distinguish between target and contaminated instances by leveraging an extra dataset solely composed of contamination instances. We prove that under some mild conditions, the proposed PuriGANs are guaranteed to converge to the distribution of desired instances. Experimental results on several datasets demonstrate that the proposed PuriGANs are able to generate much better images from the desired distribution than comparable baselines when trained on contaminated datasets. In addition, we also demonstrate the usefulness of PuriGAN on downstream applications by applying it to the tasks of semi-supervised anomaly detection on contaminated datasets and PU-learning. Experimental results show that PuriGAN is able to deliver the best performance over comparable baselines on both tasks.
△ Less
Submitted 3 February, 2023;
originally announced February 2023.
-
From Semi-supervised to Omni-supervised Room Layout Estimation Using Point Clouds
Authors:
Huan-ang Gao,
Beiwen Tian,
Pengfei Li,
Xiaoxue Chen,
Hao Zhao,
Guyue Zhou,
Yurong Chen,
Hongbin Zha
Abstract:
Room layout estimation is a long-existing robotic vision task that benefits both environment sensing and motion planning. However, layout estimation using point clouds (PCs) still suffers from data scarcity due to annotation difficulty. As such, we address the semi-supervised setting of this task based upon the idea of model exponential moving averaging. But adapting this scheme to the state-of-th…
▽ More
Room layout estimation is a long-existing robotic vision task that benefits both environment sensing and motion planning. However, layout estimation using point clouds (PCs) still suffers from data scarcity due to annotation difficulty. As such, we address the semi-supervised setting of this task based upon the idea of model exponential moving averaging. But adapting this scheme to the state-of-the-art (SOTA) solution for PC-based layout estimation is not straightforward. To this end, we define a quad set matching strategy and several consistency losses based upon metrics tailored for layout quads. Besides, we propose a new online pseudo-label harvesting algorithm that decomposes the distribution of a hybrid distance measure between quads and PC into two components. This technique does not need manual threshold selection and intuitively encourages quads to align with reliable layout points. Surprisingly, this framework also works for the fully-supervised setting, achieving a new SOTA on the ScanNet benchmark. Last but not least, we also push the semi-supervised setting to the realistic omni-supervised setting, demonstrating significantly promoted performance on a newly annotated ARKitScenes testing set. Our codes, data and models are released in this repository.
△ Less
Submitted 31 January, 2023;
originally announced January 2023.
-
Editing Language Model-based Knowledge Graph Embeddings
Authors:
Siyuan Cheng,
Ningyu Zhang,
Bozhong Tian,
Xi Chen,
Qingbing Liu,
Huajun Chen
Abstract:
Recently decades have witnessed the empirical success of framing Knowledge Graph (KG) embeddings via language models. However, language model-based KG embeddings are usually deployed as static artifacts, making them difficult to modify post-deployment without re-training after deployment. To address this issue, we propose a new task of editing language model-based KG embeddings in this paper. This…
▽ More
Recently decades have witnessed the empirical success of framing Knowledge Graph (KG) embeddings via language models. However, language model-based KG embeddings are usually deployed as static artifacts, making them difficult to modify post-deployment without re-training after deployment. To address this issue, we propose a new task of editing language model-based KG embeddings in this paper. This task is designed to facilitate rapid, data-efficient updates to KG embeddings without compromising the performance of other aspects. We build four new datasets: E-FB15k237, A-FB15k237, E-WN18RR, and A-WN18RR, and evaluate several knowledge editing baselines demonstrating the limited ability of previous models to handle the proposed challenging task. We further propose a simple yet strong baseline dubbed KGEditor, which utilizes additional parametric layers of the hypernetwork to edit/add facts. Our comprehensive experimental results reveal that KGEditor excels in updating specific facts without impacting the overall performance, even when faced with limited training resources. Code and datasets are available in https://github.com/zjunlp/PromptKG/tree/main/deltaKG.
△ Less
Submitted 19 December, 2023; v1 submitted 24 January, 2023;
originally announced January 2023.
-
VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and Uncertainty-Spectrum Modeling
Authors:
Beiwen Tian,
Liyi Luo,
Hao Zhao,
Guyue Zhou
Abstract:
Recently, 3D scenes parsing with deep learning approaches has been a heating topic. However, current methods with fully-supervised models require manually annotated point-wise supervision which is extremely user-unfriendly and time-consuming to obtain. As such, training 3D scene parsing models with sparse supervision is an intriguing alternative. We term this task as data-efficient 3D scene parsin…
▽ More
Recently, 3D scenes parsing with deep learning approaches has been a heating topic. However, current methods with fully-supervised models require manually annotated point-wise supervision which is extremely user-unfriendly and time-consuming to obtain. As such, training 3D scene parsing models with sparse supervision is an intriguing alternative. We term this task as data-efficient 3D scene parsing and propose an effective two-stage framework named VIBUS to resolve it by exploiting the enormous unlabeled points. In the first stage, we perform self-supervised representation learning on unlabeled points with the proposed Viewpoint Bottleneck loss function. The loss function is derived from an information bottleneck objective imposed on scenes under different viewpoints, making the process of representation learning free of degradation and sampling. In the second stage, pseudo labels are harvested from the sparse labels based on uncertainty-spectrum modeling. By combining data-driven uncertainty measures and 3D mesh spectrum measures (derived from normal directions and geodesic distances), a robust local affinity metric is obtained. Finite gamma/beta mixture models are used to decompose category-wise distributions of these measures, leading to automatic selection of thresholds. We evaluate VIBUS on the public benchmark ScanNet and achieve state-of-the-art results on both validation set and online test server. Ablation studies show that both Viewpoint Bottleneck and uncertainty-spectrum modeling bring significant improvements. Codes and models are publicly available at https://github.com/AIR-DISCOVER/VIBUS.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation
Authors:
Pengfei Li,
Beiwen Tian,
Yongliang Shi,
Xiaoxue Chen,
Hao Zhao,
Guyue Zhou,
Ya-Qin Zhang
Abstract:
Current referring expression comprehension algorithms can effectively detect or segment objects indicated by nouns, but how to understand verb reference is still under-explored. As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on. Towards a finer localization that better serves downst…
▽ More
Current referring expression comprehension algorithms can effectively detect or segment objects indicated by nouns, but how to understand verb reference is still under-explored. As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on. Towards a finer localization that better serves downstream applications like robot interaction, we extend the problem into task oriented instance segmentation. A unique requirement of this task is to select preferred candidates among possible alternatives. Thus we resort to the transformer architecture which naturally models pair-wise query relationships with attention, leading to the TOIST method. In order to leverage pre-trained noun referring expression comprehension models and the fact that we can access privileged noun ground truth during training, a novel noun-pronoun distillation framework is proposed. Noun prototypes are generated in an unsupervised manner and contextual pronoun features are trained to select prototypes. As such, the network remains noun-agnostic during inference. We evaluate TOIST on the large-scale task oriented dataset COCO-Tasks and achieve +10.9% higher $\rm{mAP^{box}}$ than the best-reported results. The proposed noun-pronoun distillation can boost $\rm{mAP^{box}}$ and $\rm{mAP^{mask}}$ by +2.8% and +3.8%. Codes and models are publicly available at https://github.com/AIR-DISCOVER/TOIST.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
InTEn-LOAM: Intensity and Temporal Enhanced LiDAR Odometry and Map**
Authors:
Shuaixin Li,
Bin Tian,
Zhu Xiaozhou,
Gui Jianjun,
Yao Wen,
Guangyun Li
Abstract:
Traditional LiDAR odometry (LO) systems mainly leverage geometric information obtained from the traversed surroundings to register laser scans and estimate LiDAR ego-motion, while it may be unreliable in dynamic or unstructured environments. This paper proposes InTEn-LOAM, a low-drift and robust LiDAR odometry and map** method that fully exploits implicit information of laser sweeps (i.e., geome…
▽ More
Traditional LiDAR odometry (LO) systems mainly leverage geometric information obtained from the traversed surroundings to register laser scans and estimate LiDAR ego-motion, while it may be unreliable in dynamic or unstructured environments. This paper proposes InTEn-LOAM, a low-drift and robust LiDAR odometry and map** method that fully exploits implicit information of laser sweeps (i.e., geometric, intensity, and temporal characteristics). Scanned points are projected to cylindrical images, which facilitate the efficient and adaptive extraction of various types of features, i.e., ground, beam, facade, and reflector. We propose a novel intensity-based points registration algorithm and incorporate it into the LiDAR odometry, enabling the LO system to jointly estimate the LiDAR ego-motion using both geometric and intensity feature points. To eliminate the interference of dynamic objects, we propose a temporal-based dynamic object removal approach to filter them out before map update. Moreover, the local map is organized and downsampled using a temporal-related voxel grid filter to maintain the similarity between the current scan and the static local map. Extensive experiments are conducted on both simulated and real-world datasets. The results show that the proposed method achieves similar or better accuracy w.r.t the state-of-the-arts in normal driving scenarios and outperforms geometric-based LO in unstructured environments.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Observing the **-pong Modality of Isospin Degree of Freedom in Cluster Emission from Heavy Ion Reactions
Authors:
Yijie Wang,
Fenhai Guan,
Xinyue Diao,
Mengting Wan,
Yuhao Qin,
Zhi Qin,
Qianghua Wu,
Dong Guo,
Dawei Si,
Sheng Xiao,
Boyuan Zhang,
Yaopeng Zhang,
Baiting Tian,
Xianglun Wei,
Herun Yang,
Peng Ma,
Rongjiang Hu,
Limin Duan,
Fangfang Duan,
Qiang Hu,
Junbing Ma,
Shiwei Xu,
Zhen Bai,
Yanyun Yang,
Jiansong Wang
, et al. (14 additional authors not shown)
Abstract:
Two-body correlations of the isotope-resolved light and heavy clusters are measured in $^{86}$Kr+$^{\rm 208}$Pb reactions at 25 MeV/u. The yield and kinetic variables of the $A=3$ isobars, triton and $^3$He, are analyzed in coincidence with the heavy clusters of $7\le A \le 14$ emitted at the earlier chance. While the velocity spectra of both triton and $^3$He exhibit scaling behavior over the typ…
▽ More
Two-body correlations of the isotope-resolved light and heavy clusters are measured in $^{86}$Kr+$^{\rm 208}$Pb reactions at 25 MeV/u. The yield and kinetic variables of the $A=3$ isobars, triton and $^3$He, are analyzed in coincidence with the heavy clusters of $7\le A \le 14$ emitted at the earlier chance. While the velocity spectra of both triton and $^3$He exhibit scaling behavior over the type of the heavy clusters, the yield ratios of ${\rm t/^3He}$ correlate reversely to the neutron-to-proton ratio $N/Z$ of the latter, showing the **-pong modality of the $N/Z$ of emitted clusters. The commonality that the $N/Z$ of the residues keeps the initial system value is extended to the cluster emission in heavy ion reactions. The comparison of transport model calculations to the data is discussed.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Language-guided Semantic Style Transfer of 3D Indoor Scenes
Authors:
Bu **,
Beiwen Tian,
Hao Zhao,
Guyue Zhou
Abstract:
We address the new problem of language-guided semantic style transfer of 3D indoor scenes. The input is a 3D indoor scene mesh and several phrases that describe the target scene. Firstly, 3D vertex coordinates are mapped to RGB residues by a multi-layer perceptron. Secondly, colored 3D meshes are differentiablly rendered into 2D images, via a viewpoint sampling strategy tailored for indoor scenes.…
▽ More
We address the new problem of language-guided semantic style transfer of 3D indoor scenes. The input is a 3D indoor scene mesh and several phrases that describe the target scene. Firstly, 3D vertex coordinates are mapped to RGB residues by a multi-layer perceptron. Secondly, colored 3D meshes are differentiablly rendered into 2D images, via a viewpoint sampling strategy tailored for indoor scenes. Thirdly, rendered 2D images are compared to phrases, via pre-trained vision-language models. Lastly, errors are back-propagated to the multi-layer perceptron to update vertex colors corresponding to certain semantic categories. We did large-scale qualitative analyses and A/B user tests, with the public ScanNet and SceneNN datasets. We demonstrate: (1) visually pleasing results that are potentially useful for multimedia applications. (2) rendering 3D indoor scenes from viewpoints consistent with human priors is important. (3) incorporating semantics significantly improve style transfer quality. (4) an HSV regularization term leads to results that are more consistent with inputs and generally rated better. Codes and user study toolbox are available at https://github.com/AIR-DISCOVER/LASST
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Gauge-equivariant flow models for sampling in lattice field theories with pseudofermions
Authors:
Ryan Abbott,
Michael S. Albergo,
Denis Boyda,
Kyle Cranmer,
Daniel C. Hackett,
Gurtej Kanwar,
Sébastien Racanière,
Danilo J. Rezende,
Fernando Romero-López,
Phiala E. Shanahan,
Betsy Tian,
Julian M. Urban
Abstract:
This work presents gauge-equivariant architectures for flow-based sampling in fermionic lattice field theories using pseudofermions as stochastic estimators for the fermionic determinant. This is the default approach in state-of-the-art lattice field theory calculations, making this development critical to the practical application of flow models to theories such as QCD. Methods by which flow-base…
▽ More
This work presents gauge-equivariant architectures for flow-based sampling in fermionic lattice field theories using pseudofermions as stochastic estimators for the fermionic determinant. This is the default approach in state-of-the-art lattice field theory calculations, making this development critical to the practical application of flow models to theories such as QCD. Methods by which flow-based sampling approaches can be improved via standard techniques such as even/odd preconditioning and the Hasenbusch factorization are also outlined. Numerical demonstrations in two-dimensional U(1) and SU(3) gauge theories with $N_f=2$ flavors of fermions are provided.
△ Less
Submitted 16 October, 2022; v1 submitted 18 July, 2022;
originally announced July 2022.
-
Anomaly Detection by Leveraging Incomplete Anomalous Knowledge with Anomaly-Aware Bidirectional GANs
Authors:
Bowen Tian,
Qinliang Su,
Jian Yin
Abstract:
The goal of anomaly detection is to identify anomalous samples from normal ones. In this paper, a small number of anomalies are assumed to be available at the training stage, but they are assumed to be collected only from several anomaly types, leaving the majority of anomaly types not represented in the collected anomaly dataset at all. To effectively leverage this kind of incomplete anomalous kn…
▽ More
The goal of anomaly detection is to identify anomalous samples from normal ones. In this paper, a small number of anomalies are assumed to be available at the training stage, but they are assumed to be collected only from several anomaly types, leaving the majority of anomaly types not represented in the collected anomaly dataset at all. To effectively leverage this kind of incomplete anomalous knowledge represented by the collected anomalies, we propose to learn a probability distribution that can not only model the normal samples, but also guarantee to assign low density values for the collected anomalies. To this end, an anomaly-aware generative adversarial network (GAN) is developed, which, in addition to modeling the normal samples as most GANs do, is able to explicitly avoid assigning probabilities for collected anomalous samples. Moreover, to facilitate the computation of anomaly detection criteria like reconstruction error, the proposed anomaly-aware GAN is designed to be bidirectional, attaching an encoder for the generator. Extensive experimental results demonstrate that our proposed method is able to effectively make use of the incomplete anomalous information, leading to significant performance gains compared to existing methods.
△ Less
Submitted 1 May, 2022; v1 submitted 28 April, 2022;
originally announced April 2022.
-
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Authors:
Bill Yuchen Lin,
Kangmin Tan,
Chris Miller,
Beiwen Tian,
Xiang Ren
Abstract:
Humans can perform unseen tasks by recalling relevant skills acquired previously and then generalizing them to the target tasks, even if there is no supervision at all. In this paper, we aim to improve this kind of cross-task generalization ability of massive multi-task language models, such as T0 and FLAN, in an unsupervised setting. We propose a retrieval-augmentation method named ReCross that t…
▽ More
Humans can perform unseen tasks by recalling relevant skills acquired previously and then generalizing them to the target tasks, even if there is no supervision at all. In this paper, we aim to improve this kind of cross-task generalization ability of massive multi-task language models, such as T0 and FLAN, in an unsupervised setting. We propose a retrieval-augmentation method named ReCross that takes a few unlabelled examples as queries to retrieve a small subset of upstream data and uses them to update the multi-task model for better generalization. ReCross is a straightforward yet effective retrieval method that combines both efficient dense retrieval and effective pair-wise reranking. Our results and analysis show that it significantly outperforms both non-retrieval methods and other baseline methods.
△ Less
Submitted 17 October, 2022; v1 submitted 17 April, 2022;
originally announced April 2022.
-
Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness
Authors:
Dianwen Ng,
** Hui Pang,
Yang Xiao,
Biao Tian,
Qiang Fu,
Eng Siong Chng
Abstract:
It is critical for a keyword spotting model to have a small footprint as it typically runs on-device with low computational resources. However, maintaining the previous SOTA performance with reduced model size is challenging. In addition, a far-field and noisy environment with multiple signals interference aggravates the problem causing the accuracy to degrade significantly. In this paper, we pres…
▽ More
It is critical for a keyword spotting model to have a small footprint as it typically runs on-device with low computational resources. However, maintaining the previous SOTA performance with reduced model size is challenging. In addition, a far-field and noisy environment with multiple signals interference aggravates the problem causing the accuracy to degrade significantly. In this paper, we present a multi-channel ConvMixer for speech command recognitions. The novel architecture introduces an additional audio channel mixing for channel audio interaction in a multi-channel audio setting to achieve better noise-robust features with more efficient computation. Besides, we proposed a centroid based awareness component to enhance the system by equip** it with additional spatial geometry information in the latent feature projection space. We evaluate our model using the new MISP challenge 2021 dataset. Our model achieves significant improvement against the official baseline with a 55% gain in the competition score (0.152) on raw microphone array input and a 63% (0.126) boost upon front-end speech enhancement.
△ Less
Submitted 11 April, 2022;
originally announced April 2022.
-
Non-invasive chemically selective energy delivery and focusing inside a scattering medium guided by Raman scattering
Authors:
Bingxin Tian,
Bernhard Rauer,
Antoine Boniface,
Jun Han,
Sylvain Gigan,
Hilton B. de Aguiar,
Hilton de Aguiar
Abstract:
Raman scattering is a chemically selective probing mechanism with diverse applications in industry and clinical settings. Yet, most samples are optically opaque limiting the applicability of Raman probing at depth. Here, we demonstrate chemically selective energy deposition behind a scattering medium by combining prior information on the chemical's spectrum with the measurement of a spectrally res…
▽ More
Raman scattering is a chemically selective probing mechanism with diverse applications in industry and clinical settings. Yet, most samples are optically opaque limiting the applicability of Raman probing at depth. Here, we demonstrate chemically selective energy deposition behind a scattering medium by combining prior information on the chemical's spectrum with the measurement of a spectrally resolved Raman speckle as a feedback mechanism for wavefront sha**. We demonstrate unprecedented six-fold signal enhancement in an epi-geometry, realizing targeted energy deposition and focusing on selected Raman active particles.
△ Less
Submitted 21 February, 2022;
originally announced February 2022.
-
Multi-Task Deep Residual Echo Suppression with Echo-aware Loss
Authors:
Shimin Zhang,
Ziteng Wang,
Jiayao Sun,
Yihui Fu,
Biao Tian,
Qiang Fu,
Lei Xie
Abstract:
This paper introduces the NWPU Team's entry to the ICASSP 2022 AEC Challenge. We take a hybrid approach that cascades a linear AEC with a neural post-filter. The former is used to deal with the linear echo components while the latter suppresses the residual non-linear echo components. We use gated convolutional F-T-LSTM neural network (GFTNN) as the backbone and shape the post-filter by a multi-ta…
▽ More
This paper introduces the NWPU Team's entry to the ICASSP 2022 AEC Challenge. We take a hybrid approach that cascades a linear AEC with a neural post-filter. The former is used to deal with the linear echo components while the latter suppresses the residual non-linear echo components. We use gated convolutional F-T-LSTM neural network (GFTNN) as the backbone and shape the post-filter by a multi-task learning (MTL) framework, where a voice activity detection (VAD) module is adopted as an auxiliary task along with echo suppression, with the aim to avoid over suppression that may cause speech distortion. Moreover, we adopt an echo-aware loss function, where the mean square error (MSE) loss can be optimized particularly for every time-frequency bin (TF-bin) according to the signal-to-echo ratio (SER), leading to further suppression on the echo. Extensive ablation study shows that the time delay estimation (TDE) module in neural post-filter leads to better perceptual quality, and an adaptive filter with better convergence will bring consistent performance gain for the post-filter. Besides, we find that using the linear echo as the input of our neural post-filter is a better choice than using the reference signal directly. In the ICASSP 2022 AEC-Challenge, our approach has ranked the 1st place on word accuracy (WAcc) (0.817) and the 3rd place on both mean opinion score (MOS) (4.502) and the final score (0.864).
△ Less
Submitted 20 February, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.