Search | arXiv e-print repository

UDC: A Unified Neural Divide-and-Conquer Framework for Large-Scale Combinatorial Optimization Problems

Authors: Zhi Zheng, Changliang Zhou, Tong Xialiang, Mingxuan Yuan, Zhenkun Wang

Abstract: Single-stage neural combinatorial optimization solvers have achieved near-optimal results on various small-scale combinatorial optimization (CO) problems without needing expert knowledge. However, these solvers exhibit significant performance degradation when applied to large-scale CO problems. Recently, two-stage neural methods with divide-and-conquer strategies have shown superiorities in addres… ▽ More Single-stage neural combinatorial optimization solvers have achieved near-optimal results on various small-scale combinatorial optimization (CO) problems without needing expert knowledge. However, these solvers exhibit significant performance degradation when applied to large-scale CO problems. Recently, two-stage neural methods with divide-and-conquer strategies have shown superiorities in addressing large-scale CO problems. Nevertheless, the efficiency of these methods highly relies on problem-specific heuristics in either the divide or the conquer procedure, which limits their applicability to general CO problems. Moreover, these methods employ separate training schemes and ignore the interdependencies between the dividing and conquering strategies, which often leads to sub-optimal solutions. To tackle these drawbacks, this article develops a unified neural divide-and-conquer framework (i.e., UDC) for solving general large-scale CO problems. UDC offers a Divide-Conquer-Reunion (DCR) training method to eliminate the negative impact of a sub-optimal dividing policy. Employing a high-efficiency Graph Neural Network (GNN) for global dividing and a fixed-length sub-path solver for conquering sub-problems, the proposed UDC framework demonstrates extensive applicability, achieving superior performance in 10 representative large-scale CO problems. △ Less

Submitted 29 June, 2024; originally announced July 2024.

arXiv:2406.18360 [pdf, other]

XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis

Authors: Hao Li, Ming Yuan, Yan Zhang, Chenming Wu, Chen Zhao, Chunyu Song, Haocheng Feng, Errui Ding, Dingwen Zhang, **gdong Wang

Abstract: Thoroughly testing autonomy systems is crucial in the pursuit of safe autonomous driving vehicles. It necessitates creating safety-critical scenarios that go beyond what can be safely collected from real-world data, as many of these scenarios occur infrequently on public roads. However, the evaluation of most existing NVS methods relies on sporadic sampling of image frames from the training data,… ▽ More Thoroughly testing autonomy systems is crucial in the pursuit of safe autonomous driving vehicles. It necessitates creating safety-critical scenarios that go beyond what can be safely collected from real-world data, as many of these scenarios occur infrequently on public roads. However, the evaluation of most existing NVS methods relies on sporadic sampling of image frames from the training data, comparing the rendered images with ground truth images using metrics. Unfortunately, this evaluation protocol falls short of meeting the actual requirements in closed-loop simulations. Specifically, the true application demands the capability to render novel views that extend beyond the original trajectory (such as cross-lane views), which are challenging to capture in the real world. To address this, this paper presents a novel driving view synthesis dataset and benchmark specifically designed for autonomous driving simulations. This dataset is unique as it includes testing images captured by deviating from the training trajectory by 1-4 meters. It comprises six sequences encompassing various time and weather conditions. Each sequence contains 450 training images, 150 testing images, and their corresponding camera poses and intrinsic parameters. Leveraging this novel dataset, we establish the first realistic benchmark for evaluating existing NVS approaches under front-only and multi-camera settings. The experimental findings underscore the significant gap that exists in current approaches, revealing their inadequate ability to fulfill the demanding prerequisites of cross-lane or closed-loop simulation. Our dataset is released publicly at the project page: https://3d-aigc.github.io/XLD/. △ Less

Submitted 26 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

Comments: project page: https://3d-aigc.github.io/XLD/

arXiv:2406.14868 [pdf, other]

Direct Multi-Turn Preference Optimization for Language Agents

Authors: Wentao Shi, Mengqi Yuan, Junkang Wu, Qifan Wang, Fuli Feng

Abstract: Adapting Large Language Models (LLMs) for agent tasks is critical in develo** language agents. Direct Preference Optimization (DPO) is a promising technique for this adaptation with the alleviation of compounding errors, offering a means to directly optimize Reinforcement Learning (RL) objectives. However, applying DPO to multi-turn tasks presents challenges due to the inability to cancel the pa… ▽ More Adapting Large Language Models (LLMs) for agent tasks is critical in develo** language agents. Direct Preference Optimization (DPO) is a promising technique for this adaptation with the alleviation of compounding errors, offering a means to directly optimize Reinforcement Learning (RL) objectives. However, applying DPO to multi-turn tasks presents challenges due to the inability to cancel the partition function. Overcoming this obstacle involves making the partition function independent of the current state and addressing length disparities between preferred and dis-preferred trajectories. In this light, we replace the policy constraint with the state-action occupancy measure constraint in the RL objective and add length normalization to the Bradley-Terry model, yielding a novel loss function named DMPO for multi-turn agent tasks with theoretical explanations. Extensive experiments on three multi-turn agent task datasets confirm the effectiveness and superiority of the DMPO loss. △ Less

Submitted 25 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.05010 [pdf, other]

Testing common invariant subspace of multilayer networks

Authors: Mingao Yuan, Qianqian Yao

Abstract: Graph (or network) is a mathematical structure that has been widely used to model relational data. As real-world systems get more complex, multilayer (or multiple) networks are employed to represent diverse patterns of relationships among the objects in the systems. One active research problem in multilayer networks analysis is to study the common invariant subspace of the networks, because such c… ▽ More Graph (or network) is a mathematical structure that has been widely used to model relational data. As real-world systems get more complex, multilayer (or multiple) networks are employed to represent diverse patterns of relationships among the objects in the systems. One active research problem in multilayer networks analysis is to study the common invariant subspace of the networks, because such common invariant subspace could capture the fundamental structural patterns and interactions across all layers. Many methods have been proposed to estimate the common invariant subspace. However, whether real-world multilayer networks share the same common subspace remains unknown. In this paper, we first attempt to answer this question by means of hypothesis testing. The null hypothesis states that the multilayer networks share the same subspace, and under the alternative hypothesis, there exist at least two networks that do not have the same subspace. We propose a Weighted Degree Difference Test, derive the limiting distribution of the test statistic and provide an analytical analysis of the power. Simulation study shows that the proposed test has satisfactory performance, and a real data application is provided. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.04699 [pdf, other]

Logic Synthesis with Generative Deep Neural Networks

Authors: Xihan Li, Xing Li, Lei Chen, Xing Zhang, Mingxuan Yuan, Jun Wang

Abstract: While deep learning has achieved significant success in various domains, its application to logic circuit design has been limited due to complex constraints and strict feasibility requirement. However, a recent generative deep neural model, "Circuit Transformer", has shown promise in this area by enabling equivalence-preserving circuit transformation on a small scale. In this paper, we introduce a… ▽ More While deep learning has achieved significant success in various domains, its application to logic circuit design has been limited due to complex constraints and strict feasibility requirement. However, a recent generative deep neural model, "Circuit Transformer", has shown promise in this area by enabling equivalence-preserving circuit transformation on a small scale. In this paper, we introduce a logic synthesis rewriting operator based on the Circuit Transformer model, named "ctrw" (Circuit Transformer Rewriting), which incorporates the following techniques: (1) a two-stage training scheme for the Circuit Transformer tailored for logic synthesis, with iterative improvement of optimality through self-improvement training; (2) integration of the Circuit Transformer with state-of-the-art rewriting techniques to address scalability issues, allowing for guided DAG-aware rewriting. Experimental results on the IWLS 2023 contest benchmark demonstrate the effectiveness of our proposed rewriting methods. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: In IWLS 2024

arXiv:2406.04594 [pdf, other]

Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

Authors: Jianbo Dong, Bin Luo, Jun Zhang, Pengcheng Zhang, Fei Feng, Yikai Zhu, Ang Liu, Zian Chen, Yi Shi, Hairong Jiao, Gang Lu, Yu Guan, Ennan Zhai, Wencong Xiao, Hanyu Zhao, Man Yuan, Siran Yang, Xiang Li, Jiamang Wang, Rui Men, Jianwei Zhang, Huang Zhong, Dennis Cai, Yuan Xie, Binzhang Fu

Abstract: The emergence of Large Language Models (LLMs) has necessitated the adoption of parallel training techniques, involving the deployment of thousands of GPUs to train a single model. Unfortunately, we have found that the efficiency of current parallel training is often suboptimal, largely due to the following two main issues. Firstly, hardware failures are inevitable, leading to interruptions in the… ▽ More The emergence of Large Language Models (LLMs) has necessitated the adoption of parallel training techniques, involving the deployment of thousands of GPUs to train a single model. Unfortunately, we have found that the efficiency of current parallel training is often suboptimal, largely due to the following two main issues. Firstly, hardware failures are inevitable, leading to interruptions in the training tasks. The inability to quickly identify the faulty components results in a substantial waste of GPU resources. Secondly, since GPUs must wait for parameter synchronization to complete before proceeding to the next round of computation, network congestions can greatly increase the waiting time for GPUs. To address these challenges, this paper introduces a communication-driven solution, namely the C4. The key insights of C4 are two folds. First, in parallel training, collective communication exhibits periodic and homogeneous characteristics, so any anomalies are certainly due to some form of hardware malfunction. By leveraging this feature, C4 can rapidly identify the faulty components, swiftly isolate the anomaly, and restart the task, thereby avoiding resource wastage caused by delays in anomaly detection. Second, the predictable communication model of collective communication, involving few large flows, allows C4 to efficiently execute traffic planning, substantially reducing network congestion. C4 has been extensively implemented across our production systems, cutting error-induced overhead by roughly 30% and enhancing runtime performance by about 15% for certain applications with moderate communication costs. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.19548 [pdf, other]

RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning

Authors: Mingqi Yuan, Roger Creus Castanyer, Bo Li, Xin **, Glen Berseth, Wenjun Zeng

Abstract: Extrinsic rewards can effectively guide reinforcement learning (RL) agents in specific tasks. However, extrinsic rewards frequently fall short in complex environments due to the significant human effort needed for their design and annotation. This limitation underscores the necessity for intrinsic rewards, which offer auxiliary and dense signals and can enable agents to learn in an unsupervised ma… ▽ More Extrinsic rewards can effectively guide reinforcement learning (RL) agents in specific tasks. However, extrinsic rewards frequently fall short in complex environments due to the significant human effort needed for their design and annotation. This limitation underscores the necessity for intrinsic rewards, which offer auxiliary and dense signals and can enable agents to learn in an unsupervised manner. Although various intrinsic reward formulations have been proposed, their implementation and optimization details are insufficiently explored and lack standardization, thereby hindering research progress. To address this gap, we introduce RLeXplore, a unified, highly modularized, and plug-and-play framework offering reliable implementations of eight state-of-the-art intrinsic reward algorithms. Furthermore, we conduct an in-depth study that identifies critical implementation details and establishes well-justified standard practices in intrinsically-motivated RL. The source code for RLeXplore is available at https://github.com/RLE-Foundation/RLeXplore. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 25 pages, 19 figures

arXiv:2405.19531 [pdf, other]

Real-Time Dynamic Robot-Assisted Hand-Object Interaction via Motion Primitives

Authors: Mingqi Yuan, Huijiang Wang, Kai-Fung Chu, Fumiya Iida, Bo Li, Wenjun Zeng

Abstract: Advances in artificial intelligence (AI) have been propelling the evolution of human-robot interaction (HRI) technologies. However, significant challenges remain in achieving seamless interactions, particularly in tasks requiring physical contact with humans. These challenges arise from the need for accurate real-time perception of human actions, adaptive control algorithms for robots, and the eff… ▽ More Advances in artificial intelligence (AI) have been propelling the evolution of human-robot interaction (HRI) technologies. However, significant challenges remain in achieving seamless interactions, particularly in tasks requiring physical contact with humans. These challenges arise from the need for accurate real-time perception of human actions, adaptive control algorithms for robots, and the effective coordination between human and robotic movements. In this paper, we propose an approach to enhancing physical HRI with a focus on dynamic robot-assisted hand-object interaction (HOI). Our methodology integrates hand pose estimation, adaptive robot control, and motion primitives to facilitate human-robot collaboration. Specifically, we employ a transformer-based algorithm to perform real-time 3D modeling of human hands from single RGB images, based on which a motion primitives model (MPM) is designed to translate human hand motions into robotic actions. The robot's action implementation is dynamically fine-tuned using the continuously updated 3D hand models. Experimental validations, including a ring-wearing task, demonstrate the system's effectiveness in adapting to real-time movements and assisting in precise task executions. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 8 pages, 10 figures

arXiv:2405.18748 [pdf]

doi 10.1016/j.egycc.2023.100118

Equity Implications of Net-Zero Emissions: A Multi-Model Analysis of Energy Expenditures Across Income Classes Under Economy-Wide Deep Decarbonization Policies

Authors: John Bistlinea, Chikara Onda, Morgan Browning, Johannes Emmerling, Gokul Iyer, Megan Mahajan, Jim McFarland, Haewon McJeon, Robbie Orvis, Francisco Ralston Fonseca, Christopher Roney, Noah Sandoval, Luis Sarmiento, John Weyant, Jared Woollacott, Mei Yuan

Abstract: With companies, states, and countries targeting net-zero emissions around midcentury, there are questions about how these targets alter household welfare and finances, including distributional effects across income groups. This paper examines the distributional dimensions of technology transitions and net-zero policies with a focus on welfare impacts across household incomes. The analysis uses a m… ▽ More With companies, states, and countries targeting net-zero emissions around midcentury, there are questions about how these targets alter household welfare and finances, including distributional effects across income groups. This paper examines the distributional dimensions of technology transitions and net-zero policies with a focus on welfare impacts across household incomes. The analysis uses a model intercomparison with a range of energy-economy models using harmonized policy scenarios reaching economy-wide, net-zero CO2 emissions across the United States in 2050. We employ a novel linking approach that connects output from detailed energy system models with survey microdata on energy expenditures across income classes to provide distributional analysis of net-zero policies. Although there are differences in model structure and input assumptions, we find broad agreement in qualitative trends in policy incidence and energy burdens across income groups. Models generally agree that direct energy expenditures for many households will likely decline over time with reference and net-zero policies. However, there is variation in the extent of changes relative to current levels, energy burdens relative to reference levels, and electricity expenditures. Policy design, primarily how climate policy revenues are used, has first-order impacts on distributional outcomes. Net-zero policy costs, in both absolute and relative terms, are unevenly distributed across households, and relative increases in energy expenditures are higher for lowest-income households. However, we also find that recycled revenues from climate policies have countervailing effects when rebated on a per-capita basis, offsetting higher energy burdens and potentially even leading to net progressive outcomes. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Journal ref: 2024, Energy and Climate Change, 5: 100118

arXiv:2405.18412 [pdf, other]

Tensor Methods in High Dimensional Data Analysis: Opportunities and Challenges

Authors: Arnab Auddy, Dong Xia, Ming Yuan

Abstract: Large amount of multidimensional data represented by multiway arrays or tensors are prevalent in modern applications across various fields such as chemometrics, genomics, physics, psychology, and signal processing. The structural complexity of such data provides vast new opportunities for modeling and analysis, but efficiently extracting information content from them, both statistically and comput… ▽ More Large amount of multidimensional data represented by multiway arrays or tensors are prevalent in modern applications across various fields such as chemometrics, genomics, physics, psychology, and signal processing. The structural complexity of such data provides vast new opportunities for modeling and analysis, but efficiently extracting information content from them, both statistically and computationally, presents unique and fundamental challenges. Addressing these challenges requires an interdisciplinary approach that brings together tools and insights from statistics, optimization and numerical linear algebra among other fields. Despite these hurdles, significant progress has been made in the last decade. This review seeks to examine some of the key advancements and identify common threads among them, under eight different statistical settings. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.17525 [pdf, ps, other]

SmoothGNN: Smoothing-based GNN for Unsupervised Node Anomaly Detection

Authors: Xiangyu Dong, Xingyi Zhang, Yanni Sun, Lei Chen, Mingxuan Yuan, Sibo Wang

Abstract: The smoothing issue leads to indistinguishable node representations, which poses a significant challenge in the field of graph learning. However, this issue also presents an opportunity to reveal underlying properties behind different types of nodes, which have been overlooked in previous studies. Through empirical and theoretical analysis of real-world node anomaly detection (NAD) datasets, we ob… ▽ More The smoothing issue leads to indistinguishable node representations, which poses a significant challenge in the field of graph learning. However, this issue also presents an opportunity to reveal underlying properties behind different types of nodes, which have been overlooked in previous studies. Through empirical and theoretical analysis of real-world node anomaly detection (NAD) datasets, we observe that anomalous and normal nodes show different patterns in the smoothing process, which can be leveraged to enhance NAD tasks. Motivated by these findings, in this paper, we propose a novel unsupervised NAD framework. Specifically, according to our theoretical analysis, we design a Smoothing Learning Component. Subsequently, we introduce a Smoothing-aware Spectral Graph Neural Network, which establishes the connection between the spectral space of graphs and the smoothing process. Additionally, we demonstrate that the Dirichlet Energy, which reflects the smoothness of a graph, can serve as coefficients for node representations across different dimensions of the spectral space. Building upon these observations and analyses, we devise a novel anomaly measure for the NAD task. Extensive experiments on 9 real-world datasets show that SmoothGNN outperforms the best rival by an average of 14.66% in AUC and 7.28% in Precision, with 75x running time speed-up, which validates the effectiveness and efficiency of our framework. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.17272 [pdf, other]

DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems

Authors: Zhi Zheng, Shunyu Yao, Zhenkun Wang, Xialiang Tong, Mingxuan Yuan, Ke Tang

Abstract: The min-max vehicle routing problem (min-max VRP) traverses all given customers by assigning several routes and aims to minimize the length of the longest route. Recently, reinforcement learning (RL)-based sequential planning methods have exhibited advantages in solving efficiency and optimality. However, these methods fail to exploit the problem-specific properties in learning representations, re… ▽ More The min-max vehicle routing problem (min-max VRP) traverses all given customers by assigning several routes and aims to minimize the length of the longest route. Recently, reinforcement learning (RL)-based sequential planning methods have exhibited advantages in solving efficiency and optimality. However, these methods fail to exploit the problem-specific properties in learning representations, resulting in less effective features for decoding optimal routes. This paper considers the sequential planning process of min-max VRPs as two coupled optimization tasks: customer partition for different routes and customer navigation in each route (i.e., partition and navigation). To effectively process min-max VRP instances, we present a novel attention-based Partition-and-Navigation encoder (P&N Encoder) that learns distinct embeddings for partition and navigation. Furthermore, we utilize an inherent symmetry in decoding routes and develop an effective agent-permutation-symmetric (APS) loss function. Experimental results demonstrate that the proposed Decoupling-Partition-Navigation (DPN) method significantly surpasses existing learning-based methods in both single-depot and multi-depot min-max VRPs. Our code is available at △ Less

Submitted 6 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.12262 [pdf, other]

Prompt Learning for Generalized Vehicle Routing

Authors: Fei Liu, Xi Lin, Weiduo Liao, Zhenkun Wang, Qingfu Zhang, Xialiang Tong, Mingxuan Yuan

Abstract: Neural combinatorial optimization (NCO) is a promising learning-based approach to solving various vehicle routing problems without much manual algorithm design. However, the current NCO methods mainly focus on the in-distribution performance, while the real-world problem instances usually come from different distributions. A costly fine-tuning approach or generalized model retraining from scratch… ▽ More Neural combinatorial optimization (NCO) is a promising learning-based approach to solving various vehicle routing problems without much manual algorithm design. However, the current NCO methods mainly focus on the in-distribution performance, while the real-world problem instances usually come from different distributions. A costly fine-tuning approach or generalized model retraining from scratch could be needed to tackle the out-of-distribution instances. Unlike the existing methods, this work investigates an efficient prompt learning approach in NCO for cross-distribution adaptation. To be concrete, we propose a novel prompt learning method to facilitate fast zero-shot adaptation of a pre-trained model to solve routing problem instances from different distributions. The proposed model learns a set of prompts among various distributions and then selects the best-matched one to prompt a pre-trained attention model for each problem instance. Extensive experiments show that the proposed prompt learning approach facilitates the fast adaptation of pre-trained routing models. It also outperforms existing generalized models on both in-distribution prediction and zero-shot generalization to a diverse set of new tasks. Our code implementation is available online https://github.com/FeiLiu36/PromptVRP. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.11051 [pdf, ps, other]

Darboux transformation of diffusion processes

Authors: Alexey Kuznetsov, Minjian Yuan

Abstract: Darboux transformation of a second order linear differential operator is a well-known technique with many applications in mathematics and physics. We study Darboux transformation from the point of view of Markov semigroups of diffusion processes. We construct a Darboux transform of a diffusion process through a combination of Doob's $h$-transform and a version of Siegmund duality. Our main result… ▽ More Darboux transformation of a second order linear differential operator is a well-known technique with many applications in mathematics and physics. We study Darboux transformation from the point of view of Markov semigroups of diffusion processes. We construct a Darboux transform of a diffusion process through a combination of Doob's $h$-transform and a version of Siegmund duality. Our main result is a simple formula that connects transition probability densities of the two processes. We provide several examples of Darboux transformed diffusion processes related to Brownian motion and Ornstein-Uhlenbeck process. For these examples we compute explicitly the transition probability density and derive its spectral representation. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 25 pages

MSC Class: 60J60; 60J35

arXiv:2405.11024 [pdf, other]

GraSS: Combining Graph Neural Networks with Expert Knowledge for SAT Solver Selection

Authors: Zhanguang Zhang, Didier Chetelat, Joseph Cotnareanu, Amur Ghose, Wenyi Xiao, Hui-Ling Zhen, Yingxue Zhang, Jianye Hao, Mark Coates, Mingxuan Yuan

Abstract: Boolean satisfiability (SAT) problems are routinely solved by SAT solvers in real-life applications, yet solving time can vary drastically between solvers for the same instance. This has motivated research into machine learning models that can predict, for a given SAT instance, which solver to select among several options. Existing SAT solver selection methods all rely on some hand-picked instance… ▽ More Boolean satisfiability (SAT) problems are routinely solved by SAT solvers in real-life applications, yet solving time can vary drastically between solvers for the same instance. This has motivated research into machine learning models that can predict, for a given SAT instance, which solver to select among several options. Existing SAT solver selection methods all rely on some hand-picked instance features, which are costly to compute and ignore the structural information in SAT graphs. In this paper we present GraSS, a novel approach for automatic SAT solver selection based on tripartite graph representations of instances and a heterogeneous graph neural network (GNN) model. While GNNs have been previously adopted in other SAT-related tasks, they do not incorporate any domain-specific knowledge and ignore the runtime variation introduced by different clause orders. We enrich the graph representation with domain-specific decisions, such as novel node feature design, positional encodings for clauses in the graph, a GNN architecture tailored to our tripartite graphs and a runtime-sensitive loss function. Through extensive experiments, we demonstrate that this combination of raw representations and domain-specific choices leads to improvements in runtime for a pool of seven state-of-the-art solvers on both an industrial circuit design benchmark, and on instances from the 20-year Anniversary Track of the 2022 SAT Competition. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: Accepted by KDD 2024

arXiv:2405.09024 [pdf, other]

Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels

Authors: Guozhang Liu, Ting Liu, Mengke Yuan, Tao Pang, Guangxing Yang, Hao Fu, Tao Wang, Tongkui Liao

Abstract: The ambiguous appearance, tiny scale, and fine-grained classes of objects in remote sensing imagery inevitably lead to the noisy annotations in category labels of detection dataset. However, the effects and treatments of the label noises are underexplored in modern oriented remote sensing object detectors. To address this issue, we propose a robust oriented remote sensing object detection method t… ▽ More The ambiguous appearance, tiny scale, and fine-grained classes of objects in remote sensing imagery inevitably lead to the noisy annotations in category labels of detection dataset. However, the effects and treatments of the label noises are underexplored in modern oriented remote sensing object detectors. To address this issue, we propose a robust oriented remote sensing object detection method through dynamic loss decay (DLD) mechanism, inspired by the two phase ``early-learning'' and ``memorization'' learning dynamics of deep neural networks on clean and noisy samples. To be specific, we first observe the end point of early learning phase termed as EL, after which the models begin to memorize the false labels that significantly degrade the detection accuracy. Secondly, under the guidance of the training indicator, the losses of each sample are ranked in descending order, and we adaptively decay the losses of the top K largest ones (bad samples) in the following epochs. Because these large losses are of high confidence to be calculated with wrong labels. Experimental results show that the method achieves excellent noise resistance performance tested on multiple public datasets such as HRSC2016 and DOTA-v1.0/v2.0 with synthetic category label noise. Our solution also has won the 2st place in the "fine-grained object detection based on sub-meter remote sensing imagery" track with noisy labels of 2023 National Big Data and Computing Intelligence Challenge. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.07131 [pdf, other]

MAxPrototyper: A Multi-Agent Generation System for Interactive User Interface Prototy**

Authors: Mingyue Yuan, Jieshan Chen, Aaron Quigley

Abstract: In automated user interactive design, designers face key challenges, including accurate representation of user intent, crafting high-quality components, and ensuring both aesthetic and semantic consistency. Addressing these challenges, we introduce MAxPrototyper, our human-centered, multi-agent system for interactive design generation. The core of MAxPrototyper is a theme design agent. It coordina… ▽ More In automated user interactive design, designers face key challenges, including accurate representation of user intent, crafting high-quality components, and ensuring both aesthetic and semantic consistency. Addressing these challenges, we introduce MAxPrototyper, our human-centered, multi-agent system for interactive design generation. The core of MAxPrototyper is a theme design agent. It coordinates with specialized sub-agents, each responsible for generating specific parts of the design. Through an intuitive online interface, users can control the design process by providing text descriptions and layout. Enhanced by improved language and image generation models, MAxPrototyper generates each component with careful detail and contextual understanding. Its multi-agent architecture enables a multi-round interaction capability between the system and users, facilitating precise and customized design adjustments throughout the creation process. △ Less

Submitted 11 May, 2024; originally announced May 2024.

arXiv:2405.04733 [pdf, other]

One-Bit Phase Retrieval: Optimal Rates and Efficient Algorithms

Authors: Junren Chen, Ming Yuan

Abstract: In this paper, we study the sample complexity and develop efficient optimal algorithms for 1-bit phase retrieval: recovering a signal $\mathbf{x}\in\mathbb{R}^n$ from $m$ phaseless bits $\{\mathrm{sign}(|\mathbf{a}_i^\top\mathbf{x}|-τ)\}_{i=1}^m$ generated by standard Gaussian $\mathbf{a}_i$s. By investigating a phaseless version of random hyperplane tessellation, we show that (constrained) hammin… ▽ More In this paper, we study the sample complexity and develop efficient optimal algorithms for 1-bit phase retrieval: recovering a signal $\mathbf{x}\in\mathbb{R}^n$ from $m$ phaseless bits $\{\mathrm{sign}(|\mathbf{a}_i^\top\mathbf{x}|-τ)\}_{i=1}^m$ generated by standard Gaussian $\mathbf{a}_i$s. By investigating a phaseless version of random hyperplane tessellation, we show that (constrained) hamming distance minimization uniformly recovers all unstructured signals with Euclidean norm bounded away from zero and infinity to the error $\mathcal{O}((n/m)\log(m/n))$, and $\mathcal{O}((k/m)\log(mn/k^2))$ when restricting to $k$-sparse signals. Both error rates are shown to be information-theoretically optimal, up to a logarithmic factor. Intriguingly, the optimal rate for sparse recovery matches that of 1-bit compressed sensing, suggesting that the phase information is non-essential for 1-bit compressed sensing. We also develop efficient algorithms for 1-bit (sparse) phase retrieval that can achieve these error rates. Specifically, we prove that (thresholded) gradient descent with respect to the one-sided $\ell_1$-loss, when initialized via spectral methods, converges linearly and attains the near optimal reconstruction error, with sample complexity $\mathcal{O}(n)$ for unstructured signals and $\mathcal{O}(k^2\log(n)\log^2(m/k))$ for $k$-sparse signals. Our proof is based upon the observation that a certain local (restricted) approximate invertibility condition is respected by Gaussian measurements. To show this, we utilize a delicate covering argument and derive tight concentration bounds for the directional gradients by properly conditioning on the index set of phaseless hyperplane separations, which may be of independent interests and useful for other related problems. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.01906 [pdf, other]

Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization

Authors: Changliang Zhou, Xi Lin, Zhenkun Wang, Xialiang Tong, Mingxuan Yuan, Qingfu Zhang

Abstract: The neural combinatorial optimization (NCO) approach has shown great potential for solving routing problems without the requirement of expert knowledge. However, existing constructive NCO methods cannot directly solve large-scale instances, which significantly limits their application prospects. To address these crucial shortcomings, this work proposes a novel Instance-Conditioned Adaptation Model… ▽ More The neural combinatorial optimization (NCO) approach has shown great potential for solving routing problems without the requirement of expert knowledge. However, existing constructive NCO methods cannot directly solve large-scale instances, which significantly limits their application prospects. To address these crucial shortcomings, this work proposes a novel Instance-Conditioned Adaptation Model (ICAM) for better large-scale generalization of neural combinatorial optimization. In particular, we design a powerful yet lightweight instance-conditioned adaptation module for the NCO model to generate better solutions for instances across different scales. In addition, we develop an efficient three-stage reinforcement learning-based training scheme that enables the model to learn cross-scale features without any labeled optimal solution. Experimental results show that our proposed method is capable of obtaining excellent results with a very fast inference time in solving Traveling Salesman Problems (TSPs) and Capacitated Vehicle Routing Problems (CVRPs) across different scales. To the best of our knowledge, our model achieves state-of-the-art performance among all RL-based constructive methods for TSP and CVRP with up to 1,000 nodes. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: 17 pages, 6 figures

arXiv:2404.17360 [pdf, other]

UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning

Authors: Maoxun Yuan, Bo Cui, Tianyi Zhao, Xingxing Wei

Abstract: Semantic analysis on visible (RGB) and infrared (IR) images has gained attention for its ability to be more accurate and robust under low-illumination and complex weather conditions. Due to the lack of pre-trained foundation models on the large-scale infrared image datasets, existing methods prefer to design task-specific frameworks and directly fine-tune them with pre-trained foundation models on… ▽ More Semantic analysis on visible (RGB) and infrared (IR) images has gained attention for its ability to be more accurate and robust under low-illumination and complex weather conditions. Due to the lack of pre-trained foundation models on the large-scale infrared image datasets, existing methods prefer to design task-specific frameworks and directly fine-tune them with pre-trained foundation models on their RGB-IR semantic relevance datasets, which results in poor scalability and limited generalization. In this work, we propose a scalable and efficient framework called UniRGB-IR to unify RGB-IR downstream tasks, in which a novel adapter is developed to efficiently introduce richer RGB-IR features into the pre-trained RGB-based foundation model. Specifically, our framework consists of a vision transformer (ViT) foundation model, a Multi-modal Feature Pool (MFP) module and a Supplementary Feature Injector (SFI) module. The MFP and SFI modules cooperate with each other as an adpater to effectively complement the ViT features with the contextual multi-scale features. During training process, we freeze the entire foundation model to inherit prior knowledge and only optimize the MFP and SFI modules. Furthermore, to verify the effectiveness of our framework, we utilize the ViT-Base as the pre-trained foundation model to perform extensive experiments. Experimental results on various RGB-IR downstream tasks demonstrate that our method can achieve state-of-the-art performance. The source code and results are available at https://github.com/PoTsui99/UniRGB-IR.git. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.12638 [pdf, other]

Learning to Cut via Hierarchical Sequence/Set Model for Efficient Mixed-Integer Programming

Authors: Jie Wang, Zhihai Wang, Xijun Li, Yufei Kuang, Zhihao Shi, Fangzhou Zhu, Mingxuan Yuan, Jia Zeng, Yongdong Zhang, Feng Wu

Abstract: Cutting planes (cuts) play an important role in solving mixed-integer linear programs (MILPs), which formulate many important real-world applications. Cut selection heavily depends on (P1) which cuts to prefer and (P2) how many cuts to select. Although modern MILP solvers tackle (P1)-(P2) by human-designed heuristics, machine learning carries the potential to learn more effective heuristics. Howev… ▽ More Cutting planes (cuts) play an important role in solving mixed-integer linear programs (MILPs), which formulate many important real-world applications. Cut selection heavily depends on (P1) which cuts to prefer and (P2) how many cuts to select. Although modern MILP solvers tackle (P1)-(P2) by human-designed heuristics, machine learning carries the potential to learn more effective heuristics. However, many existing learning-based methods learn which cuts to prefer, neglecting the importance of learning how many cuts to select. Moreover, we observe that (P3) what order of selected cuts to prefer significantly impacts the efficiency of MILP solvers as well. To address these challenges, we propose a novel hierarchical sequence/set model (HEM) to learn cut selection policies. Specifically, HEM is a bi-level model: (1) a higher-level module that learns how many cuts to select, (2) and a lower-level module -- that formulates the cut selection as a sequence/set to sequence learning problem -- to learn policies selecting an ordered subset with the cardinality determined by the higher-level module. To the best of our knowledge, HEM is the first data-driven methodology that well tackles (P1)-(P3) simultaneously. Experiments demonstrate that HEM significantly improves the efficiency of solving MILPs on eleven challenging MILP benchmarks, including two Huawei's real problems. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2302.00244

arXiv:2404.07951 [pdf, other]

Visualization for physics analysis improvement and applications in BESIII

Authors: Zhi-Jun Li, Ming-Kuan Yuan, Yun-Xuan Song, Yan-Gu Li, **g-Shu Li, Sheng-Sen Sun, Xiao-Long Wang, Zheng-Yun You, Ya-Jun Mao

Abstract: Modern particle physics experiments usually rely on highly complex and large-scale spectrometer devices. In high energy physics experiments, visualization helps detector design, data quality monitoring, offline data processing, and has great potential for improving physics analysis. In addition to the traditional physics data analysis based on statistical methods, visualization provides unique int… ▽ More Modern particle physics experiments usually rely on highly complex and large-scale spectrometer devices. In high energy physics experiments, visualization helps detector design, data quality monitoring, offline data processing, and has great potential for improving physics analysis. In addition to the traditional physics data analysis based on statistical methods, visualization provides unique intuitive advantages in searching for rare signal events and reducing background noises. By applying the event display tool to several physics analyses in the BESIII experiment, we demonstrate that visualization can benefit potential physics discovery and improve the signal significance. With the development of modern visualization techniques, it is expected to play a more important role in future data processing and physics analysis of particle physics experiments. △ Less

Submitted 19 March, 2024; originally announced April 2024.

Comments: 19 pages, 7 figures

arXiv:2404.05404 [pdf, other]

Contouring Error Bounded Control for Biaxial Switched Linear Systems

Authors: Meng Yuan, Ye Wang, Chris Manzie, Zhezhuang Xu, Tianyou Chai

Abstract: Biaxial motion control systems are used extensively in manufacturing and printing industries. To improve throughput and reduce machine cost, lightweight materials are being proposed in structural components but may result in higher flexibility in the machine links. This flexibility is often position dependent and compromises precision of the end effector of the machine. To address the need for imp… ▽ More Biaxial motion control systems are used extensively in manufacturing and printing industries. To improve throughput and reduce machine cost, lightweight materials are being proposed in structural components but may result in higher flexibility in the machine links. This flexibility is often position dependent and compromises precision of the end effector of the machine. To address the need for improved contouring accuracy in industrial machines with position-dependent structural flexibility, this paper introduces a novel contouring error-bounded control algorithm for biaxial switched linear systems. The proposed algorithm utilizes model predictive control to guarantee the satisfaction of state, input, and contouring error constraints for any admissible mode switching. In this paper, the switching signal remains unknown to the controller, although information about the minimum time the system is expected to stay in a specific mode is considered to be available. The proposed algorithm has the property of recursive feasibility and ensures the stability of the closed-loop system. The effectiveness of the proposed method is demonstrated by applying it to a high-fidelity simulation of a dual-drive industrial laser machine. The results show that the contouring error is successfully bounded within the given tolerance. △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.05168 [pdf, other]

Adapting to Covariate Shift in Real-time by Encoding Trees with Motion Equations

Authors: Tham Yik Foong, Heng Zhang, Mao Po Yuan, Danilo Vasconcellos Vargas

Abstract: Input distribution shift presents a significant problem in many real-world systems. Here we present Xenovert, an adaptive algorithm that can dynamically adapt to changes in input distribution. It is a perfect binary tree that adaptively divides a continuous input space into several intervals of uniform density while receiving a continuous stream of input. This process indirectly maps the source di… ▽ More Input distribution shift presents a significant problem in many real-world systems. Here we present Xenovert, an adaptive algorithm that can dynamically adapt to changes in input distribution. It is a perfect binary tree that adaptively divides a continuous input space into several intervals of uniform density while receiving a continuous stream of input. This process indirectly maps the source distribution to the shifted target distribution, preserving the data's relationship with the downstream decoder/operation, even after the shift occurs. In this paper, we demonstrated how a neural network integrated with Xenovert achieved better results in 4 out of 5 shifted datasets, saving the hurdle of retraining a machine learning model. We anticipate that Xenovert can be applied to many more applications that require adaptation to unforeseen input distribution shifts, even when the distribution shift is drastic. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: 7 figures, 2 tables

arXiv:2404.04878 [pdf, other]

CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data

Authors: Wei Fang, Yuxing Tang, Heng Guo, Mingze Yuan, Tony C. W. Mok, Ke Yan, Jiawen Yao, Xin Chen, Zaiyi Liu, Le Lu, Ling Zhang, Minfeng Xu

Abstract: In the realm of medical 3D data, such as CT and MRI images, prevalent anisotropic resolution is characterized by high intra-slice but diminished inter-slice resolution. The lowered resolution between adjacent slices poses challenges, hindering optimal viewing experiences and impeding the development of robust downstream analysis algorithms. Various volumetric super-resolution algorithms aim to sur… ▽ More In the realm of medical 3D data, such as CT and MRI images, prevalent anisotropic resolution is characterized by high intra-slice but diminished inter-slice resolution. The lowered resolution between adjacent slices poses challenges, hindering optimal viewing experiences and impeding the development of robust downstream analysis algorithms. Various volumetric super-resolution algorithms aim to surmount these challenges, enhancing inter-slice resolution and overall 3D medical imaging quality. However, existing approaches confront inherent challenges: 1) often tailored to specific upsampling factors, lacking flexibility for diverse clinical scenarios; 2) newly generated slices frequently suffer from over-smoothing, degrading fine details, and leading to inter-slice inconsistency. In response, this study presents CycleINR, a novel enhanced Implicit Neural Representation model for 3D medical data volumetric super-resolution. Leveraging the continuity of the learned implicit function, the CycleINR model can achieve results with arbitrary up-sampling rates, eliminating the need for separate training. Additionally, we enhance the grid sampling in CycleINR with a local attention mechanism and mitigate over-smoothing by integrating cycle-consistent loss. We introduce a new metric, Slice-wise Noise Level Inconsistency (SNLI), to quantitatively assess inter-slice noise level inconsistency. The effectiveness of our approach is demonstrated through image quality evaluations on an in-house dataset and a downstream task analysis on the Medical Segmentation Decathlon liver tumor dataset. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: CVPR accepted paper

arXiv:2403.19561 [pdf, other]

Self-Improved Learning for Scalable Neural Combinatorial Optimization

Authors: Fu Luo, Xi Lin, Zhenkun Wang, Xialiang Tong, Mingxuan Yuan, Qingfu Zhang

Abstract: The end-to-end neural combinatorial optimization (NCO) method shows promising performance in solving complex combinatorial optimization problems without the need for expert design. However, existing methods struggle with large-scale problems, hindering their practical applicability. To overcome this limitation, this work proposes a novel Self-Improved Learning (SIL) method for better scalability o… ▽ More The end-to-end neural combinatorial optimization (NCO) method shows promising performance in solving complex combinatorial optimization problems without the need for expert design. However, existing methods struggle with large-scale problems, hindering their practical applicability. To overcome this limitation, this work proposes a novel Self-Improved Learning (SIL) method for better scalability of neural combinatorial optimization. Specifically, we develop an efficient self-improved mechanism that enables direct model training on large-scale problem instances without any labeled data. Powered by an innovative local reconstruction approach, this method can iteratively generate better solutions by itself as pseudo-labels to guide efficient model training. In addition, we design a linear complexity attention mechanism for the model to efficiently handle large-scale combinatorial problem instances with low computation overhead. Comprehensive experiments on the Travelling Salesman Problem (TSP) and the Capacitated Vehicle Routing Problem (CVRP) with up to 100K nodes in both uniform and real-world distributions demonstrate the superior scalability of our method. △ Less

Submitted 2 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.19446 [pdf, other]

EDA-Driven Preprocessing for SAT Solving

Authors: Zhengyuan Shi, Tiebing Tang, Sadaf Khan, Hui-Ling Zhen, Mingxuan Yuan, Zhufei Chu, Qiang Xu

Abstract: Effective formulation of problems into Conjunctive Normal Form (CNF) is critical in modern Boolean Satisfiability (SAT) solving for optimizing solver performance. Addressing the limitations of existing methods, our Electronic Design Automation (EDA)-driven preprocessing framework introduces a novel methodology for preparing SAT instances, leveraging both circuit and CNF formats for enhanced flexib… ▽ More Effective formulation of problems into Conjunctive Normal Form (CNF) is critical in modern Boolean Satisfiability (SAT) solving for optimizing solver performance. Addressing the limitations of existing methods, our Electronic Design Automation (EDA)-driven preprocessing framework introduces a novel methodology for preparing SAT instances, leveraging both circuit and CNF formats for enhanced flexibility and efficiency. Central to our approach is the integration of a new logic synthesis technique, guided by a reinforcement learning agent, and a novel cost-customized LUT map** strategy, enabling efficient handling of diverse SAT challenges. By transforming the SAT competition benchmarks into circuit instances, our framework demonstrates substantial performance improvements, as evidenced by a 52.42% reduction on average compared to solving directly. Moreover, our framework achieves a remarkable 96.14% runtime reduction on average for a set of logic equivalence checking problems that exhibit inherent circuit structures. These results highlight the effectiveness and versatility of our approach in handling both CNF and circuit instances. The code is available at https://github.com/cure-lab/EDA4SAT. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.18768 [pdf, other]

Efficient Generation of Multi-partite Entanglement between Non-local Superconducting Qubits using Classical Feedback

Authors: Akel Hashim, Ming Yuan, Pranav Gokhale, Larry Chen, Christian Juenger, Neelay Fruitwala, Yilun Xu, Gang Huang, Liang Jiang, Irfan Siddiqi

Abstract: Quantum entanglement is one of the primary features which distinguishes quantum computers from classical computers. In gate-based quantum computing, the creation of entangled states or the distribution of entanglement across a quantum processor often requires circuit depths which grow with the number of entangled qubits. However, in teleportation-based quantum computing, one can deterministically… ▽ More Quantum entanglement is one of the primary features which distinguishes quantum computers from classical computers. In gate-based quantum computing, the creation of entangled states or the distribution of entanglement across a quantum processor often requires circuit depths which grow with the number of entangled qubits. However, in teleportation-based quantum computing, one can deterministically generate entangled states with a circuit depth that is constant in the number of qubits, provided that one has access to an entangled resource state, the ability to perform mid-circuit measurements, and can rapidly transmit classical information. In this work, aided by fast classical FPGA-based control hardware with a feedback latency of only 150 ns, we explore the utility of teleportation-based protocols for generating non-local, multi-partite entanglement between superconducting qubits. First, we demonstrate well-known protocols for generating Greenberger-Horne-Zeilinger (GHZ) states and non-local CNOT gates in constant depth. Next, we utilize both protocols for implementing an unbounded fan-out (i.e., controlled-NOT-NOT) gate in constant depth between three non-local qubits. Finally, we demonstrate deterministic state teleportation and entanglement swap** between qubits on opposite side of our quantum processor. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.17828 [pdf, other]

The Relativistic Spin Precession in the Compact Double Neutron Star System PSR~J1946+2052

Authors: Lingqi Meng, Weiwei Zhu, Michael Kramer, Xueli Miao, Gregory Desvignes, Li**g Shao, Huanchen Hu, Paulo C. C. Freire, Yongkun Zhang, Mengyao Xue, Ziyao Fang, David J. Champion, Mao Yuan, Chenchen Miao, Jiarui Niu, Qiuyang Fu, Jumei Yao, Yanjun Guo, Chengmin Zhang

Abstract: We observe systematic profile changes in the visible pulsar of the compact double neutron star system PSR~J1946+2052 using observations with the Five-hundred-meter Aperture Spherical radio Telescope (FAST). The interpulse of PSR~J1946+2052 changed from single-peak to double-peak shape from 2018 to 2021. We attribute this evolution as the result of the relativistic spin precession of the pulsar. Wi… ▽ More We observe systematic profile changes in the visible pulsar of the compact double neutron star system PSR~J1946+2052 using observations with the Five-hundred-meter Aperture Spherical radio Telescope (FAST). The interpulse of PSR~J1946+2052 changed from single-peak to double-peak shape from 2018 to 2021. We attribute this evolution as the result of the relativistic spin precession of the pulsar. With the high sensitivity of FAST, we also measure significant polarization for the first time, allowing us to model this with the precessional rotating vector model. Assuming, to the first order, a circular hollow-cone-like emission beam pattern and taking the validity of general relativity, we derive the binary's orbital inclination angle (${63^\circ}^{+5^\circ}_{-3^\circ}$) and pulsar's spin geometry. Pulsar's spin vector and the orbital angular momentum vector are found to be only slightly misaligned (${0.21^\circ}^{+0.28^\circ}_{-0.10^\circ}$).The quoted uncertainties do not reflect the systematic uncertainties introduced by our model assumptions. By simulating future observations of profile and polarization evolution, we estimate that we could constrain the precession rate within a $43\%$ uncertainty in 9 years. Hence, we suggest that the system's profile evolution could be combined with precise pulsar timing to test general relativity in the future. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 12 pages, 9 figures, accepted for publication in ApJ

arXiv:2403.13838 [pdf, other]

Circuit Transformer: End-to-end Circuit Design by Predicting the Next Gate

Authors: Xihan Li, Xing Li, Lei Chen, Xing Zhang, Mingxuan Yuan, Jun Wang

Abstract: Language, a prominent human ability to express through sequential symbols, has been computationally mastered by recent advances of large language models (LLMs). By predicting the next word recurrently with huge neural models, LLMs have shown unprecedented capabilities in understanding and reasoning. Circuit, as the "language" of electronic design, specifies the functionality of an electronic devic… ▽ More Language, a prominent human ability to express through sequential symbols, has been computationally mastered by recent advances of large language models (LLMs). By predicting the next word recurrently with huge neural models, LLMs have shown unprecedented capabilities in understanding and reasoning. Circuit, as the "language" of electronic design, specifies the functionality of an electronic device by cascade connections of logic gates. Then, can circuits also be mastered by a a sufficiently large "circuit model", which can conquer electronic design tasks by simply predicting the next logic gate? In this work, we take the first step to explore such possibilities. Two primary barriers impede the straightforward application of LLMs to circuits: their complex, non-sequential structure, and the intolerance of hallucination due to strict constraints (e.g., equivalence). For the first barrier, we encode a circuit as a memory-less, depth-first traversal trajectory, which allows Transformer-based neural models to better leverage its structural information, and predict the next gate on the trajectory as a circuit model. For the second barrier, we introduce an equivalence-preserving decoding process, which ensures that every token in the generated trajectory adheres to the specified equivalence constraints. Moreover, the circuit model can also be regarded as a stochastic policy to tackle optimization-oriented circuit design tasks. Experimentally, we trained a Transformer-based model of 88M parameters, named "Circuit Transformer", which demonstrates impressive performance in end-to-end logic synthesis. With Monte-Carlo tree search, Circuit Transformer significantly improves over resyn2 while retaining strict equivalence, showcasing the potential of generative AI in conquering electronic design challenges. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.11671 [pdf, other]

HDLdebugger: Streamlining HDL debugging with Large Language Models

Authors: Xufeng Yao, Haoyang Li, Tsz Ho Chan, Wenyi Xiao, Mingxuan Yuan, Yu Huang, Lei Chen, Bei Yu

Abstract: In the domain of chip design, Hardware Description Languages (HDLs) play a pivotal role. However, due to the complex syntax of HDLs and the limited availability of online resources, debugging HDL codes remains a difficult and time-intensive task, even for seasoned engineers. Consequently, there is a pressing need to develop automated HDL code debugging models, which can alleviate the burden on har… ▽ More In the domain of chip design, Hardware Description Languages (HDLs) play a pivotal role. However, due to the complex syntax of HDLs and the limited availability of online resources, debugging HDL codes remains a difficult and time-intensive task, even for seasoned engineers. Consequently, there is a pressing need to develop automated HDL code debugging models, which can alleviate the burden on hardware engineers. Despite the strong capabilities of Large Language Models (LLMs) in generating, completing, and debugging software code, their utilization in the specialized field of HDL debugging has been limited and, to date, has not yielded satisfactory results. In this paper, we propose an LLM-assisted HDL debugging framework, namely HDLdebugger, which consists of HDL debugging data generation via a reverse engineering approach, a search engine for retrieval-augmented generation, and a retrieval-augmented LLM fine-tuning approach. Through the integration of these components, HDLdebugger can automate and streamline HDL debugging for chip design. Our comprehensive experiments, conducted on an HDL code dataset sourced from Huawei, reveal that HDLdebugger outperforms 13 cutting-edge LLM baselines, displaying exceptional effectiveness in HDL code debugging. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 13 pages,5 figures

arXiv:2403.07257 [pdf, other]

The Dawn of AI-Native EDA: Opportunities and Challenges of Large Circuit Models

Authors: Lei Chen, Yiqi Chen, Zhufei Chu, Wenji Fang, Tsung-Yi Ho, Ru Huang, Yu Huang, Sadaf Khan, Min Li, Xingquan Li, Yu Li, Yun Liang, **wei Liu, Yi Liu, Yibo Lin, Guojie Luo, Zhengyuan Shi, Guangyu Sun, Dimitrios Tsaras, Runsheng Wang, Ziyi Wang, Xinming Wei, Zhiyao Xie, Qiang Xu, Chenhao Xue , et al. (14 additional authors not shown)

Abstract: Within the Electronic Design Automation (EDA) domain, AI-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutions often repurpose deep learning models from other domains, such as vision, text, and graph analytics, applying them to circuit design without tailoring to the unique complexities of electronic circuits. Suc… ▽ More Within the Electronic Design Automation (EDA) domain, AI-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutions often repurpose deep learning models from other domains, such as vision, text, and graph analytics, applying them to circuit design without tailoring to the unique complexities of electronic circuits. Such an AI4EDA approach falls short of achieving a holistic design synthesis and understanding, overlooking the intricate interplay of electrical, logical, and physical facets of circuit data. This paper argues for a paradigm shift from AI4EDA towards AI-native EDA, integrating AI at the core of the design process. Pivotal to this vision is the development of a multimodal circuit representation learning technique, poised to provide a comprehensive understanding by harmonizing and extracting insights from varied data sources, such as functional specifications, RTL designs, circuit netlists, and physical layouts. We champion the creation of large circuit models (LCMs) that are inherently multimodal, crafted to decode and express the rich semantics and structures of circuit data, thus fostering more resilient, efficient, and inventive design methodologies. Embracing this AI-native philosophy, we foresee a trajectory that transcends the current innovation plateau in EDA, igniting a profound shift-left in electronic design methodology. The envisioned advancements herald not just an evolution of existing EDA tools but a revolution, giving rise to novel instruments of design tools that promise to radically enhance design productivity and inaugurate a new epoch where the optimization of circuit performance, power, and area (PPA) is achieved not incrementally, but through leaps that redefine the benchmarks of electronic systems' capabilities. △ Less

Submitted 1 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: The authors are ordered alphabetically. Contact: qxu@cse[dot]cuhk[dot]edu[dot]hk, gluo@pku[dot]edu[dot]cn, yuan.mingxuan@huawei[dot]com

arXiv:2403.05280 [pdf, other]

ContrastDiagnosis: Enhancing Interpretability in Lung Nodule Diagnosis Using Contrastive Learning

Authors: Chenglong Wang, Yinqiao Yi, Yida Wang, Chengxiu Zhang, Yun Liu, Kensaku Mori, Mei Yuan, Guang Yang

Abstract: With the ongoing development of deep learning, an increasing number of AI models have surpassed the performance levels of human clinical practitioners. However, the prevalence of AI diagnostic products in actual clinical practice remains significantly lower than desired. One crucial reason for this gap is the so-called `black box' nature of AI models. Clinicians' distrust of black box models has d… ▽ More With the ongoing development of deep learning, an increasing number of AI models have surpassed the performance levels of human clinical practitioners. However, the prevalence of AI diagnostic products in actual clinical practice remains significantly lower than desired. One crucial reason for this gap is the so-called `black box' nature of AI models. Clinicians' distrust of black box models has directly hindered the clinical deployment of AI products. To address this challenge, we propose ContrastDiagnosis, a straightforward yet effective interpretable diagnosis framework. This framework is designed to introduce inherent transparency and provide extensive post-hoc explainability for deep learning model, making them more suitable for clinical medical diagnosis. ContrastDiagnosis incorporates a contrastive learning mechanism to provide a case-based reasoning diagnostic rationale, enhancing the model's transparency and also offers post-hoc interpretability by highlighting similar areas. High diagnostic accuracy was achieved with AUC of 0.977 while maintain a high transparency and explainability. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.04914 [pdf]

Improving the Equation of Exchange for Cryptoasset Valuation Using Empirical Data

Authors: Stylianos Kampakis, Melody Yuan, Oritsebawo Paul Ikpobe, Linas Stankevicius

Abstract: In the evolving domain of cryptocurrency markets, accurate token valuation remains a critical aspect influencing investment decisions and policy development. Whilst the prevailing equation of exchange pricing model offers a quantitative valuation approach based on the interplay between token price, transaction volume, supply, and either velocity or holding time, it exhibits intrinsic shortcomings.… ▽ More In the evolving domain of cryptocurrency markets, accurate token valuation remains a critical aspect influencing investment decisions and policy development. Whilst the prevailing equation of exchange pricing model offers a quantitative valuation approach based on the interplay between token price, transaction volume, supply, and either velocity or holding time, it exhibits intrinsic shortcomings. Specifically, the model may not consistently delineate the relationship between average token velocity and holding time. This paper aims to refine this equation, enhancing the depth of insight into token valuation methodologies. △ Less

Submitted 7 March, 2024; originally announced March 2024.

arXiv:2403.03517 [pdf, other]

IB-Net: Initial Branch Network for Variable Decision in Boolean Satisfiability

Authors: Tsz Ho Chan, Wenyi Xiao, Junhua Huang, Huiling Zhen, Guangji Tian, Mingxuan Yuan

Abstract: Boolean Satisfiability problems are vital components in Electronic Design Automation, particularly within the Logic Equivalence Checking process. Currently, SAT solvers are employed for these problems and neural network is tried as assistance to solvers. However, as SAT problems in the LEC context are distinctive due to their predominantly unsatisfiability nature and a substantial proportion of UN… ▽ More Boolean Satisfiability problems are vital components in Electronic Design Automation, particularly within the Logic Equivalence Checking process. Currently, SAT solvers are employed for these problems and neural network is tried as assistance to solvers. However, as SAT problems in the LEC context are distinctive due to their predominantly unsatisfiability nature and a substantial proportion of UNSAT-core variables, existing neural network assistance has proven unsuccessful in this specialized domain. To tackle this challenge, we propose IB-Net, an innovative framework utilizing graph neural networks and novel graph encoding techniques to model unsatisfiable problems and interact with state-of-the-art solvers. Extensive evaluations across solvers and datasets demonstrate IB-Net's acceleration, achieving an average runtime speedup of 5.0% on industrial data and 8.3% on SAT competition data empirically. This breakthrough advances efficient solving in LEC workflows. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 7 pages, 12 figures

arXiv:2403.00012 [pdf, other]

PreRoutGNN for Timing Prediction with Order Preserving Partition: Global Circuit Pre-training, Local Delay Learning and Attentional Cell Modeling

Authors: Ruizhe Zhong, Junjie Ye, Zhentao Tang, Shixiong Kai, Mingxuan Yuan, Jianye Hao, Junchi Yan

Abstract: Pre-routing timing prediction has been recently studied for evaluating the quality of a candidate cell placement in chip design. It involves directly estimating the timing metrics for both pin-level (slack, slew) and edge-level (net delay, cell delay), without time-consuming routing. However, it often suffers from signal decay and error accumulation due to the long timing paths in large-scale indu… ▽ More Pre-routing timing prediction has been recently studied for evaluating the quality of a candidate cell placement in chip design. It involves directly estimating the timing metrics for both pin-level (slack, slew) and edge-level (net delay, cell delay), without time-consuming routing. However, it often suffers from signal decay and error accumulation due to the long timing paths in large-scale industrial circuits. To address these challenges, we propose a two-stage approach. First, we propose global circuit training to pre-train a graph auto-encoder that learns the global graph embedding from circuit netlist. Second, we use a novel node updating scheme for message passing on GCN, following the topological sorting sequence of the learned graph embedding and circuit graph. This scheme residually models the local time delay between two adjacent pins in the updating sequence, and extracts the lookup table information inside each cell via a new attention mechanism. To handle large-scale circuits efficiently, we introduce an order preserving partition scheme that reduces memory consumption while maintaining the topological dependencies. Experiments on 21 real world circuits achieve a new SOTA R2 of 0.93 for slack prediction, which is significantly surpasses 0.59 by previous SOTA method. Code will be available at: https://github.com/Thinklab-SJTU/EDA-AI. △ Less

Submitted 12 March, 2024; v1 submitted 26 February, 2024; originally announced March 2024.

Comments: 13 pages, 5 figures, The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

arXiv:2402.18849 [pdf]

Enhancing Steganographic Text Extraction: Evaluating the Impact of NLP Models on Accuracy and Semantic Coherence

Authors: Mingyang Li, Maoqin Yuan, Luyao Li, Han Pengsihua

Abstract: This study discusses a new method combining image steganography technology with Natural Language Processing (NLP) large models, aimed at improving the accuracy and robustness of extracting steganographic text. Traditional Least Significant Bit (LSB) steganography techniques face challenges in accuracy and robustness of information extraction when dealing with complex character encoding, such as Ch… ▽ More This study discusses a new method combining image steganography technology with Natural Language Processing (NLP) large models, aimed at improving the accuracy and robustness of extracting steganographic text. Traditional Least Significant Bit (LSB) steganography techniques face challenges in accuracy and robustness of information extraction when dealing with complex character encoding, such as Chinese characters. To address this issue, this study proposes an innovative LSB-NLP hybrid framework. This framework integrates the advanced capabilities of NLP large models, such as error detection, correction, and semantic consistency analysis, as well as information reconstruction techniques, thereby significantly enhancing the robustness of steganographic text extraction. Experimental results show that the LSB-NLP hybrid framework excels in improving the extraction accuracy of steganographic text, especially in handling Chinese characters. The findings of this study not only confirm the effectiveness of combining image steganography technology and NLP large models but also propose new ideas for research and application in the field of information hiding. The successful implementation of this interdisciplinary approach demonstrates the great potential of integrating image steganography technology with natural language processing technology in solving complex information processing problems. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.16891 [pdf, other]

Multi-Task Learning for Routing Problem with Cross-Problem Zero-Shot Generalization

Authors: Fei Liu, Xi Lin, Zhenkun Wang, Qingfu Zhang, Xialiang Tong, Mingxuan Yuan

Abstract: Vehicle routing problems (VRPs), which can be found in numerous real-world applications, have been an important research topic for several decades. Recently, the neural combinatorial optimization (NCO) approach that leverages a learning-based model to solve VRPs without manual algorithm design has gained substantial attention. However, current NCO methods typically require building one model for e… ▽ More Vehicle routing problems (VRPs), which can be found in numerous real-world applications, have been an important research topic for several decades. Recently, the neural combinatorial optimization (NCO) approach that leverages a learning-based model to solve VRPs without manual algorithm design has gained substantial attention. However, current NCO methods typically require building one model for each routing problem, which significantly hinders their practical application for real-world industry problems with diverse attributes. In this work, we make the first attempt to tackle the crucial challenge of cross-problem generalization. In particular, we formulate VRPs as different combinations of a set of shared underlying attributes and solve them simultaneously via a single model through attribute composition. In this way, our proposed model can successfully solve VRPs with unseen attribute combinations in a zero-shot generalization manner. Extensive experiments are conducted on eleven VRP variants, benchmark datasets, and industry logistic scenarios. The results show that the unified model demonstrates superior performance in the eleven VRPs, reducing the average gap to around 5% from over 20% in the existing approach and achieving a significant performance boost on benchmark datasets as well as a real-world logistics application. The source code is included in https://github.com/FeiLiu36/MTNCO. △ Less

Submitted 12 April, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

arXiv:2402.14232 [pdf, other]

The quark flavor-violating ALPs in light of B mesons and hadron colliders

Authors: Tong Li, Zhuoni Qian, Michael A. Schmidt, Man Yuan

Abstract: The axion-like particle (ALP) may induce flavor-changing neutral currents (FCNCs) when their Peccei-Quinn charges are not generation universal. The search for flavor-violating ALP couplings with a bottom quark so far focused on FCNC processes of $B$ mesons at low energies. The recent measurements of $B\to K +X$ rare decays place stringent bounds on the quark flavor violations of a light ALP in dif… ▽ More The axion-like particle (ALP) may induce flavor-changing neutral currents (FCNCs) when their Peccei-Quinn charges are not generation universal. The search for flavor-violating ALP couplings with a bottom quark so far focused on FCNC processes of $B$ mesons at low energies. The recent measurements of $B\to K +X$ rare decays place stringent bounds on the quark flavor violations of a light ALP in different decay modes. In this work we propose a novel direct search for bottom flavor-violating interaction of a heavy ALP at the LHC and its upgrades, namely QCD production of an ALP associated with one $b$ jet and one light jet $p~p\to b~j~a$. We consider the decay of the ALP to photons, muons and invisible ALP decays. The Boosted Decision Tree (BDT) algorithm is used to analyze the events and we train the BDT classifier by feeding in the kinematic observables of signal and backgrounds. Finally, we show the complementarity between the search prospects of hadron colliders and the low-energy $B$ meson constraints from $B$ meson mixing and $B$ meson decays to a light ALP. △ Less

Submitted 26 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: 46 pages, 20 figures, 11 tables. version accepted for publication in JHEP

Report number: CPPC-2024-03

arXiv:2402.12143 [pdf, other]

Joint mode switching and resource allocation in wireless-powered RIS-aided multiuser communication systems

Authors: Mingang Yuan, Wenzhe Zhang, Gaofei Huang

Abstract: This paper investigates a wireless-powered hybrid reflecting intelligent surface (hybrid RIS)-assisted multiple access system, where the RIS can harvest energy from energy station (ES) transmitted radio frequency signal (RF), and each reflecting element can flexibly switch between active mode, passive mode, and idle mode. The objective is to minimize the maximum energy consumption of the users by… ▽ More This paper investigates a wireless-powered hybrid reflecting intelligent surface (hybrid RIS)-assisted multiple access system, where the RIS can harvest energy from energy station (ES) transmitted radio frequency signal (RF), and each reflecting element can flexibly switch between active mode, passive mode, and idle mode. The objective is to minimize the maximum energy consumption of the users by jointly optimizing the operating modes of each reflecting element, the amplification factor of active elements, the transmit power, and transmission time allocation, subject to quality-of-service (QoS) of each user and the available energy constraint of RIS. In the formulated optimization problem, the operating modes of each reflecting element are highly coupled with the amplification coefficient of the active reflecting elements, making it a challenging mixed-integer programming problem. To solve this problem, a hierarchical optimization method based on deep reinforcement learning is proposed, where the operating modes of each reflecting element and the amplification coefficient of active elements are obtained by solving the outer sub-problem using proximal policy optimization (PPO), and the transmit power and transmission time allocation are obtained by solving the inner sub-problem using convex optimization methods. Simulation results show that compared to the baseline scheme, the proposed scheme can reduce user energy consumption by $70 \%$. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.11903 [pdf, other]

DiLA: Enhancing LLM Tool Learning with Differential Logic Layer

Authors: Yu Zhang, Hui-Ling Zhen, Zehua Pei, Yingzhao Lian, Lihao Yin, Mingxuan Yuan, Bei Yu

Abstract: Considering the challenges faced by large language models (LLMs) in logical reasoning and planning, prior efforts have sought to augment LLMs with access to external solvers. While progress has been made on simple reasoning problems, solving classical constraint satisfaction problems, such as the Boolean Satisfiability Problem (SAT) and Graph Coloring Problem (GCP), remains difficult for off-the-s… ▽ More Considering the challenges faced by large language models (LLMs) in logical reasoning and planning, prior efforts have sought to augment LLMs with access to external solvers. While progress has been made on simple reasoning problems, solving classical constraint satisfaction problems, such as the Boolean Satisfiability Problem (SAT) and Graph Coloring Problem (GCP), remains difficult for off-the-shelf solvers due to their intricate expressions and exponential search spaces. In this paper, we propose a novel differential logic layer-aided language modeling (DiLA) approach, where logical constraints are integrated into the forward and backward passes of a network layer, to provide another option for LLM tool learning. In DiLA, LLM aims to transform the language description to logic constraints and identify initial solutions of the highest quality, while the differential logic layer focuses on iteratively refining the LLM-prompted solution. Leveraging the logic layer as a bridge, DiLA enhances the logical reasoning ability of LLMs on a range of reasoning problems encoded by Boolean variables, guaranteeing the efficiency and correctness of the solution process. We evaluate the performance of DiLA on two classic reasoning problems and empirically demonstrate its consistent outperformance against existing prompt-based and solver-aided approaches. △ Less

Submitted 18 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: arXiv admin note: text overlap with arXiv:2305.12295 by other authors

arXiv:2402.08683 [pdf]

Order picking efficiency: A scattered storage and clustered allocation strategy in automated drug dispensing systems

Authors: Mengge Yuan, Ning Zhao, Kan Wu, Lulu Cheng

Abstract: In the smart hospital, optimizing prescription order fulfilment processes in outpatient pharmacies is crucial. A promising device, automated drug dispensing systems (ADDSs), has emerged to streamline these processes. These systems involve human order pickers who are assisted by ADDSs. The ADDS's robotic arm transports bins from storage locations to the input/output (I/O) points, while the pharmaci… ▽ More In the smart hospital, optimizing prescription order fulfilment processes in outpatient pharmacies is crucial. A promising device, automated drug dispensing systems (ADDSs), has emerged to streamline these processes. These systems involve human order pickers who are assisted by ADDSs. The ADDS's robotic arm transports bins from storage locations to the input/output (I/O) points, while the pharmacist sorts the requested drugs from the bins at the I/O points. This paper focuses on coordinating the ADDS and the pharmacists to optimize the order-picking strategy. Another critical aspect of order-picking systems is the storage location assignment problem (SLAP), which determines the allocation of drugs to storage locations. In this study, we consider the ADDS as a smart warehouse and propose a two-stage scattered storage and clustered allocation (SSCA) strategy to optimize the SLAP for ADDSs. The first stage primarily adopts a scattered storage approach, and we develop a mathematical programming model to group drugs accordingly. In the second stage, we introduce a sequential alternating (SA) heuristic algorithm that takes into account the drug demand frequency and the correlation between drugs to cluster and locate them effectively. To evaluate the proposed SSCA strategy, we develop a double objective integer programming model for the order-picking problem in ADDSs to minimize the number of machines visited in prescription orders while maintaining the shortest average picking time of orders. The numerical results demonstrate that the proposed strategy can optimize the SLAP in ADDSs and improve significantly the order-picking efficiency of ADDSs in a human-robot cooperation environment. △ Less

Submitted 18 December, 2023; originally announced February 2024.

arXiv:2402.07049 [pdf]

A Factor Graph Model of Trust for a Collaborative Multi-Agent System

Authors: Behzad Akbari, Mingfeng Yuan, Hao Wang, Haibin Zhu, **jun Shan

Abstract: In the field of Multi-Agent Systems (MAS), known for their openness, dynamism, and cooperative nature, the ability to trust the resources and services of other agents is crucial. Trust, in this setting, is the reliance and confidence an agent has in the information, behaviors, intentions, truthfulness, and capabilities of others within the system. Our paper introduces a new graphical approach that… ▽ More In the field of Multi-Agent Systems (MAS), known for their openness, dynamism, and cooperative nature, the ability to trust the resources and services of other agents is crucial. Trust, in this setting, is the reliance and confidence an agent has in the information, behaviors, intentions, truthfulness, and capabilities of others within the system. Our paper introduces a new graphical approach that utilizes factor graphs to represent the interdependent behaviors and trustworthiness among agents. This includes modeling the behavior of robots as a trajectory of actions using a Gaussian process factor graph, which accounts for smoothness, obstacle avoidance, and trust-related factors. Our method for evaluating trust is decentralized and considers key interdependent sub-factors such as proximity safety, consistency, and cooperation. The overall system comprises a network of factor graphs that interact through trust-related factors and employs a Bayesian inference method to dynamically assess trust-based decisions with informed consent. The effectiveness of this method is validated via simulations and empirical tests with autonomous robots navigating unsignalized intersections. △ Less

Submitted 10 February, 2024; originally announced February 2024.

arXiv:2402.05789 [pdf, ps, other]

High Dimensional Factor Analysis with Weak Factors

Authors: Jungjun Choi, Ming Yuan

Abstract: This paper studies the principal components (PC) estimator for high dimensional approximate factor models with weak factors in that the factor loading ($\boldsymbolΛ^0$) scales sublinearly in the number $N$ of cross-section units, i.e., $\boldsymbolΛ^{0\top} \boldsymbolΛ^0 / N^α$ is positive definite in the limit for some $α\in (0,1)$. While the consistency and asymptotic normality of these estima… ▽ More This paper studies the principal components (PC) estimator for high dimensional approximate factor models with weak factors in that the factor loading ($\boldsymbolΛ^0$) scales sublinearly in the number $N$ of cross-section units, i.e., $\boldsymbolΛ^{0\top} \boldsymbolΛ^0 / N^α$ is positive definite in the limit for some $α\in (0,1)$. While the consistency and asymptotic normality of these estimates are by now well known when the factors are strong, i.e., $α=1$, the statistical properties for weak factors remain less explored. Here, we show that the PC estimator maintains consistency and asymptotical normality for any $α\in(0,1)$, provided suitable conditions regarding the dependence structure in the noise are met. This complements earlier result by Onatski (2012) that the PC estimator is inconsistent when $α=0$, and the more recent work by Bai and Ng (2023) who established the asymptotic normality of the PC estimator when $α\in (1/2,1)$. Our proof strategy integrates the traditional eigendecomposition-based approach for factor models with leave-one-out analysis similar in spirit to those used in matrix completion and other settings. This combination allows us to deal with factors weaker than the former and at the same time relax the incoherence and independence assumptions often associated with the later. △ Less

Submitted 8 February, 2024; originally announced February 2024.

arXiv:2402.03375 [pdf, other]

BetterV: Controlled Verilog Generation with Discriminative Guidance

Authors: Zehua Pei, Hui-Ling Zhen, Mingxuan Yuan, Yu Huang, Bei Yu

Abstract: Due to the growing complexity of modern Integrated Circuits (ICs), there is a need for automated circuit design methods. Recent years have seen rising research in hardware design language generation to facilitate the design process. In this work, we propose a Verilog generation framework, BetterV, which fine-tunes the large language models (LLMs) on processed domain-specific datasets and incorpora… ▽ More Due to the growing complexity of modern Integrated Circuits (ICs), there is a need for automated circuit design methods. Recent years have seen rising research in hardware design language generation to facilitate the design process. In this work, we propose a Verilog generation framework, BetterV, which fine-tunes the large language models (LLMs) on processed domain-specific datasets and incorporates generative discriminators for guidance on particular design demands. The Verilog modules are collected, filtered and processed from internet to form a clean and abundant dataset. Instruct-tuning methods are specially designed to fine-tune the LLMs to understand the knowledge about Verilog. Furthermore, data are augmented to enrich the training set and also used to train a generative discriminator on particular downstream task, which leads a guidance for the LLMs to optimize the Verilog implementation. BetterV has the ability to generate syntactically and functionally correct Verilog, which can outperform GPT-4 on the VerilogEval benchmark. With the help of task-specific generative discriminator, BetterV can achieve remarkable improvement on various electronic design automation (EDA) downstream tasks, including the netlist node reduction for synthesis and verification runtime reduction with Boolean Satisfiability (SAT) solving. △ Less

Submitted 2 May, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

Comments: Accepted by ICML 2024

arXiv:2402.01296 [pdf, other]

Bi-CryptoNets: Leveraging Different-Level Privacy for Encrypted Inference

Authors: Man-Jie Yuan, Zheng Zou, Wei Gao

Abstract: Privacy-preserving neural networks have attracted increasing attention in recent years, and various algorithms have been developed to keep the balance between accuracy, computational complexity and information security from the cryptographic view. This work takes a different view from the input data and structure of neural networks. We decompose the input data (e.g., some images) into sensitive an… ▽ More Privacy-preserving neural networks have attracted increasing attention in recent years, and various algorithms have been developed to keep the balance between accuracy, computational complexity and information security from the cryptographic view. This work takes a different view from the input data and structure of neural networks. We decompose the input data (e.g., some images) into sensitive and insensitive segments according to importance and privacy. The sensitive segment includes some important and private information such as human faces and we take strong homomorphic encryption to keep security, whereas the insensitive one contains some background and we add perturbations. We propose the bi-CryptoNets, i.e., plaintext and ciphertext branches, to deal with two segments, respectively, and ciphertext branch could utilize the information from plaintext branch by unidirectional connections. We adopt knowledge distillation for our bi-CryptoNets by transferring representations from a well-trained teacher neural network. Empirical studies show the effectiveness and decrease of inference latency for our bi-CryptoNets. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.12224 [pdf, other]

LLM4EDA: Emerging Progress in Large Language Models for Electronic Design Automation

Authors: Ruizhe Zhong, Xingbo Du, Shixiong Kai, Zhentao Tang, Siyuan Xu, Hui-Ling Zhen, Jianye Hao, Qiang Xu, Mingxuan Yuan, Junchi Yan

Abstract: Driven by Moore's Law, the complexity and scale of modern chip design are increasing rapidly. Electronic Design Automation (EDA) has been widely applied to address the challenges encountered in the full chip design process. However, the evolution of very large-scale integrated circuits has made chip design time-consuming and resource-intensive, requiring substantial prior expert knowledge. Additio… ▽ More Driven by Moore's Law, the complexity and scale of modern chip design are increasing rapidly. Electronic Design Automation (EDA) has been widely applied to address the challenges encountered in the full chip design process. However, the evolution of very large-scale integrated circuits has made chip design time-consuming and resource-intensive, requiring substantial prior expert knowledge. Additionally, intermediate human control activities are crucial for seeking optimal solutions. In system design stage, circuits are usually represented with Hardware Description Language (HDL) as a textual format. Recently, Large Language Models (LLMs) have demonstrated their capability in context understanding, logic reasoning and answer generation. Since circuit can be represented with HDL in a textual format, it is reasonable to question whether LLMs can be leveraged in the EDA field to achieve fully automated chip design and generate circuits with improved power, performance, and area (PPA). In this paper, we present a systematic study on the application of LLMs in the EDA field, categorizing it into the following cases: 1) assistant chatbot, 2) HDL and script generation, and 3) HDL verification and analysis. Additionally, we highlight the future research direction, focusing on applying LLMs in logic synthesis, physical design, multi-modal feature extraction and alignment of circuits. We collect relevant papers up-to-date in this field via the following link: https://github.com/Thinklab-SJTU/Awesome-LLM4EDA. △ Less

Submitted 28 December, 2023; originally announced January 2024.

Comments: 15 pages, 4 figures

arXiv:2401.11491 [pdf]

BA-LINS: A Frame-to-Frame Bundle Adjustment for LiDAR-Inertial Navigation

Authors: Hailiang Tang, Tisheng Zhang, Liqiang Wang, Man Yuan, Xiaoji Niu

Abstract: Bundle Adjustment (BA) has been proven to improve the accuracy of the LiDAR map**. However, the BA method has not yet been properly employed in a dead-reckoning navigation system. In this paper, we present a frame-to-frame (F2F) BA for LiDAR-inertial navigation, named BA-LINS. Based on the direct F2F point-cloud association, the same-plane points are associated among the LiDAR keyframes. Hence,… ▽ More Bundle Adjustment (BA) has been proven to improve the accuracy of the LiDAR map**. However, the BA method has not yet been properly employed in a dead-reckoning navigation system. In this paper, we present a frame-to-frame (F2F) BA for LiDAR-inertial navigation, named BA-LINS. Based on the direct F2F point-cloud association, the same-plane points are associated among the LiDAR keyframes. Hence, the F2F plane-point BA measurement can be constructed using the same-plane points. The LiDAR BA and the inertial measurement unit (IMU)-preintegration measurements are tightly integrated under the framework of factor graph optimization. An effective adaptive covariance estimation algorithm for LiDAR BA measurements is proposed to further improve the accuracy. We conduct exhaustive real-world experiments on public and private datasets to examine the proposed BA-LINS. The results demonstrate that BA-LINS yields superior accuracy to state-of-the-art methods. Compared to the baseline system FF-LINS, the absolute translation accuracy and state-estimation efficiency of BA-LINS are improved by 29.5% and 28.7% on the private dataset, respectively. Besides, the ablation experiment results exhibit that the proposed adaptive covariance estimation algorithm can notably improve the accuracy and robustness of BA-LINS. △ Less

Submitted 10 February, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

Comments: 14 pages, 14 figures

arXiv:2401.10731 [pdf, other]

Removal and Selection: Improving RGB-Infrared Object Detection via Coarse-to-Fine Fusion

Authors: Tianyi Zhao, Maoxun Yuan, Feng Jiang, Nan Wang, Xingxing Wei

Abstract: Object detection in visible (RGB) and infrared (IR) images has been widely applied in recent years. Leveraging the complementary characteristics of RGB and IR images, the object detector provides reliable and robust object localization from day to night. Most existing fusion strategies directly input RGB and IR images into deep neural networks, leading to inferior detection performance. However, t… ▽ More Object detection in visible (RGB) and infrared (IR) images has been widely applied in recent years. Leveraging the complementary characteristics of RGB and IR images, the object detector provides reliable and robust object localization from day to night. Most existing fusion strategies directly input RGB and IR images into deep neural networks, leading to inferior detection performance. However, the RGB and IR features have modality-specific noise, these strategies will exacerbate the fused features along with the propagation. Inspired by the mechanism of the human brain processing multimodal information, in this paper, we introduce a new coarse-to-fine perspective to purify and fuse two modality features. Specifically, following this perspective, we design a Redundant Spectrum Removal module to coarsely remove interfering information within each modality and a Dynamic Feature Selection module to finely select the desired features for feature fusion. To verify the effectiveness of the coarse-to-fine fusion strategy, we construct a new object detector called the Removal and Selection Detector (RSDet). Extensive experiments on three RGB-IR object detection datasets verify the superior performance of our method. △ Less

Submitted 7 May, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

Comments: 11pages, 11figures

arXiv:2401.05960 [pdf, other]

Machine Learning Insides OptVerse AI Solver: Design Principles and Applications

Authors: Xijun Li, Fangzhou Zhu, Hui-Ling Zhen, Weilin Luo, Meng Lu, Yimin Huang, Zhenan Fan, Zirui Zhou, Yufei Kuang, Zhihai Wang, Zijie Geng, Yang Li, Haoyang Liu, Zhiwu An, Muming Yang, Jianshu Li, Jie Wang, Junchi Yan, Defeng Sun, Tao Zhong, Yong Zhang, Jia Zeng, Mingxuan Yuan, Jianye Hao, Jun Yao , et al. (1 additional authors not shown)

Abstract: In an era of digital ubiquity, efficient resource management and decision-making are paramount across numerous industries. To this end, we present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI Solver, which aims to mitigate the scarcity of real-world mathematical programming instances, and to surpass the capabilities of traditional opt… ▽ More In an era of digital ubiquity, efficient resource management and decision-making are paramount across numerous industries. To this end, we present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI Solver, which aims to mitigate the scarcity of real-world mathematical programming instances, and to surpass the capabilities of traditional optimization techniques. We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem. Furthermore, we introduce a training framework leveraging augmentation policies to maintain solvers' utility in dynamic environments. Besides the data generation and augmentation, our proposed approaches also include novel ML-driven policies for personalized solver strategies, with an emphasis on applications like graph convolutional networks for initial basis selection and reinforcement learning for advanced presolving and cut selection. Additionally, we detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance. Compared with traditional solvers such as Cplex and SCIP, our ML-augmented OptVerse AI Solver demonstrates superior speed and precision across both established benchmarks and real-world scenarios, reinforcing the practical imperative and effectiveness of machine learning techniques in mathematical programming solvers. △ Less

Submitted 17 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

Showing 1–50 of 328 results for author: Yuan, M