-
Informed along the road: roadway capacity driven graph convolution network for network-wide traffic prediction
Authors:
Zilin Bian,
**gqin Gao,
Kaan Ozbay,
Fan Zuo,
Dachuan Zuo,
Zhenning Li
Abstract:
While deep learning has shown success in predicting traffic states, most methods treat it as a general prediction task without considering transportation aspects. Recently, graph neural networks have proven effective for this task, but few incorporate external factors that impact roadway capacity and traffic flow. This study introduces the Roadway Capacity Driven Graph Convolution Network (RCDGCN)…
▽ More
While deep learning has shown success in predicting traffic states, most methods treat it as a general prediction task without considering transportation aspects. Recently, graph neural networks have proven effective for this task, but few incorporate external factors that impact roadway capacity and traffic flow. This study introduces the Roadway Capacity Driven Graph Convolution Network (RCDGCN) model, which incorporates static and dynamic roadway capacity attributes in spatio-temporal settings to predict network-wide traffic states. The model was evaluated on two real-world datasets with different transportation factors: the ICM-495 highway network and an urban network in Manhattan, New York City. Results show RCDGCN outperformed baseline methods in forecasting accuracy. Analyses, including ablation experiments, weight analysis, and case studies, investigated the effect of capacity-related factors. The study demonstrates the potential of using RCDGCN for transportation system management.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
RL-MUL: Multiplier Design Optimization with Deep Reinforcement Learning
Authors:
Dongsheng Zuo,
Jiadong Zhu,
Yikang Ouyang,
Yuzhe Ma
Abstract:
Multiplication is a fundamental operation in many applications, and multipliers are widely adopted in various circuits. However, optimizing multipliers is challenging and non-trivial due to the huge design space. In this paper, we propose RL-MUL, a multiplier design optimization framework based on reinforcement learning. Specifically, we utilize matrix and tensor representations for the compressor…
▽ More
Multiplication is a fundamental operation in many applications, and multipliers are widely adopted in various circuits. However, optimizing multipliers is challenging and non-trivial due to the huge design space. In this paper, we propose RL-MUL, a multiplier design optimization framework based on reinforcement learning. Specifically, we utilize matrix and tensor representations for the compressor tree of a multiplier, based on which the convolutional neural networks can be seamlessly incorporated as the agent network. The agent can learn to optimize the multiplier structure based on a Pareto-driven reward which is customized to accommodate the trade-off between area and delay. Additionally, the capability of RL-MUL is extended to optimize the fused multiply-accumulator (MAC) designs. Experiments are conducted on different bit widths of multipliers. The results demonstrate that the multipliers produced by RL-MUL can dominate all baseline designs in terms of area and delay. The performance gain of RL-MUL is further validated by comparing the area and delay of processing element arrays using multipliers from RL-MUL and baseline approaches.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
E-Syn: E-Graph Rewriting with Technology-Aware Cost Functions for Logic Synthesis
Authors:
Chen Chen,
Guangyu Hu,
Dongsheng Zuo,
Cunxi Yu,
Yuzhe Ma,
Hongce Zhang
Abstract:
Logic synthesis plays a crucial role in the digital design flow. It has a decisive influence on the final Quality of Results (QoR) of the circuit implementations. However, existing multi-level logic optimization algorithms often employ greedy approaches with a series of local optimization steps. Each step breaks the circuit into small pieces (e.g., k-feasible cuts) and applies incremental changes…
▽ More
Logic synthesis plays a crucial role in the digital design flow. It has a decisive influence on the final Quality of Results (QoR) of the circuit implementations. However, existing multi-level logic optimization algorithms often employ greedy approaches with a series of local optimization steps. Each step breaks the circuit into small pieces (e.g., k-feasible cuts) and applies incremental changes to individual pieces separately. These local optimization steps could limit the exploration space and may miss opportunities for significant improvements. To address the limitation, this paper proposes using e-graph in logic synthesis. The new workflow, named Esyn, makes use of the well-established e-graph infrastructure to efficiently perform logic rewriting. It explores a diverse set of equivalent Boolean representations while allowing technology-aware cost functions to better support delay-oriented and area-oriented logic synthesis. Experiments over a wide range of benchmark designs show our proposed logic optimization approach reaches a wider design space compared to the commonly used AIG-based logic synthesis flow. It achieves on average 15.29% delay saving in delay-oriented synthesis and 6.42% area saving for area-oriented synthesis.
△ Less
Submitted 25 March, 2024; v1 submitted 21 March, 2024;
originally announced March 2024.
-
Efficient Title Reranker for Fast and Improved Knowledge-Intense NLP
Authors:
Ziyi Chen,
Jize Jiang,
Daqian Zuo,
Heyi Tao,
Jun Yang,
Yuxiang Wei
Abstract:
In recent RAG approaches, rerankers play a pivotal role in refining retrieval accuracy with the ability of revealing logical relations for each pair of query and text. However, existing rerankers are required to repeatedly encode the query and a large number of long retrieved text. This results in high computational costs and limits the number of retrieved text, hindering accuracy. As a remedy of…
▽ More
In recent RAG approaches, rerankers play a pivotal role in refining retrieval accuracy with the ability of revealing logical relations for each pair of query and text. However, existing rerankers are required to repeatedly encode the query and a large number of long retrieved text. This results in high computational costs and limits the number of retrieved text, hindering accuracy. As a remedy of the problem, we introduce the Efficient Title Reranker via Broadcasting Query Encoder, a novel technique for title reranking that achieves a 20x-40x speedup over the vanilla passage reranker. Furthermore, we introduce Sigmoid Trick, a novel loss function customized for title reranking. Combining both techniques, we empirically validated their effectiveness, achieving state-of-the-art results on all four datasets we experimented with from the KILT knowledge benchmark.
△ Less
Submitted 25 February, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Natural-language-driven Simulation Benchmark and Copilot for Efficient Production of Object Interactions in Virtual Road Scenes
Authors:
Kairui Yang,
Zihao Guo,
Gengjie Lin,
Haotian Dong,
Die Zuo,
Jibin Peng,
Zhao Huang,
Zhecheng Xu,
Fupeng Li,
Ziyun Bai,
Di Lin
Abstract:
We advocate the idea of the natural-language-driven(NLD) simulation to efficiently produce the object interactions between multiple objects in the virtual road scenes, for teaching and testing the autonomous driving systems that should take quick action to avoid collision with obstacles with unpredictable motions. The NLD simulation allows the brief natural-language description to control the obje…
▽ More
We advocate the idea of the natural-language-driven(NLD) simulation to efficiently produce the object interactions between multiple objects in the virtual road scenes, for teaching and testing the autonomous driving systems that should take quick action to avoid collision with obstacles with unpredictable motions. The NLD simulation allows the brief natural-language description to control the object interactions, significantly reducing the human efforts for creating a large amount of interaction data. To facilitate the research of NLD simulation, we collect the Language-to-Interaction(L2I) benchmark dataset with 120,000 natural-language descriptions of object interactions in 6 common types of road topologies. Each description is associated with the programming code, which the graphic render can use to visually reconstruct the object interactions in the virtual scenes. As a methodology contribution, we design SimCopilot to translate the interaction descriptions to the renderable code. We use the L2I dataset to evaluate SimCopilot's abilities to control the object motions, generate complex interactions, and generalize interactions across road topologies. The L2I dataset and the evaluation results motivate the relevant research of the NLD simulation.
△ Less
Submitted 15 December, 2023; v1 submitted 6 December, 2023;
originally announced December 2023.
-
Adaptive Frequency Green Light Optimal Speed Advisory based on Hybrid Actor-Critic Reinforcement Learning
Authors:
Ming Xu,
Dongyu Zuo
Abstract:
Green Light Optimal Speed Advisory (GLOSA) system suggests speeds to vehicles to assist them in passing through intersections during green intervals, thus reducing traffic congestion and fuel consumption by minimizing the number of stops and idle times at intersections. However, previous research has focused on optimizing the GLOSA algorithm, neglecting the frequency of speed advisory by the GLOSA…
▽ More
Green Light Optimal Speed Advisory (GLOSA) system suggests speeds to vehicles to assist them in passing through intersections during green intervals, thus reducing traffic congestion and fuel consumption by minimizing the number of stops and idle times at intersections. However, previous research has focused on optimizing the GLOSA algorithm, neglecting the frequency of speed advisory by the GLOSA system. Specifically, some studies provide speed advisory profile at each decision step, resulting in redundant advisory, while others calculate the optimal speed for the vehicle only once, which cannot adapt to dynamic traffic. In this paper, we propose an Adaptive Frequency GLOSA (AF-GLOSA) model based on Hybrid Proximal Policy Optimization (H-PPO) method, which employs an actor-critic architecture with a hybrid actor network. The hybrid actor network consists of a discrete actor that outputs control gap and a continuous actor that outputs acceleration profiles. Additionally, we design a novel reward function that considers both travel efficiency and fuel consumption. The AF-GLOSA model is evaluated in comparison to traditional GLOSA and learning-based GLOSA methods in a three-lane intersection with a traffic signal in SUMO. The results demonstrate that the AF-GLOSA model performs best in reducing average stop times, fuel consumption and CO2 emissions.
△ Less
Submitted 12 June, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Efficient Virtual View Selection for 3D Hand Pose Estimation
Authors:
Jian Cheng,
Yanguang Wan,
Dexin Zuo,
Cuixia Ma,
Jian Gu,
** Tan,
Hongan Wang,
Xiaoming Deng,
Yinda Zhang
Abstract:
3D hand pose estimation from single depth is a fundamental problem in computer vision, and has wide applications.However, the existing methods still can not achieve satisfactory hand pose estimation results due to view variation and occlusion of human hand. In this paper, we propose a new virtual view selection and fusion module for 3D hand pose estimation from single depth.We propose to automatic…
▽ More
3D hand pose estimation from single depth is a fundamental problem in computer vision, and has wide applications.However, the existing methods still can not achieve satisfactory hand pose estimation results due to view variation and occlusion of human hand. In this paper, we propose a new virtual view selection and fusion module for 3D hand pose estimation from single depth.We propose to automatically select multiple virtual viewpoints for pose estimation and fuse the results of all and find this empirically delivers accurate and robust pose estimation. In order to select most effective virtual views for pose fusion, we evaluate the virtual views based on the confidence of virtual views using a light-weight network via network distillation. Experiments on three main benchmark datasets including NYU, ICVL and Hands2019 demonstrate that our method outperforms the state-of-the-arts on NYU and ICVL, and achieves very competitive performance on Hands2019-Task1, and our proposed virtual view selection and fusion module is both effective for 3D hand pose estimation.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
A thermodynamic framework for unified continuum models for the healing of damaged soft biological tissue
Authors:
Di Zuo,
Yiqian He,
Stéphane Avril,
Haitian Yang,
Klaus Hackl
Abstract:
When they are damaged or injured, soft biological tissues are able to self-repair and heal. Mechanics is critical during the healing process, as the damaged extracellular matrix (ECM) tends to be replaced with a new undamaged ECM supporting homeostatic stresses. Computational modeling has been commonly used to simulate the healing process. However, there is a pressing need to have a unified thermo…
▽ More
When they are damaged or injured, soft biological tissues are able to self-repair and heal. Mechanics is critical during the healing process, as the damaged extracellular matrix (ECM) tends to be replaced with a new undamaged ECM supporting homeostatic stresses. Computational modeling has been commonly used to simulate the healing process. However, there is a pressing need to have a unified thermodynamics theory for healing. From the viewpoint of continuum damage mechanics, some key parameters related to healing processes, for instance, the volume fraction of newly grown soft tissue and the growth deformation, can be regarded as internal variables and have related evolution equations. This paper is aiming to establish this unified framework inspired by thermodynamics for continuum damage models for the healing of soft biological tissues. The significant advantage of the proposed model is that no \textit{ad hoc} equations are required for describing the healing process. Therefore, this new model is more concise and offers a universal approach to simulate the healing process. Three numerical examples are provided to demonstrate the effectiveness of the proposed model, which is in good agreement with the existing works, including an application for balloon angioplasty in an arteriosclerotic artery with a fiber cap.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
Reducing the Upfront Cost of Private Clouds with Clairvoyant Virtual Machine Placement
Authors:
Yan Zhao,
Hongwei Liu,
Yan Wang,
Zhan Zhang,
Decheng Zuo
Abstract:
Although public clouds still occupy the largest portion of the total cloud infrastructure, private clouds are attracting increasing interest from both industry and academia because of their better security and privacy control. According to the existing studies, the high upfront cost is among the most critical challenges associated with private clouds. To reduce cost and improve performance, virtua…
▽ More
Although public clouds still occupy the largest portion of the total cloud infrastructure, private clouds are attracting increasing interest from both industry and academia because of their better security and privacy control. According to the existing studies, the high upfront cost is among the most critical challenges associated with private clouds. To reduce cost and improve performance, virtual machine placement (VMP) methods have been extensively investigated, however, few of these methods have focused on private clouds. This paper proposes a heterogeneous and multidimensional clairvoyant dynamic bin packing (CDBP) model, in which the scheduler can conduct more efficient VMP processes using additional information on the arrival time and duration of virtual machines to reduce the datacenter scale and thereby decrease the upfront cost of private clouds. In addition, a novel branch-and-bound algorithm with a divide-and-conquer strategy (DCBB) is proposed to effectively and efficiently handle the derived problem. One state-of-the-art and several classic VMP methods are also modified to adapt to the proposed model to observe their performance and compare with our proposed algorithm. Extensive experiments are conducted on both real-world and synthetic workloads to evaluate the accuracy and efficiency of the algorithms. The experimental results demonstrate that DCBB delivers near-optimal solutions with a convergence rate that is much faster than those of the other search-based algorithms evaluated. In particular, DCBB yields the optimal solution for a real-world workload with an execution time that is an order of magnitude shorter than that required by the original branch-and-bound (BB) algorithm.
△ Less
Submitted 21 December, 2018; v1 submitted 9 February, 2018;
originally announced February 2018.
-
A deep learning approach for predicting the quality of online health expert question-answering services
Authors:
Ze Hu,
Zhan Zhang,
Qing Chen,
Haiqin Yang,
Decheng Zuo
Abstract:
Currently, a growing number of health consumers are asking health-related questions online, at any time and from anywhere, which effectively lowers the cost of health care. The most common approach is using online health expert question-answering (HQA) services, as health consumers are more willing to trust answers from professional physicians. However, these answers can be of varying quality depe…
▽ More
Currently, a growing number of health consumers are asking health-related questions online, at any time and from anywhere, which effectively lowers the cost of health care. The most common approach is using online health expert question-answering (HQA) services, as health consumers are more willing to trust answers from professional physicians. However, these answers can be of varying quality depending on circumstance. In addition, as the available HQA services grow, how to predict the answer quality of HQA services via machine learning becomes increasingly important and challenging. In an HQA service, answers are normally short texts, which are severely affected by the data sparsity problem. Furthermore, HQA services lack community features such as best answer and user votes. Therefore, the wisdom of the crowd is not available to rate answer quality. To address these problems, in this paper, the prediction of HQA answer quality is defined as a classification task. First, based on the characteristics of HQA services and feedback from medical experts, a standard for HQA service answer quality evaluation is defined. Next, based on the characteristics of HQA services, several novel non-textual features are proposed, including surface linguistic features and social features. Finally, a deep belief network (DBN)-based HQA answer quality prediction framework is proposed to predict the quality of answers by learning the high-level hidden semantic representation from the physicians' answers. Our results prove that the proposed framework overcomes the problem of overly sparse textual features in short text answers and effectively identifies high-quality answers.
△ Less
Submitted 21 December, 2016;
originally announced December 2016.
-
Task & Resource Self-adaptive Embedded Real-time Operating System Microkernel for Wireless Sensor Nodes
Authors:
Kexing Xing,
Decheng Zuo,
Haiying Zhou,
Hou Kun-Mean
Abstract:
Wireless Sensor Networks (WSNs) are used in many application fields, such as military, healthcare, environment surveillance, etc. The WSN OS based on event-driven model doesn't support real-time and multi-task application types and the OSs based on thread-driven model consume much energy because of frequent context switch. Due to the high-dense and large-scale deployment of sensor nodes, it is ver…
▽ More
Wireless Sensor Networks (WSNs) are used in many application fields, such as military, healthcare, environment surveillance, etc. The WSN OS based on event-driven model doesn't support real-time and multi-task application types and the OSs based on thread-driven model consume much energy because of frequent context switch. Due to the high-dense and large-scale deployment of sensor nodes, it is very difficult to collect sensor nodes to update their software. Furthermore, the sensor nodes are vulnerable to security attacks because of the characteristics of broadcast communication and unattended application. This paper presents a task and resource self-adaptive embedded real-time microkernel, which proposes hybrid programming model and offers a two-level scheduling strategy to support real-time multi-task correspondingly. A communication scheme, which takes the "tuple" space and "IN/OUT" primitives from "LINDA", is proposed to support some collaborative and distributed tasks. In addition, this kernel implements a run-time over-the-air updating mechanism and provides a security policy to avoid the attacks and ensure the reliable operation of nodes. The performance evaluation is proposed and the experiential results show this kernel is task-oriented and resource-aware and can be used for the applications of event-driven and real-time multi-task.
△ Less
Submitted 19 March, 2014;
originally announced March 2014.