-
WILD-SCAV: Benchmarking FPS Gaming AI on Unity3D-based Environments
Authors:
Xi Chen,
Tianyu Shi,
Qingpeng Zhao,
Yuchen Sun,
Yunfei Gao,
Xiangjun Wang
Abstract:
Recent advances in deep reinforcement learning (RL) have demonstrated complex decision-making capabilities in simulation environments such as Arcade Learning Environment, MuJoCo, and ViZDoom. However, they are hardly extensible to more complicated problems, mainly due to the lack of complexity and variations in the environments they are trained and tested on. Furthermore, they are not extensible t…
▽ More
Recent advances in deep reinforcement learning (RL) have demonstrated complex decision-making capabilities in simulation environments such as Arcade Learning Environment, MuJoCo, and ViZDoom. However, they are hardly extensible to more complicated problems, mainly due to the lack of complexity and variations in the environments they are trained and tested on. Furthermore, they are not extensible to an open-world environment to facilitate long-term exploration research. To learn realistic task-solving capabilities, we need to develop an environment with greater diversity and complexity. We developed WILD-SCAV, a powerful and extensible environment based on a 3D open-world FPS (First-Person Shooter) game to bridge the gap. It provides realistic 3D environments of variable complexity, various tasks, and multiple modes of interaction, where agents can learn to perceive 3D environments, navigate and plan, compete and cooperate in a human-like manner. WILD-SCAV also supports different complexities, such as configurable maps with different terrains, building structures and distributions, and multi-agent settings with cooperative and competitive tasks. The experimental results on configurable complexity, multi-tasking, and multi-agent scenarios demonstrate the effectiveness of WILD-SCAV in benchmarking various RL algorithms, as well as it is potential to give rise to intelligent agents with generalized task-solving abilities. The link to our open-sourced code can be found here https://github.com/inspirai/wilderness-scavenger.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Efficient Connected and Automated Driving System with Multi-agent Graph Reinforcement Learning
Authors:
Tianyu Shi,
Jiawei Wang,
Yuankai Wu,
Luis Miranda-Moreno,
Lijun Sun
Abstract:
Connected and automated vehicles (CAVs) have attracted more and more attention recently. The fast actuation time allows them having the potential to promote the efficiency and safety of the whole transportation system. Due to technical challenges, there will be a proportion of vehicles that can be equipped with automation while other vehicles are without automation. Instead of learning a reliable…
▽ More
Connected and automated vehicles (CAVs) have attracted more and more attention recently. The fast actuation time allows them having the potential to promote the efficiency and safety of the whole transportation system. Due to technical challenges, there will be a proportion of vehicles that can be equipped with automation while other vehicles are without automation. Instead of learning a reliable behavior for ego automated vehicle, we focus on how to improve the outcomes of the total transportation system by allowing each automated vehicle to learn cooperation with each other and regulate human-driven traffic flow. One of state of the art method is using reinforcement learning to learn intelligent decision making policy. However, direct reinforcement learning framework cannot improve the performance of the whole system. In this article, we demonstrate that considering the problem in multi-agent setting with shared policy can help achieve better system performance than non-shared policy in single-agent setting. Furthermore, we find that utilization of attention mechanism on interaction features can capture the interplay between each agent in order to boost cooperation. To the best of our knowledge, while previous automated driving studies mainly focus on enhancing individual's driving performance, this work serves as a starting point for research on system-level multi-agent cooperation performance using graph information sharing. We conduct extensive experiments in car-following and unsignalized intersection settings. The results demonstrate that CAVs controlled by our method can achieve the best performance against several state of the art baselines.
△ Less
Submitted 22 October, 2021; v1 submitted 6 July, 2020;
originally announced July 2020.
-
Corpus-level and Concept-based Explanations for Interpretable Document Classification
Authors:
Tian Shi,
Xuchao Zhang,
** Wang,
Chandan K. Reddy
Abstract:
Using attention weights to identify information that is important for models' decision-making is a popular approach to interpret attention-based neural networks. This is commonly realized in practice through the generation of a heat-map for every single document based on attention weights. However, this interpretation method is fragile, and easy to find contradictory examples. In this paper, we pr…
▽ More
Using attention weights to identify information that is important for models' decision-making is a popular approach to interpret attention-based neural networks. This is commonly realized in practice through the generation of a heat-map for every single document based on attention weights. However, this interpretation method is fragile, and easy to find contradictory examples. In this paper, we propose a corpus-level explanation approach, which aims to capture causal relationships between keywords and model predictions via learning the importance of keywords for predicted labels across a training corpus based on attention weights. Based on this idea, we further propose a concept-based explanation method that can automatically learn higher-level concepts and their importance to model prediction tasks. Our concept-based explanation method is built upon a novel Abstraction-Aggregation Network, which can automatically cluster important keywords during an end-to-end training process. We apply these methods to the document classification task and show that they are powerful in extracting semantically meaningful keywords and concepts. Our consistency analysis results based on an attention-based Naïve Bayes classifier also demonstrate these keywords and concepts are important for model predictions.
△ Less
Submitted 30 May, 2021; v1 submitted 24 April, 2020;
originally announced April 2020.
-
Driving Decision and Control for Autonomous Lane Change based on Deep Reinforcement Learning
Authors:
Tianyu Shi,
Pin Wang,
Xuxin Cheng,
Ching-Yao Chan,
Ding Huang
Abstract:
We apply Deep Q-network (DQN) with the consideration of safety during the task for deciding whether to conduct the maneuver. Furthermore, we design two similar Deep Q learning frameworks with quadratic approximator for deciding how to select a comfortable gap and just follow the preceding vehicle. Finally, a polynomial lane change trajectory is generated and Pure Pursuit Control is implemented for…
▽ More
We apply Deep Q-network (DQN) with the consideration of safety during the task for deciding whether to conduct the maneuver. Furthermore, we design two similar Deep Q learning frameworks with quadratic approximator for deciding how to select a comfortable gap and just follow the preceding vehicle. Finally, a polynomial lane change trajectory is generated and Pure Pursuit Control is implemented for path tracking. We demonstrate the effectiveness of this framework in simulation, from both the decision-making and control layers. The proposed architecture also has the potential to be extended to other autonomous driving scenarios.
△ Less
Submitted 30 July, 2019; v1 submitted 23 April, 2019;
originally announced April 2019.
-
Efficient Motion Planning for Automated Lane Change based on Imitation Learning and Mixed-Integer Optimization
Authors:
Chenyang Xi,
Tianyu Shi,
Yuankai Wu,
Lijun Sun
Abstract:
Intelligent motion planning is one of the core components in automated vehicles, which has received extensive interests. Traditional motion planning methods suffer from several drawbacks in terms of optimality, efficiency and generalization capability. Sampling based methods cannot guarantee the optimality of the generated trajectories. Whereas the optimization-based methods are not able to perfor…
▽ More
Intelligent motion planning is one of the core components in automated vehicles, which has received extensive interests. Traditional motion planning methods suffer from several drawbacks in terms of optimality, efficiency and generalization capability. Sampling based methods cannot guarantee the optimality of the generated trajectories. Whereas the optimization-based methods are not able to perform motion planning in real-time, and limited by the simplified formalization. In this work, we propose a learning-based approach to handle those shortcomings. Mixed Integer Quadratic Problem based optimization (MIQP) is used to generate the optimal lane-change trajectories which served as the training dataset for learning-based action generation algorithms. A hierarchical supervised learning model is devised to make the fast lane-change decision. Numerous experiments have been conducted to evaluate the optimality, efficiency, and generalization capability of the proposed approach. The experimental results indicate that the proposed model outperforms several commonly used motion planning baselines.
△ Less
Submitted 8 May, 2020; v1 submitted 18 April, 2019;
originally announced April 2019.
-
A Data Driven Method of Optimizing Feedforward Compensator for Autonomous Vehicle
Authors:
Tianyu Shi,
Pin Wang,
Ching-Yao Chan,
Chonghao Zou
Abstract:
A reliable controller is critical and essential for the execution of safe and smooth maneuvers of an autonomous vehicle.The controller must be robust to external disturbances, such as road surface, weather, and wind conditions, and so on.It also needs to deal with the internal parametric variations of vehicle sub-systems, including power-train efficiency, measurement errors, time delay,so on.Moreo…
▽ More
A reliable controller is critical and essential for the execution of safe and smooth maneuvers of an autonomous vehicle.The controller must be robust to external disturbances, such as road surface, weather, and wind conditions, and so on.It also needs to deal with the internal parametric variations of vehicle sub-systems, including power-train efficiency, measurement errors, time delay,so on.Moreover, as in most production vehicles, the low-control commands for the engine, brake, and steering systems are delivered through separate electronic control units.These aforementioned factors introduce opaque and ineffectiveness issues in controller performance.In this paper, we design a feed-forward compensate process via a data-driven method to model and further optimize the controller performance.We apply the principal component analysis to the extraction of most influential features.Subsequently,we adopt a time delay neural network and include the accuracy of the predicted error in a future time horizon.Utilizing the predicted error,we then design a feed-forward compensate process to improve the control performance.Finally,we demonstrate the effectiveness of the proposed feed-forward compensate process in simulation scenarios.
△ Less
Submitted 30 April, 2019; v1 submitted 31 January, 2019;
originally announced January 2019.
-
Neural Abstractive Text Summarization with Sequence-to-Sequence Models
Authors:
Tian Shi,
Yaser Keneshloo,
Naren Ramakrishnan,
Chandan K. Reddy
Abstract:
In the past few years, neural abstractive text summarization with sequence-to-sequence (seq2seq) models have gained a lot of popularity. Many interesting techniques have been proposed to improve seq2seq models, making them capable of handling different challenges, such as saliency, fluency and human readability, and generate high-quality summaries. Generally speaking, most of these techniques diff…
▽ More
In the past few years, neural abstractive text summarization with sequence-to-sequence (seq2seq) models have gained a lot of popularity. Many interesting techniques have been proposed to improve seq2seq models, making them capable of handling different challenges, such as saliency, fluency and human readability, and generate high-quality summaries. Generally speaking, most of these techniques differ in one of these three categories: network structure, parameter inference, and decoding/generation. There are also other concerns, such as efficiency and parallelism for training a model. In this paper, we provide a comprehensive literature survey on different seq2seq models for abstractive text summarization from the viewpoint of network structures, training strategies, and summary generation algorithms. Several models were first proposed for language modeling and generation tasks, such as machine translation, and later applied to abstractive text summarization. Hence, we also provide a brief review of these models. As part of this survey, we also develop an open source library, namely, Neural Abstractive Text Summarizer (NATS) toolkit, for the abstractive text summarization. An extensive set of experiments have been conducted on the widely used CNN/Daily Mail dataset to examine the effectiveness of several different neural network components. Finally, we benchmark two models implemented in NATS on the two recently released datasets, namely, Newsroom and Bytecup.
△ Less
Submitted 18 September, 2020; v1 submitted 4 December, 2018;
originally announced December 2018.
-
How does climate change influence regional stability
Authors:
Tianyu Shi,
Jiayan Guo,
Xuxin Cheng,
Yu hao
Abstract:
Nowadays, different places have different region stability, which is influenced by lots of factors. In this paper ,it is aimed to analyze the influence of climate change on regional stability. several factors that may influence the region stability are proposed. Then Principle Components Analysis (PCA) was used to select the most relevant factors. After that ,a BP neural network is established con…
▽ More
Nowadays, different places have different region stability, which is influenced by lots of factors. In this paper ,it is aimed to analyze the influence of climate change on regional stability. several factors that may influence the region stability are proposed. Then Principle Components Analysis (PCA) was used to select the most relevant factors. After that ,a BP neural network is established considering all the principle components to evaluate the Region Stability (RS). Subsequently, the specific influence of the climate change is analyzed and the results showed that long term average precipitation is a main climate factor influencing the RS.
△ Less
Submitted 5 August, 2018; v1 submitted 3 June, 2018;
originally announced June 2018.
-
Deep Reinforcement Learning For Sequence to Sequence Models
Authors:
Yaser Keneshloo,
Tian Shi,
Naren Ramakrishnan,
Chandan K. Reddy
Abstract:
In recent times, sequence-to-sequence (seq2seq) models have gained a lot of popularity and provide state-of-the-art performance in a wide variety of tasks such as machine translation, headline generation, text summarization, speech to text conversion, and image caption generation. The underlying framework for all these models is usually a deep neural network comprising an encoder and a decoder. Al…
▽ More
In recent times, sequence-to-sequence (seq2seq) models have gained a lot of popularity and provide state-of-the-art performance in a wide variety of tasks such as machine translation, headline generation, text summarization, speech to text conversion, and image caption generation. The underlying framework for all these models is usually a deep neural network comprising an encoder and a decoder. Although simple encoder-decoder models produce competitive results, many researchers have proposed additional improvements over these sequence-to-sequence models, e.g., using an attention-based model over the input, pointer-generation models, and self-attention models. However, such seq2seq models suffer from two common problems: 1) exposure bias and 2) inconsistency between train/test measurement. Recently, a completely novel point of view has emerged in addressing these two problems in seq2seq models, leveraging methods from reinforcement learning (RL). In this survey, we consider seq2seq problems from the RL point of view and provide a formulation combining the power of RL methods in decision-making with sequence-to-sequence models that enable remembering long-term memories. We present some of the most recent frameworks that combine concepts from RL and deep neural networks and explain how these two areas could benefit from each other in solving complex seq2seq tasks. Our work aims to provide insights into some of the problems that inherently arise with current approaches and how we can address them with better RL models. We also provide the source code for implementing most of the RL models discussed in this paper to support the complex task of abstractive text summarization.
△ Less
Submitted 15 April, 2019; v1 submitted 23 May, 2018;
originally announced May 2018.
-
Linking GloVe with word2vec
Authors:
Tianze Shi,
Zhiyuan Liu
Abstract:
The Global Vectors for word representation (GloVe), introduced by Jeffrey Pennington et al. is reported to be an efficient and effective method for learning vector representations of words. State-of-the-art performance is also provided by skip-gram with negative-sampling (SGNS) implemented in the word2vec tool. In this note, we explain the similarities between the training objectives of the two mo…
▽ More
The Global Vectors for word representation (GloVe), introduced by Jeffrey Pennington et al. is reported to be an efficient and effective method for learning vector representations of words. State-of-the-art performance is also provided by skip-gram with negative-sampling (SGNS) implemented in the word2vec tool. In this note, we explain the similarities between the training objectives of the two models, and show that the objective of SGNS is similar to the objective of a specialized form of GloVe, though their cost functions are defined differently.
△ Less
Submitted 26 November, 2014; v1 submitted 20 November, 2014;
originally announced November 2014.
-
A Comparison of Spatial Predictors when Datasets Could be Very Large
Authors:
Jonathan R. Bradley,
Noel Cressie,
Tao Shi
Abstract:
In this article, we review and compare a number of methods of spatial prediction. To demonstrate the breadth of available choices, we consider both traditional and more-recently-introduced spatial predictors. Specifically, in our exposition we review: traditional stationary kriging, smoothing splines, negative-exponential distance-weighting, Fixed Rank Kriging, modified predictive processes, a sto…
▽ More
In this article, we review and compare a number of methods of spatial prediction. To demonstrate the breadth of available choices, we consider both traditional and more-recently-introduced spatial predictors. Specifically, in our exposition we review: traditional stationary kriging, smoothing splines, negative-exponential distance-weighting, Fixed Rank Kriging, modified predictive processes, a stochastic partial differential equation approach, and lattice kriging. This comparison is meant to provide a service to practitioners wishing to decide between spatial predictors. Hence, we provide technical material for the unfamiliar, which includes the definition and motivation for each (deterministic and stochastic) spatial predictor. We use a benchmark dataset of $\mathrm{CO}_{2}$ data from NASA's AIRS instrument to address computational efficiencies that include CPU time and memory usage. Furthermore, the predictive performance of each spatial predictor is assessed empirically using a hold-out subset of the AIRS data.
△ Less
Submitted 28 October, 2014;
originally announced October 2014.
-
Scalable Spectral Algorithms for Community Detection in Directed Networks
Authors:
Sungmin Kim,
Tao Shi
Abstract:
Community detection has been one of the central problems in network studies and directed network is particularly challenging due to asymmetry among its links. In this paper, we found that incorporating the direction of links reveals new perspectives on communities regarding to two different roles, source and terminal, that a node plays in each community. Intriguingly, such communities appear to be…
▽ More
Community detection has been one of the central problems in network studies and directed network is particularly challenging due to asymmetry among its links. In this paper, we found that incorporating the direction of links reveals new perspectives on communities regarding to two different roles, source and terminal, that a node plays in each community. Intriguingly, such communities appear to be connected with unique spectral property of the graph Laplacian of the adjacency matrix and we exploit this connection by using regularized SVD methods. We propose harvesting algorithms, coupled with regularized SVDs, that are linearly scalable for efficient identification of communities in huge directed networks. The proposed algorithm shows great performance and scalability on benchmark networks in simulations and successfully recovers communities in real network applications.
△ Less
Submitted 23 September, 2013; v1 submitted 28 November, 2012;
originally announced November 2012.
-
Data spectroscopy: Eigenspaces of convolution operators and clustering
Authors:
Tao Shi,
Mikhail Belkin,
Bin Yu
Abstract:
This paper focuses on obtaining clustering information about a distribution from its i.i.d. samples. We develop theoretical results to understand and use clustering information contained in the eigenvectors of data adjacency matrices based on a radial kernel function with a sufficiently fast tail decay. In particular, we provide population analyses to gain insights into which eigenvectors should…
▽ More
This paper focuses on obtaining clustering information about a distribution from its i.i.d. samples. We develop theoretical results to understand and use clustering information contained in the eigenvectors of data adjacency matrices based on a radial kernel function with a sufficiently fast tail decay. In particular, we provide population analyses to gain insights into which eigenvectors should be used and when the clustering information for the distribution can be recovered from the sample. We learn that a fixed number of top eigenvectors might at the same time contain redundant clustering information and miss relevant clustering information. We use this insight to design the data spectroscopic clustering (DaSpec) algorithm that utilizes properly selected eigenvectors to determine the number of clusters automatically and to group the data accordingly. Our findings extend the intuitions underlying existing spectral techniques such as spectral clustering and Kernel Principal Components Analysis, and provide new understanding into their usability and modes of failure. Simulation studies and experiments on real-world data are conducted to show the potential of our algorithm. In particular, DaSpec is found to handle unbalanced groups and recover clusters of different shapes better than the competing methods.
△ Less
Submitted 20 November, 2009; v1 submitted 23 July, 2008;
originally announced July 2008.