Search | arXiv e-print repository

arXiv:2404.15533 [pdf, other]

Designing, simulating, and performing the 100-AV field test for the CIRCLES consortium: Methodology and Implementation of the Largest mobile traffic control experiment to date

Authors: Mostafa Ameli, Sean Mcquade, Jonathan W. Lee, Matthew Bunting, Matthew Nice, Han Wang, William Barbour, Ryan Weightman, Chris Denaro, Ryan Delorenzo, Sharon Hornstein, Jon F. Davis, Dan Timsit, Riley Wagner, Rita Xu, Malaika Mahmood, Mikail Mahmood, Maria Laura Delle Monache, Benjamin Seibold, Daniel B. Work, Jonathan Sprinkle, Benedetto Piccoli, Alexandre M. Bayen

Abstract: Previous controlled experiments on single-lane ring roads have shown that a single partially autonomous vehicle (AV) can effectively mitigate traffic waves. This naturally prompts the question of how these findings can be generalized to field operational, high-density traffic conditions. To address this question, the Congestion Impacts Reduction via CAV-in-the-loop Lagrangian Energy Smoothing (CIR… ▽ More Previous controlled experiments on single-lane ring roads have shown that a single partially autonomous vehicle (AV) can effectively mitigate traffic waves. This naturally prompts the question of how these findings can be generalized to field operational, high-density traffic conditions. To address this question, the Congestion Impacts Reduction via CAV-in-the-loop Lagrangian Energy Smoothing (CIRCLES) Consortium conducted MegaVanderTest (MVT), a live traffic control experiment involving 100 vehicles near Nashville, TN, USA. This article is a tutorial for develo** analytical and simulation-based tools essential for designing and executing a live traffic control experiment like the MVT. It presents an overview of the proposed roadmap and various procedures used in designing, monitoring, and conducting the MVT, which is the largest mobile traffic control experiment at the time. The design process is aimed at evaluating the impact of the CIRCLES AVs on surrounding traffic. The article discusses the agent-based traffic simulation framework created for this evaluation. A novel methodological framework is introduced to calibrate this microsimulation, aiming to accurately capture traffic dynamics and assess the impact of adding 100 vehicles to existing traffic. The calibration model's effectiveness is verified using data from a six-mile section of Nashville's I-24 highway. The results indicate that the proposed model establishes an effective feedback loop between the optimizer and the simulator, thereby calibrating flow and speed with different spatiotemporal characteristics to minimize the error between simulated and real-world data. Finally, We simulate AVs in multiple scenarios to assess their effect on traffic congestion. This evaluation validates the AV routes, thereby contributing to the execution of a safe and successful live traffic control experiment via AVs. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2402.17050 [pdf, other]

Reinforcement Learning Based Oscillation Dampening: Scaling up Single-Agent RL algorithms to a 100 AV highway field operational test

Authors: Kathy Jang, Nathan Lichtlé, Eugene Vinitsky, Adit Shah, Matthew Bunting, Matthew Nice, Benedetto Piccoli, Benjamin Seibold, Daniel B. Work, Maria Laura Delle Monache, Jonathan Sprinkle, Jonathan W. Lee, Alexandre M. Bayen

Abstract: In this article, we explore the technical details of the reinforcement learning (RL) algorithms that were deployed in the largest field test of automated vehicles designed to smooth traffic flow in history as of 2023, uncovering the challenges and breakthroughs that come with develo** RL controllers for automated vehicles. We delve into the fundamental concepts behind RL algorithms and their app… ▽ More In this article, we explore the technical details of the reinforcement learning (RL) algorithms that were deployed in the largest field test of automated vehicles designed to smooth traffic flow in history as of 2023, uncovering the challenges and breakthroughs that come with develo** RL controllers for automated vehicles. We delve into the fundamental concepts behind RL algorithms and their application in the context of self-driving cars, discussing the developmental process from simulation to deployment in detail, from designing simulators to reward function sha**. We present the results in both simulation and deployment, discussing the flow-smoothing benefits of the RL controller. From understanding the basics of Markov decision processes to exploring advanced techniques such as deep RL, our article offers a comprehensive overview and deep dive of the theoretical foundations and practical implementations driving this rapidly evolving field. We also showcase real-world case studies and alternative research projects that highlight the impact of RL controllers in revolutionizing autonomous driving. From tackling complex urban environments to dealing with unpredictable traffic scenarios, these intelligent controllers are pushing the boundaries of what automated vehicles can achieve. Furthermore, we examine the safety considerations and hardware-focused technical details surrounding deployment of RL controllers into automated vehicles. As these algorithms learn and evolve through interactions with the environment, ensuring their behavior aligns with safety standards becomes crucial. We explore the methodologies and frameworks being developed to address these challenges, emphasizing the importance of building reliable control systems for automated vehicles. △ Less

Submitted 14 May, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.17043 [pdf, other]

Traffic Control via Connected and Automated Vehicles: An Open-Road Field Experiment with 100 CAVs

Authors: Jonathan W. Lee, Han Wang, Kathy Jang, Amaury Hayat, Matthew Bunting, Arwa Alanqary, William Barbour, Zhe Fu, Xiaoqian Gong, George Gunter, Sharon Hornstein, Abdul Rahman Kreidieh, Nathan Lichtlé, Matthew W. Nice, William A. Richardson, Adit Shah, Eugene Vinitsky, Fangyu Wu, Shengquan Xiang, Sulaiman Almatrudi, Fahd Althukair, Rahul Bhadani, Joy Carpio, Raphael Chekroun, Eric Cheng , et al. (39 additional authors not shown)

Abstract: The CIRCLES project aims to reduce instabilities in traffic flow, which are naturally occurring phenomena due to human driving behavior. These "phantom jams" or "stop-and-go waves,"are a significant source of wasted energy. Toward this goal, the CIRCLES project designed a control system referred to as the MegaController by the CIRCLES team, that could be deployed in real traffic. Our field experim… ▽ More The CIRCLES project aims to reduce instabilities in traffic flow, which are naturally occurring phenomena due to human driving behavior. These "phantom jams" or "stop-and-go waves,"are a significant source of wasted energy. Toward this goal, the CIRCLES project designed a control system referred to as the MegaController by the CIRCLES team, that could be deployed in real traffic. Our field experiment leveraged a heterogeneous fleet of 100 longitudinally-controlled vehicles as Lagrangian traffic actuators, each of which ran a controller with the architecture described in this paper. The MegaController is a hierarchical control architecture, which consists of two main layers. The upper layer is called Speed Planner, and is a centralized optimal control algorithm. It assigns speed targets to the vehicles, conveyed through the LTE cellular network. The lower layer is a control layer, running on each vehicle. It performs local actuation by overriding the stock adaptive cruise controller, using the stock on-board sensors. The Speed Planner ingests live data feeds provided by third parties, as well as data from our own control vehicles, and uses both to perform the speed assignment. The architecture of the speed planner allows for modular use of standard control techniques, such as optimal control, model predictive control, kernel methods and others, including Deep RL, model predictive control and explicit controllers. Depending on the vehicle architecture, all onboard sensing data can be accessed by the local controllers, or only some. Control inputs vary across different automakers, with inputs ranging from torque or acceleration requests for some cars, and electronic selection of ACC set points in others. The proposed architecture allows for the combination of all possible settings proposed above. Most configurations were tested throughout the ramp up to the MegaVandertest. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.16993 [pdf, other]

Hierarchical Speed Planner for Automated Vehicles: A Framework for Lagrangian Variable Speed Limit in Mixed Autonomy Traffic

Authors: Han Wang, Zhe Fu, Jonathan Lee, Hossein Nick Zinat Matin, Arwa Alanqary, Daniel Urieli, Sharon Hornstein, Abdul Rahman Kreidieh, Raphael Chekroun, William Barbour, William A. Richardson, Dan Work, Benedetto Piccoli, Benjamin Seibold, Jonathan Sprinkle, Alexandre M. Bayen, Maria Laura Delle Monache

Abstract: This paper introduces a novel control framework for Lagrangian variable speed limits in hybrid traffic flow environments utilizing automated vehicles (AVs). The framework was validated using a fleet of 100 connected automated vehicles as part of the largest coordinated open-road test designed to smooth traffic flow. The framework includes two main components: a high-level controller deployed on th… ▽ More This paper introduces a novel control framework for Lagrangian variable speed limits in hybrid traffic flow environments utilizing automated vehicles (AVs). The framework was validated using a fleet of 100 connected automated vehicles as part of the largest coordinated open-road test designed to smooth traffic flow. The framework includes two main components: a high-level controller deployed on the server side, named Speed Planner, and low-level controllers called vehicle controllers deployed on the vehicle side. The Speed Planner designs and updates target speeds for the vehicle controllers based on real-time Traffic State Estimation (TSE) [1]. The Speed Planner comprises two modules: a TSE enhancement module and a target speed design module. The TSE enhancement module is designed to minimize the effects of inherent latency in the received traffic information and to improve the spatial and temporal resolution of the input traffic data. The target speed design module generates target speed profiles with the goal of improving traffic flow. The vehicle controllers are designed to track the target speed meanwhile responding to the surrounding situation. The numerical simulation indicates the performance of the proposed method: the bottleneck throughput has increased by 5.01%, and the speed standard deviation has been reduced by a significant 34.36%. We further showcase an operational study with a description of how the controller was implemented on a field-test with 100 AVs and its comprehensive effects on the traffic flow. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2401.09666 [pdf, other]

Traffic Smoothing Controllers for Autonomous Vehicles Using Deep Reinforcement Learning and Real-World Trajectory Data

Authors: Nathan Lichtlé, Kathy Jang, Adit Shah, Eugene Vinitsky, Jonathan W. Lee, Alexandre M. Bayen

Abstract: Designing traffic-smoothing cruise controllers that can be deployed onto autonomous vehicles is a key step towards improving traffic flow, reducing congestion, and enhancing fuel efficiency in mixed autonomy traffic. We bypass the common issue of having to carefully fine-tune a large traffic microsimulator by leveraging real-world trajectory data from the I-24 highway in Tennessee, replayed in a o… ▽ More Designing traffic-smoothing cruise controllers that can be deployed onto autonomous vehicles is a key step towards improving traffic flow, reducing congestion, and enhancing fuel efficiency in mixed autonomy traffic. We bypass the common issue of having to carefully fine-tune a large traffic microsimulator by leveraging real-world trajectory data from the I-24 highway in Tennessee, replayed in a one-lane simulation. Using standard deep reinforcement learning methods, we train energy-reducing wave-smoothing policies. As an input to the agent, we observe the speed and distance of only the vehicle in front, which are local states readily available on most recent vehicles, as well as non-local observations about the downstream state of the traffic. We show that at a low 4% autonomous vehicle penetration rate, we achieve significant fuel savings of over 15% on trajectories exhibiting many stop-and-go waves. Finally, we analyze the smoothing effect of the controllers and demonstrate robustness to adding lane-changing into the simulation as well as the removal of downstream information. △ Less

Submitted 17 January, 2024; originally announced January 2024.

Comments: Accepted to be published as part of the 26th IEEE International Conference on Intelligent Transportation Systems (ITSC) 2023, Bilbao, Spain, September 24-28, 2023

arXiv:2310.18151 [pdf, other]

Traffic smoothing using explicit local controllers

Authors: Amaury Hayat, Arwa Alanqary, Rahul Bhadani, Christopher Denaro, Ryan J. Weightman, Shengquan Xiang, Jonathan W. Lee, Matthew Bunting, Anish Gollakota, Matthew W. Nice, Derek Gloudemans, Gergely Zachar, Jon F. Davis, Maria Laura Delle Monache, Benjamin Seibold, Alexandre M. Bayen, Jonathan Sprinkle, Daniel B. Work, Benedetto Piccoli

Abstract: The dissipation of stop-and-go waves attracted recent attention as a traffic management problem, which can be efficiently addressed by automated driving. As part of the 100 automated vehicles experiment named MegaVanderTest, feedback controls were used to induce strong dissipation via velocity smoothing. More precisely, a single vehicle driving differently in one of the four lanes of I-24 in the N… ▽ More The dissipation of stop-and-go waves attracted recent attention as a traffic management problem, which can be efficiently addressed by automated driving. As part of the 100 automated vehicles experiment named MegaVanderTest, feedback controls were used to induce strong dissipation via velocity smoothing. More precisely, a single vehicle driving differently in one of the four lanes of I-24 in the Nashville area was able to regularize the velocity profile by reducing oscillations in time and velocity differences among vehicles. Quantitative measures of this effect were possible due to the innovative I-24 MOTION system capable of monitoring the traffic conditions for all vehicles on the roadway. This paper presents the control design, the technological aspects involved in its deployment, and, finally, the results achieved by the experiment. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: 21 pages, 1 Table , 9 figures

MSC Class: 93D15; 93D21; 93-05; 34H05; ACM Class: H.2.2

arXiv:2310.06297 [pdf, other]

Reducing Detailed Vehicle Energy Dynamics to Physics-Like Models

Authors: Nour Khoudari, Sulaiman Almatrudi, Rabie Ramadan, Joy Carpio, Mengsha Yao, Kenneth Butts, Alexandre M. Bayen, Jonathan W. Lee, Benjamin Seibold

Abstract: The energy demand of vehicles, particularly in unsteady drive cycles, is affected by complex dynamics internal to the engine and other powertrain components. Yet, in many applications, particularly macroscopic traffic flow modeling and optimization, structurally simple approximations to the complex vehicle dynamics are needed that nevertheless reproduce the correct effective energy behavior. This… ▽ More The energy demand of vehicles, particularly in unsteady drive cycles, is affected by complex dynamics internal to the engine and other powertrain components. Yet, in many applications, particularly macroscopic traffic flow modeling and optimization, structurally simple approximations to the complex vehicle dynamics are needed that nevertheless reproduce the correct effective energy behavior. This work presents a systematic model reduction pipeline that starts from complex vehicle models based on the Autonomie software and derives a hierarchy of simplified models that are fast to evaluate, easy to disseminate in open-source frameworks, and compatible with optimization frameworks. The pipeline, based on a virtual chassis dynamometer and subsequent approximation strategies, is reproducible and is applied to six different vehicle classes to produce concrete explicit energy models that represent an average vehicle in each class and leverage the accuracy and validation work of the Autonomie software. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: 40 pages, 9 figures

arXiv:2302.07453 [pdf, other]

Cooperative Driving for Speed Harmonization in Mixed-Traffic Environments

Authors: Zhe Fu, Abdul Rahman Kreidieh, Han Wang, Jonathan W. Lee, Maria Laura Delle Monache, Alexandre M. Bayen

Abstract: Autonomous driving systems present promising methods for congestion mitigation in mixed autonomy traffic control settings. In particular, when coupled with even modest traffic state estimates, such systems can plan and coordinate the behaviors of automated vehicles (AVs) in response to observed downstream events, thereby inhibiting the continued propagation of congestion. In this paper, we present… ▽ More Autonomous driving systems present promising methods for congestion mitigation in mixed autonomy traffic control settings. In particular, when coupled with even modest traffic state estimates, such systems can plan and coordinate the behaviors of automated vehicles (AVs) in response to observed downstream events, thereby inhibiting the continued propagation of congestion. In this paper, we present a two-layer control strategy in which the upper layer proposes the desired speeds that predictively react to the downstream state of traffic, and the lower layer maintains safe and reasonable headways with leading vehicles. This method is demonstrated to achieve an average of over 15% energy savings within simulations of congested events observed in Interstate 24 with only 4% AV penetration, while restricting negative externalities imposed on traveling times and mobility. The proposed strategy that served as part of the "speed planner" was deployed on 100 AVs in a massive traffic experiment conducted on Nashville's I-24 in November 2022. △ Less

Submitted 3 June, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

Comments: Accepted by IEEE IV 2023

arXiv:2211.13730 [pdf, other]

doi 10.3934/nhm.2023078

A Proof of Kirchhoff's First Law for Hyperbolic Conservation Laws on Networks

Authors: Alexandre M. Bayen, Alexander Keimer, Nils Müller

Abstract: Networks are essential models in many applications such as information technology, chemistry, power systems, transportation, neuroscience, and social sciences. In light of such broad applicability, a general theory of dynamical systems on networks may capture shared concepts, and provide a setting for deriving abstract properties. To this end, we develop a calculus for networks modeled as abstract… ▽ More Networks are essential models in many applications such as information technology, chemistry, power systems, transportation, neuroscience, and social sciences. In light of such broad applicability, a general theory of dynamical systems on networks may capture shared concepts, and provide a setting for deriving abstract properties. To this end, we develop a calculus for networks modeled as abstract metric spaces and derive an analog of Kirchhoff's first law for hyperbolic conservation laws. In dynamical systems on networks, Kirchhoff's first law connects the study of abstract global objects, and that of a computationally-beneficial edgewise-Euclidean perspective by stating its equivalence. In particular, our results show that hyperbolic conservation laws on networks can be stated without explicit Kirchhoff-type boundary conditions. △ Less

Submitted 24 November, 2022; originally announced November 2022.

Comments: 20 pages, 6 figures

MSC Class: 35R02; 35C99 (Primary) 35L65; 00A71 (Secondary)

arXiv:2208.12534 [pdf, other]

Learning energy-efficient driving behaviors by imitating experts

Authors: Abdul Rahman Kreidieh, Zhe Fu, Alexandre M. Bayen

Abstract: The rise of vehicle automation has generated significant interest in the potential role of future automated vehicles (AVs). In particular, in highly dense traffic settings, AVs are expected to serve as congestion-dampeners, mitigating the presence of instabilities that arise from various sources. However, in many applications, such maneuvers rely heavily on non-local sensing or coordination by int… ▽ More The rise of vehicle automation has generated significant interest in the potential role of future automated vehicles (AVs). In particular, in highly dense traffic settings, AVs are expected to serve as congestion-dampeners, mitigating the presence of instabilities that arise from various sources. However, in many applications, such maneuvers rely heavily on non-local sensing or coordination by interacting AVs, thereby rendering their adaptation to real-world settings a particularly difficult challenge. To address this challenge, this paper examines the role of imitation learning in bridging the gap between such control strategies and realistic limitations in communication and sensing. Treating one such controller as an "expert", we demonstrate that imitation learning can succeed in deriving policies that, if adopted by 5% of vehicles, may boost the energy-efficiency of networks with varying traffic conditions by 15% using only local observations. Results and code are available online at https://sites.google.com/view/il-traffic/home. △ Less

Submitted 28 June, 2022; originally announced August 2022.

arXiv:2208.00268 [pdf, other]

doi 10.1109/TASE.2022.3168621

Unified Automatic Control of Vehicular Systems with Reinforcement Learning

Authors: Zhongxia Yan, Abdul Rahman Kreidieh, Eugene Vinitsky, Alexandre M. Bayen, Cathy Wu

Abstract: Emerging vehicular systems with increasing proportions of automated components present opportunities for optimal control to mitigate congestion and increase efficiency. There has been a recent interest in applying deep reinforcement learning (DRL) to these nonlinear dynamical systems for the automatic design of effective control strategies. Despite conceptual advantages of DRL being model-free, st… ▽ More Emerging vehicular systems with increasing proportions of automated components present opportunities for optimal control to mitigate congestion and increase efficiency. There has been a recent interest in applying deep reinforcement learning (DRL) to these nonlinear dynamical systems for the automatic design of effective control strategies. Despite conceptual advantages of DRL being model-free, studies typically nonetheless rely on training setups that are painstakingly specialized to specific vehicular systems. This is a key challenge to efficient analysis of diverse vehicular and mobility systems. To this end, this article contributes a streamlined methodology for vehicular microsimulation and discovers high performance control strategies with minimal manual design. A variable-agent, multi-task approach is presented for optimization of vehicular Partially Observed Markov Decision Processes. The methodology is experimentally validated on mixed autonomy traffic systems, where fractions of vehicles are automated; empirical improvement, typically 15-60% over a human driving baseline, is observed in all configurations of six diverse open or closed traffic systems. The study reveals numerous emergent behaviors resembling wave mitigation, traffic signaling, and ramp metering. Finally, the emergent behaviors are analyzed to produce interpretable control strategies, which are validated against the learned control strategies. △ Less

Submitted 30 July, 2022; originally announced August 2022.

Comments: 16 pages, 14 figures, IEEE Transactions on Automation Science and Engineering (T-ASE), 2022

arXiv:2112.14345 [pdf, other]

Reachability Analysis for FollowerStopper: Safety Analysis and Experimental Results

Authors: Fang-Chieh Chou, Marsalis Gibson, Rahul Bhadani, Alexandre M. Bayen, Jonathan Sprinkle

Abstract: Motivated by earlier work and the developer of a new algorithm, the FollowerStopper, this article uses reachability analysis to verify the safety of the FollowerStopper algorithm, which is a controller designed for dampening stop- and-go traffic waves. With more than 1100 miles of driving data collected by our physical platform, we validate our analysis results by comparing it to human driving beh… ▽ More Motivated by earlier work and the developer of a new algorithm, the FollowerStopper, this article uses reachability analysis to verify the safety of the FollowerStopper algorithm, which is a controller designed for dampening stop- and-go traffic waves. With more than 1100 miles of driving data collected by our physical platform, we validate our analysis results by comparing it to human driving behaviors. The FollowerStopper controller has been demonstrated to dampen stop-and-go traffic waves at low speed, but previous analysis on its relative safety has been limited to upper and lower bounds of acceleration. To expand upon previous analysis, reachability analysis is used to investigate the safety at the speeds it was originally tested and also at higher speeds. Two formulations of safety analysis with different criteria are shown: distance-based and time headway-based. The FollowerStopper is considered safe with distance-based criterion. However, simulation results demonstrate that the FollowerStopper is not representative of human drivers - it follows too closely behind vehicles, specifically at a distance human would deem as unsafe. On the other hand, under the time headway-based safety analysis, the FollowerStopper is not considered safe anymore. A modified FollowerStopper is proposed to satisfy time-based safety criterion. Simulation results of the proposed FollowerStopper shows that its response represents human driver behavior better. △ Less

Submitted 28 December, 2021; originally announced December 2021.

Comments: 6 pages; 10 figures; ICRA publication

arXiv:2110.11943 [pdf, other]

Solving N-player dynamic routing games with congestion: a mean field approach

Authors: Theophile Cabannes, Mathieu Lauriere, Julien Perolat, Raphael Marinier, Sertan Girgin, Sarah Perrin, Olivier Pietquin, Alexandre M. Bayen, Eric Goubault, Romuald Elie

Abstract: The recent emergence of navigational tools has changed traffic patterns and has now enabled new types of congestion-aware routing control like dynamic road pricing. Using the fundamental diagram of traffic flows - applied in macroscopic and mesoscopic traffic modeling - the article introduces a new N-player dynamic routing game with explicit congestion dynamics. The model is well-posed and can rep… ▽ More The recent emergence of navigational tools has changed traffic patterns and has now enabled new types of congestion-aware routing control like dynamic road pricing. Using the fundamental diagram of traffic flows - applied in macroscopic and mesoscopic traffic modeling - the article introduces a new N-player dynamic routing game with explicit congestion dynamics. The model is well-posed and can reproduce heterogeneous departure times and congestion spill back phenomena. However, as Nash equilibrium computations are PPAD-complete, solving the game becomes intractable for large but realistic numbers of vehicles N. Therefore, the corresponding mean field game is also introduced. Experiments were performed on several classical benchmark networks of the traffic community: the Pigou, Braess, and Sioux Falls networks with heterogeneous origin, destination and departure time tuples. The Pigou and the Braess examples reveal that the mean field approximation is generally very accurate and computationally efficient as soon as the number of vehicles exceeds a few dozen. On the Sioux Falls network (76 links, 100 time steps), this approach enables learning traffic dynamics with more than 14,000 vehicles. △ Less

Submitted 27 October, 2021; v1 submitted 22 October, 2021; originally announced October 2021.

arXiv:2109.14019 [pdf, other]

Longitudinal Deep Truck: Deep learning and deep reinforcement learning for modeling and control of longitudinal dynamics of heavy duty trucks

Authors: Saleh Albeaik, Trevor Wu, Ganeshnikhil Vurimi, Xiao-Yun Lu, Alexandre M. Bayen

Abstract: Heavy duty truck mechanical configuration is often tailor designed and built for specific truck mission requirements. This renders the precise derivation of analytical dynamical models and controls for these trucks from first principles challenging, tedious, and often requires several theoretical and applied areas of expertise to carry through. This article investigates deep learning and deep rein… ▽ More Heavy duty truck mechanical configuration is often tailor designed and built for specific truck mission requirements. This renders the precise derivation of analytical dynamical models and controls for these trucks from first principles challenging, tedious, and often requires several theoretical and applied areas of expertise to carry through. This article investigates deep learning and deep reinforcement learning as truck-configuration-agnostic longitudinal modeling and control approaches for heavy duty trucks. The article outlines a process to develop and validate such models and controllers and highlights relevant practical considerations. The process is applied to simulation and real-full size trucks for validation and experimental performance evaluation. The results presented demonstrate applicability of this approach to trucks of multiple configurations; models generated were accurate for control development purposes both in simulation and the field. △ Less

Submitted 28 September, 2021; originally announced September 2021.

arXiv:2104.11267 [pdf, other]

Integrated Framework of Vehicle Dynamics, Instabilities, Energy Models, and Sparse Flow Smoothing Controllers

Authors: Jonathan W. Lee, George Gunter, Rabie Ramadan, Sulaiman Almatrudi, Paige Arnold, John Aquino, William Barbour, Rahul Bhadani, Joy Carpio, Fang-Chieh Chou, Marsalis Gibson, Xiaoqian Gong, Amaury Hayat, Nour Khoudari, Abdul Rahman Kreidieh, Maya Kumar, Nathan Lichtlé, Sean McQuade, Brian Nguyen, Megan Ross, Sydney Truong, Eugene Vinitsky, Yibo Zhao, Jonathan Sprinkle, Benedetto Piccoli , et al. (3 additional authors not shown)

Abstract: This work presents an integrated framework of: vehicle dynamics models, with a particular attention to instabilities and traffic waves; vehicle energy models, with particular attention to accurate energy values for strongly unsteady driving profiles; and sparse Lagrangian controls via automated vehicles, with a focus on controls that can be executed via existing technology such as adaptive cruise… ▽ More This work presents an integrated framework of: vehicle dynamics models, with a particular attention to instabilities and traffic waves; vehicle energy models, with particular attention to accurate energy values for strongly unsteady driving profiles; and sparse Lagrangian controls via automated vehicles, with a focus on controls that can be executed via existing technology such as adaptive cruise control systems. This framework serves as a key building block in develo** control strategies for human-in-the-loop traffic flow smoothing on real highways. In this contribution, we outline the fundamental merits of integrating vehicle dynamics and energy modeling into a single framework, and we demonstrate the energy impact of sparse flow smoothing controllers via simulation results. △ Less

Submitted 22 April, 2021; originally announced April 2021.

arXiv:2008.02691 [pdf, other]

Adaptive Coordination Offsets for Signalized Arterial Intersections using Deep Reinforcement Learning

Authors: Keith Anshilo Diaz, Damian Dailisan, Umang Sharaf, Carissa Santos, Qijian Gan, Francis Aldrine Uy, May T. Lim, Alexandre M. Bayen

Abstract: Coordinating intersections in arterial networks is critical to the performance of urban transportation systems. Deep reinforcement learning (RL) has gained traction in traffic control research along with data-driven approaches for traffic control systems. To date, proposed deep RL-based traffic schemes control phase activation or duration. Yet, such approaches may bypass low volume links for sever… ▽ More Coordinating intersections in arterial networks is critical to the performance of urban transportation systems. Deep reinforcement learning (RL) has gained traction in traffic control research along with data-driven approaches for traffic control systems. To date, proposed deep RL-based traffic schemes control phase activation or duration. Yet, such approaches may bypass low volume links for several cycles in order to optimize the network-level traffic flow. Here, we propose a deep RL framework that dynamically adjusts offsets based on traffic states and preserves the planned phase timings and order derived from model-based methods. This framework allows us to improve arterial coordination while maintaining phase order and timing predictability. Using a validated and calibrated traffic model, we trained the policy of a deep RL agent that aims to reduce travel delays in the network. We evaluated the resulting policy by comparing its performance against the phase offsets deployed along a segment of Huntington Drive in the city of Arcadia. The resulting policy dynamically readjusts phase offsets in response to changes in traffic demand. Simulation results show that the proposed deep RL agent outperformed the baseline on average, effectively reducing delay time by 13.21% in the AM Scenario, 2.42% in the Noon scenario, and 6.2% in the PM scenario when offsets are adjusted in 15-minute intervals. Finally, we also show the robustness of our agent to extreme traffic conditions, such as demand surges in off-peak hours and localized traffic incidents △ Less

Submitted 29 August, 2022; v1 submitted 6 August, 2020; originally announced August 2020.

arXiv:2002.07386 [pdf, other]

ResiliNet: Failure-Resilient Inference in Distributed Neural Networks

Authors: Ashkan Yousefpour, Brian Q. Nguyen, Siddartha Devic, Guanhua Wang, Aboudy Kreidieh, Hans Lobel, Alexandre M. Bayen, Jason P. Jue

Abstract: Federated Learning aims to train distributed deep models without sharing the raw data with the centralized server. Similarly, in distributed inference of neural networks, by partitioning the network and distributing it across several physical nodes, activations and gradients are exchanged between physical nodes, rather than raw data. Nevertheless, when a neural network is partitioned and distribut… ▽ More Federated Learning aims to train distributed deep models without sharing the raw data with the centralized server. Similarly, in distributed inference of neural networks, by partitioning the network and distributing it across several physical nodes, activations and gradients are exchanged between physical nodes, rather than raw data. Nevertheless, when a neural network is partitioned and distributed among physical nodes, failure of physical nodes causes the failure of the neural units that are placed on those nodes, which results in a significant performance drop. Current approaches focus on resiliency of training in distributed neural networks. However, resiliency of inference in distributed neural networks is less explored. We introduce ResiliNet, a scheme for making inference in distributed neural networks resilient to physical node failures. ResiliNet combines two concepts to provide resiliency: skip hyperconnection, a concept for skip** nodes in distributed neural networks similar to skip connection in resnets, and a novel technique called failout, which is introduced in this paper. Failout simulates physical node failure conditions during training using dropout, and is specifically designed to improve the resiliency of distributed neural networks. The results of the experiments and ablation studies using three datasets confirm the ability of ResiliNet to provide inference resiliency for distributed neural networks. △ Less

Submitted 19 December, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

Comments: Accepted in FL-ICML 2020 (International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2020). Added FAQ to the end of the paper

arXiv:1912.03861 [pdf, other]

Daily Data Assimilation of a Hydrologic Model Using the Ensemble Kalman Filter

Authors: Sami A. Malek, Alexandre M. Bayen, Steven D. Glaser

Abstract: Accurate runoff forecasting is crucial for reservoir operators as it allows optimized water management, flood control and hydropower generation. Land surface models in mountainous regions depend on climatic inputs such as precipitation, temperature and solar radiation to model the water and energy dynamics and produce runoff as output. With the rapid development of cheap electronics applied in var… ▽ More Accurate runoff forecasting is crucial for reservoir operators as it allows optimized water management, flood control and hydropower generation. Land surface models in mountainous regions depend on climatic inputs such as precipitation, temperature and solar radiation to model the water and energy dynamics and produce runoff as output. With the rapid development of cheap electronics applied in various systems, such as Wireless Sensor Networks (WSNs), satellite and airborne technologies, the prospect of practically measuring spatial Snow Water Equivalent in a dense temporal scale is increasing. We present a framework for updating the Precipitation Runoff Modeling System (PRMS) with Snow Water Equivalent (SWE) maps and runoff measurements on a daily timescale based on the Ensemble Kalman Filter (ENKF). Results show that by assimilating SWE daily, the modeled SWE gets updated accordingly, however no improvement is observed at the runoff model output. Instead, a deterioration consistently occurs. Augmenting the state space with model parameters and runoff model output allows for filter update with previous day measured runoff using the joint state-parameter method, and showed a considerable improvement in the daily runoff output of up to 60% reduction in RMSE for the wet water year 2011 relative to the no assimilation scenario, and improvement of up to 28% compared to a naive autoregressive AR(1) filter. Additional simulation years showed consistent improvement compared to no assimilation, but varied relative to the previous day autoregressive forecast during the dry year 2014. △ Less

Submitted 9 December, 2019; originally announced December 2019.

Comments: 18 pages, 5 figures, 4 tables and supplement

Journal ref: Published as Masters thesis here: https://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-101.html

arXiv:1912.02368 [pdf, other]

Inter-Level Cooperation in Hierarchical Reinforcement Learning

Authors: Abdul Rahman Kreidieh, Glen Berseth, Brandon Trabucco, Samyak Parajuli, Sergey Levine, Alexandre M. Bayen

Abstract: Hierarchies of temporally decoupled policies present a promising approach for enabling structured exploration in complex long-term planning problems. To fully achieve this approach an end-to-end training paradigm is needed. However, training these multi-level policies has had limited success due to challenges arising from interactions between the goal-assigning and goal-achieving levels within a h… ▽ More Hierarchies of temporally decoupled policies present a promising approach for enabling structured exploration in complex long-term planning problems. To fully achieve this approach an end-to-end training paradigm is needed. However, training these multi-level policies has had limited success due to challenges arising from interactions between the goal-assigning and goal-achieving levels within a hierarchy. In this article, we consider the policy optimization process as a multi-agent process. This allows us to draw on connections between communication and cooperation in multi-agent RL, and demonstrate the benefits of increased cooperation between sub-policies on the training performance of the overall policy. We introduce a simple yet effective technique for inducing inter-level cooperation by modifying the objective function and subsequent gradients of higher-level policies. Experimental results on a wide variety of simulated robotics and traffic control tasks demonstrate that inducing cooperation results in stronger performing policies and increased sample efficiency on a set of difficult long time horizon tasks. We also find that goal-conditioned policies trained using our method display better transfer to new tasks, highlighting the benefits of our method in learning task-agnostic lower-level behaviors. Videos and code are available at: https://sites.google.com/berkeley.edu/cooperative-hrl. △ Less

Submitted 17 November, 2021; v1 submitted 4 December, 2019; originally announced December 2019.

arXiv:1909.00995 [pdf, other]

doi 10.1145/3363347.3363366

Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud

Authors: Ashkan Yousefpour, Siddartha Devic, Brian Q. Nguyen, Aboudy Kreidieh, Alan Liao, Alexandre M. Bayen, Jason P. Jue

Abstract: Partitioning and distributing deep neural networks (DNNs) over physical nodes such as edge, fog, or cloud nodes, could enhance sensor fusion, and reduce bandwidth and inference latency. However, when a DNN is distributed over physical nodes, failure of the physical nodes causes the failure of the DNN units that are placed on these nodes. The performance of the inference task will be unpredictable,… ▽ More Partitioning and distributing deep neural networks (DNNs) over physical nodes such as edge, fog, or cloud nodes, could enhance sensor fusion, and reduce bandwidth and inference latency. However, when a DNN is distributed over physical nodes, failure of the physical nodes causes the failure of the DNN units that are placed on these nodes. The performance of the inference task will be unpredictable, and most likely, poor, if the distributed DNN is not specifically designed and properly trained for failures. Motivated by this, we introduce deepFogGuard, a DNN architecture augmentation scheme for making the distributed DNN inference task failure-resilient. To articulate deepFogGuard, we introduce the elements and a model for the resiliency of distributed DNN inference. Inspired by the concept of residual connections in DNNs, we introduce skip hyperconnections in distributed DNNs, which are the basis of deepFogGuard's design to provide resiliency. Next, our extensive experiments using two existing datasets for the sensing and vision applications confirm the ability of deepFogGuard to provide resiliency for distributed DNNs in edge-cloud networks. △ Less

Submitted 21 September, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

Comments: Accepted to ACM AIChallengeIoT 2019

arXiv:1908.03821 [pdf, other]

BISTRO: Berkeley Integrated System for Transportation Optimization

Authors: Sidney A. Feygin, Jessica R. Lazarus, Edward H. Forscher, Valentine Golfier-Vetterli, Jonathan W. Lee, Abhishek Gupta, Rashid A. Waraich, Colin J. R. Sheppard, Alexandre M. Bayen

Abstract: This article introduces BISTRO, a new open source transportation planning decision support system that uses an agent-based simulation and optimization approach to anticipate and develop adaptive plans for possible technological disruptions and growth scenarios. The new framework was evaluated in the context of a machine learning competition hosted within Uber Technologies, Inc., in which over 400… ▽ More This article introduces BISTRO, a new open source transportation planning decision support system that uses an agent-based simulation and optimization approach to anticipate and develop adaptive plans for possible technological disruptions and growth scenarios. The new framework was evaluated in the context of a machine learning competition hosted within Uber Technologies, Inc., in which over 400 engineers and data scientists participated. For the purposes of this competition, a benchmark model, based on the city of Sioux Falls, South Dakota, was adapted to the BISTRO framework. An important finding of this study was that in spite of rigorous analysis and testing done prior to the competition, the two top-scoring teams discovered an unbounded region of the search space, rendering the solutions largely uninterpretable for the purposes of decision-support. On the other hand, a follow-on study aimed to fix the objective function, served to demonstrate BISTRO's utility as a human-in-the-loop cyberphysical system: one that uses scenario-based optimization algorithms as a feedback mechanism to assist urban planners with iteratively refining objective function and constraints specification on intervention strategies such that the portfolio of transportation intervention strategy alternatives eventually chosen achieves high-level regional planning goals developed through participatory stakeholder engagement practices. △ Less

Submitted 22 January, 2020; v1 submitted 10 August, 2019; originally announced August 2019.

arXiv:1907.05464 [pdf, ps, other]

Integrated Offline and Online Optimization-Based Control in a Base-Parallel Architecture

Authors: Anahita Jamshidnejad, Gabriel Gomes, Alexandre M. Bayen, Bart De Schutter

Abstract: We propose an integrated control architecture to address the gap that currently exists for efficient real-time implementation of MPC-based control approaches for highly nonlinear systems with fast dynamics and a large number of control constraints. The proposed architecture contains two types of controllers: base controllers that are tuned or optimized offline, and parallel controllers that solve… ▽ More We propose an integrated control architecture to address the gap that currently exists for efficient real-time implementation of MPC-based control approaches for highly nonlinear systems with fast dynamics and a large number of control constraints. The proposed architecture contains two types of controllers: base controllers that are tuned or optimized offline, and parallel controllers that solve an optimization-based control problem online. The control inputs computed by the base controllers provide starting points for the optimization problem of the parallel controllers, which operate in parallel within a limited time budget that does not exceed the control sampling time. The resulting control system is very flexible and its architecture can easily be modified or changed online, e.g., by adding or eliminating controllers, for online improvement of the performance of the controlled system. In a case study, the proposed control architecture is implemented for highway traffic, which is characterized by nonlinear, fast dynamics with multiple control constraints, to minimize the overall travel time of the vehicles, while increasing their total traveled distance within the fixed simulation time window. The results of the simulation show the excellent real-time (i.e., within the given time budget) performance of the proposed control architecture, with the least realized value of the overall cost function. Moreover, among the online control approaches considered for the case study, the average cost per vehicle for the base-parallel control approach is the closest to the online MPC-based controllers, which have excellent performance but may involve computation times that exceed the given time budget. △ Less

Submitted 11 July, 2019; originally announced July 2019.

MSC Class: 49-XX

arXiv:1906.04012 [pdf, other]

PDE Traffic Observer Validated on Freeway Data

Authors: Huan Yu, Qijian Gan, Alexandre M. Bayen, Miroslav Krstic

Abstract: This paper develops boundary observer for estimation of congested freeway traffic states based on Aw-Rascle-Zhang (ARZ) partial differential equations (PDE) model. Traffic state estimation refers to acquisition of traffic state information from partially observed traffic data. This problem is relevant for freeway due to its limited accessibility to real-time traffic information. We propose a model… ▽ More This paper develops boundary observer for estimation of congested freeway traffic states based on Aw-Rascle-Zhang (ARZ) partial differential equations (PDE) model. Traffic state estimation refers to acquisition of traffic state information from partially observed traffic data. This problem is relevant for freeway due to its limited accessibility to real-time traffic information. We propose a model-driven approach in which estimation of aggregated traffic states in a freeway segment are obtained simply from boundary measurement of flow and velocity without knowledge of the initial states. The macroscopic traffic dynamics is represented by the ARZ model, consisting of $2 \times 2$ coupled nonlinear hyperbolic PDEs for traffic density and velocity. Analysis of the linearized ARZ model leads to the study of a hetero-directional hyperbolic PDE model for congested traffic regime. Using spatial transformation and PDE backstep** method, we construct a boundary observer consisting of a copy of the nonlinear plant with output injections from boundary measurement errors. The output injection gains are designed for the estimation error system so that the exponential stability of the error system in the $L^2$ norm and finite-time convergence to zero are guaranteed. Numerical simulations are conducted to validate the boundary observer design for estimation of the nonlinear ARZ model. In data validation, we calibrate model parameters of the ARZ model and then use vehicle trajectory data to test the performance of the observer design. △ Less

Submitted 6 June, 2019; originally announced June 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1904.12963

arXiv:1904.12963 [pdf, other]

Boundary Observer for Congested Freeway Traffic State Estimation via Aw-Rascle-Zhang model

Authors: Huan Yu, Alexandre M. Bayen, Miroslav Krstic

Abstract: This paper develops boundary observer for estimation of congested freeway traffic states based on Aw-Rascle-Zhang(ARZ) partial differential equations (PDE) model. Traffic state estimation refers to acquisition of traffic state information from partially observed traffic data. This problem is relevant for freeway due to its limited accessibility to real-time traffic information. We propose a bounda… ▽ More This paper develops boundary observer for estimation of congested freeway traffic states based on Aw-Rascle-Zhang(ARZ) partial differential equations (PDE) model. Traffic state estimation refers to acquisition of traffic state information from partially observed traffic data. This problem is relevant for freeway due to its limited accessibility to real-time traffic information. We propose a boundary observer design so that estimates of aggregated traffic states in a freeway segment are obtained simply from boundary measurement of flow and velocity. The macroscopic traffic dynamics is represented by the ARZ model, consisting of $2 \times 2$ coupled nonlinear hyperbolic PDEs for traffic density and velocity. Analysis of the linearized ARZ model leads to the study of a hetero-directional hyperbolic PDE model for congested traffic regime. Using spatial transformation and PDE backstep** method, we construct a boundary observer with a copy of the nonlinear plant and output injection of boundary measurement errors. The output injection gains are designed for the error system of the linearized ARZ model so that the exponential stability of error system in the $L^2$ norm and finite-time convergence to zero are guaranteed. Simulations are conducted to validate the boundary observer design for nonlinear ARZ model without knowledge of initial conditions. △ Less

Submitted 29 April, 2019; originally announced April 2019.

arXiv:1903.05252 [pdf, other]

doi 10.1109/ICCA51439.2020.9264552

Zero-Shot Autonomous Vehicle Policy Transfer: From Simulation to Real-World via Adversarial Learning

Authors: Behdad Chalaki, Logan E. Beaver, Ben Remer, Kathy Jang, Eugene Vinitsky, Alexandre M. Bayen, Andreas A. Malikopoulos

Abstract: In this article, we demonstrate a zero-shot transfer of an autonomous driving policy from simulation to University of Delaware's scaled smart city with adversarial multi-agent reinforcement learning, in which an adversary attempts to decrease the net reward by perturbing both the inputs and outputs of the autonomous vehicles during training. We train the autonomous vehicles to coordinate with each… ▽ More In this article, we demonstrate a zero-shot transfer of an autonomous driving policy from simulation to University of Delaware's scaled smart city with adversarial multi-agent reinforcement learning, in which an adversary attempts to decrease the net reward by perturbing both the inputs and outputs of the autonomous vehicles during training. We train the autonomous vehicles to coordinate with each other while crossing a roundabout in the presence of an adversary in simulation. The adversarial policy successfully reproduces the simulated behavior and incidentally outperforms, in terms of travel time, both a human-driving baseline and adversary-free trained policies. Finally, we demonstrate that the addition of adversarial training considerably improves the performance \eat{stability and robustness} of the policies after transfer to the real world compared to Gaussian noise injection. △ Less

Submitted 22 June, 2020; v1 submitted 12 March, 2019; originally announced March 2019.

Comments: 6 pages, 4 figures

Journal ref: IEEE International Conference on Control & Automation, (2020), 35-40

arXiv:1803.07246 [pdf, other]

Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines

Authors: Cathy Wu, Aravind Rajeswaran, Yan Duan, Vikash Kumar, Alexandre M Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel

Abstract: Policy gradient methods have enjoyed great success in deep reinforcement learning but suffer from high variance of gradient estimates. The high variance problem is particularly exasperated in problems with long horizons or high-dimensional action spaces. To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the st… ▽ More Policy gradient methods have enjoyed great success in deep reinforcement learning but suffer from high variance of gradient estimates. The high variance problem is particularly exasperated in problems with long horizons or high-dimensional action spaces. To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the stochastic policy itself and does not make any additional assumptions about the MDP. We demonstrate and quantify the benefit of the action-dependent baseline through both theoretical analysis as well as numerical results, including an analysis of the suboptimality of the optimal state-dependent baseline. The result is a computationally efficient policy gradient algorithm, which scales to high-dimensional control problems, as demonstrated by a synthetic 2000-dimensional target matching task. Our experimental results indicate that action-dependent baselines allow for faster learning on standard reinforcement learning benchmarks and high-dimensional hand manipulation and synthetic tasks. Finally, we show that the general idea of including additional information in baselines for improved variance reduction can be extended to partially observed and multi-agent tasks. △ Less

Submitted 19 March, 2018; originally announced March 2018.

Comments: Accepted to ICLR 2018, Oral (2%)

arXiv:1710.05465 [pdf, other]

doi 10.1109/TRO.2021.3087314

Flow: A Modular Learning Framework for Mixed Autonomy Traffic

Authors: Cathy Wu, Aboudy Kreidieh, Kanaad Parvate, Eugene Vinitsky, Alexandre M Bayen

Abstract: The rapid development of autonomous vehicles (AVs) holds vast potential for transportation systems through improved safety, efficiency, and access to mobility. However, the progression of these impacts, as AVs are adopted, is not well understood. Numerous technical challenges arise from the goal of analyzing the partial adoption of autonomy: partial control and observation, multi-vehicle interacti… ▽ More The rapid development of autonomous vehicles (AVs) holds vast potential for transportation systems through improved safety, efficiency, and access to mobility. However, the progression of these impacts, as AVs are adopted, is not well understood. Numerous technical challenges arise from the goal of analyzing the partial adoption of autonomy: partial control and observation, multi-vehicle interactions, and the sheer variety of scenarios represented by real-world networks. To shed light into near-term AV impacts, this article studies the suitability of deep reinforcement learning (RL) for overcoming these challenges in a low AV-adoption regime. A modular learning framework is presented, which leverages deep RL to address complex traffic dynamics. Modules are composed to capture common traffic phenomena (stop-and-go traffic jams, lane changing, intersections). Learned control laws are found to improve upon human driving performance, in terms of system-level velocity, by up to 57% with only 4-7% adoption of AVs. Furthermore, in single-lane traffic, a small neural network control law with only local observation is found to eliminate stop-and-go traffic - surpassing all known model-based controllers to achieve near-optimal performance - and generalize to out-of-distribution traffic densities. △ Less

Submitted 30 December, 2021; v1 submitted 15 October, 2017; originally announced October 2017.

Comments: 17 pages, 8 figures, 5 tables. 2021 IEEE Transactions on Robotics (T-RO)

arXiv:1707.07371 [pdf, other]

Integration of Information Patterns in the Modeling and Design of Mobility Management Services

Authors: Alexander Keimer, Nicolas Laurent-Brouty, Farhad Farokhi, Hippolyte Signargout, Vladimir Cvetkovic, Alexandre M. Bayen, Karl H. Johansson

Abstract: Over the last decade, the rise of the mobile internet and the usage of mobile devices has enabled ubiquitous traffic information. With the increased adoption of specific smartphone applications, the number of users of routing applications has become large enough to disrupt traffic flow patterns in a significant manner. Similarly, but at a slightly slower pace, novel services for freight transporta… ▽ More Over the last decade, the rise of the mobile internet and the usage of mobile devices has enabled ubiquitous traffic information. With the increased adoption of specific smartphone applications, the number of users of routing applications has become large enough to disrupt traffic flow patterns in a significant manner. Similarly, but at a slightly slower pace, novel services for freight transportation and city logistics improve the efficiency of goods transportation and change the use of road infrastructure. The present article provides a general four-layer framework for modeling these new trends. The main motivation behind the development is to provide a unifying formal system description that can at the same time encompass system physics (flow and motion of vehicles) as well as coordination strategies under various information and cooperation structures. To showcase the framework, we apply it to the specific challenge of modeling and analyzing the integration of routing applications in today's transportation systems. In this framework, at the lowest layer (flow dynamics) we distinguish app users from non-app users. A distributed parameter model based on a non-local partial differential equation is introduced and analyzed. The second layer incorporates connected services (e.g., routing) and other applications used to optimize the local performance of the system. As inputs to those applications, we propose a third layer introducing the incentive design and global objectives, which are typically varying over the day depending on road and weather conditions, external events etc. The high-level planning is handled on the fourth layer taking social long-term objectives into account. △ Less

Submitted 23 July, 2017; originally announced July 2017.

Comments: 24 pages, 11 Figures

arXiv:1701.08832 [pdf, other]

Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning

Authors: Francois Belletti, Daniel Haziza, Gabriel Gomes, Alexandre M. Bayen

Abstract: This article shows how the recent breakthroughs in Reinforcement Learning (RL) that have enabled robots to learn to play arcade video games, walk or assemble colored bricks, can be used to perform other tasks that are currently at the core of engineering cyberphysical systems. We present the first use of RL for the control of systems modeled by discretized non-linear Partial Differential Equations… ▽ More This article shows how the recent breakthroughs in Reinforcement Learning (RL) that have enabled robots to learn to play arcade video games, walk or assemble colored bricks, can be used to perform other tasks that are currently at the core of engineering cyberphysical systems. We present the first use of RL for the control of systems modeled by discretized non-linear Partial Differential Equations (PDEs) and devise a novel algorithm to use non-parametric control techniques for large multi-agent systems. We show how neural network based RL enables the control of discretized PDEs whose parameters are unknown, random, and time-varying. We introduce an algorithm of Mutual Weight Regularization (MWR) which alleviates the curse of dimensionality of multi-agent control schemes by sharing experience between agents while giving each agent the opportunity to specialize its action policy so as to tailor it to the local parameters of the part of the system it is located in. △ Less

Submitted 30 January, 2017; originally announced January 2017.

arXiv:1603.03336 [pdf, other]

Scalable Linear Causal Inference for Irregularly Sampled Time Series with Long Range Dependencies

Authors: Francois W. Belletti, Evan R. Sparks, Michael J. Franklin, Alexandre M. Bayen, Joseph E. Gonzalez

Abstract: Linear causal analysis is central to a wide range of important application spanning finance, the physical sciences, and engineering. Much of the existing literature in linear causal analysis operates in the time domain. Unfortunately, the direct application of time domain linear causal analysis to many real-world time series presents three critical challenges: irregular temporal sampling, long ran… ▽ More Linear causal analysis is central to a wide range of important application spanning finance, the physical sciences, and engineering. Much of the existing literature in linear causal analysis operates in the time domain. Unfortunately, the direct application of time domain linear causal analysis to many real-world time series presents three critical challenges: irregular temporal sampling, long range dependencies, and scale. Moreover, real-world data is often collected at irregular time intervals across vast arrays of decentralized sensors and with long range dependencies which make naive time domain correlation estimators spurious. In this paper we present a frequency domain based estimation framework which naturally handles irregularly sampled data and long range dependencies while enabled memory and communication efficient distributed processing of time series data. By operating in the frequency domain we eliminate the need to interpolate and help mitigate the effects of long range dependencies. We implement and evaluate our new work-flow in the distributed setting using Apache Spark and demonstrate on both Monte Carlo simulations and high-frequency financial trading that we can accurately recover causal structure at scale. △ Less

Submitted 10 March, 2016; originally announced March 2016.

arXiv:1601.04041 [pdf, other]

doi 10.1109/CDC.2015.7402640

Differential Privacy of Populations in Routing Games

Authors: Roy Dong, Walid Krichene, Alexandre M. Bayen, S. Shankar Sastry

Abstract: As our ground transportation infrastructure modernizes, the large amount of data being measured, transmitted, and stored motivates an analysis of the privacy aspect of these emerging cyber-physical technologies. In this paper, we consider privacy in the routing game, where the origins and destinations of drivers are considered private. This is motivated by the fact that this spatiotemporal informa… ▽ More As our ground transportation infrastructure modernizes, the large amount of data being measured, transmitted, and stored motivates an analysis of the privacy aspect of these emerging cyber-physical technologies. In this paper, we consider privacy in the routing game, where the origins and destinations of drivers are considered private. This is motivated by the fact that this spatiotemporal information can easily be used as the basis for inferences for a person's activities. More specifically, we consider the differential privacy of the map** from the amount of flow for each origin-destination pair to the traffic flow measurements on each link of a traffic network. We use a stochastic online learning framework for the population dynamics, which is known to converge to the Nash equilibrium of the routing game. We analyze the sensitivity of this process and provide theoretical guarantees on the convergence rates as well as differential privacy values for these models. We confirm these with simulations on a small example. △ Less

Submitted 15 January, 2016; originally announced January 2016.

Comments: Extended draft of paper that appears in 2015 IEEE CDC

arXiv:1511.06493 [pdf, other]

Embarrassingly Parallel Time Series Analysis for Large Scale Weak Memory Systems

Authors: Francois Belletti, Evan Sparks, Michael Franklin, Alexandre M. Bayen

Abstract: Second order stationary models in time series analysis are based on the analysis of essential statistics whose computations follow a common pattern. In particular, with a map-reduce nomenclature, most of these operations can be modeled as map** a kernel that only depends on short windows of consecutive data and reducing the results produced by each computation. This computational pattern stems f… ▽ More Second order stationary models in time series analysis are based on the analysis of essential statistics whose computations follow a common pattern. In particular, with a map-reduce nomenclature, most of these operations can be modeled as map** a kernel that only depends on short windows of consecutive data and reducing the results produced by each computation. This computational pattern stems from the ergodicity of the model under consideration and is often referred to as weak or short memory when it comes to data indexed with respect to time. In the following we will show how studying weak memory systems can be done in a scalable manner thanks to a framework relying on specifically designed overlap** distributed data structures that enable fragmentation and replication of the data across many machines as well as parallelism in computations. This scheme has been implemented for Apache Spark but is certainly not system specific. Indeed we prove it is also adapted to leveraging high bandwidth fragmented memory blocks on GPUs. △ Less

Submitted 20 November, 2015; originally announced November 2015.

MSC Class: 68M14; 37M10; 62M10

arXiv:1408.0017 [pdf, other]

Learning Nash Equilibria in Congestion Games

Authors: Walid Krichene, Benjamin Drighès, Alexandre M. Bayen

Abstract: We study the repeated congestion game, in which multiple populations of players share resources, and make, at each iteration, a decentralized decision on which resources to utilize. We investigate the following question: given a model of how individual players update their strategies, does the resulting dynamics of strategy profiles converge to the set of Nash equilibria of the one-shot game? We c… ▽ More We study the repeated congestion game, in which multiple populations of players share resources, and make, at each iteration, a decentralized decision on which resources to utilize. We investigate the following question: given a model of how individual players update their strategies, does the resulting dynamics of strategy profiles converge to the set of Nash equilibria of the one-shot game? We consider in particular a model in which players update their strategies using algorithms with sublinear discounted regret. We show that the resulting sequence of strategy profiles converges to the set of Nash equilibria in the sense of Cesàro means. However, strong convergence is not guaranteed in general. We show that strong convergence can be guaranteed for a class of algorithms with a vanishing upper bound on discounted regret, and which satisfy an additional condition. We call such algorithms AREP algorithms, for Approximate REPlicator, as they can be interpreted as a discrete-time approximation of the replicator equation, which models the continuous-time evolution of population strategies, and which is known to converge for the class of congestion games. In particular, we show that the discounted Hedge algorithm belongs to the AREP class, which guarantees its strong convergence. △ Less

Submitted 31 July, 2014; originally announced August 2014.

arXiv:1406.5765 [pdf]

doi 10.1109/IECON.2014.7049320

Environmental Sensing by Wearable Device for Indoor Activity and Location Estimation

Authors: Ming **, Han Zou, Kevin Weekly, Ruoxi Jia, Alexandre M. Bayen, Costas J. Spanos

Abstract: We present results from a set of experiments in this pilot study to investigate the causal influence of user activity on various environmental parameters monitored by occupant carried multi-purpose sensors. Hypotheses with respect to each type of measurements are verified, including temperature, humidity, and light level collected during eight typical activities: sitting in lab / cubicle, indoor w… ▽ More We present results from a set of experiments in this pilot study to investigate the causal influence of user activity on various environmental parameters monitored by occupant carried multi-purpose sensors. Hypotheses with respect to each type of measurements are verified, including temperature, humidity, and light level collected during eight typical activities: sitting in lab / cubicle, indoor walking / running, resting after physical activity, climbing stairs, taking elevators, and outdoor walking. Our main contribution is the development of features for activity and location recognition based on environmental measurements, which exploit location- and activity-specific characteristics and capture the trends resulted from the underlying physiological process. The features are statistically shown to have good separability and are also information-rich. Fusing environmental sensing together with acceleration is shown to achieve classification accuracy as high as 99.13%. For building applications, this study motivates a sensor fusion paradigm for learning individualized activity, location, and environmental preferences for energy management and user comfort. △ Less

Submitted 22 June, 2014; originally announced June 2014.

Comments: submitted to the 40th Annual Conference of the IEEE Industrial Electronics Society (IECON)

arXiv:1403.5085 [pdf, ps, other]

Modeling and Estimation of the Humans' Effect on the CO2 Dynamics Inside a Conference Room

Authors: Kevin Weekly, Nikolaos Bekiaris-Liberis, Alexandre M. Bayen

Abstract: We develop a data-driven, {\em Partial Differential Equation-Ordinary Differential Equation} (PDE-ODE) model that describes the response of the {\em Carbon Dioxide} (\cotwon) dynamics inside a conference room, due to the presence of humans, or of a user-controlled exogenous source of \cotwon. We conduct two controlled experiments in order to develop and tune a model whose output matches the measur… ▽ More We develop a data-driven, {\em Partial Differential Equation-Ordinary Differential Equation} (PDE-ODE) model that describes the response of the {\em Carbon Dioxide} (\cotwon) dynamics inside a conference room, due to the presence of humans, or of a user-controlled exogenous source of \cotwon. We conduct two controlled experiments in order to develop and tune a model whose output matches the measured output concentration of \cotwo inside the room, when known inputs are applied to the model. In the first experiment, a controlled amount of \cotwo gas is released inside the room from a regulated supply, and in the second, a known number of humans produce a certain amount of \cotwo inside the room. For the estimation of the exogenous inputs, we design an observer, based on our model, using measurements of \cotwo concentrations at two locations inside the room. Parameter identifiers are also designed, based on our model, for the online estimation of the parameters of the model. We perform several simulation studies for the illustration of our designs. △ Less

Submitted 20 March, 2014; originally announced March 2014.

arXiv:1312.1075 [pdf, ps, other]

A Necessary and Sufficient Condition for the Existence of Potential Functions for Heterogeneous Routing Games

Authors: Farhad Farokhi, Walid Krichene, Alexandre M. Bayen, Karl H. Johansson

Abstract: We study a heterogeneous routing game in which vehicles might belong to more than one type. The type determines the cost of traveling along an edge as a function of the flow of various types of vehicles over that edge. We relax the assumptions needed for the existence of a Nash equilibrium in this heterogeneous routing game. We extend the available results to present necessary and sufficient condi… ▽ More We study a heterogeneous routing game in which vehicles might belong to more than one type. The type determines the cost of traveling along an edge as a function of the flow of various types of vehicles over that edge. We relax the assumptions needed for the existence of a Nash equilibrium in this heterogeneous routing game. We extend the available results to present necessary and sufficient conditions for the existence of a potential function. We characterize a set of tolls that guarantee the existence of a potential function when only two types of users are participating in the game. We present an upper bound for the price of anarchy (i.e., the worst-case ratio of the social cost calculated for a Nash equilibrium over the social cost for a socially optimal flow) for the case in which only two types of players are participating in a game with affine edge cost functions. A heterogeneous routing game with vehicle platooning incentives is used as an example throughout the article to clarify the concepts and to validate the results. △ Less

Submitted 3 February, 2014; v1 submitted 4 December, 2013; originally announced December 2013.

Comments: Improved Literature Review; Updated Introduction

arXiv:1212.5327 [pdf]

doi 10.1038/srep01001

Understanding Road Usage Patterns in Urban Areas

Authors: Pu Wang, Timothy Hunter, Alexandre M. Bayen, Katja Schechtner, Marta C. González

Abstract: In this paper, we combine the most complete record of daily mobility, based on large-scale mobile phone data, with detailed Geographic Information System (GIS) data, uncovering previously hidden patterns in urban road usage. We find that the major usage of each road segment can be traced to its own - surprisingly few - driver sources. Based on this finding we propose a network of road usage by def… ▽ More In this paper, we combine the most complete record of daily mobility, based on large-scale mobile phone data, with detailed Geographic Information System (GIS) data, uncovering previously hidden patterns in urban road usage. We find that the major usage of each road segment can be traced to its own - surprisingly few - driver sources. Based on this finding we propose a network of road usage by defining a bipartite network framework, demonstrating that in contrast to traditional approaches, which define road importance solely by topological measures, the role of a road segment depends on both: its betweeness and its degree in the road usage network. Moreover, our ability to pinpoint the few driver sources contributing to the major traffic flow allows us to create a strategy that achieves a significant reduction of the travel time across the entire road system, compared to a benchmark approach. △ Less

Submitted 20 December, 2012; originally announced December 2012.

Comments: 47 pages, 24 figures

Journal ref: Scientific Reports 2:1001 (2012)

arXiv:1212.3393 [pdf, other]

Large Scale Estimation in Cyberphysical Systems using Streaming Data: a Case Study with Smartphone Traces

Authors: Timothy Hunter, Tathagata Das, Matei Zaharia, Pieter Abbeel, Alexandre M. Bayen

Abstract: Controlling and analyzing cyberphysical and robotics systems is increasingly becoming a Big Data challenge. Pushing this data to, and processing in the cloud is more efficient than on-board processing. However, current cloud-based solutions are not suitable for the latency requirements of these applications. We present a new concept, Discretized Streams or D-Streams, that enables massively scalabl… ▽ More Controlling and analyzing cyberphysical and robotics systems is increasingly becoming a Big Data challenge. Pushing this data to, and processing in the cloud is more efficient than on-board processing. However, current cloud-based solutions are not suitable for the latency requirements of these applications. We present a new concept, Discretized Streams or D-Streams, that enables massively scalable computations on streaming data with latencies as short as a second. We experiment with an implementation of D-Streams on top of the Spark computing framework. We demonstrate the usefulness of this concept with a novel algorithm to estimate vehicular traffic in urban networks. Our online EM algorithm can estimate traffic on a very large city network (the San Francisco Bay Area) by processing tens of thousands of observations per second, with a latency of a few seconds. △ Less

Submitted 14 December, 2012; originally announced December 2012.

Showing 1–38 of 38 results for author: Bayen, A M