Search | arXiv e-print repository

Node Compass: Multilevel Tracing and Debugging of Request Executions in JavaScript-Based Web-Servers

Authors: Herve Mbikayi Kabamba, Matthew Khouzam, Michel Dagenais

Abstract: Adequate consideration is crucial to ensure that services in a distributed application context are running satisfactorily with the resources available. Due to the asynchronous nature of tasks and the need to work with multiple layers that deliver coordinated results in a single-threaded context, analysing performance is a challenging task in event-loop-based systems. The existing performance ana… ▽ More Adequate consideration is crucial to ensure that services in a distributed application context are running satisfactorily with the resources available. Due to the asynchronous nature of tasks and the need to work with multiple layers that deliver coordinated results in a single-threaded context, analysing performance is a challenging task in event-loop-based systems. The existing performance analysis methods for environments such as Node.js rely on higher-level instrumentation but lack precision, as they cannot capture the relevant underlying application flow. As a solution, we propose a streamlined method for recovering the asynchronous execution path of requests called the Nested Bounded Context Algorithm. The proposed technique tracks the application execution flow through multiple layers and showcases it on an interactive interface for further assessment. Furthermore, we introduce the vertical span concept. This representation of a span as a multidimensional object (horizontal and vertical) with a start and end of execution, along with its sub-layers and triggered operations, enables the granular identification and diagnosis of performance issues. We proposed a new technique called the Bounded Context Tracking Algorithm for event matching and request reassembling in a multi-layer trace . The two techniques allow aligning the executions of the request in a tree-based data structure for developed visualisations. These visualisations permit performance debugging of complex performance issues in Node.js. △ Less

Submitted 19 November, 2023; originally announced January 2024.

arXiv:2311.11230 [pdf, other]

Advanced Strategies for Precise and Transparent Debugging of Performance Issues in In-Memory Data Store-Based Microservices

Authors: Herve Mbikayi Kabamba, Matthew Khouzam, Michel Dagenais

Abstract: The rise of microservice architectures has revolutionized application design, fostering adaptability and resilience. These architectures facilitate scaling and encourage collaborative efforts among specialized teams, streamlining deployment and maintenance. Critical to this ecosystem is the demand for low latency, prompting the adoption of cloud-based structures and in-memory data storage. This sh… ▽ More The rise of microservice architectures has revolutionized application design, fostering adaptability and resilience. These architectures facilitate scaling and encourage collaborative efforts among specialized teams, streamlining deployment and maintenance. Critical to this ecosystem is the demand for low latency, prompting the adoption of cloud-based structures and in-memory data storage. This shift optimizes data access times, supplanting direct disk access and driving the adoption of non-relational databases. Despite their benefits, microservice architectures present challenges in system performance and debugging, particularly as complexity grows. Performance issues can readily cascade through components, jeopardizing user satisfaction and service quality. Existing monitoring approaches often require code instrumentation, demanding extensive developer involvement. Recent strategies like proxies and service meshes aim to enhance tracing transparency, but introduce added configuration complexities. Our innovative solution introduces a new framework that transparently integrates heterogeneous microservices, enabling the creation of tailored tools for fine-grained performance debugging, especially for in-memory data store-based microservices. This approach leverages transparent user-level tracing, employing a two-level abstraction analysis model to pinpoint key performance influencers. It harnesses system tracing and advanced analysis to provide visualization tools for identifying intricate performance issues. In a performance-centric landscape, this approach offers a promising solution to ensure peak efficiency and reliability for in-memory data store-based cloud applications. △ Less

Submitted 19 November, 2023; originally announced November 2023.

arXiv:2311.11095 [pdf, other]

Vnode: Low-overhead Transparent Tracing of Node.js-based Microservice Architectures

Authors: Herve Mbikayi Kabamba, Matthew Khouzam, Michel Dagenais

Abstract: Tracing serves as a key method for evaluating the performance of microservices-based architectures, which are renowned for their scalability, resource efficiency, and high availability. Despite their advantages, these architectures often pose unique debugging challenges that necessitate trade-offs, including the burden of instrumentation overhead. With Node.js emerging as a leading development env… ▽ More Tracing serves as a key method for evaluating the performance of microservices-based architectures, which are renowned for their scalability, resource efficiency, and high availability. Despite their advantages, these architectures often pose unique debugging challenges that necessitate trade-offs, including the burden of instrumentation overhead. With Node.js emerging as a leading development environment, recognized for its rapidly growing ecosystem, there is a pressing need for innovative approaches that reduce the telemetry data collection efforts, and the overhead incurred by the environment instrumentation. In response, we introduce a new approach designed for transparent tracing and seamless deployment of microservices in cloud settings. This approach is centered around our newly developed Internal Transparent Tracing and Context Reconstruction (ITTCR) algorithm. ITTCR is adept at correlating internal metrics from various distributed trace files, to reconstruct the intricate execution contexts of microservices operating in a Node.js environment. Our method achieves transparency by directly instrumenting the Node.js virtual machine, enabling the collection and analysis of trace events in a transparent manner. This process facilitates the creation of visualization tools, enhancing the understanding and analysis of microservice performance in cloud environments. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2204.10208 [pdf, other]

doi 10.1016/j.robot.2022.104361

Message Flow Analysis with Complex Causal Links for Distributed ROS 2 Systems

Authors: Christophe Bédard, Pierre-Yves Lajoie, Giovanni Beltrame, Michel Dagenais

Abstract: Distributed robotic systems rely heavily on the publish-subscribe communication paradigm and middleware frameworks that support it, such as the Robot Operating System (ROS), to efficiently implement modular computation graphs. The ROS 2 executor, a high-level task scheduler which handles ROS 2 messages, is a performance bottleneck. We extend ros2_tracing, a framework with instrumentation and tools… ▽ More Distributed robotic systems rely heavily on the publish-subscribe communication paradigm and middleware frameworks that support it, such as the Robot Operating System (ROS), to efficiently implement modular computation graphs. The ROS 2 executor, a high-level task scheduler which handles ROS 2 messages, is a performance bottleneck. We extend ros2_tracing, a framework with instrumentation and tools for real-time tracing of ROS 2, with the analysis and visualization of the flow of messages across distributed ROS 2 systems. Our method detects one-to-many and many-to-many causal links between input and output messages, including indirect causal links through simple user-level annotations. We validate our method on both synthetic and real robotic systems, and demonstrate its low runtime overhead. Moreover, the underlying intermediate execution representation database can be further leveraged to extract additional metrics and high-level results. This can provide valuable timing and scheduling information to further study and improve the ROS 2 executor as well as optimize any ROS 2 system. The source code is available at: https://github.com/christophebedard/ros2-message-flow-analysis. △ Less

Submitted 2 January, 2023; v1 submitted 21 April, 2022; originally announced April 2022.

Comments: 14 pages, 12 figures

Journal ref: Robotics and Autonomous Systems, vol. 161, p. 104361, March 2023

arXiv:2201.00393 [pdf, other]

doi 10.1109/LRA.2022.3174346

ros2_tracing: Multipurpose Low-Overhead Framework for Real-Time Tracing of ROS 2

Authors: Christophe Bédard, Ingo Lütkebohle, Michel Dagenais

Abstract: Testing and debugging have become major obstacles for robot software development, because of high system complexity and dynamic environments. Standard, middleware-based data recording does not provide sufficient information on internal computation and performance bottlenecks. Other existing methods also target very specific problems and thus cannot be used for multipurpose analysis. Moreover, they… ▽ More Testing and debugging have become major obstacles for robot software development, because of high system complexity and dynamic environments. Standard, middleware-based data recording does not provide sufficient information on internal computation and performance bottlenecks. Other existing methods also target very specific problems and thus cannot be used for multipurpose analysis. Moreover, they are not suitable for real-time applications. In this paper, we present ros2_tracing, a collection of flexible tracing tools and multipurpose instrumentation for ROS 2. It allows collecting runtime execution information on real-time distributed systems, using the low-overhead LTTng tracer. Tools also integrate tracing into the invaluable ROS 2 orchestration system and other usability tools. A message latency experiment shows that the end-to-end message latency overhead, when enabling all ROS 2 instrumentation, is on average 0.0033 ms, which we believe is suitable for production real-time systems. ROS 2 execution information obtained using ros2_tracing can be combined with trace data from the operating system, enabling a wider range of precise analyses, that help understand an application execution, to find the cause of performance bottlenecks and other issues. The source code is available at: https://github.com/ros2/ros2_tracing. △ Less

Submitted 30 July, 2022; v1 submitted 2 January, 2022; originally announced January 2022.

Comments: 8 pages, 8 figures, 3 tables

Journal ref: IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 6511-6518, July 2022

arXiv:2103.04954 [pdf, other]

doi 10.1109/ISSREW.2019.00102

Automatic Cause Detection of Performance Problems in Web Applications

Authors: Quentin Fournier, Naser Ezzati-Jivan, Daniel Aloise, Michel R. Dagenais

Abstract: The execution of similar units can be compared by their internal behaviors to determine the causes of their potential performance issues. For instance, by examining the internal behaviors of different fast or slow web requests more closely and by clustering and comparing their internal executions, one can determine what causes some requests to run slowly or behave in unexpected ways. In this paper… ▽ More The execution of similar units can be compared by their internal behaviors to determine the causes of their potential performance issues. For instance, by examining the internal behaviors of different fast or slow web requests more closely and by clustering and comparing their internal executions, one can determine what causes some requests to run slowly or behave in unexpected ways. In this paper, we propose a method of extracting the internal behavior of web requests as well as introduce a pipeline that detects performance issues in web requests and provides insights into their root causes. First, low-level and fine-grained information regarding each request is gathered by tracing both the user space and the kernel space. Second, further information is extracted and fed into an outlier detector. Finally, these outliers are then clustered by their behavior, and each group is analyzed separately. Experiments revealed that this pipeline is indeed able to detect slow web requests and provide additional insights into their true root causes. Notably, we were able to identify a real PHP cache contention using the proposed approach. △ Less

Submitted 8 March, 2021; originally announced March 2021.

Comments: 8 pages, 7 figures, IEEE ISSREW 2019

Journal ref: IEEE International Symposium on Software Reliability Engineering Workshops (2019) 398-405

arXiv:2103.04933 [pdf, other]

doi 10.1109/SCAM51674.2020.00022

DepGraph: Localizing Performance Bottlenecks in Multi-Core Applications Using Waiting Dependency Graphs and Software Tracing

Authors: Naser Ezzati-Jivan, Quentin Fournier, Michel R. Dagenais, Abdelwahab Hamou-Lhadj

Abstract: This paper addresses the challenge of understanding the waiting dependencies between the threads and hardware resources required to complete a task. The objective is to improve software performance by detecting the underlying bottlenecks caused by system-level blocking dependencies. In this paper, we use a system level tracing approach to extract a Waiting Dependency Graph that shows the breakdown… ▽ More This paper addresses the challenge of understanding the waiting dependencies between the threads and hardware resources required to complete a task. The objective is to improve software performance by detecting the underlying bottlenecks caused by system-level blocking dependencies. In this paper, we use a system level tracing approach to extract a Waiting Dependency Graph that shows the breakdown of a task execution among all the interleaving threads and resources. The method allows developers and system administrators to quickly discover how the total execution time is divided among its interacting threads and resources. Ultimately, the method helps detecting bottlenecks and highlighting their possible causes. Our experiments show the effectiveness of the proposed approach in several industry-level use cases. Three performance anomalies are analysed and explained using the proposed approach. Evaluating the method efficiency reveals that the imposed overhead never exceeds 10.1%, therefore making it suitable for in-production environments. △ Less

Submitted 8 March, 2021; originally announced March 2021.

Comments: 11 pages, 8 figures, IEEE SCAM 2020

Journal ref: IEEE 20th International Working Conference on Source Code Analysis and Manipulation (2020) 149-159

arXiv:cs/0508063 [pdf]

Disks, Partitions, Volumes and RAID Performance with the Linux Operating System

Authors: Michel R. Dagenais

Abstract: Block devices in computer operating systems typically correspond to disks or disk partitions, and are used to store files in a filesystem. Disks are not the only real or virtual device which adhere to the block accessible stream of bytes block device model. Files, remote devices, or even RAM may be used as a virtual disks. This article examines several common combinations of block device layers… ▽ More Block devices in computer operating systems typically correspond to disks or disk partitions, and are used to store files in a filesystem. Disks are not the only real or virtual device which adhere to the block accessible stream of bytes block device model. Files, remote devices, or even RAM may be used as a virtual disks. This article examines several common combinations of block device layers used as virtual disks in the Linux operating system: disk partitions, loopback files, software RAID, Logical Volume Manager, and Network Block Devices. It measures their relative performance using different filesystems: Ext2, Ext3, ReiserFS, JFS, XFS,NFS. △ Less

Submitted 12 August, 2005; originally announced August 2005.

arXiv:cs/0507073 [pdf]

Software Performance Analysis

Authors: Michel R. Dagenais, Karim Yaghmour, Charles Levert, Makan Pourzandi

Abstract: The key to speeding up applications is often understanding where the elapsed time is spent, and why. This document reviews in depth the full array of performance analysis tools and techniques available on Linux for this task, from the traditional tools like gcov and gprof, to the more advanced tools still under development like oprofile and the Linux Trace Toolkit. The focus is more on the under… ▽ More The key to speeding up applications is often understanding where the elapsed time is spent, and why. This document reviews in depth the full array of performance analysis tools and techniques available on Linux for this task, from the traditional tools like gcov and gprof, to the more advanced tools still under development like oprofile and the Linux Trace Toolkit. The focus is more on the underlying data collection and processing algorithms, and their overhead and precision, than on the cosmetic details of the graphical user interface frontends. △ Less

Submitted 29 July, 2005; originally announced July 2005.

arXiv:cs/0506035 [pdf, ps, other]

Fast Recompilation of Object Oriented Modules

Authors: Jerome Collin, Michel Dagenais

Abstract: Once a program file is modified, the recompilation time should be minimized, without sacrificing execution speed or high level object oriented features. The recompilation time is often a problem for the large graphical interactive distributed applications tackled by modern OO languages. A compilation server and fast code generator were developed and integrated with the SRC Modula-3 compiler and… ▽ More Once a program file is modified, the recompilation time should be minimized, without sacrificing execution speed or high level object oriented features. The recompilation time is often a problem for the large graphical interactive distributed applications tackled by modern OO languages. A compilation server and fast code generator were developed and integrated with the SRC Modula-3 compiler and Linux ELF dynamic linker. The resulting compilation and recompilation speedups are impressive. The impact of different language features, processor speed, and application size are discussed. △ Less

Submitted 10 June, 2005; originally announced June 2005.

arXiv:cs/0412039 [pdf]

Security in Carrier Class Server Applications for All-IP Networks

Authors: Marc Chatel, Michel Dagenais, Charles Levert, Makan Pourzandi

Abstract: A revolution is taking place in telecommunication networks. New services are appearing on platforms such as third generation cellular phones (3G) and broadband Internet access. This motivates the transition from mostly switched to all-IP networks. The replacement of the traditional shallow and well-defined interface to telephony networks brings accrued flexibility, but also makes the network acc… ▽ More A revolution is taking place in telecommunication networks. New services are appearing on platforms such as third generation cellular phones (3G) and broadband Internet access. This motivates the transition from mostly switched to all-IP networks. The replacement of the traditional shallow and well-defined interface to telephony networks brings accrued flexibility, but also makes the network accordingly difficult to properly secure. This paper surveys the implications of this transition on security issues in telecom applications. It does not give an exhaustive list of security tools or security protocols. Its goal is rather to initiate the reader to the security issues brought to carrier class servers by this revolution. △ Less

Submitted 9 December, 2004; originally announced December 2004.

Comments: Survey paper on the challenges of all IP networks in telecom applications

Showing 1–11 of 11 results for author: Dagenais, M