-
Learning Generic and Dynamic Locomotion of Humanoids Across Discrete Terrains
Authors:
Shangqun Yu,
Nisal Perera,
Daniel Marew,
Donghyun Kim
Abstract:
This paper addresses the challenge of terrain-adaptive dynamic locomotion in humanoid robots, a problem traditionally tackled by optimization-based methods or reinforcement learning (RL). Optimization-based methods, such as model-predictive control, excel in finding optimal reaction forces and achieving agile locomotion, especially in quadruped, but struggle with the nonlinear hybrid dynamics of l…
▽ More
This paper addresses the challenge of terrain-adaptive dynamic locomotion in humanoid robots, a problem traditionally tackled by optimization-based methods or reinforcement learning (RL). Optimization-based methods, such as model-predictive control, excel in finding optimal reaction forces and achieving agile locomotion, especially in quadruped, but struggle with the nonlinear hybrid dynamics of legged systems and the real-time computation of step location, timing, and reaction forces. Conversely, RL-based methods show promise in navigating dynamic and rough terrains but are limited by their extensive data requirements. We introduce a novel locomotion architecture that integrates a neural network policy, trained through RL in simplified environments, with a state-of-the-art motion controller combining model-predictive control (MPC) and whole-body impulse control (WBIC). The policy efficiently learns high-level locomotion strategies, such as gait selection and step positioning, without the need for full dynamics simulations. This control architecture enables humanoid robots to dynamically navigate discrete terrains, making strategic locomotion decisions (e.g., walking, jum**, and lea**) based on ground height maps. Our results demonstrate that this integrated control architecture achieves dynamic locomotion with significantly fewer training samples than conventional RL-based methods and can be transferred to different humanoid platforms without additional training. The control architecture has been extensively tested in dynamic simulations, accomplishing terrain height-based dynamic locomotion for three different robots.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
StaccaToe: A Single-Leg Robot that Mimics the Human Leg and Toe
Authors:
Nisal Perera,
Shangqun Yu,
Daniel Marew,
Mack Tang,
Ken Suzuki,
Aidan McCormack,
Shifan Zhu,
Yong-Jae Kim,
Donghyun Kim
Abstract:
We introduce StaccaToe, a human-scale, electric motor-powered single-leg robot designed to rival the agility of human locomotion through two distinctive attributes: an actuated toe and a co-actuation configuration inspired by the human leg. Leveraging the foundational design of HyperLeg's lower leg mechanism, we develop a stand-alone robot by incorporating new link designs, custom-designed power e…
▽ More
We introduce StaccaToe, a human-scale, electric motor-powered single-leg robot designed to rival the agility of human locomotion through two distinctive attributes: an actuated toe and a co-actuation configuration inspired by the human leg. Leveraging the foundational design of HyperLeg's lower leg mechanism, we develop a stand-alone robot by incorporating new link designs, custom-designed power electronics, and a refined control system. Unlike previous jum** robots that rely on either special mechanisms (e.g., springs and clutches) or hydraulic/pneumatic actuators, StaccaToe employs electric motors without energy storage mechanisms. This choice underscores our ultimate goal of develo** a practical, high-performance humanoid robot capable of human-like, stable walking as well as explosive dynamic movements. In this paper, we aim to empirically evaluate the balance capability and the exertion of explosive ground reaction forces of our toe and co-actuation mechanisms. Throughout extensive hardware and controller development, StaccaToe showcases its control fidelity by demonstrating a balanced tip-toe stance and dynamic jump. This study is significant for three key reasons: 1) StaccaToe represents the first human-scale, electric motor-driven single-leg robot to execute dynamic maneuvers without relying on specialized mechanisms; 2) our research provides empirical evidence of the benefits of replicating critical human leg attributes in robotic design; and 3) we explain the design process for creating agile legged robots, the details that have been scantily covered in academic literature.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Design and Implementation of an Analysis Pipeline for Heterogeneous Data
Authors:
Arup Kumar Sarker,
Aymen Alsaadi,
Niranda Perera,
Mills Staylor,
Gregor von Laszewski,
Matteo Turilli,
Ozgur Ozan Kilic,
Mikhail Titov,
Andre Merzky,
Shantenu Jha,
Geoffrey Fox
Abstract:
Managing and preparing complex data for deep learning, a prevalent approach in large-scale data science can be challenging. Data transfer for model training also presents difficulties, impacting scientific fields like genomics, climate modeling, and astronomy. A large-scale solution like Google Pathways with a distributed execution environment for deep learning models exists but is proprietary. In…
▽ More
Managing and preparing complex data for deep learning, a prevalent approach in large-scale data science can be challenging. Data transfer for model training also presents difficulties, impacting scientific fields like genomics, climate modeling, and astronomy. A large-scale solution like Google Pathways with a distributed execution environment for deep learning models exists but is proprietary. Integrating existing open-source, scalable runtime tools and data frameworks on high-performance computing (HPC) platforms is crucial to address these challenges. Our objective is to establish a smooth and unified method of combining data engineering and deep learning frameworks with diverse execution capabilities that can be deployed on various high-performance computing platforms, including cloud and supercomputers. We aim to support heterogeneous systems with accelerators, where Cylon and other data engineering and deep learning frameworks can utilize heterogeneous execution. To achieve this, we propose Radical-Cylon, a heterogeneous runtime system with a parallel and distributed data framework to execute Cylon as a task of Radical Pilot. We thoroughly explain Radical-Cylon's design and development and the execution process of Cylon tasks using Radical Pilot. This approach enables the use of heterogeneous MPI-communicators across multiple nodes. Radical-Cylon achieves better performance than Bare-Metal Cylon with minimal and constant overhead. Radical-Cylon achieves (4~15)% faster execution time than batch execution while performing similar join and sort operations with 35 million and 3.5 billion rows with the same resources. The approach aims to excel in both scientific and engineering research HPC systems while demonstrating robust performance on cloud infrastructures. This dual capability fosters collaboration and innovation within the open-source scientific research community.
△ Less
Submitted 7 April, 2024; v1 submitted 23 March, 2024;
originally announced March 2024.
-
In-depth Analysis On Parallel Processing Patterns for High-Performance Dataframes
Authors:
Niranda Perera,
Arup Kumar Sarker,
Mills Staylor,
Gregor von Laszewski,
Kaiying Shan,
Supun Kamburugamuve,
Chathura Widanage,
Vibhatha Abeykoon,
Thejaka Amila Kanewela,
Geoffrey Fox
Abstract:
The Data Science domain has expanded monumentally in both research and industry communities during the past decade, predominantly owing to the Big Data revolution. Artificial Intelligence (AI) and Machine Learning (ML) are bringing more complexities to data engineering applications, which are now integrated into data processing pipelines to process terabytes of data. Typically, a significant amoun…
▽ More
The Data Science domain has expanded monumentally in both research and industry communities during the past decade, predominantly owing to the Big Data revolution. Artificial Intelligence (AI) and Machine Learning (ML) are bringing more complexities to data engineering applications, which are now integrated into data processing pipelines to process terabytes of data. Typically, a significant amount of time is spent on data preprocessing in these pipelines, and hence improving its e fficiency directly impacts the overall pipeline performance. The community has recently embraced the concept of Dataframes as the de-facto data structure for data representation and manipulation. However, the most widely used serial Dataframes today (R, pandas) experience performance limitations while working on even moderately large data sets. We believe that there is plenty of room for improvement by taking a look at this problem from a high-performance computing point of view. In a prior publication, we presented a set of parallel processing patterns for distributed dataframe operators and the reference runtime implementation, Cylon [1]. In this paper, we are expanding on the initial concept by introducing a cost model for evaluating the said patterns. Furthermore, we evaluate the performance of Cylon on the ORNL Summit supercomputer.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Supercharging Distributed Computing Environments For High Performance Data Engineering
Authors:
Niranda Perera,
Kaiying Shan,
Supun Kamburugamuwe,
Thejaka Amila Kanewela,
Chathura Widanage,
Arup Sarker,
Mills Staylor,
Tianle Zhong,
Vibhatha Abeykoon,
Geoffrey Fox
Abstract:
The data engineering and data science community has embraced the idea of using Python & R dataframes for regular applications. Driven by the big data revolution and artificial intelligence, these applications are now essential in order to process terabytes of data. They can easily exceed the capabilities of a single machine, but also demand significant developer time & effort. Therefore it is esse…
▽ More
The data engineering and data science community has embraced the idea of using Python & R dataframes for regular applications. Driven by the big data revolution and artificial intelligence, these applications are now essential in order to process terabytes of data. They can easily exceed the capabilities of a single machine, but also demand significant developer time & effort. Therefore it is essential to design scalable dataframe solutions. There have been multiple attempts to tackle this problem, the most notable being the dataframe systems developed using distributed computing environments such as Dask and Ray. Even though Dask/Ray distributed computing features look very promising, we perceive that the Dask Dataframes/Ray Datasets still have room for optimization. In this paper, we present CylonFlow, an alternative distributed dataframe execution methodology that enables state-of-the-art performance and scalability on the same Dask/Ray infrastructure (thereby supercharging them!). To achieve this, we integrate a high performance dataframe system Cylon, which was originally based on an entirely different execution paradigm, into Dask and Ray. Our experiments show that on a pipeline of dataframe operators, CylonFlow achieves 30x more distributed performance than Dask Dataframes. Interestingly, it also enables superior sequential performance due to the native C++ execution of Cylon. We believe the success of Cylon & CylonFlow extends beyond the data engineering domain, and can be used to consolidate high performance computing and distributed computing ecosystems.
△ Less
Submitted 19 January, 2023;
originally announced January 2023.
-
Hybrid Cloud and HPC Approach to High-Performance Dataframes
Authors:
Kaiying Shan,
Niranda Perera,
Damitha Lenadora,
Tianle Zhong,
Arup Sarker,
Supun Kamburugamuve,
Thejaka Amila Kanewela,
Chathura Widanage,
Geoffrey Fox
Abstract:
Data pre-processing is a fundamental component in any data-driven application. With the increasing complexity of data processing operations and volume of data, Cylon, a distributed dataframe system, is developed to facilitate data processing both as a standalone application and as a library, especially for Python applications. While Cylon shows promising performance results, we experienced difficu…
▽ More
Data pre-processing is a fundamental component in any data-driven application. With the increasing complexity of data processing operations and volume of data, Cylon, a distributed dataframe system, is developed to facilitate data processing both as a standalone application and as a library, especially for Python applications. While Cylon shows promising performance results, we experienced difficulties trying to integrate with frameworks incompatible with the traditional Message Passing Interface (MPI). While MPI implementations encompass scalable and efficient communication routines, their process launching mechanisms work well with mainstream HPC systems but are incompatible with some environments that adopt their own resource management systems. In this work, we alleviated this issue by directly integrating the Unified Communication X (UCX) framework, which supports a variety of classic HPC and non-HPC process-bootstrap** mechanisms as our communication framework. While we experimented with our methodology on Cylon, the same technique can be used to bring MPI communication to other applications that do not employ MPI's built-in process management approach.
△ Less
Submitted 29 December, 2022; v1 submitted 28 December, 2022;
originally announced December 2022.
-
Evaluation of Question Answering Systems: Complexity of judging a natural language
Authors:
Amer Farea,
Zhen Yang,
Kien Duong,
Nadeesha Perera,
Frank Emmert-Streib
Abstract:
Question answering (QA) systems are among the most important and rapidly develo** research topics in natural language processing (NLP). A reason, therefore, is that a QA system allows humans to interact more naturally with a machine, e.g., via a virtual assistant or search engine. In the last decades, many QA systems have been proposed to address the requirements of different question-answering…
▽ More
Question answering (QA) systems are among the most important and rapidly develo** research topics in natural language processing (NLP). A reason, therefore, is that a QA system allows humans to interact more naturally with a machine, e.g., via a virtual assistant or search engine. In the last decades, many QA systems have been proposed to address the requirements of different question-answering tasks. Furthermore, many error scores have been introduced, e.g., based on n-gram matching, word embeddings, or contextual embeddings to measure the performance of a QA system. This survey attempts to provide a systematic overview of the general framework of QA, QA paradigms, benchmark datasets, and assessment techniques for a quantitative evaluation of QA systems. The latter is particularly important because not only is the construction of a QA system complex but also its evaluation. We hypothesize that a reason, therefore, is that the quantitative formalization of human judgment is an open problem.
△ Less
Submitted 10 September, 2022;
originally announced September 2022.
-
High Performance Dataframes from Parallel Processing Patterns
Authors:
Niranda Perera,
Supun Kamburugamuve,
Chathura Widanage,
Vibhatha Abeykoon,
Ahmet Uyar,
Kaiying Shan,
Hasara Maithree,
Damitha Lenadora,
Thejaka Amila Kanewala,
Geoffrey Fox
Abstract:
The data science community today has embraced the concept of Dataframes as the de facto standard for data representation and manipulation. Ease of use, massive operator coverage, and popularization of R and Python languages have heavily influenced this transformation. However, most widely used serial Dataframes today (R, pandas) experience performance limitations even while working on even moderat…
▽ More
The data science community today has embraced the concept of Dataframes as the de facto standard for data representation and manipulation. Ease of use, massive operator coverage, and popularization of R and Python languages have heavily influenced this transformation. However, most widely used serial Dataframes today (R, pandas) experience performance limitations even while working on even moderately large data sets. We believe that there is plenty of room for improvement by investigating the generic distributed patterns of dataframe operators. In this paper, we propose a framework that lays the foundation for building high performance distributed-memory parallel dataframe systems based on these parallel processing patterns. We also present Cylon, as a reference runtime implementation. We demonstrate how this framework has enabled Cylon achieving scalable high performance. We also underline the flexibility of the proposed API and the extensibility of the framework on different hardware. To the best of our knowledge, Cylon is the first and only distributed-memory parallel dataframe system available today.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
HPTMT Parallel Operators for High Performance Data Science & Data Engineering
Authors:
Vibhatha Abeykoon,
Supun Kamburugamuve,
Chathura Widanage,
Niranda Perera,
Ahmet Uyar,
Thejaka Amila Kanewala,
Gregor von Laszewski,
Geoffrey Fox
Abstract:
Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstractions and operators that suit the applications of different domains. Often lack of a clear definition of data structures and operators in the field ha…
▽ More
Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstractions and operators that suit the applications of different domains. Often lack of a clear definition of data structures and operators in the field has led to other implementations that do not work well together. The HPTMT architecture that we proposed recently, identifies a set of data structures, operators, and an execution model for creating rich data applications that links all aspects of data engineering and data science together efficiently. This paper elaborates and illustrates this architecture using an end-to-end application with deep learning and data engineering parts working together.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
HPTMT: Operator-Based Architecture for Scalable High-Performance Data-Intensive Frameworks
Authors:
Supun Kamburugamuve,
Chathura Widanage,
Niranda Perera,
Vibhatha Abeykoon,
Ahmet Uyar,
Thejaka Amila Kanewala,
Gregor von Laszewski,
Geoffrey Fox
Abstract:
Data-intensive applications impact many domains, and their steadily increasing size and complexity demands high-performance, highly usable environments. We integrate a set of ideas developed in various data science and data engineering frameworks. They employ a set of operators on specific data abstractions that include vectors, matrices, tensors, graphs, and tables. Our key concepts are inspired…
▽ More
Data-intensive applications impact many domains, and their steadily increasing size and complexity demands high-performance, highly usable environments. We integrate a set of ideas developed in various data science and data engineering frameworks. They employ a set of operators on specific data abstractions that include vectors, matrices, tensors, graphs, and tables. Our key concepts are inspired from systems like MPI, HPF (High-Performance Fortran), NumPy, Pandas, Spark, Modin, PyTorch, TensorFlow, RAPIDS(NVIDIA), and OneAPI (Intel). Further, it is crucial to support different languages in everyday use in the Big Data arena, including Python, R, C++, and Java. We note the importance of Apache Arrow and Parquet for enabling language agnostic high performance and interoperability. In this paper, we propose High-Performance Tensors, Matrices and Tables (HPTMT), an operator-based architecture for data-intensive applications, and identify the fundamental principles needed for performance and usability success. We illustrate these principles by a discussion of examples using our software environments, Cylon and Twister2 that embody HPTMT.
△ Less
Submitted 29 July, 2021; v1 submitted 27 July, 2021;
originally announced July 2021.
-
A Fast, Scalable, Universal Approach For Distributed Data Aggregations
Authors:
Niranda Perera,
Vibhatha Abeykoon,
Chathura Widanage,
Supun Kamburugamuve,
Thejaka Amila Kanewala,
Pulasthi Wickramasinghe,
Ahmet Uyar,
Hasara Maithree,
Damitha Lenadora,
Geoffrey Fox
Abstract:
In the current era of Big Data, data engineering has transformed into an essential field of study across many branches of science. Advancements in Artificial Intelligence (AI) have broadened the scope of data engineering and opened up new applications in both enterprise and research communities. Aggregations (also termed reduce in functional programming) are an integral functionality in these appl…
▽ More
In the current era of Big Data, data engineering has transformed into an essential field of study across many branches of science. Advancements in Artificial Intelligence (AI) have broadened the scope of data engineering and opened up new applications in both enterprise and research communities. Aggregations (also termed reduce in functional programming) are an integral functionality in these applications. They are traditionally aimed at generating meaningful information on large data-sets, and today, they are being used for engineering more effective features for complex AI models. Aggregations are usually carried out on top of data abstractions such as tables/ arrays and are combined with other operations such as grou** of values. There are frameworks that excel in the said domains individually. But, we believe that there is an essential requirement for a data analytics tool that can universally integrate with existing frameworks, and thereby increase the productivity and efficiency of the entire data analytics pipeline. Cylon endeavors to fulfill this void. In this paper, we present Cylon's fast and scalable aggregation operations implemented on top of a distributed in-memory table structure that universally integrates with existing frameworks.
△ Less
Submitted 14 December, 2020; v1 submitted 27 October, 2020;
originally announced October 2020.
-
Data Engineering for HPC with Python
Authors:
Vibhatha Abeykoon,
Niranda Perera,
Chathura Widanage,
Supun Kamburugamuve,
Thejaka Amila Kanewala,
Hasara Maithree,
Pulasthi Wickramasinghe,
Ahmet Uyar,
Geoffrey Fox
Abstract:
Data engineering is becoming an increasingly important part of scientific discoveries with the adoption of deep learning and machine learning. Data engineering deals with a variety of data formats, storage, data extraction, transformation, and data movements. One goal of data engineering is to transform data from original data to vector/matrix/tensor formats accepted by deep learning and machine l…
▽ More
Data engineering is becoming an increasingly important part of scientific discoveries with the adoption of deep learning and machine learning. Data engineering deals with a variety of data formats, storage, data extraction, transformation, and data movements. One goal of data engineering is to transform data from original data to vector/matrix/tensor formats accepted by deep learning and machine learning applications. There are many structures such as tables, graphs, and trees to represent data in these data engineering phases. Among them, tables are a versatile and commonly used format to load and process data. In this paper, we present a distributed Python API based on table abstraction for representing and processing data. Unlike existing state-of-the-art data engineering tools written purely in Python, our solution adopts high performance compute kernels in C++, with an in-memory table representation with Cython-based Python bindings. In the core system, we use MPI for distributed memory computations with a data-parallel approach for processing large datasets in HPC clusters.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
High Performance Data Engineering Everywhere
Authors:
Chathura Widanage,
Niranda Perera,
Vibhatha Abeykoon,
Supun Kamburugamuve,
Thejaka Amila Kanewala,
Hasara Maithree,
Pulasthi Wickramasinghe,
Ahmet Uyar,
Gurhan Gunduz,
Geoffrey Fox
Abstract:
The amazing advances being made in the fields of machine and deep learning are a highlight of the Big Data era for both enterprise and research communities. Modern applications require resources beyond a single node's ability to provide. However this is just a small part of the issues facing the overall data processing environment, which must also support a raft of data engineering for pre- and po…
▽ More
The amazing advances being made in the fields of machine and deep learning are a highlight of the Big Data era for both enterprise and research communities. Modern applications require resources beyond a single node's ability to provide. However this is just a small part of the issues facing the overall data processing environment, which must also support a raft of data engineering for pre- and post-data processing, communication, and system integration. An important requirement of data analytics tools is to be able to easily integrate with existing frameworks in a multitude of languages, thereby increasing user productivity and efficiency. All this demands an efficient and highly distributed integrated approach for data processing, yet many of today's popular data analytics tools are unable to satisfy all these requirements at the same time.
In this paper we present Cylon, an open-source high performance distributed data processing library that can be seamlessly integrated with existing Big Data and AI/ML frameworks. It is developed with a flexible C++ core on top of a compact data structure and exposes language bindings to C++, Java, and Python. We discuss Cylon's architecture in detail, and reveal how it can be imported as a library to existing applications or operate as a standalone framework. Initial experiments show that Cylon enhances popular tools such as Apache Spark and Dask with major performance improvements for key operations and better component linkages. Finally, we show how its design enables Cylon to be used cross-platform with minimum overhead, which includes popular AI tools such as PyTorch, Tensorflow, and Jupyter notebooks.
△ Less
Submitted 19 July, 2020;
originally announced July 2020.
-
KeyXtract Twitter Model - An Essential Keywords Extraction Model for Twitter Designed using NLP Tools
Authors:
Tharindu Weerasooriya,
Nandula Perera,
S. R. Liyanage
Abstract:
Since a tweet is limited to 140 characters, it is ambiguous and difficult for traditional Natural Language Processing (NLP) tools to analyse. This research presents KeyXtract which enhances the machine learning based Stanford CoreNLP Part-of-Speech (POS) tagger with the Twitter model to extract essential keywords from a tweet. The system was developed using rule-based parsers and two corpora. The…
▽ More
Since a tweet is limited to 140 characters, it is ambiguous and difficult for traditional Natural Language Processing (NLP) tools to analyse. This research presents KeyXtract which enhances the machine learning based Stanford CoreNLP Part-of-Speech (POS) tagger with the Twitter model to extract essential keywords from a tweet. The system was developed using rule-based parsers and two corpora. The data for the research was obtained from a Twitter profile of a telecommunication company. The system development consisted of two stages. At the initial stage, a domain specific corpus was compiled after analysing the tweets. The POS tagger extracted the Noun Phrases and Verb Phrases while the parsers removed noise and extracted any other keywords missed by the POS tagger. The system was evaluated using the Turing Test. After it was tested and compared against Stanford CoreNLP, the second stage of the system was developed addressing the shortcomings of the first stage. It was enhanced using Named Entity Recognition and Lemmatization. The second stage was also tested using the Turing test and its pass rate increased from 50.00% to 83.33%. The performance of the final system output was measured using the F1 score. Stanford CoreNLP with the Twitter model had an average F1 of 0.69 while the improved system had a F1 of 0.77. The accuracy of the system could be improved by using a complete domain specific corpus. Since the system used linguistic features of a sentence, it could be applied to other NLP tools.
△ Less
Submitted 9 August, 2017;
originally announced August 2017.