-
Data Extraction, Transformation, and Loading Process Automation for Algorithmic Trading Machine Learning Modelling and Performance Optimization
Authors:
Nassi Ebadifard,
Ajitesh Parihar,
Youry Khmelevsky,
Gaetan Hains,
Albert Wong,
Frank Zhang
Abstract:
A data warehouse efficiently prepares data for effective and fast data analysis and modelling using machine learning algorithms. This paper discusses existing solutions for the Data Extraction, Transformation, and Loading (ETL) process and automation for algorithmic trading algorithms. Integrating the Data Warehouses and, in the future, the Data Lakes with the Machine Learning Algorithms gives eno…
▽ More
A data warehouse efficiently prepares data for effective and fast data analysis and modelling using machine learning algorithms. This paper discusses existing solutions for the Data Extraction, Transformation, and Loading (ETL) process and automation for algorithmic trading algorithms. Integrating the Data Warehouses and, in the future, the Data Lakes with the Machine Learning Algorithms gives enormous opportunities in research when performance and data processing time become critical non-functional requirements.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Short-Term Stock Price Forecasting using exogenous variables and Machine Learning Algorithms
Authors:
Albert Wong,
Steven Whang,
Emilio Sagre,
Niha Sachin,
Gustavo Dutra,
Yew-Wei Lim,
Gaetan Hains,
Youry Khmelevsky,
Frank Zhang
Abstract:
Creating accurate predictions in the stock market has always been a significant challenge in finance. With the rise of machine learning as the next level in the forecasting area, this research paper compares four machine learning models and their accuracy in forecasting three well-known stocks traded in the NYSE in the short term from March 2020 to May 2022. We deploy, develop, and tune XGBoost, R…
▽ More
Creating accurate predictions in the stock market has always been a significant challenge in finance. With the rise of machine learning as the next level in the forecasting area, this research paper compares four machine learning models and their accuracy in forecasting three well-known stocks traded in the NYSE in the short term from March 2020 to May 2022. We deploy, develop, and tune XGBoost, Random Forest, Multi-layer Perceptron, and Support Vector Regression models. We report the models that produce the highest accuracies from our evaluation metrics: RMSE, MAPE, MTT, and MPE. Using a training data set of 240 trading days, we find that XGBoost gives the highest accuracy despite running longer (up to 10 seconds). Results from this study may improve by further tuning the individual parameters or introducing more exogenous variables.
△ Less
Submitted 17 May, 2023;
originally announced September 2023.
-
Gamers Private Network Performance Forecasting. From Raw Data to the Data Warehouse with Machine Learning and Neural Nets
Authors:
Albert Wong,
Chun Yin Chiu,
GaƩtan Hains,
Jack Humphrey,
Hans Fuhrmann,
Youry Khmelevsky,
Chris Mazur
Abstract:
Gamers Private Network (GPN) is a client/server technology that guarantees a connection for online video games that is more reliable and lower latency than a standard internet connection. Users of the GPN technology benefit from a stable and high-quality gaming experience for online games, which are hosted and played across the world. After transforming a massive volume of raw networking data coll…
▽ More
Gamers Private Network (GPN) is a client/server technology that guarantees a connection for online video games that is more reliable and lower latency than a standard internet connection. Users of the GPN technology benefit from a stable and high-quality gaming experience for online games, which are hosted and played across the world. After transforming a massive volume of raw networking data collected by WTFast, we have structured the cleaned data into a special-purpose data warehouse and completed the extensive analysis using machine learning and neural nets technologies, and business intelligence tools. These analyses demonstrate the ability to predict and quantify changes in the network and demonstrate the benefits gained from the use of a GPN for users when connected to an online game session.
△ Less
Submitted 25 May, 2021;
originally announced July 2021.
-
Parallel Programming Applied Research Projects for Teaching Parallel Programming to Beginner Students
Authors:
Youry Khmelevsky,
Gaetan J. D. R. Hains
Abstract:
In this paper, we discuss the educational value of a few mid-size and one large applied research projects at the Computer Science Department of Okanagan College (OC) and at the Universities of Paris East Creteil (LACL) and Orleans (LIFO) in France. We found, that some freshmen students are very active and eager to be involved in applied research projects starting from the second semester. They are…
▽ More
In this paper, we discuss the educational value of a few mid-size and one large applied research projects at the Computer Science Department of Okanagan College (OC) and at the Universities of Paris East Creteil (LACL) and Orleans (LIFO) in France. We found, that some freshmen students are very active and eager to be involved in applied research projects starting from the second semester. They are actively participating in programming competitions and want to be involved in applied research projects to compete with sophomore and older students. Our observation is based on five NSERC Engage College and Applied Research and Development (ARD) grants, and several small applied projects. Student involvement in applied research is a key motivation and success factor in our activities, but we are also involved in transferring some results of applied research, namely programming techniques, into the parallel programming courses for beginners at the senior- and first-year MSc levels. We illustrate this feedback process with programming notions for beginners, practical tools to acquire them and the overall success/failure of students as experienced for more than 10 years in our French University courses.
△ Less
Submitted 30 May, 2021; v1 submitted 27 May, 2021;
originally announced May 2021.
-
Machine Learning Prediction of Gamer's Private Networks
Authors:
Chris Mazur,
Jesse Ayers,
Gaetan Hains,
Youry Khmelevsky
Abstract:
The Gamer's Private Network (GPN) is a client/server technology created by WTFast for making the network performance of online games faster and more reliable. GPN s use middle-mile servers and proprietary algorithms to better connect online video-game players to their game's servers across a wide-area network. Online games are a massive entertainment market and network latency is a key aspect of a…
▽ More
The Gamer's Private Network (GPN) is a client/server technology created by WTFast for making the network performance of online games faster and more reliable. GPN s use middle-mile servers and proprietary algorithms to better connect online video-game players to their game's servers across a wide-area network. Online games are a massive entertainment market and network latency is a key aspect of a player's competitive edge. This market means many different approaches to network architecture are implemented by different competing companies and that those architectures are constantly evolving. Ensuring the optimal connection between a client of WTFast and the online game they wish to play is thus an incredibly difficult problem to automate. Using machine learning, we analyzed historical network data from GPN connections to explore the feasibility of network latency prediction which is a key part of optimization. Our next step will be to collect live data (including client/server load, packet and port information and specific game state information) from GPN Minecraft servers and bots. We will use this information in a Reinforcement Learning model along with predictions about latency to alter the clients' and servers' configurations for optimal network performance. These investigations and experiments will improve the quality of service and reliability of GPN systems.
△ Less
Submitted 6 December, 2020;
originally announced December 2020.
-
State-of-the-Art on Query & Transaction Processing Acceleration
Authors:
Bernd Amann,
Youry Khmelevsky,
Gaetan Hains
Abstract:
The vast amount of processing power and memory bandwidth provided by modern Graphics Processing Units (GPUs) make them a platform for data-intensive applications. The database community identified GPUs as effective co-processors for data processing. In the past years, there were many approaches to make use of GPUs at different levels of a database system. In this Internal Technical Report, based o…
▽ More
The vast amount of processing power and memory bandwidth provided by modern Graphics Processing Units (GPUs) make them a platform for data-intensive applications. The database community identified GPUs as effective co-processors for data processing. In the past years, there were many approaches to make use of GPUs at different levels of a database system. In this Internal Technical Report, based on the [1] and some other research papers, we identify possible research areas at LIP6 for GPU-accelerated database management systems. We describe some key properties, typical challenges of GPU-aware database architectures, and identify major open challenges.
△ Less
Submitted 26 June, 2019;
originally announced July 2019.
-
Formal methods and software engineering for DL. Security, safety and productivity for DL systems development
Authors:
Gaetan J. D. R. Hains,
Arvid Jakobsson,
Youry Khmelevsky
Abstract:
Deep Learning (DL) techniques are now widespread and being integrated into many important systems. Their classification and recognition abilities ensure their relevance for multiple application domains. As machine-learning that relies on training instead of algorithm programming, they offer a high degree of productivity. But they can be vulnerable to attacks and the verification of their correctne…
▽ More
Deep Learning (DL) techniques are now widespread and being integrated into many important systems. Their classification and recognition abilities ensure their relevance for multiple application domains. As machine-learning that relies on training instead of algorithm programming, they offer a high degree of productivity. But they can be vulnerable to attacks and the verification of their correctness is only just emerging as a scientific and engineering possibility. This paper is a major update of a previously-published survey, attempting to cover all recent publications in this area. It also covers an even more recent trend, namely the design of domain-specific languages for producing and training neural nets.
△ Less
Submitted 31 January, 2019;
originally announced January 2019.
-
5Gperf: signal processing performance for 5G
Authors:
G. Hains,
W. Suijlen,
W. Liang,
Z. Wu
Abstract:
The 5Gperf project was conducted by Huawei research teams in 2016-17. It was concerned with the acceleration of signal-processing algorithms for a 5G base-station prototype. It improved on already optimized SIMD-parallel CPU algorithms and designed a new software tool for higher programmer productivity when converting MATLAB code to optimized C
The 5Gperf project was conducted by Huawei research teams in 2016-17. It was concerned with the acceleration of signal-processing algorithms for a 5G base-station prototype. It improved on already optimized SIMD-parallel CPU algorithms and designed a new software tool for higher programmer productivity when converting MATLAB code to optimized C
△ Less
Submitted 25 October, 2018;
originally announced October 2018.