-
LibProf: A Python Profiler for Improving Cold Start Performance in Serverless Applications
Authors:
Syed Salauddin Mohammad Tariq,
Ali Al Zein,
Soumya Sripad Vaidya,
Arati Khanolkar,
Probir Roy
Abstract:
Serverless computing abstracts away server management, enabling automatic scaling and efficient resource utilization. However, cold-start latency remains a significant challenge, affecting end-to-end performance. Our preliminary study reveals that inefficient library initialization and usage are major contributors to this latency in Python-based serverless applications. We introduce LibProf, a Pyt…
▽ More
Serverless computing abstracts away server management, enabling automatic scaling and efficient resource utilization. However, cold-start latency remains a significant challenge, affecting end-to-end performance. Our preliminary study reveals that inefficient library initialization and usage are major contributors to this latency in Python-based serverless applications. We introduce LibProf, a Python profiler that uses dynamic program analysis to identify inefficient library initializations. LibProf collects library usage data through statistical sampling and call-path profiling, then generates a report to guide developers in addressing four types of inefficiency patterns. Systematic evaluations on 15 serverless applications demonstrate that LibProf effectively identifies inefficiencies. LibProf guided optimization results up to 2.26x speedup in cold-start execution time and 1.51x reduction in memory usage.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
PerfCurator: Curating a large-scale dataset of performance bug-related commits from public repositories
Authors:
Md Abul Kalam Azad,
Manoj Alexender,
Matthew Alexender,
Syed Salauddin Mohammad Tariq,
Foyzul Hassan,
Probir Roy
Abstract:
Performance bugs challenge software development, degrading performance and wasting computational resources. Software developers invest substantial effort in addressing these issues. Curating these performance bugs can offer valuable insights to the software engineering research community, aiding in develo** new mitigation strategies. However, there is no large-scale open-source performance bugs…
▽ More
Performance bugs challenge software development, degrading performance and wasting computational resources. Software developers invest substantial effort in addressing these issues. Curating these performance bugs can offer valuable insights to the software engineering research community, aiding in develo** new mitigation strategies. However, there is no large-scale open-source performance bugs dataset available. To bridge this gap, we propose PerfCurator, a repository miner that collects performance bug-related commits at scale. PerfCurator employs PcBERT-KD, a 125M parameter BERT model trained to classify performance bug-related commits. Our evaluation shows PcBERT-KD achieves accuracy comparable to 7 billion parameter LLMs but with significantly lower computational overhead, enabling cost-effective deployment on CPU clusters. Utilizing PcBERT-KD as the core component, we deployed PerfCurator on a 50-node CPU cluster to mine GitHub repositories. This extensive mining operation resulted in the construction of a large-scale dataset comprising 114K performance bug-fix commits in Python, 217.9K in C++, and 76.6K in Java. Our results demonstrate that this large-scale dataset significantly enhances the effectiveness of data-driven performance bug detection systems.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Computer Simulation of DNA Computing-Based Boolean Matrix Multiplication
Authors:
Muhammad Asad Tariq,
Rafay Junaid,
Muhammad Mehdy Hasnain,
Danyal Farhat
Abstract:
DNA computing is an unconventional approach to computing that harnesses the parallelism and information storage capabilities of DNA molecules. It has emerged as a promising field with potential applications in solving a variety of computationally complex problems. This paper explores a DNA computing algorithm for Boolean matrix multiplication proposed by Nobuyuki et al. (2006) using a computer sim…
▽ More
DNA computing is an unconventional approach to computing that harnesses the parallelism and information storage capabilities of DNA molecules. It has emerged as a promising field with potential applications in solving a variety of computationally complex problems. This paper explores a DNA computing algorithm for Boolean matrix multiplication proposed by Nobuyuki et al. (2006) using a computer simulation, inspired by similar work done in the past by Obront (2021) for the DNA computing algorithm developed by Adleman (1994) for solving the Hamiltonian path problem. We develop a Python program to simulate the logical operations involved in the DNA-based Boolean matrix multiplication algorithm. The simulation replicates the key steps of the algorithm, including DNA sequence generation and hybridization, without imitating the physical behaviour of the DNA molecules. It is intended to serve as a basic prototype for larger, more comprehensive DNA computing simulators that can be used as educational or research tools in the future. Through this work, we aim to contribute to the understanding of DNA-based computing paradigms and their potential advantages and trade-offs compared to conventional computing systems, paving the way for future research and advancements in this emerging field.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1092 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 14 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
A Comparative Analysis of Supportive Navigation on Movie Recommenders
Authors:
Mohammad Sualeh Ali,
Muhammed Maaz Tariq,
Alina Ahmed,
Abdul Razaque Soomro,
Danysh Syed
Abstract:
This literature review covers the research and thought process that went into making a solution for the infinite scrolling problem faced in streaming services such as Netflix. Using the data collected, we have come to the conclusion that an alternate layout can somewhat alleviate the problems it takes in navigating a list of movies. We also found out by a comparative analysis that some layouts, th…
▽ More
This literature review covers the research and thought process that went into making a solution for the infinite scrolling problem faced in streaming services such as Netflix. Using the data collected, we have come to the conclusion that an alternate layout can somewhat alleviate the problems it takes in navigating a list of movies. We also found out by a comparative analysis that some layouts, the circular one in particular, is advantageous in certain settings making it an ideal candidate for a movie recommender system.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
Cooperative Bidirectional Mixed-Traffic Overtaking
Authors:
Faizan M. Tariq,
Nilesh Suriyarachchi,
Christos Mavridis,
John S. Baras
Abstract:
Safe overtaking, especially in a bidirectional mixed-traffic setting, remains a key challenge for Connected Autonomous Vehicles (CAVs). The presence of human-driven vehicles (HDVs), behavior unpredictability, and blind spots resulting from sensor occlusion make this a challenging control problem. To overcome these difficulties, we propose a cooperative communication-based approach that utilizes th…
▽ More
Safe overtaking, especially in a bidirectional mixed-traffic setting, remains a key challenge for Connected Autonomous Vehicles (CAVs). The presence of human-driven vehicles (HDVs), behavior unpredictability, and blind spots resulting from sensor occlusion make this a challenging control problem. To overcome these difficulties, we propose a cooperative communication-based approach that utilizes the information shared between CAVs to reduce the effects of sensor occlusion while benefiting from the local velocity prediction based on past tracking data. Our control framework aims to perform overtaking maneuvers with the objective of maximizing velocity while prioritizing safety and passenger comfort. Our method is also capable of reactively adjusting its plan to dynamic changes in the environment. The performance of the proposed approach is verified using realistic traffic simulations.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Centralized Management of a Wifi Mesh for Autonomous Farms
Authors:
Ammar Tahir,
Yueshen Li,
Jianli **,
Changxin Zhang,
Daniel Moon,
Aganze Mihigo,
Muhammad Taimoor Tariq,
Deepak Vasisht,
Radhika Mittal
Abstract:
Emerging autonomous farming techniques rely on smart devices such as multi-spectral cameras, collecting fine-grained data, and robots performing tasks such as de-weeding, berry-picking, etc. These techniques require a high throughput network, supporting 10s of Mbps per device at the scale of tens to hundreds of devices in a large farm. We conduct a survey across 12 agronomists to understand these…
▽ More
Emerging autonomous farming techniques rely on smart devices such as multi-spectral cameras, collecting fine-grained data, and robots performing tasks such as de-weeding, berry-picking, etc. These techniques require a high throughput network, supporting 10s of Mbps per device at the scale of tens to hundreds of devices in a large farm. We conduct a survey across 12 agronomists to understand these networking requirements of farm workloads and perform extensive measurements of WiFi 6 performance in a farm to identify the challenges in meeting them. Our measurements reveal how network capacity is fundamentally limited in such a setting, with severe degradation in network performance due to crop canopy, and spotlight farm networks as an emerging new problem domain that can benefit from smarter network resource management decisions. To that end, we design Cornet, a network for supporting on-farm applications that comprises: (i) a multi-hop mesh of WiFi routers that uses a strategic combination of 2.4GHz and 5GHz bands as informed by our measurements, and (ii) a centralized traffic engineering (TE) system that uses a novel abstraction of resource units to reason about wireless network capacity and make TE decisions (schedule flows, assign flow rates, and select routes and channels). Our evaluation, using testbeds in a farm and trace-driven simulations, shows how Cornet achieves 1.4 $\times$ higher network utilization and better meets application demands, compared to standard wireless mesh strategies.
△ Less
Submitted 8 November, 2023; v1 submitted 1 November, 2023;
originally announced November 2023.
-
RCMS: Risk-Aware Crash Mitigation System for Autonomous Vehicles
Authors:
Faizan M. Tariq,
David Isele,
John S. Baras,
Sangjae Bae
Abstract:
We propose a risk-aware crash mitigation system (RCMS), to augment any existing motion planner (MP), that enables an autonomous vehicle to perform evasive maneuvers in high-risk situations and minimize the severity of collision if a crash is inevitable. In order to facilitate a smooth transition between RCMS and MP, we develop a novel activation mechanism that combines instantaneous as well as pre…
▽ More
We propose a risk-aware crash mitigation system (RCMS), to augment any existing motion planner (MP), that enables an autonomous vehicle to perform evasive maneuvers in high-risk situations and minimize the severity of collision if a crash is inevitable. In order to facilitate a smooth transition between RCMS and MP, we develop a novel activation mechanism that combines instantaneous as well as predictive collision risk evaluation strategies in a unified hysteresis-band approach. For trajectory planning, we deploy a modular receding horizon optimization-based approach that minimizes a smooth situational risk profile, while adhering to the physical road limits as well as vehicular actuator limits. We demonstrate the performance of our approach in a simulation environment.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
Compiling Recurrences over Dense and Sparse Arrays
Authors:
Shiv Sundram,
Muhammad Usman Tariq,
Fredrik Kjolstad
Abstract:
Recurrence equations lie at the heart of many computational paradigms including dynamic programming, graph analysis, and linear solvers. These equations are often expensive to compute and much work has gone into optimizing them for different situations. The set of recurrence implementations is a large design space across the set of all recurrences (e.g., the Viterbi and Floyd-Warshall algorithms),…
▽ More
Recurrence equations lie at the heart of many computational paradigms including dynamic programming, graph analysis, and linear solvers. These equations are often expensive to compute and much work has gone into optimizing them for different situations. The set of recurrence implementations is a large design space across the set of all recurrences (e.g., the Viterbi and Floyd-Warshall algorithms), the choice of data structures (e.g., dense and sparse matrices), and the set of different loop orders. Optimized library implementations do not exist for most points in this design space, and developers must therefore often manually implement and optimize recurrences. We present a general framework for compiling recurrence equations into native code corresponding to any valid point in this general design space. In this framework, users specify a system of recurrences, the type of data structures for storing the input and outputs, and a set of scheduling primitives for optimization. A greedy algorithm then takes this specification and lowers it into a native program that respects the dependencies inherent to the recurrence equation. We describe the compiler transformations necessary to lower this high-level specification into native parallel code for either sparse and dense data structures and provide an algorithm for determining whether the recurrence system is solvable with the provided scheduling primitives. We evaluate the performance and correctness of the generated code on various computational tasks from domains including dense and sparse matrix solvers, dynamic programming, graph problems, and sparse tensor algebra. We demonstrate that generated code has competitive performance to handwritten implementations in libraries.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Combined Machine Learning and Physics-Based Forecaster for Intra-day and 1-Week Ahead Solar Irradiance Forecasting Under Variable Weather Conditions
Authors:
Hugo Riggs,
Shahid Tufail,
Mohd Tariq,
Arif Sarwat
Abstract:
Power systems engineers are actively develo** larger power plants out of photovoltaics imposing some major challenges which include its intermittent power generation and its poor dispatchability. The issue is that PV is a variable generation source unless additional planning and system additions for mitigation of generation intermittencies. One underlying factor that can enhance the applications…
▽ More
Power systems engineers are actively develo** larger power plants out of photovoltaics imposing some major challenges which include its intermittent power generation and its poor dispatchability. The issue is that PV is a variable generation source unless additional planning and system additions for mitigation of generation intermittencies. One underlying factor that can enhance the applications around mitigating distributed energy resource intermittency challenges is forecasting the generation output. This is challenging especially with renewable energy sources which are weather dependent as due to the random nature of weather variance. This work puts forth a forecasting model which uses the solar variables to produce a PV generation forecast and evaluates a set of machine learning models for this task. In this paper, a forecaster for irradiance prediction for intra-day is proposed. This forecaster is capable of forecasting 15 minutes and hourly irradiance up to one week ahead. The paper performed a correlation and sensitivity analysis of the strength of the relationship between local weather parameters and system generation. In this study performance of SVM, CART, ANN, and Ensemble learning were analyzed for the prediction of 15-minute intraday and day-ahead irradiance. The results show that SVM and Ensemble learning yielded the lowest MAE for 15-minute intraday and day-ahead irradiance, respectively.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
SLAS: Speed and Lane Advisory System for Highway Navigation
Authors:
Faizan M. Tariq,
David Isele,
John S. Baras,
Sangjae Bae
Abstract:
This paper proposes a hierarchical autonomous vehicle navigation architecture, composed of a high-level speed and lane advisory system (SLAS) coupled with low-level trajectory generation and trajectory following modules. Specifically, we target a multi-lane highway driving scenario where an autonomous ego vehicle navigates in traffic. We propose a novel receding horizon mixed-integer optimization…
▽ More
This paper proposes a hierarchical autonomous vehicle navigation architecture, composed of a high-level speed and lane advisory system (SLAS) coupled with low-level trajectory generation and trajectory following modules. Specifically, we target a multi-lane highway driving scenario where an autonomous ego vehicle navigates in traffic. We propose a novel receding horizon mixed-integer optimization based method for SLAS with the objective to minimize travel time while accounting for passenger comfort. We further incorporate various modifications in the proposed approach to improve the overall computational efficiency and achieve real-time performance. We demonstrate the efficacy of the proposed approach in contrast to the existing methods, when applied in conjunction with state-of-the-art trajectory generation and trajectory following frameworks, in a CARLA simulation environment.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Integration of Data Driven Technologies in Smart Grids for Resilient and Sustainable Smart Cities: A Comprehensive Review
Authors:
Mansoor Ali,
Faisal Naeem,
Nadir Adam,
Georges Kaddoum,
Noor ul Huda,
Muhammad Adnan,
Muhammad Tariq
Abstract:
A modern-day society demands resilient, reliable, and smart urban infrastructure for effective and in telligent operations and deployment. However, unexpected, high-impact, and low-probability events such as earthquakes, tsunamis, tornadoes, and hurricanes make the design of such robust infrastructure more complex. As a result of such events, a power system infrastructure can be severely affected,…
▽ More
A modern-day society demands resilient, reliable, and smart urban infrastructure for effective and in telligent operations and deployment. However, unexpected, high-impact, and low-probability events such as earthquakes, tsunamis, tornadoes, and hurricanes make the design of such robust infrastructure more complex. As a result of such events, a power system infrastructure can be severely affected, leading to unprecedented events, such as blackouts. Nevertheless, the integration of smart grids into the existing framework of smart cities adds to their resilience. Therefore, designing a resilient and reliable power system network is an inevitable requirement of modern smart city infras tructure. With the deployment of the Internet of Things (IoT), smart cities infrastructures have taken a transformational turn towards introducing technologies that do not only provide ease and comfort to the citizens but are also feasible in terms of sustainability and dependability. This paper presents a holistic view of a resilient and sustainable smart city architecture that utilizes IoT, big data analytics, unmanned aerial vehicles, and smart grids through intelligent integration of renew able energy resources. In addition, the impact of disasters on the power system infrastructure is investigated and different types of optimization techniques that can be used to sustain the power flow in the network during disturbances are compared and analyzed. Furthermore, a comparative review analysis of different data-driven machine learning techniques for sustainable smart cities is performed along with the discussion on open research issues and challenges.
△ Less
Submitted 3 August, 2023; v1 submitted 20 January, 2023;
originally announced January 2023.
-
Federated Learning for Privacy Preservation in Smart Healthcare Systems: A Comprehensive Survey
Authors:
Mansoor Ali,
Faisal Naeem,
Muhammad Tariq,
Geroges Kaddoum
Abstract:
Recent advances in electronic devices and communication infrastructure have revolutionized the traditional healthcare system into a smart healthcare system by using IoMT devices. However, due to the centralized training approach of artificial intelligence (AI), the use of mobile and wearable IoMT devices raises privacy concerns with respect to the information that has been communicated between hos…
▽ More
Recent advances in electronic devices and communication infrastructure have revolutionized the traditional healthcare system into a smart healthcare system by using IoMT devices. However, due to the centralized training approach of artificial intelligence (AI), the use of mobile and wearable IoMT devices raises privacy concerns with respect to the information that has been communicated between hospitals and end users. The information conveyed by the IoMT devices is highly confidential and can be exposed to adversaries. In this regard, federated learning (FL), a distributive AI paradigm has opened up new opportunities for privacy-preservation in IoMT without accessing the confidential data of the participants. Further, FL provides privacy to end users as only gradients are shared during training. For these specific properties of FL, in this paper we present privacy related issues in IoMT. Afterwards, we present the role of FL in IoMT networks for privacy preservation and introduce some advanced FL architectures incorporating deep reinforcement learning (DRL), digital twin, and generative adversarial networks (GANs) for detecting privacy threats. Subsequently, we present some practical opportunities of FL in smart healthcare systems. At the end, we conclude this survey by providing open research challenges for FL that can be used in future smart healthcare systems
△ Less
Submitted 17 March, 2022;
originally announced March 2022.
-
Toward Experience-Driven Traffic Management and Orchestration in Digital-Twin-Enabled 6G Networks
Authors:
Muhammad Tariq,
Faisal Naeem,
H. Vincent Poor
Abstract:
The envisioned 6G networks are expected to support extremely high data rates, low-latency, and radically new applications empowered by machine learning. The futuristic 6G networks require a novel framework that can be used to operate, manage, and optimize its underlying services such as ultra-reliable and low-latency communication, and Internet of everything. In recent years, artificial intelligen…
▽ More
The envisioned 6G networks are expected to support extremely high data rates, low-latency, and radically new applications empowered by machine learning. The futuristic 6G networks require a novel framework that can be used to operate, manage, and optimize its underlying services such as ultra-reliable and low-latency communication, and Internet of everything. In recent years, artificial intelligence (AI) has demonstrated significant success in optimizing and designing networks. The AI-enabled traffic orchestration can dynamically organize different network architectures and slices to provide quality of experience considering the dynamic nature of the wireless communication network.
In this paper, we propose a digital twin enabled network framework, empowered by AI to cater the variability and complexity of envisioned 6G networks, to provide smart resource management and intelligent service provisioning. Digital twin paves a way for achieving optimizing 6G services by creating a virtual representation of the 6G network along with its associated communication technologies (e.g., intelligent reflecting surfaces, terahertz and millimeter communication), computing systems (e.g., cloud computing and fog computing) with its associated algorithms (e.g., optimization and machine learning). We then discuss and review the existing AI-enabled traffic management and orchestration techniques and highlight future research directions and potential solutions in 6G networks.
△ Less
Submitted 11 January, 2022;
originally announced January 2022.
-
Federated Deep Learning in Electricity Forecasting: An MCDM Approach
Authors:
Marco Repetto,
Davide La Torre,
Muhammad Tariq
Abstract:
Large-scale data analysis is growing at an exponential rate as data proliferates in our societies. This abundance of data has the advantage of allowing the decision-maker to implement complex models in scenarios that were prohibitive before. At the same time, such an amount of data requires a distributed thinking approach. In fact, Deep Learning models require plenty of resources, and distributed…
▽ More
Large-scale data analysis is growing at an exponential rate as data proliferates in our societies. This abundance of data has the advantage of allowing the decision-maker to implement complex models in scenarios that were prohibitive before. At the same time, such an amount of data requires a distributed thinking approach. In fact, Deep Learning models require plenty of resources, and distributed training is needed. This paper presents a Multicriteria approach for distributed learning. Our approach uses the Weighted Goal Programming approach in its Chebyshev formulation to build an ensemble of decision rules that optimize aprioristically defined performance metrics. Such a formulation is beneficial because it is both model and metric agnostic and provides an interpretable output for the decision-maker. We test our approach by showing a practical application in electricity demand forecasting. Our results suggest that when we allow for dataset split overlap**, the performances of our methodology are consistently above the baseline model trained on the whole dataset.
△ Less
Submitted 9 January, 2022; v1 submitted 27 November, 2021;
originally announced November 2021.
-
A Privacy Preserved and Cost Efficient Control Scheme for Coronavirus Outbreak Using Call Data Record and Contact Tracing
Authors:
Shibli Nisar,
Syed Muhammad Ali Zuhaib,
Abasin Ulasyar,
Muhammad Tariq
Abstract:
Coronavirus or COVID-19, which has been declared pandemic by the World Health Organization, has incurred huge losses to the lives of people throughout the world. Although, the scientists, researchers and doctors are working round the clock to develop a vaccine for COVID-19, it may take a year or two to make a safe and effective vaccine available for the world. In current circumstances, a solution…
▽ More
Coronavirus or COVID-19, which has been declared pandemic by the World Health Organization, has incurred huge losses to the lives of people throughout the world. Although, the scientists, researchers and doctors are working round the clock to develop a vaccine for COVID-19, it may take a year or two to make a safe and effective vaccine available for the world. In current circumstances, a solution must be developed to control or stop the spread of the virus. For this purpose, a novel technique based on call data record analysis (CDRA)and contact tracing is proposed that can effectively control the coronavirus outbreak. A positive coronavirus patient can be traced through CDRA and contact tracing. The technique can track the path traversed by the patient and collect the cell numbers of all those people who have met with the patient. Kee** in tact the privacy of this group of people, who are contacted through their cell numbers so that they can isolate themselves till the result of their coronavirus test arrives. If a test result of a person comes positive among the group, then he/she must be isolated and same CDRA and contact tracing procedures are adopted for that person. A COVID-19 patient is geo tagged and alerts are sent if any violation of isolation is done by the patient. Moreover, the general public is informed in advance to avoid the path followed by the patients. This cost effective mechanism is not only capable to control the coronavirus outbreak but also helps in isolating the patient in his/her house.
△ Less
Submitted 13 December, 2020; v1 submitted 4 October, 2020;
originally announced October 2020.
-
ABCD Neurocognitive Prediction Challenge 2019: Predicting individual residual fluid intelligence scores from cortical grey matter morphology
Authors:
Neil P. Oxtoby,
Fabio S. Ferreira,
Agoston Mihalik,
Tong Wu,
Mikael Brudfors,
Hongxiang Lin,
Anita Rau,
Stefano B. Blumberg,
Maria Robu,
Cemre Zor,
Maira Tariq,
Maria Del Mar Estarellas Garcia,
Baris Kanber,
Daniil I. Nikitichev,
Janaina Mourao-Miranda
Abstract:
We predicted residual fluid intelligence scores from T1-weighted MRI data available as part of the ABCD NP Challenge 2019, using morphological similarity of grey-matter regions across the cortex. Individual structural covariance networks (SCN) were abstracted into graph-theory metrics averaged over nodes across the brain and in data-driven communities/modules. Metrics included degree, path length,…
▽ More
We predicted residual fluid intelligence scores from T1-weighted MRI data available as part of the ABCD NP Challenge 2019, using morphological similarity of grey-matter regions across the cortex. Individual structural covariance networks (SCN) were abstracted into graph-theory metrics averaged over nodes across the brain and in data-driven communities/modules. Metrics included degree, path length, clustering coefficient, centrality, rich club coefficient, and small-worldness. These features derived from the training set were used to build various regression models for predicting residual fluid intelligence scores, with performance evaluated both using cross-validation within the training set and using the held-out validation set. Our predictions on the test set were generated with a support vector regression model trained on the training set. We found minimal improvement over predicting a zero residual fluid intelligence score across the sample population, implying that structural covariance networks calculated from T1-weighted MR imaging data provide little information about residual fluid intelligence.
△ Less
Submitted 26 May, 2019;
originally announced May 2019.
-
ABCD Neurocognitive Prediction Challenge 2019: Predicting individual fluid intelligence scores from structural MRI using probabilistic segmentation and kernel ridge regression
Authors:
Agoston Mihalik,
Mikael Brudfors,
Maria Robu,
Fabio S. Ferreira,
Hongxiang Lin,
Anita Rau,
Tong Wu,
Stefano B. Blumberg,
Baris Kanber,
Maira Tariq,
Maria Del Mar Estarellas Garcia,
Cemre Zor,
Daniil I. Nikitichev,
Janaina Mourao-Miranda,
Neil P. Oxtoby
Abstract:
We applied several regression and deep learning methods to predict fluid intelligence scores from T1-weighted MRI scans as part of the ABCD Neurocognitive Prediction Challenge (ABCD-NP-Challenge) 2019. We used voxel intensities and probabilistic tissue-type labels derived from these as features to train the models. The best predictive performance (lowest mean-squared error) came from Kernel Ridge…
▽ More
We applied several regression and deep learning methods to predict fluid intelligence scores from T1-weighted MRI scans as part of the ABCD Neurocognitive Prediction Challenge (ABCD-NP-Challenge) 2019. We used voxel intensities and probabilistic tissue-type labels derived from these as features to train the models. The best predictive performance (lowest mean-squared error) came from Kernel Ridge Regression (KRR; $λ=10$), which produced a mean-squared error of 69.7204 on the validation set and 92.1298 on the test set. This placed our group in the fifth position on the validation leader board and first place on the final (test) leader board.
△ Less
Submitted 26 May, 2019;
originally announced May 2019.
-
Q-Graph: Preserving Query Locality in Multi-Query Graph Processing
Authors:
Christian Mayer,
Ruben Mayer,
Jonas Grunert,
Kurt Rothermel,
Muhammad Adnan Tariq
Abstract:
Arising user-centric graph applications such as route planning and personalized social network analysis have initiated a shift of paradigms in modern graph processing systems towards multi-query analysis, i.e., processing multiple graph queries in parallel on a shared graph. These applications generate a dynamic number of localized queries around query hotspots such as popular urban areas. However…
▽ More
Arising user-centric graph applications such as route planning and personalized social network analysis have initiated a shift of paradigms in modern graph processing systems towards multi-query analysis, i.e., processing multiple graph queries in parallel on a shared graph. These applications generate a dynamic number of localized queries around query hotspots such as popular urban areas. However, existing graph processing systems are not yet tailored towards these properties: The employed methods for graph partitioning and synchronization management disregard query locality and dynamism which leads to high query latency. To this end, we propose the system Q-Graph for multi-query graph analysis that considers query locality on three levels. (i) The query-aware graph partitioning algorithm Q-cut maximizes query locality to reduce communication overhead. (ii) The method for synchronization management, called hybrid barrier synchronization, allows for full exploitation of local queries spanning only a subset of partitions. (iii) Both methods adapt at runtime to changing query workloads in order to maintain and exploit locality. Our experiments show that Q-cut reduces average query latency by up to 57 percent compared to static query-agnostic partitioning algorithms.
△ Less
Submitted 30 May, 2018;
originally announced May 2018.
-
ADWISE: Adaptive Window-based Streaming Edge Partitioning for High-Speed Graph Processing
Authors:
Christian Mayer,
Ruben Mayer,
Muhammad Adnan Tariq,
Heiko Geppert,
Larissa Laich,
Lukas Rieger,
Kurt Rothermel
Abstract:
In recent years, the graph partitioning problem gained importance as a mandatory preprocessing step for distributed graph processing on very large graphs. Existing graph partitioning algorithms minimize partitioning latency by assigning individual graph edges to partitions in a streaming manner --- at the cost of reduced partitioning quality. However, we argue that the mere minimization of partiti…
▽ More
In recent years, the graph partitioning problem gained importance as a mandatory preprocessing step for distributed graph processing on very large graphs. Existing graph partitioning algorithms minimize partitioning latency by assigning individual graph edges to partitions in a streaming manner --- at the cost of reduced partitioning quality. However, we argue that the mere minimization of partitioning latency is not the optimal design choice in terms of minimizing total graph analysis latency, i.e., the sum of partitioning and processing latency. Instead, for complex and long-running graph processing algorithms that run on very large graphs, it is beneficial to invest more time into graph partitioning to reach a higher partitioning quality --- which drastically reduces graph processing latency. In this paper, we propose ADWISE, a novel window-based streaming partitioning algorithm that increases the partitioning quality by always choosing the best edge from a set of edges for assignment to a partition. In doing so, ADWISE controls the partitioning latency by adapting the window size dynamically at run-time. Our evaluations show that ADWISE can reach the sweet spot between graph partitioning latency and graph processing latency, reducing the total latency of partitioning plus processing by up to 23-47 percent compared to the state-of-the-art.
△ Less
Submitted 30 May, 2018; v1 submitted 22 December, 2017;
originally announced December 2017.
-
Knowledge is at the Edge! How to Search in Distributed Machine Learning Models
Authors:
Thomas Bach,
Muhammad Adnan Tariq,
Ruben Mayer,
Kurt Rothermel
Abstract:
With the advent of the Internet of Things and Industry 4.0 an enormous amount of data is produced at the edge of the network. Due to a lack of computing power, this data is currently send to the cloud where centralized machine learning models are trained to derive higher level knowledge. With the recent development of specialized machine learning hardware for mobile devices, a new era of distribut…
▽ More
With the advent of the Internet of Things and Industry 4.0 an enormous amount of data is produced at the edge of the network. Due to a lack of computing power, this data is currently send to the cloud where centralized machine learning models are trained to derive higher level knowledge. With the recent development of specialized machine learning hardware for mobile devices, a new era of distributed learning is about to begin that raises a new research question: How can we search in distributed machine learning models? Machine learning at the edge of the network has many benefits, such as low-latency inference and increased privacy. Such distributed machine learning models can also learn personalized for a human user, a specific context, or application scenario. As training data stays on the devices, control over possibly sensitive data is preserved as it is not shared with a third party. This new form of distributed learning leads to the partitioning of knowledge between many devices which makes access difficult. In this paper we tackle the problem of finding specific knowledge by forwarding a search request (query) to a device that can answer it best. To that end, we use a entropy based quality metric that takes the context of a query and the learning quality of a device into account. We show that our forwarding strategy can achieve over 95% accuracy in a urban mobility scenario where we use data from 30 000 people commuting in the city of Trento, Italy.
△ Less
Submitted 13 October, 2017;
originally announced October 2017.
-
SPECTRE: Supporting Consumption Policies in Window-Based Parallel Complex Event Processing
Authors:
Ruben Mayer,
Ahmad Slo,
Muhammad Adnan Tariq,
Kurt Rothermel,
Manuel Gräber,
Umakishore Ramachandran
Abstract:
Distributed Complex Event Processing (DCEP) is a paradigm to infer the occurrence of complex situations in the surrounding world from basic events like sensor readings. In doing so, DCEP operators detect event patterns on their incoming event streams. To yield high operator throughput, data parallelization frameworks divide the incoming event streams of an operator into overlap** windows that ar…
▽ More
Distributed Complex Event Processing (DCEP) is a paradigm to infer the occurrence of complex situations in the surrounding world from basic events like sensor readings. In doing so, DCEP operators detect event patterns on their incoming event streams. To yield high operator throughput, data parallelization frameworks divide the incoming event streams of an operator into overlap** windows that are processed in parallel by a number of operator instances. In doing so, the basic assumption is that the different windows can be processed independently from each other. However, consumption policies enforce that events can only be part of one pattern instance; then, they are consumed, i.e., removed from further pattern detection. That implies that the constituent events of a pattern instance detected in one window are excluded from all other windows as well, which breaks the data parallelism between different windows. In this paper, we tackle this problem by means of speculation: Based on the likelihood of an event's consumption in a window, subsequent windows may speculatively suppress that event. We propose the SPECTRE framework for speculative processing of multiple dependent windows in parallel. Our evaluations show an up to linear scalability of SPECTRE with the number of CPU cores.
△ Less
Submitted 6 September, 2017;
originally announced September 2017.
-
Minimizing Communication Overhead in Window-Based Parallel Complex Event Processing
Authors:
Ruben Mayer,
Muhammad Adnan Tariq,
Kurt Rothermel
Abstract:
Distributed Complex Event Processing has emerged as a well-established paradigm to detect situations of interest from basic sensor streams, building an operator graph between sensors and applications. In order to detect event patterns that correspond to situations of interest, each operator correlates events on its incoming streams according to a sliding window mechanism. To increase the throughpu…
▽ More
Distributed Complex Event Processing has emerged as a well-established paradigm to detect situations of interest from basic sensor streams, building an operator graph between sensors and applications. In order to detect event patterns that correspond to situations of interest, each operator correlates events on its incoming streams according to a sliding window mechanism. To increase the throughput of an operator, different windows can be assigned to different operator instances---i.e., identical operator copies---which process them in parallel. This implies that events that are part of multiple overlap** windows are replicated to different operator instances. The communication overhead of replicating the events can be reduced by assigning overlap** windows to the same operator instance. However, this imposes a higher processing load on the single operator instance, possibly overloading it. In this paper, we address the trade-off between processing load and communication overhead when assigning overlap** windows to a single operator instance. Controlling the trade-off is challenging and cannot be solved with traditional reactive methods. To this end, we propose a model-based batch scheduling controller building on prediction. Evaluations show that our approach is able to significantly save bandwidth, while kee** a user-defined latency bound in the operator instances.
△ Less
Submitted 16 May, 2017;
originally announced May 2017.
-
An Efficient Framework for Information Security in Cloud Computing Using Auditing Algorithm Shell (AAS)
Authors:
M. Omer Mushtaq,
Furrakh Shahzad,
M. Owais Tariq,
Mahina Riaz,
Bushra Majeed
Abstract:
There is a dynamic escalation and extension in the new infrastructure, educating personnel and licensing new computer programs in the field of IT, due to the emergence of Cloud Computing (CC) paradigm. It has become a quick growing segment of IT business in last couple of years. However, due to the rapid growth of data, people and IT firms, the issue of information security is getting more complex…
▽ More
There is a dynamic escalation and extension in the new infrastructure, educating personnel and licensing new computer programs in the field of IT, due to the emergence of Cloud Computing (CC) paradigm. It has become a quick growing segment of IT business in last couple of years. However, due to the rapid growth of data, people and IT firms, the issue of information security is getting more complex. One of the major concerns of the user is, at what degree the data is safe on Cloud? In spite of all promotional material encompassing the cloud, consortium customers are not willing to shift their business on the cloud. Data security is the major problem which has limited the scope of cloud computing. In new cloud computing infrastructure, the techniques such as the Strong Secure Shell and Encryption are deployed to guarantee the authenticity of the user through logs systems. The vendors utilize these logs to analyze and view their data. Therefore, this implementation is not enough to ensure security, privacy and authoritative use of the data. This paper introduces quad layered framework for data security, data privacy, data breaches and process associated aspects. Using this layered architecture we have preserved the secrecy of confidential information and tried to build the trust of user on cloud computing. This layered framework prevents the confidential information by multiple means i.e. Secure Transmission of Data, Encrypted Data and its Processing, Database Secure Shell and Internal/external log Auditing.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Rate-Distortion Analysis of Quantizers with Error Feedback
Authors:
Shuichi Ohno,
Teruyuki Shiraki,
M. Rizwan Tariq,
Masaaki Nagahara
Abstract:
A Delta-Sigma modulator that is often utilized to convert analog signals into digital signals can be modeled as a static uniform quantizer with an error feedback filter. In this paper, we present a rate-distortion analysis of quantizers with error feedback including the Delta-Sigma modulators, assuming that the error owing to overloading in the static quantizer is negligible. We demonstrate that t…
▽ More
A Delta-Sigma modulator that is often utilized to convert analog signals into digital signals can be modeled as a static uniform quantizer with an error feedback filter. In this paper, we present a rate-distortion analysis of quantizers with error feedback including the Delta-Sigma modulators, assuming that the error owing to overloading in the static quantizer is negligible. We demonstrate that the amplitude response of the optimal error feedback filter that minimizes the mean squared quantization error can be parameterized by one parameter. This parameterization enables us to determine the optimal error feedback filter numerically. The relationship between the number of bits used for the quantization and the achievable mean squared error can be obtained using the optimal error feedback filter. This clarifies the rate-distortion property of quantizers with error feedback. Then, ideal optimal error feedback filters are approximated by practical filters using the Yule-Walker method and the linear matrix inequality-based method. Numerical examples are provided for demonstrating our analysis and synthesis.
△ Less
Submitted 6 September, 2016;
originally announced September 2016.