Skip to main content

Showing 1–13 of 13 results for author: Dakkak, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.15758  [pdf

    cs.SD eess.AS

    Towards Solving Cocktail-Party: The First Method to Build a Realistic Dataset with Ground Truths for Speech Separation

    Authors: Rawad Melhem, Assef Jafar, Oumayma Al Dakkak

    Abstract: Speech separation is very important in real-world applications such as human-machine interaction, hearing aids devices, and automatic meeting transcription. In recent years, a significant improvement occurred towards the solution based on deep learning. In fact, much attention has been drawn to supervised learning methods using synthetic mixtures datasets despite their being not representative of… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  2. arXiv:2305.08601  [pdf, other

    cs.SE cs.CE

    DevServOps: DevOps For Product-Oriented Product-Service Systems

    Authors: Anas Dakkak, Jan Bosch, Helena Holmström Olsson

    Abstract: For companies develo** web-based applications, the Dev and the Ops refer to different groups with either operational or development focus. Therefore, DevOps help these companies streamline software development and operations activities by emphasizing the collaboration between the two groups. However, for companies producing software-intensive products, the Ops would refer to customers who use an… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  3. arXiv:2002.11262  [pdf, other

    cs.LG cs.AI

    DLSpec: A Deep Learning Task Exchange Specification

    Authors: Abdul Dakkak, Cheng Li, **jun Xiong, Wen-Mei Hwu

    Abstract: Deep Learning (DL) innovations are being introduced at a rapid pace. However, the current lack of standard specification of DL tasks makes sharing, running, reproducing, and comparing these innovations difficult. To address this problem, we propose DLSpec, a model-, dataset-, software-, and hardware-agnostic DL specification that captures the different aspects of DL tasks. DLSpec has been tested b… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

  4. arXiv:2002.08295  [pdf, other

    cs.DC cs.LG stat.ML

    MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale

    Authors: Abdul Dakkak, Cheng Li, **jun Xiong, Wen-mei Hwu

    Abstract: Machine Learning (ML) and Deep Learning (DL) innovations are being introduced at such a rapid pace that researchers are hard-pressed to analyze and study them. The complicated procedures for evaluating innovations, along with the lack of standard and efficient ways of specifying and provisioning ML/DL evaluation, is a major "pain point" for the community. This paper proposes MLModelScope, an open-… ▽ More

    Submitted 19 February, 2020; originally announced February 2020.

  5. arXiv:1911.08031  [pdf, other

    cs.DC cs.LG cs.PF stat.ML

    The Design and Implementation of a Scalable DL Benchmarking Platform

    Authors: Cheng Li, Abdul Dakkak, **jun Xiong, Wen-mei Hwu

    Abstract: The current Deep Learning (DL) landscape is fast-paced and is rife with non-uniform models, hardware/software (HW/SW) stacks, but lacks a DL benchmarking platform to facilitate evaluation and comparison of DL innovations, be it models, frameworks, libraries, or hardware. Due to the lack of a benchmarking platform, the current practice of evaluating the benefits of proposed DL innovations is both a… ▽ More

    Submitted 18 November, 2019; originally announced November 2019.

    Journal ref: 2020 IEEE 13th International Conference on Cloud Computing (CLOUD), 414-425

  6. arXiv:1911.07967  [pdf, other

    cs.LG cs.PF cs.SE stat.ML

    DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs (Extended)

    Authors: Cheng Li, Abdul Dakkak, **jun Xiong, Wen-mei Hwu

    Abstract: The past few years have seen a surge of applying Deep Learning (DL) models for a wide array of tasks such as image classification, object detection, machine translation, etc. While DL models provide an opportunity to solve otherwise intractable tasks, their adoption relies on them being optimized to meet latency and resource requirements. Benchmarking is a key step in this process but has been ham… ▽ More

    Submitted 11 March, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

  7. arXiv:1911.06922  [pdf, other

    cs.LG cs.DC cs.PF stat.ML

    Benanza: Automatic $μ$Benchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs

    Authors: Cheng Li, Abdul Dakkak, **jun Xiong, Wen-mei Hwu

    Abstract: As Deep Learning (DL) models have been increasingly used in latency-sensitive applications, there has been a growing interest in improving their response time. An important venue for such improvement is to profile the execution of these models and characterize their performance to identify possible optimization opportunities. However, the current profiling tools lack the highly desired abilities t… ▽ More

    Submitted 19 February, 2020; v1 submitted 15 November, 2019; originally announced November 2019.

  8. arXiv:1908.06869  [pdf, other

    cs.LG cs.AR cs.PF stat.ML

    XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs

    Authors: Cheng Li, Abdul Dakkak, **jun Xiong, Wei Wei, Lingjie Xu, Wen-mei Hwu

    Abstract: There has been a rapid proliferation of machine learning/deep learning (ML) models and wide adoption of them in many application domains. This has made profiling and characterization of ML model performance an increasingly pressing task for both hardware designers and system providers, as they would like to offer the best possible system to serve ML models with the target latency, throughput, cost… ▽ More

    Submitted 2 June, 2020; v1 submitted 19 August, 2019; originally announced August 2019.

  9. arXiv:1904.12437  [pdf, other

    cs.LG cs.AI cs.SE

    Challenges and Pitfalls of Machine Learning Evaluation and Benchmarking

    Authors: Cheng Li, Abdul Dakkak, **jun Xiong, Wen-mei Hwu

    Abstract: An increasingly complex and diverse collection of Machine Learning (ML) models as well as hardware/software stacks, collectively referred to as "ML artifacts", are being proposed - leading to a diverse landscape of ML. These ML innovations proposed have outpaced researchers' ability to analyze, study and adapt them. This is exacerbated by the complicated and sometimes non-reproducible procedures f… ▽ More

    Submitted 25 June, 2019; v1 submitted 28 April, 2019; originally announced April 2019.

  10. arXiv:1811.09737  [pdf, other

    cs.LG stat.ML

    Frustrated with Replicating Claims of a Shared Model? A Solution

    Authors: Abdul Dakkak, Cheng Li, **jun Xiong, Wen-Mei Hwu

    Abstract: Machine Learning (ML) and Deep Learning (DL) innovations are being introduced at such a rapid pace that model owners and evaluators are hard-pressed analyzing and studying them. This is exacerbated by the complicated procedures for evaluation. The lack of standard systems and efficient techniques for specifying and provisioning ML/DL evaluation is the main cause of this "pain point". This work dis… ▽ More

    Submitted 25 June, 2019; v1 submitted 23 November, 2018; originally announced November 2018.

  11. Accelerating Reduction and Scan Using Tensor Core Units

    Authors: Abdul Dakkak, Cheng Li, Isaac Gelado, **jun Xiong, Wen-mei Hwu

    Abstract: Driven by deep learning, there has been a surge of specialized processors for matrix multiplication, referred to as TensorCore Units (TCUs). These TCUs are capable of performing matrix multiplications on small matrices (usually 4x4 or 16x16) to accelerate the convolutional and recurrent neural networks in deep learning workloads. In this paper we leverage NVIDIA's TCU to express both reduction and… ▽ More

    Submitted 23 November, 2019; v1 submitted 23 November, 2018; originally announced November 2018.

    Comments: In Proceedings of the ACM International Conference on Supercomputing (ICS '19)

  12. TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments

    Authors: Abdul Dakkak, Cheng Li, Simon Garcia de Gonzalo, **jun Xiong, Wen-mei Hwu

    Abstract: Deep neural networks (DNNs) have become core computation components within low latency Function as a Service (FaaS) prediction pipelines: including image recognition, object detection, natural language processing, speech synthesis, and personalized recommendation pipelines. Cloud computing, as the de-facto backbone of modern computing infrastructure for both enterprise and consumer applications, h… ▽ More

    Submitted 23 November, 2018; originally announced November 2018.

    Comments: In Proceedings CLOUD 2019

  13. arXiv:1809.08311  [pdf, other

    cs.PF

    SCOPE: C3SR Systems Characterization and Benchmarking Framework

    Authors: Carl Pearson, Abdul Dakkak, Cheng Li, Sarah Hashash, **jun Xiong, Wen-mei Hwu

    Abstract: This report presents the design of the Scope infrastructure for extensible and portable benchmarking. Improvements in high- performance computing systems rely on coordination across different levels of system abstraction. Develo** and defining accurate performance measurements is necessary at all levels of the system hierarchy, and should be as accessible as possible to developers with different… ▽ More

    Submitted 18 September, 2018; originally announced September 2018.

    Comments: 8 pages, draft