Skip to main content

Showing 1–24 of 24 results for author: Reda, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.08699  [pdf, other

    cs.CL cs.AI cs.LG

    Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in LLMs

    Authors: Ahmed Agiza, Mohamed Mostagir, Sherief Reda

    Abstract: In an era where language models are increasingly integrated into decision-making and communication, understanding the biases within Large Language Models (LLMs) becomes imperative, especially when these models are applied in the economic and political domains. This work investigates the impact of fine-tuning and data selection on economic and political biases in LLM. We explore the methodological… ▽ More

    Submitted 21 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  2. arXiv:2404.00103  [pdf, other

    cs.LG cs.CV

    PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks

    Authors: Marina Neseem, Conor McCullough, Randy Hsin, Chas Leichner, Shan Li, In Suk Chong, Andrew G. Howard, Lukasz Lew, Sherief Reda, Ville-Mikko Rautio, Daniele Moro

    Abstract: Low-precision quantization is recognized for its efficacy in neural network optimization. Our analysis reveals that non-quantized elementwise operations which are prevalent in layers such as parameterized activation functions, batch normalization, and quantization scaling dominate the inference cost of low-precision models. These non-quantized elementwise operations are commonly overlooked in SOTA… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: Accepted in CVPR 2024. 10 Figures, 9 Tables

  3. arXiv:2403.20320  [pdf, other

    cs.CV cs.AI cs.LG

    MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning

    Authors: Ahmed Agiza, Marina Neseem, Sherief Reda

    Abstract: Adapting models pre-trained on large-scale datasets to a variety of downstream tasks is a common strategy in deep learning. Consequently, parameter-efficient fine-tuning methods have emerged as a promising way to adapt pre-trained models to different tasks while training only a minimal number of parameters. While most of these methods are designed for single-task adaptation, parameter-efficient tr… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

  4. arXiv:2402.03640  [pdf, other

    cs.AI

    torchmSAT: A GPU-Accelerated Approximation To The Maximum Satisfiability Problem

    Authors: Abdelrahman Hosny, Sherief Reda

    Abstract: The remarkable achievements of machine learning techniques in analyzing discrete structures have drawn significant attention towards their integration into combinatorial optimization algorithms. Typically, these methodologies improve existing solvers by injecting learned models within the solving loop to enhance the efficiency of the search process. In this work, we derive a single differentiable… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  5. arXiv:2309.09039  [pdf, other

    cs.CV

    Microscale 3-D Capacitance Tomography with a CMOS Sensor Array

    Authors: Manar Abdelatty, Joseph Incandela, Kang** Hu, Joseph W. Larkin, Sherief Reda, Jacob K. Rosenstein

    Abstract: Electrical capacitance tomography (ECT) is a nonoptical imaging technique in which a map of the interior permittivity of a volume is estimated by making capacitance measurements at its boundary and solving an inverse problem. While previous ECT demonstrations have often been at centimeter scales, ECT is not limited to macroscopic systems. In this paper, we demonstrate ECT imaging of polymer micros… ▽ More

    Submitted 2 December, 2023; v1 submitted 16 September, 2023; originally announced September 2023.

  6. arXiv:2308.13803  [pdf, other

    cs.DC

    Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?

    Authors: Seyed Morteza Nabavinejad, Masoumeh Ebrahimi, Sherief Reda

    Abstract: Deployment of real-time ML services on warehouse-scale infrastructures is on the increase. Therefore, decreasing latency and increasing throughput of deep neural network (DNN) inference applications that empower those services have attracted attention from both academia and industry. A common solution to address this challenge is leveraging hardware accelerators such as GPUs. To improve the infere… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  7. arXiv:2308.03944  [pdf, other

    cs.LG cs.AR

    GraPhSyM: Graph Physical Synthesis Model

    Authors: Ahmed Agiza, Rajarshi Roy, Teodor Dumitru Ene, Saad Godil, Sherief Reda, Bryan Catanzaro

    Abstract: In this work, we introduce GraPhSyM, a Graph Attention Network (GATv2) model for fast and accurate estimation of post-physical synthesis circuit delay and area metrics from pre-physical synthesis circuit netlists. Once trained, GraPhSyM provides accurate visibility of final design metrics to early EDA stages, such as logic synthesis, without running the slow physical synthesis flow, enabling globa… ▽ More

    Submitted 7 September, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: Accepted at Proceedings of the 42nd International Conference on Computer-Aided Design (ICCAD), 2023

  8. arXiv:2307.00670  [pdf, other

    cs.LG math.OC

    Automatic MILP Solver Configuration By Learning Problem Similarities

    Authors: Abdelrahman Hosny, Sherief Reda

    Abstract: A large number of real-world optimization problems can be formulated as Mixed Integer Linear Programs (MILP). MILP solvers expose numerous configuration parameters to control their internal algorithms. Solutions, and their associated costs or runtimes, are significantly affected by the choice of the configuration parameters, even when problem instances have the same number of decision variables an… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: To appear in Annals of Operations Research (ANOR)

  9. arXiv:2304.08594  [pdf, other

    cs.CV

    AdaMTL: Adaptive Input-dependent Inference for Efficient Multi-Task Learning

    Authors: Marina Neseem, Ahmed Agiza, Sherief Reda

    Abstract: Modern Augmented reality applications require performing multiple tasks on each input frame simultaneously. Multi-task learning (MTL) represents an effective approach where multiple tasks share an encoder to extract representative features from the input frame, followed by task-specific decoders to generate predictions for each task. Generally, the shared encoder in MTL models needs to have a larg… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: This paper will appear in the proceedings of CVPR 2023

  10. arXiv:2207.00459  [pdf, other

    cs.AR

    RUCA: RUntime Configurable Approximate Circuits with Self-Correcting Capability

    Authors: **gxiao Ma, Sherief Reda

    Abstract: Approximate computing is an emerging computing paradigm that offers improved power consumption by relaxing the requirement for full accuracy. Since real-world applications may have different requirements for design accuracy, one trend of approximate computing is to design runtime quality-configurable circuits, which are able to operate under different accuracy modes with different power consumptio… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: 8 pages, 7 figures, to be published in 30th International Workshop on Logic & Synthesis

    ACM Class: B.6.3; B.2.4; C.0

  11. BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the Edge

    Authors: Abdelrahman Hosny, Marina Neseem, Sherief Reda

    Abstract: Training on the Edge enables neural networks to learn continuously from new data after deployment on memory-constrained edge devices. Previous work is mostly concerned with reducing the number of model parameters which is only beneficial for inference. However, memory footprint from activations is the main bottleneck for training on the edge. Existing incremental training methods fine-tune the las… ▽ More

    Submitted 29 October, 2021; originally announced October 2021.

    Comments: 12 pages, 13 figures, to appear in the proceedings of The Sixth ACM/IEEE Symposium on Edge Computing (SEC 2021)

  12. arXiv:2108.06850  [pdf, other

    cs.CV cs.AI

    AdaCon: Adaptive Context-Aware Object Detection for Resource-Constrained Embedded Devices

    Authors: Marina Neseem, Sherief Reda

    Abstract: Convolutional Neural Networks achieve state-of-the-art accuracy in object detection tasks. However, they have large computational and energy requirements that challenge their deployment on resource-constrained edge devices. Object detection takes an image as an input, and identifies the existing object classes as well as their locations in the image. In this paper, we leverage the prior knowledge… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

    Comments: 9 pages, 6 figures, 2021 IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2021)

  13. arXiv:2102.10800  [pdf, other

    cs.DC cs.AI

    Characterizing and Optimizing EDA Flows for the Cloud

    Authors: Abdelrahman Hosny, Sherief Reda

    Abstract: Cloud computing accelerates design space exploration in logic synthesis, and parameter tuning in physical design. However, deploying EDA jobs on the cloud requires EDA teams to deeply understand the characteristics of their jobs in cloud environments. Unfortunately, there has been little to no public information on these characteristics. Thus, in this paper, we formulate the problem of migrating E… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Comments: Presented at DATE2021

  14. arXiv:2006.05884  [pdf, other

    eess.SP cs.HC cs.LG

    AdaSense: Adaptive Low-Power Sensing and Activity Recognition for Wearable Devices

    Authors: Marina Neseem, Jon Nelson, Sherief Reda

    Abstract: Wearable devices have strict power and memory limitations. As a result, there is a need to optimize the power consumption on those devices without sacrificing the accuracy. This paper presents AdaSense: a sensing, feature extraction and classification co-optimized framework for Human Activity Recognition. The proposed techniques reduce the power consumption by dynamically switching among different… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

    Comments: 6 pages, 7 figures, To appear in DAC 2020

  15. arXiv:1911.04021  [pdf, other

    cs.AI cs.LG cs.NE

    DRiLLS: Deep Reinforcement Learning for Logic Synthesis

    Authors: Abdelrahman Hosny, Soheil Hashemi, Mohamed Shalan, Sherief Reda

    Abstract: Logic synthesis requires extensive tuning of the synthesis optimization flow where the quality of results (QoR) depends on the sequence of optimizations used. Efficient design space exploration is challenging due to the exponential number of possible optimization permutations. Therefore, automating the optimization process is necessary. In this work, we propose a novel reinforcement learning-based… ▽ More

    Submitted 12 November, 2019; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: ASPDAC'2020

  16. arXiv:1909.03385  [pdf, other

    cs.NE cs.CV eess.IV

    A Resource-Efficient Embedded Iris Recognition System Using Fully Convolutional Networks

    Authors: Hokchhay Tann, Heng Zhao, Sherief Reda

    Abstract: Applications of Fully Convolutional Networks (FCN) in iris segmentation have shown promising advances. For mobile and embedded systems, a significant challenge is that the proposed FCN architectures are extremely computationally demanding. In this article, we propose a resource-efficient, end-to-end iris recognition flow, which consists of FCN-based segmentation, contour fitting, followed by Daugm… ▽ More

    Submitted 8 September, 2019; originally announced September 2019.

  17. arXiv:1905.02187  [pdf, other

    cs.ET cond-mat.other q-bio.MN

    Principles of Information Storage in Small-Molecule Mixtures

    Authors: Jacob K. Rosenstein, Christopher Rose, Sherief Reda, Peter M. Weber, Eunsuk Kim, Jason Sello, Joseph Geiser, Eamonn Kennedy, Christopher Arcadia, Amanda Dombroski, Kady Oakley, Shui Ling Chen, Hokchhay Tann, Brenda M. Rubenstein

    Abstract: Molecular data systems have the potential to store information at dramatically higher density than existing electronic media. Some of the first experimental demonstrations of this idea have used DNA, but nature also uses a wide diversity of smaller non-polymeric molecules to preserve, process, and transmit information. In this paper, we present a general framework for quantifying chemical memory,… ▽ More

    Submitted 6 May, 2019; originally announced May 2019.

  18. arXiv:1810.05214  [pdf, ps, other

    cs.ET cond-mat.other physics.chem-ph q-bio.MN

    Parallelized Linear Classification with Volumetric Chemical Perceptrons

    Authors: Christopher E. Arcadia, Hokchhay Tann, Amanda Dombroski, Kady Ferguson, Shui Ling Chen, Eunsuk Kim, Christopher Rose, Brenda M. Rubenstein, Sherief Reda, Jacob K. Rosenstein

    Abstract: In this work, we introduce a new type of linear classifier that is implemented in a chemical form. We propose a novel encoding technique which simultaneously represents multiple datasets in an array of microliter-scale chemical mixtures. Parallel computations on these datasets are performed as robotic liquid handling sequences, whose outputs are analyzed by high-performance liquid chromatography.… ▽ More

    Submitted 11 October, 2018; originally announced October 2018.

    Comments: Accepted to 2018 IEEE International Conference on Rebooting Computing

  19. arXiv:1808.09651  [pdf, other

    cs.AR

    Implications of Integrated CPU-GPU Processors on Thermal and Power Management Techniques

    Authors: Kapil Dev, Indrani Paul, Wei Huang, Yasuko Eckert, Wayne Burleson, Sherief Reda

    Abstract: Heterogeneous processors with architecturally different cores (CPU and GPU) integrated on the same die lead to new challenges and opportunities for thermal and power management techniques because of shared thermal/power budgets between these cores. In this paper, we show that new parallel programming paradigms (e.g., OpenCL) for CPU-GPU processors create a tighter coupling between the workload, th… ▽ More

    Submitted 29 August, 2018; originally announced August 2018.

    Comments: 9 pages, 8 figures, 2 tables

  20. BLASYS: Approximate Logic Synthesis Using Boolean Matrix Factorization

    Authors: Soheil Hashemi, Hokchhay Tann, Sherief Reda

    Abstract: Approximate computing is an emerging paradigm where design accuracy can be traded off for benefits in design metrics such as design area, power consumption or circuit complexity. In this work, we present a novel paradigm to synthesize approximate circuits using Boolean matrix factorization (BMF). In our methodology the truth table of a sub-circuit of the design is approximated using BMF to a contr… ▽ More

    Submitted 15 May, 2018; originally announced May 2018.

    Comments: To Appear in DAC'18

  21. arXiv:1801.07353  [pdf, other

    cs.NE cs.LG stat.ML

    Flexible Deep Neural Network Processing

    Authors: Hokchhay Tann, Soheil Hashemi, Sherief Reda

    Abstract: The recent success of Deep Neural Networks (DNNs) has drastically improved the state of the art for many application domains. While achieving high accuracy performance, deploying state-of-the-art DNNs is a challenge since they typically require billions of expensive arithmetic computations. In addition, DNNs are typically deployed in ensemble to boost accuracy performance, which further exacerbate… ▽ More

    Submitted 22 January, 2018; originally announced January 2018.

    Comments: 6 pages, 4 figures

  22. arXiv:1705.04288  [pdf, other

    cs.NE

    Hardware-Software Codesign of Accurate, Multiplier-free Deep Neural Networks

    Authors: Hokchhay Tann, Soheil Hashemi, Iris Bahar, Sherief Reda

    Abstract: While Deep Neural Networks (DNNs) push the state-of-the-art in many machine learning applications, they often require millions of expensive floating-point operations for each input classification. This computation overhead limits the applicability of DNNs to low-power, embedded platforms and incurs high cost in data centers. This motivates recent interests in designing low-power, low-latency DNNs… ▽ More

    Submitted 11 May, 2017; originally announced May 2017.

    Comments: 6 pages

  23. arXiv:1612.03940  [pdf, other

    cs.NE

    Understanding the Impact of Precision Quantization on the Accuracy and Energy of Neural Networks

    Authors: Soheil Hashemi, Nicholas Anthony, Hokchhay Tann, R. Iris Bahar, Sherief Reda

    Abstract: Deep neural networks are gaining in popularity as they are used to generate state-of-the-art results for a variety of computer vision and machine learning applications. At the same time, these networks have grown in depth and complexity in order to solve harder problems. Given the limitations in power budgets dedicated to these networks, the importance of low-power, low-memory solutions has been s… ▽ More

    Submitted 12 December, 2016; originally announced December 2016.

    Comments: Accepted for conference proceedings in DATE17

  24. Runtime Configurable Deep Neural Networks for Energy-Accuracy Trade-off

    Authors: Hokchhay Tann, Soheil Hashemi, R. Iris Bahar, Sherief Reda

    Abstract: We present a novel dynamic configuration technique for deep neural networks that permits step-wise energy-accuracy trade-offs during runtime. Our configuration technique adjusts the number of channels in the network dynamically depending on response time, power, and accuracy targets. To enable this dynamic configuration technique, we co-design a new training algorithm, where the network is increme… ▽ More

    Submitted 20 July, 2016; v1 submitted 19 July, 2016; originally announced July 2016.