Skip to main content

Showing 1–50 of 137 results for author: Gregg, D

.
  1. Using Ensemble Inference to Improve Recall of Clone Detection

    Authors: Gul Aftab Ahmed, James Vincent Patten, Yuanhua Han, Guoxian Lu, David Gregg, Jim Buckley, Muslim Chochlov

    Abstract: Large-scale source-code clone detection is a challenging task. In our previous work, we proposed an approach (SSCD) that leverages artificial neural networks and approximates nearest neighbour search to effectively and efficiently locate clones in large-scale bodies of code, in a time-efficient manner. However, our literature review suggests that the relative efficacy of differing neural network m… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Journal ref: 2023 IEEE 17th International Workshop on Software Clones (IWSC)

  2. Using a Nearest-Neighbour, BERT-Based Approach for Scalable Clone Detection

    Authors: Muslim Chochlov, Gul Aftab Ahmed, James Vincent Patten, Guoxian Lu, Wei Hou, David Gregg, Jim Buckley

    Abstract: Code clones can detrimentally impact software maintenance and manually detecting them in very large codebases is impractical. Additionally, automated approaches find detection of Type 3 and Type 4 (inexact) clones very challenging. While the most recent artificial deep neural networks (for example BERT-based artificial neural networks) seem to be highly effective in detecting such clones, their pa… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: 10 pages, 2 figures, 38th IEEE International Conference on Software Maintenance and Evolution

  3. arXiv:2303.15199  [pdf, other

    cs.AR

    Maple: A Processing Element for Row-Wise Product Based Sparse Tensor Accelerators

    Authors: Midia Reshadi, David Gregg

    Abstract: Sparse tensor computing is a core computational part of numerous applications in areas such as data science, graph processing, and scientific computing. Sparse tensors offer the potential of skip** unnecessary computations caused by zero values. In this paper, we propose a new strategy for extending row-wise product sparse tensor accelerators. We propose a new processing element called Maple tha… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  4. arXiv:2302.10806  [pdf, other

    cs.AR

    Dynamic Resource Partitioning for Multi-Tenant Systolic Array Based DNN Accelerator

    Authors: Midia Reshadi, David Gregg

    Abstract: Deep neural networks (DNN) have become significant applications in both cloud-server and edge devices. Meanwhile, the growing number of DNNs on those platforms raises the need to execute multiple DNNs on the same device. This paper proposes a dynamic partitioning algorithm to perform concurrent processing of multiple DNNs on a systolic-array-based accelerator. Sharing an accelerator's storage and… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

  5. arXiv:2301.05335  [pdf, other

    astro-ph.SR astro-ph.GA astro-ph.IM

    HST Low Resolution Stellar Library

    Authors: Tathagata Pal, Islam Khan, Guy Worthey, Michael D. Gregg, David R. Silva

    Abstract: Hubble Space Telescope's (HST) Space Telescope Imaging Spectrograph (STIS) targeted 556 stars in a long-running program called Next Generation Spectral Library (NGSL) via proposals GO9088, GO9786, GO10222, and GO13776. Exposures through three low resolution gratings provide wavelength coverage from 0.2 $< λ<$ 1 $μ$m at $λ/Δλ\sim$ 1000, providing unique coverage in the ultraviolet (UV). The UV grat… ▽ More

    Submitted 18 April, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: 19 pages, 20 figures, 3 tables. Full version of table 3 is available online at https://archive.stsci.edu/hlsp/lowlib

  6. The globular cluster system of the nearest Seyfert II galaxy Circinus

    Authors: C. Obasi, M. Gómez, D. Minniti, J. Alonso-García, M. Hempel, J. B. Pullen, M. D. Gregg, L. D. Baravalle, M. V. Alonso, B. I. Okere

    Abstract: Context. The globular cluster (GC) system of Circinus galaxy has not been probed previously partly because of the location of the galaxy at - 3.8$^\circ$ Galactic latitude which suffers severely from interstellar extinction, stellar crowding, and Galactic foreground contamination. However, the deep near-infrared (NIR) photometry by the VISTA Variables in the Via Láctea Extended Survey (VVVX) in co… ▽ More

    Submitted 11 December, 2022; originally announced December 2022.

    Comments: 15 pages, 12 figures

    Journal ref: A&A 670, A18 (2023)

  7. LOCAL: Low-Complex Map** Algorithm for Spatial DNN Accelerators

    Authors: Midia Reshadi, David Gregg

    Abstract: Deep neural networks are a promising solution for applications that solve problems based on learning data sets. DNN accelerators solve the processing bottleneck as a domain-specific processor. Like other hardware solutions, there must be exact compatibility between the accelerator and other software components, especially the compiler. This paper presents a LOCAL (Low Complexity map** Algorithm)… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  8. arXiv:2210.03777  [pdf, other

    cs.RO eess.SY

    Optimal Energy Sha** Control for a Backdrivable Hip Exoskeleton

    Authors: Jiefu Zhang, Jian** Lin, Vamsi Peddinti, Robert D. Gregg

    Abstract: Task-dependent controllers widely used in exoskeletons track predefined trajectories, which overly constrain the volitional motion of individuals with remnant voluntary mobility. Energy sha**, on the other hand, provides task-invariant assistance by altering the human body's dynamic characteristics in the closed loop. While human-exoskeleton systems are often modeled using Euler-Lagrange equatio… ▽ More

    Submitted 25 March, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

  9. arXiv:2209.12365  [pdf, other

    cs.RO

    Deep Convolutional Neural Network and Transfer Learning for Locomotion Intent Prediction

    Authors: Duong Le, Shihao Cheng, Robert D. Gregg, Maani Ghaffari

    Abstract: Powered prosthetic legs must anticipate the user's intent when switching between different locomotion modes (e.g., level walking, stair ascent/descent, ramp ascent/descent). Numerous data-driven classification techniques have demonstrated promising results for predicting user intent, but the performance of these intent prediction models on novel subjects remains undesirable. In other domains (e.g.… ▽ More

    Submitted 25 September, 2022; originally announced September 2022.

  10. arXiv:2205.02131  [pdf, other

    cs.CV cs.LG

    Domino Saliency Metrics: Improving Existing Channel Saliency Metrics with Structural Information

    Authors: Kaveena Persand, Andrew Anderson, David Gregg

    Abstract: Channel pruning is used to reduce the number of weights in a Convolutional Neural Network (CNN). Channel pruning removes slices of the weight tensor so that the convolution layer remains dense. The removal of these weight slices from a single layer causes mismatching number of feature maps between layers of the network. A simple solution is to force the number of feature map between layers to matc… ▽ More

    Submitted 19 June, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

  11. Real-Time Gait Phase and Task Estimation for Controlling a Powered Ankle Exoskeleton on Extremely Uneven Terrain

    Authors: Roberto Leo Medrano, Gray Cortright Thomas, Connor G. Keais, Elliott J. Rouse, Robert D. Gregg

    Abstract: Positive biomechanical outcomes have been reported with lower-limb exoskeletons in laboratory settings, but these devices have difficulty delivering appropriate assistance in synchrony with human gait as the task or rate of phase progression change in real-world environments. This paper presents a controller for an ankle exoskeleton that uses a data-driven kinematic model to continuously estimate… ▽ More

    Submitted 6 October, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

  12. arXiv:2202.07181  [pdf

    cond-mat.mtrl-sci

    Structural and phase evolution in U$_3$Si$_2$ during steam corrosion

    Authors: Jiatu Liu, Patrick A. Burr, Joshua T. White, Vanessa K. Peterson, Pranesh Dayal, Christopher Baldwin, Deborah Wakeham, Daniel J. Gregg, Elizabeth S. Sooby, Edward G. Obbard

    Abstract: U$_3$Si$_2$ nuclear fuel is corroded in deuterated steam with in situ neutron diffraction. Density functional theory is coupled with rigorous thermodynamic description of the hydride including gas/solid entropy contributions. H absorbs in the 2$b$ interstitial site of U$_3$Si$_2$H$_x$ and moves to 8$j$ for $x\ge 0.5$. Hydriding forces lattice expansion and change in $a/c$ ratio linked to site pref… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

  13. arXiv:2201.11409  [pdf, ps, other

    cs.AR

    On the RTL Implementation of FINN Matrix Vector Compute Unit

    Authors: Syed Asad Alam, David Gregg, Giulio Gambardella, Thomas Preusser, Michaela Blott

    Abstract: FPGA-based accelerators are becoming more popular for deep neural network due to the ability to scale performance with increasing degree of specialization with dataflow architectures or custom data types. To reduce the barrier for software engineers and data scientists to adopt FPGAs, C++- and OpenCL-based design entries with high-level synthesis (HLS) have been introduced. They provide higher abs… ▽ More

    Submitted 10 April, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: 22 pages, 7 tables, 16 figures

    ACM Class: B.5.0; B.2.5

  14. arXiv:2201.10369  [pdf, ps, other

    cs.CV

    Winograd Convolution for Deep Neural Networks: Efficient Point Selection

    Authors: Syed Asad Alam, Andrew Anderson, Barbara Barabasz, David Gregg

    Abstract: Convolutional neural networks (CNNs) have dramatically improved the accuracy of tasks such as object recognition, image segmentation and interactive speech systems. CNNs require large amounts of computing resources because ofcomputationally intensive convolution layers. Fast convolution algorithms such as Winograd convolution can greatly reduce the computational cost of these layers at a cost of p… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: 19 pages, 3 figures, 9 tables and 32 equations

    ACM Class: C.3.2; G.0

  15. arXiv:2110.01562  [pdf, other

    cs.RO

    Enhancing Voluntary Motion with Modular, Backdrivable, Powered Hip and Knee Orthoses

    Authors: Christopher Nesler, Gray Thomas, Nikhil Divekar, Elliott J. Rouse, Robert D. Gregg

    Abstract: Mobility disabilities are prominent in society with wide-ranging detriments to affected individuals. Addressing the specific deficits of individuals within this heterogeneous population requires modular, partial-assist, lower-limb exoskeletons. This paper introduces the Modular Backdrivable Lower-limb Unloading Exoskeleton (M-BLUE), which implements high torque, low mechanical impedance actuators… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

    Comments: 8 pages, 7 figures

  16. Lower-limb kinematics and kinetics during continuously varying human locomotion

    Authors: Emma Reznick, Kyle R. Embry, Ross Neuman, Edgar Bolívar-Nieto, Nicholas P. Fey, Robert D. Gregg

    Abstract: Human locomotion involves continuously variable activities including walking, running, and stair climbing over a range of speeds and inclinations as well as sit-stand, walk-run, and walk-stairs transitions. Understanding the kinematics and kinetics of the lower limbs during continuously varying locomotion is fundamental to develo** robotic prostheses and exoskeletons that assist in community amb… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

  17. arXiv:2102.06681  [pdf, ps, other

    math.NA eess.SP

    Low precision logarithmic number systems: Beyond base-2

    Authors: Syed Asad Alam, James Garland, David Gregg

    Abstract: Logarithmic number systems (LNS) are used to represent real numbers in many applications using a constant base raised to a fixed-point exponent making its distribution exponential. This greatly simplifies hardware multiply, divide and square root. LNS with base-2 is most common, but in this paper we show that for low-precision LNS the choice of base has a significant impact. We make four main co… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

    Comments: 22 pages, 12 figures, 8 tables, conference extension

    MSC Class: 65G50 ACM Class: C.m; G.0

    Journal ref: Syed Asad Alam, James Garland, and David Gregg. 2021. Low-precision Logarithmic Number Systems: Beyond Base-2. ACM Trans. Archit. Code Optim. 18, 4, Article 47 (December 2021), 25 pages

  18. Family sizes for complete multipartite graphs

    Authors: Danielle Gregg, Thomas W. Mattman, Zachary Porat, George Todd

    Abstract: The obstruction set for graphs with knotless embeddings is not known, but a recent paper of Goldberg, Mattman, and Naimi indicates that it is quite large. Almost all known obstructions fall into four Triangle-Y families and they ask if there is an efficient way of finding or estimating the size of such graph families. Inspired by this question, we investigate the family size for complete multipart… ▽ More

    Submitted 30 March, 2021; v1 submitted 29 August, 2020; originally announced August 2020.

    Comments: 15 pages, 6 figures, 8 tables v2 - substantial revision including improved estimate of family size of K_n family

    MSC Class: Primary 05C10; Secondary 57M15; 05C35

    Journal ref: Involve 15 (2022) 669-686

  19. arXiv:2007.06563  [pdf, other

    cs.AR cs.LG

    HOBFLOPS CNNs: Hardware Optimized Bitslice-Parallel Floating-Point Operations for Convolutional Neural Networks

    Authors: James Garland, David Gregg

    Abstract: Convolutional neural networks (CNNs) are typically trained using 16- or 32-bit floating-point (FP) and researchers show that low-precision floating-point (FP) can be highly effective for inference. Low-precision FP can be implemented in field programmable gate array (FPGA) and application-specific integrated circuit (ASIC) accelerators, but existing processors do not generally support custom preci… ▽ More

    Submitted 28 February, 2021; v1 submitted 10 July, 2020; originally announced July 2020.

    Comments: 14 pages, 3 tables, 9 figures

  20. arXiv:2006.11967  [pdf, other

    cs.LG stat.ML

    Exploiting Weight Redundancy in CNNs: Beyond Pruning and Quantization

    Authors: Yuan Wen, David Gregg

    Abstract: Pruning and quantization are proven methods for improving the performance and storage efficiency of convolutional neural networks (CNNs). Pruning removes near-zero weights in tensors and masks weak connections between neurons in neighbouring layers. Quantization reduces the precision of weights by replacing them with numerically similar values that require less storage. In this paper, we identify… ▽ More

    Submitted 21 June, 2020; originally announced June 2020.

  21. arXiv:2005.10709  [pdf, other

    cs.LG stat.ML

    TASO: Time and Space Optimization for Memory-Constrained DNN Inference

    Authors: Yuan Wen, Andrew Anderson, Valentin Radu, Michael F. P. O'Boyle, David Gregg

    Abstract: Convolutional neural networks (CNNs) are used in many embedded applications, from industrial robotics and automation systems to biometric identification on mobile devices. State-of-the-art classification is typically achieved by large networks, which are prohibitively expensive to run on mobile and embedded devices with tightly constrained memory and energy budgets. We propose an approach for ahea… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

  22. Composition of Saliency Metrics for Channel Pruning with a Myopic Oracle

    Authors: Kaveena Persand, Andrew Anderson, David Gregg

    Abstract: The computation and memory needed for Convolutional Neural Network (CNN) inference can be reduced by pruning weights from the trained network. Pruning is guided by a pruning saliency, which heuristically approximates the change in the loss function associated with the removal of specific weights. Many pruning signals have been proposed, but the performance of each heuristic depends on the particul… ▽ More

    Submitted 24 June, 2021; v1 submitted 3 April, 2020; originally announced April 2020.

    Journal ref: 2020 IEEE Symposium Series on Computational Intelligence (SSCI)

  23. arXiv:2001.02976  [pdf, other

    cs.LG cs.NE

    Performance-Oriented Neural Architecture Search

    Authors: Andrew Anderson, **g Su, Rozenn Dahyot, David Gregg

    Abstract: Hardware-Software Co-Design is a highly successful strategy for improving performance of domain-specific computing systems. We argue for the application of the same methodology to deep learning; specifically, we propose to extend neural architecture search with information about the hardware to ensure that the model designs produced are highly efficient in addition to the typical criteria around a… ▽ More

    Submitted 9 January, 2020; originally announced January 2020.

    Comments: The 2019 International Conference on High Performance Computing & Simulation

  24. Taxonomy of Saliency Metrics for Channel Pruning

    Authors: Kaveena Persand, Andrew Anderson, David Gregg

    Abstract: Pruning unimportant parameters can allow deep neural networks (DNNs) to reduce their heavy computation and memory requirements. A saliency metric estimates which parameters can be safely pruned with little impact on the classification performance of the DNN. Many saliency metrics have been proposed, each within the context of a wider pruning algorithm. The result is that it is difficult to separat… ▽ More

    Submitted 4 July, 2021; v1 submitted 11 June, 2019; originally announced June 2019.

    Journal ref: IEEE Access, vol. 9, pp. 120110-120126, 2021

  25. arXiv:1905.05233  [pdf, other

    cs.LG stat.ML

    Winograd Convolution for DNNs: Beyond linear polynomials

    Authors: Barbara Barabasz, David Gregg

    Abstract: Winograd convolution is widely used in deep neural networks (DNNs). Existing work for DNNs considers only the subset Winograd algorithms that are equivalent to Toom-Cook convolution. We investigate a wider range of Winograd algorithms for DNNs and show that these additional algorithms can significantly improve floating point (FP) accuracy in many cases. We present results for three FP formats: fp3… ▽ More

    Submitted 25 June, 2019; v1 submitted 13 May, 2019; originally announced May 2019.

  26. arXiv:1901.05049  [pdf, other

    cs.LG cs.DC cs.SD eess.AS stat.ML

    Bonseyes AI Pipeline -- bringing AI to you. End-to-end integration of data, algorithms and deployment tools

    Authors: Miguel de Prado, **g Su, Rabia Saeed, Lorenzo Keller, Noelia Vallez, Andrew Anderson, David Gregg, Luca Benini, Tim Llewellynn, Nabil Ouerhani, Rozenn Dahyot and, Nuria Pazos

    Abstract: Next generation of embedded Information and Communication Technology (ICT) systems are collaborative systems able to perform autonomous tasks. The remarkable expansion of the embedded ICT market, together with the rise and breakthroughs of Artificial Intelligence (AI), have put the focus on the Edge as it stands as one of the keys for the next technological revolution: the seamless integration of… ▽ More

    Submitted 11 June, 2020; v1 submitted 15 January, 2019; originally announced January 2019.

  27. arXiv:1812.04771  [pdf, other

    cs.RO

    Robust Optimal Design of Energy Efficient Series Elastic Actuators: Application to a Powered Prosthetic Ankle

    Authors: Edgar Bolívar, Siavash Rezazadeh, Tyler Summers, Robert D. Gregg

    Abstract: Design of robotic systems that safely and efficiently operate in uncertain operational conditions, such as rehabilitation and physical assistance robots, remains an important challenge in the field. Current methods for the design of energy efficient series elastic actuators use an optimization formulation that typically assumes known operational conditions. This approach could lead to actuators th… ▽ More

    Submitted 5 February, 2019; v1 submitted 11 December, 2018; originally announced December 2018.

  28. arXiv:1811.05414  [pdf, other

    cs.RO

    A Phase Variable Approach for Improved Rhythmic and Non-Rhythmic Control of a Powered Knee-Ankle Prosthesis

    Authors: Siavash Rezazadeh, David Quintero, Nikhil Divekar, Emma Reznick, Leslie Gray, Robert D. Gregg

    Abstract: Although there has been recent progress in control of multi-joint prosthetic legs for rhythmic tasks such as walking, control of these systems for non-rhythmic motions and general real-world maneuvers is still an open problem. In this article, we develop a new controller that is capable of both rhythmic (constant-speed) walking, transitions between speeds and/or tasks, and some common volitional l… ▽ More

    Submitted 4 August, 2019; v1 submitted 13 November, 2018; originally announced November 2018.

  29. arXiv:1809.10572  [pdf, other

    cs.PF cs.CV cs.MS

    Scalar Arithmetic Multiple Data: Customizable Precision for Deep Neural Networks

    Authors: Andrew Anderson, David Gregg

    Abstract: Quantization of weights and activations in Deep Neural Networks (DNNs) is a powerful technique for network compression, and has enjoyed significant attention and success. However, much of the inference-time benefit of quantization is accessible only through the use of customized hardware accelerators or by providing an FPGA implementation of quantized arithmetic. Building on prior work, we show… ▽ More

    Submitted 12 December, 2019; v1 submitted 27 September, 2018; originally announced September 2018.

  30. arXiv:1803.10986  [pdf, other

    math.NA cs.LG stat.ML

    Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural Networks

    Authors: Barbara Barabasz, Andrew Anderson, Kirk M. Soodhalter, David Gregg

    Abstract: Popular deep neural networks (DNNs) spend the majority of their execution time computing convolutions. The Winograd family of algorithms can greatly reduce the number of arithmetic operations required and is present in many DNN software frameworks. However, the performance gain is at the expense of a reduction in floating point (FP) numerical accuracy. In this paper, we analyse the worst case FP e… ▽ More

    Submitted 1 May, 2019; v1 submitted 29 March, 2018; originally announced March 2018.

  31. arXiv:1801.10219  [pdf, other

    cs.AR

    Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing

    Authors: James Garland, David Gregg

    Abstract: Convolutional neural networks (CNNs) are one of the most successful machine learning techniques for image, voice and video processing. CNNs require large amounts of processing capacity and memory bandwidth. Hardware accelerators have been proposed for CNNs which typically contain large numbers of multiply-accumulate (MAC) units, the multipliers of which are large in an integrated circuit (IC) gate… ▽ More

    Submitted 1 May, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

  32. arXiv:1710.01079  [pdf, other

    cs.PF cs.AI cs.CV

    Optimal DNN Primitive Selection with Partitioned Boolean Quadratic Programming

    Authors: Andrew Anderson, David Gregg

    Abstract: Deep Neural Networks (DNNs) require very large amounts of computation both for training and for inference when deployed in the field. Many different algorithms have been proposed to implement the most computationally expensive layers of DNNs. Further, each of these algorithms has a large number of variants, which offer different trade-offs of parallelism, data locality, memory footprint, and execu… ▽ More

    Submitted 2 November, 2018; v1 submitted 3 October, 2017; originally announced October 2017.

  33. arXiv:1709.03395  [pdf, other

    cs.CV

    Low-memory GEMM-based convolution algorithms for deep neural networks

    Authors: Andrew Anderson, Aravind Vasudevan, Cormac Keane, David Gregg

    Abstract: Deep neural networks (DNNs) require very large amounts of computation both for training and for inference when deployed in the field. A common approach to implementing DNNs is to recast the most computationally expensive operations as general matrix multiplication (GEMM). However, as we demonstrate in this paper, there are a great many different ways to express DNN convolution operations using GEM… ▽ More

    Submitted 8 September, 2017; originally announced September 2017.

    Comments: 13 pages, 16 figures and 3 tables. arXiv admin note: text overlap with arXiv:1704.04428

  34. arXiv:1704.08449  [pdf, other

    astro-ph.SR astro-ph.EP

    The full spectral radiative properties of Proxima Centauri

    Authors: Ignasi Ribas, Michael D. Gregg, Tabetha S. Boyajian, Emeline Bolmont

    Abstract: The discovery of Proxima b, a terrestrial temperate planet, presents the opportunity of studying a potentially habitable world in optimal conditions. A key aspect to model its habitability is to understand the radiation environment of the planet in the full spectral domain. We characterize the X-rays to mid-IR radiative properties of Proxima with the goal of providing the top-of-atmosphere fluxes… ▽ More

    Submitted 27 April, 2017; originally announced April 2017.

    Comments: 12 pages, 5 figures, accepted for publication in Astronomy & Astrophysics

    Journal ref: A&A 603, A58 (2017)

  35. arXiv:1704.04428  [pdf, other

    cs.CV cs.PF

    Parallel Multi Channel Convolution using General Matrix Multiplication

    Authors: Aravind Vasudevan, Andrew Anderson, David Gregg

    Abstract: Convolutional neural networks (CNNs) have emerged as one of the most successful machine learning technologies for image and video processing. The most computationally intensive parts of CNNs are the convolutional layers, which convolve multi-channel images with multiple kernels. A common approach to implementing convolutional layers is to expand the image into a column matrix (im2col) and perform… ▽ More

    Submitted 3 July, 2017; v1 submitted 6 April, 2017; originally announced April 2017.

    Comments: Camera ready version to be published at ASAP 2017 - The 28th Annual IEEE International Conference on Application-specific Systems, Architectures and Processors. 6 pages

  36. arXiv:1701.08800  [pdf, other

    cs.DC

    Mutual Inclusivity of the Critical Path and its Partial Schedule on Heterogeneous Systems

    Authors: Aravind Vasudevan, David Gregg

    Abstract: The critical path of a group of tasks is an important measure that is commonly used to guide task allocation and scheduling on parallel computers. The critical path is the longest chain of dependencies in an acyclic task dependence graph. A problem arises on heterogeneous parallel machines where computation and communication costs can vary between different types of processor. Existing solutions f… ▽ More

    Submitted 30 January, 2017; originally announced January 2017.

  37. arXiv:1611.05378  [pdf, ps, other

    cs.LG stat.ML

    Spectral Convolution Networks

    Authors: Maria Francesca, Arthur Hughes, David Gregg

    Abstract: Previous research has shown that computation of convolution in the frequency domain provides a significant speedup versus traditional convolution network implementations. However, this performance increase comes at the expense of repeatedly computing the transform and its inverse in order to apply other network operations such as activation, pooling, and dropout. We show, mathematically, how convo… ▽ More

    Submitted 16 November, 2016; originally announced November 2016.

  38. Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks

    Authors: James Garland, David Gregg

    Abstract: Convolutional Neural Networks (CNNs) are one of the most successful deep machine learning technologies for processing image, voice and video data. CNNs require large amounts of processing capacity and memory, which can exceed the resources of low power mobile and embedded systems. Several designs for hardware accelerators have been proposed for CNNs which typically contain large numbers of Multipl… ▽ More

    Submitted 19 January, 2017; v1 submitted 30 August, 2016; originally announced September 2016.

    Comments: 4 pages

  39. arXiv:1602.04716  [pdf, other

    cs.OH

    Customizable Precision of Floating-Point Arithmetic with Bitslice Vector Types

    Authors: Shixiong Xu, David Gregg

    Abstract: Customizing the precision of data can provide attractive trade-offs between accuracy and hardware resources. We propose a novel form of vector computing aimed at arrays of custom-precision floating point data. We represent these vectors in bitslice format. Bitwise instructions are used to implement arithmetic circuits in software that operate on customized bit-precision. Experiments show that this… ▽ More

    Submitted 15 February, 2016; originally announced February 2016.

  40. Vectorization of Multibyte Floating Point Data Formats

    Authors: Andrew Anderson, David Gregg

    Abstract: We propose a scheme for reduced-precision representation of floating point data on a continuum between IEEE-754 floating point types. Our scheme enables the use of lower precision formats for a reduction in storage space requirements and data transfer volume. We describe how our scheme can be accelerated using existing hardware vector units on a general-purpose processor (GPP). Exploiting native v… ▽ More

    Submitted 22 July, 2016; v1 submitted 26 January, 2016; originally announced January 2016.

    ACM Class: D.3.4; G.1.0; B.2.4; I.4.2

  41. arXiv:1508.01753  [pdf, other

    cs.DS

    Practical Algorithms for Finding Extremal Sets

    Authors: Martin Marinov, Nicholas Nash, David Gregg

    Abstract: The minimal sets within a collection of sets are defined as the ones which do not have a proper subset within the collection, and the maximal sets are the ones which do not have a proper superset within the collection. Identifying extremal sets is a fundamental problem with a wide-range of applications in SAT solvers, data-mining and social network analysis. In this paper, we present two novel imp… ▽ More

    Submitted 7 August, 2015; originally announced August 2015.

  42. arXiv:1507.05841  [pdf, other

    cs.CC

    On the GI-Completeness of a Sorting Networks Isomorphism

    Authors: Martin Marinov, David Gregg

    Abstract: The subitemset isomorphism problem is really important and there are excellent practical solutions described in the literature. However, the computational complexity analysis and classification of the BZ (Bundala and Zavodny) subitemset isomorphism problem is currently an open problem. In this paper we prove that checking whether two sorting networks are BZ isomorphic to each other is GI-Complete;… ▽ More

    Submitted 19 January, 2016; v1 submitted 21 July, 2015; originally announced July 2015.

    ACM Class: F.1.3; F.2.2

  43. arXiv:1502.05983  [pdf, ps, other

    cs.DM cs.DS

    Sorting Networks: The Final Countdown

    Authors: Martin Marinov, David Gregg

    Abstract: In this paper we extend the knowledge on the problem of empirically searching for sorting networks of minimal depth. We present new search space pruning techniques for the last four levels of a candidate sorting network by considering only the output set representation of a network. We present an algorithm for checking whether an $n$-input sorting network of depth $d$ exists by considering the min… ▽ More

    Submitted 20 February, 2015; originally announced February 2015.

    Comments: arXiv admin note: substantial text overlap with arXiv:1502.04748

    ACM Class: F.2.2

  44. arXiv:1502.04748  [pdf, ps, other

    cs.DS

    The Takeoff Towards Optimal Sorting Networks

    Authors: Martin Marinov, David Gregg

    Abstract: A complete set of filters $F_n$ for the optimal-depth $n$-input sorting network problem is such that if there exists an $n$-input sorting network of depth $d$ then there exists one of the form $C \oplus C'$ for some $C \in F_n$. Previous work on the topic presents a method for finding complete set of filters $R_{n, 1}$ and $R_{n, 2}$ that consists only of networks of depths one and two respectivel… ▽ More

    Submitted 12 March, 2015; v1 submitted 16 February, 2015; originally announced February 2015.

    ACM Class: F.2.2

  45. arXiv:1310.3739  [pdf, ps, other

    astro-ph.HE astro-ph.GA

    The X-ray Spectrum and Spectral Energy Distribution of FIRST J155633.8+351758: a LoBAL Quasar with a Probable Polar Outflow

    Authors: Robert C. Berrington, Michael S. Brotherton, Sarah C. Gallagher, Rajib Ganguly, Zhaohui Shang, Michael DiPompeo, Ritaban Chatterjee, Mark Lacy, Michael D. Gregg, Patrick B. Hall, S. A. Laurent-Muehleisen

    Abstract: We report the results of a new 60 ks Chandra X-ray Observatory Advanced CCD Imaging Spectrometer S-array (ACIS-S) observation of the reddened, radio-selected, highly polarized `FeLoBAL' quasar FIRST J1556+3517. We investigated a number of models of varied sophistication to fit the 531-photon spectrum. These models ranged from simple power laws to power laws absorbed by hydrogen gas in differing io… ▽ More

    Submitted 14 October, 2013; originally announced October 2013.

    Comments: to be published in MNRAS

  46. The Lick AGN Monitoring Project 2011: Reverberation Map** of Markarian 50

    Authors: A. J. Barth, A. Pancoast, S. J. Thorman, V. N. Bennert, D. J. Sand, W. Li, G. Canalizo, A. V. Filippenko, E. L. Gates, J. E. Greene, M. A. Malkan, D. Stern, T. Treu, J. -H. Woo, R. J. Assef, H. -J. Bae, B. J. Brewer, T. Buehler, S. B. Cenko, K. I. Clubb, M. C. Cooper, A. M. Diamond-Stanic, K. D. Hiner, S. F. Hoenig, M. D. Joner , et al. (24 additional authors not shown)

    Abstract: The Lick AGN Monitoring Project 2011 observing campaign was carried out over the course of 11 weeks in Spring 2011. Here we present the first results from this program, a measurement of the broad-line reverberation lag in the Seyfert 1 galaxy Mrk 50. Combining our data with supplemental observations obtained prior to the start of the main observing campaign, our dataset covers a total duration of… ▽ More

    Submitted 31 October, 2011; originally announced November 2011.

    Comments: Accepted for publication in ApJ Letters. 6 pages, 4 figures

  47. The Globular Cluster Systems of Abell 1185

    Authors: Michael J. West, Andres Jordan, John P. Blakeslee, Patrick Cote, Michael D. Gregg, Marianne Takamiya, Ronald O. Marzke

    Abstract: We examine the properties of a previously discovered population of globular clusters in the heart of the rich galaxy cluster Abell 1185 that might be intergalactic in nature. Deep images obtained with the Advanced Camera for Surveys (ACS) aboard Hubble Space Telescope (HST) confirm the presence of ~ 1300 globular clusters brighter than I_{F814W} = 27.3 mag in a field devoid of any large galaxies.… ▽ More

    Submitted 27 January, 2011; originally announced January 2011.

    Comments: Accepted for publication in Astronomy and Astrophysics, 13 pages, 15 figures

  48. Implications of Dramatic Broad Absorption Line Variability in the Quasar FBQS J1408+3054

    Authors: Patrick B. Hall, Konstantin Anosov, R. L. White, W. N. Brandt, M. D. Gregg, R. R. Gibson, R. H. Becker, D. P. Schneider

    Abstract: We have observed a dramatic change in the spectrum of the formerly heavily absorbed `overlap**-trough' iron low-ionization broad absorption line (FeLoBAL) quasar FBQS J1408+3054. Over a time span of between 0.6 to 5 rest-frame years, the Mg II trough outflowing at 12,000 km/s decreased in equivalent width by a factor of two and the Fe II troughs at the same velocity disappeared. The most likely… ▽ More

    Submitted 8 December, 2010; v1 submitted 18 October, 2010; originally announced October 2010.

    Comments: Final version to appear in MNRAS: references added and factor of 2 underestimate of accretion disk size corrected, resulting in absorber constrained to be somewhat closer to the black hole. For an animated gif showing the spectral evolution of the broad absorption line troughs in this quasar, see http://www.yorku.ca/phall/film19952009.gif

  49. Spectropolarimetry of Radio-Selected Broad Absorption Line Quasars

    Authors: M. A. DiPompeo, M. S. Brotherton, R. H. Becker, H. D. Tran, M. D. Gregg, R. L. White, S. A. Laurent-Muehleisen

    Abstract: We report spectropolarimetry of 30 radio-selected broad absorption line (BAL) quasars with the Keck Observatory, 25 from the sample of Becker et al. (2000). Both high and low-ionization BAL quasars are represented, with redshifts ranging from 0.5 to 2.5. The spectropolarimetric properties of radio-selected BAL quasars are very similar to those of radio-quiet BAL quasars: a sizeable fraction (20%)… ▽ More

    Submitted 30 June, 2010; originally announced July 2010.

    Journal ref: Astrophysical Journal Supplement Series, 189:83-103, 2010 July

  50. The Sloan Digital Sky Survey Quasar Lens Search. IV. Statistical Lens Sample from the Fifth Data Release

    Authors: Naohisa Inada, Masamune Oguri, Min-Su Shin, Issha Kayo, Michael A. Strauss, Joseph F. Hennawi, Tomoki Morokuma, Robert H. Becker, Richard L. White, Christopher S. Kochanek, Michael D. Gregg, Kuenley Chiu, David E. Johnston, Alejandro Clocchiatti, Gordon T. Richards, Donald P. Schneider, Joshua A. Frieman, Masataka Fukugita, J. Richard Gott III, Patrick B. Hall, Donald G. York, Francisco J. Castander, Neta A. Bahcall

    Abstract: We present the second report of our systematic search for strongly lensed quasars from the data of the Sloan Digital Sky Survey (SDSS). From extensive follow-up observations of 136 candidate objects, we find 36 lenses in the full sample of 77,429 spectroscopically confirmed quasars in the SDSS Data Release 5. We then define a complete sample of 19 lenses, including 11 from our previous search in t… ▽ More

    Submitted 30 May, 2010; originally announced May 2010.

    Comments: 37 pages, 2 figures and 5 tables, accepted to AJ

    Journal ref: Astron.J.140:403-415,2010