Skip to main content

Showing 1–7 of 7 results for author: Moro, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05171  [pdf, other

    cs.LG

    Custom Gradient Estimators are Straight-Through Estimators in Disguise

    Authors: Matt Schoenbauer, Daniele Moro, Lukasz Lew, Andrew Howard

    Abstract: Quantization-aware training comes with a fundamental challenge: the derivative of quantization functions such as rounding are zero almost everywhere and nonexistent elsewhere. Various differentiable approximations of quantization functions have been proposed to address this issue. In this paper, we prove that when the learning rate is sufficiently small, a large class of weight gradient estimators… ▽ More

    Submitted 22 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  2. arXiv:2404.10518  [pdf, other

    cs.CV

    MobileNetV4 -- Universal Models for the Mobile Ecosystem

    Authors: Danfeng Qin, Chas Leichner, Manolis Delakis, Marco Fornoni, Shixin Luo, Fan Yang, Weijun Wang, Colby Banbury, Chengxi Ye, Berkin Akin, Vaibhav Aggarwal, Tenghui Zhu, Daniele Moro, Andrew Howard

    Abstract: We present the latest generation of MobileNets, known as MobileNetV4 (MNv4), featuring universally efficient architecture designs for mobile devices. At its core, we introduce the Universal Inverted Bottleneck (UIB) search block, a unified and flexible structure that merges Inverted Bottleneck (IB), ConvNext, Feed Forward Network (FFN), and a novel Extra Depthwise (ExtraDW) variant. Alongside UIB,… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  3. arXiv:2404.00103  [pdf, other

    cs.LG cs.CV

    PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks

    Authors: Marina Neseem, Conor McCullough, Randy Hsin, Chas Leichner, Shan Li, In Suk Chong, Andrew G. Howard, Lukasz Lew, Sherief Reda, Ville-Mikko Rautio, Daniele Moro

    Abstract: Low-precision quantization is recognized for its efficacy in neural network optimization. Our analysis reveals that non-quantized elementwise operations which are prevalent in layers such as parameterized activation functions, batch normalization, and quantization scaling dominate the inference cost of low-precision models. These non-quantized elementwise operations are commonly overlooked in SOTA… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: Accepted in CVPR 2024. 10 Figures, 9 Tables

  4. arXiv:2401.09646  [pdf, other

    cs.LG cs.AI cs.CL

    ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change

    Authors: David Thulke, Yingbo Gao, Petrus Pelser, Rein Brune, Rricha Jalota, Floris Fok, Michael Ramos, Ian van Wyk, Abdallah Nasir, Hayden Goldstein, Taylor Tragemann, Katie Nguyen, Ariana Fowler, Andrew Stanco, Jon Gabriel, Jordan Taylor, Dean Moro, Evgenii Tsymbalov, Juliette de Waal, Evgeny Matusov, Mudar Yaghi, Mohammad Shihadah, Hermann Ney, Christian Dugast, Jonathan Dotan , et al. (1 additional authors not shown)

    Abstract: This paper introduces ClimateGPT, a model family of domain-specific large language models that synthesize interdisciplinary research on climate change. We trained two 7B models from scratch on a science-oriented dataset of 300B tokens. For the first model, the 4.2B domain-specific tokens were included during pre-training and the second was adapted to the climate domain after pre-training. Addition… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  5. arXiv:2309.00327  [pdf, other

    cs.RO cs.MA

    Implementing BDI Continual Temporal Planning for Robotic Agents

    Authors: Alex Zanetti, Devis Dal Moro, Redi Vreto, Marco Robol, Marco Roveri, Paolo Giorgini

    Abstract: Making autonomous agents effective in real-life applications requires the ability to decide at run-time and a high degree of adaptability to unpredictable and uncontrollable events. Reacting to events is still a fundamental ability for an agent, but it has to be boosted up with proactive behaviors that allow the agent to explore alternatives and decide at run-time for optimal solutions. This calls… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

  6. arXiv:2102.04270  [pdf, other

    cs.LG cs.AR

    Enabling Binary Neural Network Training on the Edge

    Authors: Erwei Wang, James J. Davis, Daniele Moro, Piotr Zielinski, Jia Jie Lim, Claudionor Coelho, Satrajit Chatterjee, Peter Y. K. Cheung, George A. Constantinides

    Abstract: The ever-growing computational demands of increasingly complex machine learning models frequently necessitate the use of powerful cloud-based infrastructure for their training. Binary neural networks are known to be promising candidates for on-device inference due to their extreme compute and memory savings over higher-precision alternatives. However, their existing training methods require the co… ▽ More

    Submitted 24 September, 2023; v1 submitted 8 February, 2021; originally announced February 2021.

  7. arXiv:1911.03283  [pdf, other

    cs.CL

    Composing and Embedding the Words-as-Classifiers Model of Grounded Semantics

    Authors: Daniele Moro, Stacy Black, Casey Kennington

    Abstract: The words-as-classifiers model of grounded lexical semantics learns a semantic fitness score between physical entities and the words that are used to denote those entities. In this paper, we explore how such a model can incrementally perform composition and how the model can be unified with a distributional representation. For the latter, we leverage the classifier coefficients as an embedding. Fo… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Comments: 10 pages