Skip to main content

Showing 1–5 of 5 results for author: Neto, F F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14971  [pdf, other

    cs.CL cs.AI cs.LG

    Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation

    Authors: Shamane Siriwardhana, Mark McQuade, Thomas Gauthier, Lucas Atkins, Fernando Fernandes Neto, Luke Meyers, Anneketh Vij, Tyler Odenthal, Charles Goddard, Mary MacCarthy, Jacob Solawetz

    Abstract: We conducted extensive experiments on domain adaptation of the Meta-Llama-3-70B-Instruct model on SEC data, exploring its performance on both general and domain-specific benchmarks. Our focus included continual pre-training (CPT) and model merging, aiming to enhance the model's domain-specific capabilities while mitigating catastrophic forgetting. Through this study, we evaluated the impact of int… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 8 pages, 6 figures

  2. arXiv:2406.06623  [pdf, other

    cs.LG stat.ML

    Spectrum: Targeted Training on Signal to Noise Ratio

    Authors: Eric Hartford, Lucas Atkins, Fernando Fernandes Neto, David Golchinfar

    Abstract: Efficiently post-training large language models remains a challenging task due to the vast computational resources required. We present Spectrum, a method that accelerates LLM training by selectively targeting layer modules based on their signal-to-noise ratio (SNR), and freezing the remaining modules. Our approach, which utilizes an algorithm to compute module SNRs prior to training, has shown to… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:1811.12081  [pdf, other

    eess.SP cs.LG stat.ML

    Deep Haar Scattering Networks in Pattern Recognition: A promising approach

    Authors: Fernando Fernandes Neto, Alemayehu Admasu Solomon, Rodrigo de Losso, Claudio Garcia, Pedro Delano Cavalcanti

    Abstract: The aim of this paper is to discuss the use of Haar scattering networks, which is a very simple architecture that naturally supports a large number of stacked layers, yet with very few parameters, in a relatively broad set of pattern recognition problems, including regression and classification tasks. This architecture, basically, consists of stacking convolutional filters, that can be thought as… ▽ More

    Submitted 29 November, 2018; originally announced November 2018.

  4. arXiv:1804.03236  [pdf

    stat.ML cs.LG

    Building Function Approximators on top of Haar Scattering Networks

    Authors: Fernando Fernandes Neto

    Abstract: In this article we propose building general-purpose function approximators on top of Haar Scattering Networks. We advocate that this architecture enables a better comprehension of feature extraction, in addition to its implementation simplicity and low computational costs. We show its approximation and feature extraction capabilities in a wide range of different problems, which can be applied on s… ▽ More

    Submitted 9 April, 2018; originally announced April 2018.

    Comments: 7 pages, 5 figures, to appear in International Journal of Machine Learning and Computing, vol. 8 number 3

  5. arXiv:1801.03523  [pdf

    stat.ML cs.NE physics.comp-ph q-fin.CP

    Generative Models for Stochastic Processes Using Convolutional Neural Networks

    Authors: Fernando Fernandes Neto

    Abstract: The present paper aims to demonstrate the usage of Convolutional Neural Networks as a generative model for stochastic processes, enabling researchers from a wide range of fields (such as quantitative finance and physics) to develop a general tool for forecasts and simulations without the need to identify/assume a specific system structure or estimate its parameters.

    Submitted 8 January, 2018; originally announced January 2018.