Search | arXiv e-print repository

ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training

Authors: Adel Nabli, Louis Fournier, Pierre Erbacher, Louis Serrano, Eugene Belilovsky, Edouard Oyallon

Abstract: Training Large Language Models (LLMs) relies heavily on distributed implementations, employing multiple GPUs to compute stochastic gradients on model replicas in parallel. However, synchronizing gradients in data parallel settings induces a communication overhead increasing with the number of distributed workers, which can impede the efficiency gains of parallelization. To address this challenge,… ▽ More Training Large Language Models (LLMs) relies heavily on distributed implementations, employing multiple GPUs to compute stochastic gradients on model replicas in parallel. However, synchronizing gradients in data parallel settings induces a communication overhead increasing with the number of distributed workers, which can impede the efficiency gains of parallelization. To address this challenge, optimization algorithms reducing inter-worker communication have emerged, such as local optimization methods used in Federated Learning. While effective in minimizing communication overhead, these methods incur significant memory costs, hindering scalability: in addition to extra momentum variables, if communications are only allowed between multiple local optimization steps, then the optimizer's states cannot be sharded among workers. In response, we propose $\textbf{AC}$cumulate while $\textbf{CO}$mmunicate ($\texttt{ACCO}$), a memory-efficient optimization algorithm tailored for distributed training of LLMs. $\texttt{ACCO}$ allows to shard optimizer states across workers, overlaps gradient computations and communications to conceal communication costs, and accommodates heterogeneous hardware. Our method relies on a novel technique to mitigate the one-step delay inherent in parallel execution of gradient computations and communications, eliminating the need for warmup steps and aligning with the training dynamics of standard distributed optimization while converging faster in terms of wall-clock time. We demonstrate the effectiveness of $\texttt{ACCO}$ on several LLMs training and fine-tuning tasks. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.02176 [pdf, other]

AROMA: Preserving Spatial Structure for Latent PDE Modeling with Local Neural Fields

Authors: Louis Serrano, Thomas X Wang, Etienne Le Naour, Jean-Noël Vittaut, Patrick Gallinari

Abstract: We present AROMA (Attentive Reduced Order Model with Attention), a framework designed to enhance the modeling of partial differential equations (PDEs) using local neural fields. Our flexible encoder-decoder architecture can obtain smooth latent representations of spatial physical fields from a variety of data types, including irregular-grid inputs and point clouds. This versatility eliminates the… ▽ More We present AROMA (Attentive Reduced Order Model with Attention), a framework designed to enhance the modeling of partial differential equations (PDEs) using local neural fields. Our flexible encoder-decoder architecture can obtain smooth latent representations of spatial physical fields from a variety of data types, including irregular-grid inputs and point clouds. This versatility eliminates the need for patching and allows efficient processing of diverse geometries. The sequential nature of our latent representation can be interpreted spatially and permits the use of a conditional transformer for modeling the temporal dynamics of PDEs. By employing a diffusion-based formulation, we achieve greater stability and enable longer rollouts compared to conventional MSE training. AROMA's superior performance in simulating 1D and 2D equations underscores the efficacy of our approach in capturing complex dynamical behaviors. △ Less

Submitted 5 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2307.13538 [pdf, other]

INFINITY: Neural Field Modeling for Reynolds-Averaged Navier-Stokes Equations

Authors: Louis Serrano, Leon Migus, Yuan Yin, Jocelyn Ahmed Mazari, Patrick Gallinari

Abstract: For numerical design, the development of efficient and accurate surrogate models is paramount. They allow us to approximate complex physical phenomena, thereby reducing the computational burden of direct numerical simulations. We propose INFINITY, a deep learning model that utilizes implicit neural representations (INRs) to address this challenge. Our framework encodes geometric information and ph… ▽ More For numerical design, the development of efficient and accurate surrogate models is paramount. They allow us to approximate complex physical phenomena, thereby reducing the computational burden of direct numerical simulations. We propose INFINITY, a deep learning model that utilizes implicit neural representations (INRs) to address this challenge. Our framework encodes geometric information and physical fields into compact representations and learns a map** between them to infer the physical fields. We use an airfoil design optimization problem as an example task and we evaluate our approach on the challenging AirfRANS dataset, which closely resembles real-world industrial use-cases. The experimental results demonstrate that our framework achieves state-of-the-art performance by accurately inferring physical fields throughout the volume and surface. Additionally we demonstrate its applicability in contexts such as design exploration and shape optimization: our model can correctly predict drag and lift coefficients while adhering to the equations. △ Less

Submitted 25 July, 2023; originally announced July 2023.

Comments: ICML 2023 Workshop on Synergy of Scientific and Machine Learning Modeling

Journal ref: ICML 2023 Workshop on Synergy of Scientific and Machine Learning Modeling

arXiv:2306.07266 [pdf, other]

Operator Learning with Neural Fields: Tackling PDEs on General Geometries

Authors: Louis Serrano, Lise Le Boudec, Armand Kassaï Koupaï, Thomas X Wang, Yuan Yin, Jean-Noël Vittaut, Patrick Gallinari

Abstract: Machine learning approaches for solving partial differential equations require learning map**s between function spaces. While convolutional or graph neural networks are constrained to discretized functions, neural operators present a promising milestone toward map** functions directly. Despite impressive results they still face challenges with respect to the domain geometry and typically rely… ▽ More Machine learning approaches for solving partial differential equations require learning map**s between function spaces. While convolutional or graph neural networks are constrained to discretized functions, neural operators present a promising milestone toward map** functions directly. Despite impressive results they still face challenges with respect to the domain geometry and typically rely on some form of discretization. In order to alleviate such limitations, we present CORAL, a new method that leverages coordinate-based networks for solving PDEs on general geometries. CORAL is designed to remove constraints on the input mesh, making it applicable to any spatial sampling and geometry. Its ability extends to diverse problem domains, including PDE solving, spatio-temporal forecasting, and inverse problems like geometric design. CORAL demonstrates robust performance across multiple resolutions and performs well in both convex and non-convex domains, surpassing or performing on par with state-of-the-art models. △ Less

Submitted 30 November, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

Journal ref: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

arXiv:2306.05880 [pdf, other]

Time Series Continuous Modeling for Imputation and Forecasting with Implicit Neural Representations

Authors: Etienne Le Naour, Louis Serrano, Léon Migus, Yuan Yin, Ghislain Agoua, Nicolas Baskiotis, Patrick Gallinari, Vincent Guigue

Abstract: We introduce a novel modeling approach for time series imputation and forecasting, tailored to address the challenges often encountered in real-world data, such as irregular samples, missing data, or unaligned measurements from multiple sensors. Our method relies on a continuous-time-dependent model of the series' evolution dynamics. It leverages adaptations of conditional, implicit neural represe… ▽ More We introduce a novel modeling approach for time series imputation and forecasting, tailored to address the challenges often encountered in real-world data, such as irregular samples, missing data, or unaligned measurements from multiple sensors. Our method relies on a continuous-time-dependent model of the series' evolution dynamics. It leverages adaptations of conditional, implicit neural representations for sequential data. A modulation mechanism, driven by a meta-learning algorithm, allows adaptation to unseen samples and extrapolation beyond observed time-windows for long-term predictions. The model provides a highly flexible and unified framework for imputation and forecasting tasks across a wide range of challenging scenarios. It achieves state-of-the-art performance on classical benchmarks and outperforms alternative time-continuous models. △ Less

Submitted 22 April, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

arXiv:2012.07259 [pdf, other]

AndroEvolve: Automated Update for Android Deprecated-API Usages

Authors: Stefanus Agus Haryono, Ferdian Thung, David Lo, Lingxiao Jiang, Julia Lawall, Hong ** Kang, Lucas Serrano, Gilles Muller

Abstract: Android operating system (OS) is often updated, where each new version may involve API deprecation. Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS. In this work, we propose AndroEvolve, an automated tool to update usages of deprecated Android APIs, that addresses the limitations of the state-of-the-art tool… ▽ More Android operating system (OS) is often updated, where each new version may involve API deprecation. Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS. In this work, we propose AndroEvolve, an automated tool to update usages of deprecated Android APIs, that addresses the limitations of the state-of-the-art tool, CocciEvolve. AndroEvolve utilizes data flow analysis to solve the problem of out-of-method-boundary variables, and variable denormalization to remove the temporary variables introduced by CocciEvolve. We evaluated the accuracy of AndroEvolve using a dataset of 360 target files and 20 deprecated Android APIs, where AndroEvolve is able to produce 319 correct updates, compared to CocciEvolve which only produces 249 correct updates. We also evaluated the readability of AndroEvolve's update results using a manual and an automatic evaluation. Both evaluations demonstrated that the code produced by AndroEvolve has higher readability than CocciEvolve's. A video demonstration of AndroEvolve is available at https://youtu.be/siU0tuMITXI. △ Less

Submitted 11 February, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

arXiv:2011.05020 [pdf, other]

AndroEvolve: Automated Android API Update with Data Flow Analysis and Variable Denormalization

Authors: Stefanus A. Haryono, Ferdian Thung, David Lo, Lingxiao Jiang, Julia Lawall, Hong ** Kang, Lucas Serrano, Gilles Muller

Abstract: The Android operating system is frequently updated, with each version bringing a new set of APIs. New versions may involve API deprecation; Android apps using deprecated APIs need to be updated to ensure the apps' compatibility withold and new versions of Android. Updating deprecated APIs is a time-consuming endeavor. Hence, automating the updates of Android APIs can be beneficial for developers.… ▽ More The Android operating system is frequently updated, with each version bringing a new set of APIs. New versions may involve API deprecation; Android apps using deprecated APIs need to be updated to ensure the apps' compatibility withold and new versions of Android. Updating deprecated APIs is a time-consuming endeavor. Hence, automating the updates of Android APIs can be beneficial for developers. CocciEvolve is the state-of-the-art approach for this automation. However, it has several limitations, including its inability to resolve out-of-method-boundary variables and the low code readability of its update due to the addition of temporary variables. In an attempt to further improve the performance of automated Android API update, we propose an approach named AndroEvolve, which addresses the limitations of CocciEvolve through the addition of data flow analysis and variable name denormalization. Data flow analysis enables AndroEvolve to resolve the value of any variable within the file scope. Variable name denormalization replaces temporary variables that may present in the CocciEvolve update with appropriate values in the target file. We have evaluated the performance of AndroEvolve and the readability of its updates on 360 target files. AndroEvolve produces 26.90% more instances of correct updates compared to CocciEvolve. Moreover, our manual and automated evaluation shows that AndroEvolve updates are more readable than CocciEvolve updates. △ Less

Submitted 10 November, 2020; originally announced November 2020.

arXiv:2005.13220 [pdf, other]

Automatic Android Deprecated-API Usage Update by Learning from Single Updated Example

Authors: Stefanus Agus Haryono, Ferdian Thung, Hong ** Kang, Lucas Serrano, Gilles Muller, Julia Lawall, David Lo, Lingxiao Jiang

Abstract: Due to the deprecation of APIs in the Android operating system,developers have to update usages of the APIs to ensure that their applications work for both the past and current versions of Android.Such updates may be widespread, non-trivial, and time-consuming. Therefore, automation of such updates will be of great benefit to developers. AppEvolve, which is the state-of-the-art tool for automating… ▽ More Due to the deprecation of APIs in the Android operating system,developers have to update usages of the APIs to ensure that their applications work for both the past and current versions of Android.Such updates may be widespread, non-trivial, and time-consuming. Therefore, automation of such updates will be of great benefit to developers. AppEvolve, which is the state-of-the-art tool for automating such updates, relies on having before- and after-update examples to learn from. In this work, we propose an approach named CocciEvolve that performs such updates using only a single after-update example. CocciEvolve learns edits by extracting the relevant update to a block of code from an after-update example. From preliminary experiments, we find that CocciEvolve can successfully perform 96 out of 112 updates, with a success rate of 85%. △ Less

Submitted 27 May, 2020; originally announced May 2020.

Comments: 5 pages, 8 figures. Accepted in The International Conference on Program Comprehension (ICPC) 2020, ERA Track

ACM Class: I.2.2

arXiv:1805.08168 [pdf, other]

"You Know What to Do": Proactive Detection of YouTube Videos Targeted by Coordinated Hate Attacks

Authors: Enrico Mariconti, Guillermo Suarez-Tangil, Jeremy Blackburn, Emiliano De Cristofaro, Nicolas Kourtellis, Ilias Leontiadis, Jordi Luque Serrano, Gianluca Stringhini

Abstract: Video sharing platforms like YouTube are increasingly targeted by aggression and hate attacks. Prior work has shown how these attacks often take place as a result of "raids," i.e., organized efforts by ad-hoc mobs coordinating from third-party communities. Despite the increasing relevance of this phenomenon, however, online services often lack effective countermeasures to mitigate it. Unlike well-… ▽ More Video sharing platforms like YouTube are increasingly targeted by aggression and hate attacks. Prior work has shown how these attacks often take place as a result of "raids," i.e., organized efforts by ad-hoc mobs coordinating from third-party communities. Despite the increasing relevance of this phenomenon, however, online services often lack effective countermeasures to mitigate it. Unlike well-studied problems like spam and phishing, coordinated aggressive behavior both targets and is perpetrated by humans, making defense mechanisms that look for automated activity unsuitable. Therefore, the de-facto solution is to reactively rely on user reports and human moderation. In this paper, we propose an automated solution to identify YouTube videos that are likely to be targeted by coordinated harassers from fringe communities like 4chan. First, we characterize and model YouTube videos along several axes (metadata, audio transcripts, thumbnails) based on a ground truth dataset of videos that were targeted by raids. Then, we use an ensemble of classifiers to determine the likelihood that a video will be raided with very good results (AUC up to 94%). Overall, our work provides an important first step towards deploying proactive systems to detect and mitigate coordinated hate attacks on platforms like YouTube. △ Less

Submitted 23 August, 2019; v1 submitted 21 May, 2018; originally announced May 2018.

Journal ref: 22nd ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2019)

arXiv:1311.0822 [pdf, ps, other]

Properties of maximum Lempel-Ziv complexity strings

Authors: C. A. J. Nunes, E. Estevez-Rams, B. Aragón Fernández, R. Lora Serrano

Abstract: The properties of maximum Lempel-Ziv complexity strings are studied for the binary case. A comparison between MLZs and random strings is carried out. The length profile of both type of sequences show different distribution functions. The non-stationary character of the MLZs are discussed. The issue of sensitiveness to noise is also addressed. An empirical ansatz is found that fits well to the Lemp… ▽ More The properties of maximum Lempel-Ziv complexity strings are studied for the binary case. A comparison between MLZs and random strings is carried out. The length profile of both type of sequences show different distribution functions. The non-stationary character of the MLZs are discussed. The issue of sensitiveness to noise is also addressed. An empirical ansatz is found that fits well to the Lempel-Ziv complexity of the MLZs for all lengths up to $10^6$ symbols. △ Less

Submitted 4 November, 2013; originally announced November 2013.

arXiv:1311.0546 [pdf, ps, other]

doi 10.1063/1.4808251

On the non-randomness of maximum Lempel Ziv complexity sequences of finite size

Authors: E. Estevez-Rams, R. Lora Serrano, B. Aragón Fernández, I. Brito Reyes

Abstract: Random sequences attain the highest entropy rate. The estimation of entropy rate for an ergodic source can be done using the Lempel Ziv complexity measure yet, the exact entropy rate value is only reached in the infinite limit. We prove that typical random sequences of finite length fall short of the maximum Lempel-Ziv complexity, contrary to common belief. We discuss that, for a finite length, ma… ▽ More Random sequences attain the highest entropy rate. The estimation of entropy rate for an ergodic source can be done using the Lempel Ziv complexity measure yet, the exact entropy rate value is only reached in the infinite limit. We prove that typical random sequences of finite length fall short of the maximum Lempel-Ziv complexity, contrary to common belief. We discuss that, for a finite length, maximum Lempel-Ziv sequences can be built from a well defined generating algorithm, which makes them of low Kolmogorov-Chaitin complexity, quite the opposite to randomness. It will be discussed that Lempel-Ziv measure is, in this sense, less general than Kolmogorov-Chaitin complexity, as it can be fooled by an intelligent enough agent. The latter will be shown to be the case for the binary expansion of certain irrational numbers. Maximum Lempel-Ziv sequences induce a normalization that gives good estimates of entropy rate for several sources, while kee** bounded values for all sequence length, making it an alternative to other normalization schemes in use. △ Less

Submitted 3 November, 2013; originally announced November 2013.

Journal ref: CHAOS 23, 023118 (2013)

Showing 1–11 of 11 results for author: Serrano, L