-
ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training
Authors:
Adel Nabli,
Louis Fournier,
Pierre Erbacher,
Louis Serrano,
Eugene Belilovsky,
Edouard Oyallon
Abstract:
Training Large Language Models (LLMs) relies heavily on distributed implementations, employing multiple GPUs to compute stochastic gradients on model replicas in parallel. However, synchronizing gradients in data parallel settings induces a communication overhead increasing with the number of distributed workers, which can impede the efficiency gains of parallelization. To address this challenge,…
▽ More
Training Large Language Models (LLMs) relies heavily on distributed implementations, employing multiple GPUs to compute stochastic gradients on model replicas in parallel. However, synchronizing gradients in data parallel settings induces a communication overhead increasing with the number of distributed workers, which can impede the efficiency gains of parallelization. To address this challenge, optimization algorithms reducing inter-worker communication have emerged, such as local optimization methods used in Federated Learning. While effective in minimizing communication overhead, these methods incur significant memory costs, hindering scalability: in addition to extra momentum variables, if communications are only allowed between multiple local optimization steps, then the optimizer's states cannot be sharded among workers. In response, we propose $\textbf{AC}$cumulate while $\textbf{CO}$mmunicate ($\texttt{ACCO}$), a memory-efficient optimization algorithm tailored for distributed training of LLMs. $\texttt{ACCO}$ allows to shard optimizer states across workers, overlaps gradient computations and communications to conceal communication costs, and accommodates heterogeneous hardware. Our method relies on a novel technique to mitigate the one-step delay inherent in parallel execution of gradient computations and communications, eliminating the need for warmup steps and aligning with the training dynamics of standard distributed optimization while converging faster in terms of wall-clock time. We demonstrate the effectiveness of $\texttt{ACCO}$ on several LLMs training and fine-tuning tasks.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
AROMA: Preserving Spatial Structure for Latent PDE Modeling with Local Neural Fields
Authors:
Louis Serrano,
Thomas X Wang,
Etienne Le Naour,
Jean-Noël Vittaut,
Patrick Gallinari
Abstract:
We present AROMA (Attentive Reduced Order Model with Attention), a framework designed to enhance the modeling of partial differential equations (PDEs) using local neural fields. Our flexible encoder-decoder architecture can obtain smooth latent representations of spatial physical fields from a variety of data types, including irregular-grid inputs and point clouds. This versatility eliminates the…
▽ More
We present AROMA (Attentive Reduced Order Model with Attention), a framework designed to enhance the modeling of partial differential equations (PDEs) using local neural fields. Our flexible encoder-decoder architecture can obtain smooth latent representations of spatial physical fields from a variety of data types, including irregular-grid inputs and point clouds. This versatility eliminates the need for patching and allows efficient processing of diverse geometries. The sequential nature of our latent representation can be interpreted spatially and permits the use of a conditional transformer for modeling the temporal dynamics of PDEs. By employing a diffusion-based formulation, we achieve greater stability and enable longer rollouts compared to conventional MSE training. AROMA's superior performance in simulating 1D and 2D equations underscores the efficacy of our approach in capturing complex dynamical behaviors.
△ Less
Submitted 5 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
INFINITY: Neural Field Modeling for Reynolds-Averaged Navier-Stokes Equations
Authors:
Louis Serrano,
Leon Migus,
Yuan Yin,
Jocelyn Ahmed Mazari,
Patrick Gallinari
Abstract:
For numerical design, the development of efficient and accurate surrogate models is paramount. They allow us to approximate complex physical phenomena, thereby reducing the computational burden of direct numerical simulations. We propose INFINITY, a deep learning model that utilizes implicit neural representations (INRs) to address this challenge. Our framework encodes geometric information and ph…
▽ More
For numerical design, the development of efficient and accurate surrogate models is paramount. They allow us to approximate complex physical phenomena, thereby reducing the computational burden of direct numerical simulations. We propose INFINITY, a deep learning model that utilizes implicit neural representations (INRs) to address this challenge. Our framework encodes geometric information and physical fields into compact representations and learns a map** between them to infer the physical fields. We use an airfoil design optimization problem as an example task and we evaluate our approach on the challenging AirfRANS dataset, which closely resembles real-world industrial use-cases. The experimental results demonstrate that our framework achieves state-of-the-art performance by accurately inferring physical fields throughout the volume and surface. Additionally we demonstrate its applicability in contexts such as design exploration and shape optimization: our model can correctly predict drag and lift coefficients while adhering to the equations.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Operator Learning with Neural Fields: Tackling PDEs on General Geometries
Authors:
Louis Serrano,
Lise Le Boudec,
Armand Kassaï Koupaï,
Thomas X Wang,
Yuan Yin,
Jean-Noël Vittaut,
Patrick Gallinari
Abstract:
Machine learning approaches for solving partial differential equations require learning map**s between function spaces. While convolutional or graph neural networks are constrained to discretized functions, neural operators present a promising milestone toward map** functions directly. Despite impressive results they still face challenges with respect to the domain geometry and typically rely…
▽ More
Machine learning approaches for solving partial differential equations require learning map**s between function spaces. While convolutional or graph neural networks are constrained to discretized functions, neural operators present a promising milestone toward map** functions directly. Despite impressive results they still face challenges with respect to the domain geometry and typically rely on some form of discretization. In order to alleviate such limitations, we present CORAL, a new method that leverages coordinate-based networks for solving PDEs on general geometries. CORAL is designed to remove constraints on the input mesh, making it applicable to any spatial sampling and geometry. Its ability extends to diverse problem domains, including PDE solving, spatio-temporal forecasting, and inverse problems like geometric design. CORAL demonstrates robust performance across multiple resolutions and performs well in both convex and non-convex domains, surpassing or performing on par with state-of-the-art models.
△ Less
Submitted 30 November, 2023; v1 submitted 12 June, 2023;
originally announced June 2023.
-
Time Series Continuous Modeling for Imputation and Forecasting with Implicit Neural Representations
Authors:
Etienne Le Naour,
Louis Serrano,
Léon Migus,
Yuan Yin,
Ghislain Agoua,
Nicolas Baskiotis,
Patrick Gallinari,
Vincent Guigue
Abstract:
We introduce a novel modeling approach for time series imputation and forecasting, tailored to address the challenges often encountered in real-world data, such as irregular samples, missing data, or unaligned measurements from multiple sensors. Our method relies on a continuous-time-dependent model of the series' evolution dynamics. It leverages adaptations of conditional, implicit neural represe…
▽ More
We introduce a novel modeling approach for time series imputation and forecasting, tailored to address the challenges often encountered in real-world data, such as irregular samples, missing data, or unaligned measurements from multiple sensors. Our method relies on a continuous-time-dependent model of the series' evolution dynamics. It leverages adaptations of conditional, implicit neural representations for sequential data. A modulation mechanism, driven by a meta-learning algorithm, allows adaptation to unseen samples and extrapolation beyond observed time-windows for long-term predictions. The model provides a highly flexible and unified framework for imputation and forecasting tasks across a wide range of challenging scenarios. It achieves state-of-the-art performance on classical benchmarks and outperforms alternative time-continuous models.
△ Less
Submitted 22 April, 2024; v1 submitted 9 June, 2023;
originally announced June 2023.
-
AndroEvolve: Automated Update for Android Deprecated-API Usages
Authors:
Stefanus Agus Haryono,
Ferdian Thung,
David Lo,
Lingxiao Jiang,
Julia Lawall,
Hong ** Kang,
Lucas Serrano,
Gilles Muller
Abstract:
Android operating system (OS) is often updated, where each new version may involve API deprecation. Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS. In this work, we propose AndroEvolve, an automated tool to update usages of deprecated Android APIs, that addresses the limitations of the state-of-the-art tool…
▽ More
Android operating system (OS) is often updated, where each new version may involve API deprecation. Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS. In this work, we propose AndroEvolve, an automated tool to update usages of deprecated Android APIs, that addresses the limitations of the state-of-the-art tool, CocciEvolve. AndroEvolve utilizes data flow analysis to solve the problem of out-of-method-boundary variables, and variable denormalization to remove the temporary variables introduced by CocciEvolve. We evaluated the accuracy of AndroEvolve using a dataset of 360 target files and 20 deprecated Android APIs, where AndroEvolve is able to produce 319 correct updates, compared to CocciEvolve which only produces 249 correct updates. We also evaluated the readability of AndroEvolve's update results using a manual and an automatic evaluation. Both evaluations demonstrated that the code produced by AndroEvolve has higher readability than CocciEvolve's. A video demonstration of AndroEvolve is available at https://youtu.be/siU0tuMITXI.
△ Less
Submitted 11 February, 2021; v1 submitted 14 December, 2020;
originally announced December 2020.
-
AndroEvolve: Automated Android API Update with Data Flow Analysis and Variable Denormalization
Authors:
Stefanus A. Haryono,
Ferdian Thung,
David Lo,
Lingxiao Jiang,
Julia Lawall,
Hong ** Kang,
Lucas Serrano,
Gilles Muller
Abstract:
The Android operating system is frequently updated, with each version bringing a new set of APIs. New versions may involve API deprecation; Android apps using deprecated APIs need to be updated to ensure the apps' compatibility withold and new versions of Android. Updating deprecated APIs is a time-consuming endeavor. Hence, automating the updates of Android APIs can be beneficial for developers.…
▽ More
The Android operating system is frequently updated, with each version bringing a new set of APIs. New versions may involve API deprecation; Android apps using deprecated APIs need to be updated to ensure the apps' compatibility withold and new versions of Android. Updating deprecated APIs is a time-consuming endeavor. Hence, automating the updates of Android APIs can be beneficial for developers. CocciEvolve is the state-of-the-art approach for this automation. However, it has several limitations, including its inability to resolve out-of-method-boundary variables and the low code readability of its update due to the addition of temporary variables. In an attempt to further improve the performance of automated Android API update, we propose an approach named AndroEvolve, which addresses the limitations of CocciEvolve through the addition of data flow analysis and variable name denormalization. Data flow analysis enables AndroEvolve to resolve the value of any variable within the file scope. Variable name denormalization replaces temporary variables that may present in the CocciEvolve update with appropriate values in the target file. We have evaluated the performance of AndroEvolve and the readability of its updates on 360 target files. AndroEvolve produces 26.90% more instances of correct updates compared to CocciEvolve. Moreover, our manual and automated evaluation shows that AndroEvolve updates are more readable than CocciEvolve updates.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Automatic Android Deprecated-API Usage Update by Learning from Single Updated Example
Authors:
Stefanus Agus Haryono,
Ferdian Thung,
Hong ** Kang,
Lucas Serrano,
Gilles Muller,
Julia Lawall,
David Lo,
Lingxiao Jiang
Abstract:
Due to the deprecation of APIs in the Android operating system,developers have to update usages of the APIs to ensure that their applications work for both the past and current versions of Android.Such updates may be widespread, non-trivial, and time-consuming. Therefore, automation of such updates will be of great benefit to developers. AppEvolve, which is the state-of-the-art tool for automating…
▽ More
Due to the deprecation of APIs in the Android operating system,developers have to update usages of the APIs to ensure that their applications work for both the past and current versions of Android.Such updates may be widespread, non-trivial, and time-consuming. Therefore, automation of such updates will be of great benefit to developers. AppEvolve, which is the state-of-the-art tool for automating such updates, relies on having before- and after-update examples to learn from. In this work, we propose an approach named CocciEvolve that performs such updates using only a single after-update example. CocciEvolve learns edits by extracting the relevant update to a block of code from an after-update example. From preliminary experiments, we find that CocciEvolve can successfully perform 96 out of 112 updates, with a success rate of 85%.
△ Less
Submitted 27 May, 2020;
originally announced May 2020.
-
"You Know What to Do": Proactive Detection of YouTube Videos Targeted by Coordinated Hate Attacks
Authors:
Enrico Mariconti,
Guillermo Suarez-Tangil,
Jeremy Blackburn,
Emiliano De Cristofaro,
Nicolas Kourtellis,
Ilias Leontiadis,
Jordi Luque Serrano,
Gianluca Stringhini
Abstract:
Video sharing platforms like YouTube are increasingly targeted by aggression and hate attacks. Prior work has shown how these attacks often take place as a result of "raids," i.e., organized efforts by ad-hoc mobs coordinating from third-party communities. Despite the increasing relevance of this phenomenon, however, online services often lack effective countermeasures to mitigate it. Unlike well-…
▽ More
Video sharing platforms like YouTube are increasingly targeted by aggression and hate attacks. Prior work has shown how these attacks often take place as a result of "raids," i.e., organized efforts by ad-hoc mobs coordinating from third-party communities. Despite the increasing relevance of this phenomenon, however, online services often lack effective countermeasures to mitigate it. Unlike well-studied problems like spam and phishing, coordinated aggressive behavior both targets and is perpetrated by humans, making defense mechanisms that look for automated activity unsuitable. Therefore, the de-facto solution is to reactively rely on user reports and human moderation.
In this paper, we propose an automated solution to identify YouTube videos that are likely to be targeted by coordinated harassers from fringe communities like 4chan. First, we characterize and model YouTube videos along several axes (metadata, audio transcripts, thumbnails) based on a ground truth dataset of videos that were targeted by raids. Then, we use an ensemble of classifiers to determine the likelihood that a video will be raided with very good results (AUC up to 94%). Overall, our work provides an important first step towards deploying proactive systems to detect and mitigate coordinated hate attacks on platforms like YouTube.
△ Less
Submitted 23 August, 2019; v1 submitted 21 May, 2018;
originally announced May 2018.
-
Properties of maximum Lempel-Ziv complexity strings
Authors:
C. A. J. Nunes,
E. Estevez-Rams,
B. Aragón Fernández,
R. Lora Serrano
Abstract:
The properties of maximum Lempel-Ziv complexity strings are studied for the binary case. A comparison between MLZs and random strings is carried out. The length profile of both type of sequences show different distribution functions. The non-stationary character of the MLZs are discussed. The issue of sensitiveness to noise is also addressed. An empirical ansatz is found that fits well to the Lemp…
▽ More
The properties of maximum Lempel-Ziv complexity strings are studied for the binary case. A comparison between MLZs and random strings is carried out. The length profile of both type of sequences show different distribution functions. The non-stationary character of the MLZs are discussed. The issue of sensitiveness to noise is also addressed. An empirical ansatz is found that fits well to the Lempel-Ziv complexity of the MLZs for all lengths up to $10^6$ symbols.
△ Less
Submitted 4 November, 2013;
originally announced November 2013.
-
On the non-randomness of maximum Lempel Ziv complexity sequences of finite size
Authors:
E. Estevez-Rams,
R. Lora Serrano,
B. Aragón Fernández,
I. Brito Reyes
Abstract:
Random sequences attain the highest entropy rate. The estimation of entropy rate for an ergodic source can be done using the Lempel Ziv complexity measure yet, the exact entropy rate value is only reached in the infinite limit. We prove that typical random sequences of finite length fall short of the maximum Lempel-Ziv complexity, contrary to common belief. We discuss that, for a finite length, ma…
▽ More
Random sequences attain the highest entropy rate. The estimation of entropy rate for an ergodic source can be done using the Lempel Ziv complexity measure yet, the exact entropy rate value is only reached in the infinite limit. We prove that typical random sequences of finite length fall short of the maximum Lempel-Ziv complexity, contrary to common belief. We discuss that, for a finite length, maximum Lempel-Ziv sequences can be built from a well defined generating algorithm, which makes them of low Kolmogorov-Chaitin complexity, quite the opposite to randomness. It will be discussed that Lempel-Ziv measure is, in this sense, less general than Kolmogorov-Chaitin complexity, as it can be fooled by an intelligent enough agent. The latter will be shown to be the case for the binary expansion of certain irrational numbers. Maximum Lempel-Ziv sequences induce a normalization that gives good estimates of entropy rate for several sources, while kee** bounded values for all sequence length, making it an alternative to other normalization schemes in use.
△ Less
Submitted 3 November, 2013;
originally announced November 2013.