Skip to main content

Showing 1–19 of 19 results for author: De Sterck, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00706  [pdf, other

    cs.LG math.OC stat.ML

    Sum-of-norms regularized Nonnegative Matrix Factorization

    Authors: Andersen Ang, Waqas Bin Hamed, Hans De Sterck

    Abstract: When applying nonnegative matrix factorization (NMF), generally the rank parameter is unknown. Such rank in NMF, called the nonnegative rank, is usually estimated heuristically since computing the exact value of it is NP-hard. In this work, we propose an approximation method to estimate such rank while solving NMF on-the-fly. We use sum-of-norm (SON), a group-lasso structure that encourages pairwi… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 22 pages, 12 figures

  2. arXiv:2404.03081  [pdf, ps, other

    cs.LG math.NA

    First-order PDES for Graph Neural Networks: Advection And Burgers Equation Models

    Authors: Yifan Qu, Oliver Krzysik, Hans De Sterck, Omer Ege Kara

    Abstract: Graph Neural Networks (GNNs) have established themselves as the preferred methodology in a multitude of domains, ranging from computer vision to computational biology, especially in contexts where data inherently conform to graph structures. While many existing methods have endeavored to model GNNs using various techniques, a prevalent challenge they grapple with is the issue of over-smoothing. Th… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  3. arXiv:2310.11960  [pdf, other

    cs.CL cs.LG

    Fast Multipole Attention: A Divide-and-Conquer Attention Mechanism for Long Sequences

    Authors: Yanming Kang, Giang Tran, Hans De Sterck

    Abstract: Transformer-based models have achieved state-of-the-art performance in many areas. However, the quadratic complexity of self-attention with respect to the input length hinders the applicability of Transformer-based models to long sequences. To address this, we present Fast Multipole Attention, a new attention mechanism that uses a divide-and-conquer strategy to reduce the time and memory complexit… ▽ More

    Submitted 20 October, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  4. arXiv:2209.15203  [pdf, other

    cs.LG cs.DC

    Downlink Compression Improves TopK Sparsification

    Authors: William Zou, Hans De Sterck, Jun Liu

    Abstract: Training large neural networks is time consuming. To speed up the process, distributed training is often used. One of the largest bottlenecks in distributed training is communicating gradients across different nodes. Different gradient compression techniques have been proposed to alleviate the communication bottleneck, including topK gradient sparsification, which truncates the gradient to the lar… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  5. arXiv:2206.01913  [pdf, other

    eess.SY cs.LG cs.RO math.DS

    Neural Lyapunov Control of Unknown Nonlinear Systems with Stability Guarantees

    Authors: Ruikun Zhou, Thanin Quartz, Hans De Sterck, Jun Liu

    Abstract: Learning for control of dynamical systems with formal guarantees remains a challenging task. This paper proposes a learning framework to simultaneously stabilize an unknown nonlinear system with a neural controller and learn a neural Lyapunov function to certify a region of attraction (ROA) for the closed-loop system. The algorithmic structure consists of two neural networks and a satisfiability m… ▽ More

    Submitted 15 October, 2022; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022

  6. arXiv:2109.14181  [pdf, other

    math.NA cs.LG math.OC

    Anderson Acceleration as a Krylov Method with Application to Asymptotic Convergence Analysis

    Authors: Hans De Sterck, Yunhui He, Oliver A. Krzysik

    Abstract: Anderson acceleration (AA) is widely used for accelerating the convergence of nonlinear fixed-point methods $x_{k+1}=q(x_{k})$, $x_k \in \mathbb{R}^n$, but little is known about how to quantify the convergence acceleration provided by AA. As a roadway towards gaining more understanding of convergence acceleration by AA, we study AA($m$), i.e., Anderson acceleration with finite window size $m$, app… ▽ More

    Submitted 23 February, 2023; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: this version resubmitted to journal on Nov 22, 2022

  7. arXiv:2109.14176  [pdf, other

    math.OC cs.LG math.NA

    Linear Asymptotic Convergence of Anderson Acceleration: Fixed-Point Analysis

    Authors: Hans De Sterck, Yunhui He

    Abstract: We study the asymptotic convergence of AA($m$), i.e., Anderson acceleration with window size $m$ for accelerating fixed-point methods $x_{k+1}=q(x_{k})$, $x_k \in R^n$. Convergence acceleration by AA($m$) has been widely observed but is not well understood. We consider the case where the fixed-point iteration function $q(x)$ is differentiable and the convergence of the fixed-point method itself is… ▽ More

    Submitted 2 May, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

  8. arXiv:2010.11358  [pdf, other

    cs.LG cs.CL

    N-ODE Transformer: A Depth-Adaptive Variant of the Transformer Using Neural Ordinary Differential Equations

    Authors: Aaron Baier-Reinio, Hans De Sterck

    Abstract: We use neural ordinary differential equations to formulate a variant of the Transformer that is depth-adaptive in the sense that an input-dependent number of time steps is taken by the ordinary differential equation solver. Our goal in proposing the N-ODE Transformer is to investigate whether its depth-adaptivity may aid in overcoming some specific known theoretical limitations of the Transformer… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

  9. arXiv:2007.02916  [pdf, other

    math.OC cs.LG math.NA

    On the Asymptotic Linear Convergence Speed of Anderson Acceleration Applied to ADMM

    Authors: Dawei Wang, Yunhui He, Hans De Sterck

    Abstract: Empirical results show that Anderson acceleration (AA) can be a powerful mechanism to improve the asymptotic linear convergence speed of the Alternating Direction Method of Multipliers (ADMM) when ADMM by itself converges linearly. However, theoretical results to quantify this improvement do not exist yet. In this paper we explain and quantify this improvement in linear asymptotic convergence spee… ▽ More

    Submitted 29 November, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

  10. arXiv:2007.01996  [pdf, other

    math.OC cs.LG math.NA

    On the Asymptotic Linear Convergence Speed of Anderson Acceleration, Nesterov Acceleration, and Nonlinear GMRES

    Authors: Hans De Sterck, Yunhui He

    Abstract: We consider nonlinear convergence acceleration methods for fixed-point iteration $x_{k+1}=q(x_k)$, including Anderson acceleration (AA), nonlinear GMRES (NGMRES), and Nesterov-type acceleration (corresponding to AA with window size one). We focus on fixed-point methods that converge asymptotically linearly with convergence factor $ρ<1$ and that solve an underlying fully smooth and non-convex optim… ▽ More

    Submitted 7 November, 2020; v1 submitted 3 July, 2020; originally announced July 2020.

    Comments: Minor update. This version accepted for publication in SIAM Journal on Scientific Computing

  11. arXiv:1810.05846  [pdf, other

    math.OC cs.LG math.NA

    Nesterov Acceleration of Alternating Least Squares for Canonical Tensor Decomposition: Momentum Step Size Selection and Restart Mechanisms

    Authors: Drew Mitchell, Nan Ye, Hans De Sterck

    Abstract: We present Nesterov-type acceleration techniques for Alternating Least Squares (ALS) methods applied to canonical tensor decomposition. While Nesterov acceleration turns gradient descent into an optimal first-order method for convex problems by adding a momentum term with a specific weight sequence, a direct application of this method and weight sequence to ALS results in erratic convergence behav… ▽ More

    Submitted 30 November, 2019; v1 submitted 13 October, 2018; originally announced October 2018.

    Comments: This version: journal revision, Nov 30, 2019

  12. arXiv:1610.02608  [pdf, other

    cs.CE math.HO stat.OT

    Research and Education in Computational Science and Engineering

    Authors: Ulrich Rüde, Karen Willcox, Lois Curfman McInnes, Hans De Sterck, George Biros, Hans Bungartz, James Corones, Evin Cramer, James Crowley, Omar Ghattas, Max Gunzburger, Michael Hanke, Robert Harrison, Michael Heroux, Jan Hesthaven, Peter Jimack, Chris Johnson, Kirk E. Jordan, David E. Keyes, Rolf Krause, Vipin Kumar, Stefan Mayer, Juan Meza, Knut Martin Mørken, J. Tinsley Oden , et al. (8 additional authors not shown)

    Abstract: Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that… ▽ More

    Submitted 31 December, 2017; v1 submitted 8 October, 2016; originally announced October 2016.

    Comments: Major revision, to appear in SIAM Review

    Report number: Argonne National Laboratory Preprint ANL/MCS-P6054-0916 MSC Class: 00A72; 62-07; 68U20; 68W01; 68W10; 97A99; 97M10; 97N80; 97R20; 97R30 ACM Class: G.0; G.4; I.6; J.0; J.2; J.3; J.4; J.6; J.7; K.3.2

  13. arXiv:1610.00656  [pdf, other

    physics.soc-ph cond-mat.stat-mech cs.SI

    The Statistical Mechanics of Human Weight Change

    Authors: John C. Lang, Hans De Sterck, Daniel M. Abrams

    Abstract: In the context of the global obesity epidemic, it is important to know who becomes obese and why. However, the processes that determine the changing shape of Body Mass Index (BMI) distributions in high-income societies are not well-understood. Here we establish the statistical mechanics of human weight change, providing a fundamental new understanding of human weight distributions. By compiling an… ▽ More

    Submitted 29 September, 2016; originally announced October 2016.

  14. arXiv:1601.05893  [pdf, other

    cs.AI cs.CL cs.DB cs.IR

    GeoTextTagger: High-Precision Location Tagging of Textual Documents using a Natural Language Processing Approach

    Authors: Shawn Brunsting, Hans De Sterck, Remco Dolman, Teun van Sprundel

    Abstract: Location tagging, also known as geotagging or geolocation, is the process of assigning geographical coordinates to input data. In this paper we present an algorithm for location tagging of textual documents. Our approach makes use of previous work in natural language processing by using a state-of-the-art part-of-speech tagger and named entity recognizer to find blocks of text which may refer to l… ▽ More

    Submitted 22 January, 2016; originally announced January 2016.

  15. arXiv:1510.08345  [pdf, ps, other

    math.NA cs.DC

    A polynomial expansion line search for large-scale unconstrained minimization of smooth L2-regularized loss functions, with implementation in Apache Spark

    Authors: Michael B Hynes, Hans De Sterck

    Abstract: In large-scale unconstrained optimization algorithms such as limited memory BFGS (LBFGS), a common subproblem is a line search minimizing the loss function along a descent direction. Commonly used line searches iteratively find an approximate solution for which the Wolfe conditions are satisfied, typically requiring multiple function and gradient evaluations per line search, which is expensive in… ▽ More

    Submitted 26 January, 2016; v1 submitted 28 October, 2015; originally announced October 2015.

    Comments: 9 pages, 8 figures, 2 tables. Preprint appearing in SIAM Conf on Data Mining, Miami, FL, 2016

    MSC Class: 65K05 ACM Class: G.1.6; H.2.8

  16. arXiv:1508.03110  [pdf, ps, other

    math.NA cs.DC cs.IR

    Algorithmic Acceleration of Parallel ALS for Collaborative Filtering: Speeding up Distributed Big Data Recommendation in Spark

    Authors: Manda Winlaw, Michael B. Hynes, Anthony Caterini, Hans De Sterck

    Abstract: Collaborative filtering algorithms are important building blocks in many practical recommendation systems. For example, many large-scale data processing environments include collaborative filtering models for which the Alternating Least Squares (ALS) algorithm is used to compute latent factor matrix decompositions. In this paper, we propose an approach to accelerate the convergence of parallel ALS… ▽ More

    Submitted 10 January, 2016; v1 submitted 12 August, 2015; originally announced August 2015.

    Comments: Proceedings of ICPADS 2015, Melbourne, AU. 10 pages; 6 figures; 4 tables

    MSC Class: 65K05 ACM Class: G.1.3; G.1.6

  17. arXiv:1501.04091  [pdf, other

    physics.soc-ph cs.SI

    A Hierarchy of Linear Threshold Models for the Spread of Political Revolutions on Social Networks

    Authors: John C. Lang, Hans De Sterck

    Abstract: We study a linear threshold agent-based model (ABM) for the spread of political revolutions on social networks using empirical network data. We propose new techniques for building a hierarchy of simplified ordinary differential equation (ODE) based models that aim to capture essential features of the ABM, including effects of the actual networks, and give insight in the parameter regime transition… ▽ More

    Submitted 16 January, 2015; originally announced January 2015.

    MSC Class: 91D10; 91D30; 70G60; 37M99

  18. arXiv:1407.2188  [pdf, other

    math.DS cs.SI physics.soc-ph

    The influence of societal individualism on a century of tobacco use: modelling the prevalence of smoking

    Authors: John C. Lang, Daniel M. Abrams, Hans De Sterck

    Abstract: Smoking of tobacco is predicted to cause approximately six million deaths worldwide in 2014. Responding effectively to this epidemic requires a thorough understanding of how smoking behaviour is transmitted and modified. Here, we present a new mathematical model of the social dynamics that cause cigarette smoking to spread in a population. Our model predicts that more individualistic societies wil… ▽ More

    Submitted 8 July, 2014; originally announced July 2014.

    MSC Class: 91D10 (Primary)

    Journal ref: BMC Public Health 15 (1280), 1-13 (2015)

  19. arXiv:1210.1841  [pdf, other

    math.DS cs.SI physics.soc-ph

    The Arab Spring: A Simple Compartmental Model for the Dynamics of a Revolution

    Authors: John Lang, Hans De Sterck

    Abstract: The self-immolation of Mohamed Bouazizi on December 17, 2011 in the small Tunisian city of Sidi Bouzid, set off a sequence of events culminating in the revolutions of the Arab Spring. It is widely believed that the Internet and social media played a critical role in the growth and success of protests that led to the downfall of the regimes in Egypt and Tunisia. However, the precise mechanisms by w… ▽ More

    Submitted 5 October, 2012; originally announced October 2012.