Skip to main content

Showing 1–19 of 19 results for author: Alesiani, F

.
  1. arXiv:2407.05816  [pdf, other

    cs.LG cs.AI

    Graph Reasoning Networks

    Authors: Markus Zopf, Francesco Alesiani

    Abstract: Graph neural networks (GNNs) are the predominant approach for graph-based machine learning. While neural networks have shown great performance at learning useful representations, they are often criticized for their limited high-level reasoning abilities. In this work, we present Graph Reasoning Networks (GRNs), a novel approach to combine the strengths of fixed and learned graph representations an… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Presented at the workshop on graphs and more complex structures for learning and reasoning at AAAI 2022

  2. arXiv:2405.14253  [pdf, other

    cs.LG physics.comp-ph

    Higher-Rank Irreducible Cartesian Tensors for Equivariant Message Passing

    Authors: Viktor Zaverkin, Francesco Alesiani, Takashi Maruyama, Federico Errica, Henrik Christiansen, Makoto Takamoto, Nicolas Weber, Mathias Niepert

    Abstract: The ability to perform fast and accurate atomistic simulations is crucial for advancing the chemical sciences. By learning from high-quality data, machine-learned interatomic potentials achieve accuracy on par with ab initio and first-principles methods at a fraction of their computational cost. The success of machine-learned interatomic potentials arises from integrating inductive biases such as… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  3. arXiv:2312.16560  [pdf, other

    cs.LG

    Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching

    Authors: Federico Errica, Henrik Christiansen, Viktor Zaverkin, Takashi Maruyama, Mathias Niepert, Francesco Alesiani

    Abstract: Long-range interactions are essential for the correct description of complex systems in many scientific fields. The price to pay for including them in the calculations, however, is a dramatic increase in the overall computational costs. Recently, deep graph networks have been employed as efficient, data-driven surrogate models for predicting properties of complex systems represented as graphs. The… ▽ More

    Submitted 20 March, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

  4. arXiv:2312.01416  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci stat.ML

    Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials

    Authors: Viktor Zaverkin, David Holzmüller, Henrik Christiansen, Federico Errica, Francesco Alesiani, Makoto Takamoto, Mathias Niepert, Johannes Kästner

    Abstract: Efficiently creating a concise but comprehensive data set for training machine-learned interatomic potentials (MLIPs) is an under-explored problem. Active learning (AL), which uses either biased or unbiased molecular dynamics (MD) simulations to generate candidate pools, aims to address this objective. Existing biased and unbiased MD simulations, however, are prone to miss either rare events or ex… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  5. arXiv:2310.13977  [pdf, other

    cs.LG cs.IT

    Continual Invariant Risk Minimization

    Authors: Francesco Alesiani, Shujian Yu, Mathias Niepert

    Abstract: Empirical risk minimization can lead to poor generalization behavior on unseen environments if the learned model does not capture invariant feature representations. Invariant risk minimization (IRM) is a recent proposal for discovering environment-invariant representations. IRM was introduced by Arjovsky et al. (2019) and extended by Ahuja et al. (2020). IRM assumes that all environments are avail… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: Shorter version of this paper was presented at RobustML workshop of ICLR 2021

  6. arXiv:2309.13593  [pdf, other

    physics.comp-ph cond-mat.soft cond-mat.stat-mech cs.LG stat.CO

    Self-Tuning Hamiltonian Monte Carlo for Accelerated Sampling

    Authors: Henrik Christiansen, Federico Errica, Francesco Alesiani

    Abstract: The performance of Hamiltonian Monte Carlo simulations crucially depends on both the integration timestep and the number of integration steps. We present an adaptive general-purpose framework to automatically tune such parameters, based on a local loss function which promotes the fast exploration of phase-space. We show that a good correspondence between loss and autocorrelation time can be establ… ▽ More

    Submitted 26 November, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

  7. arXiv:2304.14118  [pdf, other

    cs.LG cs.CE physics.comp-ph physics.flu-dyn physics.geo-ph

    Learning Neural PDE Solvers with Parameter-Guided Channel Attention

    Authors: Makoto Takamoto, Francesco Alesiani, Mathias Niepert

    Abstract: Scientific Machine Learning (SciML) is concerned with the development of learned emulators of physical systems governed by partial differential equations (PDE). In application domains such as weather forecasting, molecular dynamics, and inverse design, ML-based surrogate models are increasingly used to augment or replace inefficient and often non-differentiable numerical simulation algorithms. Whi… ▽ More

    Submitted 21 July, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: accepted for publication in ICML2023

  8. arXiv:2302.14473  [pdf, other

    cs.LG math.CO math.OC

    Implicit Bilevel Optimization: Differentiating through Bilevel Optimization Programming

    Authors: Francesco Alesiani

    Abstract: Bilevel Optimization Programming is used to model complex and conflicting interactions between agents, for example in Robust AI or Privacy-preserving AI. Integrating bilevel mathematical programming within deep learning is thus an essential objective for the Machine Learning community. Previously proposed approaches only consider single-level programming. In this paper, we extend existing single-l… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: Accepted for presentation at AAAI23; code will be submitted at https://github.com/falesiani/bigrad

  9. arXiv:2210.07182  [pdf, other

    cs.LG cs.CV physics.flu-dyn physics.geo-ph

    PDEBENCH: An Extensive Benchmark for Scientific Machine Learning

    Authors: Makoto Takamoto, Timothy Praditia, Raphael Leiteritz, Dan MacKinlay, Francesco Alesiani, Dirk Pflüger, Mathias Niepert

    Abstract: Machine learning-based modeling of physical systems has experienced increased interest in recent years. Despite some impressive progress, there is still a lack of benchmarks for Scientific ML that are easy to use but still challenging and representative of a wide range of problems. We introduce PDEBench, a benchmark suite of time-dependent simulation tasks based on Partial Differential Equations (… ▽ More

    Submitted 13 March, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: 16 pages (main body) + 34 pages (supplemental material), accepted for publication in NeurIPS 2022 Track Datasets and Benchmarks

  10. arXiv:2207.04447  [pdf, other

    cs.CL

    Human-Centric Research for NLP: Towards a Definition and Guiding Questions

    Authors: Bhushan Kotnis, Kiril Gashteovski, Julia Gastinger, Giuseppe Serra, Francesco Alesiani, Timo Sztyler, Ammar Shaker, Na Gong, Carolin Lawrence, Zhao Xu

    Abstract: With Human-Centric Research (HCR) we can steer research activities so that the research outcome is beneficial for human stakeholders, such as end users. But what exactly makes research human-centric? We address this question by providing a working definition and define how a research pipeline can be split into different stages in which human-centric components can be added. Additionally, we discus… ▽ More

    Submitted 10 July, 2022; originally announced July 2022.

  11. arXiv:2206.00118  [pdf, other

    cs.LG cs.IT

    Principle of Relevant Information for Graph Sparsification

    Authors: Shujian Yu, Francesco Alesiani, Wenzhe Yin, Robert Jenssen, Jose C. Principe

    Abstract: Graph sparsification aims to reduce the number of edges of a graph while maintaining its structural properties. In this paper, we propose the first general and effective information-theoretic formulation of graph sparsification, by taking inspiration from the Principle of Relevant Information (PRI). To this end, we extend the PRI from a standard scalar random variable setting to structured data (i… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

    Comments: accepted by UAI-22

  12. arXiv:2110.06057  [pdf, other

    cs.LG stat.ML

    Gated Information Bottleneck for Generalization in Sequential Environments

    Authors: Francesco Alesiani, Shujian Yu, Xi Yu

    Abstract: Deep neural networks suffer from poor generalization to unseen environments when the underlying data distribution is different from that in the training set. By learning minimum sufficient representations from training data, the information bottleneck (IB) approach has demonstrated its effectiveness to improve generalization in different AI applications. In this work, we propose a new neural netwo… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: manuscript accepted by IEEE ICDM-21 (regular papers), code is available at https://github.com/falesiani/GIB

  13. arXiv:2101.10160  [pdf, other

    cs.LG cs.IT stat.ML

    Measuring Dependence with Matrix-based Entropy Functional

    Authors: Shujian Yu, Francesco Alesiani, Xi Yu, Robert Jenssen, Jose C. Principe

    Abstract: Measuring the dependence of data plays a central role in statistics and machine learning. In this work, we summarize and generalize the main idea of existing information-theoretic dependence measures into a higher-level perspective by the Shearer's inequality. Based on our generalization, we then propose two measures, namely the matrix-based normalized total correlation ($T_α^*$) and the matrix-ba… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

    Comments: Accepted at AAAI-21. An interpretable and differentiable dependence (or independence) measure that can be used to 1) train deep network under covariate shift and non-Gaussian noise; 2) implement a deep deterministic information bottleneck; and 3) understand the dynamics of learning of CNN. Code available at https://bit.ly/AAAI-dependence

  14. arXiv:2011.01272   

    cs.LG stat.ML

    Modular-Relatedness for Continual Learning

    Authors: Ammar Shaker, Shujian Yu, Francesco Alesiani

    Abstract: In this paper, we propose a continual learning (CL) technique that is beneficial to sequential task learners by improving their retained accuracy and reducing catastrophic forgetting. The principal target of our approach is the automatic extraction of modular parts of the neural network and then estimating the relatedness between the tasks given these modular components. This technique is applicab… ▽ More

    Submitted 17 January, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: We realized one conclusion in the submission is erroneous and disconnected from the results shown in one theorem is. We decide to withdraw the current version to avoid misleading conclusion

  15. arXiv:2011.01168  [pdf, other

    cs.LG

    Bilevel Continual Learning

    Authors: Ammar Shaker, Francesco Alesiani, Shujian Yu, Wenzhe Yin

    Abstract: Continual learning (CL) studies the problem of learning a sequence of tasks, one at a time, such that the learning of each new task does not lead to the deterioration in performance on the previously seen ones while exploiting previously learned features. This paper presents Bilevel Continual Learning (BiCL), a general framework for continual learning that fuses bilevel optimization and recent adv… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

  16. arXiv:2009.05618  [pdf, other

    cs.LG stat.ML

    Learning an Interpretable Graph Structure in Multi-Task Learning

    Authors: Shujian Yu, Francesco Alesiani, Ammar Shaker, Wenzhe Yin

    Abstract: We present a novel methodology to jointly perform multi-task learning and infer intrinsic relationship among tasks by an interpretable and sparse graph. Unlike existing multi-task learning methodologies, the graph structure is not assumed to be known a priori or estimated separately in a preprocessing step. Instead, our graph is learned simultaneously with model parameters of each task, thus it re… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

    Comments: 11 pages, 7 figures

  17. arXiv:2009.05483  [pdf, other

    cs.LG stat.ML

    Towards Interpretable Multi-Task Learning Using Bilevel Programming

    Authors: Francesco Alesiani, Shujian Yu, Ammar Shaker, Wenzhe Yin

    Abstract: Interpretable Multi-Task Learning can be expressed as learning a sparse graph of the task relationship based on the prediction performance of the learned models. Since many natural phenomenon exhibit sparse structures, enforcing sparsity on learned models reveals the underlying task relationship. Moreover, different sparsification degrees from a fully connected graph uncover various types of struc… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

    Comments: Manuscript accepted at ECML PKDD 2020

  18. arXiv:2005.02196  [pdf, other

    cs.LG cs.IT stat.ML

    Measuring the Discrepancy between Conditional Distributions: Methods, Properties and Applications

    Authors: Shujian Yu, Ammar Shaker, Francesco Alesiani, Jose C. Principe

    Abstract: We propose a simple yet powerful test statistic to quantify the discrepancy between two conditional distributions. The new statistic avoids the explicit estimation of the underlying distributions in highdimensional space and it operates on the cone of symmetric positive semidefinite (SPS) matrix using the Bregman matrix divergence. Moreover, it inherits the merits of the correntropy function to ex… ▽ More

    Submitted 28 December, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

    Comments: manuscript accepted at IJCAI 20; added additional notes on computational complexity and auto-differentiable property; code is available at https://github.com/SJYuCNEL/Bregman-Correntropy-Conditional-Divergence

  19. arXiv:1811.05695  [pdf, other

    cs.LG stat.ML

    Efficient and Scalable Multi-task Regression on Massive Number of Tasks

    Authors: Xiao He, Francesco Alesiani, Ammar Shaker

    Abstract: Many real-world large-scale regression problems can be formulated as Multi-task Learning (MTL) problems with a massive number of tasks, as in retail and transportation domains. However, existing MTL methods still fail to offer both the generalization performance and the scalability for such problems. Scaling up MTL methods to problems with a tremendous number of tasks is a big challenge. Here, we… ▽ More

    Submitted 14 November, 2018; originally announced November 2018.

    Comments: Accepted at AAAI 2019