Skip to main content

Showing 1–5 of 5 results for author: Croci, M L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.00456  [pdf, other

    cs.LG

    QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

    Authors: Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci, Bo Li, Martin Jaggi, Dan Alistarh, Torsten Hoefler, James Hensman

    Abstract: We introduce QuaRot, a new Quantization scheme based on Rotations, which is able to quantize LLMs end-to-end, including all weights, activations, and KV cache in 4 bits. QuaRot rotates LLMs in a way that removes outliers from the hidden state without changing the output, making quantization easier. This computational invariance is applied to the hidden state (residual) of the LLM, as well as to th… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 19 pages, 6 figures

  2. arXiv:2401.15024  [pdf, other

    cs.LG cs.CL

    SliceGPT: Compress Large Language Models by Deleting Rows and Columns

    Authors: Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari do Nascimento, Torsten Hoefler, James Hensman

    Abstract: Large language models have become the cornerstone of natural language processing, but their use comes with substantial costs in terms of compute and memory resources. Sparsification provides a solution to alleviate these resource constraints, and recent works have shown that trained models can be sparsified post-hoc. Existing sparsification techniques face challenges as they need additional data s… ▽ More

    Submitted 9 February, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: 22 pages, 8 figures, accepted at ICLR24

  3. arXiv:2112.08237  [pdf, other

    cs.SI

    Exposure Inequality in People Recommender Systems: The Long-Term Effects

    Authors: Francesco Fabbri, Maria Luisa Croci, Francesco Bonchi, Carlos Castillo

    Abstract: People recommender systems may affect the exposure that users receive in social networking platforms, influencing attention dynamics and potentially strengthening pre-existing inequalities that disproportionately affect certain groups. In this paper we introduce a model to simulate the feedback loop created by multiple rounds of interactions between users and a link recommender in a social netwo… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

    Comments: To appear in ICWSM 2022

  4. arXiv:2104.13201  [pdf, other

    cs.LG stat.AP

    Online parameter inference for the simulation of a Bunsen flame using heteroscedastic Bayesian neural network ensembles

    Authors: Maximilian L. Croci, Ushnish Sengupta, Matthew P. Juniper

    Abstract: This paper proposes a Bayesian data-driven machine learning method for the online inference of the parameters of a G-equation model of a ducted, premixed flame. Heteroscedastic Bayesian neural network ensembles are trained on a library of 1.7 million flame fronts simulated in LSGEN2D, a G-equation solver, to learn the Bayesian posterior distribution of the model parameters given observations. The… ▽ More

    Submitted 26 April, 2021; originally announced April 2021.

    Comments: 6 pages, 3 figures

    Journal ref: ICLR 2021 Deep Learning for Simulation Workshop

  5. arXiv:2011.02838  [pdf, other

    cs.LG physics.flu-dyn stat.AP

    Real-time parameter inference in reduced-order flame models with heteroscedastic Bayesian neural network ensembles

    Authors: Ushnish Sengupta, Maximilian L. Croci, Matthew P. Juniper

    Abstract: The estimation of model parameters with uncertainties from observed data is a ubiquitous inverse problem in science and engineering. In this paper, we suggest an inexpensive and easy to implement parameter estimation technique that uses a heteroscedastic Bayesian Neural Network trained using anchored ensembling. The heteroscedastic aleatoric error of the network models the irreducible uncertainty… ▽ More

    Submitted 11 October, 2020; originally announced November 2020.

    Journal ref: Machine Learning and the Physical Sciences Workshop at the 34th Conference on Neural Information Processing Systems (NeurIPS) 2020