-
Are We Done with MMLU?
Authors:
Aryo Pradipta Gema,
Joshua Ong Jun Leang,
Giwon Hong,
Alessio Devoto,
Alberto Carlo Maria Mancino,
Rohit Saxena,
Xuanli He,
Yu Zhao,
Xiaotang Du,
Mohammad Reza Ghasemi Madani,
Claire Barale,
Robert McHardy,
Joshua Harris,
Jean Kaddour,
Emile van Krieken,
Pasquale Minervini
Abstract:
Maybe not. We identify and analyse errors in the popular Massive Multitask Language Understanding (MMLU) benchmark. Even though MMLU is widely adopted, our analysis demonstrates numerous ground truth errors that obscure the true capabilities of LLMs. For example, we find that 57% of the analysed questions in the Virology subset contain errors. To address this issue, we introduce a comprehensive fr…
▽ More
Maybe not. We identify and analyse errors in the popular Massive Multitask Language Understanding (MMLU) benchmark. Even though MMLU is widely adopted, our analysis demonstrates numerous ground truth errors that obscure the true capabilities of LLMs. For example, we find that 57% of the analysed questions in the Virology subset contain errors. To address this issue, we introduce a comprehensive framework for identifying dataset errors using a novel error taxonomy. Then, we create MMLU-Redux, which is a subset of 3,000 manually re-annotated questions across 30 MMLU subjects. Using MMLU-Redux, we demonstrate significant discrepancies with the model performance metrics that were originally reported. Our results strongly advocate for revising MMLU's error-ridden questions to enhance its future utility and reliability as a benchmark. Therefore, we open up MMLU-Redux for additional annotation https://huggingface.co/datasets/edinburgh-dawg/mmlu-redux.
△ Less
Submitted 7 June, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
KGUF: Simple Knowledge-aware Graph-based Recommender with User-based Semantic Features Filtering
Authors:
Salvatore Bufi,
Alberto Carlo Maria Mancino,
Antonio Ferrara,
Daniele Malitesta,
Tommaso Di Noia,
Eugenio Di Sciascio
Abstract:
The recent integration of Graph Neural Networks (GNNs) into recommendation has led to a novel family of Collaborative Filtering (CF) approaches, namely Graph Collaborative Filtering (GCF). Following the same GNNs wave, recommender systems exploiting Knowledge Graphs (KGs) have also been successfully empowered by the GCF rationale to combine the representational power of GNNs with the semantics con…
▽ More
The recent integration of Graph Neural Networks (GNNs) into recommendation has led to a novel family of Collaborative Filtering (CF) approaches, namely Graph Collaborative Filtering (GCF). Following the same GNNs wave, recommender systems exploiting Knowledge Graphs (KGs) have also been successfully empowered by the GCF rationale to combine the representational power of GNNs with the semantics conveyed by KGs, giving rise to Knowledge-aware Graph Collaborative Filtering (KGCF), which use KGs to mine hidden user intent. Nevertheless, empirical evidence suggests that computing and combining user-level intent might not always be necessary, as simpler approaches can yield comparable or superior results while kee** explicit semantic features. Under this perspective, user historical preferences become essential to refine the KG and retain the most discriminating features, thus leading to concise item representation. Driven by the assumptions above, we propose KGUF, a KGCF model that learns latent representations of semantic features in the KG to better define the item profile. By leveraging user profiles through decision trees, KGUF effectively retains only those features relevant to users. Results on three datasets justify KGUF's rationale, as our approach is able to reach performance comparable or superior to SOTA methods while maintaining a simpler formalization. Link to the repository: https://github.com/sisinflab/KGUF.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Jeans modelling of weakly flattened stellar systems
Authors:
Antonio Mancino,
Luca Ciotti,
Silvia Pellegrini,
Federica Giannetti
Abstract:
In the homoeoidal expansion, a given ellipsoidally stratified density distribution, and its associated potential, are expanded in the (small) density flattening parameter $η$, and usually truncated at the linear order. The truncated density-potential pair obeys exactly the Poisson equation, and it can be interpreted as the first-order expansion of the original ellipsoidal density-potential pair, o…
▽ More
In the homoeoidal expansion, a given ellipsoidally stratified density distribution, and its associated potential, are expanded in the (small) density flattening parameter $η$, and usually truncated at the linear order. The truncated density-potential pair obeys exactly the Poisson equation, and it can be interpreted as the first-order expansion of the original ellipsoidal density-potential pair, or as a new autonomous system. In the first interpretation, in the solutions of the Jeans equations the quadratic terms in $η$ must be discarded (``$η$-linear'' solutions), while in the second (``$η$-quadratic'') all terms are retained. In this work we study the importance of the quadratic terms by using the ellipsoidal Plummer model and the Perfect Ellipsoid, which allow for fully analytical $η$-quadratic solutions. These solutions are then compared with those obtained numerically for the original ellipsoidal models, finding that the $η$-linear models already provide an excellent approximation of the numerical solutions. As an application, the $η$-linear Plummer model (with a central black hole) is used for the phenomenological interpretation of the dynamics of the weakly flattened and rotating globular cluster NGC 4372, confirming that this system cannot be interpreted as an isotropic rotator, a conclusion reached previously with more sophisticated studies.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
A Topology-aware Analysis of Graph Collaborative Filtering
Authors:
Daniele Malitesta,
Claudio Pomo,
Vito Walter Anelli,
Alberto Carlo Maria Mancino,
Eugenio Di Sciascio,
Tommaso Di Noia
Abstract:
The successful integration of graph neural networks into recommender systems (RSs) has led to a novel paradigm in collaborative filtering (CF), graph collaborative filtering (graph CF). By representing user-item data as an undirected, bipartite graph, graph CF utilizes short- and long-range connections to extract collaborative signals that yield more accurate user preferences than traditional CF m…
▽ More
The successful integration of graph neural networks into recommender systems (RSs) has led to a novel paradigm in collaborative filtering (CF), graph collaborative filtering (graph CF). By representing user-item data as an undirected, bipartite graph, graph CF utilizes short- and long-range connections to extract collaborative signals that yield more accurate user preferences than traditional CF methods. Although the recent literature highlights the efficacy of various algorithmic strategies in graph CF, the impact of datasets and their topological features on recommendation performance is yet to be studied. To fill this gap, we propose a topology-aware analysis of graph CF. In this study, we (i) take some widely-adopted recommendation datasets and use them to generate a large set of synthetic sub-datasets through two state-of-the-art graph sampling methods, (ii) measure eleven of their classical and topological characteristics, and (iii) estimate the accuracy calculated on the generated sub-datasets considering four popular and recent graph-based RSs (i.e., LightGCN, DGCF, UltraGCN, and SVD-GCN). Finally, the investigation presents an explanatory framework that reveals the linear relationships between characteristics and accuracy measures. The results, statistically validated under different graph sampling settings, confirm the existence of solid dependencies between topological characteristics and accuracy in the graph-based recommendation, offering a new perspective on how to interpret graph CF.
△ Less
Submitted 26 November, 2023; v1 submitted 21 August, 2023;
originally announced August 2023.
-
On the polytropic Bondi accretion in two-component galaxy models with a central massive BH
Authors:
Antonio Mancino,
Luca Ciotti,
Silvia Pellegrini
Abstract:
In many investigations involving accretion on a central point mass, ranging from observational studies to cosmological simulations, including semi-analytical modelling, the classical Bondi accretion theory is the standard tool widely adopted. Previous works generalised the theory to include the effects of the gravitational field of the galaxy hosting a central black hole, and of electron scatterin…
▽ More
In many investigations involving accretion on a central point mass, ranging from observational studies to cosmological simulations, including semi-analytical modelling, the classical Bondi accretion theory is the standard tool widely adopted. Previous works generalised the theory to include the effects of the gravitational field of the galaxy hosting a central black hole, and of electron scattering in the optically thin limit. Here we apply this extended Bondi problem, in the general polytropic case, to a class of new two-component galaxy models recently presented. In these models, a Jaffe stellar density profile is embedded in a dark matter halo such that the total density distribution follows a $r^{-3}$ profile at large radii; the stellar dynamical quantities can be expressed in a fully analytical way. The hydrodynamical properties of the flow are set by imposing that the gas temperature at infinity is proportional to the virial temperature of the stellar component. The isothermal and adiabatic (monoatomic) cases can be solved analytically, in the other cases we explore the accretion solution numerically. As non-adiabatic accretion inevitably leads to an exchange of heat with the ambient, we also discuss some important thermodynamical properties of the polytropic Bondi accretion, and provide the expressions needed to compute the amount of heat exchanged with the environment, as a function of radius. The results can be useful for the subgrid treatment of accretion in numerical simulations, as well as for the interpretation of observational data.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
A Parameter Space Exploration of High Resolution Numerically Evolved Early Type Galaxies Including AGN Feedback and Accurate Dynamical Treatment of Stellar Orbits
Authors:
Luca Ciotti,
Jeremiah P. Ostriker,
Zhaoming Gan,
Brian Xing Jiang,
Silvia Pellegrini,
Caterina Caravita,
Antonio Mancino
Abstract:
An extensive exploration of the model parameter space of axisymmetric Early-Type Galaxies (ETGs) hosting a central supermassive Black Hole (SMBH) is conducted by means of high resolution hydrodynamical simulations performed with our code MACER. Global properties such as 1) total SMBH accreted mass, 2) final X-ray luminosity and temperature of the X-ray emitting halos, 3) total amount of new stars…
▽ More
An extensive exploration of the model parameter space of axisymmetric Early-Type Galaxies (ETGs) hosting a central supermassive Black Hole (SMBH) is conducted by means of high resolution hydrodynamical simulations performed with our code MACER. Global properties such as 1) total SMBH accreted mass, 2) final X-ray luminosity and temperature of the X-ray emitting halos, 3) total amount of new stars formed from the cooling gas, 4) total ejected mass in form of supernovae and AGN feedback induced galactic winds, are obtained as a function of galaxy structure and internal dynamics. In addition to the galactic dark matter halo, the model galaxies are also embedded in a group/cluster dark matter halo; finally cosmological accretion is also included, with amount and time dependence derived from cosmological simulations. Angular momentum conservation leads to the formation of cold HI disks; these disks further evolve under the action of star formation induced by disk instabilities, of the associated mass discharge onto the central SMBH, and of the consequent AGN feedback. At the end of the simulations, the hot (metal enriched) gas mass is roughly $10\%$ the mass in the old stars, with twice as much having been ejected into the intergalactic medium. The cold gas disks are a $\approx$ kpc in size, and the metal rich new stars are in $0.1$ kpc disks. The masses of cold gas and new stars are roughly $0.1\%$ the mass of the old stars. Overall, the final systems appear to reproduce quite successfully the main global properties of real ETGs.
△ Less
Submitted 25 May, 2022; v1 submitted 11 January, 2022;
originally announced January 2022.
-
Sparse Feature Factorization for Recommender Systems with Knowledge Graphs
Authors:
Vito Walter Anelli,
Tommaso Di Noia,
Eugenio Di Sciascio,
Antonio Ferrara,
Alberto Carlo Maria Mancino
Abstract:
Deep Learning and factorization-based collaborative filtering recommendation models have undoubtedly dominated the scene of recommender systems in recent years. However, despite their outstanding performance, these methods require a training time proportional to the size of the embeddings and it further increases when also side information is considered for the computation of the recommendation li…
▽ More
Deep Learning and factorization-based collaborative filtering recommendation models have undoubtedly dominated the scene of recommender systems in recent years. However, despite their outstanding performance, these methods require a training time proportional to the size of the embeddings and it further increases when also side information is considered for the computation of the recommendation list. In fact, in these cases we have that with a large number of high-quality features, the resulting models are more complex and difficult to train. This paper addresses this problem by presenting KGFlex: a sparse factorization approach that grants an even greater degree of expressiveness. To achieve this result, KGFlex analyzes the historical data to understand the dimensions the user decisions depend on (e.g., movie direction, musical genre, nationality of book writer). KGFlex represents each item feature as an embedding and it models user-item interactions as a factorized entropy-driven combination of the item attributes relevant to the user. KGFlex facilitates the training process by letting users update only those relevant features on which they base their decisions. In other words, the user-item prediction is mediated by the user's personal view that considers only relevant features. An extensive experimental evaluation shows the approach's effectiveness, considering the recommendation results' accuracy, diversity, and induced bias. The public implementation of KGFlex is available at https://split.to/kgflex.
△ Less
Submitted 29 July, 2021;
originally announced July 2021.
-
Two-component galaxy models with a central BH -- II. The ellipsoidal case
Authors:
L. Ciotti,
A. Mancino,
S. Pellegrini,
A. Ziaee Lorzad
Abstract:
Recently, two-component spherical galaxy models have been presented, where the stellar profile is described by a Jaffe law, and the total density by another Jaffe law, or by an $r^{-3}$ law at large radii. We extend these two families to their ellipsoidal axisymmetric counterparts: the JJe and J3e models. The total and stellar density distributions can have different flattenings and scale lengths,…
▽ More
Recently, two-component spherical galaxy models have been presented, where the stellar profile is described by a Jaffe law, and the total density by another Jaffe law, or by an $r^{-3}$ law at large radii. We extend these two families to their ellipsoidal axisymmetric counterparts: the JJe and J3e models. The total and stellar density distributions can have different flattenings and scale lengths, and the dark matter halo is defined by difference. First, the analytical conditions required to have a nowhere negative dark matter halo density are derived. The Jeans equations for the stellar component are then solved analytically, in the limit of small flattenings, also in presence of a central BH. The azimuthal velocity dispersion anisotropy is described by the Satoh $k$-decomposition. Finally, we present the analytical formulae for velocity fields near the center and at large radii, together with the various terms entering the Virial Theorem. The JJe and J3e models can be useful in a number of theoretical applications, e.g. to explore the role of the various parameters (flattening, relative scale lengths, mass ratios, rotational support) in determining the behavior of the stellar kinematical fields before performing more time-expensive integrations with specific galaxy models, to test codes of stellar dynamics, and in numerical simulations of gas flows in galaxies.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
A new class of galaxy models with a central BH -- I. The spherical case
Authors:
L. Ciotti,
A. Mancino,
S. Pellegrini
Abstract:
The dynamical properties of spherically symmetric galaxy models, where a Jaffe (1983) stellar density profile is embedded in a total mass density decreasing as $r^{-3}$ at large radii, are presented. The orbital structure of the stellar component is described by the Osipkov--Merritt anisotropy; the dark matter halo is isotropic, and a black hole is added at the center of the galaxy. First, the con…
▽ More
The dynamical properties of spherically symmetric galaxy models, where a Jaffe (1983) stellar density profile is embedded in a total mass density decreasing as $r^{-3}$ at large radii, are presented. The orbital structure of the stellar component is described by the Osipkov--Merritt anisotropy; the dark matter halo is isotropic, and a black hole is added at the center of the galaxy. First, the conditions for a nowhere negative and monotonically decreasing dark matter halo density profile are derived; this profile can be made asymptotically coincident with a NFW profile at the center and at large radii. Then the minimum value of the anisotropy radius for phase-space consistency is derived as a function of the galaxy parameters. The Jeans equations for the stellar component are solved analytically; the projected velocity dispersion at the center and at large radii is also obtained, for generic values of the anisotropy radius. Finally, analytical expressions for the terms entering the Virial Theorem are derived, and the fiducial anisotropy limit required to prevent the onset of Radial Orbit Instability is determined as a function of the galaxy parameters. The presented models, built following an approach already adopted in our previous works, can be a useful starting point for a more advanced modeling of the dynamics of elliptical galaxies, and can be easily implemented in numerical simulations requiring a realistic dynamical model of a galaxy.
△ Less
Submitted 19 September, 2019;
originally announced September 2019.