Search | arXiv e-print repository

The Birkhoff completion of finite lattices

Authors: Mohammad Abdulla, Johannes Hirth, Gerd Stumme

Abstract: We introduce the Birkhoff completion as the smallest distributive lattice in which a given finite lattice can be embedded as semi-lattice. We discuss its relationship to implicational theories, in particular to R. Wille's simply-implicational theories. By an example, we show how the Birkhoff completion can be used as a tool for ordinal data science. We introduce the Birkhoff completion as the smallest distributive lattice in which a given finite lattice can be embedded as semi-lattice. We discuss its relationship to implicational theories, in particular to R. Wille's simply-implicational theories. By an example, we show how the Birkhoff completion can be used as a tool for ordinal data science. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.18940 [pdf, ps, other]

Conceptual Map** of Controversies

Authors: Claude Draude, Dominik Dürrschnabel, Johannes Hirth, Viktoria Horn, Jonathan Kropf, Jörn Lamla, Gerd Stumme, Markus Uhlmann

Abstract: With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversi… ▽ More With our work, we contribute towards a qualitative analysis of the discourse on controversies in online news media. For this, we employ Formal Concept Analysis and the economics of conventions to derive conceptual controversy maps. In our experiments, we analyze two maps from different news journals with methods from ordinal data science. We show how these methods can be used to assess the diversity, complexity and potential bias of controversies. In addition to that, we discuss how the diagrams of concept lattices can be used to navigate between news articles. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2403.03607 [pdf, other]

The Geometric Structure of Topic Models

Authors: Johannes Hirth, Tom Hanika

Abstract: Topic models are a popular tool for clustering and analyzing textual data. They allow texts to be classified on the basis of their affiliation to the previously calculated topics. Despite their widespread use in research and application, an in-depth analysis of topic models is still an open research topic. State-of-the-art methods for interpreting topic models are based on simple visualizations, s… ▽ More Topic models are a popular tool for clustering and analyzing textual data. They allow texts to be classified on the basis of their affiliation to the previously calculated topics. Despite their widespread use in research and application, an in-depth analysis of topic models is still an open research topic. State-of-the-art methods for interpreting topic models are based on simple visualizations, such as similarity matrices, top-term lists or embeddings, which are limited to a maximum of three dimensions. In this paper, we propose an incidence-geometric method for deriving an ordinal structure from flat topic models, such as non-negative matrix factorization. These enable the analysis of the topic model in a higher (order) dimension and the possibility of extracting conceptual relationships between several topics at once. Due to the use of conceptual scaling, our approach does not introduce any artificial topical relationships, such as artifacts of feature compression. Based on our findings, we present a new visualization paradigm for concept hierarchies based on ordinal motifs. These allow for a top-down view on topic spaces. We introduce and demonstrate the applicability of our approach based on a topic model derived from a corpus of scientific papers taken from 32 top machine learning venues. △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2304.08093 [pdf, other]

doi 10.1007/978-3-031-40960-8_12

Automatic Textual Explanations of Concept Lattices

Authors: Johannes Hirth, Viktoria Horn, Gerd Stumme, Tom Hanika

Abstract: Lattices and their order diagrams are an essential tool for communicating knowledge and insights about data. This is in particular true when applying Formal Concept Analysis. Such representations, however, are difficult to comprehend by untrained users and in general in cases where lattices are large. We tackle this problem by automatically generating textual explanations for lattices using standa… ▽ More Lattices and their order diagrams are an essential tool for communicating knowledge and insights about data. This is in particular true when applying Formal Concept Analysis. Such representations, however, are difficult to comprehend by untrained users and in general in cases where lattices are large. We tackle this problem by automatically generating textual explanations for lattices using standard scales. Our method is based on the general notion of ordinal motifs in lattices for the special case of standard scales. We show the computational complexity of identifying a small number of standard scales that cover most of the lattice structure. For these, we provide textual explanation templates, which can be applied to any occurrence of a scale in any data domain. These templates are derived using principles from human-computer interaction and allow for a comprehensive textual explanation of lattices. We demonstrate our approach on the spices planner data set, which is a medium sized formal context comprised of fifty-six meals (objects) and thirty-seven spices (attributes). The resulting 531 formal concepts can be covered by means of about 100 standard scales. △ Less

Submitted 17 April, 2023; originally announced April 2023.

MSC Class: 06A15 03G10 68T30 68T27 ACM Class: F.4.1; I.2.6

Journal ref: ICCS 2023: 138-152

arXiv:2304.04827 [pdf, ps, other]

doi 10.1016/j.ins.2023.120009

Ordinal Motifs in Lattices

Authors: Johannes Hirth, Viktoria Horn, Gerd Stumme, Tom Hanika

Abstract: Lattices are a commonly used structure for the representation and analysis of relational and ontological knowledge. In particular, the analysis of these requires a decomposition of a large and high-dimensional lattice into a set of understandably large parts. With the present work we propose /ordinal motifs/ as analytical units of meaning. We study these ordinal substructures (or standard scales)… ▽ More Lattices are a commonly used structure for the representation and analysis of relational and ontological knowledge. In particular, the analysis of these requires a decomposition of a large and high-dimensional lattice into a set of understandably large parts. With the present work we propose /ordinal motifs/ as analytical units of meaning. We study these ordinal substructures (or standard scales) through (full) scale-measures of formal contexts from the field of formal concept analysis. We show that the underlying decision problems are NP-complete and provide results on how one can incrementally identify ordinal motifs to save computational effort. Accompanying our theoretical results, we demonstrate how ordinal motifs can be leveraged to retrieve basic meaning from a medium sized ordinal data set. △ Less

Submitted 10 April, 2023; originally announced April 2023.

MSC Class: 06A15 03G10 68T30 68T27 ACM Class: F.4.1; I.2.6

arXiv:2302.09101 [pdf, ps, other]

doi 10.1007/978-3-031-35949-1_5

Scaling Dimension

Authors: Bernhard Ganter, Tom Hanika, Johannes Hirth

Abstract: Conceptual Scaling is a useful standard tool in Formal Concept Analysis and beyond. Its mathematical theory, as elaborated in the last chapter of the FCA monograph, still has room for improvement. As it stands, even some of the basic definitions are in flux. Our contribution was triggered by the study of concept lattices for tree classifiers and the scaling methods used there. We extend some basic… ▽ More Conceptual Scaling is a useful standard tool in Formal Concept Analysis and beyond. Its mathematical theory, as elaborated in the last chapter of the FCA monograph, still has room for improvement. As it stands, even some of the basic definitions are in flux. Our contribution was triggered by the study of concept lattices for tree classifiers and the scaling methods used there. We extend some basic notions, give precise mathematical definitions for them and introduce the concept of scaling dimension. In addition to a detailed discussion of its properties, including an example, we show theoretical bounds related to the order dimension of concept lattices. We also study special subclasses, such as the ordinal and the interordinal scaling dimensions, and show for them first results and examples. △ Less

Submitted 17 February, 2023; originally announced February 2023.

MSC Class: 03G10 68T27 68T30 06B10 06A15

arXiv:2302.05270 [pdf, other]

doi 10.1016/j.ijar.2023.108930

Conceptual Views on Tree Ensemble Classifiers

Authors: Tom Hanika, Johannes Hirth

Abstract: Random Forests and related tree-based methods are popular for supervised learning from table based data. Apart from their ease of parallelization, their classification performance is also superior. However, this performance, especially parallelizability, is offset by the loss of explainability. Statistical methods are often used to compensate for this disadvantage. Yet, their ability for local exp… ▽ More Random Forests and related tree-based methods are popular for supervised learning from table based data. Apart from their ease of parallelization, their classification performance is also superior. However, this performance, especially parallelizability, is offset by the loss of explainability. Statistical methods are often used to compensate for this disadvantage. Yet, their ability for local explanations, and in particular for global explanations, is limited. In the present work we propose an algebraic method, rooted in lattice theory, for the (global) explanation of tree ensembles. In detail, we introduce two novel conceptual views on tree ensemble classifiers and demonstrate their explanatory capabilities on Random Forests that were trained with standard parameters. △ Less

Submitted 10 February, 2023; originally announced February 2023.

MSC Class: 68T30 03G10 68T27 06A15 97R40

Journal ref: Int. J. Approx. Reason. 159, 2023

arXiv:2209.13517 [pdf, other]

Formal Conceptual Views in Neural Networks

Authors: Johannes Hirth, Tom Hanika

Abstract: Explaining neural network models is a challenging task that remains unsolved in its entirety to this day. This is especially true for high dimensional and complex data. With the present work, we introduce two notions for conceptual views of a neural network, specifically a many-valued and a symbolic view. Both provide novel analysis methods to enable a human AI analyst to grasp deeper insights int… ▽ More Explaining neural network models is a challenging task that remains unsolved in its entirety to this day. This is especially true for high dimensional and complex data. With the present work, we introduce two notions for conceptual views of a neural network, specifically a many-valued and a symbolic view. Both provide novel analysis methods to enable a human AI analyst to grasp deeper insights into the knowledge that is captured by the neurons of a network. We test the conceptual expressivity of our novel views through different experiments on the ImageNet and Fruit-360 data sets. Furthermore, we show to which extent the views allow to quantify the conceptual similarity of different learning architectures. Finally, we demonstrate how conceptual views can be applied for abductive learning of human comprehensible rules from neurons. In summary, with our work, we contribute to the most relevant task of globally explaining neural networks models. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Comments: 17 pages, 8 figures, 9 tables

MSC Class: 68T07 68T30 03G10

arXiv:2206.07980 [pdf, other]

doi 10.1007/s11192-022-04529-w

Research Topic Flows in Co-Authorship Networks

Authors: Bastian Schäfermeier, Johannes Hirth, Tom Hanika

Abstract: In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scienti… ▽ More In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: 28 pages

MSC Class: 68T50 68U35 68V35 01A90 00A15

Journal ref: Scientometrics 128(9): 5051-5078 (2023)

arXiv:2106.06815 [pdf, ps, other]

doi 10.1007/978-3-030-86982-3_8

Quantifying the Conceptual Error in Dimensionality Reduction

Authors: Tom Hanika, Johannes Hirth

Abstract: Dimension reduction of data sets is a standard problem in the realm of machine learning and knowledge reasoning. They affect patterns in and dependencies on data dimensions and ultimately influence any decision-making processes. Therefore, a wide variety of reduction procedures are in use, each pursuing different objectives. A so far not considered criterion is the conceptual continuity of the red… ▽ More Dimension reduction of data sets is a standard problem in the realm of machine learning and knowledge reasoning. They affect patterns in and dependencies on data dimensions and ultimately influence any decision-making processes. Therefore, a wide variety of reduction procedures are in use, each pursuing different objectives. A so far not considered criterion is the conceptual continuity of the reduction map**, i.e., the preservation of the conceptual structure with respect to the original data set. Based on the notion scale-measure from formal concept analysis we present in this work a) the theoretical foundations to detect and quantify conceptual errors in data scalings; b) an experimental investigation of our approach on eleven data sets that were respectively treated with a variant of non-negative matrix factorization. △ Less

Submitted 12 June, 2021; originally announced June 2021.

Comments: 14 pages

MSC Class: 03G10; 68T27; 68T30; 06B10; 06A15; 06-08

arXiv:2102.02576 [pdf, ps, other]

doi 10.1007/978-3-030-77867-5_17

Exploring Scale-Measures of Data Sets

Authors: Tom Hanika, Johannes Hirth

Abstract: Measurement is a fundamental building block of numerous scientific models and their creation. This is in particular true for data driven science. Due to the high complexity and size of modern data sets, the necessity for the development of understandable and efficient scaling methods is at hand. A profound theory for scaling data is scale-measures, as developed in the field of formal concept analy… ▽ More Measurement is a fundamental building block of numerous scientific models and their creation. This is in particular true for data driven science. Due to the high complexity and size of modern data sets, the necessity for the development of understandable and efficient scaling methods is at hand. A profound theory for scaling data is scale-measures, as developed in the field of formal concept analysis. Recent developments indicate that the set of all scale-measures for a given data set constitutes a lattice and does hence allow efficient exploring algorithms. In this work we study the properties of said lattice and propose a novel scale-measure exploration algorithm that is based on the well-known and proven attribute exploration approach. Our results motivate multiple applications in scale recommendation, most prominently (semi-)automatic scaling. △ Less

Submitted 4 February, 2021; originally announced February 2021.

Comments: 16 pages, 5 figures

MSC Class: 03G10; 68T27; 68T30; 06B10; 06A15; 06-08 ACM Class: F.4.1; I.2.6

arXiv:2012.05267 [pdf, ps, other]

doi 10.1016/j.ins.2022.09.005

On the Lattice of Conceptual Measurements

Authors: Tom Hanika, Johannes Hirth

Abstract: We present a novel approach for data set scaling based on scale-measures from formal concept analysis, i.e., continuous maps between closure systems, and derive a canonical representation. Moreover, we prove said scale-measures are lattice ordered with respect to the closure systems. This enables exploring the set of scale-measures through by the use of meet and join operations. Furthermore we sho… ▽ More We present a novel approach for data set scaling based on scale-measures from formal concept analysis, i.e., continuous maps between closure systems, and derive a canonical representation. Moreover, we prove said scale-measures are lattice ordered with respect to the closure systems. This enables exploring the set of scale-measures through by the use of meet and join operations. Furthermore we show that the lattice of scale-measures is isomorphic to the lattice of sub-closure systems that arises from the original data. Finally, we provide another representation of scale-measures using propositional logic in terms of data set features. Our theoretical findings are discussed by means of examples. △ Less

Submitted 9 December, 2020; originally announced December 2020.

Comments: 19 pages, 6 figures

MSC Class: 03G10 68T27

Journal ref: Information Sciences, 613, 2022, 453-468

arXiv:2002.11776 [pdf, other]

doi 10.1007/s10472-022-09790-6

Knowledge Cores in Large Formal Contexts

Authors: Tom Hanika, Johannes Hirth

Abstract: Knowledge computation tasks are often infeasible for large data sets. This is in particular true when deriving knowledge bases in formal concept analysis (FCA). Hence, it is essential to come up with techniques to cope with this problem. Many successful methods are based on random processes to reduce the size of the investigated data set. This, however, makes them hardly interpretable with respect… ▽ More Knowledge computation tasks are often infeasible for large data sets. This is in particular true when deriving knowledge bases in formal concept analysis (FCA). Hence, it is essential to come up with techniques to cope with this problem. Many successful methods are based on random processes to reduce the size of the investigated data set. This, however, makes them hardly interpretable with respect to the discovered knowledge. Other approaches restrict themselves to highly supported subsets and omit rare and interesting patterns. An essentially different approach is used in network science, called $k$-cores. These are able to reflect rare patterns if they are well connected in the data set. In this work, we study $k$-cores in the realm of FCA by exploiting the natural correspondence to bi-partite graphs. This structurally motivated approach leads to a comprehensible extraction of knowledge cores from large formal contexts data sets. △ Less

Submitted 26 February, 2020; originally announced February 2020.

Comments: 13 pages, 10 figures

MSC Class: 68T30 03G10 97R40

Journal ref: Ann Math Artif Intell (2022)

Showing 1–13 of 13 results for author: Hirth, J