-
fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity perception in image embeddings
Authors:
Tillmann Ohm,
Andres Karjus,
Mikhail Tamm,
Maximilian Schich
Abstract:
The notion of visual similarity is essential for computer vision, and in applications and studies revolving around vector embeddings of images. However, the scarcity of benchmark datasets poses a significant hurdle in exploring how these models perceive similarity. Here we introduce Style Aligned Artwork Datasets (SALADs), and an example of fruit-SALAD with 10,000 images of fruit depictions. This…
▽ More
The notion of visual similarity is essential for computer vision, and in applications and studies revolving around vector embeddings of images. However, the scarcity of benchmark datasets poses a significant hurdle in exploring how these models perceive similarity. Here we introduce Style Aligned Artwork Datasets (SALADs), and an example of fruit-SALAD with 10,000 images of fruit depictions. This combined semantic category and style benchmark comprises 100 instances each of 10 easy-to-recognize fruit categories, across 10 easy distinguishable styles. Leveraging a systematic pipeline of generative image synthesis, this visually diverse yet balanced benchmark demonstrates salient differences in semantic category and style similarity weights across various computational models, including machine learning models, feature extraction algorithms, and complexity measures, as well as conceptual models for reference. This meticulously designed dataset offers a controlled and balanced platform for the comparative analysis of similarity perception. The SALAD framework allows the comparison of how these models perform semantic category and style recognition task to go beyond the level of anecdotal knowledge, making it robustly quantifiable and qualitatively interpretable.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Quantifying Collection Lag in European Modern and Contemporary Art Museums
Authors:
Mar Canet Solà,
Antonina Korepanova,
Ksenia Mukhina,
Maximilian Schich
Abstract:
Museum collection strategies are governed by a variety of factors, including topical focus, acquisition funds, availability of works in the art market, donations and specific coincidental opportunities. Yet, it remains unclear if more fundamental collection patterns emerge, exist, and are shared between museums, which could for example allow an established artist to estimate when a contemporary ar…
▽ More
Museum collection strategies are governed by a variety of factors, including topical focus, acquisition funds, availability of works in the art market, donations and specific coincidental opportunities. Yet, it remains unclear if more fundamental collection patterns emerge, exist, and are shared between museums, which could for example allow an established artist to estimate when a contemporary art museum would acquire their works. Here we collect and analyze data from 12 European contemporary art museums, taking into account artwork creation dates, collection acquisition dates, and the associated artist age at both points in time. From this simple quantitative construct we are able to reveal a striking gradient of museum profiles at the aggregate level. This lag can function to constitute a macroeconomic index of "mean museum collection lag", ranging from 3 years in the most dynamic cases (Kiasma) to 33 years in the most established institutions (Reina Sofia). Meanwhile, on the granular level, plotting artist age over collection year, and using artist-age vs artwork-collection matrices, a detailed picture becomes evident, where individual museums are characterized by shared patterns and a rich heterogeneity of ideographic details. Regularities include continuous acquisitions, systematic acquisition of older materials over time, and brief bursts, where whole oeuvres of individual artists join specific collections. Hence, we are able to shed light on the detailed collection history of museums, transcending the anecdotal nature of art historical storytelling via the provision of a quantitative context. Our approach of cultural data analysis combines expertise in art, art history, computational social science, and computer science. Our joint perspective builds a bridge between and serves an audience of museum professionals, art market actors, collectors, and individual artists alike.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Automated stance detection in complex topics and small languages: the challenging case of immigration in polarizing news media
Authors:
Mark Mets,
Andres Karjus,
Indrek Ibrus,
Maximilian Schich
Abstract:
Automated stance detection and related machine learning methods can provide useful insights for media monitoring and academic research. Many of these approaches require annotated training datasets, which limits their applicability for languages where these may not be readily available. This paper explores the applicability of large language models for automated stance detection in a challenging sc…
▽ More
Automated stance detection and related machine learning methods can provide useful insights for media monitoring and academic research. Many of these approaches require annotated training datasets, which limits their applicability for languages where these may not be readily available. This paper explores the applicability of large language models for automated stance detection in a challenging scenario, involving a morphologically complex, lower-resource language, and a socio-culturally complex topic, immigration. If the approach works in this case, it can be expected to perform as well or better in less demanding scenarios. We annotate a large set of pro and anti-immigration examples, and compare the performance of multiple language models as supervised learners. We also probe the usability of ChatGPT as an instructable zero-shot classifier for the same task. Supervised achieves acceptable performance, and ChatGPT yields similar accuracy. This is promising as a potentially simpler and cheaper alternative for text classification tasks, including in lower-resource languages. We further use the best-performing model to investigate diachronic trends over seven years in two corpora of Estonian mainstream and right-wing populist news sources, demonstrating the applicability of the approach for news analytics and media monitoring settings, and discuss correspondences between stance changes and real-world events.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
Collection Space Navigator: An Interactive Visualization Interface for Multidimensional Datasets
Authors:
Tillmann Ohm,
Mar Canet Solà,
Andres Karjus,
Maximilian Schich
Abstract:
We introduce the Collection Space Navigator (CSN), a browser-based visualization tool to explore, research, and curate large collections of visual digital artifacts that are associated with multidimensional data, such as vector embeddings or tables of metadata. Media objects such as images are often encoded as numerical vectors, for e.g. based on metadata or using machine learning to embed image i…
▽ More
We introduce the Collection Space Navigator (CSN), a browser-based visualization tool to explore, research, and curate large collections of visual digital artifacts that are associated with multidimensional data, such as vector embeddings or tables of metadata. Media objects such as images are often encoded as numerical vectors, for e.g. based on metadata or using machine learning to embed image information. Yet, while such procedures are widespread for a range of applications, it remains a challenge to explore, analyze, and understand the resulting multidimensional spaces in a more comprehensive manner. Dimensionality reduction techniques such as t-SNE or UMAP often serve to project high-dimensional data into low dimensional visualizations, yet require interpretation themselves as the remaining dimensions are typically abstract. Here, the Collection Space Navigator provides a customizable interface that combines two-dimensional projections with a set of configurable multidimensional filters. As a result, the user is able to view and investigate collections, by zooming and scaling, by transforming between projections, by filtering dimensions via range sliders, and advanced text filters. Insights that are gained during the interaction can be fed back into the original data via ad hoc exports of filtered metadata and projections. This paper comes with a functional showcase demo using a large digitized collection of classical Western art. The Collection Space Navigator is open source. Users can reconfigure the interface to fit their own data and research needs, including projections and filter controls. The CSN is ready to serve a broad community.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
Beyond Binary: Hypermatrix Algebra and Irreducible Arity in Higher-Order Systems
Authors:
Carlos Zapata-Carratalá,
Maximilian Schich,
Taliesin Beynon,
Xerxes D. Arsiwalla
Abstract:
Theoretical and computational frameworks of modern science are dominated by binary structures. This binary bias, seen in the ubiquity of pair-wise networks and formal operations of two arguments in mathematical models, limits our capacity to faithfully capture irreducible polyadic interactions in higher-order systems. A paradigmatic example of a higher-order interaction is the Borromean link of th…
▽ More
Theoretical and computational frameworks of modern science are dominated by binary structures. This binary bias, seen in the ubiquity of pair-wise networks and formal operations of two arguments in mathematical models, limits our capacity to faithfully capture irreducible polyadic interactions in higher-order systems. A paradigmatic example of a higher-order interaction is the Borromean link of three interlocking rings. In this paper we propose a mathematical framework via hypergraphs and hypermatrix algebras that allows to formalize such forms of higher-order bonding and connectivity in a parsimonious way. Our framework builds on and extends current techniques in higher-order networks -- still mostly rooted in binary structures such as adjacency matrices -- and incorporates recent developments in higher-arity structures to articulate the compositional behavior of adjacency hypermatrices. Irreducible higher-order interactions turn out to be a widespread occurrence across natural sciences and socio-cultural knowledge representation. We demonstrate this by reviewing recent results in computer science, physics, chemistry, biology, ecology, social science, and cultural analysis through the conceptual lens of irreducible higher-order interactions. We further speculate that the general phenomenon of emergence in complex systems may be characterized by spatio-temporal discrepancies of interaction arity.
△ Less
Submitted 5 January, 2023;
originally announced January 2023.
-
Compression ensembles quantify aesthetic complexity and the evolution of visual art
Authors:
Andres Karjus,
Mar Canet Solà,
Tillmann Ohm,
Sebastian E. Ahnert,
Maximilian Schich
Abstract:
The quantification of visual aesthetics and complexity have a long history, the latter previously operationalized via the application of compression algorithms. Here we generalize and extend the compression approach beyond simple complexity measures to quantify algorithmic distance in historical and contemporary visual media. The proposed "ensemble" approach works by compressing a large number of…
▽ More
The quantification of visual aesthetics and complexity have a long history, the latter previously operationalized via the application of compression algorithms. Here we generalize and extend the compression approach beyond simple complexity measures to quantify algorithmic distance in historical and contemporary visual media. The proposed "ensemble" approach works by compressing a large number of transformed versions of a given input image, resulting in a vector of associated compression ratios. This approach is more efficient than other compression-based algorithmic distances, and is particularly suited for the quantitative analysis of visual artifacts, because human creative processes can be understood as algorithms in the broadest sense. Unlike comparable image embedding methods using machine learning, our approach is fully explainable through the transformations. We demonstrate that the method is cognitively plausible and fit for purpose by evaluating it against human complexity judgments, and on automated detection tasks of authorship and style. We show how the approach can be used to reveal and quantify trends in art historical data, both on the scale of centuries and in rapidly evolving contemporary NFT art markets. We further quantify temporal resemblance to disambiguate artists outside the documented mainstream from those who are deeply embedded in Zeitgeist. Finally, we note that compression ensembles constitute a quantitative representation of the concept of visual family resemblance, as distinct sets of dimensions correspond to shared visual characteristics otherwise hard to pin down. Our approach provides a new perspective for the study of visual art, algorithmic image analysis, and quantitative aesthetics more generally.
△ Less
Submitted 20 May, 2022;
originally announced May 2022.
-
Network Dimensions in the Getty Provenance Index
Authors:
Maximilian Schich,
Christian Huemer,
Piotr Adamczyk,
Lev Manovich,
Yang-Yu Liu
Abstract:
In this article we make a case for a systematic application of complex network science to study art market history and more general collection dynamics. We reveal social, temporal, spatial, and conceptual network dimensions, i.e. network node and link types, previously implicit in the Getty Provenance Index (GPI). As a pioneering art history database active since the 1980s, the GPI provides online…
▽ More
In this article we make a case for a systematic application of complex network science to study art market history and more general collection dynamics. We reveal social, temporal, spatial, and conceptual network dimensions, i.e. network node and link types, previously implicit in the Getty Provenance Index (GPI). As a pioneering art history database active since the 1980s, the GPI provides online access to source material relevant for research in the history of collecting and art markets. Based on a subset of the GPI, we characterize an aggregate of more than 267,000 sales transactions connected to roughly 22,000 actors in four countries over 20 years at daily resolution from 1801 to 1820. Striving towards a deeper understanding on multiple levels we disambiguate social dynamics of buying, brokering, and selling, while observing a general broadening of the market, where large collections are split into smaller lots. Temporally, we find annual market cycles that are shifted by country and obviously favor international exchange. Spatially, we differentiate near-monopolies from regions driven by competing sub-centers, while uncovering asymmetries of international market flux. Conceptually, we track dynamics of artist attribution that clearly behave like product categories in a very slow supermarket. Taken together, we introduce a number of meaningful network perspectives dealing with historical art auction data, beyond the analysis of social networks within a single market region. The results presented here have inspired a Linked Open Data conversion of the GPI, which is currently in process and will allow further analysis by a broad set of researchers.
△ Less
Submitted 8 June, 2017;
originally announced June 2017.
-
Figuring Out Art History
Authors:
Maximilian Schich
Abstract:
World population and the number of cultural artifacts are growing exponentially or faster, while cultural interaction approaches the fidelity of a global nervous system. Every day hundreds of millions of images are loaded into social networks by users all over the world. As this myriad of new artifacts veils the view into the past, like city lights covering the night sky, it is easy to forget that…
▽ More
World population and the number of cultural artifacts are growing exponentially or faster, while cultural interaction approaches the fidelity of a global nervous system. Every day hundreds of millions of images are loaded into social networks by users all over the world. As this myriad of new artifacts veils the view into the past, like city lights covering the night sky, it is easy to forget that there is more than one Starry Night, the painting by Van Gogh. Like in ecology, where saving rare species may help us in treating disease, art and architectural history can reveal insights into the past, which may hold keys to our own future. With humanism under threat, facing the challenge of understanding the structure and dynamics of art and culture, both qualitatively and quantitatively, is more crucial now than it ever was. The purpose of this article is to provide perspective in the aim of figuring out the process of art history - not art history as a discipline, but the actual history of all made things, in the spirit of George Kubler and Marcel Duchamp. In other words, this article deals with the grand challenge of develo** a systematic science of art and culture, no matter what, and no matter how.
△ Less
Submitted 22 October, 2015;
originally announced December 2015.
-
Quantifying Cultural Histories via Person Networks in Wikipedia
Authors:
Doron Goldfarb,
Dieter Merkl,
Maximilian Schich
Abstract:
At least since Priestley's 1765 Chart of Biography, large numbers of individual person records have been used to illustrate aggregate patterns of cultural history. Wikidata, the structured database sister of Wikipedia, currently contains about 2.7 million explicit person records, across all language versions of the encyclopedia. These individuals, notable according to Wikipedia editing criteria, a…
▽ More
At least since Priestley's 1765 Chart of Biography, large numbers of individual person records have been used to illustrate aggregate patterns of cultural history. Wikidata, the structured database sister of Wikipedia, currently contains about 2.7 million explicit person records, across all language versions of the encyclopedia. These individuals, notable according to Wikipedia editing criteria, are connected via millions of hyperlinks between their respective Wikipedia articles. This situation provides us with the chance to go beyond the illustration of an idiosyncratic subset of individuals, as in the case of Priestly. In this work we summarize the overlap of nationalities and occupations, based on their co-occurrence in Wikidata individuals. We construct networks of co-occurring nationalities and occupations, provide insights into their respective community structure, and apply the results to select and color chronologically structured subsets of a large network of individuals, connected by Wikipedia hyperlinks. While the imagined communities of nationality are much more discrete in terms of co-occurrence than occupations, our quantifications reveal the existing overlap of nationality as much less clear-cut than in case of occupational domains. Our work contributes to a growing body of research using biographies of notable persons to analyze cultural processes.
△ Less
Submitted 22 June, 2015;
originally announced June 2015.