Search | arXiv e-print repository

fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity perception in image embeddings

Authors: Tillmann Ohm, Andres Karjus, Mikhail Tamm, Maximilian Schich

Abstract: The notion of visual similarity is essential for computer vision, and in applications and studies revolving around vector embeddings of images. However, the scarcity of benchmark datasets poses a significant hurdle in exploring how these models perceive similarity. Here we introduce Style Aligned Artwork Datasets (SALADs), and an example of fruit-SALAD with 10,000 images of fruit depictions. This… ▽ More The notion of visual similarity is essential for computer vision, and in applications and studies revolving around vector embeddings of images. However, the scarcity of benchmark datasets poses a significant hurdle in exploring how these models perceive similarity. Here we introduce Style Aligned Artwork Datasets (SALADs), and an example of fruit-SALAD with 10,000 images of fruit depictions. This combined semantic category and style benchmark comprises 100 instances each of 10 easy-to-recognize fruit categories, across 10 easy distinguishable styles. Leveraging a systematic pipeline of generative image synthesis, this visually diverse yet balanced benchmark demonstrates salient differences in semantic category and style similarity weights across various computational models, including machine learning models, feature extraction algorithms, and complexity measures, as well as conceptual models for reference. This meticulously designed dataset offers a controlled and balanced platform for the comparative analysis of similarity perception. The SALAD framework allows the comparison of how these models perform semantic category and style recognition task to go beyond the level of anecdotal knowledge, making it robustly quantifiable and qualitatively interpretable. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2305.14159 [pdf, other]

Quantifying Collection Lag in European Modern and Contemporary Art Museums

Authors: Mar Canet Solà, Antonina Korepanova, Ksenia Mukhina, Maximilian Schich

Abstract: Museum collection strategies are governed by a variety of factors, including topical focus, acquisition funds, availability of works in the art market, donations and specific coincidental opportunities. Yet, it remains unclear if more fundamental collection patterns emerge, exist, and are shared between museums, which could for example allow an established artist to estimate when a contemporary ar… ▽ More Museum collection strategies are governed by a variety of factors, including topical focus, acquisition funds, availability of works in the art market, donations and specific coincidental opportunities. Yet, it remains unclear if more fundamental collection patterns emerge, exist, and are shared between museums, which could for example allow an established artist to estimate when a contemporary art museum would acquire their works. Here we collect and analyze data from 12 European contemporary art museums, taking into account artwork creation dates, collection acquisition dates, and the associated artist age at both points in time. From this simple quantitative construct we are able to reveal a striking gradient of museum profiles at the aggregate level. This lag can function to constitute a macroeconomic index of "mean museum collection lag", ranging from 3 years in the most dynamic cases (Kiasma) to 33 years in the most established institutions (Reina Sofia). Meanwhile, on the granular level, plotting artist age over collection year, and using artist-age vs artwork-collection matrices, a detailed picture becomes evident, where individual museums are characterized by shared patterns and a rich heterogeneity of ideographic details. Regularities include continuous acquisitions, systematic acquisition of older materials over time, and brief bursts, where whole oeuvres of individual artists join specific collections. Hence, we are able to shed light on the detailed collection history of museums, transcending the anecdotal nature of art historical storytelling via the provision of a quantitative context. Our approach of cultural data analysis combines expertise in art, art history, computational social science, and computer science. Our joint perspective builds a bridge between and serves an audience of museum professionals, art market actors, collectors, and individual artists alike. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 13 pages, 6 figures

arXiv:2305.13047 [pdf]

Automated stance detection in complex topics and small languages: the challenging case of immigration in polarizing news media

Authors: Mark Mets, Andres Karjus, Indrek Ibrus, Maximilian Schich

Abstract: Automated stance detection and related machine learning methods can provide useful insights for media monitoring and academic research. Many of these approaches require annotated training datasets, which limits their applicability for languages where these may not be readily available. This paper explores the applicability of large language models for automated stance detection in a challenging sc… ▽ More Automated stance detection and related machine learning methods can provide useful insights for media monitoring and academic research. Many of these approaches require annotated training datasets, which limits their applicability for languages where these may not be readily available. This paper explores the applicability of large language models for automated stance detection in a challenging scenario, involving a morphologically complex, lower-resource language, and a socio-culturally complex topic, immigration. If the approach works in this case, it can be expected to perform as well or better in less demanding scenarios. We annotate a large set of pro and anti-immigration examples, and compare the performance of multiple language models as supervised learners. We also probe the usability of ChatGPT as an instructable zero-shot classifier for the same task. Supervised achieves acceptable performance, and ChatGPT yields similar accuracy. This is promising as a potentially simpler and cheaper alternative for text classification tasks, including in lower-resource languages. We further use the best-performing model to investigate diachronic trends over seven years in two corpora of Estonian mainstream and right-wing populist news sources, demonstrating the applicability of the approach for news analytics and media monitoring settings, and discuss correspondences between stance changes and real-world events. △ Less

Submitted 22 May, 2023; originally announced May 2023.

arXiv:2305.06809 [pdf, other]

Collection Space Navigator: An Interactive Visualization Interface for Multidimensional Datasets

Authors: Tillmann Ohm, Mar Canet Solà, Andres Karjus, Maximilian Schich

Abstract: We introduce the Collection Space Navigator (CSN), a browser-based visualization tool to explore, research, and curate large collections of visual digital artifacts that are associated with multidimensional data, such as vector embeddings or tables of metadata. Media objects such as images are often encoded as numerical vectors, for e.g. based on metadata or using machine learning to embed image i… ▽ More We introduce the Collection Space Navigator (CSN), a browser-based visualization tool to explore, research, and curate large collections of visual digital artifacts that are associated with multidimensional data, such as vector embeddings or tables of metadata. Media objects such as images are often encoded as numerical vectors, for e.g. based on metadata or using machine learning to embed image information. Yet, while such procedures are widespread for a range of applications, it remains a challenge to explore, analyze, and understand the resulting multidimensional spaces in a more comprehensive manner. Dimensionality reduction techniques such as t-SNE or UMAP often serve to project high-dimensional data into low dimensional visualizations, yet require interpretation themselves as the remaining dimensions are typically abstract. Here, the Collection Space Navigator provides a customizable interface that combines two-dimensional projections with a set of configurable multidimensional filters. As a result, the user is able to view and investigate collections, by zooming and scaling, by transforming between projections, by filtering dimensions via range sliders, and advanced text filters. Insights that are gained during the interaction can be fed back into the original data via ad hoc exports of filtered metadata and projections. This paper comes with a functional showcase demo using a large digitized collection of classical Western art. The Collection Space Navigator is open source. Users can reconfigure the interface to fit their own data and research needs, including projections and filter controls. The CSN is ready to serve a broad community. △ Less

Submitted 11 May, 2023; originally announced May 2023.

arXiv:2205.10271 [pdf, other]

Compression ensembles quantify aesthetic complexity and the evolution of visual art

Authors: Andres Karjus, Mar Canet Solà, Tillmann Ohm, Sebastian E. Ahnert, Maximilian Schich

Abstract: The quantification of visual aesthetics and complexity have a long history, the latter previously operationalized via the application of compression algorithms. Here we generalize and extend the compression approach beyond simple complexity measures to quantify algorithmic distance in historical and contemporary visual media. The proposed "ensemble" approach works by compressing a large number of… ▽ More The quantification of visual aesthetics and complexity have a long history, the latter previously operationalized via the application of compression algorithms. Here we generalize and extend the compression approach beyond simple complexity measures to quantify algorithmic distance in historical and contemporary visual media. The proposed "ensemble" approach works by compressing a large number of transformed versions of a given input image, resulting in a vector of associated compression ratios. This approach is more efficient than other compression-based algorithmic distances, and is particularly suited for the quantitative analysis of visual artifacts, because human creative processes can be understood as algorithms in the broadest sense. Unlike comparable image embedding methods using machine learning, our approach is fully explainable through the transformations. We demonstrate that the method is cognitively plausible and fit for purpose by evaluating it against human complexity judgments, and on automated detection tasks of authorship and style. We show how the approach can be used to reveal and quantify trends in art historical data, both on the scale of centuries and in rapidly evolving contemporary NFT art markets. We further quantify temporal resemblance to disambiguate artists outside the documented mainstream from those who are deeply embedded in Zeitgeist. Finally, we note that compression ensembles constitute a quantitative representation of the concept of visual family resemblance, as distinct sets of dimensions correspond to shared visual characteristics otherwise hard to pin down. Our approach provides a new perspective for the study of visual art, algorithmic image analysis, and quantitative aesthetics more generally. △ Less

Submitted 20 May, 2022; originally announced May 2022.

arXiv:1506.06580 [pdf, other]

Quantifying Cultural Histories via Person Networks in Wikipedia

Authors: Doron Goldfarb, Dieter Merkl, Maximilian Schich

Abstract: At least since Priestley's 1765 Chart of Biography, large numbers of individual person records have been used to illustrate aggregate patterns of cultural history. Wikidata, the structured database sister of Wikipedia, currently contains about 2.7 million explicit person records, across all language versions of the encyclopedia. These individuals, notable according to Wikipedia editing criteria, a… ▽ More At least since Priestley's 1765 Chart of Biography, large numbers of individual person records have been used to illustrate aggregate patterns of cultural history. Wikidata, the structured database sister of Wikipedia, currently contains about 2.7 million explicit person records, across all language versions of the encyclopedia. These individuals, notable according to Wikipedia editing criteria, are connected via millions of hyperlinks between their respective Wikipedia articles. This situation provides us with the chance to go beyond the illustration of an idiosyncratic subset of individuals, as in the case of Priestly. In this work we summarize the overlap of nationalities and occupations, based on their co-occurrence in Wikidata individuals. We construct networks of co-occurring nationalities and occupations, provide insights into their respective community structure, and apply the results to select and color chronologically structured subsets of a large network of individuals, connected by Wikipedia hyperlinks. While the imagined communities of nationality are much more discrete in terms of co-occurrence than occupations, our quantifications reveal the existing overlap of nationality as much less clear-cut than in case of occupational domains. Our work contributes to a growing body of research using biographies of notable persons to analyze cultural processes. △ Less

Submitted 22 June, 2015; originally announced June 2015.

Comments: 14 pages, 11 figures, Presented as conference poster at NetSci 2015

ACM Class: H.3.4; K.4.3

Showing 1–6 of 6 results for author: Schich, M