Search | arXiv e-print repository

Mining United Nations General Assembly Debates

Authors: Mateusz Grzyb, Mateusz Krzyziński, Bartłomiej Sobieski, Mikołaj Spytek, Bartosz Pieliński, Daniel Dan, Anna Wróblewska

Abstract: This project explores the application of Natural Language Processing (NLP) techniques to analyse United Nations General Assembly (UNGA) speeches. Using NLP allows for the efficient processing and analysis of large volumes of textual data, enabling the extraction of semantic patterns, sentiment analysis, and topic modelling. Our goal is to deliver a comprehensive dataset and a tool (interface with… ▽ More This project explores the application of Natural Language Processing (NLP) techniques to analyse United Nations General Assembly (UNGA) speeches. Using NLP allows for the efficient processing and analysis of large volumes of textual data, enabling the extraction of semantic patterns, sentiment analysis, and topic modelling. Our goal is to deliver a comprehensive dataset and a tool (interface with descriptive statistics and automatically extracted topics) from which political scientists can derive insights into international relations and have the opportunity to have a nuanced understanding of global diplomatic discourse. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 4 pages, 1 figure, 2 tables

arXiv:2312.12881 [pdf, other]

Big Tech influence over AI research revisited: memetic analysis of attribution of ideas to affiliation

Authors: Stanisław Giziński, Paulina Kaczyńska, Hubert Ruczyński, Emilia Wiśnios, Bartosz Pieliński, Przemysław Biecek, Julian Sienkiewicz

Abstract: There exists a growing discourse around the domination of Big Tech on the landscape of artificial intelligence (AI) research, yet our comprehension of this phenomenon remains cursory. This paper aims to broaden and deepen our understanding of Big Tech's reach and power within AI research. It highlights the dominance not merely in terms of sheer publication volume but rather in the propagation of n… ▽ More There exists a growing discourse around the domination of Big Tech on the landscape of artificial intelligence (AI) research, yet our comprehension of this phenomenon remains cursory. This paper aims to broaden and deepen our understanding of Big Tech's reach and power within AI research. It highlights the dominance not merely in terms of sheer publication volume but rather in the propagation of new ideas or \textit{memes}. Current studies often oversimplify the concept of influence to the share of affiliations in academic papers, typically sourced from limited databases such as arXiv or specific academic conferences. The main goal of this paper is to unravel the specific nuances of such influence, determining which AI ideas are predominantly driven by Big Tech entities. By employing network and memetic analysis on AI-oriented paper abstracts and their citation network, we are able to grasp a deeper insight into this phenomenon. By utilizing two databases: OpenAlex and S2ORC, we are able to perform such analysis on a much bigger scale than previous attempts. Our findings suggest, that while Big Tech-affiliated papers are disproportionately more cited in some areas, the most cited papers are those affiliated with both Big Tech and Academia. Focusing on the most contagious memes, their attribution to specific affiliation groups (Big Tech, Academia, mixed affiliation) seems to be equally distributed between those three groups. This suggests that the notion of Big Tech domination over AI research is oversimplified in the discourse. Ultimately, this more nuanced understanding of Big Tech's and Academia's influence could inform a more symbiotic alliance between these stakeholders which would better serve the dual goals of societal welfare and the scientific integrity of AI research. △ Less

Submitted 20 December, 2023; originally announced December 2023.

arXiv:2307.16573 [pdf, other]

Deep Dive into the Language of International Relations: NLP-based Analysis of UNESCO's Summary Records

Authors: Joanna Wojciechowska, Mateusz Sypniewski, Maria Śmigielska, Igor Kamiński, Emilia Wiśnios, Hanna Schreiber, Bartosz Pieliński

Abstract: Cultural heritage is an arena of international relations that interests all states worldwide. The inscription process on the UNESCO World Heritage List and the UNESCO Representative List of the Intangible Cultural Heritage of Humanity often leads to tensions and conflicts among states. This research addresses these challenges by develo** automatic tools that provide valuable insights into the de… ▽ More Cultural heritage is an arena of international relations that interests all states worldwide. The inscription process on the UNESCO World Heritage List and the UNESCO Representative List of the Intangible Cultural Heritage of Humanity often leads to tensions and conflicts among states. This research addresses these challenges by develo** automatic tools that provide valuable insights into the decision-making processes regarding inscriptions to the two lists mentioned above. We propose innovative topic modelling and tension detection methods based on UNESCO's summary records. Our analysis achieved a commendable accuracy rate of 72% in identifying tensions. Furthermore, we have developed an application tailored for diplomats, lawyers, political scientists, and international relations researchers that facilitates the efficient search of paragraphs from selected documents and statements from specific speakers about chosen topics. This application is a valuable resource for enhancing the understanding of complex decision-making dynamics within international heritage inscription procedures. △ Less

Submitted 1 August, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

Comments: Accepted for 3rd Workshop on Computational Linguistics for the Political and Social Sciences at KONVENS 2023 Conference

arXiv:2305.16750 [pdf, other]

doi 10.1007/978-3-031-36024-4_5

Automating the Analysis of Institutional Design in International Agreements

Authors: Anna Wróblewska, Bartosz Pieliński, Karolina Seweryn, Sylwia Sysko-Romańczuk, Karol Saputa, Aleksandra Wichrowska, Hanna Schreiber

Abstract: This paper explores the automatic knowledge extraction of formal institutional design - norms, rules, and actors - from international agreements. The focus was to analyze the relationship between the visibility and centrality of actors in the formal institutional design in regulating critical aspects of cultural heritage relations. The developed tool utilizes techniques such as collecting legal do… ▽ More This paper explores the automatic knowledge extraction of formal institutional design - norms, rules, and actors - from international agreements. The focus was to analyze the relationship between the visibility and centrality of actors in the formal institutional design in regulating critical aspects of cultural heritage relations. The developed tool utilizes techniques such as collecting legal documents, annotating them with Institutional Grammar, and using graph analysis to explore the formal institutional design. The system was tested against the 2003 UNESCO Convention for the Safeguarding of the Intangible Cultural Heritage. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: 11 pages, 8 figures, accepted to ICCS 2023. arXiv admin note: substantial text overlap with arXiv:2209.00944

arXiv:2302.13099 [pdf, other]

HADES: Homologous Automated Document Exploration and Summarization

Authors: Piotr Wilczyński, Artur Żółkowski, Mateusz Krzyziński, Emilia Wiśnios, Bartosz Pieliński, Stanisław Giziński, Julian Sienkiewicz, Przemysław Biecek

Abstract: This paper introduces HADES, a novel tool for automatic comparative documents with similar structures. HADES is designed to streamline the work of professionals dealing with large volumes of documents, such as policy documents, legal acts, and scientific papers. The tool employs a multi-step pipeline that begins with processing PDF documents using topic modeling, summarization, and analysis of the… ▽ More This paper introduces HADES, a novel tool for automatic comparative documents with similar structures. HADES is designed to streamline the work of professionals dealing with large volumes of documents, such as policy documents, legal acts, and scientific papers. The tool employs a multi-step pipeline that begins with processing PDF documents using topic modeling, summarization, and analysis of the most important words for each topic. The process concludes with an interactive web app with visualizations that facilitate the comparison of the documents. HADES has the potential to significantly improve the productivity of professionals dealing with high volumes of documents, reducing the time and effort required to complete tasks related to comparative document analysis. Our package is publically available on GitHub. △ Less

Submitted 25 February, 2023; originally announced February 2023.

arXiv:2211.05852 [pdf, other]

Climate Policy Tracker: Pipeline for automated analysis of public climate policies

Authors: Artur Żółkowski, Mateusz Krzyziński, Piotr Wilczyński, Stanisław Giziński, Emilia Wiśnios, Bartosz Pieliński, Julian Sienkiewicz, Przemysław Biecek

Abstract: The number of standardized policy documents regarding climate policy and their publication frequency is significantly increasing. The documents are long and tedious for manual analysis, especially for policy experts, lawmakers, and citizens who lack access or domain expertise to utilize data analytics tools. Potential consequences of such a situation include reduced citizen governance and involvem… ▽ More The number of standardized policy documents regarding climate policy and their publication frequency is significantly increasing. The documents are long and tedious for manual analysis, especially for policy experts, lawmakers, and citizens who lack access or domain expertise to utilize data analytics tools. Potential consequences of such a situation include reduced citizen governance and involvement in climate policies and an overall surge in analytics costs, rendering less accessibility for the public. In this work, we use a Latent Dirichlet Allocation-based pipeline for the automatic summarization and analysis of 10-years of national energy and climate plans (NECPs) for the period from 2021 to 2030, established by 27 Member States of the European Union. We focus on analyzing policy framing, the language used to describe specific issues, to detect essential nuances in the way governments frame their climate policies and achieve climate goals. The methods leverage topic modeling and clustering for the comparative analysis of policy documents across different countries. It allows for easier integration in potential user-friendly applications for the development of theories and processes of climate policy. This would further lead to better citizen governance and engagement over climate policies and public policy research. △ Less

Submitted 10 November, 2022; originally announced November 2022.

Comments: Accepted for Tackling Climate Change with Machine Learning: workshop at NeurIPS 2022

arXiv:2209.00944 [pdf, other]

Entity Graph Extraction from Legal Acts -- a Prototype for a Use Case in Policy Design Analysis

Authors: Anna Wróblewska, Bartosz Pieliński, Karolina Seweryn, Karol Saputa, Aleksandra Wichrowska, Sylwia Sysko-Romańczuk, Hanna Schreiber

Abstract: This paper presents research on a prototype developed to serve the quantitative study of public policy design. This sub-discipline of political science focuses on identifying actors, relations between them, and tools at their disposal in health, environmental, economic, and other policies. Our system aims to automate the process of gathering legal documents, annotating them with Institutional Gram… ▽ More This paper presents research on a prototype developed to serve the quantitative study of public policy design. This sub-discipline of political science focuses on identifying actors, relations between them, and tools at their disposal in health, environmental, economic, and other policies. Our system aims to automate the process of gathering legal documents, annotating them with Institutional Grammar, and using hypergraphs to analyse inter-relations between crucial entities. Our system is tested against the UNESCO Convention for the Safeguarding of the Intangible Cultural Heritage from 2003, a legal document regulating essential aspects of international relations securing cultural heritage. △ Less

Submitted 2 September, 2022; originally announced September 2022.

Comments: 17 pages, 10 figures

Report number: shortened version with more analysis - https://arxiv.longhoe.net/abs/2305.16750 MSC Class: 68U35

arXiv:2108.06216 [pdf, other]

MAIR: Framework for mining relationships between research articles, strategies, and regulations in the field of explainable artificial intelligence

Authors: Stanisław Gizinski, Michał Kuzba, Bartosz Pielinski, Julian Sienkiewicz, Stanisław Łaniewski, Przemysław Biecek

Abstract: The growing number of AI applications, also for high-stake decisions, increases the interest in Explainable and Interpretable Machine Learning (XI-ML). This trend can be seen both in the increasing number of regulations and strategies for develo** trustworthy AI and the growing number of scientific papers dedicated to this topic. To ensure the sustainable development of AI, it is essential to un… ▽ More The growing number of AI applications, also for high-stake decisions, increases the interest in Explainable and Interpretable Machine Learning (XI-ML). This trend can be seen both in the increasing number of regulations and strategies for develo** trustworthy AI and the growing number of scientific papers dedicated to this topic. To ensure the sustainable development of AI, it is essential to understand the dynamics of the impact of regulation on research papers as well as the impact of scientific discourse on AI-related policies. This paper introduces a novel framework for joint analysis of AI-related policy documents and eXplainable Artificial Intelligence (XAI) research papers. The collected documents are enriched with metadata and interconnections, using various NLP methods combined with a methodology inspired by Institutional Grammar. Based on the information extracted from collected documents, we showcase a series of analyses that help understand interactions, similarities, and differences between documents at different stages of institutionalization. To the best of our knowledge, this is the first work to use automatic language analysis tools to understand the dynamics between XI-ML methods and regulations. We believe that such a system contributes to better cooperation between XAI researchers and AI policymakers. △ Less

Submitted 29 July, 2021; originally announced August 2021.

Showing 1–8 of 8 results for author: Pieliński, B