Search | arXiv e-print repository

Modular Pluralism: Pluralistic Alignment via Multi-LLM Collaboration

Authors: Shangbin Feng, Taylor Sorensen, Yuhan Liu, Jillian Fisher, Chan Young Park, Ye** Choi, Yulia Tsvetkov

Abstract: While existing alignment paradigms have been integral in develo** large language models (LLMs), LLMs often learn an averaged human preference and struggle to model diverse preferences across cultures, demographics, and communities. We propose Modular Pluralism, a modular framework based on multi-LLM collaboration for pluralistic alignment: it "plugs into" a base LLM a pool of smaller but special… ▽ More While existing alignment paradigms have been integral in develo** large language models (LLMs), LLMs often learn an averaged human preference and struggle to model diverse preferences across cultures, demographics, and communities. We propose Modular Pluralism, a modular framework based on multi-LLM collaboration for pluralistic alignment: it "plugs into" a base LLM a pool of smaller but specialized community LMs, where models collaborate in distinct modes to flexibility support three modes of pluralism: Overton, steerable, and distributional. Modular Pluralism is uniquely compatible with black-box LLMs and offers the modular control of adding new community LMs for previously underrepresented communities. We evaluate Modular Pluralism with six tasks and four datasets featuring questions/instructions with value-laden and perspective-informed responses. Extensive experiments demonstrate that Modular Pluralism advances the three pluralism objectives across six black-box and open-source LLMs. Further analysis reveals that LLMs are generally faithful to the inputs from smaller community LLMs, allowing seamless patching by adding a new community LM to better cover previously underrepresented communities. △ Less

Submitted 22 June, 2024; originally announced June 2024.

arXiv:2406.12904 [pdf, other]

Meent: Differentiable Electromagnetic Simulator for Machine Learning

Authors: Yongha Kim, Anthony W. Jung, Sanmun Kim, Kevin Octavian, Doyoung Heo, Chae** Park, Jeongmin Shin, Sunghyun Nam, Chanhyung Park, Juho Park, Sangjun Han, **myoung Lee, Seolho Kim, Min Seok Jang, Chan Y. Park

Abstract: Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reachin… ▽ More Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reaching real world impact. Traditional algorithms for such tasks require iteratively refining parameters through simulations, which often yield sub-optimal results due to the high computational cost of both the algorithms and EM simulations. Machine learning (ML) emerged as a promising candidate to mitigate these challenges, and optics research community has increasingly adopted ML algorithms to obtain results surpassing classical methods across various tasks. To foster a synergistic collaboration between the optics and ML communities, it is essential to have an EM simulation software that is user-friendly for both research communities. To this end, we present Meent, an EM simulation software that employs rigorous coupled-wave analysis (RCWA). Developed in Python and equipped with automatic differentiation (AD) capabilities, Meent serves as a versatile platform for integrating ML into optics research and vice versa. To demonstrate its utility as a research platform, we present three applications of Meent: 1) generating a dataset for training neural operator, 2) serving as an environment for the reinforcement learning of nanophotonic device optimization, and 3) providing a solution for inverse problems with gradient-based optimizers. These applications highlight Meent's potential to advance both EM simulation and ML methodologies. The code is available at https://github.com/kc-ml2/meent with the MIT license to promote the cross-polinations of ideas among academic researchers and industry practitioners. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: under review

arXiv:2404.06664 [pdf, other]

CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge

Authors: Yu Ying Chiu, Liwei Jiang, Maria Antoniak, Chan Young Park, Shuyue Stella Li, Mehar Bhatia, Sahithya Ravi, Yulia Tsvetkov, Vered Shwartz, Ye** Choi

Abstract: Frontier large language models (LLMs) are developed by researchers and practitioners with skewed cultural backgrounds and on datasets with skewed sources. However, LLMs' (lack of) multicultural knowledge cannot be effectively assessed with current methods for develo** benchmarks. Existing multicultural evaluations primarily rely on expensive and restricted human annotations or potentially outdat… ▽ More Frontier large language models (LLMs) are developed by researchers and practitioners with skewed cultural backgrounds and on datasets with skewed sources. However, LLMs' (lack of) multicultural knowledge cannot be effectively assessed with current methods for develo** benchmarks. Existing multicultural evaluations primarily rely on expensive and restricted human annotations or potentially outdated internet resources. Thus, they struggle to capture the intricacy, dynamics, and diversity of cultural norms. LLM-generated benchmarks are promising, yet risk propagating the same biases they are meant to measure. To synergize the creativity and expert cultural knowledge of human annotators and the scalability and standardizability of LLM-based automation, we introduce CulturalTeaming, an interactive red-teaming system that leverages human-AI collaboration to build truly challenging evaluation dataset for assessing the multicultural knowledge of LLMs, while improving annotators' capabilities and experiences. Our study reveals that CulturalTeaming's various modes of AI assistance support annotators in creating cultural questions, that modern LLMs fail at, in a gamified manner. Importantly, the increased level of AI assistance (e.g., LLM-generated revision hints) empowers users to create more difficult questions with enhanced perceived creativity of themselves, shedding light on the promises of involving heavier AI assistance in modern evaluation dataset creation procedures. Through a series of 1-hour workshop sessions, we gather CULTURALBENCH-V0.1, a compact yet high-quality evaluation dataset with users' red-teaming attempts, that different families of modern LLMs perform with accuracy ranging from 37.7% to 72.2%, revealing a notable gap in LLMs' multicultural proficiency. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: Preprint (under review)

arXiv:2311.09741 [pdf, other]

P^3SUM: Preserving Author's Perspective in News Summarization with Diffusion Language Models

Authors: Yuhan Liu, Shangbin Feng, Xiaochuang Han, Vidhisha Balachandran, Chan Young Park, Sachin Kumar, Yulia Tsvetkov

Abstract: In this work, we take a first step towards designing summarization systems that are faithful to the author's intent, not only the semantic content of the article. Focusing on a case study of preserving political perspectives in news summarization, we find that existing approaches alter the political opinions and stances of news articles in more than 50% of summaries, misrepresenting the intent and… ▽ More In this work, we take a first step towards designing summarization systems that are faithful to the author's intent, not only the semantic content of the article. Focusing on a case study of preserving political perspectives in news summarization, we find that existing approaches alter the political opinions and stances of news articles in more than 50% of summaries, misrepresenting the intent and perspectives of the news authors. We thus propose P^3SUM, a diffusion model-based summarization approach controlled by political perspective classifiers. In P^3SUM, the political leaning of a generated summary is iteratively evaluated at each decoding step, and any drift from the article's original stance incurs a loss back-propagated to the embedding layers, steering the political stance of the summary at inference time. Extensive experiments on three news summarization datasets demonstrate that P^3SUM outperforms state-of-the-art summarization systems and large language models by up to 13.7% in terms of the success rate of stance preservation, with competitive performance on standard metrics of summarization quality. Our findings present a first analysis of preservation of pragmatic features in summarization, highlight the lacunae in existing summarization models -- that even state-of-the-art models often struggle to preserve author's intents -- and develop new summarization systems that are more faithful to author's perspectives. △ Less

Submitted 4 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

arXiv:2311.07115 [pdf, other]

Gen-Z: Generative Zero-Shot Text Classification with Contextualized Label Descriptions

Authors: Sachin Kumar, Chan Young Park, Yulia Tsvetkov

Abstract: Language model (LM) prompting--a popular paradigm for solving NLP tasks--has been shown to be susceptible to miscalibration and brittleness to slight prompt variations, caused by its discriminative prompting approach, i.e., predicting the label given the input. To address these issues, we propose Gen-Z--a generative prompting framework for zero-shot text classification. GEN-Z is generative, as it… ▽ More Language model (LM) prompting--a popular paradigm for solving NLP tasks--has been shown to be susceptible to miscalibration and brittleness to slight prompt variations, caused by its discriminative prompting approach, i.e., predicting the label given the input. To address these issues, we propose Gen-Z--a generative prompting framework for zero-shot text classification. GEN-Z is generative, as it measures the LM likelihood of input text, conditioned on natural language descriptions of labels. The framework is multivariate, as label descriptions allow us to seamlessly integrate additional contextual information about the labels to improve task performance. On various standard classification benchmarks, with six open-source LM families, we show that zero-shot classification with simple contextualization of the data source of the evaluation set consistently outperforms both zero-shot and few-shot baselines while improving robustness to prompt variations. Further, our approach enables personalizing classification in a zero-shot manner by incorporating author, subject, or reader information in the label descriptions. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2305.14326 [pdf, other]

TalkUp: Paving the Way for Understanding Empowering Language

Authors: Lucille Njoo, Chan Young Park, Octavia Stappart, Marvin Thielk, Yi Chu, Yulia Tsvetkov

Abstract: Empowering language is important in many real-world contexts, from education to workplace dynamics to healthcare. Though language technologies are growing more prevalent in these contexts, empowerment has seldom been studied in NLP, and moreover, it is inherently challenging to operationalize because of its implicit nature. This work builds from linguistic and social psychology literature to explo… ▽ More Empowering language is important in many real-world contexts, from education to workplace dynamics to healthcare. Though language technologies are growing more prevalent in these contexts, empowerment has seldom been studied in NLP, and moreover, it is inherently challenging to operationalize because of its implicit nature. This work builds from linguistic and social psychology literature to explore what characterizes empowering language. We then crowdsource a novel dataset of Reddit posts labeled for empowerment, reasons why these posts are empowering to readers, and the social relationships between posters and readers. Our preliminary analyses show that this dataset, which we call TalkUp, can be used to train language models that capture empowering and disempowering language. More broadly, TalkUp provides an avenue to explore implication, presuppositions, and how social context influences the meaning of language. △ Less

Submitted 23 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: Findings of EMNLP 2023

arXiv:2305.10731 [pdf, other]

Analyzing Norm Violations in Live-Stream Chat

Authors: Jihyung Moon, Dong-Ho Lee, Hyundong Cho, Woojeong **, Chan Young Park, Minwoo Kim, Jonathan May, Jay Pujara, Sungjoon Park

Abstract: Toxic language, such as hate speech, can deter users from participating in online communities and enjoying popular platforms. Previous approaches to detecting toxic language and norm violations have been primarily concerned with conversations from online forums and social media, such as Reddit and Twitter. These approaches are less effective when applied to conversations on live-streaming platform… ▽ More Toxic language, such as hate speech, can deter users from participating in online communities and enjoying popular platforms. Previous approaches to detecting toxic language and norm violations have been primarily concerned with conversations from online forums and social media, such as Reddit and Twitter. These approaches are less effective when applied to conversations on live-streaming platforms, such as Twitch and YouTube Live, as each comment is only visible for a limited time and lacks a thread structure that establishes its relationship with other comments. In this work, we share the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms. We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch. We articulate several facets of live-stream data that differ from other forums, and demonstrate that existing models perform poorly in this setting. By conducting a user study, we identify the informational context humans use in live-stream moderation, and train models leveraging context to identify norm violations. Our results show that appropriate contextual information can boost moderation performance by 35\%. △ Less

Submitted 7 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

Comments: 17 pages, 8 figures, 15 tables

arXiv:2305.08283 [pdf, other]

From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models

Authors: Shangbin Feng, Chan Young Park, Yuhan Liu, Yulia Tsvetkov

Abstract: Language models (LMs) are pretrained on diverse data sources, including news, discussion forums, books, and online encyclopedias. A significant portion of this data includes opinions and perspectives which, on one hand, celebrate democracy and diversity of ideas, and on the other hand are inherently socially biased. Our work develops new methods to (1) measure political biases in LMs trained on su… ▽ More Language models (LMs) are pretrained on diverse data sources, including news, discussion forums, books, and online encyclopedias. A significant portion of this data includes opinions and perspectives which, on one hand, celebrate democracy and diversity of ideas, and on the other hand are inherently socially biased. Our work develops new methods to (1) measure political biases in LMs trained on such corpora, along social and economic axes, and (2) measure the fairness of downstream NLP models trained on top of politically biased LMs. We focus on hate speech and misinformation detection, aiming to empirically quantify the effects of political (social, economic) biases in pretraining data on the fairness of high-stakes social-oriented tasks. Our findings reveal that pretrained LMs do have political leanings that reinforce the polarization present in pretraining corpora, propagating social biases into hate speech predictions and misinformation detectors. We discuss the implications of our findings for NLP research and propose future directions to mitigate unfairness. △ Less

Submitted 5 July, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

Comments: ACL 2023

arXiv:2211.14981 [pdf, other]

The Grind for Good Data: Understanding ML Practitioners' Struggles and Aspirations in Making Good Data

Authors: Inha Cha, Juhyun Oh, Cheul Young Park, Jiyoon Han, Hwalsuk Lee

Abstract: We thought data to be simply given, but reality tells otherwise; it is costly, situation-dependent, and muddled with dilemmas, constantly requiring human intervention. The ML community's focus on quality data is increasing in the same vein, as good data is vital for successful ML systems. Nonetheless, few works have investigated the dataset builders and the specifics of what they do and struggle t… ▽ More We thought data to be simply given, but reality tells otherwise; it is costly, situation-dependent, and muddled with dilemmas, constantly requiring human intervention. The ML community's focus on quality data is increasing in the same vein, as good data is vital for successful ML systems. Nonetheless, few works have investigated the dataset builders and the specifics of what they do and struggle to make good data. In this study, through semi-structured interviews with 19 ML experts, we present what humans actually do and consider in each step of the data construction pipeline. We further organize their struggles under three themes: 1) trade-offs from real-world constraints; 2) harmonizing assorted data workers for consistency; 3) the necessity of human intuition and tacit knowledge for processing data. Finally, we discuss why such struggles are inevitable for good data and what practitioners aspire, toward providing systematic support for data works. △ Less

Submitted 27 November, 2022; originally announced November 2022.

arXiv:2205.12633 [pdf, other]

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

Authors: Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, ** Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang , et al. (68 additional authors not shown)

Abstract: This paper reviews the challenge on constrained high dynamic range (HDR) imaging that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2022. This manuscript focuses on the competition set-up, datasets, the proposed methods and their results. The challenge aims at estimating an HDR image from multiple respective low dynamic range (LDR)… ▽ More This paper reviews the challenge on constrained high dynamic range (HDR) imaging that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2022. This manuscript focuses on the competition set-up, datasets, the proposed methods and their results. The challenge aims at estimating an HDR image from multiple respective low dynamic range (LDR) observations, which might suffer from under- or over-exposed regions and different sources of noise. The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i.e. solutions can not exceed a given number of operations). In Track 2, participants are asked to minimize the complexity of their solutions while imposing a constraint on fidelity scores (i.e. solutions are required to obtain a higher fidelity score than the prescribed baseline). Both tracks use the same data and metrics: Fidelity is measured by means of PSNR with respect to a ground-truth HDR image (computed both directly and with a canonical tonemap** operation), while complexity metrics include the number of Multiply-Accumulate (MAC) operations and runtime (in seconds). △ Less

Submitted 25 May, 2022; originally announced May 2022.

Comments: CVPR Workshops 2022. 15 pages, 21 figures, 2 tables

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022

arXiv:2205.12382 [pdf, other]

Challenges and Opportunities in Information Manipulation Detection: An Examination of Wartime Russian Media

Authors: Chan Young Park, Julia Mendelsohn, Anjalie Field, Yulia Tsvetkov

Abstract: NLP research on public opinion manipulation campaigns has primarily focused on detecting overt strategies such as fake news and disinformation. However, information manipulation in the ongoing Russia-Ukraine war exemplifies how governments and media also employ more nuanced strategies. We release a new dataset, VoynaSlov, containing 38M+ posts from Russian media outlets on Twitter and VKontakte, a… ▽ More NLP research on public opinion manipulation campaigns has primarily focused on detecting overt strategies such as fake news and disinformation. However, information manipulation in the ongoing Russia-Ukraine war exemplifies how governments and media also employ more nuanced strategies. We release a new dataset, VoynaSlov, containing 38M+ posts from Russian media outlets on Twitter and VKontakte, as well as public activity and responses, immediately preceding and during the 2022 Russia-Ukraine war. We apply standard and recently-developed NLP models on VoynaSlov to examine agenda setting, framing, and priming, several strategies underlying information manipulation, and reveal variation across media outlet control, social media platform, and time. Our examination of these media effects and extensive discussion of current approaches' limitations encourage further development of NLP models for understanding information manipulation in emerging crises, as well as other real-world and interdisciplinary tasks. △ Less

Submitted 24 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

Comments: Findings of EMNLP 2022

arXiv:2205.01931 [pdf, other]

Map** the landscape of histomorphological cancer phenotypes using self-supervised learning on unlabeled, unannotated pathology slides

Authors: Adalberto Claudio Quiros, Nicolas Coudray, Anna Yeaton, Xinyu Yang, Bo**g Liu, Hortense Le, Luis Chiriboga, Afreen Karimkhan, Navneet Narula, David A. Moore, Christopher Y. Park, Harvey Pass, Andre L. Moreira, John Le Quesne, Aristotelis Tsirigos, Ke Yuan

Abstract: Definitive cancer diagnosis and management depend upon the extraction of information from microscopy images by pathologists. These images contain complex information requiring time-consuming expert human interpretation that is prone to human bias. Supervised deep learning approaches have proven powerful for classification tasks, but they are inherently limited by the cost and quality of annotation… ▽ More Definitive cancer diagnosis and management depend upon the extraction of information from microscopy images by pathologists. These images contain complex information requiring time-consuming expert human interpretation that is prone to human bias. Supervised deep learning approaches have proven powerful for classification tasks, but they are inherently limited by the cost and quality of annotations used for training these models. To address this limitation of supervised methods, we developed Histomorphological Phenotype Learning (HPL), a fully blue{self-}supervised methodology that requires no expert labels or annotations and operates via the automatic discovery of discriminatory image features in small image tiles. Tiles are grouped into morphologically similar clusters which constitute a library of histomorphological phenotypes, revealing trajectories from benign to malignant tissue via inflammatory and reactive phenotypes. These clusters have distinct features which can be identified using orthogonal methods, linking histologic, molecular and clinical phenotypes. Applied to lung cancer tissues, we show that they align closely with patient survival, with histopathologically recognised tumor types and growth patterns, and with transcriptomic measures of immunophenotype. We then demonstrate that these properties are maintained in a multi-cancer study. These results show the clusters represent recurrent host responses and modes of tumor growth emerging under natural selection. Code, pre-trained models, learned embeddings, and documentation are available to the community at https://github.com/AdalbertoCq/Histomorphological-Phenotype-Learning △ Less

Submitted 1 September, 2023; v1 submitted 4 May, 2022; originally announced May 2022.

arXiv:2110.04419 [pdf, other]

Detecting Community Sensitive Norm Violations in Online Conversations

Authors: Chan Young Park, Julia Mendelsohn, Karthik Radhakrishnan, Kinjal Jain, Tushar Kanakagiri, David Jurgens, Yulia Tsvetkov

Abstract: Online platforms and communities establish their own norms that govern what behavior is acceptable within the community. Substantial effort in NLP has focused on identifying unacceptable behaviors and, recently, on forecasting them before they occur. However, these efforts have largely focused on toxicity as the sole form of community norm violation. Such focus has overlooked the much larger set o… ▽ More Online platforms and communities establish their own norms that govern what behavior is acceptable within the community. Substantial effort in NLP has focused on identifying unacceptable behaviors and, recently, on forecasting them before they occur. However, these efforts have largely focused on toxicity as the sole form of community norm violation. Such focus has overlooked the much larger set of rules that moderators enforce. Here, we introduce a new dataset focusing on a more complete spectrum of community norms and their violations in the local conversational and global community contexts. We introduce a series of models that use this data to develop context- and community-sensitive norm violation detection, showing that these changes give high performance. △ Less

Submitted 8 October, 2021; originally announced October 2021.

Comments: Findings of EMNLP 2021

arXiv:2105.11366 [pdf, other]

GMAC: A Distributional Perspective on Actor-Critic Framework

Authors: Daniel Wontae Nam, Younghoon Kim, Chan Y. Park

Abstract: In this paper, we devise a distributional framework on actor-critic as a solution to distributional instability, action type restriction, and conflation between samples and statistics. We propose a new method that minimizes the Cramér distance with the multi-step Bellman target distribution generated from a novel Sample-Replacement algorithm denoted SR($λ$), which learns the correct value distribu… ▽ More In this paper, we devise a distributional framework on actor-critic as a solution to distributional instability, action type restriction, and conflation between samples and statistics. We propose a new method that minimizes the Cramér distance with the multi-step Bellman target distribution generated from a novel Sample-Replacement algorithm denoted SR($λ$), which learns the correct value distribution under multiple Bellman operations. Parameterizing a value distribution with Gaussian Mixture Model further improves the efficiency and the performance of the method, which we name GMAC. We empirically show that GMAC captures the correct representation of value distributions and improves the performance of a conventional actor-critic method with low computational cost, in both discrete and continuous action spaces using Arcade Learning Environment (ALE) and PyBullet environment. △ Less

Submitted 15 July, 2021; v1 submitted 24 May, 2021; originally announced May 2021.

Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139:7927-7936, 2021

arXiv:2101.00078 [pdf, other]

doi 10.1145/3485447.3512134

Controlled Analyses of Social Biases in Wikipedia Bios

Authors: Anjalie Field, Chan Young Park, Kevin Z. Lin, Yulia Tsvetkov

Abstract: Social biases on Wikipedia, a widely-read global platform, could greatly influence public opinion. While prior research has examined man/woman gender bias in biography articles, possible influences of other demographic attributes limit conclusions. In this work, we present a methodology for analyzing Wikipedia pages about people that isolates dimensions of interest (e.g., gender), from other attri… ▽ More Social biases on Wikipedia, a widely-read global platform, could greatly influence public opinion. While prior research has examined man/woman gender bias in biography articles, possible influences of other demographic attributes limit conclusions. In this work, we present a methodology for analyzing Wikipedia pages about people that isolates dimensions of interest (e.g., gender), from other attributes (e.g., occupation). Given a target corpus for analysis (e.g.~biographies about women), we present a method for constructing a comparison corpus that matches the target corpus in as many attributes as possible, except the target one. We develop evaluation metrics to measure how well the comparison corpus aligns with the target corpus and then examine how articles about gender and racial minorities (cis. women, non-binary people, transgender women, and transgender men; African American, Asian American, and Hispanic/Latinx American people) differ from other articles. In addition to identifying suspect social biases, our results show that failing to control for covariates can result in different conclusions and veil biases. Our contributions include methodology that facilitates further analyses of bias in Wikipedia articles, findings that can aid Wikipedia editors in reducing biases, and a framework and evaluation metrics to guide future work in this area. △ Less

Submitted 9 February, 2022; v1 submitted 31 December, 2020; originally announced January 2021.

Comments: Accepted to the Web Conference 2022 (WWW '22)

arXiv:2010.10820 [pdf, other]

Multilingual Contextual Affective Analysis of LGBT People Portrayals in Wikipedia

Authors: Chan Young Park, Xinru Yan, Anjalie Field, Yulia Tsvetkov

Abstract: Specific lexical choices in narrative text reflect both the writer's attitudes towards people in the narrative and influence the audience's reactions. Prior work has examined descriptions of people in English using contextual affective analysis, a natural language processing (NLP) technique that seeks to analyze how people are portrayed along dimensions of power, agency, and sentiment. Our work pr… ▽ More Specific lexical choices in narrative text reflect both the writer's attitudes towards people in the narrative and influence the audience's reactions. Prior work has examined descriptions of people in English using contextual affective analysis, a natural language processing (NLP) technique that seeks to analyze how people are portrayed along dimensions of power, agency, and sentiment. Our work presents an extension of this methodology to multilingual settings, which is enabled by a new corpus that we collect and a new multilingual model. We additionally show how word connotations differ across languages and cultures, highlighting the difficulty of generalizing existing English datasets and methods. We then demonstrate the usefulness of our method by analyzing Wikipedia biography pages of members of the LGBT community across three languages: English, Russian, and Spanish. Our results show systematic differences in how the LGBT community is portrayed across languages, surfacing cultural differences in narratives and signs of social biases. Practically, this model can be used to identify Wikipedia articles for further manual analysis -- articles that might contain content gaps or an imbalanced representation of particular social groups. △ Less

Submitted 8 April, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

Comments: ICWSM 2021

arXiv:2008.01354 [pdf, other]

NLPDove at SemEval-2020 Task 12: Improving Offensive Language Detection with Cross-lingual Transfer

Authors: Hwijeen Ahn, Jimin Sun, Chan Young Park, Jungyun Seo

Abstract: This paper describes our approach to the task of identifying offensive languages in a multilingual setting. We investigate two data augmentation strategies: using additional semi-supervised labels with different thresholds and cross-lingual transfer with data selection. Leveraging the semi-supervised dataset resulted in performance improvements compared to the baseline trained solely with the manu… ▽ More This paper describes our approach to the task of identifying offensive languages in a multilingual setting. We investigate two data augmentation strategies: using additional semi-supervised labels with different thresholds and cross-lingual transfer with data selection. Leveraging the semi-supervised dataset resulted in performance improvements compared to the baseline trained solely with the manually-annotated dataset. We propose a new metric, Translation Embedding Distance, to measure the transferability of instances for cross-lingual data selection. We also introduce various preprocessing steps tailored for social media text along with methods to fine-tune the pre-trained multilingual BERT (mBERT) for offensive language identification. Our multilingual systems achieved competitive results in Greek, Danish, and Turkish at OffensEval 2020. △ Less

Submitted 4 August, 2020; originally announced August 2020.

Comments: To be published in SemEval-2020

arXiv:2006.09336 [pdf, other]

Cross-Cultural Similarity Features for Cross-Lingual Transfer Learning of Pragmatically Motivated Tasks

Authors: Jimin Sun, Hwijeen Ahn, Chan Young Park, Yulia Tsvetkov, David R. Mortensen

Abstract: Much work in cross-lingual transfer learning explored how to select better transfer languages for multilingual tasks, primarily focusing on typological and genealogical similarities between languages. We hypothesize that these measures of linguistic proximity are not enough when working with pragmatically-motivated tasks, such as sentiment analysis. As an alternative, we introduce three linguistic… ▽ More Much work in cross-lingual transfer learning explored how to select better transfer languages for multilingual tasks, primarily focusing on typological and genealogical similarities between languages. We hypothesize that these measures of linguistic proximity are not enough when working with pragmatically-motivated tasks, such as sentiment analysis. As an alternative, we introduce three linguistic features that capture cross-cultural similarities that manifest in linguistic patterns and quantify distinct aspects of language pragmatics: language context-level, figurative language, and the lexification of emotion concepts. Our analyses show that the proposed pragmatic features do capture cross-cultural similarities and align well with existing work in sociolinguistics and linguistic anthropology. We further corroborate the effectiveness of pragmatically-driven transfer in the downstream task of choosing transfer languages for cross-lingual sentiment analysis. △ Less

Submitted 8 April, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

Comments: EACL 2021

arXiv:2005.04120 [pdf, other]

doi 10.1038/s41597-020-00630-y

K-EmoCon, a multimodal sensor dataset for continuous emotion recognition in naturalistic conversations

Authors: Cheul Young Park, Narae Cha, Soowon Kang, Auk Kim, Ahsan Habib Khandoker, Leontios Hadjileontiadis, Alice Oh, Yong Jeong, Uichin Lee

Abstract: Recognizing emotions during social interactions has many potential applications with the popularization of low-cost mobile sensors, but a challenge remains with the lack of naturalistic affective interaction data. Most existing emotion datasets do not support studying idiosyncratic emotions arising in the wild as they were collected in constrained environments. Therefore, studying emotions in the… ▽ More Recognizing emotions during social interactions has many potential applications with the popularization of low-cost mobile sensors, but a challenge remains with the lack of naturalistic affective interaction data. Most existing emotion datasets do not support studying idiosyncratic emotions arising in the wild as they were collected in constrained environments. Therefore, studying emotions in the context of social interactions requires a novel dataset, and K-EmoCon is such a multimodal dataset with comprehensive annotations of continuous emotions during naturalistic conversations. The dataset contains multimodal measurements, including audiovisual recordings, EEG, and peripheral physiological signals, acquired with off-the-shelf devices from 16 sessions of approximately 10-minute long paired debates on a social issue. Distinct from previous datasets, it includes emotion annotations from all three available perspectives: self, debate partner, and external observers. Raters annotated emotional displays at intervals of every 5 seconds while viewing the debate footage, in terms of arousal-valence and 18 additional categorical emotions. The resulting K-EmoCon is the first publicly available emotion dataset accommodating the multiperspective assessment of emotions during social interactions. △ Less

Submitted 19 May, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

Comments: 20 pages, 4 figures, for associated dataset, see https://doi.org/10.5281/zenodo.3814370

Journal ref: Sci Data 7, (2020) 293

arXiv:1910.12748 [pdf, other]

A Study of Machine Learning Models in Predicting the Intention of Adolescents to Smoke Cigarettes

Authors: Seung Joon Nam, Han Min Kim, Thomas Kang, Cheol Young Park

Abstract: The use of electronic cigarette (e-cigarette) is increasing among adolescents. This is problematic since consuming nicotine at an early age can cause harmful effects in develo** teenager's brain and health. Additionally, the use of e-cigarette has a possibility of leading to the use of cigarettes, which is more severe. There were many researches about e-cigarette and cigarette that mostly focuse… ▽ More The use of electronic cigarette (e-cigarette) is increasing among adolescents. This is problematic since consuming nicotine at an early age can cause harmful effects in develo** teenager's brain and health. Additionally, the use of e-cigarette has a possibility of leading to the use of cigarettes, which is more severe. There were many researches about e-cigarette and cigarette that mostly focused on finding and analyzing causes of smoking using conventional statistics. However, there is a lack of research on develo** prediction models, which is more applicable to anti-smoking campaign, about e-cigarette and cigarette. In this paper, we research the prediction models that can be used to predict an individual e-cigarette user's (including non-e-cigarette users) intention to smoke cigarettes, so that one can be early informed about the risk of going down the path of smoking cigarettes. To construct the prediction models, five machine learning (ML) algorithms are exploited and tested for their accuracy in predicting the intention to smoke cigarettes among never smokers using data from the 2018 National Youth Tobacco Survey (NYTS). In our investigation, the Gradient Boosting Classifier, one of the prediction models, shows the highest accuracy out of all the other models. Also, with the best prediction model, we made a public website that enables users to input information to predict their intentions of smoking cigarettes. △ Less

Submitted 31 October, 2019; v1 submitted 28 October, 2019; originally announced October 2019.

arXiv:1904.12958 [pdf]

Predictive Situation Awareness for Ebola Virus Disease using a Collective Intelligence Multi-Model Integration Platform: Bayes Cloud

Authors: Cheol Young Park, Shou Matsumoto, Jubyung Ha, YoungWon Park

Abstract: The humanity has been facing a plethora of challenges associated with infectious diseases, which kill more than 6 million people a year. Although continuous efforts have been applied to relieve the potential damages from such misfortunate events, it is unquestionable that there are many persisting challenges yet to overcome. One related issue we particularly address here is the assessment and pred… ▽ More The humanity has been facing a plethora of challenges associated with infectious diseases, which kill more than 6 million people a year. Although continuous efforts have been applied to relieve the potential damages from such misfortunate events, it is unquestionable that there are many persisting challenges yet to overcome. One related issue we particularly address here is the assessment and prediction of such epidemics. In this field of study, traditional and ad-hoc models frequently fail to provide proper predictive situation awareness (PSAW), characterized by understanding the current situations and predicting the future situations. Comprehensive PSAW for infectious disease can support decision making and help to hinder disease spread. In this paper, we develop a computing system platform focusing on collective intelligence causal modeling, in order to support PSAW in the domain of infectious disease. Analyses of global epidemics require integration of multiple different data and models, which can be originated from multiple independent researchers. These models should be integrated to accurately assess and predict the infectious disease in terms of holistic view. The system shall provide three main functions: (1) collaborative causal modeling, (2) causal model integration, and (3) causal model reasoning. These functions are supported by subject-matter expert and artificial intelligence (AI), with uncertainty treatment. Subject-matter experts, as collective intelligence, develop causal models and integrate them as one joint causal model. The integrated causal model shall be used to reason about: (1) the past, regarding how the causal factors have occurred; (2) the present, regarding how the spread is going now; and (3) the future, regarding how it will proceed. Finally, we introduce one use case of predictive situation awareness for the Ebola virus disease. △ Less

Submitted 4 May, 2019; v1 submitted 29 April, 2019; originally announced April 2019.

arXiv:1806.02457 [pdf]

Reference Model of Multi-Entity Bayesian Networks for Predictive Situation Awareness

Authors: Cheol Young Park, Kathryn Blackmond Laskey

Abstract: During the past quarter-century, situation awareness (SAW) has become a critical research theme, because of its importance. Since the concept of SAW was first introduced during World War I, various versions of SAW have been researched and introduced. Predictive Situation Awareness (PSAW) focuses on the ability to predict aspects of a temporally evolving situation over time. PSAW requires a formal… ▽ More During the past quarter-century, situation awareness (SAW) has become a critical research theme, because of its importance. Since the concept of SAW was first introduced during World War I, various versions of SAW have been researched and introduced. Predictive Situation Awareness (PSAW) focuses on the ability to predict aspects of a temporally evolving situation over time. PSAW requires a formal representation and a reasoning method using such a representation. A Multi-Entity Bayesian Network (MEBN) is a knowledge representation formalism combining Bayesian Networks (BN) with First-Order Logic (FOL). MEBN can be used to represent uncertain situations (supported by BN) as well as complex situations (supported by FOL). Also, efficient reasoning algorithms for MEBN have been developed. MEBN can be a formal representation to support PSAW and has been used for several PSAW systems. Although several MEBN applications for PSAW exist, very little work can be found in the literature that attempts to generalize a MEBN model to support PSAW. In this research, we define a reference model for MEBN in PSAW, called a PSAW-MEBN reference model. The PSAW-MEBN reference model enables us to easily develop a MEBN model for PSAW by supporting the design of a MEBN model for PSAW. In this research, we introduce two example use cases using the PSAW-MEBN reference model to develop MEBN models to support PSAW: a Smart Manufacturing System and a Maritime Domain Awareness System. △ Less

Submitted 7 June, 2018; v1 submitted 6 June, 2018; originally announced June 2018.

arXiv:1806.02455 [pdf]

doi 10.3390/app9091743

MEBN-RM: A Map** between Multi-Entity Bayesian Network and Relational Model

Authors: Cheol Young Park, Kathryn Blackmond Laskey

Abstract: Multi-Entity Bayesian Network (MEBN) is a knowledge representation formalism combining Bayesian Networks (BN) with First-Order Logic (FOL). MEBN has sufficient expressive power for general-purpose knowledge representation and reasoning. Develo** a MEBN model to support a given application is a challenge, requiring definition of entities, relationships, random variables, conditional dependence re… ▽ More Multi-Entity Bayesian Network (MEBN) is a knowledge representation formalism combining Bayesian Networks (BN) with First-Order Logic (FOL). MEBN has sufficient expressive power for general-purpose knowledge representation and reasoning. Develo** a MEBN model to support a given application is a challenge, requiring definition of entities, relationships, random variables, conditional dependence relationships, and probability distributions. When available, data can be invaluable both to improve performance and to streamline development. By far the most common format for available data is the relational database (RDB). Relational databases describe and organize data according to the Relational Model (RM). Develo** a MEBN model from data stored in an RDB therefore requires map** between the two formalisms. This paper presents MEBN-RM, a set of map** rules between key elements of MEBN and RM. We identify links between the two languages (RM and MEBN) and define four levels of map** from elements of RM to elements of MEBN. These definitions are implemented in the MEBN-RM algorithm, which converts a relational schema in RM to a partial MEBN model. Through this research, the software has been released as a MEBN-RM open-source software tool. The method is illustrated through two example use cases using MEBN-RM to develop MEBN models: a Critical Infrastructure Defense System and a Smart Manufacturing System. △ Less

Submitted 7 June, 2018; v1 submitted 6 June, 2018; originally announced June 2018.

Journal ref: Applied Sciences 2019,9

arXiv:1806.02421 [pdf]

Human-aided Multi-Entity Bayesian Networks Learning from Relational Data

Authors: Cheol Young Park, Kathryn Blackmond Laskey

Abstract: An Artificial Intelligence (AI) system is an autonomous system which emulates human mental and physical activities such as Observe, Orient, Decide, and Act, called the OODA process. An AI system performing the OODA process requires a semantically rich representation to handle a complex real world situation and ability to reason under uncertainty about the situation. Multi-Entity Bayesian Networks… ▽ More An Artificial Intelligence (AI) system is an autonomous system which emulates human mental and physical activities such as Observe, Orient, Decide, and Act, called the OODA process. An AI system performing the OODA process requires a semantically rich representation to handle a complex real world situation and ability to reason under uncertainty about the situation. Multi-Entity Bayesian Networks (MEBNs) combines First-Order Logic with Bayesian Networks for representing and reasoning about uncertainty in complex, knowledge-rich domains. MEBN goes beyond standard Bayesian networks to enable reasoning about an unknown number of entities interacting with each other in various types of relationships, a key requirement for the OODA process of an AI system. MEBN models have heretofore been constructed manually by a domain expert. However, manual MEBN modeling is labor-intensive and insufficiently agile. To address these problems, an efficient method is needed for MEBN modeling. One of the methods is to use machine learning to learn a MEBN model in whole or in part from data. In the era of Big Data, data-rich environments, characterized by uncertainty and complexity, have become ubiquitous. The larger the data sample is, the more accurate the results of the machine learning approach can be. Therefore, machine learning has potential to improve the quality of MEBN models as well as the effectiveness for MEBN modeling. In this research, we study a MEBN learning framework to develop a MEBN model from a combination of domain expert's knowledge and data. To evaluate the MEBN learning framework, we conduct an experiment to compare the MEBN learning framework and the existing manual MEBN modeling in terms of development efficiency. △ Less

Submitted 6 June, 2018; originally announced June 2018.

arXiv:1806.02415 [pdf]

doi 10.3390/app9102055

Gaussian Mixture Reduction for Time-Constrained Approximate Inference in Hybrid Bayesian Networks

Authors: Cheol Young Park, Kathryn Blackmond Laskey, Paulo C. G. Costa, Shou Matsumoto

Abstract: Hybrid Bayesian Networks (HBNs), which contain both discrete and continuous variables, arise naturally in many application areas (e.g., image understanding, data fusion, medical diagnosis, fraud detection). This paper concerns inference in an important subclass of HBNs, the conditional Gaussian (CG) networks, in which all continuous random variables have Gaussian distributions and all children of… ▽ More Hybrid Bayesian Networks (HBNs), which contain both discrete and continuous variables, arise naturally in many application areas (e.g., image understanding, data fusion, medical diagnosis, fraud detection). This paper concerns inference in an important subclass of HBNs, the conditional Gaussian (CG) networks, in which all continuous random variables have Gaussian distributions and all children of continuous random variables must be continuous. Inference in CG networks can be NP-hard even for special-case structures, such as poly-trees, where inference in discrete Bayesian networks can be performed in polynomial time. Therefore, approximate inference is required. In approximate inference, it is often necessary to trade off accuracy against solution time. This paper presents an extension to the Hybrid Message Passing inference algorithm for general CG networks and an algorithm for optimizing its accuracy given a bound on computation time. The extended algorithm uses Gaussian mixture reduction to prevent an exponential increase in the number of Gaussian mixture components. The trade-off algorithm performs pre-processing to find optimal run-time settings for the extended algorithm. Experimental results for four CG networks compare performance of the extended algorithm with existing algorithms and show the optimal settings for these CG networks. △ Less

Submitted 6 June, 2018; originally announced June 2018.

Journal ref: Appl. Sci. 2019, 9, 2055

Showing 1–25 of 25 results for author: Park, C Y