Skip to main content

Showing 1–32 of 32 results for author: Durmus, E

.
  1. arXiv:2406.07814  [pdf, other

    cs.AI cs.CL cs.HC

    Collective Constitutional AI: Aligning a Language Model with Public Input

    Authors: Saffron Huang, Divya Siddarth, Liane Lovitt, Thomas I. Liao, Esin Durmus, Alex Tamkin, Deep Ganguli

    Abstract: There is growing consensus that language model (LM) developers should not be the sole deciders of LM behavior, creating a need for methods that enable the broader public to collectively shape the behavior of LM systems that affect them. To address this need, we present Collective Constitutional AI (CCAI): a multi-stage process for sourcing and integrating public input into LMs-from identifying a t… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    ACM Class: I.2.7; K.4.2

    Journal ref: Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 1395-1417

  2. arXiv:2404.01651  [pdf, other

    cs.CL cs.CY cs.HC cs.SI

    NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps

    Authors: Kristina Gligoric, Myra Cheng, Lucia Zheng, Esin Durmus, Dan Jurafsky

    Abstract: The use of words to convey speaker's intent is traditionally distinguished from the `mention' of words for quoting what someone said, or pointing out properties of a word. Here we show that computationally modeling this use-mention distinction is crucial for dealing with counterspeech online. Counterspeech that refutes problematic content often mentions harmful language but is not harmful itself (… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 (Main conference)

  3. arXiv:2312.03689  [pdf, other

    cs.CL

    Evaluating and Mitigating Discrimination in Language Model Decisions

    Authors: Alex Tamkin, Amanda Askell, Liane Lovitt, Esin Durmus, Nicholas Joseph, Shauna Kravec, Karina Nguyen, Jared Kaplan, Deep Ganguli

    Abstract: As language models (LMs) advance, interest is growing in applying them to high-stakes societal decisions, such as determining financing or housing eligibility. However, their potential for discrimination in such contexts raises ethical concerns, motivating the need for better methods to evaluate these risks. We present a method for proactively evaluating the potential discriminatory impact of LMs… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  4. arXiv:2310.13798  [pdf, other

    cs.CL cs.AI

    Specific versus General Principles for Constitutional AI

    Authors: Sandipan Kundu, Yuntao Bai, Saurav Kadavath, Amanda Askell, Andrew Callahan, Anna Chen, Anna Goldie, Avital Balwit, Azalia Mirhoseini, Brayden McLean, Catherine Olsson, Cassie Evraets, Eli Tran-Johnson, Esin Durmus, Ethan Perez, Jackson Kernion, Jamie Kerr, Kamal Ndousse, Karina Nguyen, Nelson Elhage, Newton Cheng, Nicholas Schiefer, Nova DasSarma, Oliver Rausch, Robin Larson , et al. (11 additional authors not shown)

    Abstract: Human feedback can prevent overtly harmful utterances in conversational models, but may not automatically mitigate subtle problematic behaviors such as a stated desire for self-preservation or power. Constitutional AI offers an alternative, replacing human feedback with feedback from AI models conditioned only on a list of written principles. We find this approach effectively prevents the expressi… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  5. arXiv:2310.13548  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Towards Understanding Sycophancy in Language Models

    Authors: Mrinank Sharma, Meg Tong, Tomasz Korbak, David Duvenaud, Amanda Askell, Samuel R. Bowman, Newton Cheng, Esin Durmus, Zac Hatfield-Dodds, Scott R. Johnston, Shauna Kravec, Timothy Maxwell, Sam McCandlish, Kamal Ndousse, Oliver Rausch, Nicholas Schiefer, Da Yan, Miranda Zhang, Ethan Perez

    Abstract: Human feedback is commonly utilized to finetune AI assistants. But human feedback may also encourage model responses that match user beliefs over truthful ones, a behaviour known as sycophancy. We investigate the prevalence of sycophancy in models whose finetuning procedure made use of human feedback, and the potential role of human preference judgments in such behavior. We first demonstrate that… ▽ More

    Submitted 27 October, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: 32 pages, 20 figures

    ACM Class: I.2.6

  6. arXiv:2308.03296  [pdf, other

    cs.LG cs.CL stat.ML

    Studying Large Language Model Generalization with Influence Functions

    Authors: Roger Grosse, Juhan Bae, Cem Anil, Nelson Elhage, Alex Tamkin, Amirhossein Tajdini, Benoit Steiner, Dustin Li, Esin Durmus, Ethan Perez, Evan Hubinger, Kamilė Lukošiūtė, Karina Nguyen, Nicholas Joseph, Sam McCandlish, Jared Kaplan, Samuel R. Bowman

    Abstract: When trying to gain better visibility into a machine learning model in order to understand and mitigate the associated risks, a potentially valuable source of evidence is: which training examples most contribute to a given behavior? Influence functions aim to answer a counterfactual: how would the model's parameters (and hence its outputs) change if a given sequence were added to the training set?… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: 119 pages, 47 figures, 22 tables

  7. arXiv:2307.13702  [pdf, other

    cs.AI cs.CL cs.LG

    Measuring Faithfulness in Chain-of-Thought Reasoning

    Authors: Tamera Lanham, Anna Chen, Ansh Radhakrishnan, Benoit Steiner, Carson Denison, Danny Hernandez, Dustin Li, Esin Durmus, Evan Hubinger, Jackson Kernion, Kamilė Lukošiūtė, Karina Nguyen, Newton Cheng, Nicholas Joseph, Nicholas Schiefer, Oliver Rausch, Robin Larson, Sam McCandlish, Sandipan Kundu, Saurav Kadavath, Shannon Yang, Thomas Henighan, Timothy Maxwell, Timothy Telleen-Lawton, Tristan Hume , et al. (5 additional authors not shown)

    Abstract: Large language models (LLMs) perform better when they produce step-by-step, "Chain-of-Thought" (CoT) reasoning before answering a question, but it is unclear if the stated reasoning is a faithful explanation of the model's actual reasoning (i.e., its process for answering the question). We investigate hypotheses for how CoT reasoning may be unfaithful, by examining how the model predictions change… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

  8. arXiv:2307.11768  [pdf, other

    cs.CL cs.AI cs.LG

    Question Decomposition Improves the Faithfulness of Model-Generated Reasoning

    Authors: Ansh Radhakrishnan, Karina Nguyen, Anna Chen, Carol Chen, Carson Denison, Danny Hernandez, Esin Durmus, Evan Hubinger, Jackson Kernion, Kamilė Lukošiūtė, Newton Cheng, Nicholas Joseph, Nicholas Schiefer, Oliver Rausch, Sam McCandlish, Sheer El Showk, Tamera Lanham, Tim Maxwell, Venkatesa Chandrasekaran, Zac Hatfield-Dodds, Jared Kaplan, Jan Brauner, Samuel R. Bowman, Ethan Perez

    Abstract: As large language models (LLMs) perform more difficult tasks, it becomes harder to verify the correctness and safety of their behavior. One approach to help with this issue is to prompt LLMs to externalize their reasoning, e.g., by having them generate step-by-step reasoning as they answer a question (Chain-of-Thought; CoT). The reasoning may enable us to check the process that models use to perfo… ▽ More

    Submitted 25 July, 2023; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: For few-shot examples and prompts, see https://github.com/anthropics/DecompositionFaithfulnessPaper

  9. arXiv:2306.16388  [pdf, other

    cs.CL cs.AI

    Towards Measuring the Representation of Subjective Global Opinions in Language Models

    Authors: Esin Durmus, Karina Nguyen, Thomas I. Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, Liane Lovitt, Sam McCandlish, Orowa Sikder, Alex Tamkin, Janel Thamkul, Jared Kaplan, Jack Clark, Deep Ganguli

    Abstract: Large language models (LLMs) may not equitably represent diverse global perspectives on societal issues. In this paper, we develop a quantitative framework to evaluate whose opinions model-generated responses are more similar to. We first build a dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across dif… ▽ More

    Submitted 11 April, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

  10. arXiv:2306.11932  [pdf, other

    cs.SI cs.CL cs.CY cs.HC

    Opportunities and Risks of LLMs for Scalable Deliberation with Polis

    Authors: Christopher T. Small, Ivan Vendrov, Esin Durmus, Hadjar Homaei, Elizabeth Barry, Julien Cornebise, Ted Suzman, Deep Ganguli, Colin Megill

    Abstract: Polis is a platform that leverages machine intelligence to scale up deliberative processes. In this paper, we explore the opportunities and risks associated with applying Large Language Models (LLMs) towards challenges with facilitating, moderating and summarizing the results of Polis engagements. In particular, we demonstrate with pilot experiments using Anthropic's Claude that LLMs can indeed au… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: 31 pages (main body; 45 with Bibliography and Appendix), 6 figures

  11. arXiv:2305.18189  [pdf, other

    cs.CL cs.AI cs.CY

    Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models

    Authors: Myra Cheng, Esin Durmus, Dan Jurafsky

    Abstract: To recognize and mitigate harms from large language models (LLMs), we need to understand the prevalence and nuances of stereotypes in LLM outputs. Toward this end, we present Marked Personas, a prompt-based method to measure stereotypes in LLMs for intersectional demographic groups without any lexicon or data labeling. Grounded in the sociolinguistic concept of markedness (which characterizes expl… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: To appear at ACL 2023, 9 pages, 3 figures, 3 tables

  12. arXiv:2303.17548  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Whose Opinions Do Language Models Reflect?

    Authors: Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, Tatsunori Hashimoto

    Abstract: Language models (LMs) are increasingly being used in open-ended contexts, where the opinions reflected by LMs in response to subjective queries can have a profound impact, both on user satisfaction, as well as sha** the views of society at large. In this work, we put forth a quantitative framework to investigate the opinions reflected by LMs -- by leveraging high-quality public opinion polls and… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

  13. arXiv:2301.13848  [pdf, other

    cs.CL cs.AI cs.LG

    Benchmarking Large Language Models for News Summarization

    Authors: Tianyi Zhang, Faisal Ladhak, Esin Durmus, Percy Liang, Kathleen McKeown, Tatsunori B. Hashimoto

    Abstract: Large language models (LLMs) have shown promise for automatic summarization but the reasons behind their successes are poorly understood. By conducting a human evaluation on ten LLMs across different pretraining methods, prompts, and model scales, we make two important observations. First, we find instruction tuning, and not model size, is the key to the LLM's zero-shot summarization capability. S… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

  14. arXiv:2212.10722  [pdf, other

    cs.CL

    Contrastive Error Attribution for Finetuned Language Models

    Authors: Faisal Ladhak, Esin Durmus, Tatsunori Hashimoto

    Abstract: Recent work has identified noisy and misannotated data as a core cause of hallucinations and unfaithful outputs in Natural Language Generation (NLG) tasks. Consequently, identifying and removing these examples is a key open challenge in creating reliable NLG systems. In this work, we introduce a framework to identify and remove low-quality training instances that lead to undesirable outputs, such… ▽ More

    Submitted 11 July, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  15. arXiv:2212.09746  [pdf, other

    cs.CL

    Evaluating Human-Language Model Interaction

    Authors: Mina Lee, Megha Srivastava, Amelia Hardy, John Thickstun, Esin Durmus, Ashwin Paranjape, Ines Gerard-Ursin, Xiang Lisa Li, Faisal Ladhak, Frieda Rong, Rose E. Wang, Minae Kwon, Joon Sung Park, Hancheng Cao, Tony Lee, Rishi Bommasani, Michael Bernstein, Percy Liang

    Abstract: Many real-world applications of language models (LMs), such as writing assistance and code autocomplete, involve human-LM interaction. However, most benchmarks are non-interactive in that a model produces output without human involvement. To evaluate human-LM interaction, we develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive… ▽ More

    Submitted 5 January, 2024; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI)

  16. arXiv:2211.09110  [pdf, other

    cs.CL cs.AI cs.LG

    Holistic Evaluation of Language Models

    Authors: Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao , et al. (25 additional authors not shown)

    Abstract: Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. First, we taxonomize the vast space of potential scenarios (i.e. use cases) and metrics (i.e. desiderata) that are of interest fo… ▽ More

    Submitted 1 October, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Project page: https://crfm.stanford.edu/helm/v1.0

    Journal ref: Published in Transactions on Machine Learning Research (TMLR), 2023

  17. Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale

    Authors: Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, Aylin Caliskan

    Abstract: Machine learning models that convert user-written text descriptions into images are now widely available online and used by millions of users to generate millions of images a day. We investigate the potential for these models to amplify dangerous and complex stereotypes. We find a broad range of ordinary prompts produce stereotypes, including prompts simply mentioning traits, descriptors, occupati… ▽ More

    Submitted 7 June, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

    Comments: FAccT 2023 paper. The published version is available at 10.1145/3593013.3594095

  18. arXiv:2206.11249  [pdf, other

    cs.CL cs.AI cs.LG

    GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

    Authors: Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di **, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter , et al. (52 additional authors not shown)

    Abstract: Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, an… ▽ More

    Submitted 24 June, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

  19. arXiv:2204.09890  [pdf, other

    cs.CL

    Spurious Correlations in Reference-Free Evaluation of Text Generation

    Authors: Esin Durmus, Faisal Ladhak, Tatsunori Hashimoto

    Abstract: Model-based, reference-free evaluation metrics have been proposed as a fast and cost-effective approach to evaluate Natural Language Generation (NLG) systems. Despite promising recent results, we find evidence that reference-free evaluation metrics of summarization and dialog generation may be relying on spurious correlations with measures such as word overlap, perplexity, and length. We further o… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

    Comments: Published in ACL 2022 main conference

  20. arXiv:2203.11370  [pdf, other

    cs.CL cs.LG

    Language modeling via stochastic processes

    Authors: Rose E Wang, Esin Durmus, Noah Goodman, Tatsunori Hashimoto

    Abstract: Modern language models can generate high-quality short texts. However, they often meander or are incoherent when generating longer texts. These issues arise from the next-token-only language modeling objective. Recent work in self-supervised learning suggests that models can learn good latent representations via contrastive learning, which can be effective for discriminative tasks. Our work analyz… ▽ More

    Submitted 11 May, 2023; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: Correct claims on goal-directed decoding, adds non-goal-directed baselines. ICLR 2022 Oral. Code: https://github.com/rosewang2008/language_modeling_via_stochastic_processes

  21. arXiv:2110.01078  [pdf, other

    cs.CL

    Towards Understanding Persuasion in Computational Argumentation

    Authors: Esin Durmus

    Abstract: Opinion formation and persuasion in argumentation are affected by three major factors: the argument itself, the source of the argument, and the properties of the audience. Understanding the role of each and the interplay between them is crucial for obtaining insights regarding argument interpretation and generation. It is particularly important for building effective argument generation systems th… ▽ More

    Submitted 3 October, 2021; originally announced October 2021.

    Comments: PhD Dissertation

  22. arXiv:2108.13684  [pdf, other

    cs.CL

    Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization

    Authors: Faisal Ladhak, Esin Durmus, He He, Claire Cardie, Kathleen McKeown

    Abstract: Despite recent progress in abstractive summarization, systems still suffer from faithfulness errors. While prior work has proposed models that improve faithfulness, it is unclear whether the improvement comes from an increased level of extractiveness of the model outputs as one naive way to improve faithfulness is to make summarization models more extractive. In this work, we present a framework f… ▽ More

    Submitted 21 April, 2022; v1 submitted 31 August, 2021; originally announced August 2021.

    Comments: Published in ACL 2022 main conference

  23. arXiv:2108.07258  [pdf, other

    cs.LG cs.AI cs.CY

    On the Opportunities and Risks of Foundation Models

    Authors: Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh , et al. (89 additional authors not shown)

    Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their cap… ▽ More

    Submitted 12 July, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Report page with citation guidelines: https://crfm.stanford.edu/report.html

  24. arXiv:2102.01672  [pdf, other

    cs.CL cs.AI cs.LG

    The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

    Authors: Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak , et al. (31 additional authors not shown)

    Abstract: We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. Due to this moving target, new models often still evaluate on divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it… ▽ More

    Submitted 1 April, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

  25. arXiv:2010.03538  [pdf, other

    cs.CL

    Exploring the Role of Argument Structure in Online Debate Persuasion

    Authors: Jialu Li, Esin Durmus, Claire Cardie

    Abstract: Online debate forums provide users a platform to express their opinions on controversial topics while being exposed to opinions from diverse set of viewpoints. Existing work in Natural Language Processing (NLP) has shown that linguistic features extracted from the debate text and features encoding the characteristics of the audience are both critical in persuasion studies. In this paper, we aim to… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted to EMNLP 2020

  26. arXiv:2010.03093  [pdf, other

    cs.CL

    WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization

    Authors: Faisal Ladhak, Esin Durmus, Claire Cardie, Kathleen McKeown

    Abstract: We introduce WikiLingua, a large-scale, multilingual dataset for the evaluation of crosslingual abstractive summarization systems. We extract article and summary pairs in 18 languages from WikiHow, a high quality, collaborative resource of how-to guides on a diverse set of topics written by human authors. We create gold-standard article-summary alignments across languages by aligning the images th… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: Findings of EMNLP 2020

  27. FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization

    Authors: Esin Durmus, He He, Mona Diab

    Abstract: Neural abstractive summarization models are prone to generate content inconsistent with the source document, i.e. unfaithful. Existing automatic metrics do not capture such mistakes effectively. We tackle the problem of evaluating faithfulness of a generated summary given its source document. We first collected human annotations of faithfulness for outputs from numerous models on two datasets. We… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL 2020

  28. The Role of Pragmatic and Discourse Context in Determining Argument Impact

    Authors: Esin Durmus, Faisal Ladhak, Claire Cardie

    Abstract: Research in the social sciences and psychology has shown that the persuasiveness of an argument depends not only the language employed, but also on attributes of the source/communicator, the audience, and the appropriateness and strength of the argument's claims given the pragmatic and discourse context of the argument. Among these characteristics of persuasive arguments, prior work in NLP does no… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

    Comments: EMNLP 2019

  29. Determining Relative Argument Specificity and Stance for Complex Argumentative Structures

    Authors: Esin Durmus, Faisal Ladhak, Claire Cardie

    Abstract: Systems for automatic argument generation and debate require the ability to (1) determine the stance of any claims employed in the argument and (2) assess the specificity of each claim relative to the argument context. Existing work on understanding claim specificity and stance, however, has been limited to the study of argumentative structures that are relatively shallow, most often consisting of… ▽ More

    Submitted 26 June, 2019; originally announced June 2019.

  30. A Corpus for Modeling User and Language Effects in Argumentation on Online Debating

    Authors: Esin Durmus, Claire Cardie

    Abstract: Existing argumentation datasets have succeeded in allowing researchers to develop computational methods for analyzing the content, structure and linguistic features of argumentative text. They have been much less successful in fostering studies of the effect of "user" traits -- characteristics and beliefs of the participants -- on the debate/argument outcome as this type of user information is gen… ▽ More

    Submitted 26 June, 2019; originally announced June 2019.

  31. Exploring the Role of Prior Beliefs for Argument Persuasion

    Authors: Esin Durmus, Claire Cardie

    Abstract: Public debate forums provide a common platform for exchanging opinions on a topic of interest. While recent studies in natural language processing (NLP) have provided empirical evidence that the language of the debaters and their patterns of interaction play a key role in changing the mind of a reader, research in psychology has shown that prior beliefs can affect our interpretation of an argument… ▽ More

    Submitted 26 June, 2019; originally announced June 2019.

    Comments: 11 pages

  32. IllusionPIN: Shoulder-Surfing Resistant Authentication Using Hybrid Images

    Authors: Athanasios Papadopoulos, Toan Nguyen, Emre Durmus, Nasir Memon

    Abstract: We address the problem of shoulder-surfing attacks on authentication schemes by proposing IllusionPIN (IPIN), a PIN-based authentication method that operates on touchscreen devices. IPIN uses the technique of hybrid images to blend two keypads with different digit orderings in such a way, that the user who is close to the device is seeing one keypad to enter her PIN, while the attacker who is look… ▽ More

    Submitted 22 August, 2017; originally announced August 2017.

    Comments: 15 pages, 10 figures, IEEE Transactions on Information Forensics and Security, 2017