Skip to main content

Showing 1–30 of 30 results for author: Adali, S

.
  1. arXiv:2205.07970  [pdf, other

    cs.CY cs.SI physics.soc-ph

    SciLander: Map** the Scientific News Landscape

    Authors: Maurício Gruppi, Panayiotis Smeros, Sibel Adalı, Carlos Castillo, Karl Aberer

    Abstract: The COVID-19 pandemic has fueled the spread of misinformation on social media and the Web as a whole. The phenomenon dubbed `infodemic' has taken the challenges of information veracity and trust to new heights by massively introducing seemingly scientific and technical elements into misleading content. Despite the existing body of work on modeling and predicting misinformation, the coverage of ver… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

  2. arXiv:2203.08600  [pdf, other

    cs.CY cs.MM cs.SI

    NELA-Local: A Dataset of U.S. Local News Articles for the Study of County-level News Ecosystems

    Authors: Benjamin D. Horne, Maurício Gruppi, Kenneth Joseph, Jon Green, John P. Wihbey, Sibel Adalı

    Abstract: In this paper, we present a dataset of over 1.4M online news articles from 313 local U.S. news outlets published over 20 months (between April 4th, 2020 and December 31st, 2021). These outlets cover a geographically diverse set of communities across the United States. In order to estimate characteristics of the local audience, included with this news article data is a wide range of county-level me… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: Published at ICWSM 2022

  3. arXiv:2203.05659  [pdf, other

    cs.CL cs.CY cs.LG cs.SI

    NELA-GT-2022: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles

    Authors: Maurício Gruppi, Benjamin D. Horne, Sibel Adalı

    Abstract: In this paper, we present the fifth installment of the NELA-GT datasets, NELA-GT-2022. The dataset contains 1,778,361 articles from 361 outlets between January 1st, 2022 and December 31st, 2022. Just as in past releases of the dataset, NELA-GT-2022 includes outlet-level veracity labels from Media Bias/Fact Check and tweets embedded in collected news articles. The NELA-GT-2022 dataset can be found… ▽ More

    Submitted 17 March, 2023; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: Technical report documenting the NELA-GT recent update (NELA-GT-2022). arXiv admin note: substantial text overlap with arXiv:2102.04567

  4. arXiv:2102.04567  [pdf, other

    cs.CY

    NELA-GT-2020: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles

    Authors: Maurício Gruppi, Benjamin D. Horne, Sibel Adalı

    Abstract: In this paper, we present an updated version of the NELA-GT-2019 dataset, entitled NELA-GT-2020. NELA-GT-2020 contains nearly 1.8M news articles from 519 sources collected between January 1st, 2020 and December 31st, 2020. Just as with NELA-GT-2018 and NELA-GT-2019, these sources come from a wide range of mainstream news sources and alternative news sources. Included in the dataset are source-leve… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: 6 pages, 4 figures. arXiv admin note: text overlap with arXiv:2003.08444

  5. arXiv:2102.00290  [pdf, other

    cs.CL cs.LG

    Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks

    Authors: Maurício Gruppi, Sibel Adalı, Pin-Yu Chen

    Abstract: The use of language is subject to variation over time as well as across social groups and knowledge domains, leading to differences even in the monolingual scenario. Such variation in word usage is often called lexical semantic change (LSC). The goal of LSC is to characterize and quantify language variations with respect to word meaning, to measure how distinct two language sources are (that is, p… ▽ More

    Submitted 30 January, 2021; originally announced February 2021.

    Comments: Published at AAAI-2021

  6. arXiv:2101.10973  [pdf, other

    cs.SI cs.CY cs.LG

    Tell Me Who Your Friends Are: Using Content Sharing Behavior for News Source Veracity Detection

    Authors: Maurício Gruppi, Benjamin D. Horne, Sibel Adalı

    Abstract: Stop** the malicious spread and production of false and misleading news has become a top priority for researchers. Due to this prevalence, many automated methods for detecting low quality information have been introduced. The majority of these methods have used article-level features, such as their writing style, to detect veracity. While writing style models have been shown to work well in lab-… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

    Comments: Preprint Version

  7. arXiv:2012.01603  [pdf, other

    cs.CL cs.AI

    SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical Semantic Change

    Authors: Maurício Gruppi, Sibel Adali, Pin-Yu Chen

    Abstract: This paper describes SChME (Semantic Change Detection with Model Ensemble), a method usedin SemEval-2020 Task 1 on unsupervised detection of lexical semantic change. SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature. M… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

  8. arXiv:2006.01211  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    Do All Good Actors Look The Same? Exploring News Veracity Detection Across The U.S. and The U.K

    Authors: Benjamin D. Horne, Maurício Gruppi, Sibel Adalı

    Abstract: A major concern with text-based news veracity detection methods is that they may not generalize across countries and cultures. In this short paper, we explicitly test news veracity models across news data from the United States and the United Kingdom, demonstrating there is reason for concern of generalizabilty. Through a series of testing scenarios, we show that text-based classifiers perform poo… ▽ More

    Submitted 26 May, 2020; originally announced June 2020.

    Comments: Published in ICWSM 2020 Data Challenge

  9. arXiv:2005.00596  [pdf, other

    cs.CV cs.LG stat.ML

    Learning from Noisy Labels with Noise Modeling Network

    Authors: Zhuolin Jiang, Jan Silovsky, Man-Hung Siu, William Hartmann, Herbert Gish, Sancar Adali

    Abstract: Multi-label image classification has generated significant interest in recent years and the performance of such systems often suffers from the not so infrequent occurrence of incorrect or missing labels in the training data. In this paper, we extend the state-of the-art of training classifiers to jointly deal with both forms of errorful data. We accomplish this by modeling noisy and missing labels… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

  10. arXiv:2003.08444  [pdf, other

    cs.CY

    NELA-GT-2019: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles

    Authors: Maurício Gruppi, Benjamin D. Horne, Sibel Adalı

    Abstract: In this paper, we present an updated version of the NELA-GT-2018 dataset (Nørregaard, Horne, and Adalı 2019), entitled NELA-GT-2019. NELA-GT-2019 contains 1.12M news articles from 260 sources collected between January 1st 2019 and December 31st 2019. Just as with NELA-GT-2018, these sources come from a wide range of mainstream news sources and alternative news sources. Included with the dataset ar… ▽ More

    Submitted 26 March, 2020; v1 submitted 18 March, 2020; originally announced March 2020.

    Comments: Updated dataset for paper NELA-GT-2018: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles, originally published at ICWSM in 2019

  11. arXiv:1911.05825  [pdf, other

    cs.CY

    Trustworthy Misinformation Mitigation with Soft Information Nudging

    Authors: Benjamin D. Horne, Maurício Gruppi, Sibel Adalı

    Abstract: Research in combating misinformation reports many negative results: facts may not change minds, especially if they come from sources that are not trusted. Individuals can disregard and justify lies told by trusted sources. This problem is made even worse by social recommendation algorithms which help amplify conspiracy theories and information confirming one's own biases due to companies' efforts… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

    Comments: Published at IEEE TPS 2019

  12. arXiv:1904.01546  [pdf, other

    cs.CY

    NELA-GT-2018: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles

    Authors: Jeppe Norregaard, Benjamin D. Horne, Sibel Adali

    Abstract: In this paper, we present a dataset of 713k articles collected between 02/2018-11/2018. These articles are collected directly from 194 news and media outlets including mainstream, hyper-partisan, and conspiracy sources. We incorporate ground truth ratings of the sources from 8 different assessment sites covering multiple dimensions of veracity, including reliability, bias, transparency, adherence… ▽ More

    Submitted 2 April, 2019; originally announced April 2019.

    Comments: Published at ICWSM 2019

  13. arXiv:1904.01534  [pdf, other

    cs.CY

    Different Spirals of Sameness: A Study of Content Sharing in Mainstream and Alternative Media

    Authors: Benjamin D. Horne, Jeppe Norregaard, Sibel Adali

    Abstract: In this paper, we analyze content sharing between news sources in the alternative and mainstream media using a dataset of 713K articles and 194 sources. We find that content sharing happens in tightly formed communities, and these communities represent relatively homogeneous portions of the media landscape. Through a mix-method analysis, we find several primary content sharing behaviors. First, we… ▽ More

    Submitted 2 April, 2019; originally announced April 2019.

    Comments: Published at ICWSM 2019

  14. arXiv:1904.01531  [pdf, other

    cs.CY

    Rating Reliability and Bias in News Articles: Does AI Assistance Help Everyone?

    Authors: Benjamin D. Horne, Dorit Nevo, John O'Donovan, **-Hee Cho, Sibel Adali

    Abstract: With the spread of false and misleading information in current news, many algorithmic tools have been introduced with the aim of assessing bias and reliability in written content. However, there has been little work exploring how effective these tools are at changing human perceptions of content. To this end, we conduct a study with 654 participants to understand if algorithmic assistance improves… ▽ More

    Submitted 16 May, 2019; v1 submitted 2 April, 2019; originally announced April 2019.

    Comments: Published at ICWSM 2019

  15. arXiv:1808.09270  [pdf, other

    cs.IR cs.LG stat.ML

    Models for Predicting Community-Specific Interest in News Articles

    Authors: Benjamin D. Horne, William Dron, Sibel Adali

    Abstract: In this work, we ask two questions: 1. Can we predict the type of community interested in a news article using only features from the article content? and 2. How well do these models generalize over time? To answer these questions, we compute well-studied content-based features on over 60K news articles from 4 communities on reddit.com. We train and test models over three different time periods be… ▽ More

    Submitted 27 August, 2018; originally announced August 2018.

    Comments: Published at IEEE MILCOM 2018 in Los Angeles, CA, USA

  16. arXiv:1807.06519  [pdf, other

    cs.SI

    Is Uncertainty Always Bad?: Effect of Topic Competence on Uncertain Opinions

    Authors: **-Hee Cho, Sibel Adalı

    Abstract: The proliferation of information disseminated by public/social media has made decision-making highly challenging due to the wide availability of noisy, uncertain, or unverified information. Although the issue of uncertainty in information has been studied for several decades, little work has investigated how noisy (or uncertain) or valuable (or credible) information can be formulated into people's… ▽ More

    Submitted 17 July, 2018; originally announced July 2018.

    Journal ref: IEEE ICC 2018

  17. arXiv:1806.02875  [pdf, ps, other

    cs.CL

    An Exploration of Unreliable News Classification in Brazil and The U.S

    Authors: Mauricio Gruppi, Benjamin D. Horne, Sibel Adali

    Abstract: The propagation of unreliable information is on the rise in many places around the world. This expansion is facilitated by the rapid spread of information and anonymity granted by the Internet. The spread of unreliable information is a wellstudied issue and it is associated with negative social impacts. In a previous work, we have identified significant differences in the structure of news article… ▽ More

    Submitted 7 June, 2018; originally announced June 2018.

    Comments: Presented and Peer-Reviewed at NECO 2018

  18. arXiv:1805.05939  [pdf, other

    cs.CY

    An Exploration of Verbatim Content Republishing by News Producers

    Authors: Benjamin D. Horne, Sibel Adali

    Abstract: In today's news ecosystem, news sources emerge frequently and can vary widely in intent. This intent can range from benign to malicious, with many tactics being used to achieve their goals. One lesser studied tactic is content republishing, which can be used to make specific stories seem more important, create uncertainty around an event, or create a perception of credibility for unreliable news s… ▽ More

    Submitted 15 May, 2018; originally announced May 2018.

    Comments: Peer-reviewed by NECO 2018 Workshop

  19. arXiv:1803.10124  [pdf, other

    cs.CY

    Sampling the News Producers: A Large News and Feature Data Set for the Study of the Complex Media Landscape

    Authors: Benjamin D. Horne, William Dron, Sara Khedr, Sibel Adali

    Abstract: The complexity and diversity of today's media landscape provides many challenges for researchers studying news producers. These producers use many different strategies to get their message believed by readers through the writing styles they employ, by repetition across different media sources with or without attribution, as well as other mechanisms that are yet to be studied deeply. To better faci… ▽ More

    Submitted 16 August, 2018; v1 submitted 27 March, 2018; originally announced March 2018.

    Comments: Published at ICWSM 2018. Dataset: https://github.com/BenjaminDHorne/NELA2017-Dataset-v1 Feature Code: https://github.com/BenjaminDHorne/Language-Features-for-News

  20. arXiv:1706.03364  [pdf, ps, other

    math.AG

    Singularities of Restriction Varieties in $OG(k, n)$

    Authors: Seçkin Adalı

    Abstract: Restriction varieties in the orthogonal Grassmannian are subvarieties of $OG(k, n)$ defined by rank conditions given by a flag that is not necessarily isotropic with respect to the relevant symmetric bilinear form. In particular, Schubert varieties of Type B and D are examples of restriction varieties. In this paper, we introduce a resolution of singularities for restriction varieties in… ▽ More

    Submitted 28 July, 2017; v1 submitted 11 June, 2017; originally announced June 2017.

    Comments: Revised according to referee's comments

    MSC Class: 14M15; 14E15; 32M10

  21. arXiv:1705.06709  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Learning Spatiotemporal Features for Infrared Action Recognition with 3D Convolutional Neural Networks

    Authors: Zhuolin Jiang, Viktor Rozgic, Sancar Adali

    Abstract: Infrared (IR) imaging has the potential to enable more robust action recognition systems compared to visible spectrum cameras due to lower sensitivity to lighting conditions and appearance variability. While the action recognition task on videos collected from visible spectrum imaging has received much attention, action recognition in IR videos is significantly less explored. Our objective is to e… ▽ More

    Submitted 18 May, 2017; originally announced May 2017.

  22. arXiv:1705.02673  [pdf, other

    cs.SI

    Identifying the social signals that drive online discussions: A case study of Reddit communities

    Authors: Benjamin D. Horne, Sibel Adali, Sujoy Sikdar

    Abstract: Increasingly people form opinions based on information they consume on online social media. As a result, it is crucial to understand what type of content attracts people's attention on social media and drive discussions. In this paper we focus on online discussions. Can we predict which comments and what content gets the highest attention in an online discussion? How does this content differ from… ▽ More

    Submitted 7 May, 2017; originally announced May 2017.

    Comments: \c{opyright} 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  23. arXiv:1703.10570  [pdf, other

    cs.SI

    The Impact of Crowds on News Engagement: A Reddit Case Study

    Authors: Benjamin D. Horne, Sibel Adali

    Abstract: Today, users are reading the news through social platforms. These platforms are built to facilitate crowd engagement, but not necessarily disseminate useful news to inform the masses. Hence, the news that is highly engaged with may not be the news that best informs. While predicting news popularity has been well studied, it has not been studied in the context of crowd manipulations. In this paper,… ▽ More

    Submitted 3 November, 2017; v1 submitted 30 March, 2017; originally announced March 2017.

    Comments: Published at The 2nd International Workshop on News and Public Opinion at ICWSM 2017

  24. arXiv:1703.09398  [pdf, other

    cs.SI cs.CL

    This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News

    Authors: Benjamin D. Horne, Sibel Adali

    Abstract: The problem of fake news has gained a lot of attention as it is claimed to have had a significant impact on 2016 US Presidential Elections. Fake news is not a new problem and its spread in social networks is well-studied. Often an underlying assumption in fake news discussion is that it is written to look like real news, fooling the reader who does not check for reliability of the sources or the a… ▽ More

    Submitted 28 March, 2017; originally announced March 2017.

    Comments: Published at The 2nd International Workshop on News and Public Opinion at ICWSM

  25. arXiv:1611.07636  [pdf, other

    cs.GT

    Mechanism Design for Multi-Type Housing Markets

    Authors: Sibel Adali, Sujoy Sikdar, Lirong Xia

    Abstract: We study multi-type housing markets, where there are $p\ge 2$ types of items, each agent is initially endowed one item of each type, and the goal is to design mechanisms without monetary transfer to (re)allocate items to the agents based on their preferences over bundles of items, such that each agent gets one item of each type. In sharp contrast to classical housing markets, previous studies in m… ▽ More

    Submitted 22 November, 2016; originally announced November 2016.

    Comments: full version of the AAAI-17 paper

  26. arXiv:1401.3813  [pdf, other

    stat.ML stat.AP stat.ME

    Seeded Graph Matching Via Joint Optimization of Fidelity and Commensurability

    Authors: Heather Patsolic, Sancar Adali, Joshua T. Vogelstein, Youngser Park, Carey E. Friebe, Gongkai Li, Vince Lyzinski

    Abstract: We present a novel approximate graph matching algorithm that incorporates seeded data into the graph matching paradigm. Our Joint Optimization of Fidelity and Commensurability (JOFC) algorithm embeds two graphs into a common Euclidean space where the matching inference task can be performed. Through real and simulated data examples, we demonstrate the versatility of our algorithm in matching graph… ▽ More

    Submitted 8 December, 2019; v1 submitted 15 January, 2014; originally announced January 2014.

    Comments: 26 pages, 7 figures. Updated content and added application of simultaneous matching for several time-steps for zebrafish connectomes

  27. arXiv:1306.1977  [pdf, other

    stat.ME

    Fidelity-Commensurability Tradeoff in Joint Embedding of Disparate Dissimilarities

    Authors: Sancar Adali, Carey E. Priebe

    Abstract: In various data settings, it is necessary to compare observations from disparate data sources. We assume the data is in the dissimilarity representation and investigate a joint embedding method that results in a commensurate representation of disparate dissimilarities. We further assume that there are "matched" observations from different conditions which can be considered to be highly similar, fo… ▽ More

    Submitted 4 January, 2016; v1 submitted 8 June, 2013; originally announced June 2013.

  28. arXiv:1209.0367  [pdf, other

    stat.ML

    Seeded Graph Matching

    Authors: Donniell E. Fishkind, Sancar Adali, Heather G. Patsolic, Lingyao Meng, Digvijay Singh, Vince Lyzinski, Carey E. Priebe

    Abstract: Given two graphs, the graph matching problem is to align the two vertex sets so as to minimize the number of adjacency disagreements between the two graphs. The seeded graph matching problem is the graph matching problem when we are first given a partial alignment that we are tasked with completing. In this paper, we modify the state-of-the-art approximate graph matching algorithm "FAQ" of Vogelst… ▽ More

    Submitted 10 April, 2018; v1 submitted 3 September, 2012; originally announced September 2012.

    Comments: 24 pages, 10 figures

  29. arXiv:1112.5510  [pdf, other

    stat.ME

    Manifold Matching: Joint Optimization of Fidelity and Commensurability

    Authors: Carey E. Priebe, David J. Marchette, Zhiliang Ma, Sancar Adali

    Abstract: Fusion and inference from multiple and massive disparate data sources - the requirement for our most challenging data analysis problems and the goal of our most ambitious statistical pattern recognition methodologies - -has many and varied aspects which are currently the target of intense research and development. One aspect of the overall challenge is manifold matching - identifying embeddings of… ▽ More

    Submitted 22 December, 2011; originally announced December 2011.

    Comments: 22 pages, 12 figures

  30. arXiv:1103.1359  [pdf, ps, other

    cs.DM cs.SI

    An Analysis of Optimal Link Bombs

    Authors: Sibel Adali, Tina Liu, Malik Magdon-Ismail

    Abstract: We analyze the phenomenon of collusion for the purpose of boosting the pagerank of a node in an interlinked environment. We investigate the optimal attack pattern for a group of nodes (attackers) attempting to improve the ranking of a specific node (the victim). We consider attacks where the attackers can only manipulate their own outgoing links. We show that the optimal attacks in this scenario a… ▽ More

    Submitted 7 March, 2011; originally announced March 2011.

    Comments: Full Version of a version which appeared in AIRweb 2005