Skip to main content

Showing 1–22 of 22 results for author: Singer, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.01199  [pdf

    cs.AI

    The Case for Animal-Friendly AI

    Authors: Sankalpa Ghose, Yip Fai Tse, Kasra Rasaee, Jeff Sebo, Peter Singer

    Abstract: Artificial intelligence is seen as increasingly important, and potentially profoundly so, but the fields of AI ethics and AI engineering have not fully recognized that these technologies, including large language models (LLMs), will have massive impacts on animals. We argue that this impact matters, because animals matter morally. As a first experiment in evaluating animal consideration in LLMs,… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: AAAI 2024 Workshop on Public Sector LLMs: Algorithmic and Sociotechnical Design. 12 pages, 11 figures

  2. arXiv:2401.16818  [pdf, other

    cs.CL cs.LG

    H2O-Danube-1.8B Technical Report

    Authors: Philipp Singer, Pascal Pfeiffer, Yauhen Babakhin, Maximilian Jeblick, Nischay Dhankhar, Gabor Fodor, Sri Satish Ambati

    Abstract: We present H2O-Danube, a series of small 1.8B language models consisting of H2O-Danube-1.8B, trained on 1T tokens, and the incremental improved H2O-Danube2-1.8B trained on an additional 2T tokens. Our models exhibit highly competitive metrics across a multitude of benchmarks and, as of the time of this writing, H2O-Danube2-1.8B achieves the top ranking on Open LLM Leaderboard for all models below… ▽ More

    Submitted 15 April, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

  3. arXiv:2310.13012  [pdf, other

    cs.CL cs.AI

    H2O Open Ecosystem for State-of-the-art Large Language Models

    Authors: Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Chun Ming Lee, Marcos V. Conde

    Abstract: Large Language Models (LLMs) represent a revolution in AI. However, they also pose many significant risks, such as the presence of biased, private, copyrighted or harmful text. For this reason we need open, transparent and safe solutions. We introduce a complete open-source ecosystem for develo** and testing LLMs. The goal of this project is to boost open alternatives to closed-source approaches… ▽ More

    Submitted 23 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Demo - ACL Empirical Methods in Natural Language Processing

  4. arXiv:2306.08161  [pdf, other

    cs.CL cs.AI cs.HC cs.IR cs.LG

    h2oGPT: Democratizing Large Language Models

    Authors: Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Prithvi Prabhu, Jeff Gambera, Mark Landry, Shivam Bansal, Ryan Chesler, Chun Ming Lee, Marcos V. Conde, Pasha Stetsenko, Olivier Grellier, SriSatish Ambati

    Abstract: Applications built on top of Large Language Models (LLMs) such as GPT-4 represent a revolution in AI due to their human-level capabilities in natural language processing. However, they also pose many significant risks such as the presence of biased, private, or harmful text, and the unauthorized inclusion of copyrighted material. We introduce h2oGPT, a suite of open-source code repositories for… ▽ More

    Submitted 16 June, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: Work in progress by H2O.ai, Inc

  5. arXiv:2202.10848  [pdf

    cs.LG cs.AI cs.CY

    Speciesist bias in AI -- How AI applications perpetuate discrimination and unfair outcomes against animals

    Authors: Thilo Hagendorff, Leonie Bossert, Tse Yip Fai, Peter Singer

    Abstract: Massive efforts are made to reduce biases in both data and algorithms in order to render AI applications fair. These efforts are propelled by various high-profile cases where biased algorithmic decision-making caused harm to women, people of color, minorities, etc. However, the AI fairness field still succumbs to a blind spot, namely its insensitivity to discrimination against animals. This paper… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  6. arXiv:2107.07728  [pdf, other

    cs.SD cs.LG eess.AS

    Recognizing bird species in diverse soundscapes under weak supervision

    Authors: Christof Henkel, Pascal Pfeiffer, Philipp Singer

    Abstract: We present a robust classification approach for avian vocalization in complex and diverse soundscapes, achieving second place in the BirdCLEF2021 challenge. We illustrate how to make full use of pre-trained convolutional neural networks, by using an efficient modeling and training routine supplemented by novel augmentation methods. Thereby, we improve the generalization of weakly labeled crowd-sou… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

    Comments: All authors contributed equally, 8 pages, 4 figures, submitted to CEUR-WS

  7. arXiv:2010.01650  [pdf, other

    cs.CV cs.LG

    Supporting large-scale image recognition with out-of-domain samples

    Authors: Christof Henkel, Philipp Singer

    Abstract: This article presents an efficient end-to-end method to perform instance-level recognition employed to the task of labeling and ranking landmark images. In a first step, we embed images in a high dimensional feature space using convolutional neural networks trained with an additive angular margin loss and classify images using visual similarity. We then efficiently re-rank predictions and filter n… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

  8. arXiv:2007.11411  [pdf, other

    physics.soc-ph cs.LG q-bio.PE

    Backtesting the predictability of COVID-19

    Authors: Dmitry Gordeev, Philipp Singer, Marios Michailidis, Mathias Müller, SriSatish Ambati

    Abstract: The advent of the COVID-19 pandemic has instigated unprecedented changes in many countries around the globe, putting a significant burden on the health sectors, affecting the macro economic conditions, and altering social interactions amongst the population. In response, the academic community has produced multiple forecasting models, approaches and algorithms to best predict the different indicat… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

  9. arXiv:1702.05427  [pdf, other

    cs.SI physics.soc-ph

    Sampling from Social Networks with Attributes

    Authors: Claudia Wagner, Philipp Singer, Fariba Karimi, Jürgen Pfeffer, Markus Strohmaier

    Abstract: Sampling from large networks represents a fundamental challenge for social network research. In this paper, we explore the sensitivity of different sampling techniques (node sampling, edge sampling, random walk sampling, and snowball sampling) on social networks with attributes. We consider the special case of networks (i) where we have one attribute with two values (e.g., male and female in the c… ▽ More

    Submitted 17 February, 2017; originally announced February 2017.

    Comments: Published at WWW'17

  10. arXiv:1702.05379  [pdf, other

    cs.SI cs.DL cs.HC

    Why We Read Wikipedia

    Authors: Philipp Singer, Florian Lemmerich, Robert West, Leila Zia, Ellery Wulczyn, Markus Strohmaier, Jure Leskovec

    Abstract: Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a… ▽ More

    Submitted 16 March, 2017; v1 submitted 17 February, 2017; originally announced February 2017.

    Comments: Published in WWW'17; v2 fixes caption of Table 3

  11. arXiv:1702.00150  [pdf, other

    physics.soc-ph cs.SI

    Visibility of minorities in social networks

    Authors: Fariba Karimi, Mathieu Génois, Claudia Wagner, Philipp Singer, Markus Strohmaier

    Abstract: Homophily can put minority groups at a disadvantage by restricting their ability to establish links with people from a majority group. This can limit the overall visibility of minorities in the network. Building on a Barabási-Albert model variation with groups and homophily, we show how the visibility of minority groups in social networks is a function of (i) their relative group size and (ii) the… ▽ More

    Submitted 1 February, 2017; originally announced February 2017.

    Comments: 11 pages, 8 figures, under review

    Journal ref: Scientific Reports 2018

  12. arXiv:1612.07612  [pdf, other

    cs.SI physics.data-an physics.soc-ph

    MixedTrails: Bayesian hypothesis comparison on heterogeneous sequential data

    Authors: Martin Becker, Florian Lemmerich, Philipp Singer, Markus Strohmaier, Andreas Hotho

    Abstract: Sequential traces of user data are frequently observed online and offline, e.g., as sequences of visited websites or as sequences of locations captured by GPS. However, understanding factors explaining the production of sequence data is a challenging task, especially since the data generation is often not homogeneous. For example, navigation behavior might change in different phases of browsing a… ▽ More

    Submitted 11 July, 2017; v1 submitted 21 December, 2016; originally announced December 2016.

    Comments: Published in Data Mining and Knowledge Discovery (2017) and presented at ECML PKDD 2017

    ACM Class: H.5.3

    Journal ref: Data Mining and Knowledge Discovery (2017)

  13. arXiv:1611.02508  [pdf, other

    cs.SI physics.soc-ph

    What Makes a Link Successful on Wikipedia?

    Authors: Dimitar Dimitrov, Philipp Singer, Florian Lemmerich, Markus Strohmaier

    Abstract: While a plethora of hypertext links exist on the Web, only a small amount of them are regularly clicked. Starting from this observation, we set out to study large-scale click data from Wikipedia in order to understand what makes a link successful. We systematically analyze effects of link properties on the popularity of links. By utilizing mixed-effects hurdle models supplemented with descriptive… ▽ More

    Submitted 20 February, 2017; v1 submitted 8 November, 2016; originally announced November 2016.

  14. Evidence of Online Performance Deterioration in User Sessions on Reddit

    Authors: Philipp Singer, Emilio Ferrara, Farshad Kooti, Markus Strohmaier, Kristina Lerman

    Abstract: This article presents evidence of performance deterioration in online user sessions quantified by studying a massive dataset containing over 55 million comments posted on Reddit in April 2015. After segmenting the sessions (i.e., periods of activity without a prolonged break) depending on their intensity (i.e., how many posts users produced during sessions), we observe a general decrease in the qu… ▽ More

    Submitted 26 August, 2016; v1 submitted 23 April, 2016; originally announced April 2016.

    Comments: Published in PlosOne

    Journal ref: PLoS ONE 11(8): e0161636, 2016

  15. Discovering and Characterizing Mobility Patterns in Urban Spaces: A Study of Manhattan Taxi Data

    Authors: Lisette Espín-Noboa, Florian Lemmerich, Philipp Singer, Markus Strohmaier

    Abstract: Nowadays, human movement in urban spaces can be traced digitally in many cases. It can be observed that movement patterns are not constant, but vary across time and space. In this work,we characterize such spatio-temporal patterns with an innovative combination of two separate approaches that have been utilized for studying human mobility in the past. First, by using non-negative tensor factorizat… ▽ More

    Submitted 9 February, 2016; v1 submitted 20 January, 2016; originally announced January 2016.

    Comments: Accepted at the Location and the Web (LocWeb) workshop at WWW2016

  16. arXiv:1411.2844  [pdf, other

    cs.SI physics.data-an physics.soc-ph

    HypTrails: A Bayesian Approach for Comparing Hypotheses About Human Trails on the Web

    Authors: Philipp Singer, Denis Helic, Andreas Hotho, Markus Strohmaier

    Abstract: When users interact with the Web today, they leave sequential digital trails on a massive scale. Examples of such human trails include Web navigation, sequences of online restaurant reviews, or online music play lists. Understanding the factors that drive the production of these trails can be useful for e.g., improving underlying network structures, predicting user clicks or enhancing recommendati… ▽ More

    Submitted 26 March, 2015; v1 submitted 11 November, 2014; originally announced November 2014.

    Comments: Published in the proceedings of WWW'15

    ACM Class: H.5.3

  17. arXiv:1407.2002  [pdf, ps, other

    cs.SI cs.AI cs.DL physics.data-an

    Discovering Beaten Paths in Collaborative Ontology-Engineering Projects using Markov Chains

    Authors: Simon Walk, Philipp Singer, Markus Strohmaier, Tania Tudorache, Mark A. Musen, Natalya F. Noy

    Abstract: Biomedical taxonomies, thesauri and ontologies in the form of the International Classification of Diseases (ICD) as a taxonomy or the National Cancer Institute Thesaurus as an OWL-based ontology, play a critical role in acquiring, representing and processing information about human health. With increasing adoption and relevance, biomedical ontologies have also significantly increased in size. For… ▽ More

    Submitted 29 February, 2016; v1 submitted 8 July, 2014; originally announced July 2014.

    Comments: Published in the Journal of Biomedical Informatics

  18. How to Apply Markov Chains for Modeling Sequential Edit Patterns in Collaborative Ontology-Engineering Projects

    Authors: Simon Walk, Philipp Singer, Markus Strohmaier, Denis Helic, Natalya F. Noy, Mark Musen

    Abstract: With the growing popularity of large-scale collaborative ontology-engineering projects, such as the creation of the 11th revision of the International Classification of Diseases, we need new methods and insights to help project- and community-managers to cope with the constantly growing complexity of such projects. In this paper, we present a novel application of Markov chains to model sequential… ▽ More

    Submitted 16 February, 2016; v1 submitted 5 March, 2014; originally announced March 2014.

  19. arXiv:1402.1386  [pdf, other

    cs.SI cs.CY physics.soc-ph

    Evolution of Reddit: From the Front Page of the Internet to a Self-referential Community?

    Authors: Philipp Singer, Fabian Flöck, Clemens Meinhart, Elias Zeitfogel, Markus Strohmaier

    Abstract: In the past few years, Reddit -- a community-driven platform for submitting, commenting and rating links and text posts -- has grown exponentially, from a small community of users into one of the largest online communities on the Web. To the best of our knowledge, this work represents the most comprehensive longitudinal study of Reddit's evolution to date, studying both (i) how user submissions ha… ▽ More

    Submitted 23 June, 2014; v1 submitted 6 February, 2014; originally announced February 2014.

    Comments: Published in the proceedings of WWW'14 companion

    ACM Class: H.3.5

  20. arXiv:1402.0790  [pdf, other

    cs.SI physics.soc-ph

    Detecting Memory and Structure in Human Navigation Patterns Using Markov Chain Models of Varying Order

    Authors: Philipp Singer, Denis Helic, Behnam Taraghi, Markus Strohmaier

    Abstract: One of the most frequently used models for understanding human navigation on the Web is the Markov chain model, where Web pages are represented as states and hyperlinks as probabilities of navigating from one page to another. Predominantly, human navigation on the Web has been thought to satisfy the memoryless Markov property stating that the next page a user visits only depends on her current pag… ▽ More

    Submitted 4 June, 2014; v1 submitted 4 February, 2014; originally announced February 2014.

    Journal ref: PLoS ONE, vol 9(7), 2014

  21. arXiv:1401.0629  [pdf, other

    cs.IR cs.DL cs.SI

    Of course we share! Testing Assumptions about Social Tagging Systems

    Authors: Stephan Doerfel, Daniel Zoller, Philipp Singer, Thomas Niebler, Andreas Hotho, Markus Strohmaier

    Abstract: Social tagging systems have established themselves as an important part in today's web and have attracted the interest from our research community in a variety of investigations. The overall vision of our community is that simply through interactions with the system, i.e., through tagging and sharing of resources, users would contribute to building useful semantic structures as well as resource in… ▽ More

    Submitted 28 March, 2014; v1 submitted 3 January, 2014; originally announced January 2014.

    ACM Class: H.3.4

  22. arXiv:1311.1162  [pdf, other

    cs.CY cs.IR cs.SI physics.soc-ph

    Semantic Stability in Social Tagging Streams

    Authors: Claudia Wagner, Philipp Singer, Markus Strohmaier, Bernardo A. Huberman

    Abstract: One potential disadvantage of social tagging systems is that due to the lack of a centralized vocabulary, a crowd of users may never manage to reach a consensus on the description of resources (e.g., books, users or songs) on the Web. Yet, previous research has provided interesting evidence that the tag distributions of resources may become semantically stable over time as more and more users tag… ▽ More

    Submitted 5 November, 2013; originally announced November 2013.