Skip to main content

Showing 1–5 of 5 results for author: Schoenegger, P

.
  1. arXiv:2406.08170  [pdf

    cs.CY

    Can AI Understand Human Personality? -- Comparing Human Experts and AI Systems at Predicting Personality Correlations

    Authors: Philipp Schoenegger, Spencer Greenberg, Alexander Grishin, Joshua Lewis, Lucius Caviola

    Abstract: We test the abilities of specialised deep neural networks like PersonalityMap as well as general LLMs like GPT-4o and Claude 3 Opus in understanding human personality. Specifically, we compare their ability to predict correlations between personality items to the abilities of lay people and academic experts. We find that when compared with individual humans, all AI models make better predictions t… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 45 pages, 6 figures

    MSC Class: K.4.0; J.4

  2. arXiv:2402.19379  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Rival Human Crowd Accuracy

    Authors: Philipp Schoenegger, Indre Tuminauskaite, Peter S. Park, Rafael Valdece Sousa Bastos, Philip E. Tetlock

    Abstract: Human forecasting accuracy in practice relies on the 'wisdom of the crowd' effect, in which predictions about future events are significantly improved by aggregating across a crowd of individual forecasters. Past work on the forecasting ability of large language models (LLMs) suggests that frontier LLMs, as individual forecasters, underperform compared to the gold standard of a human-crowd forecas… ▽ More

    Submitted 17 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: 26 pages; 18 visualizations (nine figures, nine tables)

  3. arXiv:2402.07862  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy

    Authors: Philipp Schoenegger, Peter S. Park, Ezra Karger, Philip E. Tetlock

    Abstract: Large language models (LLMs) show impressive capabilities, matching and sometimes exceeding human performance in many domains. This study explores the potential of LLMs to augment judgement in forecasting tasks. We evaluated the impact on forecasting accuracy of two GPT-4-Turbo assistants: one designed to provide high-quality advice ('superforecasting'), and the other designed to be overconfident… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 18 pages (main text comprised of 15 pages, appendix comprised of three pages). 10 visualizations in the main text (four figures, six tables), three additional figures in the appendix

  4. arXiv:2310.13014  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    Large Language Model Prediction Capabilities: Evidence from a Real-World Forecasting Tournament

    Authors: Philipp Schoenegger, Peter S. Park

    Abstract: Accurately predicting the future would be an important milestone in the capabilities of artificial intelligence. However, research on the ability of large language models to provide probabilistic predictions about future events remains nascent. To empirically test this ability, we enrolled OpenAI's state-of-the-art large language model, GPT-4, in a three-month forecasting tournament hosted on the… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: 13 pages, six visualizations (four figures, two tables)

  5. arXiv:2302.07267  [pdf

    cs.HC cs.AI cs.CL

    Diminished Diversity-of-Thought in a Standard Large Language Model

    Authors: Peter S. Park, Philipp Schoenegger, Chongyang Zhu

    Abstract: We test whether Large Language Models (LLMs) can be used to simulate human participants in social-science studies. To do this, we run replications of 14 studies from the Many Labs 2 replication project with OpenAI's text-davinci-003 model, colloquially known as GPT3.5. Based on our pre-registered analyses, we find that among the eight studies we could analyse, our GPT sample replicated 37.5% of th… ▽ More

    Submitted 13 September, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: 67 pages (42-page main text, 25-page SI); 12 visualizations (four tables and three figures in the main text, five figures in the SI); additional exploratory follow-up study varied the demographic details preceding the prompt; preregistered OSF database is available at https://osf.io/dzp8t/

    MSC Class: 68T50 ACM Class: I.2.7