Skip to main content

Showing 1–50 of 54 results for author: Althoff, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16964  [pdf, other

    cs.LG cs.AI

    Are Language Models Actually Useful for Time Series Forecasting?

    Authors: Mingtian Tan, Mike A. Merrill, Vinayak Gupta, Tim Althoff, Thomas Hartvigsen

    Abstract: Large language models (LLMs) are being applied to time series tasks, particularly time series forecasting. However, are language models actually useful for time series? After a series of ablation studies on three recent and popular LLM-based time series forecasting methods, we find that removing the LLM component or replacing it with a basic attention layer does not degrade the forecasting results… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 25 pages, 8 figures and 20 tables

  2. arXiv:2406.12830  [pdf, other

    cs.CL

    What Are the Odds? Language Models Are Capable of Probabilistic Reasoning

    Authors: Akshay Paruchuri, Jake Garrison, Shun Liao, John Hernandez, Jacob Sunshine, Tim Althoff, Xin Liu, Daniel McDuff

    Abstract: Language models (LM) are capable of remarkably complex linguistic tasks; however, numerical reasoning is an area in which they frequently struggle. An important but rarely evaluated form of reasoning is understanding probability distributions. In this paper, we focus on evaluating the probabilistic reasoning capabilities of LMs using idealized and real-world statistical distributions. We perform a… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 21 pages, 9 figures, 2 tables

  3. arXiv:2406.06474  [pdf, other

    cs.AI cs.CL

    Towards a Personal Health Large Language Model

    Authors: Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotra , et al. (9 additional authors not shown)

    Abstract: In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 72 pages

  4. arXiv:2406.06464  [pdf, other

    cs.AI cs.CL

    Transforming Wearable Data into Health Insights using Large Language Model Agents

    Authors: Mike A. Merrill, Akshay Paruchuri, Naghmeh Rezaei, Geza Kovacs, Javier Perez, Yun Liu, Erik Schenck, Nova Hammerquist, Jake Sunshine, Shyam Tailor, Kumar Ayush, Hao-Wei Su, Qian He, Cory Y. McLean, Mark Malhotra, Shwetak Patel, Jiening Zhan, Tim Althoff, Daniel McDuff, Xin Liu

    Abstract: Despite the proliferation of wearable health trackers and the importance of sleep and exercise to health, deriving actionable personalized insights from wearable data remains a challenge because doing so requires non-trivial open-ended analysis of these data. The recent rise of large language model (LLM) agents, which can use tools to reason about and interact with the world, presents a promising… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 38 pages

  5. arXiv:2406.04557  [pdf, other

    cs.CY

    Countrywide natural experiment reveals impact of built environment on physical activity

    Authors: Tim Althoff, Boris Ivanovic, Jennifer L. Hicks, Scott L. Delp, Abby C. King, Jure Leskovec

    Abstract: While physical activity is critical to human health, most people do not meet recommended guidelines. More walkable built environments have the potential to increase activity across the population. However, previous studies on the built environment and physical activity have led to mixed findings, possibly due to methodological limitations such as small cohorts, few or single locations, over-relian… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  6. arXiv:2404.11757  [pdf, other

    cs.CL

    Language Models Still Struggle to Zero-shot Reason about Time Series

    Authors: Mike A. Merrill, Mingtian Tan, Vinayak Gupta, Tom Hartvigsen, Tim Althoff

    Abstract: Time series are critical for decision-making in fields like finance and healthcare. Their importance has driven a recent influx of works passing time series into language models, leading to non-trivial forecasting on some datasets. But it remains unknown whether non-trivial forecasting implies that language models can reason about time series. To address this gap, we generate a first-of-its-kind e… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  7. arXiv:2403.11169  [pdf, other

    cs.CL cs.AI

    Correcting misinformation on social media with a large language model

    Authors: Xinyi Zhou, Ashish Sharma, Amy X. Zhang, Tim Althoff

    Abstract: Real-world misinformation can be partially correct and even factual but misleading. It undermines public trust in science and democracy, particularly on social media, where it can spread rapidly. High-quality and timely correction of misinformation that identifies and explains its (in)accuracies has been shown to effectively reduce false beliefs. Despite the wide acceptance of manual correction, i… ▽ More

    Submitted 30 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: 53 pages

  8. arXiv:2403.09810  [pdf, other

    cs.HC cs.AI cs.LG

    LabelAId: Just-in-time AI Interventions for Improving Human Labeling Quality and Domain Knowledge in Crowdsourcing Systems

    Authors: Chu Li, Zhihan Zhang, Michael Saugstad, Esteban Safranchik, Minchu Kulkarni, Xiaoyu Huang, Shwetak Patel, Vikram Iyer, Tim Althoff, Jon E. Froehlich

    Abstract: Crowdsourcing platforms have transformed distributed problem-solving, yet quality control remains a persistent challenge. Traditional quality control measures, such as prescreening workers and refining instructions, often focus solely on optimizing economic output. This paper explores just-in-time AI interventions to enhance both labeling quality and domain-specific knowledge among crowdworkers. W… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  9. arXiv:2402.12556  [pdf, other

    cs.HC cs.CL

    IMBUE: Improving Interpersonal Effectiveness through Simulation and Just-in-time Feedback with Human-Language Model Interaction

    Authors: Inna Wanyin Lin, Ashish Sharma, Christopher Michael Rytting, Adam S. Miner, **a Suh, Tim Althoff

    Abstract: Navigating certain communication situations can be challenging due to individuals' lack of skills and the interference of strong emotions. However, effective learning opportunities are rarely accessible. In this work, we conduct a human-centered study that uses language models to simulate bespoke communication training and provide just-in-time feedback to support the practice and learning of inter… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  10. arXiv:2402.05070  [pdf, other

    cs.AI cs.CL cs.IR

    A Roadmap to Pluralistic Alignment

    Authors: Taylor Sorensen, Jared Moore, Jillian Fisher, Mitchell Gordon, Niloofar Mireshghallah, Christopher Michael Rytting, Andre Ye, Liwei Jiang, Ximing Lu, Nouha Dziri, Tim Althoff, Ye** Choi

    Abstract: With increased power and prevalence of AI systems, it is ever more critical that AI systems are designed to serve all, i.e., people with diverse values and perspectives. However, aligning models to serve pluralistic human values remains an open research question. In this piece, we propose a roadmap to pluralistic alignment, specifically using language models as a test bed. We identify and formaliz… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  11. arXiv:2401.16610  [pdf, other

    cs.SI cs.CY cs.HC

    Perceptions of Moderators as a Large-Scale Measure of Online Community Governance

    Authors: Galen Weld, Leon Leibmann, Amy X. Zhang, Tim Althoff

    Abstract: Millions of online communities are governed by volunteer moderators, who shape their communities by setting and enforcing rules, recruiting additional moderators, and participating in the community themselves. These moderators must regularly make decisions about how to govern, yet it is challenging to determine what governance strategies are most successful, as measuring the `success' of governanc… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 16 pages, 12 figures

  12. arXiv:2401.00820  [pdf, other

    cs.CL cs.HC

    A Computational Framework for Behavioral Assessment of LLM Therapists

    Authors: Yu Ying Chiu, Ashish Sharma, Inna Wanyin Lin, Tim Althoff

    Abstract: The emergence of ChatGPT and other large language models (LLMs) has greatly increased interest in utilizing LLMs as therapists to support individuals struggling with mental health challenges. However, due to the lack of systematic studies, our understanding of how LLM therapists behave, i.e., ways in which they respond to clients, is significantly limited. Understanding their behavior across a wid… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  13. arXiv:2310.15461  [pdf, other

    cs.HC cs.CL

    Facilitating Self-Guided Mental Health Interventions Through Human-Language Model Interaction: A Case Study of Cognitive Restructuring

    Authors: Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, Theresa Nguyen, Tim Althoff

    Abstract: Self-guided mental health interventions, such as "do-it-yourself" tools to learn and practice co** strategies, show great promise to improve access to mental health care. However, these interventions are often cognitively demanding and emotionally triggering, creating accessibility barriers that limit their wide-scale implementation and adoption. In this paper, we study how human-language model… ▽ More

    Submitted 10 April, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: CHI 2024 Camera Ready

  14. arXiv:2309.10947  [pdf, other

    cs.HC

    How Do Analysts Understand and Verify AI-Assisted Data Analyses?

    Authors: Ken Gu, Ruoxi Shang, Tim Althoff, Chenglong Wang, Steven M. Drucker

    Abstract: Data analysis is challenging as it requires synthesizing domain knowledge, statistical expertise, and programming skills. Assistants powered by large language models (LLMs), such as ChatGPT, can assist analysts by translating natural language instructions into code. However, AI-assistant responses and analysis code can be misaligned with the analyst's intent or be seemingly correct but lead to inc… ▽ More

    Submitted 4 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: Accepted to CHI 2024

  15. arXiv:2309.10108  [pdf, other

    cs.HC

    How Do Data Analysts Respond to AI Assistance? A Wizard-of-Oz Study

    Authors: Ken Gu, Madeleine Grunde-McLaughlin, Andrew M. McNutt, Jeffrey Heer, Tim Althoff

    Abstract: Data analysis is challenging as analysts must navigate nuanced decisions that may yield divergent conclusions. AI assistants have the potential to support analysts in planning their analyses, enabling more robust decision making. Though AI-based assistants that target code execution (e.g., Github Copilot) have received significant attention, limited research addresses assistance for both analysis… ▽ More

    Submitted 4 March, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: Accepted to CHI 2024

  16. arXiv:2305.08323  [pdf, other

    cs.HC

    Approximation and Progressive Display of Multiverse Analyses

    Authors: Yang Liu, Tim Althoff, Jeffrey Heer

    Abstract: A multiverse analysis evaluates all combinations of "reasonable" analytic decisions to promote robustness and transparency, but can lead to a combinatorial explosion of analyses to compute. Long delays before assessing results prevent users from diagnosing errors and iterating early. We contribute (1) approximation algorithms for estimating multiverse sensitivity and (2) monitoring visualizations… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

  17. arXiv:2305.02466  [pdf, other

    cs.CL cs.HC cs.SI

    Cognitive Reframing of Negative Thoughts through Human-Language Model Interaction

    Authors: Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, David Wadden, Khendra G. Lucas, Adam S. Miner, Theresa Nguyen, Tim Althoff

    Abstract: A proven therapeutic technique to overcome negative thoughts is to replace them with a more hopeful "reframed thought." Although therapy can help people practice and learn this Cognitive Reframing of Negative Thoughts, clinician shortages and mental health stigma commonly limit people's access to therapy. In this paper, we conduct a human-centered study of how language models may assist people in… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: Accepted for publication at ACL 2023

  18. arXiv:2303.14177  [pdf, other

    cs.CL cs.AI

    Scaling Expert Language Models with Unsupervised Domain Discovery

    Authors: Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, Noah A. Smith, Luke Zettlemoyer

    Abstract: Large language models are typically trained densely: all parameters are updated with respect to all inputs. This requires synchronization of billions of parameters across thousands of GPUs. We introduce a simple but effective method to asynchronously train large, sparse language models on arbitrary text corpora. Our method clusters a corpus into sets of related documents, trains a separate expert… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

  19. arXiv:2211.02733  [pdf, other

    cs.LG cs.AI cs.HC

    GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization

    Authors: Xuhai Xu, Han Zhang, Yasaman Sefidgar, Yiyi Ren, Xin Liu, Woosuk Seo, Jennifer Brown, Kevin Kuehn, Mike Merrill, Paula Nurius, Shwetak Patel, Tim Althoff, Margaret E. Morris, Eve Riskin, Jennifer Mankoff, Anind K. Dey

    Abstract: Recent research has demonstrated the capability of behavior signals captured by smartphones and wearables for longitudinal behavior modeling. However, there is a lack of a comprehensive public dataset that serves as an open testbed for fair comparison among algorithms. Moreover, prior studies mainly evaluate algorithms using data from a single population within a short period, without measuring th… ▽ More

    Submitted 4 March, 2023; v1 submitted 4 November, 2022; originally announced November 2022.

    Comments: Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track

    MSC Class: 68T09 ACM Class: I.2.1; E.m

  20. arXiv:2210.15144  [pdf, other

    cs.CL cs.CY

    Gendered Mental Health Stigma in Masked Language Models

    Authors: Inna Wanyin Lin, Lucille Njoo, Anjalie Field, Ashish Sharma, Katharina Reinecke, Tim Althoff, Yulia Tsvetkov

    Abstract: Mental health stigma prevents many individuals from receiving the appropriate care, and social psychology studies have shown that mental health tends to be overlooked in men. In this work, we investigate gendered mental health stigma in masked language models. In doing so, we operationalize mental health stigma by develo** a framework grounded in psychology research: we use clinical psychology l… ▽ More

    Submitted 11 April, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  21. arXiv:2210.03804  [pdf, other

    cs.HC cs.SE

    Understanding and Supporting Debugging Workflows in Multiverse Analysis

    Authors: Ken Gu, Eunice Jun, Tim Althoff

    Abstract: Multiverse analysis, a paradigm for statistical analysis that considers all combinations of reasonable analysis choices in parallel, promises to improve transparency and reproducibility. Although recent tools help analysts specify multiverse analyses, they remain difficult to use in practice. In this work, we identify debugging as a key barrier due to the latency from running analyses to detecting… ▽ More

    Submitted 4 June, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: CHI 2023

    Journal ref: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23), April 23-28, 2023, Hamburg, Germany. ACM, New York, NY, USA

  22. arXiv:2208.03306  [pdf, other

    cs.CL

    Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models

    Authors: Margaret Li, Suchin Gururangan, Tim Dettmers, Mike Lewis, Tim Althoff, Noah A. Smith, Luke Zettlemoyer

    Abstract: We present Branch-Train-Merge (BTM), a communication-efficient algorithm for embarrassingly parallel training of large language models (LLMs). We show it is possible to independently train subparts of a new class of LLMs on different subsets of the data, eliminating the massive multi-node synchronization currently required to train LLMs. BTM learns a set of independent expert LMs (ELMs), each spec… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

  23. arXiv:2205.13607  [pdf, other

    cs.LG cs.HC

    Self-supervised Pretraining and Transfer Learning Enable Flu and COVID-19 Predictions in Small Mobile Sensing Datasets

    Authors: Mike A. Merrill, Tim Althoff

    Abstract: Detailed mobile sensing data from phones, watches, and fitness trackers offer an unparalleled opportunity to quantify and act upon previously unmeasurable behavioral changes in order to improve individual health and accelerate responses to emerging diseases. Unlike in natural language processing and computer vision, deep representation learning has yet to broadly impact this domain, in which the v… ▽ More

    Submitted 2 June, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

  24. arXiv:2203.15144  [pdf, other

    cs.CL cs.HC cs.SI

    Human-AI Collaboration Enables More Empathic Conversations in Text-based Peer-to-Peer Mental Health Support

    Authors: Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff

    Abstract: Advances in artificial intelligence (AI) are enabling systems that augment and collaborate with humans to perform simple, mechanistic tasks like scheduling meetings and grammar-checking text. However, such Human-AI collaboration poses challenges for more complex, creative tasks, such as carrying out empathic conversations, due to difficulties of AI systems in understanding complex human emotions a… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

  25. arXiv:2111.05835  [pdf, other

    cs.SI cs.CY cs.HC

    What Makes Online Communities 'Better'? Measuring Values, Consensus, and Conflict across Thousands of Subreddits

    Authors: Galen Weld, Amy X. Zhang, Tim Althoff

    Abstract: Making online social communities 'better' is a challenging undertaking, as online communities are extraordinarily varied in their size, topical focus, and governance. As such, what is valued by one community may not be valued by another. However, community values are challenging to measure as they are rarely explicitly stated. In this work, we measure community values through the first large-scale… ▽ More

    Submitted 9 May, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

    Comments: 12 pages, 8 figures, 4 tables; to appear at ICWSM 2022

  26. arXiv:2109.05152  [pdf, other

    cs.HC cs.CY cs.SI

    Making Online Communities 'Better': A Taxonomy of Community Values on Reddit

    Authors: Galen Weld, Amy X. Zhang, Tim Althoff

    Abstract: Many researchers studying online communities seek to make them better. However, beyond a small set of widely-held values, such as combating misinformation and abuse, determining what 'better' means can be challenging, as community members may disagree, values may be in conflict, and different communities may have differing preferences as a whole. In this work, we present the first study that elici… ▽ More

    Submitted 20 September, 2023; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: to appear at ICWSM 2024

  27. arXiv:2107.06097  [pdf, other

    cs.LG cs.HC

    Transformer-Based Behavioral Representation Learning Enables Transfer Learning for Mobile Sensing in Small Datasets

    Authors: Mike A. Merrill, Tim Althoff

    Abstract: While deep learning has revolutionized research and applications in NLP and computer vision, this has not yet been the case for behavioral modeling and behavioral health applications. This is because the domain's datasets are smaller, have heterogeneous datatypes, and typically exhibit a large degree of missingness. Therefore, off-the-shelf deep learning models require significant, often prohibiti… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

  28. arXiv:2104.13490  [pdf, other

    cs.CY

    Leveraging Community and Author Context to Explain the Performance and Bias of Text-Based Deception Detection Models

    Authors: Galen Weld, Ellyn Ayton, Tim Althoff, Maria Glenski

    Abstract: Deceptive news posts shared in online communities can be detected with NLP models, and much recent research has focused on the development of such models. In this work, we use characteristics of online communities and authors -- the context of how and where content is posted -- to explain the performance of a neural network deception detection model and identify sub-populations who are disproporti… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

  29. arXiv:2102.12523  [pdf, other

    cs.HC cs.CY q-bio.NC

    Online Mobile App Usage as an Indicator of Sleep Behavior and Job Performance

    Authors: Chunjong Park, Morelle Arian, Xin Liu, Leon Sasson, Jeffrey Kahn, Shwetak Patel, Alex Mariakakis, Tim Althoff

    Abstract: Sleep is critical to human function, mediating factors like memory, mood, energy, and alertness; therefore, it is commonly conjectured that a good night's sleep is important for job performance. However, both real-world sleep behavior and job performance are hard to measure at scale. In this work, we show that people's everyday interactions with online mobile apps can reveal insights into their jo… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

  30. arXiv:2102.08537  [pdf, other

    cs.CY

    Political Bias and Factualness in News Sharing across more than 100,000 Online Communities

    Authors: Galen Weld, Maria Glenski, Tim Althoff

    Abstract: As civil discourse increasingly takes place online, misinformation and the polarization of news shared in online communities have become ever more relevant concerns with real world harms across our society. Studying online news sharing at scale is challenging due to the massive volume of content which is shared by millions of users across thousands of communities. Therefore, existing research has… ▽ More

    Submitted 9 May, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: 12 pages, 7 figures. Published at ICWSM 2021

  31. arXiv:2101.07714  [pdf, other

    cs.CL cs.SI

    Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach

    Authors: Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff

    Abstract: Online peer-to-peer support platforms enable conversations between millions of people who seek and provide mental health support. If successful, web-based mental health conversations could improve access to treatment and reduce the global disease burden. Psychologists have repeatedly demonstrated that empathy, the ability to understand and feel the emotions and experiences of others, is a key comp… ▽ More

    Submitted 16 May, 2021; v1 submitted 19 January, 2021; originally announced January 2021.

    Comments: Published at WWW 2021

  32. arXiv:2009.09961  [pdf, other

    cs.CL

    Adjusting for Confounders with Text: Challenges and an Empirical Evaluation Framework for Causal Inference

    Authors: Galen Weld, Peter West, Maria Glenski, David Arbour, Ryan Rossi, Tim Althoff

    Abstract: Causal inference studies using textual social media data can provide actionable insights on human behavior. Making accurate causal inferences with text requires controlling for confounding which could otherwise impart bias. Recently, many different methods for adjusting for confounders have been proposed, and we show that these existing methods disagree with one another on two datasets inspired by… ▽ More

    Submitted 6 May, 2022; v1 submitted 21 September, 2020; originally announced September 2020.

    Comments: to appear at ICWSM 2022

  33. arXiv:2009.08441  [pdf, other

    cs.CL cs.SI

    A Computational Approach to Understanding Empathy Expressed in Text-Based Mental Health Support

    Authors: Ashish Sharma, Adam S. Miner, David C. Atkins, Tim Althoff

    Abstract: Empathy is critical to successful mental health support. Empathy measurement has predominantly occurred in synchronous, face-to-face settings, and may not translate to asynchronous, text-based contexts. Because millions of people use text-based platforms for mental health support, understanding empathy in these contexts is crucial. In this work, we present a computational approach to understanding… ▽ More

    Submitted 17 September, 2020; originally announced September 2020.

    Comments: Accepted for publication at EMNLP 2020

  34. arXiv:2008.12828  [pdf, other

    cs.LG cs.DL stat.ML

    CORAL: COde RepresentAtion Learning with Weakly-Supervised Transformers for Analyzing Data Analysis

    Authors: Ge Zhang, Mike A. Merrill, Yang Liu, Jeffrey Heer, Tim Althoff

    Abstract: Large scale analysis of source code, and in particular scientific source code, holds the promise of better understanding the data science process, identifying analytical best practices, and providing insights to the builders of scientific toolkits. However, large corpora have remained unanalyzed in depth, as descriptive labels are absent and require expert domain knowledge to generate. We propose… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

  35. arXiv:2008.07045  [pdf, other

    cs.CY cs.IR

    Population-Scale Study of Human Needs During the COVID-19 Pandemic: Analysis and Implications

    Authors: **a Suh, Eric Horvitz, Ryen W. White, Tim Althoff

    Abstract: Most work to date on mitigating the COVID-19 pandemic is focused urgently on biomedicine and epidemiology. Yet, pandemic-related policy decisions cannot be made on health information alone. Decisions need to consider the broader impacts on people and their needs. Quantifying human needs across the population is challenging as it requires high geo-temporal granularity, high coverage across the popu… ▽ More

    Submitted 14 January, 2021; v1 submitted 16 August, 2020; originally announced August 2020.

  36. Boba: Authoring and Visualizing Multiverse Analyses

    Authors: Yang Liu, Alex Kale, Tim Althoff, Jeffrey Heer

    Abstract: Multiverse analysis is an approach to data analysis in which all "reasonable" analytic decisions are evaluated in parallel and interpreted collectively, in order to foster robustness and transparency. However, specifying a multiverse is demanding because analysts must manage myriad variants from a cross-product of analytic decisions, and the results require nuanced interpretation. We contribute Bo… ▽ More

    Submitted 30 July, 2020; v1 submitted 10 July, 2020; originally announced July 2020.

    Comments: submitted to IEEE Transactions on Visualization and Computer Graphics (Proc. VAST)

  37. arXiv:2005.09225  [pdf, other

    cs.SI cs.CL cs.CY

    The Effect of Moderation on Online Mental Health Conversations

    Authors: David Wadden, Tal August, Qisheng Li, Tim Althoff

    Abstract: Many people struggling with mental health issues are unable to access adequate care due to high costs and a shortage of mental health professionals, leading to a global mental health crisis. Online mental health communities can help mitigate this crisis by offering a scalable, easily accessible alternative to in-person sessions with therapists or support groups. However, people seeking emotional o… ▽ More

    Submitted 22 April, 2021; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: Accepted as a full paper at ICWSM 2021. 12 pages, 12 figures, 3 tables

  38. arXiv:2004.04999  [pdf, other

    cs.SI

    Engagement Patterns of Peer-to-Peer Interactions on Mental Health Platforms

    Authors: Ashish Sharma, Monojit Choudhury, Tim Althoff, Amit Sharma

    Abstract: Mental illness is a global health problem, but access to mental healthcare resources remain poor worldwide. Online peer-to-peer support platforms attempt to alleviate this fundamental gap by enabling those who struggle with mental illness to provide and receive social support from their peers. However, successful social support requires users to engage with each other and failures may have serious… ▽ More

    Submitted 10 April, 2020; originally announced April 2020.

    Comments: Accepted to ICWSM 2020

  39. Paths Explored, Paths Omitted, Paths Obscured: Decision Points & Selective Reporting in End-to-End Data Analysis

    Authors: Yang Liu, Tim Althoff, Jeffrey Heer

    Abstract: Drawing reliable inferences from data involves many, sometimes arbitrary, decisions across phases of data collection, wrangling, and modeling. As different choices can lead to diverging conclusions, understanding how researchers make analytic decisions is important for supporting robust and replicable analysis. In this study, we pore over nine published research studies and conduct semi-structured… ▽ More

    Submitted 8 January, 2020; v1 submitted 29 October, 2019; originally announced October 2019.

  40. Goal-setting And Achievement In Activity Tracking Apps: A Case Study Of MyFitnessPal

    Authors: Mitchell L. Gordon, Tim Althoff, Jure Leskovec

    Abstract: Activity tracking apps often make use of goals as one of their core motivational tools. There are two critical components to this tool: setting a goal, and subsequently achieving that goal. Despite its crucial role in how a number of prominent self-tracking apps function, there has been relatively little investigation of the goal-setting and achievement aspects of self-tracking apps. Here we exp… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

    Journal ref: WWW 2019: The Web Conference 2019

  41. arXiv:1812.01696  [pdf, other

    cs.LG cs.CY stat.ML

    Learning Individualized Cardiovascular Responses from Large-scale Wearable Sensors Data

    Authors: Haraldur T. Hallgrímsson, Filip Jankovic, Tim Althoff, Luca Foschini

    Abstract: We consider the problem of modeling cardiovascular responses to physical activity and sleep changes captured by wearable sensors in free living conditions. We use an attentional convolutional neural network to learn parsimonious signatures of individual cardiovascular response from data recorded at the minute level resolution over several months on a cohort of 80k people. We demonstrate internal v… ▽ More

    Submitted 4 December, 2018; originally announced December 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

    Report number: ML4H/2018/238

  42. Modeling Interdependent and Periodic Real-World Action Sequences

    Authors: Takeshi Kurashima, Tim Althoff, Jure Leskovec

    Abstract: Mobile health applications, including those that track activities such as exercise, sleep, and diet, are becoming widely used. Accurately predicting human actions is essential for targeted recommendations that could improve our health and for personalization of these applications. However, making such predictions is extremely difficult due to the complexities of human behavior, which consists of a… ▽ More

    Submitted 25 February, 2018; originally announced February 2018.

    Comments: Accepted at WWW 2018

  43. I'll Be Back: On the Multiple Lives of Users of a Mobile Activity Tracking Application

    Authors: Zhiyuan Lin, Tim Althoff, Jure Leskovec

    Abstract: Mobile health applications that track activities, such as exercise, sleep, and diet, are becoming widely used. While these activity tracking applications have the potential to improve our health, user engagement and retention are critical factors for their success. However, long-term user engagement patterns in real-world activity tracking applications are not yet well understood. Here we study us… ▽ More

    Submitted 25 February, 2018; originally announced February 2018.

    Journal ref: WWW 2018: The 2018 Web Conference

  44. arXiv:1712.05748  [pdf, other

    cs.SI cs.CY q-bio.QM stat.AP

    Modeling Individual Cyclic Variation in Human Behavior

    Authors: Emma Pierson, Tim Althoff, Jure Leskovec

    Abstract: Cycles are fundamental to human health and behavior. However, modeling cycles in time series data is challenging because in most cases the cycles are not labeled or directly observed and need to be inferred from multidimensional measurements taken over time. Here, we present CyHMMs, a cyclic hidden Markov model method for detecting and modeling cycles in a collection of multidimensional heterogene… ▽ More

    Submitted 20 April, 2018; v1 submitted 15 December, 2017; originally announced December 2017.

    Comments: Accepted at WWW 2018

  45. arXiv:1702.07437  [pdf, other

    cs.CY cs.HC cs.MM cs.SI

    How Gamification Affects Physical Activity: Large-scale Analysis of Walking Challenges in a Mobile Application

    Authors: Ali Shameli, Tim Althoff, Amin Saberi, Jure Leskovec

    Abstract: Gamification represents an effective way to incentivize user behavior across a number of computing applications. However, despite the fact that physical activity is essential for a healthy lifestyle, surprisingly little is known about how gamification and in particular competitions shape human physical activity. Here we study how competitions affect physical activity. We focus on walking challenge… ▽ More

    Submitted 23 February, 2017; originally announced February 2017.

    Comments: WWW 2017

  46. arXiv:1701.07083  [pdf, other

    cs.HC cs.CY cs.IR q-bio.NC

    Harnessing the Web for Population-Scale Physiological Sensing: A Case Study of Sleep and Performance

    Authors: Tim Althoff, Eric Horvitz, Ryen W. White, Jamie Zeitzer

    Abstract: Human cognitive performance is critical to productivity, learning, and accident avoidance. Cognitive performance varies throughout each day and is in part driven by intrinsic, near 24-hour circadian rhythms. Prior research on the impact of sleep and circadian rhythms on cognitive performance has typically been restricted to small-scale laboratory-based studies that do not capture the variability o… ▽ More

    Submitted 24 February, 2017; v1 submitted 21 January, 2017; originally announced January 2017.

    Comments: Published in Proceedings of WWW 2017

  47. arXiv:1612.03053  [pdf, other

    cs.SI cs.CY cs.HC

    Online Actions with Offline Impact: How Online Social Networks Influence Online and Offline User Behavior

    Authors: Tim Althoff, Pranav **dal, Jure Leskovec

    Abstract: Many of today's most widely used computing applications utilize social networking features and allow users to connect, follow each other, share content, and comment on others' posts. However, despite the widespread adoption of these features, there is little understanding of the consequences that social networking has on user retention, engagement, and online as well as offline behavior. Here, we… ▽ More

    Submitted 16 December, 2016; v1 submitted 9 December, 2016; originally announced December 2016.

    Comments: Published in Proceedings of tenth ACM International Conference on Web Search and Data Mining, WSDM 2017

  48. arXiv:1610.02085  [pdf, other

    cs.CY cs.HC cs.IR

    Influence of Pokémon Go on Physical Activity: Study and Implications

    Authors: Tim Althoff, Ryen W. White, Eric Horvitz

    Abstract: Physical activity helps people maintain a healthy weight and reduces the risk for several chronic diseases. Although this knowledge is widely recognized, adults and children in many countries around the world do not get recommended amounts of physical activity. While many interventions are found to be ineffective at increasing physical activity or reaching inactive populations, there have been ane… ▽ More

    Submitted 27 October, 2016; v1 submitted 6 October, 2016; originally announced October 2016.

  49. arXiv:1605.04462  [pdf, other

    cs.CL cs.CY cs.SI

    Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health

    Authors: Tim Althoff, Kevin Clark, Jure Leskovec

    Abstract: Mental illness is one of the most pressing public health issues of our time. While counseling and psychotherapy can be effective treatments, our knowledge about how to conduct successful counseling conversations has been limited due to lack of large-scale data with labeled outcomes of the conversations. In this paper, we present a large-scale, quantitative study on the discourse of text-message-ba… ▽ More

    Submitted 14 August, 2016; v1 submitted 14 May, 2016; originally announced May 2016.

    Comments: preprint of paper accepted to TACL, Transactions of the Association for Computational Linguistics, 2016

  50. arXiv:1503.02729  [pdf, other

    cs.CY cs.SI

    Donor Retention in Online Crowdfunding Communities: A Case Study of DonorsChoose.org

    Authors: Tim Althoff, Jure Leskovec

    Abstract: Online crowdfunding platforms like DonorsChoose.org and Kickstarter allow specific projects to get funded by targeted contributions from a large number of people. Critical for the success of crowdfunding communities is recruitment and continued engagement of donors. With donor attrition rates above 70%, a significant challenge for online crowdfunding platforms as well as traditional offline non-pr… ▽ More

    Submitted 9 March, 2015; originally announced March 2015.

    Comments: preprint version of WWW 2015 paper