Skip to main content

Showing 1–8 of 8 results for author: Satpute, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03858  [pdf, ps, other

    cs.IR cs.DL

    Reducing the climate impact of data portals: a case study

    Authors: Noah Gießing, Madhurima Deb, Ankit Satpute, Moritz Schubotz, Olaf Teschke

    Abstract: The carbon footprint share of the information and communication technology (ICT) sector has steadily increased in the past decade and is predicted to make up as much as 23 \% of global emissions in 2030. This shows a pressing need for developers, including the information retrieval community, to make their code more energy-efficient. In this project proposal, we discuss techniques to reduce the en… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 4 pages

  2. arXiv:2404.00344  [pdf, other

    cs.CL cs.AI cs.IR

    Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange

    Authors: Ankit Satpute, Noah Giessing, Andre Greiner-Petter, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, Bela Gipp

    Abstract: Large Language Models (LLMs) have demonstrated exceptional capabilities in various natural language tasks, often achieving performances that surpass those of humans. Despite these advancements, the domain of mathematics presents a distinctive challenge, primarily due to its specialized structure and the precision it demands. In this study, we adopted a two-step approach for investigating the profi… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted for publication at the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) July 14--18, 2024, Washington D.C.,USA

  3. Taxonomy of Mathematical Plagiarism

    Authors: Ankit Satpute, Andre Greiner-Petter, Noah Gießing, Isabel Beckenbach, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, Bela Gipp

    Abstract: Plagiarism is a pressing concern, even more so with the availability of large language models. Existing plagiarism detection systems reliably find copied and moderately reworded text but fail for idea plagiarism, especially in mathematical science, which heavily uses formal mathematical notation. We make two contributions. First, we establish a taxonomy of mathematical content reuse by annotating… ▽ More

    Submitted 31 May, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 46th European Conference on Information Retrieval (ECIR)

  4. arXiv:2305.13193  [pdf, other

    cs.IR

    TEIMMA: The First Content Reuse Annotator for Text, Images, and Math

    Authors: Ankit Satpute, André Greiner-Petter, Moritz Schubotz, Norman Meuschke, Akiko Aizawa, Olaf Teschke, Bela Gipp

    Abstract: This demo paper presents the first tool to annotate the reuse of text, images, and mathematical formulae in a document pair -- TEIMMA. Annotating content reuse is particularly useful to develop plagiarism detection algorithms. Real-world content reuse is often obfuscated, which makes it challenging to identify such cases. TEIMMA allows entering the obfuscation type to enable novel classifications… ▽ More

    Submitted 13 June, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

  5. Caching and Reproducibility: Making Data Science experiments faster and FAIRer

    Authors: Moritz Schubotz, Ankit Satpute, Andre Greiner-Petter, Akiko Aizawa, Bela Gipp

    Abstract: Small to medium-scale data science experiments often rely on research software developed ad-hoc by individual scientists or small teams. Often there is no time to make the research software fast, reusable, and open access. The consequence is twofold. First, subsequent researchers must spend significant work hours building upon the proposed hypotheses or experimental framework. In the worst case, o… ▽ More

    Submitted 9 November, 2022; v1 submitted 8 November, 2022; originally announced November 2022.

    Comments: 8 pages, 1 table

    Journal ref: Frontiers in Research Metrics and Analytics, volume 7, 2022

  6. arXiv:2110.12392  [pdf, other

    q-bio.NC cs.LG

    Variation is the Norm: Brain State Dynamics Evoked By Emotional Video Clips

    Authors: Ashutosh Singh, Christiana Westlin, Hedwig Eisenbarth, Elizabeth A. Reynolds Losin, Jessica R. Andrews-Hanna, Tor D. Wager, Ajay B. Satpute, Lisa Feldman Barrett, Dana H. Brooks, Deniz Erdogmus

    Abstract: For the last several decades, emotion research has attempted to identify a "biomarker" or consistent pattern of brain activity to characterize a single category of emotion (e.g., fear) that will remain consistent across all instances of that category, regardless of individual and context. In this study, we investigated variation rather than consistency during emotional experiences while people wat… ▽ More

    Submitted 24 October, 2021; originally announced October 2021.

  7. arXiv:2003.09779  [pdf, other

    cs.LG stat.ML

    Deep Markov Spatio-Temporal Factorization

    Authors: Amirreza Farnoosh, Behnaz Rezaei, Eli Zachary Sennesh, Zulqarnain Khan, Jennifer Dy, Ajay Satpute, J Benjamin Hutchinson, Jan-Willem van de Meent, Sarah Ostadabbas

    Abstract: We introduce deep Markov spatio-temporal factorization (DMSTF), a generative model for dynamical analysis of spatio-temporal data. Like other factor analysis methods, DMSTF approximates high dimensional data by a product between time dependent weights and spatially dependent factors. These weights and factors are in turn represented in terms of lower dimensional latents inferred using stochastic v… ▽ More

    Submitted 18 August, 2020; v1 submitted 21 March, 2020; originally announced March 2020.

  8. arXiv:1906.08901  [pdf, other

    cs.LG eess.IV stat.ML

    Neural Topographic Factor Analysis for fMRI Data

    Authors: Eli Sennesh, Zulqarnain Khan, Yiyu Wang, Jennifer Dy, Ajay B. Satpute, J. Benjamin Hutchinson, Jan-Willem van de Meent

    Abstract: Neuroimaging studies produce gigabytes of spatio-temporal data for a small number of participants and stimuli. Rarely do researchers attempt to model and examine how individual participants vary from each other -- a question that should be addressable even in small samples given the right statistical tools. We propose Neural Topographic Factor Analysis (NTFA), a probabilistic factor analysis model… ▽ More

    Submitted 20 November, 2020; v1 submitted 20 June, 2019; originally announced June 2019.

    Comments: 15 pages, 9 figures, associated source code available at https://github.com/neu-spiral/HTFATorch

    Journal ref: Advances in Neural Information Processing Systems 34 (2020)