Skip to main content

Showing 1–14 of 14 results for author: Stelmakh, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.01015  [pdf, other

    cs.CY cs.DL

    A Randomized Controlled Trial on Anonymizing Reviewers to Each Other in Peer Review Discussions

    Authors: Charvi Rastogi, Xiangchen Song, Zhi**g **, Ivan Stelmakh, Hal Daumé III, Kun Zhang, Nihar B. Shah

    Abstract: Peer review often involves reviewers submitting their independent reviews, followed by a discussion among reviewers of each paper. A question among policymakers is whether the reviewers of a paper should be anonymous to each other during the discussion. We shed light on this by conducting a randomized controlled trial at the UAI 2022 conference. We randomly split the reviewers and papers into two… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 18 pages, 4 figures, 3 tables

  2. arXiv:2311.09497  [pdf, other

    cs.DL cs.GT

    Peer Reviews of Peer Reviews: A Randomized Controlled Trial and Other Experiments

    Authors: Alexander Goldberg, Ivan Stelmakh, Kyunghyun Cho, Alice Oh, Alekh Agarwal, Danielle Belgrave, Nihar B. Shah

    Abstract: Is it possible to reliably evaluate the quality of peer reviews? We study this question driven by two primary motivations -- incentivizing high-quality reviewing using assessed quality of reviews and measuring changes to review quality in experiments. We conduct a large scale study at the NeurIPS 2022 conference, a top-tier conference in machine learning, in which we invited (meta)-reviewers and a… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  3. arXiv:2303.16750  [pdf, other

    cs.IR cs.DL cs.LG

    A Gold Standard Dataset for the Reviewer Assignment Problem

    Authors: Ivan Stelmakh, John Wieting, Graham Neubig, Nihar B. Shah

    Abstract: Many peer-review venues are either using or looking to use algorithms to assign submissions to reviewers. The crux of such automated approaches is the notion of the "similarity score"--a numerical estimate of the expertise of a reviewer in reviewing a paper--and many algorithms have been proposed to compute these scores. However, these algorithms have not been subjected to a principled comparison,… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

  4. arXiv:2211.12966  [pdf, other

    cs.LG cs.DB cs.DL

    How do Authors' Perceptions of their Papers Compare with Co-authors' Perceptions and Peer-review Decisions?

    Authors: Charvi Rastogi, Ivan Stelmakh, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan, Zhenyu Xue, Hal Daumé III, Emma Pierson, Nihar B. Shah

    Abstract: How do author perceptions match up to the outcomes of the peer-review process and perceptions of others? In a top-tier computer science conference (NeurIPS 2021) with more than 23,000 submitting authors and 9,000 submitted papers, we survey the authors on three questions: (i) their predicted probability of acceptance for each of their papers, (ii) their perceived ranking of their own papers based… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  5. arXiv:2204.06092  [pdf, other

    cs.CL

    ASQA: Factoid Questions Meet Long-Form Answers

    Authors: Ivan Stelmakh, Yi Luan, Bhuwan Dhingra, Ming-Wei Chang

    Abstract: An abundance of datasets and availability of reliable evaluation metrics have resulted in strong progress in factoid question answering (QA). This progress, however, does not easily transfer to the task of long-form QA, where the goal is to answer questions that require in-depth explanations. The hurdles include (i) a lack of high-quality data, and (ii) the absence of a well-defined notion of the… ▽ More

    Submitted 22 January, 2023; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: A minor bug in computing the ROUGE score was fixed. The fix **did not** result in any changes in observations and conclusions

  6. arXiv:2203.17259  [pdf, other

    cs.DL stat.AP

    To ArXiv or not to ArXiv: A Study Quantifying Pros and Cons of Posting Preprints Online

    Authors: Charvi Rastogi, Ivan Stelmakh, Xinwei Shen, Marina Meila, Federico Echenique, Shuchi Chawla, Nihar B. Shah

    Abstract: Double-blind conferences have engaged in debates over whether to allow authors to post their papers online on arXiv or elsewhere during the review process. Independently, some authors of research papers face the dilemma of whether to put their papers on arXiv due to its pros and cons. We conduct a study to substantiate this debate and dilemma via quantitative measurements. Specifically, we conduct… ▽ More

    Submitted 11 June, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: 17 pages, 3 figures

  7. Cite-seeing and Reviewing: A Study on Citation Bias in Peer Review

    Authors: Ivan Stelmakh, Charvi Rastogi, Ryan Liu, Shuchi Chawla, Federico Echenique, Nihar B. Shah

    Abstract: Citations play an important role in researchers' careers as a key factor in evaluation of scientific impact. Many anecdotes advice authors to exploit this fact and cite prospective reviewers to try obtaining a more positive evaluation for their submission. In this work, we investigate if such a citation bias actually exists: Does the citation of a reviewer's own work in a submission cause them to… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: 19 pages, 3 figures

  8. arXiv:2107.01091  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    CrowdSpeech and VoxDIY: Benchmark Datasets for Crowdsourced Audio Transcription

    Authors: Nikita Pavlichenko, Ivan Stelmakh, Dmitry Ustalov

    Abstract: Domain-specific data is the crux of the successful transfer of machine learning systems from benchmarks to real life. In simple problems such as image classification, crowdsourcing has become one of the standard tools for cheap and time-efficient data collection: thanks in large part to advances in research on aggregation methods. However, the applicability of crowdsourcing to more complex tasks (… ▽ More

    Submitted 20 October, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

  9. arXiv:2012.00714  [pdf, other

    stat.ML cs.IT cs.LG

    Debiasing Evaluations That are Biased by Evaluations

    Authors: **gyan Wang, Ivan Stelmakh, Yuting Wei, Nihar B. Shah

    Abstract: It is common to evaluate a set of items by soliciting people to rate them. For example, universities ask students to rate the teaching quality of their instructors, and conference organizers ask authors of submissions to evaluate the quality of the reviews. However, in these applications, students often give a higher rating to a course if they receive higher grades in a course, and authors often g… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

  10. arXiv:2011.15083  [pdf, other

    cs.HC cs.LG stat.AP

    A Large Scale Randomized Controlled Trial on Herding in Peer-Review Discussions

    Authors: Ivan Stelmakh, Charvi Rastogi, Nihar B. Shah, Aarti Singh, Hal Daumé III

    Abstract: Peer review is the backbone of academia and humans constitute a cornerstone of this process, being responsible for reviewing papers and making the final acceptance/rejection decisions. Given that human decision making is known to be susceptible to various cognitive biases, it is important to understand which (if any) biases are present in the peer-review process and design the pipeline such that t… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

  11. arXiv:2011.15050  [pdf, other

    cs.HC cs.LG

    A Novice-Reviewer Experiment to Address Scarcity of Qualified Reviewers in Large Conferences

    Authors: Ivan Stelmakh, Nihar B. Shah, Aarti Singh, Hal Daumé III

    Abstract: Conference peer review constitutes a human-computation process whose importance cannot be overstated: not only it identifies the best submissions for acceptance, but, ultimately, it impacts the future of the whole research area by promoting some ideas and restraining others. A surge in the number of submissions received by leading AI conferences has challenged the sustainability of the review proc… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

  12. arXiv:2011.14646  [pdf, other

    cs.DL cs.LG stat.AP

    Prior and Prejudice: The Novice Reviewers' Bias against Resubmissions in Conference Peer Review

    Authors: Ivan Stelmakh, Nihar B. Shah, Aarti Singh, Hal Daumé III

    Abstract: Modern machine learning and computer science conferences are experiencing a surge in the number of submissions that challenges the quality of peer review as the number of competent reviewers is growing at a much slower rate. To curb this trend and reduce the burden on reviewers, several conferences have started encouraging or even requiring authors to declare the previous submission history of the… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

  13. arXiv:2010.04041  [pdf, other

    cs.MA cs.GT cs.LG

    Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment

    Authors: Ivan Stelmakh, Nihar B. Shah, Aarti Singh

    Abstract: We consider the issue of strategic behaviour in various peer-assessment tasks, including peer grading of exams or homeworks and peer review in hiring or promotions. When a peer-assessment task is competitive (e.g., when students are graded on a curve), agents may be incentivized to misreport evaluations in order to improve their own final standing. Our focus is on designing methods for detection o… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

  14. arXiv:1806.06237  [pdf, other

    stat.ML cs.DS cs.IT cs.LG

    PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review

    Authors: Ivan Stelmakh, Nihar B. Shah, Aarti Singh

    Abstract: We consider the problem of automated assignment of papers to reviewers in conference peer review, with a focus on fairness and statistical accuracy. Our fairness objective is to maximize the review quality of the most disadvantaged paper, in contrast to the commonly used objective of maximizing the total quality over all papers. We design an assignment algorithm based on an incremental max-flow pr… ▽ More

    Submitted 14 November, 2019; v1 submitted 16 June, 2018; originally announced June 2018.