-
Quantum interrogation using weak value measurement
Authors:
Muhammad Abdullah Ijaz,
Syed Bilal Hyder Shah,
Muhammad Sabieh Anwar
Abstract:
We propose a scheme for quantum interrogation measurements using constructive interference and post-selection to achieve single-pass high-efficiency detection for imperfect or semi-transparent absorbers. We illustrate that our method works for heralded single-photon as well as weak attenuated sources. We also study the influence of error from our equipment and show that post-selection renders robu…
▽ More
We propose a scheme for quantum interrogation measurements using constructive interference and post-selection to achieve single-pass high-efficiency detection for imperfect or semi-transparent absorbers. We illustrate that our method works for heralded single-photon as well as weak attenuated sources. We also study the influence of error from our equipment and show that post-selection renders robustness to our scheme against noise. We further demonstrate that with a small extension, we can quantify the transmittance of the imperfect absorber by using the process of weak value amplification (WVA)
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
How We Built Cedar: A Verification-Guided Approach
Authors:
Craig Disselkoen,
Aaron Eline,
Shaobo He,
Kyle Headley,
Michael Hicks,
Kesha Hietala,
John Kastner,
Anwar Mamat,
Matt McCutchen,
Neha Rungta,
Bhakti Shah,
Emina Torlak,
Andrew Wells
Abstract:
This paper presents verification-guided development (VGD), a software engineering process we used to build Cedar, a new policy language for expressive, fast, safe, and analyzable authorization. Develo** a system with VGD involves writing an executable model of the system and mechanically proving properties about the model; writing production code for the system and using differential random test…
▽ More
This paper presents verification-guided development (VGD), a software engineering process we used to build Cedar, a new policy language for expressive, fast, safe, and analyzable authorization. Develo** a system with VGD involves writing an executable model of the system and mechanically proving properties about the model; writing production code for the system and using differential random testing (DRT) to check that the production code matches the model; and using property-based testing (PBT) to check properties of unmodeled parts of the production code. Using VGD for Cedar, we can build fast, idiomatic production code, prove our model correct, and find and fix subtle implementation bugs that evade code reviews and unit testing. While carrying out proofs, we found and fixed 4 bugs in Cedar's policy validator, and DRT and PBT helped us find and fix 21 additional bugs in various parts of Cedar.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
What Can Natural Language Processing Do for Peer Review?
Authors:
Ilia Kuznetsov,
Osama Mohammed Afzal,
Koen Dercksen,
Nils Dycke,
Alexander Goldberg,
Tom Hope,
Dirk Hovy,
Jonathan K. Kummerfeld,
Anne Lauscher,
Kevin Leyton-Brown,
Sheng Lu,
Mausam,
Margot Mieskes,
Aurélie Névéol,
Danish Pruthi,
Lizhen Qu,
Roy Schwartz,
Noah A. Smith,
Thamar Solorio,
**gyan Wang,
Xiaodan Zhu,
Anna Rogers,
Nihar B. Shah,
Iryna Gurevych
Abstract:
The number of scientific articles produced every year is growing rapidly. Providing quality control over them is crucial for scientists and, ultimately, for the public good. In modern science, this process is largely delegated to peer review -- a distributed procedure in which each submission is evaluated by several independent experts in the field. Peer review is widely used, yet it is hard, time…
▽ More
The number of scientific articles produced every year is growing rapidly. Providing quality control over them is crucial for scientists and, ultimately, for the public good. In modern science, this process is largely delegated to peer review -- a distributed procedure in which each submission is evaluated by several independent experts in the field. Peer review is widely used, yet it is hard, time-consuming, and prone to error. Since the artifacts involved in peer review -- manuscripts, reviews, discussions -- are largely text-based, Natural Language Processing has great potential to improve reviewing. As the emergence of large language models (LLMs) has enabled NLP assistance for many new tasks, the discussion on machine-assisted peer review is picking up the pace. Yet, where exactly is help needed, where can NLP help, and where should it stand aside? The goal of our paper is to provide a foundation for the future efforts in NLP for peer-reviewing assistance. We discuss peer review as a general process, exemplified by reviewing at AI conferences. We detail each step of the process from manuscript submission to camera-ready revision, and discuss the associated challenges and opportunities for NLP assistance, illustrated by existing work. We then turn to the big challenges in NLP for peer review as a whole, including data acquisition and licensing, operationalization and experimentation, and ethical issues. To help consolidate community efforts, we create a companion repository that aggregates key datasets pertaining to peer review. Finally, we issue a detailed call for action for the scientific community, NLP and AI researchers, policymakers, and funding bodies to help bring the research in NLP for peer review forward. We hope that our work will help set the agenda for research in machine-assisted scientific quality control in the age of AI, within the NLP community and beyond.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
ViCAR: Visualizing Categories with Automated Rewriting in Coq
Authors:
Bhakti Shah,
William Spencer,
Laura Zielinski,
Ben Caldwell,
Adrian Lehmann,
Robert Rand
Abstract:
We present ViCAR, a library for working with monoidal categories in the Coq proof assistant. ViCAR provides definitions for categorical structures that users can instantiate with their own verification projects. Upon verifying relevant coherence conditions, ViCAR gives a set of lemmas and tactics for manipulating categorical structures. We also provide a visualizer that can display any composition…
▽ More
We present ViCAR, a library for working with monoidal categories in the Coq proof assistant. ViCAR provides definitions for categorical structures that users can instantiate with their own verification projects. Upon verifying relevant coherence conditions, ViCAR gives a set of lemmas and tactics for manipulating categorical structures. We also provide a visualizer that can display any composition and tensor product of morphisms as a string diagram, showing its categorical structure. This enables graphical reasoning and automated rewriting for Coq projects with monoidal structures.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
A Randomized Controlled Trial on Anonymizing Reviewers to Each Other in Peer Review Discussions
Authors:
Charvi Rastogi,
Xiangchen Song,
Zhi**g **,
Ivan Stelmakh,
Hal Daumé III,
Kun Zhang,
Nihar B. Shah
Abstract:
Peer review often involves reviewers submitting their independent reviews, followed by a discussion among reviewers of each paper. A question among policymakers is whether the reviewers of a paper should be anonymous to each other during the discussion. We shed light on this by conducting a randomized controlled trial at the UAI 2022 conference. We randomly split the reviewers and papers into two…
▽ More
Peer review often involves reviewers submitting their independent reviews, followed by a discussion among reviewers of each paper. A question among policymakers is whether the reviewers of a paper should be anonymous to each other during the discussion. We shed light on this by conducting a randomized controlled trial at the UAI 2022 conference. We randomly split the reviewers and papers into two conditions--one with anonymous discussions and the other with non-anonymous discussions, and conduct an anonymous survey of all reviewers, to address the following questions: 1. Do reviewers discuss more in one of the conditions? Marginally more in anonymous (n = 2281, p = 0.051). 2. Does seniority have more influence on final decisions when non-anonymous? Yes, the decisions are closer to senior reviewers' scores in the non-anonymous condition than in anonymous (n = 484, p = 0.04). 3. Are reviewers more polite in one of the conditions? No significant difference in politeness of reviewers' text-based responses (n = 1125, p = 0.72). 4. Do reviewers' self-reported experiences differ across the two conditions? No significant difference for each of the five questions asked (n = 132 and p > 0.3). 5. Do reviewers prefer one condition over the other? Yes, there is a weak preference for anonymous discussions (n = 159 and Cohen's d= 0.25). 6. What do reviewers consider important to make policy on anonymity among reviewers? Reviewers' feeling of safety in expressing their opinions was rated most important, while polite communication among reviewers was rated least important (n = 159). 7. Have reviewers experienced dishonest behavior due to non-anonymity in discussions? Yes, roughly 7% of respondents answered affirmatively (n = 167). Overall, this experiment reveals evidence supporting an anonymous discussion setup in the peer-review process, in terms of the evaluation criteria considered.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
On the Detection of Reviewer-Author Collusion Rings From Paper Bidding
Authors:
Steven Jecmen,
Nihar B. Shah,
Fei Fang,
Leman Akoglu
Abstract:
A major threat to the peer-review systems of computer science conferences is the existence of "collusion rings" between reviewers. In such collusion rings, reviewers who have also submitted their own papers to the conference work together to manipulate the conference's paper assignment, with the aim of being assigned to review each other's papers. The most straightforward way that colluding review…
▽ More
A major threat to the peer-review systems of computer science conferences is the existence of "collusion rings" between reviewers. In such collusion rings, reviewers who have also submitted their own papers to the conference work together to manipulate the conference's paper assignment, with the aim of being assigned to review each other's papers. The most straightforward way that colluding reviewers can manipulate the paper assignment is by indicating their interest in each other's papers through strategic paper bidding. One potential approach to solve this important problem would be to detect the colluding reviewers from their manipulated bids, after which the conference can take appropriate action. While prior work has developed effective techniques to detect other kinds of fraud, no research has yet established that detecting collusion rings is even possible. In this work, we tackle the question of whether it is feasible to detect collusion rings from the paper bidding. To answer this question, we conduct empirical analysis of two realistic conference bidding datasets, including evaluations of existing algorithms for fraud detection in other applications. We find that collusion rings can achieve considerable success at manipulating the paper assignment while remaining hidden from detection: for example, in one dataset, undetected colluders are able to achieve assignment to up to 30% of the papers authored by other colluders. In addition, when 10 colluders bid on all of each other's papers, no detection algorithm outputs a group of reviewers with more than 31% overlap with the true colluders. These results suggest that collusion cannot be effectively detected from the bidding using popular existing tools, demonstrating the need to develop more complex detection algorithms as well as those that leverage additional metadata (e.g., reviewer-paper text-similarity scores).
△ Less
Submitted 10 March, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Towards Understanding the Challenges of Bug Localization in Deep Learning Systems
Authors:
Sigma Jahan,
Mehil B. Shah,
Mohammad Masudur Rahman
Abstract:
Software bugs cost the global economy billions of dollars annually and claim ~50\% of the programming time from software developers. Locating these bugs is crucial for their resolution but challenging. It is even more challenging in deep-learning systems due to their black-box nature. Bugs in these systems are also hidden not only in the code but also in the models and training data, which might m…
▽ More
Software bugs cost the global economy billions of dollars annually and claim ~50\% of the programming time from software developers. Locating these bugs is crucial for their resolution but challenging. It is even more challenging in deep-learning systems due to their black-box nature. Bugs in these systems are also hidden not only in the code but also in the models and training data, which might make traditional debugging methods less effective. In this article, we conduct a large-scale empirical study to better understand the challenges of localizing bugs in deep-learning systems. First, we determine the bug localization performance of four existing techniques using 2,365 bugs from deep-learning systems and 2,913 from traditional software. We found these techniques significantly underperform in localizing deep-learning system bugs. Second, we evaluate how different bug types in deep learning systems impact bug localization. We found that the effectiveness of localization techniques varies with bug type due to their unique challenges. For example, tensor bugs were more accessible to locate due to their structural nature, while all techniques struggled with GPU bugs due to their external dependencies. Third, we investigate the impact of bugs' extrinsic nature on localization in deep-learning systems. We found that deep learning bugs are often extrinsic and thus connected to artifacts other than source code (e.g., GPU, training data), contributing to the poor performance of existing localization methods.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Interferometric Single-Shot Parity Measurement in an InAs-Al Hybrid Device
Authors:
Morteza Aghaee,
Alejandro Alcaraz Ramirez,
Zulfi Alam,
Rizwan Ali,
Mariusz Andrzejczuk,
Andrey Antipov,
Mikhail Astafev,
Amin Barzegar,
Bela Bauer,
Jonathan Becker,
Umesh Kumar Bhaskar,
Alex Bocharov,
Srini Boddapati,
David Bohn,
Jouri Bommer,
Leo Bourdet,
Arnaud Bousquet,
Samuel Boutin,
Lucas Casparis,
Benjamin James Chapman,
Sohail Chatoor,
Anna Wulff Christensen,
Cassandra Chua,
Patrick Codd,
William Cole
, et al. (137 additional authors not shown)
Abstract:
The fusion of non-Abelian anyons or topological defects is a fundamental operation in measurement-only topological quantum computation. In topological superconductors, this operation amounts to a determination of the shared fermion parity of Majorana zero modes. As a step towards this, we implement a single-shot interferometric measurement of fermion parity in indium arsenide-aluminum heterostruct…
▽ More
The fusion of non-Abelian anyons or topological defects is a fundamental operation in measurement-only topological quantum computation. In topological superconductors, this operation amounts to a determination of the shared fermion parity of Majorana zero modes. As a step towards this, we implement a single-shot interferometric measurement of fermion parity in indium arsenide-aluminum heterostructures with a gate-defined nanowire. The interferometer is formed by tunnel-coupling the proximitized nanowire to quantum dots. The nanowire causes a state-dependent shift of these quantum dots' quantum capacitance of up to 1 fF. Our quantum capacitance measurements show flux h/2e-periodic bimodality with a signal-to-noise ratio of 1 in 3.7 $μ$s at optimal flux values. From the time traces of the quantum capacitance measurements, we extract a dwell time in the two associated states that is longer than 1 ms at in-plane magnetic fields of approximately 2 T. These results are consistent with a measurement of the fermion parity encoded in a pair of Majorana zero modes that are separated by approximately 3 $μ$m and subjected to a low rate of poisoning by non-equilibrium quasiparticles. The large capacitance shift and long poisoning time enable a parity measurement error probability of 1%.
△ Less
Submitted 2 April, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Towards Enhancing the Reproducibility of Deep Learning Bugs: An Empirical Study
Authors:
Mehil B. Shah,
Mohammad Masudur Rahman,
Foutse Khomh
Abstract:
Context: Deep learning has achieved remarkable progress in various domains. However, like any software system, deep learning systems contain bugs, some of which can have severe impacts, as evidenced by crashes involving autonomous vehicles. Despite substantial advancements in deep learning techniques, little research has focused on reproducing deep learning bugs, which is an essential step for the…
▽ More
Context: Deep learning has achieved remarkable progress in various domains. However, like any software system, deep learning systems contain bugs, some of which can have severe impacts, as evidenced by crashes involving autonomous vehicles. Despite substantial advancements in deep learning techniques, little research has focused on reproducing deep learning bugs, which is an essential step for their resolution. Existing literature suggests that only 3% of deep learning bugs are reproducible, underscoring the need for further research.
Objective: This paper examines the reproducibility of deep learning bugs. We identify edit actions and useful information that could improve the reproducibility of deep learning bugs.
Method: First, we construct a dataset of 668 deep-learning bugs from Stack Overflow and GitHub across three frameworks and 22 architectures. Second, out of the 668 bugs, we select 165 bugs using stratified sampling and attempt to determine their reproducibility. While reproducing these bugs, we identify edit actions and useful information for their reproduction. Third, we used the Apriori algorithm to identify useful information and edit actions required to reproduce specific types of bugs. Finally, we conducted a user study involving 22 developers to assess the effectiveness of our findings in real-life settings.
Results: We successfully reproduced 148 out of 165 bugs attempted. We identified ten edit actions and five useful types of component information that can help us reproduce the deep learning bugs. With the help of our findings, the developers were able to reproduce 22.92% more bugs and reduce their reproduction time by 24.35%.
Conclusions: Our research addresses the critical issue of deep learning bug reproducibility. Practitioners and researchers can leverage our findings to improve deep learning bug reproducibility.
△ Less
Submitted 18 June, 2024; v1 submitted 5 January, 2024;
originally announced January 2024.
-
VyZX: Formal Verification of a Graphical Quantum Language
Authors:
Adrian Lehmann,
Ben Caldwell,
Bhakti Shah,
Robert Rand
Abstract:
Mathematical representations of graphs often resemble adjacency matrices or lists, representations that facilitate whiteboard reasoning and algorithm design. In the realm of proof assistants, inductive representations effectively define semantics for formal reasoning. This highlights a gap where algorithm design and proof assistants require a fundamentally different structure of graphs, particular…
▽ More
Mathematical representations of graphs often resemble adjacency matrices or lists, representations that facilitate whiteboard reasoning and algorithm design. In the realm of proof assistants, inductive representations effectively define semantics for formal reasoning. This highlights a gap where algorithm design and proof assistants require a fundamentally different structure of graphs, particularly for process theories which represent programs using graphs. To address this gap, we present VyZX, a verified library for reasoning about inductively defined graphical languages. These inductive constructs arise naturally from category theory definitions. A key goal for VyZX is to Verify the ZX-calculus, a graphical language for reasoning about quantum computation. The ZX-calculus comes with a collection of diagrammatic rewrite rules that preserve the graph's semantic interpretation. We show how inductive graphs in VyZX are used to prove the correctness of the ZX-calculus rewrite rules and apply them in practice using standard proof assistant techniques. VyZX integrates easily with the proof engineer's workflow through visualization and automation.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
Peer Reviews of Peer Reviews: A Randomized Controlled Trial and Other Experiments
Authors:
Alexander Goldberg,
Ivan Stelmakh,
Kyunghyun Cho,
Alice Oh,
Alekh Agarwal,
Danielle Belgrave,
Nihar B. Shah
Abstract:
Is it possible to reliably evaluate the quality of peer reviews? We study this question driven by two primary motivations -- incentivizing high-quality reviewing using assessed quality of reviews and measuring changes to review quality in experiments. We conduct a large scale study at the NeurIPS 2022 conference, a top-tier conference in machine learning, in which we invited (meta)-reviewers and a…
▽ More
Is it possible to reliably evaluate the quality of peer reviews? We study this question driven by two primary motivations -- incentivizing high-quality reviewing using assessed quality of reviews and measuring changes to review quality in experiments. We conduct a large scale study at the NeurIPS 2022 conference, a top-tier conference in machine learning, in which we invited (meta)-reviewers and authors to evaluate reviews given to submitted papers. First, we conduct a RCT to examine bias due to the length of reviews. We generate elongated versions of reviews by adding substantial amounts of non-informative content. Participants in the control group evaluate the original reviews, whereas participants in the experimental group evaluate the artificially lengthened versions. We find that lengthened reviews are scored (statistically significantly) higher quality than the original reviews. Additionally, in analysis of observational data we find that authors are positively biased towards reviews recommending acceptance of their own papers, even after controlling for confounders of review length, quality, and different numbers of papers per author. We also measure disagreement rates between multiple evaluations of the same review of 28%-32%, which is comparable to that of paper reviewers at NeurIPS. Further, we assess the amount of miscalibration of evaluators of reviews using a linear model of quality scores and find that it is similar to estimates of miscalibration of paper reviewers at NeurIPS. Finally, we estimate the amount of variability in subjective opinions around how to map individual criteria to overall scores of review quality and find that it is roughly the same as that in the review of papers. Our results suggest that the various problems that exist in reviews of papers -- inconsistency, bias towards irrelevant factors, miscalibration, subjectivity -- also arise in reviewing of reviews.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth Soft-Thresholding
Authors:
Shaik Basheeruddin Shah,
Pradyumna Pradhan,
Wei Pu,
Ramunaidu Randhi,
Miguel R. D. Rodrigues,
Yonina C. Eldar
Abstract:
Solving linear inverse problems plays a crucial role in numerous applications. Algorithm unfolding based, model-aware data-driven approaches have gained significant attention for effectively addressing these problems. Learned iterative soft-thresholding algorithm (LISTA) and alternating direction method of multipliers compressive sensing network (ADMM-CSNet) are two widely used such approaches, ba…
▽ More
Solving linear inverse problems plays a crucial role in numerous applications. Algorithm unfolding based, model-aware data-driven approaches have gained significant attention for effectively addressing these problems. Learned iterative soft-thresholding algorithm (LISTA) and alternating direction method of multipliers compressive sensing network (ADMM-CSNet) are two widely used such approaches, based on ISTA and ADMM algorithms, respectively. In this work, we study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs, for finite-layer unfolded networks such as LISTA and ADMM-CSNet with smooth soft-thresholding in an over-parameterized (OP) regime. We achieve this by leveraging a modified version of the Polyak-Lojasiewicz, denoted PL$^*$, condition. Satisfying the PL$^*$ condition within a specific region of the loss landscape ensures the existence of a global minimum and exponential convergence from initialization using gradient descent based methods. Hence, we provide conditions, in terms of the network width and the number of training samples, on these unfolded networks for the PL$^*$ condition to hold. We achieve this by deriving the Hessian spectral norm of these networks. Additionally, we show that the threshold on the number of training samples increases with the increase in the network width. Furthermore, we compare the threshold on training samples of unfolded networks with that of a standard fully-connected feed-forward network (FFNN) with smooth soft-thresholding non-linearity. We prove that unfolded networks have a higher threshold value than FFNN. Consequently, one can expect a better expected error for unfolded networks than FFNN.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Can NLP Models 'Identify', 'Distinguish', and 'Justify' Questions that Don't have a Definitive Answer?
Authors:
Ayushi Agarwal,
Nisarg Patel,
Neeraj Varshney,
Mihir Parmar,
Pavan Mallina,
Aryan Bhavin Shah,
Srihari Raju Sangaraju,
Tirth Patel,
Nihar Thakkar,
Chitta Baral
Abstract:
Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer. However, in real-world applications, users often ask questions that don't have a definitive answer. Incorrectly answering such questions certainly hampers a system's reliability and trustworthine…
▽ More
Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer. However, in real-world applications, users often ask questions that don't have a definitive answer. Incorrectly answering such questions certainly hampers a system's reliability and trustworthiness. Can SOTA models accurately identify such questions and provide a reasonable response?
To investigate the above question, we introduce QnotA, a dataset consisting of five different categories of questions that don't have definitive answers. Furthermore, for each QnotA instance, we also provide a corresponding QA instance i.e. an alternate question that ''can be'' answered. With this data, we formulate three evaluation tasks that test a system's ability to 'identify', 'distinguish', and 'justify' QnotA questions. Through comprehensive experiments, we show that even SOTA models including GPT-3 and Flan T5 do not fare well on these tasks and lack considerably behind the human performance baseline. We conduct a thorough analysis which further leads to several interesting findings. Overall, we believe our work and findings will encourage and facilitate further research in this important area and help develop more robust models.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Demonstrating a long-coherence dual-rail erasure qubit using tunable transmons
Authors:
Harry Levine,
Arbel Haim,
Jimmy S. C. Hung,
Nasser Alidoust,
Mahmoud Kalaee,
Laura DeLorenzo,
E. Alex Wollack,
Patricio Arrangoiz-Arriola,
Amirhossein Khalajhedayati,
Rohan Sanil,
Hesam Moradinejad,
Yotam Vaknin,
Aleksander Kubica,
David Hover,
Shahriar Aghaeimeibodi,
Joshua Ari Alcid,
Christopher Baek,
James Barnett,
Kaustubh Bawdekar,
Przemyslaw Bienias,
Hugh Carson,
Cliff Chen,
Li Chen,
Harut Chinkezian,
Eric M. Chisholm
, et al. (88 additional authors not shown)
Abstract:
Quantum error correction with erasure qubits promises significant advantages over standard error correction due to favorable thresholds for erasure errors. To realize this advantage in practice requires a qubit for which nearly all errors are such erasure errors, and the ability to check for erasure errors without dephasing the qubit. We demonstrate that a "dual-rail qubit" consisting of a pair of…
▽ More
Quantum error correction with erasure qubits promises significant advantages over standard error correction due to favorable thresholds for erasure errors. To realize this advantage in practice requires a qubit for which nearly all errors are such erasure errors, and the ability to check for erasure errors without dephasing the qubit. We demonstrate that a "dual-rail qubit" consisting of a pair of resonantly coupled transmons can form a highly coherent erasure qubit, where transmon $T_1$ errors are converted into erasure errors and residual dephasing is strongly suppressed, leading to millisecond-scale coherence within the qubit subspace. We show that single-qubit gates are limited primarily by erasure errors, with erasure probability $p_\text{erasure} = 2.19(2)\times 10^{-3}$ per gate while the residual errors are $\sim 40$ times lower. We further demonstrate mid-circuit detection of erasure errors while introducing $< 0.1\%$ dephasing error per check. Finally, we show that the suppression of transmon noise allows this dual-rail qubit to preserve high coherence over a broad tunable operating range, offering an improved capacity to avoid frequency collisions. This work establishes transmon-based dual-rail qubits as an attractive building block for hardware-efficient quantum error correction.
△ Less
Submitted 20 March, 2024; v1 submitted 17 July, 2023;
originally announced July 2023.
-
Testing for Reviewer Anchoring in Peer Review: A Randomized Controlled Trial
Authors:
Ryan Liu,
Steven Jecmen,
Vincent Conitzer,
Fei Fang,
Nihar B. Shah
Abstract:
Peer review frequently follows a process where reviewers first provide initial reviews, authors respond to these reviews, then reviewers update their reviews based on the authors' response. There is mixed evidence regarding whether this process is useful, including frequent anecdotal complaints that reviewers insufficiently update their scores. In this study, we aim to investigate whether reviewer…
▽ More
Peer review frequently follows a process where reviewers first provide initial reviews, authors respond to these reviews, then reviewers update their reviews based on the authors' response. There is mixed evidence regarding whether this process is useful, including frequent anecdotal complaints that reviewers insufficiently update their scores. In this study, we aim to investigate whether reviewers anchor to their original scores when updating their reviews, which serves as a potential explanation for the lack of updates in reviewer scores.
We design a novel randomized controlled trial to test if reviewers exhibit anchoring. In the experimental condition, participants initially see a flawed version of a paper that is later corrected, while in the control condition, participants only see the correct version. We take various measures to ensure that in the absence of anchoring, reviewers in the experimental group should revise their scores to be identically distributed to the scores from the control group. Furthermore, we construct the reviewed paper to maximize the difference between the flawed and corrected versions, and employ deception to hide the true experiment purpose.
Our randomized controlled trial consists of 108 researchers as participants. First, we find that our intervention was successful at creating a difference in perceived paper quality between the flawed and corrected versions: Using a permutation test with the Mann-Whitney U statistic, we find that the experimental group's initial scores are lower than the control group's scores in both the Evaluation category (Vargha-Delaney A=0.64, p=0.0096) and Overall score (A=0.59, p=0.058). Next, we test for anchoring by comparing the experimental group's revised scores with the control group's scores. We find no significant evidence of anchoring in either the Overall (A=0.50, p=0.61) or Evaluation category (A=0.49, p=0.61).
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Intersectionality and Testimonial Injustice in Medical Records
Authors:
Kenya S. Andrews,
Bhuvani Shah,
Lu Cheng
Abstract:
Detecting testimonial injustice is an essential element of addressing inequities and promoting inclusive healthcare practices, many of which are life-critical. However, using a single demographic factor to detect testimonial injustice does not fully encompass the nuanced identities that contribute to a patient's experience. Further, some injustices may only be evident when examining the nuances th…
▽ More
Detecting testimonial injustice is an essential element of addressing inequities and promoting inclusive healthcare practices, many of which are life-critical. However, using a single demographic factor to detect testimonial injustice does not fully encompass the nuanced identities that contribute to a patient's experience. Further, some injustices may only be evident when examining the nuances that arise through the lens of intersectionality. Ignoring such injustices can result in poor quality of care or life-endangering events. Thus, considering intersectionality could result in more accurate classifications and just decisions. To illustrate this, we use real-world medical data to determine whether medical records exhibit words that could lead to testimonial injustice, employ fairness metrics (e.g. demographic parity, differential intersectional fairness, and subgroup fairness) to assess the severity to which subgroups are experiencing testimonial injustice, and analyze how the intersectionality of demographic features (e.g. gender and race) make a difference in uncovering testimonial injustice. From our analysis, we found that with intersectionality we can better see disparities in how subgroups are treated and there are differences in how someone is treated based on the intersection of their demographic attributes. This has not been previously studied in clinical records, nor has it been proven through empirical study.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Machine Vision Using Cellphone Camera: A Comparison of deep networks for classifying three challenging denominations of Indian Coins
Authors:
Keyur D. Joshi,
Dhruv Shah,
Varshil Shah,
Nilay Gandhi,
Sanket J. Shah,
Sanket B. Shah
Abstract:
Indian currency coins come in a variety of denominations. Off all the varieties Rs.1, RS.2, and Rs.5 have similar diameters. Majority of the coin styles in market circulation for denominations of Rs.1 and Rs.2 coins are nearly the same except for numerals on its reverse side. If a coin is resting on its obverse side, the correct denomination is not distinguishable by humans. Therefore, it was hypo…
▽ More
Indian currency coins come in a variety of denominations. Off all the varieties Rs.1, RS.2, and Rs.5 have similar diameters. Majority of the coin styles in market circulation for denominations of Rs.1 and Rs.2 coins are nearly the same except for numerals on its reverse side. If a coin is resting on its obverse side, the correct denomination is not distinguishable by humans. Therefore, it was hypothesized that a digital image of a coin resting on its either size could be classified into its correct denomination by training a deep neural network model. The digital images were generated by using cheap cell phone cameras. To find the most suitable deep neural network architecture, four were selected based on the preliminary analysis carried out for comparison. The results confirm that two of the four deep neural network models can classify the correct denomination from either side of a coin with an accuracy of 97%.
△ Less
Submitted 12 May, 2023;
originally announced June 2023.
-
ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing
Authors:
Ryan Liu,
Nihar B. Shah
Abstract:
Given the rapid ascent of large language models (LLMs), we study the question: (How) can large language models help in reviewing of scientific papers or proposals? We first conduct some pilot studies where we find that (i) GPT-4 outperforms other LLMs (Bard, Vicuna, Koala, Alpaca, LLaMa, Dolly, OpenAssistant, StableLM), and (ii) prompting with a specific question (e.g., to identify errors) outperf…
▽ More
Given the rapid ascent of large language models (LLMs), we study the question: (How) can large language models help in reviewing of scientific papers or proposals? We first conduct some pilot studies where we find that (i) GPT-4 outperforms other LLMs (Bard, Vicuna, Koala, Alpaca, LLaMa, Dolly, OpenAssistant, StableLM), and (ii) prompting with a specific question (e.g., to identify errors) outperforms prompting to simply write a review. With these insights, we study the use of LLMs (specifically, GPT-4) for three tasks:
1. Identifying errors: We construct 13 short computer science papers each with a deliberately inserted error, and ask the LLM to check for the correctness of these papers. We observe that the LLM finds errors in 7 of them, spanning both mathematical and conceptual errors.
2. Verifying checklists: We task the LLM to verify 16 closed-ended checklist questions in the respective sections of 15 NeurIPS 2022 papers. We find that across 119 {checklist question, paper} pairs, the LLM had an 86.6% accuracy.
3. Choosing the "better" paper: We generate 10 pairs of abstracts, deliberately designing each pair in such a way that one abstract was clearly superior than the other. The LLM, however, struggled to discern these relatively straightforward distinctions accurately, committing errors in its evaluations for 6 out of the 10 pairs.
Based on these experiments, we think that LLMs have a promising use as reviewing assistants for specific reviewing tasks, but not (yet) for complete evaluations of papers or proposals.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Counterfactual Evaluation of Peer-Review Assignment Policies
Authors:
Martin Saveski,
Steven Jecmen,
Nihar B. Shah,
Johan Ugander
Abstract:
Peer review assignment algorithms aim to match research papers to suitable expert reviewers, working to maximize the quality of the resulting reviews. A key challenge in designing effective assignment policies is evaluating how changes to the assignment algorithm map to changes in review quality. In this work, we leverage recently proposed policies that introduce randomness in peer-review assignme…
▽ More
Peer review assignment algorithms aim to match research papers to suitable expert reviewers, working to maximize the quality of the resulting reviews. A key challenge in designing effective assignment policies is evaluating how changes to the assignment algorithm map to changes in review quality. In this work, we leverage recently proposed policies that introduce randomness in peer-review assignment--in order to mitigate fraud--as a valuable opportunity to evaluate counterfactual assignment policies. Specifically, we exploit how such randomized assignments provide a positive probability of observing the reviews of many assignment policies of interest. To address challenges in applying standard off-policy evaluation methods, such as violations of positivity, we introduce novel methods for partial identification based on monotonicity and Lipschitz smoothness assumptions for the map** between reviewer-paper covariates and outcomes. We apply our methods to peer-review data from two computer science venues: the TPDP'21 workshop (95 papers and 35 reviewers) and the AAAI'22 conference (8,450 papers and 3,145 reviewers). We consider estimates of (i) the effect on review quality when changing weights in the assignment algorithm, e.g., weighting reviewers' bids vs. textual similarity (between the review's past papers and the submission), and (ii) the "cost of randomization", capturing the difference in expected quality between the perturbed and unperturbed optimal match. We find that placing higher weight on text similarity results in higher review quality and that introducing randomization in the reviewer-paper assignment only marginally reduces the review quality. Our methods for partial identification may be of independent interest, while our off-policy approach can likely find use evaluating a broad class of algorithmic matching systems.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
A Gold Standard Dataset for the Reviewer Assignment Problem
Authors:
Ivan Stelmakh,
John Wieting,
Graham Neubig,
Nihar B. Shah
Abstract:
Many peer-review venues are either using or looking to use algorithms to assign submissions to reviewers. The crux of such automated approaches is the notion of the "similarity score"--a numerical estimate of the expertise of a reviewer in reviewing a paper--and many algorithms have been proposed to compute these scores. However, these algorithms have not been subjected to a principled comparison,…
▽ More
Many peer-review venues are either using or looking to use algorithms to assign submissions to reviewers. The crux of such automated approaches is the notion of the "similarity score"--a numerical estimate of the expertise of a reviewer in reviewing a paper--and many algorithms have been proposed to compute these scores. However, these algorithms have not been subjected to a principled comparison, making it difficult for stakeholders to choose the algorithm in an evidence-based manner. The key challenge in comparing existing algorithms and develo** better algorithms is the lack of the publicly available gold-standard data that would be needed to perform reproducible research. We address this challenge by collecting a novel dataset of similarity scores that we release to the research community. Our dataset consists of 477 self-reported expertise scores provided by 58 researchers who evaluated their expertise in reviewing papers they have read previously.
We use this data to compare several popular algorithms employed in computer science conferences and come up with recommendations for stakeholders. Our main findings are as follows. First, all algorithms make a non-trivial amount of error. For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases, highlighting the vital need for more research on the similarity-computation problem. Second, most existing algorithms are designed to work with titles and abstracts of papers, and in this regime the Specter+MFR algorithm performs best. Third, to improve performance, it may be important to develop modern deep-learning based algorithms that can make use of the full texts of papers: the classical TD-IDF algorithm enhanced with full texts of papers is on par with the deep-learning based Specter+MFR that cannot make use of this information.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
Metric-Free Exploration for Topological Map** by Task and Motion Imitation in Feature Space
Authors:
Yuhang He,
Irving Fang,
Yiming Li,
Rushi Bhavesh Shah,
Chen Feng
Abstract:
We propose DeepExplorer, a simple and lightweight metric-free exploration method for topological map** of unknown environments. It performs task and motion planning (TAMP) entirely in image feature space. The task planner is a recurrent network using the latest image observation sequence to hallucinate a feature as the next-best exploration goal. The motion planner then utilizes the current and…
▽ More
We propose DeepExplorer, a simple and lightweight metric-free exploration method for topological map** of unknown environments. It performs task and motion planning (TAMP) entirely in image feature space. The task planner is a recurrent network using the latest image observation sequence to hallucinate a feature as the next-best exploration goal. The motion planner then utilizes the current and the hallucinated features to generate an action taking the agent towards that goal. The two planners are jointly trained via deeply-supervised imitation learning from expert demonstrations. During exploration, we iteratively call the two planners to predict the next action, and the topological map is built by constantly appending the latest image observation and action to the map and using visual place recognition (VPR) for loop closing. The resulting topological map efficiently represents an environment's connectivity and traversability, so it can be used for tasks such as visual navigation. We show DeepExplorer's exploration efficiency and strong sim2sim generalization capability on large-scale simulation datasets like Gibson and MP3D. Its effectiveness is further validated via the image-goal navigation performance on the resulting topological map. We further show its strong zero-shot sim2real generalization capability in real-world experiments. The source code is available at \url{https://ai4ce.github.io/DeepExplorer/}.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Assisting Human Decisions in Document Matching
Authors:
Joon Sik Kim,
Valerie Chen,
Danish Pruthi,
Nihar B. Shah,
Ameet Talwalkar
Abstract:
Many practical applications, ranging from paper-reviewer assignment in peer review to job-applicant matching for hiring, require human decision makers to identify relevant matches by combining their expertise with predictions from machine learning models. In many such model-assisted document matching tasks, the decision makers have stressed the need for assistive information about the model output…
▽ More
Many practical applications, ranging from paper-reviewer assignment in peer review to job-applicant matching for hiring, require human decision makers to identify relevant matches by combining their expertise with predictions from machine learning models. In many such model-assisted document matching tasks, the decision makers have stressed the need for assistive information about the model outputs (or the data) to facilitate their decisions. In this paper, we devise a proxy matching task that allows us to evaluate which kinds of assistive information improve decision makers' performance (in terms of accuracy and time). Through a crowdsourced (N=271 participants) study, we find that providing black-box model explanations reduces users' accuracy on the matching task, contrary to the commonly-held belief that they can be helpful by allowing better understanding of the model. On the other hand, custom methods that are designed to closely attend to some task-specific desiderata are found to be effective in improving user performance. Surprisingly, we also find that the users' perceived utility of assistive information is misaligned with their objective utility (measured through their task performance).
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Design, fabrication and large scale qualification of cosmic muon veto scintillator detectors
Authors:
Mandar Saraf,
Pandi Raj Chinnappan,
Aditya Deodhar,
Mamta Jangra,
J. Krishnamoorthi,
Gobinda Majumder,
Veera Padmavathy,
K. C. Ravindran,
Raj Bhupen Shah,
Ravindra Shinde,
B. Satyanarayana
Abstract:
The INO collaboration is designing a cosmic muon veto detector (CMVD) to cover the mini-ICAL detector which is operational at the IICHEP transit campus, Madurai in South India. The aim of the CMVD is to study the feasibility of building an experiment to record rare events at a shallow depth of around 100 m, and use plastic scintillators to veto atmospheric muons from those produced by the rare int…
▽ More
The INO collaboration is designing a cosmic muon veto detector (CMVD) to cover the mini-ICAL detector which is operational at the IICHEP transit campus, Madurai in South India. The aim of the CMVD is to study the feasibility of building an experiment to record rare events at a shallow depth of around 100 m, and use plastic scintillators to veto atmospheric muons from those produced by the rare interactions within the target mass of the detector. The efficiency of such a veto detector should be better than 99.99% and false positive rate of less than $10^{-5}$.
The CMVD is being built using extruded plastic scintillator (EPS) strips to detect and tag atmospheric muons. More than 700 EPS strips are required to build the CMVD. Two EPS strips are pasted together to make a di-counter (DC) and wavelength shifting fibres are embedded inside the EPS strips to trap the scintillation light generated by a passing cosmic ray muon and transmit it as secondary photons to the Silicon Photo-Multipliers (SiPMs) mounted at the two ends of the DCs. Since the efficiency requirement of the veto detector is rather high, it is imperative to thoroughly test each and every component used for building the CMVD. A cosmic ray muon telescope has been setup using the DCs to qualify all the DCs that will be fabricated. In this paper we will discuss the details of the design and fabrication of the DCs, and the cosmic muon setup and the electronics used for their testing and the test results.
△ Less
Submitted 4 May, 2023; v1 submitted 29 January, 2023;
originally announced January 2023.
-
The Role of Author Identities in Peer Review
Authors:
Nihar B. Shah
Abstract:
There is widespread debate on whether to anonymize author identities in peer review. The key argument for anonymization is to mitigate bias, whereas arguments against anonymization posit various uses of author identities in the review process. The Innovations in Theoretical Computer Science (ITCS) 2023 conference adopted a middle ground by initially anonymizing the author identities from reviewers…
▽ More
There is widespread debate on whether to anonymize author identities in peer review. The key argument for anonymization is to mitigate bias, whereas arguments against anonymization posit various uses of author identities in the review process. The Innovations in Theoretical Computer Science (ITCS) 2023 conference adopted a middle ground by initially anonymizing the author identities from reviewers, revealing them after the reviewer had submitted their initial reviews, and allowing the reviewer to change their review subsequently. We present an analysis of the reviews pertaining to the identification and use of author identities. Our key findings are: (I) A majority of reviewers self-report not knowing and being unable to guess the authors' identities for the papers they were reviewing. (II) After the initial submission of reviews, 7.1% of reviews changed their overall merit score and 3.8% changed their self-reported reviewer expertise. (III) There is a very weak and statistically insignificant correlation of the rank of authors' affiliations with the change in overall merit; there is a weak but statistically significant correlation with respect to change in reviewer expertise. We also conducted an anonymous survey to obtain opinions from reviewers and authors. The main findings from the 200 survey responses are: (i) A vast majority of participants favor anonymizing author identities in some form. (ii) The "middle-ground" initiative of ITCS 2023 was appreciated. (iii) Detecting conflicts of interest is a challenge that needs to be addressed if author identities are anonymized. Overall, these findings support anonymization of author identities in some form (e.g., as was done in ITCS 2023), as long as there is a robust and efficient way to check conflicts of interest.
△ Less
Submitted 25 June, 2023; v1 submitted 31 December, 2022;
originally announced January 2023.
-
Sign Language to Text Conversion in Real Time using Transfer Learning
Authors:
Shubham Thakar,
Samveg Shah,
Bhavya Shah,
Anant V. Nimkar
Abstract:
The people in the world who are hearing impaired face many obstacles in communication and require an interpreter to comprehend what a person is saying. There has been constant scientific research and the existing models lack the ability to make accurate predictions. So we propose a deep learning model trained on ASL i.e. American Sign Language which will take actions in the form of ASL as input an…
▽ More
The people in the world who are hearing impaired face many obstacles in communication and require an interpreter to comprehend what a person is saying. There has been constant scientific research and the existing models lack the ability to make accurate predictions. So we propose a deep learning model trained on ASL i.e. American Sign Language which will take actions in the form of ASL as input and translate it into text. To achieve the translation a Convolution Neural Network model and a transfer learning model based on the VGG16 architecture are used. There has been an improvement in accuracy from 94% of CNN to 98.7% of Transfer Learning, an improvement of 5%. An application with the deep learning model integrated has also been built.
△ Less
Submitted 7 December, 2022; v1 submitted 13 November, 2022;
originally announced November 2022.
-
How do Authors' Perceptions of their Papers Compare with Co-authors' Perceptions and Peer-review Decisions?
Authors:
Charvi Rastogi,
Ivan Stelmakh,
Alina Beygelzimer,
Yann N. Dauphin,
Percy Liang,
Jennifer Wortman Vaughan,
Zhenyu Xue,
Hal Daumé III,
Emma Pierson,
Nihar B. Shah
Abstract:
How do author perceptions match up to the outcomes of the peer-review process and perceptions of others? In a top-tier computer science conference (NeurIPS 2021) with more than 23,000 submitting authors and 9,000 submitted papers, we survey the authors on three questions: (i) their predicted probability of acceptance for each of their papers, (ii) their perceived ranking of their own papers based…
▽ More
How do author perceptions match up to the outcomes of the peer-review process and perceptions of others? In a top-tier computer science conference (NeurIPS 2021) with more than 23,000 submitting authors and 9,000 submitted papers, we survey the authors on three questions: (i) their predicted probability of acceptance for each of their papers, (ii) their perceived ranking of their own papers based on scientific contribution, and (iii) the change in their perception about their own papers after seeing the reviews. The salient results are: (1) Authors have roughly a three-fold overestimate of the acceptance probability of their papers: The median prediction is 70% for an approximately 25% acceptance rate. (2) Female authors exhibit a marginally higher (statistically significant) miscalibration than male authors; predictions of authors invited to serve as meta-reviewers or reviewers are similarly calibrated, but better than authors who were not invited to review. (3) Authors' relative ranking of scientific contribution of two submissions they made generally agree (93%) with their predicted acceptance probabilities, but there is a notable 7% responses where authors think their better paper will face a worse outcome. (4) The author-provided rankings disagreed with the peer-review decisions about a third of the time; when co-authors ranked their jointly authored papers, co-authors disagreed at a similar rate -- about a third of the time. (5) At least 30% of respondents of both accepted and rejected papers said that their perception of their own paper improved after the review process. The stakeholders in peer review should take these findings into account in setting their expectations from peer review.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Batching of Tasks by Users of Pseudonymous Forums: Anonymity Compromise and Protection
Authors:
Alexander Goldberg,
Giulia Fanti,
Nihar B. Shah
Abstract:
There are a number of forums where people participate under pseudonyms. One example is peer review, where the identity of reviewers for any paper is confidential. When participating in these forums, people frequently engage in "batching": executing multiple related tasks (e.g., commenting on multiple papers) at nearly the same time. Our empirical analysis shows that batching is common in two appli…
▽ More
There are a number of forums where people participate under pseudonyms. One example is peer review, where the identity of reviewers for any paper is confidential. When participating in these forums, people frequently engage in "batching": executing multiple related tasks (e.g., commenting on multiple papers) at nearly the same time. Our empirical analysis shows that batching is common in two applications we consider $\unicode{x2013}$ peer review and Wikipedia edits. In this paper, we identify and address the risk of deanonymization arising from linking batched tasks. To protect against linkage attacks, we take the approach of adding delay to the posting time of batched tasks. We first show that under some natural assumptions, no delay mechanism can provide a meaningful differential privacy guarantee. We therefore propose a "one-sided" formulation of differential privacy for protecting against linkage attacks. We design a mechanism that adds zero-inflated uniform delay to events and show it can preserve privacy. We prove that this noise distribution is in fact optimal in minimizing expected delay among mechanisms adding independent noise to each event, thereby establishing the Pareto frontier of the trade-off between the expected delay for batched and unbatched events. Finally, we conduct a series of experiments on Wikipedia and Bitcoin data that corroborate the practical utility of our algorithm in obfuscating batching without introducing onerous delay to a system.
△ Less
Submitted 11 September, 2023; v1 submitted 22 November, 2022;
originally announced November 2022.
-
A Comparative Study of Machine Learning and Deep Learning Techniques for Prediction of Co2 Emission in Cars
Authors:
Samveg Shah,
Shubham Thakar,
Kashish Jain,
Bhavya Shah,
Sudhir Dhage
Abstract:
The most recent concern of all people on Earth is the increase in the concentration of greenhouse gas in the atmosphere. The concentration of these gases has risen rapidly over the last century and if the trend continues it can cause many adverse climatic changes. There have been ways implemented to curb this by the government by limiting processes that emit a higher amount of CO2, one such greenh…
▽ More
The most recent concern of all people on Earth is the increase in the concentration of greenhouse gas in the atmosphere. The concentration of these gases has risen rapidly over the last century and if the trend continues it can cause many adverse climatic changes. There have been ways implemented to curb this by the government by limiting processes that emit a higher amount of CO2, one such greenhouse gas. However, there is mounting evidence that the CO2 numbers supplied by the government do not accurately reflect the performance of automobiles on the road. Our proposal of using artificial intelligence techniques to improve a previously rudimentary process takes a radical tack, but it fits the bill given the situation. To determine which algorithms and models produce the greatest outcomes, we compared them all and explored a novel method of ensembling them. Further, this can be used to foretell the rise in global temperature and to ground crucial policy decisions like the adoption of electric vehicles. To estimate emissions from vehicles, we used machine learning, deep learning, and ensemble learning on a massive dataset.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Allocation Schemes in Analytic Evaluation: Applicant-Centric Holistic or Attribute-Centric Segmented?
Authors:
**gyan Wang,
Carmel Baharav,
Nihar B. Shah,
Anita Williams Woolley,
R Ravi
Abstract:
Many applications such as hiring and university admissions involve evaluation and selection of applicants. These tasks are fundamentally difficult, and require combining evidence from multiple different aspects (what we term "attributes"). In these applications, the number of applicants is often large, and a common practice is to assign the task to multiple evaluators in a distributed fashion. Spe…
▽ More
Many applications such as hiring and university admissions involve evaluation and selection of applicants. These tasks are fundamentally difficult, and require combining evidence from multiple different aspects (what we term "attributes"). In these applications, the number of applicants is often large, and a common practice is to assign the task to multiple evaluators in a distributed fashion. Specifically, in the often-used holistic allocation, each evaluator is assigned a subset of the applicants, and is asked to assess all relevant information for their assigned applicants. However, such an evaluation process is subject to issues such as miscalibration (evaluators see only a small fraction of the applicants and may not get a good sense of relative quality), and discrimination (evaluators are influenced by irrelevant information about the applicants). We identify that such attribute-based evaluation allows alternative allocation schemes. Specifically, we consider assigning each evaluator more applicants but fewer attributes per applicant, termed segmented allocation. We compare segmented allocation to holistic allocation on several dimensions via theoretical and experimental methods. We establish various tradeoffs between these two approaches, and identify conditions under which one approach results in more accurate evaluation than the other.
△ Less
Submitted 18 September, 2022;
originally announced September 2022.
-
Properties and device performance of BN thin films grown on GaN by pulsed laser deposition
Authors:
Abhijit Biswas,
Mingfei Xu,
Kai Fu,
**gan Zhou,
Rui Xu,
Anand B. Puthirath,
Jordan A. Hachtel,
Chenxi Li,
Sathvik Ajay Iyengar,
Harikishan Kannan,
Xiang Zhang,
Tia Gray,
Robert Vajtai,
A. Glen Birdwell,
Mahesh R. Neupane,
Dmitry A. Ruzmetov,
Pankaj B. Shah,
Tony Ivanov,
Hanyu Zhu,
Yuji Zhao,
Pulickel M. Ajayan
Abstract:
Wide and ultrawide-bandgap semiconductors lie at the heart of next-generation high-power, high-frequency electronics. Here, we report the growth of ultrawide-bandgap boron nitride (BN) thin films on wide-bandgap gallium nitride (GaN) by pulsed laser deposition. Comprehensive spectroscopic (core level and valence band XPS, FTIR, Raman) and microscopic (AFM and STEM) characterizations confirm the gr…
▽ More
Wide and ultrawide-bandgap semiconductors lie at the heart of next-generation high-power, high-frequency electronics. Here, we report the growth of ultrawide-bandgap boron nitride (BN) thin films on wide-bandgap gallium nitride (GaN) by pulsed laser deposition. Comprehensive spectroscopic (core level and valence band XPS, FTIR, Raman) and microscopic (AFM and STEM) characterizations confirm the growth of BN thin films on GaN. Optically, we observed that BN/GaN heterostructure is second-harmonic generation active. Moreover, we fabricated the BN/GaN heterostructure-based Schottky diode that demonstrates rectifying characteristics, lower turn-on voltage, and an improved breakdown capability (234 V) as compared to GaN (168 V), owing to the higher breakdown electrical field of BN. Our approach is an early step towards bridging the gap between wide and ultrawide-bandgap materials for potential optoelectronics as well as next-generation high-power electronics.
△ Less
Submitted 1 September, 2022;
originally announced September 2022.
-
Performance Study of Time Series Databases
Authors:
Bonil Shah,
P. M. Jat,
Kalyan Sashidhar
Abstract:
The growth of big-data sectors such as the Internet of Things (IoT) generates enormous volumes of data. As IoT devices generate a vast volume of time-series data, the Time Series Database (TSDB) popularity has grown alongside the rise of IoT. Time series databases are developed to manage and analyze huge amounts of time series data. However, it is not easy to choose the best one from them. The mos…
▽ More
The growth of big-data sectors such as the Internet of Things (IoT) generates enormous volumes of data. As IoT devices generate a vast volume of time-series data, the Time Series Database (TSDB) popularity has grown alongside the rise of IoT. Time series databases are developed to manage and analyze huge amounts of time series data. However, it is not easy to choose the best one from them. The most popular benchmarks compare the performance of different databases to each other but use random or synthetic data that applies to only one domain. As a result, these benchmarks may not always accurately represent real-world performance. It is required to comprehensively compare the performance of time series databases with real datasets. The experiment shows significant performance differences for data injection time and query execution time when comparing real and synthetic datasets. The results are reported and analyzed.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Unidirectional domain growth of hexagonal boron nitride thin films
Authors:
Abhijit Biswas,
Qiyuan Ruan,
Frank Lee,
Chenxi Li,
Sathvik Ajay Iyengar,
Anand B. Puthirath,
Xiang Zhang,
Harikishan Kannan,
Tia Gray,
A. Glen Birdwell,
Mahesh R. Neupane,
Pankaj B. Shah,
Dmitry A. Ruzmetov,
Tony G. Ivanov,
Robert Vajtai,
Manoj Tripathi,
Alan Dalton,
Boris I. Yakobson,
Pulickel M. Ajayan
Abstract:
Two-dimensional van der Waals (2D-vdW) layered hexagonal boron nitride (h-BN) has gained tremendous research interest over recent years due to its unconventional domain growth morphology, fascinating properties and application potentials as an excellent dielectric layer for 2D-based nano-electronics. However, the unidirectional domain growth of h-BN thin films directly on insulating substrates rem…
▽ More
Two-dimensional van der Waals (2D-vdW) layered hexagonal boron nitride (h-BN) has gained tremendous research interest over recent years due to its unconventional domain growth morphology, fascinating properties and application potentials as an excellent dielectric layer for 2D-based nano-electronics. However, the unidirectional domain growth of h-BN thin films directly on insulating substrates remains significantly challenging because of high-bonding anisotropicity and complex growth kinetics than the conventional thin films growth, thus resulting in the formation of randomly oriented domains morphology, and hindering its usefulness in integrated nano-devices. Here, ultra-wide bandgap h-BN thin films are grown directly on low-miscut atomically smooth highly insulating c-plane sapphire substrates (without using any metal catalytic layer) by pulsed laser deposition, showing remarkable unidirectional triangular-shape domains morphology. This unidirectional domain growth is attributed to the step-edge guided nucleation caused by reducing the film-substrate interfacial symmetry and energy, thereby breaking the degeneracy of nucleation sites of random domains, as revealed by the density functional theory (DFT) calculations. Through extensive characterizations, we further demonstrate the excellent single crystal-like functional properties of films. Our findings might pave the way for feasible large-area direct growth of electronic-quality h-BN thin films on insulating substrates for high-performance 2D-electronics, and in addition would be beneficial for hetero engineering of 2D-vdW materials with emergent phenomena.
△ Less
Submitted 26 January, 2023; v1 submitted 19 August, 2022;
originally announced August 2022.
-
Unravelling the room temperature growth of two-dimensional h-BN nanosheets for multifunctional applications
Authors:
Abhijit Biswas,
Rishi Maiti,
Frank Lee,
Cecilia Y. Chen,
Tao Li,
Anand B. Puthirath,
Sathvik Ajay Iyengar,
Chenxi Li,
Xiang Zhang,
Harikishan Kannan,
Tia Gray,
Md Abid Shahriar Rahman Saadi,
Jacob Elkins,
A. Glen Birdwell,
Mahesh R. Neupane,
Pankaj B. Shah,
Dmitry A. Ruzmetov,
Tony G. Ivanov,
Robert Vajtai,
Yuji Zhao,
Alexander L. Gaeta,
Manoj Tripathi,
Alan Dalton,
Pulickel M. Ajayan
Abstract:
Room temperature growth of two-dimensional van der Waals (2D-vdW) materials is indispensable for state-of-the-art nanotechnology. The low temperature growth supersedes the requirement of elevated growth temperature accompanied with high thermal budgets. Moreover, for electronic applications, low or room temperature growth reduces the possibility of intrinsic film-substrate interfacial thermal diff…
▽ More
Room temperature growth of two-dimensional van der Waals (2D-vdW) materials is indispensable for state-of-the-art nanotechnology. The low temperature growth supersedes the requirement of elevated growth temperature accompanied with high thermal budgets. Moreover, for electronic applications, low or room temperature growth reduces the possibility of intrinsic film-substrate interfacial thermal diffusion related deterioration of functional properties and consequent device performance. Here, we demonstrated the growth of ultrawide-bandgap boron nitride (BN) at room temperature by using the pulsed laser deposition (PLD) process and demonstrated various functionalities for potential applications. Comprehensive chemical, spectroscopic and microscopic characterization confirms the growth of ordered nanosheet-like hexagonal BN. Functionally, nanosheets show hydrophobicity, high lubricity (low coefficient of friction), low refractive index within the visible to near-infrared wavelength range, and room temperature single-photon quantum emission. Our work unveils an important step that brings a plethora of applications potential for room temperature grown h-BN nanosheets as it can be feasible on any given substrate, thus creating a scenario for h-BN on demand at frugal thermal budget.
△ Less
Submitted 12 October, 2023; v1 submitted 19 August, 2022;
originally announced August 2022.
-
Tradeoffs in Preventing Manipulation in Paper Bidding for Reviewer Assignment
Authors:
Steven Jecmen,
Nihar B. Shah,
Fei Fang,
Vincent Conitzer
Abstract:
Many conferences rely on paper bidding as a key component of their reviewer assignment procedure. These bids are then taken into account when assigning reviewers to help ensure that each reviewer is assigned to suitable papers. However, despite the benefits of using bids, reliance on paper bidding can allow malicious reviewers to manipulate the paper assignment for unethical purposes (e.g., gettin…
▽ More
Many conferences rely on paper bidding as a key component of their reviewer assignment procedure. These bids are then taken into account when assigning reviewers to help ensure that each reviewer is assigned to suitable papers. However, despite the benefits of using bids, reliance on paper bidding can allow malicious reviewers to manipulate the paper assignment for unethical purposes (e.g., getting assigned to a friend's paper). Several different approaches to preventing this manipulation have been proposed and deployed. In this paper, we enumerate certain desirable properties that algorithms for addressing bid manipulation should satisfy. We then offer a high-level analysis of various approaches along with directions for future investigation.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
A Dataset on Malicious Paper Bidding in Peer Review
Authors:
Steven Jecmen,
Minji Yoon,
Vincent Conitzer,
Nihar B. Shah,
Fei Fang
Abstract:
In conference peer review, reviewers are often asked to provide "bids" on each submitted paper that express their interest in reviewing that paper. A paper assignment algorithm then uses these bids (along with other data) to compute a high-quality assignment of reviewers to papers. However, this process has been exploited by malicious reviewers who strategically bid in order to unethically manipul…
▽ More
In conference peer review, reviewers are often asked to provide "bids" on each submitted paper that express their interest in reviewing that paper. A paper assignment algorithm then uses these bids (along with other data) to compute a high-quality assignment of reviewers to papers. However, this process has been exploited by malicious reviewers who strategically bid in order to unethically manipulate the paper assignment, crucially undermining the peer review process. For example, these reviewers may aim to get assigned to a friend's paper as part of a quid-pro-quo deal. A critical impediment towards creating and evaluating methods to mitigate this issue is the lack of any publicly-available data on malicious paper bidding. In this work, we collect and publicly release a novel dataset to fill this gap, collected from a mock conference activity where participants were instructed to bid either honestly or maliciously. We further provide a descriptive analysis of the bidding behavior, including our categorization of different strategies employed by participants. Finally, we evaluate the ability of each strategy to manipulate the assignment, and also evaluate the performance of some simple algorithms meant to detect malicious bidding. The performance of these detection algorithms can be taken as a baseline for future research on detecting malicious bidding.
△ Less
Submitted 10 March, 2023; v1 submitted 24 June, 2022;
originally announced July 2022.
-
Integrating Rankings into Quantized Scores in Peer Review
Authors:
Yusha Liu,
Yichong Xu,
Nihar B. Shah,
Aarti Singh
Abstract:
In peer review, reviewers are usually asked to provide scores for the papers. The scores are then used by Area Chairs or Program Chairs in various ways in the decision-making process. The scores are usually elicited in a quantized form to accommodate the limited cognitive ability of humans to describe their opinions in numerical values. It has been found that the quantized scores suffer from a lar…
▽ More
In peer review, reviewers are usually asked to provide scores for the papers. The scores are then used by Area Chairs or Program Chairs in various ways in the decision-making process. The scores are usually elicited in a quantized form to accommodate the limited cognitive ability of humans to describe their opinions in numerical values. It has been found that the quantized scores suffer from a large number of ties, thereby leading to a significant loss of information. To mitigate this issue, conferences have started to ask reviewers to additionally provide a ranking of the papers they have reviewed. There are however two key challenges. First, there is no standard procedure for using this ranking information and Area Chairs may use it in different ways (including simply ignoring them), thereby leading to arbitrariness in the peer-review process. Second, there are no suitable interfaces for judicious use of this data nor methods to incorporate it in existing workflows, thereby leading to inefficiencies. We take a principled approach to integrate the ranking information into the scores. The output of our method is an updated score pertaining to each review that also incorporates the rankings. Our approach addresses the two aforementioned challenges by: (i) ensuring that rankings are incorporated into the updates scores in the same manner for all papers, thereby mitigating arbitrariness, and (ii) allowing to seamlessly use existing interfaces and workflows designed for scores. We empirically evaluate our method on synthetic datasets as well as on peer reviews from the ICLR 2017 conference, and find that it reduces the error by approximately 30% as compared to the best performing baseline on the ICLR 2017 data.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Deep Speech Based End-to-End Automated Speech Recognition (ASR) for Indian-English Accents
Authors:
Priyank Dubey,
Bilal Shah
Abstract:
Automated Speech Recognition (ASR) is an interdisciplinary application of computer science and linguistics that enable us to derive the transcription from the uttered speech waveform. It finds several applications in Military like High-performance fighter aircraft, helicopters, air-traffic controller. Other than military speech recognition is used in healthcare, persons with disabilities and many…
▽ More
Automated Speech Recognition (ASR) is an interdisciplinary application of computer science and linguistics that enable us to derive the transcription from the uttered speech waveform. It finds several applications in Military like High-performance fighter aircraft, helicopters, air-traffic controller. Other than military speech recognition is used in healthcare, persons with disabilities and many more. ASR has been an active research area. Several models and algorithms for speech to text (STT) have been proposed. One of the most recent is Mozilla Deep Speech, it is based on the Deep Speech research paper by Baidu. Deep Speech is a state-of-art speech recognition system is developed using end-to-end deep learning, it is trained using well-optimized Recurrent Neural Network (RNN) training system utilizing multiple Graphical Processing Units (GPUs). This training is mostly done using American-English accent datasets, which results in poor generalizability to other English accents. India is a land of vast diversity. This can even be seen in the speech, there are several English accents which vary from state to state. In this work, we have used transfer learning approach using most recent Deep Speech model i.e., deepspeech-0.9.3 to develop an end-to-end speech recognition system for Indian-English accents. This work utilizes fine-tuning and data argumentation to further optimize and improve the Deep Speech ASR system. Indic TTS data of Indian-English accents is used for transfer learning and fine-tuning the pre-trained Deep Speech model. A general comparison is made among the untrained model, our trained model and other available speech recognition services for Indian-English Accents.
△ Less
Submitted 2 April, 2022;
originally announced April 2022.
-
To ArXiv or not to ArXiv: A Study Quantifying Pros and Cons of Posting Preprints Online
Authors:
Charvi Rastogi,
Ivan Stelmakh,
Xinwei Shen,
Marina Meila,
Federico Echenique,
Shuchi Chawla,
Nihar B. Shah
Abstract:
Double-blind conferences have engaged in debates over whether to allow authors to post their papers online on arXiv or elsewhere during the review process. Independently, some authors of research papers face the dilemma of whether to put their papers on arXiv due to its pros and cons. We conduct a study to substantiate this debate and dilemma via quantitative measurements. Specifically, we conduct…
▽ More
Double-blind conferences have engaged in debates over whether to allow authors to post their papers online on arXiv or elsewhere during the review process. Independently, some authors of research papers face the dilemma of whether to put their papers on arXiv due to its pros and cons. We conduct a study to substantiate this debate and dilemma via quantitative measurements. Specifically, we conducted surveys of reviewers in two top-tier double-blind computer science conferences -- ICML 2021 (5361 submissions and 4699 reviewers) and EC 2021 (498 submissions and 190 reviewers). Our two main findings are as follows. First, more than a third of the reviewers self-report searching online for a paper they are assigned to review. Second, outside the review process, we find that preprints from better-ranked affiliations see a weakly higher visibility, with a correlation of 0.06 in ICML and 0.05 in EC. In particular, papers associated with the top-10-ranked affiliations had a visibility of approximately 11% in ICML and 22% in EC, whereas the remaining papers had a visibility of 7% and 18% respectively.
△ Less
Submitted 11 June, 2022; v1 submitted 31 March, 2022;
originally announced March 2022.
-
Cite-seeing and Reviewing: A Study on Citation Bias in Peer Review
Authors:
Ivan Stelmakh,
Charvi Rastogi,
Ryan Liu,
Shuchi Chawla,
Federico Echenique,
Nihar B. Shah
Abstract:
Citations play an important role in researchers' careers as a key factor in evaluation of scientific impact. Many anecdotes advice authors to exploit this fact and cite prospective reviewers to try obtaining a more positive evaluation for their submission. In this work, we investigate if such a citation bias actually exists: Does the citation of a reviewer's own work in a submission cause them to…
▽ More
Citations play an important role in researchers' careers as a key factor in evaluation of scientific impact. Many anecdotes advice authors to exploit this fact and cite prospective reviewers to try obtaining a more positive evaluation for their submission. In this work, we investigate if such a citation bias actually exists: Does the citation of a reviewer's own work in a submission cause them to be positively biased towards the submission? In conjunction with the review process of two flagship conferences in machine learning and algorithmic economics, we execute an observational study to test for citation bias in peer review. In our analysis, we carefully account for various confounding factors such as paper quality and reviewer expertise, and apply different modeling techniques to alleviate concerns regarding the model mismatch. Overall, our analysis involves 1,314 papers and 1,717 reviewers and detects citation bias in both venues we consider. In terms of the effect size, by citing a reviewer's work, a submission has a non-trivial chance of getting a higher score from the reviewer: an expected increase in the score is approximately 0.23 on a 5-point Likert item. For reference, a one-point increase of a score by a single reviewer improves the position of a submission by 11% on average.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Calibration with Privacy in Peer Review
Authors:
Wenxin Ding,
Gautam Kamath,
Weina Wang,
Nihar B. Shah
Abstract:
Reviewers in peer review are often miscalibrated: they may be strict, lenient, extreme, moderate, etc. A number of algorithms have previously been proposed to calibrate reviews. Such attempts of calibration can however leak sensitive information about which reviewer reviewed which paper. In this paper, we identify this problem of calibration with privacy, and provide a foundational building block…
▽ More
Reviewers in peer review are often miscalibrated: they may be strict, lenient, extreme, moderate, etc. A number of algorithms have previously been proposed to calibrate reviews. Such attempts of calibration can however leak sensitive information about which reviewer reviewed which paper. In this paper, we identify this problem of calibration with privacy, and provide a foundational building block to address it. Specifically, we present a theoretical study of this problem under a simplified-yet-challenging model involving two reviewers, two papers, and an MAP-computing adversary. Our main results establish the Pareto frontier of the tradeoff between privacy (preventing the adversary from inferring reviewer identity) and utility (accepting better papers), and design explicit computationally-efficient algorithms that we prove are Pareto optimal.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Strategyproofing Peer Assessment via Partitioning: The Price in Terms of Evaluators' Expertise
Authors:
Komal Dhull,
Steven Jecmen,
Pravesh Kothari,
Nihar B. Shah
Abstract:
Strategic behavior is a fundamental problem in a variety of real-world applications that require some form of peer assessment, such as peer grading of homeworks, grant proposal review, conference peer review of scientific papers, and peer assessment of employees in organizations. Since an individual's own work is in competition with the submissions they are evaluating, they may provide dishonest e…
▽ More
Strategic behavior is a fundamental problem in a variety of real-world applications that require some form of peer assessment, such as peer grading of homeworks, grant proposal review, conference peer review of scientific papers, and peer assessment of employees in organizations. Since an individual's own work is in competition with the submissions they are evaluating, they may provide dishonest evaluations to increase the relative standing of their own submission. This issue is typically addressed by partitioning the individuals and assigning them to evaluate the work of only those from different subsets. Although this method ensures strategyproofness, each submission may require a different type of expertise for effective evaluation. In this paper, we focus on finding an assignment of evaluators to submissions that maximizes assigned evaluators' expertise subject to the constraint of strategyproofness. We analyze the price of strategyproofness: that is, the amount of compromise on the assigned evaluators' expertise required in order to get strategyproofness. We establish several polynomial-time algorithms for strategyproof assignment along with assignment-quality guarantees. Finally, we evaluate the methods on a dataset from conference peer review.
△ Less
Submitted 28 August, 2022; v1 submitted 25 January, 2022;
originally announced January 2022.
-
Frequency Centric Defense Mechanisms against Adversarial Examples
Authors:
Sanket B. Shah,
Param Raval,
Harin Khakhi,
Mehul S. Raval
Abstract:
Adversarial example (AE) aims at fooling a Convolution Neural Network by introducing small perturbations in the input image.The proposed work uses the magnitude and phase of the Fourier Spectrum and the entropy of the image to defend against AE. We demonstrate the defense in two ways: by training an adversarial detector and denoising the adversarial effect. Experiments were conducted on the low-re…
▽ More
Adversarial example (AE) aims at fooling a Convolution Neural Network by introducing small perturbations in the input image.The proposed work uses the magnitude and phase of the Fourier Spectrum and the entropy of the image to defend against AE. We demonstrate the defense in two ways: by training an adversarial detector and denoising the adversarial effect. Experiments were conducted on the low-resolution CIFAR-10 and high-resolution ImageNet datasets. The adversarial detector has 99% accuracy for FGSM and PGD attacks on the CIFAR-10 dataset. However, the detection accuracy falls to 50% for sophisticated DeepFool and Carlini & Wagner attacks on ImageNet. We overcome the limitation by using autoencoder and show that 70% of AEs are correctly classified after denoising.
△ Less
Submitted 26 October, 2021;
originally announced October 2021.
-
Low-Reynolds-number aerodynamic characteristics of airfoils with piezocomposite trailing control surfaces
Authors:
Kai Zhang,
Bharg Shah,
Onur Bilgen
Abstract:
Morphing wings comprised of fixed leading sections with piezocomposite trailing control surfaces have emerged as a novel active control technique for unmanned aerial vehicles. However, the wake dynamics and aerodynamic performance of such hybrid airfoil configuration has not been thoroughly investigated. In this paper, direct numerical simulations of two-dimensional flows over hybrid airfoils comp…
▽ More
Morphing wings comprised of fixed leading sections with piezocomposite trailing control surfaces have emerged as a novel active control technique for unmanned aerial vehicles. However, the wake dynamics and aerodynamic performance of such hybrid airfoil configuration has not been thoroughly investigated. In this paper, direct numerical simulations of two-dimensional flows over hybrid airfoils comprised of NACA 0012 leading sections with piezocomposite trailing control surfaces are performed at a fixed Reynolds number of 1000. The effects of length and camber of the trailing control surface on the laminar aerodynamic characteristics are studied over a wide range of angle of attack. It is shown that the flow behind the airfoil exhibits different features, including steady flow, periodic vortex shedding, and quasi-periodic vortex shedding for different configurations. The transition between these wake states occurs at slightly smaller angles of attack compared to the nominal NACA 0012 airfoil. While the drag coefficient remains close to each other at a fixed angle of attack, the lift coefficient of the hybrid airfoil is positively affected by the length and camber of the trailing control surface. The mechanism of lift generation is examined by surface pressure distributions and a force element analysis. It is revealed that with increased camber of the trailing control surface, the flow on both the suction and pressure sides of the airfoil are modified in a beneficial way to enhance lift. Increasing the length ratio only significantly modifies the flow near the aft section on the pressure side. The results herein provide a laminar aerodynamic characterization of hybrid airfoils with trailing control surfaces, and could potentially aid the design of control techniques for next-generation small unmanned aerial vehicles.
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
Near-Optimal Reviewer Splitting in Two-Phase Paper Reviewing and Conference Experiment Design
Authors:
Steven Jecmen,
Hanrui Zhang,
Ryan Liu,
Fei Fang,
Vincent Conitzer,
Nihar B. Shah
Abstract:
Many scientific conferences employ a two-phase paper review process, where some papers are assigned additional reviewers after the initial reviews are submitted. Many conferences also design and run experiments on their paper review process, where some papers are assigned reviewers who provide reviews under an experimental condition. In this paper, we consider the question: how should reviewers be…
▽ More
Many scientific conferences employ a two-phase paper review process, where some papers are assigned additional reviewers after the initial reviews are submitted. Many conferences also design and run experiments on their paper review process, where some papers are assigned reviewers who provide reviews under an experimental condition. In this paper, we consider the question: how should reviewers be divided between phases or conditions in order to maximize total assignment similarity? We make several contributions towards answering this question. First, we prove that when the set of papers requiring additional review is unknown, a simplified variant of this problem is NP-hard. Second, we empirically show that across several datasets pertaining to real conference data, dividing reviewers between phases/conditions uniformly at random allows an assignment that is nearly as good as the oracle optimal assignment. This uniformly random choice is practical for both the two-phase and conference experiment design settings. Third, we provide explanations of this phenomenon by providing theoretical bounds on the suboptimality of this random strategy under certain natural conditions. From these easily-interpretable conditions, we provide actionable insights to conference program chairs about whether a random reviewer split is suitable for their conference.
△ Less
Submitted 13 August, 2021;
originally announced August 2021.
-
Fair Decision-Making for Food Inspections
Authors:
Shubham Singh,
Bhuvni Shah,
Chris Kanich,
Ian A. Kash
Abstract:
Data and algorithms are essential and complementary parts of a large-scale decision-making process. However, their injudicious use can lead to unforeseen consequences, as has been observed by researchers and activists alike in the recent past. In this paper, we revisit the application of predictive models by the Chicago Department of Public Health to schedule restaurant inspections and prioritize…
▽ More
Data and algorithms are essential and complementary parts of a large-scale decision-making process. However, their injudicious use can lead to unforeseen consequences, as has been observed by researchers and activists alike in the recent past. In this paper, we revisit the application of predictive models by the Chicago Department of Public Health to schedule restaurant inspections and prioritize the detection of critical food code violations. We perform the first analysis of the model's fairness to the population served by the restaurants in terms of average time to find a critical violation. We find that the model treats inspections unequally based on the sanitarian who conducted the inspection and that, in turn, there are geographic disparities in the benefits of the model. We examine four alternate methods of model training and two alternative ways of scheduling using the model and find that the latter generate more desirable results. The challenges from this application point to important directions for future work around fairness with collective entities rather than individuals, the use of critical violations as a proxy, and the disconnect between fair classification and fairness in the dynamic scheduling system.
△ Less
Submitted 9 March, 2022; v1 submitted 12 August, 2021;
originally announced August 2021.
-
Orthogonal and Non-Orthogonal Signal Representations Using New Transformation Matrices Having NPM Structure
Authors:
Shaik Basheeruddin Shah,
Vijay Kumar Chakka,
Arikatla Satyanarayana Reddy
Abstract:
In this paper, we introduce two types of real-valued sums known as Complex Conjugate Pair Sums (CCPSs) denoted as CCPS$^{(1)}$ and CCPS$^{(2)}$, and discuss a few of their properties. Using each type of CCPSs and their circular shifts, we construct two non-orthogonal Nested Periodic Matrices (NPMs). As NPMs are non-singular, this introduces two non-orthogonal transforms known as Complex Conjugate…
▽ More
In this paper, we introduce two types of real-valued sums known as Complex Conjugate Pair Sums (CCPSs) denoted as CCPS$^{(1)}$ and CCPS$^{(2)}$, and discuss a few of their properties. Using each type of CCPSs and their circular shifts, we construct two non-orthogonal Nested Periodic Matrices (NPMs). As NPMs are non-singular, this introduces two non-orthogonal transforms known as Complex Conjugate Periodic Transforms (CCPTs) denoted as CCPT$^{(1)}$ and CCPT$^{(2)}$. We propose another NPM, which uses both types of CCPSs such that its columns are mutually orthogonal, this transform is known as Orthogonal CCPT (OCCPT). After a brief study of a few OCCPT properties like periodicity, circular shift, etc., we present two different interpretations of it. Further, we propose a Decimation-In-Time (DIT) based fast computation algorithm for OCCPT (termed as FOCCPT), whenever the length of the signal is equal to $2^v,\ v{\in} \mathbb{N}$. The proposed sums and transforms are inspired by Ramanujan sums and Ramanujan Period Transform (RPT). Finally, we show that the period (both divisor and non-divisor) and frequency information of a signal can be estimated using the proposed transforms with a significant reduction in the computational complexity over Discrete Fourier Transform (DFT).
△ Less
Submitted 20 June, 2021;
originally announced July 2021.
-
On Complex Conjugate Pair Sums and Complex Conjugate Subspaces
Authors:
Shaik Basheeruddin Shah,
Vijay Kumar Chakka,
Arikatla Satyanarayana Reddy
Abstract:
In this letter, we study a few properties of Complex Conjugate Pair Sums (CCPSs) and Complex Conjugate Subspaces (CCSs). Initially, we consider an LTI system whose impulse response is one period data of CCPS. For a given input x(n), we prove that the output of this system is equivalent to computing the first order derivative of x(n). Further, with some constraints on the impulse response, the syst…
▽ More
In this letter, we study a few properties of Complex Conjugate Pair Sums (CCPSs) and Complex Conjugate Subspaces (CCSs). Initially, we consider an LTI system whose impulse response is one period data of CCPS. For a given input x(n), we prove that the output of this system is equivalent to computing the first order derivative of x(n). Further, with some constraints on the impulse response, the system output is also equivalent to the second order derivative. With this, we show that a fine edge detection in an image can be achieved using CCPSs as impulse response over Ramanujan Sums (RSs). Later computation of projection for CCS is studied. Here the projection matrix has a circulant structure, which makes the computation of projections easier. Finally, we prove that CCS is shift-invariant and closed under the operation of circular cross-correlation.
△ Less
Submitted 5 June, 2021;
originally announced June 2021.
-
A New Signal Representation Using Complex Conjugate Pair Sums
Authors:
Shaik Basheeruddin Shah,
Vijay Kumar Chakka,
Arikatla Satyanarayana Reddy
Abstract:
This letter introduces a real valued summation known as Complex Conjugate Pair Sum (CCPS). The space spanned by CCPS and its one circular downshift is called {\em Complex Conjugate Subspace (CCS)}. For a given positive integer $N\geq3$, there exists $\frac{\varphi(N)}{2}$ CCPSs forming $\frac{\varphi(N)}{2}$ CCSs, where $\varphi(N)$ is the Euler's totient function. We prove that these CCSs are mut…
▽ More
This letter introduces a real valued summation known as Complex Conjugate Pair Sum (CCPS). The space spanned by CCPS and its one circular downshift is called {\em Complex Conjugate Subspace (CCS)}. For a given positive integer $N\geq3$, there exists $\frac{\varphi(N)}{2}$ CCPSs forming $\frac{\varphi(N)}{2}$ CCSs, where $\varphi(N)$ is the Euler's totient function. We prove that these CCSs are mutually orthogonal and their direct sum form a $\varphi(N)$ dimensional subspace $s_N$ of $\mathbb{C}^N$. We propose that any signal of finite length $N$ is represented as a linear combination of elements from a special basis of $s_d$, for each divisor $d$ of $N$. This defines a new transform named as Complex Conjugate Periodic Transform (CCPT). Later, we compared CCPT with DFT (Discrete Fourier Transform) and RPT (Ramanujan Periodic Transform). It is shown that, using CCPT we can estimate the period, hidden periods and frequency information of a signal. Whereas, RPT does not provide the frequency information. For a complex valued input signal, CCPT offers computational benefit over DFT. A CCPT dictionary based method is proposed to extract non-divisor period information.
△ Less
Submitted 20 June, 2021;
originally announced June 2021.
-
A Reactive Molecular Dynamics Study of Hydrogenation on Diamond Surfaces
Authors:
Eliezer F. Oliveira,
Mahesh R. Neupane,
Chenxi Li,
Harikishan Kannan,
Xiang Zhang,
Anand B. Puthirath,
Pankaj B. Shah,
A. Glen Birdwell,
Tony G. Ivanov,
Robert Vajtai,
Douglas S. Galvao,
Pulickel M. Ajayan
Abstract:
Hydrogenated diamond has been regarded as a promising material in electronic device applications, especially in field-effect transistors (FETs). However, the quality of diamond hydrogenation has not yet been established, nor has the specific orientation that would provide the optimum hydrogen coverage. In addition, most theoretical work in the literature use models with 100% hydrogenated diamond s…
▽ More
Hydrogenated diamond has been regarded as a promising material in electronic device applications, especially in field-effect transistors (FETs). However, the quality of diamond hydrogenation has not yet been established, nor has the specific orientation that would provide the optimum hydrogen coverage. In addition, most theoretical work in the literature use models with 100% hydrogenated diamond surfaces to study electronic properties, which is far from the experimentally observed hydrogen coverage. In this work, we have carried out a detailed study using fully atomistic reactive molecular dynamics (MD) simulations on low indices diamond surfaces i.e. (001), (013), (110), (113) and (111) to evaluate the quality and hydrogenation thresholds on different diamond surfaces and their possible effects on electronic properties. Our simulation results indicate that the 100% surface hydrogenation in these surfaces is hard to achieve because of the steric repulsion between the terminated hydrogen atoms. Among all the considered surfaces, the (001), (110), and (113) surfaces incorporate a larger number of hydrogen atoms and passivate the surface dangling bonds. Our results on hydrogen stability also suggest that these surfaces with optimum hydrogen coverage are robust under extreme conditions and could provide homogeneous p-type surface conductivity in the diamond surfaces, a key requirement for high-field, high-frequency device applications.
△ Less
Submitted 25 May, 2021;
originally announced May 2021.
-
Composing Modeling and Simulation with Machine Learning in Julia
Authors:
Chris Rackauckas,
Ranjan Anantharaman,
Alan Edelman,
Shashi Gowda,
Maja Gwozdz,
Anand Jain,
Chris Laughman,
Yingbo Ma,
Francesco Martinuzzi,
Avik Pal,
Utkarsh Rajput,
Elliot Saba,
Viral B. Shah
Abstract:
In this paper we introduce JuliaSim, a high-performance programming environment designed to blend traditional modeling and simulation with machine learning. JuliaSim can build accelerated surrogates from component-based models, such as those conforming to the FMI standard, using continuous-time echo state networks (CTESN). The foundation of this environment, ModelingToolkit.jl, is an acausal model…
▽ More
In this paper we introduce JuliaSim, a high-performance programming environment designed to blend traditional modeling and simulation with machine learning. JuliaSim can build accelerated surrogates from component-based models, such as those conforming to the FMI standard, using continuous-time echo state networks (CTESN). The foundation of this environment, ModelingToolkit.jl, is an acausal modeling language which can compose the trained surrogates as components within its staged compilation process. As a complementary factor we present the JuliaSim model library, a standard library with differential-algebraic equations and pre-trained surrogates, which can be composed using the modeling system for design, optimization, and control. We demonstrate the effectiveness of the surrogate-accelerated modeling and simulation approach on HVAC dynamics by showing that the CTESN surrogates accurately capture the dynamics of a HVAC cycle at less than 4\% error while accelerating its simulation by 340x. We illustrate the use of surrogate acceleration in the design process via global optimization of simulation parameters using the embedded surrogate, yielding a speedup of two orders of magnitude to find the optimum. We showcase the surrogate deployed in a co-simulation loop, as a drop-in replacement for one of the coupled FMUs, allowing engineers to effectively explore the design space of a coupled system. Together this demonstrates a workflow for automating the integration of machine learning techniques into traditional modeling and simulation processes.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.