-
Coherent control of a triangular exchange-only spin qubit
Authors:
Edwin Acuna,
Joseph D. Broz,
Kaushal Shyamsundar,
Antonio B. Mei,
Colin P. Feeney,
Valerie Smetanka,
Tiffany Davis,
Kangmu Lee,
Maxwell D. Choi,
Brydon Boyd,
June Suh,
Wonill D. Ha,
Cameron Jennings,
Andrew S. Pan,
Daniel S. Sanchez,
Matthew D. Reed,
Jason R. Petta
Abstract:
We demonstrate coherent control of a three-electron exchange-only spin qubit with the quantum dots arranged in a close-packed triangular geometry. The device is tuned to confine one electron in each quantum dot, as evidenced by pairwise charge stability diagrams. Time-domain control of the exchange coupling is demonstrated and qubit performance is characterized using blind randomized benchmarking,…
▽ More
We demonstrate coherent control of a three-electron exchange-only spin qubit with the quantum dots arranged in a close-packed triangular geometry. The device is tuned to confine one electron in each quantum dot, as evidenced by pairwise charge stability diagrams. Time-domain control of the exchange coupling is demonstrated and qubit performance is characterized using blind randomized benchmarking, with an average single-qubit gate fidelity F = 99.84%. The compact triangular device geometry can be readily scaled to larger two-dimensional quantum dot arrays with high connectivity.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
GotFunding: A grant recommendation system based on scientific articles
Authors:
Tong Zeng,
Daniel E. Acuna
Abstract:
Obtaining funding is an important part of becoming a successful scientist. Junior faculty spend a great deal of time finding the right agencies and programs that best match their research profile. But what are the factors that influence the best publication--grant matching? Some universities might employ pre-award personnel to understand these factors, but not all institutions can afford to hire t…
▽ More
Obtaining funding is an important part of becoming a successful scientist. Junior faculty spend a great deal of time finding the right agencies and programs that best match their research profile. But what are the factors that influence the best publication--grant matching? Some universities might employ pre-award personnel to understand these factors, but not all institutions can afford to hire them. Historical records of publications funded by grants can help us understand the matching process and also help us develop recommendation systems to automate it. In this work, we present \textsc{GotFunding} (Grant recOmmendaTion based on past FUNDING), a recommendation system trained on National Institutes of Health's (NIH) grant--publication records. Our system achieves a high performance (NDCG@1 = 0.945) by casting the problem as learning to rank. By analyzing the features that make predictions effective, our results show that the ranking considers most important 1) the year difference between publication and grant grant, 2) the amount of information provided in the publication, and 3) the relevance of the publication to the grant. We discuss future improvements of the system and an online tool for scientists to try.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models
Authors:
Tong Zeng,
Daniel E. Acuna
Abstract:
Scientist learn early on how to cite scientific sources to support their claims. Sometimes, however, scientists have challenges determining where a citation should be situated -- or, even worse, fail to cite a source altogether. Automatically detecting sentences that need a citation (i.e., citation worthiness) could solve both of these issues, leading to more robust and well-constructed scientific…
▽ More
Scientist learn early on how to cite scientific sources to support their claims. Sometimes, however, scientists have challenges determining where a citation should be situated -- or, even worse, fail to cite a source altogether. Automatically detecting sentences that need a citation (i.e., citation worthiness) could solve both of these issues, leading to more robust and well-constructed scientific arguments. Previous researchers have applied machine learning to this task but have used small datasets and models that do not take advantage of recent algorithmic developments such as attention mechanisms in deep learning. We hypothesize that we can develop significantly accurate deep learning architectures that learn from large supervised datasets constructed from open access publications. In this work, we propose a Bidirectional Long Short-Term Memory (BiLSTM) network with attention mechanism and contextual information to detect sentences that need citations. We also produce a new, large dataset (PMOA-CITE) based on PubMed Open Access Subset, which is orders of magnitude larger than previous datasets. Our experiments show that our architecture achieves state of the art performance on the standard ACL-ARC dataset ($F_{1}=0.507$) and exhibits high performance ($F_{1}=0.856$) on the new PMOA-CITE. Moreover, we show that it can transfer learning across these datasets. We further use interpretable models to illuminate how specific language is used to promote and inhibit citations. We discover that sections and surrounding sentences are crucial for our improved predictions. We further examined purported mispredictions of the model, and uncovered systematic human mistakes in citation behavior and source data. This opens the door for our model to check documents during pre-submission and pre-archival procedures. We make this new dataset, the code, and a web-based tool available to the community.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
The complementary contributions of academia and industry to AI research
Authors:
Lizhen Liang,
Han Zhuang,
James Zou,
Daniel E. Acuna
Abstract:
Artificial intelligence (AI) has seen tremendous development in industry and academia. However, striking recent advances by industry have stunned the world, inviting a fresh perspective on the role of academic research in this field. Here, we characterize the impact and type of AI produced by both environments over the last 25 years and establish several patterns. We find that articles published b…
▽ More
Artificial intelligence (AI) has seen tremendous development in industry and academia. However, striking recent advances by industry have stunned the world, inviting a fresh perspective on the role of academic research in this field. Here, we characterize the impact and type of AI produced by both environments over the last 25 years and establish several patterns. We find that articles published by teams consisting exclusively of industry researchers tend to get greater attention, with a higher chance of being highly cited and citation-disruptive, and several times more likely to produce state-of-the-art models. In contrast, we find that exclusively academic teams publish the bulk of AI research and tend to produce higher novelty work, with single papers having several times higher likelihood of being unconventional and atypical. The respective impact-novelty advantages of industry and academia are robust to controls for subfield, team size, seniority, and prestige. We find that academic-industry collaborations struggle to replicate the novelty of academic teams and tend to look similar to industry teams. Together, our findings identify the unique and nearly irreplaceable contributions that both academia and industry make toward the healthy progress of AI.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
Effects of Same-Race Mentorship Preferences on Academic Performance and Survival
Authors:
Meijun Liu,
Yi Bu,
Daifeng Li,
Ying Ding,
Daniel E. Acuna
Abstract:
Same-race mentorship preference refers to mentors or mentees forming connections significantly influenced by a shared race. Although racial diversity in science has been well-studied and linked to favorable outcomes, the extent and effects of same-race mentorship preferences remain largely underexplored. Here, we analyze 465,355 mentor-mentee pairs from more than 60 research areas over the last 70…
▽ More
Same-race mentorship preference refers to mentors or mentees forming connections significantly influenced by a shared race. Although racial diversity in science has been well-studied and linked to favorable outcomes, the extent and effects of same-race mentorship preferences remain largely underexplored. Here, we analyze 465,355 mentor-mentee pairs from more than 60 research areas over the last 70 years to investigate the effect of same-race mentorship preferences on mentees' academic performance and survival. We use causal inference and statistical matching to measure same-race mentorship preferences while accounting for racial demographic variations across institutions, time periods, and research fields. Our findings reveal a pervasive same-race mentorship propensity across races, fields, and universities of varying research intensity. We observe an increase in same-race mentorship propensity over the years, further reinforced inter-generationally within a mentorship lineage. This propensity is more pronounced for minorities (Asians, Blacks, and Hispanics). Our results reveal that mentees under the supervision of mentors with high same-race propensity experience significantly lower productivity, impact, and collaboration reach during and after training, ultimately leading to a 27.6% reduced likelihood of remaining in academia. In contrast, a mentorship approach devoid of racial propensity appears to offer the best prospects for academic performance and persistence. These findings underscore the importance of mentorship diversity for academic success and shed light on factors contributing to minority underrepresentation in science.
△ Less
Submitted 4 May, 2024; v1 submitted 13 October, 2023;
originally announced October 2023.
-
The Impact of Heterogeneous Shared Leadership in Scientific Teams
Authors:
Huimin Xu,
Meijun Liu,
Yi Bu,
Shu**g Sun,
Yi Zhang,
Chenwei Zhang,
Daniel E. Acuna,
Steven Gray,
Eric Meyer,
Ying Ding
Abstract:
Leadership is evolving dynamically from an individual endeavor to shared efforts. This paper aims to advance our understanding of shared leadership in scientific teams. We define three kinds of leaders, junior (10-15), mid (15-20), and senior (20+) based on career age. By considering the combinations of any two leaders, we distinguish shared leadership as heterogeneous when leaders are in differen…
▽ More
Leadership is evolving dynamically from an individual endeavor to shared efforts. This paper aims to advance our understanding of shared leadership in scientific teams. We define three kinds of leaders, junior (10-15), mid (15-20), and senior (20+) based on career age. By considering the combinations of any two leaders, we distinguish shared leadership as heterogeneous when leaders are in different age cohorts and homogeneous when leaders are in the same age cohort. Drawing on 1,845,351 CS, 254,039 Sociology, and 193,338 Business teams with two leaders in the OpenAlex dataset, we identify that heterogeneous shared leadership brings higher citation impact for teams than homogeneous shared leadership. Specifically, when junior leaders are paired with senior leaders, it significantly increases team citation ranking by 1-2%, in comparison with two leaders of similar age. We explore the patterns between homogeneous leaders and heterogeneous leaders from team scale, expertise composition, and knowledge recency perspectives. Compared with homogeneous leaders, heterogeneous leaders are more adaptive in large teams, have more diverse expertise, and trace both the newest and oldest references.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
Paraphrase Identification with Deep Learning: A Review of Datasets and Methods
Authors:
Chao Zhou,
Cheng Qiu,
Daniel E. Acuna
Abstract:
The rapid advancement of AI technology has made text generation tools like GPT-3 and ChatGPT increasingly accessible, scalable, and effective. This can pose serious threat to the credibility of various forms of media if these technologies are used for plagiarism, including scientific literature and news sources. Despite the development of automated methods for paraphrase identification, detecting…
▽ More
The rapid advancement of AI technology has made text generation tools like GPT-3 and ChatGPT increasingly accessible, scalable, and effective. This can pose serious threat to the credibility of various forms of media if these technologies are used for plagiarism, including scientific literature and news sources. Despite the development of automated methods for paraphrase identification, detecting this type of plagiarism remains a challenge due to the disparate nature of the datasets on which these methods are trained. In this study, we review traditional and current approaches to paraphrase identification and propose a refined typology of paraphrases. We also investigate how this typology is represented in popular datasets and how under-representation of certain types of paraphrases impacts detection capabilities. Finally, we outline new directions for future research and datasets in the pursuit of more effective paraphrase detection using AI.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
Primal Conjecture in Matrix $E_φ^9$
Authors:
Paul Marrero,
Eduardo Acuña
Abstract:
In this paper we propose a conjecture about integer solutions to any equations, based on Primal algebra specifically this conjecture is a corollary of the Acuña Theorem in that article. Also some problems are proposed which, if the conjecture is correct, could be solved.
In this paper we propose a conjecture about integer solutions to any equations, based on Primal algebra specifically this conjecture is a corollary of the Acuña Theorem in that article. Also some problems are proposed which, if the conjecture is correct, could be solved.
△ Less
Submitted 30 May, 2022; v1 submitted 18 May, 2022;
originally announced May 2022.
-
Predicting the longevity of resources shared in scientific publications
Authors:
Daniel E. Acuna,
Jian Jian,
Tong Zeng,
Lizhen Liang,
Han Zhuang
Abstract:
Research has shown that most resources shared in articles (e.g., URLs to code or data) are not kept up to date and mostly disappear from the web after some years (Zeng et al., 2019). Little is known about the factors that differentiate and predict the longevity of these resources. This article explores a range of explanatory features related to the publication venue, authors, references, and where…
▽ More
Research has shown that most resources shared in articles (e.g., URLs to code or data) are not kept up to date and mostly disappear from the web after some years (Zeng et al., 2019). Little is known about the factors that differentiate and predict the longevity of these resources. This article explores a range of explanatory features related to the publication venue, authors, references, and where the resource is shared. We analyze an extensive repository of publications and, through web archival services, reconstruct how they looked at different time points. We discover that the most important factors are related to where and how the resource is shared, and surprisingly little is explained by the author's reputation or prestige of the journal. By examining the places where long-lasting resources are shared, we suggest that it is critical to disseminate and create standards with modern technologies. Finally, we discuss implications for reproducibility and recognizing scientific datasets as first-class citizens.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Computable Model Discovery and High-Level-Programming Approximations to Algorithmic Complexity
Authors:
Vladimir Lemusa,
Eduardo Acuña,
Víctor Zamora,
Francisco Hernandez-Quiroz,
Hector Zenil
Abstract:
Motivated by algorithmic information theory, the problem of program discovery can help find candidates of underlying generative mechanisms of natural and artificial phenomena. The uncomputability of such inverse problem, however, significantly restricts a wider application of exhaustive methods. Here we present a proof of concept of an approach based on IMP, a high-level imperative programming lan…
▽ More
Motivated by algorithmic information theory, the problem of program discovery can help find candidates of underlying generative mechanisms of natural and artificial phenomena. The uncomputability of such inverse problem, however, significantly restricts a wider application of exhaustive methods. Here we present a proof of concept of an approach based on IMP, a high-level imperative programming language. Its main advantage is that conceptually complex computational routines are more succinctly expressed, unlike lower-level models such as Turing machines or cellular automata. We investigate if a more expressive higher-level programming language can be more efficient at generating approximations to algorithmic complexity of recursive functions, often of particular mathematical interest.
△ Less
Submitted 22 December, 2021;
originally announced December 2021.
-
EILEEN: A recommendation system for scientific publications and grants
Authors:
Daniel E. Acuna,
Kartik Nagre,
Priya Matnani
Abstract:
Finding relevant scientific articles is crucial for advancing knowledge. Recommendation systems are helpful for such purpose, although they have only been applied to science recently. This article describes EILEEN (Exploratory Innovator of LitEraturE Networks), a recommendation system for scientific publications and grants with open source code and datasets. We describe EILEEN's architecture for i…
▽ More
Finding relevant scientific articles is crucial for advancing knowledge. Recommendation systems are helpful for such purpose, although they have only been applied to science recently. This article describes EILEEN (Exploratory Innovator of LitEraturE Networks), a recommendation system for scientific publications and grants with open source code and datasets. We describe EILEEN's architecture for ingesting and processing documents and modeling the recommendation system and keyphrase estimator. Using a unique dataset of log-in user behavior, we validate our recommendation system against Latent Semantic Analysis (LSA) and the standard ranking from Elasticsearch (Lucene scoring). We find that a learning-to-rank with Random Forest achieves an AUC of 0.9, significantly outperforming both baselines. Our results suggest that we can substantially improve science recommendations and learn about scientists' behavior through their search behavior. We make our system available through eileen.io
△ Less
Submitted 23 March, 2022; v1 submitted 18 October, 2021;
originally announced October 2021.
-
A flexible design platform for Si/SiGe exchange-only qubits with low disorder
Authors:
Wonill Ha,
Sieu D. Ha,
Maxwell D. Choi,
Yan Tang,
Adele E. Schmitz,
Mark P. Levendorf,
Kangmu Lee,
James M. Chappell,
Tower S. Adams,
Daniel R. Hulbert,
Edwin Acuna,
Ramsey S. Noah,
Justine W. Matten,
Michael P. Jura,
Jeffrey A. Wright,
Matthew T. Rakher,
Matthew G. Borselli
Abstract:
Spin-based silicon quantum dots are an attractive qubit technology for quantum information processing with respect to coherence time, control, and engineering. Here we present an exchange-only Si qubit device platform that combines the throughput of CMOS-like wafer processing with the versatility of direct-write lithography. The technology, which we coin "SLEDGE," features dot-shaped gates that ar…
▽ More
Spin-based silicon quantum dots are an attractive qubit technology for quantum information processing with respect to coherence time, control, and engineering. Here we present an exchange-only Si qubit device platform that combines the throughput of CMOS-like wafer processing with the versatility of direct-write lithography. The technology, which we coin "SLEDGE," features dot-shaped gates that are patterned simultaneously on one topographical plane and subsequently connected by vias to interconnect metal lines. The process design enables non-trivial layouts as well as flexibility in gate dimensions, material selection, and additional device features such as for rf qubit control. We show that the SLEDGE process has reduced electrostatic disorder with respect to traditional overlap** gate devices with lift-off metallization, and we present spin coherent exchange oscillations and single qubit blind randomized benchmarking data.
△ Less
Submitted 22 July, 2021;
originally announced July 2021.
-
A dataset of mentorship in science with semantic and demographic estimations
Authors:
Qing Ke,
Lizhen Liang,
Ying Ding,
Stephen V. David,
Daniel E. Acuna
Abstract:
Mentorship in science is crucial for topic choice, career decisions, and the success of mentees and mentors. Typically, researchers who study mentorship use article co-authorship and doctoral dissertation datasets. However, available datasets of this type focus on narrow selections of fields and miss out on early career and non-publication-related interactions. Here, we describe MENTORSHIP, a crow…
▽ More
Mentorship in science is crucial for topic choice, career decisions, and the success of mentees and mentors. Typically, researchers who study mentorship use article co-authorship and doctoral dissertation datasets. However, available datasets of this type focus on narrow selections of fields and miss out on early career and non-publication-related interactions. Here, we describe MENTORSHIP, a crowdsourced dataset of 743176 mentorship relationships among 738989 scientists across 112 fields that avoids these shortcomings. We enrich the scientists' profiles with publication data from the Microsoft Academic Graph and "semantic" representations of research using deep learning content analysis. Because gender and race have become critical dimensions when analyzing mentorship and disparities in science, we also provide estimations of these factors. We perform extensive validations of the profile--publication matching, semantic content, and demographic inferences. We anticipate this dataset will spur the study of mentorship in science and deepen our understanding of its role in scientists' career outcomes.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.
-
Existential refinement on the search of integer solutions for the diophantine equation $x^3+y^3+z^3=n$
Authors:
Samuel Flores,
Eduardo Acuña,
Paul Marrero
Abstract:
We propose a new algorithm, call Sam to determinate the existence of the solutions for the equation $x^3+y^3+z^3=n$ for a fixed value $n > 0$ unknown.
We propose a new algorithm, call Sam to determinate the existence of the solutions for the equation $x^3+y^3+z^3=n$ for a fixed value $n > 0$ unknown.
△ Less
Submitted 12 May, 2022; v1 submitted 25 March, 2021;
originally announced March 2021.
-
Detuning Axis Pulsed Spectroscopy of Valley-Orbital States in Si/SiGe Quantum Dots
Authors:
Edward H. Chen,
Kate Raach,
Andrew Pan,
Andrey A. Kiselev,
Edwin Acuna,
Jacob Z. Blumoff,
Teresa Brecht,
Maxwell Choi,
Wonill Ha,
Daniel Hulbert,
Michael P. Jura,
Tyler Keating,
Ramsey Noah,
Bo Sun,
Bryan J. Thomas,
Matthew Borselli,
C. A. C. Jackson,
Matthew T. Rakher,
Richard S. Ross
Abstract:
Silicon quantum dot qubits must contend with low-lying valley excited states which are sensitive functions of the quantum well heterostructure and disorder; quantifying and maximizing the energies of these states are critical to improving device performance. We describe a spectroscopic method for probing excited states in isolated Si/SiGe double quantum dots using standard baseband pulsing techniq…
▽ More
Silicon quantum dot qubits must contend with low-lying valley excited states which are sensitive functions of the quantum well heterostructure and disorder; quantifying and maximizing the energies of these states are critical to improving device performance. We describe a spectroscopic method for probing excited states in isolated Si/SiGe double quantum dots using standard baseband pulsing techniques, easing the extraction of energy spectra in multiple-dot devices. We use this method to measure dozens of valley excited state energies spanning multiple wafers, quantum dots, and orbital states, crucial for evaluating the dependence of valley splitting on quantum well width and other epitaxial conditions. Our results suggest that narrower wells can be beneficial for improving valley splittings, but this effect can be confounded by variations in growth and fabrication conditions. These results underscore the importance of valley splitting measurements for guiding the development of Si qubits.
△ Less
Submitted 26 February, 2021; v1 submitted 9 October, 2020;
originally announced October 2020.
-
Magnetic Gradient Fluctuations from Quadrupolar $^{73}$Ge in Si/SiGe Exchange-Only Qubits
Authors:
J. Kerckhoff,
B. Sun,
B. H. Fong,
C. Jones,
A. A. Kiselev,
D. W. Barnes,
R. S. Noah,
E. Acuna,
M. Akmal,
S. D. Ha,
J. A. Wright,
B. J. Thomas,
C. A. C. Jackson,
L. F. Edge,
K. Eng,
R. S. Ross,
T. D. Ladd
Abstract:
We study the time-fluctuating magnetic gradient noise mechanisms in pairs of Si/SiGe quantum dots using exchange echo noise spectroscopy. We find through a combination of spectral inversion and correspondence to theoretical modeling that quadrupolar precession of the $^{73}$Ge nuclei play a key role in the spin-echo decay time $T_2$, with a characteristic dependence on magnetic field and the width…
▽ More
We study the time-fluctuating magnetic gradient noise mechanisms in pairs of Si/SiGe quantum dots using exchange echo noise spectroscopy. We find through a combination of spectral inversion and correspondence to theoretical modeling that quadrupolar precession of the $^{73}$Ge nuclei play a key role in the spin-echo decay time $T_2$, with a characteristic dependence on magnetic field and the width of the Si quantum well. The $^{73}$Ge noise peaks appear at the fundamental and first harmonic of the $^{73}$Ge Larmor resonance, superimposed over $1/f$ noise due to $^{29}$Si dipole-dipole dynamics, and are dependent on material epitaxy and applied magnetic field. These results may inform the needs of dynamical decoupling when using Si/SiGe quantum dots as qubits in quantum information processing devices.
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
Estimating a Null Model of Scientific Image Reuse to Support Research Integrity Investigations
Authors:
Daniel E. Acuna,
Ziyue Xiang
Abstract:
When there is a suspicious figure reuse case in science, research integrity investigators often find it difficult to rebut authors claiming that "it happened by chance". In other words, when there is a "collision" of image features, it is difficult to justify whether it appears rarely or not. In this article, we provide a method to predict the rarity of an image feature by statistically estimating…
▽ More
When there is a suspicious figure reuse case in science, research integrity investigators often find it difficult to rebut authors claiming that "it happened by chance". In other words, when there is a "collision" of image features, it is difficult to justify whether it appears rarely or not. In this article, we provide a method to predict the rarity of an image feature by statistically estimating the chance of it randomly occurring across all scientific imagery. Our method is based on high-dimensional density estimation of ORB features using 7+ million images in the PubMed Open Access Subset dataset. We show that this method can lead to meaningful feedback during research integrity investigations by providing a null hypothesis for scientific image reuse and thus a p-value during deliberations. We apply the model to a sample of increasingly complex imagery and confirm that it produces decreasingly smaller p-values as expected. We discuss applications to research integrity investigations as well as future work.
△ Less
Submitted 21 February, 2020;
originally announced March 2020.
-
Scientific Image Tampering Detection Based On Noise Inconsistencies: A Method And Datasets
Authors:
Ziyue Xiang,
Daniel E. Acuna
Abstract:
Scientific image tampering is a problem that affects not only authors but also the general perception of the research community. Although previous researchers have developed methods to identify tampering in natural images, these methods may not thrive under the scientific setting as scientific images have different statistics, format, quality, and intentions. Therefore, we propose a scientific-ima…
▽ More
Scientific image tampering is a problem that affects not only authors but also the general perception of the research community. Although previous researchers have developed methods to identify tampering in natural images, these methods may not thrive under the scientific setting as scientific images have different statistics, format, quality, and intentions. Therefore, we propose a scientific-image specific tampering detection method based on noise inconsistencies, which is capable of learning and generalizing to different fields of science. We train and test our method on a new dataset of manipulated western blot and microscopy imagery, which aims at emulating problematic images in science. The test results show that our method can detect various types of image manipulation in different scenarios robustly, and it outperforms existing general-purpose image tampering detection schemes. We discuss applications beyond these two types of images and suggest next steps for making detection of problematic images a systematic step in peer review and science in general.
△ Less
Submitted 4 March, 2020; v1 submitted 21 January, 2020;
originally announced January 2020.
-
Assigning credit to scientific datasets using article citation networks
Authors:
Tong Zeng,
Longfeng Wu,
Sarah Bratt,
Daniel E. Acuna
Abstract:
A citation is a well-established mechanism for connecting scientific artifacts. Citation networks are used by citation analysis for a variety of reasons, prominently to give credit to scientists' work. However, because of current citation practices, scientists tend to cite only publications, leaving out other types of artifacts such as datasets. Datasets then do not get appropriate credit even tho…
▽ More
A citation is a well-established mechanism for connecting scientific artifacts. Citation networks are used by citation analysis for a variety of reasons, prominently to give credit to scientists' work. However, because of current citation practices, scientists tend to cite only publications, leaving out other types of artifacts such as datasets. Datasets then do not get appropriate credit even though they are increasingly reused and experimented with. We develop a network flow measure, called DataRank, aimed at solving this gap. DataRank assigns a relative value to each node in the network based on how citations flow through the graph, differentiating publication and dataset flow rates. We evaluate the quality of DataRank by estimating its accuracy at predicting the usage of real datasets: web visits to GenBank and downloads of Figshare datasets. We show that DataRank is better at predicting this usage compared to alternatives while offering additional interpretable outcomes. We discuss improvements to citation behavior and algorithms to properly track and assign credit to datasets.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Artificial mental phenomena: Psychophysics as a framework to detect perception biases in AI models
Authors:
Lizhen Liang,
Daniel E. Acuna
Abstract:
Detecting biases in artificial intelligence has become difficult because of the impenetrable nature of deep learning. The central difficulty is in relating unobservable phenomena deep inside models with observable, outside quantities that we can measure from inputs and outputs. For example, can we detect gendered perceptions of occupations (e.g., female librarian, male electrician) using questions…
▽ More
Detecting biases in artificial intelligence has become difficult because of the impenetrable nature of deep learning. The central difficulty is in relating unobservable phenomena deep inside models with observable, outside quantities that we can measure from inputs and outputs. For example, can we detect gendered perceptions of occupations (e.g., female librarian, male electrician) using questions to and answers from a word embedding-based system? Current techniques for detecting biases are often customized for a task, dataset, or method, affecting their generalization. In this work, we draw from Psychophysics in Experimental Psychology---meant to relate quantities from the real world (i.e., "Physics") into subjective measures in the mind (i.e., "Psyche")---to propose an intellectually coherent and generalizable framework to detect biases in AI. Specifically, we adapt the two-alternative forced choice task (2AFC) to estimate potential biases and the strength of those biases in black-box models. We successfully reproduce previously-known biased perceptions in word embeddings and sentiment analysis predictions. We discuss how concepts in experimental psychology can be naturally applied to understanding artificial mental phenomena, and how psychophysics can form a useful methodological foundation to study fairness in AI.
△ Less
Submitted 15 December, 2019;
originally announced December 2019.
-
The effect of novelty on the future impact of scientific grants
Authors:
Han Zhuang,
Daniel E. Acuna
Abstract:
Government funding agencies and foundations tend to perceive novelty as necessary for scientific impact and hence prefer to fund novel instead of incremental projects. Evidence linking novelty and the eventual impact of a grant is surprisingly scarce, however. Here, we examine this link by analyzing 920,000 publications funded by 170,000 grants from the National Science Foundation (NSF) and the Na…
▽ More
Government funding agencies and foundations tend to perceive novelty as necessary for scientific impact and hence prefer to fund novel instead of incremental projects. Evidence linking novelty and the eventual impact of a grant is surprisingly scarce, however. Here, we examine this link by analyzing 920,000 publications funded by 170,000 grants from the National Science Foundation (NSF) and the National Institutes of Health (NIH) between 2008 and 2016. We use machine learning to quantify grant novelty at the time of funding and relate that measure to the citation dynamics of these publications. Our results show that grant novelty leads to robust increases in citations while controlling for the principal investigator's grant experience, award amount, year of publication, prestige of the journal, and team size. All else held constant, an article resulting from a fully-novel grant would on average double the citations of a fully-incremental grant. We also find that novel grants produce as many articles as incremental grants while publishing in higher prestige journals. Taken together, our results provide compelling evidence supporting NSF, NIH, and many other funding agencies' emphases on novelty.
△ Less
Submitted 6 November, 2019;
originally announced November 2019.
-
Experimental demonstration of the connection between quantum contextuality and graph theory
Authors:
Gustavo Cañas,
Evelyn Acuña,
Jaime Cariñe,
Johanna F. Barra,
Esteban S. Gómez,
Guilherme B. Xavier,
Gustavo Lima,
Adán Cabello
Abstract:
We report a method that exploits a connection between quantum contextuality and graph theory to reveal any form of quantum contextuality in high-precision experiments. We use this technique to identify a graph which corresponds to an extreme form of quantum contextuality unnoticed before and test it using high-dimensional quantum states encoded in the linear transverse momentum of single photons.…
▽ More
We report a method that exploits a connection between quantum contextuality and graph theory to reveal any form of quantum contextuality in high-precision experiments. We use this technique to identify a graph which corresponds to an extreme form of quantum contextuality unnoticed before and test it using high-dimensional quantum states encoded in the linear transverse momentum of single photons. Our results open the door to the experimental exploration of quantum contextuality in all its forms, including those needed for quantum computation.
△ Less
Submitted 22 July, 2016;
originally announced July 2016.
-
Science Concierge: A fast content-based recommendation system for scientific publications
Authors:
Titipat Achakulvisut,
Daniel E. Acuna,
Tulakan Ruangrong,
Konrad Kording
Abstract:
Finding relevant publications is important for scientists who have to cope with exponentially increasing numbers of scholarly material. Algorithms can help with this task as they help for music, movie, and product recommendations. However, we know little about the performance of these algorithms with scholarly material. Here, we develop an algorithm, and an accompanying Python library, that implem…
▽ More
Finding relevant publications is important for scientists who have to cope with exponentially increasing numbers of scholarly material. Algorithms can help with this task as they help for music, movie, and product recommendations. However, we know little about the performance of these algorithms with scholarly material. Here, we develop an algorithm, and an accompanying Python library, that implements a recommendation system based on the content of articles. Design principles are to adapt to new content, provide near-real time suggestions, and be open source. We tested the library on 15K posters from the Society of Neuroscience Conference 2015. Human curated topics are used to cross validate parameters in the algorithm and produce a similarity metric that maximally correlates with human judgments. We show that our algorithm significantly outperformed suggestions based on keywords. The work presented here promises to make the exploration of scholarly material faster and more accurate.
△ Less
Submitted 11 May, 2016; v1 submitted 4 April, 2016;
originally announced April 2016.
-
Non-Gaussian state generation certified using the EPR-steering inequality
Authors:
E. S. Gómez,
G. Cañas,
E. Acuña,
W. A. T. Nogueira,
G. Lima
Abstract:
Due to practical reasons, experimental and theoretical continuous-variable (CV) quantum information (QI) has been heavily based on Gaussian states. Nevertheless, many CV-QI protocols require the use of non-Gaussian states and operations. Here, we show that the Einstein-Podolsky-Rosen steering inequality can be used to obtain a practical witness for the generation of pure bipartite non-Gaussian sta…
▽ More
Due to practical reasons, experimental and theoretical continuous-variable (CV) quantum information (QI) has been heavily based on Gaussian states. Nevertheless, many CV-QI protocols require the use of non-Gaussian states and operations. Here, we show that the Einstein-Podolsky-Rosen steering inequality can be used to obtain a practical witness for the generation of pure bipartite non-Gaussian states. While the scenario require pure states, we show its broad relevance by reporting the experimental observation of the non-Gaussianity of the CV two-photon state generated in the process of spontaneous parametric down-conversion (SPDC). The observed non-Gaussianity is due only to the intrinsic phase-matching conditions of SPDC
△ Less
Submitted 10 January, 2015;
originally announced January 2015.