Search | arXiv e-print repository

arXiv:2407.06090 [pdf, other]

The Need for a Recurring Large-Scale Benchmarking Survey to Continually Evaluate Sampling Methods and Administration Modes: Lessons from the 2022 Collaborative Midterm Survey

Authors: Peter K. Enns, Colleen L. Barry, James N. Druckman, Sergio Garcia-Rios, David C. Wilson, Jonathon P. Schuldt

Abstract: As survey methods adapt to technological and societal changes, a growing body of research seeks to understand the tradeoffs associated with various sampling methods and administration modes. We show how the NSF-funded 2022 Collaborative Midterm Survey (CMS) can be used as a dynamic and transparent framework for evaluating which sampling approaches - or combination of approaches - are best suited f… ▽ More As survey methods adapt to technological and societal changes, a growing body of research seeks to understand the tradeoffs associated with various sampling methods and administration modes. We show how the NSF-funded 2022 Collaborative Midterm Survey (CMS) can be used as a dynamic and transparent framework for evaluating which sampling approaches - or combination of approaches - are best suited for various research goals. The CMS is ideally suited for this purpose because it includes almost 20,000 respondents interviewed using two administration modes (phone and online) and data drawn from random digit dialing, random address-based sampling, a probability-based panel, two nonprobability panels, and two nonprobability marketplaces. The analysis considers three types of population benchmarks (election data, administrative records, and large government surveys) and focuses on the national-level estimates as well as oversamples in three states (California, Florida, and Wisconsin). In addition to documenting how each of the survey strategies performed, we develop a strategy to assess how different combinations of approaches compare to different population benchmarks in order to guide researchers combining sampling methods and sources. We conclude by providing specific recommendations to public opinion and election survey researchers and demonstrating how our approach could be applied to a large government survey conducted at regular intervals to provide ongoing guidance to researchers, government, businesses, and nonprofits regarding the most appropriate survey sampling and administration methods. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2307.03730 [pdf, other]

doi 10.1063/5.0142596

First bromine doped cryogenic implosion at the National Ignition Facility

Authors: A. C. Hayes, G. Kyrala, M. Gooden, J. B. Wilhelmy, L. Kot, P. Volegov, C. Wilde, B. Haines, Gerard Jungman, R. S. Rundberg, D. C. Wilson, C. Velsko, W. Cassata, E. Henry, C. Yeamans, C. Cerjan, T. Ma, T. Doppner, A. Nikroo, O. Hurricane, D. Callahan, D. Hinkel, D. Schneider, B. Bachmann, F. Graziani , et al. (7 additional authors not shown)

Abstract: We report on the first experiment dedicated to the study of nuclear reactions on dopants in a cryogenic capsule at the National Ignition Facility (NIF). This was accomplished using bromine do** in the inner layers of the CH ablator of a capsule identical to that used in the NIF shot N140520. The capsule was doped with 3$\times$10$^{16}$ bromine atoms. The doped capsule shot, N170730, resulted in… ▽ More We report on the first experiment dedicated to the study of nuclear reactions on dopants in a cryogenic capsule at the National Ignition Facility (NIF). This was accomplished using bromine do** in the inner layers of the CH ablator of a capsule identical to that used in the NIF shot N140520. The capsule was doped with 3$\times$10$^{16}$ bromine atoms. The doped capsule shot, N170730, resulted in a DT yield that was 2.6 times lower than the undoped equivalent. The Radiochemical Analysis of Gaseous Samples (RAGS) system was used to collect and detect $^{79}$Kr atoms resulting from energetic deuteron and proton ion reactions on $^{79}$Br. RAGS was also used to detect $^{13}$N produced dominantly by knock-on deuteron reactions on the $^{12}$C in the ablator. High-energy reaction-in-flight neutrons were detected via the $^{209}$Bi(n,4n)$^{206}$Bi reaction, using bismuth activation foils located 50 cm outside of the target capsule. The robustness of the RAGS signals suggest that the use of nuclear reactions on dopants as diagnostics is quite feasible. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Report number: LA-UR-22-28149

arXiv:2111.15592 [pdf, other]

MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale

Authors: Kasra Hosseini, Daniel C. S. Wilson, Kaspar Beelen, Katherine McDonough

Abstract: We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital). This library transforms the way historians can use maps by turning extensive, homogeneous map sets into searchable primary sources. MapReader allows users with little or no computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divid… ▽ More We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital). This library transforms the way historians can use maps by turning extensive, homogeneous map sets into searchable primary sources. MapReader allows users with little or no computer vision expertise to i) retrieve maps via web-servers; ii) preprocess and divide them into patches; iii) annotate patches; iv) train, fine-tune, and evaluate deep neural network models; and v) create structured data about map content. We demonstrate how MapReader enables historians to interpret a collection of $\approx$16K nineteenth-century Ordnance Survey map sheets ($\approx$30.5M patches), foregrounding the challenge of translating visual markers into machine-readable data. We present a case study focusing on British rail infrastructure and buildings as depicted on these maps. We also show how the outputs from the MapReader pipeline can be linked to other, external datasets, which we use to evaluate as well as enrich and interpret the results. We release $\approx$62K manually annotated patches used here for training and evaluating the models. △ Less

Submitted 30 November, 2021; originally announced November 2021.

Comments: 13 pages, 9 figures

arXiv:2005.11140 [pdf, other]

Living Machines: A study of atypical animacy

Authors: Mariona Coll Ardanuy, Federico Nanni, Kaspar Beelen, Kasra Hosseini, Ruth Ahnert, Jon Lawrence, Katherine McDonough, Giorgia Tolfo, Daniel CS Wilson, Barbara McGillivray

Abstract: This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it, we have created the first dataset for atypical animacy detection, based on n… ▽ More This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it, we have created the first dataset for atypical animacy detection, based on nineteenth-century sentences in English, with machines represented as either animate or inanimate. Our method builds on recent innovations in language modeling, specifically BERT contextualized word embeddings, to better capture fine-grained contextual properties of words. We present a fully unsupervised pipeline, which can be easily adapted to different contexts, and report its performance on an established animacy dataset and our newly introduced resource. We show that our method provides a substantially more accurate characterization of atypical animacy, especially when applied to highly complex forms of language use. △ Less

Submitted 19 November, 2020; v1 submitted 22 May, 2020; originally announced May 2020.

Comments: 12 pages, 1 figures

arXiv:1701.09179 [pdf]

The first cryogenic DT layered, beryllium capsule implosion at the National Ignition Facility

Authors: D. C. Wilson, J. L. Kline, S. A. Yi, A. N. Simakov, G. A. Kyrala, R. E. Olson, T. S. Perry, F. E. Merrill, S. Batha, A. B. Zylstra, D. A. Callahan, W. Cassata, E. L. Dewald, S. W. Haan, D. E. Hinkel, O. A. Hurricane, N. Izumi, T. Ma, A. G. MacPhee, J. L. Milovich, J. E. Ralph, J. R. Rygg, M. B. Schneider, S. Sepke, D. J. Strozzi , et al. (4 additional authors not shown)

Abstract: NIF experiments with Be capsules have followed a path of the highly successful "high-foot" CH capsules. Several keyhole and ConA targets preceeded a DT layered shot. In addition to backscatter subtraction, laser drive multipliers were needed to match observed X-ray drives. Those for the picket (0.95), trough (1.0) and second pulse (0.80) were determined by VISAR measurements. The time dependence o… ▽ More NIF experiments with Be capsules have followed a path of the highly successful "high-foot" CH capsules. Several keyhole and ConA targets preceeded a DT layered shot. In addition to backscatter subtraction, laser drive multipliers were needed to match observed X-ray drives. Those for the picket (0.95), trough (1.0) and second pulse (0.80) were determined by VISAR measurements. The time dependence of the Dante total x-ray flux and its fraction > 1.8 keV reflect the time dependence of the multipliers. A two step drive multiplier for the main pulse can match implosion times, but Dante measurements suggest the drive multiplier must increase late in time. With a single set of time dependent, multi-level multipliers the Dante data are well matched. These same third pulse drive multipliers also match the implosion times and Dante signals for two CH capsule DT. One discrepancy in the calculations is the X-ray flux in the picket. Calculations over-estimate the flux > 1.8 keV by a factor of ~100, while getting the total flux correctly. These harder X-rays cause an expansion of the Be/fuel interface of 2-3 km/s before the arrival of the first shock. VISAR measurements show only 0.2 to 0.3 km/s. The X-ray drive on the DT Be capsule was further degraded by a random decrease of 9% in the total picket flux. This small change caused the capsule fuel to change from an adiabat of 1.8 to 2.3 by mistiming of the first and second shocks. With this shock tuning and adjustments to the calculation, the first NIF Be capsule implosion achieved 29% of calculated yield, comparable to the CH DT capsules of 68% and 21%. Inclusion of a large M1 asymmetry in the DT ice layer and mixing from instability growth may help explain this final degradation. In summary when driven similarly the Be capsules performed like CH capsules. Performance degradation for both seems to be dominated by drive and capsule asymmetries. △ Less

Submitted 31 January, 2017; originally announced January 2017.

Report number: LA-UR-16-20355

arXiv:1404.5372 [pdf, other]

doi 10.1080/19475683.2014.904440

Linking Geographic Vocabularies through WordNet

Authors: Andrea Ballatore, Michela Bertolotto, David C. Wilson

Abstract: The linked open data (LOD) paradigm has emerged as a promising approach to structuring and sharing geospatial information. One of the major obstacles to this vision lies in the difficulties found in the automatic integration between heterogeneous vocabularies and ontologies that provides the semantic backbone of the growing constellation of open geo-knowledge bases. In this article, we show how to… ▽ More The linked open data (LOD) paradigm has emerged as a promising approach to structuring and sharing geospatial information. One of the major obstacles to this vision lies in the difficulties found in the automatic integration between heterogeneous vocabularies and ontologies that provides the semantic backbone of the growing constellation of open geo-knowledge bases. In this article, we show how to utilize WordNet as a semantic hub to increase the integration of LOD. With this purpose in mind, we devise Voc2WordNet, an unsupervised map** technique between a given vocabulary and WordNet, combining intensional and extensional aspects of the geographic terms. Voc2WordNet is evaluated against a sample of human-generated alignments with the OpenStreetMap (OSM) Semantic Network, a crowdsourced geospatial resource, and the GeoNames ontology, the vocabulary of a large digital gazetteer. These empirical results indicate that the approach can obtain high precision and recall. △ Less

Submitted 21 April, 2014; originally announced April 2014.

Comments: 21 pages, 1 figure

Journal ref: Annals of GIS, 20 (2), 2014, 73-84

arXiv:1402.3371 [pdf, other]

doi 10.1007/s10707-013-0197-8

An evaluative baseline for geo-semantic relatedness and similarity

Authors: Andrea Ballatore, Michela Bertolotto, David C. Wilson

Abstract: In geographic information science and semantics, the computation of semantic similarity is widely recognised as key to supporting a vast number of tasks in information integration and retrieval. By contrast, the role of geo-semantic relatedness has been largely ignored. In natural language processing, semantic relatedness is often confused with the more specific semantic similarity. In this articl… ▽ More In geographic information science and semantics, the computation of semantic similarity is widely recognised as key to supporting a vast number of tasks in information integration and retrieval. By contrast, the role of geo-semantic relatedness has been largely ignored. In natural language processing, semantic relatedness is often confused with the more specific semantic similarity. In this article, we discuss a notion of geo-semantic relatedness based on Lehrer's semantic fields, and we compare it with geo-semantic similarity. We then describe and validate the Geo Relatedness and Similarity Dataset (GeReSiD), a new open dataset designed to evaluate computational measures of geo-semantic relatedness and similarity. This dataset is larger than existing datasets of this kind, and includes 97 geographic terms combined into 50 term pairs rated by 203 human subjects. GeReSiD is available online and can be used as an evaluation baseline to determine empirically to what degree a given computational model approximates geo-semantic relatedness and similarity. △ Less

Submitted 14 February, 2014; originally announced February 2014.

Comments: GeoInformatica 2014

arXiv:1401.2610 [pdf, other]

doi 10.1007/978-3-642-37688-7_5

A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web

Authors: Andrea Ballatore, David C. Wilson, Michela Bertolotto

Abstract: Over the past decade, rapid advances in web technologies, coupled with innovative models of spatial data collection and consumption, have generated a robust growth in geo-referenced information, resulting in spatial information overload. Increasing 'geographic intelligence' in traditional text-based information retrieval has become a prominent approach to respond to this issue and to fulfill users… ▽ More Over the past decade, rapid advances in web technologies, coupled with innovative models of spatial data collection and consumption, have generated a robust growth in geo-referenced information, resulting in spatial information overload. Increasing 'geographic intelligence' in traditional text-based information retrieval has become a prominent approach to respond to this issue and to fulfill users' spatial information needs. Numerous efforts in the Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the Linking Open Data initiative have converged in a constellation of open knowledge bases, freely available online. In this article, we survey these open knowledge bases, focusing on their geospatial dimension. Particular attention is devoted to the crucial issue of the quality of geo-knowledge bases, as well as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic Network, is outlined as our contribution to this area. Research directions in information integration and Geographic Information Retrieval (GIR) are then reviewed, with a critical discussion of their current limitations and future prospects. △ Less

Submitted 12 January, 2014; originally announced January 2014.

Journal ref: in Quality Issues in the Management of Web Information, ISRL 50, pp. 93-120, Springer, 2013

arXiv:1401.2517 [pdf, ps, other]

doi 10.5311/JOSIS.2013.7.128

The semantic similarity ensemble

Authors: Andrea Ballatore, Michela Bertolotto, David C. Wilson

Abstract: Computational measures of semantic similarity between geographic terms provide valuable support across geographic information retrieval, data mining, and information integration. To date, a wide variety of approaches to geo-semantic similarity have been devised. A judgment of similarity is not intrinsically right or wrong, but obtains a certain degree of cognitive plausibility, depending on how cl… ▽ More Computational measures of semantic similarity between geographic terms provide valuable support across geographic information retrieval, data mining, and information integration. To date, a wide variety of approaches to geo-semantic similarity have been devised. A judgment of similarity is not intrinsically right or wrong, but obtains a certain degree of cognitive plausibility, depending on how closely it mimics human behavior. Thus selecting the most appropriate measure for a specific task is a significant challenge. To address this issue, we make an analogy between computational similarity measures and soliciting domain expert opinions, which incorporate a subjective set of beliefs, perceptions, hypotheses, and epistemic biases. Following this analogy, we define the semantic similarity ensemble (SSE) as a composition of different similarity measures, acting as a panel of experts having to reach a decision on the semantic similarity of a set of geographic terms. The approach is evaluated in comparison to human judgments, and results indicate that an SSE performs better than the average of its parts. Although the best member tends to outperform the ensemble, all ensembles outperform the average performance of each ensemble's member. Hence, in contexts where the best measure is unknown, the ensemble provides a more cognitively plausible approach. △ Less

Submitted 11 January, 2014; originally announced January 2014.

Comments: Special feature on Semantic and Conceptual Issues in GIS (SeCoGIS)

Journal ref: Journal of Spatial Information Science (JOSIS), Number 7 (2013), pp. 27-44

arXiv:0908.1416 [pdf, ps, other]

A Characterization of Hyperbolic Affine Iterated Function Systems

Authors: Ross Atkins, Michael F. Barnsley, Andrew Vince, David C. Wilson

Abstract: The two main theorems of this paper provide a characterization of hyperbolic affine iterated function systems defined on Rm. Atsushi Kameyama (Distances on Topological Self-Similar Sets, Proceedings of Symposia in Pure Mathematics, Volume 72.1, 2004) asked the following fundamental question: given a topological self-similar set, does there exist an associated system of contraction map**s? Our… ▽ More The two main theorems of this paper provide a characterization of hyperbolic affine iterated function systems defined on Rm. Atsushi Kameyama (Distances on Topological Self-Similar Sets, Proceedings of Symposia in Pure Mathematics, Volume 72.1, 2004) asked the following fundamental question: given a topological self-similar set, does there exist an associated system of contraction map**s? Our theorems imply an affirmative answer to Kameyama's question for self-similar sets derived from affine transformations on Rm. △ Less

Submitted 10 August, 2009; originally announced August 2009.

MSC Class: 54H25

Showing 1–10 of 10 results for author: Wilson, D C