Search | arXiv e-print repository

XLTime: A Cross-Lingual Knowledge Transfer Framework for Temporal Expression Extraction

Authors: Yuwei Cao, William Groves, Tanay Kumar Saha, Joel R. Tetreault, Alex Jaimes, Hao Peng, Philip S. Yu

Abstract: Temporal Expression Extraction (TEE) is essential for understanding time in natural language. It has applications in Natural Language Processing (NLP) tasks such as question answering, information retrieval, and causal inference. To date, work in this area has mostly focused on English as there is a scarcity of labeled data for other languages. We propose XLTime, a novel framework for multilingual… ▽ More Temporal Expression Extraction (TEE) is essential for understanding time in natural language. It has applications in Natural Language Processing (NLP) tasks such as question answering, information retrieval, and causal inference. To date, work in this area has mostly focused on English as there is a scarcity of labeled data for other languages. We propose XLTime, a novel framework for multilingual TEE. XLTime works on top of pre-trained language models and leverages multi-task learning to prompt cross-language knowledge transfer both from English and within the non-English languages. XLTime alleviates problems caused by a shortage of data in the target language. We apply XLTime with different language models and show that it outperforms the previous automatic SOTA methods on French, Spanish, Portuguese, and Basque, by large margins. XLTime also closes the gap considerably on the handcrafted HeidelTime method. △ Less

Submitted 3 May, 2022; originally announced May 2022.

Comments: This paper is accepted by the Findings of NAACL 2022

arXiv:1912.13332 [pdf, other]

Unsupervised Detection of Sub-events in Large Scale Disasters

Authors: Chidubem Arachie, Manas Gaur, Sam Anzaroot, William Groves, Ke Zhang, Alejandro Jaimes

Abstract: Social media plays a major role during and after major natural disasters (e.g., hurricanes, large-scale fires, etc.), as people ``on the ground'' post useful information on what is actually happening. Given the large amounts of posts, a major challenge is identifying the information that is useful and actionable. Emergency responders are largely interested in finding out what events are taking pla… ▽ More Social media plays a major role during and after major natural disasters (e.g., hurricanes, large-scale fires, etc.), as people ``on the ground'' post useful information on what is actually happening. Given the large amounts of posts, a major challenge is identifying the information that is useful and actionable. Emergency responders are largely interested in finding out what events are taking place so they can properly plan and deploy resources. In this paper we address the problem of automatically identifying important sub-events (within a large-scale emergency ``event'', such as a hurricane). In particular, we present a novel, unsupervised learning framework to detect sub-events in Tweets for retrospective crisis analysis. We first extract noun-verb pairs and phrases from raw tweets as sub-event candidates. Then, we learn a semantic embedding of extracted noun-verb pairs and phrases, and rank them against a crisis-specific ontology. We filter out noisy and irrelevant information then cluster the noun-verb pairs and phrases so that the top-ranked ones describe the most important sub-events. Through quantitative experiments on two large crisis data sets (Hurricane Harvey and the 2015 Nepal Earthquake), we demonstrate the effectiveness of our approach over the state-of-the-art. Our qualitative evaluation shows better performance compared to our baseline. △ Less

Submitted 13 December, 2019; originally announced December 2019.

Comments: AAAI-20 Social Impact Track

arXiv:1807.00021 [pdf, other]

Reflectionless Filter Topologies Supporting Arbitrary Ladder Prototypes

Authors: Matthew A. Morgan, Wavley M. Groves III, Tod A. Boyd

Abstract: Modifications of the authors' previously-published, generalized, lumped-element, reflectionless filter topologies are presented which remove the original constraints on the relative values of its prototype parameters. Thus, any transfer function which can be realized as the transmission or reflection coefficient of a conventional ladder prototype may now be implemented in reflectionless form --- t… ▽ More Modifications of the authors' previously-published, generalized, lumped-element, reflectionless filter topologies are presented which remove the original constraints on the relative values of its prototype parameters. Thus, any transfer function which can be realized as the transmission or reflection coefficient of a conventional ladder prototype may now be implemented in reflectionless form --- that is, having the same transfer function in transmission but with identically zero reflection coefficient at both ports and at all frequencies from DC to infinity, given ideal elements. The theoretical basis of these modifications is explained, and then tested via the construction of passive, reflectionless low-pass filter prototypes that, in the prior topology, would have required negative reactive elements. The results show excellent agreement with theory. △ Less

Submitted 29 June, 2018; originally announced July 2018.

Comments: 8 pages, 17 figures, to be submitted

arXiv:1803.04473 [pdf, other]

doi 10.3847/1538-3881/aab965

Performance of a highly sensitive, 19-element, dual-polarization, cryogenic L-band Phased Array Feed on the Green Bank Telescope

Authors: D. Anish Roshi, W. Shillue, B. Simon, K. F. Warnick, B. Jeffs, D. J. Pisano, R. Prestage, S. White, J. R. Fisher, M. Morgan, R. Black, M. Burnett, J. Diao, M. Ruzindana, V. van Tonder, L. Hawkins, P. Marganian, T. Chamberlin, J. Ray, N. M. **el, K. Rajwade, D. R. Lorimer, A. Rane, J. Castro, W. Groves , et al. (4 additional authors not shown)

Abstract: A new 1.4 GHz 19-element, dual-polarization, cryogenic phased array feed (PAF) radio astronomy receiver has been developed for the Robert C. Byrd Green Bank Telescope (GBT) as part of FLAG (Focal L-band Array for the GBT) project. Commissioning observations of calibrator radio sources show that this receiver has the lowest reported beamformed system temperature ($T_{\rm sys}$) normalized by apertu… ▽ More A new 1.4 GHz 19-element, dual-polarization, cryogenic phased array feed (PAF) radio astronomy receiver has been developed for the Robert C. Byrd Green Bank Telescope (GBT) as part of FLAG (Focal L-band Array for the GBT) project. Commissioning observations of calibrator radio sources show that this receiver has the lowest reported beamformed system temperature ($T_{\rm sys}$) normalized by aperture efficiency ($η$) of any phased array receiver to date. The measured $T_{\rm sys}/η$ is $25.4 \pm 2.5$ K near 1350 MHz for the boresight beam, which is comparable to the performance of the current 1.4 GHz cryogenic single feed receiver on the GBT. The degradation in $T_{\rm sys}/η$ at $\sim$ 4 arcmin (required for Nyquist sampling) and $\sim$ 8 arcmin offsets from the boresight is, respectively, $\sim$ 1\% and $\sim$ 20\% of the boresight value. The survey speed of the PAF with seven formed beams is larger by a factor between 2.1 and 7 compared to a single beam system depending on the observing application. The measured performance, both in frequency and offset from boresight, qualitatively agree with predictions from a rigorous electromagnetic model of the PAF. The astronomical utility of the receiver is demonstrated by observations of the pulsar B0329+54 and an extended HII region, the Rosette Nebula. The enhanced survey speed with the new PAF receiver will enable the GBT to carry out exciting new science, such as more efficient observations of diffuse, extended neutral hydrogen emission from galactic in-flows and searches for Fast Radio Bursts. △ Less

Submitted 12 March, 2018; originally announced March 2018.

Comments: 24 pages, 16 figures, to appear in Astronomical Journal

arXiv:1711.02204 [pdf, other]

A Highly-Sensitive Cryogenic Phased Array Feed for the Green Bank Telescope

Authors: D. Anish Roshi, W. Shillue, J. R. Fisher, M. Morgan, J. Castro, W. Groves, T. Boyd, B. Simon, L. Hawkins, V. van Tonder, J. D. Nelson, J. Ray, T. Chamberlain, S. White, R. Black, K. F. Warnick, B. Jeffs, R. Prestage

Abstract: In this paper, we describe the development of a new L-band (1.4 GHz) Cryogenic Phased Array Feed (PAF) system, referred to as the GBT2 array. Results from initial measurements made with the GBT2 array are also presented. The PAF was developed for the Green Bank Telescope (GBT) as part of the Focal L-band Array for the GBT (FLAG) project. During the first stage of the development work (Phase I), a… ▽ More In this paper, we describe the development of a new L-band (1.4 GHz) Cryogenic Phased Array Feed (PAF) system, referred to as the GBT2 array. Results from initial measurements made with the GBT2 array are also presented. The PAF was developed for the Green Bank Telescope (GBT) as part of the Focal L-band Array for the GBT (FLAG) project. During the first stage of the development work (Phase I), a prototype cryogenic 19 element dual-polarized array with "Kite" dipole elements was developed and tested on the GBT. The measured system temperature over efficiency ($T_{sys}/η$) ratio for the bore sight beam of the Kite array was 45.5 K at 1.55 GHz. The off-boresight $T_{sys}/η$ shows an increase by 13 K at an offset equal to the half power beam width (7$^{'}$.2 at 1.7 GHz). Our measurements indicate that the off-boresight degradation and field-of-view (FoV) limitation of the Kite array is simply due to the fixed array size. To increase the FoV, a new 19-element GBT2 array with larger array spacing was developed during FLAG Phase II. The frequency response of the array was optimized from 1.2 to 1.6 GHz. A system with larger cryostat, new low noise amplifiers (LNAs), down-conversion and digitization close to the front end, unformatted digital transmission over fiber, ROACH II based polyphase filter banks (PFBs) with bandwidth 150 MHz and a data acquisition system that records voltage samples from one of the PFB channels were all developed. The data presented here is processed off-line. The receiver temperature measured with the new system is 17 K at 1.4 GHz, an improvement $>$ 8 K over the previous Kite array. Measurements with the GBT2 array on the telescope are in progress. A real time 150 MHz beamformer is also being developed as part of an NSF-funded collaboration between NRAO/GBO/BYU \& West Virginia University (Beamformer Project) to support science observations. △ Less

Submitted 6 November, 2017; originally announced November 2017.

Comments: 4 pages, 7 figures, to appear in the proceedings of 32nd URSI GASS, August 2017

arXiv:1612.04313 [pdf]

doi 10.1088/1538-3873/aa7115

A Cryogenic SiGe Low Noise Amplifier Optimized for Phased Array Feeds

Authors: Wavley M. Groves III, Matthew A. Morgan

Abstract: The growing number of phased array feeds (PAF) being built for radio astronomy demonstrates an increasing need for low noise amplifiers (LNA) that are designed for repeatability, low noise, and ease of manufacture. Specific design features which help to achieve these goals include the use of unpackaged transistors (for cryogenic operation), single-polarity biasing, straight plug-in RF interfaces t… ▽ More The growing number of phased array feeds (PAF) being built for radio astronomy demonstrates an increasing need for low noise amplifiers (LNA) that are designed for repeatability, low noise, and ease of manufacture. Specific design features which help to achieve these goals include the use of unpackaged transistors (for cryogenic operation), single-polarity biasing, straight plug-in RF interfaces to facilitate installation and re-work, and the use of off-the shelf components. The focal L-band array for the Green Bank Telescope (FLAG) is a cooperative effort by Brigham Young University (BYU) and the National Radio Astronomy Observatory (NRAO) using warm dipole antennae and cryogenic Silicon Germanium Heterojunction Bipolar Transistor (SiGe HBT) LNAs. These LNAs have an in band gain average of 38 dB and 4.85 Kelvin average noise temperature. Although the FLAG instrument was the driving instrument behind this development, most of the key features of the design and the advantages they offer apply broadly to other array feeds, including independent-beam and phased, and for many antenna types such as horn, dipole, Vivaldi, connected-bowtie, etc. This paper will focus on the unique requirements array feeds have for low noise amplifiers and how amplifier manufacturing can accommodate these needs. △ Less

Submitted 5 May, 2017; v1 submitted 13 December, 2016; originally announced December 2016.

arXiv:1211.6581 [pdf, other]

doi 10.1007/s10994-016-5546-z

Multi-Target Regression via Input Space Expansion: Treating Targets as Inputs

Authors: Eleftherios Spyromitros-Xioufis, Grigorios Tsoumakas, William Groves, Ioannis Vlahavas

Abstract: In many practical applications of supervised learning the task involves the prediction of multiple target variables from a common set of input variables. When the prediction targets are binary the task is called multi-label classification, while when the targets are continuous the task is called multi-target regression. In both tasks, target variables often exhibit statistical dependencies and exp… ▽ More In many practical applications of supervised learning the task involves the prediction of multiple target variables from a common set of input variables. When the prediction targets are binary the task is called multi-label classification, while when the targets are continuous the task is called multi-target regression. In both tasks, target variables often exhibit statistical dependencies and exploiting them in order to improve predictive accuracy is a core challenge. A family of multi-label classification methods address this challenge by building a separate model for each target on an expanded input space where other targets are treated as additional input variables. Despite the success of these methods in the multi-label classification domain, their applicability and effectiveness in multi-target regression has not been studied until now. In this paper, we introduce two new methods for multi-target regression, called Stacked Single-Target and Ensemble of Regressor Chains, by adapting two popular multi-label classification methods of this family. Furthermore, we highlight an inherent problem of these methods - a discrepancy of the values of the additional input variables between training and prediction - and develop extensions that use out-of-sample estimates of the target variables during training in order to tackle this problem. The results of an extensive experimental evaluation carried out on a large and diverse collection of datasets show that, when the discrepancy is appropriately mitigated, the proposed methods attain consistent improvements over the independent regressions baseline. Moreover, two versions of Ensemble of Regression Chains perform significantly better than four state-of-the-art methods including regularization-based multi-task learning methods and a multi-objective random forest approach. △ Less

Submitted 27 January, 2016; v1 submitted 28 November, 2012; originally announced November 2012.

Comments: Accepted for publication in Machine Learning journal. This replacement contains major improvements compared to the previous version, including a deeper theoretical and experimental analysis and an extended discussion of related work

Showing 1–7 of 7 results for author: Groves, W