Skip to main content

Showing 1–45 of 45 results for author: Cross, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.10555  [pdf, other

    cs.LG physics.flu-dyn

    Population-based wind farm monitoring based on a spatial autoregressive approach

    Authors: W. Lin, K. Worden, E. J. Cross

    Abstract: An important challenge faced by wind farm operators is to reduce operation and maintenance cost. Structural health monitoring provides a means of cost reduction through minimising unnecessary maintenance trips as well as prolonging turbine service life. Population-based structural health monitoring can further reduce the cost of health monitoring systems by implementing one system for multiple str… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 8 pages, 4 figures, submitted to the Modern Practice in Stress and Vibration Analysis (MPSVA) 2022 Conference Proceedings

  2. arXiv:2310.05807  [pdf, other

    cs.LG cs.CE

    Sharing Information Between Machine Tools to Improve Surface Finish Forecasting

    Authors: Daniel R. Clarkson, Lawrence A. Bull, Tina A. Dardeno, Chandula T. Wickramarachchi, Elizabeth J. Cross, Timothy J. Rogers, Keith Worden, Nikolaos Dervilis, Aidan J. Hughes

    Abstract: At present, most surface-quality prediction methods can only perform single-task prediction which results in under-utilised datasets, repetitive work and increased experimental costs. To counter this, the authors propose a Bayesian hierarchical model to predict surface-roughness measurements for a turning machining process. The hierarchical model is compared to multiple independent Bayesian linear… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: Submitted to International Workshop on Structural Health Monitoring 2023, Stanford University, California, USA

  3. arXiv:2309.10656  [pdf, other

    cs.LG

    A spectrum of physics-informed Gaussian processes for regression in engineering

    Authors: Elizabeth J Cross, Timothy J Rogers, Daniel J Pitchforth, Samuel J Gibson, Matthew R Jones

    Abstract: Despite the growing availability of sensing and data in general, we remain unable to fully characterise many in-service engineering systems and structures from a purely data-driven approach. The vast data and resources available to capture human activity are unmatched in our engineered world, and, even in cases where data could be referred to as ``big,'' they will rarely hold information across op… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  4. arXiv:2305.08657  [pdf, other

    stat.ML cs.LG stat.AP

    Encoding Domain Expertise into Multilevel Models for Source Location

    Authors: Lawrence A. Bull, Matthew R. Jones, Elizabeth J. Cross, Andrew Duncan, Mark Girolami

    Abstract: Data from populations of systems are prevalent in many industrial applications. Machines and infrastructure are increasingly instrumented with sensing systems, emitting streams of telemetry data with complex interdependencies. In practice, data-centric monitoring procedures tend to consider these assets (and respective models) as distinct -- operating in isolation and associated with independent d… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  5. Low-field magnetic resonance image enhancement via stochastic image quality transfer

    Authors: Hongxiang Lin, Matteo Figini, Felice D'Arco, Godwin Ogbole, Ryutaro Tanno, Stefano B. Blumberg, Lisa Ronan, Biobele J. Brown, David W. Carmichael, Ikeoluwa Lagunju, Judith Helen Cross, Delmiro Fernandez-Reyes, Daniel C. Alexander

    Abstract: Low-field (<1T) magnetic resonance imaging (MRI) scanners remain in widespread use in low- and middle-income countries (LMICs) and are commonly used for some applications in higher income countries e.g. for small child patients with obesity, claustrophobia, implants, or tattoos. However, low-field MR images commonly have lower resolution and poorer contrast than images from high field (1.5T, 3T, a… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: Accepted in Medical Image Analysis

  6. arXiv:2302.03528  [pdf, other

    cs.CL

    Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages

    Authors: Simeng Sun, Maha Elbayad, Anna Sun, James Cross

    Abstract: With multilingual machine translation (MMT) models continuing to grow in size and number of supported languages, it is natural to reuse and upgrade existing models to save computation as data becomes available in more languages. However, adding new languages requires updating the vocabulary, which complicates the reuse of embeddings. The question of how to reuse existing models while also making a… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

    Comments: Accepted to EACL 2023 (Main)

  7. arXiv:2207.04672  [pdf

    cs.CL cs.AI

    No Language Left Behind: Scaling Human-Centered Machine Translation

    Authors: NLLB Team, Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran , et al. (14 additional authors not shown)

    Abstract: Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today. However, such efforts have coalesced around a small subset of languages, leaving behind the vast majority of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe, high quality res… ▽ More

    Submitted 25 August, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: 190 pages

    MSC Class: 68T50 ACM Class: I.2.7

  8. Physics-informed machine learning for Structural Health Monitoring

    Authors: Elizabeth J Cross, Samuel J Gibson, Matthew R Jones, Daniel J Pitchforth, Sikai Zhang, Timothy J Rogers

    Abstract: The use of machine learning in Structural Health Monitoring is becoming more common, as many of the inherent tasks (such as regression and classification) in develo** condition-based assessment fall naturally into its remit. This chapter introduces the concept of physics-informed machine learning, where one adapts ML algorithms to account for the physical insight an engineer will often have of t… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

  9. arXiv:2206.02079  [pdf, other

    cs.CL

    Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders

    Authors: Xiang Kong, Adithya Renduchintala, James Cross, Yuqing Tang, Jiatao Gu, Xian Li

    Abstract: Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity. However, the extra latency and memory costs introduced by this approach may make it unacceptable for efficiency-constrained applications. It has recently been shown for bilingual translation that using a deep encoder and shallow decoder (DESD) c… ▽ More

    Submitted 4 June, 2022; originally announced June 2022.

    Comments: EACL 2021

  10. arXiv:2206.01495  [pdf, other

    cs.LG cs.SD eess.AS

    Constraining Gaussian processes for physics-informed acoustic emission map**

    Authors: Matthew R Jones, Timothy J Rogers, Elizabeth J Cross

    Abstract: The automated localisation of damage in structures is a challenging but critical ingredient in the path towards predictive or condition-based maintenance of high value structures. The use of acoustic emission time of arrival map** is a promising approach to this challenge, but is severely hindered by the need to collect a dense set of artificial acoustic emission measurements across the structur… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

  11. arXiv:2205.10835  [pdf, other

    cs.CL

    Multilingual Machine Translation with Hyper-Adapters

    Authors: Christos Baziotis, Mikel Artetxe, James Cross, Shruti Bhosale

    Abstract: Multilingual machine translation suffers from negative interference across languages. A common solution is to relax parameter sharing with language-specific modules like adapters. However, adapters of related languages are unable to transfer information, and their total number of parameters becomes prohibitively expensive as the number of languages grows. In this work, we overcome these drawbacks… ▽ More

    Submitted 5 December, 2022; v1 submitted 22 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022 camera-ready version. Code at github.com/cbaziotis/fairseq under the "hyperadapters" branch (see instructions at https://github.com/cbaziotis/fairseq/tree/hyperadapters/examples/adapters)

  12. arXiv:2205.06266  [pdf, other

    cs.CL

    Lifting the Curse of Multilinguality by Pre-training Modular Transformers

    Authors: Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe

    Abstract: Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages. We address this issue by introducing language-specific modules, which allows us to grow the total capacity of the model, while kee** the total number of trainable parameters per language constant. In contrast with prior work that learn… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: NAACL 2022

  13. arXiv:2204.14268  [pdf, other

    cs.CL cs.AI

    How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?

    Authors: Shiyue Zhang, Vishrav Chaudhary, Naman Goyal, James Cross, Guillaume Wenzek, Mohit Bansal, Francisco Guzman

    Abstract: A multilingual tokenizer is a fundamental component of multilingual neural machine translation. It is trained from a multilingual corpus. Since a skewed data distribution is considered to be harmful, a sampling strategy is usually used to balance languages in the corpus. However, few works have systematically answered how language imbalance in tokenizer training affects downstream performance. In… ▽ More

    Submitted 10 September, 2022; v1 submitted 29 April, 2022; originally announced April 2022.

    Comments: AMTA 2022

  14. arXiv:2203.13867  [pdf, other

    cs.CL cs.LG

    Data Selection Curriculum for Neural Machine Translation

    Authors: Tasnim Mohiuddin, Philipp Koehn, Vishrav Chaudhary, James Cross, Shruti Bhosale, Shafiq Joty

    Abstract: Neural Machine Translation (NMT) models are typically trained on heterogeneous data that are concatenated and randomly shuffled. However, not all of the training data are equally useful to the model. Curriculum training aims to present the data to the NMT models in a meaningful order. In this work, we introduce a two-stage curriculum training framework for NMT where we fine-tune a base NMT model o… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

  15. Bayesian Modelling of Multivalued Power Curves from an Operational Wind Farm

    Authors: L. A. Bull, P. A. Gardner, T. J. Rogers, N. Dervilis, E. J. Cross, E. Papatheou, A. E. Maguire, C. Campos, K. Worden

    Abstract: Power curves capture the relationship between wind speed and output power for a specific wind turbine. Accurate regression models of this function prove useful in monitoring, maintenance, design, and planning. In practice, however, the measurements do not always correspond to the ideal curve: power curtailments will appear as (additional) functional components. Such multivalued relationships canno… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

    Journal ref: Mechanical Systems and Signal Processing (2021): 108530

  16. arXiv:2110.08246  [pdf, ps, other

    cs.CL

    Tricks for Training Sparse Translation Models

    Authors: Dheeru Dua, Shruti Bhosale, Vedanuj Goswami, James Cross, Mike Lewis, Angela Fan

    Abstract: Multi-task learning with an unbalanced data distribution skews model learning towards high resource tasks, especially when model capacity is fixed and fully shared across all tasks. Sparse scaling architectures, such as BASELayers, provide flexible mechanisms for different tasks to have a variable number of parameters, which can be useful to counterbalance skewed data distributions. We find that t… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

  17. arXiv:2110.07804  [pdf, other

    cs.CL

    Alternative Input Signals Ease Transfer in Multilingual Machine Translation

    Authors: Simeng Sun, Angela Fan, James Cross, Vishrav Chaudhary, Chau Tran, Philipp Koehn, Francisco Guzman

    Abstract: Recent work in multilingual machine translation (MMT) has focused on the potential of positive transfer between languages, particularly cases where higher-resourced languages can benefit lower-resourced ones. While training an MMT model, the supervision signals learned from one language pair can be transferred to the other via the tokens shared by multiple source languages. However, the transfer i… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

  18. arXiv:2109.08627  [pdf, other

    cs.CL

    Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications

    Authors: Shuo Sun, Ahmed El-Kishky, Vishrav Chaudhary, James Cross, Francisco Guzmán, Lucia Specia

    Abstract: Sentence-level Quality estimation (QE) of machine translation is traditionally formulated as a regression task, and the performance of QE models is typically measured by Pearson correlation with human labels. Recent QE models have achieved previously-unseen levels of correlation with human judgments, but they rely on large multilingual contextualized language models that are computationally expens… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021

  19. arXiv:2108.03265  [pdf, other

    cs.CL

    Facebook AI WMT21 News Translation Task Submission

    Authors: Chau Tran, Shruti Bhosale, James Cross, Philipp Koehn, Sergey Edunov, Angela Fan

    Abstract: We describe Facebook's multilingual model submission to the WMT2021 shared task on news translation. We participate in 14 language directions: English to and from Czech, German, Hausa, Icelandic, Japanese, Russian, and Chinese. To develop systems covering all these directions, we focus on multilingual models. We utilize data from all available sources --- WMT, large-scale data mining, and in-domai… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

  20. arXiv:2106.11891  [pdf, other

    cs.CL

    On the Evaluation of Machine Translation for Terminology Consistency

    Authors: Md Mahfuz ibn Alam, Antonios Anastasopoulos, Laurent Besacier, James Cross, Matthias Gallé, Philipp Koehn, Vassilina Nikoulina

    Abstract: As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies. In many scenarios and particularly in cases of domain adaptation, one expects the MT output to adhere to the constraints provided by a terminology. In this work, we propose metrics to measure the consistency of MT output with… ▽ More

    Submitted 24 June, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: preprint

  21. arXiv:2106.08247  [pdf, ps, other

    stat.ML cs.LG

    Canonical-Correlation-Based Fast Feature Selection

    Authors: Sikai Zhang, Tingna Wang, Keith Worden, Elizabeth J. Cross

    Abstract: This paper proposes a canonical-correlation-based filter method for feature selection. The sum of squared canonical correlation coefficients is adopted as the feature ranking criterion. The proposed method boosts the computational speed of the ranking criterion in greedy search. The supporting theorems developed for the feature selection method are fundamental to the understanding of the canonical… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  22. arXiv:2105.13813  [pdf, other

    cs.LG eess.SP physics.flu-dyn

    Grey-box models for wave loading prediction

    Authors: Daniel J Pitchforth, Timothy J Rogers, Ulf T Tygesen, Elizabeth J Cross

    Abstract: The quantification of wave loading on offshore structures and components is a crucial element in the assessment of their useful remaining life. In many applications the well-known Morison's equation is employed to estimate the forcing from waves with assumed particle velocities and accelerations. This paper develops a grey-box modelling approach to improve the predictions of the force on structura… ▽ More

    Submitted 30 June, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

    Journal ref: Mechanical Systems and Signal Processing, Volume 159, 2021

  23. arXiv:2104.08597  [pdf, other

    cs.CL

    XLEnt: Mining a Large Cross-lingual Entity Dataset with Lexical-Semantic-Phonetic Word Alignment

    Authors: Ahmed El-Kishky, Adithya Renduchintala, James Cross, Francisco Guzmán, Philipp Koehn

    Abstract: Cross-lingual named-entity lexica are an important resource to multilingual NLP tasks such as machine translation and cross-lingual wikification. While knowledge bases contain a large number of entities in high-resource languages such as English and French, corresponding entities for lower-resource languages are often missing. To address this, we propose Lexical-Semantic-Phonetic Align (LSP-Align)… ▽ More

    Submitted 10 September, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

  24. arXiv:2103.01676  [pdf, other

    stat.ML cs.LG eess.SP

    Probabilistic Inference for Structural Health Monitoring: New Modes of Learning from Data

    Authors: Lawrence A. Bull, Paul Gardner, Timothy J. Rogers, Elizabeth J. Cross, Nikolaos Dervilis, Keith Worden

    Abstract: In data-driven SHM, the signals recorded from systems in operation can be noisy and incomplete. Data corresponding to each of the operational, environmental, and damage states are rarely available a priori; furthermore, labelling to describe the measurements is often unavailable. In consequence, the algorithms used to implement SHM should be robust and adaptive, while accommodating for missing inf… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

    Comments: This material may be downloaded for personal use only. Any other use requires prior permission of the American Society of Civil Engineers. This material may be found at https://doi.org/10.1061/AJRUA6.0001106

    Journal ref: ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 7.1 (2021): 03120003

  25. arXiv:2101.11711  [pdf, other

    cs.LG physics.data-an

    Damage detection in operational wind turbine blades using a new approach based on machine learning

    Authors: Kartik Chandrasekhar, Nevena Stevanovic, Elizabeth J. Cross, Nikolaos Dervilis, Keith Worden

    Abstract: The application of reliable structural health monitoring (SHM) technologies to operational wind turbine blades is a challenging task, due to the uncertain nature of the environments they operate in. In this paper, a novel SHM methodology, which uses Gaussian Processes (GPs) is proposed. The methodology takes advantage of the fact that the blades on a turbine are nominally identical in structural p… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

    Journal ref: This is an author produced version of a paper subsequently published in Renewable Energy, Elsevier, 2021. Uploaded in accordance with the publisher's self-archiving policy

  26. Structured Machine Learning Tools for Modelling Characteristics of Guided Waves

    Authors: Marcus Haywood-Alexander, Nikolaos Dervilis, Keith Worden, Elizabeth J. Cross, Robin S. Mills, Timothy J. Rogers

    Abstract: The use of ultrasonic guided waves to probe the materials/structures for damage continues to increase in popularity for non-destructive evaluation (NDE) and structural health monitoring (SHM). The use of high-frequency waves such as these offers an advantage over low-frequency methods from their ability to detect damage on a smaller scale. However, in order to assess damage in a structure, and imp… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

    Comments: 33 pages, 11 figures

  27. arXiv:2012.15127  [pdf, other

    cs.CL

    Improving Zero-Shot Translation by Disentangling Positional Information

    Authors: Danni Liu, Jan Niehues, James Cross, Francisco Guzmán, Xian Li

    Abstract: Multilingual neural machine translation has shown the capability of directly translating between language pairs unseen in training, i.e. zero-shot translation. Despite being conceptually attractive, it often suffers from low output quality. The difficulty of generalizing to new translation directions suggests the model representations are highly specific to those language pairs seen in training. W… ▽ More

    Submitted 30 June, 2021; v1 submitted 30 December, 2020; originally announced December 2020.

    Comments: ACL 2021

  28. arXiv:2012.11058  [pdf, other

    cs.LG cs.SD eess.AS

    A Bayesian methodology for localising acoustic emission sources in complex structures

    Authors: Matthew R. Jones, Tim J. Rogers, Keith Worden, Elizabeth J. Cross

    Abstract: In the field of structural health monitoring (SHM), the acquisition of acoustic emissions to localise damage sources has emerged as a popular approach. Despite recent advances, the task of locating damage within composite materials and structures that contain non-trivial geometrical features, still poses a significant challenge. Within this paper, a Bayesian source localisation strategy that is ro… ▽ More

    Submitted 20 December, 2020; originally announced December 2020.

    Comments: 17 pages, 7 figures

  29. arXiv:2008.10077  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Learn to Talk via Proactive Knowledge Transfer

    Authors: Qing Sun, James Cross

    Abstract: Knowledge Transfer has been applied in solving a wide variety of problems. For example, knowledge can be transferred between tasks (e.g., learning to handle novel situations by leveraging prior knowledge) or between agents (e.g., learning from others without direct experience). Without loss of generality, we relate knowledge transfer to KL-divergence minimization, i.e., matching the (belief) distr… ▽ More

    Submitted 23 August, 2020; originally announced August 2020.

  30. arXiv:2006.10369  [pdf, other

    cs.CL

    Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation

    Authors: Jungo Kasai, Nikolaos Pappas, Hao Peng, James Cross, Noah A. Smith

    Abstract: Much recent effort has been invested in non-autoregressive neural machine translation, which appears to be an efficient alternative to state-of-the-art autoregressive machine translation on modern GPUs. In contrast to the latter, where generation is sequential, the former allows generation to be parallelized across target token positions. Some of the latest non-autoregressive models have achieved… ▽ More

    Submitted 24 June, 2021; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: ICLR 2021 Final Version

  31. arXiv:2003.07216  [pdf, other

    eess.IV cs.CV physics.med-ph

    Image Quality Transfer Enhances Contrast and Resolution of Low-Field Brain MRI in African Paediatric Epilepsy Patients

    Authors: Matteo Figini, Hongxiang Lin, Godwin Ogbole, Felice D Arco, Stefano B. Blumberg, David W. Carmichael, Ryutaro Tanno, Enrico Kaden, Biobele J. Brown, Ikeoluwa Lagunju, Helen J. Cross, Delmiro Fernandez-Reyes, Daniel C. Alexander

    Abstract: 1.5T or 3T scanners are the current standard for clinical MRI, but low-field (<1T) scanners are still common in many lower- and middle-income countries for reasons of cost and robustness to power failures. Compared to modern high-field scanners, low-field scanners provide images with lower signal-to-noise ratio at equivalent resolution, leaving practitioners to compensate by using large slice thic… ▽ More

    Submitted 18 March, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

    Comments: 6 pages, 3 figures, accepted at ICLR 2020 workshop on Artificial Intelligence for Affordable Healthcare

  32. arXiv:2001.05136  [pdf, other

    cs.CL

    Non-Autoregressive Machine Translation with Disentangled Context Transformer

    Authors: Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu

    Abstract: State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens. The sequential nature of this generation process causes fundamental latency in inference since we cannot generate multiple tokens in each sentence in parallel. We propose an attention-masking based model, called Disentangled Context (DisCo)… ▽ More

    Submitted 30 June, 2020; v1 submitted 15 January, 2020; originally announced January 2020.

    Comments: ICML 2020

  33. arXiv:1912.11936  [pdf, other

    cs.HC cs.AI cs.SI

    Smell Pittsburgh: Engaging Community Citizen Science for Air Quality

    Authors: Yen-Chia Hsu, Jennifer Cross, Paul Dille, Michael Tasota, Beatrice Dias, Randy Sargent, Ting-Hao 'Kenneth' Huang, Illah Nourbakhsh

    Abstract: Urban air pollution has been linked to various human health concerns, including cardiopulmonary diseases. Communities who suffer from poor air quality often rely on experts to identify pollution sources due to the lack of accessible tools. Taking this into account, we developed Smell Pittsburgh, a system that enables community members to report odors and track where these odors are frequently conc… ▽ More

    Submitted 20 November, 2020; v1 submitted 26 December, 2019; originally announced December 2019.

    Comments: Accepted by ACM Transactions on Interactive Intelligent Systems on 2020. This is an extended version of the arXiv:1810.11143, which was accepted by the ACM IUI 2019 conference. arXiv admin note: substantial text overlap with arXiv:1810.11143

  34. arXiv:1909.12406  [pdf, other

    cs.CL

    Monotonic Multihead Attention

    Authors: Xutai Ma, Juan Pino, James Cross, Liezl Puzon, Jiatao Gu

    Abstract: Simultaneous machine translation models start generating a target sequence before they have encoded or read the source sequence. Recent approaches for this task either apply a fixed policy on a state-of-the art Transformer model, or a learnable monotonic attention on a weaker recurrent neural network-based structure. In this paper, we propose a new attention mechanism, Monotonic Multihead Attentio… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

  35. arXiv:1909.06763  [pdf, other

    eess.IV cs.CV

    Deep Learning for Low-Field to High-Field MR: Image Quality Transfer with Probabilistic Decimation Simulator

    Authors: Hongxiang Lin, Matteo Figini, Ryutaro Tanno, Stefano B. Blumberg, Enrico Kaden, Godwin Ogbole, Biobele J. Brown, Felice D'Arco, David W. Carmichael, Ikeoluwa Lagunju, Helen J. Cross, Delmiro Fernandez-Reyes, Daniel C. Alexander

    Abstract: MR images scanned at low magnetic field ($<1$T) have lower resolution in the slice direction and lower contrast, due to a relatively small signal-to-noise ratio (SNR) than those from high field (typically 1.5T and 3T). We adapt the recent idea of Image Quality Transfer (IQT) to enhance very low-field structural images aiming to estimate the resolution, spatial coverage, and contrast of high-field… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

  36. arXiv:1810.11143  [pdf, other

    cs.HC

    Smell Pittsburgh: Community-Empowered Mobile Smell Reporting System

    Authors: Yen-Chia Hsu, Jennifer Cross, Paul Dille, Michael Tasota, Beatrice Dias, Randy Sargent, Ting-Hao 'Kenneth' Huang, Illah Nourbakhsh

    Abstract: Urban air pollution has been linked to various human health considerations, including cardiopulmonary diseases. Communities who suffer from poor air quality often rely on experts to identify pollution sources due to the lack of accessible tools. Taking this into account, we developed Smell Pittsburgh, a system that enables community members to report odors and track where these odors are frequentl… ▽ More

    Submitted 1 July, 2020; v1 submitted 25 October, 2018; originally announced October 2018.

    Comments: Accepted by ACM IUI 2019 conference, with error corrections

  37. arXiv:1809.00125  [pdf, other

    cs.CL

    Simple Fusion: Return of the Language Model

    Authors: Felix Stahlberg, James Cross, Veselin Stoyanov

    Abstract: Neural Machine Translation (NMT) typically leverages monolingual data in training through backtranslation. We investigate an alternative simple method to use monolingual data for NMT training: We combine the scores of a pre-trained and fixed language model (LM) with the scores of a translation model (TM) while the TM is trained from scratch. To achieve that, we train the translation model to predi… ▽ More

    Submitted 24 January, 2019; v1 submitted 1 September, 2018; originally announced September 2018.

    Comments: WMT18 paper

  38. arXiv:1804.03293  [pdf, other

    cs.HC

    Community-Empowered Air Quality Monitoring System

    Authors: Yen-Chia Hsu, Paul Dille, Jennifer Cross, Beatrice Dias, Randy Sargent, Illah Nourbakhsh

    Abstract: Develo** information technology to democratize scientific knowledge and support citizen empowerment is a challenging task. In our case, a local community suffered from air pollution caused by industrial activity. The residents lacked the technological fluency to gather and curate diverse scientific data to advocate for regulatory change. We collaborated with the community in develo** an air qu… ▽ More

    Submitted 9 April, 2018; originally announced April 2018.

    Comments: Accepted by 2017 ACM Conference on Human Factors in Computing Systems (CHI 2017)

  39. arXiv:1804.03263  [pdf, other

    cs.HC

    Visualization Tool for Environmental Sensing and Public Health Data

    Authors: Yen-Chia Hsu, Jennifer Cross, Paul Dille, Illah Nourbakhsh, Leann Leiter, Ryan Grode

    Abstract: To assist residents affected by oil and gas development, public health professionals in a non-profit organization have collected community data, including symptoms, air quality, and personal stories. However, the organization was unable to aggregate and visualize these data computationally. We present the Environmental Health Channel, an interactive web-based tool for visualizing environmental sen… ▽ More

    Submitted 9 April, 2018; originally announced April 2018.

    Comments: Accepted by 2018 ACM Conference Companion Publication on Designing Interactive Systems (DIS 2018)

  40. arXiv:1612.06475  [pdf, other

    cs.CL

    Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles

    Authors: James Cross, Liang Huang

    Abstract: Parsing accuracy using efficient greedy transition systems has improved dramatically in recent years thanks to neural networks. Despite striking results in dependency parsing, however, neural models have not surpassed state-of-the-art approaches in constituency parsing. To remedy this, we introduce a new shift-reduce system whose stack contains merely sentence spans, represented by a bare minimum… ▽ More

    Submitted 19 December, 2016; originally announced December 2016.

    Comments: EMNLP 2016

  41. arXiv:1608.06459  [pdf, other

    cs.CL cs.CY

    Tracking Amendments to Legislation and Other Political Texts with a Novel Minimum-Edit-Distance Algorithm: DocuToads

    Authors: Henrik Hermansson, James P. Cross

    Abstract: Political scientists often find themselves tracking amendments to political texts. As different actors weigh in, texts change as they are drafted and redrafted, reflecting political preferences and power. This study provides a novel solution to the prob- lem of detecting amendments to political text based upon minimum edit distances. We demonstrate the usefulness of two language-insensitive, trans… ▽ More

    Submitted 23 August, 2016; originally announced August 2016.

  42. arXiv:1607.03055  [pdf, other

    cs.CL cs.CY

    Exploring the Political Agenda of the European Parliament Using a Dynamic Topic Modeling Approach

    Authors: Derek Greene, James P. Cross

    Abstract: This study analyzes the political agenda of the European Parliament (EP) plenary, how it has evolved over time, and the manner in which Members of the European Parliament (MEPs) have reacted to external and internal stimuli when making plenary speeches. To unveil the plenary agenda and detect latent themes in legislative speeches over time, MEP speech content is analyzed using a new dynamic topic… ▽ More

    Submitted 11 July, 2016; originally announced July 2016.

    Comments: Long version including appendix. arXiv admin note: substantial text overlap with arXiv:1505.07302

  43. arXiv:1606.06406  [pdf, other

    cs.CL

    Incremental Parsing with Minimal Features Using Bi-Directional LSTM

    Authors: James Cross, Liang Huang

    Abstract: Recently, neural network approaches for parsing have largely automated the combination of individual features, but still rely on (often a larger number of) atomic features created from human linguistic intuition, and potentially omitting important global context. To further reduce feature engineering to the bare minimum, we use bi-directional LSTM sentence representations to model a parser state w… ▽ More

    Submitted 20 June, 2016; originally announced June 2016.

    Comments: Pre-print of paper appearing in ACL 2016

  44. arXiv:1511.06312  [pdf, ps, other

    cs.CL

    Good, Better, Best: Choosing Word Embedding Context

    Authors: James Cross, Bing Xiang, Bowen Zhou

    Abstract: We propose two methods of learning vector representations of words and phrases that each combine sentence context with structural features extracted from dependency trees. Using several variations of neural network classifier, we show that these combined methods lead to improved performance when used as input features for supervised term-matching.

    Submitted 19 November, 2015; originally announced November 2015.

  45. arXiv:1505.07302  [pdf, other

    cs.CL cs.CY

    Unveiling the Political Agenda of the European Parliament Plenary: A Topical Analysis

    Authors: Derek Greene, James P. Cross

    Abstract: This study analyzes political interactions in the European Parliament (EP) by considering how the political agenda of the plenary sessions has evolved over time and the manner in which Members of the European Parliament (MEPs) have reacted to external and internal stimuli when making Parliamentary speeches. It does so by considering the context in which speeches are made, and the content of those… ▽ More

    Submitted 7 July, 2015; v1 submitted 27 May, 2015; originally announced May 2015.

    Comments: Add link to implementation code on Github