Skip to main content

Showing 1–26 of 26 results for author: Church, K

.
  1. arXiv:2407.05836  [pdf, other

    cs.IR

    Academic Article Recommendation Using Multiple Perspectives

    Authors: Kenneth Church, Omar Alonso, Peter Vickers, Jiameng Sun, Abteen Ebrahimi, Raman Chandrasekar

    Abstract: We argue that Content-based filtering (CBF) and Graph-based methods (GB) complement one another in Academic Search recommendations. The scientific literature can be viewed as a conversation between authors and the audience. CBF uses abstracts to infer authors' positions, and GB uses citations to infer responses from the audience. In this paper, we describe nine differences between CBF and GB, as w… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2406.19504  [pdf, other

    cs.CL

    Are Generative Language Models Multicultural? A Study on Hausa Culture and Emotions using ChatGPT

    Authors: Ibrahim Said Ahmad, Shiran Dudy, Resmi Ramachandranpillai, Kenneth Church

    Abstract: Large Language Models (LLMs), such as ChatGPT, are widely used to generate content for various purposes and audiences. However, these models may not reflect the cultural and emotional diversity of their users, especially for low-resource languages. In this paper, we investigate how ChatGPT represents Hausa's culture and emotions. We compare responses generated by ChatGPT with those provided by nat… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2403.18251  [pdf, other

    cs.CL

    Since the Scientific Literature Is Multilingual, Our Models Should Be Too

    Authors: Abteen Ebrahimi, Kenneth Church

    Abstract: English has long been assumed the $\textit{lingua franca}$ of scientific research, and this notion is reflected in the natural language processing (NLP) research involving scientific document representation. In this position piece, we quantitatively show that the literature is largely multilingual and argue that current models and benchmarks should reflect this linguistic diversity. We provide evi… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  4. arXiv:2307.15456  [pdf, other

    cs.LG cs.AI math.DS math.OC

    Worrisome Properties of Neural Network Controllers and Their Symbolic Representations

    Authors: Jacek Cyranka, Kevin E M Church, Jean-Philippe Lessard

    Abstract: We raise concerns about controllers' robustness in simple reinforcement learning benchmark problems. We focus on neural network controllers and their low neuron and symbolic abstractions. A typical controller reaching high mean return values still generates an abundance of persistent low-return solutions, which is a highly undesirable property, easily exploitable by an adversary. We find that the… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: accepted to ECAI23

    Journal ref: Frontiers in Artificial Intelligence and Applications, ECAI 2023

  5. arXiv:2211.10780  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis on Diversity over Language and Culture

    Authors: Youssef Mohamed, Mohamed Abdelfattah, Shyma Alhuwaider, Feifan Li, Xiangliang Zhang, Kenneth Ward Church, Mohamed Elhoseiny

    Abstract: This paper introduces ArtELingo, a new benchmark and dataset, designed to encourage work on diversity across languages and cultures. Following ArtEmis, a collection of 80k artworks from WikiArt with 0.45M emotion labels and English-only captions, ArtELingo adds another 0.79M annotations in Arabic and Chinese, plus 4.8K in Spanish to evaluate "cultural-transfer" performance. More than 51K artworks… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

    Comments: 9 pages, Accepted at EMNLP 22, for more details see https://www.artelingo.org/

  6. arXiv:2204.12672  [pdf, other

    cs.CL

    Data-Driven Adaptive Simultaneous Machine Translation

    Authors: Guangxu Xun, Mingbo Ma, Yuchen Bian, Xingyu Cai, Jiaji Huang, Renjie Zheng, Junkun Chen, Jiahong Yuan, Kenneth Church, Liang Huang

    Abstract: In simultaneous translation (SimulMT), the most widely used strategy is the wait-k policy thanks to its simplicity and effectiveness in balancing translation quality and latency. However, wait-k suffers from two major limitations: (a) it is a fixed policy that can not adaptively adjust latency given context, and (b) its training is much slower than full-sentence translation. To alleviate these iss… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

  7. arXiv:2203.03763  [pdf, other

    gr-qc math-ph math.CA math.DS

    Periodic orbits in Hořava-Lifshitz cosmologies

    Authors: Kevin E. M. Church, Olivier Hénot, Phillipo Lappicy, Jean-Philippe Lessard, Hauke Sprink

    Abstract: We consider spatially homogeneous Hořava-Lifshitz (HL) models that perturb General Relativity (GR) by a parameter $v\in (0,1)$ such that GR occurs at $v=1/2$. We describe the dynamics for the extremal case $v=0$, which possess the usual Bianchi hierarchy: type $\mathrm{I}$ (Kasner circle of equilibria), type $\mathrm{II}$ (heteroclinics that induce the Kasner map) and type… ▽ More

    Submitted 7 December, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: 21 pages, 7 figures. arXiv admin note: text overlap with arXiv:2012.07614

  8. arXiv:2202.13326  [pdf, other

    math.DS math.NA

    Computer-assisted proofs of Hopf bubbles and degenerate Hopf bifurcations

    Authors: Kevin Church, Elena Queirolo

    Abstract: We present a computer-assisted approach to prove the existence of Hopf bubbles and degenerate Hopf bifurcations in ordinary and delay differential equations. We apply the method to rigorously investigate these nonlocal bifurcation structures in the FitzHugh- Nagumo equation, the extended Lorenz-84 model and a time-delay SI model.

    Submitted 27 February, 2022; originally announced February 2022.

    Comments: 51 pages, 15 figures

    MSC Class: 34K18 (Primary); 37G15 (Secondary)

  9. arXiv:2201.01942  [pdf, other

    cs.LG stat.ML

    Efficiently Disentangle Causal Representations

    Authors: Yuanpeng Li, Joel Hestness, Mohamed Elhoseiny, Liang Zhao, Kenneth Church

    Abstract: This paper proposes an efficient approach to learning disentangled representations with causal mechanisms based on the difference of conditional probabilities in original and new distributions. We approximate the difference with models' generalization abilities so that it fits in the standard machine learning framework and can be efficiently computed. In contrast to the state-of-the-art approach,… ▽ More

    Submitted 1 January, 2024; v1 submitted 6 January, 2022; originally announced January 2022.

    Comments: 17 pages, 7 figures

    Report number: Causal-01

  10. arXiv:2111.03628  [pdf, other

    cs.AI stat.ML

    Exploiting a Zoo of Checkpoints for Unseen Tasks

    Authors: Jiaji Huang, Qiang Qiu, Kenneth Church

    Abstract: There are so many models in the literature that it is difficult for practitioners to decide which combinations are likely to be effective for a new task. This paper attempts to address this question by capturing relationships among checkpoints published on the web. We model the space of tasks as a Gaussian process. The covariance can be estimated from checkpoints and unlabeled probing data. With t… ▽ More

    Submitted 5 November, 2021; originally announced November 2021.

    Comments: Accepted in Neurips 2021

  11. arXiv:2108.01132  [pdf, other

    cs.CL

    The Role of Phonetic Units in Speech Emotion Recognition

    Authors: Jiahong Yuan, Xingyu Cai, Renjie Zheng, Liang Huang, Kenneth Church

    Abstract: We propose a method for emotion recognition through emotiondependent speech recognition using Wav2vec 2.0. Our method achieved a significant improvement over most previously reported results on IEMOCAP, a benchmark emotion dataset. Different types of phonetic units are employed and compared in terms of accuracy and robustness of emotion recognition within and across datasets and languages. Models… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

  12. arXiv:2108.01129  [pdf, other

    cs.CL cs.SD eess.AS

    Decoupling recognition and transcription in Mandarin ASR

    Authors: Jiahong Yuan, Xingyu Cai, Dongji Gao, Renjie Zheng, Liang Huang, Kenneth Church

    Abstract: Much of the recent literature on automatic speech recognition (ASR) is taking an end-to-end approach. Unlike English where the writing system is closely related to sound, Chinese characters (Hanzi) represent meaning, not sound. We propose factoring audio -> Hanzi into two sub-tasks: (1) audio -> Pinyin and (2) Pinyin -> Hanzi, where Pinyin is a system of phonetic transcription of standard Chinese.… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: submitted to ASRU 2021

  13. arXiv:2108.01122  [pdf, other

    cs.CL

    Automatic recognition of suprasegmentals in speech

    Authors: Jiahong Yuan, Neville Ryant, Xingyu Cai, Kenneth Church, Mark Liberman

    Abstract: This study reports our efforts to improve automatic recognition of suprasegmentals by fine-tuning wav2vec 2.0 with CTC, a method that has been successful in automatic speech recognition. We demonstrate that the method can improve the state-of-the-art on automatic recognition of syllables, tones, and pitch accents. Utilizing segmental information, by employing tonal finals or tonal syllables as rec… ▽ More

    Submitted 3 August, 2021; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: submitted to ASRU 2021

  14. arXiv:2105.05915  [pdf, other

    cs.CL cs.LG

    Better than BERT but Worse than Baseline

    Authors: Boxiang Liu, Jiaji Huang, Xingyu Cai, Kenneth Church

    Abstract: This paper compares BERT-SQuAD and Ab3P on the Abbreviation Definition Identification (ADI) task. ADI inputs a text and outputs short forms (abbreviations/acronyms) and long forms (expansions). BERT with reranking improves over BERT without reranking but fails to reach the Ab3P rule-based baseline. What is BERT missing? Reranking introduces two new features: charmatch and freq. The first feature i… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

    Comments: 6 pages, 2 figures, 5 tables

  15. arXiv:2012.01477  [pdf, other

    eess.AS cs.SD

    The Third DIHARD Diarization Challenge

    Authors: Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Liberman

    Abstract: DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain. Speaker diarization was evaluated under two speech activity conditions (diarization from a reference speech activity vs. diarization from scratch) and 11 diverse domains. The domains span… ▽ More

    Submitted 5 April, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

    Comments: arXiv admin note: text overlap with arXiv:1906.07839

  16. arXiv:2010.10048  [pdf, other

    cs.CL cs.AI

    Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training

    Authors: Renjie Zheng, Mingbo Ma, Baigong Zheng, Kaibo Liu, Jiahong Yuan, Kenneth Church, Liang Huang

    Abstract: Simultaneous speech-to-speech translation is widely useful but extremely challenging, since it needs to generate target-language speech concurrently with the source-language speech, with only a few seconds delay. In addition, it needs to continuously translate a stream of sentences, but all recent solutions merely focus on the single-sentence scenario. As a result, current approaches accumulate la… ▽ More

    Submitted 21 October, 2020; v1 submitted 20 October, 2020; originally announced October 2020.

    Comments: 10 pages, accepted by Findings of EMNLP 2020

    Journal ref: Findings of EMNLP 2020

  17. arXiv:2006.05815  [pdf, other

    eess.AS cs.SD

    Third DIHARD Challenge Evaluation Plan

    Authors: Neville Ryant, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Liberman

    Abstract: This paper introduces the third DIHARD challenge, the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain. The challenge comprises two tracks evaluating diarization performance when starting from a reference speech segmentation (track 1) and diarization from ra… ▽ More

    Submitted 2 December, 2020; v1 submitted 4 June, 2020; originally announced June 2020.

    Comments: Version 1.2 - Planned schedule updated - Updated numbers in tables from final versions of development/evaluation sets - Corrected typo

  18. arXiv:2004.00436  [pdf, other

    cs.CV cs.LG stat.ML

    Exploring Long Tail Visual Relationship Recognition with Large Vocabulary

    Authors: Sherif Abdelkarim, Aniket Agarwal, Panos Achlioptas, Jun Chen, Jiaji Huang, Boyang Li, Kenneth Church, Mohamed Elhoseiny

    Abstract: Several approaches have been proposed in recent literature to alleviate the long-tail problem, mainly in object classification tasks. In this paper, we make the first large-scale study concerning the task of Long-Tail Visual Relationship Recognition (LTVRR). LTVRR aims at improving the learning of structured visual relationships that come from the long-tail (e.g., "rabbit grazing on grass"). In th… ▽ More

    Submitted 25 September, 2021; v1 submitted 25 March, 2020; originally announced April 2020.

    ACM Class: I.2.10; I.5.0; I.4.0

  19. arXiv:1912.07766  [pdf, ps, other

    eess.SY math.OC

    User manual and tutorial for ISIM1s: a tiny MATLAB package for single stage invariant manifold-guided impulsive stabilization of delay equations

    Authors: Kevin E. M. Church

    Abstract: ISIM1s consists of a few MATLAB functions and a script that can be used to derive stabilizing impulsive controllers for delay differential equations. This document serves as both a manual and tutorial on the functionality of the ISIM1s package. Brief background on the theoretically guaranteed stabilization scenario are provided before the primary MATLAB script is explained. The tutorial demonstrat… ▽ More

    Submitted 12 February, 2021; v1 submitted 16 December, 2019; originally announced December 2019.

  20. arXiv:1911.02750  [pdf, other

    cs.CL cs.SD eess.AS

    Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework

    Authors: Mingbo Ma, Baigong Zheng, Kaibo Liu, Renjie Zheng, Hairong Liu, Kainan Peng, Kenneth Church, Liang Huang

    Abstract: Text-to-speech synthesis (TTS) has witnessed rapid progress in recent years, where neural methods became capable of producing audios with high naturalness. However, these efforts still suffer from two types of latencies: (a) the {\em computational latency} (synthesizing time), which grows linearly with the sentence length even with parallel approaches, and (b) the {\em input latency} in scenarios… ▽ More

    Submitted 6 October, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: Findings of EMNLP 2020

  21. arXiv:1906.07839  [pdf, ps, other

    eess.AS cs.CL

    The Second DIHARD Diarization Challenge: Dataset, task, and baselines

    Authors: Neville Ryant, Kenneth Church, Christopher Cieri, Alejandrina Cristia, Jun Du, Sriram Ganapathy, Mark Liberman

    Abstract: This paper introduces the second DIHARD challenge, the second in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain. The challenge comprises four tracks evaluating diarization performance under two input conditions (single channel vs. multi-channel) and two segmentatio… ▽ More

    Submitted 18 June, 2019; originally announced June 2019.

    Comments: Accepted by Interspeech 2019

  22. arXiv:1810.10045  [pdf, other

    cs.CL

    Language Modeling at Scale

    Authors: Mostofa Patwary, Milind Chabbi, Heewoo Jun, Jiaji Huang, Gregory Diamos, Kenneth Church

    Abstract: We show how Zipf's Law can be used to scale up language modeling (LM) to take advantage of more training data and more GPUs. LM plays a key role in many important natural language applications such as speech recognition and machine translation. Scaling up LM is important since it is widely accepted by the community that there is no data like more data. Eventually, we would like to train on terabyt… ▽ More

    Submitted 23 October, 2018; originally announced October 2018.

  23. arXiv:1510.05192  [pdf

    cs.HC cs.CY

    Three Hours a Day: Understanding Current Teen Practices of Smartphone Application Use

    Authors: Frank Bentley, Karen Church, Beverly Harrison, Kent Lyons, Matthew Rafalow

    Abstract: Teens are using mobile devices for an increasing number of activities. Smartphones and a variety of mobile apps for communication, entertainment, and productivity have become an integral part of their lives. This mobile phone use has evolved rapidly as technology has changed and thus studies from even 2 or 3 years ago may not reflect new patterns and practices as smartphones have become more sophi… ▽ More

    Submitted 17 October, 2015; originally announced October 2015.

    ACM Class: H.5.m

  24. arXiv:1505.03014  [pdf, other

    cs.IR

    Frappe: Understanding the Usage and Perception of Mobile App Recommendations In-The-Wild

    Authors: Linas Baltrunas, Karen Church, Alexandros Karatzoglou, Nuria Oliver

    Abstract: This paper describes a real world deployment of a context-aware mobile app recommender system (RS) called Frappe. Utilizing a hybrid-approach, we conducted a large-scale app market deployment with 1000 Android users combined with a small-scale local user study involving 33 users. The resulting usage logs and subjective feedback enabled us to gather key insights into (1) context-dependent app usage… ▽ More

    Submitted 12 May, 2015; originally announced May 2015.

    Report number: 11 ACM Class: H.3.3; H.5.2

  25. arXiv:cs/0610155  [pdf, ps, other

    cs.DS cs.IR cs.LG

    Nonlinear Estimators and Tail Bounds for Dimension Reduction in $l_1$ Using Cauchy Random Projections

    Authors: ** Li, Trevor J. Hastie, Kenneth W. Church

    Abstract: For dimension reduction in $l_1$, the method of {\em Cauchy random projections} multiplies the original data matrix $\mathbf{A} \in\mathbb{R}^{n\times D}$ with a random matrix $\mathbf{R} \in \mathbb{R}^{D\times k}$ ($k\ll\min(n,D)$) whose entries are i.i.d. samples of the standard Cauchy C(0,1). Because of the impossibility results, one can not hope to recover the pairwise $l_1$ distances in… ▽ More

    Submitted 27 October, 2006; originally announced October 2006.

  26. arXiv:cmp-lg/9407021  [pdf, ps

    cs.CL

    K-vec: A New Approach for Aligning Parallel Texts

    Authors: Pascale Fung, Kenneth Church

    Abstract: Various methods have been proposed for aligning texts in two or more languages such as the Canadian Parliamentary Debates(Hansards). Some of these methods generate a bilingual lexicon as a by-product. We present an alternative alignment strategy which we call K-vec, that starts by estimating the lexicon. For example, it discovers that the English word "fisheries" is similar to the French "pe^che… ▽ More

    Submitted 25 July, 1994; originally announced July 1994.

    Comments: 7 pages, uuencoded, compressed PostScript; Proc. COLING-94