Skip to main content

Showing 1–12 of 12 results for author: Ture, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08482  [pdf, other

    cs.CV cs.CL

    Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation

    Authors: Raphael Tang, Xinyu Zhang, Lixinyu Xu, Yao Lu, Wenyan Li, Pontus Stenetorp, Jimmy Lin, Ferhan Ture

    Abstract: Diffusion models are the state of the art in text-to-image generation, but their perceptual variability remains understudied. In this paper, we examine how prompts affect image variability in black-box diffusion-based models. We propose W1KP, a human-calibrated measure of variability in a set of images, bootstrapped from existing image-pair perceptual distances. Current datasets do not cover recen… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 13 pages, 11 figures

  2. "Ask Me Anything": How Comcast Uses LLMs to Assist Agents in Real Time

    Authors: Scott Rome, Tianwen Chen, Raphael Tang, Luwei Zhou, Ferhan Ture

    Abstract: Customer service is how companies interface with their customers. It can contribute heavily towards the overall customer satisfaction. However, high-quality service can become expensive, creating an incentive to make it as cost efficient as possible and prompting most companies to utilize AI-powered assistants, or "chat bots". On the other hand, human-to-human interaction is still desired by custo… ▽ More

    Submitted 6 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  3. arXiv:2311.18812  [pdf, other

    cs.CL

    What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations

    Authors: Raphael Tang, Xinyu Zhang, Jimmy Lin, Ferhan Ture

    Abstract: Do large language models (LLMs) exhibit sociodemographic biases, even when they decline to respond? To bypass their refusal to "speak," we study this research question by probing contextualized embeddings and exploring whether this bias is encoded in its latent representations. We propose a logistic Bradley-Terry probe which predicts word pair preferences of LLMs from the words' hidden vectors. We… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: 10 pages, 5 figures

  4. arXiv:2310.07712  [pdf, other

    cs.CL cs.LG

    Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models

    Authors: Raphael Tang, Xinyu Zhang, Xueguang Ma, Jimmy Lin, Ferhan Ture

    Abstract: Large language models (LLMs) exhibit positional bias in how they use context, which especially complicates listwise ranking. To address this, we propose permutation self-consistency, a form of self-consistency over ranking list outputs of black-box LLMs. Our key idea is to marginalize out different list orders in the prompt to produce an order-independent ranking with less positional bias. First,… ▽ More

    Submitted 22 April, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted for publication at NAACL 2024. First two authors contributed equally; 10 pages, 6 figures

  5. arXiv:2211.11740  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale

    Authors: Raphael Tang, Karun Kumar, Gefei Yang, Akshat Pandey, Yajie Mao, Vladislav Belyaev, Madhuri Emmadi, Craig Murray, Ferhan Ture, Jimmy Lin

    Abstract: End-to-end automatic speech recognition systems represent the state of the art, but they rely on thousands of hours of manually annotated speech for training, as well as heavyweight computation for inference. Of course, this impedes commercialization since most companies lack vast human and computational resources. In this paper, we explore training and deploying an ASR system in the label-scarce,… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted to EMNLP 2022 Industry Track; 9 pages, 7 figures

  6. arXiv:2210.04885  [pdf, other

    cs.CV cs.CL

    What the DAAM: Interpreting Stable Diffusion Using Cross Attention

    Authors: Raphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Pontus Stenetorp, Jimmy Lin, Ferhan Ture

    Abstract: Large-scale diffusion neural networks represent a substantial milestone in text-to-image generation, but they remain poorly understood, lacking interpretability analyses. In this paper, we perform a text-image attribution analysis on Stable Diffusion, a recently open-sourced model. To produce pixel-level attribution maps, we upscale and aggregate cross-attention word-pixel scores in the denoising… ▽ More

    Submitted 8 December, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: First two authors contributed equally. 13 pages, 15 figures

  7. arXiv:1812.07754  [pdf, other

    cs.CL

    Streaming Voice Query Recognition using Causal Convolutional Recurrent Neural Networks

    Authors: Raphael Tang, Gefei Yang, Hong Wei, Yajie Mao, Ferhan Ture, Jimmy Lin

    Abstract: Voice-enabled commercial products are ubiquitous, typically enabled by lightweight on-device keyword spotting (KWS) and full automatic speech recognition (ASR) in the cloud. ASR systems require significant computational resources in training and for inference, not to mention copious amounts of annotated speech data. KWS systems, on the other hand, are less resource-intensive but have limited capab… ▽ More

    Submitted 18 December, 2018; originally announced December 2018.

    Comments: 5 pages, 2 figures, submitted to ICASSP 2019

  8. arXiv:1805.08159  [pdf, other

    cs.IR cs.CL

    Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search

    Authors: **feng Rao, Wei Yang, Yuhao Zhang, Ferhan Ture, Jimmy Lin

    Abstract: Despite substantial interest in applications of neural networks to information retrieval, neural ranking models have only been applied to standard ad hoc retrieval tasks over web pages and newswire documents. This paper proposes MP-HCNN (Multi-Perspective Hierarchical Convolutional Neural Network) a novel neural ranking model specifically designed for ranking short social media posts. We identify… ▽ More

    Submitted 21 June, 2019; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: AAAI 2019, 10 pages

  9. arXiv:1707.07792  [pdf, other

    cs.IR cs.CL

    Integrating Lexical and Temporal Signals in Neural Ranking Models for Searching Social Media Streams

    Authors: **feng Rao, Hua He, Haotian Zhang, Ferhan Ture, Royal Sequiera, Salman Mohammed, Jimmy Lin

    Abstract: Time is an important relevance signal when searching streams of social media posts. The distribution of document timestamps from the results of an initial query can be leveraged to infer the distribution of relevant documents, which can then be used to rerank the initial results. Previous experiments have shown that kernel density estimation is a simple yet effective implementation of this idea. T… ▽ More

    Submitted 24 July, 2017; originally announced July 2017.

    Comments: SIGIR 2017 Workshop on Neural Information Retrieval (Neu-IR'17), August 7-11, 2017, Shinjuku, Tokyo, Japan

  10. arXiv:1705.04892  [pdf, other

    cs.IR

    Talking to Your TV: Context-Aware Voice Search with Hierarchical Recurrent Neural Networks

    Authors: **feng Rao, Ferhan Ture, Hua He, Oliver Jojic, Jimmy Lin

    Abstract: We tackle the novel problem of navigational voice queries posed against an entertainment system, where viewers interact with a voice-enabled remote controller to specify the program to watch. This is a difficult problem for several reasons: such queries are short, even shorter than comparable voice queries in other domains, which offers fewer opportunities for deciphering user intent. Furthermore,… ▽ More

    Submitted 13 May, 2017; originally announced May 2017.

  11. arXiv:1609.08210  [pdf, other

    cs.CL cs.AI

    Learning to Translate for Multilingual Question Answering

    Authors: Ferhan Ture, Elizabeth Boschee

    Abstract: In multilingual question answering, either the question needs to be translated into the document language, or vice versa. In addition to direction, there are multiple methods to perform the translation, four of which we explore in this paper: word-based, 10-best, context-based, and grammar-based. We build a feature for each combination of translation direction and method, and train a model that le… ▽ More

    Submitted 26 September, 2016; originally announced September 2016.

    Comments: 12 pages. To appear in EMNLP'16

  12. arXiv:1606.05029  [pdf, ps, other

    cs.CL

    No Need to Pay Attention: Simple Recurrent Neural Networks Work! (for Answering "Simple" Questions)

    Authors: Ferhan Ture, Oliver Jojic

    Abstract: First-order factoid question answering assumes that the question can be answered by a single fact in a knowledge base (KB). While this does not seem like a challenging task, many recent attempts that apply either complex linguistic reasoning or deep neural networks achieve 65%-76% accuracy on benchmark sets. Our approach formulates the task as two machine learning problems: detecting the entities… ▽ More

    Submitted 28 July, 2017; v1 submitted 15 June, 2016; originally announced June 2016.

    Comments: 7 pages, to appear in EMNLP 2017