Skip to main content

Showing 1–50 of 55 results for author: Lei, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.01060  [pdf, other

    cs.LG cs.AI cs.CV eess.IV

    A text-based, generative deep learning model for soil reflectance spectrum simulation in the VIS-NIR (400-2499 nm) bands

    Authors: Tong Lei, Brian N. Bailey

    Abstract: Simulating soil reflectance spectra is invaluable for soil-plant radiative modeling and training machine learning models, yet it is difficult as the intricate relationships between soil structure and its constituents. To address this, a fully data-driven soil optics generative model (SOGM) for simulation of soil reflectance spectra based on soil property inputs was developed. The model is trained… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: The paper has been submitted to Remote sensing of Environment and revised

  2. arXiv:2404.06194  [pdf, other

    cs.CV

    Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection

    Authors: Ting Lei, Shaofeng Yin, Yang Liu

    Abstract: Open-vocabulary human-object interaction (HOI) detection, which is concerned with the problem of detecting novel HOIs guided by natural language, is crucial for understanding human-centric scenes. However, prior zero-shot HOI detectors often employ the same levels of feature maps to model HOIs with varying distances, leading to suboptimal performance in scenes containing human-object pairs with a… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  3. arXiv:2403.16826  [pdf, ps, other

    cs.IT

    A Progressive Codebook Optimization Scheme for Sparse Code Multiple Access in Downlink Channels

    Authors: Tuofeng Lei, Qu Luo, Shuyan Ni, Shimiao Chen, Xin Song, Pei Xiao

    Abstract: Sparse code multiple access (SCMA) is a promising technique for enabling massive connectivity and high spectrum efficiency in future machine-type communication networks. However, its performance crucially depends on well-designed multi-dimensional codebooks. In this paper, we propose a novel progressive codebook optimization scheme that can achieve near-optimal performance over downlink fading cha… ▽ More

    Submitted 4 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  4. arXiv:2403.09611  [pdf, other

    cs.CV cs.CL cs.LG

    MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

    Authors: Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Floris Weers, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Ankur Jain, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman , et al. (7 additional authors not shown)

    Abstract: In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that for la… ▽ More

    Submitted 18 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  5. arXiv:2401.03331  [pdf, other

    cs.CV cs.LG

    Walnut Detection Through Deep Learning Enhanced by Multispectral Synthetic Images

    Authors: Kaiming Fu, Tong Lei, Maryia Halubok, Brian N. Bailey

    Abstract: The accurate identification of walnuts within orchards brings forth a plethora of advantages, profoundly amplifying the efficiency and productivity of walnut orchard management. Nevertheless, the unique characteristics of walnut trees, characterized by their closely resembling shapes, colors, and textures between the walnuts and leaves, present a formidable challenge in precisely distinguishing be… ▽ More

    Submitted 31 October, 2023; originally announced January 2024.

    Comments: This work was presented at IEEE/RSI International Conference on Intelligent Robots and Systems (IROS) Workshop

  6. Enhancing Communication Efficiency of Semantic Transmission via Joint Processing Technique

    Authors: Xumin Pu, Tiantian Lei, Wanli Wen, Qianbin Chen

    Abstract: This work presents a novel semantic transmission framework in wireless networks, leveraging the joint processing technique. Our framework enables multiple cooperating base stations to efficiently transmit semantic information to multiple users simultaneously. To enhance the semantic communication efficiency of the transmission framework, we formulate an optimization problem with the objective of m… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 6 pages, 6 figures

  7. arXiv:2311.15436  [pdf, other

    cs.CL

    Learning to Skip for Language Modeling

    Authors: Dewen Zeng, Nan Du, Tao Wang, Yuanzhong Xu, Tao Lei, Zhifeng Chen, Claire Cui

    Abstract: Overparameterized large-scale language models have impressive generalization performance of in-context few-shot learning. However, most language models allocate the same amount of parameters or computation to each token, disregarding the complexity or importance of the input data. We argue that in language model pretraining, a variable amount of computation should be assigned to different tokens,… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  8. arXiv:2309.03696  [pdf, other

    cs.CV

    Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory

    Authors: Ting Lei, Fabian Caba, Qingchao Chen, Hailin **, Yuxin Peng, Yang Liu

    Abstract: Human Object Interaction (HOI) detection aims to localize and infer the relationships between a human and an object. Arguably, training supervised models for this task from scratch presents challenges due to the performance drop over rare classes and the high computational cost and time required to handle long-tailed distributions of HOIs in complex HOI scenes in realistic settings. This observati… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  9. arXiv:2306.04086  [pdf, other

    eess.IV cs.CV

    TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for Medical Image Segmentation

    Authors: Rui Sun, Tao Lei, Weichuan Zhang, Yong Wan, Yong Xia, Asoke K. Nandi

    Abstract: The hybrid architecture of convolution neural networks (CNN) and Transformer has been the most popular method for medical image segmentation. However, the existing networks based on the hybrid architecture suffer from two problems. First, although the CNN branch can capture image local features by using convolution operation, the vanilla convolution is unable to achieve adaptive extraction of imag… ▽ More

    Submitted 19 December, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2306.03373

  10. CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

    Authors: Tao Lei, Rui Sun, Xuan Wang, Yingbo Wang, Xi He, Asoke Nandi

    Abstract: The hybrid architecture of convolutional neural networks (CNNs) and Transformer are very popular for medical image segmentation. However, it suffers from two challenges. First, although a CNNs branch can capture the local image features using vanilla convolution, it cannot achieve adaptive feature learning. Second, although a Transformer branch can capture the global features, it ignores the chann… ▽ More

    Submitted 19 December, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: 9 pages, 3 figures, 3 tables

    Journal ref: The 32nd International Joint Conference on Artificial Intelligence, IJCAI2023, MACAO

  11. arXiv:2306.01988  [pdf, other

    cs.CV

    Lightweight Structure-aware Transformer Network for VHR Remote Sensing Image Change Detection

    Authors: Tao Lei, Yetong Xu, Hailong Ning, Zhiyong Lv, Chongdan Min, Yaochu **, Asoke K. Nandi

    Abstract: Popular Transformer networks have been successfully applied to remote sensing (RS) image change detection (CD) identifications and achieve better results than most convolutional neural networks (CNNs), but they still suffer from two main problems. First, the computational complexity of the Transformer grows quadratically with the increase of image spatial resolution, which is unfavorable to very h… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  12. arXiv:2306.00812  [pdf, other

    eess.AS cs.SD

    Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

    Authors: Xiaohuai Le, Tong Lei, Li Chen, Yiqing Guo, Chao He, Cheng Chen, Xianjun Xia, Hua Gao, Yijian Xiao, Piao Ding, Shenyi Song, **g Lu

    Abstract: With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccura… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: accepted by Interspeech 2023

  13. arXiv:2304.04947  [pdf, other

    cs.CL

    Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference

    Authors: Tao Lei, Junwen Bai, Siddhartha Brahma, Joshua Ainslie, Kenton Lee, Yanqi Zhou, Nan Du, Vincent Y. Zhao, Yuexin Wu, Bo Li, Yu Zhang, Ming-Wei Chang

    Abstract: We propose Conditional Adapter (CoDA), a parameter-efficient transfer learning method that also improves inference efficiency. CoDA generalizes beyond standard adapter approaches to enable a new way of balancing speed and accuracy using conditional computation. Starting with an existing dense pretrained model, CoDA adds sparse activation together with a small number of new parameters and a light-w… ▽ More

    Submitted 26 November, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

    Comments: NeurIPS camera ready version

  14. arXiv:2304.01982  [pdf, other

    cs.CL cs.IR

    Rethinking the Role of Token Retrieval in Multi-Vector Retrieval

    Authors: **hyuk Lee, Zhuyun Dai, Sai Meher Karthik Duddu, Tao Lei, Iftekhar Naim, Ming-Wei Chang, Vincent Y. Zhao

    Abstract: Multi-vector retrieval models such as ColBERT [Khattab and Zaharia, 2020] allow token-level interactions between queries and documents, and hence achieve state of the art on many information retrieval benchmarks. However, their non-linear scoring function cannot be scaled to millions of documents, necessitating a three-stage process for inference: retrieving initial candidates via token retrieval,… ▽ More

    Submitted 8 April, 2024; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023. Code available at https://github.com/google-deepmind/xtr

  15. arXiv:2303.09752  [pdf, other

    cs.CL cs.LG

    CoLT5: Faster Long-Range Transformers with Conditional Computation

    Authors: Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai

    Abstract: Many natural language processing tasks benefit from long inputs, but processing long documents with Transformers is expensive -- not only due to quadratic attention complexity but also from applying feedforward and projection layers to every token. However, not all tokens are equally important, especially for longer documents. We propose CoLT5, a long-input Transformer model that builds on this in… ▽ More

    Submitted 23 October, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted at EMNLP 2023

  16. arXiv:2212.01742  [pdf, other

    cs.CV

    Lightweight Facial Attractiveness Prediction Using Dual Label Distribution

    Authors: Shu Liu, Enquan Huang, Ziyu Zhou, Yan Xu, Xiaoyan Kui, Tao Lei, Hongying Meng

    Abstract: Facial attractiveness prediction (FAP) aims to assess facial attractiveness automatically based on human aesthetic perception. Previous methods using deep convolutional neural networks have improved the performance, but their large-scale models have led to a deficiency in flexibility. In addition, most methods fail to take full advantage of the dataset. In this paper, we present a novel end-to-end… ▽ More

    Submitted 24 April, 2024; v1 submitted 3 December, 2022; originally announced December 2022.

  17. arXiv:2211.01267  [pdf, other

    cs.CL cs.IR

    Multi-Vector Retrieval as Sparse Alignment

    Authors: Yujie Qian, **hyuk Lee, Sai Meher Karthik Duddu, Zhuyun Dai, Siddhartha Brahma, Iftekhar Naim, Tao Lei, Vincent Y. Zhao

    Abstract: Multi-vector retrieval models improve over single-vector dual encoders on many information retrieval tasks. In this paper, we cast the multi-vector retrieval problem as sparse alignment between query and document tokens. We propose AligneR, a novel multi-vector retrieval model that learns sparsified pairwise alignments between query and document tokens (e.g. `dog' vs. `puppy') and per-token unary… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

  18. arXiv:2210.03929  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    EgoTaskQA: Understanding Human Tasks in Egocentric Videos

    Authors: Baoxiong Jia, Ting Lei, Song-Chun Zhu, Siyuan Huang

    Abstract: Understanding human tasks through video observations is an essential capability of intelligent agents. The challenges of such capability lie in the difficulty of generating a detailed understanding of situated actions, their effects on object states (i.e., state changes), and their causal dependencies. These challenges are further aggravated by the natural parallelism from multi-tasking and partia… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

    Comments: Published at NeurIPS Track on Datasets and Benchmarks 2022

  19. arXiv:2209.04702  [pdf, other

    cs.CL

    Adaptive Meta-learner via Gradient Similarity for Few-shot Text Classification

    Authors: Tianyi Lei, Honghui Hu, Qiaoyang Luo, Dezhong Peng, Xu Wang

    Abstract: Few-shot text classification aims to classify the text under the few-shot scenario. Most of the previous methods adopt optimization-based meta learning to obtain task distribution. However, due to the neglect of matching between the few amount of samples and complicated models, as well as the distinction between useful and useless task features, these methods suffer from the overfitting issue. To… ▽ More

    Submitted 28 July, 2023; v1 submitted 10 September, 2022; originally announced September 2022.

    Comments: COLING 2022

  20. Inference skip** for more efficient real-time speech enhancement with parallel RNNs

    Authors: Xiaohuai Le, Tong Lei, Kai Chen, **g Lu

    Abstract: Deep neural network (DNN) based speech enhancement models have attracted extensive attention due to their promising performance. However, it is difficult to deploy a powerful DNN in real-time applications because of its high computational cost. Typical compression methods such as pruning and quantization do not make good use of the data characteristics. In this paper, we introduce the Skip-RNN str… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: 11 pages, 8 figures, accepted by IEEE/ACM TASLP

  21. arXiv:2207.02687  [pdf, other

    cs.CV

    Team PKU-WICT-MIPL PIC Makeup Temporal Video Grounding Challenge 2022 Technical Report

    Authors: Minghang Zheng, Dejie Yang, Zhongjie Ye, Ting Lei, Yuxin Peng, Yang Liu

    Abstract: In this technical report, we briefly introduce the solutions of our team `PKU-WICT-MIPL' for the PIC Makeup Temporal Video Grounding (MTVG) Challenge in ACM-MM 2022. Given an untrimmed makeup video and a step query, the MTVG aims to localize a temporal moment of the target makeup step in the video. To tackle this task, we propose a phrase relationship mining framework to exploit the temporal local… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

    Comments: 2st Place in PIC Makeup Temporal Video Grounding (MTVG) Challenge in ACM-MM 2022

  22. arXiv:2205.12674  [pdf, other

    cs.CL cs.LG

    Training Language Models with Memory Augmentation

    Authors: Zexuan Zhong, Tao Lei, Danqi Chen

    Abstract: Recent work has improved language models (LMs) remarkably by equip** them with a non-parametric memory component. However, most existing approaches only introduce mem-ories at testing time or represent them using a separately trained encoder, resulting in suboptimal training of the language model. In this work, we present TRIME, a novel yet simple training approach designed for training LMs with… ▽ More

    Submitted 29 November, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022. Our code and models are available at https://github.com/princeton-nlp/TRIME

  23. arXiv:2205.11588  [pdf, other

    cs.CL cs.AI

    Simple Recurrence Improves Masked Language Models

    Authors: Tao Lei, Ran Tian, Jasmijn Bastings, Ankur P. Parikh

    Abstract: In this work, we explore whether modeling recurrence into the Transformer architecture can both be beneficial and efficient, by building an extremely simple recurrent module into the Transformer. We compare our model to baselines following the training and evaluation recipe of BERT. Our results confirm that recurrence can indeed improve Transformer models by a consistent margin, without requiring… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  24. arXiv:2202.09368  [pdf, other

    cs.LG cs.AI

    Mixture-of-Experts with Expert Choice Routing

    Authors: Yanqi Zhou, Tao Lei, Hanxiao Liu, Nan Du, Yan** Huang, Vincent Zhao, Andrew Dai, Zhifeng Chen, Quoc Le, James Laudon

    Abstract: Sparsely-activated Mixture-of-experts (MoE) models allow the number of parameters to greatly increase while kee** the amount of computation for a given token or a given sample unchanged. However, a poor expert routing strategy (e.g. one resulting in load imbalance) can cause certain experts to be under-trained, leading to an expert being under or over-specialized. Prior work allocates a fixed nu… ▽ More

    Submitted 13 October, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

  25. arXiv:2110.05571  [pdf, other

    eess.AS cs.CL

    SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

    Authors: **g Pan, Tao Lei, Kwangyoun Kim, Kyu Han, Shinji Watanabe

    Abstract: The Transformer architecture has been well adopted as a dominant architecture in most sequence transduction tasks including automatic speech recognition (ASR), since its attention mechanism excels in capturing long-range dependencies. While models built solely upon attention can be better parallelized than regular RNN, a novel network architecture, SRU++, was recently proposed. By combining the fa… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

  26. arXiv:2108.07846  [pdf, other

    cs.CV cs.AI

    Channel-Temporal Attention for First-Person Video Domain Adaptation

    Authors: Xianyuan Liu, Shuo Zhou, Tao Lei, Hai** Lu

    Abstract: Unsupervised Domain Adaptation (UDA) can transfer knowledge from labeled source data to unlabeled target data of the same categories. However, UDA for first-person action recognition is an under-explored problem, with lack of datasets and limited consideration of first-person video characteristics. This paper focuses on addressing this problem. Firstly, we propose two small-scale first-person vide… ▽ More

    Submitted 19 August, 2021; v1 submitted 17 August, 2021; originally announced August 2021.

  27. arXiv:2106.12023   

    cs.CV

    Team PyKale (xy9) Submission to the EPIC-Kitchens 2021 Unsupervised Domain Adaptation Challenge for Action Recognition

    Authors: Xianyuan Liu, Raivo Koot, Shuo Zhou, Tao Lei, Hai** Lu

    Abstract: This report describes the technical details of our submission to the EPIC-Kitchens 2021 Unsupervised Domain Adaptation Challenge for Action Recognition. The EPIC-Kitchens dataset is more difficult than other video domain adaptation datasets due to multi-tasks with more modalities. Firstly, to participate in the challenge, we employ a transformer to capture the spatial information from each modalit… ▽ More

    Submitted 9 August, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: This paper is not good enough for publication--no need to occupy resources here

  28. arXiv:2104.03465  [pdf, other

    cs.CL

    Nutribullets Hybrid: Multi-document Health Summarization

    Authors: Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay

    Abstract: We present a method for generating comparative summaries that highlights similarities and contradictions in input documents. The key challenge in creating such summaries is the lack of large parallel training data required for training typical summarization systems. To this end, we introduce a hybrid generation approach inspired by traditional concept-to-text systems. To enable accurate comparison… ▽ More

    Submitted 7 April, 2021; originally announced April 2021.

    Comments: NAACL 2021 Camera Ready

  29. arXiv:2103.11921  [pdf, other

    cs.CL

    Nutri-bullets: Summarizing Health Studies by Composing Segments

    Authors: Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay

    Abstract: We introduce \emph{Nutri-bullets}, a multi-document summarization task for health and nutrition. First, we present two datasets of food and health summaries from multiple scientific studies. Furthermore, we propose a novel \emph{extract-compose} model to solve the problem in the regime of limited parallel data. We explicitly select key spans from several abstracts using a policy network, followed… ▽ More

    Submitted 22 March, 2021; originally announced March 2021.

    Comments: 12 pages

    Journal ref: AAAI 2021 Camera Ready

  30. arXiv:2102.12459  [pdf, other

    cs.CL cs.LG

    When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute

    Authors: Tao Lei

    Abstract: Large language models have become increasingly difficult to train because of the growing computation time and cost. In this work, we present SRU++, a highly-efficient architecture that combines fast recurrence and attention for sequence modeling. SRU++ exhibits strong modeling capacity and training efficiency. On standard language modeling tasks such as Enwik8, Wiki-103 and Billion Word datasets,… ▽ More

    Submitted 14 September, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

    Journal ref: EMNLP 2021

  31. arXiv:2009.13120  [pdf, other

    eess.IV cs.CV

    Medical Image Segmentation Using Deep Learning: A Survey

    Authors: Risheng Wang, Tao Lei, Ruixia Cui, Bingtao Zhang, Hongying Meng, Asoke K. Nandi

    Abstract: Deep learning has been widely used for medical image segmentation and a large number of papers has been presented recording the success of deep learning in the field. In this paper, we present a comprehensive thematic survey on medical image segmentation using deep learning techniques. This paper makes two original contributions. Firstly, compared to traditional surveys that directly divide litera… ▽ More

    Submitted 22 December, 2021; v1 submitted 28 September, 2020; originally announced September 2020.

  32. arXiv:2009.07253  [pdf, other

    cs.CL cs.LG

    Autoregressive Knowledge Distillation through Imitation Learning

    Authors: Alexander Lin, Jeremy Wohlwend, Howard Chen, Tao Lei

    Abstract: The performance of autoregressive models on natural language generation tasks has dramatically improved due to the adoption of deep, self-attentive architectures. However, these gains have come at the cost of hindering inference speed, making state-of-the-art models cumbersome to deploy in real-world, time-sensitive settings. We develop a compression technique for autoregressive models that is dri… ▽ More

    Submitted 28 October, 2020; v1 submitted 15 September, 2020; originally announced September 2020.

  33. arXiv:2005.13111  [pdf, other

    cs.LG cs.CL stat.ML

    Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

    Authors: Kyle Swanson, Lili Yu, Tao Lei

    Abstract: Selecting input features of top relevance has become a popular method for building self-explaining models. In this work, we extend this selective rationalization approach to text matching, where the goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction. Our approach employs optimal transport (OT) to find a minimal cost alignm… ▽ More

    Submitted 26 May, 2020; originally announced May 2020.

    Comments: To appear at ACL 2020

  34. arXiv:2005.10469  [pdf, other

    eess.AS cs.CL cs.SD

    ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition

    Authors: **g Pan, Joshua Shapiro, Jeremy Wohlwend, Kyu J. Han, Tao Lei, Tao Ma

    Abstract: In this paper we present state-of-the-art (SOTA) performance on the LibriSpeech corpus with two novel neural network architectures, a multistream CNN for acoustic modeling and a self-attentive simple recurrent unit (SRU) for language modeling. In the hybrid ASR framework, the multistream CNN acoustic model processes an input of speech frames in multiple parallel pipelines where each stream has a u… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

    Comments: Submitted to Interspeech 2020

  35. arXiv:1911.05033  [pdf

    eess.IV cs.CV cs.MM

    Visual cryptography in single-pixel imaging

    Authors: Shuming Jiao, Jun Feng, Yang Gao, Ting Lei, Xiaocong Yuan

    Abstract: Two novel visual cryptography (VC) schemes are proposed by combining VC with single-pixel imaging (SPI) for the first time. It is pointed out that the overlap** of visual key images in VC is similar to the superposition of pixel intensities by a single-pixel detector in SPI. In the first scheme, QR-code VC is designed by using opaque sheets instead of transparent sheets. The secret image can be… ▽ More

    Submitted 12 November, 2019; originally announced November 2019.

  36. arXiv:1911.03598  [pdf, other

    cs.CL cs.HC cs.IR cs.LG

    Interactive Classification by Asking Informative Questions

    Authors: Lili Yu, Howard Chen, Sida Wang, Tao Lei, Yoav Artzi

    Abstract: We study the potential for interaction in natural language classification. We add a limited form of interaction for intent classification, where users provide an initial query using natural language, and the system asks for additional information using binary or multi-choice questions. At each turn, our system decides between asking the most informative question or making the final classification… ▽ More

    Submitted 3 May, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Accepted at ACL 2020

  37. arXiv:1911.01026  [pdf, other

    cs.LG cs.CL cs.IR stat.ML

    Metric Learning for Dynamic Text Classification

    Authors: Jeremy Wohlwend, Ethan R. Elenberg, Samuel Altschul, Shawn Henry, Tao Lei

    Abstract: Traditional text classifiers are limited to predicting over a fixed set of labels. However, in many real-world applications the label set is frequently changing. For example, in intent classification, new intents may be added over time while others are removed. We propose to address the problem of dynamic text classification by replacing the traditional, fixed-size output layer with a learned, sem… ▽ More

    Submitted 3 November, 2019; originally announced November 2019.

  38. arXiv:1911.00353  [pdf

    cs.CV eess.IV

    Does deep learning always outperform simple linear regression in optical imaging?

    Authors: Shuming Jiao, Yang Gao, Jun Feng, Ting Lei, Xiaocong Yuan

    Abstract: Deep learning has been extensively applied in many optical imaging applications in recent years. Despite the success, the limitations and drawbacks of deep learning in optical imaging have been seldom investigated. In this work, we show that conventional linear-regression-based methods can outperform the previously proposed deep learning approaches for two black-box optical imaging problems in som… ▽ More

    Submitted 17 January, 2020; v1 submitted 31 October, 2019; originally announced November 2019.

  39. arXiv:1910.11222  [pdf

    eess.IV cs.CV

    Data hiding in complex-amplitude modulation using a digital micromirror device

    Authors: Shuming Jiao, Dongfang Zhang, Chonglei Zhang, Yang Gao, Ting Lei, Xiaocong Yuan

    Abstract: A digital micromirror device (DMD) is an amplitude-type spatial light modulator. However, a complex-amplitude light modulation with a DMD can be achieved using the superpixel scheme. In the superpixel scheme, we notice that multiple different DMD local block patterns may correspond to the same complex superpixel value. Based on this inherent encoding redundancy, a large amount of external data can… ▽ More

    Submitted 24 October, 2019; originally announced October 2019.

  40. Structured Pruning of Large Language Models

    Authors: Ziheng Wang, Jeremy Wohlwend, Tao Lei

    Abstract: Large language models have recently achieved state of the art performance across a wide variety of natural language tasks. Meanwhile, the size of these models and their latency have significantly increased, which makes their usage costly, and raises an interesting question: do language models need to be large? We study this question through the lens of model compression. We present a generic, stru… ▽ More

    Submitted 28 March, 2021; v1 submitted 10 October, 2019; originally announced October 2019.

  41. arXiv:1906.03209  [pdf, other

    cs.CL

    Building a Production Model for Retrieval-Based Chatbots

    Authors: Kyle Swanson, Lili Yu, Christopher Fox, Jeremy Wohlwend, Tao Lei

    Abstract: Response suggestion is an important task for building human-computer conversation systems. Recent approaches to conversation modeling have introduced new model architectures with impressive results, but relatively little attention has been paid to whether these models would be practical in a production setting. In this paper, we describe the unique challenges of building a production retrieval-bas… ▽ More

    Submitted 1 August, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

  42. arXiv:1905.13594  [pdf

    eess.IV cs.CR cs.CV

    Known-plaintext attack and ciphertext-only attack for encrypted single-pixel imaging

    Authors: Shuming Jiao, Yang Gao, Ting Lei, Zhenwei Xie, Xiaocong Yuan

    Abstract: In many previous works, a single-pixel imaging (SPI) system is constructed as an optical image encryption system. Unauthorized users are not able to reconstruct the plaintext image from the ciphertext intensity sequence without knowing the illumination pattern key. However, little cryptanalysis about encrypted SPI has been investigated in the past. In this work, we propose a known-plaintext attack… ▽ More

    Submitted 31 May, 2019; originally announced May 2019.

  43. Optical machine learning with incoherent light and a single-pixel detector

    Authors: Shuming Jiao, Jun Feng, Yang Gao, Ting Lei, Zhenwei Xie, Xiaocong Yuan

    Abstract: An optical diffractive neural network (DNN) can be implemented with a cascaded phase mask architecture. Like an optical computer, the system can perform machine learning tasks such as number digit recognition in an all-optical manner. However, the system can only work under coherent light illumination and the precision requirement in practical experiments is quite high. This paper proposes an opti… ▽ More

    Submitted 24 November, 2019; v1 submitted 24 April, 2019; originally announced April 2019.

    Journal ref: Optics Letters 44(21) 2019

  44. Adaptive Morphological Reconstruction for Seeded Image Segmentation

    Authors: Tao Lei, Xiaohong Jia, Tongliang Liu, Shigang Liu, Hongying Meng, Asoke K. Nandi

    Abstract: Morphological reconstruction (MR) is often employed by seeded image segmentation algorithms such as watershed transform and power watershed as it is able to filter seeds (regional minima) to reduce over-segmentation. However, MR might mistakenly filter meaningful seeds that are required for generating accurate segmentation and it is also sensitive to the scale because a single-scale structuring el… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

  45. arXiv:1903.04709  [pdf, other

    cs.DC cs.NI

    Service Capacity Enhanced Task Offloading and Resource Allocation in Multi-Server Edge Computing Environment

    Authors: Wei Du, Tao Lei, Qiang He, Wei Liu, Qiwang Lei, Hailiang Zhao, Wei Wang

    Abstract: An edge computing environment features multiple edge servers and multiple service clients. In this environment, mobile service providers can offload client-side computation tasks from service clients' devices onto edge servers to reduce service latency and power consumption experienced by the clients. A critical issue that has yet to be properly addressed is how to allocate edge computing resource… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Comments: This paper has been accepted by Early Submission Phase of ICWS2019

  46. Multiple-image encryption and hiding with an optical diffractive neural network

    Authors: Yang Gao, Shuming Jiao, Juncheng Fang, Ting Lei, Zhenwei Xie, Xiaocong Yuan

    Abstract: A cascaded phase-only mask architecture (or an optical diffractive neural network) can be employed for different optical information processing tasks such as pattern recognition, orbital angular momentum (OAM) mode conversion, image salience detection and image encryption. However, for optical encryption and watermarking applications, such a system usually cannot process multiple pairs of input im… ▽ More

    Submitted 10 February, 2020; v1 submitted 21 February, 2019; originally announced February 2019.

  47. arXiv:1809.02255  [pdf, other

    cs.CL

    Adversarial Domain Adaptation for Duplicate Question Detection

    Authors: Darsh J Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, Preslav Nakov

    Abstract: We address the problem of detecting duplicate questions in forums, which is an important step towards automating the process of answering new questions. As finding and annotating such potential duplicates manually is very tedious and costly, automatic methods based on machine learning are a viable alternative. However, many forums do not have annotated data, i.e., questions labeled by experts as d… ▽ More

    Submitted 6 September, 2018; originally announced September 2018.

    Comments: EMNLP 2018 short paper - camera ready. 8 pages

  48. arXiv:1806.01340  [pdf

    eess.IV cs.CV

    Design of optimal illumination patterns in single-pixel imaging using image dictionaries

    Authors: Jun Feng, Shuming Jiao, Yang Gao, Ting Lei, Xiaocong Yuan

    Abstract: Single-pixel imaging (SPI) has a major drawback that many sequential illuminations are required for capturing one single image with long acquisition time. Basis illumination patterns such as Fourier patterns and Hadamard patterns can achieve much better imaging efficiency than random patterns. But the performance is still sub-optimal since the basis patterns are fixed and non-adaptive for varying… ▽ More

    Submitted 17 January, 2020; v1 submitted 4 June, 2018; originally announced June 2018.

  49. arXiv:1709.02755  [pdf, other

    cs.CL cs.NE

    Simple Recurrent Units for Highly Parallelizable Recurrence

    Authors: Tao Lei, Yu Zhang, Sida I. Wang, Hui Dai, Yoav Artzi

    Abstract: Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations. In this work, we propose the Simple Recurrent Unit (SRU), a light recurrent unit that balances model capacity and scalability. SRU is designed to provide expressive recurrence, enable highly parallelized implementation, and comes with careful initialization to facilitate tr… ▽ More

    Submitted 7 September, 2018; v1 submitted 8 September, 2017; originally announced September 2017.

    Comments: EMNLP

  50. arXiv:1705.09655  [pdf, other

    cs.CL cs.LG

    Style Transfer from Non-Parallel Text by Cross-Alignment

    Authors: Tianxiao Shen, Tao Lei, Regina Barzilay, Tommi Jaakkola

    Abstract: This paper focuses on style transfer on the basis of non-parallel text. This is an instance of a broad family of problems including machine translation, decipherment, and sentiment modification. The key challenge is to separate the content from other aspects such as style. We assume a shared latent content distribution across different text corpora, and propose a method that leverages refined alig… ▽ More

    Submitted 6 November, 2017; v1 submitted 26 May, 2017; originally announced May 2017.

    Comments: NIPS 2017 camera-ready. Added human evaluation on sentiment transfer