Skip to main content

Showing 1–50 of 66 results for author: Li, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19435  [pdf, other

    cs.CV

    A Sanity Check for AI-generated Image Detection

    Authors: Shilin Yan, Ouxiang Li, Jiayin Cai, Yanbin Hao, Xiaolong Jiang, Yao Hu, Weidi Xie

    Abstract: With the rapid development of generative models, discerning AI-generated content has evoked increasing attention from both industry and academia. In this paper, we conduct a sanity check on "whether the task of AI-generated image detection has been solved". To start with, we present Chameleon dataset, consisting AIgenerated images that are genuinely challenging for human perception. To quantify th… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Project page: https://shilinyan99.github.io/AIDE Code: https://github.com/shilinyan99/AIDE

  2. arXiv:2406.17660  [pdf, other

    cs.LG

    Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

    Authors: Aashiq Muhamed, Oscar Li, David Woodruff, Mona Diab, Virginia Smith

    Abstract: Large language model (LLM) training and finetuning are often bottlenecked by limited GPU memory. While existing projection-based optimization methods address this by projecting gradients into a lower-dimensional subspace to reduce optimizer state memory, they typically rely on dense projection matrices, which can introduce computational and memory overheads. In this work, we propose Grass (GRAdien… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  3. arXiv:2402.14547  [pdf, other

    cs.LG cs.AI cs.CL cs.DB

    OmniPred: Language Models as Universal Regressors

    Authors: Xingyou Song, Oscar Li, Chansoo Lee, Bangding Yang, Daiyi Peng, Sagi Perel, Yutian Chen

    Abstract: Over the broad landscape of experimental design, regression has been a powerful tool to accurately predict the outcome metrics of a system or model given a set of parameters, but has been traditionally restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training language models as universal end-to-end regressors over $(x,y)$ evalu… ▽ More

    Submitted 4 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: 24 pages, 10 figures. Code can be found in https://github.com/google-research/optformer/tree/main/optformer/omnipred

  4. arXiv:2310.14563  [pdf, other

    cs.CL cs.CY

    NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling Social Norm Adherence and Violation

    Authors: Oliver Li, Mallika Subramanian, Arkadiy Saakyan, Sky CH-Wang, Smaranda Muresan

    Abstract: Social norms fundamentally shape interpersonal communication. We present NormDial, a high-quality dyadic dialogue dataset with turn-by-turn annotations of social norm adherences and violations for Chinese and American cultures. Introducing the task of social norm observance detection, our dataset is synthetically generated in both Chinese and English using a human-in-the-loop pipeline by prompting… ▽ More

    Submitted 24 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Main Conference, Short Paper; Data at https://github.com/Aochong-Li/NormDial

  5. arXiv:2305.14492  [pdf, other

    cs.CL

    Sociocultural Norm Similarities and Differences via Situational Alignment and Explainable Textual Entailment

    Authors: Sky CH-Wang, Arkadiy Saakyan, Oliver Li, Zhou Yu, Smaranda Muresan

    Abstract: Designing systems that can reason across cultures requires that they are grounded in the norms of the contexts in which they operate. However, current research on develo** computational models of social norms has primarily focused on American society. Here, we propose a novel approach to discover and compare descriptive social norms across Chinese and American cultures. We demonstrate our approa… ▽ More

    Submitted 22 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 Main Conference (Long Paper)

  6. arXiv:2304.12180  [pdf, other

    cs.NE cs.AI cs.LG

    Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies

    Authors: Oscar Li, James Harrison, Jascha Sohl-Dickstein, Virginia Smith, Luke Metz

    Abstract: Unrolled computation graphs are prevalent throughout machine learning but present challenges to automatic differentiation (AD) gradient estimation methods when their loss functions exhibit extreme local sensitivtiy, discontinuity, or blackbox characteristics. In such scenarios, online evolution strategies methods are a more capable alternative, while being more parallelizable than vanilla evolutio… ▽ More

    Submitted 9 December, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023. 41 pages. Code available at https://github.com/OscarcarLi/Noise-Reuse-Evolution-Strategies

  7. arXiv:2304.11938  [pdf, other

    cs.SE cs.AI

    Is ChatGPT the Ultimate Programming Assistant -- How far is it?

    Authors: Haoye Tian, Weiqi Lu, Tsz On Li, Xunzhu Tang, Shing-Chi Cheung, Jacques Klein, Tegawendé F. Bissyandé

    Abstract: Recently, the ChatGPT LLM has received great attention: it can be used as a bot for discussing source code, prompting it to suggest changes, provide descriptions or even generate code. Typical demonstrations generally focus on existing benchmarks, which may have been used in model training (i.e., data leakage). To assess the feasibility of using an LLM as a useful assistant bot for programmers, we… ▽ More

    Submitted 31 August, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

  8. arXiv:2303.08698  [pdf, other

    cs.CV cs.AI

    Bi-directional Distribution Alignment for Transductive Zero-Shot Learning

    Authors: Zhicai Wang, Yanbin Hao, Tingting Mu, Ouxiang Li, Shuo Wang, Xiangnan He

    Abstract: It is well-known that zero-shot learning (ZSL) can suffer severely from the problem of domain shift, where the true and learned data distributions for the unseen classes do not match. Although transductive ZSL (TZSL) attempts to improve this by allowing the use of unlabelled examples from the unseen classes, there is still a high level of distribution shift. We propose a novel TZSL model (named as… ▽ More

    Submitted 19 March, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: CVPR2023

  9. arXiv:2210.09396  [pdf, other

    cs.CL cs.AI cs.CY cs.SD eess.AS

    Affective Idiosyncratic Responses to Music

    Authors: Sky CH-Wang, Evan Li, Oliver Li, Smaranda Muresan, Zhou Yu

    Abstract: Affective responses to music are highly personal. Despite consensus that idiosyncratic factors play a key role in regulating how listeners emotionally respond to music, precisely measuring the marginal effects of these variables has proved challenging. To address this gap, we develop computational methods to measure affective responses to music from over 403M listener comments on a Chinese social… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022 Main Conference; see Github https://github.com/skychwang/music-emotions

  10. arXiv:2208.01508  [pdf, other

    cs.SE cs.AI

    COMET: Coverage-guided Model Generation For Deep Learning Library Testing

    Authors: Meiziniu Li, Jialun Cao, Yongqiang Tian, Tsz On Li, Ming Wen, Shing-Chi Cheung

    Abstract: Recent deep learning (DL) applications are mostly built on top of DL libraries. The quality assurance of these libraries is critical to the dependable deployment of DL applications. Techniques have been proposed to generate various DL models and apply them to test these libraries. However, their test effectiveness is constrained by the diversity of layer API calls in their generated DL models. Our… ▽ More

    Submitted 30 January, 2023; v1 submitted 2 August, 2022; originally announced August 2022.

    Comments: 34 pages, 12 figures

    ACM Class: D.2.5; I.2.5

  11. SQRQuerier: A Visual Querying Framework for Cross-national Survey Data Recycling

    Authors: Yamei Tu, Olga Li, Junpeng Wang, Han-Wei Shen, Przemek Powalko, Irina Tomescu-Dubrow, Kazimierz M. Slomczynski, Spyros Blanas, J. Craig Jenkins

    Abstract: Public opinion surveys constitute a powerful tool to study peoples' attitudes and behaviors in comparative perspectives. However, even worldwide surveys provide only partial geographic and time coverage, which hinders comprehensive knowledge production. To broaden the scope of comparison, social scientists turn to ex-post harmonization of variables from datasets that cover similar topics but in di… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Journal ref: IEEE Transactions on Visualization and Computer Graphics Volume: 29, Issue: 6, 01 June 2023 pgs. 2862-2874

  12. arXiv:2109.05797  [pdf, ps, other

    cs.CL

    Show Me How To Revise: Improving Lexically Constrained Sentence Generation with XLNet

    Authors: Xingwei He, Victor O. K. Li

    Abstract: Lexically constrained sentence generation allows the incorporation of prior knowledge such as lexical constraints into the output. This technique has been applied to machine translation, and dialog response generation. Previous work usually used Markov Chain Monte Carlo (MCMC) sampling to generate lexically constrained sentences, but they randomly determined the position to be edited and the actio… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted by AAAI 2021

  13. arXiv:2103.14587  [pdf, other

    cs.LG cs.CY

    Deep-AIR: A Hybrid CNN-LSTM Framework for Air Quality Modeling in Metropolitan Cities

    Authors: Yang Han, Qi Zhang, Victor O. K. Li, Jacqueline C. K. Lam

    Abstract: Air pollution has long been a serious environmental health challenge, especially in metropolitan cities, where air pollutant concentrations are exacerbated by the street canyon effect and high building density. Whilst accurately monitoring and forecasting air pollution are highly crucial, existing data-driven models fail to fully address the complex interaction between air pollution and urban dyna… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

  14. arXiv:2103.12910  [pdf, other

    cs.HC

    AQEyes: Visual Analytics for Anomaly Detection and Examination of Air Quality Data

    Authors: Dongyu Liu, Kalyan Veeramachaneni, Alexander Geiger, Victor O. K. Li, Huamin Qu

    Abstract: Anomaly detection plays a key role in air quality analysis by enhancing situational awareness and alerting users to potential hazards. However, existing anomaly detection approaches for air quality analysis have their own limitations regarding parameter selection (e.g., need for extensive domain knowledge), computational expense, general applicability (e.g., require labeled data), interpretability… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

    Comments: 11 pages, 6 figures

  15. arXiv:2102.11503  [pdf, other

    cs.LG

    Two Sides of Meta-Learning Evaluation: In vs. Out of Distribution

    Authors: Amrith Setlur, Oscar Li, Virginia Smith

    Abstract: We categorize meta-learning evaluation into two settings: $\textit{in-distribution}$ [ID], in which the train and test tasks are sampled $\textit{iid}$ from the same underlying task distribution, and $\textit{out-of-distribution}$ [OOD], in which they are not. While most meta-learning theory and some FSL applications follow the ID setting, we identify that most existing few-shot classification ben… ▽ More

    Submitted 27 October, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

  16. arXiv:2102.08504  [pdf, other

    cs.LG cs.CR

    Label Leakage and Protection in Two-party Split Learning

    Authors: Oscar Li, Jiankai Sun, Xin Yang, Weihao Gao, Hongyi Zhang, Junyuan Xie, Virginia Smith, Chong Wang

    Abstract: Two-party split learning is a popular technique for learning a model across feature-partitioned data. In this work, we explore whether it is possible for one party to steal the private label information from the other party during split training, and whether there are methods that can protect against such attacks. Specifically, we first formulate a realistic threat model and propose a privacy loss… ▽ More

    Submitted 24 May, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: Accepted to ICLR 2022 (https://openreview.net/forum?id=cOtBRgsf2fO)

  17. arXiv:2011.14048  [pdf, other

    cs.LG stat.ML

    Is Support Set Diversity Necessary for Meta-Learning?

    Authors: Amrith Setlur, Oscar Li, Virginia Smith

    Abstract: Meta-learning is a popular framework for learning with limited data in which an algorithm is produced by training over multiple few-shot learning tasks. For classification problems, these tasks are typically constructed by sampling a small number of support and query examples from a subset of the classes. While conventional wisdom is that task diversity should improve the performance of meta-learn… ▽ More

    Submitted 7 October, 2021; v1 submitted 27 November, 2020; originally announced November 2020.

    Journal ref: NeurIPS 2020 Workshop on Meta-learning

  18. arXiv:2010.02646  [pdf, other

    cs.CL

    On the Sparsity of Neural Machine Translation Models

    Authors: Yong Wang, Longyue Wang, Victor O. K. Li, Zhaopeng Tu

    Abstract: Modern neural machine translation (NMT) models employ a large number of parameters, which leads to serious over-parameterization and typically causes the underutilization of computational resources. In response to this problem, we empirically investigate whether the redundant parameters can be reused to achieve better performance. Experiments and analyses are systematically conducted on different… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  19. arXiv:2004.09681  [pdf, other

    cs.CV

    Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution

    Authors: Yingruo Fan, Jacqueline C. K. Lam, Victor O. K. Li

    Abstract: The intensity estimation of facial action units (AUs) is challenging due to subtle changes in the person's facial appearance. Previous approaches mainly rely on probabilistic models or predefined rules for modeling co-occurrence relationships among AUs, leading to limited generalization. In contrast, we present a new learning framework that automatically learns the latent relationships of AUs via… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: Accepted at AAAI2020

  20. arXiv:1911.09912  [pdf, other

    cs.CL

    Go From the General to the Particular: Multi-Domain Translation with Domain Transformation Networks

    Authors: Yong Wang, Longyue Wang, Shuming Shi, Victor O. K. Li, Zhaopeng Tu

    Abstract: The key challenge of multi-domain translation lies in simultaneously encoding both the general knowledge shared across domains and the particular knowledge distinctive to each domain in a unified model. Previous work shows that the standard neural machine translation (NMT) model, trained on mixed-domain data, generally captures the general knowledge, but misses the domain-specific knowledge. In re… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

    Comments: AAAI 2020

  21. Max-min Fairness of K-user Cooperative Rate-Splitting in MISO Broadcast Channel with User Relaying

    Authors: Yijie Mao, Bruno Clerckx, Jian Zhang, Victor O. K. Li, Mohammed Arafah

    Abstract: Cooperative Rate-Splitting (CRS) strategy, relying on linearly precoded rate-splitting at the transmitter and opportunistic transmission of the common message by the relaying user, has recently been shown to outperform typical Non-cooperative Rate-Splitting (NRS), Cooperative Non-Orthogonal Multiple Access (C-NOMA) and Space Division Multiple Access (SDMA) in a two-user Multiple Input Single Outpu… ▽ More

    Submitted 5 August, 2020; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: accepted by IEEE Transactions on Wireless Communications

  22. arXiv:1906.10651  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Interpretable Image Recognition with Hierarchical Prototypes

    Authors: Peter Hase, Chaofan Chen, Oscar Li, Cynthia Rudin

    Abstract: Vision models are interpretable when they classify objects on the basis of features that a person can directly understand. Recently, methods relying on visual feature prototypes have been developed for this purpose. However, in contrast to how humans categorize objects, these approaches have not yet made use of any taxonomical organization of class labels. With such an approach, for instance, we m… ▽ More

    Submitted 24 August, 2019; v1 submitted 25 June, 2019; originally announced June 2019.

    Comments: Published as a full paper at HCOMP 2019

  23. arXiv:1906.01181  [pdf, other

    cs.CL

    Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

    Authors: Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O. K. Li

    Abstract: Zero-shot translation, translating between language pairs on which a Neural Machine Translation (NMT) system has never been trained, is an emergent property when training the system in multilingual settings. However, naive training for zero-shot NMT easily fails, and is sensitive to hyper-parameter setting. The performance typically lags far behind the more conventional pivot-based approach which… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

    Comments: Accepted by ACL 2019

  24. arXiv:1905.01422  [pdf, other

    cs.LG math.OC stat.ML

    An Adaptive Remote Stochastic Gradient Method for Training Neural Networks

    Authors: Yushu Chen, Hao **g, Wenlai Zhao, Zhiqiang Liu, Ouyi Li, Liang Qiao, Wei Xue, Guangwen Yang

    Abstract: We present the remote stochastic gradient (RSG) method, which computes the gradients at configurable remote observation points, in order to improve the convergence rate and suppress gradient noise at the same time for different curvatures. RSG is further combined with adaptive methods to construct ARSG for acceleration. The method is efficient in computation and memory, and is straightforward to i… ▽ More

    Submitted 6 September, 2020; v1 submitted 3 May, 2019; originally announced May 2019.

    Comments: The generalization is improved by modifying the preconditioner. For training ResNet-50 on ImageNet, ARSG outperforms ADAM in convergence speed and meanwhile it surpasses SGD in generalization. We also present a convergence bound in non-convex settings

  25. arXiv:1902.07851  [pdf, ps, other

    cs.IT

    Rate-Splitting for Multi-User Multi-Antenna Wireless Information and Power Transfer

    Authors: Yijie Mao, Bruno Clerckx, Victor O. K. Li

    Abstract: In a multi-user multi-antenna Simultaneous Wireless Information and Power Transfer (SWIPT) network, the transmitter sends information to the Information Receivers (IRs) and energy to Energy Receivers (ERs) concurrently. A conventional approach is based on Multi-User Linear Precoding (MU--LP) where each IR directly decodes the intended stream by fully treating the interference from other IRs and ER… ▽ More

    Submitted 2 July, 2019; v1 submitted 20 February, 2019; originally announced February 2019.

    Comments: 5 pages, 3 figures. This is the latest version. The typos in the version accepted by SPAWC 2019 has been revised

  26. arXiv:1808.08437  [pdf, other

    cs.CL cs.LG

    Meta-Learning for Low-Resource Neural Machine Translation

    Authors: Jiatao Gu, Yong Wang, Yun Chen, Kyunghyun Cho, Victor O. K. Li

    Abstract: In this paper, we propose to extend the recently introduced model-agnostic meta-learning algorithm (MAML) for low-resource neural machine translation (NMT). We frame low-resource translation as a meta-learning problem, and we learn to adapt to low-resource languages based on multilingual high-resource language tasks. We use the universal lexical representation~\citep{gu2018universal} to overcome t… ▽ More

    Submitted 25 August, 2018; originally announced August 2018.

    Comments: Accepted as a full paper at EMNLP 2018

  27. arXiv:1808.08325  [pdf, ps, other

    cs.IT

    Rate-Splitting for Multi-Antenna Non-Orthogonal Unicast and Multicast Transmission: Spectral and Energy Efficiency Analysis

    Authors: Yijie Mao, Bruno Clerckx, Victor O. K. Li

    Abstract: In a Non-Orthogonal Unicast and Multicast (NOUM) transmission system, a multicast stream intended to all the receivers is superimposed in the power domain on the unicast streams. One layer of Successive Interference Cancellation (SIC) is required at each receiver to remove the multicast stream before decoding its intended unicast stream. In this paper, we first show that a linearly-precoded 1-laye… ▽ More

    Submitted 19 September, 2019; v1 submitted 24 August, 2018; originally announced August 2018.

    Comments: Accepted by IEEE Transaction on Communications

  28. arXiv:1808.02252  [pdf, other

    cs.DC

    Efficient and DoS-resistant Consensus for Permissioned Blockchains

    Authors: Xusheng Chen, Shixiong Zhao, Ji Qi, Jianyu Jiang, Haoze Song, Cheng Wang, Tsz On Li, T. -H. Hubert Chan, Fengwei Zhang, Xiapu Luo, Sen Wang, Gong Zhang, Heming Cui

    Abstract: Existing permissioned blockchain systems designate a fixed and explicit group of committee nodes to run a consensus protocol that confirms the same sequence of blocks among all nodes. Unfortunately, when such a permissioned blockchain runs in a large scale on the Internet, these explicit committee nodes can be easily turned down by denial-of-service (DoS) or network partition attacks. Although wor… ▽ More

    Submitted 14 December, 2020; v1 submitted 7 August, 2018; originally announced August 2018.

  29. arXiv:1807.10575  [pdf

    cs.CV cs.HC

    Multi-Region Ensemble Convolutional Neural Network for Facial Expression Recognition

    Authors: Yingruo Fan, Jacqueline C. K. Lam, Victor O. K. Li

    Abstract: Facial expressions play an important role in conveying the emotional states of human beings. Recently, deep learning approaches have been applied to image recognition field due to the discriminative power of Convolutional Neural Network (CNN). In this paper, we first propose a novel Multi-Region Ensemble CNN (MRE-CNN) framework for facial expression recognition, which aims to enhance the learning… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Comments: 10pages, 5 figures, Accepted by ICANN 2018

  30. arXiv:1807.02872  [pdf, other

    cs.LG stat.ML

    Large Margin Few-Shot Learning

    Authors: Yong Wang, Xiao-Ming Wu, Qimai Li, Jiatao Gu, Wangmeng Xiang, Lei Zhang, Victor O. K. Li

    Abstract: The key issue of few-shot learning is learning to generalize. This paper proposes a large margin principle to improve the generalization capacity of metric based methods for few-shot learning. To realize it, we develop a unified framework to learn a more discriminative metric space by augmenting the classification loss function with a large margin distance loss function for training. Extensive exp… ▽ More

    Submitted 21 September, 2018; v1 submitted 8 July, 2018; originally announced July 2018.

    Comments: 17 pages, 5 figures, 7 tables

  31. arXiv:1806.10574  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    This Looks Like That: Deep Learning for Interpretable Image Recognition

    Authors: Chaofan Chen, Oscar Li, Chaofan Tao, Alina Jade Barnett, Jonathan Su, Cynthia Rudin

    Abstract: When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network architecture -- prototypical part network (ProtoPNet), that reasons in a similar way: the networ… ▽ More

    Submitted 28 December, 2019; v1 submitted 27 June, 2018; originally announced June 2018.

    Comments: Chaofan Chen and Oscar Li contributed equally to this work. This work has been accepted for spotlight presentation (top 3% of papers) at NeurIPS 2019

    Journal ref: Advances in Neural Information Processing Systems 32 (NeurIPS 2019)

  32. arXiv:1804.10516  [pdf, ps, other

    cs.IT

    Rate-Splitting Multiple Access for Coordinated Multi-Point Joint Transmission

    Authors: Yijie Mao, Bruno Clerckx, Victor O. K. Li

    Abstract: As a promising downlink multiple access scheme, Rate-Splitting Multiple Access (RSMA) has been shown to achieve superior spectral and energy efficiencies compared with Space-Division Multiple Access (SDMA) and Non-Orthogonal Multiple Access (NOMA) in downlink single-cell systems. By relying on linearly precoded rate-splitting at the transmitter and successive interference cancellation at the recei… ▽ More

    Submitted 16 January, 2019; v1 submitted 27 April, 2018; originally announced April 2018.

    Comments: 6 pages, 6 sigures

  33. Energy Efficiency of Rate-Splitting Multiple Access, and Performance Benefits over SDMA and NOMA

    Authors: Yijie Mao, Bruno Clerckx, Victor O. K. Li

    Abstract: Rate-Splitting Multiple Access (RSMA) is a general and powerful multiple access framework for downlink multi-antenna systems, and contains Space-Division Multiple Access (SDMA) and Non-Orthogonal Multiple Access (NOMA) as special cases. RSMA relies on linearly precoded rate-splitting with Successive Interference Cancellation (SIC) to decode part of the interference and treat the remaining part of… ▽ More

    Submitted 21 November, 2018; v1 submitted 23 April, 2018; originally announced April 2018.

    Comments: 6 pages, 5 figures

    Journal ref: 2018 15th International Symposium on Wireless Communication Systems (ISWCS), Lisbon, 2018

  34. arXiv:1804.07915  [pdf, other

    cs.CL

    A Stable and Effective Learning Strategy for Trainable Greedy Decoding

    Authors: Yun Chen, Victor O. K. Li, Kyunghyun Cho, Samuel R. Bowman

    Abstract: Beam search is a widely used approximate search strategy for neural network decoders, and it generally outperforms simple greedy decoding on tasks like machine translation. However, this improvement comes at substantial computational cost. In this paper, we propose a flexible new method that allows us to reap nearly the full benefits of beam search with nearly no additional computational cost. The… ▽ More

    Submitted 27 August, 2018; v1 submitted 21 April, 2018; originally announced April 2018.

    Comments: Accepted by EMNLP 2018

  35. Rate-Splitting for Multi-Antenna Non-Orthogonal Unicast and Multicast Transmission

    Authors: Yijie Mao, Bruno Clerckx, Victor O. K. Li

    Abstract: In a superimposed unicast and multicast transmission system, one layer of Successive Interference Cancellation (SIC) is required at each receiver to remove the multicast stream before decoding the unicast stream. In this paper, we show that a linearly-precoded Rate-Splitting (RS) strategy at the transmitter can efficiently exploit this existing SIC receiver architecture. By splitting the unicast m… ▽ More

    Submitted 16 February, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

    Comments: arXiv admin note: text overlap with arXiv:1710.11018

    Journal ref: 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Kalamata, 2018, pp. 1-5

  36. arXiv:1802.05368  [pdf, other

    cs.CL

    Universal Neural Machine Translation for Extremely Low Resource Languages

    Authors: Jiatao Gu, Hany Hassan, Jacob Devlin, Victor O. K. Li

    Abstract: In this paper, we propose a new universal machine translation approach focusing on languages with a limited amount of parallel data. Our proposed approach utilizes a transfer-learning approach to share lexical and sentence level representations across multiple source languages into one target language. The lexical part is shared through a Universal Lexical Representation to support multilingual wo… ▽ More

    Submitted 16 April, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

    Comments: NAACL-HLT 2018

  37. arXiv:1802.03116  [pdf, other

    cs.CL

    Zero-Resource Neural Machine Translation with Multi-Agent Communication Game

    Authors: Yun Chen, Yang Liu, Victor O. K. Li

    Abstract: While end-to-end neural machine translation (NMT) has achieved notable success in the past years in translating a handful of resource-rich language pairs, it still suffers from the data scarcity problem for low-resource language pairs and domains. To tackle this problem, we propose an interactive multimodal framework for zero-resource neural machine translation. Instead of being passively exposed… ▽ More

    Submitted 8 February, 2018; originally announced February 2018.

    Comments: Published at AAAI-18

  38. arXiv:1711.07652  [pdf, other

    eess.SY cs.CE

    A Unified Framework for Wide Area Measurement System Planning

    Authors: James J. Q. Yu, Albert Y. S. Lam, David J. Hill, Victor O. K. Li

    Abstract: Wide area measurement system (WAMS) is one of the essential components in the future power system. To make WAMS construction plans, practical models of the power network observability, reliability, and underlying communication infrastructures need to be considered. To address this challenging problem, in this paper we propose a unified framework for WAMS planning to cover most realistic concerns i… ▽ More

    Submitted 21 November, 2017; originally announced November 2017.

  39. Delay Aware Intelligent Transient Stability Assessment System

    Authors: James J. Q. Yu, Albert Y. S. Lam, David J. Hill, Victor O. K. Li

    Abstract: Transient stability assessment is a critical tool for power system design and operation. With the emerging advanced synchrophasor measurement techniques, machine learning methods are playing an increasingly important role in power system stability assessment. However, most existing research makes a strong assumption that the measurement data transmission delay is negligible. In this paper, we focu… ▽ More

    Submitted 21 November, 2017; originally announced November 2017.

  40. arXiv:1711.02281  [pdf, other

    cs.CL cs.LG

    Non-Autoregressive Neural Machine Translation

    Authors: Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher

    Abstract: Existing approaches to neural machine translation condition each output word on previously generated outputs. We introduce a model that avoids this autoregressive property and produces its outputs in parallel, allowing an order of magnitude lower latency during inference. Through knowledge distillation, the use of input token fertilities as a latent variable, and policy gradient fine-tuning, we ac… ▽ More

    Submitted 8 March, 2018; v1 submitted 6 November, 2017; originally announced November 2017.

    Comments: Accepted by ICLR 2018

  41. Rate-Splitting Multiple Access for Downlink Communication Systems: Bridging, Generalizing and Outperforming SDMA and NOMA

    Authors: Yijie Mao, Bruno Clerckx, Victor O. K. Li

    Abstract: Space-Division Multiple Access (SDMA) utilizes linear precoding to separate users in the spatial domain and relies on fully treating any residual multi-user interference as noise. Non-Orthogonal Multiple Access (NOMA) uses linearly precoded superposition coding with successive interference cancellation (SIC) and relies on user grou** and ordering to enforce some users to fully decode and cancel… ▽ More

    Submitted 17 April, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    Journal ref: EURASIP Journal on Wireless Communications and Networking, vol. 2018, no. 1, p. 133, May 2018

  42. arXiv:1710.04806  [pdf, other

    cs.AI cs.LG stat.ML

    Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions

    Authors: Oscar Li, Hao Liu, Chaofan Chen, Cynthia Rudin

    Abstract: Deep neural networks are widely used for classification. These deep models often suffer from a lack of interpretability -- they are particularly difficult to understand because of their non-linear nature. As a result, neural networks are often treated as "black box" models, and in the past, have been trained purely to optimize the accuracy of predictions. In this work, we create a novel network ar… ▽ More

    Submitted 21 November, 2017; v1 submitted 13 October, 2017; originally announced October 2017.

    Comments: The first two authors contributed equally, 8 pages, accepted in AAAI 2018

  43. arXiv:1706.07518  [pdf, other

    cs.CL

    Neural Machine Translation with Gumbel-Greedy Decoding

    Authors: Jiatao Gu, Daniel Jiwoong Im, Victor O. K. Li

    Abstract: Previous neural machine translation models used some heuristic search algorithms (e.g., beam search) in order to avoid solving the maximum a posteriori problem over translation sentences at test time. In this paper, we propose the Gumbel-Greedy Decoding which trains a generative network to predict translation under a trained model. We solve such a problem using the Gumbel-Softmax reparameterizatio… ▽ More

    Submitted 22 June, 2017; originally announced June 2017.

  44. arXiv:1705.07267  [pdf, other

    cs.CL cs.AI cs.LG

    Search Engine Guided Non-Parametric Neural Machine Translation

    Authors: Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O. K. Li

    Abstract: In this paper, we extend an attention-based neural machine translation (NMT) model by allowing it to access an entire training set of parallel sentence pairs even after training. The proposed approach consists of two stages. In the first stage--retrieval stage--, an off-the-shelf, black-box search engine is used to retrieve a small subset of sentence pairs from a training set given a source senten… ▽ More

    Submitted 8 March, 2018; v1 submitted 20 May, 2017; originally announced May 2017.

    Comments: Accepted by AAAI 2018

  45. arXiv:1705.00753  [pdf, other

    cs.CL

    A Teacher-Student Framework for Zero-Resource Neural Machine Translation

    Authors: Yun Chen, Yang Liu, Yong Cheng, Victor O. K. Li

    Abstract: While end-to-end neural machine translation (NMT) has made remarkable progress recently, it still suffers from the data scarcity problem for low-resource language pairs and domains. In this paper, we propose a method for zero-resource NMT by assuming that parallel sentences have close probabilities of generating a sentence in a third language. Based on this assumption, our method is able to train… ▽ More

    Submitted 1 May, 2017; originally announced May 2017.

    Comments: Accepted as a long paper by ACL 2017

  46. arXiv:1702.02429  [pdf, other

    cs.CL cs.LG

    Trainable Greedy Decoding for Neural Machine Translation

    Authors: Jiatao Gu, Kyunghyun Cho, Victor O. K. Li

    Abstract: Recent research in neural machine translation has largely focused on two aspects; neural network architectures and end-to-end learning algorithms. The problem of decoding, however, has received relatively little attention from the research community. In this paper, we solely focus on the problem of decoding given a trained neural machine translation model. Instead of trying to build a new decoding… ▽ More

    Submitted 8 February, 2017; originally announced February 2017.

    Comments: 10 pages

  47. arXiv:1612.05506  [pdf, other

    cs.IT

    Cache-Enabled Heterogeneous Cellular Networks: Optimal Tier-Level Content Placement

    Authors: Juan Wen, Kaibin Huang, Sheng Yang, Victor O. K. Li

    Abstract: Caching popular contents at base stations (BSs) of a heterogeneous cellular network (HCN) avoids frequent information passage from content providers to the network edge, thereby reducing latency and alleviating traffic congestion in backhaul links. In general, the optimal strategies for content placement in HCNs remain largely unknown and deriving them forms the theme of this paper. To this end, w… ▽ More

    Submitted 17 June, 2017; v1 submitted 16 December, 2016; originally announced December 2016.

    Comments: 15 pages, 7 figures

  48. arXiv:1610.07045  [pdf, other

    cs.AI

    pg-Causality: Identifying Spatiotemporal Causal Pathways for Air Pollutants with Urban Big Data

    Authors: Julie Yixuan Zhu, Chao Zhang, Huichu Zhang, Shi Zhi, Victor O. K. Li, Jiawei Han, Yu Zheng

    Abstract: Many countries are suffering from severe air pollution. Understanding how different air pollutants accumulate and propagate is critical to making relevant public policies. In this paper, we use urban big data (air quality data and meteorological data) to identify the \emph{spatiotemporal (ST) causal pathways} for air pollutants. This problem is challenging because: (1) there are numerous noisy and… ▽ More

    Submitted 18 April, 2018; v1 submitted 22 October, 2016; originally announced October 2016.

  49. arXiv:1610.00388  [pdf, other

    cs.CL cs.LG

    Learning to Translate in Real-time with Neural Machine Translation

    Authors: Jiatao Gu, Graham Neubig, Kyunghyun Cho, Victor O. K. Li

    Abstract: Translating in real-time, a.k.a. simultaneous translation, outputs translation words before the input sentence ends, which is a challenging problem for conventional machine translation methods. We propose a neural machine translation (NMT) framework for simultaneous translation in which an agent learns to make decisions on when to translate from the interaction with a pre-trained NMT environment.… ▽ More

    Submitted 10 January, 2017; v1 submitted 2 October, 2016; originally announced October 2016.

    Comments: 10 pages, camera ready

  50. arXiv:1603.06393  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Incorporating Copying Mechanism in Sequence-to-Sequence Learning

    Authors: Jiatao Gu, Zhengdong Lu, Hang Li, Victor O. K. Li

    Abstract: We address an important problem in sequence-to-sequence (Seq2Seq) learning referred to as copying, in which certain segments in the input sequence are selectively replicated in the output sequence. A similar phenomenon is observable in human language communication. For example, humans tend to repeat entity names or even long phrases in conversation. The challenge with regard to copying in Seq2Seq… ▽ More

    Submitted 8 June, 2016; v1 submitted 21 March, 2016; originally announced March 2016.

    Comments: 10 pages, 5 figures, accepted by ACL2016