Skip to main content

Showing 1–49 of 49 results for author: Tuan, L

.
  1. arXiv:2407.03788  [pdf, other

    cs.CV cs.CL

    Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning

    Authors: Thong Nguyen, Yi Bin, Xiaobao Wu, Xinshuai Dong, Zhiyuan Hu, Khoi Le, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

    Abstract: Data quality stands at the forefront of deciding the effectiveness of video-language representation learning. However, video-text pairs in previous data typically do not align perfectly with each other, which might lead to video-language representations that do not accurately reflect cross-modal semantics. Moreover, previous data also possess an uneven distribution of concepts, thereby hampering t… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  2. arXiv:2406.09717  [pdf, other

    cs.CL

    UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages

    Authors: Trinh Pham, Khoi M. Le, Luu Anh Tuan

    Abstract: In this paper, we introduce UniBridge (Cross-Lingual Transfer Learning with Optimized Embeddings and Vocabulary), a comprehensive approach developed to improve the effectiveness of Cross-Lingual Transfer Learning, particularly in languages with limited resources. Our approach tackles two essential elements of a language model: the initialization of embeddings and the optimal vocabulary size. Speci… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: First two authors contribute equally. Accepted at ACL 2024

  3. arXiv:2406.07301  [pdf, other

    eess.SY

    Optimal Scheduling of Battery Storage Systems in the Swedish Multi-FCR Market Incorporating Battery Degradation and Technical Requirements

    Authors: Nima Mirzaei Alavijeh, Rahmat Khezri, Mohammadreza Mazidi, David Steen, Le Anh Tuan

    Abstract: This paper develops a novel mixed-integer linear programming (MILP) model for optimal participation of battery energy storage systems (BESSs) in the Swedish frequency containment reserve (FCR) markets. The developed model aims to maximize the battery owner's potential profit by considering battery degradation and participation in multiple FCR markets, i.e., FCR in normal operation (FCR-N), and FCR… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE Transactions on Power Systems

  4. arXiv:2406.06852  [pdf, other

    cs.CR cs.AI cs.CL

    A Survey of Backdoor Attacks and Defenses on Large Language Models: Implications for Security Measures

    Authors: Shuai Zhao, Meihuizi Jia, Zhongliang Guo, Leilei Gan, Jie Fu, Yichao Feng, Fengjun Pan, Luu Anh Tuan

    Abstract: The large language models (LLMs), which bridge the gap between human language understanding and complex problem-solving, achieve state-of-the-art performance on several NLP tasks, particularly in few-shot and zero-shot settings. Despite the demonstrable efficacy of LMMs, due to constraints on computational resources, users have to engage with open-source language models or outsource the entire tra… ▽ More

    Submitted 13 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  5. arXiv:2406.05615  [pdf, other

    cs.CL

    Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

    Authors: Thong Nguyen, Yi Bin, Junbin Xiao, Leigang Qu, Yicong Li, Jay Zhangjie Wu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

    Abstract: Humans use multiple senses to comprehend the environment. Vision and language are two of the most vital senses since they allow us to easily communicate our thoughts and perceive the world around us. There has been a lot of interest in creating video-language understanding systems with human-like senses since a video-language pair can mimic both our linguistic medium and visual environment with te… ▽ More

    Submitted 1 July, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL 2024 (Findings)

  6. arXiv:2403.18423  [pdf, other

    cs.CL cs.LG

    SemRoDe: Macro Adversarial Training to Learn Representations That are Robust to Word-Level Attacks

    Authors: Brian Formento, Wenjie Feng, Chuan Sheng Foo, Luu Anh Tuan, See-Kiong Ng

    Abstract: Language models (LMs) are indispensable tools for natural language processing tasks, but their vulnerability to adversarial attacks remains a concern. While current research has explored adversarial training techniques, their improvements to defend against word-level attacks have been limited. In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial T… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Published in NAACL 2024 (Main Track)

  7. arXiv:2403.16685  [pdf, other

    cs.CL cs.CY

    ToXCL: A Unified Framework for Toxic Speech Detection and Explanation

    Authors: Nhat M. Hoang, Xuan Long Do, Duc Anh Do, Duc Anh Vu, Luu Anh Tuan

    Abstract: The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively d… ▽ More

    Submitted 20 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted at NAACL 2024 (Main Conference)

  8. arXiv:2402.12168  [pdf, other

    cs.CR cs.AI cs.CL

    Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning

    Authors: Shuai Zhao, Leilei Gan, Luu Anh Tuan, Jie Fu, Lingjuan Lyu, Meihuizi Jia, **ming Wen

    Abstract: Recently, various parameter-efficient fine-tuning (PEFT) strategies for application to language models have been proposed and successfully implemented. However, this raises the question of whether PEFT, which only updates a limited set of model parameters, constitutes security vulnerabilities when confronted with weight-poisoning backdoor attacks. In this study, we show that PEFT is more susceptib… ▽ More

    Submitted 29 March, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: NAACL Findings 2024

  9. arXiv:2401.05949  [pdf, other

    cs.CL cs.AI cs.CR

    Universal Vulnerabilities in Large Language Models: Backdoor Attacks for In-context Learning

    Authors: Shuai Zhao, Meihuizi Jia, Luu Anh Tuan, Fengjun Pan, **ming Wen

    Abstract: In-context learning, a paradigm bridging the gap between pre-training and fine-tuning, has demonstrated high efficacy in several NLP tasks, especially in few-shot settings. Despite being widely applied, in-context learning is vulnerable to malicious attacks. In this work, we raise security concerns regarding this paradigm. Our studies demonstrate that an attacker can manipulate the behavior of lar… ▽ More

    Submitted 16 February, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  10. arXiv:2312.06950  [pdf, other

    cs.CV cs.CL

    READ-PVLA: Recurrent Adapter with Partial Video-Language Alignment for Parameter-Efficient Transfer Learning in Low-Resource Video-Language Modeling

    Authors: Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Khoi Le, Zhiyuan Hu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

    Abstract: Fully fine-tuning pretrained large-scale transformer models has become a popular paradigm for video-language modeling tasks, such as temporal language grounding and video-language summarization. With a growing number of tasks and limited training data, such full fine-tuning approach leads to costly model storage and unstable training. To overcome these shortcomings, we introduce lightweight adapte… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted at AAAI 2024

  11. arXiv:2312.02549  [pdf, other

    cs.CV cs.CL

    DemaFormer: Damped Exponential Moving Average Transformer with Energy-Based Modeling for Temporal Language Grounding

    Authors: Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

    Abstract: Temporal Language Grounding seeks to localize video moments that semantically correspond to a natural language query. Recent advances employ the attention mechanism to learn the relations between video moments and the text query. However, naive attention might not be able to appropriately capture such relations, resulting in ineffective distributions where target video moments are difficult to sep… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: Accepted at EMNLP 2023 (Findings)

  12. arXiv:2312.02227  [pdf, other

    cs.LG cs.CL

    Improving Multimodal Sentiment Analysis: Supervised Angular Margin-based Contrastive Learning for Enhanced Fusion Representation

    Authors: Cong-Duy Nguyen, Thong Nguyen, Duc Anh Vu, Luu Anh Tuan

    Abstract: The effectiveness of a model is heavily reliant on the quality of the fusion representation of multiple modalities in multimodal sentiment analysis. Moreover, each modality is extracted from raw input and integrated with the rest to construct a multimodal representation. Although previous methods have proposed multimodal representations and achieved promising results, most of them focus on forming… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  13. arXiv:2312.01592  [pdf, other

    cs.CL

    Expand BERT Representation with Visual Information via Grounded Language Learning with Multimodal Partial Alignment

    Authors: Cong-Duy Nguyen, The-Anh Vu-Le, Thong Nguyen, Tho Quan, Luu Anh Tuan

    Abstract: Language models have been supervised with both language-only objective and visual grounding in existing studies of visual-grounded language learning. However, due to differences in the distribution and scale of visual-grounded datasets and language corpora, the language model tends to mix up the context of the tokens that occurred in the grounded data with those that do not. As a result, during re… ▽ More

    Submitted 9 January, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

  14. arXiv:2311.09277  [pdf, other

    cs.CL

    Contrastive Chain-of-Thought Prompting

    Authors: Yew Ken Chia, Guizhen Chen, Luu Anh Tuan, Soujanya Poria, Lidong Bing

    Abstract: Despite the success of chain of thought in enhancing language model reasoning, the underlying process remains less well understood. Although logically sound reasoning appears inherently crucial for chain of thought, prior studies surprisingly reveal minimal impact when using invalid demonstrations instead. Furthermore, the conventional chain of thought does not inform language models on what mista… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  15. arXiv:2311.09022  [pdf, other

    cs.CL

    Exploring the Potential of Large Language Models in Computational Argumentation

    Authors: Guizhen Chen, Liying Cheng, Luu Anh Tuan, Lidong Bing

    Abstract: Computational argumentation has become an essential tool in various domains, including law, public policy, and artificial intelligence. It is an emerging research field in natural language processing that attracts increasing attention. Research on computational argumentation mainly involves two types of tasks: argument mining and argument generation. As large language models (LLMs) have demonstrat… ▽ More

    Submitted 1 July, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted at ACL 2024 Main

  16. arXiv:2310.08975  [pdf, other

    cs.CL cs.AI

    ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models

    Authors: Haoran Luo, Haihong E, Zichen Tang, Shiyao Peng, Yikai Guo, Wentai Zhang, Chenghao Ma, Guanting Dong, Meina Song, Wei Lin, Yifan Zhu, Luu Anh Tuan

    Abstract: Knowledge Base Question Answering (KBQA) aims to answer natural language questions over large-scale knowledge bases (KBs), which can be summarized into two crucial steps: knowledge retrieval and semantic parsing. However, three core challenges remain: inefficient knowledge retrieval, mistakes of retrieval adversely impacting semantic parsing, and the complexity of previous KBQA methods. To tackle… ▽ More

    Submitted 30 May, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Accepted by Findings of ACL 2024

    Journal ref: ACL 2024

  17. arXiv:2310.08069  [pdf, other

    cs.SE cs.CL cs.IR cs.LG

    Rethinking Negative Pairs in Code Search

    Authors: Haochen Li, Xin Zhou, Luu Anh Tuan, Chunyan Miao

    Abstract: Recently, contrastive learning has become a key component in fine-tuning code search models for software development efficiency and effectiveness. It pulls together positive code snippets while pushing negative samples away given search queries. Among contrastive learning, InfoNCE is the most widely used loss function due to its better performance. However, the following problems in negative sampl… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023

  18. Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models

    Authors: Shuai Zhao, **ming Wen, Luu Anh Tuan, Junbo Zhao, Jie Fu

    Abstract: The prompt-based learning paradigm, which bridges the gap between pre-training and fine-tuning, achieves state-of-the-art performance on several NLP tasks, particularly in few-shot settings. Despite being widely applied, prompt-based learning is vulnerable to backdoor attacks. Textual backdoor attacks are designed to introduce targeted vulnerabilities into models by poisoning a subset of training… ▽ More

    Submitted 10 November, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: Accepted to appear at the main conference of EMNLP 2023

    Journal ref: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

  19. arXiv:2302.13473  [pdf, other

    cs.LG

    Towards Interpretable Federated Learning

    Authors: Anran Li, Rui Liu, Ming Hu, Luu Anh Tuan, Han Yu

    Abstract: Federated learning (FL) enables multiple data owners to build machine learning models collaboratively without exposing their private local data. In order for FL to achieve widespread adoption, it is important to balance the need for performance, privacy-preservation and interpretability, especially in mission critical applications such as finance and healthcare. Thus, interpretable federated learn… ▽ More

    Submitted 26 February, 2023; originally announced February 2023.

    Comments: Survey of interpretable federated learning

  20. arXiv:2211.08238  [pdf, other

    cs.CL cs.AI

    Exploiting Contrastive Learning and Numerical Evidence for Confusing Legal Judgment Prediction

    Authors: Leilei Gan, Baokui Li, Kun Kuang, Yating Zhang, Lei Wang, Luu Anh Tuan, Yi Yang, Fei Wu

    Abstract: Given the fact description text of a legal case, legal judgment prediction (LJP) aims to predict the case's charge, law article and penalty term. A core problem of LJP is how to distinguish confusing legal cases, where only subtle text differences exist. Previous studies fail to distinguish different classification errors with a standard cross-entropy classification loss, and ignore the numbers in… ▽ More

    Submitted 21 October, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: Accepted to Findings of EMNLP 2023

  21. arXiv:2211.02878  [pdf, other

    cs.CL cs.CR cs.LG

    Textual Manifold-based Defense Against Natural Language Adversarial Examples

    Authors: Dang Minh Nguyen, Luu Anh Tuan

    Abstract: Recent studies on adversarial images have shown that they tend to leave the underlying low-dimensional data manifold, making them significantly more challenging for current models to make correct predictions. This so-called off-manifold conjecture has inspired a novel line of defenses against adversarial attacks on images. In this study, we find a similar phenomenon occurs in the contextualized em… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

  22. arXiv:2207.04240  [pdf, other

    eess.SY

    Deep Reinforcement Learning for Long-Term Voltage Stability Control

    Authors: Hannes Hagmar, Le Anh Tuan, Robert Eriksson

    Abstract: Deep reinforcement learning (DRL) is a machine learning-based method suited for complex and high-dimensional control problems. In this study, a real-time control system based on DRL is developed for long-term voltage stability events. The possibility of using system services from demand response (DR) and energy storage systems (ESS) as control measures to stabilize the system is investigated. The… ▽ More

    Submitted 9 July, 2022; originally announced July 2022.

    Comments: In proceedings of the 11th Bulk Power Systems Dynamics and Control Symposium (IREP 2022), July 25-30, 2022, Banff, Canada

    Report number: IREP2022-9

  23. arXiv:2112.11668  [pdf, other

    cs.CL cs.LG

    How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?

    Authors: Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang

    Abstract: The fine-tuning of pre-trained language models has a great success in many NLP fields. Yet, it is strikingly vulnerable to adversarial examples, e.g., word substitution attacks using only synonyms can easily fool a BERT-based sentiment analysis model. In this paper, we demonstrate that adversarial training, the prevalent defense technique, does not directly fit a conventional fine-tuning scenario,… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: Accepted by NeurIPS-2021

  24. arXiv:2112.03473  [pdf, other

    cs.CL

    Improving Neural Cross-Lingual Summarization via Employing Optimal Transport Distance for Knowledge Distillation

    Authors: Thong Nguyen, Luu Anh Tuan

    Abstract: Current state-of-the-art cross-lingual summarization models employ multi-task learning paradigm, which works on a shared vocabulary module and relies on the self-attention mechanism to attend among tokens in two languages. However, correlation learned by self-attention is often loose and implicit, inefficient in capturing crucial cross-lingual representations between languages. The matter worsens… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: Accepted by 36th AAAI Conference on Artificial Intelligence (AAAI 2022)

  25. arXiv:2012.00336  [pdf, other

    eess.SY

    Comparison of security margin estimation methods under various load configurations

    Authors: Hannes Hagmar, Robert Eriksson, Le Anh Tuan

    Abstract: The post-contingency loadability limit (PCLL) and the secure operating limit (SOL) are the two main approaches used in computing the security margins of an electric power system. While the SOL is significantly more computationally demanding than the PCLL, it can account for the dynamic response after a disturbance and generally provides a better measure of the security margin. In this study, the d… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

    Comments: 8 pages

  26. arXiv:2009.01733  [pdf

    cond-mat.mtrl-sci

    Variation of TiO2/SiO2 mixed layers induced by different Xe+ ion energies

    Authors: Tran Van Phuc, Miroslaw Kulik, Afag Madadzada, Dorota. Kolodynska, Le Hong Khiem, Phan Luong Tuan, Nguyen Ngoc Anh

    Abstract: The broadening and optical parameters of TiO2/SiO2 transition layers depending on the ion energy have been investigated using the Rutherford Backscattering Spectrometry (RBS) and Ellipstrometry Spectroscopy (ES) methods. The TiO2/SiO2 samples were irradiated by Xe+ ions with energies of 100, 150, 200 and 250 keV. The depth profiles of the elements determined by the RBS spectra show the structure a… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

    Comments: 20 pages, 9 figures

  27. arXiv:1910.10274  [pdf, other

    cs.CL

    Capturing Greater Context for Question Generation

    Authors: Luu Anh Tuan, Darsh J Shah, Regina Barzilay

    Abstract: Automatic question generation can benefit many applications ranging from dialogue systems to reading comprehension. While questions are often asked with respect to long documents, there are many challenges with modeling such long documents. Many existing techniques generate questions by effectively looking at one sentence at a time, leading to questions that are easy and not reflective of the huma… ▽ More

    Submitted 22 October, 2019; originally announced October 2019.

  28. arXiv:1908.05554  [pdf, other

    eess.SY eess.SP

    Voltage Instability Prediction Using a Deep Recurrent Neural Network

    Authors: Hannes Hagmar, Lang Tong, Robert Eriksson, Le Anh Tuan

    Abstract: This paper develops a new method for voltage instability prediction using a recurrent neural network with long short-term memory. The method is aimed to be used as a supplementary warning system for system operators, capable of assessing whether the current state will cause voltage instability issues several minutes into the future. The proposed method use a long sequence-based network, where both… ▽ More

    Submitted 15 August, 2019; originally announced August 2019.

    Comments: 8 pages

  29. arXiv:1906.04393  [pdf, other

    cs.CL cs.LG

    Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

    Authors: Yi Tay, Aston Zhang, Luu Anh Tuan, **feng Rao, Shuai Zhang, Shuohang Wang, Jie Fu, Siu Cheung Hui

    Abstract: Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient. This paper proposes a series of lightweight and memory efficient neural architectures for a potpourri of natural language processing (NLP) tasks. To this end, our models exploit computation using Quaternion algebra and hypercomplex spaces, enabling not only expressive inter-component interactions but… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: ACL 2019

  30. arXiv:1905.10847  [pdf, other

    cs.CL cs.AI cs.IR

    Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives

    Authors: Yi Tay, Shuohang Wang, Luu Anh Tuan, Jie Fu, Minh C. Phan, Xingdi Yuan, **feng Rao, Siu Cheung Hui, Aston Zhang

    Abstract: This paper tackles the problem of reading comprehension over long narratives where documents easily span over thousands of tokens. We propose a curriculum learning (CL) based Pointer-Generator framework for reading/sampling over large documents, enabling diverse training of the neural model based on the notion of alternating contextual difficulty. This can be interpreted as a form of domain random… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: Accepted to ACL 2019

  31. arXiv:1811.09786  [pdf, other

    cs.CL cs.AI cs.IR cs.NE

    Recurrently Controlled Recurrent Networks

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui

    Abstract: Recurrent neural networks (RNNs) such as long short-term memory and gated recurrent units are pivotal building blocks across a broad spectrum of sequence modeling problems. This paper proposes a recurrently controlled recurrent network (RCRN) for expressive and powerful sequence encoding. More concretely, the key idea behind our approach is to learn the recurrent gating functions using recurrent n… ▽ More

    Submitted 24 November, 2018; originally announced November 2018.

    Comments: NIPS 2018

  32. arXiv:1811.04210  [pdf, other

    cs.CL cs.AI cs.IR cs.NE

    Densely Connected Attention Propagation for Reading Comprehension

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui, Jian Su

    Abstract: We propose DecaProp (Densely Connected Attention Propagation), a new densely connected neural architecture for reading comprehension (RC). There are two distinct characteristics of our model. Firstly, our model densely connects all pairwise layers of the network, modeling relationships between passage and query across all hierarchical levels. Secondly, the dense connectors in our network are learn… ▽ More

    Submitted 2 April, 2019; v1 submitted 10 November, 2018; originally announced November 2018.

    Comments: NIPS 2018

  33. arXiv:1810.02938  [pdf, other

    cs.CL cs.AI cs.IR

    Co-Stack Residual Affinity Networks with Multi-level Attention Refinement for Matching Text Sequences

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui

    Abstract: Learning a matching function between two text sequences is a long standing problem in NLP research. This task enables many potential applications such as question answering and paraphrase identification. This paper proposes Co-Stack Residual Affinity Networks (CSRAN), a new and universal neural architecture for this problem. CSRAN is a deep architecture, involving stacked (multi-layered) recurrent… ▽ More

    Submitted 6 October, 2018; originally announced October 2018.

    Comments: EMNLP 2018

  34. arXiv:1806.06446   

    cs.IR cs.AI cs.LG cs.NE

    Self-Attentive Neural Collaborative Filtering

    Authors: Yi Tay, Shuai Zhang, Luu Anh Tuan, Siu Cheung Hui

    Abstract: This paper has been withdrawn as we discovered a bug in our tensorflow implementation that involved accidental mixing of vectors across batches. This lead to different inference results given different batch sizes which is completely strange. The performance scores still remain the same but we concluded that it was not the self-attention that contributed to the performance. We are withdrawing the… ▽ More

    Submitted 19 July, 2018; v1 submitted 17 June, 2018; originally announced June 2018.

    Comments: We discovered a bug in our tensorflow implementation that involved accidental mixing of vectors across batches, rendering the main claim of the paper incorrect. We are withdrawing this paper until we find out why

  35. arXiv:1806.00778  [pdf, other

    cs.CL cs.AI cs.IR

    Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui

    Abstract: Attention is typically used to select informative sub-phrases that are used for prediction. This paper investigates the novel use of attention as a form of feature augmentation, i.e, casted attention. We propose Multi-Cast Attention Networks (MCAN), a new attention mechanism and general model architecture for a potpourri of ranking tasks in the conversational modeling and question answering domain… ▽ More

    Submitted 3 June, 2018; originally announced June 2018.

    Comments: Accepted to KDD 2018 (Paper titled only "Multi-Cast Attention Networks" in KDD version)

  36. arXiv:1805.08387  [pdf, other

    cond-mat.stat-mech

    Correlation length in a generalized two-dimensional XY model

    Authors: Duong Xuan Nui, Le Tuan, Nguyen Duc Trung Kien, Pham Thanh Huy, Hung T. Dang, Dao Xuan Viet

    Abstract: The measurements of the magnetic and nematic correlation lengths in a generalization of the two dimensional XY model on the square lattice are presented using classical Monte Carlo simulation. The full phase diagram is re-examined based on these correlation lengths, demonstrating their power in studying generalized XY models. The ratio between the correlation length and the lattice size has distin… ▽ More

    Submitted 16 October, 2018; v1 submitted 22 May, 2018; originally announced May 2018.

    Comments: 10 pages, 7 figures

    Journal ref: Phys. Rev. B 98, 144421 (2018)

  37. arXiv:1805.02856  [pdf, other

    cs.CL cs.AI cs.IR

    Reasoning with Sarcasm by Reading In-between

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui, Jian Su

    Abstract: Sarcasm is a sophisticated speech act which commonly manifests on social communities such as Twitter and Reddit. The prevalence of sarcasm on the social web is highly disruptive to opinion mining systems due to not only its tendency of polarity flip** but also usage of figurative language. Sarcasm commonly manifests with a contrastive theme either between positive-negative sentiments or between… ▽ More

    Submitted 8 May, 2018; originally announced May 2018.

    Comments: Accepted to ACL2018

  38. arXiv:1803.09074  [pdf, other

    cs.CL cs.AI cs.NE

    Multi-range Reasoning for Machine Comprehension

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui

    Abstract: We propose MRU (Multi-Range Reasoning Units), a new fast compositional encoder for machine comprehension (MC). Our proposed MRU encoders are characterized by multi-ranged gating, executing a series of parameterized contract-and-expand layers for learning gating vectors that benefit from long and short-term dependencies. The aims of our approach are as follows: (1) learning representations that are… ▽ More

    Submitted 24 March, 2018; originally announced March 2018.

  39. arXiv:1801.09251  [pdf, other

    cs.CL cs.AI cs.IR

    Multi-Pointer Co-Attention Networks for Recommendation

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui

    Abstract: Many recent state-of-the-art recommender systems such as D-ATT, TransNet and DeepCoNN exploit reviews for representation learning. This paper proposes a new neural architecture for recommendation with reviews. Our model operates on a multi-hierarchical paradigm and is based on the intuition that not all reviews are created equal, i.e., only a select few are important. The importance, however, shou… ▽ More

    Submitted 21 June, 2018; v1 submitted 28 January, 2018; originally announced January 2018.

    Comments: Accepted to KDD 2018 (Research Track)

  40. arXiv:1801.00102  [pdf, other

    cs.CL cs.AI

    Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui

    Abstract: This paper presents a new deep learning architecture for Natural Language Inference (NLI). Firstly, we introduce a new architecture where alignment pairs are compared, compressed and then propagated to upper layers for enhanced representation learning. Secondly, we adopt factorization layers for efficient and expressive compression of alignment vectors into scalar features, which are then used to… ▽ More

    Submitted 10 September, 2018; v1 submitted 30 December, 2017; originally announced January 2018.

    Comments: EMNLP 2018 CRC and Update CAFE + ELMo result on SNLI

  41. arXiv:1711.07656  [pdf, other

    cs.CL cs.AI cs.IR

    Cross Temporal Recurrent Networks for Ranking Question Answer Pairs

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui

    Abstract: Temporal gates play a significant role in modern recurrent-based neural encoders, enabling fine-grained control over recursive compositional operations over time. In recurrent models such as the long short-term memory (LSTM), temporal gates control the amount of information retained or discarded over time, not only playing an important role in influencing the learned representations but also servi… ▽ More

    Submitted 21 November, 2017; originally announced November 2017.

    Comments: Accepted to AAAI2018

  42. arXiv:1711.04981  [pdf, other

    cs.AI cs.CL

    SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring

    Authors: Yi Tay, Minh C. Phan, Luu Anh Tuan, Siu Cheung Hui

    Abstract: Deep learning has demonstrated tremendous potential for Automatic Text Scoring (ATS) tasks. In this paper, we describe a new neural architecture that enhances vanilla neural network models with auxiliary neural coherence features. Our new method proposes a new \textsc{SkipFlow} mechanism that models relationships between snapshots of the hidden representations of a long short-term memory (LSTM) ne… ▽ More

    Submitted 14 November, 2017; originally announced November 2017.

    Comments: Accepted to AAAI 2018

  43. arXiv:1708.04828  [pdf, other

    cs.AI cs.IR

    Multi-task Neural Network for Non-discrete Attribute Prediction in Knowledge Graphs

    Authors: Yi Tay, Luu Anh Tuan, Minh C. Phan, Siu Cheung Hui

    Abstract: Many popular knowledge graphs such as Freebase, YAGO or DBPedia maintain a list of non-discrete attributes for each entity. Intuitively, these attributes such as height, price or population count are able to richly characterize entities in knowledge graphs. This additional source of information may help to alleviate the inherent sparsity and incompleteness problem that are prevalent in knowledge g… ▽ More

    Submitted 16 August, 2017; originally announced August 2017.

    Comments: Accepted at CIKM 2017

  44. Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering

    Authors: Yi Tay, Luu Anh Tuan, Siu Cheung Hui

    Abstract: The dominant neural architectures in question answer retrieval are based on recurrent or convolutional encoders configured with complex word matching layers. Given that recent architectural innovations are mostly new word interaction layers or attention-based matching mechanisms, it seems to be a well-established fact that these components are mandatory for good performance. Unfortunately, the mem… ▽ More

    Submitted 23 November, 2017; v1 submitted 25 July, 2017; originally announced July 2017.

    Comments: Accepted at WSDM 2018

  45. Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture

    Authors: Yi Tay, Minh C. Phan, Luu Anh Tuan, Siu Cheung Hui

    Abstract: We describe a new deep learning architecture for learning to rank question answer pairs. Our approach extends the long short-term memory (LSTM) network with holographic composition to model the relationship between question and answer representations. As opposed to the neural tensor layer that has been adopted recently, the holographic composition provides the benefits of scalable and rich represe… ▽ More

    Submitted 20 July, 2017; originally announced July 2017.

    Comments: SIGIR 2017 Full Paper

  46. arXiv:1706.05461  [pdf, other

    cs.CV

    Truly Multi-modal YouTube-8M Video Classification with Video, Audio, and Text

    Authors: Zhe Wang, Kingsley Kuan, Mathieu Ravaut, Gaurav Manek, Sibo Song, Yuan Fang, Seokhwan Kim, Nancy Chen, Luis Fernando D'Haro, Luu Anh Tuan, Hongyuan Zhu, Zeng Zeng, Ngai Man Cheung, Georgios Piliouras, Jie Lin, Vijay Chandrasekhar

    Abstract: The YouTube-8M video classification challenge requires teams to classify 0.7 million videos into one or more of 4,716 classes. In this Kaggle competition, we placed in the top 3% out of 650 participants using released video and audio features. Beyond that, we extend the original competition by including text information in the classification, making this a truly multi-modal approach with vision, a… ▽ More

    Submitted 9 July, 2017; v1 submitted 16 June, 2017; originally announced June 2017.

    Comments: 8 pages, Accepted to CVPR'17 Workshop on YouTube-8M Large-Scale Video Understanding

  47. arXiv:1305.6358  [pdf

    cs.CY

    Using Blogs to Promote Writing Skill in ESL Classroom

    Authors: Melor Md Yunus, Julian Lau Kiing Tuan, Hadi Salehi

    Abstract: This study provides details on the motivational factors for using blogs as an essential tool to promote students writing skills in ESL classrooms. The study aims to discuss how using blogs may be integrated into classroom activities to promote students writing skills as well as polishing their skills. It would also illustrate the features offered in blogs as well as the motivational essence that i… ▽ More

    Submitted 27 May, 2013; originally announced May 2013.

    Comments: 5 pages

    Journal ref: Proceedings of the 4th International Conference on Education and Educational Technologies (EET '13), 109-113, 2013

  48. A State-Based Regression Formulation for Domains with Sensing Actions<br> and Incomplete Information

    Authors: Le-Chi Tuan, Chitta Baral, Tran Cao Son

    Abstract: We present a state-based regression function for planning domains where an agent does not have complete information and may have sensing actions. We consider binary domains and employ a three-valued characterization of domains with sensing actions to define the regression function. We prove the soundness and completeness of our regression formulation with respect to the definition of progression… ▽ More

    Submitted 1 October, 2006; v1 submitted 19 September, 2006; originally announced September 2006.

    Comments: 34 pages, 7 Figures

    ACM Class: I.2.4; I.2.8

    Journal ref: Logical Methods in Computer Science, Volume 2, Issue 4 (October 2, 2006) lmcs:2238

  49. arXiv:cs/0405071  [pdf, ps, other

    cs.AI

    Regression with respect to sensing actions and partial states

    Authors: Le-Chi Tuan, Chitta Baral, Tran Cao Son

    Abstract: In this paper, we present a state-based regression function for planning domains where an agent does not have complete information and may have sensing actions. We consider binary domains and employ the 0-approximation [Son & Baral 2001] to define the regression function. In binary domains, the use of 0-approximation means using 3-valued states. Although planning using this approach is incomplet… ▽ More

    Submitted 21 May, 2004; originally announced May 2004.

    Comments: 38 pages

    ACM Class: I.2.4; I.2.8