Skip to main content

Showing 1–21 of 21 results for author: Fung, G

.
  1. arXiv:2207.10284  [pdf, other

    cs.LG cs.CL eess.SP

    Multi Resolution Analysis (MRA) for Approximate Self-Attention

    Authors: Zhanpeng Zeng, Sourav Pal, Jeffery Kline, Glenn M Fung, Vikas Singh

    Abstract: Transformers have emerged as a preferred model for many tasks in natural langugage processing and vision. Recent efforts on training and deploying Transformers more efficiently have identified many strategies to approximate the self-attention matrix, a key module in a Transformer architecture. Effective ideas include various prespecified sparsity patterns, low-rank basis expansions and combination… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: ICML2022

  2. arXiv:2111.09714  [pdf, other

    cs.LG cs.CL

    You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

    Authors: Zhanpeng Zeng, Yunyang Xiong, Sathya N. Ravi, Shailesh Acharya, Glenn Fung, Vikas Singh

    Abstract: Transformer-based models are widely used in natural language processing (NLP). Central to the transformer model is the self-attention mechanism, which captures the interactions of token pairs in the input sequences and depends quadratically on the sequence length. Training such models on longer sequences is expensive. In this paper, we show that a Bernoulli sampling attention mechanism based on Lo… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Comments: Proceedings of the 38th ICML (2021)

  3. arXiv:2111.00007  [pdf, other

    cs.CV cs.LG

    Domain Agnostic Few-Shot Learning For Document Intelligence

    Authors: Jaya Krishna Mandivarapu, Eric bunch, Glenn fung

    Abstract: Few-shot learning aims to generalize to novel classes with only a few samples with class labels. Research in few-shot learning has borrowed techniques from transfer learning, metric learning, meta-learning, and Bayesian methods. These methods also aim to train models from limited training samples, and while encouraging performance has been achieved, they often fail to generalize to novel domains.… ▽ More

    Submitted 28 October, 2021; originally announced November 2021.

  4. arXiv:2110.08254  [pdf, other

    cs.LG cs.CL

    Inconsistent Few-Shot Relation Classification via Cross-Attentional Prototype Networks with Contrastive Learning

    Authors: Hongru Wang, Zhi**g **, Jiarun Cao, Gabriel Pui Cheong Fung, Kam-Fai Wong

    Abstract: Standard few-shot relation classification (RC) is designed to learn a robust classifier with only few labeled data for each class. However, previous works rarely investigate the effects of a different number of classes (i.e., $N$-way) and number of labeled data per class (i.e., $K$-shot) during training vs. testing. In this work, we define a new task, \textit{inconsistent few-shot RC}, where the m… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

  5. arXiv:2109.05234  [pdf, other

    cs.CL cs.AI

    Prior Omission of Dissimilar Source Domain(s) for Cost-Effective Few-Shot Learning

    Authors: Zezhong Wang, Hongru Wang, Kwan Wai Chung, Jia Zhu, Gabriel Pui Cheong Fung, Kam-Fai Wong

    Abstract: Few-shot slot tagging is an emerging research topic in the field of Natural Language Understanding (NLU). With sufficient annotated data from source domains, the key challenge is how to train and adapt the model to another target domain which only has few labels. Conventional few-shot approaches use all the data from the source domains without considering inter-domain relations and implicitly assu… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

  6. arXiv:2109.05187  [pdf, other

    cs.CL cs.AI

    TopicRefine: Joint Topic Prediction and Dialogue Response Generation for Multi-turn End-to-End Dialogue System

    Authors: Hongru Wang, Mingyu Cui, Zimo Zhou, Gabriel Pui Cheong Fung, Kam-Fai Wong

    Abstract: A multi-turn dialogue always follows a specific topic thread, and topic shift at the discourse level occurs naturally as the conversation progresses, necessitating the model's ability to capture different topics and generate topic-aware responses. Previous research has either predicted the topic first and then generated the relevant response, or simply applied the attention mechanism to all topics… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

  7. arXiv:2106.13802  [pdf, other

    cs.CV

    Efficient Document Image Classification Using Region-Based Graph Neural Network

    Authors: Jaya Krishna Mandivarapu, Eric Bunch, Qian You, Glenn Fung

    Abstract: Document image classification remains a popular research area because it can be commercialized in many enterprise applications across different industries. Recent advancements in large pre-trained computer vision and language models and graph neural networks has lent document image classification many tools. However using large pre-trained models usually requires substantial computing resources wh… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

  8. arXiv:2106.00827  [pdf, other

    cs.LG math.AT stat.ML

    Weighting vectors for machine learning: numerical harmonic analysis applied to boundary detection

    Authors: Eric Bunch, Jeffery Kline, Daniel Dickinson, Suhaas Bhat, Glenn Fung

    Abstract: Metric space magnitude, an active field of research in algebraic topology, is a scalar quantity that summarizes the effective number of distinct points that live in a general metric space. The {\em weighting vector} is a closely-related concept that captures, in a nontrivial way, much of the underlying geometry of the original metric space. Recent work has demonstrated that when the metric space i… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: 16 pages. arXiv admin note: text overlap with arXiv:2006.14063

  9. arXiv:2102.03902  [pdf, other

    cs.CL cs.LG

    Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention

    Authors: Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh

    Abstract: Transformers have emerged as a powerful tool for a broad range of natural language processing tasks. A key component that drives the impressive performance of Transformers is the self-attention mechanism that encodes the influence or dependence of other tokens on each specific token. While beneficial, the quadratic complexity of self-attention on the input sequence length has limited its applicati… ▽ More

    Submitted 31 March, 2021; v1 submitted 7 February, 2021; originally announced February 2021.

    Comments: AAAI 2021; Code and supplement available at https://github.com/mlpen/Nystromformer

  10. arXiv:2102.00426  [pdf, other

    cs.LG cs.HC cs.IR

    A Simple yet Brisk and Efficient Active Learning Platform for Text Classification

    Authors: Teja Kanchinadam, Qian You, Keith Westpfahl, James Kim, Siva Gunda, Sebastian Seith, Glenn Fung

    Abstract: In this work, we propose the use of a fully managed machine learning service, which utilizes active learning to directly build models from unstructured data. With this tool, business users can quickly and easily build machine learning models and then directly deploy them into a production ready hosted environment without much involvement from data scientists. Our approach leverages state-of-the-ar… ▽ More

    Submitted 31 January, 2021; originally announced February 2021.

  11. arXiv:2102.00420  [pdf, other

    cs.LG cs.CL

    Graph Neural Networks to Predict Customer Satisfaction Following Interactions with a Corporate Call Center

    Authors: Teja Kanchinadam, Zihang Meng, Joseph Bockhorst, Vikas Singh, Glenn Fung

    Abstract: Customer satisfaction is an important factor in creating and maintaining long-term relationships with customers. Near real-time identification of potentially dissatisfied customers following phone calls can provide organizations the opportunity to take meaningful interventions and to foster ongoing customer satisfaction and loyalty. This work describes a fully operational system we have developed… ▽ More

    Submitted 31 January, 2021; originally announced February 2021.

  12. arXiv:2012.15036  [pdf, other

    cs.LG stat.ML

    SGD Distributional Dynamics of Three Layer Neural Networks

    Authors: Victor Luo, Yazhen Wang, Glenn Fung

    Abstract: With the rise of big data analytics, multi-layer neural networks have surfaced as one of the most powerful machine learning methods. However, their theoretical mathematical properties are still not fully understood. Training a neural network requires optimizing a non-convex objective function, typically done using stochastic gradient descent (SGD). In this paper, we seek to extend the mean field r… ▽ More

    Submitted 29 December, 2020; originally announced December 2020.

  13. arXiv:2012.06010  [pdf, other

    math.AT

    Simplicial 2-Complex Convolutional Neural Nets

    Authors: Eric Bunch, Qian You, Glenn Fung, Vikas Singh

    Abstract: Recently, neural network architectures have been developed to accommodate when the data has the structure of a graph or, more generally, a hypergraph. While useful, graph structures can be potentially limiting. Hypergraph structures in general do not account for higher order relations between their hyperedges. Simplicial complexes offer a middle ground, with a rich theory to draw on. We develop a… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

    Comments: 5 pages, accepted to TDA and Beyond: Workshop at NeurIPS 2020

  14. arXiv:2011.08772  [pdf, other

    cs.CL cs.AI

    KddRES: A Multi-level Knowledge-driven Dialogue Dataset for Restaurant Towards Customized Dialogue System

    Authors: Hongru Wang, Min Li, Zimo Zhou, Gabriel Pui Cheong Fung, Kam-Fai Wong

    Abstract: Compared with CrossWOZ (Chinese) and MultiWOZ (English) dataset which have coarse-grained information, there is no dataset which handle fine-grained and hierarchical level information properly. In this paper, we publish a first Cantonese knowledge-driven Dialogue Dataset for REStaurant (KddRES) in Hong Kong, which grounds the information in multi-turn conversations to one specific restaurant. Our… ▽ More

    Submitted 14 December, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: 8 pages,2 figures

  15. arXiv:2006.14063  [pdf, other

    cs.LG math.AT stat.ML

    Practical applications of metric space magnitude and weighting vectors

    Authors: Eric Bunch, Daniel Dickinson, Jeffery Kline, Glenn Fung

    Abstract: Metric space magnitude, an active subject of research in algebraic topology, originally arose in the context of biology, where it was used to represent the effective number of distinct species in an environment. In a more general setting, the magnitude of a metric space is a real number that aims to quantify the effective number of distinct points in the space. The contribution of each point to a… ▽ More

    Submitted 2 July, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: 9 pages

    MSC Class: 68T99

  16. arXiv:2006.09161  [pdf, other

    cs.CL cs.AI cs.LG

    CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and Prediction with Multi-task Learning

    Authors: Hongru Wang, Xiangru Tang, Sunny Lai, Kwong Sak Leung, Jia Zhu, Gabriel Pui Cheong Fung, Kam-Fai Wong

    Abstract: This paper describes our system submitted to task 4 of SemEval 2020: Commonsense Validation and Explanation (ComVE) which consists of three sub-tasks. The task is to directly validate the given sentence whether or not it makes sense and require the model to explain it. Based on BERTarchitecture with a multi-task setting, we propose an effective and interpretable "Explain, Reason and Predict" (ERP)… ▽ More

    Submitted 27 July, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

  17. arXiv:1909.12398  [pdf, other

    cs.CV cs.LG

    Optimizing Nondecomposable Data Dependent Regularizers via Lagrangian Reparameterization offers Significant Performance and Efficiency Gains

    Authors: Sathya N. Ravi, Abhay Venkatesh, Glenn Moo Fung, Vikas Singh

    Abstract: Data dependent regularization is known to benefit a wide variety of problems in machine learning. Often, these regularizers cannot be easily decomposed into a sum over a finite number of terms, e.g., a sum over individual example-wise terms. The $F_β$ measure, Area under the ROC curve (AUCROC) and Precision at a fixed recall (P@R) are some prominent examples that are used in many applications. We… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

  18. arXiv:1908.02692  [pdf, other

    math.AT

    Approximating the Convex Hull via Metric Space Magnitude

    Authors: Glenn Fung, Eric Bunch, Dan Dickinson

    Abstract: Magnitude of a finite metric space and the related notion of magnitude functions on metric spaces is an active area of research in algebraic topology. Magnitude originally arose in the context of biology, where it represents the number of effective species in an environment; when applied to a one-parameter family of metric spaces $tX$ with scale parameter $t$, the magnitude captures much of the un… ▽ More

    Submitted 7 August, 2019; originally announced August 2019.

    Comments: 15 pages, 3 figures

  19. arXiv:1811.03268  [pdf, other

    cs.CV

    Ordinal Regression using Noisy Pairwise Comparisons for Body Mass Index Range Estimation

    Authors: Luisa Polania, Dongning Wang, Glenn Fung

    Abstract: Ordinal regression aims to classify instances into ordinal categories. In this paper, body mass index (BMI) category estimation from facial images is cast as an ordinal regression problem. In particular, noisy binary search algorithms based on pairwise comparisons are employed to exploit the ordinal relationship among BMI categories. Comparisons are performed with Siamese architectures, one of whi… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

    Comments: Paper accepted for publication at the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV 2019)

  20. arXiv:1605.09432  [pdf, other

    cs.HC cs.LG

    Evaluating Crowdsourcing Participants in the Absence of Ground-Truth

    Authors: Ramanathan Subramanian, Romer Rosales, Glenn Fung, Jennifer Dy

    Abstract: Given a supervised/semi-supervised learning scenario where multiple annotators are available, we consider the problem of identification of adversarial or unreliable annotators.

    Submitted 30 May, 2016; originally announced May 2016.

    Comments: 4 pages, 5 figures, Workshop on Human Computation for Science and Computational Sustainability, NIPS 2012, Lake Tahoe, NV. 7 Dec 2012

  21. arXiv:1203.3529  [pdf

    cs.LG cs.AI stat.ML

    Modeling Multiple Annotator Expertise in the Semi-Supervised Learning Scenario

    Authors: Yan Yan, Romer Rosales, Glenn Fung, Jennifer Dy

    Abstract: Learning algorithms normally assume that there is at most one annotation or label per data point. However, in some scenarios, such as medical diagnosis and on-line collaboration,multiple annotations may be available. In either case, obtaining labels for data points can be expensive and time-consuming (in some circumstances ground-truth may not exist). Semi-supervised learning approaches have shown… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-674-682