Skip to main content

Showing 1–13 of 13 results for author: Carbonell, J G

.
  1. arXiv:2001.11258  [pdf, ps, other

    cs.CL cs.CY cs.LG

    Harnessing Code Switching to Transcend the Linguistic Barrier

    Authors: Ashiqur R. KhudaBukhsh, Shriphani Palakodety, Jaime G. Carbonell

    Abstract: Code mixing (or code switching) is a common phenomenon observed in social-media content generated by a linguistically diverse user-base. Studies show that in the Indian sub-continent, a substantial fraction of social media posts exhibit code switching. While the difficulties posed by code mixed documents to further downstream analyses are well-understood, lending visibility to code mixed documents… ▽ More

    Submitted 15 June, 2020; v1 submitted 30 January, 2020; originally announced January 2020.

  2. arXiv:1910.03206  [pdf, ps, other

    cs.CY cs.CL cs.IR cs.LG

    Voice for the Voiceless: Active Sampling to Detect Comments Supporting the Rohingyas

    Authors: Shriphani Palakodety, Ashiqur R. KhudaBukhsh, Jaime G. Carbonell

    Abstract: The Rohingya refugee crisis is one of the biggest humanitarian crises of modern times with more than 600,000 Rohingyas rendered homeless according to the United Nations High Commissioner for Refugees. While it has received sustained press attention globally, no comprehensive research has been performed on social media pertaining to this large evolving crisis. In this work, we construct a substanti… ▽ More

    Submitted 6 January, 2020; v1 submitted 8 October, 2019; originally announced October 2019.

  3. arXiv:1909.12940  [pdf, ps, other

    cs.CY cs.CL cs.LG

    Hope Speech Detection: A Computational Analysis of the Voice of Peace

    Authors: Shriphani Palakodety, Ashiqur R. KhudaBukhsh, Jaime G. Carbonell

    Abstract: The recent Pulwama terror attack (February 14, 2019, Pulwama, Kashmir) triggered a chain of escalating events between India and Pakistan adding another episode to their 70-year-old dispute over Kashmir. The present era of ubiquitious social media has never seen nuclear powers closer to war. In this paper, we analyze this evolving international crisis via a substantial corpus constructed using comm… ▽ More

    Submitted 24 February, 2020; v1 submitted 11 September, 2019; originally announced September 2019.

    Comments: Minor edits

  4. arXiv:1908.08983  [pdf, other

    cs.CL

    A Little Annotation does a Lot of Good: A Study in Bootstrap** Low-resource Named Entity Recognizers

    Authors: Aditi Chaudhary, Jiateng Xie, Zaid Sheikh, Graham Neubig, Jaime G. Carbonell

    Abstract: Most state-of-the-art models for named entity recognition (NER) rely on the availability of large amounts of labeled data, making them challenging to extend to new, lower-resourced languages. However, there are now several proposed approaches involving either cross-lingual transfer learning, which learns from other highly resourced languages, or active learning, which efficiently selects effective… ▽ More

    Submitted 23 August, 2019; originally announced August 2019.

    Comments: Accepted at EMNLP 2019

  5. arXiv:1907.10129  [pdf, other

    cs.CL

    CMU-01 at the SIGMORPHON 2019 Shared Task on Crosslinguality and Context in Morphology

    Authors: Aditi Chaudhary, Elizabeth Salesky, Gayatri Bhat, David R. Mortensen, Jaime G. Carbonell, Yulia Tsvetkov

    Abstract: This paper presents the submission by the CMU-01 team to the SIGMORPHON 2019 task 2 of Morphological Analysis and Lemmatization in Context. This task requires us to produce the lemma and morpho-syntactic description of each token in a sequence, for 107 treebanks. We approach this task with a hierarchical neural conditional random field (CRF) model which predicts each coarse-grained feature (eg. PO… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: In Proceedings of the ACL-SIGMORPHON 2019 Shared Task: Crosslinguality and Context in Morphology

  6. arXiv:1808.09500  [pdf

    cs.CL

    Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations

    Authors: Aditi Chaudhary, Chunting Zhou, Lori Levin, Graham Neubig, David R. Mortensen, Jaime G. Carbonell

    Abstract: Much work in Natural Language Processing (NLP) has been for resource-rich languages, making generalization to new, less-resourced languages challenging. We present two approaches for improving generalization to low-resourced languages by adapting continuous word representations using linguistically motivated subword units: phonemes, morphemes and graphemes. Our method requires neither parallel cor… ▽ More

    Submitted 28 August, 2018; originally announced August 2018.

    Comments: Accepted at EMNLP 2018

  7. arXiv:1806.00179  [pdf, other

    cs.LG cs.CV stat.ML

    The Nonlinearity Coefficient - Predicting Generalization in Deep Neural Networks

    Authors: George Philipp, Jaime G. Carbonell

    Abstract: For a long time, designing neural architectures that exhibit high performance was considered a dark art that required expert hand-tuning. One of the few well-known guidelines for architecture design is the avoidance of exploding gradients, though even this guideline has remained relatively vague and circumstantial. We introduce the nonlinearity coefficient (NLC), a measurement of the complexity of… ▽ More

    Submitted 30 January, 2019; v1 submitted 31 May, 2018; originally announced June 2018.

    Comments: Previous name: The Nonlinearity Coefficient - Predicting Overfitting in Deep Neural Networks

  8. arXiv:1712.05577  [pdf, other

    cs.LG cs.CV

    The exploding gradient problem demystified - definition, prevalence, impact, origin, tradeoffs, and solutions

    Authors: George Philipp, Dawn Song, Jaime G. Carbonell

    Abstract: Whereas it is believed that techniques such as Adam, batch normalization and, more recently, SeLU nonlinearities "solve" the exploding gradient problem, we show that this is not the case in general and that in a range of popular MLP architectures, exploding gradients exist and that they limit the depth to which networks can be effectively trained, both in theory and in practice. We explain why exp… ▽ More

    Submitted 6 April, 2018; v1 submitted 15 December, 2017; originally announced December 2017.

    Comments: An earlier version of this paper was named "Gradients explode - Deep Networks are shallow - ResNet explained" and presented at the ICLR 2018 workshop (https://openreview.net/forum?id=rJjcdFkPM)

  9. arXiv:1712.05440  [pdf, other

    cs.LG cs.GT

    Nonparametric Neural Networks

    Authors: George Philipp, Jaime G. Carbonell

    Abstract: Automatically determining the optimal size of a neural network for a given task without prior information currently requires an expensive global search and training many networks from scratch. In this paper, we address the problem of automatically finding a good network size during a single training cycle. We introduce *nonparametric neural networks*, a non-probabilistic framework for conducting o… ▽ More

    Submitted 14 December, 2017; originally announced December 2017.

    Comments: ICLR 2017

  10. arXiv:1509.02447  [pdf, other

    eess.SY math.OC

    Efficient Structured Matrix Rank Minimization

    Authors: Adams Wei Yu, Wanli Ma, Yaoliang Yu, Jaime G. Carbonell, Suvrit Sra

    Abstract: We study the problem of finding structured low-rank matrices using nuclear norm regularization where the structure is encoded by a linear map. In contrast to most known approaches for linearly structured rank minimization, we do not (a) use the full SVD, nor (b) resort to augmented Lagrangian techniques, nor (c) solve linear systems per iteration. Instead, we formulate the problem differently so t… ▽ More

    Submitted 8 September, 2015; originally announced September 2015.

  11. arXiv:1202.3708  [pdf

    cs.LG stat.ML

    Smoothing Proximal Gradient Method for General Structured Sparse Learning

    Authors: Xi Chen, Qihang Lin, Seyoung Kim, Jaime G. Carbonell, Eric P. Xing

    Abstract: We study the problem of learning high dimensional regression models regularized by a structured-sparsity-inducing penalty that encodes prior structural information on either input or output sides. We consider two widely adopted types of such penalties as our motivating examples: 1) overlap** group lasso penalty, based on the l1/l2 mixed-norm penalty, and 2) graph-guided fusion penalty. For both… ▽ More

    Submitted 14 February, 2012; originally announced February 2012.

    Comments: arXiv admin note: substantial text overlap with arXiv:1005.4717

    Report number: UAI-P-2011-PG-105-114

  12. arXiv:1005.4717  [pdf, ps, other

    stat.ML cs.LG math.OC stat.AP stat.CO

    Smoothing proximal gradient method for general structured sparse regression

    Authors: Xi Chen, Qihang Lin, Seyoung Kim, Jaime G. Carbonell, Eric P. Xing

    Abstract: We study the problem of estimating high-dimensional regression models regularized by a structured sparsity-inducing penalty that encodes prior structural information on either the input or output variables. We consider two widely adopted types of penalties of this kind as motivating examples: (1) the general overlap**-group-lasso penalty, generalized from the group-lasso penalty; and (2) the gra… ▽ More

    Submitted 29 June, 2012; v1 submitted 25 May, 2010; originally announced May 2010.

    Comments: Published in at http://dx.doi.org/10.1214/11-AOAS514 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS514

    Journal ref: Annals of Applied Statistics 2012, Vol. 6, No. 2, 719-752

  13. arXiv:1005.3579  [pdf, ps, other

    stat.ML cs.LG math.OC

    Graph-Structured Multi-task Regression and an Efficient Optimization Method for General Fused Lasso

    Authors: Xi Chen, Seyoung Kim, Qihang Lin, Jaime G. Carbonell, Eric P. Xing

    Abstract: We consider the problem of learning a structured multi-task regression, where the output consists of multiple responses that are related by a graph and the correlated response variables are dependent on the common inputs in a sparse but synergistic manner. Previous methods such as l1/l2-regularized multi-task regression assume that all of the output variables are equally related to the inputs, alt… ▽ More

    Submitted 19 May, 2010; originally announced May 2010.

    Comments: 21 pages, 7 figures