Skip to main content

Showing 1–28 of 28 results for author: Jebara, T

.
  1. arXiv:2205.04528  [pdf, other

    cs.LG

    Selectively Contextual Bandits

    Authors: Claudia Roberts, Maria Dimakopoulou, Qifeng Qiao, Ashok Chandrashekhar, Tony Jebara

    Abstract: Contextual bandits are widely used in industrial personalization systems. These online learning frameworks learn a treatment assignment policy in the presence of treatment effects that vary with the observed contextual features of the users. While personalization creates a rich user experience that reflect individual interests, there are benefits of a shared experience across a community that enab… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  2. arXiv:2103.13420  [pdf, other

    cs.LG cs.AI

    Active Multitask Learning with Committees

    Authors: **gxi Xu, Da Tang, Tony Jebara

    Abstract: The cost of annotating training data has traditionally been a bottleneck for supervised learning approaches. The problem is further exacerbated when supervised learning is applied to a number of correlated tasks simultaneously since the amount of labels required scales with the number of tasks. To mitigate this concern, we propose an active multitask learning algorithm that achieves knowledge tran… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

  3. arXiv:1906.06419  [pdf, other

    cs.LG stat.ML

    Learning Correlated Latent Representations with Adaptive Priors

    Authors: Da Tang, Dawen Liang, Nicholas Ruozzi, Tony Jebara

    Abstract: Variational Auto-Encoders (VAEs) have been widely applied for learning compact, low-dimensional latent representations of high-dimensional data. When the correlation structure among data points is available, previous work proposed Correlated Variational Auto-Encoders (CVAEs), which employ a structured mixture model as prior and a structured variational posterior for each mixture component to enfor… ▽ More

    Submitted 18 December, 2019; v1 submitted 14 June, 2019; originally announced June 2019.

    Comments: 16 pages, 1 figure, 5 tables

  4. arXiv:1905.12052  [pdf, other

    cs.LG stat.ML

    A New Distribution on the Simplex with Auto-Encoding Applications

    Authors: Andrew Stirn, Tony Jebara, David A Knowles

    Abstract: We construct a new distribution for the simplex using the Kumaraswamy distribution and an ordered stick-breaking process. We explore and develop the theoretical properties of this new distribution and prove that it exhibits symmetry under the same conditions as the well-known Dirichlet. Like the Dirichlet, the new distribution is adept at capturing sparsity but, unlike the Dirichlet, has an exact… ▽ More

    Submitted 14 December, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: 15 pages, 6 figures, 1 tables

  5. arXiv:1905.05335  [pdf, other

    cs.LG stat.ML

    Correlated Variational Auto-Encoders

    Authors: Da Tang, Dawen Liang, Tony Jebara, Nicholas Ruozzi

    Abstract: Variational Auto-Encoders (VAEs) are capable of learning latent representations for high dimensional data. However, due to the i.i.d. assumption, VAEs only optimize the singleton variational distributions and fail to account for the correlations between data points, which might be crucial for learning latent representations from dataset where a priori we know correlations exist. We propose Correla… ▽ More

    Submitted 17 April, 2020; v1 submitted 13 May, 2019; originally announced May 2019.

    Comments: International Conference on Machine Learning (ICML), 2019

  6. arXiv:1905.03818  [pdf, other

    cs.LG stat.ML

    Beta Survival Models

    Authors: David Hubbard, Benoit Rostykus, Yves Raimond, Tony Jebara

    Abstract: This article analyzes the problem of estimating the time until an event occurs, also known as survival modeling. We observe through substantial experiments on large real-world datasets and use-cases that populations are largely heterogeneous. Sub-populations have different mean and variance in their survival rates requiring flexible models that capture heterogeneity. We leverage a classical extens… ▽ More

    Submitted 9 May, 2019; originally announced May 2019.

    Comments: 11 pages, 9 figures

  7. arXiv:1812.00856  [pdf, other

    cs.LG stat.ML

    Thompson Sampling for Noncompliant Bandits

    Authors: Andrew Stirn, Tony Jebara

    Abstract: Thompson sampling, a Bayesian method for balancing exploration and exploitation in bandit problems, has theoretical guarantees and exhibits strong empirical performance in many domains. Traditional Thompson sampling, however, assumes perfect compliance, where an agent's chosen action is treated as the implemented action. This article introduces a stochastic noncompliance model that relaxes this as… ▽ More

    Submitted 3 December, 2018; originally announced December 2018.

    Comments: 21 pages, 5 figures

  8. arXiv:1807.06651  [pdf, other

    stat.ML cs.IR cs.LG

    Item Recommendation with Variational Autoencoders and Heterogenous Priors

    Authors: Giannis Karamanolakis, Kevin Raji Cherian, Ananth Ravi Narayan, Jie Yuan, Da Tang, Tony Jebara

    Abstract: In recent years, Variational Autoencoders (VAEs) have been shown to be highly effective in both standard collaborative filtering applications and extensions such as incorporation of implicit feedback. We extend VAEs to collaborative filtering with side information, for instance when ratings are combined with explicit text feedback from the user. Instead of using a user-agnostic standard Gaussian p… ▽ More

    Submitted 6 October, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: Accepted for the 3rd Workshop on Deep Learning for Recommender Systems (DLRS 2018), held in conjunction with the 12th ACM Conference on Recommender Systems (RecSys 2018) in Vancouver, Canada

  9. arXiv:1804.07855  [pdf, other

    cs.CL cs.AI cs.LG

    Subgoal Discovery for Hierarchical Dialogue Policy Learning

    Authors: Da Tang, Xiujun Li, Jianfeng Gao, Chong Wang, Lihong Li, Tony Jebara

    Abstract: Develo** agents to engage in complex goal-oriented dialogues is challenging partly because the main learning signals are very sparse in long conversations. In this paper, we propose a divide-and-conquer approach that discovers and exploits the hidden structure of the task to enable efficient policy learning. First, given successful example dialogues, we propose the Subgoal Discovery Network (SDN… ▽ More

    Submitted 22 September, 2018; v1 submitted 20 April, 2018; originally announced April 2018.

    Comments: 11 pages, 6 figures, EMNLP 2018

  10. arXiv:1804.05454  [pdf, ps, other

    math.ST cs.LG q-fin.PM

    A refinement of Bennett's inequality with applications to portfolio optimization

    Authors: Tony Jebara

    Abstract: A refinement of Bennett's inequality is introduced which is strictly tighter than the classical bound. The new bound establishes the convergence of the average of independent random variables to its expected value. It also carefully exploits information about the potentially heterogeneous mean, variance, and ceiling of each random variable. The bound is strictly sharper in the homogeneous setting… ▽ More

    Submitted 15 April, 2018; originally announced April 2018.

  11. arXiv:1802.05814  [pdf, other

    stat.ML cs.IR cs.LG

    Variational Autoencoders for Collaborative Filtering

    Authors: Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, Tony Jebara

    Abstract: We extend variational autoencoders (VAEs) to collaborative filtering for implicit feedback. This non-linear probabilistic model enables us to go beyond the limited modeling capacity of linear factor models which still largely dominate collaborative filtering research.We introduce a generative model with multinomial likelihood and use Bayesian inference for parameter estimation. Despite widespread… ▽ More

    Submitted 15 February, 2018; originally announced February 2018.

    Comments: 10 pages, 3 figures. WWW 2018

  12. arXiv:1611.00838  [pdf, other

    stat.ML cs.CV cs.LG

    Initialization and Coordinate Optimization for Multi-way Matching

    Authors: Da Tang, Tony Jebara

    Abstract: We consider the problem of consistently matching multiple sets of elements to each other, which is a common task in fields such as computer vision. To solve the underlying NP-hard objective, existing methods often relax or approximate it, but end up with unsatisfying empirical performance due to a misaligned objective. We propose a coordinate update algorithm that directly optimizes the target obj… ▽ More

    Submitted 18 July, 2019; v1 submitted 2 November, 2016; originally announced November 2016.

    Comments: Artificial Intelligence and Statistics (AISTATS), 2017

  13. arXiv:1610.07797  [pdf, other

    math.OC cs.LG stat.ML

    Frank-Wolfe Algorithms for Saddle Point Problems

    Authors: Gauthier Gidel, Tony Jebara, Simon Lacoste-Julien

    Abstract: We extend the Frank-Wolfe (FW) optimization algorithm to solve constrained smooth convex-concave saddle point (SP) problems. Remarkably, the method only requires access to linear minimization oracles. Leveraging recent advances in FW optimization, we provide the first proof of convergence of a FW-type saddle point solver over polytopes, thereby partially answering a 30 year-old conjecture. We also… ▽ More

    Submitted 3 March, 2017; v1 submitted 25 October, 2016; originally announced October 2016.

    Comments: Appears in: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017). 39 pages

    MSC Class: 90C52; 90C90; 68T05 ACM Class: G.1.6; I.2.6

  14. arXiv:1511.05212  [pdf, other

    cs.LG

    Binary embeddings with structured hashed projections

    Authors: Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann LeCun

    Abstract: We consider the hashing mechanism for constructing binary embeddings, that involves pseudo-random projections followed by nonlinear (sign function) map**s. The pseudo-random projection is described by a matrix, where not all entries are independent random variables but instead a fixed "budget of randomness" is distributed across the matrix. Such matrices can be efficiently stored in sub-quadrati… ▽ More

    Submitted 1 July, 2016; v1 submitted 16 November, 2015; originally announced November 2015.

    Comments: arXiv admin note: text overlap with arXiv:1505.03190

  15. arXiv:1504.01119  [pdf, ps, other

    cs.DM math.CO

    Coloring tournaments with forbidden substructures

    Authors: Krzysztof Choromanski, Tony Jebara

    Abstract: Coloring graphs is an important algorithmic problem in combinatorics with many applications in computer science. In this paper we study coloring tournaments. A chromatic number of a random tournament is of order $Ω(\frac{n}{\log(n)})$. The question arises whether the chromatic number can be proven to be smaller for more structured nontrivial classes of tournaments. We analyze the class of tourname… ▽ More

    Submitted 5 April, 2015; originally announced April 2015.

  16. arXiv:1503.01228  [pdf, other

    cs.LG cs.CV stat.ML

    Bethe Learning of Conditional Random Fields via MAP Decoding

    Authors: Kui Tang, Nicholas Ruozzi, David Belanger, Tony Jebara

    Abstract: Many machine learning tasks can be formulated in terms of predicting structured outputs. In frameworks such as the structured support vector machine (SVM-Struct) and the structured perceptron, discriminative functions are learned by iteratively applying efficient maximum a posteriori (MAP) decoding. However, maximum likelihood estimation (MLE) of probabilistic models over these same structured spa… ▽ More

    Submitted 4 March, 2015; originally announced March 2015.

    Comments: 19 pages (9 supplementary), 10 figures (3 supplementary)

  17. arXiv:1402.5902  [pdf, ps, other

    stat.ML cs.LG

    On Learning from Label Proportions

    Authors: Felix X. Yu, Krzysztof Choromanski, Sanjiv Kumar, Tony Jebara, Shih-Fu Chang

    Abstract: Learning from Label Proportions (LLP) is a learning setting, where the training data is provided in groups, or "bags", and only the proportion of each class in each bag is known. The task is to learn a model to predict the class labels of the individual instances. LLP has broad applications in political science, marketing, healthcare, and computer vision. This work answers the fundamental question… ▽ More

    Submitted 11 February, 2015; v1 submitted 24 February, 2014; originally announced February 2014.

  18. arXiv:1401.0044  [pdf, ps, other

    cs.LG

    Approximating the Bethe partition function

    Authors: Adrian Weller, Tony Jebara

    Abstract: When belief propagation (BP) converges, it does so to a stationary point of the Bethe free energy $F$, and is often strikingly accurate. However, it may converge only to a local optimum or may not converge at all. An algorithm was recently introduced for attractive binary pairwise MRFs which is guaranteed to return an $ε$-approximation to the global minimum of $F$ in polynomial time provided the m… ▽ More

    Submitted 30 December, 2013; originally announced January 2014.

    Report number: cucs-031-13

  19. arXiv:1309.6872  [pdf

    cs.AI cs.DS

    On MAP Inference by MWSS on Perfect Graphs

    Authors: Adrian Weller, Tony S. Jebara

    Abstract: Finding the most likely (MAP) configuration of a Markov random field (MRF) is NP-hard in general. A promising, recent technique is to reduce the problem to finding a maximum weight stable set (MWSS) on a derived weighted graph, which if perfect, allows inference in polynomial time. We derive new results for this approach, including a general decomposition theorem for MRFs of any order and number o… ▽ More

    Submitted 26 September, 2013; originally announced September 2013.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

    Report number: UAI-P-2013-PG-684-693

  20. arXiv:1309.5605  [pdf, ps, other

    cs.LG

    Stochastic Bound Majorization

    Authors: Anna Choromanska, Tony Jebara

    Abstract: Recently a majorization method for optimizing partition functions of log-linear models was proposed alongside a novel quadratic variational upper-bound. In the batch setting, it outperformed state-of-the-art first- and second-order optimization methods on various learning tasks. We propose a stochastic version of this bound majorization method as well as a low-rank modification for high-dimensiona… ▽ More

    Submitted 22 September, 2013; originally announced September 2013.

  21. arXiv:1309.1369  [pdf, other

    stat.ML cs.LG math.NA stat.CO

    Semistochastic Quadratic Bound Methods

    Authors: Aleksandr Y. Aravkin, Anna Choromanska, Tony Jebara, Dimitri Kanevsky

    Abstract: Partition functions arise in a variety of settings, including conditional random fields, logistic regression, and latent gaussian models. In this paper, we consider semistochastic quadratic bound (SQB) methods for maximum likelihood inference based on partition function optimization. Batch methods based on the quadratic bound were recently proposed for this class of problems, and performed favorab… ▽ More

    Submitted 17 February, 2014; v1 submitted 5 September, 2013; originally announced September 2013.

    Comments: 11 pages, 1 figure

    MSC Class: 90C55; 90C15; 62H30

  22. arXiv:1306.0886  [pdf, other

    cs.LG stat.ML

    $\propto$SVM for learning with label proportions

    Authors: Felix X. Yu, Dong Liu, Sanjiv Kumar, Tony Jebara, Shih-Fu Chang

    Abstract: We study the problem of learning with label proportions in which the training data is provided in groups and only the proportion of each class in each group is known. We propose a new method called proportion-SVM, or $\propto$SVM, which explicitly models the latent unknown instance labels together with the known group label proportions in a large-margin framework. Unlike the existing works, our ap… ▽ More

    Submitted 4 June, 2013; originally announced June 2013.

    Comments: Appears in Proceedings of the 30th International Conference on Machine Learning (ICML 2013)

  23. arXiv:1301.3865  [pdf

    cs.LG stat.ML

    Feature Selection and Dualities in Maximum Entropy Discrimination

    Authors: Tony S. Jebara, Tommi S. Jaakkola

    Abstract: Incorporating feature selection into a classification or regression method often carries a number of advantages. In this paper we formalize feature selection specifically from a discriminative perspective of improving classification/regression accuracy. The feature selection method is developed as an extension to the recently proposed maximum entropy discrimination (MED) framework. We describe MED… ▽ More

    Submitted 16 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000)

    Report number: UAI-P-2000-PG-291-300

  24. arXiv:1301.0015  [pdf, ps, other

    cs.LG stat.ML

    Bethe Bounds and Approximating the Global Optimum

    Authors: Adrian Weller, Tony Jebara

    Abstract: Inference in general Markov random fields (MRFs) is NP-hard, though identifying the maximum a posteriori (MAP) configuration of pairwise MRFs with submodular cost functions is efficiently solvable using graph cuts. Marginal inference, however, even for this restricted class, is in #P. We prove new formulations of derivatives of the Bethe free energy, provide bounds on the derivatives and bracket t… ▽ More

    Submitted 31 December, 2012; originally announced January 2013.

  25. arXiv:1207.4148  [pdf

    cs.LG stat.ML

    Dynamical Systems Trees

    Authors: Andrew Howard, Tony S. Jebara

    Abstract: We propose dynamical systems trees (DSTs) as a flexible class of models for describing multiple processes that interact via a hierarchy of aggregating parent chains. DSTs extend Kalman filters, hidden Markov models and nonlinear dynamical systems to an interactive group scenario. Various individual processes interact as communities and sub-communities in a tree structure that is unrolled in time.… ▽ More

    Submitted 11 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

    Report number: UAI-P-2004-PG-260-267

  26. arXiv:1206.3269  [pdf

    cs.LG stat.ML

    Bayesian Out-Trees

    Authors: Tony S. Jebara

    Abstract: A Bayesian treatment of latent directed graph structure for non-iid data is provided where each child datum is sampled with a directed conditional dependence on a single unknown parent datum. The latent graph structure is assumed to lie in the family of directed out-tree graphs which leads to efficient Bayesian inference. The latent likelihood of the data and its gradients are computable in closed… ▽ More

    Submitted 13 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

    Report number: UAI-P-2008-PG-315-324

  27. arXiv:1205.2639  [pdf

    cs.AI cs.DM cs.DS

    MAP Estimation, Message Passing, and Perfect Graphs

    Authors: Tony S. Jebara

    Abstract: Efficiently finding the maximum a posteriori (MAP) configuration of a graphical model is an important problem which is often implemented using message passing algorithms. The optimality of such algorithms is only well established for singly-connected graphs and other limited settings. This article extends the set of graphs where MAP estimation is in P and where message passing recovers the exact s… ▽ More

    Submitted 9 May, 2012; originally announced May 2012.

    Comments: Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

    Report number: UAI-P-2009-PG-258-267

  28. arXiv:0908.1769  [pdf, ps, other

    cs.LG cs.IT

    Approximating the Permanent with Belief Propagation

    Authors: Bert Huang, Tony Jebara

    Abstract: This work describes a method of approximating matrix permanents efficiently using belief propagation. We formulate a probability distribution whose partition function is exactly the permanent, then use Bethe free energy to approximate this partition function. After deriving some speedups to standard belief propagation, the resulting algorithm requires $(n^2)$ time per iteration. Finally, we demo… ▽ More

    Submitted 12 August, 2009; originally announced August 2009.