Skip to main content

Showing 1–24 of 24 results for author: Zemel, R S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2009.04806  [pdf, other

    cs.CV cs.LG cs.NE stat.ML

    SketchEmbedNet: Learning Novel Concepts by Imitating Drawings

    Authors: Alexander Wang, Mengye Ren, Richard S. Zemel

    Abstract: Sketch drawings capture the salient information of visual concepts. Previous work has shown that neural networks are capable of producing sketches of natural objects drawn from a small number of classes. While earlier approaches focus on generation quality or retrieval, we explore properties of image representations learned by training a model to produce sketches of images. We show that this gener… ▽ More

    Submitted 22 June, 2021; v1 submitted 27 August, 2020; originally announced September 2020.

    Comments: ICML 2021

  2. arXiv:2007.04546  [pdf, other

    cs.LG cs.CV stat.ML

    Wandering Within a World: Online Contextualized Few-Shot Learning

    Authors: Mengye Ren, Michael L. Iuzzolino, Michael C. Mozer, Richard S. Zemel

    Abstract: We aim to bridge the gap between typical human and machine-learning environments by extending the standard framework of few-shot learning to an online, continual setting. In this setting, episodes do not have separate training and testing phases, and instead models are evaluated online while learning novel classes. As in the real world, where the presence of spatiotemporal context helps us retriev… ▽ More

    Submitted 22 April, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: ICLR 2021

  3. arXiv:1910.00760  [pdf, other

    cs.LG stat.ML

    Efficient Graph Generation with Graph Recurrent Attention Networks

    Authors: Renjie Liao, Yujia Li, Yang Song, Shenlong Wang, Charlie Nash, William L. Hamilton, David Duvenaud, Raquel Urtasun, Richard S. Zemel

    Abstract: We propose a new family of efficient and expressive deep generative models of graphs, called Graph Recurrent Attention Networks (GRANs). Our model generates graphs one block of nodes and associated edges at a time. The block size and sampling stride allow us to trade off sample quality for efficiency. Compared to previous RNN-based graph generative models, our framework better captures the auto-re… ▽ More

    Submitted 17 July, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

    Comments: Neural Information Processing Systems (NeurIPS) 2019

  4. arXiv:1901.01484  [pdf, other

    cs.LG stat.ML

    LanczosNet: Multi-Scale Deep Graph Convolutional Networks

    Authors: Renjie Liao, Zhizhen Zhao, Raquel Urtasun, Richard S. Zemel

    Abstract: We propose the Lanczos network (LanczosNet), which uses the Lanczos algorithm to construct low rank approximations of the graph Laplacian for graph convolution. Relying on the tridiagonal decomposition of the Lanczos algorithm, we not only efficiently exploit multi-scale information via fast approximated computation of matrix power but also design learnable spectral filters. Being fully differenti… ▽ More

    Submitted 23 October, 2019; v1 submitted 5 January, 2019; originally announced January 2019.

    Comments: The International Conference on Learning Representations (ICLR) 2019

  5. arXiv:1810.07218  [pdf, other

    cs.LG cs.CV stat.ML

    Incremental Few-Shot Learning with Attention Attractor Networks

    Authors: Mengye Ren, Renjie Liao, Ethan Fetaya, Richard S. Zemel

    Abstract: Machine learning classifiers are often trained to recognize a set of pre-defined classes. However, in many applications, it is often desirable to have the flexibility of learning additional concepts, with limited data and without re-training on the full training set. This paper addresses this problem, incremental few-shot learning, where a regular classification network has already been trained to… ▽ More

    Submitted 6 October, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

    Comments: NeurIPS 2019

  6. arXiv:1803.00676  [pdf, other

    cs.LG cs.CV stat.ML

    Meta-Learning for Semi-Supervised Few-Shot Classification

    Authors: Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua B. Tenenbaum, Hugo Larochelle, Richard S. Zemel

    Abstract: In few-shot classification, we are interested in learning algorithms that train a classifier from only a handful of labeled examples. Recent progress in few-shot classification has featured meta-learning, in which a parameterized model for a learning algorithm is defined and trained on episodes representing different classification problems, each with a small labeled training set and its correspon… ▽ More

    Submitted 1 March, 2018; originally announced March 2018.

    Comments: Published as a conference paper at ICLR 2018. 15 pages

  7. arXiv:1703.05175  [pdf, other

    cs.LG stat.ML

    Prototypical Networks for Few-shot Learning

    Authors: Jake Snell, Kevin Swersky, Richard S. Zemel

    Abstract: We propose prototypical networks for the problem of few-shot classification, where a classifier must generalize to new classes not seen in the training set, given only a small number of examples of each new class. Prototypical networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for f… ▽ More

    Submitted 19 June, 2017; v1 submitted 15 March, 2017; originally announced March 2017.

  8. arXiv:1611.04520  [pdf, other

    cs.LG stat.ML

    Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes

    Authors: Mengye Ren, Renjie Liao, Raquel Urtasun, Fabian H. Sinz, Richard S. Zemel

    Abstract: Normalization techniques have only recently begun to be exploited in supervised learning tasks. Batch normalization exploits mini-batch statistics to normalize the activations. This was shown to speed up training and result in better models. However its success has been very limited when dealing with recurrent neural networks. On the other hand, layer normalization normalizes the activations acros… ▽ More

    Submitted 6 March, 2017; v1 submitted 14 November, 2016; originally announced November 2016.

    Comments: Published as a conference paper at ICLR 2017

  9. arXiv:1605.09410  [pdf, other

    cs.LG cs.CV

    End-to-End Instance Segmentation with Recurrent Attention

    Authors: Mengye Ren, Richard S. Zemel

    Abstract: While convolutional neural networks have gained impressive success recently in solving structured prediction problems such as semantic segmentation, it remains a challenge to differentiate individual object instances in the scene. Instance segmentation is very important in a variety of applications, such as autonomous driving, image captioning, and visual question answering. Techniques that combin… ▽ More

    Submitted 12 July, 2017; v1 submitted 30 May, 2016; originally announced May 2016.

    Comments: CVPR 2017

  10. arXiv:1511.06411  [pdf, other

    cs.LG

    Training Deep Neural Networks via Direct Loss Minimization

    Authors: Yang Song, Alexander G. Schwing, Richard S. Zemel, Raquel Urtasun

    Abstract: Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on metrics specific to the application. In this paper we propose a direct loss minimization approach to train deep neural networks, which provably minimizes the application-specific loss function. This is often non-trivial, since these functions are n… ▽ More

    Submitted 1 June, 2016; v1 submitted 19 November, 2015; originally announced November 2015.

    Comments: ICML2016

  11. arXiv:1511.06409  [pdf, other

    cs.LG cs.CV

    Learning to Generate Images with Perceptual Similarity Metrics

    Authors: Jake Snell, Karl Ridgeway, Renjie Liao, Brett D. Roads, Michael C. Mozer, Richard S. Zemel

    Abstract: Deep networks are increasingly being applied to problems involving image synthesis, e.g., generating images from textual descriptions and reconstructing an input image from a compact representation. Supervised training of image-synthesis networks typically uses a pixel-wise loss (PL) to indicate the mismatch between a generated image and its corresponding target image. We propose instead to use a… ▽ More

    Submitted 23 January, 2017; v1 submitted 19 November, 2015; originally announced November 2015.

  12. arXiv:1506.06726  [pdf, other

    cs.CL cs.LG

    Skip-Thought Vectors

    Authors: Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler

    Abstract: We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next introduce a simple vocabulary expansion me… ▽ More

    Submitted 22 June, 2015; originally announced June 2015.

    Comments: 11 pages

  13. arXiv:1411.2539  [pdf, other

    cs.LG cs.CL cs.CV

    Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models

    Authors: Ryan Kiros, Ruslan Salakhutdinov, Richard S. Zemel

    Abstract: Inspired by recent advances in multimodal learning and machine translation, we introduce an encoder-decoder pipeline that learns (a): a multimodal joint embedding space with images and text and (b): a novel language model for decoding distributed representations from our space. Our pipeline effectively unifies joint image-text embedding models with multimodal neural language models. We introduce t… ▽ More

    Submitted 10 November, 2014; originally announced November 2014.

    Comments: 13 pages. NIPS 2014 deep learning workshop

  14. arXiv:1406.2710  [pdf, other

    cs.LG cs.CL

    A Multiplicative Model for Learning Distributed Text-Based Attribute Representations

    Authors: Ryan Kiros, Richard S. Zemel, Ruslan Salakhutdinov

    Abstract: In this paper we propose a general framework for learning distributed representations of attributes: characteristics of text whose representations can be jointly learned with word embeddings. Attributes can correspond to document indicators (to learn sentence vectors), language indicators (to learn distributed language representations), meta-data and side information (such as the age, gender and i… ▽ More

    Submitted 10 June, 2014; originally announced June 2014.

    Comments: 11 pages. An earlier version was accepted to the ICML-2014 Workshop on Knowledge-Powered Deep Learning for Text Mining

  15. arXiv:1402.0929  [pdf, other

    stat.ML cs.LG

    Input War** for Bayesian Optimization of Non-stationary Functions

    Authors: Jasper Snoek, Kevin Swersky, Richard S. Zemel, Ryan P. Adams

    Abstract: Bayesian optimization has proven to be a highly effective methodology for the global optimization of unknown, expensive and multimodal functions. The ability to accurately model distributions over functions is critical to the effectiveness of Bayesian optimization. Although Gaussian processes provide a flexible prior over functions which can be queried efficiently, there are various classes of fun… ▽ More

    Submitted 11 June, 2014; v1 submitted 4 February, 2014; originally announced February 2014.

  16. arXiv:1212.2513  [pdf

    cs.LG stat.ML

    Efficient Parametric Projection Pursuit Density Estimation

    Authors: Max Welling, Richard S. Zemel, Geoffrey E. Hinton

    Abstract: Product models of low dimensional experts are a powerful way to avoid the curse of dimensionality. We present the ``under-complete product of experts' (UPoE), where each expert models a one dimensional projection of the data. The UPoE is fully tractable and may be interpreted as a parametric probabilistic model for projection pursuit. Its ML learning rules are identical to the… ▽ More

    Submitted 19 October, 2012; originally announced December 2012.

    Comments: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

    Report number: UAI-P-2003-PG-575-582

  17. arXiv:1212.2442  [pdf

    cs.IR cs.LG stat.ML

    Active Collaborative Filtering

    Authors: Craig Boutilier, Richard S. Zemel, Benjamin Marlin

    Abstract: Collaborative filtering (CF) allows the preferences of multiple users to be pooled to make recommendations regarding unseen products. We consider in this paper the problem of online and interactive CF: given the current ratings associated with a user, what queries (new ratings) would most improve the quality of the recommendations made? We cast this terms of expected value of i… ▽ More

    Submitted 19 October, 2012; originally announced December 2012.

    Comments: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

    Report number: UAI-P-2003-PG-98-106

  18. arXiv:1210.4899  [pdf

    cs.LG stat.ML

    Fast Exact Inference for Recursive Cardinality Models

    Authors: Daniel Tarlow, Kevin Swersky, Richard S. Zemel, Ryan Prescott Adams, Brendan J. Frey

    Abstract: Cardinality potentials are a generally useful class of high order potential that affect probabilities based on how many of D binary variables are active. Maximum a posteriori (MAP) inference for cardinality potential models is well-understood, with efficient computations taking O(DlogD) time. Yet efficient marginalization and sampling have not been addressed as thoroughly in the machine learning c… ▽ More

    Submitted 16 October, 2012; originally announced October 2012.

    Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

    Report number: UAI-P-2012-PG-825-834

  19. arXiv:1206.5267  [pdf

    cs.LG cs.IR stat.ML

    Collaborative Filtering and the Missing at Random Assumption

    Authors: Benjamin Marlin, Richard S. Zemel, Sam Roweis, Malcolm Slaney

    Abstract: Rating prediction is an important application, and a popular research topic in collaborative filtering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present the results of a user study in which we collect a random sample of ratings from current users of an… ▽ More

    Submitted 20 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007)

    Report number: UAI-P-2007-PG-267-275

  20. arXiv:1206.3294  [pdf

    cs.LG stat.ML

    Flexible Priors for Exemplar-based Clustering

    Authors: Daniel Tarlow, Richard S. Zemel, Brendan J. Frey

    Abstract: Exemplar-based clustering methods have been shown to produce state-of-the-art results on a number of synthetic and real-world clustering problems. They are appealing because they offer computational benefits over latent-mean models and can handle arbitrary pairwise similarity measures between data points. However, when trying to recover underlying structure in clustering problems, tailored similar… ▽ More

    Submitted 13 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

    Report number: UAI-P-2008-PG-537-545

  21. arXiv:1202.3706  [pdf

    cs.IR cs.AI

    A Framework for Optimizing Paper Matching

    Authors: Laurent Charlin, Richard S. Zemel, Craig Boutilier

    Abstract: At the heart of many scientific conferences is the problem of matching submitted papers to suitable reviewers. Arriving at a good assignment is a major and important challenge for any conference organizer. In this paper we propose a framework to optimize paper-to-reviewer assignments. Our framework uses suitability scores to measure pairwise affinity between papers and reviewers. We show how learn… ▽ More

    Submitted 14 February, 2012; originally announced February 2012.

    Report number: UAI-P-2011-PG-86-95

  22. arXiv:1107.1805  [pdf, other

    stat.ML cs.AI

    Loss-sensitive Training of Probabilistic Conditional Random Fields

    Authors: Maksims N. Volkovs, Hugo Larochelle, Richard S. Zemel

    Abstract: We consider the problem of training probabilistic conditional random fields (CRFs) in the context of a task where performance is measured using a specific loss function. While maximum likelihood is the most common approach to training CRFs, it ignores the inherent structure of the task's loss function. We describe alternatives to maximum likelihood which take that loss into account. These include… ▽ More

    Submitted 9 July, 2011; originally announced July 2011.

  23. arXiv:1106.1925  [pdf, other

    stat.ML cs.IR cs.LG

    Ranking via Sinkhorn Propagation

    Authors: Ryan Prescott Adams, Richard S. Zemel

    Abstract: It is of increasing importance to develop learning methods for ranking. In contrast to many learning objectives, however, the ranking problem presents difficulties due to the fact that the space of permutations is not smooth. In this paper, we examine the class of rank-linear objective functions, which includes popular metrics such as precision and discounted cumulative gain. In particular, we obs… ▽ More

    Submitted 13 June, 2011; v1 submitted 9 June, 2011; originally announced June 2011.

    Comments: Submitted

  24. arXiv:1105.1178  [pdf, other

    cs.LG cs.DS stat.ML

    Interpreting Graph Cuts as a Max-Product Algorithm

    Authors: Daniel Tarlow, Inmar E. Givoni, Richard S. Zemel, Brendan J. Frey

    Abstract: The maximum a posteriori (MAP) configuration of binary variable models with submodular graph-structured energy functions can be found efficiently and exactly by graph cuts. Max-product belief propagation (MP) has been shown to be suboptimal on this class of energy functions by a canonical counterexample where MP converges to a suboptimal fixed point (Kulesza & Pereira, 2008). In this work, we sh… ▽ More

    Submitted 5 May, 2011; originally announced May 2011.