Skip to main content

Showing 1–18 of 18 results for author: Raiko, T

.
  1. arXiv:1612.03266  [pdf, other

    cs.CL

    A Character-Word Compositional Neural Language Model for Finnish

    Authors: Matti Lankinen, Hannes Heikinheimo, Pyry Takala, Tapani Raiko, Juha Karhunen

    Abstract: Inspired by recent research, we explore ways to model the highly morphological Finnish language at the level of characters while maintaining the performance of word-level models. We propose a new Character-to-Word-to-Character (C2W2C) compositional language model that uses characters as input and output while still internally processing word level embeddings. Our preliminary experiments, using the… ▽ More

    Submitted 10 December, 2016; originally announced December 2016.

  2. arXiv:1606.02280  [pdf, ps, other

    cs.CV

    Semi-Supervised Domain Adaptation for Weakly Labeled Semantic Video Object Segmentation

    Authors: Huiling Wang, Tapani Raiko, Lasse Lensu, Tinghuai Wang, Juha Karhunen

    Abstract: Deep convolutional neural networks (CNNs) have been immensely successful in many high-level computer vision tasks given large labeled datasets. However, for video semantic object segmentation, a domain where labels are scarce, effectively exploiting the representation power of CNN with limited training data remains a challenge. Simply borrowing the existing pretrained CNN image recognition model f… ▽ More

    Submitted 7 June, 2016; originally announced June 2016.

  3. arXiv:1602.02282  [pdf, other

    stat.ML cs.LG

    Ladder Variational Autoencoders

    Authors: Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, Ole Winther

    Abstract: Variational Autoencoders are powerful models for unsupervised learning. However deep models with several layers of dependent stochastic variables are difficult to train which limits the improvements obtained using these highly expressive models. We propose a new inference model, the Ladder Variational Autoencoder, that recursively corrects the generative distribution by a data dependent approximat… ▽ More

    Submitted 27 May, 2016; v1 submitted 6 February, 2016; originally announced February 2016.

  4. arXiv:1511.06727  [pdf, other

    cs.LG

    Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters

    Authors: Jelena Luketina, Mathias Berglund, Klaus Greff, Tapani Raiko

    Abstract: Hyperparameter selection generally relies on running multiple full training trials, with selection based on validation set performance. We propose a gradient-based approach for locally adjusting hyperparameters during training of the model. Hyperparameters are adjusted so as to make the model parameter gradients, and hence updates, more advantageous for the validation cost. We explore the approach… ▽ More

    Submitted 17 June, 2016; v1 submitted 20 November, 2015; originally announced November 2015.

    Comments: 9 pages, 7 figures. Accepted at ICML 2016

  5. arXiv:1507.02672  [pdf, other

    cs.NE cs.LG stat.ML

    Semi-Supervised Learning with Ladder Networks

    Authors: Antti Rasmus, Harri Valpola, Mikko Honkala, Mathias Berglund, Tapani Raiko

    Abstract: We combine supervised learning with unsupervised learning in deep neural networks. The proposed model is trained to simultaneously minimize the sum of supervised and unsupervised cost functions by backpropagation, avoiding the need for layer-wise pre-training. Our work builds on the Ladder network proposed by Valpola (2015), which we extend by combining the model with supervision. We show that the… ▽ More

    Submitted 24 November, 2015; v1 submitted 9 July, 2015; originally announced July 2015.

    Comments: Revised denoising function, updated results, fixed typos

  6. arXiv:1505.04771  [pdf, other

    cs.LG cs.AI cs.CL cs.NE

    DopeLearning: A Computational Approach to Rap Lyrics Generation

    Authors: Eric Malmi, Pyry Takala, Hannu Toivonen, Tapani Raiko, Aristides Gionis

    Abstract: Writing rap lyrics requires both creativity to construct a meaningful, interesting story and lyrical skills to produce complex rhyme patterns, which form the cornerstone of good flow. We present a rap lyrics generation method that captures both of these aspects. First, we develop a prediction model to identify the next line of existing lyrics from a set of candidate next lines. This model is based… ▽ More

    Submitted 9 June, 2016; v1 submitted 18 May, 2015; originally announced May 2015.

    Comments: This is a pre-print of an article appearing at KDD'16

    ACM Class: I.2.7; H.3.3

  7. arXiv:1504.08215  [pdf, other

    cs.LG cs.NE stat.ML

    Lateral Connections in Denoising Autoencoders Support Supervised Learning

    Authors: Antti Rasmus, Harri Valpola, Tapani Raiko

    Abstract: We show how a deep denoising autoencoder with lateral connections can be used as an auxiliary unsupervised learning task to support supervised learning. The proposed model is trained to minimize simultaneously the sum of supervised and unsupervised cost functions by back-propagation, avoiding the need for layer-wise pretraining. It improves the state of the art significantly in the permutation-inv… ▽ More

    Submitted 30 April, 2015; originally announced April 2015.

  8. arXiv:1504.01575  [pdf, other

    cs.LG cs.NE

    Bidirectional Recurrent Neural Networks as Generative Models - Reconstructing Gaps in Time Series

    Authors: Mathias Berglund, Tapani Raiko, Mikko Honkala, Leo Kärkkäinen, Akos Vetek, Juha Karhunen

    Abstract: Bidirectional recurrent neural networks (RNN) are trained to predict both in the positive and negative time directions simultaneously. They have not been used commonly in unsupervised tasks, because a probabilistic interpretation of the model has been difficult. Recently, two different frameworks, GSN and NADE, provide a connection between reconstruction and probabilistic modeling, which makes the… ▽ More

    Submitted 2 November, 2015; v1 submitted 7 April, 2015; originally announced April 2015.

  9. arXiv:1412.7210  [pdf, other

    cs.NE cs.CV cs.LG stat.ML

    Denoising autoencoder with modulated lateral connections learns invariant representations of natural images

    Authors: Antti Rasmus, Tapani Raiko, Harri Valpola

    Abstract: Suitable lateral connections between encoder and decoder are shown to allow higher layers of a denoising autoencoder (dAE) to focus on invariant representations. In regular autoencoders, detailed information needs to be carried through the highest layers but lateral connections from encoder to decoder relieve this pressure. It is shown that abstract invariant features can be translated to detailed… ▽ More

    Submitted 31 March, 2015; v1 submitted 22 December, 2014; originally announced December 2014.

    Comments: Presentation at ICLR 2015 workshop

  10. Linear State-Space Model with Time-Varying Dynamics

    Authors: Jaakko Luttinen, Tapani Raiko, Alexander Ilin

    Abstract: This paper introduces a linear state-space model with time-varying dynamics. The time dependency is obtained by forming the state dynamics matrix as a time-varying linear combination of a set of matrices. The time dependency of the weights in the linear combination is modelled by another linear Gaussian dynamical model allowing the model to learn how the dynamics of the process changes. Previous a… ▽ More

    Submitted 3 October, 2014; v1 submitted 2 October, 2014; originally announced October 2014.

    Comments: The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-662-44851-9_22

    Journal ref: Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science Volume 8725, 2014, pp 338-353

  11. arXiv:1406.2989  [pdf, other

    stat.ML cs.LG cs.NE

    Techniques for Learning Binary Stochastic Feedforward Neural Networks

    Authors: Tapani Raiko, Mathias Berglund, Guillaume Alain, Laurent Dinh

    Abstract: Stochastic binary hidden units in a multi-layer perceptron (MLP) network give at least three potential benefits when compared to deterministic MLP networks. (1) They allow to learn one-to-many type of map**s. (2) They can be used in structured prediction problems, where modeling the internal structure of the output is important. (3) Stochasticity has been shown to be an excellent regularizer, wh… ▽ More

    Submitted 9 April, 2015; v1 submitted 11 June, 2014; originally announced June 2014.

  12. arXiv:1406.1485  [pdf, other

    stat.ML cs.LG

    Iterative Neural Autoregressive Distribution Estimator (NADE-k)

    Authors: Tapani Raiko, Li Yao, Kyunghyun Cho, Yoshua Bengio

    Abstract: Training of the neural autoregressive density estimator (NADE) can be viewed as doing one step of probabilistic inference on missing values in data. We propose a new model that extends this inference scheme to multiple steps, arguing that it is easier to learn to improve a reconstruction in $k$ steps rather than to learn to reconstruct in a single inference step. The proposed model is an unsupervi… ▽ More

    Submitted 5 December, 2014; v1 submitted 5 June, 2014; originally announced June 2014.

    Comments: Accepted at Neural Information Processing Systems (NIPS) 2014

  13. arXiv:1312.6002  [pdf, other

    cs.NE cs.LG stat.ML

    Stochastic Gradient Estimate Variance in Contrastive Divergence and Persistent Contrastive Divergence

    Authors: Mathias Berglund, Tapani Raiko

    Abstract: Contrastive Divergence (CD) and Persistent Contrastive Divergence (PCD) are popular methods for training the weights of Restricted Boltzmann Machines. However, both methods use an approximate method for sampling from the model distribution. As a side effect, these approximations yield significantly different biases and variances for stochastic gradient estimates of individual data points. It is we… ▽ More

    Submitted 14 February, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

    Comments: ICLR2014 Workshop Track submission. Rephrased parts of text. Results unchanged

    MSC Class: 62M45 ACM Class: I.2.6

  14. arXiv:1301.3476  [pdf, other

    cs.LG cs.CV stat.ML

    Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

    Authors: Tommi Vatanen, Tapani Raiko, Harri Valpola, Yann LeCun

    Abstract: Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by analyzing the connection… ▽ More

    Submitted 11 March, 2013; v1 submitted 15 January, 2013; originally announced January 2013.

    Comments: 10 pages, 5 figures, ICLR2013

  15. arXiv:1207.1380  [pdf

    cs.MS cs.LG stat.ML

    Bayes Blocks: An Implementation of the Variational Bayesian Building Blocks Framework

    Authors: Markus Harva, Tapani Raiko, Antti Honkela, Harri Valpola, Juha Karhunen

    Abstract: A software library for constructing and learning probabilistic models is presented. The library offers a set of building blocks from which a large variety of static and dynamic models can be built. These include hierarchical models for variances of other variables and many nonlinear models. The underlying variational Bayesian machinery, providing for fast and robust estimation but being mathematic… ▽ More

    Submitted 4 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005)

    Report number: UAI-P-2005-PG-259-266

  16. arXiv:1207.1353  [pdf

    cs.AI

    'Say EM' for Selecting Probabilistic Models for Logical Sequences

    Authors: Kristian Kersting, Tapani Raiko

    Abstract: Many real world sequences such as protein secondary structures or shell logs exhibit a rich internal structures. Traditional probabilistic models of sequences, however, consider sequences of flat symbols only. Logical hidden Markov models have been proposed as one solution. They deal with logical sequences, i.e., sequences over an alphabet of logical atoms. This comes at the expense of a more comp… ▽ More

    Submitted 4 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005)

    Report number: UAI-P-2005-PG-300-307

  17. arXiv:1112.3329  [pdf, other

    physics.data-an hep-ex stat.AP stat.ML

    Semi-Supervised Anomaly Detection - Towards Model-Independent Searches of New Physics

    Authors: Mikael Kuusela, Tommi Vatanen, Eric Malmi, Tapani Raiko, Timo Aaltonen, Yoshikazu Nagai

    Abstract: Most classification algorithms used in high energy physics fall under the category of supervised machine learning. Such methods require a training set containing both signal and background events and are prone to classification errors should this training data be systematically inaccurate for example due to the assumed MC model. To complement such model-dependent searches, we propose an algorithm… ▽ More

    Submitted 16 April, 2012; v1 submitted 14 December, 2011; originally announced December 2011.

    Comments: Proceedings of ACAT 2011 conference (Uxbridge, UK), 9 pages, 4 figures

  18. Logical Hidden Markov Models

    Authors: L. De Raedt, K. Kersting, T. Raiko

    Abstract: Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models to deal with sequences of structured symbols in the form of logical atoms, rather than flat characters. This note formally introduces LOHMMs and presents solutions to the three central inference problems for LOHMMs: evaluation, most likely hidden state sequence and parameter estimation. The resulting representation and… ▽ More

    Submitted 9 September, 2011; originally announced September 2011.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 25, pages 425-456, 2006