Skip to main content

Showing 1–14 of 14 results for author: Airola, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.15012  [pdf, other

    cs.LG stat.ML

    Empirical investigation of multi-source cross-validation in clinical machine learning

    Authors: Tuija Leinonen, David Wong, Ali Wahab, Ramesh Nadarajah, Matti Kaisti, Antti Airola

    Abstract: Traditionally, machine learning-based clinical prediction models have been trained and evaluated on patient data from a single source, such as a hospital. Cross-validation methods can be used to estimate the accuracy of such models on new patients originating from the same source, by repeated random splitting of the data. However, such estimates tend to be highly overoptimistic when compared to ac… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 14 pages, 3 figures

  2. arXiv:2403.13612  [pdf, other

    cs.LG stat.ML

    Does Differentially Private Synthetic Data Lead to Synthetic Discoveries?

    Authors: Ileana Montoya Perez, Parisa Movahedi, Valtteri Nieminen, Antti Airola, Tapio Pahikkala

    Abstract: Background: Synthetic data has been proposed as a solution for sharing anonymized versions of sensitive biomedical datasets. Ideally, synthetic data should preserve the structure and statistical properties of the original data, while protecting the privacy of the individual subjects. Differential privacy (DP) is currently considered the gold standard approach for balancing this trade-off. Object… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  3. arXiv:2111.06175  [pdf, other

    cs.LG cs.AI

    Training neural networks with synthetic electrocardiograms

    Authors: Matti Kaisti, Juho Laitala, Antti Airola

    Abstract: We present a method for training neural networks with synthetic electrocardiograms that mimic signals produced by a wearable single lead electrocardiogram monitor. We use domain randomization where the synthetic signal properties such as the waveform shape, RR-intervals and noise are varied for every training example. Models trained with synthetic data are compared to their counterparts trained wi… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

  4. arXiv:2103.11856  [pdf, other

    cs.LG cs.IT math.CO

    A Link between Coding Theory and Cross-Validation with Applications

    Authors: Tapio Pahikkala, Parisa Movahedi, Ileana Montoya, Havu Miikonen, Stephan Foldes, Antti Airola, Laszlo Major

    Abstract: How many different binary classification problems a single learning algorithm can solve on a fixed data with exactly zero or at most a given number of cross-validation errors? While the number in the former case is known to be limited by the no-free-lunch theorem, we show that the exact answers are given by the theory of error detecting codes. As a case study, we focus on the AUC performance measu… ▽ More

    Submitted 9 February, 2024; v1 submitted 22 March, 2021; originally announced March 2021.

  5. Generalized vec trick for fast learning of pairwise kernel models

    Authors: Markus Viljanen, Antti Airola, Tapio Pahikkala

    Abstract: Pairwise learning corresponds to the supervised learning setting where the goal is to make predictions for pairs of objects. Prominent applications include predicting drug-target or protein-protein interactions, or customer-product preferences. In this work, we present a comprehensive review of pairwise kernels, that have been proposed for incorporating prior knowledge about the relationship betwe… ▽ More

    Submitted 4 February, 2022; v1 submitted 2 September, 2020; originally announced September 2020.

    Comments: 36 pages, 9 figures

  6. arXiv:1803.01575  [pdf, other

    stat.ML cs.LG

    A Comparative Study of Pairwise Learning Methods based on Kernel Ridge Regression

    Authors: Michiel Stock, Tapio Pahikkala, Antti Airola, Bernard De Baets, Willem Waegeman

    Abstract: Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction or network inference problems. During the last decade kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavio… ▽ More

    Submitted 5 March, 2018; originally announced March 2018.

    Comments: arXiv admin note: text overlap with arXiv:1606.04275

  7. arXiv:1701.02359  [pdf, other

    stat.AP cs.AI stat.ML

    Playtime Measurement with Survival Analysis

    Authors: Markus Viljanen, Antti Airola, Jukka Heikkonen, Tapio Pahikkala

    Abstract: Maximizing product use is a central goal of many businesses, which makes retention and monetization two central analytics metrics in games. Player retention may refer to various duration variables quantifying product use: total playtime or session playtime are popular research targets, and active playtime is well-suited for subscription games. Such research often has the goal of increasing player… ▽ More

    Submitted 4 January, 2017; originally announced January 2017.

  8. arXiv:1606.04275  [pdf, other

    cs.LG

    Efficient Pairwise Learning Using Kernel Ridge Regression: an Exact Two-Step Method

    Authors: Michiel Stock, Tapio Pahikkala, Antti Airola, Bernard De Baets, Willem Waegeman

    Abstract: Pairwise learning or dyadic prediction concerns the prediction of properties for pairs of objects. It can be seen as an umbrella covering various machine learning problems such as matrix completion, collaborative filtering, multi-task learning, transfer learning, network prediction and zero-shot learning. In this work we analyze kernel-based methods for pairwise learning, with a particular focus o… ▽ More

    Submitted 14 June, 2016; originally announced June 2016.

  9. Fast Kronecker product kernel methods via generalized vec trick

    Authors: Antti Airola, Tapio Pahikkala

    Abstract: Kronecker product kernel provides the standard approach in the kernel methods literature for learning from graph data, where edges are labeled and both start and end vertices have their own feature representations. The methods allow generalization to such new edges, whose start and end vertices do not appear in the training data, a setting known as zero-shot or zero-data learning. Such a setting o… ▽ More

    Submitted 19 April, 2017; v1 submitted 7 January, 2016; originally announced January 2016.

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems, Volume: 29, Issue: 8, Aug. 2018, pages 3374 - 3387

  10. arXiv:1506.05950  [pdf, ps, other

    cs.LG stat.ML

    Spectral Analysis of Symmetric and Anti-Symmetric Pairwise Kernels

    Authors: Tapio Pahikkala, Markus Viljanen, Antti Airola, Willem Waegeman

    Abstract: We consider the problem of learning regression functions from pairwise data when there exists prior knowledge that the relation to be learned is symmetric or anti-symmetric. Such prior knowledge is commonly enforced by symmetrizing or anti-symmetrizing pairwise kernel functions. Through spectral analysis, we show that these transformations reduce the kernel's effective dimension. Further, we provi… ▽ More

    Submitted 19 June, 2015; originally announced June 2015.

  11. arXiv:1405.4423  [pdf, other

    cs.LG

    A two-step learning approach for solving full and almost full cold start problems in dyadic prediction

    Authors: Tapio Pahikkala, Michiel Stock, Antti Airola, Tero Aittokallio, Bernard De Baets, Willem Waegeman

    Abstract: Dyadic prediction methods operate on pairs of objects (dyads), aiming to infer labels for out-of-sample dyads. We consider the full and almost full cold start problem in dyadic prediction, a setting that occurs when both objects in an out-of-sample dyad have not been observed during training, or if one of them has been observed, but very few times. A popular approach for addressing this problem is… ▽ More

    Submitted 17 May, 2014; originally announced May 2014.

  12. arXiv:1405.4394  [pdf, other

    cs.LG cs.CE q-bio.QM stat.ML

    Identification of functionally related enzymes by learning-to-rank methods

    Authors: Michiel Stock, Thomas Fober, Eyke Hüllermeier, Serghei Glinca, Gerhard Klebe, Tapio Pahikkala, Antti Airola, Bernard De Baets, Willem Waegeman

    Abstract: Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the da… ▽ More

    Submitted 17 May, 2014; originally announced May 2014.

  13. arXiv:1209.4825  [pdf, ps, other

    cs.LG stat.ML

    Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data

    Authors: Tapio Pahikkala, Antti Airola, Michiel Stock, Bernard De Baets, Willem Waegeman

    Abstract: In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficie… ▽ More

    Submitted 8 June, 2013; v1 submitted 21 September, 2012; originally announced September 2012.

  14. A kernel-based framework for learning graded relations from data

    Authors: Willem Waegeman, Tapio Pahikkala, Antti Airola, Tapio Salakoski, Michiel Stock, Bernard De Baets

    Abstract: Driven by a large number of potential applications in areas like bioinformatics, information retrieval and social network analysis, the problem setting of inferring relations between pairs of data objects has recently been investigated quite intensively in the machine learning community. To this end, current approaches typically consider datasets containing crisp relations, so that standard classi… ▽ More

    Submitted 28 November, 2011; originally announced November 2011.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

    Journal ref: IEEE Transactions on Fuzzy Systems, Volume: 20, Issue: 6, Dec. 2012, pages 1090 - 1101