Skip to main content

Showing 1–6 of 6 results for author: Taştan, Ö

Searching in archive cs. Search in all archives.
.
  1. arXiv:2111.03837  [pdf

    cs.CL cs.IR cs.LG

    Focusing on Potential Named Entities During Active Label Acquisition

    Authors: Ali Osman Berk Sapci, Oznur Tastan, Reyyan Yeniterzi

    Abstract: Named entity recognition (NER) aims to identify mentions of named entities in an unstructured text and classify them into predefined named entity classes. While deep learning-based pre-trained language models help to achieve good predictive performances in NER, many domain-specific NER applications still call for a substantial amount of labeled data. Active learning (AL), a general framework for t… ▽ More

    Submitted 13 June, 2023; v1 submitted 6 November, 2021; originally announced November 2021.

    Comments: 20 pages, 8 figures

    Journal ref: Natural Language Engineering, pp. 1-23, 2023

  2. arXiv:2110.11446  [pdf, other

    cs.CR cs.LG q-bio.GN

    ML with HE: Privacy Preserving Machine Learning Inferences for Genome Studies

    Authors: Ş. S. Mağara, C. Yıldırım, F. Yaman, B. Dilekoğlu, F. R. Tutaş, E. Öztürk, K. Kaya, Ö. Taştan, E. Savaş

    Abstract: Preserving the privacy and security of big data in the context of cloud computing, while maintaining a certain level of efficiency of its processing remains to be a subject, open for improvement. One of the most popular applications epitomizing said concerns is found to be useful in genome analysis. This work proposes a secure multi-label tumor classification method using homomorphic encryption, w… ▽ More

    Submitted 1 February, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

  3. arXiv:2106.05211  [pdf, other

    cs.CR cs.HC

    Near-Optimal Privacy-Utility Tradeoff in Genomic Studies Using Selective SNP Hiding

    Authors: Nour Almadhoun Alserr, Gulce Kale, Onur Mutlu, Oznur Tastan, Erman Ayday

    Abstract: Motivation: Researchers need a rich trove of genomic datasets that they can leverage to gain a better understanding of the genetic basis of the human genome and identify associations between phenotypes and specific parts of DNA. However, sharing genomic datasets that include sensitive genetic or medical information of individuals can lead to serious privacy-related consequences if data lands in th… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: 9 pages, 9 figures

  4. arXiv:1812.02497  [pdf, other

    cs.LG stat.ML

    Active Learning Methods based on Statistical Leverage Scores

    Authors: Cem Orhan, Oznur Tastan

    Abstract: In many real-world machine learning applications, unlabeled data are abundant whereas class labels are expensive and scarce. An active learner aims to obtain a model of high accuracy with as few labeled instances as possible by effectively selecting useful examples for labeling. We propose a new selection criterion that is based on statistical leverage scores and present two novel active learning… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

    Comments: Submitted to Machine Learning Journal, EMLP 2019 journal track

  5. arXiv:1612.04431  [pdf, other

    cs.CE cs.LG

    Identification of Cancer Patient Subgroups via Smoothed Shortest Path Graph Kernel

    Authors: Ali Burak Ünal, Öznur Taştan

    Abstract: Characterizing patient somatic mutations through next-generation sequencing technologies opens up possibilities for refining cancer subtypes. However, catalogues of mutations reveal that only a small fraction of the genes are altered frequently in patients. On the other hand different genomic alterations may perturb the same pathways. We propose a novel clustering procedure that quantifies the sim… ▽ More

    Submitted 15 December, 2016; v1 submitted 13 December, 2016; originally announced December 2016.

    Comments: NIPS Workshop on Machine Learning in Computational Biology, Barcelona, Spain, December 10, 2016

  6. arXiv:1507.04155  [pdf, ps, other

    cs.LG stat.ML

    ALEVS: Active Learning by Statistical Leverage Sampling

    Authors: Cem Orhan, Öznur Taştan

    Abstract: Active learning aims to obtain a classifier of high accuracy by using fewer label requests in comparison to passive learning by selecting effective queries. Many active learning methods have been developed in the past two decades, which sample queries based on informativeness or representativeness of unlabeled data points. In this work, we explore a novel querying criterion based on statistical le… ▽ More

    Submitted 15 July, 2015; originally announced July 2015.

    Comments: 4 pages, presented as contributed talk in ICML 2015 Active Learning Workshop