Skip to main content

Showing 1–9 of 9 results for author: Ozkirimli, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2209.00981  [pdf, other

    cs.LG cs.CL q-bio.BM q-bio.QM stat.ML

    Exploiting Pretrained Biochemical Language Models for Targeted Drug Design

    Authors: Gökçe Uludoğan, Elif Ozkirimli, Kutlu O. Ulgen, Nilgün Karalı, Arzucan Özgür

    Abstract: Motivation: The development of novel compounds targeting proteins of interest is one of the most important tasks in the pharmaceutical industry. Deep generative models have been applied to targeted molecular design and have shown promising results. Recently, target-specific molecule generation has been viewed as a translation between the protein language and the chemical language. However, such a… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: 12 pages, to appear in Bioinformatics

  2. arXiv:2111.06664  [pdf

    cs.CL cs.LG

    Extraction of Medication Names from Twitter Using Augmentation and an Ensemble of Language Models

    Authors: Igor Kulev, Berkay Köprü, Raul Rodriguez-Esteban, Diego Saldana, Yi Huang, Alessandro La Torraca, Elif Ozkirimli

    Abstract: The BioCreative VII Track 3 challenge focused on the identification of medication names in Twitter user timelines. For our submission to this challenge, we expanded the available training data by using several data augmentation techniques. The augmented data was then used to fine-tune an ensemble of language models that had been pre-trained on general-domain Twitter content. The proposed approach… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

    Comments: Proceedings of the BioCreative VII Challenge Evaluation Workshop

  3. arXiv:2109.04712  [pdf, other

    cs.CL

    Balancing Methods for Multi-label Text Classification with Long-Tailed Class Distribution

    Authors: Yi Huang, Buse Giledereli, Abdullatif Köksal, Arzucan Özgür, Elif Ozkirimli

    Abstract: Multi-label text classification is a challenging task because it requires capturing label dependencies. It becomes even more challenging when class distribution is long-tailed. Resampling and re-weighting are common approaches used for addressing the class imbalance problem, however, they are not effective when there is label dependency besides class imbalance because they result in oversampling o… ▽ More

    Submitted 15 October, 2021; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021

  4. arXiv:2107.05556  [pdf, other

    q-bio.QM cs.LG

    DebiasedDTA: A Framework for Improving the Generalizability of Drug-Target Affinity Prediction Models

    Authors: Rıza Özçelik, Alperen Bağ, Berk Atıl, Melih Barsbey, Arzucan Özgür, Elif Özkırımlı

    Abstract: Computational models that accurately predict the binding affinity of an input protein-chemical pair can accelerate drug discovery studies. These models are trained on available protein-chemical interaction datasets, which may contain dataset biases that may lead the model to learn dataset-specific patterns, instead of generalizable relationships. As a result, the prediction performance of models d… ▽ More

    Submitted 8 January, 2023; v1 submitted 4 July, 2021; originally announced July 2021.

  5. arXiv:2009.02526  [pdf, other

    cs.IR cs.LG q-bio.MN

    Vapur: A Search Engine to Find Related Protein-Compound Pairs in COVID-19 Literature

    Authors: Abdullatif Köksal, Hilal Dönmez, Rıza Özçelik, Elif Ozkirimli, Arzucan Özgür

    Abstract: Coronavirus Disease of 2019 (COVID-19) created dire consequences globally and triggered an intense scientific effort from different domains. The resulting publications created a huge text collection in which finding the studies related to a biomolecule of interest is challenging for general purpose search engines because the publications are rich in domain specific terminology. Here, we present Va… ▽ More

    Submitted 13 October, 2020; v1 submitted 5 September, 2020; originally announced September 2020.

    Comments: EMNLP 2020 - COVID-19 Workshop

  6. arXiv:2002.06053  [pdf, other

    q-bio.BM cs.CL cs.LG stat.ML

    Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery

    Authors: Hakime Öztürk, Arzucan Özgür, Philippe Schwaller, Teodoro Laino, Elif Ozkirimli

    Abstract: Text-based representations of chemicals and proteins can be thought of as unstructured languages codified by humans to describe domain-specific knowledge. Advances in natural language processing (NLP) methodologies in the processing of spoken languages accelerated the application of NLP to elucidate hidden knowledge in textual representations of these biochemical entities and then use it to constr… ▽ More

    Submitted 10 February, 2020; originally announced February 2020.

  7. arXiv:1902.04166  [pdf, ps, other

    q-bio.QM cs.LG stat.ML

    WideDTA: prediction of drug-target binding affinity

    Authors: Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür

    Abstract: Motivation: Prediction of the interaction affinity between proteins and compounds is a major challenge in the drug discovery process. WideDTA is a deep-learning based prediction model that employs chemical and biological textual sequence information to predict binding affinity. Results: WideDTA uses four text-based information sources, namely the protein sequence, ligand SMILES, protein domains… ▽ More

    Submitted 4 February, 2019; originally announced February 2019.

  8. arXiv:1811.00761  [pdf, other

    cs.LG q-bio.QM stat.ML

    ChemBoost: A chemical language based approach for protein-ligand binding affinity prediction

    Authors: Rıza Özçelik, Hakime Öztürk, Arzucan Özgür, Elif Ozkirimli

    Abstract: Identification of high affinity drug-target interactions is a major research question in drug discovery. Proteins are generally represented by their structures or sequences. However, structures are available only for a small subset of biomolecules and sequence similarity is not always correlated with functional similarity. We propose ChemBoost, a chemical language based approach for affinity predi… ▽ More

    Submitted 18 December, 2020; v1 submitted 2 November, 2018; originally announced November 2018.

    Comments: Molecular Informatics

  9. DeepDTA: Deep Drug-Target Binding Affinity Prediction

    Authors: Hakime Öztürk, Elif Ozkirimli, Arzucan Özgür

    Abstract: The identification of novel drug-target (DT) interactions is a substantial part of the drug discovery process. Most of the computational methods that have been proposed to predict DT interactions have focused on binary classification, where the goal is to determine whether a DT pair interacts or not. However, protein-ligand interactions assume a continuum of binding strength values, also called bi… ▽ More

    Submitted 5 June, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

    Comments: extended version