Skip to main content

Showing 1–7 of 7 results for author: Özçelik, R

.
  1. arXiv:2401.03590  [pdf, other

    cs.CL

    Building Efficient and Effective OpenQA Systems for Low-Resource Languages

    Authors: Emrah Budur, Rıza Özçelik, Dilara Soylu, Omar Khattab, Tunga Güngör, Christopher Potts

    Abstract: Question answering (QA) is the task of answering questions posed in natural language with free-form natural language answers extracted from a given passage. In the OpenQA variant, only a question text is given, and the system must retrieve relevant passages from an unstructured knowledge source and use them to provide answers, which is the case in the mainstream QA systems on the Web. QA systems c… ▽ More

    Submitted 4 June, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  2. arXiv:2212.13295  [pdf, other

    q-bio.BM cs.LG

    Structure-based drug discovery with deep learning

    Authors: Rıza Özçelik, Derek van Tilborg, José Jiménez-Luna, Francesca Grisoni

    Abstract: Artificial intelligence (AI) in the form of deep learning bears promise for drug discovery and chemical biology, $\textit{e.g.}$, to predict protein structure and molecular bioactivity, plan organic synthesis, and design molecules $\textit{de novo}$. While most of the deep learning efforts in drug discovery have focused on ligand-based approaches, structure-based drug discovery has the potential t… ▽ More

    Submitted 26 December, 2022; originally announced December 2022.

  3. arXiv:2210.14642  [pdf, other

    q-bio.BM

    Exploring Data-Driven Chemical SMILES Tokenization Approaches to Identify Key Protein-Ligand Binding Moieties

    Authors: Asu Büşra Temizer, Gökçe Uludoğan, Rıza Özçelik, Taha Koulani, Elif Ozkirimli, Kutlu O. Ulgen, Nilgün Karalı, Arzucan Özgür

    Abstract: Machine learning models have found numerous successful applications in computational drug discovery. A large body of these models represents molecules as sequences since molecular sequences are easily available, simple, and informative. The sequence-based models often segment molecular sequences into pieces called chemical words (analogous to the words that make up sentences in human languages) an… ▽ More

    Submitted 25 September, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: 16 pages, 11 figures, new computational analysis and extended case studies

  4. arXiv:2107.05556  [pdf, other

    q-bio.QM cs.LG

    DebiasedDTA: A Framework for Improving the Generalizability of Drug-Target Affinity Prediction Models

    Authors: Rıza Özçelik, Alperen Bağ, Berk Atıl, Melih Barsbey, Arzucan Özgür, Elif Özkırımlı

    Abstract: Computational models that accurately predict the binding affinity of an input protein-chemical pair can accelerate drug discovery studies. These models are trained on available protein-chemical interaction datasets, which may contain dataset biases that may lead the model to learn dataset-specific patterns, instead of generalizable relationships. As a result, the prediction performance of models d… ▽ More

    Submitted 8 January, 2023; v1 submitted 4 July, 2021; originally announced July 2021.

  5. arXiv:2009.02526  [pdf, other

    cs.IR cs.LG q-bio.MN

    Vapur: A Search Engine to Find Related Protein-Compound Pairs in COVID-19 Literature

    Authors: Abdullatif Köksal, Hilal Dönmez, Rıza Özçelik, Elif Ozkirimli, Arzucan Özgür

    Abstract: Coronavirus Disease of 2019 (COVID-19) created dire consequences globally and triggered an intense scientific effort from different domains. The resulting publications created a huge text collection in which finding the studies related to a biomolecule of interest is challenging for general purpose search engines because the publications are rich in domain specific terminology. Here, we present Va… ▽ More

    Submitted 13 October, 2020; v1 submitted 5 September, 2020; originally announced September 2020.

    Comments: EMNLP 2020 - COVID-19 Workshop

  6. arXiv:2004.14963  [pdf, other

    cs.CL

    Data and Representation for Turkish Natural Language Inference

    Authors: Emrah Budur, Rıza Özçelik, Tunga Güngör, Christopher Potts

    Abstract: Large annotated datasets in NLP are overwhelmingly in English. This is an obstacle to progress in other languages. Unfortunately, obtaining new annotated resources for each task in each language would be prohibitively expensive. At the same time, commercial machine translation systems are now robust. Can we leverage these systems to translate English-language datasets automatically? In this paper,… ▽ More

    Submitted 20 October, 2020; v1 submitted 30 April, 2020; originally announced April 2020.

    Comments: Accepted to EMNLP 2020

  7. arXiv:1811.00761  [pdf, other

    cs.LG q-bio.QM stat.ML

    ChemBoost: A chemical language based approach for protein-ligand binding affinity prediction

    Authors: Rıza Özçelik, Hakime Öztürk, Arzucan Özgür, Elif Ozkirimli

    Abstract: Identification of high affinity drug-target interactions is a major research question in drug discovery. Proteins are generally represented by their structures or sequences. However, structures are available only for a small subset of biomolecules and sequence similarity is not always correlated with functional similarity. We propose ChemBoost, a chemical language based approach for affinity predi… ▽ More

    Submitted 18 December, 2020; v1 submitted 2 November, 2018; originally announced November 2018.

    Comments: Molecular Informatics