Skip to main content

Showing 1–24 of 24 results for author: Yuret, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02486  [pdf, other

    cs.CL cs.AI cs.LG

    Neurocache: Efficient Vector Retrieval for Long-range Language Modeling

    Authors: Ali Safaya, Deniz Yuret

    Abstract: This paper introduces Neurocache, an approach to extend the effective context size of large language models (LLMs) using an external vector cache to store its past states. Like recent vector retrieval approaches, Neurocache uses an efficient k-nearest-neighbor (kNN) algorithm to retrieve relevant past states and incorporate them into the attention process. Neurocache improves upon previous methods… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Long paper, published at the main conference NAACL'24

  2. arXiv:2405.04685  [pdf, other

    cs.CL cs.AI cs.LG

    Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking

    Authors: Emre Can Acikgoz, Mete Erdogan, Deniz Yuret

    Abstract: Large Language Models (LLMs) are becoming crucial across various fields, emphasizing the urgency for high-quality models in underrepresented languages. This study explores the unique challenges faced by low-resource languages, such as data scarcity, model selection, evaluation, and computational limitations, with a special focus on Turkish. We conduct an in-depth analysis to evaluate the impact of… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  3. arXiv:2404.12013  [pdf, other

    cs.CL

    Sequential Compositional Generalization in Multimodal Models

    Authors: Semih Yagcioglu, Osman Batur İnce, Aykut Erdem, Erkut Erdem, Desmond Elliott, Deniz Yuret

    Abstract: The rise of large-scale multimodal models has paved the pathway for groundbreaking advances in generative modeling and reasoning, unlocking transformative applications in a variety of complex tasks. However, a pressing question that remains is their genuine capability for stronger forms of generalization, which has been largely underexplored in the multimodal setting. Our study aims to address thi… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: Accepted to the main conference of NAACL (2024) as a long paper

  4. arXiv:2308.09096  [pdf, other

    cs.CV cs.AI cs.IR

    Identity-Aware Semi-Supervised Learning for Comic Character Re-Identification

    Authors: Gürkan Soykan, Deniz Yuret, Tevfik Metin Sezgin

    Abstract: Character re-identification, recognizing characters consistently across different panels in comics, presents significant challenges due to limited annotated data and complex variations in character appearances. To tackle this issue, we introduce a robust semi-supervised framework that combines metric learning with a novel 'Identity-Aware' self-supervision method by contrastive learning of face and… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: 18 pages, 9 Figures

  5. arXiv:2307.08397  [pdf, other

    cs.CV

    CLIP-Guided StyleGAN Inversion for Text-Driven Real Image Editing

    Authors: Ahmet Canberk Baykal, Abdul Basit Anees, Duygu Ceylan, Erkut Erdem, Aykut Erdem, Deniz Yuret

    Abstract: Researchers have recently begun exploring the use of StyleGAN-based models for real image editing. One particularly interesting application is using natural language descriptions to guide the editing process. Existing approaches for editing images using language either resort to instance-level latent code optimization or map predefined text prompts to some editing directions in the latent space. H… ▽ More

    Submitted 18 July, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted for publication in ACM Transactions on Graphics

  6. arXiv:2306.03521  [pdf, other

    cs.LG cond-mat.stat-mech

    Machine learning in and out of equilibrium

    Authors: Shishir Adhikari, Alkan Kabakçıoğlu, Alexander Strang, Deniz Yuret, Michael Hinczewski

    Abstract: The algorithms used to train neural networks, like stochastic gradient descent (SGD), have close parallels to natural processes that navigate a high-dimensional parameter space -- for example protein folding or evolution. Our study uses a Fokker-Planck approach, adapted from statistical physics, to explore these parallels in a single, unified framework. We focus in particular on the stationary sta… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: 24 pages, 6 figures

  7. arXiv:2212.14674  [pdf, other

    cs.CL cs.AI

    A Comprehensive Gold Standard and Benchmark for Comics Text Detection and Recognition

    Authors: Gürkan Soykan, Deniz Yuret, Tevfik Metin Sezgin

    Abstract: This study focuses on improving the optical character recognition (OCR) data for panels in the COMICS dataset, the largest dataset containing text and images from comic books. To do this, we developed a pipeline for OCR processing and labeling of comic books and created the first text detection and recognition datasets for western comics, called "COMICS Text+: Detection" and "COMICS Text+: Recogni… ▽ More

    Submitted 27 December, 2022; originally announced December 2022.

    Comments: 33 pages, 10 figures, 16 tables

  8. arXiv:2211.10641  [pdf, other

    cs.CV cs.LG

    Domain-Adaptive Self-Supervised Pre-Training for Face & Body Detection in Drawings

    Authors: Barış Batuhan Topal, Deniz Yuret, Tevfik Metin Sezgin

    Abstract: Drawings are powerful means of pictorial abstraction and communication. Understanding diverse forms of drawings, including digital arts, cartoons, and comics, has been a major problem of interest for the computer vision and computer graphics communities. Although there are large amounts of digitized drawings from comic books and cartoons, they contain vast stylistic variations, which necessitate e… ▽ More

    Submitted 25 April, 2023; v1 submitted 19 November, 2022; originally announced November 2022.

    Comments: Preprint, 8 pages of the paper itself + 7 pages of Supplementary Material

  9. arXiv:2211.01736  [pdf, other

    cs.CL cs.AI cs.LG

    Transformers on Multilingual Clause-Level Morphology

    Authors: Emre Can Acikgoz, Tilek Chubakov, Müge Kural, Gözde Gül Şahin, Deniz Yuret

    Abstract: This paper describes our winning systems in MRL: The 1st Shared Task on Multilingual Clause-level Morphology (EMNLP 2022 Workshop) designed by KUIS AI NLP team. We present our work for all three parts of the shared task: inflection, reinflection, and analysis. We mainly explore transformers with two approaches: (i) training models from scratch in combination with data augmentation, and (ii) transf… ▽ More

    Submitted 13 November, 2022; v1 submitted 3 November, 2022; originally announced November 2022.

  10. arXiv:2209.07999  [pdf, other

    cs.LG cs.AI cs.CV cs.IT eess.IV

    Self-Supervised Learning with an Information Maximization Criterion

    Authors: Serdar Ozsoy, Shadi Hamdan, Sercan Ö. Arik, Deniz Yuret, Alper T. Erdogan

    Abstract: Self-supervised learning allows AI systems to learn effective representations from large amounts of data using tasks that do not require costly labeling. Mode collapse, i.e., the model producing identical representations for all inputs, is a central problem to many self-supervised learning approaches, making self-supervised tasks, such as matching distorted variants of the inputs, ineffective. In… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    ACM Class: I.2; I.4; I.5

  11. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  12. arXiv:2203.01215  [pdf, other

    cs.CL

    Mukayese: Turkish NLP Strikes Back

    Authors: Ali Safaya, Emirhan Kurtuluş, Arda Göktoğan, Deniz Yuret

    Abstract: Having sufficient resources for language X lifts it from the under-resourced languages class, but not necessarily from the under-researched class. In this paper, we address the problem of the absence of organized benchmarks in the Turkish language. We demonstrate that languages such as Turkish are left behind the state-of-the-art in NLP applications. As a solution, we present Mukayese, a set of NL… ▽ More

    Submitted 16 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: Accepted at Findings of ACL 2022 (Camera Ready)

  13. arXiv:2012.04293  [pdf, other

    cs.AI cs.CL cs.CV

    CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions

    Authors: Tayfun Ates, M. Samil Atesoglu, Cagatay Yigit, Ilker Kesen, Mert Kobas, Erkut Erdem, Aykut Erdem, Tilbe Goksun, Deniz Yuret

    Abstract: Humans are able to perceive, understand and reason about causal events. Develo** models with similar physical and causal understanding capabilities is a long-standing goal of artificial intelligence. As a step towards this direction, we introduce CRAFT, a new video question answering dataset that requires causal reasoning about physical forces and object interactions. It contains 58K video and q… ▽ More

    Submitted 1 March, 2022; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: Accepted to Findings of ACL 2022

  14. arXiv:2008.00351  [pdf, ps, other

    cs.CL

    Cross-context News Corpus for Protest Events related Knowledge Base Construction

    Authors: Ali Hürriyetoğlu, Erdem Yörük, Deniz Yüret, Osman Mutlu, Çağrı Yoltar, Fırat Duruşan, Burak Gürel

    Abstract: We describe a gold standard corpus of protest events that comprise of various local and international sources from various countries in English. The corpus contains document, sentence, and token level annotations. This corpus facilitates creating machine learning models that automatically classify news articles and extract protest event-related information, constructing knowledge bases which enabl… ▽ More

    Submitted 1 August, 2020; originally announced August 2020.

    Comments: Presented at Automated Knowledge Base Construction (AKBC 2020) conference. See: https://www.akbc.ws/2020/papers/7NZkNhLCjp

  15. arXiv:2008.00345  [pdf, other

    cs.CL

    Overview of CLEF 2019 Lab ProtestNews: Extracting Protests from News in a Cross-context Setting

    Authors: Ali Hürriyetoğlu, Erdem Yörük, Deniz Yüret, Çağrı Yoltar, Burak Gürel, Fırat Duruşan, Osman Mutlu, Arda Akdemir

    Abstract: We present an overview of the CLEF-2019 Lab ProtestNews on Extracting Protests from News in the context of generalizable natural language processing. The lab consists of document, sentence, and token level information classification and extraction tasks that were referred as task 1, task 2, and task 3 respectively in the scope of this lab. The tasks required the participants to identify protest re… ▽ More

    Submitted 1 August, 2020; originally announced August 2020.

    Comments: Conference and Labs of the Evaluation Forum (CLEF 2019), Overview of the Protest News analysis

  16. arXiv:2007.13184  [pdf, other

    cs.CL

    KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media

    Authors: Ali Safaya, Moutasem Abdullatif, Deniz Yuret

    Abstract: In this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020. We show that combining CNN with BERT is better than using BERT on its own, and we emphasize the importance of utilizing pre-trained language models for d… ▽ More

    Submitted 26 July, 2020; originally announced July 2020.

    Comments: to be published in the proceedings of the 14th International Workshop on Semantic Evaluation (SemEval2020), Association for Computational Linguistics (ACL)

    ACM Class: I.2.7

  17. arXiv:2003.12739  [pdf, other

    cs.CV cs.CL cs.LG

    Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional Filters

    Authors: İlker Kesen, Ozan Arkan Can, Erkut Erdem, Aykut Erdem, Deniz Yuret

    Abstract: How to best integrate linguistic and perceptual processing in multi-modal tasks that involve language and vision is an important open problem. In this work, we argue that the common practice of using language in a top-down manner, to direct visual attention over high-level visual features, may not be optimal. We hypothesize that the use of language to also condition the bottom-up processing from p… ▽ More

    Submitted 23 June, 2022; v1 submitted 28 March, 2020; originally announced March 2020.

    Comments: 13 pages, 6 figures, 6 tables. Appeared in MULA Workshop at CVPR 2022

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 4610-4620

  18. arXiv:1904.13324  [pdf, other

    cs.AI cs.RO

    Learning from Implicit Information in Natural Language Instructions for Robotic Manipulations

    Authors: Ozan Arkan Can, Pedro Zuidberg Dos Martires, Andreas Persson, Julian Gaal, Amy Loutfi, Luc De Raedt, Deniz Yuret, Alessandro Saffiotti

    Abstract: Human-robot interaction often occurs in the form of instructions given from a human to a robot. For a robot to successfully follow instructions, a common representation of the world and objects in it should be shared between humans and the robot so that the instructions can be grounded. Achieving this representation can be done via learning, where both the world representation and the language gro… ▽ More

    Submitted 30 April, 2019; originally announced April 2019.

  19. arXiv:1805.07952  [pdf, other

    cs.CL

    A new dataset and model for learning to understand navigational instructions

    Authors: Ozan Arkan Can, Deniz Yuret

    Abstract: In this paper, we present a state-of-the-art model and introduce a new dataset for grounded language learning. Our goal is to develop a model that can learn to follow new instructions given prior instruction-perception-action examples. We based our work on the SAIL dataset which consists of navigational instructions and actions in a maze-like environment. The new model we propose achieves the best… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

  20. Morphological analysis using a sequence decoder

    Authors: Ekin Akyürek, Erenay Dayanık, Deniz Yuret

    Abstract: We introduce Morse, a recurrent encoder-decoder model that produces morphological analyses of each word in a sentence. The encoder turns the relevant information about the word and its context into a fixed size vector representation and the decoder generates the sequence of characters for the lemma followed by a sequence of individual morphological features. We show that generating morphological f… ▽ More

    Submitted 24 September, 2019; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: Final TACL version

    Journal ref: Transactions Of The Association For Computational Linguistics, 7, 567-579 (2019)

  21. arXiv:1604.02201  [pdf, other

    cs.CL

    Transfer Learning for Low-Resource Neural Machine Translation

    Authors: Barret Zoph, Deniz Yuret, Jonathan May, Kevin Knight

    Abstract: The encoder-decoder framework for neural machine translation (NMT) has been shown effective in large data scenarios, but is much less effective for low-resource languages. We present a transfer learning method that significantly improves Bleu scores across a range of low-resource languages. Our key idea is to first train a high-resource language pair (the parent model), then transfer some of the l… ▽ More

    Submitted 7 April, 2016; originally announced April 2016.

    Comments: 8 pages

  22. arXiv:1407.6853  [pdf, ps, other

    cs.CL

    Substitute Based SCODE Word Embeddings in Supervised NLP Tasks

    Authors: Volkan Cirik, Deniz Yuret

    Abstract: We analyze a word embedding method in supervised tasks. It maps words on a sphere such that words co-occurring in similar contexts lie closely. The similarity of contexts is measured by the distribution of substitutes that can fill them. We compared word embeddings, including more recent representations, in Named Entity Recognition (NER), Chunking, and Dependency Parsing. We examine our framework… ▽ More

    Submitted 25 July, 2014; originally announced July 2014.

    Comments: 11 pages

  23. FASTSUBS: An Efficient and Exact Procedure for Finding the Most Likely Lexical Substitutes Based on an N-gram Language Model

    Authors: Deniz Yuret

    Abstract: Lexical substitutes have found use in areas such as paraphrasing, text simplification, machine translation, word sense disambiguation, and part of speech induction. However the computational complexity of accurately identifying the most likely substitutes for a word has made large scale experiments difficult. In this paper I introduce a new search algorithm, FASTSUBS, that is guaranteed to find th… ▽ More

    Submitted 1 September, 2012; v1 submitted 24 May, 2012; originally announced May 2012.

    Comments: 4 pages, 1 figure, to appear in IEEE Signal Processing Letters

  24. Discovery of Linguistic Relations Using Lexical Attraction

    Authors: Deniz Yuret

    Abstract: This work has been motivated by two long term goals: to understand how humans learn language and to build programs that can understand language. Using a representation that makes the relevant features explicit is a prerequisite for successful learning and understanding. Therefore, I chose to represent relations between individual words explicitly in my model. Lexical attraction is defined as the… ▽ More

    Submitted 27 May, 1998; originally announced May 1998.

    Comments: dissertation, 56 pages