Skip to main content

Showing 1–16 of 16 results for author: Can, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.00665  [pdf, other

    cs.LG

    Balancing Efficiency vs. Effectiveness and Providing Missing Label Robustness in Multi-Label Stream Classification

    Authors: Sepehr Bakhshi, Fazli Can

    Abstract: Available works addressing multi-label classification in a data stream environment focus on proposing accurate models; however, these models often exhibit inefficiency and cannot balance effectiveness and efficiency. In this work, we propose a neural network-based approach that tackles this issue and is suitable for high-dimensional multi-label classification. Our model uses a selective concept dr… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

  2. arXiv:2308.14175  [pdf, other

    cs.LG

    Leveraging Linear Independence of Component Classifiers: Optimizing Size and Prediction Accuracy for Online Ensembles

    Authors: Enes Bektas, Fazli Can

    Abstract: Ensembles, which employ a set of classifiers to enhance classification accuracy collectively, are crucial in the era of big data. However, although there is general agreement that the relation between ensemble size and its prediction accuracy, the exact nature of this relationship is still unknown. We introduce a novel perspective, rooted in the linear independence of classifier's votes, to analyz… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

  3. arXiv:2308.10807  [pdf, ps, other

    cs.LG cs.AI cs.IR

    DynED: Dynamic Ensemble Diversification in Data Stream Classification

    Authors: Soheil Abadifard, Sepehr Bakhshi, Sanaz Gheibuni, Fazli Can

    Abstract: Ensemble methods are commonly used in classification due to their remarkable performance. Achieving high accuracy in a data stream environment is a challenging task considering disruptive changes in the data distribution, also known as concept drift. A greater diversity of ensemble components is known to enhance prediction accuracy in such settings. Despite the diversity of components within an en… ▽ More

    Submitted 6 September, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM '23), October 21--25, 2023, Birmingham, United Kingdom

  4. arXiv:2210.12383  [pdf, other

    cs.CL

    Stance Detection and Open Research Avenues

    Authors: Dilek Küçük, Fazli Can

    Abstract: This tutorial aims to cover the state-of-the-art on stance detection and address open research avenues for interested researchers and practitioners. Stance detection is a recent research topic where the stance towards a given target or target set is determined based on the given content and there are significant application opportunities of stance detection in various domains. The tutorial compris… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

  5. arXiv:2210.05401  [pdf, other

    cs.SI cs.CL cs.IR

    MiDe22: An Annotated Multi-Event Tweet Dataset for Misinformation Detection

    Authors: Cagri Toraman, Oguzhan Ozcelik, Furkan Şahinuç, Fazli Can

    Abstract: The rapid dissemination of misinformation through online social networks poses a pressing issue with harmful consequences jeopardizing human health, public safety, democracy, and the economy; therefore, urgent action is required to address this problem. In this study, we construct a new human-annotated dataset, called MiDe22, having 5,284 English and 5,064 Turkish tweets with their misinformation… ▽ More

    Submitted 11 July, 2024; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Published at LREC-COLING 2024

  6. arXiv:2202.00070  [pdf, other

    cs.LG cs.IR

    Implicit Concept Drift Detection for Multi-label Data Streams

    Authors: Ege Berkay Gulcan, Fazli Can

    Abstract: Many real-world applications adopt multi-label data streams as the need for algorithms to deal with rapidly changing data increases. Changes in data distribution, also known as concept drift, cause the existing classification models to rapidly lose their effectiveness. To assist the classifiers, we propose a novel algorithm called Label Dependency Drift Detector (LD3), an implicit (unsupervised) c… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

    Comments: 18 pages, 7 figures, submitted to Artificial Intelligence Review

  7. arXiv:2110.03540  [pdf, other

    cs.LG cs.AI

    A Broad Ensemble Learning System for Drifting Stream Classification

    Authors: Sepehr Bakhshi, Pouya Ghahramanian, Hamed Bonab, Fazli Can

    Abstract: In a data stream environment, classification models must handle concept drift efficiently and effectively. Ensemble methods are widely used for this purpose; however, the ones available in the literature either use a large data chunk to update the model or learn the data one by one. In the former, the model may miss the changes in the data distribution, and in the latter, the model may suffer from… ▽ More

    Submitted 14 March, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: Submitted to IEEE Access

  8. arXiv:2109.07611  [pdf, other

    cs.LG cs.IR

    On-the-Fly Ensemble Pruning in Evolving Data Streams

    Authors: Sanem Elbasi, Alican Büyükçakır, Hamed Bonab, Fazli Can

    Abstract: Ensemble pruning is the process of selecting a subset of componentclassifiers from an ensemble which performs at least as well as theoriginal ensemble while reducing storage and computational costs.Ensemble pruning in data streams is a largely unexplored area ofresearch. It requires analysis of ensemble components as they arerunning on the stream, and differentiation of useful classifiers fromredu… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: 5 pages, 2 figures

  9. arXiv:2001.11639  [pdf, other

    cs.CV

    ParkingSticker: A Real-World Object Detection Dataset

    Authors: Caroline Potts, Ethem F. Can, Aysu Ezen-Can, Xiangqian Hu

    Abstract: We present a new and challenging object detection dataset, ParkingSticker, which mimics the type of data available in industry problems more closely than popular existing datasets like PASCAL VOC. ParkingSticker contains 1,871 images that come from a security camera's video footage. The objective is to identify parking stickers on cars approaching a gate that the security camera faces. Bounding bo… ▽ More

    Submitted 12 February, 2020; v1 submitted 30 January, 2020; originally announced January 2020.

    Comments: 8 pages, 8 figures; Updated authors

  10. arXiv:2001.05857  [pdf, other

    cs.CV cs.LG

    The Effect of Data Ordering in Image Classification

    Authors: Ethem F. Can, Aysu Ezen-Can

    Abstract: The success stories from deep learning models increase every day spanning different tasks from image classification to natural language understanding. With the increasing popularity of these models, scientists spend more and more time finding the optimal parameters and best model architectures for their tasks. In this paper, we focus on the ingredient that feeds these machines: the data. We hypoth… ▽ More

    Submitted 8 January, 2020; originally announced January 2020.

    Journal ref: Under consideration at Pattern Recognition Letters 2020

  11. arXiv:1901.04787  [pdf, ps, other

    cs.CL

    A Tweet Dataset Annotated for Named Entity Recognition and Stance Detection

    Authors: Dilek Küçük, Fazli Can

    Abstract: Annotated datasets in different domains are critical for many supervised learning-based solutions to related problems and for the evaluation of the proposed solutions. Topics in natural language processing (NLP) similarly require annotated datasets to be used for such purposes. In this paper, we target at two NLP problems, named entity recognition and stance detection, and present the details of a… ▽ More

    Submitted 16 January, 2019; v1 submitted 15 January, 2019; originally announced January 2019.

    Comments: 4 pages; resource URLs are made properly accessible (by clicking them)

  12. arXiv:1809.09994  [pdf, other

    cs.LG cs.IR stat.ML

    A Novel Online Stacked Ensemble for Multi-Label Stream Classification

    Authors: Alican Büyükçakır, Hamed Bonab, Fazli Can

    Abstract: As data streams become more prevalent, the necessity for online algorithms that mine this transient and dynamic data becomes clearer. Multi-label data stream classification is a supervised learning problem where each instance in the data stream is classified into one or more pre-defined sets of labels. Many methods have been proposed to tackle this problem, including but not limited to ensemble-ba… ▽ More

    Submitted 26 September, 2018; originally announced September 2018.

    Comments: 10 pages, 4 figures. To be appeared in ACM CIKM 2018, in Torino, Italy

  13. arXiv:1806.04511  [pdf, other

    cs.CL cs.IR

    Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

    Authors: Ethem F. Can, Aysu Ezen-Can, Fazli Can

    Abstract: Sentiment analysis is a widely studied NLP task where the goal is to determine opinions, emotions, and evaluations of users towards a product, an entity or a service that they are reviewing. One of the biggest challenges for sentiment analysis is that it is highly language dependent. Word embeddings, sentiment lexicons, and even annotated data are language specific. Further, optimizing models for… ▽ More

    Submitted 8 June, 2018; originally announced June 2018.

    Comments: ACM SIGIR 2018 Workshop on Learning from Limited or Noisy Data (LND4IR'18)

  14. arXiv:1803.08910  [pdf, ps, other

    cs.CL

    Stance Detection on Tweets: An SVM-based Approach

    Authors: Dilek Küçük, Fazli Can

    Abstract: Stance detection is a subproblem of sentiment analysis where the stance of the author of a piece of natural language text for a particular target (either explicitly stated in the text or not) is explored. The stance output is usually given as Favor, Against, or Neither. In this paper, we target at stance detection on sports-related tweets and present the performance results of our SVM-based stance… ▽ More

    Submitted 23 March, 2018; originally announced March 2018.

    Comments: 13 pages

  15. arXiv:1709.02925  [pdf, other

    cs.LG stat.ML

    Less Is More: A Comprehensive Framework for the Number of Components of Ensemble Classifiers

    Authors: Hamed Bonab, Fazli Can

    Abstract: The number of component classifiers chosen for an ensemble greatly impacts the prediction ability. In this paper, we use a geometric framework for a priori determining the ensemble size, which is applicable to most of existing batch and online ensemble classifiers. There are only a limited number of studies on the ensemble size examining Majority Voting (MV) and Weighted Majority Voting (WMV). Alm… ▽ More

    Submitted 29 September, 2018; v1 submitted 9 September, 2017; originally announced September 2017.

    Comments: This is an extended version of the work presented as a short paper at the Conference on Information and Knowledge Management (CIKM), 2016

  16. arXiv:1709.02800  [pdf, other

    cs.LG

    GOOWE: Geometrically Optimum and Online-Weighted Ensemble Classifier for Evolving Data Streams

    Authors: Hamed R. Bonab, Fazli Can

    Abstract: Designing adaptive classifiers for an evolving data stream is a challenging task due to the data size and its dynamically changing nature. Combining individual classifiers in an online setting, the ensemble approach, is a well-known solution. It is possible that a subset of classifiers in the ensemble outperforms others in a time-varying fashion. However, optimum weight assignment for component cl… ▽ More

    Submitted 7 September, 2017; originally announced September 2017.

    Comments: 33 Pages, Accepted for publication in The ACM Transactions on Knowledge Discovery from Data (TKDD) in August 2017