Skip to main content

Showing 1–6 of 6 results for author: Jabbar, H

.
  1. arXiv:2404.10780  [pdf

    cs.CR cs.CY

    Phishing Website Detection Using a Combined Model of ANN and LSTM

    Authors: Muhammad Shoaib Farooq, Hina jabbar

    Abstract: In this digital era, our lives highly depend on the internet and worldwide technology. Wide usage of technology and platforms of communication makes our lives better and easier. But on the other side it carries out some security issues and cruel activities, phishing is one activity of these cruel activities. It is a type of cybercrime, which has the purpose of stealing the personal information of… ▽ More

    Submitted 24 March, 2024; originally announced April 2024.

    Comments: Pages 9, Figures 5

  2. arXiv:2312.10188  [pdf, other

    cs.LG

    WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data

    Authors: Maurice Weber, Carlo Siebenschuh, Rory Butler, Anton Alexandrov, Valdemar Thanner, Georgios Tsolakis, Haris Jabbar, Ian Foster, Bo Li, Rick Stevens, Ce Zhang

    Abstract: We introduce WordScape, a novel pipeline for the creation of cross-disciplinary, multilingual corpora comprising millions of pages with annotations for document layout detection. Relating visual and textual items on document pages has gained further significance with the advent of multimodal models. Various approaches proved effective for visual question answering or layout segmentation. However,… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023 Datasets and Benchmarks

  3. arXiv:2307.07262  [pdf, other

    cs.CL

    MorphPiece : A Linguistic Tokenizer for Large Language Models

    Authors: Haris Jabbar

    Abstract: Tokenization is a critical part of modern NLP pipelines. However, contemporary tokenizers for Large Language Models are based on statistical analysis of text corpora, without much consideration to the linguistic features. I propose a linguistically motivated tokenization scheme, MorphPiece, which is based partly on morphological segmentation of the underlying text. A GPT-style causal language mode… ▽ More

    Submitted 3 February, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: Manuscript under review. Patent pending

  4. arXiv:2204.12225  [pdf, other

    cs.CL

    Flow-Adapter Architecture for Unsupervised Machine Translation

    Authors: Yihong Liu, Haris Jabbar, Hinrich Schütze

    Abstract: In this work, we propose a flow-adapter architecture for unsupervised NMT. It leverages normalizing flows to explicitly model the distributions of sentence-level latent representations, which are subsequently used in conjunction with the attention mechanism for the translation task. The primary novelties of our model are: (a) capturing language-specific sentence representations separately for each… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

    Comments: ACL 2022

  5. arXiv:2203.11764  [pdf, other

    cs.CL cs.AI

    Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments

    Authors: Antonis Maronikolakis, Axel Wisiorek, Leah Nann, Haris Jabbar, Sahana Udupa, Hinrich Schuetze

    Abstract: Building on current work on multilingual hate speech (e.g., Ousidhoum et al. (2019)) and hate speech reduction (e.g., Sap et al. (2020)), we present XTREMESPEECH, a new hate speech dataset containing 20,297 social media passages from Brazil, Germany, India and Kenya. The key novelty is that we directly involve the affected communities in collecting and annotating the data - as opposed to giving co… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022 Findings

  6. Disentangling magnetic hardening and molecular spin chain contributions to exchange bias in ferromagnet/molecule bilayers

    Authors: Samy Boukari, Hashim Jabbar, Filip Schleicher, Manuel Gruber, Jacek Arabski, Victor Da Costa, Guy Schmerber, Prashanth Rengasamy, Bertrand Vileno, Wolfgang Weber, Martin Bowen, Eric Beaurepaire

    Abstract: We performed SQUID and FMR magnetometry experiments to clarify the relationship between two reported magnetic exchange effects arising from interfacial spin-polarized charge transfer within ferromagnetic metal (FM)/molecule bilayers: the magnetic hardening effect, and spinterface-stabilized molecular spin chains. To disentangle these effects, both of which can affect the FM magnetization reversal,… ▽ More

    Submitted 20 December, 2017; originally announced December 2017.

    Comments: None