Skip to main content

Showing 1–5 of 5 results for author: Van Dijck, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.15059  [pdf, other

    cs.CL cs.IR

    ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval

    Authors: Antoine Louis, Vageesh Saxena, Gijs van Dijck, Gerasimos Spanakis

    Abstract: State-of-the-art neural retrievers predominantly focus on high-resource languages like English, which impedes their adoption in retrieval scenarios involving other languages. Current approaches circumvent the lack of high-quality labeled data in non-English languages by leveraging multilingual pretrained language models capable of cross-lingual transfer. However, these models require substantial t… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Under review. Code is available at https://github.com/ant-louis/xm-retrievers

  2. arXiv:2310.05484  [pdf, other

    cs.CL cs.CY cs.LG

    IDTraffickers: An Authorship Attribution Dataset to link and connect Potential Human-Trafficking Operations on Text Escort Advertisements

    Authors: Vageesh Saxena, Benjamin Bashpole, Gijs Van Dijck, Gerasimos Spanakis

    Abstract: Human trafficking (HT) is a pervasive global issue affecting vulnerable individuals, violating their fundamental human rights. Investigations reveal that a significant number of HT cases are associated with online advertisements (ads), particularly in escort markets. Consequently, identifying and connecting HT vendors has become increasingly challenging for Law Enforcement Agencies (LEAs). To addr… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  3. arXiv:2309.17050  [pdf, other

    cs.CL

    Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models

    Authors: Antoine Louis, Gijs van Dijck, Gerasimos Spanakis

    Abstract: Many individuals are likely to face a legal dispute at some point in their lives, but their lack of understanding of how to navigate these complex issues often renders them vulnerable. The advancement of natural language processing opens new avenues for bridging this legal literacy gap through the development of automated legal aid systems. However, existing legal question answering (LQA) approach… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: Under review. Code is available at https://github.com/maastrichtlawtech/lleqa

  4. arXiv:2305.02763  [pdf, other

    cs.CY cs.CL cs.CR cs.LG

    VendorLink: An NLP approach for Identifying & Linking Vendor Migrants & Potential Aliases on Darknet Markets

    Authors: Vageesh Saxena, Nils Rethmeier, Gijs Van Dijck, Gerasimos Spanakis

    Abstract: The anonymity on the Darknet allows vendors to stay undetected by using multiple vendor aliases or frequently migrating between markets. Consequently, illegal markets and their connections are challenging to uncover on the Darknet. To identify relationships between illegal markets and their vendors, we propose VendorLink, an NLP-based approach that examines writing patterns to verify, identify, an… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  5. arXiv:2301.12847  [pdf, other

    cs.IR cs.CL

    Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural Networks

    Authors: Antoine Louis, Gijs van Dijck, Gerasimos Spanakis

    Abstract: Statutory article retrieval (SAR), the task of retrieving statute law articles relevant to a legal question, is a promising application of legal text processing. In particular, high-quality SAR systems can improve the work efficiency of legal professionals and provide basic legal assistance to citizens in need at no cost. Unlike traditional ad-hoc information retrieval, where each document is cons… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: EACL 2023. Code is available at https://github.com/maastrichtlawtech/gdsr