Skip to main content

Showing 1–22 of 22 results for author: Araujo, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02099  [pdf, other

    cs.CL

    Helpful assistant or fruitful facilitator? Investigating how personas affect language model behavior

    Authors: Pedro Henrique Luz de Araujo, Benjamin Roth

    Abstract: One way to personalize and steer generations from large language models (LLM) is to assign a persona: a role that describes how the user expects the LLM to behave (e.g., a helpful assistant, a teacher, a woman). This paper investigates how personas affect diverse aspects of model behavior. We assign to seven LLMs 162 personas from 12 categories spanning variables like gender, sexual orientation, a… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 20 pages, 12 figures

  2. arXiv:2406.18589  [pdf, other

    cs.CV cs.LG

    Text-Guided Alternative Image Clustering

    Authors: Andreas Stephan, Lukas Miklautz, Collin Leiber, Pedro Henrique Luz de Araujo, Dominik Répás, Claudia Plant, Benjamin Roth

    Abstract: Traditional image clustering techniques only find a single grou** within visual data. In particular, they do not provide a possibility to explicitly define multiple types of clustering. This work explores the potential of large vision-language models to facilitate alternative image clustering. We propose Text-Guided Alternative Image Consensus Clustering (TGAICC), a novel approach that leverages… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2405.03004  [pdf, other

    cs.CL cs.LG

    Exploring prompts to elicit memorization in masked language model-based named entity recognition

    Authors: Yuxi Xia, Anastasiia Sedova, Pedro Henrique Luz de Araujo, Vasiliki Kougia, Lisa Nußbaumer, Benjamin Roth

    Abstract: Training data memorization in language models impacts model capability (generalization) and safety (privacy risk). This paper focuses on analyzing prompts' impact on detecting the memorization of 6 masked language model-based named entity recognition models. Specifically, we employ a diverse set of 400 automatically generated prompts, and a pairwise dataset where each pair consists of one person's… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  4. arXiv:2403.08425  [pdf, other

    cs.AI

    Specification Overfitting in Artificial Intelligence

    Authors: Benjamin Roth, Pedro Henrique Luz de Araujo, Yuxi Xia, Saskia Kaltenbrunner, Christoph Korab

    Abstract: Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this technology's potential negative side effects. High-level requirements such as fairness and robustness need to be formalized into concrete specification metrics, i… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 40 pages, 2 figures

  5. arXiv:2402.07586  [pdf, other

    cs.LG

    Unveiling Group-Specific Distributed Concept Drift: A Fairness Imperative in Federated Learning

    Authors: Teresa Salazar, João Gama, Helder Araújo, Pedro Henriques Abreu

    Abstract: In the evolving field of machine learning, ensuring fairness has become a critical concern, prompting the development of algorithms designed to mitigate discriminatory outcomes in decision-making processes. However, achieving fairness in the presence of group-specific concept drift remains an unexplored frontier, and our research represents pioneering efforts in this regard. Group-specific concept… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    MSC Class: 68T01 ACM Class: I.2.m

  6. arXiv:2311.08481  [pdf, other

    cs.CL

    Functionality learning through specification instructions

    Authors: Pedro Henrique Luz de Araujo, Benjamin Roth

    Abstract: Test suites assess natural language processing models' performance on specific functionalities: cases of interest involving model robustness, fairness, or particular linguistic capabilities. They enable fine-grained evaluations of model aspects that would otherwise go unnoticed in standard evaluation datasets, but they do not address the problem of how to fix the failure cases. Previous work has e… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 33 pages, 8 figures

  7. Cross-functional Analysis of Generalisation in Behavioural Learning

    Authors: Pedro Henrique Luz de Araujo, Benjamin Roth

    Abstract: In behavioural testing, system functionalities underrepresented in the standard evaluation setting (with a held-out test set) are validated through controlled input-output pairs. Optimising performance on the behavioural tests during training (behavioural learning) would improve coverage of phenomena not sufficiently represented in the i.i.d. data and could lead to seemingly more robust models. Ho… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 16 pages, 1 figure. To be published in the Transactions of the Association for Computational Linguistics (TACL). This preprint is a pre-MIT Press publication version

    Journal ref: Transactions of the Association for Computational Linguistics 11, 2023, 1066-1081

  8. arXiv:2210.15365  [pdf, other

    cs.CV cs.LG

    Li3DeTr: A LiDAR based 3D Detection Transformer

    Authors: Gopi Krishna Erabati, Helder Araujo

    Abstract: Inspired by recent advances in vision transformers for object detection, we propose Li3DeTr, an end-to-end LiDAR based 3D Detection Transformer for autonomous driving, that inputs LiDAR point clouds and regresses 3D bounding boxes. The LiDAR local and global features are encoded using sparse convolution and multi-scale deformable attention respectively. In the decoder head, firstly, in the novel L… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Accepted at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

  9. arXiv:2210.15316  [pdf, other

    cs.CV cs.LG cs.RO

    MSF3DDETR: Multi-Sensor Fusion 3D Detection Transformer for Autonomous Driving

    Authors: Gopi Krishna Erabati, Helder Araujo

    Abstract: 3D object detection is a significant task for autonomous driving. Recently with the progress of vision transformers, the 2D object detection problem is being treated with the set-to-set loss. Inspired by these approaches on 2D object detection and an approach for multi-view 3D object detection DETR3D, we propose MSF3DDETR: Multi-Sensor Fusion 3D Detection Transformer architecture to fuse image and… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Accepted at the ICPR 2022 Workshop DLVDR2022

  10. FAIR-FATE: Fair Federated Learning with Momentum

    Authors: Teresa Salazar, Miguel Fernandes, Helder Araujo, Pedro Henriques Abreu

    Abstract: While fairness-aware machine learning algorithms have been receiving increasing attention, the focus has been on centralized machine learning, leaving decentralized methods underexplored. Federated Learning is a decentralized form of machine learning where clients train local models with a server aggregating them to obtain a shared global model. Data heterogeneity amongst clients is a common chara… ▽ More

    Submitted 2 July, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

    Comments: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this contribution is published in ICCS 2023 - Lecture Notes in Computer Science, vol 14073, Springer, and is available online at https://doi.org/10.1007/978-3-031-35995-8_37

    MSC Class: 68T07 ACM Class: I.2.m

    Journal ref: Computational Science - ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14073. Springer, Cham

  11. Sequence-aware multimodal page classification of Brazilian legal documents

    Authors: Pedro H. Luz de Araujo, Ana Paula G. S. de Almeida, Fabricio A. Braz, Nilton C. da Silva, Flavio de Barros Vidal, Teofilo E. de Campos

    Abstract: The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate ou… ▽ More

    Submitted 15 July, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

    Comments: 11 pages, 6 figures. This preprint, which was originally written on 8 April 2021, has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in the International Journal on Document Analysis and Recognition, and is available online at https://doi.org/10.1007/s10032-022-00406-7 and https://rdcu.be/cRvvV

    Journal ref: International Journal on Document Analysis and Recognition.2022

  12. Checking HateCheck: a cross-functional analysis of behaviour-aware learning for hate speech detection

    Authors: Pedro Henrique Luz de Araujo, Benjamin Roth

    Abstract: Behavioural testing -- verifying system capabilities by validating human-designed input-output pairs -- is an alternative evaluation method of natural language processing systems proposed to address the shortcomings of the standard approach: computing metrics on held-out data. While behavioural tests capture human prior knowledge and insights, there has been little exploration on how to leverage t… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: 9 pages, 5 figures. Accepted at the First Workshop on Efficient Benchmarking in NLP (NLP Power!)

    Journal ref: In Proceedings of NLP Power! The First Workshop on Efficient Benchmarking in NLP, 2022, pages 75-83, Dublin, Ireland. Association for Computational Linguistics

  13. arXiv:2006.16670  [pdf, other

    cs.CV

    EndoSLAM Dataset and An Unsupervised Monocular Visual Odometry and Depth Estimation Approach for Endoscopic Videos: Endo-SfMLearner

    Authors: Kutsev Bengisu Ozyoruk, Guliz Irem Gokceler, Gulfize Coskun, Kagan Incetan, Yasin Almalioglu, Faisal Mahmood, Eva Curto, Luis Perdigoto, Marina Oliveira, Hasan Sahin, Helder Araujo, Henrique Alexandrino, Nicholas J. Durr, Hunter B. Gilbert, Mehmet Turan

    Abstract: Deep learning techniques hold promise to develop dense topography reconstruction and pose estimation methods for endoscopic videos. However, currently available datasets do not support effective quantitative benchmarking. In this paper, we introduce a comprehensive endoscopic SLAM dataset consisting of 3D point cloud data for six porcine organs, capsule and standard endoscopy recordings as well as… ▽ More

    Submitted 1 October, 2020; v1 submitted 30 June, 2020; originally announced June 2020.

    Comments: 27 pages, 16 figures

  14. arXiv:1803.01048  [pdf, other

    cs.RO

    Magnetic-Visual Sensor Fusion-based Dense 3D Reconstruction and Localization for Endoscopic Capsule Robots

    Authors: Mehmet Turan, Yasin Almalioglu, Evin Pinar Ornek, Helder Araujo, Mehmet Fatih Yanik, Metin Sitti

    Abstract: Reliable and real-time 3D reconstruction and localization functionality is a crucial prerequisite for the navigation of actively controlled capsule endoscopic robots as an emerging, minimally invasive diagnostic and therapeutic technology for use in the gastrointestinal (GI) tract. In this study, we propose a fully dense, non-rigidly deformable, strictly real-time, intraoperative map fusion approa… ▽ More

    Submitted 2 March, 2018; originally announced March 2018.

    Comments: submitted to IROS 2018

  15. arXiv:1709.06451  [pdf, other

    cs.CV

    3D Reconstruction with Low Resolution, Small Baseline and High Radial Distortion Stereo Images

    Authors: Tiago Dias, Helder Araujo, Pedro Miraldo

    Abstract: In this paper we analyze and compare approaches for 3D reconstruction from low-resolution (250x250), high radial distortion stereo images, which are acquired with small baseline (approximately 1mm). These images are acquired with the system NanEye Stereo manufactured by CMOSIS/AWAIBA. These stereo cameras have also small apertures, which means that high levels of illumination are required. The goa… ▽ More

    Submitted 19 September, 2017; originally announced September 2017.

    Journal ref: ACM Int'l Conf. Distributed Smart Cameras (ICDSC), 2016

  16. arXiv:1709.03401  [pdf, other

    cs.RO

    EndoSensorFusion: Particle Filtering-Based Multi-sensory Data Fusion with Switching State-Space Model for Endoscopic Capsule Robots

    Authors: Mehmet Turan, Yasin Almalioglu, Hunter Gilbert, Helder Araujo, Taylan Cemgil, Metin Sitti

    Abstract: A reliable, real time multi-sensor fusion functionality is crucial for localization of actively controlled capsule endoscopy robots, which are an emerging, minimally invasive diagnostic and therapeutic technology for the gastrointestinal (GI) tract. In this study, we propose a novel multi-sensor fusion approach based on a particle filter that incorporates an online estimation of sensor reliability… ▽ More

    Submitted 25 September, 2017; v1 submitted 8 September, 2017; originally announced September 2017.

    Comments: submitted to ICRA 2018. arXiv admin note: text overlap with arXiv:1705.06196

  17. Sparse-then-Dense Alignment based 3D Map Reconstruction Method for Endoscopic Capsule Robots

    Authors: Mehmet Turan, Yusuf Yigit Pilavci, Ipek Ganiyusufoglu, Helder Araujo, Ender Konukoglu, Metin Sitti

    Abstract: Since the development of capsule endoscopcy technology, substantial progress were made in converting passive capsule endoscopes to robotic active capsule endoscopes which can be controlled by the doctor. However, robotic capsule endoscopy still has some challenges. In particular, the use of such devices to generate a precise and globally consistent three-dimensional (3D) map of the entire inner or… ▽ More

    Submitted 29 August, 2017; originally announced August 2017.

    Comments: arXiv admin note: text overlap with arXiv:1705.06524

  18. Deep EndoVO: A Recurrent Convolutional Neural Network (RCNN) based Visual Odometry Approach for Endoscopic Capsule Robots

    Authors: Mehmet Turan, Yasin Almalioglu, Helder Araujo, Ender Konukoglu, Metin Sitti

    Abstract: Ingestible wireless capsule endoscopy is an emerging minimally invasive diagnostic technology for inspection of the GI tract and diagnosis of a wide range of diseases and pathologies. Medical device companies and many research groups have recently made substantial progresses in converting passive capsule endoscopes to active capsule robots, enabling more accurate, precise, and intuitive detection… ▽ More

    Submitted 8 September, 2017; v1 submitted 22 August, 2017; originally announced August 2017.

  19. arXiv:1705.06524  [pdf, other

    cs.CV

    A fully dense and globally consistent 3D map reconstruction approach for GI tract to enhance therapeutic relevance of the endoscopic capsule robot

    Authors: Mehmet Turan, Yusuf Yigit Pilavci, Redhwan Jamiruddin, Helder Araujo, Ender Konukoglu, Metin Sitti

    Abstract: In the gastrointestinal (GI) tract endoscopy field, ingestible wireless capsule endoscopy is emerging as a novel, minimally invasive diagnostic technology for inspection of the GI tract and diagnosis of a wide range of diseases and pathologies. Since the development of this technology, medical device companies and many research groups have made substantial progress in converting passive capsule en… ▽ More

    Submitted 18 May, 2017; originally announced May 2017.

  20. arXiv:1705.06196  [pdf, other

    cs.CV

    Magnetic-Visual Sensor Fusion based Medical SLAM for Endoscopic Capsule Robot

    Authors: Mehmet Turan, Yasin Almalioglu, Hunter Gilbert, Helder Araujo, Ender Konukoglu, Metin Sitti

    Abstract: A reliable, real-time simultaneous localization and map** (SLAM) method is crucial for the navigation of actively controlled capsule endoscopy robots. These robots are an emerging, minimally invasive diagnostic and therapeutic technology for use in the gastrointestinal (GI) tract. In this study, we propose a dense, non-rigidly deformable, and real-time map fusion approach for actively controlled… ▽ More

    Submitted 5 November, 2017; v1 submitted 17 May, 2017; originally announced May 2017.

  21. A Non-Rigid Map Fusion-Based RGB-Depth SLAM Method for Endoscopic Capsule Robots

    Authors: Mehmet Turan, Yasin Almalioglu, Helder Araujo, Ender Konukoglu, Metin Sitti

    Abstract: In the gastrointestinal (GI) tract endoscopy field, ingestible wireless capsule endoscopy is considered as a minimally invasive novel diagnostic technology to inspect the entire GI tract and to diagnose various diseases and pathologies. Since the development of this technology, medical device companies and many groups have made significant progress to turn such passive capsule endoscopes into robo… ▽ More

    Submitted 15 May, 2017; originally announced May 2017.

  22. arXiv:1602.05990  [pdf, ps, other

    cs.CV cs.RO

    Plücker Correction Problem: Analysis and Improvements in Efficiency

    Authors: João R. Cardoso, Pedro Miraldo, Helder Araujo

    Abstract: A given six dimensional vector represents a 3D straight line in Plucker coordinates if its coordinates satisfy the Klein quadric constraint. In many problems aiming to find the Plucker coordinates of lines, noise in the data and other type of errors contribute for obtaining 6D vectors that do not correspond to lines, because of that constraint. A common procedure to overcome this drawback is to… ▽ More

    Submitted 18 February, 2016; originally announced February 2016.