Skip to main content

Showing 1–20 of 20 results for author: Hernandez, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.19519  [pdf, other

    cs.CL cs.AI

    Two-layer retrieval augmented generation framework for low-resource medical question-answering: proof of concept using Reddit data

    Authors: Sudeshna Das, Yao Ge, Yuting Guo, Swati Rajwal, JaMor Hairston, Jeanne Powell, Drew Walker, Snigdha Peddireddy, Sahithi Lakamana, Selen Bozkurt, Matthew Reyna, Reza Sameni, Yunyu Xiao, Sangmi Kim, Rasheeta Chandler, Natalie Hernandez, Danielle Mowery, Rachel Wightman, Jennifer Love, Anthony Spadaro, Jeanmarie Perrone, Abeed Sarker

    Abstract: Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for qu… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  2. arXiv:2405.12500  [pdf, ps, other

    cs.LG

    Entropic associative memory for real world images

    Authors: Noé Hernández, Rafael Morales, Luis A. Pineda

    Abstract: The entropic associative memory (EAM) is a computational model of natural memory incorporating some of its putative properties of being associative, distributed, declarative, abstractive and constructive. Previous experiments satisfactorily tested the model on structured, homogeneous and conventional data: images of manuscripts digits and letters, images of clothing, and phone representations. In… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  3. arXiv:2403.00241  [pdf, other

    cs.CL

    CASIMIR: A Corpus of Scientific Articles enhanced with Multiple Author-Integrated Revisions

    Authors: Leane Jourdan, Florian Boudin, Nicolas Hernandez, Richard Dufour

    Abstract: Writing a scientific article is a challenging task as it is a highly codified and specific genre, consequently proficiency in written communication is essential for effectively conveying research findings and ideas. In this article, we propose an original textual resource on the revision step of the writing process of scientific articles. This new dataset, called CASIMIR, contains the multiple rev… ▽ More

    Submitted 19 March, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: Accepted at LREC-Coling 2024

  4. arXiv:2402.12036  [pdf, other

    cs.CL

    Language Model Adaptation to Specialized Domains through Selective Masking based on Genre and Topical Characteristics

    Authors: Anas Belfathi, Ygor Gallina, Nicolas Hernandez, Richard Dufour, Laura Monceaux

    Abstract: Recent advances in pre-trained language modeling have facilitated significant progress across various natural language processing (NLP) tasks. Word masking during model training constitutes a pivotal component of language modeling in architectures like BERT. However, the prevalent method of word masking relies on random selection, potentially disregarding domain-specific linguistic attributes. In… ▽ More

    Submitted 26 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  5. arXiv:2310.17413  [pdf, other

    cs.CL

    Harnessing GPT-3.5-turbo for Rhetorical Role Prediction in Legal Cases

    Authors: Anas Belfathi, Nicolas Hernandez, Laura Monceaux

    Abstract: We propose a comprehensive study of one-stage elicitation techniques for querying a large pre-trained generative transformer (GPT-3.5-turbo) in the rhetorical role prediction task of legal cases. This task is known as requiring textual context to be addressed. Our study explores strategies such as zero-few shots, task specification with definitions and clarification of annotation ambiguities, text… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Journal ref: JURIX 2023 - The 36th International Conference on Legal Knowledge and Information System, Maastricht, the Netherlands

  6. arXiv:2310.05276  [pdf, other

    cs.CL

    Enhancing Pre-Trained Language Models with Sentence Position Embeddings for Rhetorical Roles Recognition in Legal Opinions

    Authors: Anas Belfathi, Nicolas Hernandez, Laura Monceaux

    Abstract: The legal domain is a vast and complex field that involves a considerable amount of text analysis, including laws, legal arguments, and legal opinions. Legal practitioners must analyze these texts to understand legal cases, research legal precedents, and prepare legal documents. The size of legal opinions continues to grow, making it increasingly challenging to develop a model that can accurately… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

    Comments: Workshop on Automated Semantic Analysis of Information in Legal Text

    Journal ref: ASAIL 2023: Proceedings of the Sixth Workshop on Automated Semantic Analysis of Information in Legal Text (ASAIL 2023), June 23, 2023, Braga, Portugal

  7. arXiv:2303.16726  [pdf, other

    cs.CL

    Text revision in Scientific Writing Assistance: An Overview

    Authors: Léane Jourdan, Florian Boudin, Richard Dufour, Nicolas Hernandez

    Abstract: Writing a scientific article is a challenging task as it is a highly codified genre. Good writing skills are essential to properly convey ideas and results of research work. Since the majority of scientific articles are currently written in English, this exercise is all the more difficult for non-native English speakers as they additionally have to face language issues. This article aims to provid… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: Published at 13th International Workshop on Bibliometric-enhanced Information Retrieval 12 pages

    Journal ref: ceur-ws.Vol-3617(2023)22-36

  8. arXiv:2210.11673  [pdf, other

    cs.SI cs.CR

    Strategies and Vulnerabilities of Participants in Venezuelan Influence Operations

    Authors: Ruben Recabarren, Bogdan Carbunar, Nestor Hernandez, Ashfaq Ali Shafin

    Abstract: Studies of online influence operations, coordinated efforts to disseminate and amplify disinformation, focus on forensic analysis of social networks or of publicly available datasets of trolls and bot accounts. However, little is known about the experiences and challenges of human participants in influence operations. We conducted semi-structured interviews with 19 influence operations participant… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

  9. arXiv:2207.01407  [pdf, other

    cs.CV cs.AI

    Vehicle Trajectory Prediction on Highways Using Bird Eye View Representations and Deep Learning

    Authors: Rubén Izquierdo, Álvaro Quintanar, David Fernández Llorca, Iván García Daza, Noelia Hernández, Ignacio Parra, Miguel Ángel Sotelo

    Abstract: This work presents a novel method for predicting vehicle trajectories in highway scenarios using efficient bird's eye view representations and convolutional neural networks. Vehicle positions, motion histories, road configuration, and vehicle interactions are easily included in the prediction model using basic visual representations. The U-net model has been selected as the prediction kernel to ge… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: This work has been accepted for publication at Applied Intelligence

  10. Entropic Associative Memory for Manuscript Symbols

    Authors: Rafael Morales, Noé Hernández, Ricardo Cruz, Victor D. Cruz, Luis A. Pineda

    Abstract: Manuscript symbols can be stored, recognized and retrieved from an entropic digital memory that is associative and distributed but yet declarative; memory retrieval is a constructive operation, memory cues to objects not contained in the memory are rejected directly without search, and memory operations can be performed through parallel computations. Manuscript symbols, both letters and numerals,… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

    Comments: 24 pages, 13 figures

  11. arXiv:2111.10400  [pdf, other

    cs.CR cs.CY cs.SI

    RacketStore: Measurements of ASO Deception in Google Play via Mobile and App Usage

    Authors: Nestor Hernandez, Ruben Recabarren, Bogdan Carbunar, Syed Ishtiaque Ahmed

    Abstract: Online app search optimization (ASO) platforms that provide bulk installs and fake reviews for paying app developers in order to fraudulently boost their search rank in app stores, were shown to employ diverse and complex strategies that successfully evade state-of-the-art detection methods. In this paper we introduce RacketStore, a platform to collect data from Android devices of participating AS… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Journal ref: ACM IMC Internet Measurement Conference (2021)

  12. WiFiNet: WiFi-based indoor localisation using CNNs

    Authors: Noelia Hernández, Ignacio Parra, Héctor Corrales, Rubén Izquierdo, Augusto Luis Ballardini, Carlota Salinas, Iván Garcia

    Abstract: Different technologies have been proposed to provide indoor localisation: magnetic field, bluetooth , WiFi, etc. Among them, WiFi is the one with the highest availability and highest accuracy. This fact allows for an ubiquitous accurate localisation available for almost any environment and any device. However, WiFi-based localisation is still an open problem. In this article, we propose a new Wi… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Journal ref: Expert Systems with Applications, Volume 177, 1 September 2021

  13. arXiv:2012.07121  [pdf, other

    cs.RO cs.AI

    Deliberative and Conceptual Inference in Service Robots

    Authors: Luis A. Pineda, Noé Hernández, Arturo Rodríguez, Ricardo Cruz, Gibrán Fuentes

    Abstract: Service robots need to reason to support people in daily life situations. Reasoning is an expensive resource that should be used on demand whenever the expectations of the robot do not match the situation of the world and the execution of the task is broken down; in such scenarios the robot must perform the common sense daily life inference cycle consisting on diagnosing what happened, deciding wh… ▽ More

    Submitted 13 December, 2020; originally announced December 2020.

    Comments: 31 pages, 7 figures and 2 appendices

  14. arXiv:2009.06371  [pdf, ps, other

    cs.AI cs.MS

    SeqROCTM: A Matlab toolbox for the analysis of Sequence of Random Objects driven by Context Tree Models

    Authors: Noslen Hernández, Aline Duarte

    Abstract: In several research problems we deal with probabilistic sequences of inputs (e.g., sequence of stimuli) from which an agent generates a corresponding sequence of responses and it is of interest to model the relation between them. A new class of stochastic processes, namely \textit{sequences of random objects driven by context tree models}, has been introduced to model such relation in the context… ▽ More

    Submitted 22 July, 2021; v1 submitted 8 September, 2020; originally announced September 2020.

  15. arXiv:1912.06105  [pdf, other

    quant-ph cs.IT

    Bell Diagonal and Werner state generation: entanglement, non-locality, steering and discord on the IBM quantum computer

    Authors: Elias Riedel Gårding, Nicolas Schwaller, Su Yeon Chang, Samuel Bosch, Willy Robert Laborde, Javier Naya Hernandez, Chun Lam Chan, Frédéric Gessler, Xinyu Si, Marc-André Dupertuis, Nicolas Macris

    Abstract: We propose the first correct special-purpose quantum circuits for preparation of Bell-diagonal states (BDS), and implement them on the IBM Quantum computer, characterizing and testing complex aspects of their quantum correlations in the full parameter space. Among the circuits proposed, one involves only two quantum bits but requires adapted quantum tomography routines handling classical bits in p… ▽ More

    Submitted 16 May, 2021; v1 submitted 12 December, 2019; originally announced December 2019.

    Comments: 20 pages, 23 figures

  16. arXiv:1907.12878  [pdf, other

    cs.CL

    Deep Retrieval-Based Dialogue Systems: A Short Review

    Authors: Basma El Amel Boussaha, Nicolas Hernandez, Christine Jacquin, Emmanuel Morin

    Abstract: Building dialogue systems that naturally converse with humans is being an attractive and an active research domain. Multiple systems are being designed everyday and several datasets are being available. For this reason, it is being hard to keep an up-to-date state-of-the-art. In this work, we present the latest and most relevant retrieval-based dialogue systems and the available datasets used to b… ▽ More

    Submitted 30 July, 2019; originally announced July 2019.

  17. Search Rank Fraud De-Anonymization in Online Systems

    Authors: Mizanur Rahman, Nestor Hernandez, Bogdan Carbunar, Duen Horng Chau

    Abstract: We introduce the fraud de-anonymization problem, that goes beyond fraud detection, to unmask the human masterminds responsible for posting search rank fraud in online systems. We collect and study search rank fraud data from Upwork, and survey the capabilities and behaviors of 58 search rank fraudsters recruited from 6 crowdsourcing sites. We propose Dolos, a fraud de-anonymization system that lev… ▽ More

    Submitted 23 June, 2018; originally announced June 2018.

    Comments: The 29Th ACM Conference on Hypertext and Social Media, July 2018

  18. arXiv:1709.03576  [pdf, other

    cs.CY

    Capturing the contributions of the semantic web to the IoT: a unifying vision

    Authors: Nicolas Seydoux, Khalil Drira, Nathalie Hernandez, Thierry Monteil

    Abstract: The Internet of Things (IoT) is a technological topic with a very important societal impact. IoT application domains are various and include: smart cities, precision farming, smart factories, and smart buildings. The diversity of these application domains is the source of the very high technological heterogeneity in the IoT, leading to interoperability issues. The semantic web principles and techn… ▽ More

    Submitted 30 August, 2017; originally announced September 2017.

    Comments: 23 pages, 5 tables, 3 figures, submission to the Semantic Web Journal (Internet of Things special issue) and subject of an extended abstract to the SWIT2017 (https://swit2017.github.io/) workshop

  19. Marimba: A Tool for Verifying Properties of Hidden Markov Models

    Authors: Noe Hernandez, Kerstin Eder, Evgeni Magid, Jesus Savage, David A. Rosenblueth

    Abstract: The formal verification of properties of Hidden Markov Models (HMMs) is highly desirable for gaining confidence in the correctness of the model and the corresponding system. A significant step towards HMM verification was the development by Zhang et al. of a family of logics for verifying HMMs, called POCTL*, and its model checking algorithm. As far as we know, the verification tool we present her… ▽ More

    Submitted 28 October, 2015; v1 submitted 20 July, 2015; originally announced July 2015.

    Comments: Tool paper accepted in the 13th International Symposium on Automated Technology for Verification and Analysis (ATVA 2015)

  20. arXiv:0902.2345  [pdf, other

    cs.CL

    What's in a Message?

    Authors: Stergos D. Afantenos, Nicolas Hernandez

    Abstract: In this paper we present the first step in a larger series of experiments for the induction of predicate/argument structures. The structures that we are inducing are very similar to the conceptual structures that are used in Frame Semantics (such as FrameNet). Those structures are called messages and they were previously used in the context of a multi-document summarization system of evolving ev… ▽ More

    Submitted 13 February, 2009; originally announced February 2009.

    Journal ref: 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009), workshop on Cognitive Aspects of Computational Language Acquisition. Athens, Greece