Skip to main content

Showing 1–4 of 4 results for author: Ramis, C B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19102  [pdf, other

    cs.CL cs.AI cs.IR

    Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs

    Authors: Lokesh Mishra, Sohayl Dhibi, Yusik Kim, Cesar Berrospi Ramis, Shubham Gupta, Michele Dolfi, Peter Staar

    Abstract: Environment, Social, and Governance (ESG) KPIs assess an organization's performance on issues such as climate change, greenhouse gas emissions, water consumption, waste management, human rights, diversity, and policies. ESG reports convey this valuable quantitative information through tables. Unfortunately, extracting this information is difficult due to high variability in the table structure as… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted at the NLP4Climate workshop in the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)

  2. arXiv:2405.10725  [pdf, other

    cs.CL cs.IR

    INDUS: Effective and Efficient Language Models for Scientific Applications

    Authors: Bishwaranjan Bhattacharjee, Aashka Trivedi, Masayasu Muraoka, Muthukumaran Ramasubramanian, Takuma Udagawa, Iksha Gurung, Rong Zhang, Bharath Dandala, Rahul Ramachandran, Manil Maskey, Kaylin Bugbee, Mike Little, Elizabeth Fancher, Lauren Sanders, Sylvain Costes, Sergi Blanco-Cuaresma, Kelly Lockhart, Thomas Allen, Felix Grezes, Megan Ansdell, Alberto Accomazzi, Yousef El-Kurdi, Davis Wertheimer, Birgit Pfitzmann, Cesar Berrospi Ramis , et al. (9 additional authors not shown)

    Abstract: Large language models (LLMs) trained on general domain corpora showed remarkable results on natural language processing (NLP) tasks. However, previous research demonstrated LLMs trained using domain-focused corpora perform better on specialized tasks. Inspired by this pivotal insight, we developed INDUS, a comprehensive suite of LLMs tailored for the Earth science, biology, physics, heliophysics,… ▽ More

    Submitted 20 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  3. arXiv:2306.12802  [pdf, other

    cs.LG cs.AI q-bio.BM

    Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery

    Authors: Hoang Thanh Lam, Marco Luca Sbodio, Marcos Martínez Galindo, Mykhaylo Zayats, Raúl Fernández-Díaz, Víctor Valls, Gabriele Picco, Cesar Berrospi Ramis, Vanessa López

    Abstract: Recent research on predicting the binding affinity between drug molecules and proteins use representations learned, through unsupervised learning techniques, from large databases of molecule SMILES and protein sequences. While these representations have significantly enhanced the predictions, they are usually based on a limited set of modalities, and they do not exploit available knowledge about e… ▽ More

    Submitted 19 October, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

  4. Delivering Document Conversion as a Cloud Service with High Throughput and Responsiveness

    Authors: Christoph Auer, Michele Dolfi, André Carvalho, Cesar Berrospi Ramis, Peter W. J. Staar

    Abstract: Document understanding is a key business process in the data-driven economy since documents are central to knowledge discovery and business insights. Converting documents into a machine-processable format is a particular challenge here due to their huge variability in formats and complex structure. Accordingly, many algorithms and machine-learning methods emerged to solve particular tasks such as… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: 11 pages, 7 figures, to be published in IEEE CLOUD 2022

    ACM Class: I.7.5; I.2.1; C.1.4; C.4