Search | arXiv e-print repository

Scholarly Question Answering using Large Language Models in the NFDI4DataScience Gateway

Authors: Hamed Babaei Giglou, Tilahun Abedissa Taffa, Rana Abdullah, Aida Usmanova, Ricardo Usbeck, Jennifer D'Souza, Sören Auer

Abstract: This paper introduces a scholarly Question Answering (QA) system on top of the NFDI4DataScience Gateway, employing a Retrieval Augmented Generation-based (RAG) approach. The NFDI4DS Gateway, as a foundational framework, offers a unified and intuitive interface for querying various scientific databases using federated search. The RAG-based scholarly QA, powered by a Large Language Model (LLM), faci… ▽ More This paper introduces a scholarly Question Answering (QA) system on top of the NFDI4DataScience Gateway, employing a Retrieval Augmented Generation-based (RAG) approach. The NFDI4DS Gateway, as a foundational framework, offers a unified and intuitive interface for querying various scientific databases using federated search. The RAG-based scholarly QA, powered by a Large Language Model (LLM), facilitates dynamic interaction with search results, enhancing filtering capabilities and fostering a conversational engagement with the Gateway search. The effectiveness of both the Gateway and the scholarly QA system is demonstrated through experimental analysis. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 13 pages main content, 16 pages overall, 3 Figures, accepted for publication at NSLP 2024 workshop at ESWC 2024

arXiv:2312.03095 [pdf, other]

doi 10.1109/ACCESS.2024.3371585

Understanding Environmental Posts: Sentiment and Emotion Analysis of Social Media Data

Authors: Daniyar Amangeldi, Aida Usmanova, Pakizar Shamoi

Abstract: Social media is now the predominant source of information due to the availability of immediate public response. As a result, social media data has become a valuable resource for comprehending public sentiments. Studies have shown that it can amplify ideas and influence public sentiments. This study analyzes the public perception of climate change and the environment over a decade from 2014 to 2023… ▽ More Social media is now the predominant source of information due to the availability of immediate public response. As a result, social media data has become a valuable resource for comprehending public sentiments. Studies have shown that it can amplify ideas and influence public sentiments. This study analyzes the public perception of climate change and the environment over a decade from 2014 to 2023. Using the Pointwise Mutual Information (PMI) algorithm, we identify sentiment and explore prevailing emotions expressed within environmental tweets across various social media platforms, namely Twitter, Reddit, and YouTube. Accuracy on a human-annotated dataset was 0.65, higher than Vader score but lower than that of an expert rater (0.90). Our findings suggest that negative environmental tweets are far more common than positive or neutral ones. Climate change, air quality, emissions, plastic, and recycling are the most discussed topics on all social media platforms, highlighting its huge global concern. The most common emotions in environmental tweets are fear, trust, and anticipation, demonstrating public reactions wide and complex nature. By identifying patterns and trends in opinions related to the environment, we hope to provide insights that can help raise awareness regarding environmental issues, inform the development of interventions, and adapt further actions to meet environmental challenges. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Journal ref: IEEE Access

arXiv:2207.08181 [pdf, other]

Federated Continual Learning through distillation in pervasive computing

Authors: Anastasiia Usmanova, François Portet, Philippe Lalanda, German Vega

Abstract: Federated Learning has been introduced as a new machine learning paradigm enhancing the use of local devices. At a server level, FL regularly aggregates models learned locally on distributed clients to obtain a more general model. Current solutions rely on the availability of large amounts of stored data at the client side in order to fine-tune the models sent by the server. Such setting is not re… ▽ More Federated Learning has been introduced as a new machine learning paradigm enhancing the use of local devices. At a server level, FL regularly aggregates models learned locally on distributed clients to obtain a more general model. Current solutions rely on the availability of large amounts of stored data at the client side in order to fine-tune the models sent by the server. Such setting is not realistic in mobile pervasive computing where data storage must be kept low and data characteristic can change dramatically. To account for this variability, a solution is to use the data regularly collected by the client to progressively adapt the received model. But such naive approach exposes clients to the well-known problem of catastrophic forgetting. To address this problem, we have defined a Federated Continual Learning approach which is mainly based on distillation. Our approach allows a better use of resources, eliminating the need to retrain from scratch at the arrival of new data and reducing memory usage by limiting the amount of data to be stored. This proposal has been evaluated in the Human Activity Recognition (HAR) domain and has shown to effectively reduce the catastrophic forgetting effect. △ Less

Submitted 17 July, 2022; originally announced July 2022.

Comments: Anastasiia Usmanova, François Portet, Philippe Lalanda and German Vega. Federated Continual Learning through distillation in pervasive computing (short paper), SMARTCOMP2022, Espoo, Finland, June 20-24, 2022

arXiv:2207.08180 [pdf, other]

doi 10.1109/PerComWorkshops53856.2022.9767246

Federated Learning and catastrophic forgetting in pervasive computing: demonstration in HAR domain

Authors: Anastasiia Usmanova, François Portet, Philippe Lalanda, German Vega

Abstract: Federated Learning has been introduced as a new machine learning paradigm enhancing the use of local devices. At a server level, FL regularly aggregates models learned locally on distributed clients to obtain a more general model. In this way, no private data is sent over the network, and the communication cost is reduced. However, current solutions rely on the availability of large amounts of sto… ▽ More Federated Learning has been introduced as a new machine learning paradigm enhancing the use of local devices. At a server level, FL regularly aggregates models learned locally on distributed clients to obtain a more general model. In this way, no private data is sent over the network, and the communication cost is reduced. However, current solutions rely on the availability of large amounts of stored data at the client side in order to fine-tune the models sent by the server. Such setting is not realistic in mobile pervasive computing where data storage must be kept low and data characteristic (distribution) can change dramatically. To account for this variability, a solution is to use the data regularly collected by the client to progressively adapt the received model. But such naive approach exposes clients to the well-known problem of catastrophic forgetting. The purpose of this paper is to demonstrate this problem in the mobile human activity recognition context on smartphones. △ Less

Submitted 17 July, 2022; originally announced July 2022.

Comments: A. Usmanova, F. Portet, P. Lalanda and G. Vega, "Federated Learning and catastrophic forgetting in pervasive computing: demonstration in HAR domain," 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops), 2022, pp. 310-315

arXiv:2109.04197 [pdf, other]

A distillation-based approach integrating continual learning and federated learning for pervasive services

Authors: Anastasiia Usmanova, François Portet, Philippe Lalanda, German Vega

Abstract: Federated Learning, a new machine learning paradigm enhancing the use of edge devices, is receiving a lot of attention in the pervasive community to support the development of smart services. Nevertheless, this approach still needs to be adapted to the specificity of the pervasive domain. In particular, issues related to continual learning need to be addressed. In this paper, we present a distilla… ▽ More Federated Learning, a new machine learning paradigm enhancing the use of edge devices, is receiving a lot of attention in the pervasive community to support the development of smart services. Nevertheless, this approach still needs to be adapted to the specificity of the pervasive domain. In particular, issues related to continual learning need to be addressed. In this paper, we present a distillation-based approach dealing with catastrophic forgetting in federated learning scenario. Specifically, Human Activity Recognition tasks are used as a demonstration domain. △ Less

Submitted 9 September, 2021; originally announced September 2021.

Journal ref: 3rd Workshop on Continual and Multimodal Learning for Internet of Things -- Co-located with IJCAI 2021, Aug 2021, Montreal, Canada

Showing 1–5 of 5 results for author: Usmanova, A