Search | arXiv e-print repository

Cheap Ways of Extracting Clinical Markers from Texts

Authors: Anastasia Sandu, Teodor Mihailescu, Sergiu Nisioi

Abstract: This paper describes the work of the UniBuc Archaeology team for CLPsych's 2024 Shared Task, which involved finding evidence within the text supporting the assigned suicide risk level. Two types of evidence were required: highlights (extracting relevant spans within the text) and summaries (aggregating evidence into a synthesis). Our work focuses on evaluating Large Language Models (LLM) as oppose… ▽ More This paper describes the work of the UniBuc Archaeology team for CLPsych's 2024 Shared Task, which involved finding evidence within the text supporting the assigned suicide risk level. Two types of evidence were required: highlights (extracting relevant spans within the text) and summaries (aggregating evidence into a synthesis). Our work focuses on evaluating Large Language Models (LLM) as opposed to an alternative method that is much more memory and resource efficient. The first approach employs a good old-fashioned machine learning (GOML) pipeline consisting of a tf-idf vectorizer with a logistic regression classifier, whose representative features are used to extract relevant highlights. The second, more resource intensive, uses an LLM for generating the summaries and is guided by chain-of-thought to provide sequences of text indicating clinical markers. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: https://github.com/nlp-unibuc/clpsych24-task

arXiv:1703.04336 [pdf, other]

A Visual Representation of Wittgenstein's Tractatus Logico-Philosophicus

Authors: Anca Bucur, Sergiu Nisioi

Abstract: In this paper we present a data visualization method together with its potential usefulness in digital humanities and philosophy of language. We compile a multilingual parallel corpus from different versions of Wittgenstein's Tractatus Logico-Philosophicus, including the original in German and translations into English, Spanish, French, and Russian. Using this corpus, we compute a similarity measu… ▽ More In this paper we present a data visualization method together with its potential usefulness in digital humanities and philosophy of language. We compile a multilingual parallel corpus from different versions of Wittgenstein's Tractatus Logico-Philosophicus, including the original in German and translations into English, Spanish, French, and Russian. Using this corpus, we compute a similarity measure between propositions and render a visual network of relations for different languages. △ Less

Submitted 13 March, 2017; originally announced March 2017.

Comments: Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)

arXiv:1609.03204 [pdf, other]

On the Similarities Between Native, Non-native and Translated Texts

Authors: Ella Rabinovich, Sergiu Nisioi, Noam Ordan, Shuly Wintner

Abstract: We present a computational analysis of three language varieties: native, advanced non-native, and translation. Our goal is to investigate the similarities and differences between non-native language productions and translations, contrasting both with native language. Using a collection of computational methods we establish three main results: (1) the three types of texts are easily distinguishable… ▽ More We present a computational analysis of three language varieties: native, advanced non-native, and translation. Our goal is to investigate the similarities and differences between non-native language productions and translations, contrasting both with native language. Using a collection of computational methods we establish three main results: (1) the three types of texts are easily distinguishable; (2) non-native language and translations are closer to each other than each of them is to native language; and (3) some of these characteristics depend on the source or native language, while others do not, reflecting, perhaps, unified principles that similarly affect translations and non-native language. △ Less

Submitted 11 September, 2016; originally announced September 2016.

Comments: ACL2016, 12 pages

Showing 1–3 of 3 results for author: Nisioi, S