-
EPIDetect: Video-based convulsive seizure detection in chronic epilepsy mouse model for anti-epilepsy drug screening
Authors:
Junming Ren,
Zhoujian Xiao,
Yujia Zhang,
Yujie Yang,
Ling He,
Ezra Yoon,
Stephen Temitayo Bello,
Xi Chen,
Dapeng Wu,
Micky Tortorella,
Jufang He
Abstract:
In the preclinical translational studies, drug candidates with remarkable anti-epileptic efficacy demonstrate long-term suppression of spontaneous recurrent seizures (SRSs), particularly convulsive seizures (CSs), in mouse models of chronic epilepsy. However, the current methods for monitoring CSs have limitations in terms of invasiveness, specific laboratory settings, high cost, and complex opera…
▽ More
In the preclinical translational studies, drug candidates with remarkable anti-epileptic efficacy demonstrate long-term suppression of spontaneous recurrent seizures (SRSs), particularly convulsive seizures (CSs), in mouse models of chronic epilepsy. However, the current methods for monitoring CSs have limitations in terms of invasiveness, specific laboratory settings, high cost, and complex operation, which hinder drug screening efforts. In this study, a camera-based system for automated detection of CSs in chronically epileptic mice is first established to screen potential anti-epilepsy drugs.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Analyzing COVID-19 Vaccination Sentiments in Nigerian Cyberspace: Insights from a Manually Annotated Twitter Dataset
Authors:
Ibrahim Said Ahmad,
Lukman Jibril Aliyu,
Abubakar Auwal Khalid,
Saminu Muhammad Aliyu,
Shamsuddeen Hassan Muhammad,
Idris Abdulmumin,
Bala Mairiga Abduljalil,
Bello Shehu Bello,
Amina Imam Abubakar
Abstract:
Numerous successes have been achieved in combating the COVID-19 pandemic, initially using various precautionary measures like lockdowns, social distancing, and the use of face masks. More recently, various vaccinations have been developed to aid in the prevention or reduction of the severity of the COVID-19 infection. Despite the effectiveness of the precautionary measures and the vaccines, there…
▽ More
Numerous successes have been achieved in combating the COVID-19 pandemic, initially using various precautionary measures like lockdowns, social distancing, and the use of face masks. More recently, various vaccinations have been developed to aid in the prevention or reduction of the severity of the COVID-19 infection. Despite the effectiveness of the precautionary measures and the vaccines, there are several controversies that are massively shared on social media platforms like Twitter. In this paper, we explore the use of state-of-the-art transformer-based language models to study people's acceptance of vaccines in Nigeria. We developed a novel dataset by crawling multi-lingual tweets using relevant hashtags and keywords. Our analysis and visualizations revealed that most tweets expressed neutral sentiments about COVID-19 vaccines, with some individuals expressing positive views, and there was no strong preference for specific vaccine types, although Moderna received slightly more positive sentiment. We also found out that fine-tuning a pre-trained LLM with an appropriate dataset can yield competitive results, even if the LLM was not initially pre-trained on the specific language of that dataset.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI)
Authors:
Sabrina Toro,
Anna V Anagnostopoulos,
Sue Bello,
Kai Blumberg,
Rhiannon Cameron,
Leigh Carmody,
Alexander D Diehl,
Damion Dooley,
William Duncan,
Petra Fey,
Pascale Gaudet,
Nomi L Harris,
Marcin Joachimiak,
Leila Kiani,
Tiago Lubiana,
Monica C Munoz-Torres,
Shawn O'Neil,
David Osumi-Sutherland,
Aleix Puig,
Justin P Reese,
Leonore Reiser,
Sofia Robb,
Troy Ruem**,
James Seager,
Eric Sid
, et al. (5 additional authors not shown)
Abstract:
Background: Ontologies are fundamental components of informatics infrastructure in domains such as biomedical, environmental, and food sciences, representing consensus knowledge in an accurate and computable form. However, their construction and maintenance demand substantial resources and necessitate substantial collaboration between domain experts, curators, and ontology experts. We present Dyna…
▽ More
Background: Ontologies are fundamental components of informatics infrastructure in domains such as biomedical, environmental, and food sciences, representing consensus knowledge in an accurate and computable form. However, their construction and maintenance demand substantial resources and necessitate substantial collaboration between domain experts, curators, and ontology experts. We present Dynamic Retrieval Augmented Generation of Ontologies using AI (DRAGON-AI), an ontology generation method employing Large Language Models (LLMs) and Retrieval Augmented Generation (RAG). DRAGON-AI can generate textual and logical ontology components, drawing from existing knowledge in multiple ontologies and unstructured text sources.
Results: We assessed performance of DRAGON-AI on de novo term construction across ten diverse ontologies, making use of extensive manual evaluation of results. Our method has high precision for relationship generation, but has slightly lower precision than from logic-based reasoning. Our method is also able to generate definitions deemed acceptable by expert evaluators, but these scored worse than human-authored definitions. Notably, evaluators with the highest level of confidence in a domain were better able to discern flaws in AI-generated definitions. We also demonstrated the ability of DRAGON-AI to incorporate natural language instructions in the form of GitHub issues.
Conclusions: These findings suggest DRAGON-AI's potential to substantially aid the manual ontology construction process. However, our results also underscore the importance of having expert curators and ontology editors drive the ontology generation process.
△ Less
Submitted 12 June, 2024; v1 submitted 17 December, 2023;
originally announced December 2023.
-
Overview on electrical issues faced during the SPIDER experimental campaigns
Authors:
Alberto Maistrello,
Matteo Agostini,
Marco Bigi,
Matteo Brombin,
Mattia Dan,
Riccardo Casagrande,
Marco De Nardi,
Alberto Ferro,
Elena Gaio,
Palak Jain,
Francesco Lunardon,
Nicolò Marconato,
Diego Marcuzzi,
Mauro Recchia,
Tommaso Patton,
Mauro Pavei,
Francesco Santoro,
Vanni Toigo,
Loris Zanotto,
Marco Barbisan,
Lucio Baseggio,
Marco Bernardi,
Giovanni Berton,
Marco Boldrin,
Samuele Dal Bello
, et al. (17 additional authors not shown)
Abstract:
SPIDER is the full-scale prototype of the ion source of the ITER Heating Neutral Beam Injector, where negative ions of Hydrogen or Deuterium are produced by a RF generated plasma and accelerated with a set of grids up to ~100 keV. The Power Supply System is composed of high voltage dc power supplies capable of handling frequent grid breakdowns, high current dc generators for the magnetic filter fi…
▽ More
SPIDER is the full-scale prototype of the ion source of the ITER Heating Neutral Beam Injector, where negative ions of Hydrogen or Deuterium are produced by a RF generated plasma and accelerated with a set of grids up to ~100 keV. The Power Supply System is composed of high voltage dc power supplies capable of handling frequent grid breakdowns, high current dc generators for the magnetic filter field and RF generators for the plasma generation. During the first 3 years of SPIDER operation different electrical issues were discovered, understood and addressed thanks to deep analyses of the experimental results supported by modelling activities. The paper gives an overview on the observed phenomena and relevant analyses to understand them, on the effectiveness of the short-term modifications provided to SPIDER to face the encountered issues and on the design principle of long-term solutions to be introduced during the currently ongoing long shutdown.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
Lessons learned after three years of SPIDER operation and the first MITICA integrated tests
Authors:
D. Marcuzzi,
V. Toigo,
M. Boldrin,
G. Chitarin,
S. Dal Bello,
L. Grando,
A. Luchetta,
R. Pasqualotto,
M. Pavei,
G. Serianni,
L. Zanotto,
R. Agnello,
P. Agostinetti,
M. Agostini,
D. Aprile,
M. Barbisan,
M. Battistella,
G. Berton,
M. Bigi,
M. Brombin,
V. Candela,
V. Candeloro,
A. Canton,
R. Casagrande,
C. Cavallini
, et al. (117 additional authors not shown)
Abstract:
ITER envisages the use of two heating neutral beam injectors plus an optional one as part of the auxiliary heating and current drive system. The 16.5 MW expected neutral beam power per injector is several notches higher than worldwide existing facilities. A Neutral Beam Test Facility (NBTF) was established at Consorzio RFX, exploiting the synergy of two test beds, SPIDER and MITICA. SPIDER is dedi…
▽ More
ITER envisages the use of two heating neutral beam injectors plus an optional one as part of the auxiliary heating and current drive system. The 16.5 MW expected neutral beam power per injector is several notches higher than worldwide existing facilities. A Neutral Beam Test Facility (NBTF) was established at Consorzio RFX, exploiting the synergy of two test beds, SPIDER and MITICA. SPIDER is dedicated to develo** and characterizing large efficient negative ion sources at relevant parameters in ITER-like conditions: source and accelerator located in the same vacuum where the beam propagates, immunity to electromagnetic interferences of multiple radio-frequency (RF) antennas, avoidance of RF-induced discharges on the outside of the source. Three years of experiments on SPIDER have addressed to the necessary design modifications to enable full performances. The source is presently under a long shut-down phase to incorporate learnings from the experimental campaign. Parallelly, developments on MITICA, the full-scale prototype of the ITER NBI featuring a 1 MV accelerator and ion neutralization, are underway including manufacturing of in-vessel components, while power supplies and auxiliary plants are already under final testing and commissioning. Integration, commissioning and tests of the 1MV power supplies are essential for this first-of-kind system, unparalleled both in research and industry field. The integrated test to confirm 1MV output by combining invertor systems, DC generators and transmission lines extracted errors/accidents in some components. To realize a concrete system for ITER, solutions for the repair and the improvement of the system were developed. Hence, NBTF is emerging as a necessary facility, due to the large gap with existing injectors, effectively dedicated to identify issues and find solutions to enable successful ITER NBI operations in a time bound fashion.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
Conceptual design of the Gas Injection and Vacuum System for DTT NBI
Authors:
P. Agostinetti,
S. Dal Bello,
F. Dinh,
A. Ferrara,
M. Fincato,
L. Grando,
M. Mura,
A. Murari,
E. Sartori,
M. Siragusa,
F. Siviero,
F. Veronese
Abstract:
The Divertor Tokamak Test (DTT) is a new experimental facility whose construction is starting in Frascati, Rome, Italy; its main goals are improving the understanding of plasma-wall interactions and supporting the development of ITER and DEMO. DTT will be equipped with a Neutral Beam Injector (NBI) based on negative deuterium ions, designed to inject 10 MW of power to the tokamak.
A fundamental…
▽ More
The Divertor Tokamak Test (DTT) is a new experimental facility whose construction is starting in Frascati, Rome, Italy; its main goals are improving the understanding of plasma-wall interactions and supporting the development of ITER and DEMO. DTT will be equipped with a Neutral Beam Injector (NBI) based on negative deuterium ions, designed to inject 10 MW of power to the tokamak.
A fundamental system for the good operations of the DTT NBI will be its Gas injection and Vacuum System (GVS). Indeed, the efficiency of the entire NBI strongly depends on the good performance of its GVS.
The GVS for DTT NBI will be composed of two systems working in parallel: a grounded section connected to the main vacuum vessel, and a high voltage part connected to the ion source vessel and working at -510 kV voltage. The grounded part will feature a fore vacuum system (given by screw and roots pumps) plus a high vacuum system based on turbo-molecular pumps located on the side walls of the vessel and Non-Evaporable Getter (NEG) pumps located inside the vessel on the upper and lower surfaces. On the other hand, the high voltage part will feature a fore vacuum system (given by two compact screw pumps mounted on the external surface for the ion source vessel) plus a high vacuum system based on turbo-molecular pumps also located on the sidewalls of the ion source vessel. A dedicated deuterium gas injection will feed the process gas to the ion source and the neutralizer.
This paper gives a description of the conceptual design of the GVS for DTT NBI, and of the procedure followed to optimize this system considering the operational requirements and the other constraints of the DTT NBI.
△ Less
Submitted 5 April, 2023; v1 submitted 30 March, 2023;
originally announced March 2023.
-
OSRE: Object-to-Spot Rotation Estimation for Bike Parking Assessment
Authors:
Saghir Alfasly,
Zaid Al-huda,
Saifullah Bello,
Ahmed Elazab,
Jian Lu,
Chen Xu
Abstract:
Current deep models provide remarkable object detection in terms of object classification and localization. However, estimating object rotation with respect to other visual objects in the visual context of an input image still lacks deep studies due to the unavailability of object datasets with rotation annotations.
This paper tackles these two challenges to solve the rotation estimation of a pa…
▽ More
Current deep models provide remarkable object detection in terms of object classification and localization. However, estimating object rotation with respect to other visual objects in the visual context of an input image still lacks deep studies due to the unavailability of object datasets with rotation annotations.
This paper tackles these two challenges to solve the rotation estimation of a parked bike with respect to its parking area. First, we leverage the power of 3D graphics to build a camera-agnostic well-annotated Synthetic Bike Rotation Dataset (SynthBRSet). Then, we propose an object-to-spot rotation estimator (OSRE) by extending the object detection task to further regress the bike rotations in two axes. Since our model is purely trained on synthetic data, we adopt image smoothing techniques when deploying it on real-world images. The proposed OSRE is evaluated on synthetic and real-world data providing promising results. Our data and code are available at \href{https://github.com/saghiralfasly/OSRE-Project}{https://github.com/saghiralfasly/OSRE-Project}.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
Authors:
Shamsuddeen Hassan Muhammad,
Idris Abdulmumin,
Abinew Ali Ayele,
Nedjma Ousidhoum,
David Ifeoluwa Adelani,
Seid Muhie Yimam,
Ibrahim Sa'id Ahmad,
Meriem Beloucif,
Saif M. Mohammad,
Sebastian Ruder,
Oumaima Hourrane,
Pavel Brazdil,
Felermino Dário Mário António Ali,
Davis David,
Salomey Osei,
Bello Shehu Bello,
Falalu Ibrahim,
Tajuddeen Gwadabe,
Samuel Rutunda,
Tadesse Belay,
Wendimu Baye Messelle,
Hailu Beshada Balcha,
Sisay Adugna Chala,
Hagos Tesfahun Gebremichael,
Bernard Opoku
, et al. (1 additional authors not shown)
Abstract:
Africa is home to over 2,000 languages from more than six language families and has the highest linguistic diversity among all continents. These include 75 languages with at least one million speakers each. Yet, there is little NLP research conducted on African languages. Crucial to enabling such research is the availability of high-quality annotated datasets. In this paper, we introduce AfriSenti…
▽ More
Africa is home to over 2,000 languages from more than six language families and has the highest linguistic diversity among all continents. These include 75 languages with at least one million speakers each. Yet, there is little NLP research conducted on African languages. Crucial to enabling such research is the availability of high-quality annotated datasets. In this paper, we introduce AfriSenti, a sentiment analysis benchmark that contains a total of >110,000 tweets in 14 African languages (Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and Yorùbá) from four language families. The tweets were annotated by native speakers and used in the AfriSenti-SemEval shared task (The AfriSenti Shared Task had over 200 participants. See website at https://afrisenti-semeval.github.io). We describe the data collection methodology, annotation process, and the challenges we dealt with when curating each dataset. We further report baseline experiments conducted on the different datasets and discuss their usefulness.
△ Less
Submitted 4 November, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
#EndSARS Protest: Discourse and Mobilisation on Twitter
Authors:
Bello Shehu Bello,
Muhammad Abubakar Alhassan,
Isa Inuwa-Dutse
Abstract:
Using the @NGRPresident Twitter handle, the Government of Nigeria issued a special directive banning Special Anti-Robbery Squad (SARS) with immediate effect. The SARS is a special police unit under the Nigeria Police Force tasked with the responsibility of fighting violent crimes. However, the unit has been accused of waves of human rights abuse across the nation. According to a report by Amnesty…
▽ More
Using the @NGRPresident Twitter handle, the Government of Nigeria issued a special directive banning Special Anti-Robbery Squad (SARS) with immediate effect. The SARS is a special police unit under the Nigeria Police Force tasked with the responsibility of fighting violent crimes. However, the unit has been accused of waves of human rights abuse across the nation. According to a report by Amnesty International, between January 2017 and May 2020, 82 cases of police brutality have been committed. This has led to one of the major protests demanding more measures to be taken. The #EndSARS hashtag was widely used by the protesters to amplify their messages and reach out to wider communities on Twitter. In this study, we present a critical analysis of how the online protest unfolded. Essentially, we examine how the protest evolves on Twitter, the nature of engagement with the protest themes, the factors influencing the protest and public perceptions about the online movement. We found that the mobilisation strategies include direct and indirect engagements with influential users, sharing direct stories and vicarious experiences. Also, there is evidence that suggests the deployment of automated accounts to promote the course of the protest. In terms of participation, over 70% of the protest is confined within a few states in Nigeria, and the diaspora communities also lent their voices to the movement. The most active users are not those with high followership, and the majority of the protesters utilised mobile devices, accounting for 88% to mobilise and report on the protest. We also examined how social media users interact with the movement and the response from the wider online communities. Needless to say, the themes in the online discourse are mostly about #EndSARS and vicarious experiences with the police, however, there are topics around police reform and demand for regime change.
△ Less
Submitted 15 January, 2023;
originally announced January 2023.
-
Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
Authors:
Idris Abdulmumin,
Satya Ranjan Dash,
Musa Abdullahi Dawud,
Shantipriya Parida,
Shamsuddeen Hassan Muhammad,
Ibrahim Sa'id Ahmad,
Subhadarshi Panda,
Ondřej Bojar,
Bashir Shehu Galadanci,
Bello Shehu Bello
Abstract:
Multi-modal Machine Translation (MMT) enables the use of visual information to enhance the quality of translations. The visual information can serve as a valuable piece of context information to decrease the ambiguity of input sentences. Despite the increasing popularity of such a technique, good and sizeable datasets are scarce, limiting the full extent of their potential. Hausa, a Chadic languag…
▽ More
Multi-modal Machine Translation (MMT) enables the use of visual information to enhance the quality of translations. The visual information can serve as a valuable piece of context information to decrease the ambiguity of input sentences. Despite the increasing popularity of such a technique, good and sizeable datasets are scarce, limiting the full extent of their potential. Hausa, a Chadic language, is a member of the Afro-Asiatic language family. It is estimated that about 100 to 150 million people speak the language, with more than 80 million indigenous speakers. This is more than any of the other Chadic languages. Despite a large number of speakers, the Hausa language is considered low-resource in natural language processing (NLP). This is due to the absence of sufficient resources to implement most NLP tasks. While some datasets exist, they are either scarce, machine-generated, or in the religious domain. Therefore, there is a need to create training and evaluation data for implementing machine learning tasks and bridging the research gap in the language. This work presents the Hausa Visual Genome (HaVG), a dataset that contains the description of an image or a section within the image in Hausa and its equivalent in English. To prepare the dataset, we started by translating the English description of the images in the Hindi Visual Genome (HVG) into Hausa automatically. Afterward, the synthetic Hausa data was carefully post-edited considering the respective images. The dataset comprises 32,923 images and their descriptions that are divided into training, development, test, and challenge test set. The Hausa Visual Genome is the first dataset of its kind and can be used for Hausa-English machine translation, multi-modal research, and image description, among various other natural language processing and generation tasks.
△ Less
Submitted 6 May, 2022; v1 submitted 2 May, 2022;
originally announced May 2022.
-
Spectral analysis in broken sheared waveguides
Authors:
Diana C. S. Bello,
Alessandra A. Verri
Abstract:
Let $Ω\subset \mathbb R^3$ be a broken sheared waveguide, i.e., it is built by translating a cross-section in a constant direction along a broken line in $\mathbb R^3$. We prove that the discrete spectrum of the Dirichlet Laplacian operator in $Ω$ is non-empty and finite. Furthermore, we show a particular geometry for $Ω$ which implies that the total multiplicity of the discrete spectrum is equals…
▽ More
Let $Ω\subset \mathbb R^3$ be a broken sheared waveguide, i.e., it is built by translating a cross-section in a constant direction along a broken line in $\mathbb R^3$. We prove that the discrete spectrum of the Dirichlet Laplacian operator in $Ω$ is non-empty and finite. Furthermore, we show a particular geometry for $Ω$ which implies that the total multiplicity of the discrete spectrum is equals 1.
△ Less
Submitted 17 July, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis
Authors:
Shamsuddeen Hassan Muhammad,
David Ifeoluwa Adelani,
Sebastian Ruder,
Ibrahim Said Ahmad,
Idris Abdulmumin,
Bello Shehu Bello,
Monojit Choudhury,
Chris Chinenye Emezue,
Saheed Salahudeen Abdullahi,
Anuoluwapo Aremu,
Alipio Jeorge,
Pavel Brazdil
Abstract:
Sentiment analysis is one of the most widely studied applications in NLP, but most work focuses on languages with large amounts of data. We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria (Hausa, Igbo, Nigerian-Pidgin, and Yorùbá ) consisting of around 30,000 annotated tweets per language (and 14,000 for Nigerian-Pidgin…
▽ More
Sentiment analysis is one of the most widely studied applications in NLP, but most work focuses on languages with large amounts of data. We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria (Hausa, Igbo, Nigerian-Pidgin, and Yorùbá ) consisting of around 30,000 annotated tweets per language (and 14,000 for Nigerian-Pidgin), including a significant fraction of code-mixed tweets. We propose text collection, filtering, processing and labeling methods that enable us to create datasets for these low-resource languages. We evaluate a rangeof pre-trained models and transfer strategies on the dataset. We find that language-specific models and language-adaptivefine-tuning generally perform best. We release the datasets, trained models, sentiment lexicons, and code to incentivizeresearch on sentiment analysis in under-represented languages.
△ Less
Submitted 18 June, 2022; v1 submitted 20 January, 2022;
originally announced January 2022.
-
Stochastic Graph Transformation For Social Network Modeling
Authors:
Nicolas Behr,
Bello Shehu Bello,
Sebastian Ehmes,
Reiko Heckel
Abstract:
Adaptive networks model social, physical, technical, or biological systems as attributed graphs evolving at the level of both their topology and data. They are naturally described by graph transformation, but the majority of authors take an approach inspired by the physical sciences, combining an informal description of the operations with programmed simulations, and systems of ODEs as the only ab…
▽ More
Adaptive networks model social, physical, technical, or biological systems as attributed graphs evolving at the level of both their topology and data. They are naturally described by graph transformation, but the majority of authors take an approach inspired by the physical sciences, combining an informal description of the operations with programmed simulations, and systems of ODEs as the only abstract mathematical description. We show that we can capture a range of social network models, the so-called voter models, as stochastic attributed graph transformation systems, demonstrate the benefits of this representation and establish its relation to the non-standard probabilistic view adopted in the literature. We use the theory and tools of graph transformation to analyze and simulate the models and propose a new variant of a standard stochastic simulation algorithm to recreate the results observed.
△ Less
Submitted 21 December, 2021;
originally announced December 2021.
-
A Simple Standard for Sharing Ontological Map**s (SSSOM)
Authors:
Nicolas Matentzoglu,
James P. Balhoff,
Susan M. Bello,
Chris Bizon,
Matthew Brush,
Tiffany J. Callahan,
Christopher G Chute,
William D. Duncan,
Chris T. Evelo,
Davera Gabriel,
John Graybeal,
Alasdair Gray,
Benjamin M. Gyori,
Melissa Haendel,
Henriette Harmse,
Nomi L. Harris,
Ian Harrow,
Harshad Hegde,
Amelia L. Hoyt,
Charles T. Hoyt,
Dazhi Jiao,
Ernesto Jiménez-Ruiz,
Simon Jupp,
Hyeongsik Kim,
Sebastian Koehler
, et al. (19 additional authors not shown)
Abstract:
Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for map** between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Map**s often lack the metadata needed to be correctly interpreted and applied. For example, ar…
▽ More
Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for map** between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Map**s often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Are they associated in some other way? Such relationships between the mapped terms are often not documented, leading to incorrect assumptions and making them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Also, the lack of descriptions of how map**s were done makes it hard to combine and reconcile map**s, particularly curated and automated ones.
The Simple Standard for Sharing Ontological Map**s (SSSOM) addresses these problems by: 1. Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in map**s explicit. 2. Defining an easy to use table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data standards. 3. Implementing open and community-driven collaborative workflows designed to evolve the standard continuously to address changing requirements and map** practices. 4. Providing reference tools and software libraries for working with the standard.
In this paper, we present the SSSOM standard, describe several use cases, and survey some existing work on standardizing the exchange of map**s, with the goal of making map**s Findable, Accessible, Interoperable, and Reusable (FAIR). The SSSOM specification is at http://w3id.org/sssom/spec.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.
-
Self-harm: detection and support on Twitter
Authors:
Muhammad Abubakar Alhassan,
Isa Inuwa-Dutse,
Bello Shehu Bello,
Diane Pennington
Abstract:
Since the advent of online social media platforms such as Twitter and Facebook, useful health-related studies have been conducted using the information posted by online participants. Personal health-related issues such as mental health, self-harm and depression have been studied because users often share their stories on such platforms. Online users resort to sharing because the empathy and suppor…
▽ More
Since the advent of online social media platforms such as Twitter and Facebook, useful health-related studies have been conducted using the information posted by online participants. Personal health-related issues such as mental health, self-harm and depression have been studied because users often share their stories on such platforms. Online users resort to sharing because the empathy and support from online communities are crucial in hel** the affected individuals. A preliminary analysis shows how contents related to non-suicidal self-injury (NSSI) proliferate on Twitter. Thus, we use Twitter to collect relevant data, analyse, and proffer ways of supporting users prone to NSSI behaviour. Our approach utilises a custom crawler to retrieve relevant tweets from self-reporting users and relevant organisations interested in combating self-harm. Through textual analysis, we identify six major categories of self-harming users consisting of inflicted, anti-self-harm, support seekers, recovered, pro-self-harm and at risk. The inflicted category dominates the collection. From an engagement perspective, we show how online users respond to the information posted by self-harm support organisations on Twitter. By noting the most engaged organisations, we apply a useful technique to uncover the organisations' strategy. The online participants show a strong inclination towards online posts associated with mental health related attributes. Our study is based on the premise that social media can be used as a tool to support proactive measures to ease the negative impact of self-harm. Consequently, we proffer ways to prevent potential users from engaging in self-harm and support affected users through a set of recommendations. To support further research, the dataset will be made available for interested researchers.
△ Less
Submitted 31 March, 2021;
originally announced April 2021.
-
Review: deep learning on 3D point clouds
Authors:
Saifullahi Aminu Bello,
Shangshu Yu,
Cheng Wang
Abstract:
Point cloud is point sets defined in 3D metric space. Point cloud has become one of the most significant data format for 3D representation. Its gaining increased popularity as a result of increased availability of acquisition devices, such as LiDAR, as well as increased application in areas such as robotics, autonomous driving, augmented and virtual reality. Deep learning is now the most powerful…
▽ More
Point cloud is point sets defined in 3D metric space. Point cloud has become one of the most significant data format for 3D representation. Its gaining increased popularity as a result of increased availability of acquisition devices, such as LiDAR, as well as increased application in areas such as robotics, autonomous driving, augmented and virtual reality. Deep learning is now the most powerful tool for data processing in computer vision, becoming the most preferred technique for tasks such as classification, segmentation, and detection. While deep learning techniques are mainly applied to data with a structured grid, point cloud, on the other hand, is unstructured. The unstructuredness of point clouds makes use of deep learning for its processing directly very challenging. Earlier approaches overcome this challenge by preprocessing the point cloud into a structured grid format at the cost of increased computational cost or lost of depth information. Recently, however, many state-of-the-arts deep learning techniques that directly operate on point cloud are being developed. This paper contains a survey of the recent state-of-the-art deep learning techniques that mainly focused on point cloud data. We first briefly discussed the major challenges faced when using deep learning directly on point cloud, we also briefly discussed earlier approaches which overcome the challenges by preprocessing the point cloud into a structured grid. We then give the review of the various state-of-the-art deep learning approaches that directly process point cloud in its unstructured form. We introduced the popular 3D point cloud benchmark datasets. And we also further discussed the application of deep learning in popular 3D vision tasks including classification, segmentation and detection.
△ Less
Submitted 17 January, 2020;
originally announced January 2020.
-
Lexical analysis of automated accounts on Twitter
Authors:
Isa Inuwa-Dutse,
Bello Shehu Bello,
Ioannis Korkontzelos
Abstract:
In recent years, social bots have been using increasingly more sophisticated, challenging detection strategies. While many approaches and features have been proposed, social bots evade detection and interact much like humans making it difficult to distinguish real human accounts from bot accounts. For detection systems, various features under the broader categories of account profile, tweet conten…
▽ More
In recent years, social bots have been using increasingly more sophisticated, challenging detection strategies. While many approaches and features have been proposed, social bots evade detection and interact much like humans making it difficult to distinguish real human accounts from bot accounts. For detection systems, various features under the broader categories of account profile, tweet content, network and temporal pattern have been utilised. The use of tweet content features is limited to analysis of basic terms such as URLs, hashtags, name entities and sentiment. Given a set of tweet contents with no obvious pattern can we distinguish contents produced by social bots from that of humans? We aim to answer this question by analysing the lexical richness of tweets produced by the respective accounts using large collections of different datasets. Our results show a clear margin between the two classes in lexical diversity, lexical sophistication and distribution of emoticons. We found that the proposed lexical features significantly improve the performance of classifying both account types. These features are useful for training a standard machine learning classifier for effective detection of social bot accounts. A new dataset is made freely available for further exploration.
△ Less
Submitted 19 December, 2018;
originally announced December 2018.