-
ODIN: Open Data In Neurophysiology: Advancements, Solutions & Challenges
Authors:
Colleen J. Gillon,
Cody Baker,
Ryan Ly,
Edoardo Balzani,
Bingni W. Brunton,
Manuel Schottdorf,
Satrajit Ghosh,
Nima Dehghani
Abstract:
Across the life sciences, an ongoing effort over the last 50 years has made data and methods more reproducible and transparent. This openness has led to transformative insights and vastly accelerated scientific progress. For example, structural biology and genomics have undertaken systematic collection and publication of protein sequences and structures over the past half-century, and these data h…
▽ More
Across the life sciences, an ongoing effort over the last 50 years has made data and methods more reproducible and transparent. This openness has led to transformative insights and vastly accelerated scientific progress. For example, structural biology and genomics have undertaken systematic collection and publication of protein sequences and structures over the past half-century, and these data have led to scientific breakthroughs that were unthinkable when data collection first began. We believe that neuroscience is poised to follow the same path, and that principles of open data and open science will transform our understanding of the nervous system in ways that are impossible to predict at the moment.
To this end, new social structures along with active and open scientific communities are essential to facilitate and expand the still limited adoption of open science practices in our field. Unified by shared values of openness, we set out to organize a symposium for Open Data in Neuroscience (ODIN) to strengthen our community and facilitate transformative neuroscience research at large. In this report, we share what we learned during this first ODIN event. We also lay out plans for how to grow this movement, document emerging conversations, and propose a path toward a better and more transparent science of tomorrow.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Methods for Linking Data to Online Resources and Ontologies with Applications to Neurophysiology
Authors:
Matthew Avaylon,
Ryan Ly,
Andrew Tritt,
Benjamin Dichter,
Kristofer E. Bouchard,
Christopher J. Mungall,
Oliver Ruebel
Abstract:
Across many domains, large swaths of digital assets are being stored across distributed data repositories, e.g., the DANDI Archive [8]. The distribution and diversity of these repositories impede researchers from formally defining terminology within experiments, integrating information across datasets, and easily querying, reusing, and analyzing data that follow the FAIR principles [15]. As such,…
▽ More
Across many domains, large swaths of digital assets are being stored across distributed data repositories, e.g., the DANDI Archive [8]. The distribution and diversity of these repositories impede researchers from formally defining terminology within experiments, integrating information across datasets, and easily querying, reusing, and analyzing data that follow the FAIR principles [15]. As such, it has become increasingly important to have a standardized method to attach contextual metadata to datasets. Neuroscience is an exemplary use case of this issue due to the complex multimodal nature of experiments. Here, we present the HDMF External Resources Data (HERD) standard and related tools, enabling researchers to annotate new and existing datasets by map** external references to the data without requiring modification of the original dataset. We integrated HERD closely with Neurodata Without Borders (NWB) [2], a widely used data standard for sharing and storing neurophysiology data. By integrating with NWB, our tools provide neuroscientists with the capability to more easily create and manage neurophysiology data in compliance with controlled sets of terms, enhancing rigor and accuracy of data and facilitating data reuse.
△ Less
Submitted 30 May, 2024;
originally announced June 2024.
-
CrEIMBO: Cross Ensemble Interactions in Multi-view Brain Observations
Authors:
Noga Mudrik,
Ryan Ly,
Oliver Ruebel,
Adam S. Charles
Abstract:
Modern recordings of neural activity provide diverse observations of neurons across brain areas, behavioral conditions, and subjects -- thus presenting an exciting opportunity to reveal the fundamentals of brain-wide dynamics underlying cognitive function. Current methods, however, often fail to fully harness the richness of such data as they either provide an uninterpretable representation (e.g.,…
▽ More
Modern recordings of neural activity provide diverse observations of neurons across brain areas, behavioral conditions, and subjects -- thus presenting an exciting opportunity to reveal the fundamentals of brain-wide dynamics underlying cognitive function. Current methods, however, often fail to fully harness the richness of such data as they either provide an uninterpretable representation (e.g., via "black box" deep networks) or over-simplify the model (e.g., assume stationary dynamics or analyze each session independently). Here, instead of regarding asynchronous recordings that lack alignment in neural identity or brain areas as a limitation, we exploit these diverse views of the same brain system to learn a unified model of brain dynamics. We assume that brain observations stem from the joint activity of a set of functional neural ensembles (groups of co-active neurons) that are similar in functionality across recordings, and propose to discover the ensemble and their non-stationary dynamical interactions in a new model we term CrEIMBO (Cross-Ensemble Interactions in Multi-view Brain Observations). CrEIMBO identifies the composition of the per-session neural ensembles through graph-driven dictionary learning and models the ensemble dynamics as a latent sparse time-varying decomposition of global sub-circuits, thereby capturing non-stationary dynamics. CrEIMBO identifies multiple co-active sub-circuits while maintaining representation interpretability due to sharing sub-circuits across sessions. CrEIMBO distinguishes session-specific from global (session-invariant) computations by exploring when distinct sub-circuits are active. We demonstrate CrEIMBO's ability to recover ground truth components in synthetic data and uncover meaningful brain dynamics, capturing cross-subject and inter- and intra-area variability, in high-density electrode recordings of humans performing a memory task.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
The Artificial Intelligence Ontology: LLM-assisted construction of AI concept hierarchies
Authors:
Marcin P. Joachimiak,
Mark A. Miller,
J. Harry Caufield,
Ryan Ly,
Nomi L. Harris,
Andrew Tritt,
Christopher J. Mungall,
Kristofer E. Bouchard
Abstract:
The Artificial Intelligence Ontology (AIO) is a systematization of artificial intelligence (AI) concepts, methodologies, and their interrelations. Developed via manual curation, with the additional assistance of large language models (LLMs), AIO aims to address the rapidly evolving landscape of AI by providing a comprehensive framework that encompasses both technical and ethical aspects of AI tech…
▽ More
The Artificial Intelligence Ontology (AIO) is a systematization of artificial intelligence (AI) concepts, methodologies, and their interrelations. Developed via manual curation, with the additional assistance of large language models (LLMs), AIO aims to address the rapidly evolving landscape of AI by providing a comprehensive framework that encompasses both technical and ethical aspects of AI technologies. The primary audience for AIO includes AI researchers, developers, and educators seeking standardized terminology and concepts within the AI domain. The ontology is structured around six top-level branches: Networks, Layers, Functions, LLMs, Preprocessing, and Bias, each designed to support the modular composition of AI methods and facilitate a deeper understanding of deep learning architectures and ethical considerations in AI.
AIO's development utilized the Ontology Development Kit (ODK) for its creation and maintenance, with its content being dynamically updated through AI-driven curation support. This approach not only ensures the ontology's relevance amidst the fast-paced advancements in AI but also significantly enhances its utility for researchers, developers, and educators by simplifying the integration of new AI concepts and methodologies.
The ontology's utility is demonstrated through the annotation of AI methods data in a catalog of AI research publications and the integration into the BioPortal ontology resource, highlighting its potential for cross-disciplinary research. The AIO ontology is open source and is available on GitHub (https://github.com/berkeleybop/artificial-intelligence-ontology) and BioPortal (https://bioportal.bioontology.org/ontologies/AIO).
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Remote Sensing and Machine Learning for Food Crop Production Data in Africa Post-COVID-19
Authors:
Racine Ly,
Khadim Dia,
Mariam Diallo
Abstract:
In the agricultural sector, the COVID-19 threatens to lead to a severe food security crisis in the region, with disruptions in the food supply chain and agricultural production expected to contract between 2.6% and 7%. From the food crop production side, the travel bans and border closures, the late reception and the use of agricultural inputs such as imported seeds, fertilizers, and pesticides co…
▽ More
In the agricultural sector, the COVID-19 threatens to lead to a severe food security crisis in the region, with disruptions in the food supply chain and agricultural production expected to contract between 2.6% and 7%. From the food crop production side, the travel bans and border closures, the late reception and the use of agricultural inputs such as imported seeds, fertilizers, and pesticides could lead to poor food crop production performances. Another layer of disruption introduced by the mobility restriction measures is the scarcity of agricultural workers, mainly seasonal workers. The lockdown measures and border closures limit seasonal workers' availability to get to the farm on time for planting and harvesting activities. Moreover, most of the imported agricultural inputs travel by air, which the pandemic has heavily impacted. Such transportation disruptions can also negatively affect the food crop production system.
This chapter assesses food crop production levels in 2020 -- before the harvesting period -- in all African regions and four staples such as maize, cassava, rice, and wheat. The production levels are predicted using the combination of biogeophysical remote sensing data retrieved from satellite images and machine learning artificial neural networks (ANNs) technique. The remote sensing products are used as input variables and the ANNs as the predictive modeling framework. The input remote sensing products are the Normalized Difference Vegetation Index (NDVI), the daytime Land Surface Temperature (LST), rainfall data, and agricultural lands' Evapotranspiration (ET). The output maps and data are made publicly available on a web-based platform, AAgWa (Africa Agriculture Watch, www.aagwa.org), to facilitate access to such information to policymakers, deciders, and other stakeholders.
△ Less
Submitted 14 July, 2021;
originally announced August 2021.
-
Machine Learning Challenges and Opportunities in the African Agricultural Sector -- A General Perspective
Authors:
Racine Ly
Abstract:
The improvement of computers' capacities, advancements in algorithmic techniques, and the significant increase of available data have enabled the recent developments of Artificial Intelligence (AI) technology. One of its branches, called Machine Learning (ML), has shown strong capacities in mimicking characteristics attributed to human intelligence, such as vision, speech, and problem-solving. How…
▽ More
The improvement of computers' capacities, advancements in algorithmic techniques, and the significant increase of available data have enabled the recent developments of Artificial Intelligence (AI) technology. One of its branches, called Machine Learning (ML), has shown strong capacities in mimicking characteristics attributed to human intelligence, such as vision, speech, and problem-solving. However, as previous technological revolutions suggest, their most significant impacts could be mostly expected on other sectors that were not traditional users of that technology. The agricultural sector is vital for African economies; improving yields, mitigating losses, and effective management of natural resources are crucial in a climate change era. Machine Learning is a technology with an added value in making predictions, hence the potential to reduce uncertainties and risk across sectors, in this case, the agricultural sector. The purpose of this paper is to contextualize and discuss barriers to ML-based solutions for African agriculture. In the second section, we provided an overview of ML technology from a historical and technical perspective and its main driving force. In the third section, we provided a brief review of the current use of ML in agriculture. Finally, in section 4, we discuss ML growing interest in Africa and the potential barriers to creating and using ML-based solutions in the agricultural sector.
△ Less
Submitted 11 July, 2021;
originally announced July 2021.
-
Forecasting Commodity Prices Using Long Short-Term Memory Neural Networks
Authors:
Racine Ly,
Fousseini Traore,
Khadim Dia
Abstract:
This paper applies a recurrent neural network (RNN) method to forecast cotton and oil prices. We show how these new tools from machine learning, particularly Long-Short Term Memory (LSTM) models, complement traditional methods. Our results show that machine learning methods fit reasonably well the data but do not outperform systematically classical methods such as Autoregressive Integrated Moving…
▽ More
This paper applies a recurrent neural network (RNN) method to forecast cotton and oil prices. We show how these new tools from machine learning, particularly Long-Short Term Memory (LSTM) models, complement traditional methods. Our results show that machine learning methods fit reasonably well the data but do not outperform systematically classical methods such as Autoregressive Integrated Moving Average (ARIMA) models in terms of out of sample forecasts. However, averaging the forecasts from the two type of models provide better results compared to either method. Compared to the ARIMA and the LSTM, the Root Mean Squared Error (RMSE) of the average forecast was 0.21 and 21.49 percent lower respectively for cotton. For oil, the forecast averaging does not provide improvements in terms of RMSE. We suggest using a forecast averaging method and extending our analysis to a wide range of commodity prices.
△ Less
Submitted 15 January, 2021; v1 submitted 8 January, 2021;
originally announced January 2021.