-
Best practices for the manual curation of Intrinsically Disordered Proteins in DisProt
Authors:
Federica Quaglia,
Anastasia Chasapi,
Maria Victoria Nugnes,
Maria Cristina Aspromonte,
Emanuela Leonardi,
Damiano Piovesan,
Silvio C. E. Tosatto
Abstract:
The DisProt database is a significant resource containing manually curated data on experimentally validated intrinsically disordered proteins (IDPs) and regions (IDRs) from the literature. Developed in 2005, its primary goal was to collect structural and functional information into proteins that lack a fixed three-dimensional (3D) structure. Today, DisProt has evolved into a major repository that…
▽ More
The DisProt database is a significant resource containing manually curated data on experimentally validated intrinsically disordered proteins (IDPs) and regions (IDRs) from the literature. Developed in 2005, its primary goal was to collect structural and functional information into proteins that lack a fixed three-dimensional (3D) structure. Today, DisProt has evolved into a major repository that not only collects experimental data but also contributes significantly to our understanding of the IDPs/IDRs roles in various biological processes, such as autophagy or the life cycle mechanisms in viruses, or their involvement in diseases (such as cancer and neurodevelopmental disorders). DisProt offers detailed information on the structural states of IDPs/IDRs, including state transitions, interactions, and their functions, all provided as curated annotations. One of the central activities of DisProt is the meticulous curation of experimental data from the literature. For this reason, to ensure that every expert and volunteer curator possesses the requisite knowledge for data evaluation, collection, and integration, training courses and curation materials are available. However, biocuration guidelines concur on the importance of develo** robust guidelines that not only provide critical information about data consistency but also ensure data acquisition.This guideline aims to provide both biocurators and external users with best practices for manually curating IDPs and IDRs in DisProt. It describes every step of the literature curation process and provides use cases of IDP curation within DisProt.
Database URL: https://disprot.org/
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
CAFA-evaluator: A Python Tool for Benchmarking Ontological Classification Methods
Authors:
Damiano Piovesan,
Davide Zago,
Parnal Joshi,
M. Clara De Paolis Kaluza,
Mahta Mehdiabadi,
Rashika Ramola,
Alexander Miguel Monzon,
Walter Reade,
Iddo Friedberg,
Predrag Radivojac,
Silvio C. E. Tosatto
Abstract:
We present CAFA-evaluator, a powerful Python program designed to evaluate the performance of prediction methods on targets with hierarchical concept dependencies. It generalizes multi-label evaluation to modern ontologies where the prediction targets are drawn from a directed acyclic graph and achieves high efficiency by leveraging matrix computation and topological sorting. The program requiremen…
▽ More
We present CAFA-evaluator, a powerful Python program designed to evaluate the performance of prediction methods on targets with hierarchical concept dependencies. It generalizes multi-label evaluation to modern ontologies where the prediction targets are drawn from a directed acyclic graph and achieves high efficiency by leveraging matrix computation and topological sorting. The program requirements include a small number of standard Python libraries, making CAFA-evaluator easy to maintain. The code replicates the Critical Assessment of protein Function Annotation (CAFA) benchmarking, which evaluates predictions of the consistent subgraphs in Gene Ontology. Owing to its reliability and accuracy, the organizers have selected CAFA-evaluator as the official CAFA evaluation software.
△ Less
Submitted 12 March, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Avoiding Pragmatic Oddity: A Bottom-up Defeasible Deontic Logic
Authors:
Guido Governatori,
Silvano Colombo Tosatto,
Antonino Rotolo
Abstract:
This paper presents an extension of Defeasible Deontic Logic to deal with the Pragmatic Oddity problem. The logic applies three general principles: (1) the Pragmatic Oddity problem must be solved within a general logical treatment of CTD reasoning; (2) non-monotonic methods must be adopted to handle CTD reasoning; (3) logical models of CTD reasoning must be computationally feasible and, if possibl…
▽ More
This paper presents an extension of Defeasible Deontic Logic to deal with the Pragmatic Oddity problem. The logic applies three general principles: (1) the Pragmatic Oddity problem must be solved within a general logical treatment of CTD reasoning; (2) non-monotonic methods must be adopted to handle CTD reasoning; (3) logical models of CTD reasoning must be computationally feasible and, if possible, efficient. The proposed extension of Defeasible Deontic Logic elaborates a preliminary version of the model proposed by Governatori and Rotolo (2019). The previous solution was based on particular characteristics of the (constructive, top-down) proof theory of the logic. However, that method introduces some degree of non-determinism. To avoid the problem, we provide a bottom-up characterisation of the logic. The new characterisation offers insights for the efficient implementation of the logic and allows us to establish the computational complexity of the problem.
△ Less
Submitted 9 September, 2022;
originally announced September 2022.
-
Proving Regulatory Compliance: Full Compliance Against an Expressive Unconditional Obligation is coNP-Complete
Authors:
Silvano Colombo Tosatto,
Guido Governatori,
Nick van Beest
Abstract:
Organisations are required to show that their procedures and processes satisfy the relevant regulatory requirements. The computational complexity of proving regulatory compliance is known to be generally hard. However, for some of its simpler variants the computational complexity is still unknown. We focus on the eight variants of the problem that can be identified by the following binary properti…
▽ More
Organisations are required to show that their procedures and processes satisfy the relevant regulatory requirements. The computational complexity of proving regulatory compliance is known to be generally hard. However, for some of its simpler variants the computational complexity is still unknown. We focus on the eight variants of the problem that can be identified by the following binary properties: whether the requirements consists of one or multiple obligations, whether the obligations are conditional or always in force, and whether only propositional literals or formulae can be used to describe the obligations. This paper in particular shows that proving full compliance of a model against a single unconditional obligation whose elements can be described using formulae is coNP-complete. Finally we show how this result allows to fully map the computational complexity of these variants for proving full and non compliance, while for partial compliance the complexity result of one of the variants is still missing.
△ Less
Submitted 11 December, 2022; v1 submitted 12 May, 2021;
originally announced May 2021.
-
DOME: Recommendations for supervised machine learning validation in biology
Authors:
Ian Walsh,
Dmytro Fishman,
Dario Garcia-Gasulla,
Tiina Titma,
Gianluca Pollastri,
The ELIXIR Machine Learning focus group,
Jen Harrow,
Fotis E. Psomopoulos,
Silvio C. E. Tosatto
Abstract:
Modern biology frequently relies on machine learning to provide predictions and improve decision processes. There have been recent calls for more scrutiny on machine learning performance and possible limitations. Here we present a set of community-wide recommendations aiming to help establish standards of supervised machine learning validation in biology. Adopting a structured methods description…
▽ More
Modern biology frequently relies on machine learning to provide predictions and improve decision processes. There have been recent calls for more scrutiny on machine learning performance and possible limitations. Here we present a set of community-wide recommendations aiming to help establish standards of supervised machine learning validation in biology. Adopting a structured methods description for machine learning based on data, optimization, model, evaluation (DOME) will aim to help both reviewers and readers to better understand and assess the performance and limitations of a method or outcome. The recommendations are formulated as questions to anyone wishing to pursue implementation of a machine learning algorithm. Answers to these questions can be easily included in the supplementary material of published papers.
△ Less
Submitted 7 January, 2021; v1 submitted 25 June, 2020;
originally announced June 2020.
-
Business Process Full Compliance with Respect to a Set of Conditional Obligation in Polynomial Time
Authors:
Silvano Colombo Tosatto,
Guido Governatori,
Nick Van Beest
Abstract:
In this paper, we present a new methodology to evaluate whether a business process model is fully compliant with a regulatory framework composed of a set of conditional obligations. The methodology is based failure delta-constraints that are evaluated on bottom-up aggregations of a tree-like representation of business process models. While the generic problem of proving full compliance is in coNP-…
▽ More
In this paper, we present a new methodology to evaluate whether a business process model is fully compliant with a regulatory framework composed of a set of conditional obligations. The methodology is based failure delta-constraints that are evaluated on bottom-up aggregations of a tree-like representation of business process models. While the generic problem of proving full compliance is in coNP-complete, we show that verifying full compliance can be done in polynomial time using our methodology, for an acyclic structured process model given a regulatory framework composed by a set of conditional obligations, whose elements are restricted to be represented by propositional literals
△ Less
Submitted 27 January, 2020;
originally announced January 2020.