-
CDJUR-BR -- A Golden Collection of Legal Document from Brazilian Justice with Fine-Grained Named Entities
Authors:
Antonio Mauricio,
Vladia Pinheiro,
Vasco Furtado,
João Araújo Monteiro Neto,
Francisco das Chagas Jucá Bomfim,
André Câmara Ferreira da Costa,
Raquel Silveira,
Nilsiton Aragão
Abstract:
A basic task for most Legal Artificial Intelligence (Legal AI) applications is Named Entity Recognition (NER). However, texts produced in the context of legal practice make references to entities that are not trivially recognized by the currently available NERs. There is a lack of categorization of legislation, jurisprudence, evidence, penalties, the roles of people in a legal process (judge, lawy…
▽ More
A basic task for most Legal Artificial Intelligence (Legal AI) applications is Named Entity Recognition (NER). However, texts produced in the context of legal practice make references to entities that are not trivially recognized by the currently available NERs. There is a lack of categorization of legislation, jurisprudence, evidence, penalties, the roles of people in a legal process (judge, lawyer, victim, defendant, witness), types of locations (crime location, defendant's address), etc. In this sense, there is still a need for a robust golden collection, annotated with fine-grained entities of the legal domain, and which covers various documents of a legal process, such as petitions, inquiries, complaints, decisions and sentences. In this article, we describe the development of the Golden Collection of the Brazilian Judiciary (CDJUR-BR) contemplating a set of fine-grained named entities that have been annotated by experts in legal documents. The creation of CDJUR-BR followed its own methodology that aimed to attribute a character of comprehensiveness and robustness. Together with the CDJUR-BR repository we provided a NER based on the BERT model and trained with the CDJUR-BR, whose results indicated the prevalence of the CDJUR-BR.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Machine learning in the prediction of cardiac epicardial and mediastinal fat volumes
Authors:
É. O. Rodrigues,
V. H. A. Pinheiro,
P. Liatsis,
A. Conci
Abstract:
We propose a methodology to predict the cardiac epicardial and mediastinal fat volumes in computed tomography images using regression algorithms. The obtained results indicate that it is feasible to predict these fats with a high degree of correlation, thus alleviating the requirement for manual or automatic segmentation of both fat volumes. Instead, segmenting just one of them suffices, while the…
▽ More
We propose a methodology to predict the cardiac epicardial and mediastinal fat volumes in computed tomography images using regression algorithms. The obtained results indicate that it is feasible to predict these fats with a high degree of correlation, thus alleviating the requirement for manual or automatic segmentation of both fat volumes. Instead, segmenting just one of them suffices, while the volume of the other may be predicted fairly precisely. The correlation coefficient obtained by the Rotation Forest algorithm using MLP Regressor for predicting the mediastinal fat based on the epicardial fat was 0.9876, with a relative absolute error of 14.4% and a root relative squared error of 15.7%. The best correlation coefficient obtained in the prediction of the epicardial fat based on the mediastinal was 0.9683 with a relative absolute error of 19.6% and a relative squared error of 24.9%. Moreover, we analysed the feasibility of using linear regressors, which provide an intuitive interpretation of the underlying approximations. In this case, the obtained correlation coefficient was 0.9534 for predicting the mediastinal fat based on the epicardial, with a relative absolute error of 31.6% and a root relative squared error of 30.1%. On the prediction of the epicardial fat based on the mediastinal fat, the correlation coefficient was 0.8531, with a relative absolute error of 50.43% and a root relative squared error of 52.06%. In summary, it is possible to speed up general medical analyses and some segmentation and quantification methods that are currently employed in the state-of-the-art by using this prediction approach, which consequently reduces costs and therefore enables preventive treatments that may lead to a reduction of health problems.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Detecção de comunidades em redes complexas para identificar gargalos e desperdício de recursos em sistemas de ônibus
Authors:
Carlos Caminha,
Vasco Furtado,
Vládia Pinheiro,
Caio Ponte
Abstract:
We propose here a methodology to help to understand the shortcomings of public transportation in a city via the mining of complex networks representing the supply and demand of public transport. We show how to build these networks based upon data on smart card use in buses via the application of algorithms that estimate an OD and reconstruct the complete itinerary of the passengers. The overlappin…
▽ More
We propose here a methodology to help to understand the shortcomings of public transportation in a city via the mining of complex networks representing the supply and demand of public transport. We show how to build these networks based upon data on smart card use in buses via the application of algorithms that estimate an OD and reconstruct the complete itinerary of the passengers. The overlap** of the two networks sheds light in potential overload and waste in the offer of resources that can be mitigated with strategies for balancing supply and demand.
△ Less
Submitted 31 March, 2017; v1 submitted 12 June, 2016;
originally announced June 2016.