-
Abordagem probabilística para análise de confiabilidade de dados gerados em sequenciamentos multiplex na plataforma ABI SOLiD
Abstract: The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer… ▽ More
Submitted 11 August, 2021; v1 submitted 27 July, 2021; originally announced July 2021.
Comments: 8 pages, 4 figures, 2 tables, Published in Portuguese in the Anais of the XLIII Simpósio Brasileiro de Pesquisa Operacional (SBPO 2011), 2011. URL: http://www.din.uem.br/sbpo/sbpo2011/pdf/87903.pdf
-
SimCleaner -- Sistema de Padronização de Bases de Dados utilizando Funções de Similaridade
Abstract: The Knowledge Discovery in Database (KDD) process permits the detection of pattern in databases, where this analysis may be compromised if database is not consistent, making necessary the use of data cleaning techniques. This paper presents a tool based in similarity functions to help the preprocessing of databases and it behaved efficiently in the standardization of a System of Public Security of… ▽ More
Submitted 11 August, 2021; v1 submitted 27 July, 2021; originally announced July 2021.
Comments: 6 pages, 5 figures, 1 table, Published in Portuguese in the Anais da XIV Semana de Informática (SEMINF) e Escola Regional de Informática Norte (ERIN), 2011