Data Collection and Analysis of French Dialects
Authors:
Omar Shaur Choudhry,
Paul Omara Odida,
Joshua Reiner,
Keiron Appleyard,
Danielle Kushnir,
William Toon
Abstract:
This paper discusses creating and analysing a new dataset for data mining and text analytics research, contributing to a joint Leeds University research project for the Corpus of National Dialects. This report investigates machine learning classifiers to classify samples of French dialect text across various French-speaking countries. Following the steps of the CRISP-DM methodology, this report ex…
▽ More
This paper discusses creating and analysing a new dataset for data mining and text analytics research, contributing to a joint Leeds University research project for the Corpus of National Dialects. This report investigates machine learning classifiers to classify samples of French dialect text across various French-speaking countries. Following the steps of the CRISP-DM methodology, this report explores the data collection process, data quality issues and data conversion for text analysis. Finally, after applying suitable data mining techniques, the evaluation methods, best overall features and classifiers and conclusions are discussed.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
MicroAnalyzer: A Python Tool for Automated Bacterial Analysis with Fluorescence Microscopy
Authors:
Jonathan Reiner,
Guy Azran,
Gal Hyams
Abstract:
Fluorescence microscopy is a widely used method among cell biologists for studying the localization and co-localization of fluorescent protein. For microbial cell biologists, these studies often include tedious and time-consuming manual segmentation of bacteria and of the fluorescence clusters or working with multiple programs. Here, we present MicroAnalyzer - a tool that automates these tasks by…
▽ More
Fluorescence microscopy is a widely used method among cell biologists for studying the localization and co-localization of fluorescent protein. For microbial cell biologists, these studies often include tedious and time-consuming manual segmentation of bacteria and of the fluorescence clusters or working with multiple programs. Here, we present MicroAnalyzer - a tool that automates these tasks by providing an end-to-end platform for microscope image analysis. While such tools do exist, they are costly, black-boxed programs. Microanalyzer offers an open-source alternative to these tools, allowing flexibility and expandability by advanced users. MicroAnalyzer provides accurate cell and fluorescence cluster segmentation based on state-of-the-art deep-learning segmentation models, combined with ad-hoc post-processing and Colicoords - an open-source cell image analysis tool for calculating general cell and fluorescence measurements. Using these methods, it performs better than generic approaches since the dynamic nature of neural networks allows for a quick adaptation to experiment restrictions and assumptions. Other existing tools do not consider experiment assumptions, nor do they provide fluorescence cluster detection without the need for any specialized equipment. The key goal of MicroAnalyzer is to automate the entire process of cell and fluorescence image analysis "from microscope to database", meaning it does not require any further input from the researcher except for the initial deep-learning model training. In this fashion, it allows the researchers to concentrate on the bigger picture instead of granular, eye-straining labor
△ Less
Submitted 26 September, 2020;
originally announced September 2020.