-
CMISR: Circular Medical Image Super-Resolution
Authors:
Honggui Li,
Nahid Md Lokman Hossain,
Maria Trocan,
Dimitri Galayko,
Mohamad Sawan
Abstract:
Classical methods of medical image super-resolution (MISR) utilize open-loop architecture with implicit under-resolution (UR) unit and explicit super-resolution (SR) unit. The UR unit can always be given, assumed, or estimated, while the SR unit is elaborately designed according to various SR algorithms. The closed-loop feedback mechanism is widely employed in current MISR approaches and can effic…
▽ More
Classical methods of medical image super-resolution (MISR) utilize open-loop architecture with implicit under-resolution (UR) unit and explicit super-resolution (SR) unit. The UR unit can always be given, assumed, or estimated, while the SR unit is elaborately designed according to various SR algorithms. The closed-loop feedback mechanism is widely employed in current MISR approaches and can efficiently improve their performance. The feedback mechanism may be divided into two categories: local feedback and global feedback. Therefore, this paper proposes a global feedback-based closed-cycle framework, circular MISR (CMISR), with unambiguous UR and advanced SR elements. Mathematical model and closed-loop equation of CMISR are built. Mathematical proof with Taylor-series approximation indicates that CMISR has zero recovery error in steady-state. In addition, CMISR holds plug-and-play characteristic that fuses model-based and learning-based approaches and can be established on any existing MISR algorithms. Five CMISR algorithms are respectively proposed based on the state-of-the-art open-loop MISR algorithms. Experimental results with three scale factors and on three open medical image datasets show that CMISR is superior to MISR in reconstruction performance and is particularly suited to medical images with strong edges or intense contrast.
△ Less
Submitted 29 February, 2024; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Word level Bangla Sign Language Dataset for Continuous BSL Recognition
Authors:
Md Shamimul Islam,
A. J. M. Akhtarujjaman Joha,
Md Nur Hossain,
Sohaib Abdullah,
Ibrahim Elwarfalli,
Md Mahedi Hasan
Abstract:
An robust sign language recognition system can greatly alleviate communication barriers, particularly for people who struggle with verbal communication. This is crucial for human growth and progress as it enables the expression of thoughts, feelings, and ideas. However, sign recognition is a complex task that faces numerous challenges such as same gesture patterns for multiple signs, lighting, clo…
▽ More
An robust sign language recognition system can greatly alleviate communication barriers, particularly for people who struggle with verbal communication. This is crucial for human growth and progress as it enables the expression of thoughts, feelings, and ideas. However, sign recognition is a complex task that faces numerous challenges such as same gesture patterns for multiple signs, lighting, clothing, carrying conditions, and the presence of large poses, as well as illumination discrepancies across different views. Additionally, the absence of an extensive Bangla sign language video dataset makes it even more challenging to operate recognition systems, particularly when utilizing deep learning techniques. In order to address this issue, firstly, we created a large-scale dataset called the MVBSL-W50, which comprises 50 isolated words across 13 categories. Secondly, we developed an attention-based Bi-GRU model that captures the temporal dynamics of pose information for individuals communicating through sign language. The proposed model utilizes human pose information, which has shown to be successful in analyzing sign language patterns. By focusing solely on movement information and disregarding body appearance and environmental factors, the model is simplified and can achieve a speedier performance. The accuracy of the model is reported to be 85.64%.
△ Less
Submitted 9 April, 2023; v1 submitted 22 February, 2023;
originally announced February 2023.
-
DPCSpell: A Transformer-based Detector-Purificator-Corrector Framework for Spelling Error Correction of Bangla and Resource Scarce Indic Languages
Authors:
Mehedi Hasan Bijoy,
Nahid Hossain,
Salekul Islam,
Swakkhar Shatabda
Abstract:
Spelling error correction is the task of identifying and rectifying misspelled words in texts. It is a potential and active research topic in Natural Language Processing because of numerous applications in human language understanding. The phonetically or visually similar yet semantically distinct characters make it an arduous task in any language. Earlier efforts on spelling error correction in B…
▽ More
Spelling error correction is the task of identifying and rectifying misspelled words in texts. It is a potential and active research topic in Natural Language Processing because of numerous applications in human language understanding. The phonetically or visually similar yet semantically distinct characters make it an arduous task in any language. Earlier efforts on spelling error correction in Bangla and resource-scarce Indic languages focused on rule-based, statistical, and machine learning-based methods which we found rather inefficient. In particular, machine learning-based approaches, which exhibit superior performance to rule-based and statistical methods, are ineffective as they correct each character regardless of its appropriateness. In this work, we propose a novel detector-purificator-corrector framework based on denoising transformers by addressing previous issues. Moreover, we present a method for large-scale corpus creation from scratch which in turn resolves the resource limitation problem of any left-to-right scripted language. The empirical outcomes demonstrate the effectiveness of our approach that outperforms previous state-of-the-art methods by a significant margin for Bangla spelling error correction. The models and corpus are publicly available at https://tinyurl.com/DPCSpell.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Bangla hate speech detection on social media using attention-based recurrent neural network
Authors:
Amit Kumar Das,
Abdullah Al Asif,
Anik Paul,
Md. Nur Hossain
Abstract:
Hate speech has spread more rapidly through the daily use of technology and, most notably, by sharing your opinions or feelings on social media in a negative aspect. Although numerous works have been carried out in detecting hate speeches in English, German, and other languages, very few works have been carried out in the context of the Bengali language. In contrast, millions of people communicate…
▽ More
Hate speech has spread more rapidly through the daily use of technology and, most notably, by sharing your opinions or feelings on social media in a negative aspect. Although numerous works have been carried out in detecting hate speeches in English, German, and other languages, very few works have been carried out in the context of the Bengali language. In contrast, millions of people communicate on social media in Bengali. The few existing works that have been carried out need improvements in both accuracy and interpretability. This article proposed encoder decoder based machine learning model, a popular tool in NLP, to classify user's Bengali comments on Facebook pages. A dataset of 7,425 Bengali comments, consisting of seven distinct categories of hate speeches, was used to train and evaluate our model. For extracting and encoding local features from the comments, 1D convolutional layers were used. Finally, the attention mechanism, LSTM, and GRU based decoders have been used for predicting hate speech categories. Among the three encoder decoder algorithms, the attention-based decoder obtained the best accuracy (77%).
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
DataWords: Getting Contrarian with Text, Structured Data and Explanations
Authors:
Stephen I. Gallant,
Mirza Nasir Hossain
Abstract:
Our goal is to build classification models using a combination of free-text and structured data. To do this, we represent structured data by text sentences, DataWords, so that similar data items are mapped into the same sentence. This permits modeling a mixture of text and structured data by using only text-modeling algorithms. Several examples illustrate that it is possible to improve text classi…
▽ More
Our goal is to build classification models using a combination of free-text and structured data. To do this, we represent structured data by text sentences, DataWords, so that similar data items are mapped into the same sentence. This permits modeling a mixture of text and structured data by using only text-modeling algorithms. Several examples illustrate that it is possible to improve text classification performance by first running extraction tools (named entity recognition), then converting the output to DataWords, and adding the DataWords to the original text -- before model building and classification. This approach also allows us to produce explanations for inferences in terms of both free text and structured data.
△ Less
Submitted 17 February, 2022; v1 submitted 9 November, 2021;
originally announced November 2021.
-
IoT Solution for Winter Survival of Indoor Plants
Authors:
Md Saroar Jahan,
Jhuma kabir Mim,
Sampo Niittyviita,
Santeri Moberg,
Murad Ahmad,
Nijar Hossain
Abstract:
Not only does cold climate pose a problem for outdoor plants during winter in the northern hemisphere, but for indoor plants as well: low sunlight, low humidity, and simultaneous cold breezes from windows and heat from radiators all cause problems for indoor plants. People often treat their indoor plants like mere decoration, which can often lead to health issues for the plant or even death of the…
▽ More
Not only does cold climate pose a problem for outdoor plants during winter in the northern hemisphere, but for indoor plants as well: low sunlight, low humidity, and simultaneous cold breezes from windows and heat from radiators all cause problems for indoor plants. People often treat their indoor plants like mere decoration, which can often lead to health issues for the plant or even death of the plant, especially during winter. A plant monitoring system was developed to solve this problem, collecting information on plants' indoor environmental conditions (light, humidity, and temperature) and providing this information in an accessible format for the user. Preliminary functional tests were conducted in similar settings where the system would be used. In addition, the concept was evaluated by interviewing an expert in the field of horticulture.
The evaluation results indicate that this kind of system could prove useful; however, the tests indicated that the system requires further development to achieve more practical value and wider usage.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Machine Learning and Meta-Analysis Approach to Identify Patient Comorbidities and Symptoms that Increased Risk of Mortality in COVID-19
Authors:
Sakifa Aktar,
Ashis Talukder,
Md. Martuza Ahamad,
A. H. M. Kamal,
Jahidur Rahman Khan,
Md. Protikuzzaman,
Nasif Hossain,
Julian M. W. Quinn,
Mathew A. Summers,
Teng Liaw,
Valsamma Eapen,
Mohammad Ali Moni
Abstract:
Background: Providing appropriate care for people suffering from COVID-19, the disease caused by the pandemic SARS-CoV-2 virus is a significant global challenge. Many individuals who become infected have pre-existing conditions that may interact with COVID-19 to increase symptom severity and mortality risk. COVID-19 patient comorbidities are likely to be informative about individual risk of severe…
▽ More
Background: Providing appropriate care for people suffering from COVID-19, the disease caused by the pandemic SARS-CoV-2 virus is a significant global challenge. Many individuals who become infected have pre-existing conditions that may interact with COVID-19 to increase symptom severity and mortality risk. COVID-19 patient comorbidities are likely to be informative about individual risk of severe illness and mortality. Accurately determining how comorbidities are associated with severe symptoms and mortality would thus greatly assist in COVID-19 care planning and provision.
Methods: To assess the interaction of patient comorbidities with COVID-19 severity and mortality we performed a meta-analysis of the published global literature, and machine learning predictive analysis using an aggregated COVID-19 global dataset.
Results: Our meta-analysis identified chronic obstructive pulmonary disease (COPD), cerebrovascular disease (CEVD), cardiovascular disease (CVD), type 2 diabetes, malignancy, and hypertension as most significantly associated with COVID-19 severity in the current published literature. Machine learning classification using novel aggregated cohort data similarly found COPD, CVD, CKD, type 2 diabetes, malignancy and hypertension, as well as asthma, as the most significant features for classifying those deceased versus those who survived COVID-19. While age and gender were the most significant predictor of mortality, in terms of symptom-comorbidity combinations, it was observed that Pneumonia-Hypertension, Pneumonia-Diabetes and Acute Respiratory Distress Syndrome (ARDS)-Hypertension showed the most significant effects on COVID-19 mortality.
Conclusions: These results highlight patient cohorts most at risk of COVID-19 related severe morbidity and mortality which have implications for prioritization of hospital resources.
△ Less
Submitted 21 August, 2020;
originally announced August 2020.
-
TransForm: Formally Specifying Transistency Models and Synthesizing Enhanced Litmus Tests
Authors:
Naorin Hossain,
Caroline Trippel,
Margaret Martonosi
Abstract:
Memory consistency models (MCMs) specify the legal ordering and visibility of shared memory accesses in a parallel program. Traditionally, instruction set architecture (ISA) MCMs assume that relevant program-visible memory ordering behaviors only result from shared memory interactions that take place between user-level program instructions. This assumption fails to account for virtual memory (VM)…
▽ More
Memory consistency models (MCMs) specify the legal ordering and visibility of shared memory accesses in a parallel program. Traditionally, instruction set architecture (ISA) MCMs assume that relevant program-visible memory ordering behaviors only result from shared memory interactions that take place between user-level program instructions. This assumption fails to account for virtual memory (VM) implementations that may result in additional shared memory interactions between user-level program instructions and both 1) system-level operations (e.g., address remap**s and translation lookaside buffer invalidations initiated by system calls) and 2) hardware-level operations (e.g., hardware page table walks and dirty bit updates) during a user-level program's execution. These additional shared memory interactions can impact the observable memory ordering behaviors of user-level programs. Thus, memory transistency models (MTMs) have been coined as a superset of MCMs to additionally articulate VM-aware consistency rules. However, no prior work has enabled formal MTM specifications, nor methods to support their automated analysis.
To fill the above gap, this paper presents the TransForm framework. First, TransForm features an axiomatic vocabulary for formally specifying MTMs. Second, TransForm includes a synthesis engine to support the automated generation of litmus tests enhanced with MTM features (i.e., enhanced litmus tests, or ELTs) when supplied with a TransForm MTM specification. As a case study, we formally define an estimated MTM for Intel x86 processors, called x86t_elt, that is based on observations made by an ELT-based evaluation of an Intel x86 MTM implementation from prior work and available public documentation. Given x86t_elt and a synthesis bound as input, TransForm's synthesis engine successfully produces a set of ELTs including relevant ELTs from prior work.
△ Less
Submitted 11 August, 2020; v1 submitted 8 August, 2020;
originally announced August 2020.
-
SemEval-2020 Task 7: Assessing Humor in Edited News Headlines
Authors:
Nabil Hossain,
John Krumm,
Michael Gamon,
Henry Kautz
Abstract:
This paper describes the SemEval-2020 shared task "Assessing Humor in Edited News Headlines." The task's dataset contains news headlines in which short edits were applied to make them funny, and the funniness of these edited headlines was rated using crowdsourcing. This task includes two subtasks, the first of which is to estimate the funniness of headlines on a humor scale in the interval 0-3. Th…
▽ More
This paper describes the SemEval-2020 shared task "Assessing Humor in Edited News Headlines." The task's dataset contains news headlines in which short edits were applied to make them funny, and the funniness of these edited headlines was rated using crowdsourcing. This task includes two subtasks, the first of which is to estimate the funniness of headlines on a humor scale in the interval 0-3. The second subtask is to predict, for a pair of edited versions of the same original headline, which is the funnier version. To date, this task is the most popular shared computational humor task, attracting 48 teams for the first subtask and 31 teams for the second.
△ Less
Submitted 1 August, 2020;
originally announced August 2020.
-
"Judge me by my size (noun), do you?'' YodaLib: A Demographic-Aware Humor Generation Framework
Authors:
Aparna Garimella,
Carmen Banea,
Nabil Hossain,
Rada Mihalcea
Abstract:
The subjective nature of humor makes computerized humor generation a challenging task. We propose an automatic humor generation framework for filling the blanks in Mad Libs stories, while accounting for the demographic backgrounds of the desired audience. We collect a dataset consisting of such stories, which are filled in and judged by carefully selected workers on Amazon Mechanical Turk. We buil…
▽ More
The subjective nature of humor makes computerized humor generation a challenging task. We propose an automatic humor generation framework for filling the blanks in Mad Libs stories, while accounting for the demographic backgrounds of the desired audience. We collect a dataset consisting of such stories, which are filled in and judged by carefully selected workers on Amazon Mechanical Turk. We build upon the BERT platform to predict location-biased word fillings in incomplete sentences, and we fine tune BERT to classify location-specific humor in a sentence. We leverage these components to produce YodaLib, a fully-automated Mad Libs style humor generation framework, which selects and ranks appropriate candidate words and sentences in order to generate a coherent and funny story tailored to certain demographics. Our experimental results indicate that YodaLib outperforms a previous semi-automated approach proposed for this task, while also surpassing human annotators in both qualitative and quantitative analyses.
△ Less
Submitted 31 May, 2020;
originally announced June 2020.
-
Stimulating Creativity with FunLines: A Case Study of Humor Generation in Headlines
Authors:
Nabil Hossain,
John Krumm,
Tanvir Sajed,
Henry Kautz
Abstract:
Building datasets of creative text, such as humor, is quite challenging. We introduce FunLines, a competitive game where players edit news headlines to make them funny, and where they rate the funniness of headlines edited by others. FunLines makes the humor generation process fun, interactive, collaborative, rewarding and educational, kee** players engaged and providing humor data at a very low…
▽ More
Building datasets of creative text, such as humor, is quite challenging. We introduce FunLines, a competitive game where players edit news headlines to make them funny, and where they rate the funniness of headlines edited by others. FunLines makes the humor generation process fun, interactive, collaborative, rewarding and educational, kee** players engaged and providing humor data at a very low cost compared to traditional crowdsourcing approaches. FunLines offers useful performance feedback, assisting players in getting better over time at generating and assessing humor, as our analysis shows. This helps to further increase the quality of the generated dataset. We show the effectiveness of this data by training humor classification models that outperform a previous benchmark, and we release this dataset to the public.
△ Less
Submitted 5 February, 2020;
originally announced February 2020.
-
"President Vows to Cut <Taxes> Hair": Dataset and Analysis of Creative Text Editing for Humorous Headlines
Authors:
Nabil Hossain,
John Krumm,
Michael Gamon
Abstract:
We introduce, release, and analyze a new dataset, called Humicroedit, for research in computational humor. Our publicly available data consists of regular English news headlines paired with versions of the same headlines that contain simple replacement edits designed to make them funny. We carefully curated crowdsourced editors to create funny headlines and judges to score a to a total of 15,095 e…
▽ More
We introduce, release, and analyze a new dataset, called Humicroedit, for research in computational humor. Our publicly available data consists of regular English news headlines paired with versions of the same headlines that contain simple replacement edits designed to make them funny. We carefully curated crowdsourced editors to create funny headlines and judges to score a to a total of 15,095 edited headlines, with five judges per headline. The simple edits, usually just a single word replacement, mean we can apply straightforward analysis techniques to determine what makes our edited headlines humorous. We show how the data support classic theories of humor, such as incongruity, superiority, and setup/punchline. Finally, we develop baseline classifiers that can predict whether or not an edited headline is funny, which is a first step toward automatically generating humorous headlines as an approach to creating topical humor.
△ Less
Submitted 1 June, 2019;
originally announced June 2019.
-
Analyzing Uncivil Speech Provocation and Implicit Topics in Online Political News
Authors:
Rijul Magu,
Nabil Hossain,
Henry Kautz
Abstract:
Online news has made dissemination of information a faster and more efficient process. Additionally, the shift from a print medium to an online interface has enabled user interactions, creating a space to mutually understand the reader responses generated by the consumption of news articles. Intermittently, the positive environment is transformed into a hate-spewing contest, with the amount and ta…
▽ More
Online news has made dissemination of information a faster and more efficient process. Additionally, the shift from a print medium to an online interface has enabled user interactions, creating a space to mutually understand the reader responses generated by the consumption of news articles. Intermittently, the positive environment is transformed into a hate-spewing contest, with the amount and target of incivility varying depending on the specific news website in question. In this paper, we develop methods to study the emergence of incivility within the reader communities in news sites. First, we create a dataset of political news articles and their reader comments from partisan news sites. Then, we train classifiers to predict different aspects of uncivil speech in comments. We apply these classifiers to predict whether a news article is likely to provoke a substantial portion of reader comments containing uncivil language by analyzing only the article's content. Finally, we devise a technique to "read between the lines" --- finding the topics of discussions that an article triggers among its readers without frequent, explicit mentions of these topics in its content.
△ Less
Submitted 27 July, 2018;
originally announced July 2018.
-
SLEUTH: Real-time Attack Scenario Reconstruction from COTS Audit Data
Authors:
Md Nahid Hossain,
Sadegh M Milajerdi,
Junao Wang,
Birhanu Eshete,
Rigel Gjomemo,
R Sekar,
Scott Stoller,
VN Venkatakrishnan
Abstract:
We present an approach and system for real-time reconstruction of attack scenarios on an enterprise host. To meet the scalability and real-time needs of the problem, we develop a platform-neutral, main-memory based, dependency graph abstraction of audit-log data. We then present efficient, tag-based techniques for attack detection and reconstruction, including source identification and impact anal…
▽ More
We present an approach and system for real-time reconstruction of attack scenarios on an enterprise host. To meet the scalability and real-time needs of the problem, we develop a platform-neutral, main-memory based, dependency graph abstraction of audit-log data. We then present efficient, tag-based techniques for attack detection and reconstruction, including source identification and impact analysis. We also develop methods to reveal the big picture of attacks by construction of compact, visual graphs of attack steps. Our system participated in a red team evaluation organized by DARPA and was able to successfully detect and reconstruct the details of the red team's attacks on hosts running Windows, FreeBSD and Linux.
△ Less
Submitted 6 January, 2018;
originally announced January 2018.
-
Attack Analysis Results for Adversarial Engagement 1 of the DARPA Transparent Computing Program
Authors:
Birhanu Eshete,
Rigel Gjomemo,
Md Nahid Hossain,
Sadegh Momeni,
R. Sekar,
Scott Stoller,
V. N. Venkatakrishnan,
Junao Wang
Abstract:
This report presents attack analysis results of the first adversarial engagement event stream for the first engagement of the DARPA TC program conducted in October 2016. The analysis was performed by Stony Brook University and University of Illinois at Chicago. The findings in this report are obtained without prior knowledge of the attacks conducted.
This report presents attack analysis results of the first adversarial engagement event stream for the first engagement of the DARPA TC program conducted in October 2016. The analysis was performed by Stony Brook University and University of Illinois at Chicago. The findings in this report are obtained without prior knowledge of the attacks conducted.
△ Less
Submitted 21 October, 2016;
originally announced October 2016.
-
Inferring Fine-grained Details on User Activities and Home Location from Social Media: Detecting Drinking-While-Tweeting Patterns in Communities
Authors:
Nabil Hossain,
Tianran Hu,
Roghayeh Feizi,
Ann Marie White,
Jiebo Luo,
Henry Kautz
Abstract:
Nearly all previous work on geo-locating latent states and activities from social media confounds general discussions about activities, self-reports of users participating in those activities at times in the past or future, and self-reports made at the immediate time and place the activity occurs. Activities, such as alcohol consumption, may occur at different places and types of places, and it is…
▽ More
Nearly all previous work on geo-locating latent states and activities from social media confounds general discussions about activities, self-reports of users participating in those activities at times in the past or future, and self-reports made at the immediate time and place the activity occurs. Activities, such as alcohol consumption, may occur at different places and types of places, and it is important not only to detect the local regions where these activities occur, but also to analyze the degree of participation in them by local residents. In this paper, we develop new machine learning based methods for fine-grained localization of activities and home locations from Twitter data. We apply these methods to discover and compare alcohol consumption patterns in a large urban area, New York City, and a more suburban and rural area, Monroe County. We find positive correlations between the rate of alcohol consumption reported among a community's Twitter users and the density of alcohol outlets, demonstrating that the degree of correlation varies significantly between urban and suburban areas. While our experiments are focused on alcohol use, our methods for locating homes and distinguishing temporally-specific self-reports are applicable to a broad range of behaviors and latent states.
△ Less
Submitted 10 March, 2016;
originally announced March 2016.