Search | arXiv e-print repository

Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models

Authors: Abhishek Kumar, Robert Morabito, Sanzhar Umbet, Jad Kabbara, Ali Emami

Abstract: As the use of Large Language Models (LLMs) becomes more widespread, understanding their self-evaluation of confidence in generated responses becomes increasingly important as it is integral to the reliability of the output of these models. We introduce the concept of Confidence-Probability Alignment, that connects an LLM's internal confidence, quantified by token probabilities, to the confidence c… ▽ More As the use of Large Language Models (LLMs) becomes more widespread, understanding their self-evaluation of confidence in generated responses becomes increasingly important as it is integral to the reliability of the output of these models. We introduce the concept of Confidence-Probability Alignment, that connects an LLM's internal confidence, quantified by token probabilities, to the confidence conveyed in the model's response when explicitly asked about its certainty. Using various datasets and prompting techniques that encourage model introspection, we probe the alignment between models' internal and expressed confidence. These techniques encompass using structured evaluation scales to rate confidence, including answer options when prompting, and eliciting the model's confidence level for outputs it does not recognize as its own. Notably, among the models analyzed, OpenAI's GPT-4 showed the strongest confidence-probability alignment, with an average Spearman's $\hatρ$ of 0.42, across a wide range of tasks. Our work contributes to the ongoing efforts to facilitate risk assessment in the application of LLMs and to further our understanding of model trustworthiness. △ Less

Submitted 15 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

Comments: 9 pages (excluding references), accepted to ACL 2024 Main Conference

arXiv:2405.16277 [pdf, other]

Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge

Authors: Brendan Park, Madeline Janecek, Naser Ezzati-Jivan, Yifeng Li, Ali Emami

Abstract: Large Language Models (LLMs) have demonstrated remarkable success in tasks like the Winograd Schema Challenge (WSC), showcasing advanced textual common-sense reasoning. However, applying this reasoning to multimodal domains, where understanding text and images together is essential, remains a substantial challenge. To address this, we introduce WinoVis, a novel dataset specifically designed to pro… ▽ More Large Language Models (LLMs) have demonstrated remarkable success in tasks like the Winograd Schema Challenge (WSC), showcasing advanced textual common-sense reasoning. However, applying this reasoning to multimodal domains, where understanding text and images together is essential, remains a substantial challenge. To address this, we introduce WinoVis, a novel dataset specifically designed to probe text-to-image models on pronoun disambiguation within multimodal contexts. Utilizing GPT-4 for prompt generation and Diffusion Attentive Attribution Maps (DAAM) for heatmap analysis, we propose a novel evaluation framework that isolates the models' ability in pronoun disambiguation from other visual processing challenges. Evaluation of successive model versions reveals that, despite incremental advancements, Stable Diffusion 2.0 achieves a precision of 56.7% on WinoVis, only marginally surpassing random guessing. Further error analysis identifies important areas for future research aimed at advancing text-to-image models in their ability to interpret and interact with the complex visual world. △ Less

Submitted 3 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

Comments: 9 pages (excluding references), accepted to ACL 2024 Main Conference

arXiv:2405.14555 [pdf, other]

Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models

Authors: Abhishek Kumar, Sarfaroz Yunusov, Ali Emami

Abstract: Research on Large Language Models (LLMs) has often neglected subtle biases that, although less apparent, can significantly influence the models' outputs toward particular social narratives. This study addresses two such biases within LLMs: representative bias, which denotes a tendency of LLMs to generate outputs that mirror the experiences of certain identity groups, and affinity bias, reflecting… ▽ More Research on Large Language Models (LLMs) has often neglected subtle biases that, although less apparent, can significantly influence the models' outputs toward particular social narratives. This study addresses two such biases within LLMs: representative bias, which denotes a tendency of LLMs to generate outputs that mirror the experiences of certain identity groups, and affinity bias, reflecting the models' evaluative preferences for specific narratives or viewpoints. We introduce two novel metrics to measure these biases: the Representative Bias Score (RBS) and the Affinity Bias Score (ABS), and present the Creativity-Oriented Generation Suite (CoGS), a collection of open-ended tasks such as short story writing and poetry composition, designed with customized rubrics to detect these subtle biases. Our analysis uncovers marked representative biases in prominent LLMs, with a preference for identities associated with being white, straight, and men. Furthermore, our investigation of affinity bias reveals distinctive evaluative patterns within each model, akin to `bias fingerprints'. This trend is also seen in human evaluators, highlighting a complex interplay between human and machine bias perceptions. △ Less

Submitted 3 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: 9 pages (excluding references), accepted to ACL 2024 Main Conference

arXiv:2402.13372 [pdf, other]

EvoGrad: A Dynamic Take on the Winograd Schema Challenge with Human Adversaries

Authors: **g Han Sun, Ali Emami

Abstract: While Large Language Models (LLMs) excel at the Winograd Schema Challenge (WSC), a coreference resolution task testing common-sense reasoning through pronoun disambiguation, they struggle with instances that feature minor alterations or rewording. To address this, we introduce EvoGrad, an open-source platform that harnesses a human-in-the-loop approach to create a dynamic dataset tailored to such… ▽ More While Large Language Models (LLMs) excel at the Winograd Schema Challenge (WSC), a coreference resolution task testing common-sense reasoning through pronoun disambiguation, they struggle with instances that feature minor alterations or rewording. To address this, we introduce EvoGrad, an open-source platform that harnesses a human-in-the-loop approach to create a dynamic dataset tailored to such altered WSC instances. Leveraging ChatGPT's capabilities, we expand our task instances from 182 to 3,691, setting a new benchmark for diverse common-sense reasoning datasets. Additionally, we introduce the error depth metric, assessing model stability in dynamic tasks. Our results emphasize the challenge posed by EvoGrad: Even the best performing LLM, GPT-3.5, achieves an accuracy of 65.0% with an average error depth of 7.2, a stark contrast to human performance of 92. 8% accuracy without perturbation errors. This highlights ongoing model limitations and the value of dynamic datasets in uncovering them. △ Less

Submitted 22 February, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

Comments: Accepted for publication in main proceedings of LREC-COLING 2024, 16 pages, 3 figures

arXiv:2401.17703 [pdf, other]

WSC+: Enhancing The Winograd Schema Challenge Using Tree-of-Experts

Authors: Pardis Sadat Zahraei, Ali Emami

Abstract: The Winograd Schema Challenge (WSC) serves as a prominent benchmark for evaluating machine understanding. While Large Language Models (LLMs) excel at answering WSC questions, their ability to generate such questions remains less explored. In this work, we propose Tree-of-Experts (ToE), a novel prompting method which enhances the generation of WSC instances (50% valid cases vs. 10% in recent method… ▽ More The Winograd Schema Challenge (WSC) serves as a prominent benchmark for evaluating machine understanding. While Large Language Models (LLMs) excel at answering WSC questions, their ability to generate such questions remains less explored. In this work, we propose Tree-of-Experts (ToE), a novel prompting method which enhances the generation of WSC instances (50% valid cases vs. 10% in recent methods). Using this approach, we introduce WSC+, a novel dataset comprising 3,026 LLM-generated sentences. Notably, we extend the WSC framework by incorporating new 'ambiguous' and 'offensive' categories, providing a deeper insight into model overconfidence and bias. Our analysis reveals nuances in generation-evaluation consistency, suggesting that LLMs may not always outperform in evaluating their own generated questions when compared to those crafted by other models. On WSC+, GPT-4, the top-performing LLM, achieves an accuracy of 68.7%, significantly below the human benchmark of 95.1%. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: Accepted for publication in main proceedings of EACL 2024 conference, 22 pages, 16 figures

ACM Class: I.2.7; K.4.1

arXiv:2310.15466 [pdf]

EKGNet: A 10.96μW Fully Analog Neural Network for Intra-Patient Arrhythmia Classification

Authors: Benyamin Haghi, Lin Ma, Sahin Lale, Anima Anandkumar, Azita Emami

Abstract: We present an integrated approach by combining analog computing and deep learning for electrocardiogram (ECG) arrhythmia classification. We propose EKGNet, a hardware-efficient and fully analog arrhythmia classification architecture that archives high accuracy with low power consumption. The proposed architecture leverages the energy efficiency of transistors operating in the subthreshold region,… ▽ More We present an integrated approach by combining analog computing and deep learning for electrocardiogram (ECG) arrhythmia classification. We propose EKGNet, a hardware-efficient and fully analog arrhythmia classification architecture that archives high accuracy with low power consumption. The proposed architecture leverages the energy efficiency of transistors operating in the subthreshold region, eliminating the need for analog-to-digital converters (ADC) and static random access memory (SRAM). The system design includes a novel analog sequential Multiply-Accumulate (MAC) circuit that mitigates process, supply voltage, and temperature variations. Experimental evaluations on PhysioNet's MIT-BIH and PTB Diagnostics datasets demonstrate the effectiveness of the proposed method, achieving average balanced accuracy of 95% and 94.25% for intra-patient arrhythmia classification and myocardial infarction (MI) classification, respectively. This innovative approach presents a promising avenue for develo** low-power arrhythmia classification systems with enhanced accuracy and transferability in biomedical applications. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted on IEEE Biomedical Circuits and Systems (BioCAS) 2023

arXiv:2305.14307 [pdf, other]

Debiasing should be Good and Bad: Measuring the Consistency of Debiasing Techniques in Language Models

Authors: Robert Morabito, Jad Kabbara, Ali Emami

Abstract: Debiasing methods that seek to mitigate the tendency of Language Models (LMs) to occasionally output toxic or inappropriate text have recently gained traction. In this paper, we propose a standardized protocol which distinguishes methods that yield not only desirable results, but are also consistent with their mechanisms and specifications. For example, we ask, given a debiasing method that is dev… ▽ More Debiasing methods that seek to mitigate the tendency of Language Models (LMs) to occasionally output toxic or inappropriate text have recently gained traction. In this paper, we propose a standardized protocol which distinguishes methods that yield not only desirable results, but are also consistent with their mechanisms and specifications. For example, we ask, given a debiasing method that is developed to reduce toxicity in LMs, if the definition of toxicity used by the debiasing method is reversed, would the debiasing results also be reversed? We used such considerations to devise three criteria for our new protocol: Specification Polarity, Specification Importance, and Domain Transferability. As a case study, we apply our protocol to a popular debiasing method, Self-Debiasing, and compare it to one we propose, called Instructive Debiasing, and demonstrate that consistency is as important an aspect to debiasing viability as is simply a desirable result. We show that our protocol provides essential insights into the generalizability and interpretability of debiasing methods that may otherwise go overlooked. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 9 pages (excluding references), accepted at ACL Findings 2023

arXiv:2212.12055 [pdf, ps, other]

DRL-based Energy-Efficient Baseband Function Deployments for Service-Oriented Open RAN

Authors: Haiyuan Li, Amin Emami, Karcius Assis, Antonis Vafeas, Ruizhi Yang, Reza Nejabati, Shuangyi Yan, Dimitra Simeonidou

Abstract: Open Radio Access Network (Open RAN) has gained tremendous attention from industry and academia with decentralized baseband functions across multiple processing units located at different places. However, the ever-expanding scope of RANs, along with fluctuations in resource utilization across different locations and timeframes, necessitates the implementation of robust function management policies… ▽ More Open Radio Access Network (Open RAN) has gained tremendous attention from industry and academia with decentralized baseband functions across multiple processing units located at different places. However, the ever-expanding scope of RANs, along with fluctuations in resource utilization across different locations and timeframes, necessitates the implementation of robust function management policies to minimize network energy consumption. Most recently developed strategies neglected the activation time and the required energy for the server activation process, while this process could offset the potential energy savings gained from server hibernation. Furthermore, user plane functions, which can be deployed on edge computing servers to provide low-latency services, have not been sufficiently considered. In this paper, a multi-agent deep reinforcement learning (DRL) based function deployment algorithm, coupled with a heuristic method, has been developed to minimize energy consumption while fulfilling multiple requests and adhering to latency and resource constraints. In an 8-MEC network, the DRL-based solution approaches the performance of the benchmark while offering up to 51% energy savings compared to existing approaches. In a larger network of 14-MEC, it maintains a 38% energy-saving advantage and ensures real-time response capabilities. Furthermore, this paper prototypes an Open RAN testbed to verify the feasibility of the proposed solution. △ Less

Submitted 4 October, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

arXiv:2211.08570 [pdf]

Dynamic-Pix2Pix: Noise Injected cGAN for Modeling Input and Target Domain Joint Distributions with Limited Training Data

Authors: Mohammadreza Naderi, Nader Karimi, Ali Emami, Shahram Shirani, Shadrokh Samavi

Abstract: Learning to translate images from a source to a target domain with applications such as converting simple line drawing to oil painting has attracted significant attention. The quality of translated images is directly related to two crucial issues. First, the consistency of the output distribution with that of the target is essential. Second, the generated output should have a high correlation with… ▽ More Learning to translate images from a source to a target domain with applications such as converting simple line drawing to oil painting has attracted significant attention. The quality of translated images is directly related to two crucial issues. First, the consistency of the output distribution with that of the target is essential. Second, the generated output should have a high correlation with the input. Conditional Generative Adversarial Networks, cGANs, are the most common models for translating images. The performance of a cGAN drops when we use a limited training dataset. In this work, we increase the Pix2Pix (a form of cGAN) target distribution modeling ability with the help of dynamic neural network theory. Our model has two learning cycles. The model learns the correlation between input and ground truth in the first cycle. Then, the model's architecture is refined in the second cycle to learn the target distribution from noise input. These processes are executed in each iteration of the training procedure. Hel** the cGAN learn the target distribution from noise input results in a better model generalization during the test time and allows the model to fit almost perfectly to the target domain distribution. As a result, our model surpasses the Pix2Pix model in segmenting HC18 and Montgomery's chest x-ray images. Both qualitative and Dice scores show the superiority of our model. Although our proposed method does not use thousand of additional data for pretraining, it produces comparable results for the in and out-domain generalization compared to the state-of-the-art methods. △ Less

Submitted 15 November, 2022; originally announced November 2022.

Comments: 15 pages, 7 figures

arXiv:2202.10034 [pdf]

DGAFF: Deep Genetic Algorithm Fitness Formation for EEG Bio-Signal Channel Selection

Authors: Ghazaleh Ghorbanzadeh, Zahra Nabizadeh, Nader Karimi, Pejman Khadivi, Ali Emami, Shadrokh Samavi

Abstract: Brain-computer interface systems aim to facilitate human-computer interactions in a great deal by direct translation of brain signals for computers. Recently, using many electrodes has caused better performance in these systems. However, increasing the number of recorded electrodes leads to additional time, hardware, and computational costs besides undesired complications of the recording process.… ▽ More Brain-computer interface systems aim to facilitate human-computer interactions in a great deal by direct translation of brain signals for computers. Recently, using many electrodes has caused better performance in these systems. However, increasing the number of recorded electrodes leads to additional time, hardware, and computational costs besides undesired complications of the recording process. Channel selection has been utilized to decrease data dimension and eliminate irrelevant channels while reducing the noise effects. Furthermore, the technique lowers the time and computational costs in real-time applications. We present a channel selection method, which combines a sequential search method with a genetic algorithm called Deep GA Fitness Formation (DGAFF). The proposed method accelerates the convergence of the genetic algorithm and increases the system's performance. The system evaluation is based on a lightweight deep neural network that automates the whole model training process. The proposed method outperforms other channel selection methods in classifying motor imagery on the utilized dataset. △ Less

Submitted 21 February, 2022; originally announced February 2022.

Comments: 15 pages, 4 figures

arXiv:2201.09377 [pdf, other]

An Application of Pseudo-Log-Likelihoods to Natural Language Scoring

Authors: Darren Abramson, Ali Emami

Abstract: Language models built using semi-supervised machine learning on large corpora of natural language have very quickly enveloped the fields of natural language generation and understanding. In this paper we apply a zero-shot approach independently developed by a number of researchers now gaining recognition as a significant alternative to fine-tuning for evaluation on common sense tasks. A language m… ▽ More Language models built using semi-supervised machine learning on large corpora of natural language have very quickly enveloped the fields of natural language generation and understanding. In this paper we apply a zero-shot approach independently developed by a number of researchers now gaining recognition as a significant alternative to fine-tuning for evaluation on common sense tasks. A language model with relatively few parameters and training steps compared to a more recent language model (T5) can outperform it on a recent large data set (TimeDial), while displaying robustness in its performance across a similar class of language tasks. Surprisingly, this result is achieved by using a hyperparameter-free zero-shot method with the smaller model, compared to fine-tuning to the larger model. We argue that robustness of the smaller model ought to be understood in terms of compositionality, in a sense that we draw from recent literature on a class of similar models. We identify a practical cost for our method and model: high GPU-time for natural language evaluation. The zero-shot measurement technique that produces remarkable stability, both for ALBERT and other BERT variants, is an application of pseudo-log-likelihoods to masked language models for the relative measurement of probability for substitution alternatives in forced choice language tasks such as the Winograd Schema Challenge, Winogrande, and others. One contribution of this paper is to bring together a number of similar, but independent strands of research. We produce some absolute state-of-the-art results for common sense reasoning in binary choice tasks, performing better than any published result in the literature, including fine-tuned efforts. We show a remarkable consistency of the model's performance under adversarial settings, which we argue is best explained by the model's compositionality of representations. △ Less

Submitted 23 January, 2022; originally announced January 2022.

arXiv:2011.04767 [pdf, other]

An Analysis of Dataset Overlap on Winograd-Style Tasks

Authors: Ali Emami, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung

Abstract: The Winograd Schema Challenge (WSC) and variants inspired by it have become important benchmarks for common-sense reasoning (CSR). Model performance on the WSC has quickly progressed from chance-level to near-human using neural language models trained on massive corpora. In this paper, we analyze the effects of varying degrees of overlap between these training corpora and the test instances in WSC… ▽ More The Winograd Schema Challenge (WSC) and variants inspired by it have become important benchmarks for common-sense reasoning (CSR). Model performance on the WSC has quickly progressed from chance-level to near-human using neural language models trained on massive corpora. In this paper, we analyze the effects of varying degrees of overlap between these training corpora and the test instances in WSC-style tasks. We find that a large number of test instances overlap considerably with the corpora on which state-of-the-art models are (pre)trained, and that a significant drop in classification accuracy occurs when we evaluate models on instances with minimal overlap. Based on these results, we develop the KnowRef-60K dataset, which consists of over 60k pronoun disambiguation problems scraped from web data. KnowRef-60K is the largest corpus to date for WSC-style common-sense reasoning and exhibits a significantly lower proportion of overlaps with current pretraining corpora. △ Less

Submitted 9 November, 2020; originally announced November 2020.

Comments: 11 pages with references, accepted at COLING 2020

Journal ref: Coling2020

arXiv:2007.12764 [pdf]

Selection of Proper EEG Channels for Subject Intention Classification Using Deep Learning

Authors: Ghazale Ghorbanzade, Zahra Nabizadeh-ShahreBabak, Shadrokh Samavi, Nader Karimi, Ali Emami, Pejman Khadivi

Abstract: Brain signals could be used to control devices to assist individuals with disabilities. Signals such as electroencephalograms are complicated and hard to interpret. A set of signals are collected and should be classified to identify the intention of the subject. Different approaches have tried to reduce the number of channels before sending them to a classifier. We are proposing a deep learning-ba… ▽ More Brain signals could be used to control devices to assist individuals with disabilities. Signals such as electroencephalograms are complicated and hard to interpret. A set of signals are collected and should be classified to identify the intention of the subject. Different approaches have tried to reduce the number of channels before sending them to a classifier. We are proposing a deep learning-based method for selecting an informative subset of channels that produce high classification accuracy. The proposed network could be trained for an individual subject for the selection of an appropriate set of channels. Reduction of the number of channels could reduce the complexity of brain-computer-interface devices. Our method could find a subset of channels. The accuracy of our approach is comparable with a model trained on all channels. Hence, our model's temporal and power costs are low, while its accuracy is kept high. △ Less

Submitted 23 May, 2021; v1 submitted 24 July, 2020; originally announced July 2020.

Comments: 10 pages 2 figures

arXiv:2002.01975 [pdf]

Brain Tumor Segmentation by Cascaded Deep Neural Networks Using Multiple Image Scales

Authors: Zahra Sobhaninia, Safiyeh Rezaei, Nader Karimi, Ali Emami, Shadrokh Samavi

Abstract: Intracranial tumors are groups of cells that usually grow uncontrollably. One out of four cancer deaths is due to brain tumors. Early detection and evaluation of brain tumors is an essential preventive medical step that is performed by magnetic resonance imaging (MRI). Many segmentation techniques exist for this purpose. Low segmentation accuracy is the main drawback of existing methods. In this p… ▽ More Intracranial tumors are groups of cells that usually grow uncontrollably. One out of four cancer deaths is due to brain tumors. Early detection and evaluation of brain tumors is an essential preventive medical step that is performed by magnetic resonance imaging (MRI). Many segmentation techniques exist for this purpose. Low segmentation accuracy is the main drawback of existing methods. In this paper, we use a deep learning method to boost the accuracy of tumor segmentation in MR images. Cascade approach is used with multiple scales of images to induce both local and global views and help the network to reach higher accuracies. Our experimental results show that using multiple scales and the utilization of two cascade networks is advantageous. △ Less

Submitted 5 February, 2020; originally announced February 2020.

Comments: 5 pages and 4 images

arXiv:1911.00909 [pdf]

Gland Segmentation in Histopathological Images by Deep Neural Network

Authors: Safiye Rezaei, Ali Emami, Nader Karimi, Shadrokh Samavi

Abstract: Histology method is vital in the diagnosis and prognosis of cancers and many other diseases. For the analysis of histopathological images, we need to detect and segment all gland structures. These images are very challenging, and the task of segmentation is even challenging for specialists. Segmentation of glands determines the grade of cancer such as colon, breast, and prostate. Given that deep n… ▽ More Histology method is vital in the diagnosis and prognosis of cancers and many other diseases. For the analysis of histopathological images, we need to detect and segment all gland structures. These images are very challenging, and the task of segmentation is even challenging for specialists. Segmentation of glands determines the grade of cancer such as colon, breast, and prostate. Given that deep neural networks have achieved high performance in medical images, we propose a method based on the LinkNet network for gland segmentation. We found the effects of using different loss functions. By using Warwick-Qu dataset, which contains two test sets and one train set, we show that our approach is comparable to state-of-the-art methods. Finally, it is shown that enhancing the gland edges and the use of hematoxylin components can improve the performance of the proposed model. △ Less

Submitted 3 November, 2019; originally announced November 2019.

Comments: 5 pages 3 figures

arXiv:1911.00908 [pdf]

Localization of Fetal Head in Ultrasound Images by Multiscale View and Deep Neural Networks

Authors: Zahra Sobhaninia, Ali Emami, Nader Karimi, Shadrokh Samavi

Abstract: One of the routine examinations that are used for prenatal care in many countries is ultrasound imaging. This procedure provides various information about fetus health and development, the progress of the pregnancy and, the baby's due date. Some of the biometric parameters of the fetus, like fetal head circumference (HC), must be measured to check the fetus's health and growth. In this paper, we i… ▽ More One of the routine examinations that are used for prenatal care in many countries is ultrasound imaging. This procedure provides various information about fetus health and development, the progress of the pregnancy and, the baby's due date. Some of the biometric parameters of the fetus, like fetal head circumference (HC), must be measured to check the fetus's health and growth. In this paper, we investigated the effects of using multi-scale inputs in the network. We also propose a light convolutional neural network for automatic HC measurement. Experimental results on an ultrasound dataset of the fetus in different trimesters of pregnancy show that the segmentation accuracy and HC evaluations performed by a light convolutional neural network are comparable to deep convolutional neural networks. The proposed network has fewer parameters and requires less training time. △ Less

Submitted 3 November, 2019; originally announced November 2019.

Comments: 5 pages 4 figures

arXiv:1911.00382 [pdf]

BlessMark: A Blind Diagnostically-Lossless Watermarking Framework for Medical Applications Based on Deep Neural Networks

Authors: Hamidreza Zarrabi, Ali Emami, Pejman Khadivi, Nader Karimi, Shadrokh Samavi

Abstract: Nowadays, with the development of public network usage, medical information is transmitted throughout the hospitals. The watermarking system can help for the confidentiality of medical information distributed over the internet. In medical images, regions-of-interest (ROI) contain diagnostic information. The watermark should be embedded only into non-regions-of-interest (NROI) to keep diagnostic in… ▽ More Nowadays, with the development of public network usage, medical information is transmitted throughout the hospitals. The watermarking system can help for the confidentiality of medical information distributed over the internet. In medical images, regions-of-interest (ROI) contain diagnostic information. The watermark should be embedded only into non-regions-of-interest (NROI) to keep diagnostic information without distortion. Recently, ROI based watermarking has attracted the attention of the medical research community. The ROI map can be used as an embedding key for improving confidentiality protection purposes. However, in most existing works, the ROI map that is used for the embedding process must be sent as side-information along with the watermarked image. This side information is a disadvantage and makes the extraction process non-blind. Also, most existing algorithms do not recover NROI of the original cover image after the extraction of the watermark. In this paper, we propose a framework for blind diagnostically-lossless watermarking, which iteratively embeds only into NROI. The significance of the proposed framework is in satisfying the confidentiality of the patient information through a blind watermarking system, while it preserves diagnostic/medical information of the image throughout the watermarking process. A deep neural network is used to recognize the ROI map in the embedding, extraction, and recovery processes. In the extraction process, the same ROI map of the embedding process is recognized without requiring any additional information. Hence, the watermark is blindly extracted from the NROI. △ Less

Submitted 11 May, 2020; v1 submitted 1 November, 2019; originally announced November 2019.

Comments: Drs. Soroushmehr and Najarian declared that they had not contributions to the paper. I removed their names

arXiv:1909.00273 [pdf]

Fetal Ultrasound Image Segmentation for Measuring Biometric Parameters Using Multi-Task Deep Learning

Authors: Zahra Sobhaninia, Shima Rafiei, Ali Emami, Nader Karimi, Kayvan Najarian, Shadrokh Samavi, S. M. Reza Soroushmehr

Abstract: Ultrasound imaging is a standard examination during pregnancy that can be used for measuring specific biometric parameters towards prenatal diagnosis and estimating gestational age. Fetal head circumference (HC) is one of the significant factors to determine the fetus growth and health. In this paper, a multi-task deep convolutional neural network is proposed for automatic segmentation and estimat… ▽ More Ultrasound imaging is a standard examination during pregnancy that can be used for measuring specific biometric parameters towards prenatal diagnosis and estimating gestational age. Fetal head circumference (HC) is one of the significant factors to determine the fetus growth and health. In this paper, a multi-task deep convolutional neural network is proposed for automatic segmentation and estimation of HC ellipse by minimizing a compound cost function composed of segmentation dice score and MSE of ellipse parameters. Experimental results on fetus ultrasound dataset in different trimesters of pregnancy show that the segmentation results and the extracted HC match well with the radiologist annotations. The obtained dice scores of the fetal head segmentation and the accuracy of HC evaluations are comparable to the state-of-the-art. △ Less

Submitted 31 August, 2019; originally announced September 2019.

arXiv:1909.00270 [pdf]

Gland Segmentation in Histopathology Images Using Deep Networks and Handcrafted Features

Authors: Safiyeh Rezaei, Ali Emami, Hamidreza Zarrabi, Shima Rafiei, Kayvan Najarian, Nader Karimi, Shadrokh Samavi, S. M. Reza Soroushmehr

Abstract: Histopathology images contain essential information for medical diagnosis and prognosis of cancerous disease. Segmentation of glands in histopathology images is a primary step for analysis and diagnosis of an unhealthy patient. Due to the widespread application and the great success of deep neural networks in intelligent medical diagnosis and histopathology, we propose a modified version of LinkNe… ▽ More Histopathology images contain essential information for medical diagnosis and prognosis of cancerous disease. Segmentation of glands in histopathology images is a primary step for analysis and diagnosis of an unhealthy patient. Due to the widespread application and the great success of deep neural networks in intelligent medical diagnosis and histopathology, we propose a modified version of LinkNet for gland segmentation and recognition of malignant cases. We show that using specific handcrafted features such as invariant local binary pattern drastically improves the system performance. The experimental results demonstrate the competency of the proposed system against state-of-the-art methods. We achieved the best results in testing on section B images of the Warwick-QU dataset and obtained comparable results on section A images. △ Less

Submitted 31 August, 2019; originally announced September 2019.

arXiv:1811.01778 [pdf, ps, other]

doi 10.18653/v1/D19-1335

How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG

Authors: Paul Trichelair, Ali Emami, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung

Abstract: Recent studies have significantly improved the state-of-the-art on common-sense reasoning (CSR) benchmarks like the Winograd Schema Challenge (WSC) and SWAG. The question we ask in this paper is whether improved performance on these benchmarks represents genuine progress towards common-sense-enabled systems. We make case studies of both benchmarks and design protocols that clarify and qualify the… ▽ More Recent studies have significantly improved the state-of-the-art on common-sense reasoning (CSR) benchmarks like the Winograd Schema Challenge (WSC) and SWAG. The question we ask in this paper is whether improved performance on these benchmarks represents genuine progress towards common-sense-enabled systems. We make case studies of both benchmarks and design protocols that clarify and qualify the results of previous work by analyzing threats to the validity of previous experimental designs. Our protocols account for several properties prevalent in common-sense benchmarks including size limitations, structural regularities, and variable instance difficulty. △ Less

Submitted 10 September, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

Comments: 7 pages

arXiv:1811.01747 [pdf, ps, other]

The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution

Authors: Ali Emami, Paul Trichelair, Adam Trischler, Kaheer Suleman, Hannes Schulz, Jackie Chi Kit Cheung

Abstract: We introduce a new benchmark for coreference resolution and NLI, Knowref, that targets common-sense understanding and world knowledge. Previous coreference resolution tasks can largely be solved by exploiting the number and gender of the antecedents, or have been handcrafted and do not reflect the diversity of naturally occurring text. We present a corpus of over 8,000 annotated text passages with… ▽ More We introduce a new benchmark for coreference resolution and NLI, Knowref, that targets common-sense understanding and world knowledge. Previous coreference resolution tasks can largely be solved by exploiting the number and gender of the antecedents, or have been handcrafted and do not reflect the diversity of naturally occurring text. We present a corpus of over 8,000 annotated text passages with ambiguous pronominal anaphora. These instances are both challenging and realistic. We show that various coreference systems, whether rule-based, feature-rich, or neural, perform significantly worse on the task than humans, who display high inter-annotator agreement. To explain this performance gap, we show empirically that state-of-the art models often fail to capture context, instead relying on the gender or number of candidate antecedents to make a decision. We then use problem-specific insights to propose a data-augmentation trick called antecedent switching to alleviate this tendency in models. Finally, we show that antecedent switching yields promising results on other tasks as well: we use it to achieve state-of-the-art results on the GAP coreference task. △ Less

Submitted 13 June, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

Comments: 9 pages (excluding references), accepted for ACL 2019

arXiv:1810.07248 [pdf, other]

ReDMark: Framework for Residual Diffusion Watermarking on Deep Networks

Authors: Mahdi Ahmadi, Alireza Norouzi, S. M. Reza Soroushmehr, Nader Karimi, Kayvan Najarian, Shadrokh Samavi, Ali Emami

Abstract: Due to the rapid growth of machine learning tools and specifically deep networks in various computer vision and image processing areas, application of Convolutional Neural Networks for watermarking have recently emerged. In this paper, we propose a deep end-to-end diffusion watermarking framework (ReDMark) which can be adapted for any desired transform space. The framework is composed of two Fully… ▽ More Due to the rapid growth of machine learning tools and specifically deep networks in various computer vision and image processing areas, application of Convolutional Neural Networks for watermarking have recently emerged. In this paper, we propose a deep end-to-end diffusion watermarking framework (ReDMark) which can be adapted for any desired transform space. The framework is composed of two Fully Convolutional Neural Networks with the residual structure for embedding and extraction. The whole deep network is trained end-to-end to conduct a blind secure watermarking. The framework is customizable for the level of robustness vs. imperceptibility. It is also adjustable for the trade-off between capacity and robustness. The proposed framework simulates various attacks as a differentiable network layer to facilitate end-to-end training. For JPEG attack, a differentiable approximation is utilized, which drastically improves the watermarking robustness to this attack. Another important characteristic of the proposed framework, which leads to improved security and robustness, is its capability to diffuse watermark information among a relatively wide area of the image. Comparative results versus recent state-of-the-art researches highlight the superiority of the proposed framework in terms of imperceptibility and robustness. △ Less

Submitted 11 December, 2018; v1 submitted 16 October, 2018; originally announced October 2018.

Comments: 33 pages (Single column), 10 figures, 5 tables, one appendix

arXiv:1810.01375 [pdf, ps, other]

A Knowledge Hunting Framework for Common Sense Reasoning

Authors: Ali Emami, Noelia De La Cruz, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung

Abstract: We introduce an automatic system that achieves state-of-the-art results on the Winograd Schema Challenge (WSC), a common sense reasoning task that requires diverse, complex forms of inference and knowledge. Our method uses a knowledge hunting module to gather text from the web, which serves as evidence for candidate problem resolutions. Given an input problem, our system generates relevant queries… ▽ More We introduce an automatic system that achieves state-of-the-art results on the Winograd Schema Challenge (WSC), a common sense reasoning task that requires diverse, complex forms of inference and knowledge. Our method uses a knowledge hunting module to gather text from the web, which serves as evidence for candidate problem resolutions. Given an input problem, our system generates relevant queries to send to a search engine, then extracts and classifies knowledge from the returned results and weighs them to make a resolution. Our approach improves F1 performance on the full WSC by 0.21 over the previous best and represents the first system to exceed 0.5 F1. We further demonstrate that the approach is competitive on the Choice of Plausible Alternatives (COPA) task, which suggests that it is generally applicable. △ Less

Submitted 2 October, 2018; originally announced October 2018.

Comments: 10 pages, accepted at EMNLP 2018

arXiv:1809.07786 [pdf]

Brain Tumor Segmentation Using Deep Learning by Type Specific Sorting of Images

Authors: Zahra Sobhaninia, Safiyeh Rezaei, Alireza Noroozi, Mehdi Ahmadi, Hamidreza Zarrabi, Nader Karimi, Ali Emami, Shadrokh Samavi

Abstract: Recently deep learning has been playing a major role in the field of computer vision. One of its applications is the reduction of human judgment in the diagnosis of diseases. Especially, brain tumor diagnosis requires high accuracy, where minute errors in judgment may lead to disaster. For this reason, brain tumor segmentation is an important challenge for medical purposes. Currently several metho… ▽ More Recently deep learning has been playing a major role in the field of computer vision. One of its applications is the reduction of human judgment in the diagnosis of diseases. Especially, brain tumor diagnosis requires high accuracy, where minute errors in judgment may lead to disaster. For this reason, brain tumor segmentation is an important challenge for medical purposes. Currently several methods exist for tumor segmentation but they all lack high accuracy. Here we present a solution for brain tumor segmenting by using deep learning. In this work, we studied different angles of brain MR images and applied different networks for segmentation. The effect of using separate networks for segmentation of MR images is evaluated by comparing the results with a single network. Experimental evaluations of the networks show that Dice score of 0.73 is achieved for a single network and 0.79 in obtained for multiple networks. △ Less

Submitted 20 September, 2018; originally announced September 2018.

Comments: 4 pages, 3 figures

arXiv:1803.00406 [pdf]

Left ventricle segmentation By modelling uncertainty in prediction of deep convolutional neural networks and adaptive thresholding inference

Authors: Alireza Norouzi, Ali Emami, S. M. Reza Soroushmehr, Nader Karimi, Shadrokh Samavi, Kayvan Najarian

Abstract: Deep neural networks have shown great achievements in solving complex problems. However, there are fundamental problems that limit their real world applications. Lack of measurable criteria for estimating uncertainty in the network outputs is one of these problems. In this paper, we address this limitation by introducing deformation to the network input and measuring the level of stability in the… ▽ More Deep neural networks have shown great achievements in solving complex problems. However, there are fundamental problems that limit their real world applications. Lack of measurable criteria for estimating uncertainty in the network outputs is one of these problems. In this paper, we address this limitation by introducing deformation to the network input and measuring the level of stability in the network's output. We calculate simple random transformations to estimate the prediction uncertainty of deep convolutional neural networks. For a real use-case, we apply this method to left ventricle segmentation in MRI cardiac images. We also propose an adaptive thresholding method to consider the deep neural network uncertainty. Experimental results demonstrate state-of-the-art performance and highlight the capabilities of simple methods in conjunction with deep neural networks. △ Less

Submitted 23 February, 2018; originally announced March 2018.

Comments: 5 pages, 3 figures

arXiv:1802.07781 [pdf]

Lossless Image Compression Algorithm for Wireless Capsule Endoscopy by Content-Based Classification of Image Blocks

Authors: Atefe Rajaeefar, Ali Emami, S. M. Reza Soroushmehr, Nader Karimi, Shadrokh Samavi, Kayvan Najarian

Abstract: Recent advances in capsule endoscopy systems have introduced new methods and capabilities. The capsule endoscopy system, by observing the entire digestive tract, has significantly improved diagnosing gastrointestinal disorders and diseases. The system has challenges such as the need to enhance the quality of the transmitted images, low frame rates of transmission, and battery lifetime that need to… ▽ More Recent advances in capsule endoscopy systems have introduced new methods and capabilities. The capsule endoscopy system, by observing the entire digestive tract, has significantly improved diagnosing gastrointestinal disorders and diseases. The system has challenges such as the need to enhance the quality of the transmitted images, low frame rates of transmission, and battery lifetime that need to be addressed. One of the important parts of a capsule endoscopy system is the image compression unit. Better compression of images increases the frame rate and hence improves the diagnosis process. In this paper a high precision compression algorithm with high compression ratio is proposed. In this algorithm we use the similarity between frames to compress the data more efficiently. △ Less

Submitted 21 February, 2018; originally announced February 2018.

Comments: 4 pages, 5 figures

arXiv:1802.07769 [pdf]

Lossless Compression of Angiogram Foreground with Visual Quality Preservation of Background

Authors: Mahdi Ahmadi, Ali Emami, Mohsen Hajabdollahi, S. M. Reza Soroushmehr, Nader Karimi, Shadrokh Samavi, Kayvan Najarian

Abstract: By increasing the volume of telemedicine information, the need for medical image compression has become more important. In angiographic images, a small ratio of the entire image usually belongs to the vasculature that provides crucial information for diagnosis. Other parts of the image are diagnostically less important and can be compressed with higher compression ratio. However, the quality of th… ▽ More By increasing the volume of telemedicine information, the need for medical image compression has become more important. In angiographic images, a small ratio of the entire image usually belongs to the vasculature that provides crucial information for diagnosis. Other parts of the image are diagnostically less important and can be compressed with higher compression ratio. However, the quality of those parts affect the visual perception of the image as well. Existing methods compress foreground and background of angiographic images using different techniques. In this paper we first utilize convolutional neural network to segment vessels and then represent a hierarchical block processing algorithm capable of both eliminating the background redundancies and preserving the overall visual quality of angiograms. △ Less

Submitted 21 February, 2018; originally announced February 2018.

Comments: 4 pages , 7 figures

arXiv:1801.05264 [pdf]

Adaptive Reversible Watermarking Based on Linear Prediction for Medical Videos

Authors: Hamidreza Zarrabi, Ali Emami, Nader Karimi, Shadrokh Samavi

Abstract: Reversible video watermarking can guarantee that the watermark logo and the original frame can be recovered from the watermarked frame without any distortion. Although reversible video watermarking has successfully been applied in multimedia, its application has not been extensively explored in medical videos. Reversible watermarking in medical videos is still a challenging problem. The existing r… ▽ More Reversible video watermarking can guarantee that the watermark logo and the original frame can be recovered from the watermarked frame without any distortion. Although reversible video watermarking has successfully been applied in multimedia, its application has not been extensively explored in medical videos. Reversible watermarking in medical videos is still a challenging problem. The existing reversible video watermarking algorithms, which are based on error prediction expansion, use motion vectors for prediction. In this study, we propose an adaptive reversible watermarking method for medical videos. We suggest using temporal correlations for improving the prediction accuracy. Hence, two temporal neighbor pixels in upcoming frames are used alongside the four spatial rhombus neighboring pixels to minimize the prediction error. To the best of our knowledge, this is the first time this method is applied to medical videos. The method helps to protect patients' personal and medical information by watermarking, i.e., increase the security of Health Information Systems (HIS). Experimental results demonstrate the high quality of the proposed watermarking method based on PSNR metric and a large capacity for data hiding in medical videos. △ Less

Submitted 28 May, 2018; v1 submitted 8 January, 2018; originally announced January 2018.

Comments: Algorithms are now presented in a standard format

arXiv:1501.02246 [pdf]

The Effect of Wedge Tip Angles on Stress Intensity Factors in the Contact Problem between Tilted Wedge and a Half Plane with an Edge Crack Using Digital Image Correlation

Authors: Seyedmeysam Khaleghian, Anahita Emami, Mohammad Yadegari, Nasser Soltani

Abstract: The first and second mode stress intensity factors (SIFs) of a contact problem between a half-plane with an edge crack and an asymmetric tilted wedge were obtained using experimental method of Digital Image Correlation (DIC). In this technique, displacement and strain fields can be measured using two digital images of the same sample at different stages of loading. However, several images were tak… ▽ More The first and second mode stress intensity factors (SIFs) of a contact problem between a half-plane with an edge crack and an asymmetric tilted wedge were obtained using experimental method of Digital Image Correlation (DIC). In this technique, displacement and strain fields can be measured using two digital images of the same sample at different stages of loading. However, several images were taken consequently in each stage of this experiment to avoid the noise effect. A pair of images of each stage was compared to each other. Then, the correlation coefficients between them were studied using a computer code. The pairs with the correlation coefficient higher than 0.8 were selected as the acceptable match for displacement measurements near the crack tip. Subsequently, the SIFs of specimens were calculated using displacement fields obtained from DIC method. The effect of wedge tips angle on their SIFs was also studied. Moreover, the results of DIC method were compared with the results of photoelasticity method and a close agreement between them was observed. △ Less

Submitted 6 January, 2015; originally announced January 2015.

Comments: 12 pages, 11 figures, The International Conference on Experimental Solid Mechanics and Dynamics (X-MECH-2012)

arXiv:1501.02245 [pdf]

Image Processing Code for Sharpening Photoelastic Fringe Patterns and Its Usage in Determination of Stress Intensity Factors in a Sample Contact Problem

Authors: Seyedmeysam Khaleghian, Anahita Emami, Nasser Soltani

Abstract: This study presented a type of image processing code which is used for sharpening photoelastic fringe patterns of transparent materials in photoelastic experiences to determine the stress distribution. C-Sharp software was utilized for coding the algorithm of this image processing method. For evaluation of this code, the results of a photoelastic experience of a sample contact problem between a ha… ▽ More This study presented a type of image processing code which is used for sharpening photoelastic fringe patterns of transparent materials in photoelastic experiences to determine the stress distribution. C-Sharp software was utilized for coding the algorithm of this image processing method. For evaluation of this code, the results of a photoelastic experience of a sample contact problem between a half-plane with an oblique edge crack and a tilted wedge using this image processing method was compared with the FEM results of the same problem in order to obtain the stress intensity factors (SIF) of the specimen. A good agreement between experimental results extracted from this method of image processing and computational results was observed. △ Less

Submitted 6 January, 2015; originally announced January 2015.

Comments: 4 pages, 5 figures, ICME 2011, Tehran, Iran

arXiv:1501.01930 [pdf]

Design, Analysis, and Simulation of a Pipe-Welding Robot with Fixed Plinth

Authors: Anahita Emami, Seyedmeysam Khaleghian, Mohammad Mahjoob Jahromi

Abstract: Industrial requirements concerning the increased efficiency and high rate of manufacturing result in the development of manufacturer robots, and a vast group of these types of robots is used for welding. This study presented the design, analysis, and simulation of a pipe-welding robot with fixed plinth for a constant circular welding around the pipes. Design of a welding robot capable of kee** t… ▽ More Industrial requirements concerning the increased efficiency and high rate of manufacturing result in the development of manufacturer robots, and a vast group of these types of robots is used for welding. This study presented the design, analysis, and simulation of a pipe-welding robot with fixed plinth for a constant circular welding around the pipes. Design of a welding robot capable of kee** the electrode orientation, welding speed, and distance between electrode and pipe surface constant can improve the quality of welding; thus, a five-linked articulated robot was designed for this purpose. Solving of direct and diverse kinematics and dynamics equations of the robot was done by means of Matlab software. The robot was also simulated using a program written in Matlab and the diagrams of angles, velocities, and accelerations of all the arms, and the applied force and torque of each arm required for drive the mechanism were obtained. △ Less

Submitted 6 January, 2015; originally announced January 2015.

Comments: 6 pages, 11 figures, 3rd International Conference on Manufacturing Engineering, ICME2011, Tehran, Iran

Showing 1–31 of 31 results for author: Emami, A