-
Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment
Authors:
Kota Shamanth Ramanath Nayak,
Leila Kosseim
Abstract:
This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts. Our model, developed as a part of the recent SemEval task, is based on fine-tuning individual language models (BERT, XLM-RoBERTa, and mBERT) and leveraging a mean-based ensemble model in addition to dataset augmentation through paraphrase generation from ChatGPT. The scope of the study e…
▽ More
This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts. Our model, developed as a part of the recent SemEval task, is based on fine-tuning individual language models (BERT, XLM-RoBERTa, and mBERT) and leveraging a mean-based ensemble model in addition to dataset augmentation through paraphrase generation from ChatGPT. The scope of the study encompasses enhancing model performance through innovative training techniques and data augmentation strategies. The problem addressed is the effective identification and classification of multiple persuasive techniques in meme texts, a task complicated by the diversity and complexity of such content. The objective of the paper is to improve detection accuracy by refining model training methods and examining the impact of balanced versus unbalanced training datasets. Novelty in the results and discussion lies in the finding that training with paraphrases enhances model performance, yet a balanced training set proves more advantageous than a larger unbalanced one. Additionally, the analysis reveals the potential pitfalls of indiscriminate incorporation of paraphrases from diverse distributions, which can introduce substantial noise. Results with the SemEval 2024 data confirm these insights, demonstrating improved model efficacy with the proposed methods.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images
Authors:
Yongwan Lim,
Asterios Toutios,
Yannick Bliesener,
Ye Tian,
Sajan Goud Lingala,
Colin Vaz,
Tanner Sorensen,
Miran Oh,
Sarah Harper,
Weiyi Chen,
Yoonjeong Lee,
Johannes Töger,
Mairym Lloréns Montesserin,
Caitlin Smith,
Bianca Godinez,
Louis Goldstein,
Dani Byrd,
Krishna S. Nayak,
Shrikanth S. Narayanan
Abstract:
Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators…
▽ More
Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators and dynamic airway sha** during speech demands high spatio-temporal resolution and robust reconstruction methods. Further, while reconstructed images have been published, to-date there is no open dataset providing raw multi-coil RT-MRI data from an optimized speech production experimental setup. Such datasets could enable new and improved methods for dynamic image reconstruction, artifact correction, feature extraction, and direct extraction of linguistically-relevant biomarkers. The present dataset offers a unique corpus of 2D sagittal-view RT-MRI videos along with synchronized audio for 75 subjects performing linguistically motivated speech tasks, alongside the corresponding first-ever public domain raw RT-MRI data. The dataset also includes 3D volumetric vocal tract MRI during sustained speech sounds and high-resolution static anatomical T2-weighted upper airway MRI for each subject.
△ Less
Submitted 15 February, 2021;
originally announced February 2021.
-
Attention-gated convolutional neural networks for off-resonance correction of spiral real-time MRI
Authors:
Yongwan Lim,
Shrikanth S. Narayanan,
Krishna S. Nayak
Abstract:
Spiral acquisitions are preferred in real-time MRI because of their efficiency, which has made it possible to capture vocal tract dynamics during natural speech. A fundamental limitation of spirals is blurring and signal loss due to off-resonance, which degrades image quality at air-tissue boundaries. Here, we present a new CNN-based off-resonance correction method that incorporates an attention-g…
▽ More
Spiral acquisitions are preferred in real-time MRI because of their efficiency, which has made it possible to capture vocal tract dynamics during natural speech. A fundamental limitation of spirals is blurring and signal loss due to off-resonance, which degrades image quality at air-tissue boundaries. Here, we present a new CNN-based off-resonance correction method that incorporates an attention-gate mechanism. This leverages spatial and channel relationships of filtered outputs and improves the expressiveness of the networks. We demonstrate improved performance with the attention-gate, on 1.5 Tesla spiral speech RT-MRI, compared to existing off-resonance correction methods.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
Deblurring for Spiral Real-Time MRI Using Convolutional Neural Networks
Authors:
Yongwan Lim,
Shrikanth S Narayanan,
Krishna S Nayak
Abstract:
Spiral acquisitions are preferred in real-time MRI because of their time efficiency. A fundamental limitation of spirals is image blurring due to off-resonance, which degrades image quality significantly at air-tissue boundaries. Here, we demonstrate a simple CNN-based deblurring method for spiral real-time MRI of human speech production. We show the CNN-based deblurring is capable of restoring bl…
▽ More
Spiral acquisitions are preferred in real-time MRI because of their time efficiency. A fundamental limitation of spirals is image blurring due to off-resonance, which degrades image quality significantly at air-tissue boundaries. Here, we demonstrate a simple CNN-based deblurring method for spiral real-time MRI of human speech production. We show the CNN-based deblurring is capable of restoring blurred vocal tract tissue boundaries, without a need for exam-specific field maps. Deblurring performance is superior to a current auto-calibrated method, and slightly inferior to ideal reconstruction with perfect knowledge of the field maps.
△ Less
Submitted 29 May, 2020; v1 submitted 26 January, 2020;
originally announced January 2020.
-
Robust Autocalibrated Structured Low-Rank EPI Ghost Correction
Authors:
Rodrigo A. Lobos,
W. Scott Hoge,
Ahsan Javed,
Congyu Liao,
Kawin Setsompop,
Krishna S. Nayak,
Justin P. Haldar
Abstract:
Purpose: We propose and evaluate a new structured low-rank method for EPI ghost correction called Robust Autocalibrated LORAKS (RAC-LORAKS). The method can be used to suppress EPI ghosts arising from the differences between different readout gradient polarities and/or the differences between different shots. It does not require conventional EPI navigator signals, and is robust to imperfect autocal…
▽ More
Purpose: We propose and evaluate a new structured low-rank method for EPI ghost correction called Robust Autocalibrated LORAKS (RAC-LORAKS). The method can be used to suppress EPI ghosts arising from the differences between different readout gradient polarities and/or the differences between different shots. It does not require conventional EPI navigator signals, and is robust to imperfect autocalibration data.
Methods: Autocalibrated LORAKS is a previous structured low-rank method for EPI ghost correction that uses GRAPPA-type autocalibration data to enable high-quality ghost correction. This method works well when the autocalibration data is pristine, but performance degrades substantially when the autocalibration information is imperfect. RAC-LORAKS generalizes Autocalibrated LORAKS in two ways. First, it does not completely trust the information from autocalibration data, and instead considers the autocalibration and EPI data simultaneously when estimating low-rank matrix structure. And second, it uses complementary information from the autocalibration data to improve EPI reconstruction in a multi-contrast joint reconstruction framework. RAC-LORAKS is evaluated using simulations and in vivo data, including comparisons to state-of-the-art methods.
Results: RAC-LORAKS is demonstrated to have good ghost elimination performance compared to state-of-the-art methods in several complicated EPI acquisition scenarios (including gradient-echo brain imaging, diffusion-encoded brain imaging, and cardiac imaging).
Conclusion: RAC-LORAKS provides effective suppression of EPI ghosts and is robust to imperfect autocalibration data.
△ Less
Submitted 1 October, 2020; v1 submitted 30 July, 2019;
originally announced July 2019.
-
Accuracy, Uncertainty, and Adaptability of Automatic Myocardial ASL Segmentation using Deep CNN
Authors:
Hung P. Do,
Yi Guo,
Andrew J. Yoon,
Krishna S. Nayak
Abstract:
PURPOSE: To apply deep CNN to the segmentation task in myocardial arterial spin labeled (ASL) perfusion imaging and to develop methods that measure uncertainty and that adapt the CNN model to a specific false positive vs. false negative tradeoff.
METHODS: The Monte Carlo dropout (MCD) U-Net was trained on data from 22 subjects and tested on data from 6 heart transplant recipients. Manual segment…
▽ More
PURPOSE: To apply deep CNN to the segmentation task in myocardial arterial spin labeled (ASL) perfusion imaging and to develop methods that measure uncertainty and that adapt the CNN model to a specific false positive vs. false negative tradeoff.
METHODS: The Monte Carlo dropout (MCD) U-Net was trained on data from 22 subjects and tested on data from 6 heart transplant recipients. Manual segmentation and regional myocardial blood flow (MBF) were available for comparison. We consider two global uncertainty measures, named Dice Uncertainty and MCD Uncertainty, which were calculated with and without the use of manual segmentation, respectively. Tversky loss function with a hyperparameter $β$ was used to adapt the model to a specific false positive vs. false negative tradeoff.
RESULTS: The MCD U-Net achieved Dice coefficient of mean(std) = 0.91(0.04) on the test set. MBF measured using automatic segmentations was highly correlated to that measured using the manual segmentation ($R^2$ = 0.96). Dice Uncertainty and MCD Uncertainty were in good agreement ($R^2$ = 0.64). As $β$ increased, the false positive rate systematically decreased and false negative rate systematically increased.
CONCLUSION: We demonstrate the feasibility of deep CNN for automatic segmentation of myocardial ASL, with good accuracy. We also introduce two simple methods for assessing model uncertainty. Finally, we demonstrate the ability to adapt the CNN model to a specific false positive vs. false negative tradeoff. These findings are directly relevant to automatic segmentation in quantitative cardiac MRI and are broadly applicable to automatic segmentation problems in diagnostic imaging.
△ Less
Submitted 4 November, 2019; v1 submitted 10 December, 2018;
originally announced December 2018.
-
Tracer Kinetic Models as Temporal Constraints during DCE-MRI reconstruction
Authors:
Sajan Goud Lingala,
Yi Guo,
R. Marc Lebel,
Yinghua Zhu,
Yannick Bliesener,
Meng Law,
Krishna S. Nayak
Abstract:
Purpose: To apply tracer kinetic models as temporal constraints during reconstruction of under-sampled dynamic contrast enhanced (DCE) MRI.
Methods: A library of concentration v.s time profiles is simulated for a range of physiological kinetic parameters. The library is reduced to a dictionary of temporal bases, where each profile is approximated by a sparse linear combination of the bases. Imag…
▽ More
Purpose: To apply tracer kinetic models as temporal constraints during reconstruction of under-sampled dynamic contrast enhanced (DCE) MRI.
Methods: A library of concentration v.s time profiles is simulated for a range of physiological kinetic parameters. The library is reduced to a dictionary of temporal bases, where each profile is approximated by a sparse linear combination of the bases. Image reconstruction is formulated as estimation of concentration profiles and sparse model coefficients with a fixed sparsity level. Simulations are performed to evaluate modeling error, and error statistics in kinetic parameter estimation in presence of noise. Retrospective under-sampling experiments are performed on a brain tumor DCE digital reference object (DRO) at different signal to noise levels (SNR=20-40) at (k-t) space under-sampling factor (R=20), and 12 brain tumor in- vivo 3T datasets at (R=20-40). The approach is compared against an existing compressed sensing based temporal finite-difference (tFD) reconstruction approach.
Results: Simulations demonstrate that sparsity levels of 2 and 3 model the library profiles from the Patlak and extended Tofts-Kety (ETK) models, respectively. Noise sensitivity analysis showed equivalent kinetic parameter estimation error statistics from noisy concentration profiles, and model approximated profiles. DRO based experiments showed good fidelity in recovery of kinetic maps from 20-fold under- sampled data at SNRs between 10-30. In-vivo experiments demonstrated reduced bias and uncertainty in kinetic map** with the proposed approach compared to tFD at R>=20.
Conclusions: Tracer kinetic models can be applied as temporal constraints during DCE-MRI reconstruction, enabling more accurate reconstruction from under- sampled data. The approach is flexible, can use several kinetic models, and does not require tuning of regularization parameters.
△ Less
Submitted 24 July, 2017;
originally announced July 2017.