Search | arXiv e-print repository

doi 10.1145/3643795.3648395

PromptSet: A Programmer's Prompting Dataset

Authors: Kaiser Pister, Dhruba Jyoti Paul, Patrick Brophy, Ishan Joshi

Abstract: The rise of capabilities expressed by large language models has been quickly followed by the integration of the same complex systems into application level logic. Algorithms, programs, systems, and companies are built around structured prompting to black box models where the majority of the design and implementation lies in capturing and quantifying the `agent mode'. The standard way to shape a cl… ▽ More The rise of capabilities expressed by large language models has been quickly followed by the integration of the same complex systems into application level logic. Algorithms, programs, systems, and companies are built around structured prompting to black box models where the majority of the design and implementation lies in capturing and quantifying the `agent mode'. The standard way to shape a closed language model is to prime it for a specific task with a tailored prompt, often initially handwritten by a human. The textual prompts co-evolve with the codebase, taking shape over the course of project life as artifacts which must be reviewed and maintained, just as the traditional code files might be. Unlike traditional code, we find that prompts do not receive effective static testing and linting to prevent runtime issues. In this work, we present a novel dataset called PromptSet, with more than 61,000 unique developer prompts used in open source Python programs. We perform analysis on this dataset and introduce the notion of a static linter for prompts. Released with this publication is a HuggingFace dataset and a Github repository to recreate collection and processing efforts, both under the name \texttt{pisterlabs/promptset}. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: 8 pages, ICSE '24 LLM4Code Workshop

arXiv:2306.14169 [pdf]

A Web-based Mpox Skin Lesion Detection System Using State-of-the-art Deep Learning Models Considering Racial Diversity

Authors: Shams Nafisa Ali, Md. Tazuddin Ahmed, Tasnim Jahan, Joydip Paul, S. M. Sakeef Sani, Nawsabah Noor, Anzirun Nahar Asma, Taufiq Hasan

Abstract: The recent 'Mpox' outbreak, formerly known as 'Monkeypox', has become a significant public health concern and has spread to over 110 countries globally. The challenge of clinically diagnosing mpox early on is due, in part, to its similarity to other types of rashes. Computer-aided screening tools have been proven valuable in cases where Polymerase Chain Reaction (PCR) based diagnosis is not immedi… ▽ More The recent 'Mpox' outbreak, formerly known as 'Monkeypox', has become a significant public health concern and has spread to over 110 countries globally. The challenge of clinically diagnosing mpox early on is due, in part, to its similarity to other types of rashes. Computer-aided screening tools have been proven valuable in cases where Polymerase Chain Reaction (PCR) based diagnosis is not immediately available. Deep learning methods are powerful in learning complex data representations, but their efficacy largely depends on adequate training data. To address this challenge, we present the "Mpox Skin Lesion Dataset Version 2.0 (MSLD v2.0)" as a follow-up to the previously released openly accessible dataset, one of the first datasets containing mpox lesion images. This dataset contains images of patients with mpox and five other non-mpox classes (chickenpox, measles, hand-foot-mouth disease, cowpox, and healthy). We benchmark the performance of several state-of-the-art deep learning models, including VGG16, ResNet50, DenseNet121, MobileNetV2, EfficientNetB3, InceptionV3, and Xception, to classify mpox and other infectious skin diseases. In order to reduce the impact of racial bias, we utilize a color space data augmentation method to increase skin color variability during training. Additionally, by leveraging transfer learning implemented with pre-trained weights generated from the HAM10000 dataset, an extensive collection of pigmented skin lesion images, we achieved the best overall accuracy of $83.59\pm2.11\%$. Finally, the developed models are incorporated within a prototype web application to analyze uploaded skin images by a user and determine whether a subject is a suspected mpox patient. △ Less

Submitted 25 June, 2023; originally announced June 2023.

arXiv:2302.09657 [pdf]

Table Tennis Stroke Detection and Recognition Using Ball Trajectory Data

Authors: Kaustubh Milind Kulkarni, Rohan S Jamadagni, Jeffrey Aaron Paul, Sucheth Shenoy

Abstract: In this work, the novel task of detecting and classifying table tennis strokes solely using the ball trajectory has been explored. A single camera setup positioned in the umpire's view has been employed to procure a dataset consisting of six stroke classes executed by four professional table tennis players. Ball tracking using YOLOv4, a traditional object detection model, and TrackNetv2, a tempora… ▽ More In this work, the novel task of detecting and classifying table tennis strokes solely using the ball trajectory has been explored. A single camera setup positioned in the umpire's view has been employed to procure a dataset consisting of six stroke classes executed by four professional table tennis players. Ball tracking using YOLOv4, a traditional object detection model, and TrackNetv2, a temporal heatmap based model, have been implemented on our dataset and their performances have been benchmarked. A mathematical approach developed to extract temporal boundaries of strokes using the ball trajectory data yielded a total of 2023 valid strokes in our dataset, while also detecting services and missed strokes successfully. The temporal convolutional network developed performed stroke recognition on completely unseen data with an accuracy of 87.155%. Several machine learning and deep learning based model architectures have been trained for stroke recognition using ball trajectory input and benchmarked based on their performances. While stroke recognition in the field of table tennis has been extensively explored based on human action recognition using video data focused on the player's actions, the use of ball trajectory data for the same is an unexplored characteristic of the sport. Hence, the motivation behind the work is to demonstrate that meaningful inferences such as stroke detection and recognition can be drawn using minimal input information. △ Less

Submitted 19 February, 2023; originally announced February 2023.

Comments: 9 pages, 5 figures, 6 tables

arXiv:2207.03342 [pdf, other]

Monkeypox Skin Lesion Detection Using Deep Learning Models: A Feasibility Study

Authors: Shams Nafisa Ali, Md. Tazuddin Ahmed, Joydip Paul, Tasnim Jahan, S. M. Sakeef Sani, Nawsabah Noor, Taufiq Hasan

Abstract: The recent monkeypox outbreak has become a public health concern due to its rapid spread in more than 40 countries outside Africa. Clinical diagnosis of monkeypox in an early stage is challenging due to its similarity with chickenpox and measles. In cases where the confirmatory Polymerase Chain Reaction (PCR) tests are not readily available, computer-assisted detection of monkeypox lesions could b… ▽ More The recent monkeypox outbreak has become a public health concern due to its rapid spread in more than 40 countries outside Africa. Clinical diagnosis of monkeypox in an early stage is challenging due to its similarity with chickenpox and measles. In cases where the confirmatory Polymerase Chain Reaction (PCR) tests are not readily available, computer-assisted detection of monkeypox lesions could be beneficial for surveillance and rapid identification of suspected cases. Deep learning methods have been found effective in the automated detection of skin lesions, provided that sufficient training examples are available. However, as of now, such datasets are not available for the monkeypox disease. In the current study, we first develop the ``Monkeypox Skin Lesion Dataset (MSLD)" consisting skin lesion images of monkeypox, chickenpox, and measles. The images are mainly collected from websites, news portals, and publicly accessible case reports. Data augmentation is used to increase the sample size, and a 3-fold cross-validation experiment is set up. In the next step, several pre-trained deep learning models, namely, VGG-16, ResNet50, and InceptionV3 are employed to classify monkeypox and other diseases. An ensemble of the three models is also developed. ResNet50 achieves the best overall accuracy of $82.96(\pm4.57\%)$, while VGG16 and the ensemble system achieved accuracies of $81.48(\pm6.87\%)$ and $79.26(\pm1.05\%)$, respectively. A prototype web-application is also developed as an online monkeypox screening tool. While the initial results on this limited dataset are promising, a larger demographically diverse dataset is required to further enhance the generalizability of these models. △ Less

Submitted 6 July, 2022; originally announced July 2022.

Comments: 4 pages, 6 figures, conference

arXiv:2110.04291 [pdf, other]

doi 10.1016/j.knosys.2022.108453

Local and Global Context-Based Pairwise Models for Sentence Ordering

Authors: Ruskin Raj Manku, Aditya Jyoti Paul

Abstract: Sentence Ordering refers to the task of rearranging a set of sentences into the appropriate coherent order. For this task, most previous approaches have explored global context-based end-to-end methods using Sequence Generation techniques. In this paper, we put forward a set of robust local and global context-based pairwise ordering strategies, leveraging which our prediction strategies outperform… ▽ More Sentence Ordering refers to the task of rearranging a set of sentences into the appropriate coherent order. For this task, most previous approaches have explored global context-based end-to-end methods using Sequence Generation techniques. In this paper, we put forward a set of robust local and global context-based pairwise ordering strategies, leveraging which our prediction strategies outperform all previous works in this domain. Our proposed encoding method utilizes the paragraph's rich global contextual information to predict the pairwise order using novel transformer architectures. Analysis of the two proposed decoding strategies helps better explain error propagation in pairwise models. This approach is the most accurate pure pairwise model and our encoding strategy also significantly improves the performance of other recent approaches that use pairwise models, including the previous state-of-the-art, demonstrating the research novelty and generalizability of this work. Additionally, we show how the pre-training task for ALBERT helps it to significantly outperform BERT, despite having considerably lesser parameters. The extensive experimental results, architectural analysis and ablation studies demonstrate the effectiveness and superiority of the proposed models compared to the previous state-of-the-art, besides providing a much better understanding of the functioning of pairwise models. △ Less

Submitted 20 August, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

Comments: This is a post-print of an article published in Knowledge-Based Systems. For the journal-typeset version, please see https://www.sciencedirect.com/science/article/abs/pii/S0950705122001873?via%3Dihub

ACM Class: I.2.7; H.3.3; H.3.1

Journal ref: Knowledge-Based Systems, Volume 243, May 2022, 108453

arXiv:2108.00981 [pdf, other]

PSA-GAN: Progressive Self Attention GANs for Synthetic Time Series

Authors: Jeha Paul, Bohlke-Schneider Michael, Mercado Pedro, Kapoor Shubham, Singh Nirwan Rajbir, Flunkert Valentin, Gasthaus Jan, Januschowski Tim

Abstract: Realistic synthetic time series data of sufficient length enables practical applications in time series modeling tasks, such as forecasting, but remains a challenge. In this paper we present PSA-GAN, a generative adversarial network (GAN) that generates long time series samples of high quality using progressive growing of GANs and self-attention. We show that PSA-GAN can be used to reduce the erro… ▽ More Realistic synthetic time series data of sufficient length enables practical applications in time series modeling tasks, such as forecasting, but remains a challenge. In this paper we present PSA-GAN, a generative adversarial network (GAN) that generates long time series samples of high quality using progressive growing of GANs and self-attention. We show that PSA-GAN can be used to reduce the error in two downstream forecasting tasks over baselines that only use real data. We also introduce a Frechet-Inception Distance-like score, Context-FID, assessing the quality of synthetic time series samples. In our downstream tasks, we find that the lowest scoring models correspond to the best-performing ones. Therefore, Context-FID could be a useful tool to develop time series GAN models. △ Less

Submitted 28 March, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

arXiv:2107.14070 [pdf]

Machine Learning Advances aiding Recognition and Classification of Indian Monuments and Landmarks

Authors: Aditya Jyoti Paul, Smaranjit Ghose, Kanishka Aggarwal, Niketha Nethaji, Shivam Pal, Arnab Dutta Purkayastha

Abstract: Tourism in India plays a quintessential role in the country's economy with an estimated 9.2% GDP share for the year 2018. With a yearly growth rate of 6.2%, the industry holds a huge potential for being the primary driver of the economy as observed in the nations of the Middle East like the United Arab Emirates. The historical and cultural diversity exhibited throughout the geography of the nation… ▽ More Tourism in India plays a quintessential role in the country's economy with an estimated 9.2% GDP share for the year 2018. With a yearly growth rate of 6.2%, the industry holds a huge potential for being the primary driver of the economy as observed in the nations of the Middle East like the United Arab Emirates. The historical and cultural diversity exhibited throughout the geography of the nation is a unique spectacle for people around the world and therefore serves to attract tourists in tens of millions in number every year. Traditionally, tour guides or academic professionals who study these heritage monuments were responsible for providing information to the visitors regarding their architectural and historical significance. However, unfortunately this system has several caveats when considered on a large scale such as unavailability of sufficient trained people, lack of accurate information, failure to convey the richness of details in an attractive format etc. Recently, machine learning approaches revolving around the usage of monument pictures have been shown to be useful for rudimentary analysis of heritage sights. This paper serves as a survey of the research endeavors undertaken in this direction which would eventually provide insights for building an automated decision system that could be utilized to make the experience of tourism in India more modernized for visitors. △ Less

Submitted 29 July, 2021; originally announced July 2021.

Comments: Currently under review

arXiv:2107.14061 [pdf]

The Need and Status of Sea Turtle Conservation and Survey of Associated Computer Vision Advances

Authors: Aditya Jyoti Paul

Abstract: For over hundreds of millions of years, sea turtles and their ancestors have swum in the vast expanses of the ocean. They have undergone a number of evolutionary changes, leading to speciation and sub-speciation. However, in the past few decades, some of the most notable forces driving the genetic variance and population decline have been global warming and anthropogenic impact ranging from large-… ▽ More For over hundreds of millions of years, sea turtles and their ancestors have swum in the vast expanses of the ocean. They have undergone a number of evolutionary changes, leading to speciation and sub-speciation. However, in the past few decades, some of the most notable forces driving the genetic variance and population decline have been global warming and anthropogenic impact ranging from large-scale poaching, collecting turtle eggs for food, besides dum** trash including plastic waste into the ocean. This leads to severe detrimental effects in the sea turtle population, driving them to extinction. This research focusses on the forces causing the decline in sea turtle population, the necessity for the global conservation efforts along with its successes and failures, followed by an in-depth analysis of the modern advances in detection and recognition of sea turtles, involving Machine Learning and Computer Vision systems, aiding the conservation efforts. △ Less

Submitted 29 July, 2021; originally announced July 2021.

Comments: Currently under review

arXiv:2107.00615 [pdf]

A linear phase evolution model for reduction of temporal unwrap** and field estimation errors in multi-echo GRE

Authors: Joseph Suresh Paul, Sreekanth Madhusoodhanan

Abstract: This article aims at develo** a model based optimization for reduction of temporal unwrap** and field estimation errors in multi-echo acquisition of Gradient Echo sequence. Using the assumption that the phase is linear along the temporal dimension, the field estimation is performed by application of unity rank approximation to the Hankel matrix formed using the complex exponential of the chann… ▽ More This article aims at develo** a model based optimization for reduction of temporal unwrap** and field estimation errors in multi-echo acquisition of Gradient Echo sequence. Using the assumption that the phase is linear along the temporal dimension, the field estimation is performed by application of unity rank approximation to the Hankel matrix formed using the complex exponential of the channel combined phase at each echo time. For the purpose of maintaining consistency with the observed complex data, the linear phase evolution model is formulated as an optimization problem with a cost function that involves a fidelity term and a unity rank prior, implemented using alternating minimization. Itoh s algorithm applied to the multi-echo phase estimated from this linear phase evolution model is able to reduce the unwrap** errors as compared to the unwrap** when directly applied to the measured phase. Secondly, the improved accuracy of the frequency fit in comparison to estimation using weighted least-square regression and penalized maximum likelihood is demonstrated using numerical simulation of field perturbation due to magnetic susceptibility effect. It is shown that the field can be estimated with 80 percent reduction in mean absolute error in comparison to wLSR and 66 percent reduction with respect to penalized maximum likelihood. The improvement in performance becomes more pronounced with increasing strengths of field gradient magnitudes and echo spacing. △ Less

Submitted 26 June, 2021; originally announced July 2021.

Comments: 29pages, 8 figures

ACM Class: J.2

arXiv:2106.15472 [pdf]

Robust Multi-echo GRE Phase processing using a unity rank enforced complex exponential model

Authors: Joseph Suresh Paul, Sreekanth Madhusoodhanan

Abstract: Purpose: Develop a processing scheme for Gradient Echo (GRE) phase to enable restoration of susceptibility-related (SuR) features in regions affected by imperfect phase unwrap**, background suppression and low signal-to-noise ratio (SNR) due to phase dispersion. Theory and Methods: The predictable components sampled across the echo dimension in a multi-echo GRE sequence are recovered by rank min… ▽ More Purpose: Develop a processing scheme for Gradient Echo (GRE) phase to enable restoration of susceptibility-related (SuR) features in regions affected by imperfect phase unwrap**, background suppression and low signal-to-noise ratio (SNR) due to phase dispersion. Theory and Methods: The predictable components sampled across the echo dimension in a multi-echo GRE sequence are recovered by rank minimizing a Hankel matrix formed using the complex exponential of the background suppressed phase. To estimate the single frequency component that relates to the susceptibility induced field, it is required to maintain consistency with the measured phase after background suppression, penalized by a unity rank approximation (URA) prior. This is formulated as an optimization problem, implemented using the alternating direction method of multiplier (ADMM). Results: With in vivo multi-echo GRE data, the magnitude susceptibility weighted image (SWI) reconstructed using URA prior shows additional venous structures that are obscured due to phase dispersion and noise in regions subject to remnant non-local field variations. The performance is compared with the susceptibility map weighted imaging (SMWI) and the standard SWI. It is also shown using numerical simulation that quantitative susceptibility map (QSM) computed from the reconstructed phase exhibits reduced artifacts and quantification error. In vivo experiments reveal iron depositions in insular, motor cortex and superior frontal gyrus that are not identified in standard QSM. Conclusion: URA processed GRE phase is less sensitive to imperfections in the phase pre-processing techniques, and thereby enable robust estimation of SWI and QSM. △ Less

Submitted 26 June, 2021; originally announced June 2021.

Comments: 33 pages, 9 figures

ACM Class: J.2

arXiv:2106.01739 [pdf]

Advances in Classifying the Stages of Diabetic Retinopathy Using Convolutional Neural Networks in Low Memory Edge Devices

Authors: Aditya Jyoti Paul

Abstract: Diabetic Retinopathy (DR) is a severe complication that may lead to retinal vascular damage and is one of the leading causes of vision impairment and blindness. DR broadly is classified into two stages - non-proliferative (NPDR), where there are almost no symptoms, except a few microaneurysms, and proliferative (PDR) involving a huge number of microaneurysms and hemorrhages, soft and hard exudates… ▽ More Diabetic Retinopathy (DR) is a severe complication that may lead to retinal vascular damage and is one of the leading causes of vision impairment and blindness. DR broadly is classified into two stages - non-proliferative (NPDR), where there are almost no symptoms, except a few microaneurysms, and proliferative (PDR) involving a huge number of microaneurysms and hemorrhages, soft and hard exudates, neo-vascularization, macular ischemia or a combination of these, making it easier to detect. More specifically, DR is usually classified into five levels, labeled 0-4, from 0 indicating no DR to 4 which is most severe. This paper firstly presents a discussion on the risk factors of the disease, then surveys the recent literature on the topic followed by examining certain techniques which were found to be highly effective in improving the prognosis accuracy. Finally, a convolutional neural network model is proposed to detect all the stages of DR on a low-memory edge microcontroller. The model has a size of just 5.9 MB, accuracy and F1 score both of 94% and an inference speed of about 20 frames per second. △ Less

Submitted 3 June, 2021; originally announced June 2021.

Comments: This paper is currently under review at IEEE MASCON 2021. http://ieeemascon.in

MSC Class: 68T45; 68T10; 68T07; 68U10 ACM Class: I.2.10; I.4.8; I.5.1; J.3; I.4.1; K.4.2

arXiv:2103.14915 [pdf, other]

doi 10.1145/3448016.3457253

Cache-Efficient Fork-Processing Patterns on Large Graphs

Authors: Shengliang Lu, Shixuan Sun, Johns Paul, Yuchen Li, Bingsheng He

Abstract: As large graph processing emerges, we observe a costly fork-processing pattern (FPP) that is common in many graph algorithms. The unique feature of the FPP is that it launches many independent queries from different source vertices on the same graph. For example, an algorithm in analyzing the network community profile can execute Personalized PageRanks that start from tens of thousands of source v… ▽ More As large graph processing emerges, we observe a costly fork-processing pattern (FPP) that is common in many graph algorithms. The unique feature of the FPP is that it launches many independent queries from different source vertices on the same graph. For example, an algorithm in analyzing the network community profile can execute Personalized PageRanks that start from tens of thousands of source vertices at the same time. We study the efficiency of handling FPPs in state-of-the-art graph processing systems on multi-core architectures. We find that those systems suffer from severe cache miss penalty because of the irregular and uncoordinated memory accesses in processing FPPs. In this paper, we propose ForkGraph, a cache-efficient FPP processing system on multi-core architectures. To improve the cache reuse, we divide the graph into partitions each sized of LLC capacity, and the queries in an FPP are buffered and executed on the partition basis. We further develop efficient intra- and inter-partition execution strategies for efficiency. For intra-partition processing, since the graph partition fits into LLC, we propose to execute each graph query with efficient sequential algorithms (in contrast with parallel algorithms in existing parallel graph processing systems) and present an atomic-free query processing by consolidating contending operations to cache-resident graph partition. For inter-partition processing, we propose yielding and priority-based scheduling, to reduce redundant work in processing. Besides, we theoretically prove that ForkGraph performs the same amount of work, to within a constant factor, as the fastest known sequential algorithms in FPP queries processing, which is work efficient. Our evaluations on real-world graphs show that ForkGraph significantly outperforms state-of-the-art graph processing systems with two orders of magnitude speedups. △ Less

Submitted 10 April, 2021; v1 submitted 27 March, 2021; originally announced March 2021.

Comments: in SIGMOD 2021

ACM Class: H.2

arXiv:2102.11103 [pdf, other]

User Factor Adaptation for User Embedding via Multitask Learning

Authors: Xiaolei Huang, Michael J. Paul, Robin Burke, Franck Dernoncourt, Mark Dredze

Abstract: Language varies across users and their interested fields in social media data: words authored by a user across his/her interests may have different meanings (e.g., cool) or sentiments (e.g., fast). However, most of the existing methods to train user embeddings ignore the variations across user interests, such as product and movie categories (e.g., drama vs. action). In this study, we treat the use… ▽ More Language varies across users and their interested fields in social media data: words authored by a user across his/her interests may have different meanings (e.g., cool) or sentiments (e.g., fast). However, most of the existing methods to train user embeddings ignore the variations across user interests, such as product and movie categories (e.g., drama vs. action). In this study, we treat the user interest as domains and empirically examine how the user language can vary across the user factor in three English social media datasets. We then propose a user embedding model to account for the language variability of user interests via a multitask learning framework. The model learns user language and its variations without human supervision. While existing work mainly evaluated the user embedding by extrinsic tasks, we propose an intrinsic evaluation via clustering and evaluate user embeddings by an extrinsic task, text classification. The experiments on the three English-language social media datasets show that our proposed approach can generally outperform baselines via adapting the user factor. △ Less

Submitted 22 February, 2021; originally announced February 2021.

Comments: Accepted in the Second Workshop on Domain Adaptation for Natural Language Processing (Adapted-NLP)

arXiv:2012.04156 [pdf]

doi 10.1109/RAICS51191.2020.9332470

An Efficient Analyses of the Behavior of One Dimensional Chaotic Maps using 0-1 Test and Three State Test

Authors: Joan S. Muthu, Aditya Jyoti Paul, P. Murali

Abstract: In this paper, a rigorous analysis of the behavior of the standard logistic map, Logistic Tent system (LTS), Logistic-Sine system (LSS) and Tent-Sine system (TSS) is performed using 0-1 test and three state test (3ST). In this work, it has been proved that the strength of the chaotic behavior is not uniform. Through extensive experiment and analysis, the strong and weak chaotic regions of LTS, LSS… ▽ More In this paper, a rigorous analysis of the behavior of the standard logistic map, Logistic Tent system (LTS), Logistic-Sine system (LSS) and Tent-Sine system (TSS) is performed using 0-1 test and three state test (3ST). In this work, it has been proved that the strength of the chaotic behavior is not uniform. Through extensive experiment and analysis, the strong and weak chaotic regions of LTS, LSS and TSS have been identified. This would enable researchers using these maps, to have better choices of control parameters as key values, for stronger encryption. In addition, this paper serves as a precursor to stronger testing practices in cryptosystem research, as Lyapunov exponent alone has been shown to fail as a true representation of the chaotic nature of a map. △ Less

Submitted 13 February, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

Comments: 6 pages, Published in IEEE RAICS 2020, see https://www.raics.in

MSC Class: 37H20; 34F10; 34H10; 49J15; 49K15; 47J15 ACM Class: G.1.0; G.1.2; G.1.3; G.2.3; G.4; C.3; E.3; I.6.4

Journal ref: 2020 IEEE Recent Advances in Intelligent Computational Systems (RAICS), 2020, pp. 125-130

arXiv:2011.14858 [pdf, other]

doi 10.1007/978-981-16-0749-3_52

A Tiny CNN Architecture for Medical Face Mask Detection for Resource-Constrained Endpoints

Authors: Puranjay Mohan, Aditya Jyoti Paul, Abhay Chirania

Abstract: The world is going through one of the most dangerous pandemics of all time with the rapid spread of the novel coronavirus (COVID-19). According to the World Health Organisation, the most effective way to thwart the transmission of coronavirus is to wear medical face masks. Monitoring the use of face masks in public places has been a challenge because manual monitoring could be unsafe. This paper p… ▽ More The world is going through one of the most dangerous pandemics of all time with the rapid spread of the novel coronavirus (COVID-19). According to the World Health Organisation, the most effective way to thwart the transmission of coronavirus is to wear medical face masks. Monitoring the use of face masks in public places has been a challenge because manual monitoring could be unsafe. This paper proposes an architecture for detecting medical face masks for deployment on resource-constrained endpoints having extremely low memory footprints. A small development board with an ARM Cortex-M7 microcontroller clocked at 480 Mhz and having just 496 KB of framebuffer RAM, has been used for the deployment of the model. Using the TensorFlow Lite framework, the model is quantized to further reduce its size. The proposed model is 138 KB post quantization and runs at the inference speed of 30 FPS. △ Less

Submitted 3 June, 2021; v1 submitted 30 November, 2020; originally announced November 2020.

Comments: 11 pages, Published in Springer LNEE at http://link.springer.com/chapter/10.1007%2F978-981-16-0749-3_52

MSC Class: 68T45; 68T10; 68T07; 68U10 ACM Class: C.3; I.2.6; I.2.10; I.4.9; I.5.1; I.5.2; I.5.4; I.5.5; K.4.1; K.4.3

Journal ref: Innovations in Electrical and Electronic Engineering. Lecture Notes in Electrical Engineering, vol 756, pp 657-670, Springer, Singapore, 2021

arXiv:2011.13741 [pdf]

doi 10.1109/RAICS51191.2020.9332480

Rethinking Generalization in American Sign Language Prediction for Edge Devices with Extremely Low Memory Footprint

Authors: Aditya Jyoti Paul, Puranjay Mohan, Stuti Sehgal

Abstract: Due to the boom in technical compute in the last few years, the world has seen massive advances in artificially intelligent systems solving diverse real-world problems. But a major roadblock in the ubiquitous acceptance of these models is their enormous computational complexity and memory footprint. Hence efficient architectures and training techniques are required for deployment on extremely low… ▽ More Due to the boom in technical compute in the last few years, the world has seen massive advances in artificially intelligent systems solving diverse real-world problems. But a major roadblock in the ubiquitous acceptance of these models is their enormous computational complexity and memory footprint. Hence efficient architectures and training techniques are required for deployment on extremely low resource inference endpoints. This paper proposes an architecture for detection of alphabets in American Sign Language on an ARM Cortex-M7 microcontroller having just 496 KB of framebuffer RAM. Leveraging parameter quantization is a common technique that might cause varying drops in test accuracy. This paper proposes using interpolation as augmentation amongst other techniques as an efficient method of reducing this drop, which also helps the model generalize well to previously unseen noisy data. The proposed model is about 185 KB post-quantization and inference speed is 20 frames per second. △ Less

Submitted 13 February, 2021; v1 submitted 27 November, 2020; originally announced November 2020.

Comments: 6 pages, Published in IEEE RAICS 2020, see https://raics.in

MSC Class: 68T45; 68T10; 68T07; 68U10 ACM Class: I.2.10; I.4.8; I.5.1; J.3; I.4.1; K.4.2

Journal ref: 2020 IEEE Recent Advances in Intelligent Computational Systems (RAICS), 2020, pp. 147-152

arXiv:2011.13740 [pdf]

doi 10.1109/RAICS51191.2020.9332513

Recent Advances in Selective Image Encryption and its Indispensability due to COVID-19

Authors: Aditya Jyoti Paul

Abstract: The COVID-19 pandemic serves as a grim reminder of the unexpected nature of these outbreaks and gives rise to a unique set of research challenges in a variety of fields. As people all over the world adjust to this new 'normal', with most workplaces, from companies to educational institutions shifting online, enormous surges in the transmission of images and videos have been observed, creating reco… ▽ More The COVID-19 pandemic serves as a grim reminder of the unexpected nature of these outbreaks and gives rise to a unique set of research challenges in a variety of fields. As people all over the world adjust to this new 'normal', with most workplaces, from companies to educational institutions shifting online, enormous surges in the transmission of images and videos have been observed, creating record-breaking stresses on the internet backbone. At the same time, maintaining the privacy and security of the users' data is of immense importance, this is where fast and efficient image encryption algorithms play a vital role. This paper discusses the calamitous effects of the pandemic on the world population and how their changes in multimedia consumption have led to an urgent need for the development and deployment of secure and fast image encryption, especially selective image encryption techniques. It carefully surveys the most recent advances in this field, discusses their real-world effects and finally explores some future research avenues, to provide swift relief and recover from the disastrous effects of the pandemic. △ Less

Submitted 13 February, 2021; v1 submitted 27 November, 2020; originally announced November 2020.

Comments: 6 pages, Published in IEEE RAICS 2020, see https://raics.in

MSC Class: 68P25; 68P30; 94A60; 91C99; 68W27 ACM Class: E.3; I.4; J.3; J.4; K.4

Journal ref: 2020 IEEE Recent Advances in Intelligent Computational Systems (RAICS), 2020, pp. 201-206

arXiv:2009.11225 [pdf]

doi 10.1049/ccs.2020.0018

Randomized fast no-loss expert system to play tic tac toe like a human

Authors: Aditya Jyoti Paul

Abstract: This paper introduces a blazingly fast, no-loss expert system for Tic Tac Toe using Decision Trees called T3DT, that tries to emulate human gameplay as closely as possible. It does not make use of any brute force, minimax or evolutionary techniques, but is still always unbeatable. In order to make the gameplay more human-like, randomization is prioritized and T3DT randomly chooses one of the multi… ▽ More This paper introduces a blazingly fast, no-loss expert system for Tic Tac Toe using Decision Trees called T3DT, that tries to emulate human gameplay as closely as possible. It does not make use of any brute force, minimax or evolutionary techniques, but is still always unbeatable. In order to make the gameplay more human-like, randomization is prioritized and T3DT randomly chooses one of the multiple optimal moves at each step. Since it does not need to analyse the complete game tree at any point, T3DT is exceptionally faster than any brute force or minimax algorithm, this has been shown theoretically as well as empirically from clock-time analyses in this paper. T3DT also doesn't need the data sets or the time to train an evolutionary model, making it a practical no-loss approach to play Tic Tac Toe. △ Less

Submitted 10 November, 2020; v1 submitted 23 September, 2020; originally announced September 2020.

Comments: Author's version of the paper published in IET Cognitive Computation and Systems. For the journal-typeset version, please see https://doi.org/10.1049/ccs.2020.0018

MSC Class: 68T37; 68T35; 68T27; 68T30; 68T20 ACM Class: I.2.1; I.2.3; I.2.4; I.2.8; I.6.3; I.6.4

Journal ref: Cognitive Computation and Systems, Volume 2, Issue 4, December 2020, pp. 231 - 241

arXiv:2005.00524 [pdf, other]

Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries

Authors: Mozhi Zhang, Yoshinari Fu**uma, Michael J. Paul, Jordan Boyd-Graber

Abstract: Cross-lingual word embeddings (CLWE) are often evaluated on bilingual lexicon induction (BLI). Recent CLWE methods use linear projections, which underfit the training dictionary, to generalize on BLI. However, underfitting can hinder generalization to other downstream tasks that rely on words from the training dictionary. We address this limitation by retrofitting CLWE to the training dictionary,… ▽ More Cross-lingual word embeddings (CLWE) are often evaluated on bilingual lexicon induction (BLI). Recent CLWE methods use linear projections, which underfit the training dictionary, to generalize on BLI. However, underfitting can hinder generalization to other downstream tasks that rely on words from the training dictionary. We address this limitation by retrofitting CLWE to the training dictionary, which pulls training translation pairs closer in the embedding space and overfits the training dictionary. This simple post-processing step often improves accuracy on two downstream tasks, despite lowering BLI test accuracy. We also retrofit to both the training dictionary and a synthetic dictionary induced from CLWE, which sometimes generalizes even better on downstream tasks. Our results confirm the importance of fully exploiting training dictionary in downstream tasks and explains why BLI is a flawed CLWE evaluation. △ Less

Submitted 1 May, 2020; originally announced May 2020.

Comments: ACL 2020

arXiv:2003.11531 [pdf, other]

The Medical Scribe: Corpus Development and Model Performance Analyses

Authors: Izhak Shafran, Nan Du, Linh Tran, Amanda Perry, Lauren Keyes, Mark Knichel, Ashley Domin, Lei Huang, Yuhui Chen, Gang Li, Mingqiu Wang, Laurent El Shafey, Hagen Soltau, Justin S. Paul

Abstract: There is a growing interest in creating tools to assist in clinical note generation using the audio of provider-patient encounters. Motivated by this goal and with the help of providers and medical scribes, we developed an annotation scheme to extract relevant clinical concepts. We used this annotation scheme to label a corpus of about 6k clinical encounters. This was used to train a state-of-the-… ▽ More There is a growing interest in creating tools to assist in clinical note generation using the audio of provider-patient encounters. Motivated by this goal and with the help of providers and medical scribes, we developed an annotation scheme to extract relevant clinical concepts. We used this annotation scheme to label a corpus of about 6k clinical encounters. This was used to train a state-of-the-art tagging model. We report ontologies, labeling results, model performances, and detailed analyses of the results. Our results show that the entities related to medications can be extracted with a relatively high accuracy of 0.90 F-score, followed by symptoms at 0.72 F-score, and conditions at 0.57 F-score. In our task, we not only identify where the symptoms are mentioned but also map them to canonical forms as they appear in the clinical notes. Of the different types of errors, in about 19-38% of the cases, we find that the model output was correct, and about 17-32% of the errors do not impact the clinical note. Taken together, the models developed in this work are more useful than the F-scores reflect, making it a promising approach for practical applications. △ Less

Submitted 11 March, 2020; originally announced March 2020.

Comments: Extended version of the paper accepted at LREC 2020

Journal ref: Proceedings of Language Resources and Evaluation, 2020

arXiv:2002.10361 [pdf, other]

Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition

Authors: Xiaolei Huang, Linzi Xing, Franck Dernoncourt, Michael J. Paul

Abstract: Existing research on fairness evaluation of document classification models mainly uses synthetic monolingual data without ground truth for author demographic attributes. In this work, we assemble and publish a multilingual Twitter corpus for the task of hate speech detection with inferred four author demographic factors: age, country, gender and race/ethnicity. The corpus covers five languages: En… ▽ More Existing research on fairness evaluation of document classification models mainly uses synthetic monolingual data without ground truth for author demographic attributes. In this work, we assemble and publish a multilingual Twitter corpus for the task of hate speech detection with inferred four author demographic factors: age, country, gender and race/ethnicity. The corpus covers five languages: English, Italian, Polish, Portuguese and Spanish. We evaluate the inferred demographic labels with a crowdsourcing platform, Figure Eight. To examine factors that can cause biases, we take an empirical analysis of demographic predictability on the English corpus. We measure the performance of four popular document classifiers and evaluate the fairness and bias of the baseline classifiers on the author-level demographic attributes. △ Less

Submitted 3 March, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

Comments: Accepted at LREC 2020

arXiv:1912.04219 [pdf, other]

FaultNet: Faulty Rail-Valves Detection using Deep Learning and Computer Vision

Authors: Ramanpreet Singh Pahwa, ** Chao, Jestine Paul, Yiqun Li, Ma Tin Lay Nwe, Shudong Xie, Ashish James, Arulmurugan Ambikapathi, Zeng Zeng, Vijay Ramaseshan Chandrasekhar

Abstract: Regular inspection of rail valves and engines is an important task to ensure the safety and efficiency of railway networks around the globe. Over the past decade, computer vision and pattern recognition based techniques have gained traction for such inspection and defect detection tasks. An automated end-to-end trained system can potentially provide a low-cost, high throughput, and cheap alternati… ▽ More Regular inspection of rail valves and engines is an important task to ensure the safety and efficiency of railway networks around the globe. Over the past decade, computer vision and pattern recognition based techniques have gained traction for such inspection and defect detection tasks. An automated end-to-end trained system can potentially provide a low-cost, high throughput, and cheap alternative to manual visual inspection of these components. However, such systems require a huge amount of defective images for networks to understand complex defects. In this paper, a multi-phase deep learning based technique is proposed to perform accurate fault detection of rail-valves. Our approach uses a two-step method to perform high precision image segmentation of rail-valves resulting in pixel-wise accurate segmentation. Thereafter, a computer vision technique is used to identify faulty valves. We demonstrate that the proposed approach results in improved detection performance when compared to current state-of-theart techniques used in fault detection. △ Less

Submitted 8 November, 2019; originally announced December 2019.

Comments: 8 pages, 8 figures, ITSC 2019

Journal ref: IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE - ITSC 2019

arXiv:1909.06283 [pdf, other]

Toward Automated Quest Generation in Text-Adventure Games

Authors: Prithviraj Ammanabrolu, William Broniec, Alex Mueller, Jeremy Paul, Mark O. Riedl

Abstract: Interactive fictions, or text-adventures, are games in which a player interacts with a world entirely through textual descriptions and text actions. Text-adventure games are typically structured as puzzles or quests wherein the player must execute certain actions in a certain order to succeed. In this paper, we consider the problem of procedurally generating a quest, defined as a series of actions… ▽ More Interactive fictions, or text-adventures, are games in which a player interacts with a world entirely through textual descriptions and text actions. Text-adventure games are typically structured as puzzles or quests wherein the player must execute certain actions in a certain order to succeed. In this paper, we consider the problem of procedurally generating a quest, defined as a series of actions required to progress towards a goal, in a text-adventure game. Quest generation in text environments is challenging because they must be semantically coherent. We present and evaluate two quest generation techniques: (1) a Markov model, and (2) a neural generative model. We specifically look at generating quests about cooking and train our models on recipe data. We evaluate our techniques with human participant studies looking at perceived creativity and coherence. △ Less

Submitted 19 August, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

Comments: In Proceedings of the International Conference on Computational Creativity (ICCC-20)

arXiv:1909.03524 [pdf, other]

Evaluating Topic Quality with Posterior Variability

Authors: Linzi Xing, Michael J. Paul, Giuseppe Carenini

Abstract: Probabilistic topic models such as latent Dirichlet allocation (LDA) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters. We derive a novel measure of LDA topic quality using the variability of the posterior distributions. Compared to several existing baselines for automatic topic evaluation, the proposed metric ach… ▽ More Probabilistic topic models such as latent Dirichlet allocation (LDA) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters. We derive a novel measure of LDA topic quality using the variability of the posterior distributions. Compared to several existing baselines for automatic topic evaluation, the proposed metric achieves state-of-the-art correlations with human judgments of topic quality in experiments on three corpora. We additionally demonstrate that topic quality estimation can be further improved using a supervised estimator that combines multiple metrics. △ Less

Submitted 15 September, 2019; v1 submitted 8 September, 2019; originally announced September 2019.

Comments: 8 pages

arXiv:1906.01926 [pdf, other]

doi 10.18653/v1/P19-1489

A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity

Authors: Yoshinari Fu**uma, Jordan Boyd-Graber, Michael J. Paul

Abstract: Cross-lingual word embeddings encode the meaning of words from different languages into a shared low-dimensional space. An important requirement for many downstream tasks is that word similarity should be independent of language - i.e., word vectors within one language should not be more similar to each other than to words in another language. We measure this characteristic using modularity, a net… ▽ More Cross-lingual word embeddings encode the meaning of words from different languages into a shared low-dimensional space. An important requirement for many downstream tasks is that word similarity should be independent of language - i.e., word vectors within one language should not be more similar to each other than to words in another language. We measure this characteristic using modularity, a network measurement that measures the strength of clusters in a graph. Modularity has a moderate to strong correlation with three downstream tasks, even though modularity is based only on the structure of embeddings and does not require any external resources. We show through experiments that modularity can serve as an intrinsic validation metric to improve unsupervised cross-lingual word embeddings, particularly on distant language pairs in low-resource settings. △ Less

Submitted 5 June, 2019; originally announced June 2019.

Comments: Accepted to ACL 2019, camera-ready

arXiv:1810.05867 [pdf, other]

An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models

Authors: Shudong Hao, Michael J. Paul

Abstract: Probabilistic topic modeling is a popular choice as the first step of crosslingual tasks to enable knowledge transfer and extract multilingual features. While many multilingual topic models have been developed, their assumptions on the training corpus are quite varied, and it is not clear how well the models can be applied under various training conditions. In this paper, we systematically study t… ▽ More Probabilistic topic modeling is a popular choice as the first step of crosslingual tasks to enable knowledge transfer and extract multilingual features. While many multilingual topic models have been developed, their assumptions on the training corpus are quite varied, and it is not clear how well the models can be applied under various training conditions. In this paper, we systematically study the knowledge transfer mechanisms behind different multilingual topic models, and through a broad set of experiments with four models on ten languages, we provide empirical insights that can inform the selection and future development of multilingual topic models. △ Less

Submitted 10 June, 2019; v1 submitted 13 October, 2018; originally announced October 2018.

arXiv:1809.06665 [pdf]

Compressed Sensing Parallel MRI with Adaptive Shrinkage TV Regularization

Authors: Raji Susan Mathew, Joseph Suresh Paul

Abstract: Compressed sensing (CS) methods in magnetic resonance imaging (MRI) offer rapid acquisition and improved image quality but require iterative reconstruction schemes with regularization to enforce sparsity. Regardless of the difficulty in obtaining a fast numerical solution, the total variation (TV) regularization is a preferred choice due to its edge-preserving and structure recovery capabilities.… ▽ More Compressed sensing (CS) methods in magnetic resonance imaging (MRI) offer rapid acquisition and improved image quality but require iterative reconstruction schemes with regularization to enforce sparsity. Regardless of the difficulty in obtaining a fast numerical solution, the total variation (TV) regularization is a preferred choice due to its edge-preserving and structure recovery capabilities. While many approaches have been proposed to overcome the non-differentiability of the TV cost term, an iterative shrinkage based formulation allows recovering an image through recursive application of linear filtering and soft thresholding. However, providing an optimal setting for the regularization parameter is critical due to its direct impact on the rate of convergence as well as steady state error. In this paper, a regularizer adaptively varying in the derivative space is proposed, that follows the generalized discrepancy principle (GDP). The implementation proceeds by adaptively reducing the discrepancy level expressed as the absolute difference between TV norms of the consistency error and the sparse approximation error. A criterion based on the absolute difference between TV norms of consistency and sparse approximation errors is used to update the threshold. Application of the adaptive shrinkage TV regularizer to CS recovery of parallel MRI (pMRI) and temporal gradient adaptation in dynamic MRI are shown to result in improved image quality with accelerated convergence. In addition, the adaptive TV-based iterative shrinkage (ATVIS) provides a significant speed advantage over the fast iterative shrinkage-thresholding algorithm (FISTA). △ Less

Submitted 18 September, 2018; originally announced September 2018.

Comments: 27 pages,9 figures

arXiv:1806.04270 [pdf, other]

Learning Multilingual Topics from Incomparable Corpus

Authors: Shudong Hao, Michael J. Paul

Abstract: Multilingual topic models enable crosslingual tasks by extracting consistent topics from multilingual corpora. Most models require parallel or comparable training corpora, which limits their ability to generalize. In this paper, we first demystify the knowledge transfer mechanism behind multilingual topic models by defining an alternative but equivalent formulation. Based on this analysis, we then… ▽ More Multilingual topic models enable crosslingual tasks by extracting consistent topics from multilingual corpora. Most models require parallel or comparable training corpora, which limits their ability to generalize. In this paper, we first demystify the knowledge transfer mechanism behind multilingual topic models by defining an alternative but equivalent formulation. Based on this analysis, we then relax the assumption of training data required by most existing models, creating a model that only requires a dictionary for training. Experiments show that our new method effectively learns coherent multilingual topics from partially and fully incomparable corpora with limited amounts of dictionary resources. △ Less

Submitted 11 June, 2018; originally announced June 2018.

Comments: To appear in International Conference on Computational Linguistics (COLING), 2018

arXiv:1804.10184 [pdf, other]

Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation

Authors: Shudong Hao, Jordan Boyd-Graber, Michael J. Paul

Abstract: Multilingual topic models enable document analysis across languages through coherent multilingual summaries of the data. However, there is no standard and effective metric to evaluate the quality of multilingual topics. We introduce a new intrinsic evaluation of multilingual topic models that correlates well with human judgments of multilingual topic coherence as well as performance in downstream… ▽ More Multilingual topic models enable document analysis across languages through coherent multilingual summaries of the data. However, there is no standard and effective metric to evaluate the quality of multilingual topics. We introduce a new intrinsic evaluation of multilingual topic models that correlates well with human judgments of multilingual topic coherence as well as performance in downstream applications. Importantly, we also study evaluation for low-resource languages. Because standard metrics fail to accurately measure topic quality when robust external resources are unavailable, we propose an adaptation model that improves the accuracy and reliability of these metrics in low-resource settings. △ Less

Submitted 26 April, 2018; originally announced April 2018.

Comments: North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), New Orleans, Louisiana. June 2018

arXiv:1610.09498 [pdf]

A MAP-MRF filter for phase-sensitive coil combination in autocalibrating partially parallel susceptibility weighted MRI

Authors: Sreekanth Madhusoodhanan, Joseph Suresh Paul

Abstract: A statistical approach for combination of channel phases is developed for optimizing the Contrast-to-Noise Ratio (CNR) in Susceptibility Weighted Images (SWI) acquired using autocalibrating partially parallel techniques. The unwrapped phase images of each coil are filtered using local random field based probabilistic weights, derived using energy functions representative of noisy sensitivity and t… ▽ More A statistical approach for combination of channel phases is developed for optimizing the Contrast-to-Noise Ratio (CNR) in Susceptibility Weighted Images (SWI) acquired using autocalibrating partially parallel techniques. The unwrapped phase images of each coil are filtered using local random field based probabilistic weights, derived using energy functions representative of noisy sensitivity and tissue information pertaining to venous structure in the individual channel phase images. The channel energy functions are obtained as functions of local image intensities, first or second order clique phase difference and a threshold scaling parameter dependent on the input noise level. Whereas the expectation of the individual energy functions with respect to the noise distribution in clique phase differences is to be maximized for optimal filtering, the expectation of tissue energy function decreases and noise energy function increases with increase in threshold scale parameter. The optimum scaling parameter is shown to occur at the point where expectations of both energy functions contribute to the largest possible extent. It is shown that implementation of the filter in the same lines as that of Iterated Conditional Modes (ICM) algorithm provides structural enhancement in the coil combined phase, with reduced noise amplification. Application to simulated and in vivo multi-channel SWI shows that CNR of combined phase obtained using MAP-MRF filter is higher as compared to that of coil combination using weighted average. △ Less

Submitted 29 October, 2016; originally announced October 2016.

Comments: Submitted to IEEE TMI, At the end of the document the rebuttal is added. Expecting comments from other researchers

arXiv:1507.05243 [pdf]

Hand Gesture Recognition Library

Authors: Jonathan Fidelis Paul, Dibyabiva Seth, Cijo Paul, Jayati Ghosh Dastidar

Abstract: In this paper we have presented a hand gesture recognition library. Various functions include detecting cluster count, cluster orientation, finger pointing direction, etc. To use these functions first the input image needs to be processed into a logical array for which a function has been developed. The library has been developed kee** flexibility in mind and thus provides application developers… ▽ More In this paper we have presented a hand gesture recognition library. Various functions include detecting cluster count, cluster orientation, finger pointing direction, etc. To use these functions first the input image needs to be processed into a logical array for which a function has been developed. The library has been developed kee** flexibility in mind and thus provides application developers a wide range of options to develop custom gestures. △ Less

Submitted 18 July, 2015; originally announced July 2015.

Journal ref: International Journal of Science and Applied Information Technology, Volume 3, No.2, March - April 2014

arXiv:1501.03320 [pdf]

Image enhancement in intensity projected multichannel MRI using spatially adaptive directional anisotropic diffusion

Authors: P. K. Akshara, J. S. Paul

Abstract: Anisotropic Diffusion is widely used for noise reduction with simultaneous preservation of vascular structures in maximum intensity projected (MIP) angiograms. However, extension to minimum intensity projected (mIP) venograms in Susceptibility Weighted Imaging (SWI) poses difficulties due to spatially varying baseline. Here, we introduce a modified version of the directional anisotropic diffusion… ▽ More Anisotropic Diffusion is widely used for noise reduction with simultaneous preservation of vascular structures in maximum intensity projected (MIP) angiograms. However, extension to minimum intensity projected (mIP) venograms in Susceptibility Weighted Imaging (SWI) poses difficulties due to spatially varying baseline. Here, we introduce a modified version of the directional anisotropic diffusion which allows us to simultaneously reduce the noise and enhance vascular structures reconstructed using both M/mIP angiograms. This method is based on spatial adaptation of the diffusion function, separately in the directions of the gradient, and along those of the minimum and maximum curvatures. The existing approach of directional anisotropic diffusion uses binary switched diffusion function to ensure diffusion along the direction of maximum curvature stopped near the vessel borders. Here, the choice of a threshold for detecting the upper limit of diffusion becomes difficult in the presence of spatially varying baseline. Also, the approach of using vesselness measure to steer the diffusion process results in structural discontinuities due to junction suppression in mIP. The merits of the proposed method include elimination of the need for an apriori choice of a threshold to detect the vessel, and problems due to junction suppression. The proposed method is also extended to multi-channel phase contrast angiogram. △ Less

Submitted 14 January, 2015; originally announced January 2015.

arXiv:1501.03271 [pdf]

Higher dimensional homodyne filtering for suppression of incidental phase artifacts in multichannel MRI

Authors: Joseph Suresh Paul, Uma Krishna Swamy Pillai

Abstract: The aim of this paper is to introduce procedural steps for extension of the 1D homodyne phase correction for k-space truncation in all gradient encoding directions. Compared to the existing method applied to 2D partial k-space, signal losses introduced by the phase correction filter is observed to be minimal for the extended approach. In addition, the modified form of phase correction mitigates In… ▽ More The aim of this paper is to introduce procedural steps for extension of the 1D homodyne phase correction for k-space truncation in all gradient encoding directions. Compared to the existing method applied to 2D partial k-space, signal losses introduced by the phase correction filter is observed to be minimal for the extended approach. In addition, the modified form of phase correction mitigates Incidental Phase Artifacts (IPA) due to truncation. For parallel imaging with undersampling along phase encode direction, the extended homodyne filtering is shown to be effective for minimizing these artifacts when each of the channel k-spaces are truncated along both phase and frequency encode directions. This is illustrated with 2D partial k-space for flow compensated multichannel Susceptibility Weighted Imaging (SWI). Extension of our method to 3D partial k-space shows improved reconstruction of flow information in phase contrast angiography. △ Less

Submitted 14 January, 2015; originally announced January 2015.

arXiv:1405.2911 [pdf]

Resource Prediction for Humanoid Robots

Authors: Manfred Kröhnert, Nikolaus Vahrenkamp, Johny Paul, Walter Stechele, Tamim Asfour

Abstract: Humanoid robots are designed to operate in human centered environments where they execute a multitude of challenging tasks, each differing in complexity, resource requirements, and execution time. In such highly dynamic surroundings it is desirable to anticipate upcoming situations in order to predict future resource requirements such as CPU or memory usage. Resource prediction information is esse… ▽ More Humanoid robots are designed to operate in human centered environments where they execute a multitude of challenging tasks, each differing in complexity, resource requirements, and execution time. In such highly dynamic surroundings it is desirable to anticipate upcoming situations in order to predict future resource requirements such as CPU or memory usage. Resource prediction information is essential for detecting upcoming resource bottlenecks or conflicts and can be used enhance resource negotiation processes or to perform speculative resource allocation. In this paper we present a prediction model based on Markov chains for predicting the behavior of the humanoid robot ARMAR-III in human robot interaction scenarios. Robot state information required by the prediction algorithm is gathered through self-monitoring and combined with environmental context information. Adding resource profiles allows generating probability distributions of possible future resource demands. Online learning of model parameters is made possible through disclosure mechanisms provided by the robot framework ArmarX. △ Less

Submitted 12 May, 2014; originally announced May 2014.

Comments: Presented at 1st Workshop on Resource Awareness and Adaptivity in Multi-Core Computing (Racing 2014) (arXiv:1405.2281)

Report number: Racing/2014/05

arXiv:1405.2908 [pdf]

Resource-Aware Programming for Robotic Vision

Authors: Johny Paul, Walter Stechele, Manfred Kröhnert, Tamim Asfour

Abstract: Humanoid robots are designed to operate in human centered environments. They face changing, dynamic environments in which they need to fulfill a multitude of challenging tasks. Such tasks differ in complexity, resource requirements, and execution time. Latest computer architectures of humanoid robots consist of several industrial PCs containing single- or dual-core processors. According to the SIA… ▽ More Humanoid robots are designed to operate in human centered environments. They face changing, dynamic environments in which they need to fulfill a multitude of challenging tasks. Such tasks differ in complexity, resource requirements, and execution time. Latest computer architectures of humanoid robots consist of several industrial PCs containing single- or dual-core processors. According to the SIA roadmap for semiconductors, many-core chips with hundreds to thousands of cores are expected to be available in the next decade. Utilizing the full power of a chip with huge amounts of resources requires new computing paradigms and methodologies. In this paper, we analyze a resource-aware computing methodology named Invasive Computing, to address these challenges. The benefits and limitations of the new programming model is analyzed using two widely used computer vision algorithms, the Harris Corner detector and SIFT (Scale Invariant Feature Transform) feature matching. The result indicate that the new programming model together with the extensions within the application layer, makes them highly adaptable; leading to better quality in the results obtained. △ Less

Submitted 12 May, 2014; originally announced May 2014.

Comments: Presented at 1st Workshop on Resource Awareness and Adaptivity in Multi-Core Computing (Racing 2014) (arXiv:1405.2281)

Report number: Racing/2014/02

arXiv:1303.2439 [pdf]

Voxel-wise Weighted MR Image Enhancement using an Extended Neighborhood Filter

Authors: Joseph Suresh Paul, Joshin John Mathew, Souparnika Kandoth Naroth, Chandrasekar Kesavadas

Abstract: We present an edge preserving and denoising filter for enhancing the features in images, which contain an ROI having a narrow spatial extent. Typical examples include angiograms, or ROI spatially distributed in multiple locations and contained within an outlying region, such as in multiple-sclerosis. The filtering involves determination of multiplicative weights in the spatial domain using an exte… ▽ More We present an edge preserving and denoising filter for enhancing the features in images, which contain an ROI having a narrow spatial extent. Typical examples include angiograms, or ROI spatially distributed in multiple locations and contained within an outlying region, such as in multiple-sclerosis. The filtering involves determination of multiplicative weights in the spatial domain using an extended set of neighborhood directions. Equivalently, the filtering operation may be interpreted as a combination of directional filters in the frequency domain, with selective weighting for spatial frequencies contained within each direction. The advantages of the proposed filter in comparison to specialized non-linear filters, which operate on diffusion principle, are illustrated using numerical phantom data. The performance evaluation is carried out on simulated images from BrainWeb database for multiple-sclerosis, acute ischemic stroke using clinically acquired FLAIR images and MR angiograms. △ Less

Submitted 11 March, 2013; originally announced March 2013.

arXiv:1303.2437 [pdf]

Least-Squares FIR Models of Low-Resolution MR data for Efficient Phase-Error Compensation with Simultaneous Artefact Removal

Authors: Joseph Suresh Paul, Uma Krishna Swamy Pillai, Ny** Thomas

Abstract: Signal space models in both phase-encode, and frequency-encode directions are presented for extrapolation of 2D partial kspace. Using the boxcar representation of low-resolution spatial data, and a geometrical representation of signal space vectors in both positive and negative phase-encode directions, a robust predictor is constructed using a series of signal space projections. Compared to some o… ▽ More Signal space models in both phase-encode, and frequency-encode directions are presented for extrapolation of 2D partial kspace. Using the boxcar representation of low-resolution spatial data, and a geometrical representation of signal space vectors in both positive and negative phase-encode directions, a robust predictor is constructed using a series of signal space projections. Compared to some of the existing phase-correction methods that require acquisition of a pre-determined set of fractional kspace lines, the proposed predictor is found to be more efficient, due to its capability of exhibiting an equivalent degree of performance using only half the number of fractional lines. Robust filtering of noisy data is achieved using a second signal space model in the frequency-encode direction, bypassing the requirement of a prior highpass filtering operation. The signal space is constructed from Fourier Transformed samples of each row in the low-resolution image. A set of FIR filters are estimated by fitting a least squares model to this signal space. Partial kspace extrapolation using the FIR filters is shown to result in artifact-free reconstruction, particularly in respect of Gibbs ringing and streaking type artifacts. △ Less

Submitted 11 March, 2013; originally announced March 2013.

Showing 1–37 of 37 results for author: Paul, J