Search | arXiv e-print repository

Classification Tree-based Active Learning: A Wrapper Approach

Authors: Ashna Jose, Emilie Devijver, Massih-Reza Amini, Noel Jakse, Roberta Poloni

Abstract: Supervised machine learning often requires large training sets to train accurate models, yet obtaining large amounts of labeled data is not always feasible. Hence, it becomes crucial to explore active learning methods for reducing the size of training sets while maintaining high accuracy. The aim is to select the optimal subset of data for labeling from an initial unlabeled set, ensuring precise p… ▽ More Supervised machine learning often requires large training sets to train accurate models, yet obtaining large amounts of labeled data is not always feasible. Hence, it becomes crucial to explore active learning methods for reducing the size of training sets while maintaining high accuracy. The aim is to select the optimal subset of data for labeling from an initial unlabeled set, ensuring precise prediction of outcomes. However, conventional active learning approaches are comparable to classical random sampling. This paper proposes a wrapper active learning method for classification, organizing the sampling process into a tree structure, that improves state-of-the-art algorithms. A classification tree constructed on an initial set of labeled samples is considered to decompose the space into low-entropy regions. Input-space based criteria are used thereafter to sub-sample from these regions, the total number of points to be labeled being decomposed into each region. This adaptation proves to be a significant enhancement over existing active learning methods. Through experiments conducted on various benchmark data sets, the paper demonstrates the efficacy of the proposed framework by being effective in constructing accurate classification models, even when provided with a severely restricted labeled data set. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2401.09049 [pdf, other]

Enhancing Lidar-based Object Detection in Adverse Weather using Offset Sequences in Time

Authors: Raphael van Kempen, Tim Rehbronn, Abin Jose, Johannes Stegmaier, Bastian Lampe, Timo Woopen, Lutz Eckstein

Abstract: Automated vehicles require an accurate perception of their surroundings for safe and efficient driving. Lidar-based object detection is a widely used method for environment perception, but its performance is significantly affected by adverse weather conditions such as rain and fog. In this work, we investigate various strategies for enhancing the robustness of lidar-based object detection by proce… ▽ More Automated vehicles require an accurate perception of their surroundings for safe and efficient driving. Lidar-based object detection is a widely used method for environment perception, but its performance is significantly affected by adverse weather conditions such as rain and fog. In this work, we investigate various strategies for enhancing the robustness of lidar-based object detection by processing sequential data samples generated by lidar sensors. Our approaches leverage temporal information to improve a lidar object detection model, without the need for additional filtering or pre-processing steps. We compare $10$ different neural network architectures that process point cloud sequences including a novel augmentation strategy introducing a temporal offset between frames of a sequence during training and evaluate the effectiveness of all strategies on lidar point clouds under adverse weather conditions through experiments. Our research provides a comprehensive study of effective methods for mitigating the effects of adverse weather on the reliability of lidar-based object detection using sequential data that are evaluated using public datasets such as nuScenes, Dense, and the Canadian Adverse Driving Conditions Dataset. Our findings demonstrate that our novel method, involving temporal offset augmentation through randomized frame skip** in sequences, enhances object detection accuracy compared to both the baseline model (Pillar-based Object Detection) and no augmentation. △ Less

Submitted 17 January, 2024; originally announced January 2024.

Comments: Published as part of the III. International Conference on Electrical, Computer and Energy Technologies (ICECET 2023), Cape Town, South Africa, November 16-17, 2023

arXiv:2311.13978 [pdf, other]

MedISure: Towards Assuring Machine Learning-based Medical Image Classifiers using Mixup Boundary Analysis

Authors: Adam Byfield, William Poulett, Ben Wallace, Anusha Jose, Shatakshi Tyagi, Smita Shembekar, Adnan Qayyum, Junaid Qadir, Muhammad Bilal

Abstract: Machine learning (ML) models are becoming integral in healthcare technologies, presenting a critical need for formal assurance to validate their safety, fairness, robustness, and trustworthiness. These models are inherently prone to errors, potentially posing serious risks to patient health and could even cause irreparable harm. Traditional software assurance techniques rely on fixed code and do n… ▽ More Machine learning (ML) models are becoming integral in healthcare technologies, presenting a critical need for formal assurance to validate their safety, fairness, robustness, and trustworthiness. These models are inherently prone to errors, potentially posing serious risks to patient health and could even cause irreparable harm. Traditional software assurance techniques rely on fixed code and do not directly apply to ML models since these algorithms are adaptable and learn from curated datasets through a training process. However, adapting established principles, such as boundary testing using synthetic test data can effectively bridge this gap. To this end, we present a novel technique called Mix-Up Boundary Analysis (MUBA) that facilitates evaluating image classifiers in terms of prediction fairness. We evaluated MUBA for two important medical imaging tasks -- brain tumour classification and breast cancer classification -- and achieved promising results. This research aims to showcase the importance of adapting traditional assurance principles for assessing ML models to enhance the safety and reliability of healthcare technologies. To facilitate future research, we plan to publicly release our code for MUBA. △ Less

Submitted 23 November, 2023; originally announced November 2023.

arXiv:2309.17425 [pdf, other]

Data Filtering Networks

Authors: Alex Fang, Albin Madappally Jose, Amit Jain, Ludwig Schmidt, Alexander Toshev, Vaishaal Shankar

Abstract: Large training sets have become a cornerstone of machine learning and are the foundation for recent advances in language modeling and multimodal learning. While data curation for pre-training is often still ad-hoc, one common paradigm is to first collect a massive pool of data from the Web and then filter this candidate pool down to an actual training set via various heuristics. In this work, we s… ▽ More Large training sets have become a cornerstone of machine learning and are the foundation for recent advances in language modeling and multimodal learning. While data curation for pre-training is often still ad-hoc, one common paradigm is to first collect a massive pool of data from the Web and then filter this candidate pool down to an actual training set via various heuristics. In this work, we study the problem of learning a data filtering network (DFN) for this second step of filtering a large uncurated dataset. Our key finding is that the quality of a network for filtering is distinct from its performance on downstream tasks: for instance, a model that performs well on ImageNet can yield worse training sets than a model with low ImageNet accuracy that is trained on a small amount of high-quality data. Based on our insights, we construct new data filtering networks that induce state-of-the-art image-text datasets. Specifically, our best performing dataset DFN-5B enables us to train state-of-the-art CLIP models for their compute budgets: among other improvements on a variety of tasks, a ViT-H trained on our dataset achieves 84.4% zero-shot transfer accuracy on ImageNet, out-performing models trained on other datasets such as LAION-2B, DataComp-1B, or OpenAI's WIT. In order to facilitate further research in dataset design, we also release a new 2 billion example dataset DFN-2B and show that high performance data filtering networks can be trained from scratch using only publicly available data. △ Less

Submitted 5 November, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

arXiv:2308.13442 [pdf, other]

Unlocking Fine-Grained Details with Wavelet-based High-Frequency Enhancement in Transformers

Authors: Reza Azad, Amirhossein Kazerouni, Alaa Sulaiman, Afshin Bozorgpour, Ehsan Khodapanah Aghdam, Abin Jose, Dorit Merhof

Abstract: Medical image segmentation is a critical task that plays a vital role in diagnosis, treatment planning, and disease monitoring. Accurate segmentation of anatomical structures and abnormalities from medical images can aid in the early detection and treatment of various diseases. In this paper, we address the local feature deficiency of the Transformer model by carefully re-designing the self-attent… ▽ More Medical image segmentation is a critical task that plays a vital role in diagnosis, treatment planning, and disease monitoring. Accurate segmentation of anatomical structures and abnormalities from medical images can aid in the early detection and treatment of various diseases. In this paper, we address the local feature deficiency of the Transformer model by carefully re-designing the self-attention map to produce accurate dense prediction in medical images. To this end, we first apply the wavelet transformation to decompose the input feature map into low-frequency (LF) and high-frequency (HF) subbands. The LF segment is associated with coarse-grained features while the HF components preserve fine-grained features such as texture and edge information. Next, we reformulate the self-attention operation using the efficient Transformer to perform both spatial and context attention on top of the frequency representation. Furthermore, to intensify the importance of the boundary information, we impose an additional attention map by creating a Gaussian pyramid on top of the HF components. Moreover, we propose a multi-scale context enhancement block within skip connections to adaptively model inter-scale dependencies to overcome the semantic gap among stages of the encoder and decoder modules. Throughout comprehensive experiments, we demonstrate the effectiveness of our strategy on multi-organ and skin lesion segmentation benchmarks. The implementation code will be available upon acceptance. \href{https://github.com/mindflow-institue/WaveFormer}{GitHub}. △ Less

Submitted 12 September, 2023; v1 submitted 25 August, 2023; originally announced August 2023.

Comments: Accepted in MICCAI 2023 workshop MLMI

Journal ref: MICCAI 2023 workshop

arXiv:2308.07335 [pdf, other]

An Encoder-Decoder Approach for Packing Circles

Authors: Akshay Kiran Jose, Gangadhar Karevvanavar, Rajshekhar V Bhat

Abstract: The problem of packing smaller objects within a larger object has been of interest since decades. In these problems, in addition to the requirement that the smaller objects must lie completely inside the larger objects, they are expected to not overlap or have minimum overlap with each other. Due to this, the problem of packing turns out to be a non-convex problem, obtaining whose optimal solution… ▽ More The problem of packing smaller objects within a larger object has been of interest since decades. In these problems, in addition to the requirement that the smaller objects must lie completely inside the larger objects, they are expected to not overlap or have minimum overlap with each other. Due to this, the problem of packing turns out to be a non-convex problem, obtaining whose optimal solution is challenging. As such, several heuristic approaches have been used for obtaining sub-optimal solutions in general, and provably optimal solutions for some special instances. In this paper, we propose a novel encoder-decoder architecture consisting of an encoder block, a perturbation block and a decoder block, for packing identical circles within a larger circle. In our approach, the encoder takes the index of a circle to be packed as an input and outputs its center through a normalization layer, the perturbation layer adds controlled perturbations to the center, ensuring that it does not deviate beyond the radius of the smaller circle to be packed, and the decoder takes the perturbed center as input and estimates the index of the intended circle for packing. We parameterize the encoder and decoder by a neural network and optimize it to reduce an error between the decoder's estimated index and the actual index of the circle provided as input to the encoder. The proposed approach can be generalized to pack objects of higher dimensions and different shapes by carefully choosing normalization and perturbation layers. The approach gives a sub-optimal solution and is able to pack smaller objects within a larger object with competitive performance with respect to classical methods. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2306.14256 [pdf, other]

doi 10.1007/s41870-023-01342-3

A Multilingual Translator to SQL with Database Schema Pruning to Improve Self-Attention

Authors: Marcelo Archanjo Jose, Fabio Gagliardi Cozman

Abstract: Long sequences of text are challenging in the context of transformers, due to quadratic memory increase in the self-attention mechanism. As this issue directly affects the translation from natural language to SQL queries (as techniques usually take as input a concatenated text with the question and the database schema), we present techniques that allow long text sequences to be handled by transfor… ▽ More Long sequences of text are challenging in the context of transformers, due to quadratic memory increase in the self-attention mechanism. As this issue directly affects the translation from natural language to SQL queries (as techniques usually take as input a concatenated text with the question and the database schema), we present techniques that allow long text sequences to be handled by transformers with up to 512 input tokens. We propose a training process with database schema pruning (removal of tables and columns names that are useless for the query of interest). In addition, we used a multilingual approach with the mT5-large model fine-tuned with a data-augmented Spider dataset in four languages simultaneously: English, Portuguese, Spanish, and French. Our proposed technique used the Spider dataset and increased the exact set match accuracy results from 0.718 to 0.736 in a validation dataset (Dev). Source code, evaluations, and checkpoints are available at: \underline{https://github.com/C4AI/gap-text2sql}. △ Less

Submitted 25 June, 2023; originally announced June 2023.

Comments: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in International Journal of Information Technology, and is available online at https://doi.org/10.1007/s41870-023-01342-3 . SharedIt link: https://rdcu.be/dff19

MSC Class: 68T07; 68T50 ACM Class: I.2.7; H.3.3

arXiv:2301.13081 [pdf, other]

STAIR: Learning Sparse Text and Image Representation in Grounded Tokens

Authors: Chen Chen, Bowen Zhang, Liangliang Cao, Jiguang Shen, Tom Gunter, Albin Madappally Jose, Alexander Toshev, Jonathon Shlens, Ruoming Pang, Yinfei Yang

Abstract: Image and text retrieval is one of the foundational tasks in the vision and language domain with multiple real-world applications. State-of-the-art approaches, e.g. CLIP, ALIGN, represent images and texts as dense embeddings and calculate the similarity in the dense embedding space as the matching score. On the other hand, sparse semantic features like bag-of-words models are more interpretable, b… ▽ More Image and text retrieval is one of the foundational tasks in the vision and language domain with multiple real-world applications. State-of-the-art approaches, e.g. CLIP, ALIGN, represent images and texts as dense embeddings and calculate the similarity in the dense embedding space as the matching score. On the other hand, sparse semantic features like bag-of-words models are more interpretable, but believed to suffer from inferior accuracy than dense representations. In this work, we show that it is possible to build a sparse semantic representation that is as powerful as, or even better than, dense presentations. We extend the CLIP model and build a sparse text and image representation (STAIR), where the image and text are mapped to a sparse token space. Each token in the space is a (sub-)word in the vocabulary, which is not only interpretable but also easy to integrate with existing information retrieval systems. STAIR model significantly outperforms a CLIP model with +$4.9\%$ and +$4.3\%$ absolute Recall@1 improvement on COCO-5k text$\rightarrow$image and image$\rightarrow$text retrieval respectively. It also achieved better performance on both of ImageNet zero-shot and linear probing compared to CLIP. △ Less

Submitted 7 February, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

arXiv:2301.10227 [pdf, ps, other]

Denoising Diffusion Probabilistic Models for Generation of Realistic Fully-Annotated Microscopy Image Data Sets

Authors: Dennis Eschweiler, Rüveyda Yilmaz, Matisse Baumann, Ina Laube, Rijo Roy, Abin Jose, Daniel Brückner, Johannes Stegmaier

Abstract: Recent advances in computer vision have led to significant progress in the generation of realistic image data, with denoising diffusion probabilistic models proving to be a particularly effective method. In this study, we demonstrate that diffusion models can effectively generate fully-annotated microscopy image data sets through an unsupervised and intuitive approach, using rough sketches of desi… ▽ More Recent advances in computer vision have led to significant progress in the generation of realistic image data, with denoising diffusion probabilistic models proving to be a particularly effective method. In this study, we demonstrate that diffusion models can effectively generate fully-annotated microscopy image data sets through an unsupervised and intuitive approach, using rough sketches of desired structures as the starting point. The proposed pipeline helps to reduce the reliance on manual annotations when training deep learning-based segmentation approaches and enables the segmentation of diverse datasets without the need for human annotations. This approach holds great promise in streamlining the data generation process and enabling a more efficient and scalable training of segmentation models, as we show in the example of different practical experiments involving various organisms and cell types. △ Less

Submitted 8 August, 2023; v1 submitted 2 January, 2023; originally announced January 2023.

Comments: 9 pages, 2 figures

arXiv:2301.03505 [pdf, other]

Advances in Medical Image Analysis with Vision Transformers: A Comprehensive Review

Authors: Reza Azad, Amirhossein Kazerouni, Moein Heidari, Ehsan Khodapanah Aghdam, Amirali Molaei, Yiwei Jia, Abin Jose, Rijo Roy, Dorit Merhof

Abstract: The remarkable performance of the Transformer architecture in natural language processing has recently also triggered broad interest in Computer Vision. Among other merits, Transformers are witnessed as capable of learning long-range dependencies and spatial correlations, which is a clear advantage over convolutional neural networks (CNNs), which have been the de facto standard in Computer Vision… ▽ More The remarkable performance of the Transformer architecture in natural language processing has recently also triggered broad interest in Computer Vision. Among other merits, Transformers are witnessed as capable of learning long-range dependencies and spatial correlations, which is a clear advantage over convolutional neural networks (CNNs), which have been the de facto standard in Computer Vision problems so far. Thus, Transformers have become an integral part of modern medical image analysis. In this review, we provide an encyclopedic review of the applications of Transformers in medical imaging. Specifically, we present a systematic and thorough review of relevant recent Transformer literature for different medical image analysis tasks, including classification, segmentation, detection, registration, synthesis, and clinical report generation. For each of these applications, we investigate the novelty, strengths and weaknesses of the different proposed strategies and develop taxonomies highlighting key properties and contributions. Further, if applicable, we outline current benchmarks on different datasets. Finally, we summarize key challenges and discuss different future research directions. In addition, we have provided cited papers with their corresponding implementations in https://github.com/mindflow-institue/Awesome-Transformer. △ Less

Submitted 5 November, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

Comments: https://www.sciencedirect.com/science/article/abs/pii/S1361841523002608

arXiv:2209.07928 [pdf, other]

The BLue Amazon Brain (BLAB): A Modular Architecture of Services about the Brazilian Maritime Territory

Authors: Paulo Pirozelli, Ais B. R. Castro, Ana Luiza C. de Oliveira, André S. Oliveira, Flávio N. Cação, Igor C. Silveira, João G. M. Campos, Laura C. Motheo, Leticia F. Figueiredo, Lucas F. A. O. Pellicer, Marcelo A. José, Marcos M. José, Pedro de M. Ligabue, Ricardo S. Grava, Rodrigo M. Tavares, Vinícius B. Matos, Yan V. Sym, Anna H. R. Costa, Anarosa A. F. Brandão, Denis D. Mauá, Fabio G. Cozman, Sarajane M. Peres

Abstract: We describe the first steps in the development of an artificial agent focused on the Brazilian maritime territory, a large region within the South Atlantic also known as the Blue Amazon. The "BLue Amazon Brain" (BLAB) integrates a number of services aimed at disseminating information about this region and its importance, functioning as a tool for environmental awareness. The main service provided… ▽ More We describe the first steps in the development of an artificial agent focused on the Brazilian maritime territory, a large region within the South Atlantic also known as the Blue Amazon. The "BLue Amazon Brain" (BLAB) integrates a number of services aimed at disseminating information about this region and its importance, functioning as a tool for environmental awareness. The main service provided by BLAB is a conversational facility that deals with complex questions about the Blue Amazon, called BLAB-Chat; its central component is a controller that manages several task-oriented natural language processing modules (e.g., question answering and summarizer systems). These modules have access to an internal data lake as well as to third-party databases. A news reporter (BLAB-Reporter) and a purposely-developed wiki (BLAB-Wiki) are also part of the BLAB service architecture. In this paper, we describe our current version of BLAB's architecture (interface, backend, web services, NLP modules, and resources) and comment on the challenges we have faced so far, such as the lack of training data and the scattered state of domain information. Solving these issues presents a considerable challenge in the development of artificial intelligence for technical domains. △ Less

Submitted 6 September, 2022; originally announced September 2022.

Journal ref: AI: Modeling Oceans and Climate Change (IJCAI-ECAI), 2022

arXiv:2202.04048 [pdf, other]

doi 10.1007/978-3-030-98305-5_26

Integrating question answering and text-to-SQL in Portuguese

Authors: Marcos Menon José, Marcelo Archanjo José, Denis Deratani Mauá, Fábio Gagliardi Cozman

Abstract: Deep learning transformers have drastically improved systems that automatically answer questions in natural language. However, different questions demand different answering techniques; here we propose, build and validate an architecture that integrates different modules to answer two distinct kinds of queries. Our architecture takes a free-form natural language text and classifies it to send it e… ▽ More Deep learning transformers have drastically improved systems that automatically answer questions in natural language. However, different questions demand different answering techniques; here we propose, build and validate an architecture that integrates different modules to answer two distinct kinds of queries. Our architecture takes a free-form natural language text and classifies it to send it either to a Neural Question Answering Reasoner or a Natural Language parser to SQL. We implemented a complete system for the Portuguese language, using some of the main tools available for the language and translating training and testing datasets. Experiments show that our system selects the appropriate answering method with high accuracy (over 99\%), thus validating a modular question answering strategy. △ Less

Submitted 21 September, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

Comments: Published at International Conference on the Computational Processing of Portuguese (PROPOR 2022)

Journal ref: Computational Processing of the Portuguese Language 2022

arXiv:2110.05688 [pdf]

Inclusive Design: Accessibility Settings for People with Cognitive Disabilities

Authors: Trae Waggoner, Julia Ann Jose, Ashwin Nair, Sudarsan Manikandan

Abstract: The advancement of technology has progressed faster than any other field in the world and with the development of these new technologies, it is important to make sure that these tools can be used by everyone, including people with disabilities. Accessibility options in computing devices help ensure that everyone has the same access to advanced technologies. Unfortunately, for those who require mor… ▽ More The advancement of technology has progressed faster than any other field in the world and with the development of these new technologies, it is important to make sure that these tools can be used by everyone, including people with disabilities. Accessibility options in computing devices help ensure that everyone has the same access to advanced technologies. Unfortunately, for those who require more unique and sometimes challenging accommodations, such as people with Amyotrophic lateral sclerosis ( ALS), the most commonly used accessibility features are simply not enough. While assistive technology for those with ALS does exist, it requires multiple peripheral devices that can become quite expensive collectively. The purpose of this paper is to suggest a more affordable and readily available option for ALS assistive technology that can be implemented on a smartphone or tablet. △ Less

Submitted 11 October, 2021; originally announced October 2021.

arXiv:2110.05661 [pdf]

BotNet Detection on Social Media

Authors: Aniket Chandrakant Devle, Julia Ann Jose, Abhay Shrinivas Saraswathula, Shubham Mehta, Siddhant Srivastava, Sirisha Kona, Sudheera Daggumalli

Abstract: As our reliance on social media platforms and web services increase day by day, exploiters view these platforms as an opportunity to manipulate our thoughts ad actions. These platforms have become an open playground for social bot accounts. Social bots not only learn human conversations, manners, and presence but also manipulate public opinion, act as scammers, manipulate stock markets, and so on.… ▽ More As our reliance on social media platforms and web services increase day by day, exploiters view these platforms as an opportunity to manipulate our thoughts ad actions. These platforms have become an open playground for social bot accounts. Social bots not only learn human conversations, manners, and presence but also manipulate public opinion, act as scammers, manipulate stock markets, and so on. There has been evidence of bots manipulating people's opinions and thoughts which can be a great threat to democracy. Identification and prevention of such campaigns that release or create these bots have become critical. Our goal in this paper is to leverage web mining techniques to help detect fake bots on social media platforms such as Twitter, thereby mitigating the spread of disinformation. △ Less

Submitted 27 November, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

arXiv:2110.03546 [pdf, other]

doi 10.1007/978-3-030-91699-2_35

mRAT-SQL+GAP:A Portuguese Text-to-SQL Transformer

Authors: Marcelo Archanjo José, Fabio Gagliardi Cozman

Abstract: The translation of natural language questions to SQL queries has attracted growing attention, in particular in connection with transformers and similar language models. A large number of techniques are geared towards the English language; in this work, we thus investigated translation to SQL when input questions are given in the Portuguese language. To do so, we properly adapted state-of-the-art t… ▽ More The translation of natural language questions to SQL queries has attracted growing attention, in particular in connection with transformers and similar language models. A large number of techniques are geared towards the English language; in this work, we thus investigated translation to SQL when input questions are given in the Portuguese language. To do so, we properly adapted state-of-the-art tools and resources. We changed the RAT-SQL+GAP system by relying on a multilingual BART model (we report tests with other language models), and we produced a translated version of the Spider dataset. Our experiments expose interesting phenomena that arise when non-English languages are targeted; in particular, it is better to train with original and translated training datasets together, even if a single target language is desired. This multilingual BART model fine-tuned with a double-size training dataset (English and Portuguese) achieved 83% of the baseline, making inferences for the Portuguese test dataset. This investigation can help other researchers to produce results in Machine Learning in a language different from English. Our multilingual ready version of RAT-SQL+GAP and the data are available, open-sourced as mRAT-SQL+GAP at: https://github.com/C4AI/gap-text2sql △ Less

Submitted 29 November, 2021; v1 submitted 7 October, 2021; originally announced October 2021.

Comments: Published in: Intelligent Systems. BRACIS 2021. Lecture Notes in Computer Science

MSC Class: 68T07; 68T50 ACM Class: I.2.7; H.3.3

Journal ref: vol 13074, 2021, pp 511-525

arXiv:2106.10542 [pdf, other]

Reversible Colour Density Compression of Images using cGANs

Authors: Arun Jose, Abraham Francis

Abstract: Image compression using colour densities is historically impractical to decompress losslessly. We examine the use of conditional generative adversarial networks in making this transformation more feasible, through learning a map** between the images and a loss function to train on. We show that this method is effective at producing visually lossless generations, indicating that efficient colour… ▽ More Image compression using colour densities is historically impractical to decompress losslessly. We examine the use of conditional generative adversarial networks in making this transformation more feasible, through learning a map** between the images and a loss function to train on. We show that this method is effective at producing visually lossless generations, indicating that efficient colour compression is viable. △ Less

Submitted 19 June, 2021; originally announced June 2021.

Comments: 7 pages, 2 figures

arXiv:2101.02557 [pdf]

Continuous Glucose Monitoring Prediction

Authors: Julia Ann Jose, Trae Waggoner, Sudarsan Manikandan

Abstract: Diabetes is one of the deadliest diseases in the world and affects nearly 10 percent of the global adult population. Fortunately, powerful new technologies allow for a consistent and reliable treatment plan for people with diabetes. One major development is a system called continuous blood glucose monitoring (CGM). In this review, we look at three different continuous meal detection algorithms tha… ▽ More Diabetes is one of the deadliest diseases in the world and affects nearly 10 percent of the global adult population. Fortunately, powerful new technologies allow for a consistent and reliable treatment plan for people with diabetes. One major development is a system called continuous blood glucose monitoring (CGM). In this review, we look at three different continuous meal detection algorithms that were developed using given CGM data from patients with diabetes. From this analysis, an initial meal prediction algorithm was also developed utilizing these methods. △ Less

Submitted 4 January, 2021; originally announced January 2021.

arXiv:2001.11400 [pdf, other]

Optimized Feature Space Learning for Generating Efficient Binary Codes for Image Retrieval

Authors: Abin Jose, Erik Stefan Ottlik, Christian Rohlfing, Jens-Rainer Ohm

Abstract: In this paper we propose an approach for learning low dimensional optimized feature space with minimum intra-class variance and maximum inter-class variance. We address the problem of high-dimensionality of feature vectors extracted from neural networks by taking care of the global statistics of feature space. Classical approach of Linear Discriminant Analysis (LDA) is generally used for generatin… ▽ More In this paper we propose an approach for learning low dimensional optimized feature space with minimum intra-class variance and maximum inter-class variance. We address the problem of high-dimensionality of feature vectors extracted from neural networks by taking care of the global statistics of feature space. Classical approach of Linear Discriminant Analysis (LDA) is generally used for generating an optimized low dimensional feature space for single-labeled images. Since, image retrieval involves both multi-labeled and single-labeled images, we utilize the equivalence between LDA and Canonical Correlation Analysis (CCA) to generate an optimized feature space for single-labeled images and use CCA to generate an optimized feature space for multi-labeled images. Our approach correlates the projections of feature vectors with label vectors in our CCA based network architecture. The neural network minimize a loss function which maximizes the correlation coefficients. We binarize our generated feature vectors with the popular Iterative Quantization (ITQ) approach and also propose an ensemble network to generate binary codes of desired bit length for image retrieval. Our measurement of mean average precision shows competitive results on other state-of-the-art single-labeled and multi-labeled image retrieval datasets. △ Less

Submitted 30 January, 2020; originally announced January 2020.

Comments: 14 pages, 7 figures

arXiv:1910.00739 [pdf, other]

OpenUAV Cloud Testbed: a Collaborative Design Studio for Field Robotics

Authors: Harish Anand, Stephen A. Rees, Zhiang Chen, Ashwin Jose, Sarah Bearman, Prasad Antervedi, Jnaneshwar Das

Abstract: Simulations play a crucial role in robotics research and education. This paper presents the OpenUAV testbed, an open-source, easy-to-use, web-based, and reproducible software system that enables students and researchers to run robotic simulations on the cloud. We have built upon our previous work and have addressed some of the educational and research challenges associated with the prior work. The… ▽ More Simulations play a crucial role in robotics research and education. This paper presents the OpenUAV testbed, an open-source, easy-to-use, web-based, and reproducible software system that enables students and researchers to run robotic simulations on the cloud. We have built upon our previous work and have addressed some of the educational and research challenges associated with the prior work. The critical contributions of the paper to the robotics and automation community are threefold: First, OpenUAV saves students and researchers from tedious and complicated software setups by providing web-browser-based Linux desktop sessions with standard robotics software like Gazebo, ROS, and flight autonomy stack. Second, a method for saving an individual's research work with its dependencies for the work's future reproducibility. Third, the platform provides a mechanism to support photorealistic robotics simulations by combining Unity game engine-based camera rendering and Gazebo physics. The paper addresses a research need for photorealistic simulations and describes a methodology for creating a photorealistic aquatic simulation. We also present the various academic and research use-cases of this platform to improve robotics education and research, especially during times like the COVID-19 pandemic, when virtual collaboration is necessary. △ Less

Submitted 6 May, 2021; v1 submitted 1 October, 2019; originally announced October 2019.

Comments: 8 pages, Submitted to IEEE CASE 2021 for review, GitHub: https://github.com/Open-UAV/openuav-turbovnc Webpage: https://openuav.us

Showing 1–19 of 19 results for author: Jose, A