-
Shallow Recurrent Decoder for Reduced Order Modeling of Plasma Dynamics
Authors:
J. Nathan Kutz,
Maryam Reza,
Farbod Faraji,
Aaron Knoll
Abstract:
Reduced order models are becoming increasingly important for rendering complex and multiscale spatio-temporal dynamics computationally tractable. The computational efficiency of such surrogate models is especially important for design, exhaustive exploration and physical understanding. Plasma simulations, in particular those applied to the study of ${\bf E}\times {\bf B}$ plasma discharges and tec…
▽ More
Reduced order models are becoming increasingly important for rendering complex and multiscale spatio-temporal dynamics computationally tractable. The computational efficiency of such surrogate models is especially important for design, exhaustive exploration and physical understanding. Plasma simulations, in particular those applied to the study of ${\bf E}\times {\bf B}$ plasma discharges and technologies, such as Hall thrusters, require substantial computational resources in order to resolve the multidimentional dynamics that span across wide spatial and temporal scales. Although high-fidelity computational tools are available to simulate such systems over limited conditions and in highly simplified geometries, simulations of full-size systems and/or extensive parametric studies over many geometric configurations and under different physical conditions are computationally intractable with conventional numerical tools. Thus, scientific studies and industrially oriented modeling of plasma systems, including the important ${\bf E}\times {\bf B}$ technologies, stand to significantly benefit from reduced order modeling algorithms. We develop a model reduction scheme based upon a {\em Shallow REcurrent Decoder} (SHRED) architecture. The scheme uses a neural network for encoding limited sensor measurements in time (sequence-to-sequence encoding) to full state-space reconstructions via a decoder network. Based upon the theory of separation of variables, the SHRED architecture is capable of (i) reconstructing full spatio-temporal fields with as little as three point sensors, even the fields that are not measured with sensor feeds but that are in dynamic coupling with the measured field, and (ii) forecasting the future state of the system using neural network roll-outs from the trained time encoding model.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Spatially Optimized Compact Deep Metric Learning Model for Similarity Search
Authors:
Md. Farhadul Islam,
Md. Tanzim Reza,
Meem Arafat Manab,
Mohammad Rakibul Hasan Mahin,
Sarah Zabeen,
Jannatun Noor
Abstract:
Spatial optimization is often overlooked in many computer vision tasks. Filters should be able to recognize the features of an object regardless of where it is in the image. Similarity search is a crucial task where spatial features decide an important output. The capacity of convolution to capture visual patterns across various locations is limited. In contrast to convolution, the involution kern…
▽ More
Spatial optimization is often overlooked in many computer vision tasks. Filters should be able to recognize the features of an object regardless of where it is in the image. Similarity search is a crucial task where spatial features decide an important output. The capacity of convolution to capture visual patterns across various locations is limited. In contrast to convolution, the involution kernel is dynamically created at each pixel based on the pixel value and parameters that have been learned. This study demonstrates that utilizing a single layer of involution feature extractor alongside a compact convolution model significantly enhances the performance of similarity search. Additionally, we improve predictions by using the GELU activation function rather than the ReLU. The negligible amount of weight parameters in involution with a compact model with better performance makes the model very useful in real-world implementations. Our proposed model is below 1 megabyte in size. We have experimented with our proposed methodology and other models on CIFAR-10, FashionMNIST, and MNIST datasets. Our proposed method outperforms across all three datasets.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Model, Analyze, and Comprehend User Interactions and Various Attributes within a Social Media Platform
Authors:
Md Kaykobad Reza,
S M Maksudul Alam,
Yiran Luo,
Youzhe Liu
Abstract:
How can we effectively model, analyze, and comprehend user interactions and various attributes within a social media platform based on post-comment relationship? In this study, we propose a novel graph-based approach to model and analyze user interactions within a social media platform based on post-comment relationship. We construct a user interaction graph from social media data and analyze it t…
▽ More
How can we effectively model, analyze, and comprehend user interactions and various attributes within a social media platform based on post-comment relationship? In this study, we propose a novel graph-based approach to model and analyze user interactions within a social media platform based on post-comment relationship. We construct a user interaction graph from social media data and analyze it to gain insights into community dynamics, user behavior, and content preferences. Our investigation reveals that while 56.05% of the active users are strongly connected within the community, only 0.8% of them significantly contribute to its dynamics. Moreover, we observe temporal variations in community activity, with certain periods experiencing heightened engagement. Additionally, our findings highlight a correlation between user activity and popularity showing that more active users are generally more popular. Alongside these, a preference for positive and informative content is also observed where 82.41% users preferred positive and informative content. Overall, our study provides a comprehensive framework for understanding and managing online communities, leveraging graph-based techniques to gain valuable insights into user behavior and community dynamics.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
Data-driven local operator finding for reduced-order modelling of plasma systems: II. Application to parametric dynamics
Authors:
Farbod Faraji,
Maryam Reza,
Aaron Knoll,
J. Nathan Kutz
Abstract:
Real-world systems often exhibit dynamics influenced by various parameters, either inherent or externally controllable, necessitating models capable of reliably capturing these parametric behaviors. Plasma technologies exemplify such systems. For example, phenomena governing global dynamics in Hall thrusters (a spacecraft propulsion technology) vary with various parameters, such as the "self-susta…
▽ More
Real-world systems often exhibit dynamics influenced by various parameters, either inherent or externally controllable, necessitating models capable of reliably capturing these parametric behaviors. Plasma technologies exemplify such systems. For example, phenomena governing global dynamics in Hall thrusters (a spacecraft propulsion technology) vary with various parameters, such as the "self-sustained electric field". In this Part II, following on the introduction of our novel data-driven local operator finding algorithm, Phi Method, in Part I, we showcase the method's effectiveness in learning parametric dynamics to predict system behavior across unseen parameter spaces. We present two adaptations: the "parametric Phi Method" and the "ensemble Phi Method", which are demonstrated through 2D fluid-flow-past-a-cylinder and 1D Hall-thruster-plasma-discharge problems. Comparative evaluation against parametric OPT-DMD in the fluid case demonstrates superior predictive performance of the parametric Phi Method. Across both test cases, parametric and ensemble Phi Method reliably recover governing parametric PDEs and offer accurate predictions over test parameters. Ensemble ROM analysis underscores Phi Method's robust learning of dominant dynamic coefficients with high confidence.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
Data-driven local operator finding for reduced-order modelling of plasma systems: I. Concept and verifications
Authors:
Farbod Faraji,
Maryam Reza,
Aaron Knoll,
J. Nathan Kutz
Abstract:
Reduced-order plasma models that can efficiently predict plasma behavior across various settings and configurations are highly sought after yet elusive. The demand for such models has surged in the past decade due to their potential to facilitate scientific research and expedite the development of plasma technologies. In line with the advancements in computational power and data-driven methods, we…
▽ More
Reduced-order plasma models that can efficiently predict plasma behavior across various settings and configurations are highly sought after yet elusive. The demand for such models has surged in the past decade due to their potential to facilitate scientific research and expedite the development of plasma technologies. In line with the advancements in computational power and data-driven methods, we introduce the "Phi Method" in this two-part article. Part I presents this novel algorithm, which employs constrained regression on a candidate term library informed by numerical discretization schemes to discover discretized systems of differential equations. We demonstrate Phi Method's efficacy in deriving reliable and robust reduced-order models (ROMs) for three test cases: the Lorenz attractor, flow past a cylinder, and a 1D Hall-thruster-representative plasma. Part II will delve into the method's application for parametric dynamics discovery. Our results show that ROMs derived from the Phi Method provide remarkably accurate predictions of systems' behavior, whether derived from steady-state or transient-state data. This underscores the method's potential for transforming plasma system modeling.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
A Computer Vision Based Approach for Stalking Detection Using a CNN-LSTM-MLP Hybrid Fusion Model
Authors:
Murad Hasan,
Shahriar Iqbal,
Md. Billal Hossain Faisal,
Md. Musnad Hossin Neloy,
Md. Tonmoy Kabir,
Md. Tanzim Reza,
Md. Golam Rabiul Alam,
Md Zia Uddin
Abstract:
Criminal and suspicious activity detection has become a popular research topic in recent years. The rapid growth of computer vision technologies has had a crucial impact on solving this issue. However, physical stalking detection is still a less explored area despite the evolution of modern technology. Nowadays, stalking in public places has become a common occurrence with women being the most aff…
▽ More
Criminal and suspicious activity detection has become a popular research topic in recent years. The rapid growth of computer vision technologies has had a crucial impact on solving this issue. However, physical stalking detection is still a less explored area despite the evolution of modern technology. Nowadays, stalking in public places has become a common occurrence with women being the most affected. Stalking is a visible action that usually occurs before any criminal activity begins as the stalker begins to follow, loiter, and stare at the victim before committing any criminal activity such as assault, kidnap**, rape, and so on. Therefore, it has become a necessity to detect stalking as all of these criminal activities can be stopped in the first place through stalking detection. In this research, we propose a novel deep learning-based hybrid fusion model to detect potential stalkers from a single video with a minimal number of frames. We extract multiple relevant features, such as facial landmarks, head pose estimation, and relative distance, as numerical values from video frames. This data is fed into a multilayer perceptron (MLP) to perform a classification task between a stalking and a non-stalking scenario. Simultaneously, the video frames are fed into a combination of convolutional and LSTM models to extract the spatio-temporal features. We use a fusion of these numerical and spatio-temporal features to build a classifier to detect stalking incidents. Additionally, we introduce a dataset consisting of stalking and non-stalking videos gathered from various feature films and television series, which is also used to train the model. The experimental results show the efficiency and dynamism of our proposed stalker detection system, achieving 89.58% testing accuracy with a significant improvement as compared to the state-of-the-art approaches.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Labeling Indoor Scenes with Fusion of Out-of-the-Box Perception Models
Authors:
Yimeng Li,
Navid Rajabi,
Sulabh Shrestha,
Md Alimoor Reza,
Jana Kosecka
Abstract:
The image annotation stage is a critical and often the most time-consuming part required for training and evaluating object detection and semantic segmentation models. Deployment of the existing models in novel environments often requires detecting novel semantic classes not present in the training data. Furthermore, indoor scenes contain significant viewpoint variations, which need to be handled…
▽ More
The image annotation stage is a critical and often the most time-consuming part required for training and evaluating object detection and semantic segmentation models. Deployment of the existing models in novel environments often requires detecting novel semantic classes not present in the training data. Furthermore, indoor scenes contain significant viewpoint variations, which need to be handled properly by trained perception models. We propose to leverage the recent advancements in state-of-the-art models for bottom-up segmentation (SAM), object detection (Detic), and semantic segmentation (MaskFormer), all trained on large-scale datasets. We aim to develop a cost-effective labeling approach to obtain pseudo-labels for semantic segmentation and object instance detection in indoor environments, with the ultimate goal of facilitating the training of lightweight models for various downstream tasks. We also propose a multi-view labeling fusion stage, which considers the setting where multiple views of the scenes are available and can be used to identify and rectify single-view inconsistencies. We demonstrate the effectiveness of the proposed approach on the Active Vision dataset and the ADE20K dataset. We evaluate the quality of our labeling process by comparing it with human annotations. Also, we demonstrate the effectiveness of the obtained labels in downstream tasks such as object goal navigation and part discovery. In the context of object goal navigation, we depict enhanced performance using this fusion approach compared to a zero-shot baseline that utilizes large monolithic vision-language pre-trained models.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
GC-MVSNet: Multi-View, Multi-Scale, Geometrically-Consistent Multi-View Stereo
Authors:
Vibhas K. Vats,
Sripad Joshi,
David J. Crandall,
Md. Alimoor Reza,
Soon-heung Jung
Abstract:
Traditional multi-view stereo (MVS) methods rely heavily on photometric and geometric consistency constraints, but newer machine learning-based MVS methods check geometric consistency across multiple source views only as a post-processing step. In this paper, we present a novel approach that explicitly encourages geometric consistency of reference view depth maps across multiple source views at di…
▽ More
Traditional multi-view stereo (MVS) methods rely heavily on photometric and geometric consistency constraints, but newer machine learning-based MVS methods check geometric consistency across multiple source views only as a post-processing step. In this paper, we present a novel approach that explicitly encourages geometric consistency of reference view depth maps across multiple source views at different scales during learning (see Fig. 1). We find that adding this geometric consistency loss significantly accelerates learning by explicitly penalizing geometrically inconsistent pixels, reducing the training iteration requirements to nearly half that of other MVS methods. Our extensive experiments show that our approach achieves a new state-of-the-art on the DTU and BlendedMVS datasets, and competitive results on the Tanks and Temples benchmark. To the best of our knowledge, GC-MVSNet is the first attempt to enforce multi-view, multi-scale geometric consistency during learning.
△ Less
Submitted 21 December, 2023; v1 submitted 30 October, 2023;
originally announced October 2023.
-
Impact of Guidance and Interaction Strategies for LLM Use on Learner Performance and Perception
Authors:
Harsh Kumar,
Ilya Musabirov,
Mohi Reza,
Jiakai Shi,
Xinyuan Wang,
Joseph Jay Williams,
Anastasia Kuzminykh,
Michael Liut
Abstract:
Personalized chatbot-based teaching assistants can be crucial in addressing increasing classroom sizes, especially where direct teacher presence is limited. Large language models (LLMs) offer a promising avenue, with increasing research exploring their educational utility. However, the challenge lies not only in establishing the efficacy of LLMs but also in discerning the nuances of interaction be…
▽ More
Personalized chatbot-based teaching assistants can be crucial in addressing increasing classroom sizes, especially where direct teacher presence is limited. Large language models (LLMs) offer a promising avenue, with increasing research exploring their educational utility. However, the challenge lies not only in establishing the efficacy of LLMs but also in discerning the nuances of interaction between learners and these models, which impact learners' engagement and results. We conducted a formative study in an undergraduate computer science classroom (N=145) and a controlled experiment on Prolific (N=356) to explore the impact of four pedagogically informed guidance strategies on the learners' performance, confidence and trust in LLMs. Direct LLM answers marginally improved performance, while refining student solutions fostered trust. Structured guidance reduced random queries as well as instances of students copy-pasting assignment questions to the LLM. Our work highlights the role that teachers can play in sha** LLM-supported learning environments.
△ Less
Submitted 23 January, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Robust Multimodal Learning with Missing Modalities via Parameter-Efficient Adaptation
Authors:
Md Kaykobad Reza,
Ashley Prater-Bennette,
M. Salman Asif
Abstract:
Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in some correlated modalities. However, we observe that the performance of several existing multimodal networks significantly deteriorates if one or multiple modali…
▽ More
Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in some correlated modalities. However, we observe that the performance of several existing multimodal networks significantly deteriorates if one or multiple modalities are absent at test time. To enable robustness to missing modalities, we propose a simple and parameter-efficient adaptation procedure for pretrained multimodal networks. In particular, we exploit modulation of intermediate features to compensate for the missing modalities. We demonstrate that such adaptation can partially bridge performance drop due to missing modalities and outperform independent, dedicated networks trained for the available modality combinations in some cases. The proposed adaptation requires extremely small number of parameters (e.g., fewer than 0.7% of the total parameters) and applicable to a wide range of modality combinations and tasks. We conduct a series of experiments to highlight the missing modality robustness of our proposed method on 5 different datasets for multimodal semantic segmentation, multimodal material segmentation, and multimodal sentiment analysis tasks. Our proposed method demonstrates versatility across various tasks and datasets, and outperforms existing methods for robust multimodal learning with missing modalities.
△ Less
Submitted 26 February, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
ABScribe: Rapid Exploration & Organization of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language Models
Authors:
Mohi Reza,
Nathan Laundry,
Ilya Musabirov,
Peter Dushniku,
Zhi Yuan "Michael" Yu,
Kashish Mittal,
Tovi Grossman,
Michael Liut,
Anastasia Kuzminykh,
Joseph Jay Williams
Abstract:
Exploring alternative ideas by rewriting text is integral to the writing process. State-of-the-art Large Language Models (LLMs) can simplify writing variation generation. However, current interfaces pose challenges for simultaneous consideration of multiple variations: creating new variations without overwriting text can be difficult, and pasting them sequentially can clutter documents, increasing…
▽ More
Exploring alternative ideas by rewriting text is integral to the writing process. State-of-the-art Large Language Models (LLMs) can simplify writing variation generation. However, current interfaces pose challenges for simultaneous consideration of multiple variations: creating new variations without overwriting text can be difficult, and pasting them sequentially can clutter documents, increasing workload and disrupting writers' flow. To tackle this, we present ABScribe, an interface that supports rapid, yet visually structured, exploration and organization of writing variations in human-AI co-writing tasks. With ABScribe, users can swiftly modify variations using LLM prompts, which are auto-converted into reusable buttons. Variations are stored adjacently within text fields for rapid in-place comparisons using mouse-over interactions on a popup toolbar. Our user study with 12 writers shows that ABScribe significantly reduces task workload (d = 1.20, p < 0.001), enhances user perceptions of the revision process (d = 2.41, p < 0.001) compared to a popular baseline workflow, and provides insights into how writers explore variations using LLMs.
△ Less
Submitted 27 March, 2024; v1 submitted 29 September, 2023;
originally announced October 2023.
-
MMSFormer: Multimodal Transformer for Material and Semantic Segmentation
Authors:
Md Kaykobad Reza,
Ashley Prater-Bennette,
M. Salman Asif
Abstract:
Leveraging information across diverse modalities is known to enhance performance on multimodal segmentation tasks. However, effectively fusing information from different modalities remains challenging due to the unique characteristics of each modality. In this paper, we propose a novel fusion strategy that can effectively fuse information from different modality combinations. We also propose a new…
▽ More
Leveraging information across diverse modalities is known to enhance performance on multimodal segmentation tasks. However, effectively fusing information from different modalities remains challenging due to the unique characteristics of each modality. In this paper, we propose a novel fusion strategy that can effectively fuse information from different modality combinations. We also propose a new model named Multi-Modal Segmentation TransFormer (MMSFormer) that incorporates the proposed fusion strategy to perform multimodal material and semantic segmentation tasks. MMSFormer outperforms current state-of-the-art models on three different datasets. As we begin with only one input modality, performance improves progressively as additional modalities are incorporated, showcasing the effectiveness of the fusion block in combining useful information from diverse input modalities. Ablation studies show that different modules in the fusion block are crucial for overall model performance. Furthermore, our ablation studies also highlight the capacity of different input modalities to improve performance in the identification of different types of materials. The code and pretrained models will be made available at https://github.com/csiplab/MMSFormer.
△ Less
Submitted 7 April, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Dynamic Mode Decomposition for data-driven analysis and reduced-order modelling of ExB plasmas: II. dynamics forecasting
Authors:
Farbod Faraji,
Maryam Reza,
Aaron Knoll,
J. Nathan Kutz
Abstract:
In part I of the article, we demonstrated that a variant of the Dynamic Mode Decomposition (DMD) algorithm based on variable projection optimization, called Optimized DMD (OPT-DMD), enables a robust identification of the dominant spatiotemporally coherent modes underlying the data across various test cases representing different physical parameters in an ExB simulation configuration. As the OPT-DM…
▽ More
In part I of the article, we demonstrated that a variant of the Dynamic Mode Decomposition (DMD) algorithm based on variable projection optimization, called Optimized DMD (OPT-DMD), enables a robust identification of the dominant spatiotemporally coherent modes underlying the data across various test cases representing different physical parameters in an ExB simulation configuration. As the OPT-DMD can be constrained to produce stable reduced-order models (ROMs) by construction, in this paper, we extend the application of the OPT-DMD and investigate the capabilities of the linear ROM from this algorithm toward forecasting in time of the plasma dynamics in configurations representative of the radial-azimuthal and axial-azimuthal cross-sections of a Hall thruster and over a range of simulation parameters in each test case. The predictive capacity of the OPT-DMD ROM is assessed primarily in terms of short-term dynamics forecast or, in other words, for large ratios of training-to-test data. However, the utility of the ROM for long-term dynamics forecasting is also presented for an example case in the radial-azimuthal configuration. The model's predictive performance is heterogeneous across various test cases. Nonetheless, a remarkable predictiveness is observed in the test cases that do not exhibit highly transient behaviors. Moreover, in all investigated cases, the error between the ground-truth and the reconstructed data from the OPT-DMD ROM remains bounded over time within both the training and the test window. As a result, despite its limitation in terms of generalized applicability to all plasma conditions, the OPT-DMD is proven as a reliable method to develop low computational cost and highly predictive data-driven reduced-order models in systems with a quasi-periodic global evolution of the plasma state.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Dynamic Mode Decomposition for data-driven analysis and reduced-order modelling of ExB plasmas: I. Extraction of spatiotemporally coherent patterns
Authors:
Farbod Faraji,
Maryam Reza,
Aaron Knoll,
J. Nathan Kutz
Abstract:
In this two-part article, we evaluate the utility and the generalizability of the Dynamic Mode Decomposition (DMD) algorithm for data-driven analysis and reduced-order modelling of plasma dynamics in cross-field ExB configurations. The DMD algorithm is an interpretable data-driven method that finds a best-fit linear model describing the time evolution of spatiotemporally coherent structures (patte…
▽ More
In this two-part article, we evaluate the utility and the generalizability of the Dynamic Mode Decomposition (DMD) algorithm for data-driven analysis and reduced-order modelling of plasma dynamics in cross-field ExB configurations. The DMD algorithm is an interpretable data-driven method that finds a best-fit linear model describing the time evolution of spatiotemporally coherent structures (patterns) in data. We have applied the DMD to extensive high-fidelity datasets generated using a particle-in-cell (PIC) code based on a cost-efficient reduced-order PIC scheme. In this part, we first provide an overview of the concept of DMD and its underpinning Proper Orthogonal and Singular Value Decomposition methods. Two of the main DMD variants are next introduced. We then present and discuss the results of the DMD application in terms of the identification and extraction of the dominant spatiotemporal modes from high-fidelity data over a range of simulation conditions. We demonstrate that the DMD variant based on variable projection optimization (OPT-DMD) outperforms the basic DMD method in identification of the modes underlying the data, leading to notably more reliable reconstruction of the ground-truth. Furthermore, we show in multiple test cases that the discrete frequency spectrum of OPT-DMD-extracted modes is consistent with the temporal spectrum from the Fast Fourier Transform of the data. This observation implies that the OPT-DMD augments the conventional spectral analyses by being able to uniquely reveal the spatial structure of the dominant modes in the frequency spectra, thus, yielding more accessible, comprehensive information on the spatiotemporal characteristics of the plasma phenomena.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
CGBA: Curvature-aware Geometric Black-box Attack
Authors:
Md Farhamdur Reza,
Ali Rahmati,
Tianfu Wu,
Huaiyu Dai
Abstract:
Decision-based black-box attacks often necessitate a large number of queries to craft an adversarial example. Moreover, decision-based attacks based on querying boundary points in the estimated normal vector direction often suffer from inefficiency and convergence issues. In this paper, we propose a novel query-efficient curvature-aware geometric decision-based black-box attack (CGBA) that conduct…
▽ More
Decision-based black-box attacks often necessitate a large number of queries to craft an adversarial example. Moreover, decision-based attacks based on querying boundary points in the estimated normal vector direction often suffer from inefficiency and convergence issues. In this paper, we propose a novel query-efficient curvature-aware geometric decision-based black-box attack (CGBA) that conducts boundary search along a semicircular path on a restricted 2D plane to ensure finding a boundary point successfully irrespective of the boundary curvature. While the proposed CGBA attack can work effectively for an arbitrary decision boundary, it is particularly efficient in exploiting the low curvature to craft high-quality adversarial examples, which is widely seen and experimentally verified in commonly used classifiers under non-targeted attacks. In contrast, the decision boundaries often exhibit higher curvature under targeted attacks. Thus, we develop a new query-efficient variant, CGBA-H, that is adapted for the targeted attack. In addition, we further design an algorithm to obtain a better initial boundary point at the expense of some extra queries, which considerably enhances the performance of the targeted attack. Extensive experiments are conducted to evaluate the performance of our proposed methods against some well-known classifiers on the ImageNet and CIFAR10 datasets, demonstrating the superiority of CGBA and CGBA-H over state-of-the-art non-targeted and targeted attacks, respectively. The source code is available at https://github.com/Farhamdur/CGBA.
△ Less
Submitted 6 August, 2023;
originally announced August 2023.
-
AnoMalNet: Outlier Detection based Malaria Cell Image Classification Method Leveraging Deep Autoencoder
Authors:
Aminul Huq,
Md Tanzim Reza,
Shahriar Hossain,
Shakib Mahmud Dipto
Abstract:
Class imbalance is a pervasive issue in the field of disease classification from medical images. It is necessary to balance out the class distribution while training a model for decent results. However, in the case of rare medical diseases, images from affected patients are much harder to come by compared to images from non-affected patients, resulting in unwanted class imbalance. Various processe…
▽ More
Class imbalance is a pervasive issue in the field of disease classification from medical images. It is necessary to balance out the class distribution while training a model for decent results. However, in the case of rare medical diseases, images from affected patients are much harder to come by compared to images from non-affected patients, resulting in unwanted class imbalance. Various processes of tackling class imbalance issues have been explored so far, each having its fair share of drawbacks. In this research, we propose an outlier detection based binary medical image classification technique which can handle even the most extreme case of class imbalance. We have utilized a dataset of malaria parasitized and uninfected cells. An autoencoder model titled AnoMalNet is trained with only the uninfected cell images at the beginning and then used to classify both the affected and non-affected cell images by thresholding a loss value. We have achieved an accuracy, precision, recall, and F1 score of 98.49%, 97.07%, 100%, and 98.52% respectively, performing better than large deep learning models and other published works. As our proposed approach can provide competitive results without needing the disease-positive samples during training, it should prove to be useful in binary disease classification on imbalanced datasets.
△ Less
Submitted 20 February, 2024; v1 submitted 10 March, 2023;
originally announced March 2023.
-
Semantically-informed Hierarchical Event Modeling
Authors:
Shubhashis Roy Dipta,
Mehdi Rezaee,
Francis Ferraro
Abstract:
Prior work has shown that coupling sequential latent variable models with semantic ontological knowledge can improve the representational capabilities of event modeling approaches. In this work, we present a novel, doubly hierarchical, semi-supervised event modeling framework that provides structural hierarchy while also accounting for ontological hierarchy. Our approach consists of multiple layer…
▽ More
Prior work has shown that coupling sequential latent variable models with semantic ontological knowledge can improve the representational capabilities of event modeling approaches. In this work, we present a novel, doubly hierarchical, semi-supervised event modeling framework that provides structural hierarchy while also accounting for ontological hierarchy. Our approach consists of multiple layers of structured latent variables, where each successive layer compresses and abstracts the previous layers. We guide this compression through the injection of structured ontological knowledge that is defined at the type level of events: importantly, our model allows for partial injection of semantic knowledge and it does not depend on observing instances at any particular level of the semantic ontology. Across two different datasets and four different evaluation metrics, we demonstrate that our approach is able to out-perform the previous state-of-the-art approaches by up to 8.5%, demonstrating the benefits of structured and semantic hierarchical knowledge for event modeling.
△ Less
Submitted 30 May, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Explainable AI based Glaucoma Detection using Transfer Learning and LIME
Authors:
Touhidul Islam Chayan,
Anita Islam,
Eftykhar Rahman,
Md. Tanzim Reza,
Tasnim Sakib Apon,
MD. Golam Rabiul Alam
Abstract:
Glaucoma is the second driving reason for partial or complete blindness among all the visual deficiencies which mainly occurs because of excessive pressure in the eye due to anxiety or depression which damages the optic nerve and creates complications in vision. Traditional glaucoma screening is a time-consuming process that necessitates the medical professionals' constant attention, and even so t…
▽ More
Glaucoma is the second driving reason for partial or complete blindness among all the visual deficiencies which mainly occurs because of excessive pressure in the eye due to anxiety or depression which damages the optic nerve and creates complications in vision. Traditional glaucoma screening is a time-consuming process that necessitates the medical professionals' constant attention, and even so time to time due to the time constrains and pressure they fail to classify correctly that leads to wrong treatment. Numerous efforts have been made to automate the entire glaucoma classification procedure however, these existing models in general have a black box characteristics that prevents users from understanding the key reasons behind the prediction and thus medical practitioners generally can not rely on these system. In this article after comparing with various pre-trained models, we propose a transfer learning model that is able to classify Glaucoma with 94.71\% accuracy. In addition, we have utilized Local Interpretable Model-Agnostic Explanations(LIME) that introduces explainability in our system. This improvement enables medical professionals obtain important and comprehensive information that aid them in making judgments. It also lessen the opacity and fragility of the traditional deep learning models.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
Experimenting with Experimentation: Rethinking The Role of Experimentation in Educational Design
Authors:
Mohi Reza,
Akmar Chowdhury,
Aidan Li,
Mahathi Gandhamaneni,
Joseph Jay Williams
Abstract:
What if we take a broader view of what it means to run an education experiment? In this paper, we explore opportunities that arise when we think beyond the commonly-held notion that the purpose of an experiment is to either accept or reject a pre-defined hypothesis and instead, reconsider experimentation as a means to explore the complex design space of creating and improving instructional content…
▽ More
What if we take a broader view of what it means to run an education experiment? In this paper, we explore opportunities that arise when we think beyond the commonly-held notion that the purpose of an experiment is to either accept or reject a pre-defined hypothesis and instead, reconsider experimentation as a means to explore the complex design space of creating and improving instructional content. This is an approach we call experiment-inspired design. Then, to operationalize these ideas in a real-world experimentation venue, we investigate the implications of running a sequence of interventions teaching first-year students "meta-skills": transferable skills applicable to multiple areas of their lives, such as planning, and managing stress. Finally, using two examples as case studies for meta-skills interventions (stress-reappraisal and mental contrasting with implementation intentions), we reflect on our experiences with experiment-inspired design and share six preliminary lessons on how to use experimentation for design.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
RevUp: Revise and Update Information Bottleneck for Event Representation
Authors:
Mehdi Rezaee,
Francis Ferraro
Abstract:
The existence of external (``side'') semantic knowledge has been shown to result in more expressive computational event models. To enable the use of side information that may be noisy or missing, we propose a semi-supervised information bottleneck-based discrete latent variable model. We reparameterize the model's discrete variables with auxiliary continuous latent variables and a light-weight hie…
▽ More
The existence of external (``side'') semantic knowledge has been shown to result in more expressive computational event models. To enable the use of side information that may be noisy or missing, we propose a semi-supervised information bottleneck-based discrete latent variable model. We reparameterize the model's discrete variables with auxiliary continuous latent variables and a light-weight hierarchical structure. Our model is learned to minimize the mutual information between the observed data and optional side knowledge that is not already captured by the new, auxiliary variables. We theoretically show that our approach generalizes past approaches, and perform an empirical case study of our approach on event modeling. We corroborate our theoretical results with strong empirical experiments, showing that the proposed method outperforms previous proposed approaches on multiple datasets.
△ Less
Submitted 14 February, 2023; v1 submitted 24 May, 2022;
originally announced May 2022.
-
Real Time Action Recognition from Video Footage
Authors:
Tasnim Sakib Apon,
Mushfiqul Islam Chowdhury,
MD Zubair Reza,
Arpita Datta,
Syeda Tan**a Hasan,
MD. Golam Rabiul Alam
Abstract:
Crime rate is increasing proportionally with the increasing rate of the population. The most prominent approach was to introduce Closed-Circuit Television (CCTV) camera-based surveillance to tackle the issue. Video surveillance cameras have added a new dimension to detect crime. Several research works on autonomous security camera surveillance are currently ongoing, where the fundamental goal is t…
▽ More
Crime rate is increasing proportionally with the increasing rate of the population. The most prominent approach was to introduce Closed-Circuit Television (CCTV) camera-based surveillance to tackle the issue. Video surveillance cameras have added a new dimension to detect crime. Several research works on autonomous security camera surveillance are currently ongoing, where the fundamental goal is to discover violent activity from video feeds. From the technical viewpoint, this is a challenging problem because analyzing a set of frames, i.e., videos in temporal dimension to detect violence might need careful machine learning model training to reduce false results. This research focuses on this problem by integrating state-of-the-art Deep Learning methods to ensure a robust pipeline for autonomous surveillance for detecting violent activities, e.g., kicking, punching, and slap**. Initially, we designed a dataset of this specific interest, which contains 600 videos (200 for each action). Later, we have utilized existing pre-trained model architectures to extract features, and later used deep learning network for classification. Also, We have classified our models' accuracy, and confusion matrix on different pre-trained architectures like VGG16, InceptionV3, ResNet50, Xception and MobileNet V2 among which VGG16 and MobileNet V2 performed better.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.
-
Error Diagnosis of Deep Monocular Depth Estimation Models
Authors:
Jagpreet Chawla,
Nikhil Thakurdesai,
Anuj Godase,
Md Reza,
David Crandall,
Soon-Heung Jung
Abstract:
Estimating depth from a monocular image is an ill-posed problem: when the camera projects a 3D scene onto a 2D plane, depth information is inherently and permanently lost. Nevertheless, recent work has shown impressive results in estimating 3D structure from 2D images using deep learning. In this paper, we put on an introspective hat and analyze state-of-the-art monocular depth estimation models i…
▽ More
Estimating depth from a monocular image is an ill-posed problem: when the camera projects a 3D scene onto a 2D plane, depth information is inherently and permanently lost. Nevertheless, recent work has shown impressive results in estimating 3D structure from 2D images using deep learning. In this paper, we put on an introspective hat and analyze state-of-the-art monocular depth estimation models in indoor scenes to understand these models' limitations and error patterns. To address errors in depth estimation, we introduce a novel Depth Error Detection Network (DEDN) that spatially identifies erroneous depth predictions in the monocular depth estimation models. By experimenting with multiple state-of-the-art monocular indoor depth estimation models on multiple datasets, we show that our proposed depth error detection network can identify a significant number of errors in the predicted depth maps. Our module is flexible and can be readily plugged into any monocular depth prediction network to help diagnose its results. Additionally, we propose a simple yet effective Depth Error Correction Network (DECN) that iteratively corrects errors based on our initial error diagnosis.
△ Less
Submitted 15 November, 2021;
originally announced December 2021.
-
Discriminative and Generative Transformer-based Models For Situation Entity Classification
Authors:
Mehdi Rezaee,
Kasra Darvish,
Gaoussou Youssouf Kebe,
Francis Ferraro
Abstract:
We re-examine the situation entity (SE) classification task with varying amounts of available training data. We exploit a Transformer-based variational autoencoder to encode sentences into a lower dimensional latent space, which is used to generate the text and learn a SE classifier. Test set and cross-genre evaluations show that when training data is plentiful, the proposed model can improve over…
▽ More
We re-examine the situation entity (SE) classification task with varying amounts of available training data. We exploit a Transformer-based variational autoencoder to encode sentences into a lower dimensional latent space, which is used to generate the text and learn a SE classifier. Test set and cross-genre evaluations show that when training data is plentiful, the proposed model can improve over the previous discriminative state-of-the-art models. Our approach performs disproportionately better with smaller amounts of training data, but when faced with extremely small sets (4 instances per label), generative RNN methods outperform transformers. Our work provides guidance for future efforts on SE and semantic prediction tasks, and low-label training regimes.
△ Less
Submitted 15 September, 2021;
originally announced September 2021.
-
DistB-Condo: Distributed Blockchain-based IoT-SDN Model for Smart Condominium
Authors:
Anichur Rahman,
Md. Jahidul Islam,
Ziaur Rahman,
Md. Mahfuz Reza,
Adnan Anwar,
M. A. Parvez Mahmud,
Mostofa Kamal Nasir,
Rafidah Md Noor
Abstract:
Condominium network refers to intra-organization networks, where smart buildings or apartments are connected and share resources over the network. Secured communication platform or channel has been highlighted as a key requirement for a reliable condominium which can be ensured by the utilization of the advanced techniques and platforms like Software-Defined Network (SDN), Network Function Virtual…
▽ More
Condominium network refers to intra-organization networks, where smart buildings or apartments are connected and share resources over the network. Secured communication platform or channel has been highlighted as a key requirement for a reliable condominium which can be ensured by the utilization of the advanced techniques and platforms like Software-Defined Network (SDN), Network Function Virtualization (NFV) and Blockchain (BC). These technologies provide a robust, and secured platform to meet all kinds of challenges, such as safety, confidentiality, flexibility, efficiency, and availability. This work suggests a distributed, scalable IoT-SDN with Blockchain-based NFV framework for a smart condominium (DistB-Condo) that can act as an efficient secured platform for a small community. Moreover, the Blockchain-based IoT-SDN with NFV framework provides the combined benefits of leading technologies. It also presents an optimized Cluster Head Selection (CHS) algorithm for selecting a Cluster Head (CH) among the clusters that efficiently saves energy. Besides, a decentralized and secured Blockchain approach has been introduced that allows more prominent security and privacy to the desired condominium network. Our proposed approach has also the ability to detect attacks in an IoT environment. Eventually, this article evaluates the performance of the proposed architecture using different parameters (e.g., throughput, packet arrival rate, and response time). The proposed approach outperforms the existing OF-Based SDN. DistB-Condo has better throughput on average, and the bandwidth (Mbps) much higher than the OF-Based SDN approach in the presence of attacks. Also, the proposed model has an average response time of 5% less than the core model.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Using a Bi-directional LSTM Model with Attention Mechanism trained on MIDI Data for Generating Unique Music
Authors:
Ashish Ranjan,
Varun Nagesh Jolly Behera,
Motahar Reza
Abstract:
Generating music is an interesting and challenging problem in the field of machine learning. Mimicking human creativity has been popular in recent years, especially in the field of computer vision and image processing. With the advent of GANs, it is possible to generate new similar images, based on trained data. But this cannot be done for music similarly, as music has an extra temporal dimension.…
▽ More
Generating music is an interesting and challenging problem in the field of machine learning. Mimicking human creativity has been popular in recent years, especially in the field of computer vision and image processing. With the advent of GANs, it is possible to generate new similar images, based on trained data. But this cannot be done for music similarly, as music has an extra temporal dimension. So it is necessary to understand how music is represented in digital form. When building models that perform this generative task, the learning and generation part is done in some high-level representation such as MIDI (Musical Instrument Digital Interface) or scores. This paper proposes a bi-directional LSTM (Long short-term memory) model with attention mechanism capable of generating similar type of music based on MIDI data. The music generated by the model follows the theme/style of the music the model is trained on. Also, due to the nature of MIDI, the tempo, instrument, and other parameters can be defined, and changed, post generation.
△ Less
Submitted 2 November, 2020;
originally announced November 2020.
-
A Parallel Approach for Real-Time Face Recognition from a Large Database
Authors:
Ashish Ranjan,
Varun Nagesh Jolly Behera,
Motahar Reza
Abstract:
We present a new facial recognition system, capable of identifying a person, provided their likeness has been previously stored in the system, in real time. The system is based on storing and comparing facial embeddings of the subject, and identifying them later within a live video feed. This system is highly accurate, and is able to tag people with their ID in real time. It is able to do so, even…
▽ More
We present a new facial recognition system, capable of identifying a person, provided their likeness has been previously stored in the system, in real time. The system is based on storing and comparing facial embeddings of the subject, and identifying them later within a live video feed. This system is highly accurate, and is able to tag people with their ID in real time. It is able to do so, even when using a database containing thousands of facial embeddings, by using a parallelized searching technique. This makes the system quite fast and allows it to be highly scalable.
△ Less
Submitted 1 November, 2020;
originally announced November 2020.
-
Graph based Clustering Algorithm for Social Community Transmission Prediction of COVID-19
Authors:
Varun Nagesh Jolly Behera,
Ashish Ranjan,
Motahar Reza
Abstract:
A system to model the spread of COVID-19 cases after lockdown has been proposed, to define new preventive measures based on hotspots, using the graph clustering algorithm. This method allows for more lenient measures in areas less prone to the virus spread. There exist methods to model the spread of the virus, by predicting the number of confirmed cases. But the proposed system focuses more on the…
▽ More
A system to model the spread of COVID-19 cases after lockdown has been proposed, to define new preventive measures based on hotspots, using the graph clustering algorithm. This method allows for more lenient measures in areas less prone to the virus spread. There exist methods to model the spread of the virus, by predicting the number of confirmed cases. But the proposed system focuses more on the preventive side of the solution from a geographical point of view, by predicting the areas or regions that may become hotspots for the virus in the near future. The fact that the virus can only be transmitted by being in close proximity to an already infected person, suggests that, the regions that can easily be reached from an existing hotspot, have a higher chance of becoming a new hotspot. Moreover, in smaller regions, even after strict provisions, positive cases have been found. To consider this fact, the geographic distance between the nearest hotspots can be used as a measure of likelihood of the region also becoming a hotspot. In this paper, a weighted graph of regions with the regions themselves as weighted nodes with weight of the nodes as the number of active cases and the distance as edge weights. The graph can be completely connected or connected based on a distance threshold. The nodes are the administrative, and the distance measure tells the possible transmission between separate communities. Using this data, the potential regions that can become hotspots can be predicted, and preventive measures can be devised.
△ Less
Submitted 31 October, 2020;
originally announced November 2020.
-
A Discrete Variational Recurrent Topic Model without the Reparametrization Trick
Authors:
Mehdi Rezaee,
Francis Ferraro
Abstract:
We show how to learn a neural topic model with discrete random variables---one that explicitly models each word's assigned topic---using neural variational inference that does not rely on stochastic backpropagation to handle the discrete variables. The model we utilize combines the expressive power of neural methods for representing sequences of text with the topic model's ability to capture globa…
▽ More
We show how to learn a neural topic model with discrete random variables---one that explicitly models each word's assigned topic---using neural variational inference that does not rely on stochastic backpropagation to handle the discrete variables. The model we utilize combines the expressive power of neural methods for representing sequences of text with the topic model's ability to capture global, thematic coherence. Using neural variational inference, we show improved perplexity and document understanding across multiple corpora. We examine the effect of prior parameters both on the model and variational parameters and demonstrate how our approach can compete and surpass a popular topic model implementation on an automatic measure of topic quality.
△ Less
Submitted 22 October, 2020;
originally announced October 2020.
-
A Novel Demodulation Scheme for Secure and Reliable UWB Distance Bounding
Authors:
Milad Rezaee,
Dave Singelee,
Bart Preneel
Abstract:
Relay attacks pose an important threat in wireless ranging and authentication systems. Distance bounding protocols have been proposed as an effective countermeasure against these attacks and allow a verifier and a prover to establish an upper bound on the distance between them. However, secure distance bounding protocols are hard to realize in practice due to stringent implementation requirements.…
▽ More
Relay attacks pose an important threat in wireless ranging and authentication systems. Distance bounding protocols have been proposed as an effective countermeasure against these attacks and allow a verifier and a prover to establish an upper bound on the distance between them. However, secure distance bounding protocols are hard to realize in practice due to stringent implementation requirements. In this paper, we look into a yet unexplored research area and show how the security strength of Ultra Wide Band (UWB) distance bounding protocols can be significantly increased by imposing several additional security constraints during demodulation and decoding at the receiver. We demonstrate that for equal reliability metrics as in state-of-the-art UWB distance bounding protocols, our solution achieves a reduction of the success probability of a relay attack by a factor of 40. Moreover, we also argue that our security solution only needs to be combined with pulse masking and a distance commitment to achieve these security bounds and there is no need to have pulse reordering in our modulation.
△ Less
Submitted 26 October, 2020; v1 submitted 20 October, 2020;
originally announced October 2020.
-
Event Representation with Sequential, Semi-Supervised Discrete Variables
Authors:
Mehdi Rezaee,
Francis Ferraro
Abstract:
Within the context of event modeling and understanding, we propose a new method for neural sequence modeling that takes partially-observed sequences of discrete, external knowledge into account. We construct a sequential neural variational autoencoder, which uses Gumbel-Softmax reparametrization within a carefully defined encoder, to allow for successful backpropagation during training. The core i…
▽ More
Within the context of event modeling and understanding, we propose a new method for neural sequence modeling that takes partially-observed sequences of discrete, external knowledge into account. We construct a sequential neural variational autoencoder, which uses Gumbel-Softmax reparametrization within a carefully defined encoder, to allow for successful backpropagation during training. The core idea is to allow semi-supervised external discrete knowledge to guide, but not restrict, the variational latent parameters during training. Our experiments indicate that our approach not only outperforms multiple baselines and the state-of-the-art in narrative script induction, but also converges more quickly.
△ Less
Submitted 12 April, 2021; v1 submitted 9 October, 2020;
originally announced October 2020.
-
A Computational Model of Early Word Learning from the Infant's Point of View
Authors:
Satoshi Tsutsui,
Arjun Chandrasekaran,
Md Alimoor Reza,
David Crandall,
Chen Yu
Abstract:
Human infants have the remarkable ability to learn the associations between object names and visual objects from inherently ambiguous experiences. Researchers in cognitive science and developmental psychology have built formal models that implement in-principle learning algorithms, and then used pre-selected and pre-cleaned datasets to test the abilities of the models to find statistical regularit…
▽ More
Human infants have the remarkable ability to learn the associations between object names and visual objects from inherently ambiguous experiences. Researchers in cognitive science and developmental psychology have built formal models that implement in-principle learning algorithms, and then used pre-selected and pre-cleaned datasets to test the abilities of the models to find statistical regularities in the input data. In contrast to previous modeling approaches, the present study used egocentric video and gaze data collected from infant learners during natural toy play with their parents. This allowed us to capture the learning environment from the perspective of the learner's own point of view. We then used a Convolutional Neural Network (CNN) model to process sensory data from the infant's point of view and learn name-object associations from scratch. As the first model that takes raw egocentric video to simulate infant word learning, the present study provides a proof of principle that the problem of early word learning can be solved, using actual visual data perceived by infant learners. Moreover, we conducted simulation experiments to systematically determine how visual, perceptual, and attentional properties of infants' sensory experiences may affect word learning.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
A Survey on Compressive Sensing: Classical Results and Recent Advancements
Authors:
Ahmad Mousavi,
Mehdi Rezaee,
Ramin Ayanzadeh
Abstract:
Recovering sparse signals from linear measurements has demonstrated outstanding utility in a vast variety of real-world applications. Compressive sensing is the topic that studies the associated raised questions for the possibility of a successful recovery. This topic is well-nourished and numerous results are available in the literature. However, their dispersity makes it challenging and time-con…
▽ More
Recovering sparse signals from linear measurements has demonstrated outstanding utility in a vast variety of real-world applications. Compressive sensing is the topic that studies the associated raised questions for the possibility of a successful recovery. This topic is well-nourished and numerous results are available in the literature. However, their dispersity makes it challenging and time-consuming for readers and practitioners to quickly grasp its main ideas and classical algorithms, and further touch upon the recent advancements in this surging field. Besides, the sparsity notion has already demonstrated its effectiveness in many contemporary fields. Thus, these results are useful and inspiring for further investigation of related questions in these emerging fields from new perspectives. In this survey, we gather and overview vital classical tools and algorithms in compressive sensing and describe significant recent advancements. We conclude this survey by a numerical comparison of the performance of described approaches on an interesting application.
△ Less
Submitted 22 July, 2020; v1 submitted 2 August, 2019;
originally announced August 2019.
-
Active Object Manipulation Facilitates Visual Object Learning: An Egocentric Vision Study
Authors:
Satoshi Tsutsui,
Dian Zhi,
Md Alimoor Reza,
David Crandall,
Chen Yu
Abstract:
Inspired by the remarkable ability of the infant visual learning system, a recent study collected first-person images from children to analyze the `training data' that they receive. We conduct a follow-up study that investigates two additional directions. First, given that infants can quickly learn to recognize a new object without much supervision (i.e. few-shot learning), we limit the number of…
▽ More
Inspired by the remarkable ability of the infant visual learning system, a recent study collected first-person images from children to analyze the `training data' that they receive. We conduct a follow-up study that investigates two additional directions. First, given that infants can quickly learn to recognize a new object without much supervision (i.e. few-shot learning), we limit the number of training images. Second, we investigate how children control the supervision signals they receive during learning based on hand manipulation of objects. Our experimental results suggest that supervision with hand manipulation is better than without hands, and the trend is consistent even when a small number of images is available.
△ Less
Submitted 4 June, 2019;
originally announced June 2019.
-
Large Scale Local Online Similarity/Distance Learning Framework based on Passive/Aggressive
Authors:
Baida Hamdan,
Davood Zabihzadeh,
Monsefi Reza
Abstract:
Similarity/Distance measures play a key role in many machine learning, pattern recognition, and data mining algorithms, which leads to the emergence of metric learning field. Many metric learning algorithms learn a global distance function from data that satisfy the constraints of the problem. However, in many real-world datasets that the discrimination power of features varies in the different re…
▽ More
Similarity/Distance measures play a key role in many machine learning, pattern recognition, and data mining algorithms, which leads to the emergence of metric learning field. Many metric learning algorithms learn a global distance function from data that satisfy the constraints of the problem. However, in many real-world datasets that the discrimination power of features varies in the different regions of input space, a global metric is often unable to capture the complexity of the task. To address this challenge, local metric learning methods are proposed that learn multiple metrics across the different regions of input space. Some advantages of these methods are high flexibility and the ability to learn a nonlinear map** but typically achieves at the expense of higher time requirement and overfitting problem. To overcome these challenges, this research presents an online multiple metric learning framework. Each metric in the proposed framework is composed of a global and a local component learned simultaneously. Adding a global component to a local metric efficiently reduce the problem of overfitting. The proposed framework is also scalable with both sample size and the dimension of input data. To the best of our knowledge, this is the first local online similarity/distance learning framework based on PA (Passive/Aggressive). In addition, for scalability with the dimension of input data, DRP (Dual Random Projection) is extended for local online learning in the present work. It enables our methods to be run efficiently on high-dimensional datasets, while maintains their predictive performance. The proposed framework provides a straightforward local extension to any global online similarity/distance learning algorithm based on PA.
△ Less
Submitted 5 April, 2018;
originally announced April 2018.
-
Prodorshok I: A Bengali Isolated Speech Dataset for Voice-Based Assistive Technologies - A comparative analysis of the effects of data augmentation on HMM-GMM and DNN classifiers
Authors:
Mohi Reza,
Warida Rashid,
Moin Mostakim
Abstract:
Prodorshok I is a Bengali isolated word dataset tailored to help create speaker-independent, voice-command driven automated speech recognition (ASR) based assistive technologies to help improve human-computer interaction (HCI). This paper presents the results of an objective analysis that was undertaken using a subset of words from Prodorshok I to assess its reliability in ASR systems that utilize…
▽ More
Prodorshok I is a Bengali isolated word dataset tailored to help create speaker-independent, voice-command driven automated speech recognition (ASR) based assistive technologies to help improve human-computer interaction (HCI). This paper presents the results of an objective analysis that was undertaken using a subset of words from Prodorshok I to assess its reliability in ASR systems that utilize Hidden Markov Models (HMM) with Gaussian emissions and Deep Neural Networks (DNN). The results show that simple data augmentation involving a small pitch shift can make surprisingly tangible improvements to accuracy levels in speech recognition.
△ Less
Submitted 10 December, 2017;
originally announced December 2017.
-
The Optimal Route and Stops for a Group of Users in a Road Network
Authors:
Radi Muhammad Reza,
Mohammed Eunus Ali,
Muhammad Aamir Cheema
Abstract:
Recently, with the advancement of the GPS-enabled cellular technologies, the location-based services (LBS) have gained in popularity. Nowadays, an increasingly larger number of map-based applications enable users to ask a wider variety of queries. Researchers have studied the ride-sharing, the carpooling, the vehicle routing, and the collective travel planning problems extensively in recent years.…
▽ More
Recently, with the advancement of the GPS-enabled cellular technologies, the location-based services (LBS) have gained in popularity. Nowadays, an increasingly larger number of map-based applications enable users to ask a wider variety of queries. Researchers have studied the ride-sharing, the carpooling, the vehicle routing, and the collective travel planning problems extensively in recent years. Collective traveling has the benefit of being environment-friendly by reducing the global travel cost, the greenhouse gas emission, and the energy consumption. In this paper, we introduce several optimization problems to recommend a suitable route and stops of a vehicle, in a road network, for a group of users intending to travel collectively. The goal of each problem is to minimize the aggregate cost of the individual travelers' paths and the shared route under various constraints. First, we formulate the problem of determining the optimal pair of end-stops, given a set of queries that originate and terminate near the two prospective end regions. We outline a baseline polynomial-time algorithm and propose a new faster solution - both calculating an exact answer. In our approach, we utilize the path-coherence property of road networks to develop an efficient algorithm. Second, we define the problem of calculating the optimal route and intermediate stops of a vehicle that picks up and drops off passengers en-route, given its start and end stoppages, and a set of path queries from users. We outline an exact solution of both time and space complexities exponential in the number of queries. Then, we propose a novel polynomial-time-and-space heuristic algorithm that performs reasonably well in practice. We also analyze several variants of this problem under different constraints. Last, we perform extensive experiments that demonstrate the efficiency and accuracy of our algorithms.
△ Less
Submitted 23 June, 2017;
originally announced June 2017.
-
On Optimal Online Algorithms for Energy Harvesting Systems with Continuous Energy and Data Arrivals
Authors:
Milad Rezaee,
Mahtab Mirmohseni,
Mohammad Reza Aref
Abstract:
Energy harvesting (EH) has been developed to extend the lifetimes of energy-limited communication systems. In this letter, we consider a single-user EH communication system, in which both of the arrival data and the harvested energy curves are modeled as general functions. Unlike most of the works in the field, we investigate the online algorithms which only acquire the causal information of the a…
▽ More
Energy harvesting (EH) has been developed to extend the lifetimes of energy-limited communication systems. In this letter, we consider a single-user EH communication system, in which both of the arrival data and the harvested energy curves are modeled as general functions. Unlike most of the works in the field, we investigate the online algorithms which only acquire the causal information of the arrival data and the harvested energy processes. We study how well the optimal online algorithm works compared with the optimal offline algorithm, and thus our goal is to find the lower and upper bounds for the ratio of the completion time in the optimal online algorithm to the optimal offline algorithm. We propose two online algorithms which achieve the upper bound of 2 on this ratio. Also, we show that this ratio is 2 for the optimal online algorithm.
△ Less
Submitted 19 January, 2017;
originally announced January 2017.
-
Optimal Transmission Policies for Multi-hop Energy Harvesting Systems
Authors:
Milad Rezaee,
Mahtab Mirmohseni,
Vaneet Aggarwal,
Mohammad Reza Aref
Abstract:
In this paper, we consider a multi-hop energy harvesting (EH) communication system in a full-duplex mode, where arrival data and harvested energy curves in the source and the relays are modeled as general functions. This model includes the EH system with discrete arrival processes as a special case. We investigate the throughput maximization problem considering minimum utilized energy in the sourc…
▽ More
In this paper, we consider a multi-hop energy harvesting (EH) communication system in a full-duplex mode, where arrival data and harvested energy curves in the source and the relays are modeled as general functions. This model includes the EH system with discrete arrival processes as a special case. We investigate the throughput maximization problem considering minimum utilized energy in the source and relays and find the optimal offline algorithm. We show that the optimal solution of the two-hop transmission problem have three main steps: (i) Solving a point-to-point throughput maximization problem at the source; (ii) Solving a point-to-point throughput maximization problem at the relay (after applying the solution of first step as the input of this second problem); (iii) Minimizing utilized energy in the source. In addition, we show that how the optimal algorithm for the completion time minimization problem can be derived from the proposed algorithm for throughput maximization problem. Also, for the throughput maximization problem, we propose an online algorithm and show that it is more efficient than the benchmark one (which is a direct application of an existing point-to-point online algorithm to the multi-hop system).
△ Less
Submitted 30 December, 2016;
originally announced December 2016.
-
Multiview RGB-D Dataset for Object Instance Detection
Authors:
Georgios Georgakis,
Md Alimoor Reza,
Arsalan Mousavian,
Phi-Hung Le,
Jana Kosecka
Abstract:
This paper presents a new multi-view RGB-D dataset of nine kitchen scenes, each containing several objects in realistic cluttered environments including a subset of objects from the BigBird dataset. The viewpoints of the scenes are densely sampled and objects in the scenes are annotated with bounding boxes and in the 3D point cloud. Also, an approach for detection and recognition is presented, whi…
▽ More
This paper presents a new multi-view RGB-D dataset of nine kitchen scenes, each containing several objects in realistic cluttered environments including a subset of objects from the BigBird dataset. The viewpoints of the scenes are densely sampled and objects in the scenes are annotated with bounding boxes and in the 3D point cloud. Also, an approach for detection and recognition is presented, which is comprised of two parts: i) a new multi-view 3D proposal generation method and ii) the development of several recognition baselines using AlexNet to score our proposals, which is trained either on crops of the dataset or on synthetically composited training images. Finally, we compare the performance of the object proposals and a detection baseline to the Washington RGB-D Scenes (WRGB-D) dataset and demonstrate that our Kitchen scenes dataset is more challenging for object detection and recognition. The dataset is available at: http://cs.gmu.edu/~robot/gmu-kitchens.html.
△ Less
Submitted 25 September, 2016;
originally announced September 2016.
-
Reinforcement Learning for Semantic Segmentation in Indoor Scenes
Authors:
Md. Alimoor Reza,
Jana Kosecka
Abstract:
Future advancements in robot autonomy and sophistication of robotics tasks rest on robust, efficient, and task-dependent semantic understanding of the environment. Semantic segmentation is the problem of simultaneous segmentation and categorization of a partition of sensory data. The majority of current approaches tackle this using multi-class segmentation and labeling in a Conditional Random Fiel…
▽ More
Future advancements in robot autonomy and sophistication of robotics tasks rest on robust, efficient, and task-dependent semantic understanding of the environment. Semantic segmentation is the problem of simultaneous segmentation and categorization of a partition of sensory data. The majority of current approaches tackle this using multi-class segmentation and labeling in a Conditional Random Field (CRF) framework or by generating multiple object hypotheses and combining them sequentially. In practical settings, the subset of semantic labels that are needed depend on the task and particular scene and labelling every single pixel is not always necessary. We pursue these observations in develo** a more modular and flexible approach to multi-class parsing of RGBD data based on learning strategies for combining independent binary object-vs-background segmentations in place of the usual monolithic multi-label CRF approach. Parameters for the independent binary segmentation models can be learned very efficiently, and the combination strategy---learned using reinforcement learning---can be set independently and can vary over different tasks and environments. Accuracy is comparable to state-of-art methods on a subset of the NYU-V2 dataset of indoor scenes, while providing additional flexibility and modularity.
△ Less
Submitted 3 June, 2016;
originally announced June 2016.
-
Energy Harvesting Systems with Continuous Energy and Data Arrivals: the Optimal Offline and a Heuristic Online Algorithms
Authors:
Milad Rezaee,
Mahtab Mirmohseni,
Mohammad Reza Aref
Abstract:
Energy harvesting has been developed as an effective technology for communication systems in order to extend the lifetime of these systems. In this work, we consider a singleuser energy harvesting wireless communication system, in which arrival data and harvested energy curves are modeled as continuous functions. For the single-user model, our first goal is to find an offline algorithm, which maxi…
▽ More
Energy harvesting has been developed as an effective technology for communication systems in order to extend the lifetime of these systems. In this work, we consider a singleuser energy harvesting wireless communication system, in which arrival data and harvested energy curves are modeled as continuous functions. For the single-user model, our first goal is to find an offline algorithm, which maximizes the amount of data which is transmitted to the receiver node by a given deadline. If more than one scheme exists that transmits the maximum data, we choose the one with minimum utilized energy at the transmitter node. Next, we propose an online algorithm for this system. We also consider a multi-hop energy harvesting wireless communication system in a full-duplex mode and find the optimal offline algorithm to maximize the throughput.
△ Less
Submitted 22 January, 2016; v1 submitted 12 June, 2015;
originally announced June 2015.
-
Interference Alignment via Message-Passing
Authors:
Maxime Guillaud,
Mohsen Rezaee,
Gerald Matz
Abstract:
We introduce an iterative solution to the problem of interference alignment (IA) over MIMO channels based on a message-passing formulation. We propose a parameterization of the messages that enables the computation of IA precoders by a min-sum algorithm over continuous variable spaces -- under this parameterization, suitable approximations of the messages can be computed in closed-form. We show th…
▽ More
We introduce an iterative solution to the problem of interference alignment (IA) over MIMO channels based on a message-passing formulation. We propose a parameterization of the messages that enables the computation of IA precoders by a min-sum algorithm over continuous variable spaces -- under this parameterization, suitable approximations of the messages can be computed in closed-form. We show that the iterative leakage minimization algorithm of Cadambe et al. is a special case of our message-passing algorithm, obtained for a particular schedule. Finally, we show that the proposed algorithm compares favorably to iterative leakage minimization in terms of convergence speed, and discuss a distributed implementation.
△ Less
Submitted 9 October, 2013;
originally announced October 2013.
-
Measuring the similarity of PML documents with RFID-based sensors
Authors:
Zhong-qin Wang,
Ning Ye,
Malekian Reza,
Ting-ting Zhao,
Ru-chuan Wang
Abstract:
The Electronic Product Code (EPC) Network is an important part of the Internet of Things. The Physical Mark-Up Language (PML) is to represent and de-scribe data related to objects in EPC Network. The PML documents of each component to exchange data in EPC Network system are XML documents based on PML Core schema. For managing theses huge amount of PML documents of tags captured by Radio frequency…
▽ More
The Electronic Product Code (EPC) Network is an important part of the Internet of Things. The Physical Mark-Up Language (PML) is to represent and de-scribe data related to objects in EPC Network. The PML documents of each component to exchange data in EPC Network system are XML documents based on PML Core schema. For managing theses huge amount of PML documents of tags captured by Radio frequency identification (RFID) readers, it is inevitable to develop the high-performance technol-ogy, such as filtering and integrating these tag data. So in this paper, we propose an approach for meas-uring the similarity of PML documents based on Bayesian Network of several sensors. With respect to the features of PML, while measuring the similarity, we firstly reduce the redundancy data except information of EPC. On the basis of this, the Bayesian Network model derived from the structure of the PML documents being compared is constructed.
△ Less
Submitted 12 September, 2013;
originally announced September 2013.
-
CSIT Sharing over Finite Capacity Backhaul for Spatial Interference Alignment
Authors:
Mohsen Rezaee,
Maxime Guillaud,
Fredrik Lindqvist
Abstract:
Cellular systems that employ time division duplexing (TDD) transmission are good candidates for implementation of interference alignment (IA) in the downlink since channel reciprocity enables the estimation of the channel state by the base stations (BS) in the uplink phase. However, the interfering BSs need to share their channel estimates via backhaul links of finite capacity. A quantization sche…
▽ More
Cellular systems that employ time division duplexing (TDD) transmission are good candidates for implementation of interference alignment (IA) in the downlink since channel reciprocity enables the estimation of the channel state by the base stations (BS) in the uplink phase. However, the interfering BSs need to share their channel estimates via backhaul links of finite capacity. A quantization scheme is proposed which reduces the amount of information exchange (compared to conventional methods) required to achieve IA in a TDD system. The scaling (with the transmit power) of the number of bits to be exchanged between the BSs that is sufficient to preserve the multiplexing gain of IA is derived.
△ Less
Submitted 5 February, 2013;
originally announced February 2013.
-
Interference Alignment with Quantized Grassmannian Feedback in the K-user Constant MIMO Interference Channel
Authors:
Mohsen Rezaee,
Maxime Guillaud
Abstract:
A simple channel state information (CSI) feedback scheme is proposed for interference alignment (IA) over the K-user constant Multiple-Input-Multiple-Output Interference Channel (MIMO IC). The proposed technique relies on the identification of invariants in the IA equations, which enables the reformulation of the CSI quantization problem as a single quantization on the Grassmann manifold at each r…
▽ More
A simple channel state information (CSI) feedback scheme is proposed for interference alignment (IA) over the K-user constant Multiple-Input-Multiple-Output Interference Channel (MIMO IC). The proposed technique relies on the identification of invariants in the IA equations, which enables the reformulation of the CSI quantization problem as a single quantization on the Grassmann manifold at each receiver. The scaling of the number of feedback bits with the transmit power sufficient to preserve the multiplexing gain that can be achieved under perfect CSI is established. We show that the CSI feedback requirements of the proposed technique are better (lower) than what is required when using previously published methods, for system dimensions (number of users and antennas) of practical interest. Furthermore, we show through simulations that this advantage persists at low SNR, in the sense that the proposed technique yields a higher sum-rate performance for a given number of feedback bits. Finally, to complement our analysis, we introduce a statistical model that faithfully captures the properties of the quantization error obtained for random vector quantization (RVQ) on the Grassmann manifold for large codebooks; this enables the numerical (Monte-Carlo) analysis of general Grassmannian RVQ schemes for codebook sizes that would be impractically large to simulate.
△ Less
Submitted 15 February, 2015; v1 submitted 30 July, 2012;
originally announced July 2012.
-
Transmission of Voice Signal: BER Performance Analysis of Different FEC Schemes Based OFDM System over Various Channels
Authors:
Md. Golam Rashed,
M. Hasnat Kabir,
Md. Selim Reza,
Md. Matiqul Islam,
Rifat Ara Shams,
Saleh Masum,
Sheikh Enayet Ullah
Abstract:
In this paper, we investigate the impact of Forward Error Correction (FEC) codes namely Cyclic Redundancy Code and Convolution Code on the performance of OFDM wireless communication system for speech signal transmission over both AWGN and fading (Rayleigh and Rician) channels in term of Bit Error Probability. The simulation has been done in conjunction with QPSK digital modulation and compared wit…
▽ More
In this paper, we investigate the impact of Forward Error Correction (FEC) codes namely Cyclic Redundancy Code and Convolution Code on the performance of OFDM wireless communication system for speech signal transmission over both AWGN and fading (Rayleigh and Rician) channels in term of Bit Error Probability. The simulation has been done in conjunction with QPSK digital modulation and compared with uncoded resultstal modulation. In the fading channels, it is found via computer simulation that the performance of the Convolution interleaved based OFDM systems outperform than that of CRC interleaved OFDM system as well as uncoded OFDM channels.
△ Less
Submitted 17 July, 2012;
originally announced July 2012.
-
An Approach of Digital Image Copyright Protection by Using Watermarking Technology
Authors:
Md. Selim Reza,
Mohammed Shafiul Alam Khan,
Md. Golam Robiul Alam,
Serajul Islam
Abstract:
Digital watermarking system is a paramount for safeguarding valuable resources and information. Digital watermarks are generally imperceptible to the human eye and ear. Digital watermark can be used in video, audio and digital images for a wide variety of applications such as copy prevention right management, authentication and filtering of internet content. The proposed system is able to protect…
▽ More
Digital watermarking system is a paramount for safeguarding valuable resources and information. Digital watermarks are generally imperceptible to the human eye and ear. Digital watermark can be used in video, audio and digital images for a wide variety of applications such as copy prevention right management, authentication and filtering of internet content. The proposed system is able to protect copyright or owner identification of digital media, such as audio, image, video, or text. The system permutated the watermark and embed the permutated watermark into the wavelet coefficients of the original image by using a key. The key is randomly generated and used to select the locations in the wavelet domain in which to embed the permutated watermark. Finally, the system combines the concept of cryptography and digital watermarking techniques to implement a more secure digital watermarking system.
△ Less
Submitted 28 May, 2012;
originally announced May 2012.
-
Performance Evaluation of SCM-WDM System Using Different Linecoding
Authors:
Md. Shamim Reza,
Md. Maruf Hossain,
Adnan Ahmed Chowdhury,
S. M. Shamim Reza,
Md. Moshiur Rahman
Abstract:
This paper investigates the theoretical performance analysis for a subcarrier multiplexed (SCM) wavelength division multiplexing (WDM) optical transmission system in presence of optical beat interference (OBI) which occurs during the photo detection process. We have presented a comparison for improving the performance of SCM-WDM system in presence of OBI. Non-return-to zero (NRZ), Manchester and M…
▽ More
This paper investigates the theoretical performance analysis for a subcarrier multiplexed (SCM) wavelength division multiplexing (WDM) optical transmission system in presence of optical beat interference (OBI) which occurs during the photo detection process. We have presented a comparison for improving the performance of SCM-WDM system in presence of OBI. Non-return-to zero (NRZ), Manchester and Miller code (MC) line coding are used for performance investigation of SCM-WDM system. A suitable signal bandwidth is selected and 200 KHz is considered as channel bandwidth. Power spectrum of signal and cross component for those line coding are analyzed. Comparison results are evaluated in terms of signal to OBI ratio for the three linecoding schemes which is called signal to interference ratio (SIR). It is found that there is a significant increase in the SIR by employing Miller code compared to NRZ and Manchester for the same data rate. For example, for a number of subcarriers of 10, the achievable SIR is about -24 dB for Miller coded system compared to -46 dB for NRZ coded system and -49 dB for Manchester coded system. The results are found to be satisfactorily agreed with the expected results.
△ Less
Submitted 26 April, 2010;
originally announced April 2010.
-
Evaluation of Burst Loss Rate of an Optical Burst Switching (OBS) Network with Wavelength Conversion Capability
Authors:
Md. Shamim Reza,
Md. Maruf Hossain,
Satya Prasad Majumder
Abstract:
This paper presents a new analytical model for calculating burst loss rate (BLR) in a slotted optical burst switched network. The analytical result leads to a framework which provides guidelines for optical burst switched networks. Wavelength converter is used for burst contention resolution. The effect of several design parameters such as burst arrival probability, wavelength conversion capabilit…
▽ More
This paper presents a new analytical model for calculating burst loss rate (BLR) in a slotted optical burst switched network. The analytical result leads to a framework which provides guidelines for optical burst switched networks. Wavelength converter is used for burst contention resolution. The effect of several design parameters such as burst arrival probability, wavelength conversion capability, number of slots per burst and number of wavelengths is incorporated on the above performance measure. We also extend the analytical result of BLR for different types of service classes where each service class has a reserved number of wavelengths in a network with fixed number of wavelengths. We also introduce an algorithm to calculate the resultant number of wavelength for each service classes depending on the various scenarios.
△ Less
Submitted 26 April, 2010;
originally announced April 2010.