-
ST-DPGAN: A Privacy-preserving Framework for Spatiotemporal Data Generation
Authors:
Wei Shao,
Rongyi Zhu,
Cai Yang,
Chandra Thapa,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Rui Zhang,
DuYong Kim,
Hamid Menouar,
Flora D. Salim
Abstract:
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this cha…
▽ More
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this challenge, we propose a Graph-GAN-based model for generating privacy-protected spatiotemporal data. Our approach incorporates spatial and temporal attention blocks in the discriminator and a spatiotemporal deconvolution structure in the generator. These enhancements enable efficient training under Gaussian noise to achieve differential privacy. Extensive experiments conducted on three real-world spatiotemporal datasets validate the efficacy of our model. Our method provides a privacy guarantee while maintaining the data utility. The prediction model trained on our generated data maintains a competitive performance compared to the model trained on the original data.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Odd-frequency superconducting pairing and multiple Majorana edge modes in driven topological superconductors
Authors:
Eslam Ahmed,
Shun Tamura,
Yukio Tanaka,
Jorge Cayao
Abstract:
Majorana zero modes have been shown to be the simplest quasiparticles exhibiting pure odd-frequency pairing, an effect that has so far been theoretically established in the static regime. In this work we investigate the formation of Majorana modes and odd-frequency pairing in $p$-wave spin-polarized superconductors under a time-dependent drive. We first show that the driven system hosts multiple M…
▽ More
Majorana zero modes have been shown to be the simplest quasiparticles exhibiting pure odd-frequency pairing, an effect that has so far been theoretically established in the static regime. In this work we investigate the formation of Majorana modes and odd-frequency pairing in $p$-wave spin-polarized superconductors under a time-dependent drive. We first show that the driven system hosts multiple Majorana modes emerging at zero and $π$, whose formation can be controlled by an appropriate tuning of the drive frequency and chemical potential. Then we explore the induced pair correlations and find that odd-frequency spin-polarized $s$-wave pairing is broadly induced, acquiring large values in the presence of Majorana modes. We discover that, while odd-frequency pairing is proportional to $\sim1/ω$ in the presence of Majorana zero modes, it is proportional to $\sim 1/(ω-π\hbar/T)$ in the presence of Majorana $π$ modes, where $T$ is the periodicity of the drive. Furthermore, we find that the amount of odd-frequency pairing becomes larger when multiple Majorana modes appear but the overall divergent profile as a function of frequency remains. Our work thus paves the way for understanding the emergent pair correlations in driven topological superconductors
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Scientific machine learning for closure models in multiscale problems: a review
Authors:
Benjamin Sanderse,
Panos Stinis,
Romit Maulik,
Shady E. Ahmed
Abstract:
Closure problems are omnipresent when simulating multiscale systems, where some quantities and processes cannot be fully prescribed despite their effects on the simulation's accuracy. Recently, scientific machine learning approaches have been proposed as a way to tackle the closure problem, combining traditional (physics-based) modeling with data-driven (machine-learned) techniques, typically thro…
▽ More
Closure problems are omnipresent when simulating multiscale systems, where some quantities and processes cannot be fully prescribed despite their effects on the simulation's accuracy. Recently, scientific machine learning approaches have been proposed as a way to tackle the closure problem, combining traditional (physics-based) modeling with data-driven (machine-learned) techniques, typically through enriching differential equations with neural networks. This paper reviews the different reduced model forms, distinguished by the degree to which they include known physics, and the different objectives of a priori and a posteriori learning. The importance of adhering to physical laws (such as symmetries and conservation laws) in choosing the reduced model form and choosing the learning method is discussed. The effect of spatial and temporal discretization and recent trends toward discretization-invariant models are reviewed. In addition, we make the connections between closure problems and several other research disciplines: inverse problems, Mori-Zwanzig theory, and multi-fidelity methods. In conclusion, much progress has been made with scientific machine learning approaches for solving closure problems, but many challenges remain. In particular, the generalizability and interpretability of learned models is a major issue that needs to be addressed further.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Malicious Package Detection using Metadata Information
Authors:
S. Halder,
M. Bewong,
A. Mahboubi,
Y. Jiang,
R. Islam,
Z. Islam,
R. Ip,
E. Ahmed,
G. Ramachandran,
A. Babar
Abstract:
Protecting software supply chains from malicious packages is paramount in the evolving landscape of software development. Attacks on the software supply chain involve attackers injecting harmful software into commonly used packages or libraries in a software repository. For instance, JavaScript uses Node Package Manager (NPM), and Python uses Python Package Index (PyPi) as their respective package…
▽ More
Protecting software supply chains from malicious packages is paramount in the evolving landscape of software development. Attacks on the software supply chain involve attackers injecting harmful software into commonly used packages or libraries in a software repository. For instance, JavaScript uses Node Package Manager (NPM), and Python uses Python Package Index (PyPi) as their respective package repositories. In the past, NPM has had vulnerabilities such as the event-stream incident, where a malicious package was introduced into a popular NPM package, potentially impacting a wide range of projects. As the integration of third-party packages becomes increasingly ubiquitous in modern software development, accelerating the creation and deployment of applications, the need for a robust detection mechanism has become critical. On the other hand, due to the sheer volume of new packages being released daily, the task of identifying malicious packages presents a significant challenge. To address this issue, in this paper, we introduce a metadata-based malicious package detection model, MeMPtec. This model extracts a set of features from package metadata information. These extracted features are classified as either easy-to-manipulate (ETM) or difficult-to-manipulate (DTM) features based on monotonicity and restricted control properties. By utilising these metadata features, not only do we improve the effectiveness of detecting malicious packages, but also we demonstrate its resistance to adversarial attacks in comparison with existing state-of-the-art. Our experiments indicate a significant reduction in both false positives (up to 97.56%) and false negatives (up to 91.86%).
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Numerical solution of the Newtonian plane Couette flow with linear dynamic wall slip
Authors:
Muner M. A. Hasan,
Ethar A. A. Ahmed,
Ahmed F. Ghaleb,
Moustafa S. Abou-Dina,
Georgios C. Georgiou
Abstract:
An efficient numerical approach based on weighted average finite differences is used to solve the Newtonian plane Couette flow with wall slip, obeying a dynamic slip law that generalizes the Navier slip law with the inclusion of a relaxation term. Slip is exhibited only along the fixed plate, and the motion is triggered by the motion of the other plate. Three different cases are considered for the…
▽ More
An efficient numerical approach based on weighted average finite differences is used to solve the Newtonian plane Couette flow with wall slip, obeying a dynamic slip law that generalizes the Navier slip law with the inclusion of a relaxation term. Slip is exhibited only along the fixed plate, and the motion is triggered by the motion of the other plate. Three different cases are considered for the motion of the moving plate, i.e., constant speed, oscillating speed, and a single-period sinusoidal speed. The velocity and the volumetric flow rate are calculated in all cases and comparisons are made with the results of other methods and available results in the literature. The numerical outcomes confirm the dam** with time and the lagging effects arising from the Navier and dynamic wall slip conditions and demonstrate the hysteretic behavior of the slip velocity in following the harmonic boundary motion.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
TTMFN: Two-stream Transformer-based Multimodal Fusion Network for Survival Prediction
Authors:
Ruiquan Ge,
Xiangyang Hu,
Rungen Huang,
Gangyong Jia,
Yaqi Wang,
Renshu Gu,
Changmiao Wang,
Elazab Ahmed,
Linyan Wang,
Juan Ye,
Ye Li
Abstract:
Survival prediction plays a crucial role in assisting clinicians with the development of cancer treatment protocols. Recent evidence shows that multimodal data can help in the diagnosis of cancer disease and improve survival prediction. Currently, deep learning-based approaches have experienced increasing success in survival prediction by integrating pathological images and gene expression data. H…
▽ More
Survival prediction plays a crucial role in assisting clinicians with the development of cancer treatment protocols. Recent evidence shows that multimodal data can help in the diagnosis of cancer disease and improve survival prediction. Currently, deep learning-based approaches have experienced increasing success in survival prediction by integrating pathological images and gene expression data. However, most existing approaches overlook the intra-modality latent information and the complex inter-modality correlations. Furthermore, existing modalities do not fully exploit the immense representational capabilities of neural networks for feature aggregation and disregard the importance of relationships between features. Therefore, it is highly recommended to address these issues in order to enhance the prediction performance by proposing a novel deep learning-based method. We propose a novel framework named Two-stream Transformer-based Multimodal Fusion Network for survival prediction (TTMFN), which integrates pathological images and gene expression data. In TTMFN, we present a two-stream multimodal co-attention transformer module to take full advantage of the complex relationships between different modalities and the potential connections within the modalities. Additionally, we develop a multi-head attention pooling approach to effectively aggregate the feature representations of the two modalities. The experiment results on four datasets from The Cancer Genome Atlas demonstrate that TTMFN can achieve the best performance or competitive results compared to the state-of-the-art methods in predicting the overall survival of patients.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
Stacked networks improve physics-informed training: applications to neural networks and deep operator networks
Authors:
Amanda A Howard,
Sarah H Murphy,
Shady E Ahmed,
Panos Stinis
Abstract:
Physics-informed neural networks and operator networks have shown promise for effectively solving equations modeling physical systems. However, these networks can be difficult or impossible to train accurately for some systems of equations. We present a novel multifidelity framework for stacking physics-informed neural networks and operator networks that facilitates training. We successively build…
▽ More
Physics-informed neural networks and operator networks have shown promise for effectively solving equations modeling physical systems. However, these networks can be difficult or impossible to train accurately for some systems of equations. We present a novel multifidelity framework for stacking physics-informed neural networks and operator networks that facilitates training. We successively build a chain of networks, where the output at one step can act as a low-fidelity input for training the next step, gradually increasing the expressivity of the learned model. The equations imposed at each step of the iterative process can be the same or different (akin to simulated annealing). The iterative (stacking) nature of the proposed method allows us to progressively learn features of a solution that are hard to learn directly. Through benchmark problems including a nonlinear pendulum, the wave equation, and the viscous Burgers equation, we show how stacking can be used to improve the accuracy and reduce the required size of physics-informed neural networks and operator networks.
△ Less
Submitted 20 November, 2023; v1 submitted 11 November, 2023;
originally announced November 2023.
-
model-based script synthesis for fuzzing
Authors:
Zian Liu,
Chao Chen,
Muhammad Ejaz Ahmed,
Jun Zhang,
Dongxi Liu
Abstract:
Kernel fuzzing is important for finding critical kernel vulnerabilities. Close-source (e.g., Windows) operating system kernel fuzzing is even more challenging due to the lack of source code. Existing approaches fuzz the kernel by modeling syscall sequences from traces or static analysis of system codes. However, a common limitation is that they do not learn and mutate the syscall sequences to reac…
▽ More
Kernel fuzzing is important for finding critical kernel vulnerabilities. Close-source (e.g., Windows) operating system kernel fuzzing is even more challenging due to the lack of source code. Existing approaches fuzz the kernel by modeling syscall sequences from traces or static analysis of system codes. However, a common limitation is that they do not learn and mutate the syscall sequences to reach different kernel states, which can potentially result in more bugs or crashes.
In this paper, we propose WinkFuzz, an approach to learn and mutate traced syscall sequences in order to reach different kernel states. WinkFuzz learns syscall dependencies from the trace, identifies potential syscalls in the trace that can have dependent subsequent syscalls, and applies the dependencies to insert more syscalls while preserving the dependencies into the trace. Then WinkFuzz fuzzes the synthesized new syscall sequence to find system crashes.
We applied WinkFuzz to four seed applications and found a total increase in syscall number of 70.8\%, with a success rate of 61\%, within three insert levels. The average time for tracing, dependency analysis, recovering model script, and synthesizing script was 600, 39, 34, and 129 seconds respectively. The instant fuzzing rate is 3742 syscall executions per second. However, the average fuzz efficiency dropped to 155 syscall executions per second when the initializing time, waiting time, and other factors were taken into account. We fuzzed each seed application for 24 seconds and, on average, obtained 12.25 crashes within that time frame.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
SemDiff: Binary Similarity Detection by Diffing Key-Semantics Graphs
Authors:
Zian Liu,
Zhi Zhang,
Siqi Ma,
Dongxi Liu,
Jun Zhang,
Chao Chen,
Shigang Liu,
Muhammad Ejaz Ahmed,
Yang Xiang
Abstract:
Binary similarity detection is a critical technique that has been applied in many real-world scenarios where source code is not available, e.g., bug search, malware analysis, and code plagiarism detection. Existing works are ineffective in detecting similar binaries in cases where different compiling optimizations, compilers, source code versions, or obfuscation are deployed.
We observe that all…
▽ More
Binary similarity detection is a critical technique that has been applied in many real-world scenarios where source code is not available, e.g., bug search, malware analysis, and code plagiarism detection. Existing works are ineffective in detecting similar binaries in cases where different compiling optimizations, compilers, source code versions, or obfuscation are deployed.
We observe that all the cases do not change a binary's key code behaviors although they significantly modify its syntax and structure. With this key observation, we extract a set of key instructions from a binary to capture its key code behaviors. By detecting the similarity between two binaries' key instructions, we can address well the ineffectiveness limitation of existing works. Specifically, we translate each extracted key instruction into a self-defined key expression, generating a key-semantics graph based on the binary's control flow. Each node in the key-semantics graph denotes a key instruction, and the node attribute is the key expression. To quantify the similarity between two given key-semantics graphs, we first serialize each graph into a sequence of key expressions by topological sort. Then, we tokenize and concatenate key expressions to generate token lists. We calculate the locality-sensitive hash value for all token lists and quantify their similarity. %We implement a prototype, called SemDiff, consisting of two modules: graph generation and graph diffing. The first module generates a pair of key-semantics graphs and the second module diffs the graphs. Our evaluation results show that overall, SemDiff outperforms state-of-the-art tools when detecting the similarity of binaries generated from different optimization levels, compilers, and obfuscations. SemDiff is also effective for library version search and finding similar vulnerabilities in firmware.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
VulMatch: Binary-level Vulnerability Detection Through Signature
Authors:
Zian Liu,
Lei Pan,
Chao Chen,
Ejaz Ahmed,
Shigang Liu,
Jun Zhang,
Dongxi Liu
Abstract:
Similar vulnerability repeats in real-world software products because of code reuse, especially in wildly reused third-party code and libraries. Detecting repeating vulnerabilities like 1-day and N-day vulnerabilities is an important cyber security task. Unfortunately, the state-of-the-art methods suffer from poor performance because they detect patch existence instead of vulnerability existence a…
▽ More
Similar vulnerability repeats in real-world software products because of code reuse, especially in wildly reused third-party code and libraries. Detecting repeating vulnerabilities like 1-day and N-day vulnerabilities is an important cyber security task. Unfortunately, the state-of-the-art methods suffer from poor performance because they detect patch existence instead of vulnerability existence and infer the vulnerability signature directly from binary code. In this paper, we propose VulMatch to extract precise vulnerability-related binary instructions to generate the vulnerability-related signature. VulMatch detects vulnerability existence based on binary signatures. Unlike previous approaches, VulMatch accurately locates vulnerability-related instructions by utilizing source and binary codes. Our experiments were conducted using over 1000 vulnerable instances across seven open-source projects. VulMatch significantly outperformed the baseline tools Asm2vec and Palmtree. Besides the performance advantages over the baseline tools, VulMatch offers a better feature by providing explainable reasons during vulnerability detection. Our empirical studies demonstrate that VulMatch detects fine-grained vulnerability that the state-of-the-art tools struggle with. Our experiment on commercial firmware demonstrates VulMatch is able to find vulnerabilities in real-world scenario.
△ Less
Submitted 17 January, 2024; v1 submitted 1 August, 2023;
originally announced August 2023.
-
DeepMPR: Enhancing Opportunistic Routing in Wireless Networks through Multi-Agent Deep Reinforcement Learning
Authors:
Saeed Kaviani,
Bo Ryu,
Ejaz Ahmed,
Deokseong Kim,
Jae Kim,
Carrie Spiker,
Blake Harnden
Abstract:
Opportunistic routing relies on the broadcast capability of wireless networks. It brings higher reliability and robustness in highly dynamic and/or severe environments such as mobile or vehicular ad-hoc networks (MANETs/VANETs). To reduce the cost of broadcast, multicast routing schemes use the connected dominating set (CDS) or multi-point relaying (MPR) set to decrease the network overhead and he…
▽ More
Opportunistic routing relies on the broadcast capability of wireless networks. It brings higher reliability and robustness in highly dynamic and/or severe environments such as mobile or vehicular ad-hoc networks (MANETs/VANETs). To reduce the cost of broadcast, multicast routing schemes use the connected dominating set (CDS) or multi-point relaying (MPR) set to decrease the network overhead and hence, their selection algorithms are critical. Common MPR selection algorithms are heuristic, rely on coordination between nodes, need high computational power for large networks, and are difficult to tune for network uncertainties. In this paper, we use multi-agent deep reinforcement learning to design a novel MPR multicast routing technique, DeepMPR, which is outperforming the OLSR MPR selection algorithm while it does not require MPR announcement messages from the neighbors. Our evaluation results demonstrate the performance gains of our trained DeepMPR multicast forwarding policy compared to other popular techniques.
△ Less
Submitted 16 June, 2023;
originally announced June 2023.
-
Open Source-based Over-The-Air 5G New Radio Sidelink Testbed
Authors:
Melissa Elkadi,
Doekseong Kim,
Ejaz Ahmed,
Moein Sadeghi,
Anh Le,
Paul Russell,
Bo Ryu
Abstract:
The focus of this paper is to demonstrate an over-the-air (OTA) 5G new radio (NR) sidelink communication prototype. 5G NR sidelink communications allow NR UEs to transfer data independently without the assistance of a base station (gNB), which enables V2X communications, including platooning, autonomous driving, sensor extension, industrial IoT, public safety communication and much more. Our desig…
▽ More
The focus of this paper is to demonstrate an over-the-air (OTA) 5G new radio (NR) sidelink communication prototype. 5G NR sidelink communications allow NR UEs to transfer data independently without the assistance of a base station (gNB), which enables V2X communications, including platooning, autonomous driving, sensor extension, industrial IoT, public safety communication and much more. Our design leverages the open-source OpenAirInterface5G (OAI) software, which operates on software-defined radios (SDRs) and can be easily extended for mesh networking. The software includes all signal processing components specified by the 3GPP 5G sidelink standards, including Low-Density Parity Check (LDPC) encoding/decoding, polar encoding/decoding, data and control multiplexing, modulation/demodulation, and orthogonal frequency-division multiplexing (OFDM) modulation/demodulation. It can be configured to operate with different bands, bandwidths, and antenna settings. The first milestone in this work was to demonstrate the completed Physical Sidelink Broadcast Channel (PSBCH) development, which conducts synchronization between a Synchronization Reference (SyncRef) UE and a nearby UE. The SyncRef UE broadcasts a sidelink synchronization signal block (S-SSB) periodically, which the nearby UE detects and uses to synchronize its timing and frequency components with the SyncRef UE. Once a connection is established, the next developmental milestone is to transmit real data (text messages) via the Physical Sidelink Shared Channel (PSSCH). Our PHY sidelink framework is tested using both an RF simulator and an OTA testbed with multiple nearby UEs. Beyond the development of synchronization and data transmission/reception in 5G sidelink, we conclude with various performance tests and validation experiments. The results of these metrics show that our simulator is comparable to the OTA testbed.
△ Less
Submitted 6 October, 2023; v1 submitted 15 June, 2023;
originally announced June 2023.
-
STUDY: Socially Aware Temporally Causal Decoder Recommender Systems
Authors:
Eltayeb Ahmed,
Diana Mincu,
Lauren Harrell,
Katherine Heller,
Subhrajit Roy
Abstract:
Recommender systems are widely used to help people find items that are tailored to their interests. These interests are often influenced by social networks, making it important to use social network information effectively in recommender systems. This is especially true for demographic groups with interests that differ from the majority. This paper introduces STUDY, a Socially-aware Temporally caU…
▽ More
Recommender systems are widely used to help people find items that are tailored to their interests. These interests are often influenced by social networks, making it important to use social network information effectively in recommender systems. This is especially true for demographic groups with interests that differ from the majority. This paper introduces STUDY, a Socially-aware Temporally caUsal Decoder recommender sYstem. STUDY introduces a new socially-aware recommender system architecture that is significantly more efficient to learn and train than existing methods. STUDY performs joint inference over socially connected groups in a single forward pass of a modified transformer decoder network. We demonstrate the benefits of STUDY in the recommendation of books for students who are dyslexic, or struggling readers. Dyslexic students often have difficulty engaging with reading material, making it critical to recommend books that are tailored to their interests. We worked with our non-profit partner Learning Ally to evaluate STUDY on a dataset of struggling readers. STUDY was able to generate recommendations that more accurately predicted student engagement, when compared with existing methods.
△ Less
Submitted 5 September, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
On the dual advantage of placing observations through forward sensitivity analysis
Authors:
Shady E Ahmed,
Omer San,
Sivaramakrishnan Lakshmivarahan,
John M Lewis
Abstract:
The four-dimensional variational data assimilation methodology for assimilating noisy observations into a deterministic model has been the workhorse of forecasting centers for over three decades. While this method provides a computationally efficient framework for dynamic data assimilation, it is largely silent on the important question concerning the minimum number and placement of observations.…
▽ More
The four-dimensional variational data assimilation methodology for assimilating noisy observations into a deterministic model has been the workhorse of forecasting centers for over three decades. While this method provides a computationally efficient framework for dynamic data assimilation, it is largely silent on the important question concerning the minimum number and placement of observations. To answer this question, we demonstrate the dual advantage of placing the observations where the square of the sensitivity of the model solution with respect to the unknown control variables, called forward sensitivities, attains its maximum. Therefore, we can force the observability Gramian to be of full rank, which in turn guarantees efficient recovery of the optimal values of the control variables, which is the first of the two advantages of this strategy. We further show that the proposed strategy of placing observations has another inherent optimality: the square of the sensitivity of the optimal estimates of the control with respect to the observations (used to obtain these estimates) attains its minimum value, a second advantage that is a direct consequence of the above strategy for placing observations. Our analytical framework and numerical experiments on linear and nonlinear systems confirm the effectiveness of our proposed strategy.
△ Less
Submitted 29 April, 2023;
originally announced May 2023.
-
Socially Assistive Robots as Decision Makers in the Wild: Insights from a Participatory Design Workshop
Authors:
Eshtiak Ahmed,
Laura Cosio,
Juho Hamari,
Oğuz 'Oz' Buruk
Abstract:
Socially Assistive Robots (SARs) are becoming very popular every day because of their effectiveness in handling social situations. However, social robots are perceived as intelligent, and thus their decision-making process might have a significant effect on how they are perceived and how effective they are. In this paper, we present the findings from a participatory design study consisting of 5 de…
▽ More
Socially Assistive Robots (SARs) are becoming very popular every day because of their effectiveness in handling social situations. However, social robots are perceived as intelligent, and thus their decision-making process might have a significant effect on how they are perceived and how effective they are. In this paper, we present the findings from a participatory design study consisting of 5 design workshops with 30 participants, focusing on several decision-making scenarios of SARs in the wild. Through the findings of the PD study, we have discussed 5 directions that could aid the design of decision-making systems of SARs in the wild.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
Forward Sensitivity Analysis and Mode Dependent Control for Closure Modeling of Galerkin Systems
Authors:
Shady E. Ahmed,
Omer San
Abstract:
Model reduction by projection-based approaches is often associated with losing some of the important features that contribute towards the dynamics of the retained scales. As a result, a mismatch occurs between the predicted trajectories of the original system and the truncated one. We put forth a framework to apply a continuous time control signal in the latent space of the reduced order model (RO…
▽ More
Model reduction by projection-based approaches is often associated with losing some of the important features that contribute towards the dynamics of the retained scales. As a result, a mismatch occurs between the predicted trajectories of the original system and the truncated one. We put forth a framework to apply a continuous time control signal in the latent space of the reduced order model (ROM) to account for the effect of truncation. We set the control input using parameterized models by following energy transfer principles. Our methodology relies on observing the system behavior in the physical space and using the projection operator to restrict the feedback signal into the latent space. Then, we leverage the forward sensitivity method (FSM) to derive relationships between the feedback and the desired mode-dependent control. We test the performance of the proposed approach using two test cases, corresponding to viscous Burgers and vortex merger problems at high Reynolds number. Results show that the ROM trajectory with the applied FSM control closely matches its target values in both the data-dense and data-sparse regimes.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
A Multifidelity deep operator network approach to closure for multiscale systems
Authors:
Shady E. Ahmed,
Panos Stinis
Abstract:
Projection-based reduced order models (PROMs) have shown promise in representing the behavior of multiscale systems using a small set of generalized (or latent) variables. Despite their success, PROMs can be susceptible to inaccuracies, even instabilities, due to the improper accounting of the interaction between the resolved and unresolved scales of the multiscale system (known as the closure pro…
▽ More
Projection-based reduced order models (PROMs) have shown promise in representing the behavior of multiscale systems using a small set of generalized (or latent) variables. Despite their success, PROMs can be susceptible to inaccuracies, even instabilities, due to the improper accounting of the interaction between the resolved and unresolved scales of the multiscale system (known as the closure problem). In the current work, we interpret closure as a multifidelity problem and use a multifidelity deep operator network (DeepONet) framework to address it. In addition, to enhance the stability and accuracy of the multifidelity-based closure, we employ the recently developed "in-the-loop" training approach from the literature on coupling physics and machine learning models. The resulting approach is tested on shock advection for the one-dimensional viscous Burgers equation and vortex merging using the two-dimensional Navier-Stokes equations. The numerical experiments show significant improvement of the predictive ability of the closure-corrected PROM over the un-corrected one both in the interpolative and the extrapolative regimes.
△ Less
Submitted 1 June, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
Response to "On the giant deformation and ferroelectricity of guanidinium nitrate" by Marek Szafrański and Andrzej Katrusiak
Authors:
Durga Prasad Karothu,
Rodrigo Ferreira,
Ghada Dushaq,
Ejaz Ahmed,
Luca Catalano,
Jad Mahmoud Halabi,
Zainab Alhaddad,
Ibrahim Tahir,
Liang Li,
Sharmarke Mohamed,
Mahmoud Rasras,
Panče Naumov
Abstract:
Following a well-established practice of publishing commentaries to articles of other authors who work on materials that were earlier studied by them (n.b. six published comments[1-6]), Marek Szafrański(MS) and Andrzej Katrusiak (AK) have filed on the preprint server arXiv a manuscript entitled "On the giant deformation and ferroelectricity of guanidinium nitrate"[7] with comments on our article "…
▽ More
Following a well-established practice of publishing commentaries to articles of other authors who work on materials that were earlier studied by them (n.b. six published comments[1-6]), Marek Szafrański(MS) and Andrzej Katrusiak (AK) have filed on the preprint server arXiv a manuscript entitled "On the giant deformation and ferroelectricity of guanidinium nitrate"[7] with comments on our article "Exceptionally high work density of a ferroelectric dynamic organic crystal around room temperature" published in Nature Communications (2022, 13, 2823).[8] Both in the submitted comment as well as in the required (by the journal) direct communication with us preceding its posting, MS and AK have expressed dissatisfaction with the choice of literature references in our article, for which they felt that their previous work on this material has not been cited to a sufficient extent. In their comment, they summarize their other remarks on our article as "the structural determinations of GN [guanidinium nitrate] crystals, their phase transitions and associated giant deformation, as well as its detailed structural mechanism, the molecular dynamics and dielectric properties were reported before, while the semiconductivity, ferroelectricity, and fatigue resistance of the GN [guanidinium nitrate] crystals cannot be confirmed."[7] Apart from the sentiments of MS and AK on our choice of cited literature, we find their comments on the scientific content of our article to be strongly biased towards their own results and unfounded. Below, we provide a detailed response to their comments.
△ Less
Submitted 7 September, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
Beamforming and Device Selection Design in Federated Learning with Over-the-air Aggregation
Authors:
Faeze Moradi Kalarde,
Min Dong,
Ben Liang,
Yahia A. Eldemerdash Ahmed,
Ho Ting Cheng
Abstract:
Federated learning (FL) with over-the-air computation can efficiently utilize the communication bandwidth but is susceptible to analog aggregation error. Excluding those devices with weak channel conditions can reduce the aggregation error, but it also limits the amount of local training data for FL, which can reduce the training convergence rate. In this work, we jointly design uplink receiver be…
▽ More
Federated learning (FL) with over-the-air computation can efficiently utilize the communication bandwidth but is susceptible to analog aggregation error. Excluding those devices with weak channel conditions can reduce the aggregation error, but it also limits the amount of local training data for FL, which can reduce the training convergence rate. In this work, we jointly design uplink receiver beamforming and device selection for over-the-air FL over time-varying wireless channels to maximize the training convergence rate. We reformulate this stochastic optimization problem into a mixed-integer program using an upper bound on the global training loss over communication rounds. We then propose a Greedy Spatial Device Selection (GSDS) approach, which uses a sequential procedure to select devices based on a measure capturing both the channel strength and the channel correlation to the selected devices. We show that given the selected devices, the receiver beamforming optimization problem is equivalent to downlink single-group multicast beamforming. To reduce the computational complexity, we also propose an Alternating-optimization-based Device Selection and Beamforming (ADSBF) approach, which solves the receiver beamforming and device selection subproblems alternatingly. In particular, despite the device selection being an integer problem, we are able to develop an efficient algorithm to find its optimal solution.
Simulation results with real-world image classification demonstrate that our proposed methods achieve faster convergence with significantly lower computational complexity than existing alternatives. Furthermore, although ADSBF shows marginally inferior performance to GSDS, it offers the advantage of lower computational complexity when the number of devices is large.
△ Less
Submitted 6 March, 2024; v1 submitted 28 February, 2023;
originally announced February 2023.
-
PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments
Authors:
Abhijit Chowdhary,
Shady E. Ahmed,
Ahmed Attia
Abstract:
This paper describes PyOED, a highly extensible scientific package that enables develo** and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also…
▽ More
This paper describes PyOED, a highly extensible scientific package that enables develo** and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also meant to enable researchers to experiment with standard and innovative OED technologies with a wide range of test problems (e.g., simulation models). OED, inverse problems (e.g., Bayesian inversion), and data assimilation (DA) are closely related research fields, and their formulations overlap significantly. Thus, PyOED is continuously being expanded with a plethora of Bayesian inversion, DA, and OED methods as well as new scientific simulation models, observation error models, and observation operators. These pieces are added such that they can be permuted to enable testing OED methods in various settings of varying complexities. The PyOED core is completely written in Python and utilizes the inherent object-oriented capabilities; however, the current version of PyOED is meant to be extensible rather than scalable. Specifically, PyOED is developed to enable rapid development and benchmarking of OED methods with minimal coding effort and to maximize code reutilization. This paper provides a brief description of the PyOED layout and philosophy and provides a set of exemplary test cases and tutorials to demonstrate the potential of the package.
△ Less
Submitted 19 December, 2023; v1 submitted 19 January, 2023;
originally announced January 2023.
-
Nonintrusive reduced order modeling of convective Boussinesq flows
Authors:
Pedram H. Dabaghian,
Shady E. Ahmed,
Omer San
Abstract:
In this paper, we formulate three nonintrusive methods and systematically explore their performance in terms of the ability to reconstruct the quantities of interest and their predictive capabilities. The methods include deterministic dynamic mode decomposition (DMD), randomized DMD and nonlinear proper orthogonal decomposition (NLPOD). We apply these methods to a convection dominated fluid flow p…
▽ More
In this paper, we formulate three nonintrusive methods and systematically explore their performance in terms of the ability to reconstruct the quantities of interest and their predictive capabilities. The methods include deterministic dynamic mode decomposition (DMD), randomized DMD and nonlinear proper orthogonal decomposition (NLPOD). We apply these methods to a convection dominated fluid flow problem governed by the Boussinesq equations. We analyze the reconstruction results primarily at two different times for considering different noise levels synthetically added into the data snapshots. Overall, our results indicate that, with a proper selection of the number of retained modes and neural network architectures, all three approaches make predictions that are in a good agreement with the full order model solution. However, we find that the NLPOD approach seems more robust for higher noise levels compared to both DMD approaches.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
Unraveling Threat Intelligence Through the Lens of Malicious URL Campaigns
Authors:
Mahathir Almashor,
Ejaz Ahmed,
Benjamin Pick,
Sharif Abuadbba,
Jason Xue,
Raj Gaire,
Shuo Wang,
Seyit Camtepe,
Surya Nepal
Abstract:
The daily deluge of alerts is a sombre reality for Security Operations Centre (SOC) personnel worldwide. They are at the forefront of an organisation's cybersecurity infrastructure, and face the unenviable task of prioritising threats amongst a flood of abstruse alerts triggered by their Security Information and Event Management (SIEM) systems. URLs found within malicious communications form the b…
▽ More
The daily deluge of alerts is a sombre reality for Security Operations Centre (SOC) personnel worldwide. They are at the forefront of an organisation's cybersecurity infrastructure, and face the unenviable task of prioritising threats amongst a flood of abstruse alerts triggered by their Security Information and Event Management (SIEM) systems. URLs found within malicious communications form the bulk of such alerts, and pinpointing pertinent patterns within them allows teams to rapidly deescalate potential or extant threats. This need for vigilance has been traditionally filled with machine-learning based log analysis tools and anomaly detection concepts. To sidestep machine learning approaches, we instead propose to analyse suspicious URLs from SIEM alerts via the perspective of malicious URL campaigns. By first grou** URLs within 311M records gathered from VirusTotal into 2.6M suspicious clusters, we thereafter discovered 77.8K malicious campaigns. Corroborating our suspicions, we found 9.9M unique URLs attributable to 18.3K multi-URL campaigns, and that worryingly, only 2.97% of campaigns were found by security vendors. We also confer insights on evasive tactics such as ever lengthier URLs and more diverse domain names, with selected case studies exposing other adversarial techniques. By characterising the concerted campaigns driving these URL alerts, we hope to inform SOC teams of current threat trends, and thus arm them with better threat intelligence.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
Physics Guided Machine Learning for Variational Multiscale Reduced Order Modeling
Authors:
Shady E. Ahmed,
Omer San,
Adil Rasheed,
Traian Iliescu,
Alessandro Veneziani
Abstract:
We propose a new physics guided machine learning (PGML) paradigm that leverages the variational multiscale (VMS) framework and available data to dramatically increase the accuracy of reduced order models (ROMs) at a modest computational cost. The hierarchical structure of the ROM basis and the VMS framework enable a natural separation of the resolved and unresolved ROM spatial scales. Modern PGML…
▽ More
We propose a new physics guided machine learning (PGML) paradigm that leverages the variational multiscale (VMS) framework and available data to dramatically increase the accuracy of reduced order models (ROMs) at a modest computational cost. The hierarchical structure of the ROM basis and the VMS framework enable a natural separation of the resolved and unresolved ROM spatial scales. Modern PGML algorithms are used to construct novel models for the interaction among the resolved and unresolved ROM scales. Specifically, the new framework builds ROM operators that are closest to the true interaction terms in the VMS framework. Finally, machine learning is used to reduce the projection error and further increase the ROM accuracy. Our numerical experiments for a two-dimensional vorticity transport problem show that the novel PGML-VMS-ROM paradigm maintains the low computational cost of current ROMs, while significantly increasing the ROM accuracy.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Transformer-Based Language Models for Software Vulnerability Detection
Authors:
Chandra Thapa,
Seung Ick Jang,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Josef Pieprzyk,
Surya Nepal
Abstract:
The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the closeness of natural languages to high-level programming languages, such as C/C++, this work studies how to leverage (large) transformer-based language models in detec…
▽ More
The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the closeness of natural languages to high-level programming languages, such as C/C++, this work studies how to leverage (large) transformer-based language models in detecting software vulnerabilities and how good are these models for vulnerability detection tasks. In this regard, firstly, a systematic (cohesive) framework that details source code translation, model preparation, and inference is presented. Then, an empirical analysis is performed with software vulnerability datasets with C/C++ source codes having multiple vulnerabilities corresponding to the library function call, pointer usage, array usage, and arithmetic expression. Our empirical results demonstrate the good performance of the language models in vulnerability detection. Moreover, these language models have better performance metrics, such as F1-score, than the contemporary models, namely bidirectional long short-term memory and bidirectional gated recurrent unit. Experimenting with the language models is always challenging due to the requirement of computing resources, platforms, libraries, and dependencies. Thus, this paper also analyses the popular platforms to efficiently fine-tune these models and present recommendations while choosing the platforms.
△ Less
Submitted 5 September, 2022; v1 submitted 7 April, 2022;
originally announced April 2022.
-
Towards Web Phishing Detection Limitations and Mitigation
Authors:
Alsharif Abuadbba,
Shuo Wang,
Mahathir Almashor,
Muhammed Ejaz Ahmed,
Raj Gaire,
Seyit Camtepe,
Surya Nepal
Abstract:
Web phishing remains a serious cyber threat responsible for most data breaches. Machine Learning (ML)-based anti-phishing detectors are seen as an effective countermeasure, and are increasingly adopted by web-browsers and software products. However, with an average of 10K phishing links reported per hour to platforms such as PhishTank and VirusTotal (VT), the deficiencies of such ML-based solution…
▽ More
Web phishing remains a serious cyber threat responsible for most data breaches. Machine Learning (ML)-based anti-phishing detectors are seen as an effective countermeasure, and are increasingly adopted by web-browsers and software products. However, with an average of 10K phishing links reported per hour to platforms such as PhishTank and VirusTotal (VT), the deficiencies of such ML-based solutions are laid bare. We first explore how phishing sites bypass ML-based detection with a deep dive into 13K phishing pages targeting major brands such as Facebook. Results show successful evasion is caused by: (1) use of benign services to obscure phishing URLs; (2) high similarity between the HTML structures of phishing and benign pages; (3) hiding the ultimate phishing content within Javascript and running such scripts only on the client; (4) looking beyond typical credentials and credit cards for new content such as IDs and documents; (5) hiding phishing content until after human interaction. We attribute the root cause to the dependency of ML-based models on the vertical feature space (webpage content). These solutions rely only on what phishers present within the page itself. Thus, we propose Anti-SubtlePhish, a more resilient model based on logistic regression. The key augmentation is the inclusion of a horizontal feature space, which examines correlation variables between the final render of suspicious pages against what trusted services have recorded (e.g., PageRank). To defeat (1) and (2), we correlate information between WHOIS, PageRank, and page analytics. To combat (3), (4) and (5), we correlate features after rendering the page. Experiments with 100K phishing/benign sites show promising accuracy (98.8%). We also obtained 100% accuracy against 0-day phishing pages that were manually crafted, comparing well to the 0% recorded by VT vendors over the first four days.
△ Less
Submitted 3 April, 2022;
originally announced April 2022.
-
Sketching Methods for Dynamic Mode Decomposition in Spherical Shallow Water Equations
Authors:
Shady E. Ahmed,
Omer San,
Diana A. Bistrian,
Ionel M. Navon
Abstract:
Dynamic mode decomposition (DMD) is an emerging methodology that has recently attracted computational scientists working on nonintrusive reduced order modeling. One of the major strengths that DMD possesses is having ground theoretical roots from the Koopman approximation theory. Indeed, DMD may be viewed as the data-driven realization of the famous Koopman operator. Nonetheless, the stable implem…
▽ More
Dynamic mode decomposition (DMD) is an emerging methodology that has recently attracted computational scientists working on nonintrusive reduced order modeling. One of the major strengths that DMD possesses is having ground theoretical roots from the Koopman approximation theory. Indeed, DMD may be viewed as the data-driven realization of the famous Koopman operator. Nonetheless, the stable implementation of DMD incurs computing the singular value decomposition of the input data matrix. This, in turn, makes the process computationally demanding for high dimensional systems. In order to alleviate this burden, we develop a framework based on sketching methods, wherein a sketch of a matrix is simply another matrix which is significantly smaller, but still sufficiently approximates the original system. Such sketching or embedding is performed by applying random transformations, with certain properties, on the input matrix to yield a compressed version of the initial system. Hence, many of the expensive computations can be carried out on the smaller matrix, thereby accelerating the solution of the original problem. We conduct numerical experiments conducted using the spherical shallow water equations as a prototypical model in the context of geophysical flows. The performance of several sketching approaches is evaluated for capturing the range and co-range of the data matrix. The proposed sketching-based framework can accelerate various portions of the DMD algorithm, compared to classical methods that operate directly on the raw input data. This eventually leads to substantial computational gains that are vital for digital twinning of high dimensional systems.
△ Less
Submitted 11 January, 2022;
originally announced January 2022.
-
DeepCQ+: Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for Highly Dynamic Networks
Authors:
Saeed Kaviani,
Bo Ryu,
Ejaz Ahmed,
Kevin Larson,
Anh Le,
Alex Yahja,
Jae H. Kim
Abstract:
Highly dynamic mobile ad-hoc networks (MANETs) remain as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing protocol which, in a novel manner integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants and ac…
▽ More
Highly dynamic mobile ad-hoc networks (MANETs) remain as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing protocol which, in a novel manner integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants and achieves persistently higher performance across a wide range of topology and mobility configurations. While kee** the overall protocol structure of the Q-learning-based routing protocols, DeepCQ+ replaces statically configured parameterized thresholds and hand-written rules with carefully designed MADRL agents such that no configuration of such parameters is required a priori. Extensive simulation shows that DeepCQ+ yields significantly increased end-to-end throughput with lower overhead and no apparent degradation of end-to-end delays (hop counts) compared to its Q-learning based counterparts. Qualitatively, and perhaps more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful application of the MADRL framework for the MANET routing problem that demonstrates a high degree of scalability and robustness even under environments that are outside the trained range of scenarios. This implies that our MARL-based DeepCQ+ design solution significantly improves the performance of Q-learning based CQ+ baseline approach for comparison and increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios. Additional techniques to further increase the gains in performance and scalability are discussed.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
Ridge-Type Shrinkage Estimators in Low and High Dimensional Beta Regression Model with Application in Econometrics and Medicine
Authors:
Ejaz Ahmed,
Reza Arabi Belaghi,
Yasin Asar,
Abdulkhadir Hussein
Abstract:
Beta regression model is useful in the analysis of bounded continuous outcomes such as proportions. It is well known that for any regression model, the presence of multicollinearity leads to poor performance of the maximum likelihood estimators. The ridge type estimators have been proposed to alleviate the adverse effects of the multicollinearity. Furthermore, when some of the predictors have insi…
▽ More
Beta regression model is useful in the analysis of bounded continuous outcomes such as proportions. It is well known that for any regression model, the presence of multicollinearity leads to poor performance of the maximum likelihood estimators. The ridge type estimators have been proposed to alleviate the adverse effects of the multicollinearity. Furthermore, when some of the predictors have insignificant or weak effects on the outcomes, it is desired to recover as much information as possible from these predictors instead of discarding them all together. In this paper we proposed ridge type shrinkage estimators for the low and high dimensional beta regression model, which address the above two issues simultaneously. We compute the biases and variances of the proposed estimators in closed forms and use Monte Carlo simulations to evaluate their performances. The results show that, both in low and high dimensional data, the performance of the proposed estimators are superior to ridge estimators that discard weak or insignificant predictors. We conclude this paper by applying the proposed methods for two real data from econometric and medicine.
△ Less
Submitted 27 November, 2021;
originally announced November 2021.
-
NatiDroid: Cross-Language Android Permission Specification
Authors:
Chaoran Li,
Xiao Chen,
Ruoxi Sun,
Jason Xue,
Sheng Wen,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Yang Xiang
Abstract:
The Android system manages access to sensitive APIs by permission enforcement. An application (app) must declare proper permissions before invoking specific Android APIs. However, there is no official documentation providing the complete list of permission-protected APIs and the corresponding permissions to date. Researchers have spent significant efforts extracting such API protection map** fro…
▽ More
The Android system manages access to sensitive APIs by permission enforcement. An application (app) must declare proper permissions before invoking specific Android APIs. However, there is no official documentation providing the complete list of permission-protected APIs and the corresponding permissions to date. Researchers have spent significant efforts extracting such API protection map** from the Android API framework, which leverages static code analysis to determine if specific permissions are required before accessing an API. Nevertheless, none of them has attempted to analyze the protection map** in the native library (i.e., code written in C and C++), an essential component of the Android framework that handles communication with the lower-level hardware, such as cameras and sensors. While the protection map** can be utilized to detect various security vulnerabilities in Android apps, such as permission over-privilege and component hijacking, imprecise map** will lead to false results in detecting such security vulnerabilities. To fill this gap, we develop a prototype system, named NatiDroid, to facilitate the cross-language static analysis to benchmark against two state-of-the-art tools, termed Axplorer and Arcade. We evaluate NatiDroid on more than 11,000 Android apps, including system apps from custom Android ROMs and third-party apps from the Google Play. Our NatiDroid can identify up to 464 new API-permission map**s, in contrast to the worst-case results derived from both Axplorer and Arcade, where approximately 71% apps have at least one false positive in permission over-privilege and up to 3.6% apps have at least one false negative in component hijacking. Additionally, we identify that 24 components with at least one Native-triggered component hijacking vulnerability are misidentified by two benchmarks.
△ Less
Submitted 15 November, 2021;
originally announced November 2021.
-
Nonlinear proper orthogonal decomposition for convection-dominated flows
Authors:
Shady E. Ahmed,
Omer San,
Adil Rasheed,
Traian Iliescu
Abstract:
Autoencoder techniques find increasingly common use in reduced order modeling as a means to create a latent space. This reduced order representation offers a modular data-driven modeling approach for nonlinear dynamical systems when integrated with a time series predictive model. In this letter, we put forth a nonlinear proper orthogonal decomposition (POD) framework, which is an end-to-end Galerk…
▽ More
Autoencoder techniques find increasingly common use in reduced order modeling as a means to create a latent space. This reduced order representation offers a modular data-driven modeling approach for nonlinear dynamical systems when integrated with a time series predictive model. In this letter, we put forth a nonlinear proper orthogonal decomposition (POD) framework, which is an end-to-end Galerkin-free model combining autoencoders with long short-term memory networks for dynamics. By eliminating the projection error due to the truncation of Galerkin models, a key enabler of the proposed nonintrusive approach is the kinematic construction of a nonlinear map** between the full-rank expansion of the POD coefficients and the latent space where the dynamics evolve. We test our framework for model reduction of a convection-dominated system, which is generally challenging for reduced order models. Our approach not only improves the accuracy, but also significantly reduces the computational cost of training and testing.
△ Less
Submitted 5 November, 2021; v1 submitted 15 October, 2021;
originally announced October 2021.
-
A Tutorial on Trace-based Simulations of Mobile Ad-hoc Networks on the Example of Aeronautical Communications
Authors:
Musab Ahmed Eltayeb Ahmed,
Konrad Fuger,
Sebastian Lindner,
Fatema Khan,
Andreas Timm-Giel
Abstract:
The OMNeT++ simulator is well-suited for the simulation of randomized user behavior in communication networks. However, there are scenarios, where such a random model is unsuited to evaluate a communication system, and this paper attempts to highlight such a case. Using this example of ad-hoc communication between aircraft mid-flight, a tutorial-style description is attempted that shall show how t…
▽ More
The OMNeT++ simulator is well-suited for the simulation of randomized user behavior in communication networks. However, there are scenarios, where such a random model is unsuited to evaluate a communication system, and this paper attempts to highlight such a case. Using this example of ad-hoc communication between aircraft mid-flight, a tutorial-style description is attempted that shall show how the OMNeT++ simulator can be used when a wealth of real-world trace data is available. In particular, it is described how mobility trace files can be directly used within OMNeT++, and how to link the generation of data messages to this mobility data. This is explained via an example simulation that evaluates a communication network in which an aircraft notifies the ground control when it enters or leaves a specific geographic region. Additionally, a novel trace-based application has been developed to achieve this link between mobility and message generation. Furthermore, a new TDMA-based medium access protocol for decentralized communication networks is presented, which is oracle-based and thus allows a TDMA-like behavior of medium access without causing any overhead; it can be useful when upper-layer protocols should be evaluated under the assumption of TDMA-like behavior, but isolated from the effects of a full-fledged TDMA protocol. Finally, physical layer behavior is often either overly simplistic or overly computationally expensive. For the latter case, when a detailed channel model is available but its evaluation requires prohibitive computational effort, then averaging its behavior into trace data can find a middle ground between efficient evaluation and realistic representation. Hence, a novel trace-based radio model has been developed that makes use of an SNR to PER map**. In the spirit of open science, all implementations have been made available under open licenses.
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Characterizing Malicious URL Campaigns
Authors:
Mahathir Almashor,
Ejaz Ahmed,
Benjamin Pick,
Sharif Abuadbba,
Raj Gaire,
Seyit Camtepe,
Surya Nepal
Abstract:
URLs are central to a myriad of cyber-security threats, from phishing to the distribution of malware. Their inherent ease of use and familiarity is continuously abused by attackers to evade defences and deceive end-users. Seemingly dissimilar URLs are being used in an organized way to perform phishing attacks and distribute malware. We refer to such behaviours as campaigns, with the hypothesis bei…
▽ More
URLs are central to a myriad of cyber-security threats, from phishing to the distribution of malware. Their inherent ease of use and familiarity is continuously abused by attackers to evade defences and deceive end-users. Seemingly dissimilar URLs are being used in an organized way to perform phishing attacks and distribute malware. We refer to such behaviours as campaigns, with the hypothesis being that attacks are often coordinated to maximize success rates and develop evasion tactics. The aim is to gain better insights into campaigns, bolster our grasp of their characteristics, and thus aid the community devise more robust solutions. To this end, we performed extensive research and analysis into 311M records containing 77M unique real-world URLs that were submitted to VirusTotal from Dec 2019 to Jan 2020. From this dataset, 2.6M suspicious campaigns were identified based on their attached metadata, of which 77,810 were doubly verified as malicious. Using the 38.1M records and 9.9M URLs within these malicious campaigns, we provide varied insights such as their targeted victim brands as well as URL sizes and heterogeneity. Some surprising findings were observed, such as detection rates falling to just 13.27% for campaigns that employ more than 100 unique URLs. The paper concludes with several case-studies that illustrate the common malicious techniques employed by attackers to imperil users and circumvent defences.
△ Less
Submitted 28 August, 2021;
originally announced August 2021.
-
Generating Cyber Threat Intelligence to Discover Potential Security Threats Using Classification and Topic Modeling
Authors:
Md Imran Hossen,
Ashraful Islam,
Farzana Anowar,
Eshtiak Ahmed,
Mohammad Masudur Rahman,
Xiali,
Hei
Abstract:
Due to the variety of cyber-attacks or threats, the cybersecurity community enhances the traditional security control mechanisms to an advanced level so that automated tools can encounter potential security threats. Very recently, Cyber Threat Intelligence (CTI) has been presented as one of the proactive and robust mechanisms because of its automated cybersecurity threat prediction. Generally, CTI…
▽ More
Due to the variety of cyber-attacks or threats, the cybersecurity community enhances the traditional security control mechanisms to an advanced level so that automated tools can encounter potential security threats. Very recently, Cyber Threat Intelligence (CTI) has been presented as one of the proactive and robust mechanisms because of its automated cybersecurity threat prediction. Generally, CTI collects and analyses data from various sources e.g., online security forums, social media where cyber enthusiasts, analysts, even cybercriminals discuss cyber or computer security-related topics and discovers potential threats based on the analysis. As the manual analysis of every such discussion (posts on online platforms) is time-consuming, inefficient, and susceptible to errors, CTI as an automated tool can perform uniquely to detect cyber threats. In this paper, we identify and explore relevant CTI from hacker forums utilizing different supervised (classification) and unsupervised learning (topic modeling) techniques. To this end, we collect data from a real hacker forum and constructed two datasets: a binary dataset and a multi-class dataset. We then apply several classifiers along with deep neural network-based classifiers and use them on the datasets to compare their performances. We also employ the classifiers on a labeled leaked dataset as our ground truth. We further explore the datasets using unsupervised techniques. For this purpose, we leverage two topic modeling algorithms namely Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF).
△ Less
Submitted 14 November, 2022; v1 submitted 15 August, 2021;
originally announced August 2021.
-
On closures for reduced order models $-$ A spectrum of first-principle to machine-learned avenues
Authors:
Shady E. Ahmed,
Suraj Pawar,
Omer San,
Adil Rasheed,
Traian Iliescu,
Bernd R. Noack
Abstract:
For over a century, reduced order models (ROMs) have been a fundamental discipline of theoretical fluid mechanics. Early examples include Galerkin models inspired by the Orr-Sommerfeld stability equation and numerous vortex models, of which the von Kármán vortex street is one of the most prominent. Subsequent ROMs typically relied on first principles, like mathematical Galerkin models, weakly nonl…
▽ More
For over a century, reduced order models (ROMs) have been a fundamental discipline of theoretical fluid mechanics. Early examples include Galerkin models inspired by the Orr-Sommerfeld stability equation and numerous vortex models, of which the von Kármán vortex street is one of the most prominent. Subsequent ROMs typically relied on first principles, like mathematical Galerkin models, weakly nonlinear stability theory, and two- and three-dimensional vortex models. Aubry et al. [N. Aubry, P. Holmes, J. Lumley, and E. Stone, Journal of Fluid Mechanics, 192, 115-173 (1988)] pioneered data-driven proper orthogonal decomposition (POD) modeling. In early POD modeling, available data was used to build an optimal basis, which was then utilized in a classical Galerkin procedure to construct the ROM. But data has made a profound impact on ROMs beyond the Galerkin expansion. In this paper, we take a modest step and illustrate the impact of data-driven modeling on one significant ROM area. Specifically, we focus on ROM closures, which are correction terms that are added to classical ROMs in order to model the effect of the discarded ROM modes in under-resolved simulations. Through simple examples, we illustrate the main modeling principles used to construct classical ROMs, motivate and introduce modern ROM closures, and show how data-driven modeling, artificial intelligence, and machine learning have changed the standard ROM methodology over the last two decades. Finally, we outline our vision on how state-of-the-art data-driven modeling can continue to reshape the field of reduced order modeling.
△ Less
Submitted 23 August, 2021; v1 submitted 28 June, 2021;
originally announced June 2021.
-
Joint Linear Trend Recovery Using L1 Regularization
Authors:
Xiaoli Gao,
Ejaz Ahmed
Abstract:
This paper studies the recovery of a joint piece-wise linear trend from a time series using L1 regularization approach, called L1 trend filtering (Kim, Koh and Boyd, 2009). We provide some sufficient conditions under which a L1 trend filter can be well-behaved in terms of mean estimation and change point detection. The result is two-fold: for the mean estimation, an almost optimal consistent rate…
▽ More
This paper studies the recovery of a joint piece-wise linear trend from a time series using L1 regularization approach, called L1 trend filtering (Kim, Koh and Boyd, 2009). We provide some sufficient conditions under which a L1 trend filter can be well-behaved in terms of mean estimation and change point detection. The result is two-fold: for the mean estimation, an almost optimal consistent rate is obtained; for the change point detection, the slope change in direction can be recovered in a high probability. In addition, we show that the weak irrepresentable condition, a necessary condition for LASSO model to be sign consistent (Zhao and Yu, 2006), is not necessary for the consistent change point detection. The performance of the L1 trend filter is evaluated by some finite sample simulations studies.
△ Less
Submitted 30 April, 2021;
originally announced April 2021.
-
A Self-Supervised Auxiliary Loss for Deep RL in Partially Observable Settings
Authors:
Eltayeb Ahmed,
Luisa Zintgraf,
Christian A. Schroeder de Witt,
Nicolas Usunier
Abstract:
In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment. The auxiliary loss proposed is to minimize the classification error of a neural network classifier that predicts whether or not a pair of states sampled from the agents current episode trajectory are in order. The clas…
▽ More
In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment. The auxiliary loss proposed is to minimize the classification error of a neural network classifier that predicts whether or not a pair of states sampled from the agents current episode trajectory are in order. The classifier takes as input a pair of states as well as the agent's memory. The motivation for this auxiliary loss is that there is a strong correlation with which of a pair of states is more recent in the agents episode trajectory and which of the two states is spatially closer to the agent. Our hypothesis is that learning features to answer this question encourages the agent to learn and internalize in memory representations of states that facilitate spatial reasoning. We tested this auxiliary loss on a navigation task in a gridworld and achieved 9.6% increase in accumulative episode reward compared to a strong baseline approach.
△ Less
Submitted 17 April, 2021;
originally announced April 2021.
-
Grand challenges and emergent modes of convergence science
Authors:
Alexander M. Petersen,
Mohammed E. Ahmed,
Ioannis Pavlidis
Abstract:
To address complex problems, scholars are increasingly faced with challenges of integrating diverse knowledge domains. We analyzed the evolution of this convergence paradigm in the broad ecosystem of brain science, which provides a real-time testbed for evaluating two modes of cross-domain integration - subject area exploration via expansive learning and cross-disciplinary collaboration among doma…
▽ More
To address complex problems, scholars are increasingly faced with challenges of integrating diverse knowledge domains. We analyzed the evolution of this convergence paradigm in the broad ecosystem of brain science, which provides a real-time testbed for evaluating two modes of cross-domain integration - subject area exploration via expansive learning and cross-disciplinary collaboration among domain experts. We show that research involving both modes features a 16% citation premium relative to a mono-disciplinary baseline. Further comparison of research integrating neighboring versus distant research domains shows that the cross-disciplinary mode is essential for integrating across relatively large disciplinary distances. Yet we find research utilizing cross-domain subject area exploration alone - a convergence shortcut - to be growing in prevalence at roughly 3% per year, significantly faster than the alternative cross-disciplinary mode, despite being less effective at integrating domains and markedly less impactful. By measuring shifts in the prevalence and impact of different convergence modes in the 5-year intervals before and after 2013, our results indicate that these counterproductive patterns may relate to competitive pressures associated with global Human Brain flagship funding initiatives. Without additional policy guidance, such Grand Challenge flagships may unintentionally incentivize such convergence shortcuts, thereby undercutting the advantages of cross-disciplinary teams in tackling challenges calling on convergence.
△ Less
Submitted 21 March, 2021;
originally announced March 2021.
-
Physical Reasoning Using Dynamics-Aware Models
Authors:
Eltayeb Ahmed,
Anton Bakhtin,
Laurens van der Maaten,
Rohit Girdhar
Abstract:
A common approach to solving physical reasoning tasks is to train a value learner on example tasks. A limitation of such an approach is that it requires learning about object dynamics solely from reward values assigned to the final state of a rollout of the environment. This study aims to address this limitation by augmenting the reward value with self-supervised signals about object dynamics. Spe…
▽ More
A common approach to solving physical reasoning tasks is to train a value learner on example tasks. A limitation of such an approach is that it requires learning about object dynamics solely from reward values assigned to the final state of a rollout of the environment. This study aims to address this limitation by augmenting the reward value with self-supervised signals about object dynamics. Specifically, we train the model to characterize the similarity of two environment rollouts, jointly with predicting the outcome of the reasoning task. This similarity can be defined as a distance measure between the trajectory of objects in the two rollouts, or learned directly from pixels using a contrastive formulation. Empirically, we find that this approach leads to substantial performance improvements on the PHYRE benchmark for physical reasoning (Bakhtin et al., 2019), establishing a new state-of-the-art.
△ Less
Submitted 1 September, 2021; v1 submitted 20 February, 2021;
originally announced February 2021.
-
Modeling the process of speciation using a multi-scale framework including error estimates
Authors:
Mats K. Brun,
Elyes Ahmed,
Jan Martin Nordbotten,
Nils Christian Stenseth
Abstract:
This paper concerns the modeling and numerical simulation of the process of speciation. In particular, given conditions for which one or more speciation events within an ecosystem occur, our aim is to develop the necessary modeling and simulation tools. Care is also taken to establish a solid mathematical foundation on which our modeling framework is built. This is the subject of the first half of…
▽ More
This paper concerns the modeling and numerical simulation of the process of speciation. In particular, given conditions for which one or more speciation events within an ecosystem occur, our aim is to develop the necessary modeling and simulation tools. Care is also taken to establish a solid mathematical foundation on which our modeling framework is built. This is the subject of the first half of the paper. The second half is devoted to develo** a multi-scale framework for eco-evolutionary modeling, where the relevant scales are that of species and individual/population, respectively. Hence, a system of interacting species can be described at the species level, while for branching species a population level description is necessary. Our multi-scale framework thus consists of coupling the species and population level models where speciation events are detected in advance and then resolved at the population scale until the branching is complete. Moreover, since the population level model is formulated as a PDE, we first establish the well-posedness in the time-discrete setting, and then derive the a posteriori error estimates which provides a fully computable upper bound on an energy-type error, including also for the case of general smooth distributions (which will be useful for the detection of speciation events). Several numerical tests validate our framework in practice.
△ Less
Submitted 7 October, 2021; v1 submitted 9 February, 2021;
originally announced February 2021.
-
Forward sensitivity analysis of the FitzHugh-Nagumo system: Parameter estimation
Authors:
Shady E. Ahmed,
Omer San,
Sivaramakrishnan Lakshmivarahan
Abstract:
The FitzHugh-Nagumo (FHN) model, from computational neuroscience, has attracted attention in nonlinear dynamics studies as it describes the behavior of excitable systems and exhibits interesting bifurcation properties. The accurate estimation of the model parameters is vital to understand how the solution trajectory evolves in time. To this end, we provide a forward sensitivity method (FSM) approa…
▽ More
The FitzHugh-Nagumo (FHN) model, from computational neuroscience, has attracted attention in nonlinear dynamics studies as it describes the behavior of excitable systems and exhibits interesting bifurcation properties. The accurate estimation of the model parameters is vital to understand how the solution trajectory evolves in time. To this end, we provide a forward sensitivity method (FSM) approach to quantify the main model parameters using sparse measurement data. FSM constitutes a variational data assimilation technique which integrates model sensitivities into the process of fitting the model to the observations. We analyse the applicability of FSM to update the FHN model parameters and predict its dynamical characteristics. Furthermore, we highlight a few guidelines for observations placement to control the shape of the cost functional and improve the parameter inference iterations.
△ Less
Submitted 5 February, 2021;
originally announced February 2021.
-
Peeler: Profiling Kernel-Level Events to Detect Ransomware
Authors:
Muhammad Ejaz Ahmed,
Hyoungshick Kim,
Seyit Camtepe,
Surya Nepal
Abstract:
Ransomware is a growing threat that typically operates by either encrypting a victim's files or locking a victim's computer until the victim pays a ransom. However, it is still challenging to detect such malware timely with existing traditional malware detection techniques. In this paper, we present a novel ransomware detection system, called "Peeler" (Profiling kErnEl -Level Events to detect Rans…
▽ More
Ransomware is a growing threat that typically operates by either encrypting a victim's files or locking a victim's computer until the victim pays a ransom. However, it is still challenging to detect such malware timely with existing traditional malware detection techniques. In this paper, we present a novel ransomware detection system, called "Peeler" (Profiling kErnEl -Level Events to detect Ransomware). Peeler deviates from signatures for individual ransomware samples and relies on common and generic characteristics of ransomware depicted at the kernel-level. Analyzing diverse ransomware families, we observed ransomware's inherent behavioral characteristics such as stealth operations performed before the attack, file I/O request patterns, process spawning, and correlations among kernel-level events. Based on those characteristics, we develop Peeler that continuously monitors a target system's kernel events and detects ransomware attacks on the system. Our experimental results show that Peeler achieves more than 99\% detection rate with 0.58\% false-positive rate against 43 distinct ransomware families, containing samples from both crypto and screen-locker types of ransomware. For crypto ransomware, Peeler detects them promptly after only one file is lost (within 115 milliseconds on average). Peeler utilizes around 4.9\% of CPU time with only 9.8 MB memory under the normal workload condition. Our analysis demonstrates that Peeler can efficiently detect diverse malware families by monitoring their kernel-level events.
△ Less
Submitted 29 January, 2021;
originally announced January 2021.
-
A posteriori error estimates for hierarchical mixed-dimensional elliptic equations
Authors:
Jhabriel Varela,
Elyes Ahmed,
Eirik Keilegavlen,
Jan Martin Nordbotten,
Florin Adrian Radu
Abstract:
Mixed-dimensional elliptic equations exhibiting a hierarchical structure are commonly used to model problems with high aspect ratio inclusions, such as flow in fractured porous media. We derive general abstract estimates based on the theory of functional a posteriori error estimates, for which guaranteed upper bounds for the primal and dual variables and two-sided bounds for the primal-dual pair a…
▽ More
Mixed-dimensional elliptic equations exhibiting a hierarchical structure are commonly used to model problems with high aspect ratio inclusions, such as flow in fractured porous media. We derive general abstract estimates based on the theory of functional a posteriori error estimates, for which guaranteed upper bounds for the primal and dual variables and two-sided bounds for the primal-dual pair are obtained. We improve on the abstract results obtained with the functional approach by proposing four different ways of estimating the residual errors based on the extent the approximate solution has conservation properties, i.e.: (1) no conservation, (2) subdomain conservation, (3) grid-level conservation, and (4) exact conservation. This treatment results in sharper and fully computable estimates when mass is conserved either at the grid level or exactly, with a comparable structure to those obtained from grid-based a posteriori techniques. We demonstrate the practical effectiveness of our theoretical results through numerical experiments using four different discretization methods for synthetic problems and applications based on benchmarks of flow in fractured porous media.
△ Less
Submitted 19 April, 2022; v1 submitted 20 January, 2021;
originally announced January 2021.
-
Hybrid analysis and modeling for next generation of digital twins
Authors:
Suraj Pawar,
Shady E. Ahmed,
Omer San,
Adil Rasheed
Abstract:
The physics-based modeling has been the workhorse for many decades in many scientific and engineering applications ranging from wind power, weather forecasting, and aircraft design. Recently, data-driven models are increasingly becoming popular in many branches of science and engineering due to their non-intrusive nature and online learning capability. Despite the robust performance of data-driven…
▽ More
The physics-based modeling has been the workhorse for many decades in many scientific and engineering applications ranging from wind power, weather forecasting, and aircraft design. Recently, data-driven models are increasingly becoming popular in many branches of science and engineering due to their non-intrusive nature and online learning capability. Despite the robust performance of data-driven models, they are faced with challenges of poor generalizability and difficulty in interpretation. These challenges have encouraged the integration of physics-based models with data-driven models, herein denoted hybrid analysis and modeling (HAM). We propose two different frameworks under the HAM paradigm for applications relevant to wind energy in order to bring the physical realism within emerging digital twin technologies. The physics-guided machine learning (PGML) framework reduces the uncertainty of neural network predictions by embedding physics-based features from a simplified model at intermediate layers and its performance is demonstrated for the aerodynamic force prediction task. Our results show that the proposed PGML framework achieves approximately 75\% reduction in uncertainty for smaller angle of attacks. The interface learning (IL) framework illustrates how different solvers can be coupled to produce a multi-fidelity model and is successfully applied for the Boussinesq equations that govern a broad class of transport processes. The IL approach paves the way for seamless integration of multi-scale, multi-physics and multi-fidelity models (M^3 models).
△ Less
Submitted 14 January, 2021;
originally announced January 2021.
-
Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for MANETs
Authors:
Saeed Kaviani,
Bo Ryu,
Ejaz Ahmed,
Kevin A. Larson,
Anh Le,
Alex Yahja,
Jae H. Kim
Abstract:
Highly dynamic mobile ad-hoc networks (MANETs) are continuing to serve as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing which, in a novel manner, integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their varian…
▽ More
Highly dynamic mobile ad-hoc networks (MANETs) are continuing to serve as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing which, in a novel manner, integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants, and achieves persistently higher performance across a wide range of MANET configurations while training only on a limited range of network parameters and conditions. Quantitatively, DeepCQ+ shows consistently higher end-to-end throughput with lower overhead compared to its Q-learning-based counterparts with the overall gain of 10-15% in its efficiency. Qualitatively and more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful demonstration of MADRL for the MANET routing problem that achieves and maintains a high degree of scalability and robustness even in the environments that are outside the trained range of scenarios. This implies that the proposed hybrid design approach of DeepCQ+ that combines MADRL and Q-learning significantly increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios.
△ Less
Submitted 28 March, 2021; v1 submitted 8 January, 2021;
originally announced January 2021.
-
Decamouflage: A Framework to Detect Image-Scaling Attacks on Convolutional Neural Networks
Authors:
Bedeuro Kim,
Alsharif Abuadbba,
Yansong Gao,
Yifeng Zheng,
Muhammad Ejaz Ahmed,
Hyoungshick Kim,
Surya Nepal
Abstract:
As an essential processing step in computer vision applications, image resizing or scaling, more specifically downsampling, has to be applied before feeding a normally large image into a convolutional neural network (CNN) model because CNN models typically take small fixed-size images as inputs. However, image scaling functions could be adversarially abused to perform a newly revealed attack calle…
▽ More
As an essential processing step in computer vision applications, image resizing or scaling, more specifically downsampling, has to be applied before feeding a normally large image into a convolutional neural network (CNN) model because CNN models typically take small fixed-size images as inputs. However, image scaling functions could be adversarially abused to perform a newly revealed attack called image-scaling attack, which can affect a wide range of computer vision applications building upon image-scaling functions.
This work presents an image-scaling attack detection framework, termed as Decamouflage. Decamouflage consists of three independent detection methods: (1) rescaling, (2) filtering/pooling, and (3) steganalysis. While each of these three methods is efficient standalone, they can work in an ensemble manner not only to improve the detection accuracy but also to harden potential adaptive attacks. Decamouflage has a pre-determined detection threshold that is generic. More precisely, as we have validated, the threshold determined from one dataset is also applicable to other different datasets. Extensive experiments show that Decamouflage achieves detection accuracy of 99.9\% and 99.8\% in the white-box (with the knowledge of attack algorithms) and the black-box (without the knowledge of attack algorithms) settings, respectively. To corroborate the efficiency of Decamouflage, we have also measured its run-time overhead on a personal PC with an i5 CPU and found that Decamouflage can detect image-scaling attacks in milliseconds. Overall, Decamouflage can accurately detect image scaling attacks in both white-box and black-box settings with acceptable run-time overhead.
△ Less
Submitted 7 October, 2020;
originally announced October 2020.
-
Interface learning in fluid dynamics: statistical inference of closures within micro-macro coupling models
Authors:
Suraj Pawar,
Shady E. Ahmed,
Omer San
Abstract:
Many complex multiphysics systems in fluid dynamics involve using solvers with varied levels of approximations in different regions of the computational domain to resolve multiple spatiotemporal scales present in the flow. The accuracy of the solution is governed by how the information is exchanged between these solvers at the interface and several methods have been devised for such coupling probl…
▽ More
Many complex multiphysics systems in fluid dynamics involve using solvers with varied levels of approximations in different regions of the computational domain to resolve multiple spatiotemporal scales present in the flow. The accuracy of the solution is governed by how the information is exchanged between these solvers at the interface and several methods have been devised for such coupling problems. In this article, we construct a data-driven model by spatially coupling a microscale lattice Boltzmann method (LBM) solver and macroscale finite difference method (FDM) solver for reaction-diffusion systems. The coupling between the micro-macro solvers has one to many map** at the interface leading to the interface closure problem, and we propose a statistical inference method based on neural networks to learn this closure relation. The performance of the proposed framework in a bifidelity setting partitioned between the FDM and LBM domain shows its promise for complex systems where analytical relations between micro-macro solvers are not available.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
A nudged hybrid analysis and modeling approach for realtime wake-vortex transport and decay prediction
Authors:
Shady Ahmed,
Suraj Pawar,
Omer San,
Adil Rasheed,
Mandar Tabib
Abstract:
We put forth a long short-term memory (LSTM) nudging framework for the enhancement of reduced order models (ROMs) of fluid flows utilizing noisy measurements for air traffic improvements. Toward emerging applications of digital twins in aviation, the proposed approach allows for constructing a realtime predictive tool for wake-vortex transport and decay systems. We build on the fact that in realis…
▽ More
We put forth a long short-term memory (LSTM) nudging framework for the enhancement of reduced order models (ROMs) of fluid flows utilizing noisy measurements for air traffic improvements. Toward emerging applications of digital twins in aviation, the proposed approach allows for constructing a realtime predictive tool for wake-vortex transport and decay systems. We build on the fact that in realistic application, there are uncertainties in initial and boundary conditions, model parameters, as well as measurements. Moreover, conventional nonlinear ROMs based on Galerkin projection (GROMs) suffer from imperfection and solution instabilities, especially for advection-dominated flows with slow decay in the Kolmogorov width. In the presented LSTM nudging (LSTM-N) approach, we fuse forecasts from a combination of imperfect GROM and uncertain state estimates, with sparse Eulerian sensor measurements to provide more reliable predictions in a dynamical data assimilation framework. We illustrate our concept by solving a two-dimensional vorticity transport equation. We investigate the effects of measurements noise and state estimate uncertainty on the performance of the LSTM-N behavior. We also demonstrate that it can sufficiently handle different levels of temporal and spatial measurement sparsity, and offer a huge potential in develo** next-generation digital twin technologies.
△ Less
Submitted 5 March, 2021; v1 submitted 5 August, 2020;
originally announced August 2020.
-
Multifidelity Computing for Coupling Full and Reduced Order Models
Authors:
Shady E. Ahmed,
Omer San,
Kursat Kara,
Rami Younis,
Adil Rasheed
Abstract:
Hybrid physics-machine learning models are increasingly being used in simulations of transport processes. Many complex multiphysics systems relevant to scientific and engineering applications include multiple spatiotemporal scales and comprise a multifidelity problem sharing an interface between various formulations or heterogeneous computational entities. To this end, we present a robust hybrid a…
▽ More
Hybrid physics-machine learning models are increasingly being used in simulations of transport processes. Many complex multiphysics systems relevant to scientific and engineering applications include multiple spatiotemporal scales and comprise a multifidelity problem sharing an interface between various formulations or heterogeneous computational entities. To this end, we present a robust hybrid analysis and modeling approach combining a physics-based full order model (FOM) and a data-driven reduced order model (ROM) to form the building blocks of an integrated approach among mixed fidelity descriptions toward predictive digital twin technologies. At the interface, we introduce a long short-term memory network to bridge these high and low-fidelity models in various forms of interfacial error correction or prolongation. The proposed interface learning approaches are tested as a new way to address ROM-FOM coupling problems solving nonlinear advection-diffusion flow situations with a bifidelity setup that captures the essence of a broad class of transport processes.
△ Less
Submitted 12 February, 2021; v1 submitted 13 July, 2020;
originally announced July 2020.
-
Interface learning of multiphysics and multiscale systems
Authors:
Shady E. Ahmed,
Omer San,
Kursat Kara,
Rami Younis,
Adil Rasheed
Abstract:
Complex natural or engineered systems comprise multiple characteristic scales, multiple spatiotemporal domains, and even multiple physical closure laws. To address such challenges, we introduce an interface learning paradigm and put forth a data-driven closure approach based on memory embedding to provide physically correct boundary conditions at the interface. To enable the interface learning for…
▽ More
Complex natural or engineered systems comprise multiple characteristic scales, multiple spatiotemporal domains, and even multiple physical closure laws. To address such challenges, we introduce an interface learning paradigm and put forth a data-driven closure approach based on memory embedding to provide physically correct boundary conditions at the interface. To enable the interface learning for hyperbolic systems by considering the domain of influence and wave structures into account, we put forth the concept of upwind learning towards a physics-informed domain decomposition. The promise of the proposed approach is shown for a set of canonical illustrative problems. We highlight that high-performance computing environments can benefit from this methodology to reduce communication costs among processing units in emerging machine learning ready heterogeneous platforms toward exascale era.
△ Less
Submitted 31 October, 2020; v1 submitted 17 June, 2020;
originally announced June 2020.
-
COVID-19: Social Media Sentiment Analysis on Reopening
Authors:
Mohammed Emtiaz Ahmed,
Md Rafiqul Islam Rabin,
Farah Naz Chowdhury
Abstract:
The novel coronavirus (COVID-19) pandemic is the most talked topic in social media platforms in 2020. People are using social media such as Twitter to express their opinion and share information on a number of issues related to the COVID-19 in this stay at home order. In this paper, we investigate the sentiment and emotion of peoples in the United States on the subject of reopening. We choose the…
▽ More
The novel coronavirus (COVID-19) pandemic is the most talked topic in social media platforms in 2020. People are using social media such as Twitter to express their opinion and share information on a number of issues related to the COVID-19 in this stay at home order. In this paper, we investigate the sentiment and emotion of peoples in the United States on the subject of reopening. We choose the social media platform Twitter for our analysis and study the Tweets to discover the sentimental perspective, emotional perspective, and triggering words towards the reopening. During this COVID-19 pandemic, researchers have made some analysis on various social media dataset regarding lockdown and stay at home. However, in our analysis, we are particularly interested to analyse public sentiment on reopening. Our major finding is that when all states resorted to lockdown in March, people showed dominant emotion of fear, but as reopening starts people have less fear. While this may be true, due to this reopening phase daily positive cases are rising compared to the lockdown situation. Overall, people have a less negative sentiment towards the situation of reopening.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.