-
ST-DPGAN: A Privacy-preserving Framework for Spatiotemporal Data Generation
Authors:
Wei Shao,
Rongyi Zhu,
Cai Yang,
Chandra Thapa,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Rui Zhang,
DuYong Kim,
Hamid Menouar,
Flora D. Salim
Abstract:
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this cha…
▽ More
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this challenge, we propose a Graph-GAN-based model for generating privacy-protected spatiotemporal data. Our approach incorporates spatial and temporal attention blocks in the discriminator and a spatiotemporal deconvolution structure in the generator. These enhancements enable efficient training under Gaussian noise to achieve differential privacy. Extensive experiments conducted on three real-world spatiotemporal datasets validate the efficacy of our model. Our method provides a privacy guarantee while maintaining the data utility. The prediction model trained on our generated data maintains a competitive performance compared to the model trained on the original data.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Malicious Package Detection using Metadata Information
Authors:
S. Halder,
M. Bewong,
A. Mahboubi,
Y. Jiang,
R. Islam,
Z. Islam,
R. Ip,
E. Ahmed,
G. Ramachandran,
A. Babar
Abstract:
Protecting software supply chains from malicious packages is paramount in the evolving landscape of software development. Attacks on the software supply chain involve attackers injecting harmful software into commonly used packages or libraries in a software repository. For instance, JavaScript uses Node Package Manager (NPM), and Python uses Python Package Index (PyPi) as their respective package…
▽ More
Protecting software supply chains from malicious packages is paramount in the evolving landscape of software development. Attacks on the software supply chain involve attackers injecting harmful software into commonly used packages or libraries in a software repository. For instance, JavaScript uses Node Package Manager (NPM), and Python uses Python Package Index (PyPi) as their respective package repositories. In the past, NPM has had vulnerabilities such as the event-stream incident, where a malicious package was introduced into a popular NPM package, potentially impacting a wide range of projects. As the integration of third-party packages becomes increasingly ubiquitous in modern software development, accelerating the creation and deployment of applications, the need for a robust detection mechanism has become critical. On the other hand, due to the sheer volume of new packages being released daily, the task of identifying malicious packages presents a significant challenge. To address this issue, in this paper, we introduce a metadata-based malicious package detection model, MeMPtec. This model extracts a set of features from package metadata information. These extracted features are classified as either easy-to-manipulate (ETM) or difficult-to-manipulate (DTM) features based on monotonicity and restricted control properties. By utilising these metadata features, not only do we improve the effectiveness of detecting malicious packages, but also we demonstrate its resistance to adversarial attacks in comparison with existing state-of-the-art. Our experiments indicate a significant reduction in both false positives (up to 97.56%) and false negatives (up to 91.86%).
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
TTMFN: Two-stream Transformer-based Multimodal Fusion Network for Survival Prediction
Authors:
Ruiquan Ge,
Xiangyang Hu,
Rungen Huang,
Gangyong Jia,
Yaqi Wang,
Renshu Gu,
Changmiao Wang,
Elazab Ahmed,
Linyan Wang,
Juan Ye,
Ye Li
Abstract:
Survival prediction plays a crucial role in assisting clinicians with the development of cancer treatment protocols. Recent evidence shows that multimodal data can help in the diagnosis of cancer disease and improve survival prediction. Currently, deep learning-based approaches have experienced increasing success in survival prediction by integrating pathological images and gene expression data. H…
▽ More
Survival prediction plays a crucial role in assisting clinicians with the development of cancer treatment protocols. Recent evidence shows that multimodal data can help in the diagnosis of cancer disease and improve survival prediction. Currently, deep learning-based approaches have experienced increasing success in survival prediction by integrating pathological images and gene expression data. However, most existing approaches overlook the intra-modality latent information and the complex inter-modality correlations. Furthermore, existing modalities do not fully exploit the immense representational capabilities of neural networks for feature aggregation and disregard the importance of relationships between features. Therefore, it is highly recommended to address these issues in order to enhance the prediction performance by proposing a novel deep learning-based method. We propose a novel framework named Two-stream Transformer-based Multimodal Fusion Network for survival prediction (TTMFN), which integrates pathological images and gene expression data. In TTMFN, we present a two-stream multimodal co-attention transformer module to take full advantage of the complex relationships between different modalities and the potential connections within the modalities. Additionally, we develop a multi-head attention pooling approach to effectively aggregate the feature representations of the two modalities. The experiment results on four datasets from The Cancer Genome Atlas demonstrate that TTMFN can achieve the best performance or competitive results compared to the state-of-the-art methods in predicting the overall survival of patients.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
Stacked networks improve physics-informed training: applications to neural networks and deep operator networks
Authors:
Amanda A Howard,
Sarah H Murphy,
Shady E Ahmed,
Panos Stinis
Abstract:
Physics-informed neural networks and operator networks have shown promise for effectively solving equations modeling physical systems. However, these networks can be difficult or impossible to train accurately for some systems of equations. We present a novel multifidelity framework for stacking physics-informed neural networks and operator networks that facilitates training. We successively build…
▽ More
Physics-informed neural networks and operator networks have shown promise for effectively solving equations modeling physical systems. However, these networks can be difficult or impossible to train accurately for some systems of equations. We present a novel multifidelity framework for stacking physics-informed neural networks and operator networks that facilitates training. We successively build a chain of networks, where the output at one step can act as a low-fidelity input for training the next step, gradually increasing the expressivity of the learned model. The equations imposed at each step of the iterative process can be the same or different (akin to simulated annealing). The iterative (stacking) nature of the proposed method allows us to progressively learn features of a solution that are hard to learn directly. Through benchmark problems including a nonlinear pendulum, the wave equation, and the viscous Burgers equation, we show how stacking can be used to improve the accuracy and reduce the required size of physics-informed neural networks and operator networks.
△ Less
Submitted 20 November, 2023; v1 submitted 11 November, 2023;
originally announced November 2023.
-
model-based script synthesis for fuzzing
Authors:
Zian Liu,
Chao Chen,
Muhammad Ejaz Ahmed,
Jun Zhang,
Dongxi Liu
Abstract:
Kernel fuzzing is important for finding critical kernel vulnerabilities. Close-source (e.g., Windows) operating system kernel fuzzing is even more challenging due to the lack of source code. Existing approaches fuzz the kernel by modeling syscall sequences from traces or static analysis of system codes. However, a common limitation is that they do not learn and mutate the syscall sequences to reac…
▽ More
Kernel fuzzing is important for finding critical kernel vulnerabilities. Close-source (e.g., Windows) operating system kernel fuzzing is even more challenging due to the lack of source code. Existing approaches fuzz the kernel by modeling syscall sequences from traces or static analysis of system codes. However, a common limitation is that they do not learn and mutate the syscall sequences to reach different kernel states, which can potentially result in more bugs or crashes.
In this paper, we propose WinkFuzz, an approach to learn and mutate traced syscall sequences in order to reach different kernel states. WinkFuzz learns syscall dependencies from the trace, identifies potential syscalls in the trace that can have dependent subsequent syscalls, and applies the dependencies to insert more syscalls while preserving the dependencies into the trace. Then WinkFuzz fuzzes the synthesized new syscall sequence to find system crashes.
We applied WinkFuzz to four seed applications and found a total increase in syscall number of 70.8\%, with a success rate of 61\%, within three insert levels. The average time for tracing, dependency analysis, recovering model script, and synthesizing script was 600, 39, 34, and 129 seconds respectively. The instant fuzzing rate is 3742 syscall executions per second. However, the average fuzz efficiency dropped to 155 syscall executions per second when the initializing time, waiting time, and other factors were taken into account. We fuzzed each seed application for 24 seconds and, on average, obtained 12.25 crashes within that time frame.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
SemDiff: Binary Similarity Detection by Diffing Key-Semantics Graphs
Authors:
Zian Liu,
Zhi Zhang,
Siqi Ma,
Dongxi Liu,
Jun Zhang,
Chao Chen,
Shigang Liu,
Muhammad Ejaz Ahmed,
Yang Xiang
Abstract:
Binary similarity detection is a critical technique that has been applied in many real-world scenarios where source code is not available, e.g., bug search, malware analysis, and code plagiarism detection. Existing works are ineffective in detecting similar binaries in cases where different compiling optimizations, compilers, source code versions, or obfuscation are deployed.
We observe that all…
▽ More
Binary similarity detection is a critical technique that has been applied in many real-world scenarios where source code is not available, e.g., bug search, malware analysis, and code plagiarism detection. Existing works are ineffective in detecting similar binaries in cases where different compiling optimizations, compilers, source code versions, or obfuscation are deployed.
We observe that all the cases do not change a binary's key code behaviors although they significantly modify its syntax and structure. With this key observation, we extract a set of key instructions from a binary to capture its key code behaviors. By detecting the similarity between two binaries' key instructions, we can address well the ineffectiveness limitation of existing works. Specifically, we translate each extracted key instruction into a self-defined key expression, generating a key-semantics graph based on the binary's control flow. Each node in the key-semantics graph denotes a key instruction, and the node attribute is the key expression. To quantify the similarity between two given key-semantics graphs, we first serialize each graph into a sequence of key expressions by topological sort. Then, we tokenize and concatenate key expressions to generate token lists. We calculate the locality-sensitive hash value for all token lists and quantify their similarity. %We implement a prototype, called SemDiff, consisting of two modules: graph generation and graph diffing. The first module generates a pair of key-semantics graphs and the second module diffs the graphs. Our evaluation results show that overall, SemDiff outperforms state-of-the-art tools when detecting the similarity of binaries generated from different optimization levels, compilers, and obfuscations. SemDiff is also effective for library version search and finding similar vulnerabilities in firmware.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
VulMatch: Binary-level Vulnerability Detection Through Signature
Authors:
Zian Liu,
Lei Pan,
Chao Chen,
Ejaz Ahmed,
Shigang Liu,
Jun Zhang,
Dongxi Liu
Abstract:
Similar vulnerability repeats in real-world software products because of code reuse, especially in wildly reused third-party code and libraries. Detecting repeating vulnerabilities like 1-day and N-day vulnerabilities is an important cyber security task. Unfortunately, the state-of-the-art methods suffer from poor performance because they detect patch existence instead of vulnerability existence a…
▽ More
Similar vulnerability repeats in real-world software products because of code reuse, especially in wildly reused third-party code and libraries. Detecting repeating vulnerabilities like 1-day and N-day vulnerabilities is an important cyber security task. Unfortunately, the state-of-the-art methods suffer from poor performance because they detect patch existence instead of vulnerability existence and infer the vulnerability signature directly from binary code. In this paper, we propose VulMatch to extract precise vulnerability-related binary instructions to generate the vulnerability-related signature. VulMatch detects vulnerability existence based on binary signatures. Unlike previous approaches, VulMatch accurately locates vulnerability-related instructions by utilizing source and binary codes. Our experiments were conducted using over 1000 vulnerable instances across seven open-source projects. VulMatch significantly outperformed the baseline tools Asm2vec and Palmtree. Besides the performance advantages over the baseline tools, VulMatch offers a better feature by providing explainable reasons during vulnerability detection. Our empirical studies demonstrate that VulMatch detects fine-grained vulnerability that the state-of-the-art tools struggle with. Our experiment on commercial firmware demonstrates VulMatch is able to find vulnerabilities in real-world scenario.
△ Less
Submitted 17 January, 2024; v1 submitted 1 August, 2023;
originally announced August 2023.
-
DeepMPR: Enhancing Opportunistic Routing in Wireless Networks through Multi-Agent Deep Reinforcement Learning
Authors:
Saeed Kaviani,
Bo Ryu,
Ejaz Ahmed,
Deokseong Kim,
Jae Kim,
Carrie Spiker,
Blake Harnden
Abstract:
Opportunistic routing relies on the broadcast capability of wireless networks. It brings higher reliability and robustness in highly dynamic and/or severe environments such as mobile or vehicular ad-hoc networks (MANETs/VANETs). To reduce the cost of broadcast, multicast routing schemes use the connected dominating set (CDS) or multi-point relaying (MPR) set to decrease the network overhead and he…
▽ More
Opportunistic routing relies on the broadcast capability of wireless networks. It brings higher reliability and robustness in highly dynamic and/or severe environments such as mobile or vehicular ad-hoc networks (MANETs/VANETs). To reduce the cost of broadcast, multicast routing schemes use the connected dominating set (CDS) or multi-point relaying (MPR) set to decrease the network overhead and hence, their selection algorithms are critical. Common MPR selection algorithms are heuristic, rely on coordination between nodes, need high computational power for large networks, and are difficult to tune for network uncertainties. In this paper, we use multi-agent deep reinforcement learning to design a novel MPR multicast routing technique, DeepMPR, which is outperforming the OLSR MPR selection algorithm while it does not require MPR announcement messages from the neighbors. Our evaluation results demonstrate the performance gains of our trained DeepMPR multicast forwarding policy compared to other popular techniques.
△ Less
Submitted 16 June, 2023;
originally announced June 2023.
-
Open Source-based Over-The-Air 5G New Radio Sidelink Testbed
Authors:
Melissa Elkadi,
Doekseong Kim,
Ejaz Ahmed,
Moein Sadeghi,
Anh Le,
Paul Russell,
Bo Ryu
Abstract:
The focus of this paper is to demonstrate an over-the-air (OTA) 5G new radio (NR) sidelink communication prototype. 5G NR sidelink communications allow NR UEs to transfer data independently without the assistance of a base station (gNB), which enables V2X communications, including platooning, autonomous driving, sensor extension, industrial IoT, public safety communication and much more. Our desig…
▽ More
The focus of this paper is to demonstrate an over-the-air (OTA) 5G new radio (NR) sidelink communication prototype. 5G NR sidelink communications allow NR UEs to transfer data independently without the assistance of a base station (gNB), which enables V2X communications, including platooning, autonomous driving, sensor extension, industrial IoT, public safety communication and much more. Our design leverages the open-source OpenAirInterface5G (OAI) software, which operates on software-defined radios (SDRs) and can be easily extended for mesh networking. The software includes all signal processing components specified by the 3GPP 5G sidelink standards, including Low-Density Parity Check (LDPC) encoding/decoding, polar encoding/decoding, data and control multiplexing, modulation/demodulation, and orthogonal frequency-division multiplexing (OFDM) modulation/demodulation. It can be configured to operate with different bands, bandwidths, and antenna settings. The first milestone in this work was to demonstrate the completed Physical Sidelink Broadcast Channel (PSBCH) development, which conducts synchronization between a Synchronization Reference (SyncRef) UE and a nearby UE. The SyncRef UE broadcasts a sidelink synchronization signal block (S-SSB) periodically, which the nearby UE detects and uses to synchronize its timing and frequency components with the SyncRef UE. Once a connection is established, the next developmental milestone is to transmit real data (text messages) via the Physical Sidelink Shared Channel (PSSCH). Our PHY sidelink framework is tested using both an RF simulator and an OTA testbed with multiple nearby UEs. Beyond the development of synchronization and data transmission/reception in 5G sidelink, we conclude with various performance tests and validation experiments. The results of these metrics show that our simulator is comparable to the OTA testbed.
△ Less
Submitted 6 October, 2023; v1 submitted 15 June, 2023;
originally announced June 2023.
-
STUDY: Socially Aware Temporally Causal Decoder Recommender Systems
Authors:
Eltayeb Ahmed,
Diana Mincu,
Lauren Harrell,
Katherine Heller,
Subhrajit Roy
Abstract:
Recommender systems are widely used to help people find items that are tailored to their interests. These interests are often influenced by social networks, making it important to use social network information effectively in recommender systems. This is especially true for demographic groups with interests that differ from the majority. This paper introduces STUDY, a Socially-aware Temporally caU…
▽ More
Recommender systems are widely used to help people find items that are tailored to their interests. These interests are often influenced by social networks, making it important to use social network information effectively in recommender systems. This is especially true for demographic groups with interests that differ from the majority. This paper introduces STUDY, a Socially-aware Temporally caUsal Decoder recommender sYstem. STUDY introduces a new socially-aware recommender system architecture that is significantly more efficient to learn and train than existing methods. STUDY performs joint inference over socially connected groups in a single forward pass of a modified transformer decoder network. We demonstrate the benefits of STUDY in the recommendation of books for students who are dyslexic, or struggling readers. Dyslexic students often have difficulty engaging with reading material, making it critical to recommend books that are tailored to their interests. We worked with our non-profit partner Learning Ally to evaluate STUDY on a dataset of struggling readers. STUDY was able to generate recommendations that more accurately predicted student engagement, when compared with existing methods.
△ Less
Submitted 5 September, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
Socially Assistive Robots as Decision Makers in the Wild: Insights from a Participatory Design Workshop
Authors:
Eshtiak Ahmed,
Laura Cosio,
Juho Hamari,
Oğuz 'Oz' Buruk
Abstract:
Socially Assistive Robots (SARs) are becoming very popular every day because of their effectiveness in handling social situations. However, social robots are perceived as intelligent, and thus their decision-making process might have a significant effect on how they are perceived and how effective they are. In this paper, we present the findings from a participatory design study consisting of 5 de…
▽ More
Socially Assistive Robots (SARs) are becoming very popular every day because of their effectiveness in handling social situations. However, social robots are perceived as intelligent, and thus their decision-making process might have a significant effect on how they are perceived and how effective they are. In this paper, we present the findings from a participatory design study consisting of 5 design workshops with 30 participants, focusing on several decision-making scenarios of SARs in the wild. Through the findings of the PD study, we have discussed 5 directions that could aid the design of decision-making systems of SARs in the wild.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
A Multifidelity deep operator network approach to closure for multiscale systems
Authors:
Shady E. Ahmed,
Panos Stinis
Abstract:
Projection-based reduced order models (PROMs) have shown promise in representing the behavior of multiscale systems using a small set of generalized (or latent) variables. Despite their success, PROMs can be susceptible to inaccuracies, even instabilities, due to the improper accounting of the interaction between the resolved and unresolved scales of the multiscale system (known as the closure pro…
▽ More
Projection-based reduced order models (PROMs) have shown promise in representing the behavior of multiscale systems using a small set of generalized (or latent) variables. Despite their success, PROMs can be susceptible to inaccuracies, even instabilities, due to the improper accounting of the interaction between the resolved and unresolved scales of the multiscale system (known as the closure problem). In the current work, we interpret closure as a multifidelity problem and use a multifidelity deep operator network (DeepONet) framework to address it. In addition, to enhance the stability and accuracy of the multifidelity-based closure, we employ the recently developed "in-the-loop" training approach from the literature on coupling physics and machine learning models. The resulting approach is tested on shock advection for the one-dimensional viscous Burgers equation and vortex merging using the two-dimensional Navier-Stokes equations. The numerical experiments show significant improvement of the predictive ability of the closure-corrected PROM over the un-corrected one both in the interpolative and the extrapolative regimes.
△ Less
Submitted 1 June, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
Beamforming and Device Selection Design in Federated Learning with Over-the-air Aggregation
Authors:
Faeze Moradi Kalarde,
Min Dong,
Ben Liang,
Yahia A. Eldemerdash Ahmed,
Ho Ting Cheng
Abstract:
Federated learning (FL) with over-the-air computation can efficiently utilize the communication bandwidth but is susceptible to analog aggregation error. Excluding those devices with weak channel conditions can reduce the aggregation error, but it also limits the amount of local training data for FL, which can reduce the training convergence rate. In this work, we jointly design uplink receiver be…
▽ More
Federated learning (FL) with over-the-air computation can efficiently utilize the communication bandwidth but is susceptible to analog aggregation error. Excluding those devices with weak channel conditions can reduce the aggregation error, but it also limits the amount of local training data for FL, which can reduce the training convergence rate. In this work, we jointly design uplink receiver beamforming and device selection for over-the-air FL over time-varying wireless channels to maximize the training convergence rate. We reformulate this stochastic optimization problem into a mixed-integer program using an upper bound on the global training loss over communication rounds. We then propose a Greedy Spatial Device Selection (GSDS) approach, which uses a sequential procedure to select devices based on a measure capturing both the channel strength and the channel correlation to the selected devices. We show that given the selected devices, the receiver beamforming optimization problem is equivalent to downlink single-group multicast beamforming. To reduce the computational complexity, we also propose an Alternating-optimization-based Device Selection and Beamforming (ADSBF) approach, which solves the receiver beamforming and device selection subproblems alternatingly. In particular, despite the device selection being an integer problem, we are able to develop an efficient algorithm to find its optimal solution.
Simulation results with real-world image classification demonstrate that our proposed methods achieve faster convergence with significantly lower computational complexity than existing alternatives. Furthermore, although ADSBF shows marginally inferior performance to GSDS, it offers the advantage of lower computational complexity when the number of devices is large.
△ Less
Submitted 6 March, 2024; v1 submitted 28 February, 2023;
originally announced February 2023.
-
PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments
Authors:
Abhijit Chowdhary,
Shady E. Ahmed,
Ahmed Attia
Abstract:
This paper describes PyOED, a highly extensible scientific package that enables develo** and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also…
▽ More
This paper describes PyOED, a highly extensible scientific package that enables develo** and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also meant to enable researchers to experiment with standard and innovative OED technologies with a wide range of test problems (e.g., simulation models). OED, inverse problems (e.g., Bayesian inversion), and data assimilation (DA) are closely related research fields, and their formulations overlap significantly. Thus, PyOED is continuously being expanded with a plethora of Bayesian inversion, DA, and OED methods as well as new scientific simulation models, observation error models, and observation operators. These pieces are added such that they can be permuted to enable testing OED methods in various settings of varying complexities. The PyOED core is completely written in Python and utilizes the inherent object-oriented capabilities; however, the current version of PyOED is meant to be extensible rather than scalable. Specifically, PyOED is developed to enable rapid development and benchmarking of OED methods with minimal coding effort and to maximize code reutilization. This paper provides a brief description of the PyOED layout and philosophy and provides a set of exemplary test cases and tutorials to demonstrate the potential of the package.
△ Less
Submitted 19 December, 2023; v1 submitted 19 January, 2023;
originally announced January 2023.
-
Unraveling Threat Intelligence Through the Lens of Malicious URL Campaigns
Authors:
Mahathir Almashor,
Ejaz Ahmed,
Benjamin Pick,
Sharif Abuadbba,
Jason Xue,
Raj Gaire,
Shuo Wang,
Seyit Camtepe,
Surya Nepal
Abstract:
The daily deluge of alerts is a sombre reality for Security Operations Centre (SOC) personnel worldwide. They are at the forefront of an organisation's cybersecurity infrastructure, and face the unenviable task of prioritising threats amongst a flood of abstruse alerts triggered by their Security Information and Event Management (SIEM) systems. URLs found within malicious communications form the b…
▽ More
The daily deluge of alerts is a sombre reality for Security Operations Centre (SOC) personnel worldwide. They are at the forefront of an organisation's cybersecurity infrastructure, and face the unenviable task of prioritising threats amongst a flood of abstruse alerts triggered by their Security Information and Event Management (SIEM) systems. URLs found within malicious communications form the bulk of such alerts, and pinpointing pertinent patterns within them allows teams to rapidly deescalate potential or extant threats. This need for vigilance has been traditionally filled with machine-learning based log analysis tools and anomaly detection concepts. To sidestep machine learning approaches, we instead propose to analyse suspicious URLs from SIEM alerts via the perspective of malicious URL campaigns. By first grou** URLs within 311M records gathered from VirusTotal into 2.6M suspicious clusters, we thereafter discovered 77.8K malicious campaigns. Corroborating our suspicions, we found 9.9M unique URLs attributable to 18.3K multi-URL campaigns, and that worryingly, only 2.97% of campaigns were found by security vendors. We also confer insights on evasive tactics such as ever lengthier URLs and more diverse domain names, with selected case studies exposing other adversarial techniques. By characterising the concerted campaigns driving these URL alerts, we hope to inform SOC teams of current threat trends, and thus arm them with better threat intelligence.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
Physics Guided Machine Learning for Variational Multiscale Reduced Order Modeling
Authors:
Shady E. Ahmed,
Omer San,
Adil Rasheed,
Traian Iliescu,
Alessandro Veneziani
Abstract:
We propose a new physics guided machine learning (PGML) paradigm that leverages the variational multiscale (VMS) framework and available data to dramatically increase the accuracy of reduced order models (ROMs) at a modest computational cost. The hierarchical structure of the ROM basis and the VMS framework enable a natural separation of the resolved and unresolved ROM spatial scales. Modern PGML…
▽ More
We propose a new physics guided machine learning (PGML) paradigm that leverages the variational multiscale (VMS) framework and available data to dramatically increase the accuracy of reduced order models (ROMs) at a modest computational cost. The hierarchical structure of the ROM basis and the VMS framework enable a natural separation of the resolved and unresolved ROM spatial scales. Modern PGML algorithms are used to construct novel models for the interaction among the resolved and unresolved ROM scales. Specifically, the new framework builds ROM operators that are closest to the true interaction terms in the VMS framework. Finally, machine learning is used to reduce the projection error and further increase the ROM accuracy. Our numerical experiments for a two-dimensional vorticity transport problem show that the novel PGML-VMS-ROM paradigm maintains the low computational cost of current ROMs, while significantly increasing the ROM accuracy.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Transformer-Based Language Models for Software Vulnerability Detection
Authors:
Chandra Thapa,
Seung Ick Jang,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Josef Pieprzyk,
Surya Nepal
Abstract:
The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the closeness of natural languages to high-level programming languages, such as C/C++, this work studies how to leverage (large) transformer-based language models in detec…
▽ More
The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the closeness of natural languages to high-level programming languages, such as C/C++, this work studies how to leverage (large) transformer-based language models in detecting software vulnerabilities and how good are these models for vulnerability detection tasks. In this regard, firstly, a systematic (cohesive) framework that details source code translation, model preparation, and inference is presented. Then, an empirical analysis is performed with software vulnerability datasets with C/C++ source codes having multiple vulnerabilities corresponding to the library function call, pointer usage, array usage, and arithmetic expression. Our empirical results demonstrate the good performance of the language models in vulnerability detection. Moreover, these language models have better performance metrics, such as F1-score, than the contemporary models, namely bidirectional long short-term memory and bidirectional gated recurrent unit. Experimenting with the language models is always challenging due to the requirement of computing resources, platforms, libraries, and dependencies. Thus, this paper also analyses the popular platforms to efficiently fine-tune these models and present recommendations while choosing the platforms.
△ Less
Submitted 5 September, 2022; v1 submitted 7 April, 2022;
originally announced April 2022.
-
Towards Web Phishing Detection Limitations and Mitigation
Authors:
Alsharif Abuadbba,
Shuo Wang,
Mahathir Almashor,
Muhammed Ejaz Ahmed,
Raj Gaire,
Seyit Camtepe,
Surya Nepal
Abstract:
Web phishing remains a serious cyber threat responsible for most data breaches. Machine Learning (ML)-based anti-phishing detectors are seen as an effective countermeasure, and are increasingly adopted by web-browsers and software products. However, with an average of 10K phishing links reported per hour to platforms such as PhishTank and VirusTotal (VT), the deficiencies of such ML-based solution…
▽ More
Web phishing remains a serious cyber threat responsible for most data breaches. Machine Learning (ML)-based anti-phishing detectors are seen as an effective countermeasure, and are increasingly adopted by web-browsers and software products. However, with an average of 10K phishing links reported per hour to platforms such as PhishTank and VirusTotal (VT), the deficiencies of such ML-based solutions are laid bare. We first explore how phishing sites bypass ML-based detection with a deep dive into 13K phishing pages targeting major brands such as Facebook. Results show successful evasion is caused by: (1) use of benign services to obscure phishing URLs; (2) high similarity between the HTML structures of phishing and benign pages; (3) hiding the ultimate phishing content within Javascript and running such scripts only on the client; (4) looking beyond typical credentials and credit cards for new content such as IDs and documents; (5) hiding phishing content until after human interaction. We attribute the root cause to the dependency of ML-based models on the vertical feature space (webpage content). These solutions rely only on what phishers present within the page itself. Thus, we propose Anti-SubtlePhish, a more resilient model based on logistic regression. The key augmentation is the inclusion of a horizontal feature space, which examines correlation variables between the final render of suspicious pages against what trusted services have recorded (e.g., PageRank). To defeat (1) and (2), we correlate information between WHOIS, PageRank, and page analytics. To combat (3), (4) and (5), we correlate features after rendering the page. Experiments with 100K phishing/benign sites show promising accuracy (98.8%). We also obtained 100% accuracy against 0-day phishing pages that were manually crafted, comparing well to the 0% recorded by VT vendors over the first four days.
△ Less
Submitted 3 April, 2022;
originally announced April 2022.
-
DeepCQ+: Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for Highly Dynamic Networks
Authors:
Saeed Kaviani,
Bo Ryu,
Ejaz Ahmed,
Kevin Larson,
Anh Le,
Alex Yahja,
Jae H. Kim
Abstract:
Highly dynamic mobile ad-hoc networks (MANETs) remain as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing protocol which, in a novel manner integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants and ac…
▽ More
Highly dynamic mobile ad-hoc networks (MANETs) remain as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing protocol which, in a novel manner integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants and achieves persistently higher performance across a wide range of topology and mobility configurations. While kee** the overall protocol structure of the Q-learning-based routing protocols, DeepCQ+ replaces statically configured parameterized thresholds and hand-written rules with carefully designed MADRL agents such that no configuration of such parameters is required a priori. Extensive simulation shows that DeepCQ+ yields significantly increased end-to-end throughput with lower overhead and no apparent degradation of end-to-end delays (hop counts) compared to its Q-learning based counterparts. Qualitatively, and perhaps more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful application of the MADRL framework for the MANET routing problem that demonstrates a high degree of scalability and robustness even under environments that are outside the trained range of scenarios. This implies that our MARL-based DeepCQ+ design solution significantly improves the performance of Q-learning based CQ+ baseline approach for comparison and increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios. Additional techniques to further increase the gains in performance and scalability are discussed.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
NatiDroid: Cross-Language Android Permission Specification
Authors:
Chaoran Li,
Xiao Chen,
Ruoxi Sun,
Jason Xue,
Sheng Wen,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Yang Xiang
Abstract:
The Android system manages access to sensitive APIs by permission enforcement. An application (app) must declare proper permissions before invoking specific Android APIs. However, there is no official documentation providing the complete list of permission-protected APIs and the corresponding permissions to date. Researchers have spent significant efforts extracting such API protection map** fro…
▽ More
The Android system manages access to sensitive APIs by permission enforcement. An application (app) must declare proper permissions before invoking specific Android APIs. However, there is no official documentation providing the complete list of permission-protected APIs and the corresponding permissions to date. Researchers have spent significant efforts extracting such API protection map** from the Android API framework, which leverages static code analysis to determine if specific permissions are required before accessing an API. Nevertheless, none of them has attempted to analyze the protection map** in the native library (i.e., code written in C and C++), an essential component of the Android framework that handles communication with the lower-level hardware, such as cameras and sensors. While the protection map** can be utilized to detect various security vulnerabilities in Android apps, such as permission over-privilege and component hijacking, imprecise map** will lead to false results in detecting such security vulnerabilities. To fill this gap, we develop a prototype system, named NatiDroid, to facilitate the cross-language static analysis to benchmark against two state-of-the-art tools, termed Axplorer and Arcade. We evaluate NatiDroid on more than 11,000 Android apps, including system apps from custom Android ROMs and third-party apps from the Google Play. Our NatiDroid can identify up to 464 new API-permission map**s, in contrast to the worst-case results derived from both Axplorer and Arcade, where approximately 71% apps have at least one false positive in permission over-privilege and up to 3.6% apps have at least one false negative in component hijacking. Additionally, we identify that 24 components with at least one Native-triggered component hijacking vulnerability are misidentified by two benchmarks.
△ Less
Submitted 15 November, 2021;
originally announced November 2021.
-
Nonlinear proper orthogonal decomposition for convection-dominated flows
Authors:
Shady E. Ahmed,
Omer San,
Adil Rasheed,
Traian Iliescu
Abstract:
Autoencoder techniques find increasingly common use in reduced order modeling as a means to create a latent space. This reduced order representation offers a modular data-driven modeling approach for nonlinear dynamical systems when integrated with a time series predictive model. In this letter, we put forth a nonlinear proper orthogonal decomposition (POD) framework, which is an end-to-end Galerk…
▽ More
Autoencoder techniques find increasingly common use in reduced order modeling as a means to create a latent space. This reduced order representation offers a modular data-driven modeling approach for nonlinear dynamical systems when integrated with a time series predictive model. In this letter, we put forth a nonlinear proper orthogonal decomposition (POD) framework, which is an end-to-end Galerkin-free model combining autoencoders with long short-term memory networks for dynamics. By eliminating the projection error due to the truncation of Galerkin models, a key enabler of the proposed nonintrusive approach is the kinematic construction of a nonlinear map** between the full-rank expansion of the POD coefficients and the latent space where the dynamics evolve. We test our framework for model reduction of a convection-dominated system, which is generally challenging for reduced order models. Our approach not only improves the accuracy, but also significantly reduces the computational cost of training and testing.
△ Less
Submitted 5 November, 2021; v1 submitted 15 October, 2021;
originally announced October 2021.
-
A Tutorial on Trace-based Simulations of Mobile Ad-hoc Networks on the Example of Aeronautical Communications
Authors:
Musab Ahmed Eltayeb Ahmed,
Konrad Fuger,
Sebastian Lindner,
Fatema Khan,
Andreas Timm-Giel
Abstract:
The OMNeT++ simulator is well-suited for the simulation of randomized user behavior in communication networks. However, there are scenarios, where such a random model is unsuited to evaluate a communication system, and this paper attempts to highlight such a case. Using this example of ad-hoc communication between aircraft mid-flight, a tutorial-style description is attempted that shall show how t…
▽ More
The OMNeT++ simulator is well-suited for the simulation of randomized user behavior in communication networks. However, there are scenarios, where such a random model is unsuited to evaluate a communication system, and this paper attempts to highlight such a case. Using this example of ad-hoc communication between aircraft mid-flight, a tutorial-style description is attempted that shall show how the OMNeT++ simulator can be used when a wealth of real-world trace data is available. In particular, it is described how mobility trace files can be directly used within OMNeT++, and how to link the generation of data messages to this mobility data. This is explained via an example simulation that evaluates a communication network in which an aircraft notifies the ground control when it enters or leaves a specific geographic region. Additionally, a novel trace-based application has been developed to achieve this link between mobility and message generation. Furthermore, a new TDMA-based medium access protocol for decentralized communication networks is presented, which is oracle-based and thus allows a TDMA-like behavior of medium access without causing any overhead; it can be useful when upper-layer protocols should be evaluated under the assumption of TDMA-like behavior, but isolated from the effects of a full-fledged TDMA protocol. Finally, physical layer behavior is often either overly simplistic or overly computationally expensive. For the latter case, when a detailed channel model is available but its evaluation requires prohibitive computational effort, then averaging its behavior into trace data can find a middle ground between efficient evaluation and realistic representation. Hence, a novel trace-based radio model has been developed that makes use of an SNR to PER map**. In the spirit of open science, all implementations have been made available under open licenses.
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Characterizing Malicious URL Campaigns
Authors:
Mahathir Almashor,
Ejaz Ahmed,
Benjamin Pick,
Sharif Abuadbba,
Raj Gaire,
Seyit Camtepe,
Surya Nepal
Abstract:
URLs are central to a myriad of cyber-security threats, from phishing to the distribution of malware. Their inherent ease of use and familiarity is continuously abused by attackers to evade defences and deceive end-users. Seemingly dissimilar URLs are being used in an organized way to perform phishing attacks and distribute malware. We refer to such behaviours as campaigns, with the hypothesis bei…
▽ More
URLs are central to a myriad of cyber-security threats, from phishing to the distribution of malware. Their inherent ease of use and familiarity is continuously abused by attackers to evade defences and deceive end-users. Seemingly dissimilar URLs are being used in an organized way to perform phishing attacks and distribute malware. We refer to such behaviours as campaigns, with the hypothesis being that attacks are often coordinated to maximize success rates and develop evasion tactics. The aim is to gain better insights into campaigns, bolster our grasp of their characteristics, and thus aid the community devise more robust solutions. To this end, we performed extensive research and analysis into 311M records containing 77M unique real-world URLs that were submitted to VirusTotal from Dec 2019 to Jan 2020. From this dataset, 2.6M suspicious campaigns were identified based on their attached metadata, of which 77,810 were doubly verified as malicious. Using the 38.1M records and 9.9M URLs within these malicious campaigns, we provide varied insights such as their targeted victim brands as well as URL sizes and heterogeneity. Some surprising findings were observed, such as detection rates falling to just 13.27% for campaigns that employ more than 100 unique URLs. The paper concludes with several case-studies that illustrate the common malicious techniques employed by attackers to imperil users and circumvent defences.
△ Less
Submitted 28 August, 2021;
originally announced August 2021.
-
Generating Cyber Threat Intelligence to Discover Potential Security Threats Using Classification and Topic Modeling
Authors:
Md Imran Hossen,
Ashraful Islam,
Farzana Anowar,
Eshtiak Ahmed,
Mohammad Masudur Rahman,
Xiali,
Hei
Abstract:
Due to the variety of cyber-attacks or threats, the cybersecurity community enhances the traditional security control mechanisms to an advanced level so that automated tools can encounter potential security threats. Very recently, Cyber Threat Intelligence (CTI) has been presented as one of the proactive and robust mechanisms because of its automated cybersecurity threat prediction. Generally, CTI…
▽ More
Due to the variety of cyber-attacks or threats, the cybersecurity community enhances the traditional security control mechanisms to an advanced level so that automated tools can encounter potential security threats. Very recently, Cyber Threat Intelligence (CTI) has been presented as one of the proactive and robust mechanisms because of its automated cybersecurity threat prediction. Generally, CTI collects and analyses data from various sources e.g., online security forums, social media where cyber enthusiasts, analysts, even cybercriminals discuss cyber or computer security-related topics and discovers potential threats based on the analysis. As the manual analysis of every such discussion (posts on online platforms) is time-consuming, inefficient, and susceptible to errors, CTI as an automated tool can perform uniquely to detect cyber threats. In this paper, we identify and explore relevant CTI from hacker forums utilizing different supervised (classification) and unsupervised learning (topic modeling) techniques. To this end, we collect data from a real hacker forum and constructed two datasets: a binary dataset and a multi-class dataset. We then apply several classifiers along with deep neural network-based classifiers and use them on the datasets to compare their performances. We also employ the classifiers on a labeled leaked dataset as our ground truth. We further explore the datasets using unsupervised techniques. For this purpose, we leverage two topic modeling algorithms namely Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF).
△ Less
Submitted 14 November, 2022; v1 submitted 15 August, 2021;
originally announced August 2021.
-
A Self-Supervised Auxiliary Loss for Deep RL in Partially Observable Settings
Authors:
Eltayeb Ahmed,
Luisa Zintgraf,
Christian A. Schroeder de Witt,
Nicolas Usunier
Abstract:
In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment. The auxiliary loss proposed is to minimize the classification error of a neural network classifier that predicts whether or not a pair of states sampled from the agents current episode trajectory are in order. The clas…
▽ More
In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment. The auxiliary loss proposed is to minimize the classification error of a neural network classifier that predicts whether or not a pair of states sampled from the agents current episode trajectory are in order. The classifier takes as input a pair of states as well as the agent's memory. The motivation for this auxiliary loss is that there is a strong correlation with which of a pair of states is more recent in the agents episode trajectory and which of the two states is spatially closer to the agent. Our hypothesis is that learning features to answer this question encourages the agent to learn and internalize in memory representations of states that facilitate spatial reasoning. We tested this auxiliary loss on a navigation task in a gridworld and achieved 9.6% increase in accumulative episode reward compared to a strong baseline approach.
△ Less
Submitted 17 April, 2021;
originally announced April 2021.
-
Grand challenges and emergent modes of convergence science
Authors:
Alexander M. Petersen,
Mohammed E. Ahmed,
Ioannis Pavlidis
Abstract:
To address complex problems, scholars are increasingly faced with challenges of integrating diverse knowledge domains. We analyzed the evolution of this convergence paradigm in the broad ecosystem of brain science, which provides a real-time testbed for evaluating two modes of cross-domain integration - subject area exploration via expansive learning and cross-disciplinary collaboration among doma…
▽ More
To address complex problems, scholars are increasingly faced with challenges of integrating diverse knowledge domains. We analyzed the evolution of this convergence paradigm in the broad ecosystem of brain science, which provides a real-time testbed for evaluating two modes of cross-domain integration - subject area exploration via expansive learning and cross-disciplinary collaboration among domain experts. We show that research involving both modes features a 16% citation premium relative to a mono-disciplinary baseline. Further comparison of research integrating neighboring versus distant research domains shows that the cross-disciplinary mode is essential for integrating across relatively large disciplinary distances. Yet we find research utilizing cross-domain subject area exploration alone - a convergence shortcut - to be growing in prevalence at roughly 3% per year, significantly faster than the alternative cross-disciplinary mode, despite being less effective at integrating domains and markedly less impactful. By measuring shifts in the prevalence and impact of different convergence modes in the 5-year intervals before and after 2013, our results indicate that these counterproductive patterns may relate to competitive pressures associated with global Human Brain flagship funding initiatives. Without additional policy guidance, such Grand Challenge flagships may unintentionally incentivize such convergence shortcuts, thereby undercutting the advantages of cross-disciplinary teams in tackling challenges calling on convergence.
△ Less
Submitted 21 March, 2021;
originally announced March 2021.
-
Physical Reasoning Using Dynamics-Aware Models
Authors:
Eltayeb Ahmed,
Anton Bakhtin,
Laurens van der Maaten,
Rohit Girdhar
Abstract:
A common approach to solving physical reasoning tasks is to train a value learner on example tasks. A limitation of such an approach is that it requires learning about object dynamics solely from reward values assigned to the final state of a rollout of the environment. This study aims to address this limitation by augmenting the reward value with self-supervised signals about object dynamics. Spe…
▽ More
A common approach to solving physical reasoning tasks is to train a value learner on example tasks. A limitation of such an approach is that it requires learning about object dynamics solely from reward values assigned to the final state of a rollout of the environment. This study aims to address this limitation by augmenting the reward value with self-supervised signals about object dynamics. Specifically, we train the model to characterize the similarity of two environment rollouts, jointly with predicting the outcome of the reasoning task. This similarity can be defined as a distance measure between the trajectory of objects in the two rollouts, or learned directly from pixels using a contrastive formulation. Empirically, we find that this approach leads to substantial performance improvements on the PHYRE benchmark for physical reasoning (Bakhtin et al., 2019), establishing a new state-of-the-art.
△ Less
Submitted 1 September, 2021; v1 submitted 20 February, 2021;
originally announced February 2021.
-
Peeler: Profiling Kernel-Level Events to Detect Ransomware
Authors:
Muhammad Ejaz Ahmed,
Hyoungshick Kim,
Seyit Camtepe,
Surya Nepal
Abstract:
Ransomware is a growing threat that typically operates by either encrypting a victim's files or locking a victim's computer until the victim pays a ransom. However, it is still challenging to detect such malware timely with existing traditional malware detection techniques. In this paper, we present a novel ransomware detection system, called "Peeler" (Profiling kErnEl -Level Events to detect Rans…
▽ More
Ransomware is a growing threat that typically operates by either encrypting a victim's files or locking a victim's computer until the victim pays a ransom. However, it is still challenging to detect such malware timely with existing traditional malware detection techniques. In this paper, we present a novel ransomware detection system, called "Peeler" (Profiling kErnEl -Level Events to detect Ransomware). Peeler deviates from signatures for individual ransomware samples and relies on common and generic characteristics of ransomware depicted at the kernel-level. Analyzing diverse ransomware families, we observed ransomware's inherent behavioral characteristics such as stealth operations performed before the attack, file I/O request patterns, process spawning, and correlations among kernel-level events. Based on those characteristics, we develop Peeler that continuously monitors a target system's kernel events and detects ransomware attacks on the system. Our experimental results show that Peeler achieves more than 99\% detection rate with 0.58\% false-positive rate against 43 distinct ransomware families, containing samples from both crypto and screen-locker types of ransomware. For crypto ransomware, Peeler detects them promptly after only one file is lost (within 115 milliseconds on average). Peeler utilizes around 4.9\% of CPU time with only 9.8 MB memory under the normal workload condition. Our analysis demonstrates that Peeler can efficiently detect diverse malware families by monitoring their kernel-level events.
△ Less
Submitted 29 January, 2021;
originally announced January 2021.
-
Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for MANETs
Authors:
Saeed Kaviani,
Bo Ryu,
Ejaz Ahmed,
Kevin A. Larson,
Anh Le,
Alex Yahja,
Jae H. Kim
Abstract:
Highly dynamic mobile ad-hoc networks (MANETs) are continuing to serve as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing which, in a novel manner, integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their varian…
▽ More
Highly dynamic mobile ad-hoc networks (MANETs) are continuing to serve as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing which, in a novel manner, integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants, and achieves persistently higher performance across a wide range of MANET configurations while training only on a limited range of network parameters and conditions. Quantitatively, DeepCQ+ shows consistently higher end-to-end throughput with lower overhead compared to its Q-learning-based counterparts with the overall gain of 10-15% in its efficiency. Qualitatively and more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful demonstration of MADRL for the MANET routing problem that achieves and maintains a high degree of scalability and robustness even in the environments that are outside the trained range of scenarios. This implies that the proposed hybrid design approach of DeepCQ+ that combines MADRL and Q-learning significantly increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios.
△ Less
Submitted 28 March, 2021; v1 submitted 8 January, 2021;
originally announced January 2021.
-
Decamouflage: A Framework to Detect Image-Scaling Attacks on Convolutional Neural Networks
Authors:
Bedeuro Kim,
Alsharif Abuadbba,
Yansong Gao,
Yifeng Zheng,
Muhammad Ejaz Ahmed,
Hyoungshick Kim,
Surya Nepal
Abstract:
As an essential processing step in computer vision applications, image resizing or scaling, more specifically downsampling, has to be applied before feeding a normally large image into a convolutional neural network (CNN) model because CNN models typically take small fixed-size images as inputs. However, image scaling functions could be adversarially abused to perform a newly revealed attack calle…
▽ More
As an essential processing step in computer vision applications, image resizing or scaling, more specifically downsampling, has to be applied before feeding a normally large image into a convolutional neural network (CNN) model because CNN models typically take small fixed-size images as inputs. However, image scaling functions could be adversarially abused to perform a newly revealed attack called image-scaling attack, which can affect a wide range of computer vision applications building upon image-scaling functions.
This work presents an image-scaling attack detection framework, termed as Decamouflage. Decamouflage consists of three independent detection methods: (1) rescaling, (2) filtering/pooling, and (3) steganalysis. While each of these three methods is efficient standalone, they can work in an ensemble manner not only to improve the detection accuracy but also to harden potential adaptive attacks. Decamouflage has a pre-determined detection threshold that is generic. More precisely, as we have validated, the threshold determined from one dataset is also applicable to other different datasets. Extensive experiments show that Decamouflage achieves detection accuracy of 99.9\% and 99.8\% in the white-box (with the knowledge of attack algorithms) and the black-box (without the knowledge of attack algorithms) settings, respectively. To corroborate the efficiency of Decamouflage, we have also measured its run-time overhead on a personal PC with an i5 CPU and found that Decamouflage can detect image-scaling attacks in milliseconds. Overall, Decamouflage can accurately detect image scaling attacks in both white-box and black-box settings with acceptable run-time overhead.
△ Less
Submitted 7 October, 2020;
originally announced October 2020.
-
A nudged hybrid analysis and modeling approach for realtime wake-vortex transport and decay prediction
Authors:
Shady Ahmed,
Suraj Pawar,
Omer San,
Adil Rasheed,
Mandar Tabib
Abstract:
We put forth a long short-term memory (LSTM) nudging framework for the enhancement of reduced order models (ROMs) of fluid flows utilizing noisy measurements for air traffic improvements. Toward emerging applications of digital twins in aviation, the proposed approach allows for constructing a realtime predictive tool for wake-vortex transport and decay systems. We build on the fact that in realis…
▽ More
We put forth a long short-term memory (LSTM) nudging framework for the enhancement of reduced order models (ROMs) of fluid flows utilizing noisy measurements for air traffic improvements. Toward emerging applications of digital twins in aviation, the proposed approach allows for constructing a realtime predictive tool for wake-vortex transport and decay systems. We build on the fact that in realistic application, there are uncertainties in initial and boundary conditions, model parameters, as well as measurements. Moreover, conventional nonlinear ROMs based on Galerkin projection (GROMs) suffer from imperfection and solution instabilities, especially for advection-dominated flows with slow decay in the Kolmogorov width. In the presented LSTM nudging (LSTM-N) approach, we fuse forecasts from a combination of imperfect GROM and uncertain state estimates, with sparse Eulerian sensor measurements to provide more reliable predictions in a dynamical data assimilation framework. We illustrate our concept by solving a two-dimensional vorticity transport equation. We investigate the effects of measurements noise and state estimate uncertainty on the performance of the LSTM-N behavior. We also demonstrate that it can sufficiently handle different levels of temporal and spatial measurement sparsity, and offer a huge potential in develo** next-generation digital twin technologies.
△ Less
Submitted 5 March, 2021; v1 submitted 5 August, 2020;
originally announced August 2020.
-
Interface learning of multiphysics and multiscale systems
Authors:
Shady E. Ahmed,
Omer San,
Kursat Kara,
Rami Younis,
Adil Rasheed
Abstract:
Complex natural or engineered systems comprise multiple characteristic scales, multiple spatiotemporal domains, and even multiple physical closure laws. To address such challenges, we introduce an interface learning paradigm and put forth a data-driven closure approach based on memory embedding to provide physically correct boundary conditions at the interface. To enable the interface learning for…
▽ More
Complex natural or engineered systems comprise multiple characteristic scales, multiple spatiotemporal domains, and even multiple physical closure laws. To address such challenges, we introduce an interface learning paradigm and put forth a data-driven closure approach based on memory embedding to provide physically correct boundary conditions at the interface. To enable the interface learning for hyperbolic systems by considering the domain of influence and wave structures into account, we put forth the concept of upwind learning towards a physics-informed domain decomposition. The promise of the proposed approach is shown for a set of canonical illustrative problems. We highlight that high-performance computing environments can benefit from this methodology to reduce communication costs among processing units in emerging machine learning ready heterogeneous platforms toward exascale era.
△ Less
Submitted 31 October, 2020; v1 submitted 17 June, 2020;
originally announced June 2020.
-
COVID-19: Social Media Sentiment Analysis on Reopening
Authors:
Mohammed Emtiaz Ahmed,
Md Rafiqul Islam Rabin,
Farah Naz Chowdhury
Abstract:
The novel coronavirus (COVID-19) pandemic is the most talked topic in social media platforms in 2020. People are using social media such as Twitter to express their opinion and share information on a number of issues related to the COVID-19 in this stay at home order. In this paper, we investigate the sentiment and emotion of peoples in the United States on the subject of reopening. We choose the…
▽ More
The novel coronavirus (COVID-19) pandemic is the most talked topic in social media platforms in 2020. People are using social media such as Twitter to express their opinion and share information on a number of issues related to the COVID-19 in this stay at home order. In this paper, we investigate the sentiment and emotion of peoples in the United States on the subject of reopening. We choose the social media platform Twitter for our analysis and study the Tweets to discover the sentimental perspective, emotional perspective, and triggering words towards the reopening. During this COVID-19 pandemic, researchers have made some analysis on various social media dataset regarding lockdown and stay at home. However, in our analysis, we are particularly interested to analyse public sentiment on reopening. Our major finding is that when all states resorted to lockdown in March, people showed dominant emotion of fear, but as reopening starts people have less fear. While this may be true, due to this reopening phase daily positive cases are rising compared to the lockdown situation. Overall, people have a less negative sentiment towards the situation of reopening.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Energy Harvesting in 5G Networks: Taxonomy, Requirements, Challenges, and Future Directions
Authors:
Muhammad Imran,
Latif U. Khan,
Ibrar Yaqoob,
Ejaz Ahmed,
Muhammad Ahsan Qureshi,
Arif Ahmed
Abstract:
Consciousness of energy saving is increasing in fifth-generation (5G) wireless networks due to the high energy consumption issue. Energy harvesting technology is a possible appealing solution for ultimately prolonging the lifetime of devices and networks. Although considerable research efforts have been conducted in the context of using energy harvesting technology in 5G wireless networks, these e…
▽ More
Consciousness of energy saving is increasing in fifth-generation (5G) wireless networks due to the high energy consumption issue. Energy harvesting technology is a possible appealing solution for ultimately prolonging the lifetime of devices and networks. Although considerable research efforts have been conducted in the context of using energy harvesting technology in 5G wireless networks, these efforts are in their infancy, and a tutorial on this topic is still lacking. This study aims to discuss the beneficial role of energy harvesting technology in 5G networks. We categorize and classify the literature available on energy harvesting in 5G networks by devising a taxonomy based on energy sources; energy harvesting devices, phases, and models; energy conversion methods, and energy propagation medium. The key requirements for enabling energy harvesting in 5G networks are also outlined. Several core research challenges that remain to be addressed are discussed. Furthermore, future research directions are provided.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
A survey on Deep Learning Advances on Different 3D Data Representations
Authors:
Eman Ahmed,
Alexandre Saint,
Abd El Rahman Shabayek,
Kseniya Cherenkova,
Rig Das,
Gleb Gusev,
Djamila Aouada,
Bjorn Ottersten
Abstract:
3D data is a valuable asset the computer vision filed as it provides rich information about the full geometry of sensed objects and scenes. Recently, with the availability of both large 3D datasets and computational power, it is today possible to consider applying deep learning to learn specific tasks on 3D data such as segmentation, recognition and correspondence. Depending on the considered 3D d…
▽ More
3D data is a valuable asset the computer vision filed as it provides rich information about the full geometry of sensed objects and scenes. Recently, with the availability of both large 3D datasets and computational power, it is today possible to consider applying deep learning to learn specific tasks on 3D data such as segmentation, recognition and correspondence. Depending on the considered 3D data representation, different challenges may be foreseen in using existent deep learning architectures. In this work, we provide a comprehensive overview about various 3D data representations highlighting the difference between Euclidean and non-Euclidean ones. We also discuss how Deep Learning methods are applied on each representation, analyzing the challenges to overcome.
△ Less
Submitted 6 April, 2019; v1 submitted 4 August, 2018;
originally announced August 2018.
-
Optimal Spectrum Sensing Policy with Traffic Classification in RF-Powered CRNs
Authors:
Hae Sol Lee,
Muhammad Ejaz Ahmed,
Dong In Kim
Abstract:
An orthogonal frequency division multiple access (OFDMA)-based primary user (PU) network is considered, which provides different spectral access/energy harvesting opportunities in RF-powered cognitive radio networks (CRNs). In this scenario, we propose an optimal spectrum sensing policy for opportunistic spectrum access/energy harvesting under both the PU collision and energy causality constraints…
▽ More
An orthogonal frequency division multiple access (OFDMA)-based primary user (PU) network is considered, which provides different spectral access/energy harvesting opportunities in RF-powered cognitive radio networks (CRNs). In this scenario, we propose an optimal spectrum sensing policy for opportunistic spectrum access/energy harvesting under both the PU collision and energy causality constraints. PU subchannels can have different traffic patterns and exhibit distinct idle/busy frequencies, due to which the spectral access/energy harvesting opportunities are application specific. Secondary user (SU) collects traffic pattern information through observation of the PU subchannels and classifies the idle/busy period statistics for each subchannel. Based on the statistics, we invoke stochastic models for evaluating SU capacity by which the energy detection threshold for spectrum sensing can be adjusted with higher sensing accuracy. To this end, we employ the Markov decision process (MDP) model obtained by quantizing the amount of SU battery and the duty cycle model obtained by the ratio of average harvested energy and energy consumption rates. We demonstrate the effectiveness of the proposed stochastic models through comparison with the optimal one obtained from an exhaustive method.
△ Less
Submitted 6 April, 2018; v1 submitted 24 March, 2018;
originally announced March 2018.
-
House price estimation from visual and textual features
Authors:
Eman Ahmed,
Mohamed Moustafa
Abstract:
Most existing automatic house price estimation systems rely only on some textual data like its neighborhood area and the number of rooms. The final price is estimated by a human agent who visits the house and assesses it visually. In this paper, we propose extracting visual features from house photographs and combining them with the house's textual information. The combined features are fed to a f…
▽ More
Most existing automatic house price estimation systems rely only on some textual data like its neighborhood area and the number of rooms. The final price is estimated by a human agent who visits the house and assesses it visually. In this paper, we propose extracting visual features from house photographs and combining them with the house's textual information. The combined features are fed to a fully connected multilayer Neural Network (NN) that estimates the house price as its single output. To train and evaluate our network, we have collected the first houses dataset (to our knowledge) that combines both images and textual attributes. The dataset is composed of 535 sample houses from the state of California, USA. Our experiments showed that adding the visual features increased the R-value by a factor of 3 and decreased the Mean Square Error (MSE) by one order of magnitude compared with textual-only features. Additionally, when trained on the benchmark textual-only features housing dataset, our proposed NN still outperformed the existing model published results.
△ Less
Submitted 27 September, 2016;
originally announced September 2016.
-
Adaptive Beaconing Approaches for Vehicular ad hoc Networks: A Survey
Authors:
Syed Adeel Ali Shah,
Ejaz Ahmed,
Feng Xia,
Ahmad Karim,
Muhammad Shiraz,
Rafidah MD Noor
Abstract:
Vehicular communication requires vehicles to self-organize through the exchange of periodic beacons. Recent analysis on beaconing indicates that the standards for beaconing restrict the desired performance of vehicular applications. This situation can be attributed to the quality of the available transmission medium, persistent change in the traffic situation and the inability of standards to cope…
▽ More
Vehicular communication requires vehicles to self-organize through the exchange of periodic beacons. Recent analysis on beaconing indicates that the standards for beaconing restrict the desired performance of vehicular applications. This situation can be attributed to the quality of the available transmission medium, persistent change in the traffic situation and the inability of standards to cope with application requirements. To this end, this paper is motivated by the classifications and capability evaluations of existing adaptive beaconing approaches. To begin with, we explore the anatomy and the performance requirements of beaconing. Then, the beaconing design is analyzed to introduce a design-based beaconing taxonomy. A survey of the state-of-the-art is conducted with an emphasis on the salient features of the beaconing approaches. We also evaluate the capabilities of beaconing approaches using several key parameters. A comparison among beaconing approaches is presented, which is based on the architectural and implementation characteristics. The paper concludes by discussing open challenges in the field.
△ Less
Submitted 24 May, 2016;
originally announced May 2016.
-
All-Digital Self-interference Cancellation Technique for Full-duplex Systems
Authors:
Elsayed Ahmed,
Ahmed M. Eltawil
Abstract:
Full-duplex systems are expected to double the spectral efficiency compared to conventional half-duplex systems if the self-interference signal can be significantly mitigated. Digital cancellation is one of the lowest complexity self-interference cancellation techniques in full-duplex systems. However, its mitigation capability is very limited, mainly due to transmitter and receiver circuit's impa…
▽ More
Full-duplex systems are expected to double the spectral efficiency compared to conventional half-duplex systems if the self-interference signal can be significantly mitigated. Digital cancellation is one of the lowest complexity self-interference cancellation techniques in full-duplex systems. However, its mitigation capability is very limited, mainly due to transmitter and receiver circuit's impairments. In this paper, we propose a novel digital self-interference cancellation technique for full-duplex systems. The proposed technique is shown to significantly mitigate the self-interference signal as well as the associated transmitter and receiver impairments. In the proposed technique, an auxiliary receiver chain is used to obtain a digital-domain copy of the transmitted Radio Frequency (RF) self-interference signal. The self-interference copy is then used in the digital-domain to cancel out both the self-interference signal and the associated impairments. Furthermore, to alleviate the receiver phase noise effect, a common oscillator is shared between the auxiliary and ordinary receiver chains. A thorough analytical and numerical analysis for the effect of the transmitter and receiver impairments on the cancellation capability of the proposed technique is presented. Finally, the overall performance is numerically investigated showing that using the proposed technique, the self-interference signal could be mitigated to ~3dB higher than the receiver noise floor, which results in up to 76% rate improvement compared to conventional half-duplex systems at 20dBm transmit power values.
△ Less
Submitted 20 June, 2014;
originally announced June 2014.
-
Full-Duplex Systems Using Multi-Reconfigurable Antennas
Authors:
Elsayed Ahmed,
Ahmed M. Eltawil,
Zhouyuan Li,
Bedri A. Cetiner
Abstract:
Full-duplex systems are expected to achieve 100% rate improvement over half-duplex systems if the self-interference signal can be significantly mitigated. In this paper, we propose the first full-duplex system utilizing Multi-Reconfigurable Antenna (MRA) with ?90% rate improvement compared to half-duplex systems. MRA is a dynamically reconfigurable antenna structure, that is capable of changing it…
▽ More
Full-duplex systems are expected to achieve 100% rate improvement over half-duplex systems if the self-interference signal can be significantly mitigated. In this paper, we propose the first full-duplex system utilizing Multi-Reconfigurable Antenna (MRA) with ?90% rate improvement compared to half-duplex systems. MRA is a dynamically reconfigurable antenna structure, that is capable of changing its properties according to certain input configurations. A comprehensive experimental analysis is conducted to characterize the system performance in typical indoor environments. The experiments are performed using a fabricated MRA that has 4096 configurable radiation patterns. The achieved MRA-based passive self-interference suppression is investigated, with detailed analysis for the MRA training overhead. In addition, a heuristic-based approach is proposed to reduce the MRA training overhead. The results show that at 1% training overhead, a total of 95dB self-interference cancellation is achieved in typical indoor environments. The 95dB self-interference cancellation is experimentally shown to be sufficient for 90% full-duplex rate improvement compared to half-duplex systems.
△ Less
Submitted 29 May, 2014;
originally announced May 2014.
-
On Phase Noise Suppression in Full-Duplex Systems
Authors:
Elsayed Ahmed,
Ahmed M. Eltawil
Abstract:
Oscillator phase noise has been shown to be one of the main performance limiting factors in full-duplex systems. In this paper, we consider the problem of self-interference cancellation with phase noise suppression in full-duplex systems. The feasibility of performing phase noise suppression in full-duplex systems in terms of both complexity and achieved gain is analytically and experimentally inv…
▽ More
Oscillator phase noise has been shown to be one of the main performance limiting factors in full-duplex systems. In this paper, we consider the problem of self-interference cancellation with phase noise suppression in full-duplex systems. The feasibility of performing phase noise suppression in full-duplex systems in terms of both complexity and achieved gain is analytically and experimentally investigated. First, the effect of phase noise on full-duplex systems and the possibility of performing phase noise suppression are studied. Two different phase noise suppression techniques with a detailed complexity analysis are then proposed. For each suppression technique, both free-running and phase locked loop based oscillators are considered. Due to the fact that full-duplex system performance highly depends on hardware impairments, experimental analysis is essential for reliable results. In this paper, the performance of the proposed techniques is experimentally investigated in a typical indoor environment. The experimental results are shown to confirm the results obtained from numerical simulations on two different experimental research platforms. At the end, the tradeoff between the required complexity and the gain achieved using phase noise suppression is discussed.
△ Less
Submitted 6 November, 2014; v1 submitted 24 January, 2014;
originally announced January 2014.
-
Self-Interference Cancellation with Phase Noise Induced ICI Suppression for Full-Duplex Systems
Authors:
Elsayed Ahmed,
Ahmed M. Eltawil,
Ashutosh Sabharwal
Abstract:
One of the main bottlenecks in practical full-duplex systems is the oscillator phase noise, which bounds the possible cancellable self-interference power. In this paper, a digitaldomain self-interference cancellation scheme for full-duplex orthogonal frequency division multiplexing systems is proposed. The proposed scheme increases the amount of cancellable selfinterference power by suppressing th…
▽ More
One of the main bottlenecks in practical full-duplex systems is the oscillator phase noise, which bounds the possible cancellable self-interference power. In this paper, a digitaldomain self-interference cancellation scheme for full-duplex orthogonal frequency division multiplexing systems is proposed. The proposed scheme increases the amount of cancellable selfinterference power by suppressing the effect of both transmitter and receiver oscillator phase noise. The proposed scheme consists of two main phases, an estimation phase and a cancellation phase. In the estimation phase, the minimum mean square error estimator is used to jointly estimate the transmitter and receiver phase noise associated with the incoming self-interference signal. In the cancellation phase, the estimated phase noise is used to suppress the intercarrier interference caused by the phase noise associated with the incoming self-interference signal. The performance of the proposed scheme is numerically investigated under different operating conditions. It is demonstrated that the proposed scheme could achieve up to 9dB more self-interference cancellation than the existing digital-domain cancellation schemes that ignore the intercarrier interference suppression.
△ Less
Submitted 15 July, 2013;
originally announced July 2013.
-
Self-Interference Cancellation with Nonlinear Distortion Suppression for Full-Duplex Systems
Authors:
Elsayed Ahmed,
Ahmed M. Eltawil,
Ashutosh Sabharwal
Abstract:
In full-duplex systems, due to the strong self-interference signal, system nonlinearities become a significant limiting factor that bounds the possible cancellable self-interference power. In this paper, a self-interference cancellation scheme for full-duplex orthogonal frequency division multiplexing systems is proposed. The proposed scheme increases the amount of cancellable self-interference po…
▽ More
In full-duplex systems, due to the strong self-interference signal, system nonlinearities become a significant limiting factor that bounds the possible cancellable self-interference power. In this paper, a self-interference cancellation scheme for full-duplex orthogonal frequency division multiplexing systems is proposed. The proposed scheme increases the amount of cancellable self-interference power by suppressing the distortion caused by the transmitter and receiver nonlinearities. An iterative technique is used to jointly estimate the self-interference channel and the nonlinearity coefficients required to suppress the distortion signal. The performance is numerically investigated showing that the proposed scheme achieves a performance that is less than 0.5dB off the performance of a linear full-duplex system.
△ Less
Submitted 23 September, 2013; v1 submitted 14 July, 2013;
originally announced July 2013.
-
Cloud-Based Augmentation for Mobile Devices: Motivation, Taxonomies, and Open Challenges
Authors:
Saeid Abolfazli,
Zohreh Sanaei,
Ejaz Ahmed,
Abdullah Gani,
Rajkumar Buyya
Abstract:
Recently, Cloud-based Mobile Augmentation (CMA) approaches have gained remarkable ground from academia and industry. CMA is the state-of-the-art mobile augmentation model that employs resource-rich clouds to increase, enhance, and optimize computing capabilities of mobile devices aiming at execution of resource-intensive mobile applications. Augmented mobile devices envision to perform extensive c…
▽ More
Recently, Cloud-based Mobile Augmentation (CMA) approaches have gained remarkable ground from academia and industry. CMA is the state-of-the-art mobile augmentation model that employs resource-rich clouds to increase, enhance, and optimize computing capabilities of mobile devices aiming at execution of resource-intensive mobile applications. Augmented mobile devices envision to perform extensive computations and to store big data beyond their intrinsic capabilities with least footprint and vulnerability. Researchers utilize varied cloud-based computing resources (e.g., distant clouds and nearby mobile nodes) to meet various computing requirements of mobile users. However, employing cloud-based computing resources is not a straightforward panacea. Comprehending critical factors that impact on augmentation process and optimum selection of cloud-based resource types are some challenges that hinder CMA adaptability. This paper comprehensively surveys the mobile augmentation domain and presents taxonomy of CMA approaches. The objectives of this study is to highlight the effects of remote resources on the quality and reliability of augmentation processes and discuss the challenges and opportunities of employing varied cloud-based resources in augmenting mobile devices. We present augmentation definition, motivation, and taxonomy of augmentation types, including traditional and cloud-based. We critically analyze the state-of-the-art CMA approaches and classify them into four groups of distant fixed, proximate fixed, proximate mobile, and hybrid to present a taxonomy. Vital decision making and performance limitation factors that influence on the adoption of CMA approaches are introduced and an exemplary decision making flowchart for future CMA approaches are presented. Impacts of CMA approaches on mobile computing is discussed and open challenges are presented as the future research directions.
△ Less
Submitted 20 June, 2013;
originally announced June 2013.
-
Rate Gain Region and Design Tradeoffs for Full-Duplex Wireless Communications
Authors:
Elsayed Ahmed,
Ahmed Eltawil,
Ashutosh Sabharwal
Abstract:
In this paper, we analytically study the regime in which practical full-duplex systems can achieve larger rates than an equivalent half-duplex systems. The key challenge in practical full-duplex systems is uncancelled self-interference signal, which is caused by a combination of hardware and implementation imperfections. Thus, we first present a signal model which captures the effect of significan…
▽ More
In this paper, we analytically study the regime in which practical full-duplex systems can achieve larger rates than an equivalent half-duplex systems. The key challenge in practical full-duplex systems is uncancelled self-interference signal, which is caused by a combination of hardware and implementation imperfections. Thus, we first present a signal model which captures the effect of significant impairments such as oscillator phase noise, low-noise amplifier noise figure, mixer noise, and analog-to-digital converter quantization noise. Using the detailed signal model, we study the rate gain region, which is defined as the region of received signal-of-interest strength where full-duplex systems outperform half-duplex systems in terms of achievable rate. The rate gain region is derived as a piece-wise linear approximation in log-domain, and numerical results show that the approximation closely matches the exact region. Our analysis shows that when phase noise dominates mixer and quantization noise, full-duplex systems can use either active analog cancellation or base-band digital cancellation to achieve near-identical rate gain regions. Finally, as a design example, we numerically investigate the full-duplex system performance and rate gain region in typical indoor environments for practical wireless applications.
△ Less
Submitted 24 January, 2014; v1 submitted 7 March, 2013;
originally announced March 2013.
-
On Dynamical Cournot Game on a Graph
Authors:
E. Ahmed,
M. I. Shehata,
H. A. A. El-Saka
Abstract:
Cournot dynamical game is studied on a graph. The stability of the system is studied. Prisoner's dilemma game is used to model natural gas transmission.
Cournot dynamical game is studied on a graph. The stability of the system is studied. Prisoner's dilemma game is used to model natural gas transmission.
△ Less
Submitted 13 July, 2012;
originally announced August 2012.
-
Building MultiView Analyst Profile From Multidimensional Query Logs: From Consensual to Conflicting Preferences
Authors:
Eya Ben Ahmed,
Ahlem Nabli,
Faïez Gargouri
Abstract:
In order to provide suitable results to the analyst needs, user preferences summarization is widely used in several domains. In this paper, we introduce a new approach for user profile construction from OLAP query logs. The key idea is to learn the user's preferences by drawing the evidence from OLAP logs. In fact, the analyst preferences are clustered into three main pools : (i) consensual or non…
▽ More
In order to provide suitable results to the analyst needs, user preferences summarization is widely used in several domains. In this paper, we introduce a new approach for user profile construction from OLAP query logs. The key idea is to learn the user's preferences by drawing the evidence from OLAP logs. In fact, the analyst preferences are clustered into three main pools : (i) consensual or non conflicting preferences referring to same preferences for all analysts; (ii) semi-conflicting preferences corresponding to similar preferences for some analysts; (iii) conflicting preferences related to disjoint preferences for all analysts. To build generic and global model accurately describing the analyst, we enrich the obtained characteristics through including several views, namely the personal view, the professional view and the behavioral view. After that, the multiview profile extracted from multidimensional database can be annotated.
△ Less
Submitted 15 March, 2012;
originally announced March 2012.
-
Usage Des Mesures Pour La Génération Des Règles d'Associations Cycliques
Authors:
Eya Ben Ahmed,
Ahlem Nabli,
Faïez Gargouri
Abstract:
The online analytical processing (OLAP) does not provide any explanation of correlations discovered between data. Thus, the coupling of OLAP and data mining, especially association rules, is considered as an efficient solution to this problem. In this context, we mainly focus on a particular class of association rules which is the cyclic association rules. These rules aimed to discover patterns th…
▽ More
The online analytical processing (OLAP) does not provide any explanation of correlations discovered between data. Thus, the coupling of OLAP and data mining, especially association rules, is considered as an efficient solution to this problem. In this context, we mainly focus on a particular class of association rules which is the cyclic association rules. These rules aimed to discover patterns that display regular variation over user-defined intervals. Generally,the generated patterns do not take an advantage from the specificities of the multidimensional context namely, the consideration of the measures and their aggregations. In this paper, we introduce a novel method for extracting cyclic association rules from measures, and we redefine the evaluation metrics of association rules quality inspired of the temporal summarizability of measures concept through the integration of appropriate aggregation functions. To prove the usefulness of our approach, we conduct an empirical study on a real data warehouse.
△ Less
Submitted 9 September, 2012; v1 submitted 27 December, 2011;
originally announced December 2011.
-
A Survey of User-Centric Data Warehouses: From Personalization to Recommendation
Authors:
Eya Ben Ahmed,
Ahlem Nabli,
Faïez Gargouri
Abstract:
Providing a customized support for the OLAP brings tremendous challenges to the OLAP technology. Standing at the crossroads of the preferences and the data warehouse, two emerging trends are pointed out; namely: (i) the personalization and (ii) the recommendation. Although the panoply of the proposed approaches, the user-centric data warehouse community issues have not been addressed yet. In this…
▽ More
Providing a customized support for the OLAP brings tremendous challenges to the OLAP technology. Standing at the crossroads of the preferences and the data warehouse, two emerging trends are pointed out; namely: (i) the personalization and (ii) the recommendation. Although the panoply of the proposed approaches, the user-centric data warehouse community issues have not been addressed yet. In this paper we draw an overview of several user centric data warehouse proposals. We also discuss the two promising concepts in this issue, namely, the personalization and the recommendation of the data warehouses. We compare the current approaches among each others with respect to some criteria.
△ Less
Submitted 9 July, 2011;
originally announced July 2011.
-
Towards an incremental maintenance of cyclic association rules
Authors:
Eya ben Ahmed,
Mohamed Salah Gouider
Abstract:
Recently, the cyclic association rules have been introduced in order to discover rules from items characterized by their regular variation over time. In real life situations, temporal databases are often appended or updated. Rescanning the whole database every time is highly expensive while existing incremental mining techniques can efficiently solve such a problem. In this paper, we propose an in…
▽ More
Recently, the cyclic association rules have been introduced in order to discover rules from items characterized by their regular variation over time. In real life situations, temporal databases are often appended or updated. Rescanning the whole database every time is highly expensive while existing incremental mining techniques can efficiently solve such a problem. In this paper, we propose an incremental algorithm for cyclic association rules maintenance. The carried out experiments of our proposal stress on its efficiency and performance.
△ Less
Submitted 26 September, 2010;
originally announced September 2010.