-
An RFP dataset for Real, Fake, and Partially fake audio detection
Authors:
Abdulazeez AlAli,
George Theodorakopoulos
Abstract:
Recent advances in deep learning have enabled the creation of natural-sounding synthesised speech. However, attackers have also utilised these tech-nologies to conduct attacks such as phishing. Numerous public datasets have been created to facilitate the development of effective detection models. How-ever, available datasets contain only entirely fake audio; therefore, detection models may miss at…
▽ More
Recent advances in deep learning have enabled the creation of natural-sounding synthesised speech. However, attackers have also utilised these tech-nologies to conduct attacks such as phishing. Numerous public datasets have been created to facilitate the development of effective detection models. How-ever, available datasets contain only entirely fake audio; therefore, detection models may miss attacks that replace a short section of the real audio with fake audio. In recognition of this problem, the current paper presents the RFP da-taset, which comprises five distinct audio types: partial fake (PF), audio with noise, voice conversion (VC), text-to-speech (TTS), and real. The data are then used to evaluate several detection models, revealing that the available detec-tion models incur a markedly higher equal error rate (EER) when detecting PF audio instead of entirely fake audio. The lowest EER recorded was 25.42%. Therefore, we believe that creators of detection models must seriously consid-er using datasets like RFP that include PF and other types of fake audio.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
ViFiT: Reconstructing Vision Trajectories from IMU and Wi-Fi Fine Time Measurements
Authors:
Bryan Bo Cao,
Abrar Alali,
Hansi Liu,
Nicholas Meegan,
Marco Gruteser,
Kristin Dana,
Ashwin Ashok,
Shubham Jain
Abstract:
Tracking subjects in videos is one of the most widely used functions in camera-based IoT applications such as security surveillance, smart city traffic safety enhancement, vehicle to pedestrian communication and so on. In the computer vision domain, tracking is usually achieved by first detecting subjects with bounding boxes, then associating detected bounding boxes across video frames. For many I…
▽ More
Tracking subjects in videos is one of the most widely used functions in camera-based IoT applications such as security surveillance, smart city traffic safety enhancement, vehicle to pedestrian communication and so on. In the computer vision domain, tracking is usually achieved by first detecting subjects with bounding boxes, then associating detected bounding boxes across video frames. For many IoT systems, images captured by cameras are usually sent over the network to be processed at a different site that has more powerful computing resources than edge devices. However, sending entire frames through the network causes significant bandwidth consumption that may exceed the system bandwidth constraints. To tackle this problem, we propose ViFiT, a transformer-based model that reconstructs vision bounding box trajectories from phone data (IMU and Fine Time Measurements). It leverages a transformer ability of better modeling long-term time series data. ViFiT is evaluated on Vi-Fi Dataset, a large-scale multimodal dataset in 5 diverse real world scenes, including indoor and outdoor environments. To fill the gap of proper metrics of jointly capturing the system characteristics of both tracking quality and video bandwidth reduction, we propose a novel evaluation framework dubbed Minimum Required Frames (MRF) and Minimum Required Frames Ratio (MRFR). ViFiT achieves an MRFR of 0.65 that outperforms the state-of-the-art approach for cross-modal reconstruction in LSTM Encoder-Decoder architecture X-Translator of 0.98, resulting in a high frame reduction rate as 97.76%.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
ADOPT: A system for Alerting Drivers to Occluded Pedestrian Traffic
Authors:
Abrar Alali,
Stephan Olariu,
Shubham Jain
Abstract:
Recent statistics reveal an alarming increase in accidents involving pedestrians (especially children) crossing the street. A common philosophy of existing pedestrian detection approaches is that this task should be undertaken by the moving cars themselves. In sharp departure from this philosophy, we propose to enlist the help of cars parked along the sidewalk to detect and protect crossing pedest…
▽ More
Recent statistics reveal an alarming increase in accidents involving pedestrians (especially children) crossing the street. A common philosophy of existing pedestrian detection approaches is that this task should be undertaken by the moving cars themselves. In sharp departure from this philosophy, we propose to enlist the help of cars parked along the sidewalk to detect and protect crossing pedestrians. In support of this goal, we propose ADOPT: a system for Alerting Drivers to Occluded Pedestrian Traffic. ADOPT lays the theoretical foundations of a system that uses parked cars to: (1) detect the presence of a group of crossing pedestrians - a crossing cohort; (2) predict the time the last member of the cohort takes to clear the street; (3) send alert messages to those approaching cars that may reach the crossing area while pedestrians are still in the street; and, (4) show how approaching cars can adjust their speed, given several simultaneous crossing locations. Importantly, in ADOPT all communications occur over very short distances and at very low power. Our extensive simulations using SUMO-generated pedestrian and car traffic have shown the effectiveness of ADOPT in detecting and protecting crossing pedestrians.
△ Less
Submitted 20 October, 2022;
originally announced December 2022.
-
ViFiCon: Vision and Wireless Association Via Self-Supervised Contrastive Learning
Authors:
Nicholas Meegan,
Hansi Liu,
Bryan Cao,
Abrar Alali,
Kristin Dana,
Marco Gruteser,
Shubham Jain,
Ashwin Ashok
Abstract:
We introduce ViFiCon, a self-supervised contrastive learning scheme which uses synchronized information across vision and wireless modalities to perform cross-modal association. Specifically, the system uses pedestrian data collected from RGB-D camera footage as well as WiFi Fine Time Measurements (FTM) from a user's smartphone device. We represent the temporal sequence by stacking multi-person de…
▽ More
We introduce ViFiCon, a self-supervised contrastive learning scheme which uses synchronized information across vision and wireless modalities to perform cross-modal association. Specifically, the system uses pedestrian data collected from RGB-D camera footage as well as WiFi Fine Time Measurements (FTM) from a user's smartphone device. We represent the temporal sequence by stacking multi-person depth data spatially within a banded image. Depth data from RGB-D (vision domain) is inherently linked with an observable pedestrian, but FTM data (wireless domain) is associated only to a smartphone on the network. To formulate the cross-modal association problem as self-supervised, the network learns a scene-wide synchronization of the two modalities as a pretext task, and then uses that learned representation for the downstream task of associating individual bounding boxes to specific smartphones, i.e. associating vision and wireless information. We use a pre-trained region proposal model on the camera footage and then feed the extrapolated bounding box information into a dual-branch convolutional neural network along with the FTM data. We show that compared to fully supervised SoTA models, ViFiCon achieves high performance vision-to-wireless association, finding which bounding box corresponds to which smartphone device, without hand-labeled association examples for training data.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Finite-Volume Simulation of Capillary-Dominated Flow in Matrix-Fracture Systems using Interface Conditions
Authors:
Ammar Alali,
Francois Hamon,
Bradley Mallison,
Hamdi Tchelepi
Abstract:
In numerical simulations of multiphase flow and transport in fractured porous media, the estimation of the hydrocarbon recovery requires accurately predicting the capillary-driven imbibition rate of the wetting phase initially present in the fracture into the low-permeability matrix. In the fully implicit finite-volume scheme, this entails a robust methodology that captures the capillary flux at t…
▽ More
In numerical simulations of multiphase flow and transport in fractured porous media, the estimation of the hydrocarbon recovery requires accurately predicting the capillary-driven imbibition rate of the wetting phase initially present in the fracture into the low-permeability matrix. In the fully implicit finite-volume scheme, this entails a robust methodology that captures the capillary flux at the interface between the matrix and the fracture even when very coarse control volumes are used to discretize the matrix. Here, we investigate the application of discrete interface conditions at the matrix-fracture interface to improve the accuracy of the flux computation without relying on extreme grid refinement. In particular, we study the interaction of the upwinding scheme with the discrete interface conditions. Considering first capillary-dominated spontaneous imbibition and then forced imbibition with viscous, buoyancy, and capillary forces, we illustrate the importance of the interface conditions to accurately capture the matrix-fracture flux and correctly represent the flow dynamics in the problem.
△ Less
Submitted 28 May, 2020; v1 submitted 5 July, 2019;
originally announced July 2019.
-
Modeling and simulation of multiprocessor systems MPSoC by SystemC/TLM2
Authors:
Abdelhakim Alali,
Ismail Assayad,
Mohamed Sadik
Abstract:
The current manufacturing technology allows the integration of a complex multiprocessor system on one piece of silicon (MPSoC for Multiprocessor System-on- Chip). One way to manage the growing complexity of these systems is to increase the level of abstraction and to address the system-level design. In this paper, we focus on the implementation in SystemC language with TLM (Transaction Level Model…
▽ More
The current manufacturing technology allows the integration of a complex multiprocessor system on one piece of silicon (MPSoC for Multiprocessor System-on- Chip). One way to manage the growing complexity of these systems is to increase the level of abstraction and to address the system-level design. In this paper, we focus on the implementation in SystemC language with TLM (Transaction Level Model) to model an MPSOC platform. Our main contribution is to define a comprehensive, fast and accurate method for designing and evaluating performance for MPSoC systems. The studied MPSoC is composed of MicroBlaze microprocessors, memory, a timer, a VGA and an interrupt handler with two examples of software. This paper has two novel contributions: the first is to develop this MPSOC at CABA and TLM for ISS (Instruction Set Simulator), Native simulations and timed Programmer s View (PV+T); the second is to show that with PV+T simulations we can achieve timing fidelity with higher speeds than CABA simulations and have almost the same precision.
△ Less
Submitted 5 August, 2014;
originally announced August 2014.