-
Energy efficiency analysis of ammonia-fueled power systems for vehicles considering residual heat recovery
Authors:
Zexin Nie,
Yi Huang,
Guangyu Tian
Abstract:
Ammonia, known as a good hydrogen carrier, shows great potential for use as a zero-carbon fuel for vehicles. However, both the internal combustion engine (ICE) and the proton exchange membrane fuel cell (PEMFC), the currently available engines used by the vehicle, require hydrogen decomposed from ammonia. On-board hydrogen production is an energy-intensive process that significantly reduces system…
▽ More
Ammonia, known as a good hydrogen carrier, shows great potential for use as a zero-carbon fuel for vehicles. However, both the internal combustion engine (ICE) and the proton exchange membrane fuel cell (PEMFC), the currently available engines used by the vehicle, require hydrogen decomposed from ammonia. On-board hydrogen production is an energy-intensive process that significantly reduces system efficiency. Therefore, energy recovery from the system's residual heat is essential to promote system efficiency. ICEs and FCs require different amounts of hydrogen, and they produce residual heat of different quality and quantity, so the system efficiency is not only determined by the engine operating point, but also by the measures and ratios of residual heat recovery. To thoroughly understand the relationships between system energy efficiency and system configuration as well as system parameters, this paper takes three typical power systems with different configurations as our objects. Models of three systems are set up for system energy efficiency analysis, and carry out simulations under different conditions to conduct system output power and energy efficiency. By analyzing the simulation results, the factors that most significantly impact the system efficiency are identified, the guidelines for system design and parameter optimization are proposed.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Attention-aided Outdoor Localization in Commercial 5G NR Systems
Authors:
Guoda Tian,
Dino Pjanić,
Xuesong Cai,
Bo Bernhardsson,
Fredrik Tufvesson
Abstract:
The integration of high-precision cellular localization and machine learning (ML) is considered a cornerstone technique in future cellular navigation systems, offering unparalleled accuracy and functionality. This study focuses on localization based on uplink channel measurements in a fifth-generation (5G) new radio (NR) system. An attention-aided ML-based single-snapshot localization pipeline is…
▽ More
The integration of high-precision cellular localization and machine learning (ML) is considered a cornerstone technique in future cellular navigation systems, offering unparalleled accuracy and functionality. This study focuses on localization based on uplink channel measurements in a fifth-generation (5G) new radio (NR) system. An attention-aided ML-based single-snapshot localization pipeline is presented, which consists of several cascaded blocks, namely a signal processing block, an attention-aided block, and an uncertainty estimation block. Specifically, the signal processing block generates an impulse response beam matrix for all beams. The attention-aided block trains on the channel impulse responses using an attention-aided network, which captures the correlation between impulse responses for different beams. The uncertainty estimation block predicts the probability density function of the UE position, thereby also indicating the confidence level of the localization result. Two representative uncertainty estimation techniques, the negative log-likelihood and the regression-by-classification techniques, are applied and compared. Furthermore, for dynamic measurements with multiple snapshots available, we combine the proposed pipeline with a Kalman filter to enhance localization accuracy. To evaluate our approach, we extract channel impulse responses for different beams from a commercial base station. The outdoor measurement campaign covers Line-of-Sight (LoS), Non-Line-of-Sight (NLoS), and a mix of LoS and NLoS scenarios. The results show that sub-meter localization accuracy can be achieved.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Unveiling Four Key Factors for Tire Force Control Allocation in 4WID-4WIS Electric Vehicles at Handling Limits
Authors:
Ao Lu,
Runfeng Li,
Yunchang Yu,
Ziwang Lu,
Guangyu Tian
Abstract:
The four-wheel independent drive and four-wheel independent steering (4WID-4WIS) configurations enhance control flexibility and dynamic performance potential for more integrated electric vehicles. This paper comprehensively analyzes the impacts of four key factors on tire force control allocation: vertical load estimation, actuator dynamic characteristics, tire force constraints, and wheel steerin…
▽ More
The four-wheel independent drive and four-wheel independent steering (4WID-4WIS) configurations enhance control flexibility and dynamic performance potential for more integrated electric vehicles. This paper comprehensively analyzes the impacts of four key factors on tire force control allocation: vertical load estimation, actuator dynamic characteristics, tire force constraints, and wheel steering precision at handling limits. The study demonstrates that precise vertical load estimation enhances lateral force allocation accuracy. Additionally, the self-compensating effect of lateral tire forces minimizes the impact of small deviations in vertical load estimation on tire force control allocation. A novel control allocation method considering actuator dynamics is introduced, effectively improving yaw rate response and reducing tracking errors. Considering tire-road adhesion and actuator rate constraints, an innovative method to calculate the real-time attainable tire force volume is proposed based on the tire slip ratio and slip angle. Feedforward control with bump steer compensation is implemented to improve wheel steering precision and lateral tire force control accuracy. Matlab/Simulink and Carsim co-simulation results emphasize the importance of these key factors' individual impacts and combined effects. This analysis offers valuable insights for develo** advanced tire force control allocation strategies in 4WID-4WIS electric vehicles.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
GMPC: Geometric Model Predictive Control for Wheeled Mobile Robot Trajectory Tracking
Authors:
Jiawei Tang,
Shuang Wu,
Bo Lan,
Yahui Dong,
Yuqiang **,
Guangjian Tian,
Wen-An Zhang,
Ling Shi
Abstract:
The configuration of most robotic systems lies in continuous transformation groups. However, in mobile robot trajectory tracking, many recent works still naively utilize optimization methods for elements in vector space without considering the manifold constraint of the robot configuration. In this letter, we propose a geometric model predictive control (MPC) framework for wheeled mobile robot tra…
▽ More
The configuration of most robotic systems lies in continuous transformation groups. However, in mobile robot trajectory tracking, many recent works still naively utilize optimization methods for elements in vector space without considering the manifold constraint of the robot configuration. In this letter, we propose a geometric model predictive control (MPC) framework for wheeled mobile robot trajectory tracking. We first derive the error dynamics of the wheeled mobile robot trajectory tracking by considering its manifold constraint and kinematic constraint simultaneously. After that, we utilize the relationship between the Lie group and Lie algebra to convexify the tracking control problem, which enables us to solve the problem efficiently. Thanks to the Lie group formulation, our method tracks the trajectory more smoothly than existing nonlinear MPC. Simulations and physical experiments verify the effectiveness of our proposed methods. Our pure Python-based simulation platform is publicly available to benefit further research in the community.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Sensor Attacks and Resilient Defense on HVAC Systems for Energy Market Signal Tracking
Authors:
Guanyu Tian,
Qun Zhou Sun,
Yiyuan Qiao
Abstract:
The power flexibility from smart buildings makes them suitable candidates for providing grid services. The building automation system (BAS) that employs model predictive control (MPC) for grid services relies heavily on sensor data gathered from IoT-based HVAC systems through communication networks. However, cyber-attacks that tamper sensor values can compromise the accuracy and flexibility of HVA…
▽ More
The power flexibility from smart buildings makes them suitable candidates for providing grid services. The building automation system (BAS) that employs model predictive control (MPC) for grid services relies heavily on sensor data gathered from IoT-based HVAC systems through communication networks. However, cyber-attacks that tamper sensor values can compromise the accuracy and flexibility of HVAC system power adjustment. Existing studies on grid-interactive buildings mainly focus on the efficiency and flexibility of buildings' participation in grid operations, while the security aspect is lacking. In this paper, we investigate the effects of cyber-attacks on HVAC systems in grid-interactive buildings, specifically their power-tracking performance. We design a stochastic optimization-based stealthy sensor attack and a corresponding defense strategy using a resilient control framework. The attack and its defense are tested in a physical model of a test building with a single-chiller HVAC system. Simulation results demonstrate that minor falsifications caused by a stealthy sensor attack can significantly alter the power profile, leading to large power tracking errors. However, the resilient control framework can reduce the power tracking error by over 70% under such attacks without filtering out compromised data.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
LuViRA Dataset Validation and Discussion: Comparing Vision, Radio, and Audio Sensors for Indoor Localization
Authors:
Ilayda Yaman,
Guoda Tian,
Erik Tegler,
Jens Gulin,
Nikhil Challa,
Fredrik Tufvesson,
Ove Edfors,
Kalle Astrom,
Steffen Malkowsky,
Liang Liu
Abstract:
We present a unique comparative analysis, and evaluation of vision, radio, and audio based localization algorithms. We create the first baseline for the aforementioned sensors using the recently published Lund University Vision, Radio, and Audio (LuViRA) dataset, where all the sensors are synchronized and measured in the same environment. Some of the challenges of using each specific sensor for in…
▽ More
We present a unique comparative analysis, and evaluation of vision, radio, and audio based localization algorithms. We create the first baseline for the aforementioned sensors using the recently published Lund University Vision, Radio, and Audio (LuViRA) dataset, where all the sensors are synchronized and measured in the same environment. Some of the challenges of using each specific sensor for indoor localization tasks are highlighted. Each sensor is paired with a current state-of-the-art localization algorithm and evaluated for different aspects: localization accuracy, reliability and sensitivity to environment changes, calibration requirements, and potential system complexity. Specifically, the evaluation covers the ORB-SLAM3 algorithm for vision-based localization with an RGB-D camera, a machine-learning algorithm for radio-based localization with massive MIMO technology, and the SFS2 algorithm for audio-based localization with distributed microphones. The results can serve as a guideline and basis for further development of robust and high-precision multi-sensory localization systems, e.g., through sensor fusion, context, and environment-aware adaptation.
△ Less
Submitted 25 April, 2024; v1 submitted 6 September, 2023;
originally announced September 2023.
-
ViG-UNet: Vision Graph Neural Networks for Medical Image Segmentation
Authors:
Juntao Jiang,
Xiyu Chen,
Guanzhong Tian,
Yong Liu
Abstract:
Deep neural networks have been widely used in medical image analysis and medical image segmentation is one of the most important tasks. U-shaped neural networks with encoder-decoder are prevailing and have succeeded greatly in various segmentation tasks. While CNNs treat an image as a grid of pixels in Euclidean space and Transformers recognize an image as a sequence of patches, graph-based repres…
▽ More
Deep neural networks have been widely used in medical image analysis and medical image segmentation is one of the most important tasks. U-shaped neural networks with encoder-decoder are prevailing and have succeeded greatly in various segmentation tasks. While CNNs treat an image as a grid of pixels in Euclidean space and Transformers recognize an image as a sequence of patches, graph-based representation is more generalized and can construct connections for each part of an image. In this paper, we propose a novel ViG-UNet, a graph neural network-based U-shaped architecture with the encoder, the decoder, the bottleneck, and skip connections. The downsampling and upsampling modules are also carefully designed. The experimental results on ISIC 2016, ISIC 2017 and Kvasir-SEG datasets demonstrate that our proposed architecture outperforms most existing classic and state-of-the-art U-shaped networks.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
High-Precision Machine-Learning Based Indoor Localization with Massive MIMO System
Authors:
Guoda Tian,
Ilayda Yaman,
Michiel Sandra,
Xuesong Cai,
Liang Liu,
Fredrik Tufvesson
Abstract:
High-precision cellular-based localization is one of the key technologies for next-generation communication systems. In this paper, we investigate the potential of applying machine learning (ML) to a massive multiple-input multiple-output (MIMO) system to enhance localization accuracy. We analyze a new ML-based localization pipeline that has two parallel fully connected neural networks (FCNN). The…
▽ More
High-precision cellular-based localization is one of the key technologies for next-generation communication systems. In this paper, we investigate the potential of applying machine learning (ML) to a massive multiple-input multiple-output (MIMO) system to enhance localization accuracy. We analyze a new ML-based localization pipeline that has two parallel fully connected neural networks (FCNN). The first FCNN takes the instantaneous spatial covariance matrix to capture angular information, while the second FCNN takes the channel impulse responses to capture delay information. We fuse the estimated coordinates of these two FCNNs for further accuracy improvement. To test the localization algorithm, we performed an indoor measurement campaign with a massive MIMO testbed at 3.7GHz. In the measured scenario, the proposed pipeline can achieve centimeter-level accuracy by combining delay and angular information.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
The LuViRA Dataset: Synchronized Vision, Radio, and Audio Sensors for Indoor Localization
Authors:
Ilayda Yaman,
Guoda Tian,
Martin Larsson,
Patrik Persson,
Michiel Sandra,
Alexander Dürr,
Erik Tegler,
Nikhil Challa,
Henrik Garde,
Fredrik Tufvesson,
Kalle Åström,
Ove Edfors,
Steffen Malkowsky,
Liang Liu
Abstract:
We present a synchronized multisensory dataset for accurate and robust indoor localization: the Lund University Vision, Radio, and Audio (LuViRA) Dataset. The dataset includes color images, corresponding depth maps, inertial measurement unit (IMU) readings, channel response between a 5G massive multiple-input and multiple-output (MIMO) testbed and user equipment, audio recorded by 12 microphones,…
▽ More
We present a synchronized multisensory dataset for accurate and robust indoor localization: the Lund University Vision, Radio, and Audio (LuViRA) Dataset. The dataset includes color images, corresponding depth maps, inertial measurement unit (IMU) readings, channel response between a 5G massive multiple-input and multiple-output (MIMO) testbed and user equipment, audio recorded by 12 microphones, and accurate six degrees of freedom (6DOF) pose ground truth of 0.5 mm. We synchronize these sensors to ensure that all data is recorded simultaneously. A camera, speaker, and transmit antenna are placed on top of a slowly moving service robot, and 89 trajectories are recorded. Each trajectory includes 20 to 50 seconds of recorded sensor data and ground truth labels. Data from different sensors can be used separately or jointly to perform localization tasks, and data from the motion capture (mocap) system is used to verify the results obtained by the localization algorithms. The main aim of this dataset is to enable research on sensor fusion with the most commonly used sensors for localization tasks. Moreover, the full dataset or some parts of it can also be used for other research areas such as channel estimation, image classification, etc. Our dataset is available at: https://github.com/ilaydayaman/LuViRA_Dataset
△ Less
Submitted 26 April, 2024; v1 submitted 10 February, 2023;
originally announced February 2023.
-
Convolutional Neural Network Modelling for MODIS Land Surface Temperature Super-Resolution
Authors:
Binh Minh Nguyen,
Ganglin Tian,
Minh-Triet Vo,
Aurélie Michel,
Thomas Corpetti,
Carlos Granero-Belinchon
Abstract:
Nowadays, thermal infrared satellite remote sensors enable to extract very interesting information at large scale, in particular Land Surface Temperature (LST). However such data are limited in spatial and/or temporal resolutions which prevents from an analysis at fine scales. For example, MODIS satellite provides daily acquisitions with 1Km spatial resolutions which is not sufficient to deal with…
▽ More
Nowadays, thermal infrared satellite remote sensors enable to extract very interesting information at large scale, in particular Land Surface Temperature (LST). However such data are limited in spatial and/or temporal resolutions which prevents from an analysis at fine scales. For example, MODIS satellite provides daily acquisitions with 1Km spatial resolutions which is not sufficient to deal with highly heterogeneous environments as agricultural parcels. Therefore, image super-resolution is a crucial task to better exploit MODIS LSTs. This issue is tackled in this paper. We introduce a deep learning-based algorithm, named Multi-residual U-Net, for super-resolution of MODIS LST single-images. Our proposed network is a modified version of U-Net architecture, which aims at super-resolving the input LST image from 1Km to 250m per pixel. The results show that our Multi-residual U-Net outperforms other state-of-the-art methods.
△ Less
Submitted 1 April, 2022; v1 submitted 22 February, 2022;
originally announced February 2022.
-
SelFSR: Self-Conditioned Face Super-Resolution in the Wild via Flow Field Degradation Network
Authors:
Xianfang Zeng,
Jiangning Zhang,
Liang Liu,
Guangzhong Tian,
Yong Liu
Abstract:
In spite of the success on benchmark datasets, most advanced face super-resolution models perform poorly in real scenarios since the remarkable domain gap between the real images and the synthesized training pairs. To tackle this problem, we propose a novel domain-adaptive degradation network for face super-resolution in the wild. This degradation network predicts a flow field along with an interm…
▽ More
In spite of the success on benchmark datasets, most advanced face super-resolution models perform poorly in real scenarios since the remarkable domain gap between the real images and the synthesized training pairs. To tackle this problem, we propose a novel domain-adaptive degradation network for face super-resolution in the wild. This degradation network predicts a flow field along with an intermediate low resolution image. Then, the degraded counterpart is generated by war** the intermediate image. With the preference of capturing motion blur, such a model performs better at preserving identity consistency between the original images and the degraded. We further present the self-conditioned block for super-resolution network. This block takes the input image as a condition term to effectively utilize facial structure information, eliminating the reliance on explicit priors, e.g. facial landmarks or boundary. Our model achieves state-of-the-art performance on both CelebA and real-world face dataset. The former demonstrates the powerful generative ability of our proposed architecture while the latter shows great identity consistency and perceptual quality in real-world images.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Sensing and Classification Using Massive MIMO: A Tensor Decomposition-Based Approach
Authors:
B. R. Manoj,
Guoda Tian,
Sara Gunnarsson,
Fredrik Tufvesson,
Erik G. Larsson
Abstract:
Wireless-based activity sensing has gained significant attention due to its wide range of applications. We investigate radio-based multi-class classification of human activities using massive multiple-input multiple-output (MIMO) channel measurements in line-of-sight and non line-of-sight scenarios. We propose a tensor decomposition-based algorithm to extract features by exploiting the complex cor…
▽ More
Wireless-based activity sensing has gained significant attention due to its wide range of applications. We investigate radio-based multi-class classification of human activities using massive multiple-input multiple-output (MIMO) channel measurements in line-of-sight and non line-of-sight scenarios. We propose a tensor decomposition-based algorithm to extract features by exploiting the complex correlation characteristics across time, frequency, and space from channel tensors formed from the measurements, followed by a neural network that learns the relationship between the input features and output target labels. Through evaluations of real measurement data, it is demonstrated that the classification accuracy using a massive MIMO array achieves significantly better results compared to the state-of-the-art even for a smaller experimental data set.
△ Less
Submitted 2 September, 2021;
originally announced September 2021.
-
HVAC Scheduling under Data Uncertainties: A Distributionally Robust Approach
Authors:
Guanyu Tian,
Qun Zhou,
Samy Faddel,
Wenyi Wang
Abstract:
The heating, ventilation and air condition (HVAC) system consumes the most energy in commercial buildings, consisting over 60% of total energy usage in the U.S. Flexible HVAC system setpoint scheduling could potentially save building energy costs. This paper first studies deterministic optimization, robust optimization, and stochastic optimization to minimize the daily operation cost with constrai…
▽ More
The heating, ventilation and air condition (HVAC) system consumes the most energy in commercial buildings, consisting over 60% of total energy usage in the U.S. Flexible HVAC system setpoint scheduling could potentially save building energy costs. This paper first studies deterministic optimization, robust optimization, and stochastic optimization to minimize the daily operation cost with constraints of indoor air temperature comfort and mechanic operating requirement. Considering the uncertainties from ambient temperature, a Wasserstein metric-based distributionally robust optimization (DRO) method is proposed to enhance the robustness of the optimal schedule against the uncertainty of probabilistic prediction errors. The schedule is optimized under the worst-case distribution within an ambiguity set defined by the Wasserstein metric. The proposed DRO method is initially formulated as a two-stage problem and then reformulated into a tractable mixed-integer linear programming (MILP) form. The paper evaluates the feasibility and optimality of the optimized schedules for a real commercial building. The numerical results indicate that the costs of the proposed DRO method are up to 6.6% lower compared with conventional techniques of optimization under uncertainties. They also provide granular risk-benefit options for decision making in demand response programs.
△ Less
Submitted 9 March, 2021;
originally announced March 2021.
-
Quantitative Evaluation of Crack Depths on Thin Aluminum Plate using Eddy Current Pulse-Compression Thermography
Authors:
Qiuji Yi,
Hamed Malekmohammadi,
Gui Yun Tian,
Stefano Laureti,
Marco Ricci
Abstract:
Eddy current stimulated thermography is an emerging technique for non-destructive testing and evaluation of conductive materials. However, quantitative estimation of the depth of subsurface defects in metallic materials by thermography techniques remains challenging due to significant lateral thermal diffusion. This work presents the application of eddy current pulse compression thermography to de…
▽ More
Eddy current stimulated thermography is an emerging technique for non-destructive testing and evaluation of conductive materials. However, quantitative estimation of the depth of subsurface defects in metallic materials by thermography techniques remains challenging due to significant lateral thermal diffusion. This work presents the application of eddy current pulse compression thermography to detect surface and subsurface defects with various depths in an aluminum sample. Kernel Principal Component analysis and Low Rank Sparse modelling were used to enhance the defective area, and cross point feature was exploited to quantitatively evaluate the defects depth. Based on experimental results, it is shown that the crossing point feature has a monotonic relationship with surface and subsurface defects depth, and it can also indicate whether the defect is within or beyond the eddy current skin depth. In addition, the comparison study between aluminum and composites in terms of impulse response and proposed features are also presented.
△ Less
Submitted 24 February, 2021;
originally announced February 2021.
-
Moving Object Classification with a Sub-6 GHz Massive MIMO Array using Real Data
Authors:
B. R. Manoj,
Guoda Tian,
Sara Gunnarsson,
Fredrik Tufvesson,
Erik G. Larsson
Abstract:
Classification between different activities in an indoor environment using wireless signals is an emerging technology for various applications, including intrusion detection, patient care, and smart home. Researchers have shown different methods to classify activities and their potential benefits by utilizing WiFi signals. In this paper, we analyze classification of moving objects by employing mac…
▽ More
Classification between different activities in an indoor environment using wireless signals is an emerging technology for various applications, including intrusion detection, patient care, and smart home. Researchers have shown different methods to classify activities and their potential benefits by utilizing WiFi signals. In this paper, we analyze classification of moving objects by employing machine learning on real data from a massive multi-input-multi-output (MIMO) system in an indoor environment. We conduct measurements for different activities in both line-of-sight and non line-of-sight scenarios with a massive MIMO testbed operating at 3.7 GHz. We propose algorithms to exploit amplitude and phase-based features classification task. For the considered setup, we benchmark the classification performance and show that we can achieve up to 98% accuracy using real massive MIMO data, even with a small number of experiments. Furthermore, we demonstrate the gain in performance results with a massive MIMO system as compared with that of a limited number of antennas such as in WiFi devices.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
-
Amplitude and Phase Estimation for Absolute Calibration of Massive MIMO Front-Ends
Authors:
Guoda Tian,
Harsh Tataria,
Fredrik Tufvesson
Abstract:
Massive multiple-input multiple-output (MIMO) promises significantly higher performance relative to conventional multiuser systems. However, the promised gains of massive MIMO systems rely heavily on the accuracy of the absolute front-end calibration, as well as quality of channel estimates at the base station (BS). In this paper, we analyze user equipment-aided calibration mechanism to estimate t…
▽ More
Massive multiple-input multiple-output (MIMO) promises significantly higher performance relative to conventional multiuser systems. However, the promised gains of massive MIMO systems rely heavily on the accuracy of the absolute front-end calibration, as well as quality of channel estimates at the base station (BS). In this paper, we analyze user equipment-aided calibration mechanism to estimate the amplitude scaling and phase drift at each radio-frequency chain interfacing with the BS array. Assuming a uniform linear array at the BS and Ricean fading, we obtain the estimation parameters with moment-based (amplitude, phase) and maximum-likelihood (phase-only) estimation techniques. In stark contrast to previous works, we mathematically articulate the equivalence of the two approaches for phase estimation. Furthermore, we rigorously derive a Cramer-Rao lower bound to characterize the accuracy of the two estimators. Via numerical simulations, we evaluate the estimator performance with varying dominant line-of-sight powers, dominant angles-of-arrival, and signal-to-noise ratios.
△ Less
Submitted 25 February, 2020;
originally announced February 2020.
-
Audio2Face: Generating Speech/Face Animation from Single Audio with Attention-Based Bidirectional LSTM Networks
Authors:
Guanzhong Tian,
Yi Yuan,
Yong liu
Abstract:
We propose an end to end deep learning approach for generating real-time facial animation from just audio. Specifically, our deep architecture employs deep bidirectional long short-term memory network and attention mechanism to discover the latent representations of time-varying contextual information within the speech and recognize the significance of different information contributed to certain…
▽ More
We propose an end to end deep learning approach for generating real-time facial animation from just audio. Specifically, our deep architecture employs deep bidirectional long short-term memory network and attention mechanism to discover the latent representations of time-varying contextual information within the speech and recognize the significance of different information contributed to certain face status. Therefore, our model is able to drive different levels of facial movements at inference and automatically keep up with the corresponding pitch and latent speaking style in the input audio, with no assumption or further human intervention. Evaluation results show that our method could not only generate accurate lip movements from audio, but also successfully regress the speaker's time-varying facial movements.
△ Less
Submitted 27 May, 2019;
originally announced May 2019.