-
Frieren: Efficient Video-to-Audio Generation with Rectified Flow Matching
Authors:
Yongqi Wang,
Wenxiang Guo,
Rongjie Huang,
Jiawei Huang,
Zehan Wang,
Fuming You,
Ruiqi Li,
Zhou Zhao
Abstract:
Video-to-audio (V2A) generation aims to synthesize content-matching audio from silent video, and it remains challenging to build V2A models with high generation quality, efficiency, and visual-audio temporal synchrony. We propose Frieren, a V2A model based on rectified flow matching. Frieren regresses the conditional transport vector field from noise to spectrogram latent with straight paths and c…
▽ More
Video-to-audio (V2A) generation aims to synthesize content-matching audio from silent video, and it remains challenging to build V2A models with high generation quality, efficiency, and visual-audio temporal synchrony. We propose Frieren, a V2A model based on rectified flow matching. Frieren regresses the conditional transport vector field from noise to spectrogram latent with straight paths and conducts sampling by solving ODE, outperforming autoregressive and score-based models in terms of audio quality. By employing a non-autoregressive vector field estimator based on a feed-forward transformer and channel-level cross-modal feature fusion with strong temporal alignment, our model generates audio that is highly synchronized with the input video. Furthermore, through reflow and one-step distillation with guided vector field, our model can generate decent audio in a few, or even only one sampling step. Experiments indicate that Frieren achieves state-of-the-art performance in both generation quality and temporal alignment on VGGSound, with alignment accuracy reaching 97.22%, and 6.2% improvement in inception score over the strong diffusion-based baseline. Audio samples are available at http://frieren-v2a.github.io .
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Text-to-Song: Towards Controllable Music Generation Incorporating Vocals and Accompaniment
Authors:
Zhiqing Hong,
Rongjie Huang,
Xize Cheng,
Yongqi Wang,
Ruiqi Li,
Fuming You,
Zhou Zhao,
Zhimeng Zhang
Abstract:
A song is a combination of singing voice and accompaniment. However, existing works focus on singing voice synthesis and music generation independently. Little attention was paid to explore song synthesis. In this work, we propose a novel task called text-to-song synthesis which incorporating both vocals and accompaniments generation. We develop Melodist, a two-stage text-to-song method that consi…
▽ More
A song is a combination of singing voice and accompaniment. However, existing works focus on singing voice synthesis and music generation independently. Little attention was paid to explore song synthesis. In this work, we propose a novel task called text-to-song synthesis which incorporating both vocals and accompaniments generation. We develop Melodist, a two-stage text-to-song method that consists of singing voice synthesis (SVS) and vocal-to-accompaniment (V2A) synthesis. Melodist leverages tri-tower contrastive pretraining to learn more effective text representation for controllable V2A synthesis. A Chinese song dataset mined from a music website is built up to alleviate data scarcity for our research. The evaluation results on our dataset demonstrate that Melodist can synthesize songs with comparable quality and style consistency. Audio samples can be found in https://text2songMelodist.github.io/Sample/.
△ Less
Submitted 20 May, 2024; v1 submitted 14 April, 2024;
originally announced April 2024.
-
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Authors:
Yongqi Wang,
Ruofan Hu,
Rongjie Huang,
Zhiqing Hong,
Ruiqi Li,
Wenrui Liu,
Fuming You,
Tao **,
Zhou Zhao
Abstract:
Recent singing-voice-synthesis (SVS) methods have achieved remarkable audio quality and naturalness, yet they lack the capability to control the style attributes of the synthesized singing explicitly. We propose Prompt-Singer, the first SVS method that enables attribute controlling on singer gender, vocal range and volume with natural language. We adopt a model architecture based on a decoder-only…
▽ More
Recent singing-voice-synthesis (SVS) methods have achieved remarkable audio quality and naturalness, yet they lack the capability to control the style attributes of the synthesized singing explicitly. We propose Prompt-Singer, the first SVS method that enables attribute controlling on singer gender, vocal range and volume with natural language. We adopt a model architecture based on a decoder-only transformer with a multi-scale hierarchy, and design a range-melody decoupled pitch representation that enables text-conditioned vocal range control while kee** melodic accuracy. Furthermore, we explore various experiment settings, including different types of text representations, text encoder fine-tuning, and introducing speech data to alleviate data scarcity, aiming to facilitate further research. Experiments show that our model achieves favorable controlling ability and audio quality. Audio samples are available at http://prompt-singer.github.io .
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Generative AI and Process Systems Engineering: The Next Frontier
Authors:
Benjamin Decardi-Nelson,
Abdulelah S. Alshehri,
Akshay Ajagekar,
Fengqi You
Abstract:
This article explores how emerging generative artificial intelligence (GenAI) models, such as large language models (LLMs), can enhance solution methodologies within process systems engineering (PSE). These cutting-edge GenAI models, particularly foundation models (FMs), which are pre-trained on extensive, general-purpose datasets, offer versatile adaptability for a broad range of tasks, including…
▽ More
This article explores how emerging generative artificial intelligence (GenAI) models, such as large language models (LLMs), can enhance solution methodologies within process systems engineering (PSE). These cutting-edge GenAI models, particularly foundation models (FMs), which are pre-trained on extensive, general-purpose datasets, offer versatile adaptability for a broad range of tasks, including responding to queries, image generation, and complex decision-making. Given the close relationship between advancements in PSE and developments in computing and systems technologies, exploring the synergy between GenAI and PSE is essential. We begin our discussion with a compact overview of both classic and emerging GenAI models, including FMs, and then dive into their applications within key PSE domains: synthesis and design, optimization and integration, and process monitoring and control. In each domain, we explore how GenAI models could potentially advance PSE methodologies, providing insights and prospects for each area. Furthermore, the article identifies and discusses potential challenges in fully leveraging GenAI within PSE, including multiscale modeling, data requirements, evaluation metrics and benchmarks, and trust and safety, thereby deepening the discourse on effective GenAI integration into systems analysis, design, optimization, operations, monitoring, and control. This paper provides a guide for future research focused on the applications of emerging GenAI in PSE.
△ Less
Submitted 6 May, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
Quantum Computing Assisted Deep Learning for Fault Detection and Diagnosis in Industrial Process Systems
Authors:
Akshay Ajagekar,
Fengqi You
Abstract:
Quantum computing (QC) and deep learning techniques have attracted widespread attention in the recent years. This paper proposes QC-based deep learning methods for fault diagnosis that exploit their unique capabilities to overcome the computational challenges faced by conventional data-driven approaches performed on classical computers. Deep belief networks are integrated into the proposed fault d…
▽ More
Quantum computing (QC) and deep learning techniques have attracted widespread attention in the recent years. This paper proposes QC-based deep learning methods for fault diagnosis that exploit their unique capabilities to overcome the computational challenges faced by conventional data-driven approaches performed on classical computers. Deep belief networks are integrated into the proposed fault diagnosis model and are used to extract features at different levels for normal and faulty process operations. The QC-based fault diagnosis model uses a quantum computing assisted generative training process followed by discriminative training to address the shortcomings of classical algorithms. To demonstrate its applicability and efficiency, the proposed fault diagnosis method is applied to process monitoring of continuous stirred tank reactor (CSTR) and Tennessee Eastman (TE) process. The proposed QC-based deep learning approach enjoys superior fault detection and diagnosis performance with obtained average fault detection rates of 79.2% and 99.39% for CSTR and TE process, respectively.
△ Less
Submitted 1 October, 2020; v1 submitted 29 February, 2020;
originally announced March 2020.
-
Efficient Greenhouse Temperature Control with Data-Driven Robust Model Predictive Control
Authors:
Wei-Han Chen,
Fengqi You
Abstract:
Appropriate greenhouse temperature should be maintained to ensure crop production while minimizing energy consumption. Even though weather forecasts could provide a certain amount of information to improve control performance, it is not perfect and forecast error may cause the temperature to deviate from the acceptable range. To inherent uncertainty in weather that affects control accuracy, this p…
▽ More
Appropriate greenhouse temperature should be maintained to ensure crop production while minimizing energy consumption. Even though weather forecasts could provide a certain amount of information to improve control performance, it is not perfect and forecast error may cause the temperature to deviate from the acceptable range. To inherent uncertainty in weather that affects control accuracy, this paper develops a data-driven robust model predictive control (MPC) approach for greenhouse temperature control. The dynamic model is obtained from thermal resistance-capacitance modeling derived by the Building Resistance-Capacitance Modeling (BRCM) toolbox. Uncertainty sets of ambient temperature and solar radiation are captured by support vector clustering technique, and they are further tuned for better quality by training-calibration procedure. A case study that implements the carefully chosen uncertainty sets on robust model predictive control shows that the data-driven robust MPC has better control performance compared to rule-based control, certainty equivalent MPC, and robust MPC.
△ Less
Submitted 31 December, 2019; v1 submitted 29 December, 2019;
originally announced December 2019.
-
A Posteriori Probabilistic Bounds of Convex Scenario Programs with Validation Tests
Authors:
Chao Shang,
Fengqi You
Abstract:
Scenario programs have established themselves as efficient tools towards decision-making under uncertainty. To assess the quality of scenario-based solutions a posteriori, validation tests based on Bernoulli trials have been widely adopted in practice. However, to reach a theoretically reliable judgement of risk, one typically needs to collect massive validation samples. In this work, we propose n…
▽ More
Scenario programs have established themselves as efficient tools towards decision-making under uncertainty. To assess the quality of scenario-based solutions a posteriori, validation tests based on Bernoulli trials have been widely adopted in practice. However, to reach a theoretically reliable judgement of risk, one typically needs to collect massive validation samples. In this work, we propose new a posteriori bounds for convex scenario programs with validation tests, which are dependent on both realizations of support constraints and performance on out-of-sample validation data. The proposed bounds enjoy wide generality in that many existing theoretical results can be incorporated as particular cases. To facilitate practical use, a systematic approach for parameterizing a posteriori probability bounds is also developed, which is shown to possess a variety of desirable properties allowing for easy implementations and clear interpretations. By synthesizing comprehensive information about support constraints and validation tests, improved risk evaluation can be achieved for randomized solutions in comparison with existing a posteriori bounds. Case studies on controller design of aircraft lateral motion are presented to validate the effectiveness of the proposed a posteriori bounds.
△ Less
Submitted 13 September, 2020; v1 submitted 27 March, 2019;
originally announced March 2019.
-
Robust Model Predictive Control of Irrigation Systems with Active Uncertainty Learning and Data Analytics
Authors:
Chao Shang,
Wei-Han Chen,
Abraham Duncan Stroock,
Fengqi You
Abstract:
We develop a novel data-driven robust model predictive control (DDRMPC) approach for automatic control of irrigation systems. The fundamental idea is to integrate both mechanistic models, which describe dynamics in soil moisture variations, and data-driven models, which characterize uncertainty in forecast errors of evapotranspiration and precipitation, into a holistic systems control framework. T…
▽ More
We develop a novel data-driven robust model predictive control (DDRMPC) approach for automatic control of irrigation systems. The fundamental idea is to integrate both mechanistic models, which describe dynamics in soil moisture variations, and data-driven models, which characterize uncertainty in forecast errors of evapotranspiration and precipitation, into a holistic systems control framework. To better capture the support of uncertainty distribution, we take a new learning-based approach by constructing uncertainty sets from historical data. For evapotranspiration forecast error, the support vector clustering-based uncertainty set is adopted, which can be conveniently built from historical data. As for precipitation forecast errors, we analyze the dependence of their distribution on forecast values, and further design a tailored uncertainty set based on the properties of this type of uncertainty. In this way, the overall uncertainty distribution can be elaborately described, which finally contributes to rational and efficient control decisions. To assure the quality of data-driven uncertainty sets, a training-calibration scheme is used to provide theoretical performance guarantees. A generalized affine decision rule is adopted to obtain tractable approximations of optimal control problems, thereby ensuring the practicability of DDRMPC. Case studies using real data show that, DDRMPC can reliably maintain soil moisture above the safety level and avoid crop devastation. The proposed DDRMPC approach leads to a 40% reduction of total water consumption compared to the fine-tuned open-loop control strategy. In comparison with the carefully tuned rule-based control and certainty equivalent model predictive control, the proposed DDRMPC approach can significantly reduce the total water consumption and improve the control performance.
△ Less
Submitted 23 May, 2019; v1 submitted 13 October, 2018;
originally announced October 2018.
-
A Transformation-Proximal Bundle Algorithm for Multistage Adaptive Robust Optimization and Application to Constrained Robust Optimal Control
Authors:
Chao Ning,
Fengqi You
Abstract:
This paper presents a novel transformation-proximal bundle algorithm for multistage adaptive robust optimization problems. By partitioning recourse decisions into state and control decisions, the proposed algorithm applies affine control policy only to state decisions and allows control decisions to be fully adaptive, thus transforming the original problem into an equivalent two-stage Adaptive Rob…
▽ More
This paper presents a novel transformation-proximal bundle algorithm for multistage adaptive robust optimization problems. By partitioning recourse decisions into state and control decisions, the proposed algorithm applies affine control policy only to state decisions and allows control decisions to be fully adaptive, thus transforming the original problem into an equivalent two-stage Adaptive Robust Optimization (ARO) problem. Importantly, this multi-to-two transformation is general enough to be employed with any two-stage ARO solution algorithms, thus opening a new avenue for a variety of multistage ARO algorithms. The proximal bundle method is developed for the resulting two-stage problem along with convergence analysis. In an inventory control application, the affine disturbance-feedback control policy suffers from a severe suboptimality with an average gap of 34.88%, while the proposed algorithm generates an average gap of merely 1.68%.
△ Less
Submitted 29 December, 2019; v1 submitted 13 October, 2018;
originally announced October 2018.
-
A data-driven robust optimization approach to scenario-based stochastic model predictive control
Authors:
Chao Shang,
Fengqi You
Abstract:
Stochastic model predictive control (SMPC) has been a promising solution to complex control problems under uncertain disturbances. However, traditional SMPC approaches either require exact knowledge of probabilistic distributions, or rely on massive scenarios that are generated to represent uncertainties. In this paper, a novel scenario-based SMPC approach is proposed by actively learning a data-d…
▽ More
Stochastic model predictive control (SMPC) has been a promising solution to complex control problems under uncertain disturbances. However, traditional SMPC approaches either require exact knowledge of probabilistic distributions, or rely on massive scenarios that are generated to represent uncertainties. In this paper, a novel scenario-based SMPC approach is proposed by actively learning a data-driven uncertainty set from available data with machine learning techniques. A systematical procedure is then proposed to further calibrate the uncertainty set, which gives appropriate probabilistic guarantee. The resulting data-driven uncertainty set is more compact than traditional norm-based sets, and can help reducing conservatism of control actions. Meanwhile, the proposed method requires less data samples than traditional scenario-based SMPC approaches, thereby enhancing the practicability of SMPC. Finally the optimal control problem is cast as a single-stage robust optimization problem, which can be solved efficiently by deriving the robust counterpart problem. The feasibility and stability issue is also discussed in detail. The efficacy of the proposed approach is demonstrated through a two-mass-spring system and a building energy control problem under uncertain disturbances.
△ Less
Submitted 14 January, 2019; v1 submitted 13 July, 2018;
originally announced July 2018.
-
Data-Driven Stochastic Robust Optimization: A General Computational Framework and Algorithm for Optimization under Uncertainty in the Big Data Era
Authors:
Chao Ning,
Fengqi You
Abstract:
A novel data-driven stochastic robust optimization (DDSRO) framework is proposed for optimization under uncertainty leveraging labeled multi-class uncertainty data. Uncertainty data in large datasets are often collected from various conditions, which are encoded by class labels. Machine learning methods including Dirichlet process mixture model and maximum likelihood estimation are employed for un…
▽ More
A novel data-driven stochastic robust optimization (DDSRO) framework is proposed for optimization under uncertainty leveraging labeled multi-class uncertainty data. Uncertainty data in large datasets are often collected from various conditions, which are encoded by class labels. Machine learning methods including Dirichlet process mixture model and maximum likelihood estimation are employed for uncertainty modeling. A DDSRO framework is further proposed based on the data-driven uncertainty model through a bi-level optimization structure. The outer optimization problem follows a two-stage stochastic programming approach to optimize the expected objective across different data classes; adaptive robust optimization is nested as the inner problem to ensure the robustness of the solution while maintaining computational tractability. A decomposition-based algorithm is further developed to solve the resulting multi-level optimization problem efficiently. Case studies on process network design and planning are presented to demonstrate the applicability of the proposed framework and algorithm.
△ Less
Submitted 29 December, 2017; v1 submitted 28 July, 2017;
originally announced July 2017.