-
Multi-service collaboration and composition of cloud manufacturing customized production based on problem decomposition
Authors:
Hao Yue,
Yingtao Wu,
Min Wang,
Hesuan Hu,
Weimin Wu,
Jihui Zhang
Abstract:
Cloud manufacturing system is a service-oriented and knowledge-based one, which can provide solutions for the large-scale customized production. The service resource allocation is the primary factor that restricts the production time and cost in the cloud manufacturing customized production (CMCP). In order to improve the efficiency and reduce the cost in CMCP, we propose a new framework which con…
▽ More
Cloud manufacturing system is a service-oriented and knowledge-based one, which can provide solutions for the large-scale customized production. The service resource allocation is the primary factor that restricts the production time and cost in the cloud manufacturing customized production (CMCP). In order to improve the efficiency and reduce the cost in CMCP, we propose a new framework which considers the collaboration among services with the same functionality. A mathematical evaluation formulation for the service composition and service usage scheme is constructed with the following critical indexes: completion time, cost, and number of selected services. Subsequently, a problem decomposition based genetic algorithm is designed to obtain the optimal service compositions with service usage schemes. A smart clothing customization case is illustrated so as to show the effectiveness and efficiency of the method proposed in this paper. Finally, the results of simulation experiments and comparisons show that these solutions obtained by our method are with the minimum time, a lower cost, and the fewer selected services.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection
Authors:
Da Mu,
Zhicheng Zhang,
Haobo Yue
Abstract:
Sound Event Localization and Detection (SELD) involves detecting and localizing sound events using multichannel sound recordings. Previously proposed Event-Independent Network V2 (EINV2) has achieved outstanding performance on SELD. However, it still faces challenges in effectively extracting features across spectral, spatial, and temporal domains. This paper proposes a three-stage network structu…
▽ More
Sound Event Localization and Detection (SELD) involves detecting and localizing sound events using multichannel sound recordings. Previously proposed Event-Independent Network V2 (EINV2) has achieved outstanding performance on SELD. However, it still faces challenges in effectively extracting features across spectral, spatial, and temporal domains. This paper proposes a three-stage network structure named Multi-scale Feature Fusion (MFF) module to fully extract multi-scale features across spectral, spatial, and temporal domains. The MFF module utilizes parallel subnetworks architecture to generate multi-scale spectral and spatial features. The TF-Convolution Module is employed to provide multi-scale temporal features. We incorporated MFF into EINV2 and term the proposed method as MFF-EINV2. Experimental results in 2022 and 2023 DCASE challenge task3 datasets show the effectiveness of our MFF-EINV2, which achieves state-of-the-art (SOTA) performance compared to published methods.
△ Less
Submitted 15 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection
Authors:
Haobo Yue,
Zhicheng Zhang,
Da Mu,
Yonghao Dang,
Jianqin Yin,
** Tang
Abstract:
Recently, 2D convolution has been found unqualified in sound event detection (SED). It enforces translation equivariance on sound events along frequency axis, which is not a shift-invariant dimension. To address this issue, dynamic convolution is used to model the frequency dependency of sound events. In this paper, we proposed the first full-dynamic method named \emph{full-frequency dynamic convo…
▽ More
Recently, 2D convolution has been found unqualified in sound event detection (SED). It enforces translation equivariance on sound events along frequency axis, which is not a shift-invariant dimension. To address this issue, dynamic convolution is used to model the frequency dependency of sound events. In this paper, we proposed the first full-dynamic method named \emph{full-frequency dynamic convolution} (FFDConv). FFDConv generates frequency kernels for every frequency band, which is designed directly in the structure for frequency-dependent modeling. It physically furnished 2D convolution with the capability of frequency-dependent modeling. FFDConv outperforms not only the baseline by 6.6\% in DESED real validation dataset in terms of PSDS1, but outperforms the other full-dynamic methods. In addition, by visualizing features of sound events, we observed that FFDConv could effectively extract coherent features in specific frequency bands, consistent with the vocal continuity of sound events. This proves that FFDConv has great frequency-dependent perception ability.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Application of BERT in Wind Power Forecasting-Teletraan's Solution in Baidu KDD Cup 2022
Authors:
Longxing Tan,
Hongying Yue
Abstract:
Nowadays, wind energy has drawn increasing attention as its important role in carbon neutrality and sustainable development. When wind power is integrated into the power grid, precise forecasting is necessary for the sustainability and security of the system. However, the unpredictable nature and long sequence prediction make it especially challenging. In this technical report, we introduce the BE…
▽ More
Nowadays, wind energy has drawn increasing attention as its important role in carbon neutrality and sustainable development. When wind power is integrated into the power grid, precise forecasting is necessary for the sustainability and security of the system. However, the unpredictable nature and long sequence prediction make it especially challenging. In this technical report, we introduce the BERT model applied for Baidu KDD Cup 2022, and the daily fluctuation is added by post-processing to make the predicted results in line with daily periodicity. Our solution achieves 3rd place of 2490 teams. The code is released athttps://github.com/LongxingTan/KDD2022-Baidu
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Efficient HDR Reconstruction from Real-World Raw Images
Authors:
Qirui Yang,
Yihao Liu,
Qihua Chen,
Huan**g Yue,
Kun Li,
**gyu Yang
Abstract:
The widespread usage of high-definition screens on edge devices stimulates a strong demand for efficient high dynamic range (HDR) algorithms. However, many existing HDR methods either deliver unsatisfactory results or consume too much computational and memory resources, hindering their application to high-resolution images (usually with more than 12 megapixels) in practice. In addition, existing H…
▽ More
The widespread usage of high-definition screens on edge devices stimulates a strong demand for efficient high dynamic range (HDR) algorithms. However, many existing HDR methods either deliver unsatisfactory results or consume too much computational and memory resources, hindering their application to high-resolution images (usually with more than 12 megapixels) in practice. In addition, existing HDR dataset collection methods often are labor-intensive. In this work, in a new aspect, we discover an excellent opportunity for HDR reconstructing directly from raw images and investigating novel neural network structures that benefit the deployment of mobile devices. Our key insights are threefold: (1) we develop a lightweight-efficient HDR model, RepUNet, using the structural re-parameterization technique to achieve fast and robust HDR; (2) we design a new computational raw HDR data formation pipeline and construct a real-world raw HDR dataset, RealRaw-HDR; (3) we propose a plug-and-play motion alignment loss to mitigate motion ghosting under limited bandwidth conditions. Our model contains less than 830K parameters and takes less than 3 ms to process an image of 4K resolution using one RTX 3090 GPU. While being highly efficient, our model also outperforms the state-of-the-art HDR methods in terms of PSNR, SSIM, and a color difference metric.
△ Less
Submitted 5 June, 2024; v1 submitted 17 June, 2023;
originally announced June 2023.
-
HDR Video Reconstruction with a Large Dynamic Dataset in Raw and sRGB Domains
Authors:
Huan**g Yue,
Yubo Peng,
Biting Yu,
Xuanwu Yin,
Zhenyu Zhou,
**gyu Yang
Abstract:
High dynamic range (HDR) video reconstruction is attracting more and more attention due to the superior visual quality compared with those of low dynamic range (LDR) videos. The availability of LDR-HDR training pairs is essential for the HDR reconstruction quality. However, there are still no real LDR-HDR pairs for dynamic scenes due to the difficulty in capturing LDR-HDR frames simultaneously. In…
▽ More
High dynamic range (HDR) video reconstruction is attracting more and more attention due to the superior visual quality compared with those of low dynamic range (LDR) videos. The availability of LDR-HDR training pairs is essential for the HDR reconstruction quality. However, there are still no real LDR-HDR pairs for dynamic scenes due to the difficulty in capturing LDR-HDR frames simultaneously. In this work, we propose to utilize a staggered sensor to capture two alternate exposure images simultaneously, which are then fused into an HDR frame in both raw and sRGB domains. In this way, we build a large scale LDR-HDR video dataset with 85 scenes and each scene contains 60 frames. Based on this dataset, we further propose a Raw-HDRNet, which utilizes the raw LDR frames as inputs. We propose a pyramid flow-guided deformation convolution to align neighboring frames. Experimental results demonstrate that 1) the proposed dataset can improve the HDR reconstruction performance on real scenes for three benchmark networks; 2) Compared with sRGB inputs, utilizing raw inputs can further improve the reconstruction quality and our proposed Raw-HDRNet is a strong baseline for raw HDR reconstruction. Our dataset and code will be released after the acceptance of this paper.
△ Less
Submitted 12 April, 2023; v1 submitted 10 April, 2023;
originally announced April 2023.
-
Unsupervised HDR Image and Video Tone Map** via Contrastive Learning
Authors:
Cong Cao,
Huan**g Yue,
Xin Liu,
**gyu Yang
Abstract:
Capturing high dynamic range (HDR) images (videos) is attractive because it can reveal the details in both dark and bright regions. Since the mainstream screens only support low dynamic range (LDR) content, tone map** algorithm is required to compress the dynamic range of HDR images (videos). Although image tone map** has been widely explored, video tone map** is lagging behind, especially f…
▽ More
Capturing high dynamic range (HDR) images (videos) is attractive because it can reveal the details in both dark and bright regions. Since the mainstream screens only support low dynamic range (LDR) content, tone map** algorithm is required to compress the dynamic range of HDR images (videos). Although image tone map** has been widely explored, video tone map** is lagging behind, especially for the deep-learning-based methods, due to the lack of HDR-LDR video pairs. In this work, we propose a unified framework (IVTMNet) for unsupervised image and video tone map**. To improve unsupervised training, we propose domain and instance based contrastive learning loss. Instead of using a universal feature extractor, such as VGG to extract the features for similarity measurement, we propose a novel latent code, which is an aggregation of the brightness and contrast of extracted features, to measure the similarity of different pairs. We totally construct two negative pairs and three positive pairs to constrain the latent codes of tone mapped results. For the network structure, we propose a spatial-feature-enhanced (SFE) module to enable information exchange and transformation of nonlocal regions. For video tone map**, we propose a temporal-feature-replaced (TFR) module to efficiently utilize the temporal correlation and improve the temporal consistency of video tone-mapped results. We construct a large-scale unpaired HDR-LDR video dataset to facilitate the unsupervised training process for video tone map**. Experimental results demonstrate that our method outperforms state-of-the-art image and video tone map** methods. Our code and dataset are available at https://github.com/cao-cong/UnCLTMO.
△ Less
Submitted 26 June, 2023; v1 submitted 13 March, 2023;
originally announced March 2023.
-
Real-RawVSR: Real-World Raw Video Super-Resolution with a Benchmark Dataset
Authors:
Huan**g Yue,
Zhiming Zhang,
**gyu Yang
Abstract:
In recent years, real image super-resolution (SR) has achieved promising results due to the development of SR datasets and corresponding real SR methods. In contrast, the field of real video SR is lagging behind, especially for real raw videos. Considering the superiority of raw image SR over sRGB image SR, we construct a real-world raw video SR (Real-RawVSR) dataset and propose a corresponding SR…
▽ More
In recent years, real image super-resolution (SR) has achieved promising results due to the development of SR datasets and corresponding real SR methods. In contrast, the field of real video SR is lagging behind, especially for real raw videos. Considering the superiority of raw image SR over sRGB image SR, we construct a real-world raw video SR (Real-RawVSR) dataset and propose a corresponding SR method. We utilize two DSLR cameras and a beam-splitter to simultaneously capture low-resolution (LR) and high-resolution (HR) raw videos with 2x, 3x, and 4x magnifications. There are 450 video pairs in our dataset, with scenes varying from indoor to outdoor, and motions including camera and object movements. To our knowledge, this is the first real-world raw VSR dataset. Since the raw video is characterized by the Bayer pattern, we propose a two-branch network, which deals with both the packed RGGB sequence and the original Bayer pattern sequence, and the two branches are complementary to each other. After going through the proposed co-alignment, interaction, fusion, and reconstruction modules, we generate the corresponding HR sRGB sequence. Experimental results demonstrate that the proposed method outperforms benchmark real and synthetic video SR methods with either raw or sRGB inputs. Our code and dataset are available at https://github.com/zmzhang1998/Real-RawVSR.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Improved mathematical models of structured-light modulation analysis technique for contaminant and defect detection
Authors:
Yiyang Huang,
Huimin Yue,
Yuyao Fang,
Yi** Song,
Yong Liu
Abstract:
Surface quality inspection of optical components is critical in optical and electronic industries. Structured-Light Modulation Analysis Technique (SMAT) is a novel method recently proposed for the contaminant and defect detection of specular surfaces and transparent objects, and this approach was verified to be effective in eliminating ambient light. The mechanisms and mathematical models of SMAT…
▽ More
Surface quality inspection of optical components is critical in optical and electronic industries. Structured-Light Modulation Analysis Technique (SMAT) is a novel method recently proposed for the contaminant and defect detection of specular surfaces and transparent objects, and this approach was verified to be effective in eliminating ambient light. The mechanisms and mathematical models of SMAT were analyzed and established based on the theory of photometry and the optical characteristics of contaminants and defects. However, there are still some phenomena exist as conundrums in actual detection process, which cannot be well explained. In order to better analyze the phenomena in practical circumstances, improved mathematical models of SMAT are constructed based on the surface topography of contaminants and defects in this paper. These mathematical models can be used as tools for analyzing various contaminants and defects in different systems, and provide effective instruction for subsequent work. Simulations and experiments on the modulation and the luminous flux of fringe patterns have been implemented to verify the validity of these mathematical models. In adddition, by using the fringe patterns with mutually perpendicular sinusoidal directions, two obtained modulation images can be merged to solve the incomplete information acquisition issue caused by the differentiated response of modulation.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
Supervised Raw Video Denoising with a Benchmark Dataset on Dynamic Scenes
Authors:
Huan**g Yue,
Cong Cao,
Lei Liao,
Ronghe Chu,
**gyu Yang
Abstract:
In recent years, the supervised learning strategy for real noisy image denoising has been emerging and has achieved promising results. In contrast, realistic noise removal for raw noisy videos is rarely studied due to the lack of noisy-clean pairs for dynamic scenes. Clean video frames for dynamic scenes cannot be captured with a long-exposure shutter or averaging multi-shots as was done for stati…
▽ More
In recent years, the supervised learning strategy for real noisy image denoising has been emerging and has achieved promising results. In contrast, realistic noise removal for raw noisy videos is rarely studied due to the lack of noisy-clean pairs for dynamic scenes. Clean video frames for dynamic scenes cannot be captured with a long-exposure shutter or averaging multi-shots as was done for static images. In this paper, we solve this problem by creating motions for controllable objects, such as toys, and capturing each static moment for multiple times to generate clean video frames. In this way, we construct a dataset with 55 groups of noisy-clean videos with ISO values ranging from 1600 to 25600. To our knowledge, this is the first dynamic video dataset with noisy-clean pairs. Correspondingly, we propose a raw video denoising network (RViDeNet) by exploring the temporal, spatial, and channel correlations of video frames. Since the raw video has Bayer patterns, we pack it into four sub-sequences, i.e RGBG sequences, which are denoised by the proposed RViDeNet separately and finally fused into a clean video. In addition, our network not only outputs a raw denoising result, but also the sRGB result by going through an image signal processing (ISP) module, which enables users to generate the sRGB result with their favourite ISPs. Experimental results demonstrate that our method outperforms state-of-the-art video and raw image denoising algorithms on both indoor and outdoor videos.
△ Less
Submitted 31 March, 2020;
originally announced March 2020.
-
Payload-agnostic Decoupling and Hybrid Vibration Isolation Control for a Maglev Platform with Redundant Actuation
Authors:
Zhaopei Gong,
Liang Ding,
Shaozhen Li,
Honghao Yue,
Haibo Gao,
Zongquan Deng
Abstract:
Payload-specific vibration control may be suitable for a particular task but lacks generality and transferability required for adapting to the various payload. Self-decoupling and robust vibration control are the crucial problems to achieve payload-agnostic vibration control. However, there are problems still unsolved.
In this article, we present a maglev vibration isolation platform (MVIP), whi…
▽ More
Payload-specific vibration control may be suitable for a particular task but lacks generality and transferability required for adapting to the various payload. Self-decoupling and robust vibration control are the crucial problems to achieve payload-agnostic vibration control. However, there are problems still unsolved.
In this article, we present a maglev vibration isolation platform (MVIP), which aims to attenuate vibration in the payload-agnostic task under a dynamic environment. Since efforts trying to suppress disturbance will encounter inevitable coupling problems, we analyzed the reasons resulting in it and proposed unique and effective solutions.
To achieve payload-agnostic vibration control, we proposed a new control strategy, which is the main contribution of this article. It consists of a self-construct radial basis function neural network inversion (SRBFNNI) decoupling scheme and hybrid adaptive feed-forward internal model control (HAFIMC). The former one enables the MVIP to create a self inverse model with little prior knowledge and achieving self-decoupling. For the unique structure of MVIP, the vibration control problem is stated and addressed by the proposed HAFIMC, which utilizes the adaptive part to deal with the periodical disturbance and the internal mode part to deal with the stability.
△ Less
Submitted 31 March, 2020; v1 submitted 26 March, 2020;
originally announced March 2020.
-
System Integration and Control Design of a Maglev Platform for Space Vibration Isolation
Authors:
Zhaopei Gong,
Liang Ding,
Honghao Yue,
Haibo Gao,
Rongqiang Liu,
Zongquan Deng,
Yifan Lu
Abstract:
Micro-vibration has been a dominant factor impairing the performance of scientific experiments which are expected to be deployed in a micro-gravity environment such as Spacelab. The micro-vibration has a serious impact on scientific experiments requiring a quasi-static environment. Therefore, we proposed a maglev vibration isolation platform (MVIP) operating in six degrees of freedom (DOF) to fulf…
▽ More
Micro-vibration has been a dominant factor impairing the performance of scientific experiments which are expected to be deployed in a micro-gravity environment such as Spacelab. The micro-vibration has a serious impact on scientific experiments requiring a quasi-static environment. Therefore, we proposed a maglev vibration isolation platform (MVIP) operating in six degrees of freedom (DOF) to fulfill the environmental requirements. In view of non-contact and large stroke requirements for micro-vibration isolation, an optimization method was utilized to design the actuator. Mathematical models of the actuator's remarkable nonlinearity were established so that its output can be compensated according to floater's varying position and the system's performance may be satisfied. Furthermore, aiming to adapt to an energy-limited environment such as Spacelab, an optimum allocation scheme was put forward. Considering the actuator's nonlinearity, accuracy and minimum energy-consumption can be obtained simultaneously. In view of operating in six DOF, methods for nonlinear compensation and system decoupling were discussed, the necessary controller was also presented. Simulation and experiments validate the system's performance. With a movement range of 10x10x8 mm and rotations of 200 mrad, the decay ratio of -40 dB/Dec between 1-10 Hz was obtained under closed-loop control.
△ Less
Submitted 25 March, 2020; v1 submitted 27 November, 2019;
originally announced November 2019.