-
Improving Both Domain Robustness and Domain Adaptability in Machine Translation
Authors:
Wen Lai,
**dřich Libovický,
Alexander Fraser
Abstract:
We consider two problems of NMT domain adaptation using meta-learning. First, we want to reach domain robustness, i.e., we want to reach high quality on both domains seen in the training data and unseen domains. Second, we want our systems to be adaptive, i.e., making it possible to finetune systems with just hundreds of in-domain parallel sentences. We study the domain adaptability of meta-learni…
▽ More
We consider two problems of NMT domain adaptation using meta-learning. First, we want to reach domain robustness, i.e., we want to reach high quality on both domains seen in the training data and unseen domains. Second, we want our systems to be adaptive, i.e., making it possible to finetune systems with just hundreds of in-domain parallel sentences. We study the domain adaptability of meta-learning when improving the domain robustness of the model. In this paper, we propose a novel approach, RMLNMT (Robust Meta-Learning Framework for Neural Machine Translation Domain Adaptation), which improves the robustness of existing meta-learning models. More specifically, we show how to use a domain classifier in curriculum learning and we integrate the word-level domain mixing model into the meta-learning framework with a balanced sampling strategy. Experiments on English$\rightarrow$German and English$\rightarrow$Chinese translation show that RMLNMT improves in terms of both domain robustness and domain adaptability in seen and unseen domains. Our source code is available at https://github.com/lavine-lmu/RMLNMT.
△ Less
Submitted 4 October, 2022; v1 submitted 15 December, 2021;
originally announced December 2021.
-
Beyond the Longest Letter-duplicated Subsequence Problem
Authors:
Wenfeng Lai,
Adiesha Liyanage,
Binhai Zhu,
Peng Zou
Abstract:
Given a sequence $S$ of length $n$, a letter-duplicated subsequence is a subsequence of $S$ in the form of $x_1^{d_1}x_2^{d_2}\cdots x_k^{d_k}$ with $x_i\inΣ$, $x_j\neq x_{j+1}$ and $d_i\geq 2$ for all $i$ in $[k]$ and $j$ in $[k-1]$. A linear time algorithm for computing the longest letter-duplicated subsequence (LLDS) of $S$ can be easily obtained. In this paper, we focus on two variants of this…
▽ More
Given a sequence $S$ of length $n$, a letter-duplicated subsequence is a subsequence of $S$ in the form of $x_1^{d_1}x_2^{d_2}\cdots x_k^{d_k}$ with $x_i\inΣ$, $x_j\neq x_{j+1}$ and $d_i\geq 2$ for all $i$ in $[k]$ and $j$ in $[k-1]$. A linear time algorithm for computing the longest letter-duplicated subsequence (LLDS) of $S$ can be easily obtained. In this paper, we focus on two variants of this problem. We first consider the constrained version when $Σ$ is unbounded, each letter appears in $S$ at least 6 times and all the letters in $Σ$ must appear in the solution. We show that the problem is NP-hard (a further twist indicates that the problem does not admit any polynomial time approximation). The reduction is from possibly the simplest version of SAT that is NP-complete, $(\leq 2,1,\leq 3)$-SAT, where each variable appears at most twice positively and exact once negatively, and each clause contains at most three literals and some clauses must contain exactly two literals. (We hope that this technique will serve as a general tool to help us proving the NP-hardness for some more tricky sequence problems involving only one sequence -- much harder than with at least two input sequences, which we apply successfully at the end of the paper on some extra variations of the LLDS problem.) We then show that when each letter appears in $S$ at most 3 times, then the problem admits a factor $1.5-O(\frac{1}{n})$ approximation. Finally, we consider the weighted version, where the weight of a block $x_i^{d_i} (d_i\geq 2)$ could be any positive function which might not grow with $d_i$. We give a non-trivial $O(n^2)$ time dynamic programming algorithm for this version, i.e., computing an LD-subsequence of $S$ whose weight is maximized.
△ Less
Submitted 4 January, 2022; v1 submitted 10 December, 2021;
originally announced December 2021.
-
Correcting Face Distortion in Wide-Angle Videos
Authors:
Wei-Sheng Lai,
YiChang Shih,
Chia-Kai Liang,
Ming-Hsuan Yang
Abstract:
Video blogs and selfies are popular social media formats, which are often captured by wide-angle cameras to show human subjects and expanded background. Unfortunately, due to perspective projection, faces near corners and edges exhibit apparent distortions that stretch and squish the facial features, resulting in poor video quality. In this work, we present a video war** algorithm to correct the…
▽ More
Video blogs and selfies are popular social media formats, which are often captured by wide-angle cameras to show human subjects and expanded background. Unfortunately, due to perspective projection, faces near corners and edges exhibit apparent distortions that stretch and squish the facial features, resulting in poor video quality. In this work, we present a video war** algorithm to correct these distortions. Our key idea is to apply stereographic projection locally on the facial regions. We formulate a mesh warp problem using spatial-temporal energy minimization and minimize background deformation using a line-preservation term to maintain the straight edges in the background. To address temporal coherency, we constrain the temporal smoothness on the war** meshes and facial trajectories through the latent variables. For performance evaluation, we develop a wide-angle video dataset with a wide range of focal lengths. The user study shows that 83.9% of users prefer our algorithm over other alternatives based on perspective projection.
△ Less
Submitted 18 November, 2021;
originally announced November 2021.
-
Magnetic Field Structure and Faraday Rotation of the Plerionic Supernova Remnant G21.5$-$0.9
Authors:
Paul C. W. Lai,
C. -Y. Ng,
Niccolo' Bucciantini
Abstract:
We present a polarimetric study of the pulsar wind nebula (PWN) in supernova remnant G21.5$-$0.9 using archival Very Large Array (VLA) data. The rotation measure (RM) map of the PWN shows a symmetric pattern that aligns with the presumed pulsar spin axis direction, implying a significant contribution of RM from the nebula. We suggest that the spatial variation of the internal RM is mostly caused b…
▽ More
We present a polarimetric study of the pulsar wind nebula (PWN) in supernova remnant G21.5$-$0.9 using archival Very Large Array (VLA) data. The rotation measure (RM) map of the PWN shows a symmetric pattern that aligns with the presumed pulsar spin axis direction, implying a significant contribution of RM from the nebula. We suggest that the spatial variation of the internal RM is mostly caused by non-uniform distribution of electrons originated from the supernova ejecta. Our high-resolution radio polarization map reveals an overall radial $B$-field. We construct a simple model with an overall radial $B$-field and turbulence in small scale. The model can reproduce many of the observed features of the PWN, including the polarization pattern and polarized fraction. The results also reject a large-scale toroidal $B$-field which implies that the toroidal field observed in the inner PWN cannot propagate to the entire nebula.
△ Less
Submitted 22 April, 2022; v1 submitted 9 September, 2021;
originally announced September 2021.
-
Toward Real-World Super-Resolution via Adaptive Downsampling Models
Authors:
Sanghyun Son,
Jaeha Kim,
Wei-Sheng Lai,
Ming-Husan Yang,
Kyoung Mu Lee
Abstract:
Most image super-resolution (SR) methods are developed on synthetic low-resolution (LR) and high-resolution (HR) image pairs that are constructed by a predetermined operation, e.g., bicubic downsampling. As existing methods typically learn an inverse map** of the specific function, they produce blurry results when applied to real-world images whose exact formulation is different and unknown. The…
▽ More
Most image super-resolution (SR) methods are developed on synthetic low-resolution (LR) and high-resolution (HR) image pairs that are constructed by a predetermined operation, e.g., bicubic downsampling. As existing methods typically learn an inverse map** of the specific function, they produce blurry results when applied to real-world images whose exact formulation is different and unknown. Therefore, several methods attempt to synthesize much more diverse LR samples or learn a realistic downsampling model. However, due to restrictive assumptions on the downsampling process, they are still biased and less generalizable. This study proposes a novel method to simulate an unknown downsampling process without imposing restrictive prior knowledge. We propose a generalizable low-frequency loss (LFL) in the adversarial training framework to imitate the distribution of target LR images without using any paired examples. Furthermore, we design an adaptive data loss (ADL) for the downsampler, which can be adaptively learned and updated from the data during the training loops. Extensive experiments validate that our downsampling model can facilitate existing SR methods to perform more accurate reconstructions on various synthetic and real-world examples than the conventional approaches.
△ Less
Submitted 8 September, 2021;
originally announced September 2021.
-
Inhomogeneous light photovoltaic effect in neighboring quantum dots
Authors:
Wenxi Lai
Abstract:
Photovoltaic effect of double quantum dots under nonuniform light field intensity has been studied theoretically. Comparing with the traditional p-n type photovoltaic effect, the inhomogeneous light field provides asymmetric potential creating polarization of electron number distribution in the neighboring quantum dots and furthermore gives rise to net current. Current density and efficiency of su…
▽ More
Photovoltaic effect of double quantum dots under nonuniform light field intensity has been studied theoretically. Comparing with the traditional p-n type photovoltaic effect, the inhomogeneous light field provides asymmetric potential creating polarization of electron number distribution in the neighboring quantum dots and furthermore gives rise to net current. Current density and efficiency of such kind solar cells are estimated to be comparable to the traditional p-n type material based solar cells. Motion of electron is described using quantum master equation around room temperature. The inhomogeneous light photovoltaic effect has potential applications for the gain of more economical solar cells.
△ Less
Submitted 20 May, 2022; v1 submitted 7 August, 2021;
originally announced August 2021.
-
Photovoltaic transistor of atoms due to spin-orbit coupling in three optical traps
Authors:
Haihu Cui,
Mingzhu Zhang,
Wenxi Lai
Abstract:
In this paper, spin-orbit coupling induced photovoltaic effect of cold atoms has been studied in a three-trap system which is an two-dimensional extension of a two-trap system reported previously. It is proposed here that atom coherent length is one of the important influence to the resistance of this photovoltaic battery. Current properties of the system for different geometrical structures of th…
▽ More
In this paper, spin-orbit coupling induced photovoltaic effect of cold atoms has been studied in a three-trap system which is an two-dimensional extension of a two-trap system reported previously. It is proposed here that atom coherent length is one of the important influence to the resistance of this photovoltaic battery. Current properties of the system for different geometrical structures of the trap** potentials are discussed. Numerical results show extension in the number of traps could cause current increase directly. Quantum master equation at finite temperature is used to treat this opened system. This work may give a theoretical basis for further development of the photovoltaic effect of neutral atoms.
△ Less
Submitted 27 July, 2021;
originally announced July 2021.
-
Asymmetric Field Photovoltaic Effect of Neutral Atoms
Authors:
Wenxi Lai,
**yan Niu,
Yu-Quan Ma,
W. M. Liu
Abstract:
Photovoltaic effect of neutral atoms using inhomogeneous light in double-trap opened system is studied theoretically. Using asymmetric external driving field to replacing original asymmetric chemical potential of atoms, we create polarization of atom population in the double-trap system. The polarization of atom number distribution induces net current of atoms and works as collected carriers in th…
▽ More
Photovoltaic effect of neutral atoms using inhomogeneous light in double-trap opened system is studied theoretically. Using asymmetric external driving field to replacing original asymmetric chemical potential of atoms, we create polarization of atom population in the double-trap system. The polarization of atom number distribution induces net current of atoms and works as collected carriers in the cell. The cell can work even under partially coherent light. The whole configuration is described by quantum master equation considering weak tunneling between the system and its reservoirs at finite temperature. The model of neutral atoms could be extended to more general quantum particles in principle.
△ Less
Submitted 20 July, 2021; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Partonic collinear structure by quantum computing
Authors:
Tianyin Li,
Xingyu Guo,
Wai Kin Lai,
Xiaohui Liu,
Enke Wang,
Hongxi Xing,
Dan-Bo Zhang,
Shi-Liang Zhu
Abstract:
We present a systematic quantum algorithm, which integrates both the hadronic state preparation and the evaluation of real-time light-front correlators, to study parton distribution functions (PDFs). As a proof of concept, we demonstrate the first direct simulation of the PDFs in the 1+1 dimensional Nambu-Jona-Lasinio model. We show the results obtained by exact diagonalization and by quantum comp…
▽ More
We present a systematic quantum algorithm, which integrates both the hadronic state preparation and the evaluation of real-time light-front correlators, to study parton distribution functions (PDFs). As a proof of concept, we demonstrate the first direct simulation of the PDFs in the 1+1 dimensional Nambu-Jona-Lasinio model. We show the results obtained by exact diagonalization and by quantum computation using classical hardware. The agreement between these two distinct methods and the qualitative consistency with QCD PDFs validate the proposed quantum algorithm. Our work suggests the encouraging prospects of calculating QCD PDFs on current and near-term quantum devices. The presented quantum algorithm is expected to have many applications in high energy particle and nuclear physics.
△ Less
Submitted 17 October, 2023; v1 submitted 7 June, 2021;
originally announced June 2021.
-
Stylizing 3D Scene via Implicit Representation and HyperNetwork
Authors:
Pei-Ze Chiang,
Meng-Shiun Tsai,
Hung-Yu Tseng,
Wei-sheng Lai,
Wei-Chen Chiu
Abstract:
In this work, we aim to address the 3D scene stylization problem - generating stylized images of the scene at arbitrary novel view angles. A straightforward solution is to combine existing novel view synthesis and image/video style transfer approaches, which often leads to blurry results or inconsistent appearance. Inspired by the high-quality results of the neural radiance fields (NeRF) method, w…
▽ More
In this work, we aim to address the 3D scene stylization problem - generating stylized images of the scene at arbitrary novel view angles. A straightforward solution is to combine existing novel view synthesis and image/video style transfer approaches, which often leads to blurry results or inconsistent appearance. Inspired by the high-quality results of the neural radiance fields (NeRF) method, we propose a joint framework to directly render novel views with the desired style. Our framework consists of two components: an implicit representation of the 3D scene with the neural radiance fields model, and a hypernetwork to transfer the style information into the scene representation. In particular, our implicit representation model disentangles the scene into the geometry and appearance branches, and the hypernetwork learns to predict the parameters of the appearance branch from the reference style image. To alleviate the training difficulties and memory burden, we propose a two-stage training procedure and a patch sub-sampling approach to optimize the style and content losses with the neural radiance fields model. After optimization, our model is able to render consistent novel views at arbitrary view angles with arbitrary style. Both quantitative evaluation and human subject study have demonstrated that the proposed method generates faithful stylization results with consistent appearance across different views.
△ Less
Submitted 16 January, 2022; v1 submitted 27 May, 2021;
originally announced May 2021.
-
On Multi-Channel Huffman Codes for Asymmetric-Alphabet Channels
Authors:
Hoover H. F. Yin,
Xishi Wang,
Ka Hei Ng,
Russell W. F. Lai,
Lucien K. L. Ng,
Jack P. K. Ma
Abstract:
Zero-error single-channel source coding has been studied extensively over the past decades. Its natural multi-channel generalization is however not well investigated. While the special case with multiple symmetric-alphabet channels was studied a decade ago, codes in such setting have no advantage over single-channel codes in data compression, making them worthless in most applications. With essent…
▽ More
Zero-error single-channel source coding has been studied extensively over the past decades. Its natural multi-channel generalization is however not well investigated. While the special case with multiple symmetric-alphabet channels was studied a decade ago, codes in such setting have no advantage over single-channel codes in data compression, making them worthless in most applications. With essentially no development since the last decade, in this paper, we break the stalemate by showing that it is possible to beat single-channel source codes in terms of compression assuming asymmetric-alphabet channels. We present the multi-channel analog of several classical results in single-channel source coding, such as that a multi-channel Huffman code is an optimal tree-decodable code. We also show some evidences that finding an efficient construction of multi-channel Huffman codes may be hard. Nevertheless, we propose a suboptimal code construction whose redundancy is guaranteed to be no larger than that of an optimal single-channel source code.
△ Less
Submitted 8 May, 2021;
originally announced May 2021.
-
The Variational Bayesian Inference for Network Autoregression Models
Authors:
Wei-Ting Lai,
Ray-Bing Chen,
Ying Chen,
Thorsten Koch
Abstract:
We develop a variational Bayesian (VB) approach for estimating large-scale dynamic network models in the network autoregression framework. The VB approach allows for the automatic identification of the dynamic structure of such a model and obtains a direct approximation of the posterior density. Compared to Markov Chain Monte Carlo (MCMC) based sampling approaches, the VB approach achieves enhance…
▽ More
We develop a variational Bayesian (VB) approach for estimating large-scale dynamic network models in the network autoregression framework. The VB approach allows for the automatic identification of the dynamic structure of such a model and obtains a direct approximation of the posterior density. Compared to Markov Chain Monte Carlo (MCMC) based sampling approaches, the VB approach achieves enhanced computational efficiency without sacrificing estimation accuracy. In the simulation study conducted here, the proposed VB approach detects various types of proper active structures for dynamic network models. Compared to the alternative approach, the proposed method achieves similar or better accuracy, and its computational time is halved. In a real data analysis scenario of day-ahead natural gas flow prediction in the German gas transmission network with 51 nodes between October 2013 and September 2015, the VB approach delivers promising forecasting accuracy along with clearly detected structures in terms of dynamic dependence.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Hybrid Neural Fusion for Full-frame Video Stabilization
Authors:
Yu-Lun Liu,
Wei-Sheng Lai,
Ming-Hsuan Yang,
Yung-Yu Chuang,
Jia-Bin Huang
Abstract:
Existing video stabilization methods often generate visible distortion or require aggressive crop** of frame boundaries, resulting in smaller field of views. In this work, we present a frame synthesis algorithm to achieve full-frame video stabilization. We first estimate dense warp fields from neighboring frames and then synthesize the stabilized frame by fusing the warped contents. Our core tec…
▽ More
Existing video stabilization methods often generate visible distortion or require aggressive crop** of frame boundaries, resulting in smaller field of views. In this work, we present a frame synthesis algorithm to achieve full-frame video stabilization. We first estimate dense warp fields from neighboring frames and then synthesize the stabilized frame by fusing the warped contents. Our core technical novelty lies in the learning-based hybrid-space fusion that alleviates artifacts caused by optical flow inaccuracy and fast-moving objects. We validate the effectiveness of our method on the NUS, selfie, and DeepStab video datasets. Extensive experiment results demonstrate the merits of our approach over prior video stabilization methods.
△ Less
Submitted 23 August, 2021; v1 submitted 11 February, 2021;
originally announced February 2021.
-
Deep Online Fused Video Stabilization
Authors:
Zhenmei Shi,
Fuhao Shi,
Wei-Sheng Lai,
Chia-Kai Liang,
Yingyu Liang
Abstract:
We present a deep neural network (DNN) that uses both sensor data (gyroscope) and image content (optical flow) to stabilize videos through unsupervised learning. The network fuses optical flow with real/virtual camera pose histories into a joint motion representation. Next, the LSTM block infers the new virtual camera pose, and this virtual pose is used to generate a war** grid that stabilizes t…
▽ More
We present a deep neural network (DNN) that uses both sensor data (gyroscope) and image content (optical flow) to stabilize videos through unsupervised learning. The network fuses optical flow with real/virtual camera pose histories into a joint motion representation. Next, the LSTM block infers the new virtual camera pose, and this virtual pose is used to generate a war** grid that stabilizes the frame. Novel relative motion representation as well as a multi-stage training process are presented to optimize our model without any supervision. To the best of our knowledge, this is the first DNN solution that adopts both sensor data and image for stabilization. We validate the proposed framework through ablation studies and demonstrated the proposed method outperforms the state-of-art alternative solutions via quantitative evaluations and a user study.
△ Less
Submitted 3 April, 2021; v1 submitted 1 February, 2021;
originally announced February 2021.
-
When does the Physarum Solver Distinguish the Shortest Path from other Paths: the Transition Point and its Applications
Authors:
Yusheng Huang,
Dong Chu,
Joel Weijia Lai,
Yong Deng,
Kang Hao Cheong
Abstract:
Physarum solver, also called the physarum polycephalum inspired algorithm (PPA), is a newly developed bio-inspired algorithm that has an inherent ability to find the shortest path in a given graph. Recent research has proposed methods to develop this algorithm further by accelerating the original PPA (OPPA)'s path-finding process. However, when does the PPA ascertain that the shortest path has bee…
▽ More
Physarum solver, also called the physarum polycephalum inspired algorithm (PPA), is a newly developed bio-inspired algorithm that has an inherent ability to find the shortest path in a given graph. Recent research has proposed methods to develop this algorithm further by accelerating the original PPA (OPPA)'s path-finding process. However, when does the PPA ascertain that the shortest path has been found? Is there a point after which the PPA could distinguish the shortest path from other paths? By innovatively proposing the concept of the dominant path (D-Path), the exact moment, named the transition point (T-Point), when the PPA finds the shortest path can be identified. Based on the D-Path and T-Point, a newly accelerated PPA named OPPA-D using the proposed termination criterion is developed which is superior to all other baseline algorithms according to the experiments conducted in this paper. The validity and the superiority of the proposed termination criterion is also demonstrated. Furthermore, an evaluation method is proposed to provide new insights for the comparison of different accelerated OPPAs. The breakthrough of this paper lies in using D-path and T-point to terminate the OPPA. The novel termination criterion reveals the actual performance of this OPPA. This OPPA is the fastest algorithm, outperforming some so-called accelerated OPPAs. Furthermore, we explain why some existing works inappropriately claim to be accelerated algorithms is in fact a product of inappropriate termination criterion, thus giving rise to the illusion that the method is accelerated.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
Portrait Neural Radiance Fields from a Single Image
Authors:
Chen Gao,
Yichang Shih,
Wei-Sheng Lai,
Chia-Kai Liang,
Jia-Bin Huang
Abstract:
We present a method for estimating Neural Radiance Fields (NeRF) from a single headshot portrait. While NeRF has demonstrated high-quality view synthesis, it requires multiple images of static scenes and thus impractical for casual captures and moving subjects. In this work, we propose to pretrain the weights of a multilayer perceptron (MLP), which implicitly models the volumetric density and colo…
▽ More
We present a method for estimating Neural Radiance Fields (NeRF) from a single headshot portrait. While NeRF has demonstrated high-quality view synthesis, it requires multiple images of static scenes and thus impractical for casual captures and moving subjects. In this work, we propose to pretrain the weights of a multilayer perceptron (MLP), which implicitly models the volumetric density and colors, with a meta-learning framework using a light stage portrait dataset. To improve the generalization to unseen faces, we train the MLP in the canonical coordinate space approximated by 3D face morphable models. We quantitatively evaluate the method using controlled captures and demonstrate the generalization to real portrait images, showing favorable results against state-of-the-arts.
△ Less
Submitted 16 April, 2021; v1 submitted 10 December, 2020;
originally announced December 2020.
-
Real-time Localized Photorealistic Video Style Transfer
Authors:
Xide Xia,
Tianfan Xue,
Wei-sheng Lai,
Zheng Sun,
Abby Chang,
Brian Kulis,
Jiawen Chen
Abstract:
We present a novel algorithm for transferring artistic styles of semantically meaningful local regions of an image onto local regions of a target video while preserving its photorealism. Local regions may be selected either fully automatically from an image, through using video segmentation algorithms, or from casual user guidance such as scribbles. Our method, based on a deep neural network archi…
▽ More
We present a novel algorithm for transferring artistic styles of semantically meaningful local regions of an image onto local regions of a target video while preserving its photorealism. Local regions may be selected either fully automatically from an image, through using video segmentation algorithms, or from casual user guidance such as scribbles. Our method, based on a deep neural network architecture inspired by recent work in photorealistic style transfer, is real-time and works on arbitrary inputs without runtime optimization once trained on a diverse dataset of artistic styles. By augmenting our video dataset with noisy semantic labels and jointly optimizing over style, content, mask, and temporal losses, our method can cope with a variety of imperfections in the input and produce temporally coherent videos without visual artifacts. We demonstrate our method on a variety of style images and target videos, including the ability to transfer different styles onto multiple objects simultaneously, and smoothly transition between styles in time.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Learning to See Through Obstructions with Layered Decomposition
Authors:
Yu-Lun Liu,
Wei-Sheng Lai,
Ming-Hsuan Yang,
Yung-Yu Chuang,
Jia-Bin Huang
Abstract:
We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions, or adherent raindrops, from a short sequence of images captured by a moving camera. Our method leverages motion differences between the background and obstructing elements to recover both layers. Specifically, we alternate between estimating dense optical flow fields of the two la…
▽ More
We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions, or adherent raindrops, from a short sequence of images captured by a moving camera. Our method leverages motion differences between the background and obstructing elements to recover both layers. Specifically, we alternate between estimating dense optical flow fields of the two layers and reconstructing each layer from the flow-warped images via a deep convolutional neural network. This learning-based layer reconstruction module facilitates accommodating potential errors in the flow estimation and brittle assumptions, such as brightness consistency. We show that the proposed approach learned from synthetically generated data performs well to real images. Experimental results on numerous challenging scenarios of reflection and fence removal demonstrate the effectiveness of the proposed method.
△ Less
Submitted 25 July, 2021; v1 submitted 11 August, 2020;
originally announced August 2020.
-
Ubicomp Digital 2020 -- Handwriting classification using a convolutional recurrent network
Authors:
Wei-Cheng Lai,
Hendrik Schröter
Abstract:
The Ubicomp Digital 2020 -- Time Series Classification Challenge from STABILO is a challenge about multi-variate time series classification. The data collected from 100 volunteer writers, and contains 15 features measured with multiple sensors on a pen. In this paper,we use a neural network to classify the data into 52 classes, that is lower and upper cases of Arabic letters. The proposed architec…
▽ More
The Ubicomp Digital 2020 -- Time Series Classification Challenge from STABILO is a challenge about multi-variate time series classification. The data collected from 100 volunteer writers, and contains 15 features measured with multiple sensors on a pen. In this paper,we use a neural network to classify the data into 52 classes, that is lower and upper cases of Arabic letters. The proposed architecture of the neural network a is CNN-LSTM network. It combines convolutional neural network (CNN) for short term context with along short term memory layer (LSTM) for also long term dependencies. We reached an accuracy of 68% on our writer exclusive test set and64.6% on the blind challenge test set resulting in the second place.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
A Three-limb Teleoperated Robotic System with Foot Control for Flexible Endoscopic Surgery
Authors:
Yanpei Huang,
Wenjie Lai,
Lin Cao,
Jiajun Liu,
Xiaoguo Li,
Etienne Burdet,
Soo Jay Phee
Abstract:
Flexible endoscopy requires high skills to manipulate both the endoscope and associated instruments. In most robotic flexible endoscopic systems, the endoscope and instruments are controlled separately by two operators, which may result in communication errors and inefficient operation. We present a novel teleoperation robotic endoscopic system that can be commanded by a surgeon alone. This 13 deg…
▽ More
Flexible endoscopy requires high skills to manipulate both the endoscope and associated instruments. In most robotic flexible endoscopic systems, the endoscope and instruments are controlled separately by two operators, which may result in communication errors and inefficient operation. We present a novel teleoperation robotic endoscopic system that can be commanded by a surgeon alone. This 13 degrees-of-freedom (DoF) system integrates a foot-controlled robotic flexible endoscope and two hand-controlled robotic endoscopic instruments (a robotic grasper and a robotic cauterizing hook). A foot-controlled human-machine interface maps the natural foot gestures to the 4-DoF movements of the endoscope, and two hand-controlled interfaces map the movements of the two hands to the two instruments individually. The proposed robotic system was validated in an ex-vivo experiment carried out by six subjects, where foot control was also compared with a sequential clutch-based hand control scheme. The participants could successfully teleoperate the endoscope and the two instruments to cut the tissues at scattered target areas in a porcine stomach. Foot control yielded 43.7% faster task completion and required less mental effort as compared to the clutch-based hand control scheme. The system introduced in this paper is intuitive for three-limb manipulation even for operators without experience of handling the endoscope and robotic instruments. This three-limb teleoperated robotic system enables one surgeon to intuitively control three endoscopic tools which normally require two operators, leading to reduced manpower, less communication errors, and improved efficiency.
△ Less
Submitted 12 July, 2020;
originally announced July 2020.
-
Learning to See Through Obstructions
Authors:
Yu-Lun Liu,
Wei-Sheng Lai,
Ming-Hsuan Yang,
Yung-Yu Chuang,
Jia-Bin Huang
Abstract:
We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions or raindrops, from a short sequence of images captured by a moving camera. Our method leverages the motion differences between the background and the obstructing elements to recover both layers. Specifically, we alternate between estimating dense optical flow fields of the two laye…
▽ More
We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions or raindrops, from a short sequence of images captured by a moving camera. Our method leverages the motion differences between the background and the obstructing elements to recover both layers. Specifically, we alternate between estimating dense optical flow fields of the two layers and reconstructing each layer from the flow-warped images via a deep convolutional neural network. The learning-based layer reconstruction allows us to accommodate potential errors in the flow estimation and brittle assumptions such as brightness consistency. We show that training on synthetically generated data transfers well to real images. Our results on numerous challenging scenarios of reflection and fence removal demonstrate the effectiveness of the proposed method.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline
Authors:
Yu-Lun Liu,
Wei-Sheng Lai,
Yu-Sheng Chen,
Yi-Lung Kao,
Ming-Hsuan Yang,
Yung-Yu Chuang,
Jia-Bin Huang
Abstract:
Recovering a high dynamic range (HDR) image from a single low dynamic range (LDR) input image is challenging due to missing details in under-/over-exposed regions caused by quantization and saturation of camera sensors. In contrast to existing learning-based methods, our core idea is to incorporate the domain knowledge of the LDR image formation pipeline into our model. We model the HDRto-LDR imag…
▽ More
Recovering a high dynamic range (HDR) image from a single low dynamic range (LDR) input image is challenging due to missing details in under-/over-exposed regions caused by quantization and saturation of camera sensors. In contrast to existing learning-based methods, our core idea is to incorporate the domain knowledge of the LDR image formation pipeline into our model. We model the HDRto-LDR image formation pipeline as the (1) dynamic range clip**, (2) non-linear map** from a camera response function, and (3) quantization. We then propose to learn three specialized CNNs to reverse these steps. By decomposing the problem into specific sub-tasks, we impose effective physical constraints to facilitate the training of individual sub-networks. Finally, we jointly fine-tune the entire model end-to-end to reduce error accumulation. With extensive quantitative and qualitative experiments on diverse image datasets, we demonstrate that the proposed method performs favorably against state-of-the-art single-image HDR reconstruction algorithms.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Gated Fusion Network for Degraded Image Super Resolution
Authors:
Xinyi Zhang,
Hang Dong,
Zhe Hu,
Wei-Sheng Lai,
Fei Wang,
Ming-Hsuan Yang
Abstract:
Single image super resolution aims to enhance image quality with respect to spatial content, which is a fundamental task in computer vision. In this work, we address the task of single frame super resolution with the presence of image degradation, e.g., blur, haze, or rain streaks. Due to the limitations of frame capturing and formation processes, image degradation is inevitable, and the artifacts…
▽ More
Single image super resolution aims to enhance image quality with respect to spatial content, which is a fundamental task in computer vision. In this work, we address the task of single frame super resolution with the presence of image degradation, e.g., blur, haze, or rain streaks. Due to the limitations of frame capturing and formation processes, image degradation is inevitable, and the artifacts would be exacerbated by super resolution methods. To address this problem, we propose a dual-branch convolutional neural network to extract base features and recovered features separately. The base features contain local and global information of the input image. On the other hand, the recovered features focus on the degraded regions and are used to remove the degradation. Those features are then fused through a recursive gate module to obtain sharp features for super resolution. By decomposing the feature extraction step into two task-independent streams, the dual-branch model can facilitate the training process by avoiding learning the mixed degradation all-in-one and thus enhance the final high-resolution prediction results. We evaluate the proposed method in three degradation scenarios. Experiments on these scenarios demonstrate that the proposed method performs more efficiently and favorably against the state-of-the-art approaches on benchmark datasets.
△ Less
Submitted 4 March, 2020; v1 submitted 2 March, 2020;
originally announced March 2020.
-
Broadband Supercontinuum Generation in PCF, HNLF and ZBLAN Fiber with a Carbon-Nanotube-based Passively Mode-locked Erbium-doped Fiber Laser
Authors:
Sivasankara Rao Yemineni,
Wenn**g Lai,
Arokiaswami Alphones,
Shum ** Perry
Abstract:
We demonstrate the broadband supercontinuum (SC) generation in photonic crystal fiber (PCF), highly nonlinear fiber (HNLF) and ZBLAN (ZrF4-BaF2-LaF3-AlF3-NaF) fiber with a passively mode-locked erbium-doped fiber laser (EDFL). The passively mode-locked EDFL incorporates a CNT-based saturable absorber and has achieved a pulse width of 620 fs with a pulse repetition rate of 18 MHz. The spectral broa…
▽ More
We demonstrate the broadband supercontinuum (SC) generation in photonic crystal fiber (PCF), highly nonlinear fiber (HNLF) and ZBLAN (ZrF4-BaF2-LaF3-AlF3-NaF) fiber with a passively mode-locked erbium-doped fiber laser (EDFL). The passively mode-locked EDFL incorporates a CNT-based saturable absorber and has achieved a pulse width of 620 fs with a pulse repetition rate of 18 MHz. The spectral broadening phenomena inside each fiber has been observed with respect to the variation in seed pulse power. The SC spectrum bandwidth of 1050 nm, 1400 nm, and 2000 nm has been achieved using PCF, HNLF, and ZBLAN fiber respectively.
△ Less
Submitted 4 February, 2020;
originally announced February 2020.
-
Exploiting Semantics for Face Image Deblurring
Authors:
Ziyi Shen,
Wei-Sheng Lai,
Tingfa Xu,
Jan Kautz,
Ming-Hsuan Yang
Abstract:
In this paper, we propose an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks. As the human faces are highly structured and share unified facial components (e.g., eyes and mouths), such semantic information provides a strong prior for restoration. We incorporate face semantic labels as input priors and propose an adaptive structur…
▽ More
In this paper, we propose an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks. As the human faces are highly structured and share unified facial components (e.g., eyes and mouths), such semantic information provides a strong prior for restoration. We incorporate face semantic labels as input priors and propose an adaptive structural loss to regularize facial local structures within an end-to-end deep convolutional neural network. Specifically, we first use a coarse deblurring network to reduce the motion blur on the input face image. We then adopt a parsing network to extract the semantic features from the coarse deblurred image. Finally, the fine deblurring network utilizes the semantic information to restore a clear face image. We train the network with perceptual and adversarial losses to generate photo-realistic results. The proposed method restores sharp images with more accurate facial features and details. Quantitative and qualitative evaluations demonstrate that the proposed face deblurring algorithm performs favorably against the state-of-the-art methods in terms of restoration quality, face recognition and execution speed.
△ Less
Submitted 6 April, 2020; v1 submitted 19 January, 2020;
originally announced January 2020.
-
Visual Question Answering on 360° Images
Authors:
Shih-Han Chou,
Wei-Lun Chao,
Wei-Sheng Lai,
Min Sun,
Ming-Hsuan Yang
Abstract:
In this work, we introduce VQA 360, a novel task of visual question answering on 360 images. Unlike a normal field-of-view image, a 360 image captures the entire visual content around the optical center of a camera, demanding more sophisticated spatial understanding and reasoning. To address this problem, we collect the first VQA 360 dataset, containing around 17,000 real-world image-question-answ…
▽ More
In this work, we introduce VQA 360, a novel task of visual question answering on 360 images. Unlike a normal field-of-view image, a 360 image captures the entire visual content around the optical center of a camera, demanding more sophisticated spatial understanding and reasoning. To address this problem, we collect the first VQA 360 dataset, containing around 17,000 real-world image-question-answer triplets for a variety of question types. We then study two different VQA models on VQA 360, including one conventional model that takes an equirectangular image (with intrinsic distortion) as input and one dedicated model that first projects a 360 image onto cubemaps and subsequently aggregates the information from multiple spatial resolutions. We demonstrate that the cubemap-based model with multi-level fusion and attention diffusion performs favorably against other variants and the equirectangular-based models. Nevertheless, the gap between the humans' and machines' performance reveals the need for more advanced VQA 360 algorithms. We, therefore, expect our dataset and studies to serve as the benchmark for future development in this challenging task. Dataset, code, and pre-trained models are available online.
△ Less
Submitted 10 January, 2020;
originally announced January 2020.
-
QCD spin effects in the heavy hybrid potentials and spectra
Authors:
Nora Brambilla,
Wai Kin Lai,
Jorge Segovia,
Jaume Tarrús Castellà
Abstract:
The spin-dependent operators for heavy quarkonium hybrids have been recently obtained in a nonrelativistic effective field theory approach up to next-to-leading order in the heavy-quark mass expansion. In the effective field theory for hybrids several operators not found in standard quarkonia appear, including an operator suppressed by only one power of the heavy-quark mass. We compute the matchin…
▽ More
The spin-dependent operators for heavy quarkonium hybrids have been recently obtained in a nonrelativistic effective field theory approach up to next-to-leading order in the heavy-quark mass expansion. In the effective field theory for hybrids several operators not found in standard quarkonia appear, including an operator suppressed by only one power of the heavy-quark mass. We compute the matching coefficients for these operators in the short heavy-quark-antiquark distance regime, $r\ll 1/Λ_{\rm QCD}$, by matching weakly-coupled potential NRQCD to the effective field theory for hybrids. In this regime the perturbative and nonperturbative contributions to the matching coefficients factorize, and the latter can be expressed in terms of purely gluonic correlators whose form we explicitly calculate with the aid of the transformation properties of the gluon fields under discrete symmetries. We detail our previous comparison with direct lattice computations of the charmonium hybrid spectrum, from which the unknown nonperturbative contributions can be obtained, and extend it to data sets with different light-quark masses.
△ Less
Submitted 14 February, 2020; v1 submitted 30 August, 2019;
originally announced August 2019.
-
Video Stitching for Linear Camera Arrays
Authors:
Wei-Sheng Lai,
Orazio Gallo,
**wei Gu,
Deqing Sun,
Ming-Hsuan Yang,
Jan Kautz
Abstract:
Despite the long history of image and video stitching research, existing academic and commercial solutions still produce strong artifacts. In this work, we propose a wide-baseline video stitching algorithm for linear camera arrays that is temporally stable and tolerant to strong parallax. Our key insight is that stitching can be cast as a problem of learning a smooth spatial interpolation between…
▽ More
Despite the long history of image and video stitching research, existing academic and commercial solutions still produce strong artifacts. In this work, we propose a wide-baseline video stitching algorithm for linear camera arrays that is temporally stable and tolerant to strong parallax. Our key insight is that stitching can be cast as a problem of learning a smooth spatial interpolation between the input videos. To solve this problem, inspired by pushbroom cameras, we introduce a fast pushbroom interpolation layer and propose a novel pushbroom stitching network, which learns a dense flow field to smoothly align the multiple input videos for spatial interpolation. Our approach outperforms the state-of-the-art by a significant margin, as we show with a user study, and has immediate applications in many areas such as virtual reality, immersive telepresence, autonomous driving, and video surveillance.
△ Less
Submitted 31 July, 2019;
originally announced July 2019.
-
Order $v^4$ corrections to Higgs boson decay into $J/ψ+ γ$
Authors:
Nora Brambilla,
Hee Sok Chung,
Wai Kin Lai,
Vladyslav Shtabovenko,
Antonio Vairo
Abstract:
The process $H \to J/ψ+ γ$, where $H$ is the Higgs particle, provides a way to probe the size and the sign of the Higgs-charm coupling. In order to improve the theoretical control of the decay rate, we compute order $v^4$ corrections to the decay rate based on the nonrelativistic QCD factorization formalism. The perturbative calculation is carried out by using automated computer codes. We also res…
▽ More
The process $H \to J/ψ+ γ$, where $H$ is the Higgs particle, provides a way to probe the size and the sign of the Higgs-charm coupling. In order to improve the theoretical control of the decay rate, we compute order $v^4$ corrections to the decay rate based on the nonrelativistic QCD factorization formalism. The perturbative calculation is carried out by using automated computer codes. We also resum logarithms of the ratio of the masses of the Higgs boson and the $J/ψ$ to all orders in the strong coupling constant $α_s$ to next-to-leading logarithmic accuracy. In our numerical result for the decay rate, we improve the theoretical uncertainty, while our central value is in agreement with previous studies within errors. We also present numerical results for $H \to Υ(nS) + γ$ for $n=1,2$, and 3, which turn out to be extremely sensitive to the Higgs bottom coupling.
△ Less
Submitted 26 September, 2019; v1 submitted 15 July, 2019;
originally announced July 2019.
-
Decision Procedure for the Existence of Two-Channel Prefix-Free Codes
Authors:
Hoover H. F. Yin,
Ka Hei Ng,
Yu Ting Shing,
Russell W. F. Lai,
Xishi Wang
Abstract:
The Kraft inequality gives a necessary and sufficient condition for the existence of a single channel prefix-free code. However, the multichannel Kraft inequality does not imply the existence of a multichannel prefix-free code in general. It is natural to ask whatever there exists an efficient decision procedure for the existence of multichannel prefix-free codes. In this paper, we tackle the two-…
▽ More
The Kraft inequality gives a necessary and sufficient condition for the existence of a single channel prefix-free code. However, the multichannel Kraft inequality does not imply the existence of a multichannel prefix-free code in general. It is natural to ask whatever there exists an efficient decision procedure for the existence of multichannel prefix-free codes. In this paper, we tackle the two-channel case of the above problem by relating it to a constrained rectangle packing problem. Although a general rectangle packing problem is NP-complete, the extra imposed constraints allow us to propose an algorithm which can solve the problem efficiently.
△ Less
Submitted 27 April, 2019;
originally announced April 2019.
-
Depth-Aware Video Frame Interpolation
Authors:
Wenbo Bao,
Wei-Sheng Lai,
Chao Ma,
Xiaoyun Zhang,
Zhiyong Gao,
Ming-Hsuan Yang
Abstract:
Video frame interpolation aims to synthesize nonexistent frames in-between the original frames. While significant advances have been made from the recent deep convolutional neural networks, the quality of interpolation is often reduced due to large object motion or occlusion. In this work, we propose a video frame interpolation method which explicitly detects the occlusion by exploring the depth i…
▽ More
Video frame interpolation aims to synthesize nonexistent frames in-between the original frames. While significant advances have been made from the recent deep convolutional neural networks, the quality of interpolation is often reduced due to large object motion or occlusion. In this work, we propose a video frame interpolation method which explicitly detects the occlusion by exploring the depth information. Specifically, we develop a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones. In addition, we learn hierarchical features to gather contextual information from neighboring pixels. The proposed model then warps the input frames, depth maps, and contextual features based on the optical flow and local interpolation kernels for synthesizing the output frame. Our model is compact, efficient, and fully differentiable. Quantitative and qualitative results demonstrate that the proposed model performs favorably against state-of-the-art frame interpolation methods on a wide variety of datasets.
△ Less
Submitted 1 April, 2019;
originally announced April 2019.
-
Photovoltaic Effect of Atomtronics Induced by Artificial Gauge Field
Authors:
Wenxi Lai,
Yuquan Ma,
W. M. Liu
Abstract:
We investigate photovoltaic effect of atomtronics induced by artificial gauge field in four optical potentials. Effective magnetic flux gives rise to polarization of atom occupation probability which creates current of atomtronics. The relation between atomic current and magnetic flux behaves like the current-phase property in Josephson junction. The photovoltaic cell is well defined by the atomic…
▽ More
We investigate photovoltaic effect of atomtronics induced by artificial gauge field in four optical potentials. Effective magnetic flux gives rise to polarization of atom occupation probability which creates current of atomtronics. The relation between atomic current and magnetic flux behaves like the current-phase property in Josephson junction. The photovoltaic cell is well defined by the atomic opened system which have effective voltage and two different poles that correspond to two internal states of atomtronics. The atom flow is controllable by changing the direction of incident light and other system parameters. Detection of the atomic current intensity is available through light emission optical spectrum in experiments.
△ Less
Submitted 21 February, 2019;
originally announced February 2019.
-
EFT determination of the heavy-hybrid spin potential
Authors:
Wai Kin Lai
Abstract:
We study the spin splitting in the heavy quarkonium hybrid spectrum within the framework of an nonrelativistic effective field theory. We derive for the first time the spin-dependent part of the heavy-quark-antiquark potential for heavy quarkonium hybrids to order $1/m^2$ in the heavy-quark-mass expansion. We find that several operators that are not found in standard quarkonia appear, most remarka…
▽ More
We study the spin splitting in the heavy quarkonium hybrid spectrum within the framework of an nonrelativistic effective field theory. We derive for the first time the spin-dependent part of the heavy-quark-antiquark potential for heavy quarkonium hybrids to order $1/m^2$ in the heavy-quark-mass expansion. We find that several operators that are not found in standard quarkonia appear, most remarkably an operator suppressed by only one power of the heavy-quark mass. By matching the weakly-coupled pNRQCD to the effective field theory in the regime of short heavy-quark-antiquark distances, we work out the matching coefficients of the spin-dependent operators, which are factorized into a perturbative and a nonperturbative part. The nonperturbative part can be expressed in terms of purely gluonic correlators. We fit the nonperturbative parts of the matching coefficients to lattice data of the charmonium hybrid spectrum and obtain results that respect the power counting. Using the obtained nonperturbative pieces, we compute the bottomonium hybrid spectrum with the spin-dependent potential, for which results from the lattice are still sparse.
△ Less
Submitted 3 December, 2018;
originally announced December 2018.
-
MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement
Authors:
Wenbo Bao,
Wei-Sheng Lai,
Xiaoyun Zhang,
Zhiyong Gao,
Ming-Hsuan Yang
Abstract:
Motion estimation (ME) and motion compensation (MC) have been widely used for classical video frame interpolation systems over the past decades. Recently, a number of data-driven frame interpolation methods based on convolutional neural networks have been proposed. However, existing learning based methods typically estimate either flow or compensation kernels, thereby limiting performance on both…
▽ More
Motion estimation (ME) and motion compensation (MC) have been widely used for classical video frame interpolation systems over the past decades. Recently, a number of data-driven frame interpolation methods based on convolutional neural networks have been proposed. However, existing learning based methods typically estimate either flow or compensation kernels, thereby limiting performance on both computational efficiency and interpolation accuracy. In this work, we propose a motion estimation and compensation driven neural network for video frame interpolation. A novel adaptive war** layer is developed to integrate both optical flow and interpolation kernels to synthesize target frame pixels. This layer is fully differentiable such that both the flow and kernel estimation networks can be optimized jointly. The proposed model benefits from the advantages of motion estimation and compensation methods without using hand-crafted features. Compared to existing methods, our approach is computationally efficient and able to generate more visually appealing results. Furthermore, the proposed MEMC-Net can be seamlessly adapted to several video enhancement tasks, e.g., super-resolution, denoising, and deblocking. Extensive quantitative and qualitative evaluations demonstrate that the proposed method performs favorably against the state-of-the-art video frame interpolation and enhancement algorithms on a wide range of datasets.
△ Less
Submitted 5 September, 2019; v1 submitted 20 October, 2018;
originally announced October 2018.
-
Learning Blind Video Temporal Consistency
Authors:
Wei-Sheng Lai,
Jia-Bin Huang,
Oliver Wang,
Eli Shechtman,
Ersin Yumer,
Ming-Hsuan Yang
Abstract:
Applying image processing algorithms independently to each frame of a video often leads to undesired inconsistent results over time. Develo** temporally consistent video-based extensions, however, requires domain knowledge for individual tasks and is unable to generalize to other applications. In this paper, we present an efficient end-to-end approach based on deep recurrent network for enforcin…
▽ More
Applying image processing algorithms independently to each frame of a video often leads to undesired inconsistent results over time. Develo** temporally consistent video-based extensions, however, requires domain knowledge for individual tasks and is unable to generalize to other applications. In this paper, we present an efficient end-to-end approach based on deep recurrent network for enforcing temporal consistency in a video. Our method takes the original unprocessed and per-frame processed videos as inputs to produce a temporally consistent video. Consequently, our approach is agnostic to specific image processing algorithms applied on the original video. We train the proposed network by minimizing both short-term and long-term temporal losses as well as the perceptual loss to strike a balance between temporal stability and perceptual similarity with the processed frames. At test time, our model does not require computing optical flow and thus achieves real-time speed even for high-resolution videos. We show that our single model can handle multiple and unseen tasks, including but not limited to artistic style transfer, enhancement, colorization, image-to-image translation and intrinsic image decomposition. Extensive objective evaluation and subject study demonstrate that the proposed approach performs favorably against the state-of-the-art methods on various types of videos.
△ Less
Submitted 1 August, 2018;
originally announced August 2018.
-
Gated Fusion Network for Joint Image Deblurring and Super-Resolution
Authors:
Xinyi Zhang,
Hang Dong,
Zhe Hu,
Wei-Sheng Lai,
Fei Wang,
Ming-Hsuan Yang
Abstract:
Single-image super-resolution is a fundamental task for vision applications to enhance the image quality with respect to spatial resolution. If the input image contains degraded pixels, the artifacts caused by the degradation could be amplified by super-resolution methods. Image blur is a common degradation source. Images captured by moving or still cameras are inevitably affected by motion blur d…
▽ More
Single-image super-resolution is a fundamental task for vision applications to enhance the image quality with respect to spatial resolution. If the input image contains degraded pixels, the artifacts caused by the degradation could be amplified by super-resolution methods. Image blur is a common degradation source. Images captured by moving or still cameras are inevitably affected by motion blur due to relative movements between sensors and objects. In this work, we focus on the super-resolution task with the presence of motion blur. We propose a deep gated fusion convolution neural network to generate a clear high-resolution frame from a single natural image with severe blur. By decomposing the feature extraction step into two task-independent streams, the dual-branch design can facilitate the training process by avoiding learning the mixed degradation all-in-one and thus enhance the final high-resolution prediction results. Extensive experiments demonstrate that our method generates sharper super-resolved images from low-resolution inputs with high computational efficiency.
△ Less
Submitted 27 July, 2018;
originally announced July 2018.
-
Conduction Bands of Atomic Tunneling Ring in Artificial Gauge Field Assisted Opened Optical Traps
Authors:
Wenxi Lai,
Yuquan Ma,
W. M. Liu
Abstract:
We show conduction bands of artificial gauge field assisted atom flow in triangle optical lattice. The conduction bands are result from periodicity boundary condition of artificial magnetic flux induced phases of atoms. The positions of conduction bands depend on geometry of the atom trajectory. We consider a cell of the triangle optical lattice which is a opened system connected to its environmen…
▽ More
We show conduction bands of artificial gauge field assisted atom flow in triangle optical lattice. The conduction bands are result from periodicity boundary condition of artificial magnetic flux induced phases of atoms. The positions of conduction bands depend on geometry of the atom trajectory. We consider a cell of the triangle optical lattice which is a opened system connected to its environment of Fermion atom clouds. The chemical potentials of the atom clouds are the same and the atom flow is absolutely created by a clock laser induced spin-orbit coupling. Our results are important for the control of atom flow in quantum circuits.
△ Less
Submitted 30 June, 2018;
originally announced July 2018.
-
Spin structure of heavy-quark hybrids
Authors:
Nora Brambilla,
Wai Kin Lai,
Jorge Segovia,
Jaume Tarrús Castellà,
Antonio Vairo
Abstract:
A unique feature of quantum chromodynamics (QCD), the theory of strong interactions, is the possibility for gluonic degrees of freedom to participate in the construction of physical hadrons, which are color singlets, in an analogous manner to valence quarks. Hadrons with no valence quarks are called glueballs, while hadrons where both gluons and valence quarks combine to form a color singlet are c…
▽ More
A unique feature of quantum chromodynamics (QCD), the theory of strong interactions, is the possibility for gluonic degrees of freedom to participate in the construction of physical hadrons, which are color singlets, in an analogous manner to valence quarks. Hadrons with no valence quarks are called glueballs, while hadrons where both gluons and valence quarks combine to form a color singlet are called hybrids. The unambiguous identification of such states among the experimental hadron spectrum has been thus far not possible. Glueballs are particularly difficult to establish experimentally since the lowest lying ones are expected to strongly mix with conventional mesons. On the other hand, hybrids should be easier to single out because the set of quantum numbers available to their lowest excitations may be exotic, i.e., not realized in conventional quark-antiquark systems. Particularly promising for discovery appear to be heavy hybrids, which are made of gluons and a heavy-quark-antiquark pair (charm or bottom). In the heavy-quark sector systematic tools can be used that are not available in the light-quark sector. In this paper we use a nonrelativistic effective field theory to uncover for the first time the full spin structure of heavy-quark hybrids up to $1/m^2$-terms in the heavy-quark-mass expansion. We show that such terms display novel characteristics at variance with our consolidated experience on the fine and hyperfine splittings in atomic, molecular and nuclear physics. We determine the nonperturbative contributions to the matching coefficients of the effective field theory by fitting our results to lattice-QCD determinations of the charmonium hybrid spectrum and extrapolate the results to the bottomonium hybrid sector where lattice-QCD determinations are still challenging.
△ Less
Submitted 14 February, 2020; v1 submitted 20 May, 2018;
originally announced May 2018.
-
Continuous Terrain Guarding with Two-Sided Guards
Authors:
Wei-Yu Lai,
Tien-Ruey Hsiang
Abstract:
Herein, we consider the continuous 1.5-dimensional(1.5D) terrain guarding problem with two-sided guarding. We provide an x-monotone chain T and determine the minimal number of vertex guards such that all points of T have been two-sided guarded. A point p is two-sided guarded if there exist two vertices vi (left of p) and (right of p) that both see p. A vertex vi sees a point p on T if the line seg…
▽ More
Herein, we consider the continuous 1.5-dimensional(1.5D) terrain guarding problem with two-sided guarding. We provide an x-monotone chain T and determine the minimal number of vertex guards such that all points of T have been two-sided guarded. A point p is two-sided guarded if there exist two vertices vi (left of p) and (right of p) that both see p. A vertex vi sees a point p on T if the line segment connecting vi to p is on or above T. We demonstrate that the continuous 1.5D terrain guarding problem can be transformed to the discrete terrain guarding problem with a finite point set X and that if X is two-sided guarded, then T is also two-sided guarded. Through this transformation, we achieve an optimal algorithm that solves the continuous 1.5D terrain guarding problem under two-sided guarding.
△ Less
Submitted 7 May, 2018;
originally announced May 2018.
-
A Linear-Time Approximation Algorithm for the Orthogonal Terrain Guarding Problem
Authors:
Wei-Yu Lai,
Tien-Ruey Hsiang
Abstract:
In this paper, we consider the 1.5-dimensional orthogonal terrain guarding problem. In this problem, we assign an x-monotone chain T because each edge is either horizontal or vertical, and determine the minimal number of vertex guards for all vertices of T. A vertex vi sees a point p on T if the line segment connecting vi to p is on or above T. We provide an optimal algorithm with O(n) for a subpr…
▽ More
In this paper, we consider the 1.5-dimensional orthogonal terrain guarding problem. In this problem, we assign an x-monotone chain T because each edge is either horizontal or vertical, and determine the minimal number of vertex guards for all vertices of T. A vertex vi sees a point p on T if the line segment connecting vi to p is on or above T. We provide an optimal algorithm with O(n) for a subproblem of the orthogonal terrain guarding problem. In this subproblem, we determine the minimal number of vertex guards for all right(left) convex verteices of T. Finally, we provide a 2-approximation algorithm that solves the 1.5-dimensional orthogonal terrain guarding problem in O(n).
△ Less
Submitted 9 May, 2018; v1 submitted 30 April, 2018;
originally announced April 2018.
-
Learning a Discriminative Prior for Blind Image Deblurring
Authors:
Lerenhan Li,
**shan Pan,
Wei-Sheng Lai,
Changxin Gao,
Nong Sang,
Ming-Hsuan Yang
Abstract:
We present an effective blind image deblurring method based on a data-driven discriminative prior.Our work is motivated by the fact that a good image prior should favor clear images over blurred images.In this work, we formulate the image prior as a binary classifier which can be achieved by a deep convolutional neural network (CNN).The learned prior is able to distinguish whether an input image i…
▽ More
We present an effective blind image deblurring method based on a data-driven discriminative prior.Our work is motivated by the fact that a good image prior should favor clear images over blurred images.In this work, we formulate the image prior as a binary classifier which can be achieved by a deep convolutional neural network (CNN).The learned prior is able to distinguish whether an input image is clear or not.Embedded into the maximum a posterior (MAP) framework, it helps blind deblurring in various scenarios, including natural, face, text, and low-illumination images.However, it is difficult to optimize the deblurring method with the learned image prior as it involves a non-linear CNN.Therefore, we develop an efficient numerical approach based on the half-quadratic splitting method and gradient decent algorithm to solve the proposed model.Furthermore, the proposed model can be easily extended to non-uniform deblurring.Both qualitative and quantitative experimental results show that our method performs favorably against state-of-the-art algorithms as well as domain-specific image deblurring approaches.
△ Less
Submitted 4 April, 2018; v1 submitted 8 March, 2018;
originally announced March 2018.
-
Deep Semantic Face Deblurring
Authors:
Ziyi Shen,
Wei-Sheng Lai,
Tingfa Xu,
Jan Kautz,
Ming-Hsuan Yang
Abstract:
In this paper, we present an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks (CNNs). As face images are highly structured and share several key semantic components (e.g., eyes and mouths), the semantic information of a face provides a strong prior for restoration. As such, we propose to incorporate global semantic priors as input…
▽ More
In this paper, we present an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks (CNNs). As face images are highly structured and share several key semantic components (e.g., eyes and mouths), the semantic information of a face provides a strong prior for restoration. As such, we propose to incorporate global semantic priors as input and impose local structure losses to regularize the output within a multi-scale deep CNN. We train the network with perceptual and adversarial losses to generate photo-realistic results and develop an incremental training strategy to handle random blur kernels in the wild. Quantitative and qualitative evaluations demonstrate that the proposed face deblurring algorithm restores sharp images with more facial details and performs favorably against state-of-the-art methods in terms of restoration quality, face recognition and execution speed.
△ Less
Submitted 16 March, 2018; v1 submitted 8 March, 2018;
originally announced March 2018.
-
Two-component self-contracted droplets: long-range attraction and confinement effects
Authors:
Adrien Benusiglio,
Nate Cira,
Anna Wei Lai,
Manu Prakash
Abstract:
Marangoni self-contracted droplets are formed by a mixture of two liquids, one of larger surface tension and larger evaporation rate than the other. Due to evaporation, the droplets contract to a stable contact angle instead of spreading on a wetting substrate. This gives them unique properties, including absence of pinning force and ability to move under vapor gradients, self- and externally impo…
▽ More
Marangoni self-contracted droplets are formed by a mixture of two liquids, one of larger surface tension and larger evaporation rate than the other. Due to evaporation, the droplets contract to a stable contact angle instead of spreading on a wetting substrate. This gives them unique properties, including absence of pinning force and ability to move under vapor gradients, self- and externally imposed. We first model the dynamics of attraction in an unconfined geometry and then study the effects of confinement on the attraction range and dynamics, going from minimal confinement (vertical boundary), to medium confinement (2-D vapor diffusion) and eventually strong confinement (1-D). "Self-induced" motion is observed when single droplets are placed close to a vapor boundary toward which they are attracted, the boundary acting as an image droplet with respect to itself. When two droplets are confined between two horizontal plates, they interact at a longer distance with modified dynamics. Finally, confining the droplet in a tunnel, the range of attraction is greatly enhanced, as the droplet moves all the way up the tunnel when an external humidity gradient is imposed. "Self-induced" motion is also observed, as the droplet can move by itself towards the center of the tunnel. Confinement greatly increase the range at which droplets interact as well as their lifetime and thus greatly expands the control and design possibilities for applications offered by self-contracted droplets.
△ Less
Submitted 16 November, 2017;
originally announced November 2017.
-
Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks
Authors:
Wei-Sheng Lai,
Jia-Bin Huang,
Narendra Ahuja,
Ming-Hsuan Yang
Abstract:
Convolutional neural networks have recently demonstrated high-quality reconstruction for single image super-resolution. However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution Network for fast and accurate…
▽ More
Convolutional neural networks have recently demonstrated high-quality reconstruction for single image super-resolution. However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution Network for fast and accurate image super-resolution. The proposed network progressively reconstructs the sub-band residuals of high-resolution images at multiple pyramid levels. In contrast to existing methods that involve the bicubic interpolation for pre-processing (which results in large feature maps), the proposed method directly extracts features from the low-resolution input space and thereby entails low computational loads. We train the proposed network with deep supervision using the robust Charbonnier loss functions and achieve high-quality image reconstruction. Furthermore, we utilize the recursive layers to share parameters across as well as within pyramid levels, and thus drastically reduce the number of parameters. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of run-time and image quality.
△ Less
Submitted 9 August, 2018; v1 submitted 4 October, 2017;
originally announced October 2017.
-
Optical switching of defect charge states in 4H-SiC
Authors:
D. Andrew Golter,
Chih Wei Lai
Abstract:
We demonstrate optically induced switching between bright and dark charged divacancy defects in 4H-SiC. Photoluminescence excitation and time-resolved photoluminescence measurements reveal the excitation conditions for such charge conversion. For an energy below ~1.3 eV (above ~950 nm), the PL is suppressed by more than two orders of magnitude. The PL is recovered in the presence of a higher energ…
▽ More
We demonstrate optically induced switching between bright and dark charged divacancy defects in 4H-SiC. Photoluminescence excitation and time-resolved photoluminescence measurements reveal the excitation conditions for such charge conversion. For an energy below ~1.3 eV (above ~950 nm), the PL is suppressed by more than two orders of magnitude. The PL is recovered in the presence of a higher energy repump laser with a time-averaged intensity less than 0.1% that of the excitation field. Under a repump of 2.33 eV (532 nm), the PL increases rapidly, with a time constant ~30 $μ$s. By contrast, when the repump is switched off, the PL decreases first within ~100-200 $μ$s, followed by a much slower decay of a few seconds. We attribute these effects to the conversion between two different charge states. Under an excitation at energy levels below 1.3 eV, V$_{Si}$V$_C$$^0$ are converted into a dark charge state. A repump laser with an energy above 1.3 eV can excite this charged state and recover the bright neutral state. This optically induced charge switching can lead to charge-state fluctuations but can be exploited for long-term data storage or nuclear-spin-based quantum memory.
△ Less
Submitted 6 October, 2017; v1 submitted 5 July, 2017;
originally announced July 2017.
-
Constraining the Mass Scale of a Lorentz-Violating Hamiltonian with the Measurement of Astrophysical Neutrino-Flavor Composition
Authors:
Kwang-Chang Lai,
Wei-Hao Lai,
Guey-Lin Lin
Abstract:
We study Lorentz violation effects on flavor transitions of high energy astrophysical neutrinos. It is shown that the appearance of Lorentz violating Hamiltonian can drastically change the flavor transition probabilities of astrophysical neutrinos. Predictions of Lorentz violation effects on flavor compositions of astrophysical neutrinos arriving on Earth are compared with IceCube flavor compositi…
▽ More
We study Lorentz violation effects on flavor transitions of high energy astrophysical neutrinos. It is shown that the appearance of Lorentz violating Hamiltonian can drastically change the flavor transition probabilities of astrophysical neutrinos. Predictions of Lorentz violation effects on flavor compositions of astrophysical neutrinos arriving on Earth are compared with IceCube flavor composition measurement which analyzes astrophysical neutrino events in the energy range between $25~{\rm TeV}$ and $2.8~{\rm PeV}$. Such a comparison indicates that the future IceCube-Gen2 will be able to place stringent constraints on Lorentz violating Hamiltonian in the neutrino sector. We work out the expected sensitivities by IceCube-Gen2 on dimension-$3$ CPT-odd and dimension-$4$ CPT-even operators in Lorentz violating Hamiltonian. The expected sensitivities can improve on the current constraints obtained from other types of experiments by more than two orders of magnitudes for certain range of the parameter space.
△ Less
Submitted 12 January, 2018; v1 submitted 13 April, 2017;
originally announced April 2017.
-
Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution
Authors:
Wei-Sheng Lai,
Jia-Bin Huang,
Narendra Ahuja,
Ming-Hsuan Yang
Abstract:
Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image super-resolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals,…
▽ More
Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image super-resolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals, and uses transposed convolutions for upsampling to the finer level. Our method does not require the bicubic interpolation as the pre-processing step and thus dramatically reduces the computational complexity. We train the proposed LapSRN with deep supervision using a robust Charbonnier loss function and achieve high-quality reconstruction. Furthermore, our network generates multi-scale predictions in one feed-forward pass through the progressive reconstruction, thereby facilitates resource-aware applications. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of speed and accuracy.
△ Less
Submitted 9 October, 2017; v1 submitted 12 April, 2017;
originally announced April 2017.
-
Semantic-driven Generation of Hyperlapse from $360^\circ$ Video
Authors:
Wei-Sheng Lai,
Yujia Huang,
Neel Joshi,
Chris Buehler,
Ming-Hsuan Yang,
Sing Bing Kang
Abstract:
We present a system for converting a fully panoramic ($360^\circ$) video into a normal field-of-view (NFOV) hyperlapse for an optimal viewing experience. Our system exploits visual saliency and semantics to non-uniformly sample in space and time for generating hyperlapses. In addition, users can optionally choose objects of interest for customizing the hyperlapses. We first stabilize an input…
▽ More
We present a system for converting a fully panoramic ($360^\circ$) video into a normal field-of-view (NFOV) hyperlapse for an optimal viewing experience. Our system exploits visual saliency and semantics to non-uniformly sample in space and time for generating hyperlapses. In addition, users can optionally choose objects of interest for customizing the hyperlapses. We first stabilize an input $360^\circ$ video by smoothing the rotation between adjacent frames and then compute regions of interest and saliency scores. An initial hyperlapse is generated by optimizing the saliency and motion smoothness followed by the saliency-aware frame selection. We further smooth the result using an efficient 2D video stabilization approach that adaptively selects the motion model to generate the final hyperlapse. We validate the design of our system by showing results for a variety of scenes and comparing against the state-of-the-art method through a user study.
△ Less
Submitted 9 October, 2017; v1 submitted 31 March, 2017;
originally announced March 2017.
-
Learning Fully Convolutional Networks for Iterative Non-blind Deconvolution
Authors:
Jiawei Zhang,
**shan Pan,
Wei-Sheng Lai,
Rynson Lau,
Ming-Hsuan Yang
Abstract:
In this paper, we propose a fully convolutional networks for iterative non-blind deconvolution We decompose the non-blind deconvolution problem into image denoising and image deconvolution. We train a FCNN to remove noises in the gradient domain and use the learned gradients to guide the image deconvolution step. In contrast to the existing deep neural network based methods, we iteratively deconvo…
▽ More
In this paper, we propose a fully convolutional networks for iterative non-blind deconvolution We decompose the non-blind deconvolution problem into image denoising and image deconvolution. We train a FCNN to remove noises in the gradient domain and use the learned gradients to guide the image deconvolution step. In contrast to the existing deep neural network based methods, we iteratively deconvolve the blurred images in a multi-stage framework. The proposed method is able to learn an adaptive image prior, which keeps both local (details) and global (structures) information. Both quantitative and qualitative evaluations on benchmark datasets demonstrate that the proposed method performs favorably against state-of-the-art algorithms in terms of quality and speed.
△ Less
Submitted 20 November, 2016;
originally announced November 2016.
-
Overview of recent physics results from MAST
Authors:
A Kirk,
J Adamek,
RJ Akers,
S Allan,
L Appel,
F Arese Lucini,
M Barnes,
T Barrett,
N Ben Ayed,
W Boeglin,
J Bradley,
P K Browning,
J Brunner,
P Cahyna,
M Carr,
F Casson,
M Cecconello,
C Challis,
IT Chapman,
S Chapman,
S Conroy,
N Conway,
WA Cooper,
M Cox,
N Crocker
, et al. (138 additional authors not shown)
Abstract:
New results from MAST are presented that focus on validating models in order to extrapolate to future devices. Measurements during start-up experiments have shown how the bulk ion temperature rise scales with the square of the reconnecting field. During the current ramp up models are not able to correctly predict the current diffusion. Experiments have been performed looking at edge and core turbu…
▽ More
New results from MAST are presented that focus on validating models in order to extrapolate to future devices. Measurements during start-up experiments have shown how the bulk ion temperature rise scales with the square of the reconnecting field. During the current ramp up models are not able to correctly predict the current diffusion. Experiments have been performed looking at edge and core turbulence. At the edge detailed studies have revealed how filament characteristic are responsible for determining the near and far SOL density profiles. In the core the intrinsic rotation and electron scale turbulence have been measured. The role that the fast ion gradient has on redistributing fast ions through fishbone modes has led to a redesign of the neutral beam injector on MAST Upgrade. In H-mode the turbulence at the pedestal top has been shown to be consistent with being due to electron temperature gradient modes. A reconnection process appears to occur during ELMs and the number of filaments released determines the power profile at the divertor. Resonant magnetic perturbations can mitigate ELMs provided the edge peeling response is maximised and the core kink response minimised. The mitigation of intrinsic error fields with toroidal mode number n>1 has been shown to be important for plasma performance.
△ Less
Submitted 18 November, 2016;
originally announced November 2016.