Search | arXiv e-print repository

arXiv:2011.01510 [pdf]

doi 10.1038/s41467-021-25721-1

Pressure induced superconductivity in MnSe

Authors: T. L. Hung, C. H. Huang, L. Z. Deng, M. N. Ou, Y. Y. Chen, M. K. Wu, S. Y. Huyan, C. W. Chu, P. J. Chen, T. K. Lee

Abstract: The rich phenomena in the FeSe and related compounds have attracted great interests as it provides fertile material to gain further insight into the mechanism of high temperature superconductivity. A natural follow-up work was to look into the possibility of superconductivity in MnSe. It was shown that MnP becomes superconducting with Tc ~ 1 K under pressure. We demonstrated in this work that high… ▽ More The rich phenomena in the FeSe and related compounds have attracted great interests as it provides fertile material to gain further insight into the mechanism of high temperature superconductivity. A natural follow-up work was to look into the possibility of superconductivity in MnSe. It was shown that MnP becomes superconducting with Tc ~ 1 K under pressure. We demonstrated in this work that high pressure can effectively suppress the complex magnetic characters of MnSe crystal when observed at ambient condition. MnSe under pressure is found to undergo several structural transformations: the cubic phase first partially transforms to the hexagonal phase at about 12 GPa, the crystal exhibits the coexistence of cubic, hexagonal and orthorhombic phases from 16 GPa to 30 GPa, and above 30 GPa the crystal shows a single orthorhombic phase. Superconductivity with Tc ~ 5 K was first observed at pressure ~12 GPa by magnetic measurements (~16 GPa by resistive measurements). The highest Tc is ~ 9 K (magnetic result) at ~35 GPa. Our observations suggest the observed superconductivity may closely relate to the pressure-induced structural change. However, the interface between the metallic and insulating boundaries may also play an important role to the pressure induced superconductivity in MnSe. △ Less

Submitted 3 November, 2020; originally announced November 2020.

Comments: 13 pages, 8 figures. TLH, CHH and LZD contribute equally to this work

arXiv:2010.14725 [pdf, other]

CASS-NAT: CTC Alignment-based Single Step Non-autoregressive Transformer for Speech Recognition

Authors: Ruchao Fan, Wei Chu, Peng Chang, **g Xiao

Abstract: We propose a CTC alignment-based single step non-autoregressive transformer (CASS-NAT) for speech recognition. Specifically, the CTC alignment contains the information of (a) the number of tokens for decoder input, and (b) the time span of acoustics for each token. The information are used to extract acoustic representation for each token in parallel, referred to as token-level acoustic embedding… ▽ More We propose a CTC alignment-based single step non-autoregressive transformer (CASS-NAT) for speech recognition. Specifically, the CTC alignment contains the information of (a) the number of tokens for decoder input, and (b) the time span of acoustics for each token. The information are used to extract acoustic representation for each token in parallel, referred to as token-level acoustic embedding which substitutes the word embedding in autoregressive transformer (AT) to achieve parallel generation in decoder. During inference, an error-based alignment sampling method is proposed to be applied to the CTC output space, reducing the WER and retaining the parallelism as well. Experimental results show that the proposed method achieves WERs of 3.8%/9.1% on Librispeech test clean/other dataset without an external LM, and a CER of 5.8% on Aishell1 Mandarin corpus, respectively1. Compared to the AT baseline, the CASS-NAT has a performance reduction on WER, but is 51.2x faster in terms of RTF. When decoding with an oracle CTC alignment, the lower bound of WER without LM reaches 2.3% on the test-clean set, indicating the potential of the proposed method. △ Less

Submitted 11 February, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: Accepted to ICASSP2021, camera ready version

arXiv:2010.11943 [pdf, other]

Few-Shot Adaptation of Generative Adversarial Networks

Authors: Esther Robb, Wen-Sheng Chu, Abhishek Kumar, Jia-Bin Huang

Abstract: Generative Adversarial Networks (GANs) have shown remarkable performance in image synthesis tasks, but typically require a large number of training samples to achieve high-quality synthesis. This paper proposes a simple and effective method, Few-Shot GAN (FSGAN), for adapting GANs in few-shot settings (less than 100 images). FSGAN repurposes component analysis techniques and learns to adapt the si… ▽ More Generative Adversarial Networks (GANs) have shown remarkable performance in image synthesis tasks, but typically require a large number of training samples to achieve high-quality synthesis. This paper proposes a simple and effective method, Few-Shot GAN (FSGAN), for adapting GANs in few-shot settings (less than 100 images). FSGAN repurposes component analysis techniques and learns to adapt the singular values of the pre-trained weights while freezing the corresponding singular vectors. This provides a highly expressive parameter space for adaptation while constraining changes to the pretrained weights. We validate our method in a challenging few-shot setting of 5-100 images in the target domain. We show that our method has significant visual quality gains compared with existing GAN adaptation methods. We report qualitative and quantitative results showing the effectiveness of our method. We additionally highlight a problem for few-shot synthesis in the standard quantitative metric used by data-efficient image synthesis works. Code and additional results are available at http://e-271.github.io/few-shot-gan. △ Less

Submitted 22 October, 2020; originally announced October 2020.

arXiv:2009.08953 [pdf]

doi 10.1364/OL.410608

An on-chip tunable micro-disk laser fabricated on Er3+ doped lithium niobate on insulator (LNOI)

Authors: Zhe Wang, Zhiwei Fang, Zhaoxiang Liu, Wei Chu, Yuan Zhou, Jianhao Zhang, Rongbo Wu, Min Wang, Tao Lu, Ya Cheng

Abstract: We demonstrate a C-band wavelength-tunable microlaser with an Er3+ doped high quality (~1.02x10^6) lithium niobate microdisk resonator. With a 976 nm continuous-wave pump laser, lasing action can be observed at a pump power threshold as low as ~250 μW at room temperature. Furthermore, the microdisk laser wavelength can be tuned by varying the pump laser power, showing a tuning efficiency of ~-17.0… ▽ More We demonstrate a C-band wavelength-tunable microlaser with an Er3+ doped high quality (~1.02x10^6) lithium niobate microdisk resonator. With a 976 nm continuous-wave pump laser, lasing action can be observed at a pump power threshold as low as ~250 μW at room temperature. Furthermore, the microdisk laser wavelength can be tuned by varying the pump laser power, showing a tuning efficiency of ~-17.03 pm/mW at low pump power blow 13 mW, and 10.58 pm/mW at high pump power above 13 mW. △ Less

Submitted 18 September, 2020; originally announced September 2020.

arXiv:2009.07448 [pdf, other]

Question Directed Graph Attention Network for Numerical Reasoning over Text

Authors: Kunlong Chen, Weidi Xu, Xingyi Cheng, Zou Xiaochuan, Yuyu Zhang, Le Song, Taifeng Wang, Yuan Qi, Wei Chu

Abstract: Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation. To address this challenge, we propose a heterogeneous graph representation for the context of the passage and question needed for such reasoning, and design a question directed gra… ▽ More Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation. To address this challenge, we propose a heterogeneous graph representation for the context of the passage and question needed for such reasoning, and design a question directed graph attention network to drive multi-step numerical reasoning over this context graph. The code link is at: https://github.com/emnlp2020qdgat/QDGAT △ Less

Submitted 19 November, 2023; v1 submitted 15 September, 2020; originally announced September 2020.

Comments: Accepted at EMNLP 2020

arXiv:2009.05932 [pdf]

Dynamics for droplet-based electricity generators

Authors: Xiang Wang, Sunmiao Fang, ** Tan, Tao Hu, Weicun Chu, Jun Yin, Jianxin Zhou, Wanlin Guo

Abstract: The finding of droplet-based electricity generator (DEG), based on the moving boundary of electrical double layer, has triggered great research enthusiasm, and a breakthrough in instantaneous electric power density was achieved recently. However, the dynamic mechanism for such droplet-based electricity generators remains elusive, impeding optimization of the DEG for practical applications. Through… ▽ More The finding of droplet-based electricity generator (DEG), based on the moving boundary of electrical double layer, has triggered great research enthusiasm, and a breakthrough in instantaneous electric power density was achieved recently. However, the dynamic mechanism for such droplet-based electricity generators remains elusive, impeding optimization of the DEG for practical applications. Through comprehensive experiments, we developed a dynamic model of surface charge density that can explain the underlying mechanism for the DEGs. The spreading droplet in touch with the top electrode can be equivalently regarded as an additional part of the top plate of the DEG capacitor, and the change of droplet area causes the change of surface charge density of the plates, driving electrons to migrate between the two plates. The insight of the dynamic mechanism paves a way for optimal design and practical applications of DEGs △ Less

Submitted 13 September, 2020; originally announced September 2020.

arXiv:2008.09163 [pdf]

doi 10.1021/acs.cgd.0c00614

Novel polymorphic phase of BaCu2As2: impact of flux for new phase formation in crystal growth

Authors: Hanlin Wu, Sheng Li, Zheng Wu, Xiqu Wang, Gareth A. Ofenstein, Sunah Kwon, Moon J. Kim, Paul C. W. Chu, Bing Lv

Abstract: In this work, we have thoroughly studied the effects of flux composition and temperature on the crystal growth of the BaCu2As2 compound. While Pb and CuAs self-flux produce the well-known α-phase ThCr2Si2-type structure (Z=2), a new polymorphic phase of BaCu2As2 (\b{eta} phase) with a much larger c lattice parameter (Z=10), which could be considered an intergrowth of the ThCr2Si2- and CaBe2Ge2-typ… ▽ More In this work, we have thoroughly studied the effects of flux composition and temperature on the crystal growth of the BaCu2As2 compound. While Pb and CuAs self-flux produce the well-known α-phase ThCr2Si2-type structure (Z=2), a new polymorphic phase of BaCu2As2 (\b{eta} phase) with a much larger c lattice parameter (Z=10), which could be considered an intergrowth of the ThCr2Si2- and CaBe2Ge2-type structures, has been discovered via Sn flux growth. We have characterized this structure through single-crystal X-ray diffraction, transmission electron microscopy (TEM), and scanning transmission electron microscopy (STEM) studies. Furthermore, we compare this new polymorphic intergrowth structure with the α-phase BaCu2As2 (ThCr2Si2 type with Z=2) and the \b{eta}-phase BaCu2Sb2 (intergrowth of ThCr2Si2 and CaBe2Ge2 types with Z=6), both with the same space group I4/mmm. Electrical transport studies reveal p-type carriers and magnetoresistivity up to 22% at 5 K and under a magnetic field of 7 T. Our work suggests a new route for the discovery of new polymorphic structures through flux and temperature control during material synthesis. △ Less

Submitted 20 August, 2020; originally announced August 2020.

Comments: 15 pages, 4 figures

Journal ref: Crystal Growth & Design 2020

arXiv:2008.05090 [pdf, other]

Learning to Caricature via Semantic Shape Transform

Authors: Wenqing Chu, Wei-Chih Hung, Yi-Hsuan Tsai, Yu-Ting Chang, Yijun Li, Deng Cai, Ming-Hsuan Yang

Abstract: Caricature is an artistic drawing created to abstract or exaggerate facial features of a person. Rendering visually pleasing caricatures is a difficult task that requires professional skills, and thus it is of great interest to design a method to automatically generate such drawings. To deal with large shape changes, we propose an algorithm based on a semantic shape transform to produce diverse an… ▽ More Caricature is an artistic drawing created to abstract or exaggerate facial features of a person. Rendering visually pleasing caricatures is a difficult task that requires professional skills, and thus it is of great interest to design a method to automatically generate such drawings. To deal with large shape changes, we propose an algorithm based on a semantic shape transform to produce diverse and plausible shape exaggerations. Specifically, we predict pixel-wise semantic correspondences and perform image war** on the input photo to achieve dense shape transformation. We show that the proposed framework is able to render visually pleasing shape exaggerations while maintaining their facial structures. In addition, our model allows users to manipulate the shape via the semantic map. We demonstrate the effectiveness of our approach on a large photograph-caricature benchmark dataset with comparisons to the state-of-the-art methods. △ Less

Submitted 13 August, 2020; v1 submitted 11 August, 2020; originally announced August 2020.

Comments: Submitted to IJCV, code and model are available at https://github.com/wenqingchu/Semantic-CariGANs/

arXiv:2006.11562 [pdf]

doi 10.35848/1347-4065/aba0d6

High-index-contrast single-mode optical waveguides fabricated on lithium niobate by photolithography assisted chemo-mechanical etching (PLACE)

Authors: Jianhao Zhang, Rongbo Wu, Min Wang, Zhiwei Fang, **tian Lin, Junxia Zhou, Renhong Gao, Zhe Wang, Wei Chu, Ya Cheng

Abstract: We report fabrication of low loss single mode waveguides on lithium niobate on insulator (LNOI) cladded by a layer of SiO2. Our technique, termed photolithography assisted chemo-mechanical etching (PLACE), relies on patterning of a chromium film into the mask shape by femtosecond laser micromachining and subsequent chemo-mechanical etching of the lithium niobate thin film. The high-index-contrast… ▽ More We report fabrication of low loss single mode waveguides on lithium niobate on insulator (LNOI) cladded by a layer of SiO2. Our technique, termed photolithography assisted chemo-mechanical etching (PLACE), relies on patterning of a chromium film into the mask shape by femtosecond laser micromachining and subsequent chemo-mechanical etching of the lithium niobate thin film. The high-index-contrast single mode waveguide is measured to have a propagation loss of 0.13 dB/cm. Furthermore, waveguide tapers are fabricated for boosting the coupling efficiency. △ Less

Submitted 3 May, 2020; originally announced June 2020.

Comments: 4 pages, 5 figures

arXiv:2005.09195 [pdf, other]

Riemannian Proximal Policy Optimization

Authors: Shijun Wang, Baocheng Zhu, Chen Li, Mingzhe Wu, James Zhang, Wei Chu, Yuan Qi

Abstract: In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems. To model policy functions in MDP, we employ Gaussian mixture model (GMM) and formulate it as a nonconvex optimization problem in the Riemannian space of positive semidefinite matrices. For two given policy functions, we also provide its lower b… ▽ More In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems. To model policy functions in MDP, we employ Gaussian mixture model (GMM) and formulate it as a nonconvex optimization problem in the Riemannian space of positive semidefinite matrices. For two given policy functions, we also provide its lower bound on policy improvement by using bounds derived from the Wasserstein distance of GMMs. Preliminary experiments show the efficacy of our proposed Riemannian proximal policy optimization algorithm. △ Less

Submitted 18 May, 2020; originally announced May 2020.

Comments: 12 pages, 1 figures

arXiv:2005.00772 [pdf, ps, other]

Alternating Convolutions of Catalan Numbers

Authors: Wenchang Chu

Abstract: A new class of alternating convolutions concerning binomial coefficients and Catalan numbers are evaluated in closed forms. A new class of alternating convolutions concerning binomial coefficients and Catalan numbers are evaluated in closed forms. △ Less

Submitted 6 March, 2021; v1 submitted 2 May, 2020; originally announced May 2020.

MSC Class: 05A10; 33C15

arXiv:2004.14166 [pdf, other]

SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

Authors: Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu, Yuan Qi

Abstract: Chinese Spelling Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. Existing methods have made attempts to incorporate the similarity knowledge between Chinese characters. However, they take the similarity knowledge as either an external input resource or just heuristic rules. This paper proposes to incorporate phonological and visual similarity knowledge into… ▽ More Chinese Spelling Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. Existing methods have made attempts to incorporate the similarity knowledge between Chinese characters. However, they take the similarity knowledge as either an external input resource or just heuristic rules. This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN). The model builds a graph over the characters, and SpellGCN is learned to map this graph into a set of inter-dependent character classifiers. These classifiers are applied to the representations extracted by another network, such as BERT, enabling the whole network to be end-to-end trainable. Experiments (The dataset and all code for this paper are available at https://github.com/ACL2020SpellGCN/SpellGCN) are conducted on three human-annotated datasets. Our method achieves superior performance against previous models by a large margin. △ Less

Submitted 13 May, 2020; v1 submitted 25 April, 2020; originally announced April 2020.

Comments: Accepted by ACL2020

arXiv:2004.12559 [pdf]

Response to Comment on "Low-frequency lattice phonons in halide perovskites explain high defect tolerance toward electron-hole recombination"

Authors: Weibin Chu, Qi**g Zheng, Oleg V. Prezhdo, ** Zhao, Wissam A. Saidi

Abstract: Recently we proposed that defect tolerance in the hybrid perovskites is due to their characteristic low-frequency lattice phonon modes that decrease the non-adiabatic coupling and weaken the overlap between the free carrier and defect states [Sci. Adv. 6 7, eaaw7453 (2020)]. Kim and Walsh disagree with the interpretation and argue that there are flaws in our employed methodology. Herein we address… ▽ More Recently we proposed that defect tolerance in the hybrid perovskites is due to their characteristic low-frequency lattice phonon modes that decrease the non-adiabatic coupling and weaken the overlap between the free carrier and defect states [Sci. Adv. 6 7, eaaw7453 (2020)]. Kim and Walsh disagree with the interpretation and argue that there are flaws in our employed methodology. Herein we address their concerns and show that their conclusions are not valid due to misunderstandings of nonadiabatic transition. △ Less

Submitted 26 April, 2020; originally announced April 2020.

arXiv:2004.11265 [pdf, ps, other]

doi 10.1103/PhysRevMaterials.4.054406

Absence of long-range order in an $XY$ pyrochlore antiferromagnet Er$_2$AlSbO$_7$

Authors: H. L. Che, Z. Y. Zhao, X. Rao, L. G. Chu, N. Li, W. J. Chu, P. Gao, X. Y. Yue, Y. Zhou, Q. J. Li, Q. Huang, E. S. Choi, Y. Y. Han, Z. Z. He, H. D. Zhou, X. Zhao, X. F. Sun

Abstract: Rare-earth pyrochlores are known to exhibit exotic magnetic phenomena. We report a study of crystal growth and characterizations of a new rare-earth compound Er$_2$AlSbO$_7$, in which Al$^{3+}$ and Sb$^{5+}$ ions share the same positions with a random distribution. The magnetism are studied by magnetic susceptibility, specific heat and thermal conductivity measurements at low temperatures down to… ▽ More Rare-earth pyrochlores are known to exhibit exotic magnetic phenomena. We report a study of crystal growth and characterizations of a new rare-earth compound Er$_2$AlSbO$_7$, in which Al$^{3+}$ and Sb$^{5+}$ ions share the same positions with a random distribution. The magnetism are studied by magnetic susceptibility, specific heat and thermal conductivity measurements at low temperatures down to several tens of milli-kelvin. Different from the other reported Er-based pyrochlores exhibiting distinct magnetically ordered states, a spin-freezing transition is detected in Er$_2$AlSbO$_7$ below 0.37 K, which is primarily ascribed to the inherent structural disorder. A cluster spin-glass state is proposed in view of the frequency dependence of the peak position in the ac susceptibility. In addition, the temperature and field dependence of thermal conductivity indicates rather strong spin fluctuations which is probably due to the phase competition. △ Less

Submitted 23 April, 2020; originally announced April 2020.

Comments: 10 pages, 8 figures, accepted for publication Phys. Rev. Materials

Journal ref: Phys. Rev. Materials 4, 054406 (2020)

arXiv:2004.08883 [pdf, other]

Variational Policy Propagation for Multi-agent Reinforcement Learning

Authors: Chao Qu, Hui Li, Chang Liu, Junwu Xiong, James Zhang, Wei Chu, Weiqiang Wang, Yuan Qi, Le Song

Abstract: We propose a \emph{collaborative} multi-agent reinforcement learning algorithm named variational policy propagation (VPP) to learn a \emph{joint} policy through the interactions over agents. We prove that the joint policy is a Markov Random Field under some mild conditions, which in turn reduces the policy space effectively. We integrate the variational inference as special differentiable layers i… ▽ More We propose a \emph{collaborative} multi-agent reinforcement learning algorithm named variational policy propagation (VPP) to learn a \emph{joint} policy through the interactions over agents. We prove that the joint policy is a Markov Random Field under some mild conditions, which in turn reduces the policy space effectively. We integrate the variational inference as special differentiable layers in policy such that the actions can be efficiently sampled from the Markov Random Field and the overall policy is differentiable. We evaluate our algorithm on several large scale challenging tasks and demonstrate that it outperforms previous state-of-the-arts. △ Less

Submitted 29 January, 2022; v1 submitted 19 April, 2020; originally announced April 2020.

Comments: The title of previous version was "Intention Propagation for Multi-agent Reinforcement Learning"

arXiv:2004.05224 [pdf, other]

doi 10.1109/TITS.2020.3023541

Deep Learning for Image and Point Cloud Fusion in Autonomous Driving: A Review

Authors: Yaodong Cui, Ren Chen, Wenbo Chu, Long Chen, Daxin Tian, Ying Li, Dongpu Cao

Abstract: Autonomous vehicles were experiencing rapid development in the past few years. However, achieving full autonomy is not a trivial task, due to the nature of the complex and dynamic driving environment. Therefore, autonomous vehicles are equipped with a suite of different sensors to ensure robust, accurate environmental perception. In particular, the camera-LiDAR fusion is becoming an emerging resea… ▽ More Autonomous vehicles were experiencing rapid development in the past few years. However, achieving full autonomy is not a trivial task, due to the nature of the complex and dynamic driving environment. Therefore, autonomous vehicles are equipped with a suite of different sensors to ensure robust, accurate environmental perception. In particular, the camera-LiDAR fusion is becoming an emerging research theme. However, so far there has been no critical review that focuses on deep-learning-based camera-LiDAR fusion methods. To bridge this gap and motivate future research, this paper devotes to review recent deep-learning-based data fusion approaches that leverage both image and point cloud. This review gives a brief overview of deep learning on image and point cloud data processing. Followed by in-depth reviews of camera-LiDAR fusion methods in depth completion, object detection, semantic segmentation, tracking and online cross-sensor calibration, which are organized based on their respective fusion levels. Furthermore, we compare these methods on publicly available datasets. Finally, we identified gaps and over-looked challenges between current academic researches and real-world applications. Based on these observations, we provide our insights and point out promising research directions. △ Less

Submitted 9 September, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

Journal ref: IEEE Transactions on Intelligent Transportation Systems.(2021)

arXiv:2004.05122 [pdf]

doi 10.1073/pnas.1922108117

Room-temperature skyrmion phase in bulk Cu2OSeO3 under high pressures

Authors: Liangzi Deng, Hung-Cheng Wu, Alexander P. Litvinchuk, Noah F. Q. Yuan, Jey-Jau Lee, Rabin Dahal, Helmuth Berger, Hung-Duen Yang, Ching-Wu Chu

Abstract: A skyrmion state in a non-centrosymmetric helimagnet displays topologically protected spin textures with profound technological implications for high density information storage, ultrafast spintronics, and effective microwave devices. Usually, its equilibrium state in a bulk helimagnet occurs only over a very restricted magnetic-field--temperature phase space and often in the low temperature regio… ▽ More A skyrmion state in a non-centrosymmetric helimagnet displays topologically protected spin textures with profound technological implications for high density information storage, ultrafast spintronics, and effective microwave devices. Usually, its equilibrium state in a bulk helimagnet occurs only over a very restricted magnetic-field--temperature phase space and often in the low temperature region near the magnetic transition temperature Tc. We have expanded and enhanced the skyrmion phase region from the small range of 55-58.5 K to 5-300 K in single-crystalline Cu2OSeO3 by pressures up to 42.1 GPa through a series of phase transitions from the cubic P2(_1)3, through orthorhombic P2(_1)2(_1)2(_1) and monoclinic P2(_1), and finally to the triclinic P1 phase, using our newly developed ultrasensitive high-pressure magnetization technique. The results are in agreement with our Ginzburg-Landau free energy analyses, showing that pressures tend to stabilize the skyrmion states and at higher temperatures. The observations also indicate that the skyrmion state can be achieved at higher temperatures in various crystal symmetries, suggesting the insensitivity of skyrmions to the underlying crystal lattices and thus the possible more ubiquitous presence of skyrmions in helimagnets. △ Less

Submitted 10 April, 2020; originally announced April 2020.

Comments: 22 pages, 5 figures and 3 supplementary figures

Journal ref: Proceedings of the National Academy of Sciences Apr 2020, 201922108

arXiv:2004.03894 [pdf]

Incubation Induced Light Concentration Beyond the Diffraction Limit for High-Resolution Glass Printing

Authors: Haisu Zhang, Peng Wang, Wei Chu, Jian** Yu, Wenbo Li, Jia Qi, Zhanshan Wang, Ya Cheng

Abstract: In the past two decades, tremendous efforts have been exerted to understand and control the delivery of ultrashort laser pulses into various types of transparent materials ranging from glass and crystal to polymer and even bio-materials. This approach opens up the route toward determinative and highly localized modification within the transparent materials, enabling three-dimensional (3D) micromac… ▽ More In the past two decades, tremendous efforts have been exerted to understand and control the delivery of ultrashort laser pulses into various types of transparent materials ranging from glass and crystal to polymer and even bio-materials. This approach opens up the route toward determinative and highly localized modification within the transparent materials, enabling three-dimensional (3D) micromachining of the materials into sophisticated structures and devices with the extreme geometrical flexibility. Owing to the linear diffraction and nonlinear self-focusing effects, the focal volume typically exhibits an asymmetric profile stretching along the longitudinal direction. This effect becomes more severe when focusing deeply into the transparent substrates for printing objects of large heights. In this work a new laser-material interaction regime is identified with the exceptional incubation effect originating from self-regulated multiple-pulse interactions with accumulated material changes. Our finding reveals a focal-volume-invariant modification deeply inside the fused silica glass, in striking contrary to the traditional believes that the geometrical shape of the laser induced modification follows the intensity distribution of the inscription laser. A macro-scale geometrically complex glass sculpture is successfully manufactured with the incubation assisted ultrashort laser inscription at uniform micrometer resolutions in all three dimensions. △ Less

Submitted 8 April, 2020; originally announced April 2020.

Comments: 16 pages, 5 figures

arXiv:2003.05112 [pdf, other]

PONAS: Progressive One-shot Neural Architecture Search for Very Efficient Deployment

Authors: Sian-Yao Huang, Wei-Ta Chu

Abstract: We achieve very efficient deep learning model deployment that designs neural network architectures to fit different hardware constraints. Given a constraint, most neural architecture search (NAS) methods either sample a set of sub-networks according to a pre-trained accuracy predictor, or adopt the evolutionary algorithm to evolve specialized networks from the supernet. Both approaches are time co… ▽ More We achieve very efficient deep learning model deployment that designs neural network architectures to fit different hardware constraints. Given a constraint, most neural architecture search (NAS) methods either sample a set of sub-networks according to a pre-trained accuracy predictor, or adopt the evolutionary algorithm to evolve specialized networks from the supernet. Both approaches are time consuming. Here our key idea for very efficient deployment is, when searching the architecture space, constructing a table that stores the validation accuracy of all candidate blocks at all layers. For a stricter hardware constraint, the architecture of a specialized network can be very efficiently determined based on this table by picking the best candidate blocks that yield the least accuracy loss. To accomplish this idea, we propose Progressive One-shot Neural Architecture Search (PONAS) that combines advantages of progressive NAS and one-shot methods. In PONAS, we propose a two-stage training scheme, including the meta training stage and the fine-tuning stage, to make the search process efficient and stable. During search, we evaluate candidate blocks in different layers and construct the accuracy table that is to be used in deployment. Comprehensive experiments verify that PONAS is extremely flexible, and is able to find architecture of a specialized network in around 10 seconds. In ImageNet classification, 75.2% top-1 accuracy can be obtained, which is comparable with the state of the arts. △ Less

Submitted 9 April, 2020; v1 submitted 11 March, 2020; originally announced March 2020.

arXiv:2002.00190 [pdf, ps, other]

Improving Performance Estimation for FPGA-based Accelerators for Convolutional Neural Networks

Authors: Martin Ferianc, Hongxiang Fan, Ringo S. W. Chu, Jakub Stano, Wayne Luk

Abstract: Field-programmable gate array (FPGA) based accelerators are being widely used for acceleration of convolutional neural networks (CNNs) due to their potential in improving the performance and reconfigurability for specific application instances. To determine the optimal configuration of an FPGA-based accelerator, it is necessary to explore the design space and an accurate performance prediction pla… ▽ More Field-programmable gate array (FPGA) based accelerators are being widely used for acceleration of convolutional neural networks (CNNs) due to their potential in improving the performance and reconfigurability for specific application instances. To determine the optimal configuration of an FPGA-based accelerator, it is necessary to explore the design space and an accurate performance prediction plays an important role during the exploration. This work introduces a novel method for fast and accurate estimation of latency based on a Gaussian process parametrised by an analytic approximation and coupled with runtime data. The experiments conducted on three different CNNs on an FPGA-based accelerator on Intel Arria 10 GX 1150 demonstrated a 30.7% improvement in accuracy with respect to the mean absolute error in comparison to a standard analytic method in leave-one-out cross-validation. △ Less

Submitted 1 February, 2020; originally announced February 2020.

Comments: This article is accepted for publication at ARC'2020

arXiv:2001.03589 [pdf]

Freeform microfluidic networks encapsulated in laser printed three-dimensional macro-scale glass objects

Authors: Zijie Lin, Jian Xu, Yunpeng Song, Xiaolong Li, Peng Wang, Wei Chu, Zhenhua Wang, Ya Cheng

Abstract: Large-scale microfluidic microsystems with complex three-dimensional (3D) configurations are highly in demand by both fundamental research and industrial application, holding the potentials for fostering a wide range of innovative applications such as lab-on-a-chip and organ-on-a-chip as well as continuous-flow manufacturing of fine chemicals. However, freeform fabrication of such systems remains… ▽ More Large-scale microfluidic microsystems with complex three-dimensional (3D) configurations are highly in demand by both fundamental research and industrial application, holding the potentials for fostering a wide range of innovative applications such as lab-on-a-chip and organ-on-a-chip as well as continuous-flow manufacturing of fine chemicals. However, freeform fabrication of such systems remains challenging for most of the current fabrication techniques in terms of fabrication resolution, flexibility, and achievable footprint size. Here, we report ultrashort pulse laser microfabrication of freeform microfluidic circuits with high aspect ratios and tunable diameters embedded in 3D printed glass objects. We achieve uniform microfluidic channel diameter by carefully distributing a string of extra access ports along the microfluidic channels for avoiding the over-etching in the thin microfluidic channels. After the chemical etching is completed, the extra access ports are sealed using carbon dioxide laser induced localized glass melting. We demonstrate a model hand of fused silica with a size of ~3 cm * 2.7 cm * 1.1 cm in which the whole blood vessel system is encapsulated. △ Less

Submitted 17 October, 2019; originally announced January 2020.

Comments: 27 pages, 5 figures

arXiv:2001.03168 [pdf]

A compact and efficient three-dimensional microfluidic mixer

Authors: Wenbo Li, Wei Chu, Peng Wang, Jia Qi, Zhe Wang, **tian Lin, Min Wang, Ya Cheng

Abstract: Microfluidic mixing is a fundamental functionality in most lab on a chip (LOC) systems,whereas realization of efficient mixing is challenging in microfluidic channels due to the small Reynolds numbers. Here, we design and fabricate a compact three-dimensional (3D) micromixer to enable efficient mixing at various flow rates. The performance of the fabricated micromixer was examined using blue and r… ▽ More Microfluidic mixing is a fundamental functionality in most lab on a chip (LOC) systems,whereas realization of efficient mixing is challenging in microfluidic channels due to the small Reynolds numbers. Here, we design and fabricate a compact three-dimensional (3D) micromixer to enable efficient mixing at various flow rates. The performance of the fabricated micromixer was examined using blue and red inks. The extreme flexibility in fabricating microfluidic structures of arbitrary 3D geometries using femtosecond laser micromachining allows us to tackle the major disadvantageous effects for optimizing the mixing efficiency. △ Less

Submitted 7 October, 2019; originally announced January 2020.

Comments: 18 pages, 5 figures

arXiv:1911.11107 [pdf]

doi 10.1038/s41467-020-18041-3

Possible itinerant excitations and quantum spin state transitions in the effective spin-1/2 triangular-lattice antiferromagnet Na$_2$BaCo(PO$_4$)$_2$

Authors: N. Li, Q. Huang, X. Y. Yue, W. J. Chu, Q. Chen, E. S. Choi, X. Zhao, H. D. Zhou, X. F. Sun

Abstract: The most fascinating feature of certain two-dimensional (2D) gapless quantum spin liquid (QSL) is that their spinon excitations behave like the fermionic carriers of a paramagnetic metal. The spinon Fermi surface is then expected to produce a linear increase of the thermal conductivity with temperature that should manifest via a residual value ($κ_0/T$) in the zero-temperature limit. However, this… ▽ More The most fascinating feature of certain two-dimensional (2D) gapless quantum spin liquid (QSL) is that their spinon excitations behave like the fermionic carriers of a paramagnetic metal. The spinon Fermi surface is then expected to produce a linear increase of the thermal conductivity with temperature that should manifest via a residual value ($κ_0/T$) in the zero-temperature limit. However, this linear in T behavior has been reported for very few QSL candidates. Here, we studied the ultralow-temperature thermal conductivity of an effective spin-1/2 triangular QSL candidate Na$_2$BaCo(PO$_4$)$_2$, which has an antiferromagnetic order at very low temperature ($T_N \sim$ 148 mK), and observed a finite $κ_0/T$ extrapolated from the data above $T_N$. Moreover, while approaching zero temperature, it exhibits series of quantum spin state transitions with applied field along the $c$ axis. These observations indicate that Na$_2$BaCo(PO$_4$)$_2$ possibly behaves as a gapless QSL with itinerant spin excitations above $T_N$ and its strong quantum spin fluctuations persist below $T_N$. △ Less

Submitted 16 September, 2020; v1 submitted 25 November, 2019; originally announced November 2019.

Comments: 24 pages, 5 figures, with Supplementary Information

Journal ref: Nature Commun. 11, 4216 (2020)

arXiv:1911.00148 [pdf, ps, other]

A reduced-order modeling approach for electron transport in molecular junctions

Authors: Weiqi Chu, Xiantao Li

Abstract: To describe non-equilibrium transport processes in a quantum device with infinite baths, we propose to formulate the problems as a reduced-order problem. Starting with the Liouville-von Neumann equation for the density-matrix, the reduced-order technique yields a finite system with open boundary conditions. We show that with appropriate choices of subspaces, the reduced model can be obtained syste… ▽ More To describe non-equilibrium transport processes in a quantum device with infinite baths, we propose to formulate the problems as a reduced-order problem. Starting with the Liouville-von Neumann equation for the density-matrix, the reduced-order technique yields a finite system with open boundary conditions. We show that with appropriate choices of subspaces, the reduced model can be obtained systematically from the Petrov-Galerkin projection. The self-energy associated with the bath emerges naturally. The results from the numerical experiments indicate that the reduced models are able to capture both the transient and steady states. △ Less

Submitted 31 October, 2019; originally announced November 2019.

arXiv:1910.13596 [pdf]

doi 10.1103/PhysRevA.101.043404

Extreme nonlinear Raman interaction of an ultrashort nitrogen ion laser with an impulsively excited molecular wavepacket

Authors: Zhaoxiang Liu, **** Yao, Haisu Zhang, Bo Xu, **ming Chen, Fangbo Zhang, Zhihao Zhang, Yuexin Wan, Wei Chu, Zhenhua Wang, Ya Cheng

Abstract: We report generation of cascaded rotational Raman scattering up to 58th orders in coherently excited CO_2 molecules. The high-order Raman scattering, which produces a quasiperiodic frequency comb with more than 600 sidebands, is obtained using an intense femtosecond laser to impulsively excite rotational coherence and the femtosecond-laser-induced N_2^+ lasing to generate cascaded Raman signals. T… ▽ More We report generation of cascaded rotational Raman scattering up to 58th orders in coherently excited CO_2 molecules. The high-order Raman scattering, which produces a quasiperiodic frequency comb with more than 600 sidebands, is obtained using an intense femtosecond laser to impulsively excite rotational coherence and the femtosecond-laser-induced N_2^+ lasing to generate cascaded Raman signals. The novel configuration allows this experiment to be performed with a single femtosecond laser beam at free-space standoff locations. It is revealed that the efficient spectral extension of Raman signals is attributed to the specific spectra-temporal structures of N_2^+ lasing, the ideal spatial overlap of femtosecond laser and N2+ lasing, and the guiding effect of molecular alignment. The Raman spectrum extending above 2000 cm^-1 naturally corresponds to a femtosecond pulse train due to the periodic revivals of molecular rotational wavepackets. △ Less

Submitted 29 October, 2019; originally announced October 2019.

Comments: 17 pages, 4 figures

Journal ref: Phys. Rev. A 101, 043404 (2020)

arXiv:1910.09697 [pdf, other]

doi 10.1063/1.5132958

Transition from Intrinsic to Extrinsic Anomalous Hall Effect in the Ferromagnetic Weyl Semimetal PrAlGe$_{1-x}$Si$_x$

Authors: Hung-Yu Yang, Bahadur Singh, Baozhu Lu, Cheng-Yi Huang, Faranak Bahrami, Wei-Chi Chu, David Graf, Shin-Ming Huang, Baokai Wang, Hsin Lin, Darius Torchinsky, Arun Bansil, Fazel Tafti

Abstract: Recent reports of a large anomalous Hall effect (AHE) in ferromagnetic Weyl semimetals (FM WSM) have led to a resurgence of interest in this enigmatic phenomenon. However, due to a lack of tunable materials, the interplay between the intrinsic mechanism caused by Berry curvature and extrinsic mechanisms due to scattering remains unclear in FM WSMs. In this contribution, we present a thorough inves… ▽ More Recent reports of a large anomalous Hall effect (AHE) in ferromagnetic Weyl semimetals (FM WSM) have led to a resurgence of interest in this enigmatic phenomenon. However, due to a lack of tunable materials, the interplay between the intrinsic mechanism caused by Berry curvature and extrinsic mechanisms due to scattering remains unclear in FM WSMs. In this contribution, we present a thorough investigation of both the extrinsic and intrinsic AHE in a new family of FM WSMs, PrAlGe$_{1-x}$Si$_x$, where $x$ can be tuned continuously. From DFT calculations, we show that the two end members, PrAlGe and PrAlSi, have different Fermi surfaces but similar Weyl node structures. Experimentally, we observe moderate changes in the anomalous Hall coefficient ($R_S$) but significant changes in the ordinary Hall coefficient ($R_0$) in PrAlGe$_{1-x}$Si$_x$ as a function of $x$, confirming a change of Fermi surface. By comparing the magnitude of $R_0$ and $R_S$, we identify two regimes; $|R_0|<|R_S|$ when $x\le0.5$ and $|R_0|>|R_S|$ when $x>0.5$. Through a detailed scaling analysis, we discover a universal anomalous Hall conductivity (AHC) from intrinsic contribution when $x\le0.5$. Such universal AHC is absent when $x>0.5$. Thus, we point out the significance of the extrinsic mechanisms in FM WSMs and report the first observation of a transition from intrinsic to extrinsic AHE in PrAlGe$_{1-x}$Si$_x$. △ Less

Submitted 21 October, 2019; originally announced October 2019.

Comments: 11 pages, 11 figure, 3 tables

Journal ref: APL Materials 8, 011111 (2020)

arXiv:1910.08711 [pdf, other]

Correlation Maximized Structural Similarity Loss for Semantic Segmentation

Authors: Shuai Zhao, Boxi Wu, Wenqing Chu, Yao Hu, Deng Cai

Abstract: Most semantic segmentation models treat semantic segmentation as a pixel-wise classification task and use a pixel-wise classification error as their optimization criterions. However, the pixel-wise error ignores the strong dependencies among the pixels in an image, which limits the performance of the model. Several ways to incorporate the structure information of the objects have been investigated… ▽ More Most semantic segmentation models treat semantic segmentation as a pixel-wise classification task and use a pixel-wise classification error as their optimization criterions. However, the pixel-wise error ignores the strong dependencies among the pixels in an image, which limits the performance of the model. Several ways to incorporate the structure information of the objects have been investigated, \eg, conditional random fields (CRF), image structure priors based methods, and generative adversarial network (GAN). Nevertheless, these methods usually require extra model branches or additional memories, and some of them show limited improvements. In contrast, we propose a simple yet effective structural similarity loss (SSL) to encode the structure information of the objects, which only requires a few additional computational resources in the training phase. Inspired by the widely-used structural similarity (SSIM) index in image quality assessment, we use the linear correlation between two images to quantify their structural similarity. And the goal of the proposed SSL is to pay more attention to the positions, whose associated predictions lead to a low degree of linear correlation between two corresponding regions in the ground truth map and the predicted map. Thus the model can achieve a strong structural similarity between the two maps through minimizing the SSL over the whole map. The experimental results demonstrate that our method can achieve substantial and consistent improvements in performance on the PASCAL VOC 2012 and Cityscapes datasets. The code will be released soon. △ Less

Submitted 19 October, 2019; originally announced October 2019.

arXiv:1909.03405 [pdf, other]

Symmetric Regularization based BERT for Pair-wise Semantic Reasoning

Authors: Weidi Xu, Xingyi Cheng, Kunlong Chen, Wei Wang, Bin Bi, Ming Yan, Chen Wu, Luo Si, Wei Chu, Taifeng Wang

Abstract: The ability of semantic reasoning over the sentence pair is essential for many natural language understanding tasks, e.g., natural language inference and machine reading comprehension. A recent significant improvement in these tasks comes from BERT. As reported, the next sentence prediction (NSP) in BERT, which learns the contextual relationship between two sentences, is of great significance for… ▽ More The ability of semantic reasoning over the sentence pair is essential for many natural language understanding tasks, e.g., natural language inference and machine reading comprehension. A recent significant improvement in these tasks comes from BERT. As reported, the next sentence prediction (NSP) in BERT, which learns the contextual relationship between two sentences, is of great significance for downstream problems with sentence-pair input. Despite the effectiveness of NSP, we suggest that NSP still lacks the essential signal to distinguish between entailment and shallow correlation. To remedy this, we propose to augment the NSP task to a 3-class categorization task, which includes a category for previous sentence prediction (PSP). The involvement of PSP encourages the model to focus on the informative semantics to determine the sentence order, thereby improves the ability of semantic understanding. This simple modification yields remarkable improvement against vanilla BERT. To further incorporate the document-level information, the scope of NSP and PSP is expanded into a broader range, i.e., NSP and PSP also include close but nonsuccessive sentences, the noise of which is mitigated by the label-smoothing technique. Both qualitative and quantitative experimental results demonstrate the effectiveness of the proposed method. Our method consistently improves the performance on the NLI and MRC benchmarks, including the challenging HANS dataset \cite{hans}, suggesting that the document-level task is still promising for the pre-training. △ Less

Submitted 17 June, 2021; v1 submitted 8 September, 2019; originally announced September 2019.

Comments: 8 pages, 3 figures, 6 tables

arXiv:1909.02218 [pdf, other]

doi 10.1109/TIP.2018.2859820

A Better Way to Attend: Attention with Trees for Video Question Answering

Authors: Hongyang Xue, Wenqing Chu, Zhou Zhao, Deng Cai

Abstract: We propose a new attention model for video question answering. The main idea of the attention models is to locate on the most informative parts of the visual data. The attention mechanisms are quite popular these days. However, most existing visual attention mechanisms regard the question as a whole. They ignore the word-level semantics where each word can have different attentions and some words… ▽ More We propose a new attention model for video question answering. The main idea of the attention models is to locate on the most informative parts of the visual data. The attention mechanisms are quite popular these days. However, most existing visual attention mechanisms regard the question as a whole. They ignore the word-level semantics where each word can have different attentions and some words need no attention. Neither do they consider the semantic structure of the sentences. Although the Extended Soft Attention (E-SA) model for video question answering leverages the word-level attention, it performs poorly on long question sentences. In this paper, we propose the heterogeneous tree-structured memory network (HTreeMN) for video question answering. Our proposed approach is based upon the syntax parse trees of the question sentences. The HTreeMN treats the words differently where the \textit{visual} words are processed with an attention module and the \textit{verbal} ones not. It also utilizes the semantic structure of the sentences by combining the neighbors based on the recursive structure of the parse trees. The understandings of the words and the videos are propagated and merged from leaves to the root. Furthermore, we build a hierarchical attention mechanism to distill the attended features. We evaluate our approach on two datasets. The experimental results show the superiority of our HTreeMN model over the other attention models especially on complex questions. Our code is available on github. Our code is available at https://github.com/ZJULearning/TreeAttention △ Less

Submitted 5 September, 2019; originally announced September 2019.

Comments: 12 pages

Journal ref: IEEE Transactions on Image Processing ( Volume: 27 , Issue: 11 , Nov. 2018 )

arXiv:1909.00911 [pdf, other]

doi 10.1073/pnas.1915333117

Investigations of the Underlying Mechanisms of HIF-1α and CITED2 Binding to TAZ1

Authors: Wen-Ting Chu, Xiakun Chu, ** Wang

Abstract: The TAZ1 domain of CREB binding protein is crucial for transcriptional regulation and recognizes multiple targets. The interactions between TAZ1 and its specific targets are related to the cellular hypoxic negative feedback regulation. Previous experiments reported that one of the TAZ1 targets CITED2 is an efficient competitor of another target HIF-1α. Here by develo** the structure-based models… ▽ More The TAZ1 domain of CREB binding protein is crucial for transcriptional regulation and recognizes multiple targets. The interactions between TAZ1 and its specific targets are related to the cellular hypoxic negative feedback regulation. Previous experiments reported that one of the TAZ1 targets CITED2 is an efficient competitor of another target HIF-1α. Here by develo** the structure-based models of TAZ1 complexes we have uncovered the underlying mechanisms of the competitions between HIF-1α and CITED2 binding to TAZ1. Our results are consistent with the experimental hypothesis on the competition mechanisms and the apparent affinity. In addition, the simulations prove the dominant position of forming TAZ1-CITED2 complex in both thermodynamics and kinetics. For thermodynamics, TAZ1-CITED2 is the lowest basin located on the free energy surface of binding in the ternary system. For kinetics, the results suggest that CITED2 binds to TAZ1 faster than HIF-1α. Besides, the analysis of contact map and f values in this study will be helpful for further experiments on TAZ1 systems. △ Less

Submitted 2 September, 2019; originally announced September 2019.

Comments: 12 pages, 6 figures

arXiv:1909.00399 [pdf]

doi 10.1364/OL.44.005953

Efficient Electro-optical Tuning of Optical Frequency Microcomb on a Monolithically Integrated High-Q Lithium Niobate Microdisk

Authors: Zhiwei Fang, Haipeng Luo, **tian Lin, Min Wang, Jianhao Zhang, Rongbo Wu, Junxia Zhou, Wei Chu, Tao Lu, Ya Cheng

Abstract: We demonstrate efficient tuning of a monolithically integrated lithium niobate microdisk (LN) optical frequency microcomb. Utilizing the high optical quality (Q) factor (i.e., Q~7.1*10^6) of the microdisk, the microcomb spans over a spectral bandwidth of ~200 nm at a pump power as low as 20.4 mW. Combining the large eletro-optic coefficient of LN and optimum design of the geometry of microelectrod… ▽ More We demonstrate efficient tuning of a monolithically integrated lithium niobate microdisk (LN) optical frequency microcomb. Utilizing the high optical quality (Q) factor (i.e., Q~7.1*10^6) of the microdisk, the microcomb spans over a spectral bandwidth of ~200 nm at a pump power as low as 20.4 mW. Combining the large eletro-optic coefficient of LN and optimum design of the geometry of microelectrodes, we demonstrate electro-optical tuning of the comb with a spectral range of 400 pm and a tuning efficiency of ~38 pm/100V. △ Less

Submitted 1 September, 2019; originally announced September 2019.

arXiv:1908.05908 [pdf, other]

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

Authors: Weipeng Huang, Xingyi Cheng, Taifeng Wang, Wei Chu

Abstract: In this paper, we report our method for the Information Extraction task in 2019 Language and Intelligence Challenge. We incorporate BERT into the multi-head selection framework for joint entity-relation extraction. This model extends existing approaches from three perspectives. First, BERT is adopted as a feature extraction layer at the bottom of the multi-head selection framework. We further opti… ▽ More In this paper, we report our method for the Information Extraction task in 2019 Language and Intelligence Challenge. We incorporate BERT into the multi-head selection framework for joint entity-relation extraction. This model extends existing approaches from three perspectives. First, BERT is adopted as a feature extraction layer at the bottom of the multi-head selection framework. We further optimize BERT by introducing a semantic-enhanced task during BERT pre-training. Second, we introduce a large-scale Baidu Baike corpus for entity recognition pre-training, which is of weekly supervised learning since there is no actual named entity label. Third, soft label embedding is proposed to effectively transmit information between entity recognition and relation extraction. Combining these three contributions, we enhance the information extracting ability of the multi-head selection model and achieve F1-score 0.876 on testset-1 with a single model. By ensembling four variants of our model, we finally achieve F1 score 0.892 (1st place) on testset-1 and F1 score 0.8924 (2nd place) on testset-2. △ Less

Submitted 26 September, 2019; v1 submitted 16 August, 2019; originally announced August 2019.

Comments: To appear at NLPCC 2019

arXiv:1908.01856 [pdf]

doi 10.1364/OL.44.004698

Fabrication of a multifunctional photonic integrated chip on lithium niobate on insulator using femtosecond laser assisted chemo-mechanical polish

Authors: Rongbo Wu, **tian Lin, Min Wang, Zhiwei Fang, Wei Chu, Jianhao Zhang, Junxia Zhou, Ya Cheng

Abstract: We report fabrication of a multifunctional photonic integrated chip on lithium niobate on insulate (LNOI), which is achieved by femtosecond laser assisted chemo-mechanical polish. We demonstrate a high extinction ratio beam splitter, a 1 * 6 optical switch, and a balanced 3 * 3 interferometer on the fabricated chip by reconfiguring the microelectrode array integrated with the multifunctional photo… ▽ More We report fabrication of a multifunctional photonic integrated chip on lithium niobate on insulate (LNOI), which is achieved by femtosecond laser assisted chemo-mechanical polish. We demonstrate a high extinction ratio beam splitter, a 1 * 6 optical switch, and a balanced 3 * 3 interferometer on the fabricated chip by reconfiguring the microelectrode array integrated with the multifunctional photonic circuit. △ Less

Submitted 8 July, 2019; originally announced August 2019.

Comments: 6 pages, 5 figures

arXiv:1908.00154 [pdf, other]

doi 10.1103/PhysRevResearch.3.023170

Dirac state switching in transition metal diarsenides

Authors: Gyanendra Dhakal, M. Mofazzel Hosen, Wei-Chi Chu, Bahadur Singh, Klauss Dimitri, BaoKai Wang, Firoza Kabir, Christopher Sims, Sabin Regmi, William Neff, Dariusz Kaczorowski, Arun Bansil, Madhab Neupane

Abstract: Topological Dirac and Weyl semimetals, which support low-energy quasiparticles in condensed matter physics, are currently attracting intense interest due to exotic physical properties such as large magnetoresistance and high carrier mobilities. Transition metal diarsenides such as MoAs2 and WAs2 have been reported to harbor very high magnetoresistance suggesting the possible existence of a topolog… ▽ More Topological Dirac and Weyl semimetals, which support low-energy quasiparticles in condensed matter physics, are currently attracting intense interest due to exotic physical properties such as large magnetoresistance and high carrier mobilities. Transition metal diarsenides such as MoAs2 and WAs2 have been reported to harbor very high magnetoresistance suggesting the possible existence of a topological quantum state, although this conclusion remains dubious. Here, based on systematic angle-resolved photoemission spectroscopy (ARPES) measurements and parallel first-principles calculations, we investigate the electronic properties of TAs2 (T = Mo, W). Importantly, clear evidence for switching the single-Dirac cone surface state in MoAs2 with the cleaving plane is observed, whereas a Dirac state is not observed in WAs2 despite its high magnetoresistance. Our study thus reveals the key role of the terminated plane in a low-symmetry system, and provides a new perspective on how termination can drive dramatic changes in electronic structures. △ Less

Submitted 31 July, 2019; originally announced August 2019.

Comments: 8 pages, 4 figures

Journal ref: Phys. Rev. Research 3, 023170 (2021)

arXiv:1907.11188 [pdf]

High-precision measurement of a propagation loss of single-mode optical waveguides on lithium niobate on insulator

Authors: **tian Lin, Junxia Zhou, Rongbo Wu, Min Wang, Zhiwei Fang, Wei Chu, Jianhao Zhang, Lingling Qiao, Ya Cheng

Abstract: We demonstrate fabrication of single-mode optical waveguides on lithium niobate on insulator (LNOI) by optical patterning combined with chemo-mechanical polishing. The fabricated LNOI waveguides have a nearly symmetric mode profile of a mode field size of ~2.5 micron (full-width at half maximum). We develop a high-precision measurement approach by which the single mode waveguides are characterized… ▽ More We demonstrate fabrication of single-mode optical waveguides on lithium niobate on insulator (LNOI) by optical patterning combined with chemo-mechanical polishing. The fabricated LNOI waveguides have a nearly symmetric mode profile of a mode field size of ~2.5 micron (full-width at half maximum). We develop a high-precision measurement approach by which the single mode waveguides are characterized to have a propagation loss of ~0.042 dB/cm. △ Less

Submitted 7 June, 2019; originally announced July 2019.

Comments: 4 pages, 5 figures

arXiv:1907.07011 [pdf, other]

Improving Semantic Segmentation via Dilated Affinity

Authors: Boxi Wu, Shuai Zhao, Wenqing Chu, Zheng Yang, Deng Cai

Abstract: Introducing explicit constraints on the structural predictions has been an effective way to improve the performance of semantic segmentation models. Existing methods are mainly based on insufficient hand-crafted rules that only partially capture the image structure, and some methods can also suffer from the efficiency issue. As a result, most of the state-of-the-art fully convolutional networks di… ▽ More Introducing explicit constraints on the structural predictions has been an effective way to improve the performance of semantic segmentation models. Existing methods are mainly based on insufficient hand-crafted rules that only partially capture the image structure, and some methods can also suffer from the efficiency issue. As a result, most of the state-of-the-art fully convolutional networks did not adopt these techniques. In this work, we propose a simple, fast yet effective method that exploits structural information through direct supervision with minor additional expense. To be specific, our method explicitly requires the network to predict semantic segmentation as well as dilated affinity, which is a sparse version of pair-wise pixel affinity. The capability of telling the relationships between pixels are directly built into the model and enhance the quality of segmentation in two stages. 1) Joint training with dilated affinity can provide robust feature representations and thus lead to finer segmentation results. 2) The extra output of affinity information can be further utilized to refine the original segmentation with a fast propagation process. Consistent improvements are observed on various benchmark datasets when applying our framework to the existing state-of-the-art model. Codes will be released soon. △ Less

Submitted 26 July, 2019; v1 submitted 16 July, 2019; originally announced July 2019.

Comments: 10 pages, 5 figures, under review of NIPS2019

arXiv:1907.01794 [pdf, ps, other]

Two efficient gradient methods with approximately optimal stepsizes based on regularization models for unconstrained optimization

Authors: Zexian Liu, Wangli Chu, Hongwei Liu

Abstract: It is widely accepted that the stepsize is of great significance to gradient method. Two efficient gradient methods with approximately optimal stepsizes mainly based on regularization models are proposed for unconstrained optimization. More exactly, if the objective function is not close to a quadratic function on the line segment between the current and latest iterates, regularization models are… ▽ More It is widely accepted that the stepsize is of great significance to gradient method. Two efficient gradient methods with approximately optimal stepsizes mainly based on regularization models are proposed for unconstrained optimization. More exactly, if the objective function is not close to a quadratic function on the line segment between the current and latest iterates, regularization models are exploited carefully to generate approximately optimal stepsizes. Otherwise, quadratic approximation models are used. In particular, when the curvature is non-positive, special regularization models are developed. The convergence of the proposed methods is established under the weak conditions. Extensive numerical experiments indicated the proposed method is superior to the BBQ method (SIAM J. Optim. 2021,31(4), 3068-3096) and other efficient gradient methods, and is competitive to two famous and efficient conjugate gradient software packages CG$ \_ $DESCENT (5.0) (SIAM J. Optim. 16(1), 170-192, 2005) and CGOPT (1.0) (SIAM J. Optim. 23(1), 296-320, 2013) for the CUTEr library. Due to the surprising efficiency, we believe that gradient methods with approximately optimal stepsizes can become strong candidates for large-scale unconstrained optimization. △ Less

Submitted 20 January, 2022; v1 submitted 3 July, 2019; originally announced July 2019.

arXiv:1906.11981 [pdf, other]

Convolution Based Spectral Partitioning Architecture for Hyperspectral Image Classification

Authors: Ringo S. W. Chu, Ho-Cheung Ng, Xiwei Wang, Wayne Luk

Abstract: Hyperspectral images (HSIs) can distinguish materials with high number of spectral bands, which is widely adopted in remote sensing applications and benefits in high accuracy land cover classifications. However, HSIs processing are tangled with the problem of high dimensionality and limited amount of labelled data. To address these challenges, this paper proposes a deep learning architecture using… ▽ More Hyperspectral images (HSIs) can distinguish materials with high number of spectral bands, which is widely adopted in remote sensing applications and benefits in high accuracy land cover classifications. However, HSIs processing are tangled with the problem of high dimensionality and limited amount of labelled data. To address these challenges, this paper proposes a deep learning architecture using three dimensional convolutional neural networks with spectral partitioning to perform effective feature extraction. We conduct experiments using Indian Pines and Salinas scenes acquired by NASA Airborne Visible/Infra-Red Imaging Spectrometer. In comparison to prior results, our architecture shows competitive performance for classification results over current methods. △ Less

Submitted 27 June, 2019; originally announced June 2019.

Comments: Accepted for publication in IGARSS'2019

arXiv:1906.11834 [pdf, other]

Optimizing CNN-based Hyperspectral Image Classification on FPGAs

Authors: Shuanglong Liu, Ringo S. W. Chu, Xiwei Wang, Wayne Luk

Abstract: Hyperspectral image (HSI) classification has been widely adopted in applications involving remote sensing imagery analysis which require high classification accuracy and real-time processing speed. Methods based on Convolutional neural networks (CNNs) have been proven to achieve state-of-the-art accuracy in classifying HSIs. However, CNN models are often too computationally intensive to achieve re… ▽ More Hyperspectral image (HSI) classification has been widely adopted in applications involving remote sensing imagery analysis which require high classification accuracy and real-time processing speed. Methods based on Convolutional neural networks (CNNs) have been proven to achieve state-of-the-art accuracy in classifying HSIs. However, CNN models are often too computationally intensive to achieve real-time response due to the high dimensional nature of HSI, compared to traditional methods such as Support Vector Machines (SVMs). Besides, previous CNN models used in HSI are not specially designed for efficient implementation on embedded devices such as FPGAs. This paper proposes a novel CNN-based algorithm for HSI classification which takes into account hardware efficiency. A customized architecture which enables the proposed algorithm to be mapped effectively onto FPGA resources is then proposed to support real-time on-board classification with low power consumption. Implementation results show that our proposed accelerator on a Xilinx Zynq 706 FPGA board achieves more than 70x faster than an Intel 8-core Xeon CPU and 3x faster than an NVIDIA GeForce 1080 GPU. Compared to previous SVM-based FPGA accelerators, we achieve comparable processing speed but provide a much higher classification accuracy. △ Less

Submitted 27 June, 2019; originally announced June 2019.

Comments: This article is accepted for publication at ARC'2019

arXiv:1905.05091 [pdf, other]

Weakly-supervised Caricature Face Parsing through Domain Adaptation

Authors: Wenqing Chu, Wei-Chih Hung, Yi-Hsuan Tsai, Deng Cai, Ming-Hsuan Yang

Abstract: A caricature is an artistic form of a person's picture in which certain striking characteristics are abstracted or exaggerated in order to create a humor or sarcasm effect. For numerous caricature related applications such as attribute recognition and caricature editing, face parsing is an essential pre-processing step that provides a complete facial structure understanding. However, current state… ▽ More A caricature is an artistic form of a person's picture in which certain striking characteristics are abstracted or exaggerated in order to create a humor or sarcasm effect. For numerous caricature related applications such as attribute recognition and caricature editing, face parsing is an essential pre-processing step that provides a complete facial structure understanding. However, current state-of-the-art face parsing methods require large amounts of labeled data on the pixel-level and such process for caricature is tedious and labor-intensive. For real photos, there are numerous labeled datasets for face parsing. Thus, we formulate caricature face parsing as a domain adaptation problem, where real photos play the role of the source domain, adapting to the target caricatures. Specifically, we first leverage a spatial transformer based network to enable shape domain shifts. A feed-forward style transfer network is then utilized to capture texture-level domain gaps. With these two steps, we synthesize face caricatures from real photos, and thus we can use parsing ground truths of the original photos to learn the parsing model. Experimental results on the synthetic and real caricatures demonstrate the effectiveness of the proposed domain adaptation algorithm. Code is available at: https://github.com/ZJULearning/CariFaceParsing . △ Less

Submitted 13 May, 2019; originally announced May 2019.

Comments: Accepted in ICIP 2019, code and model are available at https://github.com/ZJULearning/CariFaceParsing

arXiv:1904.04027 [pdf]

Three-dimensional laser printing of macro-scale glass objects at a micro-scale resolution

Authors: Peng Wang, Wei Chu, Wenbo Li, Yuanxin Tan, Fang Liu, Min Wang, Zhe Wang, Jia Qi, **tian Lin, Fangbo Zhang, Zhanshan Wang, Ya Cheng

Abstract: Three-dimensional (3D) printing has allowed for production of geometrically complex 3D objects with extreme flexibility, which is currently undergoing rapid expansions in terms of materials, functionalities, as well as areas of application. When attempting to print 3D microstructures in glass, femtosecond laser induced chemical etching (FLICE) has proved itself a powerful approach. Here, we demons… ▽ More Three-dimensional (3D) printing has allowed for production of geometrically complex 3D objects with extreme flexibility, which is currently undergoing rapid expansions in terms of materials, functionalities, as well as areas of application. When attempting to print 3D microstructures in glass, femtosecond laser induced chemical etching (FLICE) has proved itself a powerful approach. Here, we demonstrate fabrication of macro-scale 3D glass objects of large heights up to ~3.8 cm with a well-balanced (i.e., lateral vs longitudinal) spatial resolution of ~20 μm. The remarkable accomplishment is achieved by revealing an unexplored regime in the interaction of ultrafast laser pulses with fused silica which results in aberration-free focusing of the laser pulses deeply inside fused silica. △ Less

Submitted 11 February, 2019; originally announced April 2019.

Comments: 13 pages, 7 figures

arXiv:1903.04190 [pdf, other]

Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning

Authors: Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, Wei Chu

Abstract: The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities. Multi-criteria Chinese word segmentation aims to capture various annotation criteria among datasets and leverage their common underlying knowledge. In this paper, we propose a domain adaptive segmenter to exploit diverse criteria of various datasets. Our model is based on Bidi… ▽ More The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities. Multi-criteria Chinese word segmentation aims to capture various annotation criteria among datasets and leverage their common underlying knowledge. In this paper, we propose a domain adaptive segmenter to exploit diverse criteria of various datasets. Our model is based on Bidirectional Encoder Representations from Transformers (BERT), which is responsible for introducing open-domain knowledge. Private and shared projection layers are proposed to capture domain-specific knowledge and common knowledge, respectively. We also optimize computational efficiency via distillation, quantization, and compiler optimization. Experiments show that our segmenter outperforms the previous state of the art (SOTA) models on 10 CWS datasets with superior efficiency. △ Less

Submitted 9 October, 2020; v1 submitted 11 March, 2019; originally announced March 2019.

Comments: Accepted at COLING 2020

arXiv:1903.04124 [pdf, other]

Singing voice conversion with non-parallel data

Authors: Xin Chen, Wei Chu, **xi Guo, Ning Xu

Abstract: Singing voice conversion is a task to convert a song sang by a source singer to the voice of a target singer. In this paper, we propose using a parallel data free, many-to-one voice conversion technique on singing voices. A phonetic posterior feature is first generated by decoding singing voices through a robust Automatic Speech Recognition Engine (ASR). Then, a trained Recurrent Neural Network (R… ▽ More Singing voice conversion is a task to convert a song sang by a source singer to the voice of a target singer. In this paper, we propose using a parallel data free, many-to-one voice conversion technique on singing voices. A phonetic posterior feature is first generated by decoding singing voices through a robust Automatic Speech Recognition Engine (ASR). Then, a trained Recurrent Neural Network (RNN) with a Deep Bidirectional Long Short Term Memory (DBLSTM) structure is used to model the map** from person-independent content to the acoustic features of the target person. F0 and aperiodic are obtained through the original singing voice, and used with acoustic features to reconstruct the target singing voice through a vocoder. In the obtained singing voice, the targeted and sourced singers sound similar. To our knowledge, this is the first study that uses non parallel data to train a singing voice conversion system. Subjective evaluations demonstrate that the proposed method effectively converts singing voices. △ Less

Submitted 11 March, 2019; originally announced March 2019.

Comments: Accepted to MIPR 2019

arXiv:1903.00468 [pdf]

doi 10.3390/nano9030439

Environmental Remediation Applications of Carbon Nanotube and Graphene Oxide: Adsorption and Catalysis

Authors: Yanqing Wang, Can Panl, Adavan Kiliyankil Vipin, Ling Sun, Wei Chu

Abstract: Environmental issues such as the wastewater have influenced each aspect of our lives. Coupling the existing remediation solutions with exploring new functional carbon nanomaterials (e.g. carbon nanotube, graphene oxide, graphene) by various perspectives shall open up a new venue to understand the environmental issues, phenomenon and find out the ways to get along with the nature. This review makes… ▽ More Environmental issues such as the wastewater have influenced each aspect of our lives. Coupling the existing remediation solutions with exploring new functional carbon nanomaterials (e.g. carbon nanotube, graphene oxide, graphene) by various perspectives shall open up a new venue to understand the environmental issues, phenomenon and find out the ways to get along with the nature. This review makes an attempt to provide an overview of potential environmental remediation solutions to the diverse challenges happening by using low-dimensional carbon nanomaterials and their composites as adsorbents, catalysts or catalysts support towards for the social sustainability. △ Less

Submitted 25 February, 2019; originally announced March 2019.

Comments: accepted review paper

arXiv:1902.10136 [pdf, other]

Observation of autoionization dynamics and sub-cycle quantum beating in electronic molecular wave packets

Authors: M. Reduzzi, W. -C. Chu, C. Feng, A. Dubrouil, J. Hummert, F. Calegari, F. Frassetto, L. Poletto, O. Kornilov, M. Nisoli, C. -D. Lin, G. Sansone

Abstract: The coherent interaction with ultrashort light pulses is a powerful strategy for monitoring and controlling the dynamics of wave packets in all states of matter. As light presents an oscillation period of a few femtoseconds ($T=2.6$~fs in the near infrared spectral range), the fundamental light-matter interaction occurs on the sub-cycle timescale, i.e. in a few hundred attoseconds. In this work, w… ▽ More The coherent interaction with ultrashort light pulses is a powerful strategy for monitoring and controlling the dynamics of wave packets in all states of matter. As light presents an oscillation period of a few femtoseconds ($T=2.6$~fs in the near infrared spectral range), the fundamental light-matter interaction occurs on the sub-cycle timescale, i.e. in a few hundred attoseconds. In this work, we resolve the dynamics of autoionizing states on the femtosecond timescale and observe the sub-cycle evolution of a coherent electronic wave packet in a diatomic molecule, exploiting a tunable ultrashort extreme ultraviolet pulse and a synchronized infrared field. The experimental observations are based on measuring the variations of the extreme ultraviolet radiation transmitted through the molecular gas. The different mechanisms contributing to the wave packet dynamics are investigated through theoretical simulations and a simple three level model. The method is general and can be extended to the investigation of more complex systems. △ Less

Submitted 26 February, 2019; originally announced February 2019.

arXiv:1901.10404 [pdf]

doi 10.1073/pnas.1819512116

Higher superconducting transition temperature by breaking the universal pressure relation

Authors: Liangzi Deng, Yong** Zheng, Zheng Wu, Shuyuan Huyan, Hung-Cheng Wu, Yifan Nie, Kyeongjae Cho, Ching-Wu Chu

Abstract: By investigating the bulk superconducting state via dc magnetization measurements, we have discovered a common resurgence of the superconductive transition temperatures (Tcs) of the monolayer Bi2Sr2CuO6+δ (Bi2201) and bilayer Bi2Sr2CaCu2O8+δ (Bi2212) to beyond the maximum Tcs (Tc-maxs) predicted by the universal relation between Tc and do** (p) or pressure (P) at higher pressures. The Tc of unde… ▽ More By investigating the bulk superconducting state via dc magnetization measurements, we have discovered a common resurgence of the superconductive transition temperatures (Tcs) of the monolayer Bi2Sr2CuO6+δ (Bi2201) and bilayer Bi2Sr2CaCu2O8+δ (Bi2212) to beyond the maximum Tcs (Tc-maxs) predicted by the universal relation between Tc and do** (p) or pressure (P) at higher pressures. The Tc of under-doped Bi2201 initially increases from 9.6 K at ambient to a peak at ~ 23 K at ~ 26 GPa and then drops as expected from the universal Tc-P relation. However, at pressures above ~ 40 GPa, Tc rises rapidly without any sign of saturation up to ~ 30 K at ~ 51 GPa. Similarly, the Tc for the slightly overdoped Bi2212 increases after passing a broad valley between 20-36 GPa and reaches ~ 90 K without any sign of saturation at ~ 56 GPa. We have therefore attributed this Tc-resurgence to a possible pressure-induced electronic transition in the cuprate compounds due to a charge transfer between the Cu 3d_(x^2-y^2 ) and the O 2p bands projected from a hybrid bonding state, leading to an increase of the density of states at the Fermi level, in agreement with our density functional theory calculations. Similar Tc-P behavior has also been reported in the trilayer Br2Sr2Ca2Cu3O10+δ (Bi2223). These observations suggest that higher Tcs than those previously reported for the layered cuprate high temperature superconductors can be achieved by breaking away from the universal Tc-P relation through the application of higher pressures. △ Less

Submitted 29 January, 2019; originally announced January 2019.

Comments: 13 pages, including 5 figures

arXiv:1812.10661 [pdf]

doi 10.1016/j.apsusc.2019.04.211

Polarization-insensitive space-selective etching in fused silica induced by picosecond laser irradiation

Authors: Xiaolong Li, Jian Xu, Zijie Lin, Jia Qi, Peng Wang, Wei Chu, Zhiwei Fang, Zhenhua Wang, Zhifang Chai, Ya Cheng

Abstract: It is well known that when the fused silica is irradiated with focused femtosecond laser beams, space selective chemical etching can be achieved. The etching rate depends sensitively on the polarization of the laser. Surprisingly, we observe that by chir** the Fourier-transform-limited femtosecond laser pulses to picosecond pulses, the polarization dependence of the etching rate disappears, wher… ▽ More It is well known that when the fused silica is irradiated with focused femtosecond laser beams, space selective chemical etching can be achieved. The etching rate depends sensitively on the polarization of the laser. Surprisingly, we observe that by chir** the Fourier-transform-limited femtosecond laser pulses to picosecond pulses, the polarization dependence of the etching rate disappears, whereas an efficient etching rate can still be maintained. Observation with a scanning electron microscope reveals that the chirped pulses can induce interconnected nanocracks in the irradiated areas which facilitates efficient introduction of the etchant into the microchannel. The reported technology is of great use for fabrication of three-dimensional (3D) microfluidic systems and glass-based 3D printing. △ Less

Submitted 27 December, 2018; originally announced December 2018.

Comments: 9 pages, 5 figures, 1 table

arXiv:1812.07126 [pdf, other]

BandNet: A Neural Network-based, Multi-Instrument Beatles-Style MIDI Music Composition Machine

Authors: Yichao Zhou, Wei Chu, Sam Young, Xin Chen

Abstract: In this paper, we propose a recurrent neural network (RNN)-based MIDI music composition machine that is able to learn musical knowledge from existing Beatles' songs and generate music in the style of the Beatles with little human intervention. In the learning stage, a sequence of stylistically uniform, multiple-channel music samples was modeled by a RNN. In the composition stage, a short clip of r… ▽ More In this paper, we propose a recurrent neural network (RNN)-based MIDI music composition machine that is able to learn musical knowledge from existing Beatles' songs and generate music in the style of the Beatles with little human intervention. In the learning stage, a sequence of stylistically uniform, multiple-channel music samples was modeled by a RNN. In the composition stage, a short clip of randomly-generated music was used as a seed for the RNN to start music score prediction. To form structured music, segments of generated music from different seeds were concatenated together. To improve the quality and structure of the generated music, we integrated music theory knowledge into the model, such as controlling the spacing of gaps in the vocal melody, normalizing the timing of chord changes, and requiring notes to be related to the song's key (C major, for example). This integration improved the quality of the generated music as verified by a professional composer. We also conducted a subjective listening test that showed our generated music was close to original music by the Beatles in terms of style similarity, professional quality, and interestingness. Generated music samples are at https://goo.gl/uaLXoB. △ Less

Submitted 17 December, 2018; originally announced December 2018.

arXiv:1811.10158 [pdf, other]

Reinforcement Learning for Uplift Modeling

Authors: Chenchen Li, Xiang Yan, Xiaotie Deng, Yuan Qi, Wei Chu, Le Song, Junlong Qiao, Jianshan He, Junwu Xiong

Abstract: Uplift modeling aims to directly model the incremental impact of a treatment on an individual response. In this work, we address the problem from a new angle and reformulate it as a Markov Decision Process (MDP). We conducted extensive experiments on both a synthetic dataset and real-world scenarios, and showed that our method can achieve significant improvement over previous methods. Uplift modeling aims to directly model the incremental impact of a treatment on an individual response. In this work, we address the problem from a new angle and reformulate it as a Markov Decision Process (MDP). We conducted extensive experiments on both a synthetic dataset and real-world scenarios, and showed that our method can achieve significant improvement over previous methods. △ Less

Submitted 4 February, 2019; v1 submitted 25 November, 2018; originally announced November 2018.

arXiv:1811.08611 [pdf, other]

A Novel Integrated Framework for Learning both Text Detection and Recognition

Authors: Wanchen Sui, Qing Zhang, Jun Yang, Wei Chu

Abstract: In this paper, we propose a novel integrated framework for learning both text detection and recognition. For most of the existing methods, detection and recognition are treated as two isolated tasks and trained separately, since parameters of detection and recognition models are different and two models target to optimize their own loss functions during individual training processes. In contrast t… ▽ More In this paper, we propose a novel integrated framework for learning both text detection and recognition. For most of the existing methods, detection and recognition are treated as two isolated tasks and trained separately, since parameters of detection and recognition models are different and two models target to optimize their own loss functions during individual training processes. In contrast to those methods, by sharing model parameters, we merge the detection model and recognition model into a single end-to-end trainable model and train the joint model for two tasks simultaneously. The shared parameters not only help effectively reduce the computational load in inference process, but also improve the end-to-end text detection-recognition accuracy. In addition, we design a simpler and faster sequence learning method for the recognition network based on a succession of stacked convolutional layers without any recurrent structure, this is proved feasible and dramatically improves inference speed. Extensive experiments on different datasets demonstrate that the proposed method achieves very promising results. △ Less

Submitted 21 November, 2018; originally announced November 2018.

Showing 101–150 of 391 results for author: Chu, W