-
Adaptive Attribute and Structure Subspace Clustering Network
Authors:
Zhihao Peng,
Hui Liu,
Yuheng Jia,
Junhui Hou
Abstract:
Deep self-expressiveness-based subspace clustering methods have demonstrated effectiveness. However, existing works only consider the attribute information to conduct the self-expressiveness, which may limit the clustering performance. In this paper, we propose a novel adaptive attribute and structure subspace clustering network (AASSC-Net) to simultaneously consider the attribute and structure in…
▽ More
Deep self-expressiveness-based subspace clustering methods have demonstrated effectiveness. However, existing works only consider the attribute information to conduct the self-expressiveness, which may limit the clustering performance. In this paper, we propose a novel adaptive attribute and structure subspace clustering network (AASSC-Net) to simultaneously consider the attribute and structure information in an adaptive graph fusion manner. Specifically, we first exploit an auto-encoder to represent input data samples with latent features for the construction of an attribute matrix. We also construct a mixed signed and symmetric structure matrix to capture the local geometric structure underlying data samples. Then, we perform self-expressiveness on the constructed attribute and structure matrices to learn their affinity graphs separately. Finally, we design a novel attention-based fusion module to adaptively leverage these two affinity graphs to construct a more discriminative affinity graph. Extensive experimental results on commonly used benchmark datasets demonstrate that our AASSC-Net significantly outperforms state-of-the-art methods. In addition, we conduct comprehensive ablation studies to discuss the effectiveness of the designed modules. The code will be publicly available at https://github.com/ZhihaoPENG-CityU.
△ Less
Submitted 22 April, 2022; v1 submitted 28 September, 2021;
originally announced September 2021.
-
Gutzwiller approximation approach to the SU(4) $t$-$J$ model
Authors:
Jia-Cheng He,
Jie Hou,
Yan Chen
Abstract:
We develop the Gutzwiller approximation method to obtain the renormalized Hamiltonian of the SU(4) $t$-$J$ model with the corresponding renormalization factors. Subsequently, a mean-field theory is employed on the renormalized Hamiltonian of the model on the honeycomb lattice under the scenario of a cooperative condensation of carriers moving in the resonating valence bond state of flavors. In par…
▽ More
We develop the Gutzwiller approximation method to obtain the renormalized Hamiltonian of the SU(4) $t$-$J$ model with the corresponding renormalization factors. Subsequently, a mean-field theory is employed on the renormalized Hamiltonian of the model on the honeycomb lattice under the scenario of a cooperative condensation of carriers moving in the resonating valence bond state of flavors. In particular, we find that the extended $s$-wave superconducting state is more favorable than the $d\pm id$-wave superconducting state in the do** range close to quarter filling. The pairing states of the SU(4) case reveal the property that the spin-singlet pairing and the spin-triplet pairing can coexist simultaneously. Our results might provide new insights into the twisted bilayer graphene system.
△ Less
Submitted 4 June, 2022; v1 submitted 11 September, 2021;
originally announced September 2021.
-
Quality assurance test and Failure Analysis of SiPM Arrays of GECAM Satellites
Authors:
D. L. Zhang,
M. Gao,
X. L. Sun,
X. Q. Li,
Z. H. An,
X. Y. Wen,
C. Cai,
Z. Chang,
G. Chen,
C. Chen,
Y. Y. Du,
R. Gao,
K. Gong,
D. Y. Guo,
J. J. He,
D. J. Hou,
Y. G. Li,
C. Y. Li,
G. Li,
L. Li,
X. F. Li,
M. S. Li,
X. H. Liang,
X. J. Liu,
Y. Q. Liu
, et al. (23 additional authors not shown)
Abstract:
The Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) satellite consists of two small satellites. Each GECAM payload contains 25 gamma ray detectors (GRD) and 8 charged particle detectors (CPD). GRD is the main detector which can detect gamma-rays and particles and localize the Gamma-Ray Bursts (GRB),while CPD is used to help GRD to discriminate gamma-ray bursts an…
▽ More
The Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) satellite consists of two small satellites. Each GECAM payload contains 25 gamma ray detectors (GRD) and 8 charged particle detectors (CPD). GRD is the main detector which can detect gamma-rays and particles and localize the Gamma-Ray Bursts (GRB),while CPD is used to help GRD to discriminate gamma-ray bursts and charged particle bursts. The GRD makes use of lanthanum bromide (LaBr3) crystal readout by SiPM. As the all available SiPM devices belong to commercial grade, quality assurance tests need to be performed in accordance with the aerospace specifications. In this paper, we present the results of quality assurance tests, especially a detailed mechanism analysis of failed devices during the development of GECAM. This paper also summarizes the application experience of commercial-grade SiPM devices in aerospace payloads, and provides suggestions for forthcoming SiPM space applications.
△ Less
Submitted 9 December, 2021; v1 submitted 1 September, 2021;
originally announced September 2021.
-
GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer
Authors:
Shuaicheng Li,
Qianggang Cao,
Lingbo Liu,
Kunlin Yang,
Shinan Liu,
Jun Hou,
Shuai Yi
Abstract:
Group activity recognition is a crucial yet challenging problem, whose core lies in fully exploring spatial-temporal interactions among individuals and generating reasonable group representations. However, previous methods either model spatial and temporal information separately, or directly aggregate individual features to form group features. To address these issues, we propose a novel group act…
▽ More
Group activity recognition is a crucial yet challenging problem, whose core lies in fully exploring spatial-temporal interactions among individuals and generating reasonable group representations. However, previous methods either model spatial and temporal information separately, or directly aggregate individual features to form group features. To address these issues, we propose a novel group activity recognition network termed GroupFormer. It captures spatial-temporal contextual information jointly to augment the individual and group representations effectively with a clustered spatial-temporal transformer. Specifically, our GroupFormer has three appealing advantages: (1) A tailor-modified Transformer, Clustered Spatial-Temporal Transformer, is proposed to enhance the individual representation and group representation. (2) It models the spatial and temporal dependencies integrally and utilizes decoders to build the bridge between the spatial and temporal information. (3) A clustered attention mechanism is utilized to dynamically divide individuals into multiple clusters for better learning activity-aware semantic representations. Moreover, experimental results show that the proposed framework outperforms state-of-the-art methods on the Volleyball dataset and Collective Activity dataset. Code is available at https://github.com/xueyee/GroupFormer.
△ Less
Submitted 28 August, 2021;
originally announced August 2021.
-
How well is angular momentum accretion modelled in semi-analytic galaxy formation models?
Authors:
Jun Hou,
Cedric G. Lacey,
Carlos S. Frenk
Abstract:
Gas cooling and accretion in haloes delivers mass and angular momentum onto galaxies. In this work, we investigate the accuracy of the modelling of this important process in several different semi-analytic (SA) galaxy formation models (GALFORM, L-GALAXIES and MORGANA) through comparisons with a hydrodynamical simulation performed with the moving-mesh code AREPO. Both SA models and the simulation w…
▽ More
Gas cooling and accretion in haloes delivers mass and angular momentum onto galaxies. In this work, we investigate the accuracy of the modelling of this important process in several different semi-analytic (SA) galaxy formation models (GALFORM, L-GALAXIES and MORGANA) through comparisons with a hydrodynamical simulation performed with the moving-mesh code AREPO. Both SA models and the simulation were run without any feedback or metal enrichment, in order to focus on the cooling and accretion process. All of the SA models considered here assume that gas cools from a spherical halo. We found that the assumption that the gas conserves its angular momentum when moving from the virial radius, $r_{\rm vir}$, to the central region of the halo, $r\sim 0.1 r_{\rm vir}$, is approximately consistent with the results from our simulation, in which gas typically retains $70-80\%$ of its angular momentum during this process. We also found that, compared to the simulation, the MORGANA model tends to overestimate the mean specific angular momentum of cooled-down gas, the L-GALAXIES model also tends to overestimate this in low-redshift massive haloes, while the two older GALFORM models tend to underestimate the angular momentum. In general, the predictions of the new GALFORM cooling model developed by Hou et al. agree the best with the simulation.
△ Less
Submitted 25 August, 2021;
originally announced August 2021.
-
Superpixel-guided Discriminative Low-rank Representation of Hyperspectral Images for Classification
Authors:
Shujun Yang,
Junhui Hou,
Yuheng Jia,
Shaohui Mei,
Qian Du
Abstract:
In this paper, we propose a novel classification scheme for the remotely sensed hyperspectral image (HSI), namely SP-DLRR, by comprehensively exploring its unique characteristics, including the local spatial information and low-rankness. SP-DLRR is mainly composed of two modules, i.e., the classification-guided superpixel segmentation and the discriminative low-rank representation, which are itera…
▽ More
In this paper, we propose a novel classification scheme for the remotely sensed hyperspectral image (HSI), namely SP-DLRR, by comprehensively exploring its unique characteristics, including the local spatial information and low-rankness. SP-DLRR is mainly composed of two modules, i.e., the classification-guided superpixel segmentation and the discriminative low-rank representation, which are iteratively conducted. Specifically, by utilizing the local spatial information and incorporating the predictions from a typical classifier, the first module segments pixels of an input HSI (or its restoration generated by the second module) into superpixels. According to the resulting superpixels, the pixels of the input HSI are then grouped into clusters and fed into our novel discriminative low-rank representation model with an effective numerical solution. Such a model is capable of increasing the intra-class similarity by suppressing the spectral variations locally while promoting the inter-class discriminability globally, leading to a restored HSI with more discriminative pixels. Experimental results on three benchmark datasets demonstrate the significant superiority of SP-DLRR over state-of-the-art methods, especially for the case with an extremely limited number of training pixels.
△ Less
Submitted 25 October, 2021; v1 submitted 25 August, 2021;
originally announced August 2021.
-
Implicit Profiling Estimation for Semiparametric Models with Bundled Parameters
Authors:
Yucong Lin,
**hua Su,
Yang Liu,
Jue Hou,
Feifei Wang
Abstract:
Solving semiparametric models can be computationally challenging because the dimension of parameter space may grow large with increasing sample size. Classical Newton's method becomes quite slow and unstable with intensive calculation of the large Hessian matrix and its inverse. Iterative methods separately update parameters for finite dimensional component and infinite dimensional component have…
▽ More
Solving semiparametric models can be computationally challenging because the dimension of parameter space may grow large with increasing sample size. Classical Newton's method becomes quite slow and unstable with intensive calculation of the large Hessian matrix and its inverse. Iterative methods separately update parameters for finite dimensional component and infinite dimensional component have been developed to speed up single iteration, but they often take more steps until convergence or even sometimes sacrifice estimation precision due to sub-optimal update direction. We propose a computationally efficient implicit profiling algorithm that achieves simultaneously the fast iteration step in iterative methods and the optimal update direction in the Newton's method by profiling out the infinite dimensional component as the function of the finite dimensional component. We devise a first order approximation when the profiling function has no explicit analytical form. We show that our implicit profiling method always solve any local quadratic programming problem in two steps. In two numerical experiments under semiparametric transformation models and GARCH-M models, we demonstrated the computational efficiency and statistical precision of our implicit profiling method.
△ Less
Submitted 17 August, 2021;
originally announced August 2021.
-
Learning Dynamic Interpolation for Extremely Sparse Light Fields with Wide Baselines
Authors:
Mantang Guo,
**g **,
Hui Liu,
Junhui Hou
Abstract:
In this paper, we tackle the problem of dense light field (LF) reconstruction from sparsely-sampled ones with wide baselines and propose a learnable model, namely dynamic interpolation, to replace the commonly-used geometry war** operation. Specifically, with the estimated geometric relation between input views, we first construct a lightweight neural network to dynamically learn weights for int…
▽ More
In this paper, we tackle the problem of dense light field (LF) reconstruction from sparsely-sampled ones with wide baselines and propose a learnable model, namely dynamic interpolation, to replace the commonly-used geometry war** operation. Specifically, with the estimated geometric relation between input views, we first construct a lightweight neural network to dynamically learn weights for interpolating neighbouring pixels from input views to synthesize each pixel of novel views independently. In contrast to the fixed and content-independent weights employed in the geometry war** operation, the learned interpolation weights implicitly incorporate the correspondences between the source and novel views and adapt to different image content information. Then, we recover the spatial correlation between the independently synthesized pixels of each novel view by referring to that of input views using a geometry-based spatial refinement module. We also constrain the angular correlation between the novel views through a disparity-oriented LF structure loss. Experimental results on LF datasets with wide baselines show that the reconstructed LFs achieve much higher PSNR/SSIM and preserve the LF parallax structure better than state-of-the-art methods. The source code is publicly available at https://github.com/MantangGuo/DI4SLF.
△ Less
Submitted 18 August, 2021; v1 submitted 16 August, 2021;
originally announced August 2021.
-
Semantic-embedded Unsupervised Spectral Reconstruction from Single RGB Images in the Wild
Authors:
Zhiyu Zhu,
Hui Liu,
Junhui Hou,
Huanqiang Zeng,
Qingfu Zhang
Abstract:
This paper investigates the problem of reconstructing hyperspectral (HS) images from single RGB images captured by commercial cameras, \textbf{without} using paired HS and RGB images during training. To tackle this challenge, we propose a new lightweight and end-to-end learning-based framework. Specifically, on the basis of the intrinsic imaging degradation model of RGB images from HS images, we p…
▽ More
This paper investigates the problem of reconstructing hyperspectral (HS) images from single RGB images captured by commercial cameras, \textbf{without} using paired HS and RGB images during training. To tackle this challenge, we propose a new lightweight and end-to-end learning-based framework. Specifically, on the basis of the intrinsic imaging degradation model of RGB images from HS images, we progressively spread the differences between input RGB images and re-projected RGB images from recovered HS images via effective unsupervised camera spectral response function estimation. To enable the learning without paired ground-truth HS images as supervision, we adopt the adversarial learning manner and boost it with a simple yet effective $\mathcal{L}_1$ gradient clip** scheme. Besides, we embed the semantic information of input RGB images to locally regularize the unsupervised learning, which is expected to promote pixels with identical semantics to have consistent spectral signatures. In addition to conducting quantitative experiments over two widely-used datasets for HS image reconstruction from synthetic RGB images, we also evaluate our method by applying recovered HS images from real RGB images to HS-based visual tracking. Extensive results show that our method significantly outperforms state-of-the-art unsupervised methods and even exceeds the latest supervised method under some settings. The source code is public available at https://github.com/zbzhzhy/Unsupervised-Spectral-Reconstruction.
△ Less
Submitted 16 August, 2021; v1 submitted 15 August, 2021;
originally announced August 2021.
-
Deep Amended Gradient Descent for Efficient Spectral Reconstruction from Single RGB Images
Authors:
Zhiyu Zhu,
Hui Liu,
Junhui Hou,
Sen Jia,
Qingfu Zhang
Abstract:
This paper investigates the problem of recovering hyperspectral (HS) images from single RGB images. To tackle such a severely ill-posed problem, we propose a physically-interpretable, compact, efficient, and end-to-end learning-based framework, namely AGD-Net. Precisely, by taking advantage of the imaging process, we first formulate the problem explicitly based on the classic gradient descent algo…
▽ More
This paper investigates the problem of recovering hyperspectral (HS) images from single RGB images. To tackle such a severely ill-posed problem, we propose a physically-interpretable, compact, efficient, and end-to-end learning-based framework, namely AGD-Net. Precisely, by taking advantage of the imaging process, we first formulate the problem explicitly based on the classic gradient descent algorithm. Then, we design a lightweight neural network with a multi-stage architecture to mimic the formed amended gradient descent process, in which efficient convolution and novel spectral zero-mean normalization are proposed to effectively extract spatial-spectral features for regressing an initialization, a basic gradient, and an incremental gradient. Besides, based on the approximate low-rank property of HS images, we propose a novel rank loss to promote the similarity between the global structures of reconstructed and ground-truth HS images, which is optimized with our singular value weighting strategy during training. Moreover, AGD-Net, a single network after one-time training, is flexible to handle the reconstruction with various spectral response functions. Extensive experiments over three commonly-used benchmark datasets demonstrate that AGD-Net can improve the reconstruction quality by more than 1.0 dB on average while saving 67$\times$ parameters and 32$\times$ FLOPs, compared with state-of-the-art methods. The code will be publicly available at https://github.com/zbzhzhy/GD-Net.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
Attention-driven Graph Clustering Network
Authors:
Zhihao Peng,
Hui Liu,
Yuheng Jia,
Junhui Hou
Abstract:
The combination of the traditional convolutional network (i.e., an auto-encoder) and the graph convolutional network has attracted much attention in clustering, in which the auto-encoder extracts the node attribute feature and the graph convolutional network captures the topological graph feature. However, the existing works (i) lack a flexible combination mechanism to adaptively fuse those two ki…
▽ More
The combination of the traditional convolutional network (i.e., an auto-encoder) and the graph convolutional network has attracted much attention in clustering, in which the auto-encoder extracts the node attribute feature and the graph convolutional network captures the topological graph feature. However, the existing works (i) lack a flexible combination mechanism to adaptively fuse those two kinds of features for learning the discriminative representation and (ii) overlook the multi-scale information embedded at different layers for subsequent cluster assignment, leading to inferior clustering results. To this end, we propose a novel deep clustering method named Attention-driven Graph Clustering Network (AGCN). Specifically, AGCN exploits a heterogeneity-wise fusion module to dynamically fuse the node attribute feature and the topological graph feature. Moreover, AGCN develops a scale-wise fusion module to adaptively aggregate the multi-scale features embedded at different layers. Based on a unified optimization framework, AGCN can jointly perform feature learning and cluster assignment in an unsupervised fashion. Compared with the existing deep clustering methods, our method is more flexible and effective since it comprehensively considers the numerous and discriminative information embedded in the network and directly produces the clustering results. Extensive quantitative and qualitative results on commonly used benchmark datasets validate that our AGCN consistently outperforms state-of-the-art methods.
△ Less
Submitted 11 August, 2021;
originally announced August 2021.
-
Analytic Gaussian Covariance Matrices for Galaxy $N$-Point Correlation Functions
Authors:
Jiamin Hou,
Robert N. Cahn,
Oliver H. E. Philcox,
Zachary Slepian
Abstract:
We derive analytic covariance matrices for the $N$-Point Correlation Functions (NPCFs) of galaxies in the Gaussian limit. Our results are given for arbitrary $N$ and projected onto the isotropic basis functions of Cahn & Slepian (2020), recently shown to facilitate efficient NPCF estimation. A numerical implementation of the 4PCF covariance is compared to the sample covariance obtained from a set…
▽ More
We derive analytic covariance matrices for the $N$-Point Correlation Functions (NPCFs) of galaxies in the Gaussian limit. Our results are given for arbitrary $N$ and projected onto the isotropic basis functions of Cahn & Slepian (2020), recently shown to facilitate efficient NPCF estimation. A numerical implementation of the 4PCF covariance is compared to the sample covariance obtained from a set of lognormal simulations, Quijote dark matter halo catalogues, and MultiDark-Patchy galaxy mocks, with the latter including realistic survey geometry. The analytic formalism gives reasonable predictions for the covariances estimated from mock simulations with a periodic-box geometry. Furthermore, fitting for an effective volume and number density by maximizing a likelihood based on Kullback-Leibler divergence is shown to partially compensate for the effects of a non-uniform window function.
△ Less
Submitted 3 August, 2021;
originally announced August 2021.
-
A First Detection of the Connected 4-Point Correlation Function of Galaxies Using the BOSS CMASS Sample
Authors:
Oliver H. E. Philcox,
Jiamin Hou,
Zachary Slepian
Abstract:
We present an $8.1σ$ detection of the non-Gaussian 4-Point Correlation Function (4PCF) using a sample of $N_{\rm g} \approx 8\times 10^5$ galaxies from the BOSS CMASS dataset. Our measurement uses the $\mathcal{O}(N_{\rm g}^2)$ NPCF estimator of Philcox et al. (2021), including a new modification to subtract the disconnected 4PCF contribution (arising from the product of two 2PCFs) at the estimato…
▽ More
We present an $8.1σ$ detection of the non-Gaussian 4-Point Correlation Function (4PCF) using a sample of $N_{\rm g} \approx 8\times 10^5$ galaxies from the BOSS CMASS dataset. Our measurement uses the $\mathcal{O}(N_{\rm g}^2)$ NPCF estimator of Philcox et al. (2021), including a new modification to subtract the disconnected 4PCF contribution (arising from the product of two 2PCFs) at the estimator level. This approach is unlike previous work and ensures that our signal is a robust detection of gravitationally-induced non-Gaussianity. The estimator is validated with a suite of lognormal simulations, and the analytic form of the disconnected contribution is discussed. Due to the high dimensionality of the 4PCF, data compression is required; we use a signal-to-noise-based scheme calibrated from theoretical covariance matrices to restrict to $\sim$ $100$ basis vectors. The compression has minimal impact on the detection significance and facilitates traditional $χ^2$-like analyses using a suite of mock catalogs. The significance is stable with respect to different treatments of noise in the sample covariance (arising from the limited number of mocks), but decreases to $4.7σ$ when a minimum galaxy separation of $14 h^{-1}\mathrm{Mpc}$ is enforced on the 4PCF tetrahedra (such that the statistic can be modelled more easily). The detectability of the 4PCF in the quasi-linear regime implies that it will become a useful tool in constraining cosmological and galaxy formation parameters from upcoming spectroscopic surveys.
△ Less
Submitted 3 August, 2021;
originally announced August 2021.
-
Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection
Authors:
Yinmin Zhang,
Xinzhu Ma,
Shuai Yi,
Jun Hou,
Zhihui Wang,
Wanli Ouyang,
Dan Xu
Abstract:
As a crucial task of autonomous driving, 3D object detection has made great progress in recent years. However, monocular 3D object detection remains a challenging problem due to the unsatisfactory performance in depth estimation. Most existing monocular methods typically directly regress the scene depth while ignoring important relationships between the depth and various geometric elements (e.g. b…
▽ More
As a crucial task of autonomous driving, 3D object detection has made great progress in recent years. However, monocular 3D object detection remains a challenging problem due to the unsatisfactory performance in depth estimation. Most existing monocular methods typically directly regress the scene depth while ignoring important relationships between the depth and various geometric elements (e.g. bounding box sizes, 3D object dimensions, and object poses). In this paper, we propose to learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection. Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised. We further implement and embed the proposed formula to enable geometry-aware deep representation learning, allowing effective 2D and 3D interactions for boosting the depth estimation. Moreover, we provide a strong baseline through addressing substantial misalignment between 2D annotation and projected boxes to ensure robust learning with the proposed geometric formula. Experiments on the KITTI dataset show that our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting. The model and code will be released at https://github.com/YinminZhang/MonoGeo.
△ Less
Submitted 24 April, 2024; v1 submitted 29 July, 2021;
originally announced July 2021.
-
Divide-and-Assemble: Learning Block-wise Memory for Unsupervised Anomaly Detection
Authors:
**lei Hou,
Yingying Zhang,
Qiaoyong Zhong,
Di Xie,
Shiliang Pu,
Hong Zhou
Abstract:
Reconstruction-based methods play an important role in unsupervised anomaly detection in images. Ideally, we expect a perfect reconstruction for normal samples and poor reconstruction for abnormal samples. Since the generalizability of deep neural networks is difficult to control, existing models such as autoencoder do not work well. In this work, we interpret the reconstruction of an image as a d…
▽ More
Reconstruction-based methods play an important role in unsupervised anomaly detection in images. Ideally, we expect a perfect reconstruction for normal samples and poor reconstruction for abnormal samples. Since the generalizability of deep neural networks is difficult to control, existing models such as autoencoder do not work well. In this work, we interpret the reconstruction of an image as a divide-and-assemble procedure. Surprisingly, by varying the granularity of division on feature maps, we are able to modulate the reconstruction capability of the model for both normal and abnormal samples. That is, finer granularity leads to better reconstruction, while coarser granularity leads to poorer reconstruction. With proper granularity, the gap between the reconstruction error of normal and abnormal samples can be maximized. The divide-and-assemble framework is implemented by embedding a novel multi-scale block-wise memory module into an autoencoder network. Besides, we introduce adversarial learning and explore the semantic latent representation of the discriminator, which improves the detection of subtle anomaly. We achieve state-of-the-art performance on the challenging MVTec AD dataset. Remarkably, we improve the vanilla autoencoder model by 10.1% in terms of the AUROC score.
△ Less
Submitted 27 July, 2021;
originally announced July 2021.
-
Video Crowd Localization with Multi-focus Gaussian Neighborhood Attention and a Large-Scale Benchmark
Authors:
Haopeng Li,
Lingbo Liu,
Kunlin Yang,
Shinan Liu,
Junyu Gao,
Bin Zhao,
Rui Zhang,
Jun Hou
Abstract:
Video crowd localization is a crucial yet challenging task, which aims to estimate exact locations of human heads in the given crowded videos. To model spatial-temporal dependencies of human mobility, we propose a multi-focus Gaussian neighborhood attention (GNA), which can effectively exploit long-range correspondences while maintaining the spatial topological structure of the input videos. In pa…
▽ More
Video crowd localization is a crucial yet challenging task, which aims to estimate exact locations of human heads in the given crowded videos. To model spatial-temporal dependencies of human mobility, we propose a multi-focus Gaussian neighborhood attention (GNA), which can effectively exploit long-range correspondences while maintaining the spatial topological structure of the input videos. In particular, our GNA can also capture the scale variation of human heads well using the equipped multi-focus mechanism. Based on the multi-focus GNA, we develop a unified neural network called GNANet to accurately locate head centers in video clips by fully aggregating spatial-temporal information via a scene modeling module and a context cross-attention module. Moreover, to facilitate future researches in this field, we introduce a large-scale crowd video benchmark named VSCrowd, which consists of 60K+ frames captured in various surveillance scenarios and 2M+ head annotations. Finally, we conduct extensive experiments on three datasets including our SenseCrowd, and the experiment results show that the proposed method is capable to achieve state-of-the-art performance for both video crowd localization and counting.
△ Less
Submitted 8 August, 2022; v1 submitted 19 July, 2021;
originally announced July 2021.
-
Attention-Guided Progressive Neural Texture Fusion for High Dynamic Range Image Restoration
Authors:
Jie Chen,
Zaifeng Yang,
Tsz Nam Chan,
Hui Li,
Junhui Hou,
Lap-Pui Chau
Abstract:
High Dynamic Range (HDR) imaging via multi-exposure fusion is an important task for most modern imaging platforms. In spite of recent developments in both hardware and algorithm innovations, challenges remain over content association ambiguities caused by saturation, motion, and various artifacts introduced during multi-exposure fusion such as ghosting, noise, and blur. In this work, we propose an…
▽ More
High Dynamic Range (HDR) imaging via multi-exposure fusion is an important task for most modern imaging platforms. In spite of recent developments in both hardware and algorithm innovations, challenges remain over content association ambiguities caused by saturation, motion, and various artifacts introduced during multi-exposure fusion such as ghosting, noise, and blur. In this work, we propose an Attention-guided Progressive Neural Texture Fusion (APNT-Fusion) HDR restoration model which aims to address these issues within one framework. An efficient two-stream structure is proposed which separately focuses on texture feature transfer over saturated regions and multi-exposure tonal and texture feature fusion. A neural feature transfer mechanism is proposed which establishes spatial correspondence between different exposures based on multi-scale VGG features in the masked saturated HDR domain for discriminative contextual clues over the ambiguous image areas. A progressive texture blending module is designed to blend the encoded two-stream features in a multi-scale and progressive manner. In addition, we introduce several novel attention mechanisms, i.e., the motion attention module detects and suppresses the content discrepancies among the reference images; the saturation attention module facilitates differentiating the misalignment caused by saturation from those caused by motion; and the scale attention module ensures texture blending consistency between different coder/decoder scales. We carry out comprehensive qualitative and quantitative evaluations and ablation studies, which validate that these novel modules work coherently under the same framework and outperform state-of-the-art methods.
△ Less
Submitted 13 July, 2021;
originally announced July 2021.
-
PU-Flow: a Point Cloud Upsampling Network with Normalizing Flows
Authors:
Aihua Mao,
Zihui Du,
Junhui Hou,
Yaqi Duan,
Yong-** Liu,
Ying He
Abstract:
Point cloud upsampling aims to generate dense point clouds from given sparse ones, which is a challenging task due to the irregular and unordered nature of point sets. To address this issue, we present a novel deep learning-based model, called PU-Flow, which incorporates normalizing flows and weight prediction techniques to produce dense points uniformly distributed on the underlying surface. Spec…
▽ More
Point cloud upsampling aims to generate dense point clouds from given sparse ones, which is a challenging task due to the irregular and unordered nature of point sets. To address this issue, we present a novel deep learning-based model, called PU-Flow, which incorporates normalizing flows and weight prediction techniques to produce dense points uniformly distributed on the underlying surface. Specifically, we exploit the invertible characteristics of normalizing flows to transform points between Euclidean and latent spaces and formulate the upsampling process as ensemble of neighbouring points in a latent space, where the ensemble weights are adaptively learned from local geometric context. Extensive experiments show that our method is competitive and, in most test cases, it outperforms state-of-the-art methods in terms of reconstruction quality, proximity-to-surface accuracy, and computation efficiency. The source code will be publicly available at https://github.com/unknownue/pu-flow.
△ Less
Submitted 7 June, 2022; v1 submitted 13 July, 2021;
originally announced July 2021.
-
Primordial non-Gaussianity from the Completed SDSS-IV extended Baryon Oscillation Spectroscopic Survey I: Catalogue Preparation and Systematic Mitigation
Authors:
Mehdi Rezaie,
Ashley J. Ross,
Hee-Jong Seo,
Eva-Maria Mueller,
Will J. Percival,
Grant Merz,
Reza Katebi,
Razvan C. Bunescu,
Julian Bautista,
Joel R. Brownstein,
Etienne Burtin,
Kyle Dawson,
Héctor Gil-Marín,
Jiamin Hou,
Eleanor B. Lyke,
Axel de la Macorra,
Graziano Rossi,
Donald P. Schneider,
Pauline Zarrouk,
Gong-Bo Zhao
Abstract:
We investigate the large-scale clustering of the final spectroscopic sample of quasars from the recently completed extended Baryon Oscillation Spectroscopic Survey (eBOSS). The sample contains $343708$ objects in the redshift range $0.8<z<2.2$ and $72667$ objects with redshifts $2.2<z<3.5$, covering an effective area of $4699~{\rm deg}^{2}$. We develop a neural network-based approach to mitigate s…
▽ More
We investigate the large-scale clustering of the final spectroscopic sample of quasars from the recently completed extended Baryon Oscillation Spectroscopic Survey (eBOSS). The sample contains $343708$ objects in the redshift range $0.8<z<2.2$ and $72667$ objects with redshifts $2.2<z<3.5$, covering an effective area of $4699~{\rm deg}^{2}$. We develop a neural network-based approach to mitigate spurious fluctuations in the density field caused by spatial variations in the quality of the imaging data used to select targets for follow-up spectroscopy. Simulations are used with the same angular and radial distributions as the real data to estimate covariance matrices, perform error analyses, and assess residual systematic uncertainties. We measure the mean density contrast and cross-correlations of the eBOSS quasars against maps of potential sources of imaging systematics to address algorithm effectiveness, finding that the neural network-based approach outperforms standard linear regression. Stellar density is one of the most important sources of spurious fluctuations, and a new template constructed using data from the Gaia spacecraft provides the best match to the observed quasar clustering. The end-product from this work is a new value-added quasar catalogue with the improved weights to correct for nonlinear imaging systematic effects, which will be made public. Our quasar catalogue is used to measure the local-type primordial non-Gaussianity in our companion paper, Mueller et al. in preparation.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
The Breaking of Geometric Constraint of Classical Dimers on the Square Lattice
Authors:
Hongxu Yao,
Jiaze Li,
**tao Hou
Abstract:
We study a model of two-dimensional classical dimers on the square lattice with strong geometric constraints (there is exactly one bond with the nearest point for every point in the lattice). This model corresponds to the quantum dimer model suggested by D.S. Rokhsar and S.A. Kivelson (1988). We use the directed-loop algorithm to show the system undergoes a Berezinskii-Kostelitz Thousless transiti…
▽ More
We study a model of two-dimensional classical dimers on the square lattice with strong geometric constraints (there is exactly one bond with the nearest point for every point in the lattice). This model corresponds to the quantum dimer model suggested by D.S. Rokhsar and S.A. Kivelson (1988). We use the directed-loop algorithm to show the system undergoes a Berezinskii-Kostelitz Thousless transition (BKT transition) in finite temperatures. After that, if we destroy the geometric constraint of dimers, the topological transition will transfer to a quasi one-order transition. For the dimer updates, we also introduce a new cluster updating algorithm called the edged cluster algorithm. By this method, we succeed in rapidly traversing the winding (topological) sections uniformly and widening the effective matrical ensemble to include more topological sections.
△ Less
Submitted 17 June, 2021;
originally announced June 2021.
-
Defect-free arbitrary-geometry assembly of mixed-species atom arrays
Authors:
Cheng Sheng,
Jiayi Hou,
Xiaodong He,
Kunpeng Wang,
Ruijun Guo,
Jun Zhuang,
Bahtiyar Mamat,
Peng Xu,
Min Liu,
** Wang,
Mingsheng Zhan
Abstract:
Optically trapped mixed-species single atom arrays with arbitrary geometries are an attractive and promising platform for various applications, because tunable quantum systems with multiple components provide extra degrees of freedom for experimental control. Here, we report the first demonstration of two-dimensional $6\times4$ dual-species atom assembly with a filling fraction of 0.88 (0.89) for…
▽ More
Optically trapped mixed-species single atom arrays with arbitrary geometries are an attractive and promising platform for various applications, because tunable quantum systems with multiple components provide extra degrees of freedom for experimental control. Here, we report the first demonstration of two-dimensional $6\times4$ dual-species atom assembly with a filling fraction of 0.88 (0.89) for $^{85}$Rb ($^{87}$Rb) atoms. This mixed-species atomic synthetic is achieved via rearranging initially randomly distributed atoms using a sorting algorithm (heuristic heteronuclear algorithm) which is proposed for bottom-up atom assembly with both user-defined geometries and two-species atom number ratios. Our fully tunable hybrid-atom system of scalable advantages is a good starting point for high-fidelity quantum logic, many-body quantum simulation and forming defect-free single molecule arrays.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
Seismic Inverse Modeling Method based on Generative Adversarial Network
Authors:
Pengfei Xie,
YanShu Yin,
JiaGen Hou,
Mei Chen,
Lixin Wang
Abstract:
Seismic inverse modeling is a common method in reservoir prediction and it plays a vital role in the exploration and development of oil and gas. Conventional seismic inversion method is difficult to combine with complicated and abstract knowledge on geological mode and its uncertainty is difficult to be assessed. The paper proposes an inversion modeling method based on GAN consistent with geology,…
▽ More
Seismic inverse modeling is a common method in reservoir prediction and it plays a vital role in the exploration and development of oil and gas. Conventional seismic inversion method is difficult to combine with complicated and abstract knowledge on geological mode and its uncertainty is difficult to be assessed. The paper proposes an inversion modeling method based on GAN consistent with geology, well logs, seismic data. GAN is a the most promising generation model algorithm that extracts spatial structure and abstract features of training images. The trained GAN can reproduce the models with specific mode. In our test, 1000 models were generated in 1 second. Based on the trained GAN after assessment, the optimal result of models can be calculated through Bayesian inversion frame. Results show that inversion models conform to observation data and have a low uncertainty under the premise of fast generation. This seismic inverse modeling method increases the efficiency and quality of inversion iteration. It is worthy of studying and applying in fusion of seismic data and geological knowledge.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
Occlusion-aware Unsupervised Learning of Depth from 4-D Light Fields
Authors:
**g **,
Junhui Hou
Abstract:
Depth estimation is a fundamental issue in 4-D light field processing and analysis. Although recent supervised learning-based light field depth estimation methods have significantly improved the accuracy and efficiency of traditional optimization-based ones, these methods rely on the training over light field data with ground-truth depth maps which are challenging to obtain or even unavailable for…
▽ More
Depth estimation is a fundamental issue in 4-D light field processing and analysis. Although recent supervised learning-based light field depth estimation methods have significantly improved the accuracy and efficiency of traditional optimization-based ones, these methods rely on the training over light field data with ground-truth depth maps which are challenging to obtain or even unavailable for real-world light field data. Besides, due to the inevitable gap (or domain difference) between real-world and synthetic data, they may suffer from serious performance degradation when generalizing the models trained with synthetic data to real-world data. By contrast, we propose an unsupervised learning-based method, which does not require ground-truth depth as supervision during training. Specifically, based on the basic knowledge of the unique geometry structure of light field data, we present an occlusion-aware strategy to improve the accuracy on occlusion areas, in which we explore the angular coherence among subsets of the light field views to estimate initial depth maps, and utilize a constrained unsupervised loss to learn their corresponding reliability for final depth prediction. Additionally, we adopt a multi-scale network with a weighted smoothness loss to handle the textureless areas. Experimental results on synthetic data show that our method can significantly shrink the performance gap between the previous unsupervised method and supervised ones, and produce depth maps with comparable accuracy to traditional methods with obviously reduced computational cost. Moreover, experiments on real-world datasets show that our method can avoid the domain shift problem presented in supervised methods, demonstrating the great potential of our method.
△ Less
Submitted 23 January, 2022; v1 submitted 6 June, 2021;
originally announced June 2021.
-
ENCORE: An $\mathcal{O}(N_{\rm g}^2)$ Estimator for Galaxy $N$-Point Correlation Functions
Authors:
Oliver H. E. Philcox,
Zachary Slepian,
Jiamin Hou,
Craig Warner,
Robert N. Cahn,
Daniel J. Eisenstein
Abstract:
We present a new algorithm for efficiently computing the $N$-point correlation functions (NPCFs) of a 3D density field for arbitrary $N$. This can be applied both to a discrete spectroscopic galaxy survey and a continuous field. By expanding the statistics in a separable basis of isotropic functions built from spherical harmonics, the NPCFs can be estimated by counting pairs of particles in space,…
▽ More
We present a new algorithm for efficiently computing the $N$-point correlation functions (NPCFs) of a 3D density field for arbitrary $N$. This can be applied both to a discrete spectroscopic galaxy survey and a continuous field. By expanding the statistics in a separable basis of isotropic functions built from spherical harmonics, the NPCFs can be estimated by counting pairs of particles in space, leading to an algorithm with complexity $\mathcal{O}(N_{\rm g}^2)$ for $N_{\rm g}$ particles, or $\mathcal{O}\left(N_\mathrm{FFT}\log N_\mathrm{FFT}\right)$ when using a Fast Fourier Transform with $N_\mathrm{FFT}$ grid-points. In practice, the rate-limiting step for $N>3$ will often be the summation of the histogrammed spherical harmonic coefficients, particularly if the number of radial and angular bins is large. In this case, the algorithm scales linearly with $N_{\rm g}$. The approach is implemented in the ENCORE code, which can compute the 3PCF, 4PCF, 5PCF, and 6PCF of a BOSS-like galaxy survey in $\sim$ $100$ CPU-hours, including the corrections necessary for non-uniform survey geometries. We discuss the implementation in depth, along with its GPU acceleration, and provide practical demonstration on realistic galaxy catalogs. Our approach can be straightforwardly applied to current and future datasets to unlock the potential of constraining cosmology from the higher-point functions.
△ Less
Submitted 13 October, 2021; v1 submitted 18 May, 2021;
originally announced May 2021.
-
arXiv:2105.06465
[pdf]
physics.optics
cond-mat.mes-hall
cond-mat.mtrl-sci
cond-mat.other
physics.app-ph
Self-Hybridized Polaritonic Emission from Layered Perovskites
Authors:
Surendra B. Anantharaman,
Christopher E. Stevens,
Jason Lynch,
Baokun Song,
** Hou,
Huiqin Zhang,
Kiyoung Jo,
Pawan Kumar,
Jean-Christophe Blancon,
Aditya D. Mohite,
Joshua R. Hendrickson,
Deep Jariwala
Abstract:
Light-matter coupling in excitonic materials has been the subject of intense investigation due to emergence of new excitonic materials. Two-dimensional layered hybrid organic/inorganic perovskites (2D HOIPs) support strongly bound excitons at room-temperatures with some of the highest oscillator strengths and electric loss tangents among the known excitonic materials. Here, we report strong light-…
▽ More
Light-matter coupling in excitonic materials has been the subject of intense investigation due to emergence of new excitonic materials. Two-dimensional layered hybrid organic/inorganic perovskites (2D HOIPs) support strongly bound excitons at room-temperatures with some of the highest oscillator strengths and electric loss tangents among the known excitonic materials. Here, we report strong light-matter coupling in Ruddlesden-Popper phase 2D-HOIPs crystals without the necessity of an external cavity. We report concurrent occurrence of multiple-orders of hybrid light-matter states via both reflectance and luminescence spectroscopy in thick (> 100 nm) crystals and near-unity absorption in thin (< 20 nm) crystals. We observe resonances with quality factors > 250 in hybridized exciton-polaritons and identify a linear correlation between exciton-polariton mode splitting and extinction coefficient of the various 2D-HOIPs. Our work opens the door to studying polariton dynamics in self-hybridized and open cavity systems with broad applications in optoelectronics and photochemistry.
△ Less
Submitted 13 May, 2021;
originally announced May 2021.
-
Gait Characterization in Duchenne Muscular Dystrophy (DMD) Using a Single-Sensor Accelerometer: Classical Machine Learning and Deep Learning Approaches
Authors:
Albara Ah Ramli,
Xin Liu,
Kelly Berndt,
Erica Goude,
Jiahui Hou,
Lynea B. Kaethler,
Rex Liu,
Amanda Lopez,
Alina Nicorici,
Corey Owens,
David Rodriguez,
Jane Wang,
Huanle Zhang,
Daniel Aranki,
Craig M. McDonald,
Erik K. Henricson
Abstract:
Differences in gait patterns of children with Duchenne muscular dystrophy (DMD) and typically-develo** (TD) peers are visible to the eye, but quantifications of those differences outside of the gait laboratory have been elusive. In this work, we measured vertical, mediolateral, and anteroposterior acceleration using a waist-worn iPhone accelerometer during ambulation across a typical range of ve…
▽ More
Differences in gait patterns of children with Duchenne muscular dystrophy (DMD) and typically-develo** (TD) peers are visible to the eye, but quantifications of those differences outside of the gait laboratory have been elusive. In this work, we measured vertical, mediolateral, and anteroposterior acceleration using a waist-worn iPhone accelerometer during ambulation across a typical range of velocities. Fifteen TD and fifteen DMD children from 3-16 years of age underwent eight walking/running activities, including five 25 meters walk/run speed-calibration tests at a slow walk to running speeds (SC-L1 to SC-L5), a 6-minute walk test (6MWT), a 100 meters fast-walk/jog/run (100MRW), and a free walk (FW). For clinical anchoring purposes, participants completed a Northstar Ambulatory Assessment (NSAA). We extracted temporospatial gait clinical features (CFs) and applied multiple machine learning (ML) approaches to differentiate between DMD and TD children using extracted temporospatial gait CFs and raw data. Extracted temporospatial gait CFs showed reduced step length and a greater mediolateral component of total power (TP) consistent with shorter strides and Trendelenberg-like gait commonly observed in DMD. ML approaches using temporospatial gait CFs and raw data varied in effectiveness at differentiating between DMD and TD controls at different speeds, with an accuracy of up to 100%. We demonstrate that by using ML with accelerometer data from a consumer-grade smartphone, we can capture DMD-associated gait characteristics in toddlers to teens.
△ Less
Submitted 10 July, 2023; v1 submitted 12 May, 2021;
originally announced May 2021.
-
Dissipative superfluid hydrodynamics for the unitary Fermi gas
Authors:
Jiaxun Hou,
Thomas Schaefer
Abstract:
In this work we establish constraints on the temperature dependence of the shear viscosity $η$ in the superfluid phase of a dilute Fermi gas in the unitary limit. Our results are based on analyzing experiments that measure the aspect ratio of a deformed cloud after release from an optical trap. We discuss how to apply the two-fluid formalism to the unitary gas, and provide a suitable parametrizati…
▽ More
In this work we establish constraints on the temperature dependence of the shear viscosity $η$ in the superfluid phase of a dilute Fermi gas in the unitary limit. Our results are based on analyzing experiments that measure the aspect ratio of a deformed cloud after release from an optical trap. We discuss how to apply the two-fluid formalism to the unitary gas, and provide a suitable parametrization of the equation of state. We show that in expansion experiments the difference between the normal and superfluid velocities remains small, and can be treated as a perturbation. We find that expansion experiments favor a shear viscosity that decreases significantly in the superfluid regime. Using an exponential parametrization we find $η(T_c/(2T_F))< 0.37η(T_c/T_F)$, where $T_c$ is the critical temperature, $T_F$ is the local Fermi temperature of the gas.
△ Less
Submitted 31 July, 2021; v1 submitted 11 May, 2021;
originally announced May 2021.
-
Nonlinear dynamics in a synthetic momentum state lattice
Authors:
Fangzhao Alex An,
Bhuvanesh Sundar,
Junpeng Hou,
Xi-Wang Luo,
Eric J. Meier,
Chuanwei Zhang,
Kaden R. A. Hazzard,
Bryce Gadway
Abstract:
The scope of analog simulation in atomic, molecular, and optical systems has expanded greatly over the past decades. Recently, the idea of synthetic dimensions -- in which transport occurs in a space spanned by internal or motional states coupled by field-driven transitions -- has played a key role in this expansion. While approaches based on synthetic dimensions have led to rapid advances in sing…
▽ More
The scope of analog simulation in atomic, molecular, and optical systems has expanded greatly over the past decades. Recently, the idea of synthetic dimensions -- in which transport occurs in a space spanned by internal or motional states coupled by field-driven transitions -- has played a key role in this expansion. While approaches based on synthetic dimensions have led to rapid advances in single-particle Hamiltonian engineering, strong interaction effects have been conspicuously absent from most synthetic dimensions platforms. Here, in a lattice of coupled atomic momentum states, we show that atomic interactions result in large and qualitative changes to dynamics in the synthetic dimension. We explore how the interplay of nonlinear interactions and coherent tunneling enriches the dynamics of a one-band tight-binding model, giving rise to macroscopic self-trap** and phase-driven Josephson dynamics with a nonsinusoidal current-phase relationship, which can be viewed as stemming from a nonlinear band structure arising from interactions.
△ Less
Submitted 10 May, 2021;
originally announced May 2021.
-
Surrogate Assisted Semi-supervised Inference for High Dimensional Risk Prediction
Authors:
Jue Hou,
Zijian Guo,
Tianxi Cai
Abstract:
Risk modeling with EHR data is challenging due to a lack of direct observations on the disease outcome, and the high dimensionality of the candidate predictors. In this paper, we develop a surrogate assisted semi-supervised-learning (SAS) approach to risk modeling with high dimensional predictors, leveraging a large unlabeled data on candidate predictors and surrogates of outcome, as well as a sma…
▽ More
Risk modeling with EHR data is challenging due to a lack of direct observations on the disease outcome, and the high dimensionality of the candidate predictors. In this paper, we develop a surrogate assisted semi-supervised-learning (SAS) approach to risk modeling with high dimensional predictors, leveraging a large unlabeled data on candidate predictors and surrogates of outcome, as well as a small labeled data with annotated outcomes. The SAS procedure borrows information from surrogates along with candidate predictors to impute the unobserved outcomes via a sparse working imputation model with moment conditions to achieve robustness against mis-specification in the imputation model and a one-step bias correction to enable interval estimation for the predicted risk. We demonstrate that the SAS procedure provides valid inference for the predicted risk derived from a high dimensional working model, even when the underlying risk prediction model is dense and the risk model is mis-specified. We present an extensive simulation study to demonstrate the superiority of our SSL approach compared to existing supervised methods. We apply the method to derive genetic risk prediction of type-2 diabetes mellitus using a EHR biobank cohort.
△ Less
Submitted 3 May, 2021;
originally announced May 2021.
-
Underwater Image Enhancement via Medium Transmission-Guided Multi-Color Space Embedding
Authors:
Chongyi Li,
Saeed Anwar,
Junhui Hou,
Runmin Cong,
Chunle Guo,
Wenqi Ren
Abstract:
Underwater images suffer from color casts and low contrast due to wavelength- and distance-dependent attenuation and scattering. To solve these two degradation issues, we present an underwater image enhancement network via medium transmission-guided multi-color space embedding, called Ucolor. Concretely, we first propose a multi-color space encoder network, which enriches the diversity of feature…
▽ More
Underwater images suffer from color casts and low contrast due to wavelength- and distance-dependent attenuation and scattering. To solve these two degradation issues, we present an underwater image enhancement network via medium transmission-guided multi-color space embedding, called Ucolor. Concretely, we first propose a multi-color space encoder network, which enriches the diversity of feature representations by incorporating the characteristics of different color spaces into a unified structure. Coupled with an attention mechanism, the most discriminative features extracted from multiple color spaces are adaptively integrated and highlighted. Inspired by underwater imaging physical models, we design a medium transmission (indicating the percentage of the scene radiance reaching the camera)-guided decoder network to enhance the response of the network towards quality-degraded regions. As a result, our network can effectively improve the visual quality of underwater images by exploiting multiple color spaces embedding and the advantages of both physical model-based and learning-based methods. Extensive experiments demonstrate that our Ucolor achieves superior performance against state-of-the-art methods in terms of both visual quality and quantitative metrics.
△ Less
Submitted 27 April, 2021;
originally announced April 2021.
-
Pri3D: Can 3D Priors Help 2D Representation Learning?
Authors:
Ji Hou,
Saining Xie,
Benjamin Graham,
Angela Dai,
Matthias Nießner
Abstract:
Recent advances in 3D perception have shown impressive progress in understanding geometric structures of 3Dshapes and even scenes. Inspired by these advances in geometric understanding, we aim to imbue image-based perception with representations learned under geometric constraints. We introduce an approach to learn view-invariant,geometry-aware representations for network pre-training, based on mu…
▽ More
Recent advances in 3D perception have shown impressive progress in understanding geometric structures of 3Dshapes and even scenes. Inspired by these advances in geometric understanding, we aim to imbue image-based perception with representations learned under geometric constraints. We introduce an approach to learn view-invariant,geometry-aware representations for network pre-training, based on multi-view RGB-D data, that can then be effectively transferred to downstream 2D tasks. We propose to employ contrastive learning under both multi-view im-age constraints and image-geometry constraints to encode3D priors into learned 2D representations. This results not only in improvement over 2D-only representation learning on the image-based tasks of semantic segmentation, instance segmentation, and object detection on real-world in-door datasets, but moreover, provides significant improvement in the low data regime. We show a significant improvement of 6.0% on semantic segmentation on full data as well as 11.9% on 20% data against baselines on ScanNet.
△ Less
Submitted 18 December, 2021; v1 submitted 22 April, 2021;
originally announced April 2021.
-
HMM-Free Encoder Pre-Training for Streaming RNN Transducer
Authors:
Lu Huang,
**gyu Sun,
Yufeng Tang,
Junfeng Hou,
**kun Chen,
Jun Zhang,
Zejun Ma
Abstract:
This work describes an encoder pre-training procedure using frame-wise label to improve the training of streaming recurrent neural network transducer (RNN-T) model. Streaming RNN-T trained from scratch usually performs worse than non-streaming RNN-T. Although it is common to address this issue through pre-training components of RNN-T with other criteria or frame-wise alignment guidance, the alignm…
▽ More
This work describes an encoder pre-training procedure using frame-wise label to improve the training of streaming recurrent neural network transducer (RNN-T) model. Streaming RNN-T trained from scratch usually performs worse than non-streaming RNN-T. Although it is common to address this issue through pre-training components of RNN-T with other criteria or frame-wise alignment guidance, the alignment is not easily available in end-to-end manner. In this work, frame-wise alignment, used to pre-train streaming RNN-T's encoder, is generated without using a HMM-based system. Therefore an all-neural framework equip** HMM-free encoder pre-training is constructed. This is achieved by expanding the spikes of CTC model to their left/right blank frames, and two expanding strategies are proposed. To our best knowledge, this is the first work to simulate HMM-based frame-wise label using CTC model for pre-training. Experiments conducted on LibriSpeech and MLS English tasks show the proposed pre-training procedure, compared with random initialization, reduces the WER by relatively 5%~11% and the emission latency by 60 ms. Besides, the method is lexicon-free, so it is friendly to new languages without manually designed lexicon.
△ Less
Submitted 10 June, 2021; v1 submitted 2 April, 2021;
originally announced April 2021.
-
3D Wireless Channel Modeling for Multi-layer Network on Chip
Authors:
Chao Ren,
**gze Hou,
Biao Pan
Abstract:
The resource constraints and accuracy requirements for Internet of Things (IoT) memory chips need three-dimensional (3D) monolithic integrated circuits, of which the increasing stack layers (currently more than 176) also cause excessive energy consumption and increasing wire length. In this paper, a novel 3D wireless network on chips (3DWiNoCs) model transmitting signal directly to the destination…
▽ More
The resource constraints and accuracy requirements for Internet of Things (IoT) memory chips need three-dimensional (3D) monolithic integrated circuits, of which the increasing stack layers (currently more than 176) also cause excessive energy consumption and increasing wire length. In this paper, a novel 3D wireless network on chips (3DWiNoCs) model transmitting signal directly to the destination in arbitrary layer is proposed and characterized. However, due to the the reflection and refraction characteristics in each layer, the complex and diverse wireless paths in 3DWiNoC add great difficulty to the channel characterization. To facilitate the modeling in massive layer NoC situation, both boundary-less model boundary-constrained 3DWiNoC model are proposed, of which the channel gain can be obtained by a computational efficient approximate algorithm. These 3DWiNoC models with approximation algorithm can well characterize the 3DWiNoC channel in aspect of complete reflection and refraction characteristics, and avoid massive wired connections, high power consumption of cross-layer communication and high-complexity of 3DWiNoC channel characterization. Numerical results show that: 1) The difference rate between the two models is lower than 0.001% (signal transmit through 20 layers); 2) the channel gain decreases sharply if refract time increases; and 3) the approximate algorithm can achieve an acceptable accuracy (error rate lower than 0.1%).
△ Less
Submitted 8 April, 2021;
originally announced April 2021.
-
Learning Image Aesthetic Assessment from Object-level Visual Components
Authors:
**gwen Hou,
Sheng Yang,
Weisi Lin,
Baoquan Zhao,
Yuming Fang
Abstract:
As it is said by Van Gogh, great things are done by a series of small things brought together. Aesthetic experience arises from the aggregation of underlying visual components. However, most existing deep image aesthetic assessment (IAA) methods over-simplify the IAA process by failing to model image aesthetics with clearly-defined visual components as building blocks. As a result, the connection…
▽ More
As it is said by Van Gogh, great things are done by a series of small things brought together. Aesthetic experience arises from the aggregation of underlying visual components. However, most existing deep image aesthetic assessment (IAA) methods over-simplify the IAA process by failing to model image aesthetics with clearly-defined visual components as building blocks. As a result, the connection between resulting aesthetic predictions and underlying visual components is mostly invisible and hard to be explicitly controlled, which limits the model in both performance and interpretability. This work aims to model image aesthetics from the level of visual components. Specifically, object-level regions detected by a generic object detector are defined as visual components, namely object-level visual components (OVCs). Then generic features representing OVCs are aggregated for the aesthetic prediction based upon proposed object-level and graph attention mechanisms, which dynamically determines the importance of individual OVCs and relevance between OVC pairs, respectively. Experimental results confirm the superiority of our framework over previous relevant methods in terms of SRCC and PLCC on the aesthetic rating distribution prediction. Besides, quantitative analysis is done towards model interpretation by observing how OVCs contribute to aesthetic predictions, whose results are found to be supported by psychology on aesthetics and photography rules. To the best of our knowledge, this is the first attempt at the interpretation of a deep IAA model.
△ Less
Submitted 4 April, 2021;
originally announced April 2021.
-
Prediction of three-phase relative permeabilities of Berea sandstone using lattice Boltzmann method
Authors:
Sheng Li,
Fei Jiang,
Bei Wei,
Jian Hou,
Haihu Liu
Abstract:
Three-phase flows through a pore network of Berea sandstone are studied numerically under critical interfacial tension condition. Results show that the relative permeability of each fluid increases as its own saturation increases. The specific interfacial length between wetting and non-wetting fluids monotonously decreases with increasing the saturation of intermediate-wetting fluid, while the oth…
▽ More
Three-phase flows through a pore network of Berea sandstone are studied numerically under critical interfacial tension condition. Results show that the relative permeability of each fluid increases as its own saturation increases. The specific interfacial length between wetting and non-wetting fluids monotonously decreases with increasing the saturation of intermediate-wetting fluid, while the other two specific interfacial lengths exhibit a non-monotonous variation. As the wetting (non-wetting) fluid becomes less wetting (non-wetting), the relative permeability of wetting fluid monotonously increases, while the other two relative permeabilities show a non-monotonous trend. Due to the presence of spreading layer, the specific interfacial length between wetting and non-wetting fluids always stabilizes at a low level. As the viscosity ratio of wetting (non-wetting) to intermediate-wetting fluids increases, the relative permeability of wetting (non-wetting) fluid increases. With the viscosity ratio deviating from unity, the phase interfaces become increasingly unstable, leading to an increased specific interfacial length.
△ Less
Submitted 15 March, 2021;
originally announced March 2021.
-
Risk Prediction with Imperfect Survival Outcome Information from Electronic Health Records
Authors:
Stephanie F. Chan,
Jue Hou,
Xuan Wang,
Tianxi Cai
Abstract:
Readily available proxies for time of disease onset such as time of the first diagnostic code can lead to substantial risk prediction error if performing analyses based on poor proxies. Due to the lack of detailed documentation and labor intensiveness of manual annotation, it is often only feasible to ascertain for a small subset the current status of the disease by a follow up time rather than th…
▽ More
Readily available proxies for time of disease onset such as time of the first diagnostic code can lead to substantial risk prediction error if performing analyses based on poor proxies. Due to the lack of detailed documentation and labor intensiveness of manual annotation, it is often only feasible to ascertain for a small subset the current status of the disease by a follow up time rather than the exact time. In this paper, we aim to develop risk prediction models for the onset time efficiently leveraging both a small number of labels on current status and a large number of unlabeled observations on imperfect proxies. Under a semiparametric transformation model for onset and a highly flexible measurement error models for proxy onset time, we propose the semisupervised risk prediction method by combining information from proxies and limited labels efficiently. From an initial estimator solely based on the labelled subset, we perform a one-step correction with the full data augmenting against a mean zero rank correlation score derived from the proxies. We establish the consistency and asymptotic normality of the proposed semi-supervised estimator and provide a resampling procedure for interval estimation. Simulation studies demonstrate that the proposed estimator performs well in finite sample. We illustrate the proposed estimator by develo** a genetic risk prediction model for obesity using data from Partners Biobank Electronic Health Records (EHR).
△ Less
Submitted 7 March, 2021;
originally announced March 2021.
-
Self-supervised Symmetric Nonnegative Matrix Factorization
Authors:
Yuheng Jia,
Hui Liu,
Junhui Hou,
Sam Kwong,
Qingfu Zhang
Abstract:
Symmetric nonnegative matrix factorization (SNMF) has demonstrated to be a powerful method for data clustering. However, SNMF is mathematically formulated as a non-convex optimization problem, making it sensitive to the initialization of variables. Inspired by ensemble clustering that aims to seek a better clustering result from a set of clustering results, we propose self-supervised SNMF (S$^3$NM…
▽ More
Symmetric nonnegative matrix factorization (SNMF) has demonstrated to be a powerful method for data clustering. However, SNMF is mathematically formulated as a non-convex optimization problem, making it sensitive to the initialization of variables. Inspired by ensemble clustering that aims to seek a better clustering result from a set of clustering results, we propose self-supervised SNMF (S$^3$NMF), which is capable of boosting clustering performance progressively by taking advantage of the sensitivity to initialization characteristic of SNMF, without relying on any additional information. Specifically, we first perform SNMF repeatedly with a random nonnegative matrix for initialization each time, leading to multiple decomposed matrices. Then, we rank the quality of the resulting matrices with adaptively learned weights, from which a new similarity matrix that is expected to be more discriminative is reconstructed for SNMF again. These two steps are iterated until the stop** criterion/maximum number of iterations is achieved. We mathematically formulate S$^3$NMF as a constraint optimization problem, and provide an alternative optimization algorithm to solve it with the theoretical convergence guaranteed. Extensive experimental results on $10$ commonly used benchmark datasets demonstrate the significant advantage of our S$^3$NMF over $12$ state-of-the-art methods in terms of $5$ quantitative metrics. The source code is publicly available at https://github.com/jyh-learning/SSSNMF.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
The NPU System for the 2020 Personalized Voice Trigger Challenge
Authors:
**gyong Hou,
Li Zhang,
Yihui Fu,
Qing Wang,
Zhanheng Yang,
Qijie Shao,
Lei Xie
Abstract:
This paper describes the system developed by the NPU team for the 2020 personalized voice trigger challenge. Our submitted system consists of two independently trained subsystems: a small footprint keyword spotting (KWS) system and a speaker verification (SV) system. For the KWS system, a multi-scale dilated temporal convolutional (MDTC) network is proposed to detect wake-up word (WuW). For SV sys…
▽ More
This paper describes the system developed by the NPU team for the 2020 personalized voice trigger challenge. Our submitted system consists of two independently trained subsystems: a small footprint keyword spotting (KWS) system and a speaker verification (SV) system. For the KWS system, a multi-scale dilated temporal convolutional (MDTC) network is proposed to detect wake-up word (WuW). For SV system, Write something here. The KWS predicts posterior probabilities of whether an audio utterance contains WuW and estimates the location of WuW at the same time. When the posterior probability ofWuW reaches a predefined threshold, the identity information of triggered segment is determined by the SV system. On evaluation dataset, our submitted system obtains detection costs of 0.081and 0.091 in close talking and far-field tasks, respectively.
△ Less
Submitted 26 February, 2021;
originally announced February 2021.
-
Light Field Reconstruction via Deep Adaptive Fusion of Hybrid Lenses
Authors:
**g **,
Mantang Guo,
Junhui Hou,
Hui Liu,
Hongkai Xiong
Abstract:
This paper explores the problem of reconstructing high-resolution light field (LF) images from hybrid lenses, including a high-resolution camera surrounded by multiple low-resolution cameras. The performance of existing methods is still limited, as they produce either blurry results on plain textured areas or distortions around depth discontinuous boundaries. To tackle this challenge, we propose a…
▽ More
This paper explores the problem of reconstructing high-resolution light field (LF) images from hybrid lenses, including a high-resolution camera surrounded by multiple low-resolution cameras. The performance of existing methods is still limited, as they produce either blurry results on plain textured areas or distortions around depth discontinuous boundaries. To tackle this challenge, we propose a novel end-to-end learning-based approach, which can comprehensively utilize the specific characteristics of the input from two complementary and parallel perspectives. Specifically, one module regresses a spatially consistent intermediate estimation by learning a deep multidimensional and cross-domain feature representation, while the other module warps another intermediate estimation, which maintains the high-frequency textures, by propagating the information of the high-resolution view. We finally leverage the advantages of the two intermediate estimations adaptively via the learned attention maps, leading to the final high-resolution LF image with satisfactory results on both plain textured areas and depth discontinuous boundaries. Besides, to promote the effectiveness of our method trained with simulated hybrid data on real hybrid data captured by a hybrid LF imaging system, we carefully design the network architecture and the training strategy. Extensive experiments on both real and simulated hybrid data demonstrate the significant superiority of our approach over state-of-the-art ones. To the best of our knowledge, this is the first end-to-end deep learning method for LF reconstruction from a real hybrid input. We believe our framework could potentially decrease the cost of high-resolution LF data acquisition and benefit LF data storage and transmission.
△ Less
Submitted 17 June, 2023; v1 submitted 14 February, 2021;
originally announced February 2021.
-
Artificial Intelligence Advances for De Novo Molecular Structure Modeling in Cryo-EM
Authors:
Dong Si,
Andrew Nakamura,
Runbang Tang,
Haowen Guan,
Jie Hou,
Ammaar Firozi,
Renzhi Cao,
Kyle Hippe,
Minglei Zhao
Abstract:
Cryo-electron microscopy (cryo-EM) has become a major experimental technique to determine the structures of large protein complexes and molecular assemblies, as evidenced by the 2017 Nobel Prize. Although cryo-EM has been drastically improved to generate high-resolution three-dimensional (3D) maps that contain detailed structural information about macromolecules, the computational methods for usin…
▽ More
Cryo-electron microscopy (cryo-EM) has become a major experimental technique to determine the structures of large protein complexes and molecular assemblies, as evidenced by the 2017 Nobel Prize. Although cryo-EM has been drastically improved to generate high-resolution three-dimensional (3D) maps that contain detailed structural information about macromolecules, the computational methods for using the data to automatically build structure models are lagging far behind. The traditional cryo-EM model building approach is template-based homology modeling. Manual de novo modeling is very time-consuming when no template model is found in the database. In recent years, de novo cryo-EM modeling using machine learning (ML) and deep learning (DL) has ranked among the top-performing methods in macromolecular structure modeling. Deep-learning-based de novo cryo-EM modeling is an important application of artificial intelligence, with impressive results and great potential for the next generation of molecular biomedicine. Accordingly, we systematically review the representative ML/DL-based de novo cryo-EM modeling methods. And their significances are discussed from both practical and methodological viewpoints. We also briefly describe the background of cryo-EM data processing workflow. Overall, this review provides an introductory guide to modern research on artificial intelligence (AI) for de novo molecular structure modeling and future directions in this emerging field.
△ Less
Submitted 23 February, 2021; v1 submitted 11 February, 2021;
originally announced February 2021.
-
Quantum engineering with hybrid magnonics systems and materials
Authors:
D. D. Awschalom,
C. H. R. Du,
R. He,
F. J. Heremans,
A. Hoffmann,
J. T. Hou,
H. Kurebayashi,
Y. Li,
L. Liu,
V. Novosad,
J. Sklenar,
S. E. Sullivan,
D. Sun,
H. Tang,
V. Tiberkevich,
C. Trevillian,
A. W. Tsen,
L. R. Weiss,
W. Zhang,
X. Zhang,
L. Zhao,
C. W. Zollitsch
Abstract:
Quantum technology has made tremendous strides over the past two decades with remarkable advances in materials engineering, circuit design and dynamic operation. In particular, the integration of different quantum modules has benefited from hybrid quantum systems, which provide an important pathway for harnessing the different natural advantages of complementary quantum systems and for engineering…
▽ More
Quantum technology has made tremendous strides over the past two decades with remarkable advances in materials engineering, circuit design and dynamic operation. In particular, the integration of different quantum modules has benefited from hybrid quantum systems, which provide an important pathway for harnessing the different natural advantages of complementary quantum systems and for engineering new functionalities. This review focuses on the current frontiers with respect to utilizing magnetic excitatons or magnons for novel quantum functionality. Magnons are the fundamental excitations of magnetically ordered solid-state materials and provide great tunability and flexibility for interacting with various quantum modules for integration in diverse quantum systems. The concomitant rich variety of physics and material selections enable exploration of novel quantum phenomena in materials science and engineering. In addition, the relative ease of generating strong coupling and forming hybrid dynamic systems with other excitations makes hybrid magnonics a unique platform for quantum engineering. We start our discussion with circuit-based hybrid magnonic systems, which are coupled with microwave photons and acoustic phonons. Subsequently, we are focusing on the recent progress of magnon-magnon coupling within confined magnetic systems. Next we highlight new opportunities for understanding the interactions between magnons and nitrogen-vacancy centers for quantum sensing and implementing quantum interconnects. Lastly, we focus on the spin excitations and magnon spectra of novel quantum materials investigated with advanced optical characterization.
△ Less
Submitted 5 February, 2021;
originally announced February 2021.
-
Lax Connections in $T\bar{T}$-deformed Integrable Field Theories
Authors:
Bin Chen,
Jue Hou,
Jia Tian
Abstract:
In this work, we try to construct the Lax connections of $T\bar{T}$-deformed integrable field theories in two different ways. With reasonable assumptions, we make ansatz and find the Lax pairs in the $T\bar{T}$-deformed affine Toda theories and the principal chiral model by solving the Lax equations directly. This way is straightforward but maybe hard to apply for general models. We then make use…
▽ More
In this work, we try to construct the Lax connections of $T\bar{T}$-deformed integrable field theories in two different ways. With reasonable assumptions, we make ansatz and find the Lax pairs in the $T\bar{T}$-deformed affine Toda theories and the principal chiral model by solving the Lax equations directly. This way is straightforward but maybe hard to apply for general models. We then make use of the dynamical coordinate transformation to read the Lax connection in the deformed theory from the undeformed one. We find that once the inverse of the transformation is available, the Lax connection can be read easily. We show the construction explicitly for a few classes of scalar models, and find consistency with the ones in the first way.
△ Less
Submitted 2 February, 2021;
originally announced February 2021.
-
Pseudo-Goldstone Excitations in a Striped Bose-Einstein Condensate
Authors:
Guan-Qiang Li,
Xi-Wang Luo,
Junpeng Hou,
Chuanwei Zhang
Abstract:
Significant experimental progress has been made recently for observing long-sought supersolid-like states in Bose-Einstein condensates, where spatial translational symmetry is spontaneously broken by anisotropic interactions to form a stripe order. Meanwhile, the superfluid stripe ground state was also observed by applying a weak optical lattice that forces the symmetry breaking. Despite of the si…
▽ More
Significant experimental progress has been made recently for observing long-sought supersolid-like states in Bose-Einstein condensates, where spatial translational symmetry is spontaneously broken by anisotropic interactions to form a stripe order. Meanwhile, the superfluid stripe ground state was also observed by applying a weak optical lattice that forces the symmetry breaking. Despite of the similarity of the ground states, here we show that these two symmetry breaking mechanisms can be distinguished by their collective excitation spectra. In contrast to gapless Goldstone modes of the \textit{spontaneous} stripe state, we propose that the excitation spectra of the \textit{forced} stripe phase can provide direct experimental evidence for the long-sought gapped pseudo-Goldstone modes. We characterize the pseudo-Goldstone mode of such lattice-induced stripe phase through its excitation spectrum and static structure factor. Our work may pave the way for exploring spontaneous and forced/approximate symmetry breaking mechanisms in different physical systems.
△ Less
Submitted 17 January, 2021;
originally announced January 2021.
-
Spectroscopic observations of a flare-related coronal jet
Authors:
Q. M. Zhang,
Z. H. Huang,
Y. J. Hou,
D. Li,
Z. J. Ning,
Z. Wu
Abstract:
Coronal jets are ubiquitous in active regions (ARs) and coronal holes. In this paper, we study a coronal jet related to a C3.4 circular-ribbon flare in active region 12434 on 2015 October 16. Two minifilaments were located under a 3D fan-spine structure before flare. The flare was generated by the eruption of one filament. The kinetic evolution of the jet was divided into two phases: a slow rise p…
▽ More
Coronal jets are ubiquitous in active regions (ARs) and coronal holes. In this paper, we study a coronal jet related to a C3.4 circular-ribbon flare in active region 12434 on 2015 October 16. Two minifilaments were located under a 3D fan-spine structure before flare. The flare was generated by the eruption of one filament. The kinetic evolution of the jet was divided into two phases: a slow rise phase at a speed of $\sim$131 km s$^{-1}$ and a fast rise phase at a speed of $\sim$363 km s$^{-1}$ in the plane-of-sky. The slow rise phase may correspond to the impulsive reconnection at the breakout current sheet. The fast rise phase may correspond to magnetic reconnection at the flare current sheet. The transition between the two phases occurred at $\sim$09:00:40 UT. The blueshifted Doppler velocities of the jet in the Si {\sc iv} 1402.80 Å line range from -34 to -120 km s$^{-1}$. The accelerated high-energy electrons are composed of three groups. Those propagating upward along open field generate type \textrm{III} radio bursts, while those propagating downward produce HXR emissions and drive chromospheric condensation observed in the Si {\sc iv} line. The electrons trapped in the rising filament generate a microwave burst lasting for $\le$40 s. Bidirectional outflows at the base of jet are manifested by significant line broadenings of the Si {\sc iv} line. The blueshifted Doppler velocities of outflows range from -13 to -101 km s$^{-1}$. The redshifted Doppler velocities of outflows range from $\sim$17 to $\sim$170 km s$^{-1}$. Our multiwavelength observations of the flare-related jet are in favor of the breakout jet model and are important for understanding the acceleration and transport of nonthermal electrons.
△ Less
Submitted 17 January, 2021;
originally announced January 2021.
-
Geometric frustration produces long-sought Bose metal phase of quantum matter
Authors:
Anthony Hegg,
**ning Hou,
Wei Ku
Abstract:
Two of the most prominent phases of bosonic matter are the superfluid with perfect flow and the insulator with no flow. A now decades-old mystery unexpectedly arose when experimental observations indicated that bosons could organize into the formation of an entirely different intervening third phase: the Bose metal with dissipative flow. The most viable theory for such a Bose metal to date invokes…
▽ More
Two of the most prominent phases of bosonic matter are the superfluid with perfect flow and the insulator with no flow. A now decades-old mystery unexpectedly arose when experimental observations indicated that bosons could organize into the formation of an entirely different intervening third phase: the Bose metal with dissipative flow. The most viable theory for such a Bose metal to date invokes the use of the extrinsic property of impurity-based disorder; however, a generic intrinsic quantum Bose metal state is still lacking. We propose a universal homogeneous theory for a Bose metal in which geometric frustration confines the essential quantum coherence to a lower dimension. The result is a gapless insulator characterized by dissipative flow that vanishes in the low-energy limit. This failed insulator exemplifies a frustration-dominated regime that is only enhanced by additional scattering sources at low energy and therefore produces a Bose metal that thrives under realistic experimental conditions.
△ Less
Submitted 22 November, 2021; v1 submitted 15 January, 2021;
originally announced January 2021.
-
CorrNet3D: Unsupervised End-to-end Learning of Dense Correspondence for 3D Point Clouds
Authors:
Yiming Zeng,
Yue Qian,
Zhiyu Zhu,
Junhui Hou,
Hui Yuan,
Ying He
Abstract:
Motivated by the intuition that one can transform two aligned point clouds to each other more easily and meaningfully than a misaligned pair, we propose CorrNet3D -- the first unsupervised and end-to-end deep learning-based framework -- to drive the learning of dense correspondence between 3D shapes by means of deformation-like reconstruction to overcome the need for annotated data. Specifically,…
▽ More
Motivated by the intuition that one can transform two aligned point clouds to each other more easily and meaningfully than a misaligned pair, we propose CorrNet3D -- the first unsupervised and end-to-end deep learning-based framework -- to drive the learning of dense correspondence between 3D shapes by means of deformation-like reconstruction to overcome the need for annotated data. Specifically, CorrNet3D consists of a deep feature embedding module and two novel modules called correspondence indicator and symmetric deformer. Feeding a pair of raw point clouds, our model first learns the pointwise features and passes them into the indicator to generate a learnable correspondence matrix used to permute the input pair. The symmetric deformer, with an additional regularized loss, transforms the two permuted point clouds to each other to drive the unsupervised learning of the correspondence. The extensive experiments on both synthetic and real-world datasets of rigid and non-rigid 3D shapes show our CorrNet3D outperforms state-of-the-art methods to a large extent, including those taking meshes as input. CorrNet3D is a flexible framework in that it can be easily adapted to supervised learning if annotated data are available. The source code and pre-trained model will be available at https://github.com/ZENGYIMING-EAMON/CorrNet3D.git.
△ Less
Submitted 3 May, 2021; v1 submitted 31 December, 2020;
originally announced December 2020.
-
Note on the nonrelativistic $T\bar{T}$ deformation
Authors:
Bin Chen,
Jue Hou,
Jia Tian
Abstract:
In this paper, we present our study on the $T\bar{T}$-deformation of non-relativistic complex scalar field theory. We find the closed form of the deformed Lagrangian by using the perturbation and the method of characteristics. Furthermore we compute the exact energy spectrum of the deformed free theory by using the Brillouin-Wigner perturbation theory in an appropriate regularization scheme.
In this paper, we present our study on the $T\bar{T}$-deformation of non-relativistic complex scalar field theory. We find the closed form of the deformed Lagrangian by using the perturbation and the method of characteristics. Furthermore we compute the exact energy spectrum of the deformed free theory by using the Brillouin-Wigner perturbation theory in an appropriate regularization scheme.
△ Less
Submitted 6 July, 2021; v1 submitted 27 December, 2020;
originally announced December 2020.
-
Cross-Domain Latent Modulation for Variational Transfer Learning
Authors:
**yong Hou,
Jeremiah D. Deng,
Stephen Cranefield,
Xuejie Ding
Abstract:
We propose a cross-domain latent modulation mechanism within a variational autoencoders (VAE) framework to enable improved transfer learning. Our key idea is to procure deep representations from one data domain and use it as perturbation to the reparameterization of the latent variable in another domain. Specifically, deep representations of the source and target domains are first extracted by a u…
▽ More
We propose a cross-domain latent modulation mechanism within a variational autoencoders (VAE) framework to enable improved transfer learning. Our key idea is to procure deep representations from one data domain and use it as perturbation to the reparameterization of the latent variable in another domain. Specifically, deep representations of the source and target domains are first extracted by a unified inference model and aligned by employing gradient reversal. Second, the learned deep representations are cross-modulated to the latent encoding of the alternate domain. The consistency between the reconstruction from the modulated latent encoding and the generation using deep representation samples is then enforced in order to produce inter-class alignment in the latent space. We apply the proposed model to a number of transfer learning tasks including unsupervised domain adaptation and image-toimage translation. Experimental results show that our model gives competitive performance.
△ Less
Submitted 21 December, 2020;
originally announced December 2020.
-
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts
Authors:
Ji Hou,
Benjamin Graham,
Matthias Nießner,
Saining Xie
Abstract:
The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we expl…
▽ More
The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we explore data-efficient learning for 3D point cloud. As a first step towards this direction, we propose Contrastive Scene Contexts, a 3D pre-training method that makes use of both point-level correspondences and spatial contexts in a scene. Our method achieves state-of-the-art results on a suite of benchmarks where training data or labels are scarce. Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations.
△ Less
Submitted 25 June, 2021; v1 submitted 16 December, 2020;
originally announced December 2020.
-
Clustering Ensemble Meets Low-rank Tensor Approximation
Authors:
Yuheng Jia,
Hui Liu,
Junhui Hou,
Qingfu Zhang
Abstract:
This paper explores the problem of clustering ensemble, which aims to combine multiple base clusterings to produce better performance than that of the individual one. The existing clustering ensemble methods generally construct a co-association matrix, which indicates the pairwise similarity between samples, as the weighted linear combination of the connective matrices from different base clusteri…
▽ More
This paper explores the problem of clustering ensemble, which aims to combine multiple base clusterings to produce better performance than that of the individual one. The existing clustering ensemble methods generally construct a co-association matrix, which indicates the pairwise similarity between samples, as the weighted linear combination of the connective matrices from different base clusterings, and the resulting co-association matrix is then adopted as the input of an off-the-shelf clustering algorithm, e.g., spectral clustering. However, the co-association matrix may be dominated by poor base clusterings, resulting in inferior performance. In this paper, we propose a novel low-rank tensor approximation-based method to solve the problem from a global perspective. Specifically, by inspecting whether two samples are clustered to an identical cluster under different base clusterings, we derive a coherent-link matrix, which contains limited but highly reliable relationships between samples. We then stack the coherent-link matrix and the co-association matrix to form a three-dimensional tensor, the low-rankness property of which is further explored to propagate the information of the coherent-link matrix to the co-association matrix, producing a refined co-association matrix. We formulate the proposed method as a convex constrained optimization problem and solve it efficiently. Experimental results over 7 benchmark data sets show that the proposed model achieves a breakthrough in clustering performance, compared with 12 state-of-the-art methods. To the best of our knowledge, this is the first work to explore the potential of low-rank tensor on clustering ensemble, which is fundamentally different from previous approaches.
△ Less
Submitted 16 December, 2020;
originally announced December 2020.