Search | arXiv e-print repository

AutoFT: Automatic Fine-Tune for Parameters Transfer Learning in Click-Through Rate Prediction

Authors: Xiangli Yang, Qing Liu, Rong Su, Ruiming Tang, Zhirong Liu, Xiuqiang He

Abstract: Recommender systems are often asked to serve multiple recommendation scenarios or domains. Fine-tuning a pre-trained CTR model from source domains and adapting it to a target domain allows knowledge transferring. However, optimizing all the parameters of the pre-trained network may result in over-fitting if the target dataset is small and the number of parameters is large. This leads us to think o… ▽ More Recommender systems are often asked to serve multiple recommendation scenarios or domains. Fine-tuning a pre-trained CTR model from source domains and adapting it to a target domain allows knowledge transferring. However, optimizing all the parameters of the pre-trained network may result in over-fitting if the target dataset is small and the number of parameters is large. This leads us to think of directly reusing parameters in the pre-trained model which represent more general features learned from multiple domains. However, the design of freezing or fine-tuning layers of parameters requires much manual effort since the decision highly depends on the pre-trained model and target instances. In this work, we propose an end-to-end transfer learning framework, called Automatic Fine-Tuning (AutoFT), for CTR prediction. AutoFT consists of a field-wise transfer policy and a layer-wise transfer policy. The field-wise transfer policy decides how the pre-trained embedding representations are frozen or fine-tuned based on the given instance from the target domain. The layer-wise transfer policy decides how the high?order feature representations are transferred layer by layer. Extensive experiments on two public benchmark datasets and one private industrial dataset demonstrate that AutoFT can significantly improve the performance of CTR prediction compared with state-of-the-art transferring approaches. △ Less

Submitted 9 June, 2021; originally announced June 2021.

Comments: 10 pages

arXiv:2106.00314 [pdf, other]

Dual Graph enhanced Embedding Neural Network for CTR Prediction

Authors: Wei Guo, Rong Su, Renhao Tan, Huifeng Guo, Yingxue Zhang, Zhirong Liu, Ruiming Tang, Xiuqiang He

Abstract: CTR prediction, which aims to estimate the probability that a user will click an item, plays a crucial role in online advertising and recommender system. Feature interaction modeling based and user interest mining based methods are the two kinds of most popular techniques that have been extensively explored for many years and have made great progress for CTR prediction. However, (1) feature intera… ▽ More CTR prediction, which aims to estimate the probability that a user will click an item, plays a crucial role in online advertising and recommender system. Feature interaction modeling based and user interest mining based methods are the two kinds of most popular techniques that have been extensively explored for many years and have made great progress for CTR prediction. However, (1) feature interaction based methods which rely heavily on the co-occurrence of different features, may suffer from the feature sparsity problem (i.e., many features appear few times); (2) user interest mining based methods which need rich user behaviors to obtain user's diverse interests, are easy to encounter the behavior sparsity problem (i.e., many users have very short behavior sequences). To solve these problems, we propose a novel module named Dual Graph enhanced Embedding, which is compatible with various CTR prediction models to alleviate these two problems. We further propose a Dual Graph enhanced Embedding Neural Network (DG-ENN) for CTR prediction. Dual Graph enhanced Embedding exploits the strengths of graph representation with two carefully designed learning strategies (divide-and-conquer, curriculum-learning-inspired organized learning) to refine the embedding. We conduct comprehensive experiments on three real-world industrial datasets. The experimental results show that our proposed DG-ENN significantly outperforms state-of-the-art CTR prediction models. Moreover, when applying to state-of-the-art CTR prediction models, Dual graph enhanced embedding always obtains better performance. Further case studies prove that our proposed dual graph enhanced embedding could alleviate the feature sparsity and behavior sparsity problems. Our framework will be open-source based on MindSpore in the near future. △ Less

Submitted 8 June, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

Comments: KDD 2021

arXiv:2104.10584 [pdf, other]

Deep Learning for Click-Through Rate Estimation

Authors: Weinan Zhang, Jiarui Qin, Wei Guo, Ruiming Tang, Xiuqiang He

Abstract: Click-through rate (CTR) estimation plays as a core function module in various personalized online services, including online advertising, recommender systems, and web search etc. From 2015, the success of deep learning started to benefit CTR estimation performance and now deep CTR models have been widely applied in many industrial platforms. In this survey, we provide a comprehensive review of de… ▽ More Click-through rate (CTR) estimation plays as a core function module in various personalized online services, including online advertising, recommender systems, and web search etc. From 2015, the success of deep learning started to benefit CTR estimation performance and now deep CTR models have been widely applied in many industrial platforms. In this survey, we provide a comprehensive review of deep learning models for CTR estimation tasks. First, we take a review of the transfer from shallow to deep CTR models and explain why going deep is a necessary trend of development. Second, we concentrate on explicit feature interaction learning modules of deep CTR models. Then, as an important perspective on large platforms with abundant user histories, deep behavior models are discussed. Moreover, the recently emerged automated methods for deep CTR architecture design are presented. Finally, we summarize the survey and discuss the future prospects of this field. △ Less

Submitted 21 April, 2021; originally announced April 2021.

Comments: Paper accepted at IJCAI 2021 (Survey Track)

arXiv:2104.08542 [pdf, other]

ScaleFreeCTR: MixCache-based Distributed Training System for CTR Models with Huge Embedding Table

Authors: Huifeng Guo, Wei Guo, Yong Gao, Ruiming Tang, Xiuqiang He, Wenzhi Liu

Abstract: Because of the superior feature representation ability of deep learning, various deep Click-Through Rate (CTR) models are deployed in the commercial systems by industrial companies. To achieve better performance, it is necessary to train the deep CTR models on huge volume of training data efficiently, which makes speeding up the training process an essential problem. Different from the models with… ▽ More Because of the superior feature representation ability of deep learning, various deep Click-Through Rate (CTR) models are deployed in the commercial systems by industrial companies. To achieve better performance, it is necessary to train the deep CTR models on huge volume of training data efficiently, which makes speeding up the training process an essential problem. Different from the models with dense training data, the training data for CTR models is usually high-dimensional and sparse. To transform the high-dimensional sparse input into low-dimensional dense real-value vectors, almost all deep CTR models adopt the embedding layer, which easily reaches hundreds of GB or even TB. Since a single GPU cannot afford to accommodate all the embedding parameters, when performing distributed training, it is not reasonable to conduct the data-parallelism only. Therefore, existing distributed training platforms for recommendation adopt model-parallelism. Specifically, they use CPU (Host) memory of servers to maintain and update the embedding parameters and utilize GPU worker to conduct forward and backward computations. Unfortunately, these platforms suffer from two bottlenecks: (1) the latency of pull \& push operations between Host and GPU; (2) parameters update and synchronization in the CPU servers. To address such bottlenecks, in this paper, we propose the ScaleFreeCTR: a MixCache-based distributed training system for CTR models. Specifically, in SFCTR, we also store huge embedding table in CPU but utilize GPU instead of CPU to conduct embedding synchronization efficiently. To reduce the latency of data transfer between both GPU-Host and GPU-GPU, the MixCache mechanism and Virtual Sparse Id operation are proposed. Comprehensive experiments and ablation studies are conducted to demonstrate the effectiveness and efficiency of SFCTR. △ Less

Submitted 11 May, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

Comments: 10 pages

arXiv:2104.07986 [pdf, other]

Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Authors: Cheng Yang, Jia Zheng, Xili Dai, Rui Tang, Yi Ma, Xiaojun Yuan

Abstract: Single-image room layout reconstruction aims to reconstruct the enclosed 3D structure of a room from a single image. Most previous work relies on the cuboid-shape prior. This paper considers a more general indoor assumption, i.e., the room layout consists of a single ceiling, a single floor, and several vertical walls. To this end, we first employ Convolutional Neural Networks to detect planes and… ▽ More Single-image room layout reconstruction aims to reconstruct the enclosed 3D structure of a room from a single image. Most previous work relies on the cuboid-shape prior. This paper considers a more general indoor assumption, i.e., the room layout consists of a single ceiling, a single floor, and several vertical walls. To this end, we first employ Convolutional Neural Networks to detect planes and vertical lines between adjacent walls. Meanwhile, estimating the 3D parameters for each plane. Then, a simple yet effective geometric reasoning method is adopted to achieve room layout reconstruction. Furthermore, we optimize the 3D plane parameters to reconstruct a geometrically consistent room layout between planes and lines. The experimental results on public datasets validate the effectiveness and efficiency of our method. △ Less

Submitted 11 October, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

Comments: To appear in WACV 2022. The first two author contribute equally. Code is available at https://github.com/Cyang0515/NonCuboidRoom

arXiv:2104.06077 [pdf, other]

doi 10.1145/3442381.3449913

An Adversarial Imitation Click Model for Information Retrieval

Authors: Xinyi Dai, Jianghao Lin, Weinan Zhang, Shuai Li, Weiwen Liu, Ruiming Tang, Xiuqiang He, Jianye Hao, Jun Wang, Yong Yu

Abstract: Modern information retrieval systems, including web search, ads placement, and recommender systems, typically rely on learning from user feedback. Click models, which study how users interact with a ranked list of items, provide a useful understanding of user feedback for learning ranking models. Constructing "right" dependencies is the key of any successful click model. However, probabilistic gra… ▽ More Modern information retrieval systems, including web search, ads placement, and recommender systems, typically rely on learning from user feedback. Click models, which study how users interact with a ranked list of items, provide a useful understanding of user feedback for learning ranking models. Constructing "right" dependencies is the key of any successful click model. However, probabilistic graphical models (PGMs) have to rely on manually assigned dependencies, and oversimplify user behaviors. Existing neural network based methods promote PGMs by enhancing the expressive ability and allowing flexible dependencies, but still suffer from exposure bias and inferior estimation. In this paper, we propose a novel framework, Adversarial Imitation Click Model (AICM), based on imitation learning. Firstly, we explicitly learn the reward function that recovers users' intrinsic utility and underlying intentions. Secondly, we model user interactions with a ranked list as a dynamic system instead of one-step click prediction, alleviating the exposure bias problem. Finally, we minimize the JS divergence through adversarial training and learn a stable distribution of click sequences, which makes AICM generalize well across different distributions of ranked lists. A theoretical analysis has indicated that AICM reduces the exposure bias from $O(T^2)$ to $O(T)$. Our studies on a public web search dataset show that AICM not only outperforms state-of-the-art models in traditional click metrics but also achieves superior performance in addressing the exposure bias and recovering the underlying patterns of click sequences. △ Less

Submitted 19 April, 2021; v1 submitted 13 April, 2021; originally announced April 2021.

Comments: Accepted to WWW 2021

arXiv:2103.17022 [pdf, other]

Layout-Guided Novel View Synthesis from a Single Indoor Panorama

Authors: Jiale Xu, Jia Zheng, Yanyu Xu, Rui Tang, Shenghua Gao

Abstract: Existing view synthesis methods mainly focus on the perspective images and have shown promising results. However, due to the limited field-of-view of the pinhole camera, the performance quickly degrades when large camera movements are adopted. In this paper, we make the first attempt to generate novel views from a single indoor panorama and take the large camera translations into consideration. To… ▽ More Existing view synthesis methods mainly focus on the perspective images and have shown promising results. However, due to the limited field-of-view of the pinhole camera, the performance quickly degrades when large camera movements are adopted. In this paper, we make the first attempt to generate novel views from a single indoor panorama and take the large camera translations into consideration. To tackle this challenging problem, we first use Convolutional Neural Networks (CNNs) to extract the deep features and estimate the depth map from the source-view image. Then, we leverage the room layout prior, a strong structural constraint of the indoor scene, to guide the generation of target views. More concretely, we estimate the room layout in the source view and transform it into the target viewpoint as guidance. Meanwhile, we also constrain the room layout of the generated target-view images to enforce geometric consistency. To validate the effectiveness of our method, we further build a large-scale photo-realistic dataset containing both small and large camera translations. The experimental results on our challenging dataset demonstrate that our method achieves state-of-the-art performance. The project page is at https://github.com/bluestyle97/PNVS. △ Less

Submitted 31 March, 2021; originally announced March 2021.

Comments: To appear in CVPR 2021

arXiv:2103.14782 [pdf, ps, other]

doi 10.1103/PhysRevApplied.17.024071

Scalable and Robust Photonic Integrated Unitary Converter

Authors: Ryota Tanomura, Rui Tang, Toshikazu Umezaki, Go Soma, Takuo Tanemura, Yoshiaki Nakano

Abstract: Optical unitary converter (OUC) that can convert a set of N mutually orthogonal optical modes into another set of arbitrary N orthogonal modes is expected to be the key device in diverse applications, including the optical communication, deep learning, and quantum computing. While various types of OUC have been demonstrated on photonic integration platforms, its sensitivity against a slight deviat… ▽ More Optical unitary converter (OUC) that can convert a set of N mutually orthogonal optical modes into another set of arbitrary N orthogonal modes is expected to be the key device in diverse applications, including the optical communication, deep learning, and quantum computing. While various types of OUC have been demonstrated on photonic integration platforms, its sensitivity against a slight deviation in the waveguide dimension has been the crucial issue in scaling N. Here, we demonstrate that an OUC based on the concept of multi-plane light conversion (MPLC) shows outstanding robustness against waveguide deviations. Moreover, it becomes more and more insensitive to fabrication errors as we increase N, which is in clear contrast to the conventional OUC architecture, composed of 2 $\times$ 2 Mach-Zehnder interferometers. The physical origin behind this unique robustness and scalability is studied by considering a generalized OUC configuration. As a result, we reveal that the number of coupled modes in each stage plays an essential role in determining the sensitivity of the entire OUC. The maximal robustness is attained when all-to-all-coupled interferometers are employed, which are naturally implemented in MPLC-OUC. △ Less

Submitted 26 March, 2021; originally announced March 2021.

Journal ref: Physical Review Applied 2022

arXiv:2102.10262 [pdf]

One-pot green process to synthesize controllable surface terminations MXenes in molten salts

Authors: Miao Shen, Weiyan Jiang, Liang Guo, Sufang Zhao, Rui Tang, Jianqiang Wang

Abstract: Surface terminations for 2D MXene have dramatic impacts on physicochemical properties. The commonly etching methods usually introduce -F surface termination or metallic into MXene. Here, we present a new molten salt assisted electrochemical etching (MS-E-etching) method to synthesize fluorine-free Ti3C2Tx without metallics. Due to performing electrons as reaction agent, the cathode reduction and a… ▽ More Surface terminations for 2D MXene have dramatic impacts on physicochemical properties. The commonly etching methods usually introduce -F surface termination or metallic into MXene. Here, we present a new molten salt assisted electrochemical etching (MS-E-etching) method to synthesize fluorine-free Ti3C2Tx without metallics. Due to performing electrons as reaction agent, the cathode reduction and anode etching can be spatially isolated, thus no metallic presents in Ti3C2Tx product. Moreover, the Tx surface terminations can be directly modified from -Cl to -O and/or -S in one pot process. The obtained -O terminated MXenes exhibited capacitance of 225 and 205 F/g at 1 and 10 A/g, confirming high reversibility of redox reactions. This one-pot process greatly shortens the modification procedures as well as enriches the surface functional terminations. More importantly, the recovered salt after synthesis can be recycled and reused, which brands it as a green sustainable method. △ Less

Submitted 20 February, 2021; originally announced February 2021.

Comments: 12 pages, 5 figures

arXiv:2102.06125 [pdf]

doi 10.1002/wcms.1542

Artificial Intelligence Advances for De Novo Molecular Structure Modeling in Cryo-EM

Authors: Dong Si, Andrew Nakamura, Runbang Tang, Haowen Guan, Jie Hou, Ammaar Firozi, Renzhi Cao, Kyle Hippe, Minglei Zhao

Abstract: Cryo-electron microscopy (cryo-EM) has become a major experimental technique to determine the structures of large protein complexes and molecular assemblies, as evidenced by the 2017 Nobel Prize. Although cryo-EM has been drastically improved to generate high-resolution three-dimensional (3D) maps that contain detailed structural information about macromolecules, the computational methods for usin… ▽ More Cryo-electron microscopy (cryo-EM) has become a major experimental technique to determine the structures of large protein complexes and molecular assemblies, as evidenced by the 2017 Nobel Prize. Although cryo-EM has been drastically improved to generate high-resolution three-dimensional (3D) maps that contain detailed structural information about macromolecules, the computational methods for using the data to automatically build structure models are lagging far behind. The traditional cryo-EM model building approach is template-based homology modeling. Manual de novo modeling is very time-consuming when no template model is found in the database. In recent years, de novo cryo-EM modeling using machine learning (ML) and deep learning (DL) has ranked among the top-performing methods in macromolecular structure modeling. Deep-learning-based de novo cryo-EM modeling is an important application of artificial intelligence, with impressive results and great potential for the next generation of molecular biomedicine. Accordingly, we systematically review the representative ML/DL-based de novo cryo-EM modeling methods. And their significances are discussed from both practical and methodological viewpoints. We also briefly describe the background of cryo-EM data processing workflow. Overall, this review provides an introductory guide to modern research on artificial intelligence (AI) for de novo molecular structure modeling and future directions in this emerging field. △ Less

Submitted 23 February, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

Journal ref: Wiley Interdisciplinary Reviews: Computational Molecular Science, e1542 (2021)

arXiv:2101.04849 [pdf, other]

Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation

Authors: Chen Ma, Liheng Ma, Yingxue Zhang, Ruiming Tang, Xue Liu, Mark Coates

Abstract: Personalized recommender systems are playing an increasingly important role as more content and services become available and users struggle to identify what might interest them. Although matrix factorization and deep learning based methods have proved effective in user preference modeling, they violate the triangle inequality and fail to capture fine-grained preference information. To tackle this… ▽ More Personalized recommender systems are playing an increasingly important role as more content and services become available and users struggle to identify what might interest them. Although matrix factorization and deep learning based methods have proved effective in user preference modeling, they violate the triangle inequality and fail to capture fine-grained preference information. To tackle this, we develop a distance-based recommendation model with several novel aspects: (i) each user and item are parameterized by Gaussian distributions to capture the learning uncertainties; (ii) an adaptive margin generation scheme is proposed to generate the margins regarding different training triplets; (iii) explicit user-user/item-item similarity modeling is incorporated in the objective function. The Wasserstein distance is employed to determine preferences because it obeys the triangle inequality and can measure the distance between probabilistic distributions. Via a comparison using five real-world datasets with state-of-the-art methods, the proposed model outperforms the best existing models by 4-22% in terms of recall@K on Top-K recommendation. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: Accepted by the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2020 Research Track)

arXiv:2012.15140 [pdf]

doi 10.1103/PhysRevMaterials.4.124204

Pressure effect on the topologically nontrivial electronic state and transport of lutecium monobismuthide

Authors: H. Gu1, F. Tang, Y. -R. Ruan, J. -M. Zhang, R. -J. Tang, W. Zhao, R. Zhao, L. Zhang, Z. -D. Han, B. Qian, X. -F. Jiang, Y. Fang

Abstract: Rare-earth monopnictides are predicted to be nontrivial semimetal candidates and show pressure-induced superconductivity. Here, we grow LuBi single crystal and study the magnetization, transport behaviors and electronic band structures to reveal its topological semimetal feature and superconductivity under pressure. At 0 GPa, the quantum oscillations indicate that there are several topologically n… ▽ More Rare-earth monopnictides are predicted to be nontrivial semimetal candidates and show pressure-induced superconductivity. Here, we grow LuBi single crystal and study the magnetization, transport behaviors and electronic band structures to reveal its topological semimetal feature and superconductivity under pressure. At 0 GPa, the quantum oscillations indicate that there are several topologically nontrivial carrier pockets around the Fermi level, among which the hole ones are isotropic in shape, while the electron ones are anisotropic and responsible for the angular magnetoresistance. Upon compression, the superconductivity emerges in the titled compound, showing a similar pressure dependence as that observed in LaBi. Our calculation suggests that the electronic band structures are robust at low- and high-pressure respectively and thus the topological features are always preserved. Besides, the nearly pressure-independent density of state in LuBi indicates that the conventional electron-phonon coupling appears to play a minor role in the superconductivity. △ Less

Submitted 30 December, 2020; originally announced December 2020.

arXiv:2012.13838 [pdf, other]

Inserting Information Bottlenecks for Attribution in Transformers

Authors: Zhiying Jiang, Raphael Tang, Ji Xin, Jimmy Lin

Abstract: Pretrained transformers achieve the state of the art across tasks in natural language processing, motivating researchers to investigate their inner mechanisms. One common direction is to understand what features are important for prediction. In this paper, we apply information bottlenecks to analyze the attribution of each feature for prediction on a black-box model. We use BERT as the example and… ▽ More Pretrained transformers achieve the state of the art across tasks in natural language processing, motivating researchers to investigate their inner mechanisms. One common direction is to understand what features are important for prediction. In this paper, we apply information bottlenecks to analyze the attribution of each feature for prediction on a black-box model. We use BERT as the example and evaluate our approach both quantitatively and qualitatively. We show the effectiveness of our method in terms of attribution and the ability to provide insight into how information flows through layers. We demonstrate that our technique outperforms two competitive methods in degradation tests on four datasets. Code is available at https://github.com/bazingagin/IBA. △ Less

Submitted 4 August, 2021; v1 submitted 26 December, 2020; originally announced December 2020.

Comments: refine formula

Journal ref: In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings (pp. 3850-3857)

arXiv:2012.13650 [pdf, ps, other]

A Theory of Updating Ambiguous Information

Authors: Rui Tang

Abstract: We introduce a new updating rule, the conditional maximum likelihood rule (CML) for updating ambiguous information. The CML formula replaces the likelihood term in Bayes' rule with the maximal likelihood of the given signal conditional on the state. We show that CML satisfies a new axiom, increased sensitivity after updating, while other updating rules do not. With CML, a decision maker's posterio… ▽ More We introduce a new updating rule, the conditional maximum likelihood rule (CML) for updating ambiguous information. The CML formula replaces the likelihood term in Bayes' rule with the maximal likelihood of the given signal conditional on the state. We show that CML satisfies a new axiom, increased sensitivity after updating, while other updating rules do not. With CML, a decision maker's posterior is unaffected by the order in which independent signals arrive. CML also accommodates recent experimental findings on updating signals of unknown accuracy and has simple predictions on learning with such signals. We show that an information designer can almost achieve her maximal payoff with a suitable ambiguous information structure whenever the agent updates according to CML. △ Less

Submitted 25 December, 2020; originally announced December 2020.

arXiv:2012.08986 [pdf, other]

doi 10.1145/3447548.3467077

An Embedding Learning Framework for Numerical Features in CTR Prediction

Authors: Huifeng Guo, Bo Chen, Ruiming Tang, Weinan Zhang, Zhenguo Li, Xiuqiang He

Abstract: Click-Through Rate (CTR) prediction is critical for industrial recommender systems, where most deep CTR models follow an Embedding \& Feature Interaction paradigm. However, the majority of methods focus on designing network architectures to better capture feature interactions while the feature embedding, especially for numerical features, has been overlooked. Existing approaches for numerical feat… ▽ More Click-Through Rate (CTR) prediction is critical for industrial recommender systems, where most deep CTR models follow an Embedding \& Feature Interaction paradigm. However, the majority of methods focus on designing network architectures to better capture feature interactions while the feature embedding, especially for numerical features, has been overlooked. Existing approaches for numerical features are difficult to capture informative knowledge because of the low capacity or hard discretization based on the offline expertise feature engineering. In this paper, we propose a novel embedding learning framework for numerical features in CTR prediction (AutoDis) with high model capacity, end-to-end training and unique representation properties preserved. AutoDis consists of three core components: meta-embeddings, automatic discretization and aggregation. Specifically, we propose meta-embeddings for each numerical field to learn global knowledge from the perspective of field with a manageable number of parameters. Then the differentiable automatic discretization performs soft discretization and captures the correlations between the numerical features and meta-embeddings. Finally, distinctive and informative embeddings are learned via an aggregation function. Comprehensive experiments on two public and one industrial datasets are conducted to validate the effectiveness of AutoDis. Moreover, AutoDis has been deployed onto a mainstream advertising platform, where online A/B test demonstrates the improvement over the base model by 2.1% and 2.7% in terms of CTR and eCPM, respectively. In addition, the code of our framework is publicly available in MindSpore(https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/recommend/autodis). △ Less

Submitted 22 May, 2021; v1 submitted 16 December, 2020; originally announced December 2020.

Comments: 9 pages

arXiv:2011.12975 [pdf, ps, other]

Large-scale geometry of the saddle connection graph

Authors: Valentina Disarlo, Hui** Pan, Anja Randecker, Robert Tang

Abstract: We prove that the saddle connection graph associated to any half-translation surface is 4-hyperbolic and uniformly quasi-isometric to the regular countably infinite-valent tree. Consequently, the saddle connection graph is not quasi-isometrically rigid. We also characterise its Gromov boundary as the set of straight foliations with no saddle connections. In our arguments, we give a generalisation… ▽ More We prove that the saddle connection graph associated to any half-translation surface is 4-hyperbolic and uniformly quasi-isometric to the regular countably infinite-valent tree. Consequently, the saddle connection graph is not quasi-isometrically rigid. We also characterise its Gromov boundary as the set of straight foliations with no saddle connections. In our arguments, we give a generalisation of the unicorn paths in the arc graph which may be of independent interest. △ Less

Submitted 9 December, 2020; v1 submitted 25 November, 2020; originally announced November 2020.

Comments: 28 pages, 9 figures; v2: corrected statement of Corollary 1.3 (not affecting any other part of the paper)

arXiv:2011.08960 [pdf, other]

Deep Serial Number: Computational Watermarking for DNN Intellectual Property Protection

Authors: Ruixiang Tang, Mengnan Du, Xia Hu

Abstract: In this paper, we present DSN (Deep Serial Number), a simple yet effective watermarking algorithm designed specifically for deep neural networks (DNNs). Unlike traditional methods that incorporate identification signals into DNNs, our approach explores a novel Intellectual Property (IP) protection mechanism for DNNs, effectively thwarting adversaries from using stolen networks. Inspired by the suc… ▽ More In this paper, we present DSN (Deep Serial Number), a simple yet effective watermarking algorithm designed specifically for deep neural networks (DNNs). Unlike traditional methods that incorporate identification signals into DNNs, our approach explores a novel Intellectual Property (IP) protection mechanism for DNNs, effectively thwarting adversaries from using stolen networks. Inspired by the success of serial numbers in safeguarding conventional software IP, we propose the first implementation of serial number embedding within DNNs. To achieve this, DSN is integrated into a knowledge distillation framework, in which a private teacher DNN is initially trained. Subsequently, its knowledge is distilled and imparted to a series of customized student DNNs. Each customer DNN functions correctly only upon input of a valid serial number. Experimental results across various applications demonstrate DSN's efficacy in preventing unauthorized usage without compromising the original DNN performance. The experiments further show that DSN is resistant to different categories of watermark attacks. △ Less

Submitted 26 July, 2023; v1 submitted 17 November, 2020; originally announced November 2020.

arXiv:2011.00550 [pdf, other]

doi 10.1145/3340531.3412756

U-rank: Utility-oriented Learning to Rank with Implicit Feedback

Authors: Xinyi Dai, Jiawei Hou, Qing Liu, Yunjia Xi, Ruiming Tang, Weinan Zhang, Xiuqiang He, Jun Wang, Yong Yu

Abstract: Learning to rank with implicit feedback is one of the most important tasks in many real-world information systems where the objective is some specific utility, e.g., clicks and revenue. However, we point out that existing methods based on probabilistic ranking principle do not necessarily achieve the highest utility. To this end, we propose a novel ranking framework called U-rank that directly opt… ▽ More Learning to rank with implicit feedback is one of the most important tasks in many real-world information systems where the objective is some specific utility, e.g., clicks and revenue. However, we point out that existing methods based on probabilistic ranking principle do not necessarily achieve the highest utility. To this end, we propose a novel ranking framework called U-rank that directly optimizes the expected utility of the ranking list. With a position-aware deep click-through rate prediction model, we address the attention bias considering both query-level and item-level features. Due to the item-specific attention bias modeling, the optimization for expected utility corresponds to a maximum weight matching on the item-position bipartite graph. We base the optimization of this objective in an efficient Lambdaloss framework, which is supported by both theoretical and empirical analysis. We conduct extensive experiments for both web search and recommender systems over three benchmark datasets and two proprietary datasets, where the performance gain of U-rank over state-of-the-arts is demonstrated. Moreover, our proposed U-rank has been deployed on a large-scale commercial recommender and a large improvement over the production baseline has been observed in an online A/B testing. △ Less

Submitted 1 November, 2020; originally announced November 2020.

arXiv:2010.04913 [pdf, other]

Interpretable Neural Computation for Real-World Compositional Visual Question Answering

Authors: Ruixue Tang, Chao Ma

Abstract: There are two main lines of research on visual question answering (VQA): compositional model with explicit multi-hop reasoning, and monolithic network with implicit reasoning in the latent feature space. The former excels in interpretability and compositionality but fails on real-world images, while the latter usually achieves better performance due to model flexibility and parameter efficiency. W… ▽ More There are two main lines of research on visual question answering (VQA): compositional model with explicit multi-hop reasoning, and monolithic network with implicit reasoning in the latent feature space. The former excels in interpretability and compositionality but fails on real-world images, while the latter usually achieves better performance due to model flexibility and parameter efficiency. We aim to combine the two to build an interpretable framework for real-world compositional VQA. In our framework, images and questions are disentangled into scene graphs and programs, and a symbolic program executor runs on them with full transparency to select the attention regions, which are then iteratively passed to a visual-linguistic pre-trained encoder to predict answers. Experiments conducted on the GQA benchmark demonstrate that our framework outperforms the compositional prior arts and achieves competitive accuracy among monolithic ones. With respect to the validity, plausibility and distribution metrics, our framework surpasses others by a considerable margin. △ Less

Submitted 10 October, 2020; originally announced October 2020.

Comments: PRCV 2020

arXiv:2010.04881 [pdf, ps, other]

doi 10.1016/j.geomphys.2021.104148

Twilled 3-Lie algebras, generalized matched pairs of 3-Lie algebras and O-operators

Authors: Shuai Hou, Yunhe Sheng, Rong Tang

Abstract: In this paper, first we introduce the notion of a twilled 3-Lie algebra, and construct an $L_\infty$-algebra, whose Maurer-Cartan elements give rise to new twilled 3-Lie algebras by twisting. In particular, we recover the Lie $3$-algebra whose Maurer-Cartan elements are O-operators (also called relative Rota-Baxter operators) on 3-Lie algebras. Then we introduce the notion of generalized matched p… ▽ More In this paper, first we introduce the notion of a twilled 3-Lie algebra, and construct an $L_\infty$-algebra, whose Maurer-Cartan elements give rise to new twilled 3-Lie algebras by twisting. In particular, we recover the Lie $3$-algebra whose Maurer-Cartan elements are O-operators (also called relative Rota-Baxter operators) on 3-Lie algebras. Then we introduce the notion of generalized matched pairs of 3-Lie algebras using generalized representations of 3-Lie algebras, which will give rise to twilled 3-Lie algebras. The usual matched pairs of 3-Lie algebras correspond to a special class of twilled 3-Lie algebras, which we call strict twilled 3-Lie algebras. Finally, we use O-operators to construct explicit twilled 3-Lie algebras, and explain why an $r$-matrix for a 3-Lie algebra can not give rise to a double construction 3-Lie bialgebra. Examples of twilled 3-Lie algebras are given to illustrate the various interesting phenomenon. △ Less

Submitted 9 October, 2020; originally announced October 2020.

Comments: 19 pages, comments are welcome

Journal ref: J. Geom. Phys. 163 (2021.05), 104148, 1-15

arXiv:2010.00851 [pdf, ps, other]

doi 10.3390/e23111408

On the Achievable Rate Region of the $ K $-Receiver Broadcast Channels via Exhaustive Message Splitting

Authors: Rui Tang, Songjie Xie, Youlong Wu

Abstract: This paper focuses on $ K $-receiver discrete-time memoryless broadcast channels (DM-BCs) with private messages, where the transmitter wishes to convey $K$ private messages to $K$ receivers respectively. A general inner bound on the capacity region is proposed based on an exhaustive message splitting and a $K$-level modified Marton's coding. The key idea is to split every message into… ▽ More This paper focuses on $ K $-receiver discrete-time memoryless broadcast channels (DM-BCs) with private messages, where the transmitter wishes to convey $K$ private messages to $K$ receivers respectively. A general inner bound on the capacity region is proposed based on an exhaustive message splitting and a $K$-level modified Marton's coding. The key idea is to split every message into $ \sum_{j=1}^K {K\choose j} $ submessages each corresponding to a set of users who are assigned to recover them, and then send these submessages through codewords that are jointly typical with each other. To guarantee the joint typicality among all transmitted codewords, a sufficient condition on the subcodebooks sizes is derived through a newly establishing hierarchical covering lemma, which extends the 2-level multivariate covering lemma to the $K$-level case including $(2^{K}-1)$ random variables with more intricate dependence. As the number of auxiliary random variables and rate constraints both increase linearly with $(2^{K}-1)$, the standard Fourier-Motzkin elimination procedure becomes infeasible when $K$ is large. To tackle this problem, we obtain the final form of achievable rate region with a special observation of disjoint unions of sets that constitute the power set of $ \{1,\dots,K\}$. The proposed achievable rate region allows arbitrary input probability mass functions (pmfs) and improves over all previously known ones for $ K$-receiver ($K\geq 3$) BCs whose input pmfs should satisfy certain Markov chain(s). △ Less

Submitted 2 October, 2020; originally announced October 2020.

arXiv:2009.11096 [pdf, ps, other]

doi 10.1007/s00220-021-04032-y

The controlling $L_\infty$-algebra, cohomology and homotopy of embedding tensors and Lie-Leibniz triples

Authors: Yunhe Sheng, Rong Tang, Chenchang Zhu

Abstract: In this paper, we first construct the controlling algebras of embedding tensors and Lie-Leibniz triples, which turn out to be a graded Lie algebra and an $L_\infty$-algebra respectively. Then we introduce representations and cohomologies of embedding tensors and Lie-Leibniz triples, and show that there is a long exact sequence connecting various cohomologies. As applications, we classify infinites… ▽ More In this paper, we first construct the controlling algebras of embedding tensors and Lie-Leibniz triples, which turn out to be a graded Lie algebra and an $L_\infty$-algebra respectively. Then we introduce representations and cohomologies of embedding tensors and Lie-Leibniz triples, and show that there is a long exact sequence connecting various cohomologies. As applications, we classify infinitesimal deformations and central extensions using the second cohomology groups. Finally, we introduce the notion of a homotopy embedding tensor which will induce a Leibniz$_\infty$-algebra. We realize Kotov and Strobl's construction of an $L_\infty$-algebra from an embedding tensor, to a functor from the category of homotopy embedding tensors to that of Leibniz$_\infty$-algebras, and a functor further to that of $L_\infty$-algebras. △ Less

Submitted 21 January, 2021; v1 submitted 23 September, 2020; originally announced September 2020.

Comments: 32 pages, comments are welcome

Journal ref: Comm. Math. Phys. 386 (2021), 269-304

arXiv:2009.02147 [pdf, other]

A Practical Incremental Method to Train Deep CTR Models

Authors: Yichao Wang, Huifeng Guo, Ruiming Tang, Zhirong Liu, Xiuqiang He

Abstract: Deep learning models in recommender systems are usually trained in the batch mode, namely iteratively trained on a fixed-size window of training data. Such batch mode training of deep learning models suffers from low training efficiency, which may lead to performance degradation when the model is not produced on time. To tackle this issue, incremental learning is proposed and has received much att… ▽ More Deep learning models in recommender systems are usually trained in the batch mode, namely iteratively trained on a fixed-size window of training data. Such batch mode training of deep learning models suffers from low training efficiency, which may lead to performance degradation when the model is not produced on time. To tackle this issue, incremental learning is proposed and has received much attention recently. Incremental learning has great potential in recommender systems, as two consecutive window of training data overlap most of the volume. It aims to update the model incrementally with only the newly incoming samples from the timestamp when the model is updated last time, which is much more efficient than the batch mode training. However, most of the incremental learning methods focus on the research area of image recognition where new tasks or classes are learned over time. In this work, we introduce a practical incremental method to train deep CTR models, which consists of three decoupled modules (namely, data, feature and model module). Our method can achieve comparable performance to the conventional batch mode training with much better training efficiency. We conduct extensive experiments on a public benchmark and a private dataset to demonstrate the effectiveness of our proposed method. △ Less

Submitted 4 September, 2020; originally announced September 2020.

arXiv:2008.13517 [pdf, other]

doi 10.1145/3340531.3412754

GraphSAIL: Graph Structure Aware Incremental Learning for Recommender Systems

Authors: Yishi Xu, Yingxue Zhang, Wei Guo, Huifeng Guo, Ruiming Tang, Mark Coates

Abstract: Given the convenience of collecting information through online services, recommender systems now consume large scale data and play a more important role in improving user experience. With the recent emergence of Graph Neural Networks (GNNs), GNN-based recommender models have shown the advantage of modeling the recommender system as a user-item bipartite graph to learn representations of users and… ▽ More Given the convenience of collecting information through online services, recommender systems now consume large scale data and play a more important role in improving user experience. With the recent emergence of Graph Neural Networks (GNNs), GNN-based recommender models have shown the advantage of modeling the recommender system as a user-item bipartite graph to learn representations of users and items. However, such models are expensive to train and difficult to perform frequent updates to provide the most up-to-date recommendations. In this work, we propose to update GNN-based recommender models incrementally so that the computation time can be greatly reduced and models can be updated more frequently. We develop a Graph Structure Aware Incremental Learning framework, GraphSAIL, to address the commonly experienced catastrophic forgetting problem that occurs when training a model in an incremental fashion. Our approach preserves a user's long-term preference (or an item's long-term property) during incremental model updating. GraphSAIL implements a graph structure preservation strategy which explicitly preserves each node's local structure, global structure, and self-information, respectively. We argue that our incremental training framework is the first attempt tailored for GNN based recommender systems and demonstrate its improvement compared to other incremental learning techniques on two public datasets. We further verify the effectiveness of our framework on a large-scale industrial dataset. △ Less

Submitted 1 September, 2020; v1 submitted 25 August, 2020; originally announced August 2020.

Comments: Accepted by CIKM2020 Applied Research Track

arXiv:2008.11567 [pdf, other]

Item Tagging for Information Retrieval: A Tripartite Graph Neural Network based Approach

Authors: Kelong Mao, Xi Xiao, Jieming Zhu, Biao Lu, Ruiming Tang, Xiuqiang He

Abstract: Tagging has been recognized as a successful practice to boost relevance matching for information retrieval (IR), especially when items lack rich textual descriptions. A lot of research has been done for either multi-label text categorization or image annotation. However, there is a lack of published work that targets at item tagging specifically for IR. Directly applying a traditional multi-label… ▽ More Tagging has been recognized as a successful practice to boost relevance matching for information retrieval (IR), especially when items lack rich textual descriptions. A lot of research has been done for either multi-label text categorization or image annotation. However, there is a lack of published work that targets at item tagging specifically for IR. Directly applying a traditional multi-label classification model for item tagging is sub-optimal, due to the ignorance of unique characteristics in IR. In this work, we propose to formulate item tagging as a link prediction problem between item nodes and tag nodes. To enrich the representation of items, we leverage the query logs available in IR tasks, and construct a query-item-tag tripartite graph. This formulation results in a TagGNN model that utilizes heterogeneous graph neural networks with multiple types of nodes and edges. Different from previous research, we also optimize both full tag prediction and partial tag completion cases in a unified framework via a primary-dual loss mechanism. Experimental results on both open and industrial datasets show that our TagGNN approach outperforms the state-of-the-art multi-label classification approaches. △ Less

Submitted 26 August, 2020; originally announced August 2020.

Comments: Accepted by SIGIR 2020

arXiv:2008.09606 [pdf, other]

Howl: A Deployed, Open-Source Wake Word Detection System

Authors: Raphael Tang, Jaejun Lee, Afsaneh Razi, Julia Cambre, Ian Bicking, Jofish Kaye, Jimmy Lin

Abstract: We describe Howl, an open-source wake word detection toolkit with native support for open speech datasets, like Mozilla Common Voice and Google Speech Commands. We report benchmark results on Speech Commands and our own freely available wake word detection dataset, built from MCV. We operationalize our system for Firefox Voice, a plugin enabling speech interactivity for the Firefox web browser. Ho… ▽ More We describe Howl, an open-source wake word detection toolkit with native support for open speech datasets, like Mozilla Common Voice and Google Speech Commands. We report benchmark results on Speech Commands and our own freely available wake word detection dataset, built from MCV. We operationalize our system for Firefox Voice, a plugin enabling speech interactivity for the Firefox web browser. Howl represents, to the best of our knowledge, the first fully productionized yet open-source wake word detection toolkit with a web browser deployment target. Our codebase is at https://github.com/castorini/howl. △ Less

Submitted 21 August, 2020; originally announced August 2020.

Comments: The first two authors contributed equally

arXiv:2008.07838 [pdf, other]

doi 10.1016/j.knosys.2021.107141

Improving adversarial robustness of deep neural networks by using semantic information

Authors: Lina Wang, Rui Tang, Yawei Yue, Xingshu Chen, Wei Wang, Yi Zhu, Xuemei Zeng

Abstract: The vulnerability of deep neural networks (DNNs) to adversarial attack, which is an attack that can mislead state-of-the-art classifiers into making an incorrect classification with high confidence by deliberately perturbing the original inputs, raises concerns about the robustness of DNNs to such attacks. Adversarial training, which is the main heuristic method for improving adversarial robustnes… ▽ More The vulnerability of deep neural networks (DNNs) to adversarial attack, which is an attack that can mislead state-of-the-art classifiers into making an incorrect classification with high confidence by deliberately perturbing the original inputs, raises concerns about the robustness of DNNs to such attacks. Adversarial training, which is the main heuristic method for improving adversarial robustness and the first line of defense against adversarial attacks, requires many sample-by-sample calculations to increase training size and is usually insufficiently strong for an entire network. This paper provides a new perspective on the issue of adversarial robustness, one that shifts the focus from the network as a whole to the critical part of the region close to the decision boundary corresponding to a given class. From this perspective, we propose a method to generate a single but image-agnostic adversarial perturbation that carries the semantic information implying the directions to the fragile parts on the decision boundary and causes inputs to be misclassified as a specified target. We call the adversarial training based on such perturbations "region adversarial training" (RAT), which resembles classical adversarial training but is distinguished in that it reinforces the semantic information missing in the relevant regions. Experimental results on the MNIST and CIFAR-10 datasets show that this approach greatly improves adversarial robustness even using a very small dataset from the training data; moreover, it can defend against FGSM adversarial attacks that have a completely different pattern from the model seen during retraining. △ Less

Submitted 16 June, 2021; v1 submitted 18 August, 2020; originally announced August 2020.

Comments: 13 pages, 9 figures

ACM Class: I.2.6

Journal ref: [J]. Knowledge-Based Systems, 2021: 107141

arXiv:2008.06714 [pdf, ps, other]

doi 10.1007/s00220-020-03881-3

Deformations and homotopy theory of relative Rota-Baxter Lie algebras

Authors: Andrey Lazarev, Yunhe Sheng, Rong Tang

Abstract: We determine the \emph{$L_\infty$-algebra} that controls deformations of a relative Rota-Baxter Lie algebra and show that it is an extension of the dg Lie algebra controlling deformations of the underlying LieRep pair by the dg Lie algebra controlling deformations of the relative Rota-Baxter operator. Consequently, we define the {\em cohomology} of relative Rota-Baxter Lie algebras and relate it t… ▽ More We determine the \emph{$L_\infty$-algebra} that controls deformations of a relative Rota-Baxter Lie algebra and show that it is an extension of the dg Lie algebra controlling deformations of the underlying LieRep pair by the dg Lie algebra controlling deformations of the relative Rota-Baxter operator. Consequently, we define the {\em cohomology} of relative Rota-Baxter Lie algebras and relate it to their infinitesimal deformations. A large class of relative Rota-Baxter Lie algebras is obtained from triangular Lie bialgebras and we construct a map between the corresponding deformation complexes. Next, the notion of a \emph{homotopy} relative Rota-Baxter Lie algebra is introduced. We show that a class of homotopy relative Rota-Baxter Lie algebras is intimately related to \emph{pre-Lie$_\infty$-algebras}. △ Less

Submitted 15 August, 2020; originally announced August 2020.

Comments: 31 pages, to appear in Comm. Math. Phys

Journal ref: Comm. Math. Phys. 383 (2021), 595-631

arXiv:2008.05212 [pdf, other]

Interlayer Link Prediction in Multiplex Social Networks Based on Multiple Types of Consistency between Embedding Vectors

Authors: Rui Tang, Zhenxiong Miao, Shuyu Jiang, Xingshu Chen, Haizhou Wang, Wei Wang

Abstract: Online users are typically active on multiple social media networks (SMNs), which constitute a multiplex social network. It is becoming increasingly challenging to determine whether given accounts on different SMNs belong to the same user; this can be expressed as an interlayer link prediction problem in a multiplex network. To address the challenge of predicting interlayer links , feature or stru… ▽ More Online users are typically active on multiple social media networks (SMNs), which constitute a multiplex social network. It is becoming increasingly challenging to determine whether given accounts on different SMNs belong to the same user; this can be expressed as an interlayer link prediction problem in a multiplex network. To address the challenge of predicting interlayer links , feature or structure information is leveraged. Existing methods that use network embedding techniques to address this problem focus on learning a map** function to unify all nodes into a common latent representation space for prediction; positional relationships between unmatched nodes and their common matched neighbors (CMNs) are not utilized. Furthermore, the layers are often modeled as unweighted graphs, ignoring the strengths of the relationships between nodes. To address these limitations, we propose a framework based on multiple types of consistency between embedding vectors (MulCEV). In MulCEV, the traditional embedding-based method is applied to obtain the degree of consistency between the vectors representing the unmatched nodes, and a proposed distance consistency index based on the positions of nodes in each latent space provides additional clues for prediction. By associating these two types of consistency, the effective information in the latent spaces is fully utilized. Additionally, MulCEV models the layers as weighted graphs to obtain representation. In this way, the higher the strength of the relationship between nodes, the more similar their embedding vectors in the latent representation space will be. The results of our experiments on several real-world datasets demonstrate that the proposed MulCEV framework markedly outperforms current embedding-based methods, especially when the number of training iterations is small. △ Less

Submitted 9 November, 2021; v1 submitted 12 August, 2020; originally announced August 2020.

arXiv:2008.00059 [pdf, ps, other]

Homotopy relative Rota-Baxter Lie algebras, triangular $L_\infty$-bialgebras and higher derived brackets

Authors: Andrey Lazarev, Yunhe Sheng, Rong Tang

Abstract: We describe $L_\infty$-algebras governing homotopy relative Rota-Baxter Lie algebras and triangular $L_\infty$-bialgebras, and establish a map between them. Our formulas are based on a functorial approach to Voronov's higher derived brackets construction which is of independent interest. We describe $L_\infty$-algebras governing homotopy relative Rota-Baxter Lie algebras and triangular $L_\infty$-bialgebras, and establish a map between them. Our formulas are based on a functorial approach to Voronov's higher derived brackets construction which is of independent interest. △ Less

Submitted 31 July, 2020; originally announced August 2020.

Comments: 20 pages

MSC Class: 17B40; 17B56; 17B62; 17B63

arXiv:2007.09592 [pdf, other]

Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering

Authors: Ruixue Tang, Chao Ma, Wei Emma Zhang, Qi Wu, Xiaokang Yang

Abstract: Visual Question Answering (VQA) has achieved great success thanks to the fast development of deep neural networks (DNN). On the other hand, the data augmentation, as one of the major tricks for DNN, has been widely used in many computer vision tasks. However, there are few works studying the data augmentation problem for VQA and none of the existing image based augmentation schemes (such as rotati… ▽ More Visual Question Answering (VQA) has achieved great success thanks to the fast development of deep neural networks (DNN). On the other hand, the data augmentation, as one of the major tricks for DNN, has been widely used in many computer vision tasks. However, there are few works studying the data augmentation problem for VQA and none of the existing image based augmentation schemes (such as rotation and flip**) can be directly applied to VQA due to its semantic structure -- an $\langle image, question, answer\rangle$ triplet needs to be maintained correctly. For example, a direction related Question-Answer (QA) pair may not be true if the associated image is rotated or flipped. In this paper, instead of directly manipulating images and questions, we use generated adversarial examples for both images and questions as the augmented data. The augmented examples do not change the visual properties presented in the image as well as the \textbf{semantic} meaning of the question, the correctness of the $\langle image, question, answer\rangle$ is thus still maintained. We then use adversarial learning to train a classic VQA model (BUTD) with our augmented data. We find that we not only improve the overall performance on VQAv2, but also can withstand adversarial attack effectively, compared to the baseline model. The source code is available at https://github.com/zaynmi/seada-vqa. △ Less

Submitted 19 July, 2020; originally announced July 2020.

Comments: To appear in ECCV 2020

arXiv:2007.07846 [pdf, other]

Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset

Authors: Edwin Zhang, Nikhil Gupta, Raphael Tang, Xiao Han, Ronak Pradeep, Kuang Lu, Yue Zhang, Rodrigo Nogueira, Kyunghyun Cho, Hui Fang, Jimmy Lin

Abstract: We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI. Our system has been online and serving users since late March 2020. The Covidex is the user application component of our three-pronged strategy to develop technologies for hel** domain experts tackle the ongo… ▽ More We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI. Our system has been online and serving users since late March 2020. The Covidex is the user application component of our three-pronged strategy to develop technologies for hel** domain experts tackle the ongoing global pandemic. In addition, we provide robust and easy-to-use keyword search infrastructure that exploits mature fusion-based methods as well as standalone neural ranking models that can be incorporated into other applications. These techniques have been evaluated in the ongoing TREC-COVID challenge: Our infrastructure and baselines have been adopted by many participants, including some of the highest-scoring runs in rounds 1, 2, and 3. In round 3, we report the highest-scoring run that takes advantage of previous training data and the second-highest fully automatic run. △ Less

Submitted 14 July, 2020; originally announced July 2020.

Comments: arXiv admin note: text overlap with arXiv:2004.05125

arXiv:2007.02717 [pdf, other]

doi 10.1088/1751-8121/abb511

Joint separable numerical range and bipartite entanglement witness

Authors: Pan Wu, Runhua Tang

Abstract: In 2017 an idea considering a pair of Hermitian operators of product form was published, which is called ultrafine entanglement witnessing. In 2018 some rigorous results were given. Here we improve their work. First we point this idea can be directly derived from an earlier concept named joint separable numerical range and explain how it works as a series of witnesses. Second by a simple method we… ▽ More In 2017 an idea considering a pair of Hermitian operators of product form was published, which is called ultrafine entanglement witnessing. In 2018 some rigorous results were given. Here we improve their work. First we point this idea can be directly derived from an earlier concept named joint separable numerical range and explain how it works as a series of witnesses. Second by a simple method we present a sufficient condition for an effective pair. Finally we prove this condition is necessary for optimization. △ Less

Submitted 6 July, 2020; originally announced July 2020.

Comments: Linear algebra

arXiv:2007.00585 [pdf, other]

doi 10.1109/TMI.2020.3043303

Dual-energy X-ray dark-field material decomposition

Authors: Thorsten Sellerer, Korbinian Mechlem, Ruizhi Tang, Kirsten Taphorn, Franz Pfeiffer, Julia Herzen

Abstract: Dual-energy imaging is a clinically well-established technique that offers several advantages over conventional X-ray imaging. By performing measurements with two distinct X-ray spectra, differences in energy-dependent attenuation are exploited to obtain material-specific information. This information is used in various imaging applications to improve clinical diagnosis. In recent years, grating-b… ▽ More Dual-energy imaging is a clinically well-established technique that offers several advantages over conventional X-ray imaging. By performing measurements with two distinct X-ray spectra, differences in energy-dependent attenuation are exploited to obtain material-specific information. This information is used in various imaging applications to improve clinical diagnosis. In recent years, grating-based X-ray dark-field imaging has received increasing attention in the imaging community. The X-ray dark-field signal originates from ultra small-angle scattering within an object and thus provides information about the microstructure far below the spatial resolution of the imaging system. This property has led to a number of promising future imaging applications that are currently being investigated. However, different microstructures can hardly be distinguished with current X-ray dark-field imaging techniques, since the detected dark-field signal only represents the total amount of ultra small-angle scattering. To overcome these limitations, we present a novel concept called dual-energy X-ray dark-field material decomposition, which transfers the basic material decomposition approach from attenuation-based dual-energy imaging to the dark-field imaging modality. We develop a physical model and algorithms for dual-energy dark-field material decomposition and evaluate the proposed concept in experimental measurements. Our results suggest that by sampling the energy-dependent dark-field signal with two different X-ray spectra, a decomposition into two different microstructured materials is possible. Similar to dual-energy imaging, the additional microstructure-specific information could be useful for clinical diagnosis. △ Less

Submitted 1 July, 2020; originally announced July 2020.

Journal ref: IEEE Transactions on Medical Imaging (2020)

arXiv:2006.13125 [pdf, ps, other]

doi 10.1109/TIP.2021.3083447

DeepQTMT: A Deep Learning Approach for Fast QTMT-based CU Partition of Intra-mode VVC

Authors: Tianyi Li, Mai Xu, Runzhi Tang, Ying Chen, Qunliang Xing

Abstract: Versatile Video Coding (VVC), as the latest standard, significantly improves the coding efficiency over its ancestor standard High Efficiency Video Coding (HEVC), but at the expense of sharply increased complexity. In VVC, the quad-tree plus multi-type tree (QTMT) structure of coding unit (CU) partition accounts for over 97% of the encoding time, due to the brute-force search for recursive rate-di… ▽ More Versatile Video Coding (VVC), as the latest standard, significantly improves the coding efficiency over its ancestor standard High Efficiency Video Coding (HEVC), but at the expense of sharply increased complexity. In VVC, the quad-tree plus multi-type tree (QTMT) structure of coding unit (CU) partition accounts for over 97% of the encoding time, due to the brute-force search for recursive rate-distortion (RD) optimization. Instead of the brute-force QTMT search, this paper proposes a deep learning approach to predict the QTMT-based CU partition, for drastically accelerating the encoding process of intra-mode VVC. First, we establish a large-scale database containing sufficient CU partition patterns with diverse video content, which can facilitate the data-driven VVC complexity reduction. Next, we propose a multi-stage exit CNN (MSE-CNN) model with an early-exit mechanism to determine the CU partition, in accord with the flexible QTMT structure at multiple stages. Then, we design an adaptive loss function for training the MSE-CNN model, synthesizing both the uncertain number of split modes and the target on minimized RD cost. Finally, a multi-threshold decision scheme is developed, achieving desirable trade-off between complexity and RD performance. Experimental results demonstrate that our approach can reduce the encoding time of VVC by 44.65%-66.88% with the negligible Bjøntegaard delta bit-rate (BD-BR) of 1.322%-3.188%, which significantly outperforms other state-of-the-art approaches. △ Less

Submitted 6 June, 2021; v1 submitted 23 June, 2020; originally announced June 2020.

Comments: 14 pages, 10 figures, 7 tables. Published in IEEE Transactions on Image Processing (TIP), 2021

Journal ref: in IEEE Transactions on Image Processing, vol. 30, pp. 5377-5390, 2021

arXiv:2006.10389 [pdf, other]

Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning

Authors: Si** Zhou, Xinyi Dai, Haokun Chen, Weinan Zhang, Kan Ren, Ruiming Tang, Xiuqiang He, Yong Yu

Abstract: Interactive recommender system (IRS) has drawn huge attention because of its flexible recommendation strategy and the consideration of optimal long-term user experiences. To deal with the dynamic user preference and optimize accumulative utilities, researchers have introduced reinforcement learning (RL) into IRS. However, RL methods share a common issue of sample efficiency, i.e., huge amount of i… ▽ More Interactive recommender system (IRS) has drawn huge attention because of its flexible recommendation strategy and the consideration of optimal long-term user experiences. To deal with the dynamic user preference and optimize accumulative utilities, researchers have introduced reinforcement learning (RL) into IRS. However, RL methods share a common issue of sample efficiency, i.e., huge amount of interaction data is required to train an effective recommendation policy, which is caused by the sparse user responses and the large action space consisting of a large number of candidate items. Moreover, it is infeasible to collect much data with explorative policies in online environments, which will probably harm user experience. In this work, we investigate the potential of leveraging knowledge graph (KG) in dealing with these issues of RL methods for IRS, which provides rich side information for recommendation decision making. Instead of learning RL policies from scratch, we make use of the prior knowledge of the item correlation learned from KG to (i) guide the candidate selection for better candidate item retrieval, (ii) enrich the representation of items and user states, and (iii) propagate user preferences among the correlated items over KG to deal with the sparsity of user feedback. Comprehensive experiments have been conducted on two real-world datasets, which demonstrate the superiority of our approach with significant improvements against state-of-the-arts. △ Less

Submitted 18 June, 2020; originally announced June 2020.

arXiv:2006.08315 [pdf, other]

Mitigating Gender Bias in Captioning Systems

Authors: Ruixiang Tang, Mengnan Du, Yuening Li, Zirui Liu, Na Zou, Xia Hu

Abstract: Image captioning has made substantial progress with huge supporting image collections sourced from the web. However, recent studies have pointed out that captioning datasets, such as COCO, contain gender bias found in web corpora. As a result, learning models could heavily rely on the learned priors and image context for gender identification, leading to incorrect or even offensive errors. To enco… ▽ More Image captioning has made substantial progress with huge supporting image collections sourced from the web. However, recent studies have pointed out that captioning datasets, such as COCO, contain gender bias found in web corpora. As a result, learning models could heavily rely on the learned priors and image context for gender identification, leading to incorrect or even offensive errors. To encourage models to learn correct gender features, we reorganize the COCO dataset and present two new splits COCO-GB V1 and V2 datasets where the train and test sets have different gender-context joint distribution. Models relying on contextual cues will suffer from huge gender prediction errors on the anti-stereotypical test data. Benchmarking experiments reveal that most captioning models learn gender bias, leading to high gender prediction errors, especially for women. To alleviate the unwanted bias, we propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence. Experimental results validate that GAIC can significantly reduce gender prediction errors with a competitive caption quality. Our codes and the designed benchmark datasets are available at https://github.com/datamllab/Mitigating_Gender_Bias_In_Captioning_System. △ Less

Submitted 20 April, 2021; v1 submitted 15 June, 2020; originally announced June 2020.

arXiv:2006.08131 [pdf, other]

An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks

Authors: Ruixiang Tang, Mengnan Du, Ninghao Liu, Fan Yang, Xia Hu

Abstract: With the widespread use of deep neural networks (DNNs) in high-stake applications, the security problem of the DNN models has received extensive attention. In this paper, we investigate a specific security problem called trojan attack, which aims to attack deployed DNN systems relying on the hidden trigger patterns inserted by malicious hackers. We propose a training-free attack approach which is… ▽ More With the widespread use of deep neural networks (DNNs) in high-stake applications, the security problem of the DNN models has received extensive attention. In this paper, we investigate a specific security problem called trojan attack, which aims to attack deployed DNN systems relying on the hidden trigger patterns inserted by malicious hackers. We propose a training-free attack approach which is different from previous work, in which trojaned behaviors are injected by retraining model on a poisoned dataset. Specifically, we do not change parameters in the original model but insert a tiny trojan module (TrojanNet) into the target model. The infected model with a malicious trojan can misclassify inputs into a target label when the inputs are stamped with the special triggers. The proposed TrojanNet has several nice properties including (1) it activates by tiny trigger patterns and keeps silent for other signals, (2) it is model-agnostic and could be injected into most DNNs, dramatically expanding its attack scenarios, and (3) the training-free mechanism saves massive training efforts comparing to conventional trojan attack methods. The experimental results show that TrojanNet can inject the trojan into all labels simultaneously (all-label trojan attack) and achieves 100% attack success rate without affecting model accuracy on original tasks. Experimental analysis further demonstrates that state-of-the-art trojan detection algorithms fail to detect TrojanNet attack. The code is available at https://github.com/trx14/TrojanNet. △ Less

Submitted 18 June, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

arXiv:2006.04480 [pdf]

doi 10.1080/19466315.2020.1785543

Assessing the Impact of COVID-19 on the Objective and Analysis of Oncology Clinical Trials -- Application of the Estimand Framework

Authors: Evgeny Degtyarev, Kaspar Rufibach, Yue Shentu, Godwin Yung, Michelle Casey, Stefan Englert, Feng Liu, Yi Liu, Oliver Sailer, Jonathan Siegel, Steven Sun, Rui Tang, Jiangxiu Zhou

Abstract: COVID-19 outbreak has rapidly evolved into a global pandemic. The impact of COVID-19 on patient journeys in oncology represents a new risk to interpretation of trial results and its broad applicability for future clinical practice. We identify key intercurrent events that may occur due to COVID-19 in oncology clinical trials with a focus on time-to-event endpoints and discuss considerations pertai… ▽ More COVID-19 outbreak has rapidly evolved into a global pandemic. The impact of COVID-19 on patient journeys in oncology represents a new risk to interpretation of trial results and its broad applicability for future clinical practice. We identify key intercurrent events that may occur due to COVID-19 in oncology clinical trials with a focus on time-to-event endpoints and discuss considerations pertaining to the other estimand attributes introduced in the ICH E9 addendum. We propose strategies to handle COVID-19 related intercurrent events, depending on their relationship with malignancy and treatment and the interpretability of data after them. We argue that the clinical trial objective from a world without COVID-19 pandemic remains valid. The estimand framework provides a common language to discuss the impact of COVID-19 in a structured and transparent manner. This demonstrates that the applicability of the framework may even go beyond what it was initially intended for. △ Less

Submitted 21 June, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

Comments: Paper written on behalf of the industry working group on estimands in oncology (www.oncoestimand.org). Accepted for publication in a special issue of Statistics in Biopharmaceutical Research

Journal ref: Statistics in Biopharmaceutical Research, 2020, 12(4), 427-437

arXiv:2005.00729 [pdf, ps, other]

doi 10.1142/S0219887820501741

Deformations of relative Rota-Baxter operators on Leibniz algebras

Authors: Rong Tang, Yunhe Sheng, Yanqiu Zhou

Abstract: In this paper, we introduce the cohomology theory of relative Rota-Baxter operators on Leibniz algebras. We use the cohomological approach to study linear and formal deformations of relative Rota-Baxter operators. In particular, the notion of Nijenhuis elements is introduced to characterize trivial linear deformations. Formal deformations and extendibility of order n deformations of a relative Rot… ▽ More In this paper, we introduce the cohomology theory of relative Rota-Baxter operators on Leibniz algebras. We use the cohomological approach to study linear and formal deformations of relative Rota-Baxter operators. In particular, the notion of Nijenhuis elements is introduced to characterize trivial linear deformations. Formal deformations and extendibility of order n deformations of a relative Rota-Baxter operator are also characterized in terms of the cohomology theory. △ Less

Submitted 2 May, 2020; originally announced May 2020.

Comments: arXiv admin note: text overlap with arXiv:1803.09287

Journal ref: Int. J. Geom. Methods M. Vol. 17, No. 12, 2050174 (2020)

arXiv:2004.13705 [pdf, other]

Showing Your Work Doesn't Always Work

Authors: Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yaoliang Yu, Jimmy Lin

Abstract: In natural language processing, a recently popular line of work explores how to best report the experimental results of neural networks. One exemplar publication, titled "Show Your Work: Improved Reporting of Experimental Results," advocates for reporting the expected validation effectiveness of the best-tuned model, with respect to the computational budget. In the present work, we critically exam… ▽ More In natural language processing, a recently popular line of work explores how to best report the experimental results of neural networks. One exemplar publication, titled "Show Your Work: Improved Reporting of Experimental Results," advocates for reporting the expected validation effectiveness of the best-tuned model, with respect to the computational budget. In the present work, we critically examine this paper. As far as statistical generalizability is concerned, we find unspoken pitfalls and caveats with this approach. We analytically show that their estimator is biased and uses error-prone assumptions. We find that the estimator favors negative errors and yields poor bootstrapped confidence intervals. We derive an unbiased alternative and bolster our claims with empirical evidence from statistical simulation. Our codebase is at http://github.com/castorini/meanmax. △ Less

Submitted 28 April, 2020; originally announced April 2020.

Comments: Accepted to ACL 2020

arXiv:2004.12993 [pdf, other]

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Authors: Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin

Abstract: Large-scale pre-trained language models such as BERT have brought significant improvements to NLP applications. However, they are also notorious for being slow in inference, which makes them difficult to deploy in real-time applications. We propose a simple but effective method, DeeBERT, to accelerate BERT inference. Our approach allows samples to exit earlier without passing through the entire mo… ▽ More Large-scale pre-trained language models such as BERT have brought significant improvements to NLP applications. However, they are also notorious for being slow in inference, which makes them difficult to deploy in real-time applications. We propose a simple but effective method, DeeBERT, to accelerate BERT inference. Our approach allows samples to exit earlier without passing through the entire model. Experiments show that DeeBERT is able to save up to ~40% inference time with minimal degradation in model quality. Further analyses show different behaviors in the BERT transformer layers and also reveal their redundancy. Our work provides new ideas to efficiently apply deep transformer-based models to downstream tasks. Code is available at https://github.com/castorini/DeeBERT. △ Less

Submitted 27 April, 2020; originally announced April 2020.

Comments: Accepted at ACL 2020

arXiv:2004.11339 [pdf, ps, other]

Rapidly Bootstrap** a Question Answering Dataset for COVID-19

Authors: Raphael Tang, Rodrigo Nogueira, Edwin Zhang, Nikhil Gupta, Phuong Cam, Kyunghyun Cho, Jimmy Lin

Abstract: We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19, built by hand from knowledge gathered from Kaggle's COVID-19 Open Research Dataset Challenge. To our knowledge, this is the first publicly available resource of its type, and intended as a stopgap measure for guiding research until more substantial evaluation resources become available. While thi… ▽ More We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19, built by hand from knowledge gathered from Kaggle's COVID-19 Open Research Dataset Challenge. To our knowledge, this is the first publicly available resource of its type, and intended as a stopgap measure for guiding research until more substantial evaluation resources become available. While this dataset, comprising 124 question-article pairs as of the present version 0.1 release, does not have sufficient examples for supervised machine learning, we believe that it can be helpful for evaluating the zero-shot or transfer capabilities of existing models on topics specifically related to COVID-19. This paper describes our methodology for constructing the dataset and presents the effectiveness of a number of baselines, including term-based techniques and various transformer-based models. The dataset is available at http://covidqa.ai/ △ Less

Submitted 23 April, 2020; originally announced April 2020.

arXiv:2004.06390 [pdf, other]

Personalized Re-ranking for Improving Diversity in Live Recommender Systems

Authors: Yichao Wang, Xiangyu Zhang, Zhirong Liu, Zhenhua Dong, Xinhua Feng, Ruiming Tang, Xiuqiang He

Abstract: Users of industrial recommender systems are normally suggesteda list of items at one time. Ideally, such list-wise recommendationshould provide diverse and relevant options to the users. However, in practice, list-wise recommendation is implemented as top-N recommendation. Top-N recommendation selects the first N items from candidates to display. The list is generated by a ranking function, which… ▽ More Users of industrial recommender systems are normally suggesteda list of items at one time. Ideally, such list-wise recommendationshould provide diverse and relevant options to the users. However, in practice, list-wise recommendation is implemented as top-N recommendation. Top-N recommendation selects the first N items from candidates to display. The list is generated by a ranking function, which is learned from labeled data to optimize accuracy.However, top-N recommendation may lead to suboptimal, as it focuses on accuracy of each individual item independently and overlooks mutual influence between items. Therefore, we propose a personalized re-ranking model for improving diversity of the recommendation list in real recommender systems. The proposed re-ranking model can be easily deployed as a follow-up component after any existing ranking function. The re-ranking model improves the diversity by employing personalized Determinental Point Process (DPP). DPP has been applied in some recommender systems to improve the diversity and increase the user engagement.However, DPP does not take into account the fact that users may have individual propensities to the diversity. To overcome such limitation, our re-ranking model proposes a personalized DPP to model the trade-off between accuracy and diversity for each individual user. We implement and deploy the personalized DPP model on alarge scale industrial recommender system. Experimental results on both offline and online demonstrate the efficiency of our proposed re-ranking model. △ Less

Submitted 20 April, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

arXiv:2003.11235 [pdf, other]

AutoFIS: Automatic Feature Interaction Selection in Factorization Models for Click-Through Rate Prediction

Authors: Bin Liu, Chenxu Zhu, Guilin Li, Weinan Zhang, **cai Lai, Ruiming Tang, Xiuqiang He, Zhenguo Li, Yong Yu

Abstract: Learning feature interactions is crucial for click-through rate (CTR) prediction in recommender systems. In most existing deep learning models, feature interactions are either manually designed or simply enumerated. However, enumerating all feature interactions brings large memory and computation cost. Even worse, useless interactions may introduce noise and complicate the training process. In thi… ▽ More Learning feature interactions is crucial for click-through rate (CTR) prediction in recommender systems. In most existing deep learning models, feature interactions are either manually designed or simply enumerated. However, enumerating all feature interactions brings large memory and computation cost. Even worse, useless interactions may introduce noise and complicate the training process. In this work, we propose a two-stage algorithm called Automatic Feature Interaction Selection (AutoFIS). AutoFIS can automatically identify important feature interactions for factorization models with computational cost just equivalent to training the target model to convergence. In the \emph{search stage}, instead of searching over a discrete set of candidate feature interactions, we relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer over the architecture parameters, the model can automatically identify and remove the redundant feature interactions during the training process of the model. In the \emph{re-train stage}, we keep the architecture parameters serving as an attention unit to further boost the performance. Offline experiments on three large-scale datasets (two public benchmarks, one private) demonstrate that AutoFIS can significantly improve various FM based models. AutoFIS has been deployed onto the training platform of Huawei App Store recommendation service, where a 10-day online A/B test demonstrated that AutoFIS improved the DeepFM model by 20.3\% and 20.1\% in terms of CTR and CVR respectively. △ Less

Submitted 3 July, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

Comments: KDD 2020 ADS track oral accepted

arXiv:2003.04797 [pdf]

Dam Burst: A region-merging-based image segmentation method

Authors: Rui Tang, Wenlong Song, ** Guan, Huibin Ge, Deke Kong

Abstract: Until now, all single level segmentation algorithms except CNN-based ones lead to over segmentation. And CNN-based segmentation algorithms have their own problems. To avoid over segmentation, multiple thresholds of criteria are adopted in region merging process to produce hierarchical segmentation results. However, there still has extreme over segmentation in the low level of the hierarchy, and ou… ▽ More Until now, all single level segmentation algorithms except CNN-based ones lead to over segmentation. And CNN-based segmentation algorithms have their own problems. To avoid over segmentation, multiple thresholds of criteria are adopted in region merging process to produce hierarchical segmentation results. However, there still has extreme over segmentation in the low level of the hierarchy, and outstanding tiny objects are merged to their large adjacencies in the high level of the hierarchy. This paper proposes a region-merging-based image segmentation method that we call it Dam Burst. As a single level segmentation algorithm, this method avoids over segmentation and retains details by the same time. It is named because of that it simulates a flooding from underground destroys dams between water-pools. We treat edge detection results as strengthening structure of a dam if it is on the dam. To simulate a flooding from underground, regions are merged by ascending order of the average gra-dient inside the region. △ Less

Submitted 25 February, 2020; originally announced March 2020.

arXiv:2003.00397 [pdf, other]

Intelligent Home 3D: Automatic 3D-House Design from Linguistic Descriptions Only

Authors: Qi Chen, Qi Wu, Rui Tang, Yuhan Wang, Shuai Wang, Mingkui Tan

Abstract: Home design is a complex task that normally requires architects to finish with their professional skills and tools. It will be fascinating that if one can produce a house plan intuitively without knowing much knowledge about home design and experience of using complex designing tools, for example, via natural language. In this paper, we formulate it as a language conditioned visual content generat… ▽ More Home design is a complex task that normally requires architects to finish with their professional skills and tools. It will be fascinating that if one can produce a house plan intuitively without knowing much knowledge about home design and experience of using complex designing tools, for example, via natural language. In this paper, we formulate it as a language conditioned visual content generation problem that is further divided into a floor plan generation and an interior texture (such as floor and wall) synthesis task. The only control signal of the generation process is the linguistic expression given by users that describe the house details. To this end, we propose a House Plan Generative Model (HPGM) that first translates the language input to a structural graph representation and then predicts the layout of rooms with a Graph Conditioned Layout Prediction Network (GC LPN) and generates the interior texture with a Language Conditioned Texture GAN (LCT-GAN). With some post-processing, the final product of this task is a 3D house model. To train and evaluate our model, we build the first Text-to-3D House Model dataset. △ Less

Submitted 29 February, 2020; originally announced March 2020.

Comments: To appear in CVPR2020

arXiv:2002.07272 [pdf, other]

doi 10.1016/j.nuclphysb.2021.115354

Revisit to the $b\to cτν$ transition: in and beyond the SM

Authors: Kingman Cheung, Zhuo-Ran Huang, Hua-Dong Li, Cai-Dian Lü, Ying-nan Mao, Ru-Ying Tang

Abstract: We perform an analysis of the $b\to cτν$ data, including $R(D^{(*)})$, $R(J/ψ)$, $P_τ(D^{*})$ and $F_L^{D^*}$, within and beyond the Standard Model (SM). We fit the $B\to D^{(*)}$ hadronic form factors in the HQET parametrization to the lattice and the light-cone sum rule (LCSR) results, applying the general strong unitarity bounds corresponding to $J^P=1^-$, $1^+$, $0^-$ and $0^+$. Using the obta… ▽ More We perform an analysis of the $b\to cτν$ data, including $R(D^{(*)})$, $R(J/ψ)$, $P_τ(D^{*})$ and $F_L^{D^*}$, within and beyond the Standard Model (SM). We fit the $B\to D^{(*)}$ hadronic form factors in the HQET parametrization to the lattice and the light-cone sum rule (LCSR) results, applying the general strong unitarity bounds corresponding to $J^P=1^-$, $1^+$, $0^-$ and $0^+$. Using the obtained HQET relations between helicity amplitudes, we give the strong unitarity bounds on individual helicity amplitudes, which can be used in the BGL fits. Using the fitted form factors and taking into account the most recent Belle measurement of $R(D^{(*)})$ we investigate the model-independent and the leptoquark model explanations of the $b\to cτν$ anomalies. Specifically, we consider the one-operator, the two-operator new physics (NP) scenarios and the NP models with a single $R_2$, $S_1$ or $U_1$ leptoquark which is supposed to be able to address the $b\to cτν$ anomalies, and our results show that the $R_2$ leptoquark model is in tension with the limit $\mathcal B(B_c\to τν)<10\%$. Furthermore, we give predictions for the various observables in the SM and the NP scenarios/leptoquark models based on the present form factor study and the analysis of NP. △ Less

Submitted 29 March, 2021; v1 submitted 17 February, 2020; originally announced February 2020.

Comments: 35 pages, 5 figures and 14 tables; Matches the published version in NPB

Report number: NCTS-PH/2002

arXiv:2002.02140 [pdf, ps, other]

Interlayer link prediction in multiplex social networks: an iterative degree penalty algorithm

Authors: Rui Tang, Shuyu Jiang, Xingshu Chen, Haizhou Wang, Wenxian Wang, Wei Wang

Abstract: Online social network (OSN) applications provide different experiences; for example, posting a short text on Twitter and sharing photographs on Instagram. Multiple OSNs constitute a multiplex network. For privacy protection and usage purposes, accounts belonging to the same user in different OSNs may have different usernames, photographs, and introductions. Interlayer link prediction in multiplex… ▽ More Online social network (OSN) applications provide different experiences; for example, posting a short text on Twitter and sharing photographs on Instagram. Multiple OSNs constitute a multiplex network. For privacy protection and usage purposes, accounts belonging to the same user in different OSNs may have different usernames, photographs, and introductions. Interlayer link prediction in multiplex network aims at identifying whether the accounts in different OSNs belong to the same person, which can aid in tasks including cybercriminal behavior modeling and customer interest analysis. Many real-world OSNs exhibit a scale-free degree distribution; thus, neighbors with different degrees may exert different influences on the node matching degrees across different OSNs. We developed an iterative degree penalty (IDP) algorithm for interlayer link prediction in the multiplex network. First, we proposed a degree penalty principle that assigns a greater weight to a common matched neighbor with fewer connections. Second, we applied node adjacency matrix multiplication for efficiently obtaining the matching degree of all unmatched node pairs. Thereafter, we used the approved maximum value method to obtain the interlayer link prediction results from the matching degree matrix. Finally, the prediction results were inserted into the priori interlayer node pair set and the above processes were performed iteratively until all unmatched nodes in one layer were matched or all matching degrees of the unmatched node pairs were equal to 0. Experiments demonstrated that our advanced IDP algorithm significantly outperforms current network structure-based methods when the multiplex network average degree and node overlap** rate are low. △ Less

Submitted 6 February, 2020; originally announced February 2020.

arXiv:2001.03007 [pdf, other]

Fastest Frozen Temperature for a Thermodynamic System

Authors: X. Y. Zhou, Z. Q. Yang, X. R. Tang, X. Wang, Q. H. Liu

Abstract: For a thermodynamic system obeying both the equipartition theorem in high temperature and the third law in low temperature, the curve showing relationship between the specific heat and the temperature has two common behaviors:\ it terminates at zero when the temperature is zero Kelvin and converges to a constant as temperature is higher and higher. Since it is always possible to find the character… ▽ More For a thermodynamic system obeying both the equipartition theorem in high temperature and the third law in low temperature, the curve showing relationship between the specific heat and the temperature has two common behaviors:\ it terminates at zero when the temperature is zero Kelvin and converges to a constant as temperature is higher and higher. Since it is always possible to find the characteristic temperature $T_{C}$ to mark the excited temperature as the specific heat almost reaches the equipartition value, it is reasonable to find a temperature in low temperature interval, complementary to $T_{C}$. The present study reports a possibly universal existence of the such a temperature $\vartheta$, defined by that at which the specific heat falls \textit{fastest} along with decrease of the temperature. For the Debye model of solids, above the temperature $\vartheta$ the Debye's law starts to fail. △ Less

Submitted 4 May, 2020; v1 submitted 9 January, 2020; originally announced January 2020.

Comments: 8 pages, 5 figures, manuscript completely rewritten and presentation greatly improved

Showing 201–250 of 324 results for author: Tang, R