-
Dynamic Against Dynamic: An Open-set Self-learning Framework
Authors:
Haifeng Yang,
Chuanxing Geng,
Pong C. Yuen,
Songcan Chen
Abstract:
In open-set recognition, existing methods generally learn statically fixed decision boundaries using known classes to reject unknown classes. Though they have achieved promising results, such decision boundaries are evidently insufficient for universal unknown classes in dynamic and open scenarios as they can potentially appear at any position in the feature space. Moreover, these methods just sim…
▽ More
In open-set recognition, existing methods generally learn statically fixed decision boundaries using known classes to reject unknown classes. Though they have achieved promising results, such decision boundaries are evidently insufficient for universal unknown classes in dynamic and open scenarios as they can potentially appear at any position in the feature space. Moreover, these methods just simply reject unknown class samples during testing without any effective utilization for them. In fact, such samples completely can constitute the true instantiated representation of the unknown classes to further enhance the model's performance. To address these issues, this paper proposes a novel dynamic against dynamic idea, i.e., dynamic method against dynamic changing open-set world, where an open-set self-learning (OSSL) framework is correspondingly developed. OSSL starts with a good closed-set classifier trained by known classes and utilizes available test samples for model adaptation during testing, thus gaining the adaptability to changing data distributions. In particular, a novel self-matching module is designed for OSSL, which can achieve the adaptation in automatically identifying known class samples while rejecting unknown class samples which are further utilized to enhance the discriminability of the model as the instantiated representation of unknown classes. Our method establishes new performance milestones respectively in almost all standard and cross-data benchmarks.
△ Less
Submitted 2 May, 2024; v1 submitted 27 April, 2024;
originally announced April 2024.
-
SAT-DIFF: A Tree Diffing Framework Using SAT Solving
Authors:
Chuqin Geng,
Haolin Ye,
Yihan Zhang,
Brigitte Pientka,
Xujie Si
Abstract:
Computing differences between tree-structured data is a critical but challenging problem in software analysis. In this paper, we propose a novel tree diffing approach called SatDiff, which reformulates the structural diffing problem into a MaxSAT problem. By encoding the necessary transformations from the source tree to the target tree, SatDiff generates correct, minimal, and type safe low-level e…
▽ More
Computing differences between tree-structured data is a critical but challenging problem in software analysis. In this paper, we propose a novel tree diffing approach called SatDiff, which reformulates the structural diffing problem into a MaxSAT problem. By encoding the necessary transformations from the source tree to the target tree, SatDiff generates correct, minimal, and type safe low-level edit scripts with formal guarantees. We then synthesize concise high-level edit scripts by effectively merging low-level edits in the appropriate topological order. Our empirical results demonstrate that SatDiff outperforms existing heuristic-based approaches by a significant margin in terms of conciseness while maintaining a reasonable runtime.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Learning Minimal NAP Specifications for Neural Network Verification
Authors:
Chuqin Geng,
Zhaoyue Wang,
Haolin Ye,
Saifei Liao,
Xujie Si
Abstract:
Specifications play a crucial role in neural network verification. They define the precise input regions we aim to verify, typically represented as L-infinity norm balls. While recent research suggests using neural activation patterns (NAPs) as specifications for verifying unseen test set data, it focuses on computing the most refined NAPs, often limited to very small regions in the input space. I…
▽ More
Specifications play a crucial role in neural network verification. They define the precise input regions we aim to verify, typically represented as L-infinity norm balls. While recent research suggests using neural activation patterns (NAPs) as specifications for verifying unseen test set data, it focuses on computing the most refined NAPs, often limited to very small regions in the input space. In this paper, we study the following problem: Given a neural network, find a minimal (coarsest) NAP that is sufficient for formal verification of the network's robustness. Finding the minimal NAP specification not only expands verifiable bounds but also provides insights into which neurons contribute to the model's robustness. To address this problem, we propose several exact and approximate approaches. Our exact approaches leverage the verification tool to find minimal NAP specifications in either a deterministic or statistical manner. Whereas the approximate methods efficiently estimate minimal NAPs using adversarial examples and local gradients, without making calls to the verification tool. This allows us to inspect potential causal links between neurons and the robustness of state-of-the-art neural networks, a task for which existing verification frameworks fail to scale. Our experimental results suggest that minimal NAP specifications require much smaller fractions of neurons compared to the most refined NAP specifications, yet they can significantly expand the verifiable boundaries to several orders of magnitude larger.
△ Less
Submitted 11 June, 2024; v1 submitted 6 April, 2024;
originally announced April 2024.
-
IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications
Authors:
Chujun Geng,
Spyros Blanas,
Michael D. Bond,
Yang Wang
Abstract:
This paper presents the first dynamic predictive analysis for data store applications under weak isolation levels, called Isopredict. Given an observed serializable execution of a data store application, Isopredict generates and solves SMT constraints to find an unserializable execution that is a feasible execution of the application. Isopredict introduces novel techniques that handle divergent ap…
▽ More
This paper presents the first dynamic predictive analysis for data store applications under weak isolation levels, called Isopredict. Given an observed serializable execution of a data store application, Isopredict generates and solves SMT constraints to find an unserializable execution that is a feasible execution of the application. Isopredict introduces novel techniques that handle divergent application behavior; solve mutually recursive sets of constraints; and balance coverage, precision, and performance. An evaluation on four transactional data store benchmarks shows that Isopredict often predicts unserializable behaviors, 99% of which are feasible.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Improving Adversarial Energy-Based Model via Diffusion Process
Authors:
Cong Geng,
Tian Han,
Peng-Tao Jiang,
Hao Zhang,
**wei Chen,
Søren Hauberg,
Bo Li
Abstract:
Generative models have shown strong generation ability while efficient likelihood estimation is less explored. Energy-based models~(EBMs) define a flexible energy function to parameterize unnormalized densities efficiently but are notorious for being difficult to train. Adversarial EBMs introduce a generator to form a minimax training game to avoid expensive MCMC sampling used in traditional EBMs,…
▽ More
Generative models have shown strong generation ability while efficient likelihood estimation is less explored. Energy-based models~(EBMs) define a flexible energy function to parameterize unnormalized densities efficiently but are notorious for being difficult to train. Adversarial EBMs introduce a generator to form a minimax training game to avoid expensive MCMC sampling used in traditional EBMs, but a noticeable gap between adversarial EBMs and other strong generative models still exists. Inspired by diffusion-based models, we embedded EBMs into each denoising step to split a long-generated process into several smaller steps. Besides, we employ a symmetric Jeffrey divergence and introduce a variational posterior distribution for the generator's training to address the main challenges that exist in adversarial EBMs. Our experiments show significant improvement in generation compared to existing adversarial EBMs, while also providing a useful energy function for efficient density estimation.
△ Less
Submitted 8 June, 2024; v1 submitted 3 March, 2024;
originally announced March 2024.
-
All Beings Are Equal in Open Set Recognition
Authors:
Chaohua Li,
Enhao Zhang,
Chuanxing Geng,
SongCan Chen
Abstract:
In open-set recognition (OSR), a promising strategy is exploiting pseudo-unknown data outside given $K$ known classes as an additional $K$+$1$-th class to explicitly model potential open space. However, treating unknown classes without distinction is unequal for them relative to known classes due to the category-agnostic and scale-agnostic of the unknowns. This inevitably not only disrupts the inh…
▽ More
In open-set recognition (OSR), a promising strategy is exploiting pseudo-unknown data outside given $K$ known classes as an additional $K$+$1$-th class to explicitly model potential open space. However, treating unknown classes without distinction is unequal for them relative to known classes due to the category-agnostic and scale-agnostic of the unknowns. This inevitably not only disrupts the inherent distributions of unknown classes but also incurs both class-wise and instance-wise imbalances between known and unknown classes. Ideally, the OSR problem should model the whole class space as $K$+$\infty$, but enumerating all unknowns is impractical. Since the core of OSR is to effectively model the boundaries of known classes, this means just focusing on the unknowns nearing the boundaries of targeted known classes seems sufficient. Thus, as a compromise, we convert the open classes from infinite to $K$, with a novel concept Target-Aware Universum (TAU) and propose a simple yet effective framework Dual Contrastive Learning with Target-Aware Universum (DCTAU). In details, guided by the targeted known classes, TAU automatically expands the unknown classes from the previous $1$ to $K$, effectively alleviating the distribution disruption and the imbalance issues mentioned above. Then, a novel Dual Contrastive (DC) loss is designed, where all instances irrespective of known or TAU are considered as positives to contrast with their respective negatives. Experimental results indicate DCTAU sets a new state-of-the-art.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Session-Based Recommendation by Exploiting Substitutable and Complementary Relationships from Multi-behavior Data
Authors:
Huizi Wu,
Cong Geng,
Hui Fang
Abstract:
Session-based recommendation (SR) aims to dynamically recommend items to a user based on a sequence of the most recent user-item interactions. Most existing studies on SR adopt advanced deep learning methods. However, the majority only consider a special behavior type (e.g., click), while those few considering multi-typed behaviors ignore to take full advantage of the relationships between product…
▽ More
Session-based recommendation (SR) aims to dynamically recommend items to a user based on a sequence of the most recent user-item interactions. Most existing studies on SR adopt advanced deep learning methods. However, the majority only consider a special behavior type (e.g., click), while those few considering multi-typed behaviors ignore to take full advantage of the relationships between products (items). In this case, the paper proposes a novel approach, called Substitutable and Complementary Relationships from Multi-behavior Data (denoted as SCRM) to better explore the relationships between products for effective recommendation. Specifically, we firstly construct substitutable and complementary graphs based on a user's sequential behaviors in every session by jointly considering `click' and `purchase' behaviors. We then design a denoising network to remove false relationships, and further consider constraints on the two relationships via a particularly designed loss function. Extensive experiments on two e-commerce datasets demonstrate the superiority of our model over state-of-the-art methods, and the effectiveness of every component in SCRM.
△ Less
Submitted 14 January, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
TorchProbe: Fuzzing Dynamic Deep Learning Compilers
Authors:
Qidong Su,
Chuqin Geng,
Gennady Pekhimenko,
Xujie Si
Abstract:
Static and dynamic computational graphs represent two distinct approaches to constructing deep learning frameworks. The former prioritizes compiler-based optimizations, while the latter focuses on programmability and user-friendliness. The recent release of PyTorch 2.0, which supports compiling arbitrary deep learning programs in Python, signifies a new direction in the evolution of deep learning…
▽ More
Static and dynamic computational graphs represent two distinct approaches to constructing deep learning frameworks. The former prioritizes compiler-based optimizations, while the latter focuses on programmability and user-friendliness. The recent release of PyTorch 2.0, which supports compiling arbitrary deep learning programs in Python, signifies a new direction in the evolution of deep learning infrastructure to incorporate compiler techniques in a more dynamic manner and support more dynamic language features like dynamic control flows and closures. Given PyTorch's seamless integration with Python, its compiler aims to support arbitrary deep learning code written in Python. However, the inherent dynamism of Python poses challenges to the completeness and robustness of the compiler. While recent research has introduced fuzzing to test deep learning compilers, there is still a lack of comprehensive analysis on how to test dynamic features. To address this issue, we propose several code transformations to generate test cases involving dynamic features. These transformations preserve the program's semantics, ensuring that any discrepancy between the transformed and original programs indicates the presence of a bug. Through our approach, we have successfully identified twenty previously unknown bugs in the PyTorch compiler and its underlying tensor compiler Triton.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Beyond Myopia: Learning from Positive and Unlabeled Data through Holistic Predictive Trends
Authors:
Xinrui Wang,
Wenhai Wan,
Chuanxin Geng,
Shaoyuan LI,
Songcan Chen
Abstract:
Learning binary classifiers from positive and unlabeled data (PUL) is vital in many real-world applications, especially when verifying negative examples is difficult. Despite the impressive empirical performance of recent PUL methods, challenges like accumulated errors and increased estimation bias persist due to the absence of negative labels. In this paper, we unveil an intriguing yet long-overl…
▽ More
Learning binary classifiers from positive and unlabeled data (PUL) is vital in many real-world applications, especially when verifying negative examples is difficult. Despite the impressive empirical performance of recent PUL methods, challenges like accumulated errors and increased estimation bias persist due to the absence of negative labels. In this paper, we unveil an intriguing yet long-overlooked observation in PUL: \textit{resampling the positive data in each training iteration to ensure a balanced distribution between positive and unlabeled examples results in strong early-stage performance. Furthermore, predictive trends for positive and negative classes display distinctly different patterns.} Specifically, the scores (output probability) of unlabeled negative examples consistently decrease, while those of unlabeled positive examples show largely chaotic trends. Instead of focusing on classification within individual time frames, we innovatively adopt a holistic approach, interpreting the scores of each example as a temporal point process (TPP). This reformulates the core problem of PUL as recognizing trends in these scores. We then propose a novel TPP-inspired measure for trend detection and prove its asymptotic unbiasedness in predicting changes. Notably, our method accomplishes PUL without requiring additional parameter tuning or prior assumptions, offering an alternative perspective for tackling this problem. Extensive experiments verify the superiority of our method, particularly in a highly imbalanced real-world setting, where it achieves improvements of up to $11.3\%$ in key metrics. The code is available at \href{https://github.com/wxr99/HolisticPU}{https://github.com/wxr99/HolisticPU}.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Tree-Structured Shading Decomposition
Authors:
Chen Geng,
Hong-Xing Yu,
Sharon Zhang,
Maneesh Agrawala,
Jiajun Wu
Abstract:
We study inferring a tree-structured representation from a single image for object shading. Prior work typically uses the parametric or measured representation to model shading, which is neither interpretable nor easily editable. We propose using the shade tree representation, which combines basic shading nodes and compositing methods to factorize object surface shading. The shade tree representat…
▽ More
We study inferring a tree-structured representation from a single image for object shading. Prior work typically uses the parametric or measured representation to model shading, which is neither interpretable nor easily editable. We propose using the shade tree representation, which combines basic shading nodes and compositing methods to factorize object surface shading. The shade tree representation enables novice users who are unfamiliar with the physical shading process to edit object shading in an efficient and intuitive manner. A main challenge in inferring the shade tree is that the inference problem involves both the discrete tree structure and the continuous parameters of the tree nodes. We propose a hybrid approach to address this issue. We introduce an auto-regressive inference model to generate a rough estimation of the tree structure and node parameters, and then we fine-tune the inferred shade tree through an optimization algorithm. We show experiments on synthetic images, captured reflectance, real images, and non-realistic vector drawings, allowing downstream applications such as material editing, vectorized shading, and relighting. Project website: https://chen-geng.com/inv-shade-trees
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Relightable and Animatable Neural Avatar from Sparse-View Video
Authors:
Zhen Xu,
Sida Peng,
Chen Geng,
Linzhan Mou,
Zihan Yan,
Jiaming Sun,
Hujun Bao,
Xiaowei Zhou
Abstract:
This paper tackles the challenge of creating relightable and animatable neural avatars from sparse-view (or even monocular) videos of dynamic humans under unknown illumination. Compared to studio environments, this setting is more practical and accessible but poses an extremely challenging ill-posed problem. Previous neural human reconstruction methods are able to reconstruct animatable avatars fr…
▽ More
This paper tackles the challenge of creating relightable and animatable neural avatars from sparse-view (or even monocular) videos of dynamic humans under unknown illumination. Compared to studio environments, this setting is more practical and accessible but poses an extremely challenging ill-posed problem. Previous neural human reconstruction methods are able to reconstruct animatable avatars from sparse views using deformed Signed Distance Fields (SDF) but cannot recover material parameters for relighting. While differentiable inverse rendering-based methods have succeeded in material recovery of static objects, it is not straightforward to extend them to dynamic humans as it is computationally intensive to compute pixel-surface intersection and light visibility on deformed SDFs for inverse rendering. To solve this challenge, we propose a Hierarchical Distance Query (HDQ) algorithm to approximate the world space distances under arbitrary human poses. Specifically, we estimate coarse distances based on a parametric human model and compute fine distances by exploiting the local deformation invariance of SDF. Based on the HDQ algorithm, we leverage sphere tracing to efficiently estimate the surface intersection and light visibility. This allows us to develop the first system to recover animatable and relightable neural avatars from sparse view (or monocular) inputs. Experiments demonstrate that our approach is able to produce superior results compared to state-of-the-art methods. Our code will be released for reproducibility.
△ Less
Submitted 17 August, 2023; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Can ChatGPT Pass An Introductory Level Functional Language Programming Course?
Authors:
Chuqin Geng,
Yihan Zhang,
Brigitte Pientka,
Xujie Si
Abstract:
The recent introduction of ChatGPT has drawn significant attention from both industry and academia due to its impressive capabilities in solving a diverse range of tasks, including language translation, text summarization, and computer programming. Its capability for writing, modifying, and even correcting code together with its ease of use and access is already dramatically impacting computer sci…
▽ More
The recent introduction of ChatGPT has drawn significant attention from both industry and academia due to its impressive capabilities in solving a diverse range of tasks, including language translation, text summarization, and computer programming. Its capability for writing, modifying, and even correcting code together with its ease of use and access is already dramatically impacting computer science education. This paper aims to explore how well ChatGPT can perform in an introductory-level functional language programming course. In our systematic evaluation, we treated ChatGPT as one of our students and demonstrated that it can achieve a grade B- and its rank in the class is 155 out of 314 students overall. Our comprehensive evaluation provides valuable insights into ChatGPT's impact from both student and instructor perspectives. Additionally, we identify several potential benefits that ChatGPT can offer to both groups. Overall, we believe that this study significantly clarifies and advances our understanding of ChatGPT's capabilities and potential impact on computer science education.
△ Less
Submitted 3 May, 2023; v1 submitted 29 April, 2023;
originally announced May 2023.
-
Learning Neural Volumetric Representations of Dynamic Humans in Minutes
Authors:
Chen Geng,
Sida Peng,
Zhen Xu,
Hujun Bao,
Xiaowei Zhou
Abstract:
This paper addresses the challenge of quickly reconstructing free-viewpoint videos of dynamic humans from sparse multi-view videos. Some recent works represent the dynamic human as a canonical neural radiance field (NeRF) and a motion field, which are learned from videos through differentiable rendering. But the per-scene optimization generally requires hours. Other generalizable NeRF models lever…
▽ More
This paper addresses the challenge of quickly reconstructing free-viewpoint videos of dynamic humans from sparse multi-view videos. Some recent works represent the dynamic human as a canonical neural radiance field (NeRF) and a motion field, which are learned from videos through differentiable rendering. But the per-scene optimization generally requires hours. Other generalizable NeRF models leverage learned prior from datasets and reduce the optimization time by only finetuning on new scenes at the cost of visual fidelity. In this paper, we propose a novel method for learning neural volumetric videos of dynamic humans from sparse view videos in minutes with competitive visual quality. Specifically, we define a novel part-based voxelized human representation to better distribute the representational power of the network to different human parts. Furthermore, we propose a novel 2D motion parameterization scheme to increase the convergence rate of deformation field learning. Experiments demonstrate that our model can be learned 100 times faster than prior per-scene optimization methods while being competitive in the rendering quality. Training our model on a $512 \times 512$ video with 100 frames typically takes about 5 minutes on a single RTX 3090 GPU. The code will be released on our project page: https://zju3dv.github.io/instant_nvr
△ Less
Submitted 23 February, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Identifying Different Student Clusters in Functional Programming Assignments: From Quick Learners to Struggling Students
Authors:
Chuqin Geng,
Wenwen Xu,
Yingjie Xu,
Brigitte Pientka,
Xujie Si
Abstract:
Instructors and students alike are often focused on the grade in programming assignments as a key measure of how well a student is mastering the material and whether a student is struggling. This can be, however, misleading. Especially when students have access to auto-graders, their grades may be heavily skewed. In this paper, we analyze student assignment submission data collected from a functio…
▽ More
Instructors and students alike are often focused on the grade in programming assignments as a key measure of how well a student is mastering the material and whether a student is struggling. This can be, however, misleading. Especially when students have access to auto-graders, their grades may be heavily skewed. In this paper, we analyze student assignment submission data collected from a functional programming course taught at McGill university incorporating a wide range of features. In addition to the grade, we consider activity time data, time spent, and the number of static errors. This allows us to identify four clusters of students: "Quick-learning", "Hardworking", "Satisficing", and "Struggling" through cluster algorithms. We then analyze how work habits, working duration, the range of errors, and the ability to fix errors impact different clusters of students. This structured analysis provides valuable insights for instructors to actively help different types of students and emphasize different aspects of their overall course design. It also provides insights for students themselves to understand which aspects they still struggle with and allows them to seek clarification and adjust their work habits.
△ Less
Submitted 6 January, 2023;
originally announced January 2023.
-
A Generic Reinforced Explainable Framework with Knowledge Graph for Session-based Recommendation
Authors:
Huizi Wu,
Hui Fang,
Zhu Sun,
Cong Geng,
Xinyu Kong,
Yew-Soon Ong
Abstract:
Session-based recommendation (SR) has gained increasing attention in recent years. Quite a great amount of studies have been devoted to designing complex algorithms to improve recommendation performance, where deep learning methods account for the majority. However, most of these methods are black-box ones and ignore to provide moderate explanations to facilitate users' understanding, which thus m…
▽ More
Session-based recommendation (SR) has gained increasing attention in recent years. Quite a great amount of studies have been devoted to designing complex algorithms to improve recommendation performance, where deep learning methods account for the majority. However, most of these methods are black-box ones and ignore to provide moderate explanations to facilitate users' understanding, which thus might lead to lowered user satisfaction and reduced system revenues. Therefore, in our study, we propose a generic Reinforced Explainable framework with Knowledge graph for Session-based recommendation (i.e., REKS), which strives to improve the existing black-box SR models (denoted as non-explainable ones) with Markov decision process. In particular, we construct a knowledge graph with session behaviors and treat SR models as part of the policy network of Markov decision process. Based on our particularly designed state vector, reward strategy, and loss function, the reinforcement learning (RL)-based framework not only achieves improved recommendation accuracy, but also provides appropriate explanations at the same time. Finally, we instantiate the REKS in five representative, state-of-the-art SR models (i.e., GRU4REC, NARM, SR-GNN, GCSAN, BERT4REC), whereby extensive experiments towards these methods on four datasets demonstrate the effectiveness of our framework on both recommendation and explanation tasks.
△ Less
Submitted 15 December, 2022; v1 submitted 13 December, 2022;
originally announced December 2022.
-
Scalar Invariant Networks with Zero Bias
Authors:
Chuqin Geng,
Xiaojie Xu,
Haolin Ye,
Xujie Si
Abstract:
Just like weights, bias terms are the learnable parameters of many popular machine learning models, including neural networks. Biases are thought to enhance the representational power of neural networks, enabling them to solve a variety of tasks in computer vision. However, we argue that biases can be disregarded for some image-related tasks such as image classification, by considering the intrins…
▽ More
Just like weights, bias terms are the learnable parameters of many popular machine learning models, including neural networks. Biases are thought to enhance the representational power of neural networks, enabling them to solve a variety of tasks in computer vision. However, we argue that biases can be disregarded for some image-related tasks such as image classification, by considering the intrinsic distribution of images in the input space and desired model properties from first principles. Our findings suggest that zero-bias neural networks can perform comparably to biased networks for practical image classification tasks. We demonstrate that zero-bias neural networks possess a valuable property called scalar (multiplication) invariance. This means that the prediction of the network remains unchanged when the contrast of the input image is altered. We extend scalar invariance to more general cases, enabling formal verification of certain convex regions of the input space. Additionally, we prove that zero-bias neural networks are fair in predicting the zero image. Unlike state-of-the-art models that may exhibit bias toward certain labels, zero-bias networks have uniform belief in all labels. We believe drop** bias terms can be considered as a geometric prior in designing neural network architecture for image classification, which shares the spirit of adapting convolutions as the transnational invariance prior. The robustness and fairness advantages of zero-bias neural networks may also indicate a promising path towards trustworthy and ethical AI.
△ Less
Submitted 29 May, 2023; v1 submitted 15 November, 2022;
originally announced November 2022.
-
Towards Reliable Neural Specifications
Authors:
Chuqin Geng,
Nham Le,
Xiaojie Xu,
Zhaoyue Wang,
Arie Gurfinkel,
Xujie Si
Abstract:
Having reliable specifications is an unavoidable challenge in achieving verifiable correctness, robustness, and interpretability of AI systems. Existing specifications for neural networks are in the paradigm of data as specification. That is, the local neighborhood centering around a reference input is considered to be correct (or robust). While existing specifications contribute to verifying adve…
▽ More
Having reliable specifications is an unavoidable challenge in achieving verifiable correctness, robustness, and interpretability of AI systems. Existing specifications for neural networks are in the paradigm of data as specification. That is, the local neighborhood centering around a reference input is considered to be correct (or robust). While existing specifications contribute to verifying adversarial robustness, a significant problem in many research domains, our empirical study shows that those verified regions are somewhat tight, and thus fail to allow verification of test set inputs, making them impractical for some real-world applications. To this end, we propose a new family of specifications called neural representation as specification, which uses the intrinsic information of neural networks - neural activation patterns (NAPs), rather than input data to specify the correctness and/or robustness of neural network predictions. We present a simple statistical approach to mining neural activation patterns. To show the effectiveness of discovered NAPs, we formally verify several important properties, such as various types of misclassifications will never happen for a given NAP, and there is no ambiguity between different NAPs. We show that by using NAP, we can verify a significant region of the input space, while still recalling 84% of the data on MNIST. Moreover, we can push the verifiable bound to 10 times larger on the CIFAR10 benchmark. Thus, we argue that NAPs can potentially be used as a more reliable and extensible specification for neural network verification.
△ Less
Submitted 17 March, 2023; v1 submitted 28 October, 2022;
originally announced October 2022.
-
Novice Type Error Diagnosis with Natural Language Models
Authors:
Chuqin Geng,
Haolin Ye,
Yixuan Li,
Tianyu Han,
Brigitte Pientka,
Xujie Si
Abstract:
Strong static type systems help programmers eliminate many errors without much burden of supplying type annotations. However, this flexibility makes it highly non-trivial to diagnose ill-typed programs, especially for novice programmers. Compared to classic constraint solving and optimization-based approaches, the data-driven approach has shown great promise in identifying the root causes of type…
▽ More
Strong static type systems help programmers eliminate many errors without much burden of supplying type annotations. However, this flexibility makes it highly non-trivial to diagnose ill-typed programs, especially for novice programmers. Compared to classic constraint solving and optimization-based approaches, the data-driven approach has shown great promise in identifying the root causes of type errors with higher accuracy. Instead of relying on hand-engineered features, this work explores natural language models for type error localization, which can be trained in an end-to-end fashion without requiring any features. We demonstrate that, for novice type error diagnosis, the language model-based approach significantly outperforms the previous state-of-the-art data-driven approach. Specifically, our model could predict type errors correctly 62% of the time, outperforming the state-of-the-art Nate's data-driven model by 11%, in a more rigorous accuracy metric. Furthermore, we also apply structural probes to explain the performance difference between different language models.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
Class-Aware Universum Inspired Re-Balance Learning for Long-Tailed Recognition
Authors:
Enhao Zhang,
Chuanxing Geng,
Songcan Chen
Abstract:
Data augmentation for minority classes is an effective strategy for long-tailed recognition, thus develo** a large number of methods. Although these methods all ensure the balance in sample quantity, the quality of the augmented samples is not always satisfactory for recognition, being prone to such problems as over-fitting, lack of diversity, semantic drift, etc. For these issues, we propose th…
▽ More
Data augmentation for minority classes is an effective strategy for long-tailed recognition, thus develo** a large number of methods. Although these methods all ensure the balance in sample quantity, the quality of the augmented samples is not always satisfactory for recognition, being prone to such problems as over-fitting, lack of diversity, semantic drift, etc. For these issues, we propose the Class-aware Universum Inspired Re-balance Learning(CaUIRL) for long-tailed recognition, which endows the Universum with class-aware ability to re-balance individual minority classes from both sample quantity and quality. In particular, we theoretically prove that the classifiers learned by CaUIRL are consistent with those learned under the balanced condition from a Bayesian perspective. In addition, we further develop a higher-order mixup approach, which can automatically generate class-aware Universum(CaU) data without resorting to any external data. Unlike the traditional Universum, such generated Universum additionally takes the domain similarity, class separability, and sample diversity into account. Extensive experiments on benchmark datasets demonstrate the surprising advantages of our method, especially the top1 accuracy in minority classes is improved by 1.9% 6% compared to the state-of-the-art method.
△ Less
Submitted 11 August, 2022; v1 submitted 26 July, 2022;
originally announced July 2022.
-
View-labels Are Indispensable: A Multifacet Complementarity Study of Multi-view Clustering
Authors:
Chuanxing Geng,
Aiyang Han,
Songcan Chen
Abstract:
Consistency and complementarity are two key ingredients for boosting multi-view clustering (MVC). Recently with the introduction of popular contrastive learning, the consistency learning of views has been further enhanced in MVC, leading to promising performance. However, by contrast, the complementarity has not received sufficient attention except just in the feature facet, where the Hilbert Schm…
▽ More
Consistency and complementarity are two key ingredients for boosting multi-view clustering (MVC). Recently with the introduction of popular contrastive learning, the consistency learning of views has been further enhanced in MVC, leading to promising performance. However, by contrast, the complementarity has not received sufficient attention except just in the feature facet, where the Hilbert Schmidt Independence Criterion (HSIC) term or the independent encoder-decoder network is usually adopted to capture view-specific information. This motivates us to reconsider the complementarity learning of views comprehensively from multiple facets including the feature-, view-label- and contrast- facets, while maintaining the view consistency. We empirically find that all the facets contribute to the complementarity learning, especially the view-label facet, which is usually neglected by existing methods. Based on this, we develop a novel \underline{M}ultifacet \underline{C}omplementarity learning framework for \underline{M}ulti-\underline{V}iew \underline{C}lustering (MCMVC), which fuses multifacet complementarity information, especially explicitly embedding the view-label information. To our best knowledge, it is the first time to use view-labels explicitly to guide the complementarity learning of views. Compared with the SOTA baseline, MCMVC achieves remarkable improvements, e.g., by average margins over $5.00\%$ and $7.00\%$ respectively in complete and incomplete MVC settings on Caltech101-20 in terms of three evaluation metrics.
△ Less
Submitted 18 August, 2022; v1 submitted 5 May, 2022;
originally announced May 2022.
-
The Extremal GDoF Gain of Optimal versus Binary Power Control in $K$ User Interference Networks Is $Θ(\sqrt{K})$
Authors:
Yao-Chia Chan,
Pouya Pezeshkpour,
Chunhua Geng,
Syed A. Jafar
Abstract:
Using ideas from Generalized Degrees of Freedom (GDoF) analyses and extremal network theory, this work studies the extremal gain of optimal power control over binary (on/off) power control, especially in large interference networks, in search of new theoretical insights. Whereas numerical studies have already established that in most practical settings binary power control is close to optimal, the…
▽ More
Using ideas from Generalized Degrees of Freedom (GDoF) analyses and extremal network theory, this work studies the extremal gain of optimal power control over binary (on/off) power control, especially in large interference networks, in search of new theoretical insights. Whereas numerical studies have already established that in most practical settings binary power control is close to optimal, the extremal analysis shows not only that there exist settings where the gain from optimal power control can be quite significant, but also bounds the extremal values of such gains from a GDoF perspective. As its main contribution, this work explicitly characterizes the extremal GDoF gain of optimal over binary power control as $Θ\left(\sqrt{K}\right)$ for all $K$. In particular, the extremal gain is bounded between $\lfloor \sqrt{K}\rfloor$ and $2.5\sqrt{K}$ for every $K$. For $K=2,3,4,5,6$ users, the precise extremal gain is found to be $1, 3/2, 2, 9/4$ and $41/16$, respectively. Networks shown to achieve the extremal gain may be interpreted as multi-tier heterogeneous networks. It is worthwhile to note that because of their focus on asymptotic analysis, the sharp characterizations of extremal gains are valuable primarily from a theoretical perspective, and not as contradictions to the conventional wisdom that binary power control is generally close to optimal in practical, non-asymptotic settings.
△ Less
Submitted 5 May, 2022; v1 submitted 4 May, 2022;
originally announced May 2022.
-
Universum-inspired Supervised Contrastive Learning
Authors:
Aiyang Han,
Chuanxing Geng,
Songcan Chen
Abstract:
As an effective data augmentation method, Mixup synthesizes an extra amount of samples through linear interpolations. Despite its theoretical dependency on data properties, Mixup reportedly performs well as a regularizer and calibrator contributing reliable robustness and generalization to deep model training. In this paper, inspired by Universum Learning which uses out-of-class samples to assist…
▽ More
As an effective data augmentation method, Mixup synthesizes an extra amount of samples through linear interpolations. Despite its theoretical dependency on data properties, Mixup reportedly performs well as a regularizer and calibrator contributing reliable robustness and generalization to deep model training. In this paper, inspired by Universum Learning which uses out-of-class samples to assist the target tasks, we investigate Mixup from a largely under-explored perspective - the potential to generate in-domain samples that belong to none of the target classes, that is, universum. We find that in the framework of supervised contrastive learning, Mixup-induced universum can serve as surprisingly high-quality hard negatives, greatly relieving the need for large batch sizes in contrastive learning. With these findings, we propose Universum-inspired supervised Contrastive learning (UniCon), which incorporates Mixup strategy to generate Mixup-induced universum as universum negatives and pushes them apart from anchor samples of the target classes. We extend our method to the unsupervised setting, proposing Unsupervised Universum-inspired contrastive model (Un-Uni). Our approach not only improves Mixup with hard labels, but also innovates a novel measure to generate universum data. With a linear classifier on the learned representations, UniCon shows state-of-the-art performance on various datasets. Specially, UniCon achieves 81.7% top-1 accuracy on CIFAR-100, surpassing the state of art by a significant margin of 5.2% with a much smaller batch size, typically, 256 in UniCon vs. 1024 in SupCon using ResNet-50. Un-Uni also outperforms SOTA methods on CIFAR-100. The code of this paper is released on https://github.com/hannaiiyanggit/UniCon.
△ Less
Submitted 31 October, 2023; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Causality and Correlation Graph Modeling for Effective and Explainable Session-based Recommendation
Authors:
Huizi Wu,
Cong Geng,
Hui Fang
Abstract:
Session-based recommendation which has been witnessed a booming interest recently, focuses on predicting a user's next interested item(s) based on an anonymous session. Most existing studies adopt complex deep learning techniques (e.g., graph neural networks) for effective session-based recommendation. However, they merely address co-occurrence between items, but fail to well distinguish causality…
▽ More
Session-based recommendation which has been witnessed a booming interest recently, focuses on predicting a user's next interested item(s) based on an anonymous session. Most existing studies adopt complex deep learning techniques (e.g., graph neural networks) for effective session-based recommendation. However, they merely address co-occurrence between items, but fail to well distinguish causality and correlation relationship. Considering the varied interpretations and characteristics of causality and correlation relationship between items, in this study, we propose a novel method denoted as CGSR by jointly modeling causality and correlation relationship between items. In particular, we construct cause, effect and correlation graphs from sessions by simultaneously considering the false causality problem. We further design a graph neural network-based method for session-based recommendation. To conclude, we strive to explore the relationship between items from specific ``causality" (directed) and ``correlation" (undirected) perspectives. Extensive experiments on three datasets show that our model outperforms other state-of-the-art methods in terms of recommendation accuracy. Moreover, we further propose an explainable framework on CGSR, and demonstrate the explainability of our model via case studies on Amazon dataset.
△ Less
Submitted 17 May, 2023; v1 submitted 26 January, 2022;
originally announced January 2022.
-
Bounds all around: training energy-based models with bidirectional bounds
Authors:
Cong Geng,
Jia Wang,
Zhiyong Gao,
Jes Frellsen,
Søren Hauberg
Abstract:
Energy-based models (EBMs) provide an elegant framework for density estimation, but they are notoriously difficult to train. Recent work has established links to generative adversarial networks, where the EBM is trained through a minimax game with a variational value function. We propose a bidirectional bound on the EBM log-likelihood, such that we maximize a lower bound and minimize an upper boun…
▽ More
Energy-based models (EBMs) provide an elegant framework for density estimation, but they are notoriously difficult to train. Recent work has established links to generative adversarial networks, where the EBM is trained through a minimax game with a variational value function. We propose a bidirectional bound on the EBM log-likelihood, such that we maximize a lower bound and minimize an upper bound when solving the minimax game. We link one bound to a gradient penalty that stabilizes training, thereby providing grounding for best engineering practice. To evaluate the bounds we develop a new and efficient estimator of the Jacobi-determinant of the EBM generator. We demonstrate that these developments significantly stabilize training and yield high-quality density estimation and sample generation.
△ Less
Submitted 2 November, 2021; v1 submitted 1 November, 2021;
originally announced November 2021.
-
Deep Learning-based Segmentation of Cerebral Aneurysms in 3D TOF-MRA using Coarse-to-Fine Framework
Authors:
Meng Chen,
Chen Geng,
Dongdong Wang,
Jiajun Zhang,
Ruoyu Di,
Fengmei Li,
Zhiyong Zhou,
Sirong Piao,
Yuxin Li,
Yaikang Dai
Abstract:
BACKGROUND AND PURPOSE: Cerebral aneurysm is one of the most common cerebrovascular diseases, and SAH caused by its rupture has a very high mortality and disability rate. Existing automatic segmentation methods based on DLMs with TOF-MRA modality could not segment edge voxels very well, so that our goal is to realize more accurate segmentation of cerebral aneurysms in 3D TOF-MRA with the help of D…
▽ More
BACKGROUND AND PURPOSE: Cerebral aneurysm is one of the most common cerebrovascular diseases, and SAH caused by its rupture has a very high mortality and disability rate. Existing automatic segmentation methods based on DLMs with TOF-MRA modality could not segment edge voxels very well, so that our goal is to realize more accurate segmentation of cerebral aneurysms in 3D TOF-MRA with the help of DLMs. MATERIALS AND METHODS: In this research, we proposed an automatic segmentation framework of cerebral aneurysm in 3D TOF-MRA. The framework was composed of two segmentation networks ranging from coarse to fine. The coarse segmentation network, namely DeepMedic, completed the coarse segmentation of cerebral aneurysms, and the processed results were fed into the fine segmentation network, namely dual-channel SE_3D U-Net trained with weighted loss function, for fine segmentation. Images from ADAM2020 (n=113) were used for training and validation and images from another center (n=45) were used for testing. The segmentation metrics we used include DSC, HD, and VS. RESULTS: The trained cerebral aneurysm segmentation model achieved DSC of 0.75, HD of 1.52, and VS of 0.91 on validation cohort. On the totally independent test cohort, our method achieved the highest DSC of 0.12, the lowest HD of 11.61, and the highest VS of 0.16 in comparison with state-of-the-art segmentation networks. CONCLUSIONS: The coarse-to-fine framework, which composed of DeepMedic and dual-channel SE_3D U-Net can segment cerebral aneurysms in 3D TOF-MRA with a superior accuracy.
△ Less
Submitted 26 October, 2021;
originally announced October 2021.
-
An Automatic Detection Method Of Cerebral Aneurysms In Time-Of-Flight Magnetic Resonance Angiography Images Based On Attention 3D U-Net
Authors:
Chen Geng,
Meng Chen,
Ruoyu Di,
Dongdong Wang,
Liqin Yang,
Wei Xia,
Yuxin Li,
Daoying Geng
Abstract:
Background:Subarachnoid hemorrhage caused by ruptured cerebral aneurysm often leads to fatal consequences.However,if the aneurysm can be found and treated during asymptomatic periods,the probability of rupture can be greatly reduced.At present,time-of-flight magnetic resonance angiography is one of the most commonly used non-invasive screening techniques for cerebral aneurysm,and the application o…
▽ More
Background:Subarachnoid hemorrhage caused by ruptured cerebral aneurysm often leads to fatal consequences.However,if the aneurysm can be found and treated during asymptomatic periods,the probability of rupture can be greatly reduced.At present,time-of-flight magnetic resonance angiography is one of the most commonly used non-invasive screening techniques for cerebral aneurysm,and the application of deep learning technology in aneurysm detection can effectively improve the screening effect of aneurysm.Existing studies have found that three-dimensional features play an important role in aneurysm detection,but they require a large amount of training data and have problems such as a high false positive rate. Methods:This paper proposed a novel method for aneurysm detection.First,a fully automatic cerebral artery segmentation algorithm without training data was used to extract the volume of interest,and then the 3D U-Net was improved by the 3D SENet module to establish an aneurysm detection model.Eventually a set of fully automated,end-to-end aneurysm detection methods have been formed. Results:A total of 231 magnetic resonance angiography image data were used in this study,among which 132 were training sets,34 were internal test sets and 65 were external test sets.The presented method obtained 97.89% sensitivity in the five-fold cross-validation and obtained 91.0% sensitivity with 2.48 false positives/case in the detection of the external test sets. Conclusions:Compared with the results of our previous studies and other studies,the method in this paper achieves a very competitive sensitivity with less training data and maintains a low false positive rate.As the only method currently using 3D U-Net for aneurysm detection,it proves the feasibility and superior performance of this network in aneurysm detection,and also explores the potential of the channel attention mechanism in this task.
△ Less
Submitted 25 October, 2021;
originally announced October 2021.
-
Differentiable Programming of Isometric Tensor Networks
Authors:
Chenhua Geng,
Hong-Ye Hu,
Yijian Zou
Abstract:
Differentiable programming is a new programming paradigm which enables large scale optimization through automatic calculation of gradients also known as auto-differentiation. This concept emerges from deep learning, and has also been generalized to tensor network optimizations. Here, we extend the differentiable programming to tensor networks with isometric constraints with applications to multisc…
▽ More
Differentiable programming is a new programming paradigm which enables large scale optimization through automatic calculation of gradients also known as auto-differentiation. This concept emerges from deep learning, and has also been generalized to tensor network optimizations. Here, we extend the differentiable programming to tensor networks with isometric constraints with applications to multiscale entanglement renormalization ansatz (MERA) and tensor network renormalization (TNR). By introducing several gradient-based optimization methods for the isometric tensor network and comparing with Evenbly-Vidal method, we show that auto-differentiation has a better performance for both stability and accuracy. We numerically tested our methods on 1D critical quantum Ising spin chain and 2D classical Ising model. We calculate the ground state energy for the 1D quantum model and internal energy for the classical model, and scaling dimensions of scaling operators and find they all agree with the theory well.
△ Less
Submitted 31 October, 2021; v1 submitted 8 October, 2021;
originally announced October 2021.
-
Experimental Study on Probabilistic ToA and AoA Joint Localization in Real Indoor Environments
Authors:
Chunhua Geng,
Traian E. Abrudan,
Veli-Matti Kolmonen,
Howard Huang
Abstract:
In this paper, we study probabilistic time-of-arrival (ToA) and angle-of-arrival (AoA) joint localization in real indoor environments. To mitigate the effects of multipath propagation, the joint localization algorithm incorporates into the likelihood function Gaussian mixture models (GMM) and the Von Mises-Fisher distribution to model time bias errors and angular uncertainty, respectively. We eval…
▽ More
In this paper, we study probabilistic time-of-arrival (ToA) and angle-of-arrival (AoA) joint localization in real indoor environments. To mitigate the effects of multipath propagation, the joint localization algorithm incorporates into the likelihood function Gaussian mixture models (GMM) and the Von Mises-Fisher distribution to model time bias errors and angular uncertainty, respectively. We evaluate the algorithm performance using a proprietary prototype deployed in an indoor factory environment with infrastructure receivers in each of the four corners at the ceiling of a 10 meter by 20 meter section. The field test results show that our joint probabilistic localization algorithm significantly outperforms baselines using only ToA or AoA measurements and achieves 2-D sub-meter accuracy at the 90%-ile. We also numerically demonstrate that the joint localization algorithm is more robust to synchronization errors than the baseline using ToA measurements only.
△ Less
Submitted 31 March, 2021; v1 submitted 22 February, 2021;
originally announced February 2021.
-
Multilevel Topological Interference Management: A TIM-TIN Perspective
Authors:
Chunhua Geng,
Hua Sun,
Syed A. Jafar
Abstract:
The robust principles of treating interference as noise (TIN) when it is sufficiently weak, and avoiding it when it is not, form the background of this work. Combining TIN with the topological interference management (TIM) framework that identifies optimal interference avoidance schemes, we formulate a TIM-TIN problem for multilevel topological interference management, wherein only a coarse knowle…
▽ More
The robust principles of treating interference as noise (TIN) when it is sufficiently weak, and avoiding it when it is not, form the background of this work. Combining TIN with the topological interference management (TIM) framework that identifies optimal interference avoidance schemes, we formulate a TIM-TIN problem for multilevel topological interference management, wherein only a coarse knowledge of channel strengths and no knowledge of channel phases is available to transmitters. To address the TIM-TIN problem, we first propose an analytical baseline approach, which decomposes a network into TIN and TIM components, allocates the signal power levels to each user in the TIN component, allocates signal vector space dimensions to each user in the TIM component, and guarantees that the product of the two is an achievable number of signal dimensions available to each user in the original network. Next, a distributed numerical algorithm called ZEST is developed. The convergence of the algorithm is demonstrated, leading to the duality of the TIM-TIN problem (in terms of GDoF). Numerical results are also provided to demonstrate the superior sum-rate performance and fast convergence of ZEST.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection
Authors:
Weikai Li,
Chuanxing Geng,
Songcan Chen
Abstract:
As the main workhorse for model selection, Cross Validation (CV) has achieved an empirical success due to its simplicity and intuitiveness. However, despite its ubiquitous role, CV often falls into the following notorious dilemmas. On the one hand, for small data cases, CV suffers a conservatively biased estimation, since some part of the limited data has to hold out for validation. On the other h…
▽ More
As the main workhorse for model selection, Cross Validation (CV) has achieved an empirical success due to its simplicity and intuitiveness. However, despite its ubiquitous role, CV often falls into the following notorious dilemmas. On the one hand, for small data cases, CV suffers a conservatively biased estimation, since some part of the limited data has to hold out for validation. On the other hand, for large data cases, CV tends to be extremely cumbersome, e.g., intolerant time-consuming, due to the repeated training procedures. Naturally, a straightforward ambition for CV is to validate the models with far less computational cost, while making full use of the entire given data-set for training. Thus, instead of holding out the given data, a cheap and theoretically guaranteed auxiliary/augmented validation is derived strategically in this paper. Such an embarrassingly simple strategy only needs to train models on the entire given data-set once, making the model-selection considerably efficient. In addition, the proposed validation approach is suitable for a wide range of learning settings due to the independence of both augmentation and out-of-sample estimation on learning process. In the end, we demonstrate the accuracy and computational benefits of our proposed method by extensive evaluation on multiple data-sets, models and tasks.
△ Less
Submitted 28 December, 2020; v1 submitted 24 December, 2020;
originally announced December 2020.
-
Omni-GAN: On the Secrets of cGANs and Beyond
Authors:
Peng Zhou,
Lingxi Xie,
Bingbing Ni,
Cong Geng,
Qi Tian
Abstract:
The conditional generative adversarial network (cGAN) is a powerful tool of generating high-quality images, but existing approaches mostly suffer unsatisfying performance or the risk of mode collapse. This paper presents Omni-GAN, a variant of cGAN that reveals the devil in designing a proper discriminator for training the model. The key is to ensure that the discriminator receives strong supervis…
▽ More
The conditional generative adversarial network (cGAN) is a powerful tool of generating high-quality images, but existing approaches mostly suffer unsatisfying performance or the risk of mode collapse. This paper presents Omni-GAN, a variant of cGAN that reveals the devil in designing a proper discriminator for training the model. The key is to ensure that the discriminator receives strong supervision to perceive the concepts and moderate regularization to avoid collapse. Omni-GAN is easily implemented and freely integrated with off-the-shelf encoding methods (e.g., implicit neural representation, INR). Experiments validate the superior performance of Omni-GAN and Omni-INR-GAN in a wide range of image generation and restoration tasks. In particular, Omni-INR-GAN sets new records on the ImageNet dataset with impressive Inception scores of 262.85 and 343.22 for the image sizes of 128 and 256, respectively, surpassing the previous records by 100+ points. Moreover, leveraging the generator prior, Omni-INR-GAN can extrapolate low-resolution images to arbitrary resolution, even up to x60+ higher resolution. Code is available.
△ Less
Submitted 28 March, 2021; v1 submitted 25 November, 2020;
originally announced November 2020.
-
Generative Model without Prior Distribution Matching
Authors:
Cong Geng,
Jia Wang,
Li Chen,
Zhiyong Gao
Abstract:
Variational Autoencoder (VAE) and its variations are classic generative models by learning a low-dimensional latent representation to satisfy some prior distribution (e.g., Gaussian distribution). Their advantages over GAN are that they can simultaneously generate high dimensional data and learn latent representations to reconstruct the inputs. However, it has been observed that a trade-off exists…
▽ More
Variational Autoencoder (VAE) and its variations are classic generative models by learning a low-dimensional latent representation to satisfy some prior distribution (e.g., Gaussian distribution). Their advantages over GAN are that they can simultaneously generate high dimensional data and learn latent representations to reconstruct the inputs. However, it has been observed that a trade-off exists between reconstruction and generation since matching prior distribution may destroy the geometric structure of data manifold. To mitigate this problem, we propose to let the prior match the embedding distribution rather than imposing the latent variables to fit the prior. The embedding distribution is trained using a simple regularized autoencoder architecture which preserves the geometric structure to the maximum. Then an adversarial strategy is employed to achieve a latent map**. We provide both theoretical and experimental support for the effectiveness of our method, which alleviates the contradiction between topological properties' preserving of data manifold and distribution matching in latent space.
△ Less
Submitted 23 September, 2020;
originally announced September 2020.
-
Deep-learning enhancement of large scale numerical simulations
Authors:
Caspar van Leeuwen,
Damian Podareanu,
Valeriu Codreanu,
Maxwell X. Cai,
Axel Berg,
Simon Portegies Zwart,
Robin Stoffer,
Menno Veerman,
Chiel van Heerwaarden,
Sydney Otten,
Sascha Caron,
Cunliang Geng,
Francesco Ambrosetti,
Alexandre M. J. J. Bonvin
Abstract:
Traditional simulations on High-Performance Computing (HPC) systems typically involve modeling very large domains and/or very complex equations. HPC systems allow running large models, but limits in performance increase that have become more prominent in the last 5-10 years will likely be experienced. Therefore new approaches are needed to increase application performance. Deep learning appears to…
▽ More
Traditional simulations on High-Performance Computing (HPC) systems typically involve modeling very large domains and/or very complex equations. HPC systems allow running large models, but limits in performance increase that have become more prominent in the last 5-10 years will likely be experienced. Therefore new approaches are needed to increase application performance. Deep learning appears to be a promising way to achieve this. Recently deep learning has been employed to enhance solving problems that traditionally are solved with large-scale numerical simulations using HPC. This type of application, deep learning for high-performance computing, is the theme of this whitepaper. Our goal is to provide concrete guidelines to scientists and others that would like to explore opportunities for applying deep learning approaches in their own large-scale numerical simulations. These guidelines have been extracted from a number of experiments that have been undertaken in various scientific domains over the last two years, and which are described in more detail in the Appendix. Additionally, we share the most important lessons that we have learned.
△ Less
Submitted 30 March, 2020;
originally announced April 2020.
-
A Multi-view Perspective of Self-supervised Learning
Authors:
Chuanxing Geng,
Zhenghao Tan,
Songcan Chen
Abstract:
As a newly emerging unsupervised learning paradigm, self-supervised learning (SSL) recently gained widespread attention, which usually introduces a pretext task without manual annotation of data. With its help, SSL effectively learns the feature representation beneficial for downstream tasks. Thus the pretext task plays a key role. However, the study of its design, especially its essence currently…
▽ More
As a newly emerging unsupervised learning paradigm, self-supervised learning (SSL) recently gained widespread attention, which usually introduces a pretext task without manual annotation of data. With its help, SSL effectively learns the feature representation beneficial for downstream tasks. Thus the pretext task plays a key role. However, the study of its design, especially its essence currently is still open. In this paper, we borrow a multi-view perspective to decouple a class of popular pretext tasks into a combination of view data augmentation (VDA) and view label classification (VLC), where we attempt to explore the essence of such pretext task while providing some insights into its design. Specifically, a simple multi-view learning framework is specially designed (SSL-MV), which assists the feature learning of downstream tasks (original view) through the same tasks on the augmented views. SSL-MV focuses on VDA while abandons VLC, empirically uncovering that it is VDA rather than generally considered VLC that dominates the performance of such SSL. Additionally, thanks to replacing VLC with VDA tasks, SSL-MV also enables an integrated inference combining the predictions from the augmented views, further improving the performance. Experiments on several benchmark datasets demonstrate its advantages.
△ Less
Submitted 15 May, 2020; v1 submitted 22 February, 2020;
originally announced March 2020.
-
Uniform Interpolation Constrained Geodesic Learning on Data Manifold
Authors:
Cong Geng,
Jia Wang,
Li Chen,
Wenbo Bao,
Chu Chu,
Zhiyong Gao
Abstract:
In this paper, we propose a method to learn a minimizing geodesic within a data manifold. Along the learned geodesic, our method can generate high-quality interpolations between two given data samples. Specifically, we use an autoencoder network to map data samples into latent space and perform interpolation via an interpolation network. We add prior geometric information to regularize our autoenc…
▽ More
In this paper, we propose a method to learn a minimizing geodesic within a data manifold. Along the learned geodesic, our method can generate high-quality interpolations between two given data samples. Specifically, we use an autoencoder network to map data samples into latent space and perform interpolation via an interpolation network. We add prior geometric information to regularize our autoencoder for the convexity of representations so that for any given interpolation approach, the generated interpolations remain within the distribution of the data manifold. Before the learning of a geodesic, a proper Riemannianmetric should be defined. Therefore, we induce a Riemannian metric by the canonical metric in the Euclidean space which the data manifold is isometrically immersed in. Based on this defined Riemannian metric, we introduce a constant speed loss and a minimizing geodesic loss to regularize the interpolation network to generate uniform interpolation along the learned geodesic on the manifold. We provide a theoretical analysis of our model and use image translation as an example to demonstrate the effectiveness of our method.
△ Less
Submitted 14 August, 2020; v1 submitted 12 February, 2020;
originally announced February 2020.
-
End-to-end speech enhancement based on discrete cosine transform
Authors:
Chuang Geng,
Lei Wang
Abstract:
Previous speech enhancement methods focus on estimating the short-time spectrum of speech signals due to its short-term stability. However, these methods often only estimate the clean magnitude spectrum and reuse the noisy phase when resynthesize speech signals, which is unlikely a valid short-time Fourier transform (STFT). Recently, DNN based speech enhancement methods mainly joint estimation of…
▽ More
Previous speech enhancement methods focus on estimating the short-time spectrum of speech signals due to its short-term stability. However, these methods often only estimate the clean magnitude spectrum and reuse the noisy phase when resynthesize speech signals, which is unlikely a valid short-time Fourier transform (STFT). Recently, DNN based speech enhancement methods mainly joint estimation of the magnitude and phase spectrum. These methods usually give better performance than magnitude spectrum estimation but need much larger computation and memory overhead. In this paper, we propose using the Discrete Cosine Transform (DCT) to reconstruct a valid short-time spectrum. Under the U-net structure, we enhance the real spectrogram and finally achieve perfect performance.
△ Less
Submitted 22 October, 2019; v1 submitted 17 October, 2019;
originally announced October 2019.
-
Visual and Semantic Prototypes-Jointly Guided CNN for Generalized Zero-shot Learning
Authors:
Chuanxing Geng,
Lue Tao,
Songcan Chen
Abstract:
In the process of exploring the world, the curiosity constantly drives humans to cognize new things. Supposing you are a zoologist, for a presented animal image, you can recognize it immediately if you know its class. Otherwise, you would more likely attempt to cognize it by exploiting the side-information (e.g., semantic information, etc.) you have accumulated. Inspired by this, this paper decomp…
▽ More
In the process of exploring the world, the curiosity constantly drives humans to cognize new things. Supposing you are a zoologist, for a presented animal image, you can recognize it immediately if you know its class. Otherwise, you would more likely attempt to cognize it by exploiting the side-information (e.g., semantic information, etc.) you have accumulated. Inspired by this, this paper decomposes the generalized zero-shot learning (G-ZSL) task into an open set recognition (OSR) task and a zero-shot learning (ZSL) task, where OSR recognizes seen classes (if we have seen (or known) them) and rejects unseen classes (if we have never seen (or known) them before), while ZSL identifies the unseen classes rejected by the former. Simultaneously, without violating OSR's assumptions (only known class knowledge is available in training), we also first attempt to explore a new generalized open set recognition (G-OSR) by introducing the accumulated side-information from known classes to OSR. For G-ZSL, such a decomposition effectively solves the class overfitting problem with easily misclassifying unseen classes as seen classes. The problem is ubiquitous in most existing G-ZSL methods. On the other hand, for G-OSR, introducing such semantic information of known classes not only improves the recognition performance but also endows OSR with the cognitive ability of unknown classes. Specifically, a visual and semantic prototypes-jointly guided convolutional neural network (VSG-CNN) is proposed to fulfill these two tasks (G-ZSL and G-OSR) in a unified end-to-end learning framework. Extensive experiments on benchmark datasets demonstrate the advantages of our learning framework.
△ Less
Submitted 14 August, 2019; v1 submitted 11 August, 2019;
originally announced August 2019.
-
Optimal Secure GDoF of Symmetric Gaussian Wiretap Channel with a Helper
Authors:
**yuan Chen,
Chunhua Geng
Abstract:
We study a symmetric Gaussian wiretap channel with a helper, where a confidential message is sent from a transmitter to a legitimate receiver, in the presence of a helper and an eavesdropper, under a weak notion of secrecy constraint. For this setting, we characterize the optimal secure generalized degrees-of-freedom (GDoF). The result reveals that, adding a helper can significantly increase the s…
▽ More
We study a symmetric Gaussian wiretap channel with a helper, where a confidential message is sent from a transmitter to a legitimate receiver, in the presence of a helper and an eavesdropper, under a weak notion of secrecy constraint. For this setting, we characterize the optimal secure generalized degrees-of-freedom (GDoF). The result reveals that, adding a helper can significantly increase the secure GDoF of the wiretap channel. The result is supported by a new converse and a new scheme. In the proposed scheme, the helper sends a cooperative jamming signal at a specific power level and direction. In this way, it minimizes the penalty in GDoF incurred by the secrecy constraint. In the secure rate analysis, the techniques of noise removal and signal separation are used.
△ Less
Submitted 11 September, 2019; v1 submitted 26 December, 2018;
originally announced December 2018.
-
Recent Advances in Open Set Recognition: A Survey
Authors:
Chuanxing Geng,
Sheng-jun Huang,
Songcan Chen
Abstract:
In real-world recognition/classification tasks, limited by various objective factors, it is usually difficult to collect training samples to exhaust all classes when training a recognizer or classifier. A more realistic scenario is open set recognition (OSR), where incomplete knowledge of the world exists at training time, and unknown classes can be submitted to an algorithm during testing, requir…
▽ More
In real-world recognition/classification tasks, limited by various objective factors, it is usually difficult to collect training samples to exhaust all classes when training a recognizer or classifier. A more realistic scenario is open set recognition (OSR), where incomplete knowledge of the world exists at training time, and unknown classes can be submitted to an algorithm during testing, requiring the classifiers to not only accurately classify the seen classes, but also effectively deal with the unseen ones. This paper provides a comprehensive survey of existing open set recognition techniques covering various aspects ranging from related definitions, representations of models, datasets, evaluation criteria, and algorithm comparisons. Furthermore, we briefly analyze the relationships between OSR and its related tasks including zero-shot, one-shot (few-shot) recognition/learning techniques, classification with reject option, and so forth. Additionally, we also overview the open world recognition which can be seen as a natural extension of OSR. Importantly, we highlight the limitations of existing approaches and point out some promising subsequent research directions in this field.
△ Less
Submitted 21 March, 2020; v1 submitted 20 November, 2018;
originally announced November 2018.
-
Collective decision for open set recognition
Authors:
Chuanxing Geng,
Songcan Chen
Abstract:
In open set recognition (OSR), almost all existing methods are designed specially for recognizing individual instances, even these instances are collectively coming in batch. Recognizers in decision either reject or categorize them to some known class using empirically-set threshold. Thus the decision threshold plays a key role. However, the selection for it usually depends on the knowledge of kno…
▽ More
In open set recognition (OSR), almost all existing methods are designed specially for recognizing individual instances, even these instances are collectively coming in batch. Recognizers in decision either reject or categorize them to some known class using empirically-set threshold. Thus the decision threshold plays a key role. However, the selection for it usually depends on the knowledge of known classes, inevitably incurring risks due to lacking available information from unknown classes. On the other hand, a more realistic OSR system should NOT just rest on a reject decision but should go further, especially for discovering the hidden unknown classes among the reject instances, whereas existing OSR methods do not pay special attention. In this paper, we introduce a novel collective/batch decision strategy with an aim to extend existing OSR for new class discovery while considering correlations among the testing instances. Specifically, a collective decision-based OSR framework (CD-OSR) is proposed by slightly modifying the Hierarchical Dirichlet process (HDP). Thanks to HDP, our CD-OSR does not need to define the decision threshold and can implement the open set recognition and new class discovery simultaneously. Finally, extensive experiments on benchmark datasets indicate the validity of CD-OSR.
△ Less
Submitted 21 March, 2020; v1 submitted 28 June, 2018;
originally announced June 2018.
-
On the Optimality of Treating Interference as Noise: Compound Interference Networks
Authors:
Chunhua Geng,
Syed A. Jafar
Abstract:
In a K-user Gaussian interference channel, it has been shown by Geng et al. that if for each user the desired signal strength is no less than the sum of the strengths of the strongest interference from this user and the strongest interference to this user (all values in dB scale), then power control and treating interference as noise (TIN) is optimal from the perspective of generalized degrees of…
▽ More
In a K-user Gaussian interference channel, it has been shown by Geng et al. that if for each user the desired signal strength is no less than the sum of the strengths of the strongest interference from this user and the strongest interference to this user (all values in dB scale), then power control and treating interference as noise (TIN) is optimal from the perspective of generalized degrees of freedom (GDoF) and achieves the entire channel capacity region to within a constant gap. In this work, we generalize the optimality of TIN to compound networks. We show that for a K-user compound Gaussian interference channel, if in every possible state for each receiver, the channel always satisfies the TIN-optimality condition identified by Geng et al., then the GDoF region of the compound channel is the intersection of the GDoF regions of all possible network realizations, which is achievable by power control and TIN. Furthermore, we demonstrate that for a general K-user compound interference channel, regardless of the number of states of each receiver, we can always construct a counterpart K-user regular interference channel that has the same TIN region as the original compound channel. The regular interference channel has only one state for each receiver, which may be different from all of the original states. Solving the GDoF-based power control problem for the compound channel is equivalent to solving the same problem in its regular counterpart. Exploring the power control problem further we develop a centralized power control scheme for K-user compound interference channels, to achieve all the Pareto optimal GDoF tuples. Finally, based on this scheme, we devise an iterative power control algorithm which requires at most K updates to obtain the globally optimal power allocation for any feasible GDoF tuple.
△ Less
Submitted 8 December, 2014;
originally announced December 2014.
-
On the Optimality of Treating Interference as Noise: General Message Sets
Authors:
Chunhua Geng,
Hua Sun,
Syed A. Jafar
Abstract:
In a K-user Gaussian interference channel, it has been shown that if for each user the desired signal strength is no less than the sum of the strengths of the strongest interference from this user and the strongest interference to this user (all values in dB scale), then treating interference as noise (TIN) is optimal from the perspective of generalized degrees-of-freedom (GDoF) and achieves the e…
▽ More
In a K-user Gaussian interference channel, it has been shown that if for each user the desired signal strength is no less than the sum of the strengths of the strongest interference from this user and the strongest interference to this user (all values in dB scale), then treating interference as noise (TIN) is optimal from the perspective of generalized degrees-of-freedom (GDoF) and achieves the entire channel capacity region to within a constant gap. In this work, we show that for such TIN-optimal interference channels, even if the message set is expanded to include an independent message from each transmitter to each receiver, operating the new channel as the original interference channel and treating interference as noise is still optimal for the sum capacity up to a constant gap. Furthermore, we extend the result to the sum-GDoF optimality of TIN in the general setting of X channels with arbitrary numbers of transmitters and receivers.
△ Less
Submitted 12 January, 2014;
originally announced January 2014.
-
Multilevel Topological Interference Management
Authors:
Chunhua Geng,
Hua Sun,
Syed A. Jafar
Abstract:
The robust principles of treating interference as noise (TIN) when it is sufficiently weak, and avoiding it when it is not, form the background for this work. Combining TIN with the topological interference management (TIM) framework that identifies optimal interference avoidance schemes, a baseline TIM-TIN approach is proposed which decomposes a network into TIN and TIM components, allocates the…
▽ More
The robust principles of treating interference as noise (TIN) when it is sufficiently weak, and avoiding it when it is not, form the background for this work. Combining TIN with the topological interference management (TIM) framework that identifies optimal interference avoidance schemes, a baseline TIM-TIN approach is proposed which decomposes a network into TIN and TIM components, allocates the signal power levels to each user in the TIN component, allocates signal vector space dimensions to each user in the TIM component, and guarantees that the product of the two is an achievable number of signal dimensions available to each user in the original network.
△ Less
Submitted 25 August, 2013;
originally announced August 2013.
-
On the Optimality of Treating Interference as Noise
Authors:
Chunhua Geng,
Navid Naderializadeh,
A. Salman Avestimehr,
Syed A. Jafar
Abstract:
It is shown that in the K-user interference channel, if for each user the desired signal strength is no less than the sum of the strengths of the strongest interference from this user and the strongest interference to this user (all values in dB scale), then the simple scheme of using point to point Gaussian codebooks with appropriate power levels at each transmitter and treating interference as n…
▽ More
It is shown that in the K-user interference channel, if for each user the desired signal strength is no less than the sum of the strengths of the strongest interference from this user and the strongest interference to this user (all values in dB scale), then the simple scheme of using point to point Gaussian codebooks with appropriate power levels at each transmitter and treating interference as noise at every receiver (in short, TIN scheme) achieves all points in the capacity region to within a constant gap. The generalized degrees of freedom (GDoF) region under this condition is a polyhedron, which is shown to be fully achieved by the same scheme, without the need for time-sharing. The results are proved by first deriving a polyhedral relaxation of the GDoF region achieved by TIN, then providing a dual characterization of this polyhedral region via the use of potential functions, and finally proving the optimality of this region in the desired regime.
△ Less
Submitted 20 May, 2013;
originally announced May 2013.
-
Topological Interference Management with Alternating Connectivity
Authors:
Hua Sun,
Chunhua Geng,
Syed A. Jafar
Abstract:
The topological interference management problem refers to the study of the capacity of partially connected linear (wired and wireless) communication networks with no channel state information at the transmitters (no CSIT) beyond the network topology, i.e., a knowledge of which channel coefficients are zero (weaker than the noise floor in the wireless case). While the problem is originally studied…
▽ More
The topological interference management problem refers to the study of the capacity of partially connected linear (wired and wireless) communication networks with no channel state information at the transmitters (no CSIT) beyond the network topology, i.e., a knowledge of which channel coefficients are zero (weaker than the noise floor in the wireless case). While the problem is originally studied with fixed topology, in this work we explore the implications of varying connectivity, through a series of simple and conceptually representative examples. Specifically, we highlight the synergistic benefits of coding across alternating topologies.
△ Less
Submitted 16 February, 2013;
originally announced February 2013.
-
Degrees of Freedom of MIMO X Networks: Spatial Scale Invariance, One-Sided Decomposability and Linear Feasibility
Authors:
Hua Sun,
Chunhua Geng,
Tiangao Gou,
Syed A. Jafar
Abstract:
We show that an M X N user MIMO X network with A antennas at each node has AMN/(M+N-1) degrees of freedom (DoF), thus resolving in this case a discrepancy between the spatial scale invariance conjecture (scaling the number of antennas at each node by a constant factor will scale the total DoF by the same factor) and a decomposability property of overconstrained wireless networks. While the best pr…
▽ More
We show that an M X N user MIMO X network with A antennas at each node has AMN/(M+N-1) degrees of freedom (DoF), thus resolving in this case a discrepancy between the spatial scale invariance conjecture (scaling the number of antennas at each node by a constant factor will scale the total DoF by the same factor) and a decomposability property of overconstrained wireless networks. While the best previously-known general DoF outer bound is consistent with the spatial invariance conjecture, the best previously-known general DoF inner bound, inspired by the K user MIMO interference channel, was based on the decomposition of every transmitter and receiver into multiple single antenna nodes, transforming the network into an AM X AN user SISO X network. While such a decomposition is DoF optimal for the K user MIMO interference channel, a gap remained between the best inner and outer bound for the MIMO X channel. Here we close this gap with the new insight that the MIMO X network is only one-sided decomposable, i.e., either all the transmitters or all the receivers (but not both) can be decomposed by splitting multiple antenna nodes into multiple single antenna nodes without loss of DoF. The result is extended to SIMO and MISO X networks as well and in each case the DoF results satisfy the spatial scale invariance property. In addition, the feasibility of linear interference alignment is investigated based only on spatial beamforming without symbol extensions. Similar to MIMO interference networks, we show that when the problem is improper, it is infeasible.
△ Less
Submitted 21 August, 2013; v1 submitted 25 July, 2012;
originally announced July 2012.
-
Selective Multipath Interference Canceller with Linear Equalization for DS-UWB Systems with Low Spreading Factor
Authors:
Chunhua Geng,
Yukui Pei,
Ning Ge
Abstract:
In high rate DS-UWB systems with low spreading factor, the selective multipath interference canceller with linear equalization (SMPIC-LE) is developed to alleviate severe multipath interferences induced by the poor orthogonality of spreading codes. The SMPIC iteratively mitigates the strongest inter-path interference, inter-chip interference and inter-symbol interference, while the former two are…
▽ More
In high rate DS-UWB systems with low spreading factor, the selective multipath interference canceller with linear equalization (SMPIC-LE) is developed to alleviate severe multipath interferences induced by the poor orthogonality of spreading codes. The SMPIC iteratively mitigates the strongest inter-path interference, inter-chip interference and inter-symbol interference, while the former two are unresolvable in conventional RAKE-decision feedback equalizer (DFE) receivers. The numerical results and complexity analysis demonstrate that SMPIC-LE with proper parameters provides an attractive overall advantage in performance and computational complexity compared with RAKE-DFE. In addition, it approaches the matched filter bound well as the RAKE finger in SMPIC increases.
△ Less
Submitted 21 December, 2010;
originally announced December 2010.
-
Impact of Mistiming on the Achievable Information Rate of Rake Receivers in DS-UWB Systems
Authors:
Chunhua Geng,
Yukui Pei,
Jiaqi Zhang,
Ning Ge
Abstract:
In this paper, we investigate the impact of mistiming on the performance of Rake receivers in direct-sequence ultra-wideband (DS-UWB) systems from the perspective of the achievable information rate. A generalized expression for the performance degradation due to mistiming is derived. Monte Carlo simulations based on this expression are then conducted, which demonstrate that the performance loss ha…
▽ More
In this paper, we investigate the impact of mistiming on the performance of Rake receivers in direct-sequence ultra-wideband (DS-UWB) systems from the perspective of the achievable information rate. A generalized expression for the performance degradation due to mistiming is derived. Monte Carlo simulations based on this expression are then conducted, which demonstrate that the performance loss has little relationship with the target achievable information rate, but varies significantly with the system bandwidth and the multipath diversity order, which reflects design trade-offs among the system timing requirement, the bandwidth and the implementation complexity. In addition, the performance degradations of Rake receivers with different multipath component selection schemes and combining techniques are compared. Among these receivers, the widely used maximal ratio combining (MRC) selective-Rake (S-Rake) suffers the largest performance loss in the presence of mistiming.
△ Less
Submitted 20 December, 2010;
originally announced December 2010.