-
Contrastive Learning for Self-Supervised Pre-Training of Point Cloud Segmentation Networks With Image Data
Authors:
Andrej Janda,
Brandon Wagstaff,
Edwin G. Ng,
Jonathan Kelly
Abstract:
Reducing the quantity of annotations required for supervised training is vital when labels are scarce and costly. This reduction is particularly important for semantic segmentation tasks involving 3D datasets, which are often significantly smaller and more challenging to annotate than their image-based counterparts. Self-supervised pre-training on unlabelled data is one way to reduce the amount of…
▽ More
Reducing the quantity of annotations required for supervised training is vital when labels are scarce and costly. This reduction is particularly important for semantic segmentation tasks involving 3D datasets, which are often significantly smaller and more challenging to annotate than their image-based counterparts. Self-supervised pre-training on unlabelled data is one way to reduce the amount of manual annotations needed. Previous work has focused on pre-training with point clouds exclusively. While useful, this approach often requires two or more registered views. In the present work, we combine image and point cloud modalities by first learning self-supervised image features and then using these features to train a 3D model. By incorporating image data, which is often included in many 3D datasets, our pre-training method only requires a single scan of a scene and can be applied to cases where localization information is unavailable. We demonstrate that our pre-training approach, despite using single scans, achieves comparable performance to other multi-scan, point cloud-only methods.
△ Less
Submitted 4 September, 2023; v1 submitted 17 January, 2023;
originally announced January 2023.
-
Self-Supervised Pre-training of 3D Point Cloud Networks with Image Data
Authors:
Andrej Janda,
Brandon Wagstaff,
Edwin G. Ng,
Jonathan Kelly
Abstract:
Reducing the quantity of annotations required for supervised training is vital when labels are scarce and costly. This reduction is especially important for semantic segmentation tasks involving 3D datasets that are often significantly smaller and more challenging to annotate than their image-based counterparts. Self-supervised pre-training on large unlabelled datasets is one way to reduce the amo…
▽ More
Reducing the quantity of annotations required for supervised training is vital when labels are scarce and costly. This reduction is especially important for semantic segmentation tasks involving 3D datasets that are often significantly smaller and more challenging to annotate than their image-based counterparts. Self-supervised pre-training on large unlabelled datasets is one way to reduce the amount of manual annotations needed. Previous work has focused on pre-training with point cloud data exclusively; this approach often requires two or more registered views. In the present work, we combine image and point cloud modalities, by first learning self-supervised image features and then using these features to train a 3D model. By incorporating image data, which is often included in many 3D datasets, our pre-training method only requires a single scan of a scene. We demonstrate that our pre-training approach, despite using single scans, achieves comparable performance to other multi-scan, point cloud-only methods.
△ Less
Submitted 16 December, 2022; v1 submitted 21 November, 2022;
originally announced November 2022.
-
TransPOS: Transformers for Consolidating Different POS Tagset Datasets
Authors:
Alex Li,
Ilyas Bankole-Hameed,
Ranadeep Singh,
Gabriel Shen Han Ng,
Akshat Gupta
Abstract:
In hope of expanding training data, researchers often want to merge two or more datasets that are created using different labeling schemes. This paper considers two datasets that label part-of-speech (POS) tags under different tagging schemes and leverage the supervised labels of one dataset to help generate labels for the other dataset. This paper further discusses the theoretical difficulties of…
▽ More
In hope of expanding training data, researchers often want to merge two or more datasets that are created using different labeling schemes. This paper considers two datasets that label part-of-speech (POS) tags under different tagging schemes and leverage the supervised labels of one dataset to help generate labels for the other dataset. This paper further discusses the theoretical difficulties of this approach and proposes a novel supervised architecture employing Transformers to tackle the problem of consolidating two completely disjoint datasets. The results diverge from initial expectations and discourage exploration into the use of disjoint labels to consolidate datasets with different labels.
△ Less
Submitted 24 September, 2022;
originally announced September 2022.
-
Deep Learning and Spectral Embedding for Graph Partitioning
Authors:
Alice Gatti,
Zhixiong Hu,
Tess Smidt,
Esmond G. Ng,
Pieter Ghysels
Abstract:
We present a graph bisection and partitioning algorithm based on graph neural networks. For each node in the graph, the network outputs probabilities for each of the partitions. The graph neural network consists of two modules: an embedding phase and a partitioning phase. The embedding phase is trained first by minimizing a loss function inspired by spectral graph theory. The partitioning module i…
▽ More
We present a graph bisection and partitioning algorithm based on graph neural networks. For each node in the graph, the network outputs probabilities for each of the partitions. The graph neural network consists of two modules: an embedding phase and a partitioning phase. The embedding phase is trained first by minimizing a loss function inspired by spectral graph theory. The partitioning module is trained through a loss function that corresponds to the expected value of the normalized cut. Both parts of the neural network rely on SAGE convolutional layers and graph coarsening using heavy edge matching. The multilevel structure of the neural network is inspired by the multigrid algorithm. Our approach generalizes very well to bigger graphs and has partition quality comparable to METIS, Scotch and spectral partitioning, with shorter runtime compared to METIS and spectral partitioning.
△ Less
Submitted 8 December, 2021; v1 submitted 16 October, 2021;
originally announced October 2021.
-
The Need for a Fine-grained approach in Just-in-Time Defect Prediction
Authors:
Giuseppe Ng,
Charibeth Cheng
Abstract:
With software system complexity leading to the rise of software defects, research efforts have been done on techniques towards predicting software defects and Just-in-time (JIT) defect prediction which predicts whether a code change is defective. While using features to determine potentially defective code change, inspection effort is still significant. As code change can impact several files, we…
▽ More
With software system complexity leading to the rise of software defects, research efforts have been done on techniques towards predicting software defects and Just-in-time (JIT) defect prediction which predicts whether a code change is defective. While using features to determine potentially defective code change, inspection effort is still significant. As code change can impact several files, we investigate an open source project to identify potential gaps with features in JIT perspective. In addition, with a lack of publicly available JIT dataset that link the features with actual commits, we also present a new dataset that can be utilized in JIT and semantic analysis.
△ Less
Submitted 1 October, 2021;
originally announced October 2021.
-
Investigation of Dataset Features for Just-in-Time Defect Prediction
Authors:
Giuseppe Ng,
Charibeth Cheng
Abstract:
Just-in-time (JIT) defect prediction refers to the technique of predicting whether a code change is defective. Many contributions have been made in this area through the excellent dataset by Kamei. In this paper, we revisit the dataset and highlight preprocessing difficulties with the dataset and the limitations of the dataset on unsupervised learning. Secondly, we propose certain features in the…
▽ More
Just-in-time (JIT) defect prediction refers to the technique of predicting whether a code change is defective. Many contributions have been made in this area through the excellent dataset by Kamei. In this paper, we revisit the dataset and highlight preprocessing difficulties with the dataset and the limitations of the dataset on unsupervised learning. Secondly, we propose certain features in the Kamei dataset that can be used for training models. Lastly, we discuss the limitations of the dataset's features.
△ Less
Submitted 25 September, 2021;
originally announced September 2021.
-
Student and Faculty Adviser Insights in an Agile Methodology Integrated Filipino Company-Sponsored I.T. Capstone Program
Authors:
Giuseppe Ng,
Rey Vincenzo Cruz
Abstract:
To improve the Information Technology (I.T.) graduate skill set, students need to be immersed in as realistic a software development environment as possible. In continuing our work on integrating Agile Methodology into the Capstone Program of our Bachelor of Science in I.T. (BSIT) degree program, this paper discusses the student challenges and difficulties during the software development project,…
▽ More
To improve the Information Technology (I.T.) graduate skill set, students need to be immersed in as realistic a software development environment as possible. In continuing our work on integrating Agile Methodology into the Capstone Program of our Bachelor of Science in I.T. (BSIT) degree program, this paper discusses the student challenges and difficulties during the software development project, and provides recommendations on improving the student overall learning process in such a program. We collected survey data from the whole population of 90 BSITstudents across four academic years about their experience with their client and the Capstone Program itself. Conceptual content analysis was then applied to discover and describe underlying themes. Also, faculty advisers were tasked with writing about their interactions, thoughts, and observations on their respective student group advisees. These showed issues with time management, communication, and competency. Also, groups that excelled exhibited better team coordination and a complete grasp of the Agile methodology. For future implementations, clearer task definition and reducing the skill gaps are necessary for better execution.
△ Less
Submitted 13 April, 2021;
originally announced April 2021.
-
Graph Partitioning and Sparse Matrix Ordering using Reinforcement Learning and Graph Neural Networks
Authors:
Alice Gatti,
Zhixiong Hu,
Tess Smidt,
Esmond G. Ng,
Pieter Ghysels
Abstract:
We present a novel method for graph partitioning, based on reinforcement learning and graph convolutional neural networks. Our approach is to recursively partition coarser representations of a given graph. The neural network is implemented using SAGE graph convolution layers, and trained using an advantage actor critic (A2C) agent. We present two variants, one for finding an edge separator that mi…
▽ More
We present a novel method for graph partitioning, based on reinforcement learning and graph convolutional neural networks. Our approach is to recursively partition coarser representations of a given graph. The neural network is implemented using SAGE graph convolution layers, and trained using an advantage actor critic (A2C) agent. We present two variants, one for finding an edge separator that minimizes the normalized cut or quotient cut, and one that finds a small vertex separator. The vertex separators are then used to construct a nested dissection ordering to permute a sparse matrix so that its triangular factorization will incur less fill-in. The partitioning quality is compared with partitions obtained using METIS and SCOTCH, and the nested dissection ordering is evaluated in the sparse solver SuperLU. Our results show that the proposed method achieves similar partitioning quality as METIS and SCOTCH. Furthermore, the method generalizes across different classes of graphs, and works well on a variety of graphs from the SuiteSparse sparse matrix collection.
△ Less
Submitted 28 June, 2021; v1 submitted 8 April, 2021;
originally announced April 2021.
-
Pushing the Limits of Non-Autoregressive Speech Recognition
Authors:
Edwin G. Ng,
Chung-Cheng Chiu,
Yu Zhang,
William Chan
Abstract:
We combine recent advancements in end-to-end speech recognition to non-autoregressive automatic speech recognition. We push the limits of non-autoregressive state-of-the-art results for multiple datasets: LibriSpeech, Fisher+Switchboard and Wall Street Journal. Key to our recipe, we leverage CTC on giant Conformer neural network architectures with SpecAugment and wav2vec2 pre-training. We achieve…
▽ More
We combine recent advancements in end-to-end speech recognition to non-autoregressive automatic speech recognition. We push the limits of non-autoregressive state-of-the-art results for multiple datasets: LibriSpeech, Fisher+Switchboard and Wall Street Journal. Key to our recipe, we leverage CTC on giant Conformer neural network architectures with SpecAugment and wav2vec2 pre-training. We achieve 1.8%/3.6% WER on LibriSpeech test/test-other sets, 5.1%/9.8% WER on Switchboard, and 3.4% on the Wall Street Journal, all without a language model.
△ Less
Submitted 11 September, 2021; v1 submitted 7 April, 2021;
originally announced April 2021.
-
Understanding Guided Image Captioning Performance across Domains
Authors:
Edwin G. Ng,
Bo Pang,
Piyush Sharma,
Radu Soricut
Abstract:
Image captioning models generally lack the capability to take into account user interest, and usually default to global descriptions that try to balance readability, informativeness, and information overload. On the other hand, VQA models generally lack the ability to provide long descriptive answers, while expecting the textual question to be quite precise. We present a method to control the conc…
▽ More
Image captioning models generally lack the capability to take into account user interest, and usually default to global descriptions that try to balance readability, informativeness, and information overload. On the other hand, VQA models generally lack the ability to provide long descriptive answers, while expecting the textual question to be quite precise. We present a method to control the concepts that an image caption should focus on, using an additional input called the guiding text that refers to either groundable or ungroundable concepts in the image. Our model consists of a Transformer-based multimodal encoder that uses the guiding text together with global and object-level image features to derive early-fusion representations used to generate the guided caption. While models trained on Visual Genome data have an in-domain advantage of fitting well when guided with automatic object labels, we find that guided captioning models trained on Conceptual Captions generalize better on out-of-domain images and guiding texts. Our human-evaluation results indicate that attempting in-the-wild guided image captioning requires access to large, unrestricted-domain training datasets, and that increased style diversity (even without increasing the number of unique tokens) is a key factor for improved performance.
△ Less
Submitted 10 November, 2021; v1 submitted 3 December, 2020;
originally announced December 2020.
-
Acoustic prediction of flowrate: varying liquid jet stream onto a free surface
Authors:
Balamurali B T,
Edwin Jonathan Aslim,
Yun Shu Lynn Ng,
Tricia Li,
Chuen Kuo,
Jacob Shihang Chen,
Dorien Herremans,
Lay Guat Ng,
Jer-Ming Chen
Abstract:
Information on liquid jet stream flow is crucial in many real world applications. In a large number of cases, these flows fall directly onto free surfaces (e.g. pools), creating a splash with accompanying splashing sounds. The sound produced is supplied by energy interactions between the liquid jet stream and the passive free surface. In this investigation, we collect the sound of a water jet of v…
▽ More
Information on liquid jet stream flow is crucial in many real world applications. In a large number of cases, these flows fall directly onto free surfaces (e.g. pools), creating a splash with accompanying splashing sounds. The sound produced is supplied by energy interactions between the liquid jet stream and the passive free surface. In this investigation, we collect the sound of a water jet of varying flowrate falling into a pool of water, and use this sound to predict the flowrate and flowrate trajectory involved. Two approaches are employed: one uses machine-learning models trained using audio features extracted from the collected sound to predict the flowrate (and subsequently the flowrate trajectory). In contrast, the second method directly uses acoustic parameters related to the spectral energy of the liquid-liquid interaction to estimate the flowrate trajectory. The actual flowrate, however, is determined directly using a gravimetric method: tracking the change in mass of the pooling liquid over time. We show here that the two methods agree well with the actual flowrate and offer comparable performance in accurately predicting the flowrate trajectory, and accordingly offer insights for potential real-life applications using sound.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
A Study of an Agile Methodology with Scrum Approach to the Filipino Company-Sponsored I.T. Capstone Program
Authors:
Giuseppe C. Ng
Abstract:
Purpose - The research aims to show the relevance of company client sponsored student projects in the University of Asia and the Pacific Information Technology (UA&P IT) Capstone Program through the use ofan Agile Methodology with Scrum Approach. Method - The modified program is employed on two batches with content analysis and survey results as benchmarks. Results - Surveys at the end of the spri…
▽ More
Purpose - The research aims to show the relevance of company client sponsored student projects in the University of Asia and the Pacific Information Technology (UA&P IT) Capstone Program through the use ofan Agile Methodology with Scrum Approach. Method - The modified program is employed on two batches with content analysis and survey results as benchmarks. Results - Surveys at the end of the sprints for both clients and students revealed that the length of the sprint was a critical factor in the development of the information system, and that students learned from addressing additional challenges such as academic load, team pressure and communication issues. Conclusion - Over-all results showed that clients were impressed and keen to adopt the student works. Recommendations - Maintainability aspects of the research can be analyzed for future studies. Increasing the sample size with additional batches could lead to discovery of additional factors not previously seen. Research Implications - The research could help improve other Capstone Programs while improving communication with company clients.
△ Less
Submitted 3 February, 2019;
originally announced February 2019.
-
Deep learning: Extrapolation tool for ab initio nuclear theory
Authors:
Gianina Alina Negoita,
James P. Vary,
Glenn R. Luecke,
Pieter Maris,
Andrey M. Shirokov,
Ik Jae Shin,
Youngman Kim,
Esmond G. Ng,
Chao Yang,
Matthew Lockner,
Gurpur M. Prabhu
Abstract:
Ab initio approaches in nuclear theory, such as the no-core shell model (NCSM), have been developed for approximately solving finite nuclei with realistic strong interactions. The NCSM and other approaches require an extrapolation of the results obtained in a finite basis space to the infinite basis space limit and assessment of the uncertainty of those extrapolations. Each observable requires a s…
▽ More
Ab initio approaches in nuclear theory, such as the no-core shell model (NCSM), have been developed for approximately solving finite nuclei with realistic strong interactions. The NCSM and other approaches require an extrapolation of the results obtained in a finite basis space to the infinite basis space limit and assessment of the uncertainty of those extrapolations. Each observable requires a separate extrapolation and most observables have no proven extrapolation method. We propose a feed-forward artificial neural network (ANN) method as an extrapolation tool to obtain the ground state energy and the ground state point-proton root-mean-square (rms) radius along with their extrapolation uncertainties. The designed ANNs are sufficient to produce results for these two very different observables in $^6$Li from the ab initio NCSM results in small basis spaces that satisfy the following theoretical physics condition: independence of basis space parameters in the limit of extremely large matrices. Comparisons of the ANN results with other extrapolation methods are also provided.
△ Less
Submitted 6 June, 2019; v1 submitted 5 October, 2018;
originally announced October 2018.
-
A Model Order Reduction Algorithm for Estimating the Absorption Spectrum
Authors:
Roel Van Beeumen,
David B. Williams-Young,
Joseph M. Kasper,
Chao Yang,
Esmond G. Ng,
Xiaosong Li
Abstract:
The ab initio description of the spectral interior of the absorption spectrum poses both a theoretical and computational challenge for modern electronic structure theory. Due to the often spectrally dense character of this domain in the quantum propagator's eigenspectrum for medium-to-large sized systems, traditional approaches based on the partial diagonalization of the propagator often encounter…
▽ More
The ab initio description of the spectral interior of the absorption spectrum poses both a theoretical and computational challenge for modern electronic structure theory. Due to the often spectrally dense character of this domain in the quantum propagator's eigenspectrum for medium-to-large sized systems, traditional approaches based on the partial diagonalization of the propagator often encounter oscillatory and stagnating convergence. Electronic structure methods which solve the molecular response problem through the solution of spectrally shifted linear systems, such as the complex polarization propagator, offer an alternative approach which is agnostic to the underlying spectral density or domain location. This generality comes at a seemingly high computational cost associated with solving a large linear system for each spectral shift in some discretization of the spectral domain of interest. We present a novel, adaptive solution based on model order reduction techniques via interpolation. Model order reduction reduces the computational complexity of mathematical models and is ubiquitous in the simulation of dynamical systems. The efficiency and effectiveness of the proposed algorithm in the ab initio prediction of X-Ray absorption spectra is demonstrated using a test set of challenging water clusters which are spectrally dense in the neighborhood of the oxygen K-edge. Based on a single, user defined tolerance we automatically determine the order of the reduced models and approximate the absorption spectrum up to the given tolerance. We also illustrate that the automatically determined model order increases logarithmically with the problem dimension, compared to a linear increase of the number of eigenvalues within the energy window. Furthermore, we observed that the computational cost of the proposed algorithm only scales quadratically with respect to the problem dimension.
△ Less
Submitted 30 August, 2017; v1 submitted 19 April, 2017;
originally announced April 2017.
-
The Reverse Cuthill-McKee Algorithm in Distributed-Memory
Authors:
Ariful Azad,
Mathias Jacquelin,
Aydin Buluc,
Esmond G. Ng
Abstract:
Ordering vertices of a graph is key to minimize fill-in and data structure size in sparse direct solvers, maximize locality in iterative solvers, and improve performance in graph algorithms. Except for naturally parallelizable ordering methods such as nested dissection, many important ordering methods have not been efficiently mapped to distributed-memory architectures. In this paper, we present t…
▽ More
Ordering vertices of a graph is key to minimize fill-in and data structure size in sparse direct solvers, maximize locality in iterative solvers, and improve performance in graph algorithms. Except for naturally parallelizable ordering methods such as nested dissection, many important ordering methods have not been efficiently mapped to distributed-memory architectures. In this paper, we present the first-ever distributed-memory implementation of the reverse Cuthill-McKee (RCM) algorithm for reducing the profile of a sparse matrix. Our parallelization uses a two-dimensional sparse matrix decomposition. We achieve high performance by decomposing the problem into a small number of primitives and utilizing optimized implementations of these primitives. Our implementation shows strong scaling up to 1024 cores for smaller matrices and up to 4096 cores for larger matrices.
△ Less
Submitted 25 October, 2016;
originally announced October 2016.
-
Accelerating Nuclear Configuration Interaction Calculations through a Preconditioned Block Iterative Eigensolver
Authors:
Meiyue Shao,
Hasan Metin Aktulga,
Chao Yang,
Esmond G. Ng,
Pieter Maris,
James P. Vary
Abstract:
We describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterati…
▽ More
We describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. We also discuss implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.
△ Less
Submitted 8 September, 2017; v1 submitted 6 September, 2016;
originally announced September 2016.
-
Advancing Nuclear Physics Through TOPS Solvers and Tools
Authors:
E Ng,
J Sarich,
S M Wild,
T Munson,
H Aktulga,
C Yang,
P Maris,
J P Vary,
N Schunck,
M G Bertolli,
M Kortelainen,
W Nazarewicz,
T Papenbrock,
M V Stoitsov
Abstract:
At the heart of many scientific applications is the solution of algebraic systems, such as linear systems of equations, eigenvalue problems, and optimization problems, to name a few. TOPS, which stands for Towards Optimal Petascale Simulations, is a SciDAC applied math center focused on the development of solvers for tackling these algebraic systems, as well as the deployment of such technologies…
▽ More
At the heart of many scientific applications is the solution of algebraic systems, such as linear systems of equations, eigenvalue problems, and optimization problems, to name a few. TOPS, which stands for Towards Optimal Petascale Simulations, is a SciDAC applied math center focused on the development of solvers for tackling these algebraic systems, as well as the deployment of such technologies in large-scale scientific applications of interest to the U.S. Department of Energy. In this paper, we highlight some of the solver technologies we have developed in optimization and matrix computations. We also describe some accomplishments achieved using these technologies in UNEDF, a SciDAC application project on nuclear physics.
△ Less
Submitted 8 October, 2011;
originally announced October 2011.