-
Noise-induced quantum synchronization and maximally entangled mixed states in superconducting circuits
Authors:
Ziyu Tao,
Finn Schmolke,
Chang-Kang Hu,
Wenhui Huang,
Yuxuan Zhou,
Jiawei Zhang,
Ji Chu,
Libo Zhang,
Xuandong Sun,
Zecheng Guo,
**g**g Niu,
Wenle Weng,
Song Liu,
Youpeng Zhong,
Dian Tan,
Dapeng Yu,
Eric Lutz
Abstract:
Random fluctuations can lead to cooperative effects in complex systems. We here report the experimental observation of noise-induced quantum synchronization in a chain of superconducting transmon qubits with nearest-neighbor interactions. The application of Gaussian white noise to a single site leads to synchronous oscillations in the entire chain. We show that the two synchronized end qubits are…
▽ More
Random fluctuations can lead to cooperative effects in complex systems. We here report the experimental observation of noise-induced quantum synchronization in a chain of superconducting transmon qubits with nearest-neighbor interactions. The application of Gaussian white noise to a single site leads to synchronous oscillations in the entire chain. We show that the two synchronized end qubits are entangled, with nonzero concurrence, and that they belong to a class of generalized Bell states known as maximally entangled mixed states, whose entanglement cannot be increased by any global unitary. We further demonstrate the stability against frequency detuning of both synchronization and entanglement by determining the corresponding generalized Arnold tongue diagrams. Our results highlight the constructive influence of noise in a quantum many-body system and uncover the potential role of synchronization for mixed-state quantum information science.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
RaNeuS: Ray-adaptive Neural Surface Reconstruction
Authors:
Yida Wang,
David Joseph Tan,
Nassir Navab,
Federico Tombari
Abstract:
Our objective is to leverage a differentiable radiance field \eg NeRF to reconstruct detailed 3D surfaces in addition to producing the standard novel view renderings. There have been related methods that perform such tasks, usually by utilizing a signed distance field (SDF). However, the state-of-the-art approaches still fail to correctly reconstruct the small-scale details, such as the leaves, ro…
▽ More
Our objective is to leverage a differentiable radiance field \eg NeRF to reconstruct detailed 3D surfaces in addition to producing the standard novel view renderings. There have been related methods that perform such tasks, usually by utilizing a signed distance field (SDF). However, the state-of-the-art approaches still fail to correctly reconstruct the small-scale details, such as the leaves, ropes, and textile surfaces. Considering that different methods formulate and optimize the projection from SDF to radiance field with a globally constant Eikonal regularization, we improve with a ray-wise weighting factor to prioritize the rendering and zero-crossing surface fitting on top of establishing a perfect SDF. We propose to adaptively adjust the regularization on the signed distance field so that unsatisfying rendering rays won't enforce strong Eikonal regularization which is ineffective, and allow the gradients from regions with well-learned radiance to effectively back-propagated to the SDF. Consequently, balancing the two objectives in order to generate accurate and detailed surfaces. Additionally, concerning whether there is a geometric bias between the zero-crossing surface in SDF and rendering points in the radiance field, the projection becomes adjustable as well depending on different 3D locations during optimization. Our proposed \textit{RaNeuS} are extensively evaluated on both synthetic and real datasets, achieving state-of-the-art results on both novel view synthesis and geometric reconstruction.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis
Authors:
Dehua Tao,
Daxin Tan,
Yu Ting Yeung,
Xiao Chen,
Tan Lee
Abstract:
Representing speech as discretized units has numerous benefits in supporting downstream spoken language processing tasks. However, the approach has been less explored in speech synthesis of tonal languages like Mandarin Chinese. Our preliminary experiments on Chinese speech synthesis reveal the issue of "tone shift", where a synthesized speech utterance contains correct base syllables but incorrec…
▽ More
Representing speech as discretized units has numerous benefits in supporting downstream spoken language processing tasks. However, the approach has been less explored in speech synthesis of tonal languages like Mandarin Chinese. Our preliminary experiments on Chinese speech synthesis reveal the issue of "tone shift", where a synthesized speech utterance contains correct base syllables but incorrect tones. To address the issue, we propose the ToneUnit framework, which leverages annotated data with tone labels as CTC supervision to learn tone-aware discrete speech units for Mandarin Chinese speech. Our findings indicate that the discrete units acquired through the TonUnit resolve the "tone shift" issue in synthesized Chinese speech and yield favorable results in English synthesis. Moreover, the experimental results suggest that finite scalar quantization enhances the effectiveness of ToneUnit. Notably, ToneUnit can work effectively even with minimal annotated data.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Compilation for Dynamically Field-Programmable Qubit Arrays with Efficient and Provably Near-Optimal Scheduling
Authors:
Daniel Bochen Tan,
Wan-Hsuan Lin,
Jason Cong
Abstract:
Dynamically field-programmable qubit arrays based on neutral atoms have high fidelity and highly parallel gates for quantum computing. However, it is challenging for compilers to fully leverage the novel flexibility offered by such hardware while respecting its various constraints. In this study, we break down the compilation for this architecture into three tasks: scheduling, placement, and routi…
▽ More
Dynamically field-programmable qubit arrays based on neutral atoms have high fidelity and highly parallel gates for quantum computing. However, it is challenging for compilers to fully leverage the novel flexibility offered by such hardware while respecting its various constraints. In this study, we break down the compilation for this architecture into three tasks: scheduling, placement, and routing. We formulate these three problems and present efficient solutions to them. Notably, our scheduling based on graph edge coloring is provably near-optimal in terms of two-qubit gate stage count (at most one more than the optimum), the fidelity bottleneck of this platform. As a result, our compiler, Enola, produces higher fidelity results compared to existing works, e.g., 3.7X stage reduction and 5.9X fidelity improvement on the benchmark set used by OLSQ-DPQA, the current state of the art. Additionally, Enola is highly scalable, e.g., within 30 minutes, it can compile circuits with 10,000 qubits, a scale sufficient for the current era of quantum computing. Enola is open source at https://github.com/UCLA-VAST/Enola
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Towards Generalist Robot Learning from Internet Video: A Survey
Authors:
Robert McCarthy,
Daniel C. H. Tan,
Dominik Schmidt,
Fernando Acero,
Nathan Herr,
Yilun Du,
Thomas G. Thuruthel,
Zhibin Li
Abstract:
This survey presents an overview of methods for learning from video (LfV) in the context of reinforcement learning (RL) and robotics. We focus on methods capable of scaling to large internet video datasets and, in the process, extracting foundational knowledge about the world's dynamics and physical human behaviour. Such methods hold great promise for develo** general-purpose robots.
We open w…
▽ More
This survey presents an overview of methods for learning from video (LfV) in the context of reinforcement learning (RL) and robotics. We focus on methods capable of scaling to large internet video datasets and, in the process, extracting foundational knowledge about the world's dynamics and physical human behaviour. Such methods hold great promise for develo** general-purpose robots.
We open with an overview of fundamental concepts relevant to the LfV-for-robotics setting. This includes a discussion of the exciting benefits LfV methods can offer (e.g., improved generalization beyond the available robot data) and commentary on key LfV challenges (e.g., missing information in video and LfV distribution shifts). Our literature review begins with an analysis of video foundation model techniques that can extract knowledge from large, heterogeneous video datasets. Next, we review methods that specifically leverage video data for robot learning. Here, we categorise work according to which RL knowledge modality (KM) benefits from the use of video data. We additionally highlight techniques for mitigating LfV challenges, including reviewing action representations that address missing action labels in video.
Finally, we examine LfV datasets and benchmarks, before concluding with a discussion of challenges and opportunities in LfV. Here, we advocate for scalable foundation model approaches that can leverage the full range of internet video data, and that target the learning of the most promising RL KMs: the policy and dynamics model. Overall, we hope this survey will serve as a comprehensive reference for the emerging field of LfV, catalysing further research in the area and facilitating progress towards the development of general-purpose robots.
△ Less
Submitted 7 June, 2024; v1 submitted 30 April, 2024;
originally announced April 2024.
-
A SAT Scalpel for Lattice Surgery: Representation and Synthesis of Subroutines for Surface-Code Fault-Tolerant Quantum Computing
Authors:
Daniel Bochen Tan,
Murphy Yuezhen Niu,
Craig Gidney
Abstract:
Quantum error correction is necessary for large-scale quantum computing. A promising quantum error correcting code is the surface code. For this code, fault-tolerant quantum computing (FTQC) can be performed via lattice surgery, i.e., splitting and merging patches of code. Given the frequent use of certain lattice-surgery subroutines (LaS), it becomes crucial to optimize their design in order to m…
▽ More
Quantum error correction is necessary for large-scale quantum computing. A promising quantum error correcting code is the surface code. For this code, fault-tolerant quantum computing (FTQC) can be performed via lattice surgery, i.e., splitting and merging patches of code. Given the frequent use of certain lattice-surgery subroutines (LaS), it becomes crucial to optimize their design in order to minimize the overall spacetime volume of FTQC. In this study, we define the variables to represent LaS and the constraints on these variables. Leveraging this formulation, we develop a synthesizer for LaS, LaSsynth, that encodes a LaS construction problem into a SAT instance, subsequently querying SAT solvers for a solution. Starting from a baseline design, we can gradually invoke the solver with shrinking spacetime volume to derive more compact designs. Due to our foundational formulation and the use of SAT solvers, LaSsynth can exhaustively explore the design space, yielding optimal designs in volume. For example, it achieves 8% and 18% volume reduction respectively over two states-of-the-art human designs for the 15-to-1 T-factory, a bottleneck in FTQC.
△ Less
Submitted 17 May, 2024; v1 submitted 28 April, 2024;
originally announced April 2024.
-
Closeby Habitable Exoplanet Survey (CHES). I. Astrometric Noise and Planetary Detection Efficiency due to Stellar Spots and Faculae
Authors:
Chunhui Bao,
Jianghui Ji,
Dongjie Tan,
Guo Chen,
Xiumin Huang,
Su Wang,
Yao Dong
Abstract:
The Closeby Habitable Exoplanet Survey (CHES) is dedicated to the astrometric exploration for habitable-zone Earth-like planets orbiting solar-type stars in close proximity, achieving unprecedented micro-arcsecond precision. Given the elevated precision, thorough consideration of photocenter jitters induced by stellar activity becomes imperative. This study endeavors to model the stellar activity…
▽ More
The Closeby Habitable Exoplanet Survey (CHES) is dedicated to the astrometric exploration for habitable-zone Earth-like planets orbiting solar-type stars in close proximity, achieving unprecedented micro-arcsecond precision. Given the elevated precision, thorough consideration of photocenter jitters induced by stellar activity becomes imperative. This study endeavors to model the stellar activity of solar-type stars, compute astrometric noise, and delineate the detection limits of habitable planets within the astrometric domain. Simulations were conducted for identified primary targets of CHES, involving the generation of simulated observed data for astrometry and photometry, accounting for the impact of stellar activity. Estimation of activity levels in our samples was achieved through chromospheric activity indices, revealing that over 90% of stars exhibited photocenter jitters below 1 $μ\mathrm{as}$. Notably, certain proximate stars, such as $α$ Cen A and B, displayed more discernible noise arising from stellar activity. Subsequent tests were performed to evaluate detection performance, unveiling that stellar activity tends to have a less pronounced impact on planetary detectability for the majority of stars. Approximately 95% of targets demonstrated a detection efficiency exceeding 80%. However, for several cold stars, e.g., HD 32450 and HD 21531, with the habitable zones close to the stars, a reduction in detection efficiency was observed. These findings offer invaluable insights into the intricate interplay between stellar activity and astrometric precision, significantly advancing our understanding in the search for habitable planets.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Low-Cost Generation and Evaluation of Dictionary Example Sentences
Authors:
Bill Cai,
Clarence Boon Liang Ng,
Daniel Tan,
Shelvia Hotama
Abstract:
Dictionary example sentences play an important role in illustrating word definitions and usage, but manually creating quality sentences is challenging. Prior works have demonstrated that language models can be trained to generate example sentences. However, they relied on costly customized models and word sense datasets for generation and evaluation of their work. Rapid advancements in foundationa…
▽ More
Dictionary example sentences play an important role in illustrating word definitions and usage, but manually creating quality sentences is challenging. Prior works have demonstrated that language models can be trained to generate example sentences. However, they relied on costly customized models and word sense datasets for generation and evaluation of their work. Rapid advancements in foundational models present the opportunity to create low-cost, zero-shot methods for the generation and evaluation of dictionary example sentences. We introduce a new automatic evaluation metric called OxfordEval that measures the win-rate of generated sentences against existing Oxford Dictionary sentences. OxfordEval shows high alignment with human judgments, enabling large-scale automated quality evaluation. We experiment with various LLMs and configurations to generate dictionary sentences across word classes. We complement this with a novel approach of using masked language models to identify and select sentences that best exemplify word meaning. The eventual model, FM-MLM, achieves over 85.1% win rate against Oxford baseline sentences according to OxfordEval, compared to 39.8% win rate for prior model-generated sentences.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
On certain properties of the $p$-unitary Cayley graph over a finite ring
Authors:
Tung T. Nguyen,
Nguyen Duy Tân
Abstract:
In recent work, we study certain Cayley graphs associated with a finite commutative ring and their multiplicative subgroups. Among various results that we prove, we provide the necessary and sufficient conditions for such a Cayley graph to be prime. In this paper, we continue this line of research. Specifically, we investigate some basic properties of certain $p$-unitary Cayeley graphs associated…
▽ More
In recent work, we study certain Cayley graphs associated with a finite commutative ring and their multiplicative subgroups. Among various results that we prove, we provide the necessary and sufficient conditions for such a Cayley graph to be prime. In this paper, we continue this line of research. Specifically, we investigate some basic properties of certain $p$-unitary Cayeley graphs associated with a finite commutative ring. In particular, under some mild conditions, we provide the necessary and sufficient conditions for this graph to be prime.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed
Authors:
Yifan Wang,
Xingyi He,
Sida Peng,
Dongli Tan,
Xiaowei Zhou
Abstract:
We present a novel method for efficiently producing semi-dense matches across images. Previous detector-free matcher LoFTR has shown remarkable matching capability in handling large-viewpoint change and texture-poor scenarios but suffers from low efficiency. We revisit its design choices and derive multiple improvements for both efficiency and accuracy. One key observation is that performing the t…
▽ More
We present a novel method for efficiently producing semi-dense matches across images. Previous detector-free matcher LoFTR has shown remarkable matching capability in handling large-viewpoint change and texture-poor scenarios but suffers from low efficiency. We revisit its design choices and derive multiple improvements for both efficiency and accuracy. One key observation is that performing the transformer over the entire feature map is redundant due to shared local information, therefore we propose an aggregated attention mechanism with adaptive token selection for efficiency. Furthermore, we find spatial variance exists in LoFTR's fine correlation module, which is adverse to matching accuracy. A novel two-stage correlation layer is proposed to achieve accurate subpixel correspondences for accuracy improvement. Our efficiency optimized model is $\sim 2.5\times$ faster than LoFTR which can even surpass state-of-the-art efficient sparse matching pipeline SuperPoint + LightGlue. Moreover, extensive experiments show that our method can achieve higher accuracy compared with competitive semi-dense matchers, with considerable efficiency benefits. This opens up exciting prospects for large-scale or latency-sensitive applications such as image retrieval and 3D reconstruction. Project page: https://zju3dv.github.io/efficientloftr.
△ Less
Submitted 11 March, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
Personalised Drug Identifier for Cancer Treatment with Transformers using Auxiliary Information
Authors:
Aishwarya Jayagopal,
Hansheng Xue,
Ziyang He,
Robert J. Walsh,
Krishna Kumar Hariprasannan,
David Shao Peng Tan,
Tuan Zea Tan,
Jason J. Pitt,
Anand D. Jeyasekharan,
Vaibhav Rajan
Abstract:
Cancer remains a global challenge due to its growing clinical and economic burden. Its uniquely personal manifestation, which makes treatment difficult, has fuelled the quest for personalized treatment strategies. Thus, genomic profiling is increasingly becoming part of clinical diagnostic panels. Effective use of such panels requires accurate drug response prediction (DRP) models, which are chall…
▽ More
Cancer remains a global challenge due to its growing clinical and economic burden. Its uniquely personal manifestation, which makes treatment difficult, has fuelled the quest for personalized treatment strategies. Thus, genomic profiling is increasingly becoming part of clinical diagnostic panels. Effective use of such panels requires accurate drug response prediction (DRP) models, which are challenging to build due to limited labelled patient data. Previous methods to address this problem have used various forms of transfer learning. However, they do not explicitly model the variable length sequential structure of the list of mutations in such diagnostic panels. Further, they do not utilize auxiliary information (like patient survival) for model training. We address these limitations through a novel transformer based method, which surpasses the performance of state-of-the-art DRP models on benchmark data. We also present the design of a treatment recommendation system (TRS), which is currently deployed at the National University Hospital, Singapore and is being evaluated in a clinical trial.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning
Authors:
Shengyi Huang,
Quentin Gallouédec,
Florian Felten,
Antonin Raffin,
Rousslan Fernand Julien Dossa,
Yanxiao Zhao,
Ryan Sullivan,
Viktor Makoviychuk,
Denys Makoviichuk,
Mohamad H. Danesh,
Cyril Roumégous,
Jiayi Weng,
Chufan Chen,
Md Masudur Rahman,
João G. M. Araújo,
Guorui Quan,
Daniel Tan,
Timo Klein,
Rujikorn Charakorn,
Mark Towers,
Yann Berthelot,
Kinal Mehta,
Dipam Chakraborty,
Arjun KG,
Valentin Charraut
, et al. (8 additional authors not shown)
Abstract:
In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, the complete raw data of the learning curves are rarely available. As a result, it is usually necessary to reproduce the experiments from scratch, which can be time-consuming and error-prone. We present Open RL Benchmark, a set of fully tracked RL experiments, i…
▽ More
In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, the complete raw data of the learning curves are rarely available. As a result, it is usually necessary to reproduce the experiments from scratch, which can be time-consuming and error-prone. We present Open RL Benchmark, a set of fully tracked RL experiments, including not only the usual data such as episodic return, but also all algorithm-specific and system metrics. Open RL Benchmark is community-driven: anyone can download, use, and contribute to the data. At the time of writing, more than 25,000 runs have been tracked, for a cumulative duration of more than 8 years. Open RL Benchmark covers a wide range of RL libraries and reference implementations. Special care is taken to ensure that each experiment is precisely reproducible by providing not only the full parameters, but also the versions of the dependencies used to generate it. In addition, Open RL Benchmark comes with a command-line interface (CLI) for easy fetching and generating figures to present the results. In this document, we include two case studies to demonstrate the usefulness of Open RL Benchmark in practice. To the best of our knowledge, Open RL Benchmark is the first RL benchmark of its kind, and the authors hope that it will improve and facilitate the work of researchers in the field.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
cDVGAN: One Flexible Model for Multi-class Gravitational Wave Signal and Glitch Generation
Authors:
Tom Dooney,
Lyana Curier,
Daniel Tan,
Melissa Lopez,
Chris Van Den Broeck,
Stefano Bromuri
Abstract:
Simulating realistic time-domain observations of gravitational waves (GWs) and GW detector glitches can help in advancing GW data analysis. Simulated data can be used in downstream tasks by augmenting datasets for signal searches, balancing data sets for machine learning, and validating detection schemes. In this work, we present Conditional Derivative GAN (cDVGAN), a novel conditional model in th…
▽ More
Simulating realistic time-domain observations of gravitational waves (GWs) and GW detector glitches can help in advancing GW data analysis. Simulated data can be used in downstream tasks by augmenting datasets for signal searches, balancing data sets for machine learning, and validating detection schemes. In this work, we present Conditional Derivative GAN (cDVGAN), a novel conditional model in the Generative Adversarial Network framework for simulating multiple classes of time-domain observations that represent gravitational waves (GWs) and detector glitches. cDVGAN can also generate generalized hybrid samples that span the variation between classes through interpolation in the conditioned class vector. cDVGAN introduces an additional player into the typical 2-player adversarial game of GANs, where an auxiliary discriminator analyzes the first-order derivative time-series. Our results show that this provides synthetic data that better captures the features of the original data. cDVGAN conditions on three classes, two denoised from LIGO blip and tomte glitch events from its 3rd observing run (O3), and the third representing binary black hole (BBH) mergers. Our proposed cDVGAN outperforms 4 different baseline GAN models in replicating the features of the three classes. Specifically, our experiments show that training convolutional neural networks (CNNs) with our cDVGAN-generated data improves the detection of samples embedded in detector noise beyond the synthetic data from other state-of-the-art GAN models. Our best synthetic dataset yields as much as a 4.2% increase in area-under-the-curve (AUC) performance compared to synthetic datasets from baseline GANs. Moreover, training the CNN with hybrid samples from our cDVGAN outperforms CNNs trained only on the standard classes, when identifying real samples embedded in LIGO detector background (4% AUC improvement for cDVGAN).
△ Less
Submitted 5 June, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Connectedness of the Gromov boundary of fine curve graphs
Authors:
Yusen Long,
Dong Tan
Abstract:
In this paper, we study the topological properties of the Gromov boundary of the fine curve graph of an orientable finite-type surface of genus at least 2. This graph consisting of topological curves has much richer dynamics than the classical curve graph. Using the techniques introduced by Wright [Wri23], we show that this boundary is (path) connected and that the spheres in non-separating fine c…
▽ More
In this paper, we study the topological properties of the Gromov boundary of the fine curve graph of an orientable finite-type surface of genus at least 2. This graph consisting of topological curves has much richer dynamics than the classical curve graph. Using the techniques introduced by Wright [Wri23], we show that this boundary is (path) connected and that the spheres in non-separating fine curve graph are connected.
△ Less
Submitted 28 February, 2024; v1 submitted 27 January, 2024;
originally announced January 2024.
-
Depth-Optimal Addressing of 2D Qubit Array with 1D Controls Based on Exact Binary Matrix Factorization
Authors:
Daniel Bochen Tan,
Shuohao **,
Jason Cong
Abstract:
Reducing control complexity is essential for achieving large-scale quantum computing. However, reducing control knobs may compromise the ability to independently address each qubit. Recent progress in neutral atom-based platforms suggests that rectangular (row-column) addressing may strike a balance between control granularity and flexibility for 2D qubit arrays. This scheme allows addressing qubi…
▽ More
Reducing control complexity is essential for achieving large-scale quantum computing. However, reducing control knobs may compromise the ability to independently address each qubit. Recent progress in neutral atom-based platforms suggests that rectangular (row-column) addressing may strike a balance between control granularity and flexibility for 2D qubit arrays. This scheme allows addressing qubits on the intersections of a set of rows and columns each time. While quadratically reducing controls, it may necessitate more depth. We formulate the depth-optimal rectangular addressing problem as exact binary matrix factorization, an NP-hard problem also appearing in communication complexity and combinatorial optimization. We introduce a satisfiability modulo theories-based solver for this problem, and a heuristic, row packing, performing close to the optimal solver on various benchmarks. Furthermore, we discuss rectangular addressing in the context of fault-tolerant quantum computing, leveraging a natural two-level structure.
△ Less
Submitted 22 March, 2024; v1 submitted 24 January, 2024;
originally announced January 2024.
-
Easy JavaScript Simulation (EJSS) Data Analytics for Singapore
Authors:
Loo Kang Wee,
Darren Tan,
Félix Jesús Garcia Clemente,
Francisco Eequembre
Abstract:
We have integrated Easy JavaScript Simulation (EJSS) Data Analytics into the national Learning Management System for Singapore schools, known as the Singapore Student Learning Space (SLS). EJSS Data Analytics enhances the teaching and learning experience for educators and students by enabling educators to monitor and evaluate students interactions with interactive computer simulations. The data an…
▽ More
We have integrated Easy JavaScript Simulation (EJSS) Data Analytics into the national Learning Management System for Singapore schools, known as the Singapore Student Learning Space (SLS). EJSS Data Analytics enhances the teaching and learning experience for educators and students by enabling educators to monitor and evaluate students interactions with interactive computer simulations. The data analytics and visualisation capabilities are delivered using the Moodle platform and version 1.3 of the specifications for Learning Tools Interoperability (LTI). In this paper, we showcase the potential for EJSS Data Analytics to identify students learning difficulties and misconceptions. Four examples of EJSS Data Analytics applications are provided to illustrate insights on aspects that include understanding a students sequential actions leading to specific task outcomes, the frequency of task attempts by each student, and the ratio of students achieving correct versus incorrect task completions. We identify five key considerations for designing the EJSS teacher dashboard. These considerations relate to Student Thought Process, Student Behaviour, Student Engagement, Student Choice, and Teacher Feedback. These five facets provide a framework for aligning our design efforts with the needs of students and teachers, also drawing upon research in data analytics for education.
△ Less
Submitted 21 January, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
On prime Cayley graphs
Authors:
Maria Chudnovsky,
Michal Cizek,
Logan Crew,
Ján Mináč,
Tung T. Nguyen,
Sophie Spirkl,
Nguyên Duy Tân
Abstract:
The decomposition of complex networks into smaller, interconnected components is a central challenge in network theory with a wide range of potential applications. In this paper, we utilize tools from group theory and ring theory to study this problem when the network is a Cayley graph. In particular, we answer the following question: Which Cayley graphs are prime?
The decomposition of complex networks into smaller, interconnected components is a central challenge in network theory with a wide range of potential applications. In this paper, we utilize tools from group theory and ring theory to study this problem when the network is a Cayley graph. In particular, we answer the following question: Which Cayley graphs are prime?
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Demonstration of a low loss, highly stable and re-useable edge coupler for high heralding efficiency and low g^(2) (0) SOI correlated photon pair sources
Authors:
**yi Du,
George F. R. Chen,
Hongwei Gao,
James A. Grieve,
Dawn T. H. Tan,
Alexander Ling
Abstract:
We report a stable, low loss method for coupling light from silicon-on-insulator (SOI) photonic chips into optical fibers. The technique is realized using an on-chip tapered waveguide and a cleaved small core optical fiber. The on-chip taper is monolithic and does not require a patterned cladding, thus simplifying the chip fabrication process. The optical fiber segment is composed of a centimeter-…
▽ More
We report a stable, low loss method for coupling light from silicon-on-insulator (SOI) photonic chips into optical fibers. The technique is realized using an on-chip tapered waveguide and a cleaved small core optical fiber. The on-chip taper is monolithic and does not require a patterned cladding, thus simplifying the chip fabrication process. The optical fiber segment is composed of a centimeter-long small core fiber (UHNA7) which is spliced to SMF-28 fiber with less than -0.1 dB loss. We observe an overall coupling loss of -0.64 dB with this design. The chip edge and fiber tip can be butt coupled without damaging the on-chip taper or fiber. Friction between the surfaces maintains alignment leading to an observation of +-0.1 dB coupling fluctuation during a ten-day continuous measurement without use of any adhesive. This technique minimizes the potential for generating Raman noise in the fiber, and has good stability compared to coupling strategies based on longer UHNA fibers or fragile lensed fibers. We also applied the edge coupler on a correlated photon pair source and observed a raw coincidence count rate of 1.21 million cps and raw heralding efficiency of 21.3%. We achieved an auto correlation function g^(2) (0) as low as 0.0004 at the low pump power regime.
△ Less
Submitted 14 March, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance
Authors:
Lukas Hoyer,
David Joseph Tan,
Muhammad Ferjad Naeem,
Luc Van Gool,
Federico Tombari
Abstract:
In semi-supervised semantic segmentation, a model is trained with a limited number of labeled images along with a large corpus of unlabeled images to reduce the high annotation effort. While previous methods are able to learn good segmentation boundaries, they are prone to confuse classes with similar visual appearance due to the limited supervision. On the other hand, vision-language models (VLMs…
▽ More
In semi-supervised semantic segmentation, a model is trained with a limited number of labeled images along with a large corpus of unlabeled images to reduce the high annotation effort. While previous methods are able to learn good segmentation boundaries, they are prone to confuse classes with similar visual appearance due to the limited supervision. On the other hand, vision-language models (VLMs) are able to learn diverse semantic knowledge from image-caption datasets but produce noisy segmentation due to the image-level training. In SemiVL, we propose to integrate rich priors from VLM pre-training into semi-supervised semantic segmentation to learn better semantic decision boundaries. To adapt the VLM from global to local reasoning, we introduce a spatial fine-tuning strategy for label-efficient learning. Further, we design a language-guided decoder to jointly reason over vision and language. Finally, we propose to handle inherent ambiguities in class labels by providing the model with language guidance in the form of class definitions. We evaluate SemiVL on 4 semantic segmentation datasets, where it significantly outperforms previous semi-supervised methods. For instance, SemiVL improves the state-of-the-art by +13.5 mIoU on COCO with 232 annotated images and by +6.1 mIoU on Pascal VOC with 92 labels. Project page: https://github.com/google-research/semivl
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Q-Pilot: Field Programmable Qubit Array Compilation with Flying Ancillas
Authors:
Hanrui Wang,
Daniel Bochen Tan,
Pengyu Liu,
Yilian Liu,
Jiaqi Gu,
Jason Cong,
Song Han
Abstract:
Neutral atom arrays have become a promising platform for quantum computing, especially the field programmable qubit array (FPQA) endowed with the unique capability of atom movement. This feature allows dynamic alterations in qubit connectivity during runtime, which can reduce the cost of executing long-range gates and improve parallelism. However, this added flexibility introduces new challenges i…
▽ More
Neutral atom arrays have become a promising platform for quantum computing, especially the field programmable qubit array (FPQA) endowed with the unique capability of atom movement. This feature allows dynamic alterations in qubit connectivity during runtime, which can reduce the cost of executing long-range gates and improve parallelism. However, this added flexibility introduces new challenges in circuit compilation. Inspired by the placement and routing strategies for FPGAs, we propose to map all data qubits to fixed atoms while utilizing movable atoms to route for 2-qubit gates between data qubits. Coined flying ancillas, these mobile atoms function as ancilla qubits, dynamically generated and recycled during execution. We present Q-Pilot, a scalable compiler for FPQA employing flying ancillas to maximize circuit parallelism. For two important quantum applications, quantum simulation and the Quantum Approximate Optimization Algorithm (QAOA), we devise domain-specific routing strategies. In comparison to alternative technologies such as superconducting devices or fixed atom arrays, Q-Pilot effectively harnesses the flexibility of FPQA, achieving reductions of 1.4x, 27.7x, and 6.3x in circuit depth for 100-qubit random, quantum simulation, and QAOA circuits, respectively.
△ Less
Submitted 6 May, 2024; v1 submitted 25 November, 2023;
originally announced November 2023.
-
Atomique: A Quantum Compiler for Reconfigurable Neutral Atom Arrays
Authors:
Hanrui Wang,
Pengyu Liu,
Daniel Bochen Tan,
Yilian Liu,
Jiaqi Gu,
David Z. Pan,
Jason Cong,
Umut A. Acar,
Song Han
Abstract:
The neutral atom array has gained prominence in quantum computing for its scalability and operation fidelity. Previous works focus on fixed atom arrays (FAAs) that require extensive SWAP operations for long-range interactions. This work explores a novel architecture reconfigurable atom arrays (RAAs), also known as field programmable qubit arrays (FPQAs), which allows for coherent atom movements du…
▽ More
The neutral atom array has gained prominence in quantum computing for its scalability and operation fidelity. Previous works focus on fixed atom arrays (FAAs) that require extensive SWAP operations for long-range interactions. This work explores a novel architecture reconfigurable atom arrays (RAAs), also known as field programmable qubit arrays (FPQAs), which allows for coherent atom movements during circuit execution under some constraints. Such atom movements, which are unique to this architecture, could reduce the cost of long-range interactions significantly if the atom movements could be scheduled strategically.
In this work, we introduce Atomique, a compilation framework designed for qubit map**, atom movement, and gate scheduling for RAA. Atomique contains a qubit-array mapper to decide the coarse-grained map** of the qubits to arrays, leveraging MAX k-Cut on a constructed gate frequency graph to minimize SWAP overhead. Subsequently, a qubit-atom mapper determines the fine-grained map** of qubits to specific atoms in the array and considers load balance to prevent hardware constraint violations. We further propose a router that identifies parallel gates, schedules them simultaneously, and reduces depth. We evaluate Atomique across 20+ diverse benchmarks, including generic circuits (arbitrary, QASMBench, SupermarQ), quantum simulation, and QAOA circuits. Atomique consistently outperforms IBM Superconducting, FAA with long-range gates, and FAA with rectangular and triangular topologies, achieving significant reductions in depth and the number of two-qubit gates.
△ Less
Submitted 2 May, 2024; v1 submitted 25 November, 2023;
originally announced November 2023.
-
PyMsOfa: A Python Package for the Standards of Fundamental Astronomy (SOFA) Service
Authors:
Jianghui Ji,
Dongjie Tan,
Chunhui Bao,
Xiumin Huang,
Shoucun Hu,
Yao Dong,
Su Wang
Abstract:
The Standards of Fundamental Astronomy (SOFA) is a service provided by the International Astronomical Union (IAU) that offers algorithms and software for astronomical calculations, which was released in two versions by FORTRAN 77 and ANSI C, respectively. In this work, we implement the python package PyMsOfa for SOFA service by three ways: (1) a python wrapper package based on a foreign function l…
▽ More
The Standards of Fundamental Astronomy (SOFA) is a service provided by the International Astronomical Union (IAU) that offers algorithms and software for astronomical calculations, which was released in two versions by FORTRAN 77 and ANSI C, respectively. In this work, we implement the python package PyMsOfa for SOFA service by three ways: (1) a python wrapper package based on a foreign function library for Python (ctypes), (2) a python wrapper package with the foreign function interface for Python calling C code (cffi), and (3) a python package directly written in pure python codes from SOFA subroutines. The package PyMsOfa has fully implemented 247 functions of the original SOFA routines. In addition, PyMsOfa is also extensively examined, which is exactly consistent with those test examples given by the original SOFA. This python package can be suitable to not only the astrometric detection of habitable planets of the Closeby Habitable Exoplanet Survey (CHES) mission (Ji et al. 2022), but also for the frontiers themes of black holes and dark matter related to astrometric calculations and other fields. The source codes are available via https://github.com/CHES2023/PyMsOfa.
△ Less
Submitted 17 October, 2023; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Tackling VQA with Pretrained Foundation Models without Further Training
Authors:
Alvin De Jun Tan,
Bingquan Shen
Abstract:
Large language models (LLMs) have achieved state-of-the-art results in many natural language processing tasks. They have also demonstrated ability to adapt well to different tasks through zero-shot or few-shot settings. With the capability of these LLMs, researchers have looked into how to adopt them for use with Visual Question Answering (VQA). Many methods require further training to align the i…
▽ More
Large language models (LLMs) have achieved state-of-the-art results in many natural language processing tasks. They have also demonstrated ability to adapt well to different tasks through zero-shot or few-shot settings. With the capability of these LLMs, researchers have looked into how to adopt them for use with Visual Question Answering (VQA). Many methods require further training to align the image and text embeddings. However, these methods are computationally expensive and requires large scale image-text dataset for training. In this paper, we explore a method of combining pretrained LLMs and other foundation models without further training to solve the VQA problem. The general idea is to use natural language to represent the images such that the LLM can understand the images. We explore different decoding strategies for generating textual representation of the image and evaluate their performance on the VQAv2 dataset.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Transferability of Representations Learned using Supervised Contrastive Learning Trained on a Multi-Domain Dataset
Authors:
Alvin De Jun Tan,
Clement Tan,
Chai Kiat Yeo
Abstract:
Contrastive learning has shown to learn better quality representations than models trained using cross-entropy loss. They also transfer better to downstream datasets from different domains. However, little work has been done to explore the transferability of representations learned using contrastive learning when trained on a multi-domain dataset. In this paper, a study has been conducted using th…
▽ More
Contrastive learning has shown to learn better quality representations than models trained using cross-entropy loss. They also transfer better to downstream datasets from different domains. However, little work has been done to explore the transferability of representations learned using contrastive learning when trained on a multi-domain dataset. In this paper, a study has been conducted using the Supervised Contrastive Learning framework to learn representations from the multi-domain DomainNet dataset and then evaluate the transferability of the representations learned on other downstream datasets. The fixed feature linear evaluation protocol will be used to evaluate the transferability on 7 downstream datasets that were chosen across different domains. The results obtained are compared to a baseline model that was trained using the widely used cross-entropy loss. Empirical results from the experiments showed that on average, the Supervised Contrastive Learning model performed 6.05% better than the baseline model on the 7 downstream datasets. The findings suggest that Supervised Contrastive Learning models can potentially learn more robust representations that transfer better across domains than cross-entropy models when trained on a multi-domain dataset.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
On the arithmetic of the join rings over finite fields
Authors:
Sunil K. Chebolu,
Jonathan Merzel,
Ján Mináč,
Tung T. Nguyen,
Federico Pasini,
Nguyên Duy Tân
Abstract:
Given a collection $\{ G_i\}_{i=1}^d$ of finite groups and a ring $R$, we have previously introduced and studied certain foundational properties of the join ring $\mathcal{J}_{G_1, G_2, \ldots, G_d}(R)$. This ring bridges two extreme worlds: matrix rings $M_n(R)$ on one end, and group rings $R[G]$ on the other. The construction of this ring was motivated by various problems in graph theory, networ…
▽ More
Given a collection $\{ G_i\}_{i=1}^d$ of finite groups and a ring $R$, we have previously introduced and studied certain foundational properties of the join ring $\mathcal{J}_{G_1, G_2, \ldots, G_d}(R)$. This ring bridges two extreme worlds: matrix rings $M_n(R)$ on one end, and group rings $R[G]$ on the other. The construction of this ring was motivated by various problems in graph theory, network theory, nonlinear dynamics, and neuroscience. In this paper, we continue our investigations of this ring, focusing more on its arithmetic properties. We begin by constructing a generalized augmentation map that gives a structural decomposition of this ring. This decomposition allows us to compute the zeta function of the join of group rings. We show that the join of group rings is a natural home for studying the concept of simultaneous primitive roots for a given set of primes. This concept is related to the order of the unit group of the join of group rings. Finally, we characterize the join of group rings over finite fields with the property that the order of every unit divides a fixed number. Remarkably, Mersenne and Fermat primes unexpectedly emerge within the context of this exploration.
△ Less
Submitted 13 April, 2024; v1 submitted 25 August, 2023;
originally announced August 2023.
-
Fekete polynomials of principal Dirichlet characters
Authors:
Shiva Chidambaram,
Ján Mináč,
Tung T. Nguyen,
Nguyen Duy Tân
Abstract:
Fekete polynomials associated to quadratic Dirichlet characters have interesting arithmetic properties, and have been studied in many works. In this paper, we study a seemingly simpler yet rich variant: the Fekete polynomial $F_n(x) = \sum_{a=1}^n χ_n(a) x^a$ associated to a principal Dirichlet character $χ_n$ of modulus $n$. We investigate the cyclotomic factors of $F_n$ and conjecturally describ…
▽ More
Fekete polynomials associated to quadratic Dirichlet characters have interesting arithmetic properties, and have been studied in many works. In this paper, we study a seemingly simpler yet rich variant: the Fekete polynomial $F_n(x) = \sum_{a=1}^n χ_n(a) x^a$ associated to a principal Dirichlet character $χ_n$ of modulus $n$. We investigate the cyclotomic factors of $F_n$ and conjecturally describe all of them. One interesting observation from our computations is that the non-cyclotomic part $f_n$ of $F_n(x)/x$ seems to be always irreducible. We study this factor closely in the special case that $n$ is a product of two odd primes, proving separability in specific cases, and studying its coefficients and special values. Combining these theoretical results with computational evidence lets us identify the Galois group of $f_n$ for small $n$, and raises precise questions in general.
△ Less
Submitted 11 December, 2023; v1 submitted 27 July, 2023;
originally announced July 2023.
-
Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory
Authors:
Daniel C. H. Tan,
Fernando Acero,
Robert McCarthy,
Dimitrios Kanoulas,
Zhibin Li
Abstract:
Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By analyzing task structures for safety preservation, we formalize original theorems that establish links b…
▽ More
Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By analyzing task structures for safety preservation, we formalize original theorems that establish links between value functions and control barrier functions. Further, we propose novel metrics for verifying value functions in safe control tasks and practical implementation details to improve learning. Our work presents a novel method for certificate learning, which unlocks a diversity of verification techniques from control theory for RL policies, and marks a significant step towards a formal framework for the general, scalable, and verifiable design of RL-based control systems. Code and videos are available at this https url: https://rl-cbf.github.io/
△ Less
Submitted 5 December, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Compiling Quantum Circuits for Dynamically Field-Programmable Neutral Atoms Array Processors
Authors:
Daniel Bochen Tan,
Dolev Bluvstein,
Mikhail D. Lukin,
Jason Cong
Abstract:
Dynamically field-programmable qubit arrays (DPQA) have recently emerged as a promising platform for quantum information processing. In DPQA, atomic qubits are selectively loaded into arrays of optical traps that can be reconfigured during the computation itself. Leveraging qubit transport and parallel, entangling quantum operations, different pairs of qubits, even those initially far away, can be…
▽ More
Dynamically field-programmable qubit arrays (DPQA) have recently emerged as a promising platform for quantum information processing. In DPQA, atomic qubits are selectively loaded into arrays of optical traps that can be reconfigured during the computation itself. Leveraging qubit transport and parallel, entangling quantum operations, different pairs of qubits, even those initially far away, can be entangled at different stages of the quantum program execution. Such reconfigurability and non-local connectivity present new challenges for compilation, especially in the layout synthesis step which places and routes the qubits and schedules the gates. In this paper, we consider a DPQA architecture that contains multiple arrays and supports 2D array movements, representing cutting-edge experimental platforms. Within this architecture, we discretize the state space and formulate layout synthesis as a satisfiability modulo theories problem, which can be solved by existing solvers optimally in terms of circuit depth. For a set of benchmark circuits generated by random graphs with complex connectivities, our compiler OLSQ-DPQA reduces the number of two-qubit entangling gates on small problem instances by 1.7x compared to optimal compilation results on a fixed planar architecture. To further improve scalability and practicality of the method, we introduce a greedy heuristic inspired by the iterative peeling approach in classical integrated circuit routing. Using a hybrid approach that combined the greedy and optimal methods, we demonstrate that our DPQA-based compiled circuits feature reduced scaling overhead compared to a grid fixed architecture, resulting in 5.1X less two-qubit gates for 90 qubit quantum circuits. These methods enable programmable, complex quantum circuits with neutral atom quantum computers, as well as informing both future compilers and future hardware choices.
△ Less
Submitted 1 July, 2024; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Multi-Level Variational Spectroscopy using a Programmable Quantum Simulator
Authors:
Zhikun Han,
Chufan Lyu,
Yuxuan Zhou,
Jiahao Yuan,
Ji Chu,
Wuerkaixi Nuerbolati,
Hao Jia,
Lifu Nie,
Weiwei Wei,
Zusheng Yang,
Libo Zhang,
Ziyan Zhang,
Chang-Kang Hu,
Ling Hu,
Jian Li,
Dian Tan,
Abolfazl Bayat,
Song Liu,
Fei Yan,
Dapeng Yu
Abstract:
Energy spectroscopy is a powerful tool with diverse applications across various disciplines. The advent of programmable digital quantum simulators opens new possibilities for conducting spectroscopy on various models using a single device. Variational quantum-classical algorithms have emerged as a promising approach for achieving such tasks on near-term quantum simulators, despite facing significa…
▽ More
Energy spectroscopy is a powerful tool with diverse applications across various disciplines. The advent of programmable digital quantum simulators opens new possibilities for conducting spectroscopy on various models using a single device. Variational quantum-classical algorithms have emerged as a promising approach for achieving such tasks on near-term quantum simulators, despite facing significant quantum and classical resource overheads. Here, we experimentally demonstrate multi-level variational spectroscopy for fundamental many-body Hamiltonians using a superconducting programmable digital quantum simulator. By exploiting symmetries, we effectively reduce circuit depth and optimization parameters allowing us to go beyond the ground state. Combined with the subspace search method, we achieve full spectroscopy for a 4-qubit Heisenberg spin chain, yielding an average deviation of 0.13 between experimental and theoretical energies, assuming unity coupling strength. Our method, when extended to 8-qubit Heisenberg and transverse-field Ising Hamiltonians, successfully determines the three lowest energy levels. In achieving the above, we introduce a circuit-agnostic waveform compilation method that enhances the robustness of our simulator against signal crosstalk. Our study highlights symmetry-assisted resource efficiency in variational quantum algorithms and lays the foundation for practical spectroscopy on near-term quantum simulators, with potential applications in quantum chemistry and condensed matter physics.
△ Less
Submitted 3 June, 2023;
originally announced June 2023.
-
Elucidating the Role of Prelithiation in Si-based Anodes for Interface Stabilization
Authors:
Shuang Bai,
Wurigumula Bao,
Kun Qian,
Bing Han,
Weikang Li,
Baharak Sayahpour,
Bhagath Screenarayanan,
Darren H. S. Tan,
So-yeon Ham,
Ying Shirley Meng
Abstract:
Prelithiation as a facile and effective method to compensate the lithium inventory loss in the initial cycle has progressed considerably both on anode and cathode sides. However, much less research has been devoted to the prelithiation effect on the interface stabilization for long-term cycling of Si-based anodes. An in-depth quantitative analysis of the interface that form during the prelithiatio…
▽ More
Prelithiation as a facile and effective method to compensate the lithium inventory loss in the initial cycle has progressed considerably both on anode and cathode sides. However, much less research has been devoted to the prelithiation effect on the interface stabilization for long-term cycling of Si-based anodes. An in-depth quantitative analysis of the interface that form during the prelithiation of SiO$_x$ is presented here and the results are compared with prelithiaton of Si anodes. Local structure probe combined with detailed electrochemical analysis reveals that a characteristic mosaic interface is formed on both prelithiated SiO$_x$ and Si anodes. This mosaic interface containing multiple lithium silicates phases, is fundamentally different from the solid electrolyte interface (SEI) formed without prelithiation. The ideal conductivity and mechanical properties of lithium silicates enable improved cycling stability of both prelithiated anodes. With a higher ratio of lithium silicates due to the oxygen participation, prelithiated SiO$_{1.3}$ anode improves the initial coulombic efficiency to 94% in full cell and delivers good cycling retention after hundreds cycles under lean electrolyte conditions. The insights provided in this work could be used to further optimize high Si loading based anode in future high energy density batteries.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
P-Transformer: A Prompt-based Multimodal Transformer Architecture For Medical Tabular Data
Authors:
Yucheng Ruan,
Xiang Lan,
Daniel J. Tan,
Hairil Rizal Abdullah,
Mengling Feng
Abstract:
Medical tabular data, abundant in Electronic Health Records (EHRs), is a valuable resource for diverse medical tasks such as risk prediction. While deep learning approaches, particularly transformer-based models, have shown remarkable performance in tabular data prediction, there are still problems remained for existing work to be effectively adapted into medical domain, such as under-utilization…
▽ More
Medical tabular data, abundant in Electronic Health Records (EHRs), is a valuable resource for diverse medical tasks such as risk prediction. While deep learning approaches, particularly transformer-based models, have shown remarkable performance in tabular data prediction, there are still problems remained for existing work to be effectively adapted into medical domain, such as under-utilization of unstructured free-texts, limited exploration of textual information in structured data, and data corruption. To address these issues, we propose P-Transformer, a Prompt-based multimodal Transformer architecture designed specifically for medical tabular data. This framework consists two critical components: a tabular cell embedding generator and a tabular transformer. The former efficiently encodes diverse modalities from both structured and unstructured tabular data into a harmonized language semantic space with the help of pre-trained sentence encoder and medical prompts. The latter integrates cell representations to generate patient embeddings for various medical tasks. In comprehensive experiments on two real-world datasets for three medical tasks, P-Transformer demonstrated the improvements with 10.9%/11.0% on RMSE/MAE, 0.5%/2.2% on RMSE/MAE, and 1.6%/0.8% on BACC/AUROC compared to state-of-the-art (SOTA) baselines in predictability. Notably, the model exhibited strong resilience to data corruption in the structured data, particularly when the corruption rates are high.
△ Less
Submitted 9 January, 2024; v1 submitted 30 March, 2023;
originally announced March 2023.
-
Interaction-induced topological pum** in a solid-state quantum system
Authors:
Ziyu Tao,
Wenhui Huang,
**g**g Niu,
Libo Zhang,
Yongguan Ke,
Xiu Gu,
Ling Lin,
Jiawei Qiu,
Xuandong Sun,
Xiaohan Yang,
Jiajian Zhang,
Jiawei Zhang,
Shuxiang Zhao,
Yuxuan Zhou,
Xiaowei Deng,
Changkang Hu,
Ling Hu,
Jian Li,
Yang Liu,
Dian Tan,
Yuan Xu,
Tongxing Yan,
Yuanzhen Chen,
Chaohong Lee,
Youpeng Zhong
, et al. (2 additional authors not shown)
Abstract:
As the basis for generating multi-particle quantum correlations, inter-particle interaction plays a crucial role in collective quantum phenomena, quantum phase transitions, and quantum information processing. It can profoundly alter the band structure of quantum many-body systems and give rise to exotic topological phenomena. Conventional topological pum**, which has been well demonstrated in dr…
▽ More
As the basis for generating multi-particle quantum correlations, inter-particle interaction plays a crucial role in collective quantum phenomena, quantum phase transitions, and quantum information processing. It can profoundly alter the band structure of quantum many-body systems and give rise to exotic topological phenomena. Conventional topological pum**, which has been well demonstrated in driven linear or noninteracting systems, may break down in the presence of strong interaction. However, the interplay between band topology and interaction could also induce emergent topological pum** of interacting particles, but its experimental realization has proven challenging. Here we demonstrate interaction-induced topological pum** in a solid-state quantum system comprising an array of 36 superconducting qubits. With strong interaction inherent in the qubits and site-resolved controllability of the lattice potential and hop** strength, we realize the topological Thouless pum** of single and two bounded particles. Beyond these topological phenomena with linear or noninteracting counterparts, we also observe topologically resonant tunneling and asymmetric edge-state transport of interacting particles. Our work creates a paradigm for multi-particle topological effects, and provides a new pathway to the study of exotic topological phenomena, many-body quantum transport, and quantum information transfer.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Low-loss interconnects for modular superconducting quantum processors
Authors:
**g**g Niu,
Libo Zhang,
Yang Liu,
Jiawei Qiu,
Wenhui Huang,
Jiaxiang Huang,
Hao Jia,
Jiawei Liu,
Ziyu Tao,
Weiwei Wei,
Yuxuan Zhou,
Wan**g Zou,
Yuanzhen Chen,
Xiaowei Deng,
Xiuhao Deng,
Changkang Hu,
Ling Hu,
Jian Li,
Dian Tan,
Yuan Xu,
Fei Yan,
Tongxing Yan,
Song Liu,
Youpeng Zhong,
Andrew N. Cleland
, et al. (1 additional authors not shown)
Abstract:
Scaling is now a key challenge in superconducting quantum computing. One solution is to build modular systems in which smaller-scale quantum modules are individually constructed and calibrated, and then assembled into a larger architecture. This, however, requires the development of suitable interconnects. Here, we report low-loss interconnects based on pure aluminium coaxial cables and on-chip im…
▽ More
Scaling is now a key challenge in superconducting quantum computing. One solution is to build modular systems in which smaller-scale quantum modules are individually constructed and calibrated, and then assembled into a larger architecture. This, however, requires the development of suitable interconnects. Here, we report low-loss interconnects based on pure aluminium coaxial cables and on-chip impedance transformers featuring quality factors up to $8.1 \times 10^5$, which is comparable to the performance of our transmon qubits fabricated on single-crystal sapphire substrate. We use these interconnects to link five quantum modules with inter-module quantum state transfer and Bell state fidelities up to 99\%. To benchmark the overall performance of the processor, we create maximally-entangled, multi-qubit Greenberger-Horne-Zeilinger (GHZ) states. The generated inter-module four-qubit GHZ state exhibits 92.0\% fidelity. We also entangle up to 12 qubits in a GHZ state with $55.8 \pm 1.8\%$ fidelity, which is above the genuine multipartite entanglement threshold of 1/2. These results represent a viable modular approach for large-scale superconducting quantum processors.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Perceptive Locomotion with Controllable Pace and Natural Gait Transitions Over Uneven Terrains
Authors:
Daniel Chee Hian Tan,
Jenny Zhang,
Michael,
Chuah,
Zhibin Li
Abstract:
This work developed a learning framework for perceptive legged locomotion that combines visual feedback, proprioceptive information, and active gait regulation of foot-ground contacts. The perception requires only one forward-facing camera to obtain the heightmap, and the active regulation of gait paces and traveling velocity are realized through our formulation of CPG-based high-level imitation o…
▽ More
This work developed a learning framework for perceptive legged locomotion that combines visual feedback, proprioceptive information, and active gait regulation of foot-ground contacts. The perception requires only one forward-facing camera to obtain the heightmap, and the active regulation of gait paces and traveling velocity are realized through our formulation of CPG-based high-level imitation of foot-ground contacts. Through this framework, an end-user has the ability to command task-level inputs to control different walking speeds and gait frequencies according to the traversal of different terrains, which enables more reliable negotiation with encountered obstacles. The results demonstrated that the learned perceptive locomotion policy followed task-level control inputs with intended behaviors, and was robust in presence of unseen terrains and external force perturbations. A video demonstration can be found at https://youtu.be/OTzlWzDfAe8, and the codebase at https://github.com/jennyzzt/perceptual-locomotion.
△ Less
Submitted 30 January, 2023; v1 submitted 25 January, 2023;
originally announced January 2023.
-
AP: Selective Activation for De-sparsifying Pruned Neural Networks
Authors:
Shiyu Liu,
Rohan Ghosh,
Dylan Tan,
Mehul Motani
Abstract:
The rectified linear unit (ReLU) is a highly successful activation function in neural networks as it allows networks to easily obtain sparse representations, which reduces overfitting in overparameterized networks. However, in network pruning, we find that the sparsity introduced by ReLU, which we quantify by a term called dynamic dead neuron rate (DNR), is not beneficial for the pruned network. I…
▽ More
The rectified linear unit (ReLU) is a highly successful activation function in neural networks as it allows networks to easily obtain sparse representations, which reduces overfitting in overparameterized networks. However, in network pruning, we find that the sparsity introduced by ReLU, which we quantify by a term called dynamic dead neuron rate (DNR), is not beneficial for the pruned network. Interestingly, the more the network is pruned, the smaller the dynamic DNR becomes during optimization. This motivates us to propose a method to explicitly reduce the dynamic DNR for the pruned network, i.e., de-sparsify the network. We refer to our method as Activating-while-Pruning (AP). We note that AP does not function as a stand-alone method, as it does not evaluate the importance of weights. Instead, it works in tandem with existing pruning methods and aims to improve their performance by selective activation of nodes to reduce the dynamic DNR. We conduct extensive experiments using popular networks (e.g., ResNet, VGG) via two classical and three state-of-the-art pruning methods. The experimental results on public datasets (e.g., CIFAR-10/100) suggest that AP works well with existing pruning methods and improves the performance by 3% - 4%. For larger scale datasets (e.g., ImageNet) and state-of-the-art networks (e.g., vision transformer), we observe an improvement of 2% - 3% with AP as opposed to without. Lastly, we conduct an ablation study to examine the effectiveness of the components comprising AP.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
Analysis and Utilization of Entrainment on Acoustic and Emotion Features in User-agent Dialogue
Authors:
Daxin Tan,
Nikos Kargas,
David McHardy,
Constantinos Papayiannis,
Antonio Bonafonte,
Marek Strelec,
Jonas Rohnke,
Agis Oikonomou Filandras,
Trevor Wood
Abstract:
Entrainment is the phenomenon by which an interlocutor adapts their speaking style to align with their partner in conversations. It has been found in different dimensions as acoustic, prosodic, lexical or syntactic. In this work, we explore and utilize the entrainment phenomenon to improve spoken dialogue systems for voice assistants. We first examine the existence of the entrainment phenomenon in…
▽ More
Entrainment is the phenomenon by which an interlocutor adapts their speaking style to align with their partner in conversations. It has been found in different dimensions as acoustic, prosodic, lexical or syntactic. In this work, we explore and utilize the entrainment phenomenon to improve spoken dialogue systems for voice assistants. We first examine the existence of the entrainment phenomenon in human-to-human dialogues in respect to acoustic feature and then extend the analysis to emotion features. The analysis results show strong evidence of entrainment in terms of both acoustic and emotion features. Based on this findings, we implement two entrainment policies and assess if the integration of entrainment principle into a Text-to-Speech (TTS) system improves the synthesis performance and the user experience. It is found that the integration of the entrainment principle into a TTS system brings performance improvement when considering acoustic features, while no obvious improvement is observed when considering emotion features.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
On the Paley graph of a quadratic character
Authors:
Ján Mináč,
Lyle Muller,
Tung T. Nguyen,
Nguyen Duy Tân
Abstract:
Paley graphs form a nice link between the distribution of quadratic residues and graph theory. These graphs possess remarkable properties which make them useful in several branches of mathematics. Classically, for each prime number $p$ we can construct the corresponding Paley graph using quadratic and non-quadratic residues modulo $p$. Therefore, Paley graphs are naturally associated with the Lege…
▽ More
Paley graphs form a nice link between the distribution of quadratic residues and graph theory. These graphs possess remarkable properties which make them useful in several branches of mathematics. Classically, for each prime number $p$ we can construct the corresponding Paley graph using quadratic and non-quadratic residues modulo $p$. Therefore, Paley graphs are naturally associated with the Legendre symbol at $p$ which is a quadratic Dirichlet character of conductor $p$. In this article, we introduce the generalized Paley graphs. These are graphs that are associated with a general quadratic Dirichlet character. We will then provide some of their basic properties. In particular, we describe their spectrum explicitly. We then use those generalized Paley graphs to construct some new families of Ramanujan graphs. Finally, using special values of $L$-functions, we provide an effective upper bound for their Cheeger number.
△ Less
Submitted 6 December, 2023; v1 submitted 4 December, 2022;
originally announced December 2022.
-
Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion
Authors:
Dario Pavllo,
David Joseph Tan,
Marie-Julie Rakotosaona,
Federico Tombari
Abstract:
Neural Radiance Fields (NeRF) coupled with GANs represent a promising direction in the area of 3D reconstruction from a single view, owing to their ability to efficiently model arbitrary topologies. Recent work in this area, however, has mostly focused on synthetic datasets where exact ground-truth poses are known, and has overlooked pose estimation, which is important for certain downstream appli…
▽ More
Neural Radiance Fields (NeRF) coupled with GANs represent a promising direction in the area of 3D reconstruction from a single view, owing to their ability to efficiently model arbitrary topologies. Recent work in this area, however, has mostly focused on synthetic datasets where exact ground-truth poses are known, and has overlooked pose estimation, which is important for certain downstream applications such as augmented reality (AR) and robotics. We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available. Our approach recovers an SDF-parameterized 3D shape, pose, and appearance from a single image of an object, without exploiting multiple views during training. More specifically, we leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution which is then refined via optimization. Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios. We demonstrate state-of-the-art results on a variety of real and synthetic benchmarks.
△ Less
Submitted 20 March, 2023; v1 submitted 21 November, 2022;
originally announced November 2022.
-
Optimal geodesics for boundary points of the Gardiner-Masur compactification
Authors:
Xiaoke Lou,
Weixu Su,
Dong Tan
Abstract:
The Gardiner-Masur compactification of Teichmüller space is homeomorphic to the horofunction compactification of the Teichmüller metric. Let $ξ$ and $η$ be a pair of boundary points in the Gardiner-Masur compactification that fill up the surface. We show that there is a unique Teichmüller geodesic which is optimal for the horofunctions corresponding to $ξ$ and $η$. In particular, when $ξ$ and $η$…
▽ More
The Gardiner-Masur compactification of Teichmüller space is homeomorphic to the horofunction compactification of the Teichmüller metric. Let $ξ$ and $η$ be a pair of boundary points in the Gardiner-Masur compactification that fill up the surface. We show that there is a unique Teichmüller geodesic which is optimal for the horofunctions corresponding to $ξ$ and $η$. In particular, when $ξ$ and $η$ are Busemann points that fill up the surface, the geodesic converges to $ξ$ in forward direction and to $η$ in backward direction. As an application, we show that if $\mathbf{G}_n$ is a sequence of Teichmüller geodesics passing through $X_n$ and $Y_n$ such that $X_n \to ξ$ and $Y_n \to η$, then $\mathbf{G}_n$ converges to a unique Teichmüller geodesic.
△ Less
Submitted 27 July, 2023; v1 submitted 11 October, 2022;
originally announced October 2022.
-
Automated Sex Classification of Children's Voices and Changes in Differentiating Factors with Age
Authors:
Fuling Chen,
Roberto Togneri,
Murray Maybery,
Diana Weiting Tan
Abstract:
Sex classification of children's voices allows for an investigation of the development of secondary sex characteristics which has been a key interest in the field of speech analysis. This research investigated a broad range of acoustic features from scripted and spontaneous speech and applied a hierarchical clustering-based machine learning model to distinguish the sex of children aged between 5 a…
▽ More
Sex classification of children's voices allows for an investigation of the development of secondary sex characteristics which has been a key interest in the field of speech analysis. This research investigated a broad range of acoustic features from scripted and spontaneous speech and applied a hierarchical clustering-based machine learning model to distinguish the sex of children aged between 5 and 15 years. We proposed an optimal feature set and our modelling achieved an average F1 score (the harmonic mean of the precision and recall) of 0.84 across all ages. Our results suggest that the sex classification is generally more accurate when a model is developed for each year group rather than for children in 4-year age bands, with classification accuracy being better for older age groups. We found that spontaneous speech could provide more helpful cues in sex classification than scripted speech, especially for children younger than 7 years. For younger age groups, a broad range of acoustic factors contributed evenly to sex classification, while for older age groups, F0-related acoustic factors were found to be the most critical predictors generally. Other important acoustic factors for older age groups include vocal tract length estimators, spectral flux, loudness and unvoiced features.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement
Authors:
Dongli Tan,
Jiang-Jiang Liu,
Xingyu Chen,
Chao Chen,
Ruixin Zhang,
Yunhang Shen,
Shouhong Ding,
Rongrong Ji
Abstract:
Modeling sparse and dense image matching within a unified functional correspondence model has recently attracted increasing research interest. However, existing efforts mainly focus on improving matching accuracy while ignoring its efficiency, which is crucial for realworld applications. In this paper, we propose an efficient structure named Efficient Correspondence Transformer (ECO-TR) by finding…
▽ More
Modeling sparse and dense image matching within a unified functional correspondence model has recently attracted increasing research interest. However, existing efforts mainly focus on improving matching accuracy while ignoring its efficiency, which is crucial for realworld applications. In this paper, we propose an efficient structure named Efficient Correspondence Transformer (ECO-TR) by finding correspondences in a coarse-to-fine manner, which significantly improves the efficiency of functional correspondence model. To achieve this, multiple transformer blocks are stage-wisely connected to gradually refine the predicted coordinates upon a shared multi-scale feature extraction network. Given a pair of images and for arbitrary query coordinates, all the correspondences are predicted within a single feed-forward pass. We further propose an adaptive query-clustering strategy and an uncertainty-based outlier detection module to cooperate with the proposed framework for faster and better predictions. Experiments on various sparse and dense matching tasks demonstrate the superiority of our method in both efficiency and effectiveness against existing state-of-the-arts.
△ Less
Submitted 25 September, 2022;
originally announced September 2022.
-
A Hybrid Deep Learning Model-based Remaining Useful Life Estimation for Reed Relay with Degradation Pattern Clustering
Authors:
Chinthaka Gamanayake,
Yan Qin,
Chau Yuen,
Lahiru Jayasinghe,
Dominique-Ea Tan,
Jenny Low
Abstract:
Reed relay serves as the fundamental component of functional testing, which closely relates to the successful quality inspection of electronics. To provide accurate remaining useful life (RUL) estimation for reed relay, a hybrid deep learning network with degradation pattern clustering is proposed based on the following three considerations. First, multiple degradation behaviors are observed for r…
▽ More
Reed relay serves as the fundamental component of functional testing, which closely relates to the successful quality inspection of electronics. To provide accurate remaining useful life (RUL) estimation for reed relay, a hybrid deep learning network with degradation pattern clustering is proposed based on the following three considerations. First, multiple degradation behaviors are observed for reed relay, and hence a dynamic time wrap**-based $K$-means clustering is offered to distinguish degradation patterns from each other. Second, although proper selections of features are of great significance, few studies are available to guide the selection. The proposed method recommends operational rules for easy implementation purposes. Third, a neural network for remaining useful life estimation (RULNet) is proposed to address the weakness of the convolutional neural network (CNN) in capturing temporal information of sequential data, which incorporates temporal correlation ability after high-level feature representation of convolutional operation. In this way, three variants of RULNet are constructed with health indicators, features with self-organizing map, or features with curve fitting. Ultimately, the proposed hybrid model is compared with the typical baseline models, including CNN and long short-term memory network (LSTM), through a practical reed relay dataset with two distinct degradation manners. The results from both degradation cases demonstrate that the proposed method outperforms CNN and LSTM regarding the index root mean squared error.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
On the joins of group rings
Authors:
Sunil K. Chebolu,
Jonathan L. Merzel,
Ján Mináč,
Lyle Muller,
Tung T. Nguyen,
Federico W. Pasini,
Nguyen Duy Tân
Abstract:
Given a collection $\{ G_i\}_{i=1}^d$ of finite groups and a ring $R$, we define a subring of the ring $M_n(R)$ ($n = \sum_{i=1}^d|G_i|)$ that encompasses all the individual group rings $R[G_i]$ along the diagonal blocks as $G_i$-circulant matrices. The precise definition of this ring was inspired by a construction in graph theory known as the joined union of graphs. We call this ring the join of…
▽ More
Given a collection $\{ G_i\}_{i=1}^d$ of finite groups and a ring $R$, we define a subring of the ring $M_n(R)$ ($n = \sum_{i=1}^d|G_i|)$ that encompasses all the individual group rings $R[G_i]$ along the diagonal blocks as $G_i$-circulant matrices. The precise definition of this ring was inspired by a construction in graph theory known as the joined union of graphs. We call this ring the join of group rings and denote it by $\mathcal{J}_{G_1,\dots, G_d}(R)$. In this paper, we present a systematic study of the algebraic structure of $\mathcal{J}_{G_1,\dots, G_d}(R)$. We show that it has a ring structure and characterize its center, group of units, and Jacobson radical. When $R=k$ is an algebraically closed field, we derive a formula for the number of irreducible modules over $\mathcal{J}_{G_1,\dots, G_d}(k)$. We also show how a blockwise extension of the Fourier transform provides both a generalization of the Circulant Diagonalization Theorem to joins of circulant matrices and an explicit isomorphism between the join algebra and its Wedderburn components.
△ Less
Submitted 1 April, 2023; v1 submitted 15 August, 2022;
originally announced August 2022.
-
On the arithmetic of generalized Fekete polynomials
Authors:
Ján Mináč,
Tung T. Nguyen,
Nguyen Duy Tân
Abstract:
For each prime number $p$ one can associate a Fekete polynomial with coefficients $-1$ or $1$ except the constant term, which is 0. These are classical polynomials that have been studied extensively in the framework of analytic number theory. In a recent paper, we showed that these polynomials also encode interesting arithmetic information. In this paper, we define generalized Fekete polynomials a…
▽ More
For each prime number $p$ one can associate a Fekete polynomial with coefficients $-1$ or $1$ except the constant term, which is 0. These are classical polynomials that have been studied extensively in the framework of analytic number theory. In a recent paper, we showed that these polynomials also encode interesting arithmetic information. In this paper, we define generalized Fekete polynomials associated with quadratic characters whose conductors could be a composite number. We then investigate the appearance of cyclotomic factors of these generalized Fekete polynomials. Based on this investigation, we introduce a compact version of Fekete polynomials as well as their trace polynomials. We then study the Galois groups of these Fekete polynomials using modular techniques. In particular, we discover some surprising extra symmetries which imply some restrictions on the corresponding Galois groups. Finally, based on both theoretical and numerical data, we propose a precise conjecture on the structure of these Galois groups.
△ Less
Submitted 2 December, 2023; v1 submitted 23 June, 2022;
originally announced June 2022.
-
SoftPool++: An Encoder-Decoder Network for Point Cloud Completion
Authors:
Yida Wang,
David Joseph Tan,
Nassir Navab,
Federico Tombari
Abstract:
We propose a novel convolutional operator for the task of point cloud completion. One striking characteristic of our approach is that, conversely to related work it does not require any max-pooling or voxelization operation. Instead, the proposed operator used to learn the point cloud embedding in the encoder extracts permutation-invariant features from the point cloud via a soft-pooling of featur…
▽ More
We propose a novel convolutional operator for the task of point cloud completion. One striking characteristic of our approach is that, conversely to related work it does not require any max-pooling or voxelization operation. Instead, the proposed operator used to learn the point cloud embedding in the encoder extracts permutation-invariant features from the point cloud via a soft-pooling of feature activations, which are able to preserve fine-grained geometric details. These features are then passed on to a decoder architecture. Due to the compression in the encoder, a typical limitation of this type of architectures is that they tend to lose parts of the input shape structure. We propose to overcome this limitation by using skip connections specifically devised for point clouds, where links between corresponding layers in the encoder and the decoder are established. As part of these connections, we introduce a transformation matrix that projects the features from the encoder to the decoder and vice-versa. The quantitative and qualitative results on the task of object completion from partial scans on the ShapeNet dataset show that incorporating our approach achieves state-of-the-art performance in shape completion both at low and high resolutions.
△ Less
Submitted 8 May, 2022;
originally announced May 2022.
-
CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction
Authors:
Daxin Tan,
Liqun Deng,
Nianzu Zheng,
Yu Ting Yeung,
Xin Jiang,
Xiao Chen,
Tan Lee
Abstract:
This study propose a fully automated system for speech correction and accent reduction. Consider the application scenario that a recorded speech audio contains certain errors, e.g., inappropriate words, mispronunciations, that need to be corrected. The proposed system, named CorrectSpeech, performs the correction in three steps: recognizing the recorded speech and converting it into time-stamped s…
▽ More
This study propose a fully automated system for speech correction and accent reduction. Consider the application scenario that a recorded speech audio contains certain errors, e.g., inappropriate words, mispronunciations, that need to be corrected. The proposed system, named CorrectSpeech, performs the correction in three steps: recognizing the recorded speech and converting it into time-stamped symbol sequence, aligning recognized symbol sequence with target text to determine locations and types of required edit operations, and generating the corrected speech. Experiments show that the quality and naturalness of corrected speech depend on the performance of speech recognition and alignment modules, as well as the granularity level of editing operations. The proposed system is evaluated on two corpora: a manually perturbed version of VCTK and L2-ARCTIC. The results demonstrate that our system is able to correct mispronunciation and reduce accent in speech recordings. Audio samples are available online for demonstration https://daxintan-cuhk.github.io/CorrectSpeech/ .
△ Less
Submitted 13 October, 2022; v1 submitted 11 April, 2022;
originally announced April 2022.
-
Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech
Authors:
Guangyan Zhang,
Kaitao Song,
Xu Tan,
Daxin Tan,
Yuzi Yan,
Yanqing Liu,
Gang Wang,
Wei Zhou,
Tao Qin,
Tan Lee,
Sheng Zhao
Abstract:
Recently, leveraging BERT pre-training to improve the phoneme encoder in text to speech (TTS) has drawn increasing attention. However, the works apply pre-training with character-based units to enhance the TTS phoneme encoder, which is inconsistent with the TTS fine-tuning that takes phonemes as input. Pre-training only with phonemes as input can alleviate the input mismatch but lack the ability t…
▽ More
Recently, leveraging BERT pre-training to improve the phoneme encoder in text to speech (TTS) has drawn increasing attention. However, the works apply pre-training with character-based units to enhance the TTS phoneme encoder, which is inconsistent with the TTS fine-tuning that takes phonemes as input. Pre-training only with phonemes as input can alleviate the input mismatch but lack the ability to model rich representations and semantic information due to limited phoneme vocabulary. In this paper, we propose MixedPhoneme BERT, a novel variant of the BERT model that uses mixed phoneme and sup-phoneme representations to enhance the learning capability. Specifically, we merge the adjacent phonemes into sup-phonemes and combine the phoneme sequence and the merged sup-phoneme sequence as the model input, which can enhance the model capacity to learn rich contextual representations. Experiment results demonstrate that our proposed Mixed-Phoneme BERT significantly improves the TTS performance with 0.30 CMOS gain compared with the FastSpeech 2 baseline. The Mixed-Phoneme BERT achieves 3x inference speedup and similar voice quality to the previous TTS pre-trained model PnG BERT
△ Less
Submitted 19 July, 2022; v1 submitted 31 March, 2022;
originally announced March 2022.
-
Learning Local Displacements for Point Cloud Completion
Authors:
Yida Wang,
David Joseph Tan,
Nassir Navab,
Federico Tombari
Abstract:
We propose a novel approach aimed at object and semantic scene completion from a partial scan represented as a 3D point cloud. Our architecture relies on three novel layers that are used successively within an encoder-decoder structure and specifically developed for the task at hand. The first one carries out feature extraction by matching the point features to a set of pre-trained local descripto…
▽ More
We propose a novel approach aimed at object and semantic scene completion from a partial scan represented as a 3D point cloud. Our architecture relies on three novel layers that are used successively within an encoder-decoder structure and specifically developed for the task at hand. The first one carries out feature extraction by matching the point features to a set of pre-trained local descriptors. Then, to avoid losing individual descriptors as part of standard operations such as max-pooling, we propose an alternative neighbor-pooling operation that relies on adopting the feature vectors with the highest activations. Finally, up-sampling in the decoder modifies our feature extraction in order to increase the output dimension. While this model is already able to achieve competitive results with the state of the art, we further propose a way to increase the versatility of our approach to process point clouds. To this aim, we introduce a second model that assembles our layers within a transformer architecture. We evaluate both architectures on object and indoor scene completion tasks, achieving state-of-the-art performance.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Native Conditional $i$SWAP Operation with Superconducting Artificial Atoms
Authors:
Chang-Kang Hu,
Jiahao Yuan,
Bruno A. Veloso,
Jiawei Qiu,
Yuxuan Zhou,
Libo Zhang,
Ji Chu,
Orkesh Nurbolat,
Ling Hu,
Jian Li,
Yuan Xu,
Youpeng Zhong,
Song Liu,
Fei Yan,
Dian Tan,
R. Bachelard,
Alan C. Santos,
C. J. Villas-Boas,
Dapeng Yu
Abstract:
Controlling the flow of quantum information is a fundamental task for quantum computers, which is unfeasible to realize on classical devices. Coherent devices which can process quantum states are thus required to route the quantum states that encode information. In this paper we demonstrate experimentally the smallest quantum transistor with a superconducting quantum processor which is composed of…
▽ More
Controlling the flow of quantum information is a fundamental task for quantum computers, which is unfeasible to realize on classical devices. Coherent devices which can process quantum states are thus required to route the quantum states that encode information. In this paper we demonstrate experimentally the smallest quantum transistor with a superconducting quantum processor which is composed of a collector qubit, an emitter qubit, and a coupler (transistor gate). The interaction strength between the collector and emitter qubits is controlled by the frequency and state of the coupler, effectively implementing a quantum switch. Through the coupler-state-dependent Heisenberg (inherent) interaction between the qubits, a single-step (native) conditional $i$SWAP operation can be applied. To this end, we find that it is important to take into consideration higher energy level for achieving a native and high-fidelity transistor operation. By reconstructing the Quantum Process Tomography, we obtain an operation fidelity of $92.36\%$ when the transistor gate is open ($i$SWAP implementation) and $95.23 \%$ in the case of closed gate (identity gate implementation). The architecture has strong potential in quantum information processing applications with superconducting qubits.
△ Less
Submitted 1 October, 2023; v1 submitted 18 March, 2022;
originally announced March 2022.
-
Galois module structure of some elementary $p$-abelian extensions
Authors:
Lauren Heller,
Jan Minac,
Tung T. Nguyen,
Andrew Schultz,
Nguyen Duy Tan
Abstract:
We determine the Galois module structure of the parameterizing space of elementary $p$-abelian extensions of a field $K$ when $\text{Gal}(K/F)$ is any finite $p$-group, under the assumption that the maximal pro-$p$ quotient of the absolute Galois group of $F$ is a free, finitely generated pro-$p$ group, and that $F$ contains a primitive $p$th root of unity if $\text{char}(F) \neq p$.
We determine the Galois module structure of the parameterizing space of elementary $p$-abelian extensions of a field $K$ when $\text{Gal}(K/F)$ is any finite $p$-group, under the assumption that the maximal pro-$p$ quotient of the absolute Galois group of $F$ is a free, finitely generated pro-$p$ group, and that $F$ contains a primitive $p$th root of unity if $\text{char}(F) \neq p$.
△ Less
Submitted 6 January, 2023; v1 submitted 4 March, 2022;
originally announced March 2022.