Search | arXiv e-print repository

doi 10.3847/1538-4357/ad0735

Accretion Flow Properties of EXO 1846-031 During its Multi-Peaked Outburst After Long Quiescence

Authors: Sujoy Kumar Nath, Dipak Debnath, Kaushik Chatterjee, Riya Bhowmick, Hsiang-Kuang Chang, Sandip K. Chakrabarti

Abstract: We study the recent outburst of the black hole candidate EXO 1846-031 which went into an outburst in 2019 after almost 34 years in quiescence. We use archival data from Swift/XRT, MAXI/GSC, NICER/XTI and NuSTAR/FPM satellites/instruments to study the evolution of the spectral and temporal properties of the source during the outburst. Low energy (2-10 keV) X-ray flux of the outburst shows multiple… ▽ More We study the recent outburst of the black hole candidate EXO 1846-031 which went into an outburst in 2019 after almost 34 years in quiescence. We use archival data from Swift/XRT, MAXI/GSC, NICER/XTI and NuSTAR/FPM satellites/instruments to study the evolution of the spectral and temporal properties of the source during the outburst. Low energy (2-10 keV) X-ray flux of the outburst shows multiple peaks making it a multipeak outburst. Evolving type-C quasi-periodic oscillations (QPOs) are observed in the NICER data in the hard, hard intermediate and soft intermediate states. We use the physical Two Component Advective Flow (TCAF) model to analyze the combined spectra of multiple satellite instruments. According to the TCAF model, the accreting matter is divided into Keplerian and sub-Keplerian parts, and the variation in the observed spectra in different spectral states arises out of the variable contributions of these two types of accreting matter in the total accretion rate. Studying the evolution of the accretion rates and other properties of the accretion flow obtained from the spectral analysis, we show how the multiple peaks in the outburst flux arises out of variable supply of accreting matter from the pile-up radius. We determine the probable mass of the black hole to be $10.4^{+0.1}_{-0.2}~M_\odot$ from the spectral analysis with the TCAF model. We also estimate viscous time scale of the source in this outburst to be $\sim 8$ days from the peak difference of the Keplerian and sub-Keplerian mass accretion rates. △ Less

Submitted 6 June, 2024; v1 submitted 10 July, 2023; originally announced July 2023.

Comments: 16 pages, 8 Figures, 2 Tables

Journal ref: 2024, ApJ, 960, 5

arXiv:2307.04138 [pdf, other]

doi 10.1145/3593013.3594116

On The Impact of Machine Learning Randomness on Group Fairness

Authors: Prakhar Ganesh, Hongyan Chang, Martin Strobel, Reza Shokri

Abstract: Statistical measures for group fairness in machine learning reflect the gap in performance of algorithms across different groups. These measures, however, exhibit a high variance between different training instances, which makes them unreliable for empirical evaluation of fairness. What causes this high variance? We investigate the impact on group fairness of different sources of randomness in tra… ▽ More Statistical measures for group fairness in machine learning reflect the gap in performance of algorithms across different groups. These measures, however, exhibit a high variance between different training instances, which makes them unreliable for empirical evaluation of fairness. What causes this high variance? We investigate the impact on group fairness of different sources of randomness in training neural networks. We show that the variance in group fairness measures is rooted in the high volatility of the learning process on under-represented groups. Further, we recognize the dominant source of randomness as the stochasticity of data order during training. Based on these findings, we show how one can control group-level accuracy (i.e., model fairness), with high efficiency and negligible impact on the model's overall performance, by simply changing the data order for a single epoch. △ Less

Submitted 9 July, 2023; originally announced July 2023.

Comments: 10 pages + Appendix

arXiv:2307.03865 [pdf, other]

Identifying metabolites from protein identifiers with P2M

Authors: Christine H. Chang, Bryan J. Killinger, Ryan S. Renslow, Sean M. Colby

Abstract: The identification of metabolites from complex biological samples often involves matching experimental mass spectrometry data to signatures of compounds derived from massive chemical databases. However, misidentifications may result due to the complexity of potential chemical space that leads to databases containing compounds with nearly identical structures. Prior knowledge of compounds that may… ▽ More The identification of metabolites from complex biological samples often involves matching experimental mass spectrometry data to signatures of compounds derived from massive chemical databases. However, misidentifications may result due to the complexity of potential chemical space that leads to databases containing compounds with nearly identical structures. Prior knowledge of compounds that may be enzymatically consumed or produced by an organism can help reduce misidentifications by restricting initial database searching to compounds that are likely to be present in a biological system. While databases such as UniProt allow for the identification of small molecules that may be consumed or generated by enzymes encoded in an organism's genome, currently no tool exists for identifying SMILES strings of metabolites associated with protein identifiers and expanding R-containing substructures to fully defined, biologically relevant chemical structures. Here we present Proteome2Metabolome (P2M), a tool that performs these tasks using external database querying behind a simple command line interface. Beyond mass spectrometry based applications, P2M can be generally used to identify biologically relevant chemical structures likely to be observed in a biological system. △ Less

Submitted 7 July, 2023; originally announced July 2023.

arXiv:2307.02213 [pdf]

Multi-level recording in dual-layer FePt-C granular film for heat-assisted magnetic recording

Authors: P. Tozman, S. Isogami, I. Suzuki, A. Bolyachkin, H. Sepehri-Amin, S. J. Greaves, H. Suto, Y. Sasaki, H. T. Y. Chang, Y Kubota, P. Steiner, P. -W. Huang, K. Hono, Y. K. Takahashi

Abstract: Multi-level magnetic recording is a new concept for increasing the data storage capacity of hard disk drives. However, its implementation has been limited by a lack of suitable media capable of storing information at multiple levels. Herein, we overcome this problem by develo** dual FePt-C nanogranular films separated by a Ru-C breaking layer with a cubic crystal structure. The FePt grains in th… ▽ More Multi-level magnetic recording is a new concept for increasing the data storage capacity of hard disk drives. However, its implementation has been limited by a lack of suitable media capable of storing information at multiple levels. Herein, we overcome this problem by develo** dual FePt-C nanogranular films separated by a Ru-C breaking layer with a cubic crystal structure. The FePt grains in the bottom and top layers of the developed media exhibited different effective magnetocrystalline anisotropies and Curie temperatures. The former is realized by different degrees of ordering in the L10-FePt grains, whereas the latter was attributed to the diffusion of Ru, thereby enabling separate magnetic recordings at each layer under different magnetic fields and temperatures. Furthermore, the magnetic measurements and heat-assisted magnetic recording simulations showed that these media enabled 3-level recording and could potentially be extended to 4-level recording, as the up-down and down-up states exhibited non-zero magnetization. △ Less

Submitted 5 July, 2023; originally announced July 2023.

arXiv:2306.16601 [pdf, other]

An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

Authors: Haihao Shen, Hengyu Meng, Bo Dong, Zhe Wang, Ofir Zafrir, Yi Ding, Yu Luo, Hanwen Chang, Qun Gao, Ziheng Wang, Guy Boudoukh, Moshe Wasserblat

Abstract: In recent years, Transformer-based language models have become the standard approach for natural language processing tasks. However, stringent throughput and latency requirements in industrial applications are limiting their adoption. To mitigate the gap, model compression techniques such as structured pruning are being used to improve inference efficiency. However, most existing neural network in… ▽ More In recent years, Transformer-based language models have become the standard approach for natural language processing tasks. However, stringent throughput and latency requirements in industrial applications are limiting their adoption. To mitigate the gap, model compression techniques such as structured pruning are being used to improve inference efficiency. However, most existing neural network inference runtimes lack adequate support for structured sparsity. In this paper, we propose an efficient sparse deep learning inference software stack for Transformer-based language models where the weights are pruned with constant block size. Our sparse software accelerator leverages Intel Deep Learning Boost to maximize the performance of sparse matrix - dense matrix multiplication (commonly abbreviated as SpMM) on CPUs. Our SpMM kernel outperforms the existing sparse libraries (oneMKL, TVM, and LIBXSMM) by an order of magnitude on a wide range of GEMM shapes under 5 representative sparsity ratios (70%, 75%, 80%, 85%, 90%). Moreover, our SpMM kernel shows up to 5x speedup over dense GEMM kernel of oneDNN, a well-optimized dense library widely used in industry. We apply our sparse accelerator on widely-used Transformer-based language models including Bert-Mini, DistilBERT, Bert-Base, and BERT-Large. Our sparse inference software shows up to 1.5x speedup over Neural Magic's Deepsparse under same configurations on Xeon on Amazon Web Services under proxy production latency constraints. We also compare our solution with two framework-based inference solutions, ONNX Runtime and PyTorch, and demonstrate up to 37x speedup over ONNX Runtime and 345x over PyTorch on Xeon under the latency constraints. All the source code is publicly available on Github: https://github.com/intel/intel-extension-for-transformers. △ Less

Submitted 28 June, 2023; originally announced June 2023.

arXiv:2306.15372 [pdf, other]

Excitation's lifetime extracted from electron-photon (EELS-CL) nanosecond-scale temporal coincidences

Authors: Nadezda Varkentina, Yves Auad, Steffi Y. Woo, Florian Castioni, Jean-Denis Blazit, Marcel Tencé, Huan-Cheng Chang, Jeson Chen, Kenji Watanabe, Takashi Taniguchi, Mathieu Kociak, Luiz H. G. Tizei

Abstract: Electron-photon temporal correlations in electron energy loss (EELS) and cathodoluminescence (CL) spectroscopies have recently been used to measure the relative quantum efficiency of materials. This combined spectroscopy, named Cathodoluminescence excitation spectroscopy (CLE), allows the identification of excitation and decay channels which are hidden in average measurements. Here, we demonstrate… ▽ More Electron-photon temporal correlations in electron energy loss (EELS) and cathodoluminescence (CL) spectroscopies have recently been used to measure the relative quantum efficiency of materials. This combined spectroscopy, named Cathodoluminescence excitation spectroscopy (CLE), allows the identification of excitation and decay channels which are hidden in average measurements. Here, we demonstrate that CLE can also be used to measure excitation's decay time. In addition, the decay time as a function of the excitation energy is accessed, as the energy for each electron-photon pair is probed. We used two well-known insulating materials to characterize this technique, nanodiamonds with \textit{NV$^0$} defect emission and h-BN with a \textit{4.1 eV} defect emission. Both also exhibit marked transition radiations, whose extremely short decay times can be used to characterize the instrumental response function. It is found to be typically 2 ns, in agreement with the expected limit of the EELS detector temporal resolution. The measured lifetimes of \textit{NV$^0$} centers in diamond nanoparticles (20 to 40 ns) and \textit{4.1 eV} defect in h-BN flakes ($<$ 2 ns) matches those reported for those materials previously. △ Less

Submitted 11 November, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

arXiv:2306.15087 [pdf, other]

WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models

Authors: Virginia K. Felkner, Ho-Chun Herbert Chang, Eugene Jang, Jonathan May

Abstract: We present WinoQueer: a benchmark specifically designed to measure whether large language models (LLMs) encode biases that are harmful to the LGBTQ+ community. The benchmark is community-sourced, via application of a novel method that generates a bias benchmark from a community survey. We apply our benchmark to several popular LLMs and find that off-the-shelf models generally do exhibit considerab… ▽ More We present WinoQueer: a benchmark specifically designed to measure whether large language models (LLMs) encode biases that are harmful to the LGBTQ+ community. The benchmark is community-sourced, via application of a novel method that generates a bias benchmark from a community survey. We apply our benchmark to several popular LLMs and find that off-the-shelf models generally do exhibit considerable anti-queer bias. Finally, we show that LLM bias against a marginalized community can be somewhat mitigated by finetuning on data written about or by members of that community, and that social media text written by community members is more effective than news text written about the community by non-members. Our method for community-in-the-loop benchmark development provides a blueprint for future researchers to develop community-driven, harms-grounded LLM benchmarks for other marginalized communities. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: Accepted to ACL 2023 (main conference). Camera-ready version

arXiv:2306.15081 [pdf, other]

Science with a small two-band UV-photometry mission II: Observations of stars and stellar systems

Authors: J. Krtička, J. Benáček, J. Budaj, D. Korčáková, A. Pál, M. Piecka, M. Zejda, V. Bakış, M. Brož, Hsiang-Kuang Chang, N. Faltová, R. Gális, D. Jadlovský, J. Janík, J. Kára, J. Kolář, I. Krtičková, J. Kubát, B. Kubátová, P. Kurfürst, M. Labaj, J. Merc, Z. Mikulášek, F. Münz, E. Paunzen , et al. (10 additional authors not shown)

Abstract: We outline the impact of a small two-band UV-photometry satellite mission on the field of stellar physics, magnetospheres of stars, binaries, stellar clusters, interstellar matter, and exoplanets. On specific examples of different types of stars and stellar systems, we discuss particular requirements for such satellite missions in terms of specific mission parameters such as bandpass, precision, c… ▽ More We outline the impact of a small two-band UV-photometry satellite mission on the field of stellar physics, magnetospheres of stars, binaries, stellar clusters, interstellar matter, and exoplanets. On specific examples of different types of stars and stellar systems, we discuss particular requirements for such satellite missions in terms of specific mission parameters such as bandpass, precision, cadence, and mission duration. We show that such a mission may provide crucial data not only for hot stars that emit most of their light in UV, but also for cool stars, where UV traces their activity. This is important, for instance, for exoplanetary studies, because the level of stellar activity influences habitability. While the main asset of the two-band UV mission rests in time-domain astronomy, an example of open clusters proves that such a mission would be important also for the study of stellar populations. Properties of the interstellar dust are best explored when combining optical and IR information with observations in UV. It is well known that dust absorbs UV radiation efficiently. Consequently, we outline how such a UV mission can be used to detect eclipses of sufficiently hot stars by various dusty objects and study disks, rings, clouds, disintegrating exoplanets or exoasteroids. Furthermore, UV radiation can be used to study the cooling of neutron stars providing information about the extreme states of matter in the interiors of neutron stars and used for map** heated spots on their surfaces. △ Less

Submitted 21 February, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: Accepted for publication in Space Science Reviews; corrected version including comments of the referee

arXiv:2306.15080 [pdf, other]

Science with a small two-band UV-photometry mission I: Mission description and follow-up observations of stellar transients

Authors: N. Werner, J. Řípa, C. Thöne, F. Münz, P. Kurfürst, M. Jelínek, F. Hroch, J. Benáček, M. Topinka, G. Lukes-Gerakopoulos, M. Zajaček, M. Labaj, M. Prišegen, J. Krtička, J. Merc, A. Pál, O. Pejcha, V. Dániel, J. Jon, R. Šošovička, J. Gromeš, J. Václavík, L. Steiger, J. Segiňák, E. Behar , et al. (11 additional authors not shown)

Abstract: This is the first in a collection of three papers introducing the science with an ultra-violet (UV) space telescope on an approximately 130~kg small satellite with a moderately fast re-pointing capability and a real-time alert communication system approved for a Czech national space mission. The mission, called Quick Ultra-Violet Kilonova surveyor - QUVIK, will provide key follow-up capabilities t… ▽ More This is the first in a collection of three papers introducing the science with an ultra-violet (UV) space telescope on an approximately 130~kg small satellite with a moderately fast re-pointing capability and a real-time alert communication system approved for a Czech national space mission. The mission, called Quick Ultra-Violet Kilonova surveyor - QUVIK, will provide key follow-up capabilities to increase the discovery potential of gravitational wave observatories and future wide-field multi-wavelength surveys. The primary objective of the mission is the measurement of the UV brightness evolution of kilonovae, resulting from mergers of neutron stars, to distinguish between different explosion scenarios. The mission, which is designed to be complementary to the Ultraviolet Transient Astronomy Satellite - ULTRASAT, will also provide unique follow-up capabilities for other transients both in the near- and far-UV bands. Between the observations of transients, the satellite will target other objects described in this collection of papers, which demonstrates that a small and relatively affordable dedicated UV-space telescope can be transformative for many fields of astrophysics. △ Less

Submitted 10 January, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: Accepted for publication in Space Science Reviews

arXiv:2306.14118 [pdf]

Machine Learning and Consumer Data

Authors: Hannah H. Chang, Anirban Mukherjee

Abstract: The digital revolution has led to the digitization of human behavior, creating unprecedented opportunities to understand observable actions on an unmatched scale. Emerging phenomena such as crowdfunding and crowdsourcing have further illuminated consumer behavior while also introducing new behavioral patterns. However, the sheer volume and complexity of this data present significant challenges for… ▽ More The digital revolution has led to the digitization of human behavior, creating unprecedented opportunities to understand observable actions on an unmatched scale. Emerging phenomena such as crowdfunding and crowdsourcing have further illuminated consumer behavior while also introducing new behavioral patterns. However, the sheer volume and complexity of this data present significant challenges for marketing researchers and practitioners. Traditional methods used to analyze consumer data fall short in handling the breadth, precision, and scale of emerging data sources. To address this, computational methods have been developed to manage the "big data" associated with consumer behavior, which typically includes structured data, textual data, audial data, and visual data. These methods, particularly machine learning, allow for effective parsing and processing of multi-faceted data. Given these recent developments, this review article seeks to familiarize researchers and practitioners with new data sources and analysis techniques for studying consumer behavior at scale. It serves as an introduction to the application of computational social science in understanding and leveraging publicly available consumer data. △ Less

Submitted 24 June, 2023; originally announced June 2023.

arXiv:2306.09333 [pdf, other]

doi 10.1126/science.adi7877

Dynamics of magnetization at infinite temperature in a Heisenberg spin chain

Authors: Eliott Rosenberg, Trond Andersen, Rhine Samajdar, Andre Petukhov, Jesse Hoke, Dmitry Abanin, Andreas Bengtsson, Ilya Drozdov, Catherine Erickson, Paul Klimov, Xiao Mi, Alexis Morvan, Matthew Neeley, Charles Neill, Rajeev Acharya, Richard Allen, Kyle Anderson, Markus Ansmann, Frank Arute, Kunal Arya, Abraham Asfaw, Juan Atalaya, Joseph Bardin, A. Bilmes, Gina Bortoli , et al. (156 additional authors not shown)

Abstract: Understanding universal aspects of quantum dynamics is an unresolved problem in statistical mechanics. In particular, the spin dynamics of the 1D Heisenberg model were conjectured to belong to the Kardar-Parisi-Zhang (KPZ) universality class based on the scaling of the infinite-temperature spin-spin correlation function. In a chain of 46 superconducting qubits, we study the probability distributio… ▽ More Understanding universal aspects of quantum dynamics is an unresolved problem in statistical mechanics. In particular, the spin dynamics of the 1D Heisenberg model were conjectured to belong to the Kardar-Parisi-Zhang (KPZ) universality class based on the scaling of the infinite-temperature spin-spin correlation function. In a chain of 46 superconducting qubits, we study the probability distribution, $P(\mathcal{M})$, of the magnetization transferred across the chain's center. The first two moments of $P(\mathcal{M})$ show superdiffusive behavior, a hallmark of KPZ universality. However, the third and fourth moments rule out the KPZ conjecture and allow for evaluating other theories. Our results highlight the importance of studying higher moments in determining dynamic universality classes and provide key insights into universal behavior in quantum systems. △ Less

Submitted 4 April, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

Journal ref: Science 384, 48-53 (2024)

arXiv:2306.06235 [pdf, ps, other]

Resolving the Steiner Point Removal Problem in Planar Graphs via Shortcut Partitions

Authors: Hsien-Chih Chang, Jonathan Conroy, Hung Le, Lazar Milenkovic, Shay Solomon, Cuong Than

Abstract: Recently the authors [CCLMST23] introduced the notion of shortcut partition of planar graphs and obtained several results from the partition, including a tree cover with $O(1)$ trees for planar metrics and an additive embedding into small treewidth graphs. In this note, we apply the same partition to resolve the Steiner point removal (SPR) problem in planar graphs: Given any set $K$ of terminals i… ▽ More Recently the authors [CCLMST23] introduced the notion of shortcut partition of planar graphs and obtained several results from the partition, including a tree cover with $O(1)$ trees for planar metrics and an additive embedding into small treewidth graphs. In this note, we apply the same partition to resolve the Steiner point removal (SPR) problem in planar graphs: Given any set $K$ of terminals in an arbitrary edge-weighted planar graph $G$, we construct a minor $M$ of $G$ whose vertex set is $K$, which preserves the shortest-path distances between all pairs of terminals in $G$ up to a constant factor. This resolves in the affirmative an open problem that has been asked repeatedly in literature. △ Less

Submitted 13 September, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: Manuscript not intended for publication. The results have been subsumed by arXiv:2308.00555 from the same authors

arXiv:2306.06215 [pdf, other]

Covering Planar Metrics (and Beyond): O(1) Trees Suffice

Authors: Hsien-Chih Chang, Jonathan Conroy, Hung Le, Lazar Milenkovic, Shay Solomon, Cuong Than

Abstract: While research on the geometry of planar graphs has been active in the past decades, many properties of planar metrics remain mysterious. This paper studies a fundamental aspect of the planar graph geometry: covering planar metrics by a small collection of simpler metrics. Specifically, a \emph{tree cover} of a metric space $(X, δ)$ is a collection of trees, so that every pair of points $u$ and… ▽ More While research on the geometry of planar graphs has been active in the past decades, many properties of planar metrics remain mysterious. This paper studies a fundamental aspect of the planar graph geometry: covering planar metrics by a small collection of simpler metrics. Specifically, a \emph{tree cover} of a metric space $(X, δ)$ is a collection of trees, so that every pair of points $u$ and $v$ in $X$ has a low-distortion path in at least one of the trees. The celebrated "Dumbbell Theorem" [ADMSS95] states that any low-dimensional Euclidean space admits a tree cover with $O(1)$ trees and distortion $1+\varepsilon$, for any fixed $\varepsilon \in (0,1)$. This result has found numerous algorithmic applications, and has been generalized to the wider family of doubling metrics [BFN19]. Does the same result hold for planar metrics? A positive answer would add another evidence to the well-observed connection between Euclidean/doubling metrics and planar metrics. In this work, we answer this fundamental question affirmatively. Specifically, we show that for any given fixed $\varepsilon \in (0,1)$, any planar metric can be covered by $O(1)$ trees with distortion $1+\varepsilon$. Our result for planar metrics follows from a rather general framework: First we reduce the problem to constructing tree covers with \emph{additive distortion}. Then we introduce the notion of \emph{shortcut partition}, and draw connection between shortcut partition and additive tree cover. Finally we prove the existence of shortcut partition for any planar metric, using new insights regarding the grid-like structure of planar graphs. [...] △ Less

Submitted 5 November, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: Abstract truncated to fit arXiv limits

arXiv:2306.04943 [pdf, other]

The Common Fundamental Plane of X-ray Emissions from Pulsars and Magnetars in Quiescence

Authors: Che-Yen Chu, Hsiang-Kuang Chang

Abstract: Magnetars are a unique class of neutron stars characterized by their incredibly strong magnetic fields. Unlike normal pulsars whose X-ray emission was driven by rotational energy loss, magnetars exhibit distinct X-ray emissions thought to be driven by their strong magnetic fields. Here we present the results of magnetar X-ray spectra analysis in their quiescent state. Most of the spectra of magnet… ▽ More Magnetars are a unique class of neutron stars characterized by their incredibly strong magnetic fields. Unlike normal pulsars whose X-ray emission was driven by rotational energy loss, magnetars exhibit distinct X-ray emissions thought to be driven by their strong magnetic fields. Here we present the results of magnetar X-ray spectra analysis in their quiescent state. Most of the spectra of magnetars can be fitted with a model consisting of a power-law and a black body component. We found that the luminosity of the power-law component can be described by a function of black-body temperature and emission-region radius. The same relation was seen in pulsars whose X-ray emission mechanism is thought to be different. The fact that magnetars and pulsars share a common fundamental plane in the space spanned by non-thermal X-ray luminosity, surface temperature and the radius of the thermally emitting region presents both challenges and hints to theoretical models for a complete comprehension of the magnetospheric emissions from these two classes of neutron stars. △ Less

Submitted 17 August, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

Comments: 10 pages, 4 figures. Submitted to MNRAS

arXiv:2306.04916 [pdf]

Triplet State and Auger-Type Excitation Originating from Two-Electron Tunneling in Field Emission Resonance on Ag(100)

Authors: Shin-Ming Lu, Ho-Hsiang Chang, Wei-Bin Su, Wen-Yuan Chan, Kung-Hsuan Lin, Chia-Seng Chang

Abstract: In this study, we discovered that the energy gap above the vacuum level in the projected bulk band structure of Ag(100) prevents electrons in the first-order field emission resonance (FER) from inducing the surface plasmons. This mechanism allows light emission from FER to reveal characteristics of triplet states and Auger-type excitation resulting from two-electron tunneling in FER. According to… ▽ More In this study, we discovered that the energy gap above the vacuum level in the projected bulk band structure of Ag(100) prevents electrons in the first-order field emission resonance (FER) from inducing the surface plasmons. This mechanism allows light emission from FER to reveal characteristics of triplet states and Auger-type excitation resulting from two-electron tunneling in FER. According to optical spectra, surface plasmons can be induced by electrons in the zeroth-order FER. However, corresponding radiative decay can also trigger Auger-type excitation, whose energy state is influenced by the sharpness-dependent image potential acting on the scanning tunneling microscope tip. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: 19 pages, 8 figures

MSC Class: 81-02

arXiv:2306.03601 [pdf, ps, other]

The Creative Frontier of Generative AI: Managing the Novelty-Usefulness Tradeoff

Authors: Anirban Mukherjee, Hannah Chang

Abstract: In this paper, drawing inspiration from the human creativity literature, we explore the optimal balance between novelty and usefulness in generative Artificial Intelligence (AI) systems. We posit that overemphasizing either aspect can lead to limitations such as hallucinations and memorization. Hallucinations, characterized by AI responses containing random inaccuracies or falsehoods, emerge when… ▽ More In this paper, drawing inspiration from the human creativity literature, we explore the optimal balance between novelty and usefulness in generative Artificial Intelligence (AI) systems. We posit that overemphasizing either aspect can lead to limitations such as hallucinations and memorization. Hallucinations, characterized by AI responses containing random inaccuracies or falsehoods, emerge when models prioritize novelty over usefulness. Memorization, where AI models reproduce content from their training data, results from an excessive focus on usefulness, potentially limiting creativity. To address these challenges, we propose a framework that includes domain-specific analysis, data and transfer learning, user preferences and customization, custom evaluation metrics, and collaboration mechanisms. Our approach aims to generate content that is both novel and useful within specific domains, while considering the unique requirements of various contexts. △ Less

Submitted 6 June, 2023; originally announced June 2023.

arXiv:2306.00984 [pdf, other]

StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners

Authors: Yonglong Tian, Lijie Fan, Phillip Isola, Huiwen Chang, Dilip Krishnan

Abstract: We investigate the potential of learning visual representations using synthetic images generated by text-to-image models. This is a natural question in the light of the excellent performance of such models in generating high-quality images. We consider specifically the Stable Diffusion, one of the leading open source text-to-image models. We show that (1) when the generative model is configured wi… ▽ More We investigate the potential of learning visual representations using synthetic images generated by text-to-image models. This is a natural question in the light of the excellent performance of such models in generating high-quality images. We consider specifically the Stable Diffusion, one of the leading open source text-to-image models. We show that (1) when the generative model is configured with proper classifier-free guidance scale, training self-supervised methods on synthetic images can match or beat the real image counterpart; (2) by treating the multiple images generated from the same text prompt as positives for each other, we develop a multi-positive contrastive learning method, which we call StableRep. With solely synthetic images, the representations learned by StableRep surpass the performance of representations learned by SimCLR and CLIP using the same set of text prompts and corresponding real images, on large scale datasets. When we further add language supervision, StableRep trained with 20M synthetic images achieves better accuracy than CLIP trained with 50M real images. △ Less

Submitted 26 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: code is available at: https://github.com/google-research/syn-rep-learn

arXiv:2306.00983 [pdf, other]

StyleDrop: Text-to-Image Generation in Any Style

Authors: Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, Dilip Krishnan

Abstract: Pre-trained large text-to-image models synthesize impressive images with an appropriate use of text prompts. However, ambiguities inherent in natural language and out-of-distribution effects make it hard to synthesize image styles, that leverage a specific design pattern, texture or material. In this paper, we introduce StyleDrop, a method that enables the synthesis of images that faithfully follo… ▽ More Pre-trained large text-to-image models synthesize impressive images with an appropriate use of text prompts. However, ambiguities inherent in natural language and out-of-distribution effects make it hard to synthesize image styles, that leverage a specific design pattern, texture or material. In this paper, we introduce StyleDrop, a method that enables the synthesis of images that faithfully follow a specific style using a text-to-image model. The proposed method is extremely versatile and captures nuances and details of a user-provided style, such as color schemes, shading, design patterns, and local and global effects. It efficiently learns a new style by fine-tuning very few trainable parameters (less than $1\%$ of total model parameters) and improving the quality via iterative training with either human or automated feedback. Better yet, StyleDrop is able to deliver impressive results even when the user supplies only a single image that specifies the desired style. An extensive study shows that, for the task of style tuning text-to-image models, StyleDrop implemented on Muse convincingly outperforms other methods, including DreamBooth and textual inversion on Imagen or Stable Diffusion. More results are available at our project website: https://styledrop.github.io △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: Preprint. Project page at https://styledrop.github.io

arXiv:2306.00763 [pdf, other]

Learning Disentangled Prompts for Compositional Image Synthesis

Authors: Kihyuk Sohn, Albert Shaw, Yuan Hao, Han Zhang, Luisa Polania, Huiwen Chang, Lu Jiang, Irfan Essa

Abstract: We study domain-adaptive image synthesis, the problem of teaching pretrained image generative models a new style or concept from as few as one image to synthesize novel images, to better understand the compositional image synthesis. We present a framework that leverages a pretrained class-conditional generation model and visual prompt tuning. Specifically, we propose a novel source class distilled… ▽ More We study domain-adaptive image synthesis, the problem of teaching pretrained image generative models a new style or concept from as few as one image to synthesize novel images, to better understand the compositional image synthesis. We present a framework that leverages a pretrained class-conditional generation model and visual prompt tuning. Specifically, we propose a novel source class distilled visual prompt that learns disentangled prompts of semantic (e.g., class) and domain (e.g., style) from a few images. Learned domain prompt is then used to synthesize images of any classes in the style of target domain. We conduct studies on various target domains with the number of images ranging from one to a few to many, and show qualitative results which show the compositional generalization of our method. Moreover, we show that our method can help improve zero-shot domain adaptation classification accuracy. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: tech report

arXiv:2305.17896 [pdf, other]

Continuous and Noninvasive Measurement of Arterial Pulse Pressure and Pressure Waveform using an Image-free Ultrasound System

Authors: Lirui Xu, Pang Wu, Pan Xia, Fanglin Geng, Peng Wang, Xianxiang Chen, Zhenfeng Li, Lidong Du, Shu** Liu, Li Li, Hongbo Chang, Zhen Fang

Abstract: The local beat-to-beat local pulse pressure (PP) and blood pressure waveform of arteries, especially central arteries, are important indicators of the course of cardiovascular diseases (CVDs). Nevertheless, noninvasive measurement of them remains a challenge in the clinic. This work presents a three-element image-free ultrasound system with a low-computational method for real-time measurement of l… ▽ More The local beat-to-beat local pulse pressure (PP) and blood pressure waveform of arteries, especially central arteries, are important indicators of the course of cardiovascular diseases (CVDs). Nevertheless, noninvasive measurement of them remains a challenge in the clinic. This work presents a three-element image-free ultrasound system with a low-computational method for real-time measurement of local pulse wave velocity (PWV) and diameter waveforms, enabling real-time and noninvasive continuous PP and blood pressure waveforms measurement without calibration. The developed system has been well-validated in vitro and in vivo. In in vitro cardiovascular phantom experiments, the results demonstrated high accuracy in the measurement of PP (error < 3 mmHg) and blood pressure waveform (root-mean-square-errors (RMSE) < 2 mmHg, correlation coefficient (r) > textgreater 0.99). In subsequent human carotid experiments, the system was compared with an arterial tonometer, which showed excellent PP accuracy (mean absolute error (MAE) = 3.7 +- 3.4 mmHg) and pressure waveform similarity (RMSE = 3.7 +- 1.6 mmHg, r = 0.98 +- 0.01). Furthermore, comparative experiments with the volume clamp device demonstrated the system's ability to accurately trace blood pressure changes (induced by deep breathing) over a period of one minute, with the MAE of DBP, MAP, and SBP within 5 +- 8 mmHg. The present results demonstrate the accuracy and reliability of the developed system for continuous and noninvasive measurement of arterial PP and blood pressure waveform measurements, with potential applications in the diagnosis and prevention of CVDs. △ Less

Submitted 29 May, 2023; originally announced May 2023.

Comments: 13 pages, 12 figures

arXiv:2305.12289 [pdf, other]

Revisiting the Architectures like Pointer Networks to Efficiently Improve the Next Word Distribution, Summarization Factuality, and Beyond

Authors: Haw-Shiuan Chang, Zonghai Yao, Alolika Gon, Hong Yu, Andrew McCallum

Abstract: Is the output softmax layer, which is adopted by most language models (LMs), always the best way to compute the next word probability? Given so many attention layers in a modern transformer-based LM, are the pointer networks redundant nowadays? In this study, we discover that the answers to both questions are no. This is because the softmax bottleneck sometimes prevents the LMs from predicting the… ▽ More Is the output softmax layer, which is adopted by most language models (LMs), always the best way to compute the next word probability? Given so many attention layers in a modern transformer-based LM, are the pointer networks redundant nowadays? In this study, we discover that the answers to both questions are no. This is because the softmax bottleneck sometimes prevents the LMs from predicting the desired distribution and the pointer networks can be used to break the bottleneck efficiently. Based on the finding, we propose several softmax alternatives by simplifying the pointer networks and accelerating the word-by-word rerankers. In GPT-2, our proposals are significantly better and more efficient than mixture of softmax, a state-of-the-art softmax alternative. In summarization experiments, without significantly decreasing its training/testing speed, our best method based on T5-Small improves factCC score by 2 points in CNN/DM and XSUM dataset, and improves MAUVE scores by 30% in BookSum paragraph-level dataset. △ Less

Submitted 20 May, 2023; originally announced May 2023.

Comments: ACL Findings 2023

arXiv:2305.11442 [pdf, other]

Zero-Shot Text Classification via Self-Supervised Tuning

Authors: Chaoqun Liu, Wenxuan Zhang, Guizhen Chen, Xiaobao Wu, Anh Tuan Luu, Chip Hong Chang, Lidong Bing

Abstract: Existing solutions to zero-shot text classification either conduct prompting with pre-trained language models, which is sensitive to the choices of templates, or rely on large-scale annotated data of relevant tasks for meta-tuning. In this work, we propose a new paradigm based on self-supervised learning to solve zero-shot text classification tasks by tuning the language models with unlabeled data… ▽ More Existing solutions to zero-shot text classification either conduct prompting with pre-trained language models, which is sensitive to the choices of templates, or rely on large-scale annotated data of relevant tasks for meta-tuning. In this work, we propose a new paradigm based on self-supervised learning to solve zero-shot text classification tasks by tuning the language models with unlabeled data, called self-supervised tuning. By exploring the inherent structure of free texts, we propose a new learning objective called first sentence prediction to bridge the gap between unlabeled data and text classification tasks. After tuning the model to learn to predict the first sentence in a paragraph based on the rest, the model is able to conduct zero-shot inference on unseen tasks such as topic classification and sentiment analysis. Experimental results show that our model outperforms the state-of-the-art baselines on 7 out of 10 tasks. Moreover, the analysis reveals that our model is less sensitive to the prompt design. Our code and pre-trained models are publicly available at https://github.com/DAMO-NLP-SG/SSTuning . △ Less

Submitted 25 May, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

Comments: Accepted to the Findings of ACL 2023

arXiv:2305.11237 [pdf, other]

DRL meets DSA Networks: Convergence Analysis and Its Application to System Design

Authors: Ramin Safavinejad, Hao-Hsuan Chang, Lingjia Liu

Abstract: In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the interaction between the SU and the PU systems are limited, deep reinforcement learning (DRL) has been introduced to help SUs to conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utiliz… ▽ More In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the interaction between the SU and the PU systems are limited, deep reinforcement learning (DRL) has been introduced to help SUs to conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate the information from the recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency in the sense that it needs a rather large number of training data samples to tune its parameters which is a computationally demanding task. In our recent work, deep echo state network (DEQN) has been introduced to DSA networks to address the sample efficiency issue of DRQN. In this paper, we analytically show that DEQN comparatively requires less amount of training samples than DRQN to converge to the best policy. Furthermore, we introduce a method to determine the right hyperparameters for the DEQN providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based DSA strategies. △ Less

Submitted 18 May, 2023; originally announced May 2023.

arXiv:2305.11072 [pdf, other]

Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering

Authors: Heng-Jui Chang, Alexander H. Liu, James Glass

Abstract: Self-supervised speech representation models have succeeded in various tasks, but improving them for content-related problems using unlabeled data is challenging. We propose speaker-invariant clustering (Spin), a novel self-supervised learning method that clusters speech representations and performs swapped prediction between the original and speaker-perturbed utterances. Spin disentangles speaker… ▽ More Self-supervised speech representation models have succeeded in various tasks, but improving them for content-related problems using unlabeled data is challenging. We propose speaker-invariant clustering (Spin), a novel self-supervised learning method that clusters speech representations and performs swapped prediction between the original and speaker-perturbed utterances. Spin disentangles speaker information and preserves content representations with just 45 minutes of fine-tuning on a single GPU. Spin improves pre-trained networks and outperforms prior methods in speech recognition and acoustic unit discovery. △ Less

Submitted 18 May, 2023; originally announced May 2023.

Comments: Accepted to Interspeech 2023

arXiv:2305.10621 [pdf, other]

TSoR: TCP Socket over RDMA Container Network for Cloud Native Computing

Authors: Yulin Sun, Qingming Qu, Chenxingyu Zhao, Arvind Krishnamurthy, Hong Chang, Ying Xiong

Abstract: Cloud-native containerized applications constantly seek high-performance and easy-to-operate container network solutions. RDMA network is a potential enabler with higher throughput and lower latency than the standard TCP/IP network stack. However, several challenges remain in equip** containerized applications with RDMA network: 1) How to deliver transparent improvements without modifying applic… ▽ More Cloud-native containerized applications constantly seek high-performance and easy-to-operate container network solutions. RDMA network is a potential enabler with higher throughput and lower latency than the standard TCP/IP network stack. However, several challenges remain in equip** containerized applications with RDMA network: 1) How to deliver transparent improvements without modifying application code; 2) How to integrate RDMA-based network solutions with container orchestration systems; 3) How to efficiently utilize RDMA for container networks. In this paper, we present an RDMA-based container network solution, TCP Socket over RDMA (TSoR), which addresses all the above challenges. To transparently accelerate applications using POSIX socket interfaces without modifications, we integrate TSoR with a container runtime that can intercept system calls for socket interfaces. To be compatible with orchestration systems like Kubernetes, TSoR implements a container network following the Kubernetes network model and satisfies all requirements of the model. To leverage RDMA benefits, TSoR designs a high-performance network stack that efficiently transfers TCP traffic using RDMA network. Thus, TSoR provides a turn-key solution for existing Kubernetes clusters to adopt the high-performance RDMA network with minimal effort. Our evaluation results show that TSoR provides up to 2.3x higher throughput and 64\% lower latency for existing containerized applications, such as Redis key-value store and Node.js web server, with no code changes. TSoR code will be open-sourced. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2305.10005 [pdf, other]

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

Authors: Alexander H. Liu, Heng-Jui Chang, Michael Auli, Wei-Ning Hsu, James R. Glass

Abstract: In this paper, we introduce self-distillation and online clustering for self-supervised speech representation learning (DinoSR) which combines masked language modeling, self-distillation, and online clustering. We show that these concepts complement each other and result in a strong representation learning model for speech. DinoSR first extracts contextualized embeddings from the input audio with… ▽ More In this paper, we introduce self-distillation and online clustering for self-supervised speech representation learning (DinoSR) which combines masked language modeling, self-distillation, and online clustering. We show that these concepts complement each other and result in a strong representation learning model for speech. DinoSR first extracts contextualized embeddings from the input audio with a teacher network, then runs an online clustering system on the embeddings to yield a machine-discovered phone inventory, and finally uses the discretized tokens to guide a student network. We show that DinoSR surpasses previous state-of-the-art performance in several downstream tasks, and provide a detailed analysis of the model and the learned discrete units. △ Less

Submitted 16 January, 2024; v1 submitted 17 May, 2023; originally announced May 2023.

arXiv:2305.09906 [pdf, ps, other]

Fast computation of exact confidence intervals for randomized experiments with binary outcomes

Authors: P. M. Aronow, Haoge Chang, Patrick Lopatto

Abstract: Given a randomized experiment with binary outcomes, exact confidence intervals for the average causal effect of the treatment can be computed through a series of permutation tests. This approach requires minimal assumptions and is valid for all sample sizes, as it does not rely on large-sample approximations such as the central limit theorem. We show that these confidence intervals can be found in… ▽ More Given a randomized experiment with binary outcomes, exact confidence intervals for the average causal effect of the treatment can be computed through a series of permutation tests. This approach requires minimal assumptions and is valid for all sample sizes, as it does not rely on large-sample approximations such as the central limit theorem. We show that these confidence intervals can be found in $O(n \log n)$ permutation tests in the case of balanced designs, where the treatment and control groups have equal sizes, and $O(n^2)$ permutation tests in the general case. Prior to this work, the most efficient known constructions required $O(n^2)$ such tests in the balanced case [Li and Ding, 2016], and $O(n^4)$ tests in the general case [Rigdon and Hudgens, 2015]. Our results thus facilitate exact inference as a viable option for randomized experiments far larger than those accessible by previous methods. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: 37 pages

arXiv:2305.09712 [pdf, other]

doi 10.1103/PhysRevD.108.036006

Gravitational-wave signatures from reheating

Authors: Manuel A. Buen-Abad, Jae Hyeok Chang, Anson Hook

Abstract: We initiate a study of the gravitational-wave signatures of a phase transition that occurs as the Universe's temperature increases during reheating. The gravitational-wave signatures of such a heating phase transition are different from those of a cooling phase transition, and their detection could allow us to probe reheating. In the lucky case that the gravitational-wave signatures from both the… ▽ More We initiate a study of the gravitational-wave signatures of a phase transition that occurs as the Universe's temperature increases during reheating. The gravitational-wave signatures of such a heating phase transition are different from those of a cooling phase transition, and their detection could allow us to probe reheating. In the lucky case that the gravitational-wave signatures from both the heating and cooling phase transitions were to be observed, information about reheating could in principle be obtained utilizing the correlations between the two transitions. Frictional effects, leading to a constant bubble-wall speed in one case, will instead behave as an ``antifriction'' force in the other and accelerate the bubble wall. This antifriction will often take the bubble into a runaway regime, significantly enhancing the amplitude of the heating phase transition gravitational-wave signal. The efficiency, strength, and duration of the phase transitions will be similarly correlated in a reheating-dependent way. △ Less

Submitted 2 October, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

Comments: 13 pages + references and appendices, 15 figures. v2: version published in PRD. v3: fixed typo in some captions

arXiv:2305.04414 [pdf, ps, other]

Untrained Neural Network based Bayesian Detector for OTFS Modulation Systems

Authors: Hao Chang, Alva Kosasih, Wibowo Hardjawana, Xinwei Qu, Branka Vucetic

Abstract: The orthogonal time frequency space (OTFS) symbol detector design for high mobility communication scenarios has received numerous attention lately. Current state-of-the-art OTFS detectors mainly can be divided into two categories; iterative and training-based deep neural network (DNN) detectors. Many practical iterative detectors rely on minimum-mean-square-error (MMSE) denoiser to get the initial… ▽ More The orthogonal time frequency space (OTFS) symbol detector design for high mobility communication scenarios has received numerous attention lately. Current state-of-the-art OTFS detectors mainly can be divided into two categories; iterative and training-based deep neural network (DNN) detectors. Many practical iterative detectors rely on minimum-mean-square-error (MMSE) denoiser to get the initial symbol estimates. However, their computational complexity increases exponentially with the number of detected symbols. Training-based DNN detectors typically suffer from dependency on the availability of large computation resources and the fidelity of synthetic datasets for the training phase, which are both costly. In this paper, we propose an untrained DNN based on the deep image prior (DIP) and decoder architecture, referred to as D-DIP that replaces the MMSE denoiser in the iterative detector. DIP is a type of DNN that requires no training, which makes it beneficial in OTFS detector design. Then we propose to combine the D-DIP denoiser with the Bayesian parallel interference cancellation (BPIC) detector to perform iterative symbol detection, referred to as D-DIP-BPIC. Our simulation results show that the symbol error rate (SER) performance of the proposed D-DIP-BPIC detector outperforms practical state-of-the-art detectors by 0.5 dB and retains low computational complexity. △ Less

Submitted 7 May, 2023; originally announced May 2023.

arXiv:2305.03749 [pdf, other]

Spectral distortions of astrophysical blackbodies as axion probes

Authors: Jae Hyeok Chang, Reza Ebadi, Xuheng Luo, Erwin H. Tanin

Abstract: Recent studies reveal that more than a dozen of white dwarfs displaying near-perfect blackbody spectra in the optical range have been lurking in the Sloan Digital Sky Survey catalog. We point out that, in a way analogous to the Cosmic Microwave Background, these stars serve as excellent testbeds for new physics. Specifically, we show how their observed lack of spectral distortions translates into… ▽ More Recent studies reveal that more than a dozen of white dwarfs displaying near-perfect blackbody spectra in the optical range have been lurking in the Sloan Digital Sky Survey catalog. We point out that, in a way analogous to the Cosmic Microwave Background, these stars serve as excellent testbeds for new physics. Specifically, we show how their observed lack of spectral distortions translates into limits on the parameter space of axions with electromagnetic coupling. The prospects for future improvements are also discussed. △ Less

Submitted 5 May, 2023; originally announced May 2023.

Comments: 14 pages, 8 figures

Report number: UMD-PP-022-05

arXiv:2305.03259 [pdf, other]

Clothes Gras** and Unfolding Based on RGB-D Semantic Segmentation

Authors: Xingyu Zhu, Xin Wang, Jonathan Freer, Hyung ** Chang, Yixing Gao

Abstract: Clothes gras** and unfolding is a core step in robotic-assisted dressing. Most existing works leverage depth images of clothes to train a deep learning-based model to recognize suitable gras** points. These methods often utilize physics engines to synthesize depth images to reduce the cost of real labeled data collection. However, the natural domain gap between synthetic and real images often… ▽ More Clothes gras** and unfolding is a core step in robotic-assisted dressing. Most existing works leverage depth images of clothes to train a deep learning-based model to recognize suitable gras** points. These methods often utilize physics engines to synthesize depth images to reduce the cost of real labeled data collection. However, the natural domain gap between synthetic and real images often leads to poor performance of these methods on real data. Furthermore, these approaches often struggle in scenarios where gras** points are occluded by the clothing item itself. To address the above challenges, we propose a novel Bi-directional Fractal Cross Fusion Network (BiFCNet) for semantic segmentation, enabling recognition of graspable regions in order to provide more possibilities for gras**. Instead of using depth images only, we also utilize RGB images with rich color features as input to our network in which the Fractal Cross Fusion (FCF) module fuses RGB and depth data by considering global complex features based on fractal geometry. To reduce the cost of real data collection, we further propose a data augmentation method based on an adversarial strategy, in which the color and geometric transformations simultaneously process RGB and depth data while maintaining the label correspondence. Finally, we present a pipeline for clothes gras** and unfolding from the perspective of semantic segmentation, through the addition of a strategy for grasp point selection from segmentation regions based on clothing flatness measures, while taking into account the gras** direction. We evaluate our BiFCNet on the public dataset NYUDv2 and obtained comparable performance to current state-of-the-art models. We also deploy our model on a Baxter robot, running extensive gras** and unfolding experiments as part of our ablation studies, achieving an 84% success rate. △ Less

Submitted 8 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

Comments: This paper is accepted to ICRA 2023

arXiv:2305.03165 [pdf, other]

Understanding the Benefits of Hardware-Accelerated Communication in Model-Serving Applications

Authors: Walid A. Hanafy, Limin Wang, Hyunseok Chang, Sarit Mukherjee, T. V. Lakshman, Prashant Shenoy

Abstract: It is commonly assumed that the end-to-end networking performance of edge offloading is purely dictated by that of the network connectivity between end devices and edge computing facilities, where ongoing innovation in 5G/6G networking can help. However, with the growing complexity of edge-offloaded computation and dynamic load balancing requirements, an offloaded task often goes through a multi-s… ▽ More It is commonly assumed that the end-to-end networking performance of edge offloading is purely dictated by that of the network connectivity between end devices and edge computing facilities, where ongoing innovation in 5G/6G networking can help. However, with the growing complexity of edge-offloaded computation and dynamic load balancing requirements, an offloaded task often goes through a multi-stage pipeline that spans across multiple compute nodes and proxies interconnected via a dedicated network fabric within a given edge computing facility. As the latest hardware-accelerated transport technologies such as RDMA and GPUDirect RDMA are adopted to build such network fabric, there is a need for good understanding of the full potential of these technologies in the context of computation offload and the effect of different factors such as GPU scheduling and characteristics of computation on the net performance gain achievable by these technologies. This paper unveils detailed insights into the latency overhead in typical machine learning (ML)-based computation pipelines and analyzes the potential benefits of adopting hardware-accelerated communication. To this end, we build a model-serving framework that supports various communication mechanisms. Using the framework, we identify performance bottlenecks in state-of-the-art model-serving pipelines and show how hardware-accelerated communication can alleviate them. For example, we show that GPUDirect RDMA can save 15--50\% of model-serving latency, which amounts to 70--160 ms. △ Less

Submitted 10 July, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

arXiv:2305.02236 [pdf, other]

High-sensitivity extreme-ultraviolet transient absorption spectroscopy enabled by machine learning

Authors: Tobias Heinrich, Hung-Tzu Chang, Sergey Zayko, Murat Sivis, Claus Ropers

Abstract: We introduce a machine-learning-based approach to enhance the sensitivity of optical-extreme ultraviolet (XUV) transient absorption spectroscopy. A reference spectrum is used as input to a three-layer feed-forward neural network, allowing for an efficient elimination of source noise from measurement data. In pump-probe experiments using high-harmonic radiation, we show a more than tenfold improvem… ▽ More We introduce a machine-learning-based approach to enhance the sensitivity of optical-extreme ultraviolet (XUV) transient absorption spectroscopy. A reference spectrum is used as input to a three-layer feed-forward neural network, allowing for an efficient elimination of source noise from measurement data. In pump-probe experiments using high-harmonic radiation, we show a more than tenfold improvement in noise suppression in XUV transient absorption spectra compared to conventional referencing. Utilizing strong spectral correlations in the source fluctuations, the network facilitates a pixel-wise noise reduction without the need for wavelength calibration of the reference spectrum. The presented method can be adapted to a wide range of beam lines and enables the investigation of subtle electron and lattice dynamics in the weak excitation regime, relevant for the study of photovoltaics and photoinduced phase transitions of strongly correlated materials. △ Less

Submitted 3 May, 2023; originally announced May 2023.

arXiv:2304.11751 [pdf, other]

Score-Based Diffusion Models as Principled Priors for Inverse Imaging

Authors: Berthy T. Feng, Jamie Smith, Michael Rubinstein, Huiwen Chang, Katherine L. Bouman, William T. Freeman

Abstract: Priors are essential for reconstructing images from noisy and/or incomplete measurements. The choice of the prior determines both the quality and uncertainty of recovered images. We propose turning score-based diffusion models into principled image priors ("score-based priors") for analyzing a posterior of images given measurements. Previously, probabilistic priors were limited to handcrafted regu… ▽ More Priors are essential for reconstructing images from noisy and/or incomplete measurements. The choice of the prior determines both the quality and uncertainty of recovered images. We propose turning score-based diffusion models into principled image priors ("score-based priors") for analyzing a posterior of images given measurements. Previously, probabilistic priors were limited to handcrafted regularizers and simple distributions. In this work, we empirically validate the theoretically-proven probability function of a score-based diffusion model. We show how to sample from resulting posteriors by using this probability function for variational inference. Our results, including experiments on denoising, deblurring, and interferometric imaging, suggest that score-based priors enable principled inference with a sophisticated, data-driven image prior. △ Less

Submitted 28 August, 2023; v1 submitted 23 April, 2023; originally announced April 2023.

Comments: ICCV 2023

arXiv:2304.11119 [pdf, other]

Phase transition in Random Circuit Sampling

Authors: A. Morvan, B. Villalonga, X. Mi, S. Mandrà, A. Bengtsson, P. V. Klimov, Z. Chen, S. Hong, C. Erickson, I. K. Drozdov, J. Chau, G. Laun, R. Movassagh, A. Asfaw, L. T. A. N. Brandão, R. Peralta, D. Abanin, R. Acharya, R. Allen, T. I. Andersen, K. Anderson, M. Ansmann, F. Arute, K. Arya, J. Atalaya , et al. (160 additional authors not shown)

Abstract: Undesired coupling to the surrounding environment destroys long-range correlations on quantum processors and hinders the coherent evolution in the nominally available computational space. This incoherent noise is an outstanding challenge to fully leverage the computation power of near-term quantum processors. It has been shown that benchmarking Random Circuit Sampling (RCS) with Cross-Entropy Benc… ▽ More Undesired coupling to the surrounding environment destroys long-range correlations on quantum processors and hinders the coherent evolution in the nominally available computational space. This incoherent noise is an outstanding challenge to fully leverage the computation power of near-term quantum processors. It has been shown that benchmarking Random Circuit Sampling (RCS) with Cross-Entropy Benchmarking (XEB) can provide a reliable estimate of the effective size of the Hilbert space coherently available. The extent to which the presence of noise can trivialize the outputs of a given quantum algorithm, i.e. making it spoofable by a classical computation, is an unanswered question. Here, by implementing an RCS algorithm we demonstrate experimentally that there are two phase transitions observable with XEB, which we explain theoretically with a statistical model. The first is a dynamical transition as a function of the number of cycles and is the continuation of the anti-concentration point in the noiseless case. The second is a quantum phase transition controlled by the error per cycle; to identify it analytically and experimentally, we create a weak link model which allows varying the strength of noise versus coherent evolution. Furthermore, by presenting an RCS experiment with 67 qubits at 32 cycles, we demonstrate that the computational cost of our experiment is beyond the capabilities of existing classical supercomputers, even when accounting for the inevitable presence of noise. Our experimental and theoretical work establishes the existence of transitions to a stable computationally complex phase that is reachable with current quantum processors. △ Less

Submitted 21 December, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

arXiv:2304.10718 [pdf]

Room-temperature van der Waals 2D ferromagnet switching by spin-orbit torques

Authors: Weihao Li, Wenkai Zhu, Gaojie Zhang, Hao Wu, Shouguo Zhu, Runze Li, Enze Zhang, Xiaomin Zhang, Yongcheng Deng, **g Zhang, Lixia Zhao, Haixin Chang, Kaiyou Wang

Abstract: Emerging wide varieties of the two-dimensional (2D) van der Waals (vdW) magnets with atomically thin and smooth interfaces holds great promise for next-generation spintronic devices. However, due to the lower Curie temperature of the vdW 2D ferromagnets than room temperature, electrically manipulating its magnetization at room temperature has not been realized. In this work, we demonstrate the per… ▽ More Emerging wide varieties of the two-dimensional (2D) van der Waals (vdW) magnets with atomically thin and smooth interfaces holds great promise for next-generation spintronic devices. However, due to the lower Curie temperature of the vdW 2D ferromagnets than room temperature, electrically manipulating its magnetization at room temperature has not been realized. In this work, we demonstrate the perpendicular magnetization of 2D vdW ferromagnet Fe3GaTe2 can be effectively switched at room temperature in Fe3GaTe2/Pt bilayer by spin-orbit torques (SOTs) with a relatively low current density of 1.3 10^7A/cm2. Moreover, the high SOT efficiency of ξ_{DL}~0.22 is quantitatively determined by harmonic measurements, which is higher than those in Pt-based heavy metal/conventional ferromagnet devices. Our findings of room-temperature vdW 2D ferromagnet switching by SOTs provide a significant basis for the development of vdW-ferromagnet-based spintronic applications. △ Less

Submitted 20 April, 2023; originally announced April 2023.

arXiv:2304.07445 [pdf, other]

A framework for fully autonomous design of materials via multiobjective optimization and active learning: challenges and next steps

Authors: Tyler H. Chang, Jakob R. Elias, Stefan M. Wild, Santanu Chaudhuri, Joseph A. Libera

Abstract: In order to deploy machine learning in a real-world self-driving laboratory where data acquisition is costly and there are multiple competing design criteria, systems need to be able to intelligently sample while balancing performance trade-offs and constraints. For these reasons, we present an active learning process based on multiobjective black-box optimization with continuously updated machine… ▽ More In order to deploy machine learning in a real-world self-driving laboratory where data acquisition is costly and there are multiple competing design criteria, systems need to be able to intelligently sample while balancing performance trade-offs and constraints. For these reasons, we present an active learning process based on multiobjective black-box optimization with continuously updated machine learning models. This workflow is built on open-source technologies for real-time data streaming and modular multiobjective optimization software development. We demonstrate a proof of concept for this workflow through the autonomous operation of a continuous-flow chemistry laboratory, which identifies ideal manufacturing conditions for the electrolyte 2,2,2-trifluoroethyl methyl carbonate. △ Less

Submitted 14 April, 2023; originally announced April 2023.

arXiv:2304.06881 [pdf, other]

Designing a Framework for Solving Multiobjective Simulation Optimization Problems

Authors: Tyler H. Chang, Stefan M. Wild

Abstract: Multiobjective simulation optimization (MOSO) problems are optimization problems with multiple conflicting objectives, where evaluation of at least one of the objectives depends on a black-box numerical code or real-world experiment, which we refer to as a simulation. This paper describes the design goals driving the development of the parallel MOSO library ParMOO. We derive these goals from the r… ▽ More Multiobjective simulation optimization (MOSO) problems are optimization problems with multiple conflicting objectives, where evaluation of at least one of the objectives depends on a black-box numerical code or real-world experiment, which we refer to as a simulation. This paper describes the design goals driving the development of the parallel MOSO library ParMOO. We derive these goals from the research trends and real-world requirements that arise when designing and deploying solvers for generic MOSO problems. Our specific design goals were to provide a customizable MOSO framework that allows for exploitation of simulation-based problem structures, ease of deployment in scientific workflows, maintainability, and flexibility in our support for many problem types. We explain how we have achieved these goals in the ParMOO library and provide two examples demonstrating how customized ParMOO solvers can be quickly built and deployed in real-world MOSO problems. △ Less

Submitted 6 July, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

arXiv:2304.06818 [pdf, other]

Soundini: Sound-Guided Diffusion for Natural Video Editing

Authors: Seung Hyun Lee, Sieun Kim, Innfarn Yoo, Feng Yang, Donghyeon Cho, Youngseo Kim, Huiwen Chang, **kyu Kim, Sangpil Kim

Abstract: We propose a method for adding sound-guided visual effects to specific regions of videos with a zero-shot setting. Animating the appearance of the visual effect is challenging because each frame of the edited video should have visual changes while maintaining temporal consistency. Moreover, existing video editing solutions focus on temporal consistency across frames, ignoring the visual style vari… ▽ More We propose a method for adding sound-guided visual effects to specific regions of videos with a zero-shot setting. Animating the appearance of the visual effect is challenging because each frame of the edited video should have visual changes while maintaining temporal consistency. Moreover, existing video editing solutions focus on temporal consistency across frames, ignoring the visual style variations over time, e.g., thunderstorm, wave, fire crackling. To overcome this limitation, we utilize temporal sound features for the dynamic style. Specifically, we guide denoising diffusion probabilistic models with an audio latent representation in the audio-visual latent space. To the best of our knowledge, our work is the first to explore sound-guided natural video editing from various sound sources with sound-specialized properties, such as intensity, timbre, and volume. Additionally, we design optical flow-based guidance to generate temporally consistent video frames, capturing the pixel-wise relationship between adjacent frames. Experimental results show that our method outperforms existing video editing techniques, producing more realistic visual effects that reflect the properties of sound. Please visit our page: https://kuai-lab.github.io/soundini-gallery/. △ Less

Submitted 13 April, 2023; originally announced April 2023.

arXiv:2304.06073 [pdf, ps, other]

doi 10.1007/JHEP06(2023)172

Higher derivative couplings of hypermultiplets

Authors: Hao-Yuan Chang, Ergin Sezgin, Yoshiaki Tanii

Abstract: We construct the four-derivative supersymmetric extension of $(1,0), 6D$ supergravity coupled to Yang-Mills and hypermultiplets. The hypermultiplet scalars are taken to parametrize the quaternionic projective space $Hp(n)=Sp(n,1)/Sp(n)\times Sp(1)_R$. The hyperscalar kinetic term is not deformed, and the quaternionic Kähler structure and symmetries of $Hp(n)$ are preserved. The result is a three p… ▽ More We construct the four-derivative supersymmetric extension of $(1,0), 6D$ supergravity coupled to Yang-Mills and hypermultiplets. The hypermultiplet scalars are taken to parametrize the quaternionic projective space $Hp(n)=Sp(n,1)/Sp(n)\times Sp(1)_R$. The hyperscalar kinetic term is not deformed, and the quaternionic Kähler structure and symmetries of $Hp(n)$ are preserved. The result is a three parameter Lagrangian supersymmetric up to first order in these parameters. Considering the case of $Hp(1)$ we compare our result with that obtained from the compactification of $10D$ heterotic supergravity on four-torus, consistently truncated to $N=(1,0)$, in which the hyperscalars parametrize $SO(4,1)/SO(4)$. We find that depending on how $Sp(1) \subset Sp(1,1)$ is embedded in $SO(4)$, the results agree for a specific value of the parameter that governs the higher derivative hypermultiplet couplings. △ Less

Submitted 12 April, 2023; originally announced April 2023.

Comments: 29 pages

Report number: MI-HET-799, STUPP-23-261

arXiv:2304.04168 [pdf, other]

Adversarially Robust Neural Architecture Search for Graph Neural Networks

Authors: Beini Xie, Heng Chang, Ziwei Zhang, Xin Wang, Daixin Wang, Zhiqiang Zhang, Rex Ying, Wenwu Zhu

Abstract: Graph Neural Networks (GNNs) obtain tremendous success in modeling relational data. Still, they are prone to adversarial attacks, which are massive threats to applying GNNs to risk-sensitive domains. Existing defensive methods neither guarantee performance facing new data/tasks or adversarial attacks nor provide insights to understand GNN robustness from an architectural perspective. Neural Archit… ▽ More Graph Neural Networks (GNNs) obtain tremendous success in modeling relational data. Still, they are prone to adversarial attacks, which are massive threats to applying GNNs to risk-sensitive domains. Existing defensive methods neither guarantee performance facing new data/tasks or adversarial attacks nor provide insights to understand GNN robustness from an architectural perspective. Neural Architecture Search (NAS) has the potential to solve this problem by automating GNN architecture designs. Nevertheless, current graph NAS approaches lack robust design and are vulnerable to adversarial attacks. To tackle these challenges, we propose a novel Robust Neural Architecture search framework for GNNs (G-RNA). Specifically, we design a robust search space for the message-passing mechanism by adding graph structure mask operations into the search space, which comprises various defensive operation candidates and allows us to search for defensive GNNs. Furthermore, we define a robustness metric to guide the search procedure, which helps to filter robust architectures. In this way, G-RNA helps understand GNN robustness from an architectural perspective and effectively searches for optimal adversarial robust GNNs. Extensive experimental results on benchmark datasets show that G-RNA significantly outperforms manually designed robust GNNs and vanilla graph NAS baselines by 12.1% to 23.4% under adversarial attacks. △ Less

Submitted 9 April, 2023; originally announced April 2023.

Comments: Accepted as a conference paper at CVPR 2023

arXiv:2304.02419 [pdf, other]

TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration

Authors: Kehong Gong, Dongze Lian, Heng Chang, Chuan Guo, Zihang Jiang, Xinxin Zuo, Michael Bi Mi, Xinchao Wang

Abstract: We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities. Unlike existing works that generate dance movements using a single modality such as music, our goal is to produce richer dance movements guided by the instructive information provided by the text. However, the lack of paired motion data with both music and text modalities limit… ▽ More We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities. Unlike existing works that generate dance movements using a single modality such as music, our goal is to produce richer dance movements guided by the instructive information provided by the text. However, the lack of paired motion data with both music and text modalities limits the ability to generate dance movements that integrate both. To alleviate this challenge, we propose to utilize a 3D human motion VQ-VAE to project the motions of the two datasets into a latent space consisting of quantized vectors, which effectively mix the motion tokens from the two datasets with different distributions for training. Additionally, we propose a cross-modal transformer to integrate text instructions into motion generation architecture for generating 3D dance movements without degrading the performance of music-conditioned dance generation. To better evaluate the quality of the generated motion, we introduce two novel metrics, namely Motion Prediction Distance (MPD) and Freezing Score (FS), to measure the coherence and freezing percentage of the generated motion. Extensive experiments show that our approach can generate realistic and coherent dance movements conditioned on both text and music while maintaining comparable performance with the two single modalities. Code is available at https://garfield-kh.github.io/TM2D/. △ Less

Submitted 1 October, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

Comments: Accepted by ICCV2023

arXiv:2303.18155 [pdf, other]

doi 10.1103/PhysRevB.108.195427

Highly anisotropic optical conductivities in two-dimensional tilted semi-Dirac bands

Authors: Chang-Xu Yan, Chao-Yang Tan, Hong Guo, Hao-Ran Chang

Abstract: Within linear response theory, the absorptive part of highly anisotropic optical conductivities are analytically calculated for distinct tilts in two-dimensional (2D) tilted semi-Dirac bands (SDBs). The transverse optical conductivities always vanish. The interband longitudinal optical conductivities (LOCs) in 2D tilted SDBs differ qualitatively in the power-law scaling of $ω$ as… ▽ More Within linear response theory, the absorptive part of highly anisotropic optical conductivities are analytically calculated for distinct tilts in two-dimensional (2D) tilted semi-Dirac bands (SDBs). The transverse optical conductivities always vanish. The interband longitudinal optical conductivities (LOCs) in 2D tilted SDBs differ qualitatively in the power-law scaling of $ω$ as $\mathrm{Re}σ_{\perp}^{\mathrm{IB}}(ω)\proptoσ_0\sqrtω$ and $\mathrm{Re}σ_{\parallel}^{\mathrm{IB}}(ω)\proptoσ_0/\sqrtω$. By contrast, the intraband LOCs in 2D tilted SDBs depend on $μ$ in the power-law scaling as $\mathrm{Re}σ_{\perp}^{\mathrm{D}}(ω)\proptoσ_0μ\sqrtμ$ and $\mathrm{Re}σ_{\parallel}^{\mathrm{D}}(ω)\proptoσ_0μ/\sqrtμ$. The tilt-dependent behaviors of LOCs could qualitatively characterize distinct impact of band tilting in 2D tilted SDBs. In particular, for arbitrary tilt $t$ satisfying $0<t\le 2$, the interband LOCs always possess a robust fixed point at $ω=2μ$. The power-law scalings and tilt-dependent behaviors further dictate significant differences in the asymptotic background values and angular dependence of LOCs. Our theoretical predictions should be valid for a broad class of 2D tilted SDB materials, and can also be used to fingerprint 2D tilted SDB from 2D untilted SDB as well as tilted Dirac bands. △ Less

Submitted 23 November, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

Comments: 19 pages, 5 figures

Journal ref: Phys. Rev. B 108, 195427 (2023)

arXiv:2303.16395 [pdf, ps, other]

doi 10.1088/1367-2630/ad0fa9

High-fidelity Rydberg controlled-Z gates with optimal pulses

Authors: T. H. Chang, T. N. Wang, H. H. Jen, Y. -C. Chen

Abstract: High-fidelity control-$Z$ ($C_Z$) gates are essential and mandatory to build a large-scale quantum computer. In neutral atoms, the strong dipole-dipole interactions between their Rydberg states make them one of the pioneering platforms to implement $C_Z$ gates. Here we numerically investigate the time-optimal pulses to generate a high-fidelity Rydberg $C_{Z}$ gate in a three-level ladder-type atom… ▽ More High-fidelity control-$Z$ ($C_Z$) gates are essential and mandatory to build a large-scale quantum computer. In neutral atoms, the strong dipole-dipole interactions between their Rydberg states make them one of the pioneering platforms to implement $C_Z$ gates. Here we numerically investigate the time-optimal pulses to generate a high-fidelity Rydberg $C_{Z}$ gate in a three-level ladder-type atomic system. By tuning the temporal shapes of Gaussian or segmented pulses, the populations on the intermediate excited states are shown to be suppressed within the symmetric gate operation protocol, which leads to a $C_{Z}$ gate with a high Bell fidelity up to $99.92\%$. These optimized pulses are robust to thermal fluctuations and the excitation field variations. Our results promise a high-fidelity and fast gate operation under amenable and controllable experimental parameters, which goes beyond the adiabatic operation regime under a finite Blockade strength. △ Less

Submitted 24 November, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

Comments: 6 figures

Journal ref: New J. Phys. 25, 123007 (2023)

arXiv:2303.15743 [pdf, other]

HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation

Authors: Linfang Zheng, Chen Wang, Yinghan Sun, Esha Dasgupta, Hua Chen, Ales Leonardis, Wei Zhang, Hyung ** Chang

Abstract: In this paper, we focus on the problem of category-level object pose estimation, which is challenging due to the large intra-category shape variation. 3D graph convolution (3D-GC) based methods have been widely used to extract local geometric features, but they have limitations for complex shaped objects and are sensitive to noise. Moreover, the scale and translation invariant properties of 3D-GC… ▽ More In this paper, we focus on the problem of category-level object pose estimation, which is challenging due to the large intra-category shape variation. 3D graph convolution (3D-GC) based methods have been widely used to extract local geometric features, but they have limitations for complex shaped objects and are sensitive to noise. Moreover, the scale and translation invariant properties of 3D-GC restrict the perception of an object's size and translation information. In this paper, we propose a simple network structure, the HS-layer, which extends 3D-GC to extract hybrid scope latent features from point cloud data for category-level object pose estimation tasks. The proposed HS-layer: 1) is able to perceive local-global geometric structure and global information, 2) is robust to noise, and 3) can encode size and translation information. Our experiments show that the simple replacement of the 3D-GC layer with the proposed HS-layer on the baseline method (GPV-Pose) achieves a significant improvement, with the performance increased by 14.5% on 5d2cm metric and 10.3% on IoU75. Our method outperforms the state-of-the-art methods by a large margin (8.3% on 5d2cm, 6.9% on IoU75) on the REAL275 dataset and runs in real-time (50 FPS). △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: Accepted by the 2023 IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR)

arXiv:2303.14653 [pdf, other]

SDTracker: Synthetic Data Based Multi-Object Tracking

Authors: Yingda Guan, Zhengyang Feng, Huiying Chang, Kuo Du, Tingting Li, Min Wang

Abstract: We present SDTracker, a method that harnesses the potential of synthetic data for multi-object tracking of real-world scenes in a domain generalization and semi-supervised fashion. First, we use the ImageNet dataset as an auxiliary to randomize the style of synthetic data. With out-of-domain data, we further enforce pyramid consistency loss across different "stylized" images from the same sample t… ▽ More We present SDTracker, a method that harnesses the potential of synthetic data for multi-object tracking of real-world scenes in a domain generalization and semi-supervised fashion. First, we use the ImageNet dataset as an auxiliary to randomize the style of synthetic data. With out-of-domain data, we further enforce pyramid consistency loss across different "stylized" images from the same sample to learn domain invariant features. Second, we adopt the pseudo-labeling method to effectively utilize the unlabeled MOT17 training data. To obtain high-quality pseudo-labels, we apply proximal policy optimization (PPO2) algorithm to search confidence thresholds for each sequence. When using the unlabeled MOT17 training set, combined with the pure-motion tracking strategy upgraded via developed post-processing, we finally reach 61.4 HOTA. △ Less

Submitted 26 March, 2023; originally announced March 2023.

Comments: cvpr2022 workshop

arXiv:2303.09828 [pdf, other]

Model Reference Gaussian Process Regression: Data-Driven State Feedback Controller

Authors: Hyuntae Kim, Hamin Chang, Hyungbo Shim

Abstract: This paper proposes a data-driven state feedback controller that enables reference tracking for nonlinear discrete-time systems. The controller is designed based on the identified inverse model of the system and a given reference model, assuming that the identification of the inverse model is carried out using only the system's state/input measurements. When its results are provided, we present co… ▽ More This paper proposes a data-driven state feedback controller that enables reference tracking for nonlinear discrete-time systems. The controller is designed based on the identified inverse model of the system and a given reference model, assuming that the identification of the inverse model is carried out using only the system's state/input measurements. When its results are provided, we present conditions that guarantee a certain level of reference tracking performance, regardless of the identification method employed for the inverse model. Specifically, when Gaussian process regression (GPR) is used as the identification method, we propose sufficient conditions for the required data by applying some lemmas related to identification errors to the aforementioned conditions, ensuring that the Model reference-GPR (MR-GPR) controller can guarantee a certain level of reference tracking performance. Finally, an example is provided to demonstrate the effectiveness of the MR-GPR controller. △ Less

Submitted 17 March, 2023; originally announced March 2023.

Comments: 6pages, 3figures, Submitted to LCSS/CDC 2023

arXiv:2303.08514 [pdf, other]

Deep Learning for Iris Recognition: A Review

Authors: Yimin Yin, Siliang He, Renye Zhang, Hongli Chang, Xu Han, **ghua Zhang

Abstract: Iris recognition is a secure biometric technology known for its stability and privacy. With no two irises being identical and little change throughout a person's lifetime, iris recognition is considered more reliable and less susceptible to external factors than other biometric recognition methods. Unlike traditional machine learning-based iris recognition methods, deep learning technology does no… ▽ More Iris recognition is a secure biometric technology known for its stability and privacy. With no two irises being identical and little change throughout a person's lifetime, iris recognition is considered more reliable and less susceptible to external factors than other biometric recognition methods. Unlike traditional machine learning-based iris recognition methods, deep learning technology does not rely on feature engineering and boasts excellent performance. This paper collects 120 relevant papers to summarize the development of iris recognition based on deep learning. We first introduce the background of iris recognition and the motivation and contribution of this survey. Then, we present the common datasets widely used in iris recognition. After that, we summarize the key tasks involved in the process of iris recognition based on deep learning technology, including identification, segmentation, presentation attack detection, and localization. Finally, we discuss the challenges and potential development of iris recognition. This review provides a comprehensive sight of the research of iris recognition based on deep learning. △ Less

Submitted 15 March, 2023; originally announced March 2023.

arXiv:2303.08475 [pdf, other]

Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video

Authors: Runyang Feng, Yixing Gao, Xueqing Ma, Tze Ho Elden Tse, Hyung ** Chang

Abstract: Temporal modeling is crucial for multi-frame human pose estimation. Most existing methods directly employ optical flow or deformable convolution to predict full-spectrum motion fields, which might incur numerous irrelevant cues, such as a nearby person or background. Without further efforts to excavate meaningful motion priors, their results are suboptimal, especially in complicated spatiotemporal… ▽ More Temporal modeling is crucial for multi-frame human pose estimation. Most existing methods directly employ optical flow or deformable convolution to predict full-spectrum motion fields, which might incur numerous irrelevant cues, such as a nearby person or background. Without further efforts to excavate meaningful motion priors, their results are suboptimal, especially in complicated spatiotemporal interactions. On the other hand, the temporal difference has the ability to encode representative motion information which can potentially be valuable for pose estimation but has not been fully exploited. In this paper, we present a novel multi-frame human pose estimation framework, which employs temporal differences across frames to model dynamic contexts and engages mutual information objectively to facilitate useful motion information disentanglement. To be specific, we design a multi-stage Temporal Difference Encoder that performs incremental cascaded learning conditioned on multi-stage feature difference sequences to derive informative motion representation. We further propose a Representation Disentanglement module from the mutual information perspective, which can grasp discriminative task-relevant motion signals by explicitly defining useful and noisy constituents of the raw motion features and minimizing their mutual information. These place us to rank No.1 in the Crowd Pose Estimation in Complex Events Challenge on benchmark dataset HiEve, and achieve state-of-the-art performance on three benchmarks PoseTrack2017, PoseTrack2018, and PoseTrack21. △ Less

Submitted 8 May, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: This paper is accepted to CVPR 2023

arXiv:2303.05661 [pdf, other]

The multiplexed light storage of Orbital Angular Momentum based on atomic ensembles

Authors: Xin Yang, Hong Chang, **wen Wang, Yan Ma, Yun Chen, Shuwei Qiu, Zehao Shen, Chengyuan Wang, Quan Quan, Dong Wei, Haixia Chen, Mingtao Cao, Hong Gao, Fuli Li

Abstract: The improvement of the multi-mode capability of quantum memory can further improve the utilization efficiency of the quantum memory and reduce the requirement of quantum communication for storage units. In this letter, we experimentally investigate the multi-mode light multiplexing storage of orbital angular momentum (OAM) mode based on rubidium vapor, and demultiplexing by a photonic OAM mode spl… ▽ More The improvement of the multi-mode capability of quantum memory can further improve the utilization efficiency of the quantum memory and reduce the requirement of quantum communication for storage units. In this letter, we experimentally investigate the multi-mode light multiplexing storage of orbital angular momentum (OAM) mode based on rubidium vapor, and demultiplexing by a photonic OAM mode splitter which combines a Sagnac loop with two dove prisms. Our results show a mode extinction ratio higher than 80$\%$ at 1 $μ$s of storage time. Meanwhile, two OAM modes have been multiplexing stored and demultiplexed in our experimental configuration. We believe the experimental scheme may provide a possibility for high channel capacity and multi-mode quantum multiplexed quantum storage based on atomic ensembles. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Showing 151–200 of 935 results for author: Chang, H