-
Recalibrating Gravitational Wave Phenomenological Waveform Model
Authors:
Kelvin K. H. Lam,
Kaze W. K. Wong,
Thomas D. P. Edwards
Abstract:
We investigate the possibility of improving the accuracy of the phenomenological waveform model, IMRPhenomD, by jointly optimizing all the calibration coefficients at once, given a set of numerical relativity (NR) waveforms. When IMRPhenomD was first calibrated to NR waveforms, different parts (i.e., the inspiral, merger, and ringdown) of the waveform were calibrated separately. Using ripple, a li…
▽ More
We investigate the possibility of improving the accuracy of the phenomenological waveform model, IMRPhenomD, by jointly optimizing all the calibration coefficients at once, given a set of numerical relativity (NR) waveforms. When IMRPhenomD was first calibrated to NR waveforms, different parts (i.e., the inspiral, merger, and ringdown) of the waveform were calibrated separately. Using ripple, a library of waveform models compatible with automatic differentiation, we can, for the first time, perform gradient-based optimization on all the waveform coefficients at the same time. This joint optimization process allows us to capture previously ignored correlations between separate parts of the waveform. We found that after recalibration, the median mismatch between the model and NR waveforms decreases by 50%. We further explore how different regions of the source parameter space respond to the optimization procedure. We find that the degree of improvement correlates with the spins of the source. This work shows a promising avenue to help understand and treat systematic error in waveform models.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
Early Planet Formation in Embedded Disks (eDisk). I. Overview of the Program and First Results
Authors:
Nagayoshi Ohashi,
John J. Tobin,
Jes K. Jørgensen,
Shigehisa Takakuwa,
Patrick Sheehan,
Yuri Aikawa,
Zhi-Yun Li,
Leslie W. Looney,
Jonathan P. Willians,
Yusuke Aso,
Rajeeb Sharma,
**shi Sai,
Yoshihide Yamato,
Jeong-Eun Lee,
Kengo Tomida,
Hsi-Wei Yen,
Frankie J Encalada,
Christian Flores,
Sacha Gavino,
Miyu Kido,
Ilseung Han,
Zhe-Yu Daniel Lin,
Suchitra Narayanan,
Nguyen Thi Phuong,
Alejandro Santamaría-Miranda
, et al. (12 additional authors not shown)
Abstract:
We present an overview of the Large Program, ``Early Planet Formation in Embedded Disks (eDisk)'', conducted with the Atacama Large Millimeter/submillimeter Array (ALMA). The ubiquitous detections of substructures, particularly rings and gaps, in protoplanetary disks around T Tauri stars raise the possibility that at least some planet formation may have already started during the embedded stages o…
▽ More
We present an overview of the Large Program, ``Early Planet Formation in Embedded Disks (eDisk)'', conducted with the Atacama Large Millimeter/submillimeter Array (ALMA). The ubiquitous detections of substructures, particularly rings and gaps, in protoplanetary disks around T Tauri stars raise the possibility that at least some planet formation may have already started during the embedded stages of star formation. In order to address exactly how and when planet formation is initiated, the program focuses on searching for substructures in disks around 12 Class 0 and 7 Class I protostars in nearby ($< $200 pc) star-forming regions through 1.3 mm continuum observations at a resolution of $\sim7$ au (0.04"). The initial results show that the continuum emission, mostly arising from dust disks around the sample protostars, has relatively few distinctive substructures, such as rings and spirals, in marked contrast to Class II disks. The dramatic difference may suggest that substructures quickly develop in disks when the systems evolve from protostars to Class II sources or alternatively that high optical depth of the continuum emission could obscure internal structures. Kinematic information obtained through CO isotopologue lines and other lines reveals the presence of Keplerian disks around protostars, providing us with crucial physical parameters, in particular, the dynamical mass of the central protostars. We describe the background of the eDisk program, the sample selection and their ALMA observations, the data reduction, and also highlight representative first-look results.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
Smoothed $f$-Divergence Distributionally Robust Optimization
Authors:
Zhenyuan Liu,
Bart P. G. Van Parys,
Henry Lam
Abstract:
In data-driven optimization, sample average approximation (SAA) is known to suffer from the so-called optimizer's curse that causes an over-optimistic evaluation of the solution performance. We argue that a special type of distributionallly robust optimization (DRO) formulation offers theoretical advantages in correcting for this optimizer's curse compared to simple ``margin'' adjustments to SAA a…
▽ More
In data-driven optimization, sample average approximation (SAA) is known to suffer from the so-called optimizer's curse that causes an over-optimistic evaluation of the solution performance. We argue that a special type of distributionallly robust optimization (DRO) formulation offers theoretical advantages in correcting for this optimizer's curse compared to simple ``margin'' adjustments to SAA and other DRO approaches: It attains a statistical bound on the out-of-sample performance, for a wide class of objective functions and distributions, that is nearly tightest in terms of exponential decay rate. This DRO uses an ambiguity set based on a Kullback Leibler (KL) divergence smoothed by the Wasserstein or Lévy-Prokhorov (LP) distance via a suitable distance optimization. Computationally, we also show that such a DRO, and its generalized versions using smoothed $f$-divergence, are not harder than DRO problems based on $f$-divergence or Wasserstein distances, rendering our DRO formulations both statistically optimal and computationally viable.
△ Less
Submitted 12 October, 2023; v1 submitted 24 June, 2023;
originally announced June 2023.
-
Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery
Authors:
Hoang Thanh Lam,
Marco Luca Sbodio,
Marcos Martínez Galindo,
Mykhaylo Zayats,
Raúl Fernández-Díaz,
Víctor Valls,
Gabriele Picco,
Cesar Berrospi Ramis,
Vanessa López
Abstract:
Recent research on predicting the binding affinity between drug molecules and proteins use representations learned, through unsupervised learning techniques, from large databases of molecule SMILES and protein sequences. While these representations have significantly enhanced the predictions, they are usually based on a limited set of modalities, and they do not exploit available knowledge about e…
▽ More
Recent research on predicting the binding affinity between drug molecules and proteins use representations learned, through unsupervised learning techniques, from large databases of molecule SMILES and protein sequences. While these representations have significantly enhanced the predictions, they are usually based on a limited set of modalities, and they do not exploit available knowledge about existing relations among molecules and proteins. In this study, we demonstrate that by incorporating knowledge graphs from diverse sources and modalities into the sequences or SMILES representation, we can further enrich the representation and achieve state-of-the-art results for drug-target binding affinity prediction in the established Therapeutic Data Commons (TDC) benchmarks. We release a set of multimodal knowledge graphs, integrating data from seven public data sources, and containing over 30 million triples. Our intention is to foster additional research to explore how multimodal knowledge enhanced protein/molecule embeddings can improve prediction tasks, including prediction of binding affinity. We also release some pretrained models learned from our multimodal knowledge graphs, along with source code for running standard benchmark tasks for prediction of biding affinity.
△ Less
Submitted 19 October, 2023; v1 submitted 22 June, 2023;
originally announced June 2023.
-
Optimizer's Information Criterion: Dissecting and Correcting Bias in Data-Driven Optimization
Authors:
Garud Iyengar,
Henry Lam,
Tianyu Wang
Abstract:
In data-driven optimization, the sample performance of the obtained decision typically incurs an optimistic bias against the true performance, a phenomenon commonly known as the Optimizer's Curse and intimately related to overfitting in machine learning. Common techniques to correct this bias, such as cross-validation, require repeatedly solving additional optimization problems and are therefore c…
▽ More
In data-driven optimization, the sample performance of the obtained decision typically incurs an optimistic bias against the true performance, a phenomenon commonly known as the Optimizer's Curse and intimately related to overfitting in machine learning. Common techniques to correct this bias, such as cross-validation, require repeatedly solving additional optimization problems and are therefore computationally expensive. We develop a general bias correction approach, building on what we call Optimizer's Information Criterion (OIC), that directly approximates the first-order bias and does not require solving any additional optimization problems. Our OIC generalizes the celebrated Akaike Information Criterion to evaluate the objective performance in data-driven optimization, which crucially involves not only model fitting but also its interplay with the downstream optimization. As such it can be used for decision selection instead of only model selection. We apply our approach to a range of data-driven optimization formulations comprising empirical and parametric models, their regularized counterparts, and furthermore contextual optimization. Finally, we provide numerical validation on the superior performance of our approach under synthetic and real-world datasets.
△ Less
Submitted 16 October, 2023; v1 submitted 16 June, 2023;
originally announced June 2023.
-
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks
Authors:
Ziyi Huang,
Henry Lam,
Haofeng Zhang
Abstract:
Uncertainty quantification (UQ) is important for reliability assessment and enhancement of machine learning models. In deep learning, uncertainties arise not only from data, but also from the training procedure that often injects substantial noises and biases. These hinder the attainment of statistical guarantees and, moreover, impose computational challenges on UQ due to the need for repeated net…
▽ More
Uncertainty quantification (UQ) is important for reliability assessment and enhancement of machine learning models. In deep learning, uncertainties arise not only from data, but also from the training procedure that often injects substantial noises and biases. These hinder the attainment of statistical guarantees and, moreover, impose computational challenges on UQ due to the need for repeated network retraining. Building upon the recent neural tangent kernel theory, we create statistically guaranteed schemes to principally \emph{characterize}, and \emph{remove}, the uncertainty of over-parameterized neural networks with very low computation effort. In particular, our approach, based on what we call a procedural-noise-correcting (PNC) predictor, removes the procedural uncertainty by using only \emph{one} auxiliary network that is trained on a suitably labeled dataset, instead of many retrained networks employed in deep ensembles. Moreover, by combining our PNC predictor with suitable light-computation resampling methods, we build several approaches to construct asymptotically exact-coverage confidence intervals using as low as four trained networks without additional overheads.
△ Less
Submitted 9 November, 2023; v1 submitted 9 June, 2023;
originally announced June 2023.
-
Ground State Degeneracy of Infinite-Component Chern-Simons-Maxwell Theories
Authors:
Xie Chen,
Ho Tat Lam,
Xiuqi Ma
Abstract:
Infinite-component Chern-Simons-Maxwell theories with a periodic $K$ matrix provide abundant examples of gapped and gapless, foliated and non-foliated fracton orders. In this paper, we study the ground state degeneracy of these theories. We show that the ground state degeneracy exhibit various patterns as a function of the linear system size -- the size of the $K$ matrix. It can grow exponentially…
▽ More
Infinite-component Chern-Simons-Maxwell theories with a periodic $K$ matrix provide abundant examples of gapped and gapless, foliated and non-foliated fracton orders. In this paper, we study the ground state degeneracy of these theories. We show that the ground state degeneracy exhibit various patterns as a function of the linear system size -- the size of the $K$ matrix. It can grow exponentially or polynomially, cycle over finitely many values, or fluctuate erratically inside an exponential envelope. We relate these different patterns of the ground state degeneracy with the roots of the ``determinant polynomial'', a Laurent polynomial, associated to the periodic $K$ matrix. These roots also determine whether the theory is gapped or gapless. Based on the ground state degeneracy, we formulate a necessary condition for a gapped theory to be a foliated fracton order.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
Revisiting the Reliability of Psychological Scales on Large Language Models
Authors:
Jen-tse Huang,
Wenxuan Wang,
Man Ho Lam,
Eric John Li,
Wenxiang Jiao,
Michael R. Lyu
Abstract:
Recent research has extended beyond assessing the performance of Large Language Models (LLMs) to examining their characteristics from a psychological standpoint, acknowledging the necessity of understanding their behavioral characteristics. The administration of personality tests to LLMs has emerged as a noteworthy area in this context. However, the suitability of employing psychological scales, i…
▽ More
Recent research has extended beyond assessing the performance of Large Language Models (LLMs) to examining their characteristics from a psychological standpoint, acknowledging the necessity of understanding their behavioral characteristics. The administration of personality tests to LLMs has emerged as a noteworthy area in this context. However, the suitability of employing psychological scales, initially devised for humans, on LLMs is a matter of ongoing debate. Our study aims to determine the reliability of applying personality assessments to LLMs, explicitly investigating whether LLMs demonstrate consistent personality traits. Analyzing responses under 2,500 settings reveals that gpt-3.5-turbo shows consistency in responses to the Big Five Inventory, indicating a high degree of reliability. Furthermore, our research explores the potential of gpt-3.5-turbo to emulate diverse personalities and represent various groups, which is a capability increasingly sought after in social sciences for substituting human participants with LLMs to reduce costs. Our findings reveal that LLMs have the potential to represent different personalities with specific prompt instructions. By shedding light on the personalization of LLMs, our study endeavors to pave the way for future explorations in this field. We have made our experimental results and the corresponding code openly accessible via https://github.com/CUHK-ARISE/LLMPersonality.
△ Less
Submitted 28 December, 2023; v1 submitted 31 May, 2023;
originally announced May 2023.
-
Short-term Temporal Dependency Detection under Heterogeneous Event Dynamic with Hawkes Processes
Authors:
Yu Chen,
Fengpei Li,
Anderson Schneider,
Yuriy Nevmyvaka,
Asohan Amarasingham,
Henry Lam
Abstract:
Many event sequence data exhibit mutually exciting or inhibiting patterns. Reliable detection of such temporal dependency is crucial for scientific investigation. The de facto model is the Multivariate Hawkes Process (MHP), whose impact function naturally encodes a causal structure in Granger causality. However, the vast majority of existing methods use direct or nonlinear transform of standard MH…
▽ More
Many event sequence data exhibit mutually exciting or inhibiting patterns. Reliable detection of such temporal dependency is crucial for scientific investigation. The de facto model is the Multivariate Hawkes Process (MHP), whose impact function naturally encodes a causal structure in Granger causality. However, the vast majority of existing methods use direct or nonlinear transform of standard MHP intensity with constant baseline, inconsistent with real-world data. Under irregular and unknown heterogeneous intensity, capturing temporal dependency is hard as one struggles to distinguish the effect of mutual interaction from that of intensity fluctuation. In this paper, we address the short-term temporal dependency detection issue. We show the maximum likelihood estimation (MLE) for cross-impact from MHP has an error that can not be eliminated but may be reduced by order of magnitude, using heterogeneous intensity not of the target HP but of the interacting HP. Then we proposed a robust and computationally-efficient method modified from MLE that does not rely on the prior estimation of the heterogeneous intensity and is thus applicable in a data-limited regime (e.g., few-shot, no repeated observations). Extensive experiments on various datasets show that our method outperforms existing ones by notable margins, with highlighted novel applications in neuroscience.
△ Less
Submitted 28 May, 2023;
originally announced May 2023.
-
Constrained Proximal Policy Optimization
Authors:
Chengbin Xuan,
Feng Zhang,
Faliang Yin,
Hak-Keung Lam
Abstract:
The problem of constrained reinforcement learning (CRL) holds significant importance as it provides a framework for addressing critical safety satisfaction concerns in the field of reinforcement learning (RL). However, with the introduction of constraint satisfaction, the current CRL methods necessitate the utilization of second-order optimization or primal-dual frameworks with additional Lagrangi…
▽ More
The problem of constrained reinforcement learning (CRL) holds significant importance as it provides a framework for addressing critical safety satisfaction concerns in the field of reinforcement learning (RL). However, with the introduction of constraint satisfaction, the current CRL methods necessitate the utilization of second-order optimization or primal-dual frameworks with additional Lagrangian multipliers, resulting in increased complexity and inefficiency during implementation. To address these issues, we propose a novel first-order feasible method named Constrained Proximal Policy Optimization (CPPO). By treating the CRL problem as a probabilistic inference problem, our approach integrates the Expectation-Maximization framework to solve it through two steps: 1) calculating the optimal policy distribution within the feasible region (E-step), and 2) conducting a first-order update to adjust the current policy towards the optimal policy obtained in the E-step (M-step). We establish the relationship between the probability ratios and KL divergence to convert the E-step into a convex optimization problem. Furthermore, we develop an iterative heuristic algorithm from a geometric perspective to solve this problem. Additionally, we introduce a conservative update mechanism to overcome the constraint violation issue that occurs in the existing feasible region method. Empirical evaluations conducted in complex and uncertain environments validate the effectiveness of our proposed method, as it performs at least as well as other baselines.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Eye-SpatialNet: Spatial Information Extraction from Ophthalmology Notes
Authors:
Surabhi Datta,
Tasneem Kaochar,
Hio Cheng Lam,
Nelly Nwosu,
Luca Giancardo,
Alice Z. Chuang,
Robert M. Feldman,
Kirk Roberts
Abstract:
We introduce an annotated corpus of 600 ophthalmology notes labeled with detailed spatial and contextual information of ophthalmic entities. We extend our previously proposed frame semantics-based spatial representation schema, Rad-SpatialNet, to represent spatial language in ophthalmology text, resulting in the Eye-SpatialNet schema. The spatially-grounded entities are findings, procedures, and d…
▽ More
We introduce an annotated corpus of 600 ophthalmology notes labeled with detailed spatial and contextual information of ophthalmic entities. We extend our previously proposed frame semantics-based spatial representation schema, Rad-SpatialNet, to represent spatial language in ophthalmology text, resulting in the Eye-SpatialNet schema. The spatially-grounded entities are findings, procedures, and drugs. To accurately capture all spatial details, we add some domain-specific elements in Eye-SpatialNet. The annotated corpus contains 1715 spatial triggers, 7308 findings, 2424 anatomies, and 9914 descriptors. To automatically extract the spatial information, we employ a two-turn question answering approach based on the transformer language model BERT. The results are promising, with F1 scores of 89.31, 74.86, and 88.47 for spatial triggers, Figure, and Ground frame elements, respectively. This is the first work to represent and extract a wide variety of clinical information in ophthalmology. Extracting detailed information can benefit ophthalmology applications and research targeted toward disease progression and screening.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Geometric local systems on the projective line minus four points
Authors:
Yeuk Hay Joshua Lam,
Daniel Litt
Abstract:
Let $J(m)$ be an $m\times m$ Jordan block with eigenvalue $1$. For $λ\in \mathbb{C}\setminus\{0,1\}$, we explicitly construct all rank $2$ local systems of geometric origin on $\mathbb{P}^1\setminus\{0,1,λ, \infty\}$, with local monodromy conjugate to $J(2)$ at $0,1,λ$ and conjugate to $-J(2)$ at $\infty$. The construction relies on Katz's middle convolution operation. We use our construction to p…
▽ More
Let $J(m)$ be an $m\times m$ Jordan block with eigenvalue $1$. For $λ\in \mathbb{C}\setminus\{0,1\}$, we explicitly construct all rank $2$ local systems of geometric origin on $\mathbb{P}^1\setminus\{0,1,λ, \infty\}$, with local monodromy conjugate to $J(2)$ at $0,1,λ$ and conjugate to $-J(2)$ at $\infty$. The construction relies on Katz's middle convolution operation. We use our construction to prove two conjectures of Sun-Yang-Zuo (one of which was proven earlier by Lin-Sheng-Wang; the other was proven independently from us by Yang-Zuo).
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Uncertainty Quantification and Confidence Intervals for Naive Rare-Event Estimators
Authors:
Yuanlu Bai,
Henry Lam
Abstract:
We consider the estimation of rare-event probabilities using sample proportions output by naive Monte Carlo or collected data. Unlike using variance reduction techniques, this naive estimator does not have a priori relative efficiency guarantee. On the other hand, due to the recent surge of sophisticated rare-event problems arising in safety evaluations of intelligent systems, efficiency-guarantee…
▽ More
We consider the estimation of rare-event probabilities using sample proportions output by naive Monte Carlo or collected data. Unlike using variance reduction techniques, this naive estimator does not have a priori relative efficiency guarantee. On the other hand, due to the recent surge of sophisticated rare-event problems arising in safety evaluations of intelligent systems, efficiency-guaranteed variance reduction may face implementation challenges which, coupled with the availability of computation or data collection power, motivate the use of such a naive estimator. In this paper we study the uncertainty quantification, namely the construction, coverage validity and tightness of confidence intervals, for rare-event probabilities using only sample proportions. In addition to the known normality, Wilson's and exact intervals, we investigate and compare them with two new intervals derived from Chernoff's inequality and the Berry-Esseen theorem. Moreover, we generalize our results to the natural situation where sampling stops by reaching a target number of rare-event hits. Our findings show that the normality and Wilson's intervals are not always valid, but they are close to the newly developed valid intervals in terms of half-width. In contrast, the exact interval is conservative, but safely guarantees the attainment of the nominal confidence level. Our new intervals, while being more conservative than the exact interval, provide useful insights in understanding the tightness of the considered intervals.
△ Less
Submitted 26 April, 2024; v1 submitted 3 May, 2023;
originally announced May 2023.
-
Estimate-Then-Optimize versus Integrated-Estimation-Optimization versus Sample Average Approximation: A Stochastic Dominance Perspective
Authors:
Adam N. Elmachtoub,
Henry Lam,
Haofeng Zhang,
Yunfan Zhao
Abstract:
In data-driven stochastic optimization, model parameters of the underlying distribution need to be estimated from data in addition to the optimization task. Recent literature considers integrating the estimation and optimization processes by selecting model parameters that lead to the best empirical objective performance. This integrated approach, which we call integrated-estimation-optimization (…
▽ More
In data-driven stochastic optimization, model parameters of the underlying distribution need to be estimated from data in addition to the optimization task. Recent literature considers integrating the estimation and optimization processes by selecting model parameters that lead to the best empirical objective performance. This integrated approach, which we call integrated-estimation-optimization (IEO), can be readily shown to outperform simple estimate-then-optimize (ETO) when the model is misspecified. In this paper, we show that a reverse behavior appears when the model class is well-specified and there is sufficient data. Specifically, for a general class of nonlinear stochastic optimization problems, we show that simple ETO outperforms IEO asymptotically when the model class covers the ground truth, in the strong sense of stochastic dominance of the regret. Namely, the entire distribution of the regret, not only its mean or other moments, is always better for ETO compared to IEO. Our results also apply to constrained, contextual optimization problems where the decision depends on observed features. Whenever applicable, we also demonstrate how standard sample average approximation (SAA) performs the worst when the model class is well-specified in terms of regret, and best when it is misspecified. Finally, we provide experimental results to support our theoretical comparisons and illustrate when our insights hold in finite-sample regimes and under various degrees of misspecification.
△ Less
Submitted 6 August, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Thermal Models of Asteroids with Two-band Combinations of Wide-field Infrared Survey Explorer Cryogenic Data
Authors:
Emily A. Whittaker,
Jean-Luc Margot,
Adrian L. H. Lam,
Nathan Myhrvold
Abstract:
We used the reparameterized Near-Earth Asteroid Thermal Model to model observations of a curated set of over 4000 asteroids from the Wide-field Infrared Survey Explorer in two wavelength bands (W2-3 or W3-4) and compared the results to previous results from all four wavelength bands (W1-4). This comparison was done with the goal of elucidating unique aspects of modeling two-band observations so th…
▽ More
We used the reparameterized Near-Earth Asteroid Thermal Model to model observations of a curated set of over 4000 asteroids from the Wide-field Infrared Survey Explorer in two wavelength bands (W2-3 or W3-4) and compared the results to previous results from all four wavelength bands (W1-4). This comparison was done with the goal of elucidating unique aspects of modeling two-band observations so that any potential biases or shortcomings for planned two-band surveys (e.g., the NASA Near-Earth Object Surveyor Mission) can be anticipated and quantified. The W2-3 two-band fits usually yielded slightly smaller diameters than the four-band fits, with a median diameter difference of -10%, with the 5% and 95% quantiles of the distribution at -32% and -1.5%, respectively. We conducted similar comparisons for W3-4, in part because the longest wavelength bands are expected to provide the best two-band results. We found that the W3-4 two-band diameters are slightly larger than the four-band results, with a median diameter difference of 11% and the 5% and 95% quantiles of the distribution at -2.1% and 26%, respectively. The diameter uncertainty, obtained with bootstrap analysis, is larger by 30% and 35% (median values) for the W2-3 and W3-4 fits, respectively, than for the corresponding four-band fits. Using 23 high-quality stellar occultation diameters as a benchmark, we found that the median errors of W2-3 and W3-4 diameter estimates are -15% and +12%, respectively, whereas the median error of the four-band fits is 9.3%. Although the W2-3 and W3-4 diameters appear to have greater systematic errors and uncertainties than their four-band counterparts, two-band estimates remain useful because they improve upon diameter estimates obtained from visible photometry alone.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
Randomized low-rank approximation of parameter-dependent matrices
Authors:
Daniel Kressner,
Hei Yin Lam
Abstract:
This work considers the low-rank approximation of a matrix $A(t)$ depending on a parameter $t$ in a compact set $D \subset \mathbb{R}^d$. Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low-rank approximation and they usually proceed by multiplying the matrix with ran…
▽ More
This work considers the low-rank approximation of a matrix $A(t)$ depending on a parameter $t$ in a compact set $D \subset \mathbb{R}^d$. Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low-rank approximation and they usually proceed by multiplying the matrix with random dimension reduction matrices (DRMs). Applying such algorithms directly to $A(t)$ would involve different, independent DRMs for every $t$, which is not only expensive but also leads to inherently non-smooth approximations. In this work, we propose to use constant DRMs, that is, $A(t)$ is multiplied with the same DRM for every $t$. The resulting parameter-dependent extensions of two popular randomized algorithms, the randomized singular value decomposition and the generalized Nyström method, are computationally attractive, especially when $A(t)$ admits an affine linear decomposition with respect to $t$. We perform a probabilistic analysis for both algorithms, deriving bounds on the expected value as well as failure probabilities for the approximation error when using Gaussian random DRMs. Both, the theoretical results and numerical experiments, show that the use of constant DRMs does not impair their effectiveness; our methods reliably return quasi-best low-rank approximations.
△ Less
Submitted 17 April, 2024; v1 submitted 24 February, 2023;
originally announced February 2023.
-
Design and Mechanics of Cable-Driven Rolling Diaphragm Transmission for High-Transparency Robotic Motion
Authors:
Hoi Man Lam,
W. Jared Walker,
Lucas Jonasch,
Dimitri Schreiber,
Michael C. Yip
Abstract:
Applications of rolling diaphragm transmissions for medical and teleoperated robotics are of great interest, due to the low friction of rolling diaphragms combined with the power density and stiffness of hydraulic transmissions. However, the stiffness-enabling pressure preloads can form a tradeoff against bearing loading in some rolling diaphragm layouts, and transmission setup can be difficult. U…
▽ More
Applications of rolling diaphragm transmissions for medical and teleoperated robotics are of great interest, due to the low friction of rolling diaphragms combined with the power density and stiffness of hydraulic transmissions. However, the stiffness-enabling pressure preloads can form a tradeoff against bearing loading in some rolling diaphragm layouts, and transmission setup can be difficult. Utilization of cable drives compliment the rolling diaphragm transmission's advantages, but maintaining cable tension is crucial for optimal and consistent performance. In this paper, a coaxial opposed rolling diaphragm layout with cable drive and an electronic transmission control system are investigated, with a focus on system reliability and scalability. Mechanical features are proposed which enable force balancing, decoupling of transmission pressure from bearing loads, and maintenance of cable tension. Key considerations and procedures for automation of transmission setup, phasing, and operation are also presented. We also present an analysis of system stiffness to identify key compliance contributors, and conduct experiments to validate prototype design performance.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
Distributed Learning in Heterogeneous Environment: federated learning with adaptive aggregation and computation reduction
Authors:
**gxin Li,
Toktam Mahmoodi,
Hak-Keung Lam
Abstract:
Although federated learning has achieved many breakthroughs recently, the heterogeneous nature of the learning environment greatly limits its performance and hinders its real-world applications. The heterogeneous data, time-varying wireless conditions and computing-limited devices are three main challenges, which often result in an unstable training process and degraded accuracy. Herein, we propos…
▽ More
Although federated learning has achieved many breakthroughs recently, the heterogeneous nature of the learning environment greatly limits its performance and hinders its real-world applications. The heterogeneous data, time-varying wireless conditions and computing-limited devices are three main challenges, which often result in an unstable training process and degraded accuracy. Herein, we propose strategies to address these challenges. Targeting the heterogeneous data distribution, we propose a novel adaptive mixing aggregation (AMA) scheme that mixes the model updates from previous rounds with current rounds to avoid large model shifts and thus, maintain training stability. We further propose a novel staleness-based weighting scheme for the asynchronous model updates caused by the dynamic wireless environment. Lastly, we propose a novel CPU-friendly computation-reduction scheme based on transfer learning by sharing the feature extractor (FES) and letting the computing-limited devices update only the classifier. The simulation results show that the proposed framework outperforms existing state-of-the-art solutions and increases the test accuracy, and training stability by up to 2.38%, 93.10% respectively. Additionally, the proposed framework can tolerate communication delay of up to 15 rounds under a moderate delay environment without significant accuracy degradation.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
ripple: Differentiable and Hardware-Accelerated Waveforms for Gravitational Wave Data Analysis
Authors:
Thomas D. P. Edwards,
Kaze W. K. Wong,
Kelvin K. H. Lam,
Adam Coogan,
Daniel Foreman-Mackey,
Maximiliano Isi,
Aaron Zimmerman
Abstract:
We propose the use of automatic differentiation through the programming framework jax for accelerating a variety of analysis tasks throughout gravitational wave (GW) science. Firstly, we demonstrate that complete waveforms which cover the inspiral, merger, and ringdown of binary black holes (i.e. IMRPhenomD) can be written in jax and demonstrate that the serial evaluation speed of the waveform (an…
▽ More
We propose the use of automatic differentiation through the programming framework jax for accelerating a variety of analysis tasks throughout gravitational wave (GW) science. Firstly, we demonstrate that complete waveforms which cover the inspiral, merger, and ringdown of binary black holes (i.e. IMRPhenomD) can be written in jax and demonstrate that the serial evaluation speed of the waveform (and its derivative) is similar to the lalsuite implementation in C. Moreover, jax allows for GPU-accelerated waveform calls which can be over an order of magnitude faster than serial evaluation on a CPU. We then focus on three applications where efficient and differentiable waveforms are essential. Firstly, we demonstrate how gradient descent can be used to optimize the $\sim 200$ coefficients that are used to calibrate the waveform model. In particular, we demonstrate that the typical match with numerical relativity waveforms can be improved by more than 50% without any additional overhead. Secondly, we show that Fisher forecasting calculations can be sped up by $\sim 100\times$ (on a CPU) with no loss in accuracy. This increased speed makes population forecasting substantially simpler. Finally, we show that gradient-based samplers like Hamiltonian Monte Carlo lead to significantly reduced autocorrelation values when compared to traditional Monte Carlo methods. Since differentiable waveforms have substantial advantages for a variety of tasks throughout GW science, we propose that waveform developers use jax to build new waveforms moving forward. Our waveform code, ripple, can be found at https://github.com/tedwards2412/ripple, and will continue to be updated with new waveforms as they are implemented.
△ Less
Submitted 8 February, 2023;
originally announced February 2023.
-
Multi-Task Deep Recommender Systems: A Survey
Authors:
Yuhao Wang,
Ha Tsz Lam,
Yi Wong,
Ziru Liu,
Xiangyu Zhao,
Yichao Wang,
Bo Chen,
Huifeng Guo,
Ruiming Tang
Abstract:
Multi-task learning (MTL) aims at learning related tasks in a unified model to achieve mutual improvement among tasks considering their shared knowledge. It is an important topic in recommendation due to the demand for multi-task prediction considering performance and efficiency. Although MTL has been well studied and developed, there is still a lack of systematic review in the recommendation comm…
▽ More
Multi-task learning (MTL) aims at learning related tasks in a unified model to achieve mutual improvement among tasks considering their shared knowledge. It is an important topic in recommendation due to the demand for multi-task prediction considering performance and efficiency. Although MTL has been well studied and developed, there is still a lack of systematic review in the recommendation community. To fill the gap, we provide a comprehensive review of existing multi-task deep recommender systems (MTDRS) in this survey. To be specific, the problem definition of MTDRS is first given, and it is compared with other related areas. Next, the development of MTDRS is depicted and the taxonomy is introduced from the task relation and methodology aspects. Specifically, the task relation is categorized into parallel, cascaded, and auxiliary with main, while the methodology is grouped into parameter sharing, optimization, and training mechanism. The survey concludes by summarizing the application and public datasets of MTDRS and highlighting the challenges and future directions of the field.
△ Less
Submitted 8 February, 2023; v1 submitted 7 February, 2023;
originally announced February 2023.
-
A Distributionally Robust Optimization Framework for Extreme Event Estimation
Authors:
Yuanlu Bai,
Henry Lam,
Xinyu Zhang
Abstract:
Conventional methods for extreme event estimation rely on well-chosen parametric models asymptotically justified from extreme value theory (EVT). These methods, while powerful and theoretically grounded, could however encounter a difficult bias-variance tradeoff that exacerbates especially when data size is too small, deteriorating the reliability of the tail estimation. In this paper, we study a…
▽ More
Conventional methods for extreme event estimation rely on well-chosen parametric models asymptotically justified from extreme value theory (EVT). These methods, while powerful and theoretically grounded, could however encounter a difficult bias-variance tradeoff that exacerbates especially when data size is too small, deteriorating the reliability of the tail estimation. In this paper, we study a framework based on the recently surging literature of distributionally robust optimization. This approach can be viewed as a nonparametric alternative to conventional EVT, by imposing general shape belief on the tail instead of parametric assumption and using worst-case optimization as a resolution to handle the nonparametric uncertainty. We explain how this approach bypasses the bias-variance tradeoff in EVT. On the other hand, we face a conservativeness-variance tradeoff which we describe how to tackle. We also demonstrate computational tools for the involved optimization problems and compare our performance with conventional EVT across a range of numerical examples.
△ Less
Submitted 3 January, 2023;
originally announced January 2023.
-
Decreasing behavior of the depth functions of edge ideals
Authors:
Ha Thi Thu Hien,
Ha Minh Lam,
Ngo Viet Trung
Abstract:
Let $I$ be the edge ideal of a connected non-bipartite graph and $R$ the base polynomial ring. Then $\operatorname{depth} R/I \ge 1$ and $\operatorname{depth} R/I^t = 0$ for $t \gg 1$. We give combinatorial conditions for $\operatorname{depth} R/I^t = 1$ for some $t$ in between and show that the depth function is non-increasing thereafter. Especially, the depth function quickly decreases to 0 afte…
▽ More
Let $I$ be the edge ideal of a connected non-bipartite graph and $R$ the base polynomial ring. Then $\operatorname{depth} R/I \ge 1$ and $\operatorname{depth} R/I^t = 0$ for $t \gg 1$. We give combinatorial conditions for $\operatorname{depth} R/I^t = 1$ for some $t$ in between and show that the depth function is non-increasing thereafter. Especially, the depth function quickly decreases to 0 after reaching 1. We show that if $\operatorname{depth} R/I = 1$ then $\operatorname{depth} R/I^2 = 0$ and if $\operatorname{depth} R/I^2 = 1$ then $\operatorname{depth} R/I^5 = 0$. Other similar results suggest that if $\operatorname{depth} R/I^t = 1$ then $\operatorname{depth} R/I^{t+3} = 0$. This a surprising phenomenon because the depth of a power can determine a smaller depth of another power. Furthermore, we are able to give a simple combinatorial criterion for $\operatorname{depth} R/I^{(t)} = 1$ for $t \gg 1$ and show that the condition $\operatorname{depth} R/I^{(t)} = 1$ is persistent, where $I^{(t)}$ denotes the $t$-th symbolic powers of $I$.
△ Less
Submitted 22 January, 2023; v1 submitted 30 December, 2022;
originally announced December 2022.
-
Obstructions to Gapped Phases from Non-Invertible Symmetries
Authors:
Anuj Apte,
Clay Cordova,
Ho Tat Lam
Abstract:
Quantum systems in 3+1-dimensions that are invariant under gauging a one-form symmetry enjoy novel non-invertible duality symmetries encoded by topological defects. These symmetries are renormalization group invariants which constrain dynamics. We show that such non-invertible symmetries often forbid a symmetry-preserving vacuum state with a gapped spectrum. In particular, we prove that a self-dua…
▽ More
Quantum systems in 3+1-dimensions that are invariant under gauging a one-form symmetry enjoy novel non-invertible duality symmetries encoded by topological defects. These symmetries are renormalization group invariants which constrain dynamics. We show that such non-invertible symmetries often forbid a symmetry-preserving vacuum state with a gapped spectrum. In particular, we prove that a self-dual theory with $\mathbb{Z}_{N}^{(1)}$ one-form symmetry is gapless or spontaneously breaks the self-duality symmetry unless $N=k^{2}\ell$ where $-1$ is a quadratic residue modulo $\ell$. We also extend these results to non-invertible symmetries arising from invariance under more general gauging operations including e.g. triality symmetries. Along the way, we discover how duality defects in symmetry protected topological phases have a hidden time-reversal symmetry that organizes their basic properties. These non-invertible symmetries are realized in lattice gauge theories, which serve to illustrate our results.
△ Less
Submitted 30 December, 2022;
originally announced December 2022.
-
On irreducibility of modules of Whittaker type: twisted modules and nonabelian orbifolds
Authors:
Drazen Adamovic,
Ching Hung Lam,
Veronika Pedic Tomic,
Nina Yu
Abstract:
In arXiv:1811.04649, we extended the Dong-Mason theorem on irreducibility of modules for cyclic orbifold vertex algebras to the entire category weak modules and applied this result to Whittaker modules. In this paper we present further generalizations of these results for nonabelian orbifolds of vertex operator superalgebras. Let $V$ be a vertex superalgebra with a countable dimension and let $G$…
▽ More
In arXiv:1811.04649, we extended the Dong-Mason theorem on irreducibility of modules for cyclic orbifold vertex algebras to the entire category weak modules and applied this result to Whittaker modules. In this paper we present further generalizations of these results for nonabelian orbifolds of vertex operator superalgebras. Let $V$ be a vertex superalgebra with a countable dimension and let $G$ be a finite subgroup of $\mathrm{Aut}(V)$. Assume that $h\in Z(G)$ where $Z(G)$ is the center of the group $G$. For any irreducible $h$-twisted (weak) $V$-module $M$, we prove that if $M\not\cong g\circ M$ for all $g\in G$ then $M$ is also irreducible as $V^G$-module. We also apply this result to examples and give irreducibility of modules of Whittaker type for orbifolds of Neveu-Schwarz vertex superalgebras, Heisenberg vertex algebras, Virasoro vertex operator algebra and Heisenberg-Virasoro vertex algebra.
△ Less
Submitted 28 December, 2022;
originally announced December 2022.
-
Non-Invertible Gauss Law and Axions
Authors:
Yichul Choi,
Ho Tat Lam,
Shu-Heng Shao
Abstract:
In axion-Maxwell theory at the minimal axion-photon coupling, we find non-invertible 0- and 1-form global symmetries arising from the naive shift and center symmetries. Since the Gauss law is anomalous, there is no conserved, gauge-invariant, and quantized electric charge. Rather, using half higher gauging, we find a non-invertible Gauss law associated with a non-invertible 1-form global symmetry,…
▽ More
In axion-Maxwell theory at the minimal axion-photon coupling, we find non-invertible 0- and 1-form global symmetries arising from the naive shift and center symmetries. Since the Gauss law is anomalous, there is no conserved, gauge-invariant, and quantized electric charge. Rather, using half higher gauging, we find a non-invertible Gauss law associated with a non-invertible 1-form global symmetry, which is related to the Page charge. These symmetries act invertibly on the axion field and Wilson line, but non-invertibly on the monopoles and axion strings, leading to selection rules related to the Witten effect. We also derive various crossing relations between the defects. The non-invertible 0- and 1-form global symmetries mix with other invertible symmetries in a way reminiscent of a higher-group symmetry. Using this non-invertible higher symmetry structure, we derive universal inequalities on the energy scales where different infrared symmetries emerge in any renormalization group flow to the axion-Maxwell theory. Finally, we discuss implications for the Weak Gravity Conjecture and the Completeness Hypothesis in quantum gravity.
△ Less
Submitted 17 September, 2023; v1 submitted 8 December, 2022;
originally announced December 2022.
-
Hedging Complexity in Generalization via a Parametric Distributionally Robust Optimization Framework
Authors:
Garud Iyengar,
Henry Lam,
Tianyu Wang
Abstract:
Empirical risk minimization (ERM) and distributionally robust optimization (DRO) are popular approaches for solving stochastic optimization problems that appear in operations management and machine learning. Existing generalization error bounds for these methods depend on either the complexity of the cost function or dimension of the random perturbations. Consequently, the performance of these met…
▽ More
Empirical risk minimization (ERM) and distributionally robust optimization (DRO) are popular approaches for solving stochastic optimization problems that appear in operations management and machine learning. Existing generalization error bounds for these methods depend on either the complexity of the cost function or dimension of the random perturbations. Consequently, the performance of these methods can be poor for high-dimensional problems with complex objective functions. We propose a simple approach in which the distribution of random perturbations is approximated using a parametric family of distributions. This mitigates both sources of complexity; however, it introduces a model misspecification error. We show that this new source of error can be controlled by suitable DRO formulations. Our proposed parametric DRO approach has significantly improved generalization bounds over existing ERM and DRO methods and parametric ERM for a wide variety of settings. Our method is particularly effective under distribution shifts and works broadly in contextual optimization. We also illustrate the superior performance of our approach on both synthetic and real-data portfolio optimization and regression tasks.
△ Less
Submitted 24 September, 2023; v1 submitted 2 December, 2022;
originally announced December 2022.
-
Determination of 1929 Asteroid Rotation Periods from WISE Data
Authors:
Adrian L. H. Lam,
Jean-Luc Margot,
Emily Whittaker,
Nathan Myhrvold
Abstract:
We used 22 $μ$m (W4) Wide-field Infrared Survey Explorer (WISE) observations of 4420 asteroids to analyze lightcurves and determined spin period estimates for 1929 asteroids. We fit second-order Fourier models at a large number of trial frequencies to the W4 data and analyzed the resulting periodograms. We initially excluded rotational frequencies exceeding 7.57 rotations per day (P < 3.17 hr), wh…
▽ More
We used 22 $μ$m (W4) Wide-field Infrared Survey Explorer (WISE) observations of 4420 asteroids to analyze lightcurves and determined spin period estimates for 1929 asteroids. We fit second-order Fourier models at a large number of trial frequencies to the W4 data and analyzed the resulting periodograms. We initially excluded rotational frequencies exceeding 7.57 rotations per day (P < 3.17 hr), which are not sampled adequately by WISE, and periods that exceed twice the WISE observation interval, which is typically 36 hr. Three solutions accurately capture the vast majority of the rotational frequencies in our sample: the best-fit frequency and its mirrors around 3.78 and 7.57 rotations per day. By comparing our solutions to a high-quality control group of 752 asteroid spin periods, we found that one of our solutions is accurate (within 5%) in 88% of the cases. The best-fit, secondary, and tertiary solutions are accurate in 55%, 27%, and 6% of the cases, respectively. We also observed that suppression of aliased solutions was more effective with non-uniform sampling than with quasi-uniform sampling.
△ Less
Submitted 14 March, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Unitary forms for holomorphic vertex operator algebras of central charge $24$
Authors:
Ching Hung Lam
Abstract:
We prove that all holomorphic vertex operator algebras of central charge $24$ with non-trivial weight one subspaces are unitary. The main method is to use the orbifold construction of a holomorphic VOA $V$ of central charge $24$ directly from a Niemeier lattice VOA $V_N$. We show that it is possible to extend the unitary form for the lattice VOA $V_N$ to the holomorphic VOA $V$ by using the orbifo…
▽ More
We prove that all holomorphic vertex operator algebras of central charge $24$ with non-trivial weight one subspaces are unitary. The main method is to use the orbifold construction of a holomorphic VOA $V$ of central charge $24$ directly from a Niemeier lattice VOA $V_N$. We show that it is possible to extend the unitary form for the lattice VOA $V_N$ to the holomorphic VOA $V$ by using the orbifold construction and some information of the automorphism group $\mathrm{Aut}(V)$.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Gapless Infinite-component Chern-Simons-Maxwell Theories
Authors:
Xie Chen,
Ho Tat Lam,
Xiuqi Ma
Abstract:
The infinite-component Chern-Simons-Maxwell (iCSM) theory is a 3+1D generalization of the 2+1D Chern-Simons-Maxwell theory by including an infinite number of coupled gauge fields. It can be used to describe interesting 3+1D systems. In Phys. Rev. B 105, 195124 (2022), it was used to construct gapped fracton models both within and beyond the foliation framework. In this paper, we study the nontrivi…
▽ More
The infinite-component Chern-Simons-Maxwell (iCSM) theory is a 3+1D generalization of the 2+1D Chern-Simons-Maxwell theory by including an infinite number of coupled gauge fields. It can be used to describe interesting 3+1D systems. In Phys. Rev. B 105, 195124 (2022), it was used to construct gapped fracton models both within and beyond the foliation framework. In this paper, we study the nontrivial features of gapless iCSM theories. In particular, we find that while gapless 2+1D Maxwell theories are confined and not robust due to monopole effect, gapless iCSM theories are deconfined and robust against all local perturbation and hence represent a robust 3+1D deconfined gapless order. The gaplessness of the gapless iCSM theory can be understood as a consequence of the spontaneous breaking of an exotic one-form symmetry. Moreover, for a subclass of the gapless iCSM theories, we find interesting topological features in the correlation and response of the system. Finally, for this subclass of theories, we propose a fully continuous field theory description of the model that captures all these features.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
Motivic local systems on curves and Maeda's conjecture
Authors:
Yeuk Hay Joshua Lam
Abstract:
We show that only finitely many complex genus two curves and four punctured spheres admit rank two local systems of geometric origin, and moreover each carries finitely many. This gives further counterexamples to a conjecture of Esnault and Kerz: counterexamples over very general curves were recently obtained by Landesman and Litt. In the second part we prove an analogue of this result in positive…
▽ More
We show that only finitely many complex genus two curves and four punctured spheres admit rank two local systems of geometric origin, and moreover each carries finitely many. This gives further counterexamples to a conjecture of Esnault and Kerz: counterexamples over very general curves were recently obtained by Landesman and Litt. In the second part we prove an analogue of this result in positive characteristic, namely that over $\overline{\mathbb{F}}_p$, only finitely many genus two curves admit non-trivial rank two local systems pulled back from a fixed quaternionic Shimura variety, and the same for $\mathbb{P}^1$ minus four points; conjecturally, every rank two local system arises as such a pullback. This provides results towards Maeda's conjecture on Galois orbits of eigenforms over function fields. The proofs make use of ideas from the work of Landesman and Litt such as isomonodromy, as well as crucially the description of the Goren-Oort strata due to Tian and Xiao.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Boundedness of trace fields of rank two local systems
Authors:
Yeuk Hay Joshua Lam
Abstract:
Let $p$ be a fixed prime number, and $q$ a power of $p$. For any curve over $\mb{F}_q$ and any local system on it, we have a number field generated by the traces of Frobenii at closed points, known as the trace field. We show that as we range over all pointed curves of type $(g,n)$ in characteristic $p$ and rank two local systems satisfying a condition at infinity, the set of trace fields which ar…
▽ More
Let $p$ be a fixed prime number, and $q$ a power of $p$. For any curve over $\mb{F}_q$ and any local system on it, we have a number field generated by the traces of Frobenii at closed points, known as the trace field. We show that as we range over all pointed curves of type $(g,n)$ in characteristic $p$ and rank two local systems satisfying a condition at infinity, the set of trace fields which are unramified at $p$ and of bounded degree is finite. This proves observations of Kontsevich obtained via numerical computations, which are in turn closely related to the analogue of Maeda's conjecture over function fields. We also prove a similar finiteness result across all primes $p$. One of the key steps in the proofs is the boundedness of abelian schemes of $\mathrm{GL}_2$-type over curves in positive characteristics, which is an analogue of Faltings' Arakelov theorem for abelian varieties in our setting.
△ Less
Submitted 18 November, 2022; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Adaptive Data Fusion for Multi-task Non-smooth Optimization
Authors:
Henry Lam,
Kaizheng Wang,
Yuhang Wu,
Yichen Zhang
Abstract:
We study the problem of multi-task non-smooth optimization that arises ubiquitously in statistical learning, decision-making and risk management. We develop a data fusion approach that adaptively leverages commonalities among a large number of objectives to improve sample efficiency while tackling their unknown heterogeneities. We provide sharp statistical guarantees for our approach. Numerical ex…
▽ More
We study the problem of multi-task non-smooth optimization that arises ubiquitously in statistical learning, decision-making and risk management. We develop a data fusion approach that adaptively leverages commonalities among a large number of objectives to improve sample efficiency while tackling their unknown heterogeneities. We provide sharp statistical guarantees for our approach. Numerical experiments on both synthetic and real data demonstrate significant advantages of our approach over benchmarks.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables
Authors:
Mengdi Xu,
Peide Huang,
Yaru Niu,
Visak Kumar,
Jielin Qiu,
Chao Fang,
Kuan-Hui Lee,
Xuewei Qi,
Henry Lam,
Bo Li,
Ding Zhao
Abstract:
One key challenge for multi-task Reinforcement learning (RL) in practice is the absence of task indicators. Robust RL has been applied to deal with task ambiguity, but may result in over-conservative policies. To balance the worst-case (robustness) and average performance, we propose Group Distributionally Robust Markov Decision Process (GDR-MDP), a flexible hierarchical MDP formulation that encod…
▽ More
One key challenge for multi-task Reinforcement learning (RL) in practice is the absence of task indicators. Robust RL has been applied to deal with task ambiguity, but may result in over-conservative policies. To balance the worst-case (robustness) and average performance, we propose Group Distributionally Robust Markov Decision Process (GDR-MDP), a flexible hierarchical MDP formulation that encodes task groups via a latent mixture model. GDR-MDP identifies the optimal policy that maximizes the expected return under the worst-possible qualified belief over task groups within an ambiguity set. We rigorously show that GDR-MDP's hierarchical structure improves distributional robustness by adding regularization to the worst possible outcomes. We then develop deep RL algorithms for GDR-MDP for both value-based and policy-based RL methods. Extensive experiments on Box2D control tasks, MuJoCo benchmarks, and Google football platforms show that our algorithms outperform classic robust training algorithms across diverse environments in terms of robustness under belief uncertainties. Demos are available on our project page (\url{https://sites.google.com/view/gdr-rl/home}).
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
Bootstrap in High Dimension with Low Computation
Authors:
Henry Lam,
Zhenyuan Liu
Abstract:
The bootstrap is a popular data-driven method to quantify statistical uncertainty, but for modern high-dimensional problems, it could suffer from huge computational costs due to the need to repeatedly generate resamples and refit models. We study the use of bootstraps in high-dimensional environments with a small number of resamples. In particular, we show that with a recent "cheap" bootstrap pers…
▽ More
The bootstrap is a popular data-driven method to quantify statistical uncertainty, but for modern high-dimensional problems, it could suffer from huge computational costs due to the need to repeatedly generate resamples and refit models. We study the use of bootstraps in high-dimensional environments with a small number of resamples. In particular, we show that with a recent "cheap" bootstrap perspective, using a number of resamples as small as one could attain valid coverage even when the dimension grows closely with the sample size, thus strongly supporting the implementability of the bootstrap for large-scale problems. We validate our theoretical results and compare the performance of our approach with other benchmarks via a range of experiments.
△ Less
Submitted 19 June, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
EmbryosFormer: Deformable Transformer and Collaborative Encoding-Decoding for Embryos Stage Development Classification
Authors:
Tien-Phat Nguyen,
Trong-Thang Pham,
Tri Nguyen,
Hieu Le,
Dung Nguyen,
Hau Lam,
Phong Nguyen,
Jennifer Fowler,
Minh-Triet Tran,
Ngan Le
Abstract:
The timing of cell divisions in early embryos during the In-Vitro Fertilization (IVF) process is a key predictor of embryo viability. However, observing cell divisions in Time-Lapse Monitoring (TLM) is a time-consuming process and highly depends on experts. In this paper, we propose EmbryosFormer, a computational model to automatically detect and classify cell divisions from original time-lapse im…
▽ More
The timing of cell divisions in early embryos during the In-Vitro Fertilization (IVF) process is a key predictor of embryo viability. However, observing cell divisions in Time-Lapse Monitoring (TLM) is a time-consuming process and highly depends on experts. In this paper, we propose EmbryosFormer, a computational model to automatically detect and classify cell divisions from original time-lapse images. Our proposed network is designed as an encoder-decoder deformable transformer with collaborative heads. The transformer contracting path predicts per-image labels and is optimized by a classification head. The transformer expanding path models the temporal coherency between embryo images to ensure monotonic non-decreasing constraint and is optimized by a segmentation head. Both contracting and expanding paths are synergetically learned by a collaboration head. We have benchmarked our proposed EmbryosFormer on two datasets: a public dataset with mouse embryos with 8-cell stage and an in-house dataset with human embryos with 4-cell stage. Source code: https://github.com/UARK-AICV/Embryos.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.
-
Gapped Lineon and Fracton Models on Graphs
Authors:
Pranay Gorantla,
Ho Tat Lam,
Nathan Seiberg,
Shu-Heng Shao
Abstract:
We introduce a $\mathbb{Z}_N$ stabilizer code that can be defined on any spatial lattice of the form $Γ\times C_{L_z}$, where $Γ$ is a general graph. We also present the low-energy limit of this stabilizer code as a Euclidean lattice action, which we refer to as the anisotropic $\mathbb{Z}_N$ Laplacian model. It is gapped, robust (i.e., stable under small deformations), and has lineons. Its ground…
▽ More
We introduce a $\mathbb{Z}_N$ stabilizer code that can be defined on any spatial lattice of the form $Γ\times C_{L_z}$, where $Γ$ is a general graph. We also present the low-energy limit of this stabilizer code as a Euclidean lattice action, which we refer to as the anisotropic $\mathbb{Z}_N$ Laplacian model. It is gapped, robust (i.e., stable under small deformations), and has lineons. Its ground state degeneracy (GSD) is expressed in terms of a "mod $N$-reduction" of the Jacobian group of the graph $Γ$. In the special case when space is an $L\times L\times L_z$ cubic lattice, the logarithm of the GSD depends on $L$ in an erratic way and grows no faster than $O(L)$. We also discuss another gapped model, the $\mathbb{Z}_N$ Laplacian model, which can be defined on any graph. It has fractons and a similarly strange GSD.
△ Less
Submitted 20 November, 2022; v1 submitted 7 October, 2022;
originally announced October 2022.
-
2+1d Compact Lifshitz Theory, Tensor Gauge Theory, and Fractons
Authors:
Pranay Gorantla,
Ho Tat Lam,
Nathan Seiberg,
Shu-Heng Shao
Abstract:
The 2+1d continuum Lifshitz theory of a free compact scalar field plays a prominent role in a variety of quantum systems in condensed matter physics and high energy physics. It is known that in compact space, it has an infinite ground state degeneracy. In order to understand this theory better, we consider two candidate lattice regularizations of it using the modified Villain formalism. We show th…
▽ More
The 2+1d continuum Lifshitz theory of a free compact scalar field plays a prominent role in a variety of quantum systems in condensed matter physics and high energy physics. It is known that in compact space, it has an infinite ground state degeneracy. In order to understand this theory better, we consider two candidate lattice regularizations of it using the modified Villain formalism. We show that these two lattice theories have significantly different global symmetries (including a dipole global symmetry), anomalies, ground state degeneracies, and dualities. In particular, one of them is self-dual. Given these theories and their global symmetries, we can couple them to corresponding gauge theories. These are two different $U(1)$ tensor gauge theories. The resulting models have excitations with restricted mobility, i.e., fractons. Finally, we give an exact lattice realization of the fracton/lineon-elasticity dualities for the Lifshitz theory, scalar and vector charge gauge theories.
△ Less
Submitted 23 July, 2023; v1 submitted 20 September, 2022;
originally announced September 2022.
-
Detecting Political Biases of Named Entities and Hashtags on Twitter
Authors:
Jeffrey Zhu,
Yining Wang,
Pei Zhou,
Wen Hong Lam,
Mason A. Porter,
Yizhou Sun
Abstract:
Ideological divisions in the United States have become increasingly prominent in daily communication. Accordingly, there has been much research on political polarization, including many recent efforts that take a computational perspective. By detecting political biases in a corpus of text, one can attempt to describe and discern the polarity of that text. Intuitively, the named entities (i.e., the…
▽ More
Ideological divisions in the United States have become increasingly prominent in daily communication. Accordingly, there has been much research on political polarization, including many recent efforts that take a computational perspective. By detecting political biases in a corpus of text, one can attempt to describe and discern the polarity of that text. Intuitively, the named entities (i.e., the nouns and the phrases that act as nouns) and hashtags in text often carry information about political views. For example, people who use the term "pro-choice" are likely to be liberal, whereas people who use the term "pro-life" are likely to be conservative. In this paper, we seek to reveal political polarities in social-media text data and to quantify these polarities by explicitly assigning a polarity score to entities and hashtags. Although this idea is straightforward, it is difficult to perform such inference in a trustworthy quantitative way. Key challenges include the small number of known labels, the continuous spectrum of political views, and the preservation of both a polarity score and a polarity-neutral semantic meaning in an embedding vector of words. To attempt to overcome these challenges, we propose the Polarity-aware Embedding Multi-task learning (PEM) model. This model consists of (1) a self-supervised context-preservation task, (2) an attention-based tweet-level polarity-inference task, and (3) an adversarial learning task that promotes independence between an embedding's polarity dimension and its semantic dimensions. Our experimental results demonstrate that our PEM model can successfully learn polarity-aware embeddings that perform well classification tasks. We examine a variety of applications and we thereby demonstrate the effectiveness of our PEM model. We also discuss important limitations of our work and encourage caution when applying the it to real-world scenarios.
△ Less
Submitted 17 March, 2023; v1 submitted 16 September, 2022;
originally announced September 2022.
-
Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics
Authors:
C. Chen,
Y. P. Huang,
W. H. K. Lam,
T. L. Pan,
S. C. Hsu,
A. Sumalee,
R. X. Zhong
Abstract:
Existing data-driven and feedback traffic control strategies do not consider the heterogeneity of real-time data measurements. Besides, traditional reinforcement learning (RL) methods for traffic control usually converge slowly for lacking data efficiency. Moreover, conventional optimal perimeter control schemes require exact knowledge of the system dynamics and thus would be fragile to endogenous…
▽ More
Existing data-driven and feedback traffic control strategies do not consider the heterogeneity of real-time data measurements. Besides, traditional reinforcement learning (RL) methods for traffic control usually converge slowly for lacking data efficiency. Moreover, conventional optimal perimeter control schemes require exact knowledge of the system dynamics and thus would be fragile to endogenous uncertainties. To handle these challenges, this work proposes an integral reinforcement learning (IRL) based approach to learning the macroscopic traffic dynamics for adaptive optimal perimeter control. This work makes the following primary contributions to the transportation literature: (a) A continuous-time control is developed with discrete gain updates to adapt to the discrete-time sensor data. (b) To reduce the sampling complexity and use the available data more efficiently, the experience replay (ER) technique is introduced to the IRL algorithm. (c) The proposed method relaxes the requirement on model calibration in a "model-free" manner that enables robustness against modeling uncertainty and enhances the real-time performance via a data-driven RL algorithm. (d) The convergence of the IRL-based algorithms and the stability of the controlled traffic dynamics are proven via the Lyapunov theory. The optimal control law is parameterized and then approximated by neural networks (NN), which moderates the computational complexity. Both state and input constraints are considered while no model linearization is required. Numerical examples and simulation experiments are presented to verify the effectiveness and efficiency of the proposed method.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Lifts of supersingular abelian varieties with small Mumford-Tate groups
Authors:
Yeuk Hay Joshua Lam,
Abhishek Oswal
Abstract:
We investigate to what extent an abelian variety over a finite field can be lifted to one in characteristic zero with small Mumford-Tate group. We prove that supersingular abelian surfaces, respectively threefolds, can be lifted to ones isogenous to a square, respectively product, of elliptic curves. On the other hand, we show that supersingular abelian threefolds cannot be lifted to one isogenous…
▽ More
We investigate to what extent an abelian variety over a finite field can be lifted to one in characteristic zero with small Mumford-Tate group. We prove that supersingular abelian surfaces, respectively threefolds, can be lifted to ones isogenous to a square, respectively product, of elliptic curves. On the other hand, we show that supersingular abelian threefolds cannot be lifted to one isogenous to the cube of an elliptic curve over the Witt vectors.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Improving COVID-19 CT Classification of CNNs by Learning Parameter-Efficient Representation
Authors:
Yujia Xu,
Hak-Keung Lam,
Guangyu Jia,
Jian Jiang,
Junkai Liao,
Xinqi Bao
Abstract:
COVID-19 pandemic continues to spread rapidly over the world and causes a tremendous crisis in global human health and the economy. Its early detection and diagnosis are crucial for controlling the further spread. Many deep learning-based methods have been proposed to assist clinicians in automatic COVID-19 diagnosis based on computed tomography imaging. However, challenges still remain, including…
▽ More
COVID-19 pandemic continues to spread rapidly over the world and causes a tremendous crisis in global human health and the economy. Its early detection and diagnosis are crucial for controlling the further spread. Many deep learning-based methods have been proposed to assist clinicians in automatic COVID-19 diagnosis based on computed tomography imaging. However, challenges still remain, including low data diversity in existing datasets, and unsatisfied detection resulting from insufficient accuracy and sensitivity of deep learning models. To enhance the data diversity, we design augmentation techniques of incremental levels and apply them to the largest open-access benchmark dataset, COVIDx CT-2A. Meanwhile, similarity regularization (SR) derived from contrastive learning is proposed in this study to enable CNNs to learn more parameter-efficient representations, thus improving the accuracy and sensitivity of CNNs. The results on seven commonly used CNNs demonstrate that CNN performance can be improved stably through applying the designed augmentation and SR techniques. In particular, DenseNet121 with SR achieves an average test accuracy of 99.44% in three trials for three-category classification, including normal, non-COVID-19 pneumonia, and COVID-19 pneumonia. And the achieved precision, sensitivity, and specificity for the COVID-19 pneumonia category are 98.40%, 99.59%, and 99.50%, respectively. These statistics suggest that our method has surpassed the existing state-of-the-art methods on the COVIDx CT-2A dataset.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
Non-isometric pairs of Riemannian manifolds with the same Guillemin-Ruelle zeta function
Authors:
Hy Lam
Abstract:
In 1985, T. Sunada constructed a vast collection of non-isometric Laplace-isospectral pairs $(M_1,g_1)$, resp. $(M_2,g_2)$ of Riemannian manifolds. He further proves that the Ruelle zeta functions $Z_g(s):= \prod_γ(1 - e^{-sL(γ)})^{-1}$ of $(M_1,g_1)$, resp. $(M_2,g_2)$ coincide, where $\{γ\}$ runs over the primitive closed geodesics of $(M,g)$ and $L(γ)$ is the length of $γ$. In this article, we…
▽ More
In 1985, T. Sunada constructed a vast collection of non-isometric Laplace-isospectral pairs $(M_1,g_1)$, resp. $(M_2,g_2)$ of Riemannian manifolds. He further proves that the Ruelle zeta functions $Z_g(s):= \prod_γ(1 - e^{-sL(γ)})^{-1}$ of $(M_1,g_1)$, resp. $(M_2,g_2)$ coincide, where $\{γ\}$ runs over the primitive closed geodesics of $(M,g)$ and $L(γ)$ is the length of $γ$. In this article, we use the method of intertwining operators on the unit cosphere bundle to prove that the same Sunada pairs have identical Guillemin-Ruelle dynamical L-functions $L_G(s) = \sum_{γ\in \mathscr{G}}\frac{L_γ^\# e^{-sL_γ}}{|\det(I -\mathbf{P}_γ)|}$, where the sum runs over all closed geodesics.
△ Less
Submitted 20 August, 2022; v1 submitted 9 August, 2022;
originally announced August 2022.
-
Non-invertible Time-reversal Symmetry
Authors:
Yichul Choi,
Ho Tat Lam,
Shu-Heng Shao
Abstract:
In gauge theory, it is commonly stated that time-reversal symmetry only exists at $θ=0$ or $π$ for a $2π$-periodic $θ$-angle. In this paper, we point out that in both the free Maxwell theory and massive QED, there is a non-invertible time-reversal symmetry at every rational $θ$-angle, i.e., $θ= πp/N$. The non-invertible time-reversal symmetry is implemented by a conserved, anti-linear operator wit…
▽ More
In gauge theory, it is commonly stated that time-reversal symmetry only exists at $θ=0$ or $π$ for a $2π$-periodic $θ$-angle. In this paper, we point out that in both the free Maxwell theory and massive QED, there is a non-invertible time-reversal symmetry at every rational $θ$-angle, i.e., $θ= πp/N$. The non-invertible time-reversal symmetry is implemented by a conserved, anti-linear operator without an inverse. It is a composition of the naive time-reversal transformation and a fractional quantum Hall state. We also find similar non-invertible time-reversal symmetries in non-Abelian gauge theories, including the $\mathcal{N}=4$ $SU(2)$ super Yang-Mills theory along the locus $|τ|=1$ on the conformal manifold.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
Time-Frequency Distributions of Heart Sound Signals: A Comparative Study using Convolutional Neural Networks
Authors:
Xinqi Bao,
Yujia Xu,
Hak-Keung Lam,
Mohamed Trabelsi,
Ines Chihi,
Lilia Sidhom,
Ernest N. Kamavuako
Abstract:
Time-Frequency Distributions (TFDs) support the heart sound characterisation and classification in early cardiac screening. However, despite the frequent use of TFDs in signal analysis, no study comprehensively compared their performances on deep learning for automatic diagnosis. Furthermore, the combination of signal processing methods as inputs for Convolutional Neural Networks (CNNs) has been p…
▽ More
Time-Frequency Distributions (TFDs) support the heart sound characterisation and classification in early cardiac screening. However, despite the frequent use of TFDs in signal analysis, no study comprehensively compared their performances on deep learning for automatic diagnosis. Furthermore, the combination of signal processing methods as inputs for Convolutional Neural Networks (CNNs) has been proved as a practical approach to increasing signal classification performance. Therefore, this study aimed to investigate the optimal use of TFD/ combined TFDs as input for CNNs. The presented results revealed that: 1) The transformation of the heart sound signal into the TF domain achieves higher classification performance than using of raw signals. Among the TFDs, the difference in the performance was slight for all the CNN models (within $1.3\%$ in average accuracy). However, Continuous wavelet transform (CWT) and Chirplet transform (CT) outperformed the rest. 2) The appropriate increase of the CNN capacity and architecture optimisation can improve the performance, while the network architecture should not be overly complicated. Based on the ResNet or SEResNet family results, the increase in the number of parameters and the depth of the structure do not improve the performance apparently. 3) Combining TFDs as CNN inputs did not significantly improve the classification results. The findings of this study provided the knowledge for selecting TFDs as CNN input and designing CNN architecture for heart sound classification.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
Grain Growth During Protostellar Disk Formation
Authors:
Yisheng Tu,
Zhi-Yun Li,
Ka Ho Lam
Abstract:
Recent observations indicate that mm/cm-sized grains may exist in the embedded protostellar disks. How such large grains grow from the micron size (or less) in the earliest phase of star formation remains relatively unexplored. In this study we take a first step to model the grain growth in the protostellar environment, using two-dimensional (2D axisymmetric) radiation hydrodynamic and grain growt…
▽ More
Recent observations indicate that mm/cm-sized grains may exist in the embedded protostellar disks. How such large grains grow from the micron size (or less) in the earliest phase of star formation remains relatively unexplored. In this study we take a first step to model the grain growth in the protostellar environment, using two-dimensional (2D axisymmetric) radiation hydrodynamic and grain growth simulations. We show that the grain growth calculations can be greatly simplified by the "terminal velocity approximation", where the dust drift velocity relative to the gas is proportional to its stop** time, which is proportional to the grain size. We find that the grain-grain collision from size-dependent terminal velocity alone is too slow to convert a significant fraction of the initially micron-sized grains into mm/cm sizes during the deeply embedded Class 0 phase. Substantial grain growth is achieved when the grain-grain collision speed is enhanced by a factor of 4. The dust growth above and below the disk midplane enables the grains to settle faster towards the midplane, which increases the local dust-to-gas ratio, which, in turn, speeds up further growth there. How this needed enhancement can be achieved is unclear, although turbulence is a strong possibility that deserves further exploration.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
Fractons on Graphs and Complexity
Authors:
Pranay Gorantla,
Ho Tat Lam,
Shu-Heng Shao
Abstract:
We introduce two exotic lattice models on a general spatial graph. The first one is a matter theory of a compact Lifshitz scalar field, while the second one is a certain rank-2 $U(1)$ gauge theory of fractons. Both lattice models are defined via the discrete Laplacian operator on a general graph. We unveil an intriguing correspondence between the physical observables of these lattice models and gr…
▽ More
We introduce two exotic lattice models on a general spatial graph. The first one is a matter theory of a compact Lifshitz scalar field, while the second one is a certain rank-2 $U(1)$ gauge theory of fractons. Both lattice models are defined via the discrete Laplacian operator on a general graph. We unveil an intriguing correspondence between the physical observables of these lattice models and graph theory quantities. For instance, the ground state degeneracy of the matter theory equals the number of spanning trees of the spatial graph, which is a common measure of complexity in graph theory ("GSD = complexity"). The discrete global symmetry is identified as the Jacobian group of the graph. In the gauge theory, superselection sectors of fractons are in one-to-one correspondence with the divisor classes in graph theory. In particular, under mild assumptions on the spatial graph, the fracton immobility is proven using a graph-theoretic Abel-Jacobi map.
△ Less
Submitted 5 November, 2022; v1 submitted 18 July, 2022;
originally announced July 2022.
-
Ice Age : Chemo-dynamical modeling of Cha-MMS1 to predict new solid-phase species for detection with JWST
Authors:
Mihwa **,
Ka Ho Lam,
Melissa K. McClure,
Jeroen Terwisscha van Scheltinga,
Zhi-Yun Li,
Adwin Boogert,
Eric Herbst,
Shane W. Davis,
Robin T. Garrod
Abstract:
Chemical models and experiments indicate that interstellar dust grains and their ice mantles play an important role in the production of complex organic molecules (COMs). To date, the most complex solid-phase molecule detected with certainty in the ISM is methanol, but the James Webb Space Telescope (JWST) may be able to identify still larger organic species. In this study, we use a coupled chemo-…
▽ More
Chemical models and experiments indicate that interstellar dust grains and their ice mantles play an important role in the production of complex organic molecules (COMs). To date, the most complex solid-phase molecule detected with certainty in the ISM is methanol, but the James Webb Space Telescope (JWST) may be able to identify still larger organic species. In this study, we use a coupled chemo-dynamical model to predict new candidate species for JWST detection toward the young star-forming core Cha-MMS1, combining the gas-grain chemical kinetic code MAGICKAL with a 1-D radiative hydrodynamics simulation using Athena++. With this model, the relative abundances of the main ice constituents with respect to water toward the core center match well with typical observational values, providing a firm basis to explore the ice chemistry. Six oxygen-bearing COMs (ethanol, dimethyl ether, acetaldehyde, methyl formate, methoxy methanol, and acetic acid), as well as formic acid, show abundances as high as, or exceeding, 0.01% with respect to water ice. Based on the modeled ice composition, the infrared spectrum is synthesized to diagnose the detectability of the new ice species. The contribution of COMs to IR absorption bands is minor compared to the main ice constituents, and the identification of COM ice toward the core center of Cha-MMS1 with the JWST NIRCAM/Wide Field Slitless Spectroscopy (2.4-5.0 micron) may be unlikely. However, MIRI observations (5-28 micron) toward COM-rich environments where solid-phase COM abundances exceed 1% with respect to the water ice column density might reveal the distinctive ice features of COMs.
△ Less
Submitted 9 July, 2022;
originally announced July 2022.
-
Centrifugal Barrier and Super-Keplerian Rotation in Protostellar Disk Formation
Authors:
Dylan C. Jones,
Ka Ho Lam,
Zhi-Yun Li,
Yisheng Tu
Abstract:
With the advent of ALMA, it is now possible to observationally constrain how disks form around deeply embedded protostars. In particular, the recent ALMA C3H2 line observations of the nearby protostar L1527 have been interpreted as evidence for the so-called "centrifugal barrier," where the protostellar envelope infall is gradually decelerated to a stop by the centrifugal force in a region of supe…
▽ More
With the advent of ALMA, it is now possible to observationally constrain how disks form around deeply embedded protostars. In particular, the recent ALMA C3H2 line observations of the nearby protostar L1527 have been interpreted as evidence for the so-called "centrifugal barrier," where the protostellar envelope infall is gradually decelerated to a stop by the centrifugal force in a region of super-Keplerian rotation. To test the concept of centrifugal barrier, which was originally based on angular momentum conserving-collapse of a rotating test particle around a fixed point mass, we carry out simple axisymmetric hydrodynamic simulations of protostellar disk formation including a minimum set of ingredients: self-gravity, rotation, and a prescribed viscosity that enables the disk to accrete. We find that a super-Keplerian region can indeed exist when the viscosity is relatively large but, unlike the classic picture of centrifugal barrier, the infalling envelope material is not decelerated solely by the centrifugal force. The region has more specific angular momentum than its surrounding envelope material, which points to an origin in outward angular momentum transport in the disk (subject to the constraint of disk expansion by the infalling envelope), rather than the spin-up of the envelope material envisioned in the classic picture as it falls closer to the center in order to conserve angular momentum. For smaller viscosities, the super-Keplerian rotation is weaker or non-existing. We conclude that, despite the existence of super-Keplerian rotation in some parameter regime, the classic picture of centrifugal barrier is not supported by our simulations.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Evaluating Aleatoric Uncertainty via Conditional Generative Models
Authors:
Ziyi Huang,
Henry Lam,
Haofeng Zhang
Abstract:
Aleatoric uncertainty quantification seeks for distributional knowledge of random responses, which is important for reliability analysis and robustness improvement in machine learning applications. Previous research on aleatoric uncertainty estimation mainly targets closed-formed conditional densities or variances, which requires strong restrictions on the data distribution or dimensionality. To o…
▽ More
Aleatoric uncertainty quantification seeks for distributional knowledge of random responses, which is important for reliability analysis and robustness improvement in machine learning applications. Previous research on aleatoric uncertainty estimation mainly targets closed-formed conditional densities or variances, which requires strong restrictions on the data distribution or dimensionality. To overcome these restrictions, we study conditional generative models for aleatoric uncertainty estimation. We introduce two metrics to measure the discrepancy between two conditional distributions that suit these models. Both metrics can be easily and unbiasedly computed via Monte Carlo simulation of the conditional generative models, thus facilitating their evaluation and training. We demonstrate numerically how our metrics provide correct measurements of conditional distributional discrepancies and can be used to train conditional models competitive against existing benchmarks.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Learning to Generate Prompts for Dialogue Generation through Reinforcement Learning
Authors:
Hsuan Su,
Pohan Chi,
Shih-Cheng Huang,
Chung Ho Lam,
Saurav Sahay,
Shang-Tse Chen,
Hung-yi Lee
Abstract:
Much literature has shown that prompt-based learning is an efficient method to make use of the large pre-trained language model. Recent works also exhibit the possibility of steering a chatbot's output by plugging in an appropriate prompt. Gradient-based methods are often used to perturb the prompts. However, some language models are not even available to the public. In this work, we first explore…
▽ More
Much literature has shown that prompt-based learning is an efficient method to make use of the large pre-trained language model. Recent works also exhibit the possibility of steering a chatbot's output by plugging in an appropriate prompt. Gradient-based methods are often used to perturb the prompts. However, some language models are not even available to the public. In this work, we first explored the combination of prompting and reinforcement learning (RL) to steer models' generation without accessing any of the models' parameters. Second, to reduce the training effort and enhance the generalizability to the unseen task, we apply multi-task learning to make the model learn to generalize to new tasks better. The experiment results show that our proposed method can successfully control several state-of-the-art (SOTA) dialogue models without accessing their parameters. Furthermore, the model demonstrates the strong ability to quickly adapt to an unseen task in fewer steps than the baseline model.
△ Less
Submitted 13 October, 2022; v1 submitted 8 June, 2022;
originally announced June 2022.