Search | arXiv e-print repository

Zero Inflation as a Missing Data Problem: a Proxy-based Approach

Authors: Trung Phung, Jaron J. R. Lee, Opeyemi Oladapo-Shittu, Eili Y. Klein, Ayse Pinar Gurses, Susan M. Hannum, Kimberly Weems, Jill A. Marsteller, Sara E. Cosgrove, Sara C. Keller, Ilya Shpitser

Abstract: A common type of zero-inflated data has certain true values incorrectly replaced by zeros due to data recording conventions (rare outcomes assumed to be absent) or details of data recording equipment (e.g. artificial zeros in gene expression data). Existing methods for zero-inflated data either fit the observed data likelihood via parametric mixture models that explicitly represent excess zeros,… ▽ More A common type of zero-inflated data has certain true values incorrectly replaced by zeros due to data recording conventions (rare outcomes assumed to be absent) or details of data recording equipment (e.g. artificial zeros in gene expression data). Existing methods for zero-inflated data either fit the observed data likelihood via parametric mixture models that explicitly represent excess zeros, or aim to replace excess zeros by imputed values. If the goal of the analysis relies on knowing true data realizations, a particular challenge with zero-inflated data is identifiability, since it is difficult to correctly determine which observed zeros are real and which are inflated. This paper views zero-inflated data as a general type of missing data problem, where the observability indicator for a potentially censored variable is itself unobserved whenever a zero is recorded. We show that, without additional assumptions, target parameters involving a zero-inflated variable are not identified. However, if a proxy of the missingness indicator is observed, a modification of the effect restoration approach of Kuroki and Pearl allows identification and estimation, given the proxy-indicator relationship is known. If this relationship is unknown, our approach yields a partial identification strategy for sensitivity analysis. Specifically, we show that only certain proxy-indicator relationships are compatible with the observed data distribution. We give an analytic bound for this relationship in cases with a categorical outcome, which is sharp in certain models. For more complex cases, sharp numerical bounds may be computed using methods in Duarte et al.[2023]. We illustrate our method via simulation studies and a data application on central line-associated bloodstream infections (CLABSIs). △ Less

Submitted 2 July, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

Comments: 28 pages, 8 figues, accepted for the 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)

arXiv:2405.12594 [pdf, other]

Statistical Qubit Freezing Extending Physical Limit of Quantum Annealers

Authors: Jeung Rac Lee, June-Koo Kevin Rhee, Changjun Kim, Bo Hyun Choi

Abstract: Adiabatic quantum annealers encounter scalability challenges due to exponentially fast diminishing energy gaps between ground and excited states with qubit-count increase. This introduces errors in identifying ground states compounded by a thermal noise. We propose a novel algorithmic scheme called statistical qubit freezing (SQF) that selectively fixes the state of statistically deterministic qub… ▽ More Adiabatic quantum annealers encounter scalability challenges due to exponentially fast diminishing energy gaps between ground and excited states with qubit-count increase. This introduces errors in identifying ground states compounded by a thermal noise. We propose a novel algorithmic scheme called statistical qubit freezing (SQF) that selectively fixes the state of statistically deterministic qubit in the annealing Hamiltonian model of the given problem. Applying freezing repeatedly, SQF significantly enhances the spectral gap between of an adiabatic process, as an example, by up to 60\% compared to traditional annealing methods in the standard D-Wave's quantum Ising machine solution, effectively overcoming the fundamental limitations. △ Less

Submitted 27 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: 11 pages, 6 figures

arXiv:2404.14219 [pdf, other]

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra , et al. (90 additional authors not shown)

Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset for training, a scaled-up version of the one used for phi-2, composed of heavily filtered publicly available web data and synthetic data. The model is also further aligned for robustness, safety, and chat format. We also provide some initial parameter-scaling results with a 7B and 14B models trained for 4.8T tokens, called phi-3-small and phi-3-medium, both significantly more capable than phi-3-mini (e.g., respectively 75% and 78% on MMLU, and 8.7 and 8.9 on MT-bench). Moreover, we also introduce phi-3-vision, a 4.2 billion parameter model based on phi-3-mini with strong reasoning capabilities for image and text prompts. △ Less

Submitted 23 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: 19 pages

arXiv:2404.06602 [pdf, ps, other]

A General Identification Algorithm For Data Fusion Problems Under Systematic Selection

Authors: Jaron J. R. Lee, AmirEmad Ghassami, Ilya Shpitser

Abstract: Causal inference is made challenging by confounding, selection bias, and other complications. A common approach to addressing these difficulties is the inclusion of auxiliary data on the superpopulation of interest. Such data may measure a different set of variables, or be obtained under different experimental conditions than the primary dataset. Analysis based on multiple datasets must carefully… ▽ More Causal inference is made challenging by confounding, selection bias, and other complications. A common approach to addressing these difficulties is the inclusion of auxiliary data on the superpopulation of interest. Such data may measure a different set of variables, or be obtained under different experimental conditions than the primary dataset. Analysis based on multiple datasets must carefully account for similarities between datasets, while appropriately accounting for differences. In addition, selection of experimental units into different datasets may be systematic; similar difficulties are encountered in missing data problems. Existing methods for combining datasets either do not consider this issue, or assume simple selection mechanisms. In this paper, we provide a general approach, based on graphical causal models, for causal inference from data on the same superpopulation that is obtained under different experimental conditions. Our framework allows both arbitrary unobserved confounding, and arbitrary selection processes into different experimental regimes in our data. We describe how systematic selection processes may be organized into a hierarchy similar to censoring processes in missing data: selected completely at random (SCAR), selected at random (SAR), and selected not at random (SNAR). In addition, we provide a general identification algorithm for interventional distributions in this setting. △ Less

Submitted 15 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: 17 pages

arXiv:2312.07399 [pdf, other]

Large Language Models are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales

Authors: Taeyoon Kwon, Kai Tzu-iunn Ong, Dong** Kang, Seungjun Moon, Jeong Ryong Lee, Dosik Hwang, Yongsik Sim, Beomseok Sohn, Dongha Lee, **young Yeo

Abstract: Machine reasoning has made great progress in recent years owing to large language models (LLMs). In the clinical domain, however, most NLP-driven projects mainly focus on clinical classification or reading comprehension, and under-explore clinical reasoning for disease diagnosis due to the expensive rationale annotation with clinicians. In this work, we present a "reasoning-aware" diagnosis framew… ▽ More Machine reasoning has made great progress in recent years owing to large language models (LLMs). In the clinical domain, however, most NLP-driven projects mainly focus on clinical classification or reading comprehension, and under-explore clinical reasoning for disease diagnosis due to the expensive rationale annotation with clinicians. In this work, we present a "reasoning-aware" diagnosis framework that rationalizes the diagnostic process via prompt-based learning in a time- and labor-efficient manner, and learns to reason over the prompt-generated rationales. Specifically, we address the clinical reasoning for disease diagnosis, where the LLM generates diagnostic rationales providing its insight on presented patient data and the reasoning path towards the diagnosis, namely Clinical Chain-of-Thought (Clinical CoT). We empirically demonstrate LLMs/LMs' ability of clinical reasoning via extensive experiments and analyses on both rationale generation and disease diagnosis in various settings. We further propose a novel set of criteria for evaluating machine-generated rationales' potential for real-world clinical settings, facilitating and benefiting future research in this area. △ Less

Submitted 10 May, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

Comments: Accepted to AAAI 2024

arXiv:2311.18145 [pdf, ps, other]

Sparsifying generalized linear models

Authors: Arun Jambulapati, James R. Lee, Yang P. Liu, Aaron Sidford

Abstract: We consider the sparsification of sums $F : \mathbb{R}^n \to \mathbb{R}$ where $F(x) = f_1(\langle a_1,x\rangle) + \cdots + f_m(\langle a_m,x\rangle)$ for vectors $a_1,\ldots,a_m \in \mathbb{R}^n$ and functions $f_1,\ldots,f_m : \mathbb{R} \to \mathbb{R}_+$. We show that $(1+\varepsilon)$-approximate sparsifiers of $F$ with support size… ▽ More We consider the sparsification of sums $F : \mathbb{R}^n \to \mathbb{R}$ where $F(x) = f_1(\langle a_1,x\rangle) + \cdots + f_m(\langle a_m,x\rangle)$ for vectors $a_1,\ldots,a_m \in \mathbb{R}^n$ and functions $f_1,\ldots,f_m : \mathbb{R} \to \mathbb{R}_+$. We show that $(1+\varepsilon)$-approximate sparsifiers of $F$ with support size $\frac{n}{\varepsilon^2} (\log \frac{n}{\varepsilon})^{O(1)}$ exist whenever the functions $f_1,\ldots,f_m$ are symmetric, monotone, and satisfy natural growth bounds. Additionally, we give efficient algorithms to compute such a sparsifier assuming each $f_i$ can be evaluated efficiently. Our results generalize the classic case of $\ell_p$ sparsification, where $f_i(z) = |z|^p$, for $p \in (0, 2]$, and give the first near-linear size sparsifiers in the well-studied setting of the Huber loss function and its generalizations, e.g., $f_i(z) = \min\{|z|^p, |z|^2\}$ for $0 < p \leq 2$. Our sparsification algorithm can be applied to give near-optimal reductions for optimizing a variety of generalized linear models including $\ell_p$ regression for $p \in (1, 2]$ to high accuracy, via solving $(\log n)^{O(1)}$ sparse regression instances with $m \le n(\log n)^{O(1)}$, plus runtime proportional to the number of nonzero entries in the vectors $a_1, \dots, a_m$. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2310.20468 [pdf, other]

An Introduction to Causal Inference Methods for Observational Human-Robot Interaction Research

Authors: Jaron J. R. Lee, Gopika Ajaykumar, Ilya Shpitser, Chien-Ming Huang

Abstract: Quantitative methods in Human-Robot Interaction (HRI) research have primarily relied upon randomized, controlled experiments in laboratory settings. However, such experiments are not always feasible when external validity, ethical constraints, and ease of data collection are of concern. Furthermore, as consumer robots become increasingly available, increasing amounts of real-world data will be ava… ▽ More Quantitative methods in Human-Robot Interaction (HRI) research have primarily relied upon randomized, controlled experiments in laboratory settings. However, such experiments are not always feasible when external validity, ethical constraints, and ease of data collection are of concern. Furthermore, as consumer robots become increasingly available, increasing amounts of real-world data will be available to HRI researchers, which prompts the need for quantative approaches tailored to the analysis of observational data. In this article, we present an alternate approach towards quantitative research for HRI researchers using methods from causal inference that can enable researchers to identify causal relationships in observational settings where randomized, controlled experiments cannot be run. We highlight different scenarios that HRI research with consumer household robots may involve to contextualize how methods from causal inference can be applied to observational HRI research. We then provide a tutorial summarizing key concepts from causal inference using a graphical model perspective and link to code examples throughout the article, which are available at https://gitlab.com/causal/causal_hri. Our work paves the way for further discussion on new approaches towards observational HRI research while providing a starting point for HRI researchers to add causal inference techniques to their analytical toolbox. △ Less

Submitted 31 October, 2023; originally announced October 2023.

Comments: 28 pages

arXiv:2305.09049 [pdf, ps, other]

Sparsifying sums of norms

Authors: Arun Jambulapati, James R. Lee, Yang P. Liu, Aaron Sidford

Abstract: For any norms $N_1,\ldots,N_m$ on $\mathbb{R}^n$ and $N(x) := N_1(x)+\cdots+N_m(x)$, we show there is a sparsified norm $\tilde{N}(x) = w_1 N_1(x) + \cdots + w_m N_m(x)$ such that $|N(x) - \tilde{N}(x)| \leq εN(x)$ for all $x \in \mathbb{R}^n$, where $w_1,\ldots,w_m$ are non-negative weights, of which only $O(ε^{-2} n \log(n/ε) (\log n)^{2.5} )$ are non-zero. Additionally, if $N$ is… ▽ More For any norms $N_1,\ldots,N_m$ on $\mathbb{R}^n$ and $N(x) := N_1(x)+\cdots+N_m(x)$, we show there is a sparsified norm $\tilde{N}(x) = w_1 N_1(x) + \cdots + w_m N_m(x)$ such that $|N(x) - \tilde{N}(x)| \leq εN(x)$ for all $x \in \mathbb{R}^n$, where $w_1,\ldots,w_m$ are non-negative weights, of which only $O(ε^{-2} n \log(n/ε) (\log n)^{2.5} )$ are non-zero. Additionally, if $N$ is $\mathrm{poly}(n)$-equivalent to the Euclidean norm on $\mathbb{R}^n$, then such weights can be found with high probability in time $O(m (\log n)^{O(1)} + \mathrm{poly}(n)) T$, where $T$ is the time required to evaluate a norm $N_i$. This immediately yields analogous statements for sparsifying sums of symmetric submodular functions. More generally, we show how to sparsify sums of $p$th powers of norms when the sum is $p$-uniformly smooth. △ Less

Submitted 30 November, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

arXiv:2301.11477 [pdf, other]

Ananke: A Python Package For Causal Inference Using Graphical Models

Authors: Jaron J. R. Lee, Rohit Bhattacharya, Razieh Nabi, Ilya Shpitser

Abstract: We implement Ananke: an object-oriented Python package for causal inference with graphical models. At the top of our inheritance structure is an easily extensible Graph class that provides an interface to several broadly useful graph-based algorithms and methods for visualization. We use best practices of object-oriented programming to implement subclasses of the Graph superclass that correspond t… ▽ More We implement Ananke: an object-oriented Python package for causal inference with graphical models. At the top of our inheritance structure is an easily extensible Graph class that provides an interface to several broadly useful graph-based algorithms and methods for visualization. We use best practices of object-oriented programming to implement subclasses of the Graph superclass that correspond to types of causal graphs that are popular in the current literature. This includes directed acyclic graphs for modeling causally sufficient systems, acyclic directed mixed graphs for modeling unmeasured confounding, and chain graphs for modeling data dependence and interference. Within these subclasses, we implement specialized algorithms for common statistical and causal modeling tasks, such as separation criteria for reading conditional independence, nonparametric identification, and parametric and semiparametric estimation of model parameters. Here, we present a broad overview of the package and example usage for a problem with unmeasured confounding. Up to date documentation is available at \url{https://ananke.readthedocs.io/en/latest/}. △ Less

Submitted 26 January, 2023; originally announced January 2023.

arXiv:2301.10306 [pdf]

doi 10.1063/5.0157654

Two-Level Systems in Nucleated and Non-Nucleated Epitaxial alpha-Tantalum films

Authors: L. D. Alegria, D. M. Tennant, K. R. Chaves, J. R. I. Lee, S. R. O'Kelley, Y. J. Rosen, J L DuBois

Abstract: Building usefully coherent superconducting quantum processors depends on reducing losses in their constituent materials. Tantalum, like niobium, has proven utility as the primary superconducting layer within highly coherent qubits. But, unlike Nb, high temperatures are typically used to stabilize the desirable body-centered-cubic phase, alpha-Ta, during thin film deposition. It has long been known… ▽ More Building usefully coherent superconducting quantum processors depends on reducing losses in their constituent materials. Tantalum, like niobium, has proven utility as the primary superconducting layer within highly coherent qubits. But, unlike Nb, high temperatures are typically used to stabilize the desirable body-centered-cubic phase, alpha-Ta, during thin film deposition. It has long been known that a thin Nb layer permits the room-temperature nucleation of alpha-Ta, although neither an epitaxial process nor few-photon microwave loss measurements have been reported for Nb-nucleated Ta films prior to this study. We compare resonators patterned from Ta films grown at high temperature (500 °C) and films nucleated at room temperature, in order to understand the impact of crystalline order on quantum coherence. In both cases, films grew with Al2O3 (001) || Ta (110) indicating that the epitaxial orientation is independent of temperature and is preserved across the Nb/Ta interface. We use conventional low-power spectroscopy to measure two level system (TLS) loss, as well as an electric-field bias technique to measure the effective dipole moments of TLS in the surfaces of resonators. In our measurements, Nb-nucleated Ta resonators had greater loss tangent (1.5 +/- 0.1 x 10^-5) than non-nucleated (5 +/- 1 x 10^-6) in approximate proportion to defect densities as characterized by X-ray diffraction (0.27 ° vs 0.18 ° [110] reflection width) and electron microscopy (30 nm vs 70 nm domain size). The dependence of the loss tangent on domain size indicates that the development of more ordered Ta films is likely to lead to improvements in qubit coherence times. Moreover, low-temperature alpha-Ta epitaxy may enable the growth of new, microstate-free heterostructures which would not withstand high temperature processing. △ Less

Submitted 18 September, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

Journal ref: Appl. Phys. Lett. 123, 062601 (2023)

arXiv:2209.04539 [pdf, ps, other]

Spectral hypergraph sparsification via chaining

Authors: James R. Lee

Abstract: In a hypergraph on $n$ vertices where $D$ is the maximum size of a hyperedge, there is a weighted hypergraph spectral $\varepsilon$-sparsifier with at most $O(\varepsilon^{-2} \log(D) \cdot n \log n)$ hyperedges. This improves over the bound of Kapralov, Krauthgamer, Tardos and Yoshida (2021) who achieve $O(\varepsilon^{-4} n (\log n)^3)$, as well as the bound $O(\varepsilon^{-2} D^3 n \log n)$ ob… ▽ More In a hypergraph on $n$ vertices where $D$ is the maximum size of a hyperedge, there is a weighted hypergraph spectral $\varepsilon$-sparsifier with at most $O(\varepsilon^{-2} \log(D) \cdot n \log n)$ hyperedges. This improves over the bound of Kapralov, Krauthgamer, Tardos and Yoshida (2021) who achieve $O(\varepsilon^{-4} n (\log n)^3)$, as well as the bound $O(\varepsilon^{-2} D^3 n \log n)$ obtained by Bansal, Svensson, and Trevisan (2019). The same sparsification result was obtained independently by Jambulapati, Liu, and Sidford (2022). △ Less

Submitted 23 September, 2022; v1 submitted 9 September, 2022; originally announced September 2022.

Comments: Incorrect example replaced by Remark 1.2; Definition of the distance corrected; reference to JLS'22 added

arXiv:2203.02807 [pdf, other]

Off-Policy Evaluation in Embedded Spaces

Authors: Jaron J. R. Lee, David Arbour, Georgios Theocharous

Abstract: Off-policy evaluation methods are important in recommendation systems and search engines, where data collected under an existing logging policy is used to estimate the performance of a new proposed policy. A common approach to this problem is weighting, where data is weighted by a density ratio between the probability of actions given contexts in the target and logged policies. In practice, two is… ▽ More Off-policy evaluation methods are important in recommendation systems and search engines, where data collected under an existing logging policy is used to estimate the performance of a new proposed policy. A common approach to this problem is weighting, where data is weighted by a density ratio between the probability of actions given contexts in the target and logged policies. In practice, two issues often arise. First, many problems have very large action spaces and we may not observe rewards for most actions, and so in finite samples we may encounter a positivity violation. Second, many recommendation systems are not probabilistic and so having access to logging and target policy densities may not be feasible. To address these issues, we introduce the featurized embedded permutation weighting estimator. The estimator computes the density ratio in an action embedding space, which reduces the possibility of positivity violations. The density ratio is computed leveraging recent advances in normalizing flows and density ratio estimation as a classification problem, in order to obtain estimates which are feasible in practice. △ Less

Submitted 2 January, 2023; v1 submitted 5 March, 2022; originally announced March 2022.

Comments: 9 pages, appeared at NeurIPS 2021 Workshop "Causal Inference Challenges in Sequential Decision Making: Bridging Theory and Practice", presented virtually Dec 14th 2021

arXiv:2111.10908 [pdf, other]

Multiscale entropic regularization for MTS on general metric spaces

Authors: Farzam Ebrahimnejad, James R. Lee

Abstract: We present an $O((\log n)^2)$-competitive algorithm for metrical task systems (MTS) on any $n$-point metric space that is also $1$-competitive for service costs. This matches the competitive ratio achieved by Bubeck, Cohen, Lee, and Lee (2019) and the refined competitive ratios obtained by Coester and Lee (2019). Those algorithms work by first randomly embedding the metric space into an ultrametri… ▽ More We present an $O((\log n)^2)$-competitive algorithm for metrical task systems (MTS) on any $n$-point metric space that is also $1$-competitive for service costs. This matches the competitive ratio achieved by Bubeck, Cohen, Lee, and Lee (2019) and the refined competitive ratios obtained by Coester and Lee (2019). Those algorithms work by first randomly embedding the metric space into an ultrametric and then solving MTS there. In contrast, our algorithm is cast as regularized gradient descent where the regularizer is a multiscale metric entropy defined directly on the metric space. This answers an open question of Bubeck (Highlights of Algorithms, 2019). △ Less

Submitted 21 November, 2021; originally announced November 2021.

Comments: 23 pages, 1 figure, to appear in ITCS '22

arXiv:2107.09790 [pdf, other]

Non-existence of annular separators in geometric graphs

Authors: Farzam Ebrahimnejad, James R. Lee

Abstract: Benjamini and Papasoglou (2011) showed that planar graphs with uniform polynomial volume growth admit $1$-dimensional annular separators: The vertices at graph distance $R$ from any vertex can be separated from those at distance $2R$ by removing at most $O(R)$ vertices. They asked whether geometric $d$-dimensional graphs with uniform polynomial volume growth similarly admit $(d-1)$-dimensional ann… ▽ More Benjamini and Papasoglou (2011) showed that planar graphs with uniform polynomial volume growth admit $1$-dimensional annular separators: The vertices at graph distance $R$ from any vertex can be separated from those at distance $2R$ by removing at most $O(R)$ vertices. They asked whether geometric $d$-dimensional graphs with uniform polynomial volume growth similarly admit $(d-1)$-dimensional annular separators when $d > 2$. We show that this fails in a strong sense: For any $d \geq 3$ and every $s \geq 1$, there is a collection of interior-disjoint spheres in $\mathbb{R}^d$ whose tangency graph $G$ has uniform polynomial growth, but such that all annular separators in $G$ have cardinality at least $R^s$. △ Less

Submitted 20 July, 2021; originally announced July 2021.

Comments: 17 pages, 7 figures

arXiv:2105.08868 [pdf, other]

Markov-Restricted Analysis of Randomized Trials with Non-Monotone Missing Binary Outcomes: Sensitivity Analysis and Identification Results

Authors: Daniel O. Scharfstein, Jaron J. R. Lee, Aidan McDermott, Aimee Campbell, Edward Nunes, Abigail G. Matthews, Ilya Shpitser

Abstract: Scharfstein et al. (2021) developed a sensitivity analysis model for analyzing randomized trials with repeatedly measured binary outcomes that are subject to nonmonotone missingness. Their approach becomes computationally intractable when the number of repeated measured is large (e.g., greater than 15). In this paper, we repair this problem by introducing an $m$th-order Markovian restriction. We e… ▽ More Scharfstein et al. (2021) developed a sensitivity analysis model for analyzing randomized trials with repeatedly measured binary outcomes that are subject to nonmonotone missingness. Their approach becomes computationally intractable when the number of repeated measured is large (e.g., greater than 15). In this paper, we repair this problem by introducing an $m$th-order Markovian restriction. We establish an identification by representing the model as a directed acyclic graph (DAG). We illustrate our methodology in the context of a randomized trial designed to evaluate a web-delivered psychosocial intervention to reduce substance use, assessed by testing urine samples twice weekly for 12 weeks, among patients entering outpatient addiction treatment. We evaluate the finite sample properties of our method in a realistic simulation study. Our methods have been integrated into the R package entitled slabm. △ Less

Submitted 18 May, 2021; originally announced May 2021.

arXiv:2104.03165 [pdf, other]

TB-Net: A Tailored, Self-Attention Deep Convolutional Neural Network Design for Detection of Tuberculosis Cases from Chest X-ray Images

Authors: Alexander Wong, James Ren Hou Lee, Hadi Rahmat-Khah, Ali Sabri, Amer Alaref

Abstract: Tuberculosis (TB) remains a global health problem, and is the leading cause of death from an infectious disease. A crucial step in the treatment of tuberculosis is screening high risk populations and the early detection of the disease, with chest x-ray (CXR) imaging being the most widely-used imaging modality. As such, there has been significant recent interest in artificial intelligence-based TB… ▽ More Tuberculosis (TB) remains a global health problem, and is the leading cause of death from an infectious disease. A crucial step in the treatment of tuberculosis is screening high risk populations and the early detection of the disease, with chest x-ray (CXR) imaging being the most widely-used imaging modality. As such, there has been significant recent interest in artificial intelligence-based TB screening solutions for use in resource-limited scenarios where there is a lack of trained healthcare workers with expertise in CXR interpretation. Motivated by this pressing need and the recent recommendation by the World Health Organization (WHO) for the use of computer-aided diagnosis of TB, we introduce TB-Net, a self-attention deep convolutional neural network tailored for TB case screening. More specifically, we leveraged machine-driven design exploration to build a highly customized deep neural network architecture with attention condensers. We conducted an explainability-driven performance validation process to validate TB-Net's decision-making behaviour. Experiments on CXR data from a multi-national patient cohort showed that the proposed TB-Net is able to achieve accuracy/sensitivity/specificity of 99.86%/100.0%/99.71%. Radiologist validation was conducted on select cases by two board-certified radiologists with over 10 and 19 years of experience, respectively, and showed consistency between radiologist interpretation and critical factors leveraged by TB-Net for TB case detection for the case where radiologists identified anomalies. While not a production-ready solution, we hope that the open-source release of TB-Net as part of the COVID-Net initiative will support researchers, clinicians, and citizen data scientists in advancing this field in the fight against this global public health crisis. △ Less

Submitted 13 April, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

Comments: 10 pages

arXiv:2103.04008 [pdf, other]

Fibrosis-Net: A Tailored Deep Convolutional Neural Network Design for Prediction of Pulmonary Fibrosis Progression from Chest CT Images

Authors: Alexander Wong, Jack Lu, Adam Dorfman, Paul McInnis, Mahmoud Famouri, Daniel Manary, James Ren Hou Lee, Michael Lynch

Abstract: Pulmonary fibrosis is a devastating chronic lung disease that causes irreparable lung tissue scarring and damage, resulting in progressive loss in lung capacity and has no known cure. A critical step in the treatment and management of pulmonary fibrosis is the assessment of lung function decline, with computed tomography (CT) imaging being a particularly effective method for determining the extent… ▽ More Pulmonary fibrosis is a devastating chronic lung disease that causes irreparable lung tissue scarring and damage, resulting in progressive loss in lung capacity and has no known cure. A critical step in the treatment and management of pulmonary fibrosis is the assessment of lung function decline, with computed tomography (CT) imaging being a particularly effective method for determining the extent of lung damage caused by pulmonary fibrosis. Motivated by this, we introduce Fibrosis-Net, a deep convolutional neural network design tailored for the prediction of pulmonary fibrosis progression from chest CT images. More specifically, machine-driven design exploration was leveraged to determine a strong architectural design for CT lung analysis, upon which we build a customized network design tailored for predicting forced vital capacity (FVC) based on a patient's CT scan, initial spirometry measurement, and clinical metadata. Finally, we leverage an explainability-driven performance validation strategy to study the decision-making behaviour of Fibrosis-Net as to verify that predictions are based on relevant visual indicators in CT images. Experiments using a patient cohort from the OSIC Pulmonary Fibrosis Progression Challenge showed that the proposed Fibrosis-Net is able to achieve a significantly higher modified Laplace Log Likelihood score than the winning solutions on the challenge. Furthermore, explainability-driven performance validation demonstrated that the proposed Fibrosis-Net exhibits correct decision-making behaviour by leveraging clinically-relevant visual indicators in CT images when making predictions on pulmonary fibrosis progress. While Fibrosis-Net is not yet a production-ready clinical assessment solution, we hope that its release in open source manner will encourage researchers, clinicians, and citizen data scientists alike to leverage and build upon it. △ Less

Submitted 20 April, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

Comments: 12 pages

arXiv:2012.08899 [pdf, other]

doi 10.1063/5.0040782

Analytical Gradients for Molecular-Orbital-Based Machine Learning

Authors: Sebastian J. R. Lee, Tamara Husch, Feizhi Ding, Thomas F. Miller III

Abstract: Molecular-orbital-based machine learning (MOB-ML) enables the prediction of accurate correlation energies at the cost of obtaining molecular orbitals. Here, we present the derivation, implementation, and numerical demonstration of MOB-ML analytical nuclear gradients which are formulated in a general Lagrangian framework to enforce orthogonality, localization, and Brillouin constraints on the molec… ▽ More Molecular-orbital-based machine learning (MOB-ML) enables the prediction of accurate correlation energies at the cost of obtaining molecular orbitals. Here, we present the derivation, implementation, and numerical demonstration of MOB-ML analytical nuclear gradients which are formulated in a general Lagrangian framework to enforce orthogonality, localization, and Brillouin constraints on the molecular orbitals. The MOB-ML gradient framework is general with respect to the regression technique (e.g., Gaussian process regression or neural networks) and the MOB feature design. We show that MOB-ML gradients are highly accurate compared to other ML methods on the ISO17 data set while only being trained on energies for hundreds of molecules compared to energies and gradients for hundreds of thousands of molecules for the other ML methods. The MOB-ML gradients are also shown to yield accurate optimized structures, at a computational cost for the gradient evaluation that is comparable to Hartree-Fock theory or hybrid DFT. △ Less

Submitted 16 December, 2020; originally announced December 2020.

arXiv:2011.10702 [pdf, other]

CancerNet-SCa: Tailored Deep Neural Network Designs for Detection of Skin Cancer from Dermoscopy Images

Authors: James Ren Hou Lee, Maya Pavlova, Mahmoud Famouri, Alexander Wong

Abstract: Skin cancer continues to be the most frequently diagnosed form of cancer in the U.S., with not only significant effects on health and well-being but also significant economic costs associated with treatment. A crucial step to the treatment and management of skin cancer is effective skin cancer detection due to strong prognosis when treated at an early stage, with one of the key screening approache… ▽ More Skin cancer continues to be the most frequently diagnosed form of cancer in the U.S., with not only significant effects on health and well-being but also significant economic costs associated with treatment. A crucial step to the treatment and management of skin cancer is effective skin cancer detection due to strong prognosis when treated at an early stage, with one of the key screening approaches being dermoscopy examination. Motivated by the advances of deep learning and inspired by the open source initiatives in the research community, in this study we introduce CancerNet-SCa, a suite of deep neural network designs tailored for the detection of skin cancer from dermoscopy images that is open source and available to the general public as part of the Cancer-Net initiative. To the best of the authors' knowledge, CancerNet-SCa comprises of the first machine-designed deep neural network architecture designs tailored specifically for skin cancer detection, one of which possessing a self-attention architecture design with attention condensers. Furthermore, we investigate and audit the behaviour of CancerNet-SCa in a responsible and transparent manner via explainability-driven model auditing. While CancerNet-SCa is not a production-ready screening solution, the hope is that the release of CancerNet-SCa in open source, open access form will encourage researchers, clinicians, and citizen data scientists alike to leverage and build upon them. △ Less

Submitted 20 November, 2020; originally announced November 2020.

Comments: 8 pages

arXiv:2010.11884 [pdf]

AEGIS: A real-time multimodal augmented reality computer vision based system to assist facial expression recognition for individuals with autism spectrum disorder

Authors: James Ren Hou Lee, Alexander Wong

Abstract: The ability to interpret social cues comes naturally for most people, but for those living with Autism Spectrum Disorder (ASD), some experience a deficiency in this area. This paper presents the development of a multimodal augmented reality (AR) system which combines the use of computer vision and deep convolutional neural networks (CNN) in order to assist individuals with the detection and interp… ▽ More The ability to interpret social cues comes naturally for most people, but for those living with Autism Spectrum Disorder (ASD), some experience a deficiency in this area. This paper presents the development of a multimodal augmented reality (AR) system which combines the use of computer vision and deep convolutional neural networks (CNN) in order to assist individuals with the detection and interpretation of facial expressions in social settings. The proposed system, which we call AEGIS (Augmented-reality Expression Guided Interpretation System), is an assistive technology deployable on a variety of user devices including tablets, smartphones, video conference systems, or smartglasses, showcasing its extreme flexibility and wide range of use cases, to allow integration into daily life with ease. Given a streaming video camera source, each real-world frame is passed into AEGIS, processed for facial bounding boxes, and then fed into our novel deep convolutional time windowed neural network (TimeConvNet). We leverage both spatial and temporal information in order to provide an accurate expression prediction, which is then converted into its corresponding visualization and drawn on top of the original video frame. The system runs in real-time, requires minimal set up and is simple to use. With the use of AEGIS, we can assist individuals living with ASD to learn to better identify expressions and thus improve their social experiences. △ Less

Submitted 22 October, 2020; originally announced October 2020.

Comments: 4 pages, 1 figure

arXiv:2010.03626 [pdf, other]

doi 10.1063/5.0032362

Improved accuracy and transferability of molecular-orbital-based machine learning: Organics, transition-metal complexes, non-covalent interactions, and transition states

Authors: Tamara Husch, Jiace Sun, Lixue Cheng, Sebastian J. R. Lee, Thomas F. Miller III

Abstract: Molecular-orbital-based machine learning (MOB-ML) provides a general framework for the prediction of accurate correlation energies at the cost of obtaining molecular orbitals. We demonstrate the importance of preserving physical constraints, including invariance conditions and size consistency, when generating the input for the machine learning model. Numerical improvements are demonstrated for di… ▽ More Molecular-orbital-based machine learning (MOB-ML) provides a general framework for the prediction of accurate correlation energies at the cost of obtaining molecular orbitals. We demonstrate the importance of preserving physical constraints, including invariance conditions and size consistency, when generating the input for the machine learning model. Numerical improvements are demonstrated for different data sets covering total and relative energies for thermally accessible organic and transition-metal containing molecules, non-covalent interactions, and transition-state energies. MOB-ML requires training data from only 1% of the QM7b-T data set (i.e., only 70 organic molecules with seven and fewer heavy atoms) to predict the total energy of the remaining 99% of this data set with sub-kcal/mol accuracy. This MOB-ML model is significantly more accurate than other methods when transferred to a data set comprised of thirteen heavy atom molecules, exhibiting no loss of accuracy on a size intensive (i.e., per-electron) basis. It is shown that MOB-ML also works well for extrapolating to transition-state structures, predicting the barrier region for malonaldehyde intramolecular proton-transfer to within 0.35 kcal/mol when only trained on reactant/product-like structures. Finally, the use of the Gaussian process variance enables an active learning strategy for extending MOB-ML model to new regions of chemical space with minimal effort. We demonstrate this active learning strategy by extending a QM7b-T model to describe non-covalent interactions in the protein backbone-backbone interaction data set to an accuracy of 0.28 kcal/mol. △ Less

Submitted 16 October, 2020; v1 submitted 7 October, 2020; originally announced October 2020.

arXiv:2007.06548 [pdf, other]

Relations between scaling exponents in unimodular random graphs

Authors: James R. Lee

Abstract: We investigate the validity of the "Einstein relations" in the general setting of unimodular random networks. These are equalities relating scaling exponents: $d_w = d_f + \tildeζ$ and $d_s = 2 d_f/d_w$, where $d_w$ is the walk dimension, $d_f$ is the fractal dimension, $d_s$ is the spectral dimension, and $\tildeζ$ is the resistance exponent. Roughly speaking, this relates the mean displacement a… ▽ More We investigate the validity of the "Einstein relations" in the general setting of unimodular random networks. These are equalities relating scaling exponents: $d_w = d_f + \tildeζ$ and $d_s = 2 d_f/d_w$, where $d_w$ is the walk dimension, $d_f$ is the fractal dimension, $d_s$ is the spectral dimension, and $\tildeζ$ is the resistance exponent. Roughly speaking, this relates the mean displacement and return probability of a random walker to the density and conductivity of the underlying medium. We show that if $d_f$ and $\tildeζ \geq 0$ exist, then $d_w$ and $d_s$ exist, and the aforementioned equalities hold. Moreover, our primary new estimate is the relation $d_w \geq d_f + \tildeζ$, which is established for all $\tildeζ \in \mathbb{R}$. For the uniform infinite planar triangulation (UIPT), this yields the consequence $d_w=4$ using $d_f=4$ (Angel 2003) and $\tildeζ=0$ (established here as a consequence of the Liouville Quantum Gravity theory, following Gwynne-Miller 2017 and Ding-Gwynne 2020). The conclusion $d_w=4$ had been previously established by Gwynne and Hutchcroft (2018) using more elaborate methods. A new consequence is that $d_w = d_f$ for the uniform infinite Schnyder-wood decorated triangulation, implying that the simple random walk is subdiffusive, since $d_f > 2$ (Ding and Gwynne 2020). For the random walk on $\mathbb{Z}^2$ driven by conductances from an exponentiated Gaussian free field with exponent $γ> 0$, one has $d_f = d_f(γ)$ and $\tildeζ=0$ (Biskup, Ding, and Goswami 2020). This yields $d_s=2$ and $d_w = d_f$, confirming two predictions of those authors. △ Less

Submitted 6 May, 2021; v1 submitted 13 July, 2020; originally announced July 2020.

Comments: 35 pages, 2 figures; updated references

arXiv:2006.15759 [pdf, other]

EmotionNet Nano: An Efficient Deep Convolutional Neural Network Design for Real-time Facial Expression Recognition

Authors: James Ren Hou Lee, Linda Wang, Alexander Wong

Abstract: While recent advances in deep learning have led to significant improvements in facial expression classification (FEC), a major challenge that remains a bottleneck for the widespread deployment of such systems is their high architectural and computational complexities. This is especially challenging given the operational requirements of various FEC applications, such as safety, marketing, learning,… ▽ More While recent advances in deep learning have led to significant improvements in facial expression classification (FEC), a major challenge that remains a bottleneck for the widespread deployment of such systems is their high architectural and computational complexities. This is especially challenging given the operational requirements of various FEC applications, such as safety, marketing, learning, and assistive living, where real-time requirements on low-cost embedded devices is desired. Motivated by this need for a compact, low latency, yet accurate system capable of performing FEC in real-time on low-cost embedded devices, this study proposes EmotionNet Nano, an efficient deep convolutional neural network created through a human-machine collaborative design strategy, where human experience is combined with machine meticulousness and speed in order to craft a deep neural network design catered towards real-time embedded usage. Two different variants of EmotionNet Nano are presented, each with a different trade-off between architectural and computational complexity and accuracy. Experimental results using the CK+ facial expression benchmark dataset demonstrate that the proposed EmotionNet Nano networks demonstrated accuracies comparable to state-of-the-art in FEC networks, while requiring significantly fewer parameters (e.g., 23$\times$ fewer at a higher accuracy). Furthermore, we demonstrate that the proposed EmotionNet Nano networks achieved real-time inference speeds (e.g. $>25$ FPS and $>70$ FPS at 15W and 30W, respectively) and high energy efficiency (e.g. $>1.7$ images/sec/watt at 15W) on an ARM embedded processor, thus further illustrating the efficacy of EmotionNet Nano for deployment on embedded devices. △ Less

Submitted 28 June, 2020; originally announced June 2020.

Comments: 9 pages

arXiv:2005.08934 [pdf, other]

Chemical subdiffusivity of critical 2D percolation

Authors: Shirshendu Ganguly, James R. Lee

Abstract: We show that random walk on the incipient infinite cluster (IIC) of two-dimensional critical percolation is subdiffusive in the chemical distance (i.e., in the intrinsic graph metric). Kesten (1986) famously showed that this is true for the Euclidean distance, but it is known that the chemical distance is typically asymptotically larger. More generally, we show that subdiffusivity in the chemical… ▽ More We show that random walk on the incipient infinite cluster (IIC) of two-dimensional critical percolation is subdiffusive in the chemical distance (i.e., in the intrinsic graph metric). Kesten (1986) famously showed that this is true for the Euclidean distance, but it is known that the chemical distance is typically asymptotically larger. More generally, we show that subdiffusivity in the chemical distance holds for stationary random graphs of polynomial volume growth, as long as there is a multi-scale way of covering the graph so that "deep patches" have "thin backbones". Our estimates are quantitative and give explicit bounds in terms of the one and two-arm exponents $η_2 > η_1 > 0$: For $d$-dimensional models, the mean chemical displacement after $T$ steps of random walk scales asymptotically slower than $T^{1/β}$, whenever \[ β< 2 + \frac{η_2-η_1}{d-η_1}\,. \] Using the conjectured values of $η_2 = η_1 + 1/4$ and $η_1 = 5/48$ for 2D lattices, the latter quantity is $2+12/91$. △ Less

Submitted 22 July, 2021; v1 submitted 18 May, 2020; originally announced May 2020.

Comments: 16 pages, 2 figures

arXiv:2005.03139 [pdf, other]

On planar graphs of uniform polynomial growth

Authors: Farzam Ebrahimnejad, James R. Lee

Abstract: Consider an infinite planar graph with uniform polynomial growth of degree d > 2. Many examples of such graphs exhibit similar geometric and spectral properties, and it has been conjectured that this is necessary. We present a family of counterexamples. In particular, we show that for every rational d > 2, there is a planar graph with uniform polynomial growth of degree d on which the random walk… ▽ More Consider an infinite planar graph with uniform polynomial growth of degree d > 2. Many examples of such graphs exhibit similar geometric and spectral properties, and it has been conjectured that this is necessary. We present a family of counterexamples. In particular, we show that for every rational d > 2, there is a planar graph with uniform polynomial growth of degree d on which the random walk is transient, disproving a conjecture of Benjamini (2011). By a well-known theorem of Benjamini and Schramm, such a graph cannot be a unimodular random graph. We also give examples of unimodular random planar graphs of uniform polynomial growth with unexpected properties. For instance, graphs of (almost sure) uniform polynomial growth of every rational degree d > 2 for which the speed exponent of the walk is larger than 1/d, and in which the complements of all balls are connected. This resolves negatively two questions of Benjamini and Papasoglou (2011). △ Less

Submitted 9 March, 2021; v1 submitted 6 May, 2020; originally announced May 2020.

Comments: 26 pages, 6 figures

arXiv:2004.01157 [pdf, ps, other]

Identification Methods With Arbitrary Interventional Distributions as Inputs

Authors: Jaron J. R. Lee, Ilya Shpitser

Abstract: Causal inference quantifies cause-effect relationships by estimating counterfactual parameters from data. This entails using \emph{identification theory} to establish a link between counterfactual parameters of interest and distributions from which data is available. A line of work characterized non-parametric identification for a wide variety of causal parameters in terms of the \emph{observed da… ▽ More Causal inference quantifies cause-effect relationships by estimating counterfactual parameters from data. This entails using \emph{identification theory} to establish a link between counterfactual parameters of interest and distributions from which data is available. A line of work characterized non-parametric identification for a wide variety of causal parameters in terms of the \emph{observed data distribution}. More recently, identification results have been extended to settings where experimental data from interventional distributions is also available. In this paper, we use Single World Intervention Graphs and a nested factorization of models associated with mixed graphs to give a very simple view of existing identification theory for experimental data. We use this view to yield general identification algorithms for settings where the input distributions consist of an arbitrary set of observational and experimental distributions, including marginal and conditional distributions. We show that for problems where inputs are interventional marginal distributions of a certain type (ancestral marginals), our algorithm is complete. △ Less

Submitted 15 April, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

arXiv:2003.01791 [pdf, other]

TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for Real-time Video Facial Expression Recognition

Authors: James Ren Hou Lee, Alexander Wong

Abstract: A core challenge faced by the majority of individuals with Autism Spectrum Disorder (ASD) is an impaired ability to infer other people's emotions based on their facial expressions. With significant recent advances in machine learning, one potential approach to leveraging technology to assist such individuals to better recognize facial expressions and reduce the risk of possible loneliness and depr… ▽ More A core challenge faced by the majority of individuals with Autism Spectrum Disorder (ASD) is an impaired ability to infer other people's emotions based on their facial expressions. With significant recent advances in machine learning, one potential approach to leveraging technology to assist such individuals to better recognize facial expressions and reduce the risk of possible loneliness and depression due to social isolation is the design of computer vision-driven facial expression recognition systems. Motivated by this social need as well as the low latency requirement of such systems, this study explores a novel deep time windowed convolutional neural network design (TimeConvNets) for the purpose of real-time video facial expression recognition. More specifically, we explore an efficient convolutional deep neural network design for spatiotemporal encoding of time windowed video frame sub-sequences and study the respective balance between speed and accuracy. Furthermore, to evaluate the proposed TimeConvNet design, we introduce a more difficult dataset called BigFaceX, composed of a modified aggregation of the extended Cohn-Kanade (CK+), BAUM-1, and the eNTERFACE public datasets. Different variants of the proposed TimeConvNet design with different backbone network architectures were evaluated using BigFaceX alongside other network designs for capturing spatiotemporal information, and experimental results demonstrate that TimeConvNets can better capture the transient nuances of facial expressions and boost classification accuracy while maintaining a low inference time. △ Less

Submitted 3 March, 2020; originally announced March 2020.

Comments: 8 pages, 3 figures

arXiv:1906.04270 [pdf, ps, other]

Pure entropic regularization for metrical task systems

Authors: Christian Coester, James R. Lee

Abstract: We show that on every $n$-point HST metric, there is a randomized online algorithm for metrical task systems (MTS) that is $1$-competitive for service costs and $O(\log n)$-competitive for movement costs. In general, these refined guarantees are optimal up to the implicit constant. While an $O(\log n)$-competitive algorithm for MTS on HST metrics was developed by Bubeck et al. (SODA 2019), that ap… ▽ More We show that on every $n$-point HST metric, there is a randomized online algorithm for metrical task systems (MTS) that is $1$-competitive for service costs and $O(\log n)$-competitive for movement costs. In general, these refined guarantees are optimal up to the implicit constant. While an $O(\log n)$-competitive algorithm for MTS on HST metrics was developed by Bubeck et al. (SODA 2019), that approach could only establish an $O((\log n)^2)$-competitive ratio when the service costs are required to be $O(1)$-competitive. Our algorithm can be viewed as an instantiation of online mirror descent with the regularizer derived from a multiscale conditional entropy. In fact, our algorithm satisfies a set of even more refined guarantees; we are able to exploit this property to combine it with known random embedding theorems and obtain, for any $n$-point metric space, a randomized algorithm that is $1$-competitive for service costs and $O((\log n)^2)$-competitive for movement costs. △ Less

Submitted 3 September, 2020; v1 submitted 10 June, 2019; originally announced June 2019.

Comments: COLT 2019

arXiv:1903.05830 [pdf, other]

doi 10.1063/1.5109882

Analytical Gradients for Projection-Based Wavefunction-in-DFT Embedding

Authors: Sebastian J. R. Lee, Feizhi Ding, Frederick R. Manby, Thomas F. Miller III

Abstract: Projection-based embedding provides a simple, robust, and accurate approach for describing a small part of a chemical system at the level of a correlated wavefunction method while the remainder of the system is described at the level of density functional theory. Here, we present the derivation, implementation, and numerical demonstration of analytical nuclear gradients for projection-based wavefu… ▽ More Projection-based embedding provides a simple, robust, and accurate approach for describing a small part of a chemical system at the level of a correlated wavefunction method while the remainder of the system is described at the level of density functional theory. Here, we present the derivation, implementation, and numerical demonstration of analytical nuclear gradients for projection-based wavefunction-in-density functional theory (WF-in-DFT) embedding. The gradients are formulated in the Lagrangian framework to enforce orthogonality, localization, and Brillouin constraints on the molecular orbitals. An important aspect of the gradient theory is that WF contributions to the total WF-in-DFT gradient can be simply evaluated using existing WF gradient implementations without modification. Another simplifying aspect is that Kohn-Sham (KS) DFT contributions to the projection-based embedding gradient do not require knowledge of the WF calculation beyond the relaxed WF density. Projection-based WF-in-DFT embedding gradients are thus easily generalized to any combination of WF and KS-DFT methods. We provide numerical demonstration of the method for several applications, including calculation of a minimum energy pathway for a hydride transfer in a cobalt-based molecular catalyst using the nudged-elastic-band method at the CCSD-in-DFT level of theory, which reveals large differences from the transition state geometry predicted using DFT. △ Less

Submitted 19 August, 2019; v1 submitted 14 March, 2019; originally announced March 2019.

Comments: 15 pages, 4 figures

Journal ref: J. Chem. Phys. 151, 064112 (2019)

arXiv:1811.02685 [pdf, other]

Flow-Cut Gaps and Face Covers in Planar Graphs

Authors: Robert Krauthgamer, James R. Lee, Havana Rika

Abstract: The relationship between the sparsest cut and the maximum concurrent multi-flow in graphs has been studied extensively. For general graphs with $k$ terminal pairs, the flow-cut gap is $O(\log k)$, and this is tight. But when topological restrictions are placed on the flow network, the situation is far less clear. In particular, it has been conjectured that the flow-cut gap in planar networks is… ▽ More The relationship between the sparsest cut and the maximum concurrent multi-flow in graphs has been studied extensively. For general graphs with $k$ terminal pairs, the flow-cut gap is $O(\log k)$, and this is tight. But when topological restrictions are placed on the flow network, the situation is far less clear. In particular, it has been conjectured that the flow-cut gap in planar networks is $O(1)$, while the known bounds place the gap somewhere between $2$ (Lee and Raghavendra, 2003) and $O(\sqrt{\log k})$ (Rao, 1999). A seminal result of Okamura and Seymour (1981) shows that when all the terminals of a planar network lie on a single face, the flow-cut gap is exactly $1$. This setting can be generalized by considering planar networks where the terminals lie on $γ>1$ faces in some fixed planar drawing. Lee and Sidiropoulos (2009) proved that the flow-cut gap is bounded by a function of $γ$, and Chekuri, Shepherd, and Weibel (2013) showed that the gap is at most $3γ$. We prove that the flow-cut gap is $O(\logγ)$, by showing that the edge-weighted shortest-path metric induced on the terminals admits a stochastic embedding into trees with distortion $O(\logγ)$, which is tight. The preceding results refer to the setting of edge-capacitated networks. For vertex-capacitated networks, it can be significantly more challenging to control flow-cut gaps. While there is no exact vertex-capacitated version of the Okamura-Seymour Theorem, an approximate version holds; Lee, Mendel, and Moharrami (2015) showed that the vertex-capacitated flow-cut gap is $O(1)$ on planar networks whose terminals lie on a single face. We prove that the flow-cut gap is $O(γ)$ for vertex-capacitated instances when the terminals lie on at most $γ$ faces. In fact, this result holds in the more general setting of submodular vertex capacities. △ Less

Submitted 6 November, 2018; originally announced November 2018.

arXiv:1807.04404 [pdf, ps, other]

Metrical task systems on trees via mirror descent and unfair gluing

Authors: Sébastien Bubeck, Michael B. Cohen, James R. Lee, Yin Tat Lee

Abstract: We consider metrical task systems on tree metrics, and present an $O(\mathrm{depth} \times \log n)$-competitive randomized algorithm based on the mirror descent framework introduced in our prior work on the $k$-server problem. For the special case of hierarchically separated trees (HSTs), we use mirror descent to refine the standard approach based on gluing unfair metrical task systems. This yield… ▽ More We consider metrical task systems on tree metrics, and present an $O(\mathrm{depth} \times \log n)$-competitive randomized algorithm based on the mirror descent framework introduced in our prior work on the $k$-server problem. For the special case of hierarchically separated trees (HSTs), we use mirror descent to refine the standard approach based on gluing unfair metrical task systems. This yields an $O(\log n)$-competitive algorithm for HSTs, thus removing an extraneous $\log\log n$ in the bound of Fiat and Mendel (2003). Combined with well-known HST embedding theorems, this also gives an $O((\log n)^2)$-competitive randomized algorithm for every $n$-point metric space. △ Less

Submitted 25 November, 2020; v1 submitted 11 July, 2018; originally announced July 2018.

arXiv:1711.01789

Fusible HSTs and the randomized k-server conjecture

Authors: James R. Lee

Abstract: We exhibit an $O((\log k)^6)$-competitive randomized algorithm for the $k$-server problem on any metric space. It is shown that a potential-based algorithm for the fractional $k$-server problem on hierarchically separated trees (HSTs) with competitive ratio $f(k)$ can be used to obtain a randomized algorithm for any metric space with competitive ratio $f(k)^2 O((\log k)^2)$. Employing the… ▽ More We exhibit an $O((\log k)^6)$-competitive randomized algorithm for the $k$-server problem on any metric space. It is shown that a potential-based algorithm for the fractional $k$-server problem on hierarchically separated trees (HSTs) with competitive ratio $f(k)$ can be used to obtain a randomized algorithm for any metric space with competitive ratio $f(k)^2 O((\log k)^2)$. Employing the $O((\log k)^2)$-competitive algorithm for HSTs from our joint work with Bubeck, Cohen, Lee, and Mądry (2017) yields the claimed bound. The best previous result independent of the geometry of the underlying metric space is the $2k-1$ competitive ratio established for the deterministic work function algorithm by Koutsoupias and Papadimitriou (1995). Even for the special case when the underlying metric space is the real line, the best known competitive ratio was $k$. Since deterministic algorithms can do no better than $k$ on any metric space with at least $k+1$ points, this establishes that for every metric space on which the problem is non-trivial, randomized algorithms give an exponential improvement over deterministic algorithms. △ Less

Submitted 28 July, 2021; v1 submitted 6 November, 2017; originally announced November 2017.

Comments: There is a gap in the argument in Section 5.3.2 that requires a substantial revision to correct. See the author's web page for up to date information

arXiv:1711.01085 [pdf, ps, other]

k-server via multiscale entropic regularization

Authors: Sebastien Bubeck, Michael B. Cohen, James R. Lee, Yin Tat Lee, Aleksander Madry

Abstract: We present an $O((\log k)^2)$-competitive randomized algorithm for the $k$-server problem on hierarchically separated trees (HSTs). This is the first $o(k)$-competitive randomized algorithm for which the competitive ratio is independent of the size of the underlying HST. Our algorithm is designed in the framework of online mirror descent where the mirror map is a multiscale entropy. When combined… ▽ More We present an $O((\log k)^2)$-competitive randomized algorithm for the $k$-server problem on hierarchically separated trees (HSTs). This is the first $o(k)$-competitive randomized algorithm for which the competitive ratio is independent of the size of the underlying HST. Our algorithm is designed in the framework of online mirror descent where the mirror map is a multiscale entropy. When combined with Bartal's static HST embedding reduction, this leads to an $O((\log k)^2 \log n)$-competitive algorithm on any $n$-point metric space. We give a new dynamic HST embedding that yields an $O((\log k)^3 \log Δ)$-competitive algorithm on any metric space where the ratio of the largest to smallest non-zero distance is at most $Δ$. △ Less

Submitted 3 November, 2017; originally announced November 2017.

arXiv:1701.07227 [pdf, other]

Discrete uniformizing metrics on distributional limits of sphere packings

Authors: James R. Lee

Abstract: Suppose that $\{G_n\}$ is a sequence of finite graphs such that each $G_n$ is the tangency graph of a sphere packing in $\mathbb{R}^d$. Let $ρ_n$ be a uniformly random vertex of $G_n$ and suppose that $(G,ρ)$ is the distributional limit of $\{(G_n,ρ_n)\}$ in the sense of Benjamini and Schramm. Then the conformal growth exponent of $(G,ρ)$ is at most $d$. In other words, there exists a unimodular "… ▽ More Suppose that $\{G_n\}$ is a sequence of finite graphs such that each $G_n$ is the tangency graph of a sphere packing in $\mathbb{R}^d$. Let $ρ_n$ be a uniformly random vertex of $G_n$ and suppose that $(G,ρ)$ is the distributional limit of $\{(G_n,ρ_n)\}$ in the sense of Benjamini and Schramm. Then the conformal growth exponent of $(G,ρ)$ is at most $d$. In other words, there exists a unimodular "unit volume" weighting of the graph metric on $(G,ρ)$ such that the volume growth of balls in the weighted path metric is bounded by a polynomial of degree $d$. This generalizes to limits of graphs that can be "coarsely" packed in an Ahlfors $d$-regular metric measure space. Using our previous work, this implies that, under moment conditions on the degree of the root $ρ$,the almost sure spectral dimension of $G$ is at most $d$. This fact was known previously only for graphs packed in $\mathbb{R}^2$ (planar graphs), and the case of $d > 2$ eluded approaches based on extremal length. In the process of bounding the spectral dimension, we establish that the spectral measure of $(G,ρ)$ is dominated by a variant of the $d$-dimensional Weyl law. △ Less

Submitted 11 February, 2018; v1 submitted 25 January, 2017; originally announced January 2017.

arXiv:1701.01598 [pdf, other]

Conformal growth rates and spectral geometry on distributional limits of graphs

Authors: James R. Lee

Abstract: For a unimodular random graph $(G,ρ)$, we consider deformations of its intrinsic path metric by a (random) weighting of its vertices. This leads to the notion of the conformal growth exponent of $(G,ρ)$, which is the best asymptotic degree of volume growth of balls that can be achieved by such a reweighting. Under moment conditions on the degree of the root, we show that the conformal growth expon… ▽ More For a unimodular random graph $(G,ρ)$, we consider deformations of its intrinsic path metric by a (random) weighting of its vertices. This leads to the notion of the conformal growth exponent of $(G,ρ)$, which is the best asymptotic degree of volume growth of balls that can be achieved by such a reweighting. Under moment conditions on the degree of the root, we show that the conformal growth exponent of a unimodular random graph bounds its almost sure spectral dimension. This has interesting consequences for many low-dimensional models. The consequences in dimension two are particularly strong. It establishes that models like the uniform infinite planar triangulation (UIPT) and quadrangulation (UIPQ) almost surely have spectral dimension at most two. It also establishes a conjecture of Benjamini and Schramm (2001) by extending their Recurrence Theorem from planar graphs to arbitrary families of $H$-minor free graphs. More generally, it strengthens the work of Gurel-Gurevich and Nachmias (2013) who established recurrence for distributional limits of planar graphs when the degree of the root has exponential tails. We further present a general method for proving subdiffusivity of the random walk on a large class of models, including UIPT and UIPQ, using only the volume growth profile of balls in the intrinsic metric. △ Less

Submitted 1 June, 2020; v1 submitted 6 January, 2017; originally announced January 2017.

Comments: Addressed referee comments

arXiv:1609.04040 [pdf, other]

Diffusive estimates for random walks on stationary random graphs of polynomial growth

Authors: Shirshendu Ganguly, James R. Lee, Yuval Peres

Abstract: Let $(G,ρ)$ be a stationary random graph, and use $B^G_ρ(r)$ to denote the ball of radius $r$ about $ρ$ in $G$. Suppose that $(G,ρ)$ has annealed polynomial growth, in the sense that $\mathbb{E}[|B^G_ρ(r)|] \leq O(r^k)$ for some $k > 0$ and every $r \geq 1$. Then there is an infinite sequence of times $\{t_n\}$ at which the random walk $\{X_t\}$ on $(G,ρ)$ is at most diffusive: Almost surely (ov… ▽ More Let $(G,ρ)$ be a stationary random graph, and use $B^G_ρ(r)$ to denote the ball of radius $r$ about $ρ$ in $G$. Suppose that $(G,ρ)$ has annealed polynomial growth, in the sense that $\mathbb{E}[|B^G_ρ(r)|] \leq O(r^k)$ for some $k > 0$ and every $r \geq 1$. Then there is an infinite sequence of times $\{t_n\}$ at which the random walk $\{X_t\}$ on $(G,ρ)$ is at most diffusive: Almost surely (over the choice of $(G,ρ)$), there is a number $C > 0$ such that \[ \mathbb{E} \left[\mathrm{dist}_G(X_0, X_{t_n})^2 \mid X_0 = ρ, (G,ρ)\right]\leq C t_n\qquad \forall n \geq 1\,. \] This result is new even in the case when $G$ is a stationary random subgraph of $\mathbb{Z}^d$. Combined with the work of Benjamini, Duminil-Copin, Kozma, and Yadin (2015), it implies that $G$ almost surely does not admit a non-constant harmonic function of sublinear growth. To complement this, we argue that passing to a subsequence of times $\{t_n\}$ is necessary, as there are stationary random graphs of (almost sure) polynomial growth where the random walk is almost surely superdiffusive at an infinite subset of times. △ Less

Submitted 13 September, 2016; originally announced September 2016.

arXiv:1608.01612 [pdf, other]

Separators in region intersection graphs

Authors: James R. Lee

Abstract: For undirected graphs $G=(V,E)$ and $G_0=(V_0,E_0)$, say that $G$ is a region intersection graph over $G_0$ if there is a family of connected subsets $\{ R_u \subseteq V_0 : u \in V \}$ of $G_0$ such that $\{u,v\} \in E \iff R_u \cap R_v \neq \emptyset$. We show if $G_0$ excludes the complete graph $K_h$ as a minor for some $h \geq 1$, then every region intersection graph $G$ over $G_0$ with… ▽ More For undirected graphs $G=(V,E)$ and $G_0=(V_0,E_0)$, say that $G$ is a region intersection graph over $G_0$ if there is a family of connected subsets $\{ R_u \subseteq V_0 : u \in V \}$ of $G_0$ such that $\{u,v\} \in E \iff R_u \cap R_v \neq \emptyset$. We show if $G_0$ excludes the complete graph $K_h$ as a minor for some $h \geq 1$, then every region intersection graph $G$ over $G_0$ with $m$ edges has a balanced separator with at most $c_h \sqrt{m}$ nodes, where $c_h$ is a constant depending only on $h$. If $G$ additionally has uniformly bounded vertex degrees, then such a separator is found by spectral partitioning. A string graph is the intersection graph of continuous arcs in the plane. The preceding result implies that every string graph with $m$ edges has a balanced separator of size $O(\sqrt{m})$. This bound is optimal, as it generalizes the planar separator theorem. It confirms a conjecture of Fox and Pach (2010), and improves over the $O(\sqrt{m} \log m)$ bound of Matousek (2013). △ Less

Submitted 27 July, 2017; v1 submitted 4 August, 2016; originally announced August 2016.

Comments: Minor fixes; references added

MSC Class: 05C; 52C

arXiv:1604.06859 [pdf, ps, other]

Transport-entropy inequalities and curvature in discrete-space Markov chains

Authors: Ronen Eldan, James R. Lee, Joseph Lehec

Abstract: We show that if the random walk on a graph has positive coarse Ricci curvature in the sense of Ollivier, then the stationary measure satisfies a W^1 transport-entropy inequality. Peres and Tetali have conjectured a stronger consequence, that a modified log-Sobolev inequality (MLSI) should hold, in analogy with the setting of Markov diffusions. We discuss how our entropy interpolation approach sugg… ▽ More We show that if the random walk on a graph has positive coarse Ricci curvature in the sense of Ollivier, then the stationary measure satisfies a W^1 transport-entropy inequality. Peres and Tetali have conjectured a stronger consequence, that a modified log-Sobolev inequality (MLSI) should hold, in analogy with the setting of Markov diffusions. We discuss how our entropy interpolation approach suggests a natural attack on the MLSI conjecture. △ Less

Submitted 27 December, 2016; v1 submitted 23 April, 2016; originally announced April 2016.

Comments: To appear in "A Journey through Discrete Mathematics. A Tribute to Jiri Matousek"

arXiv:1511.03192 [pdf]

doi 10.1103/PhysRevB.92.174421

Strongly Coupled Electronic, Magnetic, and Lattice Degrees of Freedom in LaCo5 under Pressure

Authors: Ryan L. Stillwell, Jason R. Jeffries, Scott K. McCall, Jonathan R. I. Lee, Samuel T. Weir, Yogesh K. Vohra

Abstract: We have performed the first high-pressure magnetotransport and x-ray diffraction measurements on ferromagnetic LaCo5, confirming the theoretically predicted electronic topological transition driving the magneto-elastic collapse seen in the related compound YCo5. Our x-ray diffraction results show an anisotropic lattice collapse of the c-axis near 10 GPa that is also commensurate with a change in t… ▽ More We have performed the first high-pressure magnetotransport and x-ray diffraction measurements on ferromagnetic LaCo5, confirming the theoretically predicted electronic topological transition driving the magneto-elastic collapse seen in the related compound YCo5. Our x-ray diffraction results show an anisotropic lattice collapse of the c-axis near 10 GPa that is also commensurate with a change in the majority charge carriers evident from high-pressure Hall effect measurements. The coupling of the electronic, magnetic and lattice degrees of freedom is further substantiated by the evolution of the anomalous Hall effect, which couples to the magnetization of the ordered state of LaCo5. △ Less

Submitted 10 November, 2015; originally announced November 2015.

Comments: 11 pages, 3 figures in main text with appendix of 5 pages and 3 figures. Submitted to PRB

arXiv:1508.07109 [pdf, ps, other]

Covering the large spectrum and generalized Riesz products

Authors: James R. Lee

Abstract: Chang's Lemma is a widely employed result in additive combinatorics. It gives bounds on the dimension of the large spectrum of probability distributions on finite abelian groups. Recently, Bloom (2016) presented a powerful variant of Chang's Lemma that yields the strongest known quantitative version of Roth's theorem on 3-term arithmetic progressions in dense subsets of the integers. In this note,… ▽ More Chang's Lemma is a widely employed result in additive combinatorics. It gives bounds on the dimension of the large spectrum of probability distributions on finite abelian groups. Recently, Bloom (2016) presented a powerful variant of Chang's Lemma that yields the strongest known quantitative version of Roth's theorem on 3-term arithmetic progressions in dense subsets of the integers. In this note, we show how such theorems can be derived from the approximation of probability measures via entropy maximization. △ Less

Submitted 28 December, 2016; v1 submitted 28 August, 2015; originally announced August 2015.

arXiv:1411.6317 [pdf, ps, other]

Lower bounds on the size of semidefinite programming relaxations

Authors: James R. Lee, Prasad Raghavendra, David Steurer

Abstract: We introduce a method for proving lower bounds on the efficacy of semidefinite programming (SDP) relaxations for combinatorial problems. In particular, we show that the cut, TSP, and stable set polytopes on $n$-vertex graphs are not the linear image of the feasible region of any SDP (i.e., any spectrahedron) of dimension less than $2^{n^c}$, for some constant $c > 0$. This result yields the first… ▽ More We introduce a method for proving lower bounds on the efficacy of semidefinite programming (SDP) relaxations for combinatorial problems. In particular, we show that the cut, TSP, and stable set polytopes on $n$-vertex graphs are not the linear image of the feasible region of any SDP (i.e., any spectrahedron) of dimension less than $2^{n^c}$, for some constant $c > 0$. This result yields the first super-polynomial lower bounds on the semidefinite extension complexity of any explicit family of polytopes. Our results follow from a general technique for proving lower bounds on the positive semidefinite rank of a matrix. To this end, we establish a close connection between arbitrary SDPs and those arising from the sum-of-squares SDP hierarchy. For approximating maximum constraint satisfaction problems, we prove that SDPs of polynomial-size are equivalent in power to those arising from degree-$O(1)$ sum-of-squares relaxations. This result implies, for instance, that no family of polynomial-size SDP relaxations can achieve better than a 7/8-approximation for MAX-3-SAT. △ Less

Submitted 23 November, 2014; originally announced November 2014.

arXiv:1410.3887 [pdf, other]

doi 10.1215/00127094-2017-0048

Regularization under diffusion and anti-concentration of the information content

Authors: Ronen Eldan, James R. Lee

Abstract: Under the Ornstein-Uhlenbeck semigroup $\{U_t\}$, any non-negative measurable $f : \mathbb R^n \to \mathbb R_+$ exhibits a uniform tail bound better than that implied by Markov's inequality and conservation of mass: For every $α\geq e^3$, and $t > 0$, \[ γ_n\left(\left\{x \in \mathbb R^n : U_t f(x) > α\int f\,dγ_n\right\}\right) \leq C(t) \frac{1}α \sqrt{\frac{\log \log α}{\log α}}\] where $γ_n$ i… ▽ More Under the Ornstein-Uhlenbeck semigroup $\{U_t\}$, any non-negative measurable $f : \mathbb R^n \to \mathbb R_+$ exhibits a uniform tail bound better than that implied by Markov's inequality and conservation of mass: For every $α\geq e^3$, and $t > 0$, \[ γ_n\left(\left\{x \in \mathbb R^n : U_t f(x) > α\int f\,dγ_n\right\}\right) \leq C(t) \frac{1}α \sqrt{\frac{\log \log α}{\log α}}\] where $γ_n$ is the $n$-dimensional Gaussian measure and $C(t)$ is a constant depending only on $t$. This confirms positively the Gaussian limiting case of Talagrand's convolution conjecture (1989). This is shown to follow from a more general phenomenon. Suppose that $f : \mathbb{R}^n \to \mathbb{R}_+$ is {\em semi-log-convex} in the sense that for some $β> 0$, for all $x \in \mathbb{R}^n$, the eigenvalues of $\nabla^2 \log f(x)$ are at least $-β$. Then $f$ satisfies a tail bound asymptotically better than that implied by Markov's inequality. △ Less

Submitted 10 October, 2017; v1 submitted 14 October, 2014; originally announced October 2014.

Comments: The bound is improved and the proof have been significantly simplified

Journal ref: Duke Math. J. 167, no. 5 (2018), 969-993

arXiv:1405.5980 [pdf, ps, other]

A Gaussian upper bound for martingale small-ball probabilities

Authors: James R. Lee, Yuval Peres, Charles K. Smart

Abstract: Consider a discrete-time martingale $\{X_t\}$ taking values in a Hilbert space $\mathcal H$. We show that if for some $L \geq 1$, the bounds $\mathbb{E} \left[\|X_{t+1}-X_t\|_{\mathcal H}^2 \mid X_t\right]=1$ and $\|X_{t+1}-X_t\|_{\mathcal H} \leq L$ are satisfied for all times $t \geq 0$, then there is a constant $c = c(L)$ such that for $1 \leq R \leq \sqrt{t}$, \[\mathbb{P}(\|X_t\|_{\mathcal H}… ▽ More Consider a discrete-time martingale $\{X_t\}$ taking values in a Hilbert space $\mathcal H$. We show that if for some $L \geq 1$, the bounds $\mathbb{E} \left[\|X_{t+1}-X_t\|_{\mathcal H}^2 \mid X_t\right]=1$ and $\|X_{t+1}-X_t\|_{\mathcal H} \leq L$ are satisfied for all times $t \geq 0$, then there is a constant $c = c(L)$ such that for $1 \leq R \leq \sqrt{t}$, \[\mathbb{P}(\|X_t\|_{\mathcal H} \leq R \mid X_0 = x_0) \leq c \frac{R}{\sqrt{t}} e^{-\|x_0\|_{\mathcal H}^2/(6 L^2 t)}\,.\] Following [Lee-Peres, Ann. Probab. 2013], this has applications to diffusive estimates for random walks on vertex-transitive graphs. △ Less

Submitted 9 September, 2015; v1 submitted 23 May, 2014; originally announced May 2014.

arXiv:1309.0563 [pdf, ps, other]

Approximate Constraint Satisfaction Requires Large LP Relaxations

Authors: Siu On Chan, James R. Lee, Prasad Raghavendra, David Steurer

Abstract: We prove super-polynomial lower bounds on the size of linear programming relaxations for approximation versions of constraint satisfaction problems. We show that for these problems, polynomial-sized linear programs are exactly as powerful as programs arising from a constant number of rounds of the Sherali-Adams hierarchy. In particular, any polynomial-sized linear program for Max Cut has an inte… ▽ More We prove super-polynomial lower bounds on the size of linear programming relaxations for approximation versions of constraint satisfaction problems. We show that for these problems, polynomial-sized linear programs are exactly as powerful as programs arising from a constant number of rounds of the Sherali-Adams hierarchy. In particular, any polynomial-sized linear program for Max Cut has an integrality gap of 1/2 and any such linear program for Max 3-Sat has an integrality gap of 7/8. △ Less

Submitted 8 February, 2016; v1 submitted 2 September, 2013; originally announced September 2013.

Comments: 29 pages; significant revisions, new references, simpler proofs

arXiv:1308.6702 [pdf, other]

doi 10.1145/2554797.2554816,

doi 10.1109/TIT.2020.2979704

Adversarial hypothesis testing and a quantum Stein's Lemma for restricted measurements

Authors: Fernando G. S. L. Brandao, Aram W. Harrow, James R. Lee, Yuval Peres

Abstract: Recall the classical hypothesis testing setting with two convex sets of probability distributions P and Q. One receives either n i.i.d. samples from a distribution p in P or from a distribution q in Q and wants to decide from which set the points were sampled. It is known that the optimal exponential rate at which errors decrease can be achieved by a simple maximum-likelihood ratio test which does… ▽ More Recall the classical hypothesis testing setting with two convex sets of probability distributions P and Q. One receives either n i.i.d. samples from a distribution p in P or from a distribution q in Q and wants to decide from which set the points were sampled. It is known that the optimal exponential rate at which errors decrease can be achieved by a simple maximum-likelihood ratio test which does not depend on p or q, but only on the sets P and Q. We consider an adaptive generalization of this model where the choice of p in P and q in Q can change in each sample in some way that depends arbitrarily on the previous samples. In other words, in the k'th round, an adversary, having observed all the previous samples in rounds 1,...,k-1, chooses p_k in P and q_k in Q, with the goal of confusing the hypothesis test. We prove that even in this case, the optimal exponential error rate can be achieved by a simple maximum-likelihood test that depends only on P and Q. We then show that the adversarial model has applications in hypothesis testing for quantum states using restricted measurements. For example, it can be used to study the problem of distinguishing entangled states from the set of all separable states using only measurements that can be implemented with local operations and classical communication (LOCC). The basic idea is that in our setup, the deleterious effects of entanglement can be simulated by an adaptive classical adversary. We prove a quantum Stein's Lemma in this setting: In many circumstances, the optimal hypothesis testing rate is equal to an appropriate notion of quantum relative entropy between two states. In particular, our arguments yield an alternate proof of Li and Winter's recent strengthening of strong subadditivity for quantum relative entropy. △ Less

Submitted 9 March, 2020; v1 submitted 30 August, 2013; originally announced August 2013.

Comments: 34 pages. v4. fixes bugs in proofs and adds detail

Journal ref: Proc. of 5th ITCS, pp. 183-194 (2014), IEEE Trans. Inf. Theory, vol 66, no 8, pp. 5037-5054 (2020)

arXiv:1302.6542 [pdf, ps, other]

A lower bound on dimension reduction for trees in \ell_1

Authors: James R. Lee, Mohammad Moharrami

Abstract: There is a constant c > 0 such that for every $ε\in (0,1)$ and $n \geq 1/ε^2$, the following holds. Any map** from the $n$-point star metric into $\ell_1^d$ with bi-Lipschitz distortion $1+ε$ requires dimension $$d \geq {c\log n\over ε^2\log (1/ε)}.$$ There is a constant c > 0 such that for every $ε\in (0,1)$ and $n \geq 1/ε^2$, the following holds. Any map** from the $n$-point star metric into $\ell_1^d$ with bi-Lipschitz distortion $1+ε$ requires dimension $$d \geq {c\log n\over ε^2\log (1/ε)}.$$ △ Less

Submitted 27 February, 2013; v1 submitted 26 February, 2013; originally announced February 2013.

arXiv:1301.6296 [pdf, ps, other]

On expanders from the action of GL(2,Z)

Authors: James R. Lee

Abstract: Consider the undirected graph $G_n=(V_n, E_n)$ where $V_n = (Z/nZ)^2$ and $E_n$ contains an edge from $(x,y)$ to $(x+1,y)$, $(x,y+1)$, $(x+y,y)$, and $(x,y+x)$ for every $(x,y) \in V_n$. Gabber and Galil, following Margulis, gave an elementary proof that ${G_n}$ forms an expander family. In this note, we present a somewhat simpler proof of this fact, and demonstrate its utility by isolating a key… ▽ More Consider the undirected graph $G_n=(V_n, E_n)$ where $V_n = (Z/nZ)^2$ and $E_n$ contains an edge from $(x,y)$ to $(x+1,y)$, $(x,y+1)$, $(x+y,y)$, and $(x,y+x)$ for every $(x,y) \in V_n$. Gabber and Galil, following Margulis, gave an elementary proof that ${G_n}$ forms an expander family. In this note, we present a somewhat simpler proof of this fact, and demonstrate its utility by isolating a key property of the linear transformations $(x,y) -> (x+y,x), (x,y+x)$ that yields expansion. As an example, consider any invertible, integral matrix $S \in GL_2(Z)$ and let $G^S_n = (V_n, E^S_n)$ where $E^S_n$ contains, for every $(x,y) \in V_n$, an edge from $(x,y)$ to $(x+1,y)$, $(x,y+1)$, $S(x,y)$, and $S^T(x,y)$, where $S^T$ denotes the transpose of $S$. Then {G_n^S} forms an expander family if and only if a related infinite graph has positive Cheeger constant. This latter property turns out to be elementary to analyze and can be used to show that {G_n^S} are expanders precisely when the trace of S is non-zero and S is not equal to its transpose. We also present some other generalizations. △ Less

Submitted 14 January, 2024; v1 submitted 26 January, 2013; originally announced January 2013.

Comments: This is an unpublished note. It is essentially identical to v3 from 2013. I am placing it here so that it can be reliably referenced

arXiv:1209.2744 [pdf, other]

doi 10.1007/s10107-014-0810-0

A node-capacitated Okamura-Seymour theorem

Authors: James R. Lee, Manor Mendel, Mohammad Moharrami

Abstract: The classical Okamura-Seymour theorem states that for an edge-capacitated, multi-commodity flow instance in which all terminals lie on a single face of a planar graph, there exists a feasible concurrent flow if and only if the cut conditions are satisfied. Simple examples show that a similar theorem is impossible in the node-capacitated setting. Nevertheless, we prove that an approximate flow/cut… ▽ More The classical Okamura-Seymour theorem states that for an edge-capacitated, multi-commodity flow instance in which all terminals lie on a single face of a planar graph, there exists a feasible concurrent flow if and only if the cut conditions are satisfied. Simple examples show that a similar theorem is impossible in the node-capacitated setting. Nevertheless, we prove that an approximate flow/cut theorem does hold: For some universal c > 0, if the node cut conditions are satisfied, then one can simultaneously route a c-fraction of all the demands. This answers an open question of Chekuri and Kawarabayashi. More generally, we show that this holds in the setting of multi-commodity polymatroid networks introduced by Chekuri, et. al. Our approach employs a new type of random metric embedding in order to round the convex programs corresponding to these more general flow problems. △ Less

Submitted 12 September, 2012; originally announced September 2012.

Comments: 30 pages, 5 figures

arXiv:1208.6088 [pdf, other]

Markov type and threshold embeddings

Authors: Jian Ding, James R. Lee, Yuval Peres

Abstract: For two metric spaces X and Y, say that X {threshold-embeds} into Y if there exist a number K > 0 and a family of Lipschitz maps $f_τ : X \to Y : τ> 0 \}$ such that for every $x,y \in X$, \[ d_X(x,y) \geq τ=> d_Y(f_τ(x),f_τ(y)) \geq \|\varphi_τ\|_{\Lip} τ/K \] where $\|f_τ\|_{\Lip}$ denotes the Lipschitz constant of $f_τ$. We show that if a metric space X threshold-embeds into a Hilbert space, the… ▽ More For two metric spaces X and Y, say that X {threshold-embeds} into Y if there exist a number K > 0 and a family of Lipschitz maps $f_τ : X \to Y : τ> 0 \}$ such that for every $x,y \in X$, \[ d_X(x,y) \geq τ=> d_Y(f_τ(x),f_τ(y)) \geq \|\varphi_τ\|_{\Lip} τ/K \] where $\|f_τ\|_{\Lip}$ denotes the Lipschitz constant of $f_τ$. We show that if a metric space X threshold-embeds into a Hilbert space, then X has Markov type 2. As a consequence, planar graph metrics and doubling metrics have Markov type 2, answering questions of Naor, Peres, Schramm, and Sheffield. More generally, if a metric space X threshold-embeds into a p-uniformly smooth Banach space, then X has Markov type p. This suggests some non-linear analogs of Kwapien's theorem. For instance, a subset $X \subseteq L_1$ threshold-embeds into Hilbert space if and only if X has Markov type 2. △ Less

Submitted 20 September, 2013; v1 submitted 30 August, 2012; originally announced August 2012.

arXiv:1205.3980 [pdf, ps, other]

A note on mixing times of planar random walks

Authors: James R. Lee, Teng Qin

Abstract: We present an infinite family of finite planar graphs $\{X_n\}$ with degree at most five and such that for some constant $c > 0$, $$ λ_1(X_n) \geq c(\frac{\log \diam(X_n)}{\diam(X_n)})^2\,, $$ where $λ_1$ denotes the smallest non-zero eigenvalue of the graph Laplacian. This significantly simplifies a construction of Louder and Souto. We also remark that such a lower bound cannot hold when the di… ▽ More We present an infinite family of finite planar graphs $\{X_n\}$ with degree at most five and such that for some constant $c > 0$, $$ λ_1(X_n) \geq c(\frac{\log \diam(X_n)}{\diam(X_n)})^2\,, $$ where $λ_1$ denotes the smallest non-zero eigenvalue of the graph Laplacian. This significantly simplifies a construction of Louder and Souto. We also remark that such a lower bound cannot hold when the diameter is replaced by the average squared distance: There exists a constant $c > 0$ such that for any family $\{X_n\}$ of planar graphs we have $$ λ_1(X_n) \leq c (\frac{1}{|X_n|^2} \sum_{x,y \in X_n} d(x,y)^2)^{-1}\,, $$ where $d$ denotes the path metric on $X_n$. △ Less

Submitted 17 May, 2012; originally announced May 2012.

Showing 1–50 of 69 results for author: Lee, J R