Search | arXiv e-print repository

Analyzing Quality, Bias, and Performance in Text-to-Image Generative Models

Authors: Nila Masrourisaadat, Nazanin Sedaghatkish, Fatemeh Sarshartehrani, Edward A. Fox

Abstract: Advances in generative models have led to significant interest in image synthesis, demonstrating the ability to generate high-quality images for a diverse range of text prompts. Despite this progress, most studies ignore the presence of bias. In this paper, we examine several text-to-image models not only by qualitatively assessing their performance in generating accurate images of human faces, gr… ▽ More Advances in generative models have led to significant interest in image synthesis, demonstrating the ability to generate high-quality images for a diverse range of text prompts. Despite this progress, most studies ignore the presence of bias. In this paper, we examine several text-to-image models not only by qualitatively assessing their performance in generating accurate images of human faces, groups, and specified numbers of objects but also by presenting a social bias analysis. As expected, models with larger capacity generate higher-quality images. However, we also document the inherent gender or social biases these models possess, offering a more complete understanding of their impact and limitations. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: 20 pages, 8 figures

ACM Class: I.2.6; I.2.10; I.2.7; I.4.10

arXiv:2405.18579 [pdf]

doi 10.1145/3663384.3663407

Public Technologies Transforming Work of the Public and the Public Sector

Authors: Seyun Kim, Bonnie Fan, Willa Yunqi Yang, Jessie Ramey, Sarah E Fox, Haiyi Zhu, John Zimmerman, Motahhare Eslami

Abstract: Technologies adopted by the public sector have transformed the work practices of employees in public agencies by creating different means of communication and decision-making. Although much of the recent research in the future of work domain has concentrated on the effects of technological advancements on public sector employees, the influence on work practices of external stakeholders engaging wi… ▽ More Technologies adopted by the public sector have transformed the work practices of employees in public agencies by creating different means of communication and decision-making. Although much of the recent research in the future of work domain has concentrated on the effects of technological advancements on public sector employees, the influence on work practices of external stakeholders engaging with this sector remains under-explored. In this paper, we focus on a digital platform called OneStop which is deployed by several building departments across the U.S. and aims to integrate various steps and services into a single point of online contact between public sector employees and the public. Drawing on semi-structured interviews with 22 stakeholders, including local business owners, experts involved in the construction process, community representatives, and building department employees, we investigate how this technology transition has impacted the work of these different stakeholders. We observe a multifaceted perspective and experience caused by the adoption of OneStop. OneStop exacerbated inequitable practices for local business owners due to a lack of face-to-face interactions with the department employees. For the public sector employees, OneStop standardized the work practices, representing the building department's priorities and values. Based on our findings, we discuss tensions around standardization, equality, and equity in technology transition, as well as design implications for equitable practices in the public sector. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2404.14644 [pdf, other]

Identifying sparse treatment effects

Authors: Yu** Jeong, Emily Fox, Ramesh Johari

Abstract: Based on technological advances in sensing modalities, randomized trials with primary outcomes represented as high-dimensional vectors have become increasingly prevalent. For example, these outcomes could be week-long time-series data from wearable devices or high-dimensional neuroimaging data, such as from functional magnetic resonance imaging. This paper focuses on randomized treatment studies w… ▽ More Based on technological advances in sensing modalities, randomized trials with primary outcomes represented as high-dimensional vectors have become increasingly prevalent. For example, these outcomes could be week-long time-series data from wearable devices or high-dimensional neuroimaging data, such as from functional magnetic resonance imaging. This paper focuses on randomized treatment studies with such high-dimensional outcomes characterized by sparse treatment effects, where interventions may influence a small number of dimensions, e.g., small temporal windows or specific brain regions. Conventional practices, such as using fixed, low-dimensional summaries of the outcomes, result in significantly reduced power for detecting treatment effects. To address this limitation, we propose a procedure that involves subset selection followed by inference. Specifically, given a potentially large set of outcome summaries, we identify the subset that captures treatment effects, which requires only one call to the Lasso, and subsequently conduct inference on the selected subset. Via theoretical analysis as well as simulations, we demonstrate that our method asymptotically selects the correct subset and increases statistical power. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.05050 [pdf, other]

Co-design Accessible Public Robots: Insights from People with Mobility Disability, Robotic Practitioners and Their Collaborations

Authors: Howard Ziyu Han, Franklin Mingzhe Li, Alesandra Baca Vazquez, Daragh Byrne, Nikolas Martelaro, Sarah E Fox

Abstract: Sidewalk robots are increasingly common across the globe. Yet, their operation on public paths poses challenges for people with mobility disabilities (PwMD) who face barriers to accessibility, such as insufficient curb cuts. We interviewed 15 PwMD to understand how they perceive sidewalk robots. Findings indicated that PwMD feel they have to compete for space on the sidewalk when robots are introd… ▽ More Sidewalk robots are increasingly common across the globe. Yet, their operation on public paths poses challenges for people with mobility disabilities (PwMD) who face barriers to accessibility, such as insufficient curb cuts. We interviewed 15 PwMD to understand how they perceive sidewalk robots. Findings indicated that PwMD feel they have to compete for space on the sidewalk when robots are introduced. We next interviewed eight robotics practitioners to learn about their attitudes towards accessibility. Practitioners described how issues often stem from robotic companies addressing accessibility only after problems arise. Both interview groups underscored the importance of integrating accessibility from the outset. Building on this finding, we held four co-design workshops with PwMD and practitioners in pairs. These convenings brought to bear accessibility needs around robots operating in public spaces and in the public interest. Our study aims to set the stage for a more inclusive future around public service robots. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2403.12878 [pdf, other]

Fréchet Edit Distance

Authors: Emily Fox, Amir Nayyeri, Jonathan James Perry, Benjamin Raichel

Abstract: We define and investigate the Fréchet edit distance problem. Given two polygonal curves $π$ and $σ$ and a threshhold value $δ>0$, we seek the minimum number of edits to $σ$ such that the Fréchet distance between the edited $σ$ and $π$ is at most $δ$. For the edit operations we consider three cases, namely, deletion of vertices, insertion of vertices, or both. For this basic problem we consider a n… ▽ More We define and investigate the Fréchet edit distance problem. Given two polygonal curves $π$ and $σ$ and a threshhold value $δ>0$, we seek the minimum number of edits to $σ$ such that the Fréchet distance between the edited $σ$ and $π$ is at most $δ$. For the edit operations we consider three cases, namely, deletion of vertices, insertion of vertices, or both. For this basic problem we consider a number of variants. Specifically, we provide polynomial time algorithms for both discrete and continuous Fréchet edit distance variants, as well as hardness results for weak Fréchet edit distance variants. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: To appear in SoCG 2024

arXiv:2402.17879 [pdf, other]

Automated Statistical Model Discovery with Language Models

Authors: Michael Y. Li, Emily B. Fox, Noah D. Goodman

Abstract: Statistical model discovery is a challenging search over a vast space of models subject to domain-specific constraints. Efficiently searching over this space requires expertise in modeling and the problem domain. Motivated by the domain knowledge and programming capabilities of large language models (LMs), we introduce a method for language model driven automated statistical model discovery. We ca… ▽ More Statistical model discovery is a challenging search over a vast space of models subject to domain-specific constraints. Efficiently searching over this space requires expertise in modeling and the problem domain. Motivated by the domain knowledge and programming capabilities of large language models (LMs), we introduce a method for language model driven automated statistical model discovery. We cast our automated procedure within the principled framework of Box's Loop: the LM iterates between proposing statistical models represented as probabilistic programs, acting as a modeler, and critiquing those models, acting as a domain expert. By leveraging LMs, we do not have to define a domain-specific language of models or design a handcrafted search procedure, which are key restrictions of previous systems. We evaluate our method in three settings in probabilistic modeling: searching within a restricted space of models, searching over an open-ended space, and improving expert models under natural language constraints (e.g., this model should be interpretable to an ecologist). Our method identifies models on par with human expert designed models and extends classic models in interpretable ways. Our results highlight the promise of LM-driven model discovery. △ Less

Submitted 22 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

Comments: ICML 2024

arXiv:2402.17233 [pdf, other]

Hybrid$^2$ Neural ODE Causal Modeling and an Application to Glycemic Response

Authors: Bob Junyi Zou, Matthew E. Levine, Dessi P. Zaharieva, Ramesh Johari, Emily B. Fox

Abstract: Hybrid models composing mechanistic ODE-based dynamics with flexible and expressive neural network components have grown rapidly in popularity, especially in scientific domains where such ODE-based modeling offers important interpretability and validated causal grounding (e.g., for counterfactual reasoning). The incorporation of mechanistic models also provides inductive bias in standard blackbox… ▽ More Hybrid models composing mechanistic ODE-based dynamics with flexible and expressive neural network components have grown rapidly in popularity, especially in scientific domains where such ODE-based modeling offers important interpretability and validated causal grounding (e.g., for counterfactual reasoning). The incorporation of mechanistic models also provides inductive bias in standard blackbox modeling approaches, critical when learning from small datasets or partially observed, complex systems. Unfortunately, as the hybrid models become more flexible, the causal grounding provided by the mechanistic model can quickly be lost. We address this problem by leveraging another common source of domain knowledge: \emph{ranking} of treatment effects for a set of interventions, even if the precise treatment effect is unknown. We encode this information in a \emph{causal loss} that we combine with the standard predictive loss to arrive at a \emph{hybrid loss} that biases our learning towards causally valid hybrid models. We demonstrate our ability to achieve a win-win, state-of-the-art predictive performance \emph{and} causal validity, in the challenging task of modeling glucose dynamics post-exercise in individuals with type 1 diabetes. △ Less

Submitted 11 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

arXiv:2312.03344 [pdf, other]

Interpretable Mechanistic Representations for Meal-level Glycemic Control in the Wild

Authors: Ke Alexander Wang, Emily B. Fox

Abstract: Diabetes encompasses a complex landscape of glycemic control that varies widely among individuals. However, current methods do not faithfully capture this variability at the meal level. On the one hand, expert-crafted features lack the flexibility of data-driven methods; on the other hand, learned representations tend to be uninterpretable which hampers clinical adoption. In this paper, we propose… ▽ More Diabetes encompasses a complex landscape of glycemic control that varies widely among individuals. However, current methods do not faithfully capture this variability at the meal level. On the one hand, expert-crafted features lack the flexibility of data-driven methods; on the other hand, learned representations tend to be uninterpretable which hampers clinical adoption. In this paper, we propose a hybrid variational autoencoder to learn interpretable representations of CGM and meal data. Our method grounds the latent space to the inputs of a mechanistic differential equation, producing embeddings that reflect physiological quantities, such as insulin sensitivity, glucose effectiveness, and basal glucose levels. Moreover, we introduce a novel method to infer the glucose appearance rate, making the mechanistic model robust to unreliable meal logs. On a dataset of CGM and self-reported meals from individuals with type-2 diabetes and pre-diabetes, our unsupervised representation discovers a separation between individuals proportional to their disease severity. Our embeddings produce clusters that are up to 4x better than naive, expert, black-box, and pure mechanistic features. Our method provides a nuanced, yet interpretable, embedding space to compare glycemic control within and across individuals, directly learnable from in-the-wild data. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: Proceedings of Machine Learning for Health (ML4H) 2023. Code available at: https://github.com/KeAWang/interpretable-cgm-representations

arXiv:2311.06300 [pdf, other]

AI Chatbot for Generating Episodic Future Thinking (EFT) Cue Texts for Health

Authors: Sareh Ahmadi, Edward A. Fox

Abstract: We describe an AI-powered chatbot to aid with health improvement by generating Episodic Future Thinking (EFT) cue texts that should reduce delay discounting. In prior studies, EFT has been shown to address maladaptive health behaviors. Those studies involved participants, working with researchers, vividly imagining future events, and writing a description that they subsequently will frequently rev… ▽ More We describe an AI-powered chatbot to aid with health improvement by generating Episodic Future Thinking (EFT) cue texts that should reduce delay discounting. In prior studies, EFT has been shown to address maladaptive health behaviors. Those studies involved participants, working with researchers, vividly imagining future events, and writing a description that they subsequently will frequently review, to ensure a shift from an inclination towards immediate rewards. That should promote behavior change, aiding in health tasks such as treatment adherence and lifestyle modifications. The AI chatbot is designed to guide users in generating personalized EFTs, automating the current labor-intensive interview-based process. This can enhance the efficiency of EFT interventions and make them more accessible, targeting specifically those with limited educational backgrounds or communication challenges. By leveraging AI for EFT intervention, we anticipate broadened access and improved health outcomes across diverse populations △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2311.04262 [pdf, other]

ETDPC: A Multimodality Framework for Classifying Pages in Electronic Theses and Dissertations

Authors: Muntabir Hasan Choudhury, Lamia Salsabil, William A. Ingram, Edward A. Fox, Jian Wu

Abstract: Electronic theses and dissertations (ETDs) have been proposed, advocated, and generated for more than 25 years. Although ETDs are hosted by commercial or institutional digital library repositories, they are still an understudied type of scholarly big data, partially because they are usually longer than conference proceedings and journals. Segmenting ETDs will allow researchers to study sectional c… ▽ More Electronic theses and dissertations (ETDs) have been proposed, advocated, and generated for more than 25 years. Although ETDs are hosted by commercial or institutional digital library repositories, they are still an understudied type of scholarly big data, partially because they are usually longer than conference proceedings and journals. Segmenting ETDs will allow researchers to study sectional content. Readers can navigate to particular pages of interest, discover, and explore the content buried in these long documents. Most existing frameworks on document page classification are designed for classifying general documents and perform poorly on ETDs. In this paper, we propose ETDPC. Its backbone is a two-stream multimodal model with a cross-attention network to classify ETD pages into 13 categories. To overcome the challenge of imbalanced labeled samples, we augmented data for minority categories and employed a hierarchical classifier. ETDPC outperforms the state-of-the-art models in all categories, achieving an F1 of 0.84 -- 0.96 for 9 out of 13 categories. We also demonstrated its data efficiency. The code and data can be found on GitHub (https://github.com/lamps-lab/ETDMiner/tree/master/etd_segmentation). △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: 10 pages, 3 figures, accepted to Innovative Applications of Artificial Intelligence (IAAI-24)

arXiv:2310.18427 [pdf, ps, other]

doi 10.1109/JCDL57899.2023.00049

Maximizing Equitable Reach and Accessibility of ETDs

Authors: William A. Ingram, Jian Wu, Edward A. Fox

Abstract: This poster addresses accessibility issues of electronic theses and dissertations (ETDs) in digital libraries (DLs). ETDs are available primarily as PDF files, which present barriers to equitable access, especially for users with visual impairments, cognitive or learning disabilities, or for anyone needing more efficient and effective ways of finding relevant information within these long document… ▽ More This poster addresses accessibility issues of electronic theses and dissertations (ETDs) in digital libraries (DLs). ETDs are available primarily as PDF files, which present barriers to equitable access, especially for users with visual impairments, cognitive or learning disabilities, or for anyone needing more efficient and effective ways of finding relevant information within these long documents. We propose using AI techniques, including natural language processing (NLP), computer vision, and text analysis, to convert PDFs into machine-readable HTML documents with semantic tags and structure, extracting figures and tables, and generating summaries and keywords. Our goal is to increase the accessibility of ETDs and to make this important scholarship available to a wider audience. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Journal ref: 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL), Santa Fe, NM, USA, 2023, pp. 256-257

arXiv:2308.10829 [pdf, other]

doi 10.1007/JHEP12(2023)171

Initial-Final and Initial-Initial antenna functions for real radiation at next-to-leading order

Authors: Elliot Fox, Nigel Glover

Abstract: The antenna subtraction method has achieved remarkable success in various processes relevant to the Large Hadron Collider. In Reference [1], an algorithm was proposed for constructing real-radiation antenna functions for electron-positron annihilation, directly from specified unresolved limits, accommodating any number of real emissions. Here, we extend this algorithm to build antennae involving p… ▽ More The antenna subtraction method has achieved remarkable success in various processes relevant to the Large Hadron Collider. In Reference [1], an algorithm was proposed for constructing real-radiation antenna functions for electron-positron annihilation, directly from specified unresolved limits, accommodating any number of real emissions. Here, we extend this algorithm to build antennae involving partons in the initial state, specifically the initial-final and initial-initial antennae. Using this extended algorithm, we explicitly construct all NLO QCD antenna functions and compare them with previously extracted antenna functions derived from matrix elements. Additionally, we rigorously match the integration of the antenna functions over the initial-final and initial-initial unresolved phase space with the previous approach, providing an independent validation of our results. The improved antenna functions are more compact and reduced in number, making them more readily applicable for higher-order calculations. △ Less

Submitted 11 February, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

Comments: 33 pages, updated to match JHEP submission

Report number: IPPP/23/44, ZU-TH 47/23

arXiv:2307.14899 [pdf, other]

Retrieval-based Text Selection for Addressing Class-Imbalanced Data in Classification

Authors: Sareh Ahmadi, Aditya Shah, Edward Fox

Abstract: This paper addresses the problem of selecting of a set of texts for annotation in text classification using retrieval methods when there are limits on the number of annotations due to constraints on human resources. An additional challenge addressed is dealing with binary categories that have a small number of positive instances, reflecting severe class imbalance. In our situation, where annotatio… ▽ More This paper addresses the problem of selecting of a set of texts for annotation in text classification using retrieval methods when there are limits on the number of annotations due to constraints on human resources. An additional challenge addressed is dealing with binary categories that have a small number of positive instances, reflecting severe class imbalance. In our situation, where annotation occurs over a long time period, the selection of texts to be annotated can be made in batches, with previous annotations guiding the choice of the next set. To address these challenges, the paper proposes leveraging SHAP to construct a quality set of queries for Elasticsearch and semantic search, to try to identify optimal sets of texts for annotation that will help with class imbalance. The approach is tested on sets of cue texts describing possible future events, constructed by participants involved in studies aimed to help with the management of obesity and diabetes. We introduce an effective method for selecting a small set of texts for annotation and building high-quality classifiers. We integrate vector search, semantic search, and machine learning classifiers to yield a good solution. Our experiments demonstrate improved F1 scores for the minority classes in binary classification. △ Less

Submitted 9 November, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

arXiv:2307.07440 [pdf, ps, other]

A simple deterministic near-linear time approximation scheme for transshipment with arbitrary positive edge costs

Authors: Emily Fox

Abstract: We describe a simple deterministic near-linear time approximation scheme for uncapacitated minimum cost flow in undirected graphs with real edge weights, a problem also known as transshipment. Specifically, our algorithm takes as input a (connected) undirected graph $G = (V, E)$, vertex demands $b \in \mathbb{R}^V$ such that $\sum_{v \in V} b(v) = 0$, positive edge costs $c \in \mathbb{R}_{>0}^E$,… ▽ More We describe a simple deterministic near-linear time approximation scheme for uncapacitated minimum cost flow in undirected graphs with real edge weights, a problem also known as transshipment. Specifically, our algorithm takes as input a (connected) undirected graph $G = (V, E)$, vertex demands $b \in \mathbb{R}^V$ such that $\sum_{v \in V} b(v) = 0$, positive edge costs $c \in \mathbb{R}_{>0}^E$, and a parameter $\varepsilon > 0$. In $O(\varepsilon^{-2} m \log^{O(1)} n)$ time, it returns a flow $f$ such that the net flow out of each vertex is equal to the vertex's demand and the cost of the flow is within a $(1 + \varepsilon)$ factor of optimal. Our algorithm is combinatorial and has no running time dependency on the demands or edge costs. With the exception of a recent result presented at STOC 2022 for polynomially bounded edge weights, all almost- and near-linear time approximation schemes for transshipment relied on randomization to embed the problem instance into low-dimensional space. Our algorithm instead deterministically approximates the cost of routing decisions that would be made if the input were subject to a random tree embedding. To avoid computing the $Ω(n^2)$ vertex-vertex distances that an approximation of this kind suggests, we also take advantage of the clustering method used in the well-known Thorup-Zwick distance oracle. △ Less

Submitted 26 June, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

Comments: Accepted for ESA 2024 v3: ESA 2024 reviewer suggestions

arXiv:2305.01638 [pdf, other]

Sequence Modeling with Multiresolution Convolutional Memory

Authors: Jiaxin Shi, Ke Alexander Wang, Emily B. Fox

Abstract: Efficiently capturing the long-range patterns in sequential data sources salient to a given task -- such as classification and generative modeling -- poses a fundamental challenge. Popular approaches in the space tradeoff between the memory burden of brute-force enumeration and comparison, as in transformers, the computational burden of complicated sequential dependencies, as in recurrent neural n… ▽ More Efficiently capturing the long-range patterns in sequential data sources salient to a given task -- such as classification and generative modeling -- poses a fundamental challenge. Popular approaches in the space tradeoff between the memory burden of brute-force enumeration and comparison, as in transformers, the computational burden of complicated sequential dependencies, as in recurrent neural networks, or the parameter burden of convolutional networks with many or large filters. We instead take inspiration from wavelet-based multiresolution analysis to define a new building block for sequence modeling, which we call a MultiresLayer. The key component of our model is the multiresolution convolution, capturing multiscale trends in the input sequence. Our MultiresConv can be implemented with shared filters across a dilated causal convolution tree. Thus it garners the computational advantages of convolutional networks and the principled theoretical motivation of wavelet decompositions. Our MultiresLayer is straightforward to implement, requires significantly fewer parameters, and maintains at most a $\mathcal{O}(N\log N)$ memory footprint for a length $N$ sequence. Yet, by stacking such layers, our model yields state-of-the-art performance on a number of sequence classification and autoregressive density estimation tasks using CIFAR-10, ListOps, and PTB-XL datasets. △ Less

Submitted 1 November, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

Comments: ICML 2023, Source code: https://github.com/thjashin/multires-conv

arXiv:2304.14300 [pdf, other]

Learning Absorption Rates in Glucose-Insulin Dynamics from Meal Covariates

Authors: Ke Alexander Wang, Matthew E. Levine, Jiaxin Shi, Emily B. Fox

Abstract: Traditional models of glucose-insulin dynamics rely on heuristic parameterizations chosen to fit observations within a laboratory setting. However, these models cannot describe glucose dynamics in daily life. One source of failure is in their descriptions of glucose absorption rates after meal events. A meal's macronutritional content has nuanced effects on the absorption profile, which is difficu… ▽ More Traditional models of glucose-insulin dynamics rely on heuristic parameterizations chosen to fit observations within a laboratory setting. However, these models cannot describe glucose dynamics in daily life. One source of failure is in their descriptions of glucose absorption rates after meal events. A meal's macronutritional content has nuanced effects on the absorption profile, which is difficult to model mechanistically. In this paper, we propose to learn the effects of macronutrition content from glucose-insulin data and meal covariates. Given macronutrition information and meal times, we use a neural network to predict an individual's glucose absorption rate. We use this neural rate function as the control function in a differential equation of glucose dynamics, enabling end-to-end training. On simulated data, our approach is able to closely approximate true absorption rates, resulting in better forecast than heuristic parameterizations, despite only observing glucose, insulin, and macronutritional information. Our work readily generalizes to meal events with higher-dimensional covariates, such as images, setting the stage for glucose dynamics models that are personalized to each individual's daily life. △ Less

Submitted 27 April, 2023; originally announced April 2023.

Comments: Work presented at NeurIPS 2022 Workshop on Learning from Time Series for Health (TS4H). arXiv admin note: substantial text overlap with arXiv:2302.11939

arXiv:2303.17661 [pdf, other]

MetaEnhance: Metadata Quality Improvement for Electronic Theses and Dissertations of University Libraries

Authors: Muntabir Hasan Choudhury, Lamia Salsabil, Himarsha R. Jayanetti, Jian Wu, William A. Ingram, Edward A. Fox

Abstract: Metadata quality is crucial for digital objects to be discovered through digital library interfaces. However, due to various reasons, the metadata of digital objects often exhibits incomplete, inconsistent, and incorrect values. We investigate methods to automatically detect, correct, and canonicalize scholarly metadata, using seven key fields of electronic theses and dissertations (ETDs) as a cas… ▽ More Metadata quality is crucial for digital objects to be discovered through digital library interfaces. However, due to various reasons, the metadata of digital objects often exhibits incomplete, inconsistent, and incorrect values. We investigate methods to automatically detect, correct, and canonicalize scholarly metadata, using seven key fields of electronic theses and dissertations (ETDs) as a case study. We propose MetaEnhance, a framework that utilizes state-of-the-art artificial intelligence methods to improve the quality of these fields. To evaluate MetaEnhance, we compiled a metadata quality evaluation benchmark containing 500 ETDs, by combining subsets sampled using multiple criteria. We tested MetaEnhance on this benchmark and found that the proposed methods achieved nearly perfect F1-scores in detecting errors and F1-scores in correcting errors ranging from 0.85 to 1.00 for five of seven fields. △ Less

Submitted 30 March, 2023; originally announced March 2023.

Comments: 7 pages, 3 tables, and 1 figure. Accepted by 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL '23) as a short paper

arXiv:2301.13321 [pdf, other]

Censorship Resistance in On-Chain Auctions

Authors: Elijah Fox, Mallesh Pai, Max Resnick

Abstract: Modern blockchains guarantee that submitted transactions will be included eventually; a property formally known as liveness. But financial activity requires transactions to be included in a timely manner. Unfortunately, classical liveness is not strong enough to guarantee this, particularly in the presence of a motivated adversary who benefits from censoring transactions. We define censorship resi… ▽ More Modern blockchains guarantee that submitted transactions will be included eventually; a property formally known as liveness. But financial activity requires transactions to be included in a timely manner. Unfortunately, classical liveness is not strong enough to guarantee this, particularly in the presence of a motivated adversary who benefits from censoring transactions. We define censorship resistance as the amount it would cost the adversary to censor a transaction for a fixed interval of time as a function of the associated tip. This definition has two advantages, first it captures the fact that transactions with a higher miner tip can be more costly to censor, and therefore are more likely to swiftly make their way onto the chain. Second, it applies to a finite time window, so it can be used to assess whether a blockchain is capable of hosting financial activity that relies on timely inclusion. We apply this definition in the context of auctions. Auctions are a building block for many financial applications, and censoring competing bids offers an easy-to-model motivation for our adversary. Traditional proof-of-stake blockchains have poor enough censorship resistance that it is difficult to retain the integrity of an auction when bids can only be submitted in a single block. As the number of bidders $n$ in a single block auction increases, the probability that the winner is not the adversary, and the economic efficiency of the auction, both decrease faster than $1/n$. Running the auction over multiple blocks, each with a different proposer, alleviates the problem only if the number of blocks grows faster than the number of bidders. We argue that blockchains with more than one concurrent proposer have can have strong censorship resistance. We achieve this by setting up a prisoner's dilemma among the proposers using conditional tips. △ Less

Submitted 26 June, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: 27 pages, 2 figures

arXiv:2211.03891 [pdf, ps, other]

A deterministic near-linear time approximation scheme for geometric transportation

Authors: Emily Fox, Jiashuai Lu

Abstract: Given a set of points $P = (P^+ \sqcup P^-) \subset \mathbb{R}^d$ for some constant $d$ and a supply function $μ:P\to \mathbb{R}$ such that $μ(p) > 0~\forall p \in P^+$, $μ(p) < 0~\forall p \in P^-$, and $\sum_{p\in P}{μ(p)} = 0$, the geometric transportation problem asks one to find a transportation map $τ: P^+\times P^-\to \mathbb{R}_{\ge 0}$ such that… ▽ More Given a set of points $P = (P^+ \sqcup P^-) \subset \mathbb{R}^d$ for some constant $d$ and a supply function $μ:P\to \mathbb{R}$ such that $μ(p) > 0~\forall p \in P^+$, $μ(p) < 0~\forall p \in P^-$, and $\sum_{p\in P}{μ(p)} = 0$, the geometric transportation problem asks one to find a transportation map $τ: P^+\times P^-\to \mathbb{R}_{\ge 0}$ such that $\sum_{q\in P^-}{τ(p, q)} = μ(p)~\forall p \in P^+$, $\sum_{p\in P^+}{τ(p, q)} = -μ(q)~ \forall q \in P^-$, and the weighted sum of Euclidean distances for the pairs $\sum_{(p,q)\in P^+\times P^-}τ(p, q)\cdot ||q-p||_2$ is minimized. We present the first deterministic algorithm that computes, in near-linear time, a transportation map whose cost is within a $(1 + \varepsilon)$ factor of optimal. More precisely, our algorithm runs in $O(n\varepsilon^{-(d+2)}\log^5{n}\log{\log{n}})$ time for any constant $\varepsilon > 0$. Surprisingly, our result is not only a generalization of a bipartite matching one to arbitrary instances of geometric transportation, but it also reduces the running time for all previously known $(1 + \varepsilon)$-approximation algorithms, randomized or deterministic, even for geometric bipartite matching. In particular, we give the first $(1 + \varepsilon)$-approximate deterministic algorithm for geometric bipartite matching and the first $(1 + \varepsilon)$-approximate deterministic or randomized algorithm for geometric transportation with no dependence on $d$ in the exponent of the running time's polylog. As an additional application of our main ideas, we also give the first randomized near-linear $O(\varepsilon^{-2} m \log^{O(1)} n)$ time $(1 + \varepsilon)$-approximation algorithm for the uncapacitated minimum cost flow (transshipment) problem in undirected graphs with arbitrary real edge costs. △ Less

Submitted 27 September, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

Comments: To appear in FOCS 2023. 24 pages. Update 2: Added corrections for minimum cost flow approximation scheme. Addressed reviewer comments. Update 1: Adds a new randomized near-linear time approximation scheme for uncapacitated minimum cost flow in undirected graphs (transshipment) with arbitrary edge costs. References more recent work in geometric bipartite matching

arXiv:2107.00516 [pdf, other]

Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations

Authors: Muntabir Hasan Choudhury, Himarsha R. Jayanetti, Jian Wu, William A. Ingram, Edward A. Fox

Abstract: Electronic Theses and Dissertations (ETDs) contain domain knowledge that can be used for many digital library tasks, such as analyzing citation networks and predicting research trends. Automatic metadata extraction is important to build scalable digital library search engines. Most existing methods are designed for born-digital documents, so they often fail to extract metadata from scanned documen… ▽ More Electronic Theses and Dissertations (ETDs) contain domain knowledge that can be used for many digital library tasks, such as analyzing citation networks and predicting research trends. Automatic metadata extraction is important to build scalable digital library search engines. Most existing methods are designed for born-digital documents, so they often fail to extract metadata from scanned documents such as for ETDs. Traditional sequence tagging methods mainly rely on text-based features. In this paper, we propose a conditional random field (CRF) model that combines text-based and visual features. To verify the robustness of our model, we extended an existing corpus and created a new ground truth corpus consisting of 500 ETD cover pages with human validated metadata. Our experiments show that CRF with visual features outperformed both a heuristic and a CRF model with only text-based features. The proposed model achieved 81.3%-96% F1 measure on seven metadata fields. The data and source code are publicly available on Google Drive (https://tinyurl.com/y8kxzwrp) and a GitHub repository (https://github.com/lamps-lab/ETDMiner/tree/master/etd_crf), respectively. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: 7 pages, 4 figures, 1 table. Accepted by JCDL '21 as a short paper

arXiv:2106.15320 [pdf, other]

ScanBank: A Benchmark Dataset for Figure Extraction from Scanned Electronic Theses and Dissertations

Authors: Sampanna Yashwant Kahu, William A. Ingram, Edward A. Fox, Jian Wu

Abstract: We focus on electronic theses and dissertations (ETDs), aiming to improve access and expand their utility, since more than 6 million are publicly available, and they constitute an important corpus to aid research and education across disciplines. The corpus is growing as new born-digital documents are included, and since millions of older theses and dissertations have been converted to digital for… ▽ More We focus on electronic theses and dissertations (ETDs), aiming to improve access and expand their utility, since more than 6 million are publicly available, and they constitute an important corpus to aid research and education across disciplines. The corpus is growing as new born-digital documents are included, and since millions of older theses and dissertations have been converted to digital form to be disseminated electronically in institutional repositories. In ETDs, as with other scholarly works, figures and tables can communicate a large amount of information in a concise way. Although methods have been proposed for extracting figures and tables from born-digital PDFs, they do not work well with scanned ETDs. Considering this problem, our assessment of state-of-the-art figure extraction systems is that the reason they do not function well on scanned PDFs is that they have only been trained on born-digital documents. To address this limitation, we present ScanBank, a new dataset containing 10 thousand scanned page images, manually labeled by humans as to the presence of the 3.3 thousand figures or tables found therein. We use this dataset to train a deep neural network model based on YOLOv5 to accurately extract figures and tables from scanned ETDs. We pose and answer important research questions aimed at finding better methods for figure extraction from scanned documents. One of those concerns the value for training, of data augmentation techniques applied to born-digital documents which are used to train models better suited for figure extraction from scanned documents. To the best of our knowledge, ScanBank is the first manually annotated dataset for figure and table extraction for scanned ETDs. A YOLOv5-based model, trained on ScanBank, outperforms existing comparable open-source and freely available baseline methods by a considerable margin. △ Less

Submitted 23 June, 2021; originally announced June 2021.

Comments: 16 pages, 3 figures, submitted to ACM/IEEE Joint Conference on Digital Libraries

arXiv:2105.08843 [pdf, ps, other]

doi 10.1063/5.0056796

Bulk dissipation in the quantum anomalous Hall effect

Authors: Linsey K. Rodenbach, Ilan T. Rosen, Eli J. Fox, Peng Zhang, Lei Pan, Kang L. Wang, Marc A. Kastner, David Goldhaber-Gordon

Abstract: Even at the lowest accessible temperatures, measurements of the quantum anomalous Hall (QAH) effect have indicated the presence of parasitic dissipative conduction channels. There is no consensus whether parasitic conduction is related to processes in the bulk or along the edges. Here, we approach this problem by comparing transport measurements of Hall bar and Corbino geometry devices fabricated… ▽ More Even at the lowest accessible temperatures, measurements of the quantum anomalous Hall (QAH) effect have indicated the presence of parasitic dissipative conduction channels. There is no consensus whether parasitic conduction is related to processes in the bulk or along the edges. Here, we approach this problem by comparing transport measurements of Hall bar and Corbino geometry devices fabricated from Cr-doped (BiSb)$_2$Te$_3$. We identify bulk conduction as the dominant source of dissipation at all values of temperature and in-plane electric field. Furthermore, we observe identical breakdown phenomenology in both geometries, indicating that breakdown of the QAH phase is a bulk process. The methodology developed in this study could be used to identify dissipative conduction mechanisms in new QAH materials, ultimately guiding material development towards realization of the QAH effect at higher temperatures. △ Less

Submitted 5 August, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

Comments: 9 pages, 4 figures, with 11 pages of supplementary information

Journal ref: APL Materials 9, 081116 (2021)

arXiv:2105.02675 [pdf, other]

Granger Causality: A Review and Recent Advances

Authors: Ali Shojaie, Emily B. Fox

Abstract: Introduced more than a half century ago, Granger causality has become a popular tool for analyzing time series data in many application domains, from economics and finance to genomics and neuroscience. Despite this popularity, the validity of this notion for inferring causal relationships among time series has remained the topic of continuous debate. Moreover, while the original definition was gen… ▽ More Introduced more than a half century ago, Granger causality has become a popular tool for analyzing time series data in many application domains, from economics and finance to genomics and neuroscience. Despite this popularity, the validity of this notion for inferring causal relationships among time series has remained the topic of continuous debate. Moreover, while the original definition was general, limitations in computational tools have primarily limited the applications of Granger causality to simple bivariate vector auto-regressive processes or pairwise relationships among a set of variables. Starting with a review of early developments and debates, this paper discusses recent advances that address various shortcomings of the earlier approaches, from models for high-dimensional time series to more recent developments that account for nonlinear and non-Gaussian observations and allow for sub-sampled and mixed frequency time series. △ Less

Submitted 6 May, 2021; v1 submitted 5 May, 2021; originally announced May 2021.

Comments: 40 pages, 12 figures

arXiv:2105.01870 [pdf, other]

doi 10.1073/pnas.2118482119

Unusual magnetotransport in twisted bilayer graphene

Authors: Joe Finney, Aaron L. Sharpe, Eli J. Fox, Connie L. Hsueh, Daniel E. Parker, Matthew Yankowitz, Shaowen Chen, Kenji Watanabe, Takashi Taniguchi, Cory R. Dean, Ashvin Vishwanath, Marc Kastner, David Goldhaber-Gordon

Abstract: We present transport measurements of bilayer graphene with 1.38° interlayer twist and apparent additional alignment to its hexagonal boron nitride cladding. As with other devices with twist angles substantially larger than the magic angle of 1.1°, we do not observe correlated insulating states or band reorganization. However, we do observe several highly unusual behaviors in magnetotransport. For… ▽ More We present transport measurements of bilayer graphene with 1.38° interlayer twist and apparent additional alignment to its hexagonal boron nitride cladding. As with other devices with twist angles substantially larger than the magic angle of 1.1°, we do not observe correlated insulating states or band reorganization. However, we do observe several highly unusual behaviors in magnetotransport. For a large range of densities around half filling of the moiré bands, magnetoresistance is large and quadratic. Over these same densities, the magnetoresistance minima corresponding to gaps between Landau levels split and bend as a function of density and field. We reproduce the same splitting and bending behavior in a simple tight-binding model of Hofstadter's butterfly on a square lattice with anisotropic hop** terms. These features appear to be a generic class of experimental manifestations of Hofstadter's butterfly and may provide insight into the emergent states of twisted bilayer graphene. △ Less

Submitted 11 June, 2021; v1 submitted 5 May, 2021; originally announced May 2021.

Comments: 8 pages, 4 figures; updated supplemental material; added additional device measurements and updated toy model

arXiv:2104.12231 [pdf, other]

Model-based metrics: Sample-efficient estimates of predictive model subpopulation performance

Authors: Andrew C. Miller, Leon A. Gatys, Joseph Futoma, Emily B. Fox

Abstract: Machine learning models $-$ now commonly developed to screen, diagnose, or predict health conditions $-$ are evaluated with a variety of performance metrics. An important first step in assessing the practical utility of a model is to evaluate its average performance over an entire population of interest. In many settings, it is also critical that the model makes good predictions within predefined… ▽ More Machine learning models $-$ now commonly developed to screen, diagnose, or predict health conditions $-$ are evaluated with a variety of performance metrics. An important first step in assessing the practical utility of a model is to evaluate its average performance over an entire population of interest. In many settings, it is also critical that the model makes good predictions within predefined subpopulations. For instance, showing that a model is fair or equitable requires evaluating the model's performance in different demographic subgroups. However, subpopulation performance metrics are typically computed using only data from that subgroup, resulting in higher variance estimates for smaller groups. We devise a procedure to measure subpopulation performance that can be more sample-efficient than the typical subsample estimates. We propose using an evaluation model $-$ a model that describes the conditional distribution of the predictive model score $-$ to form model-based metric (MBM) estimates. Our procedure incorporates model checking and validation, and we propose a computationally efficient approximation of the traditional nonparametric bootstrap to form confidence intervals. We evaluate MBMs on two main tasks: a semi-synthetic setting where ground truth metrics are available and a real-world hospital readmission prediction task. We find that MBMs consistently produce more accurate and lower variance estimates of model performance for small subpopulations. △ Less

Submitted 25 April, 2021; originally announced April 2021.

Comments: 27 pages, 8 figures

arXiv:2104.12219 [pdf, other]

Breiman's two cultures: You don't have to choose sides

Authors: Andrew C. Miller, Nicholas J. Foti, Emily B. Fox

Abstract: Breiman's classic paper casts data analysis as a choice between two cultures: data modelers and algorithmic modelers. Stated broadly, data modelers use simple, interpretable models with well-understood theoretical properties to analyze data. Algorithmic modelers prioritize predictive accuracy and use more flexible function approximations to analyze data. This dichotomy overlooks a third set of mod… ▽ More Breiman's classic paper casts data analysis as a choice between two cultures: data modelers and algorithmic modelers. Stated broadly, data modelers use simple, interpretable models with well-understood theoretical properties to analyze data. Algorithmic modelers prioritize predictive accuracy and use more flexible function approximations to analyze data. This dichotomy overlooks a third set of models $-$ mechanistic models derived from scientific theories (e.g., ODE/SDE simulators). Mechanistic models encode application-specific scientific knowledge about the data. And while these categories represent extreme points in model space, modern computational and algorithmic tools enable us to interpolate between these points, producing flexible, interpretable, and scientifically-informed hybrids that can enjoy accurate and robust predictions, and resolve issues with data analysis that Breiman describes, such as the Rashomon effect and Occam's dilemma. Challenges still remain in finding an appropriate point in model space, with many choices on how to compose model components and the degree to which each component informs inferences. △ Less

Submitted 25 April, 2021; originally announced April 2021.

Comments: Commentary to appear in a special issue of Observational Studies, discussing Leo Breiman's paper "Statistical Modeling: The Two Cultures" (https://doi.org/10.1214/ss/1009213726)

arXiv:2102.04039 [pdf, other]

doi 10.1021/acs.nanolett.1c00696

Evidence of orbital ferromagnetism in twisted bilayer graphene aligned to hexagonal boron nitride

Authors: Aaron L. Sharpe, Eli J. Fox, Arthur W. Barnard, Joe Finney, Kenji Watanabe, Takashi Taniguchi, Marc A. Kastner, David Goldhaber-Gordon

Abstract: We have previously reported ferromagnetism evinced by a large hysteretic anomalous Hall effect in twisted bilayer graphene (tBLG). Subsequent measurements of a quantized Hall resistance and small longitudinal resistance confirmed that this magnetic state is a Chern insulator. Here we report that, when tilting the sample in an external magnetic field, the ferromagnetism is highly anisotropic. Becau… ▽ More We have previously reported ferromagnetism evinced by a large hysteretic anomalous Hall effect in twisted bilayer graphene (tBLG). Subsequent measurements of a quantized Hall resistance and small longitudinal resistance confirmed that this magnetic state is a Chern insulator. Here we report that, when tilting the sample in an external magnetic field, the ferromagnetism is highly anisotropic. Because spin-orbit coupling is negligible in graphene such anisotropy is unlikely to come from spin, but rather favors theories in which the ferromagnetism is orbital. We know of no other case in which ferromagnetism has a purely orbital origin. For an applied in-plane field larger than $5\ \mathrm{T}$, the out-of-plane magnetization is destroyed, suggesting a transition to a new phase. △ Less

Submitted 18 February, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

Comments: 30 pages, 10 figures

arXiv:2012.11774 [pdf, other]

doi 10.1016/j.ins.2021.12.018

Differentially Private Synthetic Medical Data Generation using Convolutional GANs

Authors: Amirsina Torfi, Edward A. Fox, Chandan K. Reddy

Abstract: Deep learning models have demonstrated superior performance in several application problems, such as image classification and speech processing. However, creating a deep learning model using health record data requires addressing certain privacy challenges that bring unique concerns to researchers working in this domain. One effective way to handle such private data issues is to generate realistic… ▽ More Deep learning models have demonstrated superior performance in several application problems, such as image classification and speech processing. However, creating a deep learning model using health record data requires addressing certain privacy challenges that bring unique concerns to researchers working in this domain. One effective way to handle such private data issues is to generate realistic synthetic data that can provide practically acceptable data quality and correspondingly the model performance. To tackle this challenge, we develop a differentially private framework for synthetic data generation using Rényi differential privacy. Our approach builds on convolutional autoencoders and convolutional generative adversarial networks to preserve some of the critical characteristics of the generated synthetic data. In addition, our model can also capture the temporal information and feature correlations that might be present in the original data. We demonstrate that our model outperforms existing state-of-the-art models under the same privacy budget using several publicly available benchmark medical datasets in both supervised and unsupervised settings. △ Less

Submitted 21 December, 2020; originally announced December 2020.

arXiv:2012.10075 [pdf]

doi 10.1021/acs.nanolett.1c03699

Tunable ferromagnetism at non-integer filling of a moiré superlattice

Authors: Guorui Chen, Aaron L. Sharpe, Eli J. Fox, Shaoxin Wang, Bosai Lyu, Lili Jiang, Hongyuan Li, Kenji Watanabe, Takashi Taniguchi, Michael F. Crommie, M. A. Kastner, Zhiwen Shi, David Goldhaber-Gordon, Yuanbo Zhang, Feng Wang

Abstract: The flat bands resulting from moiré superlattices in magic-angle twisted bilayer graphene (MATBG) and ABC-trilayer graphene aligned with hexagonal boron nitride (ABC-TLG/hBN) have been shown to give rise to fascinating correlated electron phenomena such as correlated insulators and superconductivity. More recently, orbital magnetism associated with correlated Chern insulators was found in this cla… ▽ More The flat bands resulting from moiré superlattices in magic-angle twisted bilayer graphene (MATBG) and ABC-trilayer graphene aligned with hexagonal boron nitride (ABC-TLG/hBN) have been shown to give rise to fascinating correlated electron phenomena such as correlated insulators and superconductivity. More recently, orbital magnetism associated with correlated Chern insulators was found in this class of layered structures centered at integer multiples of n0, the density corresponding to one electron per moiré superlattice unit cell. Here we report the experimental observation of ferromagnetism at fractional filling of a flat Chern band in an ABC-TLG/hBN moirésuperlattice. The ferromagnetic state exhibits prominent ferromagnetic hysteresis behavior with large anomalous Hall resistivity in a broad region of densities, centered in the valence miniband at n = -2.3 n0. This ferromagnetism depends very sensitively on the control parameters in the moiré system: not only the magnitude of the anomalous Hall signal, but also the sign of the hysteretic ferromagnetic response can be modulated by tuning the carrier density and displacement field. Our discovery of electrically tunable ferromagnetism in a moiré Chern band at non-integer filling highlights the opportunities for exploring new correlated ferromagnetic states in moiré heterostructures. △ Less

Submitted 18 December, 2020; originally announced December 2020.

Comments: 12 pages, 4 figures

arXiv:2012.00110 [pdf, other]

Representing and Denoising Wearable ECG Recordings

Authors: Jeffrey Chan, Andrew C. Miller, Emily B. Fox

Abstract: Modern wearable devices are embedded with a range of noninvasive biomarker sensors that hold promise for improving detection and treatment of disease. One such sensor is the single-lead electrocardiogram (ECG) which measures electrical signals in the heart. The benefits of the sheer volume of ECG measurements with rich longitudinal structure made possible by wearables come at the price of potentia… ▽ More Modern wearable devices are embedded with a range of noninvasive biomarker sensors that hold promise for improving detection and treatment of disease. One such sensor is the single-lead electrocardiogram (ECG) which measures electrical signals in the heart. The benefits of the sheer volume of ECG measurements with rich longitudinal structure made possible by wearables come at the price of potentially noisier measurements compared to clinical ECGs, e.g., due to movement. In this work, we develop a statistical model to simulate a structured noise process in ECGs derived from a wearable sensor, design a beat-to-beat representation that is conducive for analyzing variation, and devise a factor analysis-based method to denoise the ECG. We study synthetic data generated using a realistic ECG simulator and a structured noise model. At varying levels of signal-to-noise, we quantitatively measure an upper bound on performance and compare estimates from linear and non-linear models. Finally, we apply our method to a set of ECGs collected by wearables in a mobile health study. △ Less

Submitted 30 November, 2020; originally announced December 2020.

Comments: ML for Mobile Health Workshop, NeurIPS 2020

arXiv:2010.03549 [pdf, other]

On the Evaluation of Generative Adversarial Networks By Discriminative Models

Authors: Amirsina Torfi, Mohammadreza Beyki, Edward A. Fox

Abstract: Generative Adversarial Networks (GANs) can accurately model complex multi-dimensional data and generate realistic samples. However, due to their implicit estimation of data distributions, their evaluation is a challenging task. The majority of research efforts associated with tackling this issue were validated by qualitative visual evaluation. Such approaches do not generalize well beyond the imag… ▽ More Generative Adversarial Networks (GANs) can accurately model complex multi-dimensional data and generate realistic samples. However, due to their implicit estimation of data distributions, their evaluation is a challenging task. The majority of research efforts associated with tackling this issue were validated by qualitative visual evaluation. Such approaches do not generalize well beyond the image domain. Since many of those evaluation metrics are proposed and bound to the vision domain, they are difficult to apply to other domains. Quantitative measures are necessary to better guide the training and comparison of different GANs models. In this work, we leverage Siamese neural networks to propose a domain-agnostic evaluation metric: (1) with a qualitative evaluation that is consistent with human evaluation, (2) that is robust relative to common GAN issues such as mode drop** and invention, and (3) does not require any pretrained classifier. The empirical results in this paper demonstrate the superiority of this method compared to the popular Inception Score and are competitive with the FID score. △ Less

Submitted 7 October, 2020; originally announced October 2020.

Comments: Accepted to be published in ICPR 2020

arXiv:2009.04485 [pdf, other]

Aspect Classification for Legal Depositions

Authors: Saurabh Chakravarty, Satvik Chekuri, Maanav Mehrotra, Edward A. Fox

Abstract: Attorneys and others have a strong interest in having a digital library with suitable services (e.g., summarizing, searching, and browsing) to help them work with large corpora of legal depositions. Their needs often involve understanding the semantics of such documents. That depends in part on the role of the deponent, e.g., plaintiff, defendant, law enforcement personnel, expert, etc. In the cas… ▽ More Attorneys and others have a strong interest in having a digital library with suitable services (e.g., summarizing, searching, and browsing) to help them work with large corpora of legal depositions. Their needs often involve understanding the semantics of such documents. That depends in part on the role of the deponent, e.g., plaintiff, defendant, law enforcement personnel, expert, etc. In the case of tort litigation associated with property and casualty insurance claims, such as relating to an injury, it is important to know not only about liability, but also about events, accidents, physical conditions, and treatments. We hypothesize that a legal deposition consists of various aspects that are discussed as part of the deponent testimony. Accordingly, we developed an ontology of aspects in a legal deposition for accident and injury cases. Using that, we have developed a classifier that can identify portions of text for each of the aspects of interest. Doing so was complicated by the peculiarities of this genre, e.g., that deposition transcripts generally consist of data in the form of question-answer (QA) pairs. Accordingly, our automated system starts with pre-processing, and then transforms the QA pairs into a canonical form made up of declarative sentences. Classifying the declarative sentences that are generated, according to the aspect, can then help with downstream tasks such as summarization, segmentation, question-answering, and information retrieval. Our methods have achieved a classification F1 score of 0.83. Having the aspects classified with a good accuracy will help in choosing QA pairs that can be used as candidate summary sentences, and to generate an informative summary for legal professionals or insurance claim agents. Our methodology could be extended to legal depositions of other kinds, and to aid services like searching. △ Less

Submitted 9 September, 2020; originally announced September 2020.

Comments: 19 pages, 3 figures, 11 tables, detailed version of shorter paper being submitted to a conference

arXiv:2008.02852 [pdf, other]

Learning Insulin-Glucose Dynamics in the Wild

Authors: Andrew C. Miller, Nicholas J. Foti, Emily Fox

Abstract: We develop a new model of insulin-glucose dynamics for forecasting blood glucose in type 1 diabetics. We augment an existing biomedical model by introducing time-varying dynamics driven by a machine learning sequence model. Our model maintains a physiologically plausible inductive bias and clinically interpretable parameters -- e.g., insulin sensitivity -- while inheriting the flexibility of moder… ▽ More We develop a new model of insulin-glucose dynamics for forecasting blood glucose in type 1 diabetics. We augment an existing biomedical model by introducing time-varying dynamics driven by a machine learning sequence model. Our model maintains a physiologically plausible inductive bias and clinically interpretable parameters -- e.g., insulin sensitivity -- while inheriting the flexibility of modern pattern recognition algorithms. Critical to modeling success are the flexible, but structured representations of subject variability with a sequence model. In contrast, less constrained models like the LSTM fail to provide reliable or physiologically plausible forecasts. We conduct an extensive empirical study. We show that allowing biomedical model dynamics to vary in time improves forecasting at long time horizons, up to six hours, and produces forecasts consistent with the physiological effects of insulin and carbohydrates. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: Machine Learning for Healthcare 2020

arXiv:2007.14405 [pdf, other]

The Atacama Cosmology Telescope: Delensed Power Spectra and Parameters

Authors: Dongwon Han, Neelima Sehgal, Amanda MacInnis, Alexander van Engelen, Blake D. Sherwin, Mathew S. Madhavacheril, Simone Aiola, Nicholas Battaglia, James A. Beall, Daniel T. Becker, Erminia Calabrese, Steve K. Choi, Omar Darwish, Edward V. Denison, Mark J. Devlin, Jo Dunkley, Simone Ferraro, Anna E. Fox, Matthew Hasselfield, J. Colin Hill, Gene C. Hilton, Matt Hilton, Renée Hložek, Johannes Hubmayr, John P. Hughes , et al. (17 additional authors not shown)

Abstract: We present LCDM cosmological parameter constraints obtained from delensed microwave background power spectra. Lensing maps from a subset of DR4 data from the Atacama Cosmology Telescope (ACT) are used to undo the lensing effect in ACT spectra observed at 150 and 98 GHz. At 150 GHz, we remove the lensing distortion with an effective efficiency of 30% (TT), 30% (EE), 26% (TE) and 20% (BB); this resu… ▽ More We present LCDM cosmological parameter constraints obtained from delensed microwave background power spectra. Lensing maps from a subset of DR4 data from the Atacama Cosmology Telescope (ACT) are used to undo the lensing effect in ACT spectra observed at 150 and 98 GHz. At 150 GHz, we remove the lensing distortion with an effective efficiency of 30% (TT), 30% (EE), 26% (TE) and 20% (BB); this results in detections of the delensing effect at 8.7 sigma (TT), 5.1 sigma (EE), 2.6 sigma (TE), and 2.4 sigma (BB) significance. The combination of 150 and 98 GHz TT, EE, and TE delensed spectra is well fit by a standard LCDM model. We also measure the shift in best-fit parameters when fitting delensed versus lensed spectra; while this shift does not inform our ability to measure cosmological parameters, it does provide a three-way consistency check among the lensing inferred from the best-fit parameters, the lensing in the CMB power spectrum, and the reconstructed lensing map. This shift is predicted to be zero when fitting with the correct model since both lensed and delensed spectra originate from the same region of sky. Fitting with a LCDM model and marginalizing over foregrounds, we find that the shift in cosmological parameters is consistent with zero. Our results show that gravitational lensing of the microwave background is internally consistent within the framework of the standard cosmological model. △ Less

Submitted 13 November, 2020; v1 submitted 28 July, 2020; originally announced July 2020.

Comments: 29 pages, 17 figures, version matches that accepted by JCAP

Journal ref: JCAP, Issue 01, article id. 031 (2021)

arXiv:2007.07290 [pdf, other]

doi 10.1088/1475-7516/2020/12/046

The Atacama Cosmology Telescope: DR5 maps of 18,000 square degrees of the microwave sky from ACT 2008-2018 data

Authors: Sigurd Naess, Simone Aiola, Jason E. Austermann, Nick Battaglia, James A. Beall, Daniel T. Becker, Richard J. Bond, Erminia Calabrese, Steve K. Choi, Nicholas F. Cothard, Kevin T. Crowley, Omar Darwish, Rahul Datta, Edward V. Denison, Mark Devlin, Cody J. Duell, Shannon M. Duff, Adriaan J. Duivenvoorden, Jo Dunkley, Rolando Dünner, Anna E. Fox, Patricio A. Gallardo, Mark Halpern, Dongwon Han, Matthew Hasselfield , et al. (37 additional authors not shown)

Abstract: This paper presents a maximum-likelihood algorithm for combining sky maps with disparate sky coverage, angular resolution and spatially varying anisotropic noise into a single map of the sky. We use this to merge hundreds of individual maps covering the 2008-2018 ACT observing seasons, resulting in by far the deepest ACT maps released so far. We also combine the maps with the full Planck maps, res… ▽ More This paper presents a maximum-likelihood algorithm for combining sky maps with disparate sky coverage, angular resolution and spatially varying anisotropic noise into a single map of the sky. We use this to merge hundreds of individual maps covering the 2008-2018 ACT observing seasons, resulting in by far the deepest ACT maps released so far. We also combine the maps with the full Planck maps, resulting in maps that have the best features of both Planck and ACT: Planck's nearly white noise on intermediate and large angular scales and ACT's high-resolution and sensitivity on small angular scales. The maps cover over 18,000 square degrees, nearly half the full sky, at 100, 150 and 220 GHz. They reveal 4,000 optically-confirmed clusters through the Sunyaev Zel'dovich effect (SZ) and 18,500 point source candidates at $> 5σ$, the largest single collection of SZ clusters and millimeter wave sources to date. The multi-frequency maps provide millimeter images of nearby galaxies and individual Milky Way nebulae, and even clear detections of several nearby stars. Other anticipated uses of these maps include, for example, thermal SZ and kinematic SZ cluster stacking, CMB cluster lensing and galactic dust science. The method itself has negligible bias. However, due to the preliminary nature of some of the component data sets, we caution that these maps should not be used for precision cosmological analysis. The maps are part of ACT DR5, and are available on LAMBDA at https://lambda.gsfc.nasa.gov/product/act/actpol_prod_table.cfm. There is also a web atlas at https://phy-act1.princeton.edu/public/snaess/actpol/dr5/atlas. △ Less

Submitted 17 February, 2021; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 38 pages, 29 figures, data release on lambda. Published in JCAP

arXiv:2007.07289 [pdf, other]

doi 10.1088/1475-7516/2020/12/045

The Atacama Cosmology Telescope: A Measurement of the Cosmic Microwave Background Power Spectra at 98 and 150 GHz

Authors: Steve K. Choi, Matthew Hasselfield, Shuay-Pwu Patty Ho, Brian Koopman, Marius Lungu, Maximilian H. Abitbol, Graeme E. Addison, Peter A. R. Ade, Simone Aiola, David Alonso, Mandana Amiri, Stefania Amodeo, Elio Angile, Jason E. Austermann, Taylor Baildon, Nick Battaglia, James A. Beall, Rachel Bean, Daniel T. Becker, J Richard Bond, Sarah Marie Bruno, Erminia Calabrese, Victoria Calafut, Luis E. Campusano, Felipe Carrero , et al. (114 additional authors not shown)

Abstract: We present the temperature and polarization angular power spectra of the CMB measured by the Atacama Cosmology Telescope (ACT) from 5400 deg$^2$ of the 2013-2016 survey, which covers $>$15000 deg$^2$ at 98 and 150 GHz. For this analysis we adopt a blinding strategy to help avoid confirmation bias and, related to this, show numerous checks for systematic error done before unblinding. Using the like… ▽ More We present the temperature and polarization angular power spectra of the CMB measured by the Atacama Cosmology Telescope (ACT) from 5400 deg$^2$ of the 2013-2016 survey, which covers $>$15000 deg$^2$ at 98 and 150 GHz. For this analysis we adopt a blinding strategy to help avoid confirmation bias and, related to this, show numerous checks for systematic error done before unblinding. Using the likelihood for the cosmological analysis we constrain secondary sources of anisotropy and foreground emission, and derive a "CMB-only" spectrum that extends to $\ell=4000$. At large angular scales, foreground emission at 150 GHz is $\sim$1% of TT and EE within our selected regions and consistent with that found by Planck. Using the same likelihood, we obtain the cosmological parameters for $Λ$CDM for the ACT data alone with a prior on the optical depth of $τ=0.065\pm0.015$. $Λ$CDM is a good fit. The best-fit model has a reduced $χ^2$ of 1.07 (PTE=0.07) with $H_0=67.9\pm1.5$ km/s/Mpc. We show that the lensing BB signal is consistent with $Λ$CDM and limit the celestial EB polarization angle to $ψ_P =-0.07^{\circ}\pm0.09^{\circ}$. We directly cross correlate ACT with Planck and observe generally good agreement but with some discrepancies in TE. All data on which this analysis is based will be publicly released. △ Less

Submitted 23 November, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 44 pages, 27 figures, products available on the NASA LAMBDA website, version accepted for publication in JCAP

arXiv:2007.07288 [pdf, other]

doi 10.1088/1475-7516/2020/12/047

The Atacama Cosmology Telescope: DR4 Maps and Cosmological Parameters

Authors: Simone Aiola, Erminia Calabrese, Loïc Maurin, Sigurd Naess, Benjamin L. Schmitt, Maximilian H. Abitbol, Graeme E. Addison, Peter A. R. Ade, David Alonso, Mandana Amiri, Stefania Amodeo, Elio Angile, Jason E. Austermann, Taylor Baildon, Nick Battaglia, James A. Beall, Rachel Bean, Daniel T. Becker, J Richard Bond, Sarah Marie Bruno, Victoria Calafut, Luis E. Campusano, Felipe Carrero, Grace E. Chesmore, Hsiao-mei Cho , et al. (116 additional authors not shown)

Abstract: We present new arcminute-resolution maps of the Cosmic Microwave Background temperature and polarization anisotropy from the Atacama Cosmology Telescope, using data taken from 2013-2016 at 98 and 150 GHz. The maps cover more than 17,000 deg$^2$, the deepest 600 deg$^2$ with noise levels below 10 $μ$K-arcmin. We use the power spectrum derived from almost 6,000 deg$^2$ of these maps to constrain cos… ▽ More We present new arcminute-resolution maps of the Cosmic Microwave Background temperature and polarization anisotropy from the Atacama Cosmology Telescope, using data taken from 2013-2016 at 98 and 150 GHz. The maps cover more than 17,000 deg$^2$, the deepest 600 deg$^2$ with noise levels below 10 $μ$K-arcmin. We use the power spectrum derived from almost 6,000 deg$^2$ of these maps to constrain cosmology. The ACT data enable a measurement of the angular scale of features in both the divergence-like polarization and the temperature anisotropy, tracing both the velocity and density at last-scattering. From these one can derive the distance to the last-scattering surface and thus infer the local expansion rate, $H_0$. By combining ACT data with large-scale information from WMAP we measure $H_0 = 67.6 \pm 1.1$ km/s/Mpc, at 68% confidence, in excellent agreement with the independently-measured Planck satellite estimate (from ACT alone we find $H_0 = 67.9 \pm 1.5$ km/s/Mpc). The $Λ$CDM model provides a good fit to the ACT data, and we find no evidence for deviations: both the spatial curvature, and the departure from the standard lensing signal in the spectrum, are zero to within 1$σ$; the number of relativistic species, the primordial Helium fraction, and the running of the spectral index are consistent with $Λ$CDM predictions to within $1.5 - 2.2σ$. We compare ACT, WMAP, and Planck at the parameter level and find good consistency; we investigate how the constraints on the correlated spectral index and baryon density parameters readjust when adding CMB large-scale information that ACT does not measure. The DR4 products presented here will be publicly released on the NASA Legacy Archive for Microwave Background Data Analysis. △ Less

Submitted 3 December, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 33 pages, 24 figures, products available on the NASA LAMBDA website, version accepted for publication in JCAP

arXiv:2004.01139 [pdf, other]

doi 10.1093/mnras/staa3438

The Atacama Cosmology Telescope: A CMB lensing mass map over 2100 square degrees of sky and its cross-correlation with BOSS-CMASS galaxies

Authors: Omar Darwish, Mathew S. Madhavacheril, Blake Sherwin, Simone Aiola, Nicholas Battaglia, James A. Beall, Daniel T. Becker, J. Richard Bond, Erminia Calabrese, Steve Choi, Mark J. Devlin, Jo Dunkley, Rolando Dünner, Simone Ferraro, Anna E. Fox, Patricio A. Gallardo, Yilun Guan, Mark Halpern, Dongwon Han, Matthew Hasselfield, J. Colin Hill, Gene C. Hilton, Matt Hilton, Adam D. Hincks, Shuay-Pwu Patty Ho , et al. (28 additional authors not shown)

Abstract: We construct cosmic microwave background lensing mass maps using data from the 2014 and 2015 seasons of observations with the Atacama Cosmology Telescope (ACT). These maps cover 2100 square degrees of sky and overlap with a wide variety of optical surveys. The maps are signal dominated on large scales and have fidelity such that their correlation with the cosmic infrared background is clearly visi… ▽ More We construct cosmic microwave background lensing mass maps using data from the 2014 and 2015 seasons of observations with the Atacama Cosmology Telescope (ACT). These maps cover 2100 square degrees of sky and overlap with a wide variety of optical surveys. The maps are signal dominated on large scales and have fidelity such that their correlation with the cosmic infrared background is clearly visible by eye. We also create lensing maps with thermal Sunyaev-Zel'dovich contamination removed using a novel cleaning procedure that only slightly degrades the lensing signal-to-noise ratio. The cross-spectrum between the cleaned lensing map and the BOSS CMASS galaxy sample is detected at $10$-$σ$ significance, with an amplitude of $A=1.02 \pm 0.10$ relative to the Planck best-fit LCDM cosmological model with fiducial linear galaxy bias. Our measurement lays the foundation for lensing cross-correlation science with current ACT data and beyond. △ Less

Submitted 3 April, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

Comments: 16 pages, 11 figures, lensing map products will be made available on LAMBDA as part of the upcoming ACT data release, v2 corrects author list

arXiv:2003.12206 [pdf, other]

Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program)

Authors: Joelle Pineau, Philippe Vincent-Lamarre, Koustuv Sinha, Vincent Larivière, Alina Beygelzimer, Florence d'Alché-Buc, Emily Fox, Hugo Larochelle

Abstract: One of the challenges in machine learning research is to ensure that presented and published results are sound and reliable. Reproducibility, that is obtaining similar results as presented in a paper or talk, using the same code and data (when available), is a necessary step to verify the reliability of research findings. Reproducibility is also an important step to promote open and accessible res… ▽ More One of the challenges in machine learning research is to ensure that presented and published results are sound and reliable. Reproducibility, that is obtaining similar results as presented in a paper or talk, using the same code and data (when available), is a necessary step to verify the reliability of research findings. Reproducibility is also an important step to promote open and accessible research, thereby allowing the scientific community to quickly integrate new findings and convert ideas to practice. Reproducibility also promotes the use of robust experimental workflows, which potentially reduce unintentional errors. In 2019, the Neural Information Processing Systems (NeurIPS) conference, the premier international conference for research in machine learning, introduced a reproducibility program, designed to improve the standards across the community for how we conduct, communicate, and evaluate machine learning research. The program contained three components: a code submission policy, a community-wide reproducibility challenge, and the inclusion of the Machine Learning Reproducibility checklist as part of the paper submission process. In this paper, we describe each of these components, how it was deployed, as well as what we were able to learn from this initiative. △ Less

Submitted 30 December, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

Comments: To appear at JMLR, 16 pages + Appendix

arXiv:2003.01200 [pdf, other]

Natural Language Processing Advancements By Deep Learning: A Survey

Authors: Amirsina Torfi, Rouzbeh A. Shirvani, Yaser Keneshloo, Nader Tavaf, Edward A. Fox

Abstract: Natural Language Processing (NLP) helps empower intelligent machines by enhancing a better understanding of the human language for linguistic-based human-computer communication. Recent developments in computational power and the advent of large amounts of linguistic data have heightened the need and demand for automating semantic analysis using data-driven approaches. The utilization of data-drive… ▽ More Natural Language Processing (NLP) helps empower intelligent machines by enhancing a better understanding of the human language for linguistic-based human-computer communication. Recent developments in computational power and the advent of large amounts of linguistic data have heightened the need and demand for automating semantic analysis using data-driven approaches. The utilization of data-driven strategies is pervasive now due to the significant improvements demonstrated through the usage of deep learning methods in areas such as Computer Vision, Automatic Speech Recognition, and in particular, NLP. This survey categorizes and addresses the different aspects and applications of NLP that have benefited from deep learning. It covers core NLP tasks and applications and describes how deep learning methods and models advance these areas. We further analyze and compare different approaches and state-of-the-art models. △ Less

Submitted 27 February, 2021; v1 submitted 2 March, 2020; originally announced March 2020.

arXiv:2001.10465 [pdf, ps, other]

doi 10.1103/PhysRevD.101.083527

The Atacama Cosmology Telescope: Constraints on Cosmic Birefringence

Authors: Toshiya Namikawa, Yilun Guan, Omar Darwish, Blake D. Sherwin, Simone Aiola, Nicholas Battaglia, James A. Beall, Daniel T. Becker, J. Richard Bond, Erminia Calabrese, Grace E. Chesmore, Steve K. Choi, Mark J. Devlin, Joanna Dunkley, Rolando Dünner, Anna E. Fox, Patricio A. Gallardo, Vera Gluscevic, Dongwon Han, Matthew Hasselfield, Gene C. Hilton, Adam D. Hincks, Renée Hložek, Johannes Hubmayr, Kevin Huffenberger , et al. (29 additional authors not shown)

Abstract: We present new constraints on anisotropic birefringence of the cosmic microwave background polarization using two seasons of data from the Atacama Cosmology Telescope covering $456$ square degrees of sky. The birefringence power spectrum, measured using a curved-sky quadratic estimator, is consistent with zero. Our results provide the tightest current constraint on birefringence over a range of an… ▽ More We present new constraints on anisotropic birefringence of the cosmic microwave background polarization using two seasons of data from the Atacama Cosmology Telescope covering $456$ square degrees of sky. The birefringence power spectrum, measured using a curved-sky quadratic estimator, is consistent with zero. Our results provide the tightest current constraint on birefringence over a range of angular scales between $5$ arcminutes and $9$ degrees. We improve previous upper limits on the amplitude of a scale-invariant birefringence power spectrum by a factor of between $2$ and $3$. Assuming a nearly-massless axion field during inflation, our result is equivalent to a $2\,σ$ upper limit on the Chern-Simons coupling constant between axions and photons of $g_{αγ}<4.0\times 10^{-2}/H_I$ where $H_I$ is the inflationary Hubble scale. △ Less

Submitted 21 April, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

Comments: 18 pages, 3 figures, Accepted for publication in PRD

Journal ref: Phys. Rev. D 101, 083527 (2020)

arXiv:2001.09346 [pdf, other]

CorGAN: Correlation-Capturing Convolutional Generative Adversarial Networks for Generating Synthetic Healthcare Records

Authors: Amirsina Torfi, Edward A. Fox

Abstract: Deep learning models have demonstrated high-quality performance in areas such as image classification and speech processing. However, creating a deep learning model using electronic health record (EHR) data, requires addressing particular privacy challenges that are unique to researchers in this domain. This matter focuses attention on generating realistic synthetic data while ensuring privacy. In… ▽ More Deep learning models have demonstrated high-quality performance in areas such as image classification and speech processing. However, creating a deep learning model using electronic health record (EHR) data, requires addressing particular privacy challenges that are unique to researchers in this domain. This matter focuses attention on generating realistic synthetic data while ensuring privacy. In this paper, we propose a novel framework called correlation-capturing Generative Adversarial Network (CorGAN), to generate synthetic healthcare records. In CorGAN we utilize Convolutional Neural Networks to capture the correlations between adjacent medical features in the data representation space by combining Convolutional Generative Adversarial Networks and Convolutional Autoencoders. To demonstrate the model fidelity, we show that CorGAN generates synthetic data with performance similar to that of real data in various Machine Learning settings such as classification and prediction. We also give a privacy assessment and report on statistical analysis regarding realistic characteristics of the synthetic data. The software of this work is open-source and is available at: https://github.com/astorfi/cor-gan. △ Less

Submitted 4 March, 2020; v1 submitted 25 January, 2020; originally announced January 2020.

Comments: Accepted to be published in the 33rd International FLAIRS Conference, AI in Healthcare Informatics

arXiv:1911.05683 [pdf, other]

Modeling patterns of smartphone usage and their relationship to cognitive health

Authors: Jonas Rauber, Emily B. Fox, Leon A. Gatys

Abstract: The ubiquity of smartphone usage in many people's lives make it a rich source of information about a person's mental and cognitive state. In this work we analyze 12 weeks of phone usage data from 113 older adults, 31 with diagnosed cognitive impairment and 82 without. We develop structured models of users' smartphone interactions to reveal differences in phone usage patterns between people with an… ▽ More The ubiquity of smartphone usage in many people's lives make it a rich source of information about a person's mental and cognitive state. In this work we analyze 12 weeks of phone usage data from 113 older adults, 31 with diagnosed cognitive impairment and 82 without. We develop structured models of users' smartphone interactions to reveal differences in phone usage patterns between people with and without cognitive impairment. In particular, we focus on inferring specific types of phone usage sessions that are predictive of cognitive impairment. Our model achieves an AUROC of 0.79 when discriminating between healthy and symptomatic subjects, and its interpretability enables novel insights into which aspects of phone usage strongly relate with cognitive health in our dataset. △ Less

Submitted 13 November, 2019; originally announced November 2019.

Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract

arXiv:1905.07473 [pdf, other]

Adaptively Truncating Backpropagation Through Time to Control Gradient Bias

Authors: Christopher Aicher, Nicholas J. Foti, Emily B. Fox

Abstract: Truncated backpropagation through time (TBPTT) is a popular method for learning in recurrent neural networks (RNNs) that saves computation and memory at the cost of bias by truncating backpropagation after a fixed number of lags. In practice, choosing the optimal truncation length is difficult: TBPTT will not converge if the truncation length is too small, or will converge slowly if it is too larg… ▽ More Truncated backpropagation through time (TBPTT) is a popular method for learning in recurrent neural networks (RNNs) that saves computation and memory at the cost of bias by truncating backpropagation after a fixed number of lags. In practice, choosing the optimal truncation length is difficult: TBPTT will not converge if the truncation length is too small, or will converge slowly if it is too large. We propose an adaptive TBPTT scheme that converts the problem from choosing a temporal lag to one of choosing a tolerable amount of gradient bias. For many realistic RNNs, the TBPTT gradients decay geometrically in expectation for large lags; under this condition, we can control the bias by varying the truncation length adaptively. For RNNs with smooth activation functions, we prove that this bias controls the convergence rate of SGD with biased gradients for our non-convex loss. Using this theory, we develop a practical method for adaptively estimating the truncation length during training. We evaluate our adaptive TBPTT method on synthetic data and language modeling tasks and find that our adaptive TBPTT ameliorates the computational pitfalls of fixed TBPTT. △ Less

Submitted 1 July, 2019; v1 submitted 17 May, 2019; originally announced May 2019.

arXiv:1905.06535 [pdf]

doi 10.1038/s41586-020-2049-7

Tunable Correlated Chern Insulator and Ferromagnetism in Trilayer Graphene/Boron Nitride Moiré Superlattice

Authors: Guorui Chen, Aaron L. Sharpe, Eli J. Fox, Ya-Hui Zhang, Shaoxin Wang, Lili Jiang, Bosai Lyu, Hongyuan Li, Kenji Watanabe, Takashi Taniguchi, Zhiwen Shi, T. Senthil, David Goldhaber-Gordon, Yuanbo Zhang, Feng Wang

Abstract: Studies on two-dimensional electron systems in a strong magnetic field first revealed the quantum Hall (QH) effect, a topological state of matter featuring a finite Chern number (C) and chiral edge states. Haldane later theorized that Chern insulators with integer QH effects could appear in lattice models with complex hop** parameters even at zero magnetic field. The ABC-trilayer graphene/hexago… ▽ More Studies on two-dimensional electron systems in a strong magnetic field first revealed the quantum Hall (QH) effect, a topological state of matter featuring a finite Chern number (C) and chiral edge states. Haldane later theorized that Chern insulators with integer QH effects could appear in lattice models with complex hop** parameters even at zero magnetic field. The ABC-trilayer graphene/hexagonal boron nitride (TLG/hBN) moiré superlattice provides an attractive platform to explore Chern insulators because it features nearly flat moiré minibands with a valley-dependent electrically tunable Chern number. Here we report the experimental observation of a correlated Chern insulator in a TLG/hBN moiré superlattice. We show that reversing the direction of the applied vertical electric field switches TLG/hBN's moiré minibands between zero and finite Chern numbers, as revealed by dramatic changes in magneto-transport behavior. For topological hole minibands tuned to have a finite Chern number, we focus on 1/4 filling, corresponding to one hole per moiré unit cell. The Hall resistance is well quantized at h/2e2, i.e. C = 2, for |B| > 0.4 T. The correlated Chern insulator is ferromagnetic, exhibiting significant magnetic hysteresis and a large anomalous Hall signal at zero magnetic field. Our discovery of a C = 2 Chern insulator at zero magnetic field should open up exciting opportunities for discovering novel correlated topological states, possibly with novel topological excitations, in nearly flat and topologically nontrivial moiré minibands. △ Less

Submitted 16 May, 2019; originally announced May 2019.

Comments: 16 pages, 4 figures, and 2 extended figures

arXiv:1901.10568 [pdf, other]

Stochastic Gradient MCMC for Nonlinear State Space Models

Authors: Christopher Aicher, Srshti Putcha, Christopher Nemeth, Paul Fearnhead, Emily B. Fox

Abstract: State space models (SSMs) provide a flexible framework for modeling complex time series via a latent stochastic process. Inference for nonlinear, non-Gaussian SSMs is often tackled with particle methods that do not scale well to long time series. The challenge is two-fold: not only do computations scale linearly with time, as in the linear case, but particle filters additionally suffer from increa… ▽ More State space models (SSMs) provide a flexible framework for modeling complex time series via a latent stochastic process. Inference for nonlinear, non-Gaussian SSMs is often tackled with particle methods that do not scale well to long time series. The challenge is two-fold: not only do computations scale linearly with time, as in the linear case, but particle filters additionally suffer from increasing particle degeneracy with longer series. Stochastic gradient MCMC methods have been developed to scale Bayesian inference for finite-state hidden Markov models and linear SSMs using buffered stochastic gradient estimates to account for temporal dependencies. We extend these stochastic gradient estimators to nonlinear SSMs using particle methods. We present error bounds that account for both buffering error and particle error in the case of nonlinear SSMs that are log-concave in the latent process. We evaluate our proposed particle buffered stochastic gradient using stochastic gradient MCMC for inference on both long sequential synthetic and minute-resolution financial returns data, demonstrating the importance of this class of methods. △ Less

Submitted 16 July, 2023; v1 submitted 29 January, 2019; originally announced January 2019.

Comments: To appear in Bayesian Analysis

arXiv:1901.04621 [pdf]

doi 10.1038/s41586-019-1393-y

Signatures of Gate-Tunable Superconductivity in Trilayer Graphene/Boron Nitride Moiré Superlattice

Authors: Guorui Chen, Aaron L. Sharpe, Patrick Gallagher, Ilan T. Rosen, Eli Fox, Lili Jiang, Bosai Lyu, Hongyuan Li, Kenji Watanabe, Takashi Taniguchi, Jeil Jung, Zhiwen Shi, David Goldhaber-Gordon, Yuanbo Zhang, Feng Wang

Abstract: Understanding the mechanism of high temperature (high Tc) superconductivity is a central problem in condensed matter physics. It is often speculated that high Tc superconductivity arises from a doped Mott insulator as described by the Hubbard model. An exact solution of the Hubbard model, however, is extremely challenging due to the strong electron-electron correlation. Therefore, it is highly des… ▽ More Understanding the mechanism of high temperature (high Tc) superconductivity is a central problem in condensed matter physics. It is often speculated that high Tc superconductivity arises from a doped Mott insulator as described by the Hubbard model. An exact solution of the Hubbard model, however, is extremely challenging due to the strong electron-electron correlation. Therefore, it is highly desirable to experimentally study a model Hubbard system in which the unconventional superconductivity can be continuously tuned by varying the Hubbard parameters. Here we report signatures of tunable superconductivity in ABC-trilayer graphene (TLG) / boron nitride (hBN) moiré superlattice. Unlike "magic angle" twisted bilayer graphene, theoretical calculations show that under a vertical displacement field the ABC-TLG/hBN heterostructure features an isolated flat valence miniband associated with a Hubbard model on a triangular superlattice. Upon applying such a displacement field we find experimentally that the ABC-TLG/hBN superlattice displays Mott insulating states below 20 Kelvin at 1/4 and 1/2 fillings, corresponding to 1 and 2 holes per unit cell, respectively. Upon further cooling, signatures of superconducting domes emerge below 1 kelvin for the electron- and hole-doped sides of the 1/4 filling Mott state. The electronic behavior in the TLG/hBN superlattice is expected to depend sensitively on the interplay between the electron-electron interaction and the miniband bandwidth, which can be tuned continuously with the displacement field D. By simply varying the D field, we demonstrate transitions from the candidate superconductor to Mott insulator and metallic phases. Our study shows that TLG/hBN heterostructures offer an attractive model system to explore rich correlated behavior emerging in the tunable triangular Hubbard model. △ Less

Submitted 14 January, 2019; originally announced January 2019.

Comments: 14 pages, 4 figures

Journal ref: Nature 572, 215-219 (2019)

arXiv:1901.03520 [pdf, other]

doi 10.1126/science.aaw3780

Emergent ferromagnetism near three-quarters filling in twisted bilayer graphene

Authors: Aaron L. Sharpe, Eli J. Fox, Arthur W. Barnard, Joe Finney, Kenji Watanabe, Takashi Taniguchi, M. A. Kastner, David Goldhaber-Gordon

Abstract: When two sheets of graphene are stacked at a small twist angle, the resulting flat superlattice minibands are expected to strongly enhance electron-electron interactions. Here we present evidence that near three-quarters ($3/4$) filling of the conduction miniband these enhanced interactions drive the twisted bilayer graphene into a ferromagnetic state. We observe emergent ferromagnetic hysteresis,… ▽ More When two sheets of graphene are stacked at a small twist angle, the resulting flat superlattice minibands are expected to strongly enhance electron-electron interactions. Here we present evidence that near three-quarters ($3/4$) filling of the conduction miniband these enhanced interactions drive the twisted bilayer graphene into a ferromagnetic state. We observe emergent ferromagnetic hysteresis, with a giant anomalous Hall (AH) effect as large as $10.4\ \mathrm{kΩ}$ and signs of chiral edge states in a narrow density range around an apparent insulating state at $3/4$. Surprisingly, the magnetization of the sample can be reversed by applying a small DC current. Although the AH resistance is not quantized and dissipation is significant, we suggest that the system is an incipient Chern insulator. △ Less

Submitted 11 January, 2019; originally announced January 2019.

Comments: 18 pages, 4 figures

Journal ref: Science 365, 605-608 (2019)

arXiv:1812.10236 [pdf, other]

Comparing Spatial Regression to Random Forests for Large Environmental Data Sets

Authors: Eric W. Fox, Jay M. Ver Hoef, Anthony R. Olsen

Abstract: Environmental data may be "large" due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates with nonlinear relationships, whereas spatial regression, when using reduced rank methods, has a reputation for good predictive performance when using many records that are spatially autocorrelated. In this study, we c… ▽ More Environmental data may be "large" due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates with nonlinear relationships, whereas spatial regression, when using reduced rank methods, has a reputation for good predictive performance when using many records that are spatially autocorrelated. In this study, we compare these two techniques using a data set containing the macroinvertebrate multimetric index (MMI) at 1859 stream sites with over 200 landscape covariates. A primary application is map** MMI predictions and prediction errors at 1.1 million perennial stream reaches across the conterminous United States. For the spatial regression model, we develop a novel transformation procedure that estimates Box-Cox transformations to linearize covariate relationships and handles possibly zero-inflated covariates. We find that the spatial regression model with transformations, and a subsequent selection of significant covariates, has cross-validation performance slightly better than random forests. We also find that prediction interval coverage is close to nominal for each method, but that spatial regression prediction intervals tend to be narrower and have less variability than quantile regression forest prediction intervals. A simulation study is used to generalize results and clarify advantages of each modeling approach. △ Less

Submitted 26 December, 2018; originally announced December 2018.

arXiv:1811.10202 [pdf, other]

A Hybrid Model for Role-related User Classification on Twitter

Authors: Liuqing Li, Ziqian Song, Xuan Zhang, Edward A. Fox

Abstract: To aid a variety of research studies, we propose TWIROLE, a hybrid model for role-related user classification on Twitter, which detects male-related, female-related, and brand-related (i.e., organization or institution) users. TWIROLE leverages features from tweet contents, user profiles, and profile images, and then applies our hybrid model to identify a user's role. To evaluate it, we used two e… ▽ More To aid a variety of research studies, we propose TWIROLE, a hybrid model for role-related user classification on Twitter, which detects male-related, female-related, and brand-related (i.e., organization or institution) users. TWIROLE leverages features from tweet contents, user profiles, and profile images, and then applies our hybrid model to identify a user's role. To evaluate it, we used two existing large datasets about Twitter users, and conducted both intra- and inter-comparison experiments. TWIROLE outperforms existing methods and obtains more balanced results over the several roles. We also confirm that user names and profile images are good indicators for this task. Our research extends prior work that does not consider brand-related users, and is an aid to future evaluation efforts relative to investigations that rely upon self-labeled datasets. △ Less

Submitted 26 November, 2018; originally announced November 2018.

Showing 1–50 of 106 results for author: Fox, E