Search | arXiv e-print repository

On in-silico estimation of left ventricular end-diastolic pressure from cardiac strains

Authors: Emilio A. Mendiola, Raza Rana Mehdi, Dipan J. Shah, Reza Avazmohammadi

Abstract: Left ventricular diastolic dysfunction (LVDD) is a group of diseases that adversely affect the passive phase of the cardiac cycle and can lead to heart failure. While left ventricular end-diastolic pressure (LVEDP) is a valuable prognostic measure in LVDD patients, traditional invasive methods of measuring LVEDP present risks and limitations, highlighting the need for alternative approaches. This… ▽ More Left ventricular diastolic dysfunction (LVDD) is a group of diseases that adversely affect the passive phase of the cardiac cycle and can lead to heart failure. While left ventricular end-diastolic pressure (LVEDP) is a valuable prognostic measure in LVDD patients, traditional invasive methods of measuring LVEDP present risks and limitations, highlighting the need for alternative approaches. This paper investigates the possibility of measuring LVEDP non-invasively using inverse in-silico modeling. We propose the adoption of patient-specific cardiac modeling and simulation to estimate LVEDP and myocardial stiffness from cardiac strains. We have developed a high-fidelity patient-specific computational model of the left ventricle. Through an inverse modeling approach, myocardial stiffness and LVEDP were accurately estimated from cardiac strains that can be acquired from in vivo imaging, indicating the feasibility of computational modeling to augment current approaches in the measurement of ventricular pressure. Integration of such computational platforms into clinical practice holds promise for early detection and comprehensive assessment of LVDD with reduced risk for patients. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.18334 [pdf, other]

SketchQL Demonstration: Zero-shot Video Moment Querying with Sketches

Authors: Renzhi Wu, Pramod Chunduri, Dristi J Shah, Ashmitha Julius Aravind, Ali Payani, Xu Chu, Joy Arulraj, Kexin Rong

Abstract: In this paper, we will present SketchQL, a video database management system (VDBMS) for retrieving video moments with a sketch-based query interface. This novel interface allows users to specify object trajectory events with simple mouse drag-and-drop operations. Users can use trajectories of single objects as building blocks to compose complex events. Using a pre-trained model that encodes trajec… ▽ More In this paper, we will present SketchQL, a video database management system (VDBMS) for retrieving video moments with a sketch-based query interface. This novel interface allows users to specify object trajectory events with simple mouse drag-and-drop operations. Users can use trajectories of single objects as building blocks to compose complex events. Using a pre-trained model that encodes trajectory similarity, SketchQL achieves zero-shot video moments retrieval by performing similarity searches over the video to identify clips that are the most similar to the visual query. In this demonstration, we introduce the graphic user interface of SketchQL and detail its functionalities and interaction mechanisms. We also demonstrate the end-to-end usage of SketchQL from query composition to video moments retrieval using real-world scenarios. △ Less

Submitted 30 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

Journal ref: Published on International Conference on Very Large Databases 2024

arXiv:2404.15683 [pdf, other]

AnoFPDM: Anomaly Segmentation with Forward Process of Diffusion Models for Brain MRI

Authors: Yiming Che, Fazle Rafsani, Jay Shah, Md Mahfuzur Rahman Siddiquee, Teresa Wu

Abstract: Weakly-supervised diffusion models (DMs) in anomaly segmentation, leveraging image-level labels, have attracted significant attention for their superior performance compared to unsupervised methods. It eliminates the need for pixel-level labels in training, offering a more cost-effective alternative to supervised methods. However, existing methods are not fully weakly-supervised because they heavi… ▽ More Weakly-supervised diffusion models (DMs) in anomaly segmentation, leveraging image-level labels, have attracted significant attention for their superior performance compared to unsupervised methods. It eliminates the need for pixel-level labels in training, offering a more cost-effective alternative to supervised methods. However, existing methods are not fully weakly-supervised because they heavily rely on costly pixel-level labels for hyperparameter tuning in inference. To tackle this challenge, we introduce Anomaly Segmentation with Forward Process of Diffusion Models (AnoFPDM), a fully weakly-supervised framework that operates without the need of pixel-level labels. Leveraging the unguided forward process as a reference for the guided forward process, we select hyperparameters such as the noise scale, the threshold for segmentation and the guidance strength. We aggregate anomaly maps from guided forward process, enhancing the signal strength of anomalous regions. Remarkably, our proposed method outperforms recent state-of-the-art weakly-supervised approaches, even without utilizing pixel-level labels. △ Less

Submitted 29 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

Comments: v2: updated introduction, experiments and supplementary material

arXiv:2403.17124 [pdf, other]

Grounding Language Plans in Demonstrations Through Counterfactual Perturbations

Authors: Yanwei Wang, Tsun-Hsuan Wang, Jiayuan Mao, Michael Hagenow, Julie Shah

Abstract: Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs directly for planning in symbolic spaces, this work uses LLMs to guide the search of task structures and constraints implicit in multi-step demonstrations. Specifically, we borrow from manipulation plann… ▽ More Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs directly for planning in symbolic spaces, this work uses LLMs to guide the search of task structures and constraints implicit in multi-step demonstrations. Specifically, we borrow from manipulation planning literature the concept of mode families, which group robot configurations by specific motion constraints, to serve as an abstraction layer between the high-level language representations of an LLM and the low-level physical trajectories of a robot. By replaying a few human demonstrations with synthetic perturbations, we generate coverage over the demonstrations' state space with additional successful executions as well as counterfactuals that fail the task. Our explanation-based learning framework trains an end-to-end differentiable neural network to predict successful trajectories from failures and as a by-product learns classifiers that ground low-level states and images in mode families without dense labeling. The learned grounding classifiers can further be used to translate language plans into reactive policies in the physical domain in an interpretable manner. We show our approach improves the interpretability and reactivity of imitation learning through 2D navigation and simulated and real robot manipulation tasks. Website: https://yanweiw.github.io/glide △ Less

Submitted 29 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: ICLR 2024 Spotlight

arXiv:2403.15469 [pdf, other]

Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning

Authors: Shivam Ratnakant Mhaskar, Nirmesh J. Shah, Mohammadi Zaki, Ashishkumar P. Gudmalwar, Pankaj Wasnik, Rajiv Ratn Shah

Abstract: Traditional Automatic Video Dubbing (AVD) pipeline consists of three key modules, namely, Automatic Speech Recognition (ASR), Neural Machine Translation (NMT), and Text-to-Speech (TTS). Within AVD pipelines, isometric-NMT algorithms are employed to regulate the length of the synthesized output text. This is done to guarantee synchronization with respect to the alignment of video and audio subseque… ▽ More Traditional Automatic Video Dubbing (AVD) pipeline consists of three key modules, namely, Automatic Speech Recognition (ASR), Neural Machine Translation (NMT), and Text-to-Speech (TTS). Within AVD pipelines, isometric-NMT algorithms are employed to regulate the length of the synthesized output text. This is done to guarantee synchronization with respect to the alignment of video and audio subsequent to the dubbing process. Previous approaches have focused on aligning the number of characters and words in the source and target language texts of Machine Translation models. However, our approach aims to align the number of phonemes instead, as they are closely associated with speech duration. In this paper, we present the development of an isometric NMT system using Reinforcement Learning (RL), with a focus on optimizing the alignment of phoneme counts in the source and target language sentence pairs. To evaluate our models, we propose the Phoneme Count Compliance (PCC) score, which is a measure of length compliance. Our approach demonstrates a substantial improvement of approximately 36% in the PCC score compared to the state-of-the-art models when applied to English-Hindi language pairs. Moreover, we propose a student-teacher architecture within the framework of our RL approach to maintain a trade-off between the phoneme count and translation quality. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: Accepted in NAACL2024 Findings

arXiv:2403.10522 [pdf, other]

doi 10.1109/WACV57701.2024.00770

Ordinal Classification with Distance Regularization for Robust Brain Age Prediction

Authors: Jay Shah, Md Mahfuzur Rahman Siddiquee, Yi Su, Teresa Wu, Baoxin Li

Abstract: Age is one of the major known risk factors for Alzheimer's Disease (AD). Detecting AD early is crucial for effective treatment and preventing irreversible brain damage. Brain age, a measure derived from brain imaging reflecting structural changes due to aging, may have the potential to identify AD onset, assess disease risk, and plan targeted interventions. Deep learning-based regression technique… ▽ More Age is one of the major known risk factors for Alzheimer's Disease (AD). Detecting AD early is crucial for effective treatment and preventing irreversible brain damage. Brain age, a measure derived from brain imaging reflecting structural changes due to aging, may have the potential to identify AD onset, assess disease risk, and plan targeted interventions. Deep learning-based regression techniques to predict brain age from magnetic resonance imaging (MRI) scans have shown great accuracy recently. However, these methods are subject to an inherent regression to the mean effect, which causes a systematic bias resulting in an overestimation of brain age in young subjects and underestimation in old subjects. This weakens the reliability of predicted brain age as a valid biomarker for downstream clinical applications. Here, we reformulate the brain age prediction task from regression to classification to address the issue of systematic bias. Recognizing the importance of preserving ordinal information from ages to understand aging trajectory and monitor aging longitudinally, we propose a novel ORdinal Distance Encoded Regularization (ORDER) loss that incorporates the order of age labels, enhancing the model's ability to capture age-related patterns. Extensive experiments and ablation studies demonstrate that this framework reduces systematic bias, outperforms state-of-art methods by statistically significant margins, and can better capture subtle differences between clinical groups in an independent AD dataset. Our implementation is publicly available at https://github.com/jaygshah/Robust-Brain-Age-Prediction. △ Less

Submitted 6 May, 2024; v1 submitted 25 October, 2023; originally announced March 2024.

Comments: Accepted in WACV 2024

arXiv:2403.08231 [pdf, other]

Object Permanence Filter for Robust Tracking with Interactive Robots

Authors: Shaoting Peng, Margaret X. Wang, Julie A. Shah, Nadia Figueroa

Abstract: Object permanence, which refers to the concept that objects continue to exist even when they are no longer perceivable through the senses, is a crucial aspect of human cognitive development. In this work, we seek to incorporate this understanding into interactive robots by proposing a set of assumptions and rules to represent object permanence in multi-object, multi-agent interactive scenarios. We… ▽ More Object permanence, which refers to the concept that objects continue to exist even when they are no longer perceivable through the senses, is a crucial aspect of human cognitive development. In this work, we seek to incorporate this understanding into interactive robots by proposing a set of assumptions and rules to represent object permanence in multi-object, multi-agent interactive scenarios. We integrate these rules into the particle filter, resulting in the Object Permanence Filter (OPF). For multi-object scenarios, we propose an ensemble of K interconnected OPFs, where each filter predicts plausible object tracks that are resilient to missing, noisy, and kinematically or dynamically infeasible measurements, thus bringing perceptional robustness. Through several interactive scenarios, we demonstrate that the proposed OPF approach provides robust tracking in human-robot interactive tasks agnostic to measurement type, even in the presence of prolonged and complete occlusion. Webpage: https://opfilter.github.io/. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 2024 IEEE International Conference on Robotics and Automation (ICRA)

arXiv:2402.18759 [pdf, other]

Learning with Language-Guided State Abstractions

Authors: Andi Peng, Ilia Sucholutsky, Belinda Z. Li, Theodore R. Sumers, Thomas L. Griffiths, Jacob Andreas, Julie A. Shah

Abstract: We describe a framework for using natural language to design state abstractions for imitation learning. Generalizable policy learning in high-dimensional observation spaces is facilitated by well-designed state representations, which can surface important features of an environment and hide irrelevant ones. These state representations are typically manually specified, or derived from other labor-i… ▽ More We describe a framework for using natural language to design state abstractions for imitation learning. Generalizable policy learning in high-dimensional observation spaces is facilitated by well-designed state representations, which can surface important features of an environment and hide irrelevant ones. These state representations are typically manually specified, or derived from other labor-intensive labeling procedures. Our method, LGA (language-guided abstraction), uses a combination of natural language supervision and background knowledge from language models (LMs) to automatically build state representations tailored to unseen tasks. In LGA, a user first provides a (possibly incomplete) description of a target task in natural language; next, a pre-trained LM translates this task description into a state abstraction function that masks out irrelevant features; finally, an imitation policy is trained using a small number of demonstrations and LGA-generated abstract states. Experiments on simulated robotic tasks show that LGA yields state abstractions similar to those designed by humans, but in a fraction of the time, and that these abstractions improve generalization and robustness in the presence of spurious correlations and ambiguous specifications. We illustrate the utility of the learned abstractions on mobile manipulation tasks with a Spot robot. △ Less

Submitted 6 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: ICLR 2024

arXiv:2402.15427 [pdf, other]

doi 10.1145/3613904.3642427

Understanding Entrainment in Human Groups: Optimising Human-Robot Collaboration from Lessons Learned during Human-Human Collaboration

Authors: Eike Schneiders, Christopher Fourie, Stanley Celestin, Julie Shah, Malte Jung

Abstract: Successful entrainment during collaboration positively affects trust, willingness to collaborate, and likeability towards collaborators. In this paper, we present a mixed-method study to investigate characteristics of successful entrainment leading to pair and group-based synchronisation. Drawing inspiration from industrial settings, we designed a fast-paced, short-cycle repetitive task. Using mot… ▽ More Successful entrainment during collaboration positively affects trust, willingness to collaborate, and likeability towards collaborators. In this paper, we present a mixed-method study to investigate characteristics of successful entrainment leading to pair and group-based synchronisation. Drawing inspiration from industrial settings, we designed a fast-paced, short-cycle repetitive task. Using motion tracking, we investigated entrainment in both dyadic and triadic task completion. Furthermore, we utilise audio-video recordings and semi-structured interviews to contextualise participants' experiences. This paper contributes to the Human-Computer/Robot Interaction (HCI/HRI) literature using a human-centred approach to identify characteristics of entrainment during pair- and group-based collaboration. We present five characteristics related to successful entrainment. These are related to the occurrence of entrainment, leader-follower patterns, interpersonal communication, the importance of the point-of-assembly, and the value of acoustic feedback. Finally, we present three design considerations for future research and design on collaboration with robots. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24), May 11--16, 2024, Honolulu, HI, USA

arXiv:2402.13428 [pdf]

Emergence and dynamics of delusions and hallucinations across stages in early psychosis

Authors: Catalina Mourgues-Codern, David Benrimoh, Jay Gandhi, Emily A. Farina, Raina Vin, Tihare Zamorano, Deven Parekh, Ashok Malla, Ridha Joober, Martin Lepage, Srividya N. Iyer, Jean Addington, Carrie E. Bearden, Kristin S. Cadenhead, Barbara Cornblatt, Matcheri Keshavan, William S. Stone, Daniel H. Mathalon, Diana O. Perkins, Elaine F. Walker, Tyrone D. Cannon, Scott W. Woods, Jai L. Shah, Albert R. Powers

Abstract: Hallucinations and delusions are often grouped together within the positive symptoms of psychosis. However, recent evidence suggests they may be driven by distinct computational and neural mechanisms. Examining the time course of their emergence may provide insights into the relationship between these underlying mechanisms. Participants from the second (N = 719) and third (N = 699) iterations of t… ▽ More Hallucinations and delusions are often grouped together within the positive symptoms of psychosis. However, recent evidence suggests they may be driven by distinct computational and neural mechanisms. Examining the time course of their emergence may provide insights into the relationship between these underlying mechanisms. Participants from the second (N = 719) and third (N = 699) iterations of the North American Prodrome Longitudinal Study (NAPLS 2 and 3) were assessed for timing of CHR-P-level delusion and hallucination onset. Pre-onset symptom patterns in first-episode psychosis patients (FEP) from the Prevention and Early Intervention Program for Psychosis (PEPP-Montreal; N = 694) were also assessed. Symptom onset was determined at baseline assessment and the evolution of symptom patterns examined over 24 months. In all three samples, participants were more likely to report the onset of delusion-spectrum symptoms prior to hallucination-spectrum symptoms (odds ratios (OR): NAPLS 2 = 4.09; NAPLS 3 = 4.14; PEPP, Z = 7.01, P < 0.001) and to present with only delusions compared to only hallucinations (OR: NAPLS 2 = 5.6; NAPLS 3 = 11.11; PEPP = 42.75). Re-emergence of delusions after remission was also more common than re-emergence of hallucinations (Ps < 0.05), and hallucinations more often resolved first (Ps < 0.001). In both CHR-P samples, ratings of delusional ideation fell with the onset of hallucinations (P = 0.007). Delusions tend to emerge before hallucinations and may play a role in their development. Further work should examine the relationship between the mechanisms driving these symptoms and its utility for diagnosis and treatment. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.06941 [pdf, other]

Achieving Low Latency at Low Outage: Multilevel Coding for mmWave Channels

Authors: Mine Gokce Dogan, Jaimin Shah, Martina Cardone, Christina Fragouli, Wei Mao, Hosein Nikopour, Rath Vannithamby

Abstract: Millimeter-wave (mmWave) spectrum is expected to support data-intensive applications that require ultra-reliable low-latency communications (URLLC). However, mmWave links are highly sensitive to blockage, which may lead to disruptions in the communication. Traditional techniques that build resilience against such blockages (among which are interleaving and feedback mechanisms) incur delays that ar… ▽ More Millimeter-wave (mmWave) spectrum is expected to support data-intensive applications that require ultra-reliable low-latency communications (URLLC). However, mmWave links are highly sensitive to blockage, which may lead to disruptions in the communication. Traditional techniques that build resilience against such blockages (among which are interleaving and feedback mechanisms) incur delays that are too large to effectively support URLLC. This calls for novel techniques that ensure resilient URLLC. In this paper, we propose to deploy multilevel codes over space and over time. These codes offer several benefits, such as they allow to control what information is received and they provide different reliability guarantees for different information streams based on their priority. We also show that deploying these codes leads to attractive trade-offs between rate, delay, and outage probability. A practically-relevant aspect of the proposed technique is that it offers resilience while incurring a low operational complexity. △ Less

Submitted 10 February, 2024; originally announced February 2024.

arXiv:2402.03081 [pdf, other]

Preference-Conditioned Language-Guided Abstraction

Authors: Andi Peng, Andreea Bobu, Belinda Z. Li, Theodore R. Sumers, Ilia Sucholutsky, Nishanth Kumar, Thomas L. Griffiths, Julie A. Shah

Abstract: Learning from demonstrations is a common way for users to teach robots, but it is prone to spurious feature correlations. Recent work constructs state abstractions, i.e. visual representations containing task-relevant features, from language as a way to perform more generalizable learning. However, these abstractions also depend on a user's preference for what matters in a task, which may be hard… ▽ More Learning from demonstrations is a common way for users to teach robots, but it is prone to spurious feature correlations. Recent work constructs state abstractions, i.e. visual representations containing task-relevant features, from language as a way to perform more generalizable learning. However, these abstractions also depend on a user's preference for what matters in a task, which may be hard to describe or infeasible to exhaustively specify using language alone. How do we construct abstractions to capture these latent preferences? We observe that how humans behave reveals how they see the world. Our key insight is that changes in human behavior inform us that there are differences in preferences for how humans see the world, i.e. their state abstractions. In this work, we propose using language models (LMs) to query for those preferences directly given knowledge that a change in behavior has occurred. In our framework, we use the LM in two ways: first, given a text description of the task and knowledge of behavioral change between states, we query the LM for possible hidden preferences; second, given the most likely preference, we query the LM to construct the state abstraction. In this framework, the LM is also able to ask the human directly when uncertain about its own estimate. We demonstrate our framework's ability to construct effective preference-conditioned abstractions in simulated experiments, a user study, as well as on a real Spot robot performing mobile manipulation tasks. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: HRI 2024

arXiv:2402.01536 [pdf, other]

doi 10.1145/3635636.3656204

Homogenization Effects of Large Language Models on Human Creative Ideation

Authors: Barrett R. Anderson, Jash Hemant Shah, Max Kreminski

Abstract: Large language models (LLMs) are now being used in a wide variety of contexts, including as creativity support tools (CSTs) intended to help their users come up with new ideas. But do LLMs actually support user creativity? We hypothesized that the use of an LLM as a CST might make the LLM's users feel more creative, and even broaden the range of ideas suggested by each individual user, but also ho… ▽ More Large language models (LLMs) are now being used in a wide variety of contexts, including as creativity support tools (CSTs) intended to help their users come up with new ideas. But do LLMs actually support user creativity? We hypothesized that the use of an LLM as a CST might make the LLM's users feel more creative, and even broaden the range of ideas suggested by each individual user, but also homogenize the ideas suggested by different users. We conducted a 36-participant comparative user study and found, in accordance with the homogenization hypothesis, that different users tended to produce less semantically distinct ideas with ChatGPT than with an alternative CST. Additionally, ChatGPT users generated a greater number of more detailed ideas, but felt less responsible for the ideas they generated. We discuss potential implications of these findings for users, designers, and developers of LLM-based CSTs. △ Less

Submitted 10 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to C&C 2024

arXiv:2312.11918 [pdf, other]

A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS Library

Authors: Ganesh Bikshandi, Jay Shah

Abstract: We provide an optimized implementation of the forward pass of FlashAttention-2, a popular memory-aware scaled dot-product attention algorithm, as a custom fused CUDA kernel targeting NVIDIA Hopper architecture and written using the open-source CUTLASS library. In doing so, we explain the challenges and techniques involved in fusing online-softmax with back-to-back GEMM kernels, utilizing the Hoppe… ▽ More We provide an optimized implementation of the forward pass of FlashAttention-2, a popular memory-aware scaled dot-product attention algorithm, as a custom fused CUDA kernel targeting NVIDIA Hopper architecture and written using the open-source CUTLASS library. In doing so, we explain the challenges and techniques involved in fusing online-softmax with back-to-back GEMM kernels, utilizing the Hopper-specific Tensor Memory Accelerator (TMA) and Warpgroup Matrix-Multiply-Accumulate (WGMMA) instructions, defining and transforming CUTLASS Layouts and Tensors, overlap** copy and GEMM operations, and choosing optimal tile sizes for the Q, K and V attention matrices while balancing the register pressure and shared memory utilization. In head-to-head benchmarks on a single H100 PCIe GPU for some common choices of hyperparameters, we observe 20-50% higher FLOPs/s over a version of FlashAttention-2 optimized for last-generation NVIDIA Ampere architecture. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: 13 pages, comments welcome

arXiv:2311.11448 [pdf]

Fast and Facile Synthesis Route to Epitaxial Oxide Membrane Using a Sacrificial Layer

Authors: Shivasheesh Varshney, Sooho Choo, Liam Thompson, Zhifei Yang, Jay Shah, Jiaxuan Wen, Steven J. Koester, K. Andre Mkhoyan, Alexander McLeod, Bharat Jalan

Abstract: The advancement in thin-film exfoliation for synthesizing oxide membranes has opened up new possibilities for creating artificially-assembled heterostructures with structurally and chemically incompatible materials. The sacrificial layer method is a promising approach to exfoliate as-grown films from a compatible material system, allowing their integration with dissimilar materials. Nonetheless, t… ▽ More The advancement in thin-film exfoliation for synthesizing oxide membranes has opened up new possibilities for creating artificially-assembled heterostructures with structurally and chemically incompatible materials. The sacrificial layer method is a promising approach to exfoliate as-grown films from a compatible material system, allowing their integration with dissimilar materials. Nonetheless, the conventional sacrificial layers often possess intricate stoichiometry, thereby constraining their practicality and adaptability, particularly when considering techniques like Molecular Beam Epitaxy (MBE). This is where easy-to-grow binary alkaline earth metal oxides with a rock salt crystal structure are useful. These oxides, which include (Mg, Ca, Sr, Ba)O, can be used as a sacrificial layer covering a much broader range of lattice parameters compared to conventional sacrificial layers and are easily dissolvable in deionized water. In this study, we show the epitaxial growth of single-crystalline perovskite SrTiO3 (STO) on sacrificial layers consisting of crystalline SrO, BaO, and Ba1-xCaxO films, employing a hybrid MBE method. Our results highlight the rapid (< 5 minutes) dissolution of the sacrificial layer when immersed in deionized water, facilitating the fabrication of millimeter-sized STO membranes. Using high-resolution x-ray diffraction, atomic-force microscopy, scanning transmission electron microscopy, impedance spectroscopy, and scattering-type near-field optical microscopy (SNOM), we demonstrate epitaxial STO membranes with bulk-like intrinsic dielectric properties. The employment of alkaline earth metal oxides as sacrificial layers is likely to simplify membrane synthesis, particularly with MBE, thus expanding research possibilities. △ Less

Submitted 19 November, 2023; originally announced November 2023.

Comments: 36 pages, 4 figures

arXiv:2310.17550 [pdf, other]

Human-Guided Complexity-Controlled Abstractions

Authors: Andi Peng, Mycal Tucker, Eoin Kenny, Noga Zaslavsky, Pulkit Agrawal, Julie Shah

Abstract: Neural networks often learn task-specific latent representations that fail to generalize to novel settings or tasks. Conversely, humans learn discrete representations (i.e., concepts or words) at a variety of abstraction levels (e.g., "bird" vs. "sparrow") and deploy the appropriate abstraction based on task. Inspired by this, we train neural models to generate a spectrum of discrete representatio… ▽ More Neural networks often learn task-specific latent representations that fail to generalize to novel settings or tasks. Conversely, humans learn discrete representations (i.e., concepts or words) at a variety of abstraction levels (e.g., "bird" vs. "sparrow") and deploy the appropriate abstraction based on task. Inspired by this, we train neural models to generate a spectrum of discrete representations, and control the complexity of the representations (roughly, how many bits are allocated for encoding inputs) by tuning the entropy of the distribution over representations. In finetuning experiments, using only a small number of labeled examples for a new task, we show that (1) tuning the representation to a task-appropriate complexity level supports the highest finetuning performance, and (2) in a human-participant study, users were able to identify the appropriate complexity level for a downstream task using visualizations of discrete representations. Our results indicate a promising direction for rapid model finetuning by leveraging human insight. △ Less

Submitted 27 October, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

Comments: NeurIPS 2023

arXiv:2310.07822 [pdf, other]

Body-mounted MR-conditional Robot for Minimally Invasive Liver Intervention

Authors: Zhefeng Huang, Anthony L. Gunderman, Samuel E. Wilcox, Saikat Sengupta, Jay Shah, Aiming Lu, David Woodrum, Yue Chen

Abstract: MR-guided microwave ablation (MWA) has proven effective in treating hepatocellular carcinoma (HCC) with small-sized tumors, but the state-of-the-art technique suffers from sub-optimal workflow due to speed and accuracy of needle placement. This paper presents a compact body-mounted MR-conditional robot that can operate in closed-bore MR scanners for accurate needle guidance. The robotic platform c… ▽ More MR-guided microwave ablation (MWA) has proven effective in treating hepatocellular carcinoma (HCC) with small-sized tumors, but the state-of-the-art technique suffers from sub-optimal workflow due to speed and accuracy of needle placement. This paper presents a compact body-mounted MR-conditional robot that can operate in closed-bore MR scanners for accurate needle guidance. The robotic platform consists of two stacked Cartesian XY stages, each with two degrees of freedom, that facilitate needle guidance. The robot is actuated using 3D-printed pneumatic turbines with MR-conditional bevel gear transmission systems. Pneumatic valves and control mechatronics are located inside the MRI control room and are connected to the robot with pneumatic transmission lines and optical fibers. Free space experiments indicated robot-assisted needle insertion error of 2.6$\pm$1.3 mm at an insertion depth of 80 mm. The MR-guided phantom studies were conducted to verify the MR-conditionality and targeting performance of the robot. Future work will focus on the system optimization and validations in animal trials. △ Less

Submitted 25 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: 10 figures

arXiv:2310.07802 [pdf, other]

An Information Bottleneck Characterization of the Understanding-Workload Tradeoff

Authors: Lindsay Sanneman, Mycal Tucker, Julie Shah

Abstract: Recent advances in artificial intelligence (AI) have underscored the need for explainable AI (XAI) to support human understanding of AI systems. Consideration of human factors that impact explanation efficacy, such as mental workload and human understanding, is central to effective XAI design. Existing work in XAI has demonstrated a tradeoff between understanding and workload induced by different… ▽ More Recent advances in artificial intelligence (AI) have underscored the need for explainable AI (XAI) to support human understanding of AI systems. Consideration of human factors that impact explanation efficacy, such as mental workload and human understanding, is central to effective XAI design. Existing work in XAI has demonstrated a tradeoff between understanding and workload induced by different types of explanations. Explaining complex concepts through abstractions (hand-crafted grou**s of related problem features) has been shown to effectively address and balance this workload-understanding tradeoff. In this work, we characterize the workload-understanding balance via the Information Bottleneck method: an information-theoretic approach which automatically generates abstractions that maximize informativeness and minimize complexity. In particular, we establish empirical connections between workload and complexity and between understanding and informativeness through human-subject experiments. This empirical link between human factors and information-theoretic concepts provides an important mathematical characterization of the workload-understanding tradeoff which enables user-tailored XAI design. △ Less

Submitted 11 October, 2023; originally announced October 2023.

arXiv:2310.02486 [pdf, other]

OCU-Net: A Novel U-Net Architecture for Enhanced Oral Cancer Segmentation

Authors: Ahmed Albishri, Syed Jawad Hussain Shah, Yugyung Lee, Rong Wang

Abstract: Accurate detection of oral cancer is crucial for improving patient outcomes. However, the field faces two key challenges: the scarcity of deep learning-based image segmentation research specifically targeting oral cancer and the lack of annotated data. Our study proposes OCU-Net, a pioneering U-Net image segmentation architecture exclusively designed to detect oral cancer in hematoxylin and eosin… ▽ More Accurate detection of oral cancer is crucial for improving patient outcomes. However, the field faces two key challenges: the scarcity of deep learning-based image segmentation research specifically targeting oral cancer and the lack of annotated data. Our study proposes OCU-Net, a pioneering U-Net image segmentation architecture exclusively designed to detect oral cancer in hematoxylin and eosin (H&E) stained image datasets. OCU-Net incorporates advanced deep learning modules, such as the Channel and Spatial Attention Fusion (CSAF) module, a novel and innovative feature that emphasizes important channel and spatial areas in H&E images while exploring contextual information. In addition, OCU-Net integrates other innovative components such as Squeeze-and-Excite (SE) attention module, Atrous Spatial Pyramid Pooling (ASPP) module, residual blocks, and multi-scale fusion. The incorporation of these modules showed superior performance for oral cancer segmentation for two datasets used in this research. Furthermore, we effectively utilized the efficient ImageNet pre-trained MobileNet-V2 model as a backbone of our OCU-Net to create OCU-Netm, an enhanced version achieving state-of-the-art results. Comprehensive evaluation demonstrates that OCU-Net and OCU-Netm outperformed existing segmentation methods, highlighting their precision in identifying cancer cells in H&E images from OCDC and ORCA datasets. △ Less

Submitted 3 October, 2023; originally announced October 2023.

arXiv:2309.14314 [pdf, other]

doi 10.1103/PhysRevX.14.011043

Symmetry breaking and ascending in the magnetic kagome metal FeGe

Authors: Shangfei Wu, Mason Klemm, Jay Shah, Ethan T. Ritz, Chunruo Duan, Xiaokun Teng, Bin Gao, Feng Ye, Masaaki Matsuda, Fankang Li, Xianghan Xu, Ming Yi, Turan Birol, Pengcheng Dai, Girsh Blumberg

Abstract: Spontaneous symmetry breaking-the phenomenon where an infinitesimal perturbation can cause the system to break the underlying symmetry-is a cornerstone concept in the understanding of interacting solid-state systems. In a typical series of temperature-driven phase transitions, higher temperature phases are more symmetric due to the stabilizing effect of entropy that becomes dominant as the tempera… ▽ More Spontaneous symmetry breaking-the phenomenon where an infinitesimal perturbation can cause the system to break the underlying symmetry-is a cornerstone concept in the understanding of interacting solid-state systems. In a typical series of temperature-driven phase transitions, higher temperature phases are more symmetric due to the stabilizing effect of entropy that becomes dominant as the temperature is increased. However, the opposite is rare but possible when there are multiple degrees of freedom in the system. Here, we present such an example of a symmetry-ascending phenomenon in a magnetic kagome metal FeGe by utilizing neutron Larmor diffraction and Raman spectroscopy. In the paramagnetic state at 460K, we confirm that the crystal structure is indeed hexagonal kagome lattice. On cooling to TN, the crystal structure changes from hexagonal to monoclinic with in-plane lattice distortions on the order of 10^(-4) and the associated splitting of the double degenerate phonon mode of the pristine kagome lattice. Upon further cooling to TCDW, the kagome lattice shows a small negative thermal expansion, and the crystal structure becomes more symmetric gradually upon further cooling. Increasing the crystalline symmetry upon cooling is unusual, it originates from an extremely weak structural instability that coexists and competes with the CDW and magnetic orders. These observations are against the expectations for a simple model with a single order parameter, hence can only be explained by a Landau free energy expansion that takes into account multiple lattice, charge, and spin degrees of freedom. Thus, the determination of the crystalline lattice symmetry as well as the unusual spin-lattice coupling is a first step towards understanding the rich electronic and magnetic properties of the system and sheds new light on intertwined orders where the lattice degree of freedom is no longer dominant. △ Less

Submitted 8 March, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: 20 pages with 10 figures, replaced with journal version

Journal ref: Phys. Rev. X 14, 011043 (2024)

arXiv:2308.14089 [pdf, other]

MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records

Authors: Scott L. Fleming, Alejandro Lozano, William J. Haberkorn, Jenelle A. **dal, Eduardo P. Reis, Rahul Thapa, Louis Blankemeier, Julian Z. Genkins, Ethan Steinberg, Ashwin Nayak, Birju S. Patel, Chia-Chun Chiang, Alison Callahan, Zepeng Huo, Sergios Gatidis, Scott J. Adams, Oluseyi Fayanju, Shreya J. Shah, Thomas Savage, Ethan Goh, Akshay S. Chaudhari, Nima Aghaeepour, Christopher Sharp, Michael A. Pfeffer, Percy Liang , et al. (5 additional authors not shown)

Abstract: The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture… ▽ More The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture the complexity of information needs and documentation burdens experienced by clinicians. To address these challenges, we introduce MedAlign, a benchmark dataset of 983 natural language instructions for EHR data. MedAlign is curated by 15 clinicians (7 specialities), includes clinician-written reference responses for 303 instructions, and provides 276 longitudinal EHRs for grounding instruction-response pairs. We used MedAlign to evaluate 6 general domain LLMs, having clinicians rank the accuracy and quality of each LLM response. We found high error rates, ranging from 35% (GPT-4) to 68% (MPT-7B-Instruct), and an 8.3% drop in accuracy moving from 32k to 2k context lengths for GPT-4. Finally, we report correlations between clinician rankings and automated natural language generation metrics as a way to rank LLMs without human review. We make MedAlign available under a research data use agreement to enable LLM evaluations on tasks aligned with clinician needs and preferences. △ Less

Submitted 24 December, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

arXiv:2307.06333 [pdf, other]

Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation

Authors: Andi Peng, Aviv Netanyahu, Mark Ho, Tianmin Shu, Andreea Bobu, Julie Shah, Pulkit Agrawal

Abstract: Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments. Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation. However, designers don't know which concepts are irrelevant a priori, especially when different end users have different preferences a… ▽ More Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments. Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation. However, designers don't know which concepts are irrelevant a priori, especially when different end users have different preferences about how the task is performed. We propose an interactive framework to leverage feedback directly from the user to identify personalized task-irrelevant concepts. Our key idea is to generate counterfactual demonstrations that allow users to quickly identify possible task-relevant and irrelevant concepts. The knowledge of task-irrelevant concepts is then used to perform data augmentation and thus obtain a policy adapted to personalized user objectives. We present experiments validating our framework on discrete and continuous control tasks with real human users. Our method (1) enables users to better understand agent failure, (2) reduces the number of demonstrations required for fine-tuning, and (3) aligns the agent to individual user task preferences. △ Less

Submitted 13 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

Comments: International Conference on Machine Learning (ICML) 2023

arXiv:2306.06084 [pdf]

doi 10.1109/M2VIP55626.2022.10041089

Machine Vision Using Cellphone Camera: A Comparison of deep networks for classifying three challenging denominations of Indian Coins

Authors: Keyur D. Joshi, Dhruv Shah, Varshil Shah, Nilay Gandhi, Sanket J. Shah, Sanket B. Shah

Abstract: Indian currency coins come in a variety of denominations. Off all the varieties Rs.1, RS.2, and Rs.5 have similar diameters. Majority of the coin styles in market circulation for denominations of Rs.1 and Rs.2 coins are nearly the same except for numerals on its reverse side. If a coin is resting on its obverse side, the correct denomination is not distinguishable by humans. Therefore, it was hypo… ▽ More Indian currency coins come in a variety of denominations. Off all the varieties Rs.1, RS.2, and Rs.5 have similar diameters. Majority of the coin styles in market circulation for denominations of Rs.1 and Rs.2 coins are nearly the same except for numerals on its reverse side. If a coin is resting on its obverse side, the correct denomination is not distinguishable by humans. Therefore, it was hypothesized that a digital image of a coin resting on its either size could be classified into its correct denomination by training a deep neural network model. The digital images were generated by using cheap cell phone cameras. To find the most suitable deep neural network architecture, four were selected based on the preliminary analysis carried out for comparison. The results confirm that two of the four deep neural network models can classify the correct denomination from either side of a coin with an accuracy of 97%. △ Less

Submitted 12 May, 2023; originally announced June 2023.

Comments: 6 Pages, 4 Figures, 6 Tables, Conference paper

arXiv:2305.11271 [pdf, other]

doi 10.24963/ijcai.2023/330

Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue

Authors: Cristian-Paul Bara, Ziqiao Ma, Yingzhuo Yu, Julie Shah, Joyce Chai

Abstract: Collaborative tasks often begin with partial task knowledge and incomplete initial plans from each partner. To complete these tasks, agents need to engage in situated communication with their partners and coordinate their partial plans towards a complete plan to achieve a joint task goal. While such collaboration seems effortless in a human-human team, it is highly challenging for human-AI collabo… ▽ More Collaborative tasks often begin with partial task knowledge and incomplete initial plans from each partner. To complete these tasks, agents need to engage in situated communication with their partners and coordinate their partial plans towards a complete plan to achieve a joint task goal. While such collaboration seems effortless in a human-human team, it is highly challenging for human-AI collaboration. To address this limitation, this paper takes a step towards collaborative plan acquisition, where humans and agents strive to learn and communicate with each other to acquire a complete plan for joint tasks. Specifically, we formulate a novel problem for agents to predict the missing task knowledge for themselves and for their partners based on rich perceptual and dialogue history. We extend a situated dialogue benchmark for symmetric collaborative tasks in a 3D blocks world and investigate computational strategies for plan acquisition. Our empirical results suggest that predicting the partner's missing knowledge is a more viable approach than predicting one's own. We show that explicit modeling of the partner's dialogue moves and mental states produces improved and more stable results than without. These results provide insight for future AI agents that can predict what knowledge their partner is missing and, therefore, can proactively communicate such information to help their partner acquire such missing knowledge toward a common understanding of joint tasks. △ Less

Submitted 18 May, 2023; originally announced May 2023.

Journal ref: International Joint Conferences on Artificial Intelligence (IJCAI 2023)

arXiv:2305.06259 [pdf]

Symmetry and nonlinearity of spin wave resonance excited by focused surface acoustic waves

Authors: Piyush J. Shah, Derek A. Bas, Abbass Hamadeh, Michael Wolf, Andrew Franson, Michael Newburger, Philipp Pirro, Mathias Weiler, Michael R. Page

Abstract: The use of a complex ferromagnetic system to manipulate GHz surface acoustic waves is a rich current topic under investigation, but the high-power nonlinear regime is under-explored. We introduce focused surface acoustic waves, which provide a way to access this regime with modest equipment. Symmetry of the magneto-acoustic interaction can be tuned by interdigitated transducer design which can int… ▽ More The use of a complex ferromagnetic system to manipulate GHz surface acoustic waves is a rich current topic under investigation, but the high-power nonlinear regime is under-explored. We introduce focused surface acoustic waves, which provide a way to access this regime with modest equipment. Symmetry of the magneto-acoustic interaction can be tuned by interdigitated transducer design which can introduce additional strain components. Here, we compare the impact of focused acoustic waves versus standard unidirectional acoustic waves in significantly enhancing the magnon-phonon coupling behavior. Analytical simulation results based on modified Landau-Lifshitz-Gilbert theory show good agreement with experimental findings. We also report nonlinear input power dependence of the transmission through the device. This experimental observation is supported by the micromagnetic simulation using mumax3 to model the nonlinear dependence. These results pave the way for extending the understanding and design of acoustic wave devices for exploration of acoustically driven spin wave resonance physics. △ Less

Submitted 10 May, 2023; originally announced May 2023.

Comments: 13 pages, 8 figures

arXiv:2302.11330 [pdf, other]

doi 10.3847/PSJ/acbe40

doi 10.7910/DVN/XEGCCO

Modeling the formation of Selk impact crater on Titan: Implications for Dragonfly

Authors: Shigeru Wakita, Brandon C. Johnson, Jason M. Soderblom, Jahnavi Shah, Catherine D. Neish, Jordan K. Steckloff

Abstract: Selk crater is an $\sim$ 80 km diameter impact crater on the Saturnian icy satellite, Titan. Melt pools associated with impact craters like Selk provide environments where liquid water and organics can mix and produce biomolecules like amino acids. It is partly for this reason that the Selk region has been selected as the area that NASA's Dragonfly mission will explore and address one of its prima… ▽ More Selk crater is an $\sim$ 80 km diameter impact crater on the Saturnian icy satellite, Titan. Melt pools associated with impact craters like Selk provide environments where liquid water and organics can mix and produce biomolecules like amino acids. It is partly for this reason that the Selk region has been selected as the area that NASA's Dragonfly mission will explore and address one of its primary goals: to search for biological signatures on Titan. Here we simulate Selk-sized impact craters on Titan to better understand the formation of Selk and its melt pool. We consider several structures for the icy target material by changing the thickness of the methane clathrate layer, which has a substantial effect on the target thermal structure and crater formation. Our numerical results show that a 4 km-diameter-impactor produces a Selk-sized crater when 5-15 km thick methane clathrate layers are considered. We confirm the production of melt pools in these cases and find that the melt volumes are similar regardless of methane clathrate layer thickness. The distribution of the melted material, however, is sensitive to the thickness of the methane clathrate layer. The melt pool appears as a torus-like shape with a few km depth in the case of 10-15 km thick methane clathrate layer, and as a shallower layer in the case of a 5 km thick clathrate layer. Melt pools of this thickness may take tens of thousands of years to freeze, allowing more time for complex organics to form. △ Less

Submitted 22 February, 2023; originally announced February 2023.

Comments: 32 pages, 11 figures, accepted for publication in PSJ

arXiv:2302.09200 [pdf, other]

Brainomaly: Unsupervised Neurologic Disease Detection Utilizing Unannotated T1-weighted Brain MR Images

Authors: Md Mahfuzur Rahman Siddiquee, Jay Shah, Teresa Wu, Catherine Chong, Todd J. Schwedt, Gina Dumkrieger, Simona Nikolova, Baoxin Li

Abstract: Harnessing the power of deep neural networks in the medical imaging domain is challenging due to the difficulties in acquiring large annotated datasets, especially for rare diseases, which involve high costs, time, and effort for annotation. Unsupervised disease detection methods, such as anomaly detection, can significantly reduce human effort in these scenarios. While anomaly detection typically… ▽ More Harnessing the power of deep neural networks in the medical imaging domain is challenging due to the difficulties in acquiring large annotated datasets, especially for rare diseases, which involve high costs, time, and effort for annotation. Unsupervised disease detection methods, such as anomaly detection, can significantly reduce human effort in these scenarios. While anomaly detection typically focuses on learning from images of healthy subjects only, real-world situations often present unannotated datasets with a mixture of healthy and diseased subjects. Recent studies have demonstrated that utilizing such unannotated images can improve unsupervised disease and anomaly detection. However, these methods do not utilize knowledge specific to registered neuroimages, resulting in a subpar performance in neurologic disease detection. To address this limitation, we propose Brainomaly, a GAN-based image-to-image translation method specifically designed for neurologic disease detection. Brainomaly not only offers tailored image-to-image translation suitable for neuroimages but also leverages unannotated mixed images to achieve superior neurologic disease detection. Additionally, we address the issue of model selection for inference without annotated samples by proposing a pseudo-AUC metric, further enhancing Brainomaly's detection performance. Extensive experiments and ablation studies demonstrate that Brainomaly outperforms existing state-of-the-art unsupervised disease and anomaly detection methods by significant margins in Alzheimer's disease detection using a publicly available dataset and headache detection using an institutional dataset. The code is available from https://github.com/mahfuzmohammad/Brainomaly. △ Less

Submitted 16 August, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

Comments: Accepted in WACV 2024

arXiv:2302.01928 [pdf, other]

doi 10.1145/3610977.3634987

Aligning Robot and Human Representations

Authors: Andreea Bobu, Andi Peng, Pulkit Agrawal, Julie Shah, Anca D. Dragan

Abstract: To act in the world, robots rely on a representation of salient task aspects: for example, to carry a coffee mug, a robot may consider movement efficiency or mug orientation in its behavior. However, if we want robots to act for and with people, their representations must not be just functional but also reflective of what humans care about, i.e. they must be aligned. We observe that current learni… ▽ More To act in the world, robots rely on a representation of salient task aspects: for example, to carry a coffee mug, a robot may consider movement efficiency or mug orientation in its behavior. However, if we want robots to act for and with people, their representations must not be just functional but also reflective of what humans care about, i.e. they must be aligned. We observe that current learning approaches suffer from representation misalignment, where the robot's learned representation does not capture the human's representation. We suggest that because humans are the ultimate evaluator of robot performance, we must explicitly focus our efforts on aligning learned representations with humans, in addition to learning the downstream task. We advocate that current representation learning approaches in robotics should be studied from the perspective of how well they accomplish the objective of representation alignment. We mathematically define the problem, identify its key desiderata, and situate current methods within this formalism. We conclude by suggesting future directions for exploring open challenges. △ Less

Submitted 28 January, 2024; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: 14 pages, 3 figures, 1 table

arXiv:2301.04657 [pdf, other]

Quantum spin ice in three-dimensional Rydberg atom arrays

Authors: Jeet Shah, Gautam Nambiar, Alexey V. Gorshkov, Victor Galitski

Abstract: Quantum spin liquids are exotic phases of matter whose low-energy physics is described as the deconfined phase of an emergent gauge theory. With recent theory proposals and an experiment showing preliminary signs of $\mathbb{Z}_2$ topological order [G. Semeghini et al., Science 374, 1242 (2021)], Rydberg atom arrays have emerged as a promising platform to realize a quantum spin liquid. In this wor… ▽ More Quantum spin liquids are exotic phases of matter whose low-energy physics is described as the deconfined phase of an emergent gauge theory. With recent theory proposals and an experiment showing preliminary signs of $\mathbb{Z}_2$ topological order [G. Semeghini et al., Science 374, 1242 (2021)], Rydberg atom arrays have emerged as a promising platform to realize a quantum spin liquid. In this work, we propose a way to realize a $U(1)$ quantum spin liquid in three spatial dimensions, described by the deconfined phase of $U(1)$ gauge theory in a pyrochlore lattice Rydberg atom array. We study the ground state phase diagram of the proposed Rydberg system as a function of experimentally relevant parameters. Within our calculation, we find that by tuning the Rabi frequency, one can access both the confinement-deconfinement transition driven by a proliferation of "magnetic" monopoles and the Higgs transition driven by a proliferation of "electric" charges of the emergent gauge theory. We suggest experimental probes for distinguishing the deconfined phase from ordered phases. This work serves as a proposal to access a confinement-deconfinement transition in three spatial dimensions on a Rydberg-based quantum simulator. △ Less

Submitted 14 June, 2024; v1 submitted 11 January, 2023; originally announced January 2023.

Comments: 28+5 pages, 15+2 figures

arXiv:2211.06318 [pdf]

Artificial Intelligence and Life in 2030: The One Hundred Year Study on Artificial Intelligence

Authors: Peter Stone, Rodney Brooks, Erik Brynjolfsson, Ryan Calo, Oren Etzioni, Greg Hager, Julia Hirschberg, Shivaram Kalyanakrishnan, Ece Kamar, Sarit Kraus, Kevin Leyton-Brown, David Parkes, William Press, AnnaLee Saxenian, Julie Shah, Milind Tambe, Astro Teller

Abstract: In September 2016, Stanford's "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the first report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Peter Stone of the University of Texas at Austin. The report, entitled… ▽ More In September 2016, Stanford's "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the first report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Peter Stone of the University of Texas at Austin. The report, entitled "Artificial Intelligence and Life in 2030," examines eight domains of typical urban settings on which AI is likely to have impact over the coming years: transportation, home and service robots, healthcare, education, public safety and security, low-resource communities, employment and workplace, and entertainment. It aims to provide the general public with a scientifically and technologically accurate portrayal of the current state of AI and its potential and to help guide decisions in industry and governments, as well as to inform research and development in the field. The charge for this report was given to the panel by the AI100 Standing Committee, chaired by Barbara Grosz of Harvard University. △ Less

Submitted 31 October, 2022; originally announced November 2022.

Comments: 52 pages, https://ai100.stanford.edu/2016-report

arXiv:2211.03587 [pdf, other]

Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments

Authors: Abhinav Joshi, Naman Gupta, **ang Shah, Binod Bhattarai, Ashutosh Modi, Danail Stoyanov

Abstract: A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation Learning (MRL) has emerged as an active area of research in recent times. MRL involves learning reliable and robust representations of information from heterogeneo… ▽ More A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation Learning (MRL) has emerged as an active area of research in recent times. MRL involves learning reliable and robust representations of information from heterogeneous sources and fusing them. However, in practice, the data acquired from different sources are typically noisy. In some extreme cases, a noise of large magnitude can completely alter the semantics of the data leading to inconsistencies in the parallel multimodal data. In this paper, we propose a novel method for multimodal representation learning in a noisy environment via the generalized product of experts technique. In the proposed method, we train a separate network for each modality to assess the credibility of information coming from that modality, and subsequently, the contribution from each modality is dynamically varied while estimating the joint distribution. We evaluate our method on two challenging benchmarks from two diverse domains: multimodal 3D hand-pose estimation and multimodal surgical video segmentation. We attain state-of-the-art performance on both benchmarks. Our extensive quantitative and qualitative evaluations show the advantages of our method compared to previous approaches. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: 11 Pages, Accepted at ICMI 2022 Oral

arXiv:2210.15767 [pdf]

Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100) 2021 Study Panel Report

Authors: Michael L. Littman, Ifeoma Ajunwa, Guy Berger, Craig Boutilier, Morgan Currie, Finale Doshi-Velez, Gillian Hadfield, Michael C. Horowitz, Charles Isbell, Hiroaki Kitano, Karen Levy, Terah Lyons, Melanie Mitchell, Julie Shah, Steven Sloman, Shannon Vallor, Toby Walsh

Abstract: In September 2021, the "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the second report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Michael Littman of Brown University. The report, entitled "Gathering Strengt… ▽ More In September 2021, the "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the second report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Michael Littman of Brown University. The report, entitled "Gathering Strength, Gathering Storms," answers a set of 14 questions probing critical areas of AI development addressing the major risks and dangers of AI, its effects on society, its public perception and the future of the field. The report concludes that AI has made a major leap from the lab to people's lives in recent years, which increases the urgency to understand its potential negative effects. The questions were developed by the AI100 Standing Committee, chaired by Peter Stone of the University of Texas at Austin, consisting of a group of AI leaders with expertise in computer science, sociology, ethics, economics, and other disciplines. △ Less

Submitted 27 October, 2022; originally announced October 2022.

Comments: 82 pages, https://ai100.stanford.edu/gathering-strength-gathering-storms-one-hundred-year-study-artificial-intelligence-ai100-2021-study

arXiv:2209.01822 [pdf, other]

HealthyGAN: Learning from Unannotated Medical Images to Detect Anomalies Associated with Human Disease

Authors: Md Mahfuzur Rahman Siddiquee, Jay Shah, Teresa Wu, Catherine Chong, Todd Schwedt, Baoxin Li

Abstract: Automated anomaly detection from medical images, such as MRIs and X-rays, can significantly reduce human effort in disease diagnosis. Owing to the complexity of modeling anomalies and the high cost of manual annotation by domain experts (e.g., radiologists), a typical technique in the current medical imaging literature has focused on deriving diagnostic models from healthy subjects only, assuming… ▽ More Automated anomaly detection from medical images, such as MRIs and X-rays, can significantly reduce human effort in disease diagnosis. Owing to the complexity of modeling anomalies and the high cost of manual annotation by domain experts (e.g., radiologists), a typical technique in the current medical imaging literature has focused on deriving diagnostic models from healthy subjects only, assuming the model will detect the images from patients as outliers. However, in many real-world scenarios, unannotated datasets with a mix of both healthy and diseased individuals are abundant. Therefore, this paper poses the research question of how to improve unsupervised anomaly detection by utilizing (1) an unannotated set of mixed images, in addition to (2) the set of healthy images as being used in the literature. To answer the question, we propose HealthyGAN, a novel one-directional image-to-image translation method, which learns to translate the images from the mixed dataset to only healthy images. Being one-directional, HealthyGAN relaxes the requirement of cycle consistency of existing unpaired image-to-image translation methods, which is unattainable with mixed unannotated data. Once the translation is learned, we generate a difference map for any given image by subtracting its translated output. Regions of significant responses in the difference map correspond to potential anomalies (if any). Our HealthyGAN outperforms the conventional state-of-the-art methods by significant margins on two publicly available datasets: COVID-19 and NIH ChestX-ray14, and one institutional dataset collected from Mayo Clinic. The implementation is publicly available at https://github.com/mahfuzmohammad/HealthyGAN. △ Less

Submitted 5 September, 2022; originally announced September 2022.

Comments: International Workshop on Simulation and Synthesis in Medical Imaging, MICCAI, 2022

arXiv:2207.00088 [pdf, other]

Towards Human-Agent Communication via the Information Bottleneck Principle

Authors: Mycal Tucker, Julie Shah, Roger Levy, Noga Zaslavsky

Abstract: Emergent communication research often focuses on optimizing task-specific utility as a driver for communication. However, human languages appear to evolve under pressure to efficiently compress meanings into communication signals by optimizing the Information Bottleneck tradeoff between informativeness and complexity. In this work, we study how trading off these three factors -- utility, informati… ▽ More Emergent communication research often focuses on optimizing task-specific utility as a driver for communication. However, human languages appear to evolve under pressure to efficiently compress meanings into communication signals by optimizing the Information Bottleneck tradeoff between informativeness and complexity. In this work, we study how trading off these three factors -- utility, informativeness, and complexity -- shapes emergent communication, including compared to human communication. To this end, we propose Vector-Quantized Variational Information Bottleneck (VQ-VIB), a method for training neural agents to compress inputs into discrete signals embedded in a continuous space. We train agents via VQ-VIB and compare their performance to previously proposed neural architectures in grounded environments and in a Lewis reference game. Across all neural architectures and settings, taking into account communicative informativeness benefits communication convergence rates, and penalizing communicative complexity leads to human-like lexicon sizes while maintaining high utility. Additionally, we find that VQ-VIB outperforms other discrete communication methods. This work demonstrates how fundamental principles that are believed to characterize human language evolution may inform emergent communication in artificial agents. △ Less

Submitted 30 June, 2022; originally announced July 2022.

arXiv:2206.04632 [pdf, other]

Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations

Authors: Yanwei Wang, Nadia Figueroa, Shen Li, Ankit Shah, Julie Shah

Abstract: Learning from demonstration (LfD) has succeeded in tasks featuring a long time horizon. However, when the problem complexity also includes human-in-the-loop perturbations, state-of-the-art approaches do not guarantee the successful reproduction of a task. In this work, we identify the roots of this challenge as the failure of a learned continuous policy to satisfy the discrete plan implicit in the… ▽ More Learning from demonstration (LfD) has succeeded in tasks featuring a long time horizon. However, when the problem complexity also includes human-in-the-loop perturbations, state-of-the-art approaches do not guarantee the successful reproduction of a task. In this work, we identify the roots of this challenge as the failure of a learned continuous policy to satisfy the discrete plan implicit in the demonstration. By utilizing modes (rather than subgoals) as the discrete abstraction and motion policies with both mode invariance and goal reachability properties, we prove our learned continuous policy can simulate any discrete plan specified by a linear temporal logic (LTL) formula. Consequently, an imitator is robust to both task- and motion-level perturbations and guaranteed to achieve task success. Project page: https://yanweiw.github.io/tli/ △ Less

Submitted 14 December, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

Comments: CoRL 2022 Oral Talk

arXiv:2205.13997 [pdf, other]

Prototype Based Classification from Hierarchy to Fairness

Authors: Mycal Tucker, Julie Shah

Abstract: Artificial neural nets can represent and classify many types of data but are often tailored to particular applications -- e.g., for "fair" or "hierarchical" classification. Once an architecture has been selected, it is often difficult for humans to adjust models for a new task; for example, a hierarchical classifier cannot be easily transformed into a fair classifier that shields a protected field… ▽ More Artificial neural nets can represent and classify many types of data but are often tailored to particular applications -- e.g., for "fair" or "hierarchical" classification. Once an architecture has been selected, it is often difficult for humans to adjust models for a new task; for example, a hierarchical classifier cannot be easily transformed into a fair classifier that shields a protected field. Our contribution in this work is a new neural network architecture, the concept subspace network (CSN), which generalizes existing specialized classifiers to produce a unified model capable of learning a spectrum of multi-concept relationships. We demonstrate that CSNs reproduce state-of-the-art results in fair classification when enforcing concept independence, may be transformed into hierarchical classifiers, or even reconcile fairness and hierarchy within a single classifier. The CSN is inspired by existing prototype-based classifiers that promote interpretability. △ Less

Submitted 27 May, 2022; originally announced May 2022.

arXiv:2205.08696 [pdf, other]

The Solvability of Interpretability Evaluation Metrics

Authors: Yilun Zhou, Julie Shah

Abstract: Feature attribution methods are popular for explaining neural network predictions, and they are often evaluated on metrics such as comprehensiveness and sufficiency. In this paper, we highlight an intriguing property of these metrics: their solvability. Concretely, we can define the problem of optimizing an explanation for a metric, which can be solved by beam search. This observation leads to the… ▽ More Feature attribution methods are popular for explaining neural network predictions, and they are often evaluated on metrics such as comprehensiveness and sufficiency. In this paper, we highlight an intriguing property of these metrics: their solvability. Concretely, we can define the problem of optimizing an explanation for a metric, which can be solved by beam search. This observation leads to the obvious yet unaddressed question: why do we use explainers (e.g., LIME) not based on solving the target metric, if the metric value represents explanation quality? We present a series of investigations showing strong performance of this beam search explainer and discuss its broader implication: a definition-evaluation duality of interpretability concepts. We implement the explainer and release the Python solvex package for models of text, image and tabular domains. △ Less

Submitted 2 February, 2023; v1 submitted 17 May, 2022; originally announced May 2022.

Comments: EACL 2023 (Findings). Project website at https://yilunzhou.github.io/solvability/

arXiv:2205.00130 [pdf, other]

ExSum: From Local Explanations to Model Understanding

Authors: Yilun Zhou, Marco Tulio Ribeiro, Julie Shah

Abstract: Interpretability methods are developed to understand the working mechanisms of black-box models, which is crucial to their responsible deployment. Fulfilling this goal requires both that the explanations generated by these methods are correct and that people can easily and reliably understand them. While the former has been addressed in prior work, the latter is often overlooked, resulting in info… ▽ More Interpretability methods are developed to understand the working mechanisms of black-box models, which is crucial to their responsible deployment. Fulfilling this goal requires both that the explanations generated by these methods are correct and that people can easily and reliably understand them. While the former has been addressed in prior work, the latter is often overlooked, resulting in informal model understanding derived from a handful of local explanations. In this paper, we introduce explanation summary (ExSum), a mathematical framework for quantifying model understanding, and propose metrics for its quality assessment. On two domains, ExSum highlights various limitations in the current practice, helps develop accurate model understanding, and reveals easily overlooked properties of the model. We also connect understandability to other properties of explanations such as human alignment, robustness, and counterfactual minimality and plausibility. △ Less

Submitted 29 April, 2022; originally announced May 2022.

Comments: NAACL 2022. The project website is at https://yilunzhou.github.io/exsum/

arXiv:2204.09722 [pdf, other]

When Does Syntax Mediate Neural Language Model Performance? Evidence from Dropout Probes

Authors: Mycal Tucker, Tiwalayo Eisape, Peng Qian, Roger Levy, Julie Shah

Abstract: Recent causal probing literature reveals when language models and syntactic probes use similar representations. Such techniques may yield "false negative" causality results: models may use representations of syntax, but probes may have learned to use redundant encodings of the same syntactic information. We demonstrate that models do encode syntactic information redundantly and introduce a new pro… ▽ More Recent causal probing literature reveals when language models and syntactic probes use similar representations. Such techniques may yield "false negative" causality results: models may use representations of syntax, but probes may have learned to use redundant encodings of the same syntactic information. We demonstrate that models do encode syntactic information redundantly and introduce a new probe design that guides probes to consider all syntactic information present in embeddings. Using these probes, we find evidence for the use of syntax in models where prior methods did not, allowing us to boost model performance by injecting syntactic information into representations. △ Less

Submitted 20 April, 2022; originally announced April 2022.

arXiv:2203.00072 [pdf, ps, other]

Parametrized and equivariant higher algebra

Authors: Denis Nardin, Jay Shah

Abstract: We develop the rudiments of a theory of parametrized $\infty$-operads, including parametrized generalizations of monoidal envelopes, Day convolution, operadic left Kan extensions, results on limits and colimits of algebras, and the symmetric monoidal Yoneda embedding. We develop the rudiments of a theory of parametrized $\infty$-operads, including parametrized generalizations of monoidal envelopes, Day convolution, operadic left Kan extensions, results on limits and colimits of algebras, and the symmetric monoidal Yoneda embedding. △ Less

Submitted 28 February, 2022; originally announced March 2022.

Comments: Draft, 60 pages

MSC Class: 18N70

arXiv:2202.12258 [pdf, other]

A Method for Waste Segregation using Convolutional Neural Networks

Authors: Jash Shah, Sagar Kamat

Abstract: Segregation of garbage is a primary concern in many nations across the world. Even though we are in the modern era, many people still do not know how to distinguish between organic and recyclable waste. It is because of this that the world is facing a major crisis of waste disposal. In this paper, we try to use deep learning algorithms to help solve this problem of waste classification. The waste… ▽ More Segregation of garbage is a primary concern in many nations across the world. Even though we are in the modern era, many people still do not know how to distinguish between organic and recyclable waste. It is because of this that the world is facing a major crisis of waste disposal. In this paper, we try to use deep learning algorithms to help solve this problem of waste classification. The waste is classified into two categories like organic and recyclable. Our proposed model achieves an accuracy of 94.9%. Although the other two models also show promising results, the Proposed Model stands out with the greatest accuracy. With the help of deep learning, one of the greatest obstacles to efficient waste management can finally be removed. △ Less

Submitted 23 February, 2022; originally announced February 2022.

arXiv:2201.12938 [pdf, other]

Probe-Based Interventions for Modifying Agent Behavior

Authors: Mycal Tucker, William Kuhl, Khizer Shahid, Seth Karten, Katia Sycara, Julie Shah

Abstract: Neural nets are powerful function approximators, but the behavior of a given neural net, once trained, cannot be easily modified. We wish, however, for people to be able to influence neural agents' actions despite the agents never training with humans, which we formalize as a human-assisted decision-making problem. Inspired by prior art initially developed for model explainability, we develop a me… ▽ More Neural nets are powerful function approximators, but the behavior of a given neural net, once trained, cannot be easily modified. We wish, however, for people to be able to influence neural agents' actions despite the agents never training with humans, which we formalize as a human-assisted decision-making problem. Inspired by prior art initially developed for model explainability, we develop a method for updating representations in pre-trained neural nets according to externally-specified properties. In experiments, we show how our method may be used to improve human-agent team performance for a variety of neural networks from image classifiers to agents in multi-agent reinforcement learning settings. △ Less

Submitted 26 January, 2022; originally announced January 2022.

arXiv:2201.09587 [pdf, other]

doi 10.3847/PSJ/ac4e91

doi 10.7910/DVN/SBESWM

Methane-saturated layers limit the observability of impact craters on Titan

Authors: Shigeru Wakita, Brandon C. Johnson, Jason M. Soderblom, Jahnavi Shah, Catherine D. Neish

Abstract: As the only icy satellite with a thick atmosphere and liquids on its surface, Titan represents a unique end-member to study the impact cratering process. Unlike craters on other Saturnian satellites, Titan's craters are preferentially located in high-elevation regions near the equator. This led to the hypothesis that the presence of liquid methane in Titan's lowlands affects crater morphology, mak… ▽ More As the only icy satellite with a thick atmosphere and liquids on its surface, Titan represents a unique end-member to study the impact cratering process. Unlike craters on other Saturnian satellites, Titan's craters are preferentially located in high-elevation regions near the equator. This led to the hypothesis that the presence of liquid methane in Titan's lowlands affects crater morphology, making them difficult to identify. This is because surfaces covered by weak fluid-saturated sediment limit the topographic expression of impact craters, as sediment moves into the crater cavity shortly after formation. Here we simulate crater-forming impacts on Titan's surface, exploring how a methane-saturated layer overlying a methane-clathrate layer affects crater formation. Our numerical results show that impacts form smaller craters in a methane-clathrate basement than a water-ice basement, due to the differences in strength. We find that the addition of a methane-saturated layer atop this basement reduces crater depths and influences crater morphology. The morphology of impact craters formed in a thin methane-saturated layer are similar to those in a "dry" target, but a thick saturated layer produces an impact structure with little to no topography. A thick methane-saturated layer (thicker than 40% of the impactor diameter) could explain the dearth of craters in the low-elevation regions on Titan. △ Less

Submitted 24 January, 2022; originally announced January 2022.

Comments: 33 pages, 12 figures, accepted for publication in PSJ

arXiv:2112.15442 [pdf, other]

Mythological Medical Machine Learning: Boosting the Performance of a Deep Learning Medical Data Classifier Using Realistic Physiological Models

Authors: Ismail Sadiq, Erick A. Perez-Alday, Amit J. Shah, Ali Bahrami Rad, Reza Sameni, Gari D. Clifford

Abstract: Objective: To determine if a realistic, but computationally efficient model of the electrocardiogram can be used to pre-train a deep neural network (DNN) with a wide range of morphologies and abnormalities specific to a given condition - T-wave Alternans (TWA) as a result of Post-Traumatic Stress Disorder, or PTSD - and significantly boost performance on a small database of rare individuals. App… ▽ More Objective: To determine if a realistic, but computationally efficient model of the electrocardiogram can be used to pre-train a deep neural network (DNN) with a wide range of morphologies and abnormalities specific to a given condition - T-wave Alternans (TWA) as a result of Post-Traumatic Stress Disorder, or PTSD - and significantly boost performance on a small database of rare individuals. Approach: Using a previously validated artificial ECG model, we generated 180,000 artificial ECGs with or without significant TWA, with varying heart rate, breathing rate, TWA amplitude, and ECG morphology. A DNN, trained on over 70,000 patients to classify 25 different rhythms, was modified the output layer to a binary class (TWA or no-TWA, or equivalently, PTSD or no-PTSD), and transfer learning was performed on the artificial ECG. In a final transfer learning step, the DNN was trained and cross-validated on ECG from 12 PTSD and 24 controls for all combinations of using the three databases. Main results: The best performing approach (AUROC = 0.77, Accuracy = 0.72, F1-score = 0.64) was found by performing both transfer learning steps, using the pre-trained arrhythmia DNN, the artificial data and the real PTSD-related ECG data. Removing the artificial data from training led to the largest drop in performance. Removing the arrhythmia data from training provided a modest, but significant, drop in performance. The final model showed no significant drop in performance on the artificial data, indicating no overfitting. Significance: In healthcare, it is common to only have a small collection of high-quality data and labels, or a larger database with much lower quality (and less relevant) labels. The paradigm presented here, involving model-based performance boosting, provides a solution through transfer learning on a large realistic artificial database, and a partially relevant real database. △ Less

Submitted 28 December, 2021; originally announced December 2021.

Comments: Presented at the University of Chicago Data Science Institute Dec 6th 2021. See: https://www.youtube.com/watch?v=B36CGi8ODCw and https://datascience.uchicago.edu/events/dss-gari-clifford/

MSC Class: 92C30; 92C32; 03H10; 62H30; 68Q07; 8T07; 78-10; 92-10; 62R07; 68T09; 68T10 ACM Class: I.5.1; I.5.2; I.5.4; I.6.3; I.2.1; J.3

arXiv:2112.07462 [pdf, other]

On the equivalence of two theories of real cyclotomic spectra

Authors: J. D. Quigley, Jay Shah

Abstract: We give a new formula for real topological cyclic homology that refines the fiber sequence formula discovered by Nikolaus and Scholze for topological cyclic homology to one involving genuine $C_2$-spectra. To accomplish this, we give a new definition of the $\infty$-category of real cyclotomic spectra that replaces the usage of genuinely equivariant dihedral spectra with the parametrized Tate cons… ▽ More We give a new formula for real topological cyclic homology that refines the fiber sequence formula discovered by Nikolaus and Scholze for topological cyclic homology to one involving genuine $C_2$-spectra. To accomplish this, we give a new definition of the $\infty$-category of real cyclotomic spectra that replaces the usage of genuinely equivariant dihedral spectra with the parametrized Tate construction. We then define an $\infty$-categorical version of Høgenhaven's O(2)-orthogonal cyclotomic spectra, construct a forgetful functor relating the two theories, and show that this functor restricts to an equivalence between full subcategories of appropriately bounded-below objects. As an application, we compute the real topological cyclic homology of perfect $\mathbb{F}_p$-algebras for all primes $p$. △ Less

Submitted 6 January, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

Comments: Major revision and expansion of sections 6-7 of arXiv:1909.03920. 81 pages. v2: minor edits

MSC Class: 19D55; 55P42; 55P43; 55P91; 16E40; 13D03

arXiv:2112.03858 [pdf, other]

Reducing Target Group Bias in Hate Speech Detectors

Authors: Darsh J Shah, Sinong Wang, Han Fang, Hao Ma, Luke Zettlemoyer

Abstract: The ubiquity of offensive and hateful content on online fora necessitates the need for automatic solutions that detect such content competently across target groups. In this paper we show that text classification models trained on large publicly available datasets despite having a high overall performance, may significantly under-perform on several protected groups. On the \citet{vidgen2020learnin… ▽ More The ubiquity of offensive and hateful content on online fora necessitates the need for automatic solutions that detect such content competently across target groups. In this paper we show that text classification models trained on large publicly available datasets despite having a high overall performance, may significantly under-perform on several protected groups. On the \citet{vidgen2020learning} dataset, we find the accuracy to be 37\% lower on an under annotated Black Women target group and 12\% lower on Immigrants, where hate speech involves a distinct style. To address this, we propose to perform token-level hate sense disambiguation, and utilize tokens' hate sense representations for detection, modeling more general signals. On two publicly available datasets, we observe that the variance in model accuracy across target groups drops by at least 30\%, improving the average target group performance by 4\% and worst case performance by 13\%. △ Less

Submitted 7 December, 2021; originally announced December 2021.

arXiv:2111.10471 [pdf]

SNPs Filtered by Allele Frequency Improve the Prediction of Hypertension Subtypes

Authors: Yiming Li, Sanjiv J. Shah, Donna Arnett, Ryan Irvin, Yuan Luo

Abstract: Hypertension is the leading global cause of cardiovascular disease and premature death. Distinct hypertension subtypes may vary in their prognoses and require different treatments. An individual's risk for hypertension is determined by genetic and environmental factors as well as their interactions. In this work, we studied 911 African Americans and 1,171 European Americans in the Hypertension Gen… ▽ More Hypertension is the leading global cause of cardiovascular disease and premature death. Distinct hypertension subtypes may vary in their prognoses and require different treatments. An individual's risk for hypertension is determined by genetic and environmental factors as well as their interactions. In this work, we studied 911 African Americans and 1,171 European Americans in the Hypertension Genetic Epidemiology Network (HyperGEN) cohort. We built hypertension subtype classification models using both environmental variables and sets of genetic features selected based on different criteria. The fitted prediction models provided insights into the genetic landscape of hypertension subtypes, which may aid personalized diagnosis and treatment of hypertension in the future. △ Less

Submitted 19 November, 2021; originally announced November 2021.

Comments: Submitted to the 12th International Workshop on Biomedical and Health Informatics (BHI 2021)

arXiv:2111.09940 [pdf, other]

Chiral Phase Change Nanomaterials

Authors: Joshua A. Burrow, Md Shah Alam, Evan M. Smith, Riad Yahiaoui, Ryan Laing, Piyush J. Shah, Thomas A. Searles, Shivashankar Vangala, Joshua R. Hendrickson, Andrew Sarangan, Imad Agha

Abstract: Chiral nanostructures offer the ability to respond to the vector nature of a light beam at the nanoscale. While naturally chiral materials offer a path towards scalability, engineered structures offer a path to wavelength tunability through geometric manipulation. Neither approach, however, allows for temporal control of chirality. Therefore, in the best of all worlds, it is crucial to realize chi… ▽ More Chiral nanostructures offer the ability to respond to the vector nature of a light beam at the nanoscale. While naturally chiral materials offer a path towards scalability, engineered structures offer a path to wavelength tunability through geometric manipulation. Neither approach, however, allows for temporal control of chirality. Therefore, in the best of all worlds, it is crucial to realize chiral materials that possess the quality of scalability, tailored wavelength response, and dynamic control at high speeds. Here, a new class of intrinsically chiral phase change nanomaterials (PCNMs) is proposed and explored, based on a scalable bottom-up fabrication technique with a high degree of control in three dimensions. Angular resolved Mueller Matrix and spectroscopic ellipsometry are performed to characterize the optical birefringence and dichroism, and a numerical model is provided to explain the origin of optical activity. This work achieves the critical goal of demonstrating high-speed dynamic switching of chirality over 50,000 cycles via the underlying PCNM. △ Less

Submitted 18 November, 2021; originally announced November 2021.

Comments: 21 pages, 10 page supplement, 16 figures

arXiv:2110.15750 [pdf]

Process Design and Economics of Production of p-Aminophenol

Authors: Chinmay Ghoroi, Jay Shah, Devanshu Thakar, Sakshi Baheti

Abstract: Para-Aminophenol is one of the key chemicals required for the synthesis of Paracetamol, an analgesic and antipyretic drug. Data shows a large fraction of India's demand for Para-Aminophenol being met through imports from China. The uncertainty in the India-China relations would affect the supply and price of this "Key Starting Material." This report is a detailed business plan for setting up a pla… ▽ More Para-Aminophenol is one of the key chemicals required for the synthesis of Paracetamol, an analgesic and antipyretic drug. Data shows a large fraction of India's demand for Para-Aminophenol being met through imports from China. The uncertainty in the India-China relations would affect the supply and price of this "Key Starting Material." This report is a detailed business plan for setting up a plant and producing Para-Aminophenol in India at a competitive price. The plant is simulated in AspenPlus V8 and different Material Balances and Energy Balances calculations are carried out. The plant produces 22.7 kmols Para-Aminophenol per hour with a purity of 99.9%. Along with the simulation, economic analysis is carried out for this plant to determine the financial parameters like Payback Period and Return on Investment. △ Less

Submitted 29 October, 2021; originally announced October 2021.

Comments: 23 pages, 5 figures

arXiv:2110.09584 [pdf, other]

Set-based State Estimation with Probabilistic Consistency Guarantee under Epistemic Uncertainty

Authors: Shen Li, Theodoros Stouraitis, Michael Gienger, Sethu Vijayakumar, Julie A. Shah

Abstract: Consistent state estimation is challenging, especially under the epistemic uncertainties arising from learned (nonlinear) dynamic and observation models. In this work, we propose a set-based estimation algorithm, named Gaussian Process-Zonotopic Kalman Filter (GP-ZKF), that produces zonotopic state estimates while respecting both the epistemic uncertainties in the learned models and aleatoric unce… ▽ More Consistent state estimation is challenging, especially under the epistemic uncertainties arising from learned (nonlinear) dynamic and observation models. In this work, we propose a set-based estimation algorithm, named Gaussian Process-Zonotopic Kalman Filter (GP-ZKF), that produces zonotopic state estimates while respecting both the epistemic uncertainties in the learned models and aleatoric uncertainties. Our method guarantees probabilistic consistency, in the sense that the true states are bounded by sets (zonotopes) across all time steps, with high probability. We formally relate GP-ZKF with the corresponding stochastic approach, GP-EKF, in the case of learned (nonlinear) models. In particular, when linearization errors and aleatoric uncertainties are omitted and epistemic uncertainties are simplified, GP-ZKF reduces to GP-EKF. We empirically demonstrate our method's efficacy in both a simulated pendulum domain and a real-world robot-assisted dressing domain, where GP-ZKF produced more consistent and less conservative set-based estimates than all baseline stochastic methods. △ Less

Submitted 25 February, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

Comments: Published at IEEE Robotics and Automation Letters, 2022. Video: https://www.youtube.com/watch?v=CvIPJlALaFU Copyright: 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any media, including reprinting/republishing for any purposes, creating new works, for resale or redistribution, or reuse of any copyrighted component of this work

Showing 1–50 of 150 results for author: Shah, J