-
On in-silico estimation of left ventricular end-diastolic pressure from cardiac strains
Authors:
Emilio A. Mendiola,
Raza Rana Mehdi,
Dipan J. Shah,
Reza Avazmohammadi
Abstract:
Left ventricular diastolic dysfunction (LVDD) is a group of diseases that adversely affect the passive phase of the cardiac cycle and can lead to heart failure. While left ventricular end-diastolic pressure (LVEDP) is a valuable prognostic measure in LVDD patients, traditional invasive methods of measuring LVEDP present risks and limitations, highlighting the need for alternative approaches. This…
▽ More
Left ventricular diastolic dysfunction (LVDD) is a group of diseases that adversely affect the passive phase of the cardiac cycle and can lead to heart failure. While left ventricular end-diastolic pressure (LVEDP) is a valuable prognostic measure in LVDD patients, traditional invasive methods of measuring LVEDP present risks and limitations, highlighting the need for alternative approaches. This paper investigates the possibility of measuring LVEDP non-invasively using inverse in-silico modeling. We propose the adoption of patient-specific cardiac modeling and simulation to estimate LVEDP and myocardial stiffness from cardiac strains. We have developed a high-fidelity patient-specific computational model of the left ventricle. Through an inverse modeling approach, myocardial stiffness and LVEDP were accurately estimated from cardiac strains that can be acquired from in vivo imaging, indicating the feasibility of computational modeling to augment current approaches in the measurement of ventricular pressure. Integration of such computational platforms into clinical practice holds promise for early detection and comprehensive assessment of LVDD with reduced risk for patients.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
SketchQL Demonstration: Zero-shot Video Moment Querying with Sketches
Authors:
Renzhi Wu,
Pramod Chunduri,
Dristi J Shah,
Ashmitha Julius Aravind,
Ali Payani,
Xu Chu,
Joy Arulraj,
Kexin Rong
Abstract:
In this paper, we will present SketchQL, a video database management system (VDBMS) for retrieving video moments with a sketch-based query interface. This novel interface allows users to specify object trajectory events with simple mouse drag-and-drop operations. Users can use trajectories of single objects as building blocks to compose complex events. Using a pre-trained model that encodes trajec…
▽ More
In this paper, we will present SketchQL, a video database management system (VDBMS) for retrieving video moments with a sketch-based query interface. This novel interface allows users to specify object trajectory events with simple mouse drag-and-drop operations. Users can use trajectories of single objects as building blocks to compose complex events. Using a pre-trained model that encodes trajectory similarity, SketchQL achieves zero-shot video moments retrieval by performing similarity searches over the video to identify clips that are the most similar to the visual query. In this demonstration, we introduce the graphic user interface of SketchQL and detail its functionalities and interaction mechanisms. We also demonstrate the end-to-end usage of SketchQL from query composition to video moments retrieval using real-world scenarios.
△ Less
Submitted 30 June, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
AnoFPDM: Anomaly Segmentation with Forward Process of Diffusion Models for Brain MRI
Authors:
Yiming Che,
Fazle Rafsani,
Jay Shah,
Md Mahfuzur Rahman Siddiquee,
Teresa Wu
Abstract:
Weakly-supervised diffusion models (DMs) in anomaly segmentation, leveraging image-level labels, have attracted significant attention for their superior performance compared to unsupervised methods. It eliminates the need for pixel-level labels in training, offering a more cost-effective alternative to supervised methods. However, existing methods are not fully weakly-supervised because they heavi…
▽ More
Weakly-supervised diffusion models (DMs) in anomaly segmentation, leveraging image-level labels, have attracted significant attention for their superior performance compared to unsupervised methods. It eliminates the need for pixel-level labels in training, offering a more cost-effective alternative to supervised methods. However, existing methods are not fully weakly-supervised because they heavily rely on costly pixel-level labels for hyperparameter tuning in inference. To tackle this challenge, we introduce Anomaly Segmentation with Forward Process of Diffusion Models (AnoFPDM), a fully weakly-supervised framework that operates without the need of pixel-level labels. Leveraging the unguided forward process as a reference for the guided forward process, we select hyperparameters such as the noise scale, the threshold for segmentation and the guidance strength. We aggregate anomaly maps from guided forward process, enhancing the signal strength of anomalous regions. Remarkably, our proposed method outperforms recent state-of-the-art weakly-supervised approaches, even without utilizing pixel-level labels.
△ Less
Submitted 29 June, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
Grounding Language Plans in Demonstrations Through Counterfactual Perturbations
Authors:
Yanwei Wang,
Tsun-Hsuan Wang,
Jiayuan Mao,
Michael Hagenow,
Julie Shah
Abstract:
Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs directly for planning in symbolic spaces, this work uses LLMs to guide the search of task structures and constraints implicit in multi-step demonstrations. Specifically, we borrow from manipulation plann…
▽ More
Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs directly for planning in symbolic spaces, this work uses LLMs to guide the search of task structures and constraints implicit in multi-step demonstrations. Specifically, we borrow from manipulation planning literature the concept of mode families, which group robot configurations by specific motion constraints, to serve as an abstraction layer between the high-level language representations of an LLM and the low-level physical trajectories of a robot. By replaying a few human demonstrations with synthetic perturbations, we generate coverage over the demonstrations' state space with additional successful executions as well as counterfactuals that fail the task. Our explanation-based learning framework trains an end-to-end differentiable neural network to predict successful trajectories from failures and as a by-product learns classifiers that ground low-level states and images in mode families without dense labeling. The learned grounding classifiers can further be used to translate language plans into reactive policies in the physical domain in an interpretable manner. We show our approach improves the interpretability and reactivity of imitation learning through 2D navigation and simulated and real robot manipulation tasks. Website: https://yanweiw.github.io/glide
△ Less
Submitted 29 April, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning
Authors:
Shivam Ratnakant Mhaskar,
Nirmesh J. Shah,
Mohammadi Zaki,
Ashishkumar P. Gudmalwar,
Pankaj Wasnik,
Rajiv Ratn Shah
Abstract:
Traditional Automatic Video Dubbing (AVD) pipeline consists of three key modules, namely, Automatic Speech Recognition (ASR), Neural Machine Translation (NMT), and Text-to-Speech (TTS). Within AVD pipelines, isometric-NMT algorithms are employed to regulate the length of the synthesized output text. This is done to guarantee synchronization with respect to the alignment of video and audio subseque…
▽ More
Traditional Automatic Video Dubbing (AVD) pipeline consists of three key modules, namely, Automatic Speech Recognition (ASR), Neural Machine Translation (NMT), and Text-to-Speech (TTS). Within AVD pipelines, isometric-NMT algorithms are employed to regulate the length of the synthesized output text. This is done to guarantee synchronization with respect to the alignment of video and audio subsequent to the dubbing process. Previous approaches have focused on aligning the number of characters and words in the source and target language texts of Machine Translation models. However, our approach aims to align the number of phonemes instead, as they are closely associated with speech duration. In this paper, we present the development of an isometric NMT system using Reinforcement Learning (RL), with a focus on optimizing the alignment of phoneme counts in the source and target language sentence pairs. To evaluate our models, we propose the Phoneme Count Compliance (PCC) score, which is a measure of length compliance. Our approach demonstrates a substantial improvement of approximately 36% in the PCC score compared to the state-of-the-art models when applied to English-Hindi language pairs. Moreover, we propose a student-teacher architecture within the framework of our RL approach to maintain a trade-off between the phoneme count and translation quality.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Ordinal Classification with Distance Regularization for Robust Brain Age Prediction
Authors:
Jay Shah,
Md Mahfuzur Rahman Siddiquee,
Yi Su,
Teresa Wu,
Baoxin Li
Abstract:
Age is one of the major known risk factors for Alzheimer's Disease (AD). Detecting AD early is crucial for effective treatment and preventing irreversible brain damage. Brain age, a measure derived from brain imaging reflecting structural changes due to aging, may have the potential to identify AD onset, assess disease risk, and plan targeted interventions. Deep learning-based regression technique…
▽ More
Age is one of the major known risk factors for Alzheimer's Disease (AD). Detecting AD early is crucial for effective treatment and preventing irreversible brain damage. Brain age, a measure derived from brain imaging reflecting structural changes due to aging, may have the potential to identify AD onset, assess disease risk, and plan targeted interventions. Deep learning-based regression techniques to predict brain age from magnetic resonance imaging (MRI) scans have shown great accuracy recently. However, these methods are subject to an inherent regression to the mean effect, which causes a systematic bias resulting in an overestimation of brain age in young subjects and underestimation in old subjects. This weakens the reliability of predicted brain age as a valid biomarker for downstream clinical applications. Here, we reformulate the brain age prediction task from regression to classification to address the issue of systematic bias. Recognizing the importance of preserving ordinal information from ages to understand aging trajectory and monitor aging longitudinally, we propose a novel ORdinal Distance Encoded Regularization (ORDER) loss that incorporates the order of age labels, enhancing the model's ability to capture age-related patterns. Extensive experiments and ablation studies demonstrate that this framework reduces systematic bias, outperforms state-of-art methods by statistically significant margins, and can better capture subtle differences between clinical groups in an independent AD dataset. Our implementation is publicly available at https://github.com/jaygshah/Robust-Brain-Age-Prediction.
△ Less
Submitted 6 May, 2024; v1 submitted 25 October, 2023;
originally announced March 2024.
-
Object Permanence Filter for Robust Tracking with Interactive Robots
Authors:
Shaoting Peng,
Margaret X. Wang,
Julie A. Shah,
Nadia Figueroa
Abstract:
Object permanence, which refers to the concept that objects continue to exist even when they are no longer perceivable through the senses, is a crucial aspect of human cognitive development. In this work, we seek to incorporate this understanding into interactive robots by proposing a set of assumptions and rules to represent object permanence in multi-object, multi-agent interactive scenarios. We…
▽ More
Object permanence, which refers to the concept that objects continue to exist even when they are no longer perceivable through the senses, is a crucial aspect of human cognitive development. In this work, we seek to incorporate this understanding into interactive robots by proposing a set of assumptions and rules to represent object permanence in multi-object, multi-agent interactive scenarios. We integrate these rules into the particle filter, resulting in the Object Permanence Filter (OPF). For multi-object scenarios, we propose an ensemble of K interconnected OPFs, where each filter predicts plausible object tracks that are resilient to missing, noisy, and kinematically or dynamically infeasible measurements, thus bringing perceptional robustness. Through several interactive scenarios, we demonstrate that the proposed OPF approach provides robust tracking in human-robot interactive tasks agnostic to measurement type, even in the presence of prolonged and complete occlusion. Webpage: https://opfilter.github.io/.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Learning with Language-Guided State Abstractions
Authors:
Andi Peng,
Ilia Sucholutsky,
Belinda Z. Li,
Theodore R. Sumers,
Thomas L. Griffiths,
Jacob Andreas,
Julie A. Shah
Abstract:
We describe a framework for using natural language to design state abstractions for imitation learning. Generalizable policy learning in high-dimensional observation spaces is facilitated by well-designed state representations, which can surface important features of an environment and hide irrelevant ones. These state representations are typically manually specified, or derived from other labor-i…
▽ More
We describe a framework for using natural language to design state abstractions for imitation learning. Generalizable policy learning in high-dimensional observation spaces is facilitated by well-designed state representations, which can surface important features of an environment and hide irrelevant ones. These state representations are typically manually specified, or derived from other labor-intensive labeling procedures. Our method, LGA (language-guided abstraction), uses a combination of natural language supervision and background knowledge from language models (LMs) to automatically build state representations tailored to unseen tasks. In LGA, a user first provides a (possibly incomplete) description of a target task in natural language; next, a pre-trained LM translates this task description into a state abstraction function that masks out irrelevant features; finally, an imitation policy is trained using a small number of demonstrations and LGA-generated abstract states. Experiments on simulated robotic tasks show that LGA yields state abstractions similar to those designed by humans, but in a fraction of the time, and that these abstractions improve generalization and robustness in the presence of spurious correlations and ambiguous specifications. We illustrate the utility of the learned abstractions on mobile manipulation tasks with a Spot robot.
△ Less
Submitted 6 March, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Understanding Entrainment in Human Groups: Optimising Human-Robot Collaboration from Lessons Learned during Human-Human Collaboration
Authors:
Eike Schneiders,
Christopher Fourie,
Stanley Celestin,
Julie Shah,
Malte Jung
Abstract:
Successful entrainment during collaboration positively affects trust, willingness to collaborate, and likeability towards collaborators. In this paper, we present a mixed-method study to investigate characteristics of successful entrainment leading to pair and group-based synchronisation. Drawing inspiration from industrial settings, we designed a fast-paced, short-cycle repetitive task. Using mot…
▽ More
Successful entrainment during collaboration positively affects trust, willingness to collaborate, and likeability towards collaborators. In this paper, we present a mixed-method study to investigate characteristics of successful entrainment leading to pair and group-based synchronisation. Drawing inspiration from industrial settings, we designed a fast-paced, short-cycle repetitive task. Using motion tracking, we investigated entrainment in both dyadic and triadic task completion. Furthermore, we utilise audio-video recordings and semi-structured interviews to contextualise participants' experiences. This paper contributes to the Human-Computer/Robot Interaction (HCI/HRI) literature using a human-centred approach to identify characteristics of entrainment during pair- and group-based collaboration. We present five characteristics related to successful entrainment. These are related to the occurrence of entrainment, leader-follower patterns, interpersonal communication, the importance of the point-of-assembly, and the value of acoustic feedback. Finally, we present three design considerations for future research and design on collaboration with robots.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Emergence and dynamics of delusions and hallucinations across stages in early psychosis
Authors:
Catalina Mourgues-Codern,
David Benrimoh,
Jay Gandhi,
Emily A. Farina,
Raina Vin,
Tihare Zamorano,
Deven Parekh,
Ashok Malla,
Ridha Joober,
Martin Lepage,
Srividya N. Iyer,
Jean Addington,
Carrie E. Bearden,
Kristin S. Cadenhead,
Barbara Cornblatt,
Matcheri Keshavan,
William S. Stone,
Daniel H. Mathalon,
Diana O. Perkins,
Elaine F. Walker,
Tyrone D. Cannon,
Scott W. Woods,
Jai L. Shah,
Albert R. Powers
Abstract:
Hallucinations and delusions are often grouped together within the positive symptoms of psychosis. However, recent evidence suggests they may be driven by distinct computational and neural mechanisms. Examining the time course of their emergence may provide insights into the relationship between these underlying mechanisms. Participants from the second (N = 719) and third (N = 699) iterations of t…
▽ More
Hallucinations and delusions are often grouped together within the positive symptoms of psychosis. However, recent evidence suggests they may be driven by distinct computational and neural mechanisms. Examining the time course of their emergence may provide insights into the relationship between these underlying mechanisms. Participants from the second (N = 719) and third (N = 699) iterations of the North American Prodrome Longitudinal Study (NAPLS 2 and 3) were assessed for timing of CHR-P-level delusion and hallucination onset. Pre-onset symptom patterns in first-episode psychosis patients (FEP) from the Prevention and Early Intervention Program for Psychosis (PEPP-Montreal; N = 694) were also assessed. Symptom onset was determined at baseline assessment and the evolution of symptom patterns examined over 24 months. In all three samples, participants were more likely to report the onset of delusion-spectrum symptoms prior to hallucination-spectrum symptoms (odds ratios (OR): NAPLS 2 = 4.09; NAPLS 3 = 4.14; PEPP, Z = 7.01, P < 0.001) and to present with only delusions compared to only hallucinations (OR: NAPLS 2 = 5.6; NAPLS 3 = 11.11; PEPP = 42.75). Re-emergence of delusions after remission was also more common than re-emergence of hallucinations (Ps < 0.05), and hallucinations more often resolved first (Ps < 0.001). In both CHR-P samples, ratings of delusional ideation fell with the onset of hallucinations (P = 0.007). Delusions tend to emerge before hallucinations and may play a role in their development. Further work should examine the relationship between the mechanisms driving these symptoms and its utility for diagnosis and treatment.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Achieving Low Latency at Low Outage: Multilevel Coding for mmWave Channels
Authors:
Mine Gokce Dogan,
Jaimin Shah,
Martina Cardone,
Christina Fragouli,
Wei Mao,
Hosein Nikopour,
Rath Vannithamby
Abstract:
Millimeter-wave (mmWave) spectrum is expected to support data-intensive applications that require ultra-reliable low-latency communications (URLLC). However, mmWave links are highly sensitive to blockage, which may lead to disruptions in the communication. Traditional techniques that build resilience against such blockages (among which are interleaving and feedback mechanisms) incur delays that ar…
▽ More
Millimeter-wave (mmWave) spectrum is expected to support data-intensive applications that require ultra-reliable low-latency communications (URLLC). However, mmWave links are highly sensitive to blockage, which may lead to disruptions in the communication. Traditional techniques that build resilience against such blockages (among which are interleaving and feedback mechanisms) incur delays that are too large to effectively support URLLC. This calls for novel techniques that ensure resilient URLLC. In this paper, we propose to deploy multilevel codes over space and over time. These codes offer several benefits, such as they allow to control what information is received and they provide different reliability guarantees for different information streams based on their priority. We also show that deploying these codes leads to attractive trade-offs between rate, delay, and outage probability. A practically-relevant aspect of the proposed technique is that it offers resilience while incurring a low operational complexity.
△ Less
Submitted 10 February, 2024;
originally announced February 2024.
-
Preference-Conditioned Language-Guided Abstraction
Authors:
Andi Peng,
Andreea Bobu,
Belinda Z. Li,
Theodore R. Sumers,
Ilia Sucholutsky,
Nishanth Kumar,
Thomas L. Griffiths,
Julie A. Shah
Abstract:
Learning from demonstrations is a common way for users to teach robots, but it is prone to spurious feature correlations. Recent work constructs state abstractions, i.e. visual representations containing task-relevant features, from language as a way to perform more generalizable learning. However, these abstractions also depend on a user's preference for what matters in a task, which may be hard…
▽ More
Learning from demonstrations is a common way for users to teach robots, but it is prone to spurious feature correlations. Recent work constructs state abstractions, i.e. visual representations containing task-relevant features, from language as a way to perform more generalizable learning. However, these abstractions also depend on a user's preference for what matters in a task, which may be hard to describe or infeasible to exhaustively specify using language alone. How do we construct abstractions to capture these latent preferences? We observe that how humans behave reveals how they see the world. Our key insight is that changes in human behavior inform us that there are differences in preferences for how humans see the world, i.e. their state abstractions. In this work, we propose using language models (LMs) to query for those preferences directly given knowledge that a change in behavior has occurred. In our framework, we use the LM in two ways: first, given a text description of the task and knowledge of behavioral change between states, we query the LM for possible hidden preferences; second, given the most likely preference, we query the LM to construct the state abstraction. In this framework, the LM is also able to ask the human directly when uncertain about its own estimate. We demonstrate our framework's ability to construct effective preference-conditioned abstractions in simulated experiments, a user study, as well as on a real Spot robot performing mobile manipulation tasks.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Homogenization Effects of Large Language Models on Human Creative Ideation
Authors:
Barrett R. Anderson,
Jash Hemant Shah,
Max Kreminski
Abstract:
Large language models (LLMs) are now being used in a wide variety of contexts, including as creativity support tools (CSTs) intended to help their users come up with new ideas. But do LLMs actually support user creativity? We hypothesized that the use of an LLM as a CST might make the LLM's users feel more creative, and even broaden the range of ideas suggested by each individual user, but also ho…
▽ More
Large language models (LLMs) are now being used in a wide variety of contexts, including as creativity support tools (CSTs) intended to help their users come up with new ideas. But do LLMs actually support user creativity? We hypothesized that the use of an LLM as a CST might make the LLM's users feel more creative, and even broaden the range of ideas suggested by each individual user, but also homogenize the ideas suggested by different users. We conducted a 36-participant comparative user study and found, in accordance with the homogenization hypothesis, that different users tended to produce less semantically distinct ideas with ChatGPT than with an alternative CST. Additionally, ChatGPT users generated a greater number of more detailed ideas, but felt less responsible for the ideas they generated. We discuss potential implications of these findings for users, designers, and developers of LLM-based CSTs.
△ Less
Submitted 10 May, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS Library
Authors:
Ganesh Bikshandi,
Jay Shah
Abstract:
We provide an optimized implementation of the forward pass of FlashAttention-2, a popular memory-aware scaled dot-product attention algorithm, as a custom fused CUDA kernel targeting NVIDIA Hopper architecture and written using the open-source CUTLASS library. In doing so, we explain the challenges and techniques involved in fusing online-softmax with back-to-back GEMM kernels, utilizing the Hoppe…
▽ More
We provide an optimized implementation of the forward pass of FlashAttention-2, a popular memory-aware scaled dot-product attention algorithm, as a custom fused CUDA kernel targeting NVIDIA Hopper architecture and written using the open-source CUTLASS library. In doing so, we explain the challenges and techniques involved in fusing online-softmax with back-to-back GEMM kernels, utilizing the Hopper-specific Tensor Memory Accelerator (TMA) and Warpgroup Matrix-Multiply-Accumulate (WGMMA) instructions, defining and transforming CUTLASS Layouts and Tensors, overlap** copy and GEMM operations, and choosing optimal tile sizes for the Q, K and V attention matrices while balancing the register pressure and shared memory utilization. In head-to-head benchmarks on a single H100 PCIe GPU for some common choices of hyperparameters, we observe 20-50% higher FLOPs/s over a version of FlashAttention-2 optimized for last-generation NVIDIA Ampere architecture.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Fast and Facile Synthesis Route to Epitaxial Oxide Membrane Using a Sacrificial Layer
Authors:
Shivasheesh Varshney,
Sooho Choo,
Liam Thompson,
Zhifei Yang,
Jay Shah,
Jiaxuan Wen,
Steven J. Koester,
K. Andre Mkhoyan,
Alexander McLeod,
Bharat Jalan
Abstract:
The advancement in thin-film exfoliation for synthesizing oxide membranes has opened up new possibilities for creating artificially-assembled heterostructures with structurally and chemically incompatible materials. The sacrificial layer method is a promising approach to exfoliate as-grown films from a compatible material system, allowing their integration with dissimilar materials. Nonetheless, t…
▽ More
The advancement in thin-film exfoliation for synthesizing oxide membranes has opened up new possibilities for creating artificially-assembled heterostructures with structurally and chemically incompatible materials. The sacrificial layer method is a promising approach to exfoliate as-grown films from a compatible material system, allowing their integration with dissimilar materials. Nonetheless, the conventional sacrificial layers often possess intricate stoichiometry, thereby constraining their practicality and adaptability, particularly when considering techniques like Molecular Beam Epitaxy (MBE). This is where easy-to-grow binary alkaline earth metal oxides with a rock salt crystal structure are useful. These oxides, which include (Mg, Ca, Sr, Ba)O, can be used as a sacrificial layer covering a much broader range of lattice parameters compared to conventional sacrificial layers and are easily dissolvable in deionized water. In this study, we show the epitaxial growth of single-crystalline perovskite SrTiO3 (STO) on sacrificial layers consisting of crystalline SrO, BaO, and Ba1-xCaxO films, employing a hybrid MBE method. Our results highlight the rapid (< 5 minutes) dissolution of the sacrificial layer when immersed in deionized water, facilitating the fabrication of millimeter-sized STO membranes. Using high-resolution x-ray diffraction, atomic-force microscopy, scanning transmission electron microscopy, impedance spectroscopy, and scattering-type near-field optical microscopy (SNOM), we demonstrate epitaxial STO membranes with bulk-like intrinsic dielectric properties. The employment of alkaline earth metal oxides as sacrificial layers is likely to simplify membrane synthesis, particularly with MBE, thus expanding research possibilities.
△ Less
Submitted 19 November, 2023;
originally announced November 2023.
-
Human-Guided Complexity-Controlled Abstractions
Authors:
Andi Peng,
Mycal Tucker,
Eoin Kenny,
Noga Zaslavsky,
Pulkit Agrawal,
Julie Shah
Abstract:
Neural networks often learn task-specific latent representations that fail to generalize to novel settings or tasks. Conversely, humans learn discrete representations (i.e., concepts or words) at a variety of abstraction levels (e.g., "bird" vs. "sparrow") and deploy the appropriate abstraction based on task. Inspired by this, we train neural models to generate a spectrum of discrete representatio…
▽ More
Neural networks often learn task-specific latent representations that fail to generalize to novel settings or tasks. Conversely, humans learn discrete representations (i.e., concepts or words) at a variety of abstraction levels (e.g., "bird" vs. "sparrow") and deploy the appropriate abstraction based on task. Inspired by this, we train neural models to generate a spectrum of discrete representations, and control the complexity of the representations (roughly, how many bits are allocated for encoding inputs) by tuning the entropy of the distribution over representations. In finetuning experiments, using only a small number of labeled examples for a new task, we show that (1) tuning the representation to a task-appropriate complexity level supports the highest finetuning performance, and (2) in a human-participant study, users were able to identify the appropriate complexity level for a downstream task using visualizations of discrete representations. Our results indicate a promising direction for rapid model finetuning by leveraging human insight.
△ Less
Submitted 27 October, 2023; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Body-mounted MR-conditional Robot for Minimally Invasive Liver Intervention
Authors:
Zhefeng Huang,
Anthony L. Gunderman,
Samuel E. Wilcox,
Saikat Sengupta,
Jay Shah,
Aiming Lu,
David Woodrum,
Yue Chen
Abstract:
MR-guided microwave ablation (MWA) has proven effective in treating hepatocellular carcinoma (HCC) with small-sized tumors, but the state-of-the-art technique suffers from sub-optimal workflow due to speed and accuracy of needle placement. This paper presents a compact body-mounted MR-conditional robot that can operate in closed-bore MR scanners for accurate needle guidance. The robotic platform c…
▽ More
MR-guided microwave ablation (MWA) has proven effective in treating hepatocellular carcinoma (HCC) with small-sized tumors, but the state-of-the-art technique suffers from sub-optimal workflow due to speed and accuracy of needle placement. This paper presents a compact body-mounted MR-conditional robot that can operate in closed-bore MR scanners for accurate needle guidance. The robotic platform consists of two stacked Cartesian XY stages, each with two degrees of freedom, that facilitate needle guidance. The robot is actuated using 3D-printed pneumatic turbines with MR-conditional bevel gear transmission systems. Pneumatic valves and control mechatronics are located inside the MRI control room and are connected to the robot with pneumatic transmission lines and optical fibers. Free space experiments indicated robot-assisted needle insertion error of 2.6$\pm$1.3 mm at an insertion depth of 80 mm. The MR-guided phantom studies were conducted to verify the MR-conditionality and targeting performance of the robot. Future work will focus on the system optimization and validations in animal trials.
△ Less
Submitted 25 March, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
An Information Bottleneck Characterization of the Understanding-Workload Tradeoff
Authors:
Lindsay Sanneman,
Mycal Tucker,
Julie Shah
Abstract:
Recent advances in artificial intelligence (AI) have underscored the need for explainable AI (XAI) to support human understanding of AI systems. Consideration of human factors that impact explanation efficacy, such as mental workload and human understanding, is central to effective XAI design. Existing work in XAI has demonstrated a tradeoff between understanding and workload induced by different…
▽ More
Recent advances in artificial intelligence (AI) have underscored the need for explainable AI (XAI) to support human understanding of AI systems. Consideration of human factors that impact explanation efficacy, such as mental workload and human understanding, is central to effective XAI design. Existing work in XAI has demonstrated a tradeoff between understanding and workload induced by different types of explanations. Explaining complex concepts through abstractions (hand-crafted grou**s of related problem features) has been shown to effectively address and balance this workload-understanding tradeoff. In this work, we characterize the workload-understanding balance via the Information Bottleneck method: an information-theoretic approach which automatically generates abstractions that maximize informativeness and minimize complexity. In particular, we establish empirical connections between workload and complexity and between understanding and informativeness through human-subject experiments. This empirical link between human factors and information-theoretic concepts provides an important mathematical characterization of the workload-understanding tradeoff which enables user-tailored XAI design.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
OCU-Net: A Novel U-Net Architecture for Enhanced Oral Cancer Segmentation
Authors:
Ahmed Albishri,
Syed Jawad Hussain Shah,
Yugyung Lee,
Rong Wang
Abstract:
Accurate detection of oral cancer is crucial for improving patient outcomes. However, the field faces two key challenges: the scarcity of deep learning-based image segmentation research specifically targeting oral cancer and the lack of annotated data. Our study proposes OCU-Net, a pioneering U-Net image segmentation architecture exclusively designed to detect oral cancer in hematoxylin and eosin…
▽ More
Accurate detection of oral cancer is crucial for improving patient outcomes. However, the field faces two key challenges: the scarcity of deep learning-based image segmentation research specifically targeting oral cancer and the lack of annotated data. Our study proposes OCU-Net, a pioneering U-Net image segmentation architecture exclusively designed to detect oral cancer in hematoxylin and eosin (H&E) stained image datasets. OCU-Net incorporates advanced deep learning modules, such as the Channel and Spatial Attention Fusion (CSAF) module, a novel and innovative feature that emphasizes important channel and spatial areas in H&E images while exploring contextual information. In addition, OCU-Net integrates other innovative components such as Squeeze-and-Excite (SE) attention module, Atrous Spatial Pyramid Pooling (ASPP) module, residual blocks, and multi-scale fusion. The incorporation of these modules showed superior performance for oral cancer segmentation for two datasets used in this research. Furthermore, we effectively utilized the efficient ImageNet pre-trained MobileNet-V2 model as a backbone of our OCU-Net to create OCU-Netm, an enhanced version achieving state-of-the-art results. Comprehensive evaluation demonstrates that OCU-Net and OCU-Netm outperformed existing segmentation methods, highlighting their precision in identifying cancer cells in H&E images from OCDC and ORCA datasets.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Symmetry breaking and ascending in the magnetic kagome metal FeGe
Authors:
Shangfei Wu,
Mason Klemm,
Jay Shah,
Ethan T. Ritz,
Chunruo Duan,
Xiaokun Teng,
Bin Gao,
Feng Ye,
Masaaki Matsuda,
Fankang Li,
Xianghan Xu,
Ming Yi,
Turan Birol,
Pengcheng Dai,
Girsh Blumberg
Abstract:
Spontaneous symmetry breaking-the phenomenon where an infinitesimal perturbation can cause the system to break the underlying symmetry-is a cornerstone concept in the understanding of interacting solid-state systems. In a typical series of temperature-driven phase transitions, higher temperature phases are more symmetric due to the stabilizing effect of entropy that becomes dominant as the tempera…
▽ More
Spontaneous symmetry breaking-the phenomenon where an infinitesimal perturbation can cause the system to break the underlying symmetry-is a cornerstone concept in the understanding of interacting solid-state systems. In a typical series of temperature-driven phase transitions, higher temperature phases are more symmetric due to the stabilizing effect of entropy that becomes dominant as the temperature is increased. However, the opposite is rare but possible when there are multiple degrees of freedom in the system. Here, we present such an example of a symmetry-ascending phenomenon in a magnetic kagome metal FeGe by utilizing neutron Larmor diffraction and Raman spectroscopy. In the paramagnetic state at 460K, we confirm that the crystal structure is indeed hexagonal kagome lattice. On cooling to TN, the crystal structure changes from hexagonal to monoclinic with in-plane lattice distortions on the order of 10^(-4) and the associated splitting of the double degenerate phonon mode of the pristine kagome lattice. Upon further cooling to TCDW, the kagome lattice shows a small negative thermal expansion, and the crystal structure becomes more symmetric gradually upon further cooling. Increasing the crystalline symmetry upon cooling is unusual, it originates from an extremely weak structural instability that coexists and competes with the CDW and magnetic orders. These observations are against the expectations for a simple model with a single order parameter, hence can only be explained by a Landau free energy expansion that takes into account multiple lattice, charge, and spin degrees of freedom. Thus, the determination of the crystalline lattice symmetry as well as the unusual spin-lattice coupling is a first step towards understanding the rich electronic and magnetic properties of the system and sheds new light on intertwined orders where the lattice degree of freedom is no longer dominant.
△ Less
Submitted 8 March, 2024; v1 submitted 25 September, 2023;
originally announced September 2023.
-
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
Authors:
Scott L. Fleming,
Alejandro Lozano,
William J. Haberkorn,
Jenelle A. **dal,
Eduardo P. Reis,
Rahul Thapa,
Louis Blankemeier,
Julian Z. Genkins,
Ethan Steinberg,
Ashwin Nayak,
Birju S. Patel,
Chia-Chun Chiang,
Alison Callahan,
Zepeng Huo,
Sergios Gatidis,
Scott J. Adams,
Oluseyi Fayanju,
Shreya J. Shah,
Thomas Savage,
Ethan Goh,
Akshay S. Chaudhari,
Nima Aghaeepour,
Christopher Sharp,
Michael A. Pfeffer,
Percy Liang
, et al. (5 additional authors not shown)
Abstract:
The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture…
▽ More
The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture the complexity of information needs and documentation burdens experienced by clinicians. To address these challenges, we introduce MedAlign, a benchmark dataset of 983 natural language instructions for EHR data. MedAlign is curated by 15 clinicians (7 specialities), includes clinician-written reference responses for 303 instructions, and provides 276 longitudinal EHRs for grounding instruction-response pairs. We used MedAlign to evaluate 6 general domain LLMs, having clinicians rank the accuracy and quality of each LLM response. We found high error rates, ranging from 35% (GPT-4) to 68% (MPT-7B-Instruct), and an 8.3% drop in accuracy moving from 32k to 2k context lengths for GPT-4. Finally, we report correlations between clinician rankings and automated natural language generation metrics as a way to rank LLMs without human review. We make MedAlign available under a research data use agreement to enable LLM evaluations on tasks aligned with clinician needs and preferences.
△ Less
Submitted 24 December, 2023; v1 submitted 27 August, 2023;
originally announced August 2023.
-
Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation
Authors:
Andi Peng,
Aviv Netanyahu,
Mark Ho,
Tianmin Shu,
Andreea Bobu,
Julie Shah,
Pulkit Agrawal
Abstract:
Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments. Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation. However, designers don't know which concepts are irrelevant a priori, especially when different end users have different preferences a…
▽ More
Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments. Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation. However, designers don't know which concepts are irrelevant a priori, especially when different end users have different preferences about how the task is performed. We propose an interactive framework to leverage feedback directly from the user to identify personalized task-irrelevant concepts. Our key idea is to generate counterfactual demonstrations that allow users to quickly identify possible task-relevant and irrelevant concepts. The knowledge of task-irrelevant concepts is then used to perform data augmentation and thus obtain a policy adapted to personalized user objectives. We present experiments validating our framework on discrete and continuous control tasks with real human users. Our method (1) enables users to better understand agent failure, (2) reduces the number of demonstrations required for fine-tuning, and (3) aligns the agent to individual user task preferences.
△ Less
Submitted 13 July, 2023; v1 submitted 12 July, 2023;
originally announced July 2023.
-
Machine Vision Using Cellphone Camera: A Comparison of deep networks for classifying three challenging denominations of Indian Coins
Authors:
Keyur D. Joshi,
Dhruv Shah,
Varshil Shah,
Nilay Gandhi,
Sanket J. Shah,
Sanket B. Shah
Abstract:
Indian currency coins come in a variety of denominations. Off all the varieties Rs.1, RS.2, and Rs.5 have similar diameters. Majority of the coin styles in market circulation for denominations of Rs.1 and Rs.2 coins are nearly the same except for numerals on its reverse side. If a coin is resting on its obverse side, the correct denomination is not distinguishable by humans. Therefore, it was hypo…
▽ More
Indian currency coins come in a variety of denominations. Off all the varieties Rs.1, RS.2, and Rs.5 have similar diameters. Majority of the coin styles in market circulation for denominations of Rs.1 and Rs.2 coins are nearly the same except for numerals on its reverse side. If a coin is resting on its obverse side, the correct denomination is not distinguishable by humans. Therefore, it was hypothesized that a digital image of a coin resting on its either size could be classified into its correct denomination by training a deep neural network model. The digital images were generated by using cheap cell phone cameras. To find the most suitable deep neural network architecture, four were selected based on the preliminary analysis carried out for comparison. The results confirm that two of the four deep neural network models can classify the correct denomination from either side of a coin with an accuracy of 97%.
△ Less
Submitted 12 May, 2023;
originally announced June 2023.
-
Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue
Authors:
Cristian-Paul Bara,
Ziqiao Ma,
Yingzhuo Yu,
Julie Shah,
Joyce Chai
Abstract:
Collaborative tasks often begin with partial task knowledge and incomplete initial plans from each partner. To complete these tasks, agents need to engage in situated communication with their partners and coordinate their partial plans towards a complete plan to achieve a joint task goal. While such collaboration seems effortless in a human-human team, it is highly challenging for human-AI collabo…
▽ More
Collaborative tasks often begin with partial task knowledge and incomplete initial plans from each partner. To complete these tasks, agents need to engage in situated communication with their partners and coordinate their partial plans towards a complete plan to achieve a joint task goal. While such collaboration seems effortless in a human-human team, it is highly challenging for human-AI collaboration. To address this limitation, this paper takes a step towards collaborative plan acquisition, where humans and agents strive to learn and communicate with each other to acquire a complete plan for joint tasks. Specifically, we formulate a novel problem for agents to predict the missing task knowledge for themselves and for their partners based on rich perceptual and dialogue history. We extend a situated dialogue benchmark for symmetric collaborative tasks in a 3D blocks world and investigate computational strategies for plan acquisition. Our empirical results suggest that predicting the partner's missing knowledge is a more viable approach than predicting one's own. We show that explicit modeling of the partner's dialogue moves and mental states produces improved and more stable results than without. These results provide insight for future AI agents that can predict what knowledge their partner is missing and, therefore, can proactively communicate such information to help their partner acquire such missing knowledge toward a common understanding of joint tasks.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Symmetry and nonlinearity of spin wave resonance excited by focused surface acoustic waves
Authors:
Piyush J. Shah,
Derek A. Bas,
Abbass Hamadeh,
Michael Wolf,
Andrew Franson,
Michael Newburger,
Philipp Pirro,
Mathias Weiler,
Michael R. Page
Abstract:
The use of a complex ferromagnetic system to manipulate GHz surface acoustic waves is a rich current topic under investigation, but the high-power nonlinear regime is under-explored. We introduce focused surface acoustic waves, which provide a way to access this regime with modest equipment. Symmetry of the magneto-acoustic interaction can be tuned by interdigitated transducer design which can int…
▽ More
The use of a complex ferromagnetic system to manipulate GHz surface acoustic waves is a rich current topic under investigation, but the high-power nonlinear regime is under-explored. We introduce focused surface acoustic waves, which provide a way to access this regime with modest equipment. Symmetry of the magneto-acoustic interaction can be tuned by interdigitated transducer design which can introduce additional strain components. Here, we compare the impact of focused acoustic waves versus standard unidirectional acoustic waves in significantly enhancing the magnon-phonon coupling behavior. Analytical simulation results based on modified Landau-Lifshitz-Gilbert theory show good agreement with experimental findings. We also report nonlinear input power dependence of the transmission through the device. This experimental observation is supported by the micromagnetic simulation using mumax3 to model the nonlinear dependence. These results pave the way for extending the understanding and design of acoustic wave devices for exploration of acoustically driven spin wave resonance physics.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
Modeling the formation of Selk impact crater on Titan: Implications for Dragonfly
Authors:
Shigeru Wakita,
Brandon C. Johnson,
Jason M. Soderblom,
Jahnavi Shah,
Catherine D. Neish,
Jordan K. Steckloff
Abstract:
Selk crater is an $\sim$ 80 km diameter impact crater on the Saturnian icy satellite, Titan. Melt pools associated with impact craters like Selk provide environments where liquid water and organics can mix and produce biomolecules like amino acids. It is partly for this reason that the Selk region has been selected as the area that NASA's Dragonfly mission will explore and address one of its prima…
▽ More
Selk crater is an $\sim$ 80 km diameter impact crater on the Saturnian icy satellite, Titan. Melt pools associated with impact craters like Selk provide environments where liquid water and organics can mix and produce biomolecules like amino acids. It is partly for this reason that the Selk region has been selected as the area that NASA's Dragonfly mission will explore and address one of its primary goals: to search for biological signatures on Titan. Here we simulate Selk-sized impact craters on Titan to better understand the formation of Selk and its melt pool. We consider several structures for the icy target material by changing the thickness of the methane clathrate layer, which has a substantial effect on the target thermal structure and crater formation. Our numerical results show that a 4 km-diameter-impactor produces a Selk-sized crater when 5-15 km thick methane clathrate layers are considered. We confirm the production of melt pools in these cases and find that the melt volumes are similar regardless of methane clathrate layer thickness. The distribution of the melted material, however, is sensitive to the thickness of the methane clathrate layer. The melt pool appears as a torus-like shape with a few km depth in the case of 10-15 km thick methane clathrate layer, and as a shallower layer in the case of a 5 km thick clathrate layer. Melt pools of this thickness may take tens of thousands of years to freeze, allowing more time for complex organics to form.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
Brainomaly: Unsupervised Neurologic Disease Detection Utilizing Unannotated T1-weighted Brain MR Images
Authors:
Md Mahfuzur Rahman Siddiquee,
Jay Shah,
Teresa Wu,
Catherine Chong,
Todd J. Schwedt,
Gina Dumkrieger,
Simona Nikolova,
Baoxin Li
Abstract:
Harnessing the power of deep neural networks in the medical imaging domain is challenging due to the difficulties in acquiring large annotated datasets, especially for rare diseases, which involve high costs, time, and effort for annotation. Unsupervised disease detection methods, such as anomaly detection, can significantly reduce human effort in these scenarios. While anomaly detection typically…
▽ More
Harnessing the power of deep neural networks in the medical imaging domain is challenging due to the difficulties in acquiring large annotated datasets, especially for rare diseases, which involve high costs, time, and effort for annotation. Unsupervised disease detection methods, such as anomaly detection, can significantly reduce human effort in these scenarios. While anomaly detection typically focuses on learning from images of healthy subjects only, real-world situations often present unannotated datasets with a mixture of healthy and diseased subjects. Recent studies have demonstrated that utilizing such unannotated images can improve unsupervised disease and anomaly detection. However, these methods do not utilize knowledge specific to registered neuroimages, resulting in a subpar performance in neurologic disease detection. To address this limitation, we propose Brainomaly, a GAN-based image-to-image translation method specifically designed for neurologic disease detection. Brainomaly not only offers tailored image-to-image translation suitable for neuroimages but also leverages unannotated mixed images to achieve superior neurologic disease detection. Additionally, we address the issue of model selection for inference without annotated samples by proposing a pseudo-AUC metric, further enhancing Brainomaly's detection performance. Extensive experiments and ablation studies demonstrate that Brainomaly outperforms existing state-of-the-art unsupervised disease and anomaly detection methods by significant margins in Alzheimer's disease detection using a publicly available dataset and headache detection using an institutional dataset. The code is available from https://github.com/mahfuzmohammad/Brainomaly.
△ Less
Submitted 16 August, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
Aligning Robot and Human Representations
Authors:
Andreea Bobu,
Andi Peng,
Pulkit Agrawal,
Julie Shah,
Anca D. Dragan
Abstract:
To act in the world, robots rely on a representation of salient task aspects: for example, to carry a coffee mug, a robot may consider movement efficiency or mug orientation in its behavior. However, if we want robots to act for and with people, their representations must not be just functional but also reflective of what humans care about, i.e. they must be aligned. We observe that current learni…
▽ More
To act in the world, robots rely on a representation of salient task aspects: for example, to carry a coffee mug, a robot may consider movement efficiency or mug orientation in its behavior. However, if we want robots to act for and with people, their representations must not be just functional but also reflective of what humans care about, i.e. they must be aligned. We observe that current learning approaches suffer from representation misalignment, where the robot's learned representation does not capture the human's representation. We suggest that because humans are the ultimate evaluator of robot performance, we must explicitly focus our efforts on aligning learned representations with humans, in addition to learning the downstream task. We advocate that current representation learning approaches in robotics should be studied from the perspective of how well they accomplish the objective of representation alignment. We mathematically define the problem, identify its key desiderata, and situate current methods within this formalism. We conclude by suggesting future directions for exploring open challenges.
△ Less
Submitted 28 January, 2024; v1 submitted 3 February, 2023;
originally announced February 2023.
-
Quantum spin ice in three-dimensional Rydberg atom arrays
Authors:
Jeet Shah,
Gautam Nambiar,
Alexey V. Gorshkov,
Victor Galitski
Abstract:
Quantum spin liquids are exotic phases of matter whose low-energy physics is described as the deconfined phase of an emergent gauge theory. With recent theory proposals and an experiment showing preliminary signs of $\mathbb{Z}_2$ topological order [G. Semeghini et al., Science 374, 1242 (2021)], Rydberg atom arrays have emerged as a promising platform to realize a quantum spin liquid. In this wor…
▽ More
Quantum spin liquids are exotic phases of matter whose low-energy physics is described as the deconfined phase of an emergent gauge theory. With recent theory proposals and an experiment showing preliminary signs of $\mathbb{Z}_2$ topological order [G. Semeghini et al., Science 374, 1242 (2021)], Rydberg atom arrays have emerged as a promising platform to realize a quantum spin liquid. In this work, we propose a way to realize a $U(1)$ quantum spin liquid in three spatial dimensions, described by the deconfined phase of $U(1)$ gauge theory in a pyrochlore lattice Rydberg atom array. We study the ground state phase diagram of the proposed Rydberg system as a function of experimentally relevant parameters. Within our calculation, we find that by tuning the Rabi frequency, one can access both the confinement-deconfinement transition driven by a proliferation of "magnetic" monopoles and the Higgs transition driven by a proliferation of "electric" charges of the emergent gauge theory. We suggest experimental probes for distinguishing the deconfined phase from ordered phases. This work serves as a proposal to access a confinement-deconfinement transition in three spatial dimensions on a Rydberg-based quantum simulator.
△ Less
Submitted 14 June, 2024; v1 submitted 11 January, 2023;
originally announced January 2023.
-
Artificial Intelligence and Life in 2030: The One Hundred Year Study on Artificial Intelligence
Authors:
Peter Stone,
Rodney Brooks,
Erik Brynjolfsson,
Ryan Calo,
Oren Etzioni,
Greg Hager,
Julia Hirschberg,
Shivaram Kalyanakrishnan,
Ece Kamar,
Sarit Kraus,
Kevin Leyton-Brown,
David Parkes,
William Press,
AnnaLee Saxenian,
Julie Shah,
Milind Tambe,
Astro Teller
Abstract:
In September 2016, Stanford's "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the first report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Peter Stone of the University of Texas at Austin. The report, entitled…
▽ More
In September 2016, Stanford's "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the first report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Peter Stone of the University of Texas at Austin. The report, entitled "Artificial Intelligence and Life in 2030," examines eight domains of typical urban settings on which AI is likely to have impact over the coming years: transportation, home and service robots, healthcare, education, public safety and security, low-resource communities, employment and workplace, and entertainment. It aims to provide the general public with a scientifically and technologically accurate portrayal of the current state of AI and its potential and to help guide decisions in industry and governments, as well as to inform research and development in the field. The charge for this report was given to the panel by the AI100 Standing Committee, chaired by Barbara Grosz of Harvard University.
△ Less
Submitted 31 October, 2022;
originally announced November 2022.
-
Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments
Authors:
Abhinav Joshi,
Naman Gupta,
**ang Shah,
Binod Bhattarai,
Ashutosh Modi,
Danail Stoyanov
Abstract:
A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation Learning (MRL) has emerged as an active area of research in recent times. MRL involves learning reliable and robust representations of information from heterogeneo…
▽ More
A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation Learning (MRL) has emerged as an active area of research in recent times. MRL involves learning reliable and robust representations of information from heterogeneous sources and fusing them. However, in practice, the data acquired from different sources are typically noisy. In some extreme cases, a noise of large magnitude can completely alter the semantics of the data leading to inconsistencies in the parallel multimodal data. In this paper, we propose a novel method for multimodal representation learning in a noisy environment via the generalized product of experts technique. In the proposed method, we train a separate network for each modality to assess the credibility of information coming from that modality, and subsequently, the contribution from each modality is dynamically varied while estimating the joint distribution. We evaluate our method on two challenging benchmarks from two diverse domains: multimodal 3D hand-pose estimation and multimodal surgical video segmentation. We attain state-of-the-art performance on both benchmarks. Our extensive quantitative and qualitative evaluations show the advantages of our method compared to previous approaches.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100) 2021 Study Panel Report
Authors:
Michael L. Littman,
Ifeoma Ajunwa,
Guy Berger,
Craig Boutilier,
Morgan Currie,
Finale Doshi-Velez,
Gillian Hadfield,
Michael C. Horowitz,
Charles Isbell,
Hiroaki Kitano,
Karen Levy,
Terah Lyons,
Melanie Mitchell,
Julie Shah,
Steven Sloman,
Shannon Vallor,
Toby Walsh
Abstract:
In September 2021, the "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the second report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Michael Littman of Brown University. The report, entitled "Gathering Strengt…
▽ More
In September 2021, the "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the second report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Michael Littman of Brown University. The report, entitled "Gathering Strength, Gathering Storms," answers a set of 14 questions probing critical areas of AI development addressing the major risks and dangers of AI, its effects on society, its public perception and the future of the field. The report concludes that AI has made a major leap from the lab to people's lives in recent years, which increases the urgency to understand its potential negative effects. The questions were developed by the AI100 Standing Committee, chaired by Peter Stone of the University of Texas at Austin, consisting of a group of AI leaders with expertise in computer science, sociology, ethics, economics, and other disciplines.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
HealthyGAN: Learning from Unannotated Medical Images to Detect Anomalies Associated with Human Disease
Authors:
Md Mahfuzur Rahman Siddiquee,
Jay Shah,
Teresa Wu,
Catherine Chong,
Todd Schwedt,
Baoxin Li
Abstract:
Automated anomaly detection from medical images, such as MRIs and X-rays, can significantly reduce human effort in disease diagnosis. Owing to the complexity of modeling anomalies and the high cost of manual annotation by domain experts (e.g., radiologists), a typical technique in the current medical imaging literature has focused on deriving diagnostic models from healthy subjects only, assuming…
▽ More
Automated anomaly detection from medical images, such as MRIs and X-rays, can significantly reduce human effort in disease diagnosis. Owing to the complexity of modeling anomalies and the high cost of manual annotation by domain experts (e.g., radiologists), a typical technique in the current medical imaging literature has focused on deriving diagnostic models from healthy subjects only, assuming the model will detect the images from patients as outliers. However, in many real-world scenarios, unannotated datasets with a mix of both healthy and diseased individuals are abundant. Therefore, this paper poses the research question of how to improve unsupervised anomaly detection by utilizing (1) an unannotated set of mixed images, in addition to (2) the set of healthy images as being used in the literature. To answer the question, we propose HealthyGAN, a novel one-directional image-to-image translation method, which learns to translate the images from the mixed dataset to only healthy images. Being one-directional, HealthyGAN relaxes the requirement of cycle consistency of existing unpaired image-to-image translation methods, which is unattainable with mixed unannotated data. Once the translation is learned, we generate a difference map for any given image by subtracting its translated output. Regions of significant responses in the difference map correspond to potential anomalies (if any). Our HealthyGAN outperforms the conventional state-of-the-art methods by significant margins on two publicly available datasets: COVID-19 and NIH ChestX-ray14, and one institutional dataset collected from Mayo Clinic. The implementation is publicly available at https://github.com/mahfuzmohammad/HealthyGAN.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Towards Human-Agent Communication via the Information Bottleneck Principle
Authors:
Mycal Tucker,
Julie Shah,
Roger Levy,
Noga Zaslavsky
Abstract:
Emergent communication research often focuses on optimizing task-specific utility as a driver for communication. However, human languages appear to evolve under pressure to efficiently compress meanings into communication signals by optimizing the Information Bottleneck tradeoff between informativeness and complexity. In this work, we study how trading off these three factors -- utility, informati…
▽ More
Emergent communication research often focuses on optimizing task-specific utility as a driver for communication. However, human languages appear to evolve under pressure to efficiently compress meanings into communication signals by optimizing the Information Bottleneck tradeoff between informativeness and complexity. In this work, we study how trading off these three factors -- utility, informativeness, and complexity -- shapes emergent communication, including compared to human communication. To this end, we propose Vector-Quantized Variational Information Bottleneck (VQ-VIB), a method for training neural agents to compress inputs into discrete signals embedded in a continuous space. We train agents via VQ-VIB and compare their performance to previously proposed neural architectures in grounded environments and in a Lewis reference game. Across all neural architectures and settings, taking into account communicative informativeness benefits communication convergence rates, and penalizing communicative complexity leads to human-like lexicon sizes while maintaining high utility. Additionally, we find that VQ-VIB outperforms other discrete communication methods. This work demonstrates how fundamental principles that are believed to characterize human language evolution may inform emergent communication in artificial agents.
△ Less
Submitted 30 June, 2022;
originally announced July 2022.
-
Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations
Authors:
Yanwei Wang,
Nadia Figueroa,
Shen Li,
Ankit Shah,
Julie Shah
Abstract:
Learning from demonstration (LfD) has succeeded in tasks featuring a long time horizon. However, when the problem complexity also includes human-in-the-loop perturbations, state-of-the-art approaches do not guarantee the successful reproduction of a task. In this work, we identify the roots of this challenge as the failure of a learned continuous policy to satisfy the discrete plan implicit in the…
▽ More
Learning from demonstration (LfD) has succeeded in tasks featuring a long time horizon. However, when the problem complexity also includes human-in-the-loop perturbations, state-of-the-art approaches do not guarantee the successful reproduction of a task. In this work, we identify the roots of this challenge as the failure of a learned continuous policy to satisfy the discrete plan implicit in the demonstration. By utilizing modes (rather than subgoals) as the discrete abstraction and motion policies with both mode invariance and goal reachability properties, we prove our learned continuous policy can simulate any discrete plan specified by a linear temporal logic (LTL) formula. Consequently, an imitator is robust to both task- and motion-level perturbations and guaranteed to achieve task success. Project page: https://yanweiw.github.io/tli/
△ Less
Submitted 14 December, 2022; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Prototype Based Classification from Hierarchy to Fairness
Authors:
Mycal Tucker,
Julie Shah
Abstract:
Artificial neural nets can represent and classify many types of data but are often tailored to particular applications -- e.g., for "fair" or "hierarchical" classification. Once an architecture has been selected, it is often difficult for humans to adjust models for a new task; for example, a hierarchical classifier cannot be easily transformed into a fair classifier that shields a protected field…
▽ More
Artificial neural nets can represent and classify many types of data but are often tailored to particular applications -- e.g., for "fair" or "hierarchical" classification. Once an architecture has been selected, it is often difficult for humans to adjust models for a new task; for example, a hierarchical classifier cannot be easily transformed into a fair classifier that shields a protected field. Our contribution in this work is a new neural network architecture, the concept subspace network (CSN), which generalizes existing specialized classifiers to produce a unified model capable of learning a spectrum of multi-concept relationships. We demonstrate that CSNs reproduce state-of-the-art results in fair classification when enforcing concept independence, may be transformed into hierarchical classifiers, or even reconcile fairness and hierarchy within a single classifier. The CSN is inspired by existing prototype-based classifiers that promote interpretability.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
The Solvability of Interpretability Evaluation Metrics
Authors:
Yilun Zhou,
Julie Shah
Abstract:
Feature attribution methods are popular for explaining neural network predictions, and they are often evaluated on metrics such as comprehensiveness and sufficiency. In this paper, we highlight an intriguing property of these metrics: their solvability. Concretely, we can define the problem of optimizing an explanation for a metric, which can be solved by beam search. This observation leads to the…
▽ More
Feature attribution methods are popular for explaining neural network predictions, and they are often evaluated on metrics such as comprehensiveness and sufficiency. In this paper, we highlight an intriguing property of these metrics: their solvability. Concretely, we can define the problem of optimizing an explanation for a metric, which can be solved by beam search. This observation leads to the obvious yet unaddressed question: why do we use explainers (e.g., LIME) not based on solving the target metric, if the metric value represents explanation quality? We present a series of investigations showing strong performance of this beam search explainer and discuss its broader implication: a definition-evaluation duality of interpretability concepts. We implement the explainer and release the Python solvex package for models of text, image and tabular domains.
△ Less
Submitted 2 February, 2023; v1 submitted 17 May, 2022;
originally announced May 2022.
-
ExSum: From Local Explanations to Model Understanding
Authors:
Yilun Zhou,
Marco Tulio Ribeiro,
Julie Shah
Abstract:
Interpretability methods are developed to understand the working mechanisms of black-box models, which is crucial to their responsible deployment. Fulfilling this goal requires both that the explanations generated by these methods are correct and that people can easily and reliably understand them. While the former has been addressed in prior work, the latter is often overlooked, resulting in info…
▽ More
Interpretability methods are developed to understand the working mechanisms of black-box models, which is crucial to their responsible deployment. Fulfilling this goal requires both that the explanations generated by these methods are correct and that people can easily and reliably understand them. While the former has been addressed in prior work, the latter is often overlooked, resulting in informal model understanding derived from a handful of local explanations. In this paper, we introduce explanation summary (ExSum), a mathematical framework for quantifying model understanding, and propose metrics for its quality assessment. On two domains, ExSum highlights various limitations in the current practice, helps develop accurate model understanding, and reveals easily overlooked properties of the model. We also connect understandability to other properties of explanations such as human alignment, robustness, and counterfactual minimality and plausibility.
△ Less
Submitted 29 April, 2022;
originally announced May 2022.
-
When Does Syntax Mediate Neural Language Model Performance? Evidence from Dropout Probes
Authors:
Mycal Tucker,
Tiwalayo Eisape,
Peng Qian,
Roger Levy,
Julie Shah
Abstract:
Recent causal probing literature reveals when language models and syntactic probes use similar representations. Such techniques may yield "false negative" causality results: models may use representations of syntax, but probes may have learned to use redundant encodings of the same syntactic information. We demonstrate that models do encode syntactic information redundantly and introduce a new pro…
▽ More
Recent causal probing literature reveals when language models and syntactic probes use similar representations. Such techniques may yield "false negative" causality results: models may use representations of syntax, but probes may have learned to use redundant encodings of the same syntactic information. We demonstrate that models do encode syntactic information redundantly and introduce a new probe design that guides probes to consider all syntactic information present in embeddings. Using these probes, we find evidence for the use of syntax in models where prior methods did not, allowing us to boost model performance by injecting syntactic information into representations.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
Parametrized and equivariant higher algebra
Authors:
Denis Nardin,
Jay Shah
Abstract:
We develop the rudiments of a theory of parametrized $\infty$-operads, including parametrized generalizations of monoidal envelopes, Day convolution, operadic left Kan extensions, results on limits and colimits of algebras, and the symmetric monoidal Yoneda embedding.
We develop the rudiments of a theory of parametrized $\infty$-operads, including parametrized generalizations of monoidal envelopes, Day convolution, operadic left Kan extensions, results on limits and colimits of algebras, and the symmetric monoidal Yoneda embedding.
△ Less
Submitted 28 February, 2022;
originally announced March 2022.
-
A Method for Waste Segregation using Convolutional Neural Networks
Authors:
Jash Shah,
Sagar Kamat
Abstract:
Segregation of garbage is a primary concern in many nations across the world. Even though we are in the modern era, many people still do not know how to distinguish between organic and recyclable waste. It is because of this that the world is facing a major crisis of waste disposal. In this paper, we try to use deep learning algorithms to help solve this problem of waste classification. The waste…
▽ More
Segregation of garbage is a primary concern in many nations across the world. Even though we are in the modern era, many people still do not know how to distinguish between organic and recyclable waste. It is because of this that the world is facing a major crisis of waste disposal. In this paper, we try to use deep learning algorithms to help solve this problem of waste classification. The waste is classified into two categories like organic and recyclable. Our proposed model achieves an accuracy of 94.9%. Although the other two models also show promising results, the Proposed Model stands out with the greatest accuracy. With the help of deep learning, one of the greatest obstacles to efficient waste management can finally be removed.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
Probe-Based Interventions for Modifying Agent Behavior
Authors:
Mycal Tucker,
William Kuhl,
Khizer Shahid,
Seth Karten,
Katia Sycara,
Julie Shah
Abstract:
Neural nets are powerful function approximators, but the behavior of a given neural net, once trained, cannot be easily modified. We wish, however, for people to be able to influence neural agents' actions despite the agents never training with humans, which we formalize as a human-assisted decision-making problem. Inspired by prior art initially developed for model explainability, we develop a me…
▽ More
Neural nets are powerful function approximators, but the behavior of a given neural net, once trained, cannot be easily modified. We wish, however, for people to be able to influence neural agents' actions despite the agents never training with humans, which we formalize as a human-assisted decision-making problem. Inspired by prior art initially developed for model explainability, we develop a method for updating representations in pre-trained neural nets according to externally-specified properties. In experiments, we show how our method may be used to improve human-agent team performance for a variety of neural networks from image classifiers to agents in multi-agent reinforcement learning settings.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Methane-saturated layers limit the observability of impact craters on Titan
Authors:
Shigeru Wakita,
Brandon C. Johnson,
Jason M. Soderblom,
Jahnavi Shah,
Catherine D. Neish
Abstract:
As the only icy satellite with a thick atmosphere and liquids on its surface, Titan represents a unique end-member to study the impact cratering process. Unlike craters on other Saturnian satellites, Titan's craters are preferentially located in high-elevation regions near the equator. This led to the hypothesis that the presence of liquid methane in Titan's lowlands affects crater morphology, mak…
▽ More
As the only icy satellite with a thick atmosphere and liquids on its surface, Titan represents a unique end-member to study the impact cratering process. Unlike craters on other Saturnian satellites, Titan's craters are preferentially located in high-elevation regions near the equator. This led to the hypothesis that the presence of liquid methane in Titan's lowlands affects crater morphology, making them difficult to identify. This is because surfaces covered by weak fluid-saturated sediment limit the topographic expression of impact craters, as sediment moves into the crater cavity shortly after formation. Here we simulate crater-forming impacts on Titan's surface, exploring how a methane-saturated layer overlying a methane-clathrate layer affects crater formation. Our numerical results show that impacts form smaller craters in a methane-clathrate basement than a water-ice basement, due to the differences in strength. We find that the addition of a methane-saturated layer atop this basement reduces crater depths and influences crater morphology. The morphology of impact craters formed in a thin methane-saturated layer are similar to those in a "dry" target, but a thick saturated layer produces an impact structure with little to no topography. A thick methane-saturated layer (thicker than 40% of the impactor diameter) could explain the dearth of craters in the low-elevation regions on Titan.
△ Less
Submitted 24 January, 2022;
originally announced January 2022.
-
Mythological Medical Machine Learning: Boosting the Performance of a Deep Learning Medical Data Classifier Using Realistic Physiological Models
Authors:
Ismail Sadiq,
Erick A. Perez-Alday,
Amit J. Shah,
Ali Bahrami Rad,
Reza Sameni,
Gari D. Clifford
Abstract:
Objective: To determine if a realistic, but computationally efficient model of the electrocardiogram can be used to pre-train a deep neural network (DNN) with a wide range of morphologies and abnormalities specific to a given condition - T-wave Alternans (TWA) as a result of Post-Traumatic Stress Disorder, or PTSD - and significantly boost performance on a small database of rare individuals.
App…
▽ More
Objective: To determine if a realistic, but computationally efficient model of the electrocardiogram can be used to pre-train a deep neural network (DNN) with a wide range of morphologies and abnormalities specific to a given condition - T-wave Alternans (TWA) as a result of Post-Traumatic Stress Disorder, or PTSD - and significantly boost performance on a small database of rare individuals.
Approach: Using a previously validated artificial ECG model, we generated 180,000 artificial ECGs with or without significant TWA, with varying heart rate, breathing rate, TWA amplitude, and ECG morphology. A DNN, trained on over 70,000 patients to classify 25 different rhythms, was modified the output layer to a binary class (TWA or no-TWA, or equivalently, PTSD or no-PTSD), and transfer learning was performed on the artificial ECG. In a final transfer learning step, the DNN was trained and cross-validated on ECG from 12 PTSD and 24 controls for all combinations of using the three databases.
Main results: The best performing approach (AUROC = 0.77, Accuracy = 0.72, F1-score = 0.64) was found by performing both transfer learning steps, using the pre-trained arrhythmia DNN, the artificial data and the real PTSD-related ECG data. Removing the artificial data from training led to the largest drop in performance. Removing the arrhythmia data from training provided a modest, but significant, drop in performance. The final model showed no significant drop in performance on the artificial data, indicating no overfitting.
Significance: In healthcare, it is common to only have a small collection of high-quality data and labels, or a larger database with much lower quality (and less relevant) labels. The paradigm presented here, involving model-based performance boosting, provides a solution through transfer learning on a large realistic artificial database, and a partially relevant real database.
△ Less
Submitted 28 December, 2021;
originally announced December 2021.
-
On the equivalence of two theories of real cyclotomic spectra
Authors:
J. D. Quigley,
Jay Shah
Abstract:
We give a new formula for real topological cyclic homology that refines the fiber sequence formula discovered by Nikolaus and Scholze for topological cyclic homology to one involving genuine $C_2$-spectra. To accomplish this, we give a new definition of the $\infty$-category of real cyclotomic spectra that replaces the usage of genuinely equivariant dihedral spectra with the parametrized Tate cons…
▽ More
We give a new formula for real topological cyclic homology that refines the fiber sequence formula discovered by Nikolaus and Scholze for topological cyclic homology to one involving genuine $C_2$-spectra. To accomplish this, we give a new definition of the $\infty$-category of real cyclotomic spectra that replaces the usage of genuinely equivariant dihedral spectra with the parametrized Tate construction. We then define an $\infty$-categorical version of Høgenhaven's O(2)-orthogonal cyclotomic spectra, construct a forgetful functor relating the two theories, and show that this functor restricts to an equivalence between full subcategories of appropriately bounded-below objects. As an application, we compute the real topological cyclic homology of perfect $\mathbb{F}_p$-algebras for all primes $p$.
△ Less
Submitted 6 January, 2022; v1 submitted 14 December, 2021;
originally announced December 2021.
-
Reducing Target Group Bias in Hate Speech Detectors
Authors:
Darsh J Shah,
Sinong Wang,
Han Fang,
Hao Ma,
Luke Zettlemoyer
Abstract:
The ubiquity of offensive and hateful content on online fora necessitates the need for automatic solutions that detect such content competently across target groups. In this paper we show that text classification models trained on large publicly available datasets despite having a high overall performance, may significantly under-perform on several protected groups. On the \citet{vidgen2020learnin…
▽ More
The ubiquity of offensive and hateful content on online fora necessitates the need for automatic solutions that detect such content competently across target groups. In this paper we show that text classification models trained on large publicly available datasets despite having a high overall performance, may significantly under-perform on several protected groups. On the \citet{vidgen2020learning} dataset, we find the accuracy to be 37\% lower on an under annotated Black Women target group and 12\% lower on Immigrants, where hate speech involves a distinct style. To address this, we propose to perform token-level hate sense disambiguation, and utilize tokens' hate sense representations for detection, modeling more general signals. On two publicly available datasets, we observe that the variance in model accuracy across target groups drops by at least 30\%, improving the average target group performance by 4\% and worst case performance by 13\%.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
SNPs Filtered by Allele Frequency Improve the Prediction of Hypertension Subtypes
Authors:
Yiming Li,
Sanjiv J. Shah,
Donna Arnett,
Ryan Irvin,
Yuan Luo
Abstract:
Hypertension is the leading global cause of cardiovascular disease and premature death. Distinct hypertension subtypes may vary in their prognoses and require different treatments. An individual's risk for hypertension is determined by genetic and environmental factors as well as their interactions. In this work, we studied 911 African Americans and 1,171 European Americans in the Hypertension Gen…
▽ More
Hypertension is the leading global cause of cardiovascular disease and premature death. Distinct hypertension subtypes may vary in their prognoses and require different treatments. An individual's risk for hypertension is determined by genetic and environmental factors as well as their interactions. In this work, we studied 911 African Americans and 1,171 European Americans in the Hypertension Genetic Epidemiology Network (HyperGEN) cohort. We built hypertension subtype classification models using both environmental variables and sets of genetic features selected based on different criteria. The fitted prediction models provided insights into the genetic landscape of hypertension subtypes, which may aid personalized diagnosis and treatment of hypertension in the future.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
Chiral Phase Change Nanomaterials
Authors:
Joshua A. Burrow,
Md Shah Alam,
Evan M. Smith,
Riad Yahiaoui,
Ryan Laing,
Piyush J. Shah,
Thomas A. Searles,
Shivashankar Vangala,
Joshua R. Hendrickson,
Andrew Sarangan,
Imad Agha
Abstract:
Chiral nanostructures offer the ability to respond to the vector nature of a light beam at the nanoscale. While naturally chiral materials offer a path towards scalability, engineered structures offer a path to wavelength tunability through geometric manipulation. Neither approach, however, allows for temporal control of chirality. Therefore, in the best of all worlds, it is crucial to realize chi…
▽ More
Chiral nanostructures offer the ability to respond to the vector nature of a light beam at the nanoscale. While naturally chiral materials offer a path towards scalability, engineered structures offer a path to wavelength tunability through geometric manipulation. Neither approach, however, allows for temporal control of chirality. Therefore, in the best of all worlds, it is crucial to realize chiral materials that possess the quality of scalability, tailored wavelength response, and dynamic control at high speeds. Here, a new class of intrinsically chiral phase change nanomaterials (PCNMs) is proposed and explored, based on a scalable bottom-up fabrication technique with a high degree of control in three dimensions. Angular resolved Mueller Matrix and spectroscopic ellipsometry are performed to characterize the optical birefringence and dichroism, and a numerical model is provided to explain the origin of optical activity. This work achieves the critical goal of demonstrating high-speed dynamic switching of chirality over 50,000 cycles via the underlying PCNM.
△ Less
Submitted 18 November, 2021;
originally announced November 2021.
-
Process Design and Economics of Production of p-Aminophenol
Authors:
Chinmay Ghoroi,
Jay Shah,
Devanshu Thakar,
Sakshi Baheti
Abstract:
Para-Aminophenol is one of the key chemicals required for the synthesis of Paracetamol, an analgesic and antipyretic drug. Data shows a large fraction of India's demand for Para-Aminophenol being met through imports from China. The uncertainty in the India-China relations would affect the supply and price of this "Key Starting Material." This report is a detailed business plan for setting up a pla…
▽ More
Para-Aminophenol is one of the key chemicals required for the synthesis of Paracetamol, an analgesic and antipyretic drug. Data shows a large fraction of India's demand for Para-Aminophenol being met through imports from China. The uncertainty in the India-China relations would affect the supply and price of this "Key Starting Material." This report is a detailed business plan for setting up a plant and producing Para-Aminophenol in India at a competitive price. The plant is simulated in AspenPlus V8 and different Material Balances and Energy Balances calculations are carried out. The plant produces 22.7 kmols Para-Aminophenol per hour with a purity of 99.9%. Along with the simulation, economic analysis is carried out for this plant to determine the financial parameters like Payback Period and Return on Investment.
△ Less
Submitted 29 October, 2021;
originally announced October 2021.
-
Set-based State Estimation with Probabilistic Consistency Guarantee under Epistemic Uncertainty
Authors:
Shen Li,
Theodoros Stouraitis,
Michael Gienger,
Sethu Vijayakumar,
Julie A. Shah
Abstract:
Consistent state estimation is challenging, especially under the epistemic uncertainties arising from learned (nonlinear) dynamic and observation models. In this work, we propose a set-based estimation algorithm, named Gaussian Process-Zonotopic Kalman Filter (GP-ZKF), that produces zonotopic state estimates while respecting both the epistemic uncertainties in the learned models and aleatoric unce…
▽ More
Consistent state estimation is challenging, especially under the epistemic uncertainties arising from learned (nonlinear) dynamic and observation models. In this work, we propose a set-based estimation algorithm, named Gaussian Process-Zonotopic Kalman Filter (GP-ZKF), that produces zonotopic state estimates while respecting both the epistemic uncertainties in the learned models and aleatoric uncertainties. Our method guarantees probabilistic consistency, in the sense that the true states are bounded by sets (zonotopes) across all time steps, with high probability. We formally relate GP-ZKF with the corresponding stochastic approach, GP-EKF, in the case of learned (nonlinear) models. In particular, when linearization errors and aleatoric uncertainties are omitted and epistemic uncertainties are simplified, GP-ZKF reduces to GP-EKF. We empirically demonstrate our method's efficacy in both a simulated pendulum domain and a real-world robot-assisted dressing domain, where GP-ZKF produced more consistent and less conservative set-based estimates than all baseline stochastic methods.
△ Less
Submitted 25 February, 2022; v1 submitted 18 October, 2021;
originally announced October 2021.