-
Drone-Based Antenna Beam Calibration in the High Arctic
Authors:
Lawrence Herman,
Christopher Barbarie,
Mohan Agrawal,
Vlad Calinescu,
Simon Chen,
H. Cynthia Chiang,
Cherie K. Day,
Eamon Egan,
Stephen Fay,
Kit Gerodias,
Maya Goss,
Michael Hétu,
Daniel C. Jacobs,
Marc-Olivier R. Lalonde,
Francis McGee,
Loïc Miara,
John Orlowski-Scherer,
Jonathan Sievers
Abstract:
The development of low-frequency radio astronomy experiments for detecting 21-cm line emission from hydrogen presents new opportunities for creative solutions to the challenge of characterizing an antenna beam pattern. The Array of Long Baseline Antennas for Taking Radio Observations from the Seventy-ninth parallel (ALBATROS) is a new radio interferometer sited in the Canadian high Arctic that aim…
▽ More
The development of low-frequency radio astronomy experiments for detecting 21-cm line emission from hydrogen presents new opportunities for creative solutions to the challenge of characterizing an antenna beam pattern. The Array of Long Baseline Antennas for Taking Radio Observations from the Seventy-ninth parallel (ALBATROS) is a new radio interferometer sited in the Canadian high Arctic that aims to map Galactic foregrounds at frequencies below $\sim$30 MHz. We present PteroSoar, a custom-built hexacopter outfitted with a transmitter, that will be used to characterize the beam patterns of ALBATROS and other experiments. The PteroSoar drone hardware is motivated by the need for user-servicing at remote sites and environmental factors that are unique to the high Arctic. In particular, magnetic heading is unreliable because the magnetic field lines near the north pole are almost vertical. We therefore implement moving baseline real time kinematic (RTK) positioning with two GPS units to obtain heading solutions with $\sim$1$^\circ$ accuracy. We present a preliminary beam map of an ALBATROS antenna, thus demonstrating successful PteroSoar operation in the high Arctic.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
The influence of solvent on surface adsorption and desorption
Authors:
Ardavan Farahvash,
Mayank Agrawal,
Adam P. Willard,
Andrew A. Peterson
Abstract:
The adsorption and desorption of reactants and products from a solid surface is essential for achieving sustained surface chemical reactions. At a liquid-solid interface, these processes can involve the collective reorganization of interfacial solvent molecules in order to accommodate the adsorbing or desorbing species. Identifying the role of solvent in adsorption and desorption is important for…
▽ More
The adsorption and desorption of reactants and products from a solid surface is essential for achieving sustained surface chemical reactions. At a liquid-solid interface, these processes can involve the collective reorganization of interfacial solvent molecules in order to accommodate the adsorbing or desorbing species. Identifying the role of solvent in adsorption and desorption is important for advancing our understanding of surface chemical rates and mechanisms and for enabling the rational design and optimization of surface chemical systems. In this manuscript we use all-atom molecular dynamics simulation and transition path sampling to identify water's role in the desorption of CO from a Pt(100) surface in contact with liquid water. We demonstrate that the solvation of CO, as quantified by the water coordination number, is an essential component of the desorption reaction coordinate. We use meta dynamics to compute the desorption free energy surface and conclude based on its features that desorption proceeds via a two-step mechanism whereby the final detachment of CO from the surface is preceded by the formation of a nascent solvation shell.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Unsupervised Threat Hunting using Continuous Bag-of-Terms-and-Time (CBoTT)
Authors:
Varol Kayhan,
Shivendu Shivendu,
Rouzbeh Behnia,
Clinton Daniel,
Manish Agrawal
Abstract:
Threat hunting is sifting through system logs to detect malicious activities that might have bypassed existing security measures. It can be performed in several ways, one of which is based on detecting anomalies. We propose an unsupervised framework, called continuous bag-of-terms-and-time (CBoTT), and publish its application programming interface (API) to help researchers and cybersecurity analys…
▽ More
Threat hunting is sifting through system logs to detect malicious activities that might have bypassed existing security measures. It can be performed in several ways, one of which is based on detecting anomalies. We propose an unsupervised framework, called continuous bag-of-terms-and-time (CBoTT), and publish its application programming interface (API) to help researchers and cybersecurity analysts perform anomaly-based threat hunting among SIEM logs geared toward process auditing on endpoint devices. Analyses show that our framework consistently outperforms benchmark approaches. When logs are sorted by likelihood of being an anomaly (from most likely to least), our approach identifies anomalies at higher percentiles (between 1.82-6.46) while benchmark approaches identify the same anomalies at lower percentiles (between 3.25-80.92). This framework can be used by other researchers to conduct benchmark analyses and cybersecurity analysts to find anomalies in SIEM logs.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium
Authors:
Hyewon Jeong,
Sarah Jabbour,
Yuzhe Yang,
Rahul Thapta,
Hussein Mozannar,
William Jongwon Han,
Nikita Mehandru,
Michael Wornow,
Vladislav Lialin,
Xin Liu,
Alejandro Lozano,
Jiacheng Zhu,
Rafal Dariusz Kocielnik,
Keith Harrigian,
Haoran Zhang,
Edward Lee,
Milos Vukadinovic,
Aparna Balagopalan,
Vincent Jeanselme,
Katherine Matton,
Ilker Demirel,
Jason Fries,
Parisa Rashidi,
Brett Beaulieu-Jones,
Xuhai Orson Xu
, et al. (18 additional authors not shown)
Abstract:
The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four vir…
▽ More
The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four virtual roundtables at ML4H 2022. The organization of the research roundtables at the conference involved 17 Senior Chairs and 19 Junior Chairs across 11 tables. Each roundtable session included invited senior chairs (with substantial experience in the field), junior chairs (responsible for facilitating the discussion), and attendees from diverse backgrounds with interest in the session's topic. Herein we detail the organization process and compile takeaways from these roundtable discussions, including recent advances, applications, and open challenges for each topic. We conclude with a summary and lessons learned across all roundtables. This document serves as a comprehensive review paper, summarizing the recent advancements in machine learning for healthcare as contributed by foremost researchers in the field.
△ Less
Submitted 5 April, 2024; v1 submitted 3 March, 2024;
originally announced March 2024.
-
A novel data generation scheme for surrogate modelling with deep operator networks
Authors:
Shivam Choubey,
Birupaksha Pal,
Manish Agrawal
Abstract:
Operator-based neural network architectures such as DeepONets have emerged as a promising tool for the surrogate modeling of physical systems. In general, towards operator surrogate modeling, the training data is generated by solving the PDEs using techniques such as Finite Element Method (FEM). The computationally intensive nature of data generation is one of the biggest bottleneck in deploying t…
▽ More
Operator-based neural network architectures such as DeepONets have emerged as a promising tool for the surrogate modeling of physical systems. In general, towards operator surrogate modeling, the training data is generated by solving the PDEs using techniques such as Finite Element Method (FEM). The computationally intensive nature of data generation is one of the biggest bottleneck in deploying these surrogate models for practical applications. In this study, we propose a novel methodology to alleviate the computational burden associated with training data generation for DeepONets. Unlike existing literature, the proposed framework for data generation does not use any partial differential equation integration strategy, thereby significantly reducing the computational cost associated with generating training dataset for DeepONet. In the proposed strategy, first, the output field is generated randomly, satisfying the boundary conditions using Gaussian Process Regression (GPR). From the output field, the input source field can be calculated easily using finite difference techniques. The proposed methodology can be extended to other operator learning methods, making the approach widely applicable. To validate the proposed approach, we employ the heat equations as the model problem and develop the surrogate model for numerous boundary value problems.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models
Authors:
Stefan Hegselmann,
Shannon Zejiang Shen,
Florian Gierse,
Monica Agrawal,
David Sontag,
Xiaoyi Jiang
Abstract:
Patients often face difficulties in understanding their hospitalizations, while healthcare workers have limited resources to provide explanations. In this work, we investigate the potential of large language models to generate patient summaries based on doctors' notes and study the effect of training data on the faithfulness and quality of the generated summaries. To this end, we release (i) a rig…
▽ More
Patients often face difficulties in understanding their hospitalizations, while healthcare workers have limited resources to provide explanations. In this work, we investigate the potential of large language models to generate patient summaries based on doctors' notes and study the effect of training data on the faithfulness and quality of the generated summaries. To this end, we release (i) a rigorous labeling protocol for errors in medical texts and (ii) a publicly available dataset of annotated hallucinations in 100 doctor-written and 100 generated summaries. We show that fine-tuning on hallucination-free data effectively reduces hallucinations from 2.60 to 1.55 per summary for Llama 2, while preserving relevant information. We observe a similar effect on GPT-4 (0.70 to 0.40), when the few-shot examples are hallucination-free. We also conduct a qualitative evaluation using hallucination-free and improved training data. We find that common quantitative metrics do not correlate well with faithfulness and quality. Finally, we test GPT-4 for automatic hallucination detection, which clearly outperforms common baselines.
△ Less
Submitted 25 June, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study
Authors:
Niklas Mannhardt,
Elizabeth Bondi-Kelly,
Barbara Lam,
Chloe O'Connell,
Mercy Asiedu,
Hussein Mozannar,
Monica Agrawal,
Alejandro Buendia,
Tatiana Urman,
Irbaz B. Riaz,
Catherine E. Ricciardi,
Marzyeh Ghassemi,
David Sontag
Abstract:
Patients derive numerous benefits from reading their clinical notes, including an increased sense of control over their health and improved understanding of their care plan. However, complex medical concepts and jargon within clinical notes hinder patient comprehension and may lead to anxiety. We developed a patient-facing tool to make clinical notes more readable, leveraging large language models…
▽ More
Patients derive numerous benefits from reading their clinical notes, including an increased sense of control over their health and improved understanding of their care plan. However, complex medical concepts and jargon within clinical notes hinder patient comprehension and may lead to anxiety. We developed a patient-facing tool to make clinical notes more readable, leveraging large language models (LLMs) to simplify, extract information from, and add context to notes. We prompt engineered GPT-4 to perform these augmentation tasks on real clinical notes donated by breast cancer survivors and synthetic notes generated by a clinician, a total of 12 notes with 3868 words. In June 2023, 200 female-identifying US-based participants were randomly assigned three clinical notes with varying levels of augmentations using our tool. Participants answered questions about each note, evaluating their understanding of follow-up actions and self-reported confidence. We found that augmentations were associated with a significant increase in action understanding score (0.63 $\pm$ 0.04 for select augmentations, compared to 0.54 $\pm$ 0.02 for the control) with p=0.002. In-depth interviews of self-identifying breast cancer patients (N=7) were also conducted via video conferencing. Augmentations, especially definitions, elicited positive responses among the seven participants, with some concerns about relying on LLMs. Augmentations were evaluated for errors by clinicians, and we found misleading errors occur, with errors more common in real donated notes than synthetic notes, illustrating the importance of carefully written clinical notes. Augmentations improve some but not all readability metrics. This work demonstrates the potential of LLMs to improve patients' experience with clinical notes at a lower burden to clinicians. However, having a human in the loop is important to correct potential model errors.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Use large language models to promote equity
Authors:
Emma Pierson,
Divya Shanmugam,
Rajiv Movva,
Jon Kleinberg,
Monica Agrawal,
Mark Dredze,
Kadija Ferryman,
Judy Wawira Gichoya,
Dan Jurafsky,
Pang Wei Koh,
Karen Levy,
Sendhil Mullainathan,
Ziad Obermeyer,
Harini Suresh,
Keyon Vafa
Abstract:
Advances in large language models (LLMs) have driven an explosion of interest about their societal impacts. Much of the discourse around how they will impact social equity has been cautionary or negative, focusing on questions like "how might LLMs be biased and how would we mitigate those biases?" This is a vital discussion: the ways in which AI generally, and LLMs specifically, can entrench biase…
▽ More
Advances in large language models (LLMs) have driven an explosion of interest about their societal impacts. Much of the discourse around how they will impact social equity has been cautionary or negative, focusing on questions like "how might LLMs be biased and how would we mitigate those biases?" This is a vital discussion: the ways in which AI generally, and LLMs specifically, can entrench biases have been well-documented. But equally vital, and much less discussed, is the more opportunity-focused counterpoint: "what promising applications do LLMs enable that could promote equity?" If LLMs are to enable a more equitable world, it is not enough just to play defense against their biases and failure modes. We must also go on offense, applying them positively to equity-enhancing use cases to increase opportunities for underserved groups and reduce societal discrimination. There are many choices which determine the impact of AI, and a fundamental choice very early in the pipeline is the problems we choose to apply it to. If we focus only later in the pipeline -- making LLMs marginally more fair as they facilitate use cases which intrinsically entrench power -- we will miss an important opportunity to guide them to equitable impacts. Here, we highlight the emerging potential of LLMs to promote equity by presenting four newly possible, promising research directions, while kee** risks and cautionary points in clear view.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Conceptualizing Machine Learning for Dynamic Information Retrieval of Electronic Health Record Notes
Authors:
Sharon Jiang,
Shannon Shen,
Monica Agrawal,
Barbara Lam,
Nicholas Kurtzman,
Steven Horng,
David Karger,
David Sontag
Abstract:
The large amount of time clinicians spend sifting through patient notes and documenting in electronic health records (EHRs) is a leading cause of clinician burnout. By proactively and dynamically retrieving relevant notes during the documentation process, we can reduce the effort required to find relevant patient history. In this work, we conceptualize the use of EHR audit logs for machine learnin…
▽ More
The large amount of time clinicians spend sifting through patient notes and documenting in electronic health records (EHRs) is a leading cause of clinician burnout. By proactively and dynamically retrieving relevant notes during the documentation process, we can reduce the effort required to find relevant patient history. In this work, we conceptualize the use of EHR audit logs for machine learning as a source of supervision of note relevance in a specific clinical context, at a particular point in time. Our evaluation focuses on the dynamic retrieval in the emergency department, a high acuity setting with unique patterns of information retrieval and note writing. We show that our methods can achieve an AUC of 0.963 for predicting which notes will be read in an individual note writing session. We additionally conduct a user study with several clinicians and find that our framework can help clinicians retrieve relevant information more efficiently. Demonstrating that our framework and methods can perform well in this demanding setting is a promising proof of concept that they will translate to other clinical settings and data modalities (e.g., labs, medications, imaging).
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems
Authors:
Debopam Sanyal,
Jui-Tse Hung,
Manav Agrawal,
Prahlad Jasti,
Shahab Nikkhoo,
Somesh Jha,
Tianhao Wang,
Sibin Mohan,
Alexey Tumanov
Abstract:
Model-serving systems have become increasingly popular, especially in real-time web applications. In such systems, users send queries to the server and specify the desired performance metrics (e.g., desired accuracy, latency). The server maintains a set of models (model zoo) in the back-end and serves the queries based on the specified metrics. This paper examines the security, specifically robust…
▽ More
Model-serving systems have become increasingly popular, especially in real-time web applications. In such systems, users send queries to the server and specify the desired performance metrics (e.g., desired accuracy, latency). The server maintains a set of models (model zoo) in the back-end and serves the queries based on the specified metrics. This paper examines the security, specifically robustness against model extraction attacks, of such systems. Existing black-box attacks assume a single model can be repeatedly selected for serving inference requests. Modern inference serving systems break this assumption. Thus, they cannot be directly applied to extract a victim model, as models are hidden behind a layer of abstraction exposed by the serving system. An attacker can no longer identify which model she is interacting with. To this end, we first propose a query-efficient fingerprinting algorithm to enable the attacker to trigger any desired model consistently. We show that by using our fingerprinting algorithm, model extraction can have fidelity and accuracy scores within $1\%$ of the scores obtained when attacking a single, explicitly specified model, as well as up to $14.6\%$ gain in accuracy and up to $7.7\%$ gain in fidelity compared to the naive attack. Second, we counter the proposed attack with a noise-based defense mechanism that thwarts fingerprinting by adding noise to the specified performance metrics. The proposed defense strategy reduces the attack's accuracy and fidelity by up to $9.8\%$ and $4.8\%$, respectively (on medium-sized model extraction). Third, we show that the proposed defense induces a fundamental trade-off between the level of protection and system goodput, achieving configurable and significant victim model extraction protection while maintaining acceptable goodput ($>80\%$). We implement the proposed defense in a real system with plans to open source.
△ Less
Submitted 6 August, 2023; v1 submitted 3 July, 2023;
originally announced July 2023.
-
Single device offset-free magnetic field sensing principle with tunable sensitivity and linear range based on spin-orbit-torques
Authors:
Sabri Koraltan,
Christin Schmitt,
Florian Bruckner,
Claas Abert,
Klemens Prügl,
Michael Kirsch,
Rahul Gupta,
Sebastian Zeilinger,
Joshua M. Salazar-Mejía,
Milan Agrawal,
Johannes Güttinger,
Armin Satz,
Gerhard Jakob,
Mathias Kläui,
Dieter Suess
Abstract:
We propose a novel device concept using spin-orbit-torques to realize a magnetic field sensor, where we eliminate the sensor offset using a differential measurement concept. We derive a simple analytical formulation for the sensor signal and demonstrate its validity with numerical investigations using macrospin simulations. The sensitivity and the measurable linear sensing range in the proposed co…
▽ More
We propose a novel device concept using spin-orbit-torques to realize a magnetic field sensor, where we eliminate the sensor offset using a differential measurement concept. We derive a simple analytical formulation for the sensor signal and demonstrate its validity with numerical investigations using macrospin simulations. The sensitivity and the measurable linear sensing range in the proposed concept can be tuned by either varying the effective magnetic anisotropy or by varying the magnitude of the injected currents. We show that undesired perturbation fields normal to the sensitive direction preserve the zero-offset property and only slightly modulate the sensitivity of the proposed sensor. Higher-harmonics voltage analysis on a Hall cross experimentally confirms the linearity and tunability via current strength. Additionally, the sensor exhibits a non-vanishing offset in the experiment which we attribute to the anomalous Nernst effect.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
Machine Learning for Health symposium 2022 -- Extended Abstract track
Authors:
Antonio Parziale,
Monica Agrawal,
Shalmali Joshi,
Irene Y. Chen,
Shengpu Tang,
Luis Oala,
Adarsh Subbaswamy
Abstract:
A collection of the extended abstracts that were presented at the 2nd Machine Learning for Health symposium (ML4H 2022), which was held both virtually and in person on November 28, 2022, in New Orleans, Louisiana, USA. Machine Learning for Health (ML4H) is a longstanding venue for research into machine learning for health, including both theoretical works and applied works. ML4H 2022 featured two…
▽ More
A collection of the extended abstracts that were presented at the 2nd Machine Learning for Health symposium (ML4H 2022), which was held both virtually and in person on November 28, 2022, in New Orleans, Louisiana, USA. Machine Learning for Health (ML4H) is a longstanding venue for research into machine learning for health, including both theoretical works and applied works. ML4H 2022 featured two submission tracks: a proceedings track, which encompassed full-length submissions of technically mature and rigorous work, and an extended abstract track, which would accept less mature, but innovative research for discussion. All the manuscripts submitted to ML4H Symposium underwent a double-blind peer-review process. Extended abstracts included in this collection describe innovative machine learning research focused on relevant problems in health and biomedicine.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
Authors:
Wele Gedara Chaminda Bandara,
Naman Patel,
Ali Gholami,
Mehdi Nikkhah,
Motilal Agrawal,
Vishal M. Patel
Abstract:
Masked Autoencoders (MAEs) learn generalizable representations for image, text, audio, video, etc., by reconstructing masked input data from tokens of the visible data. Current MAE approaches for videos rely on random patch, tube, or frame-based masking strategies to select these tokens. This paper proposes AdaMAE, an adaptive masking strategy for MAEs that is end-to-end trainable. Our adaptive ma…
▽ More
Masked Autoencoders (MAEs) learn generalizable representations for image, text, audio, video, etc., by reconstructing masked input data from tokens of the visible data. Current MAE approaches for videos rely on random patch, tube, or frame-based masking strategies to select these tokens. This paper proposes AdaMAE, an adaptive masking strategy for MAEs that is end-to-end trainable. Our adaptive masking strategy samples visible tokens based on the semantic context using an auxiliary sampling network. This network estimates a categorical distribution over spacetime-patch tokens. The tokens that increase the expected reconstruction error are rewarded and selected as visible tokens, motivated by the policy gradient algorithm in reinforcement learning. We show that AdaMAE samples more tokens from the high spatiotemporal information regions, thereby allowing us to mask 95% of tokens, resulting in lower memory requirements and faster pre-training. We conduct ablation studies on the Something-Something v2 (SSv2) dataset to demonstrate the efficacy of our adaptive sampling approach and report state-of-the-art results of 70.0% and 81.7% in top-1 accuracy on SSv2 and Kinetics-400 action classification datasets with a ViT-Base backbone and 800 pre-training epochs.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Determination of photo-nuclear cross section of $^{61}$Ni($γ$,xp) reaction via surrogate ratio technique
Authors:
Shaima Akbar,
M. M Musthafa,
Midhun C. V,
Antony Joseph,
S. V. Suryanarayana,
A. Pal,
S. Santra,
P. C. Rout,
Jyoti Pandey,
Bhawna Pandey,
H. M. Agrawal,
K. C Jagadeesan,
S. Ganesan
Abstract:
The photo nuclear reaction cross section of $^{61}$Ni($γ$,xp) reaction have been measured by employing surrogate reaction technique. This indirect method is used for the first time to obtain the cross section of photo nuclear reaction. The compound nucleus $^{61}$Ni$^{*}$ was populated using the transfer reaction $^{59}$Co($^{6}$Li,$α$) at E$_{lab}=$ 40.5 MeV. To calculate the surrogate ratio,…
▽ More
The photo nuclear reaction cross section of $^{61}$Ni($γ$,xp) reaction have been measured by employing surrogate reaction technique. This indirect method is used for the first time to obtain the cross section of photo nuclear reaction. The compound nucleus $^{61}$Ni$^{*}$ was populated using the transfer reaction $^{59}$Co($^{6}$Li,$α$) at E$_{lab}=$ 40.5 MeV. To calculate the surrogate ratio, $^{60}$Ni($γ$,xp) was selected as reference reaction and the corresponding compound nucleus $^{60}$Ni$^{*}$ was populated using the transfer reaction $^{56}$Fe($^{6}$Li,d) at E$_{lab}=$ 35.9 MeV. The experimental cross section data of the reference reaction has been taken from EXFOR data libraries. Compound nuclear cross section calculations have been done using EMPIRE 3.2.3 code.
△ Less
Submitted 14 November, 2022; v1 submitted 4 November, 2022;
originally announced November 2022.
-
TabLLM: Few-shot Classification of Tabular Data with Large Language Models
Authors:
Stefan Hegselmann,
Alejandro Buendia,
Hunter Lang,
Monica Agrawal,
Xiaoyi Jiang,
David Sontag
Abstract:
We study the application of large language models to zero-shot and few-shot classification of tabular data. We prompt the large language model with a serialization of the tabular data to a natural-language string, together with a short description of the classification problem. In the few-shot setting, we fine-tune the large language model using some labeled examples. We evaluate several serializa…
▽ More
We study the application of large language models to zero-shot and few-shot classification of tabular data. We prompt the large language model with a serialization of the tabular data to a natural-language string, together with a short description of the classification problem. In the few-shot setting, we fine-tune the large language model using some labeled examples. We evaluate several serialization methods including templates, table-to-text models, and large language models. Despite its simplicity, we find that this technique outperforms prior deep-learning-based tabular classification methods on several benchmark datasets. In most cases, even zero-shot classification obtains non-trivial performance, illustrating the method's ability to exploit prior knowledge encoded in large language models. Unlike many deep learning methods for tabular datasets, this approach is also competitive with strong traditional baselines like gradient-boosted trees, especially in the very-few-shot setting.
△ Less
Submitted 17 March, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
Discovery of double BSS sequences in the old Galactic open cluster Berkeley 17
Authors:
Khushboo K Rao,
Souradeep Bhattacharya,
Kaushar Vaidya,
Manan Agrawal
Abstract:
Blue straggler stars (BSS) are peculiar objects which normally appear as a single broad sequence along the extension of the main sequence. Only four globular clusters (GCs) have been observed to have two distinct and parallel BSS sequences. For the first time for any open cluster (OC), we report double BSS sequences in Berkeley 17. Using the machine-learning based membership algorithm ML-MOC on Ga…
▽ More
Blue straggler stars (BSS) are peculiar objects which normally appear as a single broad sequence along the extension of the main sequence. Only four globular clusters (GCs) have been observed to have two distinct and parallel BSS sequences. For the first time for any open cluster (OC), we report double BSS sequences in Berkeley 17. Using the machine-learning based membership algorithm ML-MOC on Gaia EDR3 data, we identify 627 cluster members, including 21 BSS candidates out to 15 arcmin from the cluster center. Both the BSS sequences are almost equally populated and parallel to one another in Gaia as well as in Pan-STARRS colour-magnitude diagram (CMD). We statistically confirm their presence and report that both BSS sequences are highly segregated compared to the reference population out to $\sim$5.5 arcmin and not segregated thereafter. The lower densities of OCs make BSS formation impossible via the collisional channel. Therefore, mass transfer seems to be the only viable channel for forming candidates of both sequences. The gap between the red and blue BSS sequences, on the other hand, is significant and presents a great opportunity to understand the connection between BSS formation and internal as well as external dynamics of the parent clusters.
△ Less
Submitted 8 October, 2022; v1 submitted 5 October, 2022;
originally announced October 2022.
-
GANTouch: An Attack-Resilient Framework for Touch-based Continuous Authentication System
Authors:
Mohit Agrawal,
Pragyan Mehrotra,
Rajesh Kumar,
Rajiv Ratn Shah
Abstract:
Previous studies have shown that commonly studied (vanilla) implementations of touch-based continuous authentication systems (V-TCAS) are susceptible to active adversarial attempts. This study presents a novel Generative Adversarial Network assisted TCAS (G-TCAS) framework and compares it to the V-TCAS under three active adversarial environments viz. Zero-effort, Population, and Random-vector. The…
▽ More
Previous studies have shown that commonly studied (vanilla) implementations of touch-based continuous authentication systems (V-TCAS) are susceptible to active adversarial attempts. This study presents a novel Generative Adversarial Network assisted TCAS (G-TCAS) framework and compares it to the V-TCAS under three active adversarial environments viz. Zero-effort, Population, and Random-vector. The Zero-effort environment was implemented in two variations viz. Zero-effort (same-dataset) and Zero-effort (cross-dataset). The first involved a Zero-effort attack from the same dataset, while the second used three different datasets. G-TCAS showed more resilience than V-TCAS under the Population and Random-vector, the more damaging adversarial scenarios than the Zero-effort. On average, the increase in the false accept rates (FARs) for V-TCAS was much higher (27.5% and 21.5%) than for G-TCAS (14% and 12.5%) for Population and Random-vector attacks, respectively. Moreover, we performed a fairness analysis of TCAS for different genders and found TCAS to be fair across genders. The findings suggest that we should evaluate TCAS under active adversarial environments and affirm the usefulness of GANs in the TCAS pipeline.
△ Less
Submitted 2 October, 2022;
originally announced October 2022.
-
Document Image Binarization in JPEG Compressed Domain using Dual Discriminator Generative Adversarial Networks
Authors:
Bulla Rajesh,
Manav Kamlesh Agrawal,
Milan Bhuva,
Kisalaya Kishore,
Mohammed Javed
Abstract:
Image binarization techniques are being popularly used in enhancement of noisy and/or degraded images catering different Document Image Anlaysis (DIA) applications like word spotting, document retrieval, and OCR. Most of the existing techniques focus on feeding pixel images into the Convolution Neural Networks to accomplish document binarization, which may not produce effective results when workin…
▽ More
Image binarization techniques are being popularly used in enhancement of noisy and/or degraded images catering different Document Image Anlaysis (DIA) applications like word spotting, document retrieval, and OCR. Most of the existing techniques focus on feeding pixel images into the Convolution Neural Networks to accomplish document binarization, which may not produce effective results when working with compressed images that need to be processed without full decompression. Therefore in this research paper, the idea of document image binarization directly using JPEG compressed stream of document images is proposed by employing Dual Discriminator Generative Adversarial Networks (DD-GANs). Here the two discriminator networks - Global and Local work on different image ratios and use focal loss as generator loss. The proposed model has been thoroughly tested with different versions of DIBCO dataset having challenges like holes, erased or smudged ink, dust, and misplaced fibres. The model proved to be highly robust, efficient both in terms of time and space complexities, and also resulted in state-of-the-art performance in JPEG compressed domain.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Large Language Models are Few-Shot Clinical Information Extractors
Authors:
Monica Agrawal,
Stefan Hegselmann,
Hunter Lang,
Yoon Kim,
David Sontag
Abstract:
A long-running goal of the clinical NLP community is the extraction of important variables trapped in clinical notes. However, roadblocks have included dataset shift from the general domain and a lack of public clinical corpora and annotations. In this work, we show that large language models, such as InstructGPT, perform well at zero- and few-shot information extraction from clinical text despite…
▽ More
A long-running goal of the clinical NLP community is the extraction of important variables trapped in clinical notes. However, roadblocks have included dataset shift from the general domain and a lack of public clinical corpora and annotations. In this work, we show that large language models, such as InstructGPT, perform well at zero- and few-shot information extraction from clinical text despite not being trained specifically for the clinical domain. Whereas text classification and generation performance have already been studied extensively in such models, here we additionally demonstrate how to leverage them to tackle a diverse set of NLP tasks which require more structured outputs, including span identification, token-level sequence classification, and relation extraction. Further, due to the dearth of available data to evaluate these systems, we introduce new datasets for benchmarking few-shot clinical information extraction based on a manual re-annotation of the CASI dataset for new tasks. On the clinical extraction tasks we studied, the GPT-3 systems significantly outperform existing zero- and few-shot baselines.
△ Less
Submitted 30 November, 2022; v1 submitted 25 May, 2022;
originally announced May 2022.
-
Co-training Improves Prompt-based Learning for Large Language Models
Authors:
Hunter Lang,
Monica Agrawal,
Yoon Kim,
David Sontag
Abstract:
We demonstrate that co-training (Blum & Mitchell, 1998) can improve the performance of prompt-based learning by using unlabeled data. While prompting has emerged as a promising paradigm for few-shot and zero-shot learning, it is often brittle and requires much larger models compared to the standard supervised setup. We find that co-training makes it possible to improve the original prompt model an…
▽ More
We demonstrate that co-training (Blum & Mitchell, 1998) can improve the performance of prompt-based learning by using unlabeled data. While prompting has emerged as a promising paradigm for few-shot and zero-shot learning, it is often brittle and requires much larger models compared to the standard supervised setup. We find that co-training makes it possible to improve the original prompt model and at the same time learn a smaller, downstream task-specific model. In the case where we only have partial access to a prompt model (e.g., output probabilities from GPT-3 (Brown et al., 2020)) we learn a calibration model over the prompt outputs. When we have full access to the prompt model's gradients but full finetuning remains prohibitively expensive (e.g., T0 (Sanh et al., 2021)), we learn a set of soft prompt continuous vectors to iteratively update the prompt model. We find that models trained in this manner can significantly improve performance on challenging datasets where there is currently a large gap between prompt-based learning and fully-supervised models.
△ Less
Submitted 1 February, 2022;
originally announced February 2022.
-
Leveraging Time Irreversibility with Order-Contrastive Pre-training
Authors:
Monica Agrawal,
Hunter Lang,
Michael Offin,
Lior Gazit,
David Sontag
Abstract:
Label-scarce, high-dimensional domains such as healthcare present a challenge for modern machine learning techniques. To overcome the difficulties posed by a lack of labeled data, we explore an "order-contrastive" method for self-supervised pre-training on longitudinal data. We sample pairs of time segments, switch the order for half of them, and train a model to predict whether a given pair is in…
▽ More
Label-scarce, high-dimensional domains such as healthcare present a challenge for modern machine learning techniques. To overcome the difficulties posed by a lack of labeled data, we explore an "order-contrastive" method for self-supervised pre-training on longitudinal data. We sample pairs of time segments, switch the order for half of them, and train a model to predict whether a given pair is in the correct order. Intuitively, the ordering task allows the model to attend to the least time-reversible features (for example, features that indicate progression of a chronic disease). The same features are often useful for downstream tasks of interest. To quantify this, we study a simple theoretical setting where we prove a finite-sample guarantee for the downstream error of a representation learned with order-contrastive pre-training. Empirically, in synthetic and longitudinal healthcare settings, we demonstrate the effectiveness of order-contrastive pre-training in the small-data regime over supervised learning and other self-supervised pre-training baselines. Our results indicate that pre-training methods designed for particular classes of distributions and downstream tasks can improve the performance of self-supervised learning.
△ Less
Submitted 29 March, 2022; v1 submitted 3 November, 2021;
originally announced November 2021.
-
MedKnowts: Unified Documentation and Information Retrieval for Electronic Health Records
Authors:
Luke Murray,
Divya Gopinath,
Monica Agrawal,
Steven Horng,
David Sontag,
David R. Karger
Abstract:
Clinical documentation can be transformed by Electronic Health Records, yet the documentation process is still a tedious, time-consuming, and error-prone process. Clinicians are faced with multi-faceted requirements and fragmented interfaces for information exploration and documentation. These challenges are only exacerbated in the Emergency Department -- clinicians often see 35 patients in one sh…
▽ More
Clinical documentation can be transformed by Electronic Health Records, yet the documentation process is still a tedious, time-consuming, and error-prone process. Clinicians are faced with multi-faceted requirements and fragmented interfaces for information exploration and documentation. These challenges are only exacerbated in the Emergency Department -- clinicians often see 35 patients in one shift, during which they have to synthesize an often previously unknown patient's medical records in order to reach a tailored diagnosis and treatment plan. To better support this information synthesis, clinical documentation tools must enable rapid contextual access to the patient's medical record. MedKnowts is an integrated note-taking editor and information retrieval system which unifies the documentation and search process and provides concise synthesized concept-oriented slices of the patient's medical record. MedKnowts automatically captures structured data while still allowing users the flexibility of natural language. MedKnowts leverages this structure to enable easier parsing of long notes, auto-populated text, and proactive information retrieval, easing the documentation burden.
△ Less
Submitted 23 September, 2021;
originally announced September 2021.
-
WeightScale: Interpreting Weight Change in Neural Networks
Authors:
Ayush Manish Agrawal,
Atharva Tendle,
Harshvardhan Sikka,
Sahib Singh
Abstract:
Interpreting the learning dynamics of neural networks can provide useful insights into how networks learn and the development of better training and design approaches. We present an approach to interpret learning in neural networks by measuring relative weight change on a per layer basis and dynamically aggregating emerging trends through combination of dimensionality reduction and clustering whic…
▽ More
Interpreting the learning dynamics of neural networks can provide useful insights into how networks learn and the development of better training and design approaches. We present an approach to interpret learning in neural networks by measuring relative weight change on a per layer basis and dynamically aggregating emerging trends through combination of dimensionality reduction and clustering which allows us to scale to very deep networks. We use this approach to investigate learning in the context of vision tasks across a variety of state-of-the-art networks and provide insights into the learning behavior of these networks, including how task complexity affects layer-wise learning in deeper layers of networks.
△ Less
Submitted 26 March, 2022; v1 submitted 7 July, 2021;
originally announced July 2021.
-
A Multi-Lead Fusion Method for the Accurate Delineation of QRS Complex Location in 12 Lead ECG Signal
Authors:
Chhaviraj Chauhan,
Monika Agrawal,
Pooja Sabherwal
Abstract:
This paper presents a multi-lead fusion method for the accurate and automated detection of the QRS complex location in 12 lead ECG (Electrocardiogram) signals. The proposed multi-lead fusion method accurately delineates the QRS complex by the fusion of detected QRS complexes of the individual 12 leads. The proposed algorithm consists of two major stages. Firstly, the QRS complex location of each l…
▽ More
This paper presents a multi-lead fusion method for the accurate and automated detection of the QRS complex location in 12 lead ECG (Electrocardiogram) signals. The proposed multi-lead fusion method accurately delineates the QRS complex by the fusion of detected QRS complexes of the individual 12 leads. The proposed algorithm consists of two major stages. Firstly, the QRS complex location of each lead is detected by the single lead QRS detection algorithm. Secondly, the multi-lead fusion algorithm combines the information of the QRS complex locations obtained in each of the 12 leads. The performance of the proposed algorithm is improved in terms of Sensitivity and Positive Predictivity by discarding the false positives. The proposed method is validated on the ECG signals with various artifacts, inter and intra subject variations. The performance of the proposed method is validated on the long duration recorded ECG signals of St. Petersburg INCART database with Sensitivity of 99.87% and Positive Predictivity of 99.96% and on the short duration recorded ECG signals of CSE (Common Standards for Electrocardiography) multi-lead database with 100% Sensitivity and 99.13% Positive Predictivity.
△ Less
Submitted 27 July, 2021; v1 submitted 12 July, 2021;
originally announced July 2021.
-
Defending Touch-based Continuous Authentication Systems from Active Adversaries Using Generative Adversarial Networks
Authors:
Mohit Agrawal,
Pragyan Mehrotra,
Rajesh Kumar,
Rajiv Ratn Shah
Abstract:
Previous studies have demonstrated that commonly studied (vanilla) touch-based continuous authentication systems (V-TCAS) are susceptible to population attack. This paper proposes a novel Generative Adversarial Network assisted TCAS (G-TCAS) framework, which showed more resilience to the population attack. G-TCAS framework was tested on a dataset of 117 users who interacted with a smartphone and t…
▽ More
Previous studies have demonstrated that commonly studied (vanilla) touch-based continuous authentication systems (V-TCAS) are susceptible to population attack. This paper proposes a novel Generative Adversarial Network assisted TCAS (G-TCAS) framework, which showed more resilience to the population attack. G-TCAS framework was tested on a dataset of 117 users who interacted with a smartphone and tablet pair. On average, the increase in the false accept rates (FARs) for V-TCAS was much higher (22%) than G-TCAS (13%) for the smartphone. Likewise, the increase in the FARs for V-TCAS was 25% compared to G-TCAS (6%) for the tablet.
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
Deep neural networks based predictive-generative framework for designing composite materials
Authors:
Ashank,
Soumen Chakravarty,
Pranshu Garg,
Ankit Kumar,
Manish Agrawal,
Prabhat K. Agnihotri
Abstract:
Designing composite materials as per the application requirements is fundamentally a challenging and time consuming task. Here we report the development of a deep neural network based computational framework capable of solving the forward (predictive) as well as inverse (generative) design problem. The predictor model is based on the popular convolution neural network architecture and trained with…
▽ More
Designing composite materials as per the application requirements is fundamentally a challenging and time consuming task. Here we report the development of a deep neural network based computational framework capable of solving the forward (predictive) as well as inverse (generative) design problem. The predictor model is based on the popular convolution neural network architecture and trained with the help of finite element simulations. Further, the developed property predictor model is used as a feedback mechanism in the neural network based generator model. The proposed predictive-generative model can be used to obtain the micro-structure for maximization of particular elastic properties as well as for specified elastic constants. One of the major hurdle for deployment of the deep learning techniques in composite material design is the intensive computational resources required to generate the training data sets. To this end, a novel data augmentation scheme is presented. The application of data augmentation scheme results in significant saving of computational resources in the training phase. The proposed data augmentation approach is general and can be used in any setting involving the periodic micro-structures. The efficacy of the predictive-generative model is demonstrated through various examples. It is envisaged that the developed model will significantly reduce the cost and time associated with the composite material designing process for advanced applications.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
Chandrayaan-2 Dual-Frequency SAR (DFSAR): Performance Characterization and Initial Results
Authors:
Sriram S. Bhiravarasu,
Tathagata Chakraborty,
Deepak Putrevu,
Dharmendra K. Pandey,
Anup K. Das,
V. M. Ramanujam,
Raghav Mehra,
Parikshit Parasher,
Krishna M. Agrawal,
Shubham Gupta,
Gaurav S. Seth,
Amit Shukla,
Nikhil Y. Pandya,
Sanjay Trivedi,
Arundhati Misra,
Rajeev Jyoti,
Raj Kumar
Abstract:
The Dual-Frequency synthetic aperture radar (DFSAR) system manifested on the Chandrayaan-2 spacecraft represents a significant step forward in radar exploration of solid solar system objects. It combines SAR at two wavelengths (L- and S-bands) and multiple resolutions with several polarimetric modes in one lightweight ($\sim$ 20 kg) package. The resulting data from DFSAR support calculation of the…
▽ More
The Dual-Frequency synthetic aperture radar (DFSAR) system manifested on the Chandrayaan-2 spacecraft represents a significant step forward in radar exploration of solid solar system objects. It combines SAR at two wavelengths (L- and S-bands) and multiple resolutions with several polarimetric modes in one lightweight ($\sim$ 20 kg) package. The resulting data from DFSAR support calculation of the 2$\times$2 complex scattering matrix for each resolution cell, which enables lunar near surface characterization in terms of radar polarization properties at different wavelengths and incidence angles. In this paper, we report on the calibration and preliminary performance characterization of DFSAR data based on the analysis of a sample set of crater regions on the Moon. Our calibration analysis provided a means to compare on-orbit performance with pre-launch measurements and the results matched with the pre-launch expected values. Our initial results show that craters in both permanently shadowed regions (PSRs) and non-PSRs that are classified as Circular Polarization Ratio (CPR)-anomalous in previous S-band radar analyses appear anomalous at L-band also. We also observe that material evolution and physical properties at their interior and proximal ejecta are decoupled. For Byrgius C crater region, we compare our analysis of dual-frequency radar data with the predicted behaviours of theoretical scattering models. If crater age estimates are available, comparison of their radar polarization properties at multiple wavelengths similar to that of the three unnamed south polar crater regions shown in this study may provide new insights into how the rockiness of craters evolves with time.
△ Less
Submitted 29 April, 2021;
originally announced April 2021.
-
Assessing the Impact of Automated Suggestions on Decision Making: Domain Experts Mediate Model Errors but Take Less Initiative
Authors:
Ariel Levy,
Monica Agrawal,
Arvind Satyanarayan,
David Sontag
Abstract:
Automated decision support can accelerate tedious tasks as users can focus their attention where it is needed most. However, a key concern is whether users overly trust or cede agency to automation. In this paper, we investigate the effects of introducing automation to annotating clinical texts--a multi-step, error-prone task of identifying clinical concepts (e.g., procedures) in medical notes, an…
▽ More
Automated decision support can accelerate tedious tasks as users can focus their attention where it is needed most. However, a key concern is whether users overly trust or cede agency to automation. In this paper, we investigate the effects of introducing automation to annotating clinical texts--a multi-step, error-prone task of identifying clinical concepts (e.g., procedures) in medical notes, and map** them to labels in a large ontology. We consider two forms of decision aid: recommending which labels to map concepts to, and pre-populating annotation suggestions. Through laboratory studies, we find that 18 clinicians generally build intuition of when to rely on automation and when to exercise their own judgement. However, when presented with fully pre-populated suggestions, these expert users exhibit less agency: accepting improper mentions, and taking less initiative in creating additional annotations. Our findings inform how systems and algorithms should be designed to mitigate the observed issues.
△ Less
Submitted 29 March, 2021; v1 submitted 8 March, 2021;
originally announced March 2021.
-
SUTRA: A Novel Approach to Modelling Pandemics with Applications to COVID-19
Authors:
Manindra Agrawal,
Madhuri Kanitkar,
Deepu Phillip,
Tanima Hajra,
Arti Singh,
Avaneesh Singh,
Prabal Pratap Singh,
Mathukumalli Vidyasagar
Abstract:
The Covid-19 pandemic has two key properties: (i) asymptomatic cases (both detected and undetected) that can result in new infections, and (ii) time-varying characteristics due to new variants, Non-Pharmaceutical Interventions etc. We develop a model called SUTRA (Susceptible, Undetected though infected, Tested positive, and Removed Analysis) that takes into account both of these two key propertie…
▽ More
The Covid-19 pandemic has two key properties: (i) asymptomatic cases (both detected and undetected) that can result in new infections, and (ii) time-varying characteristics due to new variants, Non-Pharmaceutical Interventions etc. We develop a model called SUTRA (Susceptible, Undetected though infected, Tested positive, and Removed Analysis) that takes into account both of these two key properties. While applying the model to a region, two parameters of the model can be learnt from the number of daily new cases found in the region. Using the learnt values of the parameters the model can predict the number of daily new cases so long as the learnt parameters do not change substantially. Whenever any of the two parameters changes due to the key property (ii) above, the SUTRA model can detect that the values of one or both of the parameters have changed. Further, the model has the capability to relearn the changed parameter values, and then use these to carry out the prediction of the trajectory of the pandemic for the region of concern. The SUTRA approach can be applied at various levels of granularity, from an entire country to a district, more specifically, to any large enough region for which the data of daily new cases are available.
We have applied the SUTRA model to thirty-two countries, covering more than half of the world's population. Our conclusions are: (i) The model is able to capture the past trajectories very well. Moreover, the parameter values, which we can estimate robustly, help quantify the impact of changes in the pandemic characteristics. (ii) Unless the pandemic characteristics change significantly, the model has good predictive capability. (iii) Natural immunity provides significantly better protection against infection than the currently available vaccines.
△ Less
Submitted 25 October, 2022; v1 submitted 22 January, 2021;
originally announced January 2021.
-
Muscle-inspired flexible mechanical logic architecture for colloidal robotics
Authors:
Mayank Agrawal,
Sharon C. Glotzer
Abstract:
Materials that respond to external stimuli by expanding or contracting provide a transduction route that integrates sensing and actuation powered directly by the stimuli. This motivates us to build colloidal scale robots using these materials that can morph into arbitrary configurations. For intelligent use of global stimuli in robotic systems, computation ability needs to be incorporated within t…
▽ More
Materials that respond to external stimuli by expanding or contracting provide a transduction route that integrates sensing and actuation powered directly by the stimuli. This motivates us to build colloidal scale robots using these materials that can morph into arbitrary configurations. For intelligent use of global stimuli in robotic systems, computation ability needs to be incorporated within them. The challenge is to design an architecture that is compact, material agnostic, stable under stochastic forces and can employ stimuli-responsive materials. We present an architecture that computes combinatorial logic using mechanical gates that use muscle-like response - expansion and contraction - as circuit signal with additional benefits of logic circuitry being physically flexible and able to be retrofit to arbitrary robot bodies. We mathematically analyze gate geometry and discuss tuning it for the given requirements of signal dimension and magnitude. We validate the function and stability of the design at the colloidal scale using Brownian dynamics simulations. We also demonstrate the gate design using a 3D printed model. Finally, we simulate a complete robot that folds into Tetris shapes.
△ Less
Submitted 16 December, 2020;
originally announced December 2020.
-
A Novel Multimodal Music Genre Classifier using Hierarchical Attention and Convolutional Neural Network
Authors:
Manish Agrawal,
Abhilash Nandy
Abstract:
Music genre classification is one of the trending topics in regards to the current Music Information Retrieval (MIR) Research. Since, the dependency of genre is not only limited to the audio profile, we also make use of textual content provided as lyrics of the corresponding song. We implemented a CNN based feature extractor for spectrograms in order to incorporate the acoustic features and a Hier…
▽ More
Music genre classification is one of the trending topics in regards to the current Music Information Retrieval (MIR) Research. Since, the dependency of genre is not only limited to the audio profile, we also make use of textual content provided as lyrics of the corresponding song. We implemented a CNN based feature extractor for spectrograms in order to incorporate the acoustic features and a Hierarchical Attention Network based feature extractor for lyrics. We then go on to classify the music track based upon the resulting fused feature vector.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change
Authors:
Ayush Manish Agrawal,
Atharva Tendle,
Harshvardhan Sikka,
Sahib Singh,
Amr Kayid
Abstract:
Understanding the per-layer learning dynamics of deep neural networks is of significant interest as it may provide insights into how neural networks learn and the potential for better training regimens. We investigate learning in Deep Convolutional Neural Networks (CNNs) by measuring the relative weight change of layers while training. Several interesting trends emerge in a variety of CNN architec…
▽ More
Understanding the per-layer learning dynamics of deep neural networks is of significant interest as it may provide insights into how neural networks learn and the potential for better training regimens. We investigate learning in Deep Convolutional Neural Networks (CNNs) by measuring the relative weight change of layers while training. Several interesting trends emerge in a variety of CNN architectures across various computer vision classification tasks, including the overall increase in relative weight change of later layers as compared to earlier ones.
△ Less
Submitted 30 November, 2020; v1 submitted 12 November, 2020;
originally announced November 2020.
-
Robust Benchmarking for Machine Learning of Clinical Entity Extraction
Authors:
Monica Agrawal,
Chloe O'Connell,
Yasmin Fatemi,
Ariel Levy,
David Sontag
Abstract:
Clinical studies often require understanding elements of a patient's narrative that exist only in free text clinical notes. To transform notes into structured data for downstream use, these elements are commonly extracted and normalized to medical vocabularies. In this work, we audit the performance of and indicate areas of improvement for state-of-the-art systems. We find that high task accuracie…
▽ More
Clinical studies often require understanding elements of a patient's narrative that exist only in free text clinical notes. To transform notes into structured data for downstream use, these elements are commonly extracted and normalized to medical vocabularies. In this work, we audit the performance of and indicate areas of improvement for state-of-the-art systems. We find that high task accuracies for clinical entity normalization systems on the 2019 n2c2 Shared Task are misleading, and underlying performance is still brittle. Normalization accuracy is high for common concepts (95.3%), but much lower for concepts unseen in training data (69.3%). We demonstrate that current approaches are hindered in part by inconsistencies in medical vocabularies, limitations of existing labeling schemas, and narrow evaluation techniques. We reformulate the annotation framework for clinical entity extraction to factor in these issues to allow for robust end-to-end system benchmarking. We evaluate concordance of annotations from our new framework between two annotators and achieve a Jaccard similarity of 0.73 for entity recognition and an agreement of 0.83 for entity normalization. We propose a path forward to address the demonstrated need for the creation of a reference standard to spur method development in entity recognition and normalization.
△ Less
Submitted 31 July, 2020;
originally announced July 2020.
-
Fast, Structured Clinical Documentation via Contextual Autocomplete
Authors:
Divya Gopinath,
Monica Agrawal,
Luke Murray,
Steven Horng,
David Karger,
David Sontag
Abstract:
We present a system that uses a learned autocompletion mechanism to facilitate rapid creation of semi-structured clinical documentation. We dynamically suggest relevant clinical concepts as a doctor drafts a note by leveraging features from both unstructured and structured medical data. By constraining our architecture to shallow neural networks, we are able to make these suggestions in real time.…
▽ More
We present a system that uses a learned autocompletion mechanism to facilitate rapid creation of semi-structured clinical documentation. We dynamically suggest relevant clinical concepts as a doctor drafts a note by leveraging features from both unstructured and structured medical data. By constraining our architecture to shallow neural networks, we are able to make these suggestions in real time. Furthermore, as our algorithm is used to write a note, we can automatically annotate the documentation with clean labels of clinical concepts drawn from medical vocabularies, making notes more structured and readable for physicians, patients, and future algorithms. To our knowledge, this system is the only machine learning-based documentation utility for clinical notes deployed in a live hospital setting, and it reduces keystroke burden of clinical concepts by 67% in real environments.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
PClean: Bayesian Data Cleaning at Scale with Domain-Specific Probabilistic Programming
Authors:
Alexander K. Lew,
Monica Agrawal,
David Sontag,
Vikash K. Mansinghka
Abstract:
Data cleaning is naturally framed as probabilistic inference in a generative model of ground-truth data and likely errors, but the diversity of real-world error patterns and the hardness of inference make Bayesian approaches difficult to automate. We present PClean, a probabilistic programming language (PPL) for leveraging dataset-specific knowledge to automate Bayesian cleaning. Compared to gener…
▽ More
Data cleaning is naturally framed as probabilistic inference in a generative model of ground-truth data and likely errors, but the diversity of real-world error patterns and the hardness of inference make Bayesian approaches difficult to automate. We present PClean, a probabilistic programming language (PPL) for leveraging dataset-specific knowledge to automate Bayesian cleaning. Compared to general-purpose PPLs, PClean tackles a restricted problem domain, enabling three modeling and inference innovations: (1) a non-parametric model of relational database instances, which users' programs customize; (2) a novel sequential Monte Carlo inference algorithm that exploits the structure of PClean's model class; and (3) a compiler that generates near-optimal SMC proposals and blocked-Gibbs rejuvenation kernels based on the user's model and data. We show empirically that short (< 50-line) PClean programs can: be faster and more accurate than generic PPL inference on data-cleaning benchmarks; match state-of-the-art data-cleaning systems in terms of accuracy and runtime (unlike generic PPL inference in the same runtime); and scale to real-world datasets with millions of records.
△ Less
Submitted 18 November, 2022; v1 submitted 23 July, 2020;
originally announced July 2020.
-
On the Inference of Soft Biometrics from Ty** Patterns Collected in a Multi-device Environment
Authors:
Vishaal Udandarao,
Mohit Agrawal,
Rajesh Kumar,
Rajiv Ratn Shah
Abstract:
In this paper, we study the inference of gender, major/minor (computer science, non-computer science), ty** style, age, and height from the ty** patterns collected from 117 individuals in a multi-device environment. The inference of the first three identifiers was considered as classification tasks, while the rest as regression tasks. For classification tasks, we benchmark the performance of s…
▽ More
In this paper, we study the inference of gender, major/minor (computer science, non-computer science), ty** style, age, and height from the ty** patterns collected from 117 individuals in a multi-device environment. The inference of the first three identifiers was considered as classification tasks, while the rest as regression tasks. For classification tasks, we benchmark the performance of six classical machine learning (ML) and four deep learning (DL) classifiers. On the other hand, for regression tasks, we evaluated three ML and four DL-based regressors. The overall experiment consisted of two text-entry (free and fixed) and four device (Desktop, Tablet, Phone, and Combined) configurations. The best arrangements achieved accuracies of 96.15%, 93.02%, and 87.80% for ty** style, gender, and major/minor, respectively, and mean absolute errors of 1.77 years and 2.65 inches for age and height, respectively. The results are promising considering the variety of application scenarios that we have listed in this work.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Model-assisted cohort selection with bias analysis for generating large-scale cohorts from the EHR for oncology research
Authors:
Benjamin Birnbaum,
Nathan Nussbaum,
Katharina Seidl-Rathkopf,
Monica Agrawal,
Melissa Estevez,
Evan Estola,
Joshua Haimson,
Lucy He,
Peter Larson,
Paul Richardson
Abstract:
Objective Electronic health records (EHRs) are a promising source of data for health outcomes research in oncology. A challenge in using EHR data is that selecting cohorts of patients often requires information in unstructured parts of the record. Machine learning has been used to address this, but even high-performing algorithms may select patients in a non-random manner and bias the resulting co…
▽ More
Objective Electronic health records (EHRs) are a promising source of data for health outcomes research in oncology. A challenge in using EHR data is that selecting cohorts of patients often requires information in unstructured parts of the record. Machine learning has been used to address this, but even high-performing algorithms may select patients in a non-random manner and bias the resulting cohort. To improve the efficiency of cohort selection while measuring potential bias, we introduce a technique called Model-Assisted Cohort Selection (MACS) with Bias Analysis and apply it to the selection of metastatic breast cancer (mBC) patients. Materials and Methods We trained a model on 17,263 patients using term-frequency inverse-document-frequency (TF-IDF) and logistic regression. We used a test set of 17,292 patients to measure algorithm performance and perform Bias Analysis. We compared the cohort generated by MACS to the cohort that would have been generated without MACS as reference standard, first by comparing distributions of an extensive set of clinical and demographic variables and then by comparing the results of two analyses addressing existing example research questions. Results Our algorithm had an area under the curve (AUC) of 0.976, a sensitivity of 96.0%, and an abstraction efficiency gain of 77.9%. During Bias Analysis, we found no large differences in baseline characteristics and no differences in the example analyses. Conclusion MACS with bias analysis can significantly improve the efficiency of cohort selection on EHR data while instilling confidence that outcomes research performed on the resulting cohort will not be biased.
△ Less
Submitted 13 January, 2020;
originally announced January 2020.
-
Scaling up Psychology via Scientific Regret Minimization: A Case Study in Moral Decisions
Authors:
Mayank Agrawal,
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Do large datasets provide value to psychologists? Without a systematic methodology for working with such datasets, there is a valid concern that analyses will produce noise artifacts rather than true effects. In this paper, we offer a way to enable researchers to systematically build models and identify novel phenomena in large datasets. One traditional approach is to analyze the residuals of mode…
▽ More
Do large datasets provide value to psychologists? Without a systematic methodology for working with such datasets, there is a valid concern that analyses will produce noise artifacts rather than true effects. In this paper, we offer a way to enable researchers to systematically build models and identify novel phenomena in large datasets. One traditional approach is to analyze the residuals of models---the biggest errors they make in predicting the data---to discover what might be missing from those models. However, once a dataset is sufficiently large, machine learning algorithms approximate the true underlying function better than the data, suggesting instead that the predictions of these data-driven models should be used to guide model-building. We call this approach "Scientific Regret Minimization" (SRM) as it focuses on minimizing errors for cases that we know should have been predictable. We demonstrate this methodology on a subset of the Moral Machine dataset, a public collection of roughly forty million moral decisions. Using SRM, we found that incorporating a set of deontological principles that capture dimensions along which groups of agents can vary (e.g. sex and age) improves a computational model of human moral judgment. Furthermore, we were able to identify and independently validate three interesting moral phenomena: criminal dehumanization, age of responsibility, and asymmetric notions of responsibility.
△ Less
Submitted 8 January, 2020; v1 submitted 16 October, 2019;
originally announced October 2019.
-
Robustly Extracting Medical Knowledge from EHRs: A Case Study of Learning a Health Knowledge Graph
Authors:
Irene Y. Chen,
Monica Agrawal,
Steven Horng,
David Sontag
Abstract:
Increasingly large electronic health records (EHRs) provide an opportunity to algorithmically learn medical knowledge. In one prominent example, a causal health knowledge graph could learn relationships between diseases and symptoms and then serve as a diagnostic tool to be refined with additional clinical input. Prior research has demonstrated the ability to construct such a graph from over 270,0…
▽ More
Increasingly large electronic health records (EHRs) provide an opportunity to algorithmically learn medical knowledge. In one prominent example, a causal health knowledge graph could learn relationships between diseases and symptoms and then serve as a diagnostic tool to be refined with additional clinical input. Prior research has demonstrated the ability to construct such a graph from over 270,000 emergency department patient visits. In this work, we describe methods to evaluate a health knowledge graph for robustness. Moving beyond precision and recall, we analyze for which diseases and for which patients the graph is most accurate. We identify sample size and unmeasured confounders as major sources of error in the health knowledge graph. We introduce a method to leverage non-linear functions in building the causal graph to better understand existing model assumptions. Finally, to assess model generalizability, we extend to a larger set of complete patient visits within a hospital system. We conclude with a discussion on how to robustly extract medical knowledge from EHRs.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
Using Machine Learning to Guide Cognitive Modeling: A Case Study in Moral Reasoning
Authors:
Mayank Agrawal,
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Large-scale behavioral datasets enable researchers to use complex machine learning algorithms to better predict human behavior, yet this increased predictive power does not always lead to a better understanding of the behavior in question. In this paper, we outline a data-driven, iterative procedure that allows cognitive scientists to use machine learning to generate models that are both interpret…
▽ More
Large-scale behavioral datasets enable researchers to use complex machine learning algorithms to better predict human behavior, yet this increased predictive power does not always lead to a better understanding of the behavior in question. In this paper, we outline a data-driven, iterative procedure that allows cognitive scientists to use machine learning to generate models that are both interpretable and accurate. We demonstrate this method in the domain of moral decision-making, where standard experimental approaches often identify relevant principles that influence human judgments, but fail to generalize these findings to "real world" situations that place these principles in conflict. The recently released Moral Machine dataset allows us to build a powerful model that can predict the outcomes of these conflicts while remaining simple enough to explain the basis behind human decisions.
△ Less
Submitted 10 May, 2019; v1 submitted 18 February, 2019;
originally announced February 2019.
-
The Query Complexity of a Permutation-Based Variant of Mastermind
Authors:
Peyman Afshani,
Manindra Agrawal,
Benjamin Doerr,
Carola Doerr,
Kasper Green Larsen,
Kurt Mehlhorn
Abstract:
We study the query complexity of a permutation-based variant of the guessing game Mastermind. In this variant, the secret is a pair $(z,π)$ which consists of a binary string $z \in \{0,1\}^n$ and a permutation $π$ of $[n]$. The secret must be unveiled by asking queries of the form $x \in \{0,1\}^n$. For each such query, we are returned the score \[ f_{z,π}(x):= \max \{ i \in [0..n]\mid \forall j \…
▽ More
We study the query complexity of a permutation-based variant of the guessing game Mastermind. In this variant, the secret is a pair $(z,π)$ which consists of a binary string $z \in \{0,1\}^n$ and a permutation $π$ of $[n]$. The secret must be unveiled by asking queries of the form $x \in \{0,1\}^n$. For each such query, we are returned the score \[ f_{z,π}(x):= \max \{ i \in [0..n]\mid \forall j \leq i: z_{π(j)} = x_{π(j)}\}\,;\] i.e., the score of $x$ is the length of the longest common prefix of $x$ and $z$ with respect to the order imposed by $π$. The goal is to minimize the number of queries needed to identify $(z,π)$. This problem originates from the study of black-box optimization heuristics, where it is known as the \textsc{LeadingOnes} problem.
In this work, we prove matching upper and lower bounds for the deterministic and randomized query complexity of this game, which are $Θ(n \log n)$ and $Θ(n \log \log n)$, respectively.
△ Less
Submitted 20 December, 2018;
originally announced December 2018.
-
TIFTI: A Framework for Extracting Drug Intervals from Longitudinal Clinic Notes
Authors:
Monica Agrawal,
Griffin Adams,
Nathan Nussbaum,
Benjamin Birnbaum
Abstract:
Oral drugs are becoming increasingly common in oncology care. In contrast to intravenous chemotherapy, which is administered in the clinic and carefully tracked via structure electronic health records (EHRs), oral drug treatment is self-administered and therefore not tracked as well. Often, the details of oral cancer treatment occur only in unstructured clinic notes. Extracting this information is…
▽ More
Oral drugs are becoming increasingly common in oncology care. In contrast to intravenous chemotherapy, which is administered in the clinic and carefully tracked via structure electronic health records (EHRs), oral drug treatment is self-administered and therefore not tracked as well. Often, the details of oral cancer treatment occur only in unstructured clinic notes. Extracting this information is critical to understanding a patient's treatment history. Yet, this a challenging task because treatment intervals must be inferred longitudinally from both explicit mentions in the text as well as from document timestamps. In this work, we present TIFTI (Temporally Integrated Framework for Treatment Intervals), a robust framework for extracting oral drug treatment intervals from a patient's unstructured notes. TIFTI leverages distinct sources of temporal information by breaking the problem down into two separate subtasks: document-level sequence labeling and date extraction. On a labeled dataset of metastatic renal-cell carcinoma (RCC) patients, it exactly matched the labeled start date in 46% of the examples (86% of the examples within 30 days), and it exactly matched the labeled end date in 52% of the examples (78% of the examples within 30 days). Without retraining, the model achieved a similar level of performance on a labeled dataset of advanced non-small-cell lung cancer (NSCLC) patients.
△ Less
Submitted 3 December, 2018; v1 submitted 30 November, 2018;
originally announced November 2018.
-
Application of Vector Sensor for Underwater Acoustic Communications
Authors:
Farheen Fauziya,
Brejesh Lall,
Monika Agrawal
Abstract:
The use of vector sensors as receivers for Underwater Acoustic Communications systems is gaining popularity. It has become important to obtain performance measures for such communication systems to quantify their efficacy. The fundamental advantage of using a vector sensor as a receiver is that a single sensor is able to provide diversity gains offered by MIMO systems. In a recent work novel frame…
▽ More
The use of vector sensors as receivers for Underwater Acoustic Communications systems is gaining popularity. It has become important to obtain performance measures for such communication systems to quantify their efficacy. The fundamental advantage of using a vector sensor as a receiver is that a single sensor is able to provide diversity gains offered by MIMO systems. In a recent work novel framework for evaluating capacity of underwater channel was proposed. The approach is based on modeling the channel as a set of paths along which the signal arrives at the receiver with different Angles of Arrival. In this work, we build on that framework to provide a bound on the achievable capacity of such a system. The analytical bounds have been compared against simulation results for a vector sensor based SIMO underwater communications system. The channel parameters are modeled by analysing the statistics generated with Bellhop simulation tool. This representation of the channel is flexible and allows for characterizing channels at differenet geographical locations and at different time instances. This characterization in terms of channel parameters enables the computing of the performance measure (channel capacity bound) for different geographical locations
△ Less
Submitted 18 April, 2018;
originally announced April 2018.
-
Modeling polypharmacy side effects with graph convolutional networks
Authors:
Marinka Zitnik,
Monica Agrawal,
Jure Leskovec
Abstract:
The use of drug combinations, termed polypharmacy, is common to treat patients with complex diseases and co-existing conditions. However, a major consequence of polypharmacy is a much higher risk of adverse side effects for the patient. Polypharmacy side effects emerge because of drug-drug interactions, in which activity of one drug may change if taken with another drug. The knowledge of drug inte…
▽ More
The use of drug combinations, termed polypharmacy, is common to treat patients with complex diseases and co-existing conditions. However, a major consequence of polypharmacy is a much higher risk of adverse side effects for the patient. Polypharmacy side effects emerge because of drug-drug interactions, in which activity of one drug may change if taken with another drug. The knowledge of drug interactions is limited because these complex relationships are rare, and are usually not observed in relatively small clinical testing. Discovering polypharmacy side effects thus remains an important challenge with significant implications for patient mortality. Here, we present Decagon, an approach for modeling polypharmacy side effects. The approach constructs a multimodal graph of protein-protein interactions, drug-protein target interactions, and the polypharmacy side effects, which are represented as drug-drug interactions, where each side effect is an edge of a different type. Decagon is developed specifically to handle such multimodal graphs with a large number of edge types. Our approach develops a new graph convolutional neural network for multirelational link prediction in multimodal networks. Decagon predicts the exact side effect, if any, through which a given drug combination manifests clinically. Decagon accurately predicts polypharmacy side effects, outperforming baselines by up to 69%. We find that it automatically learns representations of side effects indicative of co-occurrence of polypharmacy in patients. Furthermore, Decagon models particularly well side effects with a strong molecular basis, while on predominantly non-molecular side effects, it achieves good performance because of effective sharing of model parameters across edge types. Decagon creates opportunities to use large pharmacogenomic and patient data to flag and prioritize side effects for follow-up analysis.
△ Less
Submitted 26 April, 2018; v1 submitted 1 February, 2018;
originally announced February 2018.
-
Large-scale analysis of disease pathways in the human interactome
Authors:
Monica Agrawal,
Marinka Zitnik,
Jure Leskovec
Abstract:
Discovering disease pathways, which can be defined as sets of proteins associated with a given disease, is an important problem that has the potential to provide clinically actionable insights for disease diagnosis, prognosis, and treatment. Computational methods aid the discovery by relying on protein-protein interaction (PPI) networks. They start with a few known disease-associated proteins and…
▽ More
Discovering disease pathways, which can be defined as sets of proteins associated with a given disease, is an important problem that has the potential to provide clinically actionable insights for disease diagnosis, prognosis, and treatment. Computational methods aid the discovery by relying on protein-protein interaction (PPI) networks. They start with a few known disease-associated proteins and aim to find the rest of the pathway by exploring the PPI network around the known disease proteins. However, the success of such methods has been limited, and failure cases have not been well understood. Here we study the PPI network structure of 519 disease pathways. We find that 90% of pathways do not correspond to single well-connected components in the PPI network. Instead, proteins associated with a single disease tend to form many separate connected components/regions in the network. We then evaluate state-of-the-art disease pathway discovery methods and show that their performance is especially poor on diseases with disconnected pathways. Thus, we conclude that network connectivity structure alone may not be sufficient for disease pathway discovery. However, we show that higher-order network structures, such as small subgraphs of the pathway, provide a promising direction for the development of new methods.
△ Less
Submitted 3 December, 2017;
originally announced December 2017.
-
Enhanced Array Aperture using Higher Order Statistics for DoA Estimation
Authors:
Payal Gupta,
Monika Agrawal
Abstract:
Recently, the higher order statistics (HOS) and sparsity based array are most talked about techniques to estimate the Direction of Arrival (DoA). They not only provide enhanced Degree of Freedom (DoF) to handle underdetermined cases but also improve the estimation accuracy of the system. To achieve high accuracy and more number of DoF with limited number of sensors, here we have proposed a method…
▽ More
Recently, the higher order statistics (HOS) and sparsity based array are most talked about techniques to estimate the Direction of Arrival (DoA). They not only provide enhanced Degree of Freedom (DoF) to handle underdetermined cases but also improve the estimation accuracy of the system. To achieve high accuracy and more number of DoF with limited number of sensors, here we have proposed a method based on the fourth order statistics. The aperture of virtual array becomes O(16N^4) using N physical sensors. Proposed method can be extended to the HOS which increases the DoF by many folds. Numeric simulation validates these claims that the proposed method increases the resolution capacity as well as maximize the DoF among all the earlier proposed method.
△ Less
Submitted 19 April, 2018; v1 submitted 15 November, 2017;
originally announced November 2017.
-
An Evaluation of Digital Image Forgery Detection Approaches
Authors:
Abhishek Kashyap,
Rajesh Singh Parmar,
Megha Agrawal,
Hariom Gupta
Abstract:
With the headway of the advanced image handling software and altering tools, a computerized picture can be effectively controlled. The identification of image manipulation is vital in light of the fact that an image can be utilized as legitimate confirmation, in crime scene investigation, and in numerous different fields. The image forgery detection techniques intend to confirm the credibility of…
▽ More
With the headway of the advanced image handling software and altering tools, a computerized picture can be effectively controlled. The identification of image manipulation is vital in light of the fact that an image can be utilized as legitimate confirmation, in crime scene investigation, and in numerous different fields. The image forgery detection techniques intend to confirm the credibility of computerized pictures with no prior information about the original image. There are numerous routes for altering a picture, for example, resampling, splicing, and copy-move. In this paper, we have examined different type of image forgery and their detection techniques; mainly we focused on pixel based image forgery detection techniques.
△ Less
Submitted 30 March, 2017; v1 submitted 29 March, 2017;
originally announced March 2017.
-
Small hitting-sets for tiny arithmetic circuits or: How to turn bad designs into good
Authors:
Manindra Agrawal,
Michael Forbes,
Sumanta Ghosh,
Nitin Saxena
Abstract:
We show that if we can design poly($s$)-time hitting-sets for $Σ\wedge^aΣΠ^{O(\log s)}$ circuits of size $s$, where $a=ω(1)$ is arbitrarily small and the number of variables, or arity $n$, is $O(\log s)$, then we can derandomize blackbox PIT for general circuits in quasipolynomial time. This also establishes that either E$\not\subseteq$\#P/poly or that VP$\ne$VNP. In fact, we show that one only ne…
▽ More
We show that if we can design poly($s$)-time hitting-sets for $Σ\wedge^aΣΠ^{O(\log s)}$ circuits of size $s$, where $a=ω(1)$ is arbitrarily small and the number of variables, or arity $n$, is $O(\log s)$, then we can derandomize blackbox PIT for general circuits in quasipolynomial time. This also establishes that either E$\not\subseteq$\#P/poly or that VP$\ne$VNP. In fact, we show that one only needs a poly($s$)-time hitting-set against individual-degree $a'=ω(1)$ polynomials that are computable by a size-$s$ arity-$(\log s)$ $ΣΠΣ$ circuit (note: $Π$ fanin may be $s$). Alternatively, we claim that, to understand VP one only needs to find hitting-sets, for depth-$3$, that have a small parameterized complexity. Another tiny family of interest is when we restrict the arity $n=ω(1)$ to be arbitrarily small. We show that if we can design poly($s,μ(n)$)-time hitting-sets for size-$s$ arity-$n$ $ΣΠΣ\wedge$ circuits (resp.~$Σ\wedge^aΣΠ$), where function $μ$ is arbitrary, then we can solve PIT for VP in quasipoly-time, and prove the corresponding lower bounds. Our methods are strong enough to prove a surprising {\em arity reduction} for PIT-- to solve the general problem completely it suffices to find a blackbox PIT with time-complexity $sd2^{O(n)}$. We give several examples of ($\log s$)-variate circuits where a new measure (called cone-size) helps in devising poly-time hitting-sets, but the same question for their $s$-variate versions is open till date: For eg., diagonal depth-$3$ circuits, and in general, models that have a {\em small} partial derivative space. We also introduce a new concept, called cone-closed basis isolation, and provide example models where it occurs, or can be achieved by a small shift.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Low-dam** transmission of spin waves through YIG/Pt-based layered structures for spin-orbit-torque applications
Authors:
Dmytro A. Bozhko,
Alexander A. Serga,
Milan Agrawal,
Burkard Hillebrands,
Mikhail P. Kostylev
Abstract:
We show that in YIG-Pt bi-layers, which are widely used in experiments on the spin transfer torque and spin Hall effects, the spin-wave amplitude significantly decreases in comparison to a single YIG film due to the excitation of microwave eddy currents in a Pt coat. By introducing a novel excitation geometry, where the Pt layer faces the ground plane of a microstrip line structure, we suppressed…
▽ More
We show that in YIG-Pt bi-layers, which are widely used in experiments on the spin transfer torque and spin Hall effects, the spin-wave amplitude significantly decreases in comparison to a single YIG film due to the excitation of microwave eddy currents in a Pt coat. By introducing a novel excitation geometry, where the Pt layer faces the ground plane of a microstrip line structure, we suppressed the excitation of the eddy currents in the Pt layer and, thus, achieved a large increase in the transmission of the Damon-Eshbach surface spin wave. At the same time, no visible influence of an external dc current applied to the Pt layer on the spin-wave amplitude in the YIG-Pt bi-layer was observed in our experiments with YIG films of micrometer thickness.
△ Less
Submitted 30 March, 2016;
originally announced March 2016.
-
Spin-transfer torque based dam** control of parametrically excited spin waves in a magnetic insulator
Authors:
V. Lauer,
D. A. Bozhko,
T. Brächer,
P. Pirro,
V. I. Vasyuchka,
A. A. Serga,
M. B. Jungfleisch,
M. Agrawal,
Yu. V. Kobljanskyj,
G. A. Melkov,
C. Dubs,
B. Hillebrands,
A. V. Chumak
Abstract:
The dam** of spin waves parametrically excited in the magnetic insulator Yttrium Iron Garnet (YIG) is controlled by a dc current passed through an adjacent normal-metal film. The experiment is performed on a macroscopically sized YIG(100nm)/Pt(10nm) bilayer of 4x2 mm^2 lateral dimensions. The spin-wave relaxation frequency is determined via the threshold of the parametric instability measured by…
▽ More
The dam** of spin waves parametrically excited in the magnetic insulator Yttrium Iron Garnet (YIG) is controlled by a dc current passed through an adjacent normal-metal film. The experiment is performed on a macroscopically sized YIG(100nm)/Pt(10nm) bilayer of 4x2 mm^2 lateral dimensions. The spin-wave relaxation frequency is determined via the threshold of the parametric instability measured by Brillouin light scattering (BLS) spectroscopy. The application of a dc current to the Pt film leads to the formation of a spin-polarized electron current normal to the film plane due to the spin Hall effect (SHE). This spin current exerts a spin transfer torque (STT) in the YIG film and, thus, changes the spin-wave dam**. Depending on the polarity of the applied dc current with respect to the magnetization direction, the dam** can be increased or decreased. The magnitude of its variation is proportional to the applied current. A variation in the relaxation frequency of +/-7.5% is achieved for an applied dc current density of 5*10^10 A/m^2.
△ Less
Submitted 29 August, 2015;
originally announced August 2015.