Skip to main content

Showing 1–50 of 96 results for author: Kannan, A

.
  1. arXiv:2406.06712  [pdf, ps, other

    math.RT

    Classification of Non-Degenerate Symmetric Bilinear and Quadratic Forms in the Verlinde Category $\mathrm{Ver}_4^+$

    Authors: Iz Chen, Arun S. Kannan, Krishna Pothapragada

    Abstract: Although Deligne's theorem classifies all symmetric tensor categories (STCs) with moderate growth over algebraically closed fields of characteristic zero, the classification does not extend to positive characteristic. At the forefront of the study of STCs is the search for an analog to Deligne's theorem in positive characteristic, and it has become increasingly apparent that the Verlinde categorie… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  2. arXiv:2406.02778  [pdf, other

    cs.LG

    MS-IMAP -- A Multi-Scale Graph Embedding Approach for Interpretable Manifold Learning

    Authors: Shay Deutsch, Lionel Yelibi, Alex Tong Lin, Arjun Ravi Kannan

    Abstract: Deriving meaningful representations from complex, high-dimensional data in unsupervised settings is crucial across diverse machine learning applications. This paper introduces a framework for multi-scale graph network embedding based on spectral graph wavelets that employs a contrastive learning approach. A significant feature of the proposed embedding is its capacity to establish a correspondence… ▽ More

    Submitted 5 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  3. arXiv:2404.02786  [pdf, ps, other

    math.RT

    The Steinberg Tensor Product Theorem for General Linear Group Schemes in the Verlinde Category

    Authors: Arun S. Kannan

    Abstract: The Steinberg tensor product theorem is a fundamental result in the modular representation theory of reductive algebraic groups. It describes any finite-dimensional simple module of highest weight $λ$ over such a group as the tensor product of Frobenius twists of simple modules with highest weights the weights appearing in a $p$-adic decomposition of $λ$, thereby reducing the character problem to… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  4. arXiv:2404.01719  [pdf, ps, other

    math.RA

    From the Albert algebra to Kac's ten-dimensional Jordan superalgebra via tensor categories in characteristic 5

    Authors: Alberto Elduque, Pavel Etingof, Arun S. Kannan

    Abstract: Kac's ten-dimensional simple Jordan superalgebra over a field of characteristic 5 is obtained from a process of semisimplification, via tensor categories, from the exceptional simple Jordan algebra (or Albert algebra), together with a suitable order 5 automorphism. This explains McCrimmon's 'bizarre result' asserting that, in characteristic 5, Kac's superalgebra is a sort of 'degree 3 Jordan super… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 22 pages

    MSC Class: Primary 17C40; Secondary 17C70; 17B25; 18M15

  5. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  6. arXiv:2312.14976  [pdf, other

    cs.CV cs.CY

    Gaussian Harmony: Attaining Fairness in Diffusion-based Face Generation Models

    Authors: Basudha Pal, Arunkumar Kannan, Ram Prabhakar Kathirvel, Alice J. O'Toole, Rama Chellappa

    Abstract: Diffusion models have achieved great progress in face generation. However, these models amplify the bias in the generation process, leading to an imbalance in distribution of sensitive attributes such as age, gender and race. This paper proposes a novel solution to this problem by balancing the facial attributes of the generated images. We mitigate the bias by localizing the means of the facial at… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  7. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  8. arXiv:2311.18071  [pdf, other

    cs.CV

    Turn Down the Noise: Leveraging Diffusion Models for Test-time Adaptation via Pseudo-label Ensembling

    Authors: Mrigank Raman, Rohan Shah, Akash Kannan, Pranit Chawla

    Abstract: The goal of test-time adaptation is to adapt a source-pretrained model to a continuously changing target domain without relying on any source data. Typically, this is either done by updating the parameters of the model (model adaptation) using inputs from the target domain or by modifying the inputs themselves (input adaptation). However, methods that modify the model suffer from the issue of comp… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted to Workshop on Distribution Shifts: New Frontiers with Foundation Models at Neurips 2023

  9. arXiv:2311.08303  [pdf, other

    cs.CL cs.AI

    Extrinsically-Focused Evaluation of Omissions in Medical Summarization

    Authors: Elliot Schumacher, Daniel Rosenthal, Varun Nair, Luladay Price, Geoffrey Tso, Anitha Kannan

    Abstract: The goal of automated summarization techniques (Paice, 1990; Kupiec et al, 1995) is to condense text by focusing on the most critical information. Generative large language models (LLMs) have shown to be robust summarizers, yet traditional metrics struggle to capture resulting performance (Goyal et al, 2022) in more powerful LLMs. In safety-critical domains such as medicine, more rigorous evaluati… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  10. arXiv:2310.19797  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    DEFT: Dexterous Fine-Tuning for Real-World Hand Policies

    Authors: Aditya Kannan, Kenneth Shaw, Shikhar Bahl, Pragna Mannam, Deepak Pathak

    Abstract: Dexterity is often seen as a cornerstone of complex manipulation. Humans are able to perform a host of skills with their hands, from making food to operating tools. In this paper, we investigate these challenges, especially in the case of soft, deformable objects as well as complex, relatively long-horizon tasks. However, learning such behaviors from scratch can be data inefficient. To circumvent… ▽ More

    Submitted 12 December, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: In CoRL 2023. Website at https://dexterous-finetuning.github.io/

  11. arXiv:2306.03652  [pdf, other

    cs.CL

    Injecting knowledge into language generation: a case study in auto-charting after-visit care instructions from medical dialogue

    Authors: Maksim Eremeev, Ilya Valmianski, Xavier Amatriain, Anitha Kannan

    Abstract: Factual correctness is often the limiting factor in practical applications of natural language generation in high-stakes domains such as healthcare. An essential requirement for maintaining factuality is the ability to deal with rare tokens. This paper focuses on rare tokens that appear in both the source and the reference sequences, and which, when missed during generation, decrease the factual c… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: ACL 2023 (main conference)

  12. arXiv:2305.14394  [pdf, other

    cs.NE cs.AI cs.LG q-bio.NC

    Unsupervised Spiking Neural Network Model of Prefrontal Cortex to study Task Switching with Synaptic deficiency

    Authors: Ashwin Viswanathan Kannan, Goutam Mylavarapu, Johnson P Thomas

    Abstract: In this study, we build a computational model of Prefrontal Cortex (PFC) using Spiking Neural Networks (SNN) to understand how neurons adapt and respond to tasks switched under short and longer duration of stimulus changes. We also explore behavioral deficits arising out of the PFC lesions by simulating lesioned states in our Spiking architecture model. Although there are some computational models… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  13. arXiv:2305.05982  [pdf, other

    cs.CL cs.AI cs.LG

    Generating medically-accurate summaries of patient-provider dialogue: A multi-stage approach using large language models

    Authors: Varun Nair, Elliot Schumacher, Anitha Kannan

    Abstract: A medical provider's summary of a patient visit serves several critical purposes, including clinical decision-making, facilitating hand-offs between providers, and as a reference for the patient. An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue, despite the complexity of patient-generated language. Even minor inaccuracies… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  14. arXiv:2304.14364  [pdf, other

    cs.CL cs.AI cs.LG

    CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants

    Authors: Albert Yu Sun, Varun Nair, Elliot Schumacher, Anitha Kannan

    Abstract: A wave of new task-based virtual assistants has been fueled by increasingly powerful large language models (LLMs), such as GPT-4 (OpenAI, 2023). A major challenge in deploying LLM-based virtual conversational assistants in real world settings is ensuring they operate within what is admissible for the task. To overcome this challenge, the designers of these virtual assistants rely on an independent… ▽ More

    Submitted 3 April, 2024; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: To appear in NAACL 2024

  15. arXiv:2304.01974  [pdf, other

    cs.CL cs.IR

    Dialogue-Contextualized Re-ranking for Medical History-Taking

    Authors: Jian Zhu, Ilya Valmianski, Anitha Kannan

    Abstract: AI-driven medical history-taking is an important component in symptom checking, automated patient intake, triage, and other AI virtual care applications. As history-taking is extremely varied, machine learning models require a significant amount of data to train. To overcome this challenge, existing systems are developed using indirect data or expert knowledge. This leads to a training-inference g… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: Code and pre-trained S4 checkpoints will be available after publication

  16. arXiv:2303.17071  [pdf, other

    cs.CL cs.AI cs.LG

    DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents

    Authors: Varun Nair, Elliot Schumacher, Geoffrey Tso, Anitha Kannan

    Abstract: Large language models (LLMs) have emerged as valuable tools for many natural language understanding tasks. In safety-critical applications such as healthcare, the utility of these models is governed by their ability to generate outputs that are factually accurate and complete. In this work, we present dialog-enabled resolving agents (DERA). DERA is a paradigm made possible by the increased convers… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

  17. arXiv:2303.10216  [pdf, other

    cs.LG math.PR

    Approximation of group explainers with coalition structure using Monte Carlo sampling on the product space of coalitions and features

    Authors: Konstandinos Kotsiopoulos, Alexey Miroshnikov, Khashayar Filom, Arjun Ravi Kannan

    Abstract: In recent years, many Machine Learning (ML) explanation techniques have been designed using ideas from cooperative game theory. These game-theoretic explainers suffer from high complexity, hindering their exact computation in practical settings. In our work, we focus on a wide class of linear game values, as well as coalitional values, for the marginal game based on a given ML model and predictor… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: 31 pages, 6 figures

  18. arXiv:2302.12862  [pdf, other

    cs.LG cs.DC

    FLINT: A Platform for Federated Learning Integration

    Authors: Ewen Wang, Ajay Kannan, Yuefeng Liang, Boyi Chen, Mosharaf Chowdhury

    Abstract: Cross-device federated learning (FL) has been well-studied from algorithmic, system scalability, and training speed perspectives. Nonetheless, moving from centralized training to cross-device FL for millions or billions of devices presents many risks, including performance loss, developer inertia, poor user experience, and unexpected application failures. In addition, the corresponding infrastruct… ▽ More

    Submitted 10 March, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

    Comments: Preprint for MLSys 2023

    MSC Class: F.2.2; I.2.7

  19. On marginal feature attributions of tree-based models

    Authors: Khashayar Filom, Alexey Miroshnikov, Konstandinos Kotsiopoulos, Arjun Ravi Kannan

    Abstract: Due to their power and ease of use, tree-based machine learning models, such as random forests and gradient-boosted tree ensembles, have become very popular. To interpret them, local feature attributions based on marginal expectations, e.g. marginal (interventional) Shapley, Owen or Banzhaf values, may be employed. Such methods are true to the model and implementation invariant, i.e. dependent onl… ▽ More

    Submitted 5 May, 2024; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: Minor corrections. 30 pages+appendix (64 pages in total), 10 figures. To appear in Foundations of Data Science

    MSC Class: Primary: 68T01; 91A12; 91A80; 05A19; Secondary: 91A68; 91A06; 05C05

  20. arXiv:2301.12282  [pdf, other

    math.OC

    Benefits of Multiobjective Learning in Solar Energy Prediction

    Authors: Aswin Kannan

    Abstract: While the space of renewable energy forecasting has received significant attention in the last decade, literature has primarily focused on machine learning models that train on only one objective at a time. A host of classification (and regression) tasks in energy markets lead to highly imbalanced training data. Say, to balance reserves, it is natural for market regulators to have a choice to be m… ▽ More

    Submitted 28 January, 2023; originally announced January 2023.

    Comments: 8 pages

    Journal ref: Proceedings of the AI2SE, AAAI, 2023

  21. arXiv:2211.02102  [pdf, other

    eess.SP

    Beyond Codebook-Based Analog Beamforming at mmWave: Compressed Sensing and Machine Learning Methods

    Authors: Hamed Pezeshki, Fabio Valerio Massoli, Arash Behboodi, Taesang Yoo, Arumugam Kannan, Mahmoud Taherzadeh Boroujeni, Qiaoyu Li, Tao Luo, Joseph B. Soriaga

    Abstract: Analog beamforming is the predominant approach for millimeter wave (mmWave) communication given its favorable characteristics for limited-resource devices. In this work, we aim at reducing the spectral efficiency gap between analog and digital beamforming methods. We propose a method for refined beam selection based on the estimated raw channel. The channel estimation, an underdetermined problem,… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

  22. arXiv:2210.02658  [pdf, other

    cs.CL

    Learning functional sections in medical conversations: iterative pseudo-labeling and human-in-the-loop approach

    Authors: Mengqian Wang, Ilya Valmianski, Xavier Amatriain, Anitha Kannan

    Abstract: Medical conversations between patients and medical professionals have implicit functional sections, such as "history taking", "summarization", "education", and "care plan." In this work, we are interested in learning to automatically extract these sections. A direct approach would require collecting large amounts of expert annotations for this task, which is inherently costly due to the contextual… ▽ More

    Submitted 7 October, 2022; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Changed the github link as it was invalid

  23. arXiv:2207.05817  [pdf, other

    cs.CL cs.AI cs.LG

    OSLAT: Open Set Label Attention Transformer for Medical Entity Retrieval and Span Extraction

    Authors: Raymond Li, Ilya Valmianski, Li Deng, Xavier Amatriain, Anitha Kannan

    Abstract: Medical entity span extraction and linking are critical steps for many healthcare NLP tasks. Most existing entity extraction methods either have a fixed vocabulary of medical entities or require span annotations. In this paper, we propose a method for linking an open set of entities that does not require any span annotations. Our method, Open Set Label Attention Transformer (OSLAT), uses the label… ▽ More

    Submitted 20 November, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

    Comments: 18 pages, 2 figures, Camera-Ready for ML4H 2022 (Proceedings Track)

  24. arXiv:2202.11043  [pdf, other

    stat.ML cs.CR cs.LG econ.EM

    Differentially Private Estimation of Heterogeneous Causal Effects

    Authors: Fengshi Niu, Harsha Nori, Brian Quistorff, Rich Caruana, Donald Ngwe, Aadharsh Kannan

    Abstract: Estimating heterogeneous treatment effects in domains such as healthcare or social science often involves sensitive data where protecting privacy is important. We introduce a general meta-algorithm for estimating conditional average treatment effects (CATE) with differential privacy (DP) guarantees. Our meta-algorithm can work with simple, single-stage CATE estimators such as S-learner and more co… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  25. Representation Stability and Finite Orthogonal Groups

    Authors: Zifan Wang, Arun S. Kannan

    Abstract: In this paper, we prove stability results about orthogonal groups over finite commutative rings where 2 is a unit. Inspired by Putman and Sam (2017), we construct a category $\mathbf{OrI}(R)$ and prove a Noetherianity theorem for the category of $\mathbf{OrI}(R)$-modules. This implies an asymptotic structure theorem for orthogonal groups. In addition, we show general homological stability theorems… ▽ More

    Submitted 20 February, 2022; originally announced February 2022.

    Comments: 21 pages, 0 figures

    MSC Class: 16P40; 18A25; 18Gxx; 20J05

  26. arXiv:2112.01467  [pdf, ps, other

    math.RT math.GR math.RA

    Stable Centres II: Finite Classical Groups

    Authors: Arun S. Kannan, Christopher Ryba

    Abstract: Farahat and Higman constructed an algebra $\mathrm{FH}$ interpolating the centres of symmetric group algebras $Z(\mathbb{Z}S_n)$ by proving that the structure constants in these rings are "polynomial in $n$". Inspired by a construction of $\mathrm{FH}$ due to Ivanov and Kerov, we prove for $G_n = GL_n, U_n, Sp_{2n}, O_n$, that the structure constants of $Z(\mathbb{Z}G_n(\mathbb{F}_q))$ are "polyno… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

    Comments: 38 pages

  27. arXiv:2111.11259  [pdf, other

    cs.LG math.PR

    Model-agnostic bias mitigation methods with regressor distribution control for Wasserstein-based fairness metrics

    Authors: Alexey Miroshnikov, Konstandinos Kotsiopoulos, Ryan Franks, Arjun Ravi Kannan

    Abstract: This article is a companion paper to our earlier work Miroshnikov et al. (2021) on fairness interpretability, which introduces bias explanations. In the current work, we propose a bias mitigation methodology based upon the construction of post-processed models with fairer regressor distributions for Wasserstein-based fairness metrics. By identifying the list of predictors contributing the most to… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Comments: 29 pages, 32 figures

    MSC Class: 49Q22; 91A12; 68T01

  28. arXiv:2111.09381  [pdf, other

    cs.CL cs.AI cs.LG

    MEDCOD: A Medically-Accurate, Emotive, Diverse, and Controllable Dialog System

    Authors: Rhys Compton, Ilya Valmianski, Li Deng, Costa Huang, Namit Katariya, Xavier Amatriain, Anitha Kannan

    Abstract: We present MEDCOD, a Medically-Accurate, Emotive, Diverse, and Controllable Dialog system with a unique approach to the natural language generator module. MEDCOD has been developed and evaluated specifically for the history taking task. It integrates the advantage of a traditional modular approach to incorporate (medical) domain knowledge with modern deep learning techniques to generate flexible,… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

    Comments: 9 pages. Accepted at Machine Learning for Health (ML4H) 2021

  29. arXiv:2111.07564  [pdf, other

    cs.CL cs.AI cs.LG

    Adding more data does not always help: A study in medical conversation summarization with PEGASUS

    Authors: Varun Nair, Namit Katariya, Xavier Amatriain, Ilya Valmianski, Anitha Kannan

    Abstract: Medical conversation summarization is integral in capturing information gathered during interactions between patients and physicians. Summarized conversations are used to facilitate patient hand-offs between physicians, and as part of providing care in the future. Summaries, however, can be time-consuming to produce and require domain expertise. Modern pre-trained NLP models such as PEGASUS have e… ▽ More

    Submitted 28 November, 2021; v1 submitted 15 November, 2021; originally announced November 2021.

    Comments: Accepted to Machine Learning for Healthcare Workshop, NeurIPS 2021

  30. arXiv:2110.07356  [pdf, other

    cs.CL cs.AI cs.LG

    Medically Aware GPT-3 as a Data Generator for Medical Dialogue Summarization

    Authors: Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan

    Abstract: In medical dialogue summarization, summaries must be coherent and must capture all the medically relevant information in the dialogue. However, learning effective models for summarization require large amounts of labeled data which is especially hard to obtain. We present an algorithm to create synthetic training data with an explicit focus on capturing medically relevant information. We utilize G… ▽ More

    Submitted 9 September, 2021; originally announced October 2021.

    Comments: Accepted to Machine learning for healthcare 2021

  31. New Constructions of Exceptional Simple Lie Superalgebras with Integer Cartan Matrix in Characteristics 3 and 5 via Tensor Categories

    Authors: Arun S. Kannan

    Abstract: Using tensor categories, we present new constructions of several of the exceptional simple Lie superalgebras with integer Cartan matrix in characteristic $p = 3$ and $p = 5$ from the complete classification of modular Lie superalgebras with indecomposable Cartan matrix and their simple subquotients over algebraically closed fields by Bouarroudj, Grozman, and Leites in 2009. Specifically, let… ▽ More

    Submitted 16 May, 2022; v1 submitted 12 August, 2021; originally announced August 2021.

  32. arXiv:2104.12950  [pdf, other

    cs.AI cs.CL

    Document Structure aware Relational Graph Convolutional Networks for Ontology Population

    Authors: Abhay M Shalghar, Ayush Kumar, Balaji Ganesan, Aswin Kannan, Akshay Parekh, Shobha G

    Abstract: Ontologies comprising of concepts, their attributes, and relationships are used in many knowledge based AI systems. While there have been efforts towards populating domain specific ontologies, we examine the role of document structure in learning ontological relationships between concepts in any document corpus. Inspired by ideas from hypernym discovery and explainability, our method performs abou… ▽ More

    Submitted 12 April, 2022; v1 submitted 26 April, 2021; originally announced April 2021.

    Comments: 8 pages single column, 5 figures. DLG4NLP Workshop at ICLR 2022

  33. arXiv:2104.04487  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Language model fusion for streaming end to end speech recognition

    Authors: Rodrigo Cabrera, Xiaofeng Liu, Mohammadreza Ghodsi, Zebulun Matteson, Eugene Weinstein, Anjuli Kannan

    Abstract: Streaming processing of speech audio is required for many contemporary practical speech recognition tasks. Even with the large corpora of manually transcribed speech data available today, it is impossible for such corpora to cover adequately the long tail of linguistic content that's important for tasks such as open-ended dictation and voice search. We seek to address both the streaming and the ta… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: 5 pages

  34. arXiv:2103.04878  [pdf, ps, other

    math.QA math.CT math.RT

    Lectures on Symmetric Tensor Categories

    Authors: Pavel Etingof, Arun S. Kannan

    Abstract: This is an expanded version of the notes by the second author of the lectures on symmetric tensor categories given by the first author at Ohio State University in March 2019 and later at ICRA-2020 in November 2020. We review some aspects of the current state of the theory of symmetric tensor categories and discuss their applications, including ones unavailable in the literature.

    Submitted 10 November, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

    Comments: 34 pages, latex; v2 discusses results of the new paper [CEO], and derives stronger corollaries in the appendix

  35. arXiv:2102.10878  [pdf, other

    cs.GT math.PR

    Stability theory of game-theoretic group feature explanations for machine learning models

    Authors: Alexey Miroshnikov, Konstandinos Kotsiopoulos, Khashayar Filom, Arjun Ravi Kannan

    Abstract: In this article, we study feature attributions of Machine Learning (ML) models originating from linear game values and coalitional values defined as operators on appropriate functional spaces. The main focus is on random games based on the conditional and marginal expectations. The first part of our work formulates a stability theory for these explanation operators by establishing certain bounds f… ▽ More

    Submitted 3 April, 2024; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: 76 pages, 41 figures. Major revision. The title has been changed

    MSC Class: 91A06; 91A12; 91A80; 46N30; 46N99; 68T01

  36. arXiv:2011.06874  [pdf, other

    cs.CL cs.LG

    Medical symptom recognition from patient text: An active learning approach for long-tailed multilabel distributions

    Authors: Ali Mottaghi, Prathusha K Sarma, Xavier Amatriain, Serena Yeung, Anitha Kannan

    Abstract: We study the problem of medical symptoms recognition from patient text, for the purposes of gathering pertinent information from the patient (known as history-taking). A typical patient text is often descriptive of the symptoms the patient is experiencing and a single instance of such a text can be "labeled" with multiple symptoms. This makes learning a medical symptoms recognizer challenging on a… ▽ More

    Submitted 28 March, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

  37. Wasserstein-based fairness interpretability framework for machine learning models

    Authors: Alexey Miroshnikov, Konstandinos Kotsiopoulos, Ryan Franks, Arjun Ravi Kannan

    Abstract: The objective of this article is to introduce a fairness interpretability framework for measuring and explaining the bias in classification and regression models at the level of a distribution. In our work, we measure the model bias across sub-population distributions in the model output using the Wasserstein metric. To properly quantify the contributions of predictors, we take into account the fa… ▽ More

    Submitted 8 March, 2022; v1 submitted 5 November, 2020; originally announced November 2020.

    Comments: 39 pages. (submitted for publication)

    MSC Class: 49Q22; 91A12; 68T01; 90C08

    Journal ref: Machine Learning Journal (2022), Springer

  38. Simulations of Argon Plasma Decay in a Thermionic Converter

    Authors: R. E. Groenewald, S. Clark, A. Kannan, P. Scherpelz

    Abstract: The dynamics of an argon plasma in the gap of a thermionic diode is investigated using particle-in-cell (PIC) simulations. The time-averaged diode current, as a function of the relative electrical potential between the electrodes, is studied while the plasma density depletes due to recombination on the electrode surfaces. Simulations were performed in both 1D and 2D and significant differences wer… ▽ More

    Submitted 31 January, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

    Journal ref: Phys. Rev. E 103, 023207 (2021)

  39. arXiv:2009.08666  [pdf, other

    cs.CL cs.AI cs.LG

    Dr. Summarize: Global Summarization of Medical Dialogue by Exploiting Local Structures

    Authors: Anirudh Joshi, Namit Katariya, Xavier Amatriain, Anitha Kannan

    Abstract: Understanding a medical conversation between a patient and a physician poses a unique natural language understanding challenge since it combines elements of standard open ended conversation with very domain specific elements that require expertise and medical knowledge. Summarization of medical conversations is a particularly important aspect of medical conversation understanding since it addresse… ▽ More

    Submitted 18 September, 2020; originally announced September 2020.

    Comments: Accepted for publication in Findings of EMNLP at EMNLP 2020

  40. arXiv:2008.13546  [pdf, other

    cs.IR cs.CL cs.LG

    Effective Transfer Learning for Identifying Similar Questions: Matching User Questions to COVID-19 FAQs

    Authors: Clara H. McCreery, Namit Katariya, Anitha Kannan, Manish Chablani, Xavier Amatriain

    Abstract: People increasingly search online for answers to their medical questions but the rate at which medical questions are asked online significantly exceeds the capacity of qualified people to answer them. This leaves many questions unanswered or inadequately answered. Many of these questions are not unique, and reliable identification of similar questions would enable more efficient and effective ques… ▽ More

    Submitted 4 August, 2020; originally announced August 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1910.04192

  41. arXiv:2008.03323  [pdf, other

    cs.AI cs.LG

    COVID-19 in differential diagnosis of online symptom assessments

    Authors: Anitha Kannan, Richard Chen, Vignesh Venkataraman, Geoffrey J. Tso, Xavier Amatriain

    Abstract: The COVID-19 pandemic has magnified an already existing trend of people looking for healthcare solutions online. One class of solutions are symptom checkers, which have become very popular in the context of COVID-19. Traditional symptom checkers, however, are based on manually curated expert systems that are inflexible and hard to modify, especially in a quickly changing situation like the one we… ▽ More

    Submitted 30 November, 2020; v1 submitted 7 August, 2020; originally announced August 2020.

    Comments: Accepted at the Machine Learning for Health (ML4H) at NeurIPS 2020 - Extended Abstract

  42. arXiv:2006.11965  [pdf, other

    cond-mat.soft physics.flu-dyn

    Bubble coalescence in worm-like micellar solutions

    Authors: Vineeth Chandran Suja, Aadithya Kannan, Bruce Kubicka, Alex Hadidi, Gerald G. Fuller

    Abstract: Surfactants in aqueous solutions self-assemble in the presence of salt, to form long, flexible worm-like micelles (WLM). WLM solutions exhibit viscoelastic properties and are used in many applications, such as for cosmetic products, drag reduction and hydraulic fracturing. The dynamics of bubbles in WLM solutions are important considerations for the stability of many of these products. In this man… ▽ More

    Submitted 21 June, 2020; originally announced June 2020.

    Comments: Preprint submitted to Langmuir. 9 pages and 5 figures

  43. Characters for Projective Modules in the BGG Category $\mathcal{O}$ for the Orthosymplectic Lie Superalgebra $\mathfrak{osp}(3|4)$

    Authors: Arun S. Kannan, Honglin Zhu

    Abstract: We determine the Verma multiplicities of standard filtrations of projective modules for integral atypical blocks in the BGG category $\mathcal{O}$ for the orthosymplectic Lie superalgebras $\mathfrak{osp}(3|4)$ by way of translation functors. We then explicitly determine the composition factor multiplicities of Verma modules using BGG reciprocity.

    Submitted 20 November, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: arXiv admin note: text overlap with arXiv:1810.13050

  44. arXiv:2004.09752  [pdf, other

    physics.flu-dyn

    Symmetry breaking and chaos in evaporation driven Marangoni flows over bubbles

    Authors: Vineeth Chandran Suja, Alex Hadidi, Aadithya Kannan, Gerald G Fuller

    Abstract: Understanding the dynamics of liquid films that make up bubbles is of practical and fundamental importance. Practically, this understanding is crucial for tuning bubble stability, while fundamentally, thin films are an excellent platform to study 2D flows. Here we study the spatiotemporal film thickness dynamics of bubbles subjected to evaporation driven Marangoni flows. Initially, we demonstrate… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

    Comments: Under review in Nature communications; 11 pages, 6 figures

  45. arXiv:2004.09571  [pdf, other

    eess.AS cs.SD stat.ML

    Language-agnostic Multilingual Modeling

    Authors: Arindrima Datta, Bhuvana Ramabhadran, Jesse Emond, Anjuli Kannan, Brian Roark

    Abstract: Multilingual Automated Speech Recognition (ASR) systems allow for the joint training of data-rich and data-scarce languages in a single model. This enables data and parameter sharing across languages, which is especially beneficial for the data-scarce languages. However, most state-of-the-art multilingual models require the encoding of language information and therefore are not as flexible or scal… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

  46. arXiv:2003.12710  [pdf, other

    cs.CL cs.LG cs.SD

    A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

    Authors: Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho **, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman , et al. (4 additional authors not shown)

    Abstract: Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i.e., word error rate (WER), and latency, i.e., the time the hypothesis is finalized after the user stops speaking. In this paper, we develop a first-pass Recurrent Neural Network Transducer (RNN-T) model and a second-pass Listen, Attend, Spell (LAS) rescorer that… ▽ More

    Submitted 1 May, 2020; v1 submitted 28 March, 2020; originally announced March 2020.

    Comments: In Proceedings of IEEE ICASSP 2020

  47. arXiv:1912.08041  [pdf, other

    cs.LG stat.ML

    The accuracy vs. coverage trade-off in patient-facing diagnosis models

    Authors: Anitha Kannan, Jason Alan Fries, Eric Kramer, Jen Jen Chen, Nigam Shah, Xavier Amatriain

    Abstract: A third of adults in America use the Internet to diagnose medical concerns, and online symptom checkers are increasingly part of this process. These tools are powered by diagnosis models similar to clinical decision support systems, with the primary difference being the coverage of symptoms and diagnoses. To be useful to patients and physicians, these models must have high accuracy while covering… ▽ More

    Submitted 11 December, 2019; originally announced December 2019.

  48. arXiv:1911.08554  [pdf, other

    cs.CL cs.AI cs.LG

    Classification as Decoder: Trading Flexibility for Control in Medical Dialogue

    Authors: Sam Shleifer, Manish Chablani, Anitha Kannan, Namit Katariya, Xavier Amatriain

    Abstract: Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or unde… ▽ More

    Submitted 15 November, 2019; originally announced November 2019.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract. arXiv admin note: substantial text overlap with arXiv:1910.03476

  49. arXiv:1911.05531  [pdf, other

    q-bio.BM cs.LG stat.ML

    Accurate Protein Structure Prediction by Embeddings and Deep Learning Representations

    Authors: Iddo Drori, Darshan Thaker, Arjun Srivatsa, Daniel Jeong, Yueqi Wang, Linyong Nan, Fan Wu, Dimitri Leggas, **hao Lei, Weiyi Lu, Weilong Fu, Yuan Gao, Sashank Karri, Anand Kannan, Antonio Moretti, Mohammed AlQuraishi, Chen Keasar, Itsik Pe'er

    Abstract: Proteins are the major building blocks of life, and actuators of almost all chemical and biophysical events in living organisms. Their native structures in turn enable their biological functions which have a fundamental role in drug design. This motivates predicting the structure of a protein from its sequence of amino acids, a fundamental problem in computational biology. In this work, we demonst… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Journal ref: Machine Learning in Computational Biology, 2019

  50. arXiv:1911.02242  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    A comparison of end-to-end models for long-form speech recognition

    Authors: Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara Sainath, Yonghui Wu

    Abstract: End-to-end automatic speech recognition (ASR) models, including both attention-based models and the recurrent neural network transducer (RNN-T), have shown superior performance compared to conventional systems. However, previous studies have focused primarily on short utterances that typically last for just a few seconds or, at most, a few tens of seconds. Whether such architectures are practical… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

    Comments: ASRU camera-ready version