Skip to main content

Showing 1–50 of 57 results for author: Ramamurthy, K N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20163  [pdf, other

    cs.CL cs.AI

    Reasoning about concepts with LLMs: Inconsistencies abound

    Authors: Rosario Uceda-Sosa, Karthikeyan Natesan Ramamurthy, Maria Chang, Moninder Singh

    Abstract: The ability to summarize and organize knowledge into abstract concepts is key to learning and reasoning. Many industrial applications rely on the consistent and systematic use of concepts, especially when dealing with decision-critical knowledge. However, we demonstrate that, when methodically questioned, large language models (LLMs) often display and demonstrate significant inconsistencies in the… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 15 pages, 5 figures, 3 tables

  2. arXiv:2403.14459  [pdf, other

    cs.CL cs.AI

    Multi-Level Explanations for Generative Language Models

    Authors: Lucas Monteiro Paes, Dennis Wei, Hyo ** Do, Hendrik Strobelt, Ronny Luss, Amit Dhurandhar, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Werner Geyer, Soumya Ghosh

    Abstract: Perturbation-based explanation methods such as LIME and SHAP are commonly applied to text classification. This work focuses on their extension to generative language models. To address the challenges of text as output and long text inputs, we propose a general framework called MExGen that can be instantiated with different attribution algorithms. To handle text output, we introduce the notion of s… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  3. arXiv:2403.09704  [pdf, other

    cs.CL cs.AI cs.LG

    Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

    Authors: Swapnaja Achintalwar, Ioana Baldini, Djallel Bouneffouf, Joan Byamugisha, Maria Chang, Pierre Dognin, Eitan Farchi, Ndivhuwo Makondo, Aleksandra Mojsilovic, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Inkit Padhi, Orna Raz, Jesus Rios, Prasanna Sattigeri, Moninder Singh, Siphiwe Thwala, Rosario A. Uceda-Sosa, Kush R. Varshney

    Abstract: The alignment of large language models is usually done by model providers to add or control behaviors that are common or universally understood across use cases and contexts. In contrast, in this article, we present an approach and architecture that empowers application developers to tune a model to their particular values, social norms, laws and other regulations, and orchestrate between potentia… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 7 pages, 5 figures

  4. arXiv:2402.14860  [pdf, other

    cs.CL cs.AI cs.LG

    Ranking Large Language Models without Ground Truth

    Authors: Amit Dhurandhar, Rahul Nair, Moninder Singh, Elizabeth Daly, Karthikeyan Natesan Ramamurthy

    Abstract: Evaluation and ranking of large language models (LLMs) has become an important problem with the proliferation of these models and their impact. Evaluation methods either require human responses which are expensive to acquire or use pairs of LLMs to evaluate each other which can be unreliable. In this paper, we provide a novel perspective where, given a dataset of prompts (viz. questions, instructi… ▽ More

    Submitted 10 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted to ACL 2024

  5. arXiv:2402.11168  [pdf, other

    cs.LG cs.AI

    Trust Regions for Explanations via Black-Box Probabilistic Certification

    Authors: Amit Dhurandhar, Swagatam Haldar, Dennis Wei, Karthikeyan Natesan Ramamurthy

    Abstract: Given the black box nature of machine learning models, a plethora of explainability methods have been developed to decipher the factors behind individual decisions. In this paper, we introduce a novel problem of black box (probabilistic) explanation certification. We ask the question: Given a black box model with only query access, an explanation for an example and a quality metric (viz. fidelity,… ▽ More

    Submitted 5 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted to ICML 2024

  6. arXiv:2402.08871  [pdf, other

    cs.LG stat.ML

    Position: Topological Deep Learning is the New Frontier for Relational Learning

    Authors: Theodore Papamarkou, Tolga Birdal, Michael Bronstein, Gunnar Carlsson, Justin Curry, Yue Gao, Mustafa Hajij, Roland Kwitt, Pietro Liò, Paolo Di Lorenzo, Vasileios Maroulas, Nina Miolane, Farzana Nasrin, Karthikeyan Natesan Ramamurthy, Bastian Rieck, Simone Scardapane, Michael T. Schaub, Petar Veličković, Bei Wang, Yusu Wang, Guo-Wei Wei, Ghada Zamzmi

    Abstract: Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning setting… ▽ More

    Submitted 30 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  7. arXiv:2402.02441  [pdf, other

    cs.LG cs.AI cs.MS stat.CO

    TopoX: A Suite of Python Packages for Machine Learning on Topological Domains

    Authors: Mustafa Hajij, Mathilde Papillon, Florian Frantzen, Jens Agerberg, Ibrahem AlJabea, Ruben Ballester, Claudio Battiloro, Guillermo Bernárdez, Tolga Birdal, Aiden Brent, Peter Chin, Sergio Escalera, Simone Fiorellino, Odin Hoff Gardaa, Gurusankar Gopalakrishnan, Devendra Govil, Josef Hoppe, Maneel Reddy Karri, Jude Khouja, Manuel Lecha, Neal Livesay, Jan Meißner, Soham Mukherjee, Alexander Nikitin, Theodore Papamarkou , et al. (18 additional authors not shown)

    Abstract: We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order… ▽ More

    Submitted 17 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  8. arXiv:2312.11862  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Topo-MLP : A Simplicial Network Without Message Passing

    Authors: Karthikeyan Natesan Ramamurthy, Aldo Guzmán-Sáenz, Mustafa Hajij

    Abstract: Due to their ability to model meaningful higher order relations among a set of entities, higher order network models have emerged recently as a powerful alternative for graph-based network models which are only capable of modeling binary relationships. Message passing paradigm is still dominantly used to learn representations even for higher order network models. While powerful, message passing ca… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  9. ICML 2023 Topological Deep Learning Challenge : Design and Results

    Authors: Mathilde Papillon, Mustafa Hajij, Helen Jenne, Johan Mathe, Audun Myers, Theodore Papamarkou, Tolga Birdal, Tamal Dey, Tim Doster, Tegan Emerson, Gurusankar Gopalakrishnan, Devendra Govil, Aldo Guzmán-Sáenz, Henry Kvinge, Neal Livesay, Soham Mukherjee, Shreyas N. Samaga, Karthikeyan Natesan Ramamurthy, Maneel Reddy Karri, Paul Rosen, Sophia Sanborn, Robin Walters, Jens Agerberg, Sadrodin Barikbin, Claudio Battiloro , et al. (31 additional authors not shown)

    Abstract: This paper presents the computational challenge on topological deep learning that was hosted within the ICML 2023 Workshop on Topology and Geometry in Machine Learning. The competition asked participants to provide open-source implementations of topological neural networks from the literature by contributing to the python packages TopoNetX (data processing) and TopoModelX (deep learning). The chal… ▽ More

    Submitted 18 January, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

  10. arXiv:2305.19466  [pdf, other

    cs.CL cs.AI cs.LG

    The Impact of Positional Encoding on Length Generalization in Transformers

    Authors: Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy

    Abstract: Length generalization, the ability to generalize from small training context sizes to larger ones, is a critical challenge in the development of Transformer-based language models. Positional encoding (PE) has been identified as a major factor influencing length generalization, but the exact impact of different PE schemes on extrapolation in downstream tasks remains unclear. In this paper, we condu… ▽ More

    Submitted 6 November, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted at NeurIPS 2023; 15 pages and 22 pages Appendix

  11. arXiv:2302.09190  [pdf, other

    cs.LG cs.CY

    Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions

    Authors: Manish Nagireddy, Moninder Singh, Samuel C. Hoffman, Evaline Ju, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

    Abstract: Ensuring trustworthiness in machine learning (ML) models is a multi-dimensional task. In addition to the traditional notion of predictive performance, other notions such as privacy, fairness, robustness to distribution shift, adversarial robustness, interpretability, explainability, and uncertainty quantification are important considerations to evaluate and improve (if deficient). However, these s… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  12. arXiv:2301.12616  [pdf, other

    cs.LG stat.ME

    Active Sequential Two-Sample Testing

    Authors: Weizhi Li, Prad Kadambi, Pouria Saidi, Karthikeyan Natesan Ramamurthy, Gautam Dasarathy, Visar Berisha

    Abstract: A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample features) are inexpensive to access, but their group memberships (or labels) are costly. To address the problem, we devise the first \emph{active sequential two… ▽ More

    Submitted 27 June, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

  13. arXiv:2210.06475  [pdf, other

    cs.LG cs.CL

    Equi-Tuning: Group Equivariant Fine-Tuning of Pretrained Models

    Authors: Sourya Basu, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy, Vijil Chenthamarakshan, Kush R. Varshney, Lav R. Varshney, Payel Das

    Abstract: We introduce equi-tuning, a novel fine-tuning method that transforms (potentially non-equivariant) pretrained models into group equivariant models while incurring minimum $L_2$ loss between the feature representations of the pretrained and the equivariant models. Large pretrained models can be equi-tuned for different groups to satisfy the needs of various downstream tasks. Equi-tuned models benef… ▽ More

    Submitted 4 February, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Journal ref: AAAI 2023

  14. arXiv:2206.00606  [pdf, other

    cs.LG cs.CV cs.SI math.AT stat.ML

    Topological Deep Learning: Going Beyond Graph Data

    Authors: Mustafa Hajij, Ghada Zamzmi, Theodore Papamarkou, Nina Miolane, Aldo Guzmán-Sáenz, Karthikeyan Natesan Ramamurthy, Tolga Birdal, Tamal K. Dey, Soham Mukherjee, Shreyas N. Samaga, Neal Livesay, Robin Walters, Paul Rosen, Michael T. Schaub

    Abstract: Topological deep learning is a rapidly growing field that pertains to the development of deep learning models for data supported on topological domains such as simplicial complexes, cell complexes, and hypergraphs, which generalize many domains encountered in scientific computations. In this paper, we present a unifying deep learning framework built upon a richer data structure that includes widel… ▽ More

    Submitted 19 May, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

  15. arXiv:2202.01153  [pdf, other

    cs.LG

    Analogies and Feature Attributions for Model Agnostic Explanation of Similarity Learners

    Authors: Karthikeyan Natesan Ramamurthy, Amit Dhurandhar, Dennis Wei, Zaid Bin Tariq

    Abstract: Post-hoc explanations for black box models have been studied extensively in classification and regression settings. However, explanations for models that output similarity between two inputs have received comparatively lesser attention. In this paper, we provide model agnostic local explanations for similarity learners applicable to tabular and text data. We first propose a method that provides fe… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

  16. arXiv:2112.03529  [pdf, ps, other

    cs.CL

    Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets

    Authors: Kofi Arhin, Ioana Baldini, Dennis Wei, Karthikeyan Natesan Ramamurthy, Moninder Singh

    Abstract: The use of machine learning (ML)-based language models (LMs) to monitor content online is on the rise. For toxic text identification, task-specific fine-tuning of these models are performed using datasets labeled by annotators who provide ground-truth labels in an effort to distinguish between offensive and normal content. These projects have led to the development, improvement, and expansion of l… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Comments: 15 pages

  17. arXiv:2111.08861  [pdf, other

    cs.LG stat.ML

    A label-efficient two-sample test

    Authors: Weizhi Li, Gautam Dasarathy, Karthikeyan Natesan Ramamurthy, Visar Berisha

    Abstract: Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are easily measured whereas sample labels are unknown and costly to obtain. Accordingly, we devise a three-stage framework in service of performing an effective two… ▽ More

    Submitted 19 July, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: Accepted to the 38th conference on Uncertainty in Artificial Intelligence (UAI2022)

  18. arXiv:2110.02491  [pdf, ps, other

    cs.LG cs.NE math.CT stat.ML

    Data-Centric AI Requires Rethinking Data Notion

    Authors: Mustafa Hajij, Ghada Zamzmi, Karthikeyan Natesan Ramamurthy, Aldo Guzman Saenz

    Abstract: The transition towards data-centric AI requires revisiting data notions from mathematical and implementational standpoints to obtain unified data-centric machine learning packages. Towards this end, this work proposes unifying principles offered by categorical and cochain notions of data, and discusses the importance of these principles in data-centric AI transition. In the categorical notion, dat… ▽ More

    Submitted 2 December, 2021; v1 submitted 6 October, 2021; originally announced October 2021.

    Journal ref: Conference: 35th Conference on Neural Information Processing Systems (NeurIPS 2021) At: NEURIPS DATA-CENTRIC AI WORKSHOP

  19. arXiv:2108.01250  [pdf, other

    cs.CL cs.LG

    Your fairness may vary: Pretrained language model fairness in toxic text classification

    Authors: Ioana Baldini, Dennis Wei, Karthikeyan Natesan Ramamurthy, Mikhail Yurochkin, Moninder Singh

    Abstract: The popularity of pretrained language models in natural language processing systems calls for a careful evaluation of such models in down-stream tasks, which have a higher potential for societal impact. The evaluation of such systems usually focuses on accuracy measures. Our findings in this paper call for attention to be paid to fairness measures as well. Through the analysis of more than a dozen… ▽ More

    Submitted 13 April, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: Findings of ACL 2022

  20. arXiv:2106.04464  [pdf, other

    physics.chem-ph cs.LG math.AT

    Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations

    Authors: Yair Schiff, Vijil Chenthamarakshan, Samuel Hoffman, Karthikeyan Natesan Ramamurthy, Payel Das

    Abstract: Deep generative models have emerged as a powerful tool for learning useful molecular representations and designing novel molecules with desired properties, with applications in drug discovery and material design. However, most existing deep generative models are restricted due to lack of spatial information. Here we propose augmentation of deep generative models with topological data analysis (TDA… ▽ More

    Submitted 15 February, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: Accepted to ICASSP, 2022

  21. arXiv:2106.01410  [pdf, other

    cs.AI

    Uncertainty Quantification 360: A Holistic Toolkit for Quantifying and Communicating the Uncertainty of AI

    Authors: Soumya Ghosh, Q. Vera Liao, Karthikeyan Natesan Ramamurthy, Jiri Navratil, Prasanna Sattigeri, Kush R. Varshney, Yunfeng Zhang

    Abstract: In this paper, we describe an open source Python toolkit named Uncertainty Quantification 360 (UQ360) for the uncertainty quantification of AI models. The goal of this toolkit is twofold: first, to provide a broad range of capabilities to streamline as well as foster the common practices of quantifying, evaluating, improving, and communicating uncertainty in the AI application development lifecycl… ▽ More

    Submitted 3 June, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Added references

  22. arXiv:2011.09645  [pdf, other

    cs.LG

    Finding the Homology of Decision Boundaries with Active Learning

    Authors: Weizhi Li, Gautam Dasarathy, Karthikeyan Natesan Ramamurthy, Visar Berisha

    Abstract: Accurately and efficiently characterizing the decision boundary of classifiers is important for problems related to model selection and meta-learning. Inspired by topological data analysis, the characterization of decision boundaries using their homology has recently emerged as a general and powerful tool. In this paper, we propose an active learning algorithm to recover the homology of decision b… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

    Journal ref: Advances in Neural Information Processing Systems 33 (2020)

  23. arXiv:2010.08548  [pdf, other

    q-bio.BM cs.LG

    Characterizing the Latent Space of Molecular Deep Generative Models with Persistent Homology Metrics

    Authors: Yair Schiff, Vijil Chenthamarakshan, Karthikeyan Natesan Ramamurthy, Payel Das

    Abstract: Deep generative models are increasingly becoming integral parts of the in silico molecule design pipeline and have dual goals of learning the chemical and structural features that render candidate molecules viable while also being flexible enough to generate novel designs. Specifically, Variational Auto Encoders (VAEs) are generative models in which encoder-decoder network pairs are trained to rec… ▽ More

    Submitted 7 June, 2021; v1 submitted 18 October, 2020; originally announced October 2020.

    Comments: Accepted to and presented as spotlight poster at the Topological Data Analysis and Beyond Workshop at the 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

  24. arXiv:2005.00060  [pdf, other

    cs.LG cs.CV stat.ML

    Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

    Authors: Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, Xue Lin

    Abstract: Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between well-trained neural networks. In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness. Our experiments cover various types of adversar… ▽ More

    Submitted 2 July, 2020; v1 submitted 30 April, 2020; originally announced May 2020.

    Comments: accepted by ICLR 2020

  25. arXiv:2003.06005  [pdf, other

    cs.LG cs.AI stat.ML

    Model Agnostic Multilevel Explanations

    Authors: Karthikeyan Natesan Ramamurthy, Bhanukiran Vinzamuri, Yunfeng Zhang, Amit Dhurandhar

    Abstract: In recent years, post-hoc local instance-level and global dataset-level explainability of black-box models has received a lot of attention. Much less attention has been given to obtaining insights at intermediate or group levels, which is a need outlined in recent works that study the challenges in realizing the guidelines in the General Data Protection Regulation (GDPR). In this paper, we propose… ▽ More

    Submitted 12 March, 2020; originally announced March 2020.

    Comments: 21 pages, 9 figures, 1 table

    Journal ref: NeurIPS 2020

  26. arXiv:1911.07819  [pdf, other

    cs.CL cs.LG stat.ML

    Drug Repurposing for Cancer: An NLP Approach to Identify Low-Cost Therapies

    Authors: Shivashankar Subramanian, Ioana Baldini, Sushma Ravichandran, Dmitriy A. Katz-Rogozhnikov, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Kush R. Varshney, Annmarie Wang, Pradeep Mangalath, Laura B. Kleiman

    Abstract: More than 200 generic drugs approved by the U.S. Food and Drug Administration for non-cancer indications have shown promise for treating cancer. Due to their long history of safe patient use, low cost, and widespread availability, repurposing of generic drugs represents a major opportunity to rapidly improve outcomes for cancer patients and reduce healthcare costs worldwide. Evidence on the effica… ▽ More

    Submitted 5 December, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

  27. arXiv:1911.01509  [pdf, ps, other

    cs.LG cs.CY stat.ML

    Understanding racial bias in health using the Medical Expenditure Panel Survey data

    Authors: Moninder Singh, Karthikeyan Natesan Ramamurthy

    Abstract: Over the years, several studies have demonstrated that there exist significant disparities in health indicators in the United States population across various groups. Healthcare expense is used as a proxy for health in algorithms that drive healthcare systems and this exacerbates the existing bias. In this work, we focus on the presence of racial bias in health indicators in the publicly available… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: 8 pages, 8 tables

  28. arXiv:1906.02299  [pdf, other

    cs.LG cs.AI stat.ML

    Teaching AI to Explain its Decisions Using Embeddings and Multi-Task Learning

    Authors: Noel C. F. Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R. Varshney, Dennis Wei, Aleksandra Mojsilović

    Abstract: Using machine learning in high-stakes applications often requires predictions to be accompanied by explanations comprehensible to the domain user, who has ultimate responsibility for decisions and outcomes. Recently, a new framework for providing explanations, called TED, has been proposed to provide meaningful explanations for predictions. This framework augments training data to include explanat… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: presented at 2019 ICML Workshop on Human in the Loop Learning (HILL 2019), Long Beach, USA. arXiv admin note: substantial text overlap with arXiv:1805.11648

  29. arXiv:1906.01769  [pdf, other

    cs.CV cs.LG math.AT

    PI-Net: A Deep Learning Approach to Extract Topological Persistence Images

    Authors: Anirudh Som, Hongjun Choi, Karthikeyan Natesan Ramamurthy, Matthew Buman, Pavan Turaga

    Abstract: Topological features such as persistence diagrams and their functional approximations like persistence images (PIs) have been showing substantial promise for machine learning and computer vision applications. This is greatly attributed to the robustness topological representations provide against different types of physical nuisance variables seen in real-world data, such as view-point, illuminati… ▽ More

    Submitted 23 May, 2020; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: 10 pages, 8 figures, 4 tables

  30. arXiv:1906.00066  [pdf, other

    cs.LG cs.IT stat.ML

    Optimized Score Transformation for Consistent Fair Classification

    Authors: Dennis Wei, Karthikeyan Natesan Ramamurthy, Flavio du Pin Calmon

    Abstract: This paper considers fair probabilistic binary classification where the outputs of primary interest are predicted probabilities, commonly referred to as scores. We formulate the problem of transforming scores to satisfy fairness constraints that are linear in conditional means of scores while minimizing a cross-entropy objective. The formulation can be applied directly to post-process classifier o… ▽ More

    Submitted 29 October, 2021; v1 submitted 31 May, 2019; originally announced June 2019.

    Comments: 78 pages, 16 figures. Published in Journal of Machine Learning Research. Earlier version published at the 2020 International Conference on Artificial Intelligence and Statistics (AISTATS)

  31. arXiv:1905.13291  [pdf, other

    cs.CV cs.LG

    Counting and Segmenting Sorghum Heads

    Authors: Min-hwan Oh, Peder Olsen, Karthikeyan Natesan Ramamurthy

    Abstract: Phenoty** is the process of measuring an organism's observable traits. Manual phenoty** of crops is a labor-intensive, time-consuming, costly, and error prone process. Accurate, automated, high-throughput phenoty** can relieve a huge burden in the crop breeding pipeline. In this paper, we propose a scalable, high-throughput approach to automatically count and segment panicles (heads), a key… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: 23 pages, 23 figures, 5 tables

  32. arXiv:1903.07427  [pdf, other

    cs.CV cs.LG

    Crowd Counting with Decomposed Uncertainty

    Authors: Min-hwan Oh, Peder A. Olsen, Karthikeyan Natesan Ramamurthy

    Abstract: Research in neural networks in the field of computer vision has achieved remarkable accuracy for point estimation. However, the uncertainty in the estimation is rarely addressed. Uncertainty quantification accompanied by point estimation can lead to a more informed decision, and even improve the prediction quality. In this work, we focus on uncertainty estimation in the domain of crowd counting. W… ▽ More

    Submitted 22 April, 2020; v1 submitted 15 March, 2019; originally announced March 2019.

    Comments: Accepted in AAAI 2020 (Main Technical Track)

  33. arXiv:1812.06135  [pdf, other

    cs.LG cs.CY stat.ML

    Bias Mitigation Post-processing for Individual and Group Fairness

    Authors: Pranay K. Lohia, Karthikeyan Natesan Ramamurthy, Manish Bhide, Diptikalyan Saha, Kush R. Varshney, Ruchir Puri

    Abstract: Whereas previous post-processing approaches for increasing the fairness of predictions of biased classifiers address only group fairness, we propose a method for increasing both individual and group fairness. Our novel framework includes an individual bias detector used to prioritize data samples in a bias mitigation algorithm aiming to improve the group fairness measure of disparate impact. We sh… ▽ More

    Submitted 14 December, 2018; originally announced December 2018.

    Comments: 5 pages, 4 figures

  34. arXiv:1811.04896  [pdf, other

    cs.AI

    TED: Teaching AI to Explain its Decisions

    Authors: Michael Hind, Dennis Wei, Murray Campbell, Noel C. F. Codella, Amit Dhurandhar, Aleksandra Mojsilović, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

    Abstract: Artificial intelligence systems are being increasingly deployed due to their potential to increase the efficiency, scale, consistency, fairness, and accuracy of decisions. However, as many of these systems are opaque in their operation, there is a growing demand for such systems to provide explanations for their decisions. Conventional approaches to this problem attempt to expose or discover the i… ▽ More

    Submitted 15 June, 2019; v1 submitted 12 November, 2018; originally announced November 2018.

    Comments: This article leverages some content from arXiv:1805.11648; presented at ACM/AAAI AIES'19

  35. arXiv:1810.01943  [pdf, other

    cs.AI

    AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias

    Authors: Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, Yunfeng Zhang

    Abstract: Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This paper introduces a new open source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license {https://github.com/ibm/aif360). The main objectives of this… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

    Comments: 20 pages

  36. arXiv:1808.07261  [pdf, ps, other

    cs.CY cs.AI

    FactSheets: Increasing Trust in AI Services through Supplier's Declarations of Conformity

    Authors: Matthew Arnold, Rachel K. E. Bellamy, Michael Hind, Stephanie Houde, Sameep Mehta, Aleksandra Mojsilovic, Ravi Nair, Karthikeyan Natesan Ramamurthy, Darrell Reimer, Alexandra Olteanu, David Piorkowski, Jason Tsay, Kush R. Varshney

    Abstract: Accuracy is an important concern for suppliers of artificial intelligence (AI) services, but considerations beyond accuracy, such as safety (which includes fairness and explainability), security, and provenance, are also critical elements to engender consumers' trust in a service. Many industries use transparent, standardized, but often not legally required documents called supplier's declarations… ▽ More

    Submitted 7 February, 2019; v1 submitted 22 August, 2018; originally announced August 2018.

    Comments: 31 pages

  37. arXiv:1807.10400  [pdf, other

    cs.CV

    Perturbation Robust Representations of Topological Persistence Diagrams

    Authors: Anirudh Som, Kowshik Thopalli, Karthikeyan Natesan Ramamurthy, Vinay Venkataraman, Ankita Shukla, Pavan Turaga

    Abstract: Topological methods for data analysis present opportunities for enforcing certain invariances of broad interest in computer vision, including view-point in activity analysis, articulation in shape analysis, and measurement invariance in non-linear dynamical modeling. The increasing success of these methods is attributed to the complementary information that topology provides, as well as availabili… ▽ More

    Submitted 26 July, 2018; originally announced July 2018.

    Comments: 19 pages, 4 figures, 6 tables

  38. arXiv:1805.11648  [pdf, other

    cs.AI

    Teaching Meaningful Explanations

    Authors: Noel C. F. Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R. Varshney, Dennis Wei, Aleksandra Mojsilovic

    Abstract: The adoption of machine learning in high-stakes applications such as healthcare and law has lagged in part because predictions are not accompanied by explanations comprehensible to the domain user, who often holds the ultimate responsibility for decisions and outcomes. In this paper, we propose an approach to generate such explanations in which training data is augmented to include, in addition to… ▽ More

    Submitted 10 September, 2018; v1 submitted 29 May, 2018; originally announced May 2018.

    Comments: 9 pages

  39. arXiv:1805.09949  [pdf, other

    stat.ML cs.LG

    Topological Data Analysis of Decision Boundaries with Application to Model Selection

    Authors: Karthikeyan Natesan Ramamurthy, Kush R. Varshney, Krishnan Mody

    Abstract: We propose the labeled Čech complex, the plain labeled Vietoris-Rips complex, and the locally scaled labeled Vietoris-Rips complex to perform persistent homology inference of decision boundaries in classification tasks. We provide theoretical conditions and analysis for recovering the homology of a decision boundary from samples. Our main objective is quantification of deep neural network complexi… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

    Comments: Reproducible software available, 17 pages, 10 figures, 12 tables

  40. arXiv:1804.10961  [pdf, other

    stat.ML cs.LG

    Simultaneous Parameter Learning and Bi-Clustering for Multi-Response Models

    Authors: Ming Yu, Karthikeyan Natesan Ramamurthy, Addie Thompson, Aurélie Lozano

    Abstract: We consider multi-response and multitask regression models, where the parameter matrix to be estimated is expected to have an unknown grou** structure. The grou**s can be along tasks, or features, or both, the last one indicating a bi-cluster or "checkerboard" structure. Discovering this grou** structure along with parameter inference makes sense in several applications, such as multi-respon… ▽ More

    Submitted 29 April, 2018; originally announced April 2018.

    Comments: 15 pages, 15 figures

  41. arXiv:1712.07106  [pdf, other

    stat.ML cs.LG

    Exploring High-Dimensional Structure via Axis-Aligned Decomposition of Linear Projections

    Authors: Jayaraman J. Thiagarajan, Shusen Liu, Karthikeyan Natesan Ramamurthy, Peer-Timo Bremer

    Abstract: Two-dimensional embeddings remain the dominant approach to visualize high dimensional data. The choice of embeddings ranges from highly non-linear ones, which can capture complex relationships but are difficult to interpret quantitatively, to axis-aligned projections, which are easy to interpret but are limited to bivariate relationships. Linear project can be considered as a compromise between co… ▽ More

    Submitted 19 December, 2017; v1 submitted 19 December, 2017; originally announced December 2017.

  42. arXiv:1711.01514  [pdf, ps, other

    stat.ML cs.LG

    Distribution-Preserving k-Anonymity

    Authors: Dennis Wei, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

    Abstract: Preserving the privacy of individuals by protecting their sensitive attributes is an important consideration during microdata release. However, it is equally important to preserve the quality or utility of the data for at least some targeted workloads. We propose a novel framework for privacy preservation based on the k-anonymity model that is ideally suited for workloads that require preserving t… ▽ More

    Submitted 4 November, 2017; originally announced November 2017.

    Comments: Portions of this work were first presented at the 2015 SIAM International Conference on Data Mining

  43. arXiv:1710.06876  [pdf, other

    cs.CY

    An End-To-End Machine Learning Pipeline That Ensures Fairness Policies

    Authors: Samiulla Shaikh, Harit Vishwakarma, Sameep Mehta, Kush R. Varshney, Karthikeyan Natesan Ramamurthy, Dennis Wei

    Abstract: In consequential real-world applications, machine learning (ML) based systems are expected to provide fair and non-discriminatory decisions on candidates from groups defined by protected attributes such as gender and race. These expectations are set via policies or regulations governing data usage and decision criteria (sometimes explicitly calling out decisions by automated systems). Often, the d… ▽ More

    Submitted 18 October, 2017; originally announced October 2017.

    Comments: Presented at the Data For Good Exchange 2017

  44. arXiv:1708.07954  [pdf, other

    cs.CV

    Distributed Bundle Adjustment

    Authors: Karthikeyan Natesan Ramamurthy, Chung-Ching Lin, Aleksandr Aravkin, Sharath Pankanti, Raphael Viguier

    Abstract: Most methods for Bundle Adjustment (BA) in computer vision are either centralized or operate incrementally. This leads to poor scaling and affects the quality of solution as the number of images grows in large scale structure from motion (SfM). Furthermore, they cannot be used in scenarios where image acquisition and processing must be distributed. We address this problem with a new distributed BA… ▽ More

    Submitted 26 August, 2017; originally announced August 2017.

    Comments: 9 pages

  45. arXiv:1708.00069  [pdf, other

    stat.ML cs.CV cs.LG

    Learning Robust Representations for Computer Vision

    Authors: Peng Zheng, Aleksandr Y. Aravkin, Karthikeyan Natesan Ramamurthy, Jayaraman Jayaraman Thiagarajan

    Abstract: Unsupervised learning techniques in computer vision often require learning latent representations, such as low-dimensional linear and non-linear subspaces. Noise and outliers in the data can frustrate these approaches by obscuring the latent spaces. Our main goal is deeper understanding and new development of robust approaches for representation learning. We provide a new interpretation for exis… ▽ More

    Submitted 31 July, 2017; originally announced August 2017.

    Comments: 8 pages, 7 pages

  46. arXiv:1704.03354  [pdf, other

    stat.ML cs.CY cs.IT

    Optimized Data Pre-Processing for Discrimination Prevention

    Authors: Flavio P. Calmon, Dennis Wei, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

    Abstract: Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling discrimination, limiting distortion in individual data samples, and preserving utility. We characterize the imp… ▽ More

    Submitted 11 April, 2017; originally announced April 2017.

  47. arXiv:1612.09007  [pdf, ps, other

    stat.ML cs.LG

    A Deep Learning Approach To Multiple Kernel Fusion

    Authors: Huan Song, Jayaraman J. Thiagarajan, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy, Andreas Spanias

    Abstract: Kernel fusion is a popular and effective approach for combining multiple features that characterize different aspects of data. Traditional approaches for Multiple Kernel Learning (MKL) attempt to learn the parameters for combining the kernels through sophisticated optimization procedures. In this paper, we propose an alternative approach that creates dense embeddings for data using the kernel simi… ▽ More

    Submitted 28 December, 2016; originally announced December 2016.

  48. arXiv:1611.07429  [pdf, other

    stat.ML cs.LG

    TreeView: Peeking into Deep Neural Networks Via Feature-Space Partitioning

    Authors: Jayaraman J. Thiagarajan, Bhavya Kailkhura, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy

    Abstract: With the advent of highly predictive but opaque deep learning models, it has become more important than ever to understand and explain the predictions of such models. Existing approaches define interpretability as the inverse of complexity and achieve interpretability at the cost of accuracy. This introduces a risk of producing interpretable but misleading explanations. As humans, we are prone to… ▽ More

    Submitted 22 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  49. arXiv:1605.08912  [pdf, other

    math.AT cs.CG cs.CV math.DG math.ST

    A Riemannian Framework for Statistical Analysis of Topological Persistence Diagrams

    Authors: Rushil Anirudh, Vinay Venkataraman, Karthikeyan Natesan Ramamurthy, Pavan Turaga

    Abstract: Topological data analysis is becoming a popular way to study high dimensional feature spaces without any contextual clues or assumptions. This paper concerns itself with one popular topological feature, which is the number of $d-$dimensional holes in the dataset, also known as the Betti$-d$ number. The persistence of the Betti numbers over various scales is encoded into a persistence diagram (PD),… ▽ More

    Submitted 28 May, 2016; originally announced May 2016.

    Comments: Accepted at DiffCVML 2016 (CVPR 2016 Workshops)

  50. arXiv:1603.05310  [pdf, other

    cs.CG cs.CV

    Persistent Homology of Attractors For Action Recognition

    Authors: Vinay Venkataraman, Karthikeyan Natesan Ramamurthy, Pavan Turaga

    Abstract: In this paper, we propose a novel framework for dynamical analysis of human actions from 3D motion capture data using topological data analysis. We model human actions using the topological features of the attractor of the dynamical system. We reconstruct the phase-space of time series corresponding to actions using time-delay embedding, and compute the persistent homology of the phase-space recon… ▽ More

    Submitted 16 March, 2016; originally announced March 2016.

    Comments: 5 pages, Under review in International Conference on Image Processing