Skip to main content

Showing 1–50 of 155 results for author: Khanna, S

.
  1. arXiv:2406.13573  [pdf, other

    cs.DS

    Improved Bounds for Fully Dynamic Matching via Ordered Ruzsa-Szemerédi Graphs

    Authors: Sepehr Assadi, Sanjeev Khanna

    Abstract: In a very recent breakthrough, Behnezhad and Ghafari [arXiv'24] developed a novel fully dynamic randomized algorithm for maintaining a $(1-ε)$-approximation of maximum matching with amortized update time potentially much better than the trivial $O(n)$ update time. The runtime of the BG algorithm is parameterized via the following graph theoretical concept: * For any $n$, define $ORS(n)$ -- stand… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 24 pages, 2 figures

  2. arXiv:2406.10973  [pdf, other

    cs.CV cs.AI

    ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts

    Authors: Samar Khanna, Medhanie Irgau, David B. Lobell, Stefano Ermon

    Abstract: Parameter-efficient fine-tuning (PEFT) techniques such as low-rank adaptation (LoRA) can effectively adapt large pre-trained foundation models to downstream tasks using only a small fraction (0.1%-10%) of the original trainable weights. An under-explored question of PEFT is in extending the pre-training phase without supervised labels; that is, can we adapt a pre-trained foundation model to a new… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  3. arXiv:2406.09127  [pdf, other

    astro-ph.GA

    The Milky Way as Seen by Classical Cepheids II: Spiral Structure

    Authors: Ronald Drimmel, Shourya Khanna, Eloisa Poggio, Dorota M. Skowron

    Abstract: As a relatively young and bright population, and the archetype of standard candles, classical Cepheids make an ideal population to trace non-axisymmetric structure in the young stellar disk to large distances. We use the new distances derived in Paper I based on mid-IR WISE photometry for a selected sample of 2857 dynamically young Cepheids to trace the spiral arms of the Milky Way. The Perseus an… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: (10 pages, 9 figures, submitted)

  4. arXiv:2406.09113  [pdf, other

    astro-ph.GA astro-ph.SR

    The Milky Way as seen by Classical Cepheids I: distances based on mid-infrared photometry

    Authors: Dorota M. Skowron, Ronald Drimmel, Shourya Khanna, Alessandro Spagna, Eloisa Poggio, Pau Ramos

    Abstract: Classical Cepheids are the archetype of the standard candle, thanks to the period-luminosity relation which allows to measure their intrinsic brightness. They are also relatively young and bright, potentially making them excellent tracers of the young stellar population that is responsible for sha** the visible aspect of our Galaxy, the Milky Way. However, being observers embedded in the dusty i… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 21 pages, 21 figures, 3 tables; submitted to ApJS

  5. arXiv:2405.20861  [pdf, other

    cs.DS

    Maximum Bipartite Matching in $n^{2+o(1)}$ Time via a Combinatorial Algorithm

    Authors: Julia Chuzhoy, Sanjeev Khanna

    Abstract: Maximum bipartite matching (MBM) is a fundamental problem in combinatorial optimization with a long and rich history. A classic result of Hopcroft and Karp (1973) provides an $O(m \sqrt{n})$-time algorithm for the problem, where $n$ and $m$ are the number of vertices and edges in the input graph, respectively. For dense graphs, an approach based on fast matrix multiplication achieves a running tim… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  6. arXiv:2405.15843  [pdf, other

    cs.CV cs.AI

    SpotNet: An Image Centric, Lidar Anchored Approach To Long Range Perception

    Authors: Louis Foucard, Samar Khanna, Yi Shi, Chi-Kuei Liu, Quinn Z Shen, Thuyen Ngo, Zi-Xiang Xia

    Abstract: In this paper, we propose SpotNet: a fast, single stage, image-centric but LiDAR anchored approach for long range 3D object detection. We demonstrate that our approach to LiDAR/image sensor fusion, combined with the joint learning of 2D and 3D detection tasks, can lead to accurate 3D object detection with very sparse LiDAR support. Unlike more recent bird's-eye-view (BEV) sensor-fusion methods whi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  7. arXiv:2405.09594  [pdf, other

    eess.IV cs.CV cs.LG

    Learning Generalized Medical Image Representations through Image-Graph Contrastive Pretraining

    Authors: Sameer Khanna, Daniel Michael, Marinka Zitnik, Pranav Rajpurkar

    Abstract: Medical image interpretation using deep learning has shown promise but often requires extensive expert-annotated datasets. To reduce this annotation burden, we develop an Image-Graph Contrastive Learning framework that pairs chest X-rays with structured report knowledge graphs automatically extracted from radiology notes. Our approach uniquely encodes the disconnected graph components via a relati… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted into Machine Learning for Health (ML4H) 2023

  8. arXiv:2405.01585  [pdf, other

    cs.AI cs.CL cs.IR

    Tabular Embedding Model (TEM): Finetuning Embedding Models For Tabular RAG Applications

    Authors: Sujit Khanna, Shishir Subedi

    Abstract: In recent times Large Language Models have exhibited tremendous capabilities, especially in the areas of mathematics, code generation and general-purpose reasoning. However for specialized domains especially in applications that require parsing and analyzing large chunks of numeric or tabular data even state-of-the-art (SOTA) models struggle. In this paper, we introduce a new approach to solving d… ▽ More

    Submitted 28 April, 2024; originally announced May 2024.

    Comments: 11 pages, 5 figures

  9. arXiv:2404.18283  [pdf, other

    cond-mat.mtrl-sci physics.app-ph

    Fast \textit{ab initio} design of high-entropy magnetic thin films

    Authors: Dinesh Bista, Willie B. Beeson, Turbasu Sengupta, Jerome Jackson, Shiv N Khanna, Kai Liu, Gen Yin

    Abstract: We show that the magnetic properties of high-entropy alloys (HEAs) can be captured by \textit{ab initio} calculations within the coherent potential approximation, where the atomic details of the high-entropy mixing are considered as an effective medium that possesses the translational symmetry of the lattice. This is demonstrated using the face-centered cubic (FCC) phase of $\textrm{FeCoNiMnCu}$ a… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  10. arXiv:2404.14127  [pdf, other

    astro-ph.GA

    Gaia DR3 detectability of unresolved binary systems

    Authors: Alfred Castro-Ginard, Zephyr Penoyre, Andrew R. Casey, Anthony G. A. Brown, Vasily Belokurov, Tristan Cantat-Gaudin, Ronald Drimmel, Morgan Fouesneau, Shourya Khanna, Evgeny P. Kurbatov, Adrian M. Price-Whelan, Hans-Walter Rix, Richard L. Smart

    Abstract: Gaia can not individually resolve very close binary systems, however, the collected data can still be used to identify them. A powerful indicator of stellar multiplicity is the sources reported Renormalized Unit Weight Error (ruwe), which effectively captures the astrometric deviations from single-source solutions. We aim to characterise the imprints left on ruwe caused by binarity. By flagging po… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 11 pages, 10 figures. Accepted for publication in A&A

  11. arXiv:2404.10486  [pdf, other

    astro-ph.GA astro-ph.SR

    Discovery of a dormant 33 solar-mass black hole in pre-release Gaia astrometry

    Authors: Gaia Collaboration, P. Panuzzo, T. Mazeh, F. Arenou, B. Holl, E. Caffau, A. Jorissen, C. Babusiaux, P. Gavras, J. Sahlmann, U. Bastian, Ł. Wyrzykowski, L. Eyer, N. Leclerc, N. Bauchet, A. Bombrun, N. Mowlavi, G. M. Seabroke, D. Teyssier, E. Balbinot, A. Helmi, A. G. A. Brown, A. Vallenari, T. Prusti, J. H. J. de Bruijne , et al. (390 additional authors not shown)

    Abstract: Gravitational waves from black-hole merging events have revealed a population of extra-galactic BHs residing in short-period binaries with masses that are higher than expected based on most stellar evolution models - and also higher than known stellar-origin black holes in our Galaxy. It has been proposed that those high-mass BHs are the remnants of massive metal-poor stars. Gaia astrometry is exp… ▽ More

    Submitted 19 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: 23 pages, accepted fro publication in A&A Letters. New version with small fixes

  12. arXiv:2404.06327  [pdf, other

    cs.DS

    Characterizations of Sparsifiability for Affine CSPs and Symmetric CSPs

    Authors: Sanjeev Khanna, Aaron L. Putterman, Madhu Sudan

    Abstract: CSP sparsification, introduced by Kogan and Krauthgamer (ITCS 2015), considers the following question: when can an instance of a constraint satisfaction problem be sparsified (by retaining a weighted subset of the constraints) while still roughly capturing the weight of constraints satisfied by {\em every} assignment. CSP sparsification generalizes and abstracts other commonly studied problems inc… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  13. arXiv:2403.08613  [pdf, other

    cs.SI cs.AI cs.LG

    Link Prediction for Social Networks using Representation Learning and Heuristic-based Features

    Authors: Samarth Khanna, Sree Bhattacharyya, Sudipto Ghosh, Kushagra Agarwal, Asit Kumar Das

    Abstract: The exponential growth in scale and relevance of social networks enable them to provide expansive insights. Predicting missing links in social networks efficiently can help in various modern-day business applications ranging from generating recommendations to influence analysis. Several categories of solutions exist for the same. Here, we explore various feature extraction techniques to generate r… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted to the MAISoN Workshop at IJCAI 2023

  14. Parallel Approximate Maximum Flows in Near-Linear Work and Polylogarithmic Depth

    Authors: Arpit Agarwal, Sanjeev Khanna, Huan Li, Prathamesh Patil, Chen Wang, Nathan White, Peilin Zhong

    Abstract: We present a parallel algorithm for the $(1-ε)$-approximate maximum flow problem in capacitated, undirected graphs with $n$ vertices and $m$ edges, achieving $O(ε^{-3}\text{polylog} n)$ depth and $O(m ε^{-3} \text{polylog} n)$ work in the PRAM model. Although near-linear time sequential algorithms for this problem have been known for almost a decade, no parallel algorithms that simultaneously achi… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  15. arXiv:2402.13151  [pdf, other

    cs.DS

    Almost-Tight Bounds on Preserving Cuts in Classes of Submodular Hypergraphs

    Authors: Sanjeev Khanna, Aaron L. Putterman, Madhu Sudan

    Abstract: Recently, a number of variants of the notion of cut-preserving hypergraph sparsification have been studied in the literature. These variants include directed hypergraph sparsification, submodular hypergraph sparsification, general notions of approximation including spectral approximations, and more general notions like sketching that can answer cut queries using more general data structures than j… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  16. arXiv:2402.13140  [pdf, ps, other

    quant-ph

    Comment on a no-go theorem for $ψ$-ontic models

    Authors: Laurens Walleghem, Shashaank Khanna, Rutvij Bhavsar

    Abstract: In a recent paper [Carcassi, Oldofredi and Aidala, Found Phys 54, 14 (2024)] it is claimed that the whole Harrigan--Spekkens framework of ontological models is inconsistent with quantum theory. They show this by showing that all pure quantum states in $ψ$-ontic models must be orthogonal. In this note, we identify some crucial mistakes in their argument to the extent that the main claim is incorrec… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 5 pages including references All comments welcome!

  17. arXiv:2402.02680  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Large Language Models are Geographically Biased

    Authors: Rohin Manvi, Samar Khanna, Marshall Burke, David Lobell, Stefano Ermon

    Abstract: Large Language Models (LLMs) inherently carry the biases contained in their training corpora, which can lead to the perpetuation of societal harm. As the impact of these foundation models grows, understanding and evaluating their biases becomes crucial to achieving fairness and accuracy. We propose to study what LLMs know about the world we live in through the lens of geography. This approach is p… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  18. arXiv:2401.18059  [pdf, other

    cs.CL cs.LG

    RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

    Authors: Parth Sarthi, Salman Abdullah, Aditi Tuli, Shubh Khanna, Anna Goldie, Christopher D. Manning

    Abstract: Retrieval-augmented language models can better adapt to changes in world state and incorporate long-tail knowledge. However, most existing methods retrieve only short contiguous chunks from a retrieval corpus, limiting holistic understanding of the overall document context. We introduce the novel approach of recursively embedding, clustering, and summarizing chunks of text, constructing a tree wit… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  19. arXiv:2401.05023  [pdf, other

    astro-ph.GA astro-ph.IM

    Uniting Gaia and APOGEE to unveil the cosmic chemistry of the Milky Way disc

    Authors: Tristan Cantat-Gaudin, Morgan Fouesneau, Hans-Walter Rix, Anthony G. A. Brown, Ronald Drimmel, Alfred Castro-Ginard, Shourya Khanna, Vasily Belokurov, Andrew R. Casey

    Abstract: The spatial distribution of Galactic stars with different chemical abundances encodes information on the processes that drove the formation and evolution of the Milky Way. Survey selection functions are indispensable for analysing astronomical catalogues produced by large-scale surveys. The use of these selection functions in data modelling is more complex when data from different surveys are to b… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 21 pages, 18 figures, accepted for publication in Astronomy & Astrophysics

  20. arXiv:2312.12584  [pdf, ps, other

    cs.DS

    A Faster Combinatorial Algorithm for Maximum Bipartite Matching

    Authors: Julia Chuzhoy, Sanjeev Khanna

    Abstract: The maximum bipartite matching problem is among the most fundamental and well-studied problems in combinatorial optimization. A beautiful and celebrated combinatorial algorithm of Hopcroft and Karp (1973) shows that maximum bipartite matching can be solved in $O(m \sqrt{n})$ time on a graph with $n$ vertices and $m$ edges. For the case of very dense graphs, a fast matrix multiplication based appro… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  21. arXiv:2312.03606  [pdf, other

    cs.CV cs.AI cs.LG

    DiffusionSat: A Generative Foundation Model for Satellite Imagery

    Authors: Samar Khanna, Patrick Liu, Linqi Zhou, Chenlin Meng, Robin Rombach, Marshall Burke, David Lobell, Stefano Ermon

    Abstract: Diffusion models have achieved state-of-the-art results on many modalities including images, speech, and video. However, existing models are not tailored to support remote sensing data, which is widely used in important applications including environmental monitoring and crop-yield prediction. Satellite images are significantly different from natural images -- they can be multi-spectral, irregular… ▽ More

    Submitted 25 May, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: Published at ICLR 2024

  22. arXiv:2311.16654  [pdf

    cs.LG

    Elucidating Discrepancy in Explanations of Predictive Models Developed using EMR

    Authors: Aida Brankovic, Wenjie Huang, David Cook, Sankalp Khanna, Konstanty Bialkowski

    Abstract: The lack of transparency and explainability hinders the clinical adoption of Machine learning (ML) algorithms. While explainable artificial intelligence (XAI) methods have been proposed, little research has focused on the agreement between these methods and expert clinical knowledge. This study applies current state-of-the-art explainability methods to clinical decision support algorithms develope… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  23. arXiv:2311.00788  [pdf, other

    cs.DS

    Code Sparsification and its Applications

    Authors: Sanjeev Khanna, Aaron L Putterman, Madhu Sudan

    Abstract: We introduce a notion of code sparsification that generalizes the notion of cut sparsification in graphs. For a (linear) code $\mathcal{C} \subseteq \mathbb{F}_q^n$ of dimension $k$ a $(1 \pm ε)$-sparsification of size $s$ is given by a weighted set $S \subseteq [n]$ with $|S| \leq s$ such that for every codeword $c \in \mathcal{C}$ the projection $c|_S$ of $c$ to the set $S$ has (weighted) hammin… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  24. arXiv:2310.14573  [pdf, other

    cs.CL

    Exploring the Boundaries of GPT-4 in Radiology

    Authors: Qianchu Liu, Stephanie Hyland, Shruthi Bannur, Kenza Bouzid, Daniel C. Castro, Maria Teodora Wetscherek, Robert Tinn, Harshita Sharma, Fernando Pérez-García, Anton Schwaighofer, Pranav Rajpurkar, Sameer Tajdin Khanna, Hoifung Poon, Naoto Usuyama, Anja Thieme, Aditya V. Nori, Matthew P. Lungren, Ozan Oktay, Javier Alvarez-Valle

    Abstract: The recent success of general-domain large language models (LLMs) has significantly changed the natural language processing paradigm towards a unified foundation model across domains and applications. In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-s… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 main

  25. arXiv:2310.06551  [pdf, other

    astro-ph.SR astro-ph.GA

    Gaia Focused Product Release: Sources from Service Interface Function image analysis -- Half a million new sources in omega Centauri

    Authors: Gaia Collaboration, K. Weingrill, A. Mints, J. Castañeda, Z. Kostrzewa-Rutkowska, M. Davidson, F. De Angeli, J. Hernández, F. Torra, M. Ramos-Lerate, C. Babusiaux, M. Biermann, C. Crowley, D. W. Evans, L. Lindegren, J. M. Martín-Fleitas, L. Palaversa, D. Ruz Mieres, K. Tisanić, A. G. A. Brown, A. Vallenari, T. Prusti, J. H. J. de Bruijne, F. Arenou, A. Barbier , et al. (378 additional authors not shown)

    Abstract: Gaia's readout window strategy is challenged by very dense fields in the sky. Therefore, in addition to standard Gaia observations, full Sky Mapper (SM) images were recorded for nine selected regions in the sky. A new software pipeline exploits these Service Interface Function (SIF) images of crowded fields (CFs), making use of the availability of the full two-dimensional (2D) information. This ne… ▽ More

    Submitted 8 November, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

    Journal ref: A&A 680, A35 (2023)

  26. arXiv:2310.06295  [pdf, other

    astro-ph.GA astro-ph.CO astro-ph.IM

    Gaia Focused Product Release: A catalogue of sources around quasars to search for strongly lensed quasars

    Authors: Gaia Collaboration, A. Krone-Martins, C. Ducourant, L. Galluccio, L. Delchambre, I. Oreshina-Slezak, R. Teixeira, J. Braine, J. -F. Le Campion, F. Mignard, W. Roux, A. Blazere, L. Pegoraro, A. G. A. Brown, A. Vallenari, T. Prusti, J. H. J. de Bruijne, F. Arenou, C. Babusiaux, A. Barbier, M. Biermann, O. L. Creevey, D. W. Evans, L. Eyer, R. Guerra , et al. (376 additional authors not shown)

    Abstract: Context. Strongly lensed quasars are fundamental sources for cosmology. The Gaia space mission covers the entire sky with the unprecedented resolution of $0.18$" in the optical, making it an ideal instrument to search for gravitational lenses down to the limiting magnitude of 21. Nevertheless, the previous Gaia Data Releases are known to be incomplete for small angular separations such as those ex… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: 35 pages, 60 figures, accepted for publication by Astronomy and Astrophysics

    Journal ref: A&A 685, A130 (2024)

  27. arXiv:2310.06213  [pdf, other

    cs.CL cs.LG

    GeoLLM: Extracting Geospatial Knowledge from Large Language Models

    Authors: Rohin Manvi, Samar Khanna, Gengchen Mai, Marshall Burke, David Lobell, Stefano Ermon

    Abstract: The application of machine learning (ML) in a range of geospatial tasks is increasingly common but often relies on globally available covariates such as satellite imagery that can either be expensive or lack predictive power. Here we explore the question of whether the vast amounts of knowledge found in Internet language corpora, now compressed within large language models (LLMs), can be leveraged… ▽ More

    Submitted 24 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted to ICLR 2024

  28. arXiv:2310.06051  [pdf, other

    astro-ph.SR

    Gaia Focused Product Release: Radial velocity time series of long-period variables

    Authors: Gaia Collaboration, Gaia Collaboration, M. Trabucchi, N. Mowlavi, T. Lebzelter, I. Lecoeur-Taibi, M. Audard, L. Eyer, P. García-Lario, P. Gavras, B. Holl, G. Jevardat de Fombelle, K. Nienartowicz, L. Rimoldini, P. Sartoretti, R. Blomme, Y. Frémat, O. Marchal, Y. Damerdji, A. G. A. Brown, A. Guerrier, P. Panuzzo, D. Katz, G. M. Seabroke, K. Benson , et al. (382 additional authors not shown)

    Abstract: The third Gaia Data Release (DR3) provided photometric time series of more than 2 million long-period variable (LPV) candidates. Anticipating the publication of full radial-velocity (RV) in DR4, this Focused Product Release (FPR) provides RV time series for a selection of LPVs with high-quality observations. We describe the production and content of the Gaia catalog of LPV RV time series, and the… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: 36 pages, 38 figures

  29. arXiv:2309.16948  [pdf, other

    cs.CV cs.AI

    Denoising Diffusion Bridge Models

    Authors: Linqi Zhou, Aaron Lou, Samar Khanna, Stefano Ermon

    Abstract: Diffusion models are powerful generative models that map noise to data using stochastic processes. However, for many applications such as image editing, the model input comes from a distribution that is not random noise. As such, diffusion models must rely on cumbersome methods like guidance or projected sampling to incorporate this information in the generative process. In our work, we propose De… ▽ More

    Submitted 5 December, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: Github: https://github.com/alexzhou907/DDBM/

  30. arXiv:2309.10403  [pdf, other

    cs.SI physics.soc-ph

    INDoRI: Indian Dataset of Recipes and Ingredients and its Ingredient Network

    Authors: Sandeep Khanna, Chiranjoy Chattopadhyay, Suman Kundu

    Abstract: Exploring and comprehending the culinary heritage of a nation holds a captivating allure. It offers insights into the structure and qualities of its cuisine. The endeavor becomes more accessible with the availability of a well-organized dataset. In this paper, we present the introduction of INDoRI (Indian Dataset of Recipes and Ingredients), a compilation drawn from seven distinct online platforms… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: 11 pages, 4 figures, 3 tables

  31. arXiv:2308.13957  [pdf, other

    cs.CV cs.AI cs.LG

    Differentiable Weight Masks for Domain Transfer

    Authors: Samar Khanna, Skanda Vaidyanath, Akash Velu

    Abstract: One of the major drawbacks of deep learning models for computer vision has been their inability to retain multiple sources of information in a modular fashion. For instance, given a network that has been trained on a source task, we would like to re-train this network on a similar, yet different, target task while maintaining its performance on the source task. Simultaneously, researchers have ext… ▽ More

    Submitted 7 October, 2023; v1 submitted 26 August, 2023; originally announced August 2023.

    Comments: Published in Out of Distribution Generalization in Computer Vision (OOD-CV) workshop at ICCV 2023

  32. arXiv:2308.05046  [pdf, other

    cs.CL cs.LG

    RadGraph2: Modeling Disease Progression in Radiology Reports via Hierarchical Information Extraction

    Authors: Sameer Khanna, Adam Dejl, Kibo Yoon, Quoc Hung Truong, Hanh Duong, Agustina Saenz, Pranav Rajpurkar

    Abstract: We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a mo… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted at Machine Learning for Healthcare 2023

  33. arXiv:2308.02380  [pdf, other

    quant-ph math.ST stat.ML

    Classifying Causal Structures: Ascertaining when Classical Correlations are Constrained by Inequalities

    Authors: Shashaank Khanna, Marina Maciel Ansanelli, Matthew F. Pusey, Elie Wolfe

    Abstract: The classical causal relations between a set of variables, some observed and some latent, can induce both equality constraints (typically conditional independences) as well as inequality constraints (Instrumental and Bell inequalities being prototypical examples) on their compatible distribution over the observed variables. Enumerating a causal structure's implied inequality constraints is general… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: 37+12 pages, 13 figures, 4 tables

    Journal ref: Phys. Rev. Research 6, 023038 (2024)

  34. arXiv:2307.10573  [pdf, other

    cs.AI

    Invalid Logic, Equivalent Gains: The Bizarreness of Reasoning in Language Model Prompting

    Authors: Rylan Schaeffer, Kateryna Pistunova, Samar Khanna, Sarthak Consul, Sanmi Koyejo

    Abstract: Language models can be prompted to reason through problems in a manner that significantly improves performance. However, \textit{why} such prompting improves performance is unclear. Recent work showed that using logically \textit{invalid} Chain-of-Thought (CoT) prompting improves performance almost as much as logically \textit{valid} CoT prompting, and that editing CoT prompts to replace problem-s… ▽ More

    Submitted 22 July, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: ICML 2023 Workshop: Knowledge and Logical Reasoning in the Era of Data-driven Learning

  35. arXiv:2307.09834  [pdf, other

    cs.IR

    Who Provides the Largest Megaphone? The Role of Google News in Promoting Russian State-Affiliated News Sources

    Authors: Keeley Erhardt, Saurabh Khanna

    Abstract: The Internet has not only digitized but also democratized information access across the globe. This gradual but path-breaking move to online information propagation has resulted in search engines playing an increasingly prominent role in sha** access to human knowledge. When an Internet user enters a query, the search engine sorts through the hundreds of billions of possible webpages to determin… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Journal ref: The 9th International Conference on Computational Social Science (IC2S2). 2023

  36. arXiv:2306.11985  [pdf, other

    cs.LG cs.CY

    Evaluation of Popular XAI Applied to Clinical Prediction Models: Can They be Trusted?

    Authors: Aida Brankovic, David Cook, Jessica Rahman, Wenjie Huang, Sankalp Khanna

    Abstract: The absence of transparency and explainability hinders the clinical adoption of Machine learning (ML) algorithms. Although various methods of explainable artificial intelligence (XAI) have been suggested, there is a lack of literature that delves into their practicality and assesses them based on criteria that could foster trust in clinical environments. To address this gap this study evaluates tw… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

  37. Estimating the selection function of Gaia DR3 sub-samples

    Authors: A. Castro-Ginard, A. G. A. Brown, Z. Kostrzewa-Rutkowska, T. Cantat-Gaudin, R. Drimmel, S. Oh, V. Belokurov, A. R. Casey, M. Fouesneau, S. Khanna, A. M. Price-Whelan, H. W. Rix

    Abstract: Understanding which sources are present in an astronomical catalogue and which are not is crucial for the accurate interpretation of astronomical data. In particular, for the multidimensional Gaia data, filters and cuts on different parameters or measurements introduces a selection function that may unintentionally alter scientific conclusions in subtle ways. We aim to develop a methodology to est… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: 13 pages, 12 figures. Submitted to A&A

    Journal ref: A&A 677, A37 (2023)

  38. arXiv:2212.00265  [pdf, other

    cs.CL cs.LG

    PIZZA: A new benchmark for complex end-to-end task-oriented parsing

    Authors: Konstantine Arkoudas, Nicolas Guenon des Mesnards, Melanie Rubino, Sandesh Swamy, Saarthak Khanna, Weiqi Sun, Khan Haidar

    Abstract: Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate. This paper continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semant… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

    Comments: Accepted for publication at AMLC 2022

  39. arXiv:2211.03893  [pdf, other

    cs.DS

    Query Complexity of the Metric Steiner Tree Problem

    Authors: Yu Chen, Sanjeev Khanna, Zihan Tan

    Abstract: We study the query complexity of the metric Steiner Tree problem, where we are given an $n \times n$ metric on a set $V$ of vertices along with a set $T \subseteq V$ of $k$ terminals, and the goal is to find a tree of minimum cost that contains all terminals in $T$. The query complexity for the related minimum spanning tree (MST) problem is well-understood: for any fixed $\varepsilon > 0$, one can… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  40. arXiv:2209.07729  [pdf, ps, other

    cs.DS

    On Weighted Graph Sparsification by Linear Sketching

    Authors: Yu Chen, Sanjeev Khanna, Huan Li

    Abstract: A seminal work of [Ahn-Guha-McGregor, PODS'12] showed that one can compute a cut sparsifier of an unweighted undirected graph by taking a near-linear number of linear measurements on the graph. Subsequent works also studied computing other graph sparsifiers using linear sketching, and obtained near-linear upper bounds for spectral sparsifiers [Kapralov-Lee-Musco-Musco-Sidford, FOCS'14] and first n… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

  41. arXiv:2208.09335  [pdf, other

    astro-ph.GA astro-ph.IM

    An empirical model of the Gaia DR3 selection function

    Authors: Tristan Cantat-Gaudin, Morgan Fouesneau, Hans-Walter Rix, Anthony G. A. Brown, Alfred Castro-Ginard, Ronald Drimmel, David W. Hogg, Andrew R. Casey, Shourya Khanna, Semyeong Oh, Adrian M. Price Whelan, Vasily Belokurov, Andrew K. Saydjari, Gregory M. Green

    Abstract: Interpreting and modelling astronomical catalogues requires an understanding of the catalogues' completeness or selection function: objects of what properties had a chance to end up in the catalogue. Here we set out to empirically quantify the completeness of the overall Gaia DR3 catalogue. This task is not straightforward because Gaia is the all-sky optical survey with the highest angular resolut… ▽ More

    Submitted 6 September, 2022; v1 submitted 19 August, 2022; originally announced August 2022.

    Comments: submitted to A&A

    Journal ref: A&A 669, A55 (2023)

  42. Gaia Data Release 3: Summary of the content and survey properties

    Authors: Gaia Collaboration, A. Vallenari, A. G. A. Brown, T. Prusti, J. H. J. de Bruijne, F. Arenou, C. Babusiaux, M. Biermann, O. L. Creevey, C. Ducourant, D. W. Evans, L. Eyer, R. Guerra, A. Hutton, C. Jordi, S. A. Klioner, U. L. Lammers, L. Lindegren, X. Luri, F. Mignard, C. Panem, D. Pourbaix, S. Randich, P. Sartoretti, C. Soubiran , et al. (431 additional authors not shown)

    Abstract: We present the third data release of the European Space Agency's Gaia mission, GDR3. The GDR3 catalogue is the outcome of the processing of raw data collected with the Gaia instruments during the first 34 months of the mission by the Gaia Data Processing and Analysis Consortium. The GDR3 catalogue contains the same source list, celestial positions, proper motions, parallaxes, and broad band photom… ▽ More

    Submitted 30 July, 2022; originally announced August 2022.

    Comments: 23 pages, 2 figures

  43. A new resonance-like feature in the outer disc of the Milky Way

    Authors: Ronald Drimmel, Shourya Khanna, Elena D'Onghia, Thorsten Tepper-García, Joss Bland-Hawthorn, Laurent Chemin, Vincenzo Ripepi, Mercé Romero-Gómez, Pau Ramos, Eloisa Poggio, Rene Andrae, Ronny Blomme, Tristan Cantat-Gaudin, Alfred Castro-Ginard, Gisella Clementini, Francesca Fiqueras, Yves Frémat, Morgan Fouesneau, Alex Lobel, Douglas Marshall, Tatiana Muraveva

    Abstract: Modern astrometric and spectroscopic surveys have revealed a wealth of structure in the phase space of stars in the Milky Way, with evidence of resonance features and non-equilibrium processes. Using Gaia's third data release, we present evidence of a new resonance-like feature in the outer disc of the Milky Way. The feature is most evident in the angular momentum distribution of the young Classic… ▽ More

    Submitted 20 January, 2023; v1 submitted 26 July, 2022; originally announced July 2022.

    Comments: 13 pages, 11 figures, in press for A&A

    Journal ref: A&A 670, A10 (2023)

  44. arXiv:2207.09354  [pdf, ps, other

    cs.DS

    On Regularity Lemma and Barriers in Streaming and Dynamic Matching

    Authors: Sepehr Assadi, Soheil Behnezhad, Sanjeev Khanna, Huan Li

    Abstract: We present a new approach for finding matchings in dense graphs by building on Szemerédi's celebrated Regularity Lemma. This allows us to obtain non-trivial albeit slight improvements over longstanding bounds for matchings in streaming and dynamic graphs. In particular, we establish the following results for $n$-vertex graphs: * A deterministic single-pass streaming algorithm that finds a… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

  45. arXiv:2207.08051  [pdf, other

    cs.CV cs.AI

    SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery

    Authors: Yezhen Cong, Samar Khanna, Chenlin Meng, Patrick Liu, Erik Rozi, Yutong He, Marshall Burke, David B. Lobell, Stefano Ermon

    Abstract: Unsupervised pre-training methods for large vision models have shown to enhance performance on downstream supervised tasks. Develo** similar techniques for satellite imagery presents significant opportunities as unlabelled data is plentiful and the inherent temporal and multi-spectral structure provides avenues to further improve existing pre-training strategies. In this paper, we present SatMAE… ▽ More

    Submitted 15 January, 2023; v1 submitted 16 July, 2022; originally announced July 2022.

    Comments: Published at NeurIPS 2022. The first two listed names contributed equally to this project

  46. Gaia Data Release 3: Reflectance spectra of Solar System small bodies

    Authors: Gaia Collaboration, L. Galluccio, M. Delbo, F. De Angeli, T. Pauwels, P. Tanga, F. Mignard, A. Cellino, A. G. A. Brown, K. Muinonen, A. Penttila, S. Jordan, A. Vallenari, T. Prusti, J. H. J. de Bruijne, F. Arenou, C. Babusiaux, M. Biermann, O. L. Creevey, C. Ducourant, D. W. Evans, L. Eyer, R. Guerra, A. Hutton, C. Jordi , et al. (422 additional authors not shown)

    Abstract: The Gaia mission of the European Space Agency (ESA) has been routinely observing Solar System objects (SSOs) since the beginning of its operations in August 2014. The Gaia data release three (DR3) includes, for the first time, the mean reflectance spectra of a selected sample of 60 518 SSOs, primarily asteroids, observed between August 5, 2014, and May 28, 2017. Each reflectance spectrum was deriv… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

    Comments: 30 pages, 26 figures

  47. arXiv:2206.07633  [pdf, other

    cs.DS cs.LG

    Sublinear Algorithms for Hierarchical Clustering

    Authors: Arpit Agarwal, Sanjeev Khanna, Huan Li, Prathamesh Patil

    Abstract: Hierarchical clustering over graphs is a fundamental task in data mining and machine learning with applications in domains such as phylogenetics, social network analysis, and information retrieval. Specifically, we consider the recently popularized objective function for hierarchical clustering due to Dasgupta. Previous algorithms for (approximately) minimizing this objective function require line… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

  48. arXiv:2206.06693  [pdf, other

    astro-ph.IM astro-ph.EP astro-ph.GA astro-ph.HE astro-ph.SR

    ET White Paper: To Find the First Earth 2.0

    Authors: Jian Ge, Hui Zhang, Weicheng Zang, Hong** Deng, Shude Mao, Ji-Wei Xie, Hui-Gen Liu, Ji-Lin Zhou, Kevin Willis, Chelsea Huang, Steve B. Howell, Fabo Feng, Jiapeng Zhu, Xinyu Yao, Beibei Liu, Masataka Aizawa, Wei Zhu, Ya-** Li, Bo Ma, Quanzhi Ye, Jie Yu, Maosheng Xiang, Cong Yu, Shangfei Liu, Ming Yang , et al. (142 additional authors not shown)

    Abstract: We propose to develop a wide-field and ultra-high-precision photometric survey mission, temporarily named "Earth 2.0 (ET)". This mission is designed to measure, for the first time, the occurrence rate and the orbital distributions of Earth-sized planets. ET consists of seven 30cm telescopes, to be launched to the Earth-Sun's L2 point. Six of these are transit telescopes with a field of view of 500… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: 116 pages,79 figures

  49. Gaia Data Release 3: Map** the asymmetric disc of the Milky Way

    Authors: Gaia Collaboration, R. Drimmel, M. Romero-Gomez, L. Chemin, P. Ramos, E. Poggio, V. Ripepi, R. Andrae, R. Blomme, T. Cantat-Gaudin, A. Castro-Ginard, G. Clementini, F. Figueras, M. Fouesneau, Y. Fremat, K. Jardine, S. Khanna, A. Lobel, D. J. Marshall, T. Muraveva, A. G. A. Brown, A. Vallenari, T. Prusti, J. H. J. de Bruijne, F. Arenou , et al. (431 additional authors not shown)

    Abstract: With the most recent Gaia data release the number of sources with complete 6D phase space information (position and velocity) has increased to well over 33 million stars, while stellar astrophysical parameters are provided for more than 470 million sources, in addition to the identification of over 11 million variable stars. Using the astrophysical parameters and variability classifications provid… ▽ More

    Submitted 5 August, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: 35 pages, 27 figures, accepted for publication in A&A special Gaia DR3 issue. V2: abstract completed. V3: complete author list and link to data: https://drive.google.com/drive/u/1/folders/1yOJPjYmM7QK5XVsqaiSOTuwDQNti2LlZ

    Journal ref: A&A 674, A37 (2023)

  50. Gaia Data Release 3: Pulsations in main sequence OBAF-type stars

    Authors: Gaia Collaboration, J. De Ridder, V. Ripepi, C. Aerts, L. Palaversa, L. Eyer, B. Holl, M. Audard, L. Rimoldini, A. G. A. Brown, A. Vallenari, T. Prusti, J. H. J. de Bruijne, F. Arenou, C. Babusiaux, M. Biermann, O. L. Creevey, C. Ducourant, D. W. Evans, R. Guerra, A. Hutton, C. Jordi, S. A. Klioner, U. L. Lammers, L. Lindegren , et al. (423 additional authors not shown)

    Abstract: The third Gaia data release provides photometric time series covering 34 months for about 10 million stars. For many of those stars, a characterisation in Fourier space and their variability classification are also provided. This paper focuses on intermediate- to high-mass (IHM) main sequence pulsators M >= 1.3 Msun) of spectral types O, B, A, or F, known as beta Cep, slowly pulsating B (SPB), del… ▽ More

    Submitted 16 August, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

    Journal ref: A&A 674, A36 (2023)