-
A Critical Study of What Code-LLMs (Do Not) Learn
Authors:
Abhinav Anand,
Shweta Verma,
Krishna Narasimhan,
Mira Mezini
Abstract:
Large Language Models trained on code corpora (code-LLMs) have demonstrated impressive performance in various coding assistance tasks. However, despite their increased size and training dataset, code-LLMs still have limitations such as suggesting codes with syntactic errors, variable misuse etc. Some studies argue that code-LLMs perform well on coding tasks because they use self-attention and hidd…
▽ More
Large Language Models trained on code corpora (code-LLMs) have demonstrated impressive performance in various coding assistance tasks. However, despite their increased size and training dataset, code-LLMs still have limitations such as suggesting codes with syntactic errors, variable misuse etc. Some studies argue that code-LLMs perform well on coding tasks because they use self-attention and hidden representations to encode relations among input tokens. However, previous works have not studied what code properties are not encoded by code-LLMs. In this paper, we conduct a fine-grained analysis of attention maps and hidden representations of code-LLMs. Our study indicates that code-LLMs only encode relations among specific subsets of input tokens. Specifically, by categorizing input tokens into syntactic tokens and identifiers, we found that models encode relations among syntactic tokens and among identifiers, but they fail to encode relations between syntactic tokens and identifiers. We also found that fine-tuned models encode these relations poorly compared to their pre-trained counterparts. Additionally, larger models with billions of parameters encode significantly less information about code than models with only a few hundred million parameters.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
No perspective, no perception!! Perspective-aware Healthcare Answer Summarization
Authors:
Gauri Naik,
Sharad Chandakacherla,
Shweta Yadav,
Md. Shad Akhtar
Abstract:
Healthcare Community Question Answering (CQA) forums offer an accessible platform for individuals seeking information on various healthcare-related topics. People find such platforms suitable for self-disclosure, seeking medical opinions, finding simplified explanations for their medical conditions, and answering others' questions. However, answers on these forums are typically diverse and prone t…
▽ More
Healthcare Community Question Answering (CQA) forums offer an accessible platform for individuals seeking information on various healthcare-related topics. People find such platforms suitable for self-disclosure, seeking medical opinions, finding simplified explanations for their medical conditions, and answering others' questions. However, answers on these forums are typically diverse and prone to off-topic discussions. It can be challenging for readers to sift through numerous answers and extract meaningful insights, making answer summarization a crucial task for CQA forums. While several efforts have been made to summarize the community answers, most of them are limited to the open domain and overlook the different perspectives offered by these answers. To address this problem, this paper proposes a novel task of perspective-specific answer summarization. We identify various perspectives, within healthcare-related responses and frame a perspective-driven abstractive summary covering all responses. To achieve this, we annotate 3167 CQA threads with 6193 perspective-aware summaries in our PUMA dataset. Further, we propose PLASMA, a prompt-driven controllable summarization model. To encapsulate the perspective-specific conditions, we design an energy-controlled loss function for the optimization. We also leverage the prefix tuner to learn the intricacies of the health-care perspective summarization. Our evaluation against five baselines suggests the superior performance of PLASMA by a margin of 1.5-21% improvement. We supplement our experiments with ablation and qualitative analysis.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
The PLATO Mission
Authors:
Heike Rauer,
Conny Aerts,
Juan Cabrera,
Magali Deleuil,
Anders Erikson,
Laurent Gizon,
Mariejo Goupil,
Ana Heras,
Jose Lorenzo-Alvarez,
Filippo Marliani,
Cesar Martin-Garcia,
J. Miguel Mas-Hesse,
Laurence O'Rourke,
Hugh Osborn,
Isabella Pagano,
Giampaolo Piotto,
Don Pollacco,
Roberto Ragazzoni,
Gavin Ramsay,
Stéphane Udry,
Thierry Appourchaux,
Willy Benz,
Alexis Brandeker,
Manuel Güdel,
Eduardo Janot-Pacheco
, et al. (801 additional authors not shown)
Abstract:
PLATO (PLAnetary Transits and Oscillations of stars) is ESA's M3 mission designed to detect and characterise extrasolar planets and perform asteroseismic monitoring of a large number of stars. PLATO will detect small planets (down to <2 R_(Earth)) around bright stars (<11 mag), including terrestrial planets in the habitable zone of solar-like stars. With the complement of radial velocity observati…
▽ More
PLATO (PLAnetary Transits and Oscillations of stars) is ESA's M3 mission designed to detect and characterise extrasolar planets and perform asteroseismic monitoring of a large number of stars. PLATO will detect small planets (down to <2 R_(Earth)) around bright stars (<11 mag), including terrestrial planets in the habitable zone of solar-like stars. With the complement of radial velocity observations from the ground, planets will be characterised for their radius, mass, and age with high accuracy (5 %, 10 %, 10 % for an Earth-Sun combination respectively). PLATO will provide us with a large-scale catalogue of well-characterised small planets up to intermediate orbital periods, relevant for a meaningful comparison to planet formation theories and to better understand planet evolution. It will make possible comparative exoplanetology to place our Solar System planets in a broader context. In parallel, PLATO will study (host) stars using asteroseismology, allowing us to determine the stellar properties with high accuracy, substantially enhancing our knowledge of stellar structure and evolution.
The payload instrument consists of 26 cameras with 12cm aperture each. For at least four years, the mission will perform high-precision photometric measurements. Here we review the science objectives, present PLATO's target samples and fields, provide an overview of expected core science performance as well as a description of the instrument and the mission profile at the beginning of the serial production of the flight cameras. PLATO is scheduled for a launch date end 2026. This overview therefore provides a summary of the mission to the community in preparation of the upcoming operational phases.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Quantum hardware demonstrations of relativistic calculations of molecular electric dipole moments: from light to heavy systems using Variational Quantum Eigensolver
Authors:
Palak Chawla,
Shweta,
K. R. Swain,
Tushti Patel,
Renu Bala,
Disha Shetty,
Kenji Sugisaki,
Sudhindu Bikash Mandal,
Jordi Riu,
Jan Nogue,
V. S. Prasannaa,
B. P. Das
Abstract:
The quantum-classical hybrid Variational Quantum Eigensolver (VQE) algorithm is recognized to be the method of choice to obtain ground state energies of quantum many-body systems in the noisy intermediate scale quantum (NISQ) era. This study not only extends the VQE algorithm to the relativistic regime, but also calculates a property other than energy, namely the molecular permanent electric dipol…
▽ More
The quantum-classical hybrid Variational Quantum Eigensolver (VQE) algorithm is recognized to be the method of choice to obtain ground state energies of quantum many-body systems in the noisy intermediate scale quantum (NISQ) era. This study not only extends the VQE algorithm to the relativistic regime, but also calculates a property other than energy, namely the molecular permanent electric dipole moment (PDM). We carry out 18-qubit quantum simulations to obtain ground state energies as well as PDMs of single-valence diatomic molecules, ranging from the light BeH to the heavy radioactive RaH molecule. We investigate the correlation trends in these systems as well as access the precision in our results. Furthermore, we measure the PDM of the moderately heavy SrH and SrF molecules on the optimized unitary coupled cluster state, using the state-of-the-art IonQ Aria-I quantum computer in an active space of 6 qubits. The associated quantum circuits for these computations were extensively optimized in view of limitations imposed by NISQ hardware. To that end, we employ an array of techniques, including the use of point group symmetries, integrating ZX-Calculus into our pipeline-based circuit optimization, and energy sort VQE procedure. Through these methods, we compress our 6-qubit quantum circuit from 280 two-qubit gates to 37 two-qubit gates (with a marginal trade-off of 0.33 and 0.31 percent in the PDM for SrH and SrF in their respective 6-spin orbital active spaces). We anticipate that our proof-of-concept demonstration lays the groundwork for future quantum hardware calculations involving heavy atoms and molecules.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Chemical Space-Informed Machine Learning Models for Rapid Predictions of X-ray Photoelectron Spectra of Organic Molecules
Authors:
Susmita Tripathy,
Surajit Das,
Shweta **dal,
Raghunathan Ramakrishnan
Abstract:
We present machine learning models based on kernel-ridge regression for predicting X-ray photoelectron spectra of organic molecules originating from the $K$-shell ionization energies of carbon (C), nitrogen (N), oxygen (O), and fluorine (F) atoms. We constructed the training dataset through high-throughput calculations of $K$-shell core-electron binding energies (CEBEs) for 12,880 small organic mo…
▽ More
We present machine learning models based on kernel-ridge regression for predicting X-ray photoelectron spectra of organic molecules originating from the $K$-shell ionization energies of carbon (C), nitrogen (N), oxygen (O), and fluorine (F) atoms. We constructed the training dataset through high-throughput calculations of $K$-shell core-electron binding energies (CEBEs) for 12,880 small organic molecules in the bigQM7$ω$ dataset, employing the $Δ$-SCF formalism coupled with meta-GGA-DFT and a variationally converged basis set. The models are cost-effective, as they require the atomic coordinates of a molecule generated using universal force fields while estimating the target-level CEBEs corresponding to DFT-level equilibrium geometry. We explore transfer learning by utilizing the atomic environment feature vectors learned using a graph neural network framework in kernel-ridge regression. Additionally, we enhance accuracy within the $Δ$-machine learning framework by leveraging inexpensive baseline spectra derived from Kohn--Sham eigenvalues. When applied to 208 combinatorially substituted uracil molecules larger than those in the training set, our analyses suggest that the models may not provide quantitatively accurate predictions of CEBEs but offer a strong linear correlation relevant for virtual high-throughput screening. We present the dataset and models as the Python module, ${\tt cebeconf}$, to facilitate further explorations.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Bridging eResearch Infrastructure and Experimental Materials Science Process in the Quantum Data Hub
Authors:
Amarnath Gupta,
Shweta Purawat,
Subhasis Dasgupta,
Pratyush Karmakar,
Elaine Chi,
Ilkay Altintas
Abstract:
Experimental materials science is experiencing significant growth due to automated experimentation and AI techniques. Integrated autonomous platforms are emerging, combining generative models, robotics, simulations, and automated systems for material synthesis. However, two major challenges remain: democratizing access to these technologies and creating accessible infrastructure for under-resource…
▽ More
Experimental materials science is experiencing significant growth due to automated experimentation and AI techniques. Integrated autonomous platforms are emerging, combining generative models, robotics, simulations, and automated systems for material synthesis. However, two major challenges remain: democratizing access to these technologies and creating accessible infrastructure for under-resourced scientists. This paper introduces the Quantum Data Hub (QDH), a community-accessible research infrastructure aimed at researchers working with quantum materials. QDH integrates with the National Data Platform, adhering to FAIR principles while proposing additional UNIT principles for usability, navigability, interpretability, and timeliness. The QDH facilitates collaboration and extensibility, allowing seamless integration of new researchers, instruments, and data into the system.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Towards Fairness in Provably Communication-Efficient Federated Recommender Systems
Authors:
Kirandeep Kaur,
Sujit Gujar,
Shweta Jain
Abstract:
To reduce the communication overhead caused by parallel training of multiple clients, various federated learning (FL) techniques use random client sampling. Nonetheless, ensuring the efficacy of random sampling and determining the optimal number of clients to sample in federated recommender systems (FRSs) remains challenging due to the isolated nature of each user as a separate client. This challe…
▽ More
To reduce the communication overhead caused by parallel training of multiple clients, various federated learning (FL) techniques use random client sampling. Nonetheless, ensuring the efficacy of random sampling and determining the optimal number of clients to sample in federated recommender systems (FRSs) remains challenging due to the isolated nature of each user as a separate client. This challenge is exacerbated in models where public and private features can be separated, and FL allows communication of only public features (item gradients). In this study, we establish sample complexity bounds that dictate the ideal number of clients required for improved communication efficiency and retained accuracy in such models. In line with our theoretical findings, we empirically demonstrate that RS-FairFRS reduces communication cost (~47%). Second, we demonstrate the presence of class imbalance among clients that raises a substantial equity concern for FRSs. Unlike centralized machine learning, clients in FRS can not share raw data, including sensitive attributes. For this, we introduce RS-FairFRS, first fairness under unawareness FRS built upon random sampling based FRS. While random sampling improves communication efficiency, we propose a novel two-phase dual-fair update technique to achieve fairness without revealing protected attributes of active clients participating in training. Our results on real-world datasets and different sensitive features illustrate a significant reduction in demographic bias (~approx40\%), offering a promising path to achieving fairness and communication efficiency in FRSs without compromising the overall accuracy of FRS.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Aspect-oriented Consumer Health Answer Summarization
Authors:
Rochana Chaturvedi,
Abari Bhattacharya,
Shweta Yadav
Abstract:
Community Question-Answering (CQA) forums have revolutionized how people seek information, especially those related to their healthcare needs, placing their trust in the collective wisdom of the public. However, there can be several answers in response to a single query, which makes it hard to grasp the key information related to the specific health concern. Typically, CQA forums feature a single…
▽ More
Community Question-Answering (CQA) forums have revolutionized how people seek information, especially those related to their healthcare needs, placing their trust in the collective wisdom of the public. However, there can be several answers in response to a single query, which makes it hard to grasp the key information related to the specific health concern. Typically, CQA forums feature a single top-voted answer as a representative summary for each query. However, a single answer overlooks the alternative solutions and other information frequently offered in other responses. Our research focuses on aspect-based summarization of health answers to address this limitation. Summarization of responses under different aspects such as suggestions, information, personal experiences, and questions can enhance the usability of the platforms. We formalize a multi-stage annotation guideline and contribute a unique dataset comprising aspect-based human-written health answer summaries. We build an automated multi-faceted answer summarization pipeline with this dataset based on task-specific fine-tuning of several state-of-the-art models. The pipeline leverages question similarity to retrieve relevant answer sentences, subsequently classifying them into the appropriate aspect type. Following this, we employ several recent abstractive summarization models to generate aspect-based summaries. Finally, we present a comprehensive human analysis and find that our summaries rank high in capturing relevant content and a wide range of solutions.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Authors:
Mayank Mishra,
Matt Stallone,
Gaoyuan Zhang,
Yikang Shen,
Aditya Prasad,
Adriana Meza Soria,
Michele Merler,
Parameswaran Selvam,
Saptha Surendran,
Shivdeep Singh,
Manish Sethi,
Xuan-Hong Dang,
Pengyuan Li,
Kun-Lung Wu,
Syed Zawad,
Andrew Coleman,
Matthew White,
Mark Lewis,
Raju Pavuluri,
Yan Koyfman,
Boris Lublinsky,
Maximilien de Bayser,
Ibrahim Abdelaziz,
Kinjal Basu,
Mayank Agarwal
, et al. (21 additional authors not shown)
Abstract:
Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabili…
▽ More
Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabilities, including code generation, fixing bugs, explaining and documenting code, maintaining repositories, and more. In this work, we introduce the Granite series of decoder-only code models for code generative tasks, trained with code written in 116 programming languages. The Granite Code models family consists of models ranging in size from 3 to 34 billion parameters, suitable for applications ranging from complex application modernization tasks to on-device memory-constrained use cases. Evaluation on a comprehensive set of tasks demonstrates that Granite Code models consistently reaches state-of-the-art performance among available open-source code LLMs. The Granite Code model family was optimized for enterprise software development workflows and performs well across a range of coding tasks (e.g. code generation, fixing and explanation), making it a versatile all around code model. We release all our Granite Code models under an Apache 2.0 license for both research and commercial use.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Rolling in the Shadows: Analyzing the Extraction of MEV Across Layer-2 Rollups
Authors:
Christof Ferreira Torres,
Albin Mamuti,
Ben Weintraub,
Cristina Nita-Rotaru,
Shweta Shinde
Abstract:
The emergence of decentralized finance has transformed asset trading on the blockchain, making traditional financial instruments more accessible while also introducing a series of exploitative economic practices known as Maximal Extractable Value (MEV). Concurrently, decentralized finance has embraced rollup-based Layer-2 solutions to facilitate asset trading at reduced transaction costs compared…
▽ More
The emergence of decentralized finance has transformed asset trading on the blockchain, making traditional financial instruments more accessible while also introducing a series of exploitative economic practices known as Maximal Extractable Value (MEV). Concurrently, decentralized finance has embraced rollup-based Layer-2 solutions to facilitate asset trading at reduced transaction costs compared to Layer-1 solutions such as Ethereum. However, rollups lack a public mempool like Ethereum, making the extraction of MEV more challenging. In this paper, we investigate the prevalence and impact of MEV on Ethereum and prominent rollups such as Arbitrum, Optimism, and zkSync over a nearly three-year period. Our analysis encompasses various metrics including volume, profits, costs, competition, and response time to MEV opportunities. We discover that MEV is widespread on rollups, with trading volume comparable to Ethereum. We also find that, although MEV costs are lower on rollups, profits are also significantly lower compared to Ethereum. Additionally, we examine the prevalence of sandwich attacks on rollups. While our findings did not detect any sandwiching activity on popular rollups, we did identify the potential for cross-layer sandwich attacks facilitated by transactions that are sent across rollups and Ethereum. Consequently, we propose and evaluate the feasibility of three novel attacks that exploit cross-layer transactions, revealing that attackers could have already earned approximately 2 million USD through cross-layer sandwich attacks.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
SIGY: Breaking Intel SGX Enclaves with Malicious Exceptions & Signals
Authors:
Supraja Sridhara,
Andrin Bertschi,
Benedict Schlüter,
Shweta Shinde
Abstract:
User programs recover from hardware exceptions and respond to signals by executing custom handlers that they register specifically for such events. We present SIGY attack, which abuses this programming model on Intel SGX to break the confidentiality and integrity guarantees of enclaves. SIGY uses the untrusted OS to deliver fake hardware events and injects fake signals in an enclave at any point.…
▽ More
User programs recover from hardware exceptions and respond to signals by executing custom handlers that they register specifically for such events. We present SIGY attack, which abuses this programming model on Intel SGX to break the confidentiality and integrity guarantees of enclaves. SIGY uses the untrusted OS to deliver fake hardware events and injects fake signals in an enclave at any point. Such unintended execution of benign program-defined handlers in an enclave corrupts its state and violates execution integrity. 7 runtimes and library OSes (OpenEnclave, Gramine, Scone, Asylo, Teaclave, Occlum, EnclaveOS) are vulnerable to SIGY. 8 languages supported in Intel SGX have programming constructs that are vulnerable to SIGY. We use SIGY to demonstrate 4 proof of concept exploits on webservers (Nginx, Node.js) to leak secrets and data analytics workloads in different languages (C and Java) to break execution integrity.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Towards Enhancing Health Coaching Dialogue in Low-Resource Settings
Authors:
Yue Zhou,
Barbara Di Eugenio,
Brian Ziebart,
Lisa Sharp,
Bing Liu,
Ben Gerber,
Nikolaos Agadakos,
Shweta Yadav
Abstract:
Health coaching helps patients identify and accomplish lifestyle-related goals, effectively improving the control of chronic diseases and mitigating mental health conditions. However, health coaching is cost-prohibitive due to its highly personalized and labor-intensive nature. In this paper, we propose to build a dialogue system that converses with the patients, helps them create and accomplish s…
▽ More
Health coaching helps patients identify and accomplish lifestyle-related goals, effectively improving the control of chronic diseases and mitigating mental health conditions. However, health coaching is cost-prohibitive due to its highly personalized and labor-intensive nature. In this paper, we propose to build a dialogue system that converses with the patients, helps them create and accomplish specific goals, and can address their emotions with empathy. However, building such a system is challenging since real-world health coaching datasets are limited and empathy is subtle. Thus, we propose a modularized health coaching dialogue system with simplified NLU and NLG frameworks combined with mechanism-conditioned empathetic response generation. Through automatic and human evaluation, we show that our system generates more empathetic, fluent, and coherent responses and outperforms the state-of-the-art in NLU tasks while requiring less annotation. We view our approach as a key step towards building automated and more accessible health coaching systems.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
WeSee: Using Malicious #VC Interrupts to Break AMD SEV-SNP
Authors:
Benedict Schlüter,
Supraja Sridhara,
Andrin Bertschi,
Shweta Shinde
Abstract:
AMD SEV-SNP offers VM-level trusted execution environments (TEEs) to protect the confidentiality and integrity for sensitive cloud workloads from untrusted hypervisor controlled by the cloud provider. AMD introduced a new exception, #VC, to facilitate the communication between the VM and the untrusted hypervisor. We present WeSee attack, where the hypervisor injects malicious #VC into a victim VM'…
▽ More
AMD SEV-SNP offers VM-level trusted execution environments (TEEs) to protect the confidentiality and integrity for sensitive cloud workloads from untrusted hypervisor controlled by the cloud provider. AMD introduced a new exception, #VC, to facilitate the communication between the VM and the untrusted hypervisor. We present WeSee attack, where the hypervisor injects malicious #VC into a victim VM's CPU to compromise the security guarantees of AMD SEV-SNP. Specifically, WeSee injects interrupt number 29, which delivers a #VC exception to the VM who then executes the corresponding handler that performs data and register copies between the VM and the hypervisor. WeSee shows that using well-crafted #VC injections, the attacker can induce arbitrary behavior in the VM. Our case-studies demonstrate that WeSee can leak sensitive VM information (kTLS keys for NGINX), corrupt kernel data (firewall rules), and inject arbitrary code (launch a root shell from the kernel space).
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Heckler: Breaking Confidential VMs with Malicious Interrupts
Authors:
Benedict Schlüter,
Supraja Sridhara,
Mark Kuhne,
Andrin Bertschi,
Shweta Shinde
Abstract:
Hardware-based Trusted execution environments (TEEs) offer an isolation granularity of virtual machine abstraction. They provide confidential VMs (CVMs) that host security-sensitive code and data. AMD SEV-SNP and Intel TDX enable CVMs and are now available on popular cloud platforms. The untrusted hypervisor in these settings is in control of several resource management and configuration tasks, in…
▽ More
Hardware-based Trusted execution environments (TEEs) offer an isolation granularity of virtual machine abstraction. They provide confidential VMs (CVMs) that host security-sensitive code and data. AMD SEV-SNP and Intel TDX enable CVMs and are now available on popular cloud platforms. The untrusted hypervisor in these settings is in control of several resource management and configuration tasks, including interrupts. We present Heckler, a new attack wherein the hypervisor injects malicious non-timer interrupts to break the confidentiality and integrity of CVMs. Our insight is to use the interrupt handlers that have global effects, such that we can manipulate a CVM's register states to change the data and control flow. With AMD SEV-SNP and Intel TDX, we demonstrate Heckler on OpenSSH and sudo to bypass authentication. On AMD SEV-SNP we break execution integrity of C, Java, and Julia applications that perform statistical and text analysis. We explain the gaps in current defenses and outline guidelines for future defenses.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Benchmarking Object Detectors with COCO: A New Path Forward
Authors:
Shweta Singh,
Aayan Yadav,
Jitesh Jain,
Humphrey Shi,
Justin Johnson,
Karan Desai
Abstract:
The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer,…
▽ More
The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer, we inspect thousands of masks from COCO (2017 version) and uncover different types of errors such as imprecise mask boundaries, non-exhaustively annotated instances, and mislabeled masks. Due to the prevalence of COCO, we choose to correct these errors to maintain continuity with prior research. We develop COCO-ReM (Refined Masks), a cleaner set of annotations with visibly better mask quality than COCO-2017. We evaluate fifty object detectors and find that models that predict visually sharper masks score higher on COCO-ReM, affirming that they were being incorrectly penalized due to errors in COCO-2017. Moreover, our models trained using COCO-ReM converge faster and score higher than their larger variants trained using COCO-2017, highlighting the importance of data quality in improving object detectors. With these findings, we advocate using COCO-ReM for future object detection research. Our dataset is available at https://cocorem.xyz
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Semi-Analytical Methods for Population Balance models involving Aggregation and Breakage processes: A comparative study
Authors:
Shweta,
Saddam Hussain,
Rajesh Kumar
Abstract:
Population balance models often integrate fundamental kernels, including sum, gelling and Brownian aggregation kernels. These kernels have demonstrated extensive utility across various disciplines such as aerosol physics, chemical engineering, astrophysics, pharmaceutical sciences and mathematical biology for the purpose of elucidating particle dynamics. The objective of this study is to refine th…
▽ More
Population balance models often integrate fundamental kernels, including sum, gelling and Brownian aggregation kernels. These kernels have demonstrated extensive utility across various disciplines such as aerosol physics, chemical engineering, astrophysics, pharmaceutical sciences and mathematical biology for the purpose of elucidating particle dynamics. The objective of this study is to refine the semi-analytical solutions derived from current methodologies in addressing the nonlinear aggregation and coupled aggregation-breakage population balance equation. This work presents a unique semi-analytical approach based on the homotopy analysis method (HAM) to solve pure aggregation and couple aggregation-fragmentation population balance equations, which is an integro-partial differentia equation. By decomposing the non-linear operator, we investigate how to utilize the convergence control parameter to expedite the convergence of the HAM solution towards its precise values in the proposed method.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Shell-model study of $\log ft$ values for $^{139,140,141}$Ba $\rightarrow$ $^{139,140,141}$La transitions
Authors:
Shweta Sharma,
Praveen C. Srivastava
Abstract:
In the present work, beta-decay properties such as $\log ft$ values and half-lives have been systematically studied corresponding to Ba isotopes using large-scale shell-model calculations. An extensive comparison of beta decay results corresponding to $^{141}$Ba$\rightarrow$ $^{141}$La using shell-model calculations is made with the recently available experimental data. In addition, we have also c…
▽ More
In the present work, beta-decay properties such as $\log ft$ values and half-lives have been systematically studied corresponding to Ba isotopes using large-scale shell-model calculations. An extensive comparison of beta decay results corresponding to $^{141}$Ba$\rightarrow$ $^{141}$La using shell-model calculations is made with the recently available experimental data. In addition, we have also calculated the nuclear and beta decay properties corresponding to $^{139}$Ba$\rightarrow$ $^{139}$La and $^{140}$Ba$\rightarrow$ $^{140}$La transitions. The model-space considered here is $Z=50-82$ and $N=82-126$ with $^{132}$Sn core, and the interaction employed here is jj56pnb interaction. The beta decay results using shell-model calculations for all the mentioned isotopes are compared with the available experimental data. This is the first theoretical interpretation corresponding to recent experimental data.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
Authors:
Tanzila Rahman,
Shweta Mahajan,
Hsin-Ying Lee,
Jian Ren,
Sergey Tulyakov,
Leonid Sigal
Abstract:
Text-to-image (TTI) diffusion models have demonstrated impressive results in generating high-resolution images of complex and imaginative scenes. Recent approaches have further extended these methods with personalization techniques that allow them to integrate user-illustrated concepts (e.g., the user him/herself) using a few sample image illustrations. However, the ability to generate images with…
▽ More
Text-to-image (TTI) diffusion models have demonstrated impressive results in generating high-resolution images of complex and imaginative scenes. Recent approaches have further extended these methods with personalization techniques that allow them to integrate user-illustrated concepts (e.g., the user him/herself) using a few sample image illustrations. However, the ability to generate images with multiple interacting concepts, such as human subjects, as well as concepts that may be entangled in one, or across multiple, image illustrations remains illusive. In this work, we propose a concept-driven TTI personalization framework that addresses these core challenges. We build on existing works that learn custom tokens for user-illustrated concepts, allowing those to interact with existing text tokens in the TTI model. However, importantly, to disentangle and better learn the concepts in question, we jointly learn (latent) segmentation masks that disentangle these concepts in user-provided image illustrations. We do so by introducing an Expectation Maximization (EM)-like optimization procedure where we alternate between learning the custom tokens and estimating masks encompassing corresponding concepts in user-supplied images. We obtain these masks based on cross-attention, from within the U-Net parameterized latent diffusion model and subsequent Dense CRF optimization. We illustrate that such joint alternating refinement leads to the learning of better tokens for concepts and, as a bi-product, latent masks. We illustrate the benefits of the proposed approach qualitatively and quantitatively (through user studies) with a number of examples and use cases that can combine up to three entangled concepts.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Fairness of Exposure in Online Restless Multi-armed Bandits
Authors:
Archit Sood,
Shweta Jain,
Sujit Gujar
Abstract:
Restless multi-armed bandits (RMABs) generalize the multi-armed bandits where each arm exhibits Markovian behavior and transitions according to their transition dynamics. Solutions to RMAB exist for both offline and online cases. However, they do not consider the distribution of pulls among the arms. Studies have shown that optimal policies lead to unfairness, where some arms are not exposed enoug…
▽ More
Restless multi-armed bandits (RMABs) generalize the multi-armed bandits where each arm exhibits Markovian behavior and transitions according to their transition dynamics. Solutions to RMAB exist for both offline and online cases. However, they do not consider the distribution of pulls among the arms. Studies have shown that optimal policies lead to unfairness, where some arms are not exposed enough. Existing works in fairness in RMABs focus heavily on the offline case, which diminishes their application in real-world scenarios where the environment is largely unknown. In the online scenario, we propose the first fair RMAB framework, where each arm receives pulls in proportion to its merit. We define the merit of an arm as a function of its stationary reward distribution. We prove that our algorithm achieves sublinear fairness regret in the single pull case $O(\sqrt{T\ln T})$, with $T$ being the total number of episodes. Empirically, we show that our algorithm performs well in the multi-pull scenario as well.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Simultaneously Achieving Group Exposure Fairness and Within-Group Meritocracy in Stochastic Bandits
Authors:
Subham Pokhriyal,
Shweta Jain,
Ganesh Ghalme,
Swapnil Dhamal,
Sujit Gujar
Abstract:
Existing approaches to fairness in stochastic multi-armed bandits (MAB) primarily focus on exposure guarantee to individual arms. When arms are naturally grouped by certain attribute(s), we propose Bi-Level Fairness, which considers two levels of fairness. At the first level, Bi-Level Fairness guarantees a certain minimum exposure to each group. To address the unbalanced allocation of pulls to ind…
▽ More
Existing approaches to fairness in stochastic multi-armed bandits (MAB) primarily focus on exposure guarantee to individual arms. When arms are naturally grouped by certain attribute(s), we propose Bi-Level Fairness, which considers two levels of fairness. At the first level, Bi-Level Fairness guarantees a certain minimum exposure to each group. To address the unbalanced allocation of pulls to individual arms within a group, we consider meritocratic fairness at the second level, which ensures that each arm is pulled according to its merit within the group. Our work shows that we can adapt a UCB-based algorithm to achieve a Bi-Level Fairness by providing (i) anytime Group Exposure Fairness guarantees and (ii) ensuring individual-level Meritocratic Fairness within each group. We first show that one can decompose regret bounds into two components: (a) regret due to anytime group exposure fairness and (b) regret due to meritocratic fairness within each group. Our proposed algorithm BF-UCB balances these two regrets optimally to achieve the upper bound of $O(\sqrt{T})$ on regret; $T$ being the stop** time. With the help of simulated experiments, we further show that BF-UCB achieves sub-linear regret; provides better group and individual exposure guarantees compared to existing algorithms; and does not result in a significant drop in reward with respect to UCB algorithm, which does not impose any fairness constraint.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Fairness and Privacy Guarantees in Federated Contextual Bandits
Authors:
Sambhav Solanki,
Shweta Jain,
Sujit Gujar
Abstract:
This paper considers the contextual multi-armed bandit (CMAB) problem with fairness and privacy guarantees in a federated environment. We consider merit-based exposure as the desired fair outcome, which provides exposure to each action in proportion to the reward associated. We model the algorithm's effectiveness using fairness regret, which captures the difference between fair optimal policy and…
▽ More
This paper considers the contextual multi-armed bandit (CMAB) problem with fairness and privacy guarantees in a federated environment. We consider merit-based exposure as the desired fair outcome, which provides exposure to each action in proportion to the reward associated. We model the algorithm's effectiveness using fairness regret, which captures the difference between fair optimal policy and the policy output by the algorithm. Applying fair CMAB algorithm to each agent individually leads to fairness regret linear in the number of agents. We propose that collaborative -- federated learning can be more effective and provide the algorithm Fed-FairX-LinUCB that also ensures differential privacy. The primary challenge in extending the existing privacy framework is designing the communication protocol for communicating required information across agents. A naive protocol can either lead to weaker privacy guarantees or higher regret. We design a novel communication protocol that allows for (i) Sub-linear theoretical bounds on fairness regret for Fed-FairX-LinUCB and comparable bounds for the private counterpart, Priv-FairX-LinUCB (relative to single-agent learning), (ii) Effective use of privacy budget in Priv-FairX-LinUCB. We demonstrate the efficacy of our proposed algorithm with extensive simulations-based experiments. We show that both Fed-FairX-LinUCB and Priv-FairX-LinUCB achieve near-optimal fairness regret.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Early prediction of onset of sepsis in Clinical Setting
Authors:
Fahim Mohammad,
Lakshmi Arunachalam,
Samanway Sadhu,
Boudewijn Aasman,
Shweta Garg,
Adil Ahmed,
Silvie Colman,
Meena Arunachalam,
Sudhir Kulkarni,
Parsa Mirhaji
Abstract:
This study proposes the use of Machine Learning models to predict the early onset of sepsis using deidentified clinical data from Montefiore Medical Center in Bronx, NY, USA. A supervised learning approach was adopted, wherein an XGBoost model was trained utilizing 80\% of the train dataset, encompassing 107 features (including the original and derived features). Subsequently, the model was evalua…
▽ More
This study proposes the use of Machine Learning models to predict the early onset of sepsis using deidentified clinical data from Montefiore Medical Center in Bronx, NY, USA. A supervised learning approach was adopted, wherein an XGBoost model was trained utilizing 80\% of the train dataset, encompassing 107 features (including the original and derived features). Subsequently, the model was evaluated on the remaining 20\% of the test data. The model was validated on prospective data that was entirely unseen during the training phase. To assess the model's performance at the individual patient level and timeliness of the prediction, a normalized utility score was employed, a widely recognized scoring methodology for sepsis detection, as outlined in the PhysioNet Sepsis Challenge paper. Metrics such as F1 Score, Sensitivity, Specificity, and Flag Rate were also devised. The model achieved a normalized utility score of 0.494 on test data and 0.378 on prospective data at threshold 0.3. The F1 scores were 80.8\% and 67.1\% respectively for the test data and the prospective data for the same threshold, highlighting its potential to be integrated into clinical decision-making processes effectively. These results bear testament to the model's robust predictive capabilities and its potential to substantially impact clinical decision-making processes.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models
Authors:
Shweta Mahajan,
Tanzila Rahman,
Kwang Moo Yi,
Leonid Sigal
Abstract:
The quality of the prompts provided to text-to-image diffusion models determines how faithful the generated content is to the user's intent, often requiring `prompt engineering'. To harness visual concepts from target images without prompt engineering, current approaches largely rely on embedding inversion by optimizing and then map** them to pseudo-tokens. However, working with such high-dimens…
▽ More
The quality of the prompts provided to text-to-image diffusion models determines how faithful the generated content is to the user's intent, often requiring `prompt engineering'. To harness visual concepts from target images without prompt engineering, current approaches largely rely on embedding inversion by optimizing and then map** them to pseudo-tokens. However, working with such high-dimensional vector representations is challenging because they lack semantics and interpretability, and only allow simple vector operations when using them. Instead, this work focuses on inverting the diffusion model to obtain interpretable language prompts directly. The challenge of doing this lies in the fact that the resulting optimization problem is fundamentally discrete and the space of prompts is exponentially large; this makes using standard optimization techniques, such as stochastic gradient descent, difficult. To this end, we utilize a delayed projection scheme to optimize for prompts representative of the vocabulary space in the model. Further, we leverage the findings that different timesteps of the diffusion process cater to different levels of detail in an image. The later, noisy, timesteps of the forward diffusion process correspond to the semantic information, and therefore, prompt inversion in this range provides tokens representative of the image semantics. We show that our approach can identify semantically interpretable and meaningful prompts for a target image which can be used to synthesize diverse images with similar content. We further illustrate the application of the optimized prompts in evolutionary image generation and concept removal.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models
Authors:
Jeong-gi Kwak,
Erqun Dong,
Yuhe **,
Hanseok Ko,
Shweta Mahajan,
Kwang Moo Yi
Abstract:
Generating novel views of an object from a single image is a challenging task. It requires an understanding of the underlying 3D structure of the object from an image and rendering high-quality, spatially consistent new views. While recent methods for view synthesis based on diffusion have shown great progress, achieving consistency among various view estimates and at the same time abiding by the…
▽ More
Generating novel views of an object from a single image is a challenging task. It requires an understanding of the underlying 3D structure of the object from an image and rendering high-quality, spatially consistent new views. While recent methods for view synthesis based on diffusion have shown great progress, achieving consistency among various view estimates and at the same time abiding by the desired camera pose remains a critical problem yet to be solved. In this work, we demonstrate a strikingly simple method, where we utilize a pre-trained video diffusion model to solve this problem. Our key idea is that synthesizing a novel view could be reformulated as synthesizing a video of a camera going around the object of interest -- a scanning video -- which then allows us to leverage the powerful priors that a video diffusion model would have learned. Thus, to perform novel-view synthesis, we create a smooth camera trajectory to the target view that we wish to render, and denoise using both a view-conditioned diffusion model and a video diffusion model. By doing so, we obtain a highly consistent novel view synthesis, outperforming the state of the art.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
Unsupervised Keypoints from Pretrained Diffusion Models
Authors:
Eric Hedlin,
Gopal Sharma,
Shweta Mahajan,
Xingzhe He,
Hossam Isack,
Abhishek Kar Helge Rhodin,
Andrea Tagliasacchi,
Kwang Moo Yi
Abstract:
Unsupervised learning of keypoints and landmarks has seen significant progress with the help of modern neural network architectures, but performance is yet to match the supervised counterpart, making their practicability questionable. We leverage the emergent knowledge within text-to-image diffusion models, towards more robust unsupervised keypoints. Our core idea is to find text embeddings that w…
▽ More
Unsupervised learning of keypoints and landmarks has seen significant progress with the help of modern neural network architectures, but performance is yet to match the supervised counterpart, making their practicability questionable. We leverage the emergent knowledge within text-to-image diffusion models, towards more robust unsupervised keypoints. Our core idea is to find text embeddings that would cause the generative model to consistently attend to compact regions in images (i.e. keypoints). To do so, we simply optimize the text embedding such that the cross-attention maps within the denoising network are localized as Gaussians with small standard deviations. We validate our performance on multiple datasets: the CelebA, CUB-200-2011, Tai-Chi-HD, DeepFashion, and Human3.6m datasets. We achieve significantly improved accuracy, sometimes even outperforming supervised ones, particularly for data that is non-aligned and less curated. Our code is publicly available and can be found through our project page: https://ubc-vision.github.io/StableKeypoints/
△ Less
Submitted 21 May, 2024; v1 submitted 29 November, 2023;
originally announced December 2023.
-
The magnetically quiet solar surface dominates HARPS-N solar RVs during low activity
Authors:
Ben S. Lakeland,
Tim Naylor,
Raphaëlle Haywood,
Nadège Meunier,
Federica Rescigno,
Shweta Dalal,
Annelies Mortier,
Samantha J. Thompson,
Andrew Collier Cameron,
Xavier Dumusque,
Mercedes López-Morales,
Francesco Pepe,
Ken Rice,
Alessandro Sozzetti,
Stéphane Udry,
Eric Ford,
Adriano Ghedina,
Marcello Lodi
Abstract:
Using images from the Helioseismic and Magnetic Imager aboard the \textit{Solar Dynamics Observatory} (SDO/HMI), we extract the radial-velocity (RV) signal arising from the suppression of convective blue-shift and from bright faculae and dark sunspots transiting the rotating solar disc. We remove these rotationally modulated magnetic-activity contributions from simultaneous radial velocities obser…
▽ More
Using images from the Helioseismic and Magnetic Imager aboard the \textit{Solar Dynamics Observatory} (SDO/HMI), we extract the radial-velocity (RV) signal arising from the suppression of convective blue-shift and from bright faculae and dark sunspots transiting the rotating solar disc. We remove these rotationally modulated magnetic-activity contributions from simultaneous radial velocities observed by the HARPS-N solar feed to produce a radial-velocity time series arising from the magnetically quiet solar surface (the 'inactive-region radial velocities'). We find that the level of variability in the inactive-region radial velocities remains constant over the almost 7 year baseline and shows no correlation with well-known activity indicators. With an RMS of roughly 1 m/s, the inactive-region radial-velocity time series dominates the total RV variability budget during the decline of solar cycle 24. Finally, we compare the variability amplitude and timescale of the inactive-region radial velocities with simulations of supergranulation. We find consistency between the inactive-region radial-velocity and simulated time series, indicating that supergranulation is a significant contribution to the overall solar radial velocity variability, and may be the main source of variability towards solar minimum. This work highlights supergranulation as a key barrier to detecting Earth twins.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Monte Carlo Simulation Development and Implementation of the GiBUU Model for Neutrino Experiments
Authors:
Leonidas Aliaga Soplin,
Raquel Castillo Fernandez,
Jasper Gustafson,
Declan Quinn,
Shweta Yadav
Abstract:
This paper introduces a Monte Carlo simulation generated with the GiBUU model for neutrino experiments. The simulation generates realistic neutrino event samples, contributing to the prediction and interpretation of experimental outcomes. The results showcase the performance of the GiBUU-based simulation framework, emphasizing its fidelity to the original GiBUU cross-section model. This first impl…
▽ More
This paper introduces a Monte Carlo simulation generated with the GiBUU model for neutrino experiments. The simulation generates realistic neutrino event samples, contributing to the prediction and interpretation of experimental outcomes. The results showcase the performance of the GiBUU-based simulation framework, emphasizing its fidelity to the original GiBUU cross-section model. This first implementation enables future work on develo** the infrastructure to propagate systematic uncertainties. These contributions enhance the precision of experimental predictions and provide a platform for further exploration in future studies.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
Giant Outer Transiting Exoplanet Mass (GOT 'EM) Survey: III. Recovery and Confirmation of a Temperate, Mildly Eccentric, Single-Transit Jupiter Orbiting TOI-2010
Authors:
Christopher R. Mann,
Paul A. Dalba,
David Lafrenière,
Benjamin J. Fulton,
Guillaume Hébrard,
Isabelle Boisse,
Shweta Dalal,
Magali Deleuil,
Xavier Delfosse,
Olivier Demangeon,
Thierry Forveille,
Neda Heidari,
Flavien Kiefer,
Eder Martioli,
Claire Moutou,
Michael Endl,
William D. Cochran,
Phillip MacQueen,
Franck Marchis,
Diana Dragomir,
Arvind F. Gupta,
Dax L. Feliz,
Belinda A. Nicholson,
Carl Ziegler,
Steven Villanueva Jr.
, et al. (26 additional authors not shown)
Abstract:
Large-scale exoplanet surveys like the TESS mission are powerful tools for discovering large numbers of exoplanet candidates. Single-transit events are commonplace within the resulting candidate list due to the unavoidable limitation of observing baseline. These single-transit planets often remain unverified due to their unknown orbital period and consequent difficulty in scheduling follow up obse…
▽ More
Large-scale exoplanet surveys like the TESS mission are powerful tools for discovering large numbers of exoplanet candidates. Single-transit events are commonplace within the resulting candidate list due to the unavoidable limitation of observing baseline. These single-transit planets often remain unverified due to their unknown orbital period and consequent difficulty in scheduling follow up observations. In some cases, radial velocity (RV) follow up can constrain the period enough to enable a future targeted transit detection. We present the confirmation of one such planet: TOI-2010 b. Nearly three years of RV coverage determined the period to a level where a broad window search could be undertaken with the Near-Earth Object Surveillance Satellite (NEOSSat), detecting an additional transit. An additional detection in a much later TESS sector solidified our final parameter estimation. We find TOI-2010 b to be a Jovian planet ($M_P = 1.29 \ M_{\rm Jup}$, $R_P = 1.05 \ R_{\rm Jup}$) on a mildly eccentric orbit ($e = 0.21$) with a period of $P = 141.83403$ days. Assuming a simple model with no albedo and perfect heat redistribution, the equilibrium temperature ranges from about 360 K to 450 K from apoastron to periastron. Its wide orbit and bright host star ($V=9.85$) make TOI-2010 b a valuable test-bed for future low-insolation atmospheric analysis.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
The motivation for flexible star-formation histories from spatially resolved scales within galaxies
Authors:
Shweta Jain,
Sandro Tacchella,
Moein Mosleh
Abstract:
The estimation of galaxy stellar masses depends on the assumed prior of the star-formation history (SFH) and spatial scale of the analysis (spatially resolved versus integrated scales). In this paper, we connect the prescription of the SFH in the Spectral Energy Distribution (SED) fitting to spatially resolved scales ($\sim\mathrm{kpc}$) to shed light on the systematics involved when estimating st…
▽ More
The estimation of galaxy stellar masses depends on the assumed prior of the star-formation history (SFH) and spatial scale of the analysis (spatially resolved versus integrated scales). In this paper, we connect the prescription of the SFH in the Spectral Energy Distribution (SED) fitting to spatially resolved scales ($\sim\mathrm{kpc}$) to shed light on the systematics involved when estimating stellar masses. Specifically, we fit the integrated photometry of $\sim970$ massive (log (M$_{\star}$/M$_{\odot}) = 9.8-11.6$), intermediate redshift ($z=0.5-2.0$) galaxies with $\texttt{Prospector}$, assuming both exponentially declining tau model and flexible SFHs. We complement these fits with the results of spatially resolved SFH estimates obtained by pixel-by-pixel SED fitting, which assume tau models for individual pixels. These spatially resolved SFHs show a large diversity in shapes, which can largely be accounted for by the flexible SFHs with $\texttt{Prospector}$. The differences in the stellar masses from those two approaches are overall in good agreement (average difference of $\sim 0.07$ dex). Contrarily, the simpler tau model SFHs typically miss the oldest episode of star formation, leading to an underestimation of the stellar mass by $\sim 0.3$ dex. We further compare the derived global specific star-formation rate (sSFR), the mass-weighted stellar age (t$_{50}$), and the star-formation timescale ($τ_{\mathrm{SF}}$) obtained from the different SFH approaches. We conclude that the spatially resolved scales within galaxies motivate a flexible SFH on global scales to account for the diversity of SFHs and counteract the effects of outshining of older stellar populations by younger ones.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Study of the Energetic X-ray Superflares from the active fast rotator AB Doradus
Authors:
Shweta Didel,
Jeewan C. Pandey,
A. K. Srivastava,
Gurpreet Singh
Abstract:
We present the analyses of intense X-ray flares detected on the active fast rotator AB Dor using observations from the XMM-Newton. A total of 21 flares are detected, and 13 flares are analysed in detail. The total X-ray energy of these flares is found to be in the range of 10$^{34-36}$ erg, in which the peak flare flux increased up to 34 times from the pre-/post-flaring states for the strongest ob…
▽ More
We present the analyses of intense X-ray flares detected on the active fast rotator AB Dor using observations from the XMM-Newton. A total of 21 flares are detected, and 13 flares are analysed in detail. The total X-ray energy of these flares is found to be in the range of 10$^{34-36}$ erg, in which the peak flare flux increased up to 34 times from the pre-/post-flaring states for the strongest observed flare. The duration of these flaring events is found to be 0.7 to 5.8 hrs. The quiescent state X-ray spectra are found to be explained by a three-temperature plasma with average temperatures of 0.29, 0.95, and 1.9 keV, respectively. The temperatures, emission measures, and abundances are found to be varying during the flares. The peak flare temperature was found in the 31-89 MK range, whereas the peak emission measure was 10$^{52.5-54.7}$ $cm^{-3}$. The abundances vary during the flares and increase by a factor of $\sim$3 from the quiescent value for the strongest detected flare. The variation in individual abundances follows the inverse-FIP effect in quiescent and flare phases. The X-ray light curves of AB Dor are found to exhibit rotational modulation. The semi-loop lengths of the flaring events are derived in the range of 10$^{9.9-10.7}$ cm, whereas the minimum magnetic field to confine the plasma in the flaring loop is estimated between 200 and 700 G.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods
Authors:
Samyadeep Basu,
Mehrdad Saberi,
Shweta Bhardwaj,
Atoosa Malemir Chegini,
Daniela Massiceti,
Maziar Sanjabi,
Shell Xu Hu,
Soheil Feizi
Abstract:
A plethora of text-guided image editing methods have recently been developed by leveraging the impressive capabilities of large-scale diffusion-based generative models such as Imagen and Stable Diffusion. A standardized evaluation protocol, however, does not exist to compare methods across different types of fine-grained edits. To address this gap, we introduce EditVal, a standardized benchmark fo…
▽ More
A plethora of text-guided image editing methods have recently been developed by leveraging the impressive capabilities of large-scale diffusion-based generative models such as Imagen and Stable Diffusion. A standardized evaluation protocol, however, does not exist to compare methods across different types of fine-grained edits. To address this gap, we introduce EditVal, a standardized benchmark for quantitatively evaluating text-guided image editing methods. EditVal consists of a curated dataset of images, a set of editable attributes for each image drawn from 13 possible edit types, and an automated evaluation pipeline that uses pre-trained vision-language models to assess the fidelity of generated images for each edit type. We use EditVal to benchmark 8 cutting-edge diffusion-based editing methods including SINE, Imagic and Instruct-Pix2Pix. We complement this with a large-scale human study where we show that EditVall's automated evaluation pipeline is strongly correlated with human-preferences for the edit types we considered. From both the human study and automated evaluation, we find that: (i) Instruct-Pix2Pix, Null-Text and SINE are the top-performing methods averaged across different edit types, however {\it only} Instruct-Pix2Pix and Null-Text are able to preserve original image properties; (ii) Most of the editing methods fail at edits involving spatial operations (e.g., changing the position of an object). (iii) There is no `winner' method which ranks the best individually across a range of different edit types. We hope that our benchmark can pave the way to develo** more reliable text-guided image editing tools in the future. We will publicly release EditVal, and all associated code and human-study templates to support these research directions in https://deep-ml-research.github.io/editval/.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
High order approximation to Caputo derivative on graded mesh and time-fractional diffusion equation for non-smooth solutions
Authors:
Shweta Kumari,
Abhishek Kumar Singh,
Vaibhav Mehandiratta,
Mani Mehra
Abstract:
In this paper, a high-order approximation to Caputo-type time-fractional diffusion equations involving an initial-time singularity of the solution is proposed. At first, we employ a numerical algorithm based on the Lagrange polynomial interpolation to approximate the Caputo derivative on the non-uniform mesh. Then truncation error rate and the optimal grading constant of the approximation on a gra…
▽ More
In this paper, a high-order approximation to Caputo-type time-fractional diffusion equations involving an initial-time singularity of the solution is proposed. At first, we employ a numerical algorithm based on the Lagrange polynomial interpolation to approximate the Caputo derivative on the non-uniform mesh. Then truncation error rate and the optimal grading constant of the approximation on a graded mesh are obtained as $\min\{4-α,rα\}$ and $\frac{4-α}α$, respectively, where $α\in(0,1)$ is the order of fractional derivative and $r\geq 1$ is the mesh grading parameter. Using this new approximation, a difference scheme for the Caputo-type time-fractional diffusion equation on graded temporal mesh is formulated. The scheme proves to be uniquely solvable for general $r$. Then we derive the unconditional stability of the scheme on uniform mesh. The convergence of the scheme, in particular for $r=1$, is analyzed for non-smooth solutions and concluded for smooth solutions. Finally, the accuracy of the scheme is verified by analyzing the error through a few numerical examples.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms
Authors:
Shweta Patwa,
Danyu Sun,
Amir Gilad,
Ashwin Machanavajjhala,
Sudeepa Roy
Abstract:
Synthetic data generation methods, and in particular, private synthetic data generation methods, are gaining popularity as a means to make copies of sensitive databases that can be shared widely for research and data analysis. Some of the fundamental operations in data analysis include analyzing aggregated statistics, e.g., count, sum, or median, on a subset of data satisfying some conditions. Whe…
▽ More
Synthetic data generation methods, and in particular, private synthetic data generation methods, are gaining popularity as a means to make copies of sensitive databases that can be shared widely for research and data analysis. Some of the fundamental operations in data analysis include analyzing aggregated statistics, e.g., count, sum, or median, on a subset of data satisfying some conditions. When synthetic data is generated, users may be interested in knowing if their aggregated queries generating such statistics can be reliably answered on the synthetic data, for instance, to decide if the synthetic data is suitable for specific tasks. However, the standard data generation systems do not provide "per-query" quality guarantees on the synthetic data, and the users have no way of knowing how much the aggregated statistics on the synthetic data can be trusted. To address this problem, we present a novel framework named DP-PQD (differentially-private per-query decider) to detect if the query answers on the private and synthetic datasets are within a user-specified threshold of each other while guaranteeing differential privacy. We give a suite of private algorithms for per-query deciders for count, sum, and median queries, analyze their properties, and evaluate them experimentally.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Identifying Interpretable Subspaces in Image Representations
Authors:
Neha Kalibhat,
Shweta Bhardwaj,
Bayan Bruss,
Hamed Firooz,
Maziar Sanjabi,
Soheil Feizi
Abstract:
We propose Automatic Feature Explanation using Contrasting Concepts (FALCON), an interpretability framework to explain features of image representations. For a target feature, FALCON captions its highly activating cropped images using a large captioning dataset (like LAION-400m) and a pre-trained vision-language model like CLIP. Each word among the captions is scored and ranked leading to a small…
▽ More
We propose Automatic Feature Explanation using Contrasting Concepts (FALCON), an interpretability framework to explain features of image representations. For a target feature, FALCON captions its highly activating cropped images using a large captioning dataset (like LAION-400m) and a pre-trained vision-language model like CLIP. Each word among the captions is scored and ranked leading to a small number of shared, human-understandable concepts that closely describe the target feature. FALCON also applies contrastive interpretation using lowly activating (counterfactual) images, to eliminate spurious concepts. Although many existing approaches interpret features independently, we observe in state-of-the-art self-supervised and supervised models, that less than 20% of the representation space can be explained by individual features. We show that features in larger spaces become more interpretable when studied in groups and can be explained with high-order scoring concepts through FALCON. We discuss how extracted concepts can be used to explain and debug failures in downstream tasks. Finally, we present a technique to transfer concepts from one (explainable) representation space to another unseen representation space by learning a simple linear transformation. Code available at https://github.com/NehaKalibhat/falcon-explain.
△ Less
Submitted 7 September, 2023; v1 submitted 19 July, 2023;
originally announced July 2023.
-
Predicting convective blueshift and radial-velocity dispersion due to granulation for FGK stars
Authors:
S. Dalal,
R. D. Haywood,
A. Mortier,
W. J. Chaplin,
N. Meunier
Abstract:
To detect Earth-mass planets using the Doppler method, a major obstacle is to differentiate the planetary signal from intrinsic stellar variability (e.g., pulsations, granulation, spots and plages). Convective blueshift, which results from small-scale convection at the surface of Sun-like stars, is relevant for Earth-twin detections as it exhibits Doppler noise on the order of 1 m/s. Here, we pres…
▽ More
To detect Earth-mass planets using the Doppler method, a major obstacle is to differentiate the planetary signal from intrinsic stellar variability (e.g., pulsations, granulation, spots and plages). Convective blueshift, which results from small-scale convection at the surface of Sun-like stars, is relevant for Earth-twin detections as it exhibits Doppler noise on the order of 1 m/s. Here, we present a simple model for convective blueshift based on fundamental equations of stellar structure. Our model successfully matches observations of convective blueshift for FGK stars. Based on our model, we also compute the intrinsic noise floor for stellar granulation in the radial velocity observations. We find that for a given mass range, stars with higher metallicities display lower radial-velocity dispersion due to granulation, in agreement with MHD simulations. We also provide a set of formulae to predict the amplitude of radial-velocity dispersion due to granulation as a function of stellar parameters. Our work is vital in identifying the most amenable stellar targets for EPRV surveys and radial velocity follow-up programmes for TESS, CHEOPS, and the upcoming PLATO mission.
△ Less
Submitted 13 October, 2023; v1 submitted 13 July, 2023;
originally announced July 2023.
-
Revealing the impact of polystyrene-functionalization of Au octahedral nanocrystals of different sizes on formation and structure of mesocrystals
Authors:
Dmitry Lapkin,
Shweta Singh,
Felizitas Kirner,
Sebastian Sturm,
Dameli Assalauova,
Alexandr Ignatenko,
Thomas Wiek,
Thomas Gemming,
Axel Lubk,
Knut Müller-Caspary,
Azat Khadiev,
Dmitri Novikov,
Elena V. Sturm,
Ivan A. Vartanyants
Abstract:
The self-assembly of anisotropic nanocrystals (stabilized by organic cap** molecules) with pre-selected composition, size, and shape allows for the creation of nanostructured materials with unique structures and features. For such a material, the shape and packing of the individual nanoparticles play an important role. This work presents a synthesis procedure for ω-thiol-terminated polystyrene (…
▽ More
The self-assembly of anisotropic nanocrystals (stabilized by organic cap** molecules) with pre-selected composition, size, and shape allows for the creation of nanostructured materials with unique structures and features. For such a material, the shape and packing of the individual nanoparticles play an important role. This work presents a synthesis procedure for ω-thiol-terminated polystyrene (PS-SH) functionalized gold nanooctahedra of variable size (edge length 37, 46, 58, and 72 nm). The impact of polymer chain length (Mw: 11k, 22k, 43k, and 66k g/mol) on the growth of colloidal crystals (e.g. mesocrystals) and their resulting crystal structure is investigated. Small-angle X-ray scattering (SAXS) and scanning transmission electron microscopy (STEM) methods provide a detailed structural examination of the self-assembled faceted mesocrystals based on octahedral gold nanoparticles of different size and surface functionalization. Three-dimensional angular X-ray cross-correlation analysis (AXCCA) enables high-precision determination of the superlattice structure and relative orientation of nanoparticles in mesocrystals. This approach allows us to perform non-destructive characterization of mesocrystalline materials and reveals their structure with resolution down to the nanometer scale.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Is Your Wallet Snitching On You? An Analysis on the Privacy Implications of Web3
Authors:
Christof Ferreira Torres,
Fiona Willi,
Shweta Shinde
Abstract:
With the recent hype around the Metaverse and NFTs, Web3 is getting more and more popular. The goal of Web3 is to decentralize the web via decentralized applications. Wallets play a crucial role as they act as an interface between these applications and the user. Wallets such as MetaMask are being used by millions of users nowadays. Unfortunately, Web3 is often advertised as more secure and privat…
▽ More
With the recent hype around the Metaverse and NFTs, Web3 is getting more and more popular. The goal of Web3 is to decentralize the web via decentralized applications. Wallets play a crucial role as they act as an interface between these applications and the user. Wallets such as MetaMask are being used by millions of users nowadays. Unfortunately, Web3 is often advertised as more secure and private. However, decentralized applications as well as wallets are based on traditional technologies, which are not designed with privacy of users in mind. In this paper, we analyze the privacy implications that Web3 technologies such as decentralized applications and wallets have on users. To this end, we build a framework that measures exposure of wallet information. First, we study whether information about installed wallets is being used to track users online. We analyze the top 100K websites and find evidence of 1,325 websites running scripts that probe whether users have wallets installed in their browser. Second, we measure whether decentralized applications and wallets leak the user's unique wallet address to third-parties. We intercept the traffic of 616 decentralized applications and 100 wallets and find over 2000 leaks across 211 applications and more than 300 leaks across 13 wallets. Our study shows that Web3 poses a threat to users' privacy and requires new designs towards more privacy-aware wallet architectures.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Privacy Aware Question-Answering System for Online Mental Health Risk Assessment
Authors:
Prateek Chhikara,
Ujjwal Pasupulety,
John Marshall,
Dhiraj Chaurasia,
Shweta Kumari
Abstract:
Social media platforms have enabled individuals suffering from mental illnesses to share their lived experiences and find the online support necessary to cope. However, many users fail to receive genuine clinical support, thus exacerbating their symptoms. Screening users based on what they post online can aid providers in administering targeted healthcare and minimize false positives. Pre-trained…
▽ More
Social media platforms have enabled individuals suffering from mental illnesses to share their lived experiences and find the online support necessary to cope. However, many users fail to receive genuine clinical support, thus exacerbating their symptoms. Screening users based on what they post online can aid providers in administering targeted healthcare and minimize false positives. Pre-trained Language Models (LMs) can assess users' social media data and classify them in terms of their mental health risk. We propose a Question-Answering (QA) approach to assess mental health risk using the Unified-QA model on two large mental health datasets. To protect user data, we extend Unified-QA by anonymizing the model training process using differential privacy. Our results demonstrate the effectiveness of modeling risk assessment as a QA task, specifically for mental health use cases. Furthermore, the model's performance decreases by less than 1% with the inclusion of differential privacy. The proposed system's performance is indicative of a promising research direction that will lead to the development of privacy-aware diagnostic systems.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Two Warm Neptunes transiting HIP 9618 revealed by TESS & Cheops
Authors:
Hugh P. Osborn,
Grzegorz Nowak,
Guillaume Hébrard,
Thomas Masseron,
J. Lillo-Box,
Enric Pallé,
Anja Bekkelien,
Hans-Gustav Florén,
Pascal Guterman,
Attila E. Simon,
V. Adibekyan,
Allyson Bieryla,
Luca Borsato,
Alexis Brandeker,
David R. Ciardi,
Andrew Collier Cameron,
Karen A. Collins,
Jo A. Egger,
Davide Gandolfi,
Matthew J. Hooton,
David W. Latham,
Monika Lendl,
Elisabeth C. Matthews,
Amy Tuson,
Solène Ulmer-Moll
, et al. (104 additional authors not shown)
Abstract:
HIP 9618 (HD 12572, TOI-1471, TIC 306263608) is a bright ($G=9.0$ mag) solar analogue. TESS photometry revealed the star to have two candidate planets with radii of $3.9 \pm 0.044$ $R_\oplus$ (HIP 9618 b) and $3.343 \pm 0.039$ $R_\oplus$ (HIP 9618 c). While the 20.77291 day period of HIP 9618 b was measured unambiguously, HIP 9618 c showed only two transits separated by a 680-day gap in the time s…
▽ More
HIP 9618 (HD 12572, TOI-1471, TIC 306263608) is a bright ($G=9.0$ mag) solar analogue. TESS photometry revealed the star to have two candidate planets with radii of $3.9 \pm 0.044$ $R_\oplus$ (HIP 9618 b) and $3.343 \pm 0.039$ $R_\oplus$ (HIP 9618 c). While the 20.77291 day period of HIP 9618 b was measured unambiguously, HIP 9618 c showed only two transits separated by a 680-day gap in the time series, leaving many possibilities for the period. To solve this issue, CHEOPS performed targeted photometry of period aliases to attempt to recover the true period of planet c, and successfully determined the true period to be 52.56349 d. High-resolution spectroscopy with HARPS-N, SOPHIE and CAFE revealed a mass of $10.0 \pm 3.1 M_\oplus$ for HIP 9618 b, which, according to our interior structure models, corresponds to a $6.8\pm1.4\%$ gas fraction. HIP 9618 c appears to have a lower mass than HIP 9618 b, with a 3-sigma upper limit of $< 18M_\oplus$. Follow-up and archival RV measurements also reveal a clear long-term trend which, when combined with imaging and astrometric information, reveal a low-mass companion ($0.08^{+0.12}_{-0.05} M_\odot$) orbiting at $26^{+19}_{-11}$ au. This detection makes HIP 9618 one of only five bright ($K<8$ mag) transiting multi-planet systems known to host a planet with $P>50$ d, opening the door for the atmospheric characterisation of warm ($T_{\rm eq}<750$ K) sub-Neptunes.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
A General Framework for Uncertainty Quantification via Neural SDE-RNN
Authors:
Shweta Dahale,
Sai Munikoti,
Balasubramaniam Natarajan
Abstract:
Uncertainty quantification is a critical yet unsolved challenge for deep learning, especially for the time series imputation with irregularly sampled measurements. To tackle this problem, we propose a novel framework based on the principles of recurrent neural networks and neural stochastic differential equations for reconciling irregularly sampled measurements. We impute measurements at any arbit…
▽ More
Uncertainty quantification is a critical yet unsolved challenge for deep learning, especially for the time series imputation with irregularly sampled measurements. To tackle this problem, we propose a novel framework based on the principles of recurrent neural networks and neural stochastic differential equations for reconciling irregularly sampled measurements. We impute measurements at any arbitrary timescale and quantify the uncertainty in the imputations in a principled manner. Specifically, we derive analytical expressions for quantifying and propagating the epistemic and aleatoric uncertainty across time instants. Our experiments on the IEEE 37 bus test distribution system reveal that our framework can outperform state-of-the-art uncertainty quantification approaches for time-series data imputations.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
ACAI: Protecting Accelerator Execution with Arm Confidential Computing Architecture
Authors:
Supraja Sridhara,
Andrin Bertschi,
Benedict Schlüter,
Mark Kuhne,
Fabio Aliberti,
Shweta Shinde
Abstract:
Trusted execution environments in several existing and upcoming CPUs demonstrate the success of confidential computing, with the caveat that tenants cannot securely use accelerators such as GPUs and FPGAs. In this paper, we reconsider the Arm Confidential Computing Architecture (CCA) design, an upcoming TEE feature in Armv9-A, to address this gap. We observe that CCA offers the right abstraction a…
▽ More
Trusted execution environments in several existing and upcoming CPUs demonstrate the success of confidential computing, with the caveat that tenants cannot securely use accelerators such as GPUs and FPGAs. In this paper, we reconsider the Arm Confidential Computing Architecture (CCA) design, an upcoming TEE feature in Armv9-A, to address this gap. We observe that CCA offers the right abstraction and mechanisms to allow confidential VMs to use accelerators as a first-class abstraction. We build ACAI, a CCA-based solution, with a principled approach of extending CCA security invariants to device-side access to address several critical security gaps. Our experimental results on GPU and FPGA demonstrate the feasibility of ACAI while maintaining security guarantees.
△ Less
Submitted 25 October, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Unsupervised Semantic Correspondence Using Stable Diffusion
Authors:
Eric Hedlin,
Gopal Sharma,
Shweta Mahajan,
Hossam Isack,
Abhishek Kar,
Andrea Tagliasacchi,
Kwang Moo Yi
Abstract:
Text-to-image diffusion models are now capable of generating images that are often indistinguishable from real images. To generate such images, these models must understand the semantics of the objects they are asked to generate. In this work we show that, without any training, one can leverage this semantic knowledge within diffusion models to find semantic correspondences - locations in multiple…
▽ More
Text-to-image diffusion models are now capable of generating images that are often indistinguishable from real images. To generate such images, these models must understand the semantics of the objects they are asked to generate. In this work we show that, without any training, one can leverage this semantic knowledge within diffusion models to find semantic correspondences - locations in multiple images that have the same semantic meaning. Specifically, given an image, we optimize the prompt embeddings of these models for maximum attention on the regions of interest. These optimized embeddings capture semantic information about the location, which can then be transferred to another image. By doing so we obtain results on par with the strongly supervised state of the art on the PF-Willow dataset and significantly outperform (20.9% relative for the SPair-71k dataset) any existing weakly or unsupervised method on PF-Willow, CUB-200 and SPair-71k datasets.
△ Less
Submitted 23 December, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Stereo-Electronic Factors Influencing the Stability of Hydroperoxyalkyl Radicals: Transferability of Chemical Trends across Hydrocarbons and ab initio Methods
Authors:
Saurabh Chandra Kandpal,
Kgalaletso P. Otukile,
Shweta **dal,
Salini Senthil,
Cameron Matthews,
Sabyasachi Chakraborty,
Lyudmila V. Moskaleva,
Raghunathan Ramakrishnan
Abstract:
The hydroperoxyalkyl radicals (.QOOH) are known to play a significant role in combustion and tropospheric processes, yet their direct spectroscopic detection remains challenging. In this study, we investigate molecular stereo-electronic effects influencing the kinetic and thermodynamic stability of a .QOOH along its formation path from the precursor, alkylperoxyl radical (ROO.), and the depletion…
▽ More
The hydroperoxyalkyl radicals (.QOOH) are known to play a significant role in combustion and tropospheric processes, yet their direct spectroscopic detection remains challenging. In this study, we investigate molecular stereo-electronic effects influencing the kinetic and thermodynamic stability of a .QOOH along its formation path from the precursor, alkylperoxyl radical (ROO.), and the depletion path resulting in the formation of cyclic ether + .OH. We focus on reactive intermediates encountered in the oxidation of acyclic hydrocarbon radicals: ethyl, isopropyl, isobutyl, tert-butyl, neopentyl, and their alicyclic counterparts: cyclohexyl, cyclohexenyl, and cyclohexadienyl. We report reaction energies and barriers calculated with the highly accurate method Weizmann-1 (W1) for the channels: ROO. <=> .QOOH, ROO. <=> alkene + .OOH, .QOOH <=> alkene + .OOH, and .QOOH <=> cyclic ether + .OH. Using W1 results as a reference, we have systematically benchmarked the accuracy of popular density functional theory (DFT), composite thermochemistry methods, and an explicitly correlated coupled-cluster method. We ascertain inductive, resonance, and steric effects on the overall stability of .QOOH and computationally investigate the possibility of forming more stable species. With new reactions as test cases, we probe the capacity of various ab initio methods to yield quantitative insights on the elementary steps of combustion.
△ Less
Submitted 21 September, 2023; v1 submitted 22 May, 2023;
originally announced May 2023.
-
Quadratic Functional Encryption for Secure Training in Vertical Federated Learning
Authors:
Shuangyi Chen,
Anuja Modi,
Shweta Agrawal,
Ashish Khisti
Abstract:
Vertical federated learning (VFL) enables the collaborative training of machine learning (ML) models in settings where the data is distributed amongst multiple parties who wish to protect the privacy of their individual data. Notably, in VFL, the labels are available to a single party and the complete feature set is formed only when data from all parties is combined. Recently, Xu et al. proposed a…
▽ More
Vertical federated learning (VFL) enables the collaborative training of machine learning (ML) models in settings where the data is distributed amongst multiple parties who wish to protect the privacy of their individual data. Notably, in VFL, the labels are available to a single party and the complete feature set is formed only when data from all parties is combined. Recently, Xu et al. proposed a new framework called FedV for secure gradient computation for VFL using multi-input functional encryption. In this work, we explain how some of the information leakage in Xu et al. can be avoided by using Quadratic functional encryption when training generalized linear models for vertical federated learning.
△ Less
Submitted 19 June, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
Stochastic Subgraph Neighborhood Pooling for Subgraph Classification
Authors:
Shweta Ann Jacob,
Paul Louis,
Amirali Salehi-Abari
Abstract:
Subgraph classification is an emerging field in graph representation learning where the task is to classify a group of nodes (i.e., a subgraph) within a graph. Subgraph classification has applications such as predicting the cellular function of a group of proteins or identifying rare diseases given a collection of phenotypes. Graph neural networks (GNNs) are the de facto solution for node, link, a…
▽ More
Subgraph classification is an emerging field in graph representation learning where the task is to classify a group of nodes (i.e., a subgraph) within a graph. Subgraph classification has applications such as predicting the cellular function of a group of proteins or identifying rare diseases given a collection of phenotypes. Graph neural networks (GNNs) are the de facto solution for node, link, and graph-level tasks but fail to perform well on subgraph classification tasks. Even GNNs tailored for graph classification are not directly transferable to subgraph classification as they ignore the external topology of the subgraph, thus failing to capture how the subgraph is located within the larger graph. The current state-of-the-art models for subgraph classification address this shortcoming through either labeling tricks or multiple message-passing channels, both of which impose a computation burden and are not scalable to large graphs. To address the scalability issue while maintaining generalization, we propose Stochastic Subgraph Neighborhood Pooling (SSNP), which jointly aggregates the subgraph and its neighborhood (i.e., external topology) information without any computationally expensive operations such as labeling tricks. To improve scalability and generalization further, we also propose a simple data augmentation pre-processing step for SSNP that creates multiple sparse views of the subgraph neighborhood. We show that our model is more expressive than GNNs without labeling tricks. Our extensive experiments demonstrate that our models outperform current state-of-the-art methods (with a margin of up to 2%) while being up to 3X faster in training.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
$k$-SUM in the Sparse Regime
Authors:
Shweta Agrawal,
Sagnik Saha,
Nikolaj I. Schwartzbach,
and Akhil Vanukuri,
Prashant Nalini Vasudevan
Abstract:
In the average-case $k$-SUM problem, given $r$ integers chosen uniformly at random from $\{0,\dots,M-1\}$, the objective is to find a ``solution'' set of $k$ numbers that sum to $0$ modulo $M$. In the dense regime of $M \leq r^k$, where solutions exist with high probability, the complexity of these problems is well understood. Much less is known in the sparse regime of $M\gg r^k$, where solutions…
▽ More
In the average-case $k$-SUM problem, given $r$ integers chosen uniformly at random from $\{0,\dots,M-1\}$, the objective is to find a ``solution'' set of $k$ numbers that sum to $0$ modulo $M$. In the dense regime of $M \leq r^k$, where solutions exist with high probability, the complexity of these problems is well understood. Much less is known in the sparse regime of $M\gg r^k$, where solutions are unlikely to exist.
In this work, we initiate the study of the sparse regime for $k$-SUM and its variant $k$-XOR, especially their planted versions, where a random solution is planted in a randomly generated instance and has to be recovered. We provide evidence for the hardness of these problems and suggest new applications to cryptography.
Complexity. First we study the complexity of these problems in the sparse regime and show:
- Conditional Lower Bounds. Assuming established conjectures about the hardness of average-case (non-planted) $k$-SUM/$k$-XOR when $M = r^k$, we provide non-trivial lower bounds on the running time of algorithms for planted $k$-SUM when $r^k\leq M\leq r^{2k}$.
- Hardness Amplification. We show that for any $M \geq r^k$, if an algorithm running in time $T$ solves planted $k$-SUM/$k$-XOR with success probability $Ω(1/\text{polylog}(r))$, then there is an algorithm running in time $\tilde{O}(T)$ that solves it with probability $(1-o(1))$.
- New Reductions and Algorithms. We provide reductions for $k$-SUM/$k$-XOR from search to decision, as well as worst-case and average-case reductions to the Subset Sum problem from $k$-SUM, as well as a new algorithm for average-case $k$-XOR at low densities.
Cryptography. We show that by additionally assuming mild hardness of $k$-XOR, we can construct Public Key Encryption (PKE) from a weaker variant of the Learning Parity with Noise (LPN) problem than was known before.
△ Less
Submitted 21 November, 2023; v1 submitted 4 April, 2023;
originally announced April 2023.
-
Non-perturbative Generation of Light Antiquark Flavor Asymmetry in Proton
Authors:
Shweta Choudhary,
Pranjal Srivastava,
Harleen Dahiya
Abstract:
We compute the light antiquark flavor asymmetry in the proton using the Chiral Quark Model ($χ_{\rm QM}$). The distribution functions for the light antiquarks $\bar{d}(x)$ and $\bar{u}(x)$ have been extracted with the help of experimental data from NuSea/E866 and HERMES for the Bjorken$-x$ range $0.015 < x < 0.35$ as well from the most recent SeaQuest data for an extended $x$ range…
▽ More
We compute the light antiquark flavor asymmetry in the proton using the Chiral Quark Model ($χ_{\rm QM}$). The distribution functions for the light antiquarks $\bar{d}(x)$ and $\bar{u}(x)$ have been extracted with the help of experimental data from NuSea/E866 and HERMES for the Bjorken$-x$ range $0.015 < x < 0.35$ as well from the most recent SeaQuest data for an extended $x$ range $0.13 < x < 0.45$. Our results on the $\bar{d}(x)-\bar{u}(x)$, $\frac{\bar{d}(x)}{\bar{u}(x)}$ and Gottfried Integral $I_G$ are in agreement with the experimental data and confirm the presence of enhanced $\bar{d}$ sea whose origin is purely non-perturbative based on chiral symmetry breaking in QCD.
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
Yukawa-Casimir wormholes in f(Q) gravity
Authors:
Ambuj Kumar Mishra,
Shweta,
Umesh Kumar Sharma
Abstract:
Casimir energy is always suggested as a possible source to create a traversable wormhole. It is also used to demonstrate the existence of negative energy, which can be created in a lab. To generalize, this idea, Yukawa modification of Casimir source has been considered in Remo Garattini (Eur. Phys. J. C 81 no.9, 824, 2021). In this work, we explore the Yukawa Casimir wormholes in symmetric telepar…
▽ More
Casimir energy is always suggested as a possible source to create a traversable wormhole. It is also used to demonstrate the existence of negative energy, which can be created in a lab. To generalize, this idea, Yukawa modification of Casimir source has been considered in Remo Garattini (Eur. Phys. J. C 81 no.9, 824, 2021). In this work, we explore the Yukawa Casimir wormholes in symmetric teleparallel gravity. We have taken four different forms of $f(Q)$ to obtain wormhole solutions powered by the original Casimir energy source and Yukawa modification of the Casimir energy source. In power law form $f(Q)= αQ^2 + β$ and quadratic form $f(Q)= αQ^2 + βQ + γ$, where $α, β, γ$ are constants and $Q$ is non-metricity scalar, we analyze that wormhole throat is filled with non-exotic matter. We find self-sustained traversable wormholes in the Casimir source where null energy conditions are violated in all specific forms of $f(Q)$, while after Yukawa modification it is observed that violation of null energy conditions is restricted to some regions in the vicinity of the throat.
△ Less
Submitted 23 February, 2023;
originally announced March 2023.
-
Informality, Education-Occupation Mismatch, and Wages: Evidence from India
Authors:
Shweta Bahl,
Ajay Sharma
Abstract:
This article examines the intertwining relationship between informality and education-occupation mismatch and the consequent impact on wages. In particular, we discuss two issues: first, the relative importance of informality and education-occupation mismatch in determining wages, and second, the relevance of EOM for formal and informal workers. The analysis reveals that although both informality…
▽ More
This article examines the intertwining relationship between informality and education-occupation mismatch and the consequent impact on wages. In particular, we discuss two issues: first, the relative importance of informality and education-occupation mismatch in determining wages, and second, the relevance of EOM for formal and informal workers. The analysis reveals that although both informality and EOM are significant determinants of wages, the former is more crucial for a develo** country like India. Further, we find that EOM is one of the crucial determinants of wages for formal workers, but it is not critical for informal workers. The study highlights the need for considering the bifurcation of formal-informal workers to understand the complete dynamics of EOM, especially for develo** countries where informality is predominant.
△ Less
Submitted 25 February, 2023;
originally announced February 2023.
-
A Novel Demand Response Model and Method for Peak Reduction in Smart Grids -- PowerTAC
Authors:
Sanjay Chandlekar,
Arthik Boroju,
Shweta Jain,
Sujit Gujar
Abstract:
One of the widely used peak reduction methods in smart grids is demand response, where one analyzes the shift in customers' (agents') usage patterns in response to the signal from the distribution company. Often, these signals are in the form of incentives offered to agents. This work studies the effect of incentives on the probabilities of accepting such offers in a real-world smart grid simulato…
▽ More
One of the widely used peak reduction methods in smart grids is demand response, where one analyzes the shift in customers' (agents') usage patterns in response to the signal from the distribution company. Often, these signals are in the form of incentives offered to agents. This work studies the effect of incentives on the probabilities of accepting such offers in a real-world smart grid simulator, PowerTAC. We first show that there exists a function that depicts the probability of an agent reducing its load as a function of the discounts offered to them. We call it reduction probability (RP). RP function is further parametrized by the rate of reduction (RR), which can differ for each agent. We provide an optimal algorithm, MJS--ExpResponse, that outputs the discounts to each agent by maximizing the expected reduction under a budget constraint. When RRs are unknown, we propose a Multi-Armed Bandit (MAB) based online algorithm, namely MJSUCB--ExpResponse, to learn RRs. Experimentally we show that it exhibits sublinear regret. Finally, we showcase the efficacy of the proposed algorithm in mitigating demand peaks in a real-world smart grid system using the PowerTAC simulator as a test bed.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.