Search | arXiv e-print repository

doi 10.1145/3673422.3674896

To Switch or Not to Switch to TCP Prague? Incentives for Adoption in a Partial L4S Deployment

Authors: Fatih Berkay Sarpkaya, Ashutosh Srivastava, Fraida Fund, Shivendra Panwar

Abstract: The Low Latency, Low Loss, Scalable Throughput (L4S) architecture has the potential to reduce queuing delay when it is deployed at endpoints and routers throughout the Internet. However, it is not clear how TCP Prague, a prototype scalable congestion control for L4S, behaves when L4S is not yet universally deployed. Specifically, we consider the question: in a partial L4S deployment, will a user b… ▽ More The Low Latency, Low Loss, Scalable Throughput (L4S) architecture has the potential to reduce queuing delay when it is deployed at endpoints and routers throughout the Internet. However, it is not clear how TCP Prague, a prototype scalable congestion control for L4S, behaves when L4S is not yet universally deployed. Specifically, we consider the question: in a partial L4S deployment, will a user benefit by unilaterally switching from the status quo TCP to TCP Prague? To address this question, we evaluate the performance of a TCP Prague flow when sharing an L4S or non-L4S bottleneck queue with a non-L4S flow. Our findings suggest that the L4S congestion control, TCP Prague, has less favorable throughput or fairness properties than TCP Cubic or BBR in some coexistence scenarios, which may hinder adoption. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: Accepted to ACM Applied Networking Research Workshop (ANRW), 2024

arXiv:2406.17235 [pdf, other]

Task-Agnostic Federated Learning

Authors: Zhengtao Yao, Hong Nguyen, Ajitesh Srivastava, Jose Luis Ambite

Abstract: In the realm of medical imaging, leveraging large-scale datasets from various institutions is crucial for develo** precise deep learning models, yet privacy concerns frequently impede data sharing. federated learning (FL) emerges as a prominent solution for preserving privacy while facilitating collaborative learning. However, its application in real-world scenarios faces several obstacles, such… ▽ More In the realm of medical imaging, leveraging large-scale datasets from various institutions is crucial for develo** precise deep learning models, yet privacy concerns frequently impede data sharing. federated learning (FL) emerges as a prominent solution for preserving privacy while facilitating collaborative learning. However, its application in real-world scenarios faces several obstacles, such as task & data heterogeneity, label scarcity, non-identically distributed (non-IID) data, computational vaiation, etc. In real-world, medical institutions may not want to disclose their tasks to FL server and generalization challenge of out-of-network institutions with un-seen task want to join the on-going federated system. This study address task-agnostic and generalization problem on un-seen tasks by adapting self-supervised FL framework. Utilizing Vision Transformer (ViT) as consensus feature encoder for self-supervised pre-training, no initial labels required, the framework enabling effective representation learning across diverse datasets and tasks. Our extensive evaluations, using various real-world non-IID medical imaging datasets, validate our approach's efficacy, retaining 90\% of F1 accuracy with only 5\% of the training data typically required for centralized approaches and exhibiting superior adaptability to out-of-distribution task. The result indicate that federated learning architecture can be a potential approach toward multi-task foundation modeling. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.14861 [pdf, other]

Resilience of the Electric Grid through Trustable IoT-Coordinated Assets

Authors: Vineet J. Nair, Venkatesh Venkataramanan, Priyank Srivastava, Partha S. Sarker, Anurag Srivastava, Laurentiu D. Marinovici, Jun Zha, Christopher Irwin, Prateek Mittal, John Williams, H. Vincent Poor, Anuradha M. Annaswamy

Abstract: The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) that include renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. Ho… ▽ More The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) that include renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. However, they can introduce new vulnerabilities in the form of cyberattacks, which can cause significant challenges in ensuring grid resilience. %, i.e. the ability to rapidly restore grid services in the face of severe disruptions. We propose a framework in this paper for achieving grid resilience through suitably coordinated assets including a network of Internet of Things (IoT) devices. A local electricity market is proposed to identify trustable assets and carry out this coordination. Situational Awareness (SA) of locally available DERs with the ability to inject power or reduce consumption is enabled by the market, together with a monitoring procedure for their trustability and commitment. With this SA, we show that a variety of cyberattacks can be mitigated using local trustable resources without stressing the bulk grid. The demonstrations are carried out using a variety of platforms with a high-fidelity co-simulation platform, real-time hardware-in-the-loop validation, and a utility-friendly simulator. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Submitted to the Proceedings of the National Academy of Sciences (PNAS), under review

arXiv:2406.13232 [pdf]

Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and Metrics for Open Domain Question Answering in the Era of Large Language Models

Authors: Akchay Srivastava, Atif Memon

Abstract: Open Domain Question Answering (ODQA) within natural language processing involves building systems that answer factual questions using large-scale knowledge corpora. Recent advances stem from the confluence of several factors, such as large-scale training datasets, deep learning techniques, and the rise of large language models. High-quality datasets are used to train models on realistic scenarios… ▽ More Open Domain Question Answering (ODQA) within natural language processing involves building systems that answer factual questions using large-scale knowledge corpora. Recent advances stem from the confluence of several factors, such as large-scale training datasets, deep learning techniques, and the rise of large language models. High-quality datasets are used to train models on realistic scenarios and enable the evaluation of the system on potentially unseen data. Standardized metrics facilitate comparisons between different ODQA systems, allowing researchers to objectively track advancements in the field. Our study presents a thorough examination of the current landscape of ODQA benchmarking by reviewing 52 datasets and 20 evaluation techniques across textual and multimodal modalities. We introduce a novel taxonomy for ODQA datasets that incorporates both the modality and difficulty of the question types. Additionally, we present a structured organization of ODQA evaluation metrics along with a critical analysis of their inherent trade-offs. Our study aims to empower researchers by providing a framework for the robust evaluation of modern question-answering systems. We conclude by identifying the current challenges and outlining promising avenues for future research and development. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 22 pages, 13 tables, 7 figures

arXiv:2406.12458 [pdf, other]

Planning Using Schrödinger Bridge Diffusion Models

Authors: Adarsh Srivastava

Abstract: Offline planning often struggles with poor sampling efficiency as it tries to learn policies from scratch. Especially with diffusion models, such cold start practices mean that both training and sampling become very expensive. We hypothesize that certain environment constraint priors or cheaply available policies make it unnecessary to learn from scratch, and explore a way to incorporate such prio… ▽ More Offline planning often struggles with poor sampling efficiency as it tries to learn policies from scratch. Especially with diffusion models, such cold start practices mean that both training and sampling become very expensive. We hypothesize that certain environment constraint priors or cheaply available policies make it unnecessary to learn from scratch, and explore a way to incorporate such priors in the learning process. To achieve that, we borrow a variation of the Schrödinger bridge formulation from the image-to-image setting and apply it to planning tasks. We study the performance on some planning tasks and compare the performance against the DDPM formulation. The code for this work is available at https://github.com/adrshsrvstv/bridge_diffusion_planning. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.07840 [pdf, other]

SynthForge: Synthesizing High-Quality Face Dataset with Controllable 3D Generative Models

Authors: Abhay Rawat, Shubham Dokania, Astitva Srivastava, Shuaib Ahmed, Haiwen Feng, Rahul Tallamraju

Abstract: Recent advancements in generative models have unlocked the capabilities to render photo-realistic data in a controllable fashion. Trained on the real data, these generative models are capable of producing realistic samples with minimal to no domain gap, as compared to the traditional graphics rendering. However, using the data generated using such models for training downstream tasks remains under… ▽ More Recent advancements in generative models have unlocked the capabilities to render photo-realistic data in a controllable fashion. Trained on the real data, these generative models are capable of producing realistic samples with minimal to no domain gap, as compared to the traditional graphics rendering. However, using the data generated using such models for training downstream tasks remains under-explored, mainly due to the lack of 3D consistent annotations. Moreover, controllable generative models are learned from massive data and their latent space is often too vast to obtain meaningful sample distributions for downstream task with limited generation. To overcome these challenges, we extract 3D consistent annotations from an existing controllable generative model, making the data useful for downstream tasks. Our experiments show competitive performance against state-of-the-art models using only generated synthetic data, demonstrating potential for solving downstream tasks. Project page: https://synth-forge.github.io △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 11 pages, 4 figures, 3 tables. Under Review

arXiv:2406.06608 [pdf, other]

The Prompt Report: A Systematic Survey of Prompting Techniques

Authors: Sander Schulhoff, Michael Ilie, Nishant Balepur, Konstantine Kahadze, Amanda Liu, Chenglei Si, Yinheng Li, Aayush Gupta, HyoJung Han, Sevien Schulhoff, Pranav Sandeep Dulepet, Saurav Vidyadhara, Dayeon Ki, Sweta Agrawal, Chau Pham, Gerson Kroiz, Feileen Li, Hudson Tao, Ashay Srivastava, Hevander Da Costa, Saloni Gupta, Megan L. Rogers, Inna Goncearenco, Giuseppe Sarli, Igor Galynker , et al. (6 additional authors not shown)

Abstract: Generative Artificial Intelligence (GenAI) systems are being increasingly deployed across all parts of industry and research settings. Developers and end users interact with these systems through the use of prompting or prompt engineering. While prompting is a widespread and highly researched concept, there exists conflicting terminology and a poor ontological understanding of what constitutes a p… ▽ More Generative Artificial Intelligence (GenAI) systems are being increasingly deployed across all parts of industry and research settings. Developers and end users interact with these systems through the use of prompting or prompt engineering. While prompting is a widespread and highly researched concept, there exists conflicting terminology and a poor ontological understanding of what constitutes a prompt due to the area's nascency. This paper establishes a structured understanding of prompts, by assembling a taxonomy of prompting techniques and analyzing their use. We present a comprehensive vocabulary of 33 vocabulary terms, a taxonomy of 58 text-only prompting techniques, and 40 techniques for other modalities. We further present a meta-analysis of the entire literature on natural language prefix-prompting. △ Less

Submitted 16 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.21050 [pdf, other]

Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models

Authors: Xinxi Zhang, Song Wen, Ligong Han, Felix Juefei-Xu, Akash Srivastava, Junzhou Huang, Hao Wang, Molei Tao, Dimitris N. Metaxas

Abstract: Adapting large-scale pre-trained generative models in a parameter-efficient manner is gaining traction. Traditional methods like low rank adaptation achieve parameter efficiency by imposing constraints but may not be optimal for tasks requiring high representation capacity. We propose a novel spectrum-aware adaptation framework for generative models. Our method adjusts both singular values and the… ▽ More Adapting large-scale pre-trained generative models in a parameter-efficient manner is gaining traction. Traditional methods like low rank adaptation achieve parameter efficiency by imposing constraints but may not be optimal for tasks requiring high representation capacity. We propose a novel spectrum-aware adaptation framework for generative models. Our method adjusts both singular values and their basis vectors of pretrained weights. Using the Kronecker product and efficient Stiefel optimizers, we achieve parameter-efficient adaptation of orthogonal matrices. We introduce Spectral Orthogonal Decomposition Adaptation (SODA), which balances computational efficiency and representation capacity. Extensive evaluations on text-to-image diffusion models demonstrate SODA's effectiveness, offering a spectrum-aware alternative to existing fine-tuning methods. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.20592 [pdf, other]

LInK: Learning Joint Representations of Design and Performance Spaces through Contrastive Learning for Mechanism Synthesis

Authors: Amin Heyrani Nobari, Akash Srivastava, Dan Gutfreund, Kai Xu, Faez Ahmed

Abstract: In this paper, we introduce LInK, a novel framework that integrates contrastive learning of performance and design space with optimization techniques for solving complex inverse problems in engineering design with discrete and continuous variables. We focus on the path synthesis problem for planar linkage mechanisms. By leveraging a multi-modal and transformation-invariant contrastive learning fra… ▽ More In this paper, we introduce LInK, a novel framework that integrates contrastive learning of performance and design space with optimization techniques for solving complex inverse problems in engineering design with discrete and continuous variables. We focus on the path synthesis problem for planar linkage mechanisms. By leveraging a multi-modal and transformation-invariant contrastive learning framework, LInK learns a joint representation that captures complex physics and design representations of mechanisms, enabling rapid retrieval from a vast dataset of over 10 million mechanisms. This approach improves precision through the warm start of a hierarchical unconstrained nonlinear optimization algorithm, combining the robustness of traditional optimization with the speed and adaptability of modern deep learning methods. Our results on an existing benchmark demonstrate that LInK outperforms existing methods with 28 times less error compared to a state-of-the-art approach while taking 20 times less time on an existing benchmark. Moreover, we introduce a significantly more challenging benchmark, named LINK-ABC, which involves synthesizing linkages that trace the trajectories of English capital alphabets - an inverse design benchmark task that existing methods struggle with due to large non-linearities and tiny feasible space. Our results demonstrate that LInK not only advances the field of mechanism design but also broadens the applicability of contrastive learning and optimization to other areas of engineering. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19310 [pdf, other]

Network Connectivity--Information Freshness Tradeoff in Information Dissemination Over Networks

Authors: Arunabh Srivastava, Sennur Ulukus

Abstract: We consider a gossip network consisting of a source generating updates and $n$ nodes connected according to a given graph structure. The source keeps updates of a process, that might be generated or observed, and shares them with the gossi** network. The nodes in the network communicate with their neighbors and disseminate these version updates using a push-style gossip strategy. We use the vers… ▽ More We consider a gossip network consisting of a source generating updates and $n$ nodes connected according to a given graph structure. The source keeps updates of a process, that might be generated or observed, and shares them with the gossi** network. The nodes in the network communicate with their neighbors and disseminate these version updates using a push-style gossip strategy. We use the version age metric to quantify the timeliness of information at the nodes. We first find an upper bound for the average version age for a set of nodes in a general network. Using this, we find the average version age scaling of a node in several network graph structures, such as two-dimensional grids, generalized rings and hyper-cubes. Prior to our work, it was known that when $n$ nodes are connected on a ring the version age scales as $O(n^{\frac{1}{2}})$, and when they are connected on a fully-connected graph the version age scales as $O(\log n)$. Ours is the first work to show an age scaling result for a connectivity structure other than the ring and the fully-connected network, which constitute the two extremes of network connectivity. Our work helps fill the gap between these two extremes by analyzing a large variety of graphs with intermediate connectivity, thus providing insight into the relationship between the connectivity structure of the network and the version age, and uncovering a network connectivity--information freshness tradeoff. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2307.08670, arXiv:2308.12266

arXiv:2405.18670 [pdf, other]

Adapting Differentially Private Synthetic Data to Relational Databases

Authors: Kaveh Alimohammadi, Hao Wang, Ojas Gulati, Akash Srivastava, Navid Azizan

Abstract: Existing differentially private (DP) synthetic data generation mechanisms typically assume a single-source table. In practice, data is often distributed across multiple tables with relationships across tables. In this paper, we introduce the first-of-its-kind algorithm that can be combined with any existing DP mechanisms to generate synthetic relational databases. Our algorithm iteratively refines… ▽ More Existing differentially private (DP) synthetic data generation mechanisms typically assume a single-source table. In practice, data is often distributed across multiple tables with relationships across tables. In this paper, we introduce the first-of-its-kind algorithm that can be combined with any existing DP mechanisms to generate synthetic relational databases. Our algorithm iteratively refines the relationship between individual synthetic tables to minimize their approximation errors in terms of low-order marginal distributions while maintaining referential integrity. Finally, we provide both DP and theoretical utility guarantees for our algorithm. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.12052 [pdf, other]

Parallelization of the K-Means Algorithm with Applications to Big Data Clustering

Authors: Ashish Srivastava, Mohammed Nawfal

Abstract: The K-Means clustering using LLoyd's algorithm is an iterative approach to partition the given dataset into K different clusters. The algorithm assigns each point to the cluster based on the following objective function \[\ \min Σ_{i=1}^{n}||x_i-μ_{x_i}||^2\] The serial algorithm involves iterative steps where we compute the distance of each datapoint from the centroids and assign the datapoint… ▽ More The K-Means clustering using LLoyd's algorithm is an iterative approach to partition the given dataset into K different clusters. The algorithm assigns each point to the cluster based on the following objective function \[\ \min Σ_{i=1}^{n}||x_i-μ_{x_i}||^2\] The serial algorithm involves iterative steps where we compute the distance of each datapoint from the centroids and assign the datapoint to the nearest centroid. This approach is essentially known as the expectation-maximization step. Clustering involves extensive computations to calculate distances at each iteration, which increases as the number of data points increases. This provides scope for parallelism. However, we must ensure that in a parallel process, each thread has access to the updated centroid value and no racing condition exists on any centroid values. We will compare two different approaches in this project. The first approach is an OpenMP flat synchronous method where all processes are run in parallel, and we use synchronization to ensure safe updates of clusters. The second approach we adopt is a GPU based parallelization approach using OpenACC wherein we will try to make use of GPU architecture to parallelize chunks of the algorithm to observe decreased computation time. We will analyze metrics such as speed up, efficiency,time taken with varying data points, and number of processes to compare the two approaches and understand the relative performance improvement we can get. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 7 Pages, 5 tables, 12 figures

arXiv:2405.06639 [pdf, other]

Value Augmented Sampling for Language Model Alignment and Personalization

Authors: Seungwook Han, Idan Shenfeld, Akash Srivastava, Yoon Kim, Pulkit Agrawal

Abstract: Aligning Large Language Models (LLMs) to cater to different human preferences, learning new skills, and unlearning harmful behavior is an important problem. Search-based methods, such as Best-of-N or Monte-Carlo Tree Search, are performant, but impractical for LLM adaptation due to their high inference cost. On the other hand, using Reinforcement Learning (RL) for adaptation is computationally eff… ▽ More Aligning Large Language Models (LLMs) to cater to different human preferences, learning new skills, and unlearning harmful behavior is an important problem. Search-based methods, such as Best-of-N or Monte-Carlo Tree Search, are performant, but impractical for LLM adaptation due to their high inference cost. On the other hand, using Reinforcement Learning (RL) for adaptation is computationally efficient, but performs worse due to the optimization challenges in co-training the value function and the policy. We present a new framework for reward optimization, Value Augmented Sampling (VAS), that can maximize different reward functions using data sampled from only the initial, frozen LLM. VAS solves for the optimal reward-maximizing policy without co-training the policy and the value function, making the optimization stable, outperforming established baselines, such as PPO and DPO, on standard benchmarks, and achieving comparable results to Best-of-128 with lower inference cost. Unlike existing RL methods that require changing the weights of the LLM, VAS does not require access to the weights of the pre-trained LLM. Thus, it can even adapt LLMs (e.g., ChatGPT), which are available only as APIs. In addition, our algorithm unlocks the new capability of composing several rewards and controlling the extent of each one during deployment time, paving the road ahead for the future of aligned, personalized LLMs. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: Website: https://sites.google.com/view/llm-vas

arXiv:2404.12643 [pdf]

AipanVR: A Virtual Reality Experience for Preserving Uttarakhand's Traditional Art Form

Authors: Nishant Chaudhary, Mihir Raj, Richik Bhattacharjee, Anmol Srivastava, Rakesh Sah, Pankaj Badoni

Abstract: This paper presents a demonstration of the developed prototype showcasing a way to preserve the Intangible Cultural Heritage of Uttarakhand, India. Aipan is a traditional art form practiced in the Kumaon region in the state of Uttarakhand. It is typically used to decorate floors and walls at places of worship or entrances of homes and is considered auspicious to begin any work or event. This art i… ▽ More This paper presents a demonstration of the developed prototype showcasing a way to preserve the Intangible Cultural Heritage of Uttarakhand, India. Aipan is a traditional art form practiced in the Kumaon region in the state of Uttarakhand. It is typically used to decorate floors and walls at places of worship or entrances of homes and is considered auspicious to begin any work or event. This art is associated with a great degree of social, cultural as well as religious significance and is passed from generation to generation. However, in the present era of modernization and technological advancements, this art form now stands on the verge of depletion. This study presents a humble attempt to preserve this vanishing art form through the use of Virtual Reality (VR). Ethnographic studies were conducted in Almora, Nainital, and Haldwani regions of Uttarakhand to trace the origins as well as to gain a deeper understanding of this art form. A total of ten (N =10) Aipan designers were interviewed. Several interesting insights are revealed through these studies that show the potential to be incorporated as a VR experience. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: Demonstrated at ISMAR 2020

arXiv:2404.12261 [pdf]

Design And Flight Testing Of LQRi Attitude Control For Quadcopter UAV

Authors: Astik Srivastava, S. Indu, Richa Sharma

Abstract: This paper presents the design, implementation, and flight test results of linear quadratic integral regulator (LQRi) based attitude control for a quadcopter UAV. We present the derivation of the mathematical model for the kinematics and dynamics of the UAV, along with the linearized state space representation of the system about hover conditions. LQR and LQRi controllers are then designed to stab… ▽ More This paper presents the design, implementation, and flight test results of linear quadratic integral regulator (LQRi) based attitude control for a quadcopter UAV. We present the derivation of the mathematical model for the kinematics and dynamics of the UAV, along with the linearized state space representation of the system about hover conditions. LQR and LQRi controllers are then designed to stabilize the UAV in hover conditions and to track desired attitude commands. The controllers are then implemented onboard the Pixhawk flight controller and flight test results are discussed. Finally, the code related to this paper has been published open-source for replication and further research △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: This research is still work under progress. The paper has been posted here for wider review by community

arXiv:2404.07304 [pdf, other]

We're Calling an Intervention: Exploring the Fundamental Hurdles in Adapting Language Models to Nonstandard Text

Authors: Aarohi Srivastava, David Chiang

Abstract: We present a suite of experiments that allow us to understand the underlying challenges of language model adaptation to nonstandard text. We do so by designing interventions that approximate several types of linguistic variation and their interactions with existing biases of language models. Applying our interventions during language model adaptation with varying size and nature of training data,… ▽ More We present a suite of experiments that allow us to understand the underlying challenges of language model adaptation to nonstandard text. We do so by designing interventions that approximate several types of linguistic variation and their interactions with existing biases of language models. Applying our interventions during language model adaptation with varying size and nature of training data, we gain important insights into when knowledge transfer can be successful, as well as the aspects of linguistic variation that are particularly difficult for language models to deal with. For instance, on text with character-level variation, performance improves with even a few training examples but approaches a plateau, suggesting that more data is not the solution. In contrast, on text with variation involving new words or meanings, far more data is needed, but it leads to a massive breakthrough in performance. Our findings reveal that existing models lack the necessary infrastructure to handle diverse forms of nonstandard text and linguistic variation, guiding the development of more resilient language modeling techniques for the future. We make the code for our interventions, which can be applied to any English text data, publicly available. △ Less

Submitted 15 June, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: Preprint

arXiv:2404.03843 [pdf, other]

Scaling Motion Forecasting Models with Ensemble Distillation

Authors: Scott Ettinger, Kratarth Goel, Avikalp Srivastava, Rami Al-Rfou

Abstract: Motion forecasting has become an increasingly critical component of autonomous robotic systems. Onboard compute budgets typically limit the accuracy of real-time systems. In this work we propose methods of improving motion forecasting systems subject to limited compute budgets by combining model ensemble and distillation techniques. The use of ensembles of deep neural networks has been shown to im… ▽ More Motion forecasting has become an increasingly critical component of autonomous robotic systems. Onboard compute budgets typically limit the accuracy of real-time systems. In this work we propose methods of improving motion forecasting systems subject to limited compute budgets by combining model ensemble and distillation techniques. The use of ensembles of deep neural networks has been shown to improve generalization accuracy in many application domains. We first demonstrate significant performance gains by creating a large ensemble of optimized single models. We then develop a generalized framework to distill motion forecasting model ensembles into small student models which retain high performance with a fraction of the computing cost. For this study we focus on the task of motion forecasting using real world data from autonomous driving systems. We develop ensemble models that are very competitive on the Waymo Open Motion Dataset (WOMD) and Argoverse leaderboards. From these ensembles, we train distilled student models which have high performance at a fraction of the compute costs. These experiments demonstrate distillation from ensembles as an effective method for improving accuracy of predictive models for robotic systems with limited compute budgets. △ Less

Submitted 13 May, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: 11 pages, 14 figures

arXiv:2403.17541 [pdf, other]

WordRobe: Text-Guided Generation of Textured 3D Garments

Authors: Astitva Srivastava, Pranav Manu, Amit Raj, Varun Jampani, Avinash Sharma

Abstract: In this paper, we tackle a new and challenging problem of text-driven generation of 3D garments with high-quality textures. We propose "WordRobe", a novel framework for the generation of unposed & textured 3D garment meshes from user-friendly text prompts. We achieve this by first learning a latent representation of 3D garments using a novel coarse-to-fine training strategy and a loss for latent d… ▽ More In this paper, we tackle a new and challenging problem of text-driven generation of 3D garments with high-quality textures. We propose "WordRobe", a novel framework for the generation of unposed & textured 3D garment meshes from user-friendly text prompts. We achieve this by first learning a latent representation of 3D garments using a novel coarse-to-fine training strategy and a loss for latent disentanglement, promoting better latent interpolation. Subsequently, we align the garment latent space to the CLIP embedding space in a weakly supervised manner, enabling text-driven 3D garment generation and editing. For appearance modeling, we leverage the zero-shot generation capability of ControlNet to synthesize view-consistent texture maps in a single feed-forward inference step, thereby drastically decreasing the generation time as compared to existing methods. We demonstrate superior performance over current SOTAs for learning 3D garment latent space, garment interpolation, and text-driven texture synthesis, supported by quantitative evaluation and qualitative user study. The unposed 3D garment meshes generated using WordRobe can be directly fed to standard cloth simulation & animation pipelines without any post-processing. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.11009 [pdf, other]

DIALECTBENCH: A NLP Benchmark for Dialects, Varieties, and Closely-Related Languages

Authors: Fahim Faisal, Orevaoghene Ahia, Aarohi Srivastava, Kabir Ahuja, David Chiang, Yulia Tsvetkov, Antonios Anastasopoulos

Abstract: Language technologies should be judged on their usefulness in real-world use cases. An often overlooked aspect in natural language processing (NLP) research and evaluation is language variation in the form of non-standard dialects or language varieties (hereafter, varieties). Most NLP benchmarks are limited to standard language varieties. To fill this gap, we propose DIALECTBENCH, the first-ever l… ▽ More Language technologies should be judged on their usefulness in real-world use cases. An often overlooked aspect in natural language processing (NLP) research and evaluation is language variation in the form of non-standard dialects or language varieties (hereafter, varieties). Most NLP benchmarks are limited to standard language varieties. To fill this gap, we propose DIALECTBENCH, the first-ever large-scale benchmark for NLP on varieties, which aggregates an extensive set of task-varied variety datasets (10 text-level tasks covering 281 varieties). This allows for a comprehensive evaluation of NLP system performance on different language varieties. We provide substantial evidence of performance disparities between standard and non-standard language varieties, and we also identify language clusters with large performance divergence across tasks. We believe DIALECTBENCH provides a comprehensive view of the current state of NLP for language varieties and one step towards advancing it further. Code/data: https://github.com/ffaisal93/DialectBench △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: Equal contribution: Fahim Faisal, Orevaoghene Ahia

arXiv:2403.02292 [pdf, other]

A Decade of Privacy-Relevant Android App Reviews: Large Scale Trends

Authors: Omer Akgul, Sai Teja Peddinti, Nina Taft, Michelle L. Mazurek, Hamza Harkous, Animesh Srivastava, Benoit Seguin

Abstract: We present an analysis of 12 million instances of privacy-relevant reviews publicly visible on the Google Play Store that span a 10 year period. By leveraging state of the art NLP techniques, we examine what users have been writing about privacy along multiple dimensions: time, countries, app types, diverse privacy topics, and even across a spectrum of emotions. We find consistent growth of privac… ▽ More We present an analysis of 12 million instances of privacy-relevant reviews publicly visible on the Google Play Store that span a 10 year period. By leveraging state of the art NLP techniques, we examine what users have been writing about privacy along multiple dimensions: time, countries, app types, diverse privacy topics, and even across a spectrum of emotions. We find consistent growth of privacy-relevant reviews, and explore topics that are trending (such as Data Deletion and Data Theft), as well as those on the decline (such as privacy-relevant reviews on sensitive permissions). We find that although privacy reviews come from more than 200 countries, 33 countries provide 90% of privacy reviews. We conduct a comparison across countries by examining the distribution of privacy topics a country's users write about, and find that geographic proximity is not a reliable indicator that nearby countries have similar privacy perspectives. We uncover some countries with unique patterns and explore those herein. Surprisingly, we uncover that it is not uncommon for reviews that discuss privacy to be positive (32%); many users express pleasure about privacy features within apps or privacy-focused apps. We also uncover some unexpected behaviors, such as the use of reviews to deliver privacy disclaimers to developers. Finally, we demonstrate the value of analyzing app reviews with our approach as a complement to existing methods for understanding users' perspectives about privacy △ Less

Submitted 15 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: This is the extended version of the paper accepted to USENIX Security 2024

arXiv:2403.01081 [pdf, other]

LAB: Large-Scale Alignment for ChatBots

Authors: Shivchander Sudalairaj, Abhishek Bhandwaldar, Aldo Pareja, Kai Xu, David D. Cox, Akash Srivastava

Abstract: This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training. Leveraging a taxonomy-guided synthetic data generation process and a multi-phase tuning framework, LAB significantly reduces reliance on expensive human annotations and proprietary models like GPT-… ▽ More This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training. Leveraging a taxonomy-guided synthetic data generation process and a multi-phase tuning framework, LAB significantly reduces reliance on expensive human annotations and proprietary models like GPT-4. We demonstrate that LAB-trained models can achieve competitive performance across several benchmarks compared to models trained with traditional human-annotated or GPT-4 generated synthetic data. Thus offering a scalable, cost-effective solution for enhancing LLM capabilities and instruction-following behaviors without the drawbacks of catastrophic forgetting, marking a step forward in the efficient training of LLMs for a wide range of applications. △ Less

Submitted 29 April, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

Comments: Corresponding Author: Akash Srivastava. Equal Contribution: Shivchander Sudalairaj, Abhishek Bhandwaldar, Aldo Pareja, Akash Srivastava, Code: https://github.com/instructlab

arXiv:2403.00026 [pdf, other]

Learning to Deliver: a Foundation Model for the Montreal Capacitated Vehicle Routing Problem

Authors: Samuel J. K. Chin, Matthias Winkenbach, Akash Srivastava

Abstract: In this paper, we present the Foundation Model for the Montreal Capacitated Vehicle Routing Problem (FM-MCVRP), a novel Deep Learning (DL) model that approximates high-quality solutions to a variant of the Capacitated Vehicle Routing Problem (CVRP) that characterizes many real-world applications. The so-called Montreal Capacitated Vehicle Routing Problem (MCVRP), first formally described by Bengio… ▽ More In this paper, we present the Foundation Model for the Montreal Capacitated Vehicle Routing Problem (FM-MCVRP), a novel Deep Learning (DL) model that approximates high-quality solutions to a variant of the Capacitated Vehicle Routing Problem (CVRP) that characterizes many real-world applications. The so-called Montreal Capacitated Vehicle Routing Problem (MCVRP), first formally described by Bengio et al. (2021), is defined on a fixed and finite graph, which is analogous to a city. Each MCVRP instance is essentially the sub-graph connecting a randomly sampled subset of the nodes in the fixed graph, which represent a set of potential addresses in a real-world delivery problem on a given day. Our work exploits this problem structure to frame the MCVRP as an analogous Natural Language Processing (NLP) task. Specifically, we leverage a Transformer architecture embedded in a Large Language Model (LLM) framework to train our model in a supervised manner on computationally inexpensive, sub-optimal MCVRP solutions obtained algorithmically. Through comprehensive computational experiments, we show that FM-MCVRP produces better MCVRP solutions than the training data and generalizes to larger sized problem instances not seen during training. Even when compared to near-optimal solutions from state-of-the-art heuristics, FM-MCVRP yields competitive results despite being trained on inferior data. For instance, for 400-customer problems, FM-MCVRP solutions on average fall within 2% of the benchmark. Our results further demonstrate that unlike prior works in the literature, FM-MCVRP is a unified model, which performs consistently and reliably on a range of problem instance sizes and parameter values such as the vehicle capacity. △ Less

Submitted 28 February, 2024; originally announced March 2024.

arXiv:2402.19464 [pdf, other]

Curiosity-driven Red-teaming for Large Language Models

Authors: Zhang-Wei Hong, Idan Shenfeld, Tsun-Hsuan Wang, Yung-Sung Chuang, Aldo Pareja, James Glass, Akash Srivastava, Pulkit Agrawal

Abstract: Large language models (LLMs) hold great potential for many natural language applications but risk generating incorrect or toxic content. To probe when an LLM generates unwanted content, the current paradigm is to recruit a \textit{red team} of human testers to design input prompts (i.e., test cases) that elicit undesirable responses from LLMs. However, relying solely on human testers is expensive… ▽ More Large language models (LLMs) hold great potential for many natural language applications but risk generating incorrect or toxic content. To probe when an LLM generates unwanted content, the current paradigm is to recruit a \textit{red team} of human testers to design input prompts (i.e., test cases) that elicit undesirable responses from LLMs. However, relying solely on human testers is expensive and time-consuming. Recent works automate red teaming by training a separate red team LLM with reinforcement learning (RL) to generate test cases that maximize the chance of eliciting undesirable responses from the target LLM. However, current RL methods are only able to generate a small number of effective test cases resulting in a low coverage of the span of prompts that elicit undesirable responses from the target LLM. To overcome this limitation, we draw a connection between the problem of increasing the coverage of generated test cases and the well-studied approach of curiosity-driven exploration that optimizes for novelty. Our method of curiosity-driven red teaming (CRT) achieves greater coverage of test cases while mantaining or increasing their effectiveness compared to existing methods. Our method, CRT successfully provokes toxic responses from LLaMA2 model that has been heavily fine-tuned using human preferences to avoid toxic outputs. Code is available at \url{https://github.com/Improbable-AI/curiosity_redteam} △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: Published at ICLR 2024

arXiv:2402.19052 [pdf]

Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: A Benchmark Study

Authors: Prottay Kumar Adhikary, Aseem Srivastava, Shivani Kumar, Salam Michael Singh, Puneet Manuja, **i K Gopinath, Vijay Krishnan, Swati Kedia, Koushik Sinha Deb, Tanmoy Chakraborty

Abstract: Comprehensive summaries of sessions enable an effective continuity in mental health counseling, facilitating informed therapy planning. Yet, manual summarization presents a significant challenge, diverting experts' attention from the core counseling process. This study evaluates the effectiveness of state-of-the-art Large Language Models (LLMs) in selectively summarizing various components of ther… ▽ More Comprehensive summaries of sessions enable an effective continuity in mental health counseling, facilitating informed therapy planning. Yet, manual summarization presents a significant challenge, diverting experts' attention from the core counseling process. This study evaluates the effectiveness of state-of-the-art Large Language Models (LLMs) in selectively summarizing various components of therapy sessions through aspect-based summarization, aiming to benchmark their performance. We introduce MentalCLOUDS, a counseling-component guided summarization dataset consisting of 191 counseling sessions with summaries focused on three distinct counseling components (aka counseling aspects). Additionally, we assess the capabilities of 11 state-of-the-art LLMs in addressing the task of component-guided summarization in counseling. The generated summaries are evaluated quantitatively using standard summarization metrics and verified qualitatively by mental health professionals. Our findings demonstrate the superior performance of task-specific LLMs such as MentalLlama, Mistral, and MentalBART in terms of standard quantitative metrics such as Rouge-1, Rouge-2, Rouge-L, and BERTScore across all aspects of counseling components. Further, expert evaluation reveals that Mistral supersedes both MentalLlama and MentalBART based on six parameters -- affective attitude, burden, ethicality, coherence, opportunity costs, and perceived effectiveness. However, these models share the same weakness by demonstrating a potential for improvement in the opportunity costs and perceived effectiveness metrics. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.07173 [pdf, other]

INSITE: labelling medical images using submodular functions and semi-supervised data programming

Authors: Akshat Gautam, Anurag Shandilya, Akshit Srivastava, Venkatapathy Subramanian, Ganesh Ramakrishnan, Kshitij Jadhav

Abstract: The necessity of large amounts of labeled data to train deep models, especially in medical imaging creates an implementation bottleneck in resource-constrained settings. In Insite (labelINg medical imageS usIng submodular funcTions and sEmi-supervised data programming) we apply informed subset selection to identify a small number of most representative or diverse images from a huge pool of unlabel… ▽ More The necessity of large amounts of labeled data to train deep models, especially in medical imaging creates an implementation bottleneck in resource-constrained settings. In Insite (labelINg medical imageS usIng submodular funcTions and sEmi-supervised data programming) we apply informed subset selection to identify a small number of most representative or diverse images from a huge pool of unlabelled data subsequently annotated by a domain expert. The newly annotated images are then used as exemplars to develop several data programming-driven labeling functions. These labelling functions output a predicted-label and a similarity score when given an unlabelled image as an input. A consensus is brought amongst the outputs of these labeling functions by using a label aggregator function to assign the final predicted label to each unlabelled data point. We demonstrate that informed subset selection followed by semi-supervised data programming methods using these images as exemplars perform better than other state-of-the-art semi-supervised methods. Further, for the first time we demonstrate that this can be achieved through a small set of images used as exemplars. △ Less

Submitted 11 February, 2024; originally announced February 2024.

arXiv:2402.06377 [pdf, other]

High-Precision Geosteering via Reinforcement Learning and Particle Filters

Authors: Ressi Bonti Muhammad, Apoorv Srivastava, Sergey Alyaev, Reidar Brumer Bratvold, Daniel M. Tartakovsky

Abstract: Geosteering, a key component of drilling operations, traditionally involves manual interpretation of various data sources such as well-log data. This introduces subjective biases and inconsistent procedures. Academic attempts to solve geosteering decision optimization with greedy optimization and Approximate Dynamic Programming (ADP) showed promise but lacked adaptivity to realistic diverse scenar… ▽ More Geosteering, a key component of drilling operations, traditionally involves manual interpretation of various data sources such as well-log data. This introduces subjective biases and inconsistent procedures. Academic attempts to solve geosteering decision optimization with greedy optimization and Approximate Dynamic Programming (ADP) showed promise but lacked adaptivity to realistic diverse scenarios. Reinforcement learning (RL) offers a solution to these challenges, facilitating optimal decision-making through reward-based iterative learning. State estimation methods, e.g., particle filter (PF), provide a complementary strategy for geosteering decision-making based on online information. We integrate an RL-based geosteering with PF to address realistic geosteering scenarios. Our framework deploys PF to process real-time well-log data to estimate the location of the well relative to the stratigraphic layers, which then informs the RL-based decision-making process. We compare our method's performance with that of using solely either RL or PF. Our findings indicate a synergy between RL and PF in yielding optimized geosteering decisions. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 40 pages

arXiv:2401.03390 [pdf, other]

Dynamics-based Feature Augmentation of Graph Neural Networks for Variant Emergence Prediction

Authors: Majd Al Aawar, Srikar Mutnuri, Mansooreh Montazerin, Ajitesh Srivastava

Abstract: During the COVID-19 pandemic, a major driver of new surges has been the emergence of new variants. When a new variant emerges in one or more countries, other nations monitor its spread in preparation for its potential arrival. The impact of the new variant and the timings of epidemic peaks in a country highly depend on when the variant arrives. The current methods for predicting the spread of new… ▽ More During the COVID-19 pandemic, a major driver of new surges has been the emergence of new variants. When a new variant emerges in one or more countries, other nations monitor its spread in preparation for its potential arrival. The impact of the new variant and the timings of epidemic peaks in a country highly depend on when the variant arrives. The current methods for predicting the spread of new variants rely on statistical modeling, however, these methods work only when the new variant has already arrived in the region of interest and has a significant prevalence. Can we predict when a variant existing elsewhere will arrive in a given region? To address this question, we propose a variant-dynamics-informed Graph Neural Network (GNN) approach. First, we derive the dynamics of variant prevalence across pairs of regions (countries) that apply to a large class of epidemic models. The dynamics motivate the introduction of certain features in the GNN. We demonstrate that our proposed dynamics-informed GNN outperforms all the baselines, including the currently pervasive framework of Physics-Informed Neural Networks (PINNs). To advance research in this area, we introduce a benchmarking tool to assess a user-defined model's prediction performance across 87 countries and 36 variants. △ Less

Submitted 28 May, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

arXiv:2401.03108 [pdf, other]

Dress-Me-Up: A Dataset & Method for Self-Supervised 3D Garment Retargeting

Authors: Shanthika Naik, Kunwar Singh, Astitva Srivastava, Dhawal Sirikonda, Amit Raj, Varun Jampani, Avinash Sharma

Abstract: We propose a novel self-supervised framework for retargeting non-parameterized 3D garments onto 3D human avatars of arbitrary shapes and poses, enabling 3D virtual try-on (VTON). Existing self-supervised 3D retargeting methods only support parametric and canonical garments, which can only be draped over parametric body, e.g. SMPL. To facilitate the non-parametric garments and body, we propose a no… ▽ More We propose a novel self-supervised framework for retargeting non-parameterized 3D garments onto 3D human avatars of arbitrary shapes and poses, enabling 3D virtual try-on (VTON). Existing self-supervised 3D retargeting methods only support parametric and canonical garments, which can only be draped over parametric body, e.g. SMPL. To facilitate the non-parametric garments and body, we propose a novel method that introduces Isomap Embedding based correspondences matching between the garment and the human body to get a coarse alignment between the two meshes. We perform neural refinement of the coarse alignment in a self-supervised setting. Further, we leverage a Laplacian detail integration method for preserving the inherent details of the input garment. For evaluating our 3D non-parametric garment retargeting framework, we propose a dataset of 255 real-world garments with realistic noise and topological deformations. The dataset contains $44$ unique garments worn by 15 different subjects in 5 distinctive poses, captured using a multi-view RGBD capture setup. We show superior retargeting quality on non-parametric garments and human avatars over existing state-of-the-art methods, acting as the first-ever baseline on the proposed dataset for non-parametric 3D garment retargeting. △ Less

Submitted 5 January, 2024; originally announced January 2024.

arXiv:2401.00338 [pdf]

A Rapid Sco** Review and Conceptual Analysis of the Educational Metaverse in the Global South: Socio-Technical Perspectives

Authors: Anmol Srivastava

Abstract: This paper presents a conceptual insight into the Design of the Metaverse to facilitate educational transformation in selected develo** nations within the Global South regions, e.g., India. These regions are often afflicted with socio-economic challenges but rich in cultural diversity. By utilizing a socio-technical design approach, this study explores the specific needs and opportunities presen… ▽ More This paper presents a conceptual insight into the Design of the Metaverse to facilitate educational transformation in selected develo** nations within the Global South regions, e.g., India. These regions are often afflicted with socio-economic challenges but rich in cultural diversity. By utilizing a socio-technical design approach, this study explores the specific needs and opportunities presented by these diverse settings. A rapid sco** review of the scant existing literature is conducted to provide fundamental insights. A novel design methodology was formulated that utilized ChatGPT for ideation, brainstorming, and literature survey query generation. This paper aims not only to shed light on the educational possibilities enabled by the Metaverse but also to highlight design considerations unique to the Global South. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2312.17270 [pdf, other]

Anticipated Network Surveillance -- An extrapolated study to predict cyber-attacks using Machine Learning and Data Analytics

Authors: Aviral Srivastava, Dhyan Thakkar, Dr. Sharda Valiveti, Dr. Pooja Shah, Dr. Gaurang Raval

Abstract: Machine learning and data mining techniques are utiized for enhancement of the security of any network. Researchers used machine learning for pattern detection, anomaly detection, dynamic policy setting, etc. The methods allow the program to learn from data and make decisions without human intervention, consuming a huge training period and computation power. This paper discusses a novel technique… ▽ More Machine learning and data mining techniques are utiized for enhancement of the security of any network. Researchers used machine learning for pattern detection, anomaly detection, dynamic policy setting, etc. The methods allow the program to learn from data and make decisions without human intervention, consuming a huge training period and computation power. This paper discusses a novel technique to predict an upcoming attack in a network based on several data parameters. The dataset is continuous in real-time implementation. The proposed model comprises dataset pre-processing, and training, followed by the testing phase. Based on the results of the testing phase, the best model is selected using which, event class which may lead to an attack is extracted. The event statistics are used for attack △ Less

Submitted 26 December, 2023; originally announced December 2023.

arXiv:2312.16163 [pdf, other]

Age of Information in Gossip Networks: A Friendly Introduction and Literature Survey

Authors: Priyanka Kaswan, Purbesh Mitra, Arunabh Srivastava, Sennur Ulukus

Abstract: Gossi** is a communication mechanism, used for fast information dissemination in a network, where each node of the network randomly shares its information with the neighboring nodes. To characterize the notion of fastness in the context of gossip networks, age of information (AoI) is used as a timeliness metric. In this article, we summarize the recent works related to timely gossi** in a netw… ▽ More Gossi** is a communication mechanism, used for fast information dissemination in a network, where each node of the network randomly shares its information with the neighboring nodes. To characterize the notion of fastness in the context of gossip networks, age of information (AoI) is used as a timeliness metric. In this article, we summarize the recent works related to timely gossi** in a network. We start with the introduction of randomized gossip algorithms as an epidemic algorithm for database maintenance, and how the gossi** literature was later developed in the context of rumor spreading, message passing and distributed mean estimation. Then, we motivate the need for timely gossi** in applications such as source tracking and decentralized learning. We evaluate timeliness scaling of gossi** in various network topologies, such as, fully connected, ring, grid, generalized ring, hierarchical, and sparse asymmetric networks. We discuss age-aware gossi** and the higher order moments of the age process. We also consider different variations of gossi** in networks, such as, file slicing and network coding, reliable and unreliable sources, information mutation, different adversarial actions in gossi**, and energy harvesting sensors. Finally, we conclude this article with a few open problems and future directions in timely gossi**. △ Less

Submitted 26 December, 2023; originally announced December 2023.

arXiv:2311.16766 [pdf, other]

Rescuing referral failures during automated diagnosis of domain-shifted medical images

Authors: Anuj Srivastava, Karm Patel, Pradeep Shenoy, Devarajan Sridharan

Abstract: The success of deep learning models deployed in the real world depends critically on their ability to generalize well across diverse data domains. Here, we address a fundamental challenge with selective classification during automated diagnosis with domain-shifted medical images. In this scenario, models must learn to avoid making predictions when label confidence is low, especially when tested wi… ▽ More The success of deep learning models deployed in the real world depends critically on their ability to generalize well across diverse data domains. Here, we address a fundamental challenge with selective classification during automated diagnosis with domain-shifted medical images. In this scenario, models must learn to avoid making predictions when label confidence is low, especially when tested with samples far removed from the training set (covariate shift). Such uncertain cases are typically referred to the clinician for further analysis and evaluation. Yet, we show that even state-of-the-art domain generalization approaches fail severely during referral when tested on medical images acquired from a different demographic or using a different technology. We examine two benchmark diagnostic medical imaging datasets exhibiting strong covariate shifts: i) diabetic retinopathy prediction with retinal fundus images and ii) multilabel disease prediction with chest X-ray images. We show that predictive uncertainty estimates do not generalize well under covariate shifts leading to non-monotonic referral curves, and severe drops in performance (up to 50%) at high referral rates (>70%). We evaluate novel combinations of robust generalization and post hoc referral approaches, that rescue these failures and achieve significant performance improvements, typically >10%, over baseline methods. Our study identifies a critical challenge with referral in domain-shifted medical images and finds key applications in reliable, automated disease diagnosis. △ Less

Submitted 28 November, 2023; originally announced November 2023.

arXiv:2311.00116 [pdf, other]

BERTwich: Extending BERT's Capabilities to Model Dialectal and Noisy Text

Authors: Aarohi Srivastava, David Chiang

Abstract: Real-world NLP applications often deal with nonstandard text (e.g., dialectal, informal, or misspelled text). However, language models like BERT deteriorate in the face of dialect variation or noise. How do we push BERT's modeling capabilities to encompass nonstandard text? Fine-tuning helps, but it is designed for specializing a model to a task and does not seem to bring about the deeper, more pe… ▽ More Real-world NLP applications often deal with nonstandard text (e.g., dialectal, informal, or misspelled text). However, language models like BERT deteriorate in the face of dialect variation or noise. How do we push BERT's modeling capabilities to encompass nonstandard text? Fine-tuning helps, but it is designed for specializing a model to a task and does not seem to bring about the deeper, more pervasive changes needed to adapt a model to nonstandard language. In this paper, we introduce the novel idea of sandwiching BERT's encoder stack between additional encoder layers trained to perform masked language modeling on noisy text. We find that our approach, paired with recent work on including character-level noise in fine-tuning data, can promote zero-shot transfer to dialectal text, as well as reduce the distance in the embedding space between words and their noisy counterparts. △ Less

Submitted 31 October, 2023; originally announced November 2023.

Comments: Accepted for publication in Findings of the ACL: EMNLP 2023

arXiv:2310.18820 [pdf, other]

Demand-Side Threats to Power Grid Operations from IoT-Enabled Edge

Authors: Subhash Lakshminarayana, Carsten Maple, Andrew Larkins, Daryl Flack, Christopher Few, Anurag. K. Srivastava

Abstract: The growing adoption of Internet-of-Things (IoT)-enabled energy smart appliances (ESAs) at the consumer end, such as smart heat pumps, electric vehicle chargers, etc., is seen as key to enabling demand-side response (DSR) services. However, these smart appliances are often poorly engineered from a security point of view and present a new threat to power grid operations. They may become convenient… ▽ More The growing adoption of Internet-of-Things (IoT)-enabled energy smart appliances (ESAs) at the consumer end, such as smart heat pumps, electric vehicle chargers, etc., is seen as key to enabling demand-side response (DSR) services. However, these smart appliances are often poorly engineered from a security point of view and present a new threat to power grid operations. They may become convenient entry points for malicious parties to gain access to the system and disrupt important grid operations by abruptly changing the demand. Unlike utility-side and SCADA assets, ESAs are not monitored continuously due to their large numbers and the lack of extensive monitoring infrastructure at consumer sites. This article presents an in-depth analysis of the demand side threats to power grid operations including (i) an overview of the vulnerabilities in ESAs and the wider risk from the DSR ecosystem and (ii) key factors influencing the attack impact on power grid operations. Finally, it presents measures to improve the cyber-physical resilience of power grids, putting them in the context of ongoing efforts from the industry and regulatory bodies worldwide. △ Less

Submitted 28 October, 2023; originally announced October 2023.

arXiv:2310.10902 [pdf, other]

Reuse Kernels or Activations? A Flexible Dataflow for Low-latency Spectral CNN Acceleration

Authors: Yue Niu, Rajgopal Kannan, Ajitesh Srivastava, Viktor Prasanna

Abstract: Spectral-domain CNNs have been shown to be more efficient than traditional spatial CNNs in terms of reducing computation complexity. However they come with a `kernel explosion' problem that, even after compression (pruning), imposes a high memory burden and off-chip bandwidth requirement for kernel access. This creates a performance gap between the potential acceleration offered by compression and… ▽ More Spectral-domain CNNs have been shown to be more efficient than traditional spatial CNNs in terms of reducing computation complexity. However they come with a `kernel explosion' problem that, even after compression (pruning), imposes a high memory burden and off-chip bandwidth requirement for kernel access. This creates a performance gap between the potential acceleration offered by compression and actual FPGA implementation performance, especially for low-latency CNN inference. In this paper, we develop a principled approach to overcoming this performance gap and designing a low-latency, low-bandwidth, spectral sparse CNN accelerator on FPGAs. First, we analyze the bandwidth-storage tradeoff of sparse convolutional layers and locate communication bottlenecks. We then develop a dataflow for flexibly optimizing data reuse in different layers to minimize off-chip communication. Finally, we propose a novel scheduling algorithm to optimally schedule the on-chip memory access of multiple sparse kernels and minimize read conflicts. On a state-of-the-art FPGA platform, our design reduces data transfers by 42\% with DSP utilization up to 90\% and achieves inference latency of 9 ms for VGG16, compared to the baseline state-of-the-art latency of 68 ms. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: 11 pages, 11 figures Accepted to ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA) 2020

arXiv:2310.09729 [pdf, other]

Private Synthetic Data Meets Ensemble Learning

Authors: Haoyuan Sun, Navid Azizan, Akash Srivastava, Hao Wang

Abstract: When machine learning models are trained on synthetic data and then deployed on real data, there is often a performance drop due to the distribution shift between synthetic and real data. In this paper, we introduce a new ensemble strategy for training downstream models, with the goal of enhancing their performance when used on real data. We generate multiple synthetic datasets by applying a diffe… ▽ More When machine learning models are trained on synthetic data and then deployed on real data, there is often a performance drop due to the distribution shift between synthetic and real data. In this paper, we introduce a new ensemble strategy for training downstream models, with the goal of enhancing their performance when used on real data. We generate multiple synthetic datasets by applying a differential privacy (DP) mechanism several times in parallel and then ensemble the downstream models trained on these datasets. While each synthetic dataset might deviate more from the real data distribution, they collectively increase sample diversity. This may enhance the robustness of downstream models against distribution shifts. Our extensive experiments reveal that while ensembling does not enhance downstream performance (compared with training a single model) for models trained on synthetic data generated by marginal-based or workload-based DP mechanisms, our proposed ensemble strategy does improve the performance for models trained using GAN-based DP mechanisms in terms of both accuracy and calibration of downstream models. △ Less

Submitted 15 October, 2023; originally announced October 2023.

arXiv:2310.06257 [pdf, other]

SCAR: Power Side-Channel Analysis at RTL-Level

Authors: Amisha Srivastava, Sanjay Das, Navnil Choudhury, Rafail Psiakis, Pedro Henrique Silva, Debjit Pal, Kanad Basu

Abstract: Power side-channel attacks exploit the dynamic power consumption of cryptographic operations to leak sensitive information of encryption hardware. Therefore, it is necessary to conduct power side-channel analysis for assessing the susceptibility of cryptographic systems and mitigating potential risks. Existing power side-channel analysis primarily focuses on post-silicon implementations, which are… ▽ More Power side-channel attacks exploit the dynamic power consumption of cryptographic operations to leak sensitive information of encryption hardware. Therefore, it is necessary to conduct power side-channel analysis for assessing the susceptibility of cryptographic systems and mitigating potential risks. Existing power side-channel analysis primarily focuses on post-silicon implementations, which are inflexible in addressing design flaws, leading to costly and time-consuming post-fabrication design re-spins. Hence, pre-silicon power side-channel analysis is required for early detection of vulnerabilities to improve design robustness. In this paper, we introduce SCAR, a novel pre-silicon power side-channel analysis framework based on Graph Neural Networks (GNN). SCAR converts register-transfer level (RTL) designs of encryption hardware into control-data flow graphs and use that to detect the design modules susceptible to side-channel leakage. Furthermore, we incorporate a deep learning-based explainer in SCAR to generate quantifiable and human-accessible explanation of our detection and localization decisions. We have also developed a fortification component as a part of SCAR that uses large-language models (LLM) to automatically generate and insert additional design code at the localized zone to shore up the side-channel leakage. When evaluated on popular encryption algorithms like AES, RSA, and PRESENT, and postquantum cryptography algorithms like Saber and CRYSTALS-Kyber, SCAR, achieves up to 94.49% localization accuracy, 100% precision, and 90.48% recall. Additionally, through explainability analysis, SCAR reduces features for GNN model training by 57% while maintaining comparable accuracy. We believe that SCAR will transform the security-critical hardware design cycle, resulting in faster design closure at a reduced design cost. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2310.04413 [pdf, other]

Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets

Authors: Zhang-Wei Hong, Aviral Kumar, Sathwik Karnik, Abhishek Bhandwaldar, Akash Srivastava, Joni Pajarinen, Romain Laroche, Abhishek Gupta, Pulkit Agrawal

Abstract: Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning techniques such as behavior cloning is to find a policy that achieves a higher average return than the trajectories constituting the dataset. However, we empirica… ▽ More Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning techniques such as behavior cloning is to find a policy that achieves a higher average return than the trajectories constituting the dataset. However, we empirically find that when a dataset is dominated by suboptimal trajectories, state-of-the-art offline RL algorithms do not substantially improve over the average return of trajectories in the dataset. We argue this is due to an assumption made by current offline RL algorithms of staying close to the trajectories in the dataset. If the dataset primarily consists of sub-optimal trajectories, this assumption forces the policy to mimic the suboptimal actions. We overcome this issue by proposing a sampling strategy that enables the policy to only be constrained to ``good data" rather than all actions in the dataset (i.e., uniform sampling). We present a realization of the sampling strategy and an algorithm that can be used as a plug-and-play module in standard offline RL algorithms. Our evaluation demonstrates significant performance gains in 72 imbalanced datasets, D4RL dataset, and across three different offline RL algorithms. Code is available at https://github.com/Improbable-AI/dw-offline-rl. △ Less

Submitted 11 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: Accepted NeurIPS 2023

Journal ref: NeurIPS 2023

arXiv:2310.02399 [pdf, other]

Can 5G NR Sidelink communications support wireless augmented reality?

Authors: Ashutosh Srivastava, Qing Zhao, Yi Lu, ** Wang, Qi Qu, Zhu Ji, Yee Sin Chan, Shivendra S. Panwar

Abstract: Smart glasses that support augmented reality (AR) have the potential to become the consumer's primary medium of connecting to the future internet. For the best quality of user experience, AR glasses must have a small form factor and long battery life, while satisfying the data rate and latency requirements of AR applications. To extend the AR glasses' battery life, the computation and processing i… ▽ More Smart glasses that support augmented reality (AR) have the potential to become the consumer's primary medium of connecting to the future internet. For the best quality of user experience, AR glasses must have a small form factor and long battery life, while satisfying the data rate and latency requirements of AR applications. To extend the AR glasses' battery life, the computation and processing involved in AR may be offloaded to a companion device, such as a smartphone, through a wireless connection. Sidelink (SL), i.e., the D2D communication interface of 5G NR, is a potential candidate for this wireless link. In this paper, we use system-level simulations to analyze the feasibility of NR SL for supporting AR. Our simulator incorporates the PHY layer structure and MAC layer resource scheduling of 3GPP SL, standard 3GPP channel models, and MCS configurations. Our results suggest that the current SL standard specifications are insufficient for high-end AR use cases with heavy interaction but can support simpler previews and file transfers. We further propose two enhancements to SL resource allocation, which have the potential to offer significant performance improvements for AR applications. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: 7 pages, 7 figures, accepted for publication in 2023 IEEE Global Communications Conference: Mobile and Wireless Networks (Globecom 2023 MWN), Kuala Lumpur, Malaysia, Dec. 2023

arXiv:2310.01077 [pdf, other]

On Fulfilling the Exigent Need for Automating and Modernizing Logistics Infrastructure in India: Enabling AI-based Integration, Digitalization, and Smart Automation of Industrial Parks and Robotic Warehouses

Authors: Shaurya Shriyam, Prashant Palkar, Amber Srivastava

Abstract: To stay competitive, the Low- or Middle-Income Countries (LMICs) need to embrace Industry 4.0 and Logistics 4.0. This requires government-level interventions and policy-making to incentivize quality product solutions and drive innovation in traditionally resistant economic sectors. In this position paper, we support the establishment of Smart Industrial Parks (SIPs) with a focus on enhancing opera… ▽ More To stay competitive, the Low- or Middle-Income Countries (LMICs) need to embrace Industry 4.0 and Logistics 4.0. This requires government-level interventions and policy-making to incentivize quality product solutions and drive innovation in traditionally resistant economic sectors. In this position paper, we support the establishment of Smart Industrial Parks (SIPs) with a focus on enhancing operational efficiencies and bringing together MSMEs and startups targeting niche clientele with innovative Industry 4.0 solutions. SIPs along with the phased deployment of well-planned robotic automation technologies shall enable bringing down India's untenable logistics costs. Toward the successful execution of SIPs, we are required to implement the efficient allocation of manufacturing resources and capabilities within SIPs. Thus, we emphasize the importance of efficient resource utilization, collaboration, and technology adoption in industrial parks to promote industrial development and economic growth. We advocate the use of a cloud-based cyber-physical system for real-time data access and analysis in SIPs. Such centralized cloud-based monitoring of factory floors, warehouses, and industrial units using IoT infrastructure shall improve decision-making, efficiency, and safety. Digital Twins (DTs), which are cyber-replicas of physical systems, could play a significant role in enabling simulation, optimization, and real-time monitoring of smart manufacturing and distributed manufacturing systems. However, there are several challenges involved in implementing DTs in distributed manufacturing systems, such as defining data schemas and collaboration protocols, ensuring interoperability, the need for effective authentication technology, distributed machine learning models, and scalability to manage multiple DTs. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2310.00892 [pdf, other]

No Offense Taken: Eliciting Offensiveness from Language Models

Authors: Anugya Srivastava, Rahul Ahuja, Rohith Mukku

Abstract: This work was completed in May 2022. For safe and reliable deployment of language models in the real world, testing needs to be robust. This robustness can be characterized by the difficulty and diversity of the test cases we evaluate these models on. Limitations in human-in-the-loop test case generation has prompted an advent of automated test case generation approaches. In particular, we focus… ▽ More This work was completed in May 2022. For safe and reliable deployment of language models in the real world, testing needs to be robust. This robustness can be characterized by the difficulty and diversity of the test cases we evaluate these models on. Limitations in human-in-the-loop test case generation has prompted an advent of automated test case generation approaches. In particular, we focus on Red Teaming Language Models with Language Models by Perez et al.(2022). Our contributions include develo** a pipeline for automated test case generation via red teaming that leverages publicly available smaller language models (LMs), experimenting with different target LMs and red classifiers, and generating a corpus of test cases that can help in eliciting offensive responses from widely deployed LMs and identifying their failure modes. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2309.15067 [pdf, other]

Logic Locking based Trojans: A Friend Turns Foe

Authors: Yuntao Liu, Aruna Jayasena, Prabhat Mishra, Ankur Srivastava

Abstract: Logic locking and hardware Trojans are two fields in hardware security that have been mostly developed independently from each other. In this paper, we identify the relationship between these two fields. We find that a common structure that exists in many logic locking techniques has desirable properties of hardware Trojans (HWT). We then construct a novel type of HWT, called Trojans based on Logi… ▽ More Logic locking and hardware Trojans are two fields in hardware security that have been mostly developed independently from each other. In this paper, we identify the relationship between these two fields. We find that a common structure that exists in many logic locking techniques has desirable properties of hardware Trojans (HWT). We then construct a novel type of HWT, called Trojans based on Logic Locking (TroLL), in a way that can evade state-of-the-art ATPG-based HWT detection techniques. In an effort to detect TroLL, we propose customization of existing state-of-the-art ATPG-based HWT detection approaches as well as adapting the SAT-based attacks on logic locking to HWT detection. In our experiments, we use random sampling as reference. It is shown that the customized ATPG-based approaches are the best performing but only offer limited improvement over random sampling. Moreover, their efficacy also diminishes as TroLL's triggers become longer, i.e., have more bits specified). We thereby highlight the need to find a scalable HWT detection approach for TroLL. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 9 pages, double column, 8 figures, IEEE format

arXiv:2309.13136 [pdf, other]

Contextual Emotion Estimation from Image Captions

Authors: Vera Yang, Archita Srivastava, Yasaman Etesam, Chuxuan Zhang, Angelica Lim

Abstract: Emotion estimation in images is a challenging task, typically using computer vision methods to directly estimate people's emotions using face, body pose and contextual cues. In this paper, we explore whether Large Language Models (LLMs) can support the contextual emotion estimation task, by first captioning images, then using an LLM for inference. First, we must understand: how well do LLMs percei… ▽ More Emotion estimation in images is a challenging task, typically using computer vision methods to directly estimate people's emotions using face, body pose and contextual cues. In this paper, we explore whether Large Language Models (LLMs) can support the contextual emotion estimation task, by first captioning images, then using an LLM for inference. First, we must understand: how well do LLMs perceive human emotions? And which parts of the information enable them to determine emotions? One initial challenge is to construct a caption that describes a person within a scene with information relevant for emotion perception. Towards this goal, we propose a set of natural language descriptors for faces, bodies, interactions, and environments. We use them to manually generate captions and emotion annotations for a subset of 331 images from the EMOTIC dataset. These captions offer an interpretable representation for emotion estimation, towards understanding how elements of a scene affect emotion perception in LLMs and beyond. Secondly, we test the capability of a large language model to infer an emotion from the resulting image captions. We find that GPT-3.5, specifically the text-davinci-003 model, provides surprisingly reasonable emotion predictions consistent with human annotations, but accuracy can depend on the emotion concept. Overall, the results suggest promise in the image captioning and LLM approach. △ Less

Submitted 22 September, 2023; originally announced September 2023.

Comments: Accepted to ACII 2023. Project page: http://rosielab.github.io/emotion-captions/

arXiv:2309.10694 [pdf, other]

"With Great Power Comes Great Responsibility!": Student and Instructor Perspectives on the influence of LLMs on Undergraduate Engineering Education

Authors: Ishika Joshi, Ritvik Budhiraja, Pranav Deepak Tanna, Lovenya Jain, Mihika Deshpande, Arjun Srivastava, Srinivas Rallapalli, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar

Abstract: The rise in popularity of Large Language Models (LLMs) has prompted discussions in academic circles, with students exploring LLM-based tools for coursework inquiries and instructors exploring them for teaching and research. Even though a lot of work is underway to create LLM-based tools tailored for students and instructors, there is a lack of comprehensive user studies that capture the perspectiv… ▽ More The rise in popularity of Large Language Models (LLMs) has prompted discussions in academic circles, with students exploring LLM-based tools for coursework inquiries and instructors exploring them for teaching and research. Even though a lot of work is underway to create LLM-based tools tailored for students and instructors, there is a lack of comprehensive user studies that capture the perspectives of students and instructors regarding LLMs. This paper addresses this gap by conducting surveys and interviews within undergraduate engineering universities in India. Using 1306 survey responses among students, 112 student interviews, and 27 instructor interviews around the academic usage of ChatGPT (a popular LLM), this paper offers insights into the current usage patterns, perceived benefits, threats, and challenges, as well as recommendations for enhancing the adoption of LLMs among students and instructors. These insights are further utilized to discuss the practical implications of LLMs in undergraduate engineering education and beyond. △ Less

Submitted 30 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: Under review

arXiv:2309.08587 [pdf, other]

Compositional Foundation Models for Hierarchical Planning

Authors: Anurag Ajay, Seungwook Han, Yilun Du, Shuang Li, Abhi Gupta, Tommi Jaakkola, Josh Tenenbaum, Leslie Kaelbling, Akash Srivastava, Pulkit Agrawal

Abstract: To make effective decisions in novel environments with long-horizon goals, it is crucial to engage in hierarchical reasoning across spatial and temporal scales. This entails planning abstract subgoal sequences, visually reasoning about the underlying plans, and executing actions in accordance with the devised plan through visual-motor control. We propose Compositional Foundation Models for Hierarc… ▽ More To make effective decisions in novel environments with long-horizon goals, it is crucial to engage in hierarchical reasoning across spatial and temporal scales. This entails planning abstract subgoal sequences, visually reasoning about the underlying plans, and executing actions in accordance with the devised plan through visual-motor control. We propose Compositional Foundation Models for Hierarchical Planning (HiP), a foundation model which leverages multiple expert foundation model trained on language, vision and action data individually jointly together to solve long-horizon tasks. We use a large language model to construct symbolic plans that are grounded in the environment through a large video diffusion model. Generated video plans are then grounded to visual-motor control, through an inverse dynamics model that infers actions from generated videos. To enable effective reasoning within this hierarchy, we enforce consistency between the models via iterative refinement. We illustrate the efficacy and adaptability of our approach in three different long-horizon table-top manipulation tasks. △ Less

Submitted 21 September, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: Website: https://hierarchical-planning-foundation-model.github.io/

arXiv:2309.03579 [pdf, other]

DTW+S: Shape-based Comparison of Time-series with Ordered Local Trend

Authors: Ajitesh Srivastava

Abstract: Measuring distance or similarity between time-series data is a fundamental aspect of many applications including classification, clustering, and ensembling/alignment. Existing measures may fail to capture similarities among local trends (shapes) and may even produce misleading results. Our goal is to develop a measure that looks for similar trends occurring around similar times and is easily inter… ▽ More Measuring distance or similarity between time-series data is a fundamental aspect of many applications including classification, clustering, and ensembling/alignment. Existing measures may fail to capture similarities among local trends (shapes) and may even produce misleading results. Our goal is to develop a measure that looks for similar trends occurring around similar times and is easily interpretable for researchers in applied domains. This is particularly useful for applications where time-series have a sequence of meaningful local trends that are ordered, such as in epidemics (a surge to an increase to a peak to a decrease). We propose a novel measure, DTW+S, which creates an interpretable "closeness-preserving" matrix representation of the time-series, where each column represents local trends, and then it applies Dynamic Time War** to compute distances between these matrices. We present a theoretical analysis that supports the choice of this representation. We demonstrate the utility of DTW+S in several tasks. For the clustering of epidemic curves, we show that DTW+S is the only measure able to produce good clustering compared to the baselines. For ensemble building, we propose a combination of DTW+S and barycenter averaging that results in the best preservation of characteristics of the underlying trajectories. We also demonstrate that our approach results in better classification compared to Dynamic Time War** for a class of datasets, particularly when local trends rather than scale play a decisive role. △ Less

Submitted 29 November, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

Comments: 11 pages, 11 figures Update: Included barycenter averaging with DTW+S along with results

arXiv:2309.01618 [pdf, other]

Critical Behavioral Traits Foster Peer Engagement in Online Mental Health Communities

Authors: Aseem Srivastava, Tanya Gupta, Alison Cerezo, Sarah Peregrine, Lord, Md Shad Akhtar, Tanmoy Chakraborty

Abstract: Online Mental Health Communities (OMHCs), such as Reddit, have witnessed a surge in popularity as go-to platforms for seeking information and support in managing mental health needs. Platforms like Reddit offer immediate interactions with peers, granting users a vital space for seeking mental health assistance. However, the largely unregulated nature of these platforms introduces intricate challen… ▽ More Online Mental Health Communities (OMHCs), such as Reddit, have witnessed a surge in popularity as go-to platforms for seeking information and support in managing mental health needs. Platforms like Reddit offer immediate interactions with peers, granting users a vital space for seeking mental health assistance. However, the largely unregulated nature of these platforms introduces intricate challenges for both users and society at large. This study explores the factors that drive peer engagement within counseling threads, aiming to enhance our understanding of this critical phenomenon. We introduce BeCOPE, a novel behavior encoded Peer counseling dataset comprising over 10,118 posts and 58,279 comments sourced from 21 mental health-specific subreddits. The dataset is annotated using three major fine-grained behavior labels: (a) intent, (b) criticism, and (c) readability, along with the emotion labels. Our analysis indicates the prominence of ``self-criticism'' as the most prevalent form of criticism expressed by help-seekers, accounting for a significant 43% of interactions. Intriguingly, we observe that individuals who explicitly express their need for help are 18.01% more likely to receive assistance compared to those who present ``surveys'' or engage in ``rants.'' Furthermore, we highlight the pivotal role of well-articulated problem descriptions, showing that superior readability effectively doubles the likelihood of receiving the sought-after support. Our study emphasizes the essential role of OMHCs in offering personalized guidance and unveils behavior-driven engagement patterns. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2309.01108 [pdf, other]

Acoustic-to-articulatory inversion for dysarthric speech: Are pre-trained self-supervised representations favorable?

Authors: Sarthak Kumar Maharana, Krishna Kamal Adidam, Shoumik Nandi, Ajitesh Srivastava

Abstract: Acoustic-to-articulatory inversion (AAI) involves map** from the acoustic to the articulatory space. Signal-processing features like the MFCCs, have been widely used for the AAI task. For subjects with dysarthric speech, AAI is challenging because of an imprecise and indistinct pronunciation. In this work, we perform AAI for dysarthric speech using representations from pre-trained self-supervise… ▽ More Acoustic-to-articulatory inversion (AAI) involves map** from the acoustic to the articulatory space. Signal-processing features like the MFCCs, have been widely used for the AAI task. For subjects with dysarthric speech, AAI is challenging because of an imprecise and indistinct pronunciation. In this work, we perform AAI for dysarthric speech using representations from pre-trained self-supervised learning (SSL) models. We demonstrate the impact of different pre-trained features on this challenging AAI task, at low-resource conditions. In addition, we also condition x-vectors to the extracted SSL features to train a BLSTM network. In the seen case, we experiment with three AAI training schemes (subject-specific, pooled, and fine-tuned). The results, consistent across training schemes, reveal that DeCoAR, in the fine-tuned scheme, achieves a relative improvement of the Pearson Correlation Coefficient (CC) by ~1.81% and ~4.56% for healthy controls and patients, respectively, over MFCCs. We observe similar average trends for different SSL features in the unseen case. Overall, SSL networks like wav2vec, APC, and DeCoAR, trained with feature reconstruction or future timestep prediction tasks, perform well in predicting dysarthric articulatory trajectories. △ Less

Submitted 9 February, 2024; v1 submitted 3 September, 2023; originally announced September 2023.

Comments: Accepted to IEEE ICASSP Workshops 2024

arXiv:2308.12266 [pdf, other]

Age of Gossip on Generalized Rings

Authors: Arunabh Srivastava, Sennur Ulukus

Abstract: We consider a gossip network consisting of a source forwarding updates and $n$ nodes placed geometrically in a ring formation. Each node gossips with $f(n)$ nodes on either side, thus communicating with $2f(n)$ nodes in total. $f(n)$ is a sub-linear, non-decreasing and positive function. The source keeps updates of a process, that might be generated or observed, and shares them with the nodes in t… ▽ More We consider a gossip network consisting of a source forwarding updates and $n$ nodes placed geometrically in a ring formation. Each node gossips with $f(n)$ nodes on either side, thus communicating with $2f(n)$ nodes in total. $f(n)$ is a sub-linear, non-decreasing and positive function. The source keeps updates of a process, that might be generated or observed, and shares them with the nodes in the ring network. The nodes in the ring network communicate with their neighbors and disseminate these version updates using a push-style gossip strategy. We use the version age metric to quantify the timeliness of information at the nodes. Prior to this work, it was shown that the version age scales as $O(n^{\frac{1}{2}})$ in a ring network, i.e., when $f(n)=1$, and as $O(\log{n})$ in a fully-connected network, i.e., when $2f(n)=n-1$. In this paper, we find an upper bound for the average version age for a set of nodes in such a network in terms of the number of nodes $n$ and the number of gossiped neighbors $2 f(n)$. We show that if $f(n) = Ω(\frac{n}{\log^2{n}})$, then the version age still scales as $θ(\log{n})$. We also show that if $f(n)$ is a rational function, then the version age also scales as a rational function. In particular, if $f(n)=n^α$, then version age is $O(n^\frac{1-α}{2})$. Finally, through numerical calculations we verify that, for all practical purposes, if $f(n) = Ω(n^{0.6})$, the version age scales as $O(\log{n})$. △ Less

Submitted 23 August, 2023; originally announced August 2023.

arXiv:2308.11042 [pdf, other]

Unlocking Hardware Security Assurance: The Potential of LLMs

Authors: Xingyu Meng, Amisha Srivastava, Ayush Arunachalam, Avik Ray, Pedro Henrique Silva, Rafail Psiakis, Yiorgos Makris, Kanad Basu

Abstract: System-on-Chips (SoCs) form the crux of modern computing systems. SoCs enable high-level integration through the utilization of multiple Intellectual Property (IP) cores. However, the integration of multiple IP cores also presents unique challenges owing to their inherent vulnerabilities, thereby compromising the security of the entire system. Hence, it is imperative to perform hardware security v… ▽ More System-on-Chips (SoCs) form the crux of modern computing systems. SoCs enable high-level integration through the utilization of multiple Intellectual Property (IP) cores. However, the integration of multiple IP cores also presents unique challenges owing to their inherent vulnerabilities, thereby compromising the security of the entire system. Hence, it is imperative to perform hardware security validation to address these concerns. The efficiency of this validation procedure is contingent on the quality of the SoC security properties provided. However, generating security properties with traditional approaches often requires expert intervention and is limited to a few IPs, thereby resulting in a time-consuming and non-robust process. To address this issue, we, for the first time, propose a novel and automated Natural Language Processing (NLP)-based Security Property Generator (NSPG). Specifically, our approach utilizes hardware documentation in order to propose the first hardware security-specific language model, HS-BERT, for extracting security properties dedicated to hardware design. To evaluate our proposed technique, we trained the HS-BERT model using sentences from RISC-V, OpenRISC, MIPS, OpenSPARC, and OpenTitan SoC documentation. When assessedb on five untrained OpenTitan hardware IP documents, NSPG was able to extract 326 security properties from 1723 sentences. This, in turn, aided in identifying eight security bugs in the OpenTitan SoC design presented in the hardware hacking competition, Hack@DAC 2022. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Showing 1–50 of 208 results for author: Srivastava, A