Skip to main content

Showing 1–50 of 385 results for author: Jain, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18058  [pdf

    cs.CR

    Fuzzing at Scale: The Untold Story of the Scheduler

    Authors: Ivica Nikolic, Racchit Jain

    Abstract: How to search for bugs in 1,000 programs using a pre-existing fuzzer and a standard PC? We consider this problem and show that a well-designed strategy that determines which programs to fuzz and for how long can greatly impact the number of bugs found across the programs. In fact, the impact of employing an effective strategy is comparable to that of utilizing a state-of-the-art fuzzer. The consid… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.15754  [pdf, other

    cs.CV cs.CL cs.LG cs.SD eess.AS

    Multimodal Segmentation for Vocal Tract Modeling

    Authors: Rishi Jain, Bohan Yu, Peter Wu, Tejas Prabhune, Gopala Anumanchipalli

    Abstract: Accurate modeling of the vocal tract is necessary to construct articulatory representations for interpretable speech processing and linguistics. However, vocal tract modeling is challenging because many internal articulators are occluded from external motion capture technologies. Real-time magnetic resonance imaging (RT-MRI) allows measuring precise movements of internal articulators during speech… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Interspeech 2024

  3. arXiv:2406.15563  [pdf, ps, other

    cs.DS cs.DM

    Exponential Time Approximation for Coloring 3-Colorable Graphs

    Authors: Venkatesan Guruswami, Rhea Jain

    Abstract: The problem of efficiently coloring $3$-colorable graphs with few colors has received much attention on both the algorithmic and inapproximability fronts. We consider exponential time approximations, in which given a parameter $r$, we aim to develop an $r$-approximation algorithm with the best possible runtime, providing a tradeoff between runtime and approximation ratio. In this vein, an algorith… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  4. arXiv:2406.14290  [pdf, ps, other

    cs.CY cs.SI

    Examining the Implications of Deepfakes for Election Integrity

    Authors: Hriday Ranka, Mokshit Surana, Neel Kothari, Veer Pariawala, Pratyay Banerjee, Aditya Surve, Sainath Reddy Sankepally, Raghav Jain, Jhagrut Lalwani, Swapneel Mehta

    Abstract: It is becoming cheaper to launch disinformation operations at scale using AI-generated content, in particular 'deepfake' technology. We have observed instances of deepfakes in political campaigns, where generated content is employed to both bolster the credibility of certain narratives (reinforcing outcomes) and manipulate public perception to the detriment of targeted candidates or causes (advers… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted at the AAAI 2024 conference, AI for Credible Elections Workshop-AI4CE 2024

  5. arXiv:2406.09574  [pdf, other

    cs.LG

    Online Bandit Learning with Offline Preference Data

    Authors: Akhil Agnihotri, Rahul Jain, Deepak Ramachandran, Zheng Wen

    Abstract: Reinforcement Learning with Human Feedback (RLHF) is at the core of fine-tuning methods for generative AI models for language and images. Such feedback is often sought as rank or preference feedback from human raters, as opposed to eliciting scores since the latter tends to be very noisy. On the other hand, RL theory and algorithms predominantly assume that a reward feedback is available. In parti… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  6. arXiv:2406.09563  [pdf, other

    cs.LG

    e-COP : Episodic Constrained Optimization of Policies

    Authors: Akhil Agnihotri, Rahul Jain, Deepak Ramachandran, Sahil Singla

    Abstract: In this paper, we present the $\texttt{e-COP}$ algorithm, the first policy optimization algorithm for constrained Reinforcement Learning (RL) in episodic (finite horizon) settings. Such formulations are applicable when there are separate sets of optimization criteria and constraints on a system's behavior. We approach this problem by first establishing a policy difference lemma for the episodic se… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  7. arXiv:2406.08354  [pdf, other

    cs.CV cs.AI cs.LG

    DocSynthv2: A Practical Autoregressive Modeling for Document Generation

    Authors: Sanket Biswas, Rajiv Jain, Vlad I. Morariu, Jiuxiang Gu, Puneet Mathur, Curtis Wigington, Tong Sun, Josep Lladós

    Abstract: While the generation of document layouts has been extensively explored, comprehensive document generation encompassing both layout and content presents a more complex challenge. This paper delves into this advanced domain, proposing a novel approach called DocSynthv2 through the development of a simple yet effective autoregressive structured model. Our model, distinct in its integration of both la… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Spotlight (Oral) Acceptance to CVPR 2024 Workshop for Graphic Design Understanding and Generation (GDUG)

  8. arXiv:2406.05344  [pdf, other

    cs.CL

    MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention

    Authors: Prince Jha, Raghav Jain, Konika Mandal, Aman Chadha, Sriparna Saha, Pushpak Bhattacharyya

    Abstract: In the digital world, memes present a unique challenge for content moderation due to their potential to spread harmful content. Although detection methods have improved, proactive solutions such as intervention are still limited, with current research focusing mostly on text-based content, neglecting the widespread influence of multimodal content like memes. Addressing this gap, we present \textit… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  9. arXiv:2406.00833  [pdf, other

    cs.CY cs.AI

    Harvard Undergraduate Survey on Generative AI

    Authors: Shikoh Hirabayashi, Rishab Jain, Nikola Jurković, Gabriel Wu

    Abstract: How has generative AI impacted the experiences of college students? We study the influence of AI on the study habits, class choices, and career prospects of Harvard undergraduates (n=326), finding that almost 90% of students use generative AI. For roughly 25% of these students, AI has begun to substitute for attending office hours and completing required readings. Half of students are concerned th… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  10. arXiv:2405.20648  [pdf, other

    cs.CV cs.CL cs.LG

    Shotluck Holmes: A Family of Efficient Small-Scale Large Language Vision Models For Video Captioning and Summarization

    Authors: Richard Luo, Austin Peng, Adithya Vasudev, Rishabh Jain

    Abstract: Video is an increasingly prominent and information-dense medium, yet it poses substantial challenges for language models. A typical video consists of a sequence of shorter segments, or shots, that collectively form a coherent narrative. Each shot is analogous to a word in a sentence where multiple data streams of information (such as visual and auditory data) must be processed simultaneously. Comp… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  11. arXiv:2405.15090  [pdf, other

    cs.LG stat.ML

    Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget

    Authors: Dengwang Tang, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

    Abstract: In this paper, we introduce the constrained best mixed arm identification (CBMAI) problem with a fixed budget. This is a pure exploration problem in a stochastic finite armed bandit model. Each arm is associated with a reward and multiple types of costs from unknown distributions. Unlike the unconstrained best arm identification problem, the optimal solution for the CBMAI problem may be a randomiz… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures, 1 table

  12. arXiv:2405.13333  [pdf

    cs.NI

    Service Mesh: Architectures, Applications, and Implementations

    Authors: Behrooz Farkiani, Raj Jain

    Abstract: The scalability and flexibility of microservice architecture have led to major changes in cloud-native application architectures. However, the complexity of managing thousands of small services written in different languages and handling the exchange of data between them have caused significant management challenges. Service mesh is a promising solution that could mitigate these problems by introd… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 20 pages

  13. arXiv:2405.04777  [pdf, other

    cs.CL

    Empathy Through Multimodality in Conversational Interfaces

    Authors: Mahyar Abbasian, Iman Azimi, Mohammad Feli, Amir M. Rahmani, Ramesh Jain

    Abstract: Agents represent one of the most emerging applications of Large Language Models (LLMs) and Generative AI, with their effectiveness hinging on multimodal capabilities to navigate complex user environments. Conversational Health Agents (CHAs), a prime example of this, are redefining healthcare by offering nuanced support that transcends textual analysis to incorporate emotional intelligence. This pa… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 7 pages, 2 figures, 2 tables, conference paper

  14. arXiv:2404.18353  [pdf, other

    cs.CR cs.AI cs.PL

    Do Neutral Prompts Produce Insecure Code? FormAI-v2 Dataset: Labelling Vulnerabilities in Code Generated by Large Language Models

    Authors: Norbert Tihanyi, Tamas Bisztray, Mohamed Amine Ferrag, Ridhi Jain, Lucas C. Cordeiro

    Abstract: This study provides a comparative analysis of state-of-the-art large language models (LLMs), analyzing how likely they generate vulnerabilities when writing simple C programs using a neutral zero-shot prompt. We address a significant gap in the literature concerning the security properties of code produced by these models without specific directives. N. Tihanyi et al. introduced the FormAI dataset… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  15. arXiv:2404.16870  [pdf, ps, other

    cs.CR cs.AI cs.LG

    LEMDA: A Novel Feature Engineering Method for Intrusion Detection in IoT Systems

    Authors: Ali Ghubaish, Zebo Yang, Aiman Erbad, Raj Jain

    Abstract: Intrusion detection systems (IDS) for the Internet of Things (IoT) systems can use AI-based models to ensure secure communications. IoT systems tend to have many connected devices producing massive amounts of data with high dimensionality, which requires complex models. Complex models have notorious problems such as overfitting, low interpretability, and high computational complexity. Adding model… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  16. arXiv:2404.16725  [pdf, ps, other

    cs.DS

    Approximation Algorithms for Hop Constrained and Buy-at-Bulk Network Design via Hop Constrained Oblivious Routing

    Authors: Chandra Chekuri, Rhea Jain

    Abstract: We consider two-cost network design models in which edges of the input graph have an associated cost and length. We build upon recent advances in hop-constrained oblivious routing to obtain two sets of results. We address multicommodity buy-at-bulk network design in the nonuniform setting. Existing poly-logarithmic approximations are based on the junction tree approach [CHKS09,KN11]. We obtain a… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  17. arXiv:2404.06768  [pdf, ps, other

    cs.IT math.RA

    A new approach to construct minimal linear codes over $\mathbb{F}_{3}$

    Authors: Wajid M. Shaikh, Rupali S. Jain, B. Surendranath Reddy, Bhagyashri S. Patil, Sahar M. A. Maqbol

    Abstract: In this article, we present two new approaches to construct minimal linear codes of dimension $n+1$ over $\mathbb{F}_{3}$ using characteristic and ternary functions. We also obtain the weight distributions of these constructed minimal linear codes. We further show that a specific class of these codes violates Ashikhmin-Barg condition.

    Submitted 10 April, 2024; originally announced April 2024.

    Journal ref: MJMS-2024-0154

  18. arXiv:2404.03220  [pdf, ps, other

    quant-ph cs.CR

    Commitments are equivalent to one-way state generators

    Authors: Rishabh Batra, Rahul Jain

    Abstract: One-way state generators (OWSG) are natural quantum analogs to classical one-way functions. We show that $O\left(\frac{n}{\log(n)}\right)$-copy OWSGs ($n$ represents the input length) are equivalent to $poly(n)$-copy OWSG and to quantum commitments. Since known results show that $o\left(\frac{n}{\log(n)}\right)$-copy OWSG cannot imply commitments, this shows that $O\left(\frac{n}{\log(n)}\right)$-… ▽ More

    Submitted 17 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: minor changes to previous version

  19. arXiv:2404.03150  [pdf, other

    cs.CL cs.AI

    NLP at UC Santa Cruz at SemEval-2024 Task 5: Legal Answer Validation using Few-Shot Multi-Choice QA

    Authors: Anish Pahilajani, Samyak Rajesh Jain, Devasha Trivedi

    Abstract: This paper presents our submission to the SemEval 2024 Task 5: The Legal Argument Reasoning Task in Civil Procedure. We present two approaches to solving the task of legal answer validation, given an introduction to the case, a question and an answer candidate. Firstly, we fine-tuned pre-trained BERT-based models and found that models trained on domain knowledge perform better. Secondly, we perfor… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  20. arXiv:2404.00477  [pdf, other

    cs.LG cs.AR

    DE-HNN: An effective neural model for Circuit Netlist representation

    Authors: Zhishang Luo, Truong Son Hy, Puoya Tabaghi, Donghyeon Koh, Michael Defferrard, Elahe Rezaei, Ryan Carey, Rhett Davis, Rajeev Jain, Yusu Wang

    Abstract: The run-time for optimization tools used in chip design has grown with the complexity of designs to the point where it can take several days to go through one design cycle which has become a bottleneck. Designers want fast tools that can quickly give feedback on a design. Using the input and output data of the tools from past designs, one can attempt to build a machine learning model that predicts… ▽ More

    Submitted 16 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

  21. arXiv:2403.15547  [pdf, ps, other

    cs.DS

    Approximation Algorithms for Network Design in Non-Uniform Fault Models

    Authors: Chandra Chekuri, Rhea Jain

    Abstract: The Survivable Network Design problem (SNDP) is a well-studied problem, motivated by the design of networks that are robust to faults under the assumption that any subset of edges up to a specific number can fail. We consider non-uniform fault models where the subset of edges that fail can be specified in different ways. Our primary interest is in the flexible graph connectivity model, in which th… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: A preliminary version of this paper appeared in the Proc. of ICALP 2023 (10.4230/LIPIcs.ICALP.2023.36), which combines and extends results from two earlier versions: arXiv:2209.12273 (for the first set of results) and arXiv:2211.08324 (for the second set of results)

  22. arXiv:2403.14416  [pdf, other

    quant-ph cs.IT

    Quantum Channel Simulation in Fidelity is no more difficult than State Splitting

    Authors: Michael X. Cao, Rahul Jain, Marco Tomamichel

    Abstract: Characterizing the minimal communication needed for the quantum channel simulation is a fundamental task in the quantum information theory. In this paper, we show that, in fidelity, the quantum channel simulation can be directly achieved via quantum state splitting without using a technique known as the de~Finetti reduction, and thus provide a pair of tighter one-shot bounds. Using the bounds, we… ▽ More

    Submitted 24 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  23. arXiv:2403.13350  [pdf, ps, other

    cs.IT math.RA

    Construction of Minimal Binary Linear Codes of dimension $n+3$

    Authors: Wajid M. Shaikh, Rupali S. Jain, B. Surendranath Reddy, Bhagyashri S. Patil

    Abstract: In this paper, we will give the generic construction of a binary linear code of dimension $n+3$ and derive the necessary and sufficient conditions for the constructed code to be minimal. Using generic construction, a new family of minimal binary linear code will be constructed from a special class of Boolean functions violating the Ashikhmin-Barg condition. We also obtain the weight distribution o… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    MSC Class: 94B05; 94C10; 94A60

  24. arXiv:2403.13106  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Knowing Your Nonlinearities: Shapley Interactions Reveal the Underlying Structure of Data

    Authors: Divyansh Singhvi, Andrej Erkelens, Raghav Jain, Diganta Misra, Naomi Saphra

    Abstract: Measuring nonlinear feature interaction is an established approach to understanding complex patterns of attribution in many models. In this paper, we use Shapley Taylor interaction indices (STII) to analyze the impact of underlying data structure on model representations in a variety of modalities, tasks, and architectures. Considering linguistic structure in masked and auto-regressive language mo… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  25. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  26. arXiv:2403.01317  [pdf, other

    cs.LG cs.AR

    Less is More: Hop-Wise Graph Attention for Scalable and Generalizable Learning on Circuits

    Authors: Chenhui Deng, Zichao Yue, Cunxi Yu, Gokce Sarar, Ryan Carey, Rajeev Jain, Zhiru Zhang

    Abstract: While graph neural networks (GNNs) have gained popularity for learning circuit representations in various electronic design automation (EDA) tasks, they face challenges in scalability when applied to large graphs and exhibit limited generalizability to new designs. These limitations make them less practical for addressing large-scale, complex circuit problems. In this work we propose HOGA, a novel… ▽ More

    Submitted 10 April, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: Published as a conference paper at Design Automation Conference (DAC) 2024

  27. arXiv:2403.00781  [pdf, other

    cs.IR cs.AI cs.LG cs.MM

    ChatDiet: Empowering Personalized Nutrition-Oriented Food Recommender Chatbots through an LLM-Augmented Framework

    Authors: Zhongqi Yang, Elahe Khatibi, Nitish Nagesh, Mahyar Abbasian, Iman Azimi, Ramesh Jain, Amir M. Rahmani

    Abstract: The profound impact of food on health necessitates advanced nutrition-oriented food recommendation services. Conventional methods often lack the crucial elements of personalization, explainability, and interactivity. While Large Language Models (LLMs) bring interpretability and explainability, their standalone use falls short of achieving true personalization. In this paper, we introduce ChatDiet,… ▽ More

    Submitted 16 March, 2024; v1 submitted 18 February, 2024; originally announced March 2024.

    Comments: Accepted by The IEEE/ACM international conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) 2024

  28. arXiv:2403.00141  [pdf, other

    cs.CL cs.AI

    EROS: Entity-Driven Controlled Policy Document Summarization

    Authors: Joykirat Singh, Sehban Fazili, Rohan Jain, Md Shad Akhtar

    Abstract: Privacy policy documents have a crucial role in educating individuals about the collection, usage, and protection of users' personal data by organizations. However, they are notorious for their lengthy, complex, and convoluted language especially involving privacy-related entities. Hence, they pose a significant challenge to users who attempt to comprehend organization's data usage policy. In this… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

    Comments: Accepted in LREC-COLING 2024

  29. arXiv:2402.11477  [pdf, other

    cs.CY

    Studying Differential Mental Health Expressions in India

    Authors: Khushi Shelat, Sunny Rai, Devansh R Jain, Kishen Sivabalan, Young Min Cho, Maitreyi Redkar, Samindara Sawant, Sharath Chandra Guntuku

    Abstract: Psychosocial stressors and the symptomatology of mental disorders vary across cultures. However, current understandings of mental health expressions on social media are predominantly derived from studies in WEIRD (Western, Educated, Industrialized, Rich, and Democratic) contexts. In this paper, we analyze mental health posts on Reddit made by individuals in India, to identify variations in online… ▽ More

    Submitted 16 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  30. arXiv:2402.10400  [pdf, other

    cs.CL

    Chain of Logic: Rule-Based Reasoning with Large Language Models

    Authors: Sergio Servantez, Joe Barrow, Kristian Hammond, Rajiv Jain

    Abstract: Rule-based reasoning, a fundamental type of legal reasoning, enables us to draw conclusions by accurately applying a rule to a set of facts. We explore causal language models as rule-based reasoners, specifically with respect to compositional rules - rules consisting of multiple elements which form a complex logical expression. Reasoning about compositional rules is challenging because it requires… ▽ More

    Submitted 23 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  31. arXiv:2402.10153  [pdf, other

    cs.CL

    Knowledge-Infused LLM-Powered Conversational Health Agent: A Case Study for Diabetes Patients

    Authors: Mahyar Abbasian, Zhongqi Yang, Elahe Khatibi, Pengfei Zhang, Nitish Nagesh, Iman Azimi, Ramesh Jain, Amir M. Rahmani

    Abstract: Effective diabetes management is crucial for maintaining health in diabetic patients. Large Language Models (LLMs) have opened new avenues for diabetes management, facilitating their efficacy. However, current LLM-based approaches are limited by their dependence on general sources and lack of integration with domain-specific knowledge, leading to inaccurate responses. In this paper, we propose a k… ▽ More

    Submitted 28 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 4 pages, 3 figures, and 2 tables, conference paper

  32. arXiv:2402.07688  [pdf, other

    cs.AI cs.CR

    CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge

    Authors: Norbert Tihanyi, Mohamed Amine Ferrag, Ridhi Jain, Tamas Bisztray, Merouane Debbah

    Abstract: Large Language Models (LLMs) are increasingly used across various domains, from software development to cyber threat intelligence. Understanding all the different fields of cybersecurity, which includes topics such as cryptography, reverse engineering, and risk assessment, poses a challenge even for human experts. To accurately test the general knowledge of LLMs in cybersecurity, the research comm… ▽ More

    Submitted 3 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  33. arXiv:2402.07477  [pdf, other

    cs.AI

    Food Recommendation as Language Processing (F-RLP): A Personalized and Contextual Paradigm

    Authors: Ali Rostami, Ramesh Jain, Amir M. Rahmani

    Abstract: State-of-the-art rule-based and classification-based food recommendation systems face significant challenges in becoming practical and useful. This difficulty arises primarily because most machine learning models struggle with problems characterized by an almost infinite number of classes and a limited number of samples within an unbalanced dataset. Conversely, the emergence of Large Language Mode… ▽ More

    Submitted 14 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  34. arXiv:2402.04754  [pdf, other

    cs.CV cs.LG

    Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints

    Authors: Jian Chen, Ruiyi Zhang, Yufan Zhou, Rajiv Jain, Zhiqiang Xu, Ryan Rossi, Changyou Chen

    Abstract: Controllable layout generation refers to the process of creating a plausible visual arrangement of elements within a graphic design (e.g., document and web designs) with constraints representing design intentions. Although recent diffusion-based models have achieved state-of-the-art FID scores, they tend to exhibit more pronounced misalignment compared to earlier transformer-based models. In this… ▽ More

    Submitted 15 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted by ICLR 2024

  35. arXiv:2401.09899  [pdf, other

    cs.CL

    Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations

    Authors: Prince Jha, Krishanu Maity, Raghav Jain, Apoorv Verma, Sriparna Saha, Pushpak Bhattacharyya

    Abstract: Internet memes have gained significant influence in communicating political, psychological, and sociocultural ideas. While memes are often humorous, there has been a rise in the use of memes for trolling and cyberbullying. Although a wide variety of effective deep learning-based models have been developed for detecting offensive multimodal memes, only a few works have been done on explainability a… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: EACL2024

  36. arXiv:2401.09023  [pdf, other

    cs.CL

    Explain Thyself Bully: Sentiment Aided Cyberbullying Detection with Explanation

    Authors: Krishanu Maity, Prince Jha, Raghav Jain, Sriparna Saha, Pushpak Bhattacharyya

    Abstract: Cyberbullying has become a big issue with the popularity of different social media networks and online communication apps. While plenty of research is going on to develop better models for cyberbullying detection in monolingual language, there is very little research on the code-mixed languages and explainability aspect of cyberbullying. Recent laws like "right to explanations" of General Data Pro… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: ICDAR 2023

  37. arXiv:2401.06413  [pdf

    cs.HC

    Why Doesn't Microsoft Let Me Sleep? How Automaticity of Windows Updates Impacts User Autonomy

    Authors: Sanju Ahuja, Ridhi Jain, Jyoti Kumar

    Abstract: 'Automating the user away' has been designated as a dark pattern in literature for performing tasks without user consent or confirmation. However, limited studies have been reported on how users experience the sense of autonomy when digital systems fully or partially bypass consent. More research is required to understand what makes automaticity a threat to autonomy. To address this gap, a qualita… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: 6 pages, 2 figures

  38. arXiv:2401.04120  [pdf, other

    cs.HC cs.AI cs.CL

    Generation Z's Ability to Discriminate Between AI-generated and Human-Authored Text on Discord

    Authors: Dhruv Ramu, Rishab Jain, Aditya Jain

    Abstract: The growing popularity of generative artificial intelligence (AI) chatbots such as ChatGPT is having transformative effects on social media. As the prevalence of AI-generated content grows, concerns have been raised regarding privacy and misinformation online. Among social media platforms, Discord enables AI integrations -- making their primarily "Generation Z" userbase particularly exposed to AI-… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  39. arXiv:2401.01596  [pdf, other

    cs.AI cs.CL

    MedSumm: A Multimodal Approach to Summarizing Code-Mixed Hindi-English Clinical Queries

    Authors: Akash Ghosh, Arkadeep Acharya, Prince Jha, Aniket Gaudgaul, Rajdeep Majumdar, Sriparna Saha, Aman Chadha, Raghav Jain, Setu Sinha, Shivani Agarwal

    Abstract: In the healthcare domain, summarizing medical questions posed by patients is critical for improving doctor-patient interactions and medical decision-making. Although medical data has grown in complexity and quantity, the current body of research in this domain has primarily concentrated on text-based methods, overlooking the integration of visual cues. Also prior works in the area of medical quest… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: ECIR 2024

  40. arXiv:2312.14300  [pdf, other

    cs.NI quant-ph

    Asynchronous Entanglement Routing for the Quantum Internet

    Authors: Zebo Yang, Ali Ghubaish, Raj Jain, Hassan Shapourian, Alireza Shabani

    Abstract: With the emergence of the Quantum Internet, the need for advanced quantum networking techniques has significantly risen. Various models of quantum repeaters have been presented, each delineating a unique strategy to ensure quantum communication over long distances. We focus on repeaters that employ entanglement generation and swap**. This revolves around establishing remote end-to-end entangleme… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: This article has been accepted for publication in the AVS Quantum Science journal

    Report number: 013801

    Journal ref: AVS Quantum Sci. 6 (2024)

  41. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  42. arXiv:2312.11541  [pdf, other

    cs.AI cs.CL

    CLIPSyntel: CLIP and LLM Synergy for Multimodal Question Summarization in Healthcare

    Authors: Akash Ghosh, Arkadeep Acharya, Raghav Jain, Sriparna Saha, Aman Chadha, Setu Sinha

    Abstract: In the era of modern healthcare, swiftly generating medical question summaries is crucial for informed and timely patient care. Despite the increasing complexity and volume of medical data, existing studies have focused solely on text-based summarization, neglecting the integration of visual information. Recognizing the untapped potential of combining textual queries with visual representations of… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  43. arXiv:2312.10057  [pdf, other

    cs.CY cs.HC

    Generative AI in Writing Research Papers: A New Type of Algorithmic Bias and Uncertainty in Scholarly Work

    Authors: Rishab Jain, Aditya Jain

    Abstract: The use of artificial intelligence (AI) in research across all disciplines is becoming ubiquitous. However, this ubiquity is largely driven by hyperspecific AI models developed during scientific studies for accomplishing a well-defined, data-dense task. These AI models introduce apparent, human-recognizable biases because they are trained with finite, specific data sets and parameters. However, th… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: 10 pages, 6 figures

    ACM Class: I.2.m

  44. arXiv:2311.14712  [pdf, ps, other

    cs.SI

    Multiagent Simulators for Social Networks

    Authors: Aditya Surve, Archit Rathod, Mokshit Surana, Gautam Malpani, Aneesh Shamraj, Sainath Reddy Sankepally, Raghav Jain, Swapneel S Mehta

    Abstract: Multiagent social network simulations are an avenue that can bridge the communication gap between the public and private platforms in order to develop solutions to a complex array of issues relating to online safety. While there are significant challenges relating to the scale of multiagent simulations, efficient learning from observational and interventional data to accurately model micro and mac… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  45. arXiv:2311.11030  [pdf

    cs.AI cs.CY cs.HC

    Data Center Audio/Video Intelligence on Device (DAVID) -- An Edge-AI Platform for Smart-Toys

    Authors: Gabriel Cosache, Francisco Salgado, Cosmin Rotariu, George Sterpu, Rishabh Jain, Peter Corcoran

    Abstract: An overview is given of the DAVID Smart-Toy platform, one of the first Edge AI platform designs to incorporate advanced low-power data processing by neural inference models co-located with the relevant image or audio sensors. There is also on-board capability for in-device text-to-speech generation. Two alternative embodiments are presented: a smart Teddy-bear, and a roving dog-like robot. The pla… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: The 12th IEEE Conference on Speech Technology and Human Dialogue (SpeD 2023) URL: https://sped.pub.ro/

  46. arXiv:2311.10731  [pdf

    cs.LG physics.med-ph physics.soc-ph

    Gender-Based Comparative Study of Type 2 Diabetes Risk Factors in Kolkata, India: A Machine Learning Approach

    Authors: Rahul Jain, Anoushka Saha, Gourav Daga, Durba Bhattacharya, Madhura Das Gupta, Sourav Chowdhury, Suparna Roychowdhury

    Abstract: Type 2 diabetes mellitus represents a prevalent and widespread global health concern, necessitating a comprehensive assessment of its risk factors. This study aimed towards learning whether there is any differential impact of age, Lifestyle, BMI and Waist to height ratio on the risk of Type 2 diabetes mellitus in males and females in Kolkata, West Bengal, India based on a sample observed from the… ▽ More

    Submitted 14 October, 2023; originally announced November 2023.

    Comments: 10 pages, 7 tables,3 figures, submitted to a conference

  47. arXiv:2311.06307  [pdf

    cs.HC cs.AI cs.SD eess.AS

    Synthetic Speaking Children -- Why We Need Them and How to Make Them

    Authors: Muhammad Ali Farooq, Dan Bigioi, Rishabh Jain, Wang Yao, Mariam Yiwere, Peter Corcoran

    Abstract: Contemporary Human Computer Interaction (HCI) research relies primarily on neural network models for machine vision and speech understanding of a system user. Such models require extensively annotated training datasets for optimal performance and when building interfaces for users from a vulnerable population such as young children, GDPR introduces significant complexities in data collection, mana… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Presented at SpeD 23

  48. arXiv:2311.04936  [pdf

    cs.CL cs.AI cs.SD eess.AS

    A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving the child speech recognition

    Authors: Andrei Barcovschi, Rishabh Jain, Peter Corcoran

    Abstract: Automatic Speech Recognition (ASR) systems have progressed significantly in their performance on adult speech data; however, transcribing child speech remains challenging due to the acoustic differences in the characteristics of child and adult voices. This work aims to explore the potential of adapting state-of-the-art Conformer-transducer models to child speech to improve child speech recognitio… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Presented at SpeD 23

  49. arXiv:2311.04313  [pdf

    cs.SD cs.AI eess.AS

    Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning

    Authors: Rishabh Jain, Peter Corcoran

    Abstract: Speech synthesis technology has witnessed significant advancements in recent years, enabling the creation of natural and expressive synthetic speech. One area of particular interest is the generation of synthetic child speech, which presents unique challenges due to children's distinct vocal characteristics and developmental stages. This paper presents a novel approach that leverages the Fastpitch… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Presented in SpeD 23

  50. arXiv:2311.02400  [pdf, other

    cs.CY

    From Plate to Production: Artificial Intelligence in Modern Consumer-Driven Food Systems

    Authors: Weiqing Min, Pengfei Zhou, Leyi Xu, Tao Liu, Tianhao Li, Mingyu Huang, Ying **, Yifan Yi, Min Wen, Shuqiang Jiang, Ramesh Jain

    Abstract: Global food systems confront the urgent challenge of supplying sustainable, nutritious diets in the face of escalating demands. The advent of Artificial Intelligence (AI) is bringing in a personal choice revolution, wherein AI-driven individual decisions transform food systems from dinner tables, to the farms, and back to our plates. In this context, AI algorithms refine personal dietary choices,… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.