-
SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models
Authors:
Somnath Banerjee,
Soham Tripathy,
Sayan Layek,
Shanu Kumar,
Animesh Mukherjee,
Rima Hazra
Abstract:
Safety-aligned language models often exhibit fragile and imbalanced safety mechanisms, increasing the likelihood of generating unsafe content. In addition, incorporating new knowledge through editing techniques to language models can further compromise safety. To address these issues, we propose SafeInfer, a context-adaptive, decoding-time safety alignment strategy for generating safe responses to…
▽ More
Safety-aligned language models often exhibit fragile and imbalanced safety mechanisms, increasing the likelihood of generating unsafe content. In addition, incorporating new knowledge through editing techniques to language models can further compromise safety. To address these issues, we propose SafeInfer, a context-adaptive, decoding-time safety alignment strategy for generating safe responses to user queries. SafeInfer comprises two phases: the safety amplification phase, which employs safe demonstration examples to adjust the model's hidden states and increase the likelihood of safer outputs, and the safety-guided decoding phase, which influences token selection based on safety-optimized distributions, ensuring the generated content complies with ethical guidelines. Further, we present HarmEval, a novel benchmark for extensive safety evaluations, designed to address potential misuse scenarios in accordance with the policies of leading AI tech giants.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations
Authors:
Rima Hazra,
Sayan Layek,
Somnath Banerjee,
Soujanya Poria
Abstract:
Ensuring the safe alignment of large language models (LLMs) with human values is critical as they become integral to applications like translation and question answering. Current alignment methods struggle with dynamic user intentions and complex objectives, making models vulnerable to generating harmful content. We propose Safety Arithmetic, a training-free framework enhancing LLM safety across d…
▽ More
Ensuring the safe alignment of large language models (LLMs) with human values is critical as they become integral to applications like translation and question answering. Current alignment methods struggle with dynamic user intentions and complex objectives, making models vulnerable to generating harmful content. We propose Safety Arithmetic, a training-free framework enhancing LLM safety across different scenarios: Base models, Supervised fine-tuned models (SFT), and Edited models. Safety Arithmetic involves Harm Direction Removal to avoid harmful content and Safety Alignment to promote safe responses. Additionally, we present NoIntentEdit, a dataset highlighting edit instances that could compromise model safety if used unintentionally. Our experiments show that Safety Arithmetic significantly improves safety measures, reduces over-safety, and maintains model utility, outperforming existing methods in ensuring safe content generation.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance
Authors:
Somnath Banerjee,
Avik Halder,
Rajarshi Mandal,
Sayan Layek,
Ian Soboroff,
Rima Hazra,
Animesh Mukherjee
Abstract:
The integration of pretrained language models (PLMs) like BERT and GPT has revolutionized NLP, particularly for English, but it has also created linguistic imbalances. This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Ll…
▽ More
The integration of pretrained language models (PLMs) like BERT and GPT has revolutionized NLP, particularly for English, but it has also created linguistic imbalances. This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Llama, and Kan-Llama across languages including English, German, French, Italian, Spanish, Hindi, Tamil, and Kannada. Our research identifies significant discrepancies in normal and merged models concerning cross-lingual consistency. We employ strategies like 'each language for itself' (ELFI) and 'each language for others' (ELFO) to stress-test these models. Our findings demonstrate the potential for LLMs to overcome linguistic barriers, laying the groundwork for future research in achieving linguistic inclusivity in AI technologies.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
REvolve: Reward Evolution with Large Language Models for Autonomous Driving
Authors:
Rishi Hazra,
Alkis Sygkounas,
Andreas Persson,
Amy Loutfi,
Pedro Zuidberg Dos Martires
Abstract:
Designing effective reward functions is crucial to training reinforcement learning (RL) algorithms. However, this design is non-trivial, even for domain experts, due to the subjective nature of certain tasks that are hard to quantify explicitly. In recent works, large language models (LLMs) have been used for reward generation from natural language task descriptions, leveraging their extensive ins…
▽ More
Designing effective reward functions is crucial to training reinforcement learning (RL) algorithms. However, this design is non-trivial, even for domain experts, due to the subjective nature of certain tasks that are hard to quantify explicitly. In recent works, large language models (LLMs) have been used for reward generation from natural language task descriptions, leveraging their extensive instruction tuning and commonsense understanding of human behavior. In this work, we hypothesize that LLMs, guided by human feedback, can be used to formulate human-aligned reward functions. Specifically, we study this in the challenging setting of autonomous driving (AD), wherein notions of "good" driving are tacit and hard to quantify. To this end, we introduce REvolve, an evolutionary framework that uses LLMs for reward design in AD. REvolve creates and refines reward functions by utilizing human feedback to guide the evolution process, effectively translating implicit human knowledge into explicit reward functions for training (deep) RL agents. We demonstrate that agents trained on REvolve-designed rewards align closely with human driving standards, thereby outperforming other state-of-the-art baselines.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
DistALANER: Distantly Supervised Active Learning Augmented Named Entity Recognition in the Open Source Software Ecosystem
Authors:
Somnath Banerjee,
Avik Dutta,
Aaditya Agrawal,
Rima Hazra,
Animesh Mukherjee
Abstract:
With the AI revolution in place, the trend for building automated systems to support professionals in different domains such as the open source software systems, healthcare systems, banking systems, transportation systems and many others have become increasingly prominent. A crucial requirement in the automation of support tools for such systems is the early identification of named entities, which…
▽ More
With the AI revolution in place, the trend for building automated systems to support professionals in different domains such as the open source software systems, healthcare systems, banking systems, transportation systems and many others have become increasingly prominent. A crucial requirement in the automation of support tools for such systems is the early identification of named entities, which serves as a foundation for develo** specialized functionalities. However, due to the specific nature of each domain, different technical terminologies and specialized languages, expert annotation of available data becomes expensive and challenging. In light of these challenges, this paper proposes a novel named entity recognition (NER) technique specifically tailored for the open-source software systems. Our approach aims to address the scarcity of annotated software data by employing a comprehensive two-step distantly supervised annotation process. This process strategically leverages language heuristics, unique lookup tables, external knowledge sources, and an active learning approach. By harnessing these powerful techniques, we not only enhance model performance but also effectively mitigate the limitations associated with cost and the scarcity of expert annotators. It is noteworthy that our model significantly outperforms the state-of-the-art LLMs by a substantial margin. We also show the effectiveness of NER in the downstream task of relation extraction.
△ Less
Submitted 20 June, 2024; v1 submitted 25 February, 2024;
originally announced February 2024.
-
How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries
Authors:
Somnath Banerjee,
Sayan Layek,
Rima Hazra,
Animesh Mukherjee
Abstract:
In this study, we tackle a growing concern around the safety and ethical use of large language models (LLMs). Despite their potential, these models can be tricked into producing harmful or unethical content through various sophisticated methods, including 'jailbreaking' techniques and targeted manipulation. Our work zeroes in on a specific issue: to what extent LLMs can be led astray by asking the…
▽ More
In this study, we tackle a growing concern around the safety and ethical use of large language models (LLMs). Despite their potential, these models can be tricked into producing harmful or unethical content through various sophisticated methods, including 'jailbreaking' techniques and targeted manipulation. Our work zeroes in on a specific issue: to what extent LLMs can be led astray by asking them to generate responses that are instruction-centric such as a pseudocode, a program or a software snippet as opposed to vanilla text. To investigate this question, we introduce TechHazardQA, a dataset containing complex queries which should be answered in both text and instruction-centric formats (e.g., pseudocodes), aimed at identifying triggers for unethical responses. We query a series of LLMs -- Llama-2-13b, Llama-2-7b, Mistral-V2 and Mistral 8X7B -- and ask them to generate both text and instruction-centric responses. For evaluation we report the harmfulness score metric as well as judgements from GPT-4 and humans. Overall, we observe that asking LLMs to produce instruction-centric responses enhances the unethical response generation by ~2-38% across the models. As an additional objective, we investigate the impact of model editing using the ROME technique, which further increases the propensity for generating undesirable content. In particular, asking edited LLMs to generate instruction-centric responses further increases the unethical response generation by ~3-16% across the different models.
△ Less
Submitted 15 March, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context
Authors:
Somnath Banerjee,
Amruit Sahoo,
Sayan Layek,
Avik Dutta,
Rima Hazra,
Animesh Mukherjee
Abstract:
In the continuously advancing AI landscape, crafting context-rich and meaningful responses via Large Language Models (LLMs) is essential. Researchers are becoming more aware of the challenges that LLMs with fewer parameters encounter when trying to provide suitable answers to open-ended questions. To address these hurdles, the integration of cutting-edge strategies, augmentation of rich external d…
▽ More
In the continuously advancing AI landscape, crafting context-rich and meaningful responses via Large Language Models (LLMs) is essential. Researchers are becoming more aware of the challenges that LLMs with fewer parameters encounter when trying to provide suitable answers to open-ended questions. To address these hurdles, the integration of cutting-edge strategies, augmentation of rich external domain knowledge to LLMs, offers significant improvements. This paper introduces a novel framework that combines graph-driven context retrieval in conjunction to knowledge graphs based enhancement, honing the proficiency of LLMs, especially in domain specific community question answering platforms like AskUbuntu, Unix, and ServerFault. We conduct experiments on various LLMs with different parameter sizes to evaluate their ability to ground knowledge and determine factual accuracy in answers to open-ended questions. Our methodology GraphContextGen consistently outperforms dominant text-based retrieval systems, demonstrating its robustness and adaptability to a larger number of use cases. This advancement highlights the importance of pairing context rich data retrieval with LLMs, offering a renewed approach to knowledge sourcing and generation in AI systems. We also show that, due to rich contextual data retrieval, the crucial entities, along with the generated answer, remain factually coherent with the gold answer.
△ Less
Submitted 5 March, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
Sowing the Wind, Rea** the Whirlwind: The Impact of Editing Language Models
Authors:
Rima Hazra,
Sayan Layek,
Somnath Banerjee,
Soujanya Poria
Abstract:
In the rapidly advancing field of artificial intelligence, the concept of Red-Teaming or Jailbreaking large language models (LLMs) has emerged as a crucial area of study. This approach is especially significant in terms of assessing and enhancing the safety and robustness of these models. This paper investigates the intricate consequences of such modifications through model editing, uncovering a c…
▽ More
In the rapidly advancing field of artificial intelligence, the concept of Red-Teaming or Jailbreaking large language models (LLMs) has emerged as a crucial area of study. This approach is especially significant in terms of assessing and enhancing the safety and robustness of these models. This paper investigates the intricate consequences of such modifications through model editing, uncovering a complex relationship between enhancing model accuracy and preserving its ethical integrity. Our in-depth analysis reveals a striking paradox: while injecting accurate information is crucial for model reliability, it can paradoxically destabilize the model's foundational framework, resulting in unpredictable and potentially unsafe behaviors. Additionally, we propose a benchmark dataset NicheHazardQA to investigate this unsafe behavior both within the same and cross topical domain. This aspect of our research sheds light on how the edits, impact the model's safety metrics and guardrails. Our findings show that model editing serves as a cost-effective tool for topical red-teaming by methodically applying targeted edits and evaluating the resultant model behavior.
△ Less
Submitted 16 May, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
Redefining Developer Assistance: Through Large Language Models in Software Ecosystem
Authors:
Somnath Banerjee,
Avik Dutta,
Sayan Layek,
Amruit Sahoo,
Sam Conrad Joyce,
Rima Hazra
Abstract:
In this paper, we delve into the advancement of domain-specific Large Language Models (LLMs) with a focus on their application in software development. We introduce DevAssistLlama, a model developed through instruction tuning, to assist developers in processing software-related natural language queries. This model, a variant of instruction tuned LLM, is particularly adept at handling intricate tec…
▽ More
In this paper, we delve into the advancement of domain-specific Large Language Models (LLMs) with a focus on their application in software development. We introduce DevAssistLlama, a model developed through instruction tuning, to assist developers in processing software-related natural language queries. This model, a variant of instruction tuned LLM, is particularly adept at handling intricate technical documentation, enhancing developer capability in software specific tasks. The creation of DevAssistLlama involved constructing an extensive instruction dataset from various software systems, enabling effective handling of Named Entity Recognition (NER), Relation Extraction (RE), and Link Prediction (LP). Our results demonstrate DevAssistLlama's superior capabilities in these tasks, in comparison with other models including ChatGPT. This research not only highlights the potential of specialized LLMs in software development also the pioneer LLM for this domain.
△ Less
Submitted 15 March, 2024; v1 submitted 9 December, 2023;
originally announced December 2023.
-
Modeling interdisciplinary interactions among Physics, Mathematics & Computer Science
Authors:
Rima Hazra,
Mayank Singh,
Pawan Goyal,
Bibhas Adhikari,
Animesh Mukherjee
Abstract:
Interdisciplinarity has over the recent years have gained tremendous importance and has become one of the key ways of doing cutting edge research. In this paper we attempt to model the citation flow across three different fields -- Physics (PHY), Mathematics (MA) and Computer Science (CS). For instance, is there a specific pattern in which these fields cite one another? We carry out experiments on…
▽ More
Interdisciplinarity has over the recent years have gained tremendous importance and has become one of the key ways of doing cutting edge research. In this paper we attempt to model the citation flow across three different fields -- Physics (PHY), Mathematics (MA) and Computer Science (CS). For instance, is there a specific pattern in which these fields cite one another? We carry out experiments on a dataset comprising more than 1.2 million articles taken from these three fields. We quantify the citation interactions among these three fields through temporal bucket signatures. We present numerical models based on variants of the recently proposed relay-linking framework to explain the citation dynamics across the three disciplines. These models make a modest attempt to unfold the underlying principles of how citation links could have been formed across the three fields over time.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Evaluating the Ebb and Flow: An In-depth Analysis of Question-Answering Trends across Diverse Platforms
Authors:
Rima Hazra,
Agnik Saha,
Somnath Banerjee,
Animesh Mukherjee
Abstract:
Community Question Answering (CQA) platforms steadily gain popularity as they provide users with fast responses to their queries. The swiftness of these responses is contingent on a mixture of query-specific and user-related elements. This paper scrutinizes these contributing factors within the context of six highly popular CQA platforms, identified through their standout answering speed. Our inve…
▽ More
Community Question Answering (CQA) platforms steadily gain popularity as they provide users with fast responses to their queries. The swiftness of these responses is contingent on a mixture of query-specific and user-related elements. This paper scrutinizes these contributing factors within the context of six highly popular CQA platforms, identified through their standout answering speed. Our investigation reveals a correlation between the time taken to yield the first response to a question and several variables: the metadata, the formulation of the questions, and the level of interaction among users. Additionally, by employing conventional machine learning models to analyze these metadata and patterns of user interaction, we endeavor to predict which queries will receive their initial responses promptly.
△ Less
Submitted 15 March, 2024; v1 submitted 12 September, 2023;
originally announced September 2023.
-
Duplicate Question Retrieval and Confirmation Time Prediction in Software Communities
Authors:
Rima Hazra,
Debanjan Saha,
Amruit Sahoo,
Somnath Banerjee,
Animesh Mukherjee
Abstract:
Community Question Answering (CQA) in different domains is growing at a large scale because of the availability of several platforms and huge shareable information among users. With the rapid growth of such online platforms, a massive amount of archived data makes it difficult for moderators to retrieve possible duplicates for a new question and identify and confirm existing question pairs as dupl…
▽ More
Community Question Answering (CQA) in different domains is growing at a large scale because of the availability of several platforms and huge shareable information among users. With the rapid growth of such online platforms, a massive amount of archived data makes it difficult for moderators to retrieve possible duplicates for a new question and identify and confirm existing question pairs as duplicates at the right time. This problem is even more critical in CQAs corresponding to large software systems like askubuntu where moderators need to be experts to comprehend something as a duplicate. Note that the prime challenge in such CQA platforms is that the moderators are themselves experts and are therefore usually extremely busy with their time being extraordinarily expensive. To facilitate the task of the moderators, in this work, we have tackled two significant issues for the askubuntu CQA platform: (1) retrieval of duplicate questions given a new question and (2) duplicate question confirmation time prediction. In the first task, we focus on retrieving duplicate questions from a question pool for a particular newly posted question. In the second task, we solve a regression problem to rank a pair of questions that could potentially take a long time to get confirmed as duplicates. For duplicate question retrieval, we propose a Siamese neural network based approach by exploiting both text and network-based features, which outperforms several state-of-the-art baseline techniques. Our method outperforms DupPredictor and DUPE by 5% and 7% respectively. For duplicate confirmation time prediction, we have used both the standard machine learning models and neural network along with the text and graph-based features. We obtain Spearman's rank correlation of 0.20 and 0.213 (statistically significant) for text and graph based features respectively.
△ Less
Submitted 5 March, 2024; v1 submitted 10 September, 2023;
originally announced September 2023.
-
SayCanPay: Heuristic Planning with Large Language Models using Learnable Domain Knowledge
Authors:
Rishi Hazra,
Pedro Zuidberg Dos Martires,
Luc De Raedt
Abstract:
Large Language Models (LLMs) have demonstrated impressive planning abilities due to their vast "world knowledge". Yet, obtaining plans that are both feasible (grounded in affordances) and cost-effective (in plan length), remains a challenge, despite recent progress. This contrasts with heuristic planning methods that employ domain knowledge (formalized in action models such as PDDL) and heuristic…
▽ More
Large Language Models (LLMs) have demonstrated impressive planning abilities due to their vast "world knowledge". Yet, obtaining plans that are both feasible (grounded in affordances) and cost-effective (in plan length), remains a challenge, despite recent progress. This contrasts with heuristic planning methods that employ domain knowledge (formalized in action models such as PDDL) and heuristic search to generate feasible, optimal plans. Inspired by this, we propose to combine the power of LLMs and heuristic planning by leveraging the world knowledge of LLMs and the principles of heuristic search. Our approach, SayCanPay, employs LLMs to generate actions (Say) guided by learnable domain knowledge, that evaluates actions' feasibility (Can) and long-term reward/payoff (Pay), and heuristic search to select the best sequence of actions. Our contributions are (1) a novel framing of the LLM planning problem in the context of heuristic planning, (2) integrating grounding and cost-effective elements into the generated plans, and (3) using heuristic search over actions. Our extensive evaluations show that our model surpasses other LLM planning approaches.
△ Less
Submitted 1 January, 2024; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Deep Explainable Relational Reinforcement Learning: A Neuro-Symbolic Approach
Authors:
Rishi Hazra,
Luc De Raedt
Abstract:
Despite numerous successes in Deep Reinforcement Learning (DRL), the learned policies are not interpretable. Moreover, since DRL does not exploit symbolic relational representations, it has difficulties in co** with structural changes in its environment (such as increasing the number of objects). Relational Reinforcement Learning, on the other hand, inherits the relational representations from s…
▽ More
Despite numerous successes in Deep Reinforcement Learning (DRL), the learned policies are not interpretable. Moreover, since DRL does not exploit symbolic relational representations, it has difficulties in co** with structural changes in its environment (such as increasing the number of objects). Relational Reinforcement Learning, on the other hand, inherits the relational representations from symbolic planning to learn reusable policies. However, it has so far been unable to scale up and exploit the power of deep neural networks. We propose Deep Explainable Relational Reinforcement Learning (DERRL), a framework that exploits the best of both -- neural and symbolic worlds. By resorting to a neuro-symbolic approach, DERRL combines relational representations and constraints from symbolic planning with deep learning to extract interpretable policies. These policies are in the form of logical rules that explain how each decision (or action) is arrived at. Through several experiments, in setups like the Countdown Game, Blocks World, Gridworld, and Traffic, we show that the policies learned by DERRL can be applied to different configurations and contexts, hence generalizing to environmental modifications.
△ Less
Submitted 14 July, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
EgoTV: Egocentric Task Verification from Natural Language Task Descriptions
Authors:
Rishi Hazra,
Brian Chen,
Akshara Rai,
Nitin Kamra,
Ruta Desai
Abstract:
To enable progress towards egocentric agents capable of understanding everyday tasks specified in natural language, we propose a benchmark and a synthetic dataset called Egocentric Task Verification (EgoTV). The goal in EgoTV is to verify the execution of tasks from egocentric videos based on the natural language description of these tasks. EgoTV contains pairs of videos and their task description…
▽ More
To enable progress towards egocentric agents capable of understanding everyday tasks specified in natural language, we propose a benchmark and a synthetic dataset called Egocentric Task Verification (EgoTV). The goal in EgoTV is to verify the execution of tasks from egocentric videos based on the natural language description of these tasks. EgoTV contains pairs of videos and their task descriptions for multi-step tasks -- these tasks contain multiple sub-task decompositions, state changes, object interactions, and sub-task ordering constraints. In addition, EgoTV also provides abstracted task descriptions that contain only partial details about ways to accomplish a task. Consequently, EgoTV requires causal, temporal, and compositional reasoning of video and language modalities, which is missing in existing datasets. We also find that existing vision-language models struggle at such all round reasoning needed for task verification in EgoTV. Inspired by the needs of EgoTV, we propose a novel Neuro-Symbolic Grounding (NSG) approach that leverages symbolic representations to capture the compositional and temporal structure of tasks. We demonstrate NSG's capability towards task tracking and verification on our EgoTV dataset and a real-world dataset derived from CrossTask (CTV). We open-source the EgoTV and CTV datasets and the NSG model for future research on egocentric assistive agents.
△ Less
Submitted 25 September, 2023; v1 submitted 29 March, 2023;
originally announced March 2023.
-
Is this bug severe? A text-cum-graph based model for bug severity prediction
Authors:
Rima Hazra,
Arpit Dwivedi,
Animesh Mukherjee
Abstract:
Repositories of large software systems have become commonplace. This massive expansion has resulted in the emergence of various problems in these software platforms including identification of (i) bug-prone packages, (ii) critical bugs, and (iii) severity of bugs. One of the important goals would be to mine these bugs and recommend them to the developers to resolve them. The first step to this is…
▽ More
Repositories of large software systems have become commonplace. This massive expansion has resulted in the emergence of various problems in these software platforms including identification of (i) bug-prone packages, (ii) critical bugs, and (iii) severity of bugs. One of the important goals would be to mine these bugs and recommend them to the developers to resolve them. The first step to this is that one has to accurately detect the extent of severity of the bugs. In this paper, we take up this task of predicting the severity of bugs in the near future. Contextualized neural models built on the text description of a bug and the user comments about the bug help to achieve reasonably good performance. Further information on how the bugs are related to each other in terms of the ways they affect packages can be summarised in the form of a graph and used along with the text to get additional benefits.
△ Less
Submitted 1 July, 2022;
originally announced July 2022.
-
Zero-Shot Generalization using Intrinsically Motivated Compositional Emergent Protocols
Authors:
Rishi Hazra,
Sonu Dixit,
Sayambhu Sen
Abstract:
Human language has been described as a system that makes \textit{use of finite means to express an unlimited array of thoughts}. Of particular interest is the aspect of compositionality, whereby, the meaning of a compound language expression can be deduced from the meaning of its constituent parts. If artificial agents can develop compositional communication protocols akin to human language, they…
▽ More
Human language has been described as a system that makes \textit{use of finite means to express an unlimited array of thoughts}. Of particular interest is the aspect of compositionality, whereby, the meaning of a compound language expression can be deduced from the meaning of its constituent parts. If artificial agents can develop compositional communication protocols akin to human language, they can be made to seamlessly generalize to unseen combinations. Studies have recognized the role of curiosity in enabling linguistic development in children. In this paper, we seek to use this intrinsic feedback in inducing a systematic and unambiguous protolanguage. We demonstrate how compositionality can enable agents to not only interact with unseen objects but also transfer skills from one task to another in a zero-shot setting: \textit{Can an agent, trained to `pull' and `push twice', `pull twice'?}.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
gComm: An environment for investigating generalization in Grounded Language Acquisition
Authors:
Rishi Hazra,
Sonu Dixit
Abstract:
gComm is a step towards develo** a robust platform to foster research in grounded language acquisition in a more challenging and realistic setting. It comprises a 2-d grid environment with a set of agents (a stationary speaker and a mobile listener connected via a communication channel) exposed to a continuous array of tasks in a partially observable setting. The key to solving these tasks lies…
▽ More
gComm is a step towards develo** a robust platform to foster research in grounded language acquisition in a more challenging and realistic setting. It comprises a 2-d grid environment with a set of agents (a stationary speaker and a mobile listener connected via a communication channel) exposed to a continuous array of tasks in a partially observable setting. The key to solving these tasks lies in agents develo** linguistic abilities and utilizing them for efficiently exploring the environment. The speaker and listener have access to information provided in different modalities, i.e. the speaker's input is a natural language instruction that contains the target and task specifications and the listener's input is its grid-view. Each must rely on the other to complete the assigned task, however, the only way they can achieve the same, is to develop and use some form of communication. gComm provides several tools for studying different forms of communication and assessing their generalization.
△ Less
Submitted 15 May, 2021; v1 submitted 9 May, 2021;
originally announced May 2021.
-
Active$^2$ Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation
Authors:
Rishi Hazra,
Parag Dutta,
Shubham Gupta,
Mohammed Abdul Qaathir,
Ambedkar Dukkipati
Abstract:
While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation…
▽ More
While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation based on their estimated utility in training the given model. In this paper, we argue that since AL strategies choose examples independently, they may potentially select similar examples, all of which may not contribute significantly to the learning process. Our proposed approach, Active$\mathbf{^2}$ Learning (A$\mathbf{^2}$L), actively adapts to the deep learning model being trained to eliminate further such redundant examples chosen by an AL strategy. We show that A$\mathbf{^2}$L is widely applicable by using it in conjunction with several different AL strategies and NLP tasks. We empirically demonstrate that the proposed approach is further able to reduce the data requirements of state-of-the-art AL strategies by an absolute percentage reduction of $\approx\mathbf{3-25\%}$ on multiple NLP tasks while achieving the same performance with no additional computation overhead.
△ Less
Submitted 3 April, 2021; v1 submitted 11 March, 2021;
originally announced March 2021.
-
Joint Autoregressive and Graph Models for Software and Developer Social Networks
Authors:
Rima Hazra,
Hardik Aggarwal,
Pawan Goyal,
Animesh Mukherjee,
Soumen Chakrabarti
Abstract:
Social network research has focused on hyperlink graphs, bibliographic citations, friend/follow patterns, influence spread, etc. Large software repositories also form a highly valuable networked artifact, usually in the form of a collection of packages, their developers, dependencies among them, and bug reports. This "social network of code" is rarely studied by social network researchers. We intr…
▽ More
Social network research has focused on hyperlink graphs, bibliographic citations, friend/follow patterns, influence spread, etc. Large software repositories also form a highly valuable networked artifact, usually in the form of a collection of packages, their developers, dependencies among them, and bug reports. This "social network of code" is rarely studied by social network researchers. We introduce two new problems in this setting. These problems are well-motivated in the software engineering community but not closely studied by social network scientists. The first is to identify packages that are most likely to be troubled by bugs in the immediate future, thereby demanding the greatest attention. The second is to recommend developers to packages for the next development cycle. Simple autoregression can be applied to historical data for both problems, but we propose a novel method to integrate network-derived features and demonstrate that our method brings additional benefits. Apart from formalizing these problems and proposing new baseline approaches, we prepare and contribute a substantial dataset connecting multiple attributes built from the long-term history of 20 releases of Ubuntu, growing to over 25,000 packages with their dependency links, maintained by over 3,800 developers, with over 280k bug reports.
△ Less
Submitted 21 January, 2021;
originally announced January 2021.
-
Mining the online infosphere: A survey
Authors:
Sayantan Adak,
Souvic Chakraborty,
Paramtia Das,
Mithun Das,
Abhisek Dash,
Rima Hazra,
Binny Mathew,
Punyajoy Saha,
Soumya Sarkar,
Animesh Mukherjee
Abstract:
The evolution of AI-based system and applications had pervaded everyday life to make decisions that have momentous impact on individuals and society. With the staggering growth of online data, often termed as the Online Infosphere it has become paramount to monitor the infosphere to ensure social good as the AI-based decisions are severely dependent on it. The goal of this survey is to provide a c…
▽ More
The evolution of AI-based system and applications had pervaded everyday life to make decisions that have momentous impact on individuals and society. With the staggering growth of online data, often termed as the Online Infosphere it has become paramount to monitor the infosphere to ensure social good as the AI-based decisions are severely dependent on it. The goal of this survey is to provide a comprehensive review of some of the most important research areas related to infosphere, focusing on the technical challenges and potential solutions. The survey also outlines some of the important future directions. We begin by discussions focused on the collaborative systems that have emerged within the infosphere with a special thrust on Wikipedia. In the follow up we demonstrate how the infosphere has been instrumental in the growth of scientific citations and collaborations thus fueling interdisciplinary research. Finally, we illustrate the issues related to the governance of the infosphere such as the tackling of the (a) rising hateful and abusive behavior and (b) bias and discrimination in different online platforms and news reporting.
△ Less
Submitted 2 January, 2021;
originally announced January 2021.
-
Intrinsically Motivated Compositional Language Emergence
Authors:
Rishi Hazra,
Sonu Dixit,
Sayambhu Sen
Abstract:
Recently, there has been a great deal of research in emergent communication on artificial agents interacting in simulated environments. Recent studies have revealed that, in general, emergent languages do not follow the compositionality patterns of natural language. To deal with this, existing works have proposed a limited channel capacity as an important constraint for learning highly composition…
▽ More
Recently, there has been a great deal of research in emergent communication on artificial agents interacting in simulated environments. Recent studies have revealed that, in general, emergent languages do not follow the compositionality patterns of natural language. To deal with this, existing works have proposed a limited channel capacity as an important constraint for learning highly compositional languages. In this paper, we show that this is not a sufficient condition and propose an intrinsic reward framework for improving compositionality in emergent communication. We use a reinforcement learning setting with two agents -- a \textit{task-aware} Speaker and a \textit{state-aware} Listener that are required to communicate to perform a set of tasks. Through our experiments on three different referential game setups, including a novel environment gComm, we show intrinsic rewards improve compositionality scores by $\approx \mathbf{1.5-2}$ times that of existing frameworks that use limited channel capacity.
△ Less
Submitted 27 January, 2023; v1 submitted 9 December, 2020;
originally announced December 2020.
-
Characterising authors on the extent of their paper acceptance: A case study of the Journal of High Energy Physics
Authors:
Rima Hazra,
Aryan,
Hardik Aggarwal,
Matteo Marsili,
Animesh Mukherjee
Abstract:
New researchers are usually very curious about the recipe that could accelerate the chances of their paper getting accepted in a reputed forum (journal/conference). In search of such a recipe, we investigate the profile and peer review text of authors whose papers almost always get accepted at a venue (Journal of High Energy Physics in our current work). We find authors with high acceptance rate a…
▽ More
New researchers are usually very curious about the recipe that could accelerate the chances of their paper getting accepted in a reputed forum (journal/conference). In search of such a recipe, we investigate the profile and peer review text of authors whose papers almost always get accepted at a venue (Journal of High Energy Physics in our current work). We find authors with high acceptance rate are likely to have a high number of citations, high $h$-index, higher number of collaborators etc. We notice that they receive relatively lengthy and positive reviews for their papers. In addition, we also construct three networks -- co-reviewer, co-citation and collaboration network and study the network-centric features and intra- and inter-category edge interactions. We find that the authors with high acceptance rate are more `central' in these networks; the volume of intra- and inter-category interactions are also drastically different for the authors with high acceptance rate compared to the other authors. Finally, using the above set of features, we train standard machine learning models (random forest, XGBoost) and obtain very high class wise precision and recall. In a followup discussion we also narrate how apart from the author characteristics, the peer-review system might itself have a role in propelling the distinction among the different categories which could lead to potential discrimination and unfairness and calls for further investigation by the system admins.
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
Networked Multi-Agent Reinforcement Learning with Emergent Communication
Authors:
Shubham Gupta,
Rishi Hazra,
Ambedkar Dukkipati
Abstract:
Multi-Agent Reinforcement Learning (MARL) methods find optimal policies for agents that operate in the presence of other learning agents. Central to achieving this is how the agents coordinate. One way to coordinate is by learning to communicate with each other. Can the agents develop a language while learning to perform a common task? In this paper, we formulate and study a MARL problem where coo…
▽ More
Multi-Agent Reinforcement Learning (MARL) methods find optimal policies for agents that operate in the presence of other learning agents. Central to achieving this is how the agents coordinate. One way to coordinate is by learning to communicate with each other. Can the agents develop a language while learning to perform a common task? In this paper, we formulate and study a MARL problem where cooperative agents are connected to each other via a fixed underlying network. These agents can communicate along the edges of this network by exchanging discrete symbols. However, the semantics of these symbols are not predefined and, during training, the agents are required to develop a language that helps them in accomplishing their goals. We propose a method for training these agents using emergent communication. We demonstrate the applicability of the proposed framework by applying it to the problem of managing traffic controllers, where we achieve state-of-the-art performance as compared to a number of strong baselines. More importantly, we perform a detailed analysis of the emergent communication to show, for instance, that the developed language is grounded and demonstrate its relationship with the underlying network topology. To the best of our knowledge, this is the only work that performs an in depth analysis of emergent communication in a networked MARL setting while being applicable to a broad class of problems.
△ Less
Submitted 9 April, 2020; v1 submitted 6 April, 2020;
originally announced April 2020.
-
Active$^2$ Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation
Authors:
Rishi Hazra,
Parag Dutta,
Shubham Gupta,
Mohammed Abdul Qaathir,
Ambedkar Dukkipati
Abstract:
While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation…
▽ More
While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation based on their estimated utility in training the given model. In this paper, we argue that since AL strategies choose examples independently, they may potentially select similar examples, all of which may not contribute significantly to the learning process. Our proposed approach, Active$\mathbf{^2}$ Learning (A$\mathbf{^2}$L), actively adapts to the deep learning model being trained to eliminate such redundant examples chosen by an AL strategy. We show that A$\mathbf{^2}$L is widely applicable by using it in conjunction with several different AL strategies and NLP tasks. We empirically demonstrate that the proposed approach is further able to reduce the data requirements of state-of-the-art AL strategies by $\approx \mathbf{3-25\%}$ on an absolute scale on multiple NLP tasks while achieving the same performance with virtually no additional computation overhead.
△ Less
Submitted 6 April, 2021; v1 submitted 1 November, 2019;
originally announced November 2019.
-
The rise and rise of interdisciplinary research: Understanding the interaction dynamics of three major fields -- Physics, Mathematics & Computer Science
Authors:
Rima Hazra,
Mayank Singh,
Pawan Goyal,
Bibhas Adhikari,
Animesh Mukherjee
Abstract:
The distinction between sciences is becoming increasingly more artificial -- an approach from one area can be easily applied to the other. More exciting research nowadays is happening perhaps at the interfaces of disciplines like Physics, Mathematics and Computer Science. How do these interfaces emerge and interact? For instance, is there a specific pattern in which these fields cite each other? I…
▽ More
The distinction between sciences is becoming increasingly more artificial -- an approach from one area can be easily applied to the other. More exciting research nowadays is happening perhaps at the interfaces of disciplines like Physics, Mathematics and Computer Science. How do these interfaces emerge and interact? For instance, is there a specific pattern in which these fields cite each other? In this article, we investigate a collection of more than 1.2 million papers from three different scientific disciplines -- Physics, Mathematics, and Computer Science. We show how over a timescale the citation patterns from the core science fields (Physics, Mathematics) to the applied and fast-growing field of Computer Science have drastically increased. Further, we observe how certain subfields in these disciplines are shrinking while others are becoming tremendously popular. For instance, an intriguing observation is that citations from Mathematics to the subfield of machine learning in Computer Science in recent times are exponentially increasing.
△ Less
Submitted 10 August, 2019;
originally announced August 2019.