-
Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks
Authors:
Ibrahim Abdelaziz,
Kinjal Basu,
Mayank Agarwal,
Sadhana Kumaravel,
Matthew Stallone,
Rameswar Panda,
Yara Rizk,
GP Bhargav,
Maxwell Crouse,
Chulaka Gunasekara,
Shajith Ikbal,
Sachin Joshi,
Hima Karanam,
Vineet Kumar,
Asim Munawar,
Sumit Neelam,
Dinesh Raghu,
Udit Sharma,
Adriana Meza Soria,
Dheeraj Sreedhar,
Praveen Venkateswaran,
Merve Unuvar,
David Cox,
Salim Roukos,
Luis Lastras
, et al. (1 additional authors not shown)
Abstract:
Large language models (LLMs) have recently shown tremendous promise in serving as the backbone to agentic systems, as demonstrated by their performance in multi-faceted, challenging benchmarks like SWE-Bench and Agent-Bench. However, to realize the true potential of LLMs as autonomous agents, they must learn to identify, call, and interact with external tools and application program interfaces (AP…
▽ More
Large language models (LLMs) have recently shown tremendous promise in serving as the backbone to agentic systems, as demonstrated by their performance in multi-faceted, challenging benchmarks like SWE-Bench and Agent-Bench. However, to realize the true potential of LLMs as autonomous agents, they must learn to identify, call, and interact with external tools and application program interfaces (APIs) to complete complex tasks. These tasks together are termed function calling. Endowing LLMs with function calling abilities leads to a myriad of advantages, such as access to current and domain-specific information in databases and knowledge sources, and the ability to outsource tasks that can be reliably performed by tools, e.g., a Python interpreter or calculator. While there has been significant progress in function calling with LLMs, there is still a dearth of open models that perform on par with proprietary LLMs like GPT, Claude, and Gemini. Therefore, in this work, we introduce the GRANITE-20B-FUNCTIONCALLING model under an Apache 2.0 license. The model is trained using a multi-task training approach on seven fundamental tasks encompassed in function calling, those being Nested Function Calling, Function Chaining, Parallel Functions, Function Name Detection, Parameter-Value Pair Detection, Next-Best Function, and Response Generation. We present a comprehensive evaluation on multiple out-of-domain datasets comparing GRANITE-20B-FUNCTIONCALLING to more than 15 other best proprietary and open models. GRANITE-20B-FUNCTIONCALLING provides the best performance among all open models on the Berkeley Function Calling Leaderboard and fourth overall. As a result of the diverse tasks and datasets used for training our model, we show that GRANITE-20B-FUNCTIONCALLING has better generalizability on multiple tasks in seven different evaluation datasets.
△ Less
Submitted 27 June, 2024;
originally announced July 2024.
-
An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistants
Authors:
G P Shrivatsa Bhargav,
Sumit Neelam,
Udit Sharma,
Shajith Ikbal,
Dheeraj Sreedhar,
Hima Karanam,
Sachindra Joshi,
Pankaj Dhoolia,
Dinesh Garg,
Kyle Croutwater,
Haode Qi,
Eric Wayne,
J William Murdock
Abstract:
We present an approach to build Large Language Model (LLM) based slot-filling system to perform Dialogue State Tracking in conversational assistants serving across a wide variety of industry-grade applications. Key requirements of this system include: 1) usage of smaller-sized models to meet low latency requirements and to enable convenient and cost-effective cloud and customer premise deployments…
▽ More
We present an approach to build Large Language Model (LLM) based slot-filling system to perform Dialogue State Tracking in conversational assistants serving across a wide variety of industry-grade applications. Key requirements of this system include: 1) usage of smaller-sized models to meet low latency requirements and to enable convenient and cost-effective cloud and customer premise deployments, and 2) zero-shot capabilities to serve across a wide variety of domains, slot types and conversational scenarios. We adopt a fine-tuning approach where a pre-trained LLM is fine-tuned into a slot-filling model using task specific data. The fine-tuning data is prepared carefully to cover a wide variety of slot-filling task scenarios that the model is expected to face across various domains. We give details of the data preparation and model building process. We also give a detailed analysis of the results of our experimental evaluations. Results show that our prescribed approach for slot-filling model building has resulted in 6.9% relative improvement of F1 metric over the best baseline on a realistic benchmark, while at the same time reducing the latency by 57%. More over, the data we prepared has helped improve F1 on an average by 4.2% relative across various slot-types.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
An Ensemble Approach for Automated Theorem Proving Based on Efficient Name Invariant Graph Neural Representations
Authors:
Achille Fokoue,
Ibrahim Abdelaziz,
Maxwell Crouse,
Shajith Ikbal,
Akihiro Kishimoto,
Guilherme Lima,
Ndivhuwo Makondo,
Radu Marinescu
Abstract:
Using reinforcement learning for automated theorem proving has recently received much attention. Current approaches use representations of logical statements that often rely on the names used in these statements and, as a result, the models are generally not transferable from one domain to another. The size of these representations and whether to include the whole theory or part of it are other im…
▽ More
Using reinforcement learning for automated theorem proving has recently received much attention. Current approaches use representations of logical statements that often rely on the names used in these statements and, as a result, the models are generally not transferable from one domain to another. The size of these representations and whether to include the whole theory or part of it are other important decisions that affect the performance of these approaches as well as their runtime efficiency. In this paper, we present NIAGRA; an ensemble Name InvAriant Graph RepresentAtion. NIAGRA addresses this problem by using 1) improved Graph Neural Networks for learning name-invariant formula representations that is tailored for their unique characteristics and 2) an efficient ensemble approach for automated theorem proving. Our experimental evaluation shows state-of-the-art performance on multiple datasets from different domains with improvements up to 10% compared to the best learning-based approaches. Furthermore, transfer learning experiments show that our approach significantly outperforms other learning-based approaches by up to 28%.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Targeted Extraction of Temporal Facts from Textual Resources for Improved Temporal Question Answering over Knowledge Bases
Authors:
Nithish Kannen,
Udit Sharma,
Sumit Neelam,
Dinesh Khandelwal,
Shajith Ikbal,
Hima Karanam,
L Venkata Subramaniam
Abstract:
Knowledge Base Question Answering (KBQA) systems have the goal of answering complex natural language questions by reasoning over relevant facts retrieved from Knowledge Bases (KB). One of the major challenges faced by these systems is their inability to retrieve all relevant facts due to factors such as incomplete KB and entity/relation linking errors. In this paper, we address this particular cha…
▽ More
Knowledge Base Question Answering (KBQA) systems have the goal of answering complex natural language questions by reasoning over relevant facts retrieved from Knowledge Bases (KB). One of the major challenges faced by these systems is their inability to retrieve all relevant facts due to factors such as incomplete KB and entity/relation linking errors. In this paper, we address this particular challenge for systems handling a specific category of questions called temporal questions, where answer derivation involve reasoning over facts asserting point/intervals of time for various events. We propose a novel approach where a targeted temporal fact extraction technique is used to assist KBQA whenever it fails to retrieve temporal facts from the KB. We use $λ$-expressions of the questions to logically represent the component facts and the reasoning steps needed to derive the answer. This allows us to spot those facts that failed to get retrieved from the KB and generate textual queries to extract them from the textual resources in an open-domain question answering fashion. We evaluated our approach on a benchmark temporal question answering dataset considering Wikidata and Wikipedia respectively as the KB and textual resource. Experimental results show a significant $\sim$30\% relative improvement in answer accuracy, demonstrating the effectiveness of our approach.
△ Less
Submitted 21 March, 2022;
originally announced March 2022.
-
A Benchmark for Generalizable and Interpretable Temporal Question Answering over Knowledge Bases
Authors:
Sumit Neelam,
Udit Sharma,
Hima Karanam,
Shajith Ikbal,
Pavan Kapanipathi,
Ibrahim Abdelaziz,
Nandana Mihindukulasooriya,
Young-Suk Lee,
Santosh Srivastava,
Cezar Pendus,
Saswati Dana,
Dinesh Garg,
Achille Fokoue,
G P Shrivatsa Bhargav,
Dinesh Khandelwal,
Srinivas Ravishankar,
Sairam Gurajada,
Maria Chang,
Rosario Uceda-Sosa,
Salim Roukos,
Alexander Gray,
Guilherme Lima,
Ryan Riegel,
Francois Luus,
L Venkata Subramaniam
Abstract:
Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning. In this paper, we present a benchmark dataset for temporal reasoning, TempQA-…
▽ More
Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning. In this paper, we present a benchmark dataset for temporal reasoning, TempQA-WD, to encourage research in extending the present approaches to target a more challenging set of complex reasoning tasks. Specifically, our benchmark is a temporal question answering dataset with the following advantages: (a) it is based on Wikidata, which is the most frequently curated, openly available knowledge base, (b) it includes intermediate sparql queries to facilitate the evaluation of semantic parsing based approaches for KBQA, and (c) it generalizes to multiple knowledge bases: Freebase and Wikidata. The TempQA-WD dataset is available at https://github.com/IBM/tempqa-wd.
△ Less
Submitted 15 January, 2022;
originally announced January 2022.
-
SYGMA: System for Generalizable Modular Question Answering OverKnowledge Bases
Authors:
Sumit Neelam,
Udit Sharma,
Hima Karanam,
Shajith Ikbal,
Pavan Kapanipathi,
Ibrahim Abdelaziz,
Nandana Mihindukulasooriya,
Young-Suk Lee,
Santosh Srivastava,
Cezar Pendus,
Saswati Dana,
Dinesh Garg,
Achille Fokoue,
G P Shrivatsa Bhargav,
Dinesh Khandelwal,
Srinivas Ravishankar,
Sairam Gurajada,
Maria Chang,
Rosario Uceda-Sosa,
Salim Roukos,
Alexander Gray,
Guilherme LimaRyan Riegel,
Francois Luus,
L Venkata Subramaniam
Abstract:
Knowledge Base Question Answering (KBQA) tasks that in-volve complex reasoning are emerging as an important re-search direction. However, most KBQA systems struggle withgeneralizability, particularly on two dimensions: (a) acrossmultiple reasoning types where both datasets and systems haveprimarily focused on multi-hop reasoning, and (b) across mul-tiple knowledge bases, where KBQA approaches are…
▽ More
Knowledge Base Question Answering (KBQA) tasks that in-volve complex reasoning are emerging as an important re-search direction. However, most KBQA systems struggle withgeneralizability, particularly on two dimensions: (a) acrossmultiple reasoning types where both datasets and systems haveprimarily focused on multi-hop reasoning, and (b) across mul-tiple knowledge bases, where KBQA approaches are specif-ically tuned to a single knowledge base. In this paper, wepresent SYGMA, a modular approach facilitating general-izability across multiple knowledge bases and multiple rea-soning types. Specifically, SYGMA contains three high levelmodules: 1) KB-agnostic question understanding module thatis common across KBs 2) Rules to support additional reason-ing types and 3) KB-specific question map** and answeringmodule to address the KB-specific aspects of the answer ex-traction. We demonstrate effectiveness of our system by evalu-ating on datasets belonging to two distinct knowledge bases,DBpedia and Wikidata. In addition, to demonstrate extensi-bility to additional reasoning types we evaluate on multi-hopreasoning datasets and a new Temporal KBQA benchmarkdataset on Wikidata, namedTempQA-WD1, introduced in thispaper. We show that our generalizable approach has bettercompetetive performance on multiple datasets on DBpediaand Wikidata that requires both multi-hop and temporal rea-soning
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Learning to Guide a Saturation-Based Theorem Prover
Authors:
Ibrahim Abdelaziz,
Maxwell Crouse,
Bassem Makni,
Vernon Austil,
Cristina Cornelio,
Shajith Ikbal,
Pavan Kapanipathi,
Ndivhuwo Makondo,
Kavitha Srinivas,
Michael Witbrock,
Achille Fokoue
Abstract:
Traditional automated theorem provers have relied on manually tuned heuristics to guide how they perform proof search. Recently, however, there has been a surge of interest in the design of learning mechanisms that can be integrated into theorem provers to improve their performance automatically. In this work, we introduce TRAIL, a deep learning-based approach to theorem proving that characterizes…
▽ More
Traditional automated theorem provers have relied on manually tuned heuristics to guide how they perform proof search. Recently, however, there has been a surge of interest in the design of learning mechanisms that can be integrated into theorem provers to improve their performance automatically. In this work, we introduce TRAIL, a deep learning-based approach to theorem proving that characterizes core elements of saturation-based theorem proving within a neural framework. TRAIL leverages (a) an effective graph neural network for representing logical formulas, (b) a novel neural representation of the state of a saturation-based theorem prover in terms of processed clauses and available actions, and (c) a novel representation of the inference selection process as an attention-based action policy. We show through a systematic analysis that these components allow TRAIL to significantly outperform previous reinforcement learning-based theorem provers on two standard benchmark datasets (up to 36% more theorems proved). In addition, to the best of our knowledge, TRAIL is the first reinforcement learning-based approach to exceed the performance of a state-of-the-art traditional theorem prover on a standard theorem proving benchmark (solving up to 17% more problems).
△ Less
Submitted 7 June, 2021;
originally announced June 2021.
-
HEDRA: A Bio-Inspired Modular Tensegrity Soft Robot With Polyhedral Parallel Modules
Authors:
Vishal Ramadoss,
Keerthi Sagar,
Mohamed Sadiq Ikbal,
Jesus Hiram Lugo Calles,
Matteo Zoppi
Abstract:
There is a surge of research interest in the field of tensegrity robotics. Robots developed under this paradigm provide many advantages and have distinguishing features in terms of structural compliance, dexterity, safety, and weight reduction. This paper proposes a new robotic mechanism based on tensegrity ('tension-integrity') robots and reconfigurable modular robots. The specific actuation sche…
▽ More
There is a surge of research interest in the field of tensegrity robotics. Robots developed under this paradigm provide many advantages and have distinguishing features in terms of structural compliance, dexterity, safety, and weight reduction. This paper proposes a new robotic mechanism based on tensegrity ('tension-integrity') robots and reconfigurable modular robots. The specific actuation schemes for this tensegrity robot with multiple degrees of freedom are presented. This article describes an easy-to-assemble 350 mm tensegrity based robot prototype by stacking a series of rigid struts linked with tensegrity joints that have no direct rigid contact with each other. The functionality of the proposed robot is validated by the experimental results by integrating the polyhedral parallel structure as its skeleton and series of tensegrity joints. The proposed manipulator is capable of reaching bending angles up to 76 degrees. An adaptive cable driven underactuated robotic gripper is designed and attached to the tensegrity manipulator for gras** objects in different shapes, weights, and sizes.
△ Less
Submitted 28 November, 2020;
originally announced November 2020.
-
Logical Neural Networks
Authors:
Ryan Riegel,
Alexander Gray,
Francois Luus,
Naweed Khan,
Ndivhuwo Makondo,
Ismail Yunus Akhalwaya,
Haifeng Qian,
Ronald Fagin,
Francisco Barahona,
Udit Sharma,
Shajith Ikbal,
Hima Karanam,
Sumit Neelam,
Ankita Likhyani,
Santosh Srivastava
Abstract:
We propose a novel framework seamlessly providing key properties of both neural nets (learning) and symbolic logic (knowledge and reasoning). Every neuron has a meaning as a component of a formula in a weighted real-valued logic, yielding a highly intepretable disentangled representation. Inference is omnidirectional rather than focused on predefined target variables, and corresponds to logical re…
▽ More
We propose a novel framework seamlessly providing key properties of both neural nets (learning) and symbolic logic (knowledge and reasoning). Every neuron has a meaning as a component of a formula in a weighted real-valued logic, yielding a highly intepretable disentangled representation. Inference is omnidirectional rather than focused on predefined target variables, and corresponds to logical reasoning, including classical first-order logic theorem proving as a special case. The model is end-to-end differentiable, and learning minimizes a novel loss function capturing logical contradiction, yielding resilience to inconsistent knowledge. It also enables the open-world assumption by maintaining bounds on truth values which can have probabilistic semantics, yielding resilience to incomplete knowledge.
△ Less
Submitted 23 June, 2020;
originally announced June 2020.