Skip to main content

Showing 1–35 of 35 results for author: Abdelaziz, I

.
  1. arXiv:2407.01619  [pdf, other

    cs.LG cs.AI cs.DB

    TabSketchFM: Sketch-based Tabular Representation Learning for Data Discovery over Data Lakes

    Authors: Aamod Khatiwada, Harsha Kokel, Ibrahim Abdelaziz, Subhajit Chaudhury, Julian Dolby, Oktie Hassanzadeh, Zhenhan Huang, Tejaswini Pedapati, Horst Samulowitz, Kavitha Srinivas

    Abstract: Enterprises have a growing need to identify relevant tables in data lakes; e.g. tables that are unionable, joinable, or subsets of each other. Tabular neural models can be helpful for such data discovery tasks. In this paper, we present TabSketchFM, a neural tabular model for data discovery over data lakes. First, we propose a novel pre-training sketch-based approach to enhance the effectiveness o… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2307.04217

  2. arXiv:2407.00121  [pdf, other

    cs.LG cs.AI cs.CL

    Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

    Authors: Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Sadhana Kumaravel, Matthew Stallone, Rameswar Panda, Yara Rizk, GP Bhargav, Maxwell Crouse, Chulaka Gunasekara, Shajith Ikbal, Sachin Joshi, Hima Karanam, Vineet Kumar, Asim Munawar, Sumit Neelam, Dinesh Raghu, Udit Sharma, Adriana Meza Soria, Dheeraj Sreedhar, Praveen Venkateswaran, Merve Unuvar, David Cox, Salim Roukos, Luis Lastras , et al. (1 additional authors not shown)

    Abstract: Large language models (LLMs) have recently shown tremendous promise in serving as the backbone to agentic systems, as demonstrated by their performance in multi-faceted, challenging benchmarks like SWE-Bench and Agent-Bench. However, to realize the true potential of LLMs as autonomous agents, they must learn to identify, call, and interact with external tools and application program interfaces (AP… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  3. arXiv:2405.04324  [pdf, other

    cs.AI cs.CL cs.SE

    Granite Code Models: A Family of Open Foundation Models for Code Intelligence

    Authors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal , et al. (21 additional authors not shown)

    Abstract: Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabili… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Corresponding Authors: Rameswar Panda, Ruchir Puri; Equal Contributors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang

  4. arXiv:2402.15491  [pdf, other

    cs.CL cs.AI

    API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs

    Authors: Kinjal Basu, Ibrahim Abdelaziz, Subhajit Chaudhury, Soham Dan, Maxwell Crouse, Asim Munawar, Sadhana Kumaravel, Vinod Muthusamy, Pavan Kapanipathi, Luis A. Lastras

    Abstract: There is a growing need for Large Language Models (LLMs) to effectively use tools and external Application Programming Interfaces (APIs) to plan and complete tasks. As such, there is tremendous interest in methods that can acquire sufficient quantities of train and test data that involve calls to tools / APIs. Two lines of research have emerged as the predominant strategies for addressing this cha… ▽ More

    Submitted 20 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted at ACL'24-main conference

  5. arXiv:2310.08535  [pdf, other

    cs.AI cs.CL

    Formally Specifying the High-Level Behavior of LLM-Based Agents

    Authors: Maxwell Crouse, Ibrahim Abdelaziz, Ramon Astudillo, Kinjal Basu, Soham Dan, Sadhana Kumaravel, Achille Fokoue, Pavan Kapanipathi, Salim Roukos, Luis Lastras

    Abstract: Autonomous, goal-driven agents powered by LLMs have recently emerged as promising tools for solving challenging problems without the need for task-specific finetuned models that can be expensive to procure. Currently, the design and implementation of such agents is ad hoc, as the wide variety of tasks that LLM-based agents may be applied to naturally means there can be no one-size-fits-all approac… ▽ More

    Submitted 24 January, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Preprint under review

  6. arXiv:2307.04217  [pdf, other

    cs.DB cs.AI

    LakeBench: Benchmarks for Data Discovery over Data Lakes

    Authors: Kavitha Srinivas, Julian Dolby, Ibrahim Abdelaziz, Oktie Hassanzadeh, Harsha Kokel, Aamod Khatiwada, Tejaswini Pedapati, Subhajit Chaudhury, Horst Samulowitz

    Abstract: Within enterprises, there is a growing need to intelligently navigate data lakes, specifically focusing on data discovery. Of particular importance to enterprises is the ability to find related tables in data repositories. These tables can be unionable, joinable, or subsets of each other. There is a dearth of benchmarks for these tasks in the public domain, with related work targeting private data… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

  7. arXiv:2306.10452  [pdf, other

    cs.CL

    MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types

    Authors: Keerthiram Murugesan, Sarathkrishna Swaminathan, Soham Dan, Subhajit Chaudhury, Chulaka Gunasekara, Maxwell Crouse, Diwakar Mahajan, Ibrahim Abdelaziz, Achille Fokoue, Pavan Kapanipathi, Salim Roukos, Alexander Gray

    Abstract: With the growing interest in large language models, the need for evaluating the quality of machine text compared to reference (typically human-generated) text has become focal attention. Most recent works focus either on task-specific evaluation metrics or study the properties of machine-generated text captured by the existing metrics. In this work, we propose a new evaluation scheme to model huma… ▽ More

    Submitted 17 June, 2023; originally announced June 2023.

    Comments: Accepted at ACL 2023 (ACL Findings Long)

  8. arXiv:2305.08676  [pdf, other

    cs.AI cs.LO

    An Ensemble Approach for Automated Theorem Proving Based on Efficient Name Invariant Graph Neural Representations

    Authors: Achille Fokoue, Ibrahim Abdelaziz, Maxwell Crouse, Shajith Ikbal, Akihiro Kishimoto, Guilherme Lima, Ndivhuwo Makondo, Radu Marinescu

    Abstract: Using reinforcement learning for automated theorem proving has recently received much attention. Current approaches use representations of logical statements that often rely on the names used in these statements and, as a result, the models are generally not transferable from one domain to another. The size of these representations and whether to include the whole theory or part of it are other im… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: Accepted to IJCAI 2023

  9. arXiv:2301.05108  [pdf, other

    cs.PL cs.AI

    Serenity: Library Based Python Code Analysis for Code Completion and Automated Machine Learning

    Authors: Wenting Zhao, Ibrahim Abdelaziz, Julian Dolby, Kavitha Srinivas, Mossad Helali, Essam Mansour

    Abstract: Dynamically typed languages such as Python have become very popular. Among other strengths, Python's dynamic nature and its straightforward linking to native code have made it the de-facto language for many research areas such as Artificial Intelligence. This flexibility, however, makes static analysis very hard. While creating a sound, or a soundy, analysis for Python remains an open problem, we… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

  10. arXiv:2209.05828  [pdf, other

    cs.AI cs.DB

    Expressive Reasoning Graph Store: A Unified Framework for Managing RDF and Property Graph Databases

    Authors: Sumit Neelam, Udit Sharma, Sumit Bhatia, Hima Karanam, Ankita Likhyani, Ibrahim Abdelaziz, Achille Fokoue, L. V. Subramaniam

    Abstract: Resource Description Framework (RDF) and Property Graph (PG) are the two most commonly used data models for representing, storing, and querying graph data. We present Expressive Reasoning Graph Store (ERGS) -- a graph store built on top of JanusGraph (a Property Graph store) that also allows storing and querying of RDF datasets. First, we describe how RDF data can be translated into a Property Gra… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: 16 pages, 3 figures, 9 tables

  11. arXiv:2204.08554  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    CBR-iKB: A Case-Based Reasoning Approach for Question Answering over Incomplete Knowledge Bases

    Authors: Dung Thai, Srinivas Ravishankar, Ibrahim Abdelaziz, Mudit Chaudhary, Nandana Mihindukulasooriya, Tahira Naseem, Rajarshi Das, Pavan Kapanipathi, Achille Fokoue, Andrew McCallum

    Abstract: Knowledge bases (KBs) are often incomplete and constantly changing in practice. Yet, in many question answering applications coupled with knowledge bases, the sparse nature of KBs is often overlooked. To this end, we propose a case-based reasoning approach, CBR-iKB, for knowledge base question answering (KBQA) with incomplete-KB as our main focus. Our method ensembles decisions from multiple reaso… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 8 pages, 3 figurs, 4 tables

  12. arXiv:2201.12242  [pdf, other

    cs.PL

    Large Scale Generation of Labeled Type Data for Python

    Authors: Ibrahim Abdelaziz, Julian Dolby, Kavitha Srinivas

    Abstract: Recently, dynamically typed languages, such as Python, have gained unprecedented popularity. Although these languages alleviate the need for mandatory type annotations, types still play a critical role in program understanding and preventing runtime errors. An attractive option is to infer types automatically to get static guarantees without writing types. Existing inference techniques rely mostly… ▽ More

    Submitted 6 February, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

  13. arXiv:2201.05793  [pdf, other

    cs.CL cs.AI

    A Benchmark for Generalizable and Interpretable Temporal Question Answering over Knowledge Bases

    Authors: Sumit Neelam, Udit Sharma, Hima Karanam, Shajith Ikbal, Pavan Kapanipathi, Ibrahim Abdelaziz, Nandana Mihindukulasooriya, Young-Suk Lee, Santosh Srivastava, Cezar Pendus, Saswati Dana, Dinesh Garg, Achille Fokoue, G P Shrivatsa Bhargav, Dinesh Khandelwal, Srinivas Ravishankar, Sairam Gurajada, Maria Chang, Rosario Uceda-Sosa, Salim Roukos, Alexander Gray, Guilherme Lima, Ryan Riegel, Francois Luus, L Venkata Subramaniam

    Abstract: Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning. In this paper, we present a benchmark dataset for temporal reasoning, TempQA-… ▽ More

    Submitted 15 January, 2022; originally announced January 2022.

    Comments: 7 pages, 2 figures, 7 tables. arXiv admin note: substantial text overlap with arXiv:2109.13430

  14. arXiv:2112.07877  [pdf, other

    cs.CL

    Learning to Transpile AMR into SPARQL

    Authors: Mihaela Bornea, Ramon Fernandez Astudillo, Tahira Naseem, Nandana Mihindukulasooriya, Ibrahim Abdelaziz, Pavan Kapanipathi, Radu Florian, Salim Roukos

    Abstract: We propose a transition-based system to transpile Abstract Meaning Representation (AMR) into SPARQL for Knowledge Base Question Answering (KBQA). This allows us to delegate part of the semantic representation to a strongly pre-trained semantic parser, while learning transpiling with small amount of paired data. We depart from recent work relating AMR and SPARQL constructs, but rather than applying… ▽ More

    Submitted 8 December, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

  15. arXiv:2111.05825  [pdf, other

    cs.CL cs.AI

    A Two-Stage Approach towards Generalization in Knowledge Base Question Answering

    Authors: Srinivas Ravishankar, June Thai, Ibrahim Abdelaziz, Nandana Mihidukulasooriya, Tahira Naseem, Pavan Kapanipathi, Gaetano Rossiello, Achille Fokoue

    Abstract: Most existing approaches for Knowledge Base Question Answering (KBQA) focus on a specific underlying knowledge base either because of inherent assumptions in the approach, or because evaluating it on a different knowledge base requires non-trivial changes. However, many popular knowledge bases share similarities in their underlying schemas that can be leveraged to facilitate generalization across… ▽ More

    Submitted 17 November, 2021; v1 submitted 10 November, 2021; originally announced November 2021.

  16. arXiv:2111.00083  [pdf, other

    cs.LG

    A Scalable AutoML Approach Based on Graph Neural Networks

    Authors: Mossad Helali, Essam Mansour, Ibrahim Abdelaziz, Julian Dolby, Kavitha Srinivas

    Abstract: AutoML systems build machine learning models automatically by performing a search over valid data transformations and learners, along with hyper-parameter optimization for each learner. Many AutoML systems use meta-learning to guide search for optimal pipelines. In this work, we present a novel meta-learning system called KGpip which, (1) builds a database of datasets and corresponding pipelines b… ▽ More

    Submitted 14 July, 2022; v1 submitted 29 October, 2021; originally announced November 2021.

    Comments: 14 pages, 9 figures. Accepted in VLDB22

  17. arXiv:2109.13430  [pdf, other

    cs.CL cs.AI

    SYGMA: System for Generalizable Modular Question Answering OverKnowledge Bases

    Authors: Sumit Neelam, Udit Sharma, Hima Karanam, Shajith Ikbal, Pavan Kapanipathi, Ibrahim Abdelaziz, Nandana Mihindukulasooriya, Young-Suk Lee, Santosh Srivastava, Cezar Pendus, Saswati Dana, Dinesh Garg, Achille Fokoue, G P Shrivatsa Bhargav, Dinesh Khandelwal, Srinivas Ravishankar, Sairam Gurajada, Maria Chang, Rosario Uceda-Sosa, Salim Roukos, Alexander Gray, Guilherme LimaRyan Riegel, Francois Luus, L Venkata Subramaniam

    Abstract: Knowledge Base Question Answering (KBQA) tasks that in-volve complex reasoning are emerging as an important re-search direction. However, most KBQA systems struggle withgeneralizability, particularly on two dimensions: (a) acrossmultiple reasoning types where both datasets and systems haveprimarily focused on multi-hop reasoning, and (b) across mul-tiple knowledge bases, where KBQA approaches are… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

  18. arXiv:2109.09566  [pdf, other

    cs.AI cs.LG cs.LO

    Combining Rules and Embeddings via Neuro-Symbolic AI for Knowledge Base Completion

    Authors: Prithviraj Sen, Breno W. S. R. Carvalho, Ibrahim Abdelaziz, Pavan Kapanipathi, Francois Luus, Salim Roukos, Alexander Gray

    Abstract: Recent interest in Knowledge Base Completion (KBC) has led to a plethora of approaches based on reinforcement learning, inductive logic programming and graph embeddings. In particular, rule-based KBC has led to interpretable rules while being comparable in performance with graph embeddings. Even within rule-based KBC, there exist different approaches that lead to rules of varying quality and previ… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

  19. arXiv:2109.07452  [pdf, other

    cs.CL cs.AI

    Can Machines Read Coding Manuals Yet? -- A Benchmark for Building Better Language Models for Code Understanding

    Authors: Ibrahim Abdelaziz, Julian Dolby, Jamie McCusker, Kavitha Srinivas

    Abstract: Code understanding is an increasingly important application of Artificial Intelligence. A fundamental aspect of understanding code is understanding text about code, e.g., documentation and forum discussions. Pre-trained language models (e.g., BERT) are a popular approach for various NLP tasks, and there are now a variety of benchmarks, such as GLUE, to help improve the development of such models f… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

  20. arXiv:2108.07337  [pdf, other

    cs.CL cs.AI

    Generative Relation Linking for Question Answering over Knowledge Bases

    Authors: Gaetano Rossiello, Nandana Mihindukulasooriya, Ibrahim Abdelaziz, Mihaela Bornea, Alfio Gliozzo, Tahira Naseem, Pavan Kapanipathi

    Abstract: Relation linking is essential to enable question answering over knowledge bases. Although there are various efforts to improve relation linking performance, the current state-of-the-art methods do not achieve optimal results, therefore, negatively impacting the overall end-to-end question answering performance. In this work, we propose a novel approach for relation linking framing it as a generati… ▽ More

    Submitted 16 August, 2021; originally announced August 2021.

    Comments: Accepted at the 20th International Semantic Web Conference (ISWC 2021)

  21. arXiv:2106.03906  [pdf, other

    cs.AI cs.LO

    Learning to Guide a Saturation-Based Theorem Prover

    Authors: Ibrahim Abdelaziz, Maxwell Crouse, Bassem Makni, Vernon Austil, Cristina Cornelio, Shajith Ikbal, Pavan Kapanipathi, Ndivhuwo Makondo, Kavitha Srinivas, Michael Witbrock, Achille Fokoue

    Abstract: Traditional automated theorem provers have relied on manually tuned heuristics to guide how they perform proof search. Recently, however, there has been a surge of interest in the design of learning mechanisms that can be integrated into theorem provers to improve their performance automatically. In this work, we introduce TRAIL, a deep learning-based approach to theorem proving that characterizes… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  22. arXiv:2012.01707  [pdf, other

    cs.CL cs.AI

    Leveraging Abstract Meaning Representation for Knowledge Base Question Answering

    Authors: Pavan Kapanipathi, Ibrahim Abdelaziz, Srinivas Ravishankar, Salim Roukos, Alexander Gray, Ramon Astudillo, Maria Chang, Cristina Cornelio, Saswati Dana, Achille Fokoue, Dinesh Garg, Alfio Gliozzo, Sairam Gurajada, Hima Karanam, Naweed Khan, Dinesh Khandelwal, Young-Suk Lee, Yunyao Li, Francois Luus, Ndivhuwo Makondo, Nandana Mihindukulasooriya, Tahira Naseem, Sumit Neelam, Lucian Popa, Revanth Reddy , et al. (5 additional authors not shown)

    Abstract: Knowledge base question answering (KBQA)is an important task in Natural Language Processing. Existing approaches face significant challenges including complex question understanding, necessity for reasoning, and lack of large end-to-end training datasets. In this work, we propose Neuro-Symbolic Question Answering (NSQA), a modular KBQA system, that leverages (1) Abstract Meaning Representation (AM… ▽ More

    Submitted 2 June, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: Accepted to Findings of ACL

  23. Leveraging Semantic Parsing for Relation Linking over Knowledge Bases

    Authors: Nandana Mihindukulasooriya, Gaetano Rossiello, Pavan Kapanipathi, Ibrahim Abdelaziz, Srinivas Ravishankar, Mo Yu, Alfio Gliozzo, Salim Roukos, Alexander Gray

    Abstract: Knowledgebase question answering systems are heavily dependent on relation extraction and linking modules. However, the task of extracting and linking relations from text to knowledgebases faces two primary challenges; the ambiguity of natural language and lack of training data. To overcome these challenges, we present SLING, a relation linking framework which leverages semantic parsing using Abst… ▽ More

    Submitted 16 September, 2020; originally announced September 2020.

    Comments: Accepted at the 19th International Semantic Web Conference (ISWC 2020)

    MSC Class: 68T35 ACM Class: I.2.7; I.2.4

  24. arXiv:2004.03573  [pdf, other

    cs.AI cs.NE cs.SC

    Neural Analogical Matching

    Authors: Maxwell Crouse, Constantine Nakos, Ibrahim Abdelaziz, Kenneth Forbus

    Abstract: Analogy is core to human cognition. It allows us to solve problems based on prior experience, it governs the way we conceptualize new information, and it even influences our visual perception. The importance of analogy to humans has made it an active area of research in the broader field of artificial intelligence, resulting in data-efficient models that learn and reason in human-like ways. While… ▽ More

    Submitted 15 December, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: AAAI version

  25. arXiv:2004.01588  [pdf, other

    cs.CV

    HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose Estimation from a Single Depth Map

    Authors: Jameel Malik, Ibrahim Abdelaziz, Ahmed Elhayek, Soshi Shimada, Sk Aziz Ali, Vladislav Golyanik, Christian Theobalt, Didier Stricker

    Abstract: 3D hand shape and pose estimation from a single depth map is a new and challenging computer vision problem with many applications. The state-of-the-art methods directly regress 3D hand meshes from 2D depth images via 2D convolutional neural networks, which leads to artefacts in the estimations due to perspective distortions in the images. In contrast, we propose a novel architecture with 3D convol… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Comments: 10 pages, 8 figures, 5 tables, CVPR

  26. arXiv:2002.09440  [pdf, other

    cs.DB cs.AI

    A Toolkit for Generating Code Knowledge Graphs

    Authors: Ibrahim Abdelaziz, Julian Dolby, Jamie McCusker, Kavitha Srinivas

    Abstract: Knowledge graphs have been proven extremely useful in powering diverse applications in semantic search and natural language understanding. In this paper, we present GraphGen4Code, a toolkit to build code knowledge graphs that can similarly power various applications such as program search, code understanding, bug detection, and code automation. GraphGen4Code uses generic techniques to capture code… ▽ More

    Submitted 27 September, 2021; v1 submitted 21 February, 2020; originally announced February 2020.

  27. arXiv:2002.03514  [pdf, other

    cs.AI

    Explainable Deep RDFS Reasoner

    Authors: Bassem Makni, Ibrahim Abdelaziz, James Hendler

    Abstract: Recent research efforts aiming to bridge the Neural-Symbolic gap for RDFS reasoning proved empirically that deep learning techniques can be used to learn RDFS inference rules. However, one of their main deficiencies compared to rule-based reasoners is the lack of derivations for the inferred triples (i.e. explainability in AI terms). In this paper, we build on these approaches to provide not only… ▽ More

    Submitted 9 February, 2020; originally announced February 2020.

    Comments: StarAI 2020

  28. arXiv:2002.00423  [pdf, other

    cs.AI cs.LG cs.LO

    An Experimental Study of Formula Embeddings for Automated Theorem Proving in First-Order Logic

    Authors: Ibrahim Abdelaziz, Veronika Thost, Maxwell Crouse, Achille Fokoue

    Abstract: Automated theorem proving in first-order logic is an active research area which is successfully supported by machine learning. While there have been various proposals for encoding logical formulas into numerical vectors -- from simple strings to more involved graph-based embeddings -- little is known about how these different encodings compare. In this paper, we study and experimentally compare pa… ▽ More

    Submitted 15 March, 2020; v1 submitted 2 February, 2020; originally announced February 2020.

    Comments: 7 pages, preprint, under review

  29. arXiv:1911.06904  [pdf, other

    cs.AI cs.LG cs.LO cs.SC

    Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling

    Authors: Maxwell Crouse, Ibrahim Abdelaziz, Cristina Cornelio, Veronika Thost, Lingfei Wu, Kenneth Forbus, Achille Fokoue

    Abstract: Recent advances in the integration of deep learning with automated theorem proving have centered around the representation of logical formulae as inputs to deep learning systems. In particular, there has been a growing interest in adapting structure-aware neural methods to work with the underlying graph representations of logical expressions. While more effective than character and token-level app… ▽ More

    Submitted 5 June, 2020; v1 submitted 15 November, 2019; originally announced November 2019.

  30. arXiv:1911.02065  [pdf, other

    cs.AI cs.LG cs.LO

    A Deep Reinforcement Learning Approach to First-Order Logic Theorem Proving

    Authors: Maxwell Crouse, Ibrahim Abdelaziz, Bassem Makni, Spencer Whitehead, Cristina Cornelio, Pavan Kapanipathi, Kavitha Srinivas, Veronika Thost, Michael Witbrock, Achille Fokoue

    Abstract: Automated theorem provers have traditionally relied on manually tuned heuristics to guide how they perform proof search. Deep reinforcement learning has been proposed as a way to obviate the need for such heuristics, however, its deployment in automated theorem proving remains a challenge. In this paper we introduce TRAIL, a system that applies deep reinforcement learning to saturation-based theor… ▽ More

    Submitted 15 September, 2020; v1 submitted 5 November, 2019; originally announced November 2019.

  31. arXiv:1911.02060  [pdf, other

    cs.CL cs.AI

    Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

    Authors: Pavan Kapanipathi, Veronika Thost, Siva Sankalp Patel, Spencer Whitehead, Ibrahim Abdelaziz, Avinash Balakrishnan, Maria Chang, Kshitij Fadnis, Chulaka Gunasekara, Bassem Makni, Nicholas Mattei, Kartik Talamadupula, Achille Fokoue

    Abstract: Textual entailment is a fundamental task in natural language processing. Most approaches for solving the problem use only the textual content present in training data. A few approaches have shown that information from external knowledge sources like knowledge graphs (KGs) can add value, in addition to the textual content, by providing background knowledge that may be critical for a task. However,… ▽ More

    Submitted 21 November, 2019; v1 submitted 5 November, 2019; originally announced November 2019.

  32. arXiv:1809.05724  [pdf, other

    cs.AI cs.CL cs.LG

    Improving Natural Language Inference Using External Knowledge in the Science Questions Domain

    Authors: Xiaoyan Wang, Pavan Kapanipathi, Ryan Musa, Mo Yu, Kartik Talamadupula, Ibrahim Abdelaziz, Maria Chang, Achille Fokoue, Bassem Makni, Nicholas Mattei, Michael Witbrock

    Abstract: Natural Language Inference (NLI) is fundamental to many Natural Language Processing (NLP) applications including semantic search and question answering. The NLI problem has gained significant attention thanks to the release of large scale, challenging datasets. Present approaches to the problem largely focus on learning-based methods that use only textual information in order to classify whether a… ▽ More

    Submitted 20 November, 2018; v1 submitted 15 September, 2018; originally announced September 2018.

    Comments: 9 pages, 3 figures, 5 tables

  33. arXiv:1505.02728  [pdf, ps, other

    cs.DB

    Adaptive Partitioning for Very Large RDF Data

    Authors: Razen Harbi, Ibrahim Abdelaziz, Panos Kalnis, Nikos Mamoulis, Yasser Ebrahim, Majed Sahli

    Abstract: Distributed RDF systems partition data across multiple computer nodes (workers). Some systems perform cheap hash partitioning, which may result in expensive query evaluation, while others apply heuristics aiming at minimizing inter-node communication during query evaluation. This requires an expensive data preprocessing phase, leading to high startup costs for very large RDF knowledge bases. Aprio… ▽ More

    Submitted 11 May, 2015; originally announced May 2015.

    Comments: 25 pages

  34. arXiv:1412.7626  [pdf, other

    cs.CV

    AltecOnDB: A Large-Vocabulary Arabic Online Handwriting Recognition Database

    Authors: Ibrahim Abdelaziz, Sherif Abdou

    Abstract: Arabic is a semitic language characterized by a complex and rich morphology. The exceptional degree of ambiguity in the writing system, the rich morphology, and the highly complex word formation process of roots and patterns all contribute to making computational approaches to Arabic very challenging. As a result, a practical handwriting recognition system should support large vocabulary to provid… ▽ More

    Submitted 24 December, 2014; originally announced December 2014.

    Comments: The preprint is in submission

  35. arXiv:1410.4688  [pdf, ps, other

    cs.CV

    Large Vocabulary Arabic Online Handwriting Recognition System

    Authors: Ibrahim Abdelaziz, Sherif Abdou, Hassanin Al-Barhamtoshy

    Abstract: Arabic handwriting is a consonantal and cursive writing. The analysis of Arabic script is further complicated due to obligatory dots/strokes that are placed above or below most letters and usually written delayed in order. Due to ambiguities and diversities of writing styles, recognition systems are generally based on a set of possible words called lexicon. When the lexicon is small, recognition a… ▽ More

    Submitted 17 October, 2015; v1 submitted 17 October, 2014; originally announced October 2014.

    Comments: Preprint submitted to Pattern Analysis and Applications Journal