Skip to main content

Showing 1–11 of 11 results for author: Kalnis, P

.
  1. arXiv:2403.05752  [pdf, other

    cs.LG cs.AI

    Task-Oriented GNNs Training on Large Knowledge Graphs for Accurate and Efficient Modeling

    Authors: Hussein Abdallah, Waleed Afandi, Panos Kalnis, Essam Mansour

    Abstract: A Knowledge Graph (KG) is a heterogeneous graph encompassing a diverse range of node and edge types. Heterogeneous Graph Neural Networks (HGNNs) are popular for training machine learning tasks like node classification and link prediction on KGs. However, HGNN methods exhibit excessive complexity influenced by the KG's size, density, and the number of node and edge types. AI practitioners handcraft… ▽ More

    Submitted 22 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: 12 pages,9 Figures, 3 Tables, ICDE:2024

  2. arXiv:2303.00595  [pdf, other

    cs.AI cs.CL cs.DB

    A Universal Question-Answering Platform for Knowledge Graphs

    Authors: Reham Omar, Ishika Dhall, Panos Kalnis, Essam Mansour

    Abstract: Knowledge from diverse application domains is organized as knowledge graphs (KGs) that are stored in RDF engines accessible in the web via SPARQL endpoints. Expressing a well-formed SPARQL query requires information about the graph structure and the exact URIs of its components, which is impractical for the average user. Question answering (QA) systems assist by translating natural language questi… ▽ More

    Submitted 8 August, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: The paper is accepted to SIGMOD 2023

  3. arXiv:2302.06466  [pdf, ps, other

    cs.CL cs.AI cs.IR

    ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots

    Authors: Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour

    Abstract: Conversational AI and Question-Answering systems (QASs) for knowledge graphs (KGs) are both emerging research areas: they empower users with natural language interfaces for extracting information easily and effectively. Conversational AI simulates conversations with humans; however, it is limited by the data captured in the training datasets. In contrast, QASs retrieve the most recent information… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: 9 pages

  4. arXiv:2108.00951  [pdf, other

    cs.LG cs.DC math.OC

    Rethinking gradient sparsification as total error minimization

    Authors: Atal Narayan Sahu, Aritra Dutta, Ahmed M. Abdelmoniem, Trambak Banerjee, Marco Canini, Panos Kalnis

    Abstract: Gradient compression is a widely-established remedy to tackle the communication bottleneck in distributed training of large deep neural networks (DNNs). Under the error-feedback framework, Top-$k$ sparsification, sometimes with $k$ as little as $0.1\%$ of the gradient size, enables training to the same model quality as the uncompressed case for a similar iteration count. From the optimization pers… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: 33 pages, 31 figures

  5. arXiv:2102.03112  [pdf

    cs.LG

    DeepReduce: A Sparse-tensor Communication Framework for Distributed Deep Learning

    Authors: Kelly Kostopoulou, Hang Xu, Aritra Dutta, Xin Li, Alexandros Ntoulas, Panos Kalnis

    Abstract: Sparse tensors appear frequently in distributed deep learning, either as a direct artifact of the deep neural network's gradients, or as a result of an explicit sparsification process. Existing communication primitives are agnostic to the peculiarities of deep learning; consequently, they impose unnecessary communication overhead. This paper introduces DeepReduce, a versatile framework for the com… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

  6. arXiv:1911.08250  [pdf, other

    cs.DC cs.LG math.OC

    On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning

    Authors: Aritra Dutta, El Houcine Bergou, Ahmed M. Abdelmoniem, Chen-Yu Ho, Atal Narayan Sahu, Marco Canini, Panos Kalnis

    Abstract: Compressed communication, in the form of sparsification or quantization of stochastic gradients, is employed to reduce communication costs in distributed data-parallel training of deep neural networks. However, there exists a discrepancy between theory and practice: while theoretical analysis of most existing compression methods assumes compression is applied to the gradients of the entire model,… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: To Appear In Proceedings of Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

    Journal ref: In Proceedings of Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

  7. arXiv:1903.06701  [pdf, other

    cs.DC cs.LG cs.NI stat.ML

    Scaling Distributed Machine Learning with In-Network Aggregation

    Authors: Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan R. K. Ports, Peter Richtárik

    Abstract: Training machine learning models in parallel is an increasingly important workload. We accelerate distributed parallel training by designing a communication primitive that uses a programmable switch dataplane to execute a key step of the training process. Our approach, SwitchML, reduces the volume of exchanged data by aggregating the model updates from multiple workers in the network. We co-design… ▽ More

    Submitted 30 September, 2020; v1 submitted 22 February, 2019; originally announced March 2019.

  8. arXiv:1610.06052  [pdf, other

    cs.SI

    Scheduling Broadcasts in a Network of Timelines

    Authors: Emaad Manzoor, Haewoon Kwak, Panos Kalnis

    Abstract: Broadcasts and timelines are the primary mechanism of information exchange in online social platforms today. Services like Facebook, Twitter and Instagram have enabled ordinary people to reach large audiences spanning cultures and countries, while their massive popularity has created increasingly competitive marketplaces of attention. Timing broadcasts to capture the attention of such geographical… ▽ More

    Submitted 19 October, 2016; originally announced October 2016.

    Comments: Submitted to KDD 2015

  9. arXiv:1505.02728  [pdf, ps, other

    cs.DB

    Adaptive Partitioning for Very Large RDF Data

    Authors: Razen Harbi, Ibrahim Abdelaziz, Panos Kalnis, Nikos Mamoulis, Yasser Ebrahim, Majed Sahli

    Abstract: Distributed RDF systems partition data across multiple computer nodes (workers). Some systems perform cheap hash partitioning, which may result in expensive query evaluation, while others apply heuristics aiming at minimizing inter-node communication during query evaluation. This requires an expensive data preprocessing phase, leading to high startup costs for very large RDF knowledge bases. Aprio… ▽ More

    Submitted 11 May, 2015; originally announced May 2015.

    Comments: 25 pages

  10. arXiv:1405.4979  [pdf

    cs.DB

    PHD-Store: An Adaptive SPARQL Engine with Dynamic Partitioning for Distributed RDF Repositories

    Authors: Razen Al-Harbi, Yasser Ebrahim, Panos Kalnis

    Abstract: Many repositories utilize the versatile RDF model to publish data. Repositories are typically distributed and geographically remote, but data are interconnected (e.g., the Semantic Web) and queried globally by a language such as SPARQL. Due to the network cost and the nature of the queries, the execution time can be prohibitively high. Current solutions attempt to minimize the network cost by redi… ▽ More

    Submitted 20 May, 2014; originally announced May 2014.

  11. arXiv:1109.6884  [pdf, other

    cs.DB

    ERA: Efficient Serial and Parallel Suffix Tree Construction for Very Long Strings

    Authors: Essam Mansour, Amin Allam, Spiros Skiadopoulos, Panos Kalnis

    Abstract: The suffix tree is a data structure for indexing strings. It is used in a variety of applications such as bioinformatics, time series analysis, clustering, text editing and data compression. However, when the string and the resulting suffix tree are too large to fit into the main memory, most existing construction algorithms become very inefficient. This paper presents a disk-based suffix tree con… ▽ More

    Submitted 30 September, 2011; originally announced September 2011.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 1, pp. 49-60 (2011)