Skip to main content

Showing 1–16 of 16 results for author: Salihoglu, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.01731  [pdf

    cs.CY

    Integration of Artificial Intelligence in Educational Measurement: Efficacy of ChatGPT in Data Generation within the Scope of Item Response Theory

    Authors: Hatice Gurdil, Yesim Beril Soguksu, Salih Salihoglu, Fatma Coskun

    Abstract: The aim of this study is to investigate the effectiveness of ChatGPT 3.5 in develo** algorithms for data generation within the framework of Item Response Theory (IRT) using the R programming language. In this context, validity examinations were conducted on data sets generated according to the Two-Parameter Logistic Model (2PLM) with algorithms written by ChatGPT 3.5 and researchers. These exami… ▽ More

    Submitted 3 July, 2024; v1 submitted 28 January, 2024; originally announced February 2024.

  2. arXiv:2401.12830  [pdf, other

    cs.LG cs.AI

    Enhancing Next Destination Prediction: A Novel LSTM Approach Using Real-World Airline Data

    Authors: Salih Salihoglu, Gulser Koksal, Orhan Abar

    Abstract: In the modern transportation industry, accurate prediction of travelers' next destinations brings multiple benefits to companies, such as customer satisfaction and targeted marketing. This study focuses on develo** a precise model that captures the sequential patterns and dependencies in travel data, enabling accurate predictions of individual travelers' future destinations. To achieve this, a n… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  3. Optimizing Differentially-Maintained Recursive Queries on Dynamic Graphs

    Authors: Khaled Ammar, Siddhartha Sahu, Semih Salihoglu, M. Tamer Ozsu

    Abstract: Differential computation (DC) is a highly general incremental computation/view maintenance technique that can maintain the output of an arbitrary and possibly recursive dataflow computation upon changes to its base inputs. As such, it is a promising technique for graph database management systems (GDBMS) that support continuous recursive queries over dynamic graphs. Although differential computati… ▽ More

    Submitted 30 July, 2022; originally announced August 2022.

    Journal ref: PVLDB, 15(11): 3186 - 3198, 2022

  4. arXiv:2108.10540  [pdf, other

    cs.DB

    Making RDBMSs Efficient on Graph Workloads Through Predefined Joins

    Authors: Guodong **, Semih Salihoglu

    Abstract: Joins in native graph database management systems (GDBMSs) are predefined to the system as edges, which are indexed in adjacency list indices and serve as pointers. This contrasts with and can be more performant than value-based joins in RDBMSs and has lead researchers to investigate ways to integrate predefined joins directly into RDBMSs. Existing approaches adopt a strict separation of graph and… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

  5. arXiv:2105.08878  [pdf, other

    cs.DB

    Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs

    Authors: Jeremy Chen, Yuqing Huang, Mushi Wang, Semih Salihoglu, Ken Salem

    Abstract: We study two classes of summary-based cardinality estimators that use statistics about input relations and small-size joins in the context of graph database management systems: (i) optimistic estimators that make uniformity and conditional independence assumptions; and (ii) the recent pessimistic estimators that use information theoretic linear programs. We begin by addressing the problem of how t… ▽ More

    Submitted 18 May, 2021; originally announced May 2021.

    ACM Class: H.2.4

  6. arXiv:2103.02284  [pdf, other

    cs.DB

    Columnar Storage and List-based Processing for Graph Database Management Systems

    Authors: Pranjal Gupta, Amine Mhedhbi, Semih Salihoglu

    Abstract: We revisit column-oriented storage and query processing techniques in the context of contemporary graph database management systems (GDBMSs). Similar to column-oriented RDBMSs, GDBMSs support read-heavy analytical workloads that however have fundamentally different data access patterns than traditional analytical workloads. We first derive a set of desiderata for optimizing storage and query proce… ▽ More

    Submitted 27 October, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: VLDB Conference 2021 (https://www.vldb.org/pvldb/vol14/p2491-gupta.pdf)

  7. arXiv:2012.06171  [pdf, other

    cs.DC cs.DB

    The Future is Big Graphs! A Community View on Graph Processing Systems

    Authors: Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar, Renzo Angles, Walid Aref, Marcelo Arenas, Maciej Besta, Peter A. Boncz, Khuzaima Daudjee, Emanuele Della Valle, Stefania Dumbrava, Olaf Hartig, Bernhard Haslhofer, Tim Hegeman, Jan Hidders, Katja Hose, Adriana Iamnitchi, Vasiliki Kalavri, Hugo Kapp, Wim Martens, M. Tamer Özsu, Eric Peukert, Stefan Plantikow , et al. (16 additional authors not shown)

    Abstract: Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue t… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

    Comments: 12 pages, 3 figures, collaboration between the large-scale systems and data management communities, work started at the Dagstuhl Seminar 19491 on Big Graph Processing Systems, to be published in the Communications of the ACM

    ACM Class: C.3; E.0; H.2; J.0

  8. Graphsurge: Graph Analytics on View Collections Using Differential Computation

    Authors: Siddhartha Sahu, Semih Salihoglu

    Abstract: This paper presents the design and implementation of a new open-source view-based graph analytics system called Graphsurge. Graphsurge is designed to support applications that analyze multiple snapshots or views of a large-scale graph. Users program Graphsurge through a declarative graph view definition language (GVDL) to create views over input graphs and a Differential Dataflow-based programming… ▽ More

    Submitted 4 March, 2021; v1 submitted 10 April, 2020; originally announced April 2020.

  9. arXiv:2004.00130  [pdf, other

    cs.DB

    A+ Indexes: Tunable and Space-Efficient Adjacency Lists in Graph Database Management Systems

    Authors: Amine Mhedhbi, Pranjal Gupta, Shahid Khaliq, Semih Salihoglu

    Abstract: Graph database management systems (GDBMSs) are highly optimized to perform fast traversals, i.e., joins of vertices with their neighbours, by indexing the neighbourhoods of vertices in adjacency lists. However, existing GDBMSs have system-specific and fixed adjacency list structures, which makes each system efficient on only a fixed set of workloads. We describe a new tunable indexing subsystem fo… ▽ More

    Submitted 3 March, 2021; v1 submitted 31 March, 2020; originally announced April 2020.

  10. arXiv:1909.12102  [pdf, other

    cs.DB

    Box Covers and Domain Orderings for Beyond Worst-Case Join Processing

    Authors: Kaleb Alway, Eric Blais, Semih Salihoglu

    Abstract: Recent beyond worst-case optimal join algorithms Minesweeper and its generalization Tetris have brought the theory of indexing and join processing together by develo** a geometric framework for joins. These algorithms take as input an index $\mathcal{B}$, referred to as a box cover, that stores output gaps that can be inferred from traditional indexes, such as B+ trees or tries, on the input rel… ▽ More

    Submitted 10 January, 2021; v1 submitted 26 September, 2019; originally announced September 2019.

  11. arXiv:1903.02076  [pdf, other

    cs.DB

    Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins

    Authors: Amine Mhedhbi, Semih Salihoglu

    Abstract: We study the problem of optimizing subgraph queries using the new worst-case optimal join plans. Worst-case optimal plans evaluate queries by matching one query vertex at a time using multiway intersections. The core problem in optimizing worst-case optimal plans is to pick an ordering of the query vertices to match. We design a cost-based optimizer that (i) picks efficient query vertex orderings… ▽ More

    Submitted 2 June, 2019; v1 submitted 5 March, 2019; originally announced March 2019.

  12. arXiv:1802.03760  [pdf, other

    cs.DC cs.DB

    Distributed Evaluation of Subgraph Queries Using Worstcase Optimal LowMemory Dataflows

    Authors: Khaled Ammar, Frank McSherry, Semih Salihoglu, Manas Joglekar

    Abstract: We study the problem of finding and monitoring fixed-size subgraphs in a continually changing large-scale graph. We present the first approach that (i) performs worst-case optimal computation and communication, (ii) maintains a total memory footprint linear in the number of input edges, and (iii) scales down per-worker computation, communication, and memory requirements linearly as the number of w… ▽ More

    Submitted 11 February, 2018; originally announced February 2018.

  13. The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: Extended Survey

    Authors: Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, M. Tamer Özsu

    Abstract: Graph processing is becoming increasingly prevalent across many application domains. In spite of this prevalence, there is little research about how graphs are actually used in practice. We performed an extensive study that consisted of an online survey of 89 users, a review of the mailing lists, source repositories, and whitepapers of a large suite of graph software products, and in-person interv… ▽ More

    Submitted 4 September, 2019; v1 submitted 10 September, 2017; originally announced September 2017.

    Journal ref: The VLDB Journal, 2019

  14. arXiv:1410.4156  [pdf, other

    cs.DB

    GYM: A Multiround Join Algorithm In MapReduce

    Authors: Foto Afrati, Manas Joglekar, Christopher Ré, Semih Salihoglu, Jeffrey D. Ullman

    Abstract: Multiround algorithms are now commonly used in distributed data processing systems, yet the extent to which algorithms can benefit from running more rounds is not well understood. This paper answers this question for a spectrum of rounds for the problem of computing the equijoin of $n$ relations. Specifically, given any query $Q$ with width $\w$, {\em intersection width} $\iw$, input size… ▽ More

    Submitted 25 January, 2017; v1 submitted 15 October, 2014; originally announced October 2014.

  15. arXiv:1206.4377  [pdf, other

    cs.DC cs.DS

    Upper and Lower Bounds on the Cost of a Map-Reduce Computation

    Authors: Foto N. Afrati, Anish Das Sarma, Semih Salihoglu, Jeffrey D. Ullman

    Abstract: In this paper we study the tradeoff between parallelism and communication cost in a map-reduce computation. For any problem that is not "embarrassingly parallel," the finer we partition the work of the reducers so that more parallelism can be extracted, the greater will be the total communication between mappers and reducers. We introduce a model of problems that can be solved in a single round of… ▽ More

    Submitted 19 June, 2012; originally announced June 2012.

    Comments: 14 pages

  16. arXiv:1204.1754  [pdf, other

    cs.DB cs.DC

    Vision Paper: Towards an Understanding of the Limits of Map-Reduce Computation

    Authors: Foto N. Afrati, Anish Das Sarma, Semih Salihoglu, Jeffrey D. Ullman

    Abstract: A significant amount of recent research work has addressed the problem of solving various data management problems in the cloud. The major algorithmic challenges in map-reduce computations involve balancing a multitude of factors such as the number of machines available for mappers/reducers, their memory requirements, and communication cost (total amount of data sent from mappers to reducers). Mos… ▽ More

    Submitted 8 April, 2012; originally announced April 2012.

    Comments: 5 pages