Skip to main content

Showing 1–24 of 24 results for author: Golab, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.13000  [pdf, other

    cs.CL cs.AI cs.IR

    RAGE Against the Machine: Retrieval-Augmented LLM Explanations

    Authors: Joel Rorseth, Parke Godfrey, Lukasz Golab, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: This paper demonstrates RAGE, an interactive tool for explaining Large Language Models (LLMs) augmented with retrieval capabilities; i.e., able to query external sources and pull relevant information into their input context. Our explanations are counterfactual in the sense that they identify parts of the input context that, when removed, change the answer to the question posed to the LLM. RAGE in… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Accepted by ICDE 2024 (Demonstration Track)

  2. arXiv:2405.12881  [pdf, other

    cs.DB cs.AI

    Explaining Expert Search and Team Formation Systems with ExES

    Authors: Kiarash Golzadeh, Lukasz Golab, Jaroslaw Szlichta

    Abstract: Expert search and team formation systems operate on collaboration networks, with nodes representing individuals, labeled with their skills, and edges denoting collaboration relationships. Given a keyword query corresponding to the desired skills, these systems identify experts that best match the query. However, state-of-the-art solutions to this problem lack transparency. To address this issue, w… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  3. arXiv:2307.09312  [pdf, other

    cs.CL cs.LG cs.MM cs.SI

    Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media

    Authors: Liam Hebert, Gaurav Sahu, Yuxuan Guo, Nanda Kishore Sreenivas, Lukasz Golab, Robin Cohen

    Abstract: We present the Multi-Modal Discussion Transformer (mDT), a novel methodfor detecting hate speech in online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the cont… ▽ More

    Submitted 22 February, 2024; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: Accepted to AAAI 2024 (AI for Social Impact Track)

  4. CREDENCE: Counterfactual Explanations for Document Ranking

    Authors: Joel Rorseth, Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: Towards better explainability in the field of information retrieval, we present CREDENCE, an interactive tool capable of generating counterfactual explanations for document rankers. Embracing the unique properties of the ranking problem, we present counterfactual explanations in terms of document perturbations, query perturbations, and even other documents. Additionally, users may build and test t… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted by ICDE 2023 (Demonstration Track)

  5. arXiv:2301.10871  [pdf, other

    cs.LG cs.CL cs.SI

    Qualitative Analysis of a Graph Transformer Approach to Addressing Hate Speech: Adapting to Dynamically Changing Content

    Authors: Liam Hebert, Hong Yi Chen, Robin Cohen, Lukasz Golab

    Abstract: Our work advances an approach for predicting hate speech in social media, drawing out the critical need to consider the discussions that follow a post to successfully detect when hateful discourse may arise. Using graph transformer networks, coupled with modelling attention and BERT-level natural language processing, our approach can capture context and anticipate upcoming anti-social behaviour. I… ▽ More

    Submitted 30 April, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted at AAAI 2023 AI for Social Good

  6. arXiv:2301.04248  [pdf, other

    cs.CL cs.LG cs.SI

    Predicting Hateful Discussions on Reddit using Graph Transformer Networks and Communal Context

    Authors: Liam Hebert, Lukasz Golab, Robin Cohen

    Abstract: We propose a system to predict harmful discussions on social media platforms. Our solution uses contextual deep language models and proposes the novel idea of integrating state-of-the-art Graph Transformer Networks to analyze all conversations that follow an initial post. This framework also supports adapting to future comments as the conversation unfolds. In addition, we study whether a community… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

    Comments: Accepted and Presented at WI-IAT 22

  7. arXiv:2205.13697  [pdf, other

    cs.LG cs.AI cs.MA

    FedFormer: Contextual Federation with Attention in Reinforcement Learning

    Authors: Liam Hebert, Lukasz Golab, Pascal Poupart, Robin Cohen

    Abstract: A core issue in multi-agent federated reinforcement learning is defining how to aggregate insights from multiple agents. This is commonly done by taking the average of each participating agent's model weights into one common model (FedAvg). We instead propose FedFormer, a novel federation strategy that utilizes Transformer Attention to contextually aggregate embeddings from models originating from… ▽ More

    Submitted 2 March, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Our source code can be found at https://github.com/liamhebert/FedFormer. Accepted at AAMAS 2023

  8. arXiv:2203.09742  [pdf, other

    cs.CL

    GRS: Combining Generation and Revision in Unsupervised Sentence Simplification

    Authors: Mohammad Dehghan, Dhruv Kumar, Lukasz Golab

    Abstract: We propose GRS: an unsupervised approach to sentence simplification that combines text generation and text revision. We start with an iterative framework in which an input sentence is revised using explicit edit operations, and add paraphrasing as a new edit operation. This allows us to combine the advantages of generative and revision-based approaches: paraphrasing captures complex edit operation… ▽ More

    Submitted 22 March, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

    Comments: The paper has been accepted to Findings of ACL 2022

  9. arXiv:2105.12190  [pdf, other

    cs.SI cs.IR

    Climate Action During COVID-19 Recovery and Beyond: A Twitter Text Mining Study

    Authors: Mohammad S. Parsa, Lukasz Golab, Srinivasan Keshav

    Abstract: The Coronavirus pandemic created a global crisis that prompted immediate large-scale action, including economic shutdowns and mobility restrictions. These actions have had devastating effects on the economy, but some positive effects on the environment. As the world recovers from the pandemic, we ask the following question: What is the public attitude towards climate action during COVID-19 recover… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  10. arXiv:2105.08105  [pdf, other

    cs.DB

    Discovery and Contextual Data Cleaning with Ontology Functional Dependencies

    Authors: Zheng Zheng, Longtao Zheng, Morteza Alipour Langouri, Fei Chiang, Lukasz Golab, Jaroslaw Szlichta

    Abstract: Functional Dependencies (FDs) define attribute relationships based on syntactic equality, and, when usedin data cleaning, they erroneously label syntactically different but semantically equivalent values as errors. We explore dependency-based data cleaning with Ontology Functional Dependencies(OFDs), which express semantic attribute relationships such as synonyms and is-a hierarchies defined by an… ▽ More

    Submitted 12 March, 2022; v1 submitted 17 May, 2021; originally announced May 2021.

  11. arXiv:2101.06801  [pdf, other

    cs.DB

    Real-Time LSM-Trees for HTAP Workloads

    Authors: Hemant Saxena, Lukasz Golab, Stratos Idreos, Ihab F. Ilyas

    Abstract: Real-time analytics systems employ hybrid data layouts in which data are stored in different formats throughout their lifecycle. Recent data are stored in a row-oriented format to serve OLTP workloads and support high insert rates, while older data are transformed to a column-oriented format for OLAP access patterns. We observe that a Log-Structured Merge (LSM) Tree is a natural fit for a lifecycl… ▽ More

    Submitted 14 July, 2022; v1 submitted 17 January, 2021; originally announced January 2021.

  12. arXiv:2101.02174  [pdf, other

    cs.DB

    Efficient Discovery of Approximate Order Dependencies

    Authors: Reza Karegar, Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: Order dependencies (ODs) capture relationships between ordered domains of attributes. Approximate ODs (AODs) capture such relationships even when there exist exceptions in the data. During automated discovery of ODs, validation is the process of verifying whether an OD holds. We present an algorithm for validating approximate ODs with significantly improved runtime performance over existing method… ▽ More

    Submitted 6 January, 2021; originally announced January 2021.

  13. arXiv:2006.09639  [pdf, other

    cs.CL

    Iterative Edit-Based Unsupervised Sentence Simplification

    Authors: Dhruv Kumar, Lili Mou, Lukasz Golab, Olga Vechtomova

    Abstract: We present a novel iterative, edit-based approach to unsupervised sentence simplification. Our model is guided by a scoring function involving fluency, simplicity, and meaning preservation. Then, we iteratively perform word and phrase-level edits on the complex sentence. Compared with previous approaches, our model does not require a parallel training set, but is more controllable and interpretabl… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

    Comments: The paper has been accepted to ACL 2020

  14. arXiv:2005.14068  [pdf, other

    cs.DB

    Discovering Domain Orders through Order Dependencies

    Authors: Reza Karegar, Melicaalsadat Mirsafian, Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: Much real-world data come with explicitly defined domain orders; e.g., lexicographic order for strings, numeric for integers, and chronological for time. Our goal is to discover implicit domain orders that we do not already know; for instance, that the order of months in the Chinese Lunar calendar is Corner < Apricot < Peach. To do so, we enhance data profiling methods by discovering implicit doma… ▽ More

    Submitted 7 September, 2021; v1 submitted 28 May, 2020; originally announced May 2020.

  15. arXiv:1910.07110  [pdf, other

    cs.DC

    Consentio: Managing Consent to Data Access using Permissioned Blockchains

    Authors: Rishav Raj Agarwal, Dhruv Kumar, Lukasz Golab, Srinivasan Keshav

    Abstract: The increasing amount of personal data is raising serious issues in the context of privacy, security, and data ownership. Entities whose data are being collected can benefit from mechanisms to manage the parties that can access their data and to audit who has accessed their data. Consent management systems address these issues. We present Consentio, a scalable consent management system based on th… ▽ More

    Submitted 9 March, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: minor changes after reviewwe comments

  16. arXiv:1906.11229  [pdf, ps, other

    cs.DC

    XOX Fabric: A hybrid approach to blockchain transaction execution

    Authors: Christian Gorenflo, Lukasz Golab, Srinivasan Keshav

    Abstract: Performance and scalability are major concerns for blockchains: permissionless systems are typically limited by slow proof of X consensus algorithms and sequential post-order transaction execution on every node of the network. By introducing a small amount of trust in their participants, permissioned blockchain systems such as Hyperledger Fabric can benefit from more efficient consensus algorithms… ▽ More

    Submitted 9 March, 2020; v1 submitted 26 June, 2019; originally announced June 2019.

  17. arXiv:1905.02010  [pdf, other

    cs.DB

    Errata Note: Discovering Order Dependencies through Order Compatibility

    Authors: Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: A number of extensions to the classical notion of functional dependencies have been proposed to express and enforce application semantics. One of these extensions is that of order dependencies (ODs), which express rules involving order. The article entitled "Discovering Order Dependencies through Order Compatibility" by Consonni et al., published in the EDBT conference proceedings in March 2019, i… ▽ More

    Submitted 6 May, 2019; originally announced May 2019.

    Comments: 5

  18. arXiv:1903.05228  [pdf, other

    cs.DB

    Distributed Dependency Discovery

    Authors: Hemant Saxena, Lukasz Golab, Ihab F. Ilyas

    Abstract: We analyze the problem of discovering dependencies from distributed big data. Existing (non-distributed) algorithms focus on minimizing computation by pruning the search space of possible dependencies. However, distributed algorithms must also optimize communication costs, especially in shared-nothing settings, leading to a more complex optimization space. To understand this space, we introduce si… ▽ More

    Submitted 12 March, 2019; originally announced March 2019.

  19. arXiv:1901.00910  [pdf, other

    cs.DC

    FastFabric: Scaling Hyperledger Fabric to 20,000 Transactions per Second

    Authors: Christian Gorenflo, Stephen Lee, Lukasz Golab, S. Keshav

    Abstract: Blockchain technologies are expected to make a significant impact on a variety of industries. However, one issue holding them back is their limited transaction throughput, especially compared to established solutions such as distributed database systems. In this paper, we re-architect a modern permissioned blockchain system, Hyperledger Fabric, to increase transaction throughput from 3,000 to 20,0… ▽ More

    Submitted 4 March, 2019; v1 submitted 3 January, 2019; originally announced January 2019.

    Comments: Minor revisions based on reviewer feedback

  20. arXiv:1611.02992  [pdf, other

    cs.SI

    Authority-based Team Discovery in Social Networks

    Authors: Morteza Zihayat, Aijun An, Lukasz Golab, Mehdi Kargar, Jaroslaw Szlichta

    Abstract: Given a social network of experts, we address the problem of discovering a team of experts that collectively holds a set of skills required to complete a given project. Most prior work ranks possible solutions by communication cost, represented by edge weights in the expert network. Our contribution is to take experts authority into account, represented by node weights. We formulate several proble… ▽ More

    Submitted 15 November, 2016; v1 submitted 8 November, 2016; originally announced November 2016.

    Comments: 6 pages

  21. arXiv:1608.06169  [pdf, other

    cs.DB

    Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization

    Authors: Jaroslaw Szlichta, Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava

    Abstract: Integrity constraints (ICs) provide a valuable tool for expressing and enforcing application semantics. However, formulating constraints manually requires domain expertise, is prone to human errors, and may be excessively time consuming, especially on large datasets. Hence, proposals for automatic discovery have been made for some classes of ICs, such as functional dependencies (FDs), and recently… ▽ More

    Submitted 23 August, 2016; v1 submitted 22 August, 2016; originally announced August 2016.

    Comments: 14 pages

  22. arXiv:1512.06395  [pdf, other

    cs.DB

    Effective Keyword Search in Graphs

    Authors: Mehdi Kargar, Lukasz Golab, Jaroslaw Szlichta

    Abstract: In a node-labeled graph, keyword search finds subtrees of the graph whose nodes contain all of the query keywords. This provides a way to query graph databases that neither requires mastery of a query language such as SPARQL, nor a deep knowledge of the database schema. Previous work ranks answer trees using combinations of structural and content-based metrics, such as path lengths between keyword… ▽ More

    Submitted 29 March, 2016; v1 submitted 20 December, 2015; originally announced December 2015.

    Comments: 7 pages, 9 figures

  23. arXiv:1312.0285  [pdf, other

    cs.DB

    Distributed Data Placement via Graph Partitioning

    Authors: Lukasz Golab, Marios Hadjieleftheriou, Howard Karloff, Barna Saha

    Abstract: With the widespread use of shared-nothing clusters of servers, there has been a proliferation of distributed object stores that offer high availability, reliability and enhanced performance for MapReduce-style workloads. However, relational workloads cannot always be evaluated efficiently using MapReduce without extensive data migrations, which cause network congestion and reduced query throughput… ▽ More

    Submitted 1 December, 2013; originally announced December 2013.

  24. arXiv:1207.5226  [pdf, other

    cs.DB

    On the Relative Trust between Inconsistent Data and Inaccurate Constraints

    Authors: George Beskales, Ihab F. Ilyas, Lukasz Golab, Artur Galiullin

    Abstract: Functional dependencies (FDs) specify the intended data semantics while violations of FDs indicate deviation from these semantics. In this paper, we study a data cleaning problem in which the FDs may not be completely correct, e.g., due to data evolution or incomplete knowledge of the data semantics. We argue that the notion of relative trust is a crucial aspect of this problem: if the FDs are out… ▽ More

    Submitted 24 July, 2012; v1 submitted 22 July, 2012; originally announced July 2012.