Skip to main content

Showing 1–33 of 33 results for author: Godfrey, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.13000  [pdf, other

    cs.CL cs.AI cs.IR

    RAGE Against the Machine: Retrieval-Augmented LLM Explanations

    Authors: Joel Rorseth, Parke Godfrey, Lukasz Golab, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: This paper demonstrates RAGE, an interactive tool for explaining Large Language Models (LLMs) augmented with retrieval capabilities; i.e., able to query external sources and pull relevant information into their input context. Our explanations are counterfactual in the sense that they identify parts of the input context that, when removed, change the answer to the question posed to the LLM. RAGE in… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Accepted by ICDE 2024 (Demonstration Track)

  2. arXiv:2312.07813  [pdf, other

    cs.OS cs.LG

    On a Foundation Model for Operating Systems

    Authors: Divyanshu Saxena, Nihal Sharma, Donghyun Kim, Rohit Dwivedula, Jiayi Chen, Chenxi Yang, Sriram Ravula, Zichao Hu, Aditya Akella, Sebastian Angel, Joydeep Biswas, Swarat Chaudhuri, Isil Dillig, Alex Dimakis, P. Brighten Godfrey, Daehyeok Kim, Chris Rossbach, Gang Wang

    Abstract: This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: Machine Learning for Systems Workshop at 37th NeurIPS Conference, 2023, New Orleans, LA, USA

  3. arXiv:2311.02800  [pdf, other

    cs.DC

    Kivi: Verification for Cluster Management

    Authors: Bingzhe Liu, Gangmuk Lim, Ryan Beckett, P. Brighten Godfrey

    Abstract: Modern cloud infrastructure is powered by cluster management systems such as Kubernetes and Docker Swarm. While these systems seek to minimize users' operational burden, the complex, dynamic, and non-deterministic nature of these systems makes them hard to reason about, potentially leading to failures ranging from performance degradation to outages. We present Kivi, the first system for verifying… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

    Comments: 18 pages

  4. arXiv:2305.03348  [pdf, other

    cs.NI

    Flock: Accurate network fault localization at scale

    Authors: Vipul Harsh, Tong Meng, Kapil Agrawal, P. Brighten Godfrey

    Abstract: Inferring the root cause of failures among thousands of components in a data center network is challenging, especially for "gray" failures that are not reported directly by switches. Faults can be localized through end-to-end measurements, but past localization schemes are either too slow for large-scale networks or sacrifice accuracy. We describe Flock, a network fault localization algorithm and… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: To appear in ACM PACMNET, Vol 1, June 2023

  5. CREDENCE: Counterfactual Explanations for Document Ranking

    Authors: Joel Rorseth, Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: Towards better explainability in the field of information retrieval, we present CREDENCE, an interactive tool capable of generating counterfactual explanations for document rankers. Embracing the unique properties of the ranking problem, we present counterfactual explanations in terms of document perturbations, query perturbations, and even other documents. Additionally, users may build and test t… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted by ICDE 2023 (Demonstration Track)

  6. arXiv:2101.02174  [pdf, other

    cs.DB

    Efficient Discovery of Approximate Order Dependencies

    Authors: Reza Karegar, Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: Order dependencies (ODs) capture relationships between ordered domains of attributes. Approximate ODs (AODs) capture such relationships even when there exist exceptions in the data. During automated discovery of ODs, validation is the process of verifying whether an OD holds. We present an algorithm for validating approximate ODs with significantly improved runtime performance over existing method… ▽ More

    Submitted 6 January, 2021; originally announced January 2021.

  7. arXiv:2011.04838  [pdf, other

    cs.DB

    Answer Graph: Factorization Matters in Large Graphs

    Authors: Zahid Abul-Basher, Nikolay Yakovets, Parke Godfrey, Stanley Clark, Mark Chignell

    Abstract: Our answer-graph method to evaluate SPARQL conjunctive queries (CQs) finds a factorized answer set first, an answer graph, and then finds the embedding tuples from this. This approach can reduce greatly the cost to evaluate CQs. This affords a second advantage: we can construct a cost-based planner. We present the answer-graph approach, and overview our prototype system, Wireframe. We then offer p… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

  8. arXiv:2005.14068  [pdf, other

    cs.DB

    Discovering Domain Orders through Order Dependencies

    Authors: Reza Karegar, Melicaalsadat Mirsafian, Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: Much real-world data come with explicitly defined domain orders; e.g., lexicographic order for strings, numeric for integers, and chronological for time. Our goal is to discover implicit domain orders that we do not already know; for instance, that the order of months in the Chinese Lunar calendar is Corner < Apricot < Peach. To do so, we enhance data profiling methods by discovering implicit doma… ▽ More

    Submitted 7 September, 2021; v1 submitted 28 May, 2020; originally announced May 2020.

  9. arXiv:1911.02128  [pdf, other

    cs.NI

    Plankton: Scalable network configuration verification through model checking

    Authors: Santhosh Prabhu, Kuan-Yen Chou, Ali Kheradmand, P. Brighten Godfrey, Matthew Caesar

    Abstract: Network configuration verification enables operators to ensure that the network will behave as intended, prior to deployment of their configurations. Although techniques ranging from graph algorithms to SMT solvers have been proposed, scalable configuration verification with sufficient protocol support continues to be a challenge. In this paper, we show that by combining equivalence partitioning w… ▽ More

    Submitted 5 November, 2019; originally announced November 2019.

    Comments: NSDI'20

  10. arXiv:1905.02010  [pdf, other

    cs.DB

    Errata Note: Discovering Order Dependencies through Order Compatibility

    Authors: Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava, Jaroslaw Szlichta

    Abstract: A number of extensions to the classical notion of functional dependencies have been proposed to express and enforce application semantics. One of these extensions is that of order dependencies (ODs), which express rules involving order. The article entitled "Discovering Order Dependencies through Order Compatibility" by Consonni et al., published in the EDBT conference proceedings in March 2019, i… ▽ More

    Submitted 6 May, 2019; originally announced May 2019.

    Comments: 5

  11. arXiv:1901.02049  [pdf, other

    cs.DB

    Guided Automated Learning for query workload re-Optimization

    Authors: Guilherme Damasio, Vincent Corvinelli, Parke Godfrey, Piotr Mierzejewski, Alexandar Mihaylov, Jaroslaw Szlichta, Calisto Zuzarte

    Abstract: Query optimization is a hallmark of database systems enabling complex SQL queries of today's applications to be run efficiently. The query optimizer often fails to find the best plan, when logical subtleties in business queries and schemas circumvent it. When a query runs more expensively than is viable or warranted, determination of the performance issues is usually performed manually in consulta… ▽ More

    Submitted 22 May, 2019; v1 submitted 7 January, 2019; originally announced January 2019.

    Comments: 12 pages, 15 figures

  12. arXiv:1811.10737  [pdf, other

    cs.NI

    Dissecting Latency in the Internet's Fiber Infrastructure

    Authors: Ilker Nadi Bozkurt, Waqar Aqeel, Debopam Bhattacherjee, Balakrishnan Chandrasekaran, Philip Brighten Godfrey, Gregory Laughlin, Bruce M. Maggs, Ankit Singla

    Abstract: The recent publication of the `InterTubes' map of long-haul fiber-optic cables in the contiguous United States invites an exciting question: how much faster would the Internet be if routes were chosen to minimize latency? Previous measurement campaigns suggest the following rule of thumb for estimating Internet latency: multiply line-of-sight distance by 2.1, then divide by the speed of light in f… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

  13. arXiv:1811.00212  [pdf, other

    cs.NI

    Expander Datacenters: From Theory to Practice

    Authors: Vipul Harsh, Sangeetha Abdu Jyothi, Inderdeep Singh, P. Brighten Godfrey

    Abstract: Recent work has shown that expander-based data center topologies are robust and can yield superior performance over Clos topologies. However, to achieve these benefits, previous proposals use routing and transport schemes that impede quick industry adoption. In this paper, we examine if expanders can be effective for the technology and environments practical in today's data centers, including the… ▽ More

    Submitted 31 October, 2018; originally announced November 2018.

    Comments: 15 pages, 17 figures

  14. arXiv:1810.03259  [pdf, other

    cs.NI

    Internet Congestion Control via Deep Reinforcement Learning

    Authors: Nathan Jay, Noga H. Rotman, P. Brighten Godfrey, Michael Schapira, Aviv Tamar

    Abstract: We present and investigate a novel and timely application domain for deep reinforcement learning (RL): Internet congestion control. Congestion control is the core networking task of modulating traffic sources' data-transmission rates to efficiently utilize network capacity, and is the subject of extensive attention in light of the advent of Internet services such as live video, virtual reality, In… ▽ More

    Submitted 21 May, 2019; v1 submitted 7 October, 2018; originally announced October 2018.

    Comments: 10 pages, accepted to ICML 2019

  15. arXiv:1809.10897  [pdf, other

    cs.NI

    cISP: A Speed-of-Light Internet Service Provider

    Authors: Debopam Bhattacherjee, Sangeetha Abdu Jyothi, Ilker Nadi Bozkurt, Muhammad Tirmazi, Waqar Aqeel, Anthony Aguirre, Balakrishnan Chandrasekaran, P. Brighten Godfrey, Gregory P. Laughlin, Bruce M. Maggs, Ankit Singla

    Abstract: Low latency is a requirement for a variety of interactive network applications. The Internet, however, is not optimized for latency. We thus explore the design of cost-effective wide-area networks that move data over paths very close to great-circle paths, at speeds very close to the speed of light in vacuum. Our cISP design augments the Internet's fiber with free-space wireless connectivity. cISP… ▽ More

    Submitted 10 October, 2018; v1 submitted 28 September, 2018; originally announced September 2018.

  16. arXiv:1608.06169  [pdf, other

    cs.DB

    Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization

    Authors: Jaroslaw Szlichta, Parke Godfrey, Lukasz Golab, Mehdi Kargar, Divesh Srivastava

    Abstract: Integrity constraints (ICs) provide a valuable tool for expressing and enforcing application semantics. However, formulating constraints manually requires domain expertise, is prone to human errors, and may be excessively time consuming, especially on large datasets. Hence, proposals for automatic discovery have been made for some classes of ICs, such as functional dependencies (FDs), and recently… ▽ More

    Submitted 23 August, 2016; v1 submitted 22 August, 2016; originally announced August 2016.

    Comments: 14 pages

  17. arXiv:1505.03449  [pdf, other

    cs.NI

    Towards a Speed of Light Internet

    Authors: Ankit Singla, Balakrishnan Chandrasekaran, P. Brighten Godfrey, Bruce Maggs

    Abstract: In principle, a network can transfer data at nearly the speed of light. Today's Internet, however, is much slower: our measurements show that latencies are typically more than one, and often more than two orders of magnitude larger than the lower bound implied by the speed of light. Closing this gap would not only add value to today's Internet applications, but might also open the door to exciting… ▽ More

    Submitted 13 May, 2015; originally announced May 2015.

  18. arXiv:1504.08262  [pdf, other

    cs.DB

    Towards Query Optimization for SPARQL Property Paths

    Authors: Nikolay Yakovets, Parke Godfrey, Jarek Gryz

    Abstract: The extension of SPARQL in version 1.1 with property paths offers a type of regular path query for RDF graph databases. Such queries are difficult to optimize and evaluate efficiently, however. We have embarked on a project, Waveguide, to build a cost-based optimizer for SPARQL queries with property paths. Waveguide builds a query plan - a waveguide plan (WGP) - which guides the query evaluation.… ▽ More

    Submitted 30 April, 2015; originally announced April 2015.

  19. arXiv:1402.2531  [pdf, other

    cs.NI

    Measuring and Understanding Throughput of Network Topologies

    Authors: Sangeetha Abdu Jyothi, Ankit Singla, P. Brighten Godfrey, Alexandra Kolla

    Abstract: High throughput is of particular interest in data center and HPC networks. Although myriad network topologies have been proposed, a broad head-to-head comparison across topologies and across traffic patterns is absent, and the right way to compare worst-case throughput performance is a subtle problem. In this paper, we develop a framework to benchmark the throughput of network topologies, using… ▽ More

    Submitted 14 November, 2016; v1 submitted 11 February, 2014; originally announced February 2014.

  20. arXiv:1309.7066  [pdf, other

    cs.NI

    High Throughput Data Center Topology Design

    Authors: Ankit Singla, P. Brighten Godfrey, Alexandra Kolla

    Abstract: With high throughput networks acquiring a crucial role in supporting data-intensive applications, a variety of data center network topologies have been proposed to achieve high capacity at low cost. While this literature explores a large number of design points, even in the limited case of a network of identical switches, no proposal has been able to claim any notion of optimality. The case of het… ▽ More

    Submitted 12 February, 2014; v1 submitted 26 September, 2013; originally announced September 2013.

    Comments: 15 pages

  21. arXiv:1309.0874  [pdf

    cs.DC cs.DS cs.SI physics.soc-ph

    Shortest Paths in Microseconds

    Authors: Rachit Agarwal, Matthew Caesar, P. Brighten Godfrey, Ben Y. Zhao

    Abstract: Computing shortest paths is a fundamental primitive for several social network applications including socially-sensitive ranking, location-aware search, social auctions and social network privacy. Since these applications compute paths in response to a user query, the goal is to minimize latency while maintaining feasible memory requirements. We present ASAP, a system that achieves this goal by ex… ▽ More

    Submitted 3 September, 2013; originally announced September 2013.

    Comments: Extended version of WOSN'12 paper: new techniques (reduced memory, faster computations), distributed (MapReduce) algorithm, multiple paths between a source-destination pair

  22. arXiv:1306.3707  [pdf, other

    cs.NI

    Low latency via redundancy

    Authors: Ashish Vulimiri, P. Brighten Godfrey, Radhika Mittal, Justine Sherry, Sylvia Ratnasamy, Scott Shenker

    Abstract: Low latency is critical for interactive networked applications. But while we know how to scale systems to increase capacity, reducing latency --- especially the tail of the latency distribution --- can be much more difficult. In this paper, we argue that the use of redundancy is an effective way to convert extra capacity into reduced latency. By initiating redundant operations across diverse resou… ▽ More

    Submitted 16 June, 2013; originally announced June 2013.

  23. arXiv:1306.3534  [pdf, other

    cs.NI cs.DC

    A cost-benefit analysis of low latency via added utilization

    Authors: Ashish Vulimiri, P. Brighten Godfrey, Sri Varsha Gorge, Zitian Liu, Scott Shenker

    Abstract: Several recently proposed techniques achieve latency reduction by trading it off for some amount of additional bandwidth usage. But how would one quantify whether the tradeoff is actually beneficial in a given system? We develop an economic cost vs. benefit analysis for answering this question. We use the analysis to derive a benchmark for wide-area client-server applications, and demonstrate how… ▽ More

    Submitted 4 December, 2014; v1 submitted 14 June, 2013; originally announced June 2013.

  24. arXiv:1302.6156  [pdf, other

    cs.NI

    Scalable Routing on Flat Names

    Authors: Ankit Singla, P. Brighten Godfrey, Kevin Fall, Gianluca Iannaccone, Sylvia Ratnasamy

    Abstract: We introduce a protocol which routes on flat, location-independent identifiers with guaranteed scalability and low stretch. Our design builds on theoretical advances in the area of compact routing, and is the first to realize these guarantees in a dynamic distributed setting.

    Submitted 25 February, 2013; originally announced February 2013.

    Comments: 13 pages

    Journal ref: Extends our ACM CoNEXT 2010 paper with proofs for the theoretical results

  25. arXiv:1208.0084  [pdf, other

    cs.DB

    Fundamentals of Order Dependencies

    Authors: Jaroslaw Szlichta, Parke Godfrey, Jarek Gryz

    Abstract: Dependencies have played a significant role in database design for many years. They have also been shown to be useful in query optimization. In this paper, we discuss dependencies between lexicographically ordered sets of tuples. We introduce formally the concept of order dependency and present a set of axioms (inference rules) for them. We show how query rewrites based on these axioms can be used… ▽ More

    Submitted 31 July, 2012; originally announced August 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 11, pp. 1220-1231 (2012)

  26. arXiv:1206.2057  [pdf, other

    cs.NI

    Finishing Flows Quickly with Preemptive Scheduling

    Authors: Chi-Yao Hong, Matthew Caesar, P. Brighten Godfrey

    Abstract: Today's data centers face extreme challenges in providing low latency. However, fair sharing, a principle commonly adopted in current congestion control protocols, is far from optimal for satisfying latency requirements. We propose Preemptive Distributed Quick (PDQ) flow scheduling, a protocol designed to complete flows quickly and meet flow deadlines. PDQ enables flow preemption to approximate… ▽ More

    Submitted 12 June, 2012; v1 submitted 10 June, 2012; originally announced June 2012.

    Comments: The conference version was published in SIGCOMM 2012

  27. arXiv:1206.1134  [pdf

    cs.SI cs.DB physics.soc-ph

    Shortest Paths in Less Than a Millisecond

    Authors: Rachit Agarwal, Matthew Caesar, P. Brighten Godfrey, Ben Y. Zhao

    Abstract: We consider the problem of answering point-to-point shortest path queries on massive social networks. The goal is to answer queries within tens of milliseconds while minimizing the memory requirements. We present a technique that achieves this goal for an extremely large fraction of path queries by exploiting the structure of the social networks. Using evaluations on real-world datasets, we argu… ▽ More

    Submitted 6 June, 2012; originally announced June 2012.

    Comments: 6 pages; to appear in SIGCOMM WOSN 2012

  28. arXiv:1201.2703  [pdf

    cs.DS cs.DC cs.NI cs.SI

    Faster Approximate Distance Queries and Compact Routing in Sparse Graphs

    Authors: Rachit Agarwal, P. Brighten Godfrey, Sariel Har-Peled

    Abstract: A distance oracle is a compact representation of the shortest distance matrix of a graph. It can be queried to approximate shortest paths between any pair of vertices. Any distance oracle that returns paths of worst-case stretch (2k-1) must require space $Ω(n^{1 + 1/k})$ for graphs of n nodes. The hard cases that enforce this lower bound are, however, rather dense graphs with average degree Ω(n^{1… ▽ More

    Submitted 12 January, 2012; originally announced January 2012.

    Comments: 20 pages, an earlier version appeared in INFOCOM 2011, this version presents data structures with improved space/query-time trade-off

  29. arXiv:1201.1661  [pdf, ps, other

    cs.NI

    Slick Packets

    Authors: Giang T. K. Nguyen, Rachit Agarwal, Junda Liu, Matthew Caesar, P. Brighten Godfrey, Scott Shenker

    Abstract: Source-controlled routing has been proposed as a way to improve flexibility of future network architectures, as well as simplifying the data plane. However, if a packet specifies its path, this precludes fast local re-routing within the network. We propose SlickPackets, a novel solution that allows packets to slip around failures by specifying alternate paths in their headers, in the form of compa… ▽ More

    Submitted 8 January, 2012; originally announced January 2012.

    Comments: This is the full version of a paper with the same title that appeared in ACM SIGMETRICS 2011, with the inclusion of the appendix. 16 pages

    ACM Class: C.2.1; C.2.2; C.2.6

  30. arXiv:1110.1687  [pdf, other

    cs.NI

    Jellyfish: Networking Data Centers Randomly

    Authors: Ankit Singla, Chi-Yao Hong, Lucian Popa, P. Brighten Godfrey

    Abstract: Industry experience indicates that the ability to incrementally expand data centers is essential. However, existing high-bandwidth network designs have rigid structure that interferes with incremental expansion. We present Jellyfish, a high-capacity network interconnect, which, by adopting a random graph topology, yields itself naturally to incremental expansion. Somewhat surprisingly, Jellyfish i… ▽ More

    Submitted 20 April, 2012; v1 submitted 7 October, 2011; originally announced October 2011.

    Comments: 14 pages, 12 figures

  31. arXiv:1108.0192  [pdf, other

    cs.NI

    BGP Stability is Precarious

    Authors: P. Brighten Godfrey

    Abstract: We note a fact which is simple, but may be useful for the networking research community: essentially any change to BGP's decision process can cause divergence --- or convergence when BGP would otherwise diverge.

    Submitted 31 July, 2011; originally announced August 2011.

  32. arXiv:0803.0632  [pdf, ps, other

    cs.NI cs.IT

    Network Coding for Distributed Storage Systems

    Authors: Alexandros G. Dimakis, P. Brighten Godfrey, Yunnan Wu, Martin J. Wainwright, Kannan Ramchandran

    Abstract: Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since… ▽ More

    Submitted 5 March, 2008; originally announced March 2008.

    Report number: EECS 14546

  33. arXiv:cs/0702015  [pdf, ps, other

    cs.IT cs.NI

    Network Coding for Distributed Storage Systems

    Authors: Alexandros G. Dimakis, P. Brighten Godfrey, Martin J. Wainwright, Kannan Ramchandran

    Abstract: Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance bandwidth than simple replication to provide the… ▽ More

    Submitted 2 February, 2007; originally announced February 2007.

    Comments: To appear in INFOCOM 2007