-
Report on the First HIPstIR Workshop on the Future of Information Retrieval
Authors:
Laura Dietz,
Bhaskar Mitra,
Jeremy Pickens,
Hana Anber,
Sandeep Avula,
Asia Biega,
Adrian Boteanu,
Shubham Chatterjee,
Jeff Dalton,
Shiri Dori-Hacohen,
John Foley,
Henry Feild,
Ben Gamari,
Rosie Jones,
Pallika Kanani,
Sumanta Kashyapi,
Widad Machmouchi,
Matthew Mitsui,
Steve Nole,
Alexandre Tachard Passos,
Jordan Ramsdell,
Adam Roegiest,
David Smith,
Alessandro Sordoni
Abstract:
The vision of HIPstIR is that early stage information retrieval (IR) researchers get together to develop a future for non-mainstream ideas and research agendas in IR. The first iteration of this vision materialized in the form of a three day workshop in Portsmouth, New Hampshire attended by 24 researchers across academia and industry. Attendees pre-submitted one or more topics that they want to pi…
▽ More
The vision of HIPstIR is that early stage information retrieval (IR) researchers get together to develop a future for non-mainstream ideas and research agendas in IR. The first iteration of this vision materialized in the form of a three day workshop in Portsmouth, New Hampshire attended by 24 researchers across academia and industry. Attendees pre-submitted one or more topics that they want to pitch at the meeting. Then over the three days during the workshop, we self-organized into groups and worked on six specific proposals of common interest. In this report, we present an overview of the workshop and brief summaries of the six proposals that resulted from the workshop.
△ Less
Submitted 20 December, 2019;
originally announced December 2019.
-
Element Distinctness, Frequency Moments, and Sliding Windows
Authors:
Paul Beame,
Raphael Clifford,
Widad Machmouchi
Abstract:
We derive new time-space tradeoff lower bounds and algorithms for exactly computing statistics of input data, including frequency moments, element distinctness, and order statistics, that are simple to calculate for sorted data. We develop a randomized algorithm for the element distinctness problem whose time T and space S satisfy T in O (n^{3/2}/S^{1/2}), smaller than previous lower bounds for co…
▽ More
We derive new time-space tradeoff lower bounds and algorithms for exactly computing statistics of input data, including frequency moments, element distinctness, and order statistics, that are simple to calculate for sorted data. We develop a randomized algorithm for the element distinctness problem whose time T and space S satisfy T in O (n^{3/2}/S^{1/2}), smaller than previous lower bounds for comparison-based algorithms, showing that element distinctness is strictly easier than sorting for randomized branching programs. This algorithm is based on a new time and space efficient algorithm for finding all collisions of a function f from a finite set to itself that are reachable by iterating f from a given set of starting points. We further show that our element distinctness algorithm can be extended at only a polylogarithmic factor cost to solve the element distinctness problem over sliding windows, where the task is to take an input of length 2n-1 and produce an output for each window of length n, giving n outputs in total. In contrast, we show a time-space tradeoff lower bound of T in Omega(n^2/S) for randomized branching programs to compute the number of distinct elements over sliding windows. The same lower bound holds for computing the low-order bit of F_0 and computing any frequency moment F_k, k neq 1. This shows that those frequency moments and the decision problem F_0 mod 2 are strictly harder than element distinctness. We complement this lower bound with a T in O(n^2/S) comparison-based deterministic RAM algorithm for exactly computing F_k over sliding windows, nearly matching both our lower bound for the sliding-window version and the comparison-based lower bounds for the single-window version. We further exhibit a quantum algorithm for F_0 over sliding windows with T in O(n^{3/2}/S^{1/2}). Finally, we consider the computations of order statistics over sliding windows.
△ Less
Submitted 14 September, 2013;
originally announced September 2013.
-
Sliding Windows with Limited Storage
Authors:
Paul Beame,
Raphael Clifford,
Widad Machmouchi
Abstract:
We consider time-space tradeoffs for exactly computing frequency moments and order statistics over sliding windows. Given an input of length 2n-1, the task is to output the function of each window of length n, giving n outputs in total. Computations over sliding windows are related to direct sum problems except that inputs to instances almost completely overlap.
We show an average case and ran…
▽ More
We consider time-space tradeoffs for exactly computing frequency moments and order statistics over sliding windows. Given an input of length 2n-1, the task is to output the function of each window of length n, giving n outputs in total. Computations over sliding windows are related to direct sum problems except that inputs to instances almost completely overlap.
We show an average case and randomized time-space tradeoff lower bound of TS in Omega(n^2) for multi-way branching programs, and hence standard RAM and word-RAM models, to compute the number of distinct elements, F_0, in sliding windows over alphabet [n]. The same lower bound holds for computing the low-order bit of F_0 and computing any frequency moment F_k for k not equal to 1. We complement this lower bound with a TS in \tilde O(n^2) deterministic RAM algorithm for exactly computing F_k in sliding windows.
We show time-space separations between the complexity of sliding-window element distinctness and that of sliding-window $F_0\bmod 2$ computation. In particular for alphabet [n] there is a very simple errorless sliding-window algorithm for element distinctness that runs in O(n) time on average and uses O(log{n}) space.
We show that any algorithm for a single element distinctness instance can be extended to an algorithm for the sliding-window version of element distinctness with at most a polylogarithmic increase in the time-space product.
Finally, we show that the sliding-window computation of order statistics such as the maximum and minimum can be computed with only a logarithmic increase in time, but that a TS in Omega(n^2) lower bound holds for sliding-window computation of order statistics such as the median, a nearly linear increase in time when space is small.
△ Less
Submitted 17 September, 2013; v1 submitted 18 December, 2012;
originally announced December 2012.
-
Local-Testability and Self-Correctability of q-ary Sparse Linear Codes
Authors:
Widad Machmouchi
Abstract:
We prove that q-ary sparse codes with small bias are self-correctable and locally testable. We generalize a result of Kaufman and Sudan that proves the local testability and correctability of binary sparse codes with small bias. We use properties of q-ary Krawtchouk polynomials and the McWilliams identity -that relates the weight distribution of a code to the weight distribution of its dual- to de…
▽ More
We prove that q-ary sparse codes with small bias are self-correctable and locally testable. We generalize a result of Kaufman and Sudan that proves the local testability and correctability of binary sparse codes with small bias. We use properties of q-ary Krawtchouk polynomials and the McWilliams identity -that relates the weight distribution of a code to the weight distribution of its dual- to derive bounds on the error probability of the randomized tester and self-corrector we are analyzing.
△ Less
Submitted 15 December, 2010;
originally announced December 2010.
-
The Quantum Query Complexity of AC0
Authors:
Paul Beame,
Widad Machmouchi
Abstract:
We show that any quantum algorithm deciding whether an input function $f$ from $[n]$ to $[n]$ is 2-to-1 or almost 2-to-1 requires $Θ(n)$ queries to $f$. The same lower bound holds for determining whether or not a function $f$ from $[2n-2]$ to $[n]$ is surjective. These results yield a nearly linear $Ω(n/\log n)$ lower bound on the quantum query complexity of $\cl{AC}^0$. The best previous lower bo…
▽ More
We show that any quantum algorithm deciding whether an input function $f$ from $[n]$ to $[n]$ is 2-to-1 or almost 2-to-1 requires $Θ(n)$ queries to $f$. The same lower bound holds for determining whether or not a function $f$ from $[2n-2]$ to $[n]$ is surjective. These results yield a nearly linear $Ω(n/\log n)$ lower bound on the quantum query complexity of $\cl{AC}^0$. The best previous lower bound known for any $\cl{AC^0}$ function was the $Ω((n/\log n)^{2/3})$ bound given by Aaronson and Shi's $Ω(n^{2/3})$ lower bound for the element distinctness problem.
△ Less
Submitted 30 January, 2012; v1 submitted 14 August, 2010;
originally announced August 2010.