Skip to main content

Showing 1–12 of 12 results for author: Ingber, A

Searching in archive cs. Search in all archives.
.
  1. Bridging Dense and Sparse Maximum Inner Product Search

    Authors: Sebastian Bruch, Franco Maria Nardini, Amir Ingber, Edo Liberty

    Abstract: Maximum inner product search (MIPS) over dense and sparse vectors have progressed independently in a bifurcated literature for decades; the latter is better known as top-$k$ retrieval in Information Retrieval. This duality exists because sparse and dense vectors serve different end goals. That is despite the fact that they are manifestations of the same mathematical problem. In this work, we ask i… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  2. An Approximate Algorithm for Maximum Inner Product Search over Streaming Sparse Vectors

    Authors: Sebastian Bruch, Franco Maria Nardini, Amir Ingber, Edo Liberty

    Abstract: Maximum Inner Product Search or top-k retrieval on sparse vectors is well-understood in information retrieval, with a number of mature algorithms that solve it exactly. However, all existing algorithms are tailored to text and frequency-based similarity measures. To achieve optimal memory footprint and query latency, they rely on the near stationarity of documents and on laws governing natural lan… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

  3. An Analysis of Fusion Functions for Hybrid Retrieval

    Authors: Sebastian Bruch, Siyu Gai, Amir Ingber

    Abstract: We study hybrid search in text retrieval where lexical and semantic search are fused together with the intuition that the two are complementary in how they model relevance. In particular, we examine fusion by a convex combination (CC) of lexical and semantic scores, as well as the Reciprocal Rank Fusion (RRF) method, and identify their advantages and potential pitfalls. Contrary to existing studie… ▽ More

    Submitted 4 May, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  4. arXiv:2110.02065  [pdf, other

    cs.IR cs.LG

    SDR: Efficient Neural Re-ranking using Succinct Document Representation

    Authors: Nachshon Cohen, Amit Portnoy, Besnik Fetahu, Amir Ingber

    Abstract: BERT based ranking models have achieved superior performance on various information retrieval tasks. However, the large number of parameters and complex self-attention operation come at a significant latency overhead. To remedy this, recent works propose late-interaction architectures, which allow pre-computation of intermediate document representations, thus reducing the runtime latency. Nonethel… ▽ More

    Submitted 3 October, 2021; originally announced October 2021.

  5. arXiv:1506.03407  [pdf, ps, other

    cs.IT

    Strong Successive Refinability and Rate-Distortion-Complexity Tradeoff

    Authors: Albert No, Amir Ingber, Tsachy Weissman

    Abstract: We investigate the second order asymptotics (source dispersion) of the successive refinement problem. Similarly to the classical definition of a successively refinable source, we say that a source is strongly successively refinable if successive refinement coding can achieve the second order optimum rate (including the dispersion terms) at both decoders. We establish a sufficient condition for str… ▽ More

    Submitted 15 March, 2016; v1 submitted 10 June, 2015; originally announced June 2015.

  6. arXiv:1404.5173  [pdf, other

    cs.IT

    Compression for Quadratic Similarity Queries: Finite Blocklength and Practical Schemes

    Authors: Fabian Steiner, Steffen Dempfle, Amir Ingber, Tsachy Weissman

    Abstract: We study the problem of compression for the purpose of similarity identification, where similarity is measured by the mean square Euclidean distance between vectors. While the asymptotical fundamental limits of the problem - the minimal compression rate and the error exponent - were found in a previous work, in this paper we focus on the nonasymptotic domain and on practical, implementable schemes… ▽ More

    Submitted 10 May, 2014; v1 submitted 21 April, 2014; originally announced April 2014.

    Comments: minor clarifications and wording updates compared to v1

  7. arXiv:1312.2063  [pdf, ps, other

    cs.IT cs.DB cs.IR

    The Minimal Compression Rate for Similarity Identification

    Authors: Amir Ingber, Tsachy Weissman

    Abstract: Traditionally, data compression deals with the problem of concisely representing a data source, e.g. a sequence of letters, for the purpose of eventual reproduction (either exact or approximate). In this work we are interested in the case where the goal is to answer similarity queries about the compressed sequence, i.e. to identify whether or not the original sequence is similar to a given query s… ▽ More

    Submitted 7 December, 2013; originally announced December 2013.

    Comments: 45 pages, 6 figures. Submitted to IEEE Transactions on Information Theory

  8. Compression for Quadratic Similarity Queries

    Authors: Amir Ingber, Thomas Courtade, Tsachy Weissman

    Abstract: The problem of performing similarity queries on compressed data is considered. We focus on the quadratic similarity measure, and study the fundamental tradeoff between compression rate, sequence length, and reliability of queries performed on compressed data. For a Gaussian source, we show that queries can be answered reliably if and only if the compression rate exceeds a given threshold - the ide… ▽ More

    Submitted 24 July, 2013; originally announced July 2013.

    Comments: 39 pages, 6 figures, submitted to IEEE Trans. on Information Theory

  9. arXiv:1109.6310  [pdf, ps, other

    cs.IT

    The Dispersion of Joint Source-Channel Coding

    Authors: Da Wang, Amir Ingber, Yuval Kochman

    Abstract: In this work we investigate the behavior of the distortion threshold that can be guaranteed in joint source-channel coding, to within a prescribed excess-distortion probability. We show that the gap between this threshold and the optimal average distortion is governed by a constant that we call the joint source-channel dispersion. This constant can be easily computed, since it is the sum of the so… ▽ More

    Submitted 7 December, 2011; v1 submitted 28 September, 2011; originally announced September 2011.

    Comments: Extended version of work presented in the 2011 Allerton conference

  10. Finite Dimensional Infinite Constellations

    Authors: Amir Ingber, Ram Zamir, Meir Feder

    Abstract: In the setting of a Gaussian channel without power constraints, proposed by Poltyrev, the codewords are points in an n-dimensional Euclidean space (an infinite constellation) and the tradeoff between their density and the error probability is considered. The capacity in this setting is the highest achievable normalized log density (NLD) with vanishing error probability. This capacity as well as er… ▽ More

    Submitted 5 September, 2011; v1 submitted 1 March, 2011; originally announced March 2011.

    Comments: 54 pages, 13 figures. Submitted to IEEE Transactions on Information Theory

    Journal ref: IEEE Trans. on Information Theory, Vol. 59 ,Issue 3, pp. 1630-1656, 2013

  11. arXiv:1102.2598  [pdf, ps, other

    cs.IT

    The Dispersion of Lossy Source Coding

    Authors: Amir Ingber, Yuval Kochman

    Abstract: In this work we investigate the behavior of the minimal rate needed in order to guarantee a given probability that the distortion exceeds a prescribed threshold, at some fixed finite quantization block length. We show that the excess coding rate above the rate-distortion function is inversely proportional (to the first order) to the square root of the block length. We give an explicit expression f… ▽ More

    Submitted 13 February, 2011; originally announced February 2011.

    Comments: 2011 Data Compression Conference, to appear (submitted Nov. 2010)

  12. arXiv:1007.1407  [pdf, ps, other

    cs.IT

    Parallel Bit Interleaved Coded Modulation

    Authors: Amir Ingber, Meir Feder

    Abstract: A new variant of bit interleaved coded modulation (BICM) is proposed. In the new scheme, called Parallel BICM, L identical binary codes are used in parallel using a mapper, a newly proposed finite-length interleaver and a binary dither signal. As opposed to previous approaches, the scheme does not rely on any assumptions of an ideal, infinite-length interleaver. Over a memoryless channel, the new… ▽ More

    Submitted 17 August, 2010; v1 submitted 8 July, 2010; originally announced July 2010.

    Comments: 19 pages, 15 figures. A shorter version will be presented at the 48th Allerton Conference on Communication, Control, and Computing (Allerton 2010)