Search | arXiv e-print repository

A System Development Kit for Big Data Applications on FPGA-based Clusters: The EVEREST Approach

Authors: Christian Pilato, Subhadeep Banik, Jakub Beranek, Fabien Brocheton, Jeronimo Castrillon, Riccardo Cevasco, Radim Cmar, Serena Curzel, Fabrizio Ferrandi, Karl F. A. Friebel, Antonella Galizia, Matteo Grasso, Paulo Silva, Jan Martinovic, Gianluca Palermo, Michele Paolino, Andrea Parodi, Antonio Parodi, Fabio Pintus, Raphael Polig, David Poulet, Francesco Regazzoni, Burkhard Ringlein, Roberto Rocco, Katerina Slaninova , et al. (6 additional authors not shown)

Abstract: Modern big data workflows are characterized by computationally intensive kernels. The simulated results are often combined with knowledge extracted from AI models to ultimately support decision-making. These energy-hungry workflows are increasingly executed in data centers with energy-efficient hardware accelerators since FPGAs are well-suited for this task due to their inherent parallelism. We pr… ▽ More Modern big data workflows are characterized by computationally intensive kernels. The simulated results are often combined with knowledge extracted from AI models to ultimately support decision-making. These energy-hungry workflows are increasingly executed in data centers with energy-efficient hardware accelerators since FPGAs are well-suited for this task due to their inherent parallelism. We present the H2020 project EVEREST, which has developed a system development kit (SDK) to simplify the creation of FPGA-accelerated kernels and manage the execution at runtime through a virtualization environment. This paper describes the main components of the EVEREST SDK and the benefits that can be achieved in our use cases. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: Accepted for presentation at DATE 2024 (multi-partner project session)

arXiv:2311.04644 [pdf, ps, other]

Bounds on the density of smooth lattice coverings

Authors: Or Ordentlich, Oded Regev, Barak Weiss

Abstract: Let $K$ be a convex body in $\mathbb{R}^n$, let $L$ be a lattice with covolume one, and let $η>0$. We say that $K$ and $L$ form an $η$-smooth cover if each point $x \in \mathbb{R}^n$ is covered by $(1 \pm η) vol(K)$ translates of $K$ by $L$. We prove that for any positive $σ, η$, asymptotically as $n \to \infty$, for any $K$ of volume $n^{3+σ}$, one can find a lattice $L$ for which $L, K$ form an… ▽ More Let $K$ be a convex body in $\mathbb{R}^n$, let $L$ be a lattice with covolume one, and let $η>0$. We say that $K$ and $L$ form an $η$-smooth cover if each point $x \in \mathbb{R}^n$ is covered by $(1 \pm η) vol(K)$ translates of $K$ by $L$. We prove that for any positive $σ, η$, asymptotically as $n \to \infty$, for any $K$ of volume $n^{3+σ}$, one can find a lattice $L$ for which $L, K$ form an $η$-smooth cover. Moreover, this property is satisfied with high probability for a lattice chosen randomly, according to the Haar-Siegel measure on the space of lattices. Similar results hold for random construction A lattices, albeit with a worse power law, provided the ratio between the covering and packing radii of $\mathbb{Z}^n$ with respect to $K$ is at most polynomial in $n$. Our proofs rely on a recent breakthrough by Dhar and Dvir on the discrete Kakeya problem. △ Less

Submitted 8 November, 2023; originally announced November 2023.

arXiv:2308.01379 [pdf, other]

doi 10.1145/3592124

Computational Long Exposure Mobile Photography

Authors: Eric Tabellion, Nikhil Karnad, Noa Glaser, Ben Weiss, David E. Jacobs, Yael Pritch

Abstract: Long exposure photography produces stunning imagery, representing moving elements in a scene with motion-blur. It is generally employed in two modalities, producing either a foreground or a background blur effect. Foreground blur images are traditionally captured on a tripod-mounted camera and portray blurred moving foreground elements, such as silky water or light trails, over a perfectly sharp b… ▽ More Long exposure photography produces stunning imagery, representing moving elements in a scene with motion-blur. It is generally employed in two modalities, producing either a foreground or a background blur effect. Foreground blur images are traditionally captured on a tripod-mounted camera and portray blurred moving foreground elements, such as silky water or light trails, over a perfectly sharp background landscape. Background blur images, also called panning photography, are captured while the camera is tracking a moving subject, to produce an image of a sharp subject over a background blurred by relative motion. Both techniques are notoriously challenging and require additional equipment and advanced skills. In this paper, we describe a computational burst photography system that operates in a hand-held smartphone camera app, and achieves these effects fully automatically, at the tap of the shutter button. Our approach first detects and segments the salient subject. We track the scene motion over multiple frames and align the images in order to preserve desired sharpness and to produce aesthetically pleasing motion streaks. We capture an under-exposed burst and select the subset of input frames that will produce blur trails of controlled length, regardless of scene or camera motion velocity. We predict inter-frame motion and synthesize motion-blur to fill the temporal gaps between the input frames. Finally, we composite the blurred image with the sharp regular exposure to protect the sharpness of faces or areas of the scene that are barely moving, and produce a final high resolution and high dynamic range (HDR) photograph. Our system democratizes a capability previously reserved to professionals, and makes this creative style accessible to most casual photographers. More information and supplementary material can be found on our project webpage: https://motion-mode.github.io/ △ Less

Submitted 2 August, 2023; originally announced August 2023.

Comments: 15 pages, 17 figures

ACM Class: I.4; I.3.3; I.2.10

Journal ref: ACM Trans. Graph. 42, 4, Article 48 (August 2023)

arXiv:2110.04517 [pdf, other]

Extending Multi-Text Sentence Fusion Resources via Pyramid Annotations

Authors: Daniela Brook Weiss, Paul Roit, Ori Ernst, Ido Dagan

Abstract: NLP models that compare or consolidate information across multiple documents often struggle when challenged with recognizing substantial information redundancies across the texts. For example, in multi-document summarization it is crucial to identify salient information across texts and then generate a non-redundant summary, while facing repeated and usually differently-phrased salient content. To… ▽ More NLP models that compare or consolidate information across multiple documents often struggle when challenged with recognizing substantial information redundancies across the texts. For example, in multi-document summarization it is crucial to identify salient information across texts and then generate a non-redundant summary, while facing repeated and usually differently-phrased salient content. To facilitate researching such challenges, the sentence-level task of \textit{sentence fusion} was proposed, yet previous datasets for this task were very limited in their size and scope. In this paper, we revisit and substantially extend previous dataset creation efforts. With careful modifications, relabeling and employing complementing data sources, we were able to triple the size of a notable earlier dataset. Moreover, we show that our extended version uses more representative texts for multi-document tasks and provides a larger and more diverse training set, which substantially improves model training. △ Less

Submitted 9 October, 2021; originally announced October 2021.

arXiv:2109.12655 [pdf, other]

QA-Align: Representing Cross-Text Content Overlap by Aligning Question-Answer Propositions

Authors: Daniela Brook Weiss, Paul Roit, Ayal Klein, Ori Ernst, Ido Dagan

Abstract: Multi-text applications, such as multi-document summarization, are typically required to model redundancies across related texts. Current methods confronting consolidation struggle to fuse overlap** information. In order to explicitly represent content overlap, we propose to align predicate-argument relations across texts, providing a potential scaffold for information consolidation. We go beyon… ▽ More Multi-text applications, such as multi-document summarization, are typically required to model redundancies across related texts. Current methods confronting consolidation struggle to fuse overlap** information. In order to explicitly represent content overlap, we propose to align predicate-argument relations across texts, providing a potential scaffold for information consolidation. We go beyond clustering coreferring mentions, and instead model overlap with respect to redundancy at a propositional level, rather than merely detecting shared referents. Our setting exploits QA-SRL, utilizing question-answer pairs to capture predicate-argument relations, facilitating laymen annotation of cross-text alignments. We employ crowd-workers for constructing a dataset of QA-based alignments, and present a baseline QA alignment model trained over our dataset. Analyses show that our new task is semantically challenging, capturing content overlap beyond lexical similarity and complements cross-document coreference with proposition-level links, offering potential use for downstream tasks. △ Less

Submitted 26 September, 2021; originally announced September 2021.

Comments: Accepted to EMNLP 2021, Main Conference

arXiv:2106.06293 [pdf, other]

Acceleration-as-a-μService: A Cloud-native Monte-Carlo Option Pricing Engine on CPUs, GPUs and Disaggregated FPGAs

Authors: Dionysios Diamantopoulos, Raphael Polig, Burkhard Ringlein, Mitra Purandare, Beat Weiss, Christoph Hagleitner, Mark Lantz, Francois Abel

Abstract: The evolution of cloud applications into loosely-coupled microservices opens new opportunities for hardware accelerators to improve workload performance. Existing accelerator techniques for cloud sacrifice the consolidation benefits of microservices. This paper presents CloudiFi, a framework to deploy and compare accelerators as a cloud service. We evaluate our framework in the context of a financ… ▽ More The evolution of cloud applications into loosely-coupled microservices opens new opportunities for hardware accelerators to improve workload performance. Existing accelerator techniques for cloud sacrifice the consolidation benefits of microservices. This paper presents CloudiFi, a framework to deploy and compare accelerators as a cloud service. We evaluate our framework in the context of a financial workload and present early results indicating up to 485x gains in microservice response time. △ Less

Submitted 11 June, 2021; originally announced June 2021.

Comments: 3 pages, 6 figures

arXiv:2104.07608 [pdf, other]

Camera View Adjustment Prediction for Improving Image Composition

Authors: Yu-Chuan Su, Raviteja Vemulapalli, Ben Weiss, Chun-Te Chu, Philip Andrew Mansfield, Lior Shapira, Colvin Pitts

Abstract: Image composition plays an important role in the quality of a photo. However, not every camera user possesses the knowledge and expertise required for capturing well-composed photos. While post-capture crop** can improve the composition sometimes, it does not work in many common scenarios in which the photographer needs to adjust the camera view to capture the best shot. To address this issue, w… ▽ More Image composition plays an important role in the quality of a photo. However, not every camera user possesses the knowledge and expertise required for capturing well-composed photos. While post-capture crop** can improve the composition sometimes, it does not work in many common scenarios in which the photographer needs to adjust the camera view to capture the best shot. To address this issue, we propose a deep learning-based approach that provides suggestions to the photographer on how to adjust the camera view before capturing. By optimizing the composition before a photo is captured, our system helps photographers to capture better photos. As there is no publicly-available dataset for this task, we create a view adjustment dataset by repurposing existing image crop** datasets. Furthermore, we propose a two-stage semi-supervised approach that utilizes both labeled and unlabeled images for training a view adjustment model. Experiment results show that the proposed semi-supervised approach outperforms the corresponding supervised alternatives, and our user study results show that the suggested view adjustment improves image composition 79% of the time. △ Less

Submitted 15 April, 2021; originally announced April 2021.

arXiv:2006.00340 [pdf, ps, other]

New bounds on the density of lattice coverings

Authors: Or Ordentlich, Oded Regev, Barak Weiss

Abstract: We obtain new upper bounds on the minimal density of lattice coverings of Euclidean space by dilates of a convex body K. We also obtain bounds on the probability (with respect to the natural Haar-Siegel measure on the space of lattices) that a randomly chosen lattice L satisfies that L+K is all of space. As a step in the proof, we utilize and strengthen results on the discrete Kakeya problem. We obtain new upper bounds on the minimal density of lattice coverings of Euclidean space by dilates of a convex body K. We also obtain bounds on the probability (with respect to the natural Haar-Siegel measure on the space of lattices) that a randomly chosen lattice L satisfies that L+K is all of space. As a step in the proof, we utilize and strengthen results on the discrete Kakeya problem. △ Less

Submitted 30 May, 2020; originally announced June 2020.

MSC Class: 11H31; 94B75; 11T30

arXiv:2005.00400 [pdf, other]

Multi-episodic Perceived Quality of an Audio-on-Demand Service

Authors: Dennis Guse, Oliver Hohlfeld, Anna Wunderlich, Benjamin Weiss, Sebastian Möller

Abstract: QoE is traditionally evaluated by using short stimuli usually representing parts or single usage episodes. This opens the question on how the overall service perception involving multiple} usage episodes can be evaluated---a question of high practical relevance to service operators. Despite initial research on this challenging aspect of multi-episodic perceived quality, the question of the underly… ▽ More QoE is traditionally evaluated by using short stimuli usually representing parts or single usage episodes. This opens the question on how the overall service perception involving multiple} usage episodes can be evaluated---a question of high practical relevance to service operators. Despite initial research on this challenging aspect of multi-episodic perceived quality, the question of the underlying quality formation processes and its factors are still to be discovered. We present a multi-episodic experiment of an Audio on Demand service over a usage period of 6~days with 93 participants. Our work directly extends prior work investigating the impact of time between usage episodes. The results show similar effects---also the recency effect is not statistically significant. In addition, we extend prediction of multi-episodic judgments by accounting for the observed saturation. △ Less

Submitted 1 May, 2020; originally announced May 2020.

Comments: To appear at IEEE QoMEX 2020

ACM Class: H.5.1; H.5.5; C.2.m

arXiv:1909.13361 [pdf, other]

A Longitudinal Framework for Predicting Nonresponse in Panel Surveys

Authors: Christoph Kern, Bernd Weiss, Jan-Philipp Kolb

Abstract: Nonresponse in panel studies can lead to a substantial loss in data quality due to its potential to introduce bias and distort survey estimates. Recent work investigates the usage of machine learning to predict nonresponse in advance, such that predicted nonresponse propensities can be used to inform the data collection process. However, predicting nonresponse in panel studies requires accounting… ▽ More Nonresponse in panel studies can lead to a substantial loss in data quality due to its potential to introduce bias and distort survey estimates. Recent work investigates the usage of machine learning to predict nonresponse in advance, such that predicted nonresponse propensities can be used to inform the data collection process. However, predicting nonresponse in panel studies requires accounting for the longitudinal data structure in terms of model building, tuning, and evaluation. This study proposes a longitudinal framework for predicting nonresponse with machine learning and multiple panel waves and illustrates its application. With respect to model building, this approach utilizes information from multiple waves by introducing features that aggregate previous (non)response patterns. Concerning model tuning and evaluation, temporal cross-validation is employed by iterating through pairs of panel waves such that the training and test sets move in time. Implementing this approach with data from a German probability-based mixed-mode panel shows that aggregating information over multiple panel waves can be used to build prediction models with competitive and robust performance over all test waves. △ Less

Submitted 2 November, 2019; v1 submitted 29 September, 2019; originally announced September 2019.

arXiv:1907.08228 [pdf, other]

TED-On: A Total Error Framework for Digital Traces of Human Behavior on Online Platforms

Authors: Indira Sen, Fabian Floeck, Katrin Weller, Bernd Weiss, Claudia Wagner

Abstract: Peoples' activities and opinions recorded as digital traces online, especially on social media and other web-based platforms, offer increasingly informative pictures of the public. They promise to allow inferences about populations beyond the users of the platforms on which the traces are recorded, representing real potential for the Social Sciences and a complement to survey-based research. But t… ▽ More Peoples' activities and opinions recorded as digital traces online, especially on social media and other web-based platforms, offer increasingly informative pictures of the public. They promise to allow inferences about populations beyond the users of the platforms on which the traces are recorded, representing real potential for the Social Sciences and a complement to survey-based research. But the use of digital traces brings its own complexities and new error sources to the research enterprise. Recently, researchers have begun to discuss the errors that can occur when digital traces are used to learn about humans and social phenomena. This article synthesizes this discussion and proposes a systematic way to categorize potential errors, inspired by the Total Survey Error (TSE) Framework developed for survey methodology. We introduce a conceptual framework to diagnose, understand, and document errors that may occur in studies based on such digital traces. While there are clear parallels to the well-known error sources in the TSE framework, the new "Total Error Framework for Digital Traces of Human Behavior on Online Platforms" (TED-On) identifies several types of error that are specific to the use of digital traces. By providing a standard vocabulary to describe these errors, the proposed framework is intended to advance communication and research concerning the use of digital traces in scientific social research. △ Less

Submitted 3 June, 2021; v1 submitted 18 July, 2019; originally announced July 2019.

Comments: 20 pages, 2 figures, Longer version of paper set to appear in Public Opinion Quarterly. Updating terminology

arXiv:1406.3807 [pdf, other]

Dense forests and Danzer sets

Authors: Yaar Solomon, Barak Weiss

Abstract: A set $Y\subseteq\mathbb{R}^d$ that intersects every convex set of volume $1$ is called a Danzer set. It is not known whether there are Danzer sets in $\mathbb{R}^d$ with growth rate $O(T^d)$. We prove that natural candidates, such as discrete sets that arise from substitutions and from cut-and-project constructions, are not Danzer sets. For cut and project sets our proof relies on the dynamics of… ▽ More A set $Y\subseteq\mathbb{R}^d$ that intersects every convex set of volume $1$ is called a Danzer set. It is not known whether there are Danzer sets in $\mathbb{R}^d$ with growth rate $O(T^d)$. We prove that natural candidates, such as discrete sets that arise from substitutions and from cut-and-project constructions, are not Danzer sets. For cut and project sets our proof relies on the dynamics of homogeneous flows. We consider a weakening of the Danzer problem, the existence of uniformly discrete dense forests, and we use homogeneous dynamics (in particular Ratner's theorems on unipotent flows) to construct such sets. We also prove an equivalence between the above problem and a well-known combinatorial problem, and deduce the existence of Danzer sets with growth rate $O(T^d\log T)$, improving the previous bound of $O(T^d\log^{d-1} T)$. △ Less

Submitted 10 July, 2014; v1 submitted 15 June, 2014; originally announced June 2014.

arXiv:1304.6505 [pdf]

Software Design Principles of a DFS Tower A-CWP Prototype

Authors: Felix Schmitt, Ralf Heidger, Stephen Straub, Benjamin Weiß

Abstract: SESAR is supposed to boost the development of new operational procedures together with the supporting systems in order to modernize the pan-European air traffic management (ATM). One consequence of this development is that more and more information is presented to - and has to be processed by - air traffic control officers (ATCOs). Thus, there is a strong need for a software design concept that fo… ▽ More SESAR is supposed to boost the development of new operational procedures together with the supporting systems in order to modernize the pan-European air traffic management (ATM). One consequence of this development is that more and more information is presented to - and has to be processed by - air traffic control officers (ATCOs). Thus, there is a strong need for a software design concept that fosters the development of an advanced (tower) controller working position (A-CWP) that comprehensively integrates the still counting amount of information while reducing the data management workload of ATCOs. We report on our first hands-on experiences obtained during the development of an A-CWP prototype that was used in two SESAR validation sessions. △ Less

Submitted 24 April, 2013; originally announced April 2013.

Comments: Presented at the International Symposium on Enhanced Solutions for Aircraft and Vehicle Surveillance Applications (ESAVS 2013)

arXiv:0907.1201 [pdf, ps, other]

Generating Product Systems

Authors: Nir Avni, Benjamin Weiss

Abstract: Generalizing Krieger's finite generation theorem, we give conditions for an ergodic system to be generated by a pair of partitions, each required to be measurable with respect to a given sub-algebra, and also required to have a fixed size. Generalizing Krieger's finite generation theorem, we give conditions for an ergodic system to be generated by a pair of partitions, each required to be measurable with respect to a given sub-algebra, and also required to have a fixed size. △ Less

Submitted 7 July, 2009; originally announced July 2009.

MSC Class: 28D20; 60G10; 37A35; 68P30

arXiv:0808.2964 [pdf, ps, other]

Estimating the Lengths of Memory Words

Authors: Gusztav Morvai, Benjamin Weiss

Abstract: For a stationary stochastic process $\{X_n\}$ with values in some set $A$, a finite word $w \in A^K$ is called a memory word if the conditional probability of $X_0$ given the past is constant on the cylinder set defined by $X_{-K}^{-1}=w$. It is a called a minimal memory word if no proper suffix of $w$ is also a memory word. For example in a $K$-step Markov processes all words of length $K$ are… ▽ More For a stationary stochastic process $\{X_n\}$ with values in some set $A$, a finite word $w \in A^K$ is called a memory word if the conditional probability of $X_0$ given the past is constant on the cylinder set defined by $X_{-K}^{-1}=w$. It is a called a minimal memory word if no proper suffix of $w$ is also a memory word. For example in a $K$-step Markov processes all words of length $K$ are memory words but not necessarily minimal. We consider the problem of determining the lengths of the longest minimal memory words and the shortest memory words of an unknown process $\{X_n\}$ based on sequentially observing the outputs of a single sample $\{ξ_1,ξ_2,...ξ_n\}$. We will give a universal estimator which converges almost surely to the length of the longest minimal memory word and show that no such universal estimator exists for the length of the shortest memory word. The alphabet $A$ may be finite or countable. △ Less

Submitted 21 August, 2008; originally announced August 2008.

Journal ref: IEEE Transactions on Information Theory, Vol. 54, No. 8. (2008), pp. 3804-3807

arXiv:0803.4332 [pdf, ps, other]

On Sequential Estimation and Prediction for Discrete Time Series

Authors: G. Morvai, B. Weiss

Abstract: The problem of extracting as much information as possible from a sequence of observations of a stationary stochastic process $X_0,X_1,...X_n$ has been considered by many authors from different points of view. It has long been known through the work of D. Bailey that no universal estimator for $\textbf{P}(X_{n+1}|X_0,X_1,...X_n)$ can be found which converges to the true estimator almost surely. D… ▽ More The problem of extracting as much information as possible from a sequence of observations of a stationary stochastic process $X_0,X_1,...X_n$ has been considered by many authors from different points of view. It has long been known through the work of D. Bailey that no universal estimator for $\textbf{P}(X_{n+1}|X_0,X_1,...X_n)$ can be found which converges to the true estimator almost surely. Despite this result, for restricted classes of processes, or for sequences of estimators along stop** times, universal estimators can be found. We present here a survey of some of the recent work that has been done along these lines. △ Less

Submitted 30 March, 2008; originally announced March 2008.

Journal ref: Stochastics and Dynamics, Vol. 7, No. 4. pp. 417-437, 2007

arXiv:0712.0105 [pdf, ps, other]

doi 10.1016/j.anihpb.2005.11.001

On estimating the memory for finitarily Markovian processes

Authors: Gusztav Morvai, Benjamin Weiss

Abstract: Finitarily Markovian processes are those processes $\{X_n\}_{n=-\infty}^{\infty}$ for which there is a finite $K$ ($K = K(\{X_n\}_{n=-\infty}^0$) such that the conditional distribution of $X_1$ given the entire past is equal to the conditional distribution of $X_1$ given only $\{X_n\}_{n=1-K}^0$. The least such value of $K$ is called the memory length. We give a rather complete analysis of the p… ▽ More Finitarily Markovian processes are those processes $\{X_n\}_{n=-\infty}^{\infty}$ for which there is a finite $K$ ($K = K(\{X_n\}_{n=-\infty}^0$) such that the conditional distribution of $X_1$ given the entire past is equal to the conditional distribution of $X_1$ given only $\{X_n\}_{n=1-K}^0$. The least such value of $K$ is called the memory length. We give a rather complete analysis of the problems of universally estimating the least such value of $K$, both in the backward sense that we have just described and in the forward sense, where one observes successive values of $\{X_n\}$ for $n \geq 0$ and asks for the least value $K$ such that the conditional distribution of $X_{n+1}$ given $\{X_i\}_{i=n-K+1}^n$ is the same as the conditional distribution of $X_{n+1}$ given $\{X_i\}_{i=-\infty}^n$. We allow for finite or countably infinite alphabet size. △ Less

Submitted 3 December, 2007; originally announced December 2007.

Journal ref: Ann. Inst. H. Poincare Probab. Statist. 43 (2007), no. 1, 15--30

arXiv:0711.3856 [pdf, ps, other]

doi 10.1016/j.anihpb.2004.07.002

Forward estimation for ergodic time series

Authors: Gusztav Morvai, Benjamin Weiss

Abstract: The forward estimation problem for stationary and ergodic time series $\{X_n\}_{n=0}^{\infty}$ taking values from a finite alphabet ${\cal X}$ is to estimate the probability that $X_{n+1}=x$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. We present a simple procedure $g_n$ which is evaluated on the data segment… ▽ More The forward estimation problem for stationary and ergodic time series $\{X_n\}_{n=0}^{\infty}$ taking values from a finite alphabet ${\cal X}$ is to estimate the probability that $X_{n+1}=x$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. We present a simple procedure $g_n$ which is evaluated on the data segment $(X_0,...,X_n)$ and for which, ${\rm error}(n) = |g_{n}(x)-P(X_{n+1}=x |X_0,...,X_n)|\to 0$ almost surely for a subclass of all stationary and ergodic time series, while for the full class the Cesaro average of the error tends to zero almost surely and moreover, the error tends to zero in probability. △ Less

Submitted 24 November, 2007; originally announced November 2007.

Journal ref: Ann. Inst. H. Poincare Probab. Statist. 41 (2005), no. 5, 859--870

arXiv:0711.0472 [pdf, ps, other]

Order estimation of Markov chains

Authors: G. Morvai, B. Weiss

Abstract: We describe estimators $χ_n(X_0,X_1,...,X_n)$, which when applied to an unknown stationary process taking values from a countable alphabet ${\cal X}$, converge almost surely to $k$ in case the process is a $k$-th order Markov chain and to infinity otherwise. We describe estimators $χ_n(X_0,X_1,...,X_n)$, which when applied to an unknown stationary process taking values from a countable alphabet ${\cal X}$, converge almost surely to $k$ in case the process is a $k$-th order Markov chain and to infinity otherwise. △ Less

Submitted 3 November, 2007; originally announced November 2007.

Journal ref: IEEE Trans. Inform. Theory 51 (2005), no. 4, 1496--1497

arXiv:0711.0471 [pdf, ps, other]

Prediction for discrete time series

Authors: G. Morvai, B. Weiss

Abstract: Let $\{X_n\}$ be a stationary and ergodic time series taking values from a finite or countably infinite set ${\cal X}$. Assume that the distribution of the process is otherwise unknown. We propose a sequence of stop** times $λ_n$ along which we will be able to estimate the conditional probability $P(X_{λ_n+1}=x|X_0,...,X_{λ_n})$ from data segment $(X_0,...,X_{λ_n})$ in a pointwise consistent w… ▽ More Let $\{X_n\}$ be a stationary and ergodic time series taking values from a finite or countably infinite set ${\cal X}$. Assume that the distribution of the process is otherwise unknown. We propose a sequence of stop** times $λ_n$ along which we will be able to estimate the conditional probability $P(X_{λ_n+1}=x|X_0,...,X_{λ_n})$ from data segment $(X_0,...,X_{λ_n})$ in a pointwise consistent way for a restricted class of stationary and ergodic finite or countably infinite alphabet time series which includes among others all stationary and ergodic finitarily Markovian processes. If the stationary and ergodic process turns out to be finitarily Markovian (among others, all stationary and ergodic Markov chains are included in this class) then $ \lim_{n\to \infty} {n\over λ_n}>0$ almost surely. If the stationary and ergodic process turns out to possess finite entropy rate then $λ_n$ is upperbounded by a polynomial, eventually almost surely. △ Less

Submitted 3 November, 2007; originally announced November 2007.

Journal ref: Probab. Theory Related Fields 132 (2005), no. 1, 1--12

arXiv:0711.0350 [pdf, ps, other]

Intermittent estimation of stationary time series

Authors: G. Morvai, B. Weiss

Abstract: Let $\{X_n\}_{n=0}^{\infty}$ be a stationary real-valued time series with unknown distribution. Our goal is to estimate the conditional expectation of $X_{n+1}$ based on the observations $X_i$, $0\le i\le n$ in a strongly consistent way. Bailey and Ryabko proved that this is not possible even for ergodic binary time series if one estimates at all values of $n$. We propose a very simple algorithm… ▽ More Let $\{X_n\}_{n=0}^{\infty}$ be a stationary real-valued time series with unknown distribution. Our goal is to estimate the conditional expectation of $X_{n+1}$ based on the observations $X_i$, $0\le i\le n$ in a strongly consistent way. Bailey and Ryabko proved that this is not possible even for ergodic binary time series if one estimates at all values of $n$. We propose a very simple algorithm which will make prediction infinitely often at carefully selected stop** times chosen by our rule. We show that under certain conditions our procedure is strongly (pointwise) consistent, and $L_2$ consistent without any condition. An upper bound on the growth of the stop** times is also presented in this paper. △ Less

Submitted 2 November, 2007; originally announced November 2007.

Journal ref: Test 13 (2004), no. 2, 525--542

arXiv:0710.5144 [pdf, ps, other]

Forecasting for stationary binary time series

Authors: Gusztav Morvai, Benjamin Weiss

Abstract: The forecasting problem for a stationary and ergodic binary time series $\{X_n\}_{n=0}^{\infty}$ is to estimate the probability that $X_{n+1}=1$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. It is known that this is not possible if one estimates at all values of $n$. We present a simple procedure which will attempt to make su… ▽ More The forecasting problem for a stationary and ergodic binary time series $\{X_n\}_{n=0}^{\infty}$ is to estimate the probability that $X_{n+1}=1$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. It is known that this is not possible if one estimates at all values of $n$. We present a simple procedure which will attempt to make such a prediction infinitely often at carefully selected stop** times chosen by the algorithm. We show that the proposed procedure is consistent under certain conditions, and we estimate the growth rate of the stop** times. △ Less

Submitted 26 October, 2007; originally announced October 2007.

Journal ref: Acta Appl. Math. 79 (2003), no. 1-2, 25--34

arXiv:0710.3775 [pdf, ps, other]

On classifying processes

Authors: Gusztav Morvai, Benjamin Weiss

Abstract: We prove several results concerning classifications, based on successive observations $(X_1,..., X_n)$ of an unknown stationary and ergodic process, for membership in a given class of processes, such as the class of all finite order Markov chains. We prove several results concerning classifications, based on successive observations $(X_1,..., X_n)$ of an unknown stationary and ergodic process, for membership in a given class of processes, such as the class of all finite order Markov chains. △ Less

Submitted 19 October, 2007; originally announced October 2007.

Journal ref: Bernoulli 11 (2005), no. 3, pp. 523--532

arXiv:0710.3773 [pdf, ps, other]

Limitations on intermittent forecasting

Authors: Gusztav Morvai, Benjamin Weiss

Abstract: Bailey showed that the general pointwise forecasting for stationary and ergodic time series has a negative solution. However, it is known that for Markov chains the problem can be solved. Morvai showed that there is a stop** time sequence $\{λ_n\}$ such that $P(X_{λ_n+1}=1|X_0,...,X_{λ_n}) $ can be estimated from samples $(X_0,...,X_{λ_n})$ such that the difference between the conditional prob… ▽ More Bailey showed that the general pointwise forecasting for stationary and ergodic time series has a negative solution. However, it is known that for Markov chains the problem can be solved. Morvai showed that there is a stop** time sequence $\{λ_n\}$ such that $P(X_{λ_n+1}=1|X_0,...,X_{λ_n}) $ can be estimated from samples $(X_0,...,X_{λ_n})$ such that the difference between the conditional probability and the estimate vanishes along these stopp** times for all stationary and ergodic binary time series. We will show it is not possible to estimate the above conditional probability along a stop** time sequence for all stationary and ergodic binary time series in a pointwise sense such that if the time series turns out to be a Markov chain, the predictor will predict eventually for all $n$. △ Less

Submitted 19 October, 2007; originally announced October 2007.

Journal ref: Statist. Probab. Lett. 72 (2005), no. 4, 285--290

arXiv:0710.3757 [pdf, ps, other]

Inferring the conditional mean

Authors: Gusztav Morvai, Benjamin Weiss

Abstract: Consider a stationary real-valued time series $\{X_n\}_{n=0}^{\infty}$ with a priori unknown distribution. The goal is to estimate the conditional expectation $E(X_{n+1}|X_0,..., X_n)$ based on the observations $(X_0,..., X_n)$ in a pointwise consistent way. It is well known that this is not possible at all values of $n$. We will estimate it along stop** times. Consider a stationary real-valued time series $\{X_n\}_{n=0}^{\infty}$ with a priori unknown distribution. The goal is to estimate the conditional expectation $E(X_{n+1}|X_0,..., X_n)$ based on the observations $(X_0,..., X_n)$ in a pointwise consistent way. It is well known that this is not possible at all values of $n$. We will estimate it along stop** times. △ Less

Submitted 19 October, 2007; originally announced October 2007.

Journal ref: Theory Stoch. Process. 11 (2005), no. 1-2, pp. 112--120

arXiv:cs/0109067 [pdf]

Voice over IP in the Local Exchange: A Case Study

Authors: Martin B. H. Weiss, Hak-Ju Kim

Abstract: There have been a small number of cost studies of Voice over IP (VoIP) in the academic literature. Generally, they have been for abstract networks, have not been focused on the public switched telephone network, or they have not included the operating costs. This paper presents the operating cost portion of our ongoing research project comparing circuit-switched and IP network costs for an exist… ▽ More There have been a small number of cost studies of Voice over IP (VoIP) in the academic literature. Generally, they have been for abstract networks, have not been focused on the public switched telephone network, or they have not included the operating costs. This paper presents the operating cost portion of our ongoing research project comparing circuit-switched and IP network costs for an existing local exchange carrier. We have found that (1) The operating cost differential between IP and circuit switching for this LEC will be small; and (2) A substantial majority of a telco's operating cost lies in customer service and outside plant maintenance, which will be incurred equally in both networks in a pure substitution scenario. Thus, the operating cost difference lies in the actual cost differences of the switching technologies. This appears to be less than 10%-15% of the total operating cost of the network. Thus, even if the cost differences for substitute services were large, the overall impact on the telco's financial performance would be small. But IP has some hidden benefits on the operations side. Most notably, data and voice services could be managed with the same systems infrastructure, meaning that the incremental operations cost of rolling out new services would likely be much lower, since it would all be IP. △ Less

Submitted 24 September, 2001; originally announced September 2001.

Comments: 29th TPRC Conference, 2001

Report number: TPRC-2001-053 ACM Class: K.4.m Miscellaneous

Showing 1–26 of 26 results for author: Weiss, B