-
A System Development Kit for Big Data Applications on FPGA-based Clusters: The EVEREST Approach
Authors:
Christian Pilato,
Subhadeep Banik,
Jakub Beranek,
Fabien Brocheton,
Jeronimo Castrillon,
Riccardo Cevasco,
Radim Cmar,
Serena Curzel,
Fabrizio Ferrandi,
Karl F. A. Friebel,
Antonella Galizia,
Matteo Grasso,
Paulo Silva,
Jan Martinovic,
Gianluca Palermo,
Michele Paolino,
Andrea Parodi,
Antonio Parodi,
Fabio Pintus,
Raphael Polig,
David Poulet,
Francesco Regazzoni,
Burkhard Ringlein,
Roberto Rocco,
Katerina Slaninova
, et al. (6 additional authors not shown)
Abstract:
Modern big data workflows are characterized by computationally intensive kernels. The simulated results are often combined with knowledge extracted from AI models to ultimately support decision-making. These energy-hungry workflows are increasingly executed in data centers with energy-efficient hardware accelerators since FPGAs are well-suited for this task due to their inherent parallelism. We pr…
▽ More
Modern big data workflows are characterized by computationally intensive kernels. The simulated results are often combined with knowledge extracted from AI models to ultimately support decision-making. These energy-hungry workflows are increasingly executed in data centers with energy-efficient hardware accelerators since FPGAs are well-suited for this task due to their inherent parallelism. We present the H2020 project EVEREST, which has developed a system development kit (SDK) to simplify the creation of FPGA-accelerated kernels and manage the execution at runtime through a virtualization environment. This paper describes the main components of the EVEREST SDK and the benefits that can be achieved in our use cases.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Bounds on the density of smooth lattice coverings
Authors:
Or Ordentlich,
Oded Regev,
Barak Weiss
Abstract:
Let $K$ be a convex body in $\mathbb{R}^n$, let $L$ be a lattice with covolume one, and let $η>0$. We say that $K$ and $L$ form an $η$-smooth cover if each point $x \in \mathbb{R}^n$ is covered by $(1 \pm η) vol(K)$ translates of $K$ by $L$. We prove that for any positive $σ, η$, asymptotically as $n \to \infty$, for any $K$ of volume $n^{3+σ}$, one can find a lattice $L$ for which $L, K$ form an…
▽ More
Let $K$ be a convex body in $\mathbb{R}^n$, let $L$ be a lattice with covolume one, and let $η>0$. We say that $K$ and $L$ form an $η$-smooth cover if each point $x \in \mathbb{R}^n$ is covered by $(1 \pm η) vol(K)$ translates of $K$ by $L$. We prove that for any positive $σ, η$, asymptotically as $n \to \infty$, for any $K$ of volume $n^{3+σ}$, one can find a lattice $L$ for which $L, K$ form an $η$-smooth cover. Moreover, this property is satisfied with high probability for a lattice chosen randomly, according to the Haar-Siegel measure on the space of lattices. Similar results hold for random construction A lattices, albeit with a worse power law, provided the ratio between the covering and packing radii of $\mathbb{Z}^n$ with respect to $K$ is at most polynomial in $n$. Our proofs rely on a recent breakthrough by Dhar and Dvir on the discrete Kakeya problem.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Computational Long Exposure Mobile Photography
Authors:
Eric Tabellion,
Nikhil Karnad,
Noa Glaser,
Ben Weiss,
David E. Jacobs,
Yael Pritch
Abstract:
Long exposure photography produces stunning imagery, representing moving elements in a scene with motion-blur. It is generally employed in two modalities, producing either a foreground or a background blur effect. Foreground blur images are traditionally captured on a tripod-mounted camera and portray blurred moving foreground elements, such as silky water or light trails, over a perfectly sharp b…
▽ More
Long exposure photography produces stunning imagery, representing moving elements in a scene with motion-blur. It is generally employed in two modalities, producing either a foreground or a background blur effect. Foreground blur images are traditionally captured on a tripod-mounted camera and portray blurred moving foreground elements, such as silky water or light trails, over a perfectly sharp background landscape. Background blur images, also called panning photography, are captured while the camera is tracking a moving subject, to produce an image of a sharp subject over a background blurred by relative motion. Both techniques are notoriously challenging and require additional equipment and advanced skills. In this paper, we describe a computational burst photography system that operates in a hand-held smartphone camera app, and achieves these effects fully automatically, at the tap of the shutter button. Our approach first detects and segments the salient subject. We track the scene motion over multiple frames and align the images in order to preserve desired sharpness and to produce aesthetically pleasing motion streaks. We capture an under-exposed burst and select the subset of input frames that will produce blur trails of controlled length, regardless of scene or camera motion velocity. We predict inter-frame motion and synthesize motion-blur to fill the temporal gaps between the input frames. Finally, we composite the blurred image with the sharp regular exposure to protect the sharpness of faces or areas of the scene that are barely moving, and produce a final high resolution and high dynamic range (HDR) photograph. Our system democratizes a capability previously reserved to professionals, and makes this creative style accessible to most casual photographers.
More information and supplementary material can be found on our project webpage: https://motion-mode.github.io/
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Extending Multi-Text Sentence Fusion Resources via Pyramid Annotations
Authors:
Daniela Brook Weiss,
Paul Roit,
Ori Ernst,
Ido Dagan
Abstract:
NLP models that compare or consolidate information across multiple documents often struggle when challenged with recognizing substantial information redundancies across the texts. For example, in multi-document summarization it is crucial to identify salient information across texts and then generate a non-redundant summary, while facing repeated and usually differently-phrased salient content. To…
▽ More
NLP models that compare or consolidate information across multiple documents often struggle when challenged with recognizing substantial information redundancies across the texts. For example, in multi-document summarization it is crucial to identify salient information across texts and then generate a non-redundant summary, while facing repeated and usually differently-phrased salient content. To facilitate researching such challenges, the sentence-level task of \textit{sentence fusion} was proposed, yet previous datasets for this task were very limited in their size and scope. In this paper, we revisit and substantially extend previous dataset creation efforts. With careful modifications, relabeling and employing complementing data sources, we were able to triple the size of a notable earlier dataset. Moreover, we show that our extended version uses more representative texts for multi-document tasks and provides a larger and more diverse training set, which substantially improves model training.
△ Less
Submitted 9 October, 2021;
originally announced October 2021.
-
QA-Align: Representing Cross-Text Content Overlap by Aligning Question-Answer Propositions
Authors:
Daniela Brook Weiss,
Paul Roit,
Ayal Klein,
Ori Ernst,
Ido Dagan
Abstract:
Multi-text applications, such as multi-document summarization, are typically required to model redundancies across related texts. Current methods confronting consolidation struggle to fuse overlap** information. In order to explicitly represent content overlap, we propose to align predicate-argument relations across texts, providing a potential scaffold for information consolidation. We go beyon…
▽ More
Multi-text applications, such as multi-document summarization, are typically required to model redundancies across related texts. Current methods confronting consolidation struggle to fuse overlap** information. In order to explicitly represent content overlap, we propose to align predicate-argument relations across texts, providing a potential scaffold for information consolidation. We go beyond clustering coreferring mentions, and instead model overlap with respect to redundancy at a propositional level, rather than merely detecting shared referents. Our setting exploits QA-SRL, utilizing question-answer pairs to capture predicate-argument relations, facilitating laymen annotation of cross-text alignments. We employ crowd-workers for constructing a dataset of QA-based alignments, and present a baseline QA alignment model trained over our dataset. Analyses show that our new task is semantically challenging, capturing content overlap beyond lexical similarity and complements cross-document coreference with proposition-level links, offering potential use for downstream tasks.
△ Less
Submitted 26 September, 2021;
originally announced September 2021.
-
Acceleration-as-a-μService: A Cloud-native Monte-Carlo Option Pricing Engine on CPUs, GPUs and Disaggregated FPGAs
Authors:
Dionysios Diamantopoulos,
Raphael Polig,
Burkhard Ringlein,
Mitra Purandare,
Beat Weiss,
Christoph Hagleitner,
Mark Lantz,
Francois Abel
Abstract:
The evolution of cloud applications into loosely-coupled microservices opens new opportunities for hardware accelerators to improve workload performance. Existing accelerator techniques for cloud sacrifice the consolidation benefits of microservices. This paper presents CloudiFi, a framework to deploy and compare accelerators as a cloud service. We evaluate our framework in the context of a financ…
▽ More
The evolution of cloud applications into loosely-coupled microservices opens new opportunities for hardware accelerators to improve workload performance. Existing accelerator techniques for cloud sacrifice the consolidation benefits of microservices. This paper presents CloudiFi, a framework to deploy and compare accelerators as a cloud service. We evaluate our framework in the context of a financial workload and present early results indicating up to 485x gains in microservice response time.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.
-
Camera View Adjustment Prediction for Improving Image Composition
Authors:
Yu-Chuan Su,
Raviteja Vemulapalli,
Ben Weiss,
Chun-Te Chu,
Philip Andrew Mansfield,
Lior Shapira,
Colvin Pitts
Abstract:
Image composition plays an important role in the quality of a photo. However, not every camera user possesses the knowledge and expertise required for capturing well-composed photos. While post-capture crop** can improve the composition sometimes, it does not work in many common scenarios in which the photographer needs to adjust the camera view to capture the best shot. To address this issue, w…
▽ More
Image composition plays an important role in the quality of a photo. However, not every camera user possesses the knowledge and expertise required for capturing well-composed photos. While post-capture crop** can improve the composition sometimes, it does not work in many common scenarios in which the photographer needs to adjust the camera view to capture the best shot. To address this issue, we propose a deep learning-based approach that provides suggestions to the photographer on how to adjust the camera view before capturing. By optimizing the composition before a photo is captured, our system helps photographers to capture better photos. As there is no publicly-available dataset for this task, we create a view adjustment dataset by repurposing existing image crop** datasets. Furthermore, we propose a two-stage semi-supervised approach that utilizes both labeled and unlabeled images for training a view adjustment model. Experiment results show that the proposed semi-supervised approach outperforms the corresponding supervised alternatives, and our user study results show that the suggested view adjustment improves image composition 79% of the time.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
New bounds on the density of lattice coverings
Authors:
Or Ordentlich,
Oded Regev,
Barak Weiss
Abstract:
We obtain new upper bounds on the minimal density of lattice coverings of Euclidean space by dilates of a convex body K. We also obtain bounds on the probability (with respect to the natural Haar-Siegel measure on the space of lattices) that a randomly chosen lattice L satisfies that L+K is all of space. As a step in the proof, we utilize and strengthen results on the discrete Kakeya problem.
We obtain new upper bounds on the minimal density of lattice coverings of Euclidean space by dilates of a convex body K. We also obtain bounds on the probability (with respect to the natural Haar-Siegel measure on the space of lattices) that a randomly chosen lattice L satisfies that L+K is all of space. As a step in the proof, we utilize and strengthen results on the discrete Kakeya problem.
△ Less
Submitted 30 May, 2020;
originally announced June 2020.
-
Multi-episodic Perceived Quality of an Audio-on-Demand Service
Authors:
Dennis Guse,
Oliver Hohlfeld,
Anna Wunderlich,
Benjamin Weiss,
Sebastian Möller
Abstract:
QoE is traditionally evaluated by using short stimuli usually representing parts or single usage episodes. This opens the question on how the overall service perception involving multiple} usage episodes can be evaluated---a question of high practical relevance to service operators. Despite initial research on this challenging aspect of multi-episodic perceived quality, the question of the underly…
▽ More
QoE is traditionally evaluated by using short stimuli usually representing parts or single usage episodes. This opens the question on how the overall service perception involving multiple} usage episodes can be evaluated---a question of high practical relevance to service operators. Despite initial research on this challenging aspect of multi-episodic perceived quality, the question of the underlying quality formation processes and its factors are still to be discovered. We present a multi-episodic experiment of an Audio on Demand service over a usage period of 6~days with 93 participants. Our work directly extends prior work investigating the impact of time between usage episodes. The results show similar effects---also the recency effect is not statistically significant. In addition, we extend prediction of multi-episodic judgments by accounting for the observed saturation.
△ Less
Submitted 1 May, 2020;
originally announced May 2020.
-
A Longitudinal Framework for Predicting Nonresponse in Panel Surveys
Authors:
Christoph Kern,
Bernd Weiss,
Jan-Philipp Kolb
Abstract:
Nonresponse in panel studies can lead to a substantial loss in data quality due to its potential to introduce bias and distort survey estimates. Recent work investigates the usage of machine learning to predict nonresponse in advance, such that predicted nonresponse propensities can be used to inform the data collection process. However, predicting nonresponse in panel studies requires accounting…
▽ More
Nonresponse in panel studies can lead to a substantial loss in data quality due to its potential to introduce bias and distort survey estimates. Recent work investigates the usage of machine learning to predict nonresponse in advance, such that predicted nonresponse propensities can be used to inform the data collection process. However, predicting nonresponse in panel studies requires accounting for the longitudinal data structure in terms of model building, tuning, and evaluation. This study proposes a longitudinal framework for predicting nonresponse with machine learning and multiple panel waves and illustrates its application. With respect to model building, this approach utilizes information from multiple waves by introducing features that aggregate previous (non)response patterns. Concerning model tuning and evaluation, temporal cross-validation is employed by iterating through pairs of panel waves such that the training and test sets move in time. Implementing this approach with data from a German probability-based mixed-mode panel shows that aggregating information over multiple panel waves can be used to build prediction models with competitive and robust performance over all test waves.
△ Less
Submitted 2 November, 2019; v1 submitted 29 September, 2019;
originally announced September 2019.
-
TED-On: A Total Error Framework for Digital Traces of Human Behavior on Online Platforms
Authors:
Indira Sen,
Fabian Floeck,
Katrin Weller,
Bernd Weiss,
Claudia Wagner
Abstract:
Peoples' activities and opinions recorded as digital traces online, especially on social media and other web-based platforms, offer increasingly informative pictures of the public. They promise to allow inferences about populations beyond the users of the platforms on which the traces are recorded, representing real potential for the Social Sciences and a complement to survey-based research. But t…
▽ More
Peoples' activities and opinions recorded as digital traces online, especially on social media and other web-based platforms, offer increasingly informative pictures of the public. They promise to allow inferences about populations beyond the users of the platforms on which the traces are recorded, representing real potential for the Social Sciences and a complement to survey-based research. But the use of digital traces brings its own complexities and new error sources to the research enterprise. Recently, researchers have begun to discuss the errors that can occur when digital traces are used to learn about humans and social phenomena. This article synthesizes this discussion and proposes a systematic way to categorize potential errors, inspired by the Total Survey Error (TSE) Framework developed for survey methodology. We introduce a conceptual framework to diagnose, understand, and document errors that may occur in studies based on such digital traces. While there are clear parallels to the well-known error sources in the TSE framework, the new "Total Error Framework for Digital Traces of Human Behavior on Online Platforms" (TED-On) identifies several types of error that are specific to the use of digital traces. By providing a standard vocabulary to describe these errors, the proposed framework is intended to advance communication and research concerning the use of digital traces in scientific social research.
△ Less
Submitted 3 June, 2021; v1 submitted 18 July, 2019;
originally announced July 2019.
-
Dense forests and Danzer sets
Authors:
Yaar Solomon,
Barak Weiss
Abstract:
A set $Y\subseteq\mathbb{R}^d$ that intersects every convex set of volume $1$ is called a Danzer set. It is not known whether there are Danzer sets in $\mathbb{R}^d$ with growth rate $O(T^d)$. We prove that natural candidates, such as discrete sets that arise from substitutions and from cut-and-project constructions, are not Danzer sets. For cut and project sets our proof relies on the dynamics of…
▽ More
A set $Y\subseteq\mathbb{R}^d$ that intersects every convex set of volume $1$ is called a Danzer set. It is not known whether there are Danzer sets in $\mathbb{R}^d$ with growth rate $O(T^d)$. We prove that natural candidates, such as discrete sets that arise from substitutions and from cut-and-project constructions, are not Danzer sets. For cut and project sets our proof relies on the dynamics of homogeneous flows. We consider a weakening of the Danzer problem, the existence of uniformly discrete dense forests, and we use homogeneous dynamics (in particular Ratner's theorems on unipotent flows) to construct such sets. We also prove an equivalence between the above problem and a well-known combinatorial problem, and deduce the existence of Danzer sets with growth rate $O(T^d\log T)$, improving the previous bound of $O(T^d\log^{d-1} T)$.
△ Less
Submitted 10 July, 2014; v1 submitted 15 June, 2014;
originally announced June 2014.
-
Software Design Principles of a DFS Tower A-CWP Prototype
Authors:
Felix Schmitt,
Ralf Heidger,
Stephen Straub,
Benjamin Weiß
Abstract:
SESAR is supposed to boost the development of new operational procedures together with the supporting systems in order to modernize the pan-European air traffic management (ATM). One consequence of this development is that more and more information is presented to - and has to be processed by - air traffic control officers (ATCOs). Thus, there is a strong need for a software design concept that fo…
▽ More
SESAR is supposed to boost the development of new operational procedures together with the supporting systems in order to modernize the pan-European air traffic management (ATM). One consequence of this development is that more and more information is presented to - and has to be processed by - air traffic control officers (ATCOs). Thus, there is a strong need for a software design concept that fosters the development of an advanced (tower) controller working position (A-CWP) that comprehensively integrates the still counting amount of information while reducing the data management workload of ATCOs. We report on our first hands-on experiences obtained during the development of an A-CWP prototype that was used in two SESAR validation sessions.
△ Less
Submitted 24 April, 2013;
originally announced April 2013.
-
Generating Product Systems
Authors:
Nir Avni,
Benjamin Weiss
Abstract:
Generalizing Krieger's finite generation theorem, we give conditions for an ergodic system to be generated by a pair of partitions, each required to be measurable with respect to a given sub-algebra, and also required to have a fixed size.
Generalizing Krieger's finite generation theorem, we give conditions for an ergodic system to be generated by a pair of partitions, each required to be measurable with respect to a given sub-algebra, and also required to have a fixed size.
△ Less
Submitted 7 July, 2009;
originally announced July 2009.
-
Estimating the Lengths of Memory Words
Authors:
Gusztav Morvai,
Benjamin Weiss
Abstract:
For a stationary stochastic process $\{X_n\}$ with values in some set $A$, a finite word $w \in A^K$ is called a memory word if the conditional probability of $X_0$ given the past is constant on the cylinder set defined by $X_{-K}^{-1}=w$. It is a called a minimal memory word if no proper suffix of $w$ is also a memory word. For example in a $K$-step Markov processes all words of length $K$ are…
▽ More
For a stationary stochastic process $\{X_n\}$ with values in some set $A$, a finite word $w \in A^K$ is called a memory word if the conditional probability of $X_0$ given the past is constant on the cylinder set defined by $X_{-K}^{-1}=w$. It is a called a minimal memory word if no proper suffix of $w$ is also a memory word. For example in a $K$-step Markov processes all words of length $K$ are memory words but not necessarily minimal. We consider the problem of determining the lengths of the longest minimal memory words and the shortest memory words of an unknown process $\{X_n\}$ based on sequentially observing the outputs of a single sample $\{ξ_1,ξ_2,...ξ_n\}$. We will give a universal estimator which converges almost surely to the length of the longest minimal memory word and show that no such universal estimator exists for the length of the shortest memory word. The alphabet $A$ may be finite or countable.
△ Less
Submitted 21 August, 2008;
originally announced August 2008.
-
On Sequential Estimation and Prediction for Discrete Time Series
Authors:
G. Morvai,
B. Weiss
Abstract:
The problem of extracting as much information as possible from a sequence of observations of a stationary stochastic process $X_0,X_1,...X_n$ has been considered by many authors from different points of view. It has long been known through the work of D. Bailey that no universal estimator for $\textbf{P}(X_{n+1}|X_0,X_1,...X_n)$ can be found which converges to the true estimator almost surely. D…
▽ More
The problem of extracting as much information as possible from a sequence of observations of a stationary stochastic process $X_0,X_1,...X_n$ has been considered by many authors from different points of view. It has long been known through the work of D. Bailey that no universal estimator for $\textbf{P}(X_{n+1}|X_0,X_1,...X_n)$ can be found which converges to the true estimator almost surely. Despite this result, for restricted classes of processes, or for sequences of estimators along stop** times, universal estimators can be found. We present here a survey of some of the recent work that has been done along these lines.
△ Less
Submitted 30 March, 2008;
originally announced March 2008.
-
On estimating the memory for finitarily Markovian processes
Authors:
Gusztav Morvai,
Benjamin Weiss
Abstract:
Finitarily Markovian processes are those processes $\{X_n\}_{n=-\infty}^{\infty}$ for which there is a finite $K$ ($K = K(\{X_n\}_{n=-\infty}^0$) such that the conditional distribution of $X_1$ given the entire past is equal to the conditional distribution of $X_1$ given only $\{X_n\}_{n=1-K}^0$. The least such value of $K$ is called the memory length. We give a rather complete analysis of the p…
▽ More
Finitarily Markovian processes are those processes $\{X_n\}_{n=-\infty}^{\infty}$ for which there is a finite $K$ ($K = K(\{X_n\}_{n=-\infty}^0$) such that the conditional distribution of $X_1$ given the entire past is equal to the conditional distribution of $X_1$ given only $\{X_n\}_{n=1-K}^0$. The least such value of $K$ is called the memory length. We give a rather complete analysis of the problems of universally estimating the least such value of $K$, both in the backward sense that we have just described and in the forward sense, where one observes successive values of $\{X_n\}$ for $n \geq 0$ and asks for the least value $K$ such that the conditional distribution of $X_{n+1}$ given $\{X_i\}_{i=n-K+1}^n$ is the same as the conditional distribution of $X_{n+1}$ given $\{X_i\}_{i=-\infty}^n$. We allow for finite or countably infinite alphabet size.
△ Less
Submitted 3 December, 2007;
originally announced December 2007.
-
Forward estimation for ergodic time series
Authors:
Gusztav Morvai,
Benjamin Weiss
Abstract:
The forward estimation problem for stationary and ergodic time series $\{X_n\}_{n=0}^{\infty}$ taking values from a finite alphabet ${\cal X}$ is to estimate the probability that $X_{n+1}=x$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. We present a simple procedure $g_n$ which is evaluated on the data segment…
▽ More
The forward estimation problem for stationary and ergodic time series $\{X_n\}_{n=0}^{\infty}$ taking values from a finite alphabet ${\cal X}$ is to estimate the probability that $X_{n+1}=x$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. We present a simple procedure $g_n$ which is evaluated on the data segment $(X_0,...,X_n)$ and for which, ${\rm error}(n) = |g_{n}(x)-P(X_{n+1}=x |X_0,...,X_n)|\to 0$ almost surely for a subclass of all stationary and ergodic time series, while for the full class the Cesaro average of the error tends to zero almost surely and moreover, the error tends to zero in probability.
△ Less
Submitted 24 November, 2007;
originally announced November 2007.
-
Order estimation of Markov chains
Authors:
G. Morvai,
B. Weiss
Abstract:
We describe estimators $χ_n(X_0,X_1,...,X_n)$, which when applied to an unknown stationary process taking values from a countable alphabet ${\cal X}$, converge almost surely to $k$ in case the process is a $k$-th order Markov chain and to infinity otherwise.
We describe estimators $χ_n(X_0,X_1,...,X_n)$, which when applied to an unknown stationary process taking values from a countable alphabet ${\cal X}$, converge almost surely to $k$ in case the process is a $k$-th order Markov chain and to infinity otherwise.
△ Less
Submitted 3 November, 2007;
originally announced November 2007.
-
Prediction for discrete time series
Authors:
G. Morvai,
B. Weiss
Abstract:
Let $\{X_n\}$ be a stationary and ergodic time series taking values from a finite or countably infinite set ${\cal X}$. Assume that the distribution of the process is otherwise unknown. We propose a sequence of stop** times $λ_n$ along which we will be able to estimate the conditional probability $P(X_{λ_n+1}=x|X_0,...,X_{λ_n})$ from data segment $(X_0,...,X_{λ_n})$ in a pointwise consistent w…
▽ More
Let $\{X_n\}$ be a stationary and ergodic time series taking values from a finite or countably infinite set ${\cal X}$. Assume that the distribution of the process is otherwise unknown. We propose a sequence of stop** times $λ_n$ along which we will be able to estimate the conditional probability $P(X_{λ_n+1}=x|X_0,...,X_{λ_n})$ from data segment $(X_0,...,X_{λ_n})$ in a pointwise consistent way for a restricted class of stationary and ergodic finite or countably infinite alphabet time series which includes among others all stationary and ergodic finitarily Markovian processes. If the stationary and ergodic process turns out to be finitarily Markovian (among others, all stationary and ergodic Markov chains are included in this class) then $ \lim_{n\to \infty} {n\over λ_n}>0$ almost surely. If the stationary and ergodic process turns out to possess finite entropy rate then $λ_n$ is upperbounded by a polynomial, eventually almost surely.
△ Less
Submitted 3 November, 2007;
originally announced November 2007.
-
Intermittent estimation of stationary time series
Authors:
G. Morvai,
B. Weiss
Abstract:
Let $\{X_n\}_{n=0}^{\infty}$ be a stationary real-valued time series with unknown distribution. Our goal is to estimate the conditional expectation of $X_{n+1}$ based on the observations $X_i$, $0\le i\le n$ in a strongly consistent way. Bailey and Ryabko proved that this is not possible even for ergodic binary time series if one estimates at all values of $n$. We propose a very simple algorithm…
▽ More
Let $\{X_n\}_{n=0}^{\infty}$ be a stationary real-valued time series with unknown distribution. Our goal is to estimate the conditional expectation of $X_{n+1}$ based on the observations $X_i$, $0\le i\le n$ in a strongly consistent way. Bailey and Ryabko proved that this is not possible even for ergodic binary time series if one estimates at all values of $n$. We propose a very simple algorithm which will make prediction infinitely often at carefully selected stop** times chosen by our rule. We show that under certain conditions our procedure is strongly (pointwise) consistent, and $L_2$ consistent without any condition. An upper bound on the growth of the stop** times is also presented in this paper.
△ Less
Submitted 2 November, 2007;
originally announced November 2007.
-
Forecasting for stationary binary time series
Authors:
Gusztav Morvai,
Benjamin Weiss
Abstract:
The forecasting problem for a stationary and ergodic binary time series $\{X_n\}_{n=0}^{\infty}$ is to estimate the probability that $X_{n+1}=1$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. It is known that this is not possible if one estimates at all values of $n$. We present a simple procedure which will attempt to make su…
▽ More
The forecasting problem for a stationary and ergodic binary time series $\{X_n\}_{n=0}^{\infty}$ is to estimate the probability that $X_{n+1}=1$ based on the observations $X_i$, $0\le i\le n$ without prior knowledge of the distribution of the process $\{X_n\}$. It is known that this is not possible if one estimates at all values of $n$. We present a simple procedure which will attempt to make such a prediction infinitely often at carefully selected stop** times chosen by the algorithm. We show that the proposed procedure is consistent under certain conditions, and we estimate the growth rate of the stop** times.
△ Less
Submitted 26 October, 2007;
originally announced October 2007.
-
On classifying processes
Authors:
Gusztav Morvai,
Benjamin Weiss
Abstract:
We prove several results concerning classifications, based on successive observations $(X_1,..., X_n)$ of an unknown stationary and ergodic process, for membership in a given class of processes, such as the class of all finite order Markov chains.
We prove several results concerning classifications, based on successive observations $(X_1,..., X_n)$ of an unknown stationary and ergodic process, for membership in a given class of processes, such as the class of all finite order Markov chains.
△ Less
Submitted 19 October, 2007;
originally announced October 2007.
-
Limitations on intermittent forecasting
Authors:
Gusztav Morvai,
Benjamin Weiss
Abstract:
Bailey showed that the general pointwise forecasting for stationary and ergodic time series has a negative solution. However, it is known that for Markov chains the problem can be solved. Morvai showed that there is a stop** time sequence $\{λ_n\}$ such that $P(X_{λ_n+1}=1|X_0,...,X_{λ_n}) $ can be estimated from samples $(X_0,...,X_{λ_n})$ such that the difference between the conditional prob…
▽ More
Bailey showed that the general pointwise forecasting for stationary and ergodic time series has a negative solution. However, it is known that for Markov chains the problem can be solved. Morvai showed that there is a stop** time sequence $\{λ_n\}$ such that $P(X_{λ_n+1}=1|X_0,...,X_{λ_n}) $ can be estimated from samples $(X_0,...,X_{λ_n})$ such that the difference between the conditional probability and the estimate vanishes along these stopp** times for all stationary and ergodic binary time series. We will show it is not possible to estimate the above conditional probability along a stop** time sequence for all stationary and ergodic binary time series in a pointwise sense such that if the time series turns out to be a Markov chain, the predictor will predict eventually for all $n$.
△ Less
Submitted 19 October, 2007;
originally announced October 2007.
-
Inferring the conditional mean
Authors:
Gusztav Morvai,
Benjamin Weiss
Abstract:
Consider a stationary real-valued time series $\{X_n\}_{n=0}^{\infty}$ with a priori unknown distribution. The goal is to estimate the conditional expectation $E(X_{n+1}|X_0,..., X_n)$ based on the observations $(X_0,..., X_n)$ in a pointwise consistent way. It is well known that this is not possible at all values of $n$. We will estimate it along stop** times.
Consider a stationary real-valued time series $\{X_n\}_{n=0}^{\infty}$ with a priori unknown distribution. The goal is to estimate the conditional expectation $E(X_{n+1}|X_0,..., X_n)$ based on the observations $(X_0,..., X_n)$ in a pointwise consistent way. It is well known that this is not possible at all values of $n$. We will estimate it along stop** times.
△ Less
Submitted 19 October, 2007;
originally announced October 2007.
-
Voice over IP in the Local Exchange: A Case Study
Authors:
Martin B. H. Weiss,
Hak-Ju Kim
Abstract:
There have been a small number of cost studies of Voice over IP (VoIP) in the academic literature. Generally, they have been for abstract networks, have not been focused on the public switched telephone network, or they have not included the operating costs. This paper presents the operating cost portion of our ongoing research project comparing circuit-switched and IP network costs for an exist…
▽ More
There have been a small number of cost studies of Voice over IP (VoIP) in the academic literature. Generally, they have been for abstract networks, have not been focused on the public switched telephone network, or they have not included the operating costs. This paper presents the operating cost portion of our ongoing research project comparing circuit-switched and IP network costs for an existing local exchange carrier.
We have found that (1) The operating cost differential between IP and circuit switching for this LEC will be small; and (2) A substantial majority of a telco's operating cost lies in customer service and outside plant maintenance, which will be incurred equally in both networks in a pure substitution scenario. Thus, the operating cost difference lies in the actual cost differences of the switching technologies. This appears to be less than 10%-15% of the total operating cost of the network. Thus, even if the cost differences for substitute services were large, the overall impact on the telco's financial performance would be small. But IP has some hidden benefits on the operations side. Most notably, data and voice services could be managed with the same systems infrastructure, meaning that the incremental operations cost of rolling out new services would likely be much lower, since it would all be IP.
△ Less
Submitted 24 September, 2001;
originally announced September 2001.