Search | arXiv e-print repository

A Framework for Rate Efficient Control of Distributed Discrete Systems

Authors: Jie Ren, Solmaz Torabi, John MacLaren Walsh

Abstract: A key issue in the control of distributed discrete systems modeled as Markov decisions processes, is that often the state of the system is not directly observable at any single location in the system. The participants in the control scheme must share information with one another regarding the state of the system in order to collectively make informed control decisions, but this information sharing… ▽ More A key issue in the control of distributed discrete systems modeled as Markov decisions processes, is that often the state of the system is not directly observable at any single location in the system. The participants in the control scheme must share information with one another regarding the state of the system in order to collectively make informed control decisions, but this information sharing can be costly. Harnessing recent results from information theory regarding distributed function computation, in this paper we derive, for several information sharing model structures, the minimum amount of control information that must be exchanged to enable local participants to derive the same control decisions as an imaginary omniscient controller having full knowledge of the global state. Incorporating consideration for this amount of information that must be exchanged into the reward enables one to trade the competing objectives of minimizing this control information exchange and maximizing the performance of the controller. An alternating optimization framework is then provided to help find the efficient controllers and messaging schemes. A series of running examples from wireless resource allocation illustrate the ideas and design tradeoffs. △ Less

Submitted 28 April, 2017; originally announced April 2017.

arXiv:1704.01891 [pdf, other]

On Multi-source Networks: Enumeration, Rate Region Computation, and Hierarchy

Authors: Congduan Li, Steven Weber, John MacLaren Walsh

Abstract: Recent algorithmic developments have enabled computers to automatically determine and prove the capacity regions of small hypergraph networks under network coding. A structural theory relating network coding problems of different sizes is developed to make best use of this newfound computational capability. A formal notion of network minimality is developed which removes components of a network co… ▽ More Recent algorithmic developments have enabled computers to automatically determine and prove the capacity regions of small hypergraph networks under network coding. A structural theory relating network coding problems of different sizes is developed to make best use of this newfound computational capability. A formal notion of network minimality is developed which removes components of a network coding problem that are inessential to its core complexity. Equivalence between different network coding problems under relabeling is formalized via group actions, an algorithm which can directly list single representatives from each equivalence class of minimal networks up to a prescribed network size is presented. This algorithm, together with rate region software, is leveraged to create a database containing the rate regions for all minimal network coding problems with five or fewer sources and edges, a collection of 744119 equivalence classes representing more than 9 million networks. In order to best learn from this database, and to leverage it to infer rate regions and their characteristics of networks at scale, a hierarchy between different network coding problems is created with a new theory of combinations and embedding operators. △ Less

Submitted 6 April, 2017; originally announced April 2017.

Comments: 20 pages with double column, revision of previous submission arXiv:1507.05728

arXiv:1607.06833 [pdf, other]

Explicit Polyhedral Bounds on Network Coding Rate Regions via Entropy Function Region: Algorithms, Symmetry, and Computation

Authors: Jayant Apte, John MacLaren Walsh

Abstract: Automating the solutions of multiple network information theory problems, stretching from fundamental concerns such as determining all information inequalities and the limitations of linear codes, to applied ones such as designing coded networks, distributed storage systems, and caching systems, can be posed as polyhedral projections. These problems are demonstrated to exhibit multiple types of po… ▽ More Automating the solutions of multiple network information theory problems, stretching from fundamental concerns such as determining all information inequalities and the limitations of linear codes, to applied ones such as designing coded networks, distributed storage systems, and caching systems, can be posed as polyhedral projections. These problems are demonstrated to exhibit multiple types of polyhedral symmetries. It is shown how these symmetries can be exploited to reduce the complexity of solving these problems through polyhedral projection. △ Less

Submitted 6 July, 2017; v1 submitted 22 July, 2016; originally announced July 2016.

Comments: 23 pages, 15 figures

arXiv:1605.04598 [pdf, other]

Constrained Linear Representability of Polymatroids and Algorithms for Computing Achievability Proofs in Network Coding

Authors: Jayant Apte, John MacLaren Walsh

Abstract: The constrained linear representability problem (CLRP) for polymatroids determines whether there exists a polymatroid that is linear over a specified field while satisfying a collection of constraints on the rank function. Using a computer to test whether a certain rate vector is achievable with vector linear network codes for a multi-source network coding instance and whether there exists a multi… ▽ More The constrained linear representability problem (CLRP) for polymatroids determines whether there exists a polymatroid that is linear over a specified field while satisfying a collection of constraints on the rank function. Using a computer to test whether a certain rate vector is achievable with vector linear network codes for a multi-source network coding instance and whether there exists a multi-linear secret sharing scheme achieving a specified information ratio for a given secret sharing instance are shown to be special cases of CLRP. Methods for solving CLRP built from group theoretic techniques for combinatorial generation are developed and described. These techniques form the core of an information theoretic achievability prover, an implementation accompanies the article, and several computational experiments with interesting instances of network coding and secret sharing demonstrating the utility of the method are provided. △ Less

Submitted 1 February, 2017; v1 submitted 15 May, 2016; originally announced May 2016.

Comments: submitted to IEEE Transactions on Information Theory, (this version: corrected figure 9)

arXiv:1605.01744 [pdf, other]

Improving Automated Patent Claim Parsing: Dataset, System, and Experiments

Authors: Mengke Hu, David Cinciruk, John MacLaren Walsh

Abstract: Off-the-shelf natural language processing software performs poorly when parsing patent claims owing to their use of irregular language relative to the corpora built from news articles and the web typically utilized to train this software. Stop** short of the extensive and expensive process of accumulating a large enough dataset to completely retrain parsers for patent claims, a method of adaptin… ▽ More Off-the-shelf natural language processing software performs poorly when parsing patent claims owing to their use of irregular language relative to the corpora built from news articles and the web typically utilized to train this software. Stop** short of the extensive and expensive process of accumulating a large enough dataset to completely retrain parsers for patent claims, a method of adapting existing natural language processing software towards patent claims via forced part of speech tag correction is proposed. An Amazon Mechanical Turk collection campaign organized to generate a public corpus to train such an improved claim parsing system is discussed, identifying lessons learned during the campaign that can be of use in future NLP dataset collection campaigns with AMT. Experiments utilizing this corpus and other patent claim sets measure the parsing performance improvement garnered via the claim parsing system. Finally, the utility of the improved claim parsing system within other patent processing applications is demonstrated via experiments showing improved automated patent subject classification when the new claim parsing system is utilized to generate the features. △ Less

Submitted 5 May, 2016; originally announced May 2016.

arXiv:1512.03324 [pdf, other]

Map** the Region of Entropic Vectors with Support Enumeration & Information Geometry

Authors: Yunshu Liu, John MacLaren Walsh

Abstract: The region of entropic vectors is a convex cone that has been shown to be at the core of many fundamental limits for problems in multiterminal data compression, network coding, and multimedia transmission. This cone has been shown to be non-polyhedral for four or more random variables, however its boundary remains unknown for four or more discrete random variables. Methods for specifying probabili… ▽ More The region of entropic vectors is a convex cone that has been shown to be at the core of many fundamental limits for problems in multiterminal data compression, network coding, and multimedia transmission. This cone has been shown to be non-polyhedral for four or more random variables, however its boundary remains unknown for four or more discrete random variables. Methods for specifying probability distributions that are in faces and on the boundary of the convex cone are derived, then utilized to map optimized inner bounds to the unknown part of the entropy region. The first method utilizes tools and algorithms from abstract algebra to efficiently determine those supports for the joint probability mass functions for four or more random variables that can, for some appropriate set of non-zero probabilities, yield entropic vectors in the gap between the best known inner and outer bounds. These supports are utilized, together with numerical optimization over non-zero probabilities, to provide inner bounds to the unknown part of the entropy region. Next, information geometry is utilized to parameterize and study the structure of probability distributions on these supports yielding entropic vectors in the faces of entropy and in the unknown part of the entropy region. △ Less

Submitted 10 December, 2015; originally announced December 2015.

arXiv:1507.05728 [pdf, other]

On Multi-source Networks: Enumeration, Rate Region Computation, and Hierarchy

Authors: Congduan Li, Steven Weber, John MacLaren Walsh

Abstract: This paper investigates the enumeration, rate region computation, and hierarchy of general multi-source multi-sink hyperedge networks under network coding, which includes multiple network models, such as independent distributed storage systems and index coding problems, as special cases. A notion of minimal networks and a notion of network equivalence under group action are defined. An efficient a… ▽ More This paper investigates the enumeration, rate region computation, and hierarchy of general multi-source multi-sink hyperedge networks under network coding, which includes multiple network models, such as independent distributed storage systems and index coding problems, as special cases. A notion of minimal networks and a notion of network equivalence under group action are defined. An efficient algorithm capable of directly listing single minimal canonical representatives from each network equivalence class is presented and utilized to list all minimal canonical networks with up to 5 sources and hyperedges. Computational tools are then applied to obtain the rate regions of all of these canonical networks, providing exact expressions for 744,119 newly solved network coding rate regions corresponding to more than 2 trillion isomorphic network coding problems. In order to better understand and analyze the huge repository of rate regions through hierarchy, several embedding and combination operations are defined so that the rate region of the network after operation can be derived from the rate regions of networks involved in the operation. The embedding operations enable the definition and determination of a list of forbidden network minors for the sufficiency of classes of linear codes. The combination operations enable the rate regions of some larger networks to be obtained as the combination of the rate regions of smaller networks. The integration of both the combinations and embedding operators is then shown to enable the calculation of rate regions for many networks not reachable via combination operations alone. △ Less

Submitted 21 July, 2015; originally announced July 2015.

Comments: 63 pages, submitted to TransIT

arXiv:1505.04202 [pdf, other]

doi 10.1109/TSP.2015.2483479

Interactive Scalar Quantization for Distributed Resource Allocation

Authors: Bradford D. Boyle, Jie Ren, John MacLaren Walsh, Steven Weber

Abstract: In many resource allocation problems, a centralized controller needs to award some resource to a user selected from a collection of distributed users with the goal of maximizing the utility the user would receive from the resource. This can be modeled as the controller computing an extremum of the distributed users' utilities. The overhead rate necessary to enable the controller to reproduce the u… ▽ More In many resource allocation problems, a centralized controller needs to award some resource to a user selected from a collection of distributed users with the goal of maximizing the utility the user would receive from the resource. This can be modeled as the controller computing an extremum of the distributed users' utilities. The overhead rate necessary to enable the controller to reproduce the users' local state can be prohibitively high. An approach to reduce this overhead is interactive communication wherein rate savings are achieved by tolerating an increase in delay. In this paper, we consider the design of a simple achievable scheme based on successive refinements of scalar quantization at each user. The optimal quantization policy is computed via a dynamic program and we demonstrate that tolerating a small increase in delay can yield significant rate savings. We then consider two simpler quantization policies to investigate the scaling properties of the rate-delay trade-offs. Using a combination of these simpler policies, the performance of the optimal policy can be closely approximated with lower computational costs. △ Less

Submitted 6 September, 2015; v1 submitted 15 May, 2015; originally announced May 2015.

Comments: 31 pages, 9 figures. Submitted on 2015-05-15 to IEEE Transactions on Signal Processing. Revised 2015-09-06

arXiv:1504.03344 [pdf, other]

doi 10.1088/0004-637X/807/2/171

High-Resolution Spectroscopic Study of Extremely Metal-Poor Star Candidates from the SkyMapper Survey

Authors: Heather. R. Jacobson, Stefan Keller, Anna Frebel, Andrew R. Casey, Martin Asplund, Michael S. Bessell, Gary S. Da Costa, Karin Lind, Anna F. Marino, John E. Norris, Jose M. Pena, Brian P. Schmidt, Patrick Tisserand, Jennifer M. Walsh, David Yong, Qinsi Yu

Abstract: The SkyMapper Southern Sky Survey is carrying out a search for the most metal-poor stars in the Galaxy. It identifies candidates by way of its unique filter set that allows for estimation of stellar atmospheric parameters. The set includes a narrow filter centered on the Ca II K 3933A line, enabling a robust estimate of stellar metallicity. Promising candidates are then confirmed with spectroscopy… ▽ More The SkyMapper Southern Sky Survey is carrying out a search for the most metal-poor stars in the Galaxy. It identifies candidates by way of its unique filter set that allows for estimation of stellar atmospheric parameters. The set includes a narrow filter centered on the Ca II K 3933A line, enabling a robust estimate of stellar metallicity. Promising candidates are then confirmed with spectroscopy. We present the analysis of Magellan-MIKE high-resolution spectroscopy of 122 metal-poor stars found by SkyMapper in the first two years of commissioning observations. 41 stars have [Fe/H] <= -3.0. Nine have [Fe/H] <= -3.5, with three at [Fe/H] ~ -4. A 1D LTE abundance analysis of the elements Li, C, Na, Mg, Al, Si, Ca, Sc, Ti, Cr, Mn, Co, Ni, Zn, Sr, Ba and Eu shows these stars have [X/Fe] ratios typical of other halo stars. One star with low [X/Fe] values appears to be "Fe-enhanced," while another star has an extremely large [Sr/Ba] ratio: >2. Only one other star is known to have a comparable value. Seven stars are "CEMP-no" stars ([C/Fe] > 0.7, [Ba/Fe] < 0). 21 stars exhibit mild r-process element enhancements (0.3 <=[Eu/Fe] < 1.0), while four stars have [Eu/Fe] >= 1.0. These results demonstrate the ability to identify extremely metal-poor stars from SkyMapper photometry, pointing to increased sample sizes and a better characterization of the metal-poor tail of the halo metallicity distribution function in the future. △ Less

Submitted 13 July, 2015; v1 submitted 13 April, 2015; originally announced April 2015.

Comments: Minor corrections to text, missing data added to Tables 3 and 4; updated to match published version. Complete tables included in source

Journal ref: 2015 ApJ, 807, 171

arXiv:1408.3661 [pdf, other]

Overhead Performance Tradeoffs - A Resource Allocation Perspective

Authors: Jie Ren, Bradford D. Boyle, Gwanmo Ku, Steven Weber, John MacLaren Walsh

Abstract: A key aspect of many resource allocation problems is the need for the resource controller to compute a function, such as the max or arg max, of the competing users metrics. Information must be exchanged between the competing users and the resource controller in order for this function to be computed. In many practical resource controllers the competing users' metrics are communicated to the resour… ▽ More A key aspect of many resource allocation problems is the need for the resource controller to compute a function, such as the max or arg max, of the competing users metrics. Information must be exchanged between the competing users and the resource controller in order for this function to be computed. In many practical resource controllers the competing users' metrics are communicated to the resource controller, which then computes the desired extremization function. However, in this paper it is shown that information rate savings can be obtained by recognizing that controller only needs to determine the result of this extremization function. If the extremization function is to be computed losslessly, the rate savings are shown in most cases to be at most 2 bits independent of the number of competing users. Motivated by the small savings in the lossless case, simple achievable schemes for both the lossy and interactive variants of this problem are considered. It is shown that both of these approaches have the potential to realize large rate savings, especially in the case where the number of competing users is large. For the lossy variant, it is shown that the proposed simple achievable schemes are in fact close to the fundamental limit given by the rate distortion function. △ Less

Submitted 15 August, 2014; originally announced August 2014.

Comments: 70 pages, 18 figures, Submitted to IEEE Transactions on Information Theory on 2014-08-14

arXiv:1408.3469 [pdf, other]

doi 10.1109/TIT.2016.2640302

Properties of an Aloha-like stability region

Authors: Nan Xie, John MacLaren Walsh, Steven Weber

Abstract: A well-known inner bound on the stability region of the finite-user slotted Aloha protocol is the set of all arrival rates for which there exists some choice of the contention probabilities such that the associated worst-case service rate for each user exceeds the user's arrival rate, denoted $Λ$. Although testing membership in $Λ$ of a given arrival rate can be posed as a convex program, it is no… ▽ More A well-known inner bound on the stability region of the finite-user slotted Aloha protocol is the set of all arrival rates for which there exists some choice of the contention probabilities such that the associated worst-case service rate for each user exceeds the user's arrival rate, denoted $Λ$. Although testing membership in $Λ$ of a given arrival rate can be posed as a convex program, it is nonetheless of interest to understand the properties of this set. In this paper we develop new results of this nature, including $i)$ an equivalence between membership in $Λ$ and the existence of a positive root of a given polynomial, $ii)$ a method to construct a vector of contention probabilities to stabilize any stabilizable arrival rate vector, $iii)$ the volume of $Λ$, $iv)$ explicit polyhedral, spherical, and ellipsoid inner and outer bounds on $Λ$, and $v)$ characterization of the generalized convexity properties of a natural ``excess rate'' function associated with $Λ$, including the convexity of the set of contention probabilities that stabilize a given arrival rate vector. △ Less

Submitted 4 January, 2017; v1 submitted 15 August, 2014; originally announced August 2014.

Comments: 28 pages, 9 figures. Submitted August 15, 2014, revised September 21, 2015 and August 31, 2016, and accepted November 06, 2016 for publication in IEEE Transactions on Information Theory. Preliminary results presented at ISIT 2010, ITA 2010, and ITA 2011. DOI: 10.1109/TIT.2016.2640302. Copyright transferred to IEEE. This is last version uploaded by the authors prior to IEEE proofing process

arXiv:1407.5659 [pdf, other]

Multilevel Diversity Coding Systems: Rate Regions, Codes, Computation, & Forbidden Minors

Authors: Congduan Li, Steven Weber, John MacLaren Walsh

Abstract: The rate regions of multilevel diversity coding systems (MDCS), a sub-class of the broader family of multi-source multi-sink networks with special structure, are investigated. After showing how to enumerate all non-isomorphic MDCS instances of a given size, the Shannon outer bound and several achievable inner bounds based on linear codes are given for the rate region of each non-isomorphic instanc… ▽ More The rate regions of multilevel diversity coding systems (MDCS), a sub-class of the broader family of multi-source multi-sink networks with special structure, are investigated. After showing how to enumerate all non-isomorphic MDCS instances of a given size, the Shannon outer bound and several achievable inner bounds based on linear codes are given for the rate region of each non-isomorphic instance. For thousands of MDCS instances, the bounds match, and hence exact rate regions are proven. Results gained from these computations are summarized in key statistics involving aspects such as the sufficiency of scalar binary codes, the necessary size of vector binary codes, etc. Also, it is shown how to generate computer aided human readable converse proofs, as well as how to construct the codes for an achievability proof. Based on this large repository of rate regions, a series of results about general MDCS cases that they inspired are introduced and proved. In particular, a series of embedding operations that preserve the property of sufficiency of scalar or vector codes are presented. The utility of these operations is demonstrated by boiling the thousands of MDCS instances for which binary scalar codes are insufficient down to 12 forbidden smallest embedded MDCS instances. △ Less

Submitted 26 August, 2014; v1 submitted 21 July, 2014; originally announced July 2014.

Comments: Submitted to IEEE Transactions on Information Theory, 52 pages

Showing 1–12 of 12 results for author: Walsh, J M