-
Hybrid-Dynamic Ehrenfeucht-Fraïssé Games
Authors:
Guillermo Badia,
Daniel Gaina,
Alexander Knapp,
Tomasz Kowalski,
Martin Wirsing
Abstract:
Ehrenfeucht-Fraïssé games provide means to characterize elementary equivalence for first-order logic, and by standard translation also for modal logics. We propose a novel generalization of Ehrenfeucht- Fraïssé games to hybrid-dynamic logics which is direct and fully modular: parameterized by the features of the hybrid language we wish to include, for instance, the modal and hybrid language operat…
▽ More
Ehrenfeucht-Fraïssé games provide means to characterize elementary equivalence for first-order logic, and by standard translation also for modal logics. We propose a novel generalization of Ehrenfeucht- Fraïssé games to hybrid-dynamic logics which is direct and fully modular: parameterized by the features of the hybrid language we wish to include, for instance, the modal and hybrid language operators as well as first-order existential quantification. We use these games to establish a new modular Fraïssé-Hintikka Theorem for hybrid-dynamic propositional logic and its various fragments. We study the relationship between countable game equivalence (determined by countable Ehrenfeucht- Fraïssé games) and bisimulation (determined by countable back-and-forth systems). In general, the former turns out to be weaker than the latter, but under certain conditions on the language, the two coincide. We also use games to prove that for reachable image-finite Kripke structures elementary equivalence implies isomorphism.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Qualitative representations of chromatic algebras
Authors:
Badriah Al Juaid,
Marcel Jackson,
James Koussas,
Tomasz Kowalski
Abstract:
Conventional Ramsey-theoretic investigations for edge-colourings of complete graphs are framed around avoidance of certain configurations. Motivated by considerations arising in the field of Qualitative Reasoning, we explore edge colourings that in addition to forbidding certain triangle configurations also require others to be present. These conditions have natural combinatorial interest in their…
▽ More
Conventional Ramsey-theoretic investigations for edge-colourings of complete graphs are framed around avoidance of certain configurations. Motivated by considerations arising in the field of Qualitative Reasoning, we explore edge colourings that in addition to forbidding certain triangle configurations also require others to be present. These conditions have natural combinatorial interest in their own right, but also correspond to qualitative representability of certain nonassociative relation algebras, which we will call chromatic.
△ Less
Submitted 9 January, 2022;
originally announced January 2022.
-
A Hybrid Ensemble Feature Selection Design for Candidate Biomarkers Discovery from Transcriptome Profiles
Authors:
Felipe Colombelli,
Thayne Woycinck Kowalski,
Mariana Recamonde-Mendoza
Abstract:
The discovery of disease biomarkers from gene expression data has been greatly advanced by feature selection (FS) methods, especially using ensemble FS (EFS) strategies with perturbation at the data level (i.e., homogeneous, Hom-EFS) or method level (i.e., heterogeneous, Het-EFS). Here we proposed a Hybrid EFS (Hyb-EFS) design that explores both types of perturbation to improve the stability and t…
▽ More
The discovery of disease biomarkers from gene expression data has been greatly advanced by feature selection (FS) methods, especially using ensemble FS (EFS) strategies with perturbation at the data level (i.e., homogeneous, Hom-EFS) or method level (i.e., heterogeneous, Het-EFS). Here we proposed a Hybrid EFS (Hyb-EFS) design that explores both types of perturbation to improve the stability and the predictive power of candidate biomarkers. With this, Hyb-EFS aims to disrupt associations of good performance with a single dataset, single algorithm, or a specific combination of both, which is particularly interesting for better reproducibility of genomic biomarkers. We investigated the adequacy of our approach for microarray data related to four types of cancer, carrying out an extensive comparison with other ensemble and single FS approaches. Five FS methods were used in our experiments: Wx, Symmetrical Uncertainty (SU), Gain Ratio (GR), Characteristic Direction (GeoDE), and ReliefF. We observed that the Hyb-EFS and Het-EFS approaches attenuated the large performance variation observed for most single FS and Hom-EFS across distinct datasets. Also, the Hyb-EFS improved upon the stability of the Het-EFS within our domain. Comparing the Hyb-EFS and Het-EFS composed of the top-performing selectors (Wx, GR, and SU), our hybrid approach surpassed the equivalent heterogeneous design and the best Hom-EFS (Hom-Wx). Interestingly, the rankings produced by our Hyb-EFS reached greater biological plausibility, with a notably high enrichment for cancer-related genes and pathways. Thus, our experiments suggest the potential of the proposed Hybrid EFS design in discovering candidate biomarkers from microarray data. Finally, we provide an open-source framework to support similar analyses in other domains, both as a user-friendly application and a plain Python package.
△ Less
Submitted 31 July, 2021;
originally announced August 2021.
-
Faster range minimum queries
Authors:
Tomasz Kowalski,
Szymon Grabowski
Abstract:
Range Minimum Query (RMQ) is an important building brick of many compressed data structures and string matching algorithms. Although this problem is essentially solved in theory, with sophisticated data structures allowing for constant time queries, practical performance and construction time also matter. Additionally, there are offline scenarios in which the number of queries, $q$, is rather smal…
▽ More
Range Minimum Query (RMQ) is an important building brick of many compressed data structures and string matching algorithms. Although this problem is essentially solved in theory, with sophisticated data structures allowing for constant time queries, practical performance and construction time also matter. Additionally, there are offline scenarios in which the number of queries, $q$, is rather small and given beforehand, which encourages to use a simpler approach. In this work, we present a simple data structure, with very fast construction, which allows to handle queries in constant time on average. This algorithm, however, requires access to the input data during queries (which is not the case of sophisticated RMQ solutions). We subsequently refine our technique, combining it with one of the existing succinct solutions with $O(1)$ worst-case time queries and no access to the input array. The resulting hybrid is still a memory frugal data structure, spending usually up to about $3n$ bits, and providing competitive query times, especially for wide ranges. We also show how to make our baseline data structure more compact. Experimental results demonstrate that the proposed BbST (Block-based Sparse Table) variants are competitive to existing solutions, also in the offline scenario.
△ Less
Submitted 28 November, 2017;
originally announced November 2017.
-
Faster batched range minimum queries
Authors:
Szymon Grabowski,
Tomasz Kowalski
Abstract:
Range Minimum Query (RMQ) is an important building brick of many compressed data structures and string matching algorithms. Although this problem is essentially solved in theory, with sophisticated data structures allowing for constant time queries, there are scenarios in which the number of queries, $q$, is rather small and given beforehand, which encourages to use a simpler approach. A recent wo…
▽ More
Range Minimum Query (RMQ) is an important building brick of many compressed data structures and string matching algorithms. Although this problem is essentially solved in theory, with sophisticated data structures allowing for constant time queries, there are scenarios in which the number of queries, $q$, is rather small and given beforehand, which encourages to use a simpler approach. A recent work by Alzamel et al. starts with contracting the input array to a much shorter one, with its size proportional to $q$. In this work, we build upon their solution, speeding up handling small batches of queries by a factor of 3.8--7.8 (the gap grows with $q$). The key idea that helped us achieve this advantage is adapting the well-known Sparse Table technique to work on blocks, with speculative block minima comparisons. We also propose an even much faster (but possibly using more space) variant without the array contraction.
△ Less
Submitted 10 July, 2017; v1 submitted 21 June, 2017;
originally announced June 2017.
-
Suffix arrays with a twist
Authors:
Tomasz Kowalski,
Szymon Grabowski,
Kimmo Fredriksson,
Marcin Raniszewski
Abstract:
The suffix array is a classic full-text index, combining effectiveness with simplicity. We discuss three approaches aiming to improve its efficiency even more: changes to the navigation, data layout and adding extra data. In short, we show that $(i)$ how we search for the right interval boundary impacts significantly the overall search speed, $(ii)$ a B-tree data layout easily wins over the standa…
▽ More
The suffix array is a classic full-text index, combining effectiveness with simplicity. We discuss three approaches aiming to improve its efficiency even more: changes to the navigation, data layout and adding extra data. In short, we show that $(i)$ how we search for the right interval boundary impacts significantly the overall search speed, $(ii)$ a B-tree data layout easily wins over the standard one, $(iii)$ the well-known idea of a lookup table for the prefixes of the suffixes can be refined with using compression, $(iv)$ caching prefixes of the suffixes in a helper array can pose a(nother) practical space-time tradeoff.
△ Less
Submitted 27 July, 2016;
originally announced July 2016.
-
Algebraic foundations for qualitative calculi and networks
Authors:
Robin Hirsch,
Marcel Jackson,
Tomasz Kowalski
Abstract:
A qualitative representation $φ$ is like an ordinary representation of a relation algebra, but instead of requiring $(a; b)^φ= a^φ| b^φ$, as we do for ordinary representations, we only require that $c^φ\supseteq a^φ| b^φ\iff c\geq a ; b$, for each $c$ in the algebra. A constraint network is qualitatively satisfiable if its nodes can be mapped to elements of a qualitative representation, preserving…
▽ More
A qualitative representation $φ$ is like an ordinary representation of a relation algebra, but instead of requiring $(a; b)^φ= a^φ| b^φ$, as we do for ordinary representations, we only require that $c^φ\supseteq a^φ| b^φ\iff c\geq a ; b$, for each $c$ in the algebra. A constraint network is qualitatively satisfiable if its nodes can be mapped to elements of a qualitative representation, preserving the constraints. If a constraint network is satisfiable then it is clearly qualitatively satisfiable, but the converse can fail. However, for a wide range of relation algebras including the point algebra, the Allen Interval Algebra, RCC8 and many others, a network is satisfiable if and only if it is qualitatively satisfiable.
Unlike ordinary composition, the weak composition arising from qualitative representations need not be associative, so we can generalise by considering network satisfaction problems over non-associative algebras. We prove that computationally, qualitative representations have many advantages over ordinary representations: whereas many finite relation algebras have only infinite representations, every finite qualitatively representable algebra has a finite qualitative representation; the representability problem for (the atom structures of) finite non-associative algebras is NP-complete; the network satisfaction problem over a finite qualitatively representable algebra is always in NP; the validity of equations over qualitative representations is co-NP-complete. On the other hand we prove that there is no finite axiomatisation of the class of qualitatively representable algebras.
△ Less
Submitted 19 June, 2017; v1 submitted 29 June, 2016;
originally announced June 2016.
-
Indexing arbitrary-length $k$-mers in sequencing reads
Authors:
Tomasz Kowalski,
Szymon Grabowski,
Sebastian Deorowicz
Abstract:
We propose a lightweight data structure for indexing and querying collections of NGS reads data in main memory. The data structure supports the interface proposed in the pioneering work by Philippe et al. for counting and locating $k$-mers in sequencing reads. Our solution, PgSA (pseudogenome suffix array), based on finding overlap** reads, is competitive to the existing algorithms in the space…
▽ More
We propose a lightweight data structure for indexing and querying collections of NGS reads data in main memory. The data structure supports the interface proposed in the pioneering work by Philippe et al. for counting and locating $k$-mers in sequencing reads. Our solution, PgSA (pseudogenome suffix array), based on finding overlap** reads, is competitive to the existing algorithms in the space use, query times, or both. The main applications of our index include variant calling, error correction and analysis of reads from RNA-seq experiments.
△ Less
Submitted 13 February, 2015; v1 submitted 6 February, 2015;
originally announced February 2015.
-
Complexity and polymorphisms for digraph constraint problems under some basic constructions
Authors:
Marcel Jackson,
Tomasz Kowalski,
Todd Niven
Abstract:
The role of polymorphisms in determining the complexity of constraint satisfaction problems is well established.
In this context we study the stability of CSP complexity and polymorphism properties under some basic graph theoretic constructions. As applications we observe a collapse in the applicability of algorithms for CSPs over directed graphs with both a total source and a total sink: the co…
▽ More
The role of polymorphisms in determining the complexity of constraint satisfaction problems is well established.
In this context we study the stability of CSP complexity and polymorphism properties under some basic graph theoretic constructions. As applications we observe a collapse in the applicability of algorithms for CSPs over directed graphs with both a total source and a total sink: the corresponding CSP is solvable by the "few subpowers algorithm" if and only if it is solvable by a local consistency check algorithm. Moreover, we find that the property of "strict width" and solvability by few subpowers are unstable under first order reductions. The analysis also yields a complete characterisation of the main polymorphism properties for digraphs whose symmetric closure is a complete graph.
△ Less
Submitted 21 July, 2016; v1 submitted 17 April, 2013;
originally announced April 2013.