Search | arXiv e-print repository

arXiv:2406.02094 [pdf, other]

Hybrid-Dynamic Ehrenfeucht-Fraïssé Games

Authors: Guillermo Badia, Daniel Gaina, Alexander Knapp, Tomasz Kowalski, Martin Wirsing

Abstract: Ehrenfeucht-Fraïssé games provide means to characterize elementary equivalence for first-order logic, and by standard translation also for modal logics. We propose a novel generalization of Ehrenfeucht- Fraïssé games to hybrid-dynamic logics which is direct and fully modular: parameterized by the features of the hybrid language we wish to include, for instance, the modal and hybrid language operat… ▽ More Ehrenfeucht-Fraïssé games provide means to characterize elementary equivalence for first-order logic, and by standard translation also for modal logics. We propose a novel generalization of Ehrenfeucht- Fraïssé games to hybrid-dynamic logics which is direct and fully modular: parameterized by the features of the hybrid language we wish to include, for instance, the modal and hybrid language operators as well as first-order existential quantification. We use these games to establish a new modular Fraïssé-Hintikka Theorem for hybrid-dynamic propositional logic and its various fragments. We study the relationship between countable game equivalence (determined by countable Ehrenfeucht- Fraïssé games) and bisimulation (determined by countable back-and-forth systems). In general, the former turns out to be weaker than the latter, but under certain conditions on the language, the two coincide. We also use games to prove that for reachable image-finite Kripke structures elementary equivalence implies isomorphism. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2201.03098 [pdf, ps, other]

Qualitative representations of chromatic algebras

Authors: Badriah Al Juaid, Marcel Jackson, James Koussas, Tomasz Kowalski

Abstract: Conventional Ramsey-theoretic investigations for edge-colourings of complete graphs are framed around avoidance of certain configurations. Motivated by considerations arising in the field of Qualitative Reasoning, we explore edge colourings that in addition to forbidding certain triangle configurations also require others to be present. These conditions have natural combinatorial interest in their… ▽ More Conventional Ramsey-theoretic investigations for edge-colourings of complete graphs are framed around avoidance of certain configurations. Motivated by considerations arising in the field of Qualitative Reasoning, we explore edge colourings that in addition to forbidding certain triangle configurations also require others to be present. These conditions have natural combinatorial interest in their own right, but also correspond to qualitative representability of certain nonassociative relation algebras, which we will call chromatic. △ Less

Submitted 9 January, 2022; originally announced January 2022.

MSC Class: 05C70; 05C55; 51E15; 03G15; 68T27 ACM Class: G.2.1

arXiv:2108.00290 [pdf, other]

A Hybrid Ensemble Feature Selection Design for Candidate Biomarkers Discovery from Transcriptome Profiles

Authors: Felipe Colombelli, Thayne Woycinck Kowalski, Mariana Recamonde-Mendoza

Abstract: The discovery of disease biomarkers from gene expression data has been greatly advanced by feature selection (FS) methods, especially using ensemble FS (EFS) strategies with perturbation at the data level (i.e., homogeneous, Hom-EFS) or method level (i.e., heterogeneous, Het-EFS). Here we proposed a Hybrid EFS (Hyb-EFS) design that explores both types of perturbation to improve the stability and t… ▽ More The discovery of disease biomarkers from gene expression data has been greatly advanced by feature selection (FS) methods, especially using ensemble FS (EFS) strategies with perturbation at the data level (i.e., homogeneous, Hom-EFS) or method level (i.e., heterogeneous, Het-EFS). Here we proposed a Hybrid EFS (Hyb-EFS) design that explores both types of perturbation to improve the stability and the predictive power of candidate biomarkers. With this, Hyb-EFS aims to disrupt associations of good performance with a single dataset, single algorithm, or a specific combination of both, which is particularly interesting for better reproducibility of genomic biomarkers. We investigated the adequacy of our approach for microarray data related to four types of cancer, carrying out an extensive comparison with other ensemble and single FS approaches. Five FS methods were used in our experiments: Wx, Symmetrical Uncertainty (SU), Gain Ratio (GR), Characteristic Direction (GeoDE), and ReliefF. We observed that the Hyb-EFS and Het-EFS approaches attenuated the large performance variation observed for most single FS and Hom-EFS across distinct datasets. Also, the Hyb-EFS improved upon the stability of the Het-EFS within our domain. Comparing the Hyb-EFS and Het-EFS composed of the top-performing selectors (Wx, GR, and SU), our hybrid approach surpassed the equivalent heterogeneous design and the best Hom-EFS (Hom-Wx). Interestingly, the rankings produced by our Hyb-EFS reached greater biological plausibility, with a notably high enrichment for cancer-related genes and pathways. Thus, our experiments suggest the potential of the proposed Hybrid EFS design in discovering candidate biomarkers from microarray data. Finally, we provide an open-source framework to support similar analyses in other domains, both as a user-friendly application and a plain Python package. △ Less

Submitted 31 July, 2021; originally announced August 2021.

Comments: 22 pages, 11 figures, 1 table

ACM Class: I.5.2; I.2.1

arXiv:1711.10385 [pdf, other]

Faster range minimum queries

Authors: Tomasz Kowalski, Szymon Grabowski

Abstract: Range Minimum Query (RMQ) is an important building brick of many compressed data structures and string matching algorithms. Although this problem is essentially solved in theory, with sophisticated data structures allowing for constant time queries, practical performance and construction time also matter. Additionally, there are offline scenarios in which the number of queries, $q$, is rather smal… ▽ More Range Minimum Query (RMQ) is an important building brick of many compressed data structures and string matching algorithms. Although this problem is essentially solved in theory, with sophisticated data structures allowing for constant time queries, practical performance and construction time also matter. Additionally, there are offline scenarios in which the number of queries, $q$, is rather small and given beforehand, which encourages to use a simpler approach. In this work, we present a simple data structure, with very fast construction, which allows to handle queries in constant time on average. This algorithm, however, requires access to the input data during queries (which is not the case of sophisticated RMQ solutions). We subsequently refine our technique, combining it with one of the existing succinct solutions with $O(1)$ worst-case time queries and no access to the input array. The resulting hybrid is still a memory frugal data structure, spending usually up to about $3n$ bits, and providing competitive query times, especially for wide ranges. We also show how to make our baseline data structure more compact. Experimental results demonstrate that the proposed BbST (Block-based Sparse Table) variants are competitive to existing solutions, also in the offline scenario. △ Less

Submitted 28 November, 2017; originally announced November 2017.

Comments: A (very preliminary) version of the manuscript was presented in Prague Stringology Conference 2017

MSC Class: 68W32 ACM Class: F.2.2

arXiv:1706.06940 [pdf, ps, other]

Faster batched range minimum queries

Authors: Szymon Grabowski, Tomasz Kowalski

Abstract: Range Minimum Query (RMQ) is an important building brick of many compressed data structures and string matching algorithms. Although this problem is essentially solved in theory, with sophisticated data structures allowing for constant time queries, there are scenarios in which the number of queries, $q$, is rather small and given beforehand, which encourages to use a simpler approach. A recent wo… ▽ More Range Minimum Query (RMQ) is an important building brick of many compressed data structures and string matching algorithms. Although this problem is essentially solved in theory, with sophisticated data structures allowing for constant time queries, there are scenarios in which the number of queries, $q$, is rather small and given beforehand, which encourages to use a simpler approach. A recent work by Alzamel et al. starts with contracting the input array to a much shorter one, with its size proportional to $q$. In this work, we build upon their solution, speeding up handling small batches of queries by a factor of 3.8--7.8 (the gap grows with $q$). The key idea that helped us achieve this advantage is adapting the well-known Sparse Table technique to work on blocks, with speculative block minima comparisons. We also propose an even much faster (but possibly using more space) variant without the array contraction. △ Less

Submitted 10 July, 2017; v1 submitted 21 June, 2017; originally announced June 2017.

Comments: Accepted to Prague Stringology Conference 2017. Compared to v1, bugs in Table 2 were fixed

MSC Class: 68W32 ACM Class: F.2.2

arXiv:1607.08176 [pdf, ps, other]

Suffix arrays with a twist

Authors: Tomasz Kowalski, Szymon Grabowski, Kimmo Fredriksson, Marcin Raniszewski

Abstract: The suffix array is a classic full-text index, combining effectiveness with simplicity. We discuss three approaches aiming to improve its efficiency even more: changes to the navigation, data layout and adding extra data. In short, we show that $(i)$ how we search for the right interval boundary impacts significantly the overall search speed, $(ii)$ a B-tree data layout easily wins over the standa… ▽ More The suffix array is a classic full-text index, combining effectiveness with simplicity. We discuss three approaches aiming to improve its efficiency even more: changes to the navigation, data layout and adding extra data. In short, we show that $(i)$ how we search for the right interval boundary impacts significantly the overall search speed, $(ii)$ a B-tree data layout easily wins over the standard one, $(iii)$ the well-known idea of a lookup table for the prefixes of the suffixes can be refined with using compression, $(iv)$ caching prefixes of the suffixes in a helper array can pose a(nother) practical space-time tradeoff. △ Less

Submitted 27 July, 2016; originally announced July 2016.

MSC Class: 68W32 ACM Class: F.2.2

arXiv:1606.09140 [pdf, ps, other]

doi 10.1016/j.tcs.2019.02.033

Algebraic foundations for qualitative calculi and networks

Authors: Robin Hirsch, Marcel Jackson, Tomasz Kowalski

Abstract: A qualitative representation $φ$ is like an ordinary representation of a relation algebra, but instead of requiring $(a; b)^φ= a^φ| b^φ$, as we do for ordinary representations, we only require that $c^φ\supseteq a^φ| b^φ\iff c\geq a ; b$, for each $c$ in the algebra. A constraint network is qualitatively satisfiable if its nodes can be mapped to elements of a qualitative representation, preserving… ▽ More A qualitative representation $φ$ is like an ordinary representation of a relation algebra, but instead of requiring $(a; b)^φ= a^φ| b^φ$, as we do for ordinary representations, we only require that $c^φ\supseteq a^φ| b^φ\iff c\geq a ; b$, for each $c$ in the algebra. A constraint network is qualitatively satisfiable if its nodes can be mapped to elements of a qualitative representation, preserving the constraints. If a constraint network is satisfiable then it is clearly qualitatively satisfiable, but the converse can fail. However, for a wide range of relation algebras including the point algebra, the Allen Interval Algebra, RCC8 and many others, a network is satisfiable if and only if it is qualitatively satisfiable. Unlike ordinary composition, the weak composition arising from qualitative representations need not be associative, so we can generalise by considering network satisfaction problems over non-associative algebras. We prove that computationally, qualitative representations have many advantages over ordinary representations: whereas many finite relation algebras have only infinite representations, every finite qualitatively representable algebra has a finite qualitative representation; the representability problem for (the atom structures of) finite non-associative algebras is NP-complete; the network satisfaction problem over a finite qualitatively representable algebra is always in NP; the validity of equations over qualitative representations is co-NP-complete. On the other hand we prove that there is no finite axiomatisation of the class of qualitatively representable algebras. △ Less

Submitted 19 June, 2017; v1 submitted 29 June, 2016; originally announced June 2016.

Comments: 22 pages

MSC Class: 68T30

Journal ref: Theoretical Computer Science 768 (2019) 99-116

arXiv:1502.01861 [pdf, ps, other]

doi 10.1371/journal.pone.0133198

Indexing arbitrary-length $k$-mers in sequencing reads

Authors: Tomasz Kowalski, Szymon Grabowski, Sebastian Deorowicz

Abstract: We propose a lightweight data structure for indexing and querying collections of NGS reads data in main memory. The data structure supports the interface proposed in the pioneering work by Philippe et al. for counting and locating $k$-mers in sequencing reads. Our solution, PgSA (pseudogenome suffix array), based on finding overlap** reads, is competitive to the existing algorithms in the space… ▽ More We propose a lightweight data structure for indexing and querying collections of NGS reads data in main memory. The data structure supports the interface proposed in the pioneering work by Philippe et al. for counting and locating $k$-mers in sequencing reads. Our solution, PgSA (pseudogenome suffix array), based on finding overlap** reads, is competitive to the existing algorithms in the space use, query times, or both. The main applications of our index include variant calling, error correction and analysis of reads from RNA-seq experiments. △ Less

Submitted 13 February, 2015; v1 submitted 6 February, 2015; originally announced February 2015.

Journal ref: PLOS One, Article no. 0133198 (2015)

arXiv:1304.4986 [pdf, ps, other]

Complexity and polymorphisms for digraph constraint problems under some basic constructions

Authors: Marcel Jackson, Tomasz Kowalski, Todd Niven

Abstract: The role of polymorphisms in determining the complexity of constraint satisfaction problems is well established. In this context we study the stability of CSP complexity and polymorphism properties under some basic graph theoretic constructions. As applications we observe a collapse in the applicability of algorithms for CSPs over directed graphs with both a total source and a total sink: the co… ▽ More The role of polymorphisms in determining the complexity of constraint satisfaction problems is well established. In this context we study the stability of CSP complexity and polymorphism properties under some basic graph theoretic constructions. As applications we observe a collapse in the applicability of algorithms for CSPs over directed graphs with both a total source and a total sink: the corresponding CSP is solvable by the "few subpowers algorithm" if and only if it is solvable by a local consistency check algorithm. Moreover, we find that the property of "strict width" and solvability by few subpowers are unstable under first order reductions. The analysis also yields a complete characterisation of the main polymorphism properties for digraphs whose symmetric closure is a complete graph. △ Less

Submitted 21 July, 2016; v1 submitted 17 April, 2013; originally announced April 2013.

Comments: 31 pages

MSC Class: 05C20; 05C15; 08B05 ACM Class: F.2.0; G.2.2

Showing 1–9 of 9 results for author: Kowalski, T