-
arXiv:2403.12588 [pdf, ps, other]
Machine Learning of the Prime Distribution
Abstract: In the present work we use maximum entropy methods to derive several theorems in probabilistic number theory, including a version of the Hardy-Ramanujan Theorem. We also provide a theoretical argument explaining the experimental observations of Yang-Hui He about the learnability of primes, and posit that the Erdős-Kac law would very unlikely be discovered by current machine learning techniques. Nu… ▽ More
Submitted 2 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.
Comments: 10 pages; parts of arXiv:2308.10817 reworked and amended; author's draft; accepted in PLOS ONE
MSC Class: 11N05
-
A ripple in time: a discontinuity in American history
Abstract: In this note we use the State of the Union Address (SOTU) dataset from Kaggle to make some surprising (and some not so surprising) observations pertaining to the general timeline of American history, and the character and nature of the addresses themselves. Our main approach is using vector embeddings, such as BERT (DistilBERT) and GPT-2. While it is widely believed that BERT (and its variations… ▽ More
Submitted 4 May, 2024; v1 submitted 2 December, 2023; originally announced December 2023.
Comments: 7 pages, 8 figures; GitHub repository (https://github.com/sashakolpakov/ripple_in_time); Section 3: added comparison to (https://doi.org/10.1016/j.ins.2019.01.040); comments on a misleading accuracy claim in (https://doi.org/10.1002/asi.23283)
ACM Class: I.2.7; I.5.4; H.3.1; H.3.3
-
The Information Geometry of UMAP
Abstract: Although UMAP was derived from Category Theory observations, its underlying mechanisms may be clarified using Information Geometry.
Submitted 25 June, 2024; v1 submitted 3 September, 2023; originally announced September 2023.
Comments: 11 pages, 2 figures, 3 tables; Github repo (https://github.com/sashakolpakov/info-geometry-umap)
MSC Class: 53B12; 94A15
-
arXiv:2308.10817 [pdf, ps, other]
On the impossibility of discovering a formula for primes using AI
Abstract: The present work explores the theoretical limits of Machine Learning (ML) within the framework of Kolmogorov's theory of Algorithmic Probability, which clarifies the notion of entropy as Expected Kolmogorov Complexity and formalizes other fundamental concepts such as Occam's razor via Levin's Universal Distribution. As a fundamental application, we develop Maximum Entropy methods that allow us to… ▽ More
Submitted 2 June, 2024; v1 submitted 27 July, 2023; originally announced August 2023.
Comments: 29 pages; parts of this manuscript are accepted as a separate paper in PLOS ONE
MSC Class: 11N05 11N05 11N05
-
Robust affine point matching via quadratic assignment on Grassmannians
Abstract: Robust Affine matching with Grassmannians (RAG) is a new algorithm to perform affine registration of point clouds. The algorithm is based on minimizing the Frobenius distance between two elements of the Grassmannian. For this purpose, an indefinite relaxation of the Quadratic Assignment Problem (QAP) is used, and several approaches to affine feature matching are studied and compared. Experiments d… ▽ More
Submitted 4 May, 2024; v1 submitted 5 March, 2023; originally announced March 2023.
Comments: 8 pages, 23 figures; GitHub repository at (https://github.com/sashakolpakov/rag); Section IV: added comparison to GrassGraph (https://doi.org/10.1109/TIP.2019.2959722); notably, GrassGraph quickly loses accuracy on our test examples with noise and occlusion
-
An approach to robust ICP initialization
Abstract: In this note, we propose an approach to initialize the Iterative Closest Point (ICP) algorithm to match unlabelled point clouds related by rigid transformations. The method is based on matching the ellipsoids defined by the points' covariance matrices and then testing the various principal half-axes matchings that differ by elements of a finite reflection group. We derive bounds on the robustness… ▽ More
Submitted 25 June, 2023; v1 submitted 10 December, 2022; originally announced December 2022.
Comments: 9 pages, 18 figures, 1 table; GitHub repository at (https://github.com/sashakolpakov/icp-init)