Skip to main content

Showing 1–4 of 4 results for author: Bowern, C

.
  1. arXiv:2107.02858  [pdf, other

    cs.CL

    Topic Modeling in the Voynich Manuscript

    Authors: Rachel Sterneck, Annie Polish, Claire Bowern

    Abstract: This article presents the results of investigations using topic modeling of the Voynich Manuscript (Beinecke MS408). Topic modeling is a set of computational methods which are used to identify clusters of subjects within text. We use latent dirichlet allocation, latent semantic analysis, and nonnegative matrix factorization to cluster Voynich pages into `topics'. We then compare the topics derived… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

    Comments: See https://lingbuzz.net/lingbuzz/006068 for a version that has the Voynich font (and better figure placement), since arxiv does not allow xelatex compilation

  2. arXiv:2010.14697  [pdf, other

    cs.CL

    Character Entropy in Modern and Historical Texts: Comparison Metrics for an Undeciphered Manuscript

    Authors: Luke Lindemann, Claire Bowern

    Abstract: This paper outlines the creation of three corpora for multilingual comparison and analysis of the Voynich manuscript: a corpus of Voynich texts partitioned by Currier language, scribal hand, and transcription system, a corpus of 294 language samples compiled from Wikipedia, and a corpus of eighteen transcribed historical texts in eight languages. These corpora will be utilized in subsequent work b… ▽ More

    Submitted 18 May, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: Updated following updates to corpus files (see paper for details)

  3. Phylogenetic signal in phonotactics

    Authors: Jayden L. Macklin-Cordes, Claire Bowern, Erich R. Round

    Abstract: Phylogenetic methods have broad potential in linguistics beyond tree inference. Here, we show how a phylogenetic approach opens the possibility of gaining historical insights from entirely new kinds of linguistic data--in this instance, statistical phonotactics. We extract phonotactic data from 111 Pama-Nyungan vocabularies and apply tests for phylogenetic signal, quantifying the degree to which t… ▽ More

    Submitted 14 July, 2020; v1 submitted 2 February, 2020; originally announced February 2020.

    Comments: Main text: 32 pages, 17 figures, 1 table. Supplementary Information: 17 pages, 1 figure. Code and data available at http://doi.org/10.5281/zenodo.3936353. This article is in review but not yet accepted for publication in a journal

    ACM Class: J.5

  4. arXiv:1906.05760  [pdf, other

    cs.CL q-bio.PE

    Semantic Change and Semantic Stability: Variation is Key

    Authors: Claire Bowern

    Abstract: I survey some recent approaches to studying change in the lexicon, particularly change in meaning across phylogenies. I briefly sketch an evolutionary approach to language change and point out some issues in recent approaches to studying semantic change that rely on temporally stratified word embeddings. I draw illustrations from lexical cognate models in Pama-Nyungan to identify meaning classes m… ▽ More

    Submitted 13 June, 2019; originally announced June 2019.

    Comments: 8 pages