Search | arXiv e-print repository

arXiv:2006.13469 [pdf, other]

Face-to-Music Translation Using a Distance-Preserving Generative Adversarial Network with an Auxiliary Discriminator

Authors: Chelhwon Kim, Andrew Port, Mitesh Patel

Abstract: Learning a map** between two unrelated domains-such as image and audio, without any supervision is a challenging task. In this work, we propose a distance-preserving generative adversarial model to translate images of human faces into an audio domain. The audio domain is defined by a collection of musical note sounds recorded by 10 different instrument families (NSynth \cite{nsynth2017}) and a d… ▽ More Learning a map** between two unrelated domains-such as image and audio, without any supervision is a challenging task. In this work, we propose a distance-preserving generative adversarial model to translate images of human faces into an audio domain. The audio domain is defined by a collection of musical note sounds recorded by 10 different instrument families (NSynth \cite{nsynth2017}) and a distance metric where the instrument family class information is incorporated together with a mel-frequency cepstral coefficients (MFCCs) feature. To enforce distance-preservation, a loss term that penalizes difference between pairwise distances of the faces and the translated audio samples is used. Further, we discover that the distance preservation constraint in the generative adversarial model leads to reduced diversity in the translated audio samples, and propose the use of an auxiliary discriminator to enhance the diversity of the translations while using the distance preservation constraint. We also provide a visual demonstration of the results and numerical analysis of the fidelity of the translations. A video demo of our proposed model's learned translation is available in https://www.dropbox.com/s/the176w9obq8465/face_to_musical_note.mov?dl=0. △ Less

Submitted 24 June, 2020; originally announced June 2020.

Comments: 15 pages, 3 figures

arXiv:2005.13291 [pdf, other]

Deep Sensory Substitution: Noninvasively Enabling Biological Neural Networks to Receive Input from Artificial Neural Networks

Authors: Andrew Port, Chelhwon Kim, Mitesh Patel

Abstract: As is expressed in the adage "a picture is worth a thousand words", when using spoken language to communicate visual information, brevity can be a challenge. This work describes a novel technique for leveraging machine-learned feature embeddings to sonify visual (and other types of) information into a perceptual audio domain, allowing users to perceive this information using only their aural facul… ▽ More As is expressed in the adage "a picture is worth a thousand words", when using spoken language to communicate visual information, brevity can be a challenge. This work describes a novel technique for leveraging machine-learned feature embeddings to sonify visual (and other types of) information into a perceptual audio domain, allowing users to perceive this information using only their aural faculty. The system uses a pretrained image embedding network to extract visual features and embed them in a compact subset of Euclidean space -- this converts the images into feature vectors whose $L^2$ distances can be used as a meaningful measure of similarity. A generative adversarial network (GAN) is then used to find a distance preserving map from this metric space of feature vectors into the metric space defined by a target audio dataset equipped with either the Euclidean metric or a mel-frequency cepstrum-based psychoacoustic distance metric. We demonstrate this technique by sonifying images of faces into human speech-like audio. For both target audio metrics, the GAN successfully found a metric preserving map**, and in human subject tests, users were able to accurately classify audio sonifications of faces. △ Less

Submitted 25 August, 2021; v1 submitted 27 May, 2020; originally announced May 2020.

Comments: 9 pages, 3 figures

arXiv:1903.05181 [pdf, other]

Topological Analysis of Syntactic Structures

Authors: Alexander Port, Taelin Karidi, Matilde Marcolli

Abstract: We use the persistent homology method of topological data analysis and dimensional analysis techniques to study data of syntactic structures of world languages. We analyze relations between syntactic parameters in terms of dimensionality, of hierarchical clustering structures, and of non-trivial loops. We show there are relations that hold across language families and additional relations that are… ▽ More We use the persistent homology method of topological data analysis and dimensional analysis techniques to study data of syntactic structures of world languages. We analyze relations between syntactic parameters in terms of dimensionality, of hierarchical clustering structures, and of non-trivial loops. We show there are relations that hold across language families and additional relations that are family-specific. We then analyze the trees describing the merging structure of persistent connected components for languages in different language families and we show that they partly correlate to historical phylogenetic trees but with significant differences. We also show the existence of interesting non-trivial persistent first homology groups in various language families. We give examples where explicit generators for the persistent first homology can be identified, some of which appear to correspond to homoplasy phenomena, while others may have an explanation in terms of historical linguistics, corresponding to known cases of syntactic borrowing across different language subfamilies. △ Less

Submitted 12 March, 2019; originally announced March 2019.

Comments: 83 pages, LaTeX, 44 figures

MSC Class: 91F20; 55U10; 55N35; 62-07

arXiv:1507.05134 [pdf, other]

Persistent Topology of Syntax

Authors: Alexander Port, Iulia Gheorghita, Daniel Guth, John M. Clark, Crystal Liang, Shival Dasu, Matilde Marcolli

Abstract: We study the persistent homology of the data set of syntactic parameters of the world languages. We show that, while homology generators behave erratically over the whole data set, non-trivial persistent homology appears when one restricts to specific language families. Different families exhibit different persistent homology. We focus on the cases of the Indo-European and the Niger-Congo families… ▽ More We study the persistent homology of the data set of syntactic parameters of the world languages. We show that, while homology generators behave erratically over the whole data set, non-trivial persistent homology appears when one restricts to specific language families. Different families exhibit different persistent homology. We focus on the cases of the Indo-European and the Niger-Congo families, for which we compare persistent homology over different cluster filtering values. We investigate the possible significance, in historical linguistic terms, of the presence of persistent generators of the first homology. In particular, we show that the persistent first homology generator we find in the Indo-European family is not due (as one might guess) to the Anglo-Norman bridge in the Indo-European phylogenetic network, but is related to the position of Ancient Greek and the Hellenic branch within the network. △ Less

Submitted 17 July, 2015; originally announced July 2015.

Comments: 15 pages, 25 jpg figures

MSC Class: 91F20

arXiv:1502.07796 [pdf, other]

Graph Grammars, Insertion Lie Algebras, and Quantum Field Theory

Authors: Matilde Marcolli, Alexander Port

Abstract: Graph grammars extend the theory of formal languages in order to model distributed parallelism in theoretical computer science. We show here that to certain classes of context-free and context-sensitive graph grammars one can associate a Lie algebra, whose structure is reminiscent of the insertion Lie algebras of quantum field theory. We also show that the Feynman graphs of quantum field theories… ▽ More Graph grammars extend the theory of formal languages in order to model distributed parallelism in theoretical computer science. We show here that to certain classes of context-free and context-sensitive graph grammars one can associate a Lie algebra, whose structure is reminiscent of the insertion Lie algebras of quantum field theory. We also show that the Feynman graphs of quantum field theories are graph languages generated by a theory dependent graph grammar. △ Less

Submitted 26 February, 2015; originally announced February 2015.

Comments: 19 pages, LaTeX, 3 jpeg figures

MSC Class: 68Q42; 81T18

Showing 1–5 of 5 results for author: Port, A