-
Binary classification of spoken words with passive phononic metamaterials
Authors:
Tena Dubček,
Daniel Moreno-Garcia,
Thomas Haag,
Parisa Omidvar,
Henrik R. Thomsen,
Theodor S. Becker,
Lars Gebraad,
Christoph Bärlocher,
Fredrik Andersson,
Sebastian D. Huber,
Dirk-Jan van Manen,
Luis Guillermo Villanueva,
Johan O. A. Robertsson,
Marc Serra-Garcia
Abstract:
Mitigating the energy requirements of artificial intelligence requires novel physical substrates for computation. Phononic metamaterials have a vanishingly low power dissipation and hence are a prime candidate for green, always-on computers. However, their use in machine learning applications has not been explored due to the complexity of their design process: Current phononic metamaterials are re…
▽ More
Mitigating the energy requirements of artificial intelligence requires novel physical substrates for computation. Phononic metamaterials have a vanishingly low power dissipation and hence are a prime candidate for green, always-on computers. However, their use in machine learning applications has not been explored due to the complexity of their design process: Current phononic metamaterials are restricted to simple geometries (e.g. periodic, tapered), and hence do not possess sufficient expressivity to encode machine learning tasks. We design and fabricate a non-periodic phononic metamaterial, directly from data samples, that can distinguish between pairs of spoken words in the presence of a simple readout nonlinearity; hence demonstrating that phononic metamaterials are a viable avenue towards zero-power smart devices.
△ Less
Submitted 7 July, 2023; v1 submitted 14 November, 2021;
originally announced November 2021.
-
Generative Adversarial Networks for photo to Hayao Miyazaki style cartoons
Authors:
Filip Andersson,
Simon Arvidsson
Abstract:
This paper takes on the problem of transferring the style of cartoon images to real-life photographic images by implementing previous work done by CartoonGAN. We trained a Generative Adversial Network(GAN) on over 60 000 images from works by Hayao Miyazaki at Studio Ghibli. To evaluate our results, we conducted a qualitative survey comparing our results with two state-of-the-art methods. 117 surve…
▽ More
This paper takes on the problem of transferring the style of cartoon images to real-life photographic images by implementing previous work done by CartoonGAN. We trained a Generative Adversial Network(GAN) on over 60 000 images from works by Hayao Miyazaki at Studio Ghibli. To evaluate our results, we conducted a qualitative survey comparing our results with two state-of-the-art methods. 117 survey results indicated that our model on average outranked state-of-the-art methods on cartoon-likeness.
△ Less
Submitted 15 May, 2020;
originally announced May 2020.
-
An Automatic System for Acoustic Microphone Geometry Calibration based on Minimal Solvers
Authors:
Simayijiang Zhayida,
Simon Segerblom Rex,
Yubin Kuang,
Fredrik Andersson,
Kalle Åström
Abstract:
In this paper, robust detection, tracking and geometry estimation methods are developed and combined into a system for estimating time-difference estimates, microphone localization and sound source movement. No assumptions on the 3D locations of the microphones and sound sources are made. The system is capable of tracking continuously moving sound sources in an reverberant environment. The multi-p…
▽ More
In this paper, robust detection, tracking and geometry estimation methods are developed and combined into a system for estimating time-difference estimates, microphone localization and sound source movement. No assumptions on the 3D locations of the microphones and sound sources are made. The system is capable of tracking continuously moving sound sources in an reverberant environment. The multi-path components are explicitly tracked and used in the geometry estimation parts. The system is based on matching between pairs of channels using GCC-PHAT. Instead of taking a single maximum at each time instant from each such pair, we select the four strongest local maxima. This produce a set of hypothesis to work with in the subsequent steps, where consistency constraints between the channels and time-continuity constraints are exploited. In the paper it demonstrated how such detections can be used to estimate microphone positions, sound source movement and room geometry. The methods are tested and verified using real data from several reverberant environments. The evaluation demonstrated accuracy in the order of few millimeters.
△ Less
Submitted 7 October, 2016;
originally announced October 2016.