Sound Source Localization for a Source inside a Structure using Ac-CycleGAN
Authors:
Shunsuke Kita,
Choong Sik Park,
Yoshinobu Kajikawa
Abstract:
We propose a method for sound source localization (SSL) for a source inside a structure using Ac-CycleGAN under unpaired data conditions. The proposed method utilizes a large amount of simulated data and a small amount of actual experimental data to locate a sound source inside a structure in a real environment. An Ac-CycleGAN generator contributes to the transformation of simulated data into real…
▽ More
We propose a method for sound source localization (SSL) for a source inside a structure using Ac-CycleGAN under unpaired data conditions. The proposed method utilizes a large amount of simulated data and a small amount of actual experimental data to locate a sound source inside a structure in a real environment. An Ac-CycleGAN generator contributes to the transformation of simulated data into real data, or vice versa, using unpaired data from both domains. The discriminator of an Ac-CycleGAN model is designed to differentiate between the transformed data generated by the generator and real data, while also predicting the location of the sound source. Vectors representing the frequency spectrum of the accelerometers (FSAs) measured at three points outside the structure are used as input data and the source areas inside the structure are used as labels. The input data vectors are concatenated vertically to form an image. Labels are defined by dividing the interior of the structure into eight areas with one-hot encoding for each area. Thus, the SSL problem is redefined as an image-classification problem to stochastically estimate the location of the sound source. We show that it is possible to estimate the sound source location using the Ac-CycleGAN discriminator for unpaired data across domains. Furthermore, we analyze the discriminative factors for distinguishing the data. The proposed model exhibited an accuracy exceeding 90\% when trained on 80\% of actual data (12.5\% of simulated data). Despite potential imperfections in the domain transformation process carried out by the Ac-CycleGAN generator, the discriminator can effectively distinguish between transferred and real data by selectively utilizing only those features that generate a relatively small transformation error.
△ Less
Submitted 8 December, 2023;
originally announced December 2023.
Multimodal Personal Ear Authentication Using Smartphones
Authors:
S. Itani,
S. Kita,
Y. Kajikawa
Abstract:
In recent years, biometric authentication technology for smartphones has become widespread, with the mainstream methods being fingerprint authentication and face recognition. However, fingerprint authentication cannot be used when hands are wet, and face recognition cannot be used when a person is wearing a mask. Therefore, we examine a personal authentication system using the pinna as a new appro…
▽ More
In recent years, biometric authentication technology for smartphones has become widespread, with the mainstream methods being fingerprint authentication and face recognition. However, fingerprint authentication cannot be used when hands are wet, and face recognition cannot be used when a person is wearing a mask. Therefore, we examine a personal authentication system using the pinna as a new approach for biometric authentication on smartphones. Authentication systems based on the acoustic transfer function of the pinna (PRTF: Pinna Related Transfer Function) have been investigated. However, the authentication accuracy decreases due to the positional fluctuation across each measurement. In this paper, we propose multimodal personal authentication on smartphones using PRTF. The pinna image and positional sensor information are used with the PRTF, and the effectiveness of the authentication method is examined. We demonstrate that the proposed authentication system can compensate for the positional changes in each measurement and improve robustness.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.