-
Microwave dependent quantum transport characteristics in GaN/AlGaN FETs
Authors:
Motoya Shinozaki,
Takaya Abe,
Kazuma Matsumura,
Takumi Aizawa,
Takashi Kumasaka,
Tomohiro Otsuka
Abstract:
Defects in semiconductors, traditionally seen as detrimental to electronic device performance, have emerged as potential assets in quantum technologies due to their unique quantum properties. This study investigates the interaction between defects and quantum electron transport in GaN/AlGaN field-effect transistors, highlighting the observation of Fano resonances at low temperatures. We observe th…
▽ More
Defects in semiconductors, traditionally seen as detrimental to electronic device performance, have emerged as potential assets in quantum technologies due to their unique quantum properties. This study investigates the interaction between defects and quantum electron transport in GaN/AlGaN field-effect transistors, highlighting the observation of Fano resonances at low temperatures. We observe the resonance spectra and their dependence on gate voltage and magnetic fields. To explain the observed behavior, we construct the possible scenario as a Fano interferometer with finite width. Our findings reveal the potential of semiconductor defects to contribute to the development of quantum information processing, providing their role to key components in next-generation quantum devices.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Kondo effect in electrostatically defined ZnO quantum dots
Authors:
Kosuke Noro,
Yusuke Kozuka,
Kazuma Matsumura,
Takeshi Kumasaka,
Yoshihiro Fujiwara,
Atsushi Tsukazaki,
Masashi Kawasaki,
Tomohiro Otsuka
Abstract:
Quantum devices such as spin qubits have been extensively investigated in electrostatically confined quantum dots using high-quality semiconductor heterostructures like GaAs and Si. Here, we present the first demonstration of electrostatically forming the quantum dots in ZnO heterostructures. Through the transport measurement, we uncover the distinctive signature of the Kondo effect independent of…
▽ More
Quantum devices such as spin qubits have been extensively investigated in electrostatically confined quantum dots using high-quality semiconductor heterostructures like GaAs and Si. Here, we present the first demonstration of electrostatically forming the quantum dots in ZnO heterostructures. Through the transport measurement, we uncover the distinctive signature of the Kondo effect independent of the even-odd electron number parity, which contrasts with the typical behavior of the Kondo effect in GaAs. By analyzing temperature and magnetic field dependences, we find that the absence of the even-odd parity in the Kondo effect is not straightforwardly interpreted by the considerations developed for conventional semiconductors. We propose that, based on the unique parameters of ZnO, electron correlation likely plays a fundamental role in this observation. Our study not only clarifies the physics of correlated electrons in the quantum dot but also holds promise for applications in quantum devices, leveraging the unique features of ZnO.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Wide dynamic range charge sensor operation by high-speed feedback control of radio-frequency reflectometry
Authors:
Yoshihiro Fujiwara,
Motoya Shinozaki,
Kazuma Matsumura,
Kosuke Noro,
Riku Tataka,
Shoichi Sato,
Takeshi Kumasaka,
Tomohiro Otsuka
Abstract:
Semiconductor quantum dots are useful for controlling and observing quantum states and can also be used as sensors for reading out quantum bits and exploring local electronic states in nanostructures. However, challenges remain for the sensor applications, such as the trade-off between sensitivity and dynamic range and the issue of instability due to external disturbances. In this study, we demons…
▽ More
Semiconductor quantum dots are useful for controlling and observing quantum states and can also be used as sensors for reading out quantum bits and exploring local electronic states in nanostructures. However, challenges remain for the sensor applications, such as the trade-off between sensitivity and dynamic range and the issue of instability due to external disturbances. In this study, we demonstrate proportional-integral-differential feedback control of the radio-frequency reflectometry in GaN nanodevices using a field-programmable gate array. This technique can maintain the operating point of the charge sensor with high sensitivity. The system also realizes a wide dynamic range and high sensor sensitivity through the monitoring of the feedback signal. This method has potential applications in exploring dynamics and instability of electronic and quantum states in nanostructures.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code
Authors:
Kazuaki Matsumura,
Simon Garcia De Gonzalo,
Antonio J. Peña
Abstract:
Automatic code optimization is a complex process that typically involves the application of multiple discrete algorithms that modify the program structure irreversibly. However, the design of these algorithms is often monolithic, and they require repetitive implementation to perform similar analyses due to the lack of cooperation. To address this issue, modern optimization techniques, such as equa…
▽ More
Automatic code optimization is a complex process that typically involves the application of multiple discrete algorithms that modify the program structure irreversibly. However, the design of these algorithms is often monolithic, and they require repetitive implementation to perform similar analyses due to the lack of cooperation. To address this issue, modern optimization techniques, such as equality saturation, allow for exhaustive term rewriting at various levels of inputs, thereby simplifying compiler design.
In this paper, we propose equality saturation to optimize sequential codes utilized in directive-based programming for GPUs. Our approach simultaneously realizes less computation, less memory access, and high memory throughput. Our fully-automated framework constructs single-assignment forms from inputs to be entirely rewritten while kee** dependencies and extracts optimal cases. Through practical benchmarks, we demonstrate a significant performance improvement on several compilers. Furthermore, we highlight the advantages of computational reordering and emphasize the significance of memory-access order for modern GPUs.
△ Less
Submitted 23 June, 2023; v1 submitted 22 June, 2023;
originally announced June 2023.
-
Channel length dependence of the formation of quantum dots in GaN/AlGaN FETs
Authors:
Kazuma Matsumura,
Takaya Abe,
Takahito Kitada,
Takeshi Kumasaka,
Norikazu Ito,
Taketoshi Tanaka,
Ken Nakahara,
Tomohiro Otsuka
Abstract:
Quantum dots can be formed in simple GaN/AlGaN field-effect-transistors (FETs) by disordered potential induced by impurities and defects. Here, we investigate the channel length dependence of the formation of quantum dots. We observe decrease of the number of formed quantum dots with decrease of the FET channel length. A few quantum dots are formed in the case with the gate length of 0.05~$μ$m and…
▽ More
Quantum dots can be formed in simple GaN/AlGaN field-effect-transistors (FETs) by disordered potential induced by impurities and defects. Here, we investigate the channel length dependence of the formation of quantum dots. We observe decrease of the number of formed quantum dots with decrease of the FET channel length. A few quantum dots are formed in the case with the gate length of 0.05~$μ$m and we evaluate the dot parameters and the disordered potential. We also investigate the effects of a thermal cycle and illumination of light, and reveal the change of the disordered potential.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code
Authors:
Kazuaki Matsumura,
Simon Garcia De Gonzalo,
Antonio J. Peña
Abstract:
Various kinds of applications take advantage of GPUs through automation tools that attempt to automatically exploit the available performance of the GPU's parallel architecture. Directive-based programming models, such as OpenACC, are one such method that easily enables parallel computing by just adhering code annotations to code loops. Such abstract models, however, often prevent programmers from…
▽ More
Various kinds of applications take advantage of GPUs through automation tools that attempt to automatically exploit the available performance of the GPU's parallel architecture. Directive-based programming models, such as OpenACC, are one such method that easily enables parallel computing by just adhering code annotations to code loops. Such abstract models, however, often prevent programmers from making additional low-level optimizations to take advantage of the advanced architectural features of GPUs because the actual generated computation is hidden from the application developer.
This paper describes and implements a novel flexible optimization technique that operates by inserting a code emulator phase to the tail-end of the compilation pipeline. Our tool emulates the generated code using symbolic analysis by substituting dynamic information and thus allowing for further low-level code optimizations to be applied. We implement our tool to support both CUDA and OpenACC directives as the frontend of the compilation pipeline, thus enabling low-level GPU optimizations for OpenACC that were not previously possible. We demonstrate the capabilities of our tool by automating warp-level shuffle instructions that are difficult to use by even advanced GPU programmers. Lastly, evaluating our tool with a benchmark suite and complex application code, we provide a detailed study to assess the benefits of shuffle instructions across four generations of GPU architectures.
△ Less
Submitted 26 January, 2023;
originally announced January 2023.
-
Near-optimal stochastic MIMO signal detection with a mixture of t-distribution prior
Authors:
Junichiro Hagiwara,
Kazushi Matsumura,
Hiroki Asumi,
Yukiko Kasuga,
Toshihiko Nishimura,
Takanori Sato,
Yasutaka Ogawa,
Takeo Ohgane
Abstract:
Multiple-input multiple-output (MIMO) systems will play a crucial role in future wireless communication, but improving their signal detection performance to increase transmission efficiency remains a challenge. To address this issue, we propose extending the discrete signal detection problem in MIMO systems to a continuous one and applying the Hamiltonian Monte Carlo method, an efficient Markov ch…
▽ More
Multiple-input multiple-output (MIMO) systems will play a crucial role in future wireless communication, but improving their signal detection performance to increase transmission efficiency remains a challenge. To address this issue, we propose extending the discrete signal detection problem in MIMO systems to a continuous one and applying the Hamiltonian Monte Carlo method, an efficient Markov chain Monte Carlo algorithm. In our previous studies, we have used a mixture of normal distributions for the prior distribution. In this study, we propose using a mixture of t-distributions, which further improves detection performance. Based on our theoretical analysis and computer simulations, the proposed method can achieve near-optimal signal detection with polynomial computational complexity. This high-performance and practical MIMO signal detection could contribute to the development of the 6th-generation mobile network.
△ Less
Submitted 7 March, 2024; v1 submitted 9 January, 2023;
originally announced January 2023.
-
An Estimation Framework for Passerby Engagement Interacting with Social Robots
Authors:
Taichi Sakaguchi,
Yuki Okafuji,
Kohei Matsumura,
Jun Baba,
Junya Nakanishi
Abstract:
Social robots are expected to be a human labor support technology, and one application of them is an advertising medium in public spaces. When social robots provide information, such as recommended shops, adaptive communication according to the user's state is desired. User engagement, which is also defined as the level of interest in the robot, is likely to play an important role in adaptive comm…
▽ More
Social robots are expected to be a human labor support technology, and one application of them is an advertising medium in public spaces. When social robots provide information, such as recommended shops, adaptive communication according to the user's state is desired. User engagement, which is also defined as the level of interest in the robot, is likely to play an important role in adaptive communication. Therefore, in this paper, we propose a new framework to estimate user engagement. The proposed method focuses on four unsolved open problems: multi-party interactions, process of state change in engagement, difficulty in annotating engagement, and interaction dataset in the real world. The accuracy of the proposed method for estimating engagement was evaluated using interaction duration. The results show that the interaction duration can be accurately estimated by considering the influence of the behaviors of other people; this also implies that the proposed model accurately estimates the level of engagement during interaction with the robot.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
Duality Cascades and Affine Weyl Groups
Authors:
Tomohiro Furukawa,
Kazunobu Matsumura,
Sanefumi Moriyama,
Tomoki Nakanishi
Abstract:
Brane configurations in a circle allow subsequent applications of the Hanany-Witten transitions, which are known as duality cascades. By studying the process of duality cascades corresponding to quantum curves with symmetries of Weyl groups, we find a hidden structure of affine Weyl groups. Namely, the fundamental domain of duality cascades consisting of all the final destinations is characterized…
▽ More
Brane configurations in a circle allow subsequent applications of the Hanany-Witten transitions, which are known as duality cascades. By studying the process of duality cascades corresponding to quantum curves with symmetries of Weyl groups, we find a hidden structure of affine Weyl groups. Namely, the fundamental domain of duality cascades consisting of all the final destinations is characterized by the affine Weyl chamber and the duality cascades are realized as translations of the affine Weyl group, where the overall rank in the brane configuration associates to the grading operator of the affine algebra. The structure of the affine Weyl group guarantees the finiteness of the processes and the uniqueness of the endpoint of the duality cascades. In addition to the original duality cascades, we can generalize to the cases with Fayet-Iliopoulos parameters. There we can utilize the Weyl group to analyze the fundamental domain similarly and find that the fundamental domain continues to be the affine Weyl chamber. We further interpret the Weyl group we impose as a "half" of the Hanany-Witten transition.
△ Less
Submitted 30 May, 2022; v1 submitted 27 December, 2021;
originally announced December 2021.
-
JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization
Authors:
Kazuaki Matsumura,
Simon Garcia De Gonzalo,
Antonio J. Peña
Abstract:
The rapid development in computing technology has paved the way for directive-based programming models towards a principal role in maintaining software portability of performance-critical applications. Efforts on such models involve a least engineering cost for enabling computational acceleration on multiple architectures while programmers are only required to add meta information upon sequential…
▽ More
The rapid development in computing technology has paved the way for directive-based programming models towards a principal role in maintaining software portability of performance-critical applications. Efforts on such models involve a least engineering cost for enabling computational acceleration on multiple architectures while programmers are only required to add meta information upon sequential code. Optimizations for obtaining the best possible efficiency, however, are often challenging. The insertions of directives by the programmer can lead to side-effects that limit the available compiler optimization possible, which could result in performance degradation. This is exacerbated when targeting multi-GPU systems, as pragmas do not automatically adapt to such systems, and require expensive and time consuming code adjustment by programmers.
This paper introduces JACC, an OpenACC runtime framework which enables the dynamic extension of OpenACC programs by serving as a transparent layer between the program and the compiler. We add a versatile code-translation method for multi-device utilization by which manually-optimized applications can be distributed automatically while kee** original code structure and parallelism. We show in some cases nearly linear scaling on the part of kernel execution with the NVIDIA V100 GPUs. While adaptively using multi-GPUs, the resulting performance improvements amortize the latency of GPU-to-GPU communications.
△ Less
Submitted 27 April, 2022; v1 submitted 27 October, 2021;
originally announced October 2021.
-
AN5D: Automated Stencil Framework for High-Degree Temporal Blocking on GPUs
Authors:
Kazuaki Matsumura,
Hamid Reza Zohouri,
Mohamed Wahib,
Toshio Endo,
Satoshi Matsuoka
Abstract:
Stencil computation is one of the most widely-used compute patterns in high performance computing applications. Spatial and temporal blocking have been proposed to overcome the memory-bound nature of this type of computation by moving memory pressure from external memory to on-chip memory on GPUs. However, correctly implementing those optimizations while considering the complexity of the architect…
▽ More
Stencil computation is one of the most widely-used compute patterns in high performance computing applications. Spatial and temporal blocking have been proposed to overcome the memory-bound nature of this type of computation by moving memory pressure from external memory to on-chip memory on GPUs. However, correctly implementing those optimizations while considering the complexity of the architecture and memory hierarchy of GPUs to achieve high performance is difficult. We propose AN5D, an automated stencil framework which is capable of automatically transforming and optimizing stencil patterns in a given C source code, and generating corresponding CUDA code. Parameter tuning in our framework is guided by our performance model. Our novel optimization strategy reduces shared memory and register pressure in comparison to existing implementations, allowing performance scaling up to a temporal blocking degree of 10. We achieve the highest performance reported so far for all evaluated stencil benchmarks on the state-of-the-art Tesla V100 GPU.
△ Less
Submitted 3 February, 2020; v1 submitted 6 January, 2020;
originally announced January 2020.
-
Double-precision FPUs in High-Performance Computing: an Embarrassment of Riches?
Authors:
Jens Domke,
Kazuaki Matsumura,
Mohamed Wahib,
Haoyu Zhang,
Keita Yashima,
Toshiki Tsuchikawa,
Yohei Tsuji,
Artur Podobas,
Satoshi Matsuoka
Abstract:
Among the (uncontended) common wisdom in High-Performance Computing (HPC) is the applications' need for large amount of double-precision support in hardware. Hardware manufacturers, the TOP500 list, and (rarely revisited) legacy software have without doubt followed and contributed to this view.
In this paper, we challenge that wisdom, and we do so by exhaustively comparing a large number of HPC…
▽ More
Among the (uncontended) common wisdom in High-Performance Computing (HPC) is the applications' need for large amount of double-precision support in hardware. Hardware manufacturers, the TOP500 list, and (rarely revisited) legacy software have without doubt followed and contributed to this view.
In this paper, we challenge that wisdom, and we do so by exhaustively comparing a large number of HPC proxy application on two processors: Intel's Knights Landing (KNL) and Knights Mill (KNM). Although similar, the KNM and KNL architecturally deviate at one important point: the silicon area devoted to double-precision arithmetic's. This fortunate discrepancy allows us to empirically quantify the performance impact in reducing the amount of hardware double-precision arithmetic.
Our analysis shows that this common wisdom might not always be right. We find that the investigated HPC proxy applications do allow for a (significant) reduction in double-precision with little-to-no performance implications. With the advent of a failing of Moore's law, our results partially reinforce the view taken by modern industry (e.g. upcoming Fujitsu ARM64FX) to integrate hybrid-precision hardware units.
△ Less
Submitted 25 March, 2019; v1 submitted 22 October, 2018;
originally announced October 2018.
-
Acoustic Probing for Estimating the Storage Time and Firmness of Tomatoes and Mandarin Oranges
Authors:
Hidetomo Kataoka,
Takashi Ijiri,
Kohei Matsumura,
Jeremy White,
Akira Hirabayashi
Abstract:
This paper introduces an acoustic probing technique to estimate the storage time and firmness of fruits; we emit an acoustic signal to fruit from a small speaker and capture the reflected signal with a tiny microphone. We collect reflected signals for fruits with various storage times and firmness conditions, using them to train regressors for estimation. To evaluate the feasibility of our acousti…
▽ More
This paper introduces an acoustic probing technique to estimate the storage time and firmness of fruits; we emit an acoustic signal to fruit from a small speaker and capture the reflected signal with a tiny microphone. We collect reflected signals for fruits with various storage times and firmness conditions, using them to train regressors for estimation. To evaluate the feasibility of our acoustic probing, we performed experiments; we prepared 162 tomatoes and 153 mandarin oranges, collected their reflected signals using our developed device and measured their firmness with a fruit firmness tester, for a period of 35 days for tomatoes and 60 days for mandarin oranges. We performed cross validation by using this data set. The average estimation errors of storage time and firmness for tomatoes were 0.89 days and 9.47 g/mm2. Those for mandarin oranges were 1.67 days and 15.67 g/mm2. The estimation of storage time was sufficiently accurate for casual users to select fruits in their favorite condition at home. In the experiments, we tested four different acoustic probes and found that sweep signals provide highly accurate estimation results.
△ Less
Submitted 30 April, 2019; v1 submitted 27 September, 2018;
originally announced September 2018.
-
Research Activity Classification based on Time Series Bibliometrics
Authors:
Takahiro Kawamura,
Yasuhiro Yamashita,
Katsuji Matsumura
Abstract:
Bibliometrics such as the number of papers and times cited are often used to compare researchers based on specific criteria. The criteria, however, are different in each research domain and are set by empirical laws. Moreover, there are arguments, such that the simple sum of metric values works to the advantage of elders. Therefore, this paper attempts to constitute features from time series data…
▽ More
Bibliometrics such as the number of papers and times cited are often used to compare researchers based on specific criteria. The criteria, however, are different in each research domain and are set by empirical laws. Moreover, there are arguments, such that the simple sum of metric values works to the advantage of elders. Therefore, this paper attempts to constitute features from time series data of bibliometrics, and then classify the researchers according to the features. In detail, time series patterns are extracted from bibliographic data sets, and then a model to classify whether the researchers are "distinguished" or not is created by a machine learning technique. The experiments achieved an F-measure of 80.0% in the classification of 114 researchers in two research domains based on the data sets of Japan Science and Technology Agency and Elsevier's Scopus. In the future, we will conduct verification on a number of researchers in several domains, and then make use of discovering "distinguished" researchers, who are not widely known.
△ Less
Submitted 4 August, 2017;
originally announced August 2017.
-
Can Neural Networks Recognize Parts?
Authors:
Koji Matsumura,
Y-h. Taguchi
Abstract:
We have demonstrated neural networks can recognize parts by visual images. Input signals are gray scale photographs of objects consisting of some parts and output signals are their shapes. By training neural networks by a few set of images, without any supervision they become to be able to recognize the boundary between parts.
We have demonstrated neural networks can recognize parts by visual images. Input signals are gray scale photographs of objects consisting of some parts and output signals are their shapes. By training neural networks by a few set of images, without any supervision they become to be able to recognize the boundary between parts.
△ Less
Submitted 4 October, 2004;
originally announced October 2004.
-
A Toy Model of Flying Snake's Glide
Authors:
Koji Matsumura,
Y-h. Taguchi
Abstract:
We have developed a toy model of flying snake's glide [J.J. Socha, Nature vol. 418 (2002) 603.] by modifying a model for a falling paper. We have found that asymmetric oscillation is a key about why snake can glide. Further investigation for snake's glide will provide us details about how it can glide without a wing.
We have developed a toy model of flying snake's glide [J.J. Socha, Nature vol. 418 (2002) 603.] by modifying a model for a falling paper. We have found that asymmetric oscillation is a key about why snake can glide. Further investigation for snake's glide will provide us details about how it can glide without a wing.
△ Less
Submitted 17 July, 2003; v1 submitted 20 May, 2003;
originally announced May 2003.