Search | arXiv e-print repository

doi 10.1039/D4EE00911H

Physics-based material parameters extraction from perovskite experiments via Bayesian optimization

Authors: Hualin Zhan, Viqar Ahmad, Azul Mayon, Grace Tabi, Anh Dinh Bui, Zhuofeng Li, Daniel Walter, Hieu Nguyen, Klaus Weber, Thomas White, Kylie Catchpole

Abstract: The ability to extract material parameters of perovskite from quantitative experimental analysis is essential for rational design of photovoltaic and optoelectronic applications. However, the difficulty of this analysis increases significantly with the complexity of the theoretical model and the number of material parameters for perovskite. Here we use Bayesian optimization to develop an analysis… ▽ More The ability to extract material parameters of perovskite from quantitative experimental analysis is essential for rational design of photovoltaic and optoelectronic applications. However, the difficulty of this analysis increases significantly with the complexity of the theoretical model and the number of material parameters for perovskite. Here we use Bayesian optimization to develop an analysis platform that can extract up to 8 fundamental material parameters of an organometallic perovskite semiconductor from a transient photoluminescence experiment, based on a complex full physics model that includes drift-diffusion of carriers and dynamic defect occupation. An example study of thermal degradation reveals that the carrier mobility and trap-assisted recombination coefficient are reduced noticeably, while the defect energy level remains nearly unchanged. The reduced carrier mobility can dominate the overall effect on thermal degradation of perovskite solar cells by reducing the fill factor, despite the opposite effect of the reduced trap-assisted recombination coefficient on increasing the fill factor. In future, this platform can be conveniently applied to other experiments or to combinations of experiments, accelerating materials discovery and optimization of semiconductor materials for photovoltaics and other applications. △ Less

Submitted 29 May, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: The work is published in Energy & Environmental Science (DOI: 10.1039/D4EE00911H). This work is supported by the Australian Centre for Advanced Photovoltaics (ACAP) and received funding from the Australian Renewable Energy Agency (ARENA). H.Z. acknowledges the support of the ACAP Fellowship. H.Z. thanks Pawsey for providing the Nimbus Research Cloud Service

arXiv:2310.09094 [pdf]

doi 10.1109/ISOCC59558.2023.10396509

A RISC-V MCU with adaptive reverse body bias and ultra-low-power retention mode in 22 nm FD-SOI

Authors: Heiner Bauer, Marco Stolba, Stefan Scholze, Dennis Walter, Christian Mayr, Alexander Oefelein, Sebastian Höppner, André Scharfe, Flo Schraut, Holger Eisenreich

Abstract: We present a low-power, energy efficient 32-bit RISC-V microprocessor unit (MCU) in 22 nm FD-SOI. It achieves ultra-low leakage,even at high temperatures, by using an adaptive reverse body biasing aware sign-off approach, a low-power optimized physical implementation, and custom SRAM macros with retention mode. We demonstrate the robustness of the chip with measurements over the full industrial te… ▽ More We present a low-power, energy efficient 32-bit RISC-V microprocessor unit (MCU) in 22 nm FD-SOI. It achieves ultra-low leakage,even at high temperatures, by using an adaptive reverse body biasing aware sign-off approach, a low-power optimized physical implementation, and custom SRAM macros with retention mode. We demonstrate the robustness of the chip with measurements over the full industrial temperature range, from -40 °C to 125 °C. Our results match the state of the art (SOTA) with 4.8 uW / MHz at 50 MHz in active mode and surpass the SOTA in ultra-low-power retention mode. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: accepted at ISOCC 2023

arXiv:2205.06287 [pdf, other]

Adaptive Block Floating-Point for Analog Deep Learning Hardware

Authors: Ayon Basumallik, Darius Bunandar, Nicholas Dronen, Nicholas Harris, Ludmila Levkova, Calvin McCarter, Lakshmi Nair, David Walter, David Widemann

Abstract: Analog mixed-signal (AMS) devices promise faster, more energy-efficient deep neural network (DNN) inference than their digital counterparts. However, recent studies show that DNNs on AMS devices with fixed-point numbers can incur an accuracy penalty because of precision loss. To mitigate this penalty, we present a novel AMS-compatible adaptive block floating-point (ABFP) number representation. We… ▽ More Analog mixed-signal (AMS) devices promise faster, more energy-efficient deep neural network (DNN) inference than their digital counterparts. However, recent studies show that DNNs on AMS devices with fixed-point numbers can incur an accuracy penalty because of precision loss. To mitigate this penalty, we present a novel AMS-compatible adaptive block floating-point (ABFP) number representation. We also introduce amplification (or gain) as a method for increasing the accuracy of the number representation without increasing the bit precision of the output. We evaluate the effectiveness of ABFP on the DNNs in the MLPerf datacenter inference benchmark -- realizing less than $1\%$ loss in accuracy compared to FLOAT32. We also propose a novel method of finetuning for AMS devices, Differential Noise Finetuning (DNF), which samples device noise to speed up finetuning compared to conventional Quantization-Aware Training. △ Less

Submitted 12 May, 2022; originally announced May 2022.

Comments: 13 pages including Appendix, 7 figures, under submission at IEEE Transactions on Neural Networks and Learning Systems (TNNLS)

arXiv:2202.00027 [pdf, other]

doi 10.1051/0004-6361/202243230

Exoplanet Characterization using Conditional Invertible Neural Networks

Authors: Jonas Haldemann, Victor Ksoll, Daniel Walter, Yann Alibert, Ralf S. Klessen, Willy Benz, Ullrich Koethe, Lynton Ardizzone, Carsten Rother

Abstract: The characterization of an exoplanet's interior is an inverse problem, which requires statistical methods such as Bayesian inference in order to be solved. Current methods employ Markov Chain Monte Carlo (MCMC) sampling to infer the posterior probability of planetary structure parameters for a given exoplanet. These methods are time consuming since they require the calculation of a large number of… ▽ More The characterization of an exoplanet's interior is an inverse problem, which requires statistical methods such as Bayesian inference in order to be solved. Current methods employ Markov Chain Monte Carlo (MCMC) sampling to infer the posterior probability of planetary structure parameters for a given exoplanet. These methods are time consuming since they require the calculation of a large number of planetary structure models. To speed up the inference process when characterizing an exoplanet, we propose to use conditional invertible neural networks (cINNs) to calculate the posterior probability of the internal structure parameters. cINNs are a special type of neural network which excel in solving inverse problems. We constructed a cINN using FrEIA, which was then trained on a database of $5.6\cdot 10^6$ internal structure models to recover the inverse map** between internal structure parameters and observable features (i.e., planetary mass, planetary radius and composition of the host star). The cINN method was compared to a Metropolis-Hastings MCMC. For that we repeated the characterization of the exoplanet K2-111 b, using both the MCMC method and the trained cINN. We show that the inferred posterior probability of the internal structure parameters from both methods are very similar, with the biggest differences seen in the exoplanet's water content. Thus cINNs are a possible alternative to the standard time-consuming sampling methods. Indeed, using cINNs allows for orders of magnitude faster inference of an exoplanet's composition than what is possible using an MCMC method, however, it still requires the computation of a large database of internal structures to train the cINN. Since this database is only computed once, we found that using a cINN is more efficient than an MCMC, when more than 10 exoplanets are characterized using the same cINN. △ Less

Submitted 31 January, 2022; originally announced February 2022.

Comments: 15 pages, 13 figures, submitted to Astronomy & Astrophysics

Journal ref: A&A 672, A180 (2023)

arXiv:2105.00789 [pdf, other]

doi 10.1109/TVLSI.2021.3117401

Hardware Implementation of an OPC UA Server for Industrial Field Devices

Authors: Heiner Bauer, Sebastian Höppner, Chris Iatrou, Zohra Charania, Stephan Hartmann, Saif-Ur Rehman, Andreas Dixius, Georg Ellguth, Dennis Walter, Johannes Uhlig, Felix Neumärker, Marc Berthel, Marco Stolba, Florian Kelber, Leon Urbas, Christian Mayr

Abstract: Industrial plants suffer from a high degree of complexity and incompatibility in their communication infrastructure, caused by a wild mix of proprietary technologies. This prevents transformation towards Industry 4.0 and the Industrial Internet of Things. Open Platform Communications Unified Architecture (OPC UA) is a standardized protocol that addresses these problems with uniform and semantic co… ▽ More Industrial plants suffer from a high degree of complexity and incompatibility in their communication infrastructure, caused by a wild mix of proprietary technologies. This prevents transformation towards Industry 4.0 and the Industrial Internet of Things. Open Platform Communications Unified Architecture (OPC UA) is a standardized protocol that addresses these problems with uniform and semantic communication across all levels of the hierarchy. However, its adoption in embedded field devices, such as sensors and actors, is still lacking due to prohibitive memory and power requirements of software implementations. We have developed a dedicated hardware engine that offloads processing of the OPC UA protocol and enables realization of compact and low-power field devices with OPC UA support. As part of a proof-of-concept embedded system we have implemented this engine in a 22 nm FDSOI technology. We measured performance, power consumption, and memory footprint of our test chip and compared it with a software implementation based on open62541 and a Raspberry Pi 2B. Our OPC UA hardware engine is 50 times more energy efficient and only requires 36 KiB of memory. The complete chip consumes only 24 mW under full load, making it suitable for low-power embedded applications. △ Less

Submitted 3 May, 2021; originally announced May 2021.

Comments: 5 pages, 6 figures, 2 tables

Journal ref: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 29 (2021)

arXiv:2103.08392 [pdf, ps, other]

The SpiNNaker 2 Processing Element Architecture for Hybrid Digital Neuromorphic Computing

Authors: Sebastian Höppner, Yexin Yan, Andreas Dixius, Stefan Scholze, Johannes Partzsch, Marco Stolba, Florian Kelber, Bernhard Vogginger, Felix Neumärker, Georg Ellguth, Stephan Hartmann, Stefan Schiefer, Thomas Hocker, Dennis Walter, Genting Liu, Jim Garside, Steve Furber, Christian Mayr

Abstract: This paper introduces the processing element architecture of the second generation SpiNNaker chip, implemented in 22nm FDSOI. On circuit level, the chip features adaptive body biasing for near-threshold operation, and dynamic voltage-and-frequency scaling driven by spiking activity. On system level, processing is centered around an ARM M4 core, similar to the processor-centric architecture of the… ▽ More This paper introduces the processing element architecture of the second generation SpiNNaker chip, implemented in 22nm FDSOI. On circuit level, the chip features adaptive body biasing for near-threshold operation, and dynamic voltage-and-frequency scaling driven by spiking activity. On system level, processing is centered around an ARM M4 core, similar to the processor-centric architecture of the first generation SpiNNaker. To speed operation of subtasks, we have added accelerators for numerical operations of both spiking (SNN) and rate based (deep) neural networks (DNN). PEs communicate via a dedicated, custom-designed network-on-chip. We present three benchmarks showing operation of the whole processor element on SNN, DNN and hybrid SNN/DNN networks. △ Less

Submitted 15 August, 2022; v1 submitted 15 March, 2021; originally announced March 2021.

arXiv:2101.04395 [pdf, ps, other]

Symbolic Loop Compilation for Tightly Coupled Processor Arrays

Authors: Michael Witterauf, Dominik Walter, Frank Hannig, Jürgen Teich

Abstract: Loop compilation for Tightly Coupled Processor Arrays (TCPAs), a class of massively parallel loop accelerators, entails solving NP-hard problems, yet depends on the loop bounds and number of available processing elements (PEs), parameters known only at runtime because of dynamic resource management and input sizes. Therefore, this article proposes a two-phase approach called symbolic loop compilat… ▽ More Loop compilation for Tightly Coupled Processor Arrays (TCPAs), a class of massively parallel loop accelerators, entails solving NP-hard problems, yet depends on the loop bounds and number of available processing elements (PEs), parameters known only at runtime because of dynamic resource management and input sizes. Therefore, this article proposes a two-phase approach called symbolic loop compilation: At compile time, the necessary NP-complete problems are solved and the solutions compiled into a space-efficient symbolic configuration. At runtime, a concrete configuration is generated from the symbolic configuration according to the parameters values. We show that the latter phase, called instantiation, runs in polynomial time with its most complex step, program instantiation, not depending on the number of PEs. As validation, we performed symbolic loop compilation on real-world loops and measured time and space requirements. Our experiments confirm that a symbolic configuration is space-efficient and suited for systems with little memory -- often, a symbolic configuration is smaller than a single concrete configuration -- and that program instantiation scales well with the number of PEs -- for example, when instantiating a symbolic configuration of a matrix-matrix multiplication, the execution time is similar for $4\times 4$ and $32\times 32$ PEs. △ Less

Submitted 12 January, 2021; originally announced January 2021.

ACM Class: C.3; C.1.4; D.3.4

arXiv:1409.8403 [pdf, other]

doi 10.1109/CVPR.2015.7298911

Evaluation of Output Embeddings for Fine-Grained Image Classification

Authors: Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, Bernt Schiele

Abstract: Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and cla… ▽ More Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and class embeddings, we learn a compatibility function such that matching embeddings are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score. We use state-of-the-art image features and focus on different supervised attributes and unsupervised output embeddings either derived from hierarchies or learned from unlabeled text corpora. We establish a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate that purely unsupervised output embeddings (learned from Wikipedia and improved with fine-grained text) achieve compelling results, even outperforming the previous supervised state-of-the-art. By combining different output embeddings, we further improve results. △ Less

Submitted 28 August, 2015; v1 submitted 30 September, 2014; originally announced September 2014.

Comments: @inproceedings {ARWLS15, title = {Evaluation of Output Embeddings for Fine-Grained Image Classification}, booktitle = {IEEE Computer Vision and Pattern Recognition}, year = {2015}, author = {Zeynep Akata and Scott Reed and Daniel Walter and Honglak Lee and Bernt Schiele} }

Showing 1–8 of 8 results for author: Walter, D