Search | arXiv e-print repository

Machine-Learning-Optimized Perovskite Nanoplatelet Synthesis

Authors: Carola Lampe, Ioannis Kouroudis, Milan Harth, Stefan Martin, Alessio Gagliardi, Alexander S. Urban

Abstract: With the demand for renewable energy and efficient devices rapidly increasing, a need arises to find and optimize novel (nano)materials. This can be an extremely tedious process, often relying significantly on trial and error. Machine learning has emerged recently as a powerful alternative; however, most approaches require a substantial amount of data points, i.e., syntheses. Here, we merge three… ▽ More With the demand for renewable energy and efficient devices rapidly increasing, a need arises to find and optimize novel (nano)materials. This can be an extremely tedious process, often relying significantly on trial and error. Machine learning has emerged recently as a powerful alternative; however, most approaches require a substantial amount of data points, i.e., syntheses. Here, we merge three machine-learning models with Bayesian Optimization and are able to dramatically improve the quality of CsPbBr3 nanoplatelets (NPLs) using only approximately 200 total syntheses. The algorithm can predict the resulting PL emission maxima of the NPL dispersions based on the precursor ratios, which lead to previously unobtainable 7 and 8 ML NPLs. Aided by heuristic knowledge, the algorithm should be easily applicable to other nanocrystal syntheses and significantly help to identify interesting compositions and rapidly improve their quality. △ Less

Submitted 18 October, 2022; originally announced October 2022.

arXiv:2201.10865 [pdf, ps, other]

On the Issues of TrueDepth Sensor Data for Computer Vision Tasks Across Different iPad Generations

Authors: Steffen Urban, Thomas Lindemeier, David Dobbelstein, Matthias Haenel

Abstract: In 2017 Apple introduced the TrueDepth sensor with the iPhone X release. Although its primary use case is biometric face recognition, the exploitation of accurate depth data for other computer vision tasks like segmentation, portrait image generation and metric 3D reconstruction seems natural and lead to the development of various applications. In this report, we investigate the reliability of Tru… ▽ More In 2017 Apple introduced the TrueDepth sensor with the iPhone X release. Although its primary use case is biometric face recognition, the exploitation of accurate depth data for other computer vision tasks like segmentation, portrait image generation and metric 3D reconstruction seems natural and lead to the development of various applications. In this report, we investigate the reliability of TrueDepth data - accessed through two different APIs - on various devices including different iPhone and iPad generations and reveal two different and significant issues on all tested iPads. △ Less

Submitted 8 March, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

Comments: 17 pages

arXiv:2102.13391 [pdf, other]

doi 10.5220/0010211600700079

Point Cloud Upsampling and Normal Estimation using Deep Learning for Robust Surface Reconstruction

Authors: Rajat Sharma, Tobias Schwandt, Christian Kunert, Steffen Urban, Wolfgang Broll

Abstract: The reconstruction of real-world surfaces is on high demand in various applications. Most existing reconstruction approaches apply 3D scanners for creating point clouds which are generally sparse and of low density. These points clouds will be triangulated and used for visualization in combination with surface normals estimated by geometrical approaches. However, the quality of the reconstruction… ▽ More The reconstruction of real-world surfaces is on high demand in various applications. Most existing reconstruction approaches apply 3D scanners for creating point clouds which are generally sparse and of low density. These points clouds will be triangulated and used for visualization in combination with surface normals estimated by geometrical approaches. However, the quality of the reconstruction depends on the density of the point cloud and the estimation of the surface normals. In this paper, we present a novel deep learning architecture for point cloud upsampling that enables subsequent stable and smooth surface reconstruction. A noisy point cloud of low density with corresponding point normals is used to estimate a point cloud with higher density and appendant point normals. To this end, we propose a compound loss function that encourages the network to estimate points that lie on a surface including normals accurately predicting the orientation of the surface. Our results show the benefit of estimating normals together with point positions. The resulting point cloud is smoother, more complete, and the final surface reconstruction is much closer to ground truth. △ Less

Submitted 26 February, 2021; originally announced February 2021.

Journal ref: In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP, pages 70-79

arXiv:1711.11059 [pdf, other]

Gaussian Process Neurons Learn Stochastic Activation Functions

Authors: Sebastian Urban, Marcus Basalla, Patrick van der Smagt

Abstract: We propose stochastic, non-parametric activation functions that are fully learnable and individual to each neuron. Complexity and the risk of overfitting are controlled by placing a Gaussian process prior over these functions. The result is the Gaussian process neuron, a probabilistic unit that can be used as the basic building block for probabilistic graphical models that resemble the structure o… ▽ More We propose stochastic, non-parametric activation functions that are fully learnable and individual to each neuron. Complexity and the risk of overfitting are controlled by placing a Gaussian process prior over these functions. The result is the Gaussian process neuron, a probabilistic unit that can be used as the basic building block for probabilistic graphical models that resemble the structure of neural networks. The proposed model can intrinsically handle uncertainties in its inputs and self-estimate the confidence of its predictions. Using variational Bayesian inference and the central limit theorem, a fully deterministic loss function is derived, allowing it to be trained as efficiently as a conventional neural network using mini-batch gradient descent. The posterior distribution of activation functions is inferred from the training data alongside the weights of the network. The proposed model favorably compares to deep Gaussian processes, both in model complexity and efficiency of inference. It can be directly applied to recurrent or convolutional network structures, allowing its use in audio and image processing tasks. As an preliminary empirical evaluation we present experiments on regression and classification tasks, in which our model achieves performance comparable to or better than a Dropout regularized neural network with a fixed activation function. Experiments are ongoing and results will be added as they become available. △ Less

Submitted 29 November, 2017; originally announced November 2017.

arXiv:1711.01348 [pdf, ps, other]

Automatic Differentiation for Tensor Algebras

Authors: Sebastian Urban, Patrick van der Smagt

Abstract: Kjolstad et. al. proposed a tensor algebra compiler. It takes expressions that define a tensor element-wise, such as $f_{ij}(a,b,c,d) = \exp\left[-\sum_{k=0}^4 \left((a_{ik}+b_{jk})^2\, c_{ii} + d_{i+k}^3 \right) \right]$, and generates the corresponding compute kernel code. For machine learning, especially deep learning, it is often necessary to compute the gradient of a loss function… ▽ More Kjolstad et. al. proposed a tensor algebra compiler. It takes expressions that define a tensor element-wise, such as $f_{ij}(a,b,c,d) = \exp\left[-\sum_{k=0}^4 \left((a_{ik}+b_{jk})^2\, c_{ii} + d_{i+k}^3 \right) \right]$, and generates the corresponding compute kernel code. For machine learning, especially deep learning, it is often necessary to compute the gradient of a loss function $l(a,b,c,d)=l(f(a,b,c,d))$ with respect to parameters $a,b,c,d$. If tensor compilers are to be applied in this field, it is necessary to derive expressions for the derivatives of element-wise defined tensors, i.e. expressions for $(da)_{ik}=\partial l/\partial a_{ik}$. When the map** between function indices and argument indices is not 1:1, special attention is required. For the function $f_{ij} (x) = x_i^2$, the derivative of the loss is $(dx)_i=\partial l/\partial x_i=\sum_j (df)_{ij}2x_i$; the sum is necessary because index $j$ does not appear in the indices of $f$. Another example is $f_{i}(x)=x_{ii}^2$, where $x$ is a matrix; here we have $(dx)_{ij}=δ_{ij}(df)_i2x_{ii}$; the Kronecker delta is necessary because the derivative is zero for off-diagonal elements. Another indexing scheme is used by $f_{ij}(x)=\exp x_{i+j}$; here the correct derivative is $(dx)_{k}=\sum_i (df)_{i,k-i} \exp x_{k}$, where the range of the sum must be chosen appropriately. In this publication we present an algorithm that can handle any case in which the indices of an argument are an arbitrary linear combination of the indices of the function, thus all the above examples can be handled. Sums (and their ranges) and Kronecker deltas are automatically inserted into the derivatives as necessary. Additionally, the indices are transformed, if required (as in the last example). The algorithm outputs a symbolic expression that can be subsequently fed into a tensor algebra compiler. Source code is provided. △ Less

Submitted 3 November, 2017; originally announced November 2017.

Comments: Technical Report

arXiv:1610.07804 [pdf, other]

mdBrief - A Fast Online Adaptable, Distorted Binary Descriptor for Real-Time Applications Using Calibrated Wide-Angle Or Fisheye Cameras

Authors: Steffen Urban, Stefan Hinz

Abstract: Fast binary descriptors build the core for many vision based applications with real-time demands like object detection, Visual Odometry or SLAM. Commonly it is assumed, that the acquired images and thus the patches extracted around keypoints originate from a perspective projection ignoring image distortion or completely different types of projections such as omnidirectional or fisheye. Usually the… ▽ More Fast binary descriptors build the core for many vision based applications with real-time demands like object detection, Visual Odometry or SLAM. Commonly it is assumed, that the acquired images and thus the patches extracted around keypoints originate from a perspective projection ignoring image distortion or completely different types of projections such as omnidirectional or fisheye. Usually the deviations from a perfect perspective projection are corrected by undistortion. Latter, however, introduces severe artifacts if the cameras field-of-view gets larger. In this paper, we propose a distorted and masked version of the BRIEF descriptor for calibrated cameras. Instead of correcting the distortion holistically, we distort the binary tests and thus adapt the descriptor to different image regions. △ Less

Submitted 25 October, 2016; originally announced October 2016.

Comments: 18 pages, 3 tables, 14 figures

arXiv:1610.07336 [pdf, other]

MultiCol-SLAM - A Modular Real-Time Multi-Camera SLAM System

Authors: Steffen Urban, Stefan Hinz

Abstract: The basis for most vision based applications like robotics, self-driving cars and potentially augmented and virtual reality is a robust, continuous estimation of the position and orientation of a camera system w.r.t the observed environment (scene). In recent years many vision based systems that perform simultaneous localization and map** (SLAM) have been presented and released as open source. I… ▽ More The basis for most vision based applications like robotics, self-driving cars and potentially augmented and virtual reality is a robust, continuous estimation of the position and orientation of a camera system w.r.t the observed environment (scene). In recent years many vision based systems that perform simultaneous localization and map** (SLAM) have been presented and released as open source. In this paper, we extend and improve upon a state-of-the-art SLAM to make it applicable to arbitrary, rigidly coupled multi-camera systems (MCS) using the MultiCol model. In addition, we include a performance evaluation on accurate ground truth and compare the robustness of the proposed method to a single camera version of the SLAM system. An open source implementation of the proposed multi-fisheye camera SLAM system can be found on-line https://github.com/urbste/MultiCol-SLAM. △ Less

Submitted 24 October, 2016; originally announced October 2016.

Comments: 15 pages, 8 figures, 2 tables

arXiv:1607.08112 [pdf, other]

doi 10.5194/isprs-annals-III-3-131-2016

MLPnP - A Real-Time Maximum Likelihood Solution to the Perspective-n-Point Problem

Authors: Steffen Urban, Jens Leitloff, Stefan Hinz

Abstract: In this paper, a statistically optimal solution to the Perspective-n-Point (PnP) problem is presented. Many solutions to the PnP problem are geometrically optimal, but do not consider the uncertainties of the observations. In addition, it would be desirable to have an internal estimation of the accuracy of the estimated rotation and translation parameters of the camera pose. Thus, we propose a nov… ▽ More In this paper, a statistically optimal solution to the Perspective-n-Point (PnP) problem is presented. Many solutions to the PnP problem are geometrically optimal, but do not consider the uncertainties of the observations. In addition, it would be desirable to have an internal estimation of the accuracy of the estimated rotation and translation parameters of the camera pose. Thus, we propose a novel maximum likelihood solution to the PnP problem, that incorporates image observation uncertainties and remains real-time capable at the same time. Further, the presented method is general, as is works with 3D direction vectors instead of 2D image points and is thus able to cope with arbitrary central camera models. This is achieved by projecting (and thus reducing) the covariance matrices of the observations to the corresponding vector tangent space. △ Less

Submitted 27 July, 2016; originally announced July 2016.

Comments: Submitted to the ISPRS congress (2016) in Prague. Oral Presentation. Published in ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., III-3, 131-138

arXiv:1605.02688 [pdf, other]

Theano: A Python framework for fast computation of mathematical expressions

Authors: The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano , et al. (88 additional authors not shown)

Abstract: Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, mu… ▽ More Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it. △ Less

Submitted 9 May, 2016; originally announced May 2016.

Comments: 19 pages, 5 figures

arXiv:1604.03736 [pdf, other]

A Differentiable Transition Between Additive and Multiplicative Neurons

Authors: Wiebke Köpp, Patrick van der Smagt, Sebastian Urban

Abstract: Existing approaches to combine both additive and multiplicative neural units either use a fixed assignment of operations or require discrete optimization to determine what function a neuron should perform. However, this leads to an extensive increase in the computational complexity of the training procedure. We present a novel, parameterizable transfer function based on the mathematical concept… ▽ More Existing approaches to combine both additive and multiplicative neural units either use a fixed assignment of operations or require discrete optimization to determine what function a neuron should perform. However, this leads to an extensive increase in the computational complexity of the training procedure. We present a novel, parameterizable transfer function based on the mathematical concept of non-integer functional iteration that allows the operation each neuron performs to be smoothly and, most importantly, differentiablely adjusted between addition and multiplication. This allows the decision between addition and multiplication to be integrated into the standard backpropagation training procedure. △ Less

Submitted 13 April, 2016; originally announced April 2016.

Comments: ICLR 2016 extended abstract

arXiv:1503.05724 [pdf, other]

A Neural Transfer Function for a Smooth and Differentiable Transition Between Additive and Multiplicative Interactions

Authors: Sebastian Urban, Patrick van der Smagt

Abstract: Existing approaches to combine both additive and multiplicative neural units either use a fixed assignment of operations or require discrete optimization to determine what function a neuron should perform. This leads either to an inefficient distribution of computational resources or an extensive increase in the computational complexity of the training procedure. We present a novel, parameteriza… ▽ More Existing approaches to combine both additive and multiplicative neural units either use a fixed assignment of operations or require discrete optimization to determine what function a neuron should perform. This leads either to an inefficient distribution of computational resources or an extensive increase in the computational complexity of the training procedure. We present a novel, parameterizable transfer function based on the mathematical concept of non-integer functional iteration that allows the operation each neuron performs to be smoothly and, most importantly, differentiablely adjusted between addition and multiplication. This allows the decision between addition and multiplication to be integrated into the standard backpropagation training procedure. △ Less

Submitted 29 March, 2016; v1 submitted 19 March, 2015; originally announced March 2015.

arXiv:1311.0701 [pdf, other]

On Fast Dropout and its Applicability to Recurrent Networks

Authors: Justin Bayer, Christian Osendorfer, Daniela Korhammer, Nutan Chen, Sebastian Urban, Patrick van der Smagt

Abstract: Recurrent Neural Networks (RNNs) are rich models for the processing of sequential data. Recent work on advancing the state of the art has been focused on the optimization or modelling of RNNs, mostly motivated by adressing the problems of the vanishing and exploding gradients. The control of overfitting has seen considerably less attention. This paper contributes to that by analyzing fast dropout,… ▽ More Recurrent Neural Networks (RNNs) are rich models for the processing of sequential data. Recent work on advancing the state of the art has been focused on the optimization or modelling of RNNs, mostly motivated by adressing the problems of the vanishing and exploding gradients. The control of overfitting has seen considerably less attention. This paper contributes to that by analyzing fast dropout, a recent regularization method for generalized linear models and neural networks from a back-propagation inspired perspective. We show that fast dropout implements a quadratic form of an adaptive, per-parameter regularizer, which rewards large weights in the light of underfitting, penalizes them for overconfident predictions and vanishes at minima of an unregularized training loss. The derivatives of that regularizer are exclusively based on the training error signal. One consequence of this is the absense of a global weight attractor, which is particularly appealing for RNNs, since the dynamics are not biased towards a certain regime. We positively test the hypothesis that this improves the performance of RNNs on four musical data sets. △ Less

Submitted 5 March, 2014; v1 submitted 4 November, 2013; originally announced November 2013.

Comments: The experiments for the Penn Treebank corpus were erroneous and have been stripped from this version

arXiv:1301.2840 [pdf, other]

Unsupervised Feature Learning for low-level Local Image Descriptors

Authors: Christian Osendorfer, Justin Bayer, Sebastian Urban, Patrick van der Smagt

Abstract: Unsupervised feature learning has shown impressive results for a wide range of input modalities, in particular for object classification tasks in computer vision. Using a large amount of unlabeled data, unsupervised feature learning methods are utilized to construct high-level representations that are discriminative enough for subsequently trained supervised classification algorithms. However, it… ▽ More Unsupervised feature learning has shown impressive results for a wide range of input modalities, in particular for object classification tasks in computer vision. Using a large amount of unlabeled data, unsupervised feature learning methods are utilized to construct high-level representations that are discriminative enough for subsequently trained supervised classification algorithms. However, it has never been \emph{quantitatively} investigated yet how well unsupervised learning methods can find \emph{low-level representations} for image patches without any additional supervision. In this paper we examine the performance of pure unsupervised methods on a low-level correspondence task, a problem that is central to many Computer Vision applications. We find that a special type of Restricted Boltzmann Machines (RBMs) performs comparably to hand-crafted descriptors. Additionally, a simple binarization scheme produces compact representations that perform better than several state-of-the-art descriptors. △ Less

Submitted 25 April, 2013; v1 submitted 13 January, 2013; originally announced January 2013.

Showing 1–13 of 13 results for author: Urban, S