Search | arXiv e-print repository

arXiv:2008.10402 [pdf, other]

Numerical Quality Control for DFT-based Materials Databases

Authors: Christian Carbogno, Kristian Sommer Thygesen, Björn Bieniek, Claudia Draxl, Luca M. Ghiringhelli, Andris Gulans, Oliver T. Hofmann, Karsten W. Jacobsen, Sven Lubeck, Jens Jørgen Mortensen, Mikkel Strange, Elisabeth Wruss, Matthias Scheffler

Abstract: Electronic-structure theory is a strong pillar of materials science. Many different computer codes that employ different approaches are used by the community to solve various scientific problems. Still, the precision of different packages has only recently been scrutinized thoroughly, focusing on a specific task, namely selecting a popular density functional, and using unusually high, extremely pr… ▽ More Electronic-structure theory is a strong pillar of materials science. Many different computer codes that employ different approaches are used by the community to solve various scientific problems. Still, the precision of different packages has only recently been scrutinized thoroughly, focusing on a specific task, namely selecting a popular density functional, and using unusually high, extremely precise numerical settings for investigating 71 monoatomic crystals. Little is known, however, about method- and code-specific uncertainties that arise under numerical settings that are commonly used in practice. We shed light on this issue by investigating the deviations in total and relative energies as a function of computational parameters. Using typical settings for basis sets and k-grids, we compare results for 71 elemental and 63 binary solids obtained by three different electronic-structure codes that employ fundamentally different strategies. On the basis of the observed trends, we propose a simple, analytical model for the estimation of the errors associated with the basis-set incompleteness. We cross-validate this model using ternary systems obtained from the NOMAD Repository and discuss how our approach enables the comparison of the heterogeneous data present in computational materials databases. △ Less

Submitted 31 January, 2022; v1 submitted 24 August, 2020; originally announced August 2020.

Comments: 10 pages, 7 figures

arXiv:2006.07660 [pdf, other]

Convolutional Generation of Textured 3D Meshes

Authors: Dario Pavllo, Graham Spinks, Thomas Hofmann, Marie-Francine Moens, Aurelien Lucchi

Abstract: While recent generative models for 2D images achieve impressive visual results, they clearly lack the ability to perform 3D reasoning. This heavily restricts the degree of control over generated objects as well as the possible applications of such models. In this work, we bridge this gap by leveraging recent advances in differentiable rendering. We design a framework that can generate triangle mes… ▽ More While recent generative models for 2D images achieve impressive visual results, they clearly lack the ability to perform 3D reasoning. This heavily restricts the degree of control over generated objects as well as the possible applications of such models. In this work, we bridge this gap by leveraging recent advances in differentiable rendering. We design a framework that can generate triangle meshes and associated high-resolution texture maps, using only 2D supervision from single-view natural images. A key contribution of our work is the encoding of the mesh and texture as 2D representations, which are semantically aligned and can be easily modeled by a 2D convolutional GAN. We demonstrate the efficacy of our method on Pascal3D+ Cars and CUB, both in an unconditional setting and in settings where the model is conditioned on class labels, attributes, and text. Finally, we propose an evaluation methodology that assesses the mesh and texture quality separately. △ Less

Submitted 23 October, 2020; v1 submitted 13 June, 2020; originally announced June 2020.

Comments: NeurIPS 2020, Oral presentation. Code at https://github.com/dariopavllo/convmesh

arXiv:2005.02851 [pdf, other]

doi 10.1103/PhysRevMaterials.4.055401

Ionic liquid dynamics in nanoporous carbon: A pore-size- and temperature-dependent neutron spectroscopy study on supercapacitor materials

Authors: Mark Busch, Tommy Hofmann, Bernhard Frick, Jan P. Embs, Boris Dyatkin, Patrick Huber

Abstract: The influence of spatial confinement on the thermally excited stochastic cation dynamics of the room-temperature ionic liquid 1-N-butylpyridinium bis-((trifluoromethyl)sulfonyl)imide ([BuPy][Tf_2N]) inside porous carbide-derived carbons with various pore sizes in the sub- to a few nanometer range are investigated by quasi-elastic neutron spectroscopy. Using the potential of fixed window scans, i.e… ▽ More The influence of spatial confinement on the thermally excited stochastic cation dynamics of the room-temperature ionic liquid 1-N-butylpyridinium bis-((trifluoromethyl)sulfonyl)imide ([BuPy][Tf_2N]) inside porous carbide-derived carbons with various pore sizes in the sub- to a few nanometer range are investigated by quasi-elastic neutron spectroscopy. Using the potential of fixed window scans, i.e. scanning a sample parameter, while observing solely one specific energy transfer value, an overview of the dynamic landscape within a wide temperature range is obtained. It is shown that already these data provide a quite comprehensive understanding of the confinement-induced alteration of the molecular mobility in comparison to the bulk. A complementary, more detailed analysis of full energy transfer spectra at selected temperatures reveals two translational diffusive processes on different time scales. Both are considerably slower than in the bulk liquid and show a decrease of the respective self-diffusion coefficients with decreasing nanopore size. Different thermal activation energies for molecular self-diffusion in nanoporous carbons with similar pore size indicate the importance of pore morphology on the molecular mobility, beyond the pure degree of confinement. In spite of the dynamic slowing down we can show that the temperature range of the liquid state upon nanoconfinement is remarkably extended to much lower temperatures, which is beneficial for potential technical applications of such systems. △ Less

Submitted 6 May, 2020; originally announced May 2020.

Comments: 13 pages, 9 figures (in press)

Journal ref: Phys. Rev. Materials 4, 055401 (2020) - Editors' Suggestion

arXiv:2005.01836 [pdf, other]

doi 10.1063/5.0020736

Reproducibility of Potential Energy Surfaces of Organic/Metal Interfaces on the Example of PTCDA on Ag(111)

Authors: Lukas Hörmann, Andreas Jeindl, Oliver T. Hofmann

Abstract: Molecular adsorption at organic/metal interfaces depends on a range of mechanisms: covalent bonds, charge transfer, Pauli repulsion and van der Waals (vdW) interactions shape the potential energy surface (PES), making it key to understanding organic/metal interfaces. Describing such interfaces with density functional theory requires carefully selecting the exchange correlation (XC) functional and… ▽ More Molecular adsorption at organic/metal interfaces depends on a range of mechanisms: covalent bonds, charge transfer, Pauli repulsion and van der Waals (vdW) interactions shape the potential energy surface (PES), making it key to understanding organic/metal interfaces. Describing such interfaces with density functional theory requires carefully selecting the exchange correlation (XC) functional and vdW correction scheme. To explore the reproducibility of the PES with respect to the choice of method, we present a benchmark of common local, semi-local and non-local XC functionals in combination with various vdW corrections. We benchmark these methods using perylenetetracarboxylic dianhydride (PTCDA) on Ag(111), one of the most frequently studied organic/metal interfaces. For each method, we determine the PES using a Gaussian process regression algorithm, which requires only about 50 DFT calculations as input. This allows a detailed analysis of the PESs' features, such as the positions and energies of minima and saddle points. Comparing the results from different combinations of XC functionals and vdW corrections enables us to identify trends and differences between the approaches. PESs for different computation methods are in qualitative agreement, but also displaying significant quantitative differences. In particular, lateral positions of adsorption geometries agree well with experiment, while adsorption heights, energies and barriers show larger discrepancies. △ Less

Submitted 17 August, 2020; v1 submitted 4 May, 2020; originally announced May 2020.

arXiv:2005.00685 [pdf, ps, other]

Reciprocal plasmonic metasurfaces: Theory and applications

Authors: Yanzeng Li, Micheal McLamb, Serang Park, Darrell Childers, Glenn D. Boreman, Tino Hofmann

Abstract: A new configuration for metasurface construction is presented to achieve multi-functional capabilities including perfect absorption, bio/chem sensing, and surface-mode lasing. The reciprocal plasmonic metasurfaces discussed here are composed of two plasmonic surfaces of reciprocal geometries separated by a dielectric spacer. Compared to conventional metasurfaces this simple geometry exhibits an en… ▽ More A new configuration for metasurface construction is presented to achieve multi-functional capabilities including perfect absorption, bio/chem sensing, and surface-mode lasing. The reciprocal plasmonic metasurfaces discussed here are composed of two plasmonic surfaces of reciprocal geometries separated by a dielectric spacer. Compared to conventional metasurfaces this simple geometry exhibits an enhanced optical performance. The discussed reciprocal metasurface design further enables effective structural optimization and allows for a simple and scalable fabrication. The physical principle and potential applications of the reciprocal plasmonic metasurfaces are demonstrated using numerical and analytical approaches. △ Less

Submitted 1 May, 2020; originally announced May 2020.

Comments: 4 pages, 4 figures

arXiv:2004.03732 [pdf, other]

Tunable cavity-enhanced terahertz frequency-domain optical Hall effect

Authors: Sean Knight, Stefan Schöche, Philipp Kühne, Tino Hofmann, Vanya Darakchieva, Mathias Schubert

Abstract: Presented here is the development and demonstration of a tunable cavity-enhanced terahertz frequency-domain optical Hall effect technique. The cavity consists of at least one fixed and one tunable Fabry-Pérot resonator. The approach is suitable for enhancement of the optical signatures produced by the optical Hall effect in semi-transparent conductive layer structures with plane parallel interface… ▽ More Presented here is the development and demonstration of a tunable cavity-enhanced terahertz frequency-domain optical Hall effect technique. The cavity consists of at least one fixed and one tunable Fabry-Pérot resonator. The approach is suitable for enhancement of the optical signatures produced by the optical Hall effect in semi-transparent conductive layer structures with plane parallel interfaces. The physical principle is the constructive interference of electric field components that undergo multiple optical Hall effect induced polarization rotations upon multiple light passages through the conductive layer stack. Tuning one of the cavity parameters, such as the external cavity thickness, permits shifting of the frequencies of the constructive interference, and enhancement of the optical signatures produced by the optical Hall effect can be obtained over large spectral regions. A cavity-tuning optical stage and gas flow cell are used as examples of instruments that exploit tuning an external cavity to enhance polarization changes in a reflected terahertz beam. Permanent magnets are used to provide the necessary external magnetic field. Conveniently, the highly reflective surface of a permanent magnet can be used to create the tunable external cavity. The signal enhancement allows the extraction of the free charge carrier properties of thin films, and can eliminate the need for expensive super-conducting magnets. Furthermore, the thickness of the external cavity establishes an additional independent measurement condition, similar to, for example, the magnetic field strength, terahertz frequency, and angle of incidence. A high electron mobility transistor structure and epitaxial graphene are studied as examples. We discuss the theoretical background, instrument design, data acquisition, and data analysis procedures. △ Less

Submitted 7 April, 2020; originally announced April 2020.

Comments: 12 pages, 8 figures

arXiv:2004.03341 [pdf, ps, other]

Resultants over principal Artinian rings

Authors: Claus Fieker, Tommy Hofmann, Carlo Sircana

Abstract: The resultant of two univariate polynomials is an invariant of great importance in commutative algebra and vastly used in computer algebra systems. Here we present an algorithm to compute it over Artinian principal rings with a modified version of the Euclidean algorithm. Using the same strategy, we show how the reduced resultant and a pair of Bézout coefficient can be computed. Particular attenti… ▽ More The resultant of two univariate polynomials is an invariant of great importance in commutative algebra and vastly used in computer algebra systems. Here we present an algorithm to compute it over Artinian principal rings with a modified version of the Euclidean algorithm. Using the same strategy, we show how the reduced resultant and a pair of Bézout coefficient can be computed. Particular attention is devoted to the special case of $\mathbf{Z}/n\mathbf{Z}$, where we perform a detailed analysis of the asymptotic cost of the algorithm. Finally, we illustrate how the algorithms can be exploited to improve ideal arithmetic in number fields and polynomial arithmetic over $p$-adic fields. △ Less

Submitted 7 April, 2020; originally announced April 2020.

arXiv:2004.01990 [pdf]

doi 10.1126/science.aaz8727

Efficient Light Funneling based on the non-Hermitian Skin Effect

Authors: Sebastian Weidemann, Mark Kremer, Tobias Helbig, Tobias Hofmann, Alexander Stegmaier, Martin Greiter, Ronny Thomale, Alexander Szameit

Abstract: In the last two decades, the ubiquitous effect of dissipation has proven to entail astonishing non-Hermitian features, rather than just being an inescapable nuisance. As an alternative route to non-Hermiticity, we tailor the anisotropy of a lattice, which constitutes an, up to now, barely exploited degree of freedom. In this case, the appearance of an interface dramatically alters the entire eigen… ▽ More In the last two decades, the ubiquitous effect of dissipation has proven to entail astonishing non-Hermitian features, rather than just being an inescapable nuisance. As an alternative route to non-Hermiticity, we tailor the anisotropy of a lattice, which constitutes an, up to now, barely exploited degree of freedom. In this case, the appearance of an interface dramatically alters the entire eigenmode spectrum, leading to the exponential localization of all modes at the interface, which goes beyond the expectations for Hermitian systems. This effect is dubbed "non-Hermitian skin effect". We experimentally demonstrate it by studying the propagation of light in a large scale photonic mesh lattice. For arbitrary excitations, we find that light is always transported to the interface, realizing a highly efficient funnel for light. △ Less

Submitted 4 April, 2020; originally announced April 2020.

Journal ref: Science (2020)

arXiv:2003.02738 [pdf, other]

BERT as a Teacher: Contextual Embeddings for Sequence-Level Reward

Authors: Florian Schmidt, Thomas Hofmann

Abstract: Measuring the quality of a generated sequence against a set of references is a central problem in many learning frameworks, be it to compute a score, to assign a reward, or to perform discrimination. Despite great advances in model architectures, metrics that scale independently of the number of references are still based on n-gram estimates. We show that the underlying operations, counting words… ▽ More Measuring the quality of a generated sequence against a set of references is a central problem in many learning frameworks, be it to compute a score, to assign a reward, or to perform discrimination. Despite great advances in model architectures, metrics that scale independently of the number of references are still based on n-gram estimates. We show that the underlying operations, counting words and comparing counts, can be lifted to embedding words and comparing embeddings. An in-depth analysis of BERT embeddings shows empirically that contextual embeddings can be employed to capture the required dependencies while maintaining the necessary scalability through appropriate pruning and smoothing techniques. We cast unconditional generation as a reinforcement learning problem and show that our reward function indeed provides a more effective learning signal than n-gram reward in this challenging setting. △ Less

Submitted 5 March, 2020; originally announced March 2020.

arXiv:2003.01652 [pdf, other]

Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks

Authors: Hadi Daneshmand, Jonas Kohler, Francis Bach, Thomas Hofmann, Aurelien Lucchi

Abstract: Randomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used. We here investigate this phenomenon by revisiting the connection between random initialization in deep networks and spectral instabilities in products of random matrices. Given the rich literature on random mat… ▽ More Randomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used. We here investigate this phenomenon by revisiting the connection between random initialization in deep networks and spectral instabilities in products of random matrices. Given the rich literature on random matrices, it is not surprising to find that the rank of the intermediate representations in unnormalized networks collapses quickly with depth. In this work we highlight the fact that batch normalization is an effective strategy to avoid rank collapse for both linear and ReLU networks. Leveraging tools from Markov chain theory, we derive a meaningful lower rank bound in deep linear networks. Empirically, we also demonstrate that this rank robustness generalizes to ReLU nets. Finally, we conduct an extensive set of experiments on real-world data sets, which confirm that rank stability is indeed a crucial condition for training modern-day deep neural architectures. △ Less

Submitted 11 June, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

arXiv:2002.12332 [pdf, ps, other]

Norm relations and computational problems in number fields

Authors: Jean-François Biasse, Claus Fieker, Tommy Hofmann, Aurel Page

Abstract: For a finite group $G$, we introduce a generalization of norm relations in the group algebra $\mathbb Q[G]$. We give necessary and sufficient criteria for the existence of such relations and apply them to obtain relations between the arithmetic invariants of the subfields of a normal extension of algebraic number fields with Galois group $G$. On the algorithmic side this leads to subfield based al… ▽ More For a finite group $G$, we introduce a generalization of norm relations in the group algebra $\mathbb Q[G]$. We give necessary and sufficient criteria for the existence of such relations and apply them to obtain relations between the arithmetic invariants of the subfields of a normal extension of algebraic number fields with Galois group $G$. On the algorithmic side this leads to subfield based algorithms for computing rings of integers, $S$-unit groups and class groups. For the $S$-unit group computation this yields a polynomial time reduction to the corresponding problem in subfields. We compute class groups of large number fields under GRH, and new unconditional values of class numbers of cyclotomic fields. △ Less

Submitted 14 July, 2021; v1 submitted 27 February, 2020; originally announced February 2020.

MSC Class: Primary: 11Y16; 20C05; 11R32; Secondary 11R29; 11R04; 11Y40; 11R18; 11R27

arXiv:2001.03144 [pdf, other]

Fabrication of optical components with nm- to mm-scale critical features using three-dimensional direct laser writing

Authors: Y. Li, S. Park, M. McLamb, M. Lata, D. Childers, T. Hofmann

Abstract: A powerful fabrication strategy based on three-dimensional direct laser writing for the rapid prototy** of opto-mechanical components with critical features ranging from several hundred nm to a few mm is demonstrated here. As an example, a simple optical fiber connector with optical and mechanical guides as well as integrated micro-optical elements with nano-structured surfaces is designed and f… ▽ More A powerful fabrication strategy based on three-dimensional direct laser writing for the rapid prototy** of opto-mechanical components with critical features ranging from several hundred nm to a few mm is demonstrated here. As an example, a simple optical fiber connector with optical and mechanical guides as well as integrated micro-optical elements with nano-structured surfaces is designed and fabricated. In contrast to established three-dimensional direct laser writing, two different polymers are combined in the fabrication process in order to achieve a drastic reduction in fabrication time by substantially reducing the "optical tool path". A good agreement between the as-fabricated connector and nominal dimensions has been obtained. The developed approach allows the rapid prototy** of optomechanical components with multi-scale critical features. It is, therefore, envisioned to substantially accelerate the development cycle by integrating functional mechanical and optical elements in a single component. △ Less

Submitted 27 September, 2019; originally announced January 2020.

Comments: 4 pages, 4 figures

Journal ref: IEEE HONET-ICT 2019

arXiv:1912.03161 [pdf, other]

doi 10.1007/978-3-030-58539-6_29

Controlling Style and Semantics in Weakly-Supervised Image Generation

Authors: Dario Pavllo, Aurelien Lucchi, Thomas Hofmann

Abstract: We propose a weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene. We exploit sparse semantic maps to control object shapes and classes, as well as textual descriptions or attributes to control both local and global style. In order to condition our model on textual descriptions, we introduce a semantic atten… ▽ More We propose a weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene. We exploit sparse semantic maps to control object shapes and classes, as well as textual descriptions or attributes to control both local and global style. In order to condition our model on textual descriptions, we introduce a semantic attention module whose computational cost is independent of the image resolution. To further augment the controllability of the scene, we propose a two-step generation scheme that decomposes background and foreground. The label maps used to train our model are produced by a large-vocabulary object detector, which enables access to unlabeled data and provides structured instance information. In such a setting, we report better FID scores compared to fully-supervised settings where the model is trained on ground-truth semantic maps. We also showcase the ability of our model to manipulate a scene on complex datasets such as COCO and Visual Genome. △ Less

Submitted 21 July, 2020; v1 submitted 6 December, 2019; originally announced December 2019.

Comments: European Conference on Computer Vision (ECCV) 2020, Spotlight. Code at https://github.com/dariopavllo/style-semantics

arXiv:1911.10052 [pdf, other]

doi 10.1039/C9NR07143A

Self-Assembly of Liquid Crystals in Nanoporous Solids for Adaptive Photonic Metamaterials

Authors: Kathrin Sentker, Arda Yildirim, Milena Lippmann, Arne W. Zantop, Florian Bertram, Tommy Hofmann, Oliver H. Seeck, Andriy V. Kityk, Marco G. Mazza, Andreas Schönhals, Patrick Huber

Abstract: Nanoporous media exhibit structures significantly smaller than the wavelengths of visible light and can thus act as photonic metamaterials. Their optical functionality is not determined by the properties of the base materials, but rather by tailored, multiscale structures, in terms of precise pore shape, geometry, and orientation. Embedding liquid crystals in pore space provides additional opportu… ▽ More Nanoporous media exhibit structures significantly smaller than the wavelengths of visible light and can thus act as photonic metamaterials. Their optical functionality is not determined by the properties of the base materials, but rather by tailored, multiscale structures, in terms of precise pore shape, geometry, and orientation. Embedding liquid crystals in pore space provides additional opportunities to control light-matter interactions at the single-pore, meta-atomic scale. Here, we present temperature-dependent 3D reciprocal space map** using synchrotron-based X-ray diffraction in combination with high-resolution birefringence experiments on disk-like mesogens (HAT6) imbibed in self-ordered arrays of parallel cylindrical pores 17 to 160 nm across in monolithic anodic aluminium oxide (AAO). In agreement with Monte Carlo computer simulations we observe a remarkably rich self-assembly behaviour, unknown from the bulk state. It encompasses transitions between the isotropic liquid state and discotic stacking in linear columns as well as circular concentric ring formation perpendicular and parallel to the pore axis. These textural transitions underpin an optical birefringence functionality, tuneable in magnitude and in sign from positive to negative via pore size, pore surface-grafting and temperature. Our study demonstrates that the advent of large-scale, self-organised nanoporosity in monolithic solids along with confinement-controllable phase behaviour of liquid-crystalline matter at the single-pore scale provides a reliable and accessible tool to design materials with adjustable optical anisotropy, and thus offers versatile pathways to fine-tune polarisation-dependent light propagation speeds in materials. Such a tailorability is at the core of the emerging field of transformative optics, allowing, e.g., adjustable light absorbers and extremely thin metalenses. △ Less

Submitted 22 November, 2019; originally announced November 2019.

Comments: Supporting information (SI) available as ancillary file. The SI movies are available at the repository TORE of Hamburg University of Technology (https://doi.org/10.15480/336.2515). This work is dedicated to Prof. Peter S. Pershan (Harvard University), a pioneer in the field of x-ray scattering studies of soft matter, in particular of liquid surfaces and liquid crystals, on his 85th birthday

Journal ref: Nanoscale 11, 23304 (2019)

arXiv:1910.14616 [pdf, other]

Mixing of Stochastic Accelerated Gradient Descent

Authors: Peiyuan Zhang, Hadi Daneshmand, Thomas Hofmann

Abstract: We study the mixing properties for stochastic accelerated gradient descent (SAGD) on least-squares regression. First, we show that stochastic gradient descent (SGD) and SAGD are simulating the same invariant distribution. Motivated by this, we then establish mixing rate for SAGD-iterates and compare it with those of SGD-iterates. Theoretically, we prove that the chain of SAGD iterates is geometric… ▽ More We study the mixing properties for stochastic accelerated gradient descent (SAGD) on least-squares regression. First, we show that stochastic gradient descent (SGD) and SAGD are simulating the same invariant distribution. Motivated by this, we then establish mixing rate for SAGD-iterates and compare it with those of SGD-iterates. Theoretically, we prove that the chain of SAGD iterates is geometrically ergodic --using a proper choice of parameters and under regularity assumptions on the input distribution. More specifically, we derive an explicit mixing rate depending on the first 4 moments of the data distribution. By means of illustrative examples, we prove that SAGD-iterate chain mixes faster than the chain of iterates obtained by SGD. Furthermore, we highlight applications of the established mixing rate in the convergence analysis of SAGD on realizable objectives. The proposed analysis is based on a novel non-asymptotic analysis of products of random matrices. This theoretical result is substantiated and validated by experiments. △ Less

Submitted 31 October, 2019; originally announced October 2019.

arXiv:1910.07621 [pdf, other]

Recent Results from Polycrystalline CVD Diamond Detectors

Authors: RD42 Collaboration, L. Bäni, A. Alexopoulos, M. Artuso, F. Bachmair, M. Bartosik, H. Beck, V. Bellini, V. Belyaev, B. Bentele, A. Bes, J. -M. Brom, M. Bruzzi, G. Chiodini, D. Chren, V. Cindro, G. Claus, J. Collot, J. Cumalat, A. Dabrowski, R. D'Alessandro, D. Dauvergne, W. de Boer, C. Dorfer, M. Dünser , et al. (87 additional authors not shown)

Abstract: Diamond is a material in use at many nuclear and high energy facilities due to its inherent radiation tolerance and ease of use. We have characterized detectors based on chemical vapor deposition (CVD) diamond before and after proton irradiation. We present preliminary results of the spatial resolution of unirradiated and irradiated CVD diamond strip sensors. In addition, we measured the pulse hei… ▽ More Diamond is a material in use at many nuclear and high energy facilities due to its inherent radiation tolerance and ease of use. We have characterized detectors based on chemical vapor deposition (CVD) diamond before and after proton irradiation. We present preliminary results of the spatial resolution of unirradiated and irradiated CVD diamond strip sensors. In addition, we measured the pulse height versus particle rate of unirradiated and irradiated polycrystalline CVD (pCVD) diamond pad detectors up to a particle flux of $20\,\mathrm{MHz/cm^2}$ and a fluence up to $4 \times 10^{15}\,n/\mathrm{cm^2}$. △ Less

Submitted 16 October, 2019; originally announced October 2019.

Comments: Talk presented at the 2019 Meeting of the Division of Particles and Fields of the American Physical Society (DPF2019), July 29 - August 2, 2019, Northeastern University, Boston, C1907293

arXiv:1910.04826 [pdf, other]

doi 10.1116/1.5122991

Direct Laser Writing of Birefringent Photonic Crystals for the Infrared Spectral Range

Authors: Marc Lata, Yanzeng Li, Serang Park, Micheal McLamb, Tino Hofmann

Abstract: Infrared optical photonic crystals fabricated using direct laser writing, which is based on the two-photon polymerization of suitable monomers, have received substantial interest since the emergence of this process. Two-photon polymerization could be a disruptive technology for the fabrication of all-dielectric photonic crystals in the infrared spectral range, as it allows the synthesis of large s… ▽ More Infrared optical photonic crystals fabricated using direct laser writing, which is based on the two-photon polymerization of suitable monomers, have received substantial interest since the emergence of this process. Two-photon polymerization could be a disruptive technology for the fabrication of all-dielectric photonic crystals in the infrared spectral range, as it allows the synthesis of large scale arrays of uniform structures with arbitrary geometries and arrangements. However, all-dielectric photonic crystals that provide birefringent optical responses in the infrared spectral range have not yet been demonstrated using direct laser writing techniques. Here we explore the form birefringence observed in photonic crystals composed of arrays of subwavelength-sized slanted polymer microwires. The photonic crystals investigated here were fabricated in a single fabrication step using direct laser writing of an infrared transparent photoresist (IP-Dip). A strong contrast of the cross-polarized reflectance of photonic crystals as a function of the in-plane orientation is observed in the mid-infrared spectral range at $λ$ $\approx$ 6.5~$μ$m. This observation is indicative of an anisotropic optical behavior. Finite element based techniques corroborate the experimentally observed responses qualitatively. △ Less

Submitted 28 September, 2019; originally announced October 2019.

arXiv:1909.13662 [pdf, other]

A Stereolithographically Fabricated Polymethacrylate Broadband THz Absorber

Authors: Serang Park, Zackery Z. Clark, Yanzeng Li, Michael McLamb, Tino Hofmann

Abstract: Additive manufactured THz optics have been introduced as an efficient alternative to their commercial counterparts. Among various additive manufacturing methods, stereolithography provides superior spatial resolution and surface finish. However, examples of stereolithographically fabricated components for THz applications are still scarce. In this paper, we report on the fabrication process and pe… ▽ More Additive manufactured THz optics have been introduced as an efficient alternative to their commercial counterparts. Among various additive manufacturing methods, stereolithography provides superior spatial resolution and surface finish. However, examples of stereolithographically fabricated components for THz applications are still scarce. In this paper, we report on the fabrication process and performance of a stereolithographically fabricated broadband absorber for the THz spectral range. Simple THz transmission experiments were carried out for the absorber and bulk reference samples. The experimental results indicated that the fabricated absorber effectively absorbs incident signal in the investigated THz spectral range. △ Less

Submitted 26 September, 2019; originally announced September 2019.

arXiv:1909.12955 [pdf, other]

Diffraction Gratings for Uniform Light Extraction from Light Guides

Authors: Micheal McLamb, Yanzeng Li, Serang Park, Marc Lata, Tino Hofmann

Abstract: A theoretical approach for uniformly extracting light propagating through a light guide plate is developed here. Typically, liquid crystal display modalities utilize a backlit lighting system illuminated from the edge with light emitting diodes. The backlight acts as a light guide plate and is coupled with a diffuser sheet along the surface to uniformly extract the light. Our approach employs sub-… ▽ More A theoretical approach for uniformly extracting light propagating through a light guide plate is developed here. Typically, liquid crystal display modalities utilize a backlit lighting system illuminated from the edge with light emitting diodes. The backlight acts as a light guide plate and is coupled with a diffuser sheet along the surface to uniformly extract the light. Our approach employs sub-micron diffraction gratings and eliminates the need for diffuser sheets, while ensuring the uniform extraction of light. The physical dimensions of the grating are varied along the surface of the light guide plate to control the diffraction efficiency, thus determining the local viewing angle and emitted intensity. △ Less

Submitted 27 September, 2019; originally announced September 2019.

arXiv:1909.12698 [pdf, ps, other]

doi 10.1116/1.5122801

THz optical properties of polymethacrylates after thermal annealing

Authors: Serang Park, Yanzeng Li, Daniel B. Fullager, Marc Lata, Philipp Kuehne, Vanya Darakchieva, Tino Hofmann

Abstract: Polymer based stereolithographic additive manufacturing has been established for the rapid and low-cost fabrication of THz optical components due to its ability to construct complex 3D geometries with high resolution. For polymer based or integrated optics, thermal annealing processes are often used to optimize material properties. However, despite the growing interest in THz optics fabricated usi… ▽ More Polymer based stereolithographic additive manufacturing has been established for the rapid and low-cost fabrication of THz optical components due to its ability to construct complex 3D geometries with high resolution. For polymer based or integrated optics, thermal annealing processes are often used to optimize material properties. However, despite the growing interest in THz optics fabricated using stereolithography, the effects of thermal annealing on the THz dielectric properties of polymethacrylates compatible with stereolithography has not been studied yet. In this manuscript we report on the THz ellipsometric response of thermally annealed polymethacrylates prepared using UV polymerization. Our findings indicate that the investigated polymethacrylate maintain a stable optical response in THz spectral range from 650 to 950 GHz after thermal annealing at temperatures up to 70 degrees C for several hours. △ Less

Submitted 27 September, 2019; originally announced September 2019.

arXiv:1909.10860 [pdf, ps, other]

On the computation of overorders

Authors: Tommy Hofmann, Carlo Sircana

Abstract: The computation of a maximal order of an order in a semisimple algebra over a global field is a classical well-studied problem in algorithmic number theory. In this paper we consider the related problems of computing all minimal overorders as well as all overorders of a given order. We use techniques from algorithmic representation theory and the theory of minimal integral ring extensions to obtai… ▽ More The computation of a maximal order of an order in a semisimple algebra over a global field is a classical well-studied problem in algorithmic number theory. In this paper we consider the related problems of computing all minimal overorders as well as all overorders of a given order. We use techniques from algorithmic representation theory and the theory of minimal integral ring extensions to obtain efficient and practical algorithms, whose implementation is publicly available. △ Less

Submitted 24 September, 2019; originally announced September 2019.

MSC Class: 11Y40; 11R04

arXiv:1909.01646 [pdf, other]

LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games

Authors: Leonard Adolphs, Thomas Hofmann

Abstract: While Reinforcement Learning (RL) approaches lead to significant achievements in a variety of areas in recent history, natural language tasks remained mostly unaffected, due to the compositional and combinatorial nature that makes them notoriously hard to optimize. With the emerging field of Text-Based Games (TBGs), researchers try to bridge this gap. Inspired by the success of RL algorithms on At… ▽ More While Reinforcement Learning (RL) approaches lead to significant achievements in a variety of areas in recent history, natural language tasks remained mostly unaffected, due to the compositional and combinatorial nature that makes them notoriously hard to optimize. With the emerging field of Text-Based Games (TBGs), researchers try to bridge this gap. Inspired by the success of RL algorithms on Atari games, the idea is to develop new methods in a restricted game world and then gradually move to more complex environments. Previous work in the area of TBGs has mainly focused on solving individual games. We, however, consider the task of designing an agent that not just succeeds in a single game, but performs well across a whole family of games, sharing the same theme. In this work, we present our deep RL agent--LeDeepChef--that shows generalization capabilities to never-before-seen games of the same family with different environments and task descriptions. The agent participated in Microsoft Research's "First TextWorld Problems: A Language and Reinforcement Learning Challenge" and outperformed all but one competitor on the final test set. The games from the challenge all share the same theme, namely cooking in a modern house environment, but differ significantly in the arrangement of the rooms, the presented objects, and the specific goal (recipe to cook). To build an agent that achieves high scores across a whole family of games, we use an actor-critic framework and prune the action-space by using ideas from hierarchical reinforcement learning and a specialized module trained on a recipe database. △ Less

Submitted 4 September, 2019; originally announced September 2019.

arXiv:1908.11658 [pdf, ps, other]

Autoregressive Text Generation Beyond Feedback Loops

Authors: Florian Schmidt, Stephan Mandt, Thomas Hofmann

Abstract: Autoregressive state transitions, where predictions are conditioned on past predictions, are the predominant choice for both deterministic and stochastic sequential models. However, autoregressive feedback exposes the evolution of the hidden state trajectory to potential biases from well-known train-test discrepancies. In this paper, we combine a latent state space model with a CRF observation mod… ▽ More Autoregressive state transitions, where predictions are conditioned on past predictions, are the predominant choice for both deterministic and stochastic sequential models. However, autoregressive feedback exposes the evolution of the hidden state trajectory to potential biases from well-known train-test discrepancies. In this paper, we combine a latent state space model with a CRF observation model. We argue that such autoregressive observation models form an interesting middle ground that expresses local correlations on the word level but keeps the state evolution non-autoregressive. On unconditional sentence generation we show performance improvements compared to RNN and GAN baselines while avoiding some prototypical failure modes of autoregressive models. △ Less

Submitted 30 August, 2019; originally announced August 2019.

Comments: emnlp camera ready

arXiv:1908.05519 [pdf, other]

Cosmological N-body simulations: a challenge for scalable generative models

Authors: Nathanaël Perraudin, Ankit Srivastava, Aurelien Lucchi, Tomasz Kacprzak, Thomas Hofmann, Alexandre Réfrégier

Abstract: Deep generative models, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAs) have been demonstrated to produce images of high visual quality. However, the existing hardware severely limits the size of the images that can be generated. The rapid growth of high dimensional data in many fields of science therefore poses a significant challenge for generative models. In cos… ▽ More Deep generative models, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAs) have been demonstrated to produce images of high visual quality. However, the existing hardware severely limits the size of the images that can be generated. The rapid growth of high dimensional data in many fields of science therefore poses a significant challenge for generative models. In cosmology, the large-scale, three-dimensional matter distribution, modeled with N-body simulations, plays a crucial role in understanding the evolution of the universe. As these simulations are computationally very expensive, GANs have recently generated interest as a possible method to emulate these datasets, but they have been, so far, mostly limited to two dimensional data. In this work, we introduce a new benchmark for the generation of three dimensional N-body simulations, in order to stimulate new ideas in the machine learning community and move closer to the practical use of generative models in cosmology. As a first benchmark result, we propose a scalable GAN approach for training a generator of N-body three-dimensional cubes. Our technique relies on two key building blocks, (i) splitting the generation of the high-dimensional data into smaller parts, and (ii) using a multi-scale approach that efficiently captures global image features that might otherwise be lost in the splitting process. We evaluate the performance of our model for the generation of N-body samples using various statistical measures commonly used in cosmology. Our results show that the proposed model produces samples of high visual quality, although the statistical analysis reveals that capturing rare features in the data poses significant problems for the generative models. We make the data, quality evaluation routines, and the proposed GAN architecture publicly available at https://github.com/nperraud/3DcosmoGAN △ Less

Submitted 18 December, 2019; v1 submitted 15 August, 2019; originally announced August 2019.

arXiv:1908.02759 [pdf, other]

doi 10.1103/PhysRevResearch.2.023265

Reciprocal skin effect and its realization in a topolectrical circuit

Authors: Tobias Hofmann, Tobias Helbig, Frank Schindler, Nora Salgo, Marta Brzezińska, Martin Greiter, Tobias Kiessling, David Wolf, Achim Vollhardt, Anton Kabaši, Ching Hua Lee, Ante Bilušić, Ronny Thomale, Titus Neupert

Abstract: A system is non-Hermitian when it exchanges energy with its environment and non-reciprocal when it behaves differently upon the interchange of input and response. Within the field of metamaterial research on synthetic topological matter, the skin effect describes the conspiracy of non-Hermiticity and non-reciprocity to yield extensive anomalous localization of all eigenmodes in a (quasi) one-dimen… ▽ More A system is non-Hermitian when it exchanges energy with its environment and non-reciprocal when it behaves differently upon the interchange of input and response. Within the field of metamaterial research on synthetic topological matter, the skin effect describes the conspiracy of non-Hermiticity and non-reciprocity to yield extensive anomalous localization of all eigenmodes in a (quasi) one-dimensional geometry. Here, we introduce the reciprocal skin effect, which occurs in non-Hermitian but reciprocal systems in two or more dimensions: Eigenmodes with opposite longitudinal momentum exhibit opposite transverse anomalous localization. We experimentally demonstrate the reciprocal skin effect in a passive RLC circuit, suggesting convenient alternative implementations in optical, acoustic, mechanical, and related platforms. Skin mode localization brings forth potential applications in directional and polarization detectors for electromagnetic waves. △ Less

Submitted 4 June, 2020; v1 submitted 7 August, 2019; originally announced August 2019.

Comments: 12 pages, 5 figures, accepted manuscript

Journal ref: Phys. Rev. Research 2, 023265 (2020)

arXiv:1907.13299 [pdf, ps, other]

doi 10.1007/s10762-019-00616-x

Terahertz to mid-infrared dielectric properties of polymethacrylates for stereolithographic single layer assembly

Authors: Serang Park, Yanzeng Li, Daniel B. Fullager, Stefan Schöche, Craig M. Herzinger, Glenn D. Boreman, Tino Hofmann

Abstract: The fabrication of terahertz (THz) optics with arbitrary shapes via poly-methacrylate-based stereolithography is very attractive as it may offer a rapid, low-cost avenue towards optimized THz imaging applications. In order to design such THz optical components appropriately, accurate knowledge of the complex dielectric function of the materials used for stereolithographic fabrication is crucial. I… ▽ More The fabrication of terahertz (THz) optics with arbitrary shapes via poly-methacrylate-based stereolithography is very attractive as it may offer a rapid, low-cost avenue towards optimized THz imaging applications. In order to design such THz optical components appropriately, accurate knowledge of the complex dielectric function of the materials used for stereolithographic fabrication is crucial. In this paper we report on the complex dielectric functions of several polymethacrylates frequently used for stereolithographic fabrication. Spectroscopic ellipsometry data sets from the THz to mid-infrared spectral range were obtained from isotropically cross-linked polymethacrylate samples. The data sets were analyzed using stratified layer optical model calculations with parameterized model dielectric functions. While the infrared spectral range is dominated by a number of strong absorption features with Gaussian profiles, these materials are found to exhibit only weak absorption in the THz frequency range. In conclusion, we find that thin transmissive THz optics can be efficiently fabricated using polymethacrylate-based stereolithographic fabrication. △ Less

Submitted 30 July, 2019; originally announced July 2019.

Journal ref: J. Infrared, Millimeter, Terahertz Waves, vol. 40, no. 9, pp. 971-979, 2019

arXiv:1907.11562 [pdf, other]

doi 10.1038/s41567-020-0922-9

Observation of bulk boundary correspondence breakdown in topolectrical circuits

Authors: Tobias Helbig, Tobias Hofmann, Stefan Imhof, Mohamed Abdelghany, Tobias Kiessling, Laurens W. Molenkamp, Ching Hua Lee, Alexander Szameit, Martin Greiter, Ronny Thomale

Abstract: The study of the laws of nature has traditionally been pursued in the limit of isolated systems, where energy is conserved. This is not always a valid approximation, however, as the inclusion of features like gain and loss, or periodic driving, qualitatively amends these laws. A contemporary frontier of meta-material research is the challenge open systems pose to the established characterization o… ▽ More The study of the laws of nature has traditionally been pursued in the limit of isolated systems, where energy is conserved. This is not always a valid approximation, however, as the inclusion of features like gain and loss, or periodic driving, qualitatively amends these laws. A contemporary frontier of meta-material research is the challenge open systems pose to the established characterization of topological matter. There, one of the most relied upon principles is the bulk-boundary correspondence (BBC), which intimately relates the properties of the surface states to the topological classification of the bulk. The presence of gain and loss, in combination with the violation of reciprocity, has recently been predicted to affect this principle dramatically. Here, we report the experimental observation of BBC violation in a non-reciprocal topolectric circuit. The circuit admittance spectrum exhibits an unprecedented sensitivity to the presence of a boundary, displaying an extensive admittance mode localization despite a translationally invariant bulk. Intriguingly, we measure a non-local voltage response due to broken BBC. Depending on the AC current feed frequency, the voltage signal accumulates at the left or right boundary, and increases as a function of nodal distance to the current feed. △ Less

Submitted 26 July, 2019; originally announced July 2019.

Comments: 36 pages, 9 figures

arXiv:1906.09455 [pdf, other]

Characterization of the soft X-ray spectrometer PEAXIS at BESSY II

Authors: Christian Schulz, Klaus Lieutenant, Jie Xiao, Tommy Hofmann, Deniz Wong, Klaus Habicht

Abstract: The performance of the recently commissioned spectrometer PEAXIS for resonant inelastic soft X-ray scattering (RIXS) and X-ray photoelectron spectroscopy (XPS) and its hosting beamline U41-PEAXIS at the BESSY II synchrotron are characterized. The beamline provides linearly polarized light from 180 eV - 1600 eV allowing for RIXS measurements in the range of 200 eV - 1200 eV. The monochromator optic… ▽ More The performance of the recently commissioned spectrometer PEAXIS for resonant inelastic soft X-ray scattering (RIXS) and X-ray photoelectron spectroscopy (XPS) and its hosting beamline U41-PEAXIS at the BESSY II synchrotron are characterized. The beamline provides linearly polarized light from 180 eV - 1600 eV allowing for RIXS measurements in the range of 200 eV - 1200 eV. The monochromator optics can be operated in different configurations for the benefit of either high flux, providing up to $10^{12}$ photons/s within the focal spot at the sample, or high energy resolution with a full width at half maximum of <40meV at an incident photon energy of ~400 eV. This measured total energy resolution of the RIXS spectrometer is in very good agreement with the theoretically predicted values by ray-tracing simulations. PEAXIS features a 5 m long RIXS spectrometer arm that can be continuously rotated about the sample position by 106° within the horizontal photon scattering plane, thus enabling the study of momentum-transfer-dependent excitations. To demonstrate the instrument capabilities, d-d excitations and magnetic excitations have been measured on single-crystalline NiO. Measurements employing a fluid cell demonstrate the vibrational Progression in liquid acetone. Planned upgrades of the beamline and the RIXS spectrometer that will further increase the energy resolution by 20 - 30% to ~100meV at 1000 eV incident photon energy are discussed. △ Less

Submitted 22 June, 2019; originally announced June 2019.

Comments: 12 pages, 14 figures

arXiv:1906.08555 [pdf, ps, other]

On Gröbner bases over Dedekind domains

Authors: Tommy Hofmann

Abstract: Gröbner bases are a fundamental tool when studying ideals in multivariate polynomial rings. More recently there has been a growing interest in transferring techniques from the field case to other coefficient rings, most notably Euclidean domains and principal ideal rings. In this paper we will consider multivariate polynomial rings over Dedekind domain. By generalizing methods from the theory of f… ▽ More Gröbner bases are a fundamental tool when studying ideals in multivariate polynomial rings. More recently there has been a growing interest in transferring techniques from the field case to other coefficient rings, most notably Euclidean domains and principal ideal rings. In this paper we will consider multivariate polynomial rings over Dedekind domain. By generalizing methods from the theory of finitely generated projective modules, we show that it is possible to describe Gröbner bases over Dedekind domains in a way similar to the case of principal ideal domains, both from a theoretical and algorithmic point of view. △ Less

Submitted 16 April, 2020; v1 submitted 20 June, 2019; originally announced June 2019.

MSC Class: 13P10

arXiv:1906.08543 [pdf, ps, other]

Efficient Gröbner Bases Computation over Principal Ideal Rings

Authors: Christian Eder, Tommy Hofmann

Abstract: In this paper we present a new efficient variant to compute strong Gröbner basis over quotients of principal ideal domains. We show an easy lifting process which allows us to reduce one computation over the quotient $R/nR$ to two computations over $R/aR$ and $R/bR$ where $n = ab$ with coprime $a, b$. Possibly using available factorization algorithms we may thus recursively reduce some strong Gröbn… ▽ More In this paper we present a new efficient variant to compute strong Gröbner basis over quotients of principal ideal domains. We show an easy lifting process which allows us to reduce one computation over the quotient $R/nR$ to two computations over $R/aR$ and $R/bR$ where $n = ab$ with coprime $a, b$. Possibly using available factorization algorithms we may thus recursively reduce some strong Gröbner basis computations to Gröbner basis computations over fields for prime factors of $n$, at least for squarefree $n$. Considering now a computation over $R/nR$ we can run a standard Gröbner basis algorithm pretending $R/nR$ to be field. If we discover a non-invertible leading coefficient $c$, we use this information to try to split $n = ab$ with coprime $a, b$. If no such $c$ is discovered, the returned Gröbner basis is already a strong Gröbner basis for the input ideal over $R/nR$. △ Less

Submitted 20 June, 2019; originally announced June 2019.

arXiv:1906.03156 [pdf, other]

doi 10.1103/PhysRevD.100.063514

Cosmological constraints with deep learning from KiDS-450 weak lensing maps

Authors: Janis Fluri, Tomasz Kacprzak, Aurelien Lucchi, Alexandre Refregier, Adam Amara, Thomas Hofmann, Aurel Schneider

Abstract: Convolutional Neural Networks (CNN) have recently been demonstrated on synthetic data to improve upon the precision of cosmological inference. In particular they have the potential to yield more precise cosmological constraints from weak lensing mass maps than the two-point functions. We present the cosmological results with a CNN from the KiDS-450 tomographic weak lensing dataset, constraining th… ▽ More Convolutional Neural Networks (CNN) have recently been demonstrated on synthetic data to improve upon the precision of cosmological inference. In particular they have the potential to yield more precise cosmological constraints from weak lensing mass maps than the two-point functions. We present the cosmological results with a CNN from the KiDS-450 tomographic weak lensing dataset, constraining the total matter density $Ω_m$, the fluctuation amplitude $σ_8$, and the intrinsic alignment amplitude $A_{\rm{IA}}$. We use a grid of N-body simulations to generate a training set of tomographic weak lensing maps. We test the robustness of the expected constraints to various effects, such as baryonic feedback, simulation accuracy, different value of $H_0$, or the lightcone projection technique. We train a set of ResNet-based CNNs with varying depths to analyze sets of tomographic KiDS mass maps divided into 20 flat regions, with applied Gaussian smoothing of $σ=2.34$ arcmin. The uncertainties on shear calibration and $n(z)$ error are marginalized in the likelihood pipeline. Following a blinding scheme, we derive constraints of $S_8 = σ_8 (Ω_m/0.3)^{0.5} = 0.777^{+0.038}_{-0.036}$ with our CNN analysis, with $A_{\rm{IA}}=1.398^{+0.779}_{-0.724}$. We compare this result to the power spectrum analysis on the same maps and likelihood pipeline and find an improvement of about $30\%$ for the CNN. We discuss how our results offer excellent prospects for the use of deep learning in future cosmological data analysis. △ Less

Submitted 16 September, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

Comments: 22 pages, 15 figures

Journal ref: Phys. Rev. D 100, 063514 (2019)

arXiv:1906.01527 [pdf, other]

Adversarial Training is a Form of Data-dependent Operator Norm Regularization

Authors: Kevin Roth, Yannic Kilcher, Thomas Hofmann

Abstract: We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks. Specifically, we prove that $\ell_p$-norm constrained projected gradient ascent based adversarial training with an $\ell_q$-norm loss on the logits of clean and perturbed inputs is equivalent to data-dependent (p, q) operator norm regularization. This fundamental connection confi… ▽ More We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks. Specifically, we prove that $\ell_p$-norm constrained projected gradient ascent based adversarial training with an $\ell_q$-norm loss on the logits of clean and perturbed inputs is equivalent to data-dependent (p, q) operator norm regularization. This fundamental connection confirms the long-standing argument that a network's sensitivity to adversarial examples is tied to its spectral properties and hints at novel ways to robustify and defend against adversarial attacks. We provide extensive empirical evidence on state-of-the-art network architectures to support our theoretical results. △ Less

Submitted 23 October, 2020; v1 submitted 4 June, 2019; originally announced June 2019.

Comments: NeurIPS2020

arXiv:1905.07323

doi 10.1021/acs.chemmater.9b01807

Understanding the correlation between electronic coupling and energetic stability of molecular crystal polymorphs: The instructive case of quinacridone

Authors: Christian Winkler, Andreas Jeindl, Florian Mayer, Oliver T. Hofmann, Ralf Tonner, Egbert Zojer

Abstract: A crucial factor determining charge transport in organic semiconductors is the electronic coupling between the molecular constituents, which is heavily influenced by the relative arrangement of the molecules. This renders quinacridone, with its multiple, structurally fundamentally different polymorphs and their diverse intermolecular interactions an ideal test case for analyzing the correlation be… ▽ More A crucial factor determining charge transport in organic semiconductors is the electronic coupling between the molecular constituents, which is heavily influenced by the relative arrangement of the molecules. This renders quinacridone, with its multiple, structurally fundamentally different polymorphs and their diverse intermolecular interactions an ideal test case for analyzing the correlation between the electronic coupling in a specific configuration and the configuration's energetic stability. To provide an in-depth analysis of this correlation, starting from the $α$-polymorph of quinacridone, we also construct a coplanar model crystal. This allows us to systematically compare the displacement-dependence of the electronic coupling with that of the total energy. In this way, we identify the combination of Pauli repulsion and orbital rehybridization as the driving force steering the system towards a structure in which the electronic coupling is minimal (especially for the valence band and at small displacements). The general nature of these observations is supported by equivalent trends for an analogous pentacene model system. This underlines that the design of high-performance materials cannot rely on the "natural" assembly of the $π$-conjugated backbones of organic semiconductors into their most stable configurations. Rather, it must include the incorporation of functional groups that steer crystal packing towards more favorable structures, where aiming for short-axis displacements or realizing comparably large long-axis displacements appear as strategies worthwhile exploring. △ Less

Submitted 5 August, 2019; v1 submitted 17 May, 2019; originally announced May 2019.

Comments: This article has been removed by arXiv administrators because the submitter did not have the rights to agree to the license at the time of submission

Journal ref: Chem. Mater. 2019, 31, 17, 7054-7069

arXiv:1904.10183 [pdf, other]

doi 10.1038/s41467-020-17716-1

Imaging nodal knots in momentum space through topolectrical circuits

Authors: Ching Hua Lee, Amanda Sutrisno, Tobias Hofmann, Tobias Helbig, Yuhan Liu, Yee Sin Ang, Lay Kee Ang, Xiao Zhang, Martin Greiter, Ronny Thomale

Abstract: Knots are intricate structures that cannot be unambiguously distinguished with any single topological invariant. Momentum space knots, in particular, have been elusive due to their requisite finely tuned long-ranged hop**s. Even if constructed, probing their intricate linkages and topological "drumhead" surface states will be challenging due to the high precision needed. In this work, we overcom… ▽ More Knots are intricate structures that cannot be unambiguously distinguished with any single topological invariant. Momentum space knots, in particular, have been elusive due to their requisite finely tuned long-ranged hop**s. Even if constructed, probing their intricate linkages and topological "drumhead" surface states will be challenging due to the high precision needed. In this work, we overcome these practical and technical challenges with RLC circuits, transcending existing theoretical constructions which necessarily break reciprocity, by pairing nodal knots with their mirror image partners in a fully reciprocal setting. Our nodal knot circuits can be characterized with impedance measurements that resolve their drumhead states and image their 3D nodal structure. Doing so allows for reconstruction of the Seifert surface and hence knot topological invariants like the Alexander polynomial. We illustrate our approach with large-scale simulations of various nodal knots and an experiment that maps out the topological drumhead region of a Hopf-link. △ Less

Submitted 11 July, 2020; v1 submitted 23 April, 2019; originally announced April 2019.

Comments: 37 pages, 14 figures, 14 tables

Journal ref: Nature Communications, 11, 4385 (2020)

arXiv:1902.04818 [pdf, other]

The Odds are Odd: A Statistical Test for Detecting Adversarial Examples

Authors: Kevin Roth, Yannic Kilcher, Thomas Hofmann

Abstract: We investigate conditions under which test statistics exist that can reliably detect examples, which have been adversarially manipulated in a white-box attack. These statistics can be easily computed and calibrated by randomly corrupting inputs. They exploit certain anomalies that adversarial attacks introduce, in particular if they follow the paradigm of choosing perturbations optimally under p-n… ▽ More We investigate conditions under which test statistics exist that can reliably detect examples, which have been adversarially manipulated in a white-box attack. These statistics can be easily computed and calibrated by randomly corrupting inputs. They exploit certain anomalies that adversarial attacks introduce, in particular if they follow the paradigm of choosing perturbations optimally under p-norm constraints. Access to the log-odds is the only requirement to defend models. We justify our approach empirically, but also provide conditions under which detectability via the suggested test statistics is guaranteed to be effective. In our experiments, we show that it is even possible to correct test time predictions for adversarial attacks with high accuracy. △ Less

Submitted 9 May, 2019; v1 submitted 13 February, 2019; originally announced February 2019.

arXiv:1811.11702 [pdf, other]

doi 10.1016/j.cpc.2019.06.010

SAMPLE: Surface structure search enabled by coarse graining and statistical learning

Authors: Lukas Hörmann, Andreas Jeindl, Alexander T. Egger, Michael Scherbela, Oliver T. Hofmann

Abstract: In this publication we introduce SAMPLE, a structure search approach for commensurate organic monolayers on inorganic substrates. Such monolayers often show rich polymorphism with diverse molecular arrangements in differently shaped unit cells. Determining the different commensurate polymorphs from first principles poses a major challenge due to the large number of possible molecular arrangements.… ▽ More In this publication we introduce SAMPLE, a structure search approach for commensurate organic monolayers on inorganic substrates. Such monolayers often show rich polymorphism with diverse molecular arrangements in differently shaped unit cells. Determining the different commensurate polymorphs from first principles poses a major challenge due to the large number of possible molecular arrangements. To meet this challenge, SAMPLE employs coarse-grained modeling in combination with Bayesian linear regression to efficiently map the minima of the potential energy surface. In addition, it uses ab initio thermodynamics to generate phase diagrams. Using the example of naphthalene on Cu(111), we comprehensively explain the SAMPLE approach and demonstrate its capabilities by comparing the predicted with the experimentally observed polymorphs. △ Less

Submitted 9 August, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

arXiv:1811.06190 [pdf, ps, other]

The conjugacy problem in $GL(n,Z)$

Authors: Bettina Eick, Tommy Hofmann, E. A. O'Brien

Abstract: We present a new algorithm that, given two matrices in $GL(n,Q)$, decides if they are conjugate in $GL(n,Z)$ and, if so, determines a conjugating matrix. We also give an algorithm to construct a generating set for the centraliser in $GL(n,Z)$ of a matrix in $GL(n,Q)$. We do this by reducing these problems respectively to the isomorphism and automorphism group problems for certain modules over ring… ▽ More We present a new algorithm that, given two matrices in $GL(n,Q)$, decides if they are conjugate in $GL(n,Z)$ and, if so, determines a conjugating matrix. We also give an algorithm to construct a generating set for the centraliser in $GL(n,Z)$ of a matrix in $GL(n,Q)$. We do this by reducing these problems respectively to the isomorphism and automorphism group problems for certain modules over rings of the form $\mathcal O_K[y]/(y^l)$, where $\mathcal O_K$ is the maximal order of an algebraic number field and $l \in N$, and then provide algorithms to solve the latter. The algorithms are practical and our implementations are publicly available in Magma. △ Less

Submitted 10 May, 2019; v1 submitted 15 November, 2018; originally announced November 2018.

MSC Class: 20C15; 20G30; 20-04

arXiv:1811.05512 [pdf, other]

A domain agnostic measure for monitoring and evaluating GANs

Authors: Paulina Grnarova, Kfir Y Levy, Aurelien Lucchi, Nathanael Perraudin, Ian Goodfellow, Thomas Hofmann, Andreas Krause

Abstract: Generative Adversarial Networks (GANs) have shown remarkable results in modeling complex distributions, but their evaluation remains an unsettled issue. Evaluations are essential for: (i) relative assessment of different models and (ii) monitoring the progress of a single model throughout training. The latter cannot be determined by simply inspecting the generator and discriminator loss curves as… ▽ More Generative Adversarial Networks (GANs) have shown remarkable results in modeling complex distributions, but their evaluation remains an unsettled issue. Evaluations are essential for: (i) relative assessment of different models and (ii) monitoring the progress of a single model throughout training. The latter cannot be determined by simply inspecting the generator and discriminator loss curves as they behave non-intuitively. We leverage the notion of duality gap from game theory to propose a measure that addresses both (i) and (ii) at a low computational cost. Extensive experiments show the effectiveness of this measure to rank different GAN models and capture the typical GAN failure scenarios, including mode collapse and non-convergent behaviours. This evaluation metric also provides meaningful monitoring on the progression of the loss during training. It highly correlates with FID on natural image datasets, and with domain specific scores for text, sound and cosmology data where FID is not directly suitable. In particular, our proposed metric requires no labels or a pretrained classifier, making it domain agnostic. △ Less

Submitted 15 July, 2020; v1 submitted 13 November, 2018; originally announced November 2018.

arXiv:1809.08687 [pdf, other]

doi 10.1103/PhysRevLett.122.247702

Chiral voltage propagation in a self-calibrated topolectrical Chern circuit

Authors: Tobias Hofmann, Tobias Helbig, Ching Hua Lee, Martin Greiter, Ronny Thomale

Abstract: We propose an electric circuit array with topologically protected uni-directional voltage modes at its boundary. Instead of external bias fields or floquet engineering, we employ negative impedance converters with current inversion (INICs) to accomplish a non-reciprocal, time-reversal symmetry broken electronic network we call topolectrical Chern circuit (TCC). The TCC features an admittance bulk… ▽ More We propose an electric circuit array with topologically protected uni-directional voltage modes at its boundary. Instead of external bias fields or floquet engineering, we employ negative impedance converters with current inversion (INICs) to accomplish a non-reciprocal, time-reversal symmetry broken electronic network we call topolectrical Chern circuit (TCC). The TCC features an admittance bulk gap fully tunable via the resistors used in the INICs, along with a chiral voltage boundary mode reminiscent of the Berry flux monopole present in the admittance band structure. The active circuit elements in the TCC can be calibrated to compensate for dissipative loss. △ Less

Submitted 7 October, 2018; v1 submitted 23 September, 2018; originally announced September 2018.

Comments: 11 pages double column style, 6 figures; further material added to previous version

Journal ref: Phys. Rev. Lett. 122, 247702 (2019)

arXiv:1809.08621 [pdf, ps, other]

Learning and Evaluating Sparse Interpretable Sentence Embeddings

Authors: Valentin Trifonov, Octavian-Eugen Ganea, Anna Potapenko, Thomas Hofmann

Abstract: Previous research on word embeddings has shown that sparse representations, which can be either learned on top of existing dense embeddings or obtained through model constraints during training time, have the benefit of increased interpretability properties: to some degree, each dimension can be understood by a human and associated with a recognizable feature in the data. In this paper, we transfe… ▽ More Previous research on word embeddings has shown that sparse representations, which can be either learned on top of existing dense embeddings or obtained through model constraints during training time, have the benefit of increased interpretability properties: to some degree, each dimension can be understood by a human and associated with a recognizable feature in the data. In this paper, we transfer this idea to sentence embeddings and explore several approaches to obtain a sparse representation. We further introduce a novel, quantitative and automated evaluation metric for sentence embedding interpretability, based on topic coherence methods. We observe an increase in interpretability compared to dense models, on a dataset of movie dialogs and on the scene descriptions from the MS COCO dataset. △ Less

Submitted 25 September, 2018; v1 submitted 23 September, 2018; originally announced September 2018.

Comments: Will be presented at the workshop "Analyzing and interpreting neural networks for NLP", collocated with the EMNLP 2018 conference in Brussels

arXiv:1808.07699 [pdf, other]

End-to-End Neural Entity Linking

Authors: Nikolaos Kolitsas, Octavian-Eugen Ganea, Thomas Hofmann

Abstract: Entity Linking (EL) is an essential task for semantic text understanding and information extraction. Popular methods separately address the Mention Detection (MD) and Entity Disambiguation (ED) stages of EL, without leveraging their mutual dependency. We here propose the first neural end-to-end EL system that jointly discovers and links entities in a text document. The main idea is to consider all… ▽ More Entity Linking (EL) is an essential task for semantic text understanding and information extraction. Popular methods separately address the Mention Detection (MD) and Entity Disambiguation (ED) stages of EL, without leveraging their mutual dependency. We here propose the first neural end-to-end EL system that jointly discovers and links entities in a text document. The main idea is to consider all possible spans as potential mentions and learn contextual similarity scores over their entity candidates that are useful for both MD and ED decisions. Key components are context-aware mention embeddings, entity embeddings and a probabilistic mention - entity map, without demanding other engineered features. Empirically, we show that our end-to-end method significantly outperforms popular systems on the Gerbil platform when enough training data is available. Conversely, if testing datasets follow different annotation conventions compared to the training set (e.g. queries/ tweets vs news documents), our ED model coupled with a traditional NER system offers the best or second best EL accuracy. △ Less

Submitted 29 August, 2018; v1 submitted 23 August, 2018; originally announced August 2018.

Comments: Full paper at CoNLL 2018: Conference on Computational Natural Language Learning

arXiv:1807.09555 [pdf, other]

doi 10.1103/PhysRevB.99.161114

Band structure engineering and reconstruction in electric circuit networks

Authors: Tobias Helbig, Tobias Hofmann, Ching Hua Lee, Ronny Thomale, Stefan Imhof, Laurens W. Molenkamp, Tobias Kiessling

Abstract: We develop an approach to design, engineer, and measure band structures in a synthetic crystal composed of electric circuit elements. Starting from the nodal analysis of a circuit lattice in terms of currents and voltages, our Laplacian formalism for synthetic matter allows us to investigate arbitrary tight-binding models in terms of wave number resolved Laplacian eigenmodes, yielding an admittanc… ▽ More We develop an approach to design, engineer, and measure band structures in a synthetic crystal composed of electric circuit elements. Starting from the nodal analysis of a circuit lattice in terms of currents and voltages, our Laplacian formalism for synthetic matter allows us to investigate arbitrary tight-binding models in terms of wave number resolved Laplacian eigenmodes, yielding an admittance band structure of the circuit. For illustration, we model and measure a honeycomb circuit featuring a Dirac cone admittance bulk dispersion as well as flat band admittance edge modes at its bearded and zigzag terminations. We further employ our circuit band analysis to measure a topological phase transition in the topolectrical Su-Schrieffer-Heeger circuit. △ Less

Submitted 25 July, 2018; originally announced July 2018.

Comments: 4+e pages, 4 pages supplement, 5 figures

Journal ref: Phys. Rev. B 99, 161114 (2019)

arXiv:1807.08732 [pdf, other]

doi 10.1103/PhysRevD.98.123518

Cosmological constraints from noisy convergence maps through deep learning

Authors: Janis Fluri, Tomasz Kacprzak, Aurelien Lucchi, Alexandre Refregier, Adam Amara, Thomas Hofmann

Abstract: Deep learning is a powerful analysis technique that has recently been proposed as a method to constrain cosmological parameters from weak lensing mass maps. Due to its ability to learn relevant features from the data, it is able to extract more information from the mass maps than the commonly used power spectrum, and thus achieve better precision for cosmological parameter measurement. We explore… ▽ More Deep learning is a powerful analysis technique that has recently been proposed as a method to constrain cosmological parameters from weak lensing mass maps. Due to its ability to learn relevant features from the data, it is able to extract more information from the mass maps than the commonly used power spectrum, and thus achieve better precision for cosmological parameter measurement. We explore the advantage of Convolutional Neural Networks (CNN) over the power spectrum for varying levels of shape noise and different smoothing scales applied to the maps. We compare the cosmological constraints from the two methods in the $Ω_M-σ_8$ plane for sets of 400 deg$^2$ convergence maps. We find that, for a shape noise level corresponding to 8.53 galaxies/arcmin$^2$ and the smoothing scale of $σ_s = 2.34$ arcmin, the network is able to generate 45% tighter constraints. For smaller smoothing scale of $σ_s = 1.17$ the improvement can reach $\sim 50 \%$, while for larger smoothing scale of $σ_s = 5.85$, the improvement decreases to 19%. The advantage generally decreases when the noise level and smoothing scales increase. We present a new training strategy to train the neural network with noisy data, as well as considerations for practical applications of the deep learning approach. △ Less

Submitted 30 November, 2018; v1 submitted 23 July, 2018; originally announced July 2018.

Comments: 17 pages, 12 figures

Journal ref: Phys. Rev. D 98, 123518 (2018)

arXiv:1806.08631 [pdf, ps, other]

Computing isomorphisms between lattices

Authors: Tommy Hofmann, Henri Johnston

Abstract: Let K be a number field, let A be a finite dimensional semisimple K-algebra and let Lambda be an O_K-order in A. It was shown in previous work that, under certain hypotheses on A, there exists an algorithm that for a given (left) Lambda-lattice X either computes a free basis of X over Lambda or shows that X is not free over Lambda. In the present article, we generalise this by showing that, under… ▽ More Let K be a number field, let A be a finite dimensional semisimple K-algebra and let Lambda be an O_K-order in A. It was shown in previous work that, under certain hypotheses on A, there exists an algorithm that for a given (left) Lambda-lattice X either computes a free basis of X over Lambda or shows that X is not free over Lambda. In the present article, we generalise this by showing that, under weaker hypotheses on A, there exists an algorithm that for two given Lambda-lattices X and Y either computes an isomorphism X -> Y or determines that X and Y are not isomorphic. The algorithm is implemented in Magma for A=Q[G], Lambda=Z[G] and Lambda-lattices X and Y contained in Q[G], where G is a finite group satisfying certain hypotheses. This is used to investigate the Galois module structure of rings of integers and ambiguous ideals of tamely ramified Galois extensions of Q with Galois group isomorphic to Q_8 x C_2, the direct product of the quaternion group of order 8 and the cyclic group of order 2. △ Less

Submitted 2 March, 2020; v1 submitted 22 June, 2018; originally announced June 2018.

Comments: 30 pages; v3 revised and accepted version to appear in Mathematics of Computation; v2 has many minor corrections with additional explanation in section 10

MSC Class: 11R33; 11Y40; 16Z05

arXiv:1806.07569 [pdf, other]

A Distributed Second-Order Algorithm You Can Trust

Authors: Celestine Dünner, Aurelien Lucchi, Matilde Gargiani, An Bian, Thomas Hofmann, Martin Jaggi

Abstract: Due to the rapid growth of data and computational resources, distributed optimization has become an active research area in recent years. While first-order methods seem to dominate the field, second-order methods are nevertheless attractive as they potentially require fewer communication rounds to converge. However, there are significant drawbacks that impede their wide adoption, such as the compu… ▽ More Due to the rapid growth of data and computational resources, distributed optimization has become an active research area in recent years. While first-order methods seem to dominate the field, second-order methods are nevertheless attractive as they potentially require fewer communication rounds to converge. However, there are significant drawbacks that impede their wide adoption, such as the computation and the communication of a large Hessian matrix. In this paper we present a new algorithm for distributed training of generalized linear models that only requires the computation of diagonal blocks of the Hessian matrix on the individual workers. To deal with this approximate information we propose an adaptive approach that - akin to trust-region methods - dynamically adapts the auxiliary model to compensate for modeling errors. We provide theoretical rates of convergence for a wide class of problems including L1-regularized objectives. We also demonstrate that our approach achieves state-of-the-art results on multiple large benchmark datasets. △ Less

Submitted 20 June, 2018; originally announced June 2018.

Comments: appearing at ICML 2018 - Proceedings of the 35th International Conference on Machine Learning, Stockholm, Schweden, PMLR 80, 2018

arXiv:1806.04550 [pdf, ps, other]

Deep State Space Models for Unconditional Word Generation

Authors: Florian Schmidt, Thomas Hofmann

Abstract: Autoregressive feedback is considered a necessity for successful unconditional text generation using stochastic sequence models. However, such feedback is known to introduce systematic biases into the training process and it obscures a principle of generation: committing to global information and forgetting local nuances. We show that a non-autoregressive deep state space model with a clear separa… ▽ More Autoregressive feedback is considered a necessity for successful unconditional text generation using stochastic sequence models. However, such feedback is known to introduce systematic biases into the training process and it obscures a principle of generation: committing to global information and forgetting local nuances. We show that a non-autoregressive deep state space model with a clear separation of global and local uncertainty can be built from only two ingredients: An independent noise source and a deterministic transition function. Recent advances on flow-based variational inference can be used to train an evidence lower-bound without resorting to annealing, auxiliary losses or similar measures. The result is a highly interpretable generative model on par with comparable auto-regressive models on the task of word generation. △ Less

Submitted 28 October, 2018; v1 submitted 12 June, 2018; originally announced June 2018.

Comments: NIPS camera-ready version

arXiv:1805.10694 [pdf, other]

Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization

Authors: Jonas Kohler, Hadi Daneshmand, Aurelien Lucchi, Ming Zhou, Klaus Neymeyr, Thomas Hofmann

Abstract: Normalization techniques such as Batch Normalization have been applied successfully for training deep neural networks. Yet, despite its apparent empirical benefits, the reasons behind the success of Batch Normalization are mostly hypothetical. We here aim to provide a more thorough theoretical understanding from a classical optimization perspective. Our main contribution towards this goal is the i… ▽ More Normalization techniques such as Batch Normalization have been applied successfully for training deep neural networks. Yet, despite its apparent empirical benefits, the reasons behind the success of Batch Normalization are mostly hypothetical. We here aim to provide a more thorough theoretical understanding from a classical optimization perspective. Our main contribution towards this goal is the identification of various problem instances in the realm of machine learning where % -- under certain assumptions-- Batch Normalization can provably accelerate optimization. We argue that this acceleration is due to the fact that Batch Normalization splits the optimization task into optimizing length and direction of the parameters separately. This allows gradient-based methods to leverage a favourable global structure in the loss landscape that we prove to exist in Learning Halfspace problems and neural network training with Gaussian inputs. We thereby turn Batch Normalization from an effective practical heuristic into a provably converging algorithm for these settings. Furthermore, we substantiate our analysis with empirical evidence that suggests the validity of our theoretical results in a broader context. △ Less

Submitted 6 October, 2018; v1 submitted 27 May, 2018; originally announced May 2018.

arXiv:1805.10338 [pdf, other]

Zero-Shot Dual Machine Translation

Authors: Lierni Sestorain, Massimiliano Ciaramita, Christian Buck, Thomas Hofmann

Abstract: Neural Machine Translation (NMT) systems rely on large amounts of parallel data. This is a major challenge for low-resource languages. Building on recent work on unsupervised and semi-supervised methods, we present an approach that combines zero-shot and dual learning. The latter relies on reinforcement learning, to exploit the duality of the machine translation task, and requires only monolingual… ▽ More Neural Machine Translation (NMT) systems rely on large amounts of parallel data. This is a major challenge for low-resource languages. Building on recent work on unsupervised and semi-supervised methods, we present an approach that combines zero-shot and dual learning. The latter relies on reinforcement learning, to exploit the duality of the machine translation task, and requires only monolingual data for the target language pair. Experiments show that a zero-shot dual system, trained on English-French and English-Spanish, outperforms by large margins a standard NMT system in zero-shot translation performance on Spanish-French (both directions). The zero-shot dual method approaches the performance, within 2.2 BLEU points, of a comparable supervised setting. Our method can obtain improvements also on the setting where a small amount of parallel data for the zero-shot language pair is available. Adding Russian, to extend our experiments to jointly modeling 6 zero-shot translation directions, all directions improve between 4 and 15 BLEU points, again, reaching performance near that of the supervised setting. △ Less

Submitted 25 May, 2018; originally announced May 2018.

arXiv:1805.09112 [pdf, other]

Hyperbolic Neural Networks

Authors: Octavian-Eugen Ganea, Gary Bécigneul, Thomas Hofmann

Abstract: Hyperbolic spaces have recently gained momentum in the context of machine learning due to their high capacity and tree-likeliness properties. However, the representational power of hyperbolic geometry is not yet on par with Euclidean geometry, mostly because of the absence of corresponding hyperbolic neural network layers. This makes it hard to use hyperbolic embeddings in downstream tasks. Here,… ▽ More Hyperbolic spaces have recently gained momentum in the context of machine learning due to their high capacity and tree-likeliness properties. However, the representational power of hyperbolic geometry is not yet on par with Euclidean geometry, mostly because of the absence of corresponding hyperbolic neural network layers. This makes it hard to use hyperbolic embeddings in downstream tasks. Here, we bridge this gap in a principled manner by combining the formalism of Möbius gyrovector spaces with the Riemannian geometry of the Poincaré model of hyperbolic spaces. As a result, we derive hyperbolic versions of important deep learning tools: multinomial logistic regression, feed-forward and recurrent neural networks such as gated recurrent units. This allows to embed sequential data and perform classification in the hyperbolic space. Empirically, we show that, even if hyperbolic optimization tools are limited, hyperbolic sentence embeddings either outperform or are on par with their Euclidean variants on textual entailment and noisy-prefix recognition tasks. △ Less

Submitted 28 June, 2018; v1 submitted 23 May, 2018; originally announced May 2018.

arXiv:1805.08736 [pdf, other]

Adversarially Robust Training through Structured Gradient Regularization

Authors: Kevin Roth, Aurelien Lucchi, Sebastian Nowozin, Thomas Hofmann

Abstract: We propose a novel data-dependent structured gradient regularizer to increase the robustness of neural networks vis-a-vis adversarial perturbations. Our regularizer can be derived as a controlled approximation from first principles, leveraging the fundamental link between training with noise and regularization. It adds very little computational overhead during learning and is simple to implement g… ▽ More We propose a novel data-dependent structured gradient regularizer to increase the robustness of neural networks vis-a-vis adversarial perturbations. Our regularizer can be derived as a controlled approximation from first principles, leveraging the fundamental link between training with noise and regularization. It adds very little computational overhead during learning and is simple to implement generically in standard deep learning frameworks. Our experiments provide strong evidence that structured gradient regularization can act as an effective first line of defense against attacks based on low-level signal corruption. △ Less

Submitted 22 May, 2018; originally announced May 2018.

Showing 101–150 of 217 results for author: Hofmann, T