Skip to main content

Showing 1–41 of 41 results for author: Hermann, K

.
  1. arXiv:2405.05847  [pdf, other

    cs.LG cs.CV

    Learned feature representations are biased by complexity, learning order, position, and more

    Authors: Andrew Kyle Lampinen, Stephanie C. Y. Chan, Katherine Hermann

    Abstract: Representation learning, and interpreting learned representations, are key areas of focus in machine learning and neuroscience. Both fields generally use representations as a means to understand or improve a system's computations. In this work, however, we explore surprising dissociations between representation and computation that may pose challenges for such efforts. We create datasets in which… ▽ More

    Submitted 6 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2401.13306  [pdf, other

    cs.CR

    POSTER: Towards Secure 5G Infrastructures for Production Systems

    Authors: Martin Henze, Maximilian Ortmann, Thomas Vogt, Osman Ugus, Kai Hermann, Svenja Nohr, Zeren Lu, Sotiris Michaelides, Angela Massonet, Robert H. Schmitt

    Abstract: To meet the requirements of modern production, industrial communication increasingly shifts from wired fieldbus to wireless 5G communication. Besides tremendous benefits, this shift introduces severe novel risks, ranging from limited reliability over new security vulnerabilities to a lack of accountability. To address these risks, we present approaches to (i) prevent attacks through authentication… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted to the poster session of the 22nd International Conference on Applied Cryptography and Network Security (ACNS 2024)

  3. arXiv:2310.16228  [pdf, other

    cs.LG cs.CV

    On the Foundations of Shortcut Learning

    Authors: Katherine L. Hermann, Hossein Mobahi, Thomas Fel, Michael C. Mozer

    Abstract: Deep-learning models can extract a rich assortment of features from data. Which features a model uses depends not only on predictivity-how reliably a feature indicates train-set labels-but also on availability-how easily the feature can be extracted, or leveraged, from inputs. The literature on shortcut learning has noted examples in which models privilege one feature over another, for example tex… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  4. arXiv:2310.13018  [pdf, other

    q-bio.NC cs.AI cs.LG cs.NE

    Getting aligned on representational alignment

    Authors: Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine L. Hermann, Kerem Oktar, Klaus Greff, Martin N. Hebart, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert Geirhos, Sherol Chen, Simon Kornblith, Sunayana Rane, Talia Konkle, Thomas P. O'Connell , et al. (5 additional authors not shown)

    Abstract: Biological and artificial information processing systems form representations that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the extent to which the representations formed by these diverse systems agree? Do similarities in representations then translate into similar behavior? How can a system's representations be modified to better match those of an… ▽ More

    Submitted 2 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Working paper, changes to be made in upcoming revisions

  5. arXiv:2307.13375  [pdf, other

    eess.IV cs.CV

    Towards Unifying Anatomy Segmentation: Automated Generation of a Full-body CT Dataset via Knowledge Aggregation and Anatomical Guidelines

    Authors: Alexander Jaus, Constantin Seibold, Kelsey Hermann, Alexandra Walter, Kristina Giske, Johannes Haubold, Jens Kleesiek, Rainer Stiefelhagen

    Abstract: In this study, we present a method for generating automated anatomy segmentation datasets using a sequential process that involves nnU-Net-based pseudo-labeling and anatomy-guided pseudo-label refinement. By combining various fragmented knowledge bases, we generate a dataset of whole-body CT scans with $142$ voxel-level labels for 533 volumes providing comprehensive anatomical coverage which exper… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: 18 pages, 8 figures, 2 tables

  6. arXiv:2307.06017  [pdf

    cond-mat.mes-hall

    Nanoparticles with Cubic Symmetry: Classification of Polyhedral Shapes

    Authors: Klaus E. Hermann

    Abstract: The shape of crystalline nanoparticles (NP) can often be described by polyhedra with flat facet surfaces. Thus, structural studies of polyhedral bodies can help to describe geometric details of NPs. Here we consider compact polyhedra of cubic point symmetry Oh.as simple models. Their surfaces are described by facets with normal vectors along selected directions (a, b, c) together with their symmet… ▽ More

    Submitted 21 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: 40/56 pages, 18/9 figures (paper/supplement). arXiv admin note: text overlap with arXiv:2209.08919

  7. arXiv:2306.04507  [pdf, other

    cs.CV cs.LG

    Improving neural network representations using human similarity judgments

    Authors: Lukas Muttenthaler, Lorenz Linhardt, Jonas Dippel, Robert A. Vandermeulen, Katherine Hermann, Andrew K. Lampinen, Simon Kornblith

    Abstract: Deep neural networks have reached human-level performance on many computer vision tasks. However, the objectives used to train these networks enforce only that similar images are embedded at similar locations in the representation space, and do not directly constrain the global structure of the resulting space. Here, we explore the impact of supervising this global structure by linearly aligning i… ▽ More

    Submitted 26 September, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: Published as a conference paper at NeurIPS 2023

  8. arXiv:2303.17651  [pdf, other

    cs.CL cs.AI cs.LG

    Self-Refine: Iterative Refinement with Self-Feedback

    Authors: Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark

    Abstract: Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides feedback for its output and uses it… ▽ More

    Submitted 25 May, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: Code, data, and demo at https://selfrefine.info/

  9. arXiv:2209.08919  [pdf

    physics.atm-clus physics.comp-ph

    Compact Polyhedra of Cubic Symmetry: Geometrical Analysis and Classification

    Authors: KLaus E. Hermann

    Abstract: Compact polyhedra of cubic point symmetry Oh, exhibit surfaces of planar sections (facets) characterized by normal vector families {abc} with up to 48 members each, compatible with Oh symmetry. We focus first on polyhedra confined by facets with normal vectors of one family, {100}, {110}, and {111}, separately. This yields generic polyhedra which serve for the definition of general compact polyhed… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: 43 pages, 21 figures

  10. The Science Performance of JWST as Characterized in Commissioning

    Authors: Jane Rigby, Marshall Perrin, Michael McElwain, Randy Kimble, Scott Friedman, Matt Lallo, René Doyon, Lee Feinberg, Pierre Ferruit, Alistair Glasse, Marcia Rieke, George Rieke, Gillian Wright, Chris Willott, Knicole Colon, Stefanie Milam, Susan Neff, Christopher Stark, Jeff Valenti, Jim Abell, Faith Abney, Yasin Abul-Huda, D. Scott Acton, Evan Adams, David Adler , et al. (601 additional authors not shown)

    Abstract: This paper characterizes the actual science performance of the James Webb Space Telescope (JWST), as determined from the six month commissioning period. We summarize the performance of the spacecraft, telescope, science instruments, and ground system, with an emphasis on differences from pre-launch expectations. Commissioning has made clear that JWST is fully capable of achieving the discoveries f… ▽ More

    Submitted 10 April, 2023; v1 submitted 12 July, 2022; originally announced July 2022.

    Comments: 5th version as accepted to PASP; 31 pages, 18 figures; https://iopscience.iop.org/article/10.1088/1538-3873/acb293

    Journal ref: PASP 135 048001 (2023)

  11. arXiv:2201.07038  [pdf

    physics.atm-clus physics.chem-ph

    Polyhedral Metal Nanoparticles with Cubic Lattice: Theory of Structural Properties

    Authors: Klaus E. Hermann

    Abstract: We examine the structure of compact metal nanoparticles (NPs) forming polyhedral sections of face centered (fcc) and body centered (bcc) cubic lattices, which are confined by facets characterized by highly dense {100}, {110}, and {111} monolayers. Together with the constraint that the NPs exhibit the same point symmetry as the ideal cubic lattice, i.e. Oh, different types of generic NPs serve for… ▽ More

    Submitted 18 March, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: 95pages, 59 figures

  12. arXiv:2101.04385  [pdf

    cond-mat.mes-hall

    Structure and Morphology of Crystalline Metal Nanoparticles: Polyhedral Cubic Particles

    Authors: Klaus E. Hermann

    Abstract: We examine nanoparticles (NPs) forming polyhedral sections of the ideal cubic lattice, simple (sc), body centered (bcc), and face centered (fcc) cubic, which are confined by facets characterized by densest and second densest {h k l} monolayers of the lattice. Together with the constraint that the NPs exhibit the same point symmetry as the ideal cubic lattice, i.e. Oh, different types of generic NP… ▽ More

    Submitted 12 January, 2021; originally announced January 2021.

    Comments: 58 pages, all figures (jpeg) included in the text

  13. arXiv:2009.01418  [pdf, ps, other

    math.PR math-ph

    Limit theorems and soft edge of freezing random matrix models via dual orthogonal polynomials

    Authors: Sergio Andraus, Kilian Hermann, Michael Voit

    Abstract: $N$-dimensional Bessel and Jacobi processes describe interacting particle systems with $N$ particles and are related to $β$-Hermite, $β$-Laguerre, and $β$-Jacobi ensembles. For fixed $N$ there exist associated weak limit theorems (WLTs) in the freezing regime $β\to\infty$ in the $β$-Hermite and $β$-Laguerre case by Dumitriu and Edelman (2005) with explicit formulas for the covariance matrices $Σ_N… ▽ More

    Submitted 29 June, 2021; v1 submitted 2 September, 2020; originally announced September 2020.

    Comments: 32 pages, made small improvements and added references

    MSC Class: Primary 60F05; Secondary 60B20; 70F10; 82C22; 33C45; 33C10; 60J60

    Journal ref: J. Math. Phys. 62, 083303 (2021)

  14. arXiv:2006.12433  [pdf, other

    cs.LG stat.ML

    What shapes feature representations? Exploring datasets, architectures, and training

    Authors: Katherine L. Hermann, Andrew K. Lampinen

    Abstract: In naturalistic learning problems, a model's input contains a wide range of features, some useful for the task at hand, and others not. Of the useful features, which ones does the model use? Of the task-irrelevant features, which ones does the model represent? Answers to these questions are important for understanding the basis of models' decisions, as well as for building models that learn versat… ▽ More

    Submitted 22 October, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: 22 pages

  15. arXiv:1911.09071  [pdf, other

    cs.CV cs.LG q-bio.NC

    The Origins and Prevalence of Texture Bias in Convolutional Neural Networks

    Authors: Katherine L. Hermann, Ting Chen, Simon Kornblith

    Abstract: Recent work has indicated that, unlike humans, ImageNet-trained CNNs tend to classify images by texture rather than by shape. How pervasive is this bias, and where does it come from? We find that, when trained on datasets of images with conflicting shape and texture, CNNs learn to classify by shape at least as easily as by texture. What factors, then, produce the texture bias in CNNs trained on Im… ▽ More

    Submitted 3 November, 2020; v1 submitted 20 November, 2019; originally announced November 2019.

    Comments: NeurIPS'2020

  16. Limit theorems for Jacobi ensembles with large parameters

    Authors: Kilian Hermann, Michael Voit

    Abstract: Consider Jacobi random matrix ensembles with the distributions $$c_{k_1,k_2,k_3}\prod_{1\leq i< j \leq N}\left(x_j-x_i\right)^{k_3}\prod_{i=1}^N \left(1-x_i\right)^{\frac{k_1+k_2}{2}-\frac{1}{2}}\left(1+x_i\right)^{\frac{k_2}{2}-\frac{1}{2}} dx$$ of the eigenvalues on the alcoves $$A:=\{x\in\mathbb R^N| \> -1\leq x_1\le ...\le x_N\leq 1\}.$$ For $(k_1,k_2,k_3)=κ\cdot (a,b,1)$ with $a,b>0$ fixed, w… ▽ More

    Submitted 15 October, 2020; v1 submitted 20 May, 2019; originally announced May 2019.

    Comments: The presentation of the results is improved, and additional references are added

    MSC Class: 60F05; 60B20; 70F10; 82C22; 33C45; 33C67

    Journal ref: Tunisian J. Math. 3 (2021) 843-860

  17. arXiv:1903.01292  [pdf, other

    cs.AI cs.CV cs.RO

    The StreetLearn Environment and Dataset

    Authors: Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Denis Teplyashin, Karl Moritz Hermann, Mateusz Malinowski, Matthew Koichi Grimes, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

    Abstract: Navigation is a rich and well-grounded problem domain that drives progress in many different areas of research: perception, planning, memory, exploration, and optimisation in particular. Historically these challenges have been separately considered and solutions built that rely on stationary datasets - for example, recorded trajectories through an environment. These datasets cannot be used for dec… ▽ More

    Submitted 4 March, 2019; originally announced March 2019.

    Comments: 13 pages, 6 figures, 4 tables. arXiv admin note: text overlap with arXiv:1804.00168

  18. arXiv:1903.00401  [pdf, other

    cs.AI cs.CL cs.CV

    Learning To Follow Directions in Street View

    Authors: Karl Moritz Hermann, Mateusz Malinowski, Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Raia Hadsell

    Abstract: Navigating and understanding the real world remains a key challenge in machine learning and inspires a great variety of research in areas such as language grounding, planning, navigation and computer vision. We propose an instruction-following task that requires all of the above, and which combines the practicality of simulated environments with the challenges of ambiguous, noisy real world data.… ▽ More

    Submitted 21 November, 2019; v1 submitted 1 March, 2019; originally announced March 2019.

    Journal ref: AAAI 2020

  19. arXiv:1807.01670  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    Encoding Spatial Relations from Natural Language

    Authors: Tiago Ramalho, Tomáš Kočiský, Frederic Besse, S. M. Ali Eslami, Gábor Melis, Fabio Viola, Phil Blunsom, Karl Moritz Hermann

    Abstract: Natural language processing has made significant inroads into learning the semantics of words through distributional approaches, however representations learnt via these methods fail to capture certain kinds of information implicit in the real world. In particular, spatial relations are encoded in a way that is inconsistent with human spatial reasoning and lacking invariance to viewpoint changes.… ▽ More

    Submitted 5 July, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

  20. arXiv:1805.09786  [pdf, other

    cs.NE

    Hyperbolic Attention Networks

    Authors: Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, Nando de Freitas

    Abstract: We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure. A few recent approaches have successfully demonstrated the benefits of imposing hyperbolic geometry on the parameters of shallow networks. We extend this line of work by imposing hyperbolic geometry on the activations of neural networks… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

  21. arXiv:1805.09208  [pdf, other

    stat.ML cs.CL cs.LG

    Pushing the bounds of dropout

    Authors: Gábor Melis, Charles Blundell, Tomáš Kočiský, Karl Moritz Hermann, Chris Dyer, Phil Blunsom

    Abstract: We show that dropout training is best understood as performing MAP estimation concurrently for a family of conditional models whose objectives are themselves lower bounded by the original dropout objective. This discovery allows us to pick any model from this family after training, which leads to a substantial improvement on regularisation-heavy language modelling. The family includes models that… ▽ More

    Submitted 27 September, 2018; v1 submitted 23 May, 2018; originally announced May 2018.

  22. arXiv:1804.03984  [pdf, other

    cs.AI cs.CL cs.LG cs.MA

    Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input

    Authors: Angeliki Lazaridou, Karl Moritz Hermann, Karl Tuyls, Stephen Clark

    Abstract: The ability of algorithms to evolve or learn (compositional) communication protocols has traditionally been studied in the language evolution literature through the use of emergent communication tasks. Here we scale up this research by using contemporary deep learning methods and by training reinforcement-learning neural network agents on referential communication games. We extend previous work, i… ▽ More

    Submitted 11 April, 2018; originally announced April 2018.

    Comments: To appear at ICLR 2018

  23. arXiv:1804.00168  [pdf, other

    cs.AI

    Learning to Navigate in Cities Without a Map

    Authors: Piotr Mirowski, Matthew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

    Abstract: Navigating through unstructured environments is a basic capability of intelligent creatures, and thus is of fundamental interest in the study and development of artificial intelligence. Long-range navigation is a complex cognitive task that relies on develo** an internal representation of space, grounded by recognisable landmarks and robust visual processing, that can simultaneously support cont… ▽ More

    Submitted 9 January, 2019; v1 submitted 31 March, 2018; originally announced April 2018.

    Comments: 17 pages, 16 figures, published at NeurIPS 2018

    Journal ref: Neural Information Processing Systems 2018

  24. arXiv:1802.01802  [pdf

    physics.chem-ph cond-mat.mtrl-sci

    Interlocking mechanism between molecular gears attached to surfaces

    Authors: Rundong Zhao, Yan-Ling Zhao, Fei Qi, Klaus Hermann, Rui-Qin Zhang, Michel A. Van Hove

    Abstract: While molecular machines play an increasingly significant role in nanoscience research and applications, there remains a shortage of investigations and understanding of the molecular gear (cogwheel), which is an indispensable and fundamental component to drive a larger correlated molecular machine system. Employing ab initio calculations, we investigate model systems consisting of molecules adsorb… ▽ More

    Submitted 6 February, 2018; originally announced February 2018.

  25. arXiv:1712.07040  [pdf, other

    cs.CL cs.AI cs.NE

    The NarrativeQA Reading Comprehension Challenge

    Authors: Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, Edward Grefenstette

    Abstract: Reading comprehension (RC)---in contrast to information retrieval---requires integrating information and reasoning about events, entities, and their relations across a full document. Question answering is conventionally used to assess RC ability, in both artificial agents and children learning to read. However, existing RC datasets and tasks are dominated by questions that can be solved by selecti… ▽ More

    Submitted 19 December, 2017; originally announced December 2017.

  26. arXiv:1710.09867  [pdf, other

    cs.CL cs.AI cs.NE

    Understanding Early Word Learning in Situated Artificial Agents

    Authors: Felix Hill, Stephen Clark, Karl Moritz Hermann, Phil Blunsom

    Abstract: Neural network-based systems can now learn to locate the referents of words and phrases in images, answer questions about visual scenes, and execute symbolic instructions as first-person actors in partially-observable worlds. To achieve this so-called grounded language learning, models must overcome challenges that infants face when learning their first words. While it is notable that models with… ▽ More

    Submitted 1 October, 2019; v1 submitted 26 October, 2017; originally announced October 2017.

  27. arXiv:1706.06551  [pdf, other

    cs.CL cs.LG stat.ML

    Grounded Language Learning in a Simulated 3D World

    Authors: Karl Moritz Hermann, Felix Hill, Simon Green, Fumin Wang, Ryan Faulkner, Hubert Soyer, David Szepesvari, Wojciech Marian Czarnecki, Max Jaderberg, Denis Teplyashin, Marcus Wainwright, Chris Apps, Demis Hassabis, Phil Blunsom

    Abstract: We are increasingly surrounded by artificially intelligent technology that takes decisions and executes actions on our behalf. This creates a pressing need for general means to communicate with, instruct and guide artificial agents, with human language the most compelling means for such communication. To achieve this in a scalable fashion, agents must be able to relate language to the world and to… ▽ More

    Submitted 26 June, 2017; v1 submitted 20 June, 2017; originally announced June 2017.

    Comments: 16 pages, 8 figures

  28. arXiv:1609.09315  [pdf, other

    cs.CL cs.AI cs.NE

    Semantic Parsing with Semi-Supervised Sequential Autoencoders

    Authors: Tomáš Kočiský, Gábor Melis, Edward Grefenstette, Chris Dyer, Wang Ling, Phil Blunsom, Karl Moritz Hermann

    Abstract: We present a novel semi-supervised approach for sequence transduction and apply it to semantic parsing. The unsupervised component is based on a generative model in which latent sentences generate the unpaired logical forms. We apply this method to a number of semantic parsing tasks focusing on domains with limited access to labelled training data and extend those datasets with synthetically gener… ▽ More

    Submitted 29 September, 2016; originally announced September 2016.

  29. arXiv:1607.00593  [pdf, ps, other

    physics.chem-ph

    Intramolecular Torque, an Indicator of the Internal Rotation Direction of Rotor Molecules and Similar Systems

    Authors: Rui-Qin Zhang, Yan-Ling Zhao, Fei Qi, Klaus Hermann, Michel A. Van Hove

    Abstract: Torque is ubiquitous in many molecular systems, including collisions, chemical reactions, vibrations, electronic excitations and especially rotor molecules. We present a straightforward theoretical method based on forces acting on atoms and obtained from atomistic quantum mechanics calculations, to quickly and qualitatively determine whether a molecule or sub-unit thereof has a tendency to rotatio… ▽ More

    Submitted 3 July, 2016; originally announced July 2016.

    Comments: 11 pages, 4 figures, 1 SI file

  30. arXiv:1603.06744  [pdf, other

    cs.CL cs.NE

    Latent Predictor Networks for Code Generation

    Authors: Wang Ling, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Andrew Senior, Fumin Wang, Phil Blunsom

    Abstract: Many language generation tasks require the production of text conditioned on both structured and unstructured inputs. We present a novel neural network architecture which generates an output sequence conditioned on an arbitrary number of input functions. Crucially, our approach allows both the choice of conditioning context and the granularity of generation, for example characters or tokens, to be… ▽ More

    Submitted 8 June, 2016; v1 submitted 22 March, 2016; originally announced March 2016.

  31. arXiv:1509.06664  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Reasoning about Entailment with Neural Attention

    Authors: Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Phil Blunsom

    Abstract: While most approaches to automatically recognizing entailment relations have used classifiers employing hand engineered features derived from complex natural language processing pipelines, in practice their performance has been only slightly better than bag-of-word pair classifiers using only lexical similarity. The only attempt so far to build an end-to-end differentiable neural network for entai… ▽ More

    Submitted 1 March, 2016; v1 submitted 22 September, 2015; originally announced September 2015.

    Comments: ICLR 2016 camera-ready, 9 pages, 10 figures (incl. subfigures)

    MSC Class: 68T50 ACM Class: I.2.6; I.2.7

  32. arXiv:1506.03340  [pdf, other

    cs.CL cs.AI cs.NE

    Teaching Machines to Read and Comprehend

    Authors: Karl Moritz Hermann, Tomáš Kočiský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, Phil Blunsom

    Abstract: Teaching machines to read natural language documents remains an elusive challenge. Machine reading systems can be tested on their ability to answer questions posed on the contents of documents that they have seen, but until now large scale training and test datasets have been missing for this type of evaluation. In this work we define a new methodology that resolves this bottleneck and provides la… ▽ More

    Submitted 19 November, 2015; v1 submitted 10 June, 2015; originally announced June 2015.

    Comments: Appears in: Advances in Neural Information Processing Systems 28 (NIPS 2015). 14 pages, 13 figures

  33. arXiv:1506.02516  [pdf, other

    cs.NE cs.CL cs.LG

    Learning to Transduce with Unbounded Memory

    Authors: Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Phil Blunsom

    Abstract: Recently, strong results have been demonstrated by Deep Recurrent Neural Networks on natural language transduction problems. In this paper we explore the representational power of these models using synthetic grammars designed to exhibit phenomena similar to those found in real transduction problems such as machine translation. These experiments lead us to propose new memory-based recurrent networ… ▽ More

    Submitted 3 November, 2015; v1 submitted 8 June, 2015; originally announced June 2015.

    Comments: 14 pages, 4 figures, NIPS 2015

    MSC Class: 68T05 ACM Class: I.5.1; I.2.6; I.2.7

  34. arXiv:1412.1632  [pdf, ps, other

    cs.CL

    Deep Learning for Answer Sentence Selection

    Authors: Lei Yu, Karl Moritz Hermann, Phil Blunsom, Stephen Pulman

    Abstract: Answer sentence selection is the task of identifying sentences that contain the answer to a given question. This is an important problem in its own right as well as in the larger context of open domain question answering. We propose a novel approach to solving this task via means of distributed representations, and learn to match questions with answers by considering their semantic encoding. This… ▽ More

    Submitted 4 December, 2014; originally announced December 2014.

    Comments: 9 pages, accepted by NIPS deep learning workshop

  35. arXiv:1411.3146  [pdf, other

    cs.CL

    Distributed Representations for Compositional Semantics

    Authors: Karl Moritz Hermann

    Abstract: The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional approaches --- meaning distributed representations that exploit co-occurrence statistics of large corpora --- have proved popular and successful across a number of tas… ▽ More

    Submitted 12 November, 2014; originally announced November 2014.

    Comments: DPhil Thesis, University of Oxford, Submitted and accepted in 2014

  36. arXiv:1405.0947  [pdf, other

    cs.CL

    Learning Bilingual Word Representations by Marginalizing Alignments

    Authors: Tomáš Kočiský, Karl Moritz Hermann, Phil Blunsom

    Abstract: We present a probabilistic model that simultaneously learns alignments and distributed representations for bilingual data. By marginalizing over word alignments the model captures a larger semantic context than prior work relying on hard alignments. The advantage of this approach is demonstrated in a cross-lingual classification task, where we outperform the prior published state of the art.

    Submitted 5 May, 2014; originally announced May 2014.

    Comments: Proceedings of ACL 2014 (Short Papers)

  37. arXiv:1404.7296  [pdf, other

    cs.CL

    A Deep Architecture for Semantic Parsing

    Authors: Edward Grefenstette, Phil Blunsom, Nando de Freitas, Karl Moritz Hermann

    Abstract: Many successful approaches to semantic parsing build on top of the syntactic analysis of text, and make use of distributional representations or statistical models to match parses to ontology-specific queries. This paper presents a novel deep learning architecture which provides a semantic parsing system through the union of two neural models of language semantics. It allows for the generation of… ▽ More

    Submitted 29 April, 2014; originally announced April 2014.

    Comments: In Proceedings of the Semantic Parsing Workshop at ACL 2014 (forthcoming)

  38. arXiv:1404.4641  [pdf, other

    cs.CL

    Multilingual Models for Compositional Distributed Semantics

    Authors: Karl Moritz Hermann, Phil Blunsom

    Abstract: We present a novel technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of semantically equivalent sentences, while maintaining sufficient distance between those of dissimilar sentences. The models do not rely on word alignments or an… ▽ More

    Submitted 17 April, 2014; originally announced April 2014.

    Comments: Proceedings of ACL 2014 (Long papers)

  39. arXiv:1312.6173  [pdf, other

    cs.CL

    Multilingual Distributed Representations without Word Alignment

    Authors: Karl Moritz Hermann, Phil Blunsom

    Abstract: Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not available in discrete representations, distributed representations have proven useful in many NLP tasks. Recent work has shown how compositional semantic represent… ▽ More

    Submitted 20 March, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

    Comments: To appear at ICLR 2014

  40. arXiv:1306.2158  [pdf, other

    cs.CL

    "Not not bad" is not "bad": A distributional account of negation

    Authors: Karl Moritz Hermann, Edward Grefenstette, Phil Blunsom

    Abstract: With the increasing empirical success of distributional models of compositional semantics, it is timely to consider the types of textual logic that such models are capable of capturing. In this paper, we address shortcomings in the ability of current models to capture logical operations such as negation. As a solution we propose a tripartite formulation for a continuous vector space representation… ▽ More

    Submitted 10 June, 2013; originally announced June 2013.

    Comments: 9 pages, to appear in Proceedings of the 2013 Workshop on Continuous Vector Space Models and their Compositionality

    MSC Class: 68T50 ACM Class: I.2.7

  41. Theory of Alkali Induced Reconstruction of the Cu(100) Surface

    Authors: S. Quassowski, K. Hermann

    Abstract: LEED experiments show that Li adsorbed at Cu(100) surfaces at room temperature induces a (2x1) missing row substrate reconstruction while adsorption at lower temperatures, T=180 K, results in an unreconstructed Cu(100)+c(2x2)--Li overlayer structure. Substrate reconstruction has not been observed for Na nor for K adsorption. In order to study the specific reconstruction behavior of the Li adsorb… ▽ More

    Submitted 26 September, 1996; originally announced September 1996.

    Comments: 8 pages, 5 figures, submitted to Surf. Rev. Lett