Skip to main content

Showing 1–34 of 34 results for author: Lampinen, A

.
  1. arXiv:2407.06076  [pdf, other

    cs.CV cs.AI

    Understanding Visual Feature Reliance through the Lens of Complexity

    Authors: Thomas Fel, Louis Bethune, Andrew Kyle Lampinen, Thomas Serre, Katherine Hermann

    Abstract: Recent studies suggest that deep learning models inductive bias towards favoring simpler features may be one of the sources of shortcut learning. Yet, there has been limited focus on understanding the complexity of the myriad features that models learn. In this work, we introduce a new metric for quantifying feature complexity, based on $\mathscr{V}$-information and capturing whether a feature req… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2405.05847  [pdf, other

    cs.LG cs.CV

    Learned feature representations are biased by complexity, learning order, position, and more

    Authors: Andrew Kyle Lampinen, Stephanie C. Y. Chan, Katherine Hermann

    Abstract: Representation learning, and interpreting learned representations, are key areas of focus in machine learning and neuroscience. Both fields generally use representations as a means to understand or improve a system's computations. In this work, however, we explore surprising dissociations between representation and computation that may pose challenges for such efforts. We create datasets in which… ▽ More

    Submitted 6 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2404.10179  [pdf, other

    cs.RO cs.AI cs.HC cs.LG

    Scaling Instructable Agents Across Many Simulated Worlds

    Authors: SIMA Team, Maria Abi Raad, Arun Ahuja, Catarina Barros, Frederic Besse, Andrew Bolt, Adrian Bolton, Bethanie Brownfield, Gavin Buttimore, Max Cant, Sarah Chakera, Stephanie C. Y. Chan, Jeff Clune, Adrian Collister, Vikki Copeman, Alex Cullum, Ishita Dasgupta, Dario de Cesare, Julia Di Trapani, Yani Donchev, Emma Dunleavy, Martin Engelcke, Ryan Faulkner, Frankie Garcia, Charles Gbadamosi , et al. (68 additional authors not shown)

    Abstract: Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order to accomplish complex tasks. The Scalable, Instructable, Multiworld Agent (SIMA) project tackles this by training agents to follow free-form instructio… ▽ More

    Submitted 17 April, 2024; v1 submitted 13 March, 2024; originally announced April 2024.

  4. arXiv:2312.09688  [pdf, other

    cs.HC cs.RO

    Sha** and Being Shaped by Drones: Supporting Perception-Action Loops

    Authors: Mousa Sondoqah, Fehmi Ben Abdesslem, Kristina Popova, Moira McGregor, Joseph La Delfa, Rachael Garrett, Airi Lampinen, Luca Mottola, Kristina Höök

    Abstract: We report on a three-day challenge during which five teams each programmed a nanodrone to be piloted through an obstacle course using bodily movement, in a 3D transposition of the '80s video-game Pacman. Using a bricolage approach to analyse interviews, field notes, video recordings, and inspection of each team's code revealed how participants were sha** and, in turn, became shaped in bodily way… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  5. arXiv:2312.03656  [pdf, other

    cs.LG cs.CL

    Interpretability Illusions in the Generalization of Simplified Models

    Authors: Dan Friedman, Andrew Lampinen, Lucas Dixon, Danqi Chen, Asma Ghandeharioun

    Abstract: A common method to study deep learning systems is to use simplified model representations--for example, using singular value decomposition to visualize the model's hidden states in a lower dimensional space. This approach assumes that the results of these simplifications are faithful to the original model. Here, we illustrate an important caveat to this assumption: even if the simplified represent… ▽ More

    Submitted 5 June, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: ICML 2024

  6. arXiv:2311.17901  [pdf, other

    cs.CV cs.AI cs.LG

    SODA: Bottleneck Diffusion Models for Representation Learning

    Authors: Drew A. Hudson, Daniel Zoran, Mateusz Malinowski, Andrew K. Lampinen, Andrew Jaegle, James L. McClelland, Loic Matthey, Felix Hill, Alexander Lerchner

    Abstract: We introduce SODA, a self-supervised diffusion model, designed for representation learning. The model incorporates an image encoder, which distills a source view into a compact representation, that, in turn, guides the generation of related novel views. We show that by imposing a tight bottleneck between the encoder and a denoising decoder, and leveraging novel view synthesis as a self-supervised… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  7. arXiv:2310.15940  [pdf, other

    cs.AI cs.LG

    Combining Behaviors with the Successor Features Keyboard

    Authors: Wilka Carvalho, Andre Saraiva, Angelos Filos, Andrew Kyle Lampinen, Loic Matthey, Richard L. Lewis, Honglak Lee, Satinder Singh, Danilo J. Rezende, Daniel Zoran

    Abstract: The Option Keyboard (OK) was recently proposed as a method for transferring behavioral knowledge across tasks. OK transfers knowledge by adaptively combining subsets of known behaviors using Successor Features (SFs) and Generalized Policy Improvement (GPI). However, it relies on hand-designed state-features and task encodings which are cumbersome to design for every new environment. In this work,… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  8. arXiv:2310.14540  [pdf, other

    cs.CL cs.AI

    Evaluating Spatial Understanding of Large Language Models

    Authors: Yutaro Yamada, Yihan Bao, Andrew K. Lampinen, Jungo Kasai, Ilker Yildirim

    Abstract: Large language models (LLMs) show remarkable capabilities across a variety of tasks. Despite the models only seeing text in training, several recent studies suggest that LLM representations implicitly capture aspects of the underlying grounded concepts. Here, we explore LLM representations of a particularly salient kind of grounded knowledge -- spatial relationships. We design natural-language nav… ▽ More

    Submitted 12 April, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: Accepted to TMLR 2024. Our code and data are available at https://github.com/runopti/SpatialEvalLLM, https://huggingface.co/datasets/yyamada/SpatialEvalLLM

  9. arXiv:2310.13018  [pdf, other

    q-bio.NC cs.AI cs.LG cs.NE

    Getting aligned on representational alignment

    Authors: Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine L. Hermann, Kerem Oktar, Klaus Greff, Martin N. Hebart, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert Geirhos, Sherol Chen, Simon Kornblith, Sunayana Rane, Talia Konkle, Thomas P. O'Connell , et al. (5 additional authors not shown)

    Abstract: Biological and artificial information processing systems form representations that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the extent to which the representations formed by these diverse systems agree? Do similarities in representations then translate into similar behavior? How can a system's representations be modified to better match those of an… ▽ More

    Submitted 2 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Working paper, changes to be made in upcoming revisions

  10. arXiv:2308.06035  [pdf, other

    cs.AI cs.CL

    Multimodality and Attention Increase Alignment in Natural Language Prediction Between Humans and Computational Models

    Authors: Viktor Kewenig, Andrew Lampinen, Samuel A. Nastase, Christopher Edwards, Quitterie Lacome DEstalenx, Akilles Rechardt, Jeremy I Skipper, Gabriella Vigliocco

    Abstract: The potential of multimodal generative artificial intelligence (mAI) to replicate human grounded language understanding, including the pragmatic, context-rich aspects of communication, remains to be clarified. Humans are known to use salient multimodal features, such as visual cues, to facilitate the processing of upcoming words. Correspondingly, multimodal computational models can integrate visua… ▽ More

    Submitted 2 January, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

    Comments: 20 pages, 4 figures, submitted to Nature Human Behaviour

  11. arXiv:2306.04507  [pdf, other

    cs.CV cs.LG

    Improving neural network representations using human similarity judgments

    Authors: Lukas Muttenthaler, Lorenz Linhardt, Jonas Dippel, Robert A. Vandermeulen, Katherine Hermann, Andrew K. Lampinen, Simon Kornblith

    Abstract: Deep neural networks have reached human-level performance on many computer vision tasks. However, the objectives used to train these networks enforce only that similar images are embedded at similar locations in the representation space, and do not directly constrain the global structure of the resulting space. Here, we explore the impact of supervising this global structure by linearly aligning i… ▽ More

    Submitted 26 September, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: Published as a conference paper at NeurIPS 2023

  12. arXiv:2305.16183  [pdf, other

    cs.LG cs.AI cs.CL

    Passive learning of active causal strategies in agents and language models

    Authors: Andrew Kyle Lampinen, Stephanie C Y Chan, Ishita Dasgupta, Andrew J Nam, Jane X Wang

    Abstract: What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However, we show that purely passive learning can in fact allow an agent to learn generalizable strategies for determining and using causal structures, as long… ▽ More

    Submitted 2 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Advances in Neural Information Processing Systems (NeurIPS 2023). 10 pages main text

  13. arXiv:2305.08298  [pdf, other

    cs.CL

    Symbol tuning improves in-context learning in language models

    Authors: Jerry Wei, Le Hou, Andrew Lampinen, Xiangning Chen, Da Huang, Yi Tay, Xinyun Chen, Yifeng Lu, Denny Zhou, Tengyu Ma, Quoc V. Le

    Abstract: We present symbol tuning - finetuning language models on in-context input-label pairs where natural language labels (e.g., "positive/negative sentiment") are replaced with arbitrary symbols (e.g., "foo/bar"). Symbol tuning leverages the intuition that when a model cannot use instructions or natural language labels to figure out a task, it must instead do so by learning the input-label map**s.… ▽ More

    Submitted 30 December, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023

  14. arXiv:2210.15303  [pdf, other

    cs.CL cs.AI cs.LG

    Can language models handle recursively nested grammatical structures? A case study on comparing models and humans

    Authors: Andrew Kyle Lampinen

    Abstract: How should we compare the capabilities of language models (LMs) and humans? I draw inspiration from comparative psychology to highlight some challenges. In particular, I consider a case study: processing of recursively nested grammatical structures. Prior work suggests that LMs cannot handle these structures as reliably as humans can. However, the humans were provided with instructions and trainin… ▽ More

    Submitted 16 February, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

  15. arXiv:2210.05675  [pdf, other

    cs.CL cs.AI cs.LG

    Transformers generalize differently from information stored in context vs in weights

    Authors: Stephanie C. Y. Chan, Ishita Dasgupta, Junkyung Kim, Dharshan Kumaran, Andrew K. Lampinen, Felix Hill

    Abstract: Transformer models can use two fundamentally different kinds of information: information stored in weights during training, and information provided ``in-context'' at inference time. In this work, we show that transformers exhibit different inductive biases in how they represent and generalize from the information in these two sources. In particular, we characterize whether they generalize via par… ▽ More

    Submitted 13 October, 2022; v1 submitted 11 October, 2022; originally announced October 2022.

  16. arXiv:2207.07051  [pdf, other

    cs.CL cs.AI cs.LG

    Language models show human-like content effects on reasoning tasks

    Authors: Ishita Dasgupta, Andrew K. Lampinen, Stephanie C. Y. Chan, Hannah R. Sheahan, Antonia Creswell, Dharshan Kumaran, James L. McClelland, Felix Hill

    Abstract: Abstract reasoning is a key ability for an intelligent system. Large language models (LMs) achieve above-chance performance on abstract reasoning tasks, but exhibit many imperfections. However, human abstract reasoning is also imperfect. For example, human reasoning is affected by our real-world knowledge and beliefs, and shows notable "content effects"; humans reason more reliably when the semant… ▽ More

    Submitted 30 October, 2023; v1 submitted 14 July, 2022; originally announced July 2022.

  17. arXiv:2206.08349  [pdf, other

    cs.LG cs.AI cs.CL

    Know your audience: specializing grounded language models with listener subtraction

    Authors: Aaditya K. Singh, David Ding, Andrew Saxe, Felix Hill, Andrew K. Lampinen

    Abstract: Effective communication requires adapting to the idiosyncrasies of each communicative context--such as the common ground shared with each partner. Humans demonstrate this ability to specialize to their audience in many contexts, such as the popular game Dixit. We take inspiration from Dixit to formulate a multi-agent image reference game where a (trained) speaker model is rewarded for describing a… ▽ More

    Submitted 1 May, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: 28 pages, 9 figures

  18. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  19. arXiv:2205.05055  [pdf, other

    cs.LG cs.AI cs.CL

    Data Distributional Properties Drive Emergent In-Context Learning in Transformers

    Authors: Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill

    Abstract: Large transformer-based models are able to perform in-context few-shot learning, without being explicitly trained for it. This observation raises the question: what aspects of the training regime lead to this emergent behavior? Here, we show that this behavior is driven by the distributions of the training data itself. In-context learning emerges when the training data exhibits particular distribu… ▽ More

    Submitted 17 November, 2022; v1 submitted 22 April, 2022; originally announced May 2022.

    Comments: Accepted at NeurIPS 2022 (Oral). Code is available at: https://github.com/deepmind/emergent_in_context_learning

  20. arXiv:2204.05080  [pdf, other

    cs.LG cs.AI

    Semantic Exploration from Language Abstractions and Pretrained Representations

    Authors: Allison C. Tam, Neil C. Rabinowitz, Andrew K. Lampinen, Nicholas A. Roy, Stephanie C. Y. Chan, DJ Strouse, Jane X. Wang, Andrea Banino, Felix Hill

    Abstract: Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration methods can suffer in high-dimensional state spaces, such as continuous partially-observable 3D environments. We address this challenge by defining novelty using semantically meaningful state abstractions, which can be found in learned representations shaped by natural language. In particular, we evaluat… ▽ More

    Submitted 26 April, 2023; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: NeurIPS 2022

  21. arXiv:2204.02329  [pdf, other

    cs.CL cs.AI cs.LG

    Can language models learn from explanations in context?

    Authors: Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill

    Abstract: Language Models (LMs) can perform new tasks by adapting to a few in-context examples. For humans, explanations that connect examples to task principles can improve learning. We therefore investigate whether explanations of few-shot examples can help LMs. We annotate questions from 40 challenging tasks with answer explanations, and various matched control explanations. We evaluate how different typ… ▽ More

    Submitted 10 October, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Findings of EMNLP 2022

  22. arXiv:2203.08222  [pdf, other

    cs.LG

    Zipfian environments for Reinforcement Learning

    Authors: Stephanie C. Y. Chan, Andrew K. Lampinen, Pierre H. Richemond, Felix Hill

    Abstract: As humans and animals learn in the natural world, they encounter distributions of entities, situations and events that are far from uniform. Typically, a relatively small set of experiences are encountered frequently, while many important experiences occur only rarely. The highly-skewed, heavy-tailed nature of reality poses particular learning challenges that humans and animals have met by evolvin… ▽ More

    Submitted 8 August, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

  23. arXiv:2112.08369  [pdf, other

    cs.LG cs.AI

    Feature-Attending Recurrent Modules for Generalization in Reinforcement Learning

    Authors: Wilka Carvalho, Andrew Lampinen, Kyriacos Nikiforou, Felix Hill, Murray Shanahan

    Abstract: Many important tasks are defined in terms of object. To generalize across these tasks, a reinforcement learning (RL) agent needs to exploit the structure that the objects induce. Prior work has either hard-coded object-centric features, used complex object-centric generative models, or updated state using local spatial features. However, these approaches have had limited success in enabling genera… ▽ More

    Submitted 3 November, 2023; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: TMLR, 2023

  24. arXiv:2112.03753  [pdf, other

    cs.LG cs.AI stat.ML

    Tell me why! Explanations support learning relational and causal structure

    Authors: Andrew K. Lampinen, Nicholas A. Roy, Ishita Dasgupta, Stephanie C. Y. Chan, Allison C. Tam, James L. McClelland, Chen Yan, Adam Santoro, Neil C. Rabinowitz, Jane X. Wang, Felix Hill

    Abstract: Inferring the abstract relational and causal structure of the world is a major challenge for reinforcement-learning (RL) agents. For humans, language--particularly in the form of explanations--plays a considerable role in overcoming this challenge. Here, we show that language can play a similar role for deep RL agents in complex environments. While agents typically struggle to acquire relational a… ▽ More

    Submitted 25 May, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

    Comments: ICML 2022; 23 pages

    ACM Class: I.2.6

  25. arXiv:2105.14039  [pdf, other

    cs.LG cs.AI cs.NE

    Towards mental time travel: a hierarchical memory for reinforcement learning agents

    Authors: Andrew Kyle Lampinen, Stephanie C. Y. Chan, Andrea Banino, Felix Hill

    Abstract: Reinforcement learning agents often forget details of the past, especially after delays or distractor tasks. Agents with common memory architectures struggle to recall and integrate across multiple timesteps of a past event, or even to recall the details of a single timestep that is followed by distractor tasks. To address these limitations, we propose a Hierarchical Chunk Attention Memory (HCAM),… ▽ More

    Submitted 8 December, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

    Comments: NeurIPS 2021; 10 pages main text; 29 pages total

    ACM Class: I.2.6

    Journal ref: Advances in Neural Information Processing Systems, 2021

  26. arXiv:2102.03406  [pdf, other

    cs.AI cs.LG

    Symbolic Behaviour in Artificial Intelligence

    Authors: Adam Santoro, Andrew Lampinen, Kory Mathewson, Timothy Lillicrap, David Raposo

    Abstract: The ability to use symbols is the pinnacle of human intelligence, but has yet to be fully replicated in machines. Here we argue that the path towards symbolically fluent artificial intelligence (AI) begins with a reinterpretation of what symbols are, how they come to exist, and how a system behaves when it uses them. We begin by offering an interpretation of symbols as entities whose meaning is es… ▽ More

    Submitted 21 January, 2022; v1 submitted 5 February, 2021; originally announced February 2021.

  27. arXiv:2006.12433  [pdf, other

    cs.LG stat.ML

    What shapes feature representations? Exploring datasets, architectures, and training

    Authors: Katherine L. Hermann, Andrew K. Lampinen

    Abstract: In naturalistic learning problems, a model's input contains a wide range of features, some useful for the task at hand, and others not. Of the useful features, which ones does the model use? Of the task-irrelevant features, which ones does the model represent? Answers to these questions are important for understanding the basis of models' decisions, as well as for building models that learn versat… ▽ More

    Submitted 22 October, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: 22 pages

  28. arXiv:2005.04318  [pdf, other

    cs.LG cs.AI stat.ML

    Transforming task representations to perform novel tasks

    Authors: Andrew K. Lampinen, James L. McClelland

    Abstract: An important aspect of intelligence is the ability to adapt to a novel task without any direct experience (zero-shot), based on its relationship to previous tasks. Humans can exhibit this cognitive flexibility. By contrast, models that achieve superhuman performance in specific tasks often fail to adapt to even slight task alterations. To address this, we propose a general computational framework… ▽ More

    Submitted 6 October, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

    Comments: 45 pages

    ACM Class: I.2.0; I.2.6

    Journal ref: PNAS December 29, 2020 117 (52) 32970-32981;

  29. arXiv:1910.00571  [pdf, other

    cs.AI

    Environmental drivers of systematicity and generalization in a situated agent

    Authors: Felix Hill, Andrew Lampinen, Rosalia Schneider, Stephen Clark, Matthew Botvinick, James L. McClelland, Adam Santoro

    Abstract: The question of whether deep neural networks are good at generalising beyond their immediate training experience is of critical importance for learning-based approaches to AI. Here, we consider tests of out-of-sample generalisation that require an agent to respond to never-seen-before instructions by manipulating and positioning objects in a 3D Unity simulated room. We first describe a comparative… ▽ More

    Submitted 19 February, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

  30. arXiv:1909.12892  [pdf, other

    cs.LG cs.AI stat.ML

    Automated curricula through setter-solver interactions

    Authors: Sebastien Racaniere, Andrew K. Lampinen, Adam Santoro, David P. Reichert, Vlad Firoiu, Timothy P. Lillicrap

    Abstract: Reinforcement learning algorithms use correlations between policies and rewards to improve agent performance. But in dynamic or sparsely rewarding environments these correlations are often too small, or rewarding events are too infrequent to make learning feasible. Human education instead relies on curricula--the breakdown of tasks into simpler, static challenges with dense rewards--to build up to… ▽ More

    Submitted 21 January, 2020; v1 submitted 27 September, 2019; originally announced September 2019.

    Journal ref: International Conference on Learning Representations, 2020

  31. arXiv:1905.09950  [pdf, other

    cs.LG cs.NE stat.ML

    Zero-shot task adaptation by homoiconic meta-map**

    Authors: Andrew K. Lampinen, James L. McClelland

    Abstract: How can deep learning systems flexibly reuse their knowledge? Toward this goal, we propose a new class of challenges, and a class of architectures that can solve them. The challenges are meta-map**s, which involve systematically transforming task behaviors to adapt to new tasks zero-shot. The key to achieving these challenges is representing the task being performed in such a way that this task… ▽ More

    Submitted 12 November, 2019; v1 submitted 23 May, 2019; originally announced May 2019.

    Comments: 27 pages

    ACM Class: I.2.0; I.2.6

  32. arXiv:1809.10374  [pdf, other

    stat.ML cs.LG

    An analytic theory of generalization dynamics and transfer learning in deep linear networks

    Authors: Andrew K. Lampinen, Surya Ganguli

    Abstract: Much attention has been devoted recently to the generalization puzzle in deep learning: large, deep networks can generalize well, but existing theories bounding generalization error are exceedingly loose, and thus cannot explain this striking performance. Furthermore, a major hope is that knowledge may transfer across tasks, so that multi-task learning can improve generalization on individual task… ▽ More

    Submitted 4 January, 2019; v1 submitted 27 September, 2018; originally announced September 2018.

    Comments: ICLR 2019, 20 pages

    ACM Class: I.2.6; F.m

  33. arXiv:1710.10280  [pdf, other

    cs.CL cs.LG stat.ML

    One-shot and few-shot learning of word embeddings

    Authors: Andrew K. Lampinen, James L. McClelland

    Abstract: Standard deep learning systems require thousands or millions of examples to learn a concept, and cannot integrate new concepts easily. By contrast, humans have an incredible ability to do one-shot or few-shot learning. For instance, from just hearing a word used in a sentence, humans can infer a great deal about it, by leveraging what the syntax and semantics of the surrounding words tells us. Her… ▽ More

    Submitted 2 January, 2018; v1 submitted 27 October, 2017; originally announced October 2017.

    Comments: 15 pages, 7 figures, under review as a conference paper at ICLR 2018

    ACM Class: I.2.7

  34. arXiv:1709.10459  [pdf, other

    cs.CV cs.LG cs.NE

    Improving image generative models with human interactions

    Authors: Andrew Kyle Lampinen, David So, Douglas Eck, Fred Bertsch

    Abstract: GANs provide a framework for training generative models which mimic a data distribution. However, in many cases we wish to train these generative models to optimize some auxiliary objective function within the data it generates, such as making more aesthetically pleasing images. In some cases, these objective functions are difficult to evaluate, e.g. they may require human interaction. Here, we de… ▽ More

    Submitted 29 September, 2017; originally announced September 2017.