-
Efficient optimization of ODE neuron models using gradient descent
Authors:
Ilenna Simone Jones,
Konrad Paul Kording
Abstract:
Neuroscientists fit morphologically and biophysically detailed neuron simulations to physiological data, often using evolutionary algorithms. However, such gradient-free approaches are computationally expensive, making convergence slow when neuron models have many parameters. Here we introduce a gradient-based algorithm using differentiable ODE solvers that scales well to high-dimensional problems…
▽ More
Neuroscientists fit morphologically and biophysically detailed neuron simulations to physiological data, often using evolutionary algorithms. However, such gradient-free approaches are computationally expensive, making convergence slow when neuron models have many parameters. Here we introduce a gradient-based algorithm using differentiable ODE solvers that scales well to high-dimensional problems. GPUs make parallel simulations fast and gradient calculations make optimization efficient. We verify the utility of our approach optimizing neuron models with active dendrites with heterogeneously distributed ion channel densities. We find that individually stimulating and recording all dendritic compartments makes such model parameters identifiable. Identification breaks down gracefully as fewer stimulation and recording sites are given. Differentiable neuron models, which should be added to popular neuron simulation packages, promise a new era of optimizable neuron models with many free parameters, a key feature of real neurons.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Empirical influence functions to understand the logic of fine-tuning
Authors:
Jordan K. Matelsky,
Lyle Ungar,
Konrad P. Kording
Abstract:
Understanding the process of learning in neural networks is crucial for improving their performance and interpreting their behavior. This can be approximately understood by asking how a model's output is influenced when we fine-tune on a new training sample. There are desiderata for such influences, such as decreasing influence with semantic distance, sparseness, noise invariance, transitive causa…
▽ More
Understanding the process of learning in neural networks is crucial for improving their performance and interpreting their behavior. This can be approximately understood by asking how a model's output is influenced when we fine-tune on a new training sample. There are desiderata for such influences, such as decreasing influence with semantic distance, sparseness, noise invariance, transitive causality, and logical consistency. Here we use the empirical influence measured using fine-tuning to demonstrate how individual training samples affect outputs. We show that these desiderata are violated for both for simple convolutional networks and for a modern LLM. We also illustrate how prompting can partially rescue this failure. Our paper presents an efficient and practical way of quantifying how well neural networks learn from fine-tuning stimuli. Our results suggest that popular models cannot generalize or perform logic in the way they appear to.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Vision-language models for decoding provider attention during neonatal resuscitation
Authors:
Felipe Parodi,
Jordan Matelsky,
Alejandra Regla-Vargas,
Elizabeth Foglia,
Charis Lim,
Danielle Weinberg,
Konrad Kording,
Heidi Herrick,
Michael Platt
Abstract:
Neonatal resuscitations demand an exceptional level of attentiveness from providers, who must process multiple streams of information simultaneously. Gaze strongly influences decision making; thus, understanding where a provider is looking during neonatal resuscitations could inform provider training, enhance real-time decision support, and improve the design of delivery rooms and neonatal intensi…
▽ More
Neonatal resuscitations demand an exceptional level of attentiveness from providers, who must process multiple streams of information simultaneously. Gaze strongly influences decision making; thus, understanding where a provider is looking during neonatal resuscitations could inform provider training, enhance real-time decision support, and improve the design of delivery rooms and neonatal intensive care units (NICUs). Current approaches to quantifying neonatal providers' gaze rely on manual coding or simulations, which limit scalability and utility. Here, we introduce an automated, real-time, deep learning approach capable of decoding provider gaze into semantic classes directly from first-person point-of-view videos recorded during live resuscitations. Combining state-of-the-art, real-time segmentation with vision-language models (CLIP), our low-shot pipeline attains 91\% classification accuracy in identifying gaze targets without training. Upon fine-tuning, the performance of our gaze-guided vision transformer exceeds 98\% accuracy in gaze classification, approaching human-level precision. This system, capable of real-time inference, enables objective quantification of provider attention dynamics during live neonatal resuscitation. Our approach offers a scalable solution that seamlessly integrates with existing infrastructure for data-scarce gaze analysis, thereby offering new opportunities for understanding and refining clinical decision making.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
To reverse engineer an entire nervous system
Authors:
Gal Haspel,
Edward S Boyden,
Jeffrey Brown,
George Church,
Netta Cohen,
Christopher Fang-Yen,
Steven Flavell,
Miriam B Goodman,
Anne C Hart,
Oliver Hobert,
Eduardo J Izquierdo,
Konstantinos Kagias,
Shawn Lockery,
Yangning Lu,
Adam Marblestone,
Jordan Matelsky,
Hanspeter Pfister,
Horacio G Rotstein,
Monika Scholz,
Eli Shlizerman,
Quilee Simeon,
Michael A Skuhersky,
Vineet Tiruvadi,
Vivek Venkatachalam,
Guangyu Robert Yang
, et al. (3 additional authors not shown)
Abstract:
A primary goal of neuroscience is to understand how nervous systems, or assemblies of neural circuits, generate and control behavior. Testing and refining our theories of neural control would be greatly facilitated if we could reliably simulate an entire nervous system so we could replicate the brain dynamics in response to any stimuli and different contexts. More fundamentally, reconstructing or…
▽ More
A primary goal of neuroscience is to understand how nervous systems, or assemblies of neural circuits, generate and control behavior. Testing and refining our theories of neural control would be greatly facilitated if we could reliably simulate an entire nervous system so we could replicate the brain dynamics in response to any stimuli and different contexts. More fundamentally, reconstructing or modeling a system is an important milestone in understanding it, and so, simulating an entire nervous system is in itself one of the goals, indeed dreams, of systems neuroscience. To do so requires us to identify how each neuron's output depends on its inputs, within some nervous system. This deconstruction, understanding function from input-output pairs, falls into the realm of reverse engineering. Current efforts at reverse engineering the brain focus on the mammalian nervous system, but these brains are complex, allowing only recordings of tiny subsystems. Here we argue that the time is ripe to embark on a concerted effort to reverse engineer a smaller system and that the nematode C. elegans is the ideal candidate system. In particular, the established and growing toolkit of optophysiology techniques can non-invasively capture and control each neuron's activity and scale to hundreds of thousands of experiments, across a large population of animals. Data across populations and behaviors can be combined because across individuals neuronal identities are largely conserved in form and function. Modern machine-learning-based model training should then enable a simulation of C. elegans' impressive breadth of brain states and behaviors. The ability to reverse engineer an entire nervous system will benefit systems neuroscience as well as the design of artificial intelligence systems, enabling fundamental insights as well as new approaches for investigations of progressively larger nervous systems.
△ Less
Submitted 9 December, 2023; v1 submitted 12 August, 2023;
originally announced August 2023.
-
A large language model-assisted education tool to provide feedback on open-ended responses
Authors:
Jordan K. Matelsky,
Felipe Parodi,
Tony Liu,
Richard D. Lange,
Konrad P. Kording
Abstract:
Open-ended questions are a favored tool among instructors for assessing student understanding and encouraging critical exploration of course material. Providing feedback for such responses is a time-consuming task that can lead to overwhelmed instructors and decreased feedback quality. Many instructors resort to simpler question formats, like multiple-choice questions, which provide immediate feed…
▽ More
Open-ended questions are a favored tool among instructors for assessing student understanding and encouraging critical exploration of course material. Providing feedback for such responses is a time-consuming task that can lead to overwhelmed instructors and decreased feedback quality. Many instructors resort to simpler question formats, like multiple-choice questions, which provide immediate feedback but at the expense of personalized and insightful comments. Here, we present a tool that uses large language models (LLMs), guided by instructor-defined criteria, to automate responses to open-ended questions. Our tool delivers rapid personalized feedback, enabling students to quickly test their knowledge and identify areas for improvement. We provide open-source reference implementations both as a web application and as a Jupyter Notebook widget that can be used with instructional coding or math notebooks. With instructor guidance, LLMs hold promise to enhance student learning outcomes and elevate instructional methodologies.
△ Less
Submitted 25 July, 2023;
originally announced August 2023.
-
Comparing dendritic trees with actual trees
Authors:
Roozbeh Farhoodi,
Phil Wilkes,
Anirudh M. Natarajan,
Samantha Ing-Esteves,
Julie L. Lefebvre,
Mathias Disney,
Konrad P. Kording
Abstract:
Since they became observable, neuron morphologies have been informally compared with biological trees but they are studied by distinct communities, neuroscientists, and ecologists. The apparent structural similarity suggests there may be common quantitative rules and constraints. However, there are also reasons to believe they should be different. For example, while the environments of trees may b…
▽ More
Since they became observable, neuron morphologies have been informally compared with biological trees but they are studied by distinct communities, neuroscientists, and ecologists. The apparent structural similarity suggests there may be common quantitative rules and constraints. However, there are also reasons to believe they should be different. For example, while the environments of trees may be relatively simple, neurons are constructed by a complex iterative program where synapses are made and pruned. This complexity may make neurons less self-similar than trees. Here we test this hypothesis by comparing the features of segmented sub-trees with those of the whole tree. We indeed find more self-similarity within actual trees than neurons. At the same time, we find that many other features are somewhat comparable across the two. Investigation of shapes and behaviors promises new ways of conceptualizing the form-function link.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution
Authors:
Anthony Zador,
Sean Escola,
Blake Richards,
Bence Ölveczky,
Yoshua Bengio,
Kwabena Boahen,
Matthew Botvinick,
Dmitri Chklovskii,
Anne Churchland,
Claudia Clopath,
James DiCarlo,
Surya Ganguli,
Jeff Hawkins,
Konrad Koerding,
Alexei Koulakov,
Yann LeCun,
Timothy Lillicrap,
Adam Marblestone,
Bruno Olshausen,
Alexandre Pouget,
Cristina Savin,
Terrence Sejnowski,
Eero Simoncelli,
Sara Solla,
David Sussillo
, et al. (2 additional authors not shown)
Abstract:
Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts…
▽ More
Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts the focus from those capabilities like game playing and language that are especially well-developed or uniquely human to those capabilities, inherited from over 500 million years of evolution, that are shared with all animals. Building models that can pass the embodied Turing test will provide a roadmap for the next generation of AI.
△ Less
Submitted 22 February, 2023; v1 submitted 15 October, 2022;
originally announced October 2022.
-
Learning domain-specific causal discovery from time series
Authors:
Xinyue Wang,
Konrad Paul Kording
Abstract:
Causal discovery (CD) from time-varying data is important in neuroscience, medicine, and machine learning. Techniques for CD encompass randomized experiments, which are generally unbiased but expensive, and algorithms such as Granger causality, conditional-independence-based, structural-equation-based, and score-based methods that are only accurate under strong assumptions made by human designers.…
▽ More
Causal discovery (CD) from time-varying data is important in neuroscience, medicine, and machine learning. Techniques for CD encompass randomized experiments, which are generally unbiased but expensive, and algorithms such as Granger causality, conditional-independence-based, structural-equation-based, and score-based methods that are only accurate under strong assumptions made by human designers. However, as demonstrated in other areas of machine learning, human expertise is often not entirely accurate and tends to be outperformed in domains with abundant data. In this study, we examine whether we can enhance domain-specific causal discovery for time series using a data-driven approach. Our findings indicate that this procedure significantly outperforms human-designed, domain-agnostic causal discovery methods, such as Mutual Information, VAR-LiNGAM, and Granger Causality on the MOS 6502 microprocessor, the NetSim fMRI dataset, and the Dream3 gene dataset. We argue that, when feasible, the causality field should consider a supervised approach in which domain-specific CD procedures are learned from extensive datasets with known causal relationships, rather than being designed by human specialists. Our findings promise a new approach toward improving CD in neural and medical data and for the broader machine learning community.
△ Less
Submitted 9 October, 2023; v1 submitted 12 September, 2022;
originally announced September 2022.
-
The neuroconnectionist research programme
Authors:
Adrien Doerig,
Rowan Sommers,
Katja Seeliger,
Blake Richards,
Jenann Ismael,
Grace Lindsay,
Konrad Kording,
Talia Konkle,
Marcel A. J. Van Gerven,
Nikolaus Kriegeskorte,
Tim C. Kietzmann
Abstract:
Artificial Neural Networks (ANNs) inspired by biology are beginning to be widely used to model behavioral and neural data, an approach we call neuroconnectionism. ANNs have been lauded as the current best models of information processing in the brain, but also criticized for failing to account for basic cognitive functions. We propose that arguing about the successes and failures of a restricted s…
▽ More
Artificial Neural Networks (ANNs) inspired by biology are beginning to be widely used to model behavioral and neural data, an approach we call neuroconnectionism. ANNs have been lauded as the current best models of information processing in the brain, but also criticized for failing to account for basic cognitive functions. We propose that arguing about the successes and failures of a restricted set of current ANNs is the wrong approach to assess the promise of neuroconnectionism. Instead, we take inspiration from the philosophy of science, and in particular from Lakatos, who showed that the core of scientific research programmes is often not directly falsifiable, but should be assessed by its capacity to generate novel insights. Following this view, we present neuroconnectionism as a cohesive large-scale research programme centered around ANNs as a computational language for expressing falsifiable theories about brain computation. We describe the core of the programme, the underlying computational framework and its tools for testing specific neuroscientific hypotheses. Taking a longitudinal view, we review past and present neuroconnectionist projects and their responses to challenges, and argue that the research programme is highly progressive, generating new and otherwise unreachable insights into the workings of the brain.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Neural Networks as Paths through the Space of Representations
Authors:
Richard D. Lange,
Devin Kwok,
Jordan Matelsky,
Xinyue Wang,
David S. Rolnick,
Konrad P. Kording
Abstract:
Deep neural networks implement a sequence of layer-by-layer operations that are each relatively easy to understand, but the resulting overall computation is generally difficult to understand. We consider a simple hypothesis for interpreting the layer-by-layer construction of useful representations: perhaps the role of each layer is to reformat information to reduce the "distance" to the desired ou…
▽ More
Deep neural networks implement a sequence of layer-by-layer operations that are each relatively easy to understand, but the resulting overall computation is generally difficult to understand. We consider a simple hypothesis for interpreting the layer-by-layer construction of useful representations: perhaps the role of each layer is to reformat information to reduce the "distance" to the desired outputs. With this framework, the layer-wise computation implemented by a deep neural network can be viewed as a path through a high-dimensional representation space. We formalize this intuitive idea of a "path" by leveraging recent advances in *metric* representational similarity. We extend existing representational distance methods by computing geodesics, angles, and projections of representations, going beyond mere layer distances. We then demonstrate these tools by visualizing and comparing the paths taken by ResNet and VGG architectures on CIFAR-10. We conclude by sketching additional ways that this kind of representational geometry can be used to understand and interpret network training, and to describe novel kinds of similarities between different models.
△ Less
Submitted 27 November, 2022; v1 submitted 22 June, 2022;
originally announced June 2022.
-
Nothing makes sense in deep learning, except in the light of evolution
Authors:
Artem Kaznatcheev,
Konrad Paul Kording
Abstract:
Deep Learning (DL) is a surprisingly successful branch of machine learning. The success of DL is usually explained by focusing analysis on a particular recent algorithm and its traits. Instead, we propose that an explanation of the success of DL must look at the population of all algorithms in the field and how they have evolved over time. We argue that cultural evolution is a useful framework to…
▽ More
Deep Learning (DL) is a surprisingly successful branch of machine learning. The success of DL is usually explained by focusing analysis on a particular recent algorithm and its traits. Instead, we propose that an explanation of the success of DL must look at the population of all algorithms in the field and how they have evolved over time. We argue that cultural evolution is a useful framework to explain the success of DL. In analogy to biology, we use `development' to mean the process converting the pseudocode or text description of an algorithm into a fully trained model. This includes writing the programming code, compiling and running the program, and training the model. If all parts of the process don't align well then the resultant model will be useless (if the code runs at all!). This is a constraint. A core component of evolutionary developmental biology is the concept of deconstraints -- these are modification to the developmental process that avoid complete failure by automatically accommodating changes in other components. We suggest that many important innovations in DL, from neural networks themselves to hyperparameter optimization and AutoGrad, can be seen as developmental deconstraints. These deconstraints can be very helpful to both the particular algorithm in how it handles challenges in implementation and the overall field of DL in how easy it is for new ideas to be generated. We highlight how our perspective can both advance DL and lead to new insights for evolutionary biology.
△ Less
Submitted 20 May, 2022;
originally announced May 2022.
-
Comparing high-dimensional neural recordings by aligning their low-dimensional latent representations
Authors:
Max Dabagia,
Konrad P Kording,
Eva L Dyer
Abstract:
Many questions in neuroscience involve understanding of the responses of large populations of neurons. However, when dealing with large-scale neural activity, interpretation becomes difficult, and comparisons between two animals, or across different time points becomes challenging. One major challenge that we face in modern neuroscience is that of correspondence, e.g. we do not record the exact sa…
▽ More
Many questions in neuroscience involve understanding of the responses of large populations of neurons. However, when dealing with large-scale neural activity, interpretation becomes difficult, and comparisons between two animals, or across different time points becomes challenging. One major challenge that we face in modern neuroscience is that of correspondence, e.g. we do not record the exact same neurons at the exact same times. Without some way to link two or more datasets, comparing different collections of neural activity patterns becomes impossible. Here, we describe approaches for leveraging shared latent structure across neural recordings to tackle this correspondence challenge. We review algorithms that map two datasets into a shared space where they can be directly compared, and argue that alignment is key for comparing high-dimensional neural activities across times, subsets of neurons, and individuals.
△ Less
Submitted 17 May, 2022;
originally announced May 2022.
-
Clustering units in neural networks: upstream vs downstream information
Authors:
Richard D. Lange,
David S. Rolnick,
Konrad P. Kording
Abstract:
It has been hypothesized that some form of "modular" structure in artificial neural networks should be useful for learning, compositionality, and generalization. However, defining and quantifying modularity remains an open problem. We cast the problem of detecting functional modules into the problem of detecting clusters of similar-functioning units. This begs the question of what makes two units…
▽ More
It has been hypothesized that some form of "modular" structure in artificial neural networks should be useful for learning, compositionality, and generalization. However, defining and quantifying modularity remains an open problem. We cast the problem of detecting functional modules into the problem of detecting clusters of similar-functioning units. This begs the question of what makes two units functionally similar. For this, we consider two broad families of methods: those that define similarity based on how units respond to structured variations in inputs ("upstream"), and those based on how variations in hidden unit activations affect outputs ("downstream"). We conduct an empirical study quantifying modularity of hidden layer representations of simple feedforward, fully connected networks, across a range of hyperparameters. For each model, we quantify pairwise associations between hidden units in each layer using a variety of both upstream and downstream measures, then cluster them by maximizing their "modularity score" using established tools from network science. We find two surprising results: first, dropout dramatically increased modularity, while other forms of weight regularization had more modest effects. Second, although we observe that there is usually good agreement about clusters within both upstream methods and downstream methods, there is little agreement about the cluster assignments across these two families of methods. This has important implications for representation-learning, as it suggests that finding modular representations that reflect structure in inputs (e.g. disentanglement) may be a distinct goal from learning modular representations that reflect structure in outputs (e.g. compositionality).
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
Prospective Learning: Principled Extrapolation to the Future
Authors:
Ashwin De Silva,
Rahul Ramesh,
Lyle Ungar,
Marshall Hussain Shuler,
Noah J. Cowan,
Michael Platt,
Chen Li,
Leyla Isik,
Seung-Eon Roh,
Adam Charles,
Archana Venkataraman,
Brian Caffo,
Javier J. How,
Justus M Kebschull,
John W. Krakauer,
Maxim Bichuch,
Kaleab Alemayehu Kinfu,
Eva Yezerets,
Dinesh Jayaraman,
Jong M. Shin,
Soledad Villar,
Ian Phillips,
Carey E. Priebe,
Thomas Hartung,
Michael I. Miller
, et al. (18 additional authors not shown)
Abstract:
Learning is a process which can update decision rules, based on past experience, such that future performance improves. Traditionally, machine learning is often evaluated under the assumption that the future will be identical to the past in distribution or change adversarially. But these assumptions can be either too optimistic or pessimistic for many problems in the real world. Real world scenari…
▽ More
Learning is a process which can update decision rules, based on past experience, such that future performance improves. Traditionally, machine learning is often evaluated under the assumption that the future will be identical to the past in distribution or change adversarially. But these assumptions can be either too optimistic or pessimistic for many problems in the real world. Real world scenarios evolve over multiple spatiotemporal scales with partially predictable dynamics. Here we reformulate the learning problem to one that centers around this idea of dynamic futures that are partially learnable. We conjecture that certain sequences of tasks are not retrospectively learnable (in which the data distribution is fixed), but are prospectively learnable (in which distributions may be dynamic), suggesting that prospective learning is more difficult in kind than retrospective learning. We argue that prospective learning more accurately characterizes many real world problems that (1) currently stymie existing artificial intelligence solutions and/or (2) lack adequate explanations for how natural intelligences solve them. Thus, studying prospective learning will lead to deeper insights and solutions to currently vexing challenges in both natural and artificial intelligences.
△ Less
Submitted 13 July, 2023; v1 submitted 18 January, 2022;
originally announced January 2022.
-
Object Based Attention Through Internal Gating
Authors:
Jordan Lei,
Ari S. Benjamin,
Konrad P. Kording
Abstract:
Object-based attention is a key component of the visual system, relevant for perception, learning, and memory. Neurons tuned to features of attended objects tend to be more active than those associated with non-attended objects. There is a rich set of models of this phenomenon in computational neuroscience. However, there is currently a divide between models that successfully match physiological d…
▽ More
Object-based attention is a key component of the visual system, relevant for perception, learning, and memory. Neurons tuned to features of attended objects tend to be more active than those associated with non-attended objects. There is a rich set of models of this phenomenon in computational neuroscience. However, there is currently a divide between models that successfully match physiological data but can only deal with extremely simple problems and models of attention used in computer vision. For example, attention in the brain is known to depend on top-down processing, whereas self-attention in deep learning does not. Here, we propose an artificial neural network model of object-based attention that captures the way in which attention is both top-down and recurrent. Our attention model works well both on simple test stimuli, such as those using images of handwritten digits, and on more complex stimuli, such as natural images drawn from the COCO dataset. We find that our model replicates a range of findings from neuroscience, including attention-invariant tuning, inhibition of return, and attention-mediated scaling of activity. Understanding object based attention is both computationally interesting and a key problem for computational neuroscience.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
A critical reappraisal of predicting suicidal ideation using fMRI
Authors:
Timothy Verstynen,
Konrad Kording
Abstract:
For many psychiatric disorders, neuroimaging offers a potential for revolutionizing diagnosis, and potentially treatment, by providing access to preverbal mental processes. In their study "Machine learning of neural representations of suicide and emotion concepts identifies suicidal youth."1, Just and colleagues report that a Naive Bayes classifier, trained on voxelwise fMRI responses in human par…
▽ More
For many psychiatric disorders, neuroimaging offers a potential for revolutionizing diagnosis, and potentially treatment, by providing access to preverbal mental processes. In their study "Machine learning of neural representations of suicide and emotion concepts identifies suicidal youth."1, Just and colleagues report that a Naive Bayes classifier, trained on voxelwise fMRI responses in human participants during the presentation of words and concepts related to mortality, can predict whether an individual had reported having suicidal ideations with a classification accuracy of 91%. Here we report a reappraisal of the methods employed by the authors, including re-analysis of the same data set, that calls into question the accuracy of the authors findings. The analysis is a case study in the dangers of overfitting in machine learning.
△ Less
Submitted 29 October, 2021; v1 submitted 10 March, 2021;
originally announced March 2021.
-
Do biological constraints impair dendritic computation?
Authors:
Ilenna Simone Jones,
Konrad Paul Kording
Abstract:
Computations on the dendritic trees of neurons have important constraints. Voltage dependent conductances in dendrites are not similar to arbitrary direct-current generation, they are the basis for dendritic nonlinearities and they do not allow converting positive currents into negative currents. While it has been speculated that the dendritic tree of a neuron can be seen as a multi-layer neural n…
▽ More
Computations on the dendritic trees of neurons have important constraints. Voltage dependent conductances in dendrites are not similar to arbitrary direct-current generation, they are the basis for dendritic nonlinearities and they do not allow converting positive currents into negative currents. While it has been speculated that the dendritic tree of a neuron can be seen as a multi-layer neural network and it has been shown that such an architecture could be computationally strong, we do not know if that computational strength is preserved under these biological constraints. Here we simulate models of dendritic computation with and without these constraints. We find that dendritic model performance on interesting machine learning tasks is not hurt by these constraints but may benefit from them. Our results suggest that single real dendritic trees may be able to learn a surprisingly broad range of tasks.
△ Less
Submitted 11 August, 2021; v1 submitted 4 March, 2021;
originally announced March 2021.
-
A Philosophical Understanding of Representation for Neuroscience
Authors:
Ben Baker,
Benjamin Lansdell,
Konrad Kording
Abstract:
Neuroscientists often describe neural activity as a representation of something, or claim to have found evidence for a neural representation. But what do these statements mean? The reasons to call some neural activity a representation and the assumptions that come with this term are not generally made clear from its common uses in neuroscience. Representation is a central concept in philosophy of…
▽ More
Neuroscientists often describe neural activity as a representation of something, or claim to have found evidence for a neural representation. But what do these statements mean? The reasons to call some neural activity a representation and the assumptions that come with this term are not generally made clear from its common uses in neuroscience. Representation is a central concept in philosophy of mind, with a rich history going back to the ancient period. In order to clarify its usage in neuroscience, here we advance a link between the connotations of this term across these disciplines. We draw on a broad range of discourse in philosophy to distinguish three key aspects of representation: correspondence, functional role, and teleology. We argue that each of these aspects are implied by the explanatory role the term plays in neuroscience. However, evidence related to all three aspects is rarely presented or discussed in the course of individual studies that aim to identify representations. Overlooking the significance of all three aspects hinders communication in neuroscience, as it obscures the limitations of experimental paradigms and conceals gaps in our understanding of the phenomena of primary interest. Working from this three-part view, we discuss how to move toward clearer communication about representations in the brain.
△ Less
Submitted 28 April, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
Neuromatch Academy: Teaching Computational Neuroscience with global accessibility
Authors:
Tara van Viegen,
Athena Akrami,
Kate Bonnen,
Eric DeWitt,
Alexandre Hyafil,
Helena Ledmyr,
Grace W. Lindsay,
Patrick Mineault,
John D. Murray,
Xaq Pitkow,
Aina Puce,
Madineh Sedigh-Sarvestani,
Carsen Stringer,
Titipat Achakulvisut,
Elnaz Alikarami,
Melvin Selim Atay,
Eleanor Batty,
Jeffrey C. Erlich,
Byron V. Galbraith,
Yueqi Guo,
Ashley L. Juavinett,
Matthew R. Krause,
Songting Li,
Marius Pachitariu,
Elizabeth Straley
, et al. (10 additional authors not shown)
Abstract:
Neuromatch Academy designed and ran a fully online 3-week Computational Neuroscience summer school for 1757 students with 191 teaching assistants working in virtual inverted (or flipped) classrooms and on small group projects. Fourteen languages, active community management, and low cost allowed for an unprecedented level of inclusivity and universal accessibility.
Neuromatch Academy designed and ran a fully online 3-week Computational Neuroscience summer school for 1757 students with 191 teaching assistants working in virtual inverted (or flipped) classrooms and on small group projects. Fourteen languages, active community management, and low cost allowed for an unprecedented level of inclusivity and universal accessibility.
△ Less
Submitted 15 December, 2020;
originally announced December 2020.
-
Can Single Neurons Solve MNIST? The Computational Power of Biological Dendritic Trees
Authors:
Ilenna Simone Jones,
Konrad Paul Kording
Abstract:
Physiological experiments have highlighted how the dendrites of biological neurons can nonlinearly process distributed synaptic inputs. This is in stark contrast to units in artificial neural networks that are generally linear apart from an output nonlinearity. If dendritic trees can be nonlinear, biological neurons may have far more computational power than their artificial counterparts. Here we…
▽ More
Physiological experiments have highlighted how the dendrites of biological neurons can nonlinearly process distributed synaptic inputs. This is in stark contrast to units in artificial neural networks that are generally linear apart from an output nonlinearity. If dendritic trees can be nonlinear, biological neurons may have far more computational power than their artificial counterparts. Here we use a simple model where the dendrite is implemented as a sequence of thresholded linear units. We find that such dendrites can readily solve machine learning problems, such as MNIST or CIFAR-10, and that they benefit from having the same input onto several branches of the dendritic tree. This dendrite model is a special case of sparse network. This work suggests that popular neuron models may severely underestimate the computational power enabled by the biological fact of nonlinear dendrites and multiple synapses per pair of neurons. The next generation of artificial neural networks may significantly benefit from these biologically inspired dendritic architectures.
△ Less
Submitted 2 September, 2020;
originally announced September 2020.
-
Learning to infer in recurrent biological networks
Authors:
Ari S. Benjamin,
Konrad P. Kording
Abstract:
A popular theory of perceptual processing holds that the brain learns both a generative model of the world and a paired recognition model using variational Bayesian inference. Most hypotheses of how the brain might learn these models assume that neurons in a population are conditionally independent given their common inputs. This simplification is likely not compatible with the type of local recur…
▽ More
A popular theory of perceptual processing holds that the brain learns both a generative model of the world and a paired recognition model using variational Bayesian inference. Most hypotheses of how the brain might learn these models assume that neurons in a population are conditionally independent given their common inputs. This simplification is likely not compatible with the type of local recurrence observed in the brain. Seeking an alternative that is compatible with complex inter-dependencies yet consistent with known biology, we argue here that the cortex may learn with an adversarial algorithm. Many observable symptoms of this approach would resemble known neural phenomena, including wake/sleep cycles and oscillations that vary in magnitude with surprise, and we describe how further predictions could be tested. We illustrate the idea on recurrent neural networks trained to model image and video datasets. This framework for learning brings variational inference closer to neuroscience and yields multiple testable hypotheses.
△ Less
Submitted 31 May, 2021; v1 submitted 18 June, 2020;
originally announced June 2020.
-
PDE constraints on smooth hierarchical functions computed by neural networks
Authors:
Khashayar Filom,
Konrad Paul Kording,
Roozbeh Farhoodi
Abstract:
Neural networks are versatile tools for computation, having the ability to approximate a broad range of functions. An important problem in the theory of deep neural networks is expressivity; that is, we want to understand the functions that are computable by a given network. We study real infinitely differentiable (smooth) hierarchical functions implemented by feedforward neural networks via compo…
▽ More
Neural networks are versatile tools for computation, having the ability to approximate a broad range of functions. An important problem in the theory of deep neural networks is expressivity; that is, we want to understand the functions that are computable by a given network. We study real infinitely differentiable (smooth) hierarchical functions implemented by feedforward neural networks via composing simpler functions in two cases:
1) each constituent function of the composition has fewer inputs than the resulting function;
2) constituent functions are in the more specific yet prevalent form of a non-linear univariate function (e.g. tanh) applied to a linear multivariate function.
We establish that in each of these regimes there exist non-trivial algebraic partial differential equations (PDEs), which are satisfied by the computed functions. These PDEs are purely in terms of the partial derivatives and are dependent only on the topology of the network. For compositions of polynomial functions, the algebraic PDEs yield non-trivial equations (of degrees dependent only on the architecture) in the ambient polynomial space that are satisfied on the associated functional varieties. Conversely, we conjecture that such PDE constraints, once accompanied by appropriate non-singularity conditions and perhaps certain inequalities involving partial derivatives, guarantee that the smooth function under consideration can be represented by the network. The conjecture is verified in numerous examples including the case of tree architectures which are of neuroscientific interest. Our approach is a step toward formulating an algebraic description of functional spaces associated with specific neural networks, and may provide new, useful tools for constructing neural networks.
△ Less
Submitted 13 August, 2021; v1 submitted 18 May, 2020;
originally announced May 2020.
-
MoVi: A Large Multipurpose Motion and Video Dataset
Authors:
Saeed Ghorbani,
Kimia Mahdaviani,
Anne Thaler,
Konrad Kording,
Douglas James Cook,
Gunnar Blohm,
Nikolaus F. Troje
Abstract:
Human movements are both an area of intense study and the basis of many applications such as character animation. For many applications, it is crucial to identify movements from videos or analyze datasets of movements. Here we introduce a new human Motion and Video dataset MoVi, which we make available publicly. It contains 60 female and 30 male actors performing a collection of 20 predefined ever…
▽ More
Human movements are both an area of intense study and the basis of many applications such as character animation. For many applications, it is crucial to identify movements from videos or analyze datasets of movements. Here we introduce a new human Motion and Video dataset MoVi, which we make available publicly. It contains 60 female and 30 male actors performing a collection of 20 predefined everyday actions and sports movements, and one self-chosen movement. In five capture rounds, the same actors and movements were recorded using different hardware systems, including an optical motion capture system, video cameras, and inertial measurement units (IMU). For some of the capture rounds, the actors were recorded when wearing natural clothing, for the other rounds they wore minimal clothing. In total, our dataset contains 9 hours of motion capture data, 17 hours of video data from 4 different points of view (including one hand-held camera), and 6.6 hours of IMU data. In this paper, we describe how the dataset was collected and post-processed; We present state-of-the-art estimates of skeletal motions and full-body shape deformations associated with skeletal motion. We discuss examples for potential studies this dataset could enable.
△ Less
Submitted 3 March, 2020;
originally announced March 2020.
-
Appreciating the variety of goals in computational neuroscience
Authors:
Konrad P. Kording,
Gunnar Blohm,
Paul Schrater,
Kendrick Kay
Abstract:
Within computational neuroscience, informal interactions with modelers often reveal wildly divergent goals. In this opinion piece, we explicitly address the diversity of goals that motivate and ultimately influence modeling efforts. We argue that a wide range of goals can be meaningfully taken to be of highest importance. A simple informal survey conducted on the Internet confirmed the diversity o…
▽ More
Within computational neuroscience, informal interactions with modelers often reveal wildly divergent goals. In this opinion piece, we explicitly address the diversity of goals that motivate and ultimately influence modeling efforts. We argue that a wide range of goals can be meaningfully taken to be of highest importance. A simple informal survey conducted on the Internet confirmed the diversity of goals in the community. However, different priorities or preferences of individual researchers can lead to divergent model evaluation criteria. We propose that many disagreements in evaluating the merit of computational research stem from differences in goals and not from the mechanics of constructing, describing, and validating models. We suggest that authors state explicitly their goals when proposing models so that others can judge the quality of the research with respect to its stated goals.
△ Less
Submitted 8 February, 2020;
originally announced February 2020.
-
End-to-end Training of CNN-CRF via Differentiable Dual-Decomposition
Authors:
Shaofei Wang,
Vishnu Lokhande,
Maneesh Singh,
Konrad Kording,
Julian Yarkony
Abstract:
Modern computer vision (CV) is often based on convolutional neural networks (CNNs) that excel at hierarchical feature extraction. The previous generation of CV approaches was often based on conditional random fields (CRFs) that excel at modeling flexible higher order interactions. As their benefits are complementary they are often combined. However, these approaches generally use mean-field approx…
▽ More
Modern computer vision (CV) is often based on convolutional neural networks (CNNs) that excel at hierarchical feature extraction. The previous generation of CV approaches was often based on conditional random fields (CRFs) that excel at modeling flexible higher order interactions. As their benefits are complementary they are often combined. However, these approaches generally use mean-field approximations and thus, arguably, did not directly optimize the real problem. Here we revisit dual-decomposition-based approaches to CRF optimization, an alternative to the mean-field approximation. These algorithms can efficiently and exactly solve sub-problems and directly optimize a convex upper bound of the real problem, providing optimality certificates on the way. Our approach uses a novel fixed-point iteration algorithm which enjoys dual-monotonicity, dual-differentiability and high parallelism. The whole system, CRF and CNN can thus be efficiently trained using back-propagation. We demonstrate the effectiveness of our system on semantic image segmentation, showing consistent improvement over baseline models.
△ Less
Submitted 5 December, 2019;
originally announced December 2019.
-
Spike-based causal inference for weight alignment
Authors:
Jordan Guerguiev,
Konrad P. Kording,
Blake A. Richards
Abstract:
In artificial neural networks trained with gradient descent, the weights used for processing stimuli are also used during backward passes to calculate gradients. For the real brain to approximate gradients, gradient information would have to be propagated separately, such that one set of synaptic weights is used for processing and another set is used for backward passes. This produces the so-calle…
▽ More
In artificial neural networks trained with gradient descent, the weights used for processing stimuli are also used during backward passes to calculate gradients. For the real brain to approximate gradients, gradient information would have to be propagated separately, such that one set of synaptic weights is used for processing and another set is used for backward passes. This produces the so-called "weight transport problem" for biological models of learning, where the backward weights used to calculate gradients need to mirror the forward weights used to process stimuli. This weight transport problem has been considered so hard that popular proposals for biological learning assume that the backward weights are simply random, as in the feedback alignment algorithm. However, such random weights do not appear to work well for large networks. Here we show how the discontinuity introduced in a spiking system can lead to a solution to this problem. The resulting algorithm is a special case of an estimator used for causal inference in econometrics, regression discontinuity design. We show empirically that this algorithm rapidly makes the backward weights approximate the forward weights. As the backward weights become correct, this improves learning performance over feedback alignment on tasks such as Fashion-MNIST, SVHN, CIFAR-10 and VOC. Our results demonstrate that a simple learning rule in a spiking network can allow neurons to produce the right backward connections and thus solve the weight transport problem.
△ Less
Submitted 1 February, 2020; v1 submitted 3 October, 2019;
originally announced October 2019.
-
Reverse-Engineering Deep ReLU Networks
Authors:
David Rolnick,
Konrad P. Kording
Abstract:
It has been widely assumed that a neural network cannot be recovered from its outputs, as the network depends on its parameters in a highly nonlinear way. Here, we prove that in fact it is often possible to identify the architecture, weights, and biases of an unknown deep ReLU network by observing only its output. Every ReLU network defines a piecewise linear function, where the boundaries between…
▽ More
It has been widely assumed that a neural network cannot be recovered from its outputs, as the network depends on its parameters in a highly nonlinear way. Here, we prove that in fact it is often possible to identify the architecture, weights, and biases of an unknown deep ReLU network by observing only its output. Every ReLU network defines a piecewise linear function, where the boundaries between linear regions correspond to inputs for which some neuron in the network switches between inactive and active ReLU states. By dissecting the set of region boundaries into components associated with particular neurons, we show both theoretically and empirically that it is possible to recover the weights of neurons and their arrangement within the network, up to isomorphism.
△ Less
Submitted 22 February, 2020; v1 submitted 1 October, 2019;
originally announced October 2019.
-
Movement science needs different pose tracking algorithms
Authors:
Nidhi Seethapathi,
Shaofei Wang,
Rachit Saluja,
Gunnar Blohm,
Konrad P. Kording
Abstract:
Over the last decade, computer science has made progress towards extracting body pose from single camera photographs or videos. This promises to enable movement science to detect disease, quantify movement performance, and take the science out of the lab into the real world. However, current pose tracking algorithms fall short of the needs of movement science; the types of movement data that matte…
▽ More
Over the last decade, computer science has made progress towards extracting body pose from single camera photographs or videos. This promises to enable movement science to detect disease, quantify movement performance, and take the science out of the lab into the real world. However, current pose tracking algorithms fall short of the needs of movement science; the types of movement data that matter are poorly estimated. For instance, the metrics currently used for evaluating pose tracking algorithms use noisy hand-labeled ground truth data and do not prioritize precision of relevant variables like three-dimensional position, velocity, acceleration, and forces which are crucial for movement science. Here, we introduce the scientific disciplines that use movement data, the types of data they need, and discuss the changes needed to make pose tracking truly transformative for movement science.
△ Less
Submitted 23 July, 2019;
originally announced July 2019.
-
What does it mean to understand a neural network?
Authors:
Timothy P. Lillicrap,
Konrad P. Kording
Abstract:
We can define a neural network that can learn to recognize objects in less than 100 lines of code. However, after training, it is characterized by millions of weights that contain the knowledge about many object types across visual scenes. Such networks are thus dramatically easier to understand in terms of the code that makes them than the resulting properties, such as tuning or connections. In a…
▽ More
We can define a neural network that can learn to recognize objects in less than 100 lines of code. However, after training, it is characterized by millions of weights that contain the knowledge about many object types across visual scenes. Such networks are thus dramatically easier to understand in terms of the code that makes them than the resulting properties, such as tuning or connections. In analogy, we conjecture that rules for development and learning in brains may be far easier to understand than their resulting properties. The analogy suggests that neuroscience would benefit from a focus on learning and development.
△ Less
Submitted 15 July, 2019;
originally announced July 2019.
-
Reverse engineering neural networks from many partial recordings
Authors:
Elahe Arani,
Sofia Triantafillou,
Konrad P. Kording
Abstract:
Much of neuroscience aims at reverse engineering the brain, but we only record a small number of neurons at a time. We do not currently know if reverse engineering the brain requires us to simultaneously record most neurons or if multiple recordings from smaller subsets suffice. This is made even more important by the development of novel techniques that allow recording from selected subsets of ne…
▽ More
Much of neuroscience aims at reverse engineering the brain, but we only record a small number of neurons at a time. We do not currently know if reverse engineering the brain requires us to simultaneously record most neurons or if multiple recordings from smaller subsets suffice. This is made even more important by the development of novel techniques that allow recording from selected subsets of neurons, e.g. using optical techniques. To get at this question, we analyze a neural network, trained on the MNIST dataset, using only partial recordings and characterize the dependency of the quality of our reverse engineering on the number of simultaneously recorded "neurons". We find that reverse engineering of the nonlinear neural network is meaningfully possible if a sufficiently large number of neurons is simultaneously recorded but that this number can be considerably smaller than the number of neurons. Moreover, recording many times from small random subsets of neurons yields surprisingly good performance. Application in neuroscience suggests to approximate the I/O function of an actual neural system, we need to record from a much larger number of neurons. The kind of scaling analysis we perform here can, and arguably should be used to calibrate approaches that can dramatically scale up the size of recorded data sets in neuroscience.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.
-
Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning
Authors:
Titipat Achakulvisut,
Chandra Bhagavatula,
Daniel Acuna,
Konrad Kording
Abstract:
Claims are a fundamental unit of scientific discourse. The exponential growth in the number of scientific publications makes automatic claim extraction an important problem for researchers who are overwhelmed by this information overload. Such an automated claim extraction system is useful for both manual and programmatic exploration of scientific knowledge. In this paper, we introduce a new datas…
▽ More
Claims are a fundamental unit of scientific discourse. The exponential growth in the number of scientific publications makes automatic claim extraction an important problem for researchers who are overwhelmed by this information overload. Such an automated claim extraction system is useful for both manual and programmatic exploration of scientific knowledge. In this paper, we introduce a new dataset of 1,500 scientific abstracts from the biomedical domain with expert annotations for each sentence indicating whether the sentence presents a scientific claim. We introduce a new model for claim extraction and compare it to several baseline models including rule-based and deep learning techniques. Moreover, we show that using a transfer learning approach with a fine-tuning step allows us to improve performance from a large discourse-annotated dataset. Our final model increases F1-score by over 14 percent points compared to a baseline model without transfer learning. We release a publicly accessible tool for discourse and claims prediction along with an annotation tool. We discuss further applications beyond biomedical literature.
△ Less
Submitted 16 January, 2020; v1 submitted 1 July, 2019;
originally announced July 2019.
-
Tackling Climate Change with Machine Learning
Authors:
David Rolnick,
Priya L. Donti,
Lynn H. Kaack,
Kelly Kochanski,
Alexandre Lacoste,
Kris Sankaran,
Andrew Slavin Ross,
Nikola Milojevic-Dupont,
Natasha Jaques,
Anna Waldman-Brown,
Alexandra Luccioni,
Tegan Maharaj,
Evan D. Sherwin,
S. Karthik Mukkavilli,
Konrad P. Kording,
Carla Gomes,
Andrew Y. Ng,
Demis Hassabis,
John C. Platt,
Felix Creutzig,
Jennifer Chayes,
Yoshua Bengio
Abstract:
Climate change is one of the greatest challenges facing humanity, and we, as machine learning experts, may wonder how we can help. Here we describe how machine learning can be a powerful tool in reducing greenhouse gas emissions and hel** society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by machine lea…
▽ More
Climate change is one of the greatest challenges facing humanity, and we, as machine learning experts, may wonder how we can help. Here we describe how machine learning can be a powerful tool in reducing greenhouse gas emissions and hel** society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by machine learning, in collaboration with other fields. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the machine learning community to join the global effort against climate change.
△ Less
Submitted 5 November, 2019; v1 submitted 10 June, 2019;
originally announced June 2019.
-
Learning to solve the credit assignment problem
Authors:
Benjamin James Lansdell,
Prashanth Ravi Prakash,
Konrad Paul Kording
Abstract:
Backpropagation is driving today's artificial neural networks (ANNs). However, despite extensive research, it remains unclear if the brain implements this algorithm. Among neuroscientists, reinforcement learning (RL) algorithms are often seen as a realistic alternative: neurons can randomly introduce change, and use unspecific feedback signals to observe their effect on the cost and thus approxima…
▽ More
Backpropagation is driving today's artificial neural networks (ANNs). However, despite extensive research, it remains unclear if the brain implements this algorithm. Among neuroscientists, reinforcement learning (RL) algorithms are often seen as a realistic alternative: neurons can randomly introduce change, and use unspecific feedback signals to observe their effect on the cost and thus approximate their gradient. However, the convergence rate of such learning scales poorly with the number of involved neurons. Here we propose a hybrid learning approach. Each neuron uses an RL-type strategy to learn how to approximate the gradients that backpropagation would provide. We provide proof that our approach converges to the true gradient for certain classes of networks. In both feedforward and convolutional networks, we empirically show that our approach learns to approximate the gradient, and can match or the performance of exact gradient-based learning. Learning feedback weights provides a biologically plausible mechanism of achieving good performance, without the need for precise, pre-specified learning rules.
△ Less
Submitted 22 April, 2020; v1 submitted 3 June, 2019;
originally announced June 2019.
-
Rarely-switching linear bandits: optimization of causal effects for the real world
Authors:
Benjamin Lansdell,
Sofia Triantafillou,
Konrad Kording
Abstract:
Excessively changing policies in many real world scenarios is difficult, unethical, or expensive. After all, doctor guidelines, tax codes, and price lists can only be reprinted so often. We may thus want to only change a policy when it is probable that the change is beneficial. In cases that a policy is a threshold on contextual variables we can estimate treatment effects for populations lying at…
▽ More
Excessively changing policies in many real world scenarios is difficult, unethical, or expensive. After all, doctor guidelines, tax codes, and price lists can only be reprinted so often. We may thus want to only change a policy when it is probable that the change is beneficial. In cases that a policy is a threshold on contextual variables we can estimate treatment effects for populations lying at the threshold. This allows for a schedule of incremental policy updates that let us optimize a policy while making few detrimental changes. Using this idea, and the theory of linear contextual bandits, we present a conservative policy updating procedure which updates a deterministic policy only when justified. We extend the theory of linear bandits to this rarely-switching case, proving that such procedures share the same regret, up to constant scaling, as the common LinUCB algorithm. However the algorithm makes far fewer changes to its policy and, of those changes, fewer are detrimental. We provide simulations and an analysis of an infant health well-being causal inference dataset, showing the algorithm efficiently learns a good policy with few changes. Our approach allows efficiently solving problems where changes are to be avoided, with potential applications in medicine, economics and beyond.
△ Less
Submitted 15 October, 2019; v1 submitted 30 May, 2019;
originally announced May 2019.
-
Quantifying the role of neurons for behavior is a mediation question
Authors:
Ilenna Simone Jones,
Konrad Paul Kording
Abstract:
Many systems neuroscientists want to understand neurons in terms of mediation; we want to understand how neurons are involved in the causal chain from stimulus to behavior. Unfortunately, most tools are inappropriate for that while our language takes mediation for granted. Here we discuss the contrast between our conceptual drive towards mediation and the difficulty of obtaining meaningful evidenc…
▽ More
Many systems neuroscientists want to understand neurons in terms of mediation; we want to understand how neurons are involved in the causal chain from stimulus to behavior. Unfortunately, most tools are inappropriate for that while our language takes mediation for granted. Here we discuss the contrast between our conceptual drive towards mediation and the difficulty of obtaining meaningful evidence.
△ Less
Submitted 6 May, 2019;
originally announced May 2019.
-
On functions computed on trees
Authors:
Roozbeh Farhoodi,
Khashayar Filom,
Ilenna Simone Jones,
Konrad Paul Kording
Abstract:
Any function can be constructed using a hierarchy of simpler functions through compositions. Such a hierarchy can be characterized by a binary rooted tree. Each node of this tree is associated with a function which takes as inputs two numbers from its children and produces one output. Since thinking about functions in terms of computation graphs is getting popular we may want to know which functio…
▽ More
Any function can be constructed using a hierarchy of simpler functions through compositions. Such a hierarchy can be characterized by a binary rooted tree. Each node of this tree is associated with a function which takes as inputs two numbers from its children and produces one output. Since thinking about functions in terms of computation graphs is getting popular we may want to know which functions can be implemented on a given tree. Here, we describe a set of necessary constraints in the form of a system of non-linear partial differential equations that must be satisfied. Moreover, we prove that these conditions are sufficient in both contexts of analytic and bit-valued functions. In the latter case, we explicitly enumerate discrete functions and observe that there are relatively few. Our point of view allows us to compare different neural network architectures in regard to their function spaces. Our work connects the structure of computation graphs with the functions they can implement and has potential applications to neuroscience and computer science.
△ Less
Submitted 22 October, 2019; v1 submitted 3 April, 2019;
originally announced April 2019.
-
The lure of misleading causal statements in functional connectivity research
Authors:
David Marc Anton Mehler,
Konrad Paul Kording
Abstract:
As neuroscientists we want to understand how causal interactions or mechanisms within the brain give rise to perception, cognition, and behavior. It is typical to estimate interaction effects from measured activity using statistical techniques such as functional connectivity, Granger Causality, or information flow, whose outcomes are often falsely treated as revealing mechanistic insight. Since th…
▽ More
As neuroscientists we want to understand how causal interactions or mechanisms within the brain give rise to perception, cognition, and behavior. It is typical to estimate interaction effects from measured activity using statistical techniques such as functional connectivity, Granger Causality, or information flow, whose outcomes are often falsely treated as revealing mechanistic insight. Since these statistical techniques fit models to low-dimensional measurements from brains, they ignore the fact that brain activity is high-dimensional. Here we focus on the obvious confound of common inputs: the countless unobserved variables likely have more influence than the few observed ones. Any given observed correlation can be explained by an infinite set of causal models that take into account the unobserved variables. Therefore, correlations within massively undersampled measurements tell us little about mechanisms. We argue that these mis-inferences of causality from correlation are augmented by an implicit redefinition of words that suggest mechanisms, such as connectivity, causality, and flow.
△ Less
Submitted 23 October, 2020; v1 submitted 8 December, 2018;
originally announced December 2018.
-
Towards learning-to-learn
Authors:
Benjamin James Lansdell,
Konrad Paul Kording
Abstract:
In good old-fashioned artificial intelligence (GOFAI), humans specified systems that solved problems. Much of the recent progress in AI has come from replacing human insights by learning. However, learning itself is still usually built by humans -- specifically the choice that parameter updates should follow the gradient of a cost function. Yet, in analogy with GOFAI, there is no reason to believe…
▽ More
In good old-fashioned artificial intelligence (GOFAI), humans specified systems that solved problems. Much of the recent progress in AI has come from replacing human insights by learning. However, learning itself is still usually built by humans -- specifically the choice that parameter updates should follow the gradient of a cost function. Yet, in analogy with GOFAI, there is no reason to believe that humans are particularly good at defining such learning systems: we may expect learning itself to be better if we learn it. Recent research in machine learning has started to realize the benefits of that strategy. We should thus expect this to be relevant for neuroscience: how could the correct learning rules be acquired? Indeed, cognitive science has long shown that humans learn-to-learn, which is potentially responsible for their impressive learning abilities. Here we discuss ideas across machine learning, neuroscience, and cognitive science that matter for the principle of learning-to-learn.
△ Less
Submitted 9 January, 2019; v1 submitted 1 November, 2018;
originally announced November 2018.
-
Measuring and regularizing networks in function space
Authors:
Ari S. Benjamin,
David Rolnick,
Konrad Kording
Abstract:
To optimize a neural network one often thinks of optimizing its parameters, but it is ultimately a matter of optimizing the function that maps inputs to outputs. Since a change in the parameters might serve as a poor proxy for the change in the function, it is of some concern that primacy is given to parameters but that the correspondence has not been tested. Here, we show that it is simple and co…
▽ More
To optimize a neural network one often thinks of optimizing its parameters, but it is ultimately a matter of optimizing the function that maps inputs to outputs. Since a change in the parameters might serve as a poor proxy for the change in the function, it is of some concern that primacy is given to parameters but that the correspondence has not been tested. Here, we show that it is simple and computationally feasible to calculate distances between functions in a $L^2$ Hilbert space. We examine how typical networks behave in this space, and compare how parameter $\ell^2$ distances compare to function $L^2$ distances between various points of an optimization trajectory. We find that the two distances are nontrivially related. In particular, the $L^2/\ell^2$ ratio decreases throughout optimization, reaching a steady value around when test error plateaus. We then investigate how the $L^2$ distance could be applied directly to optimization. We first propose that in multitask learning, one can avoid catastrophic forgetting by directly limiting how much the input/output function changes between tasks. Secondly, we propose a new learning rule that constrains the distance a network can travel through $L^2$-space in any one update. This allows new examples to be learned in a way that minimally interferes with what has previously been learned. These applications demonstrate how one can measure and regularize function distances directly, without relying on parameters or local approximations like loss curvature.
△ Less
Submitted 26 June, 2019; v1 submitted 21 May, 2018;
originally announced May 2018.
-
The Roles of Supervised Machine Learning in Systems Neuroscience
Authors:
Joshua I. Glaser,
Ari S. Benjamin,
Roozbeh Farhoodi,
Konrad P. Kording
Abstract:
Over the last several years, the use of machine learning (ML) in neuroscience has been rapidly increasing. Here, we review ML's contributions, both realized and potential, across several areas of systems neuroscience. We describe four primary roles of ML within neuroscience: 1) creating solutions to engineering problems, 2) identifying predictive variables, 3) setting benchmarks for simple models…
▽ More
Over the last several years, the use of machine learning (ML) in neuroscience has been rapidly increasing. Here, we review ML's contributions, both realized and potential, across several areas of systems neuroscience. We describe four primary roles of ML within neuroscience: 1) creating solutions to engineering problems, 2) identifying predictive variables, 3) setting benchmarks for simple models of the brain, and 4) serving itself as a model for the brain. The breadth and ease of its applicability suggests that machine learning should be in the toolbox of most systems neuroscientists.
△ Less
Submitted 26 November, 2018; v1 submitted 21 May, 2018;
originally announced May 2018.
-
The Social Structure of Consensus in Scientific Review
Authors:
Misha Teplitskiy,
Daniel Acuna,
Aida Elamrani-Raoult,
Konrad Kording,
James Evans
Abstract:
Personal connections between creators and evaluators of scientific works are ubiquitous, and the possibility of bias ever-present. Although connections have been shown to bias prospective judgments of (uncertain) future performance, it is unknown whether such biases occur in the much more concrete task of assessing the scientific validity of already completed work, and if so, why. This study prese…
▽ More
Personal connections between creators and evaluators of scientific works are ubiquitous, and the possibility of bias ever-present. Although connections have been shown to bias prospective judgments of (uncertain) future performance, it is unknown whether such biases occur in the much more concrete task of assessing the scientific validity of already completed work, and if so, why. This study presents evidence that personal connections between authors and reviewers of neuroscience manuscripts are associated with biased judgments and explores the mechanisms driving the effect. Using reviews from 7,981 neuroscience manuscripts submitted to the journal PLOS ONE, which instructs reviewers to evaluate manuscripts only on scientific validity, we find that reviewers favored authors close in the co-authorship network by ~0.11 points on a 1.0 - 4.0 scale for each step of proximity. PLOS ONE's validity-focused review and the substantial amount of favoritism shown by distant vs. very distant reviewers, both of whom should have little to gain from nepotism, point to the central role of substantive disagreements between scientists in different "schools of thought." The results suggest that removing bias from peer review cannot be accomplished simply by recusing the closely-connected reviewers, and highlight the value of recruiting reviewers embedded in diverse professional networks.
△ Less
Submitted 5 February, 2018;
originally announced February 2018.
-
Efficient Multi-Person Pose Estimation with Provable Guarantees
Authors:
Shaofei Wang,
Konrad Paul Kording,
Julian Yarkony
Abstract:
Multi-person pose estimation (MPPE) in natural images is key to the meaningful use of visual data in many fields including movement science, security, and rehabilitation. In this paper we tackle MPPE with a bottom-up approach, starting with candidate detections of body parts from a convolutional neural network (CNN) and grou** them into people. We formulate the grou** of body part detections i…
▽ More
Multi-person pose estimation (MPPE) in natural images is key to the meaningful use of visual data in many fields including movement science, security, and rehabilitation. In this paper we tackle MPPE with a bottom-up approach, starting with candidate detections of body parts from a convolutional neural network (CNN) and grou** them into people. We formulate the grou** of body part detections into people as a minimum-weight set packing (MWSP) problem where the set of potential people is the power set of body part detections. We model the quality of a hypothesis of a person which is a set in the MWSP by an augmented tree-structured Markov random field where variables correspond to body-parts and their state-spaces correspond to the power set of the detections for that part.
We describe a novel algorithm that combines efficiency with provable bounds on this MWSP problem. We employ an implicit column generation strategy where the pricing problem is formulated as a dynamic program. To efficiently solve this dynamic program we exploit the problem structure utilizing a nested Bender's decomposition (NBD) exact inference strategy which we speed up by recycling Bender's rows between calls to the pricing problem.
We test our approach on the MPII-Multiperson dataset, showing that our approach obtains comparable results with the state-of-the-art algorithm for joint node labeling and grou** problems, and that NBD achieves considerable speed-ups relative to a naive dynamic programming approach. Typical algorithms that solve joint node labeling and grou** problems use heuristics and thus can not obtain proofs of optimality. Our approach, in contrast, proves that for over 99 percent of problem instances we find the globally optimal solution and otherwise provide upper/lower bounds.
△ Less
Submitted 21 November, 2017;
originally announced November 2017.
-
Exploiting skeletal structure in computer vision annotation with Benders decomposition
Authors:
Shaofei Wang,
Konrad Kording,
Julian Yarkony
Abstract:
Many annotation problems in computer vision can be phrased as integer linear programs (ILPs). The use of standard industrial solvers does not to exploit the underlying structure of such problems eg, the skeleton in pose estimation. The leveraging of the underlying structure in conjunction with industrial solvers promises increases in both speed and accuracy. Such structure can be exploited using B…
▽ More
Many annotation problems in computer vision can be phrased as integer linear programs (ILPs). The use of standard industrial solvers does not to exploit the underlying structure of such problems eg, the skeleton in pose estimation. The leveraging of the underlying structure in conjunction with industrial solvers promises increases in both speed and accuracy. Such structure can be exploited using Bender's decomposition, a technique from operations research, that solves complex ILPs or mixed integer linear programs by decomposing them into sub-problems that communicate via a master problem. The intuition is that conditioned on a small subset of the variables the solution to the remaining variables can be computed easily by taking advantage of properties of the ILP constraint matrix such as block structure. In this paper we apply Benders decomposition to a typical problem in computer vision where we have many sub-ILPs (eg, partitioning of detections, body-parts) coupled to a master ILP (eg, constructing skeletons). Dividing inference problems into a master problem and sub-problems motivates the development of a plethora of novel models, and inference approaches for the field of computer vision.
△ Less
Submitted 13 September, 2017;
originally announced September 2017.
-
Machine learning for neural decoding
Authors:
Joshua I. Glaser,
Ari S. Benjamin,
Raeed H. Chowdhury,
Matthew G. Perich,
Lee E. Miller,
Konrad P. Kording
Abstract:
Despite rapid advances in machine learning tools, the majority of neural decoding approaches still use traditional methods. Modern machine learning tools, which are versatile and easy to use, have the potential to significantly improve decoding performance. This tutorial describes how to effectively apply these algorithms for typical decoding problems. We provide descriptions, best practices, and…
▽ More
Despite rapid advances in machine learning tools, the majority of neural decoding approaches still use traditional methods. Modern machine learning tools, which are versatile and easy to use, have the potential to significantly improve decoding performance. This tutorial describes how to effectively apply these algorithms for typical decoding problems. We provide descriptions, best practices, and code for applying common machine learning methods, including neural networks and gradient boosting. We also provide detailed comparisons of the performance of various methods at the task of decoding spiking activity in motor cortex, somatosensory cortex, and hippocampus. Modern methods, particularly neural networks and ensembles, significantly outperform traditional approaches, such as Wiener and Kalman filters. Improving the performance of neural decoding algorithms allows neuroscientists to better understand the information contained in a neural population and can help advance engineering applications such as brain machine interfaces.
△ Less
Submitted 3 July, 2020; v1 submitted 2 August, 2017;
originally announced August 2017.
-
Meaningless comparisons lead to false optimism in medical machine learning
Authors:
Orianna DeMasi,
Konrad Kording,
Benjamin Recht
Abstract:
A new trend in medicine is the use of algorithms to analyze big datasets, e.g. using everything your phone measures about you for diagnostics or monitoring. However, these algorithms are commonly compared against weak baselines, which may contribute to excessive optimism. To assess how well an algorithm works, scientists typically ask how well its output correlates with medically assigned scores.…
▽ More
A new trend in medicine is the use of algorithms to analyze big datasets, e.g. using everything your phone measures about you for diagnostics or monitoring. However, these algorithms are commonly compared against weak baselines, which may contribute to excessive optimism. To assess how well an algorithm works, scientists typically ask how well its output correlates with medically assigned scores. Here we perform a meta-analysis to quantify how the literature evaluates their algorithms for monitoring mental wellbeing. We find that the bulk of the literature ($\sim$77%) uses meaningless comparisons that ignore patient baseline state. For example, having an algorithm that uses phone data to diagnose mood disorders would be useful. However, it is possible to over 80% of the variance of some mood measures in the population by simply guessing that each patient has their own average mood - the patient-specific baseline. Thus, an algorithm that just predicts that our mood is like it usually is can explain the majority of variance, but is, obviously, entirely useless. Comparing to the wrong (population) baseline has a massive effect on the perceived quality of algorithms and produces baseless optimism in the field. To solve this problem we propose "user lift" that reduces these systematic errors in the evaluation of personalized medical monitoring.
△ Less
Submitted 19 July, 2017;
originally announced July 2017.
-
Grand Challenges for Global Brain Sciences
Authors:
Joshua T. Vogelstein,
Katrin Amunts,
Andreas Andreou,
Dora Angelaki,
Giorgio Ascoli,
Cori Bargmann,
Randal Burns,
Corrado Cali,
Frances Chance,
Miyoung Chun,
George Church,
Hollis Cline,
Todd Coleman,
Stephanie de La Rochefoucauld,
Winfried Denk,
Ana Belen Elgoyhen,
Ralph Etienne Cummings,
Alan Evans,
Kenneth Harris,
Michael Hausser,
Sean Hill,
Samuel Inverso,
Chad Jackson,
Viren Jain,
Rob Kass
, et al. (37 additional authors not shown)
Abstract:
The next grand challenges for society and science are in the brain sciences. A collection of 60+ scientists from around the world, together with 10+ observers from national, private, and foundations, spent two days together discussing the top challenges that we could solve as a global community in the next decade. We eventually settled on three challenges, spanning anatomy, physiology, and medicin…
▽ More
The next grand challenges for society and science are in the brain sciences. A collection of 60+ scientists from around the world, together with 10+ observers from national, private, and foundations, spent two days together discussing the top challenges that we could solve as a global community in the next decade. We eventually settled on three challenges, spanning anatomy, physiology, and medicine. Addressing all three challenges requires novel computational infrastructure. The group proposed the advent of The International Brain Station (TIBS), to address these challenges, and launch brain sciences to the next level of understanding.
△ Less
Submitted 27 October, 2016; v1 submitted 23 August, 2016;
originally announced August 2016.
-
Towards an integration of deep learning and neuroscience
Authors:
Adam Marblestone,
Greg Wayne,
Konrad Kording
Abstract:
Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged with…
▽ More
Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives. First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage. Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas. We hypothesize that (1) the brain optimizes cost functions, (2) these cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses.
△ Less
Submitted 13 June, 2016;
originally announced June 2016.
-
Quantifying mesoscale neuroanatomy using X-ray microtomography
Authors:
Eva L. Dyer,
William Gray Roncal,
Hugo L. Fernandes,
Doga Gürsoy,
Vincent De Andrade,
Rafael Vescovi,
Kamel Fezzaa,
Xianghui Xiao,
Joshua T. Vogelstein,
Chris Jacobsen,
Konrad P. Körding,
Narayanan Kasthuri
Abstract:
Methods for resolving the 3D microstructure of the brain typically start by thinly slicing and staining the brain, and then imaging each individual section with visible light photons or electrons. In contrast, X-rays can be used to image thick samples, providing a rapid approach for producing large 3D brain maps without sectioning. Here we demonstrate the use of synchrotron X-ray microtomography (…
▽ More
Methods for resolving the 3D microstructure of the brain typically start by thinly slicing and staining the brain, and then imaging each individual section with visible light photons or electrons. In contrast, X-rays can be used to image thick samples, providing a rapid approach for producing large 3D brain maps without sectioning. Here we demonstrate the use of synchrotron X-ray microtomography ($μ$CT) for producing mesoscale $(1~μm^3)$ resolution brain maps from millimeter-scale volumes of mouse brain. We introduce a pipeline for $μ$CT-based brain map** that combines methods for sample preparation, imaging, automated segmentation of image volumes into cells and blood vessels, and statistical analysis of the resulting brain structures. Our results demonstrate that X-ray tomography promises rapid quantification of large brain volumes, complementing other brain map** and connectomics efforts.
△ Less
Submitted 26 July, 2016; v1 submitted 12 April, 2016;
originally announced April 2016.
-
From sample to knowledge: Towards an integrated approach for neuroscience discovery
Authors:
William Gray Roncal,
Eva L Dyer,
Doga Gürsoy,
Konrad Kording,
Narayanan Kasthuri
Abstract:
Imaging methods used in modern neuroscience experiments are quickly producing large amounts of data capable of providing increasing amounts of knowledge about neuroanatomy and function. A great deal of information in these datasets is relatively unexplored and untapped. One of the bottlenecks in knowledge extraction is that often there is no feedback loop between the knowledge produced (e.g., grap…
▽ More
Imaging methods used in modern neuroscience experiments are quickly producing large amounts of data capable of providing increasing amounts of knowledge about neuroanatomy and function. A great deal of information in these datasets is relatively unexplored and untapped. One of the bottlenecks in knowledge extraction is that often there is no feedback loop between the knowledge produced (e.g., graph, density estimate, or other statistic) and the earlier stages of the pipeline (e.g., acquisition). We thus advocate for the development of sample-to-knowledge discovery pipelines that one can use to optimize acquisition and processing steps with a particular end goal (i.e., piece of knowledge) in mind. We therefore propose that optimization takes place not just within each processing stage but also between adjacent (and non-adjacent) steps of the pipeline. Furthermore, we explore the existing categories of knowledge representation and models to motivate the types of experiments and analysis needed to achieve the ultimate goal. To illustrate this approach, we provide an experimental paradigm to answer questions about large-scale synaptic distributions through a multimodal approach combining X-ray microtomography and electron microscopy.
△ Less
Submitted 23 January, 2017; v1 submitted 11 April, 2016;
originally announced April 2016.
-
Science Concierge: A fast content-based recommendation system for scientific publications
Authors:
Titipat Achakulvisut,
Daniel E. Acuna,
Tulakan Ruangrong,
Konrad Kording
Abstract:
Finding relevant publications is important for scientists who have to cope with exponentially increasing numbers of scholarly material. Algorithms can help with this task as they help for music, movie, and product recommendations. However, we know little about the performance of these algorithms with scholarly material. Here, we develop an algorithm, and an accompanying Python library, that implem…
▽ More
Finding relevant publications is important for scientists who have to cope with exponentially increasing numbers of scholarly material. Algorithms can help with this task as they help for music, movie, and product recommendations. However, we know little about the performance of these algorithms with scholarly material. Here, we develop an algorithm, and an accompanying Python library, that implements a recommendation system based on the content of articles. Design principles are to adapt to new content, provide near-real time suggestions, and be open source. We tested the library on 15K posters from the Society of Neuroscience Conference 2015. Human curated topics are used to cross validate parameters in the algorithm and produce a similarity metric that maximally correlates with human judgments. We show that our algorithm significantly outperformed suggestions based on keywords. The work presented here promises to make the exploration of scholarly material faster and more accurate.
△ Less
Submitted 11 May, 2016; v1 submitted 4 April, 2016;
originally announced April 2016.