Skip to main content

Showing 1–30 of 30 results for author: Yosinski, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2301.05768  [pdf, other

    cs.CV

    RxRx1: A Dataset for Evaluating Experimental Batch Correction Methods

    Authors: Maciej Sypetkowski, Morteza Rezanejad, Saber Saberian, Oren Kraus, John Urbanik, James Taylor, Ben Mabey, Mason Victors, Jason Yosinski, Alborz Rezazadeh Sereshkeh, Imran Haque, Berton Earnshaw

    Abstract: High-throughput screening techniques are commonly used to obtain large quantities of data in many fields of biology. It is well known that artifacts arising from variability in the technical execution of different experimental batches within such screens confound these observations and can lead to invalid biological conclusions. It is therefore necessary to account for these batch effects when ana… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

  2. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  3. arXiv:2109.07684  [pdf, other

    cs.CL cs.AI

    Language Models are Few-shot Multilingual Learners

    Authors: Genta Indra Winata, Andrea Madotto, Zhaojiang Lin, Rosanne Liu, Jason Yosinski, Pascale Fung

    Abstract: General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few examples. Here, we evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages without a… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: 14 pages

  4. arXiv:2107.07741  [pdf, other

    cs.LG

    When does loss-based prioritization fail?

    Authors: Niel Teng Hu, Xinyu Hu, Rosanne Liu, Sara Hooker, Jason Yosinski

    Abstract: Not all examples are created equal, but standard deep neural network training protocols treat each training point uniformly. Each example is propagated forward and backward through the network the same amount of times, independent of how much the example contributes to the learning protocol. Recent work has proposed ways to accelerate training by deviating from this uniform treatment. Popular meth… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

  5. arXiv:2006.14769  [pdf, other

    cs.LG cs.AI stat.ML

    Supermasks in Superposition

    Authors: Mitchell Wortsman, Vivek Ramanujan, Rosanne Liu, Aniruddha Kembhavi, Mohammad Rastegari, Jason Yosinski, Ali Farhadi

    Abstract: We present the Supermasks in Superposition (SupSup) model, capable of sequentially learning thousands of tasks without catastrophic forgetting. Our approach uses a randomly initialized, fixed base network and for each task finds a subnetwork (supermask) that achieves good performance. If task identity is given at test time, the correct subnetwork can be retrieved with minimal memory usage. If not… ▽ More

    Submitted 21 October, 2020; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020 Camera Ready

  6. arXiv:2002.09505  [pdf, other

    cs.LG cs.AI stat.ML

    Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

    Authors: Ashley D. Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski

    Abstract: In this paper, we introduce a novel form of value function, $Q(s, s')$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter. In order to derive an optimal policy, we develop a forward dynamics model that learns to make next-state predictions that maximize this value. This formulation decouples actions from values while still… ▽ More

    Submitted 25 August, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: Accepted into ICML 2020

  7. arXiv:1912.02164  [pdf, other

    cs.CL cs.AI cs.LG

    Plug and Play Language Models: A Simple Approach to Controlled Text Generation

    Authors: Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, Rosanne Liu

    Abstract: Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the… ▽ More

    Submitted 3 March, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

    Comments: ICLR 2020 camera ready

  8. arXiv:1910.08461  [pdf, other

    cs.LG stat.ML

    First-Order Preconditioning via Hypergradient Descent

    Authors: Ted Moskovitz, Rui Wang, Janice Lan, Sanyam Kapoor, Thomas Miconi, Jason Yosinski, Aditya Rawal

    Abstract: Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence. Unfortunately, such algorithms typically struggle to scale to high-dimensional problems, in part… ▽ More

    Submitted 27 April, 2020; v1 submitted 18 October, 2019; originally announced October 2019.

  9. arXiv:1909.01440  [pdf, other

    cs.LG stat.ML

    LCA: Loss Change Allocation for Neural Network Training

    Authors: Janice Lan, Rosanne Liu, Hattie Zhou, Jason Yosinski

    Abstract: Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into this high-dimensional, dynamic process. We propose a new window into training called Loss Change Allocation (LCA), in which credit for changes to the… ▽ More

    Submitted 3 March, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: NeurIPS 2019 camera ready version

  10. arXiv:1906.01563  [pdf, other

    cs.NE

    Hamiltonian Neural Networks

    Authors: Sam Greydanus, Misko Dzamba, Jason Yosinski

    Abstract: Even though neural networks enjoy widespread use, they still struggle to learn the basic laws of physics. How might we endow them with better inductive biases? In this paper, we draw inspiration from Hamiltonian mechanics to train models that learn and respect exact conservation laws in an unsupervised manner. We evaluate our models on problems where conservation of energy is important, including… ▽ More

    Submitted 5 September, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: Conference paper at NeurIPS 2019. Main paper has 8 pages and 5 figures

  11. arXiv:1905.01067  [pdf, other

    cs.LG cs.CV stat.ML

    Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

    Authors: Hattie Zhou, Janice Lan, Rosanne Liu, Jason Yosinski

    Abstract: The recent "Lottery Ticket Hypothesis" paper by Frankle & Carbin showed that a simple approach to creating sparse networks (kee** the large weights) results in models that are trainable from scratch, but only when starting from the same initial weights. The performance of these networks often exceeds the performance of the non-sparse base model, but for reasons that were not well understood. In… ▽ More

    Submitted 3 March, 2020; v1 submitted 3 May, 2019; originally announced May 2019.

    Comments: NeurIPS 2019 camera ready version

  12. arXiv:1904.08939  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Understanding Neural Networks via Feature Visualization: A survey

    Authors: Anh Nguyen, Jason Yosinski, Jeff Clune

    Abstract: A neuroscience method to understanding the brain is to find and study the preferred stimuli that highly activate an individual cell or groups of cells. Recent advances in machine learning enable a family of methods to synthesize preferred stimuli that cause a neuron in an artificial or biological brain to fire strongly. Those methods are known as Activation Maximization (AM) or Feature Visualizati… ▽ More

    Submitted 18 April, 2019; originally announced April 2019.

    Comments: A book chapter in an Interpretable ML book (http://www.interpretable-ml.org/book/)

  13. arXiv:1811.11357  [pdf, other

    stat.ML cs.LG

    Metropolis-Hastings Generative Adversarial Networks

    Authors: Ryan Turner, Jane Hung, Eric Frank, Yunus Saatci, Jason Yosinski

    Abstract: We introduce the Metropolis-Hastings generative adversarial network (MH-GAN), which combines aspects of Markov chain Monte Carlo and GANs. The MH-GAN draws samples from the distribution implicitly defined by a GAN's discriminator-generator pair, as opposed to standard GANs which draw samples from the distribution defined only by the generator. It uses the discriminator from GAN training to build a… ▽ More

    Submitted 17 May, 2019; v1 submitted 27 November, 2018; originally announced November 2018.

  14. arXiv:1807.03247  [pdf, other

    cs.CV cs.LG stat.ML

    An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution

    Authors: Rosanne Liu, Joel Lehman, Piero Molino, Felipe Petroski Such, Eric Frank, Alex Sergeev, Jason Yosinski

    Abstract: Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a map** between coordinates in… ▽ More

    Submitted 3 December, 2018; v1 submitted 9 July, 2018; originally announced July 2018.

    Comments: Published in NeurIPS 2018

  15. arXiv:1804.08838  [pdf, other

    cs.LG cs.NE stat.ML

    Measuring the Intrinsic Dimension of Objective Landscapes

    Authors: Chunyuan Li, Heerad Farkhoor, Rosanne Liu, Jason Yosinski

    Abstract: Many recently trained neural networks employ large numbers of parameters to achieve good performance. One may intuitively use the number of parameters required as a rough gauge of the difficulty of a problem. But how accurate are such notions? How many parameters are really needed? In this paper we attempt to answer this question by training networks not in their native parameter space, but instea… ▽ More

    Submitted 24 April, 2018; originally announced April 2018.

    Comments: Published in ICLR 2018

  16. arXiv:1803.03453  [pdf, other

    cs.NE

    The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities

    Authors: Joel Lehman, Jeff Clune, Dusan Misevic, Christoph Adami, Lee Altenberg, Julie Beaulieu, Peter J. Bentley, Samuel Bernard, Guillaume Beslon, David M. Bryson, Patryk Chrabaszcz, Nick Cheney, Antoine Cully, Stephane Doncieux, Fred C. Dyer, Kai Olav Ellefsen, Robert Feldt, Stephan Fischer, Stephanie Forrest, Antoine Frénoy, Christian Gagné, Leni Le Goff, Laura M. Grabowski, Babak Hodjat, Frank Hutter , et al. (28 additional authors not shown)

    Abstract: Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution's creativity is not limited to nature. Indeed, many researchers in the field of digital evolution have observed their evolving algorithms and organisms su… ▽ More

    Submitted 21 November, 2019; v1 submitted 9 March, 2018; originally announced March 2018.

  17. arXiv:1706.05806  [pdf, other

    stat.ML cs.LG

    SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability

    Authors: Maithra Raghu, Justin Gilmer, Jason Yosinski, Jascha Sohl-Dickstein

    Abstract: We propose a new technique, Singular Vector Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations in a way that is both invariant to affine transform (allowing comparison between different layers and networks) and fast to compute (allowing more comparisons to be calculated than with previous methods). We deploy this tool to measure the intrinsic dimensionality of… ▽ More

    Submitted 8 November, 2017; v1 submitted 19 June, 2017; originally announced June 2017.

    Comments: Accepted to NIPS 2017, code: https://github.com/google/svcca/ , new plots on Imagenet

  18. arXiv:1612.00005  [pdf, other

    cs.CV

    Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space

    Authors: Anh Nguyen, Jeff Clune, Yoshua Bengio, Alexey Dosovitskiy, Jason Yosinski

    Abstract: Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network. In this paper we extend this method by introducing… ▽ More

    Submitted 12 April, 2017; v1 submitted 30 November, 2016; originally announced December 2016.

    Comments: CVPR camera-ready

  19. arXiv:1605.09304  [pdf, other

    cs.NE cs.AI cs.CV cs.LG

    Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

    Authors: Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, Jeff Clune

    Abstract: Deep neural networks (DNNs) have demonstrated state-of-the-art results on many pattern recognition tasks, especially vision classification problems. Understanding the inner workings of such computational brains is both fascinating basic science that is interesting in its own right - similar to why we study the human brain - and will enable researchers to further improve DNNs. One path to understan… ▽ More

    Submitted 23 November, 2016; v1 submitted 30 May, 2016; originally announced May 2016.

    Comments: 29 pages, 35 figures, NIPS camera-ready

  20. arXiv:1602.03616  [pdf, other

    cs.NE cs.CV

    Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks

    Authors: Anh Nguyen, Jason Yosinski, Jeff Clune

    Abstract: We can better understand deep neural networks by identifying which features each of their neurons have learned to detect. To do so, researchers have created Deep Visualization techniques including activation maximization, which synthetically generates inputs (e.g. images) that maximally activate each neuron. A limitation of current techniques is that they assume each neuron detects only one type o… ▽ More

    Submitted 7 May, 2016; v1 submitted 11 February, 2016; originally announced February 2016.

    Comments: 23 pages (including SI), 24 figures

  21. arXiv:1511.07543  [pdf, other

    cs.LG cs.NE

    Convergent Learning: Do different neural networks learn the same representations?

    Authors: Yixuan Li, Jason Yosinski, Jeff Clune, Hod Lipson, John Hopcroft

    Abstract: Recent success in training deep neural networks have prompted active investigation into the features learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by millions of parameters, but valuable because it increases our ability to understand current models and create improved versions of them. In this paper we investi… ▽ More

    Submitted 28 February, 2016; v1 submitted 23 November, 2015; originally announced November 2015.

    Comments: Published as a conference paper at ICLR 2016

  22. arXiv:1511.07356  [pdf, other

    cs.CV

    Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation

    Authors: Sina Honari, Jason Yosinski, Pascal Vincent, Christopher Pal

    Abstract: Deep neural networks with alternating convolutional, max-pooling and decimation layers are widely used in state of the art architectures for computer vision. Max-pooling purposefully discards precise spatial information in order to create features that are more robust, and typically organized as lower resolution spatial feature maps. On some tasks, such as whole-image classification, max-pooling d… ▽ More

    Submitted 17 April, 2016; v1 submitted 23 November, 2015; originally announced November 2015.

    Comments: accepted in CVPR 2016

  23. arXiv:1506.06579  [pdf, other

    cs.CV cs.LG cs.NE

    Understanding Neural Networks Through Deep Visualization

    Authors: Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, Hod Lipson

    Abstract: Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural networks (convnets) to recognize natural images. However, our understanding of how these models work, especially what computations they perform at intermediate layers, has lagged behind. Progress in the field will be further accelerated by the devel… ▽ More

    Submitted 22 June, 2015; originally announced June 2015.

    Comments: 12 pages. To appear at ICML Deep Learning Workshop 2015

  24. arXiv:1505.00359  [pdf, other

    cs.LG

    Can deep learning help you find the perfect match?

    Authors: Harm de Vries, Jason Yosinski

    Abstract: Is he/she my type or not? The answer to this question depends on the personal preferences of the one asking it. The individual process of obtaining a full answer may generally be difficult and time consuming, but often an approximate answer can be obtained simply by looking at a photo of the potential match. Such approximate answers based on visual cues can be produced in a fraction of a second, a… ▽ More

    Submitted 20 June, 2015; v1 submitted 2 May, 2015; originally announced May 2015.

  25. arXiv:1503.05571  [pdf, other

    cs.LG

    GSNs : Generative Stochastic Networks

    Authors: Guillaume Alain, Yoshua Bengio, Li Yao, Jason Yosinski, Eric Thibodeau-Laufer, Saizheng Zhang, Pascal Vincent

    Abstract: We introduce a novel training principle for probabilistic models that is an alternative to maximum likelihood. The proposed Generative Stochastic Networks (GSN) framework is based on learning the transition operator of a Markov chain whose stationary distribution estimates the data distribution. Because the transition distribution is a conditional distribution generally involving a small move, it… ▽ More

    Submitted 23 March, 2015; v1 submitted 18 March, 2015; originally announced March 2015.

    Comments: arXiv admin note: substantial text overlap with arXiv:1306.1091

  26. arXiv:1412.1897  [pdf, other

    cs.CV cs.AI cs.NE

    Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

    Authors: Anh Nguyen, Jason Yosinski, Jeff Clune

    Abstract: Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Given that DNNs are now able to classify objects in images with near-human-level performance, questions naturally arise as to what differences remain between computer and human vision. A recent study revealed that changing an… ▽ More

    Submitted 2 April, 2015; v1 submitted 5 December, 2014; originally announced December 2014.

    Comments: To appear at CVPR 2015

  27. arXiv:1411.1792  [pdf, other

    cs.LG cs.NE

    How transferable are features in deep neural networks?

    Authors: Jason Yosinski, Jeff Clune, Yoshua Bengio, Hod Lipson

    Abstract: Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last l… ▽ More

    Submitted 6 November, 2014; originally announced November 2014.

    Comments: To appear in Advances in Neural Information Processing Systems 27 (NIPS 2014)

    Journal ref: Advances in Neural Information Processing Systems 27, pages 3320-3328. Dec. 2014

  28. arXiv:1306.1091  [pdf, other

    cs.LG

    Deep Generative Stochastic Networks Trainable by Backprop

    Authors: Yoshua Bengio, Éric Thibodeau-Laufer, Guillaume Alain, Jason Yosinski

    Abstract: We introduce a novel training principle for probabilistic models that is an alternative to maximum likelihood. The proposed Generative Stochastic Networks (GSN) framework is based on learning the transition operator of a Markov chain whose stationary distribution estimates the data distribution. The transition distribution of the Markov chain is conditional on the previous state, generally involvi… ▽ More

    Submitted 23 May, 2014; v1 submitted 5 June, 2013; originally announced June 2013.

    Comments: arXiv admin note: text overlap with arXiv:1305.0445, Also published in ICML'2014

  29. arXiv:1304.4889  [pdf, other

    cs.NE cs.HC

    Hands-free Evolution of 3D-printable Objects via Eye Tracking

    Authors: Nick Cheney, Jeff Clune, Jason Yosinski, Hod Lipson

    Abstract: Interactive evolution has shown the potential to create amazing and complex forms in both 2-D and 3-D settings. However, the algorithm is slow and users quickly become fatigued. We propose that the use of eye tracking for interactive evolution systems will both reduce user fatigue and improve evolutionary success. We describe a systematic method for testing the hypothesis that eye tracking driven… ▽ More

    Submitted 19 April, 2013; v1 submitted 17 April, 2013; originally announced April 2013.

    Comments: 6 pages, 7 figures

  30. arXiv:1202.4465  [pdf, other

    cs.RO cs.AI

    MAV Stabilization using Machine Learning and Onboard Sensors

    Authors: Jason Yosinski, Cooper Bills

    Abstract: In many situations, Miniature Aerial Vehicles (MAVs) are limited to using only on-board sensors for navigation. This limits the data available to algorithms used for stabilization and localization, and current control methods are often insufficient to allow reliable hovering in place or trajectory following. In this research, we explore using machine learning to predict the drift (flight path erro… ▽ More

    Submitted 20 February, 2012; originally announced February 2012.

    Comments: 9 pages, 7 figures