Skip to main content

Showing 1–16 of 16 results for author: Goh, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.04093  [pdf, other

    cs.LG cs.AI

    Scaling and evaluating sparse autoencoders

    Authors: Leo Gao, Tom Dupré la Tour, Henk Tillman, Gabriel Goh, Rajan Troll, Alec Radford, Ilya Sutskever, Jan Leike, Jeffrey Wu

    Abstract: Sparse autoencoders provide a promising unsupervised approach for extracting interpretable features from a language model by reconstructing activations from a sparse bottleneck layer. Since language models learn many concepts, autoencoders need to be very large to recover all relevant features. However, studying the properties of autoencoder scaling is difficult due to the need to balance reconstr… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  2. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  3. arXiv:2103.00020  [pdf, other

    cs.CV cs.LG

    Learning Transferable Visual Models From Natural Language Supervision

    Authors: Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever

    Abstract: State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstr… ▽ More

    Submitted 26 February, 2021; originally announced March 2021.

  4. arXiv:2102.12092  [pdf, other

    cs.CV cs.LG

    Zero-Shot Text-to-Image Generation

    Authors: Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever

    Abstract: Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset. These assumptions might involve complex architectures, auxiliary losses, or side information such as object part labels or segmentation masks supplied during training. We describe a simple approach for this task based on a transformer that autoregressively models the text and… ▽ More

    Submitted 26 February, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

  5. Understanding Integrated Gradients with SmoothTaylor for Deep Neural Network Attribution

    Authors: Gary S. W. Goh, Sebastian Lapuschkin, Leander Weber, Wojciech Samek, Alexander Binder

    Abstract: Integrated Gradients as an attribution method for deep neural network models offers simple implementability. However, it suffers from noisiness of explanations which affects the ease of interpretability. The SmoothGrad technique is proposed to solve the noisiness issue and smoothen the attribution maps of any gradient-based attribution method. In this paper, we present SmoothTaylor as a novel theo… ▽ More

    Submitted 2 September, 2021; v1 submitted 22 April, 2020; originally announced April 2020.

    Comments: 8 pages, 3 figures. Accepted in 25th International Conference on Pattern Recognition, (ICPR) 2020. In Proceedings: pp. 4949-4956

  6. arXiv:2002.01535  [pdf, ps, other

    cs.CL cs.LG

    Lightweight Convolutional Representations for On-Device Natural Language Processing

    Authors: Shrey Desai, Geoffrey Goh, Arun Babu, Ahmed Aly

    Abstract: The increasing computational and memory complexities of deep neural networks have made it difficult to deploy them on low-resource electronic devices (e.g., mobile phones, tablets, wearables). Practitioners have developed numerous model compression methods to address these concerns, but few have condensed input representations themselves. In this work, we propose a fast, accurate, and lightweight… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

    Comments: Accepted to MLSys 2020

  7. arXiv:1911.06876  [pdf, other

    cs.LG cs.AI stat.ML

    Explanatory Masks for Neural Network Interpretability

    Authors: Lawrence Phillips, Garrett Goh, Nathan Hodas

    Abstract: Neural network interpretability is a vital component for applications across a wide variety of domains. In such cases it is often useful to analyze a network which has already been trained for its specific purpose. In this work, we develop a method to produce explanation masks for pre-trained networks. The mask localizes the most important aspects of each input for prediction of the original netwo… ▽ More

    Submitted 15 November, 2019; originally announced November 2019.

    Comments: Presented at IJCAI-18 Workshop on Explainable Artificial Intelligence (XAI)

  8. arXiv:1910.03741  [pdf, other

    cs.LG cs.AI

    Multiple-objective Reinforcement Learning for Inverse Design and Identification

    Authors: Haoran Wei, Mariefel Olarte, Garrett B. Goh

    Abstract: The aim of the inverse chemical design is to develop new molecules with given optimized molecular properties or objectives. Recently, generative deep learning (DL) networks are considered as the state-of-the-art in inverse chemical design and have achieved early success in generating molecular structures with desired properties in the pharmaceutical and material chemistry fields. However, satisfyi… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

  9. arXiv:1809.05127  [pdf, other

    cs.LG cs.AI stat.ML

    IL-Net: Using Expert Knowledge to Guide the Design of Furcated Neural Networks

    Authors: Khushmeen Sakloth, Wesley Beckner, Jim Pfaendtner, Garrett B. Goh

    Abstract: Deep neural networks (DNN) excel at extracting patterns. Through representation learning and automated feature engineering on large datasets, such models have been highly successful in computer vision and natural language applications. Designing optimal network architectures from a principled or rational approach however has been less than successful, with the best successful approaches utilizing… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

    Comments: Submitted to peer-reviewed ML conference

  10. arXiv:1808.04456  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Multimodal Deep Neural Networks using Both Engineered and Learned Representations for Biodegradability Prediction

    Authors: Garrett B. Goh, Khushmeen Sakloth, Charles Siegel, Abhinav Vishnu, Jim Pfaendtner

    Abstract: Deep learning algorithms excel at extracting patterns from raw data, and with large datasets, they have been very successful in computer vision and natural language applications. However, in other domains, large datasets on which to learn representations from may not exist. In this work, we develop a novel multimodal CNN-MLP neural network architecture that utilizes both domain-specific feature en… ▽ More

    Submitted 13 September, 2018; v1 submitted 13 August, 2018; originally announced August 2018.

    Comments: Submitted to a peer-reviewed ML conference

  11. arXiv:1712.02734  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Using Rule-Based Labels for Weak Supervised Learning: A ChemNet for Transferable Chemical Property Prediction

    Authors: Garrett B. Goh, Charles Siegel, Abhinav Vishnu, Nathan O. Hodas

    Abstract: With access to large datasets, deep neural networks (DNN) have achieved human-level accuracy in image and speech recognition tasks. However, in chemistry, data is inherently small and fragmented. In this work, we develop an approach of using rule-based knowledge for training ChemNet, a transferable and generalizable deep neural network for chemical property prediction that learns in a weak-supervi… ▽ More

    Submitted 18 March, 2018; v1 submitted 7 December, 2017; originally announced December 2017.

    Comments: Submitted to SIGKDD 2018

  12. arXiv:1712.02034  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties

    Authors: Garrett B. Goh, Nathan O. Hodas, Charles Siegel, Abhinav Vishnu

    Abstract: Chemical databases store information in text representations, and the SMILES format is a universal standard used in many cheminformatics software. Encoded in each SMILES string is structural information that can be used to predict complex chemical properties. In this work, we develop SMILES2vec, a deep RNN that automatically learns features from SMILES to predict chemical properties, without the n… ▽ More

    Submitted 18 March, 2018; v1 submitted 5 December, 2017; originally announced December 2017.

    Comments: Submitted to SIGKDD 2018

  13. arXiv:1710.02238  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    How Much Chemistry Does a Deep Neural Network Need to Know to Make Accurate Predictions?

    Authors: Garrett B. Goh, Charles Siegel, Abhinav Vishnu, Nathan O. Hodas, Nathan Baker

    Abstract: The meteoric rise of deep learning models in computer vision research, having achieved human-level accuracy in image recognition tasks is firm evidence of the impact of representation learning of deep neural networks. In the chemistry domain, recent advances have also led to the development of similar CNN models, such as Chemception, that is trained to predict chemical properties using images of m… ▽ More

    Submitted 18 March, 2018; v1 submitted 5 October, 2017; originally announced October 2017.

    Comments: In Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV)

  14. arXiv:1706.06689  [pdf

    stat.ML cs.AI cs.CE cs.CV cs.LG

    Chemception: A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models

    Authors: Garrett B. Goh, Charles Siegel, Abhinav Vishnu, Nathan O. Hodas, Nathan Baker

    Abstract: In the last few years, we have seen the transformative impact of deep learning in many applications, particularly in speech recognition and computer vision. Inspired by Google's Inception-ResNet deep convolutional neural network (CNN) for image classification, we have developed "Chemception", a deep CNN for the prediction of chemical properties, using just the images of 2D drawings of molecules. W… ▽ More

    Submitted 20 June, 2017; originally announced June 2017.

    Comments: Submitted to a chemistry peer-reviewed journal

  15. arXiv:1701.04503  [pdf

    stat.ML cs.AI cs.CE cs.LG physics.chem-ph

    Deep Learning for Computational Chemistry

    Authors: Garrett B. Goh, Nathan O. Hodas, Abhinav Vishnu

    Abstract: The rise and fall of artificial neural networks is well documented in the scientific literature of both computer science and computational chemistry. Yet almost two decades later, we are now seeing a resurgence of interest in deep learning, a machine learning algorithm based on multilayer neural networks. Within the last few years, we have seen the transformative impact of deep learning in many do… ▽ More

    Submitted 16 January, 2017; originally announced January 2017.

  16. arXiv:1606.07558  [pdf, ps, other

    cs.LG

    Satisfying Real-world Goals with Dataset Constraints

    Authors: Gabriel Goh, Andrew Cotter, Maya Gupta, Michael Friedlander

    Abstract: The goal of minimizing misclassification error on a training set is often just one of several real-world goals that might be defined on different datasets. For example, one may require a classifier to also make positive predictions at some specified rate for some subpopulation (fairness), or to achieve a specified empirical recall. Other real-world goals include reducing churn with respect to a pr… ▽ More

    Submitted 3 May, 2017; v1 submitted 23 June, 2016; originally announced June 2016.