Skip to main content

Showing 1–29 of 29 results for author: Goh, G

.
  1. arXiv:2406.04093  [pdf, other

    cs.LG cs.AI

    Scaling and evaluating sparse autoencoders

    Authors: Leo Gao, Tom Dupré la Tour, Henk Tillman, Gabriel Goh, Rajan Troll, Alec Radford, Ilya Sutskever, Jan Leike, Jeffrey Wu

    Abstract: Sparse autoencoders provide a promising unsupervised approach for extracting interpretable features from a language model by reconstructing activations from a sparse bottleneck layer. Since language models learn many concepts, autoencoders need to be very large to recover all relevant features. However, studying the properties of autoencoder scaling is difficult due to the need to balance reconstr… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  2. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  3. arXiv:2207.11622  [pdf

    cond-mat.mtrl-sci

    Unconventional Charge-density-wave Order in a Dilute d-band Semiconductor

    Authors: Huandong Chen, Batyr Ilyas, Boyang Zhao, Emre Ergecen, Josh Mutch, Gwan Yeong Jung, Qian Song, Connor A. Occhialini, Guodong Ren, Sara Shabani, Eric Seewald, Shanyuan Niu, Jiangbin Wu, Nan Wang, Mythili Surendran, Shantanu Singh, Jiang Luo, Sanae Ohtomo, Gemma Goh, Bryan C. Chakoumakos, Simon J. Teat, Brent Melot, Han Wang, Di Xiao, Abhay N. Pasupathy , et al. (5 additional authors not shown)

    Abstract: Electron-lattice coupling effects in low dimensional materials give rise to charge density wave (CDW) order and phase transitions. These phenomena are critical ingredients for superconductivity and predominantly occur in metallic model systems such as doped cuprates, transition metal dichalcogenides, and more recently, in Kagome lattice materials. However, CDW in semiconducting systems, specifical… ▽ More

    Submitted 23 July, 2022; originally announced July 2022.

    Journal ref: Adv. Mater. 2023, 2303283

  4. arXiv:2201.10958  [pdf, other

    q-bio.PE

    Two Results about the Sackin and Colless Indices for Phylogenetic Trees and Their Shapes

    Authors: Gary Goh, Michael Fuchs, Louxin Zhang

    Abstract: The Sackin and Colless indices are two widely-used metrics for measuring the balance of trees and for testing evolutionary models in phylogenetics. This short paper contributes two results about the Sackin and Colless indices of trees. One result is the asymptotic analysis of the expected Sackin and Colless indices of a tree shape (which are full binary rooted unlabelled trees) under the uniform m… ▽ More

    Submitted 18 July, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

    Comments: 10 pages, 1 fugre

    MSC Class: 05A16; 05C30; 92D15

  5. arXiv:2103.00020  [pdf, other

    cs.CV cs.LG

    Learning Transferable Visual Models From Natural Language Supervision

    Authors: Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever

    Abstract: State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstr… ▽ More

    Submitted 26 February, 2021; originally announced March 2021.

  6. arXiv:2102.12092  [pdf, other

    cs.CV cs.LG

    Zero-Shot Text-to-Image Generation

    Authors: Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever

    Abstract: Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset. These assumptions might involve complex architectures, auxiliary losses, or side information such as object part labels or segmentation masks supplied during training. We describe a simple approach for this task based on a transformer that autoregressively models the text and… ▽ More

    Submitted 26 February, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

  7. arXiv:2006.07663  [pdf, other

    stat.ME

    Bayesian causal inference with some invalid instrumental variables

    Authors: Gyuhyeong Goh, Jisang Yu

    Abstract: In observational studies, instrumental variables estimation is greatly utilized to identify causal effects. One of the key conditions for the instrumental variables estimator to be consistent is the exclusion restriction, which indicates that instruments affect the outcome of interest only via the exposure variable of interest. We propose a likelihood-free Bayesian approach to make consistent infe… ▽ More

    Submitted 13 June, 2020; originally announced June 2020.

  8. arXiv:2005.13719  [pdf, other

    stat.ME

    Synthetic control method with convex hull restrictions: A Bayesian maximum a posteriori approach

    Authors: Gyuhyeong Goh, Jisang Yu

    Abstract: Synthetic control methods have gained popularity among causal studies with observational data, particularly when estimating the impacts of the interventions that are implemented to a small number of large units. Implementing the synthetic control methods faces two major challenges: a) estimating weights for each control unit to create a synthetic control and b) providing statistical inferences. To… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

  9. Understanding Integrated Gradients with SmoothTaylor for Deep Neural Network Attribution

    Authors: Gary S. W. Goh, Sebastian Lapuschkin, Leander Weber, Wojciech Samek, Alexander Binder

    Abstract: Integrated Gradients as an attribution method for deep neural network models offers simple implementability. However, it suffers from noisiness of explanations which affects the ease of interpretability. The SmoothGrad technique is proposed to solve the noisiness issue and smoothen the attribution maps of any gradient-based attribution method. In this paper, we present SmoothTaylor as a novel theo… ▽ More

    Submitted 2 September, 2021; v1 submitted 22 April, 2020; originally announced April 2020.

    Comments: 8 pages, 3 figures. Accepted in 25th International Conference on Pattern Recognition, (ICPR) 2020. In Proceedings: pp. 4949-4956

  10. arXiv:2002.01535  [pdf, ps, other

    cs.CL cs.LG

    Lightweight Convolutional Representations for On-Device Natural Language Processing

    Authors: Shrey Desai, Geoffrey Goh, Arun Babu, Ahmed Aly

    Abstract: The increasing computational and memory complexities of deep neural networks have made it difficult to deploy them on low-resource electronic devices (e.g., mobile phones, tablets, wearables). Practitioners have developed numerous model compression methods to address these concerns, but few have condensed input representations themselves. In this work, we propose a fast, accurate, and lightweight… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

    Comments: Accepted to MLSys 2020

  11. arXiv:1911.06876  [pdf, other

    cs.LG cs.AI stat.ML

    Explanatory Masks for Neural Network Interpretability

    Authors: Lawrence Phillips, Garrett Goh, Nathan Hodas

    Abstract: Neural network interpretability is a vital component for applications across a wide variety of domains. In such cases it is often useful to analyze a network which has already been trained for its specific purpose. In this work, we develop a method to produce explanation masks for pre-trained networks. The mask localizes the most important aspects of each input for prediction of the original netwo… ▽ More

    Submitted 15 November, 2019; originally announced November 2019.

    Comments: Presented at IJCAI-18 Workshop on Explainable Artificial Intelligence (XAI)

  12. arXiv:1910.03741  [pdf, other

    cs.LG cs.AI

    Multiple-objective Reinforcement Learning for Inverse Design and Identification

    Authors: Haoran Wei, Mariefel Olarte, Garrett B. Goh

    Abstract: The aim of the inverse chemical design is to develop new molecules with given optimized molecular properties or objectives. Recently, generative deep learning (DL) networks are considered as the state-of-the-art in inverse chemical design and have achieved early success in generating molecular structures with desired properties in the pharmaceutical and material chemistry fields. However, satisfyi… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

  13. arXiv:1811.11950  [pdf, other

    stat.ME

    Accounting for model uncertainty in multiple imputation under complex sampling

    Authors: Gyuhyeong Goh, Jae Kwang Kim

    Abstract: Multiple imputation provides an effective way to handle missing data. When several possible models are under consideration for the data, the multiple imputation is typically performed under a single-best model selected from the candidate models. This single model selection approach ignores the uncertainty associated with the model selection and so leads to underestimation of the variance of multip… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

    Comments: 23 pages, 1 Table

  14. arXiv:1809.05127  [pdf, other

    cs.LG cs.AI stat.ML

    IL-Net: Using Expert Knowledge to Guide the Design of Furcated Neural Networks

    Authors: Khushmeen Sakloth, Wesley Beckner, Jim Pfaendtner, Garrett B. Goh

    Abstract: Deep neural networks (DNN) excel at extracting patterns. Through representation learning and automated feature engineering on large datasets, such models have been highly successful in computer vision and natural language applications. Designing optimal network architectures from a principled or rational approach however has been less than successful, with the best successful approaches utilizing… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

    Comments: Submitted to peer-reviewed ML conference

  15. arXiv:1808.04456  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Multimodal Deep Neural Networks using Both Engineered and Learned Representations for Biodegradability Prediction

    Authors: Garrett B. Goh, Khushmeen Sakloth, Charles Siegel, Abhinav Vishnu, Jim Pfaendtner

    Abstract: Deep learning algorithms excel at extracting patterns from raw data, and with large datasets, they have been very successful in computer vision and natural language applications. However, in other domains, large datasets on which to learn representations from may not exist. In this work, we develop a novel multimodal CNN-MLP neural network architecture that utilizes both domain-specific feature en… ▽ More

    Submitted 13 September, 2018; v1 submitted 13 August, 2018; originally announced August 2018.

    Comments: Submitted to a peer-reviewed ML conference

  16. arXiv:1807.10873  [pdf, other

    stat.ME

    Bayesian Sparse Propensity Score Estimation for Unit Nonresponse

    Authors: Hejian Sang, Gyuhyeong Goh, Jae Kwang Kim

    Abstract: Nonresponse weighting adjustment using propensity score is a popular method for handling unit nonresponse. However, including all available auxiliary variables into the propensity model can lead to inefficient and inconsistent estimation, especially with high-dimensional covariates. In this paper, a new Bayesian method using the Spike-and-Slab prior is proposed for sparse propensity score estimati… ▽ More

    Submitted 27 July, 2018; originally announced July 2018.

    Comments: 38 pages, 3 tables

  17. arXiv:1712.02734  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Using Rule-Based Labels for Weak Supervised Learning: A ChemNet for Transferable Chemical Property Prediction

    Authors: Garrett B. Goh, Charles Siegel, Abhinav Vishnu, Nathan O. Hodas

    Abstract: With access to large datasets, deep neural networks (DNN) have achieved human-level accuracy in image and speech recognition tasks. However, in chemistry, data is inherently small and fragmented. In this work, we develop an approach of using rule-based knowledge for training ChemNet, a transferable and generalizable deep neural network for chemical property prediction that learns in a weak-supervi… ▽ More

    Submitted 18 March, 2018; v1 submitted 7 December, 2017; originally announced December 2017.

    Comments: Submitted to SIGKDD 2018

  18. arXiv:1712.02034  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties

    Authors: Garrett B. Goh, Nathan O. Hodas, Charles Siegel, Abhinav Vishnu

    Abstract: Chemical databases store information in text representations, and the SMILES format is a universal standard used in many cheminformatics software. Encoded in each SMILES string is structural information that can be used to predict complex chemical properties. In this work, we develop SMILES2vec, a deep RNN that automatically learns features from SMILES to predict chemical properties, without the n… ▽ More

    Submitted 18 March, 2018; v1 submitted 5 December, 2017; originally announced December 2017.

    Comments: Submitted to SIGKDD 2018

  19. arXiv:1710.02238  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    How Much Chemistry Does a Deep Neural Network Need to Know to Make Accurate Predictions?

    Authors: Garrett B. Goh, Charles Siegel, Abhinav Vishnu, Nathan O. Hodas, Nathan Baker

    Abstract: The meteoric rise of deep learning models in computer vision research, having achieved human-level accuracy in image recognition tasks is firm evidence of the impact of representation learning of deep neural networks. In the chemistry domain, recent advances have also led to the development of similar CNN models, such as Chemception, that is trained to predict chemical properties using images of m… ▽ More

    Submitted 18 March, 2018; v1 submitted 5 October, 2017; originally announced October 2017.

    Comments: In Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV)

  20. arXiv:1706.06689  [pdf

    stat.ML cs.AI cs.CE cs.CV cs.LG

    Chemception: A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models

    Authors: Garrett B. Goh, Charles Siegel, Abhinav Vishnu, Nathan O. Hodas, Nathan Baker

    Abstract: In the last few years, we have seen the transformative impact of deep learning in many applications, particularly in speech recognition and computer vision. Inspired by Google's Inception-ResNet deep convolutional neural network (CNN) for image classification, we have developed "Chemception", a deep CNN for the prediction of chemical properties, using just the images of 2D drawings of molecules. W… ▽ More

    Submitted 20 June, 2017; originally announced June 2017.

    Comments: Submitted to a chemistry peer-reviewed journal

  21. arXiv:1701.04503  [pdf

    stat.ML cs.AI cs.CE cs.LG physics.chem-ph

    Deep Learning for Computational Chemistry

    Authors: Garrett B. Goh, Nathan O. Hodas, Abhinav Vishnu

    Abstract: The rise and fall of artificial neural networks is well documented in the scientific literature of both computer science and computational chemistry. Yet almost two decades later, we are now seeing a resurgence of interest in deep learning, a machine learning algorithm based on multilayer neural networks. Within the last few years, we have seen the transformative impact of deep learning in many do… ▽ More

    Submitted 16 January, 2017; originally announced January 2017.

  22. arXiv:1606.07558  [pdf, ps, other

    cs.LG

    Satisfying Real-world Goals with Dataset Constraints

    Authors: Gabriel Goh, Andrew Cotter, Maya Gupta, Michael Friedlander

    Abstract: The goal of minimizing misclassification error on a training set is often just one of several real-world goals that might be defined on different datasets. For example, one may require a classifier to also make positive predictions at some specified rate for some subpopulation (fairness), or to achieve a specified empirical recall. Other real-world goals include reducing churn with respect to a pr… ▽ More

    Submitted 3 May, 2017; v1 submitted 23 June, 2016; originally announced June 2016.

  23. arXiv:1603.05719  [pdf, other

    math.OC math.NA

    Efficient evaluation of scaled proximal operators

    Authors: Michael P. Friedlander, Gabriel Goh

    Abstract: Quadratic-support functions [Aravkin, Burke, and Pillonetto; J. Mach. Learn. Res. 14(1), 2013] constitute a parametric family of convex functions that includes a range of useful regularization terms found in applications of convex optimization. We show how an interior method can be used to efficiently compute the proximal operator of a quadratic-support function under different metrics. When the m… ▽ More

    Submitted 19 December, 2016; v1 submitted 17 March, 2016; originally announced March 2016.

    Comments: 23 pages

    Journal ref: Electronic Transactions on Numerical Analysis, 46:1-22, 2017

  24. arXiv:1401.3061  [pdf

    physics.ed-ph physics.comp-ph

    Easy Java Simulation, an innovative tool for teachers as designers of gravity-physics computer models

    Authors: Loo Kang Wee, Giam Hwee Goh, Ee-Peow Lim

    Abstract: This paper is on customization of computer models using the Easy Java Simulation authoring toolkit for the Singapore syllabus, based on real astronomical data, supported with literature reviewed researched pedagogical features. These 4 new computer models serves to support the enactment of scientific work that are inquiry centric and evidence based that are more likely to promote enjoyment and ins… ▽ More

    Submitted 29 January, 2014; v1 submitted 13 January, 2014; originally announced January 2014.

    Comments: 8 pages, 8 figures, MPTL18, 18th Multimedia in Physics Teaching and Learning Conference, MPTL18, Madrid, Spain Day 1: Parallel Session 1: Room PS1. Download simulations https://dl.dropboxusercontent.com/u/44365627/lookangEJSworkspace/export/ejs_model_GField_and_Potential_1D_v8wee.jar https://dl.dropboxusercontent.com/u/44365627/lookangEJSworkspace/export/ejs_model_GFieldandPotential1Dv7EarthMoon.jar https://dl.dropboxusercontent.com/u/44365627/lookangEJSworkspace/export/ejs_model_KeplerSystem3rdLaw09.jar https://dl.dropboxusercontent.com/u/44365627/lookangEJSS/export/ejs_model_EarthAndSatelite.jar

  25. arXiv:1304.5586  [pdf, other

    math.OC

    Tail bounds for stochastic approximation

    Authors: Michael P. Friedlander, Gabriel Goh

    Abstract: Stochastic-approximation gradient methods are attractive for large-scale convex optimization because they offer inexpensive iterations. They are especially popular in data-fitting and machine-learning applications where the data arrives in a continuous stream, or it is necessary to minimize large sums of functions. It is known that by appropriately decreasing the variance of the error at each iter… ▽ More

    Submitted 8 January, 2014; v1 submitted 20 April, 2013; originally announced April 2013.

  26. arXiv:1303.0079  [pdf

    physics.ed-ph physics.comp-ph

    Enabling Gravity Physics by Inquiry using Easy Java Simulation

    Authors: Loo Kang Wee, Giam Hwee Goh, Charles Chew

    Abstract: Studying physics of very large scale like the solar system is difficult in real life, using telescope on clear skies over years. We are probably a world first to create four well designed gravity computer models to serve as powerful pedagogical tools for students active inquiry, based on real data. These models are syllabus customized, free and rapidly prototyped with Open Source Physics researche… ▽ More

    Submitted 28 February, 2013; originally announced March 2013.

    Comments: 6 pages, 12 figures, 5th redesign pedagogy conference

  27. arXiv:1212.3863  [pdf

    physics.ed-ph physics.comp-ph

    Geostationary Earth Orbit Satellite Model using Easy Java Simulation

    Authors: Loo Kang Wee, Giam Hwee Goh

    Abstract: We develop an Easy Java Simulation (EJS) model for students to visualize geostationary orbits near Earth, modeled using Java 3D implementation of the EJS 3D library. The simplified physics model is described and simulated using simple constant angular velocity equation. Four computer model design ideas such as 1) simple and realistic 3D view and associated learning to real world, 2) comparative vi… ▽ More

    Submitted 28 December, 2015; v1 submitted 16 December, 2012; originally announced December 2012.

    Comments: 6 pages, 11 figures, 2013 Physics Education Volume 48 Number 1

    Journal ref: Phys. Educ. 48 72 (2013)

  28. arXiv:1210.3410  [pdf

    physics.ed-ph physics.comp-ph

    Computer Models Design for Teaching and Learning using Easy Java Simulation

    Authors: Loo Kang Lawrence Wee, Ai Phing Lim, Khoon Song Aloysius Goh, Sze Yee LyeYE, Tat Leong Lee, Weiming Xu, Giam Hwee Jimmy Goh, Chee Wah Ong, Soo Kok Ng, Ee-Peow Lim, Chew Ling Lim, Wee Leng Joshua Yeo, Matthew Ong, Kenneth Y. T. LimI

    Abstract: We are teachers who have benefited from the Open Source Physics (Brown, 2012; Christian, 2010; Esquembre, 2012) community's work and we would like to share some of the computer models and lesson packages that we have designed and implemented in five schools grade 11 to 12 classes. In a ground-up teacher-leadership (MOE, 2010) approach, we came together to learn, advancing the professionalism (MOE,… ▽ More

    Submitted 24 October, 2013; v1 submitted 11 October, 2012; originally announced October 2012.

    Comments: 10 pages with 12 pages appendix worksheet, 12 figures, The World Conference on Physics Education 1-6 July 2012 Oral Presentation [PS.02.09.a] Parallel Session 02.09|Date & Time: 02.07.2012 / 13:00 - 14:30|Hall: D403 (3rd Floor)

  29. arXiv:1206.6489  [pdf

    physics.ed-ph physics.class-ph physics.comp-ph

    Using Tracker as a Pedagogical Tool for Understanding Projectile Motion

    Authors: Loo Kang Wee, Charles Chew, Giam Hwee Goh, Samuel Tan, Tat Leong Lee

    Abstract: This paper reports the use of Tracker as a pedagogical tool in the effective learning and teaching of projectile motion in physics. When computer model building learning processes is supported and driven by video analysis data, this free Open Source Physics (OSP) tool can provide opportunities for students to engage in active inquiry-based learning. We discuss the pedagogical use of Tracker to add… ▽ More

    Submitted 23 December, 2015; v1 submitted 26 June, 2012; originally announced June 2012.

    Comments: 9 pages, 9 figures; http://iopscience.iop.org/0031-9120/47/4/448