Skip to main content

Showing 1–40 of 40 results for author: Dahl, G

.
  1. arXiv:2306.13032  [pdf, other

    math.CO

    Combinatorial Fiedler Theory and Graph Partition

    Authors: Enide Andrade, Geir Dahl

    Abstract: Partition problems in graphs are extremely important in applications, as shown in the Data science and Machine learning literature. One approach is spectral partitioning based on a Fiedler vector, i.e., an eigenvector corresponding to the second smallest eigenvalue $a(G)$ of the Laplacian matrix $L_G$ of the graph $G$. This problem corresponds to the minimization of a quadratic form associated wit… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    MSC Class: 05C50; 15A18; 05C05; 05C40

  2. arXiv:2306.07179  [pdf, other

    cs.LG stat.ML

    Benchmarking Neural Network Training Algorithms

    Authors: George E. Dahl, Frank Schneider, Zachary Nado, Naman Agarwal, Chandramouli Shama Sastry, Philipp Hennig, Sourabh Medapati, Runa Eschenhagen, Priya Kasimbeg, Daniel Suo, Juhan Bae, Justin Gilmer, Abel L. Peirson, Bilal Khan, Rohan Anil, Mike Rabbat, Shankar Krishnan, Daniel Snider, Ehsan Amid, Kongtao Chen, Chris J. Maddison, Rakshith Vasudev, Michal Badura, Ankush Garg, Peter Mattson

    Abstract: Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate schedules, or data selection schemes) could save time, save computational resources, and lead to better, more accurate, models. Unfortunately, as a communi… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 102 pages, 8 figures, 41 tables

  3. arXiv:2207.14484  [pdf, other

    cs.LG

    Adaptive Gradient Methods at the Edge of Stability

    Authors: Jeremy M. Cohen, Behrooz Ghorbani, Shankar Krishnan, Naman Agarwal, Sourabh Medapati, Michal Badura, Daniel Suo, David Cardoze, Zachary Nado, George E. Dahl, Justin Gilmer

    Abstract: Very little is known about the training dynamics of adaptive gradient methods like Adam in deep learning. In this paper, we shed light on the behavior of these algorithms in the full-batch and sufficiently large batch settings. Specifically, we empirically demonstrate that during full-batch training, the maximum eigenvalue of the preconditioned Hessian typically equilibrates at a certain numerical… ▽ More

    Submitted 15 April, 2024; v1 submitted 29 July, 2022; originally announced July 2022.

    Comments: v2 corrects the formula for Adam's preconditioner in Eq 2

  4. arXiv:2207.03084  [pdf, other

    cs.LG cs.AI stat.ML

    Pre-training helps Bayesian optimization too

    Authors: Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani

    Abstract: Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully. Such domain knowledge often manifests in Gaussian process priors that specify initial beliefs o… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: ICML2022 Workshop on Adaptive Experimental Design and Active Learning in the Real World. arXiv admin note: substantial text overlap with arXiv:2109.08215

  5. arXiv:2203.10139  [pdf

    cs.LG cs.AI cs.CV eess.IV

    AI system for fetal ultrasound in low-resource settings

    Authors: Ryan G. Gomes, Bellington Vwalika, Chace Lee, Angelica Willis, Marcin Sieniek, Joan T. Price, Christina Chen, Margaret P. Kasaro, James A. Taylor, Elizabeth M. Stringer, Scott Mayer McKinney, Ntazana Sindano, George E. Dahl, William Goodnight III, Justin Gilmer, Benjamin H. Chi, Charles Lau, Terry Spitz, T Saensuksopa, Kris Liu, Jonny Wong, Rory Pilgrim, Akib Uddin, Greg Corrado, Lily Peng , et al. (4 additional authors not shown)

    Abstract: Despite considerable progress in maternal healthcare, maternal and perinatal deaths remain high in low-to-middle income countries. Fetal ultrasound is an important component of antenatal care, but shortage of adequately trained healthcare workers has limited its adoption. We developed and validated an artificial intelligence (AI) system that uses novice-acquired "blind sweep" ultrasound videos to… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

  6. arXiv:2112.08250  [pdf, other

    cs.LG

    Predicting the utility of search spaces for black-box optimization: a simple, budget-aware approach

    Authors: Setareh Ariafar, Justin Gilmer, Zachary Nado, Jasper Snoek, Rodolphe Jenatton, George E. Dahl

    Abstract: Black box optimization requires specifying a search space to explore for solutions, e.g. a d-dimensional compact space, and this choice is critical for getting the best results at a reasonable budget. Unfortunately, determining a high quality search space can be challenging in many applications. For example, when tuning hyperparameters for machine learning pipelines on a new problem given a limite… ▽ More

    Submitted 16 December, 2021; v1 submitted 15 December, 2021; originally announced December 2021.

  7. arXiv:2110.04369  [pdf, other

    cs.LG cs.AI

    A Loss Curvature Perspective on Training Instability in Deep Learning

    Authors: Justin Gilmer, Behrooz Ghorbani, Ankush Garg, Sneha Kudugunta, Behnam Neyshabur, David Cardoze, George Dahl, Zachary Nado, Orhan Firat

    Abstract: In this work, we study the evolution of the loss Hessian across many classification tasks in order to understand the effect the curvature of the loss has on the training dynamics. Whereas prior work has focused on how different learning rates affect the loss Hessian observed during training, we also analyze the effects of model initialization, architectural choices, and common training heuristics… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

    Comments: 20 pages, 16 figures

  8. arXiv:2109.08215  [pdf, other

    cs.LG stat.ML

    Pre-trained Gaussian processes for Bayesian optimization

    Authors: Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani

    Abstract: Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully. Such domain knowledge often manifests in Gaussian process priors that specify initial beliefs o… ▽ More

    Submitted 6 July, 2022; v1 submitted 16 September, 2021; originally announced September 2021.

  9. arXiv:2104.02145  [pdf, other

    cs.CL

    What Will it Take to Fix Benchmarking in Natural Language Understanding?

    Authors: Samuel R. Bowman, George E. Dahl

    Abstract: Evaluation for many natural language understanding (NLU) tasks is broken: Unreliable and biased systems score so highly on standard benchmarks that there is little room for researchers who develop better systems to demonstrate their improvements. The recent trend to abandon IID benchmarks in favor of adversarially-constructed, out-of-distribution test sets ensures that current models will perform… ▽ More

    Submitted 15 October, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

    Comments: Proceedings of NAACL 2020. This revision adds a missing acknowledgment

  10. arXiv:2102.06356  [pdf, other

    cs.LG stat.ML

    A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes

    Authors: Zachary Nado, Justin M. Gilmer, Christopher J. Shallue, Rohan Anil, George E. Dahl

    Abstract: Recently the LARS and LAMB optimizers have been proposed for training neural networks faster using large batch sizes. LARS and LAMB add layer-wise normalization to the update rules of Heavy-ball momentum and Adam, respectively, and have become popular in prominent benchmarks and deep learning libraries. However, without fair comparisons to standard optimizers, it remains an open question whether L… ▽ More

    Submitted 9 June, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

  11. arXiv:2101.04150  [pdf, ps, other

    math.CO

    Sign-restricted matrices of $0$'s, $1$'s, and $-1$'s

    Authors: Richard A. Brualdi, Geir Dahl

    Abstract: We study {\em sign-restricted matrices} (SRMs), a class of rectangular $(0, \pm 1)$-matrices generalizing the alternating sign matrices (ASMs). In an SRM each partial column sum, starting from row 1, equals 0 or 1, and each partial row sum, starting from column 1, is nonnegative. We determine the maximum number of nonzeros in SRMs and characterize the possible row and column sum vectors. Moreover,… ▽ More

    Submitted 11 January, 2021; originally announced January 2021.

    MSC Class: 05A18; 05B20; 06A07; 15B35; 15B36

  12. arXiv:2101.04148  [pdf, ps, other

    math.CO

    Convex $(0,1)$-Matrices and Their Epitopes

    Authors: Richard A. Brualdi, Geir Dahl

    Abstract: We investigate $(0,1)$-matrices that are {\em convex}, which means that the ones are consecutive in every row and column. These matrices occur in discrete tomography. The notion of ranked essential sets, known for permutation matrices, is extended to convex sets. We show a number of results for the class $\mc{C}(R,S)$ of convex matrices with given row and column sum vectors $R$ and $S$. Also, it i… ▽ More

    Submitted 11 January, 2021; originally announced January 2021.

    MSC Class: 05A05; 05B20; 15B36; 52A37

  13. arXiv:2101.04143  [pdf, other

    math.CO math.OC

    Diagonal Sums of Doubly Stochastic Matrices

    Authors: Richard A. Brualdi, Geir Dahl

    Abstract: Let $Ω_n$ denote the class of $n \times n$ doubly stochastic matrices (each such matrix is entrywise nonnegative and every row and column sum is 1). We study the diagonals of matrices in $Ω_n$. The main question is: which $A \in Ω_n$ are such that the diagonals in $A$ that avoid the zeros of $A$ all have the same sum of their entries. We give a characterization of such matrices, and establish seve… ▽ More

    Submitted 11 January, 2021; originally announced January 2021.

  14. arXiv:2003.08286  [pdf, ps, other

    math.CO math.PR

    On Kemeny's constant for trees with fixed order and diameter

    Authors: Lorenzo Ciardo, Geir Dahl, Steve Kirkland

    Abstract: Kemeny's constant $κ(G)$ of a connected graph $G$ is a measure of the expected transit time for the random walk associated with $G$. In the current work, we consider the case when $G$ is a tree, and, in this setting, we provide lower and upper bounds for $κ(G)$ in terms of the order $n$ and diameter $δ$ of $G$ by using two different techniques. The lower bound is given as Kemeny's constant of a pa… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

    Comments: 20 pages, 5 figures

    MSC Class: 05C81; 05C50; 60J10; 05C12; 94C15

  15. arXiv:1912.01359  [pdf, other

    eess.IV cs.CV q-bio.NC

    A deep learning based tool for automatic brain extraction from functional magnetic resonance images in rodents

    Authors: Sidney Pontes-Filho, Annelene Gulden Dahl, Stefano Nichele, Gustavo Borges Moreno e Mello

    Abstract: Removing skull artifacts from functional magnetic images (fMRI) is a well understood and frequently encountered problem. Because the fMRI field has grown mostly due to human studies, many new tools were developed to handle human data. Nonetheless, these tools are not equally useful to handle the data derived from animal studies, especially from rodents. This represents a major problem to the field… ▽ More

    Submitted 5 December, 2019; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: 5 pages, 2 figures

  16. arXiv:1910.05446  [pdf, other

    cs.LG stat.ML

    On Empirical Comparisons of Optimizers for Deep Learning

    Authors: Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl

    Abstract: Selecting an optimizer is a central step in the contemporary deep learning pipeline. In this paper, we demonstrate the sensitivity of optimizer comparisons to the hyperparameter tuning protocol. Our findings suggest that the hyperparameter search space may be the single most important factor explaining the rankings obtained by recent empirical comparisons in the literature. In fact, we show that t… ▽ More

    Submitted 15 June, 2020; v1 submitted 11 October, 2019; originally announced October 2019.

  17. arXiv:1908.03739  [pdf, ps, other

    math.CO

    Permutation Matrices, Their Discrete Derivatives and Extremal Properties

    Authors: Richard A. Brualdi, Geir Dahl

    Abstract: For a permutation $π$, and the corresponding permutation matrix, we introduce the notion of {\em discrete derivative}, obtained by taking differences of successive entries in $π$. We characterize the possible derivatives of permutations, and consider questions for permutations with certain properties satisfied by the derivative. For instance, we consider permutations with distinct derivatives, and… ▽ More

    Submitted 10 August, 2019; originally announced August 2019.

    MSC Class: 05B20; 15B48

  18. arXiv:1907.05550  [pdf, other

    cs.LG

    Faster Neural Network Training with Data Echoing

    Authors: Dami Choi, Alexandre Passos, Christopher J. Shallue, George E. Dahl

    Abstract: In the twilight of Moore's law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training. However, earlier stages of the training pipeline, such as disk I/O and data preprocessing, do not run on accelerators. As accelerators continue to improve, these earlier stages will increasingly become the bottleneck. In this paper, we introduce "data echoing," which… ▽ More

    Submitted 7 May, 2020; v1 submitted 11 July, 2019; originally announced July 2019.

  19. arXiv:1907.04164  [pdf, other

    cs.LG stat.ML

    Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model

    Authors: Guodong Zhang, Lala Li, Zachary Nado, James Martens, Sushant Sachdeva, George E. Dahl, Christopher J. Shallue, Roger Grosse

    Abstract: Increasing the batch size is a popular way to speed up neural network training, but beyond some critical batch size, larger batch sizes yield diminishing returns. In this work, we study how the critical batch size changes based on properties of the optimization algorithm, including acceleration and preconditioning, through two different lenses: large scale experiments, and analysis of a simple noi… ▽ More

    Submitted 28 October, 2019; v1 submitted 9 July, 2019; originally announced July 2019.

    Comments: NeurIPS 2019

  20. arXiv:1811.03600  [pdf, other

    cs.LG stat.ML

    Measuring the Effects of Data Parallelism on Neural Network Training

    Authors: Christopher J. Shallue, Jaehoon Lee, Joseph Antognini, Jascha Sohl-Dickstein, Roy Frostig, George E. Dahl

    Abstract: Recent hardware developments have dramatically increased the scale of data parallelism available for neural network training. Among the simplest ways to harness next-generation hardware is to increase the batch size in standard mini-batch neural network training algorithms. In this work, we aim to experimentally characterize the effects of increasing the batch size on training time, as measured by… ▽ More

    Submitted 18 July, 2019; v1 submitted 8 November, 2018; originally announced November 2018.

    Journal ref: Journal of Machine Learning Research 20 (2019) 1-49

  21. arXiv:1808.07910  [pdf, ps, other

    cs.LG cs.CL stat.ML

    The Importance of Generation Order in Language Modeling

    Authors: Nicolas Ford, Daniel Duckworth, Mohammad Norouzi, George E. Dahl

    Abstract: Neural language models are a critical component of state-of-the-art systems for machine translation, summarization, audio transcription, and other tasks. These language models are almost universally autoregressive in nature, generating sentences one token at a time from left to right. This paper studies the influence of token generation order on model quality via a novel two-pass language model th… ▽ More

    Submitted 23 August, 2018; originally announced August 2018.

  22. arXiv:1808.06576  [pdf, other

    q-bio.QM stat.ML

    Peptide-Spectra Matching from Weak Supervision

    Authors: Samuel S. Schoenholz, Sean Hackett, Laura Deming, Eugene Melamud, Navdeep Jaitly, Fiona McAllister, Jonathon O'Brien, George Dahl, Bryson Bennett, Andrew M. Dai, Daphne Koller

    Abstract: As in many other scientific domains, we face a fundamental problem when using machine learning to identify proteins from mass spectrometry data: large ground truth datasets map** inputs to correct outputs are extremely difficult to obtain. Instead, we have access to imperfect hand-coded models crafted by domain experts. In this paper, we apply deep neural networks to an important step of the pro… ▽ More

    Submitted 22 August, 2018; v1 submitted 20 August, 2018; originally announced August 2018.

  23. arXiv:1807.06732  [pdf, other

    cs.LG stat.ML

    Motivating the Rules of the Game for Adversarial Example Research

    Authors: Justin Gilmer, Ryan P. Adams, Ian Goodfellow, David Andersen, George E. Dahl

    Abstract: Advances in machine learning have led to broad deployment of systems with impressive performance on important problems. Nonetheless, these systems can be induced to make errors on data that are surprisingly similar to examples the learned system handles correctly. The existence of these errors raises a variety of questions about out-of-sample generalization and whether bad actors might use such ex… ▽ More

    Submitted 19 July, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

  24. arXiv:1806.04313  [pdf, other

    cs.CL cs.LG

    Embedding Text in Hyperbolic Spaces

    Authors: Bhuwan Dhingra, Christopher J. Shallue, Mohammad Norouzi, Andrew M. Dai, George E. Dahl

    Abstract: Natural language text exhibits hierarchical structure in a variety of respects. Ideally, we could incorporate our prior knowledge of this hierarchical structure into unsupervised learning algorithms that work on text data. Recent work by Nickel & Kiela (2017) proposed using hyperbolic instead of Euclidean embedding spaces to represent hierarchical data and demonstrated encouraging results when emb… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

    Comments: TextGraphs 2018

  25. arXiv:1806.01261  [pdf, other

    cs.LG cs.AI stat.ML

    Relational inductive biases, deep learning, and graph networks

    Authors: Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Caglar Gulcehre, Francis Song, Andrew Ballard, Justin Gilmer, George Dahl, Ashish Vaswani, Kelsey Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matt Botvinick, Oriol Vinyals , et al. (2 additional authors not shown)

    Abstract: Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, rema… ▽ More

    Submitted 17 October, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

  26. arXiv:1805.11803  [pdf, ps, other

    math.SP math.CO

    New Bounds for the Signless Laplacian Spread

    Authors: Enide Andrade, Geir Dahl, Laura Leal, María Robbiano

    Abstract: Let $G$ be a simple graph. The signless Laplacian spread of $G$ is defined as the maximum distance of pairs of its signless Laplacian eigenvalues. This paper establishes some new bounds, both lower and upper, for the signless Laplacian spread. Several of these bounds depend on invariant parameters of the graph. We also use a minmax principle to find several lower bounds for this spectral invariant… ▽ More

    Submitted 30 May, 2018; originally announced May 2018.

  27. arXiv:1805.10255  [pdf, other

    cs.CV cs.AI cs.LG cs.NE

    Parallel Architecture and Hyperparameter Search via Successive Halving and Classification

    Authors: Manoj Kumar, George E. Dahl, Vijay Vasudevan, Mohammad Norouzi

    Abstract: We present a simple and powerful algorithm for parallel black box optimization called Successive Halving and Classification (SHAC). The algorithm operates in $K$ stages of parallel function evaluations and trains a cascade of binary classifiers to iteratively cull the undesirable regions of the search space. SHAC is easy to implement, requires no tuning of its own configuration parameters, is inva… ▽ More

    Submitted 25 May, 2018; originally announced May 2018.

  28. arXiv:1804.03235  [pdf, other

    cs.LG cs.AI stat.ML

    Large scale distributed neural network training through online distillation

    Authors: Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E. Dahl, Geoffrey E. Hinton

    Abstract: Techniques such as ensembling and distillation promise model quality improvements when paired with almost any base model. However, due to increased test-time cost (for ensembles) and increased complexity of the training pipeline (for distillation), these techniques are challenging to use in industrial settings. In this paper we explore a variant of distillation which is relatively straightforward… ▽ More

    Submitted 20 August, 2020; v1 submitted 9 April, 2018; originally announced April 2018.

    Comments: Clarify that implementations should use available parallelism in pseudo-code

  29. arXiv:1704.07752  [pdf, ps, other

    math.CO

    Alternating Sign Matrices and Hypermatrices, and a Generalization of Latin Square

    Authors: Richard A. Brualdi, Geir Dahl

    Abstract: An alternating sign matrix, or ASM, is a $(0, \pm 1)$-matrix where the nonzero entries in each row and column alternate in sign. We generalize this notion to hypermatrices: an $n\times n\times n$ hypermatrix $A=[a_{ijk}]$ is an {\em alternating sign hypermatrix}, or ASHM, if each of its planes, obtained by fixing one of the three indices, is an ASM. Several results concerning ASHMs are shown, such… ▽ More

    Submitted 25 April, 2017; originally announced April 2017.

    Comments: 39 pages

    MSC Class: 05B15; 15B35

  30. arXiv:1704.01212  [pdf, other

    cs.LG

    Neural Message Passing for Quantum Chemistry

    Authors: Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, George E. Dahl

    Abstract: Supervised learning on molecules has incredible potential to be useful in chemistry, drug discovery, and materials science. Luckily, several promising and closely related neural network models invariant to molecular symmetries have already been described in the literature. These models learn a message passing algorithm and aggregation procedure to compute a function of their entire input graph. At… ▽ More

    Submitted 12 June, 2017; v1 submitted 4 April, 2017; originally announced April 2017.

    Comments: 14 pages

    ACM Class: I.2.6

  31. arXiv:1703.02442  [pdf, other

    cs.CV

    Detecting Cancer Metastases on Gigapixel Pathology Images

    Authors: Yun Liu, Krishna Gadepalli, Mohammad Norouzi, George E. Dahl, Timo Kohlberger, Aleksey Boyko, Subhashini Venugopalan, Aleksei Timofeev, Philip Q. Nelson, Greg S. Corrado, Jason D. Hipp, Lily Peng, Martin C. Stumpe

    Abstract: Each year, the treatment decisions for more than 230,000 breast cancer patients in the U.S. hinge on whether the cancer has metastasized away from the breast. Metastasis detection is currently performed by pathologists reviewing large expanses of biological tissues. This process is labor intensive and error-prone. We present a framework to automatically detect and localize tumors as small as 100 x… ▽ More

    Submitted 7 March, 2017; v1 submitted 3 March, 2017; originally announced March 2017.

    Comments: Fig 1: normal and tumor patches were accidentally reversed - now fixed. Minor grammatical corrections in appendix, section "Image Color Normalization"

    Journal ref: MICCAI Tutorial (2017)

  32. Machine learning prediction errors better than DFT accuracy

    Authors: Felix A. Faber, Luke Hutchison, Bing Huang, Justin Gilmer, Samuel S. Schoenholz, George E. Dahl, Oriol Vinyals, Steven Kearnes, Patrick F. Riley, O. Anatole von Lilienfeld

    Abstract: We investigate the impact of choosing regressors and molecular representations for the construction of fast machine learning (ML) models of thirteen electronic ground-state properties of organic molecules. The performance of each regressor/representation/property combination is assessed using learning curves which report out-of-sample errors as a function of training set size with up to $\sim$117k… ▽ More

    Submitted 4 June, 2017; v1 submitted 17 February, 2017; originally announced February 2017.

  33. arXiv:1609.02218  [pdf, ps, other

    physics.atom-ph physics.optics

    Measurement of the Yb I $^1S_0 - ^1P_1$ transition frequency at 399 nm using an optical frequency comb

    Authors: Michaela Kleinert, M. E. Gold Dahl, Scott D. Bergeson

    Abstract: We determine the frequency of the Yb I $^1S_0 - ^1P_1$ transition at 399 nm using an optical frequency comb. Although this transition was measured previously using an optical transfer cavity [D. Das et al., Phys. Rev. A 72, 032506 (2005)], recent work has uncovered significant errors in that method. We compare our result of 751 526 533.49 $\pm$ 0.33 MHz for the Yb-174 isotope with those from the l… ▽ More

    Submitted 2 December, 2016; v1 submitted 7 September, 2016; originally announced September 2016.

    Comments: http://journals.aps.org/pra/abstract/10.1103/PhysRevA.94.052511

    Journal ref: Phys. Rev. A 94, 052511 (2016)

  34. arXiv:1408.2039  [pdf

    cs.LG stat.ML

    Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes

    Authors: Ryan Prescott Adams, George E. Dahl, Iain Murray

    Abstract: Probabilistic matrix factorization (PMF) is a powerful method for modeling data associ- ated with pairwise relationships, Finding use in collaborative Filtering, computational bi- ology, and document analysis, among other areas. In many domains, there are additional covariates that can assist in prediction. For example, when modeling movie ratings, we might know when the rating occurred, where the… ▽ More

    Submitted 9 August, 2014; originally announced August 2014.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-1-9

  35. arXiv:1406.1231  [pdf, other

    stat.ML cs.LG cs.NE

    Multi-task Neural Networks for QSAR Predictions

    Authors: George E. Dahl, Navdeep Jaitly, Ruslan Salakhutdinov

    Abstract: Although artificial neural networks have occasionally been used for Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) studies in the past, the literature has of late been dominated by other machine learning techniques such as random forests. However, a variety of new neural net techniques along with successful applications in other domains have renewed interest in network approache… ▽ More

    Submitted 4 June, 2014; originally announced June 2014.

  36. arXiv:1309.1501  [pdf, ps, other

    cs.LG cs.CL cs.NE math.OC stat.ML

    Improvements to deep convolutional neural networks for LVCSR

    Authors: Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George E. Dahl, George Saon, Hagen Soltau, Tomas Beran, Aleksandr Y. Aravkin, Bhuvana Ramabhadran

    Abstract: Deep Convolutional Neural Networks (CNNs) are more powerful than Deep Neural Networks (DNN), as they are able to better reduce spectral variation in the input signal. This has also been confirmed experimentally, with CNNs showing improvements in word error rate (WER) between 4-12% relative compared to DNNs across a variety of LVCSR tasks. In this paper, we describe different methods to further imp… ▽ More

    Submitted 10 December, 2013; v1 submitted 5 September, 2013; originally announced September 2013.

    Comments: 6 pages, 1 figure

    MSC Class: 65K05; 90C15; 90C90

  37. arXiv:1306.0685  [pdf, ps, other

    math.NA

    Subdivision schemes, network flows and linear optimization

    Authors: Maria Charina, Geir Dahl

    Abstract: We link regularity and smoothness analysis of multivariate vector subdivision schemes with network flow theory and with special linear optimization problems. This connection allows us to prove the existence of what we call optimal difference masks that posses crucial properties unifying the regularity analysis of univariate and multivariate subdivision schemes. We also provide efficient optimizati… ▽ More

    Submitted 10 February, 2015; v1 submitted 4 June, 2013; originally announced June 2013.

  38. arXiv:1202.5695  [pdf, other

    cs.LG stat.ML

    Training Restricted Boltzmann Machines on Word Observations

    Authors: George E. Dahl, Ryan P. Adams, Hugo Larochelle

    Abstract: The restricted Boltzmann machine (RBM) is a flexible tool for modeling complex data, however there have been significant computational difficulties in using RBMs to model high-dimensional multinomial observations. In natural language processing applications, words are naturally modeled by K-ary discrete distributions, where K is determined by the vocabulary size and can easily be in the hundreds o… ▽ More

    Submitted 5 July, 2012; v1 submitted 25 February, 2012; originally announced February 2012.

  39. arXiv:1110.4678  [pdf, ps, other

    math.OC quant-ph

    Quantum Strategies

    Authors: Gordon B. Dahl, Steven E. Landsburg

    Abstract: We investigate the consequences of allowing players to adopt strategies which take advantage of quantum randomization devices. In games of full information, the resulting equilibria are always correlated equilibria, but not all correlated equilibria appear as quantum equilibria. The classical and quantum theories diverge further in games of private information. In the quantum context, we show that… ▽ More

    Submitted 20 October, 2011; originally announced October 2011.

    MSC Class: 91A05; 81P45 (Primary)

  40. arXiv:1003.4944  [pdf, other

    stat.ML cs.LG

    Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes

    Authors: Ryan Prescott Adams, George E. Dahl, Iain Murray

    Abstract: Probabilistic matrix factorization (PMF) is a powerful method for modeling data associated with pairwise relationships, finding use in collaborative filtering, computational biology, and document analysis, among other areas. In many domains, there is additional information that can assist in prediction. For example, when modeling movie ratings, we might know when the rating occurred, where the u… ▽ More

    Submitted 25 March, 2010; originally announced March 2010.

    Comments: 18 pages, 4 figures, Submitted to UAI 2010