Skip to main content

Showing 1–6 of 6 results for author: Dawid, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.03579  [pdf, other

    cs.LG math.OC

    Deconstructing the Goldilocks Zone of Neural Network Initialization

    Authors: Artem Vysogorets, Anna Dawid, Julia Kempe

    Abstract: The second-order properties of the training loss have a massive impact on the optimization dynamics of deep learning models. Fort & Scherlis (2019) discovered that a large excess of positive curvature and local convexity of the loss Hessian is associated with highly trainable initial points located in a region coined the "Goldilocks zone". Only a handful of subsequent studies touched upon this rel… ▽ More

    Submitted 4 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  2. arXiv:2306.07104  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Unveiling the Hessian's Connection to the Decision Boundary

    Authors: Mahalakshmi Sabanayagam, Freya Behrens, Urte Adomaityte, Anna Dawid

    Abstract: Understanding the properties of well-generalizing minima is at the heart of deep learning research. On the one hand, the generalization of neural networks has been connected to the decision boundary complexity, which is hard to study in the high-dimensional input space. Conversely, the flatness of a minimum has become a controversial proxy for generalization. In this work, we provide the missing l… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 14 pages, 6 figures + 18-page appendices with 19 figures. Any feedback is very welcome! Code is available at https://github.com/Shmoo137/Hessian-and-Decision-Boundary

  3. arXiv:2306.02572  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Introduction to Latent Variable Energy-Based Models: A Path Towards Autonomous Machine Intelligence

    Authors: Anna Dawid, Yann LeCun

    Abstract: Current automated systems have crucial limitations that need to be addressed before artificial intelligence can reach human-like levels and bring new technological revolutions. Among others, our societies still lack Level 5 self-driving cars, domestic robots, and virtual assistants that learn reliable world models, reason, and plan complex action sequences. In these notes, we summarize the main id… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: 23 pages + 1-page appendix, 11 figures. These notes follow the content of three lectures given by Yann LeCun during the Les Houches Summer School on Statistical Physics and Machine Learning in 2022. Feedback and comments are most welcome!

  4. arXiv:2005.07605  [pdf, ps, other

    stat.ML cs.LG

    On Learnability under General Stochastic Processes

    Authors: A. Philip Dawid, Ambuj Tewari

    Abstract: Statistical learning theory under independent and identically distributed (iid) sampling and online learning theory for worst case individual sequences are two of the best developed branches of learning theory. Statistical learning under general non-iid stochastic processes is less mature. We provide two natural notions of learnability of a function class under a general stochastic process. We sho… ▽ More

    Submitted 11 March, 2022; v1 submitted 15 May, 2020; originally announced May 2020.

    Comments: The regression results in the previous version have been made stronger

  5. arXiv:1411.2636  [pdf, ps, other

    math.ST cs.AI stat.ME

    Bounding the Probability of Causation in Mediation Analysis

    Authors: A. P. Dawid, R. Murtas, M. Musio

    Abstract: Given empirical evidence for the dependence of an outcome variable on an exposure variable, we can typically only provide bounds for the "probability of causation" in the case of an individual who has developed the outcome after being exposed. We show how these bounds can be adapted or improved if further information becomes available. In addition to reviewing existing work on this topic, we provi… ▽ More

    Submitted 10 November, 2014; originally announced November 2014.

    Comments: 9 pages, 1 figure, 3 tables

    MSC Class: 62A99

    Journal ref: In Topics on Methodological and Applied Statistical Inference, edited by T. Di Battista, E. Moreno and W. Racugno. Springer (2016), 75-84

  6. arXiv:1010.3425  [pdf, ps, other

    math.ST cs.AI

    Identifying the consequences of dynamic treatment strategies: A decision-theoretic overview

    Authors: A. Philip Dawid, Vanessa Didelez

    Abstract: We consider the problem of learning about and comparing the consequences of dynamic treatment strategies on the basis of observational data. We formulate this within a probabilistic decision-theoretic framework. Our approach is compared with related work by Robins and others: in particular, we show how Robins's 'G-computation' algorithm arises naturally from this decision-theoretic perspective. Ca… ▽ More

    Submitted 17 October, 2010; originally announced October 2010.

    Comments: 49 pages, 15 figures

    MSC Class: 62C05; 62A01

    Journal ref: Statistics Surveys 2010, Vol. 4, 184-231