Skip to main content

Showing 1–8 of 8 results for author: Mazzawi, H

.
  1. arXiv:2402.05033  [pdf, other

    cs.LG

    Simulated Overparameterization

    Authors: Hanna Mazzawi, Pranjal Awasthi, Xavi Gonzalvo, Srikumar Ramalingam

    Abstract: In this work, we introduce a novel paradigm called Simulated Overparametrization (SOP). SOP merges the computational efficiency of compact models with the advanced learning proficiencies of overparameterized models. SOP proposes a unique approach to model training and inference, where a model with a significantly larger number of parameters is trained in such a way that a smaller, efficient subset… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  2. arXiv:2306.11903  [pdf, other

    cs.LG

    Deep Fusion: Efficient Network Training via Pre-trained Initializations

    Authors: Hanna Mazzawi, Xavi Gonzalvo, Michael Wunder, Sammy Jerome, Benoit Dherin

    Abstract: In recent years, deep learning has made remarkable progress in a wide range of domains, with a particularly notable impact on natural language processing tasks. One of the challenges associated with training deep neural networks in the context of LLMs is the need for large amounts of computational resources and time. To mitigate this, network growing algorithms offer potential cost savings, but th… ▽ More

    Submitted 26 June, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

  3. arXiv:1906.01550  [pdf, other

    stat.ML cs.LG

    Towards Task and Architecture-Independent Generalization Gap Predictors

    Authors: Scott Yak, Javier Gonzalvo, Hanna Mazzawi

    Abstract: Can we use deep learning to predict when deep learning works? Our results suggest the affirmative. We created a dataset by training 13,500 neural networks with different architectures, on different variations of spiral datasets, and using different optimization parameters. We used this dataset to train task-independent and architecture-independent generalization gap predictors for those neural net… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: 8 pages, 6 figures, 2 tables. To be presented at ICML 2019 "Understanding and Improving Generalization in Deep Learning" Workshop (poster)

  4. arXiv:1905.00080  [pdf, other

    cs.LG stat.ML

    AdaNet: A Scalable and Flexible Framework for Automatically Learning Ensembles

    Authors: Charles Weill, Javier Gonzalvo, Vitaly Kuznetsov, Scott Yang, Scott Yak, Hanna Mazzawi, Eugen Hotaj, Ghassen Jerfel, Vladimir Macko, Ben Adlam, Mehryar Mohri, Corinna Cortes

    Abstract: AdaNet is a lightweight TensorFlow-based (Abadi et al., 2015) framework for automatically learning high-quality ensembles with minimal expert intervention. Our framework is inspired by the AdaNet algorithm (Cortes et al., 2017) which learns the structure of a neural network as an ensemble of subnetworks. We designed it to: (1) integrate with the existing TensorFlow ecosystem, (2) offer sensible de… ▽ More

    Submitted 30 April, 2019; originally announced May 2019.

  5. arXiv:1903.06236  [pdf, other

    cs.LG stat.ML

    Improving Neural Architecture Search Image Classifiers via Ensemble Learning

    Authors: Vladimir Macko, Charles Weill, Hanna Mazzawi, Javier Gonzalvo

    Abstract: Finding the best neural network architecture requires significant time, resources, and human expertise. These challenges are partially addressed by neural architecture search (NAS) which is able to find the best convolutional layer or cell that is then used as a building block for the network. However, once a good building block is found, manual design is still required to assemble the final archi… ▽ More

    Submitted 14 March, 2019; originally announced March 2019.

  6. arXiv:1502.04137  [pdf, ps, other

    cs.LG

    Non-Adaptive Learning a Hidden Hipergraph

    Authors: Hasan Abasi, Nader H. Bshouty, Hanna Mazzawi

    Abstract: We give a new deterministic algorithm that non-adaptively learns a hidden hypergraph from edge-detecting queries. All previous non-adaptive algorithms either run in exponential time or have non-optimal query complexity. We give the first polynomial time non-adaptive learning algorithm for learning hypergraph that asks almost optimal number of queries.

    Submitted 13 February, 2015; originally announced February 2015.

  7. arXiv:1405.0792  [pdf, ps, other

    cs.LG

    On Exact Learning Monotone DNF from Membership Queries

    Authors: Hasan Abasi, Nader H. Bshouty, Hanna Mazzawi

    Abstract: In this paper, we study the problem of learning a monotone DNF with at most $s$ terms of size (number of variables in each term) at most $r$ ($s$ term $r$-MDNF) from membership queries. This problem is equivalent to the problem of learning a general hypergraph using hyperedge-detecting queries, a problem motivated by applications arising in chemical reactions and genome sequencing. We first pres… ▽ More

    Submitted 5 May, 2014; originally announced May 2014.

  8. arXiv:1001.0405  [pdf, ps, other

    cs.LG

    Optimal Query Complexity for Reconstructing Hypergraphs

    Authors: Nader H. Bshouty, Hanna Mazzawi

    Abstract: In this paper we consider the problem of reconstructing a hidden weighted hypergraph of constant rank using additive queries. We prove the following: Let $G$ be a weighted hidden hypergraph of constant rank with n vertices and $m$ hyperedges. For any $m$ there exists a non-adaptive algorithm that finds the edges of the graph and their weights using $$ O(\frac{m\log n}{\log m}) $$ additive querie… ▽ More

    Submitted 3 January, 2010; originally announced January 2010.