Skip to main content

Showing 1–10 of 10 results for author: Havrilla, A

.
  1. arXiv:2403.04642  [pdf, other

    cs.LG

    Teaching Large Language Models to Reason with Reinforcement Learning

    Authors: Alex Havrilla, Yuqing Du, Sharath Chandra Raparthy, Christoforos Nalmpantis, Jane Dwivedi-Yu, Maksym Zhuravinskyi, Eric Hambro, Sainbayar Sukhbaatar, Roberta Raileanu

    Abstract: Reinforcement Learning from Human Feedback (\textbf{RLHF}) has emerged as a dominant approach for aligning LLM outputs with human preferences. Inspired by the success of RLHF, we study the performance of multiple algorithms that learn from feedback (Expert Iteration, Proximal Policy Optimization (\textbf{PPO}), Return-Conditioned RL) on improving LLM reasoning capabilities. We investigate both spa… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  2. arXiv:2402.10963  [pdf, other

    cs.CL cs.LG

    GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements

    Authors: Alex Havrilla, Sharath Raparthy, Christoforus Nalmpantis, Jane Dwivedi-Yu, Maksym Zhuravinskyi, Eric Hambro, Roberta Raileanu

    Abstract: State-of-the-art language models can exhibit impressive reasoning refinement capabilities on math, science or coding tasks. However, recent work demonstrates that even the best models struggle to identify \textit{when and where to refine} without access to external feedback. Outcome-based Reward Models (\textbf{ORMs}), trained to predict correctness of the final answer indicating when to refine, o… ▽ More

    Submitted 24 June, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  3. arXiv:2402.04004  [pdf, other

    cs.LG

    Understanding the Effect of Noise in LLM Training Data with Algorithmic Chains of Thought

    Authors: Alex Havrilla, Maia Iyer

    Abstract: During both pretraining and fine-tuning, Large Language Models (\textbf{LLMs}) are trained on trillions of tokens of text of widely varying quality. Both phases of training typically involve heuristically filtering out ``low-quality'' or \textit{noisy} training samples, yet little is known quantitatively about how the type or intensity of noise affects downstream performance. In this work, we stud… ▽ More

    Submitted 8 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  4. arXiv:2401.06144  [pdf, other

    cs.CV cs.LG

    DFU: scale-robust diffusion model for zero-shot super-resolution image generation

    Authors: Alex Havrilla, Kevin Rojas, Wen**g Liao, Molei Tao

    Abstract: Diffusion generative models have achieved remarkable success in generating images with a fixed resolution. However, existing models have limited ability to generalize to different resolutions when training data at those resolutions are not available. Leveraging techniques from operator learning, we present a novel deep-learning architecture, Dual-FNO UNet (DFU), which approximates the score operat… ▽ More

    Submitted 22 January, 2024; v1 submitted 30 November, 2023; originally announced January 2024.

  5. arXiv:2307.13692  [pdf, other

    cs.CL cs.LG

    ARB: Advanced Reasoning Benchmark for Large Language Models

    Authors: Tomohiro Sawada, Daniel Paleka, Alexander Havrilla, Pranav Tadepalli, Paula Vidas, Alexander Kranias, John J. Nay, Kshitij Gupta, Aran Komatsuzaki

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance on various quantitative reasoning and knowledge benchmarks. However, many of these benchmarks are losing utility as LLMs get increasingly high scores, despite not yet reaching expert performance in these domains. We introduce ARB, a novel benchmark composed of advanced reasoning problems in multiple fields. ARB presents a more c… ▽ More

    Submitted 27 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: Submitted to NeurIPS Datasets and Benchmarks Track

  6. arXiv:2303.09863  [pdf, other

    stat.ML cs.LG

    Deep Nonparametric Estimation of Intrinsic Data Structures by Chart Autoencoders: Generalization Error and Robustness

    Authors: Hao Liu, Alex Havrilla, Rongjie Lai, Wen**g Liao

    Abstract: Autoencoders have demonstrated remarkable success in learning low-dimensional latent features of high-dimensional data across various applications. Assuming that data are sampled near a low-dimensional manifold, we employ chart autoencoders, which encode data into low-dimensional latent features on a collection of charts, preserving the topology and geometry of the data manifold. Our paper establi… ▽ More

    Submitted 25 October, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

  7. arXiv:2302.13183  [pdf, other

    stat.ML cs.LG

    On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds

    Authors: Biraj Dahal, Alex Havrilla, Minshuo Chen, Tuo Zhao, Wen**g Liao

    Abstract: Generative networks have experienced great empirical successes in distribution learning. Many existing experiments have demonstrated that generative networks can generate high-dimensional complex data from a low-dimensional easy-to-sample distribution. However, this phenomenon can not be justified by existing theories. The widely held manifold hypothesis speculates that real-world data sets, such… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

  8. arXiv:2210.07792  [pdf, other

    cs.CL

    Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning

    Authors: Louis Castricato, Alexander Havrilla, Shahbuland Matiana, Michael Pieler, Anbang Ye, Ian Yang, Spencer Frazier, Mark Riedl

    Abstract: Controlled automated story generation seeks to generate natural language stories satisfying constraints from natural language critiques or preferences. Existing methods to control for story preference utilize prompt engineering which is labor intensive and often inconsistent. They may also use logit-manipulation methods which require annotated datasets to exist for the desired attributes. To addre… ▽ More

    Submitted 15 December, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

  9. arXiv:2102.09500  [pdf, ps, other

    math.PR

    Khinchin-type inequalities via Hadamard's factorisation

    Authors: Alex Havrilla, Piotr Nayar, Tomasz Tkocz

    Abstract: We prove Khinchin-type inequalities with sharp constants for type L random variables and all even moments. Our main tool is Hadamard's factorisation theorem from complex analysis, combined with Newton's inequalities for elementary symmetric functions. Besides the case of independent summands, we also treat ferromagnetic dependencies in a nonnegative external magnetic field (thanks to Newman's gene… ▽ More

    Submitted 11 October, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: Final version. To appear in Int. Math. Res. Not. IMRN

  10. arXiv:1912.13345  [pdf, ps, other

    math.PR

    Sharp Khinchin-type inequalities for symmetric discrete uniform random variables

    Authors: Alex Havrilla, Tomasz Tkocz

    Abstract: We establish several optimal moment comparison inequalities (Khinchin-type inequalities) for weighted sums of independent identically distributed symmetric discrete random variables which are uniform on sets of consecutive integers. Specifically, we obtain sharp constants for the second moment and any moment of order at least 3 (using convex dominance by Gaussian random variables). In the case of… ▽ More

    Submitted 7 December, 2020; v1 submitted 31 December, 2019; originally announced December 2019.

    Comments: Revised (exposition shortened; L1-L2 inequality generalised to arbitrary symmetric distributions with large atom at 0; results for even moments will appear elsewhere). 12 pages

    MSC Class: 60E15; 26D15

    Journal ref: Israel J. Math. 246 (2021), no. 1, 281-297