Skip to main content

Showing 1–8 of 8 results for author: Sharifnassab, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.00747  [pdf, other

    cs.LG cs.AI

    Soft Preference Optimization: Aligning Language Models to Expert Distributions

    Authors: Arsalan Sharifnassab, Sina Ghiassian, Saber Salehkaleybar, Surya Kanoria, Dale Schuurmans

    Abstract: We propose Soft Preference Optimization (SPO), a method for aligning generative models, such as Large Language Models (LLMs), with human preferences, without the need for a reward model. SPO optimizes model outputs directly over a preference dataset through a natural loss function that integrates preference loss with a regularization term across the model's entire output distribution rather than l… ▽ More

    Submitted 27 May, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

  2. arXiv:2402.02342  [pdf, other

    cs.LG cs.AI math.OC

    MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters

    Authors: Arsalan Sharifnassab, Saber Salehkaleybar, Richard Sutton

    Abstract: This paper addresses the challenge of optimizing meta-parameters (i.e., hyperparameters) in machine learning algorithms, a critical factor influencing training efficiency and model performance. Moving away from the computationally expensive traditional meta-parameter search methods, we introduce MetaOptimize framework that dynamically adjusts meta-parameters, particularly step sizes (also known as… ▽ More

    Submitted 27 May, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  3. arXiv:2401.17401  [pdf, other

    cs.LG cs.AI

    Step-size Optimization for Continual Learning

    Authors: Thomas Degris, Khurram Javed, Arsalan Sharifnassab, Yuxin Liu, Richard Sutton

    Abstract: In continual learning, a learner has to keep learning from the data over its whole life time. A key issue is to decide what knowledge to keep and what knowledge to let go. In a neural network, this can be implemented by using a step-size vector to scale how much gradient samples change network weights. Common algorithms, like RMSProp and Adam, use heuristics, specifically normalization, to adapt t… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  4. arXiv:2301.13757  [pdf, other

    cs.LG cs.AI

    Toward Efficient Gradient-Based Value Estimation

    Authors: Arsalan Sharifnassab, Richard Sutton

    Abstract: Gradient-based methods for value estimation in reinforcement learning have favorable stability properties, but they are typically much slower than Temporal Difference (TD) learning methods. We study the root causes of this slowness and show that Mean Square Bellman Error (MSBE) is an ill-conditioned loss function in the sense that its Hessian has large condition-number. To resolve the adverse effe… ▽ More

    Submitted 23 July, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

  5. arXiv:2108.08677  [pdf, other

    cs.LG

    Order Optimal Bounds for One-Shot Federated Learning over non-Convex Loss Functions

    Authors: Arsalan Sharifnassab, Saber Salehkaleybar, S. Jamaloddin Golestani

    Abstract: We consider the problem of federated learning in a one-shot setting in which there are $m$ machines, each observing $n$ sample functions from an unknown distribution on non-convex loss functions. Let $F:[-1,1]^d\to\mathbb{R}$ be the expected loss function with respect to this unknown distribution. The goal is to find an estimate of the minimizer of $F$. Based on its observations, each machine gene… ▽ More

    Submitted 6 February, 2024; v1 submitted 19 August, 2021; originally announced August 2021.

  6. arXiv:1911.00731  [pdf, other

    cs.LG stat.ML

    Order Optimal One-Shot Distributed Learning

    Authors: Arsalan Sharifnassab, Saber Salehkaleybar, S. Jamaloddin Golestani

    Abstract: We consider distributed statistical optimization in one-shot setting, where there are $m$ machines each observing $n$ i.i.d. samples. Based on its observed samples, each machine then sends an $O(\log(mn))$-length message to a server, at which a parameter minimizing an expected loss is to be estimated. We propose an algorithm called Multi-Resolution Estimator (MRE) whose expected error is no larger… ▽ More

    Submitted 2 November, 2019; originally announced November 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1905.04634

    Journal ref: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  7. arXiv:1905.04634  [pdf, other

    cs.LG cs.DC stat.ML

    One-Shot Federated Learning: Theoretical Limits and Algorithms to Achieve Them

    Authors: Saber Salehkaleybar, Arsalan Sharifnassab, S. Jamaloddin Golestani

    Abstract: We consider distributed statistical optimization in one-shot setting, where there are $m$ machines each observing $n$ i.i.d. samples. Based on its observed samples, each machine sends a $B$-bit-long message to a server. The server then collects messages from all machines, and estimates a parameter that minimizes an expected convex loss function. We investigate the impact of communication constrain… ▽ More

    Submitted 30 December, 2019; v1 submitted 11 May, 2019; originally announced May 2019.

  8. arXiv:1810.09180  [pdf, ps, other

    cs.NI

    Fluctuation Bounds for the Max-Weight Policy, with Applications to State Space Collapse

    Authors: Arsalan Sharifnassab, John N. Tsitsiklis, S. Jamaloddin Golestani

    Abstract: We consider a multi-hop switched network operating under a Max-Weight (MW) scheduling policy, and show that the distance between the queue length process and a fluid solution remains bounded by a constant multiple of the deviation of the cumulative arrival process from its average. We then exploit this result to prove matching upper and lower bounds for the time scale over which additive state spa… ▽ More

    Submitted 12 June, 2019; v1 submitted 22 October, 2018; originally announced October 2018.