Skip to main content

Showing 1–3 of 3 results for author: Siegel, Z S

.
  1. arXiv:2407.01502  [pdf, other

    cs.LG cs.AI

    AI Agents That Matter

    Authors: Sayash Kapoor, Benedikt Stroebl, Zachary S. Siegel, Nitya Nadgir, Arvind Narayanan

    Abstract: AI agents are an exciting new research direction, and agent development is driven by benchmarks. Our analysis of current agent benchmarks and evaluation practices reveals several shortcomings that hinder their usefulness in real-world applications. First, there is a narrow focus on accuracy without attention to other metrics. As a result, SOTA agents are needlessly complex and costly, and the comm… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2312.08566  [pdf, other

    cs.AI cs.CL cs.RO

    Learning adaptive planning representations with natural language guidance

    Authors: Lionel Wong, Jiayuan Mao, Pratyusha Sharma, Zachary S. Siegel, Jiahai Feng, Noa Korneev, Joshua B. Tenenbaum, Jacob Andreas

    Abstract: Effective planning in the real world requires not only world knowledge, but the ability to leverage that knowledge to build the right representation of the task at hand. Decades of hierarchical planning techniques have used domain-specific temporal action abstractions to support efficient and accurate planning, almost always relying on human priors and domain knowledge to decompose hard tasks into… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  3. arXiv:2206.05794  [pdf, other

    cs.LG stat.ML

    Characterizing the Implicit Bias of Regularized SGD in Rank Minimization

    Authors: Tomer Galanti, Zachary S. Siegel, Aparna Gupte, Tomaso Poggio

    Abstract: We study the bias of Stochastic Gradient Descent (SGD) to learn low-rank weight matrices when training deep neural networks. Our results show that training neural networks with mini-batch SGD and weight decay causes a bias towards rank minimization over the weight matrices. Specifically, we show, both theoretically and empirically, that this bias is more pronounced when using smaller batch sizes,… ▽ More

    Submitted 25 October, 2023; v1 submitted 12 June, 2022; originally announced June 2022.