Skip to main content

Showing 1–1 of 1 results for author: Chen, B K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.02847  [pdf, other

    cs.LG stat.ML

    Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers

    Authors: Brian K Chen, Tianyang Hu, Hui **, Hwee Kuan Lee, Kenji Kawaguchi

    Abstract: In-Context Learning (ICL) has been a powerful emergent property of large language models that has attracted increasing attention in recent years. In contrast to regular gradient-based learning, ICL is highly interpretable and does not require parameter updates. In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias ter… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024