Skip to main content

Showing 1–2 of 2 results for author: Cholak, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2301.10743  [pdf, other

    cs.LG cs.FL cs.LO

    Tighter Bounds on the Expressivity of Transformer Encoders

    Authors: David Chiang, Peter Cholak, Anand Pillay

    Abstract: Characterizing neural networks in terms of better-understood formal systems has the potential to yield new insights into the power and limitations of these networks. Doing so for transformers remains an active area of research. Bhattamishra and others have shown that transformer encoders are at least as expressive as a certain kind of counter machine, while Merrill and Sabharwal have shown that fi… ▽ More

    Submitted 13 November, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: Presented at ICML 2023. Typo corrections in Appendix B and Section 8.1

  2. arXiv:2202.12172  [pdf, other

    cs.LG cs.CL

    Overcoming a Theoretical Limitation of Self-Attention

    Authors: David Chiang, Peter Cholak

    Abstract: Although transformers are remarkably effective for many tasks, there are some surprisingly easy-looking regular languages that they struggle with. Hahn shows that for languages where acceptance depends on a single input symbol, a transformer's classification decisions become less and less confident (that is, with cross-entropy approaching 1 bit per string) as input strings get longer and longer. W… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

    Comments: Accepted at ACL 2022