Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Authors:
Mor Geva,
Avi Caciularu,
Kevin Ro Wang,
Yoav Goldberg
Abstract:
Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood. In this work, we make a substantial step towards unveiling this underlying prediction process, by reverse-engineering the operation of the feed-forward network (FFN) layers, one of the building blocks of transformer models. We view the toke…
▽ More
Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood. In this work, we make a substantial step towards unveiling this underlying prediction process, by reverse-engineering the operation of the feed-forward network (FFN) layers, one of the building blocks of transformer models. We view the token representation as a changing distribution over the vocabulary, and the output from each FFN layer as an additive update to that distribution. Then, we analyze the FFN updates in the vocabulary space, showing that each update can be decomposed to sub-updates corresponding to single FFN parameter vectors, each promoting concepts that are often human-interpretable. We then leverage these findings for controlling LM predictions, where we reduce the toxicity of GPT2 by almost 50%, and for improving computation efficiency with a simple early exit rule, saving 20% of computation on average.
△ Less
Submitted 12 October, 2022; v1 submitted 28 March, 2022;
originally announced March 2022.
Study of two-subband population in Fe-doped AlxGa1-xN/GaN heterostructures by persistent photoconductivity effect
Authors:
Ikai Lo,
J. K. Tsai,
M. H. Gau,
Y. L. Chen,
Z. J. Chang,
W. T. Wang,
J. C. Chiang,
K. R. Wang,
Chun-Nan Chen,
T. Aggerstam
Abstract:
The electronic properties of Fe-doped Al0.31Ga0.69N/GaN heterostructures have been studied by Shubnikov-de Haas measurement. Two subbands of the two-dimensional electron gas in the hetero-interface were populated. After the low temperature illumination, the electron density increases from 11.99 x 1012 cm-2 to 13.40 x 1012 cm-2 for the first subband and from 0.66 x 1012 cm-2 to 0.94 x 1012 cm-2 f…
▽ More
The electronic properties of Fe-doped Al0.31Ga0.69N/GaN heterostructures have been studied by Shubnikov-de Haas measurement. Two subbands of the two-dimensional electron gas in the hetero-interface were populated. After the low temperature illumination, the electron density increases from 11.99 x 1012 cm-2 to 13.40 x 1012 cm-2 for the first subband and from 0.66 x 1012 cm-2 to 0.94 x 1012 cm-2 for the second subband. The persistent photoconductivity effect (~13% increase) is mostly attributed to the Fe-related deep-donor level in GaN layer. The second subband starts to populate when the first subband is filled at a density n1 = 9.40 x 1012 cm-2. We calculate the energy separation between the first and second subbands to be 105 meV.
△ Less
Submitted 14 September, 2006;
originally announced September 2006.