Skip to main content

Showing 1–8 of 8 results for author: Neo, C

.
  1. arXiv:2407.01082  [pdf, other

    cs.CL

    Min P Sampling: Balancing Creativity and Coherence at High Temperature

    Authors: Minh Nguyen, Andrew Baker, Andreas Kirsch, Clement Neo

    Abstract: Large Language Models (LLMs) generate longform text by successively sampling the next token based on the probability distribution of the token vocabulary at each decoding step. Current popular truncation sampling methods such as top-$p$ sampling, also known as nucleus sampling, often struggle to balance coherence and creativity in generating text, particularly when using higher temperatures. To ad… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 8 Pages

  2. arXiv:2402.15055  [pdf, other

    cs.CL cs.AI cs.LG

    Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions

    Authors: Clement Neo, Shay B. Cohen, Fazl Barez

    Abstract: In this paper, we investigate the interplay between attention heads and specialized "next-token" neurons in the Multilayer Perceptron that predict specific tokens. By prompting an LLM like GPT-4 to explain these model internals, we can elucidate attention mechanisms that activate certain next-token neurons. Our analysis identifies attention heads that recognize contexts relevant to predicting a pa… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 15 pages, 11 figures

  3. arXiv:2402.02619  [pdf, other

    cs.LG cs.CL

    Increasing Trust in Language Models through the Reuse of Verified Circuits

    Authors: Philip Quirke, Clement Neo, Fazl Barez

    Abstract: Language Models (LMs) are increasingly used for a wide range of prediction tasks, but their training can often neglect rare edge cases, reducing their reliability. Here, we define a stringent standard of trustworthiness whereby the task algorithm and circuit implementation must be verified, accounting for edge cases, with no known failure modes. We show that a model can be trained to meet this sta… ▽ More

    Submitted 11 July, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: 8 pages, 4 figures, 5 tables

  4. arXiv:2310.08164  [pdf, other

    cs.LG

    Beyond Training Objectives: Interpreting Reward Model Divergence in Large Language Models

    Authors: Luke Marks, Amir Abdullah, Clement Neo, Rauno Arike, Philip Torr, Fazl Barez

    Abstract: Large language models (LLMs) fine-tuned by reinforcement learning from human feedback (RLHF) are becoming more widely deployed. We coin the term $\textit{Implicit Reward Model}$ (IRM) to refer to the changes that occur to an LLM during RLHF that result in high-reward generations. We interpret IRMs, and measure their divergence from the RLHF reward model used in the fine-tuning process that induced… ▽ More

    Submitted 7 February, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: 19 pages, 5 figures

  5. Silicon Nanoantenna Mix Arrays for a Trifecta of Quantum Emitter Enhancements

    Authors: Zhaogang Dong, Sergey Gorelik, Ramón Paniagua-Dominguez, Johnathan Yik, **fa Ho, Febiana Tjiptoharsono, Emmanuel Lassalle, Soroosh Daqiqeh Rezaei, Darren C. J. Neo, ** Bai, Arseniy I. Kuznetsov, Joel K. W. Yang

    Abstract: Dielectric nanostructures have demonstrated optical antenna effects due to Mie resonances. Preliminary investigations on dielectric nanoantennas have been carried out for a trifecta of enhancements, i.e., simultaneous enhancements in absorption, emission directionality and radiative decay rates of quantum emitters. However, these investigations are limited by fragile substrates or low Purcell fact… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: 20 pages, 4 figures

    Journal ref: Nano Letters 21, 4853-4860 (2021)

  6. arXiv:2209.15585  [pdf, other

    physics.ao-ph cs.LG

    Cloud Classification with Unsupervised Deep Learning

    Authors: Takuya Kurihana, Ian Foster, Rebecca Willett, Sydney Jenkins, Kathryn Koenig, Ruby Werman, Ricardo Barros Lourenco, Casper Neo, Elisabeth Moyer

    Abstract: We present a framework for cloud characterization that leverages modern unsupervised deep learning technologies. While previous neural network-based cloud classification models have used supervised learning methods, unsupervised learning allows us to avoid restricting the model to artificial categories based on historical cloud classification schemes and enables the discovery of novel, more detail… ▽ More

    Submitted 30 September, 2022; originally announced September 2022.

    Comments: 5 pages, 6 figures, Proceedings for Climate Informatics Workshop 2019 Paris

  7. Hybrid Dielectric-Plasmonic Nanoantenna with Multiresonances for Subwavelength Photon Sources

    Authors: Pavel A. Dmitriev, Emmanuel Lassalle, Lu Ding, Zhenying Pan, Darren C. J. Neo, Vytautas Valuckas, Ramón Paniagua-Dominguez, Joel K. W. Yang, Hilmi Volkan Demir, Arseniy I. Kuznetsov

    Abstract: The enhancement of the photoluminescence of quantum dots induced by an optical nanoantenna has been studied considerably, but there is still significant interest in optimizing and miniaturizing such structures, especially when accompanied by an experimental demonstration. Most of the realizations use plasmonic platforms, and some also use all-dielectric nanoantennas, but hybrid dielectric-plasmoni… ▽ More

    Submitted 28 February, 2023; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: 39 pages, 4 figures

  8. arXiv:1909.04253  [pdf, other

    cond-mat.soft physics.app-ph physics.flu-dyn

    Map** micron-scale wetting properties of superhydrophobic surfaces

    Authors: Dan Daniel, Chee Leng Lay, Anqi Sng, Corryl **g Jun Lee, Darren Chi ** Neo, Xing Yi Ling, Nikodem Tomczak

    Abstract: There is a huge interest in develo** super-repellent surfaces for anti-fouling and heat transfer applications. To characterize the wetting properties of such surfaces, the most common approach is to place a millimetric-sized droplet and measure its contact angles. The adhesion and friction forces can then be indirectly inferred from the Furmidge's relation. While easy to implement, contact angle… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.