Skip to main content

Showing 1–2 of 2 results for author: Vaiana, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19552  [pdf, other

    cs.CL cs.AI cs.LG

    Rethinking harmless refusals when fine-tuning foundation models

    Authors: Florin Pop, Judd Rosenblatt, Diogo Schwerz de Lucena, Michael Vaiana

    Abstract: In this paper, we investigate the degree to which fine-tuning in Large Language Models (LLMs) effectively mitigates versus merely conceals undesirable behavior. Through the lens of semi-realistic role-playing exercises designed to elicit such behaviors, we explore the response dynamics of LLMs post fine-tuning interventions. Our methodology involves prompting models for Chain-of-Thought (CoT) reas… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: ICLR 2024 AGI Workshop Poster

  2. arXiv:1803.03597  [pdf, other

    physics.soc-ph cs.SI physics.data-an

    Resolution Limits for Detecting Community Changes in Multilayer Networks

    Authors: Michael Vaiana, Sarah Muldoon

    Abstract: Multilayer networks capture pairwise relationships between the components of complex systems across multiple modes or scales of interactions. An important meso-scale feature of these networks is measured though their community structure, which defines groups of strongly connected nodes that exist within and across network layers. Because interlayer edges can describe relationships between differen… ▽ More

    Submitted 9 March, 2018; originally announced March 2018.