Skip to main content

Showing 1–2 of 2 results for author: Perkowski, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.01916  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA astro-ph.SR cs.CL cs.LG

    AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

    Authors: Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charlie O'Neill, Maja Jablonska, Zechang Sun, Michael J. Smith, Huiling Liu, Kevin Schawinski, Kartheik Iyer, Ioana Ciucă for UniverseTBD

    Abstract: We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like… ▽ More

    Submitted 5 January, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: 4 pages, 1 figure, model is available at https://huggingface.co/universeTBD, published in RNAAS

  2. arXiv:2309.06126  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA astro-ph.HE cs.CL cs.LG

    AstroLLaMA: Towards Specialized Foundation Models in Astronomy

    Authors: Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciucă, Charlie O'Neill, Ze-Chang Sun, Maja Jabłońska, Sandor Kruk, Ernest Perkowski, Jack Miller, Jason Li, Josh Peek, Kartheik Iyer, Tomasz Różański, Pranav Khetarpal, Sharaf Zaman, David Brodrick, Sergio J. Rodríguez Méndez, Thang Bui, Alyssa Goodman, Alberto Accomazzi, Jill Naiman, Jesse Cranney, Kevin Schawinski, UniverseTBD

    Abstract: Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marke… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: 6 pages, 3 figures, submitted to IJCNLP-AACL 2023. Comments are welcome. The model can be found on Hugging Face - https://huggingface.co/universeTBD/astrollama