Skip to main content

Showing 1–20 of 20 results for author: Smith, M J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.14930  [pdf, other

    astro-ph.IM astro-ph.GA cs.LG

    AstroPT: Scaling Large Observation Models for Astronomy

    Authors: Michael J. Smith, Ryan J. Roberts, Eirini Angeloudi, Marc Huertas-Company

    Abstract: This work presents AstroPT, an autoregressive pretrained transformer developed with astronomical use-cases in mind. The AstroPT models presented here have been pretrained on 8.6 million $512 \times 512$ pixel $grz$-band galaxy postage stamp observations from the DESI Legacy Survey DR8. We train a selection of foundation models of increasing size from 1 million to 2.1 billion parameters, and find t… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 12 pages, 4 figures, 1 table. Code available at https://github.com/Smith42/astroPT

  2. arXiv:2401.01916  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA astro-ph.SR cs.CL cs.LG

    AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

    Authors: Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charlie O'Neill, Maja Jablonska, Zechang Sun, Michael J. Smith, Huiling Liu, Kevin Schawinski, Kartheik Iyer, Ioana Ciucă for UniverseTBD

    Abstract: We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like… ▽ More

    Submitted 5 January, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: 4 pages, 1 figure, model is available at https://huggingface.co/universeTBD, published in RNAAS

  3. arXiv:2309.07207  [pdf, other

    cs.LG physics.geo-ph

    EarthPT: a time series foundation model for Earth Observation

    Authors: Michael J. Smith, Luke Fleming, James E. Geach

    Abstract: We introduce EarthPT -- an Earth Observation (EO) pretrained transformer. EarthPT is a 700 million parameter decoding transformer foundation model trained in an autoregressive self-supervised manner and developed specifically with EO use-cases in mind. We demonstrate that EarthPT is an effective forecaster that can accurately predict future pixel-level surface reflectances across the 400-2300 nm r… ▽ More

    Submitted 11 January, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: 7 pages, 4 figures, accepted to NeurIPS CCAI workshop at https://www.climatechange.ai/papers/neurips2023/2 . Code available at https://github.com/aspiaspace/EarthPT

  4. arXiv:2302.12537  [pdf, other

    cs.LG cs.AI

    Why Target Networks Stabilise Temporal Difference Methods

    Authors: Mattie Fellows, Matthew J. A. Smith, Shimon Whiteson

    Abstract: Integral to recent successes in deep reinforcement learning has been a class of temporal difference methods that use infrequently updated target values for policy evaluation in a Markov Decision Process. Yet a complete theoretical explanation for the effectiveness of target networks remains elusive. In this work, we provide an analysis of this popular class of algorithms, to finally answer the que… ▽ More

    Submitted 11 August, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

    Comments: Found a small error in Appendix (Proposition 1, Appendix B3, penultimate line) that affects results presented in the original submission. These have been fixed and this version is the one accepted at ICML 2023

    Journal ref: ICML 2023

  5. arXiv:2302.08091  [pdf, other

    cs.CL

    Do We Still Need Clinical Language Models?

    Authors: Eric Lehman, Evan Hernandez, Diwakar Mahajan, Jonas Wulff, Micah J. Smith, Zachary Ziegler, Daniel Nadler, Peter Szolovits, Alistair Johnson, Emily Alsentzer

    Abstract: Although recent advances in scaling large language models (LLMs) have resulted in improvements on many NLP tasks, it remains unclear whether these models trained primarily with general web text are the right tool in highly specialized, safety critical domains such as clinical text. Recent results have suggested that LLMs encode a surprising amount of medical knowledge. This raises an important que… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  6. arXiv:2211.03796  [pdf, other

    astro-ph.IM cs.LG

    Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy

    Authors: Michael J. Smith, James E. Geach

    Abstract: In this review, we explore the historical development and future prospects of artificial intelligence (AI) and deep learning in astronomy. We trace the evolution of connectionism in astronomy through its three waves, from the early use of multilayer perceptrons, to the rise of convolutional and recurrent neural networks, and finally to the current era of unsupervised and generative deep learning m… ▽ More

    Submitted 12 May, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

    Comments: 75 pages, 327 references, 32 figures. Review accepted in Royal Society Open Science

  7. arXiv:2111.01713  [pdf, other

    astro-ph.IM astro-ph.GA cs.LG

    Realistic galaxy image simulation via score-based generative models

    Authors: Michael J. Smith, James E. Geach, Ryan A. Jackson, Nikhil Arora, Connor Stone, Stéphane Courteau

    Abstract: We show that a Denoising Diffusion Probabalistic Model (DDPM), a class of score-based generative model, can be used to produce realistic mock images that mimic observations of galaxies. Our method is tested with Dark Energy Spectroscopic Instrument (DESI) grz imaging of galaxies from the Photometry and Rotation curve OBservations from Extragalactic Surveys (PROBES) sample and galaxies selected fro… ▽ More

    Submitted 31 January, 2022; v1 submitted 2 November, 2021; originally announced November 2021.

    Comments: 11 pages, 8 figures. Code: https://github.com/smith42/astroddpm . Follow the Twitter bot @ThisIsNotAnApod for DDPM-generated APODs

  8. arXiv:2103.15787  [pdf, other

    cs.HC

    Meeting in the notebook: a notebook-based environment for micro-submissions in data science collaborations

    Authors: Micah J. Smith, Jürgen Cito, Kalyan Veeramachaneni

    Abstract: Developers in data science and other domains frequently use computational notebooks to create exploratory analyses and prototype models. However, they often struggle to incorporate existing software engineering tooling into these notebook-based workflows, leading to fragile development processes. We introduce Assemblé, a new development environment for collaborative data science projects, in which… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

  9. arXiv:2012.07816  [pdf, other

    cs.LG cs.HC cs.SE

    Enabling Collaborative Data Science Development with the Ballet Framework

    Authors: Micah J. Smith, Jürgen Cito, Kelvin Lu, Kalyan Veeramachaneni

    Abstract: While the open-source software development model has led to successful large-scale collaborations in building software systems, data science projects are frequently developed by individuals or small teams. We describe challenges to scaling data science collaborations and present a conceptual framework and ML programming model to address them. We instantiate these ideas in Ballet, a lightweight fra… ▽ More

    Submitted 22 October, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

    Journal ref: Proc. ACM Hum.-Comput. Interact. 5, CSCW2, Article 431 (October 2021), 39 pages

  10. arXiv:2010.10777  [pdf, other

    cs.LG cs.AI

    AutoML to Date and Beyond: Challenges and Opportunities

    Authors: Shubhra Kanti Karmaker Santu, Md. Mahadi Hassan, Micah J. Smith, Lei Xu, ChengXiang Zhai, Kalyan Veeramachaneni

    Abstract: As big data becomes ubiquitous across domains, and more and more stakeholders aspire to make the most of their data, demand for machine learning tools has spurred researchers to explore the possibilities of automated machine learning (AutoML). AutoML tools aim to make machine learning accessible for non-machine learning experts (domain experts), to improve the efficiency of machine learning, and t… ▽ More

    Submitted 19 May, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: 35 pages, survey article, 3 figures

    ACM Class: I.2

  11. arXiv:2010.00622  [pdf, other

    astro-ph.IM astro-ph.GA cs.LG

    Pix2Prof: fast extraction of sequential information from galaxy imagery via a deep natural language 'captioning' model

    Authors: Michael J. Smith, Nikhil Arora, Connor Stone, Stéphane Courteau, James E. Geach

    Abstract: We present 'Pix2Prof', a deep learning model that can eliminate any manual steps taken when extracting galaxy profiles. We argue that a galaxy profile of any sort is conceptually similar to a natural language image caption. This idea allows us to leverage image captioning methods from the field of natural language processing, and so we design Pix2Prof as a float sequence 'captioning' model suitabl… ▽ More

    Submitted 28 April, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

    Comments: Accepted for publication in MNRAS. 10 pages, and 8 figures. Code: https://github.com/Smith42/pix2prof

  12. arXiv:1905.08942  [pdf, other

    cs.SE cs.LG stat.ML

    The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development

    Authors: Micah J. Smith, Carles Sala, James Max Kanter, Kalyan Veeramachaneni

    Abstract: As machine learning is applied more widely, data scientists often struggle to find or create end-to-end machine learning systems for specific tasks. The proliferation of libraries and frameworks and the complexity of the tasks have led to the emergence of "pipeline jungles" - brittle, ad hoc ML systems. To address these problems, we introduce the Machine Learning Bazaar, a new framework for develo… ▽ More

    Submitted 7 April, 2020; v1 submitted 21 May, 2019; originally announced May 2019.

    Comments: To appear in SIGMOD '20

    Journal ref: In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 785-800

  13. arXiv:1902.05009  [pdf, other

    cs.LG cs.HC stat.ML

    ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning

    Authors: Qianwen Wang, Yao Ming, Zhihua **, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, Huamin Qu

    Abstract: To relieve the pain of manually selecting machine learning algorithms and tuning hyperparameters, automated machine learning (AutoML) methods have been developed to automatically search for good models. Due to the huge model search space, it is impossible to try all models. Users tend to distrust automatic results and increase the search budget as much as they can, thereby undermining the efficien… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

    Comments: Published in the ACM Conference on Human Factors in Computing Systems (CHI), 2019, Glasgow, Scotland UK

    Journal ref: In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Paper 681, 1-12

  14. arXiv:1610.08171  [pdf, other

    cs.LO q-bio.QM

    MELA: Modelling in Ecology with Location Attributes

    Authors: Ludovica Luisa Vissat, Jane Hillston, Glenn Marion, Matthew J. Smith

    Abstract: Ecology studies the interactions between individuals, species and the environment. The ability to predict the dynamics of ecological systems would support the design and monitoring of control strategies and would help to address pressing global environmental issues. It is also important to plan for efficient use of natural resources and maintenance of critical ecosystem services. The mathematical… ▽ More

    Submitted 26 October, 2016; originally announced October 2016.

    Comments: In Proceedings QAPL'16, arXiv:1610.07696

    Journal ref: EPTCS 227, 2016, pp. 82-97

  15. Adaptive Frequency Cepstral Coefficients for Word Mispronunciation Detection

    Authors: Zhenhao Ge, Sudhendu R. Sharma, Mark J. T. Smith

    Abstract: Systems based on automatic speech recognition (ASR) technology can provide important functionality in computer assisted language learning applications. This is a young but growing area of research motivated by the large number of students studying foreign languages. Here we propose a Hidden Markov Model (HMM)-based method to detect mispronunciations. Exploiting the specific dialog scripting employ… ▽ More

    Submitted 25 February, 2016; originally announced February 2016.

    Comments: 4th International Congress on Image and Signal Processing (CISP) 2011

  16. arXiv:1602.08128  [pdf, ps, other

    cs.SD cs.CL cs.LG

    PCA Method for Automated Detection of Mispronounced Words

    Authors: Zhenhao Ge, Sudhendu R. Sharma, Mark J. T. Smith

    Abstract: This paper presents a method for detecting mispronunciations with the aim of improving Computer Assisted Language Learning (CALL) tools used by foreign language learners. The algorithm is based on Principle Component Analysis (PCA). It is hierarchical with each successive step refining the estimate to classify the test word as being either mispronounced or correct. Preprocessing before detection,… ▽ More

    Submitted 25 February, 2016; originally announced February 2016.

    Comments: SPIE Defense, Security, and Sensing

  17. PCA/LDA Approach for Text-Independent Speaker Recognition

    Authors: Zhenhao Ge, Sudhendu R. Sharma, Mark J. T. Smith

    Abstract: Various algorithms for text-independent speaker recognition have been developed through the decades, aiming to improve both accuracy and efficiency. This paper presents a novel PCA/LDA-based approach that is faster than traditional statistical model-based methods and achieves competitive results. First, the performance based on only PCA and only LDA is measured; then a mixed model, taking advantag… ▽ More

    Submitted 25 February, 2016; originally announced February 2016.

    Comments: Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series

  18. arXiv:1602.05292  [pdf, other

    cs.CL cs.AI

    Authorship Attribution Using a Neural Network Language Model

    Authors: Zhenhao Ge, Yufang Sun, Mark J. T. Smith

    Abstract: In practice, training language models for individual authors is often expensive because of limited data resources. In such cases, Neural Network Language Models (NNLMs), generally outperform the traditional non-parametric N-gram models. Here we investigate the performance of a feed-forward NNLM on an authorship attribution problem, with moderate author set size and relatively limited data. We also… ▽ More

    Submitted 16 February, 2016; originally announced February 2016.

    Comments: Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI'16)

  19. arXiv:1312.6122  [pdf, other

    physics.soc-ph cond-mat.dis-nn cs.SI physics.data-an

    Shadow networks: Discovering hidden nodes with models of information flow

    Authors: James P. Bagrow, Suma Desu, Morgan R. Frank, Narine Manukyan, Lewis Mitchell, Andrew Reagan, Eric E. Bloedorn, Lashon B. Booker, Luther K. Branting, Michael J. Smith, Brian F. Tivnan, Christopher M. Danforth, Peter S. Dodds, Joshua C. Bongard

    Abstract: Complex, dynamic networks underlie many systems, and understanding these networks is the concern of a great span of important scientific and engineering problems. Quantitative description is crucial for this understanding yet, due to a range of measurement problems, many real network datasets are incomplete. Here we explore how accidentally missing or deliberately hidden nodes may be detected in n… ▽ More

    Submitted 20 December, 2013; originally announced December 2013.

    Comments: 12 pages, 3 figures

  20. arXiv:1209.6578  [pdf, other

    cs.LO cs.FL

    Roadmap Document on Stochastic Analysis

    Authors: Bo Friis Nielsen, Flemming Nielson, Henrik Pilegaard, Michael James Andrew Smith, Ender Yüksel, Kebin Zeng, Lijun Zhang

    Abstract: This document was prepared as part of the MT-LAB research centre. The research centre studies the Modelling of Information Technology and is a VKR Centre of Excellence funded for five years by the VILLUM Foundation. You can read more about MT-LAB at its webpage www.MT-LAB.dk. The goal of the document is to serve as an introduction to new PhD students addressing the research goals of MT-LAB. As s… ▽ More

    Submitted 27 September, 2012; originally announced September 2012.

    Comments: This work has been supported by MT-LAB, a VKR Centre of Excellence for the Modelling of Information Technology