Skip to main content

Showing 1–8 of 8 results for author: Willes, J

.
  1. arXiv:2406.02969  [pdf, other

    cs.LG cs.AI cs.CL q-fin.CP q-fin.MF

    Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models

    Authors: Raeid Saqur, Anastasis Kratsios, Florian Krach, Yannick Limmer, Jacob-Junqi Tian, John Willes, Blanka Horvath, Frank Rudzicz

    Abstract: We propose MoE-F -- a formalised mechanism for combining $N$ pre-trained expert Large Language Models (LLMs) in online time-series prediction tasks by adaptively forecasting the best weighting of LLM predictions at every time step. Our mechanism leverages the conditional information in each expert's running performance to forecast the best combination of LLMs for predicting the time series in its… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 29 pages, 5 Appendix sections

    MSC Class: 60J05; 60G35; 68T20; 68T42; 68T50 ACM Class: I.2.6; I.2.7; G.3

  2. arXiv:2404.11599  [pdf, other

    cs.LG cs.CV stat.ML

    Variational Bayesian Last Layers

    Authors: James Harrison, John Willes, Jasper Snoek

    Abstract: We introduce a deterministic variational formulation for training Bayesian last layer neural networks. This yields a sampling-free, single-pass model and loss that effectively improves uncertainty estimation. Our variational Bayesian last layer (VBLL) can be trained and evaluated with only quadratic complexity in last layer width, and is thus (nearly) computationally free to add to standard archit… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: International Conference on Learning Representations (ICLR) 2024

  3. arXiv:2312.03140  [pdf, other

    cs.LG cs.AI cs.CL cs.DC

    FlexModel: A Framework for Interpretability of Distributed Large Language Models

    Authors: Matthew Choi, Muhammad Adil Asif, John Willes, David Emerson

    Abstract: With the growth of large language models, now incorporating billions of parameters, the hardware prerequisites for their training and deployment have seen a corresponding increase. Although existing tools facilitate model parallelization and distributed training, deeper model interactions, crucial for interpretability and responsible AI techniques, still demand thorough knowledge of distributed co… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 14 pages, 8 figures. To appear at the Socially Responsible Language Modelling Research (SoLaR) Workshop, 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  4. arXiv:2308.05711  [pdf, other

    cs.LG eess.SY

    A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control

    Authors: Marshall Wang, John Willes, Thomas Jiralerspong, Matin Moezzi

    Abstract: Reinforcement learning (RL) is a promising approach for optimizing HVAC control. RL offers a framework for improving system performance, reducing energy consumption, and enhancing cost efficiency. We benchmark two popular classical and deep RL methods (Q-Learning and Deep-Q-Networks) across multiple HVAC environments and explore the practical consideration of model hyper-parameter selection and re… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  5. arXiv:2209.12487  [pdf, other

    cs.CE

    Tartarus: A Benchmarking Platform for Realistic And Practical Inverse Molecular Design

    Authors: AkshatKumar Nigam, Robert Pollice, Gary Tom, Kjell Jorner, John Willes, Luca A. Thiede, Anshul Kundaje, Alan Aspuru-Guzik

    Abstract: The efficient exploration of chemical space to design molecules with intended properties enables the accelerated discovery of drugs, materials, and catalysts, and is one of the most important outstanding challenges in chemistry. Encouraged by the recent surge in computer power and artificial intelligence development, many algorithms have been developed to tackle this problem. However, despite the… ▽ More

    Submitted 11 October, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: 29+21 pages, 6+19 figures, 6+2 tables

  6. arXiv:2208.08041  [pdf, other

    cs.CV

    InterTrack: Interaction Transformer for 3D Multi-Object Tracking

    Authors: John Willes, Cody Reading, Steven L. Waslander

    Abstract: 3D multi-object tracking (MOT) is a key problem for autonomous vehicles, required to perform well-informed motion planning in dynamic environments. Particularly for densely occupied scenes, associating existing tracks to new detections remains challenging as existing systems tend to omit critical contextual information. Our proposed solution, InterTrack, introduces the Interaction Transformer for… ▽ More

    Submitted 6 May, 2023; v1 submitted 16 August, 2022; originally announced August 2022.

    Comments: Accepted to CRV 2023

  7. arXiv:2107.13682  [pdf, other

    cs.CV

    Bayesian Embeddings for Few-Shot Open World Recognition

    Authors: John Willes, James Harrison, Ali Harakeh, Chelsea Finn, Marco Pavone, Steven Waslander

    Abstract: As autonomous decision-making agents move from narrow operating environments to unstructured worlds, learning systems must move from a closed-world formulation to an open-world and few-shot setting in which agents continuously learn new classes from small amounts of information. This stands in stark contrast to modern machine learning systems that are typically designed with a known set of classes… ▽ More

    Submitted 5 October, 2022; v1 submitted 28 July, 2021; originally announced July 2021.

  8. Electron-cyclotron maser emission from white-dwarf pairs and white-dwarf planetary systems

    Authors: Andrew J. Willes, Kinwah Wu

    Abstract: By analogy to Jovian radio emissions powered by the electromagnetic interaction between Jupiter and its moons, we propose that close magnetic-nonmagnetic white-dwarf pairs and white-dwarf planetary systems are strong radio sources. A simple model is developed to predict the flux densities of radio emission generated by a loss-cone-driven electron-cyclotron maser. The radio emission from these sy… ▽ More

    Submitted 27 February, 2003; originally announced February 2003.

    Comments: 13 pages, 5 figures, submitted to MNRAS

    Report number: USYD-03-328

    Journal ref: Mon.Not.Roy.Astron.Soc. 348 (2004) 285