Skip to main content

Showing 1–15 of 15 results for author: Paleyes, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.11370  [pdf

    cs.SE cs.AI eess.SY

    Self-sustaining Software Systems (S4): Towards Improved Interpretability and Adaptation

    Authors: Christian Cabrera, Andrei Paleyes, Neil D. Lawrence

    Abstract: Software systems impact society at different levels as they pervasively solve real-world problems. Modern software systems are often so sophisticated that their complexity exceeds the limits of human comprehension. These systems must respond to changing goals, dynamic data, unexpected failures, and security threats, among other variable factors in real-world environments. Systems' complexity chall… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

    Comments: Accepted at The 1st International Workshop New Trends in Software Architecture (SATrends) 2024

  2. arXiv:2311.15691  [pdf, other

    cs.LG cs.CR cs.CY

    Automated discovery of trade-off between utility, privacy and fairness in machine learning models

    Authors: Bogdan Ficiu, Neil D. Lawrence, Andrei Paleyes

    Abstract: Machine learning models are deployed as a central component in decision making and policy operations with direct impact on individuals' lives. In order to act ethically and comply with government regulations, these models need to make fair decisions and protect the users' privacy. However, such requirements can come with decrease in models' performance compared to their potentially biased, privacy… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 3rd Workshop on Bias and Fairness in AI (BIAS), ECML 2023

  3. arXiv:2304.11987  [pdf, other

    cs.SE cs.AI

    Causal fault localisation in dataflow systems

    Authors: Andrei Paleyes, Neil D. Lawrence

    Abstract: Dataflow computing was shown to bring significant benefits to multiple niches of systems engineering and has the potential to become a general-purpose paradigm of choice for data-driven application development. One of the characteristic features of dataflow computing is the natural access to the dataflow graph of the entire system. Recently it has been observed that these dataflow graphs can be tr… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: Accepted to EuroMLSys'23

  4. arXiv:2303.09552  [pdf, other

    cs.SE cs.AI cs.LG

    Dataflow graphs as complete causal graphs

    Authors: Andrei Paleyes, Siyuan Guo, Bernhard Schölkopf, Neil D. Lawrence

    Abstract: Component-based development is one of the core principles behind modern software engineering practices. Understanding of causal relationships between components of a software system can yield significant benefits to developers. Yet modern software design approaches make it difficult to track and discover such relationships at system scale, which leads to growing intellectual debt. In this paper we… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted to 2nd International Conference on AI Engineering - Software Engineering for AI (CAIN 23)

  5. arXiv:2302.08436  [pdf, other

    stat.ML cs.LG

    Trieste: Efficiently Exploring The Depths of Black-box Functions with TensorFlow

    Authors: Victor Picheny, Joel Berkeley, Henry B. Moss, Hrvoje Stojic, Uri Granta, Sebastian W. Ober, Artem Artemev, Khurram Ghani, Alexander Goodall, Andrei Paleyes, Sattar Vakili, Sergio Pascual-Diaz, Stratis Markou, Jixiang Qing, Nasrulloh R. B. S Loka, Ivo Couckuyt

    Abstract: We present Trieste, an open-source Python package for Bayesian optimization and active learning benefiting from the scalability and efficiency of TensorFlow. Our library enables the plug-and-play of popular TensorFlow-based models within sequential decision-making loops, e.g. Gaussian processes from GPflow or GPflux, or neural networks from Keras. This modular mindset is central to the package and… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  6. arXiv:2302.04810  [pdf, other

    cs.SE cs.AI cs.LG

    Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective

    Authors: Christian Cabrera, Andrei Paleyes, Pierre Thodoroff, Neil D. Lawrence

    Abstract: Machine Learning models are being deployed as parts of real-world systems with the upsurge of interest in artificial intelligence. The design, implementation, and maintenance of such systems are challenged by real-world environments that produce larger amounts of heterogeneous data and users requiring increasingly faster responses with efficient resource consumption. These requirements push preval… ▽ More

    Submitted 9 October, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Under review

  7. arXiv:2210.14665  [pdf, other

    cs.LG cs.SE

    Desiderata for next generation of ML model serving

    Authors: Sherif Akoush, Andrei Paleyes, Arnaud Van Looveren, Clive Cox

    Abstract: Inference is a significant part of ML software infrastructure. Despite the variety of inference frameworks available, the field as a whole can be considered in its early days. This position paper puts forth a range of important qualities that next generation of inference platforms should be aiming for. We present our rationale for the importance of each quality, and discuss ways to achieve it in p… ▽ More

    Submitted 22 November, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted at NeurIPS 2022 Workshop on Challenges in Deploying and Monitoring Machine Learning Systems

  8. arXiv:2206.13326  [pdf, other

    cs.LG

    A penalisation method for batch multi-objective Bayesian optimisation with application in heat exchanger design

    Authors: Andrei Paleyes, Henry B. Moss, Victor Picheny, Piotr Zulawski, Felix Newman

    Abstract: We present HIghly Parallelisable Pareto Optimisation (HIPPO) -- a batch acquisition function that enables multi-objective Bayesian optimisation methods to efficiently exploit parallel processing resources. Multi-Objective Bayesian Optimisation (MOBO) is a very efficient tool for tackling expensive black-box problems. However, most MOBO algorithms are designed as purely sequential strategies, and e… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: ICML 2022 Workshop on Adaptive Experimental Design and Active Learning in the Real World

  9. arXiv:2204.12781  [pdf, other

    cs.SE cs.LG

    An Empirical Evaluation of Flow Based Programming in the Machine Learning Deployment Context

    Authors: Andrei Paleyes, Christian Cabrera, Neil D. Lawrence

    Abstract: As use of data driven technologies spreads, software engineers are more often faced with the task of solving a business problem using data-driven methods such as machine learning (ML) algorithms. Deployment of ML within large software systems brings new challenges that are not addressed by standard engineering practices and as a result businesses observe high rate of ML deployment project failures… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: Accepted to CAIN 2022, 1st International Conference on AI Engineering - Software Engineering for AI. arXiv admin note: text overlap with arXiv:2108.04105

  10. arXiv:2110.13293  [pdf, other

    cs.LG

    Emulation of physical processes with Emukit

    Authors: Andrei Paleyes, Mark Pullin, Maren Mahsereci, Cliff McCollum, Neil D. Lawrence, Javier Gonzalez

    Abstract: Decision making in uncertain scenarios is an ubiquitous challenge in real world systems. Tools to deal with this challenge include simulations to gather information and statistical emulation to quantify uncertainty. The machine learning community has developed a number of methods to facilitate decision making, but so far they are scattered in multiple different toolkits, and generally rely on a fi… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: Second Workshop on Machine Learning and the Physical Sciences, NeurIPS 2019

  11. arXiv:2108.04105  [pdf, other

    cs.SE cs.LG

    Towards better data discovery and collection with flow-based programming

    Authors: Andrei Paleyes, Christian Cabrera, Neil D. Lawrence

    Abstract: Despite huge successes reported by the field of machine learning, such as voice assistants or self-driving cars, businesses still observe very high failure rate when it comes to deployment of ML in production. We argue that part of the reason is infrastructure that was not designed for data-oriented activities. This paper explores the potential of flow-based programming (FBP) for simplifying data… ▽ More

    Submitted 25 October, 2021; v1 submitted 9 August, 2021; originally announced August 2021.

    Comments: Extended version. Short version is accepted to Data-Centric AI Workshop, NeurIPS 2021

  12. arXiv:2012.15471  [pdf, other

    cs.LG

    Good practices for Bayesian Optimization of high dimensional structured spaces

    Authors: Eero Siivola, Javier Gonzalez, Andrei Paleyes, Aki Vehtari

    Abstract: The increasing availability of structured but high dimensional data has opened new opportunities for optimization. One emerging and promising avenue is the exploration of unsupervised methods for projecting structured high dimensional data into low dimensional continuous representations, simplifying the optimization problem and enabling the application of traditional optimization methods. However,… ▽ More

    Submitted 6 January, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

  13. Challenges in Deploying Machine Learning: a Survey of Case Studies

    Authors: Andrei Paleyes, Raoul-Gabriel Urma, Neil D. Lawrence

    Abstract: In recent years, machine learning has transitioned from a field of academic research interest to a field capable of solving real-world business problems. However, the deployment of machine learning models in production systems can present a number of issues and concerns. This survey reviews published reports of deploying machine learning solutions in a variety of use cases, industries and applicat… ▽ More

    Submitted 19 May, 2022; v1 submitted 18 November, 2020; originally announced November 2020.

    Comments: v3 accepted to publication at ACM Computer Surveys in 2022; v2 presented at The ML-Retrospectives, Surveys & Meta-Analyses Workshop, NeurIPS 2020

  14. arXiv:2005.11741  [pdf, other

    stat.ML cs.LG

    Causal Bayesian Optimization

    Authors: Virginia Aglietti, Xiaoyu Lu, Andrei Paleyes, Javier González

    Abstract: This paper studies the problem of globally optimizing a variable of interest that is part of a causal model in which a sequence of interventions can be performed. This problem arises in biology, operational research, communications and, more generally, in all fields where the goal is to optimize an output metric of a system of interconnected nodes. Our approach combines ideas from causal inference… ▽ More

    Submitted 26 May, 2020; v1 submitted 24 May, 2020; originally announced May 2020.

  15. arXiv:1905.10862  [pdf, other

    stat.ML cs.LG

    Automatic Discovery of Privacy-Utility Pareto Fronts

    Authors: Brendan Avent, Javier Gonzalez, Tom Diethe, Andrei Paleyes, Borja Balle

    Abstract: Differential privacy is a mathematical framework for privacy-preserving data analysis. Changing the hyperparameters of a differentially private algorithm allows one to trade off privacy and utility in a principled way. Quantifying this trade-off in advance is essential to decision-makers tasked with deciding how much privacy can be provided in a particular application while maintaining acceptable… ▽ More

    Submitted 21 July, 2020; v1 submitted 26 May, 2019; originally announced May 2019.

    Comments: Proceedings on Privacy Enhancing Technologies 2020