Search | arXiv e-print repository

Chat AI: A Seamless Slurm-Native Solution for HPC-Based Services

Authors: Ali Doosthosseini, Jonathan Decker, Hendrik Nolte, Julian M. Kunkel

Abstract: The increasing adoption of large language models (LLMs) has created a pressing need for an efficient, secure and private serving infrastructure, which allows researchers to run open-source or custom fine-tuned LLMs and ensures users that their data remains private and is not stored without their consent. While high-performance computing (HPC) systems equipped with state-of-the-art GPUs are well-su… ▽ More The increasing adoption of large language models (LLMs) has created a pressing need for an efficient, secure and private serving infrastructure, which allows researchers to run open-source or custom fine-tuned LLMs and ensures users that their data remains private and is not stored without their consent. While high-performance computing (HPC) systems equipped with state-of-the-art GPUs are well-suited for training LLMs, their batch scheduling paradigm is not designed to support real-time serving of AI applications. Cloud systems, on the other hand, are well suited for web services but commonly lack access to the computational power of clusters, especially expensive and scarce high-end GPUs, which are required for optimal inference speed. We propose an architecture with an implementation consisting of a web service that runs on a cloud VM with secure access to a scalable backend running a multitude of AI models on HPC systems. By offering a web service using our HPC infrastructure to host LLMs, we leverage the trusted environment of local universities and research centers to offer a private and secure alternative to commercial LLM services. Our solution natively integrates with Slurm, enabling seamless deployment on HPC clusters and is able to run side by side with regular Slurm workloads, while utilizing gaps in the schedule created by Slurm. In order to ensure the security of the HPC system, we use the SSH ForceCommand directive to construct a robust circuit breaker, which prevents successful attacks on the web-facing server from affecting the cluster. We have successfully deployed our system as a production service, and made the source code available at https://github.com/gwdg/chat-ai △ Less

Submitted 27 June, 2024; originally announced July 2024.

Comments: 27 pages, 5 figures, 2 tables

arXiv:2308.03861 [pdf, other]

High-Throughput and Accurate 3D Scanning of Cattle Using Time-of-Flight Sensors and Deep Learning

Authors: Gbenga Omotara, Seyed Mohamad Ali Tousi, Jared Decker, Derek Brake, Guilherme N. DeSouza

Abstract: We introduce a high throughput 3D scanning solution specifically designed to precisely measure cattle phenotypes. This scanner leverages an array of depth sensors, i.e. time-of-flight (Tof) sensors, each governed by dedicated embedded devices. The system excels at generating high-fidelity 3D point clouds, thus facilitating an accurate mesh that faithfully reconstructs the cattle geometry on the fl… ▽ More We introduce a high throughput 3D scanning solution specifically designed to precisely measure cattle phenotypes. This scanner leverages an array of depth sensors, i.e. time-of-flight (Tof) sensors, each governed by dedicated embedded devices. The system excels at generating high-fidelity 3D point clouds, thus facilitating an accurate mesh that faithfully reconstructs the cattle geometry on the fly. In order to evaluate the performance of our system, we have implemented a two-fold validation process. Initially, we test the scanner's competency in determining volume and surface area measurements within a controlled environment featuring known objects. Secondly, we explore the impact and necessity of multi-device synchronization when operating a series of time-of-flight sensors. Based on the experimental results, the proposed system is capable of producing high-quality meshes of untamed cattle for livestock studies. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2305.02697 [pdf, other]

DECICE: Device-Edge-Cloud Intelligent Collaboration Framework

Authors: Julian Kunkel, Christian Boehme, Jonathan Decker, Fabrizio Magugliani, Dirk Pleiter, Bastian Koller, Karthee Sivalingam, Sabri Pllana, Alexander Nikolov, Mujdat Soyturk, Christian Racca, Andrea Bartolini, Adrian Tate, Berkay Yaman

Abstract: DECICE is a Horizon Europe project that is develo** an AI-enabled open and portable management framework for automatic and adaptive optimization and deployment of applications in computing continuum encompassing from IoT sensors on the Edge to large-scale Cloud / HPC computing infrastructures. In this paper, we describe the DECICE framework and architecture. Furthermore, we highlight use-cases f… ▽ More DECICE is a Horizon Europe project that is develo** an AI-enabled open and portable management framework for automatic and adaptive optimization and deployment of applications in computing continuum encompassing from IoT sensors on the Edge to large-scale Cloud / HPC computing infrastructures. In this paper, we describe the DECICE framework and architecture. Furthermore, we highlight use-cases for framework evaluation: intelligent traffic intersection, magnetic resonance imaging, and emergency response. △ Less

Submitted 4 May, 2023; originally announced May 2023.

arXiv:2203.03544 [pdf, other]

Online Adaptable Bug Localization for Rapidly Evolving Software

Authors: Agnieszka Ciborowska, Michael J. Decker, Kostadin Damevski

Abstract: Bug localization aims to reduce debugging time by recommending program elements that are relevant for a specific bug report. To date, researchers have primarily addressed this problem by applying different information retrieval techniques that leverage similarities between a given bug report and source code. However, with modern software development trending towards increased speed of software cha… ▽ More Bug localization aims to reduce debugging time by recommending program elements that are relevant for a specific bug report. To date, researchers have primarily addressed this problem by applying different information retrieval techniques that leverage similarities between a given bug report and source code. However, with modern software development trending towards increased speed of software change and continuous delivery to the user, the current generation of bug localization techniques, which cannot quickly adapt to the latest version of the software, is becoming inadequate. In this paper, we propose a technique for online bug localization, which enables rapidly updatable bug localization models. More specifically, we propose a streaming bug localization technique, based on an ensemble of online topic models, that is able to adapt to both specific (with explicit code mentions) and more abstract bug reports. By using changesets (diffs) as the input instead of a snapshot of the source code, the model naturally integrates defect prediction and co-change information into its prediction. Initial results indicate that the proposed approach improves bug localization performance for 42 out of 56 evaluation projects, with an average MAP improvement of 5.9%. △ Less

Submitted 7 March, 2022; originally announced March 2022.

arXiv:2109.00629 [pdf, other]

doi 10.1109/TSE.2021.3098242

An Ensemble Approach for Annotating Source Code Identifiers with Part-of-speech Tags

Authors: Christian D. Newman, Michael J. Decker, Reem S. AlSuhaibani, Anthony Peruma, Satyajit Mohapatra, Tejal Vishnoi, Marcos Zampieri, Mohamed W. Mkaouer, Timothy J. Sheldon, Emily Hill

Abstract: This paper presents an ensemble part-of-speech tagging approach for source code identifiers. Ensemble tagging is a technique that uses machine-learning and the output from multiple part-of-speech taggers to annotate natural language text at a higher quality than the part-of-speech taggers are able to obtain independently. Our ensemble uses three state-of-the-art part-of-speech taggers: SWUM, POSSE… ▽ More This paper presents an ensemble part-of-speech tagging approach for source code identifiers. Ensemble tagging is a technique that uses machine-learning and the output from multiple part-of-speech taggers to annotate natural language text at a higher quality than the part-of-speech taggers are able to obtain independently. Our ensemble uses three state-of-the-art part-of-speech taggers: SWUM, POSSE, and Stanford. We study the quality of the ensemble's annotations on five different types of identifier names: function, class, attribute, parameter, and declaration statement at the level of both individual words and full identifier names. We also study and discuss the weaknesses of our tagger to promote the future amelioration of these problems through further research. Our results show that the ensemble achieves 75\% accuracy at the identifier level and 84-86\% accuracy at the word level. This is an increase of +17\% points at the identifier level from the closest independent part-of-speech tagger. △ Less

Submitted 1 September, 2021; originally announced September 2021.

Comments: 18 pages. arXiv admin note: text overlap with arXiv:2007.08033

Journal ref: in IEEE Transactions on Software Engineering, vol. , no. 01, pp. 1-1, 5555

arXiv:2102.13555 [pdf]

On the Naming of Methods: A Survey of Professional Developers

Authors: Reem S. AlSuhaibani, Christian D. Newman, Michael J. Decker, Michael L. Collard, Jonathan I. Maletic

Abstract: This paper describes the results of a large (+1100 responses) survey of professional software developers concerning standards for naming source code methods. The various standards for source code method names are derived from and supported in the software engineering literature. The goal of the survey is to determine if there is a general consensus among developers that the standards are accepted… ▽ More This paper describes the results of a large (+1100 responses) survey of professional software developers concerning standards for naming source code methods. The various standards for source code method names are derived from and supported in the software engineering literature. The goal of the survey is to determine if there is a general consensus among developers that the standards are accepted and used in practice. Additionally, the paper examines factors such as years of experience and programming language knowledge in the context of survey responses. The survey results show that participants very much agree about the importance of various standards and how they apply to names. Additionally, the survey shows that years of experience and the programming language the participants use has almost no effect on their responses. △ Less

Submitted 26 February, 2021; originally announced February 2021.

arXiv:2007.08033 [pdf, other]

doi 10.1016/j.jss.2020.110740

On the Generation, Structure, and Semantics of Grammar Patterns in Source Code Identifiers

Authors: Christian D. Newman, Reem S. AlSuhaibani, Michael J. Decker, Anthony Peruma, Dishant Kaushik, Mohamed Wiem Mkaouer, Emily Hill

Abstract: Identifiers make up a majority of the text in code. They are one of the most basic mediums through which developers describe the code they create and understand the code that others create. Therefore, understanding the patterns latent in identifier naming practices and how accurately we are able to automatically model these patterns is vital if researchers are to support developers and automated a… ▽ More Identifiers make up a majority of the text in code. They are one of the most basic mediums through which developers describe the code they create and understand the code that others create. Therefore, understanding the patterns latent in identifier naming practices and how accurately we are able to automatically model these patterns is vital if researchers are to support developers and automated analysis approaches in comprehending and creating identifiers correctly and optimally. This paper investigates identifiers by studying sequences of part-of-speech annotations, referred to as grammar patterns. This work advances our understanding of these patterns and our ability to model them by 1) establishing common naming patterns in different types of identifiers, such as class and attribute names; 2) analyzing how different patterns influence comprehension; and 3) studying the accuracy of state-of-the-art techniques for part-of-speech annotations, which are vital in automatically modeling identifier naming patterns, in order to establish their limits and paths toward improvement. To do this, we manually annotate a dataset of 1,335 identifiers from 20 open-source systems and use this dataset to study naming patterns, semantics, and tagger accuracy. △ Less

Submitted 15 July, 2020; originally announced July 2020.

Comments: 69 pages, 3 figures, 16 tables

Journal ref: Journal of Systems and Software, 2020, 110740, ISSN 0164-1212

arXiv:1902.01769 [pdf, other]

Dungeon Crawl Stone Soup as an Evaluation Domain for Artificial Intelligence

Authors: Dustin Dannenhauer, Michael W. Floyd, Jonathan Decker, David W. Aha

Abstract: Dungeon Crawl Stone Soup is a popular, single-player, free and open-source rogue-like video game with a sufficiently complex decision space that makes it an ideal testbed for research in cognitive systems and, more generally, artificial intelligence. This paper describes the properties of Dungeon Crawl Stone Soup that are conducive to evaluating new approaches of AI systems. We also highlight an o… ▽ More Dungeon Crawl Stone Soup is a popular, single-player, free and open-source rogue-like video game with a sufficiently complex decision space that makes it an ideal testbed for research in cognitive systems and, more generally, artificial intelligence. This paper describes the properties of Dungeon Crawl Stone Soup that are conducive to evaluating new approaches of AI systems. We also highlight an ongoing effort to build an API for AI researchers in the spirit of recent game APIs such as MALMO, ELF, and the Starcraft II API. Dungeon Crawl Stone Soup's complexity offers significant opportunities for evaluating AI and cognitive systems, including human user studies. In this paper we provide (1) a description of the state space of Dungeon Crawl Stone Soup, (2) a description of the components for our API, and (3) the potential benefits of evaluating AI agents in the Dungeon Crawl Stone Soup video game. △ Less

Submitted 5 February, 2019; originally announced February 2019.

Comments: AAAI-19 Workshop on Games and Simulations for Artificial Intelligence

arXiv:1810.08061 [pdf, ps, other]

AutoGraph: Imperative-style Coding with Graph-based Performance

Authors: Dan Moldovan, James M Decker, Fei Wang, Andrew A Johnson, Brian K Lee, Zachary Nado, D Sculley, Tiark Rompf, Alexander B Wiltschko

Abstract: There is a perceived trade-off between machine learning code that is easy to write, and machine learning code that is scalable or fast to execute. In machine learning, imperative style libraries like Autograd and PyTorch are easy to write, but suffer from high interpretive overhead and are not easily deployable in production or mobile settings. Graph-based libraries like TensorFlow and Theano bene… ▽ More There is a perceived trade-off between machine learning code that is easy to write, and machine learning code that is scalable or fast to execute. In machine learning, imperative style libraries like Autograd and PyTorch are easy to write, but suffer from high interpretive overhead and are not easily deployable in production or mobile settings. Graph-based libraries like TensorFlow and Theano benefit from whole-program optimization and can be deployed broadly, but make expressing complex models more cumbersome. We describe how the use of staged programming in Python, via source code transformation, offers a midpoint between these two library design patterns, capturing the benefits of both. A key insight is to delay all type-dependent decisions until runtime, via dynamic dispatch. We instantiate these principles in AutoGraph, a software system that improves the programming experience of the TensorFlow library, and demonstrate usability improvements with no loss in performance compared to native TensorFlow graphs. We also show that our system is backend agnostic, and demonstrate targeting an alternate IR with characteristics not found in TensorFlow graphs. △ Less

Submitted 26 March, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

arXiv:1803.10228 [pdf, other]

Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator

Authors: Fei Wang, Daniel Zheng, James Decker, Xilun Wu, Grégory M. Essertel, Tiark Rompf

Abstract: Deep learning has seen tremendous success over the past decade in computer vision, machine translation, and gameplay. This success rests in crucial ways on gradient-descent optimization and the ability to learn parameters of a neural network by backpropagating observed errors. However, neural network architectures are growing increasingly sophisticated and diverse, which motivates an emerging ques… ▽ More Deep learning has seen tremendous success over the past decade in computer vision, machine translation, and gameplay. This success rests in crucial ways on gradient-descent optimization and the ability to learn parameters of a neural network by backpropagating observed errors. However, neural network architectures are growing increasingly sophisticated and diverse, which motivates an emerging quest for even more general forms of differentiable programming, where arbitrary parameterized computations can be trained by gradient descent. In this paper, we take a fresh look at automatic differentiation (AD) techniques, and especially aim to demystify the reverse-mode form of AD that generalizes backpropagation in neural networks. We uncover a tight connection between reverse-mode AD and delimited continuations, which permits implementing reverse-mode AD purely via operator overloading and without any auxiliary data structures. We further show how this formulation of AD can be fruitfully combined with multi-stage programming (staging), leading to a highly efficient implementation that combines the performance benefits of deep learning frameworks based on explicit reified computation graphs (e.g., TensorFlow) with the expressiveness of pure library approaches (e.g., PyTorch). △ Less

Submitted 28 August, 2019; v1 submitted 27 March, 2018; originally announced March 2018.

arXiv:1703.08219 [pdf, other]

Flare: Native Compilation for Heterogeneous Workloads in Apache Spark

Authors: Grégory M. Essertel, Ruby Y. Tahboub, James M. Decker, Kevin J. Brown, Kunle Olukotun, Tiark Rompf

Abstract: The need for modern data analytics to combine relational, procedural, and map-reduce-style functional processing is widely recognized. State-of-the-art systems like Spark have added SQL front-ends and relational query optimization, which promise an increase in expressiveness and performance. But how good are these extensions at extracting high performance from modern hardware platforms? While Sp… ▽ More The need for modern data analytics to combine relational, procedural, and map-reduce-style functional processing is widely recognized. State-of-the-art systems like Spark have added SQL front-ends and relational query optimization, which promise an increase in expressiveness and performance. But how good are these extensions at extracting high performance from modern hardware platforms? While Spark has made impressive progress, we show that for relational workloads, there is still a significant gap compared with best-of-breed query engines. And when step** outside of the relational world, query optimization techniques are ineffective if large parts of a computation have to be treated as user-defined functions (UDFs). We present Flare: a new back-end for Spark that brings performance closer to the best SQL engines, without giving up the added expressiveness of Spark. We demonstrate order of magnitude speedups both for relational workloads such as TPC-H, as well as for a range of machine learning kernels that combine relational and iterative functional processing. Flare achieves these results through (1) compilation to native code, (2) replacing parts of the Spark runtime system, and (3) extending the scope of optimization and code generation to large classes of UDFs. △ Less

Submitted 23 March, 2017; originally announced March 2017.

Showing 1–11 of 11 results for author: Decker, J