Search | arXiv e-print repository

DocuMint: Docstring Generation for Python using Small Language Models

Authors: Bibek Poudel, Adam Cook, Sekou Traore, Shelah Ameli

Abstract: Effective communication, specifically through documentation, is the beating heart of collaboration among contributors in software development. Recent advancements in language models (LMs) have enabled the introduction of a new type of actor in that ecosystem: LM-powered assistants capable of code generation, optimization, and maintenance. Our study investigates the efficacy of small language model… ▽ More Effective communication, specifically through documentation, is the beating heart of collaboration among contributors in software development. Recent advancements in language models (LMs) have enabled the introduction of a new type of actor in that ecosystem: LM-powered assistants capable of code generation, optimization, and maintenance. Our study investigates the efficacy of small language models (SLMs) for generating high-quality docstrings by assessing accuracy, conciseness, and clarity, benchmarking performance quantitatively through mathematical formulas and qualitatively through human evaluation using Likert scale. Further, we introduce DocuMint, as a large-scale supervised fine-tuning dataset with 100,000 samples. In quantitative experiments, Llama 3 8B achieved the best performance across all metrics, with conciseness and clarity scores of 0.605 and 64.88, respectively. However, under human evaluation, CodeGemma 7B achieved the highest overall score with an average of 8.3 out of 10 across all metrics. Fine-tuning the CodeGemma 2B model using the DocuMint dataset led to significant improvements in performance across all metrics, with gains of up to 22.5% in conciseness. The fine-tuned model and the dataset can be found in HuggingFace and the code can be found in the repository. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 12 pages, 4 figures

arXiv:2308.06815 [pdf, other]

Optimizing the cloud? Don't train models. Build oracles!

Authors: Tiemo Bang, Conor Power, Siavash Ameli, Natacha Crooks, Joseph M. Hellerstein

Abstract: We propose cloud oracles, an alternative to machine learning for online optimization of cloud configurations. Our cloud oracle approach guarantees complete accuracy and explainability of decisions for problems that can be formulated as parametric convex optimizations. We give experimental evidence of this technique's efficacy and share a vision of research directions for expanding its applicabilit… ▽ More We propose cloud oracles, an alternative to machine learning for online optimization of cloud configurations. Our cloud oracle approach guarantees complete accuracy and explainability of decisions for problems that can be formulated as parametric convex optimizations. We give experimental evidence of this technique's efficacy and share a vision of research directions for expanding its applicability. △ Less

Submitted 22 December, 2023; v1 submitted 13 August, 2023; originally announced August 2023.

Comments: Camera-ready publication for CIDR'24: https://www.cidrdb.org/cidr2024/papers/p47-bang.pdf

arXiv:2304.11030 [pdf, other]

Analog Feedback-Controlled Memristor programming Circuit for analog Content Addressable Memory

Authors: Jiaao Yu, Paul-Philipp Manea, Sara Ameli, Mohammad Hizzani, Amro Eldebiky, John Paul Strachan

Abstract: Recent breakthroughs in associative memories suggest that silicon memories are coming closer to human memories, especially for memristive Content Addressable Memories (CAMs) which are capable to read and write in analog values. However, the Program-Verify algorithm, the state-of-the-art memristor programming algorithm, requires frequent switching between verifying and programming memristor conduct… ▽ More Recent breakthroughs in associative memories suggest that silicon memories are coming closer to human memories, especially for memristive Content Addressable Memories (CAMs) which are capable to read and write in analog values. However, the Program-Verify algorithm, the state-of-the-art memristor programming algorithm, requires frequent switching between verifying and programming memristor conductance, which brings many defects such as high dynamic power and long programming time. Here, we propose an analog feedback-controlled memristor programming circuit that makes use of a novel look-up table-based (LUT-based) programming algorithm. With the proposed algorithm, the programming and the verification of a memristor can be performed in a single-direction sequential process. Besides, we also integrated a single proposed programming circuit with eight analog CAM (aCAM) cells to build an aCAM array. We present SPICE simulations on TSMC 28nm process. The theoretical analysis shows that 1. A memristor conductance within an aCAM cell can be converted to an output boundary voltage in aCAM searching operations and 2. An output boundary voltage in aCAM searching operations can be converted to a programming data line voltage in aCAM programming operations. The simulation results of the proposed programming circuit prove the theoretical analysis and thus verify the feasibility to program memristors without frequently switching between verifying and programming the conductance. Besides, the simulation results of the proposed aCAM array show that the proposed programming circuit can be integrated into a large array architecture. △ Less

Submitted 21 April, 2023; originally announced April 2023.

arXiv:2209.13775 [pdf, other]

doi 10.1007/978-3-319-04099-8_13

Development of an Efficient and Flexible Pipeline for Lagrangian Coherent Structure Computation

Authors: Siavash Ameli, Yogin Desai, Shawn C. Shadden

Abstract: The computation of Lagrangian coherent structures (LCS) has become a standard tool for the analysis of advective transport in unsteady flow applications. LCS identification is primarily accomplished by evaluating measures based on the finite-time Cauchy Green (CG) strain tensor over the fluid domain. Sampling the CG tensor requires the advection of large numbers of fluid tracers, which can be comp… ▽ More The computation of Lagrangian coherent structures (LCS) has become a standard tool for the analysis of advective transport in unsteady flow applications. LCS identification is primarily accomplished by evaluating measures based on the finite-time Cauchy Green (CG) strain tensor over the fluid domain. Sampling the CG tensor requires the advection of large numbers of fluid tracers, which can be computationally intensive, but presents a large degree of data parallelism. Processing can be specialized to parallel computing architectures, but on the other hand, there is compelling need for robust and flexible implementations for end users. Specifically, code that can accommodate analysis of wide-ranging fluid mechanics applications, while using a modular structure that is easily extended or modified, and facilitates visualization is desirable. We discuss the use of Visualization Toolkit (VTK) libraries as a foundation for object-oriented LCS computation, and how this framework can facilitate integration of LCS computation into flow visualization software such as ParaView. We also discuss the development of CUDA GPU kernels for efficient parallel spatial sampling of the flow map, including optimizing these kernels for better utilization. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Journal ref: Topological Methods in Data Analysis and Visualization III, Theory, Algorithms, and Applications (2014), pp. 201-215

arXiv:2207.08038 [pdf, other]

doi 10.1016/j.amc.2023.128032

A Singular Woodbury and Pseudo-Determinant Matrix Identities and Application to Gaussian Process Regression

Authors: Siavash Ameli, Shawn C. Shadden

Abstract: We study a matrix that arises from a singular form of the Woodbury matrix identity. We present generalized inverse and pseudo-determinant identities for this matrix, which have direct applications for Gaussian process regression, specifically its likelihood representation and precision matrix. We extend the definition of the precision matrix to the Bott-Duffin inverse of the covariance matrix, pre… ▽ More We study a matrix that arises from a singular form of the Woodbury matrix identity. We present generalized inverse and pseudo-determinant identities for this matrix, which have direct applications for Gaussian process regression, specifically its likelihood representation and precision matrix. We extend the definition of the precision matrix to the Bott-Duffin inverse of the covariance matrix, preserving properties related to conditional independence, conditional precision, and marginal precision. We also provide an efficient algorithm and numerical analysis for the presented determinant identities and demonstrate their advantages under specific conditions relevant to computing log-determinant terms in likelihood functions of Gaussian process regression. △ Less

Submitted 24 April, 2023; v1 submitted 16 July, 2022; originally announced July 2022.

Comments: Instructions for reproducing the data and results of this manuscript are available at https://ameli.github.io/detkit/benchmark.html in the form of a user guide

MSC Class: 15A10; 15-04; 62G08

Journal ref: Applied Mathematics and Computation, vol. 452, 2023, p. 128032

arXiv:2206.09976 [pdf, other]

Noise Estimation in Gaussian Process Regression

Authors: Siavash Ameli, Shawn C. Shadden

Abstract: We develop a computational procedure to estimate the covariance hyperparameters for semiparametric Gaussian process regression models with additive noise. Namely, the presented method can be used to efficiently estimate the variance of the correlated error, and the variance of the noise based on maximizing a marginal likelihood function. Our method involves suitably reducing the dimensionality of… ▽ More We develop a computational procedure to estimate the covariance hyperparameters for semiparametric Gaussian process regression models with additive noise. Namely, the presented method can be used to efficiently estimate the variance of the correlated error, and the variance of the noise based on maximizing a marginal likelihood function. Our method involves suitably reducing the dimensionality of the hyperparameter space to simplify the estimation procedure to a univariate root-finding problem. Moreover, we derive bounds and asymptotes of the marginal likelihood function and its derivatives, which are useful to narrowing the initial range of the hyperparameter search. Using numerical examples, we demonstrate the computational advantages and robustness of the presented approach compared to traditional parameter optimization. △ Less

Submitted 20 June, 2022; originally announced June 2022.

MSC Class: 62J05; 62H12

Showing 1–6 of 6 results for author: Ameli, S